Inventors list |
Assignees list |
Classification tree browser |
Top 100 Inventors |
Top 100 Assignees |
Patent application title: Alteration of Plant Embryo/Endosperm Size During Seed Development
Inventors:
Hajime Sakai (Newark, DE, US)
Nobuhiro Nagasawa (Newark, DE, US)
IPC8 Class: AA01H100FI
USPC Class:
800278
Class name: METHOD OF INTRODUCING A POLYNUCLEOTIDE MOLECULE INTO OR REARRANGEMENT OF GENETIC MATERIAL WITHIN A PLANT OR PLANT PART
Publication date: 08/27/2009
Patent application number: 20090217412
Sign up to receive free email alerts when patent applications with chosen keywords are published SIGN UP
Abstract:
Isolated nucleic acid fragments and recombinant constructs comprising such
fragments useful for altering embryo/endosperm size during seed
development are disclosed along with a method of controlling
embryo/endosperm size during development in plants using such recombinant
constructs.Claims:
1. An isolated polynucleotide comprising:(a) a nucleic acid sequence
encoding a polypeptide involved in altering embryo/endosperm size during
seed development, said polypeptide having at least 80% amino acid
sequence identity, based on the Clustal V method of alignment, when
compared to an amino acid sequence selected from the group consisting of
SEQ ID NOs:37, 39, 41, 43, 45, 47, 49, 51, and 53; or(b) a nucleic acid
sequence set forth in SEQ ID NO:25 wherein said sequence comprises at
least one of the following modifications:(i) nucleotide 271 is a T
residue instead of a C;(ii) nucleotide 110 is a T residue instead of a G;
or(iii) nucleotide 75 is deleted; or(c) a nucleic acid sequence set forth
in SEQ ID NO:34 wherein(i) nucleotides 4473 through 4829 correspond to a
first exon, andii) nucleotides 5661 through 6110 correspond to a second
exon, andfurther wherein the nucleotides of (c) (i) and/or (c)(ii) encode
a polypeptide involved in altering embryo/endosperm size during seed
development; or(d) a nucleic acid sequence set forth in SEQ ID NO:72;
or(e) a full complement of (a), (b), (c), (d), or SEQ ID NO:34; or(f) all
or part of a non-coding or coding region of the isolated polynucleotide
comprising sequences of (a), (b), (c), (d), (e), or SEQ ID NO:34 for use
in co-suppression or antisense suppression of endogenous nucleic acid
sequences encoding polypeptides involved in altering embryo/endosperm
size during seed development.
2. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is at least 85%.
3. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is at least 90%.
4. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is at least 95%.
5. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is 100%.
6. The isolated polynucleotide of claim 1 wherein the nucleotide sequence corresponds to any of the nucleotide sequences set forth in SEQ ID NOs:34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72.
7. A recombinant DNA construct comprising the isolated polynucleotide of any one of claims 1-6 operably linked to at least one regulatory sequence.
8. A plant comprising in its genome the recombinant DNA construct of claim 7.
9. Seeds and progeny thereof obtained from the plant of claim 8.
10. Oil obtained from the seeds of claim 9.
11. The plant of claim 8 wherein said plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.
12. Transformed plant tissue or plant cells comprising the recombinant DNA construct of claim 7.
13. The transformed plant tissue or plant cells of claim 12 wherein the plant is selected from the group consisting of rice, corm, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.
14. A method of altering embryo/endosperm size during seed development in a plant comprising:(a) transforming plant cells or plant tissue with the recombinant DNA construct of claim 7;(b) regenerating transgenic plants from the transformed plant cells or plant tissue of (a);(c) obtaining seeds and progeny thereof from the transgenic plants of (b) having altered embryo/endosperm size based on a comparison of embryo/endosperm size of seeds obtained from non-transformed plants.
15. The method of claim 14 wherein said plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.
16. A method of mapping genetic variations related to controlling embryo/endosperm size and/or altering oil phenotype in plants comprising:(a) crossing two plant varieties; and(b) evaluating genetic variations with respect to(i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52 and 72; or(ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37., 39, 41, 43, 45, 47, 49, 51, and 53;in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.
17. The method of claim 16 wherein the plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.
18. A method of molecular breeding to control embryo/endosperm size and/or altering oil phenotype in plants comprising:(a) crossing two plant varieties; and(b) evaluating genetic variations with respect to(i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52 and 72; or(ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 30, 32, 34, 36, 38, 40, 42, 44, and 46;in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.
19. The plant of claim 18 wherein the plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.
Description:
[0001]This application claims the benefit of U.S. Provisional Application
No. 60/664,512, filed 23 Mar. 2005, the entire content of which is hereby
incorporated by reference.
FIELD OF THE INVENTION
[0002]The present invention is in the field of plant breeding and genetics and, in particular, relates to recombinant constructs useful for altering embryo/endosperm size during seed development.
BACKGROUND OF THE INVENTION
[0003]Elucidation of how the size of a developing embryo is genetically regulated is important because the final volume of endosperm as a storage organ of starch and proteins is affected by embryo size in cereal crops. Researchers have found that genes involved in embryo size contribute to the regulation of endosperm development. Investigation of these genes is important for agriculture because cereal endosperms are the staple diet in many countries.
[0004]Rice mutants, having normally differentiated shoot and radicle and either reduced or enlarged embryo when compared to wild type rice, were identified in the early 1990s in plants obtained from methyl-nitrosourea mutagenized Taichung 65 cultivar. Mutant plants displaying an enlarged embryo were designated giant embryo (ge) mutants while plants displaying a smaller embryo were designated reduced embryo (re) mutants (Kitano et al. 1993, Plant J. 3:607-610; Hong et al. in 1995, Dev. Genet 16:298-310).
[0005]The phenotypes of each of the three reduced embryo mutants were designated re1, re2, and re3 even though the gene(s) responsible for these phenotypes have not been characterized. A mutation in a different locus is responsible for the mutant phenotype. Phenotypic analysis of ge and re mutant plants led to the theory that embryo size may be determined by the interaction between embryo-specific genes and endosperm-specific genes regulating endosperm development (Hong et al. (1996) Development 122:2051-2058).
[0006]The reduced embryo size phenotype of re2 mutant plants is associated with the enlargement of the endosperm size without altering the overall seed size. This phenotype is potentially useful for improving cereal quality by increasing the amount of endosperm tissue, which is rich in starch and other nutrients. Moreover, the reduction of embryo size in seed has a potential benefit for some milling processes, where embryonic tissues are considered as waste, such as in the production of ethanol.
SUMMARY OF THE INVENTION
[0007]In a first embodiment, the invention concerns an isolated polynucleotide comprising: [0008](a) a nucleic acid sequence encoding a polypeptide involved in altering embryo/endosperm size during seed development, said polypeptide having at least 80% amino acid sequence identity, based on the Clustal V method of alignment, when compared to an amino acid sequence selected from the group consisting of SEQ ID NOs:37, 39, 41, 43, 45, 47, 49, 51, and 53; or [0009](b) a nucleic acid sequence set forth in SEQ ID NO:25 wherein said sequence comprises at least one of the following modifications: [0010](i) nucleotide 271 is a T residue instead of a C; [0011](ii) nucleotide 110 is a T residue instead of a G; or [0012](iii) nucleotide 75 is deleted; or [0013](c) a nucleic acid sequence set forth in SEQ ID NO:34 wherein [0014](i) nucleotides 4473 through 4829 correspond to a first exon, and [0015](ii) nucleotides 5661 through 6110 correspond to a second exon, and [0016]further wherein the nucleotides of (c) (i) and/or (c)(ii) encode a polypeptide involved in altering embryo/endosperm size during seed development, [0017](d) a nucleic acid sequence set forth in SEQ ID NO:34 or 72; or [0018](e) the full complement of (a), (b), (c), (d), or SEQ ID NO:34; or [0019](f) all or part of a non-coding or coding region of the isolated polynucleotide comprising sequences of (a), (b) or SEQ ID NO:34 for use in co-suppression or antisense suppression of endogenous nucleic acid sequences encoding polypeptides involved in altering embryo/endosperm size during seed development.
[0020]In a second embodiment, the invention concerns a recombinant DNA construct comprising the isolated polynucleotide of the invention operably linked to at least one regulatory sequence.
[0021]In a third embodiment, the invention concerns a plant comprising in its genome the recombinant DNA construct of the invention as well as any seeds obtained from such a plant and oil obtained from such seeds. Also of interest are transformed plant tissue or plant cells comprising the recombinant DNA construct of the invention.
[0022]In a fourth embodiment, the invention concerns a method of altering embryo/endosperm size during seed development in a plant comprising: [0023](a) transforming plant cells or plant tissue with the recombinant DNA construct of the invention; [0024](b) regenerating transgenic plants from the transformed plant cells or plant tissue of (a); [0025](c) screening the transgenic plants of (b) for seeds having an altered embryo/endosperm size based on a comparison of embryo/endosperm size of seeds obtained from non-transformed plants.
[0026]In a fifth embodiment, the invention concerns a method of mapping genetic variations related to controlling embryo/endosperm size and/or altering oil phenotype in plants comprising: [0027](a) crossing two-plant varieties; and [0028](b) evaluating genetic variations with respect to [0029](i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or [0030](ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53; in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP (restriction fragment length polymorphism) analysis, SNP (single nucleotide polymorphism) analysis, and PCR-based analysis.
[0031]In a sixth embodiment the invention concerns a method of molecular breeding to control embryo/endosperm size and/or altering oil phenotype in plants comprising: [0032](a) crossing two plant varieties; and [0033](b) evaluating genetic variations with respect to [0034](i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or [0035](ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53;in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.
BRIEF DESCRIPTION OF THE FIGURES AND-SEQUENCE LISTINGS
[0036]The invention can be more fully understood from the following detailed description and the accompanying drawings and Sequence Listing that form a part of this application.
[0037]FIG. 1A-D shows an alignment of the nucleotide sequences obtained for wild type RE2 (SEQ ID NO:25), and mutants re2-1 (SEQ ID NO:28), re2-2 (SEQ ID NO:30), and re2-3 (SEQ ID NO:32). Changes in the nucleotide sequence are indicated by a star below the alignment and by a box around the nucleotides at that position. Numbers at the left of the alignment indicate the nucleotide position.
[0038]FIG. 2 shows an alignment of the amino acid sequences obtained for polypeptides from wild type RE2 protein (SEQ ID NO:26), and re2-1 mutant protein (SEQ ID NO:29), and re2-2 mutant protein (SEQ ID NO:31). Changes in the amino acid sequence are indicated by a star below the alignment and by a box around the amino acids at that position. As seen in FIG. 2, mutant allele re2-1 had an isoleucine at amino acid 93 instead of the highly conserved threonine; mutant allele re2-2 had a phenylalanine instead of the conserved cysteine at amino acid 37. The deletion of a nucleotide at position 75 in mutant allele re2-3 gene produced a frame shift that results in a 127 amino acid polypeptide for the re2-3 mutant protein (set forth in SEQ ID NO:33) that is quite different than the one encoded by wild type RE2 gene or mutant genes re2-1 or re2-2. Numbers at the left of the alignment indicate the amino acid position.
[0039]FIG. 3A-C depicts the Clustal V alignment obtained for the amino acid sequences from the rice wild type RE2 protein (SEQ ID NO:26), the O. sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27), the A. thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164 (SEQ ID NO:54), and the amino acid sequences of the polypeptides encoded by corn clones cef1f.pk001.f4:fis (SEQ ID NO:37), cpf1c.pk006.d18a:fis (SEQ ID NO:39), cpi1c.pk005.a12:fis (SEQ ID NO:41), and cr1n.pk0028.h3a:fis (SEQ ID NO:43), Euphorbia lagascae clone eel1c.pk003.b10:fis (SEQ ID NO:45), columbine clone eav1c.pk003.c9 (SEQ ID NO:47), guar clone lds3c.pk011.j11:fis (SEQ ID NO:49), soybean clone sdr1f.pk005.d21.f:fis (SEQ ID NO:51), and wheat clone wdr1f.pk002.l10:fis (SEQ ID NO:53). The program uses dashes to maximize the alignment. An asterisk (*) below the alignment indicates amino acids conserved among all the sequences. The C-block, a GAS-block, and a leucine zipper conserved motifs are shown boxed. Numbers at the left of the alignment indicate the amino acid position.
[0040]The following sequence descriptions and sequence listings attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. §1.821-1.825.
[0041]SEQ ID NO:1 is the nucleotide sequence of oligonucleotide primer C10 6-3 used to amplify CAPS marker C10 7.7 to identify the re2 locus.
[0042]SEQ ID NO:2 is the nucleotide sequence of oligonucleotide primer C10 6-4 used to amplify CAPS marker C10 7.7 to identify the re2 locus.
[0043]SEQ ID NO:3 is the nucleotide sequence of oligonucleotide primer C10 15.9-1 used to amplify CAPS marker C10 15.9 to identify the re2 locus.
[0044]SEQ ID NO:4 is the nucleotide sequence of oligonucleotide primer C10 15.9-2 used to amplify CAPS marker C10 15.9 to identify the re2 locus.
[0045]SEQ ID NO:5 is the nucleotide sequence of oligonucleotide primer C10-7.7 2 HPYIVF used to amplify CAPS marker C10 7.7 Hpy.
[0046]SEQ ID NO:6 is the nucleotide sequence of oligonucleotide primer C10-7.7 2 HPYIVR used to amplify CAPS marker C10 7.7 Hpy.
[0047]SEQ ID NO:7 is the nucleotide sequence of oligonucleotide primer 11.5 HpyV used to amplify CAPS marker C10 11.5.
[0048]SEQ ID NO:8 is the nucleotide sequence of oligonucleotide primer C10 11.5-9 used to amplify CAPS marker C10 11.5.
[0049]SEQ ID NO:9 is the nucleotide sequence of oligonucleotide primer C10 11-5 used to amplify CAPS marker C10 11.0.
[0050]SEQ ID NO:10 is the nucleotide sequence of oligonucleotide primer 11 HinfR used to amplify CAPS marker C10 11.0.
[0051]SEQ ID NO:11 is the nucleotide sequence of oligonucleotide primer 9.6 DraIF used to amplify CAPS marker C10 9.6.
[0052]SEQ ID NO:12 is the nucleotide sequence of oligonucleotide primer 9.6 DraIR used to amplify CAPS marker C10 9.6.
[0053]SEQ ID NO:13 is the nucleotide sequence of the oligonucleotide primer E08 93KF used to amplify CAPS marker E08 93K.
[0054]SEQ ID NO:14 is the nucleotide sequence of the oligonucleotide primer E08 93KR used to amplify CAPS marker E08 93K.
[0055]SEQ ID NO:15 is the nucleotide sequence of the oligonucleotide primer E08 46KF used to amplify CAPS marker E08 46K.
[0056]SEQ ID NO:16 is the nucleotide sequence of the oligonucleotide primer E08 46KR used to amplify CAPS marker E08 46K.
[0057]SEQ ID NO:17 is the nucleotide sequence of the oligonucleotide primer K08 21KF used to amplify CAPS marker K08 21K.
[0058]SEQ ID NO:18 is the nucleotide sequence of the oligonucleotide primer K08 21KR used to amplify CAPS marker K08 21K.
[0059]SEQ ID NO:19 is the nucleotide sequence of the oligonucleotide primer K08 46KF used to amplify SNP-based marker K08 46K.
[0060]SEQ ID NO:20 is the nucleotide sequence of the oligonucleotide primer K08 46KR used to amplify SNP-based marker K08 46K.
[0061]SEQ ID NO:21 is the nucleotide sequence of the oligonucleotide primer LOB-82F used to amplify the first exon (exon 1) of RE2 wild type gene or re2 mutant gene from genomic DNA.
[0062]SEQ ID NO:22 is the nucleotide sequence of the oligonucleotide primer LOB R1 used to amplify the first exon (exon 1) of RE2 wild type gene or re2 mutant gene from genomic DNA.
[0063]SEQ ID NO:23 is the nucleotide sequence of the oligonucleotide primer LOB F2 used to amplify the second exon (exon 2) of RE2 wild type gene or re2 mutant gene from genomic DNA.
[0064]SEQ ID NO:24 is the nucleotide sequence of the oligonucleotide primer LOB R2 used to amplify the second exon (exon 2) of RE2 wild type gene or re2 mutant gene from genomic DNA.
[0065]SEQ ID NO:25 is the nucleotide sequence of the wild-type rice RE2 gene open reading frame (ORF) identified in the instant application.
[0066]SEQ ID NO:26 is the amino acid sequence of the wild-type rice RE2 protein derived from translating nucleotides 1 through 807 of SEQ ID NO:25.
[0067]SEQ ID NO:27 is the amino acid sequence of the rice protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509.
[0068]SEQ ID NO:28 is the nucleotide sequence obtained for mutant allele re2-1 gene.
[0069]SEQ ID NO:29 is the amino acid sequence of a re2-1 mutant allele protein obtained by translating nucleotides 1 through 807 of SEQ ID NO:28.
[0070]SEQ ID NO:30 is the nucleotide sequence obtained for mutant allele re2-2 gene.
[0071]SEQ ID NO:31 is the amino acid sequence of a re2-2 mutant allele protein obtained by translating nucleotides 1 through 807 of SEQ ID NO:30.
[0072]SEQ ID NO:32 is the nucleotide sequence obtained for mutant allele re2-3 gene.
[0073]SEQ ID NO:33 is the amino acid sequence of a re2-3 mutant allele protein obtained by translating nucleotides 1 through 378 of SEQ ID NO:32.
[0074]SEQ ID NO:34 is the nucleotide sequence of the approximately 9 Kb BamH I fragment from RE2G4 which comprises the RE2 wild type gene coding region. Nucleotides 1 through 4472 are 5' of the ATG initiation codon, nucleotides 4473 through 4829 correspond to the first exon, nucleotides 4830 through 5660 correspond to an intron, and nucleotides 5661 through 6110 correspond to the second exon. Nucleotides 6111 through 6113 form a termination codon.
[0075]SEQ ID NO:35 is the nucleotide sequence of vector pML18 used to subclone the approximately 9 Kb BamH I fragment from RE2G4 comprising the rice RE2 wild type gene coding region.
[0076]SEQ ID NO:36 is the nucleotide sequence comprising the entire cDNA insert in clone cef1f.pk001.f4:fis encoding a putative corn RE2 protein homolog.
[0077]SEQ ID NO:37 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 76 through 851 of SEQ ID NO:36.
[0078]SEQ ID NO:38 is the nucleotide sequence comprising the entire cDNA insert in clone cpf1c.pk006.d18a:fis encoding a putative corn RE2 protein homolog.
[0079]SEQ ID NO:39 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 151 through 804 of SEQ ID NO:38.
[0080]SEQ ID NO:40 is the nucleotide sequence comprising the entire cDNA insert in clone cpi1c.pk005.a12:fis encoding a putative corn RE2 protein homolog.
[0081]SEQ ID NO:41 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 81 through 854 of SEQ ID NO:40.
[0082]SEQ ID NO:42 is the nucleotide sequence comprising the entire cDNA insert in clone cr1n.pk0028.h3a:fis encoding a putative corn RE2 protein homolog.
[0083]SEQ ID NO:43 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 158 through 658 of SEQ ID NO:42.
[0084]SEQ ID NO:44 is the nucleotide sequence comprising the entire cDNA insert in clone eel1c.pk003.b10:fis encoding a putative Euphorbia RE2 protein homolog.
[0085]SEQ ID NO:45 is the deduced amino acid sequence of a putative Euphorbia RE2 protein homolog derived from nucleotides 71 through 823 of SEQ ID NO:44.
[0086]SEQ ID NO:46 is the nucleotide sequence comprising a portion of the cDNA insert in clone eav1c.pk003.c9 encoding a fragment of a putative columbine RE2 protein homolog.
[0087]SEQ ID NO:47 is the deduced amino acid sequence of a fragment of a putative columbine RE2 protein homolog derived from nucleotides 2 through 382 of SEQ ID NO:46.
[0088]SEQ ID NO:48 is the nucleotide sequence comprising the entire cDNA insert in clone Ids3c.pk011.j11:fis encoding a putative guar RE2 protein homolog.
[0089]SEQ ID NO:49 is the deduced amino acid sequence of a putative guar RE2 protein homolog derived from nucleotides 146 through 898 of SEQ ID NO:48.
[0090]SEQ ID NO:50 is the nucleotide sequence comprising the entire cDNA insert in clone sdr1f.pk005.d21.f:fis encoding putative soybean RE2 protein homolog.
[0091]SEQ ID NO:51 is the deduced amino acid sequence of a putative soybean RE2 protein homolog derived from nucleotides 971 through 1609 of SEQ ID NO:50.
[0092]SEQ ID NO:52 is the nucleotide sequence comprising the entire cDNA insert in clone wdr1f.pk002.l10:fis encoding a putative wheat RE2 protein homolog.
[0093]SEQ ID NO:53 is the deduced amino acid sequence of a putative wheat RE2 protein homolog derived from nucleotides 80 through 640 of SEQ ID NO:52.
[0094]SEQ ID NO:54 is the amino acid sequence of the Arabidopsis thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164.
[0095]SEQ ID NO:55 is the consensus amino acid sequence included in the C block of RE2 protein homologs.
[0096]SEQ ID NO:56 is the amino acid sequence of the motif at the N-terminus of the 49 amino acid GAS block of RE2 protein homologs.
[0097]SEQ ID NO:57 is the amino acid sequence of the motif at the C-terminus of the 49 amino acid GAS block of RE2 protein homologs.
[0098]SEQ ID NO:58 is the amino acid sequence of the Leucine-zipper motif of RE2 protein homologs.
[0099]SEQ ID NO:59 is the nucleotide sequence of oligonucleotide primer Cpi Bbsl F used to amplify genomic Zea mays RE2 gene.
[0100]SEQ ID NO:60 is the nucleotide sequence of oligonucleotide primer Cpi Bbsl R used to amplify genomic Zea mays RE2 gene.
[0101]SEQ ID NO:61 is the nucleotide sequence of the genomic fragment encoding a maize RE2 protein homolog obtained by amplifying a maize genomic library with primers Cpi Bbsl F and Cpi Bbsl R. Nucleotides 79 through 429 correspond to the first exon, nucleotides 430 through 1363 correspond to an intron, and nucleotides 1364 through 1783 correspond to the second exon.
[0102]SEQ ID NO:62 is the nucleotide sequence of oligonucleotide primer RE2 pro Bst 2F used for amplifying a portion of the 5' region of the OsRE2 gene.
[0103]SEQ ID NO:63 is the nucleotide sequence of oligonucleotide primer RE2 PRO R Bbsl used for amplifying a portion of the 5' region of the OsRE2 gene.
[0104]SEQ ID NO:64 is the nucleotide sequence of plasmid RE2Pro comprising a portion of the OsRE2 gene promoter region.
[0105]SEQ ID NO:65 is the nucleotide sequence of oligonucleotide primer RE2 TERM Xbal R used for amplifying a 780 bp fragment of the 3' terminator region from the OsRE2 gene.
[0106]SEQ ID NO:66 is the nucleotide sequence of oligonucleotide primer RE2 TERM EcoBspml used for amplifying a 780 bp fragment of the 3' terminator region from the OsRE2 gene.
[0107]SEQ ID NO:67 is the nucleotide sequence of plasmid RE2TERGEM comprising a portion of the OsRE2 gene terminator region.
[0108]SEQ ID NO:68 is the nucleotide sequence of an oligonucleotide primer that may be used to identify RE2 homologs from other plant species.
[0109]SEQ ID NO:69 is the nucleotide sequence of an oligonucleotide primer that may be used to identify RE2 homologs from other species.
[0110]SEQ ID NO:70 is the nucleotide sequence of an oligonucleotide primer that may be used to identify RE2 homologs from other plant species.
[0111]SEQ ID NO:71 is the nucleotide sequence of the "RE2 second exon probe" used to screen for cDNAs encoding RE2 proteins.
[0112]SEQ ID NO:72 is the nucleotide sequence of clone RE2 cDNA C1, the longest cDNA clone identified encoding an RE2 protein.
[0113]The Sequence Listing contains the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB standards described in Nucleic Acids Res. 13:3021-3030 (1985) and in the Biochemical J. 219 (No. 2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822.
DETAILED DESCRIPTION OF THE INVENTION
[0114]Disclosure of all references, patents, and patent applications cited herein are hereby incorporated by reference.
[0115]The terms "isolated nucleic acid fragment" and "isolated polynucleotide" are used interchangeably herein. These terms refer to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.
[0116]It has been reported that the Lateral Organ Boundary (LOB) gene in Arabidopsis has a potential role in lateral organ development. See Shuai et al., (2002), Plant Phys. 129, 747-761. Shuai et al. found LOB gene expression at the base of lateral organs in the shoots and roots of Arabidopsis. In fact, 23 members of the LOB domain family (LBD) of genes were found to exhibit expression patterns in the root tissues of Arabidopsis.
[0117]The LOB domain 18 protein is considered as being in the class I group of the Lateral Organ Boundaries (LOB) domain protein plant-specific gene family. The Class I LOB domain proteins contain a C-block, a GAS-block, and a leucine zipper motif (Shuai, B. et al., 2002, Plant Phys. 129:747-761). Thus, it is expected that an Oryza sativa RE2 protein and its homologs would also contain a C-block, a GAS-block, and a leucine zipper motif. The consensus sequences of these motifs were identified using a Clustal V alignment and are indicated in FIG. 3.
[0118]FIG. 3A-C depicts the Clustal V alignment obtained for the amino acid sequences from the rice wild type RE2 protein (SEQ ID NO:26), the O. sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27), the A. thaliana LOB domain 18 protein having NGBI General Identifier No. 17227164 (SEQ ID NO:54), and the amino acid sequences of the polypeptides encoded by corn clones cef1f.pk001.f4:fis (SEQ ID NO:37), cpf1c.pk006.d18a:fis (SEQ ID NO:39), cpi1c.pk005.a12:fis (SEQ ID NO:41), and cr1n.pk0028.h3a:fis (SEQ ID NO:43), Euphorbia lagascae clone eel1c.pk003.b10:fis (SEQ ID NO:45), columbine clone eav1c.pk003.c9 (SEQ ID NO:47), guar clone Ids3c.pk011.j11:fis (SEQ ID NO:49), soybean clone sdr1f.pk005.d21.f:fis (SEQ ID NO:51), and wheat clone wdr1f.pk002.l10:fis (SEQ ID NO:53). The program uses dashes to maximize the alignment. An asterisk (*) below the alignment indicates amino acids conserved among all the sequences. The C-block, a GAS-block, and a leucine zipper conserved motifs are shown boxed.
[0119]It has been found in the present invention that a single mutation of a rice gene encoding a member of a class I LOB domain protein family can lead to alteration of embryo/endosperm size during seed development.
[0120]The gene associated with the reduced embryo phenotype is named Reduced Embryo2 (RE2). Silencing or inhibition of this gene leads to a reduction of embryonic tissue, thus, resulting in a smaller embryo size and a concomitantly larger endosperm size. Reduction of embryo size will result in seeds having a reduced amount of components such as oils. On the other hand, overexpression of this gene might lead to an increase of embryonic tissue, thus, resulting in a larger embryo size and a concomitantly smaller endosperm size.
[0121]The italicized and uppercase term "RE2" as used herein refers to a genetic locus capable of expressing a Reduced Embryo 2 protein. The italicized and lowercase letters term "re2" as used herein refers to a mutated form of RE2. Italics are not used when referring to a protein or polypeptide encoded by the genetic locus. Thus, the uppercase term "RE2" as used herein refers to the wild type protein, and the lowercase "re2" as used herein refers to a mutant protein. As was noted above, the rice RE2 isolated polynucleotide was identified in the instant application using high fidelity mapping of DNA obtained from reduced embryo 2 (re2) mutant plants. These mutant plants produce grain that have a small embryo phenotype.
[0122]The terms "Oryza sativa RE2", "OsRE2", and "rice RE2" are used interchangeably herein. These terms refer to a polynucleotide isolated from wild-type rice and whose sequence is set forth in the instant application. The rice RE2 isolated polynucleotide is the polynucleotide that, when mutated, is responsible for a reduced embryo 2, or re2, phenotype as exemplified by Hong et al. (1996, Development 122:2051-2058). Mutant rice displaying the re2 phenotype has a reduced embryo size and an enlarged endosperm size.
[0123]The terms "subfragment that is functionally equivalent" and "functionally equivalent subfragment" are used interchangeably herein. These terms refer to a portion or subsequence of an isolated nucleic acid fragment in which the ability to alter gene expression or produce a certain phenotype is retained whether or not the fragment or subfragment encodes an active enzyme. For example, the fragment or subfragment can be used in the design of recombinant DNA constructs to produce the desired phenotype in a transformed plant. Recombinant DNA constructs can be designed for use in co-suppression or antisense by linking a nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in the appropriate orientation relative to a plant promoter sequence.
[0124]The terms "homology", "homologous", "substantially similar" and "corresponding substantially" are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the specific exemplary sequences.
[0125]A "homolog" can be a second gene in the same plant type or in a different plant type that has a polynucleotide sequence that is functionally identical to a sequence in the first gene. It is believed that, in general, homologs share a common evolutionary past.
[0126]The term "RE2 homolog" refers to an isolated polynucleotide encoding a class I LOB domain polypeptide obtained from a plant species, other than rice, that functions in a manner similar to that of the rice RE2 isolated polynucleotide and that, when mutated, exhibits a reduced embryo phenotype. The corn, Euphorbia lagascae, Columbine, guar, soybean, and wheat isolated polynucleotides disclosed herein appear to encode such polypeptides, namely, these polypeptides are members of a class I LOB domain protein family, have a C-like motif, a GAS-like motif, and a leucine zipper-like motif, and are useful for altering embryo/endosperm size during seed development.
[0127]A search of GenBank and Du Pont proprietary databases using the rice RE2 gene sequence or the RE2 polypeptide sequence uncovered a number of isolated polynucleotides from plants that appeared to be homologous. RE2 homologs appear to encompass those polynucleotides isolated from plants, other than rice, which appeared to encode a polypeptide that shares sequence and/or functional similarity to the polypeptide encoded by the rice RE2 isolated polynucleotide. It is believed that such a polynucleotide would comprise a subset of the polynucleotides encoding polypeptides of the class I LOB domain family, and that alteration in the expression of this polypeptide may affect embryo/endosperm size.
[0128]"Sequence identity" or "identity" in the context of nucleic acid or polypeptide sequences refers to the nucleic acid bases or amino acid residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window.
[0129]Thus, "Percentage of sequence identity" refers to the valued determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the results by 100 to yield the percentage of sequence identity. Useful examples of percent sequence identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. These identities can be determined using any of the programs described herein.
[0130]Sequence alignments and percent identity or similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences are performed using the Clustal V method of alignment (Higgins, D. G. and Sharp, P. M. (1989) Comput. Appl. Biosci. 5:151-153; Higgins, D. G. et al. (1992) Comput. Appl. Biosci. 8:189-191) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4.
[0131]It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, from other plant species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. Indeed, any integer amino acid identity from 50%-100% may be useful in describing the present invention. Also, of interest is any full or partial complement of this isolated nucleotide fragment.
[0132]It is believed that another way to identify genes that are homologous to the rice RE2 gene is to screen by hybridization. It is possible to hybridize cDNA at 60° C. with a probe derived from the rice RE2 gene and wash at medium stringency conditions (5×SSPE, 0.5% SDS at 65° C. followed by 1×SSPE, 0.5×SDS at 65° C.). For general hybridization protocols, see Ausubel et al. 1993, "Current Protocols in Molecular Biology" John Wiley & Sons, USA, or Sambrook et al. 1989. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press. An appropriate probe with a unique sequence can be extracted, for example, from part of the exon 1 of the RE2 gene. Exon 1 of the RE2 gene has regions of sequence identity between the corn and rice RE2 nucleotide sequences. Oligonucleotide primers useful in hybridization screenings may have the sequences disclosed in SEQ ID NO: 68, SEQ ID NO:69, or SEQ ID NO:70, for example. The oligonucleotide primers having the sequences set forth in SEQ ID NO: 68, SEQ ID NO:69, or SEQ ID NO:70 have the sequences set forth as follows:
TABLE-US-00001 SEQ ID NO:68 5'-GCATCTTCGCGCCCTACTTCGACTCGG-3' SEQ ID NO:69 5'-GCACAAGGTGTTCGGCGCCAGCAACGTGTCCAAGC-3' SEQ ID NO:70 5'-CCGCGACCCCGTCTACGGCTGCGTCGCCCACCTC-3'
[0133]Genomic DNA or cDNA clones giving significant signals may be isolated and their chromosomal origin analyzed using CAPS markers or SNP-based markers similar to those described in the present Application. DNA fragments containing the region homologous to rice RE2 gene may be further subcloned and sequenced. Polypeptides encoded by these polynucleotides should the have the C-Block, GAS Block N-end and C-end, and Leu Zipper consensus sequences described in Example 6 and as set forth in SEQ ID NOs:55 through 58.
[0134]"Gene" refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Recombinant DNA construct" refers to a combination of nucleic acid fragments that are not normally found together in nature. Accordingly, a recombinant DNA construct may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that normally found in nature. A "foreign" gene refers to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or recombinant DNA constructs. A "transgene" is a gene that has been introduced into the genome by a transformation procedure.
[0135]"coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
[0136]"Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoter sequences can also be located within the transcribed portions of genes, and/or downstream of the transcribed sequences. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of an isolated nucleic acid fragment in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause an isolated nucleic acid fragment to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, (1989) Biochemistry of Plants 15:1-82. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.
[0137]Specific examples of promoters that may be useful in expressing the nucleic acid fragments of the invention include, but are not limited to, the oleosin promoter (PCT Publication WO99/65479, published Dec. 12, 1999), the maize 27 kD zein promoter (Ueda et al (1994) Mol. Cell. Biol. 14:4350-4359), the ubiquitin promoter (Christensen et al (1992) Plant Mol. Biol. 18:675-680), the SAM synthetase promoter (PCT Publication WO00/37662, published Jun. 29, 2000), the CaMV 35S (Odell et al (1985) Nature 313:810-812), and the promoter described in PCT. Publication WO02/099063 published Dec. 12, 2002.
[0138]An "intron" is an intervening sequence in a gene that does not encode a portion of the protein sequence. Thus, such sequences are transcribed into RNA but are then excised and are not translated. The term is also used for the excised RNA sequences. An "exon" is a portion of the sequence of a gene that is transcribed and is found in the mature messenger RNA derived from the gene, but is not necessarily a part of the sequence that encodes the final-gene product.
[0139]The "translation leader sequence" refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 3:225).
[0140]The "3' non-coding sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht et al. (1989) Plant Cell 1:671-680.
[0141]"RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to and synthesized from an mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form using the Klenow fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro. "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target isolated nucleic acid fragment (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.
[0142]The term "endogenous RNA" refers to any RNA which is encoded by any nucleic acid sequence present in the genome of the host prior to transformation with the recombinant construct of the present invention, whether naturally-occurring or non-naturally occurring, i.e., introduced by recombinant means, mutagenesis, etc.
[0143]The term "non-naturally occurring" means artificial, not consistent with what is normally found in nature.
[0144]The term "operably linked" refers to an association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is regulated by the other. For example, a promoter is operably linked with a coding sequence when it is capable of regulating the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions of the invention can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.
[0145]Cosuppression technology constitutes the subject matter of U.S. Pat. No. 5,231,020, which issued to Jorgensen et al. on Jul. 27, 1999. The phenomenon observed by Napoli et al. in petunia was referred to as "cosuppression" since expression of both the endogenous gene and the introduced transgene were suppressed (for reviews see Vaucheret et al., Plant J. 16:651-659 (1998); and Gura, Nature 404:804-808 (2000)).
[0146]Co-suppression constructs in plants previously have been designed by focusing on overexpression of a nucleic acid sequence having homology to an endogenous mRNA, in the sense orientation, which results in the reduction of all RNA having homology to the overexpressed sequence (see Vaucheret et al. (1998) Plant J 16:651-659; and Gura (2000) Nature 404:804-808). The overall efficiency of this phenomenon is low, and the extent of the RNA reduction is widely variable. Recent work has described the use of "hairpin" structures that incorporate all, or part, of an mRNA encoding sequence in a complementary orientation that results in a potential "stem-loop" structure for the expressed RNA (PCT Publication WO 99/53050 published on Oct. 21, 1999). This increases the frequency of co-suppression in the recovered transgenic plants. Another variation describes the use of plant viral sequences to direct the suppression, or "silencing", of proximal mRNA encoding sequences (PCT Publication WO 98/36083 published on Aug. 20, 1998). Both of these co-suppressing phenomena have not been elucidated mechanistically, although recent genetic evidence has begun to unravel this complex situation (Elmayan et al. (1998) Plant Cell 10:1747-1757).
[0147]In addition to cosuppression, antisense technology has also been used to block the function of specific genes in cells. Antisense RNA is complementary to the normally expressed RNA, and presumably inhibits gene expression by interacting with the normal RNA strand. The mechanisms by which the expression of a specific gene are inhibited by either antisense or sense RNA are on their way to being understood. However, the frequencies of obtaining the desired phenotype in a transgenic plant may vary with the design of the construct, the gene, the strength and specificity of its promoter, the method of transformation and the complexity of transgene insertion events (Baulcombe, Curr. Biol. 12(3):R82-84 (2002); Tang et al., Genes Dev. 17(1):49-63 (2003); Yu et al., Plant Cell. Rep. 22(3):167-174 (2003)). Cosuppression and antisense inhibition are also referred to as "gene silencing", "post-transcriptional gene silencing" (PTGS), RNA interference or RNAi. See for example U.S. Pat. No. 6,506,559.
[0148]MicroRNAs (miRNA) are small regulatory RNAs that control gene expression. miRNAs bind to regions of target RNAs and inhibit their translation and, thus, interfere with production of the polypeptide encoded by the target RNA. miRNAs can be designed to be complementary to any region of the target sequence RNA including the 3' untranslated region, coding region, etc. miRNAs are processed from highly structured RNA precursors that are processed by the action of a ribonuclease III termed DICER. While the exact mechanism of action of miRNAs is unknown, it appears that they function to regulate expression of the target gene. See, e.g., U.S. Patent Publication No. 2004/0268441 A1 which was published on Dec. 30, 2004.
[0149]The term "expression", as used herein, refers to the production of a functional end-product, be it mRNA or translation of mRNA into a polypeptide. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).
[0150]"Overexpression" refers to the production of a functional end-product in transgenic organisms that exceeds levels of production when compared to expression of that functional end-product in a normal, wild type or non-transformed organism.
[0151]"Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. The preferred method of cell transformation of rice, corn and other monocots is using particle-accelerated or "gene gun" transformation technology (Klein et al. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050), or an Agrobacterium-mediated method (Ishida Y. et al. (1996) Nature Biotech. 14:745750). The term "transformation" as used herein refers to both stable transformation and transient transformation.
[0152]Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press Cold Spring Harbor, 1989 (hereinafter "Sambrook").
[0153]The term "recombinant" refers to an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis or by the manipulation of isolated segments of nucleic acids by genetic engineering techniques.
[0154]"PCR" or "Polymerase Chain Reaction" is a technique for the synthesis of large quantities of specific DNA segments, consists of a series of repetitive cycles (Perkin Elmer Cetus Instruments, Norwalk, Conn.). Typically, the double stranded DNA is heat denatured, the two primers complementary to the 3' boundaries of the target segment are annealed at low temperature and then extended at an intermediate temperature. One set of these three consecutive steps is referred to as a cycle.
[0155]Polymerase chain reaction ("PCR") is a powerful technique used to amplify DNA millions of fold, by repeated replication of a template, in a short period of time. (Mullis et al. (1986) Cold Spring Harbor Symp. Quant. Biol. 51:263-273; Erlich et al, European Patent Application 50,424; European Patent Application 84,796; European Patent Application 258,017, European Patent Application 237,362; Mullis, European Patent Application 201,184, Mullis et al U.S. Pat. No. 4,683,202; Erlich, U.S. Pat. No. 4,582,788; and Saiki et al, U.S. Pat. No. 4,683,194). The process utilizes sets of specific in vitro synthesized oligonucleotides to prime DNA synthesis. The design of the primers is dependent upon the sequences of DNA that are to be analyzed. The technique is carried out through many cycles (usually 20-50) of melting the template at high temperature, allowing the primers to anneal to complementary sequences within the template and then replicating the template with DNA polymerase.
[0156]The products of PCR reactions are analyzed by separation in agarose gels followed by ethidium bromide staining and visualization with UV transillumination. Alternatively, radioactive dNTPs can be added to the PCR in order to incorporate label into the products. In this case the products of PCR are visualized by exposure of the gel to x-ray film. The added advantage of radiolabeling PCR products is that the levels of individual amplification products can be quantitated.
[0157]The terms "recombinant construct", "expression construct" and "recombinant expression construct" are used interchangeably herein. These terms refer to a functional unit of genetic material that can be inserted into the genome of a cell using standard methodology well known to one skilled in the art. Such construct may be itself or may be used in conjunction with a vector. If a vector is used then the choice of vector is dependent upon the method that will be used to transform host plants as is well known to those skilled in the art. For example, a plasmid vector can be used. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells comprising any of the isolated nucleic acid fragments of the invention. The skilled artisan will also recognize that different independent transformation events will result in different levels and patterns of expression (Jones et al. (1985) EMBO J. 4:2411-2418; De Almeida et al. (1989) Mol. Gen. Genetics 218:78-86), and thus that multiple events must be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by Southern analysis of DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.
[0158]"Motifs" or "subsequences" refer to relatively short conserved regions of nucleic acids or amino acids that comprise part of a longer sequence. For example, it is expected that such conserved subsequences, such as those exemplified in SEQ ID NOs:49, 50, 52, and 52, would be important for function and could be used to identify new homologues of class I LOB domain proteins involved in controlling embryo/endosperm size in plants. It is expected that some or all of the elements may be found in an RE2 homolog. Also, it is expected that one or two of the conserved amino acids in any given motif may differ in a true RE2 homolog.
[0159]Thus, in one aspect, this invention concerns an isolated polynucleotide comprising: [0160](a) a nucleic acid sequence encoding a polypeptide involved in altering embryo/endosperm size during seed development, said polypeptide having at least 80% amino acid sequence identity, based on the Clustal V method of alignment, when compared to an amino acid sequence selected from the group consisting of SEQ ID NOs:37, 39, 41, 43, 45, 47, 49, 51, and 53; or [0161](b) a nucleic acid sequence set forth in SEQ ID NO:25 wherein said sequence comprises at least one of the following modifications: [0162](i) nucleotide 271 is a T residue instead of a C; [0163](ii) nucleotide 110 is a T residue instead of a G; or [0164](iii) nucleotide 75 is deleted; or [0165](c) a nucleic acid sequence set forth in SEQ ID NO:34 wherein [0166](i) nucleotides 4473 through 4829 correspond to a first exon, and [0167](ii) nucleotides 5661 through 6110 correspond to a second exon, and [0168]further wherein the nucleotides of (c) (i) and/or (c)(ii) encode a polypeptide involved in altering embryo/endosperm size during seed development, [0169](d) the nucleic acid sequence set forth in SEQ ID NO:34 or 72; or [0170](e) the full complement of (a), (b), (c), (d), or SEQ ID NO:34; or [0171](f) all or part of a non-coding or coding region of the isolated polynucleotide comprising sequences of (a), (b) or SEQ ID NO:34 for use in co-suppression or antisense suppression of endogenous nucleic acid sequences encoding polypeptides involved in altering embryo/endosperm size during seed development.
[0172]Also of interest are recombinant DNA constructs comprising an isolated polynucleotide comprising any of the nucleotide sequences described herein operably linked in a sense or anti-sense orientation to at least one regulatory sequence. Such constructs can then be used to transform plants, plant tissue, or plant cells. Transformation methods are well known to those skilled in the art and are described above. Any plant, dicot or monocot can be transformed with such recombinant DNA constructs.
[0173]Examples of monocots include, but are not limited to, corn, wheat, rice, sorghum, millet, barley, palm, lily, Alstroemeria, rye, and oat.
[0174]Examples of dicots include, but are not limited to, soybean, rape, sunflower, canola, grape, guayule, columbine, cotton, tobacco, peas, beans, flax, safflower, and alfalfa.
[0175]Plant tissue includes differentiated and undifferentiated tissues or plants, including but not limited to, roots, stems, shoots, leaves, pollen, seeds, tumor tissue, and various forms of cells and culture such as single cells, protoplasm, embryos, and callus tissue. The plant tissue may in plant or in organ, tissue or cell culture.
[0176]The term "plant organ" refers to plant tissue or group of tissues that constitute a morphologically and functionally distinct part of a plant. The term "genome" refers to the following: 1. The entire complement of genetic material (genes and non-coding sequences) is present in each cell of an organism, or virus or organelle. 2. A complete set of chromosomes inherited as a (haploid) unit from one parent. The term "stably integrated" refers to the transfer of a nucleic acid fragment into the genome of a host organism or cell resulting in genetically stable inheritance.
[0177]Also within the scope of this invention are seeds obtained from such transformed plants and oil obtained from these seeds.
[0178]In another aspect, this invention concerns a method of altering embryo/endosperm size during seed development in a plant comprising: [0179](a) transforming plant cells or plant tissue with the recombinant DNA construct of the invention; [0180](b) regenerating transgenic plants from the transformed plant cells or plant tissue of (a); [0181](c) screening the transgenic plants of (b) for seeds having an altered embryo/endosperm size based on a comparison of embryo/endosperm size of seeds obtained from non-transformed plants.
[0182]The regeneration, development, and cultivation of plants from single plant protoplast transformants or from various transformed explants is well known in the art (Weissbach and Weissbach, In: Methods for Plant Molecular Biology, (Eds.), Academic Press, Inc. San Diego, Calif., (1988)). This regeneration and growth process typically includes the steps of selection of transformed cells, culturing those individualized cells through the usual stages of embryonic development through the rooted plantlet stage. Transgenic embryos and seeds are similarly regenerated. The resulting transgenic rooted shoots are thereafter planted in an appropriate plant growth medium such as soil. Preferably, the regenerated plants are self-pollinated to provide homozygous transgenic plants. Otherwise, pollen obtained from the regenerated plants is crossed to seed-grown plants of agronomically important lines. Conversely, pollen from plants of these important lines is used to pollinate regenerated plants. A transgenic plant of the present invention containing a desired polypeptide is cultivated using methods well known to one skilled in the art.
[0183]There are a variety of methods for the regeneration of plants from plant tissue. The particular method of regeneration will depend on the starting plant tissue and the particular plant species to be regenerated.
[0184]Methods for transforming dicots, primarily using Agrobacterium tumefaciens, and obtaining transgenic plants have been published for cotton (U.S. Pat. No. 5,004,863, U.S. Pat. No. 5,159,135, U.S. Pat. No. 5,518,908); soybean (U.S. Pat. No. 5,569,834, U.S. Pat. No. 5,416,011, McCabe et. al. (1988) Bio/Technology 6:923, Christou et al. (1988) Plant Physiol. 87:671674); Brassica(U.S. Pat. No. 5,463,174); peanut (Cheng et al. (1996) Plant Cell Rep. 15:653-657, McKently et al. (1995) Plant Cell Rep. 14:699-703); papaya and pea--(Grant et al. (1995) Plant Cell Rep. 15:254-258).
[0185]Transformation of monocotyledons using electroporation, particle bombardment, and Agrobacterium have also been reported. Transformation and plant regeneration have been achieved in asparagus (Bytebier et al., Proc. Natl. Acad. Sci. (USA) (1987) 84:5354); barley (Wan and Lemaux (1994) Plant Physiol. 104:37); Zea mays (Rhodes et al. (1988) Science 240:204, Gordon-Kamm et al. (1990) Plant Cell 2:603-618, Fromm et al. (1990) Bio/Technology 8:833; Koziel et al. (1993) Bio/Technology 11: 194, Armstrong et al. (1995) Crop Science 35:550-557); oat (Somers et al. (1992) Bio/Technology 10: 15 89); orchard grass (Horn et al. (1988) Plant Cell Rep. 7:469); rice (Toriyama et al. (1986) Theor. Appl. Genet. 205:34; Part et al. (1996) Plant Mol. Biol. 32:1135-1148; Abedinia et al. (1997) Aust. J. Plant Physiol. 24:133-141; Zhang and Wu (1988) Theor. AppL Genet. 76:835; Zhang et al. (1988) Plant Cell Rep. 7:379; Battraw and Hall (1992) Plant Sci. 86:191-202; Christou et al. (1991) Bio/Technology 9:957); rye (De la Pena et al. (1987) Nature 325:274); sugarcane (Bower and Birch (1992) Plant J. 2:409); tall fescue (Wang et al. (1992) Bio/Technology 10:691), and wheat (Vasil et al. (1992) Bio/Technology 10:667; U.S. Pat. No. 5,631,152).
[0186]"Plant" includes reference to whole plants, plant organs, plant tissues, seeds and plant cells and progeny of same. Plant cells include, without limitation, cells from seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, and microspores.
[0187]"Progeny" comprises any subsequent generation of a plant.
[0188]"Transgenic plant" includes reference to a plant which comprises within its genome a heterologous polynucleotide. Preferably, the heterologous polynucleotide is stably integrated within the genome such that the polynucleotide is passed on to successive generations. The heterologous polynucleotide may be integrated into the genome alone or as part of a recombinant DNA construct.
[0189]Assays for gene expression based on the transient expression of cloned nucleic acid constructs have been developed by introducing the nucleic acid molecules into plant cells by polyethylene glycol treatment, electroporation, or particle bombardment (Marcotte et al., Nature 335:454-457 (1988); Marcotte et al., Plant Cell 1:523-532 (1989); McCarty et al., Cell 66:895-905 (1991); Hattori et al., Genes Dev. 6:609-18 (1992); Goff et al., EMBO J. 9:2517-2522 (1-990)).
[0190]Transient expression systems may be used to functionally dissect isolated nucleic acid fragment constructs (see generally, Maliga et al., Methods in Plant Molecular Biology, Cold Spring Harbor Press (1995)). It is understood that any of the nucleic acid molecules of the present invention can be introduced into a plant cell in a permanent or transient manner in combination with other genetic elements such as vectors, promoters, enhancers etc.
[0191]In addition to the above discussed procedures the standard resource materials which describe specific conditions and procedures for the construction, manipulation and isolation of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of recombinant organisms and screening and isolating of clones (see for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (1989); Maliga et al., Methods in Plant Molecular Biology, Cold Spring Harbor Press (1995); Birren et al., Genome Analysis: Detecting Genes, 1, Cold Spring Harbor, N.Y. (1998); Birren et al., Genome Analysis: Analyzing DNA, 2, Cold Spring Harbor, N.Y. (1998); Plant Molecular Biology: A Laboratory Manual, eds. Clark, Springer, New York (1997)) are well known.
[0192]In another aspect, this invention concerns a method of mapping genetic variations related to controlling embryo/endosperm size during seed development and/or altering oil phenotypes in plants comprising: (a) crossing two plant varieties; and evaluating genetic variations with respect to a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53; in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.
[0193]The terms "mapping genetic variation" or "mapping genetic variability" are used interchangeably and define the process of identifying changes in DNA sequence, whether from natural or induced causes, within a genetic region that differentiates between different plant lines, cultivars, varieties, families, or species. The genetic variability at a particular locus (gene) due to even minor base changes can alter the pattern of restriction enzyme digestion fragments that can be generated. Pathogenic alterations to the genotype can be due to deletions or insertions within the gene being analyzed or even single nucleotide substitutions that can create or delete a restriction enzyme recognition site. Restriction fragment length polymorphism (RFLP) analysis takes advantage of this and utilizes Southern blotting with a probe corresponding to the isolated nucleic acid fragment of interest.
[0194]Thus, if a polymorphism (i.e., a commonly occurring variation in a gene or segment of DNA; also, the existence of several forms of a gene (alleles) in the same species) creates or destroys a restriction endonuclease cleavage site, or if it results in the loss or insertion of DNA (e.g., a variable nucleotide tandem repeat (VNTR) polymorphism), it will alter the size or profile of the DNA fragments that are generated by digestion with that restriction endonuclease. As such, individuals that possess a variant sequence can be distinguished from those having the original sequence by restriction fragment analysis. Polymorphisms that can be identified in this manner are termed "restriction fragment length polymorphisms: ("RFLPs"). RFLPs have been widely used in human and plant genetic analyses (Glassberg, UK Patent Application 2135774; Skolnick et al, Cytogen. Cell Genet 32:58-67 (1982); Botstein et al, Ann. J. Hum. Genet. 32:314-331 (1980); Fischer et al (PCT Application WO 90/13668; Uhlen, PCT Application WO 90/11369).
[0195]A central attribute of "single nucleotide polymorphisms" or "SNPs" is that the site of the polymorphism is at a single nucleotide. SNPs have certain reported advantages over RFLPs or VNTRs. First, SNPs are more stable than other classes of polymorphisms. Their spontaneous mutation rate is approximately 10-9 (Kornberg, DNA Replication, W.H. Freeman & Co., San Francisco, 1980), approximately, 1,000 times less frequent than VNTRs (U.S. Pat. No. 5,679,524). Second, SNPs occur at greater frequency, and with greater uniformity than RrFLPs and VNTRs. As SNPs result from sequence variation, new polymorphisms can be identified by random sequencing of genomic or cDNA molecules. SNPs can also result from deletions, point mutations and insertions. Any single base alteration, whatever the cause, can be a SNP. The greater frequency of SNPs means that they can be more readily identified than the other classes of polymorphisms.
[0196]SNPs can be characterized using any of a variety of methods. Such methods include the direct or indirect sequencing of the site, the use of restriction enzymes where the respective alleles of the site create or destroy a restriction site, the use of allele-specific hybridization probes, and the use of antibodies that are specific for the proteins encoded by the different alleles of the polymorphism or by other biochemical interpretation. SNPs can be sequenced by a number of methods. Two basic methods may be used for DNA sequencing, the chain termination method of Sanger et al, Proc. Nati. Acad. Sci. (U.S.A.) 74:5463-5467 (1977), and the chemical degradation method of Maxam and Gilbert, Proc. Nati. Acad. Sci. (U.S.A.) 74: 560-564 (1977).
[0197]Furthermore, single point mutations can be detected by modified PCR techniques such as the ligase chain reaction ("LCR") and PCR-single strand conformational polymorphisms ("PCR-SSCP") analysis. The PCR technique can also be used to identify the level of expression of genes in extremely small samples of material, e.g., tissues or cells from a body. The technique is termed reverse transcription-PCR ("RT-PCR").
[0198]In another embodiment, this invention concerns a method of molecular breeding to obtain altered embryo/endosperm size during seed development and/or altered oil phenotypes in plants comprising: (a) crossing two plant varieties; and (b) evaluating genetic variations with respect to: (i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53; in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.
[0199]The term "molecular breeding" defines the process of tracking molecular markers during the breeding process. It is common for the molecular-markers to be linked to phenotypic traits that are desirable. By following the segregation of the molecular marker or genetic trait, instead of scoring for a phenotype, the breeding process can be accelerated by growing fewer plants and eliminating assaying or visual inspection for phenotypic variation. The molecular markers useful in this process include, but are not limited to, any marker useful in identifying mapable genetic variations previously mentioned, as well as any closely linked genes that display synteny across plant species. The term "synteny" refers to the conservation of gene placement/order on chromosomes between different organisms. This means that two or more genetic loci, that mayor may not be closely linked, are found on the same chromosome among different species. Another term for synteny is "genome colinearity".
EXAMPLES
[0200]The present invention is further defined in the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those set forth and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.
[0201]The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.
Example 1
Mapping of the Oryza sativa RE2 Locus to a Single Chromosome
[0202]Identification of the chromosome comprising the Oryza sativa RE2 locus was performed using Cleaved Amplified Polymorphic Sequence markers (CAPS markers). The Oryza sativa RE2 locus comprises the polynucleotide that, when mutated, is responsible for a reduced embryo 2, or re2, mutant phenotype as exemplified by Hong et al. (1996, Development 122:2051-2058). Mutant rice grains displaying the re2 phenotype show a reduced embryo size and an increased endosperm size. CAPS markers covering the entire rice genome were developed and, asset forth below, were used to identify the portion of the chromosome comprising the Oryza sativa RE2 Locus.
Developing of CAPS Markers
[0203]Mapping of the RE2 locus to a single chromosome required first developing CAPS markers covering the entire rice genome. CAPS markers were developed as follows.
[0204]Oligonucleotide primer sets were designed based on rice genomic sequence information available in the NCBI database. Information relating to the position of the sequences in the rice chromosomes was retrieved from the web sites of the Rice Genome Research Program (RGP), Tsukuba, Japan, or the Clemson University Genomics Institute, Clemson, S.C. The oligonucleotide primer sets were used to amplify portions of genomic DNA prepared from Indica (cv. Kasalth), Japonica (cv. Taichung 65), and Japonica (cv. Kinmaze) rice. The amplified fragments were digested with restriction endonucleases and polymorphisms identified between the three wild type rice as follows.
[0205]Genomic DNA was prepared from leaves of the three rice cultivars as follows. A 3 g piece from the leaf blade was ground using a mortar and pestle and suspended in 8 mL DNA extraction buffer (0.1 M ethylenediaminetetraacetic acid [EDTA], 1% N-lauroylsarcosine, 100 μg/mL proteinase K). The suspended sample was incubated at 50° C. for 1 hour, and debris removed by centrifuging at 3,400 rpm for 15 minutes using a RT-7 Plus centrifuge (Sorvall®) and transferring the supernatant to a fresh tube. The DNA was precipitated by adding 2 volumes of 100% ethanol and separated by centrifuging at 10,000 rpm for 15 minutes at 4° C. using an RC-5B centrifuge (Sorvall®). The DNA pellet was resuspended in 8 mL TE (10 mM tris, 1 mM EDTA) and reprecipitated with 16 mL 100% ethanol. After separation of the DNA pellet by centrifugation, it was resuspended in 3.7 mL TE, 50 μL 10 mg/mL ethidium bromide were added, the volume was brought up to 4 mL with TE, and 4.4 g CsCl were added. The solution was transferred to an OptiSeal® tube (Beckman) and centrifuged for 16 hours at 52,000 rpm at 25° C. using an NVT65.2 rotor in an L8-M centrifuge (Beckman). After centrifugation the DNA band was visualized using an UV lamp and 500 μL removed using an 18-gauge needle in a 1 mL syringe. The DNA band was transferred to a 1.5 mL tube and the ethidium bromide removed by adding 500 μL isopropanol saturated with 20×SSPE buffer and centrifuging at 14,000 rpm for 30 seconds using a using a 5415C centrifuge (Eppendorf) and discarding the isopropanol phase. Removal of the ethidium bromide was accomplished by repeating addition of isopropanol and centrifugation 6 times. The DNA was then precipitated by adding 100 μL TE and 500 μL 100% ethanol and separated by centrifuging at 14,000 rpm for 15 minutes. The recovered DNA pellet was resuspended in 400 μL TE and 40 μL 3 M NaOAC. The DNA was precipitated one more time with the addition of 1 ml 100% ethanol, separated by centrifuging at 14,000 rpm for 15 minutes, rinsed with 500 μL 70% ethanol, dried, and resuspended in water to a concentration of 10 ng/μL. The genomic DNA was amplified using the oligonucleotide primer sets designed above using the following PCR conditions:
[0206]Amplifications were performed in 30 μL reactions containing 1 μL DNA prepared above (at 10 ng/μL concentration), 2 μL of 2.5 mM dNTPs, 2 μL 25 mM MgCl2, 10 pmole of each primer, 0.3 μL Amplitaq gold (Perkin Elmer, Wellsley, Mass.), and 3 μL 10×PCR buffer. Amplification of DNA was performed by heating the reactions at 95° C. for 10 minutes followed by 40 cycles of 94° C. for 30 seconds, 56° C. for 30 seconds, and 72° C. for 30 seconds. Termination of the amplification reactions was accomplished by heating the reactions at 72° C. for 5 minutes.
[0207]Amplified DNA fragments were then digested with restriction endonucleases having 4 or 5 base recognition sites. Restriction endonuclease digestions were performed in 15 μL digestion reactions containing 2 μL of amplified DNA, 1.5 μL 10× reaction buffer, and 0.5 μL restriction enzyme. The digestion reactions were incubated for 1 hour at either 37° C. or at 60° C. depending on restriction endonuclease being utilized. Digested DNA products were loaded on a 2.5% agarose gel and separated by electrophoresis to analyze polymorphisms. Comparison of the CAPS markers developed for Japonica and Indica rice allowed the development of 26 CAPS markers for wild type rice.
Mapping of the Oryza sativa RE2 Locus to a Single Chromosome
[0208]Linkage between CAPS markers obtained for wild type rice and those obtained for re2 mutant plants was then analyzed. CAPS markers were prepared with genomic DNA from F3 Japonica rice plants whose F2 seed showed re2 phenotype and were compared to the CAPS markers prepared above. Two markers on chromosome 10 (markers-C10 7.7 and C10 15.9) showed co-segregation with the re2-1 phenotype and were identified as follows.
[0209]Plants displaying an re2 mutant phenotype were obtained by crossing a Japonica cv. Taichung 65 mutant plant showing the re2-1 mutant phenotype with a plant of the Indica cultivar Kasalath and scoring the embryo phenotype of F2 mature seeds using a dissecting microscope. Twenty eight (28) seeds showing re2 mutant phenotype were sterilized and sown in soil. Genomic DNA was extracted from the leaves of these 28 F3 re2 mutant plants as follows. Leaf samples, weighing 300 mg, were ground to powder in liquid Nitrogen using a mortar and pestle. Each sample was then suspended in 750 μL extraction buffer containing 1.5 M NaCl, 0.2 M EDTA, 1 M tris and 3% CTAB (cetyltrimethylammonium bromide) and vortexed. Proteins were removed by adding 50 μL chloroform to the samples, shaking for 20 minutes, centrifuging briefly in a microfuge, and decanting the supernatant, containing the DNA, into a new tube. Genomic DNA was precipitated by adding 300 μL isopropanol, mixing by quick vortexing, and allowing the aqueous phase to precipitate. The pellet, containing the DNA, was recovered in H2O and used in amplification reactions as follows.
[0210]Marker C10 7.7 was amplified using oligonucleotide primers C10 6-3 and C10 6-4. Oligonucleotide primers C16-3 and C10 6-4 were developed as described above, have the nucleotide sequences set forth in SEQ ID NO:1 and SEQ ID NO:2, respectively, and have the sequences set forth as follows:
TABLE-US-00002 SEQ ID NO:1: 5'-TAGCAGCTGGGAAGAACAACATG-3' SEQ ID NO:2: 5'-CGTGCACCACGTAACGTTAAGC-3'
[0211]Polymorphism was observed on CAPS marker C10 7.7 when the amplified DNA was digested with the restriction endonuclease Dde I, loaded on a 2.5% agarose gel, and separated by electrophoresis. Comparison of C10 7.7 CAPS markers allowed the identification of 4 recombination breakpoints between DNA prepared from wild type plants and that obtained from re2 mutant plants.
[0212]Marker C10 15.9 was amplified using oligonucleotide primers C10 15.9-1 and C10 15.9-2. Oligonucleotide primers C10 15.9-1 and C10 15.9-2 were developed as described above, have the nucleotide sequences set forth in SEQ ID NO:3 and SEQ ID NO:4, respectively, and have the sequences set forth as follows:
TABLE-US-00003 SEQ ID NO:3: 5'-CAGGGTTGTGTAAGGATCGTTG-3' SEQ ID NO:4: 5'-GATCATCGTGTAGTACCAGGAC-3'
[0213]Polymorphism was observed on CAPS marker C10 15.9 when the amplified DNA was digested with the restriction endonuclease Msp I. This digestion produced additional bands in the Indica (Kasalath) background. Comparison of marker C10 15.9 prepared from DNA obtained from wild type plants with marker C10 15.9 prepared from DNA obtained from re2 mutant plants allowed the identification of 4 recombination breakpoints different from the ones identified with CAPS marker C10 7.7.
[0214]As explained above, comparison of CAPS markers prepared from DNA obtained from wild type rice and that obtained from F3 rice plants whose F2 seed showed re2 phenotype allowed the identification of 4 recombination breakpoints in CAPS marker C10 7.7 and 4 different recombination breakpoints in CAPS marker C10 15.9. These results indicate that the RE2 locus which contains the polynucleotide that when mutated is responsible for a re2 mutant phenotype maps to a region on chromosome 10 flanked by markers C10 7.7 and C10 15.9.
Example 2
Map-Based Cloning of the Oryza sativa RE2 Gene
[0215]In Example 1 the RE2 locus, comprising the RE2 gene, was mapped to a region on chromosome 10 flanked by markers C10 7.7 and C10 15.9. This Example describes cloning of the RE2 gene from F2 recombinant plants produced by crossing a re2-1 mutant plant (Japonica cv. Taichung 65) with an Indica cultivar, Kasalath using CAPS markers as follows.
[0216]F2 seeds obtained from self-fertilized F1 plants were screened for the re2 mutant phenotype to obtain populations for cloning the RE2 gene. Seeds (308) displaying an re2 mutant phenotype were germinated on MS medium containing 0.3% gelrite and incubated in a growth chamber for 3 weeks with a 16 hour light/8 hour dark cycle. When the plants on the plates were at third leaf stage, 5-10 mm of the tip of the leaf was removed and used for DNA amplification. Direct PCR amplification reactions were carried out as described in Klimyuk et al. (1993 Plant J. 3:493-494) with a modification of extending the sample boiling time to 4 minutes after the neutralization step. Briefly; the leaf tissue was collected in a sterile vial containing 40 μL of 0.25 M NaOH and incubated 30 seconds in a boiling water bath. Samples were neutralized by adding 40 μL 0.25 M HCl and 20 μL 0.5 M Tris-HCL, pH 8.0 containing 0.25% (v/v) Nonidet P-40 and boiling for an additional 4 minutes. Tissue samples were used immediately for amplification or stored at 4° C. until needed. Each 30 μL amplification reaction contained 10 pmole of each primer, 2 μL of 2.5 mM dNTPs, 2 μL of 25 mM MgCl2, 1 μL leaf extract, 0.3 μL AmpliTaq gold (Perkin Elmer), and 3 μL PCR buffer. The thermal cycler was set to 95° C. for 10 minutes, followed by 40 cycles of 94° C. for 4 minutes, 50° C. for 30 seconds, and 72° C. for 30 seconds followed by heating at 72° C. for 5 minutes.
[0217]DNA obtained from 44 of these 308 F2 recombinant plants contained breakpoints between CAPS markers C10 7.7 and C10 15.9 and were identified using CAPS markers C10 7.7 Hpy, C10 11.5, C10 11.0, C10 9.6, E08 93K, and E08 46K which were developed as follows.
[0218]Marker C10 7.7 Hpy was amplified using oligonucleotide primers C10-7.7 2 HPYIVF and C10-7.7 2 HPYIVR. Oligonucleotide primers C10-7.7 2 HPYIVF and C10-7.7 2 HPYIVR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:5 and SEQ ID NO:6, respectively, and have the sequences set forth as follows:
TABLE-US-00004 SEQ ID NO:5: 5'-ATTGTCTCGTGTGACAGCGC-3' SEQ ID NO:6: 5'-CCGCAATTAATATTCCGAGC-3'
[0219]Polymorphism was observed on the C10 7.7 Hpy CAPS marker when the amplified DNA was digested with the restriction endonuclease HpyCH4 IV.
[0220]Marker C10 11.5 was amplified using oligonucleotide primers 11.5 HpyV and C10 11.5-9. Oligonucleotide primers 11.5 HpyV and C10 11.5-9 were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:7 and SEQ ID NO:8, respectively, and have the sequences set forth as follows:
TABLE-US-00005 SEQ ID NO:7: 5'-AAAGTGTGGTAGGTGTCATCCAGTTG-3' SEQ ID NO:8: 5'-GCCACATGATCATCCACTACCAATG-3'
[0221]Polymorphism was observed on the C10 11.5 CAPS marker when the amplified DNA was digested with the restriction endonuclease HpyCH4 V.
[0222]Marker C10 11.0 was amplified using oligonucleotide primers C10 11-5 and 11 HinfR. Oligonucleotide primers C10 11-5 and 11 HinfR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:9 and SEQ ID NO:10, respectively, and have the sequences set forth as follows:
TABLE-US-00006 SEQ ID NO:9: 5'-CTTTTTCCGACCCACATGAAGGT-3' SEQ ID NO:10: 5'-TACAAACGCTCCTAAACCACCATGT-3'
[0223]Polymorphism was observed on the C10 11.0 CAPS marker when the amplified DNA was digested with the restriction endonuclease Hinf I.
[0224]Marker C10 9.6 was amplified using oligonucleotide primers 9.6 DraIF and 9.6 DraIR. Oligonucleotide primers-9.6 DraIF and 9.6 DraIR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:11 and SEQ ID NO:12, respectively, and have the sequences set forth as follows:
TABLE-US-00007 SEQ ID NO:11: 5'-TTTGGGTGCATTAAAGTGGACCA-3' SEQ ID NO:12: 5'-GGGGTAATTCGGATGACCATG-3'
[0225]Polymorphism was observed on the C10 9.6 CAPS marker when the amplified DNA was digested with the restriction endonuclease Dra I.
[0226]Marker E08 93K was amplified using oligonucleotide primers E08 93KF and E08 93KR. Oligonucleotide primers E08 93KF and E08 93KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:13 and SEQ ID NO:14, respectively, and have the sequences set forth as follows:
TABLE-US-00008 SEQ ID NO:13: 5'-CTCATAGCCGCCTAGCCTCATAG-3' SEQ ID NO:14: 5'-GAAGCAGAGAAACTCCAACCTGG-3'
[0227]Polymorphism was observed on the E08 93K CAPS marker when the amplified DNA was digested with the restriction endonuclease HpyCH4 V.
[0228]Marker E08 46K was amplified using oligonucleotide primers E08 46KF and E08 46KR. Oligonucleotide primers E08 46KF and E08 46KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:15 and SEQ ID NO:16, respectively, and have the sequences set forth as follows:
TABLE-US-00009 SEQ ID NO:15: 5'-GTTCATAGGTGCCAAATTTGGGTG-3' SEQ ID NO:16: 5'-CACAAGTAACCCAATGCCCAAAC-3'
[0229]Polymorphism was observed on the E08 46K CAPS marker when the amplified DNA was digested with the restriction endonuclease Rsa I.
[0230]Analysis of recombination breakpoints identified 6 recombination-breakpoints between DNA obtained from re2 mutant plants and CAPS marker E08 93K and 3 recombination breakpoints between DNA obtained from re2 mutant plants and CAPS marker C10 9.6. Information relating to the position of the sequences of the CAPS markers in the rice chromosomes was retrieved from the web sites of the Rice Genome Research Program (RGP), Tsukuba, Japan, or the Clemson University Genomics Institute, Clemson, S.C. This information revealed that the sequences for CAPS markers E08 93K and C10 9.6 were derived from two overlapping BAC clones, OSJNBa0050E08 and OSJNBb0042K08, that cover 190 Kb on rice chromosome 10. At least 10 genes are found in this region.
[0231]An additional CAPS marker, K08 21K, and a single nucleotide polymorphism-based (SNP-based) marker were generated that were derived from BAC OSJNBb0042K08, and mapped 25 Kb apart.
[0232]Marker K08 21K was amplified using oligonucleotide primers K08 21KF and K08 21KR. Oligonucleotide primers K08 21KF and K08 21KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:17 and SEQ ID NO:18, respectively, and have the sequences set forth as follows:
TABLE-US-00010 SEQ ID NO:17: 5'-GTTCACCCATTAGTGATGCCTGG-3' SEQ ID NO:18: 5'-GTTCACTCGATAAGAGCAATCGAAC-3'
[0233]Polymorphism was observed on the K08 21K CAPS marker when the amplified DNA was digested with the restriction endonuclease Taq I.
[0234]SNP-based marker K08 46K was amplified using primers K08 46KF and K08 46KR. Oligonucleotide primers K08 46KF and K08 46KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:19 and SEQ ID NO:20, respectively, and have the sequences set forth as follows:
TABLE-US-00011 SEQ ID NO:19: 5'-GTTATGTTGCACACCTCCAGTAGTTAC-3' SEQ ID NO:20: 5'-GTCAAGCCTGCTGTTACCCTTTAAG-3'
[0235]Amplified DNA products were purified using a Qiagen PCR purification kit (Qiagen, Valencia, Calif.) and 100 ng of each purified DNA was used for direct sequencing. Of 9 recombination breakpoints analyzed 3 were found in marker K08 21K and 1 was found in marker K08 46K confining the RE2 gene to a 25 Kb region between these two markers.
[0236]This 25 Kb region contains DNA corresponding to two putative genes. One is gene OSJNBb0042K08.8 that is predicted to encode a myosin-like protein and is found in the NCBI database as Version AAL77142.1 having NCBI General Identifier No. 18652508. The other one is gene OSJNBb0042K08.9 that is predicted to encode a protein of unknown function and is found in the NCBI database as Version AAL77143.1 having. NCBI General Identifier No. 18652509.
[0237]The regions corresponding to these two genes were sequenced in genomic DNA obtained from mutant alleles re2-1, re2-2 and re2-3 to identify the RE2 gene. Amplification of exon 1 was performed using oligonucleotide primers LOB-82F and LOB R1 and amplification of exon 2 was performed using oligonucleotide primers LOB F2 and LOB R2. Oligonucleotide primers LOB-82F, LOB R1, LOB F2, and LOB R2 have the nucleotide sequences set forth in SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, and SEQ ID NO:24, respectively, and have the sequences set forth as follows:
TABLE-US-00012 SEQ ID NO:21: 5'-GTCAAGCCTGCTGTTACCCTTTAAG-3' SEQ ID NO:22: 5'-CCACCATGACGAACATCTAAATG-3' SEQ ID NO:23: 5'-GTATAGCTCCCAACCATTTCTCCTC-3' SEQ ID NO:24: 5'-CCAACATCACCATCATCGTCTTC-3'
[0238]Amplification reactions were carried out using the same conditions that were used for CAPS marker amplifications in Example 1 except that 20 ng of DNA was used per reaction and the annealing temperature was 55° C. Amplified DNA products were cloned into p-GEM T easy Vector (Promega, Madison, Wis.) and, for each amplification reaction, plasmid DNA was prepared from at least 4 independent colonies using a Qiagen miniprep kit (Qiagen, Valencia, Calif.). Plasmids were sequenced using the M13 forward and reverse sequencing primers.
[0239]No mutation was found in the portion of DNA corresponding to the gene encoding the myosin-like protein, but mutations were found in the region encoding the unknown protein. This means that the RE2 gene has the sequence found in NCBI having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function and is found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509.
[0240]The nucleotide sequence of the Oryza saliva RE2 gene is set forth in SEQ ID NO:25 and the amino acid sequence deduced from translating nucleotides 1 through 807 of SEQ ID NO:25 is set forth in SEQ ID NO:26. Nucleotides 808-810 of SEQ ID NO:25 correspond to a stop codon. The nucleotide sequence set forth in SEQ ID NO:25 is the same as the one found in the NCBI database having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function. The amino acid sequence set forth in SEQ ID NO:26 is the same as the one for the protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509 that is set forth here in SEQ ID NO:27.
Identification of Mutations Responsible for an re2 Phenotype
[0241]Mutations in the RE2 gene responsible for the re2 phenotype were determined by comparing the nucleotide sequences obtained for DNA from wild-type rice with the nucleotide sequences obtained for DNA from rice exhibiting an re2 phenotype. Three re2 mutant alleles were identified and labeled re2-1, re2-2, and re2-3. The nucleotide sequence obtained for mutant allele re2-1 is set forth in SEQ ID NO:28 and the amino acid sequence obtained by translating nucleotides 1 through 807 of SEQ ID NO:28 is set forth in SEQ ID NO:29. Nucleotides 808 through 810 of SEQ ID NO:28 correspond to a stop codon. The nucleotide sequence obtained for mutant allele re2-2 is set forth in SEQ ID NO:30 and the amino acid sequence obtained by translating nucleotides 1 through 807 of SEQ ID NO:30 is set forth in SEQ ID NO:31. Nucleotides 808 through 810 of SEQ ID NO:30 correspond to a stop codon. The nucleotide sequence obtained for mutant allele re2-3 is set forth in SEQ ID NO:32 and the amino acid sequence obtained by translating nucleotides 1-378 of SEQ ID NO:32 is set forth in SEQ ID NO:33. Nucleotides 379 through 381 of SEQ ID NO:32 correspond to a stop codon.
[0242]FIG. 1A-C shows an alignment of the nucleotide sequences obtained for the coding regions of wild type RE2 (SEQ ID NO:25), and mutants re2-1 (SEQ ID NO:28), re2-2 (SEQ ID NO:30), and re2-3 (SEQ ID NO:32). Changes in the nucleotide sequence are indicated by a star below the alignment and by a box around the nucleotides at that position. As seen in FIG. 1, mutant allele re2-1 had a T residue at nucleotide 279, mutant allele re2-2 had a T residue at nucleotide 110, and mutant allele re2-3 had the C at nucleotide 75 deleted. These nucleotide changes result in changes in the amino acid sequence.
[0243]FIG. 2 shows an alignment of the amino acid sequences obtained for wild type RE2 protein (SEQ ID NO:26), and mutant proteins re2-1 (SEQ ID NO:29), and re2-2 (SEQ ID NO:31). Amino acids that change between the wild type and mutant are indicated by a box around the amino acids that are different at that position. As seen in FIG. 2, mutant allele re2-1 protein had an isoleucine at amino acid 93 instead of the highly-conserved threonine, mutant allele re2-2 protein had a phenylalanine instead of the conserved cysteine at amino acid 37. The deletion of a nucleotide at position 75 in mutant allele re2-3 gene produced a frame shift that results in a 127 amino acid polypeptide (set forth in SEQ ID NO:33) that shares identity with the first 25 amino acids of wild type RE2 protein but whose remaining 102 amino acids share little or no homology with wild type RE2 protein or mutant proteins re2-1 or re2-2.
Example 3
Confirmation of the Function of the Oryza sativa RE2 Gene
[0244]Functional confirmation of the identity of the Oryza sativa RE2 gene identified in Example 2 was performed using genetic complementation. Rice callus cells derived from wild type and re2 mutant plants were transformed with a genomic DNA fragment comprising the RE2 gene. Restoration of the embryo size of the re2 mutant cells transformed with the genomic DNA fragment comprising the RE2 gene confirmed that the Oryza sativa RE2 gene identified in Example 2 is the sole target of mutations giving rise to the re2 phenotype. Cloning of the genomic fragment comprising the wild type RE2 gene and transformation into rice cells were performed as follows.
[0245]A genomic DNA fragment containing wild type Oryza sativa RE2 gene was obtained from a lambda rice genomic DNA library (Stratagene) as follows. The genomic library was screened using a DNA probe obtained using primers LOB F2 (SEQ ID NO:23) and LOB R2 (SEQ ID NO:24) that, as indicated in Example 2, above, may be used to amplify exon 2 of the RE2 gene. Of 8 clones identified, one clone, named RE2G4, contained a 15 Kb insert comprising an approximately 9 Kb fragment flanked by two BamH I sites and comprising the RE2 gene. One of the BamH I sites was located 4472 bp upstream of the ATG initiation codon in the RE2 gene and the other one was located 3089 bp downstream of the termination codon of the RE2 gene. Nucleotides 4473 through 4829 correspond to a first exon, nucleotides 4830 through 5660 correspond to an intron, and nucleotides 5661 through 6110 correspond to the second exon. Nucleotides 6111 through 6113 form a termination codon. The nucleotide sequence of this approximately 9 Kb BamH I fragment is set forth in SEQ ID NO:34.
[0246]The approximately 9 Kb BamH I fragment comprising the RE2 coding region (set forth in SEQ ID NO:34) was removed from clone RE2G4 by digestion with BamH I and was subcloned into the BamH I site of the pML18 transformation vector to produce vector OsRE2pML18. Transformation vector pML18 is derived from the commercially available vector pGEM9z (obtained from Gibco-BRL which is owned by Invitrogen, Carlsbad, Calif.) and was modified by adding a cassette to express the bacterial hygromycin phosphotransferase gene. The bacterial hygromycin phosphotransferase gene confers resistance to the antibiotic used as selectable marker for rice transformation. A Sal I fragment, containing a cassette comprising the cauliflower mosaic virus 35S promoter, driving expression of the bacterial hygromycin phosphotransferase gene, followed by nucleotides 848 to 1550 of the 3' end of the nopaline synthase gene, was inserted at the Sal I site of vector pGEM9z to produce pML18. The nucleotide sequence of pML18 is set forth in SEQ ID NO:35.
[0247]Vector OsRE2pML18 was introduced into callus derived from wild type rice plants and from re2 mutant plants using a Biolistic PDS-1000/He gun (BioRAD Laboratories, Hercules, Calif.) and the particle bombardment technique (Klein et al. (1987) Nature (London) 327:70-73) as follows.
[0248]Embryogenic callus cultures derived from the scutellum of germinating rice seeds were used as source material for transformation experiments. This material was generated by germinating sterile rice seeds on N6-2,4D media (N6 salts, N6 vitamins, 2.0 mg/l 2,4-D, 100 mg/L myo-inositol, 300 mg/L casamino acids, and 2.7 g/L proline) in the dark at 27-28° C. Embryogenic callus proliferating from the scutellum of the embryos was then transferred to fresh N6-2,4D media. Callus cultures were maintained by routine sub-culture at two-week intervals and used for transformation within 4 weeks of initiation. The regeneration, development, and cultivation of plants from single plant protoplast transformants or from various transformed explants is well known in the art (Weissbach and Weissbach, In: Methods for Plant Molecular Biology, (Eds.), Academic Press, Inc. San Diego, Calif., (1988)).
[0249]Callus was prepared for transformation by arranging 0.5-1.0 mm callus pieces approximately 1 mm apart in a circular area of about 4 cm in diameter in the center of a circle of Whatman #541 paper placed on CM media and incubating in the dark at 27-28° C. for 3-5 days. Vector OsRE2pML18 was introduced into wild type callus cells and re2 mutant rice callus cells using a Biolistic PDS-1000/He gun (BioRAD Laboratories, Hercules, Calif.).
[0250]Transformation of mutant callus with vector OsRE2pML18 produced 16 transgenic plants of which 7 transgenic plants produced seed. T2 seed from 6 plants showed a wild type to re2 mutant phenotype segregating at a 3:1 ratio. Restoration of wild type phenotype in re2 mutant plants by vector OsRE2pML18 indicates that the 9,203 bp rice genomic DNA fragment present in vector OsRE2pML18 was capable of complementing an re2 mutation. This confirms that the Oryza sativa RE2 gene has the sequence found in NCBI having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509. These results also indicate that the 9,203 bp rice genomic DNA fragment in vector OsRE2pML18 used in these transformations and set forth in SEQ ID NO:34 contains the complete set of regulatory elements required for proper complementation of an re2 mutant phenotype and involved in altering embryo/endosperm size during seed development.
Example 4
Composition of cDNA Libraries: Isolation and Sequencing of cDNA Clones Encoding Polypeptides Involved in Altering Embryo/Endosperm Size During Seed Development
[0251]The sequence of the Oryza sativa RE2 gene was identified in Example 2 and its function was confirmed in Example 3 as being involved in altering embryo/endosperm size during seed development. Identification of genes from other crops involved in altering embryo/endosperm size during seed development is set forth in Examples 4 and 5. cDNAs encoding polypeptides homologous to rice RE2 protein were identified by electronically screening the Du Pont proprietary database using BLAST analysis (Basic Local Alignment Search Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410). Clones derived from cDNA libraries representing mRNAs from various corn (Zea maize), Euphorbia lagascae, columbine (Aquilegia vulgaris), guar (Cyamopsis tetragonoloba), rice (Oryza sativa), soybean (Glycine max), and wheat (Triticum aestivum) tissues were identified as encoding homologs to the rice RE2 protein. The libraries were prepared as described below. The characteristics of the libraries are described in Table 1.
TABLE-US-00013 TABLE 1 Libraries from Corn, Euphorbia lagascae, Columbine, Guar, Rice, Soybean, and Wheat Library Tissue Clone cef1f Corn entire fertilized ear 3 to 12 days cef1f.pk001.f4:fis after pollination cpf1c Corn pooled BMS treated with chemicals cpf1c.pk006.d18a:fis related to protein synthesis1 cpi1c Corn pooled BMS treated with chemicals cpi1c.pk005.a12:fis related to biochemical compound synthesis2 cr1n Corn root from 7 day old seedlings3 cr1n.pk0028.h3a:fis eel1c Euphorbia lagascae developing seeds eel1c.pk003.b10:fis eav1c Columbine developing seeds eav1c.pk003.c9 lds3c Guar seeds harvested 32 days after flowering lds3c.pk011.j11:fis sdr1f Soybean 10 day old root sdr1f.pk005.d21.f:fis wdr1f Wheat entire developing root wdr1f.pk002.l10:fis 1Chemicals used included chloramphenicol, cyclohexamide, aurintricarboylic acid. 2Chemicals used included sorbitol, egosterol, taxifolin, methotrexate, D-mannose, D-galactose, alpha-amino adipic acid, ancymidol. 3This library was normalized essentially as described in U.S. Pat. No. 5,482,845
[0252]cDNA libraries representing mRNAs from the tissues described in Table 1 were prepared in Uni-ZAP® XR vectors according to the manufacturer's protocol (Stratagene Cloning Systems, La Jolla, Calif.). Conversion of the Uni-ZAP® XR libraries into plasmid libraries was accomplished according to the protocol provided by Stratagene. Upon conversion, cDNA inserts were contained in the plasmid vector pBluescript. cDNA inserts from randomly picked bacterial colonies containing recombinant pBluescript plasmids were amplified via polymerase chain reaction using primers specific for vector sequences flanking the inserted cDNA sequences or plasmid DNA was prepared from cultured bacterial cells. Amplified insert DNAs or plasmid DNAs were sequenced in dye-primer sequencing reactions to generate partial cDNA sequences (expressed sequence tags or "ESTs"; see Adams, M. D. et al., (1991) Science 252:1651). The resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent sequencer.
[0253]Full-insert sequence (FIS) data was generated utilizing a modified transposition protocol. Clones identified for FIS were recovered from archived glycerol stocks as single colonies, and plasmid DNAs were isolated via alkaline lysis. Isolated DNA templates were reacted with vector primed M13 forward and reverse oligonucleotides in a PCR-based sequencing reaction and loaded onto automated sequencers. Confirmation of clone identification was performed by sequence alignment to the original EST sequence from which the FIS request was made.
[0254]Confirmed templates were transposed via the Primer Island transposition kit (PE Applied Biosystems, Foster City, Calif.) which is based upon the Saccharomyces cerevisiae Ty1 transposable element (Devine and Boeke (1994) Nucleic Acids Res. 22:3765-3772). The in vitro transposition system places unique binding sites randomly throughout a population of large DNA molecules. The transposed DNA was then used to transform DH10B electro-competent cells (Gibco BRL/Life Technologies, Rockville, Md.) via electroporation. The transposable element contains an additional selectable marker (named DHFR; Fling and Richards (1983) Nucleic Acids Res. 11:5147-5158), allowing for dual selection on agar plates of only those subclones containing the integrated transposon. Multiple subclones were randomly selected from each transposition reaction, plasmid DNAs were prepared via alkaline lysis, and templates were sequenced (ABI Prism dye-terminator ReadyReaction mix) outward from the transposition event site, utilizing unique primers specific to the binding sites within the transposon.
[0255]Sequence data was collected (ABI Prism Collections) and assembled using Phred and Phrap (Ewing et al. (1998) Genome Res. 8:175-185; Ewing and Green (1998) Genome Res. 8:186-194). Phred re-reads the ABI sequence data, re-calls the bases, assigns quality values, and writes the base calls and quality values into editable output files. Phrap is a sequence assembly program that uses the quality values assigned by Phred to increase the accuracy of the assembled sequence contigs. Assemblies are viewed using the Consed sequence editor (Gordon et al. (1998) Genome Res. 8:195-202).
Example 5
Identification and Characterization of cDNA Clones Encoding Putative Homologs of the Oryza sativa RE2 Protein
[0256]Clones containing cDNA-inserts encoding-polypeptides homologous to rice RE2 protein were identified by conducting BLAST (Basic Local Alignment Search Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410) searches for similarity to sequences contained in the Du Pont proprietary database. The sequences identified were also compared, using BLAST, to the Genbank database.
[0257]A BLASTX search was performed to identify cDNAs encoding proteins similar to those encoded by the RE2 gene. BLASTX compares the translation, in all six reading frames, of the nucleotide query sequence to a protein database. As mentioned in Example 2, the Oryza sativa RE2 gene has the sequence found in the NCBI database having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509. Thus, the polypeptides encoded by the cDNAs identified in the BLASTX search are similar to the protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509.
[0258]The BLASTX search using the nucleotide sequences from the clones listed in Table 1 revealed that the polypeptides encoded by these CDNAs had similarity to the Oryza sativa protein having NCBI General Identifier No. 18652509 and the Arabidopsis thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164. Set forth in Table 2 are the BLASTX results for individual ESTs ("EST"), or for the sequences of the entire cDNA inserts comprising the indicated cDNA clones ("FIS"):
TABLE-US-00014 TABLE 2 BLAST Results for Sequences Encoding Polypeptides Homologous To O. sativa RE2 Protein and A. thaliana LOB Domain 18 Protein aa BLAST pLog Score Clone SEQ ID NO: Status 18652509 17227164 rice RE2 26 FIS >180.00 73.70 cef1f.pk001.f4:fis 37 FIS 58.00 58.30 cpf1.c.pk006.d 18a:fis 39 FIS 36.00 38.22 cpi1c.pk005.a12:fis 41 FIS 60.70 58.15 cr1n.pk0028.h3a:fis 43 FIS 34.22 34.40 eel1c.pk003.b10:fis 45 FIS 59.05 69.40 eav1c.pk003.c9 47 EST 47.00 51.70 lds3c.pk011.j11:fis 49 FIS 53.10 56.40 sdr1f.pk005.d21.f:fis 51 FIS 39.15 41.10 wdr1f.pk002.l10:fis 53 FIS 36.30 33.70
[0259]The data set forth in Table 3 presents the percent identity, calculated using the Clustal V method of alignment, of the amino acid sequences set forth in SEQ ID NOs:26, 37, 39, 41, 43, 45, 47, 49, 51, and 53, with the Oryza saliva protein having NCBI General Identifier No. 18652509 (set forth in SEQ ID NO:27), and the Arabidopsis thaliana LOB domain 18 protein (NCBI General Identifier No. 17227164; set forth in SEQ ID NO:54).
TABLE-US-00015 TABLE 3 Percent Identity of Amino Acid Sequences Deduced From Nucleotide Sequences of cDNA Clones Encoding Putative O. sativa RE2 Homolog Polypeptides aa Percent Identity to SEQ ID NO. 18652509 17227164 rice RE2 26 100.00 49.6 cef1f.pk001.f4:fis 37 59.1 56.9 cpf1c.pk006.d18a:fis 39 38.5 40.4 cpi1c.pk005.a12:fis 41 57.8 52.7 cr1n.pk0028.h3a:fis 43 45.8 47.6 eel1c.pk003.b10:fis 45 53.4 58.6 eav1c.pk003.c9 47 79.2 85.4 lds3c.pk011.j11:fis 49 41.8 47.4 sdr1f.pk005.d21.f:fis 51 40.8 42.3 wdr1f.pk002.l10:fis 53 43.9 41.2
[0260]Sequence alignments and percent identity calculations were performed using the Megalign program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences was performed using the Clustal V method of alignment (Higgins, D. G. and Sharp, P. M. (1989) Comput. Appl. Biosci. 5:151-153; Higgins, D. G. et al. (1992) Comput. Appl. Biosci. 8:189-191.) and the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments using the Clustal V method were KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores and probabilities indicate that the nucleic acid fragments comprising the instant cDNA clones encode polypeptides with homology to the O. sativa RE2 protein and the A. thaliana LOB 18 domain protein.
Example 6
Structure of the Oryza sativa RE2 Protein and its Putative Homologs
[0261]As set forth on Table 3, Example 5, the amino acid sequence of the RE2 polypeptide (SEQ ID NO:26) set forth in Example 2, above, to be able to complement an re2 mutant phenotype was identical to the Oryza sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27) and had sequence similarity to the Arabidopsis thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164 (set forth in SEQ ID NO:54).
[0262]The LOB domain 18 protein is considered to belong in the class I group of the Lateral Organ Boundaries (LOB) domain protein plant-specific gene family. The Class I LOB domain proteins contain a C-block, a GAS-block, and a leucine zipper motif (Shuai, B. et al., 2002, Plant Phys. 129:747-761). Thus, it is expected that the Oryza sativa RE2 protein and its homologs also contain a C-block, a GAS-block, and a leucine zipper motif. The consensus sequences of these motifs were identified using a Clustal V alignment and are indicated in FIG. 3A-C.
[0263]FIG. 3A-C depicts the Clustal V alignment obtained for the amino acid sequences from the wild type rice RE2 protein (SEQ ID NO:26), the O. sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27), the A. thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164 (SEQ ID NO:54), and the amino acid sequences of the polypeptides encoded by corn clones cef1f.pk001.f4:fis (SEQ ID NO:37), cpf1c.pk006.d18a:fis (SEQ ID NO:39), cpi1c.pk005.a12:fis (SEQ ID NO:41), and cr1n.pk0028.h3a:fis (SEQ ID NO:43), Euphorbia lagascae clone eel1c.pk003.b10:fis (SEQ ID NO:45), columbine clone eav1c.pk003.c9 (SEQ ID NO:47), guar clone lds3c.pk011.j11:fis (SEQ ID NO:49), soybean clone sdr1f.pk005.d21.f:fis (SEQ ID NO:51), and wheat clone wdr1f.pk002.l10:fis (SEQ ID NO:53). The program uses dashes to maximize the alignment. An asterisk (*) below the alignment indicates amino acids conserved among all the sequences. The C-block, a GAS-block, and a leucine zipper conserved motifs are set forth boxed.
[0264]Table 4 sets forth the amino acid position of the C-block, Gas Block, and leucine zipper conserved amino acid domains in SEQ ID NOs:26, 54, 37, 39, 41, 43, 45, 47, 49, 51, and 53. The amino acids in each domain are indicated in FIG. 1 and the consensus sequence for each domain described below the table.
TABLE-US-00016 TABLE 4 Location of the Conserved Domains in Oryza sativa RE2 and its Putative Homologs Gas Block SEQ ID NO: C-Block N-end C-end Leu Zipper 26/27 33-54 63-74 103-111 116-134 54 37-58 67-78 107-115 120-138 37 34-55 64-75 104-112 117-135 39 24-45 54-65 94-102 107-125 41 32-53 62-73 102-110 115-133 43 24-45 44-55 84-92 97-115 45 30-51 60-71 100-108 113-131 47 22-33 62-70 75-93 49 20-41 50-61 90-98 103-121 51 16-37 46-57 86-94 99-117 53 12-33 42-53 82-90 95-113
[0265]In the following consensus sequences the amino acids are indicated with their one letter code, positions where more than one amino acid is found at that position are indicated in parenthesis and the amino acids separated by a slash. An X is used in cases where at a certain position any amino acid may be present. The amino acids comprising the C-Block, GAS Block N-end and C-end, and Leu Zipper identified here follow:
[0266]The C block consensus sequence found in RE2 homologs is set forth in SEQ ID NO:55 and corresponds to:
TABLE-US-00017 SEQ ID NO:55: PCGACKFLRR(K/R)C(V/Q/A)X(G/D/E)C(V/I)FAP(Y/H)F
[0267]The GAS block has 49 amino acids that have an N-end consensus sequence set forth in SEQ ID NO:56 and a C-end C-end consensus sequence set forth in SEQ ID NO:57.
TABLE-US-00018 SEQ ID NO:56: FAA(V/I)HKVFGASN SEQ ID NO:57: RDP(V/I)(F/Y)GCV(A/S)
[0268]The consensus sequence for the Leucine Zipper domain is set forth in SEQ ID NO:58 and corresponds to:
TABLE-US-00019 SEQ ID NO:58: LQ(Q/H)QV(A/V/G)XLQX(E/Q)(L/V)X(Y/Q/H)(L/A/V) (Q/K/R)X(H/Q/Y)(L/V)
[0269]The C-Block, GAS Block N-end and C-end, and Leu Zipper consensus sequences set forth above were identified in a Clustal V alignment of polypeptides similar to the Oryza sativa RE2, thus, they should be present in any polypeptide having the same function in altering embryo/endosperm size during seed development as the Oryza sativa RE2 polypeptide.
Example 7
Cloning and Sequencing of a Genomic Fragment Encoding a Maize Putative RE2 Homolog and Preparation of a Recombinant DNA Construct to Complement re2 Mutant Plants
[0270]A genomic DNA fragment encoding a corn RE2 homolog was amplified from a maize genomic library, cloned and sequenced. Then, the portion of DNA from the initiator ATG to the terminator codon of the fragment encoding the maize RE2 homolog was used to replace the portion of DNA from the initiator ATG to the terminator codon encoding the rice RE2 protein in vector OsRE2pML18 as follows.
Cloning and Sequencing of a Genomic Fragment Encoding a Maize RE2 Homolog
[0271]The polynucleotide in cDNA clone cpi1c.pk005.a12 was identified in Example 5 as encoding a polypeptide with similarity to the Oryza saliva RE2 protein. A genomic fragment comprising the open reading frame in clone cpi1c.pk005.a12 was amplified from a maize genomic library (Stratagene, Catalog No. 946102) using oligonucleotide primers Cpi Bbsl F and Cpi Bsal R. Oligonucleotide primers Cpi Bbsl F and Cpi Bsal R were designed based on the sequence of clone cpi1c.pk005.a12, are set forth in SEQ ID NO:59 and SEQ ID NO:60, respectively, and have the sequences set forth as follows:
TABLE-US-00020 SEQ ID NO:59: 5'-GAAGACCAATGAGCGCTGGCGGCGGCAGCAG-3 SEQ ID NO:60: 5'-GGTCTCCTCATCTTGAGTGTGGCGGCGGGTGCTC-3'
[0272]Amplification was performed using the conditions suggested by the manufacturer of the library. The amplified DNA product comprising a maize RE2 homolog gene was named ZmRE2 ORF, was cloned into vector pGEM-T-easy, and was sequenced. The nucleotide sequence obtained for ZmRE2 ORF is set forth in SEQ ID NO:61. Nucleotides 79 through 429 correspond to the first exon, nucleotides 430 through 1363 correspond to an intron, and nucleotides 1364 through 1784 correspond to the second exon, and nucleotides 1785 to 1787 correspond to a stop codon.
A. Preparation of a Recombinant DNA Construct Encoding a Putative Maize RE2 Homolog
[0273]A recombinant DNA construct was prepared in which a genomic DNA fragment encoding a maize RE2 homolog present in ZmRE2 ORF was used to replace the Oryza sativa RE2 coding region in vector OsRE2pML18 (prepared in Example 3, above). The resulting chimeric construct comprises the genomic DNA fragment encoding a maize RE2 homolog (referred to as ZmRE2 ORF) surrounded by the sequences upstream of the initiator ATG and downstream of the termination codon from vector OsRE2pML18. This chimeric construct was prepared by amplifying portions upstream of the initiator ATG and downstream of the termination signal in vector OsRE2pML18, adding these portions to the pGEM-T-easy vector containing ZmRE2 ORF and then replacing the Oryza sativa RE2 coding sequence with this chimeric fragment in vector OsRE2pML18 as follows.
B. Amplification of a Fragment 5' of the O. sativa RE2 Gene in Vector OsRE2pML18
[0274]A portion of the DNA fragment 5' of the initiator ATG in vector OsRE2pML18 was amplified using oligonucleotide primers RE2 pro Bst 2F and RE2 PRO R Bbs. Oligonucleotide primers RE2 pro Bst 2F and RE2 PRO R Bbs are set forth in SEQ ID NO:62 and SEQ ID NO:63, respectively, and have the sequences set forth as follows:
TABLE-US-00021 SEQ ID NO:62: 5'-CACCATCATGTCAGTGTGCCAATACGCTAAACTTAGAAGA-3' SEQ ID NO:63: 5'-GAAGACGCTCATTCTTGGAATGAGCCCCCA-3'
[0275]The amplified fragment comprises a portion of the Oryza sativa RE2 promoter and was cloned in pGEM-T-easy (Promega) to create plasmid RE2PRO whose sequence is set forth in SEQ ID NO:64.
C. Preparation of a Chimera Comprising the Fragments Amplified in A and B Above
[0276]Digestion of the pGEM-T-easy vector containing ZmRE2 ORF (prepared in A above) with Bbs I and Aat II produced a 1760 bp fragment. Restriction endonuclease Bbs I cuts the pGEM-T-easy vector containing ZmRE2 ORF immediately upstream of the initiator ATG and Aat II cuts in the vector, downstream of the maize stop codon. Plasmid RE2PRO was digested with Bbs I which cuts immediately upstream of the initiator ATG, and with Sal I which cuts in the vector's multiple cloning region. The 4316 bp fragment obtained from plasmid RE2PRO was ligated to the 1760 bp fragment obtained from the pGEM-T-easy vector containing ZmRE2 ORF by introducing the fragments in DH10B competent cells (Invitrogen). The resulting plasmid contains a portion of the Oryza sativa RE2 promoter region operably linked to the first codon of the genomic fragment encoding a maize RE2 homolog in ZmRE2 ORF.
D. Amplification of a Fragment 3' of the O. sativa RE2 Gene in Vector OsRE2DML18
[0277]A portion of the DNA fragment 3' of the termination signal in vector OsRE2 μMl18 was amplified using oligonucleotide primers RE2 TERM Xbal R and RF2 TERM EcoBspml. Oligonucleotide primers RE2 TERM Xbal R and RE2 TERM EcoBspml are set forth in SEQ ID NO:65 and SEQ ID NO:66, respectively, and have the sequences set forth as follows:
TABLE-US-00022 SEQ ID NO:65: 5'-GTAAAAGGATCTAGACACCTGGCTCTAGCCTCCAAGTA-3' SEQ ID NO:66: 5'-TGGAGCGAATTCACCTGCCAAGATGATCCTCCTCACTGTGTGTGATCATC-3'
The amplified DNA product comprising a portion of the Oryza sativa RE2 terminator region was cloned into vector pGEM9z to produce plasmid pRE2TERGEM whose sequence is set forth in SEQ ID NO:67.
E. Addition of the Fragment Amplified in D Above to the Chimera of C Above
[0278]The maize sequences 3' of the termination signal were replaced for rice sequences as follows.
[0279]Plasmid pRE2TERGEM was digested with Xba I and Eco RI to remove a 758 bp fragment containing only sequences from the termination region of the rice RE2 gene. This 758 bp fragment was cloned into vector pGEM7 that had been digested with Xba I and Eco RI to produce plasmid RE2TERMpGEM7.
[0280]Plasmid RE2TERMpGEM7 was digested with Bsp HI and Eco RI and an approximately 3.7 Kb fragment was recovered. The chimera prepared in C, above, was digested with Bsa I and Eco RI to remove the fragment comprising a portion of the Oryza sativa RE2 promoter region operably linked to the genomic fragment encoding a maize RE2 homolog. These two fragments were ligated to form a plasmid comprising a portion of the rice RE2 promoter operably linked to the genomic fragment encoding a maize RE2 homolog operably linked to a portion of the rice RE2 terminator region.
F. Preparation of a Vector Comprising a Genomic Fragment Encoding a Maize RE2 Homolog Under the Control of the Oryza Sativa RE2 Promoter and Terminator
[0281]A vector comprising the genomic fragment encoding the maize RE2 homolog under the control of the Oryza sativa RE2 promoter and terminator regions was assembled from vector OsRE2pML18 and the chimeric fragment prepared in part E above, as follows.
[0282]Vector OsR2pML18 and the chimeric fragment prepared in part E above were digested with restriction endonucleases Bst Ell and SexAI. Digestion of vector OsRE2pML18 removed the Oryza sativa RE2 coding region and portions of the promoter and terminator regions from vector OsRE2pML18 leaving a 12.1 Kb DNA fragment. Digestion of the fragment prepared in part E above, produced a 3.1 Kb fragment comprising a fragment encoding a maize RE2 homolog between portions of the Oryza sativa RE2 promoter and terminator regions. Ligation of the 12.1 and 3.1 Kb fragments produced a vector comprising a fragment encoding a maize RE2 homolog under the control of the Oryza sativa RE2 promoter and terminator. The vector comprising the maize RE2 homolog open reading frame under the control of the Oryza sativa RE2 promoter and terminator regions was named ZmRE2pML18.
Example 8
Genetic Complementation of a Rice Re2 Mutant Plant with an RE2 Homolog from Corn
[0283]Confirmation of the function of the corn RE2 homolog, identified in Example 5 above, was performed using genetic complementation. Rice callus cells derived from rice re2 mutant plants were transformed with vector ZmRE2pML18 prepared as described in Example 7 above. Transformations were performed using a Biolistic PDS-1000/He gun and the particle bombardment technique as in Example 3 above.
[0284]Transformation of re2-1 mutant cells with vector ZmRE2pML18 produced 14 transgenic plants. Thirteen of these fourteen plants produced seeds of which ten plants produced seeds having wild type appearance. Some of the seeds produced by these 10 plants had a wild-type phenotype and some had an re2 mutant phenotype. The ratio of seeds having a wild-type appearance to seeds having an re2 mutant phenotype varied in each plant. Approximately 25% to 70% of the seeds obtained from individual re2-1 mutant plants transformed with vector ZmRE2pML18 had a wild-type appearance. Restoration of a wild-type appearance in seeds from plants regenerated from re2 mutant cells transformed with the vector comprising the fragment encoding the corn RE2 homolog indicates that the corn RE2 homolog, encoded by ZmRE2 (SEQ ID NO:61), is capable of complementing an re2 mutation. These results suggest that the corn RE2 homolog performs the same function in corn as the rice RE2 protein performs in rice.
Example 9
Identification of a cDNA Clone Encoding OsRE2
[0285]A cDNA clone encoding OsRE2 was identified by screening a rice phage cDNA library using an RE2-specific probe.
[0286]The phage cDNA library was prepared from total RNA extracted from developing rice seeds harvested 2-5 days after pollination as follows. Total RNA was extracted using a TRIazol® Reagent containing phenol and guanidine thiocyanate (Life Technologies Inc., Rockville, Md.). Poly(A) mRNA was purified from the total RNA using mRNA Purification kits which consist of oligo (dT)-cellulose spin columns (Amersham Pharmacia Biotech Inc., Piscataway, N.J.). cDNA was synthesized using 5.5 μg of poly(A) mRNA and cDNA synthesis kits (Stratagene, La Jolla, Calif.), following manufacturer's protocol with the exception of using Superscript® reverse transcriptase (Life Technologies Inc.) in the first step instead of Moloney murine leukemia virus reverse transcriptase. The cDNA was size-fractionated using BRL cDNA Size Fraction Columns (GIBCO-BRL). Fractions 1 to 13 were precipitated, resuspended, and ligated with 1 μg Uni-ZAP XR vector following the manufacturer's instructions. After incubation for two days at 4° C. the ligated DNA was packaged using Gigapack III Gold® packaging extract (Stratagene, La Jolla, Calif.). The titer of the resulting library was approximately 7.8×105 plaque forming units per mL (pfu/mL). The cDNA phage library was amplified following the manufacturer's instructions and 150 mL of phage cDNA library were obtained. The amplified library had a 5.5×108 pfu/mL titer.
[0287]Screening for the RE2 cDNA was performed following standard protocols well known to those skilled in the art (Ausubel et al. 1993, "Current Protocols in Molecular Biology" John Wiley & Sons, USA, or Sambrook et al. 1989. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press). Briefly, 1.0×106 pfu were plated, transferred to nylon membranes, and subjected to hybridization with radioactively-labeled RE2 second exon probe. The nucleotide sequence of RE2 second exon probe is shown in SEQ ID NO:71. Following hybridization the membranes were exposed to film where approximately 1 positive plaque was detected per 100,000 plaques plated. Eight plaques that gave a positive signal were isolated after a second round of screening. Lambda phage DNA was prepared from all 8 plaques, converted into plasmid DNA, and sequenced. Six of the eight clones contained a cDNA sequence encoding OsRE2. One of these six clones, RE2 cDNA C1, had a 5'UTR that extended 196 nucleotides upstream of the ATG start codon predicted from the genomic sequence. The nucleotide sequence of clone RE2 cDNA C1 is shown in SEQ ID NO:72.
Sequence CWU
1
72123DNAArtificial SequenceOligonucleotide primer C10 6-3 1tagcagctgg
gaagaacaac atg
23222DNAartificial sequenceOligonucleotide primer C10 6-4 2cgtgcaccac
gtaacgttaa gc
22322DNAartificial sequenceoligonucleotide primer C10 15.9-1 3cagggttgtg
taaggatcgt tg
22422DNAartificial sequenceOligonucleotide primer C10 15.9-2 4gatcatcgtg
tagtaccagg ac
22520DNAartificial sequenceOligonucleotide primer C10-7.7 2 HPYIVF
5attgtctcgt gtgacagcgc
20620DNAartificial sequenceOligonucleotide primer C10-7.7 2 HPYIVR
6ccgcaattaa tattccgagc
20726DNAartificial sequenceOligonucleotide primer 11.5 HpyV 7aaagtgtggt
aggtgtcatc cagttg
26825DNAartificial sequenceOligonucleotide primer C10 11.5-9 8gccacatgat
catccactac caatg
25923DNAartificial sequenceOligonucleotide primer C10 11-5 9ctttttccga
cccacatgaa ggt
231025DNAartificial sequenceOligonucleotide primer 11 HinfR 10tacaaacgct
cctaaaccac catgt
251123DNAartificial sequenceOligonucleotide primer 9.6 DraIF 11tttgggtgca
ttaaagtgga cca
231221DNAartificial sequenceOligonucleotide primer 9.6 DraIR 12ggggtaattc
ggatgaccat g
211323DNAArtificial sequenceOligonucleotide primer E08 93KF 13ctcatagccg
cctagcctca tag
231423DNAartificial sequenceOligonucleotide primer E08 93KR 14gaagcagaga
aactccaacc tgg
231524DNAArtificial sequenceOligonucleotide primer E08 46KF 15gttcataggt
gccaaatttg ggtg
241623DNAArtificial SequenceOligonucleotide primer E08 46KR 16cacaagtaac
ccaatgccca aac
231723DNAArtificial sequenceOligonucleotide primer K08 21KF 17gttcacccat
tagtgatgcc tgg
231825DNAArtificial sequenceOligonucleotide primer K08 21KR 18gttcactcga
taagagcaat cgaac
251927DNAartificial sequenceOligonucleotide primer K08 46KF 19gttatgttgc
acacctccag tagttac
272025DNAArtificial sequenceOligonucleotide primer K08 46KR 20gtcaagcctg
ctgttaccct ttaag
252125DNAArtificial sequenceOligonucleotide primer LOB-82F 21gtcaagcctg
ctgttaccct ttaag
252223DNAArtificial sequenceOligonucleotide primer LOB R1 22ccaccatgac
gaacatctaa atg
232325DNAArtificial sequenceOligonucleotide primer LOB F2 23gtatagctcc
caaccatttc tcctc
252423DNAArtificial sequenceOligonucleotide primer LOB R2 24ccaacatcac
catcatcgtc ttc
2325810DNAOryza sativa 25atgagctcgt cggtggttgt gagcgcgagc ggcagcggca
gcggcggcgg aggaggagga 60ggaggtggcg gcgccggagg tggaggagga ggtgggccgt
gcggggcgtg caagttcttg 120cggcggaagt gcgtgcaggg gtgcatcttc gcgccctact
tcgactcgga ggccggggcg 180gcgcacttcg cggcggtgca caaggtgttc ggcgccagca
acgtgtccaa gctgctgcag 240cagatcccgg cgcaccgccg cctcgacgcc gtcgtcacca
tctgctacga ggcccaggcc 300cgcctccgcg accccgtcta cggctgcgtc gcccacatct
tccacctcca acaccaggtg 360gcaggtctcc agtccgagct gaactacctg caaggtcacc
tctcgacgat ggagctgccg 420tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc
cacagccaca gccactgatg 480ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc
catcgtcgtc ggcggccaac 540attccggtca ccgccgacct gtccaccctc tttgacccac
tgccggcggc gcagccgcag 600tggggactat accagcagca gcaacaccac caccagcagc
tgcatcatca cccctatgac 660cggatgggcg acggctcgtc gagcagcaga ggcggcgacg
acgatggcag cgacggcggc 720gacttgcaag cgctggcgag ggagcttctt gaccgccatg
gacggtcgtc gtcgagctcc 780aagctggagc cgccacctca cacacagtga
81026269PRTOryza sativa 26Met Ser Ser Ser Val Val
Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5
10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly Gly Gly
Gly Gly Gly Gly 20 25 30Pro
Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Gln Gly Cys 35
40 45Ile Phe Ala Pro Tyr Phe Asp Ser Glu
Ala Gly Ala Ala His Phe Ala 50 55
60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Gln65
70 75 80Gln Ile Pro Ala His
Arg Arg Leu Asp Ala Val Val Thr Ile Cys Tyr 85
90 95Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr
Gly Cys Val Ala His 100 105
110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln Ser Glu Leu Asn
115 120 125Tyr Leu Gln Gly His Leu Ser
Thr Met Glu Leu Pro Ser Pro Pro Pro 130 135
140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln Pro Gln Pro Leu
Met145 150 155 160Pro Met
Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser
165 170 175Ser Ala Ala Asn Ile Pro Val
Thr Ala Asp Leu Ser Thr Leu Phe Asp 180 185
190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu Tyr Gln Gln
Gln Gln 195 200 205His His His Gln
Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210
215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp Asp Gly
Ser Asp Gly Gly225 230 235
240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg Ser
245 250 255Ser Ser Ser Ser Lys
Leu Glu Pro Pro Pro His Thr Gln 260
26527269PRTOryza sativaMISC_FEATURENCBI General Identification No.
18652509 27Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly
Gly1 5 10 15Gly Gly Gly
Gly Gly Gly Gly Gly Ala Gly Gly Gly Gly Gly Gly Gly 20
25 30Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg
Lys Cys Val Gln Gly Cys 35 40
45Ile Phe Ala Pro Tyr Phe Asp Ser Glu Ala Gly Ala Ala His Phe Ala 50
55 60Ala Val His Lys Val Phe Gly Ala Ser
Asn Val Ser Lys Leu Leu Gln65 70 75
80Gln Ile Pro Ala His Arg Arg Leu Asp Ala Val Val Thr Ile
Cys Tyr 85 90 95Glu Ala
Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His 100
105 110Ile Phe His Leu Gln His Gln Val Ala
Gly Leu Gln Ser Glu Leu Asn 115 120
125Tyr Leu Gln Gly His Leu Ser Thr Met Glu Leu Pro Ser Pro Pro Pro
130 135 140Tyr Val Ala Gly Pro Thr Leu
Ala Pro Pro Gln Pro Gln Pro Leu Met145 150
155 160Pro Met Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp
Leu Pro Ser Ser 165 170
175Ser Ala Ala Asn Ile Pro Val Thr Ala Asp Leu Ser Thr Leu Phe Asp
180 185 190Pro Leu Pro Ala Ala Gln
Pro Gln Trp Gly Leu Tyr Gln Gln Gln Gln 195 200
205His His His Gln Gln Leu His His His Pro Tyr Asp Arg Met
Gly Asp 210 215 220Gly Ser Ser Ser Ser
Arg Gly Gly Asp Asp Asp Gly Ser Asp Gly Gly225 230
235 240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu
Asp Arg His Gly Arg Ser 245 250
255Ser Ser Ser Ser Lys Leu Glu Pro Pro Pro His Thr Gln
260 26528810DNAOryza sativa 28atgagctcgt cggtggttgt
gagcgcgagc ggcagcggca gcggcggcgg aggaggagga 60ggaggtggcg gcgccggagg
tggaggagga ggtgggccgt gcggggcgtg caagttcttg 120cggcggaagt gcgtgcaggg
gtgcatcttc gcgccctact tcgactcgga ggccggggcg 180gcgcacttcg cggcggtgca
caaggtgttc ggcgccagca acgtgtccaa gctgctgcag 240cagatcccgg cgcaccgccg
cctcgacgcc gtcgtcatca tctgctacga ggcccaggcc 300cgcctccgcg accccgtcta
cggctgcgtc gcccacatct tccacctcca acaccaggtg 360gcaggtctcc agtccgagct
gaactacctg caaggtcacc tctcgacgat ggagctgccg 420tcgccgccgc cctacgtcgc
cgggccgacc ctggcgccgc cacagccaca gccactgatg 480ccgatgaccg ccgccgccaa
cttcaacttc tccgacctgc catcgtcgtc ggcggccaac 540attccggtca ccgccgacct
gtccaccctc tttgacccac tgccggcggc gcagccgcag 600tggggactat accagcagca
gcaacaccac caccagcagc tgcatcatca cccctatgac 660cggatgggcg acggctcgtc
gagcagcaga ggcggcgacg acgatggcag cgacggcggc 720gacttgcaag cgctggcgag
ggagcttctt gaccgccatg gacggtcgtc gtcgagctcc 780aagctggagc cgccacctca
cacacagtga 81029269PRTOryza sativa
29Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1
5 10 15Gly Gly Gly Gly Gly Gly
Gly Gly Ala Gly Gly Gly Gly Gly Gly Gly 20 25
30Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val
Gln Gly Cys 35 40 45Ile Phe Ala
Pro Tyr Phe Asp Ser Glu Ala Gly Ala Ala His Phe Ala 50
55 60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser
Lys Leu Leu Gln65 70 75
80Gln Ile Pro Ala His Arg Arg Leu Asp Ala Val Val Ile Ile Cys Tyr
85 90 95Glu Ala Gln Ala Arg Leu
Arg Asp Pro Val Tyr Gly Cys Val Ala His 100
105 110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln
Ser Glu Leu Asn 115 120 125Tyr Leu
Gln Gly His Leu Ser Thr Met Glu Leu Pro Ser Pro Pro Pro 130
135 140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln
Pro Gln Pro Leu Met145 150 155
160Pro Met Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser
165 170 175Ser Ala Ala Asn
Ile Pro Val Thr Ala Asp Leu Ser Thr Leu Phe Asp 180
185 190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu
Tyr Gln Gln Gln Gln 195 200 205His
His His Gln Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210
215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp
Asp Gly Ser Asp Gly Gly225 230 235
240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg
Ser 245 250 255Ser Ser Ser
Ser Lys Leu Glu Pro Pro Pro His Thr Gln 260
26530810DNAOryza sativa 30atgagctcgt cggtggttgt gagcgcgagc ggcagcggca
gcggcggcgg aggaggagga 60ggaggtggcg gcgccggagg tggaggagga ggtgggccgt
gcggggcgtt caagttcttg 120cggcggaagt gcgtgcaggg gtgcatcttc gcgccctact
tcgactcgga ggccggggcg 180gcgcacttcg cggcggtgca caaggtgttc ggcgccagca
acgtgtccaa gctgctgcag 240cagatcccgg cgcaccgccg cctcgacgcc gtcgtcacca
tctgctacga ggcccaggcc 300cgcctccgcg accccgtcta cggctgcgtc gcccacatct
tccacctcca acaccaggtg 360gcaggtctcc agtccgagct gaactacctg caaggtcacc
tctcgacgat ggagctgccg 420tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc
cacagccaca gccactgatg 480ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc
catcgtcgtc ggcggccaac 540attccggtca ccgccgacct gtccaccctc tttgacccac
tgccggcggc gcagccgcag 600tggggactat accagcagca gcaacaccac caccagcagc
tgcatcatca cccctatgac 660cggatgggcg acggctcgtc gagcagcaga ggcggcgacg
acgatggcag cgacggcggc 720gacttgcaag cgctggcgag ggagcttctt gaccgccatg
gacggtcgtc gtcgagctcc 780aagctggagc cgccacctca cacacagtga
81031269PRTOryza sativa 31Met Ser Ser Ser Val Val
Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5
10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly Gly Gly
Gly Gly Gly Gly 20 25 30Pro
Cys Gly Ala Phe Lys Phe Leu Arg Arg Lys Cys Val Gln Gly Cys 35
40 45Ile Phe Ala Pro Tyr Phe Asp Ser Glu
Ala Gly Ala Ala His Phe Ala 50 55
60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Gln65
70 75 80Gln Ile Pro Ala His
Arg Arg Leu Asp Ala Val Val Thr Ile Cys Tyr 85
90 95Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr
Gly Cys Val Ala His 100 105
110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln Ser Glu Leu Asn
115 120 125Tyr Leu Gln Gly His Leu Ser
Thr Met Glu Leu Pro Ser Pro Pro Pro 130 135
140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln Pro Gln Pro Leu
Met145 150 155 160Pro Met
Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser
165 170 175Ser Ala Ala Asn Ile Pro Val
Thr Ala Asp Leu Ser Thr Leu Phe Asp 180 185
190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu Tyr Gln Gln
Gln Gln 195 200 205His His His Gln
Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210
215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp Asp Gly
Ser Asp Gly Gly225 230 235
240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg Ser
245 250 255Ser Ser Ser Ser Lys
Leu Glu Pro Pro Pro His Thr Gln 260
26532809DNAOryza sativa 32atgagctcgt cggtggttgt gagcgcgagc ggcagcggca
gcggcggcgg aggaggagga 60ggaggtggcg gcgcggaggt ggaggaggag gtgggccgtg
cggggcgtgc aagttcttgc 120ggcggaagtg cgtgcagggg tgcatcttcg cgccctactt
cgactcggag gccggggcgg 180cgcacttcgc ggcggtgcac aaggtgttcg gcgccagcaa
cgtgtccaag ctgctgcagc 240agatcccggc gcaccgccgc ctcgacgccg tcgtcaccat
ctgctacgag gcccaggccc 300gcctccgcga ccccgtctac ggctgcgtcg cccacatctt
ccacctccaa caccaggtgg 360caggtctcca gtccgagctg aactacctgc aaggtcacct
ctcgacgatg gagctgccgt 420cgccgccgcc ctacgtcgcc gggccgaccc tggcgccgcc
acagccacag ccactgatgc 480cgatgaccgc cgccgccaac ttcaacttct ccgacctgcc
atcgtcgtcg gcggccaaca 540ttccggtcac cgccgacctg tccaccctct ttgacccact
gccggcggcg cagccgcagt 600ggggactata ccagcagcag caacaccacc accagcagct
gcatcatcac ccctatgacc 660ggatgggcga cggctcgtcg agcagcagag gcggcgacga
cgatggcagc gacggcggcg 720acttgcaagc gctggcgagg gagcttcttg accgccatgg
acggtcgtcg tcgagctcca 780agctggagcc gccacctcac acacagtga
80933126PRTOryza sativa 33Met Ser Ser Ser Val Val
Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5
10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Glu Val Glu
Glu Glu Val Gly 20 25 30Arg
Ala Gly Arg Ala Ser Ser Cys Gly Gly Ser Ala Cys Arg Gly Ala 35
40 45Ser Ser Arg Pro Thr Ser Thr Arg Arg
Pro Gly Arg Arg Thr Ser Arg 50 55
60Arg Cys Thr Arg Cys Ser Ala Pro Ala Thr Cys Pro Ser Cys Cys Ser65
70 75 80Arg Ser Arg Arg Thr
Ala Ala Ser Thr Pro Ser Ser Pro Ser Ala Thr 85
90 95Arg Pro Arg Pro Ala Ser Ala Thr Pro Ser Thr
Ala Ala Ser Pro Thr 100 105
110Ser Ser Thr Ser Asn Thr Arg Trp Gln Val Ser Ser Pro Ser 115
120 125349203DNAOryza sativa 34ggatccatcc
aacagtttct cctaaatatc agaataaagt tgaagtaact gctttgctgc 60cgtccaagat
atattgcaaa ggacaaaagg ttcaggagca atgcaagaca aaaaaatgtg 120atctcaactg
tatgtacatc catatatatg cctggagttc acttgacctg taaagtagta 180gtaccaaatt
ctgttgctga ccaattcatt ttaattatct taattccttg cataaaagaa 240taaataattc
agcagatgct tgctaaggaa ttaatgtgta atatatataa gcacaactaa 300taaagcaatg
gatactttca accaaaaaaa gtggtttaat ttgtctatag tagttctgtg 360gaaatggaga
acttaagaaa ggacaaaaag gaaataacca cttttggatc tatttgatgc 420atggtatttc
ttcaactcca ggggtatttt tgatatatgt atatatttag ggtataagct 480aaatatgtac
gcatattctc catatgaaag agtgcagtac tatttagcag cattctccat 540atggatggtt
agtactgaca ttgaagaatt ttgtagctag gactccatgt ttttttttat 600cagtgataca
gttgtacgtt gtcaattatt gatcatgaat tccagtttga tgtgacaatt 660aatttgatga
ttagtataac tagaaattaa cgatggatca atggacacct ggcccataat 720taattaatta
aattgtgata agattgtgtg tttgagtcac aaaaaacttt aagtggtgaa 780tttgagaggt
gtggtcaagc atgcaagttt cttacctagc cagggtgccg tcttttggct 840tccacgcatc
catctataga tctcagatgc acattatatt ctctgtgtgt atgggagaga 900gagagagaga
gaaagaaggg tttgcttggt acacactctc ctgatcaatc aggccatact 960gtacagtgat
cacacagtcc atgcatgcca gctctagtgc tgcatctacc tagaagctag 1020catatgcttt
ggttcaaact gtgcacacat cattcacaca tatttacatg ttactatatc 1080ttacccaagg
aagaggtact ctttgctgta ataacacatg tgattatgga aaaactgata 1140aaatcattgt
ccatacatat atttatgtat gacatgtttg aaaattgggt ttgagaagta 1200ttatctcact
ctttaaagga taagttttaa cctccaccgc accccatatt atcccgaccc 1260tgcatctatt
tttatttcac aatcacatcc tttgtaaccc attatcttga tatctcatac 1320atattataca
tgtattatat gtatatgaga aaatctttca tataaaaatt aaactactgc 1380atgattatat
atacgtgtta cgcgtatatg atcttggtca aatgtaacct caaggaaaca 1440taaagttttg
taagtcgtaa ctgcaggcgg tgaagtgtct agagctgatg gctggcggtg 1500attcccatgc
atgtctgggc atgcatggat cgatccatcg atgcctcgaa attcacgcaa 1560gctctgcaac
ggtttccgga tacagatggt cattgtcgtg tcctttttat ttttctcttt 1620catttgcttt
aattttcttc tctctgtttg gcttaacatg tgtggtacgt acactttgta 1680acggatgtga
gatgagcaat gcagtaagct taaggtagct agcgcgttgc agaatgcaga 1740tcagagccac
acatttactt tacttctcac ctgatcgatc gatcactgaa tgaagagagt 1800ccaaagctag
gcagcagttc ataacatgca tacgttgaca acgtacggac gcagctggca 1860gctagctata
tatttaatta agccttcata ctcaaagaat aactttttgg agctcttgaa 1920tttctatcct
tgcgttagct agatagatac gtcgaaaaaa ataactgcac ttttttagtg 1980atacaatcca
aagccagcaa aaaataataa attatatacg ctatttatga tggtaatata 2040ttactgatac
ataatccagc ccattttgct ctccatctaa ctttagatgt tcatatcaac 2100cacttcggtt
atattgcgga aattttgatt gaatgtatat atgtggcatt atagattata 2160tctatgtctg
aaaaatcata tcaccacata ttggttataa tgtgcgaaaa tatagaacaa 2220aaaactgata
ttgtcgatta gaggatgcca cctacaagct atagttttac atatattatt 2280ttatgctgtc
tactaaaaga acaataaaac catttactta caccactgca ttcaaacagt 2340aaattggaga
agttggcttt ctaccttgac actactagtc cttgctaaga taaaagtaaa 2400acaacattac
catcttatat caaatctact aattaaacca ctccatatta gatgaaatcc 2460atgttaaaga
gtctatatct atgcagtcgc tctcatgata tgtcattata tcttgatcta 2520tctatgttaa
tttagaagtt tacacccaca atcgctctaa ttttatagga ccatcgatga 2580tatataatat
ttttttcatc aggaatgaaa tagattacgt acacagttac attacgactc 2640atgacactag
aactatatct atgtttagaa gtttatctag atatggcatg attaatagaa 2700tgtatttgtg
ttagagctct aagtttagaa tatgtgaccg ataaacctac cgttttattc 2760tttttaacta
catgttttgc aaaagattaa attgttatct tacaattcat atagcactag 2820cattatgatc
tggtgtatca tatatgtcat tatccatcta tgtttagaag tttatatcca 2880cggctctaat
tatgtggaac gattaaatga tgatatatat agttgttaag aggtatggaa 2940tagattaaat
aattagttac gttacgattc gtaacactag tgctatctat atttagaagt 3000ttacatccac
aatcgctcta attatgtggg attattaaac gatgaatata tttttctgtg 3060aggaatgaat
tgaaatagat taaatagtta cgttacgatt cgtaacatta gttctatcta 3120tgtttagaag
tttatctgga caatcaccat catgtcagtg tgccaatacg ctaaacttag 3180aagatgcctc
cgataatcgt agcattagta ttatttgggg aatgaattaa aaaatataaa 3240taatgatata
ttacaattga taatctatgt ttagaaactt ttgtcggtta ctcgctcaaa 3300ttgtatgggg
taataaatcg gtgaagtata cttttatact gaatggacaa gataagctac 3360cattgatagc
attagcggtt ctatttggta tattatcccg attatccacc ctcaatttgt 3420gctaaaataa
gatttttaca tcatcctagt caatatttgg ggttaccctg tctgcattat 3480aatttatttt
tgtgcttaac tataatatat acatacacta taatttatct aaataaaagt 3540tctggtatga
ttaaaaaaac taacaatttt gtgtgtggcg tattgagtgg aagaatgtca 3600tgttaggatc
acatgggaga gagtgcatgc gacgagatca tccttgttgg tctgtgcagg 3660tggtgtgaaa
tgtgatcaat atatatggtg gtgacagaga gagaaactaa cccaaaaaaa 3720caaaaaaaga
gagatgagag cgaatggatg gatgcaattg gcattaattt tcggtctttg 3780ctgttctccc
ccagccaggc cagtttgctt cacgcaatat tctaaccctt tgagaaagag 3840aagtgtactt
gttgccaagg ccaattgcaa gcatttgcct tggctttaaa gtctcatcaa 3900tacaacggca
ccaaaaagaa aacacagaga tagaaaacca cctagtagct gatatacatt 3960tatatatgac
ctaaataaaa aaattccatt aatatgtata attccagcaa caacataaag 4020aaataaaaat
gcatttaaga aaacatagaa agaaataaaa ataaagtaaa taaagctagc 4080taggcccaaa
attggcagta attaagtagg gactagtata gaaatatatg gatatataca 4140ccagcctcca
ccaatgggat tgcaaacagc ctacttatca ctttgctgct gtatttacgc 4200ttttgccctt
cttccctcct atatgtacag ccgcccccac ctcattcctc cattcttact 4260ccacacacac
actctctctc tctctaccat ttgtgagaaa gaaaatcgat tcagttctag 4320agagagaaac
aaacaatttt cgctgtctat ctctctcttg ctactagtcg gtcgatcttg 4380agttagtttt
aaccctacac aagccaaggt aacaacatct agcaggtagg agaagagagc 4440tagagactag
gtggtggggg ctcattccaa gaatgagctc gtcggtggtt gtgagcgcga 4500gcggcagcgg
cagcggcggc ggaggaggag gaggaggtgg cggcgccgga ggtggaggag 4560gaggtgggcc
gtgcggggcg tgcaagttct tgcggcggaa gtgcgtgcag gggtgcatct 4620tcgcgcccta
cttcgactcg gaggccgggg cggcgcactt cgcggcggtg cacaaggtgt 4680tcggcgccag
caacgtgtcc aagctgctgc agcagatccc ggcgcaccgc cgcctcgacg 4740ccgtcgtcac
catctgctac gaggcccagg cccgcctccg cgaccccgtc tacggctgcg 4800tcgcccacat
cttccacctc caacaccagg tatatactac tcatactcac tcgatctcct 4860cctcctcatc
gtcgccgtcg gtggcggcga gtcatttaga tgttcgtcat ggtggttgtg 4920cgatcgatcg
agcttctatt ttggttttgg ttttggtttt ggtttcttgg gtttgatttg 4980gttggttttt
ggaggaagga tggatgtctt tttcttgaag aaggcaaagg agtccttttt 5040tggggaggag
agaaggctag caagctaagc aagggagtta atctggagaa atggacttct 5100ctctttctgt
tactactcac tactactcag gcctaccagt gatgatgtgc acatctcatc 5160atccatctca
tcattaaatc ccatcatcta ctctctctct tgttcttgct ttctcttctt 5220tcattctttc
tctgaatctt ctgatagata gattgataga tagatgcatg atgatatccc 5280catttatcac
atcattttat atcatgcatc aggttgttgt cccccccccc cctctctctc 5340tcttgctctg
aaatcaagga gggtatgcat acatgcttgg atttcacacc cacaaaagaa 5400aaatggtaat
ttagcaagcc ctagctagga attaggatgc atcaatctct agtagttctt 5460gaagctgcag
ctagtatagc tcccaaccat ttctcctctt ccttttcttt actaatatga 5520tcagcatttc
attaagattt ttttgtatat agtatagcta cctacatttt ctcttgatct 5580gattatgcca
agtactaatt ttctgtccat tttactgatg atgatctggt tcaattcccc 5640atgtgtatat
gtactctcag gtggcaggtc tccagtccga gctgaactac ctgcaaggtc 5700acctctcgac
gatggagctg ccgtcgccgc cgccctacgt cgccgggccg accctggcgc 5760cgccacagcc
acagccactg atgccgatga ccgccgccgc caacttcaac ttctccgacc 5820tgccatcgtc
gtcggcggcc aacattccgg tcaccgccga cctgtccacc ctctttgacc 5880cactgccggc
ggcgcagccg cagtggggac tataccagca gcagcaacac caccaccagc 5940agctgcatca
tcacccctat gaccggatgg gcgacggctc gtcgagcagc agaggcggcg 6000acgacgatgg
cagcgacggc ggcgacttgc aagcgctggc gagggagctt cttgaccgcc 6060atggacggtc
gtcgtcgagc tccaagctgg agccgccacc tcacacacag tgatcctcct 6120cactgtgtgt
gatcatcaat tcagcttagc tagctagctc atggactaat tgatcaggtg 6180ttaatcattc
atgaatgcat tggttgaggc aagaagagaa tttaatccca atggtgaaat 6240ttttttcacc
aaatcctcca tgtcgttgag gcgaaaaatc gaacgacgac gacgacgatg 6300gcgaggaaga
cgatgatggt gatgttgggg atggagatgg taggtaacag gcattgcccg 6360gttttcgcgt
atcatctttg ttcttgggct agggtgcaag gggtgcccac ttgcaccatt 6420ttataatgct
tgggagtttg ctccaaaaga ggaagcttgg ggatgagttc ttgttagctt 6480agctgtagcc
ctgatcactg ttccattgca acagttctaa ttgcaaaaaa caaaacctgg 6540tctaatttag
ttcaatatac aaaaaaaaaa tctttgtctc atcgcagatt aattacggtt 6600gtgtttgaag
tttgtttgat ttctgtttca aggttctaac tgaacatctg aagtgaagtg 6660tagtcagtct
taatttggga ctttctgatc tctctatgac aaatgagctt tttttttttg 6720ccataatata
tacaacagct agcaagtagc aaaatgagca tttttggggt taatggtaac 6780tgaacatata
tgtatggtgg caactgaata aagtgtgaac atatgtgagt acttggaggc 6840tagagccagg
tgttggtttc cttttacttg cttcgtgctt ctacaactac aataatgcaa 6900gtattcatat
ggtgcaacct tagtcttaga ttcagtctgg ctagctagct agctatttct 6960aatgggacac
agtacattta aaacaagcct aattaaacta gtttatttct ctatgaagag 7020tcgtggtata
tctggggcta aatgattggc aggggattat attttagagt ttgatatata 7080gatgagtgag
agacagacag gaagcatagc tttggtggga catctttgac taagaccatg 7140cagcatgcac
acaacaatgt tttctctcct tatgtttctt gaagttatat catatgccct 7200tgtattcagg
gactcctttg ttatcaattg ttggaaaatg acaagcggtt gggatatgaa 7260taatatgatg
ccataggaaa gtacatgttt cagtttagct agctctttaa tgtgtccaaa 7320ccgcattgaa
aagtttcatg attactacta gtccatgtag gtaactaatt attaccgtaa 7380tgaacatgca
tatgcatatg agttaatttt ggcatgtact ctaagctaat ttaagatgat 7440gtttctgtgg
cgaccggcca tgcgtgcaaa tacatggtac tatatatgca taataatagg 7500gatactgcta
ctagttaagt aagttaatta tgtgtctcta cattactctt tgattcgtta 7560attaattaga
caggtctttt ttttcttaag gaagatcgtc actaccgtaa attatcaaag 7620cagggtaaat
ttgacatgta actatggtaa gttagtaact attataacta gctggtacct 7680agcatcaata
atattccttg taacaatatt atttctgcaa cttttgcaca agtaagaata 7740taaacattaa
taggaaacaa gtattatttg tacaaactaa agaatgtaat aattagttgg 7800atcattagta
atgcatttaa ttagctttct tagaaataag agcattaatc aacagttgtc 7860atatatgcaa
tgaatcgtgt cattaatgtg tacattttgc gtgcaaggat cgcaaaagtt 7920ccatatagct
gttactattc tttgcaaatt tatcgtgcgt gatatcaaat gtattggatt 7980ctcctactat
tagaattgat acataagcga caccatcaca catgagaacg tcttttctta 8040atatatataa
gcacaagata ttagctagat gccaaattaa atacttgatt tccagcactt 8100catagatatt
agttaccttt ctcaagtttg gtgtgaaaaa aatgcatatc tatatatatt 8160tgacagtatt
ttagtagaat aatgtgagta gctagctgga aaagaatata tttctgcatg 8220ctgcaaatat
atggtaccaa ctgttcagtg ctaccctgaa ctgaagtaaa tgtcttatta 8280atgaagagcc
atcttatgtt gttttagaaa ttctatatgt agtgttggca ttgacttgat 8340ttacacacta
tagtgttata tatgcttgta gctgcacaaa agtagctttt agttgctggc 8400atgtcaattt
accaaacaga aaaagagacg ttcaatttgt ggatgaaaaa aaaaggaata 8460ttttttaaag
acgcaagttt aacacaattc aaaatgaagc ctttcgatgt tgagtttaag 8520taatctttaa
tttaaatata aaagaataat ggtagtcagg gttaacactg cacaaaaaag 8580tgaccttggt
cctggtccta atgaagcggt ttccttaaaa cctgctagta accacctcac 8640agttctcaat
gctatgaaga aattaaaaag tcggttctaa atcaattaag gtaaatcatg 8700gaagaatata
gatgtcttac gaaagtaatc tgccccaata agcaatagta cagtggtcag 8760tggagatcag
agaaataatt ttgtgatatg gagtttaaac attggggtag cggaattaaa 8820ctactatttg
gtgtgaaatt atattaagta ggtcagaagc atccatacat agtacttctt 8880ctgtcccaaa
atataaagag tttcggttga atgggggcat atctaagtat tgcgagtttg 8940gacaggctac
atcccgcatg gaaaaaaatt gttgtttttc ataccttttt gacaggcggg 9000tgagccaacc
acgtgaaaaa agtttttttt tcgtagtgta tccagtagac tgacatgtta 9060gcccaatggc
gaaatcggcc ctagagcaaa taacgttgga ggcaatataa tggatcaaac 9120agagatggtg
tagcagctag ccgtgtgggg gccaagggct tttggaggcg gctattgcaa 9180tttcggtttg
aaatttcgga tcc
9203356157DNAArtificial sequenceVector pML18 35gaatatgcat cactagtaag
ctttgctcta gactggaatt cgtcgactct agaggatcca 60attccaatcc cacaaaaatc
tgagcttaac agcacagttg ctcctctcag agcagaatcg 120ggtattcaac accctcatat
caactactac gttgtgtata acggtccaca tgccggtata 180tacgatgact ggggttgtac
aaaggcggca acaaacggcg ttcccggagt tgcacacaag 240aaatttgcca ctattacaga
ggcaagagca gcagctgacg cgtacacaac aagtcagcaa 300acagacaggt tgaacttcat
ccccaaagga gaagctcaac tcaagcccaa gagctttgct 360aaggccctaa caagcccacc
aaagcaaaaa gcccactggc tcacgctagg aaccaaaagg 420cccagcagtg atccagcccc
aaaagagatc tcctttgccc cggagattac aatggacgat 480ttcctctatc tttacgatct
aggaaggaag ttcgaaggtg aaggtgacga cactatgttc 540accactgata atgagaaggt
tagcctcttc aatttcagaa agaatgctga cccacagatg 600gttagagagg cctacgcagc
aggtctcatc aagacgatct acccgagtaa caatctccag 660gagatcaaat accttcccaa
gaaggttaaa gatgcagtca aaagattcag gactaattgc 720atcaagaaca cagagaaaga
catatttctc aagatcagaa gtactattcc agtatggacg 780attcaaggct tgcttcataa
accaaggcaa gtaatagaga ttggagtctc taaaaaggta 840gttcctactg aatctaaggc
catgcatgga gtctaagatt caaatcgagg atctaacaga 900actcgccgtg aagactggcg
aacagttcat acagagtctt ttacgactca atgacaagaa 960gaaaatcttc gtcaacatgg
tggagcacga cactctggtc tactccaaaa atgtcaaaga 1020tacagtctca gaagaccaaa
gggctattga gacttttcaa caaaggataa tttcgggaaa 1080cctcctcgga ttccattgcc
cagctatctg tcacttcatc gaaaggacag tagaaaagga 1140aggtggctcc tacaaatgcc
atcattgcga taaaggaaag gctatcattc aagatgcctc 1200tgccgacagt ggtcccaaag
atggaccccc acccacgagg agcatcgtgg aaaaagaaga 1260cgttccaacc acgtcttcaa
agcaagtgga ttgatgtgac atctccactg acgtaaggga 1320tgacgcacaa tcccactatc
cttcgcaaga cccttcctct atataaggaa gttcatttca 1380tttggagagg acacgctcga
gctcatttct ctattacttc agccataaca aaagaactct 1440tttctcttct tattaaacca
tgaaaaagcc tgaactcacc gcgacgtctg tcgagaagtt 1500tctgatcgaa aagttcgaca
gcgtctccga cctgatgcag ctctcggagg gcgaagaatc 1560tcgtgctttc agcttcgatg
taggagggcg tggatatgtc ctgcgggtaa atagctgcgc 1620cgatggtttc tacaaagatc
gttatgttta tcggcacttt gcatcggccg cgctcccgat 1680tccggaagtg cttgacattg
gggaattcag cgagagcctg acctattgca tctcccgccg 1740tgcacagggt gtcacgttgc
aagacctgcc tgaaaccgaa ctgcccgctg ttctgcagcc 1800ggtcgcggag gccatggatg
cgatcgctgc ggccgatctt agccagacga gcgggttcgg 1860cccattcgga ccgcaaggaa
tcggtcaata cactacatgg cgtgatttca tatgcgcgat 1920tgctgatccc catgtgtatc
actggcaaac tgtgatggac gacaccgtca gtgcgtccgt 1980cgcgcaggct ctcgatgagc
tgatgctttg ggccgaggac tgccccgaag tccggcacct 2040cgtgcacgcg gatttcggct
ccaacaatgt cctgacggac aatggccgca taacagcggt 2100cattgactgg agcgaggcga
tgttcgggga ttcccaatac gaggtcgcca acatcttctt 2160ctggaggccg tggttggctt
gtatggagca gcagacgcgc tacttcgagc ggaggcatcc 2220ggagcttgca ggatcgccgc
ggctccgggc gtatatgctc cgcattggtc ttgaccaact 2280ctatcagagc ttggttgacg
gcaatttcga tgatgcagct tgggcgcagg gtcgatgcga 2340cgcaatcgtc cgatccggag
ccgggactgt cgggcgtaca caaatcgccc gcagaagcgc 2400ggccgtctgg accgatggct
gtgtagaagt actcgccgat agtggaaacc gacgccccag 2460cactcgtccg agggcaaagg
aatagtgagg tacctaatag tgagatccaa cacttacgtt 2520tgcaacgtcc aagagcaaat
agaccacgna cgccggaagg ttgccgcagc gtgtggattg 2580cgtctcaatt ctctcttgca
ggaatgcaat gatgaatatg atactgacta tgaaactttg 2640agggaatact gcctagcacc
gtcacctcat aacgtgcatc atgcatgccc tgacaacatg 2700gaacatcgct atttttctga
agaattatgc tcgttggagg atgtcgcggc aattgcagct 2760attgccaaca tcgaactacc
cctcacgcat gcattcatca atattattca tgcggggaaa 2820ggcaagatta atccaactgg
caaatcatcc agcgtgattg gtaacttcag ttccagcgac 2880ttgattcgtt ttggtgctac
ccacgttttc aataaggacg agatggtgga gtaaagaagg 2940agtgcgtcga agcagatcgt
tcaaacattt ggcaataaag tttcttaaga ttgaatcctg 3000ttgccggtct tgcgatgatt
atcatataat ttctgttgaa ttacgttaag catgtaataa 3060ttaacatgta atgcatgacg
ttatttatga gatgggtttt tatgattaga gtcccgcaat 3120tatacattta atacgcgata
gaaaacaaaa tatagcgcgc aaactaggat aaattatcgc 3180gcgcggtgtc atctatgtta
ctagatcgat caaacttcgg tactgtgtaa tgacgatgag 3240caatcgagag gctgactaac
aaaaggtaca tcggtcgacg agctccctat agtgagtcgt 3300attagaggcc gacttggcca
aattcgtaat catggtcata gctgtttcct gtgtgaaatt 3360gttatccgct cacaattcca
cacaacatac gagccggaag cataaagtgt aaagcctggg 3420gtgcctaatg agtgagctaa
ctcacattaa ttgcgttgcg ctcactgccc gctttccagt 3480cgggaaacct gtcgtgccag
ctgcattaat gaatcggcca acgcgcgggg agaggcggtt 3540tgcgtattgg gcgctcttcc
gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc 3600tgcggcgagc ggtatcagct
cactcaaagg cggtaatacg gttatccaca gaatcagggg 3660ataacgcagg aaagaacatg
tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg 3720ccgcgttgct ggcgtttttc
cataggctcc gcccccctga cgagcatcac aaaaatcgac 3780gctcaagtca gaggtggcga
aacccgacag gactataaag ataccaggcg tttccccctg 3840gaagctccct cgtgcgctct
cctgttccga ccctgccgct taccggatac ctgtccgcct 3900ttctcccttc gggaagcgtg
gcgctttctc atagctcacg ctgtaggtat ctcagttcgg 3960tgtaggtcgt tcgctccaag
ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct 4020gcgccttatc cggtaactat
cgtcttgagt ccaacccggt aagacacgac ttatcgccac 4080tggcagcagc cactggtaac
aggattagca gagcgaggta tgtaggcggt gctacagagt 4140tcttgaagtg gtggcctaac
tacggctaca ctagaaggac agtatttggt atctgcgctc 4200tgctgaagcc agttaccttc
ggaaaaagag ttggtagctc ttgatccggc aaacaaacca 4260ccgctggtag cggtggtttt
tttgtttgca agcagcagat tacgcgcaga aaaaaaggat 4320ctcaagaaga tcctttgatc
ttttctacgg ggtctgacgc tcagtggaac gaaaactcac 4380gttaagggat tttggtcatg
agattatcaa aaaggatctt cacctagatc cttttaaatt 4440aaaaatgaag ttttaaatca
atctaaagta tatatgagta aacttggtct gacagttacc 4500aatgcttaat cagtgaggca
cctatctcag cgatctgtct atttcgttca tccatagttg 4560cctgactccc cgtcgtgtag
ataactacga tacgggaggg cttaccatct ggccccagtg 4620ctgcaatgat accgcgagac
ccacgctcac cggctccaga tttatcagca ataaaccagc 4680cagccggaag ggccgagcgc
agaagtggtc ctgcaacttt atccgcctcc atccagtcta 4740ttaattgttg ccgggaagct
agagtaagta gttcgccagt taatagtttg cgcaacgttg 4800ttgccattgc tacaggcatc
gtggtgtcac gctcgtcgtt tggtatggct tcattcagct 4860ccggttccca acgatcaagg
cgagttacat gatcccccat gttgtgcaaa aaagcggtta 4920gctccttcgg tcctccgatc
gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg 4980ttatggcagc actgcataat
tctcttactg tcatgccatc cgtaagatgc ttttctgtga 5040ctggtgagta ctcaaccaag
tcattctgag aatagtgtat gcggcgaccg agttgctctt 5100gcccggcgtc aatacgggat
aataccgcgc cacatagcag aactttaaaa gtgctcatca 5160ttggaaaacg ttcttcgggg
cgaaaactct caaggatctt accgctgttg agatccagtt 5220cgatgtaacc cactcgtgca
cccaactgat cttcagcatc ttttactttc accagcgttt 5280ctgggtgagc aaaaacagga
aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga 5340aatgttgaat actcatactc
ttcctttttc aatattattg aagcatttat cagggttatt 5400gtctcatgag cggatacata
tttgaatgta tttagaaaaa taaacaaata ggggttccgc 5460gcacatttcc ccgaaaagtg
ccacctgacg cgccctgtag cggcgcatta agcgcggcgg 5520gtgtggtggt tacgcgcagc
gtgaccgcta cacttgccag cgccctagcg cccgctcctt 5580tcgctttctt cccttccttt
ctcgccacgt tcgccggctt tccccgtcaa gctctaaatc 5640ggggcatccc tttagggttc
cgatttagtg ctttacggca cctcgacccc aaaaaacttg 5700attagggtga tggttcacgt
agtgggccat cgccctgata gacggttttt cgccctttga 5760cgttggagtc cacgttcttt
aatagtggac tcttgttcca aactggaaca acactcaacc 5820ctatctcggt ctattctttt
gatttataag ggattttgcc gatttcggcc tattggttaa 5880aaaatgagct gatttaacaa
aaatttaacg cgaattttaa caaaatatta acaaaatatt 5940aacgtttaca atttcccatt
cgccattcag gctgcgcaac tgttgggaag ggcgatcggt 6000gcgggcctct tcgctattac
gccagctggc gaaaggggga tgtgctgcaa ggcgattaag 6060ttgggtaacg ccagggtttt
cccagtcacg acgttgtaaa acgacggcca gtgccaagct 6120gacttggtca gcggccgcag
atttaggtga cactata 6157361130DNAZea
maysmisc_featurecef1f.pk001.f4fis 36cggcaggcac gcacgcaggg agagagatag
ataaaaggtc gcccccttga ggacagggca 60gggcagctga gggcaatgag cgctggcgga
ggcggcggcg gcaccagcac gcttggcggc 120gggggcccga gcggcagcgg cagcggaggc
cctggaggaa gcggcggcgg cgggccttgc 180ggcgcgtgca agttcctccg gcgcaagtgc
gtcagcggct gcatcttcgc gccctacttc 240gactcggagc agggcgcggc gcacttcgcg
gccgtgcaca aggtgttcgg cgccagcaac 300gtgtccaagc tgctgctcca gatcccggcg
cacaagcgcc tcgacgccgt cgtcaccatc 360tgctacgagg cccaggcgcg gctccgcgac
cccgtctacg gctgcgtcgc ccacatcttc 420gcgctccagc agcaggtggt gaatctccag
gccgagctga cctacctgca agcacacctc 480gccacgctcg agctgccggc cccgcccccg
ctgccggccc cgccgcagat gcccatgcca 540ggcccgttct ccatctcgga cctgccgttg
tcgaccagcg tccccaccac cgtcgacctg 600tccgcgctct tcgacccgcc accaccgcag
tgggcgacgg cgcagcagcc gcaccaccac 660catcaacagc cgccgcagca ccaccagctc
cggcaaccgg cgccgtatgg cgctggcgcg 720tccgtcaggt ccggcggcgt gaagctcgag
cacccgccgc cacactcaag atgagctgga 780tgggggagta gaaggatcaa aaacccgtgc
agaacaaggt gagagttggc gcccggcagt 840atcgagggag ataggggtcg gtgacgggcg
atgtccagca cagcaggagt aggtaagcag 900cattggccgg ttttcgcgta cccagcaccc
ctgttgttaa tcggctgggg tgcaatggcg 960gcgcccactt gcttgatata ttctccagtt
tgatcatatt tgctccaaga caaaagaaag 1020agtgctgggg atcgacgaga gtattactag
aattgacatg tattagtaaa aaaaaaaaaa 1080aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa 113037232PRTZea
maysMISC_FEATUREcef1f.pk001.f4fis 37Met Ser Ala Gly Gly Gly Gly Gly Gly
Thr Ser Thr Leu Gly Gly Gly1 5 10
15Gly Pro Ser Gly Ser Gly Ser Gly Gly Pro Gly Gly Ser Gly Gly
Gly 20 25 30Gly Pro Cys Gly
Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Ser Gly 35
40 45Cys Ile Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly
Ala Ala His Phe 50 55 60Ala Ala Val
His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu65 70
75 80Leu Gln Ile Pro Ala His Lys Arg
Leu Asp Ala Val Val Thr Ile Cys 85 90
95Tyr Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys
Val Ala 100 105 110His Ile Phe
Ala Leu Gln Gln Gln Val Val Asn Leu Gln Ala Glu Leu 115
120 125Thr Tyr Leu Gln Ala His Leu Ala Thr Leu Glu
Leu Pro Ala Pro Pro 130 135 140Pro Leu
Pro Ala Pro Pro Gln Met Pro Met Pro Gly Pro Phe Ser Ile145
150 155 160Ser Asp Leu Pro Leu Ser Thr
Ser Val Pro Thr Thr Val Asp Leu Ser 165
170 175Ala Leu Phe Asp Pro Pro Pro Pro Gln Trp Ala Thr
Ala Gln Gln Pro 180 185 190His
His His His Gln Gln Pro Pro Gln His His Gln Leu Arg Gln Pro 195
200 205Ala Pro Tyr Gly Ala Gly Ala Ser Val
Arg Ser Gly Gly Val Lys Leu 210 215
220Glu His Pro Pro Pro His Ser Arg225 230381038DNAZea
maysmisc_featurecpf1c.pk006.d18afis 38agaagcaggg cgcaagtcct accatagcaa
tatagcatag ctagcacacc agtagctagc 60atcggagacg atctatcgac tagctctcta
tagctagtta gctcttccct tgctagccgt 120ttgcgccggt gactgacgac gaccgacgac
atggccaacg aaggggccgc cgctgccgct 180gccgctgccg ctgctgctgc cgcgacgggc
gcggggtctc cgtgcggcgc gtgcaagttc 240ctgcgccggc ggtgcgtgcc ggagtgcgtg
ttcgcgccct acttcagcag cgaccagggc 300gccgcgcgct tcgccgccat ccacaaggtg
ttcggcgcca gcaacgcctc caagctgctg 360tcccacctcc ccgtggccga ccgctgcgag
gccgtcgtca ccatcaccta cgaggcgcag 420gccaggctcc gggaccccgt ctacggctgc
gtcgcccaga tcttcgccct ccagcagcag 480gtcgccatcc tgcaagcgca gctgatgcag
gccaaggcgc agctggcgtg cggcgtccag 540ggcgccgccg cgcactcgcc ggcgagccac
caccaccacc agtggccgga cagcgccagc 600atcagcgccc tgctccgcca ggacgcggcg
tgtagcgcca ggaggcccgg cgggcccctc 660gacgacttct tcactccgga gctcgtggcc
gggttcaggg acgacgtcgc cgccgccgcc 720gggcagcatt gcgcaggcaa ggtggatgcc
ggagagctcc agtacctggc ccaggccatg 780atgaggagcc ccaactactc cctgtagccg
tagctgtagc tgcctaggaa ggatgatgag 840aatcagacac catgcgtttt ggagccatgc
catgctgtgc catctcatct cgatctccac 900tccgctaatg caagtgttga gagatgagct
agaaattcct gcaaaaggaa gataacaact 960tgtaccagct agtgatgaag tactctcctt
gtctctctca aaaaaaaaaa aaaaaaaaaa 1020aaaaaaaaaa aaaaaaaa
103839218PRTZea
maysMISC_FEATUREcpf1c.pk006.d18afis 39Met Ala Asn Glu Gly Ala Ala Ala Ala
Ala Ala Ala Ala Ala Ala Ala1 5 10
15Ala Ala Thr Gly Ala Gly Ser Pro Cys Gly Ala Cys Lys Phe Leu
Arg 20 25 30Arg Arg Cys Val
Pro Glu Cys Val Phe Ala Pro Tyr Phe Ser Ser Asp 35
40 45Gln Gly Ala Ala Arg Phe Ala Ala Ile His Lys Val
Phe Gly Ala Ser 50 55 60Asn Ala Ser
Lys Leu Leu Ser His Leu Pro Val Ala Asp Arg Cys Glu65 70
75 80Ala Val Val Thr Ile Thr Tyr Glu
Ala Gln Ala Arg Leu Arg Asp Pro 85 90
95Val Tyr Gly Cys Val Ala Gln Ile Phe Ala Leu Gln Gln Gln
Val Ala 100 105 110Ile Leu Gln
Ala Gln Leu Met Gln Ala Lys Ala Gln Leu Ala Cys Gly 115
120 125Val Gln Gly Ala Ala Ala His Ser Pro Ala Ser
His His His His Gln 130 135 140Trp Pro
Asp Ser Ala Ser Ile Ser Ala Leu Leu Arg Gln Asp Ala Ala145
150 155 160Cys Ser Ala Arg Arg Pro Gly
Gly Pro Leu Asp Asp Phe Phe Thr Pro 165
170 175Glu Leu Val Ala Gly Phe Arg Asp Asp Val Ala Ala
Ala Ala Gly Gln 180 185 190His
Cys Ala Gly Lys Val Asp Ala Gly Glu Leu Gln Tyr Leu Ala Gln 195
200 205Ala Met Met Arg Ser Pro Asn Tyr Ser
Leu 210 215401262DNAZea
maysmisc_featurecpi1c.pk005.a12fis 40gcggcacgca cgcacgctcg cagggagaga
gatagataaa aggtcgcccc cttgagggca 60gggcagggca gctgagggca atgagcgctg
gcggcggcag cagcacgctt ggcggcgggg 120ggccgagcgg cagcagcagc ggaggccctg
gaggaagcgg cggcggcggc gggccttgcg 180gcgcgtgcaa gttcctccgg cgcaagtgcg
tcagcggctg catcttcgcg ccctacttcg 240actcggagca gggcgcggcg cacttcgcgg
ccgtgcacaa ggtgttcggc gccagcaacg 300tgtccaagct gctgctccag atcccggcgc
acaagcgcct cgacgccgtc gtcaccatct 360gctacgaggc ccaggcgcgg ctccgcgacc
ccgtctacgg ctgcgtcgcc cacatcttcg 420cgctccagca gcaggtggtg aatctccagg
ccgagctgac ctacctgcaa gcacacctcg 480ccacgctcga gctgccggcc ccgcccccgc
tgccggcccc gccgcagatg cccatgccag 540gcccgttctc catctcggac ctgccgttgt
cgaccagcgt ccccaccacc gtcgacctgt 600ccgcgctctt cgacccgcca ccaccgcagt
gggcgacggc gcagcagccg caccaccacc 660atcaacagcc gccgcagcac caccagctcc
ggcaaccggc gccgtatggc gctggcgcgt 720ccgtcaggcc cggcggcggc cccggcatgg
cagagagctc aggcggagac gagctgcagt 780cgctggcgag ggagctcctg gaccgccacc
ggtccggcgg cgtgaagctc gagcacccgc 840cgccacactc aagatgagct ggatggggga
gtagaaggat caaaaacccg tgcagaacaa 900ggtgagagtt ggcgcccggc agtatcgagg
gagatagggg tcggtgacgg gcgatgtcca 960gcacagcagg agtaggtaag cagcattggc
cggttttcgc gtacccagca cccctgttgt 1020taatcggctg gggtgcaatg gcggcgccca
cttgcttgat atattctcca gtttgatcat 1080atttgctcca agacaaaaga aagagtgctg
gggatcgacg agagtattac tagaattgac 1140atgtattagt aacattattg ttacctttga
taccgttcca ttagttgcaa gatttttatt 1200aagaaaagaa tctcaacatg gtttctaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1260aa
126241258PRTZea
maysMISC_FEATUREcpi1c.pk005.a12fis 41Met Ser Ala Gly Gly Gly Ser Ser Thr
Leu Gly Gly Gly Gly Pro Ser1 5 10
15Gly Ser Ser Ser Gly Gly Pro Gly Gly Ser Gly Gly Gly Gly Gly
Pro 20 25 30Cys Gly Ala Cys
Lys Phe Leu Arg Arg Lys Cys Val Ser Gly Cys Ile 35
40 45Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly Ala Ala
His Phe Ala Ala 50 55 60Val His Lys
Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Leu Gln65 70
75 80Ile Pro Ala His Lys Arg Leu Asp
Ala Val Val Thr Ile Cys Tyr Glu 85 90
95Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala
His Ile 100 105 110Phe Ala Leu
Gln Gln Gln Val Val Asn Leu Gln Ala Glu Leu Thr Tyr 115
120 125Leu Gln Ala His Leu Ala Thr Leu Glu Leu Pro
Ala Pro Pro Pro Leu 130 135 140Pro Ala
Pro Pro Gln Met Pro Met Pro Gly Pro Phe Ser Ile Ser Asp145
150 155 160Leu Pro Leu Ser Thr Ser Val
Pro Thr Thr Val Asp Leu Ser Ala Leu 165
170 175Phe Asp Pro Pro Pro Pro Gln Trp Ala Thr Ala Gln
Gln Pro His His 180 185 190His
His Gln Gln Pro Pro Gln His His Gln Leu Arg Gln Pro Ala Pro 195
200 205Tyr Gly Ala Gly Ala Ser Val Arg Pro
Gly Gly Gly Pro Gly Met Ala 210 215
220Glu Ser Ser Gly Gly Asp Glu Leu Gln Ser Leu Ala Arg Glu Leu Leu225
230 235 240Asp Arg His Arg
Ser Gly Gly Val Lys Leu Glu His Pro Pro Pro His 245
250 255Ser Arg42977DNAZea
maysmisc_featurecr1n.pk0028.h3afis 42gcaacttgca gtaggtgaca ggtgttaaca
ggagctggct gagcttctct tgcttctgca 60agtagtagct gtagccgccc tgtaggcaga
gagaggagag acgacgtacg tgagggagcg 120agcgagcgac gacagcatca ggcaggcgtt
gacggccatg gcttcctccg gcagcggtgg 180cggctcgccg gggtccccgt gtggcgcctg
caagttcctg cggcgcaagt gcgcggcgga 240gtgcgtgttc gctccccact tctgcgccga
ggacggggcg gcgcagttcg cggccatcca 300caaggtgttc ggcgccagca acgcggccaa
gctgctgcag caggtggccc ccgccgaccg 360gagcgaggcg gcggccaccg tcacctacga
ggcgcaggcc aggctgcgcg accccatcta 420cggctgcgtc gcccacatct tcgcgctgca
gcaacaggtg gcgagcttgc agatgcaggt 480gctgcaggcg aaggcgcagg tggcgcagac
gatggcggcg gccgggccgc aggggggcag 540cagccctctc ctgcagcggt ggccgctgga
gcctgagtcg ctgtcgacgc agagctccgg 600gtgctacagc gacatgtact gcggcttcgg
cgaccaggag gaaggcagct acacgagatg 660aataatgaat ggatcattcg cgcgcgcgcg
cgcacgcatc gacacagata ctttcttcta 720ttagcgccaa gagacaacaa caaccgaggg
cctcaacttt cttgttggtt tgcagtgcgt 780tttgttcagt tcagcagcta gctctccggt
ttggggagga gcttaatttc gatgagattt 840cgtgcgatcc ataaacttgt atttcttgcc
ggttcgagct gtaaaatgga agtgcagctc 900atcatcatgt gtgtggttat taaacggagg
cacaaatcga ggataatttc atattcccta 960aaaaaaaaaa aaaaaaa
97743167PRTZea
maysMISC_FEATUREcr1n.pk0028.h3afis 43Met Ala Ser Ser Gly Ser Gly Gly Gly
Ser Pro Gly Ser Pro Cys Gly1 5 10
15Ala Cys Lys Phe Leu Arg Arg Lys Cys Ala Ala Glu Cys Val Phe
Ala 20 25 30Pro His Phe Cys
Ala Glu Asp Gly Ala Ala Gln Phe Ala Ala Ile His 35
40 45Lys Val Phe Gly Ala Ser Asn Ala Ala Lys Leu Leu
Gln Gln Val Ala 50 55 60Pro Ala Asp
Arg Ser Glu Ala Ala Ala Thr Val Thr Tyr Glu Ala Gln65 70
75 80Ala Arg Leu Arg Asp Pro Ile Tyr
Gly Cys Val Ala His Ile Phe Ala 85 90
95Leu Gln Gln Gln Val Ala Ser Leu Gln Met Gln Val Leu Gln
Ala Lys 100 105 110Ala Gln Val
Ala Gln Thr Met Ala Ala Ala Gly Pro Gln Gly Gly Ser 115
120 125Ser Pro Leu Leu Gln Arg Trp Pro Leu Glu Pro
Glu Ser Leu Ser Thr 130 135 140Gln Ser
Ser Gly Cys Tyr Ser Asp Met Tyr Cys Gly Phe Gly Asp Gln145
150 155 160Glu Glu Gly Ser Tyr Thr Arg
165441058DNAEuphorbia lagascaemisc_featureeel1c.pk003.b10fis
44gcttcttctt catattctgc gtctcataaa ccctaattat gctctcttct ctctccaaat
60tcgatccgaa atgagttcga cggtgcatcc tagcagcagc ggcagcagcg gcggagccgg
120aggaggagga agtggtggaa gtggcggagg gagtgggccg tgtggagcgt gtaaattttt
180gaggagaaaa tgtgtgccgg ggtgtatatt tgcgccgtac tttgattccg agcagggagc
240ggcgcatttt gcggcggtgc ataaggtttt tggtgcgagt aacgtttcga aacttcttct
300gcatattccg gtacataaac gccttgatgc ggtggttact atttgttatg aagctcaagc
360tcggcttcga gatcctgttt atggctgcgt tgctcatata ttcgctctgc aacaacaggt
420ggtgaactta caggcagagc tcacatattt gcaagcccat ttagcaacac tagagcttcc
480gtcaccaccg ccgcctcctc tcccaccaca aacactattg acaccaccac ctctatcaat
540atccgacctc ccatcatcct cttctgctcc cggttcatat gacttgcaat cgctttttga
600tccgatggca caaaattcat ggtcaatgca acaaaggcta atagatccac gccatcaatt
660cataggttcg actagtggtt catcgtcgtt aaccaccaca ggcagtggga gtggtgatct
720tcatacattg gcacgtgagc ttctccatag acatggttct ccgtcacatg gttcaatgcc
780atgtagcggc gctttatctt catctccgtc ttctatctca aaatgaaact gaccctattg
840atagaagttg ttgcaacata atttgtacta attttcaatg ggatgctagc cgaaagagct
900taagttttca tggtatttta gtttagagat ctagtgtttt aataactggt cactaatttt
960tttggccttc tgtttattat attattcatt ttctcttaaa aaaaaaaaaa aaaaaaaaaa
1020aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa
105845251PRTEuphorbia lagascaeMISC_FEATUREeel1c.pk003.b10fis 45Met Ser
Ser Thr Val His Pro Ser Ser Ser Gly Ser Ser Gly Gly Ala1 5
10 15Gly Gly Gly Gly Ser Gly Gly Ser
Gly Gly Gly Ser Gly Pro Cys Gly 20 25
30Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Pro Gly Cys Ile Phe
Ala 35 40 45Pro Tyr Phe Asp Ser
Glu Gln Gly Ala Ala His Phe Ala Ala Val His 50 55
60Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Leu His
Ile Pro65 70 75 80Val
His Lys Arg Leu Asp Ala Val Val Thr Ile Cys Tyr Glu Ala Gln
85 90 95Ala Arg Leu Arg Asp Pro Val
Tyr Gly Cys Val Ala His Ile Phe Ala 100 105
110Leu Gln Gln Gln Val Val Asn Leu Gln Ala Glu Leu Thr Tyr
Leu Gln 115 120 125Ala His Leu Ala
Thr Leu Glu Leu Pro Ser Pro Pro Pro Pro Pro Leu 130
135 140Pro Pro Gln Thr Leu Leu Thr Pro Pro Pro Leu Ser
Ile Ser Asp Leu145 150 155
160Pro Ser Ser Ser Ser Ala Pro Gly Ser Tyr Asp Leu Gln Ser Leu Phe
165 170 175Asp Pro Met Ala Gln
Asn Ser Trp Ser Met Gln Gln Arg Leu Ile Asp 180
185 190Pro Arg His Gln Phe Ile Gly Ser Thr Ser Gly Ser
Ser Ser Leu Thr 195 200 205Thr Thr
Gly Ser Gly Ser Gly Asp Leu His Thr Leu Ala Arg Glu Leu 210
215 220Leu His Arg His Gly Ser Pro Ser His Gly Ser
Met Pro Cys Ser Gly225 230 235
240Ala Leu Ser Ser Ser Pro Ser Ser Ile Ser Lys 245
25046484DNAAquilegia vulgarismisc_featureeav1c.pk003.c9
46gcganagtgc gttgttggnt gtattttcgc cccatatttt gattcagaac aaggtgcaac
60acactttgca gctgttcata aggtgtttgg tgcaagtaat gtgtccaagc ttcttttaca
120catacctgtt cataagcgtt tggatgcagt tgttactatt tgttatgaag ctcaagcacg
180tttaagagat ccagtttatg ggtgtgttgc taatatcttt gctcttcaac aacaggtggg
240aaatttacaa gctgagttat cctacttgca aacataccta gcatcattgg gngcttccaa
300ctccaccanc aagctccgcc aacaccaatg cttattacaa caacacctct ctccaaaagc
360aaattttcca tcaagcttcc actaagncan gcaaaacttt tgacttggtc aactcctttt
420cganccccca aaggaacaaa tcgggggaca cttcaacaaa agacaaatgg atttttaaac
480aaat
48447127PRTAquilegia vulgarisMISC_FEATUREeav1c.pk003.c9 47Arg Xaa Cys Val
Val Gly Cys Ile Phe Ala Pro Tyr Phe Asp Ser Glu1 5
10 15Gln Gly Ala Thr His Phe Ala Ala Val His
Lys Val Phe Gly Ala Ser 20 25
30Asn Val Ser Lys Leu Leu Leu His Ile Pro Val His Lys Arg Leu Asp
35 40 45Ala Val Val Thr Ile Cys Tyr Glu
Ala Gln Ala Arg Leu Arg Asp Pro 50 55
60Val Tyr Gly Cys Val Ala Asn Ile Phe Ala Leu Gln Gln Gln Val Gly65
70 75 80Asn Leu Gln Ala Glu
Leu Ser Tyr Leu Gln Thr Tyr Leu Ala Ser Leu 85
90 95Gly Ala Ser Asn Ser Thr Xaa Lys Leu Arg Gln
His Gln Cys Leu Leu 100 105
110Gln Gln His Leu Ser Pro Lys Ala Asn Phe Pro Ser Ser Phe His 115
120 125481077DNACyamopsis
tetragonolobamisc_featurelds3c.pk011.j11fis 48gacaacacat cttgctctca
catgatacag gtagagagag aaagttgaaa ggatgatgag 60ttgtgttgca taaattgacg
aggaaggagt agcgagggca aaaaaggaat taaatttaaa 120gattaagatt cagttaaggt
ggaagatgag ttcgaaagct ggaaatggaa gtggaagtgg 180aagtggcagt ggaggcggga
gcccttgtgg ggcttgtaag tttcttcgaa ggaagtgtgt 240ggcaggatgt gtgtttgctc
catactttga ctcagagcaa ggagccactc attttgcagc 300tgtgcataag gtgtttggtg
caagcaacgt ttctaaactt ctcctcaacc ttccgctcaa 360caaaaggctt gatgctgtta
ttaccatttg ctatgaagct cagtcaagga tcagagatcc 420cgtcttcggc tgcgttgctc
acatctttgc tctccagcaa caggtggtaa gtttacaaac 480agaagtgtcg tacttacaaa
gccaccttgc tgcaatggag ttacctcagc caccacctcc 540tcctcctcca caggagacag
tggtgcaggc accggtattc tcgattgcag acataccggc 600agcaacggta gcgggcatgc
cggcgagcta tgacctgtct tcactttttg agccgacggg 660gcaacaaaat tcatgggggg
gcggcggcat agacccgcgt caatttttgg cagttggccc 720atcatcaact actgatgctg
atctccaagc aatggcacgt gacctttctg aaagacttgc 780ctctctacct ccacctgcac
ccgcacctgc atttgctcct ctacctccac ttccacctgc 840acctgcacct gcacctgcac
catcatgccc caatgcacct tcatctttat cactttctta 900attaatcatc atcatcatca
tacatgcatc tatcttcaga cttttcttca cttttatttt 960tcatcgaaaa ctagtcaggg
atcttcaatt tcgtacacgc tctaatttat gtgcgtgcgg 1020atatttcttt taattttcgc
gcttctgcct ttcaaaaaaa aaaaaaaaaa aaaaaaa 107749251PRTCyamopsis
tetragonolobaMISC_FEATURElds3c.pk011.j11fis 49Met Ser Ser Lys Ala Gly Asn
Gly Ser Gly Ser Gly Ser Gly Ser Gly1 5 10
15Gly Gly Ser Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg
Lys Cys Val 20 25 30Ala Gly
Cys Val Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly Ala Thr 35
40 45His Phe Ala Ala Val His Lys Val Phe Gly
Ala Ser Asn Val Ser Lys 50 55 60Leu
Leu Leu Asn Leu Pro Leu Asn Lys Arg Leu Asp Ala Val Ile Thr65
70 75 80Ile Cys Tyr Glu Ala Gln
Ser Arg Ile Arg Asp Pro Val Phe Gly Cys 85
90 95Val Ala His Ile Phe Ala Leu Gln Gln Gln Val Val
Ser Leu Gln Thr 100 105 110Glu
Val Ser Tyr Leu Gln Ser His Leu Ala Ala Met Glu Leu Pro Gln 115
120 125Pro Pro Pro Pro Pro Pro Pro Gln Glu
Thr Val Val Gln Ala Pro Val 130 135
140Phe Ser Ile Ala Asp Ile Pro Ala Ala Thr Val Ala Gly Met Pro Ala145
150 155 160Ser Tyr Asp Leu
Ser Ser Leu Phe Glu Pro Thr Gly Gln Gln Asn Ser 165
170 175Trp Gly Gly Gly Gly Ile Asp Pro Arg Gln
Phe Leu Ala Val Gly Pro 180 185
190Ser Ser Thr Thr Asp Ala Asp Leu Gln Ala Met Ala Arg Asp Leu Ser
195 200 205Glu Arg Leu Ala Ser Leu Pro
Pro Pro Ala Pro Ala Pro Ala Phe Ala 210 215
220Pro Leu Pro Pro Leu Pro Pro Ala Pro Ala Pro Ala Pro Ala Pro
Ser225 230 235 240Cys Pro
Asn Ala Pro Ser Ser Leu Ser Leu Ser 245
250501847DNAGlycine maxmisc_featuresdr1f.pk005.d21fis 50gtacgaggac
ccctcactct tccatactat agtcctcaga tttttagttt gcaccatttc 60ctagtgtgcc
cgtgtgccta caaattttat tcacttcctc ccactcaggt cctttctttt 120caaacataaa
atacatatct ttctctctct cggtaatgac tccaacttat tgatagtgtt 180ttatgttcag
ataatgcccg atgactttgt catgcagctc caccgatttt gagaacgaca 240gcgacttccg
tcccagccgt gccaggtgct gcctcagatt caggttatgc cgctcaattc 300gctgcgtata
tcgcttgctg attacgtgca gctttccctt caggcgggat tcatacagcg 360gccagccatc
cgtcatccat atcaccacgt caaagggtga cagcaggctc ataagacgcc 420ccagcgtcgc
catagtgcgt tcaccgaata cgtgcgcaac aaccgtcttc cggagactgt 480catacgcgta
aaacagccag cgctggcgcg atttagcccc gacatagccc cactgttcgt 540ccatttccgc
gcagacgatg acgtcactgc ccggctgtat gcgcgaggtt accgactgcg 600gcctgagttt
tttaagtgac gtaaaatcgt gttgaggcca acgcccataa tgcgggctgt 660tgcccggcat
ccaacgccat tcatggccat atcaatgatt ttctggtgcg taccgggttg 720agaagcggtg
taagtgaact gcagttgcca tgttttacgg cagtgagagc agagatagcg 780ctgatgtccg
gcggtgcttt tgccgttacg caccaccccg tcagtagctg aacaggaggg 840acagctgata
gaaacagaag ccactggagc acctcaaaaa caccatcata cactaaatca 900gtaagttggc
agcatcaccc tctctctctt tgtgtgttgg ttattagtac aattatacta 960ctactatact
atggcttctg ctagtggaaa tggtgtctct aatggctctg gctctccttg 1020cggggcatgc
aagttcctca gaagaaggtg tgcttctgat tgtatctttg caccttactt 1080ttgttcagaa
cagggccctg ctagatttgc agccatacac aaggtatttg gtgccagcaa 1140cgtttcaaag
ttgcttttgc acataccagc tcatgatcgt tgtgaagcgg ttgtcacaat 1200cacttatgag
gctcaggctc gtattagaga ccctgtctat ggctgtgtct ctcacatttt 1260tgccttacaa
caacaggtgg cacgcttgca ggcacagctg atgcaggtaa aagctcagct 1320gactcagaac
ctagtggagt ccaggaacat agagaataat catcatttgc aagggaataa 1380taacaatgtt
acaggacaac taatgaatca tccattttgt cccccttaca tgaatcctat 1440atctcctcaa
agctcacttg aatcaattga tcacagcagc atcaatgatg gaatgagcat 1500gcaagatata
caaagcagag aggatttcca aatccaagct aaagaaagac catacaacaa 1560caatgacttg
ggggagctgc aagaactggc actaaggatg atgaggaact gattaattat 1620gactaggtta
gcaccaaagc tagccttttc attttctaga agggtgttcc ttgatgttta 1680gggggggatg
gtcttttgct agtgttgtat atataatgag tgtcatgaag aaaaactggt 1740cataactgat
aataagccta aagtttaaac taagcattag gcttttttct gtttgtggat 1800tcaatccaaa
agaaaattaa ttttttgcaa aaaaaaaaaa aaaaaaa
184751213PRTGlycine maxMISC_FEATUREsdr1f.pk005.d21fis 51Met Ala Ser Ala
Ser Gly Asn Gly Val Ser Asn Gly Ser Gly Ser Pro1 5
10 15Cys Gly Ala Cys Lys Phe Leu Arg Arg Arg
Cys Ala Ser Asp Cys Ile 20 25
30Phe Ala Pro Tyr Phe Cys Ser Glu Gln Gly Pro Ala Arg Phe Ala Ala
35 40 45Ile His Lys Val Phe Gly Ala Ser
Asn Val Ser Lys Leu Leu Leu His 50 55
60Ile Pro Ala His Asp Arg Cys Glu Ala Val Val Thr Ile Thr Tyr Glu65
70 75 80Ala Gln Ala Arg Ile
Arg Asp Pro Val Tyr Gly Cys Val Ser His Ile 85
90 95Phe Ala Leu Gln Gln Gln Val Ala Arg Leu Gln
Ala Gln Leu Met Gln 100 105
110Val Lys Ala Gln Leu Thr Gln Asn Leu Val Glu Ser Arg Asn Ile Glu
115 120 125Asn Asn His His Leu Gln Gly
Asn Asn Asn Asn Val Thr Gly Gln Leu 130 135
140Met Asn His Pro Phe Cys Pro Pro Tyr Met Asn Pro Ile Ser Pro
Gln145 150 155 160Ser Ser
Leu Glu Ser Ile Asp His Ser Ser Ile Asn Asp Gly Met Ser
165 170 175Met Gln Asp Ile Gln Ser Arg
Glu Asp Phe Gln Ile Gln Ala Lys Glu 180 185
190Arg Pro Tyr Asn Asn Asn Asp Leu Gly Glu Leu Gln Glu Leu
Ala Leu 195 200 205Arg Met Met Arg
Asn 21052852DNATriticum aestivummisc_featurewdr1f.pk002.l10fis
52gcagagctcg atcataagct agctagtcag gccaggcggg cgatcggacg atcgggctat
60aatttcgact acggcgacga tggccggcgc gggcgtgacg acgacggggt cgccgtgcgg
120ggcgtgcaag ttcctgcggc gccggtgcgc ggcggagtgc gtgttcgcgc cctacttctg
180cgccgaggac ggcgcgtcgc agttcgcggc catccacaag gtgttcgggg ccagcaacgc
240ggccaagctg ctgcagcagg tggcccccgg cgaccggagc gaggcggccg ccacagtgac
300ctacgaggcg caggcccggc tgcgcgaccc cgtctacggc tgcgtcgccc acatcttcgc
360gctgcagcag caggttgtgg cgctgcaggc gcaggtggcg cacgccagga cgcaggcgca
420gctgggggcg gcgacggcga tgcacccgct gctccagcag cagctgcagc agcaggcgtg
480gcaggtggcc gccgccgcgg atcagcacga ccaccagtcc atgacgtcca cgcagagcag
540ctccggctgc tacagcggcg cccaccagcg ctccgacggc tcgtcgctgc acggcgccga
600gatgtactgc ggctacggcg agcaggagga aggcagctac taacccccag atgattgatt
660cactcgttcc tcgttcgttc ccctgagaaa cctgagacat gtgccatgaa aagtttctcc
720tttgcaacgc gcgttcgctt gagttggttc aactcttgcc ggtctcggct gtaaaggcat
780caatcggtct tgtgttgttt ggggctcaag acgaacccat aatttccaac tttgcaaaaa
840aaaaaaaaaa aa
85253187PRTTriticum aestivumMISC_FEATUREwdr1f.pk002.l10fis 53Met Ala Gly
Ala Gly Val Thr Thr Thr Gly Ser Pro Cys Gly Ala Cys1 5
10 15Lys Phe Leu Arg Arg Arg Cys Ala Ala
Glu Cys Val Phe Ala Pro Tyr 20 25
30Phe Cys Ala Glu Asp Gly Ala Ser Gln Phe Ala Ala Ile His Lys Val
35 40 45Phe Gly Ala Ser Asn Ala Ala
Lys Leu Leu Gln Gln Val Ala Pro Gly 50 55
60Asp Arg Ser Glu Ala Ala Ala Thr Val Thr Tyr Glu Ala Gln Ala Arg65
70 75 80Leu Arg Asp Pro
Val Tyr Gly Cys Val Ala His Ile Phe Ala Leu Gln 85
90 95Gln Gln Val Val Ala Leu Gln Ala Gln Val
Ala His Ala Arg Thr Gln 100 105
110Ala Gln Leu Gly Ala Ala Thr Ala Met His Pro Leu Leu Gln Gln Gln
115 120 125Leu Gln Gln Gln Ala Trp Gln
Val Ala Ala Ala Ala Asp Gln His Asp 130 135
140His Gln Ser Met Thr Ser Thr Gln Ser Ser Ser Gly Cys Tyr Ser
Gly145 150 155 160Ala His
Gln Arg Ser Asp Gly Ser Ser Leu His Gly Ala Glu Met Tyr
165 170 175Cys Gly Tyr Gly Glu Gln Glu
Glu Gly Ser Tyr 180 18554262PRTArabidopsis
thalianaMISC_FEATURENCBI General Identifier No. 17227164 54Met Ser Gly
Gly Gly Asn Thr Ile Thr Ala Val Gly Gly Gly Gly Gly1 5
10 15Gly Cys Gly Gly Gly Gly Ser Ser Gly
Gly Gly Gly Ser Ser Gly Gly 20 25
30Gly Gly Gly Gly Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys
35 40 45Val Pro Gly Cys Ile Phe Ala
Pro Tyr Phe Asp Ser Glu Gln Gly Ser 50 55
60Ala Tyr Phe Ala Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser65
70 75 80Lys Leu Leu Leu
His Ile Pro Val His Arg Arg Ser Asp Ala Val Val 85
90 95Thr Ile Cys Tyr Glu Ala Gln Ala Arg Ile
Arg Asp Pro Ile Tyr Gly 100 105
110Cys Val Ala His Ile Phe Ala Leu Gln Gln Gln Val Val Asn Leu Gln
115 120 125Ala Glu Val Ser Tyr Leu Gln
Ala His Leu Ala Ser Leu Glu Leu Pro 130 135
140Gln Pro Gln Thr Arg Pro Gln Pro Met Pro Gln Pro Gln Pro Leu
Phe145 150 155 160Phe Thr
Pro Pro Pro Pro Leu Ala Ile Thr Asp Leu Pro Ala Ser Val
165 170 175Ser Pro Leu Pro Ser Thr Tyr
Asp Leu Ala Ser Ile Phe Asp Gln Thr 180 185
190Thr Ser Ser Ser Ala Trp Ala Thr Gln Gln Arg Arg Phe Ile
Asp Pro 195 200 205Arg His Gln Tyr
Gly Val Ser Ser Ser Ser Ser Ser Val Ala Val Gly 210
215 220Leu Gly Gly Glu Asn Ser His Asp Leu Gln Ala Leu
Ala His Glu Leu225 230 235
240Leu His Arg Gln Gly Ser Pro Pro Pro Ala Ala Thr Asp His Ser Pro
245 250 255Ser Arg Thr Met Ser
Arg 2605522PRTArtificial SequenceC-block consensus 55Pro Cys
Gly Ala Cys Lys Phe Leu Arg Arg Xaa Cys Xaa Xaa Xaa Cys1 5
10 15Xaa Phe Ala Pro Xaa Phe
205612PRTArtificial SequenceN-terminus of the GAS block 49 amino acids
56Phe Ala Ala Xaa His Lys Val Phe Gly Ala Ser Asn1 5
10579PRTArtificial SequenceC-terminus of GAS block 49 amino
acids 57Arg Asp Pro Xaa Xaa Gly Cys Val Xaa1
55819PRTArtificial SequenceLeucine zipper motif 58Leu Gln Xaa Gln Xaa Xaa
Xaa Leu Gln Xaa Xaa Xaa Xaa Xaa Xaa Xaa1 5
10 15Xaa Xaa Xaa5931DNAArtificial
SequenceOligonucleotide primer cpi BbsI F 59gaagaccaat gagcgctggc
ggcggcagca g 316034DNAArtificial
SequenceOligonucleotide primer Cpi BsaI R 60ggtctcctca tcttgagtgt
ggcggcgggt gctc 34611855DNAZea mays
61tatgcatcca acgcgttggg agctctccca tatggtcgac ctgcaggcgg ccgcgaattc
60actagtgatt gaagaccaat gagcgctggc ggcggcagca gcacgcttgg cggcgggggc
120ccgagcggca gcggcagcgg aggccctgga ggaagcggcg gcggcgggcc ttgcggcgcg
180tgcaagttcc tccggcgcaa gtgcgtcagc ggctgcatct tcgcgcccta cttcgactcg
240gagcagggcg cggcgcactt cgcggccgtg cacaaggtgt tcggcgccag caacgtgtcc
300aagctgctgc tccagatccc ggcgcacaag cgcctcgacg ccgtcgtcac catctgctac
360gaggcccagg cgcggctccg cgaccccgtc tacggctgcg tcgcccacat cttcgcgctc
420cagcagcagg tatatatatg agatgctagg atgatcgatt atctttgggt tgggttatat
480atatattcgg tccatccatc catgcaagat ccatccatgg gctcgctcgc tagtagcttg
540gcatgcatgc acgcatgcat ggatcgatca tggatagacg atgcctgcta gtagtaggcc
600ggcaggcgct accagcgatt attgctgcat gatttcccct tcgcattcgc gtgtggatct
660gggtcttttc tgaatccgcc gtctctgcga taagattctg ggagcggcca ggcgtgtttc
720tttctcgagg aaggcaagtc cgtccccgtc cccccctttc acgaggaaat caacactgac
780aagccaagca acggcagtgc aaaaagaagc acgccaagcg ctaatccggg aggcctgcct
840gcggcgatga atgatatgca cttctcatcc gtcgcatccg tgccgtcgat cgcattcctc
900ttctacccgt caaggcagca gccacgtaca ccatgcggat gcatgtgatg tgtgtgtgtg
960tgtgtgtgta tctccttcta tcttgggctc tgcacaaagc cttccaatgc cagtggcggt
1020gtggtgcttc ccgatctgat cgatcgatga ctcgatgagc tagccctcct tgaaaagaat
1080agaacgtcag cgccaatctc tagtattggt agcagcagta gccgtcctcc tcctaggtag
1140aagatccaaa cctgcattct tttttgtcaa tcgtgcgatg gacacctttc atttcgatcg
1200catatttgca tccgtgtgtg tgatgtgtct ttttttttct tccatattat atgcatctgt
1260atcgtgtaca aacaatgatg gcttttggtg gttccaagtt tgcacgtaac aatttactgt
1320tggatcgtcg acggtgcatg aatgtcacgt cattattccc caggtggtga atctccaggc
1380cgagctgacc tacctgcaag cacacctcgc cacgctcgag ctgccggccc cgcccccgct
1440gccggccccg ccgcagatgc ccatgccagg cccgttctcc atctcggacc tgccgttgtc
1500gaccagcgtc cccaccaccg tcgacctgtc cgcgctcttc gacccgccac caccgcagtg
1560ggcgacggcg cagcagccgc accaccacca tcaacagccg ccgcagcacc accagctccg
1620gcaaccggcg ccgtatggcg ctggcgcgtc cgtcaggccc ggcggcggcc ccggcatggc
1680agagagctca ggcggagacg agctgcagtc gctggcgagg gagctcctgg accgccaccg
1740gtccggcggc gtgaagctcg agcacccgcc gccacactca agatgaggag accaatcgaa
1800ttcccgcggc cgccatggcg gcccgggagc atgcgacgtc gggcccaatt cgccc
18556240DNAArtificial SequenceOligonucleotide primer RE2 pro Bst 2F
62caccatcatg tcagtgtgcc aatacgctaa acttagaaga
406330DNAArtificial SequenceOligonucleotide primer RE2 PRO R BbsI
63gaagacgctc attcttggaa tgagccccca
30644359DNAArtificialpGEMT-easy vector with portion of rice RE2
promoter 64gggcgaattg ggcccgacgt cgcatgctcc cggccgccat ggcggccgcg
ggaattcgat 60tcaccatcat gtcagtgtgc caatacgcta aacttagaag atgcctccga
taatcgtagc 120atttgtatta tttggggaat gaattaaawa atataaataa tgatatatta
caattgataa 180tctatgttta gaaacttttg tcggttactc gctcaaattg tatggggtaa
taaatcggtg 240aagtatattt ttatactgaa tggaaaagat aagctaccat tgatagcatt
agcggttcta 300tttcgtatat tatcccgatt atccaccctc aatttgtgct aaaataagat
ttttacatca 360tcctagtcaa tatttggggt taccctgtct gcattataat ttatttttgt
gcttaactat 420aatatataca tacactataa tttatctaaa taaaagttct ggtatgattg
aaaaaaacta 480acaattttgt gtgtggcgta ttgagtggaa gaatgtcatg ttaggatcac
atgggagaga 540gtgcatgcga cgagatcatc cttgttggtc tgtgcaggtg gtgtgaaatg
tgatcaatat 600atatggtggt gacagagaga gaaactaacc caaaaaaaca aaaaaagaga
gatgagagcg 660aatggatgga tgcaattggc attaattttc ggtctttgct gttctccccc
agccaggcca 720gtttgcttca cgcaatattc taaccctttg agaaagagaa gtgtacttgt
tgccaaggcc 780aattgcaagc atttgccttg gctttaaagt ctcatcaata caacggcacc
aaaaagaaaa 840cacatagaga tagaaaacca cctagtagct gatatacatt tatatatgac
ctaaataaaa 900aaattccatt aatatrtata attccagcaa caacataaag aaataaaaat
gcatttaaga 960aaacatagaa agaaataaaa ataaagtaaa taaagctagc taggcccaaa
attggcagta 1020attaagtagg gactagtata gaaatatatg gatatataca ccagcctcca
ccaatgggat 1080tgcaaacagc ctacttatca ctttgctgct gtatttacgc ttttgccctt
cttccctcct 1140atatgtacag ccgccccmac ctcattccct ccattcttac tccacacaca
cactctctct 1200ctctaccatt tgtgagaaag aaaatcgatt cagttctaga gagagaaaca
aacaattttc 1260gctgtctatc tctctcttgc tactagtcgg tcgatcttga gttagtttta
accctacaca 1320agccaaggta acaacatcta gcaggtagga gaagagtgct agagactagg
tggtgggggc 1380tcattccaag aatgagcgtc ttcaatcact agtgaattcg cggccgcctg
caggtcgacc 1440atatgggaga gctcccaacg cgttggatgc atagcttgag tattctatag
tgtcacctaa 1500atagcttggc gtaatcatgg tcatagctgt ttcctgtgtg aaattgttat
ccgctcacaa 1560ttccacacaa catacgagcc ggaagcataa agtgtaaagc ctggggtgcc
taatgagtga 1620gctaactcac attaattgcg ttgcgctcac tgcccgcttt ccagtcggga
aacctgtcgt 1680gccagctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt
attgggcgct 1740cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg
cgagcggtat 1800cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac
gcaggaaaga 1860acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg
ttgctggcgt 1920ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca
agtcagaggt 1980ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc
tccctcgtgc 2040gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc
ccttcgggaa 2100gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag
gtcgttcgct 2160ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc
ttatccggta 2220actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca
gcagccactg 2280gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg
aagtggtggc 2340ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg
aagccagtta 2400ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct
ggtagcggtg 2460gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa
gaagatcctt 2520tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa
gggattttgg 2580tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa
tgaagtttta 2640aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc
ttaatcagtg 2700aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga
ctccccgtcg 2760tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca
atgataccgc 2820gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc
ggaagggccg 2880agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat
tgttgccggg 2940aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc
attgctacag 3000gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt
tcccaacgat 3060caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc
ttcggtcctc 3120cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg
gcagcactgc 3180ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt
gagtactcaa 3240ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg
gcgtcaatac 3300gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga
aaacgttctt 3360cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg
taacccactc 3420gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg
tgagcaaaaa 3480caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt
tgaatactca 3540tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc
atgagcggat 3600acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca
tttccccgaa 3660aagtgccacc tgatgcggtg tgaaataccg cacagatgcg taaggagaaa
ataccgcatc 3720aggaaattgt aagcgttaat attttgttaa aattcgcgtt aaatttttgt
taaatcagct 3780cattttttaa ccaataggcc gaaatcggca aaatccctta taaatcaaaa
gaatagaccg 3840agatagggtt gagtgttgtt ccagtttgga acaagagtcc actattaaag
aacgtggact 3900ccaacgtcaa agggcgaaaa accgtctatc agggcgatgg cccactacgt
gaaccatcac 3960cctaatcaag ttttttgggg tcgaggtgcc gtaaagcact aaatcggaac
cctaaaggga 4020gcccccgatt tagagcttga cggggaaagc cggcgaacgt ggcgagaaag
gaagggaaga 4080aagcgaaagg agcgggcgct agggcgctgg caagtgtagc ggtcacgctg
cgcgtaacca 4140ccacacccgc cgcgcttaat gcgccgctac agggcgcgtc cattcgccat
tcaggctgcg 4200caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc
tggcgaaagg 4260gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt
cacgacgttg 4320taaaacgacg gccagtgaat tgtaatacga ctcactata
43596538DNAArtificial SequenceOligonucleotide primer RE2 TERM
XbaI R 65gtaaaaggat ctagacacct ggctctagcc tccaagta
386650DNAArtificial SequenceOligonucleotide primer RE2 TERM EcoBspmI
66tggagcgaat tcacctgcca agatgatcct cctcactgtg tgtgatcatc
50673745DNAArtificialpGEM9z plasmid containing portion of rice RE2
terminator 67gggcgaattg ggcccgacgt cgcatgctcc tctagacacc tggctctagc
ctccaagtac 60tcacatatgt tcacacttta ttcagttgcc accatacata tatgttcagt
taccattaac 120cccaaaaatg ctcattttgc tacttgctag ctgttgtata tattatggca
aaaaaaaaaa 180gctcatttgt catagagaga tcagaaagtc ccaaattaag actgactaca
cttcacttca 240gatgttcagt tagaaccttg aaacagaaat caaacaaact tcaaacacaa
ccgtaattaa 300tctgcgatga gacaaagatt ttttttttgt atattgaact aaattagacc
aggttttgtt 360ttttgcaatt agaactgttg caatggaaca gtgatcaggg ctacagctaa
gctaacaaga 420actcatcccc aagcttcctc ttttggagca aactcccaag cattataaaa
tggtgcaagt 480gggcacccct tgcaccctag cccaagaaca aagatgatac gcgaaaaccg
ggcaatgcct 540gttacctacc atctccatcc ccaacatcac catcatcgtc ttcctcgcca
tcgtcgtcgt 600cgtcgttcga tttttcgcct caacgacatg gaggatttgg tgaaaaaaat
ttcaccattg 660ggattaaatt ctcttcttgc ctcaaccaat gcattcatga atgattaaca
cctgatcaat 720tagtccatga gctagctagc taagctgaat tgatgatcac acacagtgag
gaggatcatc 780ttggcaggtg aattcggtac cccgggttcg aaatcgataa gcttggatcc
ggagagctcc 840caacgcgttg gatgcatagc ttgagtattc tatagtgtca cctaaatagc
ttggcgtaat 900catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca
cacaacatac 960gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa
ctcacattaa 1020ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag
ctgcattaat 1080gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc
gcttcctcgc 1140tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct
cactcaaagg 1200cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg
tgagcaaaag 1260gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc
cataggctcc 1320gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga
aacccgacag 1380gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct
cctgttccga 1440ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg
gcgctttctc 1500atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag
ctgggctgtg 1560tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat
cgtcttgagt 1620ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac
aggattagca 1680gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac
tacggctaca 1740ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc
ggaaaaagag 1800ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt
tttgtttgca 1860agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc
ttttctacgg 1920ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg
agattatcaa 1980aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca
atctaaagta 2040tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca
cctatctcag 2100cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag
ataactacga 2160tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac
ccacgctcac 2220cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc
agaagtggtc 2280ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct
agagtaagta 2340gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc
gtggtgtcac 2400gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg
cgagttacat 2460gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc
gttgtcagaa 2520gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat
tctcttactg 2580tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag
tcattctgag 2640aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat
aataccgcgc 2700cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg
cgaaaactct 2760caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca
cccaactgat 2820cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga
aggcaaaatg 2880ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc
ttcctttttc 2940aatattattg aagcatttat cagggttatt gtctcatgag cggatacata
tttgaatgta 3000tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg
ccacctgtat 3060gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcagga
cgcgccctgt 3120agcggcgcat taagcgcggc gggtgtggtg gttacgcgca gcgtgaccgc
tacacttgcc 3180agcgccctag cgcccgctcc tttcgctttc ttcccttcct ttctcgccac
gttcgccggc 3240tttccccgtc aagctctaaa tcgggggctc cctttagggt tccgatttag
tgctttacgg 3300cacctcgacc ccaaaaaact tgattagggt gatggttcac gtagtgggcc
atcgccctga 3360tagacggttt ttcgcccttt gacgttggag tccacgttct ttaatagtgg
actcttgttc 3420caaactggaa caacactcaa ccctatctcg gtctattctt ttgatttata
agggattttg 3480ccgatttcgg cctattggtt aaaaaatgag ctgatttaac aaaaatttaa
cgcgaatttt 3540aacaaaatat taacgcttac aatttccatt cgccattcag gctgcgcaac
tgttgggaag 3600ggcgatcggt gcgggcctct tcgctattac gccagctggc gaaaggggga
tgtgctgcaa 3660ggcgattaag ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa
acgacggcca 3720gtgaattgta atacgactca ctata
37456827DNAArtificial Sequenceoligonucleotide primer that may
be used to identify RE2 homologs 68gcatcttcgc gccctacttc gactcgg
276935DNAArtificial
Sequenceoligonucleotide primer that may be used to identify RE2
homologs 69gcacaaggtg ttcggcgcca gcaacgtgtc caagc
357034DNAArtificial Sequenceoligonucleotide primer that may be
used to identify RE2 homologs 70ccgcgacccc gtctacggct gcgtcgccca
cctc 3471448DNAArtificial SequenceProbe
used to identify RE2 cDNA 71gcaggtctcc agtccgagct gaactacctg caaggtcacc
tctcgacgat ggagctgccg 60tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc
cacagccaca gccactgatg 120ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc
catcgtcgtc ggcggccaac 180attccggtca ccgccgacct gtccaccctc tttgacccac
tgccggcggc gcagccgcag 240tggggactat accagcagca gcaacaccac caccagcagc
tgcatcatca cccctatgac 300cggatgggcg acggctcgtc gagcagcaga ggcggcgacg
acgatggcag cgacggcggc 360gacttgcaag cgctggcgag ggagcttctt gaccgccatg
gacggtcgtc gtcgagctcc 420aagctggagc cgccacctca cacacagt
448721500DNAOryza sativa 72ggcacgaggt ctctctctac
catttgtgag aaagaaaatc gattcagttc tagagagaga 60aacaaacaat tttcgctgtc
tatctctctc ttgctactag tcggtcgatc ttgagttagt 120tttaacccta cacaagccaa
ggtaacaaca tctagcaggt aggagaagag agctagagac 180taggtggtgg gggctcattc
caagaatgag ctcgtcggtg gttgtgagcg cgagcggcag 240cggcagcggc ggcggaggag
gaggaggagg tggcggcgcc ggaggtggag gaggaggtgg 300gccgtgcggg gcgtgcaagt
tcttgcggcg gaagtgcgtg caggggtgca tcttcgcgcc 360ctacttcgac tcggaggccg
gggcggcgca cttcgcggcg gtgcacaagg tgttcggcgc 420cagcaacgtg tccaagctgc
tgcagcagat cccggcgcac cgccgcctcg acgccgtcgt 480caccatctgc tacgaggccc
aggcccgcct ccgcgacccc gtctacggct gcgtcgccca 540catcttccac ctccaacacc
aggtggcagg tctccagtcc gagctgaact acctgcaagg 600tcacctctcg acgatggagc
tgccgtcgcc gccgccctac gtcgccgggc cgaccctggc 660gccgccacag ccacagccac
tgatgccgat gaccgccgcc gccaacttca acttctccga 720cctgccatcg tcgtcggcgg
ccaacattcc ggtcaccgcc gacctgtcca ccctctttga 780cccactgccg gcggcgcagc
cgcagtgggg actataccag cagcagcaac accaccacca 840gcagctgcat catcacccct
atgaccggat gggcgacggc tcgtcgagca gcagaggcgg 900cgacgacgat ggcagcgacg
gcggcgactt gcaagcgctg gcgagggagc ttcttgaccg 960ccatggacgg tcgtcgtcga
gctccaagct ggagccgcca cctcacacac agtgatcctc 1020ctcactgtgt gtgatcatca
attcagctta gctagctagc tcatggacta attgatcagg 1080tgttaatcat tcatgaatgc
attggttgag gcaagaagag aatttaatcc caatggtgaa 1140atttttttca ccaaatcctc
catgtcgttg aggcgaaaaa tcgaacgacg acgacgacga 1200tggcgaggaa gacgatgatg
gtgatgttgg ggatggagat ggtaggtaac aggcattgcc 1260cggttttcgc gtatcatctt
tgttcttggg ctagggtgca aggggtgccc acttgcacca 1320ttttataatg cttgggagtt
tgctccaaaa gaggaagctt ggggatgagt tcttgttagc 1380ttagctgtag ccctgatcac
tgttccattg caacagttct aattgcaaaa aacaaaacct 1440ggtctaattt agttcaatat
acaaaaaaaa aatctttgtc tcaaaaaaaa aaaaaaaaaa 1500
User Contributions:
Comment about this patent or add new information about this topic:








