Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: Alteration of Plant Embryo/Endosperm Size During Seed Development

Inventors:  Hajime Sakai (Newark, DE, US)  Nobuhiro Nagasawa (Newark, DE, US)
IPC8 Class: AA01H100FI
USPC Class: 800278
Class name: METHOD OF INTRODUCING A POLYNUCLEOTIDE MOLECULE INTO OR REARRANGEMENT OF GENETIC MATERIAL WITHIN A PLANT OR PLANT PART
Publication date: 08/27/2009
Patent application number: 20090217412






Sign up to receive free email alerts when patent applications with chosen keywords are published SIGN UP

Abstract:

Isolated nucleic acid fragments and recombinant constructs comprising such fragments useful for altering embryo/endosperm size during seed development are disclosed along with a method of controlling embryo/endosperm size during development in plants using such recombinant constructs.

Claims:

1. An isolated polynucleotide comprising:(a) a nucleic acid sequence encoding a polypeptide involved in altering embryo/endosperm size during seed development, said polypeptide having at least 80% amino acid sequence identity, based on the Clustal V method of alignment, when compared to an amino acid sequence selected from the group consisting of SEQ ID NOs:37, 39, 41, 43, 45, 47, 49, 51, and 53; or(b) a nucleic acid sequence set forth in SEQ ID NO:25 wherein said sequence comprises at least one of the following modifications:(i) nucleotide 271 is a T residue instead of a C;(ii) nucleotide 110 is a T residue instead of a G; or(iii) nucleotide 75 is deleted; or(c) a nucleic acid sequence set forth in SEQ ID NO:34 wherein(i) nucleotides 4473 through 4829 correspond to a first exon, andii) nucleotides 5661 through 6110 correspond to a second exon, andfurther wherein the nucleotides of (c) (i) and/or (c)(ii) encode a polypeptide involved in altering embryo/endosperm size during seed development; or(d) a nucleic acid sequence set forth in SEQ ID NO:72; or(e) a full complement of (a), (b), (c), (d), or SEQ ID NO:34; or(f) all or part of a non-coding or coding region of the isolated polynucleotide comprising sequences of (a), (b), (c), (d), (e), or SEQ ID NO:34 for use in co-suppression or antisense suppression of endogenous nucleic acid sequences encoding polypeptides involved in altering embryo/endosperm size during seed development.

2. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is at least 85%.

3. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is at least 90%.

4. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is at least 95%.

5. The isolated polynucleotide of claim 1 wherein the amino acid sequence identity is 100%.

6. The isolated polynucleotide of claim 1 wherein the nucleotide sequence corresponds to any of the nucleotide sequences set forth in SEQ ID NOs:34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72.

7. A recombinant DNA construct comprising the isolated polynucleotide of any one of claims 1-6 operably linked to at least one regulatory sequence.

8. A plant comprising in its genome the recombinant DNA construct of claim 7.

9. Seeds and progeny thereof obtained from the plant of claim 8.

10. Oil obtained from the seeds of claim 9.

11. The plant of claim 8 wherein said plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.

12. Transformed plant tissue or plant cells comprising the recombinant DNA construct of claim 7.

13. The transformed plant tissue or plant cells of claim 12 wherein the plant is selected from the group consisting of rice, corm, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.

14. A method of altering embryo/endosperm size during seed development in a plant comprising:(a) transforming plant cells or plant tissue with the recombinant DNA construct of claim 7;(b) regenerating transgenic plants from the transformed plant cells or plant tissue of (a);(c) obtaining seeds and progeny thereof from the transgenic plants of (b) having altered embryo/endosperm size based on a comparison of embryo/endosperm size of seeds obtained from non-transformed plants.

15. The method of claim 14 wherein said plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.

16. A method of mapping genetic variations related to controlling embryo/endosperm size and/or altering oil phenotype in plants comprising:(a) crossing two plant varieties; and(b) evaluating genetic variations with respect to(i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52 and 72; or(ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37., 39, 41, 43, 45, 47, 49, 51, and 53;in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.

17. The method of claim 16 wherein the plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.

18. A method of molecular breeding to control embryo/endosperm size and/or altering oil phenotype in plants comprising:(a) crossing two plant varieties; and(b) evaluating genetic variations with respect to(i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52 and 72; or(ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 30, 32, 34, 36, 38, 40, 42, 44, and 46;in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.

19. The plant of claim 18 wherein the plant is selected from the group consisting of rice, corn, sorghum, millet, rye, soybean, canola, wheat, barley, oat, beans, and nuts.

Description:

[0001]This application claims the benefit of U.S. Provisional Application No. 60/664,512, filed 23 Mar. 2005, the entire content of which is hereby incorporated by reference.

FIELD OF THE INVENTION

[0002]The present invention is in the field of plant breeding and genetics and, in particular, relates to recombinant constructs useful for altering embryo/endosperm size during seed development.

BACKGROUND OF THE INVENTION

[0003]Elucidation of how the size of a developing embryo is genetically regulated is important because the final volume of endosperm as a storage organ of starch and proteins is affected by embryo size in cereal crops. Researchers have found that genes involved in embryo size contribute to the regulation of endosperm development. Investigation of these genes is important for agriculture because cereal endosperms are the staple diet in many countries.

[0004]Rice mutants, having normally differentiated shoot and radicle and either reduced or enlarged embryo when compared to wild type rice, were identified in the early 1990s in plants obtained from methyl-nitrosourea mutagenized Taichung 65 cultivar. Mutant plants displaying an enlarged embryo were designated giant embryo (ge) mutants while plants displaying a smaller embryo were designated reduced embryo (re) mutants (Kitano et al. 1993, Plant J. 3:607-610; Hong et al. in 1995, Dev. Genet 16:298-310).

[0005]The phenotypes of each of the three reduced embryo mutants were designated re1, re2, and re3 even though the gene(s) responsible for these phenotypes have not been characterized. A mutation in a different locus is responsible for the mutant phenotype. Phenotypic analysis of ge and re mutant plants led to the theory that embryo size may be determined by the interaction between embryo-specific genes and endosperm-specific genes regulating endosperm development (Hong et al. (1996) Development 122:2051-2058).

[0006]The reduced embryo size phenotype of re2 mutant plants is associated with the enlargement of the endosperm size without altering the overall seed size. This phenotype is potentially useful for improving cereal quality by increasing the amount of endosperm tissue, which is rich in starch and other nutrients. Moreover, the reduction of embryo size in seed has a potential benefit for some milling processes, where embryonic tissues are considered as waste, such as in the production of ethanol.

SUMMARY OF THE INVENTION

[0007]In a first embodiment, the invention concerns an isolated polynucleotide comprising: [0008](a) a nucleic acid sequence encoding a polypeptide involved in altering embryo/endosperm size during seed development, said polypeptide having at least 80% amino acid sequence identity, based on the Clustal V method of alignment, when compared to an amino acid sequence selected from the group consisting of SEQ ID NOs:37, 39, 41, 43, 45, 47, 49, 51, and 53; or [0009](b) a nucleic acid sequence set forth in SEQ ID NO:25 wherein said sequence comprises at least one of the following modifications: [0010](i) nucleotide 271 is a T residue instead of a C; [0011](ii) nucleotide 110 is a T residue instead of a G; or [0012](iii) nucleotide 75 is deleted; or [0013](c) a nucleic acid sequence set forth in SEQ ID NO:34 wherein [0014](i) nucleotides 4473 through 4829 correspond to a first exon, and [0015](ii) nucleotides 5661 through 6110 correspond to a second exon, and [0016]further wherein the nucleotides of (c) (i) and/or (c)(ii) encode a polypeptide involved in altering embryo/endosperm size during seed development, [0017](d) a nucleic acid sequence set forth in SEQ ID NO:34 or 72; or [0018](e) the full complement of (a), (b), (c), (d), or SEQ ID NO:34; or [0019](f) all or part of a non-coding or coding region of the isolated polynucleotide comprising sequences of (a), (b) or SEQ ID NO:34 for use in co-suppression or antisense suppression of endogenous nucleic acid sequences encoding polypeptides involved in altering embryo/endosperm size during seed development.

[0020]In a second embodiment, the invention concerns a recombinant DNA construct comprising the isolated polynucleotide of the invention operably linked to at least one regulatory sequence.

[0021]In a third embodiment, the invention concerns a plant comprising in its genome the recombinant DNA construct of the invention as well as any seeds obtained from such a plant and oil obtained from such seeds. Also of interest are transformed plant tissue or plant cells comprising the recombinant DNA construct of the invention.

[0022]In a fourth embodiment, the invention concerns a method of altering embryo/endosperm size during seed development in a plant comprising: [0023](a) transforming plant cells or plant tissue with the recombinant DNA construct of the invention; [0024](b) regenerating transgenic plants from the transformed plant cells or plant tissue of (a); [0025](c) screening the transgenic plants of (b) for seeds having an altered embryo/endosperm size based on a comparison of embryo/endosperm size of seeds obtained from non-transformed plants.

[0026]In a fifth embodiment, the invention concerns a method of mapping genetic variations related to controlling embryo/endosperm size and/or altering oil phenotype in plants comprising: [0027](a) crossing two-plant varieties; and [0028](b) evaluating genetic variations with respect to [0029](i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or [0030](ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53; in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP (restriction fragment length polymorphism) analysis, SNP (single nucleotide polymorphism) analysis, and PCR-based analysis.

[0031]In a sixth embodiment the invention concerns a method of molecular breeding to control embryo/endosperm size and/or altering oil phenotype in plants comprising: [0032](a) crossing two plant varieties; and [0033](b) evaluating genetic variations with respect to [0034](i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or [0035](ii) a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53;in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.

BRIEF DESCRIPTION OF THE FIGURES AND-SEQUENCE LISTINGS

[0036]The invention can be more fully understood from the following detailed description and the accompanying drawings and Sequence Listing that form a part of this application.

[0037]FIG. 1A-D shows an alignment of the nucleotide sequences obtained for wild type RE2 (SEQ ID NO:25), and mutants re2-1 (SEQ ID NO:28), re2-2 (SEQ ID NO:30), and re2-3 (SEQ ID NO:32). Changes in the nucleotide sequence are indicated by a star below the alignment and by a box around the nucleotides at that position. Numbers at the left of the alignment indicate the nucleotide position.

[0038]FIG. 2 shows an alignment of the amino acid sequences obtained for polypeptides from wild type RE2 protein (SEQ ID NO:26), and re2-1 mutant protein (SEQ ID NO:29), and re2-2 mutant protein (SEQ ID NO:31). Changes in the amino acid sequence are indicated by a star below the alignment and by a box around the amino acids at that position. As seen in FIG. 2, mutant allele re2-1 had an isoleucine at amino acid 93 instead of the highly conserved threonine; mutant allele re2-2 had a phenylalanine instead of the conserved cysteine at amino acid 37. The deletion of a nucleotide at position 75 in mutant allele re2-3 gene produced a frame shift that results in a 127 amino acid polypeptide for the re2-3 mutant protein (set forth in SEQ ID NO:33) that is quite different than the one encoded by wild type RE2 gene or mutant genes re2-1 or re2-2. Numbers at the left of the alignment indicate the amino acid position.

[0039]FIG. 3A-C depicts the Clustal V alignment obtained for the amino acid sequences from the rice wild type RE2 protein (SEQ ID NO:26), the O. sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27), the A. thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164 (SEQ ID NO:54), and the amino acid sequences of the polypeptides encoded by corn clones cef1f.pk001.f4:fis (SEQ ID NO:37), cpf1c.pk006.d18a:fis (SEQ ID NO:39), cpi1c.pk005.a12:fis (SEQ ID NO:41), and cr1n.pk0028.h3a:fis (SEQ ID NO:43), Euphorbia lagascae clone eel1c.pk003.b10:fis (SEQ ID NO:45), columbine clone eav1c.pk003.c9 (SEQ ID NO:47), guar clone lds3c.pk011.j11:fis (SEQ ID NO:49), soybean clone sdr1f.pk005.d21.f:fis (SEQ ID NO:51), and wheat clone wdr1f.pk002.l10:fis (SEQ ID NO:53). The program uses dashes to maximize the alignment. An asterisk (*) below the alignment indicates amino acids conserved among all the sequences. The C-block, a GAS-block, and a leucine zipper conserved motifs are shown boxed. Numbers at the left of the alignment indicate the amino acid position.

[0040]The following sequence descriptions and sequence listings attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. §1.821-1.825.

[0041]SEQ ID NO:1 is the nucleotide sequence of oligonucleotide primer C10 6-3 used to amplify CAPS marker C10 7.7 to identify the re2 locus.

[0042]SEQ ID NO:2 is the nucleotide sequence of oligonucleotide primer C10 6-4 used to amplify CAPS marker C10 7.7 to identify the re2 locus.

[0043]SEQ ID NO:3 is the nucleotide sequence of oligonucleotide primer C10 15.9-1 used to amplify CAPS marker C10 15.9 to identify the re2 locus.

[0044]SEQ ID NO:4 is the nucleotide sequence of oligonucleotide primer C10 15.9-2 used to amplify CAPS marker C10 15.9 to identify the re2 locus.

[0045]SEQ ID NO:5 is the nucleotide sequence of oligonucleotide primer C10-7.7 2 HPYIVF used to amplify CAPS marker C10 7.7 Hpy.

[0046]SEQ ID NO:6 is the nucleotide sequence of oligonucleotide primer C10-7.7 2 HPYIVR used to amplify CAPS marker C10 7.7 Hpy.

[0047]SEQ ID NO:7 is the nucleotide sequence of oligonucleotide primer 11.5 HpyV used to amplify CAPS marker C10 11.5.

[0048]SEQ ID NO:8 is the nucleotide sequence of oligonucleotide primer C10 11.5-9 used to amplify CAPS marker C10 11.5.

[0049]SEQ ID NO:9 is the nucleotide sequence of oligonucleotide primer C10 11-5 used to amplify CAPS marker C10 11.0.

[0050]SEQ ID NO:10 is the nucleotide sequence of oligonucleotide primer 11 HinfR used to amplify CAPS marker C10 11.0.

[0051]SEQ ID NO:11 is the nucleotide sequence of oligonucleotide primer 9.6 DraIF used to amplify CAPS marker C10 9.6.

[0052]SEQ ID NO:12 is the nucleotide sequence of oligonucleotide primer 9.6 DraIR used to amplify CAPS marker C10 9.6.

[0053]SEQ ID NO:13 is the nucleotide sequence of the oligonucleotide primer E08 93KF used to amplify CAPS marker E08 93K.

[0054]SEQ ID NO:14 is the nucleotide sequence of the oligonucleotide primer E08 93KR used to amplify CAPS marker E08 93K.

[0055]SEQ ID NO:15 is the nucleotide sequence of the oligonucleotide primer E08 46KF used to amplify CAPS marker E08 46K.

[0056]SEQ ID NO:16 is the nucleotide sequence of the oligonucleotide primer E08 46KR used to amplify CAPS marker E08 46K.

[0057]SEQ ID NO:17 is the nucleotide sequence of the oligonucleotide primer K08 21KF used to amplify CAPS marker K08 21K.

[0058]SEQ ID NO:18 is the nucleotide sequence of the oligonucleotide primer K08 21KR used to amplify CAPS marker K08 21K.

[0059]SEQ ID NO:19 is the nucleotide sequence of the oligonucleotide primer K08 46KF used to amplify SNP-based marker K08 46K.

[0060]SEQ ID NO:20 is the nucleotide sequence of the oligonucleotide primer K08 46KR used to amplify SNP-based marker K08 46K.

[0061]SEQ ID NO:21 is the nucleotide sequence of the oligonucleotide primer LOB-82F used to amplify the first exon (exon 1) of RE2 wild type gene or re2 mutant gene from genomic DNA.

[0062]SEQ ID NO:22 is the nucleotide sequence of the oligonucleotide primer LOB R1 used to amplify the first exon (exon 1) of RE2 wild type gene or re2 mutant gene from genomic DNA.

[0063]SEQ ID NO:23 is the nucleotide sequence of the oligonucleotide primer LOB F2 used to amplify the second exon (exon 2) of RE2 wild type gene or re2 mutant gene from genomic DNA.

[0064]SEQ ID NO:24 is the nucleotide sequence of the oligonucleotide primer LOB R2 used to amplify the second exon (exon 2) of RE2 wild type gene or re2 mutant gene from genomic DNA.

[0065]SEQ ID NO:25 is the nucleotide sequence of the wild-type rice RE2 gene open reading frame (ORF) identified in the instant application.

[0066]SEQ ID NO:26 is the amino acid sequence of the wild-type rice RE2 protein derived from translating nucleotides 1 through 807 of SEQ ID NO:25.

[0067]SEQ ID NO:27 is the amino acid sequence of the rice protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509.

[0068]SEQ ID NO:28 is the nucleotide sequence obtained for mutant allele re2-1 gene.

[0069]SEQ ID NO:29 is the amino acid sequence of a re2-1 mutant allele protein obtained by translating nucleotides 1 through 807 of SEQ ID NO:28.

[0070]SEQ ID NO:30 is the nucleotide sequence obtained for mutant allele re2-2 gene.

[0071]SEQ ID NO:31 is the amino acid sequence of a re2-2 mutant allele protein obtained by translating nucleotides 1 through 807 of SEQ ID NO:30.

[0072]SEQ ID NO:32 is the nucleotide sequence obtained for mutant allele re2-3 gene.

[0073]SEQ ID NO:33 is the amino acid sequence of a re2-3 mutant allele protein obtained by translating nucleotides 1 through 378 of SEQ ID NO:32.

[0074]SEQ ID NO:34 is the nucleotide sequence of the approximately 9 Kb BamH I fragment from RE2G4 which comprises the RE2 wild type gene coding region. Nucleotides 1 through 4472 are 5' of the ATG initiation codon, nucleotides 4473 through 4829 correspond to the first exon, nucleotides 4830 through 5660 correspond to an intron, and nucleotides 5661 through 6110 correspond to the second exon. Nucleotides 6111 through 6113 form a termination codon.

[0075]SEQ ID NO:35 is the nucleotide sequence of vector pML18 used to subclone the approximately 9 Kb BamH I fragment from RE2G4 comprising the rice RE2 wild type gene coding region.

[0076]SEQ ID NO:36 is the nucleotide sequence comprising the entire cDNA insert in clone cef1f.pk001.f4:fis encoding a putative corn RE2 protein homolog.

[0077]SEQ ID NO:37 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 76 through 851 of SEQ ID NO:36.

[0078]SEQ ID NO:38 is the nucleotide sequence comprising the entire cDNA insert in clone cpf1c.pk006.d18a:fis encoding a putative corn RE2 protein homolog.

[0079]SEQ ID NO:39 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 151 through 804 of SEQ ID NO:38.

[0080]SEQ ID NO:40 is the nucleotide sequence comprising the entire cDNA insert in clone cpi1c.pk005.a12:fis encoding a putative corn RE2 protein homolog.

[0081]SEQ ID NO:41 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 81 through 854 of SEQ ID NO:40.

[0082]SEQ ID NO:42 is the nucleotide sequence comprising the entire cDNA insert in clone cr1n.pk0028.h3a:fis encoding a putative corn RE2 protein homolog.

[0083]SEQ ID NO:43 is the deduced amino acid sequence of a putative corn RE2 protein homolog derived from nucleotides 158 through 658 of SEQ ID NO:42.

[0084]SEQ ID NO:44 is the nucleotide sequence comprising the entire cDNA insert in clone eel1c.pk003.b10:fis encoding a putative Euphorbia RE2 protein homolog.

[0085]SEQ ID NO:45 is the deduced amino acid sequence of a putative Euphorbia RE2 protein homolog derived from nucleotides 71 through 823 of SEQ ID NO:44.

[0086]SEQ ID NO:46 is the nucleotide sequence comprising a portion of the cDNA insert in clone eav1c.pk003.c9 encoding a fragment of a putative columbine RE2 protein homolog.

[0087]SEQ ID NO:47 is the deduced amino acid sequence of a fragment of a putative columbine RE2 protein homolog derived from nucleotides 2 through 382 of SEQ ID NO:46.

[0088]SEQ ID NO:48 is the nucleotide sequence comprising the entire cDNA insert in clone Ids3c.pk011.j11:fis encoding a putative guar RE2 protein homolog.

[0089]SEQ ID NO:49 is the deduced amino acid sequence of a putative guar RE2 protein homolog derived from nucleotides 146 through 898 of SEQ ID NO:48.

[0090]SEQ ID NO:50 is the nucleotide sequence comprising the entire cDNA insert in clone sdr1f.pk005.d21.f:fis encoding putative soybean RE2 protein homolog.

[0091]SEQ ID NO:51 is the deduced amino acid sequence of a putative soybean RE2 protein homolog derived from nucleotides 971 through 1609 of SEQ ID NO:50.

[0092]SEQ ID NO:52 is the nucleotide sequence comprising the entire cDNA insert in clone wdr1f.pk002.l10:fis encoding a putative wheat RE2 protein homolog.

[0093]SEQ ID NO:53 is the deduced amino acid sequence of a putative wheat RE2 protein homolog derived from nucleotides 80 through 640 of SEQ ID NO:52.

[0094]SEQ ID NO:54 is the amino acid sequence of the Arabidopsis thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164.

[0095]SEQ ID NO:55 is the consensus amino acid sequence included in the C block of RE2 protein homologs.

[0096]SEQ ID NO:56 is the amino acid sequence of the motif at the N-terminus of the 49 amino acid GAS block of RE2 protein homologs.

[0097]SEQ ID NO:57 is the amino acid sequence of the motif at the C-terminus of the 49 amino acid GAS block of RE2 protein homologs.

[0098]SEQ ID NO:58 is the amino acid sequence of the Leucine-zipper motif of RE2 protein homologs.

[0099]SEQ ID NO:59 is the nucleotide sequence of oligonucleotide primer Cpi Bbsl F used to amplify genomic Zea mays RE2 gene.

[0100]SEQ ID NO:60 is the nucleotide sequence of oligonucleotide primer Cpi Bbsl R used to amplify genomic Zea mays RE2 gene.

[0101]SEQ ID NO:61 is the nucleotide sequence of the genomic fragment encoding a maize RE2 protein homolog obtained by amplifying a maize genomic library with primers Cpi Bbsl F and Cpi Bbsl R. Nucleotides 79 through 429 correspond to the first exon, nucleotides 430 through 1363 correspond to an intron, and nucleotides 1364 through 1783 correspond to the second exon.

[0102]SEQ ID NO:62 is the nucleotide sequence of oligonucleotide primer RE2 pro Bst 2F used for amplifying a portion of the 5' region of the OsRE2 gene.

[0103]SEQ ID NO:63 is the nucleotide sequence of oligonucleotide primer RE2 PRO R Bbsl used for amplifying a portion of the 5' region of the OsRE2 gene.

[0104]SEQ ID NO:64 is the nucleotide sequence of plasmid RE2Pro comprising a portion of the OsRE2 gene promoter region.

[0105]SEQ ID NO:65 is the nucleotide sequence of oligonucleotide primer RE2 TERM Xbal R used for amplifying a 780 bp fragment of the 3' terminator region from the OsRE2 gene.

[0106]SEQ ID NO:66 is the nucleotide sequence of oligonucleotide primer RE2 TERM EcoBspml used for amplifying a 780 bp fragment of the 3' terminator region from the OsRE2 gene.

[0107]SEQ ID NO:67 is the nucleotide sequence of plasmid RE2TERGEM comprising a portion of the OsRE2 gene terminator region.

[0108]SEQ ID NO:68 is the nucleotide sequence of an oligonucleotide primer that may be used to identify RE2 homologs from other plant species.

[0109]SEQ ID NO:69 is the nucleotide sequence of an oligonucleotide primer that may be used to identify RE2 homologs from other species.

[0110]SEQ ID NO:70 is the nucleotide sequence of an oligonucleotide primer that may be used to identify RE2 homologs from other plant species.

[0111]SEQ ID NO:71 is the nucleotide sequence of the "RE2 second exon probe" used to screen for cDNAs encoding RE2 proteins.

[0112]SEQ ID NO:72 is the nucleotide sequence of clone RE2 cDNA C1, the longest cDNA clone identified encoding an RE2 protein.

[0113]The Sequence Listing contains the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB standards described in Nucleic Acids Res. 13:3021-3030 (1985) and in the Biochemical J. 219 (No. 2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822.

DETAILED DESCRIPTION OF THE INVENTION

[0114]Disclosure of all references, patents, and patent applications cited herein are hereby incorporated by reference.

[0115]The terms "isolated nucleic acid fragment" and "isolated polynucleotide" are used interchangeably herein. These terms refer to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.

[0116]It has been reported that the Lateral Organ Boundary (LOB) gene in Arabidopsis has a potential role in lateral organ development. See Shuai et al., (2002), Plant Phys. 129, 747-761. Shuai et al. found LOB gene expression at the base of lateral organs in the shoots and roots of Arabidopsis. In fact, 23 members of the LOB domain family (LBD) of genes were found to exhibit expression patterns in the root tissues of Arabidopsis.

[0117]The LOB domain 18 protein is considered as being in the class I group of the Lateral Organ Boundaries (LOB) domain protein plant-specific gene family. The Class I LOB domain proteins contain a C-block, a GAS-block, and a leucine zipper motif (Shuai, B. et al., 2002, Plant Phys. 129:747-761). Thus, it is expected that an Oryza sativa RE2 protein and its homologs would also contain a C-block, a GAS-block, and a leucine zipper motif. The consensus sequences of these motifs were identified using a Clustal V alignment and are indicated in FIG. 3.

[0118]FIG. 3A-C depicts the Clustal V alignment obtained for the amino acid sequences from the rice wild type RE2 protein (SEQ ID NO:26), the O. sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27), the A. thaliana LOB domain 18 protein having NGBI General Identifier No. 17227164 (SEQ ID NO:54), and the amino acid sequences of the polypeptides encoded by corn clones cef1f.pk001.f4:fis (SEQ ID NO:37), cpf1c.pk006.d18a:fis (SEQ ID NO:39), cpi1c.pk005.a12:fis (SEQ ID NO:41), and cr1n.pk0028.h3a:fis (SEQ ID NO:43), Euphorbia lagascae clone eel1c.pk003.b10:fis (SEQ ID NO:45), columbine clone eav1c.pk003.c9 (SEQ ID NO:47), guar clone Ids3c.pk011.j11:fis (SEQ ID NO:49), soybean clone sdr1f.pk005.d21.f:fis (SEQ ID NO:51), and wheat clone wdr1f.pk002.l10:fis (SEQ ID NO:53). The program uses dashes to maximize the alignment. An asterisk (*) below the alignment indicates amino acids conserved among all the sequences. The C-block, a GAS-block, and a leucine zipper conserved motifs are shown boxed.

[0119]It has been found in the present invention that a single mutation of a rice gene encoding a member of a class I LOB domain protein family can lead to alteration of embryo/endosperm size during seed development.

[0120]The gene associated with the reduced embryo phenotype is named Reduced Embryo2 (RE2). Silencing or inhibition of this gene leads to a reduction of embryonic tissue, thus, resulting in a smaller embryo size and a concomitantly larger endosperm size. Reduction of embryo size will result in seeds having a reduced amount of components such as oils. On the other hand, overexpression of this gene might lead to an increase of embryonic tissue, thus, resulting in a larger embryo size and a concomitantly smaller endosperm size.

[0121]The italicized and uppercase term "RE2" as used herein refers to a genetic locus capable of expressing a Reduced Embryo 2 protein. The italicized and lowercase letters term "re2" as used herein refers to a mutated form of RE2. Italics are not used when referring to a protein or polypeptide encoded by the genetic locus. Thus, the uppercase term "RE2" as used herein refers to the wild type protein, and the lowercase "re2" as used herein refers to a mutant protein. As was noted above, the rice RE2 isolated polynucleotide was identified in the instant application using high fidelity mapping of DNA obtained from reduced embryo 2 (re2) mutant plants. These mutant plants produce grain that have a small embryo phenotype.

[0122]The terms "Oryza sativa RE2", "OsRE2", and "rice RE2" are used interchangeably herein. These terms refer to a polynucleotide isolated from wild-type rice and whose sequence is set forth in the instant application. The rice RE2 isolated polynucleotide is the polynucleotide that, when mutated, is responsible for a reduced embryo 2, or re2, phenotype as exemplified by Hong et al. (1996, Development 122:2051-2058). Mutant rice displaying the re2 phenotype has a reduced embryo size and an enlarged endosperm size.

[0123]The terms "subfragment that is functionally equivalent" and "functionally equivalent subfragment" are used interchangeably herein. These terms refer to a portion or subsequence of an isolated nucleic acid fragment in which the ability to alter gene expression or produce a certain phenotype is retained whether or not the fragment or subfragment encodes an active enzyme. For example, the fragment or subfragment can be used in the design of recombinant DNA constructs to produce the desired phenotype in a transformed plant. Recombinant DNA constructs can be designed for use in co-suppression or antisense by linking a nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in the appropriate orientation relative to a plant promoter sequence.

[0124]The terms "homology", "homologous", "substantially similar" and "corresponding substantially" are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the specific exemplary sequences.

[0125]A "homolog" can be a second gene in the same plant type or in a different plant type that has a polynucleotide sequence that is functionally identical to a sequence in the first gene. It is believed that, in general, homologs share a common evolutionary past.

[0126]The term "RE2 homolog" refers to an isolated polynucleotide encoding a class I LOB domain polypeptide obtained from a plant species, other than rice, that functions in a manner similar to that of the rice RE2 isolated polynucleotide and that, when mutated, exhibits a reduced embryo phenotype. The corn, Euphorbia lagascae, Columbine, guar, soybean, and wheat isolated polynucleotides disclosed herein appear to encode such polypeptides, namely, these polypeptides are members of a class I LOB domain protein family, have a C-like motif, a GAS-like motif, and a leucine zipper-like motif, and are useful for altering embryo/endosperm size during seed development.

[0127]A search of GenBank and Du Pont proprietary databases using the rice RE2 gene sequence or the RE2 polypeptide sequence uncovered a number of isolated polynucleotides from plants that appeared to be homologous. RE2 homologs appear to encompass those polynucleotides isolated from plants, other than rice, which appeared to encode a polypeptide that shares sequence and/or functional similarity to the polypeptide encoded by the rice RE2 isolated polynucleotide. It is believed that such a polynucleotide would comprise a subset of the polynucleotides encoding polypeptides of the class I LOB domain family, and that alteration in the expression of this polypeptide may affect embryo/endosperm size.

[0128]"Sequence identity" or "identity" in the context of nucleic acid or polypeptide sequences refers to the nucleic acid bases or amino acid residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window.

[0129]Thus, "Percentage of sequence identity" refers to the valued determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the results by 100 to yield the percentage of sequence identity. Useful examples of percent sequence identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. These identities can be determined using any of the programs described herein.

[0130]Sequence alignments and percent identity or similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the Megalign program of the LASARGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences are performed using the Clustal V method of alignment (Higgins, D. G. and Sharp, P. M. (1989) Comput. Appl. Biosci. 5:151-153; Higgins, D. G. et al. (1992) Comput. Appl. Biosci. 8:189-191) with the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4.

[0131]It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides, from other plant species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 55% to 100%. Indeed, any integer amino acid identity from 50%-100% may be useful in describing the present invention. Also, of interest is any full or partial complement of this isolated nucleotide fragment.

[0132]It is believed that another way to identify genes that are homologous to the rice RE2 gene is to screen by hybridization. It is possible to hybridize cDNA at 60° C. with a probe derived from the rice RE2 gene and wash at medium stringency conditions (5×SSPE, 0.5% SDS at 65° C. followed by 1×SSPE, 0.5×SDS at 65° C.). For general hybridization protocols, see Ausubel et al. 1993, "Current Protocols in Molecular Biology" John Wiley & Sons, USA, or Sambrook et al. 1989. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press. An appropriate probe with a unique sequence can be extracted, for example, from part of the exon 1 of the RE2 gene. Exon 1 of the RE2 gene has regions of sequence identity between the corn and rice RE2 nucleotide sequences. Oligonucleotide primers useful in hybridization screenings may have the sequences disclosed in SEQ ID NO: 68, SEQ ID NO:69, or SEQ ID NO:70, for example. The oligonucleotide primers having the sequences set forth in SEQ ID NO: 68, SEQ ID NO:69, or SEQ ID NO:70 have the sequences set forth as follows:

TABLE-US-00001 SEQ ID NO:68 5'-GCATCTTCGCGCCCTACTTCGACTCGG-3' SEQ ID NO:69 5'-GCACAAGGTGTTCGGCGCCAGCAACGTGTCCAAGC-3' SEQ ID NO:70 5'-CCGCGACCCCGTCTACGGCTGCGTCGCCCACCTC-3'

[0133]Genomic DNA or cDNA clones giving significant signals may be isolated and their chromosomal origin analyzed using CAPS markers or SNP-based markers similar to those described in the present Application. DNA fragments containing the region homologous to rice RE2 gene may be further subcloned and sequenced. Polypeptides encoded by these polynucleotides should the have the C-Block, GAS Block N-end and C-end, and Leu Zipper consensus sequences described in Example 6 and as set forth in SEQ ID NOs:55 through 58.

[0134]"Gene" refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Recombinant DNA construct" refers to a combination of nucleic acid fragments that are not normally found together in nature. Accordingly, a recombinant DNA construct may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that normally found in nature. A "foreign" gene refers to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or recombinant DNA constructs. A "transgene" is a gene that has been introduced into the genome by a transformation procedure.

[0135]"coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

[0136]"Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoter sequences can also be located within the transcribed portions of genes, and/or downstream of the transcribed sequences. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of an isolated nucleic acid fragment in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause an isolated nucleic acid fragment to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, (1989) Biochemistry of Plants 15:1-82. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity.

[0137]Specific examples of promoters that may be useful in expressing the nucleic acid fragments of the invention include, but are not limited to, the oleosin promoter (PCT Publication WO99/65479, published Dec. 12, 1999), the maize 27 kD zein promoter (Ueda et al (1994) Mol. Cell. Biol. 14:4350-4359), the ubiquitin promoter (Christensen et al (1992) Plant Mol. Biol. 18:675-680), the SAM synthetase promoter (PCT Publication WO00/37662, published Jun. 29, 2000), the CaMV 35S (Odell et al (1985) Nature 313:810-812), and the promoter described in PCT. Publication WO02/099063 published Dec. 12, 2002.

[0138]An "intron" is an intervening sequence in a gene that does not encode a portion of the protein sequence. Thus, such sequences are transcribed into RNA but are then excised and are not translated. The term is also used for the excised RNA sequences. An "exon" is a portion of the sequence of a gene that is transcribed and is found in the mature messenger RNA derived from the gene, but is not necessarily a part of the sequence that encodes the final-gene product.

[0139]The "translation leader sequence" refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 3:225).

[0140]The "3' non-coding sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht et al. (1989) Plant Cell 1:671-680.

[0141]"RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to and synthesized from an mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form using the Klenow fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro. "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target isolated nucleic acid fragment (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.

[0142]The term "endogenous RNA" refers to any RNA which is encoded by any nucleic acid sequence present in the genome of the host prior to transformation with the recombinant construct of the present invention, whether naturally-occurring or non-naturally occurring, i.e., introduced by recombinant means, mutagenesis, etc.

[0143]The term "non-naturally occurring" means artificial, not consistent with what is normally found in nature.

[0144]The term "operably linked" refers to an association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is regulated by the other. For example, a promoter is operably linked with a coding sequence when it is capable of regulating the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions of the invention can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.

[0145]Cosuppression technology constitutes the subject matter of U.S. Pat. No. 5,231,020, which issued to Jorgensen et al. on Jul. 27, 1999. The phenomenon observed by Napoli et al. in petunia was referred to as "cosuppression" since expression of both the endogenous gene and the introduced transgene were suppressed (for reviews see Vaucheret et al., Plant J. 16:651-659 (1998); and Gura, Nature 404:804-808 (2000)).

[0146]Co-suppression constructs in plants previously have been designed by focusing on overexpression of a nucleic acid sequence having homology to an endogenous mRNA, in the sense orientation, which results in the reduction of all RNA having homology to the overexpressed sequence (see Vaucheret et al. (1998) Plant J 16:651-659; and Gura (2000) Nature 404:804-808). The overall efficiency of this phenomenon is low, and the extent of the RNA reduction is widely variable. Recent work has described the use of "hairpin" structures that incorporate all, or part, of an mRNA encoding sequence in a complementary orientation that results in a potential "stem-loop" structure for the expressed RNA (PCT Publication WO 99/53050 published on Oct. 21, 1999). This increases the frequency of co-suppression in the recovered transgenic plants. Another variation describes the use of plant viral sequences to direct the suppression, or "silencing", of proximal mRNA encoding sequences (PCT Publication WO 98/36083 published on Aug. 20, 1998). Both of these co-suppressing phenomena have not been elucidated mechanistically, although recent genetic evidence has begun to unravel this complex situation (Elmayan et al. (1998) Plant Cell 10:1747-1757).

[0147]In addition to cosuppression, antisense technology has also been used to block the function of specific genes in cells. Antisense RNA is complementary to the normally expressed RNA, and presumably inhibits gene expression by interacting with the normal RNA strand. The mechanisms by which the expression of a specific gene are inhibited by either antisense or sense RNA are on their way to being understood. However, the frequencies of obtaining the desired phenotype in a transgenic plant may vary with the design of the construct, the gene, the strength and specificity of its promoter, the method of transformation and the complexity of transgene insertion events (Baulcombe, Curr. Biol. 12(3):R82-84 (2002); Tang et al., Genes Dev. 17(1):49-63 (2003); Yu et al., Plant Cell. Rep. 22(3):167-174 (2003)). Cosuppression and antisense inhibition are also referred to as "gene silencing", "post-transcriptional gene silencing" (PTGS), RNA interference or RNAi. See for example U.S. Pat. No. 6,506,559.

[0148]MicroRNAs (miRNA) are small regulatory RNAs that control gene expression. miRNAs bind to regions of target RNAs and inhibit their translation and, thus, interfere with production of the polypeptide encoded by the target RNA. miRNAs can be designed to be complementary to any region of the target sequence RNA including the 3' untranslated region, coding region, etc. miRNAs are processed from highly structured RNA precursors that are processed by the action of a ribonuclease III termed DICER. While the exact mechanism of action of miRNAs is unknown, it appears that they function to regulate expression of the target gene. See, e.g., U.S. Patent Publication No. 2004/0268441 A1 which was published on Dec. 30, 2004.

[0149]The term "expression", as used herein, refers to the production of a functional end-product, be it mRNA or translation of mRNA into a polypeptide. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).

[0150]"Overexpression" refers to the production of a functional end-product in transgenic organisms that exceeds levels of production when compared to expression of that functional end-product in a normal, wild type or non-transformed organism.

[0151]"Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms. The preferred method of cell transformation of rice, corn and other monocots is using particle-accelerated or "gene gun" transformation technology (Klein et al. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050), or an Agrobacterium-mediated method (Ishida Y. et al. (1996) Nature Biotech. 14:745750). The term "transformation" as used herein refers to both stable transformation and transient transformation.

[0152]Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press Cold Spring Harbor, 1989 (hereinafter "Sambrook").

[0153]The term "recombinant" refers to an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis or by the manipulation of isolated segments of nucleic acids by genetic engineering techniques.

[0154]"PCR" or "Polymerase Chain Reaction" is a technique for the synthesis of large quantities of specific DNA segments, consists of a series of repetitive cycles (Perkin Elmer Cetus Instruments, Norwalk, Conn.). Typically, the double stranded DNA is heat denatured, the two primers complementary to the 3' boundaries of the target segment are annealed at low temperature and then extended at an intermediate temperature. One set of these three consecutive steps is referred to as a cycle.

[0155]Polymerase chain reaction ("PCR") is a powerful technique used to amplify DNA millions of fold, by repeated replication of a template, in a short period of time. (Mullis et al. (1986) Cold Spring Harbor Symp. Quant. Biol. 51:263-273; Erlich et al, European Patent Application 50,424; European Patent Application 84,796; European Patent Application 258,017, European Patent Application 237,362; Mullis, European Patent Application 201,184, Mullis et al U.S. Pat. No. 4,683,202; Erlich, U.S. Pat. No. 4,582,788; and Saiki et al, U.S. Pat. No. 4,683,194). The process utilizes sets of specific in vitro synthesized oligonucleotides to prime DNA synthesis. The design of the primers is dependent upon the sequences of DNA that are to be analyzed. The technique is carried out through many cycles (usually 20-50) of melting the template at high temperature, allowing the primers to anneal to complementary sequences within the template and then replicating the template with DNA polymerase.

[0156]The products of PCR reactions are analyzed by separation in agarose gels followed by ethidium bromide staining and visualization with UV transillumination. Alternatively, radioactive dNTPs can be added to the PCR in order to incorporate label into the products. In this case the products of PCR are visualized by exposure of the gel to x-ray film. The added advantage of radiolabeling PCR products is that the levels of individual amplification products can be quantitated.

[0157]The terms "recombinant construct", "expression construct" and "recombinant expression construct" are used interchangeably herein. These terms refer to a functional unit of genetic material that can be inserted into the genome of a cell using standard methodology well known to one skilled in the art. Such construct may be itself or may be used in conjunction with a vector. If a vector is used then the choice of vector is dependent upon the method that will be used to transform host plants as is well known to those skilled in the art. For example, a plasmid vector can be used. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells comprising any of the isolated nucleic acid fragments of the invention. The skilled artisan will also recognize that different independent transformation events will result in different levels and patterns of expression (Jones et al. (1985) EMBO J. 4:2411-2418; De Almeida et al. (1989) Mol. Gen. Genetics 218:78-86), and thus that multiple events must be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by Southern analysis of DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.

[0158]"Motifs" or "subsequences" refer to relatively short conserved regions of nucleic acids or amino acids that comprise part of a longer sequence. For example, it is expected that such conserved subsequences, such as those exemplified in SEQ ID NOs:49, 50, 52, and 52, would be important for function and could be used to identify new homologues of class I LOB domain proteins involved in controlling embryo/endosperm size in plants. It is expected that some or all of the elements may be found in an RE2 homolog. Also, it is expected that one or two of the conserved amino acids in any given motif may differ in a true RE2 homolog.

[0159]Thus, in one aspect, this invention concerns an isolated polynucleotide comprising: [0160](a) a nucleic acid sequence encoding a polypeptide involved in altering embryo/endosperm size during seed development, said polypeptide having at least 80% amino acid sequence identity, based on the Clustal V method of alignment, when compared to an amino acid sequence selected from the group consisting of SEQ ID NOs:37, 39, 41, 43, 45, 47, 49, 51, and 53; or [0161](b) a nucleic acid sequence set forth in SEQ ID NO:25 wherein said sequence comprises at least one of the following modifications: [0162](i) nucleotide 271 is a T residue instead of a C; [0163](ii) nucleotide 110 is a T residue instead of a G; or [0164](iii) nucleotide 75 is deleted; or [0165](c) a nucleic acid sequence set forth in SEQ ID NO:34 wherein [0166](i) nucleotides 4473 through 4829 correspond to a first exon, and [0167](ii) nucleotides 5661 through 6110 correspond to a second exon, and [0168]further wherein the nucleotides of (c) (i) and/or (c)(ii) encode a polypeptide involved in altering embryo/endosperm size during seed development, [0169](d) the nucleic acid sequence set forth in SEQ ID NO:34 or 72; or [0170](e) the full complement of (a), (b), (c), (d), or SEQ ID NO:34; or [0171](f) all or part of a non-coding or coding region of the isolated polynucleotide comprising sequences of (a), (b) or SEQ ID NO:34 for use in co-suppression or antisense suppression of endogenous nucleic acid sequences encoding polypeptides involved in altering embryo/endosperm size during seed development.

[0172]Also of interest are recombinant DNA constructs comprising an isolated polynucleotide comprising any of the nucleotide sequences described herein operably linked in a sense or anti-sense orientation to at least one regulatory sequence. Such constructs can then be used to transform plants, plant tissue, or plant cells. Transformation methods are well known to those skilled in the art and are described above. Any plant, dicot or monocot can be transformed with such recombinant DNA constructs.

[0173]Examples of monocots include, but are not limited to, corn, wheat, rice, sorghum, millet, barley, palm, lily, Alstroemeria, rye, and oat.

[0174]Examples of dicots include, but are not limited to, soybean, rape, sunflower, canola, grape, guayule, columbine, cotton, tobacco, peas, beans, flax, safflower, and alfalfa.

[0175]Plant tissue includes differentiated and undifferentiated tissues or plants, including but not limited to, roots, stems, shoots, leaves, pollen, seeds, tumor tissue, and various forms of cells and culture such as single cells, protoplasm, embryos, and callus tissue. The plant tissue may in plant or in organ, tissue or cell culture.

[0176]The term "plant organ" refers to plant tissue or group of tissues that constitute a morphologically and functionally distinct part of a plant. The term "genome" refers to the following: 1. The entire complement of genetic material (genes and non-coding sequences) is present in each cell of an organism, or virus or organelle. 2. A complete set of chromosomes inherited as a (haploid) unit from one parent. The term "stably integrated" refers to the transfer of a nucleic acid fragment into the genome of a host organism or cell resulting in genetically stable inheritance.

[0177]Also within the scope of this invention are seeds obtained from such transformed plants and oil obtained from these seeds.

[0178]In another aspect, this invention concerns a method of altering embryo/endosperm size during seed development in a plant comprising: [0179](a) transforming plant cells or plant tissue with the recombinant DNA construct of the invention; [0180](b) regenerating transgenic plants from the transformed plant cells or plant tissue of (a); [0181](c) screening the transgenic plants of (b) for seeds having an altered embryo/endosperm size based on a comparison of embryo/endosperm size of seeds obtained from non-transformed plants.

[0182]The regeneration, development, and cultivation of plants from single plant protoplast transformants or from various transformed explants is well known in the art (Weissbach and Weissbach, In: Methods for Plant Molecular Biology, (Eds.), Academic Press, Inc. San Diego, Calif., (1988)). This regeneration and growth process typically includes the steps of selection of transformed cells, culturing those individualized cells through the usual stages of embryonic development through the rooted plantlet stage. Transgenic embryos and seeds are similarly regenerated. The resulting transgenic rooted shoots are thereafter planted in an appropriate plant growth medium such as soil. Preferably, the regenerated plants are self-pollinated to provide homozygous transgenic plants. Otherwise, pollen obtained from the regenerated plants is crossed to seed-grown plants of agronomically important lines. Conversely, pollen from plants of these important lines is used to pollinate regenerated plants. A transgenic plant of the present invention containing a desired polypeptide is cultivated using methods well known to one skilled in the art.

[0183]There are a variety of methods for the regeneration of plants from plant tissue. The particular method of regeneration will depend on the starting plant tissue and the particular plant species to be regenerated.

[0184]Methods for transforming dicots, primarily using Agrobacterium tumefaciens, and obtaining transgenic plants have been published for cotton (U.S. Pat. No. 5,004,863, U.S. Pat. No. 5,159,135, U.S. Pat. No. 5,518,908); soybean (U.S. Pat. No. 5,569,834, U.S. Pat. No. 5,416,011, McCabe et. al. (1988) Bio/Technology 6:923, Christou et al. (1988) Plant Physiol. 87:671674); Brassica(U.S. Pat. No. 5,463,174); peanut (Cheng et al. (1996) Plant Cell Rep. 15:653-657, McKently et al. (1995) Plant Cell Rep. 14:699-703); papaya and pea--(Grant et al. (1995) Plant Cell Rep. 15:254-258).

[0185]Transformation of monocotyledons using electroporation, particle bombardment, and Agrobacterium have also been reported. Transformation and plant regeneration have been achieved in asparagus (Bytebier et al., Proc. Natl. Acad. Sci. (USA) (1987) 84:5354); barley (Wan and Lemaux (1994) Plant Physiol. 104:37); Zea mays (Rhodes et al. (1988) Science 240:204, Gordon-Kamm et al. (1990) Plant Cell 2:603-618, Fromm et al. (1990) Bio/Technology 8:833; Koziel et al. (1993) Bio/Technology 11: 194, Armstrong et al. (1995) Crop Science 35:550-557); oat (Somers et al. (1992) Bio/Technology 10: 15 89); orchard grass (Horn et al. (1988) Plant Cell Rep. 7:469); rice (Toriyama et al. (1986) Theor. Appl. Genet. 205:34; Part et al. (1996) Plant Mol. Biol. 32:1135-1148; Abedinia et al. (1997) Aust. J. Plant Physiol. 24:133-141; Zhang and Wu (1988) Theor. AppL Genet. 76:835; Zhang et al. (1988) Plant Cell Rep. 7:379; Battraw and Hall (1992) Plant Sci. 86:191-202; Christou et al. (1991) Bio/Technology 9:957); rye (De la Pena et al. (1987) Nature 325:274); sugarcane (Bower and Birch (1992) Plant J. 2:409); tall fescue (Wang et al. (1992) Bio/Technology 10:691), and wheat (Vasil et al. (1992) Bio/Technology 10:667; U.S. Pat. No. 5,631,152).

[0186]"Plant" includes reference to whole plants, plant organs, plant tissues, seeds and plant cells and progeny of same. Plant cells include, without limitation, cells from seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, and microspores.

[0187]"Progeny" comprises any subsequent generation of a plant.

[0188]"Transgenic plant" includes reference to a plant which comprises within its genome a heterologous polynucleotide. Preferably, the heterologous polynucleotide is stably integrated within the genome such that the polynucleotide is passed on to successive generations. The heterologous polynucleotide may be integrated into the genome alone or as part of a recombinant DNA construct.

[0189]Assays for gene expression based on the transient expression of cloned nucleic acid constructs have been developed by introducing the nucleic acid molecules into plant cells by polyethylene glycol treatment, electroporation, or particle bombardment (Marcotte et al., Nature 335:454-457 (1988); Marcotte et al., Plant Cell 1:523-532 (1989); McCarty et al., Cell 66:895-905 (1991); Hattori et al., Genes Dev. 6:609-18 (1992); Goff et al., EMBO J. 9:2517-2522 (1-990)).

[0190]Transient expression systems may be used to functionally dissect isolated nucleic acid fragment constructs (see generally, Maliga et al., Methods in Plant Molecular Biology, Cold Spring Harbor Press (1995)). It is understood that any of the nucleic acid molecules of the present invention can be introduced into a plant cell in a permanent or transient manner in combination with other genetic elements such as vectors, promoters, enhancers etc.

[0191]In addition to the above discussed procedures the standard resource materials which describe specific conditions and procedures for the construction, manipulation and isolation of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of recombinant organisms and screening and isolating of clones (see for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (1989); Maliga et al., Methods in Plant Molecular Biology, Cold Spring Harbor Press (1995); Birren et al., Genome Analysis: Detecting Genes, 1, Cold Spring Harbor, N.Y. (1998); Birren et al., Genome Analysis: Analyzing DNA, 2, Cold Spring Harbor, N.Y. (1998); Plant Molecular Biology: A Laboratory Manual, eds. Clark, Springer, New York (1997)) are well known.

[0192]In another aspect, this invention concerns a method of mapping genetic variations related to controlling embryo/endosperm size during seed development and/or altering oil phenotypes in plants comprising: (a) crossing two plant varieties; and evaluating genetic variations with respect to a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53; in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.

[0193]The terms "mapping genetic variation" or "mapping genetic variability" are used interchangeably and define the process of identifying changes in DNA sequence, whether from natural or induced causes, within a genetic region that differentiates between different plant lines, cultivars, varieties, families, or species. The genetic variability at a particular locus (gene) due to even minor base changes can alter the pattern of restriction enzyme digestion fragments that can be generated. Pathogenic alterations to the genotype can be due to deletions or insertions within the gene being analyzed or even single nucleotide substitutions that can create or delete a restriction enzyme recognition site. Restriction fragment length polymorphism (RFLP) analysis takes advantage of this and utilizes Southern blotting with a probe corresponding to the isolated nucleic acid fragment of interest.

[0194]Thus, if a polymorphism (i.e., a commonly occurring variation in a gene or segment of DNA; also, the existence of several forms of a gene (alleles) in the same species) creates or destroys a restriction endonuclease cleavage site, or if it results in the loss or insertion of DNA (e.g., a variable nucleotide tandem repeat (VNTR) polymorphism), it will alter the size or profile of the DNA fragments that are generated by digestion with that restriction endonuclease. As such, individuals that possess a variant sequence can be distinguished from those having the original sequence by restriction fragment analysis. Polymorphisms that can be identified in this manner are termed "restriction fragment length polymorphisms: ("RFLPs"). RFLPs have been widely used in human and plant genetic analyses (Glassberg, UK Patent Application 2135774; Skolnick et al, Cytogen. Cell Genet 32:58-67 (1982); Botstein et al, Ann. J. Hum. Genet. 32:314-331 (1980); Fischer et al (PCT Application WO 90/13668; Uhlen, PCT Application WO 90/11369).

[0195]A central attribute of "single nucleotide polymorphisms" or "SNPs" is that the site of the polymorphism is at a single nucleotide. SNPs have certain reported advantages over RFLPs or VNTRs. First, SNPs are more stable than other classes of polymorphisms. Their spontaneous mutation rate is approximately 10-9 (Kornberg, DNA Replication, W.H. Freeman & Co., San Francisco, 1980), approximately, 1,000 times less frequent than VNTRs (U.S. Pat. No. 5,679,524). Second, SNPs occur at greater frequency, and with greater uniformity than RrFLPs and VNTRs. As SNPs result from sequence variation, new polymorphisms can be identified by random sequencing of genomic or cDNA molecules. SNPs can also result from deletions, point mutations and insertions. Any single base alteration, whatever the cause, can be a SNP. The greater frequency of SNPs means that they can be more readily identified than the other classes of polymorphisms.

[0196]SNPs can be characterized using any of a variety of methods. Such methods include the direct or indirect sequencing of the site, the use of restriction enzymes where the respective alleles of the site create or destroy a restriction site, the use of allele-specific hybridization probes, and the use of antibodies that are specific for the proteins encoded by the different alleles of the polymorphism or by other biochemical interpretation. SNPs can be sequenced by a number of methods. Two basic methods may be used for DNA sequencing, the chain termination method of Sanger et al, Proc. Nati. Acad. Sci. (U.S.A.) 74:5463-5467 (1977), and the chemical degradation method of Maxam and Gilbert, Proc. Nati. Acad. Sci. (U.S.A.) 74: 560-564 (1977).

[0197]Furthermore, single point mutations can be detected by modified PCR techniques such as the ligase chain reaction ("LCR") and PCR-single strand conformational polymorphisms ("PCR-SSCP") analysis. The PCR technique can also be used to identify the level of expression of genes in extremely small samples of material, e.g., tissues or cells from a body. The technique is termed reverse transcription-PCR ("RT-PCR").

[0198]In another embodiment, this invention concerns a method of molecular breeding to obtain altered embryo/endosperm size during seed development and/or altered oil phenotypes in plants comprising: (a) crossing two plant varieties; and (b) evaluating genetic variations with respect to: (i) a nucleic acid sequence selected from the group consisting of SEQ ID NOs:25, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, and 72; or a nucleic acid sequence encoding a polypeptide selected from the group consisting of SEQ ID NOs:26, 29, 31, 33, 37, 39, 41, 43, 45, 47, 49, 51, and 53; in progeny plants resulting from the cross of step (a) wherein the evaluation is made using a method selected from the group consisting of RFLP analysis, SNP analysis, and PCR-based analysis.

[0199]The term "molecular breeding" defines the process of tracking molecular markers during the breeding process. It is common for the molecular-markers to be linked to phenotypic traits that are desirable. By following the segregation of the molecular marker or genetic trait, instead of scoring for a phenotype, the breeding process can be accelerated by growing fewer plants and eliminating assaying or visual inspection for phenotypic variation. The molecular markers useful in this process include, but are not limited to, any marker useful in identifying mapable genetic variations previously mentioned, as well as any closely linked genes that display synteny across plant species. The term "synteny" refers to the conservation of gene placement/order on chromosomes between different organisms. This means that two or more genetic loci, that mayor may not be closely linked, are found on the same chromosome among different species. Another term for synteny is "genome colinearity".

EXAMPLES

[0200]The present invention is further defined in the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those set forth and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

[0201]The disclosure of each reference set forth herein is incorporated herein by reference in its entirety.

Example 1

Mapping of the Oryza sativa RE2 Locus to a Single Chromosome

[0202]Identification of the chromosome comprising the Oryza sativa RE2 locus was performed using Cleaved Amplified Polymorphic Sequence markers (CAPS markers). The Oryza sativa RE2 locus comprises the polynucleotide that, when mutated, is responsible for a reduced embryo 2, or re2, mutant phenotype as exemplified by Hong et al. (1996, Development 122:2051-2058). Mutant rice grains displaying the re2 phenotype show a reduced embryo size and an increased endosperm size. CAPS markers covering the entire rice genome were developed and, asset forth below, were used to identify the portion of the chromosome comprising the Oryza sativa RE2 Locus.

Developing of CAPS Markers

[0203]Mapping of the RE2 locus to a single chromosome required first developing CAPS markers covering the entire rice genome. CAPS markers were developed as follows.

[0204]Oligonucleotide primer sets were designed based on rice genomic sequence information available in the NCBI database. Information relating to the position of the sequences in the rice chromosomes was retrieved from the web sites of the Rice Genome Research Program (RGP), Tsukuba, Japan, or the Clemson University Genomics Institute, Clemson, S.C. The oligonucleotide primer sets were used to amplify portions of genomic DNA prepared from Indica (cv. Kasalth), Japonica (cv. Taichung 65), and Japonica (cv. Kinmaze) rice. The amplified fragments were digested with restriction endonucleases and polymorphisms identified between the three wild type rice as follows.

[0205]Genomic DNA was prepared from leaves of the three rice cultivars as follows. A 3 g piece from the leaf blade was ground using a mortar and pestle and suspended in 8 mL DNA extraction buffer (0.1 M ethylenediaminetetraacetic acid [EDTA], 1% N-lauroylsarcosine, 100 μg/mL proteinase K). The suspended sample was incubated at 50° C. for 1 hour, and debris removed by centrifuging at 3,400 rpm for 15 minutes using a RT-7 Plus centrifuge (Sorvall®) and transferring the supernatant to a fresh tube. The DNA was precipitated by adding 2 volumes of 100% ethanol and separated by centrifuging at 10,000 rpm for 15 minutes at 4° C. using an RC-5B centrifuge (Sorvall®). The DNA pellet was resuspended in 8 mL TE (10 mM tris, 1 mM EDTA) and reprecipitated with 16 mL 100% ethanol. After separation of the DNA pellet by centrifugation, it was resuspended in 3.7 mL TE, 50 μL 10 mg/mL ethidium bromide were added, the volume was brought up to 4 mL with TE, and 4.4 g CsCl were added. The solution was transferred to an OptiSeal® tube (Beckman) and centrifuged for 16 hours at 52,000 rpm at 25° C. using an NVT65.2 rotor in an L8-M centrifuge (Beckman). After centrifugation the DNA band was visualized using an UV lamp and 500 μL removed using an 18-gauge needle in a 1 mL syringe. The DNA band was transferred to a 1.5 mL tube and the ethidium bromide removed by adding 500 μL isopropanol saturated with 20×SSPE buffer and centrifuging at 14,000 rpm for 30 seconds using a using a 5415C centrifuge (Eppendorf) and discarding the isopropanol phase. Removal of the ethidium bromide was accomplished by repeating addition of isopropanol and centrifugation 6 times. The DNA was then precipitated by adding 100 μL TE and 500 μL 100% ethanol and separated by centrifuging at 14,000 rpm for 15 minutes. The recovered DNA pellet was resuspended in 400 μL TE and 40 μL 3 M NaOAC. The DNA was precipitated one more time with the addition of 1 ml 100% ethanol, separated by centrifuging at 14,000 rpm for 15 minutes, rinsed with 500 μL 70% ethanol, dried, and resuspended in water to a concentration of 10 ng/μL. The genomic DNA was amplified using the oligonucleotide primer sets designed above using the following PCR conditions:

[0206]Amplifications were performed in 30 μL reactions containing 1 μL DNA prepared above (at 10 ng/μL concentration), 2 μL of 2.5 mM dNTPs, 2 μL 25 mM MgCl2, 10 pmole of each primer, 0.3 μL Amplitaq gold (Perkin Elmer, Wellsley, Mass.), and 3 μL 10×PCR buffer. Amplification of DNA was performed by heating the reactions at 95° C. for 10 minutes followed by 40 cycles of 94° C. for 30 seconds, 56° C. for 30 seconds, and 72° C. for 30 seconds. Termination of the amplification reactions was accomplished by heating the reactions at 72° C. for 5 minutes.

[0207]Amplified DNA fragments were then digested with restriction endonucleases having 4 or 5 base recognition sites. Restriction endonuclease digestions were performed in 15 μL digestion reactions containing 2 μL of amplified DNA, 1.5 μL 10× reaction buffer, and 0.5 μL restriction enzyme. The digestion reactions were incubated for 1 hour at either 37° C. or at 60° C. depending on restriction endonuclease being utilized. Digested DNA products were loaded on a 2.5% agarose gel and separated by electrophoresis to analyze polymorphisms. Comparison of the CAPS markers developed for Japonica and Indica rice allowed the development of 26 CAPS markers for wild type rice.

Mapping of the Oryza sativa RE2 Locus to a Single Chromosome

[0208]Linkage between CAPS markers obtained for wild type rice and those obtained for re2 mutant plants was then analyzed. CAPS markers were prepared with genomic DNA from F3 Japonica rice plants whose F2 seed showed re2 phenotype and were compared to the CAPS markers prepared above. Two markers on chromosome 10 (markers-C10 7.7 and C10 15.9) showed co-segregation with the re2-1 phenotype and were identified as follows.

[0209]Plants displaying an re2 mutant phenotype were obtained by crossing a Japonica cv. Taichung 65 mutant plant showing the re2-1 mutant phenotype with a plant of the Indica cultivar Kasalath and scoring the embryo phenotype of F2 mature seeds using a dissecting microscope. Twenty eight (28) seeds showing re2 mutant phenotype were sterilized and sown in soil. Genomic DNA was extracted from the leaves of these 28 F3 re2 mutant plants as follows. Leaf samples, weighing 300 mg, were ground to powder in liquid Nitrogen using a mortar and pestle. Each sample was then suspended in 750 μL extraction buffer containing 1.5 M NaCl, 0.2 M EDTA, 1 M tris and 3% CTAB (cetyltrimethylammonium bromide) and vortexed. Proteins were removed by adding 50 μL chloroform to the samples, shaking for 20 minutes, centrifuging briefly in a microfuge, and decanting the supernatant, containing the DNA, into a new tube. Genomic DNA was precipitated by adding 300 μL isopropanol, mixing by quick vortexing, and allowing the aqueous phase to precipitate. The pellet, containing the DNA, was recovered in H2O and used in amplification reactions as follows.

[0210]Marker C10 7.7 was amplified using oligonucleotide primers C10 6-3 and C10 6-4. Oligonucleotide primers C16-3 and C10 6-4 were developed as described above, have the nucleotide sequences set forth in SEQ ID NO:1 and SEQ ID NO:2, respectively, and have the sequences set forth as follows:

TABLE-US-00002 SEQ ID NO:1: 5'-TAGCAGCTGGGAAGAACAACATG-3' SEQ ID NO:2: 5'-CGTGCACCACGTAACGTTAAGC-3'

[0211]Polymorphism was observed on CAPS marker C10 7.7 when the amplified DNA was digested with the restriction endonuclease Dde I, loaded on a 2.5% agarose gel, and separated by electrophoresis. Comparison of C10 7.7 CAPS markers allowed the identification of 4 recombination breakpoints between DNA prepared from wild type plants and that obtained from re2 mutant plants.

[0212]Marker C10 15.9 was amplified using oligonucleotide primers C10 15.9-1 and C10 15.9-2. Oligonucleotide primers C10 15.9-1 and C10 15.9-2 were developed as described above, have the nucleotide sequences set forth in SEQ ID NO:3 and SEQ ID NO:4, respectively, and have the sequences set forth as follows:

TABLE-US-00003 SEQ ID NO:3: 5'-CAGGGTTGTGTAAGGATCGTTG-3' SEQ ID NO:4: 5'-GATCATCGTGTAGTACCAGGAC-3'

[0213]Polymorphism was observed on CAPS marker C10 15.9 when the amplified DNA was digested with the restriction endonuclease Msp I. This digestion produced additional bands in the Indica (Kasalath) background. Comparison of marker C10 15.9 prepared from DNA obtained from wild type plants with marker C10 15.9 prepared from DNA obtained from re2 mutant plants allowed the identification of 4 recombination breakpoints different from the ones identified with CAPS marker C10 7.7.

[0214]As explained above, comparison of CAPS markers prepared from DNA obtained from wild type rice and that obtained from F3 rice plants whose F2 seed showed re2 phenotype allowed the identification of 4 recombination breakpoints in CAPS marker C10 7.7 and 4 different recombination breakpoints in CAPS marker C10 15.9. These results indicate that the RE2 locus which contains the polynucleotide that when mutated is responsible for a re2 mutant phenotype maps to a region on chromosome 10 flanked by markers C10 7.7 and C10 15.9.

Example 2

Map-Based Cloning of the Oryza sativa RE2 Gene

[0215]In Example 1 the RE2 locus, comprising the RE2 gene, was mapped to a region on chromosome 10 flanked by markers C10 7.7 and C10 15.9. This Example describes cloning of the RE2 gene from F2 recombinant plants produced by crossing a re2-1 mutant plant (Japonica cv. Taichung 65) with an Indica cultivar, Kasalath using CAPS markers as follows.

[0216]F2 seeds obtained from self-fertilized F1 plants were screened for the re2 mutant phenotype to obtain populations for cloning the RE2 gene. Seeds (308) displaying an re2 mutant phenotype were germinated on MS medium containing 0.3% gelrite and incubated in a growth chamber for 3 weeks with a 16 hour light/8 hour dark cycle. When the plants on the plates were at third leaf stage, 5-10 mm of the tip of the leaf was removed and used for DNA amplification. Direct PCR amplification reactions were carried out as described in Klimyuk et al. (1993 Plant J. 3:493-494) with a modification of extending the sample boiling time to 4 minutes after the neutralization step. Briefly; the leaf tissue was collected in a sterile vial containing 40 μL of 0.25 M NaOH and incubated 30 seconds in a boiling water bath. Samples were neutralized by adding 40 μL 0.25 M HCl and 20 μL 0.5 M Tris-HCL, pH 8.0 containing 0.25% (v/v) Nonidet P-40 and boiling for an additional 4 minutes. Tissue samples were used immediately for amplification or stored at 4° C. until needed. Each 30 μL amplification reaction contained 10 pmole of each primer, 2 μL of 2.5 mM dNTPs, 2 μL of 25 mM MgCl2, 1 μL leaf extract, 0.3 μL AmpliTaq gold (Perkin Elmer), and 3 μL PCR buffer. The thermal cycler was set to 95° C. for 10 minutes, followed by 40 cycles of 94° C. for 4 minutes, 50° C. for 30 seconds, and 72° C. for 30 seconds followed by heating at 72° C. for 5 minutes.

[0217]DNA obtained from 44 of these 308 F2 recombinant plants contained breakpoints between CAPS markers C10 7.7 and C10 15.9 and were identified using CAPS markers C10 7.7 Hpy, C10 11.5, C10 11.0, C10 9.6, E08 93K, and E08 46K which were developed as follows.

[0218]Marker C10 7.7 Hpy was amplified using oligonucleotide primers C10-7.7 2 HPYIVF and C10-7.7 2 HPYIVR. Oligonucleotide primers C10-7.7 2 HPYIVF and C10-7.7 2 HPYIVR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:5 and SEQ ID NO:6, respectively, and have the sequences set forth as follows:

TABLE-US-00004 SEQ ID NO:5: 5'-ATTGTCTCGTGTGACAGCGC-3' SEQ ID NO:6: 5'-CCGCAATTAATATTCCGAGC-3'

[0219]Polymorphism was observed on the C10 7.7 Hpy CAPS marker when the amplified DNA was digested with the restriction endonuclease HpyCH4 IV.

[0220]Marker C10 11.5 was amplified using oligonucleotide primers 11.5 HpyV and C10 11.5-9. Oligonucleotide primers 11.5 HpyV and C10 11.5-9 were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:7 and SEQ ID NO:8, respectively, and have the sequences set forth as follows:

TABLE-US-00005 SEQ ID NO:7: 5'-AAAGTGTGGTAGGTGTCATCCAGTTG-3' SEQ ID NO:8: 5'-GCCACATGATCATCCACTACCAATG-3'

[0221]Polymorphism was observed on the C10 11.5 CAPS marker when the amplified DNA was digested with the restriction endonuclease HpyCH4 V.

[0222]Marker C10 11.0 was amplified using oligonucleotide primers C10 11-5 and 11 HinfR. Oligonucleotide primers C10 11-5 and 11 HinfR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:9 and SEQ ID NO:10, respectively, and have the sequences set forth as follows:

TABLE-US-00006 SEQ ID NO:9: 5'-CTTTTTCCGACCCACATGAAGGT-3' SEQ ID NO:10: 5'-TACAAACGCTCCTAAACCACCATGT-3'

[0223]Polymorphism was observed on the C10 11.0 CAPS marker when the amplified DNA was digested with the restriction endonuclease Hinf I.

[0224]Marker C10 9.6 was amplified using oligonucleotide primers 9.6 DraIF and 9.6 DraIR. Oligonucleotide primers-9.6 DraIF and 9.6 DraIR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:11 and SEQ ID NO:12, respectively, and have the sequences set forth as follows:

TABLE-US-00007 SEQ ID NO:11: 5'-TTTGGGTGCATTAAAGTGGACCA-3' SEQ ID NO:12: 5'-GGGGTAATTCGGATGACCATG-3'

[0225]Polymorphism was observed on the C10 9.6 CAPS marker when the amplified DNA was digested with the restriction endonuclease Dra I.

[0226]Marker E08 93K was amplified using oligonucleotide primers E08 93KF and E08 93KR. Oligonucleotide primers E08 93KF and E08 93KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:13 and SEQ ID NO:14, respectively, and have the sequences set forth as follows:

TABLE-US-00008 SEQ ID NO:13: 5'-CTCATAGCCGCCTAGCCTCATAG-3' SEQ ID NO:14: 5'-GAAGCAGAGAAACTCCAACCTGG-3'

[0227]Polymorphism was observed on the E08 93K CAPS marker when the amplified DNA was digested with the restriction endonuclease HpyCH4 V.

[0228]Marker E08 46K was amplified using oligonucleotide primers E08 46KF and E08 46KR. Oligonucleotide primers E08 46KF and E08 46KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:15 and SEQ ID NO:16, respectively, and have the sequences set forth as follows:

TABLE-US-00009 SEQ ID NO:15: 5'-GTTCATAGGTGCCAAATTTGGGTG-3' SEQ ID NO:16: 5'-CACAAGTAACCCAATGCCCAAAC-3'

[0229]Polymorphism was observed on the E08 46K CAPS marker when the amplified DNA was digested with the restriction endonuclease Rsa I.

[0230]Analysis of recombination breakpoints identified 6 recombination-breakpoints between DNA obtained from re2 mutant plants and CAPS marker E08 93K and 3 recombination breakpoints between DNA obtained from re2 mutant plants and CAPS marker C10 9.6. Information relating to the position of the sequences of the CAPS markers in the rice chromosomes was retrieved from the web sites of the Rice Genome Research Program (RGP), Tsukuba, Japan, or the Clemson University Genomics Institute, Clemson, S.C. This information revealed that the sequences for CAPS markers E08 93K and C10 9.6 were derived from two overlapping BAC clones, OSJNBa0050E08 and OSJNBb0042K08, that cover 190 Kb on rice chromosome 10. At least 10 genes are found in this region.

[0231]An additional CAPS marker, K08 21K, and a single nucleotide polymorphism-based (SNP-based) marker were generated that were derived from BAC OSJNBb0042K08, and mapped 25 Kb apart.

[0232]Marker K08 21K was amplified using oligonucleotide primers K08 21KF and K08 21KR. Oligonucleotide primers K08 21KF and K08 21KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:17 and SEQ ID NO:18, respectively, and have the sequences set forth as follows:

TABLE-US-00010 SEQ ID NO:17: 5'-GTTCACCCATTAGTGATGCCTGG-3' SEQ ID NO:18: 5'-GTTCACTCGATAAGAGCAATCGAAC-3'

[0233]Polymorphism was observed on the K08 21K CAPS marker when the amplified DNA was digested with the restriction endonuclease Taq I.

[0234]SNP-based marker K08 46K was amplified using primers K08 46KF and K08 46KR. Oligonucleotide primers K08 46KF and K08 46KR were developed as described in Example 1, have the nucleotide sequences set forth in SEQ ID NO:19 and SEQ ID NO:20, respectively, and have the sequences set forth as follows:

TABLE-US-00011 SEQ ID NO:19: 5'-GTTATGTTGCACACCTCCAGTAGTTAC-3' SEQ ID NO:20: 5'-GTCAAGCCTGCTGTTACCCTTTAAG-3'

[0235]Amplified DNA products were purified using a Qiagen PCR purification kit (Qiagen, Valencia, Calif.) and 100 ng of each purified DNA was used for direct sequencing. Of 9 recombination breakpoints analyzed 3 were found in marker K08 21K and 1 was found in marker K08 46K confining the RE2 gene to a 25 Kb region between these two markers.

[0236]This 25 Kb region contains DNA corresponding to two putative genes. One is gene OSJNBb0042K08.8 that is predicted to encode a myosin-like protein and is found in the NCBI database as Version AAL77142.1 having NCBI General Identifier No. 18652508. The other one is gene OSJNBb0042K08.9 that is predicted to encode a protein of unknown function and is found in the NCBI database as Version AAL77143.1 having. NCBI General Identifier No. 18652509.

[0237]The regions corresponding to these two genes were sequenced in genomic DNA obtained from mutant alleles re2-1, re2-2 and re2-3 to identify the RE2 gene. Amplification of exon 1 was performed using oligonucleotide primers LOB-82F and LOB R1 and amplification of exon 2 was performed using oligonucleotide primers LOB F2 and LOB R2. Oligonucleotide primers LOB-82F, LOB R1, LOB F2, and LOB R2 have the nucleotide sequences set forth in SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, and SEQ ID NO:24, respectively, and have the sequences set forth as follows:

TABLE-US-00012 SEQ ID NO:21: 5'-GTCAAGCCTGCTGTTACCCTTTAAG-3' SEQ ID NO:22: 5'-CCACCATGACGAACATCTAAATG-3' SEQ ID NO:23: 5'-GTATAGCTCCCAACCATTTCTCCTC-3' SEQ ID NO:24: 5'-CCAACATCACCATCATCGTCTTC-3'

[0238]Amplification reactions were carried out using the same conditions that were used for CAPS marker amplifications in Example 1 except that 20 ng of DNA was used per reaction and the annealing temperature was 55° C. Amplified DNA products were cloned into p-GEM T easy Vector (Promega, Madison, Wis.) and, for each amplification reaction, plasmid DNA was prepared from at least 4 independent colonies using a Qiagen miniprep kit (Qiagen, Valencia, Calif.). Plasmids were sequenced using the M13 forward and reverse sequencing primers.

[0239]No mutation was found in the portion of DNA corresponding to the gene encoding the myosin-like protein, but mutations were found in the region encoding the unknown protein. This means that the RE2 gene has the sequence found in NCBI having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function and is found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509.

[0240]The nucleotide sequence of the Oryza saliva RE2 gene is set forth in SEQ ID NO:25 and the amino acid sequence deduced from translating nucleotides 1 through 807 of SEQ ID NO:25 is set forth in SEQ ID NO:26. Nucleotides 808-810 of SEQ ID NO:25 correspond to a stop codon. The nucleotide sequence set forth in SEQ ID NO:25 is the same as the one found in the NCBI database having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function. The amino acid sequence set forth in SEQ ID NO:26 is the same as the one for the protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509 that is set forth here in SEQ ID NO:27.

Identification of Mutations Responsible for an re2 Phenotype

[0241]Mutations in the RE2 gene responsible for the re2 phenotype were determined by comparing the nucleotide sequences obtained for DNA from wild-type rice with the nucleotide sequences obtained for DNA from rice exhibiting an re2 phenotype. Three re2 mutant alleles were identified and labeled re2-1, re2-2, and re2-3. The nucleotide sequence obtained for mutant allele re2-1 is set forth in SEQ ID NO:28 and the amino acid sequence obtained by translating nucleotides 1 through 807 of SEQ ID NO:28 is set forth in SEQ ID NO:29. Nucleotides 808 through 810 of SEQ ID NO:28 correspond to a stop codon. The nucleotide sequence obtained for mutant allele re2-2 is set forth in SEQ ID NO:30 and the amino acid sequence obtained by translating nucleotides 1 through 807 of SEQ ID NO:30 is set forth in SEQ ID NO:31. Nucleotides 808 through 810 of SEQ ID NO:30 correspond to a stop codon. The nucleotide sequence obtained for mutant allele re2-3 is set forth in SEQ ID NO:32 and the amino acid sequence obtained by translating nucleotides 1-378 of SEQ ID NO:32 is set forth in SEQ ID NO:33. Nucleotides 379 through 381 of SEQ ID NO:32 correspond to a stop codon.

[0242]FIG. 1A-C shows an alignment of the nucleotide sequences obtained for the coding regions of wild type RE2 (SEQ ID NO:25), and mutants re2-1 (SEQ ID NO:28), re2-2 (SEQ ID NO:30), and re2-3 (SEQ ID NO:32). Changes in the nucleotide sequence are indicated by a star below the alignment and by a box around the nucleotides at that position. As seen in FIG. 1, mutant allele re2-1 had a T residue at nucleotide 279, mutant allele re2-2 had a T residue at nucleotide 110, and mutant allele re2-3 had the C at nucleotide 75 deleted. These nucleotide changes result in changes in the amino acid sequence.

[0243]FIG. 2 shows an alignment of the amino acid sequences obtained for wild type RE2 protein (SEQ ID NO:26), and mutant proteins re2-1 (SEQ ID NO:29), and re2-2 (SEQ ID NO:31). Amino acids that change between the wild type and mutant are indicated by a box around the amino acids that are different at that position. As seen in FIG. 2, mutant allele re2-1 protein had an isoleucine at amino acid 93 instead of the highly-conserved threonine, mutant allele re2-2 protein had a phenylalanine instead of the conserved cysteine at amino acid 37. The deletion of a nucleotide at position 75 in mutant allele re2-3 gene produced a frame shift that results in a 127 amino acid polypeptide (set forth in SEQ ID NO:33) that shares identity with the first 25 amino acids of wild type RE2 protein but whose remaining 102 amino acids share little or no homology with wild type RE2 protein or mutant proteins re2-1 or re2-2.

Example 3

Confirmation of the Function of the Oryza sativa RE2 Gene

[0244]Functional confirmation of the identity of the Oryza sativa RE2 gene identified in Example 2 was performed using genetic complementation. Rice callus cells derived from wild type and re2 mutant plants were transformed with a genomic DNA fragment comprising the RE2 gene. Restoration of the embryo size of the re2 mutant cells transformed with the genomic DNA fragment comprising the RE2 gene confirmed that the Oryza sativa RE2 gene identified in Example 2 is the sole target of mutations giving rise to the re2 phenotype. Cloning of the genomic fragment comprising the wild type RE2 gene and transformation into rice cells were performed as follows.

[0245]A genomic DNA fragment containing wild type Oryza sativa RE2 gene was obtained from a lambda rice genomic DNA library (Stratagene) as follows. The genomic library was screened using a DNA probe obtained using primers LOB F2 (SEQ ID NO:23) and LOB R2 (SEQ ID NO:24) that, as indicated in Example 2, above, may be used to amplify exon 2 of the RE2 gene. Of 8 clones identified, one clone, named RE2G4, contained a 15 Kb insert comprising an approximately 9 Kb fragment flanked by two BamH I sites and comprising the RE2 gene. One of the BamH I sites was located 4472 bp upstream of the ATG initiation codon in the RE2 gene and the other one was located 3089 bp downstream of the termination codon of the RE2 gene. Nucleotides 4473 through 4829 correspond to a first exon, nucleotides 4830 through 5660 correspond to an intron, and nucleotides 5661 through 6110 correspond to the second exon. Nucleotides 6111 through 6113 form a termination codon. The nucleotide sequence of this approximately 9 Kb BamH I fragment is set forth in SEQ ID NO:34.

[0246]The approximately 9 Kb BamH I fragment comprising the RE2 coding region (set forth in SEQ ID NO:34) was removed from clone RE2G4 by digestion with BamH I and was subcloned into the BamH I site of the pML18 transformation vector to produce vector OsRE2pML18. Transformation vector pML18 is derived from the commercially available vector pGEM9z (obtained from Gibco-BRL which is owned by Invitrogen, Carlsbad, Calif.) and was modified by adding a cassette to express the bacterial hygromycin phosphotransferase gene. The bacterial hygromycin phosphotransferase gene confers resistance to the antibiotic used as selectable marker for rice transformation. A Sal I fragment, containing a cassette comprising the cauliflower mosaic virus 35S promoter, driving expression of the bacterial hygromycin phosphotransferase gene, followed by nucleotides 848 to 1550 of the 3' end of the nopaline synthase gene, was inserted at the Sal I site of vector pGEM9z to produce pML18. The nucleotide sequence of pML18 is set forth in SEQ ID NO:35.

[0247]Vector OsRE2pML18 was introduced into callus derived from wild type rice plants and from re2 mutant plants using a Biolistic PDS-1000/He gun (BioRAD Laboratories, Hercules, Calif.) and the particle bombardment technique (Klein et al. (1987) Nature (London) 327:70-73) as follows.

[0248]Embryogenic callus cultures derived from the scutellum of germinating rice seeds were used as source material for transformation experiments. This material was generated by germinating sterile rice seeds on N6-2,4D media (N6 salts, N6 vitamins, 2.0 mg/l 2,4-D, 100 mg/L myo-inositol, 300 mg/L casamino acids, and 2.7 g/L proline) in the dark at 27-28° C. Embryogenic callus proliferating from the scutellum of the embryos was then transferred to fresh N6-2,4D media. Callus cultures were maintained by routine sub-culture at two-week intervals and used for transformation within 4 weeks of initiation. The regeneration, development, and cultivation of plants from single plant protoplast transformants or from various transformed explants is well known in the art (Weissbach and Weissbach, In: Methods for Plant Molecular Biology, (Eds.), Academic Press, Inc. San Diego, Calif., (1988)).

[0249]Callus was prepared for transformation by arranging 0.5-1.0 mm callus pieces approximately 1 mm apart in a circular area of about 4 cm in diameter in the center of a circle of Whatman #541 paper placed on CM media and incubating in the dark at 27-28° C. for 3-5 days. Vector OsRE2pML18 was introduced into wild type callus cells and re2 mutant rice callus cells using a Biolistic PDS-1000/He gun (BioRAD Laboratories, Hercules, Calif.).

[0250]Transformation of mutant callus with vector OsRE2pML18 produced 16 transgenic plants of which 7 transgenic plants produced seed. T2 seed from 6 plants showed a wild type to re2 mutant phenotype segregating at a 3:1 ratio. Restoration of wild type phenotype in re2 mutant plants by vector OsRE2pML18 indicates that the 9,203 bp rice genomic DNA fragment present in vector OsRE2pML18 was capable of complementing an re2 mutation. This confirms that the Oryza sativa RE2 gene has the sequence found in NCBI having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509. These results also indicate that the 9,203 bp rice genomic DNA fragment in vector OsRE2pML18 used in these transformations and set forth in SEQ ID NO:34 contains the complete set of regulatory elements required for proper complementation of an re2 mutant phenotype and involved in altering embryo/endosperm size during seed development.

Example 4

Composition of cDNA Libraries: Isolation and Sequencing of cDNA Clones Encoding Polypeptides Involved in Altering Embryo/Endosperm Size During Seed Development

[0251]The sequence of the Oryza sativa RE2 gene was identified in Example 2 and its function was confirmed in Example 3 as being involved in altering embryo/endosperm size during seed development. Identification of genes from other crops involved in altering embryo/endosperm size during seed development is set forth in Examples 4 and 5. cDNAs encoding polypeptides homologous to rice RE2 protein were identified by electronically screening the Du Pont proprietary database using BLAST analysis (Basic Local Alignment Search Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410). Clones derived from cDNA libraries representing mRNAs from various corn (Zea maize), Euphorbia lagascae, columbine (Aquilegia vulgaris), guar (Cyamopsis tetragonoloba), rice (Oryza sativa), soybean (Glycine max), and wheat (Triticum aestivum) tissues were identified as encoding homologs to the rice RE2 protein. The libraries were prepared as described below. The characteristics of the libraries are described in Table 1.

TABLE-US-00013 TABLE 1 Libraries from Corn, Euphorbia lagascae, Columbine, Guar, Rice, Soybean, and Wheat Library Tissue Clone cef1f Corn entire fertilized ear 3 to 12 days cef1f.pk001.f4:fis after pollination cpf1c Corn pooled BMS treated with chemicals cpf1c.pk006.d18a:fis related to protein synthesis1 cpi1c Corn pooled BMS treated with chemicals cpi1c.pk005.a12:fis related to biochemical compound synthesis2 cr1n Corn root from 7 day old seedlings3 cr1n.pk0028.h3a:fis eel1c Euphorbia lagascae developing seeds eel1c.pk003.b10:fis eav1c Columbine developing seeds eav1c.pk003.c9 lds3c Guar seeds harvested 32 days after flowering lds3c.pk011.j11:fis sdr1f Soybean 10 day old root sdr1f.pk005.d21.f:fis wdr1f Wheat entire developing root wdr1f.pk002.l10:fis 1Chemicals used included chloramphenicol, cyclohexamide, aurintricarboylic acid. 2Chemicals used included sorbitol, egosterol, taxifolin, methotrexate, D-mannose, D-galactose, alpha-amino adipic acid, ancymidol. 3This library was normalized essentially as described in U.S. Pat. No. 5,482,845

[0252]cDNA libraries representing mRNAs from the tissues described in Table 1 were prepared in Uni-ZAP® XR vectors according to the manufacturer's protocol (Stratagene Cloning Systems, La Jolla, Calif.). Conversion of the Uni-ZAP® XR libraries into plasmid libraries was accomplished according to the protocol provided by Stratagene. Upon conversion, cDNA inserts were contained in the plasmid vector pBluescript. cDNA inserts from randomly picked bacterial colonies containing recombinant pBluescript plasmids were amplified via polymerase chain reaction using primers specific for vector sequences flanking the inserted cDNA sequences or plasmid DNA was prepared from cultured bacterial cells. Amplified insert DNAs or plasmid DNAs were sequenced in dye-primer sequencing reactions to generate partial cDNA sequences (expressed sequence tags or "ESTs"; see Adams, M. D. et al., (1991) Science 252:1651). The resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent sequencer.

[0253]Full-insert sequence (FIS) data was generated utilizing a modified transposition protocol. Clones identified for FIS were recovered from archived glycerol stocks as single colonies, and plasmid DNAs were isolated via alkaline lysis. Isolated DNA templates were reacted with vector primed M13 forward and reverse oligonucleotides in a PCR-based sequencing reaction and loaded onto automated sequencers. Confirmation of clone identification was performed by sequence alignment to the original EST sequence from which the FIS request was made.

[0254]Confirmed templates were transposed via the Primer Island transposition kit (PE Applied Biosystems, Foster City, Calif.) which is based upon the Saccharomyces cerevisiae Ty1 transposable element (Devine and Boeke (1994) Nucleic Acids Res. 22:3765-3772). The in vitro transposition system places unique binding sites randomly throughout a population of large DNA molecules. The transposed DNA was then used to transform DH10B electro-competent cells (Gibco BRL/Life Technologies, Rockville, Md.) via electroporation. The transposable element contains an additional selectable marker (named DHFR; Fling and Richards (1983) Nucleic Acids Res. 11:5147-5158), allowing for dual selection on agar plates of only those subclones containing the integrated transposon. Multiple subclones were randomly selected from each transposition reaction, plasmid DNAs were prepared via alkaline lysis, and templates were sequenced (ABI Prism dye-terminator ReadyReaction mix) outward from the transposition event site, utilizing unique primers specific to the binding sites within the transposon.

[0255]Sequence data was collected (ABI Prism Collections) and assembled using Phred and Phrap (Ewing et al. (1998) Genome Res. 8:175-185; Ewing and Green (1998) Genome Res. 8:186-194). Phred re-reads the ABI sequence data, re-calls the bases, assigns quality values, and writes the base calls and quality values into editable output files. Phrap is a sequence assembly program that uses the quality values assigned by Phred to increase the accuracy of the assembled sequence contigs. Assemblies are viewed using the Consed sequence editor (Gordon et al. (1998) Genome Res. 8:195-202).

Example 5

Identification and Characterization of cDNA Clones Encoding Putative Homologs of the Oryza sativa RE2 Protein

[0256]Clones containing cDNA-inserts encoding-polypeptides homologous to rice RE2 protein were identified by conducting BLAST (Basic Local Alignment Search Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410) searches for similarity to sequences contained in the Du Pont proprietary database. The sequences identified were also compared, using BLAST, to the Genbank database.

[0257]A BLASTX search was performed to identify cDNAs encoding proteins similar to those encoded by the RE2 gene. BLASTX compares the translation, in all six reading frames, of the nucleotide query sequence to a protein database. As mentioned in Example 2, the Oryza sativa RE2 gene has the sequence found in the NCBI database having locus tag OSJNBb0042K08.9 that is predicted to encode a protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509. Thus, the polypeptides encoded by the cDNAs identified in the BLASTX search are similar to the protein of unknown function found in the NCBI database as Version AAL77143.1 having NCBI General Identifier No. 18652509.

[0258]The BLASTX search using the nucleotide sequences from the clones listed in Table 1 revealed that the polypeptides encoded by these CDNAs had similarity to the Oryza sativa protein having NCBI General Identifier No. 18652509 and the Arabidopsis thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164. Set forth in Table 2 are the BLASTX results for individual ESTs ("EST"), or for the sequences of the entire cDNA inserts comprising the indicated cDNA clones ("FIS"):

TABLE-US-00014 TABLE 2 BLAST Results for Sequences Encoding Polypeptides Homologous To O. sativa RE2 Protein and A. thaliana LOB Domain 18 Protein aa BLAST pLog Score Clone SEQ ID NO: Status 18652509 17227164 rice RE2 26 FIS >180.00 73.70 cef1f.pk001.f4:fis 37 FIS 58.00 58.30 cpf1.c.pk006.d 18a:fis 39 FIS 36.00 38.22 cpi1c.pk005.a12:fis 41 FIS 60.70 58.15 cr1n.pk0028.h3a:fis 43 FIS 34.22 34.40 eel1c.pk003.b10:fis 45 FIS 59.05 69.40 eav1c.pk003.c9 47 EST 47.00 51.70 lds3c.pk011.j11:fis 49 FIS 53.10 56.40 sdr1f.pk005.d21.f:fis 51 FIS 39.15 41.10 wdr1f.pk002.l10:fis 53 FIS 36.30 33.70

[0259]The data set forth in Table 3 presents the percent identity, calculated using the Clustal V method of alignment, of the amino acid sequences set forth in SEQ ID NOs:26, 37, 39, 41, 43, 45, 47, 49, 51, and 53, with the Oryza saliva protein having NCBI General Identifier No. 18652509 (set forth in SEQ ID NO:27), and the Arabidopsis thaliana LOB domain 18 protein (NCBI General Identifier No. 17227164; set forth in SEQ ID NO:54).

TABLE-US-00015 TABLE 3 Percent Identity of Amino Acid Sequences Deduced From Nucleotide Sequences of cDNA Clones Encoding Putative O. sativa RE2 Homolog Polypeptides aa Percent Identity to SEQ ID NO. 18652509 17227164 rice RE2 26 100.00 49.6 cef1f.pk001.f4:fis 37 59.1 56.9 cpf1c.pk006.d18a:fis 39 38.5 40.4 cpi1c.pk005.a12:fis 41 57.8 52.7 cr1n.pk0028.h3a:fis 43 45.8 47.6 eel1c.pk003.b10:fis 45 53.4 58.6 eav1c.pk003.c9 47 79.2 85.4 lds3c.pk011.j11:fis 49 41.8 47.4 sdr1f.pk005.d21.f:fis 51 40.8 42.3 wdr1f.pk002.l10:fis 53 43.9 41.2

[0260]Sequence alignments and percent identity calculations were performed using the Megalign program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences was performed using the Clustal V method of alignment (Higgins, D. G. and Sharp, P. M. (1989) Comput. Appl. Biosci. 5:151-153; Higgins, D. G. et al. (1992) Comput. Appl. Biosci. 8:189-191.) and the default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwise alignments using the Clustal V method were KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments and BLAST scores and probabilities indicate that the nucleic acid fragments comprising the instant cDNA clones encode polypeptides with homology to the O. sativa RE2 protein and the A. thaliana LOB 18 domain protein.

Example 6

Structure of the Oryza sativa RE2 Protein and its Putative Homologs

[0261]As set forth on Table 3, Example 5, the amino acid sequence of the RE2 polypeptide (SEQ ID NO:26) set forth in Example 2, above, to be able to complement an re2 mutant phenotype was identical to the Oryza sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27) and had sequence similarity to the Arabidopsis thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164 (set forth in SEQ ID NO:54).

[0262]The LOB domain 18 protein is considered to belong in the class I group of the Lateral Organ Boundaries (LOB) domain protein plant-specific gene family. The Class I LOB domain proteins contain a C-block, a GAS-block, and a leucine zipper motif (Shuai, B. et al., 2002, Plant Phys. 129:747-761). Thus, it is expected that the Oryza sativa RE2 protein and its homologs also contain a C-block, a GAS-block, and a leucine zipper motif. The consensus sequences of these motifs were identified using a Clustal V alignment and are indicated in FIG. 3A-C.

[0263]FIG. 3A-C depicts the Clustal V alignment obtained for the amino acid sequences from the wild type rice RE2 protein (SEQ ID NO:26), the O. sativa protein having NCBI General Identifier No. 18652509 (SEQ ID NO:27), the A. thaliana LOB domain 18 protein having NCBI General Identifier No. 17227164 (SEQ ID NO:54), and the amino acid sequences of the polypeptides encoded by corn clones cef1f.pk001.f4:fis (SEQ ID NO:37), cpf1c.pk006.d18a:fis (SEQ ID NO:39), cpi1c.pk005.a12:fis (SEQ ID NO:41), and cr1n.pk0028.h3a:fis (SEQ ID NO:43), Euphorbia lagascae clone eel1c.pk003.b10:fis (SEQ ID NO:45), columbine clone eav1c.pk003.c9 (SEQ ID NO:47), guar clone lds3c.pk011.j11:fis (SEQ ID NO:49), soybean clone sdr1f.pk005.d21.f:fis (SEQ ID NO:51), and wheat clone wdr1f.pk002.l10:fis (SEQ ID NO:53). The program uses dashes to maximize the alignment. An asterisk (*) below the alignment indicates amino acids conserved among all the sequences. The C-block, a GAS-block, and a leucine zipper conserved motifs are set forth boxed.

[0264]Table 4 sets forth the amino acid position of the C-block, Gas Block, and leucine zipper conserved amino acid domains in SEQ ID NOs:26, 54, 37, 39, 41, 43, 45, 47, 49, 51, and 53. The amino acids in each domain are indicated in FIG. 1 and the consensus sequence for each domain described below the table.

TABLE-US-00016 TABLE 4 Location of the Conserved Domains in Oryza sativa RE2 and its Putative Homologs Gas Block SEQ ID NO: C-Block N-end C-end Leu Zipper 26/27 33-54 63-74 103-111 116-134 54 37-58 67-78 107-115 120-138 37 34-55 64-75 104-112 117-135 39 24-45 54-65 94-102 107-125 41 32-53 62-73 102-110 115-133 43 24-45 44-55 84-92 97-115 45 30-51 60-71 100-108 113-131 47 22-33 62-70 75-93 49 20-41 50-61 90-98 103-121 51 16-37 46-57 86-94 99-117 53 12-33 42-53 82-90 95-113

[0265]In the following consensus sequences the amino acids are indicated with their one letter code, positions where more than one amino acid is found at that position are indicated in parenthesis and the amino acids separated by a slash. An X is used in cases where at a certain position any amino acid may be present. The amino acids comprising the C-Block, GAS Block N-end and C-end, and Leu Zipper identified here follow:

[0266]The C block consensus sequence found in RE2 homologs is set forth in SEQ ID NO:55 and corresponds to:

TABLE-US-00017 SEQ ID NO:55: PCGACKFLRR(K/R)C(V/Q/A)X(G/D/E)C(V/I)FAP(Y/H)F

[0267]The GAS block has 49 amino acids that have an N-end consensus sequence set forth in SEQ ID NO:56 and a C-end C-end consensus sequence set forth in SEQ ID NO:57.

TABLE-US-00018 SEQ ID NO:56: FAA(V/I)HKVFGASN SEQ ID NO:57: RDP(V/I)(F/Y)GCV(A/S)

[0268]The consensus sequence for the Leucine Zipper domain is set forth in SEQ ID NO:58 and corresponds to:

TABLE-US-00019 SEQ ID NO:58: LQ(Q/H)QV(A/V/G)XLQX(E/Q)(L/V)X(Y/Q/H)(L/A/V) (Q/K/R)X(H/Q/Y)(L/V)

[0269]The C-Block, GAS Block N-end and C-end, and Leu Zipper consensus sequences set forth above were identified in a Clustal V alignment of polypeptides similar to the Oryza sativa RE2, thus, they should be present in any polypeptide having the same function in altering embryo/endosperm size during seed development as the Oryza sativa RE2 polypeptide.

Example 7

Cloning and Sequencing of a Genomic Fragment Encoding a Maize Putative RE2 Homolog and Preparation of a Recombinant DNA Construct to Complement re2 Mutant Plants

[0270]A genomic DNA fragment encoding a corn RE2 homolog was amplified from a maize genomic library, cloned and sequenced. Then, the portion of DNA from the initiator ATG to the terminator codon of the fragment encoding the maize RE2 homolog was used to replace the portion of DNA from the initiator ATG to the terminator codon encoding the rice RE2 protein in vector OsRE2pML18 as follows.

Cloning and Sequencing of a Genomic Fragment Encoding a Maize RE2 Homolog

[0271]The polynucleotide in cDNA clone cpi1c.pk005.a12 was identified in Example 5 as encoding a polypeptide with similarity to the Oryza saliva RE2 protein. A genomic fragment comprising the open reading frame in clone cpi1c.pk005.a12 was amplified from a maize genomic library (Stratagene, Catalog No. 946102) using oligonucleotide primers Cpi Bbsl F and Cpi Bsal R. Oligonucleotide primers Cpi Bbsl F and Cpi Bsal R were designed based on the sequence of clone cpi1c.pk005.a12, are set forth in SEQ ID NO:59 and SEQ ID NO:60, respectively, and have the sequences set forth as follows:

TABLE-US-00020 SEQ ID NO:59: 5'-GAAGACCAATGAGCGCTGGCGGCGGCAGCAG-3 SEQ ID NO:60: 5'-GGTCTCCTCATCTTGAGTGTGGCGGCGGGTGCTC-3'

[0272]Amplification was performed using the conditions suggested by the manufacturer of the library. The amplified DNA product comprising a maize RE2 homolog gene was named ZmRE2 ORF, was cloned into vector pGEM-T-easy, and was sequenced. The nucleotide sequence obtained for ZmRE2 ORF is set forth in SEQ ID NO:61. Nucleotides 79 through 429 correspond to the first exon, nucleotides 430 through 1363 correspond to an intron, and nucleotides 1364 through 1784 correspond to the second exon, and nucleotides 1785 to 1787 correspond to a stop codon.

A. Preparation of a Recombinant DNA Construct Encoding a Putative Maize RE2 Homolog

[0273]A recombinant DNA construct was prepared in which a genomic DNA fragment encoding a maize RE2 homolog present in ZmRE2 ORF was used to replace the Oryza sativa RE2 coding region in vector OsRE2pML18 (prepared in Example 3, above). The resulting chimeric construct comprises the genomic DNA fragment encoding a maize RE2 homolog (referred to as ZmRE2 ORF) surrounded by the sequences upstream of the initiator ATG and downstream of the termination codon from vector OsRE2pML18. This chimeric construct was prepared by amplifying portions upstream of the initiator ATG and downstream of the termination signal in vector OsRE2pML18, adding these portions to the pGEM-T-easy vector containing ZmRE2 ORF and then replacing the Oryza sativa RE2 coding sequence with this chimeric fragment in vector OsRE2pML18 as follows.

B. Amplification of a Fragment 5' of the O. sativa RE2 Gene in Vector OsRE2pML18

[0274]A portion of the DNA fragment 5' of the initiator ATG in vector OsRE2pML18 was amplified using oligonucleotide primers RE2 pro Bst 2F and RE2 PRO R Bbs. Oligonucleotide primers RE2 pro Bst 2F and RE2 PRO R Bbs are set forth in SEQ ID NO:62 and SEQ ID NO:63, respectively, and have the sequences set forth as follows:

TABLE-US-00021 SEQ ID NO:62: 5'-CACCATCATGTCAGTGTGCCAATACGCTAAACTTAGAAGA-3' SEQ ID NO:63: 5'-GAAGACGCTCATTCTTGGAATGAGCCCCCA-3'

[0275]The amplified fragment comprises a portion of the Oryza sativa RE2 promoter and was cloned in pGEM-T-easy (Promega) to create plasmid RE2PRO whose sequence is set forth in SEQ ID NO:64.

C. Preparation of a Chimera Comprising the Fragments Amplified in A and B Above

[0276]Digestion of the pGEM-T-easy vector containing ZmRE2 ORF (prepared in A above) with Bbs I and Aat II produced a 1760 bp fragment. Restriction endonuclease Bbs I cuts the pGEM-T-easy vector containing ZmRE2 ORF immediately upstream of the initiator ATG and Aat II cuts in the vector, downstream of the maize stop codon. Plasmid RE2PRO was digested with Bbs I which cuts immediately upstream of the initiator ATG, and with Sal I which cuts in the vector's multiple cloning region. The 4316 bp fragment obtained from plasmid RE2PRO was ligated to the 1760 bp fragment obtained from the pGEM-T-easy vector containing ZmRE2 ORF by introducing the fragments in DH10B competent cells (Invitrogen). The resulting plasmid contains a portion of the Oryza sativa RE2 promoter region operably linked to the first codon of the genomic fragment encoding a maize RE2 homolog in ZmRE2 ORF.

D. Amplification of a Fragment 3' of the O. sativa RE2 Gene in Vector OsRE2DML18

[0277]A portion of the DNA fragment 3' of the termination signal in vector OsRE2 μMl18 was amplified using oligonucleotide primers RE2 TERM Xbal R and RF2 TERM EcoBspml. Oligonucleotide primers RE2 TERM Xbal R and RE2 TERM EcoBspml are set forth in SEQ ID NO:65 and SEQ ID NO:66, respectively, and have the sequences set forth as follows:

TABLE-US-00022 SEQ ID NO:65: 5'-GTAAAAGGATCTAGACACCTGGCTCTAGCCTCCAAGTA-3' SEQ ID NO:66: 5'-TGGAGCGAATTCACCTGCCAAGATGATCCTCCTCACTGTGTGTGATCATC-3'

The amplified DNA product comprising a portion of the Oryza sativa RE2 terminator region was cloned into vector pGEM9z to produce plasmid pRE2TERGEM whose sequence is set forth in SEQ ID NO:67.

E. Addition of the Fragment Amplified in D Above to the Chimera of C Above

[0278]The maize sequences 3' of the termination signal were replaced for rice sequences as follows.

[0279]Plasmid pRE2TERGEM was digested with Xba I and Eco RI to remove a 758 bp fragment containing only sequences from the termination region of the rice RE2 gene. This 758 bp fragment was cloned into vector pGEM7 that had been digested with Xba I and Eco RI to produce plasmid RE2TERMpGEM7.

[0280]Plasmid RE2TERMpGEM7 was digested with Bsp HI and Eco RI and an approximately 3.7 Kb fragment was recovered. The chimera prepared in C, above, was digested with Bsa I and Eco RI to remove the fragment comprising a portion of the Oryza sativa RE2 promoter region operably linked to the genomic fragment encoding a maize RE2 homolog. These two fragments were ligated to form a plasmid comprising a portion of the rice RE2 promoter operably linked to the genomic fragment encoding a maize RE2 homolog operably linked to a portion of the rice RE2 terminator region.

F. Preparation of a Vector Comprising a Genomic Fragment Encoding a Maize RE2 Homolog Under the Control of the Oryza Sativa RE2 Promoter and Terminator

[0281]A vector comprising the genomic fragment encoding the maize RE2 homolog under the control of the Oryza sativa RE2 promoter and terminator regions was assembled from vector OsRE2pML18 and the chimeric fragment prepared in part E above, as follows.

[0282]Vector OsR2pML18 and the chimeric fragment prepared in part E above were digested with restriction endonucleases Bst Ell and SexAI. Digestion of vector OsRE2pML18 removed the Oryza sativa RE2 coding region and portions of the promoter and terminator regions from vector OsRE2pML18 leaving a 12.1 Kb DNA fragment. Digestion of the fragment prepared in part E above, produced a 3.1 Kb fragment comprising a fragment encoding a maize RE2 homolog between portions of the Oryza sativa RE2 promoter and terminator regions. Ligation of the 12.1 and 3.1 Kb fragments produced a vector comprising a fragment encoding a maize RE2 homolog under the control of the Oryza sativa RE2 promoter and terminator. The vector comprising the maize RE2 homolog open reading frame under the control of the Oryza sativa RE2 promoter and terminator regions was named ZmRE2pML18.

Example 8

Genetic Complementation of a Rice Re2 Mutant Plant with an RE2 Homolog from Corn

[0283]Confirmation of the function of the corn RE2 homolog, identified in Example 5 above, was performed using genetic complementation. Rice callus cells derived from rice re2 mutant plants were transformed with vector ZmRE2pML18 prepared as described in Example 7 above. Transformations were performed using a Biolistic PDS-1000/He gun and the particle bombardment technique as in Example 3 above.

[0284]Transformation of re2-1 mutant cells with vector ZmRE2pML18 produced 14 transgenic plants. Thirteen of these fourteen plants produced seeds of which ten plants produced seeds having wild type appearance. Some of the seeds produced by these 10 plants had a wild-type phenotype and some had an re2 mutant phenotype. The ratio of seeds having a wild-type appearance to seeds having an re2 mutant phenotype varied in each plant. Approximately 25% to 70% of the seeds obtained from individual re2-1 mutant plants transformed with vector ZmRE2pML18 had a wild-type appearance. Restoration of a wild-type appearance in seeds from plants regenerated from re2 mutant cells transformed with the vector comprising the fragment encoding the corn RE2 homolog indicates that the corn RE2 homolog, encoded by ZmRE2 (SEQ ID NO:61), is capable of complementing an re2 mutation. These results suggest that the corn RE2 homolog performs the same function in corn as the rice RE2 protein performs in rice.

Example 9

Identification of a cDNA Clone Encoding OsRE2

[0285]A cDNA clone encoding OsRE2 was identified by screening a rice phage cDNA library using an RE2-specific probe.

[0286]The phage cDNA library was prepared from total RNA extracted from developing rice seeds harvested 2-5 days after pollination as follows. Total RNA was extracted using a TRIazol® Reagent containing phenol and guanidine thiocyanate (Life Technologies Inc., Rockville, Md.). Poly(A) mRNA was purified from the total RNA using mRNA Purification kits which consist of oligo (dT)-cellulose spin columns (Amersham Pharmacia Biotech Inc., Piscataway, N.J.). cDNA was synthesized using 5.5 μg of poly(A) mRNA and cDNA synthesis kits (Stratagene, La Jolla, Calif.), following manufacturer's protocol with the exception of using Superscript® reverse transcriptase (Life Technologies Inc.) in the first step instead of Moloney murine leukemia virus reverse transcriptase. The cDNA was size-fractionated using BRL cDNA Size Fraction Columns (GIBCO-BRL). Fractions 1 to 13 were precipitated, resuspended, and ligated with 1 μg Uni-ZAP XR vector following the manufacturer's instructions. After incubation for two days at 4° C. the ligated DNA was packaged using Gigapack III Gold® packaging extract (Stratagene, La Jolla, Calif.). The titer of the resulting library was approximately 7.8×105 plaque forming units per mL (pfu/mL). The cDNA phage library was amplified following the manufacturer's instructions and 150 mL of phage cDNA library were obtained. The amplified library had a 5.5×108 pfu/mL titer.

[0287]Screening for the RE2 cDNA was performed following standard protocols well known to those skilled in the art (Ausubel et al. 1993, "Current Protocols in Molecular Biology" John Wiley & Sons, USA, or Sambrook et al. 1989. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press). Briefly, 1.0×106 pfu were plated, transferred to nylon membranes, and subjected to hybridization with radioactively-labeled RE2 second exon probe. The nucleotide sequence of RE2 second exon probe is shown in SEQ ID NO:71. Following hybridization the membranes were exposed to film where approximately 1 positive plaque was detected per 100,000 plaques plated. Eight plaques that gave a positive signal were isolated after a second round of screening. Lambda phage DNA was prepared from all 8 plaques, converted into plasmid DNA, and sequenced. Six of the eight clones contained a cDNA sequence encoding OsRE2. One of these six clones, RE2 cDNA C1, had a 5'UTR that extended 196 nucleotides upstream of the ATG start codon predicted from the genomic sequence. The nucleotide sequence of clone RE2 cDNA C1 is shown in SEQ ID NO:72.

Sequence CWU 1

72123DNAArtificial SequenceOligonucleotide primer C10 6-3 1tagcagctgg gaagaacaac atg 23222DNAartificial sequenceOligonucleotide primer C10 6-4 2cgtgcaccac gtaacgttaa gc 22322DNAartificial sequenceoligonucleotide primer C10 15.9-1 3cagggttgtg taaggatcgt tg 22422DNAartificial sequenceOligonucleotide primer C10 15.9-2 4gatcatcgtg tagtaccagg ac 22520DNAartificial sequenceOligonucleotide primer C10-7.7 2 HPYIVF 5attgtctcgt gtgacagcgc 20620DNAartificial sequenceOligonucleotide primer C10-7.7 2 HPYIVR 6ccgcaattaa tattccgagc 20726DNAartificial sequenceOligonucleotide primer 11.5 HpyV 7aaagtgtggt aggtgtcatc cagttg 26825DNAartificial sequenceOligonucleotide primer C10 11.5-9 8gccacatgat catccactac caatg 25923DNAartificial sequenceOligonucleotide primer C10 11-5 9ctttttccga cccacatgaa ggt 231025DNAartificial sequenceOligonucleotide primer 11 HinfR 10tacaaacgct cctaaaccac catgt 251123DNAartificial sequenceOligonucleotide primer 9.6 DraIF 11tttgggtgca ttaaagtgga cca 231221DNAartificial sequenceOligonucleotide primer 9.6 DraIR 12ggggtaattc ggatgaccat g 211323DNAArtificial sequenceOligonucleotide primer E08 93KF 13ctcatagccg cctagcctca tag 231423DNAartificial sequenceOligonucleotide primer E08 93KR 14gaagcagaga aactccaacc tgg 231524DNAArtificial sequenceOligonucleotide primer E08 46KF 15gttcataggt gccaaatttg ggtg 241623DNAArtificial SequenceOligonucleotide primer E08 46KR 16cacaagtaac ccaatgccca aac 231723DNAArtificial sequenceOligonucleotide primer K08 21KF 17gttcacccat tagtgatgcc tgg 231825DNAArtificial sequenceOligonucleotide primer K08 21KR 18gttcactcga taagagcaat cgaac 251927DNAartificial sequenceOligonucleotide primer K08 46KF 19gttatgttgc acacctccag tagttac 272025DNAArtificial sequenceOligonucleotide primer K08 46KR 20gtcaagcctg ctgttaccct ttaag 252125DNAArtificial sequenceOligonucleotide primer LOB-82F 21gtcaagcctg ctgttaccct ttaag 252223DNAArtificial sequenceOligonucleotide primer LOB R1 22ccaccatgac gaacatctaa atg 232325DNAArtificial sequenceOligonucleotide primer LOB F2 23gtatagctcc caaccatttc tcctc 252423DNAArtificial sequenceOligonucleotide primer LOB R2 24ccaacatcac catcatcgtc ttc 2325810DNAOryza sativa 25atgagctcgt cggtggttgt gagcgcgagc ggcagcggca gcggcggcgg aggaggagga 60ggaggtggcg gcgccggagg tggaggagga ggtgggccgt gcggggcgtg caagttcttg 120cggcggaagt gcgtgcaggg gtgcatcttc gcgccctact tcgactcgga ggccggggcg 180gcgcacttcg cggcggtgca caaggtgttc ggcgccagca acgtgtccaa gctgctgcag 240cagatcccgg cgcaccgccg cctcgacgcc gtcgtcacca tctgctacga ggcccaggcc 300cgcctccgcg accccgtcta cggctgcgtc gcccacatct tccacctcca acaccaggtg 360gcaggtctcc agtccgagct gaactacctg caaggtcacc tctcgacgat ggagctgccg 420tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc cacagccaca gccactgatg 480ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc catcgtcgtc ggcggccaac 540attccggtca ccgccgacct gtccaccctc tttgacccac tgccggcggc gcagccgcag 600tggggactat accagcagca gcaacaccac caccagcagc tgcatcatca cccctatgac 660cggatgggcg acggctcgtc gagcagcaga ggcggcgacg acgatggcag cgacggcggc 720gacttgcaag cgctggcgag ggagcttctt gaccgccatg gacggtcgtc gtcgagctcc 780aagctggagc cgccacctca cacacagtga 81026269PRTOryza sativa 26Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5 10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly Gly Gly Gly Gly Gly Gly 20 25 30Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Gln Gly Cys 35 40 45Ile Phe Ala Pro Tyr Phe Asp Ser Glu Ala Gly Ala Ala His Phe Ala 50 55 60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Gln65 70 75 80Gln Ile Pro Ala His Arg Arg Leu Asp Ala Val Val Thr Ile Cys Tyr 85 90 95Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His 100 105 110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln Ser Glu Leu Asn 115 120 125Tyr Leu Gln Gly His Leu Ser Thr Met Glu Leu Pro Ser Pro Pro Pro 130 135 140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln Pro Gln Pro Leu Met145 150 155 160Pro Met Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser 165 170 175Ser Ala Ala Asn Ile Pro Val Thr Ala Asp Leu Ser Thr Leu Phe Asp 180 185 190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu Tyr Gln Gln Gln Gln 195 200 205His His His Gln Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210 215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp Asp Gly Ser Asp Gly Gly225 230 235 240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg Ser 245 250 255Ser Ser Ser Ser Lys Leu Glu Pro Pro Pro His Thr Gln 260 26527269PRTOryza sativaMISC_FEATURENCBI General Identification No. 18652509 27Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5 10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly Gly Gly Gly Gly Gly Gly 20 25 30Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Gln Gly Cys 35 40 45Ile Phe Ala Pro Tyr Phe Asp Ser Glu Ala Gly Ala Ala His Phe Ala 50 55 60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Gln65 70 75 80Gln Ile Pro Ala His Arg Arg Leu Asp Ala Val Val Thr Ile Cys Tyr 85 90 95Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His 100 105 110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln Ser Glu Leu Asn 115 120 125Tyr Leu Gln Gly His Leu Ser Thr Met Glu Leu Pro Ser Pro Pro Pro 130 135 140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln Pro Gln Pro Leu Met145 150 155 160Pro Met Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser 165 170 175Ser Ala Ala Asn Ile Pro Val Thr Ala Asp Leu Ser Thr Leu Phe Asp 180 185 190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu Tyr Gln Gln Gln Gln 195 200 205His His His Gln Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210 215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp Asp Gly Ser Asp Gly Gly225 230 235 240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg Ser 245 250 255Ser Ser Ser Ser Lys Leu Glu Pro Pro Pro His Thr Gln 260 26528810DNAOryza sativa 28atgagctcgt cggtggttgt gagcgcgagc ggcagcggca gcggcggcgg aggaggagga 60ggaggtggcg gcgccggagg tggaggagga ggtgggccgt gcggggcgtg caagttcttg 120cggcggaagt gcgtgcaggg gtgcatcttc gcgccctact tcgactcgga ggccggggcg 180gcgcacttcg cggcggtgca caaggtgttc ggcgccagca acgtgtccaa gctgctgcag 240cagatcccgg cgcaccgccg cctcgacgcc gtcgtcatca tctgctacga ggcccaggcc 300cgcctccgcg accccgtcta cggctgcgtc gcccacatct tccacctcca acaccaggtg 360gcaggtctcc agtccgagct gaactacctg caaggtcacc tctcgacgat ggagctgccg 420tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc cacagccaca gccactgatg 480ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc catcgtcgtc ggcggccaac 540attccggtca ccgccgacct gtccaccctc tttgacccac tgccggcggc gcagccgcag 600tggggactat accagcagca gcaacaccac caccagcagc tgcatcatca cccctatgac 660cggatgggcg acggctcgtc gagcagcaga ggcggcgacg acgatggcag cgacggcggc 720gacttgcaag cgctggcgag ggagcttctt gaccgccatg gacggtcgtc gtcgagctcc 780aagctggagc cgccacctca cacacagtga 81029269PRTOryza sativa 29Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5 10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly Gly Gly Gly Gly Gly Gly 20 25 30Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Gln Gly Cys 35 40 45Ile Phe Ala Pro Tyr Phe Asp Ser Glu Ala Gly Ala Ala His Phe Ala 50 55 60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Gln65 70 75 80Gln Ile Pro Ala His Arg Arg Leu Asp Ala Val Val Ile Ile Cys Tyr 85 90 95Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His 100 105 110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln Ser Glu Leu Asn 115 120 125Tyr Leu Gln Gly His Leu Ser Thr Met Glu Leu Pro Ser Pro Pro Pro 130 135 140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln Pro Gln Pro Leu Met145 150 155 160Pro Met Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser 165 170 175Ser Ala Ala Asn Ile Pro Val Thr Ala Asp Leu Ser Thr Leu Phe Asp 180 185 190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu Tyr Gln Gln Gln Gln 195 200 205His His His Gln Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210 215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp Asp Gly Ser Asp Gly Gly225 230 235 240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg Ser 245 250 255Ser Ser Ser Ser Lys Leu Glu Pro Pro Pro His Thr Gln 260 26530810DNAOryza sativa 30atgagctcgt cggtggttgt gagcgcgagc ggcagcggca gcggcggcgg aggaggagga 60ggaggtggcg gcgccggagg tggaggagga ggtgggccgt gcggggcgtt caagttcttg 120cggcggaagt gcgtgcaggg gtgcatcttc gcgccctact tcgactcgga ggccggggcg 180gcgcacttcg cggcggtgca caaggtgttc ggcgccagca acgtgtccaa gctgctgcag 240cagatcccgg cgcaccgccg cctcgacgcc gtcgtcacca tctgctacga ggcccaggcc 300cgcctccgcg accccgtcta cggctgcgtc gcccacatct tccacctcca acaccaggtg 360gcaggtctcc agtccgagct gaactacctg caaggtcacc tctcgacgat ggagctgccg 420tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc cacagccaca gccactgatg 480ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc catcgtcgtc ggcggccaac 540attccggtca ccgccgacct gtccaccctc tttgacccac tgccggcggc gcagccgcag 600tggggactat accagcagca gcaacaccac caccagcagc tgcatcatca cccctatgac 660cggatgggcg acggctcgtc gagcagcaga ggcggcgacg acgatggcag cgacggcggc 720gacttgcaag cgctggcgag ggagcttctt gaccgccatg gacggtcgtc gtcgagctcc 780aagctggagc cgccacctca cacacagtga 81031269PRTOryza sativa 31Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5 10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Gly Gly Gly Gly Gly Gly Gly 20 25 30Pro Cys Gly Ala Phe Lys Phe Leu Arg Arg Lys Cys Val Gln Gly Cys 35 40 45Ile Phe Ala Pro Tyr Phe Asp Ser Glu Ala Gly Ala Ala His Phe Ala 50 55 60Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Gln65 70 75 80Gln Ile Pro Ala His Arg Arg Leu Asp Ala Val Val Thr Ile Cys Tyr 85 90 95Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His 100 105 110Ile Phe His Leu Gln His Gln Val Ala Gly Leu Gln Ser Glu Leu Asn 115 120 125Tyr Leu Gln Gly His Leu Ser Thr Met Glu Leu Pro Ser Pro Pro Pro 130 135 140Tyr Val Ala Gly Pro Thr Leu Ala Pro Pro Gln Pro Gln Pro Leu Met145 150 155 160Pro Met Thr Ala Ala Ala Asn Phe Asn Phe Ser Asp Leu Pro Ser Ser 165 170 175Ser Ala Ala Asn Ile Pro Val Thr Ala Asp Leu Ser Thr Leu Phe Asp 180 185 190Pro Leu Pro Ala Ala Gln Pro Gln Trp Gly Leu Tyr Gln Gln Gln Gln 195 200 205His His His Gln Gln Leu His His His Pro Tyr Asp Arg Met Gly Asp 210 215 220Gly Ser Ser Ser Ser Arg Gly Gly Asp Asp Asp Gly Ser Asp Gly Gly225 230 235 240Asp Leu Gln Ala Leu Ala Arg Glu Leu Leu Asp Arg His Gly Arg Ser 245 250 255Ser Ser Ser Ser Lys Leu Glu Pro Pro Pro His Thr Gln 260 26532809DNAOryza sativa 32atgagctcgt cggtggttgt gagcgcgagc ggcagcggca gcggcggcgg aggaggagga 60ggaggtggcg gcgcggaggt ggaggaggag gtgggccgtg cggggcgtgc aagttcttgc 120ggcggaagtg cgtgcagggg tgcatcttcg cgccctactt cgactcggag gccggggcgg 180cgcacttcgc ggcggtgcac aaggtgttcg gcgccagcaa cgtgtccaag ctgctgcagc 240agatcccggc gcaccgccgc ctcgacgccg tcgtcaccat ctgctacgag gcccaggccc 300gcctccgcga ccccgtctac ggctgcgtcg cccacatctt ccacctccaa caccaggtgg 360caggtctcca gtccgagctg aactacctgc aaggtcacct ctcgacgatg gagctgccgt 420cgccgccgcc ctacgtcgcc gggccgaccc tggcgccgcc acagccacag ccactgatgc 480cgatgaccgc cgccgccaac ttcaacttct ccgacctgcc atcgtcgtcg gcggccaaca 540ttccggtcac cgccgacctg tccaccctct ttgacccact gccggcggcg cagccgcagt 600ggggactata ccagcagcag caacaccacc accagcagct gcatcatcac ccctatgacc 660ggatgggcga cggctcgtcg agcagcagag gcggcgacga cgatggcagc gacggcggcg 720acttgcaagc gctggcgagg gagcttcttg accgccatgg acggtcgtcg tcgagctcca 780agctggagcc gccacctcac acacagtga 80933126PRTOryza sativa 33Met Ser Ser Ser Val Val Val Ser Ala Ser Gly Ser Gly Ser Gly Gly1 5 10 15Gly Gly Gly Gly Gly Gly Gly Gly Ala Glu Val Glu Glu Glu Val Gly 20 25 30Arg Ala Gly Arg Ala Ser Ser Cys Gly Gly Ser Ala Cys Arg Gly Ala 35 40 45Ser Ser Arg Pro Thr Ser Thr Arg Arg Pro Gly Arg Arg Thr Ser Arg 50 55 60Arg Cys Thr Arg Cys Ser Ala Pro Ala Thr Cys Pro Ser Cys Cys Ser65 70 75 80Arg Ser Arg Arg Thr Ala Ala Ser Thr Pro Ser Ser Pro Ser Ala Thr 85 90 95Arg Pro Arg Pro Ala Ser Ala Thr Pro Ser Thr Ala Ala Ser Pro Thr 100 105 110Ser Ser Thr Ser Asn Thr Arg Trp Gln Val Ser Ser Pro Ser 115 120 125349203DNAOryza sativa 34ggatccatcc aacagtttct cctaaatatc agaataaagt tgaagtaact gctttgctgc 60cgtccaagat atattgcaaa ggacaaaagg ttcaggagca atgcaagaca aaaaaatgtg 120atctcaactg tatgtacatc catatatatg cctggagttc acttgacctg taaagtagta 180gtaccaaatt ctgttgctga ccaattcatt ttaattatct taattccttg cataaaagaa 240taaataattc agcagatgct tgctaaggaa ttaatgtgta atatatataa gcacaactaa 300taaagcaatg gatactttca accaaaaaaa gtggtttaat ttgtctatag tagttctgtg 360gaaatggaga acttaagaaa ggacaaaaag gaaataacca cttttggatc tatttgatgc 420atggtatttc ttcaactcca ggggtatttt tgatatatgt atatatttag ggtataagct 480aaatatgtac gcatattctc catatgaaag agtgcagtac tatttagcag cattctccat 540atggatggtt agtactgaca ttgaagaatt ttgtagctag gactccatgt ttttttttat 600cagtgataca gttgtacgtt gtcaattatt gatcatgaat tccagtttga tgtgacaatt 660aatttgatga ttagtataac tagaaattaa cgatggatca atggacacct ggcccataat 720taattaatta aattgtgata agattgtgtg tttgagtcac aaaaaacttt aagtggtgaa 780tttgagaggt gtggtcaagc atgcaagttt cttacctagc cagggtgccg tcttttggct 840tccacgcatc catctataga tctcagatgc acattatatt ctctgtgtgt atgggagaga 900gagagagaga gaaagaaggg tttgcttggt acacactctc ctgatcaatc aggccatact 960gtacagtgat cacacagtcc atgcatgcca gctctagtgc tgcatctacc tagaagctag 1020catatgcttt ggttcaaact gtgcacacat cattcacaca tatttacatg ttactatatc 1080ttacccaagg aagaggtact ctttgctgta ataacacatg tgattatgga aaaactgata 1140aaatcattgt ccatacatat atttatgtat gacatgtttg aaaattgggt ttgagaagta 1200ttatctcact ctttaaagga taagttttaa cctccaccgc accccatatt atcccgaccc 1260tgcatctatt

tttatttcac aatcacatcc tttgtaaccc attatcttga tatctcatac 1320atattataca tgtattatat gtatatgaga aaatctttca tataaaaatt aaactactgc 1380atgattatat atacgtgtta cgcgtatatg atcttggtca aatgtaacct caaggaaaca 1440taaagttttg taagtcgtaa ctgcaggcgg tgaagtgtct agagctgatg gctggcggtg 1500attcccatgc atgtctgggc atgcatggat cgatccatcg atgcctcgaa attcacgcaa 1560gctctgcaac ggtttccgga tacagatggt cattgtcgtg tcctttttat ttttctcttt 1620catttgcttt aattttcttc tctctgtttg gcttaacatg tgtggtacgt acactttgta 1680acggatgtga gatgagcaat gcagtaagct taaggtagct agcgcgttgc agaatgcaga 1740tcagagccac acatttactt tacttctcac ctgatcgatc gatcactgaa tgaagagagt 1800ccaaagctag gcagcagttc ataacatgca tacgttgaca acgtacggac gcagctggca 1860gctagctata tatttaatta agccttcata ctcaaagaat aactttttgg agctcttgaa 1920tttctatcct tgcgttagct agatagatac gtcgaaaaaa ataactgcac ttttttagtg 1980atacaatcca aagccagcaa aaaataataa attatatacg ctatttatga tggtaatata 2040ttactgatac ataatccagc ccattttgct ctccatctaa ctttagatgt tcatatcaac 2100cacttcggtt atattgcgga aattttgatt gaatgtatat atgtggcatt atagattata 2160tctatgtctg aaaaatcata tcaccacata ttggttataa tgtgcgaaaa tatagaacaa 2220aaaactgata ttgtcgatta gaggatgcca cctacaagct atagttttac atatattatt 2280ttatgctgtc tactaaaaga acaataaaac catttactta caccactgca ttcaaacagt 2340aaattggaga agttggcttt ctaccttgac actactagtc cttgctaaga taaaagtaaa 2400acaacattac catcttatat caaatctact aattaaacca ctccatatta gatgaaatcc 2460atgttaaaga gtctatatct atgcagtcgc tctcatgata tgtcattata tcttgatcta 2520tctatgttaa tttagaagtt tacacccaca atcgctctaa ttttatagga ccatcgatga 2580tatataatat ttttttcatc aggaatgaaa tagattacgt acacagttac attacgactc 2640atgacactag aactatatct atgtttagaa gtttatctag atatggcatg attaatagaa 2700tgtatttgtg ttagagctct aagtttagaa tatgtgaccg ataaacctac cgttttattc 2760tttttaacta catgttttgc aaaagattaa attgttatct tacaattcat atagcactag 2820cattatgatc tggtgtatca tatatgtcat tatccatcta tgtttagaag tttatatcca 2880cggctctaat tatgtggaac gattaaatga tgatatatat agttgttaag aggtatggaa 2940tagattaaat aattagttac gttacgattc gtaacactag tgctatctat atttagaagt 3000ttacatccac aatcgctcta attatgtggg attattaaac gatgaatata tttttctgtg 3060aggaatgaat tgaaatagat taaatagtta cgttacgatt cgtaacatta gttctatcta 3120tgtttagaag tttatctgga caatcaccat catgtcagtg tgccaatacg ctaaacttag 3180aagatgcctc cgataatcgt agcattagta ttatttgggg aatgaattaa aaaatataaa 3240taatgatata ttacaattga taatctatgt ttagaaactt ttgtcggtta ctcgctcaaa 3300ttgtatgggg taataaatcg gtgaagtata cttttatact gaatggacaa gataagctac 3360cattgatagc attagcggtt ctatttggta tattatcccg attatccacc ctcaatttgt 3420gctaaaataa gatttttaca tcatcctagt caatatttgg ggttaccctg tctgcattat 3480aatttatttt tgtgcttaac tataatatat acatacacta taatttatct aaataaaagt 3540tctggtatga ttaaaaaaac taacaatttt gtgtgtggcg tattgagtgg aagaatgtca 3600tgttaggatc acatgggaga gagtgcatgc gacgagatca tccttgttgg tctgtgcagg 3660tggtgtgaaa tgtgatcaat atatatggtg gtgacagaga gagaaactaa cccaaaaaaa 3720caaaaaaaga gagatgagag cgaatggatg gatgcaattg gcattaattt tcggtctttg 3780ctgttctccc ccagccaggc cagtttgctt cacgcaatat tctaaccctt tgagaaagag 3840aagtgtactt gttgccaagg ccaattgcaa gcatttgcct tggctttaaa gtctcatcaa 3900tacaacggca ccaaaaagaa aacacagaga tagaaaacca cctagtagct gatatacatt 3960tatatatgac ctaaataaaa aaattccatt aatatgtata attccagcaa caacataaag 4020aaataaaaat gcatttaaga aaacatagaa agaaataaaa ataaagtaaa taaagctagc 4080taggcccaaa attggcagta attaagtagg gactagtata gaaatatatg gatatataca 4140ccagcctcca ccaatgggat tgcaaacagc ctacttatca ctttgctgct gtatttacgc 4200ttttgccctt cttccctcct atatgtacag ccgcccccac ctcattcctc cattcttact 4260ccacacacac actctctctc tctctaccat ttgtgagaaa gaaaatcgat tcagttctag 4320agagagaaac aaacaatttt cgctgtctat ctctctcttg ctactagtcg gtcgatcttg 4380agttagtttt aaccctacac aagccaaggt aacaacatct agcaggtagg agaagagagc 4440tagagactag gtggtggggg ctcattccaa gaatgagctc gtcggtggtt gtgagcgcga 4500gcggcagcgg cagcggcggc ggaggaggag gaggaggtgg cggcgccgga ggtggaggag 4560gaggtgggcc gtgcggggcg tgcaagttct tgcggcggaa gtgcgtgcag gggtgcatct 4620tcgcgcccta cttcgactcg gaggccgggg cggcgcactt cgcggcggtg cacaaggtgt 4680tcggcgccag caacgtgtcc aagctgctgc agcagatccc ggcgcaccgc cgcctcgacg 4740ccgtcgtcac catctgctac gaggcccagg cccgcctccg cgaccccgtc tacggctgcg 4800tcgcccacat cttccacctc caacaccagg tatatactac tcatactcac tcgatctcct 4860cctcctcatc gtcgccgtcg gtggcggcga gtcatttaga tgttcgtcat ggtggttgtg 4920cgatcgatcg agcttctatt ttggttttgg ttttggtttt ggtttcttgg gtttgatttg 4980gttggttttt ggaggaagga tggatgtctt tttcttgaag aaggcaaagg agtccttttt 5040tggggaggag agaaggctag caagctaagc aagggagtta atctggagaa atggacttct 5100ctctttctgt tactactcac tactactcag gcctaccagt gatgatgtgc acatctcatc 5160atccatctca tcattaaatc ccatcatcta ctctctctct tgttcttgct ttctcttctt 5220tcattctttc tctgaatctt ctgatagata gattgataga tagatgcatg atgatatccc 5280catttatcac atcattttat atcatgcatc aggttgttgt cccccccccc cctctctctc 5340tcttgctctg aaatcaagga gggtatgcat acatgcttgg atttcacacc cacaaaagaa 5400aaatggtaat ttagcaagcc ctagctagga attaggatgc atcaatctct agtagttctt 5460gaagctgcag ctagtatagc tcccaaccat ttctcctctt ccttttcttt actaatatga 5520tcagcatttc attaagattt ttttgtatat agtatagcta cctacatttt ctcttgatct 5580gattatgcca agtactaatt ttctgtccat tttactgatg atgatctggt tcaattcccc 5640atgtgtatat gtactctcag gtggcaggtc tccagtccga gctgaactac ctgcaaggtc 5700acctctcgac gatggagctg ccgtcgccgc cgccctacgt cgccgggccg accctggcgc 5760cgccacagcc acagccactg atgccgatga ccgccgccgc caacttcaac ttctccgacc 5820tgccatcgtc gtcggcggcc aacattccgg tcaccgccga cctgtccacc ctctttgacc 5880cactgccggc ggcgcagccg cagtggggac tataccagca gcagcaacac caccaccagc 5940agctgcatca tcacccctat gaccggatgg gcgacggctc gtcgagcagc agaggcggcg 6000acgacgatgg cagcgacggc ggcgacttgc aagcgctggc gagggagctt cttgaccgcc 6060atggacggtc gtcgtcgagc tccaagctgg agccgccacc tcacacacag tgatcctcct 6120cactgtgtgt gatcatcaat tcagcttagc tagctagctc atggactaat tgatcaggtg 6180ttaatcattc atgaatgcat tggttgaggc aagaagagaa tttaatccca atggtgaaat 6240ttttttcacc aaatcctcca tgtcgttgag gcgaaaaatc gaacgacgac gacgacgatg 6300gcgaggaaga cgatgatggt gatgttgggg atggagatgg taggtaacag gcattgcccg 6360gttttcgcgt atcatctttg ttcttgggct agggtgcaag gggtgcccac ttgcaccatt 6420ttataatgct tgggagtttg ctccaaaaga ggaagcttgg ggatgagttc ttgttagctt 6480agctgtagcc ctgatcactg ttccattgca acagttctaa ttgcaaaaaa caaaacctgg 6540tctaatttag ttcaatatac aaaaaaaaaa tctttgtctc atcgcagatt aattacggtt 6600gtgtttgaag tttgtttgat ttctgtttca aggttctaac tgaacatctg aagtgaagtg 6660tagtcagtct taatttggga ctttctgatc tctctatgac aaatgagctt tttttttttg 6720ccataatata tacaacagct agcaagtagc aaaatgagca tttttggggt taatggtaac 6780tgaacatata tgtatggtgg caactgaata aagtgtgaac atatgtgagt acttggaggc 6840tagagccagg tgttggtttc cttttacttg cttcgtgctt ctacaactac aataatgcaa 6900gtattcatat ggtgcaacct tagtcttaga ttcagtctgg ctagctagct agctatttct 6960aatgggacac agtacattta aaacaagcct aattaaacta gtttatttct ctatgaagag 7020tcgtggtata tctggggcta aatgattggc aggggattat attttagagt ttgatatata 7080gatgagtgag agacagacag gaagcatagc tttggtggga catctttgac taagaccatg 7140cagcatgcac acaacaatgt tttctctcct tatgtttctt gaagttatat catatgccct 7200tgtattcagg gactcctttg ttatcaattg ttggaaaatg acaagcggtt gggatatgaa 7260taatatgatg ccataggaaa gtacatgttt cagtttagct agctctttaa tgtgtccaaa 7320ccgcattgaa aagtttcatg attactacta gtccatgtag gtaactaatt attaccgtaa 7380tgaacatgca tatgcatatg agttaatttt ggcatgtact ctaagctaat ttaagatgat 7440gtttctgtgg cgaccggcca tgcgtgcaaa tacatggtac tatatatgca taataatagg 7500gatactgcta ctagttaagt aagttaatta tgtgtctcta cattactctt tgattcgtta 7560attaattaga caggtctttt ttttcttaag gaagatcgtc actaccgtaa attatcaaag 7620cagggtaaat ttgacatgta actatggtaa gttagtaact attataacta gctggtacct 7680agcatcaata atattccttg taacaatatt atttctgcaa cttttgcaca agtaagaata 7740taaacattaa taggaaacaa gtattatttg tacaaactaa agaatgtaat aattagttgg 7800atcattagta atgcatttaa ttagctttct tagaaataag agcattaatc aacagttgtc 7860atatatgcaa tgaatcgtgt cattaatgtg tacattttgc gtgcaaggat cgcaaaagtt 7920ccatatagct gttactattc tttgcaaatt tatcgtgcgt gatatcaaat gtattggatt 7980ctcctactat tagaattgat acataagcga caccatcaca catgagaacg tcttttctta 8040atatatataa gcacaagata ttagctagat gccaaattaa atacttgatt tccagcactt 8100catagatatt agttaccttt ctcaagtttg gtgtgaaaaa aatgcatatc tatatatatt 8160tgacagtatt ttagtagaat aatgtgagta gctagctgga aaagaatata tttctgcatg 8220ctgcaaatat atggtaccaa ctgttcagtg ctaccctgaa ctgaagtaaa tgtcttatta 8280atgaagagcc atcttatgtt gttttagaaa ttctatatgt agtgttggca ttgacttgat 8340ttacacacta tagtgttata tatgcttgta gctgcacaaa agtagctttt agttgctggc 8400atgtcaattt accaaacaga aaaagagacg ttcaatttgt ggatgaaaaa aaaaggaata 8460ttttttaaag acgcaagttt aacacaattc aaaatgaagc ctttcgatgt tgagtttaag 8520taatctttaa tttaaatata aaagaataat ggtagtcagg gttaacactg cacaaaaaag 8580tgaccttggt cctggtccta atgaagcggt ttccttaaaa cctgctagta accacctcac 8640agttctcaat gctatgaaga aattaaaaag tcggttctaa atcaattaag gtaaatcatg 8700gaagaatata gatgtcttac gaaagtaatc tgccccaata agcaatagta cagtggtcag 8760tggagatcag agaaataatt ttgtgatatg gagtttaaac attggggtag cggaattaaa 8820ctactatttg gtgtgaaatt atattaagta ggtcagaagc atccatacat agtacttctt 8880ctgtcccaaa atataaagag tttcggttga atgggggcat atctaagtat tgcgagtttg 8940gacaggctac atcccgcatg gaaaaaaatt gttgtttttc ataccttttt gacaggcggg 9000tgagccaacc acgtgaaaaa agtttttttt tcgtagtgta tccagtagac tgacatgtta 9060gcccaatggc gaaatcggcc ctagagcaaa taacgttgga ggcaatataa tggatcaaac 9120agagatggtg tagcagctag ccgtgtgggg gccaagggct tttggaggcg gctattgcaa 9180tttcggtttg aaatttcgga tcc 9203356157DNAArtificial sequenceVector pML18 35gaatatgcat cactagtaag ctttgctcta gactggaatt cgtcgactct agaggatcca 60attccaatcc cacaaaaatc tgagcttaac agcacagttg ctcctctcag agcagaatcg 120ggtattcaac accctcatat caactactac gttgtgtata acggtccaca tgccggtata 180tacgatgact ggggttgtac aaaggcggca acaaacggcg ttcccggagt tgcacacaag 240aaatttgcca ctattacaga ggcaagagca gcagctgacg cgtacacaac aagtcagcaa 300acagacaggt tgaacttcat ccccaaagga gaagctcaac tcaagcccaa gagctttgct 360aaggccctaa caagcccacc aaagcaaaaa gcccactggc tcacgctagg aaccaaaagg 420cccagcagtg atccagcccc aaaagagatc tcctttgccc cggagattac aatggacgat 480ttcctctatc tttacgatct aggaaggaag ttcgaaggtg aaggtgacga cactatgttc 540accactgata atgagaaggt tagcctcttc aatttcagaa agaatgctga cccacagatg 600gttagagagg cctacgcagc aggtctcatc aagacgatct acccgagtaa caatctccag 660gagatcaaat accttcccaa gaaggttaaa gatgcagtca aaagattcag gactaattgc 720atcaagaaca cagagaaaga catatttctc aagatcagaa gtactattcc agtatggacg 780attcaaggct tgcttcataa accaaggcaa gtaatagaga ttggagtctc taaaaaggta 840gttcctactg aatctaaggc catgcatgga gtctaagatt caaatcgagg atctaacaga 900actcgccgtg aagactggcg aacagttcat acagagtctt ttacgactca atgacaagaa 960gaaaatcttc gtcaacatgg tggagcacga cactctggtc tactccaaaa atgtcaaaga 1020tacagtctca gaagaccaaa gggctattga gacttttcaa caaaggataa tttcgggaaa 1080cctcctcgga ttccattgcc cagctatctg tcacttcatc gaaaggacag tagaaaagga 1140aggtggctcc tacaaatgcc atcattgcga taaaggaaag gctatcattc aagatgcctc 1200tgccgacagt ggtcccaaag atggaccccc acccacgagg agcatcgtgg aaaaagaaga 1260cgttccaacc acgtcttcaa agcaagtgga ttgatgtgac atctccactg acgtaaggga 1320tgacgcacaa tcccactatc cttcgcaaga cccttcctct atataaggaa gttcatttca 1380tttggagagg acacgctcga gctcatttct ctattacttc agccataaca aaagaactct 1440tttctcttct tattaaacca tgaaaaagcc tgaactcacc gcgacgtctg tcgagaagtt 1500tctgatcgaa aagttcgaca gcgtctccga cctgatgcag ctctcggagg gcgaagaatc 1560tcgtgctttc agcttcgatg taggagggcg tggatatgtc ctgcgggtaa atagctgcgc 1620cgatggtttc tacaaagatc gttatgttta tcggcacttt gcatcggccg cgctcccgat 1680tccggaagtg cttgacattg gggaattcag cgagagcctg acctattgca tctcccgccg 1740tgcacagggt gtcacgttgc aagacctgcc tgaaaccgaa ctgcccgctg ttctgcagcc 1800ggtcgcggag gccatggatg cgatcgctgc ggccgatctt agccagacga gcgggttcgg 1860cccattcgga ccgcaaggaa tcggtcaata cactacatgg cgtgatttca tatgcgcgat 1920tgctgatccc catgtgtatc actggcaaac tgtgatggac gacaccgtca gtgcgtccgt 1980cgcgcaggct ctcgatgagc tgatgctttg ggccgaggac tgccccgaag tccggcacct 2040cgtgcacgcg gatttcggct ccaacaatgt cctgacggac aatggccgca taacagcggt 2100cattgactgg agcgaggcga tgttcgggga ttcccaatac gaggtcgcca acatcttctt 2160ctggaggccg tggttggctt gtatggagca gcagacgcgc tacttcgagc ggaggcatcc 2220ggagcttgca ggatcgccgc ggctccgggc gtatatgctc cgcattggtc ttgaccaact 2280ctatcagagc ttggttgacg gcaatttcga tgatgcagct tgggcgcagg gtcgatgcga 2340cgcaatcgtc cgatccggag ccgggactgt cgggcgtaca caaatcgccc gcagaagcgc 2400ggccgtctgg accgatggct gtgtagaagt actcgccgat agtggaaacc gacgccccag 2460cactcgtccg agggcaaagg aatagtgagg tacctaatag tgagatccaa cacttacgtt 2520tgcaacgtcc aagagcaaat agaccacgna cgccggaagg ttgccgcagc gtgtggattg 2580cgtctcaatt ctctcttgca ggaatgcaat gatgaatatg atactgacta tgaaactttg 2640agggaatact gcctagcacc gtcacctcat aacgtgcatc atgcatgccc tgacaacatg 2700gaacatcgct atttttctga agaattatgc tcgttggagg atgtcgcggc aattgcagct 2760attgccaaca tcgaactacc cctcacgcat gcattcatca atattattca tgcggggaaa 2820ggcaagatta atccaactgg caaatcatcc agcgtgattg gtaacttcag ttccagcgac 2880ttgattcgtt ttggtgctac ccacgttttc aataaggacg agatggtgga gtaaagaagg 2940agtgcgtcga agcagatcgt tcaaacattt ggcaataaag tttcttaaga ttgaatcctg 3000ttgccggtct tgcgatgatt atcatataat ttctgttgaa ttacgttaag catgtaataa 3060ttaacatgta atgcatgacg ttatttatga gatgggtttt tatgattaga gtcccgcaat 3120tatacattta atacgcgata gaaaacaaaa tatagcgcgc aaactaggat aaattatcgc 3180gcgcggtgtc atctatgtta ctagatcgat caaacttcgg tactgtgtaa tgacgatgag 3240caatcgagag gctgactaac aaaaggtaca tcggtcgacg agctccctat agtgagtcgt 3300attagaggcc gacttggcca aattcgtaat catggtcata gctgtttcct gtgtgaaatt 3360gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt aaagcctggg 3420gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc gctttccagt 3480cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg agaggcggtt 3540tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc 3600tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg 3660ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg 3720ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac 3780gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg 3840gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct 3900ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat ctcagttcgg 3960tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct 4020gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac 4080tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt 4140tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt atctgcgctc 4200tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca 4260ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat 4320ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac 4380gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt 4440aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc 4500aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg 4560cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct ggccccagtg 4620ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca ataaaccagc 4680cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc atccagtcta 4740ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg cgcaacgttg 4800ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct tcattcagct 4860ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa aaagcggtta 4920gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg 4980ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc ttttctgtga 5040ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg agttgctctt 5100gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca 5160ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt 5220cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt 5280ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga 5340aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat cagggttatt 5400gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc 5460gcacatttcc ccgaaaagtg ccacctgacg cgccctgtag cggcgcatta agcgcggcgg 5520gtgtggtggt tacgcgcagc gtgaccgcta cacttgccag cgccctagcg cccgctcctt 5580tcgctttctt cccttccttt ctcgccacgt tcgccggctt tccccgtcaa gctctaaatc 5640ggggcatccc tttagggttc cgatttagtg ctttacggca cctcgacccc aaaaaacttg 5700attagggtga tggttcacgt agtgggccat cgccctgata gacggttttt cgccctttga 5760cgttggagtc cacgttcttt aatagtggac tcttgttcca aactggaaca acactcaacc 5820ctatctcggt ctattctttt gatttataag ggattttgcc gatttcggcc tattggttaa 5880aaaatgagct gatttaacaa aaatttaacg cgaattttaa caaaatatta acaaaatatt 5940aacgtttaca atttcccatt cgccattcag gctgcgcaac tgttgggaag ggcgatcggt 6000gcgggcctct tcgctattac gccagctggc gaaaggggga tgtgctgcaa ggcgattaag 6060ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa acgacggcca gtgccaagct 6120gacttggtca gcggccgcag atttaggtga cactata 6157361130DNAZea maysmisc_featurecef1f.pk001.f4fis 36cggcaggcac gcacgcaggg agagagatag ataaaaggtc gcccccttga ggacagggca 60gggcagctga gggcaatgag cgctggcgga ggcggcggcg gcaccagcac gcttggcggc 120gggggcccga gcggcagcgg cagcggaggc cctggaggaa gcggcggcgg cgggccttgc 180ggcgcgtgca agttcctccg gcgcaagtgc gtcagcggct gcatcttcgc gccctacttc 240gactcggagc agggcgcggc gcacttcgcg gccgtgcaca aggtgttcgg cgccagcaac 300gtgtccaagc tgctgctcca gatcccggcg cacaagcgcc tcgacgccgt cgtcaccatc 360tgctacgagg cccaggcgcg gctccgcgac cccgtctacg gctgcgtcgc ccacatcttc 420gcgctccagc agcaggtggt gaatctccag gccgagctga cctacctgca agcacacctc 480gccacgctcg agctgccggc cccgcccccg ctgccggccc cgccgcagat gcccatgcca 540ggcccgttct ccatctcgga cctgccgttg tcgaccagcg tccccaccac cgtcgacctg 600tccgcgctct tcgacccgcc accaccgcag tgggcgacgg cgcagcagcc gcaccaccac 660catcaacagc cgccgcagca ccaccagctc cggcaaccgg cgccgtatgg cgctggcgcg 720tccgtcaggt ccggcggcgt gaagctcgag cacccgccgc cacactcaag atgagctgga 780tgggggagta gaaggatcaa aaacccgtgc

agaacaaggt gagagttggc gcccggcagt 840atcgagggag ataggggtcg gtgacgggcg atgtccagca cagcaggagt aggtaagcag 900cattggccgg ttttcgcgta cccagcaccc ctgttgttaa tcggctgggg tgcaatggcg 960gcgcccactt gcttgatata ttctccagtt tgatcatatt tgctccaaga caaaagaaag 1020agtgctgggg atcgacgaga gtattactag aattgacatg tattagtaaa aaaaaaaaaa 1080aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 113037232PRTZea maysMISC_FEATUREcef1f.pk001.f4fis 37Met Ser Ala Gly Gly Gly Gly Gly Gly Thr Ser Thr Leu Gly Gly Gly1 5 10 15Gly Pro Ser Gly Ser Gly Ser Gly Gly Pro Gly Gly Ser Gly Gly Gly 20 25 30Gly Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Ser Gly 35 40 45Cys Ile Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly Ala Ala His Phe 50 55 60Ala Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu65 70 75 80Leu Gln Ile Pro Ala His Lys Arg Leu Asp Ala Val Val Thr Ile Cys 85 90 95Tyr Glu Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala 100 105 110His Ile Phe Ala Leu Gln Gln Gln Val Val Asn Leu Gln Ala Glu Leu 115 120 125Thr Tyr Leu Gln Ala His Leu Ala Thr Leu Glu Leu Pro Ala Pro Pro 130 135 140Pro Leu Pro Ala Pro Pro Gln Met Pro Met Pro Gly Pro Phe Ser Ile145 150 155 160Ser Asp Leu Pro Leu Ser Thr Ser Val Pro Thr Thr Val Asp Leu Ser 165 170 175Ala Leu Phe Asp Pro Pro Pro Pro Gln Trp Ala Thr Ala Gln Gln Pro 180 185 190His His His His Gln Gln Pro Pro Gln His His Gln Leu Arg Gln Pro 195 200 205Ala Pro Tyr Gly Ala Gly Ala Ser Val Arg Ser Gly Gly Val Lys Leu 210 215 220Glu His Pro Pro Pro His Ser Arg225 230381038DNAZea maysmisc_featurecpf1c.pk006.d18afis 38agaagcaggg cgcaagtcct accatagcaa tatagcatag ctagcacacc agtagctagc 60atcggagacg atctatcgac tagctctcta tagctagtta gctcttccct tgctagccgt 120ttgcgccggt gactgacgac gaccgacgac atggccaacg aaggggccgc cgctgccgct 180gccgctgccg ctgctgctgc cgcgacgggc gcggggtctc cgtgcggcgc gtgcaagttc 240ctgcgccggc ggtgcgtgcc ggagtgcgtg ttcgcgccct acttcagcag cgaccagggc 300gccgcgcgct tcgccgccat ccacaaggtg ttcggcgcca gcaacgcctc caagctgctg 360tcccacctcc ccgtggccga ccgctgcgag gccgtcgtca ccatcaccta cgaggcgcag 420gccaggctcc gggaccccgt ctacggctgc gtcgcccaga tcttcgccct ccagcagcag 480gtcgccatcc tgcaagcgca gctgatgcag gccaaggcgc agctggcgtg cggcgtccag 540ggcgccgccg cgcactcgcc ggcgagccac caccaccacc agtggccgga cagcgccagc 600atcagcgccc tgctccgcca ggacgcggcg tgtagcgcca ggaggcccgg cgggcccctc 660gacgacttct tcactccgga gctcgtggcc gggttcaggg acgacgtcgc cgccgccgcc 720gggcagcatt gcgcaggcaa ggtggatgcc ggagagctcc agtacctggc ccaggccatg 780atgaggagcc ccaactactc cctgtagccg tagctgtagc tgcctaggaa ggatgatgag 840aatcagacac catgcgtttt ggagccatgc catgctgtgc catctcatct cgatctccac 900tccgctaatg caagtgttga gagatgagct agaaattcct gcaaaaggaa gataacaact 960tgtaccagct agtgatgaag tactctcctt gtctctctca aaaaaaaaaa aaaaaaaaaa 1020aaaaaaaaaa aaaaaaaa 103839218PRTZea maysMISC_FEATUREcpf1c.pk006.d18afis 39Met Ala Asn Glu Gly Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala1 5 10 15Ala Ala Thr Gly Ala Gly Ser Pro Cys Gly Ala Cys Lys Phe Leu Arg 20 25 30Arg Arg Cys Val Pro Glu Cys Val Phe Ala Pro Tyr Phe Ser Ser Asp 35 40 45Gln Gly Ala Ala Arg Phe Ala Ala Ile His Lys Val Phe Gly Ala Ser 50 55 60Asn Ala Ser Lys Leu Leu Ser His Leu Pro Val Ala Asp Arg Cys Glu65 70 75 80Ala Val Val Thr Ile Thr Tyr Glu Ala Gln Ala Arg Leu Arg Asp Pro 85 90 95Val Tyr Gly Cys Val Ala Gln Ile Phe Ala Leu Gln Gln Gln Val Ala 100 105 110Ile Leu Gln Ala Gln Leu Met Gln Ala Lys Ala Gln Leu Ala Cys Gly 115 120 125Val Gln Gly Ala Ala Ala His Ser Pro Ala Ser His His His His Gln 130 135 140Trp Pro Asp Ser Ala Ser Ile Ser Ala Leu Leu Arg Gln Asp Ala Ala145 150 155 160Cys Ser Ala Arg Arg Pro Gly Gly Pro Leu Asp Asp Phe Phe Thr Pro 165 170 175Glu Leu Val Ala Gly Phe Arg Asp Asp Val Ala Ala Ala Ala Gly Gln 180 185 190His Cys Ala Gly Lys Val Asp Ala Gly Glu Leu Gln Tyr Leu Ala Gln 195 200 205Ala Met Met Arg Ser Pro Asn Tyr Ser Leu 210 215401262DNAZea maysmisc_featurecpi1c.pk005.a12fis 40gcggcacgca cgcacgctcg cagggagaga gatagataaa aggtcgcccc cttgagggca 60gggcagggca gctgagggca atgagcgctg gcggcggcag cagcacgctt ggcggcgggg 120ggccgagcgg cagcagcagc ggaggccctg gaggaagcgg cggcggcggc gggccttgcg 180gcgcgtgcaa gttcctccgg cgcaagtgcg tcagcggctg catcttcgcg ccctacttcg 240actcggagca gggcgcggcg cacttcgcgg ccgtgcacaa ggtgttcggc gccagcaacg 300tgtccaagct gctgctccag atcccggcgc acaagcgcct cgacgccgtc gtcaccatct 360gctacgaggc ccaggcgcgg ctccgcgacc ccgtctacgg ctgcgtcgcc cacatcttcg 420cgctccagca gcaggtggtg aatctccagg ccgagctgac ctacctgcaa gcacacctcg 480ccacgctcga gctgccggcc ccgcccccgc tgccggcccc gccgcagatg cccatgccag 540gcccgttctc catctcggac ctgccgttgt cgaccagcgt ccccaccacc gtcgacctgt 600ccgcgctctt cgacccgcca ccaccgcagt gggcgacggc gcagcagccg caccaccacc 660atcaacagcc gccgcagcac caccagctcc ggcaaccggc gccgtatggc gctggcgcgt 720ccgtcaggcc cggcggcggc cccggcatgg cagagagctc aggcggagac gagctgcagt 780cgctggcgag ggagctcctg gaccgccacc ggtccggcgg cgtgaagctc gagcacccgc 840cgccacactc aagatgagct ggatggggga gtagaaggat caaaaacccg tgcagaacaa 900ggtgagagtt ggcgcccggc agtatcgagg gagatagggg tcggtgacgg gcgatgtcca 960gcacagcagg agtaggtaag cagcattggc cggttttcgc gtacccagca cccctgttgt 1020taatcggctg gggtgcaatg gcggcgccca cttgcttgat atattctcca gtttgatcat 1080atttgctcca agacaaaaga aagagtgctg gggatcgacg agagtattac tagaattgac 1140atgtattagt aacattattg ttacctttga taccgttcca ttagttgcaa gatttttatt 1200aagaaaagaa tctcaacatg gtttctaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1260aa 126241258PRTZea maysMISC_FEATUREcpi1c.pk005.a12fis 41Met Ser Ala Gly Gly Gly Ser Ser Thr Leu Gly Gly Gly Gly Pro Ser1 5 10 15Gly Ser Ser Ser Gly Gly Pro Gly Gly Ser Gly Gly Gly Gly Gly Pro 20 25 30Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Ser Gly Cys Ile 35 40 45Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly Ala Ala His Phe Ala Ala 50 55 60Val His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Leu Gln65 70 75 80Ile Pro Ala His Lys Arg Leu Asp Ala Val Val Thr Ile Cys Tyr Glu 85 90 95Ala Gln Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His Ile 100 105 110Phe Ala Leu Gln Gln Gln Val Val Asn Leu Gln Ala Glu Leu Thr Tyr 115 120 125Leu Gln Ala His Leu Ala Thr Leu Glu Leu Pro Ala Pro Pro Pro Leu 130 135 140Pro Ala Pro Pro Gln Met Pro Met Pro Gly Pro Phe Ser Ile Ser Asp145 150 155 160Leu Pro Leu Ser Thr Ser Val Pro Thr Thr Val Asp Leu Ser Ala Leu 165 170 175Phe Asp Pro Pro Pro Pro Gln Trp Ala Thr Ala Gln Gln Pro His His 180 185 190His His Gln Gln Pro Pro Gln His His Gln Leu Arg Gln Pro Ala Pro 195 200 205Tyr Gly Ala Gly Ala Ser Val Arg Pro Gly Gly Gly Pro Gly Met Ala 210 215 220Glu Ser Ser Gly Gly Asp Glu Leu Gln Ser Leu Ala Arg Glu Leu Leu225 230 235 240Asp Arg His Arg Ser Gly Gly Val Lys Leu Glu His Pro Pro Pro His 245 250 255Ser Arg42977DNAZea maysmisc_featurecr1n.pk0028.h3afis 42gcaacttgca gtaggtgaca ggtgttaaca ggagctggct gagcttctct tgcttctgca 60agtagtagct gtagccgccc tgtaggcaga gagaggagag acgacgtacg tgagggagcg 120agcgagcgac gacagcatca ggcaggcgtt gacggccatg gcttcctccg gcagcggtgg 180cggctcgccg gggtccccgt gtggcgcctg caagttcctg cggcgcaagt gcgcggcgga 240gtgcgtgttc gctccccact tctgcgccga ggacggggcg gcgcagttcg cggccatcca 300caaggtgttc ggcgccagca acgcggccaa gctgctgcag caggtggccc ccgccgaccg 360gagcgaggcg gcggccaccg tcacctacga ggcgcaggcc aggctgcgcg accccatcta 420cggctgcgtc gcccacatct tcgcgctgca gcaacaggtg gcgagcttgc agatgcaggt 480gctgcaggcg aaggcgcagg tggcgcagac gatggcggcg gccgggccgc aggggggcag 540cagccctctc ctgcagcggt ggccgctgga gcctgagtcg ctgtcgacgc agagctccgg 600gtgctacagc gacatgtact gcggcttcgg cgaccaggag gaaggcagct acacgagatg 660aataatgaat ggatcattcg cgcgcgcgcg cgcacgcatc gacacagata ctttcttcta 720ttagcgccaa gagacaacaa caaccgaggg cctcaacttt cttgttggtt tgcagtgcgt 780tttgttcagt tcagcagcta gctctccggt ttggggagga gcttaatttc gatgagattt 840cgtgcgatcc ataaacttgt atttcttgcc ggttcgagct gtaaaatgga agtgcagctc 900atcatcatgt gtgtggttat taaacggagg cacaaatcga ggataatttc atattcccta 960aaaaaaaaaa aaaaaaa 97743167PRTZea maysMISC_FEATUREcr1n.pk0028.h3afis 43Met Ala Ser Ser Gly Ser Gly Gly Gly Ser Pro Gly Ser Pro Cys Gly1 5 10 15Ala Cys Lys Phe Leu Arg Arg Lys Cys Ala Ala Glu Cys Val Phe Ala 20 25 30Pro His Phe Cys Ala Glu Asp Gly Ala Ala Gln Phe Ala Ala Ile His 35 40 45Lys Val Phe Gly Ala Ser Asn Ala Ala Lys Leu Leu Gln Gln Val Ala 50 55 60Pro Ala Asp Arg Ser Glu Ala Ala Ala Thr Val Thr Tyr Glu Ala Gln65 70 75 80Ala Arg Leu Arg Asp Pro Ile Tyr Gly Cys Val Ala His Ile Phe Ala 85 90 95Leu Gln Gln Gln Val Ala Ser Leu Gln Met Gln Val Leu Gln Ala Lys 100 105 110Ala Gln Val Ala Gln Thr Met Ala Ala Ala Gly Pro Gln Gly Gly Ser 115 120 125Ser Pro Leu Leu Gln Arg Trp Pro Leu Glu Pro Glu Ser Leu Ser Thr 130 135 140Gln Ser Ser Gly Cys Tyr Ser Asp Met Tyr Cys Gly Phe Gly Asp Gln145 150 155 160Glu Glu Gly Ser Tyr Thr Arg 165441058DNAEuphorbia lagascaemisc_featureeel1c.pk003.b10fis 44gcttcttctt catattctgc gtctcataaa ccctaattat gctctcttct ctctccaaat 60tcgatccgaa atgagttcga cggtgcatcc tagcagcagc ggcagcagcg gcggagccgg 120aggaggagga agtggtggaa gtggcggagg gagtgggccg tgtggagcgt gtaaattttt 180gaggagaaaa tgtgtgccgg ggtgtatatt tgcgccgtac tttgattccg agcagggagc 240ggcgcatttt gcggcggtgc ataaggtttt tggtgcgagt aacgtttcga aacttcttct 300gcatattccg gtacataaac gccttgatgc ggtggttact atttgttatg aagctcaagc 360tcggcttcga gatcctgttt atggctgcgt tgctcatata ttcgctctgc aacaacaggt 420ggtgaactta caggcagagc tcacatattt gcaagcccat ttagcaacac tagagcttcc 480gtcaccaccg ccgcctcctc tcccaccaca aacactattg acaccaccac ctctatcaat 540atccgacctc ccatcatcct cttctgctcc cggttcatat gacttgcaat cgctttttga 600tccgatggca caaaattcat ggtcaatgca acaaaggcta atagatccac gccatcaatt 660cataggttcg actagtggtt catcgtcgtt aaccaccaca ggcagtggga gtggtgatct 720tcatacattg gcacgtgagc ttctccatag acatggttct ccgtcacatg gttcaatgcc 780atgtagcggc gctttatctt catctccgtc ttctatctca aaatgaaact gaccctattg 840atagaagttg ttgcaacata atttgtacta attttcaatg ggatgctagc cgaaagagct 900taagttttca tggtatttta gtttagagat ctagtgtttt aataactggt cactaatttt 960tttggccttc tgtttattat attattcatt ttctcttaaa aaaaaaaaaa aaaaaaaaaa 1020aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 105845251PRTEuphorbia lagascaeMISC_FEATUREeel1c.pk003.b10fis 45Met Ser Ser Thr Val His Pro Ser Ser Ser Gly Ser Ser Gly Gly Ala1 5 10 15Gly Gly Gly Gly Ser Gly Gly Ser Gly Gly Gly Ser Gly Pro Cys Gly 20 25 30Ala Cys Lys Phe Leu Arg Arg Lys Cys Val Pro Gly Cys Ile Phe Ala 35 40 45Pro Tyr Phe Asp Ser Glu Gln Gly Ala Ala His Phe Ala Ala Val His 50 55 60Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Leu His Ile Pro65 70 75 80Val His Lys Arg Leu Asp Ala Val Val Thr Ile Cys Tyr Glu Ala Gln 85 90 95Ala Arg Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His Ile Phe Ala 100 105 110Leu Gln Gln Gln Val Val Asn Leu Gln Ala Glu Leu Thr Tyr Leu Gln 115 120 125Ala His Leu Ala Thr Leu Glu Leu Pro Ser Pro Pro Pro Pro Pro Leu 130 135 140Pro Pro Gln Thr Leu Leu Thr Pro Pro Pro Leu Ser Ile Ser Asp Leu145 150 155 160Pro Ser Ser Ser Ser Ala Pro Gly Ser Tyr Asp Leu Gln Ser Leu Phe 165 170 175Asp Pro Met Ala Gln Asn Ser Trp Ser Met Gln Gln Arg Leu Ile Asp 180 185 190Pro Arg His Gln Phe Ile Gly Ser Thr Ser Gly Ser Ser Ser Leu Thr 195 200 205Thr Thr Gly Ser Gly Ser Gly Asp Leu His Thr Leu Ala Arg Glu Leu 210 215 220Leu His Arg His Gly Ser Pro Ser His Gly Ser Met Pro Cys Ser Gly225 230 235 240Ala Leu Ser Ser Ser Pro Ser Ser Ile Ser Lys 245 25046484DNAAquilegia vulgarismisc_featureeav1c.pk003.c9 46gcganagtgc gttgttggnt gtattttcgc cccatatttt gattcagaac aaggtgcaac 60acactttgca gctgttcata aggtgtttgg tgcaagtaat gtgtccaagc ttcttttaca 120catacctgtt cataagcgtt tggatgcagt tgttactatt tgttatgaag ctcaagcacg 180tttaagagat ccagtttatg ggtgtgttgc taatatcttt gctcttcaac aacaggtggg 240aaatttacaa gctgagttat cctacttgca aacataccta gcatcattgg gngcttccaa 300ctccaccanc aagctccgcc aacaccaatg cttattacaa caacacctct ctccaaaagc 360aaattttcca tcaagcttcc actaagncan gcaaaacttt tgacttggtc aactcctttt 420cganccccca aaggaacaaa tcgggggaca cttcaacaaa agacaaatgg atttttaaac 480aaat 48447127PRTAquilegia vulgarisMISC_FEATUREeav1c.pk003.c9 47Arg Xaa Cys Val Val Gly Cys Ile Phe Ala Pro Tyr Phe Asp Ser Glu1 5 10 15Gln Gly Ala Thr His Phe Ala Ala Val His Lys Val Phe Gly Ala Ser 20 25 30Asn Val Ser Lys Leu Leu Leu His Ile Pro Val His Lys Arg Leu Asp 35 40 45Ala Val Val Thr Ile Cys Tyr Glu Ala Gln Ala Arg Leu Arg Asp Pro 50 55 60Val Tyr Gly Cys Val Ala Asn Ile Phe Ala Leu Gln Gln Gln Val Gly65 70 75 80Asn Leu Gln Ala Glu Leu Ser Tyr Leu Gln Thr Tyr Leu Ala Ser Leu 85 90 95Gly Ala Ser Asn Ser Thr Xaa Lys Leu Arg Gln His Gln Cys Leu Leu 100 105 110Gln Gln His Leu Ser Pro Lys Ala Asn Phe Pro Ser Ser Phe His 115 120 125481077DNACyamopsis tetragonolobamisc_featurelds3c.pk011.j11fis 48gacaacacat cttgctctca catgatacag gtagagagag aaagttgaaa ggatgatgag 60ttgtgttgca taaattgacg aggaaggagt agcgagggca aaaaaggaat taaatttaaa 120gattaagatt cagttaaggt ggaagatgag ttcgaaagct ggaaatggaa gtggaagtgg 180aagtggcagt ggaggcggga gcccttgtgg ggcttgtaag tttcttcgaa ggaagtgtgt 240ggcaggatgt gtgtttgctc catactttga ctcagagcaa ggagccactc attttgcagc 300tgtgcataag gtgtttggtg caagcaacgt ttctaaactt ctcctcaacc ttccgctcaa 360caaaaggctt gatgctgtta ttaccatttg ctatgaagct cagtcaagga tcagagatcc 420cgtcttcggc tgcgttgctc acatctttgc tctccagcaa caggtggtaa gtttacaaac 480agaagtgtcg tacttacaaa gccaccttgc tgcaatggag ttacctcagc caccacctcc 540tcctcctcca caggagacag tggtgcaggc accggtattc tcgattgcag acataccggc 600agcaacggta gcgggcatgc cggcgagcta tgacctgtct tcactttttg agccgacggg 660gcaacaaaat tcatgggggg gcggcggcat agacccgcgt caatttttgg cagttggccc 720atcatcaact actgatgctg atctccaagc aatggcacgt gacctttctg aaagacttgc 780ctctctacct ccacctgcac ccgcacctgc atttgctcct ctacctccac ttccacctgc 840acctgcacct gcacctgcac catcatgccc caatgcacct tcatctttat cactttctta 900attaatcatc atcatcatca tacatgcatc tatcttcaga cttttcttca cttttatttt 960tcatcgaaaa ctagtcaggg atcttcaatt tcgtacacgc tctaatttat gtgcgtgcgg 1020atatttcttt taattttcgc gcttctgcct ttcaaaaaaa aaaaaaaaaa aaaaaaa 107749251PRTCyamopsis tetragonolobaMISC_FEATURElds3c.pk011.j11fis 49Met Ser Ser Lys Ala Gly Asn Gly Ser Gly Ser Gly Ser Gly Ser Gly1 5 10 15Gly Gly Ser Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys Val 20 25 30Ala Gly Cys Val Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly Ala Thr 35 40 45His Phe Ala Ala Val His Lys Val Phe Gly

Ala Ser Asn Val Ser Lys 50 55 60Leu Leu Leu Asn Leu Pro Leu Asn Lys Arg Leu Asp Ala Val Ile Thr65 70 75 80Ile Cys Tyr Glu Ala Gln Ser Arg Ile Arg Asp Pro Val Phe Gly Cys 85 90 95Val Ala His Ile Phe Ala Leu Gln Gln Gln Val Val Ser Leu Gln Thr 100 105 110Glu Val Ser Tyr Leu Gln Ser His Leu Ala Ala Met Glu Leu Pro Gln 115 120 125Pro Pro Pro Pro Pro Pro Pro Gln Glu Thr Val Val Gln Ala Pro Val 130 135 140Phe Ser Ile Ala Asp Ile Pro Ala Ala Thr Val Ala Gly Met Pro Ala145 150 155 160Ser Tyr Asp Leu Ser Ser Leu Phe Glu Pro Thr Gly Gln Gln Asn Ser 165 170 175Trp Gly Gly Gly Gly Ile Asp Pro Arg Gln Phe Leu Ala Val Gly Pro 180 185 190Ser Ser Thr Thr Asp Ala Asp Leu Gln Ala Met Ala Arg Asp Leu Ser 195 200 205Glu Arg Leu Ala Ser Leu Pro Pro Pro Ala Pro Ala Pro Ala Phe Ala 210 215 220Pro Leu Pro Pro Leu Pro Pro Ala Pro Ala Pro Ala Pro Ala Pro Ser225 230 235 240Cys Pro Asn Ala Pro Ser Ser Leu Ser Leu Ser 245 250501847DNAGlycine maxmisc_featuresdr1f.pk005.d21fis 50gtacgaggac ccctcactct tccatactat agtcctcaga tttttagttt gcaccatttc 60ctagtgtgcc cgtgtgccta caaattttat tcacttcctc ccactcaggt cctttctttt 120caaacataaa atacatatct ttctctctct cggtaatgac tccaacttat tgatagtgtt 180ttatgttcag ataatgcccg atgactttgt catgcagctc caccgatttt gagaacgaca 240gcgacttccg tcccagccgt gccaggtgct gcctcagatt caggttatgc cgctcaattc 300gctgcgtata tcgcttgctg attacgtgca gctttccctt caggcgggat tcatacagcg 360gccagccatc cgtcatccat atcaccacgt caaagggtga cagcaggctc ataagacgcc 420ccagcgtcgc catagtgcgt tcaccgaata cgtgcgcaac aaccgtcttc cggagactgt 480catacgcgta aaacagccag cgctggcgcg atttagcccc gacatagccc cactgttcgt 540ccatttccgc gcagacgatg acgtcactgc ccggctgtat gcgcgaggtt accgactgcg 600gcctgagttt tttaagtgac gtaaaatcgt gttgaggcca acgcccataa tgcgggctgt 660tgcccggcat ccaacgccat tcatggccat atcaatgatt ttctggtgcg taccgggttg 720agaagcggtg taagtgaact gcagttgcca tgttttacgg cagtgagagc agagatagcg 780ctgatgtccg gcggtgcttt tgccgttacg caccaccccg tcagtagctg aacaggaggg 840acagctgata gaaacagaag ccactggagc acctcaaaaa caccatcata cactaaatca 900gtaagttggc agcatcaccc tctctctctt tgtgtgttgg ttattagtac aattatacta 960ctactatact atggcttctg ctagtggaaa tggtgtctct aatggctctg gctctccttg 1020cggggcatgc aagttcctca gaagaaggtg tgcttctgat tgtatctttg caccttactt 1080ttgttcagaa cagggccctg ctagatttgc agccatacac aaggtatttg gtgccagcaa 1140cgtttcaaag ttgcttttgc acataccagc tcatgatcgt tgtgaagcgg ttgtcacaat 1200cacttatgag gctcaggctc gtattagaga ccctgtctat ggctgtgtct ctcacatttt 1260tgccttacaa caacaggtgg cacgcttgca ggcacagctg atgcaggtaa aagctcagct 1320gactcagaac ctagtggagt ccaggaacat agagaataat catcatttgc aagggaataa 1380taacaatgtt acaggacaac taatgaatca tccattttgt cccccttaca tgaatcctat 1440atctcctcaa agctcacttg aatcaattga tcacagcagc atcaatgatg gaatgagcat 1500gcaagatata caaagcagag aggatttcca aatccaagct aaagaaagac catacaacaa 1560caatgacttg ggggagctgc aagaactggc actaaggatg atgaggaact gattaattat 1620gactaggtta gcaccaaagc tagccttttc attttctaga agggtgttcc ttgatgttta 1680gggggggatg gtcttttgct agtgttgtat atataatgag tgtcatgaag aaaaactggt 1740cataactgat aataagccta aagtttaaac taagcattag gcttttttct gtttgtggat 1800tcaatccaaa agaaaattaa ttttttgcaa aaaaaaaaaa aaaaaaa 184751213PRTGlycine maxMISC_FEATUREsdr1f.pk005.d21fis 51Met Ala Ser Ala Ser Gly Asn Gly Val Ser Asn Gly Ser Gly Ser Pro1 5 10 15Cys Gly Ala Cys Lys Phe Leu Arg Arg Arg Cys Ala Ser Asp Cys Ile 20 25 30Phe Ala Pro Tyr Phe Cys Ser Glu Gln Gly Pro Ala Arg Phe Ala Ala 35 40 45Ile His Lys Val Phe Gly Ala Ser Asn Val Ser Lys Leu Leu Leu His 50 55 60Ile Pro Ala His Asp Arg Cys Glu Ala Val Val Thr Ile Thr Tyr Glu65 70 75 80Ala Gln Ala Arg Ile Arg Asp Pro Val Tyr Gly Cys Val Ser His Ile 85 90 95Phe Ala Leu Gln Gln Gln Val Ala Arg Leu Gln Ala Gln Leu Met Gln 100 105 110Val Lys Ala Gln Leu Thr Gln Asn Leu Val Glu Ser Arg Asn Ile Glu 115 120 125Asn Asn His His Leu Gln Gly Asn Asn Asn Asn Val Thr Gly Gln Leu 130 135 140Met Asn His Pro Phe Cys Pro Pro Tyr Met Asn Pro Ile Ser Pro Gln145 150 155 160Ser Ser Leu Glu Ser Ile Asp His Ser Ser Ile Asn Asp Gly Met Ser 165 170 175Met Gln Asp Ile Gln Ser Arg Glu Asp Phe Gln Ile Gln Ala Lys Glu 180 185 190Arg Pro Tyr Asn Asn Asn Asp Leu Gly Glu Leu Gln Glu Leu Ala Leu 195 200 205Arg Met Met Arg Asn 21052852DNATriticum aestivummisc_featurewdr1f.pk002.l10fis 52gcagagctcg atcataagct agctagtcag gccaggcggg cgatcggacg atcgggctat 60aatttcgact acggcgacga tggccggcgc gggcgtgacg acgacggggt cgccgtgcgg 120ggcgtgcaag ttcctgcggc gccggtgcgc ggcggagtgc gtgttcgcgc cctacttctg 180cgccgaggac ggcgcgtcgc agttcgcggc catccacaag gtgttcgggg ccagcaacgc 240ggccaagctg ctgcagcagg tggcccccgg cgaccggagc gaggcggccg ccacagtgac 300ctacgaggcg caggcccggc tgcgcgaccc cgtctacggc tgcgtcgccc acatcttcgc 360gctgcagcag caggttgtgg cgctgcaggc gcaggtggcg cacgccagga cgcaggcgca 420gctgggggcg gcgacggcga tgcacccgct gctccagcag cagctgcagc agcaggcgtg 480gcaggtggcc gccgccgcgg atcagcacga ccaccagtcc atgacgtcca cgcagagcag 540ctccggctgc tacagcggcg cccaccagcg ctccgacggc tcgtcgctgc acggcgccga 600gatgtactgc ggctacggcg agcaggagga aggcagctac taacccccag atgattgatt 660cactcgttcc tcgttcgttc ccctgagaaa cctgagacat gtgccatgaa aagtttctcc 720tttgcaacgc gcgttcgctt gagttggttc aactcttgcc ggtctcggct gtaaaggcat 780caatcggtct tgtgttgttt ggggctcaag acgaacccat aatttccaac tttgcaaaaa 840aaaaaaaaaa aa 85253187PRTTriticum aestivumMISC_FEATUREwdr1f.pk002.l10fis 53Met Ala Gly Ala Gly Val Thr Thr Thr Gly Ser Pro Cys Gly Ala Cys1 5 10 15Lys Phe Leu Arg Arg Arg Cys Ala Ala Glu Cys Val Phe Ala Pro Tyr 20 25 30Phe Cys Ala Glu Asp Gly Ala Ser Gln Phe Ala Ala Ile His Lys Val 35 40 45Phe Gly Ala Ser Asn Ala Ala Lys Leu Leu Gln Gln Val Ala Pro Gly 50 55 60Asp Arg Ser Glu Ala Ala Ala Thr Val Thr Tyr Glu Ala Gln Ala Arg65 70 75 80Leu Arg Asp Pro Val Tyr Gly Cys Val Ala His Ile Phe Ala Leu Gln 85 90 95Gln Gln Val Val Ala Leu Gln Ala Gln Val Ala His Ala Arg Thr Gln 100 105 110Ala Gln Leu Gly Ala Ala Thr Ala Met His Pro Leu Leu Gln Gln Gln 115 120 125Leu Gln Gln Gln Ala Trp Gln Val Ala Ala Ala Ala Asp Gln His Asp 130 135 140His Gln Ser Met Thr Ser Thr Gln Ser Ser Ser Gly Cys Tyr Ser Gly145 150 155 160Ala His Gln Arg Ser Asp Gly Ser Ser Leu His Gly Ala Glu Met Tyr 165 170 175Cys Gly Tyr Gly Glu Gln Glu Glu Gly Ser Tyr 180 18554262PRTArabidopsis thalianaMISC_FEATURENCBI General Identifier No. 17227164 54Met Ser Gly Gly Gly Asn Thr Ile Thr Ala Val Gly Gly Gly Gly Gly1 5 10 15Gly Cys Gly Gly Gly Gly Ser Ser Gly Gly Gly Gly Ser Ser Gly Gly 20 25 30Gly Gly Gly Gly Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Lys Cys 35 40 45Val Pro Gly Cys Ile Phe Ala Pro Tyr Phe Asp Ser Glu Gln Gly Ser 50 55 60Ala Tyr Phe Ala Ala Val His Lys Val Phe Gly Ala Ser Asn Val Ser65 70 75 80Lys Leu Leu Leu His Ile Pro Val His Arg Arg Ser Asp Ala Val Val 85 90 95Thr Ile Cys Tyr Glu Ala Gln Ala Arg Ile Arg Asp Pro Ile Tyr Gly 100 105 110Cys Val Ala His Ile Phe Ala Leu Gln Gln Gln Val Val Asn Leu Gln 115 120 125Ala Glu Val Ser Tyr Leu Gln Ala His Leu Ala Ser Leu Glu Leu Pro 130 135 140Gln Pro Gln Thr Arg Pro Gln Pro Met Pro Gln Pro Gln Pro Leu Phe145 150 155 160Phe Thr Pro Pro Pro Pro Leu Ala Ile Thr Asp Leu Pro Ala Ser Val 165 170 175Ser Pro Leu Pro Ser Thr Tyr Asp Leu Ala Ser Ile Phe Asp Gln Thr 180 185 190Thr Ser Ser Ser Ala Trp Ala Thr Gln Gln Arg Arg Phe Ile Asp Pro 195 200 205Arg His Gln Tyr Gly Val Ser Ser Ser Ser Ser Ser Val Ala Val Gly 210 215 220Leu Gly Gly Glu Asn Ser His Asp Leu Gln Ala Leu Ala His Glu Leu225 230 235 240Leu His Arg Gln Gly Ser Pro Pro Pro Ala Ala Thr Asp His Ser Pro 245 250 255Ser Arg Thr Met Ser Arg 2605522PRTArtificial SequenceC-block consensus 55Pro Cys Gly Ala Cys Lys Phe Leu Arg Arg Xaa Cys Xaa Xaa Xaa Cys1 5 10 15Xaa Phe Ala Pro Xaa Phe 205612PRTArtificial SequenceN-terminus of the GAS block 49 amino acids 56Phe Ala Ala Xaa His Lys Val Phe Gly Ala Ser Asn1 5 10579PRTArtificial SequenceC-terminus of GAS block 49 amino acids 57Arg Asp Pro Xaa Xaa Gly Cys Val Xaa1 55819PRTArtificial SequenceLeucine zipper motif 58Leu Gln Xaa Gln Xaa Xaa Xaa Leu Gln Xaa Xaa Xaa Xaa Xaa Xaa Xaa1 5 10 15Xaa Xaa Xaa5931DNAArtificial SequenceOligonucleotide primer cpi BbsI F 59gaagaccaat gagcgctggc ggcggcagca g 316034DNAArtificial SequenceOligonucleotide primer Cpi BsaI R 60ggtctcctca tcttgagtgt ggcggcgggt gctc 34611855DNAZea mays 61tatgcatcca acgcgttggg agctctccca tatggtcgac ctgcaggcgg ccgcgaattc 60actagtgatt gaagaccaat gagcgctggc ggcggcagca gcacgcttgg cggcgggggc 120ccgagcggca gcggcagcgg aggccctgga ggaagcggcg gcggcgggcc ttgcggcgcg 180tgcaagttcc tccggcgcaa gtgcgtcagc ggctgcatct tcgcgcccta cttcgactcg 240gagcagggcg cggcgcactt cgcggccgtg cacaaggtgt tcggcgccag caacgtgtcc 300aagctgctgc tccagatccc ggcgcacaag cgcctcgacg ccgtcgtcac catctgctac 360gaggcccagg cgcggctccg cgaccccgtc tacggctgcg tcgcccacat cttcgcgctc 420cagcagcagg tatatatatg agatgctagg atgatcgatt atctttgggt tgggttatat 480atatattcgg tccatccatc catgcaagat ccatccatgg gctcgctcgc tagtagcttg 540gcatgcatgc acgcatgcat ggatcgatca tggatagacg atgcctgcta gtagtaggcc 600ggcaggcgct accagcgatt attgctgcat gatttcccct tcgcattcgc gtgtggatct 660gggtcttttc tgaatccgcc gtctctgcga taagattctg ggagcggcca ggcgtgtttc 720tttctcgagg aaggcaagtc cgtccccgtc cccccctttc acgaggaaat caacactgac 780aagccaagca acggcagtgc aaaaagaagc acgccaagcg ctaatccggg aggcctgcct 840gcggcgatga atgatatgca cttctcatcc gtcgcatccg tgccgtcgat cgcattcctc 900ttctacccgt caaggcagca gccacgtaca ccatgcggat gcatgtgatg tgtgtgtgtg 960tgtgtgtgta tctccttcta tcttgggctc tgcacaaagc cttccaatgc cagtggcggt 1020gtggtgcttc ccgatctgat cgatcgatga ctcgatgagc tagccctcct tgaaaagaat 1080agaacgtcag cgccaatctc tagtattggt agcagcagta gccgtcctcc tcctaggtag 1140aagatccaaa cctgcattct tttttgtcaa tcgtgcgatg gacacctttc atttcgatcg 1200catatttgca tccgtgtgtg tgatgtgtct ttttttttct tccatattat atgcatctgt 1260atcgtgtaca aacaatgatg gcttttggtg gttccaagtt tgcacgtaac aatttactgt 1320tggatcgtcg acggtgcatg aatgtcacgt cattattccc caggtggtga atctccaggc 1380cgagctgacc tacctgcaag cacacctcgc cacgctcgag ctgccggccc cgcccccgct 1440gccggccccg ccgcagatgc ccatgccagg cccgttctcc atctcggacc tgccgttgtc 1500gaccagcgtc cccaccaccg tcgacctgtc cgcgctcttc gacccgccac caccgcagtg 1560ggcgacggcg cagcagccgc accaccacca tcaacagccg ccgcagcacc accagctccg 1620gcaaccggcg ccgtatggcg ctggcgcgtc cgtcaggccc ggcggcggcc ccggcatggc 1680agagagctca ggcggagacg agctgcagtc gctggcgagg gagctcctgg accgccaccg 1740gtccggcggc gtgaagctcg agcacccgcc gccacactca agatgaggag accaatcgaa 1800ttcccgcggc cgccatggcg gcccgggagc atgcgacgtc gggcccaatt cgccc 18556240DNAArtificial SequenceOligonucleotide primer RE2 pro Bst 2F 62caccatcatg tcagtgtgcc aatacgctaa acttagaaga 406330DNAArtificial SequenceOligonucleotide primer RE2 PRO R BbsI 63gaagacgctc attcttggaa tgagccccca 30644359DNAArtificialpGEMT-easy vector with portion of rice RE2 promoter 64gggcgaattg ggcccgacgt cgcatgctcc cggccgccat ggcggccgcg ggaattcgat 60tcaccatcat gtcagtgtgc caatacgcta aacttagaag atgcctccga taatcgtagc 120atttgtatta tttggggaat gaattaaawa atataaataa tgatatatta caattgataa 180tctatgttta gaaacttttg tcggttactc gctcaaattg tatggggtaa taaatcggtg 240aagtatattt ttatactgaa tggaaaagat aagctaccat tgatagcatt agcggttcta 300tttcgtatat tatcccgatt atccaccctc aatttgtgct aaaataagat ttttacatca 360tcctagtcaa tatttggggt taccctgtct gcattataat ttatttttgt gcttaactat 420aatatataca tacactataa tttatctaaa taaaagttct ggtatgattg aaaaaaacta 480acaattttgt gtgtggcgta ttgagtggaa gaatgtcatg ttaggatcac atgggagaga 540gtgcatgcga cgagatcatc cttgttggtc tgtgcaggtg gtgtgaaatg tgatcaatat 600atatggtggt gacagagaga gaaactaacc caaaaaaaca aaaaaagaga gatgagagcg 660aatggatgga tgcaattggc attaattttc ggtctttgct gttctccccc agccaggcca 720gtttgcttca cgcaatattc taaccctttg agaaagagaa gtgtacttgt tgccaaggcc 780aattgcaagc atttgccttg gctttaaagt ctcatcaata caacggcacc aaaaagaaaa 840cacatagaga tagaaaacca cctagtagct gatatacatt tatatatgac ctaaataaaa 900aaattccatt aatatrtata attccagcaa caacataaag aaataaaaat gcatttaaga 960aaacatagaa agaaataaaa ataaagtaaa taaagctagc taggcccaaa attggcagta 1020attaagtagg gactagtata gaaatatatg gatatataca ccagcctcca ccaatgggat 1080tgcaaacagc ctacttatca ctttgctgct gtatttacgc ttttgccctt cttccctcct 1140atatgtacag ccgccccmac ctcattccct ccattcttac tccacacaca cactctctct 1200ctctaccatt tgtgagaaag aaaatcgatt cagttctaga gagagaaaca aacaattttc 1260gctgtctatc tctctcttgc tactagtcgg tcgatcttga gttagtttta accctacaca 1320agccaaggta acaacatcta gcaggtagga gaagagtgct agagactagg tggtgggggc 1380tcattccaag aatgagcgtc ttcaatcact agtgaattcg cggccgcctg caggtcgacc 1440atatgggaga gctcccaacg cgttggatgc atagcttgag tattctatag tgtcacctaa 1500atagcttggc gtaatcatgg tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa 1560ttccacacaa catacgagcc ggaagcataa agtgtaaagc ctggggtgcc taatgagtga 1620gctaactcac attaattgcg ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt 1680gccagctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt attgggcgct 1740cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat 1800cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga 1860acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt 1920ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt 1980ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 2040gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 2100gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 2160ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 2220actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 2280gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 2340ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 2400ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg 2460gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt 2520tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg 2580tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta 2640aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg 2700aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg 2760tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc 2820gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg 2880agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg 2940aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag 3000gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat 3060caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc 3120cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc 3180ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa 3240ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac 3300gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt 3360cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc 3420gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa 3480caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca 3540tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat 3600acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa 3660aagtgccacc tgatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc 3720aggaaattgt aagcgttaat attttgttaa aattcgcgtt aaatttttgt taaatcagct 3780cattttttaa ccaataggcc gaaatcggca aaatccctta taaatcaaaa

gaatagaccg 3840agatagggtt gagtgttgtt ccagtttgga acaagagtcc actattaaag aacgtggact 3900ccaacgtcaa agggcgaaaa accgtctatc agggcgatgg cccactacgt gaaccatcac 3960cctaatcaag ttttttgggg tcgaggtgcc gtaaagcact aaatcggaac cctaaaggga 4020gcccccgatt tagagcttga cggggaaagc cggcgaacgt ggcgagaaag gaagggaaga 4080aagcgaaagg agcgggcgct agggcgctgg caagtgtagc ggtcacgctg cgcgtaacca 4140ccacacccgc cgcgcttaat gcgccgctac agggcgcgtc cattcgccat tcaggctgcg 4200caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 4260gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 4320taaaacgacg gccagtgaat tgtaatacga ctcactata 43596538DNAArtificial SequenceOligonucleotide primer RE2 TERM XbaI R 65gtaaaaggat ctagacacct ggctctagcc tccaagta 386650DNAArtificial SequenceOligonucleotide primer RE2 TERM EcoBspmI 66tggagcgaat tcacctgcca agatgatcct cctcactgtg tgtgatcatc 50673745DNAArtificialpGEM9z plasmid containing portion of rice RE2 terminator 67gggcgaattg ggcccgacgt cgcatgctcc tctagacacc tggctctagc ctccaagtac 60tcacatatgt tcacacttta ttcagttgcc accatacata tatgttcagt taccattaac 120cccaaaaatg ctcattttgc tacttgctag ctgttgtata tattatggca aaaaaaaaaa 180gctcatttgt catagagaga tcagaaagtc ccaaattaag actgactaca cttcacttca 240gatgttcagt tagaaccttg aaacagaaat caaacaaact tcaaacacaa ccgtaattaa 300tctgcgatga gacaaagatt ttttttttgt atattgaact aaattagacc aggttttgtt 360ttttgcaatt agaactgttg caatggaaca gtgatcaggg ctacagctaa gctaacaaga 420actcatcccc aagcttcctc ttttggagca aactcccaag cattataaaa tggtgcaagt 480gggcacccct tgcaccctag cccaagaaca aagatgatac gcgaaaaccg ggcaatgcct 540gttacctacc atctccatcc ccaacatcac catcatcgtc ttcctcgcca tcgtcgtcgt 600cgtcgttcga tttttcgcct caacgacatg gaggatttgg tgaaaaaaat ttcaccattg 660ggattaaatt ctcttcttgc ctcaaccaat gcattcatga atgattaaca cctgatcaat 720tagtccatga gctagctagc taagctgaat tgatgatcac acacagtgag gaggatcatc 780ttggcaggtg aattcggtac cccgggttcg aaatcgataa gcttggatcc ggagagctcc 840caacgcgttg gatgcatagc ttgagtattc tatagtgtca cctaaatagc ttggcgtaat 900catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac 960gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa 1020ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat 1080gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc 1140tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg 1200cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag 1260gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc 1320gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag 1380gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 1440ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 1500atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 1560tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 1620ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca 1680gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 1740ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag 1800ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca 1860agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg 1920ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa 1980aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta 2040tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag 2100cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga 2160tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac 2220cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc 2280ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta 2340gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac 2400gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat 2460gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa 2520gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg 2580tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag 2640aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc 2700cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct 2760caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat 2820cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg 2880ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc 2940aatattattg aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta 3000tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgtat 3060gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcagga cgcgccctgt 3120agcggcgcat taagcgcggc gggtgtggtg gttacgcgca gcgtgaccgc tacacttgcc 3180agcgccctag cgcccgctcc tttcgctttc ttcccttcct ttctcgccac gttcgccggc 3240tttccccgtc aagctctaaa tcgggggctc cctttagggt tccgatttag tgctttacgg 3300cacctcgacc ccaaaaaact tgattagggt gatggttcac gtagtgggcc atcgccctga 3360tagacggttt ttcgcccttt gacgttggag tccacgttct ttaatagtgg actcttgttc 3420caaactggaa caacactcaa ccctatctcg gtctattctt ttgatttata agggattttg 3480ccgatttcgg cctattggtt aaaaaatgag ctgatttaac aaaaatttaa cgcgaatttt 3540aacaaaatat taacgcttac aatttccatt cgccattcag gctgcgcaac tgttgggaag 3600ggcgatcggt gcgggcctct tcgctattac gccagctggc gaaaggggga tgtgctgcaa 3660ggcgattaag ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa acgacggcca 3720gtgaattgta atacgactca ctata 37456827DNAArtificial Sequenceoligonucleotide primer that may be used to identify RE2 homologs 68gcatcttcgc gccctacttc gactcgg 276935DNAArtificial Sequenceoligonucleotide primer that may be used to identify RE2 homologs 69gcacaaggtg ttcggcgcca gcaacgtgtc caagc 357034DNAArtificial Sequenceoligonucleotide primer that may be used to identify RE2 homologs 70ccgcgacccc gtctacggct gcgtcgccca cctc 3471448DNAArtificial SequenceProbe used to identify RE2 cDNA 71gcaggtctcc agtccgagct gaactacctg caaggtcacc tctcgacgat ggagctgccg 60tcgccgccgc cctacgtcgc cgggccgacc ctggcgccgc cacagccaca gccactgatg 120ccgatgaccg ccgccgccaa cttcaacttc tccgacctgc catcgtcgtc ggcggccaac 180attccggtca ccgccgacct gtccaccctc tttgacccac tgccggcggc gcagccgcag 240tggggactat accagcagca gcaacaccac caccagcagc tgcatcatca cccctatgac 300cggatgggcg acggctcgtc gagcagcaga ggcggcgacg acgatggcag cgacggcggc 360gacttgcaag cgctggcgag ggagcttctt gaccgccatg gacggtcgtc gtcgagctcc 420aagctggagc cgccacctca cacacagt 448721500DNAOryza sativa 72ggcacgaggt ctctctctac catttgtgag aaagaaaatc gattcagttc tagagagaga 60aacaaacaat tttcgctgtc tatctctctc ttgctactag tcggtcgatc ttgagttagt 120tttaacccta cacaagccaa ggtaacaaca tctagcaggt aggagaagag agctagagac 180taggtggtgg gggctcattc caagaatgag ctcgtcggtg gttgtgagcg cgagcggcag 240cggcagcggc ggcggaggag gaggaggagg tggcggcgcc ggaggtggag gaggaggtgg 300gccgtgcggg gcgtgcaagt tcttgcggcg gaagtgcgtg caggggtgca tcttcgcgcc 360ctacttcgac tcggaggccg gggcggcgca cttcgcggcg gtgcacaagg tgttcggcgc 420cagcaacgtg tccaagctgc tgcagcagat cccggcgcac cgccgcctcg acgccgtcgt 480caccatctgc tacgaggccc aggcccgcct ccgcgacccc gtctacggct gcgtcgccca 540catcttccac ctccaacacc aggtggcagg tctccagtcc gagctgaact acctgcaagg 600tcacctctcg acgatggagc tgccgtcgcc gccgccctac gtcgccgggc cgaccctggc 660gccgccacag ccacagccac tgatgccgat gaccgccgcc gccaacttca acttctccga 720cctgccatcg tcgtcggcgg ccaacattcc ggtcaccgcc gacctgtcca ccctctttga 780cccactgccg gcggcgcagc cgcagtgggg actataccag cagcagcaac accaccacca 840gcagctgcat catcacccct atgaccggat gggcgacggc tcgtcgagca gcagaggcgg 900cgacgacgat ggcagcgacg gcggcgactt gcaagcgctg gcgagggagc ttcttgaccg 960ccatggacgg tcgtcgtcga gctccaagct ggagccgcca cctcacacac agtgatcctc 1020ctcactgtgt gtgatcatca attcagctta gctagctagc tcatggacta attgatcagg 1080tgttaatcat tcatgaatgc attggttgag gcaagaagag aatttaatcc caatggtgaa 1140atttttttca ccaaatcctc catgtcgttg aggcgaaaaa tcgaacgacg acgacgacga 1200tggcgaggaa gacgatgatg gtgatgttgg ggatggagat ggtaggtaac aggcattgcc 1260cggttttcgc gtatcatctt tgttcttggg ctagggtgca aggggtgccc acttgcacca 1320ttttataatg cttgggagtt tgctccaaaa gaggaagctt ggggatgagt tcttgttagc 1380ttagctgtag ccctgatcac tgttccattg caacagttct aattgcaaaa aacaaaacct 1440ggtctaattt agttcaatat acaaaaaaaa aatctttgtc tcaaaaaaaa aaaaaaaaaa 1500


Alteration of Plant Embryo/Endosperm Size During Seed Development diagram and imageAlteration of Plant Embryo/Endosperm Size During Seed Development diagram and imageAlteration of Plant Embryo/Endosperm Size During Seed Development diagram and imageAlteration of Plant Embryo/Endosperm Size During Seed Development diagram and imageAlteration of Plant Embryo/Endosperm Size During Seed Development diagram and imageAlteration of Plant Embryo/Endosperm Size During Seed Development diagram and imageAlteration of Plant Embryo/Endosperm Size During Seed Development diagram and image
Alteration of Plant Embryo/Endosperm Size During Seed Development diagram and image

Patent applications by Hajime Sakai, Newark, DE US

Patent applications by Nobuhiro Nagasawa, Newark, DE US

Patent applications in class METHOD OF INTRODUCING A POLYNUCLEOTIDE MOLECULE INTO OR REARRANGEMENT OF GENETIC MATERIAL WITHIN A PLANT OR PLANT PART

Patent applications in all subclasses METHOD OF INTRODUCING A POLYNUCLEOTIDE MOLECULE INTO OR REARRANGEMENT OF GENETIC MATERIAL WITHIN A PLANT OR PLANT PART


User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA