Patent application title: Plants Having Enhanced Yield-Related Traits and Method for Making the Same

Inventors: Ana Isabel Sanz Molinero (Madrid, ES) Ana Isabel Sanz Molinero (Madrid, ES) Valerie Frankard (Waterloo, BE) Steven Vandenabeele (Oudenaarde, BE)
Assignees: BASF Plant Science Company GmbH
IPC8 Class: AC12N1582FI
USPC Class: 800290
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part the polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)
Publication date: 2014-02-20
Patent application number: 20140053298

Abstract:

Provided is a method for enhancing yield-related traits in plants by modulating expression of a nucleic acid encoding a bZIP-like polypeptide or a BCAT4-like polypeptide in a plant. Also provided are plants having modulated expression of a nucleic acid encoding a bZIP-like polypeptide or a BCAT4-like polypeptide, which plants have enhanced yield-related traits compared with control plants. Also provided are constructs comprising bZIP-like polypeptide-encoding nucleic acids or BCAT4-like polypeptide-encoding nucleic acids, useful in enhancing yield-related traits in plants.

Claims:

1-50. (canceled)

51. A method for enhancing yield-related traits in a plant plants relative to a control plant, comprising: (i) modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, wherein said bZIP-like polypeptide comprises a Basic Leucine Zipper Domain (PF00170) and a G-box binding domain of the MFMR type (PF07777) and one or more of motifs 1 to 3 of SEQ ID NO: 119, SEQ ID NO: 120 and SEQ ID NO: 121; or (ii) modulating expression in a plant of a nucleic acid encoding a BCAT4-like polypeptide, wherein said BCAT4-like polypeptide comprises the signature sequence of SEQ ID NO: 216.

52. The method of claim 51, wherein: (i) said bZIP-like polypeptide comprises one or more of motifs 4 to 6 and/or one or more of motifs 7 to 12; or (ii) said BCAT4-like polypeptide comprises one or more of Motif 13 of SEQ ID NO: 213, Motif 14 of SEQ ID NO: 214, and Motif 15 of SEQ ID NO: 215.

53. The method of claim 51, wherein said BCAT4-like polypeptide is encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of: (i) the nucleotide sequence of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209 or 211, or the complement thereof; (ii) a nucleotide sequence encoding a polypeptide comprising the amino acid sequence of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210 or 212; (iii) a nucleotide sequence having at least 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity with the nucleotide sequence of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209 or 211, and conferring enhanced yield-related traits in a plant relative to a control plant; (iv) a nucleotide sequence which hybridizes with the nucleotide sequence of (i) or (ii) under stringent hybridization conditions and confers enhanced yield-related traits in a plant relative to a control plant; and (v) a nucleotide sequence encoding a polypeptide having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity to the amino acid sequence of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210 or 212, and conferring enhanced yield-related traits in a plant relative to a control plant.

54. The method of claim 51, wherein said modulated expression is effected by: (i) introducing and expressing in a plant said nucleic acid encoding said bZIP-like polypeptide; or (ii) introducing and expressing in a plant said nucleic acid encoding said BCAT4-like polypeptide.

55. The method of claim 51, wherein said enhanced yield-related traits comprise increased yield relative to a control plant, or increased biomass and/or increased seed yield relative to a control plant.

56. The method of claim 51, wherein said nucleic acid encodes a bZIP-like polypeptide, and wherein said enhanced yield-related traits are obtained without effect on flowering time of the plant.

57. The method of claim 51, wherein said nucleic acid encodes a bZIP-like polypeptide, and wherein said enhanced yield-related traits are obtained under non-stress conditions.

58. The method of claim 51, wherein: (i) said nucleic acid encodes a bZIP-like polypeptide, and wherein said enhanced yield-related traits are obtained under conditions of drought stress or nitrogen deficiency; or (ii) said nucleic acid encodes a BCAT4-like polypeptide, and wherein said enhanced yield-related traits are obtained under conditions of drought stress.

59. The method of claim 51, wherein: (i) said nucleic acid encoding a bZIP-like polypeptide is of plant origin or from a dicotyledonous plant; or (ii) said nucleic acid encoding a BCAT4-like polypeptide is of plant origin, from a dicotyledonous plant, from a plant of the family Salicaceae, from a plant of the genus Populus, or from a Populus trichocarpa plant.

60. The method of claim 51, wherein: (i) said nucleic acid encoding a bZIP-like polypeptide encodes any one of the polypeptides listed in Table A1, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid; or (ii) said nucleic acid encoding a BCAT4-like polypeptide encodes any one of the polypeptides listed in Table A2, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid.

61. The method of claim 51, wherein: said nucleic acid encoding a bZIP-like polypeptide encodes an orthologue or paralogue of any of the polypeptides given in Table A1; or (ii) said nucleic acid encoding a BCAT4-like polypeptide encodes an orthologue or paralogue of any of the polypeptides given in Table A2.

62. The method of claim 51, wherein: (i) said nucleic acid encoding a bZIP-like polypeptide encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4; or (ii) said nucleic acid encoding a BCAT4-like polypeptide encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 142.

63. The method of claim 51, wherein said nucleic acid encoding a bZIP-like polypeptide or said nucleic acid encoding a BCAT4-like polypeptide is operably linked to a constitutive promoter, a medium strength constitutive promoter, a plant promoter, a GOS2 promoter, or a GOS2 promoter from rice.

64. A plant, plant cell or plant part thereof, or a seed or progeny of said plant, obtained by the method of claim 51, wherein said plant, plant cell or plant part, or said seed or progeny, comprises a recombinant nucleic acid encoding said bZIP-like polypeptide or a recombinant nucleic acid encoding a BCAT4-like polypeptide.

65. A construct comprising: (i) a nucleic acid encoding a bZIP-like polypeptide or a nucleic acid encoding a BCAT4-like polypeptide as defined in claim 51; (ii) one or more control sequences capable of driving expression of the nucleic acid of (i); and optionally (iii) a transcription termination sequence.

66. The construct of claim 65, wherein one of said control sequences is a constitutive promoter, a medium strength constitutive promoter, a plant promoter, a GOS2 promoter, or a GOS2 promoter from rice.

67. A plant, plant cell or plant part thereof, comprising the construct of claim 65.

68. A method for the production of a transgenic plant having enhanced yield-related traits relative to a control plant, preferably increased yield, increased seed yield and/or increased biomass relative to a control plant, comprising: (i) introducing and expressing in a plant cell or plant a nucleic acid encoding a bZIP-like polypeptide or a nucleic acid encoding a BCAT4-like polypeptide as defined in claim 51; and (ii) cultivating said plant cell or plant under conditions promoting plant growth and development.

69. A transgenic plant having enhanced yield-related traits relative to a control plant, preferably increased yield, increased seed yield and/or increased biomass relative to a control plant, resulting from modulated expression of a nucleic acid encoding a bZIP-like polypeptide or a nucleic acid encoding a BCAT4-like polypeptide as defined in claim 51, or a transgenic plant cell derived from said transgenic plant.

70. The transgenic plant of claim 69, or a transgenic plant cell derived therefrom, wherein said plant is a crop plant, a monocotyledonous plant or a cereal, or wherein said plant is beet, sugarbeet, alfalfa, sugarcane, rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo or oats.

71. Harvestable parts of the transgenic plant of claim 69, wherein said harvestable parts are preferably shoot biomass and/or seeds.

72. Products derived from the transgenic plant of claim 69 and/or from harvestable parts of said plant.

73. A method for manufacturing a product, comprising growing the transgenic plant of claim 69 and producing a product from or by said plant or part thereof, including seeds.

Description:

BACKGROUND

[0001] The present invention relates generally to the field of molecular biology and concerns a method for enhancing yield-related traits in plants by modulating expression in a plant of a nucleic acid encoding a bZIP-like (basic Leucine Zipper) polypeptide, or a BCAT4-like (Branched-Chain AminoTransferase 4-like) polypeptide. The present invention also concerns plants having modulated expression of a nucleic acid encoding a bZIP-like polypeptide, which plants have enhanced yield-related traits relative to corresponding wild type plants or other control plants. The invention also provides constructs useful in the methods of the invention.

[0002] The ever-increasing world population and the dwindling supply of arable land available for agriculture fuels research towards increasing the efficiency of agriculture. Conventional means for crop and horticultural improvements utilise selective breeding techniques to identify plants having desirable characteristics. However, such selective breeding techniques have several drawbacks, namely that these techniques are typically labour intensive and result in plants that often contain heterogeneous genetic components that may not always result in the desirable trait being passed on from parent plants. Advances in molecular biology have allowed mankind to modify the germplasm of animals and plants. Genetic engineering of plants entails the isolation and manipulation of genetic material (typically in the form of DNA or RNA) and the subsequent introduction of that genetic material into a plant. Such technology has the capacity to deliver crops or plants having various improved economic, agronomic or horticultural traits.

[0003] A trait of particular economic interest is increased yield. Yield is normally defined as the measurable produce of economic value from a crop. This may be defined in terms of quantity and/or quality. Yield is directly dependent on several factors, for example, the number and size of the organs, plant architecture (for example, the number of branches), seed production, leaf senescence and more. Root development, nutrient uptake, stress tolerance and early vigour may also be important factors in determining yield. Optimizing the abovementioned factors may therefore contribute to increasing crop yield.

[0004] Seed yield is a particularly important trait, since the seeds of many plants are important for human and animal nutrition. Crops such as corn, rice, wheat, canola and soybean account for over half the total human caloric intake, whether through direct consumption of the seeds themselves or through consumption of meat products raised on processed seeds. They are also a source of sugars, oils and many kinds of metabolites used in industrial processes. Seeds contain an embryo (the source of new shoots and roots) and an endosperm (the source of nutrients for embryo growth during germination and during early growth of seedlings). The development of a seed involves many genes, and requires the transfer of metabolites from the roots, leaves and stems into the growing seed. The endosperm, in particular, assimilates the metabolic precursors of carbohydrates, oils and proteins and synthesizes them into storage macromolecules to fill out the grain.

[0005] Another important trait for many crops is early vigour. Improving early vigour is an important objective of modern rice breeding programs in both temperate and tropical rice cultivars. Long roots are important for proper soil anchorage in water-seeded rice. Where rice is sown directly into flooded fields, and where plants must emerge rapidly through water, longer shoots are associated with vigour. Where drill-seeding is practiced, longer mesocotyls and coleoptiles are important for good seedling emergence. The ability to engineer early vigour into plants would be of great importance in agriculture. For example, poor early vigour has been a limitation to the introduction of maize (Zea mays L.) hybrids based on Corn Belt germplasm in the European Atlantic.

[0006] A further important trait is that of improved abiotic stress tolerance. Abiotic stress is a primary cause of crop loss worldwide, reducing average yields for most major crop plants by more than 50% (Wang et al., Planta 218, 1-14, 2003). Abiotic stresses may be caused by drought, salinity, extremes of temperature, chemical toxicity and oxidative stress. The ability to improve plant tolerance to abiotic stress would be of great economic advantage to farmers worldwide and would allow for the cultivation of crops during adverse conditions and in territories where cultivation of crops may not otherwise be possible.

[0007] Crop yield may therefore be increased by optimising one of the above-mentioned factors.

[0008] With respect to bZIP-like polypeptides, bZIP proteins are a group of transcription factors, containing a conserved domain (ZIP) that participates in the formation of homo or heterodimers with other bZIP proteins and a DNA binding domain. These transcription factors form a large family and are present across fungi, animals and plants.

[0009] In plants, up to 13 groups of bZIP transcription factors have been identified in plants using evolutionary studies on nucleotide sequences. (Corr a et al. 2008).

[0010] A subgroup within the group G of bZIP transcription factors contains a G-BOX motif binding domain. The genes of this group are mostly related to morphogenic responses to light maturation and LEA genes repression, ABA regulation, Adh activation and photomorphogenesis (Corr a et al. 2008).

[0011] With respect to BCAT4-like polypeptides, BCAT or Branched Chain AminoTransferase, sometimes also named Branched Chain AminoTransaminase, catalyses the last step of synthesis, the transamination of the branched-chain amino acids leucine, isoleucine and valine to their respective alpha-keto acids and/or the initial step of degradation of these amino acids.

[0012] Diebold et al. (Plant Physiol, 2002, 129, 540-550) have identified in Arabidopsis seven putative BCAT genes. Maloney et al. (Plant Physiol., 2010, 153, 925-936) identified six BCAT genes from the cultivated tomato Solanum Lycopersicum.

[0013] Depending on the end use, the modification of certain yield traits may be favoured over others. For example for applications such as forage or wood production, or bio-fuel resource, an increase in the vegetative parts of a plant may be desirable, and for applications such as flour, starch or oil production, an increase in seed parameters may be particularly desirable. Even amongst the seed parameters, some may be favoured over others, depending on the application. Various mechanisms may contribute to increasing seed yield, whether that is in the form of increased seed size or increased seed number.

[0014] It has now been found that various yield-related traits may be improved in plants by modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.

DETAILED DESCRIPTION OF THE INVENTION

[0015] With respect to bZIP-like polypeptides, the present invention shows that modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide gives plants having enhanced yield-related traits relative to control plants.

[0016] With respect to BCAT4-like polypeptides, the present invention shows that modulating expression of a nucleic acid encoding a BCAT4-like polypeptide as defined herein gives plants having enhanced yield-related traits, in particular increased yield or more particular increased seed yield relative to control plants.

[0017] According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, and optionally selecting for plants having enhanced yield-related traits. According to another embodiment, the present invention provides a method for producing plants having enhanced yield-related traits relative to control plants, wherein said method comprises the steps of modulating expression in said plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as described herein and optionally selecting for plants having enhanced yield-related traits.

[0018] A preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, is by introducing and expressing in a plant a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.

[0019] Any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a bZIP-like polypeptide, or a BCAT4-like polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "bZIP-like nucleic acid", or "BCAT4-like nucleic acid", or "bZIP-like gene", or "BCAT4-like gene".

[0020] A "bZIP-like polypeptide" as defined herein refers to any polypeptide comprising a Basic Leucine Zipper domain (PF00170, SM00338, PS50217) and a G-box binding domain of the MFMR type (PF07777).

[0021] The Basic Leucine Zipper Domain (bZIP domain, PF00170) is found in many DNA binding eukaryotic proteins. One part of the domain contains a region that mediates sequence specific DNA binding properties and the leucine zipper that is required for the dimerisation of two DNA binding regions. The DNA binding region comprises a number of basic amino acids such as arginine and lysine.

[0022] The G-box binding protein MFMR domain (PF07777) is found at the N-terminus of the PF00170 bZIP domain. It typically ranges in length between 150 and 200 amino acids, but may be shorter, such as in SEQ ID NO: 2. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain) whereas the C-terminal half is more polar and has been named the MFMR (multifunctional mosaic region). It has been suggested that some of these motifs may be involved in mediating protein-protein interactions.

[0023] Additionally or alternatively, the bZIP-like polypeptide comprises one or more of the following motif(s):

TABLE-US-00001 Motif 1 (SEQ ID NO: 119): ELKR[EQ][KR]RKQSNRESARRSRLRKQAE[CTA]EEL Motif 2 (SEQ ID NO: 120): [AQ][RH][KR]VE[SAV]L[TS][HAT]ENx[SAT]L[RKQ][SD]E [LI][QNS][RQ][LF] Motif 3 (SEQ ID NO: 121): [HPL][AN][PI][HPG][PM][YD][MLV]W

[0024] In one embodiment the bZIP-like polypeptide comprises additionally or alternatively one or more of motif(s):

TABLE-US-00002 Motif 4 (SEQ ID NO: 122): RELKRQKRKQSNRESARRSRLRKQAECEELQ Motif 5 (SEQ ID NO: 123): [AG]TNLN[IM]GMD[LV]WN Motif 6 (SEQ ID NO: 124): MPPYGTPVPYPA[LIM]YPP

[0025] In another embodiment, the bZIP-like polypeptide comprises additionally or alternatively one or more of motif(s):

TABLE-US-00003 Motif 7 (SEQ ID NO: 125): NE[RL]ELKRE[RK]RKQSNRESARRSRLRKQAE[TA]EELA[RH] [KR]V[EDQ][SAV]LT[AT]EN[LM][TAS]L[KRQ] Motif 8 (SEQ ID NO: 126): [IA][ED][TS]P[TA]KSSGNTD[RQ]GL[MLV][NK]KLK[GE]FDGL [AT]MSIGN Motif 9 (SEQ ID NO: 127): [PL]PQ[PH]MMPPYG[APT]PY Motif 10(SEQ ID NO: 128): NSGAKL[HR]QLLD[AT][SN]PR[AT]DAVAAG Motif 11 (SEQ ID NO: 129): EI[NS][RKQ][LF]TE[NK]SEK[LM][KR][LM][EQ]N[AS][ATK] L[MRT][EV][KH] Preferably, motif 7 extends to (Motif 12, SEQ ID NO: 130) WLQNE[RL]ELKRE[RK]RKQSNRESARRSRLRKQAE[AT]EE LA[RIH][KR]V[EQ][VSA]LT[AST]EN[ML][ATS]L[QKR].

[0026] The term "bZIP-like" or "bZIP-like polypeptide" as used herein also intends to include homologues as defined hereunder of "bZIP-like polypeptide".

[0027] Motifs 1 to 12 were derived using the MEME algorithm (Bailey and Elkan, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, Calif., 1994). At each position within a MEME motif, the residues are shown that are present in the query set of sequences with a frequency higher than 0.2. Residues within square brackets represent alternatives.

[0028] More preferably, the bZIP-like polypeptide comprises in increasing order of preference, 1, 2, 3 or more of motifs 1 to 12, preferably 4 or more, more preferably 5 or more of motifs 1 to 12, most preferably 6 or more of motifs 1 to 12.

[0029] Additionally or alternatively, the homologue of a bZIP-like protein has in increasing order of preference at least 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 2, provided that the homologous protein comprises any one or more of the conserved motifs as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides).

[0030] In one embodiment the sequence identity level is determined by comparison of the polypeptide sequences over the entire length of the sequence of SEQ ID NO: 2 or SEQ ID NO: 4.

[0031] Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered. Preferably the motifs in a bZIP-like polypeptide have, in increasing order of preference, at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to any one or more of the motifs represented by SEQ ID NO: 119 to SEQ ID NO: 130 (Motifs 1 to 12).

[0032] In other words, in another embodiment a method is provided wherein said bZIP-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid E186 up to amino acid E240 in SEQ ID NO: 2 (which corresponds to the conserved domain starting with amino acid E262 up to amino acid N322 in SEQ ID NO: 4). Additionally or alternatively, the bZIP-like polypeptide useful in the methods of the present invention comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid M1 up to amino acid N99 in SEQ ID NO: 2 (which corresponds to the conserved domain starting with amino acid M1 up to amino acid R175 in SEQ ID NO: 4). Preferably said bZIP-like polypeptide comprises both conserved domains.

[0033] A "BCAT4-like polypeptide" as defined herein refers to any polypeptide comprising the signature sequence represented by AN(KREN)(RKH)W(VIT)PP(PTAQWFRH)GKG (SEQ ID NO: 216).

[0034] The term "BCAT4-like" or "BCAT4-like polypeptide" as used herein also intends to include homologues as defined hereunder of "BCAT4-like polypeptide".

[0035] Preferably, the BCAT4-like polypeptide comprises one or more of the following motifs:

[0036] (i) Motif 13 represented by ANKRWVPP[PT]GKGSLYIRP (SEQ ID NO: 213),

[0037] (ii) Motif 14 represented by RP[ED]ENA[ML]RM[IQK]xGA[ED]R[ML]CM (SEQ ID NO: 214),

[0038] (iii) Motif 15 represented by LNYGQGLFEGLKAYR[KT]ED (SEQ ID NO: 215).

[0039] Motifs 13 to 15 were derived using the MEME algorithm (Bailey and Elkan, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, Calif., 1994). At each position within a MEME motif, the residues are shown that are present in the query set of sequences with a frequency higher than 0.2. Residues within square brackets represent alternatives.

[0040] More preferably, the BCAT4-like polypeptide comprises in increasing order of preference, at least 2, or all 3 motifs.

[0041] Additionally or alternatively, the homologue of a BCAT4-like protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid sequence represented by SEQ ID NO: 142, provided that the homologous protein comprises any one or more of the conserved motifs as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides).

[0042] In one embodiment the sequence identity level is determined by comparison of the polypeptide sequences over the entire length of the sequence of SEQ ID NO: 142.

[0043] Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered. Preferably the motifs in a BCAT4-like polypeptide have, in increasing order of preference, at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to any one or more of the motifs represented by SEQ ID NO: 213 to SEQ ID NO: 215 (Motifs 13 to 15).

[0044] In other words, in another embodiment a method is provided wherein said BCAT4-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid 185 up to amino acid 202 in SEQ ID NO: 142, or to the conserved domain starting with amino acid 150 up to amino acid 167 in SEQ ID NO: 142, or to the conserved domain starting with amino acid 126 up to amino acid 143 in SEQ ID NO: 142.

[0045] The terms "domain", "signature" and "motif" are defined in the "definitions" section herein.

[0046] With respect to bZIP-like polypeptides, the bZIP-like polypeptide sequence, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), preferably clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides.

[0047] Furthermore, bZIP-like polypeptides (at least in their native form) typically have DNA binding activity. More particulary, bZIP-like polypeptides typically bind to the G-box sequence (ABRE oligonucleotide in Example 6, SEQ ID NO: 139). Tools and techniques for measuring DNA binding activity are well known in the art, see for example the DNA binding assay in Liao et al. (Planta, 228, 225-240, 2008). Further details are provided in Example 6.

[0048] In addition, bZIP-like polypeptides, when expressed in rice according to the methods of the present invention as outlined in Examples 7 and 8, give plants having increased yield related traits, in particular increased seed yield when grown under conditions of nitrogen limitation or drought stress.

[0049] In one embodiment of the present invention the function of the nucleic acid sequences of the invention is to confer information for synthesis of the bZIP-like that increases yield or yield related traits, when such a nucleic acid sequence of the invention is transcribed and translated in a living plant cell.

[0050] With respect to BCAT4-like polypeptides, the BCAT4-like polypeptide is preferably encoded by a nucleic acid molecule comprising a nucleic acid molecule selected from the group consisting of:

[0051] (i) a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;

[0052] (ii) the complement of a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;

[0053] (iii) a nucleic acid encoding the polypeptide as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, preferably as a result of the degeneracy of the genetic code, said isolated nucleic acid being deducible from a polypeptide sequence as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, and further preferably conferring enhanced yield-related traits relative to control plants;

[0054] (iv) a nucleic acid having, in increasing order of preference at least 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity with any of the nucleic acid sequences of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211, and further preferably conferring enhanced yield-related traits relative to control plants;

[0055] (v) a first nucleic acid molecule which hybridizes with a second nucleic acid molecule of (i) to (iv) under stringent hybridization conditions and preferably confers enhanced yield-related traits relative to control plants;

[0056] (vi) a nucleic acid encoding said polypeptide having, in increasing order of preference, at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 81%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, and preferably conferring enhanced yield-related traits relative to control plants; or

[0057] (vii) a nucleic acid comprising any combination(s) of features of (i) to (vi) above.

[0058] Preferably, the polypeptide sequence which when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group.

[0059] Furthermore, BCAT4-like polypeptides, at least in their native form, typically have transamination activity.

[0060] In addition, BCAT4-like polypeptides, when expressed in rice according to the methods of the present invention as outlined in Examples 7 and 8, give plants having increased yield related traits, in particular increased seed yield, more in particular increase in total weight of the seeds, increase in fillrate, increase in harvestindex and increased number of seeds.

[0061] In one embodiment of the present invention the function of the nucleic acid sequences of the invention is to confer information for synthesis of the BCAT4-like polypeptide that increases yield or yield related traits, when such a nucleic acid sequence of the invention is transcribed and translated in a living plant cell.

[0062] With respect to bZIP-like polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 1, encoding the polypeptide sequence of SEQ ID NO: 2. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any bZIP-like-encoding nucleic acid or bZIP-like polypeptide as defined herein, as exemplified with the bZIP-like encoding nucleic acid of SEQ ID NO: 3.

[0063] Examples of nucleic acids encoding bZIP-like polypeptides are given in Table A1 of the Examples section herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A1 of the Examples section are example sequences of orthologues and paralogues of the bZIP-like polypeptide represented by SEQ ID NO: 2, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search as described in the definitions section; where the query sequence is SEQ ID NO: 1 or SEQ ID NO: 2, the second BLAST (back-BLAST) would be against Solanum lycopersicum sequences. Where the query sequence is SEQ ID NO: 3 or SEQ ID NO: 4, the second BLAST (back-BLAST) would be against Populus trichocarpa sequences.

[0064] The invention also provides hitherto unknown bZIP-like-encoding nucleic acids and bZIP-like polypeptides useful for conferring enhanced yield-related traits in plants relative to control plants.

[0065] With respect to BCAT4-like polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 141, encoding the polypeptide sequence of SEQ ID NO: 142. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any BCAT4-like-encoding nucleic acid or BCAT4-like polypeptide as defined herein.

[0066] Examples of nucleic acids encoding BCAT4-like polypeptides are given in Table A2 of the Examples section herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A2 of the Examples section are example sequences of orthologues and paralogues of the BCAT4-like polypeptide represented by SEQ ID NO: 142, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search as described in the definitions section; where the query sequence is SEQ ID NO: 141 or SEQ ID NO: 142, the second BLAST (back-BLAST) would be against Populus trichocarpa sequences.

[0067] Nucleic acid variants may also be useful in practising the methods of the invention. Examples of such variants include nucleic acids encoding homologues and derivatives of any one of the amino acid sequences given in Table A1 and A2 of the Examples section, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acids encoding homologues and derivatives of orthologues or paralogues of any one of the amino acid sequences given in Table A1 and A2 of the Examples section. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived. Further variants useful in practising the methods of the invention are variants in which codon usage is optimised or in which miRNA target sites are removed.

[0068] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, nucleic acids hybridising to nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, splice variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, allelic variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, and variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.

[0069] Nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of any one of the nucleic acid sequences given in Table A1 and A2 of the Examples section, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section.

[0070] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.

[0071] With respect to bZIP-like polypeptides, portions useful in the methods of the invention, encode a bZIP-like polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A1 of the Examples section. Preferably, the portion is a portion of any one of the nucleic acids given in Table A1 of the Examples section, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of the Examples section. Preferably the portion is at least 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A1 of the Examples section, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of the Examples section. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 1 or SEQ ID NO: 3. Preferably, the portion encodes a fragment of an amino acid sequence which, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.

[0072] With respect to BCAT4-like polypeptides, portions useful in the methods of the invention, encode a BCAT4-like polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A2 of the Examples section. Preferably, the portion is a portion of any one of the nucleic acids given in Table A2 of the Examples section, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A2 of the Examples section. Preferably the portion is at least 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250 consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A2 of the Examples section, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A2 of the Examples section. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 141. Preferably, the portion encodes a fragment of an amino acid sequence which, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.

[0073] Another nucleic acid variant useful in the methods of the invention is a nucleic acid capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein, or with a portion as defined herein. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to the complement of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or to the complement of a nucleic acid encoding an orthologue, paralogue or homologue of any one of the proteins given in Table A1 and A2.

[0074] Hybridising sequences useful in the methods of the invention encode a POI polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in Table A1 and A2 of the Examples section. Preferably, the hybridising sequence is capable of hybridising to the complement of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or to a portion of any of these sequences, a portion being as defined herein, or the hybridising sequence is capable of hybridising to the complement of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 and A2 of the Examples section.

[0075] With respect to bZIP-like polypeptides, the hybridising sequence is most preferably capable of hybridising to the complement of a nucleic acid as represented by SEQ ID NO: 1 or SEQ ID NO: 3, or to a portion thereof. In one embodiment, the hybridization conditions are of medium stringency, preferably of high stringency, as defined above.

[0076] Preferably, the hybridising sequence encodes a polypeptide with an amino acid sequence which, when full-length and when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.

[0077] With respect to BCAT4-like polypeptides, the hybridising sequence is most preferably capable of hybridising to the complement of a nucleic acid as represented by SEQ ID NO: 141 or to a portion thereof. In one embodiment, the hybridization conditions are of medium stringency, preferably of high stringency, as defined above.

[0078] Preferably, the hybridising sequence encodes a polypeptide with an amino acid sequence which, when full-length and used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.

[0079] Another nucleic acid variant useful in the methods of the invention is a splice variant encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined hereinabove, a splice variant being as defined herein.

[0080] In another embodiment, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section.

[0081] With respect to bZIP-like polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 1 or SEQ ID NO: 3, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2 or SEQ ID NO: 4. Preferably, the amino acid sequence encoded by the splice variant, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.

[0082] With respect to BCAT4-like polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 141, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 142. Preferably, the amino acid sequence encoded by the splice variant, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.

[0083] Another nucleic acid variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined hereinabove, an allelic variant being as defined herein.

[0084] In yet another embodiment, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section.

[0085] With respect to bZIP-like polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the bZIP-like polypeptide of SEQ ID NO: 2 and any of the amino acids depicted in Table A1 of the Examples section. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 1 or SEQ ID NO: 3, or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2 or SEQ ID NO: 4. Preferably, the amino acid sequence encoded by the allelic variant, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.

[0086] With respect to BCAT4-like polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the BCAT4-like polypeptide of SEQ ID NO: 142 and any of the amino acids depicted in Table A2 of the Examples section. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 141 or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 142. Preferably, the amino acid sequence encoded by the allelic variant, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.

[0087] Gene shuffling or directed evolution may also be used to generate variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, as defined herein; the term "gene shuffling" being as defined herein.

[0088] In yet another embodiment, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section, which variant nucleic acid is obtained by gene shuffling.

[0089] With respect to bZIP-like polypeptides, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), preferably clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.

[0090] With respect to BCAT4-like polypeptides, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling, when used in the construction of a phylogenetic tree such as the one depicted in FIG. 7, preferably clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.

[0091] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).

[0092] bZIP-like polypeptides differing from the sequence of SEQ ID NO: 2 or of SEQ ID NO: 4 by one or several amino acids (substitution(s), insertion(s) and/or deletion(s) as defined above) may equally be useful to increase the yield of plants in the methods and constructs and plants of the invention.

[0093] BCAT4-like polypeptides differing from the sequence of SEQ ID NO: 142 by one or several amino acids (substitution(s), insertion(s) and/or deletion(s) as defined above) may equally be useful to increase the yield of plants in the methods and constructs and plants of the invention.

[0094] Concerning bZIP-like polypeptides, nucleic acids encoding bZIP-like polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the bZIP-like polypeptide-encoding nucleic acid is from a plant, further preferably from a dicotyledonous plant. In one embodiment, the bZIP-like polypeptide-encoding nucleic acid is from the family Solanaceae, preferably from Solanum lycopersicum. In another embodiment, the bZIP-like polypeptide-encoding nucleic acid is from the family Salicaceae, preferably from Populus trichocarpa.

[0095] Nucleic acids encoding BCAT4-like polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the BCAT4-like polypeptide-encoding nucleic acid is from a plant, further preferably from a dicotyledonous plant, more preferably from the family Salicaceae, most preferably the nucleic acid is from Populus trichocarpa.

[0096] In another embodiment the present invention extends to recombinant chromosomal DNA comprising a nucleic acid sequence useful in the methods of the invention, wherein said nucleic acid is present in the chromosomal DNA as a result of recombinant methods, but is not in its natural genetic environment. In a further embodiment the recombinant chromosomal DNA of the invention is comprised in a plant cell.

[0097] Performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased yield, especially increased seed yield and/or increased biomass relative to control plants. It should be noted that the increased biomass was obtained without effect on flowering time, in particular, no delay in flowering time was observed. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.

[0098] Reference herein to enhanced yield-related traits is taken to mean an increase early vigour and/or in biomass (weight) of one or more parts of a plant, which may include (i) aboveground parts and preferably aboveground harvestable parts and/or (ii) parts below ground and preferably harvestable below ground. In particular, such harvestable parts are biomass and/or seeds, and performance of the methods of the invention results in plants having increased biomass and/or increased seed yield relative to the seed yield of control plants. In one preferred embodiment, the increased yield is increased biomass, in another preferred embodiment, the increased yield is increased seed yield, in yet another embodiment, the increased yield is increased biomass and increased seed yield.

[0099] The present invention provides a method for increasing yield-related traits, in particular yield, especially seed yield and/or biomass of plants, relative to control plants, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein.

[0100] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein.

[0101] Performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under non-stress conditions or under mild drought conditions, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.

[0102] Performance of the methods of the invention gives plants grown under conditions of drought, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of drought which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.

[0103] Performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.

[0104] Performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield-related traits relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield-related traits in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.

[0105] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants or host cells and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.

[0106] More specifically, the present invention provides a construct comprising:

[0107] (a) a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined above;

[0108] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally

[0109] (c) a transcription termination sequence.

[0110] Preferably, the nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, is as defined above. The term "control sequence" and "termination sequence" are as defined herein.

[0111] The genetic construct of the invention may be comprised in a host cell, plant cell, seed, agricultural product or plant. Plants or host cells are transformed with a genetic construct such as a vector or an expression cassette comprising any of the nucleic acids described above. Thus the invention furthermore provides plants or host cells transformed with a construct as described above. In particular, the invention provides plants transformed with a construct as described above, which plants have increased yield-related traits as described herein.

[0112] In one embodiment the genetic construct of the invention confers increased yield or yield related traits(s) to a plant when it has been introduced into said plant, which plant expresses the nucleic acid encoding the POI comprised in the genetic construct. In another embodiment the genetic construct of the invention confers increased yield or yield related traits(s) to a plant comprising plant cells in which the construct has been introduced, which plant cells express the nucleic acid encoding the POI comprised in the genetic construct.

[0113] The skilled artisan is well aware of the genetic elements that must be present on the genetic construct in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).

[0114] Advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods. See the "Definitions" section herein for definitions of the various promoter types.

[0115] The constitutive promoter is preferably a ubiquitous constitutive promoter of medium strength. More preferably it is a plant derived promoter, e.g. a promoter of plant chromosomal origin, such as a GOS2 promoter or a promoter of substantially the same strength and having substantially the same expression pattern (a functionally equivalent promoter), more preferably the promoter is the promoter GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 131 or SEQ ID NO: 218, most preferably the constitutive promoter is as represented by SEQ ID NO: 131 or SEQ ID NO: 218. See the "Definitions" section herein for further examples of constitutive promoters.

[0116] With respect to bZIP-like polypeptides, it should be clear that the applicability of the present invention is not restricted to the bZIP-like polypeptide-encoding nucleic acid represented by SEQ ID NO: 1 or SEQ ID NO: 3, nor is the applicability of the invention restricted to the rice GOS2 promoter when expression of a bZIP-like polypeptide-encoding nucleic acid is driven by a constitutive promoter.

[0117] With respect to BCAT4-like polypeptides, it should be clear that the applicability of the present invention is not restricted to the BCAT4-like polypeptide-encoding nucleic acid represented by SEQ ID NO: 141, nor is the applicability of the invention restricted to expression of a BCAT4-like polypeptide-encoding nucleic acid when driven by a constitutive promoter.

[0118] With respect to bZIP-like polypeptides, optionally, one or more terminator sequences may be used in the construct introduced into a plant. Preferably, the construct comprises an expression cassette comprising a GOS2 promoter, substantially similar to SEQ ID NO: 131, operably linked to the nucleic acid encoding the bZIP-like polypeptide. More preferably, the construct comprises a zein terminator (t-zein) linked to the 3' end of the bZIP-like coding sequence. Most preferably, the expression cassette comprises a sequence having in increasing order of preference at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity to the sequence represented by SEQ ID NO: 132 (Le expression cassette) or SEQ ID NO: 133 (Pt expression cassette). Furthermore, one or more sequences encoding selectable markers may be present on the construct introduced into a plant.

[0119] With respect to BCAT4-like polypeptides, optionally, one or more terminator sequences may be used in the construct introduced into a plant. Preferably, the construct comprises an expression cassette comprising a GOS2 promoter, substantially similar to SEQ ID NO: 218, operably linked to the nucleic acid encoding the BCAT4-like polypeptide. More preferably, the construct comprises a zein terminator (t-zein) linked to the 3' end of the BCAT4-like polypeptide coding sequence. Most preferably, the expression cassette comprises a sequence having in increasing order of preference at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity to the sequence represented by SEQ ID NO: 217 (pGOS2::BCAT4-like::t-zein sequence). Furthermore, one or more sequences encoding selectable markers may be present on the construct introduced into a plant.

[0120] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.

[0121] As mentioned above, a preferred method for modulating expression of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, is by introducing and expressing in a plant a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.

[0122] The invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein.

[0123] With respect to bZIP-like polypeptides, the present invention more specifically provides a method for the production of transgenic plants having enhanced yield-related traits, particularly increased yield, which method comprises:

[0124] (i) introducing and expressing in a plant or plant cell a bZIP-like polypeptide-encoding nucleic acid or a genetic construct comprising a bZIP-like polypeptide-encoding nucleic acid; and

[0125] (ii) cultivating the plant cell under conditions promoting plant growth and development.

[0126] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a bZIP-like polypeptide as defined herein.

[0127] Preferably, the nucleic acid encoding bZIP-like polypeptide as defined herein and to be introduced into the plant is an isolated nucleic acid or is comprised in a genetic construct.

[0128] With respect to BCAT4-like polypeptides, the present invention more specifically provides a method for the production of transgenic plants having enhanced yield-related traits, particularly increased yield, more in particular increased seed yield, which method comprises:

[0129] (i) introducing and expressing in a plant or plant cell a BCAT4-like polypeptide-encoding nucleic acid or a genetic construct comprising a BCAT4-like polypeptide-encoding nucleic acid; and

[0130] (ii) cultivating the plant cell under conditions promoting plant growth and development.

[0131] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a BCAT4-like polypeptide as defined herein.

[0132] Preferably, the nucleic acid encoding BCAT4-like polypeptide as defined herein and to be introduced into the plant is an isolated nucleic acid or is comprised in a genetic construct.

[0133] Cultivating the plant cell under conditions promoting plant growth and development, may or may not include regeneration and/or growth to maturity. Accordingly, in a particular embodiment of the invention, the plant cell transformed by the method according to the invention is regenerable into a transformed plant. In another particular embodiment, the plant cell transformed by the method according to the invention is not regenerable into a transformed plant, i.e. cells that are not capable to regenerate into a plant using cell culture techniques known in the art. While plants cells generally have the characteristic of totipotency, some plant cells can not be used to regenerate or propagate intact plants from said cells. In one embodiment of the invention the plant cells of the invention are such cells. In another embodiment the plant cells of the invention are plant cells that do not sustain themselves in an autotrophic way.

[0134] The nucleic acid may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant or plant cell by transformation. The term "transformation" is described in more detail in the "definitions" section herein.

[0135] In one embodiment the present invention extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof.

[0136] The present invention encompasses plants or parts thereof (including seeds) obtainable by the methods according to the present invention. The plants or plant parts or plant cells comprise a nucleic acid transgene encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined above, preferably in a genetic construct such as an expression cassette. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.

[0137] In a further embodiment the invention extends to seeds comprising the expression cassettes of the invention, the genetic constructs of the invention, or the nucleic acids encoding the bZIP-like polypeptide, or the BCAT4-like polypeptide, and/or the bZIP-like polypeptides, or BCAT4-like polypeptides, as described above.

[0138] The invention also includes host cells containing an isolated nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined above. In one embodiment host cells according to the invention are plant cells, yeasts, bacteria or fungi. Host plants for the nucleic acids, construct, expression cassette or the vector used in the method according to the invention are, in principle, advantageously all plants which are capable of synthesizing the polypeptides used in the inventive method. In a particular embodiment the plant cells of the invention overexpress the nucleic acid molecule of the invention.

[0139] The methods of the invention are advantageously applicable to any plant, in particular to any plant as defined herein. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to an embodiment of the present invention, the plant is a crop plant. Examples of crop plants include but are not limited to chicory, carrot, cassava, trefoil, soybean, beet, sugar beet, sunflower, canola, alfalfa, rapeseed, linseed, cotton, tomato, potato and tobacco. According to another embodiment of the present invention, the plant is a monootyledonous plant. Examples of monocotyledonous plants include sugarcane. According to another embodiment of the present invention, the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo and oats. In a particular embodiment the plants used in the methods of the invention are selected from the group consisting of maize, wheat, rice, soybean, cotton, oilseed rape including canola, sugarcane, sugar beet and alfalfa. Advantageously the methods of the invention are more efficient than the known methods, because the plants of the invention have increased yield and/or tolerance to an environmental stress compared to control plants used in comparable methods.

[0140] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs, which harvestable parts comprise a recombinant nucleic acid encoding a POI polypeptide. The invention furthermore relates to products derived or produced, preferably directly derived or produced, from a harvestable part of such a plant, such as dry pellets, meal or powders, oil, fat and fatty acids, starch or proteins.

[0141] The invention also includes methods for manufacturing a product comprising a) growing the plants of the invention and b) producing said product from or by the plants of the invention or parts thereof, including seeds. In a further embodiment the methods comprise the steps of a) growing the plants of the invention, b) removing the harvestable parts as described herein from the plants and c) producing said product from, or with the harvestable parts of plants according to the invention.

[0142] In one embodiment the products produced by the methods of the invention are plant products such as, but not limited to, a foodstuff, feedstuff, a food supplement, feed supplement, fiber, cosmetic or pharmaceutical. In another embodiment the methods for production are used to make agricultural products such as, but not limited to, plant extracts, proteins, amino acids, carbohydrates, fats, oils, polymers, vitamins, and the like.

[0143] In yet another embodiment the polynucleotides or the polypeptides of the invention are comprised in an agricultural product. In a particular embodiment the nucleic acid sequences and protein sequences of the invention may be used as product markers, for example where an agricultural product was produced by the methods of the invention. Such a marker can be used to identify a product to have been produced by an advantageous process resulting not only in a greater efficiency of the process but also improved quality of the product due to increased quality of the plant material and harvestable parts used in the process. Such markers can be detected by a variety of methods known in the art, for example but not limited to PCR based methods for nucleic acid detection or antibody based methods for protein detection.

[0144] The present invention also encompasses use of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, as described herein and use of these bZIP-like polypeptides, or BCAT4-like polypeptides, in enhancing any of the aforementioned yield-related traits in plants. For example, nucleic acids encoding POI polypeptide described herein, or the bZIP-like polypeptides, or the BCAT4-like polypeptides, themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a gene encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide. The nucleic acids/genes, or the bZIP-like polypeptides, or the BCAT4-like polypeptides, themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined herein in the methods of the invention. Furthermore, allelic variants of a nucleic acid/gene encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, may find use in marker-assisted breeding programmes. Nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes.

[0145] Moreover, concerning the bZIP-like polypeptides, the present invention relates to the following specific embodiments:

[0146] 1. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, wherein said bZIP-like polypeptide comprises Basic Leucine Zipper Domain (PF00170) and a G-box binding domain of the MFMR type (PF07777) and one or more of motifs 1 to 3, represented by SEQ ID NO: 119, SEQ ID NO: 120 or SEQ ID NO: 121.

[0147] 2. Method according to embodiment 1, wherein said bZIP-like polypeptide comprises one or more of motifs 4 to 6.

[0148] 3. Method according to embodiment 1, wherein said bZIP-like polypeptide comprises one or more of motifs 7 to 12.

[0149] 4. Method according to any one of embodiments 1 to 3, wherein said modulated expression is effected by introducing and expressing in a plant said nucleic acid encoding said bZIP-like polypeptide.

[0150] 5. Method according to embodiment 1 to 4, wherein said enhanced yield-related traits comprise increased yield relative to control plants, and preferably comprise increased biomass and/or increased seed yield relative to control plants.

[0151] 6. Method according to embodiment 1 to 5, wherein said enhanced yield-related traits are obtained without effect on flowering time of the plant.

[0152] 7. Method according to any one of embodiments 1 to 3, wherein said enhanced yield-related traits are obtained under non-stress conditions.

[0153] 8. Method according to any one of embodiments 1 to 3, wherein said enhanced yield-related traits are obtained under conditions of drought stress or nitrogen deficiency.

[0154] 9. Method according to any one of embodiments 1 to 8, wherein said nucleic acid encoding a bZIP-like is of plant origin, preferably from a dicotyledonous plant.

[0155] 10. Method according to any one of embodiments 1 to 8, wherein said nucleic acid encoding a bZIP-like encodes any one of the polypeptides listed in Table A1 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid.

[0156] 11. Method according to any one of embodiments 1 to 8, wherein said nucleic acid sequence en-codes an orthologue or paralogue of any of the polypeptides given in Table A1.

[0157] 12. Method according to any one of embodiments 1 to 11, wherein said nucleic acid encodes the polypeptide represented by SEQ ID NO: 2 or SEQ ID NO: 4.

[0158] 13. Method according to any one of embodiments 1 to 12, wherein said nucleic acid is operably linked to a constitutive promoter, preferably to a medium strength constitutive promoter, preferably to a plant promoter, more preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.

[0159] 14. Plant, plant part thereof, including seeds, or plant cell, obtainable by a method according to any one of embodiments 1 to 13, wherein said plant, plant part or plant cell comprises a recombinant nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12.

[0160] 15. Construct comprising:

[0161] (i) nucleic acid encoding a bZIP-like as defined in any of embodiments 1 to 3 and 9 to 12;

[0162] (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (i); and optionally

[0163] (iii) a transcription termination sequence.

[0164] 16. Construct according to embodiment 15, wherein one of said control sequences is a constitutive promoter, preferably a medium strength constitutive promoter, preferably a plant promoter, more preferably a GOS2 promoter, most preferably a GOS2 promoter from rice.

[0165] 17. Use of a construct according to embodiment 15 or 16 in a method for making plants having enhanced yield-related traits, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants.

[0166] 18. Plant, plant part or plant cell transformed with a construct according to embodiment 15 or 16.

[0167] 19. Method for the production of a transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants, comprising:

[0168] (i) introducing and expressing in a plant cell or plant a nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12; and

[0169] (ii) cultivating said plant cell or plant under conditions promoting plant growth and development.

[0170] 20. Transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass, resulting from modulated expression of a nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12 or a transgenic plant cell derived from said transgenic plant.

[0171] 21. Transgenic plant according to embodiment 14, 18 or 20, or a transgenic plant cell derived therefrom, wherein said plant is a crop plant, such as beet, sugarbeet or alfalfa; or a monocotyledonous plant such as sugarcane; or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo or oats.

[0172] 22. Harvestable parts of a plant according to embodiment 21, wherein said harvestable parts are preferably shoot biomass and/or seeds.

[0173] 23. Products derived from a plant according to embodiment 21 and/or from harvestable parts of a plant according to embodiment 22.

[0174] 24. Use of a nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12, for enhancing yield-related traits in plants relative to control plants, preferably for increasing yield, and more preferably for increasing seed yield and/or for increasing biomass in plants relative to control plants.

[0175] 25. A method for manufacturing a product, comprising the steps of growing the plants according to embodiment 14, 18, 20 or 21 and producing said product from or by said plants; or parts thereof, including seeds.

[0176] Moreover, concerning the BCAT4-like polypeptides, the present invention relates to the following specific embodiments:

[0177] 1. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding an BCAT4-like polypeptide, wherein said BCAT4-like polypeptide comprises the signature sequence represented by SEQ ID NO: 216.

[0178] 2. Method according to embodiment 1, wherein said polypeptide is encoded by a nucleic acid molecule comprising a nucleic acid molecule selected from the group consisting of:

[0179] (i) a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;

[0180] (ii) the complement of a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;

[0181] (iii) a nucleic acid encoding the polypeptide as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, preferably as a result of the degeneracy of the genetic code, said isolated nucleic acid being deducible from a polypeptide sequence as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212 and further preferably conferring enhanced yield-related traits relative to control plants;

[0182] (iv) a nucleic acid having, in increasing order of preference at least 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity with any of the nucleic acid sequences of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211, and further preferably conferring enhanced yield-related traits relative to control plants;

[0183] (v) a first nucleic acid molecule which hybridizes with a second nucleic acid molecule of (i) to (iv) under stringent hybridization conditions and preferably confers enhanced yield-related traits relative to control plants;

[0184] (vi) a nucleic acid encoding said polypeptide having, in increasing order of preference, at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212 and preferably conferring enhanced yield-related traits relative to control plants; or

[0185] (vii) a nucleic acid comprising any combination(s) of features of (i) to (vi) above.

[0186] 3. Method according to embodiment 1 or 2, wherein said modulated expression is effected by introducing and expressing in a plant said nucleic acid encoding said BCAT4-like polypeptide.

[0187] 4. Method according to any one of embodiment 1 to 3, wherein said enhanced yield-related traits comprise increased yield relative to control plants, and preferably comprise increased biomass and/or increased seed yield relative to control plants.

[0188] 5. Method according to any one of embodiments 1 to 4, wherein said enhanced yield-related traits are obtained under conditions of drought stress.

[0189] 6. Method according to any of embodiments 1 to 5, wherein said BCAT4-like polypeptide comprises one or more of the following motifs:

[0190] (i) Motif 13 represented by SEQ ID NO: 213,

[0191] (ii) Motif 14 represented by SEQ ID NO: 214,

[0192] (iii) Motif 15 represented by SEQ ID NO: 215.

[0193] 7. Method according to any one of embodiments 1 to 6, wherein said nucleic acid encoding an BCAT4-like is of plant origin, preferably from a dicotyledonous plant, further preferably from the family Salicaceae, more preferably from the genus Populus, most preferably from Populus trichocarpa.

[0194] 8. Method according to any one of embodiments 1 to 7, wherein said nucleic acid encoding a BCAT4-like polypeptide encodes any one of the polypeptides listed in Table A2 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid.

[0195] 9. Method according to any one of embodiments 1 to 8, wherein said nucleic acid sequence encodes an orthologue or paralogue of any of the polypeptides given in Table A2.

[0196] 10. Method according to any one of embodiments 1 to 9, wherein said nucleic acid encodes the polypeptide represented by SEQ ID NO: 142.

[0197] 11. Method according to any one of embodiments 1 to 10, wherein said nucleic acid is operably linked to a constitutive promoter, preferably to a medium strength constitutive promoter, preferably to a plant promoter, more preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.

[0198] 12. Plant, plant part thereof, including seeds, or plant cell, obtainable by a method according to any one of embodiments 1 to 11, wherein said plant, plant part or plant cell comprises a recombinant nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11.

[0199] 13. Construct comprising:

[0200] (i) nucleic acid encoding an BCAT4-like as defined in any of embodiments 1, 2 and 6 to 11;

[0201] (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (i); and optionally

[0202] (iii) a transcription termination sequence.

[0203] 14. Construct according to embodiment 13, wherein one of said control sequences is a constitutive promoter, preferably a medium strength constitutive promoter, preferably a plant promoter, more preferably a GOS2 promoter, most preferably a GOS2 promoter from rice.

[0204] 15. Use of a construct according to embodiment 13 or 14 in a method for making plants having enhanced yield-related traits, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants.

[0205] 16. Plant, plant part or plant cell transformed with a construct according to embodiment 13 or 14.

[0206] 17. Method for the production of a transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants, comprising:

[0207] (i) introducing and expressing in a plant cell or plant a nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11; and

[0208] (ii) cultivating said plant cell or plant under conditions promoting plant growth and development.

[0209] 18. Transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass, resulting from modulated expression of a nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11 or a transgenic plant cell derived from said transgenic plant.

[0210] 19. Transgenic plant according to embodiment 12, 16 or 18, or a transgenic plant cell derived therefrom, wherein said plant is a crop plant, such as beet, sugarbeet or alfalfa; or a monocotyledonous plant such as sugarcane; or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo or oats.

[0211] 20. Harvestable parts of a plant according to embodiment 19, wherein said harvestable parts are preferably shoot biomass and/or seeds.

[0212] 21. Products derived from a plant according to embodiment 19 and/or from harvestable parts of a plant according to embodiment 20.

[0213] 22. Use of a nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11 for enhancing yield-related traits in plants relative to control plants, preferably for increasing yield, and more preferably for increasing seed yield and/or for increasing biomass in plants relative to control plants.

[0214] 23. A method for manufacturing a product comprising the steps of growing the plants according to embodiment 12, 16, 19 or 20 and producing said product from or by said plants; or parts thereof, including seeds.

[0215] 24. Construct according to embodiment 13 or 14 comprised in a plant cell.

[0216] 25. Recombinant chromosomal DNA comprising the construct according to embodiment 13 or 14.

DEFINITIONS

[0217] The following definitions will be used throughout the present application. The section captions and headings in this application are for convenience and reference purpose only and should not affect in any way the meaning or interpretation of this application. The technical terms and expressions used within the scope of this application are generally to be given the meaning commonly applied to them in the pertinent art of plant biology, molecular biology, bioinformatics and plant breeding. All of the following term definitions apply to the complete content of this application. The term "essentially", "about", "approximately" and the like in connection with an attribute or a value, particularly also define exactly the attribute or exactly the value, respectively. The term "about" in the context of a given numeric value or range relates in particular to a value or range that is within 20%, within 10%, or within 5% of the value or range given. As used herein, the term "comprising" also encompasses the term "consisting of".

Peptide(s)/Protein(s)

[0218] The terms "peptides", "oligopeptides", "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds, unless mentioned herein otherwise.

Polynucleotide(s)/Nucleic Acid(s)/Nucleic Acid Sequence(s)/Nucleotide Sequence(s)

[0219] The terms "polynucleotide(s)", "nucleic acid sequence(s)", "nucleotide sequence(s)", "nucleic acid(s)", "nucleic acid molecule" are used interchangeably herein and refer to nucleotides, either ribonucleotides or deoxyribonucleotides or a combination of both, in a polymeric unbranched form of any length.

Homologue(s)

[0220] "Homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified protein in question and having similar biological and functional activity as the unmodified protein from which they are derived.

[0221] Orthologues and paralogues are two different forms of homologues and encompass evolutionary concepts used to describe the ancestral relationships of genes. Paralogues are genes within the same species that have originated through duplication of an ancestral gene; orthologues are genes from different organisms that have originated through speciation, and are also derived from a common ancestral gene.

[0222] A "deletion" refers to removal of one or more amino acids from a protein.

[0223] An "insertion" refers to one or more amino acid residues being introduced into a predetermined site in a protein. Insertions may comprise N-terminal and/or C-terminal fusions as well as intra-sequence insertions of single or multiple amino acids. Generally, insertions within the amino acid sequence will be smaller than N- or C-terminal fusions, of the order of about 1 to 10 residues. Examples of N- or C-terminal fusion proteins or peptides include the binding domain or activation domain of a transcriptional activator as used in the yeast two-hybrid system, phage coat proteins, (histidine)-6-tag, glutathione S-transferase-tag, protein A, maltose-binding protein, dihydrofolate reductase, Tag•100 epitope, c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding peptide), HA epitope, protein C epitope and VSV epitope.

[0224] A "substitution" refers to replacement of amino acids of the protein with other amino acids having similar properties (such as similar hydrophobicity, hydrophilicity, antigenicity, propensity to form or break α-helical structures or β-sheet structures). Amino acid substitutions are typically of single residues, but may be clustered depending upon functional constraints placed upon the polypeptide and may range from 1 to 10 amino acids. The amino acid substitutions are preferably conservative amino acid substitutions. Conservative substitution tables are well known in the art (see for example Creighton (1984) Proteins. W.H. Freeman and Company (Eds) and Table 1 below).

TABLE-US-00004 TABLE 1 Examples of conserved amino acid substitutions Residue Conservative Substitutions Ala Ser Arg Lys Asn Gln; His Asp Glu Gln Asn Cys Ser Glu Asp Gly Pro His Asn; Gln Ile Leu, Val Leu Ile; Val Lys Arg; Gln Met Leu; Ile Phe Met; Leu; Tyr Ser Thr; Gly Thr Ser; Val Trp Tyr Tyr Trp; Phe Val Ile; Leu

[0225] Amino acid substitutions, deletions and/or insertions may readily be made using peptide synthetic techniques known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulation. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. For example, techniques for making substitution mutations at predetermined sites in DNA are well known to those skilled in the art and include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio), QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis or other site-directed mutagenesis protocols (see Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates)).

Derivatives

[0226] "Derivatives" include peptides, oligopeptides, polypeptides which may, compared to the amino acid sequence of the naturally-occurring form of the protein, such as the protein of interest, comprise substitutions of amino acids with non-naturally occurring amino acid residues, or additions of non-naturally occurring amino acid residues. "Derivatives" of a protein also encompass peptides, oligopeptides, polypeptides which comprise naturally occurring altered (glycosylated, acylated, prenylated, phosphorylated, myristoylated, sulphated etc.) or non-naturally altered amino acid residues compared to the amino acid sequence of a naturally-occurring form of the polypeptide. A derivative may also comprise one or more non-amino acid substituents or additions compared to the amino acid sequence from which it is derived, for example a reporter molecule or other ligand, covalently or non-covalently bound to the amino acid sequence, such as a reporter molecule which is bound to facilitate its detection, and non-naturally occurring amino acid residues relative to the amino acid sequence of a naturally-occurring protein. Furthermore, "derivatives" also include fusions of the naturally-occurring form of the protein with tagging peptides such as FLAG, HIS6 or thioredoxin (for a review of tagging peptides, see Terpe, Appl. Microbiol. Biotechnol. 60, 523-533, 2003).

Domain, Motif/Consensus Sequence/Signature The term "domain" refers to a set of amino acids conserved at specific positions along an alignment of sequences of evolutionarily related proteins. While amino acids at other positions can vary between homologues, amino acids that are highly conserved at specific positions indicate amino acids that are likely essential in the structure, stability or function of a protein. Identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers to determine if any polypeptide in question belongs to a previously identified polypeptide family.

[0227] The term "motif" or "consensus sequence" or "signature" refers to a short conserved region in the sequence of evolutionarily related proteins. Motifs are frequently highly conserved parts of domains, but may also include only part of the domain, or be located outside of conserved domain (if all of the amino acids of the motif fall outside of a defined domain).

[0228] Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788 (2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.

[0229] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters. For local alignments, the Smith-Waterman algorithm is particularly useful (Smith T F, Waterman M S (1981) J. Mol. Biol. 147(1); 195-7).

Reciprocal BLAST

[0230] Typically, this involves a first BLAST involving BLASTing a query sequence (for example using any of the sequences listed in Table A of the Examples section) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived. The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.

[0231] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.

Hybridisation

[0232] The term "hybridisation" as defined herein is a process wherein substantially homologous complementary nucleotide sequences anneal to each other. The hybridisation process can occur entirely in solution, i.e. both complementary nucleic acids are in solution. The hybridisation process can also occur with one of the complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose beads or any other resin. The hybridisation process can furthermore occur with one of the complementary nucleic acids immobilised to a solid support such as a nitro-cellulose or nylon membrane or immobilised by e.g. photolithography to, for example, a siliceous glass support (the latter known as nucleic acid arrays or microarrays or as nucleic acid chips). In order to allow hybridisation to occur, the nucleic acid molecules are generally thermally or chemically denatured to melt a double strand into two single strands and/or to remove hairpins or other secondary structures from single stranded nucleic acids.

[0233] The term "stringency" refers to the conditions under which a hybridisation takes place. The stringency of hybridisation is influenced by conditions such as temperature, salt concentration, ionic strength and hybridisation buffer composition. Generally, low stringency conditions are selected to be about 30° C. lower than the thermal melting point (T_m) for the specific sequence at a defined ionic strength and pH. Medium stringency conditions are when the temperature is 20° C. below T_m, and high stringency conditions are when the temperature is 10° C. below T_m. High stringency hybridisation conditions are typically used for isolating hybridising sequences that have high sequence similarity to the target nucleic acid sequence. However, nucleic acids may deviate in sequence and still encode a substantially identical polypeptide, due to the degeneracy of the genetic code. Therefore medium stringency hybridisation conditions may sometimes be needed to identify such nucleic acid molecules.

[0234] The T_m is the temperature under defined ionic strength and pH, at which 50% of the target sequence hybridises to a perfectly matched probe. The T_m is dependent upon the solution conditions and the base composition and length of the probe. For example, longer sequences hybridise specifically at higher temperatures. The maximum rate of hybridisation is obtained from about 16° C. up to 32° C. below T_m. The presence of monovalent cations in the hybridisation solution reduce the electrostatic repulsion between the two nucleic acid strands thereby promoting hybrid formation; this effect is visible for sodium concentrations of up to 0.4M (for higher concentrations, this effect may be ignored). Formamide reduces the melting temperature of DNA-DNA and DNA-RNA duplexes with 0.6 to 0.7° C. for each percent formamide, and addition of 50% formamide allows hybridisation to be performed at 30 to 45° C., though the rate of hybridisation will be lowered. Base pair mismatches reduce the hybridisation rate and the thermal stability of the duplexes. On average and for large probes, the Tm decreases about 1° C. per % base mismatch. The T_m may be calculated using the following equations, depending on the types of hybrids:

1) DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284, 1984):

T_m=81.5° C.+16.6x log₁₀[Na.sup.+]^a+0.41x%[G/C^b]-500x[L^c]^-1-0.61x% formamide

2) DNA-RNA or RNA-RNA hybrids:

T_m=79.8° C.+18.5(log₁₀[Na.sup.+]^a)+0.58(%G/C^b)+11.8(%G/C^b).sup- .2-820/L^c

3) oligo-DNA or oligo-RNAs hybrids:

For <20 nucleotides:T_m=2(l_n)

For 20-35 nucleotides:T_m=22+1.46(l_n)

^a or for other monovalent cation, but only accurate in the 0.01-0.4 M range. ^b only accurate for % GC in the 30% to 75% range. ^c L=length of duplex in base pairs. ^d oligo, oligonucleotide; l_n=effective length of primer=2×(no. of G/C)+(no. of A/T).

[0235] Non-specific binding may be controlled using any one of a number of known techniques such as, for example, blocking the membrane with protein containing solutions, additions of heterologous RNA, DNA, and SDS to the hybridisation buffer, and treatment with Rnase. For non-homologous probes, a series of hybridizations may be performed by varying one of (i) progressively lowering the annealing temperature (for example from 68° C. to 42° C.) or (ii) progressively lowering the formamide concentration (for example from 50% to 0%). The skilled artisan is aware of various parameters which may be altered during hybridisation and which will either maintain or change the stringency conditions.

[0236] Besides the hybridisation conditions, specificity of hybridisation typically also depends on the function of post-hybridisation washes. To remove background resulting from non-specific hybridisation, samples are washed with dilute salt solutions. Critical factors of such washes include the ionic strength and temperature of the final wash solution: the lower the salt concentration and the higher the wash temperature, the higher the stringency of the wash. Wash conditions are typically performed at or below hybridisation stringency. A positive hybridisation gives a signal that is at least twice of that of the background. Generally, suitable stringent conditions for nucleic acid hybridisation assays or gene amplification detection procedures are as set forth above. More or less stringent conditions may also be selected. The skilled artisan is aware of various parameters which may be altered during washing and which will either maintain or change the stringency conditions.

[0237] For example, typical high stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 65° C. in 1×SSC or at 42° C. in 1×SSC and 50% formamide, followed by washing at 65° C. in 0.3×SSC. Examples of medium stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 50° C. in 4×SSC or at 40° C. in 6×SSC and 50% formamide, followed by washing at 50° C. in 2×SSC. The length of the hybrid is the anticipated length for the hybridising nucleic acid. When nucleic acids of known sequence are hybridised, the hybrid length may be determined by aligning the sequences and identifying the conserved regions described herein. 1×SSC is 0.15M NaCl and 15 mM sodium citrate; the hybridisation solution and wash solutions may additionally include 5×Denhardt's reagent, 0.5-1.0% SDS, 100 pg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate.

[0238] For the purposes of defining the level of stringency, reference can be made to Sambrook et al. (2001) Molecular Cloning: a laboratory manual, 3rd Edition, Cold Spring Harbor Laboratory Press, CSH, New York or to Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates).

Splice Variant

[0239] The term "splice variant" as used herein encompasses variants of a nucleic acid sequence in which selected introns and/or exons have been excised, replaced, displaced or added, or in which introns have been shortened or lengthened. Such variants will be ones in which the biological activity of the protein is substantially retained; this may be achieved by selectively retaining functional segments of the protein. Such splice variants may be found in nature or may be manmade. Methods for predicting and isolating such splice variants are well known in the art (see for example Foissac and Schiex (2005) BMC Bioinformatics 6: 25).

Allelic Variant

[0240] "Alleles" or "allelic variants" are alternative forms of a given gene, located at the same chromosomal position. Allelic variants encompass Single Nucleotide Polymorphisms (SNPs), as well as Small Insertion/Deletion Polymorphisms (INDELs). The size of INDELs is usually less than 100 bp. SNPs and INDELs form the largest set of sequence variants in naturally occurring polymorphic strains of most organisms.

Endogenous Gene

[0241] Reference herein to an "endogenous" gene not only refers to the gene in question as found in a plant in its natural form (i.e., without there being any human intervention), but also refers to that same gene (or a substantially homologous nucleic acid/gene) in an isolated form subsequently (re)introduced into a plant (a transgene). For example, a transgenic plant containing such a transgene may encounter a substantial reduction of the transgene expression and/or substantial reduction of expression of the endogenous gene. The isolated gene may be isolated from an organism or may be manmade, for example by chemical synthesis.

Gene Shuffling/Directed Evolution

[0242] "Gene shuffling" or "directed evolution" consists of iterations of DNA shuffling followed by appropriate screening and/or selection to generate variants of nucleic acids or portions thereof encoding proteins having a modified biological activity (Castle et al., (2004) Science 304(5674): 1151-4; U.S. Pat. Nos. 5,811,238 and 6,395,547).

Construct

[0243] Artificial DNA (such as but, not limited to plasmids or viral DNA) capable of replication in a host cell and used for introduction of a DNA sequence of interest into a host cell or host organism. Host cells of the invention may be any cell selected from bacterial cells, such as Escherichia coli or Agrobacterium species cells, yeast cells, fungal, algal or cyanobacterial cells or plant cells. The skilled artisan is well aware of the genetic elements that must be present on the genetic construct in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter) as described herein. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.

[0244] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.

[0245] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.

Regulatory Element/Control Sequence/Promoter

[0246] The terms "regulatory element", "control sequence" and "promoter" are all used interchangeably herein and are to be taken in a broad context to refer to regulatory nucleic acid sequences capable of effecting expression of the sequences to which they are ligated. The term "promoter" typically refers to a nucleic acid control sequence located upstream from the transcriptional start of a gene and which is involved in recognising and binding of RNA polymerase and other proteins, thereby directing transcription of an operably linked nucleic acid. Encompassed by the aforementioned terms are transcriptional regulatory sequences derived from a classical eukaryotic genomic gene (including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence) and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. Also included within the term is a transcriptional regulatory sequence of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 box transcriptional regulatory sequences. The term "regulatory element" also encompasses a synthetic fusion molecule or derivative that confers, activates or enhances expression of a nucleic acid molecule in a cell, tissue or organ.

[0247] A "plant promoter" comprises regulatory elements, which mediate the expression of a coding sequence segment in plant cells. Accordingly, a plant promoter need not be of plant origin, but may originate from viruses or micro-organisms, for example from viruses which attack plant cells. The "plant promoter" can also originate from a plant cell, e.g. from the plant which is transformed with the nucleic acid sequence to be expressed in the inventive process and described herein. This also applies to other "plant" regulatory signals, such as "plant" terminators. The promoters upstream of the nucleotide sequences useful in the methods of the present invention can be modified by one or more nucleotide substitution(s), insertion(s) and/or deletion(s) without interfering with the functionality or activity of either the promoters, the open reading frame (ORF) or the 3'-regulatory region such as terminators or other 3' regulatory regions which are located away from the ORF. It is furthermore possible that the activity of the promoters is increased by modification of their sequence, or that they are replaced completely by more active promoters, even promoters from heterologous organisms. For expression in plants, the nucleic acid molecule must, as described above, be linked operably to or comprise a suitable promoter which expresses the gene at the right point in time and with the required spatial expression pattern.

[0248] For the identification of functionally equivalent promoters, the promoter strength and/or expression pattern of a candidate promoter may be analysed for example by operably linking the promoter to a reporter gene and assaying the expression level and pattern of the reporter gene in various tissues of the plant. Suitable well-known reporter genes include for example beta-glucuronidase or beta-galactosidase. The promoter activity is assayed by measuring the enzymatic activity of the beta-glucuronidase or beta-galactosidase. The promoter strength and/or expression pattern may then be compared to that of a reference promoter (such as the one used in the methods of the present invention). Alternatively, promoter strength may be assayed by quantifying mRNA levels or by comparing mRNA levels of the nucleic acid used in the methods of the present invention, with mRNA levels of housekeeping genes such as 18S rRNA, using methods known in the art, such as Northern blotting with densitometric analysis of autoradiograms, quantitative real-time PCR or RT-PCR (Heid et al., 1996 Genome Methods 6: 986-994). Generally by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By "low level" is intended at levels of about 1/10,000 transcripts to about 1/100,000 transcripts, to about 1/500,0000 transcripts per cell. Conversely, a "strong promoter" drives expression of a coding sequence at high level, or at about 1/10 transcripts to about 1/100 transcripts to about 1/1000 transcripts per cell. Generally, by "medium strength promoter" is intended a promoter that drives expression of a coding sequence at a lower level than a strong promoter, in particular at a level that is in all instances below that obtained when under the control of a 35S CaMV promoter.

Operably Linked

[0249] The term "operably linked" as used herein refers to a functional linkage between the promoter sequence and the gene of interest, such that the promoter sequence is able to initiate transcription of the gene of interest.

Constitutive Promoter

[0250] A "constitutive promoter" refers to a promoter that is transcriptionally active during most, but not necessarily all, phases of growth and development and under most environmental conditions, in at least one cell, tissue or organ. Table 2a below gives examples of constitutive promoters.

TABLE-US-00005 TABLE 2a Examples of constitutive promoters Gene Source Reference Actin McElroy et al, Plant Cell, 2: 163-171, 1990 HMGP WO 2004/070039 CAMV 35S Odell et al, Nature, 313: 810-812, 1985 CaMV 19S Nilsson et al., Physiol. Plant. 100: 456-462, 1997 GOS2 de Pater et al, Plant J Nov; 2(6): 837-44, 1992, WO 2004/065596 Ubiquitin Christensen et al, Plant Mol. Biol. 18: 675-689, 1992 Rice cyclophilin Buchholz et al, Plant Mol Biol. 25(5): 837-43, 1994 Maize H3 histone Lepetit et al, Mol. Gen. Genet. 231: 276-285, 1992 Alfalfa H3 Wu et al. Plant Mol. Biol. 11: 641-649, 1988 histone Actin 2 An et al, Plant J. 10(1); 107-121, 1996 34S FMV Sanger et al., Plant. Mol. Biol., 14, 1990: 433-443 Rubisco small U.S. Pat. No. 4,962,028 subunit OCS Leisner (1988) Proc Natl Acad Sci USA 85(5): 2553 SAD1 Jain et al., Crop Science, 39 (6), 1999: 1696 SAD2 Jain et al., Crop Science, 39 (6), 1999: 1696 nos Shaw et al. (1984) Nucleic Acids Res. 12(20): 7831-7846 V-ATPase WO 01/14572 Super promoter WO 95/14098 G-box proteins WO 94/12015

Ubiquitous Promoter

[0251] A "ubiquitous promoter" is active in substantially all tissues or cells of an organism.

Developmentally-Regulated Promoter

[0252] A "developmentally-regulated promoter" is active during certain developmental stages or in parts of the plant that undergo developmental changes.

Inducible Promoter

[0253] An "inducible promoter" has induced or increased transcription initiation in response to a chemical (for a review see Gatz 1997, Annu. Rev. Plant Physiol. Plant Mol. Biol., 48:89-108), environmental or physical stimulus, or may be "stress-inducible", i.e. activated when a plant is exposed to various stress conditions, or a "pathogen-inducible" i.e. activated when a plant is exposed to exposure to various pathogens.

Organ-Specific/Tissue-Specific Promoter

[0254] An "organ-specific" or "tissue-specific promoter" is one that is capable of preferentially initiating transcription in certain organs or tissues, such as the leaves, roots, seed tissue etc. For example, a "root-specific promoter" is a promoter that is transcriptionally active predominantly in plant roots, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Promoters able to initiate transcription in certain cells only are referred to herein as "cell-specific".

[0255] Examples of root-specific promoters are listed in Table 2b below:

TABLE-US-00006 TABLE 2b Examples of root-specific promoters Gene Source Reference RCc3 Plant Mol Biol. 1995 January; 27(2): 237-48 Arabidopsis PHT1 Koyama et al. J Biosci Bioeng. 2005 January; 99(1): 38-42.; Mudge et al. (2002, Plant J. 31: 341) Medicago phosphate Xiao et al., 2006, Plant Biol (Stuttg). 2006 July; 8(4): 439-49 transporter Arabidopsis Pyk10 Nitz et al. (2001) Plant Sci 161(2): 337-346 root-expressible genes Tingey et al., EMBO J. 6: 1, 1987. tobacco auxin- Van der Zaal et al., Plant Mol. Biol. 16, inducible gene 983, 1991. β-tubulin Oppenheimer, et al., Gene 63: 87, 1988. tobacco root- Conkling, et al., Plant Physiol. 93: 1203, 1990. specific genes B. napus G1-3b gene U.S. Pat. No. 5,401,836 SbPRP1 Suzuki et al., Plant Mol. Biol. 21: 109-119, 1993. LRX1 Baumberger et al. 2001, Genes & Dev. 15: 1128 BTG-26 Brassica US 20050044585 napus LeAMT1 (tomato) Lauter et al. (1996, PNAS 3: 8139) The LeNRT1-1 Lauter et al. (1996, PNAS 3: 8139) (tomato) class I patatin Liu et al., Plant Mol. Biol. 7 (6): 1139-1154 gene (potato) KDC1 (Daucus Downey et al. (2000, J. Biol. Chem. 275: 39420) carota) TobRB7 gene W Song (1997) PhD Thesis, North Carolina State University, Raleigh, NC USA OsRAB5a (rice) Wang et al. 2002, Plant Sci. 163: 273 ALF5 (Arabidopsis) Diener et al. (2001, Plant Cell 13: 1625) NRT2; 1Np (N. Quesada et al. (1997, Plant Mol. Biol. 34: 265) plumbaginifolia)

[0256] A "seed-specific promoter" is transcriptionally active predominantly in seed tissue, but not necessarily exclusively in seed tissue (in cases of leaky expression). The seed-specific promoter may be active during seed development and/or during germination. The seed specific promoter may be endosperm/aleurone/embryo specific. Examples of seed-specific promoters (endosperm/aleurone/embryo specific) are shown in Table 2c to Table 2f below. Further examples of seed-specific promoters are given in Qing Qu and Takaiwa (Plant Biotechnol. J. 2, 113-125, 2004), which disclosure is incorporated by reference herein as if fully set forth.

TABLE-US-00007 TABLE 2c Examples of seed-specific promoters Gene source Reference seed-specific genes Simon et al., Plant Mol. Biol. 5: 191, 1985; Scofield et al., J. Biol. Chem. 262: 12202, 1987.; Baszczynski et al., Plant Mol. Biol. 14: 633, 1990. Brazil Nut albumin Pearson et al., Plant Mol. Biol. 18: 235-245, 1992. legumin Ellis et al., Plant Mol. Biol. 10: 203-214, 1988. glutelin (rice) Takaiwa et al., Mol. Gen. Genet. 208: 15-22, 1986; Takaiwa et al., FEBS Letts. 221: 43-47, 1987. zein Matzke et al Plant Mol Biol, 14(3): 323-32 1990 napA Stalberg et al, Planta 199: 515-519, 1996. wheat LMW and HMW Mol Gen Genet 216: 81-90, 1989; NAR 17: 461-2, 1989 glutenin-1 wheat SPA Albani et al, Plant Cell, 9: 171-184, 1997 wheat α, β, γ-gliadins EMBO J. 3: 1409-15, 1984 barley Itr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Theor Appl Gen 98: 1253-62, 1999; Plant J 4: 343-55, 1993; Mol Gen Genet 250: 750-60, 1996 barley DOF Mena et al, The Plant Journal, 116(1): 53-62, 1998 blz2 EP99106056.7 synthetic promoter Vicente-Carbajosa et al., Plant J. 13: 629-640, 1998. rice prolamin NRP33 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice a-globulin Glb-1 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 rice α-globulin REB/OHP-1 Nakase et al. Plant Mol. Biol. 33: 513-522, 1997 rice ADP-glucose pyrophos- Trans Res 6: 157-68, 1997 phorylase maize ESR gene family Plant J 12: 235-46, 1997 sorghum α-kafirin DeRose et al., Plant Mol. Biol 32: 1029-35, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 rice oleosin Wu et al, J. Biochem. 123: 386, 1998 sunflower oleosin Cummins et al., Plant Mol. Biol. 19: 873-876, 1992 PRO0117, putative rice 40S WO 2004/070039 ribosomal protein PRO0136, rice alanine unpublished aminotransferase PRO0147, trypsin inhibitor unpublished ITR1 (barley) PRO0151, rice WSI18 WO 2004/070039 PRO0175, rice RAB21 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039 α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998

TABLE-US-00008 TABLE 2d examples of endosperm-specific promoters Gene source Reference glutelin (rice) Takaiwa et al. (1986) Mol Gen Genet 208: 15-22; Takaiwa et al. (1987) FEBS Letts. 221: 43-47 zein Matzke et al., (1990) Plant Mol Biol 14(3): 323-32 wheat LMW Colot et al. (1989) Mol Gen Genet 216: 81-90, and HMW Anderson et al. (1989) NAR 17: 461-2 glutenin-1 wheat SPA Albani et al. (1997) Plant Cell 9: 171-184 wheat gliadins Rafalski et al. (1984) EMBO 3: 1409-15 barley Itr1 Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 promoter barley B1, C, D, Cho et al. (1999) Theor Appl Genet 98: 1253-62; hordein Muller et al. (1993) Plant J 4: 343-55; Sorenson et al. (1996) Mol Gen Genet 250: 750-60 barley DOF Mena et al, (1998) Plant J 116(1): 53-62 blz2 Onate et al. (1999) J Biol Chem 274(14): 9175-82 synthetic promoter Vicente-Carbajosa et al. (1998) Plant J 13: 629-640 rice prolamin Wu et al, (1998) Plant Cell Physiol 39(8) 885-889 NRP33 rice globulin Wu et al. (1998) Plant Cell Physiol 39(8) 885-889 Glb-1 rice globulin Nakase et al. (1997) Plant Molec Biol 33: 513-522 REB/OHP-1 rice ADP-glucose Russell et al. (1997) Trans Res 6: 157-68 pyrophosphorylase maize ESR Opsahl-Ferstad et al. (1997) Plant J 12: 235-46 gene family sorghum kafirin DeRose et al. (1996) Plant Mol Biol 32: 1029-35

TABLE-US-00009 TABLE 2e Examples of embryo specific promoters: Gene source Reference rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 PRO0151 WO 2004/070039 PRO0175 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039

TABLE-US-00010 TABLE 2f Examples of aleurone-specific promoters: Gene source Reference α-amylase Lanahan et al, Plant Cell 4: 203-211, 1992; (Amy32b) Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998

[0257] A "green tissue-specific promoter" as defined herein is a promoter that is transcriptionally active predominantly in green tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts.

[0258] Examples of green tissue-specific promoters which may be used to perform the methods of the invention are shown in Table 2g below.

TABLE-US-00011 TABLE 2g Examples of green tissue-specific promoters Gene Expression Reference Maize Orthophosphate dikinase Leaf specific Fukavama et al., Plant Physiol. 2001 November; 127(3): 1136-46 Maize Phosphoenolpyruvate Leaf specific Kausch et al., Plant Mol Biol. carboxylase 2001 January; 45(1): 1-15 Rice Phosphoenolpyruvate Leaf specific Lin et al., 2004 DNA Seq. 2004 carboxylase August; 15(4): 269-76 Rice small subunit Rubisco Leaf specific Nomura et al., Plant Mol Biol. 2000 September; 44(1): 99-106 rice beta expansin EXBP9 Shoot specific WO 2004/070039 Pigeonpea small subunit Rubisco Leaf specific Panguluri et al., Indian J Exp Biol. 2005 April; 43(4): 369-72 Pea RBCS3A Leaf specific

[0259] Another example of a tissue-specific promoter is a meristem-specific promoter, which is transcriptionally active predominantly in meristematic tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Examples of green meristem-specific promoters which may be used to perform the methods of the invention are shown in Table 2h below.

TABLE-US-00012 TABLE 2h Examples of meristem-specific promoters Gene source Expression pattern Reference rice OSH1 Shoot apical meristem, Sato et al. (1996) from embryo globular Proc. Natl. Acad. Sci. stage to seedling stage USA, 93: 8117-8122 Rice metallothionein Meristem specific BAD87835.1 WAK1 & WAK 2 Shoot and root apical Wagner & Kohorn meristems, and in ex- (2001) Plant Cell panding leaves and sepals 13(2): 303-318

Terminator

[0260] The term "terminator" encompasses a control sequence which is a DNA sequence at the end of a transcriptional unit which signals 3' processing and polyadenylation of a primary transcript and termination of transcription. The terminator can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The terminator to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.

Selectable Marker (Gene)/Reporter Gene

[0261] "Selectable marker", "selectable marker gene" or "reporter gene" includes any gene that confers a phenotype on a cell in which it is expressed to facilitate the identification and/or selection of cells that are transfected or transformed with a nucleic acid construct of the invention. These marker genes enable the identification of a successful transfer of the nucleic acid molecules via a series of different principles. Suitable markers may be selected from markers that confer antibiotic or herbicide resistance, that introduce a new metabolic trait or that allow visual selection. Examples of selectable marker genes include genes conferring resistance to antibiotics (such as nptII that phosphorylates neomycin and kanamycin, or hpt, phosphorylating hygromycin, or genes conferring resistance to, for example, bleomycin, streptomycin, tetracyclin, chloramphenicol, ampicillin, gentamycin, geneticin (G418), spectinomycin or blasticidin), to herbicides (for example bar which provides resistance to Basta®; aroA or gox providing resistance against glyphosate, or the genes conferring resistance to, for example, imidazolinone, phosphinothricin or sulfonylurea), or genes that provide a metabolic trait (such as manA that allows plants to use mannose as sole carbon source or xylose isomerase for the utilisation of xylose, or antinutritive markers such as the resistance to 2-deoxyglucose). Expression of visual marker genes results in the formation of colour (for example β-glucuronidase, GUS or β-galactosidase with its coloured substrates, for example X-Gal), luminescence (such as the luciferin/luceferase system) or fluorescence (Green Fluorescent Protein, GFP, and derivatives thereof). This list represents only a small number of possible markers. The skilled worker is familiar with such markers. Different markers are preferred, depending on the organism and the selection method.

[0262] It is known that upon stable or transient integration of nucleic acids into plant cells, only a minority of the cells takes up the foreign DNA and, if desired, integrates it into its genome, depending on the expression vector used and the transfection technique used. To identify and select these integrants, a gene coding for a selectable marker (such as the ones described above) is usually introduced into the host cells together with the gene of interest. These markers can for example be used in mutants in which these genes are not functional by, for example, deletion by conventional methods. Furthermore, nucleic acid molecules encoding a selectable marker can be introduced into a host cell on the same vector that comprises the sequence encoding the polypeptides of the invention or used in the methods of the invention, or else in a separate vector. Cells which have been stably transfected with the introduced nucleic acid can be identified for example by selection (for example, cells which have integrated the selectable marker survive whereas the other cells die).

[0263] Since the marker genes, particularly genes for resistance to antibiotics and herbicides, are no longer required or are undesired in the transgenic host cell once the nucleic acids have been introduced successfully, the process according to the invention for introducing the nucleic acids advantageously employs techniques which enable the removal or excision of these marker genes. One such a method is what is known as co-transformation. The co-transformation method employs two vectors simultaneously for the transformation, one vector bearing the nucleic acid according to the invention and a second bearing the marker gene(s). A large proportion of transformants receives or, in the case of plants, comprises (up to 40% or more of the transformants), both vectors. In case of transformation with Agrobacteria, the transformants usually receive only a part of the vector, i.e. the sequence flanked by the T-DNA, which usually represents the expression cassette. The marker genes can subsequently be removed from the transformed plant by performing crosses. In another method, marker genes integrated into a transposon are used for the transformation together with desired nucleic acid (known as the Ac/Ds technology). The transformants can be crossed with a transposase source or the transformants are transformed with a nucleic acid construct conferring expression of a transposase, transiently or stable. In some cases (approx. 10%), the transposon jumps out of the genome of the host cell once transformation has taken place successfully and is lost. In a further number of cases, the transposon jumps to a different location. In these cases the marker gene must be eliminated by performing crosses. In microbiology, techniques were developed which make possible, or facilitate, the detection of such events. A further advantageous method relies on what is known as recombination systems; whose advantage is that elimination by crossing can be dispensed with. The best-known system of this type is what is known as the Cre/Iox system. Cre1 is a recombinase that removes the sequences located between the IoxP sequences. If the marker gene is integrated between the IoxP sequences, it is removed once transformation has taken place successfully, by expression of the recombinase. Further recombination systems are the HIN/HIX, FLP/FRT and REP/STB system (Tribble et al., J. Biol. Chem., 275, 2000: 22255-22267; Velmurugan et al., J. Cell Biol., 149, 2000: 553-566). A site-specific integration into the plant genome of the nucleic acid sequences according to the invention is possible. Naturally, these methods can also be applied to microorganisms such as yeast, fungi or bacteria.

Transgenic/Transgene/Recombinant

[0264] For the purposes of the invention, "transgenic", "transgene" or "recombinant" means with regard to, for example, a nucleic acid sequence, an expression cassette, gene construct or a vector comprising the nucleic acid sequence or an organism transformed with the nucleic acid sequences, expression cassettes or vectors according to the invention, all those constructions brought about by recombinant methods in which either

[0265] (a) the nucleic acid sequences encoding proteins useful in the methods of the invention, or

[0266] (b) genetic control sequence(s) which is operably linked with the nucleic acid sequence according to the invention, for example a promoter, or

[0267] (c) a) and b) are not located in their natural genetic environment or have been modified by recombinant methods, it being possible for the modification to take the form of, for example, a substitution, addition, deletion, inversion or insertion of one or more nucleotide residues. The natural genetic environment is understood as meaning the natural genomic or chromosomal locus in the original plant or the presence in a genomic library. In the case of a genomic library, the natural genetic environment of the nucleic acid sequence is preferably retained, at least in part. The environment flanks the nucleic acid sequence at least on one side and has a sequence length of at least 50 bp, preferably at least 500 bp, especially preferably at least 1000 bp, most preferably at least 5000 bp. A naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a polypeptide useful in the methods of the present invention, as defined above--becomes a transgenic expression cassette when this expression cassette is modified by non-natural, synthetic ("artificial") methods such as, for example, mutagenic treatment. Suitable methods are described, for example, in U.S. Pat. No. 5,565,350 or WO 00/15815.

[0268] A transgenic plant for the purposes of the invention is thus understood as meaning, as above, that the nucleic acids used in the method of the invention are not present in, or originating from, the genome of said plant, or are present in the genome of said plant but not at their natural locus in the genome of said plant, it being possible for the nucleic acids to be expressed homologously or heterologously. However, as mentioned, transgenic also means that, while the nucleic acids according to the invention or used in the inventive method are at their natural position in the genome of a plant, the sequence has been modified with regard to the natural sequence, and/or that the regulatory sequences of the natural sequences have been modified. Transgenic is preferably understood as meaning the expression of the nucleic acids according to the invention at an unnatural locus in the genome, i.e. homologous or, preferably, heterologous expression of the nucleic acids takes place. Preferred transgenic plants are mentioned herein.

[0269] It shall further be noted that in the context of the present invention, the term "isolated nucleic acid" or "isolated polypeptide" may in some instances be considered as a synonym for a "recombinant nucleic acid" or a "recombinant polypeptide", respectively and refers to a nucleic acid or polypeptide that is not located in its natural genetic environment and/or that has been modified by recombinant methods.

Modulation

[0270] The term "modulation" means in relation to expression or gene expression, a process in which the expression level is changed by said gene expression in comparison to the control plant, the expression level may be increased or decreased. The original, unmodulated expression may be of any kind of expression of a structural RNA (rRNA, tRNA) or mRNA with subsequent translation. For the purposes of this invention, the original unmodulated expression may also be absence of any expression. The term "modulating the activity" shall mean any change of the expression of the inventive nucleic acid sequences or encoded proteins, which leads to increased yield and/or increased growth of the plants. The expression can increase from zero (absence of, or immeasurable expression) to a certain amount, or can decrease from a certain amount to immeasurable small amounts or zero.

Expression

[0271] The term "expression" or "gene expression" means the transcription of a specific gene or specific genes or specific genetic construct. The term "expression" or "gene expression" in particular means the transcription of a gene or genes or genetic construct into structural RNA (rRNA, tRNA) or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting mRNA product.

Increased Expression/Overexpression

[0272] The term "increased expression" or "overexpression" as used herein means any form of expression that is additional to the original wild-type expression level. For the purposes of this invention, the original wild-type expression level might also be zero, i.e. absence of expression or immeasurable expression.

[0273] Methods for increasing expression of genes or gene products are well documented in the art and include, for example, overexpression driven by appropriate promoters, the use of transcription enhancers or translation enhancers. Isolated nucleic acids which serve as promoter or enhancer elements may be introduced in an appropriate position (typically upstream) of a non-heterologous form of a polynucleotide so as to upregulate expression of a nucleic acid encoding the polypeptide of interest. For example, endogenous promoters may be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., WO9322443), or isolated promoters may be introduced into a plant cell in the proper orientation and distance from a gene of the present invention so as to control the expression of the gene.

[0274] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.

[0275] An intron sequence may also be added to the 5' untranslated region (UTR) or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg (1988) Mol. Cell. biol. 8: 4395-4405; Callis et al. (1987) Genes Dev 1:1183-1200). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of the maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. For general information see: The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, N.Y. (1994).

Decreased Expression

[0276] Reference herein to "decreased expression" or "reduction or substantial elimination" of expression is taken to mean a decrease in endogenous gene expression and/or polypeptide levels and/or polypeptide activity relative to control plants. The reduction or substantial elimination is in increasing order of preference at least 10%, 20%, 30%, 40% or 50%, 60%, 70%, 80%, 85%, 90%, or 95%, 96%, 97%, 98%, 99% or more reduced compared to that of control plants.

[0277] For the reduction or substantial elimination of expression an endogenous gene in a plant, a sufficient length of substantially contiguous nucleotides of a nucleic acid sequence is required. In order to perform gene silencing, this may be as little as 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 or fewer nucleotides, alternatively this may be as much as the entire gene (including the 5' and/or 3' UTR, either in part or in whole). The stretch of substantially contiguous nucleotides may be derived from the nucleic acid encoding the protein of interest (target gene), or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest. Preferably, the stretch of substantially contiguous nucleotides is capable of forming hydrogen bonds with the target gene (either sense or antisense strand), more preferably, the stretch of substantially contiguous nucleotides has, in increasing order of preference, 50%, 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity to the target gene (either sense or antisense strand). A nucleic acid sequence encoding a (functional) polypeptide is not a requirement for the various methods discussed herein for the reduction or substantial elimination of expression of an endogenous gene.

[0278] This reduction or substantial elimination of expression may be achieved using routine tools and techniques. A preferred method for the reduction or substantial elimination of endogenous gene expression is by introducing and expressing in a plant a genetic construct into which the nucleic acid (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of any one of the protein of interest) is cloned as an inverted repeat (in part or completely), separated by a spacer (non-coding DNA).

[0279] In such a preferred method, expression of the endogenous gene is reduced or substantially eliminated through RNA-mediated silencing using an inverted repeat of a nucleic acid or a part thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), preferably capable of forming a hairpin structure. The inverted repeat is cloned in an expression vector comprising control sequences. A non-coding DNA nucleic acid sequence (a spacer, for example a matrix attachment region fragment (MAR), an intron, a polylinker, etc.) is located between the two inverted nucleic acids forming the inverted repeat. After transcription of the inverted repeat, a chimeric RNA with a self-complementary structure is formed (partial or complete). This double-stranded RNA structure is referred to as the hairpin RNA (hpRNA). The hpRNA is processed by the plant into siRNAs that are incorporated into an RNA-induced silencing complex (RISC). The RISC further cleaves the mRNA transcripts, thereby substantially reducing the number of mRNA transcripts to be translated into polypeptides. For further general details see for example, Grierson et al. (1998) WO 98/53083; Waterhouse et al. (1999) WO 99/53050).

[0280] Performance of the methods of the invention does not rely on introducing and expressing in a plant a genetic construct into which the nucleic acid is cloned as an inverted repeat, but any one or more of several well-known "gene silencing" methods may be used to achieve the same effects.

[0281] One such method for the reduction of endogenous gene expression is RNA-mediated silencing of gene expression (downregulation). Silencing in this case is triggered in a plant by a double stranded RNA sequence (dsRNA) that is substantially similar to the target endogenous gene. This dsRNA is further processed by the plant into about 20 to about 26 nucleotides called short interfering RNAs (siRNAs). The siRNAs are incorporated into an RNA-induced silencing complex (RISC) that cleaves the mRNA transcript of the endogenous target gene, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. Preferably, the double stranded RNA sequence corresponds to a target gene.

[0282] Another example of an RNA silencing method involves the introduction of nucleic acid sequences or parts thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest) in a sense orientation into a plant. "Sense orientation" refers to a DNA sequence that is homologous to an mRNA transcript thereof. Introduced into a plant would therefore be at least one copy of the nucleic acid sequence. The additional nucleic acid sequence will reduce expression of the endogenous gene, giving rise to a phenomenon known as co-suppression. The reduction of gene expression will be more pronounced if several additional copies of a nucleic acid sequence are introduced into the plant, as there is a positive correlation between high transcript levels and the triggering of co-suppression.

[0283] Another example of an RNA silencing method involves the use of antisense nucleic acid sequences. An "antisense" nucleic acid sequence comprises a nucleotide sequence that is complementary to a "sense" nucleic acid sequence encoding a protein, i.e. complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA transcript sequence. The antisense nucleic acid sequence is preferably complementary to the endogenous gene to be silenced. The complementarity may be located in the "coding region" and/or in the "non-coding region" of a gene. The term "coding region" refers to a region of the nucleotide sequence comprising codons that are translated into amino acid residues. The term "non-coding region" refers to 5' and 3' sequences that flank the coding region that are transcribed but not translated into amino acids (also referred to as 5' and 3' untranslated regions).

[0284] Antisense nucleic acid sequences can be designed according to the rules of Watson and Crick base pairing. The antisense nucleic acid sequence may be complementary to the entire nucleic acid sequence (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), but may also be an oligonucleotide that is antisense to only a part of the nucleic acid sequence (including the mRNA 5' and 3' UTR). For example, the antisense oligonucleotide sequence may be complementary to the region surrounding the translation start site of an mRNA transcript encoding a polypeptide. The length of a suitable antisense oligonucleotide sequence is known in the art and may start from about 50, 45, 40, 35, 30, 25, 20, 15 or 10 nucleotides in length or less. An antisense nucleic acid sequence according to the invention may be constructed using chemical synthesis and enzymatic ligation reactions using methods known in the art. For example, an antisense nucleic acid sequence (e.g., an antisense oligonucleotide sequence) may be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acid sequences, e.g., phosphorothioate derivatives and acridine substituted nucleotides may be used. Examples of modified nucleotides that may be used to generate the antisense nucleic acid sequences are well known in the art. Known nucleotide modifications include methylation, cyclization and `caps` and substitution of one or more of the naturally occurring nucleotides with an analogue such as inosine. Other modifications of nucleotides are well known in the art.

[0285] The antisense nucleic acid sequence can be produced biologically using an expression vector into which a nucleic acid sequence has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest). Preferably, production of antisense nucleic acid sequences in plants occurs by means of a stably integrated nucleic acid construct comprising a promoter, an operably linked antisense oligonucleotide, and a terminator.

[0286] The nucleic acid molecules used for silencing in the methods of the invention (whether introduced into a plant or generated in situ) hybridize with or bind to mRNA transcripts and/or genomic DNA encoding a polypeptide to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid sequence which binds to DNA duplexes, through specific interactions in the major groove of the double helix. Antisense nucleic acid sequences may be introduced into a plant by transformation or direct injection at a specific tissue site. Alternatively, antisense nucleic acid sequences can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense nucleic acid sequences can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid sequence to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid sequences can also be delivered to cells using the vectors described herein.

[0287] According to a further aspect, the antisense nucleic acid sequence is an a-anomeric nucleic acid sequence. An a-anomeric nucleic acid sequence forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual b-units, the strands run parallel to each other (Gaultier et al. (1987) Nucl Ac Res 15: 6625-6641). The antisense nucleic acid sequence may also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucl Ac Res 15, 6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215, 327-330).

[0288] The reduction or substantial elimination of endogenous gene expression may also be performed using ribozymes. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid sequence, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334, 585-591) can be used to catalytically cleave mRNA transcripts encoding a polypeptide, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. A ribozyme having specificity for a nucleic acid sequence can be designed (see for example: Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742). Alternatively, mRNA transcripts corresponding to a nucleic acid sequence can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules (Bartel and Szostak (1993) Science 261, 1411-1418). The use of ribozymes for gene silencing in plants is known in the art (e.g., Atkins et al. (1994) WO 94/00012; Lenne et al. (1995) WO 95/03404; Lutziger et al. (2000) WO 00/00619; Prinsen et al. (1997) WO 97/13865 and Scott et al. (1997) WO 97/38116).

[0289] Gene silencing may also be achieved by insertion mutagenesis (for example, T-DNA insertion or transposon insertion) or by strategies as described by, among others, Angell and Baulcombe ((1999) Plant J 20(3): 357-62), (Amplicon VIGS WO 98/36083), or Baulcombe (WO 99/15682).

[0290] Gene silencing may also occur if there is a mutation on an endogenous gene and/or a mutation on an isolated gene/nucleic acid subsequently introduced into a plant. The reduction or substantial elimination may be caused by a non-functional polypeptide. For example, the polypeptide may bind to various interacting proteins; one or more mutation(s) and/or truncation(s) may therefore provide for a polypeptide that is still able to bind interacting proteins (such as receptor proteins) but that cannot exhibit its normal function (such as signalling ligand).

[0291] A further approach to gene silencing is by targeting nucleic acid sequences complementary to the regulatory region of the gene (e.g., the promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See Helene, C., Anticancer Drug Res. 6, 569-84, 1991; Helene et al., Ann. N.Y. Acad. Sci. 660, 27-36 1992; and Maher, L. J. Bioassays 14, 807-15, 1992.

[0292] Other methods, such as the use of antibodies directed to an endogenous polypeptide for inhibiting its function in planta, or interference in the signalling pathway in which a polypeptide is involved, will be well known to the skilled man. In particular, it can be envisaged that manmade molecules may be useful for inhibiting the biological function of a target polypeptide, or for interfering with the signalling pathway in which the target polypeptide is involved.

[0293] Alternatively, a screening program may be set up to identify in a plant population natural variants of a gene, which variants encode polypeptides with reduced activity. Such natural variants may also be used for example, to perform homologous recombination.

[0294] Artificial and/or natural microRNAs (miRNAs) may be used to knock out gene expression and/or mRNA translation. Endogenous miRNAs are single stranded small RNAs of typically 19-24 nucleotides long. They function primarily to regulate gene expression and/or mRNA translation. Most plant microRNAs (miRNAs) have perfect or near-perfect complementarity with their target sequences. However, there are natural targets with up to five mismatches. They are processed from longer non-coding RNAs with characteristic fold-back structures by double-strand specific RNases of the Dicer family. Upon processing, they are incorporated in the RNA-induced silencing complex (RISC) by binding to its main component, an Argonaute protein. MiRNAs serve as the specificity components of RISC, since they base-pair to target nucleic acids, mostly mRNAs, in the cytoplasm. Subsequent regulatory events include target mRNA cleavage and destruction and/or translational inhibition. Effects of miRNA overexpression are thus often reflected in decreased mRNA levels of target genes.

[0295] Artificial microRNAs (amiRNAs), which are typically 21 nucleotides in length, can be genetically engineered specifically to negatively regulate gene expression of single or multiple genes of interest. Determinants of plant microRNA target selection are well known in the art. Empirical parameters for target recognition have been defined and can be used to aid in the design of specific amiRNAs, (Schwab et al., Dev. Cell 8, 517-527, 2005). Convenient tools for design and generation of amiRNAs and their precursors are also available to the public (Schwab et al., Plant Cell 18, 1121-1133, 2006).

[0296] For optimal performance, the gene silencing techniques used for reducing expression in a plant of an endogenous gene requires the use of nucleic acid sequences from monocotyledonous plants for transformation of monocotyledonous plants, and from dicotyledonous plants for transformation of dicotyledonous plants. Preferably, a nucleic acid sequence from any given plant species is introduced into that same species. For example, a nucleic acid sequence from rice is transformed into a rice plant. However, it is not an absolute requirement that the nucleic acid sequence to be introduced originates from the same plant species as the plant in which it will be introduced. It is sufficient that there is substantial homology between the endogenous target gene and the nucleic acid to be introduced.

[0297] Described above are examples of various methods for the reduction or substantial elimination of expression in a plant of an endogenous gene. A person skilled in the art would readily be able to adapt the aforementioned methods for silencing so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.

Transformation

[0298] The term "introduction" or "transformation" as referred to herein encompasses the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a genetic construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art. Alternatively, a plant cell that cannot be regenerated into a plant may be chosen as host cell, i.e. the resulting transformed plant cell does not have the capacity to regenerate into a (whole) plant.

[0299] The transfer of foreign genes into the genome of a plant is called transformation. Transformation of plant species is now a fairly routine technique. Advantageously, any of several transformation methods may be used to introduce the gene of interest into a suitable ancestor cell. The methods described for the transformation and regeneration of plants from plant tissues or plant cells may be utilized for transient or for stable transformation. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens, F. A. et al., (1982) Nature 296, 72-74; Negrutiu I et al. (1987) Plant Mol Biol 8: 363-373); electroporation of protoplasts (Shillito R. D. et al. (1985) Bio/Technol 3, 1099-1102); microinjection into plant material (Crossway A et al., (1986) Mol. Gen. Genet. 202: 179-185); DNA or RNA-coated particle bombardment (Klein T M et al., (1987) Nature 327: 70) infection with (non-integrative) viruses and the like. Transgenic plants, including transgenic crop plants, are preferably produced via Agrobacterium-mediated transformation. An advantageous transformation method is the transformation in planta. To this end, it is possible, for example, to allow the agrobacteria to act on plant seeds or to inoculate the plant meristem with agrobacteria. It has proved particularly expedient in accordance with the invention to allow a suspension of transformed agrobacteria to act on the intact plant or at least on the flower primordia. The plant is subsequently grown on until the seeds of the treated plant are obtained (Clough and Bent, Plant J. (1998) 16, 735-743). Methods for Agrobacterium-mediated transformation of rice include well known methods for rice transformation, such as those described in any of the following: European patent application EP 1198985 A1, Aldemita and Hodges (Planta 199: 612-617, 1996); Chan et al. (Plant Mol Biol 22 (3): 491-506, 1993), Hiei et al. (Plant J 6 (2): 271-282, 1994), which disclosures are incorporated by reference herein as if fully set forth. In the case of corn transformation, the preferred method is as described in either Ishida et al. (Nat. Biotechnol 14(6): 745-50, 1996) or Frame et al. (Plant Physiol 129(1): 13-22, 2002), which disclosures are incorporated by reference herein as if fully set forth. Said methods are further described by way of example in B. Jenes et al., Techniques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press (1993) 128-143 and in Potrykus Annu. Rev. Plant Physiol. Plant Molec. Biol. 42 (1991) 205-225). The nucleic acids or the construct to be expressed is preferably cloned into a vector, which is suitable for transforming Agrobacterium tumefaciens, for example pBin19 (Bevan et al., Nucl. Acids Res. 12 (1984) 8711). Agrobacteria transformed by such a vector can then be used in known manner for the transformation of plants, such as plants used as a model, like Arabidopsis (Arabidopsis thaliana is within the scope of the present invention not considered as a crop plant), or crop plants such as, by way of example, tobacco plants, for example by immersing bruised leaves or chopped leaves in an agrobacterial solution and then culturing them in suitable media. The transformation of plants by means of Agrobacterium tumefaciens is described, for example, by Hofgen and Willmitzer in Nucl. Acid Res. (1988) 16, 9877 or is known inter alia from F. F. White, Vectors for Gene Transfer in Higher Plants; in Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press, 1993, pp. 15-38.

[0300] In addition to the transformation of somatic cells, which then have to be regenerated into intact plants, it is also possible to transform the cells of plant meristems and in particular those cells which develop into gametes. In this case, the transformed gametes follow the natural plant development, giving rise to transgenic plants. Thus, for example, seeds of Arabidopsis are treated with agrobacteria and seeds are obtained from the developing plants of which a certain proportion is transformed and thus transgenic [Feldman, K A and Marks M D (1987). Mol Gen Genet. 208:1-9; Feldmann K (1992). In: C Koncz, N-H Chua and J Shell, eds, Methods in Arabidopsis Research. Word Scientific, Singapore, pp. 274-289]. Alternative methods are based on the repeated removal of the inflorescences and incubation of the excision site in the center of the rosette with transformed agrobacteria, whereby transformed seeds can likewise be obtained at a later point in time (Chang (1994). Plant J. 5: 551-558; Katavic (1994). Mol Gen Genet, 245: 363-370). However, an especially effective method is the vacuum infiltration method with its modifications such as the "floral dip" method. In the case of vacuum infiltration of Arabidopsis, intact plants under reduced pressure are treated with an agrobacterial suspension [Bechthold, N (1993). C R Acad Sci Paris Life Sci, 316: 1194-1199], while in the case of the "floral dip" method the developing floral tissue is incubated briefly with a surfactant-treated agrobacterial suspension [Clough, S J and Bent A F (1998) The Plant J. 16, 735-743]. A certain proportion of transgenic seeds are harvested in both cases, and these seeds can be distinguished from non-transgenic seeds by growing under the above-described selective conditions. In addition the stable transformation of plastids is of advantages because plastids are inherited maternally is most crops reducing or eliminating the risk of transgene flow through pollen. The transformation of the chloroplast genome is generally achieved by a process which has been schematically displayed in Klaus et al., 2004 [Nature Biotechnology 22 (2), 225-229]. Briefly the sequences to be transformed are cloned together with a selectable marker gene between flanking sequences homologous to the chloroplast genome. These homologous flanking sequences direct site specific integration into the plastome. Plastidal transformation has been described for many different plant species and an overview is given in Bock (2001) Transgenic plastids in basic research and plant biotechnology. J Mol. Biol. 2001 Sep. 21; 312 (3):425-38 or Maliga, P (2003) Progress towards commercialization of plastid transformation technology. Trends Biotechnol. 21, 20-28. Further biotechnological progress has recently been reported in form of marker free plastid transformants, which can be produced by a transient co-integrated maker gene (Klaus et al., 2004, Nature Biotechnology 22(2), 225-229).

[0301] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer. Alternatively, the genetically modified plant cells are non-regenerable into a whole plant.

[0302] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.

[0303] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.

[0304] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).

T-DNA Activation Tagging

[0305] "T-DNA activation" tagging (Hayashi et al. Science (1992) 1350-1353), involves insertion of T-DNA, usually containing a promoter (may also be a translation enhancer or an intron), in the genomic region of the gene of interest or 10 kb up- or downstream of the coding region of a gene in a configuration such that the promoter directs expression of the targeted gene. Typically, regulation of expression of the targeted gene by its natural promoter is disrupted and the gene falls under the control of the newly introduced promoter. The promoter is typically embedded in a T-DNA. This T-DNA is randomly inserted into the plant genome, for example, through Agrobacterium infection and leads to modified expression of genes near the inserted T-DNA. The resulting transgenic plants show dominant phenotypes due to modified expression of genes close to the introduced promoter.

Tilling

[0306] The term "TILLING" is an abbreviation of "Targeted Induced Local Lesions In Genomes" and refers to a mutagenesis technology useful to generate and/or identify nucleic acids encoding proteins with modified expression and/or activity. TILLING also allows selection of plants carrying such mutant variants. These mutant variants may exhibit modified expression, either in strength or in location or in timing (if the mutations affect the promoter for example). These mutant variants may exhibit higher activity than that exhibited by the gene in its natural form. TILLING combines high-density mutagenesis with high-throughput screening methods. The steps typically followed in TILLING are: (a) EMS mutagenesis (Redei G P and Koncz C (1992) In Methods in Arabidopsis Research, Koncz C, Chua N H, Schell J, eds. Singapore, World Scientific Publishing Co, pp. 16-82; Feldmann et al., (1994) In Meyerowitz E M, Somerville C R, eds, Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 137-172; Lightner J and Caspar T (1998) In J Martinez-Zapater, J Salinas, eds, Methods on Molecular Biology, Vol. 82. Humana Press, Totowa, N.J., pp 91-104); (b) DNA preparation and pooling of individuals; (c) PCR amplification of a region of interest; (d) denaturation and annealing to allow formation of heteroduplexes; (e) DHPLC, where the presence of a heteroduplex in a pool is detected as an extra peak in the chromatogram; (f) identification of the mutant individual; and (g) sequencing of the mutant PCR product. Methods for TILLING are well known in the art (McCallum et al., (2000) Nat Biotechnol 18: 455-457; reviewed by Stemple (2004) Nat Rev Genet. 5(2): 145-50).

Homologous Recombination

[0307] "Homologous recombination" allows introduction in a genome of a selected nucleic acid at a defined selected position. Homologous recombination is a standard technology used routinely in biological sciences for lower organisms such as yeast or the moss Physcomitrella. Methods for performing homologous recombination in plants have been described not only for model plants (Offring a et al. (1990) EMBO J. 9(10): 3077-84) but also for crop plants, for example rice (Terada et al. (2002) Nat Biotech 20(10): 1030-4; lida and Terada (2004) Curr Opin Biotech 15(2): 132-8), and approaches exist that are generally applicable regardless of the target organism (Miller et al, Nature Biotechnol. 25, 778-785, 2007).

Yield Related Trait(s)

[0308] A "Yield related trait" is a trait or feature which is related to plant yield. Yield-related traits may comprise one or more of the following non-limitative list of features: early flowering time, yield, biomass, seed yield, early vigour, greenness index, growth rate, agronomic traits, such as e.g. tolerance to submergence (which leads to yield in rice), Water Use Efficiency (WUE), Nitrogen Use Efficiency (NUE), etc.

[0309] Reference herein to enhanced yield-related traits, relative to of control plants is taken to mean one or more of an increase in early vigour and/or in biomass (weight) of one or more parts of a plant, which may include (i) aboveground parts and preferably aboveground harvestable parts and/or (ii) parts below ground and preferably harvestable below ground. In particular, such harvestable parts are seeds.

Yield

[0310] The term "yield" in general means a measurable produce of economic value, typically related to a specified crop, to an area, and to a period of time. Individual plant parts directly contribute to yield based on their number, size and/or weight, or the actual yield is the yield per square meter for a crop and year, which is determined by dividing total production (includes both harvested and appraised production) by planted square meters.

[0311] The terms "yield" of a plant and "plant yield" are used interchangeably herein and are meant to refer to vegetative biomass such as root and/or shoot biomass, to reproductive organs, and/or to propagules such as seeds of that plant.

[0312] Flowers in maize are unisexual; male inflorescences (tassels) originate from the apical stem and female inflorescences (ears) arise from axillary bud apices. The female inflorescence produces pairs of spikelets on the surface of a central axis (cob). Each of the female spikelets encloses two fertile florets, one of them will usually mature into a maize kernel once fertilized. Hence a yield increase in maize may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate, which is the number of filled florets (i.e. florets containing seed) divided by the total number of florets and multiplied by 100), among others.

[0313] Inflorescences in rice plants are named panicles. The panicle bears spikelets, which are the basic units of the panicles, and which consist of a pedicel and a floret. The floret is borne on the pedicel and includes a flower that is covered by two protective glumes: a larger glume (the lemma) and a shorter glume (the palea). Hence, taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, panicle length, number of spikelets per panicle, number of flowers (or florets) per panicle; an increase in the seed filling rate which is the number of filled florets (i.e. florets containing seeds) divided by the total number of florets and multiplied by 100; an increase in thousand kernel weight, among others.

Early Flowering Time

[0314] Plants having an "early flowering time" as used herein are plants which start to flower earlier than control plants. Hence this term refers to plants that show an earlier start of flowering. Flowering time of plants can be assessed by counting the number of days ("time to flower") between sowing and the emergence of a first inflorescence. The "flowering time" of a plant can for instance be determined using the method as described in WO 2007/093444.

Early Vigour

[0315] "Early vigour" refers to active healthy well-balanced growth especially during early stages of plant growth, and may result from increased plant fitness due to, for example, the plants being better adapted to their environment (i.e. optimizing the use of energy resources and partitioning between shoot and root). Plants having early vigour also show increased seedling survival and a better establishment of the crop, which often results in highly uniform fields (with the crop growing in uniform manner, i.e. with the majority of plants reaching the various stages of development at substantially the same time), and often better and higher yield. Therefore, early vigour may be determined by measuring various factors, such as thousand kernel weight, percentage germination, percentage emergence, seedling growth, seedling height, root length, root and shoot biomass and many more.

Increased Growth Rate

[0316] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a mature seed up to the stage where the plant has produced mature seeds, similar to the starting material. This life cycle may be influenced by factors such as speed of germination, early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.

Stress Resistance

[0317] An increase in yield and/or growth rate occurs whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35%, 30% or 25%, more preferably less than 20% or 15% in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures.

[0318] "Biotic stresses" are typically those stresses caused by pathogens, such as bacteria, viruses, fungi, nematodes and insects.

[0319] The "abiotic stress" may be an osmotic stress caused by a water stress, e.g. due to drought, salt stress, or freezing stress. Abiotic stress may also be an oxidative stress or a cold stress. "Freezing stress" is intended to refer to stress due to freezing temperatures, i.e. temperatures at which available water molecules freeze and turn into ice. "Cold stress", also called "chilling stress", is intended to refer to cold temperatures, e.g. temperatures below 10°, or preferably below 5° C., but at which water molecules do not freeze. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location. Plants with optimal growth conditions, (grown under non-stress conditions) typically yield in increasing order of preference at least 97%, 95%, 92%, 90%, 87%, 85%, 83%, 80%, 77% or 75% of the average production of such plant in a given environment. Average production may be calculated on harvest and/or season basis. Persons skilled in the art are aware of average yield productions of a crop.

[0320] In particular, the methods of the present invention may be performed under non-stress conditions. In an example, the methods of the present invention may be performed under non-stress conditions such as mild drought to give plants having increased yield relative to control plants.

[0321] In another embodiment, the methods of the present invention may be performed under stress conditions.

[0322] In an example, the methods of the present invention may be performed under stress conditions such as drought to give plants having increased yield relative to control plants.

[0323] In another example, the methods of the present invention may be performed under stress conditions such as nutrient deficiency to give plants having increased yield relative to control plants.

[0324] Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, magnesium, manganese, iron and boron, amongst others.

[0325] In yet another example, the methods of the present invention may be performed under stress conditions such as salt stress to give plants having increased yield relative to control plants. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl₂, CaCl₂, amongst others.

[0326] In yet another example, the methods of the present invention may be performed under stress conditions such as cold stress or freezing stress to give plants having increased yield relative to control plants.

Increase/Improve/Enhance

[0327] The terms "increase", "improve" or "enhance" are interchangeable and shall mean in the sense of the application at least a 3%, 4%, 5%, 6%, 7%, 8%, 9% or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or 40% more yield and/or growth in comparison to control plants as defined herein.

Seed Yield

[0328] Increased seed yield may manifest itself as one or more of the following:

[0329] (a) an increase in seed biomass (total seed weight) which may be on an individual seed basis and/or per plant and/or per square meter;

[0330] (b) increased number of flowers per plant;

[0331] (c) increased number of seeds;

[0332] (d) increased seed filling rate (which is expressed as the ratio between the number of filled florets divided by the total number of florets);

[0333] (e) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, divided by the biomass of aboveground plant parts; and

[0334] (f) increased thousand kernel weight (TKW), which is extrapolated from the number of seeds counted and their total weight. An increased TKW may result from an increased seed size and/or seed weight, and may also result from an increase in embryo and/or endosperm size.

[0335] The terms "filled florets" and "filled seeds" may be considered synonyms.

[0336] An increase in seed yield may also be manifested as an increase in seed size and/or seed volume. Furthermore, an increase in seed yield may also manifest itself as an increase in seed area and/or seed length and/or seed width and/or seed perimeter.

Greenness Index

[0337] The "greenness index" as used herein is calculated from digital images of plants. For each pixel belonging to the plant object on the image, the ratio of the green value versus the red value (in the RGB model for encoding color) is calculated. The greenness index is expressed as the percentage of pixels for which the green-to-red ratio exceeds a given threshold. Under normal growth conditions, under salt stress growth conditions, and under reduced nutrient availability growth conditions, the greenness index of plants is measured in the last imaging before flowering. In contrast, under drought stress growth conditions, the greenness index of plants is measured in the first imaging after drought.

Biomass

[0338] The term "biomass" as used herein is intended to refer to the total weight of a plant. Within the definition of biomass, a distinction may be made between the biomass of one or more parts of a plant, which may include any one or more of the following:

[0339] aboveground parts such as but not limited to shoot biomass, seed biomass, leaf biomass, etc.;

[0340] aboveground harvestable parts such as but not limited to shoot biomass, seed biomass, leaf biomass, etc.;

[0341] parts below ground, such as but not limited to root biomass, tubers, bulbs, etc.;

[0342] harvestable parts below ground, such as but not limited to root biomass, tubers, bulbs, etc.;

[0343] harvestable parts partially below ground such as but not limited to beets and other hypocotyl areas of a plant, rhizomes, stolons or creeping rootstalks;

[0344] vegetative biomass such as root biomass, shoot biomass, etc.;

[0345] reproductive organs; and

[0346] propagules such as seed.

Marker Assisted Breeding

[0347] Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.

Use as Probes in (Gene Mapping)

[0348] Use of nucleic acids encoding the protein of interest for genetically and physically mapping the genes requires only a nucleic acid sequence of at least 15 nucleotides in length. These nucleic acids may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the nucleic acids encoding the protein of interest. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acids may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the nucleic acid encoding the protein of interest in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32:314-331).

[0349] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.

[0350] The nucleic acid probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).

[0351] In another embodiment, the nucleic acid probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.

[0352] A variety of nucleic acid amplification-based methods for genetic and physical mapping may be carried out using the nucleic acids. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med. 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art. In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.

Plant

[0353] The term "plant" as used herein encompasses whole plants, ancestors and progeny of the plants and plant parts, including seeds, shoots, stems, leaves, roots (including tubers), flowers, and tissues and organs, wherein each of the aforementioned comprise the gene/nucleic acid of interest. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen and microspores, again wherein each of the aforementioned comprises the gene/nucleic acid of interest.

[0354] Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g. Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp. (e.g. Brassica napus, Brassica rapa ssp. [canola, oilseed rape, turnip rape]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g. Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp., Fragaria spp., Ginkgo biloba, Glycine spp. (e.g. Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g. Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g. Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens culinaris, Linum usitatissimum, Litchi chinensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g. Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g. Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g. Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g. Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vitis spp., Zea mays, Zizania palustris, Ziziphus spp., amongst others.

Control Plant(s)

[0355] The choice of suitable control plants is a routine part of an experimental setup and may include corresponding wild type plants or corresponding plants without the gene of interest. The control plant is typically of the same plant species or even of the same variety as the plant to be assessed. The control plant may also be a nullizygote of the plant to be assessed. Nullizygotes (or null control plants) are individuals missing the transgene by segregation. Further, control plants are grown under equal growing conditions to the growing conditions of the plants of the invention, i.e. in the vicinity of, and simultaneously with, the plants of the invention. A "control plant" as used herein refers not only to whole plants, but also to plant parts, including seeds and seed parts.

DESCRIPTION OF FIGURES

[0356] The present invention will now be described with reference to the following figures in which:

[0357] FIG. 1 represents the domain structure of SEQ ID NO: 2 and SEQ ID NO: 4 with the conserved motifs 1 to 12 indicated.

[0358] FIG. 2 represents a multiple alignment of various bZIP-like polypeptides. The consensus sequence line at the bottom of the alignment gives an indication of the conserved regions within the group of bZIP-like proteins. These alignments can be used for defining further motifs or signature sequences, when using conserved amino acids. Panel A shows an alignment of sequences comprising SEQ ID NO: 2, panel B shows an alignment of sequences comprising SEQ ID NO: 4.

[0359] FIG. 3 shows the MATGAT tables of Example 3. Panel A shows a table for sequences comprising SEQ ID NO: 2, panel B shows a table for sequences comprising SEQ ID NO: 4.

[0360] FIG. 4 represents the binary vector used for increased expression in Oryza sativa of a bZIP-like-encoding nucleic acid (such as SEQ ID NO: 1 or SEQ ID NO: 3) under the control of a rice GOS2 promoter (pGOS2).

[0361] FIG. 5 represents the domain structure of SEQ ID NO: 142 with conserved motifs.

[0362] FIG. 6 represents a multiple alignment of various BCAT4-like polypeptides. The asterisks indicate identical amino acids among the various protein sequences, colons represent highly conserved amino acid substitutions, and the dots represent less conserved amino acid substitution; on other positions there is no sequence conservation. These alignments can be used for defining further motifs or signature sequences, when using conserved amino acids. The corresponding SEQ ID NOs for the aligned polypeptide sequences shown in FIG. 6 are:

[0363] SEQ ID NO: 160 for G.max_Glyma06g05280

[0364] SEQ ID NO: 168 for H.annuus_TC54245

[0365] SEQ ID NO: 146 for A.thaliana_AT1G10070

[0366] SEQ ID NO: 198 for P.trichocarpa_scaff_II.1054

[0367] SEQ ID NO: 144 for A.thaliana_AT1G10060

[0368] SEQ ID NO: 154 for A.thaliana_AT3G49680

[0369] SEQ ID NO: 156 for A.thaliana_AT5G65780

[0370] SEQ ID NO: 142 for P.trichocarpa_BCAT4-like

[0371] SEQ ID NO: 158 for G.max_Glyma01g40420

[0372] SEQ ID NO: 166 for G.max_Glyma11g04870

[0373] SEQ ID NO: 182 for M.truncatula_TC114768

[0374] SEQ ID NO: 202 for S.lycopersicum_TC213629

[0375] SEQ ID NO: 170 for H.vulgare_TC165564

[0376] SEQ ID NO: 188 for O.sativa_LOC_Os05g48450

[0377] SEQ ID NO: 208 for Z.mays_TA12434_--4577999

[0378] SEQ ID NO: 174 for Hordeum_--vulgare_PUT-169a-Horde

[0379] SEQ ID NO: 176 for Hordeum_--vulgare_subsp_--vulgare_--

[0380] SEQ ID NO: 190 for O.sativa_Os03g0106400

[0381] SEQ ID NO: 186 for O.sativa_LOC_Os04g47190

[0382] SEQ ID NO: 204 for T.aestivum_TC320973

[0383] SEQ ID NO: 210 for Zea_--mays_GRMZM2G047347--T03

[0384] SEQ ID NO: 192 for P.patens_TC31354

[0385] SEQ ID NO: 164 for P.patens_TC33668

[0386] SEQ ID NO: 172 for H.vulgare_TC186077

[0387] SEQ ID NO: 206 for T.aestivum_TC325793

[0388] SEQ ID NO: 184 for O.sativa_LOC_Os03g12890

[0389] SEQ ID NO: 212 for Zea_--mays_GRMZM2G153536--T03

[0390] SEQ ID NO: 148 for A.thaliana_AT1G50090

[0391] SEQ ID NO: 150 for A.thaliana_AT1G50110

[0392] SEQ ID NO: 152 for A.thaliana_AT3G19710

[0393] SEQ ID NO: 162 for G.max_Glyma07g30510

[0394] SEQ ID NO: 164 for G.max_Glyma08g06750

[0395] SEQ ID NO: 178 for M.truncatula_AC159872_--36

[0396] SEQ ID NO: 196 for P.trichocarpa_--804339

[0397] SEQ ID NO: 200 for P.trichocarpa_scaff_IX.827

[0398] SEQ ID NO: 180 for M.truncatula_AC159872_--55

[0399] FIG. 7 shows phylogenetic tree of BCAT4-like polypeptides, as described in example 2.

[0400] FIG. 8 shows the MATGAT table of Example 3.

[0401] FIG. 9 represents the binary vector used for increased expression in Oryza sativa of a BCAT4-like-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2).

EXAMPLES

[0402] The present invention will now be described with reference to the following examples, which are by way of illustration only. The following examples are not intended to limit the scope of the invention. Unless otherwise indicated, the present invention employs conventional techniques and methods of plant biology, molecular biology, bioinformatics and plant breedings.

[0403] DNA manipulation: unless otherwise stated, recombinant DNA techniques are performed according to standard protocols described in (Sambrook (2001) Molecular Cloning: a laboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH, New York) or in Volumes 1 and 2 of Ausubel et al. (1994), Current Protocols in Molecular Biology, Current Protocols. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R. D. D. Croy, published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications (UK).

Example 1

Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of Intervention

[0404] 1. bZIP-Like Polypeptides

[0405] Sequences (full length cDNA, ESTs or genomic) related to SEQ ID NO: 1 or 3 and SEQ ID NO: 2 or 4 were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid of SEQ ID NO: 1 or SEQ ID NO: 3 was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.

[0406] Table A1 provides a list of nucleic acid sequences and amino acid sequences related to SEQ ID NO: 1/2 and SEQ ID NO: 3/4 respectively.

TABLE-US-00013 TABLE A1 Examples of bZIP-like nucleic acids and polypeptides: Nuclei acid Protein Plant source SEQ ID NO: SEQ ID NO: S. lycopersicum bZIP-like 1 2 P. tricocarpa bZIP-like 3 4 A.lyrata_322281 5 6 A.lyrata_895903 7 8 A.lyrata_944204 9 10 A.thaliana_AT2G46270.2 11 12 A.thaliana_AT4G01120.1 13 14 A.thaliana_AT4G36730.2 15 16 Aquilegia_sp_TC21139 17 18 B.napus_TC63500 19 20 B.napus_TC63534 21 22 B.napus_TC63535 23 24 B.napus_TC63581 25 26 B.napus_TC89867 27 28 C.canephora_TC1061 29 30 C.canephora_TC5243 31 32 C.clementina_TC35556 33 34 C.endivia_EL359780 35 36 C.maculosa_EH744483 37 38 C.maculosa_TA2192_215693 39 40 C.roseus_AF084971 41 42 C.roseus_AY027510 43 44 C.sinensis_TC24527 45 46 C.sinensis_TC9189 47 48 C.tinctorius_TA1437_4222 49 50 E.esula_TC3985 51 52 F.vesca_TA11395_57918 53 54 G.hirsutum_TC137570 55 56 G.hirsutum_TC148476 57 58 G.max_Glyma03g41590.4 59 60 G.max_Glyma07g06620.1 61 62 G.max_Glyma11g06960.1 63 64 G.max_Glyma16g03190.1 65 66 G.max_Glyma16g25600.2 67 68 G.max_Glyma19g44190.1 69 70 H.annuus_BU027457 71 72 H.annuus_TC42219 73 74 H.petiolaris_TA3844_4234 75 76 H.tuberosus_TA2407_4233 77 78 H.tuberosus_TA2966_4233 79 80 M.truncatula_AC137602_8.5 81 82 M.truncatula_AC148484_8.5 83 84 N.tabacum_BP133908 85 86 N.tabacum_NP917548 87 88 N.tabacum_TC40642 89 90 N.tabacum_TC76189 91 92 P.taeda_TA23200_3352 93 94 P.trichocarpa_719452 95 96 P.trifoliata_TA6299_37690 97 98 P.vulgaris_TC11785 99 100 S.lycopersicum_TC211600 101 102 S.lycopersicum_TC213303 103 104 S.tuberosum_TC185019 105 106 S.tuberosum_TC186959 107 108 S.tuberosum_TC187892 109 110 T.pratense_TA1890_57577 111 112 Medicago truncatula GBF3-like 113 114 V.vinifera_GSVIVT00014657001 115 116 V.vinifera_GSVIVT00024984001 117 118

2. BCAT4-Like Polypeptides

[0407] Sequences (full length cDNA, ESTs or genomic) related to SEQ ID NO: 141 and SEQ ID NO: 142 were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid of SEQ ID NO: 141 was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.

[0408] Table A2 provides a list of nucleic acid sequences related to SEQ ID NO: 141 and SEQ ID NO: 142.

TABLE-US-00014 TABLE A2 Examples of BCAT4-like nucleic acids and polypeptides: Nucleic acid Protein Plant Source SEQ ID NO: SEQ ID NO: A.thaliana_AT1G10060.1 143 144 A.thaliana_AT1G10070.1 145 146 A.thaliana_AT1G50090.1 147 148 A.thaliana_AT1G50110.1 149 150 A.thaliana_AT3G19710.1 151 152 A.thaliana_AT3G49680.1 153 154 A.thaliana_AT5G65780.1 155 156 G.max_Glyma01g40420.1 157 158 G.max_Glyma06g05280.1 159 160 G.max_Glyma07g30510.1 161 162 G.max_Glyma08g06750.1 163 164 G.max_Glyma11g04870.1 165 166 H.annuus_TC54245 167 168 H.vulgare_TC165564 169 170 H.vulgare_TC186077 171 172 Hordeum_vulgare_PUT-169a- 173 174 Hordeum_vulgare-79158 Hordeum_vulgare_subsp_-- 175 176 vulgare_AK251931 M.truncatula_AC159872_36 177 178 M.truncatula_AC159872_55 179 180 M.truncatula_TC114768 181 182 O.sativa_LOC_Os03g12890 183 184 O.sativa_LOC_Os04g47190 185 186 O.sativa_LOC_Os05g48450 187 188 O.sativa_Os03g0106400 189 190 P.patens_TC31354 191 192 P.patens_TC33668 193 194 P.trichocarpa_804339 195 196 P.trichocarpa_scaff_II.1054 197 198 P.trichocarpa_scaff_IX.827 199 200 S.lycopersicum_TC213629 201 202 T.aestivum_TC320973 203 204 T.aestivum_TC325793 205 206 Z.mays_TA12434_4577999 207 208 Zea_mays_GRMZM2G047347_T03 209 210 Zea_mays_GRMZM2G153536_T03 211 212

[0409] Sequences have been tentatively assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR; beginning with TA). For instance, the Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid sequence or polypeptide sequence of interest. Special nucleic acid sequence databases have been created for particular organisms, e.g. for certain prokaryotic organisms, such as by the Joint Genome Institute. Furthermore, access to proprietary databases, has allowed the identification of novel nucleic acid and polypeptide sequences.

Example 2

Alignment of Sequences to the Polypeptide Sequences Used in the Methods of the Invention

[0410] 1. bZIP-Like Polypeptides

[0411] Alignment of the polypeptide sequences was performed using the ClustalW algorithm (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chema et al. (2003) Nucleic Acids Res 31:3497-3500) as present in Vector NTI (Invitrogen), with standard setting (gap opening penalty: 10, gap extension penalty: 0.05, gap separation penalty range: 8). Minor manual editing was done to further optimise the alignment. The bZIP-like polypeptides are aligned in FIGS. 2 A and B.

2. BCAT4-Like Polypeptides

[0412] Alignment of the polypeptide sequences was performed using the ClustalW 2.0.11 algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chema et al. (2003). Nucleic Acids Res 31:3497-3500) with standard setting (slow alignment, similarity matrix: Gonnet, gap opening penalty 10, gap extension penalty: 0.2). Minor manual editing was done to further optimise the alignment. The BCAT4-like polypeptides are aligned in FIG. 6.

[0413] A phylogenetic tree of BCAT4-like polypeptides (FIG. 7) was constructed by aligning BCAT4-like sequences using MAFFT (Katoh and Toh (2008)--Briefings in Bioinformatics 9:286-298). A neighbour-joining tree was calculated using Quick-Tree (Howe et al. (2002), Bioinformatics 18(11): 1546-7), 100 bootstrap repetitions. The tree was drawn using Dendroscope (Huson et al. (2007), BMC Bioinformatics 8(1):460). Confidence levels for 100 bootstrap repetitions are indicated for major branchings.

Example 3

Calculation of Global Percentage Identity Between Polypeptide Sequences

[0414] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention were determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix.

1. bZIP-Like Polypeptides

[0415] Results of the MatGAT analysis are shown in FIG. 3 with global similarity and identity percentages over the full length of the polypeptide sequences. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line. Parameters used in the analysis were: Scoring matrix: Blosum62, First Gap: 12, Extending Gap: 2. The sequence identity (in %) between the bZIP-like polypeptide sequences useful in performing the methods of the invention can be as low as 18% (but is generally higher than 30%) compared to SEQ ID NO: 2. For SEQ ID NO: 4, the sequence identity with other bZIP-like polypeptide sequences useful in performing the methods of the invention can be as low as 23% but is on average higher than 48%)

2. BCAT4-Like Polypeptides

[0416] Results of the MatGAT analysis are shown in FIG. 8 with global similarity and identity percentages over the full length of the polypeptide sequences. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line. Parameters used in the analysis were: Scoring matrix: Blosum62, First Gap: 12, Extending Gap: 2. The sequence identity (in %) between the BCAT4-like polypeptide sequences useful in performing the methods of the invention can be as low as 37.9% (is generally higher than 37.9%) compared to SEQ ID NO: 142.

TABLE-US-00015 TABLE B Description of proteins in FIG. 8: 1. A.thaliana_AT1G10060 2. A.thaliana_AT1G10070 3. A.thaliana_AT1G50090 4. A.thaliana_AT1G50110 5. A.thaliana_AT3G19710 6. A.thaliana_AT3G49680 7. A.thaliana_AT5G65780 8. G.max_Glyma01g40420 9. G.max_Glyma06g05280 10. G.max_Glyma07g30510 11. G.max_Glyma08g06750 12. G.max_Glyma11g04870 13. H.annuus_TC54245 14. H.vulgare_TC165564 15. H.vulgare_TC186077 16. Hordeum_vulgare_PUT-169a-Hordeum_vulgare-79158 17. Hordeum_vulgare_subsp_vulgare_AK251931 18. M.truncatula_AC159872_36 19. M.truncatula_AC159872_55 20. M.truncatula_TC114768 21. O.sativa_LOC_Os03g12890 22. O.sativa_LOC_Os04g47190 23. O.sativa_LOC_Os05g48450 24. O.sativa_Os03g0106400 25. P.patens_TC31354 26. P.patens_TC33668 27. P.trichocarpa_804339 28. P.trichocarpa_scaff_II.1054 29. P.trichocarpa_scaff_IX.827 30. S.lycopersicum_TC213629 31. P.trichocarpa_BCAT4-like 32. T.aestivum_TC320973 33. T.aestivum_TC325793 34. Z.mays_TA12434_4577999 35. Zea_mays_GRMZM2G047347_T03 36. Zea_mays_GRMZM2G153536_T03

Example 4

Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention

[0417] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, Propom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.

1. bZIP-Like Polypeptides

[0418] The results of the InterPro scan (InterPro database, release 31.0) of the polypeptide sequence as represented by SEQ ID NO: 2 and 4 are presented in Table C.

TABLE-US-00016 TABLE C1 InterPro scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 2. InterPro ID Domain name Short name Description Location IPR004827 SM00338 no description Basic-leucine zipper (bZIP) 186-250 transcription factor PS50217 BZIP Basic-leucine zipper (bZIP) 188-251 transcription factor PS00036 BZIP_BASIC Basic-leucine zipper (bZIP) 193-208 transcription factor IPR011616 PF00170 bZIP_1 bZIP transcription factor, bZIP-1 186-240 IPR012900 PF07777 MFMR G-box binding, MFMR 001-099 No IPR ID G3DSA: 1.20.5.170 no description NULL 180-246 PTHR22952: SF5 CYCLIC-AMP-DEPENDENT 158-232 TRANSCRIPTION FACTOR ATF-6 BETA PTHR22952 CAMP-RESPONSE ELEMENT 158-232 BINDING PROTEIN-RELATED

TABLE-US-00017 TABLE C2 InterPro scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 4. InterPro ID Domain name Short name Description Location NULL G3DSA: 1.20.5.170 no description NULL 254-319 IPR004827 PS00036 BZIP_BASIC Basic-leucine zipper 267-282 (bZIP) transcription factor IPR012900 PF07777 MFMR G-box binding, MFMR 1-175 IPR011616 PF00170 bZIP_1 bZIP transcription factor, 262-322 bZIP-1 PTHR22952: SF73 GBF2 (G-BOX BINDING NULL 12-333 FACTOR 2); DNA BINDING/ TRANSCRIPTION FACTOR PTHR22952 FAMILY NOT NAMED NULL 12-333 IPR004827 SM00338 no description Basic-leucine zipper 260-324 (bZIP) transcription factor IPR004827 PS50217 BZIP Basic-leucine zipper 262-325 (bZIP) transcription factor

[0419] In an embodiment a bZIP-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid M1 up to amino acid N99 in SEQ ID NO: 2 (which corresponds to the conserved domain starting with amino acid M1 up to amino acid R175 in SEQ ID NO: 4).

2. BCAT4-Like Polypeptides

[0420] SEQ ID NO: 142 was checked against the NCBI database for conserved domains and it was found that SEQ ID NO: 142 is part of the BCAT_beta_family and part of the multidomains PLN02782.

[0421] In an embodiment a BCAT4-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid 185 up to amino acid 202 in SEQ ID NO: 142, or to the conserved domain starting with amino acid 150 up to amino acid 167 in SEQ ID NO: 142, or to the conserved domain starting with amino acid

[0422] 126 up to amino acid 143 in SEQ ID NO: 142.

Example 5

Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of Invention

[0423] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted. TargetP is maintained at the server of the Technical University of Denmark.

[0424] A number of parameters must be selected before analysing a sequence, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).

[0425] Many other algorithms can be used to perform such analyses, including:

[0426] ChloroP 1.1 hosted on the server of the Technical University of Denmark;

[0427] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;

[0428] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;

[0429] TMHMM, hosted on the server of the Technical University of Denmark

[0430] PSORT (URL: psort.org)

[0431] PLOC (Park and Kanehisa, Bioinformatics, 19, 1656-1663, 2003). 1. bZIP-Like Polypeptides

[0432] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.

[0433] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).

[0434] The results of TargetP 1.1 analysis of the polypeptide sequence as represented by SEQ ID NO: 2 are presented Table D. The "plant" organism group has been selected, no cutoffs defined, and the predicted length of the transit peptide requested. The subcellular localization of the polypeptide sequence as represented by SEQ ID NO: 2 may be the cytoplasm or nucleus, no transit peptide is predicted.

TABLE-US-00018 TABLE D TargetP 1.1 analysis of the polypeptide sequence as represented by SEQ ID NO: 2 and 4. Name Len cTP mTP SP other Loc RC TPlen SEQ ID NO:2 284 0.144 0.279 0.067 0.468 -- 5 -- cutoff 0.000 0.000 0.000 0.000 SEQ ID NO: 4 399 0.238 0.053 0.045 0.940 -- 2 -- cutoff 0.000 0.000 0.000 0.000 Abbreviations: Len, Length; cTP, Chloroplastic transit peptide; mTP, Mitochondrial transit peptide, SP, Secretory pathway signal peptide, other, Other subcellular targeting, Loc, Predicted Location; RC, Reliability class; TPlen, Predicted transit peptide length.

Example 6

Functional Assay Related to the Polypeptide Sequences Useful in Performing the Methods of the Invention

[0435] Gel-shift analysis of GmbZIP DNA-binding ability (Liao et al., 2008)

[0436] Three copies of GLM (GTGAGTCAT) (Onate et al., J Biol Chem, 274:9175-9182, 1999), ABRE (CCACGTGG) (Jakoby et al., Trends Plant Sci 7:106-111, 2002), and PB-like (TGAAAA) (Onate et al. 1999) elements are synthesised using standard techniques. Two complementary single-stranded oligonucleotides are mixed in 50 mM NaCl, heated at 70° C. for 5 min and then cooled slowly to room temperature. Each annealed element is labeled with [gamma-³²P]ATP (about 110 TBq/mmol, Amersham) using T4 polynucleotide kinase and used as a probe. Gelshift analysis is performed with the ³²P-labeled probe and 2 pg of recombinant protein. The competitive experiment is performed by adding an excess of 50× unlabeled element in addition to the ³²P-labeled probe. After incubation for 30 min at 25° C., the mixtures are subjected to electrophoresis in 6% (w/v) polyacrylamide gel on ice. The gel is placed on a sheet of Whatman 3 MM filter paper, covered with plastic wrap and exposed to X-ray film overnight at -70° C. with an intensifying screen.

Example 7

Cloning of the Nucleic Acid Sequence Used in Methods of the Invention

[0437] 1. bZIP-Like Polypeptides

[0438] The nucleic acid sequence of SEQ ID NO: 1 was amplified by PCR using as template a custom-made Solanum lycopersicum seedlings cDNA library. PCR was performed using a commercially available proofreading Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm9943 (SEQ ID NO: 134; sense, start codon in bold): 5'-ggggacaagtttgtacaaaaaagcaggcttaaacaatgcctccttatgggactc-3' and prm9944 (SEQ ID NO: 135; reverse, complementary): 5'-ggggaccactttgtacaagaaagctg ggtgcttttccacttctccttaac-3', which include the AttB sites for Gateway recombination. Similarly, the nucleic acid sequence of SEQ ID NO: 3 was amplified by PCR using as template a custom-made Populus trichocarpa seedlings cDNA library using primers prm17402: 5'-ggggacaagtttgtacaaaaaagcaggcttaaacaatgggaaacattgaagaggg-3' (SEQ ID NO: 136) and prm17403: 5' ggggaccactttgtacaagaaagctgggttgaaccagtgtcatcaaccag-3' (SEQ ID NO: 137).

[0439] The amplified PCR fragments were purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment (comprising either SEQ ID NO: 1 or SEQ ID NO: 3) recombined in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pbZIP-like. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.

[0440] The entry clone comprising SEQ ID NO: 1 or SEQ ID NO: 3 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 131) for constitutive expression was located upstream of this Gateway cassette.

[0441] After the LR recombination step, the resulting expression vector pGOS2::bZIP-like (FIG. 4) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.

2. BCAT4-Like Polypeptides

[0442] The nucleic acid sequence was amplified by PCR using as template a custom-made Populus trichocarpa seedlings cDNA library. PCR was performed using a commercially available proofreading Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm15099 (SEQ ID NO: 219; sense): 5'-ggggacaa gtttgtacaaaaaagcaggcttaaacaatggagagaagcgccgt-3' and prm15100 (SEQ ID NO: 220; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggttcactgcagtacgcctaa ctc-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombined in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pBCAT4-like. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.

[0443] The entry clone comprising SEQ ID NO: 141 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 218) for constitutive expression was located upstream of this Gateway cassette.

[0444] After the LR recombination step, the resulting expression vector pGOS2::BCAT4-like (FIG. 9) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.

Example 8

Plant Transformation

Rice Transformation

[0445] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 to 60 minutes, preferably 30 minutes in sodium hypochlorite solution (depending on the grade of contamination), followed by a 3 to 6 times, preferably 4 time wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in light for 6 days scutellum-derived calli is transformed with Agrobacterium as described herein below.

[0446] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD₆₀₀) of about 1. The calli were immersed in the suspension for 1 to 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. After washing away the Agrobacterium, the calli were grown on 2,4-D-containing medium for 10 to 14 days (growth time for indica: 3 weeks) under light at 28° C.-32° C. in the presence of a selection agent. During this period, rapidly growing resistant callus developed. After transfer of this material to regeneration media, the embryogenic potential was released and shoots developed in the next four to six weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.

[0447] Transformation of rice cultivar indica can also be done in a similar way as give above according to techniques well known to a skilled person.

[0448] 35 to 90 independent TO rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges1996, Chan et al. 1993, Hiei et al. 1994).

Example 9

Transformation of Other Crops

Corn Transformation

[0449] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Wheat Transformation

[0450] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Soybean Transformation

[0451] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Rapeseed/Canola Transformation

[0452] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/l BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/l BAP, cefotaxime, carbenicillin, or timentin (300 mg/l) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/l BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Alfalfa Transformation

[0453] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown DCW and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.

Cotton Transformation

[0454] Cotton is transformed using Agrobacterium tumefaciens according to the method described in U.S. Pat. No. 5,159,135. Cotton seeds are surface sterilised in 3% sodium hypochlorite solution during 20 minutes and washed in distilled water with 500 μg/ml cefotaxime. The seeds are then transferred to SH-medium with 50 μg/ml benomyl for germination. Hypocotyls of 4 to 6 days old seedlings are removed, cut into 0.5 cm pieces and are placed on 0.8% agar. An Agrobacterium suspension (approx. 108 cells per ml, diluted from an overnight culture transformed with the gene of interest and suitable selection markers) is used for inoculation of the hypocotyl explants. After 3 days at room temperature and lighting, the tissues are transferred to a solid medium (1.6 g/l Gelrite) with Murashige and Skoog salts with B5 vitamins (Gamborg et al., Exp. Cell Res. 50:151-158 (1968)), 0.1 mg/l 2,4-D, 0.1 mg/l 6-furfurylaminopurine and 750 μg/ml MgCL2, and with 50 to 100 μg/ml cefotaxime and 400-500 μg/ml carbenicillin to kill residual bacteria. Individual cell lines are isolated after two to three months (with subcultures every four to six weeks) and are further cultivated on selective medium for tissue amplification (30° C., 16 hr photoperiod). Transformed tissues are subsequently further cultivated on non-selective medium during 2 to 3 months to give rise to somatic embryos. Healthy looking embryos of at least 4 mm length are transferred to tubes with SH medium in fine vermiculite, supplemented with 0.1 mg/l indole acetic acid, 6 furfurylaminopurine and gibberellic acid. The embryos are cultivated at 30° C. with a photoperiod of 16 hrs, and plantlets at the 2 to 3 leaf stage are transferred to pots with vermiculite and nutrients. The plants are hardened and subsequently moved to the greenhouse for further cultivation.

Sugarbeet Transformation

[0455] Seeds of sugarbeet (Beta vulgaris L.) are sterilized in 70% ethanol for one minute followed by 20 min. shaking in 20% Hypochlorite bleach e.g. Clorox® regular bleach (commercially available from Clorox, 1221 Broadway, Oakland, Calif. 94612, USA). Seeds are rinsed with sterile water and air dried followed by plating onto germinating medium (Murashige and Skoog (MS) based medium (Murashige, T., and Skoog, . . . , 1962. Physiol. Plant, vol. 15, 473-497) including B5 vitamins (Gamborg et al.; Exp. Cell Res., vol. 50, 151-8.) supplemented with 10 g/l sucrose and 0.8% agar). Hypocotyl tissue is used essentially for the initiation of shoot cultures according to Hussey and Hepher (Hussey, G., and Hepher, A., 1978. Annals of Botany, 42, 477-9) and are maintained on MS based medium supplemented with 30 g/l sucrose plus 0.25 mg/l benzylamino purine and 0.75% agar, pH 5.8 at 23-25° C. with a 16-hour photoperiod. Agrobacterium tumefaciens strain carrying a binary plasmid harbouring a selectable marker gene, for example nptII, is used in transformation experiments. One day before transformation, a liquid LB culture including antibiotics is grown on a shaker (28° C., 150 rpm) until an optical density (O.D.) at 600 nm of ˜1 is reached. Overnight-grown bacterial cultures are centrifuged and resuspended in inoculation medium (O.D. ˜1) including Acetosyringone, pH 5.5. Shoot base tissue is cut into slices (1.0 cm×1.0 cm×2.0 mm approximately). Tissue is immersed for 30 s in liquid bacterial inoculation medium. Excess liquid is removed by filter paper blotting. Co-cultivation occurred for 24-72 hours on MS based medium incl. 30 g/l sucrose followed by a non-selective period including MS based medium, 30 g/l sucrose with 1 mg/l BAP to induce shoot development and cefotaxim for eliminating the Agrobacterium. After 3-10 days explants are transferred to similar selective medium harbouring for example kanamycin or G418 (50-100 mg/l genotype dependent). Tissues are transferred to fresh medium every 2-3 weeks to maintain selection pressure. The very rapid initiation of shoots (after 3-4 days) indicates regeneration of existing meristems rather than organogenesis of newly developed transgenic meristems. Small shoots are transferred after several rounds of subculture to root induction medium containing 5 mg/l NAA and kanamycin or G418. Additional steps are taken to reduce the potential of generating transformed plants that are chimeric (partially transgenic). Tissue samples from regenerated shoots are used for DNA analysis. Other transformation methods for sugarbeet are known in the art, for example those by Linsey & Gallois (Linsey, K., and Gallois, P., 1990. Journal of Experimental Botany; vol. 41, No. 226; 529-36) or the methods published in the international application published as WO9623891A.

Sugarcane Transformation

[0456] Spindles are isolated from 6-month-old field grown sugarcane plants (Arencibia et al., 1998. Transgenic Research, vol. 7, 213-22; Enriquez-Obregon et al., 1998. Planta, vol. 206, 20-27). Material is sterilized by immersion in a 20% Hypochlorite bleach e.g. Clorox® regular bleach (commercially available from Clorox, 1221 Broadway, Oakland, Calif. 94612, USA) for 20 minutes. Transverse sections around 0.5 cm are placed on the medium in the top-up direction. Plant material is cultivated for 4 weeks on MS (Murashige, T., and Skoog, 1962. Physiol. Plant, vol. 15, 473-497) based medium incl. B5 vitamins (Gamborg, O., et al., 1968. Exp. Cell Res., vol. 50, 151-8) supplemented with 20 g/l sucrose, 500 mg/l casein hydrolysate, 0.8% agar and 5 mg/l 2,4-D at 23° C. in the dark. Cultures are transferred after 4 weeks onto identical fresh medium. Agrobacterium tumefaciens strain carrying a binary plasmid harbouring a selectable marker gene, for example hpt, is used in transformation experiments. One day before transformation, a liquid LB culture including antibiotics is grown on a shaker (28° C., 150 rpm) until an optical density (O.D.) at 600 nm of ˜0.6 is reached. Overnight-grown bacterial cultures are centrifuged and resuspended in MS based inoculation medium (O.D. ˜0.4) including acetosyringone, pH 5.5. Sugarcane embryogenic callus pieces (2-4 mm) are isolated based on morphological characteristics as compact structure and yellow colour and dried for 20 min. in the flow hood followed by immersion in a liquid bacterial inoculation medium for 10-20 minutes. Excess liquid is removed by filter paper blotting. Co-cultivation occurred for 3-5 days in the dark on filter paper which is placed on top of MS based medium incl. B5 vitamins containing 1 mg/l 2,4-D. After co-cultivation calli are washed with sterile water followed by a non-selective cultivation period on similar medium containing 500 mg/l cefotaxime for eliminating remaining Agrobacterium cells. After 3-10 days explants are transferred to MS based selective medium incl. B5 vitamins containing 1 mg/l 2,4-D for another 3 weeks harbouring 25 mg/l of hygromycin (genotype dependent). All treatments are made at 23° C. under dark conditions. Resistant calli are further cultivated on medium lacking 2,4-D including 1 mg/l BA and 25 mg/l hygromycin under 16 h light photoperiod resulting in the development of shoot structures. Shoots are isolated and cultivated on selective rooting medium (MS based including, 20 g/l sucrose, 20 mg/l hygromycin and 500 mg/l cefotaxime). Tissue samples from regenerated shoots are used for DNA analysis. Other transformation methods for sugarcane are known in the art, for example from the international application published as WO2010/151634A and the granted European patent EP1831378.

Example 10

Phenotypic Evaluation Procedure

10.1 Evaluation Setup

[0457] 35 to 90 independent T0 rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Six events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%. Plants grown under non-stress conditions were watered at regular intervals to ensure that water and nutrients were not limiting and to satisfy plant needs to complete growth and development, unless they were used in a stress screen.

[0458] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.

[0459] T1 events can be further evaluated in the T2 generation following the same evaluation procedure as for the T1 generation, e.g. with less events and/or with more individuals per event.

Drought Screen

[0460] T1 or T2 plants were grown in potting soil under normal conditions until they approached the heading stage. They were then transferred to a "dry" section where irrigation was withheld. Soil moisture probes were inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC went below certain thresholds, the plants were automatically re-watered continuously until a normal level was reached again. The plants were then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress conditions. Growth and yield parameters were recorded as detailed for growth under normal conditions.

Nitrogen Use Efficiency Screen (bZIP-Like Polypeptides)

[0461] T1 or T2 plants were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters were recorded as detailed for growth under normal conditions.

Salt Stress Screen

[0462] T1 or T2 plants are grown on a substrate made of coco fibers and particles of baked clay (Argex) (3 to 1 ratio). A normal nutrient solution is used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) is added to the nutrient solution, until the plants are harvested. Growth and yield parameters are recorded as detailed for growth under normal conditions.

10.2 Statistical Analysis: F Test

[0463] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.

10.3 Parameters Measured

[0464] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles as described in WO 2010/031780. These measurements were used to determine different parameters.

Biomass-Related Parameter Measurement

[0465] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass.

[0466] Increase in root biomass is expressed as an increase in total root biomass (measured as maximum biomass of roots observed during the lifespan of a plant); or as an increase in the root/shoot index, measured as the ratio between root mass and shoot mass in the period of active growth of root and shoot. In other words, the root/shoot index is defined as the ratio of the rapidity of root growth to the rapidity of shoot growth in the period of active growth of root and shoot. Root biomass can be determined using a method as described in WO 2006/029987.

Parameters Related to Development Time

[0467] The early vigour is the plant aboveground area three weeks post-germination. Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration.

[0468] AreaEmer is an indication of quick early development when this value is decreased compared to control plants. It is the ratio (expressed in %) between the time a plant needs to make 30% of the final biomass and the time needs to make 90% of its final biomass.

[0469] The "time to flower" or "flowering time" of the plant can be determined using the method as described in WO 2007/093444.

Seed-Related Parameter Measurements

[0470] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The seeds are usually covered by a dry outer covering, the husk. The filled husks (herein also named filled florets) were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance.

[0471] The total number of seeds was determined by counting the number of filled husks that remained after the separation step. The total seed weight was measured by weighing all filled husks harvested from a plant.

[0472] The total number of seeds (or florets) per plant was determined by counting the number of husks (whether filled or not) harvested from a plant.

[0473] Thousand Kernel Weight (TKW) is extrapolated from the number of seeds counted and their total weight.

[0474] The Harvest Index (HI) in the present invention is defined as the ratio between the total seed weight and the above ground area (mm²), multiplied by a factor 10⁶.

[0475] The number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds over the number of mature primary panicles.

[0476] The "seed fill rate" or "seed filling rate" as defined in the present invention is the proportion (expressed as a %) of the number of filled seeds (i.e. florets containing seeds) over the total number of seeds (i.e. total number of florets). In other words, the seed filling rate is the percentage of florets that are filled with seed.

Example 10

Results of the Phenotypic Evaluation of the Transgenic Plants

[0477] 1. bZIP-Like Polypeptides

[0478] The results of the evaluation of transgenic rice plants in the T1 generation and expressing a nucleic acid encoding the bZIP-like polypeptide of SEQ ID NO: 2 under nutrient-stress conditions are presented below in Table E1. When grown under conditions of nitrogen deficiency, an increase of at least 5% was observed for aboveground (or green) biomass (AreaMax), for root biomass (RootMax & RootThickMax) and for seed yield (total seed weight (totalwgseeds), number of filled seeds (nrfilledseed) and total number of seeds (nrtotalseed)). No negative effect on flowering time (delayed flowering) was observed.

[0479] In addition, plants expressing the bZIP-like nucleic acid of SEQ ID NO: 1 showed an increase in Thousand Kernel Weight, fillrate, and number of first panicles in at least one of the tested lines. One of the tested lines also showed an earlier flowering time.

TABLE-US-00019 TABLE E1 Data summary for transgenic rice plants; for each parameter, the overall percent increase is shown for the T1 generation, for each parameter the p-value is <0.05. Parameter Overall increase AreaMax 14.7 RootMax 8.8 totalwgseeds 18.1 nrtotalseed 10.0 nrfilledseed 16.1 RootThickMax 5.3

[0480] A similar trend was observed for plants grown under non-stress conditions: aboveground biomass, root biomass and seed yield (total seed weight, total number of seeds, fillrate, and number of filled seeds) were increased in at least 2 of the lines tested, compared to control plants

[0481] The results of the evaluation of transgenic rice plants in the T1 generation and expressing a nucleic acid encoding the bZIP-like polypeptide of SEQ ID NO: 4 under drought-stress conditions are presented below in Table E2. When grown under conditions of drought, an increase of at least 5% was observed for aboveground (or green) biomass (AreaMax), for root biomass (RootThickMax) and for seed yield (total seed weight (totalwgseeds), fillrate, harvest index, and number of filled seeds (nrfilledseed)). No negative effect on flowering time (delayed flowering) was observed.

TABLE-US-00020 TABLE E2 Data summary for transgenic rice plants; for each parameter, the overall percent increase is shown for the T1 generation, for each parameter the p-value is <0.05. Parameter Overall increase AreaMax 8.6 totalwgseeds 42.5 fillrate 46.6 harvestindex 33.5 nrfilledseed 44.0 RootThickMax 8.4

2. BCAT4-Like Polypeptides

[0482] The results of the evaluation of transgenic rice plants expressing a BCAT4-like nucleic acid under drought-stress conditions are presented hereunder. An increase of at least 5% was observed for total seed weight, fill rate, harvest index and number of seeds (Table E3).

[0483] The results of the evaluation of transgenic rice plants in the T1 generation and expressing a nucleic acid encoding the BCAT4-like polypeptide of SEQ ID NO: 142 under drought stress conditions are presented below in Table D. When grown under drought stress conditions, an increase of at least 5% was observed for seed yield, including total weight of seeds (totalwgseeds), fill rate, harvest index and number of seeds (nrfilledseed). In addition, 2 lines were clearly positive for GravityYMax, which is the height of the gravity center of the leafy biomass of the plant.

TABLE-US-00021 TABLE E3 Data summary for transgenic rice plants; for each parameter, the overall percent increase is shown (T1 generation), for each parameter the p-value is <0.05. Parameter Overall increase totalwgseeds 34.5 fillrate 36.0 harvestindex 29.8 nrfilledseed 32.1

Sequence CWU 1

1

2221855DNASolanum lycopersicon 1atgcctcctt atgggactcc agttccatat ccagctttat atcctcctgc cggagtttat 60gctcatccta acattgccac gccggctcca aattctgtgc cggcaaatcc tgaagcagat 120gggaaggggc ctgaaggaaa ggatcggaat tcaagtaaaa agttaaaggt ctgttctggt 180ggtaaggcag gcgacaatgg gaaagttact tcaggttccg gaaatgatgg tgccacacaa 240agtgatgaaa gcagaagtga aggtacatca gatacaaatg atgaaaatga taacaatgaa 300tttgctgcaa acaagaaggg aagctttgat caaatgcttg cagatggagc cagtgcacag 360aataatcctg cgaaagagaa tcacccgact tctatacatg gaaatcctgt caccatgcct 420gcaactaacc taaatattgg aatggacgtg tggaatgcat cagctgccgg tcctggagcg 480atcaaaatac agcaaaatgc aactggtcca gttataggac atgaaggaag gatgaatgat 540cagtggattc aggaggaacg tgaacttaaa aggcaaaaga gaaagcaatc taatagggag 600tcagctagga ggtcgaggct ccgcaagcag gcagagtgtg aagagctaca acgtagagta 660gaagctttga gccatgagaa tcattcactc aaagatgagc tccaacggct ctctgaggaa 720tgtgagaagc ttacctcgga gaataattta attaaggaag agttaacgct actttgtgga 780ccagacgttg tgtctaagct ggagagaaac gataatgtca cacgtattca atctaatgtt 840gaagaagcta gttaa 8552284PRTSolanum lycopersicon 2Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Leu Tyr Pro Pro 1 5 10 15 Ala Gly Val Tyr Ala His Pro Asn Ile Ala Thr Pro Ala Pro Asn Ser 20 25 30 Val Pro Ala Asn Pro Glu Ala Asp Gly Lys Gly Pro Glu Gly Lys Asp 35 40 45 Arg Asn Ser Ser Lys Lys Leu Lys Val Cys Ser Gly Gly Lys Ala Gly 50 55 60 Asp Asn Gly Lys Val Thr Ser Gly Ser Gly Asn Asp Gly Ala Thr Gln 65 70 75 80 Ser Asp Glu Ser Arg Ser Glu Gly Thr Ser Asp Thr Asn Asp Glu Asn 85 90 95 Asp Asn Asn Glu Phe Ala Ala Asn Lys Lys Gly Ser Phe Asp Gln Met 100 105 110 Leu Ala Asp Gly Ala Ser Ala Gln Asn Asn Pro Ala Lys Glu Asn His 115 120 125 Pro Thr Ser Ile His Gly Asn Pro Val Thr Met Pro Ala Thr Asn Leu 130 135 140 Asn Ile Gly Met Asp Val Trp Asn Ala Ser Ala Ala Gly Pro Gly Ala 145 150 155 160 Ile Lys Ile Gln Gln Asn Ala Thr Gly Pro Val Ile Gly His Glu Gly 165 170 175 Arg Met Asn Asp Gln Trp Ile Gln Glu Glu Arg Glu Leu Lys Arg Gln 180 185 190 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 195 200 205 Lys Gln Ala Glu Cys Glu Glu Leu Gln Arg Arg Val Glu Ala Leu Ser 210 215 220 His Glu Asn His Ser Leu Lys Asp Glu Leu Gln Arg Leu Ser Glu Glu 225 230 235 240 Cys Glu Lys Leu Thr Ser Glu Asn Asn Leu Ile Lys Glu Glu Leu Thr 245 250 255 Leu Leu Cys Gly Pro Asp Val Val Ser Lys Leu Glu Arg Asn Asp Asn 260 265 270 Val Thr Arg Ile Gln Ser Asn Val Glu Glu Ala Ser 275 280 31200DNAPopulus trichocarpa 3atgggaaaca ttgaagaggg aaagtcttcc acttctgata aatcttcacc tgcaccaccg 60gatcagacca atattcatgt gtatcctgat ggggcagcta tgcaggcata ttatggcccc 120cgagtggctc tcccaccata ttacaactcg gccgtggctt ctggtcatgc ccctcatcct 180tatatgtggg gcctgccaca gcctatgatg ccaccttatg gggcacctta tgcaacagtc 240tactcacatg gagtgtatgc acatccggct gttccaattg tatcccatcc tcatggtcct 300gggattgtgt catctcctgc agctggaacc cttttgagtg cagaaacacc tacaaaatct 360tcaggaaata ctgatcgagg tttagtgaat aagttgaaag gatttgatgg gcttgcaatg 420tcaataggca atggtaatgc tgagactgtc gagggtgggg gtaggctgtc tcaaagtgtg 480gagatagaag tttccagtga tggaattgat gggaatacaa ctaggggaaa gaaaaggagc 540cgtgagggaa caccaactgt tgcaacaggt ggagatacaa aaatggagtc acattccagt 600ccccttccta gagaggtgaa tgcatccact gacaatgtat tgagggcagc tgttgctcct 660ggcatgacca cagcattgga gcttaggaac cctcctagtg tgaatgctgc taagacaagt 720cctactacga ttcctcaatc tggtgtagtc ctgccctctg aagcctggtt acagaatgag 780ctggagctga aacgggagaa gaggaaacaa tcaaatcgag aatctgccag aaggtcaaga 840ttaaggaagc aggctgaggc tgaagaactt gcacacaaag ttgaagtact caccacagaa 900aacatggcac tccaatctga aataagtcaa tttacagaga aatcagagaa actaaggctt 960gaaaatgctg cattaacgga gaaactcaag aatgcacaat taggacatgc gcaagaaatg 1020attttaaaca ttgatgagca cagggcccca gctgttagta cagaaaactt gctatcaaga 1080gttaacaatt ctgcctttga agaagagagt gatctgtatg aacgaaactc aaattctggt 1140gccaagctgc atcaactctt ggatgcaagc cccagagccg atgctgtggc tgctggttga 12004399PRTPopulus trichocarpa 4Met Gly Asn Ile Glu Glu Gly Lys Ser Ser Thr Ser Asp Lys Ser Ser 1 5 10 15 Pro Ala Pro Pro Asp Gln Thr Asn Ile His Val Tyr Pro Asp Gly Ala 20 25 30 Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Leu Pro Pro Tyr Tyr 35 40 45 Asn Ser Ala Val Ala Ser Gly His Ala Pro His Pro Tyr Met Trp Gly 50 55 60 Leu Pro Gln Pro Met Met Pro Pro Tyr Gly Ala Pro Tyr Ala Thr Val 65 70 75 80 Tyr Ser His Gly Val Tyr Ala His Pro Ala Val Pro Ile Val Ser His 85 90 95 Pro His Gly Pro Gly Ile Val Ser Ser Pro Ala Ala Gly Thr Leu Leu 100 105 110 Ser Ala Glu Thr Pro Thr Lys Ser Ser Gly Asn Thr Asp Arg Gly Leu 115 120 125 Val Asn Lys Leu Lys Gly Phe Asp Gly Leu Ala Met Ser Ile Gly Asn 130 135 140 Gly Asn Ala Glu Thr Val Glu Gly Gly Gly Arg Leu Ser Gln Ser Val 145 150 155 160 Glu Ile Glu Val Ser Ser Asp Gly Ile Asp Gly Asn Thr Thr Arg Gly 165 170 175 Lys Lys Arg Ser Arg Glu Gly Thr Pro Thr Val Ala Thr Gly Gly Asp 180 185 190 Thr Lys Met Glu Ser His Ser Ser Pro Leu Pro Arg Glu Val Asn Ala 195 200 205 Ser Thr Asp Asn Val Leu Arg Ala Ala Val Ala Pro Gly Met Thr Thr 210 215 220 Ala Leu Glu Leu Arg Asn Pro Pro Ser Val Asn Ala Ala Lys Thr Ser 225 230 235 240 Pro Thr Thr Ile Pro Gln Ser Gly Val Val Leu Pro Ser Glu Ala Trp 245 250 255 Leu Gln Asn Glu Leu Glu Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn 260 265 270 Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu 275 280 285 Glu Leu Ala His Lys Val Glu Val Leu Thr Thr Glu Asn Met Ala Leu 290 295 300 Gln Ser Glu Ile Ser Gln Phe Thr Glu Lys Ser Glu Lys Leu Arg Leu 305 310 315 320 Glu Asn Ala Ala Leu Thr Glu Lys Leu Lys Asn Ala Gln Leu Gly His 325 330 335 Ala Gln Glu Met Ile Leu Asn Ile Asp Glu His Arg Ala Pro Ala Val 340 345 350 Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser Ala Phe Glu Glu 355 360 365 Glu Ser Asp Leu Tyr Glu Arg Asn Ser Asn Ser Gly Ala Lys Leu His 370 375 380 Gln Leu Leu Asp Ala Ser Pro Arg Ala Asp Ala Val Ala Ala Gly 385 390 395 51143DNAArabidopsis lyrata 5atgggaaata gcagcgagga accaaagcct accaaatcag ataaaccatc ttcacctccg 60gtggatcaaa caaatgttca tgtgtaccct gattgggcag ctatgcaggc atattatggt 120ccaagagttg caatgcctcc ttattacaat tcagctttgg ctgcatctgg tcatcctcct 180cctccttaca tgtggaatcc tcagcatatg atgtcaccat atggagcacc ctatgctgcg 240gtttatcctc atggaggagg agtttacgct catcccggaa ttcccatggg atcacagcct 300caaggtcaaa agactccacc tttagcaact ccggggacgc atttgagcat cgacactcct 360actaaatcta cggggaacac agacaatgga ttgatgaaga agctgaaaga gtttgatggg 420cttgctatgt ctctaggaaa cgggaatcct gaaaatggtg cagatgaaca taaacgatca 480cggaacagct cagaaactga tggttcaact gatggaagtg atgggaatac aactggagca 540gatgaaccga aacttaaaag aagtcgagag ggaactccga caaaagatgt gaaacaattg 600gttcaatcta gctcatttca ttctgtttct ccgtcaagtg gtgataccgg cgtaaaactt 660attcaaggat cagctatact ctctcctggt gtaagtgcaa attccaaccc cttcatgtca 720caatctttag ccatggttcc tcctgaaact tggcctcaga acgagagaga actgaaacgg 780gagcgaagga aacagtctaa tagagaatct gctagaaggt caagattaag gaaacaggcc 840gagacggaag aacttgctag gaaagtcgaa gccttgacag ccgaaaacat ggcactaaga 900tctgaactaa accaacttaa tgagaaatct gataaactaa gaggagcaaa tgcaaccttg 960ttggacaaac tgaaatgttc agaacctgaa aagagagtct ccgggaaaat gttgtctaga 1020gttaaaaact caggagctgg agacaagaac aagaaccaag gagacaatga ttctaaatct 1080acaagcaaat tgtaccaact gctggataca aagccccgag ctaatgctgt agctgcgggc 1140taa 11436380PRTArabidopsis lyrata 6Met Gly Asn Ser Ser Glu Glu Pro Lys Pro Thr Lys Ser Asp Lys Pro 1 5 10 15 Ser Ser Pro Pro Val Asp Gln Thr Asn Val His Val Tyr Pro Asp Trp 20 25 30 Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Met Pro Pro Tyr 35 40 45 Tyr Asn Ser Ala Leu Ala Ala Ser Gly His Pro Pro Pro Pro Tyr Met 50 55 60 Trp Asn Pro Gln His Met Met Ser Pro Tyr Gly Ala Pro Tyr Ala Ala 65 70 75 80 Val Tyr Pro His Gly Gly Gly Val Tyr Ala His Pro Gly Ile Pro Met 85 90 95 Gly Ser Gln Pro Gln Gly Gln Lys Thr Pro Pro Leu Ala Thr Pro Gly 100 105 110 Thr His Leu Ser Ile Asp Thr Pro Thr Lys Ser Thr Gly Asn Thr Asp 115 120 125 Asn Gly Leu Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met Ser 130 135 140 Leu Gly Asn Gly Asn Pro Glu Asn Gly Ala Asp Glu His Lys Arg Ser 145 150 155 160 Arg Asn Ser Ser Glu Thr Asp Gly Ser Thr Asp Gly Ser Asp Gly Asn 165 170 175 Thr Thr Gly Ala Asp Glu Pro Lys Leu Lys Arg Ser Arg Glu Gly Thr 180 185 190 Pro Thr Lys Asp Val Lys Gln Leu Val Gln Ser Ser Ser Phe His Ser 195 200 205 Val Ser Pro Ser Ser Gly Asp Thr Gly Val Lys Leu Ile Gln Gly Ser 210 215 220 Ala Ile Leu Ser Pro Gly Val Ser Ala Asn Ser Asn Pro Phe Met Ser 225 230 235 240 Gln Ser Leu Ala Met Val Pro Pro Glu Thr Trp Pro Gln Asn Glu Arg 245 250 255 Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 260 265 270 Arg Ser Arg Leu Arg Lys Gln Ala Glu Thr Glu Glu Leu Ala Arg Lys 275 280 285 Val Glu Ala Leu Thr Ala Glu Asn Met Ala Leu Arg Ser Glu Leu Asn 290 295 300 Gln Leu Asn Glu Lys Ser Asp Lys Leu Arg Gly Ala Asn Ala Thr Leu 305 310 315 320 Leu Asp Lys Leu Lys Cys Ser Glu Pro Glu Lys Arg Val Ser Gly Lys 325 330 335 Met Leu Ser Arg Val Lys Asn Ser Gly Ala Gly Asp Lys Asn Lys Asn 340 345 350 Gln Gly Asp Asn Asp Ser Lys Ser Thr Ser Lys Leu Tyr Gln Leu Leu 355 360 365 Asp Thr Lys Pro Arg Ala Asn Ala Val Ala Ala Gly 370 375 380 71122DNAArabidopsis lyrata 7atgggtagca acgaagaagg aaaacccaca aactctgata agccatcgca agctgctgct 60cctgagcaga gtaatgttca tgtgtatcat catgactggg ctgctatgca ggcatattat 120gggcctagag ttggtatacc tcaatattac aactcaaatg tggcgcctgg tcatgctcca 180ccgccttata tgtgggcgtc tccctcgcca atgatggctc cttatggggc accatatcca 240ccattttgcc ctcctggtgg agtttatgct catcctggtg ttcaaatggg ctcacaacta 300caaggtcctg tttctcaagc aacacctggt gttacaactc ctttgaccat ggatgcacca 360actaattcag ctggaaactc ggatcacggg ttcatgaaaa aactgaaaga gttcgatgga 420cttgcaatgt caataagcaa taacaaagtt gggagtgctg aacatagcag cagtgaacat 480aggagttctc agagatatat agaatctaac gtggttttga tatcaatagc tccgagaatg 540atggctctag caatggtagt gatgtattcg tctttcttac cacagggaga gcaatctcgg 600aggaaaataa ggcgagaaag atcaccaagc accggtgaaa gaccttcatc tcaaaccacg 660cctcctgtta gaggtgaaaa tgagaaagcc gatgtgacca tggggactcc cgtaatgccc 720acaacaatgg gtttccaaaa ctctgctggc atgaacggtg tcccacagcc atggaatgaa 780aaagaggtta aacgagagaa gagaaaacag tcaaaccgag aatctgctag aaggtcgaga 840ctgaggaagc aggctgaaac tgaacaacta tctgtcaaag ttgacgcact tgtagctgag 900aacatgactc tgaggtccaa actaggccag ctaaaaaatg agtctgagaa actgcggctg 960gagaacgaag ctttattgca tcaactgaaa gcgcaagcaa ctgggaaaac agagaacctt 1020atctctcgag ttgataagaa caactctgta tcaggtagca aaaatgtgca gcatcaactg 1080ttaaatgcaa gtccaataac tgatcctgtc gcggccagct ga 11228373PRTArabidopsis lyrata 8Met Gly Ser Asn Glu Glu Gly Lys Pro Thr Asn Ser Asp Lys Pro Ser 1 5 10 15 Gln Ala Ala Ala Pro Glu Gln Ser Asn Val His Val Tyr His His Asp 20 25 30 Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Gly Ile Pro Gln 35 40 45 Tyr Tyr Asn Ser Asn Val Ala Pro Gly His Ala Pro Pro Pro Tyr Met 50 55 60 Trp Ala Ser Pro Ser Pro Met Met Ala Pro Tyr Gly Ala Pro Tyr Pro 65 70 75 80 Pro Phe Cys Pro Pro Gly Gly Val Tyr Ala His Pro Gly Val Gln Met 85 90 95 Gly Ser Gln Leu Gln Gly Pro Val Ser Gln Ala Thr Pro Gly Val Thr 100 105 110 Thr Pro Leu Thr Met Asp Ala Pro Thr Asn Ser Ala Gly Asn Ser Asp 115 120 125 His Gly Phe Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met Ser 130 135 140 Ile Ser Asn Asn Lys Val Gly Ser Ala Glu His Ser Ser Ser Glu His 145 150 155 160 Arg Ser Ser Gln Arg Tyr Ile Glu Ser Asn Val Val Leu Ile Ser Ile 165 170 175 Ala Pro Arg Met Met Ala Leu Ala Met Val Val Met Tyr Ser Ser Phe 180 185 190 Leu Pro Gln Gly Glu Gln Ser Arg Arg Lys Ile Arg Arg Glu Arg Ser 195 200 205 Pro Ser Thr Gly Glu Arg Pro Ser Ser Gln Thr Thr Pro Pro Val Arg 210 215 220 Gly Glu Asn Glu Lys Ala Asp Val Thr Met Gly Thr Pro Val Met Pro 225 230 235 240 Thr Thr Met Gly Phe Gln Asn Ser Ala Gly Met Asn Gly Val Pro Gln 245 250 255 Pro Trp Asn Glu Lys Glu Val Lys Arg Glu Lys Arg Lys Gln Ser Asn 260 265 270 Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Thr Glu 275 280 285 Gln Leu Ser Val Lys Val Asp Ala Leu Val Ala Glu Asn Met Thr Leu 290 295 300 Arg Ser Lys Leu Gly Gln Leu Lys Asn Glu Ser Glu Lys Leu Arg Leu 305 310 315 320 Glu Asn Glu Ala Leu Leu His Gln Leu Lys Ala Gln Ala Thr Gly Lys 325 330 335 Thr Glu Asn Leu Ile Ser Arg Val Asp Lys Asn Asn Ser Val Ser Gly 340 345 350 Ser Lys Asn Val Gln His Gln Leu Leu Asn Ala Ser Pro Ile Thr Asp 355 360 365 Pro Val Ala Ala Ser 370 9939DNAArabidopsis lyrata 9atgggaacga gcgaagacaa gatgccattt aagcctacca aaccaacatc ttcggctcag 60gaagttcctc ccacaccgta tccagattgg tcaaattcaa tgcaggctta ttatggcgga 120ggaggtacgc caaatccttt tttcccatcc ccagttggat ctcctagtcc ccacgcttat 180atgtggggcg ctcaacacca tatgatgccg ccttatggga ccccagtacc gtacccagca 240atgtatcccc cgggagcagt ctattctcat cctagcatgc ccatgcctcc taattctggt 300ccaaccaaca aggagactgt gaaggaccaa gcttctggca agaagtcaaa ggggagctcg 360aaaaaaaagg gtgaaggagg tgacaaagcg ctctctggtt cagggaacga tggtgtctct 420catagtgatg acagtgtcac agcgggttca tctgatgaaa atgatgacaa tgccaatcaa 480caggaacaag gttcagttag aaagccgagc tttggacaga tgcttgcgga cgcaagttct 540caaagtacta ctggtgaaat ccaaggttcg gtgcccatga agccggtagc cccggggact 600aatctgaata tcgggatgga cttatggtct tcccaagctg gtgtacctgt gaaggatgaa 660cgagagctca agaggcagaa gaggaaacag tctaaccgtg aatctgctag gcggtctaga 720ttgcggaagc aggcggaatg cgaacaactt caacagagag tagagagttt gtcgaacgag 780aatcaaagcc tgagagatga gctacaaaga ctctcaagcg aatgtgaaaa gctcaagtct 840gagaacaact caatccagga tgagttgcag agagtgcttg gagcagaggc tgtagctaat 900ctagagcaga atgctgctga cggtgaagga aaaaattaa 93910312PRTArabidopsis lyrata 10Met Gly Thr Ser Glu Asp Lys Met Pro Phe

Lys Pro Thr Lys Pro Thr 1 5 10 15 Ser Ser Ala Gln Glu Val Pro Pro Thr Pro Tyr Pro Asp Trp Ser Asn 20 25 30 Ser Met Gln Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Asn Pro Phe Phe 35 40 45 Pro Ser Pro Val Gly Ser Pro Ser Pro His Ala Tyr Met Trp Gly Ala 50 55 60 Gln His His Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala 65 70 75 80 Met Tyr Pro Pro Gly Ala Val Tyr Ser His Pro Ser Met Pro Met Pro 85 90 95 Pro Asn Ser Gly Pro Thr Asn Lys Glu Thr Val Lys Asp Gln Ala Ser 100 105 110 Gly Lys Lys Ser Lys Gly Ser Ser Lys Lys Lys Gly Glu Gly Gly Asp 115 120 125 Lys Ala Leu Ser Gly Ser Gly Asn Asp Gly Val Ser His Ser Asp Asp 130 135 140 Ser Val Thr Ala Gly Ser Ser Asp Glu Asn Asp Asp Asn Ala Asn Gln 145 150 155 160 Gln Glu Gln Gly Ser Val Arg Lys Pro Ser Phe Gly Gln Met Leu Ala 165 170 175 Asp Ala Ser Ser Gln Ser Thr Thr Gly Glu Ile Gln Gly Ser Val Pro 180 185 190 Met Lys Pro Val Ala Pro Gly Thr Asn Leu Asn Ile Gly Met Asp Leu 195 200 205 Trp Ser Ser Gln Ala Gly Val Pro Val Lys Asp Glu Arg Glu Leu Lys 210 215 220 Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 225 230 235 240 Leu Arg Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln Arg Val Glu Ser 245 250 255 Leu Ser Asn Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser 260 265 270 Ser Glu Cys Glu Lys Leu Lys Ser Glu Asn Asn Ser Ile Gln Asp Glu 275 280 285 Leu Gln Arg Val Leu Gly Ala Glu Ala Val Ala Asn Leu Glu Gln Asn 290 295 300 Ala Ala Asp Gly Glu Gly Lys Asn 305 310 111080DNAArabidopsis thaliana 11atgggaaata gcagcgagga accaaagcct cctaccaaat cagataaacc atcttcaccc 60ccggtggatc aaacaaatgt tcatgtctac cctgattggg cagctatgca ggcatattat 120ggtccaagag tagcaatgcc tccttattac aattcagcta tggctgcatc tggtcatcct 180cctcctcctt acatgtggaa tcctcagcat atgatgtcac catatggagc accctatgct 240gctgtttatc ctcatggagg aggagtttac gctcatcccg gtattcccat gggatcactg 300cctcaaggtc aaaaggatcc acctttaaca actccgggga cgcttttgag catcgacact 360cctactaaat ctacagggaa cacagacaat ggattgatga agaagctgaa agagtttgat 420gggcttgcta tgtctctagg aaatgggaat cctgaaaatg gtgcagatga acataaacga 480tcacggaaca gctcagaaac tgatggttct actgatggaa gtgatgggaa tacaactggg 540gcagatgaac cgaaacttaa aagaagtcga gagggaactc caacaaaaga tgggaaacaa 600ttggttcaag ctagctcatt tcattctgtt tctccgtcaa gtggtgatac cggcgtaaaa 660ctcattcaag gatctggagc tatactctct cctggtaacg agagagaact gaaacgggag 720cgaaggaaac agtctaatag agaatctgct agaaggtcaa gattaaggaa acaggccgag 780acagaagaac ttgctaggaa agtggaagcc ttgacagccg aaaacatggc attaagatct 840gaactaaacc aacttaatga gaaatctgat aaactaagag gagcaaatgc aaccttgttg 900gacaaactga aatgctcgga acccgaaaag agagtccccg caaatatgtt gtctagagtt 960aagaactcag gagctggaga taagaacaag aaccaaggag acaatgattc taactctaca 1020agcaaattgc atcaactgct cgatacgaag cctcgagcta aagcagtagc tgcaggctga 108012359PRTArabidopsis thaliana 12Met Gly Asn Ser Ser Glu Glu Pro Lys Pro Pro Thr Lys Ser Asp Lys 1 5 10 15 Pro Ser Ser Pro Pro Val Asp Gln Thr Asn Val His Val Tyr Pro Asp 20 25 30 Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Met Pro Pro 35 40 45 Tyr Tyr Asn Ser Ala Met Ala Ala Ser Gly His Pro Pro Pro Pro Tyr 50 55 60 Met Trp Asn Pro Gln His Met Met Ser Pro Tyr Gly Ala Pro Tyr Ala 65 70 75 80 Ala Val Tyr Pro His Gly Gly Gly Val Tyr Ala His Pro Gly Ile Pro 85 90 95 Met Gly Ser Leu Pro Gln Gly Gln Lys Asp Pro Pro Leu Thr Thr Pro 100 105 110 Gly Thr Leu Leu Ser Ile Asp Thr Pro Thr Lys Ser Thr Gly Asn Thr 115 120 125 Asp Asn Gly Leu Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met 130 135 140 Ser Leu Gly Asn Gly Asn Pro Glu Asn Gly Ala Asp Glu His Lys Arg 145 150 155 160 Ser Arg Asn Ser Ser Glu Thr Asp Gly Ser Thr Asp Gly Ser Asp Gly 165 170 175 Asn Thr Thr Gly Ala Asp Glu Pro Lys Leu Lys Arg Ser Arg Glu Gly 180 185 190 Thr Pro Thr Lys Asp Gly Lys Gln Leu Val Gln Ala Ser Ser Phe His 195 200 205 Ser Val Ser Pro Ser Ser Gly Asp Thr Gly Val Lys Leu Ile Gln Gly 210 215 220 Ser Gly Ala Ile Leu Ser Pro Gly Asn Glu Arg Glu Leu Lys Arg Glu 225 230 235 240 Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 245 250 255 Lys Gln Ala Glu Thr Glu Glu Leu Ala Arg Lys Val Glu Ala Leu Thr 260 265 270 Ala Glu Asn Met Ala Leu Arg Ser Glu Leu Asn Gln Leu Asn Glu Lys 275 280 285 Ser Asp Lys Leu Arg Gly Ala Asn Ala Thr Leu Leu Asp Lys Leu Lys 290 295 300 Cys Ser Glu Pro Glu Lys Arg Val Pro Ala Asn Met Leu Ser Arg Val 305 310 315 320 Lys Asn Ser Gly Ala Gly Asp Lys Asn Lys Asn Gln Gly Asp Asn Asp 325 330 335 Ser Asn Ser Thr Ser Lys Leu His Gln Leu Leu Asp Thr Lys Pro Arg 340 345 350 Ala Lys Ala Val Ala Ala Gly 355 131083DNAArabidopsis thaliana 13atgggtagca acgaagaagg aaaccccact aacaactctg ataagccatc gcaagctgct 60gctcctgagc agagtaatgt tcatgtgtat catcatgact gggctgctat gcaggcatat 120tatgggccta gagttggtat acctcaatat tacaactcaa atttggcgcc tggtcatgct 180ccaccgcctt atatgtgggc gtctccatcg ccaatgatgg ctccttatgg agcaccatat 240ccaccatttt gccctcctgg tggagtttat gctcatcctg gtgttcaaat gggctcacaa 300ccacaaggtc ctgtttctca atcagcatct ggagttacaa cccctttgac cattgatgca 360ccagctaatt cagctggaaa ctcagatcat gggttcatga aaaagctgaa agagttcgat 420ggacttgcaa tgtcaataag caataacaaa gttgggagtg ctgaacatag cagcagtgaa 480cataggagtt ctcagagctc cgagaatgat ggctctagca atggtagtga tggtaataca 540actgggggag aacaatctag gaggaaaaga aggcaacaaa gatcaccaag cactggtgaa 600agaccctcat ctcaaaacag tctgcctctt agaggtgaaa atgagaaacc cgatgtgact 660atggggactc ctgttatgcc cacagcaatg agtttccaaa actctgctgg catgaacggt 720gtgccacagc catggaatga aaaagaggtt aaacgagaga agagaaaaca gtcaaaccga 780gaatctgcta ggaggtcaag actgaggaag caggctgaaa cagaacaact atctgtcaaa 840gttgacgcat tagtagctga gaacatgtct ctgaggtcta aactaggcca gctaaacaat 900gagtctgaga aactacggct ggagaacgaa gctatattgg atcaactgaa agcgcaagca 960acagggaaaa cagagaacct gatctctcga gttgataaga acaactctgt atcaggtagc 1020aaaactgtgc agcatcaact gttaaatgca agtccgataa ccgatcctgt cgcggctagc 1080tga 108314360PRTArabidopsis thaliana 14Met Gly Ser Asn Glu Glu Gly Asn Pro Thr Asn Asn Ser Asp Lys Pro 1 5 10 15 Ser Gln Ala Ala Ala Pro Glu Gln Ser Asn Val His Val Tyr His His 20 25 30 Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Gly Ile Pro 35 40 45 Gln Tyr Tyr Asn Ser Asn Leu Ala Pro Gly His Ala Pro Pro Pro Tyr 50 55 60 Met Trp Ala Ser Pro Ser Pro Met Met Ala Pro Tyr Gly Ala Pro Tyr 65 70 75 80 Pro Pro Phe Cys Pro Pro Gly Gly Val Tyr Ala His Pro Gly Val Gln 85 90 95 Met Gly Ser Gln Pro Gln Gly Pro Val Ser Gln Ser Ala Ser Gly Val 100 105 110 Thr Thr Pro Leu Thr Ile Asp Ala Pro Ala Asn Ser Ala Gly Asn Ser 115 120 125 Asp His Gly Phe Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met 130 135 140 Ser Ile Ser Asn Asn Lys Val Gly Ser Ala Glu His Ser Ser Ser Glu 145 150 155 160 His Arg Ser Ser Gln Ser Ser Glu Asn Asp Gly Ser Ser Asn Gly Ser 165 170 175 Asp Gly Asn Thr Thr Gly Gly Glu Gln Ser Arg Arg Lys Arg Arg Gln 180 185 190 Gln Arg Ser Pro Ser Thr Gly Glu Arg Pro Ser Ser Gln Asn Ser Leu 195 200 205 Pro Leu Arg Gly Glu Asn Glu Lys Pro Asp Val Thr Met Gly Thr Pro 210 215 220 Val Met Pro Thr Ala Met Ser Phe Gln Asn Ser Ala Gly Met Asn Gly 225 230 235 240 Val Pro Gln Pro Trp Asn Glu Lys Glu Val Lys Arg Glu Lys Arg Lys 245 250 255 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 260 265 270 Glu Thr Glu Gln Leu Ser Val Lys Val Asp Ala Leu Val Ala Glu Asn 275 280 285 Met Ser Leu Arg Ser Lys Leu Gly Gln Leu Asn Asn Glu Ser Glu Lys 290 295 300 Leu Arg Leu Glu Asn Glu Ala Ile Leu Asp Gln Leu Lys Ala Gln Ala 305 310 315 320 Thr Gly Lys Thr Glu Asn Leu Ile Ser Arg Val Asp Lys Asn Asn Ser 325 330 335 Val Ser Gly Ser Lys Thr Val Gln His Gln Leu Leu Asn Ala Ser Pro 340 345 350 Ile Thr Asp Pro Val Ala Ala Ser 355 360 15942DNAArabidopsis thaliana 15atgggaacga gcgaagacaa gatgccattt aagactacca aaccaacatc ttcggctcag 60gaagttcctc ccacaccgta tccagattgg caaaattcaa tgcaggctta ttatggcgga 120ggaggtactc caaatccttt tttcccatcc ccagttggat ctcctagtcc tcacccctat 180atgtggggtg ctcaacacca tatgatgccg ccttatggca ccccagttcc gtacccagca 240atgtatcccc cgggggcagt ctatgctcat cctagcatgc ccatgcctcc taattctggt 300cctaccaaca aggagcctgc gaaggaccaa gcttctggca agaagtcaaa ggggaactcg 360aaaaaaaagg ctgaaggagg tgataaagcg ctctctggtt cagggaacga tggtgcctct 420catagtgatg aaagtgtcac agcgggttca tctgatgaaa atgatgagaa tgccaatcaa 480cagggttcaa ttcgaaagcc aagctttgga cagatgcttg ctgacgcaag ttctcaaagt 540acgactggtg aaatccaagg ttcggtgccc atgaagccgg tagccccggg gactaatctg 600aatatcggga tggacttatg gtcttcccaa gctggtgtac cagtgaagga tgaacgagag 660ctcaagcggc agaagaggaa acaatctaac cgtgaatccg ctaggcggtc tagattgcgg 720aagcaggccg aatgcgaaca acttcaacaa agagtagaga gtttgtcgaa cgagaatcaa 780agcctgagag atgagctaca gagactctca agcgaatgtg ataagctcaa gtctgagaac 840aactcaatcc aggatgagtt gcagagagta cttggagcag aggctgtagc taatctagaa 900cagaatgctg ctgggtcgaa agatggtgaa ggaacaaatt aa 94216313PRTArabidopsis thaliana 16Met Gly Thr Ser Glu Asp Lys Met Pro Phe Lys Thr Thr Lys Pro Thr 1 5 10 15 Ser Ser Ala Gln Glu Val Pro Pro Thr Pro Tyr Pro Asp Trp Gln Asn 20 25 30 Ser Met Gln Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Asn Pro Phe Phe 35 40 45 Pro Ser Pro Val Gly Ser Pro Ser Pro His Pro Tyr Met Trp Gly Ala 50 55 60 Gln His His Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala 65 70 75 80 Met Tyr Pro Pro Gly Ala Val Tyr Ala His Pro Ser Met Pro Met Pro 85 90 95 Pro Asn Ser Gly Pro Thr Asn Lys Glu Pro Ala Lys Asp Gln Ala Ser 100 105 110 Gly Lys Lys Ser Lys Gly Asn Ser Lys Lys Lys Ala Glu Gly Gly Asp 115 120 125 Lys Ala Leu Ser Gly Ser Gly Asn Asp Gly Ala Ser His Ser Asp Glu 130 135 140 Ser Val Thr Ala Gly Ser Ser Asp Glu Asn Asp Glu Asn Ala Asn Gln 145 150 155 160 Gln Gly Ser Ile Arg Lys Pro Ser Phe Gly Gln Met Leu Ala Asp Ala 165 170 175 Ser Ser Gln Ser Thr Thr Gly Glu Ile Gln Gly Ser Val Pro Met Lys 180 185 190 Pro Val Ala Pro Gly Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Ser 195 200 205 Ser Gln Ala Gly Val Pro Val Lys Asp Glu Arg Glu Leu Lys Arg Gln 210 215 220 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 225 230 235 240 Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln Arg Val Glu Ser Leu Ser 245 250 255 Asn Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Ser Glu 260 265 270 Cys Asp Lys Leu Lys Ser Glu Asn Asn Ser Ile Gln Asp Glu Leu Gln 275 280 285 Arg Val Leu Gly Ala Glu Ala Val Ala Asn Leu Glu Gln Asn Ala Ala 290 295 300 Gly Ser Lys Asp Gly Glu Gly Thr Asn 305 310 171164DNAAquilegia sp. 17atgggttctg ctgaagagag tacacctgcc aaaccttcca gaccaagtgc ttcatctcag 60gaaacaccac caacaccttt atatcccgat tggtcaactc caatgcaggc atactacggt 120gctggagcta cccaacctcc atttttccct acaaatgttg ctagtccacc tccgtatcca 180tatatgtggg gaggccagat ggtctcacca tatggtaccc caattccata ccctgctatt 240tacccccatg gagggcttta tcctcatcct aacttggcta cggctcaggg tgcagcaatg 300ccaactacgc agacggagga aaataactct ccagtgaaaa aaatcaagag ctcgggaaac 360attggtgtgg ttggtggcaa attgaaagaa agtgggaagg cagcttctgg ctctcgaaat 420gacggtgtct cacggagtgc tgaaagtgga agtgagggct catcagatgc gagtgacgag 480ttcaatcata aagacgattc ggagaataag aacaaaagct ttgaccagat gcttgcagat 540ggagcaaatg cacagaacac cagtgctcac cacagtagta cagctgttgg gtcatcttta 600aatggaaatg gagaaccatc tgcaaatttt ccagttcctt tgccagggaa tcctgtggga 660gatattgctg caaccaattt gaatataggg atggacctct ggaatgcatc tcctgttgga 720tcggtgcctt tgagggcaag atcgaatgct tcaggtgtcg tgccagcagt tgctcctgtc 780aagagagatg ggcatgaagg cattgtgcct gaacacctat ggggtcaaga tgaacgtgaa 840ctgaaaagac agagaaggaa gctatcaaat agggagtcag ctcggagatc aagactacgc 900aaacaggctg agtgtgaaga gctacaagtg aaggtggata cattgaccga tgagaacgat 960aatctccgta aagagctgga gaggctcgcc gaggaacgcc aaaagctcac taatgaaaat 1020gcatccttag agagtgaact gagtcagttg tatggagaag aagcaatttc gaccctcaag 1080ggtaagaatg ccaacatgtc tgtgcagtct gttaatggtt ttgaacaaga cactttgatg 1140ggaaacaact ccttatctga gtag 116418387PRTAquilegia sp. 18Met Gly Ser Ala Glu Glu Ser Thr Pro Ala Lys Pro Ser Arg Pro Ser 1 5 10 15 Ala Ser Ser Gln Glu Thr Pro Pro Thr Pro Leu Tyr Pro Asp Trp Ser 20 25 30 Thr Pro Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Gln Pro Pro Phe 35 40 45 Phe Pro Thr Asn Val Ala Ser Pro Pro Pro Tyr Pro Tyr Met Trp Gly 50 55 60 Gly Gln Met Val Ser Pro Tyr Gly Thr Pro Ile Pro Tyr Pro Ala Ile 65 70 75 80 Tyr Pro His Gly Gly Leu Tyr Pro His Pro Asn Leu Ala Thr Ala Gln 85 90 95 Gly Ala Ala Met Pro Thr Thr Gln Thr Glu Glu Asn Asn Ser Pro Val 100 105 110 Lys Lys Ile Lys Ser Ser Gly Asn Ile Gly Val Val Gly Gly Lys Leu 115 120 125 Lys Glu Ser Gly Lys Ala Ala Ser Gly Ser Arg Asn Asp Gly Val Ser 130 135 140 Arg Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser Asp Ala Ser Asp Glu 145 150 155 160 Phe Asn His Lys Asp Asp Ser Glu Asn Lys Asn Lys Ser Phe Asp Gln 165 170 175 Met Leu Ala Asp Gly Ala Asn Ala Gln Asn Thr Ser Ala His His Ser 180 185 190 Ser Thr Ala Val Gly Ser Ser Leu Asn Gly Asn Gly Glu Pro Ser Ala 195 200 205 Asn Phe Pro Val Pro Leu Pro Gly Asn Pro Val Gly Asp Ile Ala Ala 210 215 220 Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Pro Val Gly 225 230 235 240 Ser Val Pro Leu Arg Ala Arg Ser Asn Ala Ser Gly Val Val Pro Ala 245 250 255 Val Ala Pro Val Lys Arg Asp Gly His

Glu Gly Ile Val Pro Glu His 260 265 270 Leu Trp Gly Gln Asp Glu Arg Glu Leu Lys Arg Gln Arg Arg Lys Leu 275 280 285 Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 290 295 300 Cys Glu Glu Leu Gln Val Lys Val Asp Thr Leu Thr Asp Glu Asn Asp 305 310 315 320 Asn Leu Arg Lys Glu Leu Glu Arg Leu Ala Glu Glu Arg Gln Lys Leu 325 330 335 Thr Asn Glu Asn Ala Ser Leu Glu Ser Glu Leu Ser Gln Leu Tyr Gly 340 345 350 Glu Glu Ala Ile Ser Thr Leu Lys Gly Lys Asn Ala Asn Met Ser Val 355 360 365 Gln Ser Val Asn Gly Phe Glu Gln Asp Thr Leu Met Gly Asn Asn Ser 370 375 380 Leu Ser Glu 385 191104DNABrassica napus 19atgggaagca acgaagaagg aaagaccaca cagtctgaca agccagcaca agtacaagct 60cctcctcctc ctcctgagca aagcaatgtt catgtgtatc atcatgattg ggctgctatg 120caggcgtact atggaccaag agtagccata actcctcaat attacaactc aaatggtcat 180gctcctccac ctcctcctta tatctggggc tctccttcgc caatgatggc tccttatgga 240acaccatacc caccgttttg tcctcctggt ggagtctatg ctcatcctgc tcttcaaatg 300ggctcacaac cacaagggcc tgcttctcaa gcaacacctg ttgttgcaac tccgttgaac 360ttggaagctc atccagctaa ctcatctgga aacacggatc aggggttcat gaaaaagttg 420aaagaatttg atggacttgc aatgtctata agcaataaca aatctgggag tggtgaacat 480agcagtgaac ctaagaattc tcagagttct gagaatgatg attccagcaa tggtagtgat 540gggaatacaa ctgggggaga acagtctagg aagaaaagaa gccgggaagg atcaccaaac 600aacgatggga agccttcatc tcaaattgtt cctcttctaa gagatgaaag tgagaaacag 660gcagtgacta tggggactcc tgttatgccc acagttttgg atttcccaca gccattccct 720ggtgcgcctc atgaagtctg gaatgaaaaa gaggttaaac gagagaagag aaaacagtca 780aacagagaat ctgctagaag gtcaagactg aggaagcagg ctgaaactga agaactgtcc 840gtcaaggttg atgcactagt tgctgagaac atgactctga ggtcaaaact aggccaacta 900aacgatgagt ctgagaaact acggctggag aaccaagctt tattggatca actgaaagcg 960caagcaactg ggaaaacaga gaacctaata tctggagttg ataagaacaa cagctctgta 1020tcaggtacta gtagtagtag taagaatgcg gaacagcaac tcttaaacgt aagtctaaga 1080accgattctg tcgcggctag ctga 110420367PRTBrassica napus 20Met Gly Ser Asn Glu Glu Gly Lys Thr Thr Gln Ser Asp Lys Pro Ala 1 5 10 15 Gln Val Gln Ala Pro Pro Pro Pro Pro Glu Gln Ser Asn Val His Val 20 25 30 Tyr His His Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val 35 40 45 Ala Ile Thr Pro Gln Tyr Tyr Asn Ser Asn Gly His Ala Pro Pro Pro 50 55 60 Pro Pro Tyr Ile Trp Gly Ser Pro Ser Pro Met Met Ala Pro Tyr Gly 65 70 75 80 Thr Pro Tyr Pro Pro Phe Cys Pro Pro Gly Gly Val Tyr Ala His Pro 85 90 95 Ala Leu Gln Met Gly Ser Gln Pro Gln Gly Pro Ala Ser Gln Ala Thr 100 105 110 Pro Val Val Ala Thr Pro Leu Asn Leu Glu Ala His Pro Ala Asn Ser 115 120 125 Ser Gly Asn Thr Asp Gln Gly Phe Met Lys Lys Leu Lys Glu Phe Asp 130 135 140 Gly Leu Ala Met Ser Ile Ser Asn Asn Lys Ser Gly Ser Gly Glu His 145 150 155 160 Ser Ser Glu Pro Lys Asn Ser Gln Ser Ser Glu Asn Asp Asp Ser Ser 165 170 175 Asn Gly Ser Asp Gly Asn Thr Thr Gly Gly Glu Gln Ser Arg Lys Lys 180 185 190 Arg Ser Arg Glu Gly Ser Pro Asn Asn Asp Gly Lys Pro Ser Ser Gln 195 200 205 Ile Val Pro Leu Leu Arg Asp Glu Ser Glu Lys Gln Ala Val Thr Met 210 215 220 Gly Thr Pro Val Met Pro Thr Val Leu Asp Phe Pro Gln Pro Phe Pro 225 230 235 240 Gly Ala Pro His Glu Val Trp Asn Glu Lys Glu Val Lys Arg Glu Lys 245 250 255 Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys 260 265 270 Gln Ala Glu Thr Glu Glu Leu Ser Val Lys Val Asp Ala Leu Val Ala 275 280 285 Glu Asn Met Thr Leu Arg Ser Lys Leu Gly Gln Leu Asn Asp Glu Ser 290 295 300 Glu Lys Leu Arg Leu Glu Asn Gln Ala Leu Leu Asp Gln Leu Lys Ala 305 310 315 320 Gln Ala Thr Gly Lys Thr Glu Asn Leu Ile Ser Gly Val Asp Lys Asn 325 330 335 Asn Ser Ser Val Ser Gly Thr Ser Ser Ser Ser Lys Asn Ala Glu Gln 340 345 350 Gln Leu Leu Asn Val Ser Leu Arg Thr Asp Ser Val Ala Ala Ser 355 360 365 211125DNABrassica napus 21atgggaaaaa gcgaggaacc aaaggttacc aaatcagaca acaaaccatc ttcaccacct 60gcggatcaaa caaatgttca tgtctaccct gattgggccg ctatgcaggc ttattatggg 120ccaagagtag caatacctcc ttattacaac tcagctatgg ctgctgcatc tggtcatcct 180cctcctcctt acatgtggaa tcctcagcat atgatgtcac catatggaac accgtatgca 240gcggtttacc ctcatggagg aggagtctac gctcatcctg gattccccat gcctcaaagt 300caaaagggtg ctgctttatc aactccgggg acgccattga acatagacac tcctagtaaa 360tcaacaggaa acacagagaa tgggctgatg aagaagctga aagagtttga tggacttgct 420atgtctctag gaaatggtaa taatggtgat gaaggtaaac gctcacggaa cagctcagaa 480acggatggtt ctagtgatgg aagtgacggg aataccactg gggctgatga accgaaactt 540aagagaaggc gagaaggaac tccaaccaaa gatgaggaga aacatttggt tcagtcaagc 600tcatttcggt ctgtttctca gtcaagtggt gataacgttg taaagcatag tgttcaagga 660ggaggtggag ctatagtctc tgctgctggt gtaagtgcaa attcaaaccc aaccttcatg 720tcacaatctt tagccatggt tcctcctgaa acttggcttc agaacgagag agagctgaaa 780cgggagagaa ggaaacagtc taatagagaa tctgcaagaa ggtcaagatt aaggaaacag 840gctgagactg aagaactggc taggaaagtt gaagccttga cagcagaaaa catggcgtta 900agatctgagc taaaccaact taatgagaaa tctaataatc taagaggagc taatgcaacc 960ttactggaca agctgaaaag ttcagaacct gaaaagagag ttaagagctc aggaaatgga 1020gacgacaaga acaagaagca aggagacaat gagactaact ctaccagcaa actgcatcaa 1080ctgcttgata ccaagcctcg agctgacggt gtagctgctc gctaa 112522374PRTBrassica napus 22Met Gly Lys Ser Glu Glu Pro Lys Val Thr Lys Ser Asp Asn Lys Pro 1 5 10 15 Ser Ser Pro Pro Ala Asp Gln Thr Asn Val His Val Tyr Pro Asp Trp 20 25 30 Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Ile Pro Pro Tyr 35 40 45 Tyr Asn Ser Ala Met Ala Ala Ala Ser Gly His Pro Pro Pro Pro Tyr 50 55 60 Met Trp Asn Pro Gln His Met Met Ser Pro Tyr Gly Thr Pro Tyr Ala 65 70 75 80 Ala Val Tyr Pro His Gly Gly Gly Val Tyr Ala His Pro Gly Phe Pro 85 90 95 Met Pro Gln Ser Gln Lys Gly Ala Ala Leu Ser Thr Pro Gly Thr Pro 100 105 110 Leu Asn Ile Asp Thr Pro Ser Lys Ser Thr Gly Asn Thr Glu Asn Gly 115 120 125 Leu Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met Ser Leu Gly 130 135 140 Asn Gly Asn Asn Gly Asp Glu Gly Lys Arg Ser Arg Asn Ser Ser Glu 145 150 155 160 Thr Asp Gly Ser Ser Asp Gly Ser Asp Gly Asn Thr Thr Gly Ala Asp 165 170 175 Glu Pro Lys Leu Lys Arg Arg Arg Glu Gly Thr Pro Thr Lys Asp Glu 180 185 190 Glu Lys His Leu Val Gln Ser Ser Ser Phe Arg Ser Val Ser Gln Ser 195 200 205 Ser Gly Asp Asn Val Val Lys His Ser Val Gln Gly Gly Gly Gly Ala 210 215 220 Ile Val Ser Ala Ala Gly Val Ser Ala Asn Ser Asn Pro Thr Phe Met 225 230 235 240 Ser Gln Ser Leu Ala Met Val Pro Pro Glu Thr Trp Leu Gln Asn Glu 245 250 255 Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala 260 265 270 Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Thr Glu Glu Leu Ala Arg 275 280 285 Lys Val Glu Ala Leu Thr Ala Glu Asn Met Ala Leu Arg Ser Glu Leu 290 295 300 Asn Gln Leu Asn Glu Lys Ser Asn Asn Leu Arg Gly Ala Asn Ala Thr 305 310 315 320 Leu Leu Asp Lys Leu Lys Ser Ser Glu Pro Glu Lys Arg Val Lys Ser 325 330 335 Ser Gly Asn Gly Asp Asp Lys Asn Lys Lys Gln Gly Asp Asn Glu Thr 340 345 350 Asn Ser Thr Ser Lys Leu His Gln Leu Leu Asp Thr Lys Pro Arg Ala 355 360 365 Asp Gly Val Ala Ala Arg 370 23942DNABrassica napus 23atgggaacaa gcgaagaaaa gacgccttct aaaccagcat cctcaacaca ggacattccc 60cccacacctt atccagactg gtctaactca atgcaggctt attatggagg aggaggtact 120ccgagtcctt ttttcccatc tccagttgga tctcctagtc ctcaccctta catgtggggt 180gctcaacacc atatgatgcc gccttatggg acccccgttc cgtacccagc catgtatcct 240ccaggggcgg tctacgccca tcctggcatg cccatgcctc cttcttctgc tccaaccaac 300gagaccgtga aggaacaagc ccctggaaag aagtcaaaag ggagcttgaa aagaaagggc 360gaaggaggtg agaaggcgcc ttctggttct gggaacgatg gtgtatctca cagtgatgaa 420agtgtcacag ggggttcatc tgatgaaaac gatgagaatg ctaaccacca ggaacatggt 480tcagttagaa agcctagctt tggacaaatg ctggcggatg caagttctca gagtaatact 540accggtgaga tgatccaagg ttcagttccc atgaagccac tagcccctgg gactaatttg 600aatatgggaa tggacttatg gtcttcccag gctggtgtac ctgtgaagga tgaaagagag 660ctcaagaggc agaaaaggaa acaatctaac cgtgaatccg ccaggcggtc cagactaagg 720aagcaggcgg aatgcgaaca gcttcagcag agagtagaga gtttgactag tgagaatcag 780agcctgagag atgagttgca gagactctct ggagaatgtg agaagctcaa gactcagaac 840agttctattc aggatgagtt ggtaagagtg catggaccag aggccgtggc taatctagaa 900cagaatgctg atgggtctaa agatggcgaa ggaacagatt aa 94224313PRTBrassica napus 24Met Gly Thr Ser Glu Glu Lys Thr Pro Ser Lys Pro Ala Ser Ser Thr 1 5 10 15 Gln Asp Ile Pro Pro Thr Pro Tyr Pro Asp Trp Ser Asn Ser Met Gln 20 25 30 Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Ser Pro Phe Phe Pro Ser Pro 35 40 45 Val Gly Ser Pro Ser Pro His Pro Tyr Met Trp Gly Ala Gln His His 50 55 60 Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Met Tyr Pro 65 70 75 80 Pro Gly Ala Val Tyr Ala His Pro Gly Met Pro Met Pro Pro Ser Ser 85 90 95 Ala Pro Thr Asn Glu Thr Val Lys Glu Gln Ala Pro Gly Lys Lys Ser 100 105 110 Lys Gly Ser Leu Lys Arg Lys Gly Glu Gly Gly Glu Lys Ala Pro Ser 115 120 125 Gly Ser Gly Asn Asp Gly Val Ser His Ser Asp Glu Ser Val Thr Gly 130 135 140 Gly Ser Ser Asp Glu Asn Asp Glu Asn Ala Asn His Gln Glu His Gly 145 150 155 160 Ser Val Arg Lys Pro Ser Phe Gly Gln Met Leu Ala Asp Ala Ser Ser 165 170 175 Gln Ser Asn Thr Thr Gly Glu Met Ile Gln Gly Ser Val Pro Met Lys 180 185 190 Pro Leu Ala Pro Gly Thr Asn Leu Asn Met Gly Met Asp Leu Trp Ser 195 200 205 Ser Gln Ala Gly Val Pro Val Lys Asp Glu Arg Glu Leu Lys Arg Gln 210 215 220 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 225 230 235 240 Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln Arg Val Glu Ser Leu Thr 245 250 255 Ser Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Gly Glu 260 265 270 Cys Glu Lys Leu Lys Thr Gln Asn Ser Ser Ile Gln Asp Glu Leu Val 275 280 285 Arg Val His Gly Pro Glu Ala Val Ala Asn Leu Glu Gln Asn Ala Asp 290 295 300 Gly Ser Lys Asp Gly Glu Gly Thr Asp 305 310 25942DNABrassica napus 25atgggaacga gcgaggaaaa gaccccattt aagccttcca agccagcatc ctcggcacag 60gacactcctc ccacacctta tgcagactgg tcaaactcaa tgcaggctta ttatggagga 120ggaggtactc caagtccttt tttcccatcc ccagttggat ctcctagtcc tcacccttat 180atgtggggtg ctcaacacca tatgatgccg ccttatggga ctccagttcc gtatccagca 240atgtatcccc cagggactgt ctatgcccat cctggcatgc ccatgcctca ggcttctggt 300ccaaccaaca cggagaccgt gaaagctcaa gcccctggta agaagccaaa gggtaacttg 360aaaagaaaga gtggaggaag tgagaaggcg ccttctggtt cagggaacga tgctgtatct 420caaagtgaag aaagtgtcac agctggttca tctgatgaaa acgatgacaa tgccaaccac 480caggaacaag gttcagttag aaagccaagc ttcggacaga tgctggctga tgcaagttct 540cagagtaata ctactggtga gatccaaggt tccatgccaa tgaaaccagt ggcgccaggg 600actaatctga atatggggat ggacttatgg tcttcccaga ctggtgtagc tgtgaaggat 660gaaagagagc tcaagaggca gaaaaggaaa caatctaacc gtgaatcagc tagacggtcc 720agattgcgga agcaggcgga atgcgagcag cttcaacaga gagtagagag tttgacgagt 780gagaatcaaa gtctgagaga tgagttacag agactctccg gagaatgtga gaagctcaag 840acggagaaca acactattca ggatgagttg gtaagagtgc atggaccaga ggcagtagct 900aatctagaac agaatgctga tggatctaaa gatggtgaat ga 94226313PRTBrassica napus 26Met Gly Thr Ser Glu Glu Lys Thr Pro Phe Lys Pro Ser Lys Pro Ala 1 5 10 15 Ser Ser Ala Gln Asp Thr Pro Pro Thr Pro Tyr Ala Asp Trp Ser Asn 20 25 30 Ser Met Gln Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Ser Pro Phe Phe 35 40 45 Pro Ser Pro Val Gly Ser Pro Ser Pro His Pro Tyr Met Trp Gly Ala 50 55 60 Gln His His Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala 65 70 75 80 Met Tyr Pro Pro Gly Thr Val Tyr Ala His Pro Gly Met Pro Met Pro 85 90 95 Gln Ala Ser Gly Pro Thr Asn Thr Glu Thr Val Lys Ala Gln Ala Pro 100 105 110 Gly Lys Lys Pro Lys Gly Asn Leu Lys Arg Lys Ser Gly Gly Ser Glu 115 120 125 Lys Ala Pro Ser Gly Ser Gly Asn Asp Ala Val Ser Gln Ser Glu Glu 130 135 140 Ser Val Thr Ala Gly Ser Ser Asp Glu Asn Asp Asp Asn Ala Asn His 145 150 155 160 Gln Glu Gln Gly Ser Val Arg Lys Pro Ser Phe Gly Gln Met Leu Ala 165 170 175 Asp Ala Ser Ser Gln Ser Asn Thr Thr Gly Glu Ile Gln Gly Ser Met 180 185 190 Pro Met Lys Pro Val Ala Pro Gly Thr Asn Leu Asn Met Gly Met Asp 195 200 205 Leu Trp Ser Ser Gln Thr Gly Val Ala Val Lys Asp Glu Arg Glu Leu 210 215 220 Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 225 230 235 240 Arg Leu Arg Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln Arg Val Glu 245 250 255 Ser Leu Thr Ser Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu 260 265 270 Ser Gly Glu Cys Glu Lys Leu Lys Thr Glu Asn Asn Thr Ile Gln Asp 275 280 285 Glu Leu Val Arg Val His Gly Pro Glu Ala Val Ala Asn Leu Glu Gln 290 295 300 Asn Ala Asp Gly Ser Lys Asp Gly Glu 305 310 27933DNABrassica napus 27atgggaacaa gcgaagaaaa gacgccttcc aaaccagcat cctcaacaca ggacattcct 60cccacaccat atccagactg gtcaaactca atgcaggctt attatggagg aggaggtact 120ccgaatcctt ttttcccatc tcctgttgga tctcctagtc ctcaccctta catgtggggt 180gctcaacacc atatgatgcc gccttatggg accccggttc cgtatccagc catgtatcct 240ccaggggcgg tctacgctca tcctggcatg cccatgcctc ctgcttctgc tccaaccaac 300aaggagacgg tgaaggaaca agcccctggc aagaagtcaa aagggagctt gaaaagaaag 360ggtgaaggag gtgagaaggc gccttctggt tctgggaacg atggtgtatc tcacagtgat 420gaaagtgtca caggaggttc atctgatgaa aatgatgaga acgctaacca ccaggaacaa 480ggctcagtta gaaagccgag ctttggacaa atgctagcgg atgcaagttc tcagagtaat 540actactggtg agatccaagg ttccatgcca atgaaaccag tggcgccagg gactaatctg 600aatatgggga tggacttatg gtcttcccag actggtgtag ctgtgaagga tgaaagagag 660ctcaagaggc agaaaaggaa acaatctaac cgtgaatcag ctagacggtc cagattgcgg 720aagcaggcgg aatgcgagca gcttcaacag agagtagaga gtttgacgag tgagaatcaa 780agtctgagag atgagttaca gagactctcc ggagaatgtg agaagctcaa gacggagaac 840aacactattc aggatgagtt ggtaagagtg catggaccag aggcagtagc taatctagaa 900cagaatgctg

atggatctaa agatggtgaa tga 93328310PRTBrassica napus 28Met Gly Thr Ser Glu Glu Lys Thr Pro Ser Lys Pro Ala Ser Ser Thr 1 5 10 15 Gln Asp Ile Pro Pro Thr Pro Tyr Pro Asp Trp Ser Asn Ser Met Gln 20 25 30 Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Asn Pro Phe Phe Pro Ser Pro 35 40 45 Val Gly Ser Pro Ser Pro His Pro Tyr Met Trp Gly Ala Gln His His 50 55 60 Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Met Tyr Pro 65 70 75 80 Pro Gly Ala Val Tyr Ala His Pro Gly Met Pro Met Pro Pro Ala Ser 85 90 95 Ala Pro Thr Asn Lys Glu Thr Val Lys Glu Gln Ala Pro Gly Lys Lys 100 105 110 Ser Lys Gly Ser Leu Lys Arg Lys Gly Glu Gly Gly Glu Lys Ala Pro 115 120 125 Ser Gly Ser Gly Asn Asp Gly Val Ser His Ser Asp Glu Ser Val Thr 130 135 140 Gly Gly Ser Ser Asp Glu Asn Asp Glu Asn Ala Asn His Gln Glu Gln 145 150 155 160 Gly Ser Val Arg Lys Pro Ser Phe Gly Gln Met Leu Ala Asp Ala Ser 165 170 175 Ser Gln Ser Asn Thr Thr Gly Glu Ile Gln Gly Ser Met Pro Met Lys 180 185 190 Pro Val Ala Pro Gly Thr Asn Leu Asn Met Gly Met Asp Leu Trp Ser 195 200 205 Ser Gln Thr Gly Val Ala Val Lys Asp Glu Arg Glu Leu Lys Arg Gln 210 215 220 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 225 230 235 240 Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln Arg Val Glu Ser Leu Thr 245 250 255 Ser Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Gly Glu 260 265 270 Cys Glu Lys Leu Lys Thr Glu Asn Asn Thr Ile Gln Asp Glu Leu Val 275 280 285 Arg Val His Gly Pro Glu Ala Val Ala Asn Leu Glu Gln Asn Ala Asp 290 295 300 Gly Ser Lys Asp Gly Glu 305 310 29972DNACoffea canephora 29atgcaggctt actatggtgc tggagctact ccaccctttt ttgcatcaac tgttgcttct 60cctagtcccc atccctattt atggggcaac cagcatcctc tgatgccacc ttatggcact 120ccagttcctt atccagcgct atatccaggg ggagtttatg ctcatcctaa catggcaatg 180gctccaggag cggtacaggc tcctatagag tcggatgcaa aagctcctga tgggaaagac 240cggaacacaa acaaaaaact caagggtcct tcaggaaacc ctgggttgat tgctgtcaag 300gctggggaga gtgggaaagc ggcttcaggc tcaggaaatg atggtgcaac tcaaagtgct 360gaaagtggaa gtgaaggttc atctgatgga agcgatgaga ataataacca tgaactttct 420gcaacaaaga aaggcagctt tgatcaaatg cttgcagatg gagccactgc acagaacaat 480acttctgtag caaattttca gaattcagtg cctgggaatc ctgtagtctc cgtgcctgct 540actaatctaa atattggaat ggacttgtgg aatccatctt ctggagcttc tggagccatg 600aagatgcgtc caaatcctgg tgtctcacct gctgttgctc ctggcatgat gactgaccag 660tggattcagg atgagcgaga gttgaaaaga cagaagcgaa agcaatctaa tcgtgagtct 720gcccggagat caagattacg caaacaggct gagtgcgaag agttgcagca gagagtcgag 780tcactgaaca gtgaaaatcg tgcacttagg gatgagctac aaaaggtttc tgaggaatgc 840gagaagctta catccgaaaa taactctatt aaggaggagt tgactaggtt gtgtggacca 900gaggcagtag ctaaattaga gagcagtagc atcacccaac ttgagacaaa tggtgatgaa 960gatgaccatt ga 97230323PRTCoffea canephora 30Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Pro Pro Phe Phe Ala Ser 1 5 10 15 Thr Val Ala Ser Pro Ser Pro His Pro Tyr Leu Trp Gly Asn Gln His 20 25 30 Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Leu Tyr 35 40 45 Pro Gly Gly Val Tyr Ala His Pro Asn Met Ala Met Ala Pro Gly Ala 50 55 60 Val Gln Ala Pro Ile Glu Ser Asp Ala Lys Ala Pro Asp Gly Lys Asp 65 70 75 80 Arg Asn Thr Asn Lys Lys Leu Lys Gly Pro Ser Gly Asn Pro Gly Leu 85 90 95 Ile Ala Val Lys Ala Gly Glu Ser Gly Lys Ala Ala Ser Gly Ser Gly 100 105 110 Asn Asp Gly Ala Thr Gln Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser 115 120 125 Asp Gly Ser Asp Glu Asn Asn Asn His Glu Leu Ser Ala Thr Lys Lys 130 135 140 Gly Ser Phe Asp Gln Met Leu Ala Asp Gly Ala Thr Ala Gln Asn Asn 145 150 155 160 Thr Ser Val Ala Asn Phe Gln Asn Ser Val Pro Gly Asn Pro Val Val 165 170 175 Ser Val Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn Pro 180 185 190 Ser Ser Gly Ala Ser Gly Ala Met Lys Met Arg Pro Asn Pro Gly Val 195 200 205 Ser Pro Ala Val Ala Pro Gly Met Met Thr Asp Gln Trp Ile Gln Asp 210 215 220 Glu Arg Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser 225 230 235 240 Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln 245 250 255 Gln Arg Val Glu Ser Leu Asn Ser Glu Asn Arg Ala Leu Arg Asp Glu 260 265 270 Leu Gln Lys Val Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn 275 280 285 Ser Ile Lys Glu Glu Leu Thr Arg Leu Cys Gly Pro Glu Ala Val Ala 290 295 300 Lys Leu Glu Ser Ser Ser Ile Thr Gln Leu Glu Thr Asn Gly Asp Glu 305 310 315 320 Asp Asp His 31540DNACoffea canephora 31atgggaagtg tactttctcc gaatatgact tcgactcttg aacttagaaa cccttctggt 60ggaaatatga agacaagtcc tgttagtgaa gcctggctgc agaatgagcg agagctgaag 120cgggaaagga ggaaacagtc aaatcgagaa tctgcaagga gatcaagatt gaggaaacag 180gctgagactg aagaactagc taaaaaagtt caatcactga ctgccgagaa cctaagtttg 240aagtctgaaa tacacaaatt aactgagagc tctgaacggc tgaagcttga aaatgctact 300atgatggaga aactgaaaaa cccacaactg gggcagactg gaaatttgag tttaagcaag 360tttgatgaaa tgcgactaca accagttggc acggcaaatc tactcgccag ggtaaacaac 420tctggttctg ttgacaggaa tgacgaggag ggtgaggtgt tcgagaatac aaaatccggg 480gcaaagcttc gccagctgct tgatgcaaac ccccgcacgg atgccgtggc agctggctga 54032179PRTCoffea canephora 32Met Gly Ser Val Leu Ser Pro Asn Met Thr Ser Thr Leu Glu Leu Arg 1 5 10 15 Asn Pro Ser Gly Gly Asn Met Lys Thr Ser Pro Val Ser Glu Ala Trp 20 25 30 Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser Asn 35 40 45 Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Thr Glu 50 55 60 Glu Leu Ala Lys Lys Val Gln Ser Leu Thr Ala Glu Asn Leu Ser Leu 65 70 75 80 Lys Ser Glu Ile His Lys Leu Thr Glu Ser Ser Glu Arg Leu Lys Leu 85 90 95 Glu Asn Ala Thr Met Met Glu Lys Leu Lys Asn Pro Gln Leu Gly Gln 100 105 110 Thr Gly Asn Leu Ser Leu Ser Lys Phe Asp Glu Met Arg Leu Gln Pro 115 120 125 Val Gly Thr Ala Asn Leu Leu Ala Arg Val Asn Asn Ser Gly Ser Val 130 135 140 Asp Arg Asn Asp Glu Glu Gly Glu Val Phe Glu Asn Thr Lys Ser Gly 145 150 155 160 Ala Lys Leu Arg Gln Leu Leu Asp Ala Asn Pro Arg Thr Asp Ala Val 165 170 175 Ala Ala Gly 33852DNACitrus clementina 33atgcccccca tagggcaccc ccgtttccat ccacaagttt atttttttcc ggggggggtt 60tgcccttcct ggccgggttt ggagttcaaa ccgggcccca accaaaacag ggccggaagg 120aaagggccgg gagcaaagga ccggggttcg tttaaaaatt ccaggggact ccggaaggta 180aggctgggga gattgtttag gcaacttttg gtttctggga atgacggtgt ttttcaaagt 240ggtggaagtg gtagtgacgg ttcttttgat gcgagtgatg agaattgtaa ccagcaggag 300tttgttaggg gtaagaaagg aagctttgac aagatgcttg cagatgccaa cacggagaat 360aacacagcgg aagctgttcc aggatcagtg cccgggaagc ctgtagtttc aatgcctgca 420actaatctca atattggcat ggatttgtgg aatacatccc ctgctgctgc tggagctgca 480aaaatgagaa caaatccatt tggggcctca ccagcagttg ctcccgctgg cataatgccc 540gatcaatgga ttcaagatga acgtgaattg aaaagacaga aaaggaagca atttaatagg 600gagtcagcca gaaggtcaag gttacgcaag caggcggaat gtggggagct acaggccaga 660gtggggactt tgagcaatga gaatcgcaac cttagagatg agttacagag gctttttgag 720gaatgggaga agcttacatt tgaaaataat tccattaagg aagacttatt tcggttgtgt 780ggaccagagg cagttgttaa ttttgagcag agcaacccca ctcagttgtc cggggaagaa 840gaaaatagct aa 85234283PRTCitrus clementina 34Met Pro Pro Ile Gly His Pro Arg Phe His Pro Gln Val Tyr Phe Phe 1 5 10 15 Pro Gly Gly Val Cys Pro Ser Trp Pro Gly Leu Glu Phe Lys Pro Gly 20 25 30 Pro Asn Gln Asn Arg Ala Gly Arg Lys Gly Pro Gly Ala Lys Asp Arg 35 40 45 Gly Ser Phe Lys Asn Ser Arg Gly Leu Arg Lys Val Arg Leu Gly Arg 50 55 60 Leu Phe Arg Gln Leu Leu Val Ser Gly Asn Asp Gly Val Phe Gln Ser 65 70 75 80 Gly Gly Ser Gly Ser Asp Gly Ser Phe Asp Ala Ser Asp Glu Asn Cys 85 90 95 Asn Gln Gln Glu Phe Val Arg Gly Lys Lys Gly Ser Phe Asp Lys Met 100 105 110 Leu Ala Asp Ala Asn Thr Glu Asn Asn Thr Ala Glu Ala Val Pro Gly 115 120 125 Ser Val Pro Gly Lys Pro Val Val Ser Met Pro Ala Thr Asn Leu Asn 130 135 140 Ile Gly Met Asp Leu Trp Asn Thr Ser Pro Ala Ala Ala Gly Ala Ala 145 150 155 160 Lys Met Arg Thr Asn Pro Phe Gly Ala Ser Pro Ala Val Ala Pro Ala 165 170 175 Gly Ile Met Pro Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg 180 185 190 Gln Lys Arg Lys Gln Phe Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu 195 200 205 Arg Lys Gln Ala Glu Cys Gly Glu Leu Gln Ala Arg Val Gly Thr Leu 210 215 220 Ser Asn Glu Asn Arg Asn Leu Arg Asp Glu Leu Gln Arg Leu Phe Glu 225 230 235 240 Glu Trp Glu Lys Leu Thr Phe Glu Asn Asn Ser Ile Lys Glu Asp Leu 245 250 255 Phe Arg Leu Cys Gly Pro Glu Ala Val Val Asn Phe Glu Gln Ser Asn 260 265 270 Pro Thr Gln Leu Ser Gly Glu Glu Glu Asn Ser 275 280 35546DNACichorium endivia 35atgggactcc ggttccatac cccgctccat tatccacctg caggagttta tgctcatcct 60agtatgccta tgactccaag tcctgctcca gcaaacacag aaatggaagc aaaggcatat 120gaaggaaagg aaagggccac caataaaaag tccaagggta cttctggaaa tggaaatgtt 180ggtgttagaa ctggagatag tggcattgca gcatcaagtt cagggaatga tggtggtgcc 240acacagagtg ctgatagtgg aagtgatggt tcatcagatg gaagtgatga aaatgaccaa 300aatgaatttt ctggaggcaa gaaaggaagc ttcaatcaga tgcttgcaga tgcaaatgca 360cagaacaata actttcacac accagtagta cctgtgaatc ctgtgacttc tattcctggt 420acaaatctca tcatgagaat ggacttgcgg aatccctcca ccggaaatgc cgccatgaaa 480atgcgaacaa atcattccgg caagtcccgt ggggaggtgc cgccacctat gaagcctgaa 540tcatga 54636181PRTCichorium endivia 36Met Gly Leu Arg Phe His Thr Pro Leu His Tyr Pro Pro Ala Gly Val 1 5 10 15 Tyr Ala His Pro Ser Met Pro Met Thr Pro Ser Pro Ala Pro Ala Asn 20 25 30 Thr Glu Met Glu Ala Lys Ala Tyr Glu Gly Lys Glu Arg Ala Thr Asn 35 40 45 Lys Lys Ser Lys Gly Thr Ser Gly Asn Gly Asn Val Gly Val Arg Thr 50 55 60 Gly Asp Ser Gly Ile Ala Ala Ser Ser Ser Gly Asn Asp Gly Gly Ala 65 70 75 80 Thr Gln Ser Ala Asp Ser Gly Ser Asp Gly Ser Ser Asp Gly Ser Asp 85 90 95 Glu Asn Asp Gln Asn Glu Phe Ser Gly Gly Lys Lys Gly Ser Phe Asn 100 105 110 Gln Met Leu Ala Asp Ala Asn Ala Gln Asn Asn Asn Phe His Thr Pro 115 120 125 Val Val Pro Val Asn Pro Val Thr Ser Ile Pro Gly Thr Asn Leu Ile 130 135 140 Met Arg Met Asp Leu Arg Asn Pro Ser Thr Gly Asn Ala Ala Met Lys 145 150 155 160 Met Arg Thr Asn His Ser Gly Lys Ser Arg Gly Glu Val Pro Pro Pro 165 170 175 Met Lys Pro Glu Ser 180 37456DNACentaurea maculosa 37atggtgcccg gtgaatctct gttgcagaac gagcgggaac ttaaaaggga gaggagaaag 60caatctaatc gagaatctgc caggcggtct agattaagga aacaggcgga agcagaagaa 120cttgcgataa aagttgaatc cctcactaat gaaaatctga cccttaagtc cgaaattaac 180cgcttgactg ataattccga gaaactgaag cttcaaaatg ctaaactaat tgagaaactc 240aagaatgcac gacaagacac cgaagaccca cggctggacc caaacggctc gtctctgagc 300acggctaacc tcctctcgag agtcaacaac gggtctggtg ctagaactga tggagacgct 360gaagtatatg agaataacaa taaccaaaac tcgggtgcaa aactgcgtca actattggac 420gccagccctc ggaccgatgc tgttgcagcg ggctaa 45638151PRTCentaurea maculosa 38Met Val Pro Gly Glu Ser Leu Leu Gln Asn Glu Arg Glu Leu Lys Arg 1 5 10 15 Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu 20 25 30 Arg Lys Gln Ala Glu Ala Glu Glu Leu Ala Ile Lys Val Glu Ser Leu 35 40 45 Thr Asn Glu Asn Leu Thr Leu Lys Ser Glu Ile Asn Arg Leu Thr Asp 50 55 60 Asn Ser Glu Lys Leu Lys Leu Gln Asn Ala Lys Leu Ile Glu Lys Leu 65 70 75 80 Lys Asn Ala Arg Gln Asp Thr Glu Asp Pro Arg Leu Asp Pro Asn Gly 85 90 95 Ser Ser Leu Ser Thr Ala Asn Leu Leu Ser Arg Val Asn Asn Gly Ser 100 105 110 Gly Ala Arg Thr Asp Gly Asp Ala Glu Val Tyr Glu Asn Asn Asn Asn 115 120 125 Gln Asn Ser Gly Ala Lys Leu Arg Gln Leu Leu Asp Ala Ser Pro Arg 130 135 140 Thr Asp Ala Val Ala Ala Gly 145 150 391197DNACentaurea maculosa 39atgggcaact gtgaagagac aaaggcttgt aaacctgaga aatcgtcttc acctccaccc 60gagcaacaac agaccaacgt tcatgcattt cctgattggg cagccatgca ggcttattat 120ggccctagaa tggctatgcc accatacttc aactcggctg ttgcatctgg tcatgcccct 180ccaccatata tgtggggacc accacagcat atgatgccgc cttatgctgc tatgtatcca 240catggaggtg tttatccaca tcccggagtt cctcttgcgg gtagtcctat gagcattgat 300tctccggcca agtcatcagg gaattctgat cgtggattgc tgaaaaagtt gaaaggattt 360gatgggttgg caatgtcaat tggcaatggc aacggtgata gtggtggagg tggaaatgag 420aatgggatct cccatagtgg ggagactgaa ggttctagtg aaggaagtga tggcaataca 480acagaggggg gtcaaaatag cgggaaaagg agccgagaag gatcgcctaa ggctcctgaa 540gttggcaaga ccgagccact aagcggacaa tttttcccta ctgaagcaaa cggagcttcc 600aagaaagtta ctggtcttac tgttaccctt cctaaggttt cgggtaaatt aggagctgcc 660gtctccgcta acttgacctc tgacttagag attaagaatt ctcccacaac tgctgctaag 720ctggcctccg caactgtcgc catggtgccc ggtgaatctc tgttgcagaa cgagcgtgaa 780cttaaaaggg agaggagaaa gcaatctaat cgagaatctg ccaggcggtc tagattaagg 840aaacaggcgg aagcagaaga acttgcgata aaagttgaat ccctcactaa tgaaaatctg 900acccttaagt ccgaaattaa ccgcttgagc gataattccg agaaactgaa gcttcaaaat 960gccaaactaa ttgagaaact caagaatgca cgacaagaca ccgaacaccc acggctggac 1020ccaaatggct cgtctctgag cacggctaac ctcctctcga gagtcgacaa cgggtccggt 1080gctagaactg atggagacgt tgaagtgtac gagaataaca ataaccaaaa cccgggtgca 1140aaactgcgtc aactattgca cgccagccca aggaccgatg ccgttgcagc gggctaa 119740398PRTCentaurea maculosa 40Met Gly Asn Cys Glu Glu Thr Lys Ala Cys Lys Pro Glu Lys Ser Ser 1 5 10 15 Ser Pro Pro Pro Glu Gln Gln Gln Thr Asn Val His Ala Phe Pro Asp 20 25 30 Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Met Ala Met Pro Pro 35 40 45 Tyr Phe Asn Ser Ala Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met 50 55 60 Trp Gly Pro Pro Gln His Met Met Pro Pro Tyr Ala Ala Met Tyr Pro 65 70 75 80 His Gly Gly Val Tyr Pro His Pro Gly Val Pro Leu Ala Gly Ser Pro 85 90

95 Met Ser Ile Asp Ser Pro Ala Lys Ser Ser Gly Asn Ser Asp Arg Gly 100 105 110 Leu Leu Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala Met Ser Ile Gly 115 120 125 Asn Gly Asn Gly Asp Ser Gly Gly Gly Gly Asn Glu Asn Gly Ile Ser 130 135 140 His Ser Gly Glu Thr Glu Gly Ser Ser Glu Gly Ser Asp Gly Asn Thr 145 150 155 160 Thr Glu Gly Gly Gln Asn Ser Gly Lys Arg Ser Arg Glu Gly Ser Pro 165 170 175 Lys Ala Pro Glu Val Gly Lys Thr Glu Pro Leu Ser Gly Gln Phe Phe 180 185 190 Pro Thr Glu Ala Asn Gly Ala Ser Lys Lys Val Thr Gly Leu Thr Val 195 200 205 Thr Leu Pro Lys Val Ser Gly Lys Leu Gly Ala Ala Val Ser Ala Asn 210 215 220 Leu Thr Ser Asp Leu Glu Ile Lys Asn Ser Pro Thr Thr Ala Ala Lys 225 230 235 240 Leu Ala Ser Ala Thr Val Ala Met Val Pro Gly Glu Ser Leu Leu Gln 245 250 255 Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu 260 265 270 Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu Glu Leu 275 280 285 Ala Ile Lys Val Glu Ser Leu Thr Asn Glu Asn Leu Thr Leu Lys Ser 290 295 300 Glu Ile Asn Arg Leu Ser Asp Asn Ser Glu Lys Leu Lys Leu Gln Asn 305 310 315 320 Ala Lys Leu Ile Glu Lys Leu Lys Asn Ala Arg Gln Asp Thr Glu His 325 330 335 Pro Arg Leu Asp Pro Asn Gly Ser Ser Leu Ser Thr Ala Asn Leu Leu 340 345 350 Ser Arg Val Asp Asn Gly Ser Gly Ala Arg Thr Asp Gly Asp Val Glu 355 360 365 Val Tyr Glu Asn Asn Asn Asn Gln Asn Pro Gly Ala Lys Leu Arg Gln 370 375 380 Leu Leu His Ala Ser Pro Arg Thr Asp Ala Val Ala Ala Gly 385 390 395 411281DNACatharanthus roseus 41atgggaagta gtgaagagac aaagtcgtcg aagcctgaga aatcatcttc tcctgcaccg 60gagcagagta atgttcatgt atatcctgat tgggcagcaa tgcaggcata ttatggtccc 120cgagttgctg taccaccata tttcagctct gctgttgcat ctggtcatcc acctcaccct 180tacatgtggg gaccacctca gcctatgatg ccaccttatg gaacacctta tgctgcaatc 240tatgctcatg gaggtgttta tacccatccc ggggttcctt tgggttcaca tgccaatgct 300catgcggggg ctacatctcc tggtgcaaca gaagctattg ctgctagtcc tttgagcatt 360gatacaccta ccaagtcatc ggcaaatggc agtcaaggtc tgatgaacaa attgagaggc 420tttgatggac ttgcaatgtc aataggcaat ggcaacacgg acagtgccga tgggggaact 480gatcatggga tatcacagag tggtgacact gaaggttcaa gtgatggaag caatgggact 540acatccaagg caggtcaaaa gaacaagaaa cgcagccgtg aagggactcc tgctaatgat 600agggagcgca agtccctgac acctagtagt ccatcagctg ctgtcaacac aaatggttct 660tcagagaaag ctatgagggc aagtaaagtt cctgctgctg caactgaaaa ggtgatgggt 720gctgtacttt ctcctaatat gactactgca tcggagctca ggaatccttc tgctgccaat 780gctaagacaa gtccggctaa ggtttcccaa tcctgttctt ctcttccggg tgaaacttgg 840ttgcagaatg agcgagagct taagcgggaa aggaggaaac aatctaatcg tgaatctgca 900aggagatcaa gattgaggaa acaggctgag acggaagaat tagctaagaa agttcagact 960ttgactgctg aaaacatgac tttaaggtcg gaaatcaata aactaactga gaactctgag 1020catctaaggc atgagagtgc gcttttggat aagttgaaaa atgcacgggt catgcaagca 1080ggggagatga ataaatatga tgaattgcat cggcaaccaa ctggtacagc tgaccttctt 1140gcgagagtca acaattctgg ttctactgat aagagcaacg aggagggtgg tggtgatgtg 1200ttcgagaaca gaaactccgg gaccaagctt caccagttgc tcgatgccag ccctagggcg 1260gatgctgttg ccgctggttg a 128142426PRTCatharanthus roseus 42Met Gly Ser Ser Glu Glu Thr Lys Ser Ser Lys Pro Glu Lys Ser Ser 1 5 10 15 Ser Pro Ala Pro Glu Gln Ser Asn Val His Val Tyr Pro Asp Trp Ala 20 25 30 Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Val Pro Pro Tyr Phe 35 40 45 Ser Ser Ala Val Ala Ser Gly His Pro Pro His Pro Tyr Met Trp Gly 50 55 60 Pro Pro Gln Pro Met Met Pro Pro Tyr Gly Thr Pro Tyr Ala Ala Ile 65 70 75 80 Tyr Ala His Gly Gly Val Tyr Thr His Pro Gly Val Pro Leu Gly Ser 85 90 95 His Ala Asn Ala His Ala Gly Ala Thr Ser Pro Gly Ala Thr Glu Ala 100 105 110 Ile Ala Ala Ser Pro Leu Ser Ile Asp Thr Pro Thr Lys Ser Ser Ala 115 120 125 Asn Gly Ser Gln Gly Leu Met Asn Lys Leu Arg Gly Phe Asp Gly Leu 130 135 140 Ala Met Ser Ile Gly Asn Gly Asn Thr Asp Ser Ala Asp Gly Gly Thr 145 150 155 160 Asp His Gly Ile Ser Gln Ser Gly Asp Thr Glu Gly Ser Ser Asp Gly 165 170 175 Ser Asn Gly Thr Thr Ser Lys Ala Gly Gln Lys Asn Lys Lys Arg Ser 180 185 190 Arg Glu Gly Thr Pro Ala Asn Asp Arg Glu Arg Lys Ser Leu Thr Pro 195 200 205 Ser Ser Pro Ser Ala Ala Val Asn Thr Asn Gly Ser Ser Glu Lys Ala 210 215 220 Met Arg Ala Ser Lys Val Pro Ala Ala Ala Thr Glu Lys Val Met Gly 225 230 235 240 Ala Val Leu Ser Pro Asn Met Thr Thr Ala Ser Glu Leu Arg Asn Pro 245 250 255 Ser Ala Ala Asn Ala Lys Thr Ser Pro Ala Lys Val Ser Gln Ser Cys 260 265 270 Ser Ser Leu Pro Gly Glu Thr Trp Leu Gln Asn Glu Arg Glu Leu Lys 275 280 285 Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 290 295 300 Leu Arg Lys Gln Ala Glu Thr Glu Glu Leu Ala Lys Lys Val Gln Thr 305 310 315 320 Leu Thr Ala Glu Asn Met Thr Leu Arg Ser Glu Ile Asn Lys Leu Thr 325 330 335 Glu Asn Ser Glu His Leu Arg His Glu Ser Ala Leu Leu Asp Lys Leu 340 345 350 Lys Asn Ala Arg Val Met Gln Ala Gly Glu Met Asn Lys Tyr Asp Glu 355 360 365 Leu His Arg Gln Pro Thr Gly Thr Ala Asp Leu Leu Ala Arg Val Asn 370 375 380 Asn Ser Gly Ser Thr Asp Lys Ser Asn Glu Glu Gly Gly Gly Asp Val 385 390 395 400 Phe Glu Asn Arg Asn Ser Gly Thr Lys Leu His Gln Leu Leu Asp Ala 405 410 415 Ser Pro Arg Ala Asp Ala Val Ala Ala Gly 420 425 43903DNACatharanthus roseus 43atgtggggag gccagcatcc attgatgccc ccttatggga ctccagttcc atatccagct 60ctatatcctc ctggaggtgt ttacgctcat cctactatgg caacgactcc aggaacaaca 120caagcaaatg ccgaatcaga tgcagtaaag gtctctgaag gaaaggaccg acccacaagc 180aaaaggtccc gaggagcttc agggaaccat ggcttggttg ctgcaaaagt tgcggagagt 240gggaaagcag cttcagagtc tggaaatgat ggtgctactc agagtgctga aagtggaagt 300gaaggttcat cagatggaag tgatgagaat aacaatcatg agctctctgg gaccaaaaaa 360ggaagttttg agcagatgct agctgatgga gcaacagctc agaatagcac tgcaatagca 420aacttcccga actcagttcc tggaaatcca gtagctatgc ctgcgaccaa tttgaacatt 480ggaatggact tgtggaatgc ttcctctgct gctcctggag ccatgaaaat gcgtccaagt 540catggtgtcc catctgctgt agctccgggc atggtcaatg accaatggat tcaagatgaa 600agagaattga aaagacaaaa gcgaaaacaa tctaatcggg aatcagctag gagatcaaga 660ttacgcaaac aggctgagtg cgaggaactg caacagagag tagagacatt gagcaatgaa 720aatcgtgcat tgcgagatga gctacagagg ctttctgagg aatgtgagaa gcttacatca 780gaaaataact ccattaagga cgagctaacg agggtatgcg gtcctgaggc agtatcgaaa 840ctagagagca gtagcataac caaacaacaa cttcagtccc gcggtaatga acatgaaagt 900taa 90344300PRTCatharanthus roseus 44Met Trp Gly Gly Gln His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val 1 5 10 15 Pro Tyr Pro Ala Leu Tyr Pro Pro Gly Gly Val Tyr Ala His Pro Thr 20 25 30 Met Ala Thr Thr Pro Gly Thr Thr Gln Ala Asn Ala Glu Ser Asp Ala 35 40 45 Val Lys Val Ser Glu Gly Lys Asp Arg Pro Thr Ser Lys Arg Ser Arg 50 55 60 Gly Ala Ser Gly Asn His Gly Leu Val Ala Ala Lys Val Ala Glu Ser 65 70 75 80 Gly Lys Ala Ala Ser Glu Ser Gly Asn Asp Gly Ala Thr Gln Ser Ala 85 90 95 Glu Ser Gly Ser Glu Gly Ser Ser Asp Gly Ser Asp Glu Asn Asn Asn 100 105 110 His Glu Leu Ser Gly Thr Lys Lys Gly Ser Phe Glu Gln Met Leu Ala 115 120 125 Asp Gly Ala Thr Ala Gln Asn Ser Thr Ala Ile Ala Asn Phe Pro Asn 130 135 140 Ser Val Pro Gly Asn Pro Val Ala Met Pro Ala Thr Asn Leu Asn Ile 145 150 155 160 Gly Met Asp Leu Trp Asn Ala Ser Ser Ala Ala Pro Gly Ala Met Lys 165 170 175 Met Arg Pro Ser His Gly Val Pro Ser Ala Val Ala Pro Gly Met Val 180 185 190 Asn Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg Gln Lys Arg 195 200 205 Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln 210 215 220 Ala Glu Cys Glu Glu Leu Gln Gln Arg Val Glu Thr Leu Ser Asn Glu 225 230 235 240 Asn Arg Ala Leu Arg Asp Glu Leu Gln Arg Leu Ser Glu Glu Cys Glu 245 250 255 Lys Leu Thr Ser Glu Asn Asn Ser Ile Lys Asp Glu Leu Thr Arg Val 260 265 270 Cys Gly Pro Glu Ala Val Ser Lys Leu Glu Ser Ser Ser Ile Thr Lys 275 280 285 Gln Gln Leu Gln Ser Arg Gly Asn Glu His Glu Ser 290 295 300 451041DNACitrus sinensismisc_feature(1027)..(1027)n is a, c, g, or t 45atggggaaca atgaagatgg aaagtccttc aagtctgaaa aaccatcttc acctccacct 60tcggatcaag gcaatattca tatgtatact gattgggcag ctatgcaggc ttattatggc 120ccccgagttg ctattccgcc atattacaac tcacccattg catctggtca tgctcctcaa 180ccctacatgt ggggcccagc ccagcctatg atgccaccat atggagcgcc ttatgcagcc 240atctattcta ctggaggtgt ttatgcacat cctgctgttc ctttggctgt aaccccattg 300aacacagagg cacctactaa gtcgtcagga aatgcagatc gaggtttagc aaagaagctg 360aaagggttag atggcctggc aatgtcaata ggcaatgcta gtgctgagag tgctgagggt 420ggagcagaac aaaggccgtc acagagtgag gccgacggtt ctactgatgg aagtgatggg 480aatacagtta gggcaggtca atctagaaag aaaagaagcc gagagggaac gccaattgct 540ggaaaacccg ttggtcctgt gctttctcct ggcatgccta caaaattgga gctcaggaat 600gcacctggca tgaacgttaa ggcaagtcca accagtgttc cacagccttg tgcagtttta 660cctcctgaaa cctggattca gaatgaacgg gagctgaaac gggaaaggag gaaacaatct 720aatcgagaat ctgctagaag gtctaggttg aggaagcagg ctgaggctga agaactttct 780cgtaaagttg attccttgat tgatgagaat gcttccctca agtctgaaat aaatcaatta 840tcagagaatt ctgagaaact gaggcaagaa aacgcagcat tactggaaaa actgaagagt 900gcacaactgg gaaacaagca agagattgtt ttgaacgagg acaagagggt tacacctgtt 960agcacagaaa acctattatc tagagttaca actccggtac tgttgataga aacatggagg 1020aaaggangtc accctgtttg a 104146346PRTCitrus sinensismisc_feature(343)..(343)Xaa can be any naturally occurring amino acid 46Met Gly Asn Asn Glu Asp Gly Lys Ser Phe Lys Ser Glu Lys Pro Ser 1 5 10 15 Ser Pro Pro Pro Ser Asp Gln Gly Asn Ile His Met Tyr Thr Asp Trp 20 25 30 Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Ile Pro Pro Tyr 35 40 45 Tyr Asn Ser Pro Ile Ala Ser Gly His Ala Pro Gln Pro Tyr Met Trp 50 55 60 Gly Pro Ala Gln Pro Met Met Pro Pro Tyr Gly Ala Pro Tyr Ala Ala 65 70 75 80 Ile Tyr Ser Thr Gly Gly Val Tyr Ala His Pro Ala Val Pro Leu Ala 85 90 95 Val Thr Pro Leu Asn Thr Glu Ala Pro Thr Lys Ser Ser Gly Asn Ala 100 105 110 Asp Arg Gly Leu Ala Lys Lys Leu Lys Gly Leu Asp Gly Leu Ala Met 115 120 125 Ser Ile Gly Asn Ala Ser Ala Glu Ser Ala Glu Gly Gly Ala Glu Gln 130 135 140 Arg Pro Ser Gln Ser Glu Ala Asp Gly Ser Thr Asp Gly Ser Asp Gly 145 150 155 160 Asn Thr Val Arg Ala Gly Gln Ser Arg Lys Lys Arg Ser Arg Glu Gly 165 170 175 Thr Pro Ile Ala Gly Lys Pro Val Gly Pro Val Leu Ser Pro Gly Met 180 185 190 Pro Thr Lys Leu Glu Leu Arg Asn Ala Pro Gly Met Asn Val Lys Ala 195 200 205 Ser Pro Thr Ser Val Pro Gln Pro Cys Ala Val Leu Pro Pro Glu Thr 210 215 220 Trp Ile Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser 225 230 235 240 Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala 245 250 255 Glu Glu Leu Ser Arg Lys Val Asp Ser Leu Ile Asp Glu Asn Ala Ser 260 265 270 Leu Lys Ser Glu Ile Asn Gln Leu Ser Glu Asn Ser Glu Lys Leu Arg 275 280 285 Gln Glu Asn Ala Ala Leu Leu Glu Lys Leu Lys Ser Ala Gln Leu Gly 290 295 300 Asn Lys Gln Glu Ile Val Leu Asn Glu Asp Lys Arg Val Thr Pro Val 305 310 315 320 Ser Thr Glu Asn Leu Leu Ser Arg Val Thr Thr Pro Val Leu Leu Ile 325 330 335 Glu Thr Trp Arg Lys Gly Xaa His Pro Val 340 345 47516DNACitrus sinensis 47atggggacag gggaagagaa cacttctgct aagactgcca aaacagcttc ttcaactcag 60gagataccaa ccacaccctc gtacgctgat tggtccagct ctatgcaggc tttctatggt 120gctggggcta cgccacctcc attttttgct tccaccgttg cttctccaac tcctcatccc 180tatctgtggg gaagccagca tcctttaatg ccaccatatg gcaccccagt tccataccaa 240gctatatatc ctccaggggg agtatatgca catcctagca tggctacgac tccaacagca 300gcaccaacaa atacagagcc ggaagggaag ggacctgaag caaaggaccg ggcttcagct 360aaaaaatcca agggaactcc aggaggtaag gctggagaga ttgtaaaggc aacttctggt 420tctgggaatg acggtgtctc tcaaagtgct gaaagtggta gtgacggttc atctgatgcg 480agtgatgaga atggtaaccg agcaggagtt tgctag 51648171PRTCitrus sinensis 48Met Gly Thr Gly Glu Glu Asn Thr Ser Ala Lys Thr Ala Lys Thr Ala 1 5 10 15 Ser Ser Thr Gln Glu Ile Pro Thr Thr Pro Ser Tyr Ala Asp Trp Ser 20 25 30 Ser Ser Met Gln Ala Phe Tyr Gly Ala Gly Ala Thr Pro Pro Pro Phe 35 40 45 Phe Ala Ser Thr Val Ala Ser Pro Thr Pro His Pro Tyr Leu Trp Gly 50 55 60 Ser Gln His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Gln 65 70 75 80 Ala Ile Tyr Pro Pro Gly Gly Val Tyr Ala His Pro Ser Met Ala Thr 85 90 95 Thr Pro Thr Ala Ala Pro Thr Asn Thr Glu Pro Glu Gly Lys Gly Pro 100 105 110 Glu Ala Lys Asp Arg Ala Ser Ala Lys Lys Ser Lys Gly Thr Pro Gly 115 120 125 Gly Lys Ala Gly Glu Ile Val Lys Ala Thr Ser Gly Ser Gly Asn Asp 130 135 140 Gly Val Ser Gln Ser Ala Glu Ser Gly Ser Asp Gly Ser Ser Asp Ala 145 150 155 160 Ser Asp Glu Asn Gly Asn Arg Ala Gly Val Cys 165 170 491002DNACarthamus tinctorius 49atgccgattg gtcaaactca atgcaggctt attatggtgc tggaggcact ccaccttttt 60tttgcctcaa ctgttgcttc tccgactcct catccctaca tatggggagg ccagcatcct 120atgatgtcac catatgggac tccagttcca taccctgctc tatatccacc agcaggagtt 180tacgctcatc ctagtatgcc tatgacccca agtaccgcac caccaaatgc agaaatggaa 240gtgaaggcct atgaaggcaa ggaaagggct gcaaataaaa agtccaaggg aacttcagga 300aatggcaatg ctgctgttgt tagaactgga gagagtggga aggcggcatc aagttcaggg 360aatgatggtg ccacccagag cgctgaaagt ggaagtgatg gctcatcaga tggaagtgaa 420gaaaatgacc aacatgaata ctctggaggc aagaaaggaa gttttaatca gatgcttgca 480gatgccaatg cacagaataa caattctggg ccaaatattc agacgtcagt acctgggaac 540cctgtggtgt ctatacctgg taccaatctt aatatgggga tggacttgtg gaatccatct 600accggaagtg gaaccatgaa aattcgatca aatccttctg gtgtggctcg agcagcagtg 660ccaccaccaa tgataggacg ggaaggaatg atgcctgatc agtgggttca ggatgagcgt

720gaactgaaga gacaaaagag gaagcagtct aaccgagagt cggctaggag atcaaggttg 780cgcaagcagg cggagtgtga agagccacag gcaagagtag aggcactaag caacgagaat 840cattcactca gagatgaact gcaaaggcta tcggaggaat gcgagaagct tacttctgaa 900aataattcga taaaggatga cttaactagg ttttgtgggc ccgaggcagt atcaaagcta 960gatgcacatc ttcaatctcg ggtggacgaa agtaacagct ga 100250333PRTCarthamus tinctorius 50Met Pro Ile Gly Gln Thr Gln Cys Arg Leu Ile Met Val Leu Glu Ala 1 5 10 15 Leu His Leu Phe Phe Ala Ser Thr Val Ala Ser Pro Thr Pro His Pro 20 25 30 Tyr Ile Trp Gly Gly Gln His Pro Met Met Ser Pro Tyr Gly Thr Pro 35 40 45 Val Pro Tyr Pro Ala Leu Tyr Pro Pro Ala Gly Val Tyr Ala His Pro 50 55 60 Ser Met Pro Met Thr Pro Ser Thr Ala Pro Pro Asn Ala Glu Met Glu 65 70 75 80 Val Lys Ala Tyr Glu Gly Lys Glu Arg Ala Ala Asn Lys Lys Ser Lys 85 90 95 Gly Thr Ser Gly Asn Gly Asn Ala Ala Val Val Arg Thr Gly Glu Ser 100 105 110 Gly Lys Ala Ala Ser Ser Ser Gly Asn Asp Gly Ala Thr Gln Ser Ala 115 120 125 Glu Ser Gly Ser Asp Gly Ser Ser Asp Gly Ser Glu Glu Asn Asp Gln 130 135 140 His Glu Tyr Ser Gly Gly Lys Lys Gly Ser Phe Asn Gln Met Leu Ala 145 150 155 160 Asp Ala Asn Ala Gln Asn Asn Asn Ser Gly Pro Asn Ile Gln Thr Ser 165 170 175 Val Pro Gly Asn Pro Val Val Ser Ile Pro Gly Thr Asn Leu Asn Met 180 185 190 Gly Met Asp Leu Trp Asn Pro Ser Thr Gly Ser Gly Thr Met Lys Ile 195 200 205 Arg Ser Asn Pro Ser Gly Val Ala Arg Ala Ala Val Pro Pro Pro Met 210 215 220 Ile Gly Arg Glu Gly Met Met Pro Asp Gln Trp Val Gln Asp Glu Arg 225 230 235 240 Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 245 250 255 Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Pro Gln Ala Arg 260 265 270 Val Glu Ala Leu Ser Asn Glu Asn His Ser Leu Arg Asp Glu Leu Gln 275 280 285 Arg Leu Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile 290 295 300 Lys Asp Asp Leu Thr Arg Phe Cys Gly Pro Glu Ala Val Ser Lys Leu 305 310 315 320 Asp Ala His Leu Gln Ser Arg Val Asp Glu Ser Asn Ser 325 330 51728DNAEuphorbia esulamisc_feature(631)..(632)n is a, c, g, or t 51atggggacag gggaagaaag cacgcctact aagacgtcta aaccagcgcc ttcaactcag 60gttcctgaaa ttccaacaac gcccgtgtat ccagattggt ccaattctat gcaggcttat 120tatggtgctg gagctactcc accgcatttt ttcgcatcaa cagttccatc tccaactccc 180cacccttatc tctggggagg tcagcacccc atgatgccac cgccctacgg gactcccgtt 240ccatatcctg ctttatatcc tgctggggga gtatattccc atcctactat gaccacgaca 300ccaaactctg caccggtaaa tgcagaattt gaaggaaaag gtcctgatgg aaaagaccgt 360gcttctgcca aaaaatctaa gggagcttca gctggcaagg gaggagagac cggaaaggca 420acctcaggtt ccggaaacga tggtgcctcc cagagcggtg aaagtggtag cgatggatcc 480tcagatggaa gtgatgagaa ctaacaggaa tatggggcga ataagaaagg aagttttgat 540cagatgcttg cggatgccaa tgctcaaaat aatggtatcc agggttcagt tccagggaag 600ccggttgcgt ccatggctgg agctaatctt nntnttggaa tggatttgtg gaatccttct 660gctgctgctc cggggactgc taaaattaga ccaaatgcat ccggtgctcc atcaggaatt 720actcctgc 72852242PRTEuphorbia esulamisc_feature(210)..(211)Xaa can be any naturally occurring amino acid 52Met Gly Thr Gly Glu Glu Ser Thr Pro Thr Lys Thr Ser Lys Pro Ala 1 5 10 15 Pro Ser Thr Gln Val Pro Glu Ile Pro Thr Thr Pro Val Tyr Pro Asp 20 25 30 Trp Ser Asn Ser Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Pro Pro 35 40 45 His Phe Phe Ala Ser Thr Val Pro Ser Pro Thr Pro His Pro Tyr Leu 50 55 60 Trp Gly Gly Gln His Pro Met Met Pro Pro Pro Tyr Gly Thr Pro Val 65 70 75 80 Pro Tyr Pro Ala Leu Tyr Pro Ala Gly Gly Val Tyr Ser His Pro Thr 85 90 95 Met Thr Thr Thr Pro Asn Ser Ala Pro Val Asn Ala Glu Phe Glu Gly 100 105 110 Lys Gly Pro Asp Gly Lys Asp Arg Ala Ser Ala Lys Lys Ser Lys Gly 115 120 125 Ala Ser Ala Gly Lys Gly Gly Glu Thr Gly Lys Ala Thr Ser Gly Ser 130 135 140 Gly Asn Asp Gly Ala Ser Gln Ser Gly Glu Ser Gly Ser Asp Gly Ser 145 150 155 160 Ser Asp Gly Ser Asp Glu Asn Gln Glu Tyr Gly Ala Asn Lys Lys Gly 165 170 175 Ser Phe Asp Gln Met Leu Ala Asp Ala Asn Ala Gln Asn Asn Gly Ile 180 185 190 Gln Gly Ser Val Pro Gly Lys Pro Val Ala Ser Met Ala Gly Ala Asn 195 200 205 Leu Xaa Xaa Gly Met Asp Leu Trp Asn Pro Ser Ala Ala Ala Pro Gly 210 215 220 Thr Ala Lys Ile Arg Pro Asn Ala Ser Gly Ala Pro Ser Gly Ile Thr 225 230 235 240 Pro Ala 53828DNAFragaria vesca 53atgatgccgc cttatggaac tcctgttccg taccctgcca tatatcctcc aggtggagta 60tatgctcatc cgggtatggt cacgactcct gcctcggtac cgccaacaaa tccagagtcg 120gaagggaaga gcacagatgg gaaagagcga gcttcagcca aaaaacctaa gggagctgca 180ggattggtta gtgggaaggc tggggatggt ggaaaagcaa cttctggttc cggaaatgat 240ggtgcgtcac aaagtgctga aagtggtagc gagggttcat cagatggaag tgaggagaat 300ggtaaccacc aggagtatgg tgcaaacaag aagggaagct ttgacaagat gcttgcagat 360ggagcgaatg cacaaaataa cacgggttca gtgcctggga agcctgtagt ttctatgcct 420gcaacaagtc tgaatatggg aatggacttg tggaatccat cccctgctgg tgccggaact 480gcaaaaatga gaggaaatca atctggagcc ccatcagctg tcggtggtga ccattggatt 540caggatgaac gggaactgaa aagacagaaa aggaagcagt caaataggga gtctgctagg 600aggtcaagat tgaggaaaca ggcggagtgt gaagagctac aaaagagtgt acatggactg 660acaaatgaga atcacggcct taaagatgag ctgcagagac tctcccagga gtgcgagaag 720cttgcgtctg aaaatacttc tataaaggaa gagttgacac gattgtgtgg accagattta 780gtagcgaaca ttgaacatca atctcatggt ggtgaaggta acagttga 82854275PRTFragaria vesca 54Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Ile Tyr Pro 1 5 10 15 Pro Gly Gly Val Tyr Ala His Pro Gly Met Val Thr Thr Pro Ala Ser 20 25 30 Val Pro Pro Thr Asn Pro Glu Ser Glu Gly Lys Ser Thr Asp Gly Lys 35 40 45 Glu Arg Ala Ser Ala Lys Lys Pro Lys Gly Ala Ala Gly Leu Val Ser 50 55 60 Gly Lys Ala Gly Asp Gly Gly Lys Ala Thr Ser Gly Ser Gly Asn Asp 65 70 75 80 Gly Ala Ser Gln Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser Asp Gly 85 90 95 Ser Glu Glu Asn Gly Asn His Gln Glu Tyr Gly Ala Asn Lys Lys Gly 100 105 110 Ser Phe Asp Lys Met Leu Ala Asp Gly Ala Asn Ala Gln Asn Asn Thr 115 120 125 Gly Ser Val Pro Gly Lys Pro Val Val Ser Met Pro Ala Thr Ser Leu 130 135 140 Asn Met Gly Met Asp Leu Trp Asn Pro Ser Pro Ala Gly Ala Gly Thr 145 150 155 160 Ala Lys Met Arg Gly Asn Gln Ser Gly Ala Pro Ser Ala Val Gly Gly 165 170 175 Asp His Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg Gln Lys Arg Lys 180 185 190 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 195 200 205 Glu Cys Glu Glu Leu Gln Lys Ser Val His Gly Leu Thr Asn Glu Asn 210 215 220 His Gly Leu Lys Asp Glu Leu Gln Arg Leu Ser Gln Glu Cys Glu Lys 225 230 235 240 Leu Ala Ser Glu Asn Thr Ser Ile Lys Glu Glu Leu Thr Arg Leu Cys 245 250 255 Gly Pro Asp Leu Val Ala Asn Ile Glu His Gln Ser His Gly Gly Glu 260 265 270 Gly Asn Ser 275 55666DNAGossypium hirsutum 55atgggaacgg aagaggagag cacaccagcc aagccttcca aacctactgc ctcatcccag 60gaaatgccaa cagtgtcata tcctgattgg tcgacccaaa tgcaggctta ttatggtgct 120gcagctactc ctccattatt tgcctcaaac gttgcttcac caaccccgca tccatacata 180tggggaggcc agcatcctct aatgcctcca tatggtaccc cggttccgta cccagctgta 240tatcctccaa ggggagtgta tgcgcatcct aatatggccc caatgccaag ttctgcacgg 300aataatggtg ctgatggaaa ggatcggggt gtgaccaaaa agcccaaggg atcttcggga 360agcaaagttg gagagagtgc aaaggccact tcaggctcgg gaaacgatgg cggctctcaa 420agtggtgaaa gtggcagcga gggtacatca gacagaagtg atgagagtaa tcaacaagaa 480gtcaatgctg gcaaaaaggg aagctttgag cagatgcttg cagatgccaa tgcacagggt 540aaagctgctg gggctttagt tcccgcagaa cccatagtct ctatgcctgg caactacttt 600gaatatagga atggacctat ggagtgcttc ccctgctgcc acaggagctc caaaaaccag 660acctaa 66656221PRTGossypium hirsutum 56Met Gly Thr Glu Glu Glu Ser Thr Pro Ala Lys Pro Ser Lys Pro Thr 1 5 10 15 Ala Ser Ser Gln Glu Met Pro Thr Val Ser Tyr Pro Asp Trp Ser Thr 20 25 30 Gln Met Gln Ala Tyr Tyr Gly Ala Ala Ala Thr Pro Pro Leu Phe Ala 35 40 45 Ser Asn Val Ala Ser Pro Thr Pro His Pro Tyr Ile Trp Gly Gly Gln 50 55 60 His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Val 65 70 75 80 Tyr Pro Pro Arg Gly Val Tyr Ala His Pro Asn Met Ala Pro Met Pro 85 90 95 Ser Ser Ala Arg Asn Asn Gly Ala Asp Gly Lys Asp Arg Gly Val Thr 100 105 110 Lys Lys Pro Lys Gly Ser Ser Gly Ser Lys Val Gly Glu Ser Ala Lys 115 120 125 Ala Thr Ser Gly Ser Gly Asn Asp Gly Gly Ser Gln Ser Gly Glu Ser 130 135 140 Gly Ser Glu Gly Thr Ser Asp Arg Ser Asp Glu Ser Asn Gln Gln Glu 145 150 155 160 Val Asn Ala Gly Lys Lys Gly Ser Phe Glu Gln Met Leu Ala Asp Ala 165 170 175 Asn Ala Gln Gly Lys Ala Ala Gly Ala Leu Val Pro Ala Glu Pro Ile 180 185 190 Val Ser Met Pro Gly Asn Tyr Phe Glu Tyr Arg Asn Gly Pro Met Glu 195 200 205 Cys Phe Pro Cys Cys His Arg Ser Ser Lys Asn Gln Thr 210 215 220 571050DNAGossypium hirsutum 57atgcaggcat attatggtcc tcatgtcaat atgccaccgt attacagttc agctgtggca 60tcaggccatg ctcctccccc ctatatgtgg ggtccaacac agcctatgat gccatcctat 120ggagcacctt atgcagcaat ctactctcat gggggagttt atgcacatcc cgcagttcct 180ctggcatcac acagtcttgg tgttccatca tcaccggcag ctgcaggtcc tgtggagaca 240cctacgaagt cccctggaaa tactgaacaa ggtttaatga agaagctgaa aggatttgat 300ggtcttgcaa tatcaatagg caatggtact gctgagaatg ctgaaggaag agctaaacct 360agaccatccc acagtgtgga gactgcaggt tcagctgatg gtagtgatgg aaatacaact 420gggacggatc aaagtagacg gaaaagaagc agggagggga caccaactat tgcaggcgaa 480gatgagaaaa ttgaggcaaa gtctaaccaa gtcgctgcgg gggaggtgac tgcaaccatt 540tctcctaaac taattggaac tgtagtttct cctggcatga ccacaggaac aatattggag 600cttaggaaca cccccaccat gaatgctatg tccagtgcta tgggtgtaca ttgtggagta 660atgcctactg aagtctggtt gcagagtgag cgggagctga aacgggagag gcgaaaacaa 720tctaatagag aatccgctag aaggtcaagg ctgaggaagc aggctgagac tgaagagctt 780gcccgtaaag ttgaatcctt aacttcagag aatgcagcac tcagatctga aataaaccaa 840ttaactgaaa tgtctgaaaa agtaaggctc gaaaatgcga tattagtgga ggaactgaaa 900aatgctcaac ttggacacgc acaggagaat attttgaaca aaaaggaaga caaggagggt 960gaaatgggtg agaaaaggtc agactccggt gccaagctgc atcaactctt ggatccgagt 1020cctagagacg atgcagtggc tgccggctga 105058349PRTGossypium hirsutum 58Met Gln Ala Tyr Tyr Gly Pro His Val Asn Met Pro Pro Tyr Tyr Ser 1 5 10 15 Ser Ala Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met Trp Gly Pro 20 25 30 Thr Gln Pro Met Met Pro Ser Tyr Gly Ala Pro Tyr Ala Ala Ile Tyr 35 40 45 Ser His Gly Gly Val Tyr Ala His Pro Ala Val Pro Leu Ala Ser His 50 55 60 Ser Leu Gly Val Pro Ser Ser Pro Ala Ala Ala Gly Pro Val Glu Thr 65 70 75 80 Pro Thr Lys Ser Pro Gly Asn Thr Glu Gln Gly Leu Met Lys Lys Leu 85 90 95 Lys Gly Phe Asp Gly Leu Ala Ile Ser Ile Gly Asn Gly Thr Ala Glu 100 105 110 Asn Ala Glu Gly Arg Ala Lys Pro Arg Pro Ser His Ser Val Glu Thr 115 120 125 Ala Gly Ser Ala Asp Gly Ser Asp Gly Asn Thr Thr Gly Thr Asp Gln 130 135 140 Ser Arg Arg Lys Arg Ser Arg Glu Gly Thr Pro Thr Ile Ala Gly Glu 145 150 155 160 Asp Glu Lys Ile Glu Ala Lys Ser Asn Gln Val Ala Ala Gly Glu Val 165 170 175 Thr Ala Thr Ile Ser Pro Lys Leu Ile Gly Thr Val Val Ser Pro Gly 180 185 190 Met Thr Thr Gly Thr Ile Leu Glu Leu Arg Asn Thr Pro Thr Met Asn 195 200 205 Ala Met Ser Ser Ala Met Gly Val His Cys Gly Val Met Pro Thr Glu 210 215 220 Val Trp Leu Gln Ser Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln 225 230 235 240 Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 245 250 255 Thr Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Thr Ser Glu Asn Ala 260 265 270 Ala Leu Arg Ser Glu Ile Asn Gln Leu Thr Glu Met Ser Glu Lys Val 275 280 285 Arg Leu Glu Asn Ala Ile Leu Val Glu Glu Leu Lys Asn Ala Gln Leu 290 295 300 Gly His Ala Gln Glu Asn Ile Leu Asn Lys Lys Glu Asp Lys Glu Gly 305 310 315 320 Glu Met Gly Glu Lys Arg Ser Asp Ser Gly Ala Lys Leu His Gln Leu 325 330 335 Leu Asp Pro Ser Pro Arg Asp Asp Ala Val Ala Ala Gly 340 345 591125DNAGlycine max 59atgccaccat actacaactc agctgttgct tctggtcacg ctcctcaccc gtacatgtgg 60gggccaccac agcctatgat gccaccttat gggcctcctt atgcagcaat ttatccacat 120ggaggggttt atactcaccc tgcagttcct attgggccac ttacacatag tcaaggagtt 180ccgtcttcac ctgctgctgg gactcctttg agcatagaga caccacccaa atcatctgga 240aatactgatc agggtttaat gaagaaattg aaagagtttg atggacttgc aatgtcaatt 300ggcaatggcc atgctgaaag tgcagagcgt ggaggtgaaa acaggctctc acagagtgtg 360gatactgagg gttccagtga tggaagtgat ggcaacactt caggggctaa tcaatcaaga 420aggaaaagaa gccgtgaggg aacaccaacc actgatggag aagggaaaac tgagatacaa 480ggcagtccaa tttccaaaga gactgcagct tctaataaga tgttgggagt tgtccctgcc 540agtgttgcag gaacaatagt tggacatgta gtttcttcag gtatgaccac tgcactggag 600ctgagaaatc cttccagtgt tcattctaaa acaagtgccc cacaaccttg tccagtattg 660cctgcagaag cttgggtaca gaatgagcgt gagctgaaac gggagaggcg gaaacagtca 720aatcgagaat ctgctagaag gtccagacta aggaagcagg ctgaaactga agaactggca 780cgaaaagttg aatccttgaa tgctgagaat gcaacactga aatcagaaat taatcgactg 840actgaaagtt ctgaaaaaat gagggtggaa aatgctacat taaggggaaa acttaaaaat 900gctcaactgg gacaaaccca agagataact ttgaagataa ttgacagcca gagggctaca 960cctgtaagta cagaaaactt attatcaaga gttaataatt ccggttctaa tgatagaact 1020gtggaggatg agaatggttt ttgcgaaaat aaaccaaact ctggtgcaaa gctgcatcaa 1080ctgctggaca caagtcctag agctgatgct gtggcagctg gttga 112560374PRTGlycine max 60Met Pro Pro Tyr Tyr Asn Ser Ala Val Ala Ser Gly His Ala Pro His 1 5 10 15 Pro Tyr Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr Gly Pro 20 25 30 Pro Tyr Ala Ala Ile Tyr Pro His Gly Gly Val Tyr Thr His Pro Ala 35 40 45 Val Pro Ile Gly Pro Leu Thr His Ser Gln Gly Val Pro Ser Ser Pro 50 55 60 Ala Ala Gly Thr Pro Leu Ser Ile Glu Thr Pro Pro Lys Ser Ser Gly 65 70 75

80 Asn Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Glu Phe Asp Gly Leu 85 90 95 Ala Met Ser Ile Gly Asn Gly His Ala Glu Ser Ala Glu Arg Gly Gly 100 105 110 Glu Asn Arg Leu Ser Gln Ser Val Asp Thr Glu Gly Ser Ser Asp Gly 115 120 125 Ser Asp Gly Asn Thr Ser Gly Ala Asn Gln Ser Arg Arg Lys Arg Ser 130 135 140 Arg Glu Gly Thr Pro Thr Thr Asp Gly Glu Gly Lys Thr Glu Ile Gln 145 150 155 160 Gly Ser Pro Ile Ser Lys Glu Thr Ala Ala Ser Asn Lys Met Leu Gly 165 170 175 Val Val Pro Ala Ser Val Ala Gly Thr Ile Val Gly His Val Val Ser 180 185 190 Ser Gly Met Thr Thr Ala Leu Glu Leu Arg Asn Pro Ser Ser Val His 195 200 205 Ser Lys Thr Ser Ala Pro Gln Pro Cys Pro Val Leu Pro Ala Glu Ala 210 215 220 Trp Val Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser 225 230 235 240 Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Thr 245 250 255 Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Asn Ala Glu Asn Ala Thr 260 265 270 Leu Lys Ser Glu Ile Asn Arg Leu Thr Glu Ser Ser Glu Lys Met Arg 275 280 285 Val Glu Asn Ala Thr Leu Arg Gly Lys Leu Lys Asn Ala Gln Leu Gly 290 295 300 Gln Thr Gln Glu Ile Thr Leu Lys Ile Ile Asp Ser Gln Arg Ala Thr 305 310 315 320 Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser Gly Ser 325 330 335 Asn Asp Arg Thr Val Glu Asp Glu Asn Gly Phe Cys Glu Asn Lys Pro 340 345 350 Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Ser Pro Arg Ala 355 360 365 Asp Ala Val Ala Ala Gly 370 611275DNAGlycine max 61atgggaaaca gtgaggatga gaaatctgtt aagactggaa gcccttcttc ttcacctgca 60acaactgatc agaccaacca acctaatatt catgtctatc ctgattgggc tgccatgcag 120tattatgggc caagagtcaa cattccacca tacttcaact cagctgtggc ttcaggtcat 180gctccacacc catacatgtg gggaccacca cagcctatga tgccacctta tgggccacct 240tatgcagcat tttattcgca tggaggggtt tatactcacc ctgcagttgc tattgggcca 300cacttacatg gtcaaggagt ttcatcttca cctgctgttg ggactcattc aagcatagaa 360tcaccaacca aattatctgg aaatactgat cagggtttaa tgaagaaatc aaaagggttt 420gatgggcttg caatgtcaat aggcaattgc aatgctgaga gtgctgagca tggagctgag 480aacaggcagt cacagagtgt ggatactgag ggttacagcg acggaagtga tggcaacact 540gcaggggcta atcaaacaaa aaggaaaaga tgccgagagg gaacactgac cactgatgga 600gaagggaaaa ctgagctaca aaatggtccg gcttccaaag agacttcatc ttccaaaaag 660attgtgtcag ctactccagc tagtgttgcc ggaacattag ttggacctgt agtttcttca 720gttatggcca caacactgga actgaggaac ccttcgactg ttgattctaa ggcaaattcc 780acaagtgccc cacaaccttg tgcaattgtg cctaatgaaa cttgcttaca gaatgagcgt 840gagctgaaac gggagaggag aaaacaatct aaccgtgaat ctgctagaag gtccaggctg 900aggaagcagg ccgagactga agaattggca cgaaaagttg atatgttaac tgctgagaat 960gtgtccctga agtcagaaat aattcaactg actgaaggtt ctgagcagat gaggatggaa 1020aattctgcat tgagggaaaa actgagaaat actcaactgg gacaaaggga agagataatt 1080ttgagtagca tcgagagtaa gagggctgca cctgtaagta cagaaaactt gttatcaaga 1140gttaataatt ctagttctaa tgacagaact acagagaatg agaatgattt ctgtgagaac 1200aaaccaaatt ctggtgcaaa gctgcatcaa ctattggata caaatcctag agcagatgct 1260gtggcagctg gttga 127562424PRTGlycine max 62Met Gly Asn Ser Glu Asp Glu Lys Ser Val Lys Thr Gly Ser Pro Ser 1 5 10 15 Ser Ser Pro Ala Thr Thr Asp Gln Thr Asn Gln Pro Asn Ile His Val 20 25 30 Tyr Pro Asp Trp Ala Ala Met Gln Tyr Tyr Gly Pro Arg Val Asn Ile 35 40 45 Pro Pro Tyr Phe Asn Ser Ala Val Ala Ser Gly His Ala Pro His Pro 50 55 60 Tyr Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr Gly Pro Pro 65 70 75 80 Tyr Ala Ala Phe Tyr Ser His Gly Gly Val Tyr Thr His Pro Ala Val 85 90 95 Ala Ile Gly Pro His Leu His Gly Gln Gly Val Ser Ser Ser Pro Ala 100 105 110 Val Gly Thr His Ser Ser Ile Glu Ser Pro Thr Lys Leu Ser Gly Asn 115 120 125 Thr Asp Gln Gly Leu Met Lys Lys Ser Lys Gly Phe Asp Gly Leu Ala 130 135 140 Met Ser Ile Gly Asn Cys Asn Ala Glu Ser Ala Glu His Gly Ala Glu 145 150 155 160 Asn Arg Gln Ser Gln Ser Val Asp Thr Glu Gly Tyr Ser Asp Gly Ser 165 170 175 Asp Gly Asn Thr Ala Gly Ala Asn Gln Thr Lys Arg Lys Arg Cys Arg 180 185 190 Glu Gly Thr Leu Thr Thr Asp Gly Glu Gly Lys Thr Glu Leu Gln Asn 195 200 205 Gly Pro Ala Ser Lys Glu Thr Ser Ser Ser Lys Lys Ile Val Ser Ala 210 215 220 Thr Pro Ala Ser Val Ala Gly Thr Leu Val Gly Pro Val Val Ser Ser 225 230 235 240 Val Met Ala Thr Thr Leu Glu Leu Arg Asn Pro Ser Thr Val Asp Ser 245 250 255 Lys Ala Asn Ser Thr Ser Ala Pro Gln Pro Cys Ala Ile Val Pro Asn 260 265 270 Glu Thr Cys Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys 275 280 285 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 290 295 300 Glu Thr Glu Glu Leu Ala Arg Lys Val Asp Met Leu Thr Ala Glu Asn 305 310 315 320 Val Ser Leu Lys Ser Glu Ile Ile Gln Leu Thr Glu Gly Ser Glu Gln 325 330 335 Met Arg Met Glu Asn Ser Ala Leu Arg Glu Lys Leu Arg Asn Thr Gln 340 345 350 Leu Gly Gln Arg Glu Glu Ile Ile Leu Ser Ser Ile Glu Ser Lys Arg 355 360 365 Ala Ala Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser 370 375 380 Ser Ser Asn Asp Arg Thr Thr Glu Asn Glu Asn Asp Phe Cys Glu Asn 385 390 395 400 Lys Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Asn Pro 405 410 415 Arg Ala Asp Ala Val Ala Ala Gly 420 631035DNAGlycine max 63atgggaaccg gtgaagaaag cacagctaaa gttcctaaac catcttctac ttcttcaatt 60cagataccac tggcaccttc atatcctgat tggtcaagct cgatgcaggc ttactatgct 120cctggagcca ctccacctgc attttttgcc tcaaatattg cttctccaac tccccattct 180tatatgtggg gaagccagca ccctctaatt ccaccatata gtactcctgt tccatatcca 240gctatatatc ctcctgggaa tgtctatgct catcctagca tggcaatgac cctgagcacc 300acacagaatg gtacagagtt tgtaggaaag ggttctgatg aaaaagatcg ggtttctgcc 360aaaagttcaa aggctgtgtc tgcaaacaat ggttccaaag ctggagacaa tggaaaggca 420agctcaggtc ccagaaatga tggcacctca caaagtgctg aaagtggttc agagggatct 480tcggatgcta gtgatgagaa tactaaccaa caggaatcgg ctacaaacaa gaaagggagt 540tttgaccaaa tgcttgttga tggtgctaat gcacggaaca attctgtgag catcattcct 600caacctggaa atcccgctgt gtcaatgtct ccaactagtc ttaatattgg aatgaacttg 660tggaatgcat ctcctgctgg tgacgaagct gcaaaaatga gacagaatca gtcttcagga 720gctgttactc ctccaaccat aatgggacgt gaagtcgcgc tgggtgaaca ctggatacaa 780gatgaacgtg aactaaagaa acagaaaagg aaacagtcta atagggagtc tgctaggaga 840tcaagactac gcaagcaggc tgagtgtgaa gagttacaaa agagggtgga gtctctggga 900agtgagaatc aaactctcag agaagagctt cagagagtat ctgaagaatg caaaaaactt 960acatctgaaa atgattccat caaggaagag ttagaacggt tgtgtggacc agaagcagtt 1020gctaatcttg aataa 103564344PRTGlycine max 64Met Gly Thr Gly Glu Glu Ser Thr Ala Lys Val Pro Lys Pro Ser Ser 1 5 10 15 Thr Ser Ser Ile Gln Ile Pro Leu Ala Pro Ser Tyr Pro Asp Trp Ser 20 25 30 Ser Ser Met Gln Ala Tyr Tyr Ala Pro Gly Ala Thr Pro Pro Ala Phe 35 40 45 Phe Ala Ser Asn Ile Ala Ser Pro Thr Pro His Ser Tyr Met Trp Gly 50 55 60 Ser Gln His Pro Leu Ile Pro Pro Tyr Ser Thr Pro Val Pro Tyr Pro 65 70 75 80 Ala Ile Tyr Pro Pro Gly Asn Val Tyr Ala His Pro Ser Met Ala Met 85 90 95 Thr Leu Ser Thr Thr Gln Asn Gly Thr Glu Phe Val Gly Lys Gly Ser 100 105 110 Asp Glu Lys Asp Arg Val Ser Ala Lys Ser Ser Lys Ala Val Ser Ala 115 120 125 Asn Asn Gly Ser Lys Ala Gly Asp Asn Gly Lys Ala Ser Ser Gly Pro 130 135 140 Arg Asn Asp Gly Thr Ser Gln Ser Ala Glu Ser Gly Ser Glu Gly Ser 145 150 155 160 Ser Asp Ala Ser Asp Glu Asn Thr Asn Gln Gln Glu Ser Ala Thr Asn 165 170 175 Lys Lys Gly Ser Phe Asp Gln Met Leu Val Asp Gly Ala Asn Ala Arg 180 185 190 Asn Asn Ser Val Ser Ile Ile Pro Gln Pro Gly Asn Pro Ala Val Ser 195 200 205 Met Ser Pro Thr Ser Leu Asn Ile Gly Met Asn Leu Trp Asn Ala Ser 210 215 220 Pro Ala Gly Asp Glu Ala Ala Lys Met Arg Gln Asn Gln Ser Ser Gly 225 230 235 240 Ala Val Thr Pro Pro Thr Ile Met Gly Arg Glu Val Ala Leu Gly Glu 245 250 255 His Trp Ile Gln Asp Glu Arg Glu Leu Lys Lys Gln Lys Arg Lys Gln 260 265 270 Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 275 280 285 Cys Glu Glu Leu Gln Lys Arg Val Glu Ser Leu Gly Ser Glu Asn Gln 290 295 300 Thr Leu Arg Glu Glu Leu Gln Arg Val Ser Glu Glu Cys Lys Lys Leu 305 310 315 320 Thr Ser Glu Asn Asp Ser Ile Lys Glu Glu Leu Glu Arg Leu Cys Gly 325 330 335 Pro Glu Ala Val Ala Asn Leu Glu 340 651275DNAGlycine max 65atgggaaaca gtgaggaaga gaaatctgtt aaaactggaa gtccttcttc ttcacctgca 60acaactgagc agaccaacca acctaatatt cacgtctatc ctgattgggc tgccatgcag 120tattatgggc caagagtcaa cattccacca tacttcaact cagctgtggc ttctggtcat 180gctccacacc catacatgtg ggggccgcca cagcctatga tgcaacctta tgggccacct 240tatgcagcat tttattcgca tggaggggtt tatactcacc ctgcagttgc tattgggcca 300cactcacatg gtcaaggagt tccatcttca cctgctgctg ggactccttc aagcgtagaa 360tcaccaacca aattctctgg aaatactaat cagggtttag tgaagaaatt gaaagggttt 420gatgagcttg caatgtcaat aggcaattgc aatgctgaga gtgctgagcg aggagctgaa 480aacaggctgt cacagagtgt ggatactgag ggttccagcg acggaagtga tggcaacact 540gcaggggcta atcaaacaaa aaggaaaaga agccgagaag gaacaccgat cactgatgca 600gaagggaaaa ctgagctaca aaatggtccg gcttccaaag agactgcatc ttccaaaaag 660attgtgtcag ctaccccagc tagtgttgca ggaacattag ttggacctgt agtttcttca 720ggtatggcca cagcactgga gctgaggaac ccttcgactg ttcattctaa ggcaaattcc 780acaagtgccg cacaaccttg tgcagttgtg cgtaatgaaa cttggttaca gaatgagcgt 840gagctgaaac gggagaggag aaaacaatct aaccgtgaat ctgctagaag gtccaggctg 900aggaagcagg ccgagactga agaattggca cgaaaagttg agatgttaac tgctgagaat 960gtgtctctga agtcagaaat aactcgactg actgaaggtt ctgagcagat gaggatggaa 1020aattctgcat tgagggaaaa actgataaat actcaactgg gaccaaggga agagataact 1080ttgagcagca ttgacagcaa gagggctgca cctgtaagta cagaaaactt gttatcaaga 1140gttaacaatt ccggagctaa tgatagaact gcagagaatg agaatgatat ctgcgagaac 1200aaaccaaatt ctggtgcaaa gctgcatcaa ctactggata caaatcctag agctaatgct 1260gtagcagctg gttga 127566424PRTGlycine max 66Met Gly Asn Ser Glu Glu Glu Lys Ser Val Lys Thr Gly Ser Pro Ser 1 5 10 15 Ser Ser Pro Ala Thr Thr Glu Gln Thr Asn Gln Pro Asn Ile His Val 20 25 30 Tyr Pro Asp Trp Ala Ala Met Gln Tyr Tyr Gly Pro Arg Val Asn Ile 35 40 45 Pro Pro Tyr Phe Asn Ser Ala Val Ala Ser Gly His Ala Pro His Pro 50 55 60 Tyr Met Trp Gly Pro Pro Gln Pro Met Met Gln Pro Tyr Gly Pro Pro 65 70 75 80 Tyr Ala Ala Phe Tyr Ser His Gly Gly Val Tyr Thr His Pro Ala Val 85 90 95 Ala Ile Gly Pro His Ser His Gly Gln Gly Val Pro Ser Ser Pro Ala 100 105 110 Ala Gly Thr Pro Ser Ser Val Glu Ser Pro Thr Lys Phe Ser Gly Asn 115 120 125 Thr Asn Gln Gly Leu Val Lys Lys Leu Lys Gly Phe Asp Glu Leu Ala 130 135 140 Met Ser Ile Gly Asn Cys Asn Ala Glu Ser Ala Glu Arg Gly Ala Glu 145 150 155 160 Asn Arg Leu Ser Gln Ser Val Asp Thr Glu Gly Ser Ser Asp Gly Ser 165 170 175 Asp Gly Asn Thr Ala Gly Ala Asn Gln Thr Lys Arg Lys Arg Ser Arg 180 185 190 Glu Gly Thr Pro Ile Thr Asp Ala Glu Gly Lys Thr Glu Leu Gln Asn 195 200 205 Gly Pro Ala Ser Lys Glu Thr Ala Ser Ser Lys Lys Ile Val Ser Ala 210 215 220 Thr Pro Ala Ser Val Ala Gly Thr Leu Val Gly Pro Val Val Ser Ser 225 230 235 240 Gly Met Ala Thr Ala Leu Glu Leu Arg Asn Pro Ser Thr Val His Ser 245 250 255 Lys Ala Asn Ser Thr Ser Ala Ala Gln Pro Cys Ala Val Val Arg Asn 260 265 270 Glu Thr Trp Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys 275 280 285 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 290 295 300 Glu Thr Glu Glu Leu Ala Arg Lys Val Glu Met Leu Thr Ala Glu Asn 305 310 315 320 Val Ser Leu Lys Ser Glu Ile Thr Arg Leu Thr Glu Gly Ser Glu Gln 325 330 335 Met Arg Met Glu Asn Ser Ala Leu Arg Glu Lys Leu Ile Asn Thr Gln 340 345 350 Leu Gly Pro Arg Glu Glu Ile Thr Leu Ser Ser Ile Asp Ser Lys Arg 355 360 365 Ala Ala Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser 370 375 380 Gly Ala Asn Asp Arg Thr Ala Glu Asn Glu Asn Asp Ile Cys Glu Asn 385 390 395 400 Lys Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Asn Pro 405 410 415 Arg Ala Asn Ala Val Ala Ala Gly 420 671014DNAGlycine max 67atgggggctg gggaagagag cacagctaaa tcttctaaat catcttcatc agctcaggat 60acaccaacag cacctgcata tcctgattgg tcaagctcca tgcaggccta ttatgctcct 120ggagccactc ctcctccctt ttttgccaca accgttgctt ccccaactcc ccatccctat 180ttatggggag gccagcatcc tttgatgccg ccatatggga ctccagtccc atatccagct 240atatatcctc ctgggagtat ctatgctcat cctagcatgg cagtgactcc aagtgctgtc 300cagcaaaata cagagattga agggaaggga gctgaaggaa aatatcggga ctcatccaaa 360aaattgaaag gaccttctgc aaatacagct tccaaagcag gagaaagtgg aaaggcaggc 420tcaggttcag gcaatgatgg catatcgcaa agtggtgaaa gtggttcaga gggttcatca 480aatgctagcg atgagaatac taaccaacag gaatcagctg caaataagaa gggaagcttt 540gacctgatgc ttgttgatgg agccaatgca cagaacaatt ctgctggtgc tatttctcaa 600tcttctgtgc ctggaaagcc tgttgtccca atgccagcaa ctaatcttaa cattggaatg 660gacttgtgga atgcatcttc tggtggcgct gaagctgcaa aaatgagaca taatcaatct 720ggtgccccgg gagttgccct tggtgatcaa tgggtacaag atgaacgtga gctgaaaaga 780cagaagagga aacagtcaaa ccgagagtca gccaggaggt caagattacg caagcaggct 840gagtgtgaag agttacaaaa gagggtggaa tcgctgggag gtgagaatca aactctcaga 900gaagagcttc agagactttc tgaagaatgc gagaagctta catccgaaaa taattctatc 960aaggaagagt tggagcggtt gtgtggtcca gaagcagttg ctaacctgga ttaa 101468337PRTGlycine max 68Met Gly Ala Gly Glu Glu Ser Thr Ala Lys Ser Ser Lys Ser Ser Ser 1 5 10 15 Ser Ala Gln Asp Thr Pro Thr Ala Pro Ala Tyr Pro Asp Trp Ser Ser 20 25 30 Ser Met Gln Ala Tyr Tyr Ala Pro Gly Ala Thr Pro Pro Pro Phe Phe 35 40 45 Ala Thr Thr Val Ala Ser Pro Thr Pro His Pro Tyr Leu Trp Gly Gly 50 55 60 Gln His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala 65

70 75 80 Ile Tyr Pro Pro Gly Ser Ile Tyr Ala His Pro Ser Met Ala Val Thr 85 90 95 Pro Ser Ala Val Gln Gln Asn Thr Glu Ile Glu Gly Lys Gly Ala Glu 100 105 110 Gly Lys Tyr Arg Asp Ser Ser Lys Lys Leu Lys Gly Pro Ser Ala Asn 115 120 125 Thr Ala Ser Lys Ala Gly Glu Ser Gly Lys Ala Gly Ser Gly Ser Gly 130 135 140 Asn Asp Gly Ile Ser Gln Ser Gly Glu Ser Gly Ser Glu Gly Ser Ser 145 150 155 160 Asn Ala Ser Asp Glu Asn Thr Asn Gln Gln Glu Ser Ala Ala Asn Lys 165 170 175 Lys Gly Ser Phe Asp Leu Met Leu Val Asp Gly Ala Asn Ala Gln Asn 180 185 190 Asn Ser Ala Gly Ala Ile Ser Gln Ser Ser Val Pro Gly Lys Pro Val 195 200 205 Val Pro Met Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn 210 215 220 Ala Ser Ser Gly Gly Ala Glu Ala Ala Lys Met Arg His Asn Gln Ser 225 230 235 240 Gly Ala Pro Gly Val Ala Leu Gly Asp Gln Trp Val Gln Asp Glu Arg 245 250 255 Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 260 265 270 Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Lys Arg 275 280 285 Val Glu Ser Leu Gly Gly Glu Asn Gln Thr Leu Arg Glu Glu Leu Gln 290 295 300 Arg Leu Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile 305 310 315 320 Lys Glu Glu Leu Glu Arg Leu Cys Gly Pro Glu Ala Val Ala Asn Leu 325 330 335 Asp 691278DNAGlycine max 69atgggaaaca gtgaggaaga gaaatctacc aagactgaaa aaccttcttc acctgtaaca 60gtggatcaag ccaatcagac caaccagacc aatattcatg tctatcctga ttgggcagcc 120atgcaggcat attatgggcc aagagtcacc atgccaccat actacaactc agctgtggct 180tctggtcacg ctcctcaccc atacatgtgg ggaccaccac agcctatgat gccaccttat 240gggcctcctt atgcagcaat ttatccacat ggaggggttt atactcaccc tgcagttcct 300attgggccac atacacatag tcaaggagtt ccatcttcac ccgccgctgg gactccttta 360agcatagaga caccacccaa atcatctgga aatactgatc agggtttaat gaagaaattg 420aaagagtttg atggacttgc aatgtcaata ggaaatggcc atgctgaaag tgcagagcct 480ggaggtgaaa acaggctgtc agagagtgtg gatactgagg gttccagtga tggaagtgat 540ggcaacactt caggggctaa tcaaacaaga aggaaaagaa gccgtgaggg aacaccaacc 600actgatggag aagggaaaac tgagatgcaa ggcagtccaa tttccaaaga gactgcagct 660tctaataaga tgttggcagt tgtcactgct ggtgttgcag gaacaatagt tggacctgta 720gtttcttcag gtatgaccac cacgctggag ctgagaaatc cttccagtgt tcattctaaa 780gcaagtgccc cacaaccttg tccagtattg cctgcagaaa cttggttaca gaatgagcgt 840gagctgaaac gtgagaggcg gaaacaatca aatcgagaat ctgctagaag gtccagacta 900aggaagcagg ctgaaactga agaactggca cggaaagttg aatccttgaa tgctgagaat 960gcaacactga aatcagaaat aaatcgactg accgaaagtt ctgaaaaaat gagggtggaa 1020aatgctacat taaggggaaa acttaaaaat gctcaactga gacaaacaca agagataact 1080ttgaacataa ttgacagcca gagggctaca cctataagta cagaaaactt actatcgaga 1140gttaataata attccggttc taatgataga actgtggagg atgagaatgg tttttgcgaa 1200aataaaccaa actctggtgc aaagctgcat caactactgg acacaagtcc tagagctgat 1260gctgtggcag ctggttga 127870425PRTGlycine max 70Met Gly Asn Ser Glu Glu Glu Lys Ser Thr Lys Thr Glu Lys Pro Ser 1 5 10 15 Ser Pro Val Thr Val Asp Gln Ala Asn Gln Thr Asn Gln Thr Asn Ile 20 25 30 His Val Tyr Pro Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg 35 40 45 Val Thr Met Pro Pro Tyr Tyr Asn Ser Ala Val Ala Ser Gly His Ala 50 55 60 Pro His Pro Tyr Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr 65 70 75 80 Gly Pro Pro Tyr Ala Ala Ile Tyr Pro His Gly Gly Val Tyr Thr His 85 90 95 Pro Ala Val Pro Ile Gly Pro His Thr His Ser Gln Gly Val Pro Ser 100 105 110 Ser Pro Ala Ala Gly Thr Pro Leu Ser Ile Glu Thr Pro Pro Lys Ser 115 120 125 Ser Gly Asn Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Glu Phe Asp 130 135 140 Gly Leu Ala Met Ser Ile Gly Asn Gly His Ala Glu Ser Ala Glu Pro 145 150 155 160 Gly Gly Glu Asn Arg Leu Ser Glu Ser Val Asp Thr Glu Gly Ser Ser 165 170 175 Asp Gly Ser Asp Gly Asn Thr Ser Gly Ala Asn Gln Thr Arg Arg Lys 180 185 190 Arg Ser Arg Glu Gly Thr Pro Thr Thr Asp Gly Glu Gly Lys Thr Glu 195 200 205 Met Gln Gly Ser Pro Ile Ser Lys Glu Thr Ala Ala Ser Asn Lys Met 210 215 220 Leu Ala Val Val Thr Ala Gly Val Ala Gly Thr Ile Val Gly Pro Val 225 230 235 240 Val Ser Ser Gly Met Thr Thr Thr Leu Glu Leu Arg Asn Pro Ser Ser 245 250 255 Val His Ser Lys Ala Ser Ala Pro Gln Pro Cys Pro Val Leu Pro Ala 260 265 270 Glu Thr Trp Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys 275 280 285 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 290 295 300 Glu Thr Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Asn Ala Glu Asn 305 310 315 320 Ala Thr Leu Lys Ser Glu Ile Asn Arg Leu Thr Glu Ser Ser Glu Lys 325 330 335 Met Arg Val Glu Asn Ala Thr Leu Arg Gly Lys Leu Lys Asn Ala Gln 340 345 350 Leu Arg Gln Thr Gln Glu Ile Thr Leu Asn Ile Ile Asp Ser Gln Arg 355 360 365 Ala Thr Pro Ile Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Asn 370 375 380 Ser Gly Ser Asn Asp Arg Thr Val Glu Asp Glu Asn Gly Phe Cys Glu 385 390 395 400 Asn Lys Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Ser 405 410 415 Pro Arg Ala Asp Ala Val Ala Ala Gly 420 425 71558DNAHelianthus annuus 71atgtcaacga tactggtgtt agggttggag agagtgggaa gacagcttcg agttcaggga 60atgacggtgc cacacaaagc gtgtaacgta agcagtgccg aaagtggaaa taacggttca 120tctgatgcaa atgatgagga tacccaacag gaacattctg gaagcaaaaa gggaagcttc 180catcagatgc ttgcggacgc gaatgcacga aacaacaatc atgtagcttc tatgcctgct 240accaccaatc tttatatggg gatgaatatg tggaccccac ctactggttc tgtgtcacaa 300ccggtgaccc cgccaccggt aatggctgat cggtgggttc aggatgaacg agaattgaaa 360aggcagaaac ggaagcaaga caacagagag tcggctagaa gatcaaggat gcgcaagcag 420gctgagtgtg aagcgctaca agcaacagta gagacgctaa ataacgagaa tcactcactt 480agggatgagc tgcagaggct ttctgaggca tgtgggaagc ttacagctga aaatgattcg 540ataaaggatg agatttaa 55872185PRTHelianthus annuus 72Met Ser Thr Ile Leu Val Leu Gly Leu Glu Arg Val Gly Arg Gln Leu 1 5 10 15 Arg Val Gln Gly Met Thr Val Pro His Lys Ala Cys Asn Val Ser Ser 20 25 30 Ala Glu Ser Gly Asn Asn Gly Ser Ser Asp Ala Asn Asp Glu Asp Thr 35 40 45 Gln Gln Glu His Ser Gly Ser Lys Lys Gly Ser Phe His Gln Met Leu 50 55 60 Ala Asp Ala Asn Ala Arg Asn Asn Asn His Val Ala Ser Met Pro Ala 65 70 75 80 Thr Thr Asn Leu Tyr Met Gly Met Asn Met Trp Thr Pro Pro Thr Gly 85 90 95 Ser Val Ser Gln Pro Val Thr Pro Pro Pro Val Met Ala Asp Arg Trp 100 105 110 Val Gln Asp Glu Arg Glu Leu Lys Arg Gln Lys Arg Lys Gln Asp Asn 115 120 125 Arg Glu Ser Ala Arg Arg Ser Arg Met Arg Lys Gln Ala Glu Cys Glu 130 135 140 Ala Leu Gln Ala Thr Val Glu Thr Leu Asn Asn Glu Asn His Ser Leu 145 150 155 160 Arg Asp Glu Leu Gln Arg Leu Ser Glu Ala Cys Gly Lys Leu Thr Ala 165 170 175 Glu Asn Asp Ser Ile Lys Asp Glu Ile 180 185 731047DNAHelianthus annuus 73atgggaaact gtgaagaagc aaaggattgt aagcccgaag aaacatcttc accacctgcg 60gcttattacg gcccgagaat ggctgtgcca ccatacttca gttcacctgt tgcatctggt 120catgcacctc caccttatat gtggggtcca atgccgcata tgatgccccc ttacgctgca 180atttatccac atggaggtgt ttatgcacat cccggagtta ctgtggcggg taatcctaat 240cctacactcg taccaatcga ttcttgtgcc aagtcatcag ggaatccaga tagaggcttg 300ataaagaagt tgaaaggaat tgatgagcta acaatgtcga tagggaacaa caatggaatc 360tctaatagtc gagatacgga aggttccagt gaaggcagtg atggtaatac aagttgtaaa 420aatggtcaca aaaggatccg tggaggatca tctacaagtt ctgaaggtgg caagactgag 480cgaagtgatg cgaacggggt ctctaagaaa gtgacaggtg tcaagatttt gggtaaagaa 540gttggagcgg ttatttctgg tgactcggga accgagctag agcttaagaa gtcgccagca 600tctgccaaca tggccatggt gcccaacagt cttgtgcttc agaacgaacg agaactgaaa 660agggaaaaaa gaaagcagtc taaccgagaa tcagctagga ggtcgagatt aaggaaacag 720gccgaagccg aagaacttgg tacaagagtt gaagcactca ctagtgaaaa tttgaaactc 780aagtctgaga ttaaccagtt aaccgttaat gcaacaaact tgaagcttca aaacgctaaa 840ctactggaaa aacttaagaa agctacagaa ggcccaaggg ccgacaaaaa gggttcatct 900ctgagcactg ccaacctcct ctctagggtc gacaaccgat ctggttctgt agttggagag 960gccaccagtt ctggttctgg tgcaccgctg taccaacttc tggatgctag tccccgggcc 1020gatggtgttg cggctggtgc tggctaa 104774348PRTHelianthus annuus 74Met Gly Asn Cys Glu Glu Ala Lys Asp Cys Lys Pro Glu Glu Thr Ser 1 5 10 15 Ser Pro Pro Ala Ala Tyr Tyr Gly Pro Arg Met Ala Val Pro Pro Tyr 20 25 30 Phe Ser Ser Pro Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met Trp 35 40 45 Gly Pro Met Pro His Met Met Pro Pro Tyr Ala Ala Ile Tyr Pro His 50 55 60 Gly Gly Val Tyr Ala His Pro Gly Val Thr Val Ala Gly Asn Pro Asn 65 70 75 80 Pro Thr Leu Val Pro Ile Asp Ser Cys Ala Lys Ser Ser Gly Asn Pro 85 90 95 Asp Arg Gly Leu Ile Lys Lys Leu Lys Gly Ile Asp Glu Leu Thr Met 100 105 110 Ser Ile Gly Asn Asn Asn Gly Ile Ser Asn Ser Arg Asp Thr Glu Gly 115 120 125 Ser Ser Glu Gly Ser Asp Gly Asn Thr Ser Cys Lys Asn Gly His Lys 130 135 140 Arg Ile Arg Gly Gly Ser Ser Thr Ser Ser Glu Gly Gly Lys Thr Glu 145 150 155 160 Arg Ser Asp Ala Asn Gly Val Ser Lys Lys Val Thr Gly Val Lys Ile 165 170 175 Leu Gly Lys Glu Val Gly Ala Val Ile Ser Gly Asp Ser Gly Thr Glu 180 185 190 Leu Glu Leu Lys Lys Ser Pro Ala Ser Ala Asn Met Ala Met Val Pro 195 200 205 Asn Ser Leu Val Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Lys Arg 210 215 220 Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln 225 230 235 240 Ala Glu Ala Glu Glu Leu Gly Thr Arg Val Glu Ala Leu Thr Ser Glu 245 250 255 Asn Leu Lys Leu Lys Ser Glu Ile Asn Gln Leu Thr Val Asn Ala Thr 260 265 270 Asn Leu Lys Leu Gln Asn Ala Lys Leu Leu Glu Lys Leu Lys Lys Ala 275 280 285 Thr Glu Gly Pro Arg Ala Asp Lys Lys Gly Ser Ser Leu Ser Thr Ala 290 295 300 Asn Leu Leu Ser Arg Val Asp Asn Arg Ser Gly Ser Val Val Gly Glu 305 310 315 320 Ala Thr Ser Ser Gly Ser Gly Ala Pro Leu Tyr Gln Leu Leu Asp Ala 325 330 335 Ser Pro Arg Ala Asp Gly Val Ala Ala Gly Ala Gly 340 345 75747DNAHelianthus petiolaris 75atgggagctg gtgaaggaag ctcaactgct aagcattcta aacctacctt atctcaggaa 60acacatccac cctcctatgc tgattggtca gcctcaatgc aggcgtatta tggtggcgga 120gctcctccac cattctttcc gtcgatcgtt gcatctcctc ctcctcatcc atacatgtgg 180ggaggccagc atcctatgat gccaccatac ggggctccag ttccatatcc gactctatat 240tcaccgccgg aagtttatgc tcatcccggc acacctatgg tagtccgcca aaaggtagca 300tcaagttcag ggaacgacgg taacactacc caaagtgctg atagtgagaa tgatggttct 360tcagatgcca atttgcagaa tgacaagtcg gggaatctcg ttgttcatac gcctgctacc 420gctaaaaatg tagcagttga aatgcaatcg aatccgtctg atgctcctca aacagcggtc 480ccaccaccga taatgggaca gtggccacgc caagatgaac gagaactgaa gagacagaag 540agaaagcaat ctaatcgaga gtcagctagg agatcgagaa tgcgcaagca ggccgagtgt 600gaagagctac aagcacgagt tgaggtacta agtaatgaaa atcacacgct caaagatgag 660ctgcagagac tctctgagga aatgcagaag ctaacgtctg aaaatgattc tatgaaggga 720aagttaagtg actattgtga accctga 74776248PRTHelianthus petiolaris 76Met Gly Ala Gly Glu Gly Ser Ser Thr Ala Lys His Ser Lys Pro Thr 1 5 10 15 Leu Ser Gln Glu Thr His Pro Pro Ser Tyr Ala Asp Trp Ser Ala Ser 20 25 30 Met Gln Ala Tyr Tyr Gly Gly Gly Ala Pro Pro Pro Phe Phe Pro Ser 35 40 45 Ile Val Ala Ser Pro Pro Pro His Pro Tyr Met Trp Gly Gly Gln His 50 55 60 Pro Met Met Pro Pro Tyr Gly Ala Pro Val Pro Tyr Pro Thr Leu Tyr 65 70 75 80 Ser Pro Pro Glu Val Tyr Ala His Pro Gly Thr Pro Met Val Val Arg 85 90 95 Gln Lys Val Ala Ser Ser Ser Gly Asn Asp Gly Asn Thr Thr Gln Ser 100 105 110 Ala Asp Ser Glu Asn Asp Gly Ser Ser Asp Ala Asn Leu Gln Asn Asp 115 120 125 Lys Ser Gly Asn Leu Val Val His Thr Pro Ala Thr Ala Lys Asn Val 130 135 140 Ala Val Glu Met Gln Ser Asn Pro Ser Asp Ala Pro Gln Thr Ala Val 145 150 155 160 Pro Pro Pro Ile Met Gly Gln Trp Pro Arg Gln Asp Glu Arg Glu Leu 165 170 175 Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 180 185 190 Arg Met Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Ala Arg Val Glu 195 200 205 Val Leu Ser Asn Glu Asn His Thr Leu Lys Asp Glu Leu Gln Arg Leu 210 215 220 Ser Glu Glu Met Gln Lys Leu Thr Ser Glu Asn Asp Ser Met Lys Gly 225 230 235 240 Lys Leu Ser Asp Tyr Cys Glu Pro 245 771083DNAHelianthus tuberosus 77atgggaaact gtgaagaagc aaaggattgt aaacccgaag aaacatcttc acctcctgcg 60actggtcgtg tatatactga ttgggcgtcc atgcaggctt attatggccc gagaatggct 120gtgccaccat acttcaattc acctgttgca tctggtcatg cacctccacc ttatatgtgg 180ggtccaatgc cgcatatgat gcccccttac gctgcaattt atccacatgg aggtgtttat 240gcacatcccg gagttactgt ggcggataat cctaatcctg cactcatacc aatcgattct 300tctgccaagt catcagggaa tccagataga ggcttgataa agaagttgaa aggaattgat 360gagctaacaa tgtcgatagg gaacaacaat ggaatctcta atagtcgaga tacggaaggt 420tccagtgaag gcagtgatgg taatacaagt tgtaaaaatg aacacaaaag gatccgtgga 480ggatcatcta caagttctga aggtggcaag actgagcgaa gtgatgcgaa cggggtcact 540aagaaagtga aaggtgtcaa gattttgggt aaagaagttg gagcggttat ttctggtgac 600tcgggaaccg agctagagct taagaagtct ccagcatctc ccaacatgtc catggtgccc 660aacagtcttg tgcttcagaa cgaacgagaa ctgaaaaggg aaaaaagaaa gcagtctaac 720cgagaatcag ctaggaggtc aagattaagg aaacaggccg aagccgaaga acttggtaca 780agagttgaag cactcactag tgaaaatttg aaactcaagt ctgagattaa ccagttaact 840gttaatgcaa caaacttgaa gcttcaaaac gctaaactac tggaaaaact taagaaagct 900acagaaggcc caagggccga caaaaagggt tcatctctga gcactgctaa cctcctctct 960agggtcgaca accgttctgg ttcagtagtt ggagatgcca ccagttctgg ttctggtgca 1020ccgctgcacc aacttctgga tgctagtccc cgggccgatg ctgttgcggc tggtgctggc 1080taa 108378360PRTHelianthus tuberosus 78Met Gly Asn Cys Glu Glu Ala Lys Asp Cys Lys Pro Glu Glu Thr Ser 1 5 10 15 Ser Pro Pro Ala Thr Gly Arg Val Tyr Thr Asp Trp Ala Ser Met Gln 20 25 30 Ala Tyr Tyr Gly Pro Arg Met Ala Val Pro Pro Tyr Phe Asn Ser Pro

35 40 45 Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met Trp Gly Pro Met Pro 50 55 60 His Met Met Pro Pro Tyr Ala Ala Ile Tyr Pro His Gly Gly Val Tyr 65 70 75 80 Ala His Pro Gly Val Thr Val Ala Asp Asn Pro Asn Pro Ala Leu Ile 85 90 95 Pro Ile Asp Ser Ser Ala Lys Ser Ser Gly Asn Pro Asp Arg Gly Leu 100 105 110 Ile Lys Lys Leu Lys Gly Ile Asp Glu Leu Thr Met Ser Ile Gly Asn 115 120 125 Asn Asn Gly Ile Ser Asn Ser Arg Asp Thr Glu Gly Ser Ser Glu Gly 130 135 140 Ser Asp Gly Asn Thr Ser Cys Lys Asn Glu His Lys Arg Ile Arg Gly 145 150 155 160 Gly Ser Ser Thr Ser Ser Glu Gly Gly Lys Thr Glu Arg Ser Asp Ala 165 170 175 Asn Gly Val Thr Lys Lys Val Lys Gly Val Lys Ile Leu Gly Lys Glu 180 185 190 Val Gly Ala Val Ile Ser Gly Asp Ser Gly Thr Glu Leu Glu Leu Lys 195 200 205 Lys Ser Pro Ala Ser Pro Asn Met Ser Met Val Pro Asn Ser Leu Val 210 215 220 Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn 225 230 235 240 Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu 245 250 255 Glu Leu Gly Thr Arg Val Glu Ala Leu Thr Ser Glu Asn Leu Lys Leu 260 265 270 Lys Ser Glu Ile Asn Gln Leu Thr Val Asn Ala Thr Asn Leu Lys Leu 275 280 285 Gln Asn Ala Lys Leu Leu Glu Lys Leu Lys Lys Ala Thr Glu Gly Pro 290 295 300 Arg Ala Asp Lys Lys Gly Ser Ser Leu Ser Thr Ala Asn Leu Leu Ser 305 310 315 320 Arg Val Asp Asn Arg Ser Gly Ser Val Val Gly Asp Ala Thr Ser Ser 325 330 335 Gly Ser Gly Ala Pro Leu His Gln Leu Leu Asp Ala Ser Pro Arg Ala 340 345 350 Asp Ala Val Ala Ala Gly Ala Gly 355 360 79858DNAHelianthus tuberosus 79atgatggcac cttatgggac tccggttcca tatgctgcta tgtatccacc agcaggagtt 60tatggtcatc ccggtatgcc tatgactcct gacactgtgc agccgaccac agaaatggaa 120tcgaaggccc ccaatgggaa ggacaaggtc aacaaaaagt caaaggggtc ttctggaaat 180gtcaatgcca gcggcgctaa gactagagag agtgggaagg cggcttcaag ttcagggaat 240gacggtggca cccaaagtgc tgaaagtgga aataatggtt catctgatgc aagcgatgaa 300gataaccagc aggattattc tggaggcaaa aagggaagct tccatcaaat gcttgcagat 360gcgaatgcac gaaacaactc tggaccaaat gttcagatgc cagtgcctgg aaatcccgta 420gtttctatgc ctgctaccaa tcttaatatg gggatggacc tgtggaaccc gtctgctgga 480tccgggtcta tgaaaatgca accaaatcac tctggtgtct cgcaaccagg ggttccacca 540ccaataatgg ctgatcagtg gggtcaggat gagagagaat tgaagaggct gaaaaggaag 600caatccaata gagagtcagc tagaagatca aggctacgca agcagactga gtgtgaagag 660ctacaggcaa gagtagagac actaaacaac gagaatcact cactcagaga tgaactgcag 720aggctttctg aggaatgtgg gaagcttaca gctgaaaatg attcaataaa ggatgaatta 780accaggtttt ttggacctga agctgtttcg aaactcgatg cacatcttca atctcgaacg 840aacgaagatg aaagttga 85880285PRTHelianthus tuberosus 80Met Met Ala Pro Tyr Gly Thr Pro Val Pro Tyr Ala Ala Met Tyr Pro 1 5 10 15 Pro Ala Gly Val Tyr Gly His Pro Gly Met Pro Met Thr Pro Asp Thr 20 25 30 Val Gln Pro Thr Thr Glu Met Glu Ser Lys Ala Pro Asn Gly Lys Asp 35 40 45 Lys Val Asn Lys Lys Ser Lys Gly Ser Ser Gly Asn Val Asn Ala Ser 50 55 60 Gly Ala Lys Thr Arg Glu Ser Gly Lys Ala Ala Ser Ser Ser Gly Asn 65 70 75 80 Asp Gly Gly Thr Gln Ser Ala Glu Ser Gly Asn Asn Gly Ser Ser Asp 85 90 95 Ala Ser Asp Glu Asp Asn Gln Gln Asp Tyr Ser Gly Gly Lys Lys Gly 100 105 110 Ser Phe His Gln Met Leu Ala Asp Ala Asn Ala Arg Asn Asn Ser Gly 115 120 125 Pro Asn Val Gln Met Pro Val Pro Gly Asn Pro Val Val Ser Met Pro 130 135 140 Ala Thr Asn Leu Asn Met Gly Met Asp Leu Trp Asn Pro Ser Ala Gly 145 150 155 160 Ser Gly Ser Met Lys Met Gln Pro Asn His Ser Gly Val Ser Gln Pro 165 170 175 Gly Val Pro Pro Pro Ile Met Ala Asp Gln Trp Gly Gln Asp Glu Arg 180 185 190 Glu Leu Lys Arg Leu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 195 200 205 Arg Ser Arg Leu Arg Lys Gln Thr Glu Cys Glu Glu Leu Gln Ala Arg 210 215 220 Val Glu Thr Leu Asn Asn Glu Asn His Ser Leu Arg Asp Glu Leu Gln 225 230 235 240 Arg Leu Ser Glu Glu Cys Gly Lys Leu Thr Ala Glu Asn Asp Ser Ile 245 250 255 Lys Asp Glu Leu Thr Arg Phe Phe Gly Pro Glu Ala Val Ser Lys Leu 260 265 270 Asp Ala His Leu Gln Ser Arg Thr Asn Glu Asp Glu Ser 275 280 285 811278DNAMedicago truncatula 81atgggaaata gcgatgaaga gaaatctacc aagactgaaa aaccttcttc acctgtaaca 60gtggatcaga ccaaccagac gaatgttcat gtctatcctg attgggcagc catgcaggca 120tattatggac caagagttgc catgccacct tactacaact cacctgtggc ttctggtcac 180actcctcacc catatatgtg ggggccacca cagcctatga tgccacctta tggacatcct 240tatgcagcaa tgtatccaca tggaggggtt tatactcacc ctgcagttcc tattggtcca 300catccacata gtcaaggaat ttcatcttca cctgctactg gaactccttt aagcatagag 360acacctccca aatcatctgg aaatactgat cagggtctga tgaagaaatt gaaagggttt 420gatggacttg caatgtcaat aggcaatggc catgctgaga gtgctgagcc tggagctgaa 480agcaggcaat cacagagtgt gaatactgag ggttcgagtg atggaagtga cggaaacact 540tcaggggcta atcaaacaag aaggaaaaga agccgtgagg gaacaccaac cactgatgga 600gaagggaaaa caaatacaca aggtagtcaa atttccaaag aaattgcagc ttctgataag 660atgatggcag tagcccctgc tggtgtcaca ggtcaactag ttggacctgt agcttcttca 720gcgatgacca ccgcactgga gctgagaaat tcttctagtg ttcattctaa aacaaatccc 780acaagtaccc cacaaccttc tgctgtattg cctcccgagg cttggataca gaatgagcgt 840gaactgaagc gtgagaggag gaaacaatca aatcgagaat ctgctagaag gtccagacta 900aggaagcagg ctgaggctga agaactggca cgaaaagttg aatccttgaa tgctgagagt 960gcgtcactta gatcagaaat aaaccgactt gctgagaatt ctgaaagact gaggatggaa 1020aatgctgcat taaaggaaaa atttaaaatt gctaaactgg gacaaccgaa agagataatt 1080ttgaccaaca ttgacagcca gaggaccaca cctgtaagta cagaaaactt attatcaaga 1140gttaataaca attcgggttc taatgataga actgtggagg atgagaatgg ttattgtgac 1200aacaaaccaa attctggtgc gaagttgcac caactattgg atgcaagtcc tagagctgat 1260gctgtggctg ctggttga 127882425PRTMedicago truncatula 82Met Gly Asn Ser Asp Glu Glu Lys Ser Thr Lys Thr Glu Lys Pro Ser 1 5 10 15 Ser Pro Val Thr Val Asp Gln Thr Asn Gln Thr Asn Val His Val Tyr 20 25 30 Pro Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Met 35 40 45 Pro Pro Tyr Tyr Asn Ser Pro Val Ala Ser Gly His Thr Pro His Pro 50 55 60 Tyr Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr Gly His Pro 65 70 75 80 Tyr Ala Ala Met Tyr Pro His Gly Gly Val Tyr Thr His Pro Ala Val 85 90 95 Pro Ile Gly Pro His Pro His Ser Gln Gly Ile Ser Ser Ser Pro Ala 100 105 110 Thr Gly Thr Pro Leu Ser Ile Glu Thr Pro Pro Lys Ser Ser Gly Asn 115 120 125 Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala 130 135 140 Met Ser Ile Gly Asn Gly His Ala Glu Ser Ala Glu Pro Gly Ala Glu 145 150 155 160 Ser Arg Gln Ser Gln Ser Val Asn Thr Glu Gly Ser Ser Asp Gly Ser 165 170 175 Asp Gly Asn Thr Ser Gly Ala Asn Gln Thr Arg Arg Lys Arg Ser Arg 180 185 190 Glu Gly Thr Pro Thr Thr Asp Gly Glu Gly Lys Thr Asn Thr Gln Gly 195 200 205 Ser Gln Ile Ser Lys Glu Ile Ala Ala Ser Asp Lys Met Met Ala Val 210 215 220 Ala Pro Ala Gly Val Thr Gly Gln Leu Val Gly Pro Val Ala Ser Ser 225 230 235 240 Ala Met Thr Thr Ala Leu Glu Leu Arg Asn Ser Ser Ser Val His Ser 245 250 255 Lys Thr Asn Pro Thr Ser Thr Pro Gln Pro Ser Ala Val Leu Pro Pro 260 265 270 Glu Ala Trp Ile Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys 275 280 285 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 290 295 300 Glu Ala Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Asn Ala Glu Ser 305 310 315 320 Ala Ser Leu Arg Ser Glu Ile Asn Arg Leu Ala Glu Asn Ser Glu Arg 325 330 335 Leu Arg Met Glu Asn Ala Ala Leu Lys Glu Lys Phe Lys Ile Ala Lys 340 345 350 Leu Gly Gln Pro Lys Glu Ile Ile Leu Thr Asn Ile Asp Ser Gln Arg 355 360 365 Thr Thr Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Asn 370 375 380 Ser Gly Ser Asn Asp Arg Thr Val Glu Asp Glu Asn Gly Tyr Cys Asp 385 390 395 400 Asn Lys Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Ala Ser 405 410 415 Pro Arg Ala Asp Ala Val Ala Ala Gly 420 425 831023DNAMedicago truncatula 83atggggacta aggaggatag cacaactaaa ccttctaaaa catcttcatc aactcaggaa 60gtaccaacac caacagtaca accatcatat ccagattggt caacctccat gcaggcctac 120tataatcctg gagccgctcc gcctccctat tatgcctcaa ctgtggcttc accaaccccg 180catccttata tgtggggagg ccagcatcct atgatggcac catacgggac tccagttccg 240tatcctgcaa tgttccctcc tgggaatatc tatgctcatc ctagcatggt agtgactcca 300agtgctatgc accaaactac agagtttgaa gggaagggac ctgatggaaa ggataaggat 360tcatctaaaa aaccgaaggg cacttctgca aatacaagcg ccaaagcagg agagggtgga 420aaggcaggat caggttcagg caatgatggc ttttcacata gtggtgacag tggttcagag 480ggttcatcta atgctagcga tgaaaaccaa caggaatcag ctagaaacaa gaagggaagc 540tttgacctca tgcttgttga tggagccaac gcgcagaaca atactactgg acccatttct 600caatcatctg ttcctggaaa tcctgttgtc tcgatacctg caactaatct taatattgga 660atggacttat ggaatgcatc ttctgctggt gctgaagccg ccaaaatgag acacaatcaa 720cctggtgctc ctggagctgg tgcacttggt gaacagtgga tgcaacaaga tgatcgtgag 780ttgaaaagac agaagagaaa acagtctaat cgagagtcag ccaggaggtc aagactacgc 840aagcaggccg agtgtgaaga actacaaaag agggtggagg cgctgggagg tgagaatcga 900actctcagag aagagcttca gaaactttct gaagagtgtg agaagcttac atctgaaaac 960gattctatta aggaagactt ggaacggttg tgtgggcctg aagtagttgc taaccttgaa 1020tga 102384340PRTMedicago truncatula 84Met Gly Thr Lys Glu Asp Ser Thr Thr Lys Pro Ser Lys Thr Ser Ser 1 5 10 15 Ser Thr Gln Glu Val Pro Thr Pro Thr Val Gln Pro Ser Tyr Pro Asp 20 25 30 Trp Ser Thr Ser Met Gln Ala Tyr Tyr Asn Pro Gly Ala Ala Pro Pro 35 40 45 Pro Tyr Tyr Ala Ser Thr Val Ala Ser Pro Thr Pro His Pro Tyr Met 50 55 60 Trp Gly Gly Gln His Pro Met Met Ala Pro Tyr Gly Thr Pro Val Pro 65 70 75 80 Tyr Pro Ala Met Phe Pro Pro Gly Asn Ile Tyr Ala His Pro Ser Met 85 90 95 Val Val Thr Pro Ser Ala Met His Gln Thr Thr Glu Phe Glu Gly Lys 100 105 110 Gly Pro Asp Gly Lys Asp Lys Asp Ser Ser Lys Lys Pro Lys Gly Thr 115 120 125 Ser Ala Asn Thr Ser Ala Lys Ala Gly Glu Gly Gly Lys Ala Gly Ser 130 135 140 Gly Ser Gly Asn Asp Gly Phe Ser His Ser Gly Asp Ser Gly Ser Glu 145 150 155 160 Gly Ser Ser Asn Ala Ser Asp Glu Asn Gln Gln Glu Ser Ala Arg Asn 165 170 175 Lys Lys Gly Ser Phe Asp Leu Met Leu Val Asp Gly Ala Asn Ala Gln 180 185 190 Asn Asn Thr Thr Gly Pro Ile Ser Gln Ser Ser Val Pro Gly Asn Pro 195 200 205 Val Val Ser Ile Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp 210 215 220 Asn Ala Ser Ser Ala Gly Ala Glu Ala Ala Lys Met Arg His Asn Gln 225 230 235 240 Pro Gly Ala Pro Gly Ala Gly Ala Leu Gly Glu Gln Trp Met Gln Gln 245 250 255 Asp Asp Arg Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu 260 265 270 Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu 275 280 285 Gln Lys Arg Val Glu Ala Leu Gly Gly Glu Asn Arg Thr Leu Arg Glu 290 295 300 Glu Leu Gln Lys Leu Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn 305 310 315 320 Asp Ser Ile Lys Glu Asp Leu Glu Arg Leu Cys Gly Pro Glu Val Val 325 330 335 Ala Asn Leu Glu 340 85372DNANicotiana tabacummisc_feature(300)..(300)n is a, c, g, or t 85atgttcatgg gcggtttgac ttgtttgctt gctccaactt tctggaatcc aaaatgtact 60caacaatttc ctttaggatt ttctggcagt catgttgttt tctgtttctt atttttgcag 120gctgagtgtg aagagctgca gcgtagggta gaagctttga gcagcgagaa tcattcactc 180aaagatgagc tccaacggct ctctgaggaa tgtgagaagc ttacctcaga gaatagtttg 240ataaaggtac ttaatacctt gttaagtgtt cctctaatat gcttatcctc cttgatattn 300ggtaatattc tcaagtaccc tcttcatgtc atcccctttt tcctttccct catttacctt 360tgcactttat ag 37286123PRTNicotiana tabacummisc_feature(100)..(100)Xaa can be any naturally occurring amino acid 86Met Phe Met Gly Gly Leu Thr Cys Leu Leu Ala Pro Thr Phe Trp Asn 1 5 10 15 Pro Lys Cys Thr Gln Gln Phe Pro Leu Gly Phe Ser Gly Ser His Val 20 25 30 Val Phe Cys Phe Leu Phe Leu Gln Ala Glu Cys Glu Glu Leu Gln Arg 35 40 45 Arg Val Glu Ala Leu Ser Ser Glu Asn His Ser Leu Lys Asp Glu Leu 50 55 60 Gln Arg Leu Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Ser Leu 65 70 75 80 Ile Lys Val Leu Asn Thr Leu Leu Ser Val Pro Leu Ile Cys Leu Ser 85 90 95 Ser Leu Ile Xaa Gly Asn Ile Leu Lys Tyr Pro Leu His Val Ile Pro 100 105 110 Phe Phe Leu Ser Leu Ile Tyr Leu Cys Thr Leu 115 120 871284DNANicotiana tabacum 87atgggaaata gtgaggacgg gaaatcttgt aagcctgaga aatcatcttc gaccgcacca 60gaccagagca atattcacgt gtatcctgat tgggcggcta tgcaggcata ttatggtcca 120cgggtagctg tacctccata tgttaattct cctgttgcac ctggtcaagc tcctcatcct 180tgtatgtggg gaccgctaca gcctatgatg ccaccttatg gtataccata tgcaggaatc 240tatgcgcatg gtggtgttta tgcgcaccct ggagttccta tcgtgtctcg tcctcaggct 300catgtaatga catcatctcc tgctgtcagc caaaccatgg atgctgcttc tttgagtatg 360gacccttctg ctaagacttc gggggatacg aatcaaggct tgatgagtaa gttaaaaggt 420tctgatgggc ttggaatgtc aataggaaat tgcagcgttg acaatggcga cggtactgac 480catggacctt ctcagagtga cagtgggcaa acggaaggtt caagtgatgg aagtaacata 540cacacagcag aggtgggtga gaagagtaag aaaagaagcc gcgagacgac tcctaatacc 600tctggtgacg gaaagagtcg gacacgaagc agtccacaac ctagggaagt aaatggggct 660accaagaagg aaacttctat agcttttaat cctggtaaca tagcagagaa agtagtcgga 720acagtatttt ctccaaccat gactactact ctggaactga gaaatcctgt cggtacacta 780gtgaaagcta gtccaactaa tgtttcacga attagtcctg cagtgccggg cgaagcttgg 840ttacagaatg aacgtgagat gaagcgggag aagaggaagc agtctaatcg ggaatctgca 900aggagatcaa ggttgagaaa gcagggagaa gctgaagaat tggcaatacg agttcaatct 960ttaacctccg aaaatttggg cctcaaatca gagataaata atttcactga aaattctgcg 1020aaactaaagc ttgaaaattc cgctttaatg gagagactgc aaaataaaca acgaggacaa 1080gcagaagagg taactttagg caagattggt gataagaggc tgcaacctgt tagcacagca 1140gacctattag caagagtcaa caactctggt ccgttggata gaaccaacaa agacgatgaa 1200attcatgaga ataatacttc aggagcaaag

cttcatcaac ttcttgatgc tagccacaga 1260actgatgctg tggctgctag atga 128488427PRTNicotiana tabacum 88Met Gly Asn Ser Glu Asp Gly Lys Ser Cys Lys Pro Glu Lys Ser Ser 1 5 10 15 Ser Thr Ala Pro Asp Gln Ser Asn Ile His Val Tyr Pro Asp Trp Ala 20 25 30 Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Val Pro Pro Tyr Val 35 40 45 Asn Ser Pro Val Ala Pro Gly Gln Ala Pro His Pro Cys Met Trp Gly 50 55 60 Pro Leu Gln Pro Met Met Pro Pro Tyr Gly Ile Pro Tyr Ala Gly Ile 65 70 75 80 Tyr Ala His Gly Gly Val Tyr Ala His Pro Gly Val Pro Ile Val Ser 85 90 95 Arg Pro Gln Ala His Val Met Thr Ser Ser Pro Ala Val Ser Gln Thr 100 105 110 Met Asp Ala Ala Ser Leu Ser Met Asp Pro Ser Ala Lys Thr Ser Gly 115 120 125 Asp Thr Asn Gln Gly Leu Met Ser Lys Leu Lys Gly Ser Asp Gly Leu 130 135 140 Gly Met Ser Ile Gly Asn Cys Ser Val Asp Asn Gly Asp Gly Thr Asp 145 150 155 160 His Gly Pro Ser Gln Ser Asp Ser Gly Gln Thr Glu Gly Ser Ser Asp 165 170 175 Gly Ser Asn Ile His Thr Ala Glu Val Gly Glu Lys Ser Lys Lys Arg 180 185 190 Ser Arg Glu Thr Thr Pro Asn Thr Ser Gly Asp Gly Lys Ser Arg Thr 195 200 205 Arg Ser Ser Pro Gln Pro Arg Glu Val Asn Gly Ala Thr Lys Lys Glu 210 215 220 Thr Ser Ile Ala Phe Asn Pro Gly Asn Ile Ala Glu Lys Val Val Gly 225 230 235 240 Thr Val Phe Ser Pro Thr Met Thr Thr Thr Leu Glu Leu Arg Asn Pro 245 250 255 Val Gly Thr Leu Val Lys Ala Ser Pro Thr Asn Val Ser Arg Ile Ser 260 265 270 Pro Ala Val Pro Gly Glu Ala Trp Leu Gln Asn Glu Arg Glu Met Lys 275 280 285 Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 290 295 300 Leu Arg Lys Gln Gly Glu Ala Glu Glu Leu Ala Ile Arg Val Gln Ser 305 310 315 320 Leu Thr Ser Glu Asn Leu Gly Leu Lys Ser Glu Ile Asn Asn Phe Thr 325 330 335 Glu Asn Ser Ala Lys Leu Lys Leu Glu Asn Ser Ala Leu Met Glu Arg 340 345 350 Leu Gln Asn Lys Gln Arg Gly Gln Ala Glu Glu Val Thr Leu Gly Lys 355 360 365 Ile Gly Asp Lys Arg Leu Gln Pro Val Ser Thr Ala Asp Leu Leu Ala 370 375 380 Arg Val Asn Asn Ser Gly Pro Leu Asp Arg Thr Asn Lys Asp Asp Glu 385 390 395 400 Ile His Glu Asn Asn Thr Ser Gly Ala Lys Leu His Gln Leu Leu Asp 405 410 415 Ala Ser His Arg Thr Asp Ala Val Ala Ala Arg 420 425 89483DNANicotiana tabacum 89atgtttcaca ctcagcctgc actgccaaat gaagcctggt tacagaatga acgtgagctg 60aagcgggaga aaaggaaaca gtctaatcgg gaatctgcaa ggcgatcaag attgagaaaa 120caggctgaag ctgaagaatt ggcaatacga gttcagtctt taacagggga aaacatgaca 180ctcaaatctg agataaacaa attaatggag aactcggaga aacttaagct agaaaatgct 240gctttaatgg agaaactgaa caatgaacag ctaagcccga cagaagaagt gagtttaggt 300aagattgatg ataagagggt gcaacctgta ggcaccgcaa acctactagc aagagtcaat 360aactctggtt ccttaaatag agcaaacgag gagagtgaag tttatgagaa caatagttct 420ggagcaaagc ttcatcaact actcgattcc agccccagaa ctgatgcagt ggctgctggg 480tga 48390160PRTNicotiana tabacum 90Met Phe His Thr Gln Pro Ala Leu Pro Asn Glu Ala Trp Leu Gln Asn 1 5 10 15 Glu Arg Glu Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser 20 25 30 Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu Glu Leu Ala 35 40 45 Ile Arg Val Gln Ser Leu Thr Gly Glu Asn Met Thr Leu Lys Ser Glu 50 55 60 Ile Asn Lys Leu Met Glu Asn Ser Glu Lys Leu Lys Leu Glu Asn Ala 65 70 75 80 Ala Leu Met Glu Lys Leu Asn Asn Glu Gln Leu Ser Pro Thr Glu Glu 85 90 95 Val Ser Leu Gly Lys Ile Asp Asp Lys Arg Val Gln Pro Val Gly Thr 100 105 110 Ala Asn Leu Leu Ala Arg Val Asn Asn Ser Gly Ser Leu Asn Arg Ala 115 120 125 Asn Glu Glu Ser Glu Val Tyr Glu Asn Asn Ser Ser Gly Ala Lys Leu 130 135 140 His Gln Leu Leu Asp Ser Ser Pro Arg Thr Asp Ala Val Ala Ala Gly 145 150 155 160 911041DNANicotiana tabacum 91atgggagctg gggaagagag cacccctaca aagccttcaa aaccggcttc aactcaggag 60acacaaacta caccctcata tcctgattgg tcaagctcta tgcaggctta ttatagcgct 120ggagctactc ctcccttctt tgcctcacct gttgcttctc ctgcttccca cccatacttg 180tggggaggcc agcatcctct tatgcctcct tatggggctc cagtcccgta tccagcttta 240tatcctcctg ctggagttta tgctcatcct aacatggcca cgcacactcc aaacgctgtg 300caggcaaatc ttgaatcaaa caggaaggat cctgaaggaa aggatcggag tacgaacaaa 360aagttaaagg ccagttctgg tggcaaggca ggcgacagcg ggaaagttgc ttcaggttct 420ggaaatgatg gtgccacaca aagtgatgaa accagaagtg aaggtacatc agatacaaat 480gatgaaaatg ataaccacga atttgctgca agcaagaagg gaagctttga tcaaatgctt 540gccgatggag ccagtgcgca gaataatccc acaacagcga attaccagac ctctatgcat 600gcaaatcctg tcactgtgca tgcaactaac ctaaatattg gaatggatgt gtggaatgca 660tcatctgccg gtcctggagc gatcataata cagccaaatg tgaatggtcc agttatagga 720catgaaggaa ggatgaatga tcaatgggtt caggacgaac gtgaacttaa aagacaaaag 780agaaagcaat ctaataggga gtcagctagg aggtcaaggc tacgcaagca ggctgagtgt 840gaagagctgc agcgtagggt agaagctttg agcagcgaga atcattcact caaagatgag 900ctccaacggc tctctgagga atgtgagaag cttacctcag agaatagttt gataaaggaa 960gagttaacgc gtttatgtgg gccagatgct gtgtctaagc tagagagcaa cggcaatgcc 1020actcatgagg aagctagtta a 104192346PRTNicotiana tabacum 92Met Gly Ala Gly Glu Glu Ser Thr Pro Thr Lys Pro Ser Lys Pro Ala 1 5 10 15 Ser Thr Gln Glu Thr Gln Thr Thr Pro Ser Tyr Pro Asp Trp Ser Ser 20 25 30 Ser Met Gln Ala Tyr Tyr Ser Ala Gly Ala Thr Pro Pro Phe Phe Ala 35 40 45 Ser Pro Val Ala Ser Pro Ala Ser His Pro Tyr Leu Trp Gly Gly Gln 50 55 60 His Pro Leu Met Pro Pro Tyr Gly Ala Pro Val Pro Tyr Pro Ala Leu 65 70 75 80 Tyr Pro Pro Ala Gly Val Tyr Ala His Pro Asn Met Ala Thr His Thr 85 90 95 Pro Asn Ala Val Gln Ala Asn Leu Glu Ser Asn Arg Lys Asp Pro Glu 100 105 110 Gly Lys Asp Arg Ser Thr Asn Lys Lys Leu Lys Ala Ser Ser Gly Gly 115 120 125 Lys Ala Gly Asp Ser Gly Lys Val Ala Ser Gly Ser Gly Asn Asp Gly 130 135 140 Ala Thr Gln Ser Asp Glu Thr Arg Ser Glu Gly Thr Ser Asp Thr Asn 145 150 155 160 Asp Glu Asn Asp Asn His Glu Phe Ala Ala Ser Lys Lys Gly Ser Phe 165 170 175 Asp Gln Met Leu Ala Asp Gly Ala Ser Ala Gln Asn Asn Pro Thr Thr 180 185 190 Ala Asn Tyr Gln Thr Ser Met His Ala Asn Pro Val Thr Val His Ala 195 200 205 Thr Asn Leu Asn Ile Gly Met Asp Val Trp Asn Ala Ser Ser Ala Gly 210 215 220 Pro Gly Ala Ile Ile Ile Gln Pro Asn Val Asn Gly Pro Val Ile Gly 225 230 235 240 His Glu Gly Arg Met Asn Asp Gln Trp Val Gln Asp Glu Arg Glu Leu 245 250 255 Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 260 265 270 Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Arg Arg Val Glu 275 280 285 Ala Leu Ser Ser Glu Asn His Ser Leu Lys Asp Glu Leu Gln Arg Leu 290 295 300 Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Ser Leu Ile Lys Glu 305 310 315 320 Glu Leu Thr Arg Leu Cys Gly Pro Asp Ala Val Ser Lys Leu Glu Ser 325 330 335 Asn Gly Asn Ala Thr His Glu Glu Ala Ser 340 345 93672DNAPinus taeda 93atggctttgg ttccaccagc acctgttcca ggaatgtcaa cagttagtct acctactaca 60aatctaaata ttggaatggg cgtctataat gtgcctgccc aaggatctgt aactcctgta 120aaaggaagac agggaacaac taacagtgta tcaacaattg ttcctgcagc atctcaactc 180attccaggtc atgatggagt gccttctgaa ttatgggtcc aggatgaacg ggaattaaaa 240cgacagaagc gtaaacaatc taatcgggag tcagctaaac gctctcgaat gaagaaacag 300atggaatgcg aagagttgtc tgcaaaagtt gagacactga cttctgagaa catggcactc 360agaaatgaaa taaatcttat agcagaggaa tctgaaaagc ttgcttctga gaatgaatca 420ctaaaggtaa tgttgaggaa ttatcaaaga gaggatagag taggagattc agaaagaaat 480ggtcctagag agacacagtc acagtttatg cagcaaggag gcaatggcta tcctcaagtt 540ctctctaaaa taagcaattt gagttctagc caaagagatg agcaaaggga gagtgagatg 600agcgattcaa ctggtaaatg tcatgccgta ttggaagcaa atgttcgatc tgatagagtg 660gctgcaggtt aa 67294223PRTPinus taeda 94Met Ala Leu Val Pro Pro Ala Pro Val Pro Gly Met Ser Thr Val Ser 1 5 10 15 Leu Pro Thr Thr Asn Leu Asn Ile Gly Met Gly Val Tyr Asn Val Pro 20 25 30 Ala Gln Gly Ser Val Thr Pro Val Lys Gly Arg Gln Gly Thr Thr Asn 35 40 45 Ser Val Ser Thr Ile Val Pro Ala Ala Ser Gln Leu Ile Pro Gly His 50 55 60 Asp Gly Val Pro Ser Glu Leu Trp Val Gln Asp Glu Arg Glu Leu Lys 65 70 75 80 Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Lys Arg Ser Arg 85 90 95 Met Lys Lys Gln Met Glu Cys Glu Glu Leu Ser Ala Lys Val Glu Thr 100 105 110 Leu Thr Ser Glu Asn Met Ala Leu Arg Asn Glu Ile Asn Leu Ile Ala 115 120 125 Glu Glu Ser Glu Lys Leu Ala Ser Glu Asn Glu Ser Leu Lys Val Met 130 135 140 Leu Arg Asn Tyr Gln Arg Glu Asp Arg Val Gly Asp Ser Glu Arg Asn 145 150 155 160 Gly Pro Arg Glu Thr Gln Ser Gln Phe Met Gln Gln Gly Gly Asn Gly 165 170 175 Tyr Pro Gln Val Leu Ser Lys Ile Ser Asn Leu Ser Ser Ser Gln Arg 180 185 190 Asp Glu Gln Arg Glu Ser Glu Met Ser Asp Ser Thr Gly Lys Cys His 195 200 205 Ala Val Leu Glu Ala Asn Val Arg Ser Asp Arg Val Ala Ala Gly 210 215 220 95573DNAPopulus trichocarpa 95atggctcatc aggaatatgg tgcaagcaag aagggaagct tcaaccagat gcttgcagat 60gctaatgcac aaagtacctc agctggagca aatatccaag cttctgtgcc tgggaaacct 120gtggcgtcta tgcctgcaac taatttaaac attgggatgg acttatggaa tgcatcttct 180gctgctggag ctacaaaaat gagaccaaat ccatcttgtg ccacatctgg agttgttcct 240gctggattgc ctgaacaatg gattcaagat gaacgtgaat tgaaaagaca gaagaggaaa 300caatctaata gagagtcagc cagaaggtcc agattacgca aacaggcaga gtgcgaggag 360ctacaagcca gggtacagaa tttgagcagt gacaatagca atctcagaaa tgaattgcag 420agtctctctg aagaatgcaa taagcttaaa tccgaaaatg attccattaa ggaggagttg 480actcggttgt atggaccaga agttgtagct aaacttgaac agagcaaccc tgcttcggtt 540ccagagtctc atggtggtga gggagacagt tga 57396190PRTPopulus trichocarpa 96Met Ala His Gln Glu Tyr Gly Ala Ser Lys Lys Gly Ser Phe Asn Gln 1 5 10 15 Met Leu Ala Asp Ala Asn Ala Gln Ser Thr Ser Ala Gly Ala Asn Ile 20 25 30 Gln Ala Ser Val Pro Gly Lys Pro Val Ala Ser Met Pro Ala Thr Asn 35 40 45 Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Ser Ala Ala Gly Ala 50 55 60 Thr Lys Met Arg Pro Asn Pro Ser Cys Ala Thr Ser Gly Val Val Pro 65 70 75 80 Ala Gly Leu Pro Glu Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg 85 90 95 Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu 100 105 110 Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Ala Arg Val Gln Asn Leu 115 120 125 Ser Ser Asp Asn Ser Asn Leu Arg Asn Glu Leu Gln Ser Leu Ser Glu 130 135 140 Glu Cys Asn Lys Leu Lys Ser Glu Asn Asp Ser Ile Lys Glu Glu Leu 145 150 155 160 Thr Arg Leu Tyr Gly Pro Glu Val Val Ala Lys Leu Glu Gln Ser Asn 165 170 175 Pro Ala Ser Val Pro Glu Ser His Gly Gly Glu Gly Asp Ser 180 185 190 97852DNAPoncirus trifoliata 97atgccaccat atggcacccc agttccatac caagctatat atcctccagg gggagtatat 60gcacatccta gcatggctac gactccaaca gttgcaccaa caaatacaga gctggaaggg 120aagggacctg aagcaaagga ccgggcttca gctaaaaaat ccaagggaac tccaggaggt 180aaggctggag agattgtaaa ggcaacttct ggttctggga acgatggtgt ctctcaaagt 240gctgaaagtg gtagtgacgg ttcatctgat gcgagtgatg agaatggtaa ccagcaggag 300tttgctgggg gtaagaaagg aagctttgac cagatgcttg cagatgccaa cacggagaat 360aacacagtag aagctgttcc aggatcagtg cccgggaagc ctgtagtctc aatgcctgca 420actaatctca atattggcat ggatttgtgg aatacatccc ctgctgctgc tggagctgca 480aaaatgagaa caaatccatc tggggcctca ccagcagttg ctccagctgg cataatacct 540gatcaatgga ttcaagatga acgtgaattg aaaagacaga aaaggaagca atctaatagg 600gagtcagcca gaaggtcaag gttacgcaag caggcggaat gtgaggagct acaggccaga 660gtggagagtt tgagcaatga gaatcgcaac cttagagatg agttgcagag gctttctgag 720gaatgcgaga agcttacatc tgaaaataat tccattaagg aagacttatc tcggttgtgt 780ggaccagagg cagttgctaa tcttgagcag agcaacccca ctcagtcgtg cggggaagaa 840gaaaatagct aa 85298283PRTPoncirus trifoliata 98Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Gln Ala Ile Tyr Pro Pro 1 5 10 15 Gly Gly Val Tyr Ala His Pro Ser Met Ala Thr Thr Pro Thr Val Ala 20 25 30 Pro Thr Asn Thr Glu Leu Glu Gly Lys Gly Pro Glu Ala Lys Asp Arg 35 40 45 Ala Ser Ala Lys Lys Ser Lys Gly Thr Pro Gly Gly Lys Ala Gly Glu 50 55 60 Ile Val Lys Ala Thr Ser Gly Ser Gly Asn Asp Gly Val Ser Gln Ser 65 70 75 80 Ala Glu Ser Gly Ser Asp Gly Ser Ser Asp Ala Ser Asp Glu Asn Gly 85 90 95 Asn Gln Gln Glu Phe Ala Gly Gly Lys Lys Gly Ser Phe Asp Gln Met 100 105 110 Leu Ala Asp Ala Asn Thr Glu Asn Asn Thr Val Glu Ala Val Pro Gly 115 120 125 Ser Val Pro Gly Lys Pro Val Val Ser Met Pro Ala Thr Asn Leu Asn 130 135 140 Ile Gly Met Asp Leu Trp Asn Thr Ser Pro Ala Ala Ala Gly Ala Ala 145 150 155 160 Lys Met Arg Thr Asn Pro Ser Gly Ala Ser Pro Ala Val Ala Pro Ala 165 170 175 Gly Ile Ile Pro Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg 180 185 190 Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu 195 200 205 Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Ala Arg Val Glu Ser Leu 210 215 220 Ser Asn Glu Asn Arg Asn Leu Arg Asp Glu Leu Gln Arg Leu Ser Glu 225 230 235 240 Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile Lys Glu Asp Leu 245 250 255 Ser Arg Leu Cys Gly Pro Glu Ala Val Ala Asn Leu Glu Gln Ser Asn 260 265 270 Pro Thr Gln Ser Cys Gly Glu Glu Glu Asn Ser 275 280 991272DNAPhaseolus vulgaris 99atgggaaaca gtgaggaagg gaaatctgtt aaaactggaa gtccttcttc accagctact 60accaatcaga caaaccagcc taactttcat gtctatcctg attgggctgc catgcagtat 120tatgggccga gagtcaacat tcctccatac ttcaactcgg ctgtggcttc tggtcatgct

180ccacacccat acatgtgggg tccaccacag cctatgatgc caccttatgg gccaccatat 240gcagcatttt attctcctgg aggggtttat actcaccctg cagttgctat tgggccacat 300tcacacggtc aaggagttcc atccccacct gctgctggga ctccttcaag tgtagattca 360ccaacaaaat tatctggaaa tactgatcaa gggttaatga aaaaattgaa agggtttgat 420gggcttgcaa tgtcaatagg caattgcaat gctgagagtg cggagcttgg agctgaaaac 480aggctgtcgc agagtgtgga tactgagggt tctagcgatg gaagtgatgg caacactgca 540ggggctaatc aaacaaaaat gaaaagaagc cgagaggaaa catcaaccac tgatggagaa 600gggaaaactg agacacaaga tgggccagtt tccaaagaga ctacatcttc gaaaatggtt 660atgtctgcta caccagctag tgttgcagga aagttagttg gtcctgtaat ttcttcaggt 720atgaccacag cactggagct taggaaacct ttgactgttc attctaagga aaatcccacg 780agtgccccac aaccttgtgc agctgtgcct cctgaagctt ggttacagaa tgagcgtgag 840ctgaaacggg agaggaggaa acaatctaac cgtgaatctg ctagaaggtc caggctgagg 900aagcaggccg agactgaaga attggcacga aaagttgaga tgttaactgc tgaaaatgtg 960tcactgaagt cagaaataac tcaattgact gaaggttctg agcagatgag gatggaaaat 1020tctgcattga gggaaaaact gagaaatact caactgggac aaagggaaga gataattttg 1080gacagcattg acagcaagag gtctacacct gtaagtactg aaaatttgct atcaagagtt 1140aataattcca gttctaatga tagaagtgca gagaatgaga gtgatttctg tgagaacaaa 1200ccaaattctg gtgcaaagct gcatcaacta ctggatacaa atcctagagc tgatgctgtt 1260gctgctgggt ga 1272100423PRTPhaseolus vulgaris 100Met Gly Asn Ser Glu Glu Gly Lys Ser Val Lys Thr Gly Ser Pro Ser 1 5 10 15 Ser Pro Ala Thr Thr Asn Gln Thr Asn Gln Pro Asn Phe His Val Tyr 20 25 30 Pro Asp Trp Ala Ala Met Gln Tyr Tyr Gly Pro Arg Val Asn Ile Pro 35 40 45 Pro Tyr Phe Asn Ser Ala Val Ala Ser Gly His Ala Pro His Pro Tyr 50 55 60 Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr Gly Pro Pro Tyr 65 70 75 80 Ala Ala Phe Tyr Ser Pro Gly Gly Val Tyr Thr His Pro Ala Val Ala 85 90 95 Ile Gly Pro His Ser His Gly Gln Gly Val Pro Ser Pro Pro Ala Ala 100 105 110 Gly Thr Pro Ser Ser Val Asp Ser Pro Thr Lys Leu Ser Gly Asn Thr 115 120 125 Asp Gln Gly Leu Met Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala Met 130 135 140 Ser Ile Gly Asn Cys Asn Ala Glu Ser Ala Glu Leu Gly Ala Glu Asn 145 150 155 160 Arg Leu Ser Gln Ser Val Asp Thr Glu Gly Ser Ser Asp Gly Ser Asp 165 170 175 Gly Asn Thr Ala Gly Ala Asn Gln Thr Lys Met Lys Arg Ser Arg Glu 180 185 190 Glu Thr Ser Thr Thr Asp Gly Glu Gly Lys Thr Glu Thr Gln Asp Gly 195 200 205 Pro Val Ser Lys Glu Thr Thr Ser Ser Lys Met Val Met Ser Ala Thr 210 215 220 Pro Ala Ser Val Ala Gly Lys Leu Val Gly Pro Val Ile Ser Ser Gly 225 230 235 240 Met Thr Thr Ala Leu Glu Leu Arg Lys Pro Leu Thr Val His Ser Lys 245 250 255 Glu Asn Pro Thr Ser Ala Pro Gln Pro Cys Ala Ala Val Pro Pro Glu 260 265 270 Ala Trp Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln 275 280 285 Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 290 295 300 Thr Glu Glu Leu Ala Arg Lys Val Glu Met Leu Thr Ala Glu Asn Val 305 310 315 320 Ser Leu Lys Ser Glu Ile Thr Gln Leu Thr Glu Gly Ser Glu Gln Met 325 330 335 Arg Met Glu Asn Ser Ala Leu Arg Glu Lys Leu Arg Asn Thr Gln Leu 340 345 350 Gly Gln Arg Glu Glu Ile Ile Leu Asp Ser Ile Asp Ser Lys Arg Ser 355 360 365 Thr Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser Ser 370 375 380 Ser Asn Asp Arg Ser Ala Glu Asn Glu Ser Asp Phe Cys Glu Asn Lys 385 390 395 400 Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Asn Pro Arg 405 410 415 Ala Asp Ala Val Ala Ala Gly 420 1011053DNASolanum lycopersicum 101atgggagctg gggaagagag cactcctaca aagacttcaa agcctccttt aactcaggag 60acaccaaccg caccttcata tcctgattgg tcaagctcta tgcaggctta ttatagtgct 120ggagctactc ctcctttttt tgcctcacct gttgcttctc ctgctcccca cccatacatg 180tggggaggtc agcatcctct tatgcctcct tatgggactc cagttccata tccagcttta 240tatcctcctg ccggagttta tgctcatcct aacattgcca cgccggctcc aaattctgtg 300ccggcaaatc ctgaagcaga tgggaagggg cctgaaggaa aggatcggaa ttcaagtaaa 360aagttaaagg tctgttctgg tggtaaggca ggcgacaatg ggaaagttac ttcaggttcc 420ggaaatgatg gtgccacaca aagtgatgaa agcagaagtg aaggtacatc agatacaaat 480gatgaaaatg ataacaatga atttgctgca aacaagaagg gaagctttga tcaaatgctt 540cgagatggag ccagtgcaca gaataatcct gcgaaagaga atcacccgac ttctatacat 600ggaatctgta ccatgcctgc aactaaccta aatattggaa tggacgtgtg gaatgcatca 660gctgccggtc ctggagcgat caaaatacag caaaatgcaa ctggtccagt tataggacat 720gaaggaagga tgaatgatca gtggattcag gaggaacgtg aacttaaaag gcaaaagaga 780aagcaatcta atagggagtc agctaggagg tcgaggctcc gcaagcaggc agagtgtgaa 840gagctacaac gtagagtaga agctttgagc catgagaatc attcactcaa agatgagctc 900caacggctct ctgaggaatg tgagaagctt acctcggaga ataatttaat taaggaagag 960ttaacgctac tttgtggacc agacgttgtg tctaagctgg agagaaacga taatgtcaca 1020cgtattcaat ctaatgttga agaagctagt taa 1053102350PRTSolanum lycopersicum 102Met Gly Ala Gly Glu Glu Ser Thr Pro Thr Lys Thr Ser Lys Pro Pro 1 5 10 15 Leu Thr Gln Glu Thr Pro Thr Ala Pro Ser Tyr Pro Asp Trp Ser Ser 20 25 30 Ser Met Gln Ala Tyr Tyr Ser Ala Gly Ala Thr Pro Pro Phe Phe Ala 35 40 45 Ser Pro Val Ala Ser Pro Ala Pro His Pro Tyr Met Trp Gly Gly Gln 50 55 60 His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Leu 65 70 75 80 Tyr Pro Pro Ala Gly Val Tyr Ala His Pro Asn Ile Ala Thr Pro Ala 85 90 95 Pro Asn Ser Val Pro Ala Asn Pro Glu Ala Asp Gly Lys Gly Pro Glu 100 105 110 Gly Lys Asp Arg Asn Ser Ser Lys Lys Leu Lys Val Cys Ser Gly Gly 115 120 125 Lys Ala Gly Asp Asn Gly Lys Val Thr Ser Gly Ser Gly Asn Asp Gly 130 135 140 Ala Thr Gln Ser Asp Glu Ser Arg Ser Glu Gly Thr Ser Asp Thr Asn 145 150 155 160 Asp Glu Asn Asp Asn Asn Glu Phe Ala Ala Asn Lys Lys Gly Ser Phe 165 170 175 Asp Gln Met Leu Arg Asp Gly Ala Ser Ala Gln Asn Asn Pro Ala Lys 180 185 190 Glu Asn His Pro Thr Ser Ile His Gly Ile Cys Thr Met Pro Ala Thr 195 200 205 Asn Leu Asn Ile Gly Met Asp Val Trp Asn Ala Ser Ala Ala Gly Pro 210 215 220 Gly Ala Ile Lys Ile Gln Gln Asn Ala Thr Gly Pro Val Ile Gly His 225 230 235 240 Glu Gly Arg Met Asn Asp Gln Trp Ile Gln Glu Glu Arg Glu Leu Lys 245 250 255 Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 260 265 270 Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Arg Arg Val Glu Ala 275 280 285 Leu Ser His Glu Asn His Ser Leu Lys Asp Glu Leu Gln Arg Leu Ser 290 295 300 Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn Leu Ile Lys Glu Glu 305 310 315 320 Leu Thr Leu Leu Cys Gly Pro Asp Val Val Ser Lys Leu Glu Arg Asn 325 330 335 Asp Asn Val Thr Arg Ile Gln Ser Asn Val Glu Glu Ala Ser 340 345 350 103573DNASolanum lycopersicum 103atgacgggaa caggactttc tccttgcatg acaactttgg aaatgagaaa tcctgctagt 60gcacatatga aatctagccc aactaatggt ggttcaccac tcagccctgc actgcctaat 120gaaacctggt tacagaatga gcgtgagctg aagcgggaga aaaggaaaca gtctaatcgg 180gaatctgcaa ggcgatcaag attgagaaaa caggctgaag ctgaagaatt ggcaatacga 240gttcaggctt taacaggaga gaacttgaca ctcagatccg agattaacaa attaatggac 300aactcggaga aactgaagct agacaatgcc actttaatgg agagactgaa aaatgaacag 360cttggacaga cagaagaagt aagtttaggt aagattgatg ataagagact gcaacctgta 420ggcacagtaa acctgctagc acgagtgaac aactcaggtt cctcggatac aacgaacgag 480gatggtgaag tttatgagaa caacagctct ggagcaaagc ttcatcaact acttgatacc 540agccccagaa ctgatgcagt agcagctggg tga 573104190PRTSolanum lycopersicum 104Met Thr Gly Thr Gly Leu Ser Pro Cys Met Thr Thr Leu Glu Met Arg 1 5 10 15 Asn Pro Ala Ser Ala His Met Lys Ser Ser Pro Thr Asn Gly Gly Ser 20 25 30 Pro Leu Ser Pro Ala Leu Pro Asn Glu Thr Trp Leu Gln Asn Glu Arg 35 40 45 Glu Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 50 55 60 Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu Glu Leu Ala Ile Arg 65 70 75 80 Val Gln Ala Leu Thr Gly Glu Asn Leu Thr Leu Arg Ser Glu Ile Asn 85 90 95 Lys Leu Met Asp Asn Ser Glu Lys Leu Lys Leu Asp Asn Ala Thr Leu 100 105 110 Met Glu Arg Leu Lys Asn Glu Gln Leu Gly Gln Thr Glu Glu Val Ser 115 120 125 Leu Gly Lys Ile Asp Asp Lys Arg Leu Gln Pro Val Gly Thr Val Asn 130 135 140 Leu Leu Ala Arg Val Asn Asn Ser Gly Ser Ser Asp Thr Thr Asn Glu 145 150 155 160 Asp Gly Glu Val Tyr Glu Asn Asn Ser Ser Gly Ala Lys Leu His Gln 165 170 175 Leu Leu Asp Thr Ser Pro Arg Thr Asp Ala Val Ala Ala Gly 180 185 190 105516DNASolanum tuberosum 105atgactacta ctccactagt aaaatcaagt ccaacttcac gaatcagtcc tgcagtgcca 60ggcgaagtct ggttacagaa tgaacgtgag ctgaagcggg agaagaggaa gcagtctaat 120cgagaatctg caaggagatc aaggttgaga aaacaggcgg aagctgaaga attggcagtg 180caggttcaat ctttaacctc tgaaaatttg gcactcagat tagaaataaa caaattcacc 240gaaaactctg agaaactaaa ggttgaaaat gctgctttaa tggagagact gaaaaacaag 300caaggacaag caaaagaggt aactttaggt atgattgatg ataaaaggct gaagcctgtt 360agcacagcag acctactagc aagagtcaac aacaacaatg gttcattcaa tagaaccaac 420gaagacggtg aagttcatga tagtacatct ggagcaaagc ttcgtcaact ccttgatgcc 480agtcccagga ctgatcatgc tgtggctgct agatga 516106171PRTSolanum tuberosum 106Met Thr Thr Thr Pro Leu Val Lys Ser Ser Pro Thr Ser Arg Ile Ser 1 5 10 15 Pro Ala Val Pro Gly Glu Val Trp Leu Gln Asn Glu Arg Glu Leu Lys 20 25 30 Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 35 40 45 Leu Arg Lys Gln Ala Glu Ala Glu Glu Leu Ala Val Gln Val Gln Ser 50 55 60 Leu Thr Ser Glu Asn Leu Ala Leu Arg Leu Glu Ile Asn Lys Phe Thr 65 70 75 80 Glu Asn Ser Glu Lys Leu Lys Val Glu Asn Ala Ala Leu Met Glu Arg 85 90 95 Leu Lys Asn Lys Gln Gly Gln Ala Lys Glu Val Thr Leu Gly Met Ile 100 105 110 Asp Asp Lys Arg Leu Lys Pro Val Ser Thr Ala Asp Leu Leu Ala Arg 115 120 125 Val Asn Asn Asn Asn Gly Ser Phe Asn Arg Thr Asn Glu Asp Gly Glu 130 135 140 Val His Asp Ser Thr Ser Gly Ala Lys Leu Arg Gln Leu Leu Asp Ala 145 150 155 160 Ser Pro Arg Thr Asp His Ala Val Ala Ala Arg 165 170 107657DNASolanum tuberosum 107atgggagctg gggaagagag cacccctgca aagccttcga aagttactgc aactcaggaa 60acacaagcta caccttcata tcctgattgg tcttctatgc aggcttatta tggtgctgga 120cctacacctc ccttctttcc ctcaactgtt gcttctccca ctccccaccc atacatgtgg 180ggaggccagc atccgcttat gcctccttat ggagccccag tcccatatcc tgctttatat 240cctcctgctg gagtttatgc tcatcctaat atgcctatga ctccaaacac actgcaggca 300aatccagaat cagatagtaa ggcaccagat ggtaaggacc agaatacaag caaaaaattg 360aagggatgtt caggtggcaa ggcaggagaa attgggaaag cggcttcagg ttctggaaat 420gatggtggtg ccacaagaag tgctgaaagc ggaagtgaag gttcatcaga tgaaaatgat 480gaaaatgata accatgaatt ttctgcagac aagaatagaa gctttgatct aatgcttgct 540aatggagcca atgctcagac caatcctgca acagggaatc cagtcgctat gcccgcaact 600aatctgaata ttgggatgga tttgtggaac gcaaccgcct ggcgggtccc ggaatga 657108218PRTSolanum tuberosum 108Met Gly Ala Gly Glu Glu Ser Thr Pro Ala Lys Pro Ser Lys Val Thr 1 5 10 15 Ala Thr Gln Glu Thr Gln Ala Thr Pro Ser Tyr Pro Asp Trp Ser Ser 20 25 30 Met Gln Ala Tyr Tyr Gly Ala Gly Pro Thr Pro Pro Phe Phe Pro Ser 35 40 45 Thr Val Ala Ser Pro Thr Pro His Pro Tyr Met Trp Gly Gly Gln His 50 55 60 Pro Leu Met Pro Pro Tyr Gly Ala Pro Val Pro Tyr Pro Ala Leu Tyr 65 70 75 80 Pro Pro Ala Gly Val Tyr Ala His Pro Asn Met Pro Met Thr Pro Asn 85 90 95 Thr Leu Gln Ala Asn Pro Glu Ser Asp Ser Lys Ala Pro Asp Gly Lys 100 105 110 Asp Gln Asn Thr Ser Lys Lys Leu Lys Gly Cys Ser Gly Gly Lys Ala 115 120 125 Gly Glu Ile Gly Lys Ala Ala Ser Gly Ser Gly Asn Asp Gly Gly Ala 130 135 140 Thr Arg Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser Asp Glu Asn Asp 145 150 155 160 Glu Asn Asp Asn His Glu Phe Ser Ala Asp Lys Asn Arg Ser Phe Asp 165 170 175 Leu Met Leu Ala Asn Gly Ala Asn Ala Gln Thr Asn Pro Ala Thr Gly 180 185 190 Asn Pro Val Ala Met Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu 195 200 205 Trp Asn Ala Thr Ala Trp Arg Val Pro Glu 210 215 109300DNASolanum tuberosum 109atgggagctg gggaagagag cacccctgca aagccttcga aagttactgc aactcaggaa 60acacaagcta caccttcata tcctgcttta tatcctcctg ctggagttta tgctcatcct 120aatatgccta tgactccaaa cacactgcag gcaaatccag aatcagatag taaggcacca 180gatggtaagg accagaatac aagcaaaaaa ttgaagggat gttcaggtgg caaggcagga 240gaaattggga aagcggcttc aggttctgga aatgattggt ggtgccacaa gaagtgctga 30011099PRTSolanum tuberosum 110Met Gly Ala Gly Glu Glu Ser Thr Pro Ala Lys Pro Ser Lys Val Thr 1 5 10 15 Ala Thr Gln Glu Thr Gln Ala Thr Pro Ser Tyr Pro Ala Leu Tyr Pro 20 25 30 Pro Ala Gly Val Tyr Ala His Pro Asn Met Pro Met Thr Pro Asn Thr 35 40 45 Leu Gln Ala Asn Pro Glu Ser Asp Ser Lys Ala Pro Asp Gly Lys Asp 50 55 60 Gln Asn Thr Ser Lys Lys Leu Lys Gly Cys Ser Gly Gly Lys Ala Gly 65 70 75 80 Glu Ile Gly Lys Ala Ala Ser Gly Ser Gly Asn Asp Trp Trp Cys His 85 90 95 Lys Lys Cys 1111031DNATrifolium pratense 111atggggacta aggaggatag cacaactaaa ccttctaaat catcttcatc aactcaggag 60gtaccaacag taccaccacc atatccagat tggtcgcagg cctactataa tcccggagct 120gctccgcctc cctattatgc ctcaactgtt cctcagccaa ccccccatcc gtatatgtgg 180ggaagccagc atcctttaat ggcgccatat gggactccag tcccgtatcc tgctatgtac 240cctcctggaa atatctatgc tcatcctagc atggtagtgg ctccaagtgc tatgcaccaa 300actacagagt ttgaagggaa gggaccagat ggaaaggata aggattcatc taaaaaaccg 360aagggcactt ctgcgaatac aggtgctaaa gcaggagaga gtggaaaggc aggctcaggt 420tcaggcaatg atggcttttc acaaagtggt gaaagtggtt cagagggttc atcaaatggt 480agtgatgaga accaacagga atcagcgaga aacaagaagg gaggttttga cctcatgctt 540gttaatggag caaacgtaca gaacaataac actggaccca tttctcaatc acctgttcca 600ggaaatcctg ttgtctcgat acctgctact aatcttaata tcggaatgga tttatggaat 660gcatctcctg ctaatgctga agccaccaaa ctgagacaca atcaatctag tgcccctgga 720gctggtgaac aatggatgca acaagatgat cgtgagctga aaagacagaa gagaaaacag 780tctaatcgag agtcagctag gaggtcaaga ctacgcaagc aggctgagtg tgaagagcta 840caaaagaggg ttgaggcgtt gggaggtgag aatcgaactc tcagagaaga gcttcagaaa 900ctttctgaag

aatgcgagaa gcttacatct gaaaacaatt ctatcaagga agagttggaa 960cgattgtgtg ggccggaagt agttgctaat cttgaatgaa acaaacaaaa cagttcctcc 1020atgtcgtctt t 1031112343PRTTrifolium pratensemisc_feature(343)..(343)Xaa can be any naturally occurring amino acid 112Met Gly Thr Lys Glu Asp Ser Thr Thr Lys Pro Ser Lys Ser Ser Ser 1 5 10 15 Ser Thr Gln Glu Val Pro Thr Val Pro Pro Pro Tyr Pro Asp Trp Ser 20 25 30 Gln Ala Tyr Tyr Asn Pro Gly Ala Ala Pro Pro Pro Tyr Tyr Ala Ser 35 40 45 Thr Val Pro Gln Pro Thr Pro His Pro Tyr Met Trp Gly Ser Gln His 50 55 60 Pro Leu Met Ala Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Met Tyr 65 70 75 80 Pro Pro Gly Asn Ile Tyr Ala His Pro Ser Met Val Val Ala Pro Ser 85 90 95 Ala Met His Gln Thr Thr Glu Phe Glu Gly Lys Gly Pro Asp Gly Lys 100 105 110 Asp Lys Asp Ser Ser Lys Lys Pro Lys Gly Thr Ser Ala Asn Thr Gly 115 120 125 Ala Lys Ala Gly Glu Ser Gly Lys Ala Gly Ser Gly Ser Gly Asn Asp 130 135 140 Gly Phe Ser Gln Ser Gly Glu Ser Gly Ser Glu Gly Ser Ser Asn Gly 145 150 155 160 Ser Asp Glu Asn Gln Gln Glu Ser Ala Arg Asn Lys Lys Gly Gly Phe 165 170 175 Asp Leu Met Leu Val Asn Gly Ala Asn Val Gln Asn Asn Asn Thr Gly 180 185 190 Pro Ile Ser Gln Ser Pro Val Pro Gly Asn Pro Val Val Ser Ile Pro 195 200 205 Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Pro Ala 210 215 220 Asn Ala Glu Ala Thr Lys Leu Arg His Asn Gln Ser Ser Ala Pro Gly 225 230 235 240 Ala Gly Glu Gln Trp Met Gln Gln Asp Asp Arg Glu Leu Lys Arg Gln 245 250 255 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 260 265 270 Lys Gln Ala Glu Cys Glu Glu Leu Gln Lys Arg Val Glu Ala Leu Gly 275 280 285 Gly Glu Asn Arg Thr Leu Arg Glu Glu Leu Gln Lys Leu Ser Glu Glu 290 295 300 Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile Lys Glu Glu Leu Glu 305 310 315 320 Arg Leu Cys Gly Pro Glu Val Val Ala Asn Leu Glu Asn Lys Gln Asn 325 330 335 Ser Ser Ser Met Ser Ser Xaa 340 1131065DNAMedicago truncatula 113atgtggggac caccacagcc tatgatgcat ccatatgggc cgccatatgc accaccattt 60tattcacatg gaggggttta tactcatcct gccgttgcca tcgggtcaaa ttcaaatggt 120caaggaattt catcttcacc tgctgctggg actcctacaa gcatagagac accgaccaaa 180tcatctggaa acactgatca gggtttaatg aaaaaattga aaggatttga cgggcttgca 240atgtcaatag gcaatggcaa tgctgaaagt gctgagcgtg gagctgaaaa ccggctatca 300cggagtgtgg atactgaggg ttccagcgat ggaagcgatg gcaacactac agggaccaat 360ggaacaagga aaagaagccg ggatgggaca ccaacaacca ctgatggaga agggaaaact 420gagatgccag atagtcaagt ttccaaagag actgctgctt ccaaaaagac agtgtcagtt 480atcacaagca gtgctgcaga aaatatggtt ggacctgtac tttcttcagg tatgaccaca 540tcactggaac tgaggaaccc ttcacctatt tccaccagtg ctccacaacc ttgtggagtt 600ttgcctcctg aagcttggat gcagaatgag cgtgagctga aacgtgagag gaggaaacaa 660tcaaatcgtg aatctgctag aagatccagg cttaggaagc aggctgaggc tgaagaattg 720gcacgaagag tcgatgcgtt gactgctgag aatttggcgc tgaaatcaga aatgaatgaa 780ttggctgaaa attcggcgaa gctgaagatt gaaaatgcta cattaaagga aaagctggaa 840aacactcaac tgggacaaac agaagagata attttgaacg gcatggacaa gagggctaca 900cctgtaagta cagaaaactt actgtcaaga gttaatgatt ccaattctga tgatagagct 960gcagaggaag aaaatggttt ctgtgagaac aaacccaatt ctggtgcaaa gctgcgtcaa 1020ctactcgaca caaatcctag agctaatgct gtggccgcta gttga 1065114354PRTMedicago truncatula 114Met Trp Gly Pro Pro Gln Pro Met Met His Pro Tyr Gly Pro Pro Tyr 1 5 10 15 Ala Pro Pro Phe Tyr Ser His Gly Gly Val Tyr Thr His Pro Ala Val 20 25 30 Ala Ile Gly Ser Asn Ser Asn Gly Gln Gly Ile Ser Ser Ser Pro Ala 35 40 45 Ala Gly Thr Pro Thr Ser Ile Glu Thr Pro Thr Lys Ser Ser Gly Asn 50 55 60 Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala 65 70 75 80 Met Ser Ile Gly Asn Gly Asn Ala Glu Ser Ala Glu Arg Gly Ala Glu 85 90 95 Asn Arg Leu Ser Arg Ser Val Asp Thr Glu Gly Ser Ser Asp Gly Ser 100 105 110 Asp Gly Asn Thr Thr Gly Thr Asn Gly Thr Arg Lys Arg Ser Arg Asp 115 120 125 Gly Thr Pro Thr Thr Thr Asp Gly Glu Gly Lys Thr Glu Met Pro Asp 130 135 140 Ser Gln Val Ser Lys Glu Thr Ala Ala Ser Lys Lys Thr Val Ser Val 145 150 155 160 Ile Thr Ser Ser Ala Ala Glu Asn Met Val Gly Pro Val Leu Ser Ser 165 170 175 Gly Met Thr Thr Ser Leu Glu Leu Arg Asn Pro Ser Pro Ile Ser Thr 180 185 190 Ser Ala Pro Gln Pro Cys Gly Val Leu Pro Pro Glu Ala Trp Met Gln 195 200 205 Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu 210 215 220 Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu Glu Leu 225 230 235 240 Ala Arg Arg Val Asp Ala Leu Thr Ala Glu Asn Leu Ala Leu Lys Ser 245 250 255 Glu Met Asn Glu Leu Ala Glu Asn Ser Ala Lys Leu Lys Ile Glu Asn 260 265 270 Ala Thr Leu Lys Glu Lys Leu Glu Asn Thr Gln Leu Gly Gln Thr Glu 275 280 285 Glu Ile Ile Leu Asn Gly Met Asp Lys Arg Ala Thr Pro Val Ser Thr 290 295 300 Glu Asn Leu Leu Ser Arg Val Asn Asp Ser Asn Ser Asp Asp Arg Ala 305 310 315 320 Ala Glu Glu Glu Asn Gly Phe Cys Glu Asn Lys Pro Asn Ser Gly Ala 325 330 335 Lys Leu Arg Gln Leu Leu Asp Thr Asn Pro Arg Ala Asn Ala Val Ala 340 345 350 Ala Ser 1151197DNAVitis vinifera 115atgggggatg gtgaggaaag cacacctccc aagtcttcta aaccacctgc ttcaacacag 60gagacaccat caaccccttc ttatcctgac tggtcaacat ctatgcaggc ctactatggt 120gctggagcta ctccgcctcc ttttttccct tctcctgttg cacccccatc ccctcatccg 180tacctatggg gaggtcagca tcctatgatg ccaccatatg gaactccact tccataccca 240gctctctatc ctcgtggggc cctctatgct catcctagca tggctacggc tcagggtgtg 300gcactgacaa ataccgacat ggaagtaaag acccctgatg gaaaagaccc agcatcaatt 360aaaaaatcaa aggcagcttc aggaaacatg ggtttgatta gtggaaaatc tggggaaagc 420ggaaaggcag cttcagtttc tggcaatgat ggtgcttcac aaagtgggga gagtggtagt 480gaggcctcat cagatgcgac tgatgagaat gctaaccaag catcttctgc agtaaagaag 540agaagcttca accttgctga tggatcaaat gcaaagggta acagtgctgc tcagtacact 600ggtggaaatc attcagcctc agttccaggc aagcctgtgg tacctatgcc tacaaccagt 660ttaaatattg ggatggacct gtggaatgca tcccctgctg gaggcacacc catgaagaca 720agaccacagt catctggtgc ctcacctcaa gtggcttcag caactatagt tggacgtgaa 780ggcatgttac aggatcatca atggattcaa gatgaacgtg aactcaaacg acaaaggaga 840aagcaatcca atagggagtc agctaggaga tcaagattgc gtaagcaggc tgaatgtgaa 900gaattacaat caaaggttga aattttgagc aatgagaatc atgtgctgag agaggagctg 960cataggcttg ctgagcagtg cgagaagctt acatctgaga ataattccat aatggaggag 1020ttgacacaat tgtatgggcc agaggcaaca tctagccttc aagataacaa ccacaacttg 1080gttctccatc ctatcaatgg tgaagacgat ggccatgtac aagatgcttc ccctctaaac 1140aactccagtt ccacgtctga tcaaaatggg aaattcagct ccaatgggaa gatttga 1197116398PRTVitis vinifera 116Met Gly Asp Gly Glu Glu Ser Thr Pro Pro Lys Ser Ser Lys Pro Pro 1 5 10 15 Ala Ser Thr Gln Glu Thr Pro Ser Thr Pro Ser Tyr Pro Asp Trp Ser 20 25 30 Thr Ser Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Pro Pro Pro Phe 35 40 45 Phe Pro Ser Pro Val Ala Pro Pro Ser Pro His Pro Tyr Leu Trp Gly 50 55 60 Gly Gln His Pro Met Met Pro Pro Tyr Gly Thr Pro Leu Pro Tyr Pro 65 70 75 80 Ala Leu Tyr Pro Arg Gly Ala Leu Tyr Ala His Pro Ser Met Ala Thr 85 90 95 Ala Gln Gly Val Ala Leu Thr Asn Thr Asp Met Glu Val Lys Thr Pro 100 105 110 Asp Gly Lys Asp Pro Ala Ser Ile Lys Lys Ser Lys Ala Ala Ser Gly 115 120 125 Asn Met Gly Leu Ile Ser Gly Lys Ser Gly Glu Ser Gly Lys Ala Ala 130 135 140 Ser Val Ser Gly Asn Asp Gly Ala Ser Gln Ser Gly Glu Ser Gly Ser 145 150 155 160 Glu Ala Ser Ser Asp Ala Thr Asp Glu Asn Ala Asn Gln Ala Ser Ser 165 170 175 Ala Val Lys Lys Arg Ser Phe Asn Leu Ala Asp Gly Ser Asn Ala Lys 180 185 190 Gly Asn Ser Ala Ala Gln Tyr Thr Gly Gly Asn His Ser Ala Ser Val 195 200 205 Pro Gly Lys Pro Val Val Pro Met Pro Thr Thr Ser Leu Asn Ile Gly 210 215 220 Met Asp Leu Trp Asn Ala Ser Pro Ala Gly Gly Thr Pro Met Lys Thr 225 230 235 240 Arg Pro Gln Ser Ser Gly Ala Ser Pro Gln Val Ala Ser Ala Thr Ile 245 250 255 Val Gly Arg Glu Gly Met Leu Gln Asp His Gln Trp Ile Gln Asp Glu 260 265 270 Arg Glu Leu Lys Arg Gln Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala 275 280 285 Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Ser 290 295 300 Lys Val Glu Ile Leu Ser Asn Glu Asn His Val Leu Arg Glu Glu Leu 305 310 315 320 His Arg Leu Ala Glu Gln Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser 325 330 335 Ile Met Glu Glu Leu Thr Gln Leu Tyr Gly Pro Glu Ala Thr Ser Ser 340 345 350 Leu Gln Asp Asn Asn His Asn Leu Val Leu His Pro Ile Asn Gly Glu 355 360 365 Asp Asp Gly His Val Gln Asp Ala Ser Pro Leu Asn Asn Ser Ser Ser 370 375 380 Thr Ser Asp Gln Asn Gly Lys Phe Ser Ser Asn Gly Lys Ile 385 390 395 1171095DNAVitis vinifera 117atgggggctg gggaagatac cacacctact aagccttcca aaccaacttc ttcagctcag 60gaaatgccaa cgacgccctc atatcctgag tggtcgagct ctatgcaggc ttattatggt 120cctggagcta caccacctcc cttttttgct ccctctgttg cttctccaac tccccatcca 180tatctgtggg gaagccagca tcctttaatt cctccatatg gaacaccagt tccatactca 240gccttatatc ctccaggagg tgtttatgca catcctaatt tggccacggc tccaagtgca 300gcacatttaa accctgagtt ggaagggaaa ggccctgagg gaaaagacaa ggcttcagca 360aagaaatcta aaggaacatc aggaaatact gttaagggtg gcgagagtgg aaaggcagct 420tcaggctcag gaaatgatgg tgcctcacca agtgctgaaa gtggaagtga gggttcatca 480gatgcaagtg atgagaatac taaccaacaa gaatttgctt ctagtaagaa gggaagtttc 540aatcagatgc ttgctgatgc caatgcacag aataacatct ctggaacaag tgttcaggct 600tcagttcctg ggaagcctgt aatatctatg cctgcaacta atctaaatat tgggatggac 660ttatggagtg catctcctgg gggctctgga gctacaaaac tgagaccaaa tccatctggc 720atctcatctt ctgttgctcc agcagcaatg gttgggcgtg aaggcgttat gcccgaccag 780tggattcaag atgaacgtga actcaaaaga caaaagagga aacaatctaa cagggagtca 840gctaggaggt cgagattacg gaagcaggcg gagtgtgagg aactacaagc aaaggtagaa 900actttgagca ctgagaatac tgcactcaga gatgagctgc agaggctttc tgaggaatgc 960gagaagctta catctgaaaa taattccatt aaggaagaat tgactcgggt atgtggagca 1020gatgcagtgg ctgcaaacct caaagagaaa aaccccacac aactccaatc tcagggcgtc 1080gagggcaaca gttga 1095118364PRTVitis vinifera 118Met Gly Ala Gly Glu Asp Thr Thr Pro Thr Lys Pro Ser Lys Pro Thr 1 5 10 15 Ser Ser Ala Gln Glu Met Pro Thr Thr Pro Ser Tyr Pro Glu Trp Ser 20 25 30 Ser Ser Met Gln Ala Tyr Tyr Gly Pro Gly Ala Thr Pro Pro Pro Phe 35 40 45 Phe Ala Pro Ser Val Ala Ser Pro Thr Pro His Pro Tyr Leu Trp Gly 50 55 60 Ser Gln His Pro Leu Ile Pro Pro Tyr Gly Thr Pro Val Pro Tyr Ser 65 70 75 80 Ala Leu Tyr Pro Pro Gly Gly Val Tyr Ala His Pro Asn Leu Ala Thr 85 90 95 Ala Pro Ser Ala Ala His Leu Asn Pro Glu Leu Glu Gly Lys Gly Pro 100 105 110 Glu Gly Lys Asp Lys Ala Ser Ala Lys Lys Ser Lys Gly Thr Ser Gly 115 120 125 Asn Thr Val Lys Gly Gly Glu Ser Gly Lys Ala Ala Ser Gly Ser Gly 130 135 140 Asn Asp Gly Ala Ser Pro Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser 145 150 155 160 Asp Ala Ser Asp Glu Asn Thr Asn Gln Gln Glu Phe Ala Ser Ser Lys 165 170 175 Lys Gly Ser Phe Asn Gln Met Leu Ala Asp Ala Asn Ala Gln Asn Asn 180 185 190 Ile Ser Gly Thr Ser Val Gln Ala Ser Val Pro Gly Lys Pro Val Ile 195 200 205 Ser Met Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Ser Ala 210 215 220 Ser Pro Gly Gly Ser Gly Ala Thr Lys Leu Arg Pro Asn Pro Ser Gly 225 230 235 240 Ile Ser Ser Ser Val Ala Pro Ala Ala Met Val Gly Arg Glu Gly Val 245 250 255 Met Pro Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg Gln Lys 260 265 270 Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys 275 280 285 Gln Ala Glu Cys Glu Glu Leu Gln Ala Lys Val Glu Thr Leu Ser Thr 290 295 300 Glu Asn Thr Ala Leu Arg Asp Glu Leu Gln Arg Leu Ser Glu Glu Cys 305 310 315 320 Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile Lys Glu Glu Leu Thr Arg 325 330 335 Val Cys Gly Ala Asp Ala Val Ala Ala Asn Leu Lys Glu Lys Asn Pro 340 345 350 Thr Gln Leu Gln Ser Gln Gly Val Glu Gly Asn Ser 355 360 11929PRTArtificial sequencemotif 1 119Glu Leu Lys Arg Xaa Xaa Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 1 5 10 15 Arg Ser Arg Leu Arg Lys Gln Ala Glu Xaa Glu Glu Leu 20 25 12021PRTArtificial sequencemotif 2 120Xaa Xaa Xaa Val Glu Xaa Leu Xaa Xaa Glu Asn Xaa Xaa Leu Xaa Xaa 1 5 10 15 Glu Xaa Xaa Xaa Xaa 20 1218PRTArtificial sequencemotif 4 121Xaa Xaa Xaa Xaa Xaa Xaa Xaa Trp 1 5 12231PRTArtificial sequencemotif 4 122Arg Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala 1 5 10 15 Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln 20 25 30 12312PRTArtificial sequencemotif 5 123Xaa Thr Asn Leu Asn Xaa Gly Met Asp Xaa Trp Asn 1 5 10 12416PRTArtificial sequencemotif 6 124Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Xaa Tyr Pro Pro 1 5 10 15 12547PRTArtificial sequencemotif 7 125Asn Glu Xaa Glu Leu Lys Arg Glu Xaa Arg Lys Gln Ser Asn Arg Glu 1 5 10 15 Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Xaa Glu Glu Leu 20 25 30 Ala Xaa Xaa Val Xaa Xaa Leu Thr Xaa Glu Asn Xaa Xaa Leu Xaa 35 40 45 12631PRTArtificial sequencemotif 8 126Xaa Xaa Xaa Pro Xaa Lys Ser Ser Gly Asn Thr Asp Xaa Gly Leu Xaa 1 5 10 15 Xaa Lys Leu Lys Xaa Phe Asp Gly Leu Xaa Met Ser Ile Gly Asn 20 25 30 12713PRTArtificial sequencemotif 9 127Xaa Pro Gln Xaa Met Met Pro Pro Tyr Gly Xaa Pro Tyr 1 5 10

12822PRTArtificial sequencemotif 10 128Asn Ser Gly Ala Lys Leu Xaa Gln Leu Leu Asp Xaa Xaa Pro Arg Xaa 1 5 10 15 Asp Ala Val Ala Ala Gly 20 12922PRTArtificial sequencemotif 11 129Glu Ile Xaa Xaa Xaa Thr Glu Xaa Ser Glu Lys Xaa Xaa Xaa Xaa Asn 1 5 10 15 Xaa Xaa Leu Xaa Xaa Xaa 20 13050PRTArtificial sequencemotif 12 130Trp Leu Gln Asn Glu Xaa Glu Leu Lys Arg Glu Xaa Arg Lys Gln Ser 1 5 10 15 Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Xaa 20 25 30 Glu Glu Leu Ala Xaa Xaa Val Xaa Xaa Leu Thr Xaa Glu Asn Xaa Xaa 35 40 45 Leu Xaa 50 1312194DNAOryza sativa 131aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttc 21941323418DNAArtificial sequenceexpression cassette with SEQ ID NO 1 132aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttcatttaa atcaactagg gatatcacaa 2220gtttgtacaa aaaagcaggc ttaaacaatg cctccttatg ggactccagt tccatatcca 2280gctttatatc ctcctgccgg agtttatgct catcctaaca ttgccacgcc ggctccaaat 2340tctgtgccgg caaatcctga agcagatggg aaggggcctg aaggaaagga tcggaattca 2400agtaaaaagt taaaggtctg ttctggtggt aaggcaggcg acaatgggaa agttacttca 2460ggttccggaa atgatggtgc cacacaaagt gatgaaagca gaagtgaagg tacatcagat 2520acaaatgatg aaaatgataa caatgaattt gctgcaaaca agaagggaag ctttgatcaa 2580atgcttgcag atggagccag tgcacagaat aatcctgcga aagagaatca cccgacttct 2640atacatggaa atcctgtcac catgcctgca actaacctaa atattggaat ggacgtgtgg 2700aatgcatcag ctgccggtcc tggagcgatc aaaatacagc aaaatgcaac tggtccagtt 2760ataggacatg aaggaaggat gaatgatcag tggattcagg aggaacgtga acttaaaagg 2820caaaagagaa agcaatctaa tagggagtca gctaggaggt cgaggctccg caagcaggca 2880gagtgtgaag agctacaacg tagagtagaa gctttgagcc atgagaatca ttcactcaaa 2940gatgagctcc aacggctctc tgaggaatgt gagaagctta cctcggagaa taatttaatt 3000aaggaagagt taacgctact ttgtggacca gacgttgtgt ctaagctgga gagaaacgat 3060aatgtcacac gtattcaatc taatgttgaa gaagctagtt aaggagaagt ggaaaagcac 3120ccagctttct tgtacaaagt ggtgatatca caagcccggg cggtcttcta gggataacag 3180ggtaattata tccctctaga tcacaagccc gggcggtctt ctacgatgat tgagtaataa 3240tgtgtcacgc atcaccatgg gtggcagtgt cagtgtgagc aatgacctga atgaacaatt 3300gaaatgaaaa gaaaaaaagt actccatctg ttccaaatta aaattggttt taacctttta 3360ataggtttat acaataattg atatatgttt tctgtatatg tctaatttgt tatcatcc 34181333760DNAArtificial sequenceexpression cassette with SEQ ID NO 3 133aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttcatttaa atcaactagg gatatcacaa 2220gtttgtacaa aaaagcaggc ttaaacaatg ggaaacattg aagagggaaa gtcttccact 2280tctgataaat cttcacctgc accaccggat cagaccaata ttcatgtgta tcctgatggg 2340gcagctatgc aggcatatta tggcccccga gtggctctcc caccatatta caactcggcc 2400gtggcttctg gtcatgcccc tcatccttat atgtggggcc tgccacagcc tatgatgcca 2460ccttatgggg caccttatgc aacagtctac tcacatggag tgtatgcaca tccggctgtt 2520ccaattgtat cccatcctca tggtcctggg attgtgtcat ctcctgcagc tggaaccctt 2580ttgagtgcag aaacacctac aaaatcttca ggaaatactg atcgaggttt agtgaataag 2640ttgaaaggat ttgatgggct tgcaatgtca ataggcaatg gtaatgctga gactgtcgag 2700ggtgggggta ggctgtctca aagtgtggag atagaagttt ccagtgatgg aattgatggg 2760aatacaacta ggggaaagaa aaggagccgt gagggaacac caactgttgc aacaggtgga 2820gatacaaaaa tggagtcaca ttccagtccc cttcctagag aggtgaatgc atccactgac 2880aatgtattga gggcagctgt tgctcctggc atgaccacag cattggagct taggaaccct 2940cctagtgtga atgctgctaa gacaagtcct actacgattc ctcaatctgg tgtagtcctg 3000ccctctgaag cctggttaca gaatgagctg gagctgaaac gggagaagag gaaacaatca 3060aatcgagaat ctgccagaag gtcaagatta aggaagcagg ctgaggctga agaacttgca 3120cacaaagttg aagtactcac cacagaaaac atggcactcc aatctgaaat aagtcaattt 3180acagagaaat cagagaaact aaggcttgaa aatgctgcat taacggagaa actcaagaat 3240gcacaattag gacatgcgca agaaatgatt ttaaacattg atgagcacag ggccccagct 3300gttagtacag aaaacttgct atcaagagtt aacaattctg cctttgaaga agagagtgat 3360ctgtatgaac gaaactcaaa ttctggtgcc aagctgcatc aactcttgga tgcaagcccc 3420agagccgatg ctgtggctgc tggttgatga cactggttca acccagcttt cttgtacaaa 3480gtggtgatat cacaagcccg ggcggtcttc tagggataac agggtaatta tatccctcta 3540gatcacaagc ccgggcggtc ttctacgatg attgagtaat aatgtgtcac gcatcaccat 3600gggtggcagt gtcagtgtga gcaatgacct gaatgaacaa ttgaaatgaa aagaaaaaaa 3660gtactccatc tgttccaaat taaaattggt tttaaccttt taataggttt atacaataat 3720tgatatatgt tttctgtata tgtctaattt gttatcatcc 376013454DNAArtificial sequenceprimer prm009943 134ggggacaagt ttgtacaaaa aagcaggctt aaacaatgcc tccttatggg actc 5413550DNAArtificial sequenceprimer prm009944 135ggggaccact ttgtacaaga aagctgggtg cttttccact tctccttaac 5013655DNAArtificial sequenceprimer prm17402 136ggggacaagt ttgtacaaaa aagcaggctt aaacaatggg aaacattgaa gaggg 5513750DNAArtificial sequenceprimer prm17403 137ggggaccact ttgtacaaga aagctgggtt gaaccagtgt catcaaccag 501389DNAArtificial sequenceGLM oligonucleotide 138gtgagtcat 91398DNAArtificial sequenceABRE oligonucleotide 139ccacgtgg 81406DNAArtificial sequencePB-like oligonucleotide 140tgaaaa 61411242DNAPopulus trichocarpa 141atggagagaa gcgccgtctt tggtggtctg caaccaaatt accttcttta cccctcaccc 60aactcttcat cccttccttt ctcagaccac cgcgctagac ttccaaattt ctctcctcct 120ccctctctgt ctctcaagat acataagcag gtttcttctt gttttaaagc tgtgtctcct 180tttaagcgtg gagctgcgtt ttctgataca cacagtgaca catttgaatt agctgacata 240gactgggatg accttggatt tgcatacgtt cccactgatt atatgtattc aatgaaatgc 300actaaaggtg gaaacttttc caaaggtgaa ttacagagat atggaaacat tgaactgaac 360ccttctgctg gcgtcttaaa ttatggccag ggattgtttg aaggtctgaa agcctacagg 420aaagaagatg gtaaccttct tctatttcgt cctgaggaaa atgctatgcg gatgataatg 480ggtgcagaga ggatgtgcat gccatcaccg acaattgatc agtttgtgga tgcagtaaaa 540gcaactgttt tagcaaacaa acgttgggtt cctcctccag gtaaaggttc cttatatatc 600agaccattgc taatggggag tggagctgtt cttggtcttg cacctgctcc tgagtatacc 660tttctcattt atgtttcacc ggtggggaac tattttaagg aaggtgtggc accaattcat 720ttaattgtgg agcatgaact tcatcgagca actcctggtg gcactggagg tgtgaagact 780atagggaatt atgctgcggt tctcaaggca caatctgctg caaaagccag aggtttttct 840gacgttttat atcttgattg tgtacataaa aagtatctag aagaggtttc ctcttgcaac 900atttttgttg tgaagggtaa cagcatctcc actcctgcaa taaaagggac aatcctacca 960ggaattacaa ggaagagcat aattgatgtt gctcgaagcc aaggatttca ggttgaggaa 1020cggcttgtga cagtagatga attgcttgat gctgatgagg ttttttgtac cggaacagct 1080gttgttgtgt cacctgtggg aagcatcacc tacaagggta aaagggtgtc ttatggcgta 1140gaaggttttg gtgctgtctc gcaacaactc tatagtgtgc taaccaagct acagatgggc 1200cttatagagg acaagatgaa ttggactgtg gagctgagtt ag 1242142413PRTPopulus trichocarpa 142Met Glu Arg Ser Ala Val Phe Gly Gly Leu Gln Pro Asn Tyr Leu Leu 1 5 10 15 Tyr Pro Ser Pro Asn Ser Ser Ser Leu Pro Phe Ser Asp His Arg Ala 20 25 30 Arg Leu Pro Asn Phe Ser Pro Pro Pro Ser Leu Ser Leu Lys Ile His 35 40 45 Lys Gln Val Ser Ser Cys Phe Lys Ala Val Ser Pro Phe Lys Arg Gly 50 55 60 Ala Ala Phe Ser Asp Thr His Ser Asp Thr Phe Glu Leu Ala Asp Ile 65 70 75 80 Asp Trp Asp Asp Leu Gly Phe Ala Tyr Val Pro Thr Asp Tyr Met Tyr 85 90 95 Ser Met Lys Cys Thr Lys Gly Gly Asn Phe Ser Lys Gly Glu Leu Gln 100 105 110 Arg Tyr Gly Asn Ile Glu Leu Asn Pro Ser Ala Gly Val Leu Asn Tyr 115 120 125 Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Glu Asp Gly 130 135 140 Asn Leu Leu Leu Phe Arg Pro Glu Glu Asn Ala Met Arg Met Ile Met 145 150 155 160 Gly Ala Glu Arg Met Cys Met Pro Ser Pro Thr Ile Asp Gln Phe Val 165 170 175 Asp Ala Val Lys Ala Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro 180 185 190 Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly 195 200 205 Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr 210 215 220 Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Val Ala Pro Ile His 225 230 235 240 Leu Ile Val Glu His Glu Leu His Arg Ala Thr Pro Gly Gly Thr Gly 245 250 255 Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln Ser 260 265 270 Ala Ala Lys Ala Arg Gly Phe Ser Asp Val Leu Tyr Leu Asp Cys Val 275 280 285 His Lys Lys Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val 290 295 300 Lys Gly Asn Ser Ile Ser Thr Pro Ala Ile Lys Gly Thr Ile Leu Pro 305 310 315 320 Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Arg Ser Gln Gly Phe 325 330 335 Gln Val Glu Glu Arg Leu Val Thr Val Asp Glu Leu Leu Asp Ala Asp 340 345 350 Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser 355 360 365 Ile Thr Tyr Lys Gly Lys Arg Val Ser Tyr Gly Val Glu Gly Phe Gly 370 375 380 Ala Val Ser Gln Gln Leu Tyr Ser Val Leu Thr Lys Leu Gln Met Gly 385 390 395 400 Leu Ile Glu Asp Lys Met Asn Trp Thr Val Glu Leu Ser 405 410 143930DNAArabidopsis thaliana 143atggctcttc gtcgctgctt

acctcaatat tcaacaactt catcttatct ctccaagatc 60tggggatttc gtatgcatgg gaccaaggca gcagcttctg ttgtagaaga acatgtctcg 120ggggcagaac gtgaggatga agaatatgct gatgtagatt gggacaacct tggattcagt 180cttgtacgga cagatttcat gttcgccacc aaaagttgca gagacggaaa cttcgaacag 240ggttacctta gccgttacgg caacatcgag ctcaaccctg ctgctggaat tctcaactat 300ggccagggac taatagaggg gatgaaagcg tacagaggag aagacggtag ggttcttctc 360ttccgtccag agctaaacgc gatgcgtatg aagataggag ctgagagaat gtgtatgcat 420tctccttctg ttcatcagtt tattgaaggt gttaagcaga ccgttcttgc aaacaggcgt 480tgggttcctc ctccgggcaa aggctcgttg tatctcagac cgttgttgtt cggaagtgga 540gcaagcttgg gtgtggctgc agcatcagag tacacgtttc ttgtgtttgg ctctcctgtt 600caaaactact tcaaggaagg cacagcggcg ttgaacctgt atgtggagga ggtgattcca 660cgcgcttatc ttggaggaac tggtggtgta aaggcaatat ccaattacgg tccagtgctt 720gaagtgatga gaagagcaaa atcaagaggg ttttcggatg ttttgtatct tgatgcagat 780actgggaaga acatagaaga agtctctgct gctaatatat tccttgtgaa gggcaataca 840atagtgacac cagctacgag cggaacgatt ctcggaggga tcacacgtgg aagaacgaag 900tgttccggta gaagaactga aggaagctga 930144309PRTArabidopsis thaliana 144Met Ala Leu Arg Arg Cys Leu Pro Gln Tyr Ser Thr Thr Ser Ser Tyr 1 5 10 15 Leu Ser Lys Ile Trp Gly Phe Arg Met His Gly Thr Lys Ala Ala Ala 20 25 30 Ser Val Val Glu Glu His Val Ser Gly Ala Glu Arg Glu Asp Glu Glu 35 40 45 Tyr Ala Asp Val Asp Trp Asp Asn Leu Gly Phe Ser Leu Val Arg Thr 50 55 60 Asp Phe Met Phe Ala Thr Lys Ser Cys Arg Asp Gly Asn Phe Glu Gln 65 70 75 80 Gly Tyr Leu Ser Arg Tyr Gly Asn Ile Glu Leu Asn Pro Ala Ala Gly 85 90 95 Ile Leu Asn Tyr Gly Gln Gly Leu Ile Glu Gly Met Lys Ala Tyr Arg 100 105 110 Gly Glu Asp Gly Arg Val Leu Leu Phe Arg Pro Glu Leu Asn Ala Met 115 120 125 Arg Met Lys Ile Gly Ala Glu Arg Met Cys Met His Ser Pro Ser Val 130 135 140 His Gln Phe Ile Glu Gly Val Lys Gln Thr Val Leu Ala Asn Arg Arg 145 150 155 160 Trp Val Pro Pro Pro Gly Lys Gly Ser Leu Tyr Leu Arg Pro Leu Leu 165 170 175 Phe Gly Ser Gly Ala Ser Leu Gly Val Ala Ala Ala Ser Glu Tyr Thr 180 185 190 Phe Leu Val Phe Gly Ser Pro Val Gln Asn Tyr Phe Lys Glu Gly Thr 195 200 205 Ala Ala Leu Asn Leu Tyr Val Glu Glu Val Ile Pro Arg Ala Tyr Leu 210 215 220 Gly Gly Thr Gly Gly Val Lys Ala Ile Ser Asn Tyr Gly Pro Val Leu 225 230 235 240 Glu Val Met Arg Arg Ala Lys Ser Arg Gly Phe Ser Asp Val Leu Tyr 245 250 255 Leu Asp Ala Asp Thr Gly Lys Asn Ile Glu Glu Val Ser Ala Ala Asn 260 265 270 Ile Phe Leu Val Lys Gly Asn Thr Ile Val Thr Pro Ala Thr Ser Gly 275 280 285 Thr Ile Leu Gly Gly Ile Thr Arg Gly Arg Thr Lys Cys Ser Gly Arg 290 295 300 Arg Thr Glu Gly Ser 305 1451167DNAArabidopsis thaliana 145atgatcaaaa caatcacatc tctacgcaaa actctggttc tacctcttca tttacatatt 60cgtacgctac aaactttcgc caagtacaac gcacaagctg catcggcttt gcgagaagag 120cgtaagaaac ctctttatca aaatggagat gatgtatatg cggatttgga ttgggataat 180ctcgggtttg gtctaaatcc agctgattac atgtatgtca tgaaatgctc aaaagacggc 240gaattcactc aaggagaact tagtccctat gggaatattc agctaagtcc ttctgctgga 300gtcttaaact atggacaggc gatatacgaa ggtacaaaag catacaggaa agaaaatggg 360aagcttcttt tgtttcgtcc ggatcacaac gctatccgga tgaagcttgg cgctgaacgg 420atgctcatgc cttctccttc ggttgatcag tttgttaatg cagttaaaca aaccgctctt 480gcaaacaaac gttgggttcc tcctgcaggg aaagggactt tgtacattag gcctttgttg 540atgggaagtg gtccaatact tggtttaggt cctgcacctg aatatacatt cattgtctat 600gcatctccag ttggtaacta cttcaaggaa gggatggctg ctcttaacct ctatgttgag 660gaagaatatg tccgagcggc tcctggtgga gctggaggcg tcaagagcat cacaaattat 720gcgccagttt tgaaagcact gagcagagcc aagagtcggg ggttttcaga cgttctttat 780ctcgactctg tcaagaagaa gtacttagag gaggcttctt cttgcaacgt ctttgttgtc 840aagggtcgga caatctcaac tcctgcaact aatggaacaa ttcttgaagg gattacgcgg 900aaaagtgtga tggagatcgc aagtgatcaa ggttatcagg tagtagagaa ggcagttcat 960gtggatgaag taatggatgc agatgaagtt ttttgcaccg gaactgctgt agtagttgct 1020cccgtgggca ctatcacata tcaggaaaaa agagtagagt ataaaaccgg ggatgaatct 1080gtctgccaga aactgcgttc agtcctcgta ggtatccaga caggattgat tgaagataac 1140aagggatggg tcacagatat caactga 1167146388PRTArabidopsis thaliana 146Met Ile Lys Thr Ile Thr Ser Leu Arg Lys Thr Leu Val Leu Pro Leu 1 5 10 15 His Leu His Ile Arg Thr Leu Gln Thr Phe Ala Lys Tyr Asn Ala Gln 20 25 30 Ala Ala Ser Ala Leu Arg Glu Glu Arg Lys Lys Pro Leu Tyr Gln Asn 35 40 45 Gly Asp Asp Val Tyr Ala Asp Leu Asp Trp Asp Asn Leu Gly Phe Gly 50 55 60 Leu Asn Pro Ala Asp Tyr Met Tyr Val Met Lys Cys Ser Lys Asp Gly 65 70 75 80 Glu Phe Thr Gln Gly Glu Leu Ser Pro Tyr Gly Asn Ile Gln Leu Ser 85 90 95 Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Ala Ile Tyr Glu Gly Thr 100 105 110 Lys Ala Tyr Arg Lys Glu Asn Gly Lys Leu Leu Leu Phe Arg Pro Asp 115 120 125 His Asn Ala Ile Arg Met Lys Leu Gly Ala Glu Arg Met Leu Met Pro 130 135 140 Ser Pro Ser Val Asp Gln Phe Val Asn Ala Val Lys Gln Thr Ala Leu 145 150 155 160 Ala Asn Lys Arg Trp Val Pro Pro Ala Gly Lys Gly Thr Leu Tyr Ile 165 170 175 Arg Pro Leu Leu Met Gly Ser Gly Pro Ile Leu Gly Leu Gly Pro Ala 180 185 190 Pro Glu Tyr Thr Phe Ile Val Tyr Ala Ser Pro Val Gly Asn Tyr Phe 195 200 205 Lys Glu Gly Met Ala Ala Leu Asn Leu Tyr Val Glu Glu Glu Tyr Val 210 215 220 Arg Ala Ala Pro Gly Gly Ala Gly Gly Val Lys Ser Ile Thr Asn Tyr 225 230 235 240 Ala Pro Val Leu Lys Ala Leu Ser Arg Ala Lys Ser Arg Gly Phe Ser 245 250 255 Asp Val Leu Tyr Leu Asp Ser Val Lys Lys Lys Tyr Leu Glu Glu Ala 260 265 270 Ser Ser Cys Asn Val Phe Val Val Lys Gly Arg Thr Ile Ser Thr Pro 275 280 285 Ala Thr Asn Gly Thr Ile Leu Glu Gly Ile Thr Arg Lys Ser Val Met 290 295 300 Glu Ile Ala Ser Asp Gln Gly Tyr Gln Val Val Glu Lys Ala Val His 305 310 315 320 Val Asp Glu Val Met Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala 325 330 335 Val Val Val Ala Pro Val Gly Thr Ile Thr Tyr Gln Glu Lys Arg Val 340 345 350 Glu Tyr Lys Thr Gly Asp Glu Ser Val Cys Gln Lys Leu Arg Ser Val 355 360 365 Leu Val Gly Ile Gln Thr Gly Leu Ile Glu Asp Asn Lys Gly Trp Val 370 375 380 Thr Asp Ile Asn 385 1471104DNAArabidopsis thaliana 147atggctcctt ctgtgcaccc ttcttcatca cctcttttta caagtaaagc cgatgaaaag 60tatgcgaatg taaaatggga tgagctcgga ttcgcactgg ttccaacaga ttatatgtat 120gtggcgaaat gcaaacaagg agagagcttt tcaacaggag agattgttcc ttatggggat 180atttctataa gcccttgtgc tgggattctc aattatggcc agggactatt tgaaggtctc 240aaggcttaca ggacagaaga cggtcggatc acactcttcc gacctgacca aaacgctatt 300cgtatgcaaa caggtgcaga taggctttgt atgacacctc cttccccgga gcaattcgtt 360gaagcagtta agcaaactgt gcttgccaac aacaaatggg tacctcctcc ggggaaagga 420gctttgtata ttaggcctct actcataggt actggtgctg tccttggagt agcttcagct 480cctgaatata cgttcctcat ttacacatct cccgtgggaa attatcacaa ggcaagctca 540ggcttgaacc tcaaagttga tcataaccat cgccgagccc acttcggtgg aacagggggt 600gtgaagagct gcacaaatta ttctccagtt gtaaaatcgt tgatcgaagc aaagtcttcg 660ggtttctctg atgtcttgtt cctggatgcg gcaactggta aaaacatcga agaggtttct 720acttgtaaca tcttcattct aaagggaaac attgtatcca ctcccccaac ttcaggaacc 780attttaccag gaatcacaag gaagagcata tgtgagctag cccgtgacat tggctatgag 840gttcaagaac gtgatctttc tgtggatgag ctattagagg cagaggaagt tttttgcacg 900gggacggcag tggtcattaa agctgttgaa accgtgacat tccatgacaa aagggtaaaa 960tatagaacag gagaagaagc attctctacg aagcttcact tgatattaac taatattcaa 1020atgggagttg tcgaagataa gaagggttgg atgatggaga tcgatcattt ggttggaaca 1080gattcgtttc ctgatgaaac ataa 1104148367PRTArabidopsis thaliana 148Met Ala Pro Ser Val His Pro Ser Ser Ser Pro Leu Phe Thr Ser Lys 1 5 10 15 Ala Asp Glu Lys Tyr Ala Asn Val Lys Trp Asp Glu Leu Gly Phe Ala 20 25 30 Leu Val Pro Thr Asp Tyr Met Tyr Val Ala Lys Cys Lys Gln Gly Glu 35 40 45 Ser Phe Ser Thr Gly Glu Ile Val Pro Tyr Gly Asp Ile Ser Ile Ser 50 55 60 Pro Cys Ala Gly Ile Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu 65 70 75 80 Lys Ala Tyr Arg Thr Glu Asp Gly Arg Ile Thr Leu Phe Arg Pro Asp 85 90 95 Gln Asn Ala Ile Arg Met Gln Thr Gly Ala Asp Arg Leu Cys Met Thr 100 105 110 Pro Pro Ser Pro Glu Gln Phe Val Glu Ala Val Lys Gln Thr Val Leu 115 120 125 Ala Asn Asn Lys Trp Val Pro Pro Pro Gly Lys Gly Ala Leu Tyr Ile 130 135 140 Arg Pro Leu Leu Ile Gly Thr Gly Ala Val Leu Gly Val Ala Ser Ala 145 150 155 160 Pro Glu Tyr Thr Phe Leu Ile Tyr Thr Ser Pro Val Gly Asn Tyr His 165 170 175 Lys Ala Ser Ser Gly Leu Asn Leu Lys Val Asp His Asn His Arg Arg 180 185 190 Ala His Phe Gly Gly Thr Gly Gly Val Lys Ser Cys Thr Asn Tyr Ser 195 200 205 Pro Val Val Lys Ser Leu Ile Glu Ala Lys Ser Ser Gly Phe Ser Asp 210 215 220 Val Leu Phe Leu Asp Ala Ala Thr Gly Lys Asn Ile Glu Glu Val Ser 225 230 235 240 Thr Cys Asn Ile Phe Ile Leu Lys Gly Asn Ile Val Ser Thr Pro Pro 245 250 255 Thr Ser Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Cys Glu 260 265 270 Leu Ala Arg Asp Ile Gly Tyr Glu Val Gln Glu Arg Asp Leu Ser Val 275 280 285 Asp Glu Leu Leu Glu Ala Glu Glu Val Phe Cys Thr Gly Thr Ala Val 290 295 300 Val Ile Lys Ala Val Glu Thr Val Thr Phe His Asp Lys Arg Val Lys 305 310 315 320 Tyr Arg Thr Gly Glu Glu Ala Phe Ser Thr Lys Leu His Leu Ile Leu 325 330 335 Thr Asn Ile Gln Met Gly Val Val Glu Asp Lys Lys Gly Trp Met Met 340 345 350 Glu Ile Asp His Leu Val Gly Thr Asp Ser Phe Pro Asp Glu Thr 355 360 365 1491071DNAArabidopsis thaliana 149atggctcctt cttcatcacc tcttcgtact acaagtgaaa cagatgaaaa atatgcgaat 60gtcaaatggg aagagcttgg attcgctctg actccaatag attatatgta tgtagccaaa 120tgcagacaag gagagagctt tacacaaggg aagattgttc cttatggcga catttcaatt 180agcccttgtt ctccgattct caattacggc cagggactat ttgaaggtct caaagcttac 240agaacagaag acgaccggat taggattttc cggcctgacc aaaacgctct tcgcatgcaa 300actggtgcgg agaggctttg tatgacacct cctactctag aacaatttgt cgaggcagtt 360aagcaaactg tgcttgccaa caagaaatgg gttcctcctc cgggtaaagg aactctgtat 420ataaggcctc tgctactagg gagtggtgct acccttggag tagctccagc acctgaatac 480acttttctca tatatgcatc tcccgtagga gattaccata aggtaagctc aggcttgaac 540ctcaaagttg atcataagta tcaccgagcc cattcaggtg gaacgggggg tgtcaagagc 600tgcacaaact attctccagt tgtgaaatcg ttactcgaag caaagtcagc gggtttctct 660gatgtcctgt tcctggatgc agcaactggt agaaacatcg aagagcttac tgcttgtaac 720atcttcattg tcaagggaaa cattgtatcc accccaccaa cttcaggaac cattttacct 780ggagtcacga ggaaaagcat aagtgagctg gctcatgata ttggctacca ggtcgaagaa 840cgcgatgtat ctgtggatga gctactagag gcagaagaag ttttctgcac agggactgca 900gtggtcgtta aagctgttga aactgtgacc ttccatgaca aaaaggtaaa atacaggaca 960ggagaagcag cattgtctac gaagcttcac tcgatgttga ccaatattca gatgggagtt 1020gttgaagata agaaaggttg gatggtggac attgatcctt gtcaaggttg a 1071150356PRTArabidopsis thaliana 150Met Ala Pro Ser Ser Ser Pro Leu Arg Thr Thr Ser Glu Thr Asp Glu 1 5 10 15 Lys Tyr Ala Asn Val Lys Trp Glu Glu Leu Gly Phe Ala Leu Thr Pro 20 25 30 Ile Asp Tyr Met Tyr Val Ala Lys Cys Arg Gln Gly Glu Ser Phe Thr 35 40 45 Gln Gly Lys Ile Val Pro Tyr Gly Asp Ile Ser Ile Ser Pro Cys Ser 50 55 60 Pro Ile Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr 65 70 75 80 Arg Thr Glu Asp Asp Arg Ile Arg Ile Phe Arg Pro Asp Gln Asn Ala 85 90 95 Leu Arg Met Gln Thr Gly Ala Glu Arg Leu Cys Met Thr Pro Pro Thr 100 105 110 Leu Glu Gln Phe Val Glu Ala Val Lys Gln Thr Val Leu Ala Asn Lys 115 120 125 Lys Trp Val Pro Pro Pro Gly Lys Gly Thr Leu Tyr Ile Arg Pro Leu 130 135 140 Leu Leu Gly Ser Gly Ala Thr Leu Gly Val Ala Pro Ala Pro Glu Tyr 145 150 155 160 Thr Phe Leu Ile Tyr Ala Ser Pro Val Gly Asp Tyr His Lys Val Ser 165 170 175 Ser Gly Leu Asn Leu Lys Val Asp His Lys Tyr His Arg Ala His Ser 180 185 190 Gly Gly Thr Gly Gly Val Lys Ser Cys Thr Asn Tyr Ser Pro Val Val 195 200 205 Lys Ser Leu Leu Glu Ala Lys Ser Ala Gly Phe Ser Asp Val Leu Phe 210 215 220 Leu Asp Ala Ala Thr Gly Arg Asn Ile Glu Glu Leu Thr Ala Cys Asn 225 230 235 240 Ile Phe Ile Val Lys Gly Asn Ile Val Ser Thr Pro Pro Thr Ser Gly 245 250 255 Thr Ile Leu Pro Gly Val Thr Arg Lys Ser Ile Ser Glu Leu Ala His 260 265 270 Asp Ile Gly Tyr Gln Val Glu Glu Arg Asp Val Ser Val Asp Glu Leu 275 280 285 Leu Glu Ala Glu Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Lys 290 295 300 Ala Val Glu Thr Val Thr Phe His Asp Lys Lys Val Lys Tyr Arg Thr 305 310 315 320 Gly Glu Ala Ala Leu Ser Thr Lys Leu His Ser Met Leu Thr Asn Ile 325 330 335 Gln Met Gly Val Val Glu Asp Lys Lys Gly Trp Met Val Asp Ile Asp 340 345 350 Pro Cys Gln Gly 355 1511065DNAArabidopsis thaliana 151atggctcctt ctgcgcaacc tcttcctgtg agtgtttcgg atgaaaaata tgcgaatgtc 60aagtgggaag agttggcatt caagtttgtt cgtacggatt atatgtatgt tgcgaagtgc 120aatcatggag agagttttca agaggggaag attcttcctt ttgctgattt gcaacttaac 180ccttgcgctg ctgttcttca gtatggccag ggtttatatg aaggactgaa agcttacagg 240acagaagatg gtcggattct gctattccga ccagaccaaa acggtctccg ccttcaagcc 300ggagctgaca gactctatat gccttatcct tcggtcgatc aattcgtctc cgccatcaaa 360caagttgctc ttgccaacaa gaaatggatt cctcctccgg ggaaaggaac attgtatatt 420aggcctatct tgtttgggag tggtccgatt cttggttcat ttcccattcc tgagaccacc 480ttcacagctt ttgcctgtcc tgttggacgt tatcataagg ataactctgg tttgaatctg 540aaaatcgaag atcagtttcg tcgagctttt cctagtggaa ctggtggtgt gaagagcatc 600acaaactatt gtcctgtttg gataccattg gcagaggcga aaaaacaagg tttctctgat 660attttgtttt tggatgctgc aactggcaaa aacattgaag aacttttcgc agctaatgtt 720tttatgctca agggcaatgt tgtatcgaca ccaacaattg caggaactat tttgcccgga 780gtcactcgaa actgcgtaat ggaattgtgt cgtgatttcg gctaccaggt cgaggaacgt 840acgattcctc tagtggactt tctcgatgcg gacgaagctt tctgtactgg cactgcttcc 900attgtgacta gtattgcatc cgtaaccttt aaagacaaaa agaccggatt caaaacaggg 960gaagaaacat tggctgcgaa gctatacgag acgttaagtg atatccagac gggtcgggtc 1020gaggatacca agggatggac ggtggagatt gaccgccagg gctga 1065152354PRTArabidopsis thaliana 152Met Ala Pro Ser Ala Gln Pro Leu Pro Val Ser Val Ser Asp Glu Lys 1

5 10 15 Tyr Ala Asn Val Lys Trp Glu Glu Leu Ala Phe Lys Phe Val Arg Thr 20 25 30 Asp Tyr Met Tyr Val Ala Lys Cys Asn His Gly Glu Ser Phe Gln Glu 35 40 45 Gly Lys Ile Leu Pro Phe Ala Asp Leu Gln Leu Asn Pro Cys Ala Ala 50 55 60 Val Leu Gln Tyr Gly Gln Gly Leu Tyr Glu Gly Leu Lys Ala Tyr Arg 65 70 75 80 Thr Glu Asp Gly Arg Ile Leu Leu Phe Arg Pro Asp Gln Asn Gly Leu 85 90 95 Arg Leu Gln Ala Gly Ala Asp Arg Leu Tyr Met Pro Tyr Pro Ser Val 100 105 110 Asp Gln Phe Val Ser Ala Ile Lys Gln Val Ala Leu Ala Asn Lys Lys 115 120 125 Trp Ile Pro Pro Pro Gly Lys Gly Thr Leu Tyr Ile Arg Pro Ile Leu 130 135 140 Phe Gly Ser Gly Pro Ile Leu Gly Ser Phe Pro Ile Pro Glu Thr Thr 145 150 155 160 Phe Thr Ala Phe Ala Cys Pro Val Gly Arg Tyr His Lys Asp Asn Ser 165 170 175 Gly Leu Asn Leu Lys Ile Glu Asp Gln Phe Arg Arg Ala Phe Pro Ser 180 185 190 Gly Thr Gly Gly Val Lys Ser Ile Thr Asn Tyr Cys Pro Val Trp Ile 195 200 205 Pro Leu Ala Glu Ala Lys Lys Gln Gly Phe Ser Asp Ile Leu Phe Leu 210 215 220 Asp Ala Ala Thr Gly Lys Asn Ile Glu Glu Leu Phe Ala Ala Asn Val 225 230 235 240 Phe Met Leu Lys Gly Asn Val Val Ser Thr Pro Thr Ile Ala Gly Thr 245 250 255 Ile Leu Pro Gly Val Thr Arg Asn Cys Val Met Glu Leu Cys Arg Asp 260 265 270 Phe Gly Tyr Gln Val Glu Glu Arg Thr Ile Pro Leu Val Asp Phe Leu 275 280 285 Asp Ala Asp Glu Ala Phe Cys Thr Gly Thr Ala Ser Ile Val Thr Ser 290 295 300 Ile Ala Ser Val Thr Phe Lys Asp Lys Lys Thr Gly Phe Lys Thr Gly 305 310 315 320 Glu Glu Thr Leu Ala Ala Lys Leu Tyr Glu Thr Leu Ser Asp Ile Gln 325 330 335 Thr Gly Arg Val Glu Asp Thr Lys Gly Trp Thr Val Glu Ile Asp Arg 340 345 350 Gln Gly 1531242DNAArabidopsis thaliana 153atggagagag cagcaattct cccgagtgtt aatcaaaatt acctactttg tccttcacgc 60gccttctcca cgcgcctcca ctcctctact cgtaacttat cgccgccgtc atttgcctcc 120atcaagcttc agcattcttc ttcctctgtt tcttctaatg gtggaatctc tcttactcga 180tgcaacgctg tttcgtccaa ttcttccagt acgttggtaa ctgaattagc cgacatagat 240tgggataccg ttggatttgg gcttaagcca gctgattata tgtatgtgat gaaatgtaac 300attgatggag agttctcaaa aggtgagttg caacgttttg ggaatattga aattagccca 360tctgctggtg tactcaacta tggacaggga ttgtttgaag ggctaaaagc ttacagaaag 420aaagatggta ataacatcct cctctttcgt cctgaggaga atgcaaagcg tatgagaaat 480ggtgctgaga ggatgtgtat gcctgctcca accgttgagc agtttgtaga agctgtgaca 540gaaactgtac tagcaaacaa acgttgggtt ccaccaccag gtaaaggttc cttatatgtt 600agaccattgc taatgggaac aggagctgtt cttggtcttg cgcctgcacc agaatatact 660ttcattatct atgtttcgcc tgttgggaac tacttcaagg aaggtgtggc acctatcaat 720ttgattgtgg agaatgaatt tcaccgtgca actcctggtg gtaccggagg tgttaaaacc 780ataggcaatt atgctgcagt actgaaggca cagtcaattg cgaaagctaa aggatattcc 840gatgttttgt accttgattg catttacaaa agatatcttg aggaggtctc gtcttgcaat 900attttcatcg tgaaggacaa tgtgatatct actcctgaaa taaaaggaac cattttaccc 960ggtattactc gaaaaagtat gatagacgtg gctcgaacac aagggtttca ggtggaggaa 1020cggaatgtga cagtggatga attgttagaa gcagacgagg ttttctgcac aggaaccgcc 1080gtggttgtct ctcctgttgg aagcgtcact tacaaaggca aaagagtgtc ttacggagaa 1140ggtaccttcg gaactgtgtc gaagcaactc tacaccgttc tgacaagctt gcagatgggt 1200ctgattgaag acaacatgaa atggactgtg aatcttagtt aa 1242154413PRTArabidopsis thaliana 154Met Glu Arg Ala Ala Ile Leu Pro Ser Val Asn Gln Asn Tyr Leu Leu 1 5 10 15 Cys Pro Ser Arg Ala Phe Ser Thr Arg Leu His Ser Ser Thr Arg Asn 20 25 30 Leu Ser Pro Pro Ser Phe Ala Ser Ile Lys Leu Gln His Ser Ser Ser 35 40 45 Ser Val Ser Ser Asn Gly Gly Ile Ser Leu Thr Arg Cys Asn Ala Val 50 55 60 Ser Ser Asn Ser Ser Ser Thr Leu Val Thr Glu Leu Ala Asp Ile Asp 65 70 75 80 Trp Asp Thr Val Gly Phe Gly Leu Lys Pro Ala Asp Tyr Met Tyr Val 85 90 95 Met Lys Cys Asn Ile Asp Gly Glu Phe Ser Lys Gly Glu Leu Gln Arg 100 105 110 Phe Gly Asn Ile Glu Ile Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly 115 120 125 Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Lys Asp Gly Asn 130 135 140 Asn Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala Lys Arg Met Arg Asn 145 150 155 160 Gly Ala Glu Arg Met Cys Met Pro Ala Pro Thr Val Glu Gln Phe Val 165 170 175 Glu Ala Val Thr Glu Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro 180 185 190 Pro Gly Lys Gly Ser Leu Tyr Val Arg Pro Leu Leu Met Gly Thr Gly 195 200 205 Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Ile Ile Tyr 210 215 220 Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Val Ala Pro Ile Asn 225 230 235 240 Leu Ile Val Glu Asn Glu Phe His Arg Ala Thr Pro Gly Gly Thr Gly 245 250 255 Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln Ser 260 265 270 Ile Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Cys Ile 275 280 285 Tyr Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Ile Val 290 295 300 Lys Asp Asn Val Ile Ser Thr Pro Glu Ile Lys Gly Thr Ile Leu Pro 305 310 315 320 Gly Ile Thr Arg Lys Ser Met Ile Asp Val Ala Arg Thr Gln Gly Phe 325 330 335 Gln Val Glu Glu Arg Asn Val Thr Val Asp Glu Leu Leu Glu Ala Asp 340 345 350 Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser 355 360 365 Val Thr Tyr Lys Gly Lys Arg Val Ser Tyr Gly Glu Gly Thr Phe Gly 370 375 380 Thr Val Ser Lys Gln Leu Tyr Thr Val Leu Thr Ser Leu Gln Met Gly 385 390 395 400 Leu Ile Glu Asp Asn Met Lys Trp Thr Val Asn Leu Ser 405 410 1551248DNAArabidopsis thaliana 155atggagagaa gcgccgttgc ctcaggtttt catagaaatt acatcctctg tgcttcacgc 60gccgccactt ccacgacgcg cctccactct ttgtcctccc tcagaaactt tccctcttcc 120tctctcagga ttcgtcactg tccttctccc atctcttcca atttcatcgt tagtgaagtt 180tcccgaaacc gacgatgcga cgccgtttct tccagcacca ccgatgtgac tgaattagcc 240gaaattgatt gggacaagat tgattttggg cttaaaccaa cggattacat gtacgccatg 300aaatgtagcc gtgatggtga attctctcaa ggtcaattgc aaccttttgg taacattgac 360attaacccag cagctggtgt tctcaactat ggacaaggtt tgtttgaagg tctaaaagct 420tacagaaaac aagatgggaa tattctactc ttccgtcctg aggagaatgc gatccgaatg 480agaaatggcg ctgaaagaat gtgtatgcct tctccaaccg ttgaacagtt tgttgaggct 540gtgaaaacta ctgtattagc taacaaacgc tggattccac ctccaggtaa aggatcatta 600tacataaggc cattgctaat gggaactgga gctgttcttg gtcttgctcc tgctcctgaa 660tacactttcc ttatctttgt ttcacctgtc gggaactact tcaaggaagg tgttgcgccg 720atcaacttaa ttgttgaaac tgaattccat cgtgcaactc ccggcggtac tggaggtgtt 780aaaaccatcg gtaattatgc tgcagtcttg aaggctcagt cgattgcgaa agctaaaggg 840tattctgatg ttttatacct tgattgcctt cacaaaagat atcttgagga ggtttcatcg 900tgcaatattt tcattgtgaa ggataatgtg atatctactc ctgaaattaa aggaaccatc 960ttgcctggaa ttacccggaa gagtatcatc gaagtagctc gtagccaagg tttcaaggtg 1020gaggaacgaa atgtgacagt tgatgaattg gtagaagcag acgaggtttt ctgcacagga 1080accgccgttg ttttatctcc ggttggaagc atcacttaca aaagccaaag gttttcttat 1140ggagaagatg gctttggaac agtctcgaaa caactctaca cttccttgac gagcctgcaa 1200atgggtctga gcgaagataa catgaactgg actgttcaat tgagttaa 1248156415PRTArabidopsis thaliana 156Met Glu Arg Ser Ala Val Ala Ser Gly Phe His Arg Asn Tyr Ile Leu 1 5 10 15 Cys Ala Ser Arg Ala Ala Thr Ser Thr Thr Arg Leu His Ser Leu Ser 20 25 30 Ser Leu Arg Asn Phe Pro Ser Ser Ser Leu Arg Ile Arg His Cys Pro 35 40 45 Ser Pro Ile Ser Ser Asn Phe Ile Val Ser Glu Val Ser Arg Asn Arg 50 55 60 Arg Cys Asp Ala Val Ser Ser Ser Thr Thr Asp Val Thr Glu Leu Ala 65 70 75 80 Glu Ile Asp Trp Asp Lys Ile Asp Phe Gly Leu Lys Pro Thr Asp Tyr 85 90 95 Met Tyr Ala Met Lys Cys Ser Arg Asp Gly Glu Phe Ser Gln Gly Gln 100 105 110 Leu Gln Pro Phe Gly Asn Ile Asp Ile Asn Pro Ala Ala Gly Val Leu 115 120 125 Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Gln 130 135 140 Asp Gly Asn Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala Ile Arg Met 145 150 155 160 Arg Asn Gly Ala Glu Arg Met Cys Met Pro Ser Pro Thr Val Glu Gln 165 170 175 Phe Val Glu Ala Val Lys Thr Thr Val Leu Ala Asn Lys Arg Trp Ile 180 185 190 Pro Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly 195 200 205 Thr Gly Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu 210 215 220 Ile Phe Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Val Ala Pro 225 230 235 240 Ile Asn Leu Ile Val Glu Thr Glu Phe His Arg Ala Thr Pro Gly Gly 245 250 255 Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala 260 265 270 Gln Ser Ile Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp 275 280 285 Cys Leu His Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe 290 295 300 Ile Val Lys Asp Asn Val Ile Ser Thr Pro Glu Ile Lys Gly Thr Ile 305 310 315 320 Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Glu Val Ala Arg Ser Gln 325 330 335 Gly Phe Lys Val Glu Glu Arg Asn Val Thr Val Asp Glu Leu Val Glu 340 345 350 Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Leu Ser Pro Val 355 360 365 Gly Ser Ile Thr Tyr Lys Ser Gln Arg Phe Ser Tyr Gly Glu Asp Gly 370 375 380 Phe Gly Thr Val Ser Lys Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln 385 390 395 400 Met Gly Leu Ser Glu Asp Asn Met Asn Trp Thr Val Gln Leu Ser 405 410 415 1571242DNAGlycine max 157atgggaaaac agaaacagaa aatggagagc attcgactaa tttacccgat ctgcccctct 60cgacattctt cctttcttct ctctcatcaa tctcccttcc tatgcgaacc ttctctctct 120ctcaagcttc gaaagcagtt tcctctcact tcgcagaatg ttctggaagc cgcctctcct 180ctcaggcctt ccgccactct gtcttctgat ccctacagtg agacgattga attagctgat 240atagaatggg acaaccttgg gtttgggctt caacccactg attatatgta tatcatgaaa 300tgcacacgag gtggaacctt ttccaaaggt gaattgcagc gttttgggaa catcgagttg 360aacccctccg ctggagtttt aaactatggc cagggattat ttgagggttt gaaagcatac 420cgcaaacaag atgggagtat actcctcttc cgtccggaag aaaatggttt gcggatgcag 480ataggtgcgg agcggatgtg catgccatca cctactatgg agcagtttgt ggaagctgtg 540aaggatactg ttttagctaa caaacgttgg gttccccctg caggtaaagg ttccttgtat 600attagacctt tgttaatggg aagtggacct gtacttggtg ttgcacctgc accagagtac 660acatttctaa tatatgtttc acctgttggg aactacttca aggaaggttt ggccccaatc 720aatttgattg tagaaaatga attccatcgt gcaactcctg gtggcactgg aggtgtgaag 780accattggaa actatgctgc agttctgaag gcacagtctg aagcaaaagc taaaggctac 840tctgatgttt tataccttga ctgtgtgcac aaaagatatt tggaggaggt ttcttcatgc 900aacatttttg ttgttaaggg taacattatt tcaactccag ctattaaagg gacaatccta 960cctggcatta ctcgcaaaag tataattgat gtggctcgaa gcgaagggtt tcaggttgag 1020gagcgattag tttcagtgga tgaattgcta gatgctgatg aggttttctg cacgggaaca 1080gctgtggttg tatcacctgt tggcagtatt acttatcttg gcaagagggt aacatatggg 1140gatggtattg gcgtggttgc acagcaactt tatactgtcc ttaccagatt acagatgggt 1200cttacggagg atgagatgaa ttggactgtt gagctgagat aa 1242158413PRTGlycine max 158Met Gly Lys Gln Lys Gln Lys Met Glu Ser Ile Arg Leu Ile Tyr Pro 1 5 10 15 Ile Cys Pro Ser Arg His Ser Ser Phe Leu Leu Ser His Gln Ser Pro 20 25 30 Phe Leu Cys Glu Pro Ser Leu Ser Leu Lys Leu Arg Lys Gln Phe Pro 35 40 45 Leu Thr Ser Gln Asn Val Leu Glu Ala Ala Ser Pro Leu Arg Pro Ser 50 55 60 Ala Thr Leu Ser Ser Asp Pro Tyr Ser Glu Thr Ile Glu Leu Ala Asp 65 70 75 80 Ile Glu Trp Asp Asn Leu Gly Phe Gly Leu Gln Pro Thr Asp Tyr Met 85 90 95 Tyr Ile Met Lys Cys Thr Arg Gly Gly Thr Phe Ser Lys Gly Glu Leu 100 105 110 Gln Arg Phe Gly Asn Ile Glu Leu Asn Pro Ser Ala Gly Val Leu Asn 115 120 125 Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Gln Asp 130 135 140 Gly Ser Ile Leu Leu Phe Arg Pro Glu Glu Asn Gly Leu Arg Met Gln 145 150 155 160 Ile Gly Ala Glu Arg Met Cys Met Pro Ser Pro Thr Met Glu Gln Phe 165 170 175 Val Glu Ala Val Lys Asp Thr Val Leu Ala Asn Lys Arg Trp Val Pro 180 185 190 Pro Ala Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser 195 200 205 Gly Pro Val Leu Gly Val Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile 210 215 220 Tyr Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile 225 230 235 240 Asn Leu Ile Val Glu Asn Glu Phe His Arg Ala Thr Pro Gly Gly Thr 245 250 255 Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln 260 265 270 Ser Glu Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Cys 275 280 285 Val His Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val 290 295 300 Val Lys Gly Asn Ile Ile Ser Thr Pro Ala Ile Lys Gly Thr Ile Leu 305 310 315 320 Pro Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Arg Ser Glu Gly 325 330 335 Phe Gln Val Glu Glu Arg Leu Val Ser Val Asp Glu Leu Leu Asp Ala 340 345 350 Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly 355 360 365 Ser Ile Thr Tyr Leu Gly Lys Arg Val Thr Tyr Gly Asp Gly Ile Gly 370 375 380 Val Val Ala Gln Gln Leu Tyr Thr Val Leu Thr Arg Leu Gln Met Gly 385 390 395 400 Leu Thr Glu Asp Glu Met Asn Trp Thr Val Glu Leu Arg 405 410 1591167DNAGlycine max 159atgattcaaa gaactgtgtc ctttcccagt ttaaggaaat tgcttcttcg ggctggttgt 60tctaaatctg cttcgtccaa gatcggaact tacaattgct ttgcttctca gtcctcccct 120ctaccgagcc acaaccctag ttaccgtgat gacgagtatg ctgatgtgga ctgggacagt 180cttggatttg gactgatgcc cactgattat atgtatatta ctaaatgttg tgagggccaa 240aattttggac aaggacaact cagtcgttat gggaacattg aactcagtcc atcagctggt 300gtcctaaatt atggtcaggg tttattcgaa ggcacgaaag catacagaaa agaaaatggg 360ggcttgctac tcttccgtcc agaagaaaat gccattcgca tgaagactgg tgcccaaaga 420atgtgcatgg catcgccttc cattgatcat tttgttgatg ctttgaagca aactgtcttg 480gctaataagc gttgggttcc tccaccgggc aaaggatcct tgtaccttag gcctctgctc 540ctaggaactg gtccggtttt gggtttggct cctgcacctg aatacacatt cctcatattt 600gcttcccctg ttcgcaacta tttcaaggag ggctctgctc cactcaactt gtacgtggag 660gaaaactttg accgtgcttc tagccgcggc actggaaacg ttaaaaccat ttccaattat 720gcaccggtct tgatggcaca

aattcaagcc aagaaaagag gattttcgga tgtgctatac 780cttgattcag acaccaagaa aaatctcgag gaggtctctt cttgtaacat ttttattgcc 840aagggcaaat gcatctcaac acctgctact aatggaacta ttctttccgg aattacccga 900aaaagtgtca ttgaaattgc tcgcgatcat ggctatcagg tagaagagcg tgctgttgcc 960gtggatgaat tgattgaggc tgatgaagtt ttctgcacag gaactgcggt cggtgttgct 1020ccagtaggga gtatcacata ccaggataaa aggatggaat atataacagg ttctggaacc 1080atttgtcaag agctgaacaa taccatttca ggaattcaaa cgggtactat tgaagataag 1140aagggatgga ttgtcgaagt tgattaa 1167160388PRTGlycine max 160Met Ile Gln Arg Thr Val Ser Phe Pro Ser Leu Arg Lys Leu Leu Leu 1 5 10 15 Arg Ala Gly Cys Ser Lys Ser Ala Ser Ser Lys Ile Gly Thr Tyr Asn 20 25 30 Cys Phe Ala Ser Gln Ser Ser Pro Leu Pro Ser His Asn Pro Ser Tyr 35 40 45 Arg Asp Asp Glu Tyr Ala Asp Val Asp Trp Asp Ser Leu Gly Phe Gly 50 55 60 Leu Met Pro Thr Asp Tyr Met Tyr Ile Thr Lys Cys Cys Glu Gly Gln 65 70 75 80 Asn Phe Gly Gln Gly Gln Leu Ser Arg Tyr Gly Asn Ile Glu Leu Ser 85 90 95 Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Thr 100 105 110 Lys Ala Tyr Arg Lys Glu Asn Gly Gly Leu Leu Leu Phe Arg Pro Glu 115 120 125 Glu Asn Ala Ile Arg Met Lys Thr Gly Ala Gln Arg Met Cys Met Ala 130 135 140 Ser Pro Ser Ile Asp His Phe Val Asp Ala Leu Lys Gln Thr Val Leu 145 150 155 160 Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Ser Leu Tyr Leu 165 170 175 Arg Pro Leu Leu Leu Gly Thr Gly Pro Val Leu Gly Leu Ala Pro Ala 180 185 190 Pro Glu Tyr Thr Phe Leu Ile Phe Ala Ser Pro Val Arg Asn Tyr Phe 195 200 205 Lys Glu Gly Ser Ala Pro Leu Asn Leu Tyr Val Glu Glu Asn Phe Asp 210 215 220 Arg Ala Ser Ser Arg Gly Thr Gly Asn Val Lys Thr Ile Ser Asn Tyr 225 230 235 240 Ala Pro Val Leu Met Ala Gln Ile Gln Ala Lys Lys Arg Gly Phe Ser 245 250 255 Asp Val Leu Tyr Leu Asp Ser Asp Thr Lys Lys Asn Leu Glu Glu Val 260 265 270 Ser Ser Cys Asn Ile Phe Ile Ala Lys Gly Lys Cys Ile Ser Thr Pro 275 280 285 Ala Thr Asn Gly Thr Ile Leu Ser Gly Ile Thr Arg Lys Ser Val Ile 290 295 300 Glu Ile Ala Arg Asp His Gly Tyr Gln Val Glu Glu Arg Ala Val Ala 305 310 315 320 Val Asp Glu Leu Ile Glu Ala Asp Glu Val Phe Cys Thr Gly Thr Ala 325 330 335 Val Gly Val Ala Pro Val Gly Ser Ile Thr Tyr Gln Asp Lys Arg Met 340 345 350 Glu Tyr Ile Thr Gly Ser Gly Thr Ile Cys Gln Glu Leu Asn Asn Thr 355 360 365 Ile Ser Gly Ile Gln Thr Gly Thr Ile Glu Asp Lys Lys Gly Trp Ile 370 375 380 Val Glu Val Asp 385 1611080DNAGlycine max 161atgtctcccc cttctatgtt aggcaaccgc aaagacggtt ctgaaattgc tgttgcggaa 60aactatgctg acattaattg ggatgagctt ggatttagtc tagttccaac agattacatg 120tatgtcatga aatgtgcaaa aggagataag ttttcacaag gatccatcgt tccatttgga 180aacatagaga tcagcccttc tgctggaatc ttaaattatg gacagggact ctttgagggg 240ctaaaagcac atagaactga agatgggcat gtacttttat ttcgaccaga tgagaatgct 300caacgcatga aacgaggtgc agatagattg tgtatgccat ccccatctcc tggccaattt 360gttaatgctg taaagcagat agttattgcc aacaaacgtt gggtgcctcc accagggaaa 420gggtcactat atattaggcc attgctgata ggaacaggag cattgttagg ggtggcacct 480gcccctgagt atacatttct tatttattgt tctccagttg gcagctacca gaagggtgca 540ctaaatttaa aggttgagga taaactatat agagcaatat ctggctgtgg tggaactgga 600gggatcaaaa gtgtcaccaa ttatgcccct gtttatactg caatggctga tgcaaaggcc 660aacggattct ctgatgtgct gttcttagac tcagcaactg gaaaacatat agaggaggcc 720tcagcatgca atgtatttgt tttgaaggac aatgctatct ccactcctgc aatagatgga 780accatcctac ctggtatcac ccgaaaatcc atcattgaca ttgccattga tttgggttat 840caggtcatgg aacgttccgt atcagtggag gagatgctag gtgctgatga aatgttctgc 900actggaactg cagttgttgt caactctgtt gcatctgtaa cttataagga aacaagggtg 960gattacaaaa caggcccagc aacattgtcc tcaaaactac ggaaaacact tgttggaatt 1020caaacagggt gtcttgagga caaaaaatca tggacagtcc gagtagattc aacaatatag 1080162359PRTGlycine max 162Met Ser Pro Pro Ser Met Leu Gly Asn Arg Lys Asp Gly Ser Glu Ile 1 5 10 15 Ala Val Ala Glu Asn Tyr Ala Asp Ile Asn Trp Asp Glu Leu Gly Phe 20 25 30 Ser Leu Val Pro Thr Asp Tyr Met Tyr Val Met Lys Cys Ala Lys Gly 35 40 45 Asp Lys Phe Ser Gln Gly Ser Ile Val Pro Phe Gly Asn Ile Glu Ile 50 55 60 Ser Pro Ser Ala Gly Ile Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly 65 70 75 80 Leu Lys Ala His Arg Thr Glu Asp Gly His Val Leu Leu Phe Arg Pro 85 90 95 Asp Glu Asn Ala Gln Arg Met Lys Arg Gly Ala Asp Arg Leu Cys Met 100 105 110 Pro Ser Pro Ser Pro Gly Gln Phe Val Asn Ala Val Lys Gln Ile Val 115 120 125 Ile Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Ser Leu Tyr 130 135 140 Ile Arg Pro Leu Leu Ile Gly Thr Gly Ala Leu Leu Gly Val Ala Pro 145 150 155 160 Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Cys Ser Pro Val Gly Ser Tyr 165 170 175 Gln Lys Gly Ala Leu Asn Leu Lys Val Glu Asp Lys Leu Tyr Arg Ala 180 185 190 Ile Ser Gly Cys Gly Gly Thr Gly Gly Ile Lys Ser Val Thr Asn Tyr 195 200 205 Ala Pro Val Tyr Thr Ala Met Ala Asp Ala Lys Ala Asn Gly Phe Ser 210 215 220 Asp Val Leu Phe Leu Asp Ser Ala Thr Gly Lys His Ile Glu Glu Ala 225 230 235 240 Ser Ala Cys Asn Val Phe Val Leu Lys Asp Asn Ala Ile Ser Thr Pro 245 250 255 Ala Ile Asp Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile 260 265 270 Asp Ile Ala Ile Asp Leu Gly Tyr Gln Val Met Glu Arg Ser Val Ser 275 280 285 Val Glu Glu Met Leu Gly Ala Asp Glu Met Phe Cys Thr Gly Thr Ala 290 295 300 Val Val Val Asn Ser Val Ala Ser Val Thr Tyr Lys Glu Thr Arg Val 305 310 315 320 Asp Tyr Lys Thr Gly Pro Ala Thr Leu Ser Ser Lys Leu Arg Lys Thr 325 330 335 Leu Val Gly Ile Gln Thr Gly Cys Leu Glu Asp Lys Lys Ser Trp Thr 340 345 350 Val Arg Val Asp Ser Thr Ile 355 1631203DNAGlycine max 163atggtgcctc ctcatggaaa aggagcgttg tacattaggc ctttattatt tggaagtgga 60tctgttatgg gtattgcacc tgcaccacat tgcaccttcc taatatacac taatccaatt 120tccaacgctt acaaatgcag ggttgagttc aaaacaggag ccgacactgt gacccagaat 180actgctgggg aaaactatgc tgacattaat tgggatgagc ttggatttag tctagttcca 240acagattaca tgtatgtcat gaaatgtgca aagggagata agttttcaca aggatccatc 300cttccctatg gaaacttaga gattaaccct tctgctggaa tcttaaatta tgggcaggga 360atctttgagg gactaaaggc atatagaact gaagatgggt gcatccttct gtttagacca 420gaagagaatg ctcaacgaat gaagatagga gcagacagat tgtgcatgcc atccccatcc 480attgaccagt ttgttgctgc tgtgaagcag acagttcttg ccaacaaacg ttgggtgcct 540ccaccaggga aagggtcact atatattagg ccattgctca tgggaaccgg agcttctttg 600aacttgtctc cagcacctga gtacacatta cttatttatt gttctcctgt cactaattac 660cacaagggtt cactgaactt aaaagtggag agtaagttct accgagcaat atctggcact 720ggtggaaccg gagggatcaa gagtgttacc aactatgccc ctgtttatgc tgcaagcatt 780gaagcaaagg ccagtggatt ctctgatgtt ttgttcttgg actcagcaac tggaaaaaat 840atagaggagg tttctgcatg caatgtgttt gttgtgaagg gtaatgctat ctgcaccccg 900gcaacaaatg gagccatcct ccctgggatc acacgaaaat ccatcattga gattgcctta 960gatatgggtt atcaggtcac ggaacgtgcc atatcagtgg aggaaatgct agatgctgat 1020gaagtgttct gcacaggaac tgcagttgtt gtcaactctg tttcatctgt aacctacaaa 1080gaaacaagaa ctgagtataa aacaggacca gaaacattgt cccaaaaact gcgcaaaaca 1140ctggttggaa ttcaaactgg gtgtattgag gacacaaagg gctggacagt tcgaatagat 1200tga 1203164400PRTGlycine max 164Met Val Pro Pro His Gly Lys Gly Ala Leu Tyr Ile Arg Pro Leu Leu 1 5 10 15 Phe Gly Ser Gly Ser Val Met Gly Ile Ala Pro Ala Pro His Cys Thr 20 25 30 Phe Leu Ile Tyr Thr Asn Pro Ile Ser Asn Ala Tyr Lys Cys Arg Val 35 40 45 Glu Phe Lys Thr Gly Ala Asp Thr Val Thr Gln Asn Thr Ala Gly Glu 50 55 60 Asn Tyr Ala Asp Ile Asn Trp Asp Glu Leu Gly Phe Ser Leu Val Pro 65 70 75 80 Thr Asp Tyr Met Tyr Val Met Lys Cys Ala Lys Gly Asp Lys Phe Ser 85 90 95 Gln Gly Ser Ile Leu Pro Tyr Gly Asn Leu Glu Ile Asn Pro Ser Ala 100 105 110 Gly Ile Leu Asn Tyr Gly Gln Gly Ile Phe Glu Gly Leu Lys Ala Tyr 115 120 125 Arg Thr Glu Asp Gly Cys Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala 130 135 140 Gln Arg Met Lys Ile Gly Ala Asp Arg Leu Cys Met Pro Ser Pro Ser 145 150 155 160 Ile Asp Gln Phe Val Ala Ala Val Lys Gln Thr Val Leu Ala Asn Lys 165 170 175 Arg Trp Val Pro Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu 180 185 190 Leu Met Gly Thr Gly Ala Ser Leu Asn Leu Ser Pro Ala Pro Glu Tyr 195 200 205 Thr Leu Leu Ile Tyr Cys Ser Pro Val Thr Asn Tyr His Lys Gly Ser 210 215 220 Leu Asn Leu Lys Val Glu Ser Lys Phe Tyr Arg Ala Ile Ser Gly Thr 225 230 235 240 Gly Gly Thr Gly Gly Ile Lys Ser Val Thr Asn Tyr Ala Pro Val Tyr 245 250 255 Ala Ala Ser Ile Glu Ala Lys Ala Ser Gly Phe Ser Asp Val Leu Phe 260 265 270 Leu Asp Ser Ala Thr Gly Lys Asn Ile Glu Glu Val Ser Ala Cys Asn 275 280 285 Val Phe Val Val Lys Gly Asn Ala Ile Cys Thr Pro Ala Thr Asn Gly 290 295 300 Ala Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Glu Ile Ala Leu 305 310 315 320 Asp Met Gly Tyr Gln Val Thr Glu Arg Ala Ile Ser Val Glu Glu Met 325 330 335 Leu Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Asn 340 345 350 Ser Val Ser Ser Val Thr Tyr Lys Glu Thr Arg Thr Glu Tyr Lys Thr 355 360 365 Gly Pro Glu Thr Leu Ser Gln Lys Leu Arg Lys Thr Leu Val Gly Ile 370 375 380 Gln Thr Gly Cys Ile Glu Asp Thr Lys Gly Trp Thr Val Arg Ile Asp 385 390 395 400 1651236DNAGlycine max 165atggagagca gcgccgtcta cggaagcatt cgaccaagtt acccgatctg cccctctcga 60cgttcttcct ctcttctctc tcaccaatct cccttcctat tcgagccttc tctttctctc 120aagcttcgca agcagtttcc tctcatttcg cagaattttc tggaagccgc ttctcctctc 180aggccttctg ccactttgtc ttctgattcc taccgtgaga cgattgaatt agctgatata 240gaatgggaca accttggttt tgggcttcaa cccacggatt atatgtattc catgaaatgc 300acacgaggtg gaaccttctc caaaggtgaa ctgcagcgtt ttggtaacat tgaattgaac 360ccctcagcag gagttttaaa ctatggtcag ggattatttg agggtttgaa agcgtatcgg 420aaacaagatg ggagtatact cctcttccgt ccggaagaaa atggtttgcg gatgcagata 480ggtgcagaga ggatgtgcat gccatcgcct actgttgagc agtttgtgga agctgtaaag 540gagacagttt tagcaaacaa acgttgggtt ccccctgcag gtaaaggttc cctgtatatt 600agacctttgc taatgggaag tggaccggta cttggtcttg cacctgctcc agagtacacc 660tttctaatat acgtttcacc tgttgggaac tacttcaagg aaggtttggc cccaatcaat 720ttgattgtgg aaaatgaatt acatcgtgca actcctggtg gcactggagg tgtgaagacc 780attggaaact atgctgcagt tctgaaggca cagtctgaag caaaagctaa aggctactct 840gatgttttat accttgactg tgtgcacaaa agatatttgg aggaggtttc ttcatgcaac 900atttttgttg ttaagggtaa cgttatttca actccagcta ttaaagggac aatcctacct 960ggcattactc gcaaaagtat aattgatgtt gctcgaagcc aagggttcca ggttgaggag 1020cgattagtgt cagtggatga attgctagat gctgacgagg tcttctgcac gggaacagct 1080gtggttgtat cacctgttgg cagtattact tatcttgaca agagtttatt tctattaatt 1140tgttgtgatt ttactttatt ttatttttta tacactgaaa gactaaggcc atatgaattt 1200ggtccatatt tatctccatt ctcttcacct cattaa 1236166411PRTGlycine max 166Met Glu Ser Ser Ala Val Tyr Gly Ser Ile Arg Pro Ser Tyr Pro Ile 1 5 10 15 Cys Pro Ser Arg Arg Ser Ser Ser Leu Leu Ser His Gln Ser Pro Phe 20 25 30 Leu Phe Glu Pro Ser Leu Ser Leu Lys Leu Arg Lys Gln Phe Pro Leu 35 40 45 Ile Ser Gln Asn Phe Leu Glu Ala Ala Ser Pro Leu Arg Pro Ser Ala 50 55 60 Thr Leu Ser Ser Asp Ser Tyr Arg Glu Thr Ile Glu Leu Ala Asp Ile 65 70 75 80 Glu Trp Asp Asn Leu Gly Phe Gly Leu Gln Pro Thr Asp Tyr Met Tyr 85 90 95 Ser Met Lys Cys Thr Arg Gly Gly Thr Phe Ser Lys Gly Glu Leu Gln 100 105 110 Arg Phe Gly Asn Ile Glu Leu Asn Pro Ser Ala Gly Val Leu Asn Tyr 115 120 125 Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Gln Asp Gly 130 135 140 Ser Ile Leu Leu Phe Arg Pro Glu Glu Asn Gly Leu Arg Met Gln Ile 145 150 155 160 Gly Ala Glu Arg Met Cys Met Pro Ser Pro Thr Val Glu Gln Phe Val 165 170 175 Glu Ala Val Lys Glu Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro 180 185 190 Ala Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly 195 200 205 Pro Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr 210 215 220 Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn 225 230 235 240 Leu Ile Val Glu Asn Glu Leu His Arg Ala Thr Pro Gly Gly Thr Gly 245 250 255 Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln Ser 260 265 270 Glu Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Cys Val 275 280 285 His Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val 290 295 300 Lys Gly Asn Val Ile Ser Thr Pro Ala Ile Lys Gly Thr Ile Leu Pro 305 310 315 320 Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Arg Ser Gln Gly Phe 325 330 335 Gln Val Glu Glu Arg Leu Val Ser Val Asp Glu Leu Leu Asp Ala Asp 340 345 350 Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser 355 360 365 Ile Thr Tyr Leu Asp Lys Ser Leu Phe Leu Leu Ile Cys Cys Asp Phe 370 375 380 Thr Leu Phe Tyr Phe Leu Tyr Thr Glu Arg Leu Arg Pro Tyr Glu Phe 385 390 395 400 Gly Pro Tyr Leu Ser Pro Phe Ser Ser Pro His 405 410 1671143DNAHelianthus annuus 167atgatgatcc gacaaagccc cttccttctt ggtttgattc agacttccat atcaaagatt 60tctgcaagat gtttgacggc acaggctgcc tcggcgcttc aggaaaatga acctatgatc 120agacacgagg aatatgctgc tgatattgat tggaacaact tgggttttgg tataaaacaa 180accgattaca tgtacaagtc taaatgcaca aagaacaaca cttttgagca aggacaacta 240gttaattatg gaaacttaga attgagcccg gctgctggag ttttaaacta cggccaggga 300ctcttcgaag gtacaaaagc cgtgagggga gaagacggtc gccttttgct ttttcgaccc 360gatcaaaacg ccatccgaat gcaaatcgga gccgagcgaa tgtgcatgca atccccatct 420atagaacagg ttgtagatgc agttaaacaa acagctttag ccaataaacg ttggattcca 480cctccaggaa aagggtcgct ttacatcagg cctttgctca ttggaactgg gcctatattg 540ggcttatctc ctgctcatga gtacacattt ttagtatatg cctccccagt tggcaactat 600ttcaaggaag gtacggcacc gttaaactta tacgttaaca acgagtttca

tcgtacaact 660cgtggtggag cgggaggggt caaaaccatt acaaattatg ccccggtatt gaaaccgtta 720ttaagagcaa aggaacaagg gttctcagat gtagtgtacc ttgattcggt ccataaaaag 780tacatcgagg aagttagttc ttgtaatatt ttcattgtta agggtgatgt tatttcaacg 840ccttcaacgg taggtactat cctcgaagga atccccagaa agagcatcat tgatattgca 900cgtgccttag gatacaaggt tgaagaacgt ttagttgcag tagatgaatt gatggaagcc 960gacgaagttt tcactactgg aaccgcggtt actgttgcca ctgttggtag cattacatac 1020aatggtcgaa gagtggcgta tagaacaggt gatgggttgg tgagtcagaa tttattcaaa 1080aggctagtag gaattcaagt tgggaaagtt gaagacaaat acacctggat agttgatatt 1140taa 1143168380PRTHelianthus annuus 168Met Met Ile Arg Gln Ser Pro Phe Leu Leu Gly Leu Ile Gln Thr Ser 1 5 10 15 Ile Ser Lys Ile Ser Ala Arg Cys Leu Thr Ala Gln Ala Ala Ser Ala 20 25 30 Leu Gln Glu Asn Glu Pro Met Ile Arg His Glu Glu Tyr Ala Ala Asp 35 40 45 Ile Asp Trp Asn Asn Leu Gly Phe Gly Ile Lys Gln Thr Asp Tyr Met 50 55 60 Tyr Lys Ser Lys Cys Thr Lys Asn Asn Thr Phe Glu Gln Gly Gln Leu 65 70 75 80 Val Asn Tyr Gly Asn Leu Glu Leu Ser Pro Ala Ala Gly Val Leu Asn 85 90 95 Tyr Gly Gln Gly Leu Phe Glu Gly Thr Lys Ala Val Arg Gly Glu Asp 100 105 110 Gly Arg Leu Leu Leu Phe Arg Pro Asp Gln Asn Ala Ile Arg Met Gln 115 120 125 Ile Gly Ala Glu Arg Met Cys Met Gln Ser Pro Ser Ile Glu Gln Val 130 135 140 Val Asp Ala Val Lys Gln Thr Ala Leu Ala Asn Lys Arg Trp Ile Pro 145 150 155 160 Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Thr 165 170 175 Gly Pro Ile Leu Gly Leu Ser Pro Ala His Glu Tyr Thr Phe Leu Val 180 185 190 Tyr Ala Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Thr Ala Pro Leu 195 200 205 Asn Leu Tyr Val Asn Asn Glu Phe His Arg Thr Thr Arg Gly Gly Ala 210 215 220 Gly Gly Val Lys Thr Ile Thr Asn Tyr Ala Pro Val Leu Lys Pro Leu 225 230 235 240 Leu Arg Ala Lys Glu Gln Gly Phe Ser Asp Val Val Tyr Leu Asp Ser 245 250 255 Val His Lys Lys Tyr Ile Glu Glu Val Ser Ser Cys Asn Ile Phe Ile 260 265 270 Val Lys Gly Asp Val Ile Ser Thr Pro Ser Thr Val Gly Thr Ile Leu 275 280 285 Glu Gly Ile Pro Arg Lys Ser Ile Ile Asp Ile Ala Arg Ala Leu Gly 290 295 300 Tyr Lys Val Glu Glu Arg Leu Val Ala Val Asp Glu Leu Met Glu Ala 305 310 315 320 Asp Glu Val Phe Thr Thr Gly Thr Ala Val Thr Val Ala Thr Val Gly 325 330 335 Ser Ile Thr Tyr Asn Gly Arg Arg Val Ala Tyr Arg Thr Gly Asp Gly 340 345 350 Leu Val Ser Gln Asn Leu Phe Lys Arg Leu Val Gly Ile Gln Val Gly 355 360 365 Lys Val Glu Asp Lys Tyr Thr Trp Ile Val Asp Ile 370 375 380 1691029DNAHordeum vulgare 169atggattgcg gcacggcctc gcacggcgcc ctactcgccg ccgcgccgct cgccggccgg 60cggccccggc tgctgcccct ctcgccgccg ccatcgacgc cgtccattca gattcagaat 120cgactttatt cgatgtcact gcttccgctt cgaaaggctc gtggcatggg aagatgcgag 180gcttctctag caagtaacta cacgcagaca tcagagtttg ctgatttgga ttgggagaac 240cttggttttg gacttgtgca aactgactat atgtatactg caaaatgtgg gccggatggg 300aactttgaca agggtggaat ggtgccgttt gggccgatag aaatgaaccc agcatccgga 360gtcctgaatt atggacaggg attgttcgag ggcctaaagg cgtataggaa aaccgatgga 420tccatcctgt tgtttcgccc aatggaaaat gcaatgcgga tgcaaactgg tgctgagagg 480atgtgcatgc ctgcacctcc tgtcgagcaa tttgtgaacg cagtaaaaca aaccgtttta 540gcaaacaaga gatgggtgcc tcctacgggt aaaggttctt tgtatattag gccactactc 600gtgggaagtg gagctgttct tggtctcgca cctgctcctg agtacacatt cattattttt 660gcctcccctg ttgggaacta ctttaaggaa ggattagccc caataaattt gatagttgaa 720gacaagtttc atcgggccac ccctggtgga actggaggtg ttaagaccat tgggaattat 780gcctcggtct tgatggcaca gaagattgca aaggagaaag gttattctga tgttctctac 840ttggatgctg ttgagaaaaa gtaccttgaa gaagtatctt cgtgtaatat ttttgttgtg 900aagggcaatg ttatttcaac tccagcaata aaaggaacaa tactaccggg catcacaagg 960aaaagtataa ttgatgttgc tctgagtaaa ggcttccagg ttgaggagcg gcctcgtgtc 1020cgtggatga 1029170342PRTHordeum vulgare 170Met Asp Cys Gly Thr Ala Ser His Gly Ala Leu Leu Ala Ala Ala Pro 1 5 10 15 Leu Ala Gly Arg Arg Pro Arg Leu Leu Pro Leu Ser Pro Pro Pro Ser 20 25 30 Thr Pro Ser Ile Gln Ile Gln Asn Arg Leu Tyr Ser Met Ser Leu Leu 35 40 45 Pro Leu Arg Lys Ala Arg Gly Met Gly Arg Cys Glu Ala Ser Leu Ala 50 55 60 Ser Asn Tyr Thr Gln Thr Ser Glu Phe Ala Asp Leu Asp Trp Glu Asn 65 70 75 80 Leu Gly Phe Gly Leu Val Gln Thr Asp Tyr Met Tyr Thr Ala Lys Cys 85 90 95 Gly Pro Asp Gly Asn Phe Asp Lys Gly Gly Met Val Pro Phe Gly Pro 100 105 110 Ile Glu Met Asn Pro Ala Ser Gly Val Leu Asn Tyr Gly Gln Gly Leu 115 120 125 Phe Glu Gly Leu Lys Ala Tyr Arg Lys Thr Asp Gly Ser Ile Leu Leu 130 135 140 Phe Arg Pro Met Glu Asn Ala Met Arg Met Gln Thr Gly Ala Glu Arg 145 150 155 160 Met Cys Met Pro Ala Pro Pro Val Glu Gln Phe Val Asn Ala Val Lys 165 170 175 Gln Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly 180 185 190 Ser Leu Tyr Ile Arg Pro Leu Leu Val Gly Ser Gly Ala Val Leu Gly 195 200 205 Leu Ala Pro Ala Pro Glu Tyr Thr Phe Ile Ile Phe Ala Ser Pro Val 210 215 220 Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu Ile Val Glu 225 230 235 240 Asp Lys Phe His Arg Ala Thr Pro Gly Gly Thr Gly Gly Val Lys Thr 245 250 255 Ile Gly Asn Tyr Ala Ser Val Leu Met Ala Gln Lys Ile Ala Lys Glu 260 265 270 Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val Glu Lys Lys Tyr 275 280 285 Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys Gly Asn Val 290 295 300 Ile Ser Thr Pro Ala Ile Lys Gly Thr Ile Leu Pro Gly Ile Thr Arg 305 310 315 320 Lys Ser Ile Ile Asp Val Ala Leu Ser Lys Gly Phe Gln Val Glu Glu 325 330 335 Arg Pro Arg Val Arg Gly 340 1711194DNAHordeum vulgare 171atggctgtgc tgtcttctgc gaagcgcgtc ctcccgtgcg cctcggccgg cggggtcagc 60ggcggcctcc gagctctact cgggacggac ggaggcggcc gctctcttct cccgtcccgg 120tggaagtcgt cgctgccgca gctggacccc gtcgacaggt ccgacgagga gagcggcggc 180gacatcgact gggacaacct cggcttcggg ctcaccccga cggactacat gtacgtcatg 240cggtgctcgc gggaggaggg cggcttctcc cgcggcgagc tcgcccgcta cggcaacatc 300gagctcagcc cctcctccgg cgtcctcaac tacggccagg ggctgttcga ggggctgaag 360gcgtacaggc ggtcggacgg ggccgggtac atgctgttcc ggccggagga gaacgcgcgg 420cggatgcagc acggcgcggg ccgcatgtgc atgccgtccc cgtctgtcga gcagttcgtg 480cacgccgtca agcagaccgt cctcgccaac aggcgctggg tgccgccgca aggcaaggga 540gcgctgtaca tcaggccgct gctcatcggg agcggggcga tcctcgggct ggcgccggcc 600cccgagtaca ccttcatgat ctacgccgcg cctgtgggga catatttcaa ggaaggcatg 660gcggcgataa acctgctggt cgaggaggag atccaccgcg cgatgccggg cggcaccggc 720ggggtcaaga ccatctccaa ctacgcgccg gtgctcaagc cgcagatgga cgcgaaaagc 780aaggggttcg cggacgtgct gtacctggac gcggtccaca agaggtacgt cgaggaggcc 840tcctcctgca acctcttcgt cgtcaagggc ggcgccgtcg cgacgccggc gacgacggca 900gggaccatcc tgccgggtgt cacgcgcagg agcatcatcg agctcgccag ggatgacggc 960taccaggtcg aagagcgcct cgtctccatc gacgatctcg tcggcgcaga cgaagtgttc 1020tgcacgggaa cggccgtcgg cgtcaccccg gtgtcgacca tcacctacca agggacaagg 1080cacgagttca ggactgggga agacacgttg tcgaggaaat tgtacacgac tctcacatcg 1140atccagatgg ggctggcaga ggacaagaaa ggatggacgg tagcgattga ttga 1194172397PRTHordeum vulgare 172Met Ala Val Leu Ser Ser Ala Lys Arg Val Leu Pro Cys Ala Ser Ala 1 5 10 15 Gly Gly Val Ser Gly Gly Leu Arg Ala Leu Leu Gly Thr Asp Gly Gly 20 25 30 Gly Arg Ser Leu Leu Pro Ser Arg Trp Lys Ser Ser Leu Pro Gln Leu 35 40 45 Asp Pro Val Asp Arg Ser Asp Glu Glu Ser Gly Gly Asp Ile Asp Trp 50 55 60 Asp Asn Leu Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met Tyr Val Met 65 70 75 80 Arg Cys Ser Arg Glu Glu Gly Gly Phe Ser Arg Gly Glu Leu Ala Arg 85 90 95 Tyr Gly Asn Ile Glu Leu Ser Pro Ser Ser Gly Val Leu Asn Tyr Gly 100 105 110 Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Arg Ser Asp Gly Ala 115 120 125 Gly Tyr Met Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg Met Gln His 130 135 140 Gly Ala Gly Arg Met Cys Met Pro Ser Pro Ser Val Glu Gln Phe Val 145 150 155 160 His Ala Val Lys Gln Thr Val Leu Ala Asn Arg Arg Trp Val Pro Pro 165 170 175 Gln Gly Lys Gly Ala Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly 180 185 190 Ala Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Met Ile Tyr 195 200 205 Ala Ala Pro Val Gly Thr Tyr Phe Lys Glu Gly Met Ala Ala Ile Asn 210 215 220 Leu Leu Val Glu Glu Glu Ile His Arg Ala Met Pro Gly Gly Thr Gly 225 230 235 240 Gly Val Lys Thr Ile Ser Asn Tyr Ala Pro Val Leu Lys Pro Gln Met 245 250 255 Asp Ala Lys Ser Lys Gly Phe Ala Asp Val Leu Tyr Leu Asp Ala Val 260 265 270 His Lys Arg Tyr Val Glu Glu Ala Ser Ser Cys Asn Leu Phe Val Val 275 280 285 Lys Gly Gly Ala Val Ala Thr Pro Ala Thr Thr Ala Gly Thr Ile Leu 290 295 300 Pro Gly Val Thr Arg Arg Ser Ile Ile Glu Leu Ala Arg Asp Asp Gly 305 310 315 320 Tyr Gln Val Glu Glu Arg Leu Val Ser Ile Asp Asp Leu Val Gly Ala 325 330 335 Asp Glu Val Phe Cys Thr Gly Thr Ala Val Gly Val Thr Pro Val Ser 340 345 350 Thr Ile Thr Tyr Gln Gly Thr Arg His Glu Phe Arg Thr Gly Glu Asp 355 360 365 Thr Leu Ser Arg Lys Leu Tyr Thr Thr Leu Thr Ser Ile Gln Met Gly 370 375 380 Leu Ala Glu Asp Lys Lys Gly Trp Thr Val Ala Ile Asp 385 390 395 1731098DNAHordeum vulgare 173atgccgactc tccatcacaa ggcccatacc acagtaggat gccaggcttc tgtagcctct 60aaatacatgg aaacacctga gatagtcgat ttggactggg aaaaccttgg ctttggcctt 120gtcaataccg actttatgta catggccaaa tgtgggccag atgggaactt ttccaaagga 180gaaattctgc catttggacc catagcacta agcccgtctg ctggagtctt aaattatgga 240cagggactgt ttgagggcct aaaagcatat aggaaaactg atggttctgt cctattattc 300cgtccggagg agaatgccgt acggatgaag aatggttcag ataggatgtg catgcctgca 360ccgactgttg agcagttcgt ggacgcagtg aaacaaaccg ttttggcaaa taaaagatgg 420gtgcctccta ctggtaaagg ttccttgtat atcaggccac tacttattgg aagcggggct 480attcttggtc ttgcacctgc tcctgagtac accttcctta tttatgtctc acctgttgga 540aactatttca aggaaggttt agctcctatt aacttgatta ttgaagataa ctttcaccgt 600gcggcccctg gtggaactgg aggcgtgaaa accattggaa actatgcctc ggtgttgaaa 660gcacagagaa ccgcaaagga gaaaggatat tctgatgtcc tctatttgga cgccgttcac 720aacaaatatc tggaagaagt ttcttcgtgc aatattttcg ttgtgaaagg caatgctatt 780tgcactccag caatagaagg aacgatactg cctggtatca caaggaaaag tatcatcgaa 840gtagccgaga gcaaaggcta caaggtggag gaacgccatg tgtccgtaga cgaactgctt 900gacgctgacg aagttttctg cacgggaaca gctgttgtgg tttcacccgt ggggagtatt 960acctataagg ggaaaagggt aaaatacgac ggcaaccaag gagtcggtgt ggtgtcgcag 1020cagctctaca cctcgctgac gagcctccag atgggtcatg cagaggaccc gatgggctgg 1080accgtgcaac tgaattaa 1098174365PRTHordeum vulgare 174Met Pro Thr Leu His His Lys Ala His Thr Thr Val Gly Cys Gln Ala 1 5 10 15 Ser Val Ala Ser Lys Tyr Met Glu Thr Pro Glu Ile Val Asp Leu Asp 20 25 30 Trp Glu Asn Leu Gly Phe Gly Leu Val Asn Thr Asp Phe Met Tyr Met 35 40 45 Ala Lys Cys Gly Pro Asp Gly Asn Phe Ser Lys Gly Glu Ile Leu Pro 50 55 60 Phe Gly Pro Ile Ala Leu Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly 65 70 75 80 Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Thr Asp Gly Ser 85 90 95 Val Leu Leu Phe Arg Pro Glu Glu Asn Ala Val Arg Met Lys Asn Gly 100 105 110 Ser Asp Arg Met Cys Met Pro Ala Pro Thr Val Glu Gln Phe Val Asp 115 120 125 Ala Val Lys Gln Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro Thr 130 135 140 Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala 145 150 155 160 Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val 165 170 175 Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu 180 185 190 Ile Ile Glu Asp Asn Phe His Arg Ala Ala Pro Gly Gly Thr Gly Gly 195 200 205 Val Lys Thr Ile Gly Asn Tyr Ala Ser Val Leu Lys Ala Gln Arg Thr 210 215 220 Ala Lys Glu Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val His 225 230 235 240 Asn Lys Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys 245 250 255 Gly Asn Ala Ile Cys Thr Pro Ala Ile Glu Gly Thr Ile Leu Pro Gly 260 265 270 Ile Thr Arg Lys Ser Ile Ile Glu Val Ala Glu Ser Lys Gly Tyr Lys 275 280 285 Val Glu Glu Arg His Val Ser Val Asp Glu Leu Leu Asp Ala Asp Glu 290 295 300 Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser Ile 305 310 315 320 Thr Tyr Lys Gly Lys Arg Val Lys Tyr Asp Gly Asn Gln Gly Val Gly 325 330 335 Val Val Ser Gln Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln Met Gly 340 345 350 His Ala Glu Asp Pro Met Gly Trp Thr Val Gln Leu Asn 355 360 365 1751098DNAHordeum vulgare 175atgccgactc tccatcacaa ggcccatacc acagtaggat gccaggcttc tgtagcctct 60aaatacatgg aaacacctga gatagtcgat ttggactggg aaaaccttgg ctttggcctt 120gtcaataccg actttatgta catggccaaa tgtgggccag atgggaactt ttccaaagga 180gaaattctgc catttggacc catagcacta agcccgtctg ctggagtctt aaattatgga 240cagggactgt ttgagggcct aaaagcatat aggaaaactg atggttctgt cctattattc 300cgtccggagg agaatgccgt acggatgaag aatggttcag ataggatgtg catgcctgca 360ccgactgttg agcagttcgt ggacgcagtg aaacaaaccg ttttggcaaa taaaagatgg 420gtgcctccta ctggtaaagg ttccttgtat atcaggccac tacttattgg aagcggggct 480attcttggtc ttgcacctgc tcctgagtac accttcctta tttatgtctc acctgttgga 540aactatttca aggaaggttt agctcctatt aacttgatta ttgaagataa ctttcaccgt 600gcggcccctg gtggaactgg aggcgtgaaa accattggaa actatgcctc ggtgttgaaa 660gcacagagaa ccgcaaagga gaaaggatat tctgatgtcc tctatttgga cgccgttcac 720aacaaatatc tggaagaagt ttcttcgtgc aatattttcg ttgtgaaagg caatgctatt 780tgcactccag caatagaagg aacgatactg cctggtatca caaggaaaag tatcatcgaa 840gtagccgaga gcaaaggcta caaggtggag gaacgccatg tgtccgtaga cgaactgctt 900gacgctgacg aagttttctg cacgggaaca gctgttgtgg tttcacccgt ggggagtatt 960acctataagg ggaaaagggt aaaatacgac ggcaaccaag gagtcggtgt ggtgtcgcag 1020cagctctaca cctcgctgac gagcctccag atgggtcatg cagagggccc gatgggctgg 1080accgtgcaac tgaattaa 1098176365PRTHordeum vulgare 176Met Pro Thr Leu His His Lys Ala His Thr

Thr Val Gly Cys Gln Ala 1 5 10 15 Ser Val Ala Ser Lys Tyr Met Glu Thr Pro Glu Ile Val Asp Leu Asp 20 25 30 Trp Glu Asn Leu Gly Phe Gly Leu Val Asn Thr Asp Phe Met Tyr Met 35 40 45 Ala Lys Cys Gly Pro Asp Gly Asn Phe Ser Lys Gly Glu Ile Leu Pro 50 55 60 Phe Gly Pro Ile Ala Leu Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly 65 70 75 80 Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Thr Asp Gly Ser 85 90 95 Val Leu Leu Phe Arg Pro Glu Glu Asn Ala Val Arg Met Lys Asn Gly 100 105 110 Ser Asp Arg Met Cys Met Pro Ala Pro Thr Val Glu Gln Phe Val Asp 115 120 125 Ala Val Lys Gln Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro Thr 130 135 140 Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala 145 150 155 160 Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val 165 170 175 Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu 180 185 190 Ile Ile Glu Asp Asn Phe His Arg Ala Ala Pro Gly Gly Thr Gly Gly 195 200 205 Val Lys Thr Ile Gly Asn Tyr Ala Ser Val Leu Lys Ala Gln Arg Thr 210 215 220 Ala Lys Glu Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val His 225 230 235 240 Asn Lys Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys 245 250 255 Gly Asn Ala Ile Cys Thr Pro Ala Ile Glu Gly Thr Ile Leu Pro Gly 260 265 270 Ile Thr Arg Lys Ser Ile Ile Glu Val Ala Glu Ser Lys Gly Tyr Lys 275 280 285 Val Glu Glu Arg His Val Ser Val Asp Glu Leu Leu Asp Ala Asp Glu 290 295 300 Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser Ile 305 310 315 320 Thr Tyr Lys Gly Lys Arg Val Lys Tyr Asp Gly Asn Gln Gly Val Gly 325 330 335 Val Val Ser Gln Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln Met Gly 340 345 350 His Ala Glu Gly Pro Met Gly Trp Thr Val Gln Leu Asn 355 360 365 1771074DNAMedicago truncatula 177atggctcctc cttctatttt aagggacact gaagatggtt ctgaaagtga tatgggtgaa 60aattatgctg acatcaattg ggaaggactt agttttagtc tgactcaaac agattacatg 120catgtcatga aatgcacaaa aggagaaaag ttttctcaag gatccctcat tcgctacgga 180aacattgaga taagcccggc tgctggtatc ataaactatg gacagggaat cttcgaggga 240ctaaaagcat atagaacaga agatgggcga atccttcttt tccgaccgga ggagaatgct 300ctacgcatga agatgggggc tgataggttg tgtatgccgt caccatcggt tgagcagttt 360gttgatgctg ttaagcaaac agttcttgcc aataaacgtt gggtacctcc tccagggaaa 420gggacgcttt atcttaggcc tttgctgatg ggaacaggag ctgcattagg cctggctcca 480tcacctgagt acacatttct catttattgc tcccctgttg gaaagtatca cgagggagga 540agactaaact taaaagtgga ggataaattt catcgatcaa tagctggcag cggtggaaca 600ggaggaatca agagtgttac taattatgcc ccaatatata ctgcagtaac tgaagcaaaa 660gccaatggat tttctgatgt cttgttcttg gattcagcaa ctggtaaaaa tattgaggag 720gctactgcgt gcaatatatt tgttgtgaag gaaaatgata tcttcactcc ggcaatagat 780ggatctattc tgcctggggt cacacgaaaa tccatcatag acattgccat tgatttgggt 840tataaggtca tagaacgttc catatcagtg gaggaaatga tgagcgctga tgaagtgttc 900tgcacaggaa ctgcagtggt tgttacctct gttgcatctg taacatataa ggaaacaaga 960gctgaatata aaacaggcgc agaaacgttg tctcaaaaac tacaaggaat actggttgga 1020atacaaacag ggtgtattga ggacaaaaag tcatggacag tccaagtaga ttga 1074178357PRTMedicago truncatula 178Met Ala Pro Pro Ser Ile Leu Arg Asp Thr Glu Asp Gly Ser Glu Ser 1 5 10 15 Asp Met Gly Glu Asn Tyr Ala Asp Ile Asn Trp Glu Gly Leu Ser Phe 20 25 30 Ser Leu Thr Gln Thr Asp Tyr Met His Val Met Lys Cys Thr Lys Gly 35 40 45 Glu Lys Phe Ser Gln Gly Ser Leu Ile Arg Tyr Gly Asn Ile Glu Ile 50 55 60 Ser Pro Ala Ala Gly Ile Ile Asn Tyr Gly Gln Gly Ile Phe Glu Gly 65 70 75 80 Leu Lys Ala Tyr Arg Thr Glu Asp Gly Arg Ile Leu Leu Phe Arg Pro 85 90 95 Glu Glu Asn Ala Leu Arg Met Lys Met Gly Ala Asp Arg Leu Cys Met 100 105 110 Pro Ser Pro Ser Val Glu Gln Phe Val Asp Ala Val Lys Gln Thr Val 115 120 125 Leu Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Thr Leu Tyr 130 135 140 Leu Arg Pro Leu Leu Met Gly Thr Gly Ala Ala Leu Gly Leu Ala Pro 145 150 155 160 Ser Pro Glu Tyr Thr Phe Leu Ile Tyr Cys Ser Pro Val Gly Lys Tyr 165 170 175 His Glu Gly Gly Arg Leu Asn Leu Lys Val Glu Asp Lys Phe His Arg 180 185 190 Ser Ile Ala Gly Ser Gly Gly Thr Gly Gly Ile Lys Ser Val Thr Asn 195 200 205 Tyr Ala Pro Ile Tyr Thr Ala Val Thr Glu Ala Lys Ala Asn Gly Phe 210 215 220 Ser Asp Val Leu Phe Leu Asp Ser Ala Thr Gly Lys Asn Ile Glu Glu 225 230 235 240 Ala Thr Ala Cys Asn Ile Phe Val Val Lys Glu Asn Asp Ile Phe Thr 245 250 255 Pro Ala Ile Asp Gly Ser Ile Leu Pro Gly Val Thr Arg Lys Ser Ile 260 265 270 Ile Asp Ile Ala Ile Asp Leu Gly Tyr Lys Val Ile Glu Arg Ser Ile 275 280 285 Ser Val Glu Glu Met Met Ser Ala Asp Glu Val Phe Cys Thr Gly Thr 290 295 300 Ala Val Val Val Thr Ser Val Ala Ser Val Thr Tyr Lys Glu Thr Arg 305 310 315 320 Ala Glu Tyr Lys Thr Gly Ala Glu Thr Leu Ser Gln Lys Leu Gln Gly 325 330 335 Ile Leu Val Gly Ile Gln Thr Gly Cys Ile Glu Asp Lys Lys Ser Trp 340 345 350 Thr Val Gln Val Asp 355 1791077DNAMedicago truncatula 179atggcaacat cccatcaact acccaacaat ggtaaagctt ccaacaggga gactgaaaaa 60atatatgcca atatggattg ggacaaactt acatgtggag tgattccaac tgattatatg 120tacataatta aatccaatga agaccgaacc tattcaaacg gtactctcgt gccttttgga 180accattgata tcaacccaca ttctgctgtt ataaattatg gacagggatt atttgagggc 240atgaaggctt acagaacaaa agacggcaat gtgcaactat tccgaccgga agaaaatgcg 300ctgcgcatgc agatgggagc agagaggctg ctgatgccat caccttctgt tgagcagtac 360attgatgctg taaaacaagt tgttcatgca aataaacgtt gggtgcctcc ttggggaaaa 420ggaacattgt acattaggcc tttactattt ggaagcggac ctgttctggg tattggacca 480gcacctcaat gcaccctctt aatattcact aatccaatta gcaacattta caagggacaa 540acatcagcct tgaatttgtt gattaatgaa aactttcctc gtgcatatcc tggtggaact 600ggtggagtaa aaagtattag taattatcca cttgttttcc aagttgtaaa agaagcaaaa 660gccaaaggat tttccgatgt gctttttcta gatgcagtgg aacataaata cattgaagag 720gtatcttcgt gtaatgcttt cattgtgaag ggtaaggttc tttcaactgc acctacactt 780ggaactattc ttcctggagt cacaaggaaa agtgtcattg aacttgcacg tgatttgggt 840tacgaggtga tggaacgcaa ggtctcggta gaagaactgc ttgaagctga tgaggttttc 900tgcactggaa ctgctgttgg gatttctgct gttggaagtg taacatacaa gaataaaagg 960tgtgttacgt tcaaaacagg ggcagatact gtgactaaga agttgtatga tttgattacg 1020ggcatccaga caggtctctt ggaagataag aaaggatggg tggtcaagat tgattga 1077180358PRTMedicago truncatula 180Met Ala Thr Ser His Gln Leu Pro Asn Asn Gly Lys Ala Ser Asn Arg 1 5 10 15 Glu Thr Glu Lys Ile Tyr Ala Asn Met Asp Trp Asp Lys Leu Thr Cys 20 25 30 Gly Val Ile Pro Thr Asp Tyr Met Tyr Ile Ile Lys Ser Asn Glu Asp 35 40 45 Arg Thr Tyr Ser Asn Gly Thr Leu Val Pro Phe Gly Thr Ile Asp Ile 50 55 60 Asn Pro His Ser Ala Val Ile Asn Tyr Gly Gln Gly Leu Phe Glu Gly 65 70 75 80 Met Lys Ala Tyr Arg Thr Lys Asp Gly Asn Val Gln Leu Phe Arg Pro 85 90 95 Glu Glu Asn Ala Leu Arg Met Gln Met Gly Ala Glu Arg Leu Leu Met 100 105 110 Pro Ser Pro Ser Val Glu Gln Tyr Ile Asp Ala Val Lys Gln Val Val 115 120 125 His Ala Asn Lys Arg Trp Val Pro Pro Trp Gly Lys Gly Thr Leu Tyr 130 135 140 Ile Arg Pro Leu Leu Phe Gly Ser Gly Pro Val Leu Gly Ile Gly Pro 145 150 155 160 Ala Pro Gln Cys Thr Leu Leu Ile Phe Thr Asn Pro Ile Ser Asn Ile 165 170 175 Tyr Lys Gly Gln Thr Ser Ala Leu Asn Leu Leu Ile Asn Glu Asn Phe 180 185 190 Pro Arg Ala Tyr Pro Gly Gly Thr Gly Gly Val Lys Ser Ile Ser Asn 195 200 205 Tyr Pro Leu Val Phe Gln Val Val Lys Glu Ala Lys Ala Lys Gly Phe 210 215 220 Ser Asp Val Leu Phe Leu Asp Ala Val Glu His Lys Tyr Ile Glu Glu 225 230 235 240 Val Ser Ser Cys Asn Ala Phe Ile Val Lys Gly Lys Val Leu Ser Thr 245 250 255 Ala Pro Thr Leu Gly Thr Ile Leu Pro Gly Val Thr Arg Lys Ser Val 260 265 270 Ile Glu Leu Ala Arg Asp Leu Gly Tyr Glu Val Met Glu Arg Lys Val 275 280 285 Ser Val Glu Glu Leu Leu Glu Ala Asp Glu Val Phe Cys Thr Gly Thr 290 295 300 Ala Val Gly Ile Ser Ala Val Gly Ser Val Thr Tyr Lys Asn Lys Arg 305 310 315 320 Cys Val Thr Phe Lys Thr Gly Ala Asp Thr Val Thr Lys Lys Leu Tyr 325 330 335 Asp Leu Ile Thr Gly Ile Gln Thr Gly Leu Leu Glu Asp Lys Lys Gly 340 345 350 Trp Val Val Lys Ile Asp 355 1811227DNAMedicago truncatula 181atggagagca gcgccgcact aactagcatt cgactcactt cctcgatccg tccttcccgt 60ttttcttccc cttttctttc ccccgcattt cctcccaaac ccacttctct atccctcaag 120ctccaaaagc agtttccttt cacttcccag aatgttctcc aagcttctaa tgctctcaga 180ccttctgctt ctgtttctgc tagtgaggcg attgagttgg cagacataga ttgggacaac 240ctaggatttg gtcttcagcc tactgattat atgtatttca tgaaatgtga tcaaggtgga 300accttttcta agggtgaatt aaagcgtttt gggaacattg aattgaaccc ttctgctggt 360gttttaaact atggacaggg attatttgag ggtttgaaag cttaccgtaa agatgatggg 420aacatactcc tctttcgtcc ggaagaaaac gctttacgga tgaagacggg tgcagagcga 480atgtgcatgc catcacctag tgtagaacag ttcgtggaag ctgtgaaaga tactgtttta 540gcaaacaaac gttggatccc ccctcagggt aaaggttcat tgtatattag acctttgcta 600atgggaagtg gagctgtact tgggcttgca cctgctccag agtacacctt tctaatatat 660gtttcacctg ttgggaacta cttcaaggaa ggtttggctc caatcaattt gattgtggag 720agtgaactac atcgtgcaac tcccggtggc actggaggtg tgaagaccat tggaaactat 780gctgcagttc ttaaggcaca gtctgcagcc aaggcgaaag gctactctga tgttttgtac 840cttgactgcg tgcacaaaag atatttggag gaggtttctt cctgcaatat atttgttgtt 900aagggtaatg ttatttcaac tccatccatc aaagggacta tcctgcctgg cattactcga 960aagagtataa ttgacgttgc tcgaagccaa ggattcgagg ttgaggagcg attagtggca 1020gtggacgaat tgctcgaggc agacgaggtc ttctgcacag gaacagctgt ggttgtatca 1080cctgttggca gtattacata tcttggcgag aagaaatctt atggagatgg tgttggagca 1140gtttcacagc aactttatac tggccttacc agactacaga tgggtcttgc agaggataac 1200atgaattgga ctgttgagct gagataa 1227182408PRTMedicago truncatula 182Met Glu Ser Ser Ala Ala Leu Thr Ser Ile Arg Leu Thr Ser Ser Ile 1 5 10 15 Arg Pro Ser Arg Phe Ser Ser Pro Phe Leu Ser Pro Ala Phe Pro Pro 20 25 30 Lys Pro Thr Ser Leu Ser Leu Lys Leu Gln Lys Gln Phe Pro Phe Thr 35 40 45 Ser Gln Asn Val Leu Gln Ala Ser Asn Ala Leu Arg Pro Ser Ala Ser 50 55 60 Val Ser Ala Ser Glu Ala Ile Glu Leu Ala Asp Ile Asp Trp Asp Asn 65 70 75 80 Leu Gly Phe Gly Leu Gln Pro Thr Asp Tyr Met Tyr Phe Met Lys Cys 85 90 95 Asp Gln Gly Gly Thr Phe Ser Lys Gly Glu Leu Lys Arg Phe Gly Asn 100 105 110 Ile Glu Leu Asn Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly Leu 115 120 125 Phe Glu Gly Leu Lys Ala Tyr Arg Lys Asp Asp Gly Asn Ile Leu Leu 130 135 140 Phe Arg Pro Glu Glu Asn Ala Leu Arg Met Lys Thr Gly Ala Glu Arg 145 150 155 160 Met Cys Met Pro Ser Pro Ser Val Glu Gln Phe Val Glu Ala Val Lys 165 170 175 Asp Thr Val Leu Ala Asn Lys Arg Trp Ile Pro Pro Gln Gly Lys Gly 180 185 190 Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly Ala Val Leu Gly 195 200 205 Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val Ser Pro Val 210 215 220 Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu Ile Val Glu 225 230 235 240 Ser Glu Leu His Arg Ala Thr Pro Gly Gly Thr Gly Gly Val Lys Thr 245 250 255 Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln Ser Ala Ala Lys Ala 260 265 270 Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Cys Val His Lys Arg Tyr 275 280 285 Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys Gly Asn Val 290 295 300 Ile Ser Thr Pro Ser Ile Lys Gly Thr Ile Leu Pro Gly Ile Thr Arg 305 310 315 320 Lys Ser Ile Ile Asp Val Ala Arg Ser Gln Gly Phe Glu Val Glu Glu 325 330 335 Arg Leu Val Ala Val Asp Glu Leu Leu Glu Ala Asp Glu Val Phe Cys 340 345 350 Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser Ile Thr Tyr Leu 355 360 365 Gly Glu Lys Lys Ser Tyr Gly Asp Gly Val Gly Ala Val Ser Gln Gln 370 375 380 Leu Tyr Thr Gly Leu Thr Arg Leu Gln Met Gly Leu Ala Glu Asp Asn 385 390 395 400 Met Asn Trp Thr Val Glu Leu Arg 405 1831251DNAOryza sativa 183atggctgctg ctgctgctgc tgcgtcgtcc gcgaagcgcg cgctcctccc gtgggcacgc 60gacgcccacc acgcgctggc cagggccctg cagggatgcg gcggcggcgg cggcctcggt 120ctccgcgggg cgctcccgac ggccggaggc aggtggtctc tgctccagtg ccggtggagg 180tcgtcgctgc cgcagctcga ctccgccgac aggtccgatg aggaaagcgg cggcgaaatc 240gactgggaca acctggggtt cgggctgacg ccgaccgact acatgtacgt catgcggtgc 300tcgctggagg acggcgtctt ctcccgcggc gagctcagcc gctacggcaa catcgagctc 360agcccctcct ccggcgtcat caactacggc caggggctct tcgagggtct gaaggcgtac 420agggcggcga accaacaggg gtcgtacatg ctgttccggc cggaggagaa cgcgcggcgg 480atgcagcacg gcgccgagcg catgtgcatg ccgtcgccgt cggtggagca gttcgtccac 540gccgtcaagc agaccgtcct cgccaaccgc cgctgggtgc caccgcaagg aaagggggcg 600ctgtacatca ggccgctgct catcgggagc ggaccgattc tcgggctggc tcccgccccg 660gagtacacgt tcctcatcta cgccgcaccg gttggaacgt acttcaagga gggtctagcg 720ccgataaacc ttgtcgtaga ggactcgata caccgggcca tgccgggcgg caccggcggg 780gtcaagacga tcaccaacta cgcgccggtg ctcaaggcgc agatggacgc caagagcaga 840gggttcactg acgtgctgta cctcgacgcg gtgcacaaga cgtacctgga ggaggcctcc 900tcctgcaacc tcttcatcgt caaggacggc gtcgtcgcca cgccggccac cgtgggaacc 960atcctgccgg ggatcacgcg caagagcgtc atcgagctcg ccagggaccg cggctatcag 1020caggttgaag aacggctcgt ctccatcgac gatctggtcg gcgcagacga ggtgttctgc 1080accggaacag cggtggtcgt tgccccagta tcgagtgtta cttaccatgg gcaaaggtac 1140gagttcagga ctggacatga cacgttatcg cagacactgc acacgactct gacgtccatc 1200cagatgggcc tggctgagga caagaaagga tggacagtgg caatagatta a 1251184416PRTOryza sativa 184Met Ala Ala Ala Ala Ala Ala Ala Ser Ser Ala Lys Arg Ala Leu Leu 1 5 10 15 Pro Trp Ala Arg Asp Ala His His Ala Leu Ala Arg Ala Leu Gln Gly 20 25 30 Cys Gly Gly Gly Gly Gly Leu Gly Leu Arg Gly Ala Leu Pro Thr Ala 35 40 45 Gly Gly Arg Trp Ser Leu Leu Gln Cys Arg Trp Arg Ser Ser Leu Pro 50 55 60 Gln Leu Asp Ser Ala Asp Arg Ser Asp Glu Glu Ser Gly Gly Glu Ile 65

70 75 80 Asp Trp Asp Asn Leu Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met Tyr 85 90 95 Val Met Arg Cys Ser Leu Glu Asp Gly Val Phe Ser Arg Gly Glu Leu 100 105 110 Ser Arg Tyr Gly Asn Ile Glu Leu Ser Pro Ser Ser Gly Val Ile Asn 115 120 125 Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Ala Ala Asn 130 135 140 Gln Gln Gly Ser Tyr Met Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg 145 150 155 160 Met Gln His Gly Ala Glu Arg Met Cys Met Pro Ser Pro Ser Val Glu 165 170 175 Gln Phe Val His Ala Val Lys Gln Thr Val Leu Ala Asn Arg Arg Trp 180 185 190 Val Pro Pro Gln Gly Lys Gly Ala Leu Tyr Ile Arg Pro Leu Leu Ile 195 200 205 Gly Ser Gly Pro Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe 210 215 220 Leu Ile Tyr Ala Ala Pro Val Gly Thr Tyr Phe Lys Glu Gly Leu Ala 225 230 235 240 Pro Ile Asn Leu Val Val Glu Asp Ser Ile His Arg Ala Met Pro Gly 245 250 255 Gly Thr Gly Gly Val Lys Thr Ile Thr Asn Tyr Ala Pro Val Leu Lys 260 265 270 Ala Gln Met Asp Ala Lys Ser Arg Gly Phe Thr Asp Val Leu Tyr Leu 275 280 285 Asp Ala Val His Lys Thr Tyr Leu Glu Glu Ala Ser Ser Cys Asn Leu 290 295 300 Phe Ile Val Lys Asp Gly Val Val Ala Thr Pro Ala Thr Val Gly Thr 305 310 315 320 Ile Leu Pro Gly Ile Thr Arg Lys Ser Val Ile Glu Leu Ala Arg Asp 325 330 335 Arg Gly Tyr Gln Gln Val Glu Glu Arg Leu Val Ser Ile Asp Asp Leu 340 345 350 Val Gly Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ala 355 360 365 Pro Val Ser Ser Val Thr Tyr His Gly Gln Arg Tyr Glu Phe Arg Thr 370 375 380 Gly His Asp Thr Leu Ser Gln Thr Leu His Thr Thr Leu Thr Ser Ile 385 390 395 400 Gln Met Gly Leu Ala Glu Asp Lys Lys Gly Trp Thr Val Ala Ile Asp 405 410 415 1851233DNAOryza sativa 185atggagctcc tcccgcgtgt gggtgtggcc gcccccggtc ccggacgcgg cggcgcgtcg 60ccgtccccga cgcgccgtca tcgcgcgccc tctcacccca ttctgaagcg atcggcggcg 120gtttgcggcg cagtcgccgt ctgcagagga ggggctgtcg ccaggaggag ccggtggtca 180actctggtga ccgcagcata ttacacagga actgctgaac tggtcgactt taactgggaa 240actcttgggt ttcaacccgt gccgactgac tttatgtatg tgatgagatg ttccgaggaa 300ggggtgttca ccaagggtga attggtgcca tatgggccaa tagaactgaa cccagcagct 360ggagtgttga attatggtca gggtttactt gaaggtctgc gagcacatag aaaagaggat 420ggatcagtcc ttctatttcg tcctgatgaa aatgctttac ggatgagagt aggcgcagac 480cggttatgta tgcctgcacc aagtgtagag cagttcctag aagctataaa gctaacaatt 540ttagcaaaca agcgctgggt accccctact ggcaaaggtt ctttatatat cagaccgctg 600ctgattggaa gtggggctat cctcggtgtt gcaccagccc cagagtacac atttgttgtc 660tttgcttgcc cagttgggca ctattttaag gatggcttat ctccaatcag cttgttaacc 720gaggaagaat atcagtgtgc ggcaccaggt ggaactggtg atataaagac tatcggaaat 780tatgcttcag ccgtttatgc taaagaaaga gctaaggaga gaggtcattc tgatgttctt 840tacttggatc cagtgcataa aaagtttgtt gaggaacttt cgtcctgtaa tatattcatg 900gtgaaggaca acattatttc tactccacta ttaacgggaa cagttcttcc tggcatcaca 960agaagaagta taattgaata cgcccgtagc cttggatttc aggttgaaga gtgtcttatt 1020acaatagatg agttgcttga cgctgatgaa gttttctgta ctggaacttc tgtggtacta 1080tcctctgttg gttgcatagt gtaccagggg agaagagtgg agtatgggaa ccagaagttc 1140agaactgtgt ctcagcaact ctattcagca cttacggcta tccagaaagg cctcgtggag 1200gacagtatgg gatggactgt gcaactgaat tag 1233186410PRTOryza sativa 186Met Glu Leu Leu Pro Arg Val Gly Val Ala Ala Pro Gly Pro Gly Arg 1 5 10 15 Gly Gly Ala Ser Pro Ser Pro Thr Arg Arg His Arg Ala Pro Ser His 20 25 30 Pro Ile Leu Lys Arg Ser Ala Ala Val Cys Gly Ala Val Ala Val Cys 35 40 45 Arg Gly Gly Ala Val Ala Arg Arg Ser Arg Trp Ser Thr Leu Val Thr 50 55 60 Ala Ala Tyr Tyr Thr Gly Thr Ala Glu Leu Val Asp Phe Asn Trp Glu 65 70 75 80 Thr Leu Gly Phe Gln Pro Val Pro Thr Asp Phe Met Tyr Val Met Arg 85 90 95 Cys Ser Glu Glu Gly Val Phe Thr Lys Gly Glu Leu Val Pro Tyr Gly 100 105 110 Pro Ile Glu Leu Asn Pro Ala Ala Gly Val Leu Asn Tyr Gly Gln Gly 115 120 125 Leu Leu Glu Gly Leu Arg Ala His Arg Lys Glu Asp Gly Ser Val Leu 130 135 140 Leu Phe Arg Pro Asp Glu Asn Ala Leu Arg Met Arg Val Gly Ala Asp 145 150 155 160 Arg Leu Cys Met Pro Ala Pro Ser Val Glu Gln Phe Leu Glu Ala Ile 165 170 175 Lys Leu Thr Ile Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys 180 185 190 Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala Ile Leu 195 200 205 Gly Val Ala Pro Ala Pro Glu Tyr Thr Phe Val Val Phe Ala Cys Pro 210 215 220 Val Gly His Tyr Phe Lys Asp Gly Leu Ser Pro Ile Ser Leu Leu Thr 225 230 235 240 Glu Glu Glu Tyr Gln Cys Ala Ala Pro Gly Gly Thr Gly Asp Ile Lys 245 250 255 Thr Ile Gly Asn Tyr Ala Ser Ala Val Tyr Ala Lys Glu Arg Ala Lys 260 265 270 Glu Arg Gly His Ser Asp Val Leu Tyr Leu Asp Pro Val His Lys Lys 275 280 285 Phe Val Glu Glu Leu Ser Ser Cys Asn Ile Phe Met Val Lys Asp Asn 290 295 300 Ile Ile Ser Thr Pro Leu Leu Thr Gly Thr Val Leu Pro Gly Ile Thr 305 310 315 320 Arg Arg Ser Ile Ile Glu Tyr Ala Arg Ser Leu Gly Phe Gln Val Glu 325 330 335 Glu Cys Leu Ile Thr Ile Asp Glu Leu Leu Asp Ala Asp Glu Val Phe 340 345 350 Cys Thr Gly Thr Ser Val Val Leu Ser Ser Val Gly Cys Ile Val Tyr 355 360 365 Gln Gly Arg Arg Val Glu Tyr Gly Asn Gln Lys Phe Arg Thr Val Ser 370 375 380 Gln Gln Leu Tyr Ser Ala Leu Thr Ala Ile Gln Lys Gly Leu Val Glu 385 390 395 400 Asp Ser Met Gly Trp Thr Val Gln Leu Asn 405 410 1871230DNAOryza sativa 187atggagtacg gtgcagcaac gcgtggcgcg ctcctcgcgg ccgccccgct ctccggcgcc 60cggcgtagct ggttgcccct ctcatcgccg ccgtcgccgc cctctattca gattcagaat 120cgactttatt cgatatcgtc gcttccacta aaggctcgag gcgtgagaag atgcgaggct 180tctctagcaa gtgactacac gaaggcatct gaggtagctg atttagattg ggagaacctt 240ggttttggaa tcgtgcagac cgactacatg tatatcacaa aatgcggaca ggacgggaat 300ttttctgagg gtgaaatgat tccatttgga cctatagcgc tgaacccatc ttctggagtc 360cttaattacg gacagggatt atttgaaggt ctaaaagcat atagaacaac agatgactct 420atcttattat ttcgcccgga ggaaaatgca ctgagaatga gaacaggtgc agaaagaatg 480tgcatgcctg cgcctagtgt tgagcagttt gtggatgcag taaagcaaac tgttttagca 540aacaagagat gggtgcctcc taccggtaaa ggttctttgt atattagacc gctactcatg 600ggtagtggtg ctgttcttgg tcttgcacct gctcctgagt atacgttcat tatatttgtc 660tcgcctgtgg ggaactactt taaggaaggt ttagctccaa taaatttgat agttgaagat 720aagtttcatc gtgcaacccc tggtggaact ggaagtgtga agaccatagg aaattatgcc 780tcggtcttga tggcacagaa gattgcaaaa gaaaagggct attctgatgt tctctacttg 840gatgctgttc acaaaaagta tcttgaagaa gtttcttcat gtaatatttt tgttgtcaag 900ggcaatgtca tttcaactcc agcagtaaaa ggaacaatat tgccaggcat cacaaggaaa 960agtatcattg atgttgctct gagcaagggt ttccaggtcg aggagcgact tgtgtcagta 1020gatgagctgc ttgaagctga tgaggttttc tgcacaggaa ctgctgtcgt agtgtctcct 1080gtgggtagta ttacctatca agggaaaagg gtcgaatatg ctggcaacaa aggagttggt 1140gtcgtgtctc agcagctata tacttcatta acaagcctgc agatgggcca ggcagaagat 1200tggctagtct ggactgtgca actgagttag 1230188409PRTOryza sativa 188Met Glu Tyr Gly Ala Ala Thr Arg Gly Ala Leu Leu Ala Ala Ala Pro 1 5 10 15 Leu Ser Gly Ala Arg Arg Ser Trp Leu Pro Leu Ser Ser Pro Pro Ser 20 25 30 Pro Pro Ser Ile Gln Ile Gln Asn Arg Leu Tyr Ser Ile Ser Ser Leu 35 40 45 Pro Leu Lys Ala Arg Gly Val Arg Arg Cys Glu Ala Ser Leu Ala Ser 50 55 60 Asp Tyr Thr Lys Ala Ser Glu Val Ala Asp Leu Asp Trp Glu Asn Leu 65 70 75 80 Gly Phe Gly Ile Val Gln Thr Asp Tyr Met Tyr Ile Thr Lys Cys Gly 85 90 95 Gln Asp Gly Asn Phe Ser Glu Gly Glu Met Ile Pro Phe Gly Pro Ile 100 105 110 Ala Leu Asn Pro Ser Ser Gly Val Leu Asn Tyr Gly Gln Gly Leu Phe 115 120 125 Glu Gly Leu Lys Ala Tyr Arg Thr Thr Asp Asp Ser Ile Leu Leu Phe 130 135 140 Arg Pro Glu Glu Asn Ala Leu Arg Met Arg Thr Gly Ala Glu Arg Met 145 150 155 160 Cys Met Pro Ala Pro Ser Val Glu Gln Phe Val Asp Ala Val Lys Gln 165 170 175 Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly Ser 180 185 190 Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly Ala Val Leu Gly Leu 195 200 205 Ala Pro Ala Pro Glu Tyr Thr Phe Ile Ile Phe Val Ser Pro Val Gly 210 215 220 Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu Ile Val Glu Asp 225 230 235 240 Lys Phe His Arg Ala Thr Pro Gly Gly Thr Gly Ser Val Lys Thr Ile 245 250 255 Gly Asn Tyr Ala Ser Val Leu Met Ala Gln Lys Ile Ala Lys Glu Lys 260 265 270 Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val His Lys Lys Tyr Leu 275 280 285 Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys Gly Asn Val Ile 290 295 300 Ser Thr Pro Ala Val Lys Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys 305 310 315 320 Ser Ile Ile Asp Val Ala Leu Ser Lys Gly Phe Gln Val Glu Glu Arg 325 330 335 Leu Val Ser Val Asp Glu Leu Leu Glu Ala Asp Glu Val Phe Cys Thr 340 345 350 Gly Thr Ala Val Val Val Ser Pro Val Gly Ser Ile Thr Tyr Gln Gly 355 360 365 Lys Arg Val Glu Tyr Ala Gly Asn Lys Gly Val Gly Val Val Ser Gln 370 375 380 Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln Met Gly Gln Ala Glu Asp 385 390 395 400 Trp Leu Val Trp Thr Val Gln Leu Ser 405 1891218DNAOryza sativa 189atggagctcc acctcacctc ccgcggcgcc ctcccgctgt ctccgccgct cgccggccag 60cggcgtcctc acctctctct ctccacgccg tcgcttccga tcaagaatca cacttattca 120gtgccacctc ctttctccaa ggctcactgc gcgataggat gccaagcttc tctagcaact 180aactacatgg aaacctctgc ggtggctgat ttggactggg agaacctcgg ttttggcctt 240gtccagacag attttatgta tattgcaaaa tgcgggccag atgggaactt ttccaaagga 300gaaatggtac catttggacc tatagaactg agcccatctg ctggagtctt aaattatgga 360cagggcttgt ttgagggctt aaaggcatat agaaaaacag atggatacat tctgctgttt 420cgtccggagg agaatgccat aaggatgaga aatggtgcag agaggatgtg tatgcctgca 480ccaactcttg aacaatttgt ggatgcagta aagcaaaccg ttttggcaaa taaaagatgg 540gtgcccccaa ccggtaaagg ctccctgtat ataaggccgc tgcttatggg aagtggagct 600gtccttggtc ttgcacctgc tcctgagtat acctttatga tttttgtctc ccctgttggg 660aactatttca aggaaggttt agcccctatt aacttgatta tagaagaaaa ctttcaccgt 720gctgcccctg gtggaactgg cggagtgaaa accattggaa actatgcctc ggtattaaaa 780gcacagagga ttgcaaaaca gaaaggatat tcagatgtcc tctatctaga tgccgttcac 840aagaaatatc tggaagaagt gtcttcgtgc aatatcttta ttgtgaaagg caatgttatt 900tctactccag caataaaagg aaccatactg cctggtataa caaggaaaag tattcttgaa 960gttgctcaga gaaaaggctt catggttgag gagcgccttg tgtcagtgga tgagcttctt 1020gaagctgatg aagttttctg cacgggaaca gctgttgtgg tgtcccctgt ggggagcata 1080acttatctgg ggcaaagggt ggaatatggc aaccaaggag tgggcgtggt gtgtcagcag 1140ctgtatactt cacttacaag cctccagatg ggtcatgtgg acgattgtat gggctggact 1200gtggaactaa accagtga 1218190405PRTOryza sativa 190Met Glu Leu His Leu Thr Ser Arg Gly Ala Leu Pro Leu Ser Pro Pro 1 5 10 15 Leu Ala Gly Gln Arg Arg Pro His Leu Ser Leu Ser Thr Pro Ser Leu 20 25 30 Pro Ile Lys Asn His Thr Tyr Ser Val Pro Pro Pro Phe Ser Lys Ala 35 40 45 His Cys Ala Ile Gly Cys Gln Ala Ser Leu Ala Thr Asn Tyr Met Glu 50 55 60 Thr Ser Ala Val Ala Asp Leu Asp Trp Glu Asn Leu Gly Phe Gly Leu 65 70 75 80 Val Gln Thr Asp Phe Met Tyr Ile Ala Lys Cys Gly Pro Asp Gly Asn 85 90 95 Phe Ser Lys Gly Glu Met Val Pro Phe Gly Pro Ile Glu Leu Ser Pro 100 105 110 Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys 115 120 125 Ala Tyr Arg Lys Thr Asp Gly Tyr Ile Leu Leu Phe Arg Pro Glu Glu 130 135 140 Asn Ala Ile Arg Met Arg Asn Gly Ala Glu Arg Met Cys Met Pro Ala 145 150 155 160 Pro Thr Leu Glu Gln Phe Val Asp Ala Val Lys Gln Thr Val Leu Ala 165 170 175 Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly Ser Leu Tyr Ile Arg 180 185 190 Pro Leu Leu Met Gly Ser Gly Ala Val Leu Gly Leu Ala Pro Ala Pro 195 200 205 Glu Tyr Thr Phe Met Ile Phe Val Ser Pro Val Gly Asn Tyr Phe Lys 210 215 220 Glu Gly Leu Ala Pro Ile Asn Leu Ile Ile Glu Glu Asn Phe His Arg 225 230 235 240 Ala Ala Pro Gly Gly Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala 245 250 255 Ser Val Leu Lys Ala Gln Arg Ile Ala Lys Gln Lys Gly Tyr Ser Asp 260 265 270 Val Leu Tyr Leu Asp Ala Val His Lys Lys Tyr Leu Glu Glu Val Ser 275 280 285 Ser Cys Asn Ile Phe Ile Val Lys Gly Asn Val Ile Ser Thr Pro Ala 290 295 300 Ile Lys Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Leu Glu 305 310 315 320 Val Ala Gln Arg Lys Gly Phe Met Val Glu Glu Arg Leu Val Ser Val 325 330 335 Asp Glu Leu Leu Glu Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val 340 345 350 Val Val Ser Pro Val Gly Ser Ile Thr Tyr Leu Gly Gln Arg Val Glu 355 360 365 Tyr Gly Asn Gln Gly Val Gly Val Val Cys Gln Gln Leu Tyr Thr Ser 370 375 380 Leu Thr Ser Leu Gln Met Gly His Val Asp Asp Cys Met Gly Trp Thr 385 390 395 400 Val Glu Leu Asn Gln 405 1911317DNAPhyscomitrella patens 191atggcggtga tgtgcgggat cgggttggct tcctccttgt tgcagcagga gagttacatg 60agcgtggcgt cgtctgaagc ggggagagct gatgttaagc gcgtctcatc gtcttcgtcg 120cctcagcttc ttcagaatgg cgtcggtttg aggaggactt gtcggatgcc ggcttttttt 180gtgacagagg agaggcttag gtcttcgttg tcgcataact caacctacca tgtacgcgga 240tcgaaagtgc agcaattgca tgcagtagca gatgctctga accaaaccag tgacttggat 300acattggaag ggatcgactg ggacaatttt ggttttggtc tgcgtccaac tgatttcatg 360tttgttatga agggcgacct agagggcaat tggcaaaaag gacagctacg accgtttggg 420aatttggaag tcagtccatc tgctggagtg ttaaattatg gacagggtgt gtttgaaggt 480atgaaggcat ataggacagc tgatgatcgc atattaattt tccgtccaga ggagaatgcc 540atgcgtatga taaatggggc tgagcggatg agcatgcctg ccccagatgt tgacacattt 600gttgatgctg tgaaaaaaac ggttctggca aacaaacgtt gggtgccccc gacaggcaag 660ggatcacttt acatccgtcc cttgctcatt ggcactggcc ctattttggg cttagcacct 720gcaccagaat atacctttct aatttatgta tctcctgttg gaacatattt caaggggggc 780ttatctccaa ttgacctgaa agtagagact tatttccatc gtgctgctcc tggtggaact 840ggtggagtaa aaactatatc caattatgcc

ccagtgctga agactcaact gatggctaaa 900gggaacggct attcagatgt cttgtaccta gacgcaatag agaacaagta tgtggaggaa 960gtctcgtctt gcaacatatt catggtcaag ggcaaagtga tctcaactcc tgaattggct 1020ggaacgatcc tgccgggaat cacaagaaag agcattattc agttagcacg cagtcgcggt 1080tatgaggtaa atgagcgacc agtgtcggtg gatgaactgc tagctgccga tgaggtgttt 1140tgcactggaa cggctgtggt tgtaaatccc gtaggaagca tcactcatgg cacaaacagg 1200gtgcagtaca ataatggagc tgtgggaaga gtatcacaag agctttatga agccctgaca 1260accttacaaa tgggtgtatc caaagatgaa tttgactggg tagtagaatt ggtgtaa 1317192438PRTPhyscomitrella patens 192Met Ala Val Met Cys Gly Ile Gly Leu Ala Ser Ser Leu Leu Gln Gln 1 5 10 15 Glu Ser Tyr Met Ser Val Ala Ser Ser Glu Ala Gly Arg Ala Asp Val 20 25 30 Lys Arg Val Ser Ser Ser Ser Ser Pro Gln Leu Leu Gln Asn Gly Val 35 40 45 Gly Leu Arg Arg Thr Cys Arg Met Pro Ala Phe Phe Val Thr Glu Glu 50 55 60 Arg Leu Arg Ser Ser Leu Ser His Asn Ser Thr Tyr His Val Arg Gly 65 70 75 80 Ser Lys Val Gln Gln Leu His Ala Val Ala Asp Ala Leu Asn Gln Thr 85 90 95 Ser Asp Leu Asp Thr Leu Glu Gly Ile Asp Trp Asp Asn Phe Gly Phe 100 105 110 Gly Leu Arg Pro Thr Asp Phe Met Phe Val Met Lys Gly Asp Leu Glu 115 120 125 Gly Asn Trp Gln Lys Gly Gln Leu Arg Pro Phe Gly Asn Leu Glu Val 130 135 140 Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly Val Phe Glu Gly 145 150 155 160 Met Lys Ala Tyr Arg Thr Ala Asp Asp Arg Ile Leu Ile Phe Arg Pro 165 170 175 Glu Glu Asn Ala Met Arg Met Ile Asn Gly Ala Glu Arg Met Ser Met 180 185 190 Pro Ala Pro Asp Val Asp Thr Phe Val Asp Ala Val Lys Lys Thr Val 195 200 205 Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly Ser Leu Tyr 210 215 220 Ile Arg Pro Leu Leu Ile Gly Thr Gly Pro Ile Leu Gly Leu Ala Pro 225 230 235 240 Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val Ser Pro Val Gly Thr Tyr 245 250 255 Phe Lys Gly Gly Leu Ser Pro Ile Asp Leu Lys Val Glu Thr Tyr Phe 260 265 270 His Arg Ala Ala Pro Gly Gly Thr Gly Gly Val Lys Thr Ile Ser Asn 275 280 285 Tyr Ala Pro Val Leu Lys Thr Gln Leu Met Ala Lys Gly Asn Gly Tyr 290 295 300 Ser Asp Val Leu Tyr Leu Asp Ala Ile Glu Asn Lys Tyr Val Glu Glu 305 310 315 320 Val Ser Ser Cys Asn Ile Phe Met Val Lys Gly Lys Val Ile Ser Thr 325 330 335 Pro Glu Leu Ala Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile 340 345 350 Ile Gln Leu Ala Arg Ser Arg Gly Tyr Glu Val Asn Glu Arg Pro Val 355 360 365 Ser Val Asp Glu Leu Leu Ala Ala Asp Glu Val Phe Cys Thr Gly Thr 370 375 380 Ala Val Val Val Asn Pro Val Gly Ser Ile Thr His Gly Thr Asn Arg 385 390 395 400 Val Gln Tyr Asn Asn Gly Ala Val Gly Arg Val Ser Gln Glu Leu Tyr 405 410 415 Glu Ala Leu Thr Thr Leu Gln Met Gly Val Ser Lys Asp Glu Phe Asp 420 425 430 Trp Val Val Glu Leu Val 435 1931317DNAPhyscomitrella patens 193atggggatgg cgtgtgggaa tcggcttgca tcctctctgt tgcaagagag tctcatgacg 60gtggcctcgt ccgaagctag aagaaaggat gccaggcgcg tctcgttgtc gtcgtcgccg 120tctcagcttc gtcaaagtgc ttctggtggc ttgagacggt gccgagtgcc tgcatttgcc 180ctgatagagg acgattctag gtcatcagtt tcgcagaatg caacctgcca tgtacgcggg 240tcgaaagtgc agcccttgaa tgcagtagca gatgccctca accaaaccag tgatttggat 300acgttgcaag ggattgattg ggacaacttt ggatttggtc tgcgtcctac tgatttcatg 360tacgtaaaga agggcgacat tgcgggaaat tggcaagagg gggagctagt accatatggg 420aatttggaaa tcagtccatc tgctggagtg ttaaattatg gacagggtgt gtttgaaggc 480ctgaaggcgt acaggacagc tgatgatagc atattaatgt tccgtccaga ggagaatgct 540ttgcgcatgg ttcatggggc tgagcgtatg agtatgcctg ctcctgatgt tgacacattc 600atcaatgctg taaagcaaac tgttctggcg aataaacgtt gggtgccccc gactggaaaa 660ggatcacttt acatccgtcc cttgcttatt ggcactggtc ctattttggg cttagcacca 720gctccagagt atacctttct cgtatatgtg tctcctgtcg gaacctactt caagggaggg 780ctatctccta ttgacctgaa agtggaaact tatttccatc gtgctgctcc tggtgggact 840gggggagtta aaaccatctc caattatgct ccagtgctca agactcaata tacagctaaa 900gggaaaggct attcagatgt cgtatattta gacgcaaaag agaacaagta tgtggaggag 960gtttcgtctt gcaacatatt cgtggttaag gacaaagtga tctcaacccc ggaattggct 1020ggaacaatcc tgccgggaat tacaaggaat agcattattc aattagctcg gagtcgtggt 1080tatgaggtga atgagcgacc agtatccgtg gatgagctgc tagctgctga tgaggtgttt 1140tgcactggaa cggctgtggt tgtaaatcct gtgggcagcg tcactcacgg cacgaagcgg 1200gtgctgtata atcacggagt tgttggagga gtatcgcaag agctttatga agccctaaca 1260tccatacaaa tgggtgtatc caaagatgag tttgattggg tagtagaatt ggcgtaa 1317194438PRTPhyscomitrella patens 194Met Gly Met Ala Cys Gly Asn Arg Leu Ala Ser Ser Leu Leu Gln Glu 1 5 10 15 Ser Leu Met Thr Val Ala Ser Ser Glu Ala Arg Arg Lys Asp Ala Arg 20 25 30 Arg Val Ser Leu Ser Ser Ser Pro Ser Gln Leu Arg Gln Ser Ala Ser 35 40 45 Gly Gly Leu Arg Arg Cys Arg Val Pro Ala Phe Ala Leu Ile Glu Asp 50 55 60 Asp Ser Arg Ser Ser Val Ser Gln Asn Ala Thr Cys His Val Arg Gly 65 70 75 80 Ser Lys Val Gln Pro Leu Asn Ala Val Ala Asp Ala Leu Asn Gln Thr 85 90 95 Ser Asp Leu Asp Thr Leu Gln Gly Ile Asp Trp Asp Asn Phe Gly Phe 100 105 110 Gly Leu Arg Pro Thr Asp Phe Met Tyr Val Lys Lys Gly Asp Ile Ala 115 120 125 Gly Asn Trp Gln Glu Gly Glu Leu Val Pro Tyr Gly Asn Leu Glu Ile 130 135 140 Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly Val Phe Glu Gly 145 150 155 160 Leu Lys Ala Tyr Arg Thr Ala Asp Asp Ser Ile Leu Met Phe Arg Pro 165 170 175 Glu Glu Asn Ala Leu Arg Met Val His Gly Ala Glu Arg Met Ser Met 180 185 190 Pro Ala Pro Asp Val Asp Thr Phe Ile Asn Ala Val Lys Gln Thr Val 195 200 205 Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly Ser Leu Tyr 210 215 220 Ile Arg Pro Leu Leu Ile Gly Thr Gly Pro Ile Leu Gly Leu Ala Pro 225 230 235 240 Ala Pro Glu Tyr Thr Phe Leu Val Tyr Val Ser Pro Val Gly Thr Tyr 245 250 255 Phe Lys Gly Gly Leu Ser Pro Ile Asp Leu Lys Val Glu Thr Tyr Phe 260 265 270 His Arg Ala Ala Pro Gly Gly Thr Gly Gly Val Lys Thr Ile Ser Asn 275 280 285 Tyr Ala Pro Val Leu Lys Thr Gln Tyr Thr Ala Lys Gly Lys Gly Tyr 290 295 300 Ser Asp Val Val Tyr Leu Asp Ala Lys Glu Asn Lys Tyr Val Glu Glu 305 310 315 320 Val Ser Ser Cys Asn Ile Phe Val Val Lys Asp Lys Val Ile Ser Thr 325 330 335 Pro Glu Leu Ala Gly Thr Ile Leu Pro Gly Ile Thr Arg Asn Ser Ile 340 345 350 Ile Gln Leu Ala Arg Ser Arg Gly Tyr Glu Val Asn Glu Arg Pro Val 355 360 365 Ser Val Asp Glu Leu Leu Ala Ala Asp Glu Val Phe Cys Thr Gly Thr 370 375 380 Ala Val Val Val Asn Pro Val Gly Ser Val Thr His Gly Thr Lys Arg 385 390 395 400 Val Leu Tyr Asn His Gly Val Val Gly Gly Val Ser Gln Glu Leu Tyr 405 410 415 Glu Ala Leu Thr Ser Ile Gln Met Gly Val Ser Lys Asp Glu Phe Asp 420 425 430 Trp Val Val Glu Leu Ala 435 1951020DNAPopulus trichocarpa 195atgcaacaag gatttgtgcc attgcatatc aattgggata atgttggttt tggtctaact 60cccacggatt tcatgttctt aatgaaatgc cctgttggag acaaatattc agaaggacac 120cttgttccct atggaaatct tgagataagc ccatcctctt cagtgttaaa ctacggacag 180gggttacttg aaggcttaaa ggcatataga ggtgatgata accgtattcg actcttccgg 240ccagaacaaa atgctctacg catgcaaatg ggggcggaaa gaatgtgcat gtcatcacca 300actgctgagc aatttgttag ttcaataaag caaactgctt tggccaataa aagatgggta 360cctcctccag gaaaaggatc gctctatatt aggcccttgc tcctgggaac agggccaatt 420ctaggtgtgg cgccatctcc agaatacacc ttcctagcat atgcttctcc agttggcaac 480tatttcaatg gtcccatgca cttctctgtt gaagataagg tctatcgagc aattcctgga 540ggaactggtg gcattaaatc tatcactaac tattcgccta tttacaaggc aatcactcaa 600gcaaaggcca aaggcttcac cgatgctata ttccttgatg cagcaactgg caaaaatata 660gaggaggcta ctgcatgtaa tatcttcgtt gtgaagggaa atgtcatctc aactcctcca 720atagccggaa ctattctgcc tggaatcaca agaaaaagca tcattgaagt tgcttcctgg 780ctcggatatc aaattgagga acgtgctatc ccactggagg agttgataaa tgttgatgaa 840gctttctgct caggaactgc gatagcaatt aagcctgttg gcagtgtaac ctatcaggga 900caaagggttg aatataaaac aggcgagggt actgtatctg agaaactatg tagaacactg 960acaggaattc aaactggtct cattgaggac actatgggat gggtcgtgga gattgaataa 1020196339PRTPopulus trichocarpa 196Met Gln Gln Gly Phe Val Pro Leu His Ile Asn Trp Asp Asn Val Gly 1 5 10 15 Phe Gly Leu Thr Pro Thr Asp Phe Met Phe Leu Met Lys Cys Pro Val 20 25 30 Gly Asp Lys Tyr Ser Glu Gly His Leu Val Pro Tyr Gly Asn Leu Glu 35 40 45 Ile Ser Pro Ser Ser Ser Val Leu Asn Tyr Gly Gln Gly Leu Leu Glu 50 55 60 Gly Leu Lys Ala Tyr Arg Gly Asp Asp Asn Arg Ile Arg Leu Phe Arg 65 70 75 80 Pro Glu Gln Asn Ala Leu Arg Met Gln Met Gly Ala Glu Arg Met Cys 85 90 95 Met Ser Ser Pro Thr Ala Glu Gln Phe Val Ser Ser Ile Lys Gln Thr 100 105 110 Ala Leu Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Ser Leu 115 120 125 Tyr Ile Arg Pro Leu Leu Leu Gly Thr Gly Pro Ile Leu Gly Val Ala 130 135 140 Pro Ser Pro Glu Tyr Thr Phe Leu Ala Tyr Ala Ser Pro Val Gly Asn 145 150 155 160 Tyr Phe Asn Gly Pro Met His Phe Ser Val Glu Asp Lys Val Tyr Arg 165 170 175 Ala Ile Pro Gly Gly Thr Gly Gly Ile Lys Ser Ile Thr Asn Tyr Ser 180 185 190 Pro Ile Tyr Lys Ala Ile Thr Gln Ala Lys Ala Lys Gly Phe Thr Asp 195 200 205 Ala Ile Phe Leu Asp Ala Ala Thr Gly Lys Asn Ile Glu Glu Ala Thr 210 215 220 Ala Cys Asn Ile Phe Val Val Lys Gly Asn Val Ile Ser Thr Pro Pro 225 230 235 240 Ile Ala Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Glu 245 250 255 Val Ala Ser Trp Leu Gly Tyr Gln Ile Glu Glu Arg Ala Ile Pro Leu 260 265 270 Glu Glu Leu Ile Asn Val Asp Glu Ala Phe Cys Ser Gly Thr Ala Ile 275 280 285 Ala Ile Lys Pro Val Gly Ser Val Thr Tyr Gln Gly Gln Arg Val Glu 290 295 300 Tyr Lys Thr Gly Glu Gly Thr Val Ser Glu Lys Leu Cys Arg Thr Leu 305 310 315 320 Thr Gly Ile Gln Thr Gly Leu Ile Glu Asp Thr Met Gly Trp Val Val 325 330 335 Glu Ile Glu 1971185DNAPopulus trichocarpa 197atgattcaaa cgaattctgg cttacgcagt ttggttcaat ctttacgacc catcacttcc 60tctttatcgg agcttatagt tgttgctaca catcacaagc agcatctgct cttcaacaag 120ttagcagacc atattcaaac agaagctttc ggatttttgc tgttcgttgg tggtgattct 180agcgaggatg agtatgctaa agtggactgg gataatctca gatttggcat cacaccagct 240gattacatgt acacaatgaa atgttccagt gatgggaagt ttgaacaagg gcagcttgct 300ccatacggaa atgttgaatt gagcccttca gcagcaggac tttatgaagg cacaaaagca 360tatagaacag aagatgggcg cctgcttctc tttcgtctgg atcaaaatgc cacgcggatg 420aagatgggcg ctgacagatt gtgcatggct tgcccctcca tttatcaaat tattgacgcg 480gtcaaacaaa ctgctctcgc taacaagcgc tggaccccac ctcgagggaa agggactttg 540tatatcaggc ctttgctaat gggaagtggt cctattctgg gattagcacc agcacctgaa 600tacacattcc tcatatatgc ttctcctgtc ggcaattatt tcaaggaggg tttgaaaccc 660ttgaacctat atgttgagga tgagtttcat cgggctactc gaggaggagc tggaggcgtt 720aaatccatca caaattatgc accagtttta aaagcaatgg ccagagcaaa aagcagagga 780ttttctgatg ttttgtacct cgactcggcc aataagaaaa atctggaaga agtctcttct 840tgcaacattt tccttgtgaa gggcaatata atttctagtc ctgctacaag tgggactatt 900cttccagggg tcactcgaag aagcatcatt gaaattgctc tcgatcatgg ctatcaggtc 960gaggaacgtg caattccatt ggacgaattg atggatgccg atgaagtttt ttgcacggga 1020actgcagtag gtgttgcccc tgtgggcacg attacatatc aggataggag agttgagtac 1080aacgtcggag aagagtccgt gtctcagaag ctttactcga ttcttgaagg aattaaaacg 1140ggagtcatcg aggataagaa aggctggact attgagatcc agtga 1185198394PRTPopulus trichocarpa 198Met Ile Gln Thr Asn Ser Gly Leu Arg Ser Leu Val Gln Ser Leu Arg 1 5 10 15 Pro Ile Thr Ser Ser Leu Ser Glu Leu Ile Val Val Ala Thr His His 20 25 30 Lys Gln His Leu Leu Phe Asn Lys Leu Ala Asp His Ile Gln Thr Glu 35 40 45 Ala Phe Gly Phe Leu Leu Phe Val Gly Gly Asp Ser Ser Glu Asp Glu 50 55 60 Tyr Ala Lys Val Asp Trp Asp Asn Leu Arg Phe Gly Ile Thr Pro Ala 65 70 75 80 Asp Tyr Met Tyr Thr Met Lys Cys Ser Ser Asp Gly Lys Phe Glu Gln 85 90 95 Gly Gln Leu Ala Pro Tyr Gly Asn Val Glu Leu Ser Pro Ser Ala Ala 100 105 110 Gly Leu Tyr Glu Gly Thr Lys Ala Tyr Arg Thr Glu Asp Gly Arg Leu 115 120 125 Leu Leu Phe Arg Leu Asp Gln Asn Ala Thr Arg Met Lys Met Gly Ala 130 135 140 Asp Arg Leu Cys Met Ala Cys Pro Ser Ile Tyr Gln Ile Ile Asp Ala 145 150 155 160 Val Lys Gln Thr Ala Leu Ala Asn Lys Arg Trp Thr Pro Pro Arg Gly 165 170 175 Lys Gly Thr Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly Pro Ile 180 185 190 Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Ala Ser 195 200 205 Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Lys Pro Leu Asn Leu Tyr 210 215 220 Val Glu Asp Glu Phe His Arg Ala Thr Arg Gly Gly Ala Gly Gly Val 225 230 235 240 Lys Ser Ile Thr Asn Tyr Ala Pro Val Leu Lys Ala Met Ala Arg Ala 245 250 255 Lys Ser Arg Gly Phe Ser Asp Val Leu Tyr Leu Asp Ser Ala Asn Lys 260 265 270 Lys Asn Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Leu Val Lys Gly 275 280 285 Asn Ile Ile Ser Ser Pro Ala Thr Ser Gly Thr Ile Leu Pro Gly Val 290 295 300 Thr Arg Arg Ser Ile Ile Glu Ile Ala Leu Asp His Gly Tyr Gln Val 305 310 315 320 Glu Glu Arg Ala Ile Pro Leu Asp Glu Leu Met Asp Ala Asp Glu Val 325 330 335 Phe Cys Thr Gly Thr Ala Val Gly Val Ala Pro Val Gly Thr Ile Thr 340 345 350 Tyr Gln Asp Arg Arg Val Glu Tyr Asn Val Gly Glu Glu Ser Val Ser 355 360 365 Gln Lys Leu Tyr Ser Ile Leu Glu Gly Ile Lys Thr Gly Val Ile Glu 370 375 380 Asp Lys Lys Gly Trp Thr Ile Glu Ile Gln 385 390 1991257DNAPopulus trichocarpa 199atggcttcat caacctcagg cccaaaaaaa gtgctgtcaa ggttccgcca agtcgttggg 60ctattgcttc cgcattcgaa atctacacca ccacctgtta gcagtactga tgatgacaaa 120gtgaccactg atgatcacca agtgagtact gatgatcaca aagtgagtac tgatgagaaa 180gtgaagtctg gtggtgaaga tatcaattgg gataatgttg gttttggtct aactcccacg 240gatttcatgt tcttaatgaa atgccctgtt ggagacaaat attcagaagg acaccttgtt

300ccctatggaa atcttgagat aagcccatcc tcttcagtgt taaactacgg acagggattg 360tttgaaggga tgaaagtata taggagagaa gatgacagaa tcatgatctt taggccagaa 420gaaaatgctc gacgcatgca aatgggagca gagagactgc tgatgcaagc accaacgacc 480gagcaattta ttgatgctgt gaagaaaact gcccttgcaa acgagcgttg ggtgcctccc 540catgggacgg gaacattgta cctgaggcct ttgctaatgg gaagtggagc tgttttgggt 600attggaccag ctcctgaatg cacattcctt atctttgcat ctcctatccg caactcttac 660aagagtggga tcgacgcctt taacttgtct atcgagacca aacttcatcg agcttcccct 720ggtggaactg gaggtatcaa aagcattacc aactatgctc cggtaagcat agtttgcaac 780tggtttgatt ttgatactgt ttatgaagtc gcagtggttt ttgaatcagt gaagcgagcg 840aaggctgcag ggtttgatga tgtcctgttc ttggatggag aaactggaaa gcatattgaa 900gaggcttctt cgtgtaatgt tttcatgttg aagggtaatg tcatttcaac ccccaccata 960ctcgggacaa ttttgcctgg aattactaga aaaagcatcc tggagattgc tcaagattgt 1020ggttatgagg tcgaagaagg acgtattcca gttgaggatg tgcttgctgc ggatgaggta 1080ttttgcacag gaactgcagt tgtagtcact tctgttgcca gcataaccta tcaggaacaa 1140agggtggaat ataaaacagg agagaacaca gtgtgtcacg aactgcgaac agcccttaca 1200ggaattcaaa ctggacttgt tgaggacaag aagggatgga ctgtccacct taattaa 1257200418PRTPopulus trichocarpa 200Met Ala Ser Ser Thr Ser Gly Pro Lys Lys Val Leu Ser Arg Phe Arg 1 5 10 15 Gln Val Val Gly Leu Leu Leu Pro His Ser Lys Ser Thr Pro Pro Pro 20 25 30 Val Ser Ser Thr Asp Asp Asp Lys Val Thr Thr Asp Asp His Gln Val 35 40 45 Ser Thr Asp Asp His Lys Val Ser Thr Asp Glu Lys Val Lys Ser Gly 50 55 60 Gly Glu Asp Ile Asn Trp Asp Asn Val Gly Phe Gly Leu Thr Pro Thr 65 70 75 80 Asp Phe Met Phe Leu Met Lys Cys Pro Val Gly Asp Lys Tyr Ser Glu 85 90 95 Gly His Leu Val Pro Tyr Gly Asn Leu Glu Ile Ser Pro Ser Ser Ser 100 105 110 Val Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Met Lys Val Tyr Arg 115 120 125 Arg Glu Asp Asp Arg Ile Met Ile Phe Arg Pro Glu Glu Asn Ala Arg 130 135 140 Arg Met Gln Met Gly Ala Glu Arg Leu Leu Met Gln Ala Pro Thr Thr 145 150 155 160 Glu Gln Phe Ile Asp Ala Val Lys Lys Thr Ala Leu Ala Asn Glu Arg 165 170 175 Trp Val Pro Pro His Gly Thr Gly Thr Leu Tyr Leu Arg Pro Leu Leu 180 185 190 Met Gly Ser Gly Ala Val Leu Gly Ile Gly Pro Ala Pro Glu Cys Thr 195 200 205 Phe Leu Ile Phe Ala Ser Pro Ile Arg Asn Ser Tyr Lys Ser Gly Ile 210 215 220 Asp Ala Phe Asn Leu Ser Ile Glu Thr Lys Leu His Arg Ala Ser Pro 225 230 235 240 Gly Gly Thr Gly Gly Ile Lys Ser Ile Thr Asn Tyr Ala Pro Val Ser 245 250 255 Ile Val Cys Asn Trp Phe Asp Phe Asp Thr Val Tyr Glu Val Ala Val 260 265 270 Val Phe Glu Ser Val Lys Arg Ala Lys Ala Ala Gly Phe Asp Asp Val 275 280 285 Leu Phe Leu Asp Gly Glu Thr Gly Lys His Ile Glu Glu Ala Ser Ser 290 295 300 Cys Asn Val Phe Met Leu Lys Gly Asn Val Ile Ser Thr Pro Thr Ile 305 310 315 320 Leu Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Leu Glu Ile 325 330 335 Ala Gln Asp Cys Gly Tyr Glu Val Glu Glu Gly Arg Ile Pro Val Glu 340 345 350 Asp Val Leu Ala Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val 355 360 365 Val Thr Ser Val Ala Ser Ile Thr Tyr Gln Glu Gln Arg Val Glu Tyr 370 375 380 Lys Thr Gly Glu Asn Thr Val Cys His Glu Leu Arg Thr Ala Leu Thr 385 390 395 400 Gly Ile Gln Thr Gly Leu Val Glu Asp Lys Lys Gly Trp Thr Val His 405 410 415 Leu Asn 2011254DNASolanum lycopersicum 201atggagagcg ccgccgtatt tgcagggctt caccctattc ccggtcacca taaccacctt 60ctgggtccat cacgaactgc tattaagctt cttcctcctt ccattgataa aatcaatttt 120tctcctttgc ccctcaagtt tcagaagcag tcgcatttca cttcttatat tggtaatagt 180gccataaaca gtggaaattc atttcgtgtg gcatctcctg caagcgacgt tgcatctgaa 240ttagccgaca tcgattggga taaccttggc tttggcttta tgcctactga ttatatgtat 300agcatgaaat gctctcaggg tgaaaacttt tctaagggtg aattacagcg tttcggtaac 360attgagttga gtccgtctgc tggaatatta aattatggtc agggattgtt cgaaggttta 420aaagcatatc gaaaacatga cggcaatata ttgttgtttc gacctgagga aaatgctacg 480cgtttgaaga tgggtgctga acgtatgtgt atgccttcac cgtctgttga acagtttgta 540gaagcagtga aagccactgt gttagctaat gaaagatgga ttcctcctcc cggtaaaggc 600tcattataca taagacctct gcttatgggg agtggagctg ttcttggtct tgctcctgct 660cctgagtaca cattcctgat ttatgtgtca cctgttggaa attattttaa ggaaggtttg 720gcaccaataa atttggtagt tgagactgaa atgcaccgtg caacacctgg tggtactgga 780ggcgttaaga ctattggaaa ttatgctgca gttctgaagg cacagagtgc tgctaaagca 840aaaggctatt ctgatgttct gtaccttgat tgtgttcaga aaaaatatct cgaagaggtt 900tcctcttgca atgtctttat tgtgaagggt aatctgatag taactcctgc aattaaagga 960accattctac ctggaattac gcgaaaaagc ataatcgacg tagctattag tcaaggattc 1020gaggttgagg aacgacaggt gtctgtggac gaattgcttg atgctgacga agttttctgt 1080acgggaactg ccgtggtagt atctcctgtt ggtagcatta ctcatcaagg gagaagggtg 1140acatatggaa atgatggtgt tggtcttgtg tcgcagcagt tatactctgc acttactagc 1200ctacaaatgg ggctctcaga ggataagatg ggttggattg ttgagctcaa atga 1254202417PRTSolanum lycopersicum 202Met Glu Ser Ala Ala Val Phe Ala Gly Leu His Pro Ile Pro Gly His 1 5 10 15 His Asn His Leu Leu Gly Pro Ser Arg Thr Ala Ile Lys Leu Leu Pro 20 25 30 Pro Ser Ile Asp Lys Ile Asn Phe Ser Pro Leu Pro Leu Lys Phe Gln 35 40 45 Lys Gln Ser His Phe Thr Ser Tyr Ile Gly Asn Ser Ala Ile Asn Ser 50 55 60 Gly Asn Ser Phe Arg Val Ala Ser Pro Ala Ser Asp Val Ala Ser Glu 65 70 75 80 Leu Ala Asp Ile Asp Trp Asp Asn Leu Gly Phe Gly Phe Met Pro Thr 85 90 95 Asp Tyr Met Tyr Ser Met Lys Cys Ser Gln Gly Glu Asn Phe Ser Lys 100 105 110 Gly Glu Leu Gln Arg Phe Gly Asn Ile Glu Leu Ser Pro Ser Ala Gly 115 120 125 Ile Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg 130 135 140 Lys His Asp Gly Asn Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala Thr 145 150 155 160 Arg Leu Lys Met Gly Ala Glu Arg Met Cys Met Pro Ser Pro Ser Val 165 170 175 Glu Gln Phe Val Glu Ala Val Lys Ala Thr Val Leu Ala Asn Glu Arg 180 185 190 Trp Ile Pro Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu 195 200 205 Met Gly Ser Gly Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr 210 215 220 Phe Leu Ile Tyr Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu 225 230 235 240 Ala Pro Ile Asn Leu Val Val Glu Thr Glu Met His Arg Ala Thr Pro 245 250 255 Gly Gly Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu 260 265 270 Lys Ala Gln Ser Ala Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr 275 280 285 Leu Asp Cys Val Gln Lys Lys Tyr Leu Glu Glu Val Ser Ser Cys Asn 290 295 300 Val Phe Ile Val Lys Gly Asn Leu Ile Val Thr Pro Ala Ile Lys Gly 305 310 315 320 Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Ile 325 330 335 Ser Gln Gly Phe Glu Val Glu Glu Arg Gln Val Ser Val Asp Glu Leu 340 345 350 Leu Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser 355 360 365 Pro Val Gly Ser Ile Thr His Gln Gly Arg Arg Val Thr Tyr Gly Asn 370 375 380 Asp Gly Val Gly Leu Val Ser Gln Gln Leu Tyr Ser Ala Leu Thr Ser 385 390 395 400 Leu Gln Met Gly Leu Ser Glu Asp Lys Met Gly Trp Ile Val Glu Leu 405 410 415 Lys 2031233DNATriticum aestivum 203atggaactcc gcctccgcgc cccggcgtcc cccgcttccg cctctccgcg cggcacgtcg 60gtctccccca gccccaggcc gcatccgcgc ctaccctcgc aacccattca gaagcgattg 120tccggcagcg ccgtctccgt ctccaggcga ggcaccgcgg caaggagcag cctgtgttcc 180gccctgatgg cggcatcata caacacagga actccggacc tagtcgactt cgactgggag 240actcttggat ttcaactggt cccgacggac tttatgtata taatgaaatg ttcgtcagat 300ggagtgttca ccaagggtga attggttcca tatgggccaa tcgagctgaa ccctgctgct 360gcagttttaa attacggcca gggattgctc gaaggtctta gagcacacag aaaggaggat 420ggttcagtaa ttgtttttcg ccccaaggaa aacgcgttgc ggatgaggat aggtgcagat 480cggctatgca tgcctgcacc aagcgttgag cagttcctat cagctgtcaa gcaaactata 540ttggcaaaca agcgttgggt accccccact ggcaaaggtt ctttatatat caggccgctg 600ctgattggaa gtggagctat gctaggtgta gcacctgccc cggagtatac atttgtcgtg 660tatgtttgcc cagttggtca ctatttcaag gatggcctgt ctcctattag cttattgact 720gaggaagaat atcaccgcgc tgcacctggt ggaactggtg atattaagac aattggaaat 780tatgcttcgg ttgttagtgc tcagagaaga gccaaggaga aaggtcattc tgatgttctt 840tacttggatc ccgtgcataa gaagtttgtg gaggaagttt cttcctgtaa tatattcatg 900gtgaaggata atgttatttc tactccacta ttaacgggaa caatccttcc tggaatcaca 960agaagaagta taatcgaaat tgccagcaat cttggaattc aggttgaaga gcgccttatt 1020gcgatagatg agttgcttga tgctgatgaa gtcttctgta cagggactgc cgttgtacta 1080tcacctgttg gttccatagt gtaccacgga agaagagtgg agtatggggg cgggaaggtc 1140ggagctgtgt cccagcaact gtactcagca cttacagcta tccagaaagg ccttgtggag 1200gacagtatgg gatggagtgt gcaattgaat tag 1233204410PRTTriticum aestivum 204Met Glu Leu Arg Leu Arg Ala Pro Ala Ser Pro Ala Ser Ala Ser Pro 1 5 10 15 Arg Gly Thr Ser Val Ser Pro Ser Pro Arg Pro His Pro Arg Leu Pro 20 25 30 Ser Gln Pro Ile Gln Lys Arg Leu Ser Gly Ser Ala Val Ser Val Ser 35 40 45 Arg Arg Gly Thr Ala Ala Arg Ser Ser Leu Cys Ser Ala Leu Met Ala 50 55 60 Ala Ser Tyr Asn Thr Gly Thr Pro Asp Leu Val Asp Phe Asp Trp Glu 65 70 75 80 Thr Leu Gly Phe Gln Leu Val Pro Thr Asp Phe Met Tyr Ile Met Lys 85 90 95 Cys Ser Ser Asp Gly Val Phe Thr Lys Gly Glu Leu Val Pro Tyr Gly 100 105 110 Pro Ile Glu Leu Asn Pro Ala Ala Ala Val Leu Asn Tyr Gly Gln Gly 115 120 125 Leu Leu Glu Gly Leu Arg Ala His Arg Lys Glu Asp Gly Ser Val Ile 130 135 140 Val Phe Arg Pro Lys Glu Asn Ala Leu Arg Met Arg Ile Gly Ala Asp 145 150 155 160 Arg Leu Cys Met Pro Ala Pro Ser Val Glu Gln Phe Leu Ser Ala Val 165 170 175 Lys Gln Thr Ile Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys 180 185 190 Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala Met Leu 195 200 205 Gly Val Ala Pro Ala Pro Glu Tyr Thr Phe Val Val Tyr Val Cys Pro 210 215 220 Val Gly His Tyr Phe Lys Asp Gly Leu Ser Pro Ile Ser Leu Leu Thr 225 230 235 240 Glu Glu Glu Tyr His Arg Ala Ala Pro Gly Gly Thr Gly Asp Ile Lys 245 250 255 Thr Ile Gly Asn Tyr Ala Ser Val Val Ser Ala Gln Arg Arg Ala Lys 260 265 270 Glu Lys Gly His Ser Asp Val Leu Tyr Leu Asp Pro Val His Lys Lys 275 280 285 Phe Val Glu Glu Val Ser Ser Cys Asn Ile Phe Met Val Lys Asp Asn 290 295 300 Val Ile Ser Thr Pro Leu Leu Thr Gly Thr Ile Leu Pro Gly Ile Thr 305 310 315 320 Arg Arg Ser Ile Ile Glu Ile Ala Ser Asn Leu Gly Ile Gln Val Glu 325 330 335 Glu Arg Leu Ile Ala Ile Asp Glu Leu Leu Asp Ala Asp Glu Val Phe 340 345 350 Cys Thr Gly Thr Ala Val Val Leu Ser Pro Val Gly Ser Ile Val Tyr 355 360 365 His Gly Arg Arg Val Glu Tyr Gly Gly Gly Lys Val Gly Ala Val Ser 370 375 380 Gln Gln Leu Tyr Ser Ala Leu Thr Ala Ile Gln Lys Gly Leu Val Glu 385 390 395 400 Asp Ser Met Gly Trp Ser Val Gln Leu Asn 405 410 2051200DNATriticum aestivum 205atggacgtgc tgtcgtctgc gaagcgcgcc ctcccgtggg gccgcacctc ggccggcggg 60gtcatcggcg gcctccgagc tctactcggg acggacggag gcggcggccg ctctcttctc 120ccgtcccggt ggaagtcgtc gcagccgcag ctggaccccg tcgacaggtc cgacgaggag 180ggcggcggcg acatcgactg ggacaacctc ggcttcgggc tcaccccgac cgactacatg 240tacgtcatgc ggtgctcgca ggaggagggc ggcttctccc gcggcgagct cgcccgctac 300ggcaacatcg agctcagccc ctcctccggc gtgctcaact acgggcaggg gctgttcgag 360gggctcaagg cgtaccggag ggcggacggg cccgggtaca tgctgttccg gccggaggag 420aacgcgcggc ggatgcagca cggcgccggg cgcatgtgca tgccggcccc gtccgtcgag 480cagttcgtgc acgccgtcaa gcagaccgtc ctcgccaaca ggcgctgggt gccgccgcag 540ggaaagggag cgctgtacct caggccgctg ctcatcggga gcggggcgat cctcgggctg 600gcgccggcgc cggagtacac cttcatgatc tacgccgcgc ctgtggggac atatttcaag 660gaaggcatgg cggcgataaa cctgctggtc gaggaggaga tccaccgcgc catgccgggc 720ggcaccggcg gggtcaagag catctccaac tacgcgccgg tgctcaaggc gcagatggac 780gcgaggagca aggggttcgc ggacgtgctg tacctggact cggtgcacaa gaggtacgtg 840gaggaggcct cctcctgcaa cctcttcgtc gtgaagggcg gcgccatcgc gacgccggcg 900acggagggga ccatcctgcc gggggtcacg cgcaggagca tcatcgagct cgccagagac 960agcggctacc aggtggaaga gcgcctcgtc tccatcgacg atctgatcag tgcagacgaa 1020gtgttctgca cgggaacggc cgtcggcatc accccggtgt cgaccatcac ctaccaaggg 1080acaaggtacg agttcaggac cggggaggac acgttgtcga agaagcttta cacggctctg 1140acgtcgatcc agatgggcct ggcggaggac aagaagggat ggacggtcgc ggttgattga 1200206399PRTTriticum aestivum 206Met Asp Val Leu Ser Ser Ala Lys Arg Ala Leu Pro Trp Gly Arg Thr 1 5 10 15 Ser Ala Gly Gly Val Ile Gly Gly Leu Arg Ala Leu Leu Gly Thr Asp 20 25 30 Gly Gly Gly Gly Arg Ser Leu Leu Pro Ser Arg Trp Lys Ser Ser Gln 35 40 45 Pro Gln Leu Asp Pro Val Asp Arg Ser Asp Glu Glu Gly Gly Gly Asp 50 55 60 Ile Asp Trp Asp Asn Leu Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met 65 70 75 80 Tyr Val Met Arg Cys Ser Gln Glu Glu Gly Gly Phe Ser Arg Gly Glu 85 90 95 Leu Ala Arg Tyr Gly Asn Ile Glu Leu Ser Pro Ser Ser Gly Val Leu 100 105 110 Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Arg Ala 115 120 125 Asp Gly Pro Gly Tyr Met Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg 130 135 140 Met Gln His Gly Ala Gly Arg Met Cys Met Pro Ala Pro Ser Val Glu 145 150 155 160 Gln Phe Val His Ala Val Lys Gln Thr Val Leu Ala Asn Arg Arg Trp 165 170 175 Val Pro Pro Gln Gly Lys Gly Ala Leu Tyr Leu Arg Pro Leu Leu Ile 180 185 190 Gly Ser Gly Ala Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe 195 200 205 Met Ile Tyr Ala Ala Pro Val Gly Thr Tyr Phe Lys Glu Gly Met Ala 210 215 220 Ala Ile Asn Leu Leu Val Glu Glu Glu Ile His Arg Ala Met Pro Gly 225 230 235 240 Gly Thr Gly Gly Val Lys Ser Ile Ser Asn Tyr Ala Pro Val Leu Lys 245 250 255 Ala Gln Met Asp Ala Arg Ser Lys Gly Phe Ala Asp Val Leu Tyr Leu 260 265 270 Asp Ser Val His Lys Arg Tyr Val Glu Glu Ala Ser Ser Cys Asn Leu 275 280 285 Phe Val Val Lys Gly Gly Ala Ile Ala Thr Pro Ala Thr Glu Gly Thr 290 295 300 Ile Leu Pro Gly Val Thr Arg Arg Ser Ile Ile Glu Leu Ala Arg Asp 305

310 315 320 Ser Gly Tyr Gln Val Glu Glu Arg Leu Val Ser Ile Asp Asp Leu Ile 325 330 335 Ser Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Gly Ile Thr Pro 340 345 350 Val Ser Thr Ile Thr Tyr Gln Gly Thr Arg Tyr Glu Phe Arg Thr Gly 355 360 365 Glu Asp Thr Leu Ser Lys Lys Leu Tyr Thr Ala Leu Thr Ser Ile Gln 370 375 380 Met Gly Leu Ala Glu Asp Lys Lys Gly Trp Thr Val Ala Val Asp 385 390 395 2071206DNAZea mays 207atggaatacg gcgccgtcct cgccgccgcg ccgctcgtcg cacggccgaa ctggctcctc 60ctctcgccgc cgccactggc gccgtctatt cagattcaga atcgtcttta ttcgatctcg 120tcattcccac taaaggctgg acctgtaagg gcatgcagag ctttagcaag caactacacg 180caaacatctg aaacagttga tttggactgg gagaacctgg gttttgggat tgtgcaaact 240gattatatgt atattgctaa gtgcgggaca gacgggaatt tttctgaggg tgaaatggtg 300ccttttggac ctatagcgct gagtccatct tctggagtcc taaattatgg acagggattg 360tttgagggcc taaaggcgta taagaaaact gatggatcca tcctattatt tcgcccagag 420gaaaatgctg agaggatgcg gacaggtgct gagaggatgt gcatgcctgc accctctgtc 480gagcagttta ttgatgcagt aaaacaaacc gttcttgcaa ataagagatg gattcctcct 540actggtaaag gttctctgta tattaggccc ttacttatgg gaagtggggc tgttcttggt 600cttgcacctg ctcctgagta tacattcatt atatttgtct ctcctgttgg aaactacttt 660aaggaaggtt tagcaccaat aaatttgata gttgtagaca agttccatcg tgctactcct 720ggtggtactg ggggtgtgaa gaccatagga aattatgctt cggtgttgat ggcacagaaa 780attgcaaagg aaaagggtta ttctgatgtc ctctacttgg acgctgttca caagaagtac 840cttgaagaag tttcttcatg caatgttttt gttgtcaagg acaatgttat ttctacccca 900gcaataaaag gaacaatatt acctggtatc acaaggaaaa gtataattga cgttgctttg 960agtaaaggct tccaggttga ggagcggctt gtttcagtgg atgaactgct tgatgctgat 1020gaggtattct gcacaggaac tgctgttgtg gtgtctcctg ttgggagcat tacatatcaa 1080gggaaaagag tggaatacgg ccaccaaggt gttggcgttg tgtcccagca gctgtacact 1140tcactgacga gtcttcagat gggtcaaacc gaggattgga tgggctggac tgtgcaactg 1200aattag 1206208401PRTZea mays 208Met Glu Tyr Gly Ala Val Leu Ala Ala Ala Pro Leu Val Ala Arg Pro 1 5 10 15 Asn Trp Leu Leu Leu Ser Pro Pro Pro Leu Ala Pro Ser Ile Gln Ile 20 25 30 Gln Asn Arg Leu Tyr Ser Ile Ser Ser Phe Pro Leu Lys Ala Gly Pro 35 40 45 Val Arg Ala Cys Arg Ala Leu Ala Ser Asn Tyr Thr Gln Thr Ser Glu 50 55 60 Thr Val Asp Leu Asp Trp Glu Asn Leu Gly Phe Gly Ile Val Gln Thr 65 70 75 80 Asp Tyr Met Tyr Ile Ala Lys Cys Gly Thr Asp Gly Asn Phe Ser Glu 85 90 95 Gly Glu Met Val Pro Phe Gly Pro Ile Ala Leu Ser Pro Ser Ser Gly 100 105 110 Val Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Lys 115 120 125 Lys Thr Asp Gly Ser Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala Glu 130 135 140 Arg Met Arg Thr Gly Ala Glu Arg Met Cys Met Pro Ala Pro Ser Val 145 150 155 160 Glu Gln Phe Ile Asp Ala Val Lys Gln Thr Val Leu Ala Asn Lys Arg 165 170 175 Trp Ile Pro Pro Thr Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu 180 185 190 Met Gly Ser Gly Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr 195 200 205 Phe Ile Ile Phe Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu 210 215 220 Ala Pro Ile Asn Leu Ile Val Val Asp Lys Phe His Arg Ala Thr Pro 225 230 235 240 Gly Gly Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ser Val Leu 245 250 255 Met Ala Gln Lys Ile Ala Lys Glu Lys Gly Tyr Ser Asp Val Leu Tyr 260 265 270 Leu Asp Ala Val His Lys Lys Tyr Leu Glu Glu Val Ser Ser Cys Asn 275 280 285 Val Phe Val Val Lys Asp Asn Val Ile Ser Thr Pro Ala Ile Lys Gly 290 295 300 Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Leu 305 310 315 320 Ser Lys Gly Phe Gln Val Glu Glu Arg Leu Val Ser Val Asp Glu Leu 325 330 335 Leu Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser 340 345 350 Pro Val Gly Ser Ile Thr Tyr Gln Gly Lys Arg Val Glu Tyr Gly His 355 360 365 Gln Gly Val Gly Val Val Ser Gln Gln Leu Tyr Thr Ser Leu Thr Ser 370 375 380 Leu Gln Met Gly Gln Thr Glu Asp Trp Met Gly Trp Thr Val Gln Leu 385 390 395 400 Asn 2091218DNAZea mays 209atggagctct gcacctgcgc ggcccgcagt tctgcgcccc cctcttctag ctgccgcgcg 60gcgccattcc cgcgagttct ctcccatcgc atttggagcc gatcgggata ctgcactgtc 120tgctttaccc cgtcaagccc tgttgctagg agtcgattct ctactctaat gacgactgca 180cacaacacag ggaccccaga tctagttgac ttcaattggg atgatcttgg gtttcaactg 240atcccaacgg acttcatgta tttaatgagc tgttcttcag atggggtgtt tatgaatggt 300aaattagtgc catatgggtc aattgagctg aatccagcag ccgccgtgct gaattatggt 360cagggattgc ttgaaggtct acgatcacat agaaaagagg atggatcaat ccttcttttt 420cgtccacatg aaaatgcacg gcggatggaa attggtgcag accggttatg catgcctgca 480ccaagtgtag agcaattcct agaagctgtg aaactaactg ttctggcaaa caagcattgg 540gtgcctcctt ttggtaaagg ttctttgtat atcagaccgc agctaattgg aagtggggct 600atgcttggtg tggcacctgc cccacagtac acattcattg tgtttgtttg cccagttggg 660cattatttca agggtggtct agctccaatc agcttgttaa ctgaggaaga ataccaccgt 720gctgcacctg gtggaactgg tgatataaaa actattggga actatgcttc ggttgtcagt 780gctcagagaa gatccaagga aaaaggccat tctgatgttt tatacttaga tccactccat 840aataagtttg ttgaggaagt ttcttcttgt aatatattca tggtgaagga caatattatt 900tctactccac tgttaacggg gacaattctt cctggaatca caaggagaag tgtgattgaa 960atttctcaga atcttggatt tcaggttgag gagcgtctta tcacaataga tgaactgctt 1020ggggctgatg aagtcttttg tacaggaaca gctgttgtat tgtcacctgt tgggagcatc 1080acttaccgtg gaagaagagt ggagtatggg aagaaccagg aggccggagt cgtgtcccaa 1140caactctatg ccgcactcac agctatccag aaaggtctca cggaggacag catgggatgg 1200acgttgcagc taacttag 1218210405PRTZea mays 210Met Glu Leu Cys Thr Cys Ala Ala Arg Ser Ser Ala Pro Pro Ser Ser 1 5 10 15 Ser Cys Arg Ala Ala Pro Phe Pro Arg Val Leu Ser His Arg Ile Trp 20 25 30 Ser Arg Ser Gly Tyr Cys Thr Val Cys Phe Thr Pro Ser Ser Pro Val 35 40 45 Ala Arg Ser Arg Phe Ser Thr Leu Met Thr Thr Ala His Asn Thr Gly 50 55 60 Thr Pro Asp Leu Val Asp Phe Asn Trp Asp Asp Leu Gly Phe Gln Leu 65 70 75 80 Ile Pro Thr Asp Phe Met Tyr Leu Met Ser Cys Ser Ser Asp Gly Val 85 90 95 Phe Met Asn Gly Lys Leu Val Pro Tyr Gly Ser Ile Glu Leu Asn Pro 100 105 110 Ala Ala Ala Val Leu Asn Tyr Gly Gln Gly Leu Leu Glu Gly Leu Arg 115 120 125 Ser His Arg Lys Glu Asp Gly Ser Ile Leu Leu Phe Arg Pro His Glu 130 135 140 Asn Ala Arg Arg Met Glu Ile Gly Ala Asp Arg Leu Cys Met Pro Ala 145 150 155 160 Pro Ser Val Glu Gln Phe Leu Glu Ala Val Lys Leu Thr Val Leu Ala 165 170 175 Asn Lys His Trp Val Pro Pro Phe Gly Lys Gly Ser Leu Tyr Ile Arg 180 185 190 Pro Gln Leu Ile Gly Ser Gly Ala Met Leu Gly Val Ala Pro Ala Pro 195 200 205 Gln Tyr Thr Phe Ile Val Phe Val Cys Pro Val Gly His Tyr Phe Lys 210 215 220 Gly Gly Leu Ala Pro Ile Ser Leu Leu Thr Glu Glu Glu Tyr His Arg 225 230 235 240 Ala Ala Pro Gly Gly Thr Gly Asp Ile Lys Thr Ile Gly Asn Tyr Ala 245 250 255 Ser Val Val Ser Ala Gln Arg Arg Ser Lys Glu Lys Gly His Ser Asp 260 265 270 Val Leu Tyr Leu Asp Pro Leu His Asn Lys Phe Val Glu Glu Val Ser 275 280 285 Ser Cys Asn Ile Phe Met Val Lys Asp Asn Ile Ile Ser Thr Pro Leu 290 295 300 Leu Thr Gly Thr Ile Leu Pro Gly Ile Thr Arg Arg Ser Val Ile Glu 305 310 315 320 Ile Ser Gln Asn Leu Gly Phe Gln Val Glu Glu Arg Leu Ile Thr Ile 325 330 335 Asp Glu Leu Leu Gly Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val 340 345 350 Val Leu Ser Pro Val Gly Ser Ile Thr Tyr Arg Gly Arg Arg Val Glu 355 360 365 Tyr Gly Lys Asn Gln Glu Ala Gly Val Val Ser Gln Gln Leu Tyr Ala 370 375 380 Ala Leu Thr Ala Ile Gln Lys Gly Leu Thr Glu Asp Ser Met Gly Trp 385 390 395 400 Thr Leu Gln Leu Thr 405 2111239DNAZea mays 211atggccgcgt tgacatctgc gaagggcgct ctccttccgt cgtgggctcg cagcagcagc 60agcggccatg gcggcgactt gtggagggtc ctggggaagg cgttggctac ggccggagga 120ggaggcggcg gcggatgctc ccttctgctc ccgcgccggt ggcagtcgtc gctgccgcag 180ctggaccacg tcgccgacag gtccaacgag gagagcggcg gcgagatcga ctgggacaac 240ctcggcttcg gcctcacccc gaccgactac atgtacgtca cgcggtgctc gccggaggac 300cgcggcgact tcccccgcgg cgagctctgc cgctacggca acatcgagct cagcccctcc 360tccggcgttc taaactacgc ccagggcctg ttcgagggaa tgaaggcgta ccggcggccg 420gaccgggccg ggtacacgct gttccggccg gaggagaacg cgcggcggat gcagcgcggc 480gccgagcgca tgtgcatgcc ggcgccgtcg gtggagcagt tcgtccacgc cgtcaggcag 540acagtcctcg ccaacaggcg ctgggtgccg ccgcagggga agggagccct gtacctccgg 600cctctgctcg tggggagcgg cccgatcctt gggctggctc cggcccccga gtacaccttc 660ctcatctacg ccgcacccgt tgggaactac ttcaaggagg gcctggcgcc catcaacctg 720gtggtgcatg acgagttcca ccgcgcgatg cccggcggca ccggcggggt caagaccatc 780gccaactacg cgccggtgct gagggcgcag atggacgcca agagcaaggg gttcacggac 840gtgctgtacc tggactcggt ccacaagcgg tacctggagg aggtgtcgtc gtgcaacgtg 900ttcgtcgtca agggcggcgt ggtcgccacg ccggacaccc ggggcaccat cctgccgggc 960atcacgcgca agagcgtcat cgagctcgcc agggaccgcg gatacaaggt tgaggaacgc 1020ctggtttcca tcgacgatct ggtggccgca gacgaggtgt tctgcaccgg gaccgcggtg 1080gtggttgctc ccgtgtcgac agtcacgtac cagggcgaga ggtatgagtt cagaacgggg 1140ccggacacgg tgtcgcagga gctgtacacg acgctgacat ccattcagat gggcatggcc 1200gccgaggaca gcaagggatg gacagtagca gtagagtag 1239212412PRTZea mays 212Met Ala Ala Leu Thr Ser Ala Lys Gly Ala Leu Leu Pro Ser Trp Ala 1 5 10 15 Arg Ser Ser Ser Ser Gly His Gly Gly Asp Leu Trp Arg Val Leu Gly 20 25 30 Lys Ala Leu Ala Thr Ala Gly Gly Gly Gly Gly Gly Gly Cys Ser Leu 35 40 45 Leu Leu Pro Arg Arg Trp Gln Ser Ser Leu Pro Gln Leu Asp His Val 50 55 60 Ala Asp Arg Ser Asn Glu Glu Ser Gly Gly Glu Ile Asp Trp Asp Asn 65 70 75 80 Leu Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met Tyr Val Thr Arg Cys 85 90 95 Ser Pro Glu Asp Arg Gly Asp Phe Pro Arg Gly Glu Leu Cys Arg Tyr 100 105 110 Gly Asn Ile Glu Leu Ser Pro Ser Ser Gly Val Leu Asn Tyr Ala Gln 115 120 125 Gly Leu Phe Glu Gly Met Lys Ala Tyr Arg Arg Pro Asp Arg Ala Gly 130 135 140 Tyr Thr Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg Met Gln Arg Gly 145 150 155 160 Ala Glu Arg Met Cys Met Pro Ala Pro Ser Val Glu Gln Phe Val His 165 170 175 Ala Val Arg Gln Thr Val Leu Ala Asn Arg Arg Trp Val Pro Pro Gln 180 185 190 Gly Lys Gly Ala Leu Tyr Leu Arg Pro Leu Leu Val Gly Ser Gly Pro 195 200 205 Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Ala 210 215 220 Ala Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu 225 230 235 240 Val Val His Asp Glu Phe His Arg Ala Met Pro Gly Gly Thr Gly Gly 245 250 255 Val Lys Thr Ile Ala Asn Tyr Ala Pro Val Leu Arg Ala Gln Met Asp 260 265 270 Ala Lys Ser Lys Gly Phe Thr Asp Val Leu Tyr Leu Asp Ser Val His 275 280 285 Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Val Phe Val Val Lys 290 295 300 Gly Gly Val Val Ala Thr Pro Asp Thr Arg Gly Thr Ile Leu Pro Gly 305 310 315 320 Ile Thr Arg Lys Ser Val Ile Glu Leu Ala Arg Asp Arg Gly Tyr Lys 325 330 335 Val Glu Glu Arg Leu Val Ser Ile Asp Asp Leu Val Ala Ala Asp Glu 340 345 350 Val Phe Cys Thr Gly Thr Ala Val Val Val Ala Pro Val Ser Thr Val 355 360 365 Thr Tyr Gln Gly Glu Arg Tyr Glu Phe Arg Thr Gly Pro Asp Thr Val 370 375 380 Ser Gln Glu Leu Tyr Thr Thr Leu Thr Ser Ile Gln Met Gly Met Ala 385 390 395 400 Ala Glu Asp Ser Lys Gly Trp Thr Val Ala Val Glu 405 410 21318PRTArtificial sequencemotif 13 213Ala Asn Lys Arg Trp Val Pro Pro Xaa Gly Lys Gly Ser Leu Tyr Ile 1 5 10 15 Arg Pro 21418PRTArtificial sequencemotif 14 214Arg Pro Xaa Glu Asn Ala Xaa Arg Met Xaa Xaa Gly Ala Xaa Arg Xaa 1 5 10 15 Cys Met 21518PRTArtificial sequencemotif 15 215Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Xaa 1 5 10 15 Glu Asp 21612PRTArtificial sequencesignature sequence 216Ala Asn Xaa Xaa Trp Xaa Pro Pro Xaa Gly Lys Gly 1 5 10 2173803DNAArtificial sequenceexpression cassette 217aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt

ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttcatttaa atcaactagg gatatcacaa 2220gtttgtacaa aaaagcaggc ttaaacaatg gagagaagcg ccgtctttgg tggtctgcaa 2280ccaaattacc ttctttaccc ctcacccaac tcttcatccc ttcctttctc agaccaccgc 2340gctagacttc caaatttctc tcctcctccc tctctgtctc tcaagataca taagcaggtt 2400tcttcttgtt ttaaagctgt gtctcctttt aagcgtggag ctgcgttttc tgatacacac 2460agtgacacat ttgaattagc tgacatagac tgggatgacc ttggatttgc atacgttccc 2520actgattata tgtattcaat gaaatgcact aaaggtggaa acttttccaa aggtgaatta 2580cagagatatg gaaacattga actgaaccct tctgctggcg tcttaaatta tggccaggga 2640ttgtttgaag gtctgaaagc ctacaggaaa gaagatggta accttcttct atttcgtcct 2700gaggaaaatg ctatgcggat gataatgggt gcagagagga tgtgcatgcc atcaccgaca 2760attgatcagt ttgtggatgc agtaaaagca actgttttag caaacaaacg ttgggttcct 2820cctccaggta aaggttcctt atatatcaga ccattgctaa tggggagtgg agctgttctt 2880ggtcttgcac ctgctcctga gtataccttt ctcatttatg tttcaccggt ggggaactat 2940tttaaggaag gtgtggcacc aattcattta attgtggagc atgaacttca tcgagcaact 3000cctggtggca ctggaggtgt gaagactata gggaattatg ctgcggttct caaggcacaa 3060tctgctgcaa aagccagagg tttttctgac gttttatatc ttgattgtgt acataaaaag 3120tatctagaag aggtttcctc ttgcaacatt tttgttgtga agggtaacag catctccact 3180cctgcaataa aagggacaat cctaccagga attacaagga agagcataat tgatgttgct 3240cgaagccaag gatttcaggt tgaggaacgg cttgtgacag tagatgaatt gcttgatgct 3300gatgaggttt tttgtaccgg aacagctgtt gttgtgtcac ctgtgggaag catcacctac 3360aagggtaaaa gggtgtctta tggcgtagaa ggttttggtg ctgtctcgca acaactctat 3420agtgtgctaa ccaagctaca gatgggcctt atagaggaca agatgaattg gactgtggag 3480ctgagttagg cgtactgcag tgaacccagc tttcttgtac aaagtggtga tatcacaagc 3540ccgggcggtc ttctagggat aacagggtaa ttatatccct ctagatcaca agcccgggcg 3600gtcttctacg atgattgagt aataatgtgt cacgcatcac catgggtggc agtgtcagtg 3660tgagcaatga cctgaatgaa caattgaaat gaaaagaaaa aaagtactcc atctgttcca 3720aattaaaatt ggttttaacc ttttaatagg tttatacaat aattgatata tgttttctgt 3780atatgtctaa tttgttatca tcc 38032182194DNAOryza sativa 218aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttc 219421952DNAArtificial sequenceprimer prm15099 219ggggacaagt ttgtacaaaa aagcaggctt aaacaatgga gagaagcgcc gt 5222050DNAArtificial sequenceprimer prm15100 220ggggaccact ttgtacaaga aagctgggtt cactgcagta cgcctaactc 50221380PRTArtificial sequenceConsensus 221Met Gly Xaa Xaa Glu Glu Xaa Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Xaa Ser Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tyr Xaa 20 25 30 Asp Trp Ser Xaa Ser Met Gln Ala Tyr Tyr Xaa Xaa Gly Ala Xaa Pro 35 40 45 Xaa Xaa Phe Phe Xaa Ser Xaa Val Ala Ser Pro Thr Pro His Pro Tyr 50 55 60 Met Trp Gly Gly Gln His Xaa Met Met Pro Pro Xaa Tyr Gly Thr Pro 65 70 75 80 Val Pro Tyr Pro Ala Leu Tyr Pro Pro Gly Gly Val Tyr Ala His Pro 85 90 95 Xaa Met Xaa Met Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 100 105 110 Xaa Xaa Lys Xaa Xaa Asp Gly Lys Asp Arg Xaa Ser Xaa Lys Lys Xaa 115 120 125 Lys Gly Xaa Ser Gly Xaa Xaa Gly Xaa Xaa Xaa Xaa Xaa Xaa Gly Glu 130 135 140 Xaa Gly Lys Ala Xaa Ser Gly Ser Gly Asn Asp Gly Xaa Ser Xaa Xaa 145 150 155 160 Ser Glu Xaa Ser Gly Ser Glu Gly Ser Ser Asp Ala Ser Asp Glu Asn 165 170 175 Xaa Asn Xaa Gln Glu Xaa Ala Ala Xaa Lys Lys Gly Ser Phe Xaa Gln 180 185 190 Met Leu Ala Asp Ala Ala Xaa Xaa Gln Asn Xaa Xaa Xaa Xaa Xaa Xaa 195 200 205 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Val Pro Gly Xaa Xaa Val Val Ser Met Pro Ala 225 230 235 240 Thr Asn Xaa Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Xaa Ala 245 250 255 Gly Xaa Xaa Xaa Xaa Lys Xaa Xaa Xaa Met Xaa Xaa Asn Xaa Xaa Xaa 260 265 270 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 275 280 285 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asp Xaa Xaa Glu Leu Lys Arg Gln 290 295 300 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 305 310 315 320 Lys Gln Ala Glu Cys Glu Glu Leu Gln Xaa Arg Val Glu Xaa Leu Ser 325 330 335 Xaa Glu Asn Xaa Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Glu Glu 340 345 350 Cys Glu Lys Leu Thr Ser Glu Asn Xaa Ser Ile Lys Glu Glu Leu Xaa 355 360 365 Arg Leu Xaa Gly Pro Glu Ala Val Ala Xaa Leu Glu 370 375 380 222463PRTArtificial sequenceConsensus 222Met Gly Asn Xaa Glu Glu Xaa Lys Xaa Xaa Xaa Xaa Xaa Xaa Lys Xaa 1 5 10 15 Xaa Xaa Xaa Ser Ser Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25 30 Xaa Xaa Xaa Xaa Xaa His Val Tyr Xaa Asp Trp Ala Ala Met Gln Ala 35 40 45 Tyr Tyr Gly Pro Arg Val Ala Ile Pro Pro Tyr Tyr Asn Ser Ala Val 50 55 60 Ala Ser Gly Xaa Xaa His Ala Pro Xaa Pro Tyr Met Trp Gly Pro Pro 65 70 75 80 Gln Pro Met Met Pro Pro Tyr Gly Xaa Pro Tyr Ala Ala Xaa Xaa Xaa 85 90 95 Xaa Xaa Gly Xaa Val Tyr Xaa His Pro Ala Val Xaa Ile Gly Xaa Xaa 100 105 110 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ser Xaa Pro Xaa Xaa Xaa Xaa Xaa Xaa 115 120 125 Xaa Gly Thr Xaa Leu Ser Ile Asp Thr Pro Xaa Lys Ser Ser Gly Asn 130 135 140 Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala 145 150 155 160 Met Ser Ile Gly Asn Xaa Xaa Xaa Glu Ser Ala Glu Xaa Xaa Ala Xaa 165 170 175 Xaa Xaa Arg Xaa Ser Gln Ser Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 180 185 190 Xaa Xaa Xaa Xaa Xaa Xaa Asp Thr Glu Gly Ser Ser Asp Gly Ser Asp 195 200 205 Gly Asn Thr Thr Gly Ala Xaa Gln Xaa Arg Xaa Lys Arg Ser Arg Glu 210 215 220 Gly Thr Pro Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 225 230 235 240 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Ser Xaa Lys Xaa Xaa Xaa 245 250 255 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Gly Xaa Val 260 265 270 Val Ser Xaa Xaa Met Xaa Thr Xaa Xaa Xaa Leu Glu Leu Arg Asn Xaa 275 280 285 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 290 295 300 Xaa Ala Val Val Pro Xaa Glu Xaa Trp Leu Gln Asn Glu Arg Glu Leu 305 310 315 320 Lys Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 325 330 335 Arg Leu Arg Lys Gln Ala Glu Thr Glu Glu Leu Ala Arg Lys Val Glu 340 345 350 Ser Leu Thr Ala Glu Asn Leu Thr Leu Lys Ser Glu Ile Asn Gln Leu 355 360 365 Thr Glu Xaa Ser Glu Lys Leu Arg Leu Glu Asn Ala Ala Leu Leu Glu 370 375 380 Lys Leu Lys Asn Ala Gln Leu Gly Xaa Xaa Xaa Glu Ile Xaa Leu Xaa 385 390 395 400 Xaa Xaa Asp Xaa Xaa Arg Xaa Xaa Pro Val Ser Thr Glu Asn Leu Leu 405 410 415 Ser Arg Val Asn Asn Xaa Xaa Gly Ser Xaa Asp Arg Xaa Xaa Glu Xaa 420 425 430 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asn Ser Gly Ala Lys Leu His 435 440 445 Gln Leu Leu Asp Ala Ser Pro Arg Ala Asp Ala Val Ala Ala Gly 450 455 460

Patent applications by Ana Isabel Sanz Molinero, Madrid ES

Patent applications by Steven Vandenabeele, Oudenaarde BE

Patent applications by Valerie Frankard, Waterloo BE

Patent applications by BASF Plant Science Company GmbH

Patent applications in class The polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)

Patent applications in all subclasses The polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)

User Contributions:

Comment about this patent or add new information about this topic:

Patent application number	Title
People who visited this patent also read:
20170112471	ULTRASOUND DIAGNOSTIC APPARATUS AND ULTRASOUND SIGNAL PROCESSING METHOD
20170112470	METHOD AND APPARATUS FOR REAL-TIME AND ROBUST STRAIN IMAGING
20170112469	ULTRASONIC PROBE
20170112468	IMAGE DIAGNOSIS APPARATUS AND IMAGE DIAGNOSIS METHOD
20170112467	RETENTION AND STABILIZATION OF ANATOMY FOR ULTRASOUND IMAGING

Date	Title
Similar patent applications:
2013-05-09	Plants having enhanced yield-related traits and a method for making the same
2014-03-06	Plants having enhanced yield-related traits and producing methods thereof
2014-03-06	Drought tolerant plants and related constructs and methods involving genes encoding zinc-finger (c3hc4-type ring finger) family polypeptides
2014-05-08	Common wheat, plants or parts thereof having partially or fully multiplied genome, hybrids and products thereof and methods of generating and using same
2013-01-03	Plant with enhanced growth and method for producing the same

Date	Title
New patent applications in this class:
2016-06-23	Plants having one or more enhanced yield-related traits and a method for making the same
2016-06-09	Transgenic maize
2016-05-19	Methods and compositions for improvement in seed yield
2016-05-12	Means and methods for yield performance in plants
2016-04-21	Plants having one or more enhanced yield-related traits and a method for making the same

Date	Title
New patent applications from these inventors:
2016-03-24	Plants having enhanced yield-related traits and a method for making the same
2015-12-31	Plants having enhanced yield-related traits and method for making the same
2015-12-17	Plants having enhanced yield-related traits and a method for making the same

Rank	Inventor's name
Top Inventors for class "Multicellular living organisms and unmodified parts thereof and related processes"
1	Gregory J. Holland
2	William H. Eby
3	Richard G. Stelpflug
4	Laron L. Peters
5	Justin T. Mason

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: Plants Having Enhanced Yield-Related Traits and Method for Making the Same

Abstract:

Claims:

Description: