Patent application title: Plants Having Enhanced Yield-Related Traits and Method for Making the Same
Inventors:
Ana Isabel Sanz Molinero (Madrid, ES)
Ana Isabel Sanz Molinero (Madrid, ES)
Valerie Frankard (Waterloo, BE)
Steven Vandenabeele (Oudenaarde, BE)
Assignees:
BASF Plant Science Company GmbH
IPC8 Class: AC12N1582FI
USPC Class:
800290
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part the polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)
Publication date: 2014-02-20
Patent application number: 20140053298
Abstract:
Provided is a method for enhancing yield-related traits in plants by
modulating expression of a nucleic acid encoding a bZIP-like polypeptide
or a BCAT4-like polypeptide in a plant. Also provided are plants having
modulated expression of a nucleic acid encoding a bZIP-like polypeptide
or a BCAT4-like polypeptide, which plants have enhanced yield-related
traits compared with control plants. Also provided are constructs
comprising bZIP-like polypeptide-encoding nucleic acids or BCAT4-like
polypeptide-encoding nucleic acids, useful in enhancing yield-related
traits in plants.Claims:
1-50. (canceled)
51. A method for enhancing yield-related traits in a plant plants relative to a control plant, comprising: (i) modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, wherein said bZIP-like polypeptide comprises a Basic Leucine Zipper Domain (PF00170) and a G-box binding domain of the MFMR type (PF07777) and one or more of motifs 1 to 3 of SEQ ID NO: 119, SEQ ID NO: 120 and SEQ ID NO: 121; or (ii) modulating expression in a plant of a nucleic acid encoding a BCAT4-like polypeptide, wherein said BCAT4-like polypeptide comprises the signature sequence of SEQ ID NO: 216.
52. The method of claim 51, wherein: (i) said bZIP-like polypeptide comprises one or more of motifs 4 to 6 and/or one or more of motifs 7 to 12; or (ii) said BCAT4-like polypeptide comprises one or more of Motif 13 of SEQ ID NO: 213, Motif 14 of SEQ ID NO: 214, and Motif 15 of SEQ ID NO: 215.
53. The method of claim 51, wherein said BCAT4-like polypeptide is encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of: (i) the nucleotide sequence of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209 or 211, or the complement thereof; (ii) a nucleotide sequence encoding a polypeptide comprising the amino acid sequence of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210 or 212; (iii) a nucleotide sequence having at least 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity with the nucleotide sequence of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209 or 211, and conferring enhanced yield-related traits in a plant relative to a control plant; (iv) a nucleotide sequence which hybridizes with the nucleotide sequence of (i) or (ii) under stringent hybridization conditions and confers enhanced yield-related traits in a plant relative to a control plant; and (v) a nucleotide sequence encoding a polypeptide having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity to the amino acid sequence of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210 or 212, and conferring enhanced yield-related traits in a plant relative to a control plant.
54. The method of claim 51, wherein said modulated expression is effected by: (i) introducing and expressing in a plant said nucleic acid encoding said bZIP-like polypeptide; or (ii) introducing and expressing in a plant said nucleic acid encoding said BCAT4-like polypeptide.
55. The method of claim 51, wherein said enhanced yield-related traits comprise increased yield relative to a control plant, or increased biomass and/or increased seed yield relative to a control plant.
56. The method of claim 51, wherein said nucleic acid encodes a bZIP-like polypeptide, and wherein said enhanced yield-related traits are obtained without effect on flowering time of the plant.
57. The method of claim 51, wherein said nucleic acid encodes a bZIP-like polypeptide, and wherein said enhanced yield-related traits are obtained under non-stress conditions.
58. The method of claim 51, wherein: (i) said nucleic acid encodes a bZIP-like polypeptide, and wherein said enhanced yield-related traits are obtained under conditions of drought stress or nitrogen deficiency; or (ii) said nucleic acid encodes a BCAT4-like polypeptide, and wherein said enhanced yield-related traits are obtained under conditions of drought stress.
59. The method of claim 51, wherein: (i) said nucleic acid encoding a bZIP-like polypeptide is of plant origin or from a dicotyledonous plant; or (ii) said nucleic acid encoding a BCAT4-like polypeptide is of plant origin, from a dicotyledonous plant, from a plant of the family Salicaceae, from a plant of the genus Populus, or from a Populus trichocarpa plant.
60. The method of claim 51, wherein: (i) said nucleic acid encoding a bZIP-like polypeptide encodes any one of the polypeptides listed in Table A1, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid; or (ii) said nucleic acid encoding a BCAT4-like polypeptide encodes any one of the polypeptides listed in Table A2, or is a portion of such a nucleic acid, or a nucleic acid capable of hybridizing with such a nucleic acid.
61. The method of claim 51, wherein: said nucleic acid encoding a bZIP-like polypeptide encodes an orthologue or paralogue of any of the polypeptides given in Table A1; or (ii) said nucleic acid encoding a BCAT4-like polypeptide encodes an orthologue or paralogue of any of the polypeptides given in Table A2.
62. The method of claim 51, wherein: (i) said nucleic acid encoding a bZIP-like polypeptide encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or SEQ ID NO: 4; or (ii) said nucleic acid encoding a BCAT4-like polypeptide encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 142.
63. The method of claim 51, wherein said nucleic acid encoding a bZIP-like polypeptide or said nucleic acid encoding a BCAT4-like polypeptide is operably linked to a constitutive promoter, a medium strength constitutive promoter, a plant promoter, a GOS2 promoter, or a GOS2 promoter from rice.
64. A plant, plant cell or plant part thereof, or a seed or progeny of said plant, obtained by the method of claim 51, wherein said plant, plant cell or plant part, or said seed or progeny, comprises a recombinant nucleic acid encoding said bZIP-like polypeptide or a recombinant nucleic acid encoding a BCAT4-like polypeptide.
65. A construct comprising: (i) a nucleic acid encoding a bZIP-like polypeptide or a nucleic acid encoding a BCAT4-like polypeptide as defined in claim 51; (ii) one or more control sequences capable of driving expression of the nucleic acid of (i); and optionally (iii) a transcription termination sequence.
66. The construct of claim 65, wherein one of said control sequences is a constitutive promoter, a medium strength constitutive promoter, a plant promoter, a GOS2 promoter, or a GOS2 promoter from rice.
67. A plant, plant cell or plant part thereof, comprising the construct of claim 65.
68. A method for the production of a transgenic plant having enhanced yield-related traits relative to a control plant, preferably increased yield, increased seed yield and/or increased biomass relative to a control plant, comprising: (i) introducing and expressing in a plant cell or plant a nucleic acid encoding a bZIP-like polypeptide or a nucleic acid encoding a BCAT4-like polypeptide as defined in claim 51; and (ii) cultivating said plant cell or plant under conditions promoting plant growth and development.
69. A transgenic plant having enhanced yield-related traits relative to a control plant, preferably increased yield, increased seed yield and/or increased biomass relative to a control plant, resulting from modulated expression of a nucleic acid encoding a bZIP-like polypeptide or a nucleic acid encoding a BCAT4-like polypeptide as defined in claim 51, or a transgenic plant cell derived from said transgenic plant.
70. The transgenic plant of claim 69, or a transgenic plant cell derived therefrom, wherein said plant is a crop plant, a monocotyledonous plant or a cereal, or wherein said plant is beet, sugarbeet, alfalfa, sugarcane, rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo or oats.
71. Harvestable parts of the transgenic plant of claim 69, wherein said harvestable parts are preferably shoot biomass and/or seeds.
72. Products derived from the transgenic plant of claim 69 and/or from harvestable parts of said plant.
73. A method for manufacturing a product, comprising growing the transgenic plant of claim 69 and producing a product from or by said plant or part thereof, including seeds.
Description:
BACKGROUND
[0001] The present invention relates generally to the field of molecular biology and concerns a method for enhancing yield-related traits in plants by modulating expression in a plant of a nucleic acid encoding a bZIP-like (basic Leucine Zipper) polypeptide, or a BCAT4-like (Branched-Chain AminoTransferase 4-like) polypeptide. The present invention also concerns plants having modulated expression of a nucleic acid encoding a bZIP-like polypeptide, which plants have enhanced yield-related traits relative to corresponding wild type plants or other control plants. The invention also provides constructs useful in the methods of the invention.
[0002] The ever-increasing world population and the dwindling supply of arable land available for agriculture fuels research towards increasing the efficiency of agriculture. Conventional means for crop and horticultural improvements utilise selective breeding techniques to identify plants having desirable characteristics. However, such selective breeding techniques have several drawbacks, namely that these techniques are typically labour intensive and result in plants that often contain heterogeneous genetic components that may not always result in the desirable trait being passed on from parent plants. Advances in molecular biology have allowed mankind to modify the germplasm of animals and plants. Genetic engineering of plants entails the isolation and manipulation of genetic material (typically in the form of DNA or RNA) and the subsequent introduction of that genetic material into a plant. Such technology has the capacity to deliver crops or plants having various improved economic, agronomic or horticultural traits.
[0003] A trait of particular economic interest is increased yield. Yield is normally defined as the measurable produce of economic value from a crop. This may be defined in terms of quantity and/or quality. Yield is directly dependent on several factors, for example, the number and size of the organs, plant architecture (for example, the number of branches), seed production, leaf senescence and more. Root development, nutrient uptake, stress tolerance and early vigour may also be important factors in determining yield. Optimizing the abovementioned factors may therefore contribute to increasing crop yield.
[0004] Seed yield is a particularly important trait, since the seeds of many plants are important for human and animal nutrition. Crops such as corn, rice, wheat, canola and soybean account for over half the total human caloric intake, whether through direct consumption of the seeds themselves or through consumption of meat products raised on processed seeds. They are also a source of sugars, oils and many kinds of metabolites used in industrial processes. Seeds contain an embryo (the source of new shoots and roots) and an endosperm (the source of nutrients for embryo growth during germination and during early growth of seedlings). The development of a seed involves many genes, and requires the transfer of metabolites from the roots, leaves and stems into the growing seed. The endosperm, in particular, assimilates the metabolic precursors of carbohydrates, oils and proteins and synthesizes them into storage macromolecules to fill out the grain.
[0005] Another important trait for many crops is early vigour. Improving early vigour is an important objective of modern rice breeding programs in both temperate and tropical rice cultivars. Long roots are important for proper soil anchorage in water-seeded rice. Where rice is sown directly into flooded fields, and where plants must emerge rapidly through water, longer shoots are associated with vigour. Where drill-seeding is practiced, longer mesocotyls and coleoptiles are important for good seedling emergence. The ability to engineer early vigour into plants would be of great importance in agriculture. For example, poor early vigour has been a limitation to the introduction of maize (Zea mays L.) hybrids based on Corn Belt germplasm in the European Atlantic.
[0006] A further important trait is that of improved abiotic stress tolerance. Abiotic stress is a primary cause of crop loss worldwide, reducing average yields for most major crop plants by more than 50% (Wang et al., Planta 218, 1-14, 2003). Abiotic stresses may be caused by drought, salinity, extremes of temperature, chemical toxicity and oxidative stress. The ability to improve plant tolerance to abiotic stress would be of great economic advantage to farmers worldwide and would allow for the cultivation of crops during adverse conditions and in territories where cultivation of crops may not otherwise be possible.
[0007] Crop yield may therefore be increased by optimising one of the above-mentioned factors.
[0008] With respect to bZIP-like polypeptides, bZIP proteins are a group of transcription factors, containing a conserved domain (ZIP) that participates in the formation of homo or heterodimers with other bZIP proteins and a DNA binding domain. These transcription factors form a large family and are present across fungi, animals and plants.
[0009] In plants, up to 13 groups of bZIP transcription factors have been identified in plants using evolutionary studies on nucleotide sequences. (Corr a et al. 2008).
[0010] A subgroup within the group G of bZIP transcription factors contains a G-BOX motif binding domain. The genes of this group are mostly related to morphogenic responses to light maturation and LEA genes repression, ABA regulation, Adh activation and photomorphogenesis (Corr a et al. 2008).
[0011] With respect to BCAT4-like polypeptides, BCAT or Branched Chain AminoTransferase, sometimes also named Branched Chain AminoTransaminase, catalyses the last step of synthesis, the transamination of the branched-chain amino acids leucine, isoleucine and valine to their respective alpha-keto acids and/or the initial step of degradation of these amino acids.
[0012] Diebold et al. (Plant Physiol, 2002, 129, 540-550) have identified in Arabidopsis seven putative BCAT genes. Maloney et al. (Plant Physiol., 2010, 153, 925-936) identified six BCAT genes from the cultivated tomato Solanum Lycopersicum.
[0013] Depending on the end use, the modification of certain yield traits may be favoured over others. For example for applications such as forage or wood production, or bio-fuel resource, an increase in the vegetative parts of a plant may be desirable, and for applications such as flour, starch or oil production, an increase in seed parameters may be particularly desirable. Even amongst the seed parameters, some may be favoured over others, depending on the application. Various mechanisms may contribute to increasing seed yield, whether that is in the form of increased seed size or increased seed number.
[0014] It has now been found that various yield-related traits may be improved in plants by modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.
DETAILED DESCRIPTION OF THE INVENTION
[0015] With respect to bZIP-like polypeptides, the present invention shows that modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide gives plants having enhanced yield-related traits relative to control plants.
[0016] With respect to BCAT4-like polypeptides, the present invention shows that modulating expression of a nucleic acid encoding a BCAT4-like polypeptide as defined herein gives plants having enhanced yield-related traits, in particular increased yield or more particular increased seed yield relative to control plants.
[0017] According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, and optionally selecting for plants having enhanced yield-related traits. According to another embodiment, the present invention provides a method for producing plants having enhanced yield-related traits relative to control plants, wherein said method comprises the steps of modulating expression in said plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as described herein and optionally selecting for plants having enhanced yield-related traits.
[0018] A preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, is by introducing and expressing in a plant a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.
[0019] Any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a bZIP-like polypeptide, or a BCAT4-like polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "bZIP-like nucleic acid", or "BCAT4-like nucleic acid", or "bZIP-like gene", or "BCAT4-like gene".
[0020] A "bZIP-like polypeptide" as defined herein refers to any polypeptide comprising a Basic Leucine Zipper domain (PF00170, SM00338, PS50217) and a G-box binding domain of the MFMR type (PF07777).
[0021] The Basic Leucine Zipper Domain (bZIP domain, PF00170) is found in many DNA binding eukaryotic proteins. One part of the domain contains a region that mediates sequence specific DNA binding properties and the leucine zipper that is required for the dimerisation of two DNA binding regions. The DNA binding region comprises a number of basic amino acids such as arginine and lysine.
[0022] The G-box binding protein MFMR domain (PF07777) is found at the N-terminus of the PF00170 bZIP domain. It typically ranges in length between 150 and 200 amino acids, but may be shorter, such as in SEQ ID NO: 2. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain) whereas the C-terminal half is more polar and has been named the MFMR (multifunctional mosaic region). It has been suggested that some of these motifs may be involved in mediating protein-protein interactions.
[0023] Additionally or alternatively, the bZIP-like polypeptide comprises one or more of the following motif(s):
TABLE-US-00001 Motif 1 (SEQ ID NO: 119): ELKR[EQ][KR]RKQSNRESARRSRLRKQAE[CTA]EEL Motif 2 (SEQ ID NO: 120): [AQ][RH][KR]VE[SAV]L[TS][HAT]ENx[SAT]L[RKQ][SD]E [LI][QNS][RQ][LF] Motif 3 (SEQ ID NO: 121): [HPL][AN][PI][HPG][PM][YD][MLV]W
[0024] In one embodiment the bZIP-like polypeptide comprises additionally or alternatively one or more of motif(s):
TABLE-US-00002 Motif 4 (SEQ ID NO: 122): RELKRQKRKQSNRESARRSRLRKQAECEELQ Motif 5 (SEQ ID NO: 123): [AG]TNLN[IM]GMD[LV]WN Motif 6 (SEQ ID NO: 124): MPPYGTPVPYPA[LIM]YPP
[0025] In another embodiment, the bZIP-like polypeptide comprises additionally or alternatively one or more of motif(s):
TABLE-US-00003 Motif 7 (SEQ ID NO: 125): NE[RL]ELKRE[RK]RKQSNRESARRSRLRKQAE[TA]EELA[RH] [KR]V[EDQ][SAV]LT[AT]EN[LM][TAS]L[KRQ] Motif 8 (SEQ ID NO: 126): [IA][ED][TS]P[TA]KSSGNTD[RQ]GL[MLV][NK]KLK[GE]FDGL [AT]MSIGN Motif 9 (SEQ ID NO: 127): [PL]PQ[PH]MMPPYG[APT]PY Motif 10(SEQ ID NO: 128): NSGAKL[HR]QLLD[AT][SN]PR[AT]DAVAAG Motif 11 (SEQ ID NO: 129): EI[NS][RKQ][LF]TE[NK]SEK[LM][KR][LM][EQ]N[AS][ATK] L[MRT][EV][KH] Preferably, motif 7 extends to (Motif 12, SEQ ID NO: 130) WLQNE[RL]ELKRE[RK]RKQSNRESARRSRLRKQAE[AT]EE LA[RIH][KR]V[EQ][VSA]LT[AST]EN[ML][ATS]L[QKR].
[0026] The term "bZIP-like" or "bZIP-like polypeptide" as used herein also intends to include homologues as defined hereunder of "bZIP-like polypeptide".
[0027] Motifs 1 to 12 were derived using the MEME algorithm (Bailey and Elkan, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, Calif., 1994). At each position within a MEME motif, the residues are shown that are present in the query set of sequences with a frequency higher than 0.2. Residues within square brackets represent alternatives.
[0028] More preferably, the bZIP-like polypeptide comprises in increasing order of preference, 1, 2, 3 or more of motifs 1 to 12, preferably 4 or more, more preferably 5 or more of motifs 1 to 12, most preferably 6 or more of motifs 1 to 12.
[0029] Additionally or alternatively, the homologue of a bZIP-like protein has in increasing order of preference at least 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 2, provided that the homologous protein comprises any one or more of the conserved motifs as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides).
[0030] In one embodiment the sequence identity level is determined by comparison of the polypeptide sequences over the entire length of the sequence of SEQ ID NO: 2 or SEQ ID NO: 4.
[0031] Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered. Preferably the motifs in a bZIP-like polypeptide have, in increasing order of preference, at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to any one or more of the motifs represented by SEQ ID NO: 119 to SEQ ID NO: 130 (Motifs 1 to 12).
[0032] In other words, in another embodiment a method is provided wherein said bZIP-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid E186 up to amino acid E240 in SEQ ID NO: 2 (which corresponds to the conserved domain starting with amino acid E262 up to amino acid N322 in SEQ ID NO: 4). Additionally or alternatively, the bZIP-like polypeptide useful in the methods of the present invention comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid M1 up to amino acid N99 in SEQ ID NO: 2 (which corresponds to the conserved domain starting with amino acid M1 up to amino acid R175 in SEQ ID NO: 4). Preferably said bZIP-like polypeptide comprises both conserved domains.
[0033] A "BCAT4-like polypeptide" as defined herein refers to any polypeptide comprising the signature sequence represented by AN(KREN)(RKH)W(VIT)PP(PTAQWFRH)GKG (SEQ ID NO: 216).
[0034] The term "BCAT4-like" or "BCAT4-like polypeptide" as used herein also intends to include homologues as defined hereunder of "BCAT4-like polypeptide".
[0035] Preferably, the BCAT4-like polypeptide comprises one or more of the following motifs:
[0036] (i) Motif 13 represented by ANKRWVPP[PT]GKGSLYIRP (SEQ ID NO: 213),
[0037] (ii) Motif 14 represented by RP[ED]ENA[ML]RM[IQK]xGA[ED]R[ML]CM (SEQ ID NO: 214),
[0038] (iii) Motif 15 represented by LNYGQGLFEGLKAYR[KT]ED (SEQ ID NO: 215).
[0039] Motifs 13 to 15 were derived using the MEME algorithm (Bailey and Elkan, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, Calif., 1994). At each position within a MEME motif, the residues are shown that are present in the query set of sequences with a frequency higher than 0.2. Residues within square brackets represent alternatives.
[0040] More preferably, the BCAT4-like polypeptide comprises in increasing order of preference, at least 2, or all 3 motifs.
[0041] Additionally or alternatively, the homologue of a BCAT4-like protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid sequence represented by SEQ ID NO: 142, provided that the homologous protein comprises any one or more of the conserved motifs as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides).
[0042] In one embodiment the sequence identity level is determined by comparison of the polypeptide sequences over the entire length of the sequence of SEQ ID NO: 142.
[0043] Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered. Preferably the motifs in a BCAT4-like polypeptide have, in increasing order of preference, at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to any one or more of the motifs represented by SEQ ID NO: 213 to SEQ ID NO: 215 (Motifs 13 to 15).
[0044] In other words, in another embodiment a method is provided wherein said BCAT4-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid 185 up to amino acid 202 in SEQ ID NO: 142, or to the conserved domain starting with amino acid 150 up to amino acid 167 in SEQ ID NO: 142, or to the conserved domain starting with amino acid 126 up to amino acid 143 in SEQ ID NO: 142.
[0045] The terms "domain", "signature" and "motif" are defined in the "definitions" section herein.
[0046] With respect to bZIP-like polypeptides, the bZIP-like polypeptide sequence, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), preferably clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides.
[0047] Furthermore, bZIP-like polypeptides (at least in their native form) typically have DNA binding activity. More particulary, bZIP-like polypeptides typically bind to the G-box sequence (ABRE oligonucleotide in Example 6, SEQ ID NO: 139). Tools and techniques for measuring DNA binding activity are well known in the art, see for example the DNA binding assay in Liao et al. (Planta, 228, 225-240, 2008). Further details are provided in Example 6.
[0048] In addition, bZIP-like polypeptides, when expressed in rice according to the methods of the present invention as outlined in Examples 7 and 8, give plants having increased yield related traits, in particular increased seed yield when grown under conditions of nitrogen limitation or drought stress.
[0049] In one embodiment of the present invention the function of the nucleic acid sequences of the invention is to confer information for synthesis of the bZIP-like that increases yield or yield related traits, when such a nucleic acid sequence of the invention is transcribed and translated in a living plant cell.
[0050] With respect to BCAT4-like polypeptides, the BCAT4-like polypeptide is preferably encoded by a nucleic acid molecule comprising a nucleic acid molecule selected from the group consisting of:
[0051] (i) a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;
[0052] (ii) the complement of a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;
[0053] (iii) a nucleic acid encoding the polypeptide as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, preferably as a result of the degeneracy of the genetic code, said isolated nucleic acid being deducible from a polypeptide sequence as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, and further preferably conferring enhanced yield-related traits relative to control plants;
[0054] (iv) a nucleic acid having, in increasing order of preference at least 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity with any of the nucleic acid sequences of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211, and further preferably conferring enhanced yield-related traits relative to control plants;
[0055] (v) a first nucleic acid molecule which hybridizes with a second nucleic acid molecule of (i) to (iv) under stringent hybridization conditions and preferably confers enhanced yield-related traits relative to control plants;
[0056] (vi) a nucleic acid encoding said polypeptide having, in increasing order of preference, at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 81%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, and preferably conferring enhanced yield-related traits relative to control plants; or
[0057] (vii) a nucleic acid comprising any combination(s) of features of (i) to (vi) above.
[0058] Preferably, the polypeptide sequence which when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group.
[0059] Furthermore, BCAT4-like polypeptides, at least in their native form, typically have transamination activity.
[0060] In addition, BCAT4-like polypeptides, when expressed in rice according to the methods of the present invention as outlined in Examples 7 and 8, give plants having increased yield related traits, in particular increased seed yield, more in particular increase in total weight of the seeds, increase in fillrate, increase in harvestindex and increased number of seeds.
[0061] In one embodiment of the present invention the function of the nucleic acid sequences of the invention is to confer information for synthesis of the BCAT4-like polypeptide that increases yield or yield related traits, when such a nucleic acid sequence of the invention is transcribed and translated in a living plant cell.
[0062] With respect to bZIP-like polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 1, encoding the polypeptide sequence of SEQ ID NO: 2. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any bZIP-like-encoding nucleic acid or bZIP-like polypeptide as defined herein, as exemplified with the bZIP-like encoding nucleic acid of SEQ ID NO: 3.
[0063] Examples of nucleic acids encoding bZIP-like polypeptides are given in Table A1 of the Examples section herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A1 of the Examples section are example sequences of orthologues and paralogues of the bZIP-like polypeptide represented by SEQ ID NO: 2, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search as described in the definitions section; where the query sequence is SEQ ID NO: 1 or SEQ ID NO: 2, the second BLAST (back-BLAST) would be against Solanum lycopersicum sequences. Where the query sequence is SEQ ID NO: 3 or SEQ ID NO: 4, the second BLAST (back-BLAST) would be against Populus trichocarpa sequences.
[0064] The invention also provides hitherto unknown bZIP-like-encoding nucleic acids and bZIP-like polypeptides useful for conferring enhanced yield-related traits in plants relative to control plants.
[0065] With respect to BCAT4-like polypeptides, the present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 141, encoding the polypeptide sequence of SEQ ID NO: 142. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any BCAT4-like-encoding nucleic acid or BCAT4-like polypeptide as defined herein.
[0066] Examples of nucleic acids encoding BCAT4-like polypeptides are given in Table A2 of the Examples section herein. Such nucleic acids are useful in performing the methods of the invention. The amino acid sequences given in Table A2 of the Examples section are example sequences of orthologues and paralogues of the BCAT4-like polypeptide represented by SEQ ID NO: 142, the terms "orthologues" and "paralogues" being as defined herein. Further orthologues and paralogues may readily be identified by performing a so-called reciprocal blast search as described in the definitions section; where the query sequence is SEQ ID NO: 141 or SEQ ID NO: 142, the second BLAST (back-BLAST) would be against Populus trichocarpa sequences.
[0067] Nucleic acid variants may also be useful in practising the methods of the invention. Examples of such variants include nucleic acids encoding homologues and derivatives of any one of the amino acid sequences given in Table A1 and A2 of the Examples section, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acids encoding homologues and derivatives of orthologues or paralogues of any one of the amino acid sequences given in Table A1 and A2 of the Examples section. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived. Further variants useful in practising the methods of the invention are variants in which codon usage is optimised or in which miRNA target sites are removed.
[0068] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, nucleic acids hybridising to nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, splice variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, allelic variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, and variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.
[0069] Nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of any one of the nucleic acid sequences given in Table A1 and A2 of the Examples section, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section.
[0070] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.
[0071] With respect to bZIP-like polypeptides, portions useful in the methods of the invention, encode a bZIP-like polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A1 of the Examples section. Preferably, the portion is a portion of any one of the nucleic acids given in Table A1 of the Examples section, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of the Examples section. Preferably the portion is at least 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A1 of the Examples section, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 of the Examples section. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 1 or SEQ ID NO: 3. Preferably, the portion encodes a fragment of an amino acid sequence which, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.
[0072] With respect to BCAT4-like polypeptides, portions useful in the methods of the invention, encode a BCAT4-like polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in Table A2 of the Examples section. Preferably, the portion is a portion of any one of the nucleic acids given in Table A2 of the Examples section, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A2 of the Examples section. Preferably the portion is at least 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250 consecutive nucleotides in length, the consecutive nucleotides being of any one of the nucleic acid sequences given in Table A2 of the Examples section, or of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A2 of the Examples section. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 141. Preferably, the portion encodes a fragment of an amino acid sequence which, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.
[0073] Another nucleic acid variant useful in the methods of the invention is a nucleic acid capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein, or with a portion as defined herein. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to the complement of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or to the complement of a nucleic acid encoding an orthologue, paralogue or homologue of any one of the proteins given in Table A1 and A2.
[0074] Hybridising sequences useful in the methods of the invention encode a POI polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in Table A1 and A2 of the Examples section. Preferably, the hybridising sequence is capable of hybridising to the complement of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or to a portion of any of these sequences, a portion being as defined herein, or the hybridising sequence is capable of hybridising to the complement of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in Table A1 and A2 of the Examples section.
[0075] With respect to bZIP-like polypeptides, the hybridising sequence is most preferably capable of hybridising to the complement of a nucleic acid as represented by SEQ ID NO: 1 or SEQ ID NO: 3, or to a portion thereof. In one embodiment, the hybridization conditions are of medium stringency, preferably of high stringency, as defined above.
[0076] Preferably, the hybridising sequence encodes a polypeptide with an amino acid sequence which, when full-length and when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.
[0077] With respect to BCAT4-like polypeptides, the hybridising sequence is most preferably capable of hybridising to the complement of a nucleic acid as represented by SEQ ID NO: 141 or to a portion thereof. In one embodiment, the hybridization conditions are of medium stringency, preferably of high stringency, as defined above.
[0078] Preferably, the hybridising sequence encodes a polypeptide with an amino acid sequence which, when full-length and used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.
[0079] Another nucleic acid variant useful in the methods of the invention is a splice variant encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined hereinabove, a splice variant being as defined herein.
[0080] In another embodiment, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section.
[0081] With respect to bZIP-like polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 1 or SEQ ID NO: 3, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2 or SEQ ID NO: 4. Preferably, the amino acid sequence encoded by the splice variant, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.
[0082] With respect to BCAT4-like polypeptides, preferred splice variants are splice variants of a nucleic acid represented by SEQ ID NO: 141, or a splice variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 142. Preferably, the amino acid sequence encoded by the splice variant, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.
[0083] Another nucleic acid variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined hereinabove, an allelic variant being as defined herein.
[0084] In yet another embodiment, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section.
[0085] With respect to bZIP-like polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the bZIP-like polypeptide of SEQ ID NO: 2 and any of the amino acids depicted in Table A1 of the Examples section. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 1 or SEQ ID NO: 3, or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2 or SEQ ID NO: 4. Preferably, the amino acid sequence encoded by the allelic variant, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.
[0086] With respect to BCAT4-like polypeptides, the polypeptides encoded by allelic variants useful in the methods of the present invention have substantially the same biological activity as the BCAT4-like polypeptide of SEQ ID NO: 142 and any of the amino acids depicted in Table A2 of the Examples section. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Preferably, the allelic variant is an allelic variant of SEQ ID NO: 141 or an allelic variant of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 142. Preferably, the amino acid sequence encoded by the allelic variant, when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 7, clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.
[0087] Gene shuffling or directed evolution may also be used to generate variants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, as defined herein; the term "gene shuffling" being as defined herein.
[0088] In yet another embodiment, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of a nucleic acid encoding any one of the proteins given in Table A1 and A2 of the Examples section, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of any of the amino acid sequences given in Table A1 and A2 of the Examples section, which variant nucleic acid is obtained by gene shuffling.
[0089] With respect to bZIP-like polypeptides, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling, when used in the construction of the phylogenetic tree as provided in Correa et al. (2008), preferably clusters with the group G of bZIP-like polypeptides rather than with any other group of bZIP-like polypeptides, and/or comprises one or more of the motifs 1 to 12 as described above, and/or has DNA binding activity, and/or has at least 17% sequence identity to SEQ ID NO: 2, or at least 23% sequence identity to SEQ ID NO: 4.
[0090] With respect to BCAT4-like polypeptides, the amino acid sequence encoded by the variant nucleic acid obtained by gene shuffling, when used in the construction of a phylogenetic tree such as the one depicted in FIG. 7, preferably clusters with the group of BCAT4-like polypeptides comprising the amino acid sequence represented by SEQ ID NO: 142 rather than with any other group, and/or comprises the signature peptide as represented by SEQ ID NO: 216, and/or comprises at least one of the motifs 13 to 15 as represented in SEQ ID NO: 213 to 215, respectively, and/or has aminotransferase biological activity, and/or has at least 80% sequence identity to SEQ ID NO: 142.
[0091] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).
[0092] bZIP-like polypeptides differing from the sequence of SEQ ID NO: 2 or of SEQ ID NO: 4 by one or several amino acids (substitution(s), insertion(s) and/or deletion(s) as defined above) may equally be useful to increase the yield of plants in the methods and constructs and plants of the invention.
[0093] BCAT4-like polypeptides differing from the sequence of SEQ ID NO: 142 by one or several amino acids (substitution(s), insertion(s) and/or deletion(s) as defined above) may equally be useful to increase the yield of plants in the methods and constructs and plants of the invention.
[0094] Concerning bZIP-like polypeptides, nucleic acids encoding bZIP-like polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the bZIP-like polypeptide-encoding nucleic acid is from a plant, further preferably from a dicotyledonous plant. In one embodiment, the bZIP-like polypeptide-encoding nucleic acid is from the family Solanaceae, preferably from Solanum lycopersicum. In another embodiment, the bZIP-like polypeptide-encoding nucleic acid is from the family Salicaceae, preferably from Populus trichocarpa.
[0095] Nucleic acids encoding BCAT4-like polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the BCAT4-like polypeptide-encoding nucleic acid is from a plant, further preferably from a dicotyledonous plant, more preferably from the family Salicaceae, most preferably the nucleic acid is from Populus trichocarpa.
[0096] In another embodiment the present invention extends to recombinant chromosomal DNA comprising a nucleic acid sequence useful in the methods of the invention, wherein said nucleic acid is present in the chromosomal DNA as a result of recombinant methods, but is not in its natural genetic environment. In a further embodiment the recombinant chromosomal DNA of the invention is comprised in a plant cell.
[0097] Performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased yield, especially increased seed yield and/or increased biomass relative to control plants. It should be noted that the increased biomass was obtained without effect on flowering time, in particular, no delay in flowering time was observed. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.
[0098] Reference herein to enhanced yield-related traits is taken to mean an increase early vigour and/or in biomass (weight) of one or more parts of a plant, which may include (i) aboveground parts and preferably aboveground harvestable parts and/or (ii) parts below ground and preferably harvestable below ground. In particular, such harvestable parts are biomass and/or seeds, and performance of the methods of the invention results in plants having increased biomass and/or increased seed yield relative to the seed yield of control plants. In one preferred embodiment, the increased yield is increased biomass, in another preferred embodiment, the increased yield is increased seed yield, in yet another embodiment, the increased yield is increased biomass and increased seed yield.
[0099] The present invention provides a method for increasing yield-related traits, in particular yield, especially seed yield and/or biomass of plants, relative to control plants, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein.
[0100] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein.
[0101] Performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under non-stress conditions or under mild drought conditions, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.
[0102] Performance of the methods of the invention gives plants grown under conditions of drought, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of drought which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.
[0103] Performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.
[0104] Performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield-related traits relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield-related traits in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide.
[0105] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants or host cells and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.
[0106] More specifically, the present invention provides a construct comprising:
[0107] (a) a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined above;
[0108] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0109] (c) a transcription termination sequence.
[0110] Preferably, the nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, is as defined above. The term "control sequence" and "termination sequence" are as defined herein.
[0111] The genetic construct of the invention may be comprised in a host cell, plant cell, seed, agricultural product or plant. Plants or host cells are transformed with a genetic construct such as a vector or an expression cassette comprising any of the nucleic acids described above. Thus the invention furthermore provides plants or host cells transformed with a construct as described above. In particular, the invention provides plants transformed with a construct as described above, which plants have increased yield-related traits as described herein.
[0112] In one embodiment the genetic construct of the invention confers increased yield or yield related traits(s) to a plant when it has been introduced into said plant, which plant expresses the nucleic acid encoding the POI comprised in the genetic construct. In another embodiment the genetic construct of the invention confers increased yield or yield related traits(s) to a plant comprising plant cells in which the construct has been introduced, which plant cells express the nucleic acid encoding the POI comprised in the genetic construct.
[0113] The skilled artisan is well aware of the genetic elements that must be present on the genetic construct in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).
[0114] Advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods. See the "Definitions" section herein for definitions of the various promoter types.
[0115] The constitutive promoter is preferably a ubiquitous constitutive promoter of medium strength. More preferably it is a plant derived promoter, e.g. a promoter of plant chromosomal origin, such as a GOS2 promoter or a promoter of substantially the same strength and having substantially the same expression pattern (a functionally equivalent promoter), more preferably the promoter is the promoter GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 131 or SEQ ID NO: 218, most preferably the constitutive promoter is as represented by SEQ ID NO: 131 or SEQ ID NO: 218. See the "Definitions" section herein for further examples of constitutive promoters.
[0116] With respect to bZIP-like polypeptides, it should be clear that the applicability of the present invention is not restricted to the bZIP-like polypeptide-encoding nucleic acid represented by SEQ ID NO: 1 or SEQ ID NO: 3, nor is the applicability of the invention restricted to the rice GOS2 promoter when expression of a bZIP-like polypeptide-encoding nucleic acid is driven by a constitutive promoter.
[0117] With respect to BCAT4-like polypeptides, it should be clear that the applicability of the present invention is not restricted to the BCAT4-like polypeptide-encoding nucleic acid represented by SEQ ID NO: 141, nor is the applicability of the invention restricted to expression of a BCAT4-like polypeptide-encoding nucleic acid when driven by a constitutive promoter.
[0118] With respect to bZIP-like polypeptides, optionally, one or more terminator sequences may be used in the construct introduced into a plant. Preferably, the construct comprises an expression cassette comprising a GOS2 promoter, substantially similar to SEQ ID NO: 131, operably linked to the nucleic acid encoding the bZIP-like polypeptide. More preferably, the construct comprises a zein terminator (t-zein) linked to the 3' end of the bZIP-like coding sequence. Most preferably, the expression cassette comprises a sequence having in increasing order of preference at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity to the sequence represented by SEQ ID NO: 132 (Le expression cassette) or SEQ ID NO: 133 (Pt expression cassette). Furthermore, one or more sequences encoding selectable markers may be present on the construct introduced into a plant.
[0119] With respect to BCAT4-like polypeptides, optionally, one or more terminator sequences may be used in the construct introduced into a plant. Preferably, the construct comprises an expression cassette comprising a GOS2 promoter, substantially similar to SEQ ID NO: 218, operably linked to the nucleic acid encoding the BCAT4-like polypeptide. More preferably, the construct comprises a zein terminator (t-zein) linked to the 3' end of the BCAT4-like polypeptide coding sequence. Most preferably, the expression cassette comprises a sequence having in increasing order of preference at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identity to the sequence represented by SEQ ID NO: 217 (pGOS2::BCAT4-like::t-zein sequence). Furthermore, one or more sequences encoding selectable markers may be present on the construct introduced into a plant.
[0120] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.
[0121] As mentioned above, a preferred method for modulating expression of a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, is by introducing and expressing in a plant a nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.
[0122] The invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined herein.
[0123] With respect to bZIP-like polypeptides, the present invention more specifically provides a method for the production of transgenic plants having enhanced yield-related traits, particularly increased yield, which method comprises:
[0124] (i) introducing and expressing in a plant or plant cell a bZIP-like polypeptide-encoding nucleic acid or a genetic construct comprising a bZIP-like polypeptide-encoding nucleic acid; and
[0125] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0126] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a bZIP-like polypeptide as defined herein.
[0127] Preferably, the nucleic acid encoding bZIP-like polypeptide as defined herein and to be introduced into the plant is an isolated nucleic acid or is comprised in a genetic construct.
[0128] With respect to BCAT4-like polypeptides, the present invention more specifically provides a method for the production of transgenic plants having enhanced yield-related traits, particularly increased yield, more in particular increased seed yield, which method comprises:
[0129] (i) introducing and expressing in a plant or plant cell a BCAT4-like polypeptide-encoding nucleic acid or a genetic construct comprising a BCAT4-like polypeptide-encoding nucleic acid; and
[0130] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0131] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a BCAT4-like polypeptide as defined herein.
[0132] Preferably, the nucleic acid encoding BCAT4-like polypeptide as defined herein and to be introduced into the plant is an isolated nucleic acid or is comprised in a genetic construct.
[0133] Cultivating the plant cell under conditions promoting plant growth and development, may or may not include regeneration and/or growth to maturity. Accordingly, in a particular embodiment of the invention, the plant cell transformed by the method according to the invention is regenerable into a transformed plant. In another particular embodiment, the plant cell transformed by the method according to the invention is not regenerable into a transformed plant, i.e. cells that are not capable to regenerate into a plant using cell culture techniques known in the art. While plants cells generally have the characteristic of totipotency, some plant cells can not be used to regenerate or propagate intact plants from said cells. In one embodiment of the invention the plant cells of the invention are such cells. In another embodiment the plant cells of the invention are plant cells that do not sustain themselves in an autotrophic way.
[0134] The nucleic acid may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant or plant cell by transformation. The term "transformation" is described in more detail in the "definitions" section herein.
[0135] In one embodiment the present invention extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof.
[0136] The present invention encompasses plants or parts thereof (including seeds) obtainable by the methods according to the present invention. The plants or plant parts or plant cells comprise a nucleic acid transgene encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined above, preferably in a genetic construct such as an expression cassette. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.
[0137] In a further embodiment the invention extends to seeds comprising the expression cassettes of the invention, the genetic constructs of the invention, or the nucleic acids encoding the bZIP-like polypeptide, or the BCAT4-like polypeptide, and/or the bZIP-like polypeptides, or BCAT4-like polypeptides, as described above.
[0138] The invention also includes host cells containing an isolated nucleic acid encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, as defined above. In one embodiment host cells according to the invention are plant cells, yeasts, bacteria or fungi. Host plants for the nucleic acids, construct, expression cassette or the vector used in the method according to the invention are, in principle, advantageously all plants which are capable of synthesizing the polypeptides used in the inventive method. In a particular embodiment the plant cells of the invention overexpress the nucleic acid molecule of the invention.
[0139] The methods of the invention are advantageously applicable to any plant, in particular to any plant as defined herein. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to an embodiment of the present invention, the plant is a crop plant. Examples of crop plants include but are not limited to chicory, carrot, cassava, trefoil, soybean, beet, sugar beet, sunflower, canola, alfalfa, rapeseed, linseed, cotton, tomato, potato and tobacco. According to another embodiment of the present invention, the plant is a monootyledonous plant. Examples of monocotyledonous plants include sugarcane. According to another embodiment of the present invention, the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo and oats. In a particular embodiment the plants used in the methods of the invention are selected from the group consisting of maize, wheat, rice, soybean, cotton, oilseed rape including canola, sugarcane, sugar beet and alfalfa. Advantageously the methods of the invention are more efficient than the known methods, because the plants of the invention have increased yield and/or tolerance to an environmental stress compared to control plants used in comparable methods.
[0140] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs, which harvestable parts comprise a recombinant nucleic acid encoding a POI polypeptide. The invention furthermore relates to products derived or produced, preferably directly derived or produced, from a harvestable part of such a plant, such as dry pellets, meal or powders, oil, fat and fatty acids, starch or proteins.
[0141] The invention also includes methods for manufacturing a product comprising a) growing the plants of the invention and b) producing said product from or by the plants of the invention or parts thereof, including seeds. In a further embodiment the methods comprise the steps of a) growing the plants of the invention, b) removing the harvestable parts as described herein from the plants and c) producing said product from, or with the harvestable parts of plants according to the invention.
[0142] In one embodiment the products produced by the methods of the invention are plant products such as, but not limited to, a foodstuff, feedstuff, a food supplement, feed supplement, fiber, cosmetic or pharmaceutical. In another embodiment the methods for production are used to make agricultural products such as, but not limited to, plant extracts, proteins, amino acids, carbohydrates, fats, oils, polymers, vitamins, and the like.
[0143] In yet another embodiment the polynucleotides or the polypeptides of the invention are comprised in an agricultural product. In a particular embodiment the nucleic acid sequences and protein sequences of the invention may be used as product markers, for example where an agricultural product was produced by the methods of the invention. Such a marker can be used to identify a product to have been produced by an advantageous process resulting not only in a greater efficiency of the process but also improved quality of the product due to increased quality of the plant material and harvestable parts used in the process. Such markers can be detected by a variety of methods known in the art, for example but not limited to PCR based methods for nucleic acid detection or antibody based methods for protein detection.
[0144] The present invention also encompasses use of nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, as described herein and use of these bZIP-like polypeptides, or BCAT4-like polypeptides, in enhancing any of the aforementioned yield-related traits in plants. For example, nucleic acids encoding POI polypeptide described herein, or the bZIP-like polypeptides, or the BCAT4-like polypeptides, themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a gene encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide. The nucleic acids/genes, or the bZIP-like polypeptides, or the BCAT4-like polypeptides, themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined herein in the methods of the invention. Furthermore, allelic variants of a nucleic acid/gene encoding a bZIP-like polypeptide, or a BCAT4-like polypeptide, may find use in marker-assisted breeding programmes. Nucleic acids encoding bZIP-like polypeptides, or BCAT4-like polypeptides, may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes.
[0145] Moreover, concerning the bZIP-like polypeptides, the present invention relates to the following specific embodiments:
[0146] 1. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a bZIP-like polypeptide, wherein said bZIP-like polypeptide comprises Basic Leucine Zipper Domain (PF00170) and a G-box binding domain of the MFMR type (PF07777) and one or more of motifs 1 to 3, represented by SEQ ID NO: 119, SEQ ID NO: 120 or SEQ ID NO: 121.
[0147] 2. Method according to embodiment 1, wherein said bZIP-like polypeptide comprises one or more of motifs 4 to 6.
[0148] 3. Method according to embodiment 1, wherein said bZIP-like polypeptide comprises one or more of motifs 7 to 12.
[0149] 4. Method according to any one of embodiments 1 to 3, wherein said modulated expression is effected by introducing and expressing in a plant said nucleic acid encoding said bZIP-like polypeptide.
[0150] 5. Method according to embodiment 1 to 4, wherein said enhanced yield-related traits comprise increased yield relative to control plants, and preferably comprise increased biomass and/or increased seed yield relative to control plants.
[0151] 6. Method according to embodiment 1 to 5, wherein said enhanced yield-related traits are obtained without effect on flowering time of the plant.
[0152] 7. Method according to any one of embodiments 1 to 3, wherein said enhanced yield-related traits are obtained under non-stress conditions.
[0153] 8. Method according to any one of embodiments 1 to 3, wherein said enhanced yield-related traits are obtained under conditions of drought stress or nitrogen deficiency.
[0154] 9. Method according to any one of embodiments 1 to 8, wherein said nucleic acid encoding a bZIP-like is of plant origin, preferably from a dicotyledonous plant.
[0155] 10. Method according to any one of embodiments 1 to 8, wherein said nucleic acid encoding a bZIP-like encodes any one of the polypeptides listed in Table A1 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid.
[0156] 11. Method according to any one of embodiments 1 to 8, wherein said nucleic acid sequence en-codes an orthologue or paralogue of any of the polypeptides given in Table A1.
[0157] 12. Method according to any one of embodiments 1 to 11, wherein said nucleic acid encodes the polypeptide represented by SEQ ID NO: 2 or SEQ ID NO: 4.
[0158] 13. Method according to any one of embodiments 1 to 12, wherein said nucleic acid is operably linked to a constitutive promoter, preferably to a medium strength constitutive promoter, preferably to a plant promoter, more preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.
[0159] 14. Plant, plant part thereof, including seeds, or plant cell, obtainable by a method according to any one of embodiments 1 to 13, wherein said plant, plant part or plant cell comprises a recombinant nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12.
[0160] 15. Construct comprising:
[0161] (i) nucleic acid encoding a bZIP-like as defined in any of embodiments 1 to 3 and 9 to 12;
[0162] (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (i); and optionally
[0163] (iii) a transcription termination sequence.
[0164] 16. Construct according to embodiment 15, wherein one of said control sequences is a constitutive promoter, preferably a medium strength constitutive promoter, preferably a plant promoter, more preferably a GOS2 promoter, most preferably a GOS2 promoter from rice.
[0165] 17. Use of a construct according to embodiment 15 or 16 in a method for making plants having enhanced yield-related traits, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants.
[0166] 18. Plant, plant part or plant cell transformed with a construct according to embodiment 15 or 16.
[0167] 19. Method for the production of a transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants, comprising:
[0168] (i) introducing and expressing in a plant cell or plant a nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12; and
[0169] (ii) cultivating said plant cell or plant under conditions promoting plant growth and development.
[0170] 20. Transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass, resulting from modulated expression of a nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12 or a transgenic plant cell derived from said transgenic plant.
[0171] 21. Transgenic plant according to embodiment 14, 18 or 20, or a transgenic plant cell derived therefrom, wherein said plant is a crop plant, such as beet, sugarbeet or alfalfa; or a monocotyledonous plant such as sugarcane; or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo or oats.
[0172] 22. Harvestable parts of a plant according to embodiment 21, wherein said harvestable parts are preferably shoot biomass and/or seeds.
[0173] 23. Products derived from a plant according to embodiment 21 and/or from harvestable parts of a plant according to embodiment 22.
[0174] 24. Use of a nucleic acid encoding a bZIP-like polypeptide as defined in any of embodiments 1 to 3 and 9 to 12, for enhancing yield-related traits in plants relative to control plants, preferably for increasing yield, and more preferably for increasing seed yield and/or for increasing biomass in plants relative to control plants.
[0175] 25. A method for manufacturing a product, comprising the steps of growing the plants according to embodiment 14, 18, 20 or 21 and producing said product from or by said plants; or parts thereof, including seeds.
[0176] Moreover, concerning the BCAT4-like polypeptides, the present invention relates to the following specific embodiments:
[0177] 1. A method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding an BCAT4-like polypeptide, wherein said BCAT4-like polypeptide comprises the signature sequence represented by SEQ ID NO: 216.
[0178] 2. Method according to embodiment 1, wherein said polypeptide is encoded by a nucleic acid molecule comprising a nucleic acid molecule selected from the group consisting of:
[0179] (i) a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;
[0180] (ii) the complement of a nucleic acid represented by any one of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211;
[0181] (iii) a nucleic acid encoding the polypeptide as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212, preferably as a result of the degeneracy of the genetic code, said isolated nucleic acid being deducible from a polypeptide sequence as represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212 and further preferably conferring enhanced yield-related traits relative to control plants;
[0182] (iv) a nucleic acid having, in increasing order of preference at least 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity with any of the nucleic acid sequences of SEQ ID NO: 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, or 211, and further preferably conferring enhanced yield-related traits relative to control plants;
[0183] (v) a first nucleic acid molecule which hybridizes with a second nucleic acid molecule of (i) to (iv) under stringent hybridization conditions and preferably confers enhanced yield-related traits relative to control plants;
[0184] (vi) a nucleic acid encoding said polypeptide having, in increasing order of preference, at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence represented by any one of SEQ ID NO: 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, or 212 and preferably conferring enhanced yield-related traits relative to control plants; or
[0185] (vii) a nucleic acid comprising any combination(s) of features of (i) to (vi) above.
[0186] 3. Method according to embodiment 1 or 2, wherein said modulated expression is effected by introducing and expressing in a plant said nucleic acid encoding said BCAT4-like polypeptide.
[0187] 4. Method according to any one of embodiment 1 to 3, wherein said enhanced yield-related traits comprise increased yield relative to control plants, and preferably comprise increased biomass and/or increased seed yield relative to control plants.
[0188] 5. Method according to any one of embodiments 1 to 4, wherein said enhanced yield-related traits are obtained under conditions of drought stress.
[0189] 6. Method according to any of embodiments 1 to 5, wherein said BCAT4-like polypeptide comprises one or more of the following motifs:
[0190] (i) Motif 13 represented by SEQ ID NO: 213,
[0191] (ii) Motif 14 represented by SEQ ID NO: 214,
[0192] (iii) Motif 15 represented by SEQ ID NO: 215.
[0193] 7. Method according to any one of embodiments 1 to 6, wherein said nucleic acid encoding an BCAT4-like is of plant origin, preferably from a dicotyledonous plant, further preferably from the family Salicaceae, more preferably from the genus Populus, most preferably from Populus trichocarpa.
[0194] 8. Method according to any one of embodiments 1 to 7, wherein said nucleic acid encoding a BCAT4-like polypeptide encodes any one of the polypeptides listed in Table A2 or is a portion of such a nucleic acid, or a nucleic acid capable of hybridising with such a nucleic acid.
[0195] 9. Method according to any one of embodiments 1 to 8, wherein said nucleic acid sequence encodes an orthologue or paralogue of any of the polypeptides given in Table A2.
[0196] 10. Method according to any one of embodiments 1 to 9, wherein said nucleic acid encodes the polypeptide represented by SEQ ID NO: 142.
[0197] 11. Method according to any one of embodiments 1 to 10, wherein said nucleic acid is operably linked to a constitutive promoter, preferably to a medium strength constitutive promoter, preferably to a plant promoter, more preferably to a GOS2 promoter, most preferably to a GOS2 promoter from rice.
[0198] 12. Plant, plant part thereof, including seeds, or plant cell, obtainable by a method according to any one of embodiments 1 to 11, wherein said plant, plant part or plant cell comprises a recombinant nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11.
[0199] 13. Construct comprising:
[0200] (i) nucleic acid encoding an BCAT4-like as defined in any of embodiments 1, 2 and 6 to 11;
[0201] (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (i); and optionally
[0202] (iii) a transcription termination sequence.
[0203] 14. Construct according to embodiment 13, wherein one of said control sequences is a constitutive promoter, preferably a medium strength constitutive promoter, preferably a plant promoter, more preferably a GOS2 promoter, most preferably a GOS2 promoter from rice.
[0204] 15. Use of a construct according to embodiment 13 or 14 in a method for making plants having enhanced yield-related traits, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants.
[0205] 16. Plant, plant part or plant cell transformed with a construct according to embodiment 13 or 14.
[0206] 17. Method for the production of a transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass relative to control plants, comprising:
[0207] (i) introducing and expressing in a plant cell or plant a nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11; and
[0208] (ii) cultivating said plant cell or plant under conditions promoting plant growth and development.
[0209] 18. Transgenic plant having enhanced yield-related traits relative to control plants, preferably increased yield relative to control plants, and more preferably increased seed yield and/or increased biomass, resulting from modulated expression of a nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11 or a transgenic plant cell derived from said transgenic plant.
[0210] 19. Transgenic plant according to embodiment 12, 16 or 18, or a transgenic plant cell derived therefrom, wherein said plant is a crop plant, such as beet, sugarbeet or alfalfa; or a monocotyledonous plant such as sugarcane; or a cereal, such as rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, einkorn, teff, milo or oats.
[0211] 20. Harvestable parts of a plant according to embodiment 19, wherein said harvestable parts are preferably shoot biomass and/or seeds.
[0212] 21. Products derived from a plant according to embodiment 19 and/or from harvestable parts of a plant according to embodiment 20.
[0213] 22. Use of a nucleic acid encoding an BCAT4-like polypeptide as defined in any of embodiments 1, 2 and 6 to 11 for enhancing yield-related traits in plants relative to control plants, preferably for increasing yield, and more preferably for increasing seed yield and/or for increasing biomass in plants relative to control plants.
[0214] 23. A method for manufacturing a product comprising the steps of growing the plants according to embodiment 12, 16, 19 or 20 and producing said product from or by said plants; or parts thereof, including seeds.
[0215] 24. Construct according to embodiment 13 or 14 comprised in a plant cell.
[0216] 25. Recombinant chromosomal DNA comprising the construct according to embodiment 13 or 14.
DEFINITIONS
[0217] The following definitions will be used throughout the present application. The section captions and headings in this application are for convenience and reference purpose only and should not affect in any way the meaning or interpretation of this application. The technical terms and expressions used within the scope of this application are generally to be given the meaning commonly applied to them in the pertinent art of plant biology, molecular biology, bioinformatics and plant breeding. All of the following term definitions apply to the complete content of this application. The term "essentially", "about", "approximately" and the like in connection with an attribute or a value, particularly also define exactly the attribute or exactly the value, respectively. The term "about" in the context of a given numeric value or range relates in particular to a value or range that is within 20%, within 10%, or within 5% of the value or range given. As used herein, the term "comprising" also encompasses the term "consisting of".
Peptide(s)/Protein(s)
[0218] The terms "peptides", "oligopeptides", "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds, unless mentioned herein otherwise.
Polynucleotide(s)/Nucleic Acid(s)/Nucleic Acid Sequence(s)/Nucleotide Sequence(s)
[0219] The terms "polynucleotide(s)", "nucleic acid sequence(s)", "nucleotide sequence(s)", "nucleic acid(s)", "nucleic acid molecule" are used interchangeably herein and refer to nucleotides, either ribonucleotides or deoxyribonucleotides or a combination of both, in a polymeric unbranched form of any length.
Homologue(s)
[0220] "Homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified protein in question and having similar biological and functional activity as the unmodified protein from which they are derived.
[0221] Orthologues and paralogues are two different forms of homologues and encompass evolutionary concepts used to describe the ancestral relationships of genes. Paralogues are genes within the same species that have originated through duplication of an ancestral gene; orthologues are genes from different organisms that have originated through speciation, and are also derived from a common ancestral gene.
[0222] A "deletion" refers to removal of one or more amino acids from a protein.
[0223] An "insertion" refers to one or more amino acid residues being introduced into a predetermined site in a protein. Insertions may comprise N-terminal and/or C-terminal fusions as well as intra-sequence insertions of single or multiple amino acids. Generally, insertions within the amino acid sequence will be smaller than N- or C-terminal fusions, of the order of about 1 to 10 residues. Examples of N- or C-terminal fusion proteins or peptides include the binding domain or activation domain of a transcriptional activator as used in the yeast two-hybrid system, phage coat proteins, (histidine)-6-tag, glutathione S-transferase-tag, protein A, maltose-binding protein, dihydrofolate reductase, Tag•100 epitope, c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding peptide), HA epitope, protein C epitope and VSV epitope.
[0224] A "substitution" refers to replacement of amino acids of the protein with other amino acids having similar properties (such as similar hydrophobicity, hydrophilicity, antigenicity, propensity to form or break α-helical structures or β-sheet structures). Amino acid substitutions are typically of single residues, but may be clustered depending upon functional constraints placed upon the polypeptide and may range from 1 to 10 amino acids. The amino acid substitutions are preferably conservative amino acid substitutions. Conservative substitution tables are well known in the art (see for example Creighton (1984) Proteins. W.H. Freeman and Company (Eds) and Table 1 below).
TABLE-US-00004 TABLE 1 Examples of conserved amino acid substitutions Residue Conservative Substitutions Ala Ser Arg Lys Asn Gln; His Asp Glu Gln Asn Cys Ser Glu Asp Gly Pro His Asn; Gln Ile Leu, Val Leu Ile; Val Lys Arg; Gln Met Leu; Ile Phe Met; Leu; Tyr Ser Thr; Gly Thr Ser; Val Trp Tyr Tyr Trp; Phe Val Ile; Leu
[0225] Amino acid substitutions, deletions and/or insertions may readily be made using peptide synthetic techniques known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulation. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. For example, techniques for making substitution mutations at predetermined sites in DNA are well known to those skilled in the art and include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio), QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis or other site-directed mutagenesis protocols (see Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates)).
Derivatives
[0226] "Derivatives" include peptides, oligopeptides, polypeptides which may, compared to the amino acid sequence of the naturally-occurring form of the protein, such as the protein of interest, comprise substitutions of amino acids with non-naturally occurring amino acid residues, or additions of non-naturally occurring amino acid residues. "Derivatives" of a protein also encompass peptides, oligopeptides, polypeptides which comprise naturally occurring altered (glycosylated, acylated, prenylated, phosphorylated, myristoylated, sulphated etc.) or non-naturally altered amino acid residues compared to the amino acid sequence of a naturally-occurring form of the polypeptide. A derivative may also comprise one or more non-amino acid substituents or additions compared to the amino acid sequence from which it is derived, for example a reporter molecule or other ligand, covalently or non-covalently bound to the amino acid sequence, such as a reporter molecule which is bound to facilitate its detection, and non-naturally occurring amino acid residues relative to the amino acid sequence of a naturally-occurring protein. Furthermore, "derivatives" also include fusions of the naturally-occurring form of the protein with tagging peptides such as FLAG, HIS6 or thioredoxin (for a review of tagging peptides, see Terpe, Appl. Microbiol. Biotechnol. 60, 523-533, 2003).
Domain, Motif/Consensus Sequence/Signature The term "domain" refers to a set of amino acids conserved at specific positions along an alignment of sequences of evolutionarily related proteins. While amino acids at other positions can vary between homologues, amino acids that are highly conserved at specific positions indicate amino acids that are likely essential in the structure, stability or function of a protein. Identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers to determine if any polypeptide in question belongs to a previously identified polypeptide family.
[0227] The term "motif" or "consensus sequence" or "signature" refers to a short conserved region in the sequence of evolutionarily related proteins. Motifs are frequently highly conserved parts of domains, but may also include only part of the domain, or be located outside of conserved domain (if all of the amino acids of the motif fall outside of a defined domain).
[0228] Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788 (2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.
[0229] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters. For local alignments, the Smith-Waterman algorithm is particularly useful (Smith T F, Waterman M S (1981) J. Mol. Biol. 147(1); 195-7).
Reciprocal BLAST
[0230] Typically, this involves a first BLAST involving BLASTing a query sequence (for example using any of the sequences listed in Table A of the Examples section) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived. The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.
[0231] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.
Hybridisation
[0232] The term "hybridisation" as defined herein is a process wherein substantially homologous complementary nucleotide sequences anneal to each other. The hybridisation process can occur entirely in solution, i.e. both complementary nucleic acids are in solution. The hybridisation process can also occur with one of the complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose beads or any other resin. The hybridisation process can furthermore occur with one of the complementary nucleic acids immobilised to a solid support such as a nitro-cellulose or nylon membrane or immobilised by e.g. photolithography to, for example, a siliceous glass support (the latter known as nucleic acid arrays or microarrays or as nucleic acid chips). In order to allow hybridisation to occur, the nucleic acid molecules are generally thermally or chemically denatured to melt a double strand into two single strands and/or to remove hairpins or other secondary structures from single stranded nucleic acids.
[0233] The term "stringency" refers to the conditions under which a hybridisation takes place. The stringency of hybridisation is influenced by conditions such as temperature, salt concentration, ionic strength and hybridisation buffer composition. Generally, low stringency conditions are selected to be about 30° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Medium stringency conditions are when the temperature is 20° C. below Tm, and high stringency conditions are when the temperature is 10° C. below Tm. High stringency hybridisation conditions are typically used for isolating hybridising sequences that have high sequence similarity to the target nucleic acid sequence. However, nucleic acids may deviate in sequence and still encode a substantially identical polypeptide, due to the degeneracy of the genetic code. Therefore medium stringency hybridisation conditions may sometimes be needed to identify such nucleic acid molecules.
[0234] The Tm is the temperature under defined ionic strength and pH, at which 50% of the target sequence hybridises to a perfectly matched probe. The Tm is dependent upon the solution conditions and the base composition and length of the probe. For example, longer sequences hybridise specifically at higher temperatures. The maximum rate of hybridisation is obtained from about 16° C. up to 32° C. below Tm. The presence of monovalent cations in the hybridisation solution reduce the electrostatic repulsion between the two nucleic acid strands thereby promoting hybrid formation; this effect is visible for sodium concentrations of up to 0.4M (for higher concentrations, this effect may be ignored). Formamide reduces the melting temperature of DNA-DNA and DNA-RNA duplexes with 0.6 to 0.7° C. for each percent formamide, and addition of 50% formamide allows hybridisation to be performed at 30 to 45° C., though the rate of hybridisation will be lowered. Base pair mismatches reduce the hybridisation rate and the thermal stability of the duplexes. On average and for large probes, the Tm decreases about 1° C. per % base mismatch. The Tm may be calculated using the following equations, depending on the types of hybrids:
1) DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284, 1984):
Tm=81.5° C.+16.6x log10[Na.sup.+]a+0.41x%[G/Cb]-500x[Lc]-1-0.61x% formamide
2) DNA-RNA or RNA-RNA hybrids:
Tm=79.8° C.+18.5(log10[Na.sup.+]a)+0.58(%G/Cb)+11.8(%G/Cb).sup- .2-820/Lc
3) oligo-DNA or oligo-RNAs hybrids:
For <20 nucleotides:Tm=2(ln)
For 20-35 nucleotides:Tm=22+1.46(ln)
a or for other monovalent cation, but only accurate in the 0.01-0.4 M range. b only accurate for % GC in the 30% to 75% range. c L=length of duplex in base pairs. d oligo, oligonucleotide; ln=effective length of primer=2×(no. of G/C)+(no. of A/T).
[0235] Non-specific binding may be controlled using any one of a number of known techniques such as, for example, blocking the membrane with protein containing solutions, additions of heterologous RNA, DNA, and SDS to the hybridisation buffer, and treatment with Rnase. For non-homologous probes, a series of hybridizations may be performed by varying one of (i) progressively lowering the annealing temperature (for example from 68° C. to 42° C.) or (ii) progressively lowering the formamide concentration (for example from 50% to 0%). The skilled artisan is aware of various parameters which may be altered during hybridisation and which will either maintain or change the stringency conditions.
[0236] Besides the hybridisation conditions, specificity of hybridisation typically also depends on the function of post-hybridisation washes. To remove background resulting from non-specific hybridisation, samples are washed with dilute salt solutions. Critical factors of such washes include the ionic strength and temperature of the final wash solution: the lower the salt concentration and the higher the wash temperature, the higher the stringency of the wash. Wash conditions are typically performed at or below hybridisation stringency. A positive hybridisation gives a signal that is at least twice of that of the background. Generally, suitable stringent conditions for nucleic acid hybridisation assays or gene amplification detection procedures are as set forth above. More or less stringent conditions may also be selected. The skilled artisan is aware of various parameters which may be altered during washing and which will either maintain or change the stringency conditions.
[0237] For example, typical high stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 65° C. in 1×SSC or at 42° C. in 1×SSC and 50% formamide, followed by washing at 65° C. in 0.3×SSC. Examples of medium stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 50° C. in 4×SSC or at 40° C. in 6×SSC and 50% formamide, followed by washing at 50° C. in 2×SSC. The length of the hybrid is the anticipated length for the hybridising nucleic acid. When nucleic acids of known sequence are hybridised, the hybrid length may be determined by aligning the sequences and identifying the conserved regions described herein. 1×SSC is 0.15M NaCl and 15 mM sodium citrate; the hybridisation solution and wash solutions may additionally include 5×Denhardt's reagent, 0.5-1.0% SDS, 100 pg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate.
[0238] For the purposes of defining the level of stringency, reference can be made to Sambrook et al. (2001) Molecular Cloning: a laboratory manual, 3rd Edition, Cold Spring Harbor Laboratory Press, CSH, New York or to Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates).
Splice Variant
[0239] The term "splice variant" as used herein encompasses variants of a nucleic acid sequence in which selected introns and/or exons have been excised, replaced, displaced or added, or in which introns have been shortened or lengthened. Such variants will be ones in which the biological activity of the protein is substantially retained; this may be achieved by selectively retaining functional segments of the protein. Such splice variants may be found in nature or may be manmade. Methods for predicting and isolating such splice variants are well known in the art (see for example Foissac and Schiex (2005) BMC Bioinformatics 6: 25).
Allelic Variant
[0240] "Alleles" or "allelic variants" are alternative forms of a given gene, located at the same chromosomal position. Allelic variants encompass Single Nucleotide Polymorphisms (SNPs), as well as Small Insertion/Deletion Polymorphisms (INDELs). The size of INDELs is usually less than 100 bp. SNPs and INDELs form the largest set of sequence variants in naturally occurring polymorphic strains of most organisms.
Endogenous Gene
[0241] Reference herein to an "endogenous" gene not only refers to the gene in question as found in a plant in its natural form (i.e., without there being any human intervention), but also refers to that same gene (or a substantially homologous nucleic acid/gene) in an isolated form subsequently (re)introduced into a plant (a transgene). For example, a transgenic plant containing such a transgene may encounter a substantial reduction of the transgene expression and/or substantial reduction of expression of the endogenous gene. The isolated gene may be isolated from an organism or may be manmade, for example by chemical synthesis.
Gene Shuffling/Directed Evolution
[0242] "Gene shuffling" or "directed evolution" consists of iterations of DNA shuffling followed by appropriate screening and/or selection to generate variants of nucleic acids or portions thereof encoding proteins having a modified biological activity (Castle et al., (2004) Science 304(5674): 1151-4; U.S. Pat. Nos. 5,811,238 and 6,395,547).
Construct
[0243] Artificial DNA (such as but, not limited to plasmids or viral DNA) capable of replication in a host cell and used for introduction of a DNA sequence of interest into a host cell or host organism. Host cells of the invention may be any cell selected from bacterial cells, such as Escherichia coli or Agrobacterium species cells, yeast cells, fungal, algal or cyanobacterial cells or plant cells. The skilled artisan is well aware of the genetic elements that must be present on the genetic construct in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter) as described herein. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.
[0244] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.
[0245] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.
Regulatory Element/Control Sequence/Promoter
[0246] The terms "regulatory element", "control sequence" and "promoter" are all used interchangeably herein and are to be taken in a broad context to refer to regulatory nucleic acid sequences capable of effecting expression of the sequences to which they are ligated. The term "promoter" typically refers to a nucleic acid control sequence located upstream from the transcriptional start of a gene and which is involved in recognising and binding of RNA polymerase and other proteins, thereby directing transcription of an operably linked nucleic acid. Encompassed by the aforementioned terms are transcriptional regulatory sequences derived from a classical eukaryotic genomic gene (including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence) and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. Also included within the term is a transcriptional regulatory sequence of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 box transcriptional regulatory sequences. The term "regulatory element" also encompasses a synthetic fusion molecule or derivative that confers, activates or enhances expression of a nucleic acid molecule in a cell, tissue or organ.
[0247] A "plant promoter" comprises regulatory elements, which mediate the expression of a coding sequence segment in plant cells. Accordingly, a plant promoter need not be of plant origin, but may originate from viruses or micro-organisms, for example from viruses which attack plant cells. The "plant promoter" can also originate from a plant cell, e.g. from the plant which is transformed with the nucleic acid sequence to be expressed in the inventive process and described herein. This also applies to other "plant" regulatory signals, such as "plant" terminators. The promoters upstream of the nucleotide sequences useful in the methods of the present invention can be modified by one or more nucleotide substitution(s), insertion(s) and/or deletion(s) without interfering with the functionality or activity of either the promoters, the open reading frame (ORF) or the 3'-regulatory region such as terminators or other 3' regulatory regions which are located away from the ORF. It is furthermore possible that the activity of the promoters is increased by modification of their sequence, or that they are replaced completely by more active promoters, even promoters from heterologous organisms. For expression in plants, the nucleic acid molecule must, as described above, be linked operably to or comprise a suitable promoter which expresses the gene at the right point in time and with the required spatial expression pattern.
[0248] For the identification of functionally equivalent promoters, the promoter strength and/or expression pattern of a candidate promoter may be analysed for example by operably linking the promoter to a reporter gene and assaying the expression level and pattern of the reporter gene in various tissues of the plant. Suitable well-known reporter genes include for example beta-glucuronidase or beta-galactosidase. The promoter activity is assayed by measuring the enzymatic activity of the beta-glucuronidase or beta-galactosidase. The promoter strength and/or expression pattern may then be compared to that of a reference promoter (such as the one used in the methods of the present invention). Alternatively, promoter strength may be assayed by quantifying mRNA levels or by comparing mRNA levels of the nucleic acid used in the methods of the present invention, with mRNA levels of housekeeping genes such as 18S rRNA, using methods known in the art, such as Northern blotting with densitometric analysis of autoradiograms, quantitative real-time PCR or RT-PCR (Heid et al., 1996 Genome Methods 6: 986-994). Generally by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By "low level" is intended at levels of about 1/10,000 transcripts to about 1/100,000 transcripts, to about 1/500,0000 transcripts per cell. Conversely, a "strong promoter" drives expression of a coding sequence at high level, or at about 1/10 transcripts to about 1/100 transcripts to about 1/1000 transcripts per cell. Generally, by "medium strength promoter" is intended a promoter that drives expression of a coding sequence at a lower level than a strong promoter, in particular at a level that is in all instances below that obtained when under the control of a 35S CaMV promoter.
Operably Linked
[0249] The term "operably linked" as used herein refers to a functional linkage between the promoter sequence and the gene of interest, such that the promoter sequence is able to initiate transcription of the gene of interest.
Constitutive Promoter
[0250] A "constitutive promoter" refers to a promoter that is transcriptionally active during most, but not necessarily all, phases of growth and development and under most environmental conditions, in at least one cell, tissue or organ. Table 2a below gives examples of constitutive promoters.
TABLE-US-00005 TABLE 2a Examples of constitutive promoters Gene Source Reference Actin McElroy et al, Plant Cell, 2: 163-171, 1990 HMGP WO 2004/070039 CAMV 35S Odell et al, Nature, 313: 810-812, 1985 CaMV 19S Nilsson et al., Physiol. Plant. 100: 456-462, 1997 GOS2 de Pater et al, Plant J Nov; 2(6): 837-44, 1992, WO 2004/065596 Ubiquitin Christensen et al, Plant Mol. Biol. 18: 675-689, 1992 Rice cyclophilin Buchholz et al, Plant Mol Biol. 25(5): 837-43, 1994 Maize H3 histone Lepetit et al, Mol. Gen. Genet. 231: 276-285, 1992 Alfalfa H3 Wu et al. Plant Mol. Biol. 11: 641-649, 1988 histone Actin 2 An et al, Plant J. 10(1); 107-121, 1996 34S FMV Sanger et al., Plant. Mol. Biol., 14, 1990: 433-443 Rubisco small U.S. Pat. No. 4,962,028 subunit OCS Leisner (1988) Proc Natl Acad Sci USA 85(5): 2553 SAD1 Jain et al., Crop Science, 39 (6), 1999: 1696 SAD2 Jain et al., Crop Science, 39 (6), 1999: 1696 nos Shaw et al. (1984) Nucleic Acids Res. 12(20): 7831-7846 V-ATPase WO 01/14572 Super promoter WO 95/14098 G-box proteins WO 94/12015
Ubiquitous Promoter
[0251] A "ubiquitous promoter" is active in substantially all tissues or cells of an organism.
Developmentally-Regulated Promoter
[0252] A "developmentally-regulated promoter" is active during certain developmental stages or in parts of the plant that undergo developmental changes.
Inducible Promoter
[0253] An "inducible promoter" has induced or increased transcription initiation in response to a chemical (for a review see Gatz 1997, Annu. Rev. Plant Physiol. Plant Mol. Biol., 48:89-108), environmental or physical stimulus, or may be "stress-inducible", i.e. activated when a plant is exposed to various stress conditions, or a "pathogen-inducible" i.e. activated when a plant is exposed to exposure to various pathogens.
Organ-Specific/Tissue-Specific Promoter
[0254] An "organ-specific" or "tissue-specific promoter" is one that is capable of preferentially initiating transcription in certain organs or tissues, such as the leaves, roots, seed tissue etc. For example, a "root-specific promoter" is a promoter that is transcriptionally active predominantly in plant roots, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Promoters able to initiate transcription in certain cells only are referred to herein as "cell-specific".
[0255] Examples of root-specific promoters are listed in Table 2b below:
TABLE-US-00006 TABLE 2b Examples of root-specific promoters Gene Source Reference RCc3 Plant Mol Biol. 1995 January; 27(2): 237-48 Arabidopsis PHT1 Koyama et al. J Biosci Bioeng. 2005 January; 99(1): 38-42.; Mudge et al. (2002, Plant J. 31: 341) Medicago phosphate Xiao et al., 2006, Plant Biol (Stuttg). 2006 July; 8(4): 439-49 transporter Arabidopsis Pyk10 Nitz et al. (2001) Plant Sci 161(2): 337-346 root-expressible genes Tingey et al., EMBO J. 6: 1, 1987. tobacco auxin- Van der Zaal et al., Plant Mol. Biol. 16, inducible gene 983, 1991. β-tubulin Oppenheimer, et al., Gene 63: 87, 1988. tobacco root- Conkling, et al., Plant Physiol. 93: 1203, 1990. specific genes B. napus G1-3b gene U.S. Pat. No. 5,401,836 SbPRP1 Suzuki et al., Plant Mol. Biol. 21: 109-119, 1993. LRX1 Baumberger et al. 2001, Genes & Dev. 15: 1128 BTG-26 Brassica US 20050044585 napus LeAMT1 (tomato) Lauter et al. (1996, PNAS 3: 8139) The LeNRT1-1 Lauter et al. (1996, PNAS 3: 8139) (tomato) class I patatin Liu et al., Plant Mol. Biol. 7 (6): 1139-1154 gene (potato) KDC1 (Daucus Downey et al. (2000, J. Biol. Chem. 275: 39420) carota) TobRB7 gene W Song (1997) PhD Thesis, North Carolina State University, Raleigh, NC USA OsRAB5a (rice) Wang et al. 2002, Plant Sci. 163: 273 ALF5 (Arabidopsis) Diener et al. (2001, Plant Cell 13: 1625) NRT2; 1Np (N. Quesada et al. (1997, Plant Mol. Biol. 34: 265) plumbaginifolia)
[0256] A "seed-specific promoter" is transcriptionally active predominantly in seed tissue, but not necessarily exclusively in seed tissue (in cases of leaky expression). The seed-specific promoter may be active during seed development and/or during germination. The seed specific promoter may be endosperm/aleurone/embryo specific. Examples of seed-specific promoters (endosperm/aleurone/embryo specific) are shown in Table 2c to Table 2f below. Further examples of seed-specific promoters are given in Qing Qu and Takaiwa (Plant Biotechnol. J. 2, 113-125, 2004), which disclosure is incorporated by reference herein as if fully set forth.
TABLE-US-00007 TABLE 2c Examples of seed-specific promoters Gene source Reference seed-specific genes Simon et al., Plant Mol. Biol. 5: 191, 1985; Scofield et al., J. Biol. Chem. 262: 12202, 1987.; Baszczynski et al., Plant Mol. Biol. 14: 633, 1990. Brazil Nut albumin Pearson et al., Plant Mol. Biol. 18: 235-245, 1992. legumin Ellis et al., Plant Mol. Biol. 10: 203-214, 1988. glutelin (rice) Takaiwa et al., Mol. Gen. Genet. 208: 15-22, 1986; Takaiwa et al., FEBS Letts. 221: 43-47, 1987. zein Matzke et al Plant Mol Biol, 14(3): 323-32 1990 napA Stalberg et al, Planta 199: 515-519, 1996. wheat LMW and HMW Mol Gen Genet 216: 81-90, 1989; NAR 17: 461-2, 1989 glutenin-1 wheat SPA Albani et al, Plant Cell, 9: 171-184, 1997 wheat α, β, γ-gliadins EMBO J. 3: 1409-15, 1984 barley Itr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Theor Appl Gen 98: 1253-62, 1999; Plant J 4: 343-55, 1993; Mol Gen Genet 250: 750-60, 1996 barley DOF Mena et al, The Plant Journal, 116(1): 53-62, 1998 blz2 EP99106056.7 synthetic promoter Vicente-Carbajosa et al., Plant J. 13: 629-640, 1998. rice prolamin NRP33 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice a-globulin Glb-1 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 rice α-globulin REB/OHP-1 Nakase et al. Plant Mol. Biol. 33: 513-522, 1997 rice ADP-glucose pyrophos- Trans Res 6: 157-68, 1997 phorylase maize ESR gene family Plant J 12: 235-46, 1997 sorghum α-kafirin DeRose et al., Plant Mol. Biol 32: 1029-35, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 rice oleosin Wu et al, J. Biochem. 123: 386, 1998 sunflower oleosin Cummins et al., Plant Mol. Biol. 19: 873-876, 1992 PRO0117, putative rice 40S WO 2004/070039 ribosomal protein PRO0136, rice alanine unpublished aminotransferase PRO0147, trypsin inhibitor unpublished ITR1 (barley) PRO0151, rice WSI18 WO 2004/070039 PRO0175, rice RAB21 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039 α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998
TABLE-US-00008 TABLE 2d examples of endosperm-specific promoters Gene source Reference glutelin (rice) Takaiwa et al. (1986) Mol Gen Genet 208: 15-22; Takaiwa et al. (1987) FEBS Letts. 221: 43-47 zein Matzke et al., (1990) Plant Mol Biol 14(3): 323-32 wheat LMW Colot et al. (1989) Mol Gen Genet 216: 81-90, and HMW Anderson et al. (1989) NAR 17: 461-2 glutenin-1 wheat SPA Albani et al. (1997) Plant Cell 9: 171-184 wheat gliadins Rafalski et al. (1984) EMBO 3: 1409-15 barley Itr1 Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 promoter barley B1, C, D, Cho et al. (1999) Theor Appl Genet 98: 1253-62; hordein Muller et al. (1993) Plant J 4: 343-55; Sorenson et al. (1996) Mol Gen Genet 250: 750-60 barley DOF Mena et al, (1998) Plant J 116(1): 53-62 blz2 Onate et al. (1999) J Biol Chem 274(14): 9175-82 synthetic promoter Vicente-Carbajosa et al. (1998) Plant J 13: 629-640 rice prolamin Wu et al, (1998) Plant Cell Physiol 39(8) 885-889 NRP33 rice globulin Wu et al. (1998) Plant Cell Physiol 39(8) 885-889 Glb-1 rice globulin Nakase et al. (1997) Plant Molec Biol 33: 513-522 REB/OHP-1 rice ADP-glucose Russell et al. (1997) Trans Res 6: 157-68 pyrophosphorylase maize ESR Opsahl-Ferstad et al. (1997) Plant J 12: 235-46 gene family sorghum kafirin DeRose et al. (1996) Plant Mol Biol 32: 1029-35
TABLE-US-00009 TABLE 2e Examples of embryo specific promoters: Gene source Reference rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 PRO0151 WO 2004/070039 PRO0175 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039
TABLE-US-00010 TABLE 2f Examples of aleurone-specific promoters: Gene source Reference α-amylase Lanahan et al, Plant Cell 4: 203-211, 1992; (Amy32b) Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998
[0257] A "green tissue-specific promoter" as defined herein is a promoter that is transcriptionally active predominantly in green tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts.
[0258] Examples of green tissue-specific promoters which may be used to perform the methods of the invention are shown in Table 2g below.
TABLE-US-00011 TABLE 2g Examples of green tissue-specific promoters Gene Expression Reference Maize Orthophosphate dikinase Leaf specific Fukavama et al., Plant Physiol. 2001 November; 127(3): 1136-46 Maize Phosphoenolpyruvate Leaf specific Kausch et al., Plant Mol Biol. carboxylase 2001 January; 45(1): 1-15 Rice Phosphoenolpyruvate Leaf specific Lin et al., 2004 DNA Seq. 2004 carboxylase August; 15(4): 269-76 Rice small subunit Rubisco Leaf specific Nomura et al., Plant Mol Biol. 2000 September; 44(1): 99-106 rice beta expansin EXBP9 Shoot specific WO 2004/070039 Pigeonpea small subunit Rubisco Leaf specific Panguluri et al., Indian J Exp Biol. 2005 April; 43(4): 369-72 Pea RBCS3A Leaf specific
[0259] Another example of a tissue-specific promoter is a meristem-specific promoter, which is transcriptionally active predominantly in meristematic tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Examples of green meristem-specific promoters which may be used to perform the methods of the invention are shown in Table 2h below.
TABLE-US-00012 TABLE 2h Examples of meristem-specific promoters Gene source Expression pattern Reference rice OSH1 Shoot apical meristem, Sato et al. (1996) from embryo globular Proc. Natl. Acad. Sci. stage to seedling stage USA, 93: 8117-8122 Rice metallothionein Meristem specific BAD87835.1 WAK1 & WAK 2 Shoot and root apical Wagner & Kohorn meristems, and in ex- (2001) Plant Cell panding leaves and sepals 13(2): 303-318
Terminator
[0260] The term "terminator" encompasses a control sequence which is a DNA sequence at the end of a transcriptional unit which signals 3' processing and polyadenylation of a primary transcript and termination of transcription. The terminator can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The terminator to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
Selectable Marker (Gene)/Reporter Gene
[0261] "Selectable marker", "selectable marker gene" or "reporter gene" includes any gene that confers a phenotype on a cell in which it is expressed to facilitate the identification and/or selection of cells that are transfected or transformed with a nucleic acid construct of the invention. These marker genes enable the identification of a successful transfer of the nucleic acid molecules via a series of different principles. Suitable markers may be selected from markers that confer antibiotic or herbicide resistance, that introduce a new metabolic trait or that allow visual selection. Examples of selectable marker genes include genes conferring resistance to antibiotics (such as nptII that phosphorylates neomycin and kanamycin, or hpt, phosphorylating hygromycin, or genes conferring resistance to, for example, bleomycin, streptomycin, tetracyclin, chloramphenicol, ampicillin, gentamycin, geneticin (G418), spectinomycin or blasticidin), to herbicides (for example bar which provides resistance to Basta®; aroA or gox providing resistance against glyphosate, or the genes conferring resistance to, for example, imidazolinone, phosphinothricin or sulfonylurea), or genes that provide a metabolic trait (such as manA that allows plants to use mannose as sole carbon source or xylose isomerase for the utilisation of xylose, or antinutritive markers such as the resistance to 2-deoxyglucose). Expression of visual marker genes results in the formation of colour (for example β-glucuronidase, GUS or β-galactosidase with its coloured substrates, for example X-Gal), luminescence (such as the luciferin/luceferase system) or fluorescence (Green Fluorescent Protein, GFP, and derivatives thereof). This list represents only a small number of possible markers. The skilled worker is familiar with such markers. Different markers are preferred, depending on the organism and the selection method.
[0262] It is known that upon stable or transient integration of nucleic acids into plant cells, only a minority of the cells takes up the foreign DNA and, if desired, integrates it into its genome, depending on the expression vector used and the transfection technique used. To identify and select these integrants, a gene coding for a selectable marker (such as the ones described above) is usually introduced into the host cells together with the gene of interest. These markers can for example be used in mutants in which these genes are not functional by, for example, deletion by conventional methods. Furthermore, nucleic acid molecules encoding a selectable marker can be introduced into a host cell on the same vector that comprises the sequence encoding the polypeptides of the invention or used in the methods of the invention, or else in a separate vector. Cells which have been stably transfected with the introduced nucleic acid can be identified for example by selection (for example, cells which have integrated the selectable marker survive whereas the other cells die).
[0263] Since the marker genes, particularly genes for resistance to antibiotics and herbicides, are no longer required or are undesired in the transgenic host cell once the nucleic acids have been introduced successfully, the process according to the invention for introducing the nucleic acids advantageously employs techniques which enable the removal or excision of these marker genes. One such a method is what is known as co-transformation. The co-transformation method employs two vectors simultaneously for the transformation, one vector bearing the nucleic acid according to the invention and a second bearing the marker gene(s). A large proportion of transformants receives or, in the case of plants, comprises (up to 40% or more of the transformants), both vectors. In case of transformation with Agrobacteria, the transformants usually receive only a part of the vector, i.e. the sequence flanked by the T-DNA, which usually represents the expression cassette. The marker genes can subsequently be removed from the transformed plant by performing crosses. In another method, marker genes integrated into a transposon are used for the transformation together with desired nucleic acid (known as the Ac/Ds technology). The transformants can be crossed with a transposase source or the transformants are transformed with a nucleic acid construct conferring expression of a transposase, transiently or stable. In some cases (approx. 10%), the transposon jumps out of the genome of the host cell once transformation has taken place successfully and is lost. In a further number of cases, the transposon jumps to a different location. In these cases the marker gene must be eliminated by performing crosses. In microbiology, techniques were developed which make possible, or facilitate, the detection of such events. A further advantageous method relies on what is known as recombination systems; whose advantage is that elimination by crossing can be dispensed with. The best-known system of this type is what is known as the Cre/Iox system. Cre1 is a recombinase that removes the sequences located between the IoxP sequences. If the marker gene is integrated between the IoxP sequences, it is removed once transformation has taken place successfully, by expression of the recombinase. Further recombination systems are the HIN/HIX, FLP/FRT and REP/STB system (Tribble et al., J. Biol. Chem., 275, 2000: 22255-22267; Velmurugan et al., J. Cell Biol., 149, 2000: 553-566). A site-specific integration into the plant genome of the nucleic acid sequences according to the invention is possible. Naturally, these methods can also be applied to microorganisms such as yeast, fungi or bacteria.
Transgenic/Transgene/Recombinant
[0264] For the purposes of the invention, "transgenic", "transgene" or "recombinant" means with regard to, for example, a nucleic acid sequence, an expression cassette, gene construct or a vector comprising the nucleic acid sequence or an organism transformed with the nucleic acid sequences, expression cassettes or vectors according to the invention, all those constructions brought about by recombinant methods in which either
[0265] (a) the nucleic acid sequences encoding proteins useful in the methods of the invention, or
[0266] (b) genetic control sequence(s) which is operably linked with the nucleic acid sequence according to the invention, for example a promoter, or
[0267] (c) a) and b) are not located in their natural genetic environment or have been modified by recombinant methods, it being possible for the modification to take the form of, for example, a substitution, addition, deletion, inversion or insertion of one or more nucleotide residues. The natural genetic environment is understood as meaning the natural genomic or chromosomal locus in the original plant or the presence in a genomic library. In the case of a genomic library, the natural genetic environment of the nucleic acid sequence is preferably retained, at least in part. The environment flanks the nucleic acid sequence at least on one side and has a sequence length of at least 50 bp, preferably at least 500 bp, especially preferably at least 1000 bp, most preferably at least 5000 bp. A naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a polypeptide useful in the methods of the present invention, as defined above--becomes a transgenic expression cassette when this expression cassette is modified by non-natural, synthetic ("artificial") methods such as, for example, mutagenic treatment. Suitable methods are described, for example, in U.S. Pat. No. 5,565,350 or WO 00/15815.
[0268] A transgenic plant for the purposes of the invention is thus understood as meaning, as above, that the nucleic acids used in the method of the invention are not present in, or originating from, the genome of said plant, or are present in the genome of said plant but not at their natural locus in the genome of said plant, it being possible for the nucleic acids to be expressed homologously or heterologously. However, as mentioned, transgenic also means that, while the nucleic acids according to the invention or used in the inventive method are at their natural position in the genome of a plant, the sequence has been modified with regard to the natural sequence, and/or that the regulatory sequences of the natural sequences have been modified. Transgenic is preferably understood as meaning the expression of the nucleic acids according to the invention at an unnatural locus in the genome, i.e. homologous or, preferably, heterologous expression of the nucleic acids takes place. Preferred transgenic plants are mentioned herein.
[0269] It shall further be noted that in the context of the present invention, the term "isolated nucleic acid" or "isolated polypeptide" may in some instances be considered as a synonym for a "recombinant nucleic acid" or a "recombinant polypeptide", respectively and refers to a nucleic acid or polypeptide that is not located in its natural genetic environment and/or that has been modified by recombinant methods.
Modulation
[0270] The term "modulation" means in relation to expression or gene expression, a process in which the expression level is changed by said gene expression in comparison to the control plant, the expression level may be increased or decreased. The original, unmodulated expression may be of any kind of expression of a structural RNA (rRNA, tRNA) or mRNA with subsequent translation. For the purposes of this invention, the original unmodulated expression may also be absence of any expression. The term "modulating the activity" shall mean any change of the expression of the inventive nucleic acid sequences or encoded proteins, which leads to increased yield and/or increased growth of the plants. The expression can increase from zero (absence of, or immeasurable expression) to a certain amount, or can decrease from a certain amount to immeasurable small amounts or zero.
Expression
[0271] The term "expression" or "gene expression" means the transcription of a specific gene or specific genes or specific genetic construct. The term "expression" or "gene expression" in particular means the transcription of a gene or genes or genetic construct into structural RNA (rRNA, tRNA) or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting mRNA product.
Increased Expression/Overexpression
[0272] The term "increased expression" or "overexpression" as used herein means any form of expression that is additional to the original wild-type expression level. For the purposes of this invention, the original wild-type expression level might also be zero, i.e. absence of expression or immeasurable expression.
[0273] Methods for increasing expression of genes or gene products are well documented in the art and include, for example, overexpression driven by appropriate promoters, the use of transcription enhancers or translation enhancers. Isolated nucleic acids which serve as promoter or enhancer elements may be introduced in an appropriate position (typically upstream) of a non-heterologous form of a polynucleotide so as to upregulate expression of a nucleic acid encoding the polypeptide of interest. For example, endogenous promoters may be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., WO9322443), or isolated promoters may be introduced into a plant cell in the proper orientation and distance from a gene of the present invention so as to control the expression of the gene.
[0274] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
[0275] An intron sequence may also be added to the 5' untranslated region (UTR) or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg (1988) Mol. Cell. biol. 8: 4395-4405; Callis et al. (1987) Genes Dev 1:1183-1200). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of the maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. For general information see: The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, N.Y. (1994).
Decreased Expression
[0276] Reference herein to "decreased expression" or "reduction or substantial elimination" of expression is taken to mean a decrease in endogenous gene expression and/or polypeptide levels and/or polypeptide activity relative to control plants. The reduction or substantial elimination is in increasing order of preference at least 10%, 20%, 30%, 40% or 50%, 60%, 70%, 80%, 85%, 90%, or 95%, 96%, 97%, 98%, 99% or more reduced compared to that of control plants.
[0277] For the reduction or substantial elimination of expression an endogenous gene in a plant, a sufficient length of substantially contiguous nucleotides of a nucleic acid sequence is required. In order to perform gene silencing, this may be as little as 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 or fewer nucleotides, alternatively this may be as much as the entire gene (including the 5' and/or 3' UTR, either in part or in whole). The stretch of substantially contiguous nucleotides may be derived from the nucleic acid encoding the protein of interest (target gene), or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest. Preferably, the stretch of substantially contiguous nucleotides is capable of forming hydrogen bonds with the target gene (either sense or antisense strand), more preferably, the stretch of substantially contiguous nucleotides has, in increasing order of preference, 50%, 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity to the target gene (either sense or antisense strand). A nucleic acid sequence encoding a (functional) polypeptide is not a requirement for the various methods discussed herein for the reduction or substantial elimination of expression of an endogenous gene.
[0278] This reduction or substantial elimination of expression may be achieved using routine tools and techniques. A preferred method for the reduction or substantial elimination of endogenous gene expression is by introducing and expressing in a plant a genetic construct into which the nucleic acid (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of any one of the protein of interest) is cloned as an inverted repeat (in part or completely), separated by a spacer (non-coding DNA).
[0279] In such a preferred method, expression of the endogenous gene is reduced or substantially eliminated through RNA-mediated silencing using an inverted repeat of a nucleic acid or a part thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), preferably capable of forming a hairpin structure. The inverted repeat is cloned in an expression vector comprising control sequences. A non-coding DNA nucleic acid sequence (a spacer, for example a matrix attachment region fragment (MAR), an intron, a polylinker, etc.) is located between the two inverted nucleic acids forming the inverted repeat. After transcription of the inverted repeat, a chimeric RNA with a self-complementary structure is formed (partial or complete). This double-stranded RNA structure is referred to as the hairpin RNA (hpRNA). The hpRNA is processed by the plant into siRNAs that are incorporated into an RNA-induced silencing complex (RISC). The RISC further cleaves the mRNA transcripts, thereby substantially reducing the number of mRNA transcripts to be translated into polypeptides. For further general details see for example, Grierson et al. (1998) WO 98/53083; Waterhouse et al. (1999) WO 99/53050).
[0280] Performance of the methods of the invention does not rely on introducing and expressing in a plant a genetic construct into which the nucleic acid is cloned as an inverted repeat, but any one or more of several well-known "gene silencing" methods may be used to achieve the same effects.
[0281] One such method for the reduction of endogenous gene expression is RNA-mediated silencing of gene expression (downregulation). Silencing in this case is triggered in a plant by a double stranded RNA sequence (dsRNA) that is substantially similar to the target endogenous gene. This dsRNA is further processed by the plant into about 20 to about 26 nucleotides called short interfering RNAs (siRNAs). The siRNAs are incorporated into an RNA-induced silencing complex (RISC) that cleaves the mRNA transcript of the endogenous target gene, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. Preferably, the double stranded RNA sequence corresponds to a target gene.
[0282] Another example of an RNA silencing method involves the introduction of nucleic acid sequences or parts thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest) in a sense orientation into a plant. "Sense orientation" refers to a DNA sequence that is homologous to an mRNA transcript thereof. Introduced into a plant would therefore be at least one copy of the nucleic acid sequence. The additional nucleic acid sequence will reduce expression of the endogenous gene, giving rise to a phenomenon known as co-suppression. The reduction of gene expression will be more pronounced if several additional copies of a nucleic acid sequence are introduced into the plant, as there is a positive correlation between high transcript levels and the triggering of co-suppression.
[0283] Another example of an RNA silencing method involves the use of antisense nucleic acid sequences. An "antisense" nucleic acid sequence comprises a nucleotide sequence that is complementary to a "sense" nucleic acid sequence encoding a protein, i.e. complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA transcript sequence. The antisense nucleic acid sequence is preferably complementary to the endogenous gene to be silenced. The complementarity may be located in the "coding region" and/or in the "non-coding region" of a gene. The term "coding region" refers to a region of the nucleotide sequence comprising codons that are translated into amino acid residues. The term "non-coding region" refers to 5' and 3' sequences that flank the coding region that are transcribed but not translated into amino acids (also referred to as 5' and 3' untranslated regions).
[0284] Antisense nucleic acid sequences can be designed according to the rules of Watson and Crick base pairing. The antisense nucleic acid sequence may be complementary to the entire nucleic acid sequence (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), but may also be an oligonucleotide that is antisense to only a part of the nucleic acid sequence (including the mRNA 5' and 3' UTR). For example, the antisense oligonucleotide sequence may be complementary to the region surrounding the translation start site of an mRNA transcript encoding a polypeptide. The length of a suitable antisense oligonucleotide sequence is known in the art and may start from about 50, 45, 40, 35, 30, 25, 20, 15 or 10 nucleotides in length or less. An antisense nucleic acid sequence according to the invention may be constructed using chemical synthesis and enzymatic ligation reactions using methods known in the art. For example, an antisense nucleic acid sequence (e.g., an antisense oligonucleotide sequence) may be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acid sequences, e.g., phosphorothioate derivatives and acridine substituted nucleotides may be used. Examples of modified nucleotides that may be used to generate the antisense nucleic acid sequences are well known in the art. Known nucleotide modifications include methylation, cyclization and `caps` and substitution of one or more of the naturally occurring nucleotides with an analogue such as inosine. Other modifications of nucleotides are well known in the art.
[0285] The antisense nucleic acid sequence can be produced biologically using an expression vector into which a nucleic acid sequence has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest). Preferably, production of antisense nucleic acid sequences in plants occurs by means of a stably integrated nucleic acid construct comprising a promoter, an operably linked antisense oligonucleotide, and a terminator.
[0286] The nucleic acid molecules used for silencing in the methods of the invention (whether introduced into a plant or generated in situ) hybridize with or bind to mRNA transcripts and/or genomic DNA encoding a polypeptide to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid sequence which binds to DNA duplexes, through specific interactions in the major groove of the double helix. Antisense nucleic acid sequences may be introduced into a plant by transformation or direct injection at a specific tissue site. Alternatively, antisense nucleic acid sequences can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense nucleic acid sequences can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid sequence to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid sequences can also be delivered to cells using the vectors described herein.
[0287] According to a further aspect, the antisense nucleic acid sequence is an a-anomeric nucleic acid sequence. An a-anomeric nucleic acid sequence forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual b-units, the strands run parallel to each other (Gaultier et al. (1987) Nucl Ac Res 15: 6625-6641). The antisense nucleic acid sequence may also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucl Ac Res 15, 6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215, 327-330).
[0288] The reduction or substantial elimination of endogenous gene expression may also be performed using ribozymes. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid sequence, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334, 585-591) can be used to catalytically cleave mRNA transcripts encoding a polypeptide, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. A ribozyme having specificity for a nucleic acid sequence can be designed (see for example: Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742). Alternatively, mRNA transcripts corresponding to a nucleic acid sequence can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules (Bartel and Szostak (1993) Science 261, 1411-1418). The use of ribozymes for gene silencing in plants is known in the art (e.g., Atkins et al. (1994) WO 94/00012; Lenne et al. (1995) WO 95/03404; Lutziger et al. (2000) WO 00/00619; Prinsen et al. (1997) WO 97/13865 and Scott et al. (1997) WO 97/38116).
[0289] Gene silencing may also be achieved by insertion mutagenesis (for example, T-DNA insertion or transposon insertion) or by strategies as described by, among others, Angell and Baulcombe ((1999) Plant J 20(3): 357-62), (Amplicon VIGS WO 98/36083), or Baulcombe (WO 99/15682).
[0290] Gene silencing may also occur if there is a mutation on an endogenous gene and/or a mutation on an isolated gene/nucleic acid subsequently introduced into a plant. The reduction or substantial elimination may be caused by a non-functional polypeptide. For example, the polypeptide may bind to various interacting proteins; one or more mutation(s) and/or truncation(s) may therefore provide for a polypeptide that is still able to bind interacting proteins (such as receptor proteins) but that cannot exhibit its normal function (such as signalling ligand).
[0291] A further approach to gene silencing is by targeting nucleic acid sequences complementary to the regulatory region of the gene (e.g., the promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See Helene, C., Anticancer Drug Res. 6, 569-84, 1991; Helene et al., Ann. N.Y. Acad. Sci. 660, 27-36 1992; and Maher, L. J. Bioassays 14, 807-15, 1992.
[0292] Other methods, such as the use of antibodies directed to an endogenous polypeptide for inhibiting its function in planta, or interference in the signalling pathway in which a polypeptide is involved, will be well known to the skilled man. In particular, it can be envisaged that manmade molecules may be useful for inhibiting the biological function of a target polypeptide, or for interfering with the signalling pathway in which the target polypeptide is involved.
[0293] Alternatively, a screening program may be set up to identify in a plant population natural variants of a gene, which variants encode polypeptides with reduced activity. Such natural variants may also be used for example, to perform homologous recombination.
[0294] Artificial and/or natural microRNAs (miRNAs) may be used to knock out gene expression and/or mRNA translation. Endogenous miRNAs are single stranded small RNAs of typically 19-24 nucleotides long. They function primarily to regulate gene expression and/or mRNA translation. Most plant microRNAs (miRNAs) have perfect or near-perfect complementarity with their target sequences. However, there are natural targets with up to five mismatches. They are processed from longer non-coding RNAs with characteristic fold-back structures by double-strand specific RNases of the Dicer family. Upon processing, they are incorporated in the RNA-induced silencing complex (RISC) by binding to its main component, an Argonaute protein. MiRNAs serve as the specificity components of RISC, since they base-pair to target nucleic acids, mostly mRNAs, in the cytoplasm. Subsequent regulatory events include target mRNA cleavage and destruction and/or translational inhibition. Effects of miRNA overexpression are thus often reflected in decreased mRNA levels of target genes.
[0295] Artificial microRNAs (amiRNAs), which are typically 21 nucleotides in length, can be genetically engineered specifically to negatively regulate gene expression of single or multiple genes of interest. Determinants of plant microRNA target selection are well known in the art. Empirical parameters for target recognition have been defined and can be used to aid in the design of specific amiRNAs, (Schwab et al., Dev. Cell 8, 517-527, 2005). Convenient tools for design and generation of amiRNAs and their precursors are also available to the public (Schwab et al., Plant Cell 18, 1121-1133, 2006).
[0296] For optimal performance, the gene silencing techniques used for reducing expression in a plant of an endogenous gene requires the use of nucleic acid sequences from monocotyledonous plants for transformation of monocotyledonous plants, and from dicotyledonous plants for transformation of dicotyledonous plants. Preferably, a nucleic acid sequence from any given plant species is introduced into that same species. For example, a nucleic acid sequence from rice is transformed into a rice plant. However, it is not an absolute requirement that the nucleic acid sequence to be introduced originates from the same plant species as the plant in which it will be introduced. It is sufficient that there is substantial homology between the endogenous target gene and the nucleic acid to be introduced.
[0297] Described above are examples of various methods for the reduction or substantial elimination of expression in a plant of an endogenous gene. A person skilled in the art would readily be able to adapt the aforementioned methods for silencing so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.
Transformation
[0298] The term "introduction" or "transformation" as referred to herein encompasses the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a genetic construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art. Alternatively, a plant cell that cannot be regenerated into a plant may be chosen as host cell, i.e. the resulting transformed plant cell does not have the capacity to regenerate into a (whole) plant.
[0299] The transfer of foreign genes into the genome of a plant is called transformation. Transformation of plant species is now a fairly routine technique. Advantageously, any of several transformation methods may be used to introduce the gene of interest into a suitable ancestor cell. The methods described for the transformation and regeneration of plants from plant tissues or plant cells may be utilized for transient or for stable transformation. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens, F. A. et al., (1982) Nature 296, 72-74; Negrutiu I et al. (1987) Plant Mol Biol 8: 363-373); electroporation of protoplasts (Shillito R. D. et al. (1985) Bio/Technol 3, 1099-1102); microinjection into plant material (Crossway A et al., (1986) Mol. Gen. Genet. 202: 179-185); DNA or RNA-coated particle bombardment (Klein T M et al., (1987) Nature 327: 70) infection with (non-integrative) viruses and the like. Transgenic plants, including transgenic crop plants, are preferably produced via Agrobacterium-mediated transformation. An advantageous transformation method is the transformation in planta. To this end, it is possible, for example, to allow the agrobacteria to act on plant seeds or to inoculate the plant meristem with agrobacteria. It has proved particularly expedient in accordance with the invention to allow a suspension of transformed agrobacteria to act on the intact plant or at least on the flower primordia. The plant is subsequently grown on until the seeds of the treated plant are obtained (Clough and Bent, Plant J. (1998) 16, 735-743). Methods for Agrobacterium-mediated transformation of rice include well known methods for rice transformation, such as those described in any of the following: European patent application EP 1198985 A1, Aldemita and Hodges (Planta 199: 612-617, 1996); Chan et al. (Plant Mol Biol 22 (3): 491-506, 1993), Hiei et al. (Plant J 6 (2): 271-282, 1994), which disclosures are incorporated by reference herein as if fully set forth. In the case of corn transformation, the preferred method is as described in either Ishida et al. (Nat. Biotechnol 14(6): 745-50, 1996) or Frame et al. (Plant Physiol 129(1): 13-22, 2002), which disclosures are incorporated by reference herein as if fully set forth. Said methods are further described by way of example in B. Jenes et al., Techniques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press (1993) 128-143 and in Potrykus Annu. Rev. Plant Physiol. Plant Molec. Biol. 42 (1991) 205-225). The nucleic acids or the construct to be expressed is preferably cloned into a vector, which is suitable for transforming Agrobacterium tumefaciens, for example pBin19 (Bevan et al., Nucl. Acids Res. 12 (1984) 8711). Agrobacteria transformed by such a vector can then be used in known manner for the transformation of plants, such as plants used as a model, like Arabidopsis (Arabidopsis thaliana is within the scope of the present invention not considered as a crop plant), or crop plants such as, by way of example, tobacco plants, for example by immersing bruised leaves or chopped leaves in an agrobacterial solution and then culturing them in suitable media. The transformation of plants by means of Agrobacterium tumefaciens is described, for example, by Hofgen and Willmitzer in Nucl. Acid Res. (1988) 16, 9877 or is known inter alia from F. F. White, Vectors for Gene Transfer in Higher Plants; in Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press, 1993, pp. 15-38.
[0300] In addition to the transformation of somatic cells, which then have to be regenerated into intact plants, it is also possible to transform the cells of plant meristems and in particular those cells which develop into gametes. In this case, the transformed gametes follow the natural plant development, giving rise to transgenic plants. Thus, for example, seeds of Arabidopsis are treated with agrobacteria and seeds are obtained from the developing plants of which a certain proportion is transformed and thus transgenic [Feldman, K A and Marks M D (1987). Mol Gen Genet. 208:1-9; Feldmann K (1992). In: C Koncz, N-H Chua and J Shell, eds, Methods in Arabidopsis Research. Word Scientific, Singapore, pp. 274-289]. Alternative methods are based on the repeated removal of the inflorescences and incubation of the excision site in the center of the rosette with transformed agrobacteria, whereby transformed seeds can likewise be obtained at a later point in time (Chang (1994). Plant J. 5: 551-558; Katavic (1994). Mol Gen Genet, 245: 363-370). However, an especially effective method is the vacuum infiltration method with its modifications such as the "floral dip" method. In the case of vacuum infiltration of Arabidopsis, intact plants under reduced pressure are treated with an agrobacterial suspension [Bechthold, N (1993). C R Acad Sci Paris Life Sci, 316: 1194-1199], while in the case of the "floral dip" method the developing floral tissue is incubated briefly with a surfactant-treated agrobacterial suspension [Clough, S J and Bent A F (1998) The Plant J. 16, 735-743]. A certain proportion of transgenic seeds are harvested in both cases, and these seeds can be distinguished from non-transgenic seeds by growing under the above-described selective conditions. In addition the stable transformation of plastids is of advantages because plastids are inherited maternally is most crops reducing or eliminating the risk of transgene flow through pollen. The transformation of the chloroplast genome is generally achieved by a process which has been schematically displayed in Klaus et al., 2004 [Nature Biotechnology 22 (2), 225-229]. Briefly the sequences to be transformed are cloned together with a selectable marker gene between flanking sequences homologous to the chloroplast genome. These homologous flanking sequences direct site specific integration into the plastome. Plastidal transformation has been described for many different plant species and an overview is given in Bock (2001) Transgenic plastids in basic research and plant biotechnology. J Mol. Biol. 2001 Sep. 21; 312 (3):425-38 or Maliga, P (2003) Progress towards commercialization of plastid transformation technology. Trends Biotechnol. 21, 20-28. Further biotechnological progress has recently been reported in form of marker free plastid transformants, which can be produced by a transient co-integrated maker gene (Klaus et al., 2004, Nature Biotechnology 22(2), 225-229).
[0301] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer. Alternatively, the genetically modified plant cells are non-regenerable into a whole plant.
[0302] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.
[0303] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.
[0304] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).
T-DNA Activation Tagging
[0305] "T-DNA activation" tagging (Hayashi et al. Science (1992) 1350-1353), involves insertion of T-DNA, usually containing a promoter (may also be a translation enhancer or an intron), in the genomic region of the gene of interest or 10 kb up- or downstream of the coding region of a gene in a configuration such that the promoter directs expression of the targeted gene. Typically, regulation of expression of the targeted gene by its natural promoter is disrupted and the gene falls under the control of the newly introduced promoter. The promoter is typically embedded in a T-DNA. This T-DNA is randomly inserted into the plant genome, for example, through Agrobacterium infection and leads to modified expression of genes near the inserted T-DNA. The resulting transgenic plants show dominant phenotypes due to modified expression of genes close to the introduced promoter.
Tilling
[0306] The term "TILLING" is an abbreviation of "Targeted Induced Local Lesions In Genomes" and refers to a mutagenesis technology useful to generate and/or identify nucleic acids encoding proteins with modified expression and/or activity. TILLING also allows selection of plants carrying such mutant variants. These mutant variants may exhibit modified expression, either in strength or in location or in timing (if the mutations affect the promoter for example). These mutant variants may exhibit higher activity than that exhibited by the gene in its natural form. TILLING combines high-density mutagenesis with high-throughput screening methods. The steps typically followed in TILLING are: (a) EMS mutagenesis (Redei G P and Koncz C (1992) In Methods in Arabidopsis Research, Koncz C, Chua N H, Schell J, eds. Singapore, World Scientific Publishing Co, pp. 16-82; Feldmann et al., (1994) In Meyerowitz E M, Somerville C R, eds, Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 137-172; Lightner J and Caspar T (1998) In J Martinez-Zapater, J Salinas, eds, Methods on Molecular Biology, Vol. 82. Humana Press, Totowa, N.J., pp 91-104); (b) DNA preparation and pooling of individuals; (c) PCR amplification of a region of interest; (d) denaturation and annealing to allow formation of heteroduplexes; (e) DHPLC, where the presence of a heteroduplex in a pool is detected as an extra peak in the chromatogram; (f) identification of the mutant individual; and (g) sequencing of the mutant PCR product. Methods for TILLING are well known in the art (McCallum et al., (2000) Nat Biotechnol 18: 455-457; reviewed by Stemple (2004) Nat Rev Genet. 5(2): 145-50).
Homologous Recombination
[0307] "Homologous recombination" allows introduction in a genome of a selected nucleic acid at a defined selected position. Homologous recombination is a standard technology used routinely in biological sciences for lower organisms such as yeast or the moss Physcomitrella. Methods for performing homologous recombination in plants have been described not only for model plants (Offring a et al. (1990) EMBO J. 9(10): 3077-84) but also for crop plants, for example rice (Terada et al. (2002) Nat Biotech 20(10): 1030-4; lida and Terada (2004) Curr Opin Biotech 15(2): 132-8), and approaches exist that are generally applicable regardless of the target organism (Miller et al, Nature Biotechnol. 25, 778-785, 2007).
Yield Related Trait(s)
[0308] A "Yield related trait" is a trait or feature which is related to plant yield. Yield-related traits may comprise one or more of the following non-limitative list of features: early flowering time, yield, biomass, seed yield, early vigour, greenness index, growth rate, agronomic traits, such as e.g. tolerance to submergence (which leads to yield in rice), Water Use Efficiency (WUE), Nitrogen Use Efficiency (NUE), etc.
[0309] Reference herein to enhanced yield-related traits, relative to of control plants is taken to mean one or more of an increase in early vigour and/or in biomass (weight) of one or more parts of a plant, which may include (i) aboveground parts and preferably aboveground harvestable parts and/or (ii) parts below ground and preferably harvestable below ground. In particular, such harvestable parts are seeds.
Yield
[0310] The term "yield" in general means a measurable produce of economic value, typically related to a specified crop, to an area, and to a period of time. Individual plant parts directly contribute to yield based on their number, size and/or weight, or the actual yield is the yield per square meter for a crop and year, which is determined by dividing total production (includes both harvested and appraised production) by planted square meters.
[0311] The terms "yield" of a plant and "plant yield" are used interchangeably herein and are meant to refer to vegetative biomass such as root and/or shoot biomass, to reproductive organs, and/or to propagules such as seeds of that plant.
[0312] Flowers in maize are unisexual; male inflorescences (tassels) originate from the apical stem and female inflorescences (ears) arise from axillary bud apices. The female inflorescence produces pairs of spikelets on the surface of a central axis (cob). Each of the female spikelets encloses two fertile florets, one of them will usually mature into a maize kernel once fertilized. Hence a yield increase in maize may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate, which is the number of filled florets (i.e. florets containing seed) divided by the total number of florets and multiplied by 100), among others.
[0313] Inflorescences in rice plants are named panicles. The panicle bears spikelets, which are the basic units of the panicles, and which consist of a pedicel and a floret. The floret is borne on the pedicel and includes a flower that is covered by two protective glumes: a larger glume (the lemma) and a shorter glume (the palea). Hence, taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, panicle length, number of spikelets per panicle, number of flowers (or florets) per panicle; an increase in the seed filling rate which is the number of filled florets (i.e. florets containing seeds) divided by the total number of florets and multiplied by 100; an increase in thousand kernel weight, among others.
Early Flowering Time
[0314] Plants having an "early flowering time" as used herein are plants which start to flower earlier than control plants. Hence this term refers to plants that show an earlier start of flowering. Flowering time of plants can be assessed by counting the number of days ("time to flower") between sowing and the emergence of a first inflorescence. The "flowering time" of a plant can for instance be determined using the method as described in WO 2007/093444.
Early Vigour
[0315] "Early vigour" refers to active healthy well-balanced growth especially during early stages of plant growth, and may result from increased plant fitness due to, for example, the plants being better adapted to their environment (i.e. optimizing the use of energy resources and partitioning between shoot and root). Plants having early vigour also show increased seedling survival and a better establishment of the crop, which often results in highly uniform fields (with the crop growing in uniform manner, i.e. with the majority of plants reaching the various stages of development at substantially the same time), and often better and higher yield. Therefore, early vigour may be determined by measuring various factors, such as thousand kernel weight, percentage germination, percentage emergence, seedling growth, seedling height, root length, root and shoot biomass and many more.
Increased Growth Rate
[0316] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a mature seed up to the stage where the plant has produced mature seeds, similar to the starting material. This life cycle may be influenced by factors such as speed of germination, early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.
Stress Resistance
[0317] An increase in yield and/or growth rate occurs whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35%, 30% or 25%, more preferably less than 20% or 15% in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures.
[0318] "Biotic stresses" are typically those stresses caused by pathogens, such as bacteria, viruses, fungi, nematodes and insects.
[0319] The "abiotic stress" may be an osmotic stress caused by a water stress, e.g. due to drought, salt stress, or freezing stress. Abiotic stress may also be an oxidative stress or a cold stress. "Freezing stress" is intended to refer to stress due to freezing temperatures, i.e. temperatures at which available water molecules freeze and turn into ice. "Cold stress", also called "chilling stress", is intended to refer to cold temperatures, e.g. temperatures below 10°, or preferably below 5° C., but at which water molecules do not freeze. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location. Plants with optimal growth conditions, (grown under non-stress conditions) typically yield in increasing order of preference at least 97%, 95%, 92%, 90%, 87%, 85%, 83%, 80%, 77% or 75% of the average production of such plant in a given environment. Average production may be calculated on harvest and/or season basis. Persons skilled in the art are aware of average yield productions of a crop.
[0320] In particular, the methods of the present invention may be performed under non-stress conditions. In an example, the methods of the present invention may be performed under non-stress conditions such as mild drought to give plants having increased yield relative to control plants.
[0321] In another embodiment, the methods of the present invention may be performed under stress conditions.
[0322] In an example, the methods of the present invention may be performed under stress conditions such as drought to give plants having increased yield relative to control plants.
[0323] In another example, the methods of the present invention may be performed under stress conditions such as nutrient deficiency to give plants having increased yield relative to control plants.
[0324] Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, magnesium, manganese, iron and boron, amongst others.
[0325] In yet another example, the methods of the present invention may be performed under stress conditions such as salt stress to give plants having increased yield relative to control plants. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.
[0326] In yet another example, the methods of the present invention may be performed under stress conditions such as cold stress or freezing stress to give plants having increased yield relative to control plants.
Increase/Improve/Enhance
[0327] The terms "increase", "improve" or "enhance" are interchangeable and shall mean in the sense of the application at least a 3%, 4%, 5%, 6%, 7%, 8%, 9% or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or 40% more yield and/or growth in comparison to control plants as defined herein.
Seed Yield
[0328] Increased seed yield may manifest itself as one or more of the following:
[0329] (a) an increase in seed biomass (total seed weight) which may be on an individual seed basis and/or per plant and/or per square meter;
[0330] (b) increased number of flowers per plant;
[0331] (c) increased number of seeds;
[0332] (d) increased seed filling rate (which is expressed as the ratio between the number of filled florets divided by the total number of florets);
[0333] (e) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, divided by the biomass of aboveground plant parts; and
[0334] (f) increased thousand kernel weight (TKW), which is extrapolated from the number of seeds counted and their total weight. An increased TKW may result from an increased seed size and/or seed weight, and may also result from an increase in embryo and/or endosperm size.
[0335] The terms "filled florets" and "filled seeds" may be considered synonyms.
[0336] An increase in seed yield may also be manifested as an increase in seed size and/or seed volume. Furthermore, an increase in seed yield may also manifest itself as an increase in seed area and/or seed length and/or seed width and/or seed perimeter.
Greenness Index
[0337] The "greenness index" as used herein is calculated from digital images of plants. For each pixel belonging to the plant object on the image, the ratio of the green value versus the red value (in the RGB model for encoding color) is calculated. The greenness index is expressed as the percentage of pixels for which the green-to-red ratio exceeds a given threshold. Under normal growth conditions, under salt stress growth conditions, and under reduced nutrient availability growth conditions, the greenness index of plants is measured in the last imaging before flowering. In contrast, under drought stress growth conditions, the greenness index of plants is measured in the first imaging after drought.
Biomass
[0338] The term "biomass" as used herein is intended to refer to the total weight of a plant. Within the definition of biomass, a distinction may be made between the biomass of one or more parts of a plant, which may include any one or more of the following:
[0339] aboveground parts such as but not limited to shoot biomass, seed biomass, leaf biomass, etc.;
[0340] aboveground harvestable parts such as but not limited to shoot biomass, seed biomass, leaf biomass, etc.;
[0341] parts below ground, such as but not limited to root biomass, tubers, bulbs, etc.;
[0342] harvestable parts below ground, such as but not limited to root biomass, tubers, bulbs, etc.;
[0343] harvestable parts partially below ground such as but not limited to beets and other hypocotyl areas of a plant, rhizomes, stolons or creeping rootstalks;
[0344] vegetative biomass such as root biomass, shoot biomass, etc.;
[0345] reproductive organs; and
[0346] propagules such as seed.
Marker Assisted Breeding
[0347] Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.
Use as Probes in (Gene Mapping)
[0348] Use of nucleic acids encoding the protein of interest for genetically and physically mapping the genes requires only a nucleic acid sequence of at least 15 nucleotides in length. These nucleic acids may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the nucleic acids encoding the protein of interest. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acids may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the nucleic acid encoding the protein of interest in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32:314-331).
[0349] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.
[0350] The nucleic acid probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).
[0351] In another embodiment, the nucleic acid probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.
[0352] A variety of nucleic acid amplification-based methods for genetic and physical mapping may be carried out using the nucleic acids. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med. 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art. In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.
Plant
[0353] The term "plant" as used herein encompasses whole plants, ancestors and progeny of the plants and plant parts, including seeds, shoots, stems, leaves, roots (including tubers), flowers, and tissues and organs, wherein each of the aforementioned comprise the gene/nucleic acid of interest. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen and microspores, again wherein each of the aforementioned comprises the gene/nucleic acid of interest.
[0354] Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g. Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp. (e.g. Brassica napus, Brassica rapa ssp. [canola, oilseed rape, turnip rape]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g. Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp., Fragaria spp., Ginkgo biloba, Glycine spp. (e.g. Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g. Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g. Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens culinaris, Linum usitatissimum, Litchi chinensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g. Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g. Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g. Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g. Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vitis spp., Zea mays, Zizania palustris, Ziziphus spp., amongst others.
Control Plant(s)
[0355] The choice of suitable control plants is a routine part of an experimental setup and may include corresponding wild type plants or corresponding plants without the gene of interest. The control plant is typically of the same plant species or even of the same variety as the plant to be assessed. The control plant may also be a nullizygote of the plant to be assessed. Nullizygotes (or null control plants) are individuals missing the transgene by segregation. Further, control plants are grown under equal growing conditions to the growing conditions of the plants of the invention, i.e. in the vicinity of, and simultaneously with, the plants of the invention. A "control plant" as used herein refers not only to whole plants, but also to plant parts, including seeds and seed parts.
DESCRIPTION OF FIGURES
[0356] The present invention will now be described with reference to the following figures in which:
[0357] FIG. 1 represents the domain structure of SEQ ID NO: 2 and SEQ ID NO: 4 with the conserved motifs 1 to 12 indicated.
[0358] FIG. 2 represents a multiple alignment of various bZIP-like polypeptides. The consensus sequence line at the bottom of the alignment gives an indication of the conserved regions within the group of bZIP-like proteins. These alignments can be used for defining further motifs or signature sequences, when using conserved amino acids. Panel A shows an alignment of sequences comprising SEQ ID NO: 2, panel B shows an alignment of sequences comprising SEQ ID NO: 4.
[0359] FIG. 3 shows the MATGAT tables of Example 3. Panel A shows a table for sequences comprising SEQ ID NO: 2, panel B shows a table for sequences comprising SEQ ID NO: 4.
[0360] FIG. 4 represents the binary vector used for increased expression in Oryza sativa of a bZIP-like-encoding nucleic acid (such as SEQ ID NO: 1 or SEQ ID NO: 3) under the control of a rice GOS2 promoter (pGOS2).
[0361] FIG. 5 represents the domain structure of SEQ ID NO: 142 with conserved motifs.
[0362] FIG. 6 represents a multiple alignment of various BCAT4-like polypeptides. The asterisks indicate identical amino acids among the various protein sequences, colons represent highly conserved amino acid substitutions, and the dots represent less conserved amino acid substitution; on other positions there is no sequence conservation. These alignments can be used for defining further motifs or signature sequences, when using conserved amino acids. The corresponding SEQ ID NOs for the aligned polypeptide sequences shown in FIG. 6 are:
[0363] SEQ ID NO: 160 for G.max_Glyma06g05280
[0364] SEQ ID NO: 168 for H.annuus_TC54245
[0365] SEQ ID NO: 146 for A.thaliana_AT1G10070
[0366] SEQ ID NO: 198 for P.trichocarpa_scaff_II.1054
[0367] SEQ ID NO: 144 for A.thaliana_AT1G10060
[0368] SEQ ID NO: 154 for A.thaliana_AT3G49680
[0369] SEQ ID NO: 156 for A.thaliana_AT5G65780
[0370] SEQ ID NO: 142 for P.trichocarpa_BCAT4-like
[0371] SEQ ID NO: 158 for G.max_Glyma01g40420
[0372] SEQ ID NO: 166 for G.max_Glyma11g04870
[0373] SEQ ID NO: 182 for M.truncatula_TC114768
[0374] SEQ ID NO: 202 for S.lycopersicum_TC213629
[0375] SEQ ID NO: 170 for H.vulgare_TC165564
[0376] SEQ ID NO: 188 for O.sativa_LOC_Os05g48450
[0377] SEQ ID NO: 208 for Z.mays_TA12434--4577999
[0378] SEQ ID NO: 174 for Hordeum--vulgare_PUT-169a-Horde
[0379] SEQ ID NO: 176 for Hordeum--vulgare_subsp--vulgare--
[0380] SEQ ID NO: 190 for O.sativa_Os03g0106400
[0381] SEQ ID NO: 186 for O.sativa_LOC_Os04g47190
[0382] SEQ ID NO: 204 for T.aestivum_TC320973
[0383] SEQ ID NO: 210 for Zea--mays_GRMZM2G047347--T03
[0384] SEQ ID NO: 192 for P.patens_TC31354
[0385] SEQ ID NO: 164 for P.patens_TC33668
[0386] SEQ ID NO: 172 for H.vulgare_TC186077
[0387] SEQ ID NO: 206 for T.aestivum_TC325793
[0388] SEQ ID NO: 184 for O.sativa_LOC_Os03g12890
[0389] SEQ ID NO: 212 for Zea--mays_GRMZM2G153536--T03
[0390] SEQ ID NO: 148 for A.thaliana_AT1G50090
[0391] SEQ ID NO: 150 for A.thaliana_AT1G50110
[0392] SEQ ID NO: 152 for A.thaliana_AT3G19710
[0393] SEQ ID NO: 162 for G.max_Glyma07g30510
[0394] SEQ ID NO: 164 for G.max_Glyma08g06750
[0395] SEQ ID NO: 178 for M.truncatula_AC159872--36
[0396] SEQ ID NO: 196 for P.trichocarpa--804339
[0397] SEQ ID NO: 200 for P.trichocarpa_scaff_IX.827
[0398] SEQ ID NO: 180 for M.truncatula_AC159872--55
[0399] FIG. 7 shows phylogenetic tree of BCAT4-like polypeptides, as described in example 2.
[0400] FIG. 8 shows the MATGAT table of Example 3.
[0401] FIG. 9 represents the binary vector used for increased expression in Oryza sativa of a BCAT4-like-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2).
EXAMPLES
[0402] The present invention will now be described with reference to the following examples, which are by way of illustration only. The following examples are not intended to limit the scope of the invention. Unless otherwise indicated, the present invention employs conventional techniques and methods of plant biology, molecular biology, bioinformatics and plant breedings.
[0403] DNA manipulation: unless otherwise stated, recombinant DNA techniques are performed according to standard protocols described in (Sambrook (2001) Molecular Cloning: a laboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH, New York) or in Volumes 1 and 2 of Ausubel et al. (1994), Current Protocols in Molecular Biology, Current Protocols. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R. D. D. Croy, published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications (UK).
Example 1
Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of Intervention
[0404] 1. bZIP-Like Polypeptides
[0405] Sequences (full length cDNA, ESTs or genomic) related to SEQ ID NO: 1 or 3 and SEQ ID NO: 2 or 4 were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid of SEQ ID NO: 1 or SEQ ID NO: 3 was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0406] Table A1 provides a list of nucleic acid sequences and amino acid sequences related to SEQ ID NO: 1/2 and SEQ ID NO: 3/4 respectively.
TABLE-US-00013 TABLE A1 Examples of bZIP-like nucleic acids and polypeptides: Nuclei acid Protein Plant source SEQ ID NO: SEQ ID NO: S. lycopersicum bZIP-like 1 2 P. tricocarpa bZIP-like 3 4 A.lyrata_322281 5 6 A.lyrata_895903 7 8 A.lyrata_944204 9 10 A.thaliana_AT2G46270.2 11 12 A.thaliana_AT4G01120.1 13 14 A.thaliana_AT4G36730.2 15 16 Aquilegia_sp_TC21139 17 18 B.napus_TC63500 19 20 B.napus_TC63534 21 22 B.napus_TC63535 23 24 B.napus_TC63581 25 26 B.napus_TC89867 27 28 C.canephora_TC1061 29 30 C.canephora_TC5243 31 32 C.clementina_TC35556 33 34 C.endivia_EL359780 35 36 C.maculosa_EH744483 37 38 C.maculosa_TA2192_215693 39 40 C.roseus_AF084971 41 42 C.roseus_AY027510 43 44 C.sinensis_TC24527 45 46 C.sinensis_TC9189 47 48 C.tinctorius_TA1437_4222 49 50 E.esula_TC3985 51 52 F.vesca_TA11395_57918 53 54 G.hirsutum_TC137570 55 56 G.hirsutum_TC148476 57 58 G.max_Glyma03g41590.4 59 60 G.max_Glyma07g06620.1 61 62 G.max_Glyma11g06960.1 63 64 G.max_Glyma16g03190.1 65 66 G.max_Glyma16g25600.2 67 68 G.max_Glyma19g44190.1 69 70 H.annuus_BU027457 71 72 H.annuus_TC42219 73 74 H.petiolaris_TA3844_4234 75 76 H.tuberosus_TA2407_4233 77 78 H.tuberosus_TA2966_4233 79 80 M.truncatula_AC137602_8.5 81 82 M.truncatula_AC148484_8.5 83 84 N.tabacum_BP133908 85 86 N.tabacum_NP917548 87 88 N.tabacum_TC40642 89 90 N.tabacum_TC76189 91 92 P.taeda_TA23200_3352 93 94 P.trichocarpa_719452 95 96 P.trifoliata_TA6299_37690 97 98 P.vulgaris_TC11785 99 100 S.lycopersicum_TC211600 101 102 S.lycopersicum_TC213303 103 104 S.tuberosum_TC185019 105 106 S.tuberosum_TC186959 107 108 S.tuberosum_TC187892 109 110 T.pratense_TA1890_57577 111 112 Medicago truncatula GBF3-like 113 114 V.vinifera_GSVIVT00014657001 115 116 V.vinifera_GSVIVT00024984001 117 118
2. BCAT4-Like Polypeptides
[0407] Sequences (full length cDNA, ESTs or genomic) related to SEQ ID NO: 141 and SEQ ID NO: 142 were identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acid of SEQ ID NO: 141 was used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis was viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflect the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons were also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0408] Table A2 provides a list of nucleic acid sequences related to SEQ ID NO: 141 and SEQ ID NO: 142.
TABLE-US-00014 TABLE A2 Examples of BCAT4-like nucleic acids and polypeptides: Nucleic acid Protein Plant Source SEQ ID NO: SEQ ID NO: A.thaliana_AT1G10060.1 143 144 A.thaliana_AT1G10070.1 145 146 A.thaliana_AT1G50090.1 147 148 A.thaliana_AT1G50110.1 149 150 A.thaliana_AT3G19710.1 151 152 A.thaliana_AT3G49680.1 153 154 A.thaliana_AT5G65780.1 155 156 G.max_Glyma01g40420.1 157 158 G.max_Glyma06g05280.1 159 160 G.max_Glyma07g30510.1 161 162 G.max_Glyma08g06750.1 163 164 G.max_Glyma11g04870.1 165 166 H.annuus_TC54245 167 168 H.vulgare_TC165564 169 170 H.vulgare_TC186077 171 172 Hordeum_vulgare_PUT-169a- 173 174 Hordeum_vulgare-79158 Hordeum_vulgare_subsp-- 175 176 vulgare_AK251931 M.truncatula_AC159872_36 177 178 M.truncatula_AC159872_55 179 180 M.truncatula_TC114768 181 182 O.sativa_LOC_Os03g12890 183 184 O.sativa_LOC_Os04g47190 185 186 O.sativa_LOC_Os05g48450 187 188 O.sativa_Os03g0106400 189 190 P.patens_TC31354 191 192 P.patens_TC33668 193 194 P.trichocarpa_804339 195 196 P.trichocarpa_scaff_II.1054 197 198 P.trichocarpa_scaff_IX.827 199 200 S.lycopersicum_TC213629 201 202 T.aestivum_TC320973 203 204 T.aestivum_TC325793 205 206 Z.mays_TA12434_4577999 207 208 Zea_mays_GRMZM2G047347_T03 209 210 Zea_mays_GRMZM2G153536_T03 211 212
[0409] Sequences have been tentatively assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR; beginning with TA). For instance, the Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid sequence or polypeptide sequence of interest. Special nucleic acid sequence databases have been created for particular organisms, e.g. for certain prokaryotic organisms, such as by the Joint Genome Institute. Furthermore, access to proprietary databases, has allowed the identification of novel nucleic acid and polypeptide sequences.
Example 2
Alignment of Sequences to the Polypeptide Sequences Used in the Methods of the Invention
[0410] 1. bZIP-Like Polypeptides
[0411] Alignment of the polypeptide sequences was performed using the ClustalW algorithm (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chema et al. (2003) Nucleic Acids Res 31:3497-3500) as present in Vector NTI (Invitrogen), with standard setting (gap opening penalty: 10, gap extension penalty: 0.05, gap separation penalty range: 8). Minor manual editing was done to further optimise the alignment. The bZIP-like polypeptides are aligned in FIGS. 2 A and B.
2. BCAT4-Like Polypeptides
[0412] Alignment of the polypeptide sequences was performed using the ClustalW 2.0.11 algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chema et al. (2003). Nucleic Acids Res 31:3497-3500) with standard setting (slow alignment, similarity matrix: Gonnet, gap opening penalty 10, gap extension penalty: 0.2). Minor manual editing was done to further optimise the alignment. The BCAT4-like polypeptides are aligned in FIG. 6.
[0413] A phylogenetic tree of BCAT4-like polypeptides (FIG. 7) was constructed by aligning BCAT4-like sequences using MAFFT (Katoh and Toh (2008)--Briefings in Bioinformatics 9:286-298). A neighbour-joining tree was calculated using Quick-Tree (Howe et al. (2002), Bioinformatics 18(11): 1546-7), 100 bootstrap repetitions. The tree was drawn using Dendroscope (Huson et al. (2007), BMC Bioinformatics 8(1):460). Confidence levels for 100 bootstrap repetitions are indicated for major branchings.
Example 3
Calculation of Global Percentage Identity Between Polypeptide Sequences
[0414] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention were determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix.
1. bZIP-Like Polypeptides
[0415] Results of the MatGAT analysis are shown in FIG. 3 with global similarity and identity percentages over the full length of the polypeptide sequences. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line. Parameters used in the analysis were: Scoring matrix: Blosum62, First Gap: 12, Extending Gap: 2. The sequence identity (in %) between the bZIP-like polypeptide sequences useful in performing the methods of the invention can be as low as 18% (but is generally higher than 30%) compared to SEQ ID NO: 2. For SEQ ID NO: 4, the sequence identity with other bZIP-like polypeptide sequences useful in performing the methods of the invention can be as low as 23% but is on average higher than 48%)
2. BCAT4-Like Polypeptides
[0416] Results of the MatGAT analysis are shown in FIG. 8 with global similarity and identity percentages over the full length of the polypeptide sequences. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line. Parameters used in the analysis were: Scoring matrix: Blosum62, First Gap: 12, Extending Gap: 2. The sequence identity (in %) between the BCAT4-like polypeptide sequences useful in performing the methods of the invention can be as low as 37.9% (is generally higher than 37.9%) compared to SEQ ID NO: 142.
TABLE-US-00015 TABLE B Description of proteins in FIG. 8: 1. A.thaliana_AT1G10060 2. A.thaliana_AT1G10070 3. A.thaliana_AT1G50090 4. A.thaliana_AT1G50110 5. A.thaliana_AT3G19710 6. A.thaliana_AT3G49680 7. A.thaliana_AT5G65780 8. G.max_Glyma01g40420 9. G.max_Glyma06g05280 10. G.max_Glyma07g30510 11. G.max_Glyma08g06750 12. G.max_Glyma11g04870 13. H.annuus_TC54245 14. H.vulgare_TC165564 15. H.vulgare_TC186077 16. Hordeum_vulgare_PUT-169a-Hordeum_vulgare-79158 17. Hordeum_vulgare_subsp_vulgare_AK251931 18. M.truncatula_AC159872_36 19. M.truncatula_AC159872_55 20. M.truncatula_TC114768 21. O.sativa_LOC_Os03g12890 22. O.sativa_LOC_Os04g47190 23. O.sativa_LOC_Os05g48450 24. O.sativa_Os03g0106400 25. P.patens_TC31354 26. P.patens_TC33668 27. P.trichocarpa_804339 28. P.trichocarpa_scaff_II.1054 29. P.trichocarpa_scaff_IX.827 30. S.lycopersicum_TC213629 31. P.trichocarpa_BCAT4-like 32. T.aestivum_TC320973 33. T.aestivum_TC325793 34. Z.mays_TA12434_4577999 35. Zea_mays_GRMZM2G047347_T03 36. Zea_mays_GRMZM2G153536_T03
Example 4
Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention
[0417] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, Propom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.
1. bZIP-Like Polypeptides
[0418] The results of the InterPro scan (InterPro database, release 31.0) of the polypeptide sequence as represented by SEQ ID NO: 2 and 4 are presented in Table C.
TABLE-US-00016 TABLE C1 InterPro scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 2. InterPro ID Domain name Short name Description Location IPR004827 SM00338 no description Basic-leucine zipper (bZIP) 186-250 transcription factor PS50217 BZIP Basic-leucine zipper (bZIP) 188-251 transcription factor PS00036 BZIP_BASIC Basic-leucine zipper (bZIP) 193-208 transcription factor IPR011616 PF00170 bZIP_1 bZIP transcription factor, bZIP-1 186-240 IPR012900 PF07777 MFMR G-box binding, MFMR 001-099 No IPR ID G3DSA: 1.20.5.170 no description NULL 180-246 PTHR22952: SF5 CYCLIC-AMP-DEPENDENT 158-232 TRANSCRIPTION FACTOR ATF-6 BETA PTHR22952 CAMP-RESPONSE ELEMENT 158-232 BINDING PROTEIN-RELATED
TABLE-US-00017 TABLE C2 InterPro scan results (major accession numbers) of the polypeptide sequence as represented by SEQ ID NO: 4. InterPro ID Domain name Short name Description Location NULL G3DSA: 1.20.5.170 no description NULL 254-319 IPR004827 PS00036 BZIP_BASIC Basic-leucine zipper 267-282 (bZIP) transcription factor IPR012900 PF07777 MFMR G-box binding, MFMR 1-175 IPR011616 PF00170 bZIP_1 bZIP transcription factor, 262-322 bZIP-1 PTHR22952: SF73 GBF2 (G-BOX BINDING NULL 12-333 FACTOR 2); DNA BINDING/ TRANSCRIPTION FACTOR PTHR22952 FAMILY NOT NAMED NULL 12-333 IPR004827 SM00338 no description Basic-leucine zipper 260-324 (bZIP) transcription factor IPR004827 PS50217 BZIP Basic-leucine zipper 262-325 (bZIP) transcription factor
[0419] In an embodiment a bZIP-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid M1 up to amino acid N99 in SEQ ID NO: 2 (which corresponds to the conserved domain starting with amino acid M1 up to amino acid R175 in SEQ ID NO: 4).
2. BCAT4-Like Polypeptides
[0420] SEQ ID NO: 142 was checked against the NCBI database for conserved domains and it was found that SEQ ID NO: 142 is part of the BCAT_beta_family and part of the multidomains PLN02782.
[0421] In an embodiment a BCAT4-like polypeptide comprises a conserved domain (or motif) with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the conserved domain starting with amino acid 185 up to amino acid 202 in SEQ ID NO: 142, or to the conserved domain starting with amino acid 150 up to amino acid 167 in SEQ ID NO: 142, or to the conserved domain starting with amino acid
[0422] 126 up to amino acid 143 in SEQ ID NO: 142.
Example 5
Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of Invention
[0423] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted. TargetP is maintained at the server of the Technical University of Denmark.
[0424] A number of parameters must be selected before analysing a sequence, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).
[0425] Many other algorithms can be used to perform such analyses, including:
[0426] ChloroP 1.1 hosted on the server of the Technical University of Denmark;
[0427] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;
[0428] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;
[0429] TMHMM, hosted on the server of the Technical University of Denmark
[0430] PSORT (URL: psort.org)
[0431] PLOC (Park and Kanehisa, Bioinformatics, 19, 1656-1663, 2003). 1. bZIP-Like Polypeptides
[0432] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.
[0433] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).
[0434] The results of TargetP 1.1 analysis of the polypeptide sequence as represented by SEQ ID NO: 2 are presented Table D. The "plant" organism group has been selected, no cutoffs defined, and the predicted length of the transit peptide requested. The subcellular localization of the polypeptide sequence as represented by SEQ ID NO: 2 may be the cytoplasm or nucleus, no transit peptide is predicted.
TABLE-US-00018 TABLE D TargetP 1.1 analysis of the polypeptide sequence as represented by SEQ ID NO: 2 and 4. Name Len cTP mTP SP other Loc RC TPlen SEQ ID NO:2 284 0.144 0.279 0.067 0.468 -- 5 -- cutoff 0.000 0.000 0.000 0.000 SEQ ID NO: 4 399 0.238 0.053 0.045 0.940 -- 2 -- cutoff 0.000 0.000 0.000 0.000 Abbreviations: Len, Length; cTP, Chloroplastic transit peptide; mTP, Mitochondrial transit peptide, SP, Secretory pathway signal peptide, other, Other subcellular targeting, Loc, Predicted Location; RC, Reliability class; TPlen, Predicted transit peptide length.
Example 6
Functional Assay Related to the Polypeptide Sequences Useful in Performing the Methods of the Invention
[0435] Gel-shift analysis of GmbZIP DNA-binding ability (Liao et al., 2008)
[0436] Three copies of GLM (GTGAGTCAT) (Onate et al., J Biol Chem, 274:9175-9182, 1999), ABRE (CCACGTGG) (Jakoby et al., Trends Plant Sci 7:106-111, 2002), and PB-like (TGAAAA) (Onate et al. 1999) elements are synthesised using standard techniques. Two complementary single-stranded oligonucleotides are mixed in 50 mM NaCl, heated at 70° C. for 5 min and then cooled slowly to room temperature. Each annealed element is labeled with [gamma-32P]ATP (about 110 TBq/mmol, Amersham) using T4 polynucleotide kinase and used as a probe. Gelshift analysis is performed with the 32P-labeled probe and 2 pg of recombinant protein. The competitive experiment is performed by adding an excess of 50× unlabeled element in addition to the 32P-labeled probe. After incubation for 30 min at 25° C., the mixtures are subjected to electrophoresis in 6% (w/v) polyacrylamide gel on ice. The gel is placed on a sheet of Whatman 3 MM filter paper, covered with plastic wrap and exposed to X-ray film overnight at -70° C. with an intensifying screen.
Example 7
Cloning of the Nucleic Acid Sequence Used in Methods of the Invention
[0437] 1. bZIP-Like Polypeptides
[0438] The nucleic acid sequence of SEQ ID NO: 1 was amplified by PCR using as template a custom-made Solanum lycopersicum seedlings cDNA library. PCR was performed using a commercially available proofreading Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm9943 (SEQ ID NO: 134; sense, start codon in bold): 5'-ggggacaagtttgtacaaaaaagcaggcttaaacaatgcctccttatgggactc-3' and prm9944 (SEQ ID NO: 135; reverse, complementary): 5'-ggggaccactttgtacaagaaagctg ggtgcttttccacttctccttaac-3', which include the AttB sites for Gateway recombination. Similarly, the nucleic acid sequence of SEQ ID NO: 3 was amplified by PCR using as template a custom-made Populus trichocarpa seedlings cDNA library using primers prm17402: 5'-ggggacaagtttgtacaaaaaagcaggcttaaacaatgggaaacattgaagaggg-3' (SEQ ID NO: 136) and prm17403: 5' ggggaccactttgtacaagaaagctgggttgaaccagtgtcatcaaccag-3' (SEQ ID NO: 137).
[0439] The amplified PCR fragments were purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment (comprising either SEQ ID NO: 1 or SEQ ID NO: 3) recombined in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pbZIP-like. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0440] The entry clone comprising SEQ ID NO: 1 or SEQ ID NO: 3 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 131) for constitutive expression was located upstream of this Gateway cassette.
[0441] After the LR recombination step, the resulting expression vector pGOS2::bZIP-like (FIG. 4) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
2. BCAT4-Like Polypeptides
[0442] The nucleic acid sequence was amplified by PCR using as template a custom-made Populus trichocarpa seedlings cDNA library. PCR was performed using a commercially available proofreading Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm15099 (SEQ ID NO: 219; sense): 5'-ggggacaa gtttgtacaaaaaagcaggcttaaacaatggagagaagcgccgt-3' and prm15100 (SEQ ID NO: 220; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggttcactgcagtacgcctaa ctc-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombined in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone", pBCAT4-like. Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0443] The entry clone comprising SEQ ID NO: 141 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 218) for constitutive expression was located upstream of this Gateway cassette.
[0444] After the LR recombination step, the resulting expression vector pGOS2::BCAT4-like (FIG. 9) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
Example 8
Plant Transformation
Rice Transformation
[0445] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 to 60 minutes, preferably 30 minutes in sodium hypochlorite solution (depending on the grade of contamination), followed by a 3 to 6 times, preferably 4 time wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in light for 6 days scutellum-derived calli is transformed with Agrobacterium as described herein below.
[0446] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD600) of about 1. The calli were immersed in the suspension for 1 to 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. After washing away the Agrobacterium, the calli were grown on 2,4-D-containing medium for 10 to 14 days (growth time for indica: 3 weeks) under light at 28° C.-32° C. in the presence of a selection agent. During this period, rapidly growing resistant callus developed. After transfer of this material to regeneration media, the embryogenic potential was released and shoots developed in the next four to six weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.
[0447] Transformation of rice cultivar indica can also be done in a similar way as give above according to techniques well known to a skilled person.
[0448] 35 to 90 independent TO rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges1996, Chan et al. 1993, Hiei et al. 1994).
Example 9
Transformation of Other Crops
Corn Transformation
[0449] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Wheat Transformation
[0450] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Soybean Transformation
[0451] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Rapeseed/Canola Transformation
[0452] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/l BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/l BAP, cefotaxime, carbenicillin, or timentin (300 mg/l) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/l BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Alfalfa Transformation
[0453] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown DCW and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Cotton Transformation
[0454] Cotton is transformed using Agrobacterium tumefaciens according to the method described in U.S. Pat. No. 5,159,135. Cotton seeds are surface sterilised in 3% sodium hypochlorite solution during 20 minutes and washed in distilled water with 500 μg/ml cefotaxime. The seeds are then transferred to SH-medium with 50 μg/ml benomyl for germination. Hypocotyls of 4 to 6 days old seedlings are removed, cut into 0.5 cm pieces and are placed on 0.8% agar. An Agrobacterium suspension (approx. 108 cells per ml, diluted from an overnight culture transformed with the gene of interest and suitable selection markers) is used for inoculation of the hypocotyl explants. After 3 days at room temperature and lighting, the tissues are transferred to a solid medium (1.6 g/l Gelrite) with Murashige and Skoog salts with B5 vitamins (Gamborg et al., Exp. Cell Res. 50:151-158 (1968)), 0.1 mg/l 2,4-D, 0.1 mg/l 6-furfurylaminopurine and 750 μg/ml MgCL2, and with 50 to 100 μg/ml cefotaxime and 400-500 μg/ml carbenicillin to kill residual bacteria. Individual cell lines are isolated after two to three months (with subcultures every four to six weeks) and are further cultivated on selective medium for tissue amplification (30° C., 16 hr photoperiod). Transformed tissues are subsequently further cultivated on non-selective medium during 2 to 3 months to give rise to somatic embryos. Healthy looking embryos of at least 4 mm length are transferred to tubes with SH medium in fine vermiculite, supplemented with 0.1 mg/l indole acetic acid, 6 furfurylaminopurine and gibberellic acid. The embryos are cultivated at 30° C. with a photoperiod of 16 hrs, and plantlets at the 2 to 3 leaf stage are transferred to pots with vermiculite and nutrients. The plants are hardened and subsequently moved to the greenhouse for further cultivation.
Sugarbeet Transformation
[0455] Seeds of sugarbeet (Beta vulgaris L.) are sterilized in 70% ethanol for one minute followed by 20 min. shaking in 20% Hypochlorite bleach e.g. Clorox® regular bleach (commercially available from Clorox, 1221 Broadway, Oakland, Calif. 94612, USA). Seeds are rinsed with sterile water and air dried followed by plating onto germinating medium (Murashige and Skoog (MS) based medium (Murashige, T., and Skoog, . . . , 1962. Physiol. Plant, vol. 15, 473-497) including B5 vitamins (Gamborg et al.; Exp. Cell Res., vol. 50, 151-8.) supplemented with 10 g/l sucrose and 0.8% agar). Hypocotyl tissue is used essentially for the initiation of shoot cultures according to Hussey and Hepher (Hussey, G., and Hepher, A., 1978. Annals of Botany, 42, 477-9) and are maintained on MS based medium supplemented with 30 g/l sucrose plus 0.25 mg/l benzylamino purine and 0.75% agar, pH 5.8 at 23-25° C. with a 16-hour photoperiod. Agrobacterium tumefaciens strain carrying a binary plasmid harbouring a selectable marker gene, for example nptII, is used in transformation experiments. One day before transformation, a liquid LB culture including antibiotics is grown on a shaker (28° C., 150 rpm) until an optical density (O.D.) at 600 nm of ˜1 is reached. Overnight-grown bacterial cultures are centrifuged and resuspended in inoculation medium (O.D. ˜1) including Acetosyringone, pH 5.5. Shoot base tissue is cut into slices (1.0 cm×1.0 cm×2.0 mm approximately). Tissue is immersed for 30 s in liquid bacterial inoculation medium. Excess liquid is removed by filter paper blotting. Co-cultivation occurred for 24-72 hours on MS based medium incl. 30 g/l sucrose followed by a non-selective period including MS based medium, 30 g/l sucrose with 1 mg/l BAP to induce shoot development and cefotaxim for eliminating the Agrobacterium. After 3-10 days explants are transferred to similar selective medium harbouring for example kanamycin or G418 (50-100 mg/l genotype dependent). Tissues are transferred to fresh medium every 2-3 weeks to maintain selection pressure. The very rapid initiation of shoots (after 3-4 days) indicates regeneration of existing meristems rather than organogenesis of newly developed transgenic meristems. Small shoots are transferred after several rounds of subculture to root induction medium containing 5 mg/l NAA and kanamycin or G418. Additional steps are taken to reduce the potential of generating transformed plants that are chimeric (partially transgenic). Tissue samples from regenerated shoots are used for DNA analysis. Other transformation methods for sugarbeet are known in the art, for example those by Linsey & Gallois (Linsey, K., and Gallois, P., 1990. Journal of Experimental Botany; vol. 41, No. 226; 529-36) or the methods published in the international application published as WO9623891A.
Sugarcane Transformation
[0456] Spindles are isolated from 6-month-old field grown sugarcane plants (Arencibia et al., 1998. Transgenic Research, vol. 7, 213-22; Enriquez-Obregon et al., 1998. Planta, vol. 206, 20-27). Material is sterilized by immersion in a 20% Hypochlorite bleach e.g. Clorox® regular bleach (commercially available from Clorox, 1221 Broadway, Oakland, Calif. 94612, USA) for 20 minutes. Transverse sections around 0.5 cm are placed on the medium in the top-up direction. Plant material is cultivated for 4 weeks on MS (Murashige, T., and Skoog, 1962. Physiol. Plant, vol. 15, 473-497) based medium incl. B5 vitamins (Gamborg, O., et al., 1968. Exp. Cell Res., vol. 50, 151-8) supplemented with 20 g/l sucrose, 500 mg/l casein hydrolysate, 0.8% agar and 5 mg/l 2,4-D at 23° C. in the dark. Cultures are transferred after 4 weeks onto identical fresh medium. Agrobacterium tumefaciens strain carrying a binary plasmid harbouring a selectable marker gene, for example hpt, is used in transformation experiments. One day before transformation, a liquid LB culture including antibiotics is grown on a shaker (28° C., 150 rpm) until an optical density (O.D.) at 600 nm of ˜0.6 is reached. Overnight-grown bacterial cultures are centrifuged and resuspended in MS based inoculation medium (O.D. ˜0.4) including acetosyringone, pH 5.5. Sugarcane embryogenic callus pieces (2-4 mm) are isolated based on morphological characteristics as compact structure and yellow colour and dried for 20 min. in the flow hood followed by immersion in a liquid bacterial inoculation medium for 10-20 minutes. Excess liquid is removed by filter paper blotting. Co-cultivation occurred for 3-5 days in the dark on filter paper which is placed on top of MS based medium incl. B5 vitamins containing 1 mg/l 2,4-D. After co-cultivation calli are washed with sterile water followed by a non-selective cultivation period on similar medium containing 500 mg/l cefotaxime for eliminating remaining Agrobacterium cells. After 3-10 days explants are transferred to MS based selective medium incl. B5 vitamins containing 1 mg/l 2,4-D for another 3 weeks harbouring 25 mg/l of hygromycin (genotype dependent). All treatments are made at 23° C. under dark conditions. Resistant calli are further cultivated on medium lacking 2,4-D including 1 mg/l BA and 25 mg/l hygromycin under 16 h light photoperiod resulting in the development of shoot structures. Shoots are isolated and cultivated on selective rooting medium (MS based including, 20 g/l sucrose, 20 mg/l hygromycin and 500 mg/l cefotaxime). Tissue samples from regenerated shoots are used for DNA analysis. Other transformation methods for sugarcane are known in the art, for example from the international application published as WO2010/151634A and the granted European patent EP1831378.
Example 10
Phenotypic Evaluation Procedure
10.1 Evaluation Setup
[0457] 35 to 90 independent T0 rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Six events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%. Plants grown under non-stress conditions were watered at regular intervals to ensure that water and nutrients were not limiting and to satisfy plant needs to complete growth and development, unless they were used in a stress screen.
[0458] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.
[0459] T1 events can be further evaluated in the T2 generation following the same evaluation procedure as for the T1 generation, e.g. with less events and/or with more individuals per event.
Drought Screen
[0460] T1 or T2 plants were grown in potting soil under normal conditions until they approached the heading stage. They were then transferred to a "dry" section where irrigation was withheld. Soil moisture probes were inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC went below certain thresholds, the plants were automatically re-watered continuously until a normal level was reached again. The plants were then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress conditions. Growth and yield parameters were recorded as detailed for growth under normal conditions.
Nitrogen Use Efficiency Screen (bZIP-Like Polypeptides)
[0461] T1 or T2 plants were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters were recorded as detailed for growth under normal conditions.
Salt Stress Screen
[0462] T1 or T2 plants are grown on a substrate made of coco fibers and particles of baked clay (Argex) (3 to 1 ratio). A normal nutrient solution is used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) is added to the nutrient solution, until the plants are harvested. Growth and yield parameters are recorded as detailed for growth under normal conditions.
10.2 Statistical Analysis: F Test
[0463] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.
10.3 Parameters Measured
[0464] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles as described in WO 2010/031780. These measurements were used to determine different parameters.
Biomass-Related Parameter Measurement
[0465] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass.
[0466] Increase in root biomass is expressed as an increase in total root biomass (measured as maximum biomass of roots observed during the lifespan of a plant); or as an increase in the root/shoot index, measured as the ratio between root mass and shoot mass in the period of active growth of root and shoot. In other words, the root/shoot index is defined as the ratio of the rapidity of root growth to the rapidity of shoot growth in the period of active growth of root and shoot. Root biomass can be determined using a method as described in WO 2006/029987.
Parameters Related to Development Time
[0467] The early vigour is the plant aboveground area three weeks post-germination. Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration.
[0468] AreaEmer is an indication of quick early development when this value is decreased compared to control plants. It is the ratio (expressed in %) between the time a plant needs to make 30% of the final biomass and the time needs to make 90% of its final biomass.
[0469] The "time to flower" or "flowering time" of the plant can be determined using the method as described in WO 2007/093444.
Seed-Related Parameter Measurements
[0470] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The seeds are usually covered by a dry outer covering, the husk. The filled husks (herein also named filled florets) were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance.
[0471] The total number of seeds was determined by counting the number of filled husks that remained after the separation step. The total seed weight was measured by weighing all filled husks harvested from a plant.
[0472] The total number of seeds (or florets) per plant was determined by counting the number of husks (whether filled or not) harvested from a plant.
[0473] Thousand Kernel Weight (TKW) is extrapolated from the number of seeds counted and their total weight.
[0474] The Harvest Index (HI) in the present invention is defined as the ratio between the total seed weight and the above ground area (mm2), multiplied by a factor 106.
[0475] The number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds over the number of mature primary panicles.
[0476] The "seed fill rate" or "seed filling rate" as defined in the present invention is the proportion (expressed as a %) of the number of filled seeds (i.e. florets containing seeds) over the total number of seeds (i.e. total number of florets). In other words, the seed filling rate is the percentage of florets that are filled with seed.
Example 10
Results of the Phenotypic Evaluation of the Transgenic Plants
[0477] 1. bZIP-Like Polypeptides
[0478] The results of the evaluation of transgenic rice plants in the T1 generation and expressing a nucleic acid encoding the bZIP-like polypeptide of SEQ ID NO: 2 under nutrient-stress conditions are presented below in Table E1. When grown under conditions of nitrogen deficiency, an increase of at least 5% was observed for aboveground (or green) biomass (AreaMax), for root biomass (RootMax & RootThickMax) and for seed yield (total seed weight (totalwgseeds), number of filled seeds (nrfilledseed) and total number of seeds (nrtotalseed)). No negative effect on flowering time (delayed flowering) was observed.
[0479] In addition, plants expressing the bZIP-like nucleic acid of SEQ ID NO: 1 showed an increase in Thousand Kernel Weight, fillrate, and number of first panicles in at least one of the tested lines. One of the tested lines also showed an earlier flowering time.
TABLE-US-00019 TABLE E1 Data summary for transgenic rice plants; for each parameter, the overall percent increase is shown for the T1 generation, for each parameter the p-value is <0.05. Parameter Overall increase AreaMax 14.7 RootMax 8.8 totalwgseeds 18.1 nrtotalseed 10.0 nrfilledseed 16.1 RootThickMax 5.3
[0480] A similar trend was observed for plants grown under non-stress conditions: aboveground biomass, root biomass and seed yield (total seed weight, total number of seeds, fillrate, and number of filled seeds) were increased in at least 2 of the lines tested, compared to control plants
[0481] The results of the evaluation of transgenic rice plants in the T1 generation and expressing a nucleic acid encoding the bZIP-like polypeptide of SEQ ID NO: 4 under drought-stress conditions are presented below in Table E2. When grown under conditions of drought, an increase of at least 5% was observed for aboveground (or green) biomass (AreaMax), for root biomass (RootThickMax) and for seed yield (total seed weight (totalwgseeds), fillrate, harvest index, and number of filled seeds (nrfilledseed)). No negative effect on flowering time (delayed flowering) was observed.
TABLE-US-00020 TABLE E2 Data summary for transgenic rice plants; for each parameter, the overall percent increase is shown for the T1 generation, for each parameter the p-value is <0.05. Parameter Overall increase AreaMax 8.6 totalwgseeds 42.5 fillrate 46.6 harvestindex 33.5 nrfilledseed 44.0 RootThickMax 8.4
2. BCAT4-Like Polypeptides
[0482] The results of the evaluation of transgenic rice plants expressing a BCAT4-like nucleic acid under drought-stress conditions are presented hereunder. An increase of at least 5% was observed for total seed weight, fill rate, harvest index and number of seeds (Table E3).
[0483] The results of the evaluation of transgenic rice plants in the T1 generation and expressing a nucleic acid encoding the BCAT4-like polypeptide of SEQ ID NO: 142 under drought stress conditions are presented below in Table D. When grown under drought stress conditions, an increase of at least 5% was observed for seed yield, including total weight of seeds (totalwgseeds), fill rate, harvest index and number of seeds (nrfilledseed). In addition, 2 lines were clearly positive for GravityYMax, which is the height of the gravity center of the leafy biomass of the plant.
TABLE-US-00021 TABLE E3 Data summary for transgenic rice plants; for each parameter, the overall percent increase is shown (T1 generation), for each parameter the p-value is <0.05. Parameter Overall increase totalwgseeds 34.5 fillrate 36.0 harvestindex 29.8 nrfilledseed 32.1
Sequence CWU
1
1
2221855DNASolanum lycopersicon 1atgcctcctt atgggactcc agttccatat
ccagctttat atcctcctgc cggagtttat 60gctcatccta acattgccac gccggctcca
aattctgtgc cggcaaatcc tgaagcagat 120gggaaggggc ctgaaggaaa ggatcggaat
tcaagtaaaa agttaaaggt ctgttctggt 180ggtaaggcag gcgacaatgg gaaagttact
tcaggttccg gaaatgatgg tgccacacaa 240agtgatgaaa gcagaagtga aggtacatca
gatacaaatg atgaaaatga taacaatgaa 300tttgctgcaa acaagaaggg aagctttgat
caaatgcttg cagatggagc cagtgcacag 360aataatcctg cgaaagagaa tcacccgact
tctatacatg gaaatcctgt caccatgcct 420gcaactaacc taaatattgg aatggacgtg
tggaatgcat cagctgccgg tcctggagcg 480atcaaaatac agcaaaatgc aactggtcca
gttataggac atgaaggaag gatgaatgat 540cagtggattc aggaggaacg tgaacttaaa
aggcaaaaga gaaagcaatc taatagggag 600tcagctagga ggtcgaggct ccgcaagcag
gcagagtgtg aagagctaca acgtagagta 660gaagctttga gccatgagaa tcattcactc
aaagatgagc tccaacggct ctctgaggaa 720tgtgagaagc ttacctcgga gaataattta
attaaggaag agttaacgct actttgtgga 780ccagacgttg tgtctaagct ggagagaaac
gataatgtca cacgtattca atctaatgtt 840gaagaagcta gttaa
8552284PRTSolanum lycopersicon 2Met Pro
Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala Leu Tyr Pro Pro 1 5
10 15 Ala Gly Val Tyr Ala His Pro
Asn Ile Ala Thr Pro Ala Pro Asn Ser 20 25
30 Val Pro Ala Asn Pro Glu Ala Asp Gly Lys Gly Pro
Glu Gly Lys Asp 35 40 45
Arg Asn Ser Ser Lys Lys Leu Lys Val Cys Ser Gly Gly Lys Ala Gly
50 55 60 Asp Asn Gly
Lys Val Thr Ser Gly Ser Gly Asn Asp Gly Ala Thr Gln 65
70 75 80 Ser Asp Glu Ser Arg Ser Glu
Gly Thr Ser Asp Thr Asn Asp Glu Asn 85
90 95 Asp Asn Asn Glu Phe Ala Ala Asn Lys Lys Gly
Ser Phe Asp Gln Met 100 105
110 Leu Ala Asp Gly Ala Ser Ala Gln Asn Asn Pro Ala Lys Glu Asn
His 115 120 125 Pro
Thr Ser Ile His Gly Asn Pro Val Thr Met Pro Ala Thr Asn Leu 130
135 140 Asn Ile Gly Met Asp Val
Trp Asn Ala Ser Ala Ala Gly Pro Gly Ala 145 150
155 160 Ile Lys Ile Gln Gln Asn Ala Thr Gly Pro Val
Ile Gly His Glu Gly 165 170
175 Arg Met Asn Asp Gln Trp Ile Gln Glu Glu Arg Glu Leu Lys Arg Gln
180 185 190 Lys Arg
Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 195
200 205 Lys Gln Ala Glu Cys Glu Glu
Leu Gln Arg Arg Val Glu Ala Leu Ser 210 215
220 His Glu Asn His Ser Leu Lys Asp Glu Leu Gln Arg
Leu Ser Glu Glu 225 230 235
240 Cys Glu Lys Leu Thr Ser Glu Asn Asn Leu Ile Lys Glu Glu Leu Thr
245 250 255 Leu Leu Cys
Gly Pro Asp Val Val Ser Lys Leu Glu Arg Asn Asp Asn 260
265 270 Val Thr Arg Ile Gln Ser Asn Val
Glu Glu Ala Ser 275 280
31200DNAPopulus trichocarpa 3atgggaaaca ttgaagaggg aaagtcttcc acttctgata
aatcttcacc tgcaccaccg 60gatcagacca atattcatgt gtatcctgat ggggcagcta
tgcaggcata ttatggcccc 120cgagtggctc tcccaccata ttacaactcg gccgtggctt
ctggtcatgc ccctcatcct 180tatatgtggg gcctgccaca gcctatgatg ccaccttatg
gggcacctta tgcaacagtc 240tactcacatg gagtgtatgc acatccggct gttccaattg
tatcccatcc tcatggtcct 300gggattgtgt catctcctgc agctggaacc cttttgagtg
cagaaacacc tacaaaatct 360tcaggaaata ctgatcgagg tttagtgaat aagttgaaag
gatttgatgg gcttgcaatg 420tcaataggca atggtaatgc tgagactgtc gagggtgggg
gtaggctgtc tcaaagtgtg 480gagatagaag tttccagtga tggaattgat gggaatacaa
ctaggggaaa gaaaaggagc 540cgtgagggaa caccaactgt tgcaacaggt ggagatacaa
aaatggagtc acattccagt 600ccccttccta gagaggtgaa tgcatccact gacaatgtat
tgagggcagc tgttgctcct 660ggcatgacca cagcattgga gcttaggaac cctcctagtg
tgaatgctgc taagacaagt 720cctactacga ttcctcaatc tggtgtagtc ctgccctctg
aagcctggtt acagaatgag 780ctggagctga aacgggagaa gaggaaacaa tcaaatcgag
aatctgccag aaggtcaaga 840ttaaggaagc aggctgaggc tgaagaactt gcacacaaag
ttgaagtact caccacagaa 900aacatggcac tccaatctga aataagtcaa tttacagaga
aatcagagaa actaaggctt 960gaaaatgctg cattaacgga gaaactcaag aatgcacaat
taggacatgc gcaagaaatg 1020attttaaaca ttgatgagca cagggcccca gctgttagta
cagaaaactt gctatcaaga 1080gttaacaatt ctgcctttga agaagagagt gatctgtatg
aacgaaactc aaattctggt 1140gccaagctgc atcaactctt ggatgcaagc cccagagccg
atgctgtggc tgctggttga 12004399PRTPopulus trichocarpa 4Met Gly Asn Ile
Glu Glu Gly Lys Ser Ser Thr Ser Asp Lys Ser Ser 1 5
10 15 Pro Ala Pro Pro Asp Gln Thr Asn Ile
His Val Tyr Pro Asp Gly Ala 20 25
30 Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Leu Pro Pro
Tyr Tyr 35 40 45
Asn Ser Ala Val Ala Ser Gly His Ala Pro His Pro Tyr Met Trp Gly 50
55 60 Leu Pro Gln Pro Met
Met Pro Pro Tyr Gly Ala Pro Tyr Ala Thr Val 65 70
75 80 Tyr Ser His Gly Val Tyr Ala His Pro Ala
Val Pro Ile Val Ser His 85 90
95 Pro His Gly Pro Gly Ile Val Ser Ser Pro Ala Ala Gly Thr Leu
Leu 100 105 110 Ser
Ala Glu Thr Pro Thr Lys Ser Ser Gly Asn Thr Asp Arg Gly Leu 115
120 125 Val Asn Lys Leu Lys Gly
Phe Asp Gly Leu Ala Met Ser Ile Gly Asn 130 135
140 Gly Asn Ala Glu Thr Val Glu Gly Gly Gly Arg
Leu Ser Gln Ser Val 145 150 155
160 Glu Ile Glu Val Ser Ser Asp Gly Ile Asp Gly Asn Thr Thr Arg Gly
165 170 175 Lys Lys
Arg Ser Arg Glu Gly Thr Pro Thr Val Ala Thr Gly Gly Asp 180
185 190 Thr Lys Met Glu Ser His Ser
Ser Pro Leu Pro Arg Glu Val Asn Ala 195 200
205 Ser Thr Asp Asn Val Leu Arg Ala Ala Val Ala Pro
Gly Met Thr Thr 210 215 220
Ala Leu Glu Leu Arg Asn Pro Pro Ser Val Asn Ala Ala Lys Thr Ser 225
230 235 240 Pro Thr Thr
Ile Pro Gln Ser Gly Val Val Leu Pro Ser Glu Ala Trp 245
250 255 Leu Gln Asn Glu Leu Glu Leu Lys
Arg Glu Lys Arg Lys Gln Ser Asn 260 265
270 Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala
Glu Ala Glu 275 280 285
Glu Leu Ala His Lys Val Glu Val Leu Thr Thr Glu Asn Met Ala Leu 290
295 300 Gln Ser Glu Ile
Ser Gln Phe Thr Glu Lys Ser Glu Lys Leu Arg Leu 305 310
315 320 Glu Asn Ala Ala Leu Thr Glu Lys Leu
Lys Asn Ala Gln Leu Gly His 325 330
335 Ala Gln Glu Met Ile Leu Asn Ile Asp Glu His Arg Ala Pro
Ala Val 340 345 350
Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser Ala Phe Glu Glu
355 360 365 Glu Ser Asp Leu
Tyr Glu Arg Asn Ser Asn Ser Gly Ala Lys Leu His 370
375 380 Gln Leu Leu Asp Ala Ser Pro Arg
Ala Asp Ala Val Ala Ala Gly 385 390 395
51143DNAArabidopsis lyrata 5atgggaaata gcagcgagga
accaaagcct accaaatcag ataaaccatc ttcacctccg 60gtggatcaaa caaatgttca
tgtgtaccct gattgggcag ctatgcaggc atattatggt 120ccaagagttg caatgcctcc
ttattacaat tcagctttgg ctgcatctgg tcatcctcct 180cctccttaca tgtggaatcc
tcagcatatg atgtcaccat atggagcacc ctatgctgcg 240gtttatcctc atggaggagg
agtttacgct catcccggaa ttcccatggg atcacagcct 300caaggtcaaa agactccacc
tttagcaact ccggggacgc atttgagcat cgacactcct 360actaaatcta cggggaacac
agacaatgga ttgatgaaga agctgaaaga gtttgatggg 420cttgctatgt ctctaggaaa
cgggaatcct gaaaatggtg cagatgaaca taaacgatca 480cggaacagct cagaaactga
tggttcaact gatggaagtg atgggaatac aactggagca 540gatgaaccga aacttaaaag
aagtcgagag ggaactccga caaaagatgt gaaacaattg 600gttcaatcta gctcatttca
ttctgtttct ccgtcaagtg gtgataccgg cgtaaaactt 660attcaaggat cagctatact
ctctcctggt gtaagtgcaa attccaaccc cttcatgtca 720caatctttag ccatggttcc
tcctgaaact tggcctcaga acgagagaga actgaaacgg 780gagcgaagga aacagtctaa
tagagaatct gctagaaggt caagattaag gaaacaggcc 840gagacggaag aacttgctag
gaaagtcgaa gccttgacag ccgaaaacat ggcactaaga 900tctgaactaa accaacttaa
tgagaaatct gataaactaa gaggagcaaa tgcaaccttg 960ttggacaaac tgaaatgttc
agaacctgaa aagagagtct ccgggaaaat gttgtctaga 1020gttaaaaact caggagctgg
agacaagaac aagaaccaag gagacaatga ttctaaatct 1080acaagcaaat tgtaccaact
gctggataca aagccccgag ctaatgctgt agctgcgggc 1140taa
11436380PRTArabidopsis lyrata
6Met Gly Asn Ser Ser Glu Glu Pro Lys Pro Thr Lys Ser Asp Lys Pro 1
5 10 15 Ser Ser Pro Pro
Val Asp Gln Thr Asn Val His Val Tyr Pro Asp Trp 20
25 30 Ala Ala Met Gln Ala Tyr Tyr Gly Pro
Arg Val Ala Met Pro Pro Tyr 35 40
45 Tyr Asn Ser Ala Leu Ala Ala Ser Gly His Pro Pro Pro Pro
Tyr Met 50 55 60
Trp Asn Pro Gln His Met Met Ser Pro Tyr Gly Ala Pro Tyr Ala Ala 65
70 75 80 Val Tyr Pro His Gly
Gly Gly Val Tyr Ala His Pro Gly Ile Pro Met 85
90 95 Gly Ser Gln Pro Gln Gly Gln Lys Thr Pro
Pro Leu Ala Thr Pro Gly 100 105
110 Thr His Leu Ser Ile Asp Thr Pro Thr Lys Ser Thr Gly Asn Thr
Asp 115 120 125 Asn
Gly Leu Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met Ser 130
135 140 Leu Gly Asn Gly Asn Pro
Glu Asn Gly Ala Asp Glu His Lys Arg Ser 145 150
155 160 Arg Asn Ser Ser Glu Thr Asp Gly Ser Thr Asp
Gly Ser Asp Gly Asn 165 170
175 Thr Thr Gly Ala Asp Glu Pro Lys Leu Lys Arg Ser Arg Glu Gly Thr
180 185 190 Pro Thr
Lys Asp Val Lys Gln Leu Val Gln Ser Ser Ser Phe His Ser 195
200 205 Val Ser Pro Ser Ser Gly Asp
Thr Gly Val Lys Leu Ile Gln Gly Ser 210 215
220 Ala Ile Leu Ser Pro Gly Val Ser Ala Asn Ser Asn
Pro Phe Met Ser 225 230 235
240 Gln Ser Leu Ala Met Val Pro Pro Glu Thr Trp Pro Gln Asn Glu Arg
245 250 255 Glu Leu Lys
Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 260
265 270 Arg Ser Arg Leu Arg Lys Gln Ala
Glu Thr Glu Glu Leu Ala Arg Lys 275 280
285 Val Glu Ala Leu Thr Ala Glu Asn Met Ala Leu Arg Ser
Glu Leu Asn 290 295 300
Gln Leu Asn Glu Lys Ser Asp Lys Leu Arg Gly Ala Asn Ala Thr Leu 305
310 315 320 Leu Asp Lys Leu
Lys Cys Ser Glu Pro Glu Lys Arg Val Ser Gly Lys 325
330 335 Met Leu Ser Arg Val Lys Asn Ser Gly
Ala Gly Asp Lys Asn Lys Asn 340 345
350 Gln Gly Asp Asn Asp Ser Lys Ser Thr Ser Lys Leu Tyr Gln
Leu Leu 355 360 365
Asp Thr Lys Pro Arg Ala Asn Ala Val Ala Ala Gly 370
375 380 71122DNAArabidopsis lyrata 7atgggtagca acgaagaagg
aaaacccaca aactctgata agccatcgca agctgctgct 60cctgagcaga gtaatgttca
tgtgtatcat catgactggg ctgctatgca ggcatattat 120gggcctagag ttggtatacc
tcaatattac aactcaaatg tggcgcctgg tcatgctcca 180ccgccttata tgtgggcgtc
tccctcgcca atgatggctc cttatggggc accatatcca 240ccattttgcc ctcctggtgg
agtttatgct catcctggtg ttcaaatggg ctcacaacta 300caaggtcctg tttctcaagc
aacacctggt gttacaactc ctttgaccat ggatgcacca 360actaattcag ctggaaactc
ggatcacggg ttcatgaaaa aactgaaaga gttcgatgga 420cttgcaatgt caataagcaa
taacaaagtt gggagtgctg aacatagcag cagtgaacat 480aggagttctc agagatatat
agaatctaac gtggttttga tatcaatagc tccgagaatg 540atggctctag caatggtagt
gatgtattcg tctttcttac cacagggaga gcaatctcgg 600aggaaaataa ggcgagaaag
atcaccaagc accggtgaaa gaccttcatc tcaaaccacg 660cctcctgtta gaggtgaaaa
tgagaaagcc gatgtgacca tggggactcc cgtaatgccc 720acaacaatgg gtttccaaaa
ctctgctggc atgaacggtg tcccacagcc atggaatgaa 780aaagaggtta aacgagagaa
gagaaaacag tcaaaccgag aatctgctag aaggtcgaga 840ctgaggaagc aggctgaaac
tgaacaacta tctgtcaaag ttgacgcact tgtagctgag 900aacatgactc tgaggtccaa
actaggccag ctaaaaaatg agtctgagaa actgcggctg 960gagaacgaag ctttattgca
tcaactgaaa gcgcaagcaa ctgggaaaac agagaacctt 1020atctctcgag ttgataagaa
caactctgta tcaggtagca aaaatgtgca gcatcaactg 1080ttaaatgcaa gtccaataac
tgatcctgtc gcggccagct ga 11228373PRTArabidopsis
lyrata 8Met Gly Ser Asn Glu Glu Gly Lys Pro Thr Asn Ser Asp Lys Pro Ser 1
5 10 15 Gln Ala Ala
Ala Pro Glu Gln Ser Asn Val His Val Tyr His His Asp 20
25 30 Trp Ala Ala Met Gln Ala Tyr Tyr
Gly Pro Arg Val Gly Ile Pro Gln 35 40
45 Tyr Tyr Asn Ser Asn Val Ala Pro Gly His Ala Pro Pro
Pro Tyr Met 50 55 60
Trp Ala Ser Pro Ser Pro Met Met Ala Pro Tyr Gly Ala Pro Tyr Pro 65
70 75 80 Pro Phe Cys Pro
Pro Gly Gly Val Tyr Ala His Pro Gly Val Gln Met 85
90 95 Gly Ser Gln Leu Gln Gly Pro Val Ser
Gln Ala Thr Pro Gly Val Thr 100 105
110 Thr Pro Leu Thr Met Asp Ala Pro Thr Asn Ser Ala Gly Asn
Ser Asp 115 120 125
His Gly Phe Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met Ser 130
135 140 Ile Ser Asn Asn Lys
Val Gly Ser Ala Glu His Ser Ser Ser Glu His 145 150
155 160 Arg Ser Ser Gln Arg Tyr Ile Glu Ser Asn
Val Val Leu Ile Ser Ile 165 170
175 Ala Pro Arg Met Met Ala Leu Ala Met Val Val Met Tyr Ser Ser
Phe 180 185 190 Leu
Pro Gln Gly Glu Gln Ser Arg Arg Lys Ile Arg Arg Glu Arg Ser 195
200 205 Pro Ser Thr Gly Glu Arg
Pro Ser Ser Gln Thr Thr Pro Pro Val Arg 210 215
220 Gly Glu Asn Glu Lys Ala Asp Val Thr Met Gly
Thr Pro Val Met Pro 225 230 235
240 Thr Thr Met Gly Phe Gln Asn Ser Ala Gly Met Asn Gly Val Pro Gln
245 250 255 Pro Trp
Asn Glu Lys Glu Val Lys Arg Glu Lys Arg Lys Gln Ser Asn 260
265 270 Arg Glu Ser Ala Arg Arg Ser
Arg Leu Arg Lys Gln Ala Glu Thr Glu 275 280
285 Gln Leu Ser Val Lys Val Asp Ala Leu Val Ala Glu
Asn Met Thr Leu 290 295 300
Arg Ser Lys Leu Gly Gln Leu Lys Asn Glu Ser Glu Lys Leu Arg Leu 305
310 315 320 Glu Asn Glu
Ala Leu Leu His Gln Leu Lys Ala Gln Ala Thr Gly Lys 325
330 335 Thr Glu Asn Leu Ile Ser Arg Val
Asp Lys Asn Asn Ser Val Ser Gly 340 345
350 Ser Lys Asn Val Gln His Gln Leu Leu Asn Ala Ser Pro
Ile Thr Asp 355 360 365
Pro Val Ala Ala Ser 370 9939DNAArabidopsis lyrata
9atgggaacga gcgaagacaa gatgccattt aagcctacca aaccaacatc ttcggctcag
60gaagttcctc ccacaccgta tccagattgg tcaaattcaa tgcaggctta ttatggcgga
120ggaggtacgc caaatccttt tttcccatcc ccagttggat ctcctagtcc ccacgcttat
180atgtggggcg ctcaacacca tatgatgccg ccttatggga ccccagtacc gtacccagca
240atgtatcccc cgggagcagt ctattctcat cctagcatgc ccatgcctcc taattctggt
300ccaaccaaca aggagactgt gaaggaccaa gcttctggca agaagtcaaa ggggagctcg
360aaaaaaaagg gtgaaggagg tgacaaagcg ctctctggtt cagggaacga tggtgtctct
420catagtgatg acagtgtcac agcgggttca tctgatgaaa atgatgacaa tgccaatcaa
480caggaacaag gttcagttag aaagccgagc tttggacaga tgcttgcgga cgcaagttct
540caaagtacta ctggtgaaat ccaaggttcg gtgcccatga agccggtagc cccggggact
600aatctgaata tcgggatgga cttatggtct tcccaagctg gtgtacctgt gaaggatgaa
660cgagagctca agaggcagaa gaggaaacag tctaaccgtg aatctgctag gcggtctaga
720ttgcggaagc aggcggaatg cgaacaactt caacagagag tagagagttt gtcgaacgag
780aatcaaagcc tgagagatga gctacaaaga ctctcaagcg aatgtgaaaa gctcaagtct
840gagaacaact caatccagga tgagttgcag agagtgcttg gagcagaggc tgtagctaat
900ctagagcaga atgctgctga cggtgaagga aaaaattaa
93910312PRTArabidopsis lyrata 10Met Gly Thr Ser Glu Asp Lys Met Pro Phe
Lys Pro Thr Lys Pro Thr 1 5 10
15 Ser Ser Ala Gln Glu Val Pro Pro Thr Pro Tyr Pro Asp Trp Ser
Asn 20 25 30 Ser
Met Gln Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Asn Pro Phe Phe 35
40 45 Pro Ser Pro Val Gly Ser
Pro Ser Pro His Ala Tyr Met Trp Gly Ala 50 55
60 Gln His His Met Met Pro Pro Tyr Gly Thr Pro
Val Pro Tyr Pro Ala 65 70 75
80 Met Tyr Pro Pro Gly Ala Val Tyr Ser His Pro Ser Met Pro Met Pro
85 90 95 Pro Asn
Ser Gly Pro Thr Asn Lys Glu Thr Val Lys Asp Gln Ala Ser 100
105 110 Gly Lys Lys Ser Lys Gly Ser
Ser Lys Lys Lys Gly Glu Gly Gly Asp 115 120
125 Lys Ala Leu Ser Gly Ser Gly Asn Asp Gly Val Ser
His Ser Asp Asp 130 135 140
Ser Val Thr Ala Gly Ser Ser Asp Glu Asn Asp Asp Asn Ala Asn Gln 145
150 155 160 Gln Glu Gln
Gly Ser Val Arg Lys Pro Ser Phe Gly Gln Met Leu Ala 165
170 175 Asp Ala Ser Ser Gln Ser Thr Thr
Gly Glu Ile Gln Gly Ser Val Pro 180 185
190 Met Lys Pro Val Ala Pro Gly Thr Asn Leu Asn Ile Gly
Met Asp Leu 195 200 205
Trp Ser Ser Gln Ala Gly Val Pro Val Lys Asp Glu Arg Glu Leu Lys 210
215 220 Arg Gln Lys Arg
Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 225 230
235 240 Leu Arg Lys Gln Ala Glu Cys Glu Gln
Leu Gln Gln Arg Val Glu Ser 245 250
255 Leu Ser Asn Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg
Leu Ser 260 265 270
Ser Glu Cys Glu Lys Leu Lys Ser Glu Asn Asn Ser Ile Gln Asp Glu
275 280 285 Leu Gln Arg Val
Leu Gly Ala Glu Ala Val Ala Asn Leu Glu Gln Asn 290
295 300 Ala Ala Asp Gly Glu Gly Lys Asn
305 310 111080DNAArabidopsis thaliana
11atgggaaata gcagcgagga accaaagcct cctaccaaat cagataaacc atcttcaccc
60ccggtggatc aaacaaatgt tcatgtctac cctgattggg cagctatgca ggcatattat
120ggtccaagag tagcaatgcc tccttattac aattcagcta tggctgcatc tggtcatcct
180cctcctcctt acatgtggaa tcctcagcat atgatgtcac catatggagc accctatgct
240gctgtttatc ctcatggagg aggagtttac gctcatcccg gtattcccat gggatcactg
300cctcaaggtc aaaaggatcc acctttaaca actccgggga cgcttttgag catcgacact
360cctactaaat ctacagggaa cacagacaat ggattgatga agaagctgaa agagtttgat
420gggcttgcta tgtctctagg aaatgggaat cctgaaaatg gtgcagatga acataaacga
480tcacggaaca gctcagaaac tgatggttct actgatggaa gtgatgggaa tacaactggg
540gcagatgaac cgaaacttaa aagaagtcga gagggaactc caacaaaaga tgggaaacaa
600ttggttcaag ctagctcatt tcattctgtt tctccgtcaa gtggtgatac cggcgtaaaa
660ctcattcaag gatctggagc tatactctct cctggtaacg agagagaact gaaacgggag
720cgaaggaaac agtctaatag agaatctgct agaaggtcaa gattaaggaa acaggccgag
780acagaagaac ttgctaggaa agtggaagcc ttgacagccg aaaacatggc attaagatct
840gaactaaacc aacttaatga gaaatctgat aaactaagag gagcaaatgc aaccttgttg
900gacaaactga aatgctcgga acccgaaaag agagtccccg caaatatgtt gtctagagtt
960aagaactcag gagctggaga taagaacaag aaccaaggag acaatgattc taactctaca
1020agcaaattgc atcaactgct cgatacgaag cctcgagcta aagcagtagc tgcaggctga
108012359PRTArabidopsis thaliana 12Met Gly Asn Ser Ser Glu Glu Pro Lys
Pro Pro Thr Lys Ser Asp Lys 1 5 10
15 Pro Ser Ser Pro Pro Val Asp Gln Thr Asn Val His Val Tyr
Pro Asp 20 25 30
Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Met Pro Pro
35 40 45 Tyr Tyr Asn Ser
Ala Met Ala Ala Ser Gly His Pro Pro Pro Pro Tyr 50
55 60 Met Trp Asn Pro Gln His Met Met
Ser Pro Tyr Gly Ala Pro Tyr Ala 65 70
75 80 Ala Val Tyr Pro His Gly Gly Gly Val Tyr Ala His
Pro Gly Ile Pro 85 90
95 Met Gly Ser Leu Pro Gln Gly Gln Lys Asp Pro Pro Leu Thr Thr Pro
100 105 110 Gly Thr Leu
Leu Ser Ile Asp Thr Pro Thr Lys Ser Thr Gly Asn Thr 115
120 125 Asp Asn Gly Leu Met Lys Lys Leu
Lys Glu Phe Asp Gly Leu Ala Met 130 135
140 Ser Leu Gly Asn Gly Asn Pro Glu Asn Gly Ala Asp Glu
His Lys Arg 145 150 155
160 Ser Arg Asn Ser Ser Glu Thr Asp Gly Ser Thr Asp Gly Ser Asp Gly
165 170 175 Asn Thr Thr Gly
Ala Asp Glu Pro Lys Leu Lys Arg Ser Arg Glu Gly 180
185 190 Thr Pro Thr Lys Asp Gly Lys Gln Leu
Val Gln Ala Ser Ser Phe His 195 200
205 Ser Val Ser Pro Ser Ser Gly Asp Thr Gly Val Lys Leu Ile
Gln Gly 210 215 220
Ser Gly Ala Ile Leu Ser Pro Gly Asn Glu Arg Glu Leu Lys Arg Glu 225
230 235 240 Arg Arg Lys Gln Ser
Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 245
250 255 Lys Gln Ala Glu Thr Glu Glu Leu Ala Arg
Lys Val Glu Ala Leu Thr 260 265
270 Ala Glu Asn Met Ala Leu Arg Ser Glu Leu Asn Gln Leu Asn Glu
Lys 275 280 285 Ser
Asp Lys Leu Arg Gly Ala Asn Ala Thr Leu Leu Asp Lys Leu Lys 290
295 300 Cys Ser Glu Pro Glu Lys
Arg Val Pro Ala Asn Met Leu Ser Arg Val 305 310
315 320 Lys Asn Ser Gly Ala Gly Asp Lys Asn Lys Asn
Gln Gly Asp Asn Asp 325 330
335 Ser Asn Ser Thr Ser Lys Leu His Gln Leu Leu Asp Thr Lys Pro Arg
340 345 350 Ala Lys
Ala Val Ala Ala Gly 355 131083DNAArabidopsis
thaliana 13atgggtagca acgaagaagg aaaccccact aacaactctg ataagccatc
gcaagctgct 60gctcctgagc agagtaatgt tcatgtgtat catcatgact gggctgctat
gcaggcatat 120tatgggccta gagttggtat acctcaatat tacaactcaa atttggcgcc
tggtcatgct 180ccaccgcctt atatgtgggc gtctccatcg ccaatgatgg ctccttatgg
agcaccatat 240ccaccatttt gccctcctgg tggagtttat gctcatcctg gtgttcaaat
gggctcacaa 300ccacaaggtc ctgtttctca atcagcatct ggagttacaa cccctttgac
cattgatgca 360ccagctaatt cagctggaaa ctcagatcat gggttcatga aaaagctgaa
agagttcgat 420ggacttgcaa tgtcaataag caataacaaa gttgggagtg ctgaacatag
cagcagtgaa 480cataggagtt ctcagagctc cgagaatgat ggctctagca atggtagtga
tggtaataca 540actgggggag aacaatctag gaggaaaaga aggcaacaaa gatcaccaag
cactggtgaa 600agaccctcat ctcaaaacag tctgcctctt agaggtgaaa atgagaaacc
cgatgtgact 660atggggactc ctgttatgcc cacagcaatg agtttccaaa actctgctgg
catgaacggt 720gtgccacagc catggaatga aaaagaggtt aaacgagaga agagaaaaca
gtcaaaccga 780gaatctgcta ggaggtcaag actgaggaag caggctgaaa cagaacaact
atctgtcaaa 840gttgacgcat tagtagctga gaacatgtct ctgaggtcta aactaggcca
gctaaacaat 900gagtctgaga aactacggct ggagaacgaa gctatattgg atcaactgaa
agcgcaagca 960acagggaaaa cagagaacct gatctctcga gttgataaga acaactctgt
atcaggtagc 1020aaaactgtgc agcatcaact gttaaatgca agtccgataa ccgatcctgt
cgcggctagc 1080tga
108314360PRTArabidopsis thaliana 14Met Gly Ser Asn Glu Glu Gly
Asn Pro Thr Asn Asn Ser Asp Lys Pro 1 5
10 15 Ser Gln Ala Ala Ala Pro Glu Gln Ser Asn Val
His Val Tyr His His 20 25
30 Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Gly Ile
Pro 35 40 45 Gln
Tyr Tyr Asn Ser Asn Leu Ala Pro Gly His Ala Pro Pro Pro Tyr 50
55 60 Met Trp Ala Ser Pro Ser
Pro Met Met Ala Pro Tyr Gly Ala Pro Tyr 65 70
75 80 Pro Pro Phe Cys Pro Pro Gly Gly Val Tyr Ala
His Pro Gly Val Gln 85 90
95 Met Gly Ser Gln Pro Gln Gly Pro Val Ser Gln Ser Ala Ser Gly Val
100 105 110 Thr Thr
Pro Leu Thr Ile Asp Ala Pro Ala Asn Ser Ala Gly Asn Ser 115
120 125 Asp His Gly Phe Met Lys Lys
Leu Lys Glu Phe Asp Gly Leu Ala Met 130 135
140 Ser Ile Ser Asn Asn Lys Val Gly Ser Ala Glu His
Ser Ser Ser Glu 145 150 155
160 His Arg Ser Ser Gln Ser Ser Glu Asn Asp Gly Ser Ser Asn Gly Ser
165 170 175 Asp Gly Asn
Thr Thr Gly Gly Glu Gln Ser Arg Arg Lys Arg Arg Gln 180
185 190 Gln Arg Ser Pro Ser Thr Gly Glu
Arg Pro Ser Ser Gln Asn Ser Leu 195 200
205 Pro Leu Arg Gly Glu Asn Glu Lys Pro Asp Val Thr Met
Gly Thr Pro 210 215 220
Val Met Pro Thr Ala Met Ser Phe Gln Asn Ser Ala Gly Met Asn Gly 225
230 235 240 Val Pro Gln Pro
Trp Asn Glu Lys Glu Val Lys Arg Glu Lys Arg Lys 245
250 255 Gln Ser Asn Arg Glu Ser Ala Arg Arg
Ser Arg Leu Arg Lys Gln Ala 260 265
270 Glu Thr Glu Gln Leu Ser Val Lys Val Asp Ala Leu Val Ala
Glu Asn 275 280 285
Met Ser Leu Arg Ser Lys Leu Gly Gln Leu Asn Asn Glu Ser Glu Lys 290
295 300 Leu Arg Leu Glu Asn
Glu Ala Ile Leu Asp Gln Leu Lys Ala Gln Ala 305 310
315 320 Thr Gly Lys Thr Glu Asn Leu Ile Ser Arg
Val Asp Lys Asn Asn Ser 325 330
335 Val Ser Gly Ser Lys Thr Val Gln His Gln Leu Leu Asn Ala Ser
Pro 340 345 350 Ile
Thr Asp Pro Val Ala Ala Ser 355 360
15942DNAArabidopsis thaliana 15atgggaacga gcgaagacaa gatgccattt
aagactacca aaccaacatc ttcggctcag 60gaagttcctc ccacaccgta tccagattgg
caaaattcaa tgcaggctta ttatggcgga 120ggaggtactc caaatccttt tttcccatcc
ccagttggat ctcctagtcc tcacccctat 180atgtggggtg ctcaacacca tatgatgccg
ccttatggca ccccagttcc gtacccagca 240atgtatcccc cgggggcagt ctatgctcat
cctagcatgc ccatgcctcc taattctggt 300cctaccaaca aggagcctgc gaaggaccaa
gcttctggca agaagtcaaa ggggaactcg 360aaaaaaaagg ctgaaggagg tgataaagcg
ctctctggtt cagggaacga tggtgcctct 420catagtgatg aaagtgtcac agcgggttca
tctgatgaaa atgatgagaa tgccaatcaa 480cagggttcaa ttcgaaagcc aagctttgga
cagatgcttg ctgacgcaag ttctcaaagt 540acgactggtg aaatccaagg ttcggtgccc
atgaagccgg tagccccggg gactaatctg 600aatatcggga tggacttatg gtcttcccaa
gctggtgtac cagtgaagga tgaacgagag 660ctcaagcggc agaagaggaa acaatctaac
cgtgaatccg ctaggcggtc tagattgcgg 720aagcaggccg aatgcgaaca acttcaacaa
agagtagaga gtttgtcgaa cgagaatcaa 780agcctgagag atgagctaca gagactctca
agcgaatgtg ataagctcaa gtctgagaac 840aactcaatcc aggatgagtt gcagagagta
cttggagcag aggctgtagc taatctagaa 900cagaatgctg ctgggtcgaa agatggtgaa
ggaacaaatt aa 94216313PRTArabidopsis thaliana 16Met
Gly Thr Ser Glu Asp Lys Met Pro Phe Lys Thr Thr Lys Pro Thr 1
5 10 15 Ser Ser Ala Gln Glu Val
Pro Pro Thr Pro Tyr Pro Asp Trp Gln Asn 20
25 30 Ser Met Gln Ala Tyr Tyr Gly Gly Gly Gly
Thr Pro Asn Pro Phe Phe 35 40
45 Pro Ser Pro Val Gly Ser Pro Ser Pro His Pro Tyr Met Trp
Gly Ala 50 55 60
Gln His His Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala 65
70 75 80 Met Tyr Pro Pro Gly
Ala Val Tyr Ala His Pro Ser Met Pro Met Pro 85
90 95 Pro Asn Ser Gly Pro Thr Asn Lys Glu Pro
Ala Lys Asp Gln Ala Ser 100 105
110 Gly Lys Lys Ser Lys Gly Asn Ser Lys Lys Lys Ala Glu Gly Gly
Asp 115 120 125 Lys
Ala Leu Ser Gly Ser Gly Asn Asp Gly Ala Ser His Ser Asp Glu 130
135 140 Ser Val Thr Ala Gly Ser
Ser Asp Glu Asn Asp Glu Asn Ala Asn Gln 145 150
155 160 Gln Gly Ser Ile Arg Lys Pro Ser Phe Gly Gln
Met Leu Ala Asp Ala 165 170
175 Ser Ser Gln Ser Thr Thr Gly Glu Ile Gln Gly Ser Val Pro Met Lys
180 185 190 Pro Val
Ala Pro Gly Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Ser 195
200 205 Ser Gln Ala Gly Val Pro Val
Lys Asp Glu Arg Glu Leu Lys Arg Gln 210 215
220 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg
Ser Arg Leu Arg 225 230 235
240 Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln Arg Val Glu Ser Leu Ser
245 250 255 Asn Glu Asn
Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Ser Glu 260
265 270 Cys Asp Lys Leu Lys Ser Glu Asn
Asn Ser Ile Gln Asp Glu Leu Gln 275 280
285 Arg Val Leu Gly Ala Glu Ala Val Ala Asn Leu Glu Gln
Asn Ala Ala 290 295 300
Gly Ser Lys Asp Gly Glu Gly Thr Asn 305 310
171164DNAAquilegia sp. 17atgggttctg ctgaagagag tacacctgcc aaaccttcca
gaccaagtgc ttcatctcag 60gaaacaccac caacaccttt atatcccgat tggtcaactc
caatgcaggc atactacggt 120gctggagcta cccaacctcc atttttccct acaaatgttg
ctagtccacc tccgtatcca 180tatatgtggg gaggccagat ggtctcacca tatggtaccc
caattccata ccctgctatt 240tacccccatg gagggcttta tcctcatcct aacttggcta
cggctcaggg tgcagcaatg 300ccaactacgc agacggagga aaataactct ccagtgaaaa
aaatcaagag ctcgggaaac 360attggtgtgg ttggtggcaa attgaaagaa agtgggaagg
cagcttctgg ctctcgaaat 420gacggtgtct cacggagtgc tgaaagtgga agtgagggct
catcagatgc gagtgacgag 480ttcaatcata aagacgattc ggagaataag aacaaaagct
ttgaccagat gcttgcagat 540ggagcaaatg cacagaacac cagtgctcac cacagtagta
cagctgttgg gtcatcttta 600aatggaaatg gagaaccatc tgcaaatttt ccagttcctt
tgccagggaa tcctgtggga 660gatattgctg caaccaattt gaatataggg atggacctct
ggaatgcatc tcctgttgga 720tcggtgcctt tgagggcaag atcgaatgct tcaggtgtcg
tgccagcagt tgctcctgtc 780aagagagatg ggcatgaagg cattgtgcct gaacacctat
ggggtcaaga tgaacgtgaa 840ctgaaaagac agagaaggaa gctatcaaat agggagtcag
ctcggagatc aagactacgc 900aaacaggctg agtgtgaaga gctacaagtg aaggtggata
cattgaccga tgagaacgat 960aatctccgta aagagctgga gaggctcgcc gaggaacgcc
aaaagctcac taatgaaaat 1020gcatccttag agagtgaact gagtcagttg tatggagaag
aagcaatttc gaccctcaag 1080ggtaagaatg ccaacatgtc tgtgcagtct gttaatggtt
ttgaacaaga cactttgatg 1140ggaaacaact ccttatctga gtag
116418387PRTAquilegia sp. 18Met Gly Ser Ala Glu Glu
Ser Thr Pro Ala Lys Pro Ser Arg Pro Ser 1 5
10 15 Ala Ser Ser Gln Glu Thr Pro Pro Thr Pro Leu
Tyr Pro Asp Trp Ser 20 25
30 Thr Pro Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Gln Pro Pro
Phe 35 40 45 Phe
Pro Thr Asn Val Ala Ser Pro Pro Pro Tyr Pro Tyr Met Trp Gly 50
55 60 Gly Gln Met Val Ser Pro
Tyr Gly Thr Pro Ile Pro Tyr Pro Ala Ile 65 70
75 80 Tyr Pro His Gly Gly Leu Tyr Pro His Pro Asn
Leu Ala Thr Ala Gln 85 90
95 Gly Ala Ala Met Pro Thr Thr Gln Thr Glu Glu Asn Asn Ser Pro Val
100 105 110 Lys Lys
Ile Lys Ser Ser Gly Asn Ile Gly Val Val Gly Gly Lys Leu 115
120 125 Lys Glu Ser Gly Lys Ala Ala
Ser Gly Ser Arg Asn Asp Gly Val Ser 130 135
140 Arg Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser Asp
Ala Ser Asp Glu 145 150 155
160 Phe Asn His Lys Asp Asp Ser Glu Asn Lys Asn Lys Ser Phe Asp Gln
165 170 175 Met Leu Ala
Asp Gly Ala Asn Ala Gln Asn Thr Ser Ala His His Ser 180
185 190 Ser Thr Ala Val Gly Ser Ser Leu
Asn Gly Asn Gly Glu Pro Ser Ala 195 200
205 Asn Phe Pro Val Pro Leu Pro Gly Asn Pro Val Gly Asp
Ile Ala Ala 210 215 220
Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Pro Val Gly 225
230 235 240 Ser Val Pro Leu
Arg Ala Arg Ser Asn Ala Ser Gly Val Val Pro Ala 245
250 255 Val Ala Pro Val Lys Arg Asp Gly His
Glu Gly Ile Val Pro Glu His 260 265
270 Leu Trp Gly Gln Asp Glu Arg Glu Leu Lys Arg Gln Arg Arg
Lys Leu 275 280 285
Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 290
295 300 Cys Glu Glu Leu Gln
Val Lys Val Asp Thr Leu Thr Asp Glu Asn Asp 305 310
315 320 Asn Leu Arg Lys Glu Leu Glu Arg Leu Ala
Glu Glu Arg Gln Lys Leu 325 330
335 Thr Asn Glu Asn Ala Ser Leu Glu Ser Glu Leu Ser Gln Leu Tyr
Gly 340 345 350 Glu
Glu Ala Ile Ser Thr Leu Lys Gly Lys Asn Ala Asn Met Ser Val 355
360 365 Gln Ser Val Asn Gly Phe
Glu Gln Asp Thr Leu Met Gly Asn Asn Ser 370 375
380 Leu Ser Glu 385 191104DNABrassica
napus 19atgggaagca acgaagaagg aaagaccaca cagtctgaca agccagcaca agtacaagct
60cctcctcctc ctcctgagca aagcaatgtt catgtgtatc atcatgattg ggctgctatg
120caggcgtact atggaccaag agtagccata actcctcaat attacaactc aaatggtcat
180gctcctccac ctcctcctta tatctggggc tctccttcgc caatgatggc tccttatgga
240acaccatacc caccgttttg tcctcctggt ggagtctatg ctcatcctgc tcttcaaatg
300ggctcacaac cacaagggcc tgcttctcaa gcaacacctg ttgttgcaac tccgttgaac
360ttggaagctc atccagctaa ctcatctgga aacacggatc aggggttcat gaaaaagttg
420aaagaatttg atggacttgc aatgtctata agcaataaca aatctgggag tggtgaacat
480agcagtgaac ctaagaattc tcagagttct gagaatgatg attccagcaa tggtagtgat
540gggaatacaa ctgggggaga acagtctagg aagaaaagaa gccgggaagg atcaccaaac
600aacgatggga agccttcatc tcaaattgtt cctcttctaa gagatgaaag tgagaaacag
660gcagtgacta tggggactcc tgttatgccc acagttttgg atttcccaca gccattccct
720ggtgcgcctc atgaagtctg gaatgaaaaa gaggttaaac gagagaagag aaaacagtca
780aacagagaat ctgctagaag gtcaagactg aggaagcagg ctgaaactga agaactgtcc
840gtcaaggttg atgcactagt tgctgagaac atgactctga ggtcaaaact aggccaacta
900aacgatgagt ctgagaaact acggctggag aaccaagctt tattggatca actgaaagcg
960caagcaactg ggaaaacaga gaacctaata tctggagttg ataagaacaa cagctctgta
1020tcaggtacta gtagtagtag taagaatgcg gaacagcaac tcttaaacgt aagtctaaga
1080accgattctg tcgcggctag ctga
110420367PRTBrassica napus 20Met Gly Ser Asn Glu Glu Gly Lys Thr Thr Gln
Ser Asp Lys Pro Ala 1 5 10
15 Gln Val Gln Ala Pro Pro Pro Pro Pro Glu Gln Ser Asn Val His Val
20 25 30 Tyr His
His Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val 35
40 45 Ala Ile Thr Pro Gln Tyr Tyr
Asn Ser Asn Gly His Ala Pro Pro Pro 50 55
60 Pro Pro Tyr Ile Trp Gly Ser Pro Ser Pro Met Met
Ala Pro Tyr Gly 65 70 75
80 Thr Pro Tyr Pro Pro Phe Cys Pro Pro Gly Gly Val Tyr Ala His Pro
85 90 95 Ala Leu Gln
Met Gly Ser Gln Pro Gln Gly Pro Ala Ser Gln Ala Thr 100
105 110 Pro Val Val Ala Thr Pro Leu Asn
Leu Glu Ala His Pro Ala Asn Ser 115 120
125 Ser Gly Asn Thr Asp Gln Gly Phe Met Lys Lys Leu Lys
Glu Phe Asp 130 135 140
Gly Leu Ala Met Ser Ile Ser Asn Asn Lys Ser Gly Ser Gly Glu His 145
150 155 160 Ser Ser Glu Pro
Lys Asn Ser Gln Ser Ser Glu Asn Asp Asp Ser Ser 165
170 175 Asn Gly Ser Asp Gly Asn Thr Thr Gly
Gly Glu Gln Ser Arg Lys Lys 180 185
190 Arg Ser Arg Glu Gly Ser Pro Asn Asn Asp Gly Lys Pro Ser
Ser Gln 195 200 205
Ile Val Pro Leu Leu Arg Asp Glu Ser Glu Lys Gln Ala Val Thr Met 210
215 220 Gly Thr Pro Val Met
Pro Thr Val Leu Asp Phe Pro Gln Pro Phe Pro 225 230
235 240 Gly Ala Pro His Glu Val Trp Asn Glu Lys
Glu Val Lys Arg Glu Lys 245 250
255 Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg
Lys 260 265 270 Gln
Ala Glu Thr Glu Glu Leu Ser Val Lys Val Asp Ala Leu Val Ala 275
280 285 Glu Asn Met Thr Leu Arg
Ser Lys Leu Gly Gln Leu Asn Asp Glu Ser 290 295
300 Glu Lys Leu Arg Leu Glu Asn Gln Ala Leu Leu
Asp Gln Leu Lys Ala 305 310 315
320 Gln Ala Thr Gly Lys Thr Glu Asn Leu Ile Ser Gly Val Asp Lys Asn
325 330 335 Asn Ser
Ser Val Ser Gly Thr Ser Ser Ser Ser Lys Asn Ala Glu Gln 340
345 350 Gln Leu Leu Asn Val Ser Leu
Arg Thr Asp Ser Val Ala Ala Ser 355 360
365 211125DNABrassica napus 21atgggaaaaa gcgaggaacc
aaaggttacc aaatcagaca acaaaccatc ttcaccacct 60gcggatcaaa caaatgttca
tgtctaccct gattgggccg ctatgcaggc ttattatggg 120ccaagagtag caatacctcc
ttattacaac tcagctatgg ctgctgcatc tggtcatcct 180cctcctcctt acatgtggaa
tcctcagcat atgatgtcac catatggaac accgtatgca 240gcggtttacc ctcatggagg
aggagtctac gctcatcctg gattccccat gcctcaaagt 300caaaagggtg ctgctttatc
aactccgggg acgccattga acatagacac tcctagtaaa 360tcaacaggaa acacagagaa
tgggctgatg aagaagctga aagagtttga tggacttgct 420atgtctctag gaaatggtaa
taatggtgat gaaggtaaac gctcacggaa cagctcagaa 480acggatggtt ctagtgatgg
aagtgacggg aataccactg gggctgatga accgaaactt 540aagagaaggc gagaaggaac
tccaaccaaa gatgaggaga aacatttggt tcagtcaagc 600tcatttcggt ctgtttctca
gtcaagtggt gataacgttg taaagcatag tgttcaagga 660ggaggtggag ctatagtctc
tgctgctggt gtaagtgcaa attcaaaccc aaccttcatg 720tcacaatctt tagccatggt
tcctcctgaa acttggcttc agaacgagag agagctgaaa 780cgggagagaa ggaaacagtc
taatagagaa tctgcaagaa ggtcaagatt aaggaaacag 840gctgagactg aagaactggc
taggaaagtt gaagccttga cagcagaaaa catggcgtta 900agatctgagc taaaccaact
taatgagaaa tctaataatc taagaggagc taatgcaacc 960ttactggaca agctgaaaag
ttcagaacct gaaaagagag ttaagagctc aggaaatgga 1020gacgacaaga acaagaagca
aggagacaat gagactaact ctaccagcaa actgcatcaa 1080ctgcttgata ccaagcctcg
agctgacggt gtagctgctc gctaa 112522374PRTBrassica napus
22Met Gly Lys Ser Glu Glu Pro Lys Val Thr Lys Ser Asp Asn Lys Pro 1
5 10 15 Ser Ser Pro Pro
Ala Asp Gln Thr Asn Val His Val Tyr Pro Asp Trp 20
25 30 Ala Ala Met Gln Ala Tyr Tyr Gly Pro
Arg Val Ala Ile Pro Pro Tyr 35 40
45 Tyr Asn Ser Ala Met Ala Ala Ala Ser Gly His Pro Pro Pro
Pro Tyr 50 55 60
Met Trp Asn Pro Gln His Met Met Ser Pro Tyr Gly Thr Pro Tyr Ala 65
70 75 80 Ala Val Tyr Pro His
Gly Gly Gly Val Tyr Ala His Pro Gly Phe Pro 85
90 95 Met Pro Gln Ser Gln Lys Gly Ala Ala Leu
Ser Thr Pro Gly Thr Pro 100 105
110 Leu Asn Ile Asp Thr Pro Ser Lys Ser Thr Gly Asn Thr Glu Asn
Gly 115 120 125 Leu
Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met Ser Leu Gly 130
135 140 Asn Gly Asn Asn Gly Asp
Glu Gly Lys Arg Ser Arg Asn Ser Ser Glu 145 150
155 160 Thr Asp Gly Ser Ser Asp Gly Ser Asp Gly Asn
Thr Thr Gly Ala Asp 165 170
175 Glu Pro Lys Leu Lys Arg Arg Arg Glu Gly Thr Pro Thr Lys Asp Glu
180 185 190 Glu Lys
His Leu Val Gln Ser Ser Ser Phe Arg Ser Val Ser Gln Ser 195
200 205 Ser Gly Asp Asn Val Val Lys
His Ser Val Gln Gly Gly Gly Gly Ala 210 215
220 Ile Val Ser Ala Ala Gly Val Ser Ala Asn Ser Asn
Pro Thr Phe Met 225 230 235
240 Ser Gln Ser Leu Ala Met Val Pro Pro Glu Thr Trp Leu Gln Asn Glu
245 250 255 Arg Glu Leu
Lys Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala 260
265 270 Arg Arg Ser Arg Leu Arg Lys Gln
Ala Glu Thr Glu Glu Leu Ala Arg 275 280
285 Lys Val Glu Ala Leu Thr Ala Glu Asn Met Ala Leu Arg
Ser Glu Leu 290 295 300
Asn Gln Leu Asn Glu Lys Ser Asn Asn Leu Arg Gly Ala Asn Ala Thr 305
310 315 320 Leu Leu Asp Lys
Leu Lys Ser Ser Glu Pro Glu Lys Arg Val Lys Ser 325
330 335 Ser Gly Asn Gly Asp Asp Lys Asn Lys
Lys Gln Gly Asp Asn Glu Thr 340 345
350 Asn Ser Thr Ser Lys Leu His Gln Leu Leu Asp Thr Lys Pro
Arg Ala 355 360 365
Asp Gly Val Ala Ala Arg 370 23942DNABrassica napus
23atgggaacaa gcgaagaaaa gacgccttct aaaccagcat cctcaacaca ggacattccc
60cccacacctt atccagactg gtctaactca atgcaggctt attatggagg aggaggtact
120ccgagtcctt ttttcccatc tccagttgga tctcctagtc ctcaccctta catgtggggt
180gctcaacacc atatgatgcc gccttatggg acccccgttc cgtacccagc catgtatcct
240ccaggggcgg tctacgccca tcctggcatg cccatgcctc cttcttctgc tccaaccaac
300gagaccgtga aggaacaagc ccctggaaag aagtcaaaag ggagcttgaa aagaaagggc
360gaaggaggtg agaaggcgcc ttctggttct gggaacgatg gtgtatctca cagtgatgaa
420agtgtcacag ggggttcatc tgatgaaaac gatgagaatg ctaaccacca ggaacatggt
480tcagttagaa agcctagctt tggacaaatg ctggcggatg caagttctca gagtaatact
540accggtgaga tgatccaagg ttcagttccc atgaagccac tagcccctgg gactaatttg
600aatatgggaa tggacttatg gtcttcccag gctggtgtac ctgtgaagga tgaaagagag
660ctcaagaggc agaaaaggaa acaatctaac cgtgaatccg ccaggcggtc cagactaagg
720aagcaggcgg aatgcgaaca gcttcagcag agagtagaga gtttgactag tgagaatcag
780agcctgagag atgagttgca gagactctct ggagaatgtg agaagctcaa gactcagaac
840agttctattc aggatgagtt ggtaagagtg catggaccag aggccgtggc taatctagaa
900cagaatgctg atgggtctaa agatggcgaa ggaacagatt aa
94224313PRTBrassica napus 24Met Gly Thr Ser Glu Glu Lys Thr Pro Ser Lys
Pro Ala Ser Ser Thr 1 5 10
15 Gln Asp Ile Pro Pro Thr Pro Tyr Pro Asp Trp Ser Asn Ser Met Gln
20 25 30 Ala Tyr
Tyr Gly Gly Gly Gly Thr Pro Ser Pro Phe Phe Pro Ser Pro 35
40 45 Val Gly Ser Pro Ser Pro His
Pro Tyr Met Trp Gly Ala Gln His His 50 55
60 Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro
Ala Met Tyr Pro 65 70 75
80 Pro Gly Ala Val Tyr Ala His Pro Gly Met Pro Met Pro Pro Ser Ser
85 90 95 Ala Pro Thr
Asn Glu Thr Val Lys Glu Gln Ala Pro Gly Lys Lys Ser 100
105 110 Lys Gly Ser Leu Lys Arg Lys Gly
Glu Gly Gly Glu Lys Ala Pro Ser 115 120
125 Gly Ser Gly Asn Asp Gly Val Ser His Ser Asp Glu Ser
Val Thr Gly 130 135 140
Gly Ser Ser Asp Glu Asn Asp Glu Asn Ala Asn His Gln Glu His Gly 145
150 155 160 Ser Val Arg Lys
Pro Ser Phe Gly Gln Met Leu Ala Asp Ala Ser Ser 165
170 175 Gln Ser Asn Thr Thr Gly Glu Met Ile
Gln Gly Ser Val Pro Met Lys 180 185
190 Pro Leu Ala Pro Gly Thr Asn Leu Asn Met Gly Met Asp Leu
Trp Ser 195 200 205
Ser Gln Ala Gly Val Pro Val Lys Asp Glu Arg Glu Leu Lys Arg Gln 210
215 220 Lys Arg Lys Gln Ser
Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 225 230
235 240 Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln
Arg Val Glu Ser Leu Thr 245 250
255 Ser Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Gly
Glu 260 265 270 Cys
Glu Lys Leu Lys Thr Gln Asn Ser Ser Ile Gln Asp Glu Leu Val 275
280 285 Arg Val His Gly Pro Glu
Ala Val Ala Asn Leu Glu Gln Asn Ala Asp 290 295
300 Gly Ser Lys Asp Gly Glu Gly Thr Asp 305
310 25942DNABrassica napus 25atgggaacga
gcgaggaaaa gaccccattt aagccttcca agccagcatc ctcggcacag 60gacactcctc
ccacacctta tgcagactgg tcaaactcaa tgcaggctta ttatggagga 120ggaggtactc
caagtccttt tttcccatcc ccagttggat ctcctagtcc tcacccttat 180atgtggggtg
ctcaacacca tatgatgccg ccttatggga ctccagttcc gtatccagca 240atgtatcccc
cagggactgt ctatgcccat cctggcatgc ccatgcctca ggcttctggt 300ccaaccaaca
cggagaccgt gaaagctcaa gcccctggta agaagccaaa gggtaacttg 360aaaagaaaga
gtggaggaag tgagaaggcg ccttctggtt cagggaacga tgctgtatct 420caaagtgaag
aaagtgtcac agctggttca tctgatgaaa acgatgacaa tgccaaccac 480caggaacaag
gttcagttag aaagccaagc ttcggacaga tgctggctga tgcaagttct 540cagagtaata
ctactggtga gatccaaggt tccatgccaa tgaaaccagt ggcgccaggg 600actaatctga
atatggggat ggacttatgg tcttcccaga ctggtgtagc tgtgaaggat 660gaaagagagc
tcaagaggca gaaaaggaaa caatctaacc gtgaatcagc tagacggtcc 720agattgcgga
agcaggcgga atgcgagcag cttcaacaga gagtagagag tttgacgagt 780gagaatcaaa
gtctgagaga tgagttacag agactctccg gagaatgtga gaagctcaag 840acggagaaca
acactattca ggatgagttg gtaagagtgc atggaccaga ggcagtagct 900aatctagaac
agaatgctga tggatctaaa gatggtgaat ga
94226313PRTBrassica napus 26Met Gly Thr Ser Glu Glu Lys Thr Pro Phe Lys
Pro Ser Lys Pro Ala 1 5 10
15 Ser Ser Ala Gln Asp Thr Pro Pro Thr Pro Tyr Ala Asp Trp Ser Asn
20 25 30 Ser Met
Gln Ala Tyr Tyr Gly Gly Gly Gly Thr Pro Ser Pro Phe Phe 35
40 45 Pro Ser Pro Val Gly Ser Pro
Ser Pro His Pro Tyr Met Trp Gly Ala 50 55
60 Gln His His Met Met Pro Pro Tyr Gly Thr Pro Val
Pro Tyr Pro Ala 65 70 75
80 Met Tyr Pro Pro Gly Thr Val Tyr Ala His Pro Gly Met Pro Met Pro
85 90 95 Gln Ala Ser
Gly Pro Thr Asn Thr Glu Thr Val Lys Ala Gln Ala Pro 100
105 110 Gly Lys Lys Pro Lys Gly Asn Leu
Lys Arg Lys Ser Gly Gly Ser Glu 115 120
125 Lys Ala Pro Ser Gly Ser Gly Asn Asp Ala Val Ser Gln
Ser Glu Glu 130 135 140
Ser Val Thr Ala Gly Ser Ser Asp Glu Asn Asp Asp Asn Ala Asn His 145
150 155 160 Gln Glu Gln Gly
Ser Val Arg Lys Pro Ser Phe Gly Gln Met Leu Ala 165
170 175 Asp Ala Ser Ser Gln Ser Asn Thr Thr
Gly Glu Ile Gln Gly Ser Met 180 185
190 Pro Met Lys Pro Val Ala Pro Gly Thr Asn Leu Asn Met Gly
Met Asp 195 200 205
Leu Trp Ser Ser Gln Thr Gly Val Ala Val Lys Asp Glu Arg Glu Leu 210
215 220 Lys Arg Gln Lys Arg
Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 225 230
235 240 Arg Leu Arg Lys Gln Ala Glu Cys Glu Gln
Leu Gln Gln Arg Val Glu 245 250
255 Ser Leu Thr Ser Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg
Leu 260 265 270 Ser
Gly Glu Cys Glu Lys Leu Lys Thr Glu Asn Asn Thr Ile Gln Asp 275
280 285 Glu Leu Val Arg Val His
Gly Pro Glu Ala Val Ala Asn Leu Glu Gln 290 295
300 Asn Ala Asp Gly Ser Lys Asp Gly Glu 305
310 27933DNABrassica napus 27atgggaacaa
gcgaagaaaa gacgccttcc aaaccagcat cctcaacaca ggacattcct 60cccacaccat
atccagactg gtcaaactca atgcaggctt attatggagg aggaggtact 120ccgaatcctt
ttttcccatc tcctgttgga tctcctagtc ctcaccctta catgtggggt 180gctcaacacc
atatgatgcc gccttatggg accccggttc cgtatccagc catgtatcct 240ccaggggcgg
tctacgctca tcctggcatg cccatgcctc ctgcttctgc tccaaccaac 300aaggagacgg
tgaaggaaca agcccctggc aagaagtcaa aagggagctt gaaaagaaag 360ggtgaaggag
gtgagaaggc gccttctggt tctgggaacg atggtgtatc tcacagtgat 420gaaagtgtca
caggaggttc atctgatgaa aatgatgaga acgctaacca ccaggaacaa 480ggctcagtta
gaaagccgag ctttggacaa atgctagcgg atgcaagttc tcagagtaat 540actactggtg
agatccaagg ttccatgcca atgaaaccag tggcgccagg gactaatctg 600aatatgggga
tggacttatg gtcttcccag actggtgtag ctgtgaagga tgaaagagag 660ctcaagaggc
agaaaaggaa acaatctaac cgtgaatcag ctagacggtc cagattgcgg 720aagcaggcgg
aatgcgagca gcttcaacag agagtagaga gtttgacgag tgagaatcaa 780agtctgagag
atgagttaca gagactctcc ggagaatgtg agaagctcaa gacggagaac 840aacactattc
aggatgagtt ggtaagagtg catggaccag aggcagtagc taatctagaa 900cagaatgctg
atggatctaa agatggtgaa tga
93328310PRTBrassica napus 28Met Gly Thr Ser Glu Glu Lys Thr Pro Ser Lys
Pro Ala Ser Ser Thr 1 5 10
15 Gln Asp Ile Pro Pro Thr Pro Tyr Pro Asp Trp Ser Asn Ser Met Gln
20 25 30 Ala Tyr
Tyr Gly Gly Gly Gly Thr Pro Asn Pro Phe Phe Pro Ser Pro 35
40 45 Val Gly Ser Pro Ser Pro His
Pro Tyr Met Trp Gly Ala Gln His His 50 55
60 Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro
Ala Met Tyr Pro 65 70 75
80 Pro Gly Ala Val Tyr Ala His Pro Gly Met Pro Met Pro Pro Ala Ser
85 90 95 Ala Pro Thr
Asn Lys Glu Thr Val Lys Glu Gln Ala Pro Gly Lys Lys 100
105 110 Ser Lys Gly Ser Leu Lys Arg Lys
Gly Glu Gly Gly Glu Lys Ala Pro 115 120
125 Ser Gly Ser Gly Asn Asp Gly Val Ser His Ser Asp Glu
Ser Val Thr 130 135 140
Gly Gly Ser Ser Asp Glu Asn Asp Glu Asn Ala Asn His Gln Glu Gln 145
150 155 160 Gly Ser Val Arg
Lys Pro Ser Phe Gly Gln Met Leu Ala Asp Ala Ser 165
170 175 Ser Gln Ser Asn Thr Thr Gly Glu Ile
Gln Gly Ser Met Pro Met Lys 180 185
190 Pro Val Ala Pro Gly Thr Asn Leu Asn Met Gly Met Asp Leu
Trp Ser 195 200 205
Ser Gln Thr Gly Val Ala Val Lys Asp Glu Arg Glu Leu Lys Arg Gln 210
215 220 Lys Arg Lys Gln Ser
Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 225 230
235 240 Lys Gln Ala Glu Cys Glu Gln Leu Gln Gln
Arg Val Glu Ser Leu Thr 245 250
255 Ser Glu Asn Gln Ser Leu Arg Asp Glu Leu Gln Arg Leu Ser Gly
Glu 260 265 270 Cys
Glu Lys Leu Lys Thr Glu Asn Asn Thr Ile Gln Asp Glu Leu Val 275
280 285 Arg Val His Gly Pro Glu
Ala Val Ala Asn Leu Glu Gln Asn Ala Asp 290 295
300 Gly Ser Lys Asp Gly Glu 305
310 29972DNACoffea canephora 29atgcaggctt actatggtgc tggagctact
ccaccctttt ttgcatcaac tgttgcttct 60cctagtcccc atccctattt atggggcaac
cagcatcctc tgatgccacc ttatggcact 120ccagttcctt atccagcgct atatccaggg
ggagtttatg ctcatcctaa catggcaatg 180gctccaggag cggtacaggc tcctatagag
tcggatgcaa aagctcctga tgggaaagac 240cggaacacaa acaaaaaact caagggtcct
tcaggaaacc ctgggttgat tgctgtcaag 300gctggggaga gtgggaaagc ggcttcaggc
tcaggaaatg atggtgcaac tcaaagtgct 360gaaagtggaa gtgaaggttc atctgatgga
agcgatgaga ataataacca tgaactttct 420gcaacaaaga aaggcagctt tgatcaaatg
cttgcagatg gagccactgc acagaacaat 480acttctgtag caaattttca gaattcagtg
cctgggaatc ctgtagtctc cgtgcctgct 540actaatctaa atattggaat ggacttgtgg
aatccatctt ctggagcttc tggagccatg 600aagatgcgtc caaatcctgg tgtctcacct
gctgttgctc ctggcatgat gactgaccag 660tggattcagg atgagcgaga gttgaaaaga
cagaagcgaa agcaatctaa tcgtgagtct 720gcccggagat caagattacg caaacaggct
gagtgcgaag agttgcagca gagagtcgag 780tcactgaaca gtgaaaatcg tgcacttagg
gatgagctac aaaaggtttc tgaggaatgc 840gagaagctta catccgaaaa taactctatt
aaggaggagt tgactaggtt gtgtggacca 900gaggcagtag ctaaattaga gagcagtagc
atcacccaac ttgagacaaa tggtgatgaa 960gatgaccatt ga
97230323PRTCoffea canephora 30Met Gln
Ala Tyr Tyr Gly Ala Gly Ala Thr Pro Pro Phe Phe Ala Ser 1 5
10 15 Thr Val Ala Ser Pro Ser Pro
His Pro Tyr Leu Trp Gly Asn Gln His 20 25
30 Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr
Pro Ala Leu Tyr 35 40 45
Pro Gly Gly Val Tyr Ala His Pro Asn Met Ala Met Ala Pro Gly Ala
50 55 60 Val Gln Ala
Pro Ile Glu Ser Asp Ala Lys Ala Pro Asp Gly Lys Asp 65
70 75 80 Arg Asn Thr Asn Lys Lys Leu
Lys Gly Pro Ser Gly Asn Pro Gly Leu 85
90 95 Ile Ala Val Lys Ala Gly Glu Ser Gly Lys Ala
Ala Ser Gly Ser Gly 100 105
110 Asn Asp Gly Ala Thr Gln Ser Ala Glu Ser Gly Ser Glu Gly Ser
Ser 115 120 125 Asp
Gly Ser Asp Glu Asn Asn Asn His Glu Leu Ser Ala Thr Lys Lys 130
135 140 Gly Ser Phe Asp Gln Met
Leu Ala Asp Gly Ala Thr Ala Gln Asn Asn 145 150
155 160 Thr Ser Val Ala Asn Phe Gln Asn Ser Val Pro
Gly Asn Pro Val Val 165 170
175 Ser Val Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn Pro
180 185 190 Ser Ser
Gly Ala Ser Gly Ala Met Lys Met Arg Pro Asn Pro Gly Val 195
200 205 Ser Pro Ala Val Ala Pro Gly
Met Met Thr Asp Gln Trp Ile Gln Asp 210 215
220 Glu Arg Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser
Asn Arg Glu Ser 225 230 235
240 Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln
245 250 255 Gln Arg Val
Glu Ser Leu Asn Ser Glu Asn Arg Ala Leu Arg Asp Glu 260
265 270 Leu Gln Lys Val Ser Glu Glu Cys
Glu Lys Leu Thr Ser Glu Asn Asn 275 280
285 Ser Ile Lys Glu Glu Leu Thr Arg Leu Cys Gly Pro Glu
Ala Val Ala 290 295 300
Lys Leu Glu Ser Ser Ser Ile Thr Gln Leu Glu Thr Asn Gly Asp Glu 305
310 315 320 Asp Asp His
31540DNACoffea canephora 31atgggaagtg tactttctcc gaatatgact tcgactcttg
aacttagaaa cccttctggt 60ggaaatatga agacaagtcc tgttagtgaa gcctggctgc
agaatgagcg agagctgaag 120cgggaaagga ggaaacagtc aaatcgagaa tctgcaagga
gatcaagatt gaggaaacag 180gctgagactg aagaactagc taaaaaagtt caatcactga
ctgccgagaa cctaagtttg 240aagtctgaaa tacacaaatt aactgagagc tctgaacggc
tgaagcttga aaatgctact 300atgatggaga aactgaaaaa cccacaactg gggcagactg
gaaatttgag tttaagcaag 360tttgatgaaa tgcgactaca accagttggc acggcaaatc
tactcgccag ggtaaacaac 420tctggttctg ttgacaggaa tgacgaggag ggtgaggtgt
tcgagaatac aaaatccggg 480gcaaagcttc gccagctgct tgatgcaaac ccccgcacgg
atgccgtggc agctggctga 54032179PRTCoffea canephora 32Met Gly Ser Val
Leu Ser Pro Asn Met Thr Ser Thr Leu Glu Leu Arg 1 5
10 15 Asn Pro Ser Gly Gly Asn Met Lys Thr
Ser Pro Val Ser Glu Ala Trp 20 25
30 Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln
Ser Asn 35 40 45
Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Thr Glu 50
55 60 Glu Leu Ala Lys Lys
Val Gln Ser Leu Thr Ala Glu Asn Leu Ser Leu 65 70
75 80 Lys Ser Glu Ile His Lys Leu Thr Glu Ser
Ser Glu Arg Leu Lys Leu 85 90
95 Glu Asn Ala Thr Met Met Glu Lys Leu Lys Asn Pro Gln Leu Gly
Gln 100 105 110 Thr
Gly Asn Leu Ser Leu Ser Lys Phe Asp Glu Met Arg Leu Gln Pro 115
120 125 Val Gly Thr Ala Asn Leu
Leu Ala Arg Val Asn Asn Ser Gly Ser Val 130 135
140 Asp Arg Asn Asp Glu Glu Gly Glu Val Phe Glu
Asn Thr Lys Ser Gly 145 150 155
160 Ala Lys Leu Arg Gln Leu Leu Asp Ala Asn Pro Arg Thr Asp Ala Val
165 170 175 Ala Ala
Gly 33852DNACitrus clementina 33atgcccccca tagggcaccc ccgtttccat
ccacaagttt atttttttcc ggggggggtt 60tgcccttcct ggccgggttt ggagttcaaa
ccgggcccca accaaaacag ggccggaagg 120aaagggccgg gagcaaagga ccggggttcg
tttaaaaatt ccaggggact ccggaaggta 180aggctgggga gattgtttag gcaacttttg
gtttctggga atgacggtgt ttttcaaagt 240ggtggaagtg gtagtgacgg ttcttttgat
gcgagtgatg agaattgtaa ccagcaggag 300tttgttaggg gtaagaaagg aagctttgac
aagatgcttg cagatgccaa cacggagaat 360aacacagcgg aagctgttcc aggatcagtg
cccgggaagc ctgtagtttc aatgcctgca 420actaatctca atattggcat ggatttgtgg
aatacatccc ctgctgctgc tggagctgca 480aaaatgagaa caaatccatt tggggcctca
ccagcagttg ctcccgctgg cataatgccc 540gatcaatgga ttcaagatga acgtgaattg
aaaagacaga aaaggaagca atttaatagg 600gagtcagcca gaaggtcaag gttacgcaag
caggcggaat gtggggagct acaggccaga 660gtggggactt tgagcaatga gaatcgcaac
cttagagatg agttacagag gctttttgag 720gaatgggaga agcttacatt tgaaaataat
tccattaagg aagacttatt tcggttgtgt 780ggaccagagg cagttgttaa ttttgagcag
agcaacccca ctcagttgtc cggggaagaa 840gaaaatagct aa
85234283PRTCitrus clementina 34Met Pro
Pro Ile Gly His Pro Arg Phe His Pro Gln Val Tyr Phe Phe 1 5
10 15 Pro Gly Gly Val Cys Pro Ser
Trp Pro Gly Leu Glu Phe Lys Pro Gly 20 25
30 Pro Asn Gln Asn Arg Ala Gly Arg Lys Gly Pro Gly
Ala Lys Asp Arg 35 40 45
Gly Ser Phe Lys Asn Ser Arg Gly Leu Arg Lys Val Arg Leu Gly Arg
50 55 60 Leu Phe Arg
Gln Leu Leu Val Ser Gly Asn Asp Gly Val Phe Gln Ser 65
70 75 80 Gly Gly Ser Gly Ser Asp Gly
Ser Phe Asp Ala Ser Asp Glu Asn Cys 85
90 95 Asn Gln Gln Glu Phe Val Arg Gly Lys Lys Gly
Ser Phe Asp Lys Met 100 105
110 Leu Ala Asp Ala Asn Thr Glu Asn Asn Thr Ala Glu Ala Val Pro
Gly 115 120 125 Ser
Val Pro Gly Lys Pro Val Val Ser Met Pro Ala Thr Asn Leu Asn 130
135 140 Ile Gly Met Asp Leu Trp
Asn Thr Ser Pro Ala Ala Ala Gly Ala Ala 145 150
155 160 Lys Met Arg Thr Asn Pro Phe Gly Ala Ser Pro
Ala Val Ala Pro Ala 165 170
175 Gly Ile Met Pro Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg
180 185 190 Gln Lys
Arg Lys Gln Phe Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu 195
200 205 Arg Lys Gln Ala Glu Cys Gly
Glu Leu Gln Ala Arg Val Gly Thr Leu 210 215
220 Ser Asn Glu Asn Arg Asn Leu Arg Asp Glu Leu Gln
Arg Leu Phe Glu 225 230 235
240 Glu Trp Glu Lys Leu Thr Phe Glu Asn Asn Ser Ile Lys Glu Asp Leu
245 250 255 Phe Arg Leu
Cys Gly Pro Glu Ala Val Val Asn Phe Glu Gln Ser Asn 260
265 270 Pro Thr Gln Leu Ser Gly Glu Glu
Glu Asn Ser 275 280 35546DNACichorium
endivia 35atgggactcc ggttccatac cccgctccat tatccacctg caggagttta
tgctcatcct 60agtatgccta tgactccaag tcctgctcca gcaaacacag aaatggaagc
aaaggcatat 120gaaggaaagg aaagggccac caataaaaag tccaagggta cttctggaaa
tggaaatgtt 180ggtgttagaa ctggagatag tggcattgca gcatcaagtt cagggaatga
tggtggtgcc 240acacagagtg ctgatagtgg aagtgatggt tcatcagatg gaagtgatga
aaatgaccaa 300aatgaatttt ctggaggcaa gaaaggaagc ttcaatcaga tgcttgcaga
tgcaaatgca 360cagaacaata actttcacac accagtagta cctgtgaatc ctgtgacttc
tattcctggt 420acaaatctca tcatgagaat ggacttgcgg aatccctcca ccggaaatgc
cgccatgaaa 480atgcgaacaa atcattccgg caagtcccgt ggggaggtgc cgccacctat
gaagcctgaa 540tcatga
54636181PRTCichorium endivia 36Met Gly Leu Arg Phe His Thr
Pro Leu His Tyr Pro Pro Ala Gly Val 1 5
10 15 Tyr Ala His Pro Ser Met Pro Met Thr Pro Ser
Pro Ala Pro Ala Asn 20 25
30 Thr Glu Met Glu Ala Lys Ala Tyr Glu Gly Lys Glu Arg Ala Thr
Asn 35 40 45 Lys
Lys Ser Lys Gly Thr Ser Gly Asn Gly Asn Val Gly Val Arg Thr 50
55 60 Gly Asp Ser Gly Ile Ala
Ala Ser Ser Ser Gly Asn Asp Gly Gly Ala 65 70
75 80 Thr Gln Ser Ala Asp Ser Gly Ser Asp Gly Ser
Ser Asp Gly Ser Asp 85 90
95 Glu Asn Asp Gln Asn Glu Phe Ser Gly Gly Lys Lys Gly Ser Phe Asn
100 105 110 Gln Met
Leu Ala Asp Ala Asn Ala Gln Asn Asn Asn Phe His Thr Pro 115
120 125 Val Val Pro Val Asn Pro Val
Thr Ser Ile Pro Gly Thr Asn Leu Ile 130 135
140 Met Arg Met Asp Leu Arg Asn Pro Ser Thr Gly Asn
Ala Ala Met Lys 145 150 155
160 Met Arg Thr Asn His Ser Gly Lys Ser Arg Gly Glu Val Pro Pro Pro
165 170 175 Met Lys Pro
Glu Ser 180 37456DNACentaurea maculosa 37atggtgcccg
gtgaatctct gttgcagaac gagcgggaac ttaaaaggga gaggagaaag 60caatctaatc
gagaatctgc caggcggtct agattaagga aacaggcgga agcagaagaa 120cttgcgataa
aagttgaatc cctcactaat gaaaatctga cccttaagtc cgaaattaac 180cgcttgactg
ataattccga gaaactgaag cttcaaaatg ctaaactaat tgagaaactc 240aagaatgcac
gacaagacac cgaagaccca cggctggacc caaacggctc gtctctgagc 300acggctaacc
tcctctcgag agtcaacaac gggtctggtg ctagaactga tggagacgct 360gaagtatatg
agaataacaa taaccaaaac tcgggtgcaa aactgcgtca actattggac 420gccagccctc
ggaccgatgc tgttgcagcg ggctaa
45638151PRTCentaurea maculosa 38Met Val Pro Gly Glu Ser Leu Leu Gln Asn
Glu Arg Glu Leu Lys Arg 1 5 10
15 Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg
Leu 20 25 30 Arg
Lys Gln Ala Glu Ala Glu Glu Leu Ala Ile Lys Val Glu Ser Leu 35
40 45 Thr Asn Glu Asn Leu Thr
Leu Lys Ser Glu Ile Asn Arg Leu Thr Asp 50 55
60 Asn Ser Glu Lys Leu Lys Leu Gln Asn Ala Lys
Leu Ile Glu Lys Leu 65 70 75
80 Lys Asn Ala Arg Gln Asp Thr Glu Asp Pro Arg Leu Asp Pro Asn Gly
85 90 95 Ser Ser
Leu Ser Thr Ala Asn Leu Leu Ser Arg Val Asn Asn Gly Ser 100
105 110 Gly Ala Arg Thr Asp Gly Asp
Ala Glu Val Tyr Glu Asn Asn Asn Asn 115 120
125 Gln Asn Ser Gly Ala Lys Leu Arg Gln Leu Leu Asp
Ala Ser Pro Arg 130 135 140
Thr Asp Ala Val Ala Ala Gly 145 150
391197DNACentaurea maculosa 39atgggcaact gtgaagagac aaaggcttgt aaacctgaga
aatcgtcttc acctccaccc 60gagcaacaac agaccaacgt tcatgcattt cctgattggg
cagccatgca ggcttattat 120ggccctagaa tggctatgcc accatacttc aactcggctg
ttgcatctgg tcatgcccct 180ccaccatata tgtggggacc accacagcat atgatgccgc
cttatgctgc tatgtatcca 240catggaggtg tttatccaca tcccggagtt cctcttgcgg
gtagtcctat gagcattgat 300tctccggcca agtcatcagg gaattctgat cgtggattgc
tgaaaaagtt gaaaggattt 360gatgggttgg caatgtcaat tggcaatggc aacggtgata
gtggtggagg tggaaatgag 420aatgggatct cccatagtgg ggagactgaa ggttctagtg
aaggaagtga tggcaataca 480acagaggggg gtcaaaatag cgggaaaagg agccgagaag
gatcgcctaa ggctcctgaa 540gttggcaaga ccgagccact aagcggacaa tttttcccta
ctgaagcaaa cggagcttcc 600aagaaagtta ctggtcttac tgttaccctt cctaaggttt
cgggtaaatt aggagctgcc 660gtctccgcta acttgacctc tgacttagag attaagaatt
ctcccacaac tgctgctaag 720ctggcctccg caactgtcgc catggtgccc ggtgaatctc
tgttgcagaa cgagcgtgaa 780cttaaaaggg agaggagaaa gcaatctaat cgagaatctg
ccaggcggtc tagattaagg 840aaacaggcgg aagcagaaga acttgcgata aaagttgaat
ccctcactaa tgaaaatctg 900acccttaagt ccgaaattaa ccgcttgagc gataattccg
agaaactgaa gcttcaaaat 960gccaaactaa ttgagaaact caagaatgca cgacaagaca
ccgaacaccc acggctggac 1020ccaaatggct cgtctctgag cacggctaac ctcctctcga
gagtcgacaa cgggtccggt 1080gctagaactg atggagacgt tgaagtgtac gagaataaca
ataaccaaaa cccgggtgca 1140aaactgcgtc aactattgca cgccagccca aggaccgatg
ccgttgcagc gggctaa 119740398PRTCentaurea maculosa 40Met Gly Asn Cys
Glu Glu Thr Lys Ala Cys Lys Pro Glu Lys Ser Ser 1 5
10 15 Ser Pro Pro Pro Glu Gln Gln Gln Thr
Asn Val His Ala Phe Pro Asp 20 25
30 Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg Met Ala Met
Pro Pro 35 40 45
Tyr Phe Asn Ser Ala Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met 50
55 60 Trp Gly Pro Pro Gln
His Met Met Pro Pro Tyr Ala Ala Met Tyr Pro 65 70
75 80 His Gly Gly Val Tyr Pro His Pro Gly Val
Pro Leu Ala Gly Ser Pro 85 90
95 Met Ser Ile Asp Ser Pro Ala Lys Ser Ser Gly Asn Ser Asp Arg
Gly 100 105 110 Leu
Leu Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala Met Ser Ile Gly 115
120 125 Asn Gly Asn Gly Asp Ser
Gly Gly Gly Gly Asn Glu Asn Gly Ile Ser 130 135
140 His Ser Gly Glu Thr Glu Gly Ser Ser Glu Gly
Ser Asp Gly Asn Thr 145 150 155
160 Thr Glu Gly Gly Gln Asn Ser Gly Lys Arg Ser Arg Glu Gly Ser Pro
165 170 175 Lys Ala
Pro Glu Val Gly Lys Thr Glu Pro Leu Ser Gly Gln Phe Phe 180
185 190 Pro Thr Glu Ala Asn Gly Ala
Ser Lys Lys Val Thr Gly Leu Thr Val 195 200
205 Thr Leu Pro Lys Val Ser Gly Lys Leu Gly Ala Ala
Val Ser Ala Asn 210 215 220
Leu Thr Ser Asp Leu Glu Ile Lys Asn Ser Pro Thr Thr Ala Ala Lys 225
230 235 240 Leu Ala Ser
Ala Thr Val Ala Met Val Pro Gly Glu Ser Leu Leu Gln 245
250 255 Asn Glu Arg Glu Leu Lys Arg Glu
Arg Arg Lys Gln Ser Asn Arg Glu 260 265
270 Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala
Glu Glu Leu 275 280 285
Ala Ile Lys Val Glu Ser Leu Thr Asn Glu Asn Leu Thr Leu Lys Ser 290
295 300 Glu Ile Asn Arg
Leu Ser Asp Asn Ser Glu Lys Leu Lys Leu Gln Asn 305 310
315 320 Ala Lys Leu Ile Glu Lys Leu Lys Asn
Ala Arg Gln Asp Thr Glu His 325 330
335 Pro Arg Leu Asp Pro Asn Gly Ser Ser Leu Ser Thr Ala Asn
Leu Leu 340 345 350
Ser Arg Val Asp Asn Gly Ser Gly Ala Arg Thr Asp Gly Asp Val Glu
355 360 365 Val Tyr Glu Asn
Asn Asn Asn Gln Asn Pro Gly Ala Lys Leu Arg Gln 370
375 380 Leu Leu His Ala Ser Pro Arg Thr
Asp Ala Val Ala Ala Gly 385 390 395
411281DNACatharanthus roseus 41atgggaagta gtgaagagac aaagtcgtcg
aagcctgaga aatcatcttc tcctgcaccg 60gagcagagta atgttcatgt atatcctgat
tgggcagcaa tgcaggcata ttatggtccc 120cgagttgctg taccaccata tttcagctct
gctgttgcat ctggtcatcc acctcaccct 180tacatgtggg gaccacctca gcctatgatg
ccaccttatg gaacacctta tgctgcaatc 240tatgctcatg gaggtgttta tacccatccc
ggggttcctt tgggttcaca tgccaatgct 300catgcggggg ctacatctcc tggtgcaaca
gaagctattg ctgctagtcc tttgagcatt 360gatacaccta ccaagtcatc ggcaaatggc
agtcaaggtc tgatgaacaa attgagaggc 420tttgatggac ttgcaatgtc aataggcaat
ggcaacacgg acagtgccga tgggggaact 480gatcatggga tatcacagag tggtgacact
gaaggttcaa gtgatggaag caatgggact 540acatccaagg caggtcaaaa gaacaagaaa
cgcagccgtg aagggactcc tgctaatgat 600agggagcgca agtccctgac acctagtagt
ccatcagctg ctgtcaacac aaatggttct 660tcagagaaag ctatgagggc aagtaaagtt
cctgctgctg caactgaaaa ggtgatgggt 720gctgtacttt ctcctaatat gactactgca
tcggagctca ggaatccttc tgctgccaat 780gctaagacaa gtccggctaa ggtttcccaa
tcctgttctt ctcttccggg tgaaacttgg 840ttgcagaatg agcgagagct taagcgggaa
aggaggaaac aatctaatcg tgaatctgca 900aggagatcaa gattgaggaa acaggctgag
acggaagaat tagctaagaa agttcagact 960ttgactgctg aaaacatgac tttaaggtcg
gaaatcaata aactaactga gaactctgag 1020catctaaggc atgagagtgc gcttttggat
aagttgaaaa atgcacgggt catgcaagca 1080ggggagatga ataaatatga tgaattgcat
cggcaaccaa ctggtacagc tgaccttctt 1140gcgagagtca acaattctgg ttctactgat
aagagcaacg aggagggtgg tggtgatgtg 1200ttcgagaaca gaaactccgg gaccaagctt
caccagttgc tcgatgccag ccctagggcg 1260gatgctgttg ccgctggttg a
128142426PRTCatharanthus roseus 42Met
Gly Ser Ser Glu Glu Thr Lys Ser Ser Lys Pro Glu Lys Ser Ser 1
5 10 15 Ser Pro Ala Pro Glu Gln
Ser Asn Val His Val Tyr Pro Asp Trp Ala 20
25 30 Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val
Ala Val Pro Pro Tyr Phe 35 40
45 Ser Ser Ala Val Ala Ser Gly His Pro Pro His Pro Tyr Met
Trp Gly 50 55 60
Pro Pro Gln Pro Met Met Pro Pro Tyr Gly Thr Pro Tyr Ala Ala Ile 65
70 75 80 Tyr Ala His Gly Gly
Val Tyr Thr His Pro Gly Val Pro Leu Gly Ser 85
90 95 His Ala Asn Ala His Ala Gly Ala Thr Ser
Pro Gly Ala Thr Glu Ala 100 105
110 Ile Ala Ala Ser Pro Leu Ser Ile Asp Thr Pro Thr Lys Ser Ser
Ala 115 120 125 Asn
Gly Ser Gln Gly Leu Met Asn Lys Leu Arg Gly Phe Asp Gly Leu 130
135 140 Ala Met Ser Ile Gly Asn
Gly Asn Thr Asp Ser Ala Asp Gly Gly Thr 145 150
155 160 Asp His Gly Ile Ser Gln Ser Gly Asp Thr Glu
Gly Ser Ser Asp Gly 165 170
175 Ser Asn Gly Thr Thr Ser Lys Ala Gly Gln Lys Asn Lys Lys Arg Ser
180 185 190 Arg Glu
Gly Thr Pro Ala Asn Asp Arg Glu Arg Lys Ser Leu Thr Pro 195
200 205 Ser Ser Pro Ser Ala Ala Val
Asn Thr Asn Gly Ser Ser Glu Lys Ala 210 215
220 Met Arg Ala Ser Lys Val Pro Ala Ala Ala Thr Glu
Lys Val Met Gly 225 230 235
240 Ala Val Leu Ser Pro Asn Met Thr Thr Ala Ser Glu Leu Arg Asn Pro
245 250 255 Ser Ala Ala
Asn Ala Lys Thr Ser Pro Ala Lys Val Ser Gln Ser Cys 260
265 270 Ser Ser Leu Pro Gly Glu Thr Trp
Leu Gln Asn Glu Arg Glu Leu Lys 275 280
285 Arg Glu Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg
Arg Ser Arg 290 295 300
Leu Arg Lys Gln Ala Glu Thr Glu Glu Leu Ala Lys Lys Val Gln Thr 305
310 315 320 Leu Thr Ala Glu
Asn Met Thr Leu Arg Ser Glu Ile Asn Lys Leu Thr 325
330 335 Glu Asn Ser Glu His Leu Arg His Glu
Ser Ala Leu Leu Asp Lys Leu 340 345
350 Lys Asn Ala Arg Val Met Gln Ala Gly Glu Met Asn Lys Tyr
Asp Glu 355 360 365
Leu His Arg Gln Pro Thr Gly Thr Ala Asp Leu Leu Ala Arg Val Asn 370
375 380 Asn Ser Gly Ser Thr
Asp Lys Ser Asn Glu Glu Gly Gly Gly Asp Val 385 390
395 400 Phe Glu Asn Arg Asn Ser Gly Thr Lys Leu
His Gln Leu Leu Asp Ala 405 410
415 Ser Pro Arg Ala Asp Ala Val Ala Ala Gly 420
425 43903DNACatharanthus roseus 43atgtggggag gccagcatcc
attgatgccc ccttatggga ctccagttcc atatccagct 60ctatatcctc ctggaggtgt
ttacgctcat cctactatgg caacgactcc aggaacaaca 120caagcaaatg ccgaatcaga
tgcagtaaag gtctctgaag gaaaggaccg acccacaagc 180aaaaggtccc gaggagcttc
agggaaccat ggcttggttg ctgcaaaagt tgcggagagt 240gggaaagcag cttcagagtc
tggaaatgat ggtgctactc agagtgctga aagtggaagt 300gaaggttcat cagatggaag
tgatgagaat aacaatcatg agctctctgg gaccaaaaaa 360ggaagttttg agcagatgct
agctgatgga gcaacagctc agaatagcac tgcaatagca 420aacttcccga actcagttcc
tggaaatcca gtagctatgc ctgcgaccaa tttgaacatt 480ggaatggact tgtggaatgc
ttcctctgct gctcctggag ccatgaaaat gcgtccaagt 540catggtgtcc catctgctgt
agctccgggc atggtcaatg accaatggat tcaagatgaa 600agagaattga aaagacaaaa
gcgaaaacaa tctaatcggg aatcagctag gagatcaaga 660ttacgcaaac aggctgagtg
cgaggaactg caacagagag tagagacatt gagcaatgaa 720aatcgtgcat tgcgagatga
gctacagagg ctttctgagg aatgtgagaa gcttacatca 780gaaaataact ccattaagga
cgagctaacg agggtatgcg gtcctgaggc agtatcgaaa 840ctagagagca gtagcataac
caaacaacaa cttcagtccc gcggtaatga acatgaaagt 900taa
90344300PRTCatharanthus
roseus 44Met Trp Gly Gly Gln His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val
1 5 10 15 Pro Tyr
Pro Ala Leu Tyr Pro Pro Gly Gly Val Tyr Ala His Pro Thr 20
25 30 Met Ala Thr Thr Pro Gly Thr
Thr Gln Ala Asn Ala Glu Ser Asp Ala 35 40
45 Val Lys Val Ser Glu Gly Lys Asp Arg Pro Thr Ser
Lys Arg Ser Arg 50 55 60
Gly Ala Ser Gly Asn His Gly Leu Val Ala Ala Lys Val Ala Glu Ser 65
70 75 80 Gly Lys Ala
Ala Ser Glu Ser Gly Asn Asp Gly Ala Thr Gln Ser Ala 85
90 95 Glu Ser Gly Ser Glu Gly Ser Ser
Asp Gly Ser Asp Glu Asn Asn Asn 100 105
110 His Glu Leu Ser Gly Thr Lys Lys Gly Ser Phe Glu Gln
Met Leu Ala 115 120 125
Asp Gly Ala Thr Ala Gln Asn Ser Thr Ala Ile Ala Asn Phe Pro Asn 130
135 140 Ser Val Pro Gly
Asn Pro Val Ala Met Pro Ala Thr Asn Leu Asn Ile 145 150
155 160 Gly Met Asp Leu Trp Asn Ala Ser Ser
Ala Ala Pro Gly Ala Met Lys 165 170
175 Met Arg Pro Ser His Gly Val Pro Ser Ala Val Ala Pro Gly
Met Val 180 185 190
Asn Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg Gln Lys Arg
195 200 205 Lys Gln Ser Asn
Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln 210
215 220 Ala Glu Cys Glu Glu Leu Gln Gln
Arg Val Glu Thr Leu Ser Asn Glu 225 230
235 240 Asn Arg Ala Leu Arg Asp Glu Leu Gln Arg Leu Ser
Glu Glu Cys Glu 245 250
255 Lys Leu Thr Ser Glu Asn Asn Ser Ile Lys Asp Glu Leu Thr Arg Val
260 265 270 Cys Gly Pro
Glu Ala Val Ser Lys Leu Glu Ser Ser Ser Ile Thr Lys 275
280 285 Gln Gln Leu Gln Ser Arg Gly Asn
Glu His Glu Ser 290 295 300
451041DNACitrus sinensismisc_feature(1027)..(1027)n is a, c, g, or t
45atggggaaca atgaagatgg aaagtccttc aagtctgaaa aaccatcttc acctccacct
60tcggatcaag gcaatattca tatgtatact gattgggcag ctatgcaggc ttattatggc
120ccccgagttg ctattccgcc atattacaac tcacccattg catctggtca tgctcctcaa
180ccctacatgt ggggcccagc ccagcctatg atgccaccat atggagcgcc ttatgcagcc
240atctattcta ctggaggtgt ttatgcacat cctgctgttc ctttggctgt aaccccattg
300aacacagagg cacctactaa gtcgtcagga aatgcagatc gaggtttagc aaagaagctg
360aaagggttag atggcctggc aatgtcaata ggcaatgcta gtgctgagag tgctgagggt
420ggagcagaac aaaggccgtc acagagtgag gccgacggtt ctactgatgg aagtgatggg
480aatacagtta gggcaggtca atctagaaag aaaagaagcc gagagggaac gccaattgct
540ggaaaacccg ttggtcctgt gctttctcct ggcatgccta caaaattgga gctcaggaat
600gcacctggca tgaacgttaa ggcaagtcca accagtgttc cacagccttg tgcagtttta
660cctcctgaaa cctggattca gaatgaacgg gagctgaaac gggaaaggag gaaacaatct
720aatcgagaat ctgctagaag gtctaggttg aggaagcagg ctgaggctga agaactttct
780cgtaaagttg attccttgat tgatgagaat gcttccctca agtctgaaat aaatcaatta
840tcagagaatt ctgagaaact gaggcaagaa aacgcagcat tactggaaaa actgaagagt
900gcacaactgg gaaacaagca agagattgtt ttgaacgagg acaagagggt tacacctgtt
960agcacagaaa acctattatc tagagttaca actccggtac tgttgataga aacatggagg
1020aaaggangtc accctgtttg a
104146346PRTCitrus sinensismisc_feature(343)..(343)Xaa can be any
naturally occurring amino acid 46Met Gly Asn Asn Glu Asp Gly Lys Ser Phe
Lys Ser Glu Lys Pro Ser 1 5 10
15 Ser Pro Pro Pro Ser Asp Gln Gly Asn Ile His Met Tyr Thr Asp
Trp 20 25 30 Ala
Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Ile Pro Pro Tyr 35
40 45 Tyr Asn Ser Pro Ile Ala
Ser Gly His Ala Pro Gln Pro Tyr Met Trp 50 55
60 Gly Pro Ala Gln Pro Met Met Pro Pro Tyr Gly
Ala Pro Tyr Ala Ala 65 70 75
80 Ile Tyr Ser Thr Gly Gly Val Tyr Ala His Pro Ala Val Pro Leu Ala
85 90 95 Val Thr
Pro Leu Asn Thr Glu Ala Pro Thr Lys Ser Ser Gly Asn Ala 100
105 110 Asp Arg Gly Leu Ala Lys Lys
Leu Lys Gly Leu Asp Gly Leu Ala Met 115 120
125 Ser Ile Gly Asn Ala Ser Ala Glu Ser Ala Glu Gly
Gly Ala Glu Gln 130 135 140
Arg Pro Ser Gln Ser Glu Ala Asp Gly Ser Thr Asp Gly Ser Asp Gly 145
150 155 160 Asn Thr Val
Arg Ala Gly Gln Ser Arg Lys Lys Arg Ser Arg Glu Gly 165
170 175 Thr Pro Ile Ala Gly Lys Pro Val
Gly Pro Val Leu Ser Pro Gly Met 180 185
190 Pro Thr Lys Leu Glu Leu Arg Asn Ala Pro Gly Met Asn
Val Lys Ala 195 200 205
Ser Pro Thr Ser Val Pro Gln Pro Cys Ala Val Leu Pro Pro Glu Thr 210
215 220 Trp Ile Gln Asn
Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser 225 230
235 240 Asn Arg Glu Ser Ala Arg Arg Ser Arg
Leu Arg Lys Gln Ala Glu Ala 245 250
255 Glu Glu Leu Ser Arg Lys Val Asp Ser Leu Ile Asp Glu Asn
Ala Ser 260 265 270
Leu Lys Ser Glu Ile Asn Gln Leu Ser Glu Asn Ser Glu Lys Leu Arg
275 280 285 Gln Glu Asn Ala
Ala Leu Leu Glu Lys Leu Lys Ser Ala Gln Leu Gly 290
295 300 Asn Lys Gln Glu Ile Val Leu Asn
Glu Asp Lys Arg Val Thr Pro Val 305 310
315 320 Ser Thr Glu Asn Leu Leu Ser Arg Val Thr Thr Pro
Val Leu Leu Ile 325 330
335 Glu Thr Trp Arg Lys Gly Xaa His Pro Val 340
345 47516DNACitrus sinensis 47atggggacag gggaagagaa cacttctgct
aagactgcca aaacagcttc ttcaactcag 60gagataccaa ccacaccctc gtacgctgat
tggtccagct ctatgcaggc tttctatggt 120gctggggcta cgccacctcc attttttgct
tccaccgttg cttctccaac tcctcatccc 180tatctgtggg gaagccagca tcctttaatg
ccaccatatg gcaccccagt tccataccaa 240gctatatatc ctccaggggg agtatatgca
catcctagca tggctacgac tccaacagca 300gcaccaacaa atacagagcc ggaagggaag
ggacctgaag caaaggaccg ggcttcagct 360aaaaaatcca agggaactcc aggaggtaag
gctggagaga ttgtaaaggc aacttctggt 420tctgggaatg acggtgtctc tcaaagtgct
gaaagtggta gtgacggttc atctgatgcg 480agtgatgaga atggtaaccg agcaggagtt
tgctag 51648171PRTCitrus sinensis 48Met Gly
Thr Gly Glu Glu Asn Thr Ser Ala Lys Thr Ala Lys Thr Ala 1 5
10 15 Ser Ser Thr Gln Glu Ile Pro
Thr Thr Pro Ser Tyr Ala Asp Trp Ser 20 25
30 Ser Ser Met Gln Ala Phe Tyr Gly Ala Gly Ala Thr
Pro Pro Pro Phe 35 40 45
Phe Ala Ser Thr Val Ala Ser Pro Thr Pro His Pro Tyr Leu Trp Gly
50 55 60 Ser Gln His
Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Gln 65
70 75 80 Ala Ile Tyr Pro Pro Gly Gly
Val Tyr Ala His Pro Ser Met Ala Thr 85
90 95 Thr Pro Thr Ala Ala Pro Thr Asn Thr Glu Pro
Glu Gly Lys Gly Pro 100 105
110 Glu Ala Lys Asp Arg Ala Ser Ala Lys Lys Ser Lys Gly Thr Pro
Gly 115 120 125 Gly
Lys Ala Gly Glu Ile Val Lys Ala Thr Ser Gly Ser Gly Asn Asp 130
135 140 Gly Val Ser Gln Ser Ala
Glu Ser Gly Ser Asp Gly Ser Ser Asp Ala 145 150
155 160 Ser Asp Glu Asn Gly Asn Arg Ala Gly Val Cys
165 170 491002DNACarthamus tinctorius
49atgccgattg gtcaaactca atgcaggctt attatggtgc tggaggcact ccaccttttt
60tttgcctcaa ctgttgcttc tccgactcct catccctaca tatggggagg ccagcatcct
120atgatgtcac catatgggac tccagttcca taccctgctc tatatccacc agcaggagtt
180tacgctcatc ctagtatgcc tatgacccca agtaccgcac caccaaatgc agaaatggaa
240gtgaaggcct atgaaggcaa ggaaagggct gcaaataaaa agtccaaggg aacttcagga
300aatggcaatg ctgctgttgt tagaactgga gagagtggga aggcggcatc aagttcaggg
360aatgatggtg ccacccagag cgctgaaagt ggaagtgatg gctcatcaga tggaagtgaa
420gaaaatgacc aacatgaata ctctggaggc aagaaaggaa gttttaatca gatgcttgca
480gatgccaatg cacagaataa caattctggg ccaaatattc agacgtcagt acctgggaac
540cctgtggtgt ctatacctgg taccaatctt aatatgggga tggacttgtg gaatccatct
600accggaagtg gaaccatgaa aattcgatca aatccttctg gtgtggctcg agcagcagtg
660ccaccaccaa tgataggacg ggaaggaatg atgcctgatc agtgggttca ggatgagcgt
720gaactgaaga gacaaaagag gaagcagtct aaccgagagt cggctaggag atcaaggttg
780cgcaagcagg cggagtgtga agagccacag gcaagagtag aggcactaag caacgagaat
840cattcactca gagatgaact gcaaaggcta tcggaggaat gcgagaagct tacttctgaa
900aataattcga taaaggatga cttaactagg ttttgtgggc ccgaggcagt atcaaagcta
960gatgcacatc ttcaatctcg ggtggacgaa agtaacagct ga
100250333PRTCarthamus tinctorius 50Met Pro Ile Gly Gln Thr Gln Cys Arg
Leu Ile Met Val Leu Glu Ala 1 5 10
15 Leu His Leu Phe Phe Ala Ser Thr Val Ala Ser Pro Thr Pro
His Pro 20 25 30
Tyr Ile Trp Gly Gly Gln His Pro Met Met Ser Pro Tyr Gly Thr Pro
35 40 45 Val Pro Tyr Pro
Ala Leu Tyr Pro Pro Ala Gly Val Tyr Ala His Pro 50
55 60 Ser Met Pro Met Thr Pro Ser Thr
Ala Pro Pro Asn Ala Glu Met Glu 65 70
75 80 Val Lys Ala Tyr Glu Gly Lys Glu Arg Ala Ala Asn
Lys Lys Ser Lys 85 90
95 Gly Thr Ser Gly Asn Gly Asn Ala Ala Val Val Arg Thr Gly Glu Ser
100 105 110 Gly Lys Ala
Ala Ser Ser Ser Gly Asn Asp Gly Ala Thr Gln Ser Ala 115
120 125 Glu Ser Gly Ser Asp Gly Ser Ser
Asp Gly Ser Glu Glu Asn Asp Gln 130 135
140 His Glu Tyr Ser Gly Gly Lys Lys Gly Ser Phe Asn Gln
Met Leu Ala 145 150 155
160 Asp Ala Asn Ala Gln Asn Asn Asn Ser Gly Pro Asn Ile Gln Thr Ser
165 170 175 Val Pro Gly Asn
Pro Val Val Ser Ile Pro Gly Thr Asn Leu Asn Met 180
185 190 Gly Met Asp Leu Trp Asn Pro Ser Thr
Gly Ser Gly Thr Met Lys Ile 195 200
205 Arg Ser Asn Pro Ser Gly Val Ala Arg Ala Ala Val Pro Pro
Pro Met 210 215 220
Ile Gly Arg Glu Gly Met Met Pro Asp Gln Trp Val Gln Asp Glu Arg 225
230 235 240 Glu Leu Lys Arg Gln
Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 245
250 255 Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys
Glu Glu Pro Gln Ala Arg 260 265
270 Val Glu Ala Leu Ser Asn Glu Asn His Ser Leu Arg Asp Glu Leu
Gln 275 280 285 Arg
Leu Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile 290
295 300 Lys Asp Asp Leu Thr Arg
Phe Cys Gly Pro Glu Ala Val Ser Lys Leu 305 310
315 320 Asp Ala His Leu Gln Ser Arg Val Asp Glu Ser
Asn Ser 325 330
51728DNAEuphorbia esulamisc_feature(631)..(632)n is a, c, g, or t
51atggggacag gggaagaaag cacgcctact aagacgtcta aaccagcgcc ttcaactcag
60gttcctgaaa ttccaacaac gcccgtgtat ccagattggt ccaattctat gcaggcttat
120tatggtgctg gagctactcc accgcatttt ttcgcatcaa cagttccatc tccaactccc
180cacccttatc tctggggagg tcagcacccc atgatgccac cgccctacgg gactcccgtt
240ccatatcctg ctttatatcc tgctggggga gtatattccc atcctactat gaccacgaca
300ccaaactctg caccggtaaa tgcagaattt gaaggaaaag gtcctgatgg aaaagaccgt
360gcttctgcca aaaaatctaa gggagcttca gctggcaagg gaggagagac cggaaaggca
420acctcaggtt ccggaaacga tggtgcctcc cagagcggtg aaagtggtag cgatggatcc
480tcagatggaa gtgatgagaa ctaacaggaa tatggggcga ataagaaagg aagttttgat
540cagatgcttg cggatgccaa tgctcaaaat aatggtatcc agggttcagt tccagggaag
600ccggttgcgt ccatggctgg agctaatctt nntnttggaa tggatttgtg gaatccttct
660gctgctgctc cggggactgc taaaattaga ccaaatgcat ccggtgctcc atcaggaatt
720actcctgc
72852242PRTEuphorbia esulamisc_feature(210)..(211)Xaa can be any
naturally occurring amino acid 52Met Gly Thr Gly Glu Glu Ser Thr Pro Thr
Lys Thr Ser Lys Pro Ala 1 5 10
15 Pro Ser Thr Gln Val Pro Glu Ile Pro Thr Thr Pro Val Tyr Pro
Asp 20 25 30 Trp
Ser Asn Ser Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Pro Pro 35
40 45 His Phe Phe Ala Ser Thr
Val Pro Ser Pro Thr Pro His Pro Tyr Leu 50 55
60 Trp Gly Gly Gln His Pro Met Met Pro Pro Pro
Tyr Gly Thr Pro Val 65 70 75
80 Pro Tyr Pro Ala Leu Tyr Pro Ala Gly Gly Val Tyr Ser His Pro Thr
85 90 95 Met Thr
Thr Thr Pro Asn Ser Ala Pro Val Asn Ala Glu Phe Glu Gly 100
105 110 Lys Gly Pro Asp Gly Lys Asp
Arg Ala Ser Ala Lys Lys Ser Lys Gly 115 120
125 Ala Ser Ala Gly Lys Gly Gly Glu Thr Gly Lys Ala
Thr Ser Gly Ser 130 135 140
Gly Asn Asp Gly Ala Ser Gln Ser Gly Glu Ser Gly Ser Asp Gly Ser 145
150 155 160 Ser Asp Gly
Ser Asp Glu Asn Gln Glu Tyr Gly Ala Asn Lys Lys Gly 165
170 175 Ser Phe Asp Gln Met Leu Ala Asp
Ala Asn Ala Gln Asn Asn Gly Ile 180 185
190 Gln Gly Ser Val Pro Gly Lys Pro Val Ala Ser Met Ala
Gly Ala Asn 195 200 205
Leu Xaa Xaa Gly Met Asp Leu Trp Asn Pro Ser Ala Ala Ala Pro Gly 210
215 220 Thr Ala Lys Ile
Arg Pro Asn Ala Ser Gly Ala Pro Ser Gly Ile Thr 225 230
235 240 Pro Ala 53828DNAFragaria vesca
53atgatgccgc cttatggaac tcctgttccg taccctgcca tatatcctcc aggtggagta
60tatgctcatc cgggtatggt cacgactcct gcctcggtac cgccaacaaa tccagagtcg
120gaagggaaga gcacagatgg gaaagagcga gcttcagcca aaaaacctaa gggagctgca
180ggattggtta gtgggaaggc tggggatggt ggaaaagcaa cttctggttc cggaaatgat
240ggtgcgtcac aaagtgctga aagtggtagc gagggttcat cagatggaag tgaggagaat
300ggtaaccacc aggagtatgg tgcaaacaag aagggaagct ttgacaagat gcttgcagat
360ggagcgaatg cacaaaataa cacgggttca gtgcctggga agcctgtagt ttctatgcct
420gcaacaagtc tgaatatggg aatggacttg tggaatccat cccctgctgg tgccggaact
480gcaaaaatga gaggaaatca atctggagcc ccatcagctg tcggtggtga ccattggatt
540caggatgaac gggaactgaa aagacagaaa aggaagcagt caaataggga gtctgctagg
600aggtcaagat tgaggaaaca ggcggagtgt gaagagctac aaaagagtgt acatggactg
660acaaatgaga atcacggcct taaagatgag ctgcagagac tctcccagga gtgcgagaag
720cttgcgtctg aaaatacttc tataaaggaa gagttgacac gattgtgtgg accagattta
780gtagcgaaca ttgaacatca atctcatggt ggtgaaggta acagttga
82854275PRTFragaria vesca 54Met Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr
Pro Ala Ile Tyr Pro 1 5 10
15 Pro Gly Gly Val Tyr Ala His Pro Gly Met Val Thr Thr Pro Ala Ser
20 25 30 Val Pro
Pro Thr Asn Pro Glu Ser Glu Gly Lys Ser Thr Asp Gly Lys 35
40 45 Glu Arg Ala Ser Ala Lys Lys
Pro Lys Gly Ala Ala Gly Leu Val Ser 50 55
60 Gly Lys Ala Gly Asp Gly Gly Lys Ala Thr Ser Gly
Ser Gly Asn Asp 65 70 75
80 Gly Ala Ser Gln Ser Ala Glu Ser Gly Ser Glu Gly Ser Ser Asp Gly
85 90 95 Ser Glu Glu
Asn Gly Asn His Gln Glu Tyr Gly Ala Asn Lys Lys Gly 100
105 110 Ser Phe Asp Lys Met Leu Ala Asp
Gly Ala Asn Ala Gln Asn Asn Thr 115 120
125 Gly Ser Val Pro Gly Lys Pro Val Val Ser Met Pro Ala
Thr Ser Leu 130 135 140
Asn Met Gly Met Asp Leu Trp Asn Pro Ser Pro Ala Gly Ala Gly Thr 145
150 155 160 Ala Lys Met Arg
Gly Asn Gln Ser Gly Ala Pro Ser Ala Val Gly Gly 165
170 175 Asp His Trp Ile Gln Asp Glu Arg Glu
Leu Lys Arg Gln Lys Arg Lys 180 185
190 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys
Gln Ala 195 200 205
Glu Cys Glu Glu Leu Gln Lys Ser Val His Gly Leu Thr Asn Glu Asn 210
215 220 His Gly Leu Lys Asp
Glu Leu Gln Arg Leu Ser Gln Glu Cys Glu Lys 225 230
235 240 Leu Ala Ser Glu Asn Thr Ser Ile Lys Glu
Glu Leu Thr Arg Leu Cys 245 250
255 Gly Pro Asp Leu Val Ala Asn Ile Glu His Gln Ser His Gly Gly
Glu 260 265 270 Gly
Asn Ser 275 55666DNAGossypium hirsutum 55atgggaacgg aagaggagag
cacaccagcc aagccttcca aacctactgc ctcatcccag 60gaaatgccaa cagtgtcata
tcctgattgg tcgacccaaa tgcaggctta ttatggtgct 120gcagctactc ctccattatt
tgcctcaaac gttgcttcac caaccccgca tccatacata 180tggggaggcc agcatcctct
aatgcctcca tatggtaccc cggttccgta cccagctgta 240tatcctccaa ggggagtgta
tgcgcatcct aatatggccc caatgccaag ttctgcacgg 300aataatggtg ctgatggaaa
ggatcggggt gtgaccaaaa agcccaaggg atcttcggga 360agcaaagttg gagagagtgc
aaaggccact tcaggctcgg gaaacgatgg cggctctcaa 420agtggtgaaa gtggcagcga
gggtacatca gacagaagtg atgagagtaa tcaacaagaa 480gtcaatgctg gcaaaaaggg
aagctttgag cagatgcttg cagatgccaa tgcacagggt 540aaagctgctg gggctttagt
tcccgcagaa cccatagtct ctatgcctgg caactacttt 600gaatatagga atggacctat
ggagtgcttc ccctgctgcc acaggagctc caaaaaccag 660acctaa
66656221PRTGossypium
hirsutum 56Met Gly Thr Glu Glu Glu Ser Thr Pro Ala Lys Pro Ser Lys Pro
Thr 1 5 10 15 Ala
Ser Ser Gln Glu Met Pro Thr Val Ser Tyr Pro Asp Trp Ser Thr
20 25 30 Gln Met Gln Ala Tyr
Tyr Gly Ala Ala Ala Thr Pro Pro Leu Phe Ala 35
40 45 Ser Asn Val Ala Ser Pro Thr Pro His
Pro Tyr Ile Trp Gly Gly Gln 50 55
60 His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr
Pro Ala Val 65 70 75
80 Tyr Pro Pro Arg Gly Val Tyr Ala His Pro Asn Met Ala Pro Met Pro
85 90 95 Ser Ser Ala Arg
Asn Asn Gly Ala Asp Gly Lys Asp Arg Gly Val Thr 100
105 110 Lys Lys Pro Lys Gly Ser Ser Gly Ser
Lys Val Gly Glu Ser Ala Lys 115 120
125 Ala Thr Ser Gly Ser Gly Asn Asp Gly Gly Ser Gln Ser Gly
Glu Ser 130 135 140
Gly Ser Glu Gly Thr Ser Asp Arg Ser Asp Glu Ser Asn Gln Gln Glu 145
150 155 160 Val Asn Ala Gly Lys
Lys Gly Ser Phe Glu Gln Met Leu Ala Asp Ala 165
170 175 Asn Ala Gln Gly Lys Ala Ala Gly Ala Leu
Val Pro Ala Glu Pro Ile 180 185
190 Val Ser Met Pro Gly Asn Tyr Phe Glu Tyr Arg Asn Gly Pro Met
Glu 195 200 205 Cys
Phe Pro Cys Cys His Arg Ser Ser Lys Asn Gln Thr 210
215 220 571050DNAGossypium hirsutum 57atgcaggcat
attatggtcc tcatgtcaat atgccaccgt attacagttc agctgtggca 60tcaggccatg
ctcctccccc ctatatgtgg ggtccaacac agcctatgat gccatcctat 120ggagcacctt
atgcagcaat ctactctcat gggggagttt atgcacatcc cgcagttcct 180ctggcatcac
acagtcttgg tgttccatca tcaccggcag ctgcaggtcc tgtggagaca 240cctacgaagt
cccctggaaa tactgaacaa ggtttaatga agaagctgaa aggatttgat 300ggtcttgcaa
tatcaatagg caatggtact gctgagaatg ctgaaggaag agctaaacct 360agaccatccc
acagtgtgga gactgcaggt tcagctgatg gtagtgatgg aaatacaact 420gggacggatc
aaagtagacg gaaaagaagc agggagggga caccaactat tgcaggcgaa 480gatgagaaaa
ttgaggcaaa gtctaaccaa gtcgctgcgg gggaggtgac tgcaaccatt 540tctcctaaac
taattggaac tgtagtttct cctggcatga ccacaggaac aatattggag 600cttaggaaca
cccccaccat gaatgctatg tccagtgcta tgggtgtaca ttgtggagta 660atgcctactg
aagtctggtt gcagagtgag cgggagctga aacgggagag gcgaaaacaa 720tctaatagag
aatccgctag aaggtcaagg ctgaggaagc aggctgagac tgaagagctt 780gcccgtaaag
ttgaatcctt aacttcagag aatgcagcac tcagatctga aataaaccaa 840ttaactgaaa
tgtctgaaaa agtaaggctc gaaaatgcga tattagtgga ggaactgaaa 900aatgctcaac
ttggacacgc acaggagaat attttgaaca aaaaggaaga caaggagggt 960gaaatgggtg
agaaaaggtc agactccggt gccaagctgc atcaactctt ggatccgagt 1020cctagagacg
atgcagtggc tgccggctga
105058349PRTGossypium hirsutum 58Met Gln Ala Tyr Tyr Gly Pro His Val Asn
Met Pro Pro Tyr Tyr Ser 1 5 10
15 Ser Ala Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met Trp Gly
Pro 20 25 30 Thr
Gln Pro Met Met Pro Ser Tyr Gly Ala Pro Tyr Ala Ala Ile Tyr 35
40 45 Ser His Gly Gly Val Tyr
Ala His Pro Ala Val Pro Leu Ala Ser His 50 55
60 Ser Leu Gly Val Pro Ser Ser Pro Ala Ala Ala
Gly Pro Val Glu Thr 65 70 75
80 Pro Thr Lys Ser Pro Gly Asn Thr Glu Gln Gly Leu Met Lys Lys Leu
85 90 95 Lys Gly
Phe Asp Gly Leu Ala Ile Ser Ile Gly Asn Gly Thr Ala Glu 100
105 110 Asn Ala Glu Gly Arg Ala Lys
Pro Arg Pro Ser His Ser Val Glu Thr 115 120
125 Ala Gly Ser Ala Asp Gly Ser Asp Gly Asn Thr Thr
Gly Thr Asp Gln 130 135 140
Ser Arg Arg Lys Arg Ser Arg Glu Gly Thr Pro Thr Ile Ala Gly Glu 145
150 155 160 Asp Glu Lys
Ile Glu Ala Lys Ser Asn Gln Val Ala Ala Gly Glu Val 165
170 175 Thr Ala Thr Ile Ser Pro Lys Leu
Ile Gly Thr Val Val Ser Pro Gly 180 185
190 Met Thr Thr Gly Thr Ile Leu Glu Leu Arg Asn Thr Pro
Thr Met Asn 195 200 205
Ala Met Ser Ser Ala Met Gly Val His Cys Gly Val Met Pro Thr Glu 210
215 220 Val Trp Leu Gln
Ser Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys Gln 225 230
235 240 Ser Asn Arg Glu Ser Ala Arg Arg Ser
Arg Leu Arg Lys Gln Ala Glu 245 250
255 Thr Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Thr Ser Glu
Asn Ala 260 265 270
Ala Leu Arg Ser Glu Ile Asn Gln Leu Thr Glu Met Ser Glu Lys Val
275 280 285 Arg Leu Glu Asn
Ala Ile Leu Val Glu Glu Leu Lys Asn Ala Gln Leu 290
295 300 Gly His Ala Gln Glu Asn Ile Leu
Asn Lys Lys Glu Asp Lys Glu Gly 305 310
315 320 Glu Met Gly Glu Lys Arg Ser Asp Ser Gly Ala Lys
Leu His Gln Leu 325 330
335 Leu Asp Pro Ser Pro Arg Asp Asp Ala Val Ala Ala Gly
340 345 591125DNAGlycine max 59atgccaccat
actacaactc agctgttgct tctggtcacg ctcctcaccc gtacatgtgg 60gggccaccac
agcctatgat gccaccttat gggcctcctt atgcagcaat ttatccacat 120ggaggggttt
atactcaccc tgcagttcct attgggccac ttacacatag tcaaggagtt 180ccgtcttcac
ctgctgctgg gactcctttg agcatagaga caccacccaa atcatctgga 240aatactgatc
agggtttaat gaagaaattg aaagagtttg atggacttgc aatgtcaatt 300ggcaatggcc
atgctgaaag tgcagagcgt ggaggtgaaa acaggctctc acagagtgtg 360gatactgagg
gttccagtga tggaagtgat ggcaacactt caggggctaa tcaatcaaga 420aggaaaagaa
gccgtgaggg aacaccaacc actgatggag aagggaaaac tgagatacaa 480ggcagtccaa
tttccaaaga gactgcagct tctaataaga tgttgggagt tgtccctgcc 540agtgttgcag
gaacaatagt tggacatgta gtttcttcag gtatgaccac tgcactggag 600ctgagaaatc
cttccagtgt tcattctaaa acaagtgccc cacaaccttg tccagtattg 660cctgcagaag
cttgggtaca gaatgagcgt gagctgaaac gggagaggcg gaaacagtca 720aatcgagaat
ctgctagaag gtccagacta aggaagcagg ctgaaactga agaactggca 780cgaaaagttg
aatccttgaa tgctgagaat gcaacactga aatcagaaat taatcgactg 840actgaaagtt
ctgaaaaaat gagggtggaa aatgctacat taaggggaaa acttaaaaat 900gctcaactgg
gacaaaccca agagataact ttgaagataa ttgacagcca gagggctaca 960cctgtaagta
cagaaaactt attatcaaga gttaataatt ccggttctaa tgatagaact 1020gtggaggatg
agaatggttt ttgcgaaaat aaaccaaact ctggtgcaaa gctgcatcaa 1080ctgctggaca
caagtcctag agctgatgct gtggcagctg gttga
112560374PRTGlycine max 60Met Pro Pro Tyr Tyr Asn Ser Ala Val Ala Ser Gly
His Ala Pro His 1 5 10
15 Pro Tyr Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr Gly Pro
20 25 30 Pro Tyr Ala
Ala Ile Tyr Pro His Gly Gly Val Tyr Thr His Pro Ala 35
40 45 Val Pro Ile Gly Pro Leu Thr His
Ser Gln Gly Val Pro Ser Ser Pro 50 55
60 Ala Ala Gly Thr Pro Leu Ser Ile Glu Thr Pro Pro Lys
Ser Ser Gly 65 70 75
80 Asn Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Glu Phe Asp Gly Leu
85 90 95 Ala Met Ser Ile
Gly Asn Gly His Ala Glu Ser Ala Glu Arg Gly Gly 100
105 110 Glu Asn Arg Leu Ser Gln Ser Val Asp
Thr Glu Gly Ser Ser Asp Gly 115 120
125 Ser Asp Gly Asn Thr Ser Gly Ala Asn Gln Ser Arg Arg Lys
Arg Ser 130 135 140
Arg Glu Gly Thr Pro Thr Thr Asp Gly Glu Gly Lys Thr Glu Ile Gln 145
150 155 160 Gly Ser Pro Ile Ser
Lys Glu Thr Ala Ala Ser Asn Lys Met Leu Gly 165
170 175 Val Val Pro Ala Ser Val Ala Gly Thr Ile
Val Gly His Val Val Ser 180 185
190 Ser Gly Met Thr Thr Ala Leu Glu Leu Arg Asn Pro Ser Ser Val
His 195 200 205 Ser
Lys Thr Ser Ala Pro Gln Pro Cys Pro Val Leu Pro Ala Glu Ala 210
215 220 Trp Val Gln Asn Glu Arg
Glu Leu Lys Arg Glu Arg Arg Lys Gln Ser 225 230
235 240 Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg
Lys Gln Ala Glu Thr 245 250
255 Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Asn Ala Glu Asn Ala Thr
260 265 270 Leu Lys
Ser Glu Ile Asn Arg Leu Thr Glu Ser Ser Glu Lys Met Arg 275
280 285 Val Glu Asn Ala Thr Leu Arg
Gly Lys Leu Lys Asn Ala Gln Leu Gly 290 295
300 Gln Thr Gln Glu Ile Thr Leu Lys Ile Ile Asp Ser
Gln Arg Ala Thr 305 310 315
320 Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Ser Gly Ser
325 330 335 Asn Asp Arg
Thr Val Glu Asp Glu Asn Gly Phe Cys Glu Asn Lys Pro 340
345 350 Asn Ser Gly Ala Lys Leu His Gln
Leu Leu Asp Thr Ser Pro Arg Ala 355 360
365 Asp Ala Val Ala Ala Gly 370
611275DNAGlycine max 61atgggaaaca gtgaggatga gaaatctgtt aagactggaa
gcccttcttc ttcacctgca 60acaactgatc agaccaacca acctaatatt catgtctatc
ctgattgggc tgccatgcag 120tattatgggc caagagtcaa cattccacca tacttcaact
cagctgtggc ttcaggtcat 180gctccacacc catacatgtg gggaccacca cagcctatga
tgccacctta tgggccacct 240tatgcagcat tttattcgca tggaggggtt tatactcacc
ctgcagttgc tattgggcca 300cacttacatg gtcaaggagt ttcatcttca cctgctgttg
ggactcattc aagcatagaa 360tcaccaacca aattatctgg aaatactgat cagggtttaa
tgaagaaatc aaaagggttt 420gatgggcttg caatgtcaat aggcaattgc aatgctgaga
gtgctgagca tggagctgag 480aacaggcagt cacagagtgt ggatactgag ggttacagcg
acggaagtga tggcaacact 540gcaggggcta atcaaacaaa aaggaaaaga tgccgagagg
gaacactgac cactgatgga 600gaagggaaaa ctgagctaca aaatggtccg gcttccaaag
agacttcatc ttccaaaaag 660attgtgtcag ctactccagc tagtgttgcc ggaacattag
ttggacctgt agtttcttca 720gttatggcca caacactgga actgaggaac ccttcgactg
ttgattctaa ggcaaattcc 780acaagtgccc cacaaccttg tgcaattgtg cctaatgaaa
cttgcttaca gaatgagcgt 840gagctgaaac gggagaggag aaaacaatct aaccgtgaat
ctgctagaag gtccaggctg 900aggaagcagg ccgagactga agaattggca cgaaaagttg
atatgttaac tgctgagaat 960gtgtccctga agtcagaaat aattcaactg actgaaggtt
ctgagcagat gaggatggaa 1020aattctgcat tgagggaaaa actgagaaat actcaactgg
gacaaaggga agagataatt 1080ttgagtagca tcgagagtaa gagggctgca cctgtaagta
cagaaaactt gttatcaaga 1140gttaataatt ctagttctaa tgacagaact acagagaatg
agaatgattt ctgtgagaac 1200aaaccaaatt ctggtgcaaa gctgcatcaa ctattggata
caaatcctag agcagatgct 1260gtggcagctg gttga
127562424PRTGlycine max 62Met Gly Asn Ser Glu Asp
Glu Lys Ser Val Lys Thr Gly Ser Pro Ser 1 5
10 15 Ser Ser Pro Ala Thr Thr Asp Gln Thr Asn Gln
Pro Asn Ile His Val 20 25
30 Tyr Pro Asp Trp Ala Ala Met Gln Tyr Tyr Gly Pro Arg Val Asn
Ile 35 40 45 Pro
Pro Tyr Phe Asn Ser Ala Val Ala Ser Gly His Ala Pro His Pro 50
55 60 Tyr Met Trp Gly Pro Pro
Gln Pro Met Met Pro Pro Tyr Gly Pro Pro 65 70
75 80 Tyr Ala Ala Phe Tyr Ser His Gly Gly Val Tyr
Thr His Pro Ala Val 85 90
95 Ala Ile Gly Pro His Leu His Gly Gln Gly Val Ser Ser Ser Pro Ala
100 105 110 Val Gly
Thr His Ser Ser Ile Glu Ser Pro Thr Lys Leu Ser Gly Asn 115
120 125 Thr Asp Gln Gly Leu Met Lys
Lys Ser Lys Gly Phe Asp Gly Leu Ala 130 135
140 Met Ser Ile Gly Asn Cys Asn Ala Glu Ser Ala Glu
His Gly Ala Glu 145 150 155
160 Asn Arg Gln Ser Gln Ser Val Asp Thr Glu Gly Tyr Ser Asp Gly Ser
165 170 175 Asp Gly Asn
Thr Ala Gly Ala Asn Gln Thr Lys Arg Lys Arg Cys Arg 180
185 190 Glu Gly Thr Leu Thr Thr Asp Gly
Glu Gly Lys Thr Glu Leu Gln Asn 195 200
205 Gly Pro Ala Ser Lys Glu Thr Ser Ser Ser Lys Lys Ile
Val Ser Ala 210 215 220
Thr Pro Ala Ser Val Ala Gly Thr Leu Val Gly Pro Val Val Ser Ser 225
230 235 240 Val Met Ala Thr
Thr Leu Glu Leu Arg Asn Pro Ser Thr Val Asp Ser 245
250 255 Lys Ala Asn Ser Thr Ser Ala Pro Gln
Pro Cys Ala Ile Val Pro Asn 260 265
270 Glu Thr Cys Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg
Arg Lys 275 280 285
Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala 290
295 300 Glu Thr Glu Glu Leu
Ala Arg Lys Val Asp Met Leu Thr Ala Glu Asn 305 310
315 320 Val Ser Leu Lys Ser Glu Ile Ile Gln Leu
Thr Glu Gly Ser Glu Gln 325 330
335 Met Arg Met Glu Asn Ser Ala Leu Arg Glu Lys Leu Arg Asn Thr
Gln 340 345 350 Leu
Gly Gln Arg Glu Glu Ile Ile Leu Ser Ser Ile Glu Ser Lys Arg 355
360 365 Ala Ala Pro Val Ser Thr
Glu Asn Leu Leu Ser Arg Val Asn Asn Ser 370 375
380 Ser Ser Asn Asp Arg Thr Thr Glu Asn Glu Asn
Asp Phe Cys Glu Asn 385 390 395
400 Lys Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Asn Pro
405 410 415 Arg Ala
Asp Ala Val Ala Ala Gly 420 631035DNAGlycine
max 63atgggaaccg gtgaagaaag cacagctaaa gttcctaaac catcttctac ttcttcaatt
60cagataccac tggcaccttc atatcctgat tggtcaagct cgatgcaggc ttactatgct
120cctggagcca ctccacctgc attttttgcc tcaaatattg cttctccaac tccccattct
180tatatgtggg gaagccagca ccctctaatt ccaccatata gtactcctgt tccatatcca
240gctatatatc ctcctgggaa tgtctatgct catcctagca tggcaatgac cctgagcacc
300acacagaatg gtacagagtt tgtaggaaag ggttctgatg aaaaagatcg ggtttctgcc
360aaaagttcaa aggctgtgtc tgcaaacaat ggttccaaag ctggagacaa tggaaaggca
420agctcaggtc ccagaaatga tggcacctca caaagtgctg aaagtggttc agagggatct
480tcggatgcta gtgatgagaa tactaaccaa caggaatcgg ctacaaacaa gaaagggagt
540tttgaccaaa tgcttgttga tggtgctaat gcacggaaca attctgtgag catcattcct
600caacctggaa atcccgctgt gtcaatgtct ccaactagtc ttaatattgg aatgaacttg
660tggaatgcat ctcctgctgg tgacgaagct gcaaaaatga gacagaatca gtcttcagga
720gctgttactc ctccaaccat aatgggacgt gaagtcgcgc tgggtgaaca ctggatacaa
780gatgaacgtg aactaaagaa acagaaaagg aaacagtcta atagggagtc tgctaggaga
840tcaagactac gcaagcaggc tgagtgtgaa gagttacaaa agagggtgga gtctctggga
900agtgagaatc aaactctcag agaagagctt cagagagtat ctgaagaatg caaaaaactt
960acatctgaaa atgattccat caaggaagag ttagaacggt tgtgtggacc agaagcagtt
1020gctaatcttg aataa
103564344PRTGlycine max 64Met Gly Thr Gly Glu Glu Ser Thr Ala Lys Val Pro
Lys Pro Ser Ser 1 5 10
15 Thr Ser Ser Ile Gln Ile Pro Leu Ala Pro Ser Tyr Pro Asp Trp Ser
20 25 30 Ser Ser Met
Gln Ala Tyr Tyr Ala Pro Gly Ala Thr Pro Pro Ala Phe 35
40 45 Phe Ala Ser Asn Ile Ala Ser Pro
Thr Pro His Ser Tyr Met Trp Gly 50 55
60 Ser Gln His Pro Leu Ile Pro Pro Tyr Ser Thr Pro Val
Pro Tyr Pro 65 70 75
80 Ala Ile Tyr Pro Pro Gly Asn Val Tyr Ala His Pro Ser Met Ala Met
85 90 95 Thr Leu Ser Thr
Thr Gln Asn Gly Thr Glu Phe Val Gly Lys Gly Ser 100
105 110 Asp Glu Lys Asp Arg Val Ser Ala Lys
Ser Ser Lys Ala Val Ser Ala 115 120
125 Asn Asn Gly Ser Lys Ala Gly Asp Asn Gly Lys Ala Ser Ser
Gly Pro 130 135 140
Arg Asn Asp Gly Thr Ser Gln Ser Ala Glu Ser Gly Ser Glu Gly Ser 145
150 155 160 Ser Asp Ala Ser Asp
Glu Asn Thr Asn Gln Gln Glu Ser Ala Thr Asn 165
170 175 Lys Lys Gly Ser Phe Asp Gln Met Leu Val
Asp Gly Ala Asn Ala Arg 180 185
190 Asn Asn Ser Val Ser Ile Ile Pro Gln Pro Gly Asn Pro Ala Val
Ser 195 200 205 Met
Ser Pro Thr Ser Leu Asn Ile Gly Met Asn Leu Trp Asn Ala Ser 210
215 220 Pro Ala Gly Asp Glu Ala
Ala Lys Met Arg Gln Asn Gln Ser Ser Gly 225 230
235 240 Ala Val Thr Pro Pro Thr Ile Met Gly Arg Glu
Val Ala Leu Gly Glu 245 250
255 His Trp Ile Gln Asp Glu Arg Glu Leu Lys Lys Gln Lys Arg Lys Gln
260 265 270 Ser Asn
Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 275
280 285 Cys Glu Glu Leu Gln Lys Arg
Val Glu Ser Leu Gly Ser Glu Asn Gln 290 295
300 Thr Leu Arg Glu Glu Leu Gln Arg Val Ser Glu Glu
Cys Lys Lys Leu 305 310 315
320 Thr Ser Glu Asn Asp Ser Ile Lys Glu Glu Leu Glu Arg Leu Cys Gly
325 330 335 Pro Glu Ala
Val Ala Asn Leu Glu 340 651275DNAGlycine max
65atgggaaaca gtgaggaaga gaaatctgtt aaaactggaa gtccttcttc ttcacctgca
60acaactgagc agaccaacca acctaatatt cacgtctatc ctgattgggc tgccatgcag
120tattatgggc caagagtcaa cattccacca tacttcaact cagctgtggc ttctggtcat
180gctccacacc catacatgtg ggggccgcca cagcctatga tgcaacctta tgggccacct
240tatgcagcat tttattcgca tggaggggtt tatactcacc ctgcagttgc tattgggcca
300cactcacatg gtcaaggagt tccatcttca cctgctgctg ggactccttc aagcgtagaa
360tcaccaacca aattctctgg aaatactaat cagggtttag tgaagaaatt gaaagggttt
420gatgagcttg caatgtcaat aggcaattgc aatgctgaga gtgctgagcg aggagctgaa
480aacaggctgt cacagagtgt ggatactgag ggttccagcg acggaagtga tggcaacact
540gcaggggcta atcaaacaaa aaggaaaaga agccgagaag gaacaccgat cactgatgca
600gaagggaaaa ctgagctaca aaatggtccg gcttccaaag agactgcatc ttccaaaaag
660attgtgtcag ctaccccagc tagtgttgca ggaacattag ttggacctgt agtttcttca
720ggtatggcca cagcactgga gctgaggaac ccttcgactg ttcattctaa ggcaaattcc
780acaagtgccg cacaaccttg tgcagttgtg cgtaatgaaa cttggttaca gaatgagcgt
840gagctgaaac gggagaggag aaaacaatct aaccgtgaat ctgctagaag gtccaggctg
900aggaagcagg ccgagactga agaattggca cgaaaagttg agatgttaac tgctgagaat
960gtgtctctga agtcagaaat aactcgactg actgaaggtt ctgagcagat gaggatggaa
1020aattctgcat tgagggaaaa actgataaat actcaactgg gaccaaggga agagataact
1080ttgagcagca ttgacagcaa gagggctgca cctgtaagta cagaaaactt gttatcaaga
1140gttaacaatt ccggagctaa tgatagaact gcagagaatg agaatgatat ctgcgagaac
1200aaaccaaatt ctggtgcaaa gctgcatcaa ctactggata caaatcctag agctaatgct
1260gtagcagctg gttga
127566424PRTGlycine max 66Met Gly Asn Ser Glu Glu Glu Lys Ser Val Lys Thr
Gly Ser Pro Ser 1 5 10
15 Ser Ser Pro Ala Thr Thr Glu Gln Thr Asn Gln Pro Asn Ile His Val
20 25 30 Tyr Pro Asp
Trp Ala Ala Met Gln Tyr Tyr Gly Pro Arg Val Asn Ile 35
40 45 Pro Pro Tyr Phe Asn Ser Ala Val
Ala Ser Gly His Ala Pro His Pro 50 55
60 Tyr Met Trp Gly Pro Pro Gln Pro Met Met Gln Pro Tyr
Gly Pro Pro 65 70 75
80 Tyr Ala Ala Phe Tyr Ser His Gly Gly Val Tyr Thr His Pro Ala Val
85 90 95 Ala Ile Gly Pro
His Ser His Gly Gln Gly Val Pro Ser Ser Pro Ala 100
105 110 Ala Gly Thr Pro Ser Ser Val Glu Ser
Pro Thr Lys Phe Ser Gly Asn 115 120
125 Thr Asn Gln Gly Leu Val Lys Lys Leu Lys Gly Phe Asp Glu
Leu Ala 130 135 140
Met Ser Ile Gly Asn Cys Asn Ala Glu Ser Ala Glu Arg Gly Ala Glu 145
150 155 160 Asn Arg Leu Ser Gln
Ser Val Asp Thr Glu Gly Ser Ser Asp Gly Ser 165
170 175 Asp Gly Asn Thr Ala Gly Ala Asn Gln Thr
Lys Arg Lys Arg Ser Arg 180 185
190 Glu Gly Thr Pro Ile Thr Asp Ala Glu Gly Lys Thr Glu Leu Gln
Asn 195 200 205 Gly
Pro Ala Ser Lys Glu Thr Ala Ser Ser Lys Lys Ile Val Ser Ala 210
215 220 Thr Pro Ala Ser Val Ala
Gly Thr Leu Val Gly Pro Val Val Ser Ser 225 230
235 240 Gly Met Ala Thr Ala Leu Glu Leu Arg Asn Pro
Ser Thr Val His Ser 245 250
255 Lys Ala Asn Ser Thr Ser Ala Ala Gln Pro Cys Ala Val Val Arg Asn
260 265 270 Glu Thr
Trp Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys 275
280 285 Gln Ser Asn Arg Glu Ser Ala
Arg Arg Ser Arg Leu Arg Lys Gln Ala 290 295
300 Glu Thr Glu Glu Leu Ala Arg Lys Val Glu Met Leu
Thr Ala Glu Asn 305 310 315
320 Val Ser Leu Lys Ser Glu Ile Thr Arg Leu Thr Glu Gly Ser Glu Gln
325 330 335 Met Arg Met
Glu Asn Ser Ala Leu Arg Glu Lys Leu Ile Asn Thr Gln 340
345 350 Leu Gly Pro Arg Glu Glu Ile Thr
Leu Ser Ser Ile Asp Ser Lys Arg 355 360
365 Ala Ala Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val
Asn Asn Ser 370 375 380
Gly Ala Asn Asp Arg Thr Ala Glu Asn Glu Asn Asp Ile Cys Glu Asn 385
390 395 400 Lys Pro Asn Ser
Gly Ala Lys Leu His Gln Leu Leu Asp Thr Asn Pro 405
410 415 Arg Ala Asn Ala Val Ala Ala Gly
420 671014DNAGlycine max 67atgggggctg gggaagagag
cacagctaaa tcttctaaat catcttcatc agctcaggat 60acaccaacag cacctgcata
tcctgattgg tcaagctcca tgcaggccta ttatgctcct 120ggagccactc ctcctccctt
ttttgccaca accgttgctt ccccaactcc ccatccctat 180ttatggggag gccagcatcc
tttgatgccg ccatatggga ctccagtccc atatccagct 240atatatcctc ctgggagtat
ctatgctcat cctagcatgg cagtgactcc aagtgctgtc 300cagcaaaata cagagattga
agggaaggga gctgaaggaa aatatcggga ctcatccaaa 360aaattgaaag gaccttctgc
aaatacagct tccaaagcag gagaaagtgg aaaggcaggc 420tcaggttcag gcaatgatgg
catatcgcaa agtggtgaaa gtggttcaga gggttcatca 480aatgctagcg atgagaatac
taaccaacag gaatcagctg caaataagaa gggaagcttt 540gacctgatgc ttgttgatgg
agccaatgca cagaacaatt ctgctggtgc tatttctcaa 600tcttctgtgc ctggaaagcc
tgttgtccca atgccagcaa ctaatcttaa cattggaatg 660gacttgtgga atgcatcttc
tggtggcgct gaagctgcaa aaatgagaca taatcaatct 720ggtgccccgg gagttgccct
tggtgatcaa tgggtacaag atgaacgtga gctgaaaaga 780cagaagagga aacagtcaaa
ccgagagtca gccaggaggt caagattacg caagcaggct 840gagtgtgaag agttacaaaa
gagggtggaa tcgctgggag gtgagaatca aactctcaga 900gaagagcttc agagactttc
tgaagaatgc gagaagctta catccgaaaa taattctatc 960aaggaagagt tggagcggtt
gtgtggtcca gaagcagttg ctaacctgga ttaa 101468337PRTGlycine max
68Met Gly Ala Gly Glu Glu Ser Thr Ala Lys Ser Ser Lys Ser Ser Ser 1
5 10 15 Ser Ala Gln Asp
Thr Pro Thr Ala Pro Ala Tyr Pro Asp Trp Ser Ser 20
25 30 Ser Met Gln Ala Tyr Tyr Ala Pro Gly
Ala Thr Pro Pro Pro Phe Phe 35 40
45 Ala Thr Thr Val Ala Ser Pro Thr Pro His Pro Tyr Leu Trp
Gly Gly 50 55 60
Gln His Pro Leu Met Pro Pro Tyr Gly Thr Pro Val Pro Tyr Pro Ala 65
70 75 80 Ile Tyr Pro Pro Gly
Ser Ile Tyr Ala His Pro Ser Met Ala Val Thr 85
90 95 Pro Ser Ala Val Gln Gln Asn Thr Glu Ile
Glu Gly Lys Gly Ala Glu 100 105
110 Gly Lys Tyr Arg Asp Ser Ser Lys Lys Leu Lys Gly Pro Ser Ala
Asn 115 120 125 Thr
Ala Ser Lys Ala Gly Glu Ser Gly Lys Ala Gly Ser Gly Ser Gly 130
135 140 Asn Asp Gly Ile Ser Gln
Ser Gly Glu Ser Gly Ser Glu Gly Ser Ser 145 150
155 160 Asn Ala Ser Asp Glu Asn Thr Asn Gln Gln Glu
Ser Ala Ala Asn Lys 165 170
175 Lys Gly Ser Phe Asp Leu Met Leu Val Asp Gly Ala Asn Ala Gln Asn
180 185 190 Asn Ser
Ala Gly Ala Ile Ser Gln Ser Ser Val Pro Gly Lys Pro Val 195
200 205 Val Pro Met Pro Ala Thr Asn
Leu Asn Ile Gly Met Asp Leu Trp Asn 210 215
220 Ala Ser Ser Gly Gly Ala Glu Ala Ala Lys Met Arg
His Asn Gln Ser 225 230 235
240 Gly Ala Pro Gly Val Ala Leu Gly Asp Gln Trp Val Gln Asp Glu Arg
245 250 255 Glu Leu Lys
Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 260
265 270 Arg Ser Arg Leu Arg Lys Gln Ala
Glu Cys Glu Glu Leu Gln Lys Arg 275 280
285 Val Glu Ser Leu Gly Gly Glu Asn Gln Thr Leu Arg Glu
Glu Leu Gln 290 295 300
Arg Leu Ser Glu Glu Cys Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile 305
310 315 320 Lys Glu Glu Leu
Glu Arg Leu Cys Gly Pro Glu Ala Val Ala Asn Leu 325
330 335 Asp 691278DNAGlycine max
69atgggaaaca gtgaggaaga gaaatctacc aagactgaaa aaccttcttc acctgtaaca
60gtggatcaag ccaatcagac caaccagacc aatattcatg tctatcctga ttgggcagcc
120atgcaggcat attatgggcc aagagtcacc atgccaccat actacaactc agctgtggct
180tctggtcacg ctcctcaccc atacatgtgg ggaccaccac agcctatgat gccaccttat
240gggcctcctt atgcagcaat ttatccacat ggaggggttt atactcaccc tgcagttcct
300attgggccac atacacatag tcaaggagtt ccatcttcac ccgccgctgg gactccttta
360agcatagaga caccacccaa atcatctgga aatactgatc agggtttaat gaagaaattg
420aaagagtttg atggacttgc aatgtcaata ggaaatggcc atgctgaaag tgcagagcct
480ggaggtgaaa acaggctgtc agagagtgtg gatactgagg gttccagtga tggaagtgat
540ggcaacactt caggggctaa tcaaacaaga aggaaaagaa gccgtgaggg aacaccaacc
600actgatggag aagggaaaac tgagatgcaa ggcagtccaa tttccaaaga gactgcagct
660tctaataaga tgttggcagt tgtcactgct ggtgttgcag gaacaatagt tggacctgta
720gtttcttcag gtatgaccac cacgctggag ctgagaaatc cttccagtgt tcattctaaa
780gcaagtgccc cacaaccttg tccagtattg cctgcagaaa cttggttaca gaatgagcgt
840gagctgaaac gtgagaggcg gaaacaatca aatcgagaat ctgctagaag gtccagacta
900aggaagcagg ctgaaactga agaactggca cggaaagttg aatccttgaa tgctgagaat
960gcaacactga aatcagaaat aaatcgactg accgaaagtt ctgaaaaaat gagggtggaa
1020aatgctacat taaggggaaa acttaaaaat gctcaactga gacaaacaca agagataact
1080ttgaacataa ttgacagcca gagggctaca cctataagta cagaaaactt actatcgaga
1140gttaataata attccggttc taatgataga actgtggagg atgagaatgg tttttgcgaa
1200aataaaccaa actctggtgc aaagctgcat caactactgg acacaagtcc tagagctgat
1260gctgtggcag ctggttga
127870425PRTGlycine max 70Met Gly Asn Ser Glu Glu Glu Lys Ser Thr Lys Thr
Glu Lys Pro Ser 1 5 10
15 Ser Pro Val Thr Val Asp Gln Ala Asn Gln Thr Asn Gln Thr Asn Ile
20 25 30 His Val Tyr
Pro Asp Trp Ala Ala Met Gln Ala Tyr Tyr Gly Pro Arg 35
40 45 Val Thr Met Pro Pro Tyr Tyr Asn
Ser Ala Val Ala Ser Gly His Ala 50 55
60 Pro His Pro Tyr Met Trp Gly Pro Pro Gln Pro Met Met
Pro Pro Tyr 65 70 75
80 Gly Pro Pro Tyr Ala Ala Ile Tyr Pro His Gly Gly Val Tyr Thr His
85 90 95 Pro Ala Val Pro
Ile Gly Pro His Thr His Ser Gln Gly Val Pro Ser 100
105 110 Ser Pro Ala Ala Gly Thr Pro Leu Ser
Ile Glu Thr Pro Pro Lys Ser 115 120
125 Ser Gly Asn Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Glu
Phe Asp 130 135 140
Gly Leu Ala Met Ser Ile Gly Asn Gly His Ala Glu Ser Ala Glu Pro 145
150 155 160 Gly Gly Glu Asn Arg
Leu Ser Glu Ser Val Asp Thr Glu Gly Ser Ser 165
170 175 Asp Gly Ser Asp Gly Asn Thr Ser Gly Ala
Asn Gln Thr Arg Arg Lys 180 185
190 Arg Ser Arg Glu Gly Thr Pro Thr Thr Asp Gly Glu Gly Lys Thr
Glu 195 200 205 Met
Gln Gly Ser Pro Ile Ser Lys Glu Thr Ala Ala Ser Asn Lys Met 210
215 220 Leu Ala Val Val Thr Ala
Gly Val Ala Gly Thr Ile Val Gly Pro Val 225 230
235 240 Val Ser Ser Gly Met Thr Thr Thr Leu Glu Leu
Arg Asn Pro Ser Ser 245 250
255 Val His Ser Lys Ala Ser Ala Pro Gln Pro Cys Pro Val Leu Pro Ala
260 265 270 Glu Thr
Trp Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys 275
280 285 Gln Ser Asn Arg Glu Ser Ala
Arg Arg Ser Arg Leu Arg Lys Gln Ala 290 295
300 Glu Thr Glu Glu Leu Ala Arg Lys Val Glu Ser Leu
Asn Ala Glu Asn 305 310 315
320 Ala Thr Leu Lys Ser Glu Ile Asn Arg Leu Thr Glu Ser Ser Glu Lys
325 330 335 Met Arg Val
Glu Asn Ala Thr Leu Arg Gly Lys Leu Lys Asn Ala Gln 340
345 350 Leu Arg Gln Thr Gln Glu Ile Thr
Leu Asn Ile Ile Asp Ser Gln Arg 355 360
365 Ala Thr Pro Ile Ser Thr Glu Asn Leu Leu Ser Arg Val
Asn Asn Asn 370 375 380
Ser Gly Ser Asn Asp Arg Thr Val Glu Asp Glu Asn Gly Phe Cys Glu 385
390 395 400 Asn Lys Pro Asn
Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Ser 405
410 415 Pro Arg Ala Asp Ala Val Ala Ala Gly
420 425 71558DNAHelianthus annuus
71atgtcaacga tactggtgtt agggttggag agagtgggaa gacagcttcg agttcaggga
60atgacggtgc cacacaaagc gtgtaacgta agcagtgccg aaagtggaaa taacggttca
120tctgatgcaa atgatgagga tacccaacag gaacattctg gaagcaaaaa gggaagcttc
180catcagatgc ttgcggacgc gaatgcacga aacaacaatc atgtagcttc tatgcctgct
240accaccaatc tttatatggg gatgaatatg tggaccccac ctactggttc tgtgtcacaa
300ccggtgaccc cgccaccggt aatggctgat cggtgggttc aggatgaacg agaattgaaa
360aggcagaaac ggaagcaaga caacagagag tcggctagaa gatcaaggat gcgcaagcag
420gctgagtgtg aagcgctaca agcaacagta gagacgctaa ataacgagaa tcactcactt
480agggatgagc tgcagaggct ttctgaggca tgtgggaagc ttacagctga aaatgattcg
540ataaaggatg agatttaa
55872185PRTHelianthus annuus 72Met Ser Thr Ile Leu Val Leu Gly Leu Glu
Arg Val Gly Arg Gln Leu 1 5 10
15 Arg Val Gln Gly Met Thr Val Pro His Lys Ala Cys Asn Val Ser
Ser 20 25 30 Ala
Glu Ser Gly Asn Asn Gly Ser Ser Asp Ala Asn Asp Glu Asp Thr 35
40 45 Gln Gln Glu His Ser Gly
Ser Lys Lys Gly Ser Phe His Gln Met Leu 50 55
60 Ala Asp Ala Asn Ala Arg Asn Asn Asn His Val
Ala Ser Met Pro Ala 65 70 75
80 Thr Thr Asn Leu Tyr Met Gly Met Asn Met Trp Thr Pro Pro Thr Gly
85 90 95 Ser Val
Ser Gln Pro Val Thr Pro Pro Pro Val Met Ala Asp Arg Trp 100
105 110 Val Gln Asp Glu Arg Glu Leu
Lys Arg Gln Lys Arg Lys Gln Asp Asn 115 120
125 Arg Glu Ser Ala Arg Arg Ser Arg Met Arg Lys Gln
Ala Glu Cys Glu 130 135 140
Ala Leu Gln Ala Thr Val Glu Thr Leu Asn Asn Glu Asn His Ser Leu 145
150 155 160 Arg Asp Glu
Leu Gln Arg Leu Ser Glu Ala Cys Gly Lys Leu Thr Ala 165
170 175 Glu Asn Asp Ser Ile Lys Asp Glu
Ile 180 185 731047DNAHelianthus annuus
73atgggaaact gtgaagaagc aaaggattgt aagcccgaag aaacatcttc accacctgcg
60gcttattacg gcccgagaat ggctgtgcca ccatacttca gttcacctgt tgcatctggt
120catgcacctc caccttatat gtggggtcca atgccgcata tgatgccccc ttacgctgca
180atttatccac atggaggtgt ttatgcacat cccggagtta ctgtggcggg taatcctaat
240cctacactcg taccaatcga ttcttgtgcc aagtcatcag ggaatccaga tagaggcttg
300ataaagaagt tgaaaggaat tgatgagcta acaatgtcga tagggaacaa caatggaatc
360tctaatagtc gagatacgga aggttccagt gaaggcagtg atggtaatac aagttgtaaa
420aatggtcaca aaaggatccg tggaggatca tctacaagtt ctgaaggtgg caagactgag
480cgaagtgatg cgaacggggt ctctaagaaa gtgacaggtg tcaagatttt gggtaaagaa
540gttggagcgg ttatttctgg tgactcggga accgagctag agcttaagaa gtcgccagca
600tctgccaaca tggccatggt gcccaacagt cttgtgcttc agaacgaacg agaactgaaa
660agggaaaaaa gaaagcagtc taaccgagaa tcagctagga ggtcgagatt aaggaaacag
720gccgaagccg aagaacttgg tacaagagtt gaagcactca ctagtgaaaa tttgaaactc
780aagtctgaga ttaaccagtt aaccgttaat gcaacaaact tgaagcttca aaacgctaaa
840ctactggaaa aacttaagaa agctacagaa ggcccaaggg ccgacaaaaa gggttcatct
900ctgagcactg ccaacctcct ctctagggtc gacaaccgat ctggttctgt agttggagag
960gccaccagtt ctggttctgg tgcaccgctg taccaacttc tggatgctag tccccgggcc
1020gatggtgttg cggctggtgc tggctaa
104774348PRTHelianthus annuus 74Met Gly Asn Cys Glu Glu Ala Lys Asp Cys
Lys Pro Glu Glu Thr Ser 1 5 10
15 Ser Pro Pro Ala Ala Tyr Tyr Gly Pro Arg Met Ala Val Pro Pro
Tyr 20 25 30 Phe
Ser Ser Pro Val Ala Ser Gly His Ala Pro Pro Pro Tyr Met Trp 35
40 45 Gly Pro Met Pro His Met
Met Pro Pro Tyr Ala Ala Ile Tyr Pro His 50 55
60 Gly Gly Val Tyr Ala His Pro Gly Val Thr Val
Ala Gly Asn Pro Asn 65 70 75
80 Pro Thr Leu Val Pro Ile Asp Ser Cys Ala Lys Ser Ser Gly Asn Pro
85 90 95 Asp Arg
Gly Leu Ile Lys Lys Leu Lys Gly Ile Asp Glu Leu Thr Met 100
105 110 Ser Ile Gly Asn Asn Asn Gly
Ile Ser Asn Ser Arg Asp Thr Glu Gly 115 120
125 Ser Ser Glu Gly Ser Asp Gly Asn Thr Ser Cys Lys
Asn Gly His Lys 130 135 140
Arg Ile Arg Gly Gly Ser Ser Thr Ser Ser Glu Gly Gly Lys Thr Glu 145
150 155 160 Arg Ser Asp
Ala Asn Gly Val Ser Lys Lys Val Thr Gly Val Lys Ile 165
170 175 Leu Gly Lys Glu Val Gly Ala Val
Ile Ser Gly Asp Ser Gly Thr Glu 180 185
190 Leu Glu Leu Lys Lys Ser Pro Ala Ser Ala Asn Met Ala
Met Val Pro 195 200 205
Asn Ser Leu Val Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Lys Arg 210
215 220 Lys Gln Ser Asn
Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln 225 230
235 240 Ala Glu Ala Glu Glu Leu Gly Thr Arg
Val Glu Ala Leu Thr Ser Glu 245 250
255 Asn Leu Lys Leu Lys Ser Glu Ile Asn Gln Leu Thr Val Asn
Ala Thr 260 265 270
Asn Leu Lys Leu Gln Asn Ala Lys Leu Leu Glu Lys Leu Lys Lys Ala
275 280 285 Thr Glu Gly Pro
Arg Ala Asp Lys Lys Gly Ser Ser Leu Ser Thr Ala 290
295 300 Asn Leu Leu Ser Arg Val Asp Asn
Arg Ser Gly Ser Val Val Gly Glu 305 310
315 320 Ala Thr Ser Ser Gly Ser Gly Ala Pro Leu Tyr Gln
Leu Leu Asp Ala 325 330
335 Ser Pro Arg Ala Asp Gly Val Ala Ala Gly Ala Gly 340
345 75747DNAHelianthus petiolaris 75atgggagctg
gtgaaggaag ctcaactgct aagcattcta aacctacctt atctcaggaa 60acacatccac
cctcctatgc tgattggtca gcctcaatgc aggcgtatta tggtggcgga 120gctcctccac
cattctttcc gtcgatcgtt gcatctcctc ctcctcatcc atacatgtgg 180ggaggccagc
atcctatgat gccaccatac ggggctccag ttccatatcc gactctatat 240tcaccgccgg
aagtttatgc tcatcccggc acacctatgg tagtccgcca aaaggtagca 300tcaagttcag
ggaacgacgg taacactacc caaagtgctg atagtgagaa tgatggttct 360tcagatgcca
atttgcagaa tgacaagtcg gggaatctcg ttgttcatac gcctgctacc 420gctaaaaatg
tagcagttga aatgcaatcg aatccgtctg atgctcctca aacagcggtc 480ccaccaccga
taatgggaca gtggccacgc caagatgaac gagaactgaa gagacagaag 540agaaagcaat
ctaatcgaga gtcagctagg agatcgagaa tgcgcaagca ggccgagtgt 600gaagagctac
aagcacgagt tgaggtacta agtaatgaaa atcacacgct caaagatgag 660ctgcagagac
tctctgagga aatgcagaag ctaacgtctg aaaatgattc tatgaaggga 720aagttaagtg
actattgtga accctga
74776248PRTHelianthus petiolaris 76Met Gly Ala Gly Glu Gly Ser Ser Thr
Ala Lys His Ser Lys Pro Thr 1 5 10
15 Leu Ser Gln Glu Thr His Pro Pro Ser Tyr Ala Asp Trp Ser
Ala Ser 20 25 30
Met Gln Ala Tyr Tyr Gly Gly Gly Ala Pro Pro Pro Phe Phe Pro Ser
35 40 45 Ile Val Ala Ser
Pro Pro Pro His Pro Tyr Met Trp Gly Gly Gln His 50
55 60 Pro Met Met Pro Pro Tyr Gly Ala
Pro Val Pro Tyr Pro Thr Leu Tyr 65 70
75 80 Ser Pro Pro Glu Val Tyr Ala His Pro Gly Thr Pro
Met Val Val Arg 85 90
95 Gln Lys Val Ala Ser Ser Ser Gly Asn Asp Gly Asn Thr Thr Gln Ser
100 105 110 Ala Asp Ser
Glu Asn Asp Gly Ser Ser Asp Ala Asn Leu Gln Asn Asp 115
120 125 Lys Ser Gly Asn Leu Val Val His
Thr Pro Ala Thr Ala Lys Asn Val 130 135
140 Ala Val Glu Met Gln Ser Asn Pro Ser Asp Ala Pro Gln
Thr Ala Val 145 150 155
160 Pro Pro Pro Ile Met Gly Gln Trp Pro Arg Gln Asp Glu Arg Glu Leu
165 170 175 Lys Arg Gln Lys
Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 180
185 190 Arg Met Arg Lys Gln Ala Glu Cys Glu
Glu Leu Gln Ala Arg Val Glu 195 200
205 Val Leu Ser Asn Glu Asn His Thr Leu Lys Asp Glu Leu Gln
Arg Leu 210 215 220
Ser Glu Glu Met Gln Lys Leu Thr Ser Glu Asn Asp Ser Met Lys Gly 225
230 235 240 Lys Leu Ser Asp Tyr
Cys Glu Pro 245 771083DNAHelianthus tuberosus
77atgggaaact gtgaagaagc aaaggattgt aaacccgaag aaacatcttc acctcctgcg
60actggtcgtg tatatactga ttgggcgtcc atgcaggctt attatggccc gagaatggct
120gtgccaccat acttcaattc acctgttgca tctggtcatg cacctccacc ttatatgtgg
180ggtccaatgc cgcatatgat gcccccttac gctgcaattt atccacatgg aggtgtttat
240gcacatcccg gagttactgt ggcggataat cctaatcctg cactcatacc aatcgattct
300tctgccaagt catcagggaa tccagataga ggcttgataa agaagttgaa aggaattgat
360gagctaacaa tgtcgatagg gaacaacaat ggaatctcta atagtcgaga tacggaaggt
420tccagtgaag gcagtgatgg taatacaagt tgtaaaaatg aacacaaaag gatccgtgga
480ggatcatcta caagttctga aggtggcaag actgagcgaa gtgatgcgaa cggggtcact
540aagaaagtga aaggtgtcaa gattttgggt aaagaagttg gagcggttat ttctggtgac
600tcgggaaccg agctagagct taagaagtct ccagcatctc ccaacatgtc catggtgccc
660aacagtcttg tgcttcagaa cgaacgagaa ctgaaaaggg aaaaaagaaa gcagtctaac
720cgagaatcag ctaggaggtc aagattaagg aaacaggccg aagccgaaga acttggtaca
780agagttgaag cactcactag tgaaaatttg aaactcaagt ctgagattaa ccagttaact
840gttaatgcaa caaacttgaa gcttcaaaac gctaaactac tggaaaaact taagaaagct
900acagaaggcc caagggccga caaaaagggt tcatctctga gcactgctaa cctcctctct
960agggtcgaca accgttctgg ttcagtagtt ggagatgcca ccagttctgg ttctggtgca
1020ccgctgcacc aacttctgga tgctagtccc cgggccgatg ctgttgcggc tggtgctggc
1080taa
108378360PRTHelianthus tuberosus 78Met Gly Asn Cys Glu Glu Ala Lys Asp
Cys Lys Pro Glu Glu Thr Ser 1 5 10
15 Ser Pro Pro Ala Thr Gly Arg Val Tyr Thr Asp Trp Ala Ser
Met Gln 20 25 30
Ala Tyr Tyr Gly Pro Arg Met Ala Val Pro Pro Tyr Phe Asn Ser Pro
35 40 45 Val Ala Ser Gly
His Ala Pro Pro Pro Tyr Met Trp Gly Pro Met Pro 50
55 60 His Met Met Pro Pro Tyr Ala Ala
Ile Tyr Pro His Gly Gly Val Tyr 65 70
75 80 Ala His Pro Gly Val Thr Val Ala Asp Asn Pro Asn
Pro Ala Leu Ile 85 90
95 Pro Ile Asp Ser Ser Ala Lys Ser Ser Gly Asn Pro Asp Arg Gly Leu
100 105 110 Ile Lys Lys
Leu Lys Gly Ile Asp Glu Leu Thr Met Ser Ile Gly Asn 115
120 125 Asn Asn Gly Ile Ser Asn Ser Arg
Asp Thr Glu Gly Ser Ser Glu Gly 130 135
140 Ser Asp Gly Asn Thr Ser Cys Lys Asn Glu His Lys Arg
Ile Arg Gly 145 150 155
160 Gly Ser Ser Thr Ser Ser Glu Gly Gly Lys Thr Glu Arg Ser Asp Ala
165 170 175 Asn Gly Val Thr
Lys Lys Val Lys Gly Val Lys Ile Leu Gly Lys Glu 180
185 190 Val Gly Ala Val Ile Ser Gly Asp Ser
Gly Thr Glu Leu Glu Leu Lys 195 200
205 Lys Ser Pro Ala Ser Pro Asn Met Ser Met Val Pro Asn Ser
Leu Val 210 215 220
Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn 225
230 235 240 Arg Glu Ser Ala Arg
Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu 245
250 255 Glu Leu Gly Thr Arg Val Glu Ala Leu Thr
Ser Glu Asn Leu Lys Leu 260 265
270 Lys Ser Glu Ile Asn Gln Leu Thr Val Asn Ala Thr Asn Leu Lys
Leu 275 280 285 Gln
Asn Ala Lys Leu Leu Glu Lys Leu Lys Lys Ala Thr Glu Gly Pro 290
295 300 Arg Ala Asp Lys Lys Gly
Ser Ser Leu Ser Thr Ala Asn Leu Leu Ser 305 310
315 320 Arg Val Asp Asn Arg Ser Gly Ser Val Val Gly
Asp Ala Thr Ser Ser 325 330
335 Gly Ser Gly Ala Pro Leu His Gln Leu Leu Asp Ala Ser Pro Arg Ala
340 345 350 Asp Ala
Val Ala Ala Gly Ala Gly 355 360
79858DNAHelianthus tuberosus 79atgatggcac cttatgggac tccggttcca
tatgctgcta tgtatccacc agcaggagtt 60tatggtcatc ccggtatgcc tatgactcct
gacactgtgc agccgaccac agaaatggaa 120tcgaaggccc ccaatgggaa ggacaaggtc
aacaaaaagt caaaggggtc ttctggaaat 180gtcaatgcca gcggcgctaa gactagagag
agtgggaagg cggcttcaag ttcagggaat 240gacggtggca cccaaagtgc tgaaagtgga
aataatggtt catctgatgc aagcgatgaa 300gataaccagc aggattattc tggaggcaaa
aagggaagct tccatcaaat gcttgcagat 360gcgaatgcac gaaacaactc tggaccaaat
gttcagatgc cagtgcctgg aaatcccgta 420gtttctatgc ctgctaccaa tcttaatatg
gggatggacc tgtggaaccc gtctgctgga 480tccgggtcta tgaaaatgca accaaatcac
tctggtgtct cgcaaccagg ggttccacca 540ccaataatgg ctgatcagtg gggtcaggat
gagagagaat tgaagaggct gaaaaggaag 600caatccaata gagagtcagc tagaagatca
aggctacgca agcagactga gtgtgaagag 660ctacaggcaa gagtagagac actaaacaac
gagaatcact cactcagaga tgaactgcag 720aggctttctg aggaatgtgg gaagcttaca
gctgaaaatg attcaataaa ggatgaatta 780accaggtttt ttggacctga agctgtttcg
aaactcgatg cacatcttca atctcgaacg 840aacgaagatg aaagttga
85880285PRTHelianthus tuberosus 80Met
Met Ala Pro Tyr Gly Thr Pro Val Pro Tyr Ala Ala Met Tyr Pro 1
5 10 15 Pro Ala Gly Val Tyr Gly
His Pro Gly Met Pro Met Thr Pro Asp Thr 20
25 30 Val Gln Pro Thr Thr Glu Met Glu Ser Lys
Ala Pro Asn Gly Lys Asp 35 40
45 Lys Val Asn Lys Lys Ser Lys Gly Ser Ser Gly Asn Val Asn
Ala Ser 50 55 60
Gly Ala Lys Thr Arg Glu Ser Gly Lys Ala Ala Ser Ser Ser Gly Asn 65
70 75 80 Asp Gly Gly Thr Gln
Ser Ala Glu Ser Gly Asn Asn Gly Ser Ser Asp 85
90 95 Ala Ser Asp Glu Asp Asn Gln Gln Asp Tyr
Ser Gly Gly Lys Lys Gly 100 105
110 Ser Phe His Gln Met Leu Ala Asp Ala Asn Ala Arg Asn Asn Ser
Gly 115 120 125 Pro
Asn Val Gln Met Pro Val Pro Gly Asn Pro Val Val Ser Met Pro 130
135 140 Ala Thr Asn Leu Asn Met
Gly Met Asp Leu Trp Asn Pro Ser Ala Gly 145 150
155 160 Ser Gly Ser Met Lys Met Gln Pro Asn His Ser
Gly Val Ser Gln Pro 165 170
175 Gly Val Pro Pro Pro Ile Met Ala Asp Gln Trp Gly Gln Asp Glu Arg
180 185 190 Glu Leu
Lys Arg Leu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 195
200 205 Arg Ser Arg Leu Arg Lys Gln
Thr Glu Cys Glu Glu Leu Gln Ala Arg 210 215
220 Val Glu Thr Leu Asn Asn Glu Asn His Ser Leu Arg
Asp Glu Leu Gln 225 230 235
240 Arg Leu Ser Glu Glu Cys Gly Lys Leu Thr Ala Glu Asn Asp Ser Ile
245 250 255 Lys Asp Glu
Leu Thr Arg Phe Phe Gly Pro Glu Ala Val Ser Lys Leu 260
265 270 Asp Ala His Leu Gln Ser Arg Thr
Asn Glu Asp Glu Ser 275 280 285
811278DNAMedicago truncatula 81atgggaaata gcgatgaaga gaaatctacc
aagactgaaa aaccttcttc acctgtaaca 60gtggatcaga ccaaccagac gaatgttcat
gtctatcctg attgggcagc catgcaggca 120tattatggac caagagttgc catgccacct
tactacaact cacctgtggc ttctggtcac 180actcctcacc catatatgtg ggggccacca
cagcctatga tgccacctta tggacatcct 240tatgcagcaa tgtatccaca tggaggggtt
tatactcacc ctgcagttcc tattggtcca 300catccacata gtcaaggaat ttcatcttca
cctgctactg gaactccttt aagcatagag 360acacctccca aatcatctgg aaatactgat
cagggtctga tgaagaaatt gaaagggttt 420gatggacttg caatgtcaat aggcaatggc
catgctgaga gtgctgagcc tggagctgaa 480agcaggcaat cacagagtgt gaatactgag
ggttcgagtg atggaagtga cggaaacact 540tcaggggcta atcaaacaag aaggaaaaga
agccgtgagg gaacaccaac cactgatgga 600gaagggaaaa caaatacaca aggtagtcaa
atttccaaag aaattgcagc ttctgataag 660atgatggcag tagcccctgc tggtgtcaca
ggtcaactag ttggacctgt agcttcttca 720gcgatgacca ccgcactgga gctgagaaat
tcttctagtg ttcattctaa aacaaatccc 780acaagtaccc cacaaccttc tgctgtattg
cctcccgagg cttggataca gaatgagcgt 840gaactgaagc gtgagaggag gaaacaatca
aatcgagaat ctgctagaag gtccagacta 900aggaagcagg ctgaggctga agaactggca
cgaaaagttg aatccttgaa tgctgagagt 960gcgtcactta gatcagaaat aaaccgactt
gctgagaatt ctgaaagact gaggatggaa 1020aatgctgcat taaaggaaaa atttaaaatt
gctaaactgg gacaaccgaa agagataatt 1080ttgaccaaca ttgacagcca gaggaccaca
cctgtaagta cagaaaactt attatcaaga 1140gttaataaca attcgggttc taatgataga
actgtggagg atgagaatgg ttattgtgac 1200aacaaaccaa attctggtgc gaagttgcac
caactattgg atgcaagtcc tagagctgat 1260gctgtggctg ctggttga
127882425PRTMedicago truncatula 82Met
Gly Asn Ser Asp Glu Glu Lys Ser Thr Lys Thr Glu Lys Pro Ser 1
5 10 15 Ser Pro Val Thr Val Asp
Gln Thr Asn Gln Thr Asn Val His Val Tyr 20
25 30 Pro Asp Trp Ala Ala Met Gln Ala Tyr Tyr
Gly Pro Arg Val Ala Met 35 40
45 Pro Pro Tyr Tyr Asn Ser Pro Val Ala Ser Gly His Thr Pro
His Pro 50 55 60
Tyr Met Trp Gly Pro Pro Gln Pro Met Met Pro Pro Tyr Gly His Pro 65
70 75 80 Tyr Ala Ala Met Tyr
Pro His Gly Gly Val Tyr Thr His Pro Ala Val 85
90 95 Pro Ile Gly Pro His Pro His Ser Gln Gly
Ile Ser Ser Ser Pro Ala 100 105
110 Thr Gly Thr Pro Leu Ser Ile Glu Thr Pro Pro Lys Ser Ser Gly
Asn 115 120 125 Thr
Asp Gln Gly Leu Met Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala 130
135 140 Met Ser Ile Gly Asn Gly
His Ala Glu Ser Ala Glu Pro Gly Ala Glu 145 150
155 160 Ser Arg Gln Ser Gln Ser Val Asn Thr Glu Gly
Ser Ser Asp Gly Ser 165 170
175 Asp Gly Asn Thr Ser Gly Ala Asn Gln Thr Arg Arg Lys Arg Ser Arg
180 185 190 Glu Gly
Thr Pro Thr Thr Asp Gly Glu Gly Lys Thr Asn Thr Gln Gly 195
200 205 Ser Gln Ile Ser Lys Glu Ile
Ala Ala Ser Asp Lys Met Met Ala Val 210 215
220 Ala Pro Ala Gly Val Thr Gly Gln Leu Val Gly Pro
Val Ala Ser Ser 225 230 235
240 Ala Met Thr Thr Ala Leu Glu Leu Arg Asn Ser Ser Ser Val His Ser
245 250 255 Lys Thr Asn
Pro Thr Ser Thr Pro Gln Pro Ser Ala Val Leu Pro Pro 260
265 270 Glu Ala Trp Ile Gln Asn Glu Arg
Glu Leu Lys Arg Glu Arg Arg Lys 275 280
285 Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg
Lys Gln Ala 290 295 300
Glu Ala Glu Glu Leu Ala Arg Lys Val Glu Ser Leu Asn Ala Glu Ser 305
310 315 320 Ala Ser Leu Arg
Ser Glu Ile Asn Arg Leu Ala Glu Asn Ser Glu Arg 325
330 335 Leu Arg Met Glu Asn Ala Ala Leu Lys
Glu Lys Phe Lys Ile Ala Lys 340 345
350 Leu Gly Gln Pro Lys Glu Ile Ile Leu Thr Asn Ile Asp Ser
Gln Arg 355 360 365
Thr Thr Pro Val Ser Thr Glu Asn Leu Leu Ser Arg Val Asn Asn Asn 370
375 380 Ser Gly Ser Asn Asp
Arg Thr Val Glu Asp Glu Asn Gly Tyr Cys Asp 385 390
395 400 Asn Lys Pro Asn Ser Gly Ala Lys Leu His
Gln Leu Leu Asp Ala Ser 405 410
415 Pro Arg Ala Asp Ala Val Ala Ala Gly 420
425 831023DNAMedicago truncatula 83atggggacta aggaggatag
cacaactaaa ccttctaaaa catcttcatc aactcaggaa 60gtaccaacac caacagtaca
accatcatat ccagattggt caacctccat gcaggcctac 120tataatcctg gagccgctcc
gcctccctat tatgcctcaa ctgtggcttc accaaccccg 180catccttata tgtggggagg
ccagcatcct atgatggcac catacgggac tccagttccg 240tatcctgcaa tgttccctcc
tgggaatatc tatgctcatc ctagcatggt agtgactcca 300agtgctatgc accaaactac
agagtttgaa gggaagggac ctgatggaaa ggataaggat 360tcatctaaaa aaccgaaggg
cacttctgca aatacaagcg ccaaagcagg agagggtgga 420aaggcaggat caggttcagg
caatgatggc ttttcacata gtggtgacag tggttcagag 480ggttcatcta atgctagcga
tgaaaaccaa caggaatcag ctagaaacaa gaagggaagc 540tttgacctca tgcttgttga
tggagccaac gcgcagaaca atactactgg acccatttct 600caatcatctg ttcctggaaa
tcctgttgtc tcgatacctg caactaatct taatattgga 660atggacttat ggaatgcatc
ttctgctggt gctgaagccg ccaaaatgag acacaatcaa 720cctggtgctc ctggagctgg
tgcacttggt gaacagtgga tgcaacaaga tgatcgtgag 780ttgaaaagac agaagagaaa
acagtctaat cgagagtcag ccaggaggtc aagactacgc 840aagcaggccg agtgtgaaga
actacaaaag agggtggagg cgctgggagg tgagaatcga 900actctcagag aagagcttca
gaaactttct gaagagtgtg agaagcttac atctgaaaac 960gattctatta aggaagactt
ggaacggttg tgtgggcctg aagtagttgc taaccttgaa 1020tga
102384340PRTMedicago
truncatula 84Met Gly Thr Lys Glu Asp Ser Thr Thr Lys Pro Ser Lys Thr Ser
Ser 1 5 10 15 Ser
Thr Gln Glu Val Pro Thr Pro Thr Val Gln Pro Ser Tyr Pro Asp
20 25 30 Trp Ser Thr Ser Met
Gln Ala Tyr Tyr Asn Pro Gly Ala Ala Pro Pro 35
40 45 Pro Tyr Tyr Ala Ser Thr Val Ala Ser
Pro Thr Pro His Pro Tyr Met 50 55
60 Trp Gly Gly Gln His Pro Met Met Ala Pro Tyr Gly Thr
Pro Val Pro 65 70 75
80 Tyr Pro Ala Met Phe Pro Pro Gly Asn Ile Tyr Ala His Pro Ser Met
85 90 95 Val Val Thr Pro
Ser Ala Met His Gln Thr Thr Glu Phe Glu Gly Lys 100
105 110 Gly Pro Asp Gly Lys Asp Lys Asp Ser
Ser Lys Lys Pro Lys Gly Thr 115 120
125 Ser Ala Asn Thr Ser Ala Lys Ala Gly Glu Gly Gly Lys Ala
Gly Ser 130 135 140
Gly Ser Gly Asn Asp Gly Phe Ser His Ser Gly Asp Ser Gly Ser Glu 145
150 155 160 Gly Ser Ser Asn Ala
Ser Asp Glu Asn Gln Gln Glu Ser Ala Arg Asn 165
170 175 Lys Lys Gly Ser Phe Asp Leu Met Leu Val
Asp Gly Ala Asn Ala Gln 180 185
190 Asn Asn Thr Thr Gly Pro Ile Ser Gln Ser Ser Val Pro Gly Asn
Pro 195 200 205 Val
Val Ser Ile Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp 210
215 220 Asn Ala Ser Ser Ala Gly
Ala Glu Ala Ala Lys Met Arg His Asn Gln 225 230
235 240 Pro Gly Ala Pro Gly Ala Gly Ala Leu Gly Glu
Gln Trp Met Gln Gln 245 250
255 Asp Asp Arg Glu Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu
260 265 270 Ser Ala
Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu 275
280 285 Gln Lys Arg Val Glu Ala Leu
Gly Gly Glu Asn Arg Thr Leu Arg Glu 290 295
300 Glu Leu Gln Lys Leu Ser Glu Glu Cys Glu Lys Leu
Thr Ser Glu Asn 305 310 315
320 Asp Ser Ile Lys Glu Asp Leu Glu Arg Leu Cys Gly Pro Glu Val Val
325 330 335 Ala Asn Leu
Glu 340 85372DNANicotiana tabacummisc_feature(300)..(300)n is
a, c, g, or t 85atgttcatgg gcggtttgac ttgtttgctt gctccaactt tctggaatcc
aaaatgtact 60caacaatttc ctttaggatt ttctggcagt catgttgttt tctgtttctt
atttttgcag 120gctgagtgtg aagagctgca gcgtagggta gaagctttga gcagcgagaa
tcattcactc 180aaagatgagc tccaacggct ctctgaggaa tgtgagaagc ttacctcaga
gaatagtttg 240ataaaggtac ttaatacctt gttaagtgtt cctctaatat gcttatcctc
cttgatattn 300ggtaatattc tcaagtaccc tcttcatgtc atcccctttt tcctttccct
catttacctt 360tgcactttat ag
37286123PRTNicotiana tabacummisc_feature(100)..(100)Xaa can
be any naturally occurring amino acid 86Met Phe Met Gly Gly Leu Thr Cys
Leu Leu Ala Pro Thr Phe Trp Asn 1 5 10
15 Pro Lys Cys Thr Gln Gln Phe Pro Leu Gly Phe Ser Gly
Ser His Val 20 25 30
Val Phe Cys Phe Leu Phe Leu Gln Ala Glu Cys Glu Glu Leu Gln Arg
35 40 45 Arg Val Glu Ala
Leu Ser Ser Glu Asn His Ser Leu Lys Asp Glu Leu 50
55 60 Gln Arg Leu Ser Glu Glu Cys Glu
Lys Leu Thr Ser Glu Asn Ser Leu 65 70
75 80 Ile Lys Val Leu Asn Thr Leu Leu Ser Val Pro Leu
Ile Cys Leu Ser 85 90
95 Ser Leu Ile Xaa Gly Asn Ile Leu Lys Tyr Pro Leu His Val Ile Pro
100 105 110 Phe Phe Leu
Ser Leu Ile Tyr Leu Cys Thr Leu 115 120
871284DNANicotiana tabacum 87atgggaaata gtgaggacgg gaaatcttgt
aagcctgaga aatcatcttc gaccgcacca 60gaccagagca atattcacgt gtatcctgat
tgggcggcta tgcaggcata ttatggtcca 120cgggtagctg tacctccata tgttaattct
cctgttgcac ctggtcaagc tcctcatcct 180tgtatgtggg gaccgctaca gcctatgatg
ccaccttatg gtataccata tgcaggaatc 240tatgcgcatg gtggtgttta tgcgcaccct
ggagttccta tcgtgtctcg tcctcaggct 300catgtaatga catcatctcc tgctgtcagc
caaaccatgg atgctgcttc tttgagtatg 360gacccttctg ctaagacttc gggggatacg
aatcaaggct tgatgagtaa gttaaaaggt 420tctgatgggc ttggaatgtc aataggaaat
tgcagcgttg acaatggcga cggtactgac 480catggacctt ctcagagtga cagtgggcaa
acggaaggtt caagtgatgg aagtaacata 540cacacagcag aggtgggtga gaagagtaag
aaaagaagcc gcgagacgac tcctaatacc 600tctggtgacg gaaagagtcg gacacgaagc
agtccacaac ctagggaagt aaatggggct 660accaagaagg aaacttctat agcttttaat
cctggtaaca tagcagagaa agtagtcgga 720acagtatttt ctccaaccat gactactact
ctggaactga gaaatcctgt cggtacacta 780gtgaaagcta gtccaactaa tgtttcacga
attagtcctg cagtgccggg cgaagcttgg 840ttacagaatg aacgtgagat gaagcgggag
aagaggaagc agtctaatcg ggaatctgca 900aggagatcaa ggttgagaaa gcagggagaa
gctgaagaat tggcaatacg agttcaatct 960ttaacctccg aaaatttggg cctcaaatca
gagataaata atttcactga aaattctgcg 1020aaactaaagc ttgaaaattc cgctttaatg
gagagactgc aaaataaaca acgaggacaa 1080gcagaagagg taactttagg caagattggt
gataagaggc tgcaacctgt tagcacagca 1140gacctattag caagagtcaa caactctggt
ccgttggata gaaccaacaa agacgatgaa 1200attcatgaga ataatacttc aggagcaaag
cttcatcaac ttcttgatgc tagccacaga 1260actgatgctg tggctgctag atga
128488427PRTNicotiana tabacum 88Met Gly
Asn Ser Glu Asp Gly Lys Ser Cys Lys Pro Glu Lys Ser Ser 1 5
10 15 Ser Thr Ala Pro Asp Gln Ser
Asn Ile His Val Tyr Pro Asp Trp Ala 20 25
30 Ala Met Gln Ala Tyr Tyr Gly Pro Arg Val Ala Val
Pro Pro Tyr Val 35 40 45
Asn Ser Pro Val Ala Pro Gly Gln Ala Pro His Pro Cys Met Trp Gly
50 55 60 Pro Leu Gln
Pro Met Met Pro Pro Tyr Gly Ile Pro Tyr Ala Gly Ile 65
70 75 80 Tyr Ala His Gly Gly Val Tyr
Ala His Pro Gly Val Pro Ile Val Ser 85
90 95 Arg Pro Gln Ala His Val Met Thr Ser Ser Pro
Ala Val Ser Gln Thr 100 105
110 Met Asp Ala Ala Ser Leu Ser Met Asp Pro Ser Ala Lys Thr Ser
Gly 115 120 125 Asp
Thr Asn Gln Gly Leu Met Ser Lys Leu Lys Gly Ser Asp Gly Leu 130
135 140 Gly Met Ser Ile Gly Asn
Cys Ser Val Asp Asn Gly Asp Gly Thr Asp 145 150
155 160 His Gly Pro Ser Gln Ser Asp Ser Gly Gln Thr
Glu Gly Ser Ser Asp 165 170
175 Gly Ser Asn Ile His Thr Ala Glu Val Gly Glu Lys Ser Lys Lys Arg
180 185 190 Ser Arg
Glu Thr Thr Pro Asn Thr Ser Gly Asp Gly Lys Ser Arg Thr 195
200 205 Arg Ser Ser Pro Gln Pro Arg
Glu Val Asn Gly Ala Thr Lys Lys Glu 210 215
220 Thr Ser Ile Ala Phe Asn Pro Gly Asn Ile Ala Glu
Lys Val Val Gly 225 230 235
240 Thr Val Phe Ser Pro Thr Met Thr Thr Thr Leu Glu Leu Arg Asn Pro
245 250 255 Val Gly Thr
Leu Val Lys Ala Ser Pro Thr Asn Val Ser Arg Ile Ser 260
265 270 Pro Ala Val Pro Gly Glu Ala Trp
Leu Gln Asn Glu Arg Glu Met Lys 275 280
285 Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg
Arg Ser Arg 290 295 300
Leu Arg Lys Gln Gly Glu Ala Glu Glu Leu Ala Ile Arg Val Gln Ser 305
310 315 320 Leu Thr Ser Glu
Asn Leu Gly Leu Lys Ser Glu Ile Asn Asn Phe Thr 325
330 335 Glu Asn Ser Ala Lys Leu Lys Leu Glu
Asn Ser Ala Leu Met Glu Arg 340 345
350 Leu Gln Asn Lys Gln Arg Gly Gln Ala Glu Glu Val Thr Leu
Gly Lys 355 360 365
Ile Gly Asp Lys Arg Leu Gln Pro Val Ser Thr Ala Asp Leu Leu Ala 370
375 380 Arg Val Asn Asn Ser
Gly Pro Leu Asp Arg Thr Asn Lys Asp Asp Glu 385 390
395 400 Ile His Glu Asn Asn Thr Ser Gly Ala Lys
Leu His Gln Leu Leu Asp 405 410
415 Ala Ser His Arg Thr Asp Ala Val Ala Ala Arg 420
425 89483DNANicotiana tabacum 89atgtttcaca
ctcagcctgc actgccaaat gaagcctggt tacagaatga acgtgagctg 60aagcgggaga
aaaggaaaca gtctaatcgg gaatctgcaa ggcgatcaag attgagaaaa 120caggctgaag
ctgaagaatt ggcaatacga gttcagtctt taacagggga aaacatgaca 180ctcaaatctg
agataaacaa attaatggag aactcggaga aacttaagct agaaaatgct 240gctttaatgg
agaaactgaa caatgaacag ctaagcccga cagaagaagt gagtttaggt 300aagattgatg
ataagagggt gcaacctgta ggcaccgcaa acctactagc aagagtcaat 360aactctggtt
ccttaaatag agcaaacgag gagagtgaag tttatgagaa caatagttct 420ggagcaaagc
ttcatcaact actcgattcc agccccagaa ctgatgcagt ggctgctggg 480tga
48390160PRTNicotiana tabacum 90Met Phe His Thr Gln Pro Ala Leu Pro Asn
Glu Ala Trp Leu Gln Asn 1 5 10
15 Glu Arg Glu Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu
Ser 20 25 30 Ala
Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu Ala Glu Glu Leu Ala 35
40 45 Ile Arg Val Gln Ser Leu
Thr Gly Glu Asn Met Thr Leu Lys Ser Glu 50 55
60 Ile Asn Lys Leu Met Glu Asn Ser Glu Lys Leu
Lys Leu Glu Asn Ala 65 70 75
80 Ala Leu Met Glu Lys Leu Asn Asn Glu Gln Leu Ser Pro Thr Glu Glu
85 90 95 Val Ser
Leu Gly Lys Ile Asp Asp Lys Arg Val Gln Pro Val Gly Thr 100
105 110 Ala Asn Leu Leu Ala Arg Val
Asn Asn Ser Gly Ser Leu Asn Arg Ala 115 120
125 Asn Glu Glu Ser Glu Val Tyr Glu Asn Asn Ser Ser
Gly Ala Lys Leu 130 135 140
His Gln Leu Leu Asp Ser Ser Pro Arg Thr Asp Ala Val Ala Ala Gly 145
150 155 160
911041DNANicotiana tabacum 91atgggagctg gggaagagag cacccctaca aagccttcaa
aaccggcttc aactcaggag 60acacaaacta caccctcata tcctgattgg tcaagctcta
tgcaggctta ttatagcgct 120ggagctactc ctcccttctt tgcctcacct gttgcttctc
ctgcttccca cccatacttg 180tggggaggcc agcatcctct tatgcctcct tatggggctc
cagtcccgta tccagcttta 240tatcctcctg ctggagttta tgctcatcct aacatggcca
cgcacactcc aaacgctgtg 300caggcaaatc ttgaatcaaa caggaaggat cctgaaggaa
aggatcggag tacgaacaaa 360aagttaaagg ccagttctgg tggcaaggca ggcgacagcg
ggaaagttgc ttcaggttct 420ggaaatgatg gtgccacaca aagtgatgaa accagaagtg
aaggtacatc agatacaaat 480gatgaaaatg ataaccacga atttgctgca agcaagaagg
gaagctttga tcaaatgctt 540gccgatggag ccagtgcgca gaataatccc acaacagcga
attaccagac ctctatgcat 600gcaaatcctg tcactgtgca tgcaactaac ctaaatattg
gaatggatgt gtggaatgca 660tcatctgccg gtcctggagc gatcataata cagccaaatg
tgaatggtcc agttatagga 720catgaaggaa ggatgaatga tcaatgggtt caggacgaac
gtgaacttaa aagacaaaag 780agaaagcaat ctaataggga gtcagctagg aggtcaaggc
tacgcaagca ggctgagtgt 840gaagagctgc agcgtagggt agaagctttg agcagcgaga
atcattcact caaagatgag 900ctccaacggc tctctgagga atgtgagaag cttacctcag
agaatagttt gataaaggaa 960gagttaacgc gtttatgtgg gccagatgct gtgtctaagc
tagagagcaa cggcaatgcc 1020actcatgagg aagctagtta a
104192346PRTNicotiana tabacum 92Met Gly Ala Gly Glu
Glu Ser Thr Pro Thr Lys Pro Ser Lys Pro Ala 1 5
10 15 Ser Thr Gln Glu Thr Gln Thr Thr Pro Ser
Tyr Pro Asp Trp Ser Ser 20 25
30 Ser Met Gln Ala Tyr Tyr Ser Ala Gly Ala Thr Pro Pro Phe Phe
Ala 35 40 45 Ser
Pro Val Ala Ser Pro Ala Ser His Pro Tyr Leu Trp Gly Gly Gln 50
55 60 His Pro Leu Met Pro Pro
Tyr Gly Ala Pro Val Pro Tyr Pro Ala Leu 65 70
75 80 Tyr Pro Pro Ala Gly Val Tyr Ala His Pro Asn
Met Ala Thr His Thr 85 90
95 Pro Asn Ala Val Gln Ala Asn Leu Glu Ser Asn Arg Lys Asp Pro Glu
100 105 110 Gly Lys
Asp Arg Ser Thr Asn Lys Lys Leu Lys Ala Ser Ser Gly Gly 115
120 125 Lys Ala Gly Asp Ser Gly Lys
Val Ala Ser Gly Ser Gly Asn Asp Gly 130 135
140 Ala Thr Gln Ser Asp Glu Thr Arg Ser Glu Gly Thr
Ser Asp Thr Asn 145 150 155
160 Asp Glu Asn Asp Asn His Glu Phe Ala Ala Ser Lys Lys Gly Ser Phe
165 170 175 Asp Gln Met
Leu Ala Asp Gly Ala Ser Ala Gln Asn Asn Pro Thr Thr 180
185 190 Ala Asn Tyr Gln Thr Ser Met His
Ala Asn Pro Val Thr Val His Ala 195 200
205 Thr Asn Leu Asn Ile Gly Met Asp Val Trp Asn Ala Ser
Ser Ala Gly 210 215 220
Pro Gly Ala Ile Ile Ile Gln Pro Asn Val Asn Gly Pro Val Ile Gly 225
230 235 240 His Glu Gly Arg
Met Asn Asp Gln Trp Val Gln Asp Glu Arg Glu Leu 245
250 255 Lys Arg Gln Lys Arg Lys Gln Ser Asn
Arg Glu Ser Ala Arg Arg Ser 260 265
270 Arg Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Arg Arg
Val Glu 275 280 285
Ala Leu Ser Ser Glu Asn His Ser Leu Lys Asp Glu Leu Gln Arg Leu 290
295 300 Ser Glu Glu Cys Glu
Lys Leu Thr Ser Glu Asn Ser Leu Ile Lys Glu 305 310
315 320 Glu Leu Thr Arg Leu Cys Gly Pro Asp Ala
Val Ser Lys Leu Glu Ser 325 330
335 Asn Gly Asn Ala Thr His Glu Glu Ala Ser 340
345 93672DNAPinus taeda 93atggctttgg ttccaccagc
acctgttcca ggaatgtcaa cagttagtct acctactaca 60aatctaaata ttggaatggg
cgtctataat gtgcctgccc aaggatctgt aactcctgta 120aaaggaagac agggaacaac
taacagtgta tcaacaattg ttcctgcagc atctcaactc 180attccaggtc atgatggagt
gccttctgaa ttatgggtcc aggatgaacg ggaattaaaa 240cgacagaagc gtaaacaatc
taatcgggag tcagctaaac gctctcgaat gaagaaacag 300atggaatgcg aagagttgtc
tgcaaaagtt gagacactga cttctgagaa catggcactc 360agaaatgaaa taaatcttat
agcagaggaa tctgaaaagc ttgcttctga gaatgaatca 420ctaaaggtaa tgttgaggaa
ttatcaaaga gaggatagag taggagattc agaaagaaat 480ggtcctagag agacacagtc
acagtttatg cagcaaggag gcaatggcta tcctcaagtt 540ctctctaaaa taagcaattt
gagttctagc caaagagatg agcaaaggga gagtgagatg 600agcgattcaa ctggtaaatg
tcatgccgta ttggaagcaa atgttcgatc tgatagagtg 660gctgcaggtt aa
67294223PRTPinus taeda 94Met
Ala Leu Val Pro Pro Ala Pro Val Pro Gly Met Ser Thr Val Ser 1
5 10 15 Leu Pro Thr Thr Asn Leu
Asn Ile Gly Met Gly Val Tyr Asn Val Pro 20
25 30 Ala Gln Gly Ser Val Thr Pro Val Lys Gly
Arg Gln Gly Thr Thr Asn 35 40
45 Ser Val Ser Thr Ile Val Pro Ala Ala Ser Gln Leu Ile Pro
Gly His 50 55 60
Asp Gly Val Pro Ser Glu Leu Trp Val Gln Asp Glu Arg Glu Leu Lys 65
70 75 80 Arg Gln Lys Arg Lys
Gln Ser Asn Arg Glu Ser Ala Lys Arg Ser Arg 85
90 95 Met Lys Lys Gln Met Glu Cys Glu Glu Leu
Ser Ala Lys Val Glu Thr 100 105
110 Leu Thr Ser Glu Asn Met Ala Leu Arg Asn Glu Ile Asn Leu Ile
Ala 115 120 125 Glu
Glu Ser Glu Lys Leu Ala Ser Glu Asn Glu Ser Leu Lys Val Met 130
135 140 Leu Arg Asn Tyr Gln Arg
Glu Asp Arg Val Gly Asp Ser Glu Arg Asn 145 150
155 160 Gly Pro Arg Glu Thr Gln Ser Gln Phe Met Gln
Gln Gly Gly Asn Gly 165 170
175 Tyr Pro Gln Val Leu Ser Lys Ile Ser Asn Leu Ser Ser Ser Gln Arg
180 185 190 Asp Glu
Gln Arg Glu Ser Glu Met Ser Asp Ser Thr Gly Lys Cys His 195
200 205 Ala Val Leu Glu Ala Asn Val
Arg Ser Asp Arg Val Ala Ala Gly 210 215
220 95573DNAPopulus trichocarpa 95atggctcatc aggaatatgg
tgcaagcaag aagggaagct tcaaccagat gcttgcagat 60gctaatgcac aaagtacctc
agctggagca aatatccaag cttctgtgcc tgggaaacct 120gtggcgtcta tgcctgcaac
taatttaaac attgggatgg acttatggaa tgcatcttct 180gctgctggag ctacaaaaat
gagaccaaat ccatcttgtg ccacatctgg agttgttcct 240gctggattgc ctgaacaatg
gattcaagat gaacgtgaat tgaaaagaca gaagaggaaa 300caatctaata gagagtcagc
cagaaggtcc agattacgca aacaggcaga gtgcgaggag 360ctacaagcca gggtacagaa
tttgagcagt gacaatagca atctcagaaa tgaattgcag 420agtctctctg aagaatgcaa
taagcttaaa tccgaaaatg attccattaa ggaggagttg 480actcggttgt atggaccaga
agttgtagct aaacttgaac agagcaaccc tgcttcggtt 540ccagagtctc atggtggtga
gggagacagt tga 57396190PRTPopulus
trichocarpa 96Met Ala His Gln Glu Tyr Gly Ala Ser Lys Lys Gly Ser Phe Asn
Gln 1 5 10 15 Met
Leu Ala Asp Ala Asn Ala Gln Ser Thr Ser Ala Gly Ala Asn Ile
20 25 30 Gln Ala Ser Val Pro
Gly Lys Pro Val Ala Ser Met Pro Ala Thr Asn 35
40 45 Leu Asn Ile Gly Met Asp Leu Trp Asn
Ala Ser Ser Ala Ala Gly Ala 50 55
60 Thr Lys Met Arg Pro Asn Pro Ser Cys Ala Thr Ser Gly
Val Val Pro 65 70 75
80 Ala Gly Leu Pro Glu Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg
85 90 95 Gln Lys Arg Lys
Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu 100
105 110 Arg Lys Gln Ala Glu Cys Glu Glu Leu
Gln Ala Arg Val Gln Asn Leu 115 120
125 Ser Ser Asp Asn Ser Asn Leu Arg Asn Glu Leu Gln Ser Leu
Ser Glu 130 135 140
Glu Cys Asn Lys Leu Lys Ser Glu Asn Asp Ser Ile Lys Glu Glu Leu 145
150 155 160 Thr Arg Leu Tyr Gly
Pro Glu Val Val Ala Lys Leu Glu Gln Ser Asn 165
170 175 Pro Ala Ser Val Pro Glu Ser His Gly Gly
Glu Gly Asp Ser 180 185 190
97852DNAPoncirus trifoliata 97atgccaccat atggcacccc agttccatac caagctatat
atcctccagg gggagtatat 60gcacatccta gcatggctac gactccaaca gttgcaccaa
caaatacaga gctggaaggg 120aagggacctg aagcaaagga ccgggcttca gctaaaaaat
ccaagggaac tccaggaggt 180aaggctggag agattgtaaa ggcaacttct ggttctggga
acgatggtgt ctctcaaagt 240gctgaaagtg gtagtgacgg ttcatctgat gcgagtgatg
agaatggtaa ccagcaggag 300tttgctgggg gtaagaaagg aagctttgac cagatgcttg
cagatgccaa cacggagaat 360aacacagtag aagctgttcc aggatcagtg cccgggaagc
ctgtagtctc aatgcctgca 420actaatctca atattggcat ggatttgtgg aatacatccc
ctgctgctgc tggagctgca 480aaaatgagaa caaatccatc tggggcctca ccagcagttg
ctccagctgg cataatacct 540gatcaatgga ttcaagatga acgtgaattg aaaagacaga
aaaggaagca atctaatagg 600gagtcagcca gaaggtcaag gttacgcaag caggcggaat
gtgaggagct acaggccaga 660gtggagagtt tgagcaatga gaatcgcaac cttagagatg
agttgcagag gctttctgag 720gaatgcgaga agcttacatc tgaaaataat tccattaagg
aagacttatc tcggttgtgt 780ggaccagagg cagttgctaa tcttgagcag agcaacccca
ctcagtcgtg cggggaagaa 840gaaaatagct aa
85298283PRTPoncirus trifoliata 98Met Pro Pro Tyr
Gly Thr Pro Val Pro Tyr Gln Ala Ile Tyr Pro Pro 1 5
10 15 Gly Gly Val Tyr Ala His Pro Ser Met
Ala Thr Thr Pro Thr Val Ala 20 25
30 Pro Thr Asn Thr Glu Leu Glu Gly Lys Gly Pro Glu Ala Lys
Asp Arg 35 40 45
Ala Ser Ala Lys Lys Ser Lys Gly Thr Pro Gly Gly Lys Ala Gly Glu 50
55 60 Ile Val Lys Ala Thr
Ser Gly Ser Gly Asn Asp Gly Val Ser Gln Ser 65 70
75 80 Ala Glu Ser Gly Ser Asp Gly Ser Ser Asp
Ala Ser Asp Glu Asn Gly 85 90
95 Asn Gln Gln Glu Phe Ala Gly Gly Lys Lys Gly Ser Phe Asp Gln
Met 100 105 110 Leu
Ala Asp Ala Asn Thr Glu Asn Asn Thr Val Glu Ala Val Pro Gly 115
120 125 Ser Val Pro Gly Lys Pro
Val Val Ser Met Pro Ala Thr Asn Leu Asn 130 135
140 Ile Gly Met Asp Leu Trp Asn Thr Ser Pro Ala
Ala Ala Gly Ala Ala 145 150 155
160 Lys Met Arg Thr Asn Pro Ser Gly Ala Ser Pro Ala Val Ala Pro Ala
165 170 175 Gly Ile
Ile Pro Asp Gln Trp Ile Gln Asp Glu Arg Glu Leu Lys Arg 180
185 190 Gln Lys Arg Lys Gln Ser Asn
Arg Glu Ser Ala Arg Arg Ser Arg Leu 195 200
205 Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Ala Arg
Val Glu Ser Leu 210 215 220
Ser Asn Glu Asn Arg Asn Leu Arg Asp Glu Leu Gln Arg Leu Ser Glu 225
230 235 240 Glu Cys Glu
Lys Leu Thr Ser Glu Asn Asn Ser Ile Lys Glu Asp Leu 245
250 255 Ser Arg Leu Cys Gly Pro Glu Ala
Val Ala Asn Leu Glu Gln Ser Asn 260 265
270 Pro Thr Gln Ser Cys Gly Glu Glu Glu Asn Ser
275 280 991272DNAPhaseolus vulgaris
99atgggaaaca gtgaggaagg gaaatctgtt aaaactggaa gtccttcttc accagctact
60accaatcaga caaaccagcc taactttcat gtctatcctg attgggctgc catgcagtat
120tatgggccga gagtcaacat tcctccatac ttcaactcgg ctgtggcttc tggtcatgct
180ccacacccat acatgtgggg tccaccacag cctatgatgc caccttatgg gccaccatat
240gcagcatttt attctcctgg aggggtttat actcaccctg cagttgctat tgggccacat
300tcacacggtc aaggagttcc atccccacct gctgctggga ctccttcaag tgtagattca
360ccaacaaaat tatctggaaa tactgatcaa gggttaatga aaaaattgaa agggtttgat
420gggcttgcaa tgtcaatagg caattgcaat gctgagagtg cggagcttgg agctgaaaac
480aggctgtcgc agagtgtgga tactgagggt tctagcgatg gaagtgatgg caacactgca
540ggggctaatc aaacaaaaat gaaaagaagc cgagaggaaa catcaaccac tgatggagaa
600gggaaaactg agacacaaga tgggccagtt tccaaagaga ctacatcttc gaaaatggtt
660atgtctgcta caccagctag tgttgcagga aagttagttg gtcctgtaat ttcttcaggt
720atgaccacag cactggagct taggaaacct ttgactgttc attctaagga aaatcccacg
780agtgccccac aaccttgtgc agctgtgcct cctgaagctt ggttacagaa tgagcgtgag
840ctgaaacggg agaggaggaa acaatctaac cgtgaatctg ctagaaggtc caggctgagg
900aagcaggccg agactgaaga attggcacga aaagttgaga tgttaactgc tgaaaatgtg
960tcactgaagt cagaaataac tcaattgact gaaggttctg agcagatgag gatggaaaat
1020tctgcattga gggaaaaact gagaaatact caactgggac aaagggaaga gataattttg
1080gacagcattg acagcaagag gtctacacct gtaagtactg aaaatttgct atcaagagtt
1140aataattcca gttctaatga tagaagtgca gagaatgaga gtgatttctg tgagaacaaa
1200ccaaattctg gtgcaaagct gcatcaacta ctggatacaa atcctagagc tgatgctgtt
1260gctgctgggt ga
1272100423PRTPhaseolus vulgaris 100Met Gly Asn Ser Glu Glu Gly Lys Ser
Val Lys Thr Gly Ser Pro Ser 1 5 10
15 Ser Pro Ala Thr Thr Asn Gln Thr Asn Gln Pro Asn Phe His
Val Tyr 20 25 30
Pro Asp Trp Ala Ala Met Gln Tyr Tyr Gly Pro Arg Val Asn Ile Pro
35 40 45 Pro Tyr Phe Asn
Ser Ala Val Ala Ser Gly His Ala Pro His Pro Tyr 50
55 60 Met Trp Gly Pro Pro Gln Pro Met
Met Pro Pro Tyr Gly Pro Pro Tyr 65 70
75 80 Ala Ala Phe Tyr Ser Pro Gly Gly Val Tyr Thr His
Pro Ala Val Ala 85 90
95 Ile Gly Pro His Ser His Gly Gln Gly Val Pro Ser Pro Pro Ala Ala
100 105 110 Gly Thr Pro
Ser Ser Val Asp Ser Pro Thr Lys Leu Ser Gly Asn Thr 115
120 125 Asp Gln Gly Leu Met Lys Lys Leu
Lys Gly Phe Asp Gly Leu Ala Met 130 135
140 Ser Ile Gly Asn Cys Asn Ala Glu Ser Ala Glu Leu Gly
Ala Glu Asn 145 150 155
160 Arg Leu Ser Gln Ser Val Asp Thr Glu Gly Ser Ser Asp Gly Ser Asp
165 170 175 Gly Asn Thr Ala
Gly Ala Asn Gln Thr Lys Met Lys Arg Ser Arg Glu 180
185 190 Glu Thr Ser Thr Thr Asp Gly Glu Gly
Lys Thr Glu Thr Gln Asp Gly 195 200
205 Pro Val Ser Lys Glu Thr Thr Ser Ser Lys Met Val Met Ser
Ala Thr 210 215 220
Pro Ala Ser Val Ala Gly Lys Leu Val Gly Pro Val Ile Ser Ser Gly 225
230 235 240 Met Thr Thr Ala Leu
Glu Leu Arg Lys Pro Leu Thr Val His Ser Lys 245
250 255 Glu Asn Pro Thr Ser Ala Pro Gln Pro Cys
Ala Ala Val Pro Pro Glu 260 265
270 Ala Trp Leu Gln Asn Glu Arg Glu Leu Lys Arg Glu Arg Arg Lys
Gln 275 280 285 Ser
Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu 290
295 300 Thr Glu Glu Leu Ala Arg
Lys Val Glu Met Leu Thr Ala Glu Asn Val 305 310
315 320 Ser Leu Lys Ser Glu Ile Thr Gln Leu Thr Glu
Gly Ser Glu Gln Met 325 330
335 Arg Met Glu Asn Ser Ala Leu Arg Glu Lys Leu Arg Asn Thr Gln Leu
340 345 350 Gly Gln
Arg Glu Glu Ile Ile Leu Asp Ser Ile Asp Ser Lys Arg Ser 355
360 365 Thr Pro Val Ser Thr Glu Asn
Leu Leu Ser Arg Val Asn Asn Ser Ser 370 375
380 Ser Asn Asp Arg Ser Ala Glu Asn Glu Ser Asp Phe
Cys Glu Asn Lys 385 390 395
400 Pro Asn Ser Gly Ala Lys Leu His Gln Leu Leu Asp Thr Asn Pro Arg
405 410 415 Ala Asp Ala
Val Ala Ala Gly 420 1011053DNASolanum
lycopersicum 101atgggagctg gggaagagag cactcctaca aagacttcaa agcctccttt
aactcaggag 60acaccaaccg caccttcata tcctgattgg tcaagctcta tgcaggctta
ttatagtgct 120ggagctactc ctcctttttt tgcctcacct gttgcttctc ctgctcccca
cccatacatg 180tggggaggtc agcatcctct tatgcctcct tatgggactc cagttccata
tccagcttta 240tatcctcctg ccggagttta tgctcatcct aacattgcca cgccggctcc
aaattctgtg 300ccggcaaatc ctgaagcaga tgggaagggg cctgaaggaa aggatcggaa
ttcaagtaaa 360aagttaaagg tctgttctgg tggtaaggca ggcgacaatg ggaaagttac
ttcaggttcc 420ggaaatgatg gtgccacaca aagtgatgaa agcagaagtg aaggtacatc
agatacaaat 480gatgaaaatg ataacaatga atttgctgca aacaagaagg gaagctttga
tcaaatgctt 540cgagatggag ccagtgcaca gaataatcct gcgaaagaga atcacccgac
ttctatacat 600ggaatctgta ccatgcctgc aactaaccta aatattggaa tggacgtgtg
gaatgcatca 660gctgccggtc ctggagcgat caaaatacag caaaatgcaa ctggtccagt
tataggacat 720gaaggaagga tgaatgatca gtggattcag gaggaacgtg aacttaaaag
gcaaaagaga 780aagcaatcta atagggagtc agctaggagg tcgaggctcc gcaagcaggc
agagtgtgaa 840gagctacaac gtagagtaga agctttgagc catgagaatc attcactcaa
agatgagctc 900caacggctct ctgaggaatg tgagaagctt acctcggaga ataatttaat
taaggaagag 960ttaacgctac tttgtggacc agacgttgtg tctaagctgg agagaaacga
taatgtcaca 1020cgtattcaat ctaatgttga agaagctagt taa
1053102350PRTSolanum lycopersicum 102Met Gly Ala Gly Glu Glu
Ser Thr Pro Thr Lys Thr Ser Lys Pro Pro 1 5
10 15 Leu Thr Gln Glu Thr Pro Thr Ala Pro Ser Tyr
Pro Asp Trp Ser Ser 20 25
30 Ser Met Gln Ala Tyr Tyr Ser Ala Gly Ala Thr Pro Pro Phe Phe
Ala 35 40 45 Ser
Pro Val Ala Ser Pro Ala Pro His Pro Tyr Met Trp Gly Gly Gln 50
55 60 His Pro Leu Met Pro Pro
Tyr Gly Thr Pro Val Pro Tyr Pro Ala Leu 65 70
75 80 Tyr Pro Pro Ala Gly Val Tyr Ala His Pro Asn
Ile Ala Thr Pro Ala 85 90
95 Pro Asn Ser Val Pro Ala Asn Pro Glu Ala Asp Gly Lys Gly Pro Glu
100 105 110 Gly Lys
Asp Arg Asn Ser Ser Lys Lys Leu Lys Val Cys Ser Gly Gly 115
120 125 Lys Ala Gly Asp Asn Gly Lys
Val Thr Ser Gly Ser Gly Asn Asp Gly 130 135
140 Ala Thr Gln Ser Asp Glu Ser Arg Ser Glu Gly Thr
Ser Asp Thr Asn 145 150 155
160 Asp Glu Asn Asp Asn Asn Glu Phe Ala Ala Asn Lys Lys Gly Ser Phe
165 170 175 Asp Gln Met
Leu Arg Asp Gly Ala Ser Ala Gln Asn Asn Pro Ala Lys 180
185 190 Glu Asn His Pro Thr Ser Ile His
Gly Ile Cys Thr Met Pro Ala Thr 195 200
205 Asn Leu Asn Ile Gly Met Asp Val Trp Asn Ala Ser Ala
Ala Gly Pro 210 215 220
Gly Ala Ile Lys Ile Gln Gln Asn Ala Thr Gly Pro Val Ile Gly His 225
230 235 240 Glu Gly Arg Met
Asn Asp Gln Trp Ile Gln Glu Glu Arg Glu Leu Lys 245
250 255 Arg Gln Lys Arg Lys Gln Ser Asn Arg
Glu Ser Ala Arg Arg Ser Arg 260 265
270 Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Arg Arg Val
Glu Ala 275 280 285
Leu Ser His Glu Asn His Ser Leu Lys Asp Glu Leu Gln Arg Leu Ser 290
295 300 Glu Glu Cys Glu Lys
Leu Thr Ser Glu Asn Asn Leu Ile Lys Glu Glu 305 310
315 320 Leu Thr Leu Leu Cys Gly Pro Asp Val Val
Ser Lys Leu Glu Arg Asn 325 330
335 Asp Asn Val Thr Arg Ile Gln Ser Asn Val Glu Glu Ala Ser
340 345 350 103573DNASolanum
lycopersicum 103atgacgggaa caggactttc tccttgcatg acaactttgg aaatgagaaa
tcctgctagt 60gcacatatga aatctagccc aactaatggt ggttcaccac tcagccctgc
actgcctaat 120gaaacctggt tacagaatga gcgtgagctg aagcgggaga aaaggaaaca
gtctaatcgg 180gaatctgcaa ggcgatcaag attgagaaaa caggctgaag ctgaagaatt
ggcaatacga 240gttcaggctt taacaggaga gaacttgaca ctcagatccg agattaacaa
attaatggac 300aactcggaga aactgaagct agacaatgcc actttaatgg agagactgaa
aaatgaacag 360cttggacaga cagaagaagt aagtttaggt aagattgatg ataagagact
gcaacctgta 420ggcacagtaa acctgctagc acgagtgaac aactcaggtt cctcggatac
aacgaacgag 480gatggtgaag tttatgagaa caacagctct ggagcaaagc ttcatcaact
acttgatacc 540agccccagaa ctgatgcagt agcagctggg tga
573104190PRTSolanum lycopersicum 104Met Thr Gly Thr Gly Leu
Ser Pro Cys Met Thr Thr Leu Glu Met Arg 1 5
10 15 Asn Pro Ala Ser Ala His Met Lys Ser Ser Pro
Thr Asn Gly Gly Ser 20 25
30 Pro Leu Ser Pro Ala Leu Pro Asn Glu Thr Trp Leu Gln Asn Glu
Arg 35 40 45 Glu
Leu Lys Arg Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 50
55 60 Arg Ser Arg Leu Arg Lys
Gln Ala Glu Ala Glu Glu Leu Ala Ile Arg 65 70
75 80 Val Gln Ala Leu Thr Gly Glu Asn Leu Thr Leu
Arg Ser Glu Ile Asn 85 90
95 Lys Leu Met Asp Asn Ser Glu Lys Leu Lys Leu Asp Asn Ala Thr Leu
100 105 110 Met Glu
Arg Leu Lys Asn Glu Gln Leu Gly Gln Thr Glu Glu Val Ser 115
120 125 Leu Gly Lys Ile Asp Asp Lys
Arg Leu Gln Pro Val Gly Thr Val Asn 130 135
140 Leu Leu Ala Arg Val Asn Asn Ser Gly Ser Ser Asp
Thr Thr Asn Glu 145 150 155
160 Asp Gly Glu Val Tyr Glu Asn Asn Ser Ser Gly Ala Lys Leu His Gln
165 170 175 Leu Leu Asp
Thr Ser Pro Arg Thr Asp Ala Val Ala Ala Gly 180
185 190 105516DNASolanum tuberosum 105atgactacta
ctccactagt aaaatcaagt ccaacttcac gaatcagtcc tgcagtgcca 60ggcgaagtct
ggttacagaa tgaacgtgag ctgaagcggg agaagaggaa gcagtctaat 120cgagaatctg
caaggagatc aaggttgaga aaacaggcgg aagctgaaga attggcagtg 180caggttcaat
ctttaacctc tgaaaatttg gcactcagat tagaaataaa caaattcacc 240gaaaactctg
agaaactaaa ggttgaaaat gctgctttaa tggagagact gaaaaacaag 300caaggacaag
caaaagaggt aactttaggt atgattgatg ataaaaggct gaagcctgtt 360agcacagcag
acctactagc aagagtcaac aacaacaatg gttcattcaa tagaaccaac 420gaagacggtg
aagttcatga tagtacatct ggagcaaagc ttcgtcaact ccttgatgcc 480agtcccagga
ctgatcatgc tgtggctgct agatga
516106171PRTSolanum tuberosum 106Met Thr Thr Thr Pro Leu Val Lys Ser Ser
Pro Thr Ser Arg Ile Ser 1 5 10
15 Pro Ala Val Pro Gly Glu Val Trp Leu Gln Asn Glu Arg Glu Leu
Lys 20 25 30 Arg
Glu Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg 35
40 45 Leu Arg Lys Gln Ala Glu
Ala Glu Glu Leu Ala Val Gln Val Gln Ser 50 55
60 Leu Thr Ser Glu Asn Leu Ala Leu Arg Leu Glu
Ile Asn Lys Phe Thr 65 70 75
80 Glu Asn Ser Glu Lys Leu Lys Val Glu Asn Ala Ala Leu Met Glu Arg
85 90 95 Leu Lys
Asn Lys Gln Gly Gln Ala Lys Glu Val Thr Leu Gly Met Ile 100
105 110 Asp Asp Lys Arg Leu Lys Pro
Val Ser Thr Ala Asp Leu Leu Ala Arg 115 120
125 Val Asn Asn Asn Asn Gly Ser Phe Asn Arg Thr Asn
Glu Asp Gly Glu 130 135 140
Val His Asp Ser Thr Ser Gly Ala Lys Leu Arg Gln Leu Leu Asp Ala 145
150 155 160 Ser Pro Arg
Thr Asp His Ala Val Ala Ala Arg 165 170
107657DNASolanum tuberosum 107atgggagctg gggaagagag cacccctgca
aagccttcga aagttactgc aactcaggaa 60acacaagcta caccttcata tcctgattgg
tcttctatgc aggcttatta tggtgctgga 120cctacacctc ccttctttcc ctcaactgtt
gcttctccca ctccccaccc atacatgtgg 180ggaggccagc atccgcttat gcctccttat
ggagccccag tcccatatcc tgctttatat 240cctcctgctg gagtttatgc tcatcctaat
atgcctatga ctccaaacac actgcaggca 300aatccagaat cagatagtaa ggcaccagat
ggtaaggacc agaatacaag caaaaaattg 360aagggatgtt caggtggcaa ggcaggagaa
attgggaaag cggcttcagg ttctggaaat 420gatggtggtg ccacaagaag tgctgaaagc
ggaagtgaag gttcatcaga tgaaaatgat 480gaaaatgata accatgaatt ttctgcagac
aagaatagaa gctttgatct aatgcttgct 540aatggagcca atgctcagac caatcctgca
acagggaatc cagtcgctat gcccgcaact 600aatctgaata ttgggatgga tttgtggaac
gcaaccgcct ggcgggtccc ggaatga 657108218PRTSolanum tuberosum 108Met
Gly Ala Gly Glu Glu Ser Thr Pro Ala Lys Pro Ser Lys Val Thr 1
5 10 15 Ala Thr Gln Glu Thr Gln
Ala Thr Pro Ser Tyr Pro Asp Trp Ser Ser 20
25 30 Met Gln Ala Tyr Tyr Gly Ala Gly Pro Thr
Pro Pro Phe Phe Pro Ser 35 40
45 Thr Val Ala Ser Pro Thr Pro His Pro Tyr Met Trp Gly Gly
Gln His 50 55 60
Pro Leu Met Pro Pro Tyr Gly Ala Pro Val Pro Tyr Pro Ala Leu Tyr 65
70 75 80 Pro Pro Ala Gly Val
Tyr Ala His Pro Asn Met Pro Met Thr Pro Asn 85
90 95 Thr Leu Gln Ala Asn Pro Glu Ser Asp Ser
Lys Ala Pro Asp Gly Lys 100 105
110 Asp Gln Asn Thr Ser Lys Lys Leu Lys Gly Cys Ser Gly Gly Lys
Ala 115 120 125 Gly
Glu Ile Gly Lys Ala Ala Ser Gly Ser Gly Asn Asp Gly Gly Ala 130
135 140 Thr Arg Ser Ala Glu Ser
Gly Ser Glu Gly Ser Ser Asp Glu Asn Asp 145 150
155 160 Glu Asn Asp Asn His Glu Phe Ser Ala Asp Lys
Asn Arg Ser Phe Asp 165 170
175 Leu Met Leu Ala Asn Gly Ala Asn Ala Gln Thr Asn Pro Ala Thr Gly
180 185 190 Asn Pro
Val Ala Met Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu 195
200 205 Trp Asn Ala Thr Ala Trp Arg
Val Pro Glu 210 215 109300DNASolanum
tuberosum 109atgggagctg gggaagagag cacccctgca aagccttcga aagttactgc
aactcaggaa 60acacaagcta caccttcata tcctgcttta tatcctcctg ctggagttta
tgctcatcct 120aatatgccta tgactccaaa cacactgcag gcaaatccag aatcagatag
taaggcacca 180gatggtaagg accagaatac aagcaaaaaa ttgaagggat gttcaggtgg
caaggcagga 240gaaattggga aagcggcttc aggttctgga aatgattggt ggtgccacaa
gaagtgctga 30011099PRTSolanum tuberosum 110Met Gly Ala Gly Glu Glu Ser
Thr Pro Ala Lys Pro Ser Lys Val Thr 1 5
10 15 Ala Thr Gln Glu Thr Gln Ala Thr Pro Ser Tyr
Pro Ala Leu Tyr Pro 20 25
30 Pro Ala Gly Val Tyr Ala His Pro Asn Met Pro Met Thr Pro Asn
Thr 35 40 45 Leu
Gln Ala Asn Pro Glu Ser Asp Ser Lys Ala Pro Asp Gly Lys Asp 50
55 60 Gln Asn Thr Ser Lys Lys
Leu Lys Gly Cys Ser Gly Gly Lys Ala Gly 65 70
75 80 Glu Ile Gly Lys Ala Ala Ser Gly Ser Gly Asn
Asp Trp Trp Cys His 85 90
95 Lys Lys Cys 1111031DNATrifolium pratense 111atggggacta
aggaggatag cacaactaaa ccttctaaat catcttcatc aactcaggag 60gtaccaacag
taccaccacc atatccagat tggtcgcagg cctactataa tcccggagct 120gctccgcctc
cctattatgc ctcaactgtt cctcagccaa ccccccatcc gtatatgtgg 180ggaagccagc
atcctttaat ggcgccatat gggactccag tcccgtatcc tgctatgtac 240cctcctggaa
atatctatgc tcatcctagc atggtagtgg ctccaagtgc tatgcaccaa 300actacagagt
ttgaagggaa gggaccagat ggaaaggata aggattcatc taaaaaaccg 360aagggcactt
ctgcgaatac aggtgctaaa gcaggagaga gtggaaaggc aggctcaggt 420tcaggcaatg
atggcttttc acaaagtggt gaaagtggtt cagagggttc atcaaatggt 480agtgatgaga
accaacagga atcagcgaga aacaagaagg gaggttttga cctcatgctt 540gttaatggag
caaacgtaca gaacaataac actggaccca tttctcaatc acctgttcca 600ggaaatcctg
ttgtctcgat acctgctact aatcttaata tcggaatgga tttatggaat 660gcatctcctg
ctaatgctga agccaccaaa ctgagacaca atcaatctag tgcccctgga 720gctggtgaac
aatggatgca acaagatgat cgtgagctga aaagacagaa gagaaaacag 780tctaatcgag
agtcagctag gaggtcaaga ctacgcaagc aggctgagtg tgaagagcta 840caaaagaggg
ttgaggcgtt gggaggtgag aatcgaactc tcagagaaga gcttcagaaa 900ctttctgaag
aatgcgagaa gcttacatct gaaaacaatt ctatcaagga agagttggaa 960cgattgtgtg
ggccggaagt agttgctaat cttgaatgaa acaaacaaaa cagttcctcc 1020atgtcgtctt t
1031112343PRTTrifolium pratensemisc_feature(343)..(343)Xaa can be any
naturally occurring amino acid 112Met Gly Thr Lys Glu Asp Ser Thr Thr Lys
Pro Ser Lys Ser Ser Ser 1 5 10
15 Ser Thr Gln Glu Val Pro Thr Val Pro Pro Pro Tyr Pro Asp Trp
Ser 20 25 30 Gln
Ala Tyr Tyr Asn Pro Gly Ala Ala Pro Pro Pro Tyr Tyr Ala Ser 35
40 45 Thr Val Pro Gln Pro Thr
Pro His Pro Tyr Met Trp Gly Ser Gln His 50 55
60 Pro Leu Met Ala Pro Tyr Gly Thr Pro Val Pro
Tyr Pro Ala Met Tyr 65 70 75
80 Pro Pro Gly Asn Ile Tyr Ala His Pro Ser Met Val Val Ala Pro Ser
85 90 95 Ala Met
His Gln Thr Thr Glu Phe Glu Gly Lys Gly Pro Asp Gly Lys 100
105 110 Asp Lys Asp Ser Ser Lys Lys
Pro Lys Gly Thr Ser Ala Asn Thr Gly 115 120
125 Ala Lys Ala Gly Glu Ser Gly Lys Ala Gly Ser Gly
Ser Gly Asn Asp 130 135 140
Gly Phe Ser Gln Ser Gly Glu Ser Gly Ser Glu Gly Ser Ser Asn Gly 145
150 155 160 Ser Asp Glu
Asn Gln Gln Glu Ser Ala Arg Asn Lys Lys Gly Gly Phe 165
170 175 Asp Leu Met Leu Val Asn Gly Ala
Asn Val Gln Asn Asn Asn Thr Gly 180 185
190 Pro Ile Ser Gln Ser Pro Val Pro Gly Asn Pro Val Val
Ser Ile Pro 195 200 205
Ala Thr Asn Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Pro Ala 210
215 220 Asn Ala Glu Ala
Thr Lys Leu Arg His Asn Gln Ser Ser Ala Pro Gly 225 230
235 240 Ala Gly Glu Gln Trp Met Gln Gln Asp
Asp Arg Glu Leu Lys Arg Gln 245 250
255 Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg
Leu Arg 260 265 270
Lys Gln Ala Glu Cys Glu Glu Leu Gln Lys Arg Val Glu Ala Leu Gly
275 280 285 Gly Glu Asn Arg
Thr Leu Arg Glu Glu Leu Gln Lys Leu Ser Glu Glu 290
295 300 Cys Glu Lys Leu Thr Ser Glu Asn
Asn Ser Ile Lys Glu Glu Leu Glu 305 310
315 320 Arg Leu Cys Gly Pro Glu Val Val Ala Asn Leu Glu
Asn Lys Gln Asn 325 330
335 Ser Ser Ser Met Ser Ser Xaa 340
1131065DNAMedicago truncatula 113atgtggggac caccacagcc tatgatgcat
ccatatgggc cgccatatgc accaccattt 60tattcacatg gaggggttta tactcatcct
gccgttgcca tcgggtcaaa ttcaaatggt 120caaggaattt catcttcacc tgctgctggg
actcctacaa gcatagagac accgaccaaa 180tcatctggaa acactgatca gggtttaatg
aaaaaattga aaggatttga cgggcttgca 240atgtcaatag gcaatggcaa tgctgaaagt
gctgagcgtg gagctgaaaa ccggctatca 300cggagtgtgg atactgaggg ttccagcgat
ggaagcgatg gcaacactac agggaccaat 360ggaacaagga aaagaagccg ggatgggaca
ccaacaacca ctgatggaga agggaaaact 420gagatgccag atagtcaagt ttccaaagag
actgctgctt ccaaaaagac agtgtcagtt 480atcacaagca gtgctgcaga aaatatggtt
ggacctgtac tttcttcagg tatgaccaca 540tcactggaac tgaggaaccc ttcacctatt
tccaccagtg ctccacaacc ttgtggagtt 600ttgcctcctg aagcttggat gcagaatgag
cgtgagctga aacgtgagag gaggaaacaa 660tcaaatcgtg aatctgctag aagatccagg
cttaggaagc aggctgaggc tgaagaattg 720gcacgaagag tcgatgcgtt gactgctgag
aatttggcgc tgaaatcaga aatgaatgaa 780ttggctgaaa attcggcgaa gctgaagatt
gaaaatgcta cattaaagga aaagctggaa 840aacactcaac tgggacaaac agaagagata
attttgaacg gcatggacaa gagggctaca 900cctgtaagta cagaaaactt actgtcaaga
gttaatgatt ccaattctga tgatagagct 960gcagaggaag aaaatggttt ctgtgagaac
aaacccaatt ctggtgcaaa gctgcgtcaa 1020ctactcgaca caaatcctag agctaatgct
gtggccgcta gttga 1065114354PRTMedicago truncatula
114Met Trp Gly Pro Pro Gln Pro Met Met His Pro Tyr Gly Pro Pro Tyr 1
5 10 15 Ala Pro Pro Phe
Tyr Ser His Gly Gly Val Tyr Thr His Pro Ala Val 20
25 30 Ala Ile Gly Ser Asn Ser Asn Gly Gln
Gly Ile Ser Ser Ser Pro Ala 35 40
45 Ala Gly Thr Pro Thr Ser Ile Glu Thr Pro Thr Lys Ser Ser
Gly Asn 50 55 60
Thr Asp Gln Gly Leu Met Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala 65
70 75 80 Met Ser Ile Gly Asn
Gly Asn Ala Glu Ser Ala Glu Arg Gly Ala Glu 85
90 95 Asn Arg Leu Ser Arg Ser Val Asp Thr Glu
Gly Ser Ser Asp Gly Ser 100 105
110 Asp Gly Asn Thr Thr Gly Thr Asn Gly Thr Arg Lys Arg Ser Arg
Asp 115 120 125 Gly
Thr Pro Thr Thr Thr Asp Gly Glu Gly Lys Thr Glu Met Pro Asp 130
135 140 Ser Gln Val Ser Lys Glu
Thr Ala Ala Ser Lys Lys Thr Val Ser Val 145 150
155 160 Ile Thr Ser Ser Ala Ala Glu Asn Met Val Gly
Pro Val Leu Ser Ser 165 170
175 Gly Met Thr Thr Ser Leu Glu Leu Arg Asn Pro Ser Pro Ile Ser Thr
180 185 190 Ser Ala
Pro Gln Pro Cys Gly Val Leu Pro Pro Glu Ala Trp Met Gln 195
200 205 Asn Glu Arg Glu Leu Lys Arg
Glu Arg Arg Lys Gln Ser Asn Arg Glu 210 215
220 Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln Ala Glu
Ala Glu Glu Leu 225 230 235
240 Ala Arg Arg Val Asp Ala Leu Thr Ala Glu Asn Leu Ala Leu Lys Ser
245 250 255 Glu Met Asn
Glu Leu Ala Glu Asn Ser Ala Lys Leu Lys Ile Glu Asn 260
265 270 Ala Thr Leu Lys Glu Lys Leu Glu
Asn Thr Gln Leu Gly Gln Thr Glu 275 280
285 Glu Ile Ile Leu Asn Gly Met Asp Lys Arg Ala Thr Pro
Val Ser Thr 290 295 300
Glu Asn Leu Leu Ser Arg Val Asn Asp Ser Asn Ser Asp Asp Arg Ala 305
310 315 320 Ala Glu Glu Glu
Asn Gly Phe Cys Glu Asn Lys Pro Asn Ser Gly Ala 325
330 335 Lys Leu Arg Gln Leu Leu Asp Thr Asn
Pro Arg Ala Asn Ala Val Ala 340 345
350 Ala Ser 1151197DNAVitis vinifera 115atgggggatg
gtgaggaaag cacacctccc aagtcttcta aaccacctgc ttcaacacag 60gagacaccat
caaccccttc ttatcctgac tggtcaacat ctatgcaggc ctactatggt 120gctggagcta
ctccgcctcc ttttttccct tctcctgttg cacccccatc ccctcatccg 180tacctatggg
gaggtcagca tcctatgatg ccaccatatg gaactccact tccataccca 240gctctctatc
ctcgtggggc cctctatgct catcctagca tggctacggc tcagggtgtg 300gcactgacaa
ataccgacat ggaagtaaag acccctgatg gaaaagaccc agcatcaatt 360aaaaaatcaa
aggcagcttc aggaaacatg ggtttgatta gtggaaaatc tggggaaagc 420ggaaaggcag
cttcagtttc tggcaatgat ggtgcttcac aaagtgggga gagtggtagt 480gaggcctcat
cagatgcgac tgatgagaat gctaaccaag catcttctgc agtaaagaag 540agaagcttca
accttgctga tggatcaaat gcaaagggta acagtgctgc tcagtacact 600ggtggaaatc
attcagcctc agttccaggc aagcctgtgg tacctatgcc tacaaccagt 660ttaaatattg
ggatggacct gtggaatgca tcccctgctg gaggcacacc catgaagaca 720agaccacagt
catctggtgc ctcacctcaa gtggcttcag caactatagt tggacgtgaa 780ggcatgttac
aggatcatca atggattcaa gatgaacgtg aactcaaacg acaaaggaga 840aagcaatcca
atagggagtc agctaggaga tcaagattgc gtaagcaggc tgaatgtgaa 900gaattacaat
caaaggttga aattttgagc aatgagaatc atgtgctgag agaggagctg 960cataggcttg
ctgagcagtg cgagaagctt acatctgaga ataattccat aatggaggag 1020ttgacacaat
tgtatgggcc agaggcaaca tctagccttc aagataacaa ccacaacttg 1080gttctccatc
ctatcaatgg tgaagacgat ggccatgtac aagatgcttc ccctctaaac 1140aactccagtt
ccacgtctga tcaaaatggg aaattcagct ccaatgggaa gatttga
1197116398PRTVitis vinifera 116Met Gly Asp Gly Glu Glu Ser Thr Pro Pro
Lys Ser Ser Lys Pro Pro 1 5 10
15 Ala Ser Thr Gln Glu Thr Pro Ser Thr Pro Ser Tyr Pro Asp Trp
Ser 20 25 30 Thr
Ser Met Gln Ala Tyr Tyr Gly Ala Gly Ala Thr Pro Pro Pro Phe 35
40 45 Phe Pro Ser Pro Val Ala
Pro Pro Ser Pro His Pro Tyr Leu Trp Gly 50 55
60 Gly Gln His Pro Met Met Pro Pro Tyr Gly Thr
Pro Leu Pro Tyr Pro 65 70 75
80 Ala Leu Tyr Pro Arg Gly Ala Leu Tyr Ala His Pro Ser Met Ala Thr
85 90 95 Ala Gln
Gly Val Ala Leu Thr Asn Thr Asp Met Glu Val Lys Thr Pro 100
105 110 Asp Gly Lys Asp Pro Ala Ser
Ile Lys Lys Ser Lys Ala Ala Ser Gly 115 120
125 Asn Met Gly Leu Ile Ser Gly Lys Ser Gly Glu Ser
Gly Lys Ala Ala 130 135 140
Ser Val Ser Gly Asn Asp Gly Ala Ser Gln Ser Gly Glu Ser Gly Ser 145
150 155 160 Glu Ala Ser
Ser Asp Ala Thr Asp Glu Asn Ala Asn Gln Ala Ser Ser 165
170 175 Ala Val Lys Lys Arg Ser Phe Asn
Leu Ala Asp Gly Ser Asn Ala Lys 180 185
190 Gly Asn Ser Ala Ala Gln Tyr Thr Gly Gly Asn His Ser
Ala Ser Val 195 200 205
Pro Gly Lys Pro Val Val Pro Met Pro Thr Thr Ser Leu Asn Ile Gly 210
215 220 Met Asp Leu Trp
Asn Ala Ser Pro Ala Gly Gly Thr Pro Met Lys Thr 225 230
235 240 Arg Pro Gln Ser Ser Gly Ala Ser Pro
Gln Val Ala Ser Ala Thr Ile 245 250
255 Val Gly Arg Glu Gly Met Leu Gln Asp His Gln Trp Ile Gln
Asp Glu 260 265 270
Arg Glu Leu Lys Arg Gln Arg Arg Lys Gln Ser Asn Arg Glu Ser Ala
275 280 285 Arg Arg Ser Arg
Leu Arg Lys Gln Ala Glu Cys Glu Glu Leu Gln Ser 290
295 300 Lys Val Glu Ile Leu Ser Asn Glu
Asn His Val Leu Arg Glu Glu Leu 305 310
315 320 His Arg Leu Ala Glu Gln Cys Glu Lys Leu Thr Ser
Glu Asn Asn Ser 325 330
335 Ile Met Glu Glu Leu Thr Gln Leu Tyr Gly Pro Glu Ala Thr Ser Ser
340 345 350 Leu Gln Asp
Asn Asn His Asn Leu Val Leu His Pro Ile Asn Gly Glu 355
360 365 Asp Asp Gly His Val Gln Asp Ala
Ser Pro Leu Asn Asn Ser Ser Ser 370 375
380 Thr Ser Asp Gln Asn Gly Lys Phe Ser Ser Asn Gly Lys
Ile 385 390 395
1171095DNAVitis vinifera 117atgggggctg gggaagatac cacacctact aagccttcca
aaccaacttc ttcagctcag 60gaaatgccaa cgacgccctc atatcctgag tggtcgagct
ctatgcaggc ttattatggt 120cctggagcta caccacctcc cttttttgct ccctctgttg
cttctccaac tccccatcca 180tatctgtggg gaagccagca tcctttaatt cctccatatg
gaacaccagt tccatactca 240gccttatatc ctccaggagg tgtttatgca catcctaatt
tggccacggc tccaagtgca 300gcacatttaa accctgagtt ggaagggaaa ggccctgagg
gaaaagacaa ggcttcagca 360aagaaatcta aaggaacatc aggaaatact gttaagggtg
gcgagagtgg aaaggcagct 420tcaggctcag gaaatgatgg tgcctcacca agtgctgaaa
gtggaagtga gggttcatca 480gatgcaagtg atgagaatac taaccaacaa gaatttgctt
ctagtaagaa gggaagtttc 540aatcagatgc ttgctgatgc caatgcacag aataacatct
ctggaacaag tgttcaggct 600tcagttcctg ggaagcctgt aatatctatg cctgcaacta
atctaaatat tgggatggac 660ttatggagtg catctcctgg gggctctgga gctacaaaac
tgagaccaaa tccatctggc 720atctcatctt ctgttgctcc agcagcaatg gttgggcgtg
aaggcgttat gcccgaccag 780tggattcaag atgaacgtga actcaaaaga caaaagagga
aacaatctaa cagggagtca 840gctaggaggt cgagattacg gaagcaggcg gagtgtgagg
aactacaagc aaaggtagaa 900actttgagca ctgagaatac tgcactcaga gatgagctgc
agaggctttc tgaggaatgc 960gagaagctta catctgaaaa taattccatt aaggaagaat
tgactcgggt atgtggagca 1020gatgcagtgg ctgcaaacct caaagagaaa aaccccacac
aactccaatc tcagggcgtc 1080gagggcaaca gttga
1095118364PRTVitis vinifera 118Met Gly Ala Gly Glu
Asp Thr Thr Pro Thr Lys Pro Ser Lys Pro Thr 1 5
10 15 Ser Ser Ala Gln Glu Met Pro Thr Thr Pro
Ser Tyr Pro Glu Trp Ser 20 25
30 Ser Ser Met Gln Ala Tyr Tyr Gly Pro Gly Ala Thr Pro Pro Pro
Phe 35 40 45 Phe
Ala Pro Ser Val Ala Ser Pro Thr Pro His Pro Tyr Leu Trp Gly 50
55 60 Ser Gln His Pro Leu Ile
Pro Pro Tyr Gly Thr Pro Val Pro Tyr Ser 65 70
75 80 Ala Leu Tyr Pro Pro Gly Gly Val Tyr Ala His
Pro Asn Leu Ala Thr 85 90
95 Ala Pro Ser Ala Ala His Leu Asn Pro Glu Leu Glu Gly Lys Gly Pro
100 105 110 Glu Gly
Lys Asp Lys Ala Ser Ala Lys Lys Ser Lys Gly Thr Ser Gly 115
120 125 Asn Thr Val Lys Gly Gly Glu
Ser Gly Lys Ala Ala Ser Gly Ser Gly 130 135
140 Asn Asp Gly Ala Ser Pro Ser Ala Glu Ser Gly Ser
Glu Gly Ser Ser 145 150 155
160 Asp Ala Ser Asp Glu Asn Thr Asn Gln Gln Glu Phe Ala Ser Ser Lys
165 170 175 Lys Gly Ser
Phe Asn Gln Met Leu Ala Asp Ala Asn Ala Gln Asn Asn 180
185 190 Ile Ser Gly Thr Ser Val Gln Ala
Ser Val Pro Gly Lys Pro Val Ile 195 200
205 Ser Met Pro Ala Thr Asn Leu Asn Ile Gly Met Asp Leu
Trp Ser Ala 210 215 220
Ser Pro Gly Gly Ser Gly Ala Thr Lys Leu Arg Pro Asn Pro Ser Gly 225
230 235 240 Ile Ser Ser Ser
Val Ala Pro Ala Ala Met Val Gly Arg Glu Gly Val 245
250 255 Met Pro Asp Gln Trp Ile Gln Asp Glu
Arg Glu Leu Lys Arg Gln Lys 260 265
270 Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu
Arg Lys 275 280 285
Gln Ala Glu Cys Glu Glu Leu Gln Ala Lys Val Glu Thr Leu Ser Thr 290
295 300 Glu Asn Thr Ala Leu
Arg Asp Glu Leu Gln Arg Leu Ser Glu Glu Cys 305 310
315 320 Glu Lys Leu Thr Ser Glu Asn Asn Ser Ile
Lys Glu Glu Leu Thr Arg 325 330
335 Val Cys Gly Ala Asp Ala Val Ala Ala Asn Leu Lys Glu Lys Asn
Pro 340 345 350 Thr
Gln Leu Gln Ser Gln Gly Val Glu Gly Asn Ser 355
360 11929PRTArtificial sequencemotif 1 119Glu Leu Lys Arg
Xaa Xaa Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg 1 5
10 15 Arg Ser Arg Leu Arg Lys Gln Ala Glu
Xaa Glu Glu Leu 20 25
12021PRTArtificial sequencemotif 2 120Xaa Xaa Xaa Val Glu Xaa Leu Xaa Xaa
Glu Asn Xaa Xaa Leu Xaa Xaa 1 5 10
15 Glu Xaa Xaa Xaa Xaa 20
1218PRTArtificial sequencemotif 4 121Xaa Xaa Xaa Xaa Xaa Xaa Xaa Trp 1
5 12231PRTArtificial sequencemotif 4 122Arg Glu
Leu Lys Arg Gln Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala 1 5
10 15 Arg Arg Ser Arg Leu Arg Lys
Gln Ala Glu Cys Glu Glu Leu Gln 20 25
30 12312PRTArtificial sequencemotif 5 123Xaa Thr Asn Leu
Asn Xaa Gly Met Asp Xaa Trp Asn 1 5 10
12416PRTArtificial sequencemotif 6 124Met Pro Pro Tyr Gly Thr Pro
Val Pro Tyr Pro Ala Xaa Tyr Pro Pro 1 5
10 15 12547PRTArtificial sequencemotif 7 125Asn Glu
Xaa Glu Leu Lys Arg Glu Xaa Arg Lys Gln Ser Asn Arg Glu 1 5
10 15 Ser Ala Arg Arg Ser Arg Leu
Arg Lys Gln Ala Glu Xaa Glu Glu Leu 20 25
30 Ala Xaa Xaa Val Xaa Xaa Leu Thr Xaa Glu Asn Xaa
Xaa Leu Xaa 35 40 45
12631PRTArtificial sequencemotif 8 126Xaa Xaa Xaa Pro Xaa Lys Ser Ser Gly
Asn Thr Asp Xaa Gly Leu Xaa 1 5 10
15 Xaa Lys Leu Lys Xaa Phe Asp Gly Leu Xaa Met Ser Ile Gly
Asn 20 25 30
12713PRTArtificial sequencemotif 9 127Xaa Pro Gln Xaa Met Met Pro Pro Tyr
Gly Xaa Pro Tyr 1 5 10
12822PRTArtificial sequencemotif 10 128Asn Ser Gly Ala Lys Leu Xaa Gln
Leu Leu Asp Xaa Xaa Pro Arg Xaa 1 5 10
15 Asp Ala Val Ala Ala Gly 20
12922PRTArtificial sequencemotif 11 129Glu Ile Xaa Xaa Xaa Thr Glu Xaa
Ser Glu Lys Xaa Xaa Xaa Xaa Asn 1 5 10
15 Xaa Xaa Leu Xaa Xaa Xaa 20
13050PRTArtificial sequencemotif 12 130Trp Leu Gln Asn Glu Xaa Glu Leu
Lys Arg Glu Xaa Arg Lys Gln Ser 1 5 10
15 Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gln
Ala Glu Xaa 20 25 30
Glu Glu Leu Ala Xaa Xaa Val Xaa Xaa Leu Thr Xaa Glu Asn Xaa Xaa
35 40 45 Leu Xaa 50
1312194DNAOryza sativa 131aatccgaaaa gtttctgcac cgttttcacc ccctaactaa
caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact
agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag
agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc
ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt
catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag
atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata
taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct
tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc
ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag
tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa
tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata
atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt
taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca
tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg
atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag
agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc
ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca
cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg
gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc
ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta
tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc
ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat
ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag
caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg
atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca
actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt
tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata
atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt
cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa
tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag
tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta
gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg
agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta
tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg
attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat
tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttc
21941323418DNAArtificial sequenceexpression
cassette with SEQ ID NO 1 132aatccgaaaa gtttctgcac cgttttcacc ccctaactaa
caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact
agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag
agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc
ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt
catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag
atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata
taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct
tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc
ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag
tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa
tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata
atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt
taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca
tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg
atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag
agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc
ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca
cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg
gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc
ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta
tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc
ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat
ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag
caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg
atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca
actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt
tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata
atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt
cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa
tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag
tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta
gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg
agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta
tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg
attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat
tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttcatttaa
atcaactagg gatatcacaa 2220gtttgtacaa aaaagcaggc ttaaacaatg cctccttatg
ggactccagt tccatatcca 2280gctttatatc ctcctgccgg agtttatgct catcctaaca
ttgccacgcc ggctccaaat 2340tctgtgccgg caaatcctga agcagatggg aaggggcctg
aaggaaagga tcggaattca 2400agtaaaaagt taaaggtctg ttctggtggt aaggcaggcg
acaatgggaa agttacttca 2460ggttccggaa atgatggtgc cacacaaagt gatgaaagca
gaagtgaagg tacatcagat 2520acaaatgatg aaaatgataa caatgaattt gctgcaaaca
agaagggaag ctttgatcaa 2580atgcttgcag atggagccag tgcacagaat aatcctgcga
aagagaatca cccgacttct 2640atacatggaa atcctgtcac catgcctgca actaacctaa
atattggaat ggacgtgtgg 2700aatgcatcag ctgccggtcc tggagcgatc aaaatacagc
aaaatgcaac tggtccagtt 2760ataggacatg aaggaaggat gaatgatcag tggattcagg
aggaacgtga acttaaaagg 2820caaaagagaa agcaatctaa tagggagtca gctaggaggt
cgaggctccg caagcaggca 2880gagtgtgaag agctacaacg tagagtagaa gctttgagcc
atgagaatca ttcactcaaa 2940gatgagctcc aacggctctc tgaggaatgt gagaagctta
cctcggagaa taatttaatt 3000aaggaagagt taacgctact ttgtggacca gacgttgtgt
ctaagctgga gagaaacgat 3060aatgtcacac gtattcaatc taatgttgaa gaagctagtt
aaggagaagt ggaaaagcac 3120ccagctttct tgtacaaagt ggtgatatca caagcccggg
cggtcttcta gggataacag 3180ggtaattata tccctctaga tcacaagccc gggcggtctt
ctacgatgat tgagtaataa 3240tgtgtcacgc atcaccatgg gtggcagtgt cagtgtgagc
aatgacctga atgaacaatt 3300gaaatgaaaa gaaaaaaagt actccatctg ttccaaatta
aaattggttt taacctttta 3360ataggtttat acaataattg atatatgttt tctgtatatg
tctaatttgt tatcatcc 34181333760DNAArtificial sequenceexpression
cassette with SEQ ID NO 3 133aatccgaaaa gtttctgcac cgttttcacc ccctaactaa
caatataggg aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact
agaactatgc aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag
agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc
ttagaatata cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt
catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag
atttttttta aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata
taattatata attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct
tactccatcc caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc
ttttttcgat tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag
tgcacctcct caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa
tacatttagg tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata
atatctaaaa tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt
taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca
tattgggcac acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg
atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag
agaggaggag aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc
ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca
cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg
gtcgatctct tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc
ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta
tctgtgatga ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc
ggttcggttt gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat
ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag
caccggtgat tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg
atgcttctcg atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca
actttgaaga cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt
tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata
atcctcagga acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt
cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa
tgcttttata gcgttatcct 1800agctgtagtt cagttaatag gtaatacccc tatagtttag
tcaggagaag aacttatccg 1860atttctgatc tccattttta attatatgaa atgaactgta
gcataagcag tattcatttg 1920gattattttt tttattagct ctcacccctt cattattctg
agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta
tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt ggttattcct tgactgcttg
attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt atactgcttg ttcttatgat
tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt caccagcaaa gttcatttaa
atcaactagg gatatcacaa 2220gtttgtacaa aaaagcaggc ttaaacaatg ggaaacattg
aagagggaaa gtcttccact 2280tctgataaat cttcacctgc accaccggat cagaccaata
ttcatgtgta tcctgatggg 2340gcagctatgc aggcatatta tggcccccga gtggctctcc
caccatatta caactcggcc 2400gtggcttctg gtcatgcccc tcatccttat atgtggggcc
tgccacagcc tatgatgcca 2460ccttatgggg caccttatgc aacagtctac tcacatggag
tgtatgcaca tccggctgtt 2520ccaattgtat cccatcctca tggtcctggg attgtgtcat
ctcctgcagc tggaaccctt 2580ttgagtgcag aaacacctac aaaatcttca ggaaatactg
atcgaggttt agtgaataag 2640ttgaaaggat ttgatgggct tgcaatgtca ataggcaatg
gtaatgctga gactgtcgag 2700ggtgggggta ggctgtctca aagtgtggag atagaagttt
ccagtgatgg aattgatggg 2760aatacaacta ggggaaagaa aaggagccgt gagggaacac
caactgttgc aacaggtgga 2820gatacaaaaa tggagtcaca ttccagtccc cttcctagag
aggtgaatgc atccactgac 2880aatgtattga gggcagctgt tgctcctggc atgaccacag
cattggagct taggaaccct 2940cctagtgtga atgctgctaa gacaagtcct actacgattc
ctcaatctgg tgtagtcctg 3000ccctctgaag cctggttaca gaatgagctg gagctgaaac
gggagaagag gaaacaatca 3060aatcgagaat ctgccagaag gtcaagatta aggaagcagg
ctgaggctga agaacttgca 3120cacaaagttg aagtactcac cacagaaaac atggcactcc
aatctgaaat aagtcaattt 3180acagagaaat cagagaaact aaggcttgaa aatgctgcat
taacggagaa actcaagaat 3240gcacaattag gacatgcgca agaaatgatt ttaaacattg
atgagcacag ggccccagct 3300gttagtacag aaaacttgct atcaagagtt aacaattctg
cctttgaaga agagagtgat 3360ctgtatgaac gaaactcaaa ttctggtgcc aagctgcatc
aactcttgga tgcaagcccc 3420agagccgatg ctgtggctgc tggttgatga cactggttca
acccagcttt cttgtacaaa 3480gtggtgatat cacaagcccg ggcggtcttc tagggataac
agggtaatta tatccctcta 3540gatcacaagc ccgggcggtc ttctacgatg attgagtaat
aatgtgtcac gcatcaccat 3600gggtggcagt gtcagtgtga gcaatgacct gaatgaacaa
ttgaaatgaa aagaaaaaaa 3660gtactccatc tgttccaaat taaaattggt tttaaccttt
taataggttt atacaataat 3720tgatatatgt tttctgtata tgtctaattt gttatcatcc
376013454DNAArtificial sequenceprimer prm009943
134ggggacaagt ttgtacaaaa aagcaggctt aaacaatgcc tccttatggg actc
5413550DNAArtificial sequenceprimer prm009944 135ggggaccact ttgtacaaga
aagctgggtg cttttccact tctccttaac 5013655DNAArtificial
sequenceprimer prm17402 136ggggacaagt ttgtacaaaa aagcaggctt aaacaatggg
aaacattgaa gaggg 5513750DNAArtificial sequenceprimer prm17403
137ggggaccact ttgtacaaga aagctgggtt gaaccagtgt catcaaccag
501389DNAArtificial sequenceGLM oligonucleotide 138gtgagtcat
91398DNAArtificial
sequenceABRE oligonucleotide 139ccacgtgg
81406DNAArtificial sequencePB-like
oligonucleotide 140tgaaaa
61411242DNAPopulus trichocarpa 141atggagagaa gcgccgtctt
tggtggtctg caaccaaatt accttcttta cccctcaccc 60aactcttcat cccttccttt
ctcagaccac cgcgctagac ttccaaattt ctctcctcct 120ccctctctgt ctctcaagat
acataagcag gtttcttctt gttttaaagc tgtgtctcct 180tttaagcgtg gagctgcgtt
ttctgataca cacagtgaca catttgaatt agctgacata 240gactgggatg accttggatt
tgcatacgtt cccactgatt atatgtattc aatgaaatgc 300actaaaggtg gaaacttttc
caaaggtgaa ttacagagat atggaaacat tgaactgaac 360ccttctgctg gcgtcttaaa
ttatggccag ggattgtttg aaggtctgaa agcctacagg 420aaagaagatg gtaaccttct
tctatttcgt cctgaggaaa atgctatgcg gatgataatg 480ggtgcagaga ggatgtgcat
gccatcaccg acaattgatc agtttgtgga tgcagtaaaa 540gcaactgttt tagcaaacaa
acgttgggtt cctcctccag gtaaaggttc cttatatatc 600agaccattgc taatggggag
tggagctgtt cttggtcttg cacctgctcc tgagtatacc 660tttctcattt atgtttcacc
ggtggggaac tattttaagg aaggtgtggc accaattcat 720ttaattgtgg agcatgaact
tcatcgagca actcctggtg gcactggagg tgtgaagact 780atagggaatt atgctgcggt
tctcaaggca caatctgctg caaaagccag aggtttttct 840gacgttttat atcttgattg
tgtacataaa aagtatctag aagaggtttc ctcttgcaac 900atttttgttg tgaagggtaa
cagcatctcc actcctgcaa taaaagggac aatcctacca 960ggaattacaa ggaagagcat
aattgatgtt gctcgaagcc aaggatttca ggttgaggaa 1020cggcttgtga cagtagatga
attgcttgat gctgatgagg ttttttgtac cggaacagct 1080gttgttgtgt cacctgtggg
aagcatcacc tacaagggta aaagggtgtc ttatggcgta 1140gaaggttttg gtgctgtctc
gcaacaactc tatagtgtgc taaccaagct acagatgggc 1200cttatagagg acaagatgaa
ttggactgtg gagctgagtt ag 1242142413PRTPopulus
trichocarpa 142Met Glu Arg Ser Ala Val Phe Gly Gly Leu Gln Pro Asn Tyr
Leu Leu 1 5 10 15
Tyr Pro Ser Pro Asn Ser Ser Ser Leu Pro Phe Ser Asp His Arg Ala
20 25 30 Arg Leu Pro Asn Phe
Ser Pro Pro Pro Ser Leu Ser Leu Lys Ile His 35
40 45 Lys Gln Val Ser Ser Cys Phe Lys Ala
Val Ser Pro Phe Lys Arg Gly 50 55
60 Ala Ala Phe Ser Asp Thr His Ser Asp Thr Phe Glu Leu
Ala Asp Ile 65 70 75
80 Asp Trp Asp Asp Leu Gly Phe Ala Tyr Val Pro Thr Asp Tyr Met Tyr
85 90 95 Ser Met Lys Cys
Thr Lys Gly Gly Asn Phe Ser Lys Gly Glu Leu Gln 100
105 110 Arg Tyr Gly Asn Ile Glu Leu Asn Pro
Ser Ala Gly Val Leu Asn Tyr 115 120
125 Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Glu
Asp Gly 130 135 140
Asn Leu Leu Leu Phe Arg Pro Glu Glu Asn Ala Met Arg Met Ile Met 145
150 155 160 Gly Ala Glu Arg Met
Cys Met Pro Ser Pro Thr Ile Asp Gln Phe Val 165
170 175 Asp Ala Val Lys Ala Thr Val Leu Ala Asn
Lys Arg Trp Val Pro Pro 180 185
190 Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser
Gly 195 200 205 Ala
Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr 210
215 220 Val Ser Pro Val Gly Asn
Tyr Phe Lys Glu Gly Val Ala Pro Ile His 225 230
235 240 Leu Ile Val Glu His Glu Leu His Arg Ala Thr
Pro Gly Gly Thr Gly 245 250
255 Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln Ser
260 265 270 Ala Ala
Lys Ala Arg Gly Phe Ser Asp Val Leu Tyr Leu Asp Cys Val 275
280 285 His Lys Lys Tyr Leu Glu Glu
Val Ser Ser Cys Asn Ile Phe Val Val 290 295
300 Lys Gly Asn Ser Ile Ser Thr Pro Ala Ile Lys Gly
Thr Ile Leu Pro 305 310 315
320 Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Arg Ser Gln Gly Phe
325 330 335 Gln Val Glu
Glu Arg Leu Val Thr Val Asp Glu Leu Leu Asp Ala Asp 340
345 350 Glu Val Phe Cys Thr Gly Thr Ala
Val Val Val Ser Pro Val Gly Ser 355 360
365 Ile Thr Tyr Lys Gly Lys Arg Val Ser Tyr Gly Val Glu
Gly Phe Gly 370 375 380
Ala Val Ser Gln Gln Leu Tyr Ser Val Leu Thr Lys Leu Gln Met Gly 385
390 395 400 Leu Ile Glu Asp
Lys Met Asn Trp Thr Val Glu Leu Ser 405
410 143930DNAArabidopsis thaliana 143atggctcttc gtcgctgctt
acctcaatat tcaacaactt catcttatct ctccaagatc 60tggggatttc gtatgcatgg
gaccaaggca gcagcttctg ttgtagaaga acatgtctcg 120ggggcagaac gtgaggatga
agaatatgct gatgtagatt gggacaacct tggattcagt 180cttgtacgga cagatttcat
gttcgccacc aaaagttgca gagacggaaa cttcgaacag 240ggttacctta gccgttacgg
caacatcgag ctcaaccctg ctgctggaat tctcaactat 300ggccagggac taatagaggg
gatgaaagcg tacagaggag aagacggtag ggttcttctc 360ttccgtccag agctaaacgc
gatgcgtatg aagataggag ctgagagaat gtgtatgcat 420tctccttctg ttcatcagtt
tattgaaggt gttaagcaga ccgttcttgc aaacaggcgt 480tgggttcctc ctccgggcaa
aggctcgttg tatctcagac cgttgttgtt cggaagtgga 540gcaagcttgg gtgtggctgc
agcatcagag tacacgtttc ttgtgtttgg ctctcctgtt 600caaaactact tcaaggaagg
cacagcggcg ttgaacctgt atgtggagga ggtgattcca 660cgcgcttatc ttggaggaac
tggtggtgta aaggcaatat ccaattacgg tccagtgctt 720gaagtgatga gaagagcaaa
atcaagaggg ttttcggatg ttttgtatct tgatgcagat 780actgggaaga acatagaaga
agtctctgct gctaatatat tccttgtgaa gggcaataca 840atagtgacac cagctacgag
cggaacgatt ctcggaggga tcacacgtgg aagaacgaag 900tgttccggta gaagaactga
aggaagctga 930144309PRTArabidopsis
thaliana 144Met Ala Leu Arg Arg Cys Leu Pro Gln Tyr Ser Thr Thr Ser Ser
Tyr 1 5 10 15 Leu
Ser Lys Ile Trp Gly Phe Arg Met His Gly Thr Lys Ala Ala Ala
20 25 30 Ser Val Val Glu Glu
His Val Ser Gly Ala Glu Arg Glu Asp Glu Glu 35
40 45 Tyr Ala Asp Val Asp Trp Asp Asn Leu
Gly Phe Ser Leu Val Arg Thr 50 55
60 Asp Phe Met Phe Ala Thr Lys Ser Cys Arg Asp Gly Asn
Phe Glu Gln 65 70 75
80 Gly Tyr Leu Ser Arg Tyr Gly Asn Ile Glu Leu Asn Pro Ala Ala Gly
85 90 95 Ile Leu Asn Tyr
Gly Gln Gly Leu Ile Glu Gly Met Lys Ala Tyr Arg 100
105 110 Gly Glu Asp Gly Arg Val Leu Leu Phe
Arg Pro Glu Leu Asn Ala Met 115 120
125 Arg Met Lys Ile Gly Ala Glu Arg Met Cys Met His Ser Pro
Ser Val 130 135 140
His Gln Phe Ile Glu Gly Val Lys Gln Thr Val Leu Ala Asn Arg Arg 145
150 155 160 Trp Val Pro Pro Pro
Gly Lys Gly Ser Leu Tyr Leu Arg Pro Leu Leu 165
170 175 Phe Gly Ser Gly Ala Ser Leu Gly Val Ala
Ala Ala Ser Glu Tyr Thr 180 185
190 Phe Leu Val Phe Gly Ser Pro Val Gln Asn Tyr Phe Lys Glu Gly
Thr 195 200 205 Ala
Ala Leu Asn Leu Tyr Val Glu Glu Val Ile Pro Arg Ala Tyr Leu 210
215 220 Gly Gly Thr Gly Gly Val
Lys Ala Ile Ser Asn Tyr Gly Pro Val Leu 225 230
235 240 Glu Val Met Arg Arg Ala Lys Ser Arg Gly Phe
Ser Asp Val Leu Tyr 245 250
255 Leu Asp Ala Asp Thr Gly Lys Asn Ile Glu Glu Val Ser Ala Ala Asn
260 265 270 Ile Phe
Leu Val Lys Gly Asn Thr Ile Val Thr Pro Ala Thr Ser Gly 275
280 285 Thr Ile Leu Gly Gly Ile Thr
Arg Gly Arg Thr Lys Cys Ser Gly Arg 290 295
300 Arg Thr Glu Gly Ser 305
1451167DNAArabidopsis thaliana 145atgatcaaaa caatcacatc tctacgcaaa
actctggttc tacctcttca tttacatatt 60cgtacgctac aaactttcgc caagtacaac
gcacaagctg catcggcttt gcgagaagag 120cgtaagaaac ctctttatca aaatggagat
gatgtatatg cggatttgga ttgggataat 180ctcgggtttg gtctaaatcc agctgattac
atgtatgtca tgaaatgctc aaaagacggc 240gaattcactc aaggagaact tagtccctat
gggaatattc agctaagtcc ttctgctgga 300gtcttaaact atggacaggc gatatacgaa
ggtacaaaag catacaggaa agaaaatggg 360aagcttcttt tgtttcgtcc ggatcacaac
gctatccgga tgaagcttgg cgctgaacgg 420atgctcatgc cttctccttc ggttgatcag
tttgttaatg cagttaaaca aaccgctctt 480gcaaacaaac gttgggttcc tcctgcaggg
aaagggactt tgtacattag gcctttgttg 540atgggaagtg gtccaatact tggtttaggt
cctgcacctg aatatacatt cattgtctat 600gcatctccag ttggtaacta cttcaaggaa
gggatggctg ctcttaacct ctatgttgag 660gaagaatatg tccgagcggc tcctggtgga
gctggaggcg tcaagagcat cacaaattat 720gcgccagttt tgaaagcact gagcagagcc
aagagtcggg ggttttcaga cgttctttat 780ctcgactctg tcaagaagaa gtacttagag
gaggcttctt cttgcaacgt ctttgttgtc 840aagggtcgga caatctcaac tcctgcaact
aatggaacaa ttcttgaagg gattacgcgg 900aaaagtgtga tggagatcgc aagtgatcaa
ggttatcagg tagtagagaa ggcagttcat 960gtggatgaag taatggatgc agatgaagtt
ttttgcaccg gaactgctgt agtagttgct 1020cccgtgggca ctatcacata tcaggaaaaa
agagtagagt ataaaaccgg ggatgaatct 1080gtctgccaga aactgcgttc agtcctcgta
ggtatccaga caggattgat tgaagataac 1140aagggatggg tcacagatat caactga
1167146388PRTArabidopsis thaliana 146Met
Ile Lys Thr Ile Thr Ser Leu Arg Lys Thr Leu Val Leu Pro Leu 1
5 10 15 His Leu His Ile Arg Thr
Leu Gln Thr Phe Ala Lys Tyr Asn Ala Gln 20
25 30 Ala Ala Ser Ala Leu Arg Glu Glu Arg Lys
Lys Pro Leu Tyr Gln Asn 35 40
45 Gly Asp Asp Val Tyr Ala Asp Leu Asp Trp Asp Asn Leu Gly
Phe Gly 50 55 60
Leu Asn Pro Ala Asp Tyr Met Tyr Val Met Lys Cys Ser Lys Asp Gly 65
70 75 80 Glu Phe Thr Gln Gly
Glu Leu Ser Pro Tyr Gly Asn Ile Gln Leu Ser 85
90 95 Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln
Ala Ile Tyr Glu Gly Thr 100 105
110 Lys Ala Tyr Arg Lys Glu Asn Gly Lys Leu Leu Leu Phe Arg Pro
Asp 115 120 125 His
Asn Ala Ile Arg Met Lys Leu Gly Ala Glu Arg Met Leu Met Pro 130
135 140 Ser Pro Ser Val Asp Gln
Phe Val Asn Ala Val Lys Gln Thr Ala Leu 145 150
155 160 Ala Asn Lys Arg Trp Val Pro Pro Ala Gly Lys
Gly Thr Leu Tyr Ile 165 170
175 Arg Pro Leu Leu Met Gly Ser Gly Pro Ile Leu Gly Leu Gly Pro Ala
180 185 190 Pro Glu
Tyr Thr Phe Ile Val Tyr Ala Ser Pro Val Gly Asn Tyr Phe 195
200 205 Lys Glu Gly Met Ala Ala Leu
Asn Leu Tyr Val Glu Glu Glu Tyr Val 210 215
220 Arg Ala Ala Pro Gly Gly Ala Gly Gly Val Lys Ser
Ile Thr Asn Tyr 225 230 235
240 Ala Pro Val Leu Lys Ala Leu Ser Arg Ala Lys Ser Arg Gly Phe Ser
245 250 255 Asp Val Leu
Tyr Leu Asp Ser Val Lys Lys Lys Tyr Leu Glu Glu Ala 260
265 270 Ser Ser Cys Asn Val Phe Val Val
Lys Gly Arg Thr Ile Ser Thr Pro 275 280
285 Ala Thr Asn Gly Thr Ile Leu Glu Gly Ile Thr Arg Lys
Ser Val Met 290 295 300
Glu Ile Ala Ser Asp Gln Gly Tyr Gln Val Val Glu Lys Ala Val His 305
310 315 320 Val Asp Glu Val
Met Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala 325
330 335 Val Val Val Ala Pro Val Gly Thr Ile
Thr Tyr Gln Glu Lys Arg Val 340 345
350 Glu Tyr Lys Thr Gly Asp Glu Ser Val Cys Gln Lys Leu Arg
Ser Val 355 360 365
Leu Val Gly Ile Gln Thr Gly Leu Ile Glu Asp Asn Lys Gly Trp Val 370
375 380 Thr Asp Ile Asn 385
1471104DNAArabidopsis thaliana 147atggctcctt ctgtgcaccc
ttcttcatca cctcttttta caagtaaagc cgatgaaaag 60tatgcgaatg taaaatggga
tgagctcgga ttcgcactgg ttccaacaga ttatatgtat 120gtggcgaaat gcaaacaagg
agagagcttt tcaacaggag agattgttcc ttatggggat 180atttctataa gcccttgtgc
tgggattctc aattatggcc agggactatt tgaaggtctc 240aaggcttaca ggacagaaga
cggtcggatc acactcttcc gacctgacca aaacgctatt 300cgtatgcaaa caggtgcaga
taggctttgt atgacacctc cttccccgga gcaattcgtt 360gaagcagtta agcaaactgt
gcttgccaac aacaaatggg tacctcctcc ggggaaagga 420gctttgtata ttaggcctct
actcataggt actggtgctg tccttggagt agcttcagct 480cctgaatata cgttcctcat
ttacacatct cccgtgggaa attatcacaa ggcaagctca 540ggcttgaacc tcaaagttga
tcataaccat cgccgagccc acttcggtgg aacagggggt 600gtgaagagct gcacaaatta
ttctccagtt gtaaaatcgt tgatcgaagc aaagtcttcg 660ggtttctctg atgtcttgtt
cctggatgcg gcaactggta aaaacatcga agaggtttct 720acttgtaaca tcttcattct
aaagggaaac attgtatcca ctcccccaac ttcaggaacc 780attttaccag gaatcacaag
gaagagcata tgtgagctag cccgtgacat tggctatgag 840gttcaagaac gtgatctttc
tgtggatgag ctattagagg cagaggaagt tttttgcacg 900gggacggcag tggtcattaa
agctgttgaa accgtgacat tccatgacaa aagggtaaaa 960tatagaacag gagaagaagc
attctctacg aagcttcact tgatattaac taatattcaa 1020atgggagttg tcgaagataa
gaagggttgg atgatggaga tcgatcattt ggttggaaca 1080gattcgtttc ctgatgaaac
ataa 1104148367PRTArabidopsis
thaliana 148Met Ala Pro Ser Val His Pro Ser Ser Ser Pro Leu Phe Thr Ser
Lys 1 5 10 15 Ala
Asp Glu Lys Tyr Ala Asn Val Lys Trp Asp Glu Leu Gly Phe Ala
20 25 30 Leu Val Pro Thr Asp
Tyr Met Tyr Val Ala Lys Cys Lys Gln Gly Glu 35
40 45 Ser Phe Ser Thr Gly Glu Ile Val Pro
Tyr Gly Asp Ile Ser Ile Ser 50 55
60 Pro Cys Ala Gly Ile Leu Asn Tyr Gly Gln Gly Leu Phe
Glu Gly Leu 65 70 75
80 Lys Ala Tyr Arg Thr Glu Asp Gly Arg Ile Thr Leu Phe Arg Pro Asp
85 90 95 Gln Asn Ala Ile
Arg Met Gln Thr Gly Ala Asp Arg Leu Cys Met Thr 100
105 110 Pro Pro Ser Pro Glu Gln Phe Val Glu
Ala Val Lys Gln Thr Val Leu 115 120
125 Ala Asn Asn Lys Trp Val Pro Pro Pro Gly Lys Gly Ala Leu
Tyr Ile 130 135 140
Arg Pro Leu Leu Ile Gly Thr Gly Ala Val Leu Gly Val Ala Ser Ala 145
150 155 160 Pro Glu Tyr Thr Phe
Leu Ile Tyr Thr Ser Pro Val Gly Asn Tyr His 165
170 175 Lys Ala Ser Ser Gly Leu Asn Leu Lys Val
Asp His Asn His Arg Arg 180 185
190 Ala His Phe Gly Gly Thr Gly Gly Val Lys Ser Cys Thr Asn Tyr
Ser 195 200 205 Pro
Val Val Lys Ser Leu Ile Glu Ala Lys Ser Ser Gly Phe Ser Asp 210
215 220 Val Leu Phe Leu Asp Ala
Ala Thr Gly Lys Asn Ile Glu Glu Val Ser 225 230
235 240 Thr Cys Asn Ile Phe Ile Leu Lys Gly Asn Ile
Val Ser Thr Pro Pro 245 250
255 Thr Ser Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Cys Glu
260 265 270 Leu Ala
Arg Asp Ile Gly Tyr Glu Val Gln Glu Arg Asp Leu Ser Val 275
280 285 Asp Glu Leu Leu Glu Ala Glu
Glu Val Phe Cys Thr Gly Thr Ala Val 290 295
300 Val Ile Lys Ala Val Glu Thr Val Thr Phe His Asp
Lys Arg Val Lys 305 310 315
320 Tyr Arg Thr Gly Glu Glu Ala Phe Ser Thr Lys Leu His Leu Ile Leu
325 330 335 Thr Asn Ile
Gln Met Gly Val Val Glu Asp Lys Lys Gly Trp Met Met 340
345 350 Glu Ile Asp His Leu Val Gly Thr
Asp Ser Phe Pro Asp Glu Thr 355 360
365 1491071DNAArabidopsis thaliana 149atggctcctt cttcatcacc
tcttcgtact acaagtgaaa cagatgaaaa atatgcgaat 60gtcaaatggg aagagcttgg
attcgctctg actccaatag attatatgta tgtagccaaa 120tgcagacaag gagagagctt
tacacaaggg aagattgttc cttatggcga catttcaatt 180agcccttgtt ctccgattct
caattacggc cagggactat ttgaaggtct caaagcttac 240agaacagaag acgaccggat
taggattttc cggcctgacc aaaacgctct tcgcatgcaa 300actggtgcgg agaggctttg
tatgacacct cctactctag aacaatttgt cgaggcagtt 360aagcaaactg tgcttgccaa
caagaaatgg gttcctcctc cgggtaaagg aactctgtat 420ataaggcctc tgctactagg
gagtggtgct acccttggag tagctccagc acctgaatac 480acttttctca tatatgcatc
tcccgtagga gattaccata aggtaagctc aggcttgaac 540ctcaaagttg atcataagta
tcaccgagcc cattcaggtg gaacgggggg tgtcaagagc 600tgcacaaact attctccagt
tgtgaaatcg ttactcgaag caaagtcagc gggtttctct 660gatgtcctgt tcctggatgc
agcaactggt agaaacatcg aagagcttac tgcttgtaac 720atcttcattg tcaagggaaa
cattgtatcc accccaccaa cttcaggaac cattttacct 780ggagtcacga ggaaaagcat
aagtgagctg gctcatgata ttggctacca ggtcgaagaa 840cgcgatgtat ctgtggatga
gctactagag gcagaagaag ttttctgcac agggactgca 900gtggtcgtta aagctgttga
aactgtgacc ttccatgaca aaaaggtaaa atacaggaca 960ggagaagcag cattgtctac
gaagcttcac tcgatgttga ccaatattca gatgggagtt 1020gttgaagata agaaaggttg
gatggtggac attgatcctt gtcaaggttg a 1071150356PRTArabidopsis
thaliana 150Met Ala Pro Ser Ser Ser Pro Leu Arg Thr Thr Ser Glu Thr Asp
Glu 1 5 10 15 Lys
Tyr Ala Asn Val Lys Trp Glu Glu Leu Gly Phe Ala Leu Thr Pro
20 25 30 Ile Asp Tyr Met Tyr
Val Ala Lys Cys Arg Gln Gly Glu Ser Phe Thr 35
40 45 Gln Gly Lys Ile Val Pro Tyr Gly Asp
Ile Ser Ile Ser Pro Cys Ser 50 55
60 Pro Ile Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu
Lys Ala Tyr 65 70 75
80 Arg Thr Glu Asp Asp Arg Ile Arg Ile Phe Arg Pro Asp Gln Asn Ala
85 90 95 Leu Arg Met Gln
Thr Gly Ala Glu Arg Leu Cys Met Thr Pro Pro Thr 100
105 110 Leu Glu Gln Phe Val Glu Ala Val Lys
Gln Thr Val Leu Ala Asn Lys 115 120
125 Lys Trp Val Pro Pro Pro Gly Lys Gly Thr Leu Tyr Ile Arg
Pro Leu 130 135 140
Leu Leu Gly Ser Gly Ala Thr Leu Gly Val Ala Pro Ala Pro Glu Tyr 145
150 155 160 Thr Phe Leu Ile Tyr
Ala Ser Pro Val Gly Asp Tyr His Lys Val Ser 165
170 175 Ser Gly Leu Asn Leu Lys Val Asp His Lys
Tyr His Arg Ala His Ser 180 185
190 Gly Gly Thr Gly Gly Val Lys Ser Cys Thr Asn Tyr Ser Pro Val
Val 195 200 205 Lys
Ser Leu Leu Glu Ala Lys Ser Ala Gly Phe Ser Asp Val Leu Phe 210
215 220 Leu Asp Ala Ala Thr Gly
Arg Asn Ile Glu Glu Leu Thr Ala Cys Asn 225 230
235 240 Ile Phe Ile Val Lys Gly Asn Ile Val Ser Thr
Pro Pro Thr Ser Gly 245 250
255 Thr Ile Leu Pro Gly Val Thr Arg Lys Ser Ile Ser Glu Leu Ala His
260 265 270 Asp Ile
Gly Tyr Gln Val Glu Glu Arg Asp Val Ser Val Asp Glu Leu 275
280 285 Leu Glu Ala Glu Glu Val Phe
Cys Thr Gly Thr Ala Val Val Val Lys 290 295
300 Ala Val Glu Thr Val Thr Phe His Asp Lys Lys Val
Lys Tyr Arg Thr 305 310 315
320 Gly Glu Ala Ala Leu Ser Thr Lys Leu His Ser Met Leu Thr Asn Ile
325 330 335 Gln Met Gly
Val Val Glu Asp Lys Lys Gly Trp Met Val Asp Ile Asp 340
345 350 Pro Cys Gln Gly 355
1511065DNAArabidopsis thaliana 151atggctcctt ctgcgcaacc tcttcctgtg
agtgtttcgg atgaaaaata tgcgaatgtc 60aagtgggaag agttggcatt caagtttgtt
cgtacggatt atatgtatgt tgcgaagtgc 120aatcatggag agagttttca agaggggaag
attcttcctt ttgctgattt gcaacttaac 180ccttgcgctg ctgttcttca gtatggccag
ggtttatatg aaggactgaa agcttacagg 240acagaagatg gtcggattct gctattccga
ccagaccaaa acggtctccg ccttcaagcc 300ggagctgaca gactctatat gccttatcct
tcggtcgatc aattcgtctc cgccatcaaa 360caagttgctc ttgccaacaa gaaatggatt
cctcctccgg ggaaaggaac attgtatatt 420aggcctatct tgtttgggag tggtccgatt
cttggttcat ttcccattcc tgagaccacc 480ttcacagctt ttgcctgtcc tgttggacgt
tatcataagg ataactctgg tttgaatctg 540aaaatcgaag atcagtttcg tcgagctttt
cctagtggaa ctggtggtgt gaagagcatc 600acaaactatt gtcctgtttg gataccattg
gcagaggcga aaaaacaagg tttctctgat 660attttgtttt tggatgctgc aactggcaaa
aacattgaag aacttttcgc agctaatgtt 720tttatgctca agggcaatgt tgtatcgaca
ccaacaattg caggaactat tttgcccgga 780gtcactcgaa actgcgtaat ggaattgtgt
cgtgatttcg gctaccaggt cgaggaacgt 840acgattcctc tagtggactt tctcgatgcg
gacgaagctt tctgtactgg cactgcttcc 900attgtgacta gtattgcatc cgtaaccttt
aaagacaaaa agaccggatt caaaacaggg 960gaagaaacat tggctgcgaa gctatacgag
acgttaagtg atatccagac gggtcgggtc 1020gaggatacca agggatggac ggtggagatt
gaccgccagg gctga 1065152354PRTArabidopsis thaliana
152Met Ala Pro Ser Ala Gln Pro Leu Pro Val Ser Val Ser Asp Glu Lys 1
5 10 15 Tyr Ala Asn Val
Lys Trp Glu Glu Leu Ala Phe Lys Phe Val Arg Thr 20
25 30 Asp Tyr Met Tyr Val Ala Lys Cys Asn
His Gly Glu Ser Phe Gln Glu 35 40
45 Gly Lys Ile Leu Pro Phe Ala Asp Leu Gln Leu Asn Pro Cys
Ala Ala 50 55 60
Val Leu Gln Tyr Gly Gln Gly Leu Tyr Glu Gly Leu Lys Ala Tyr Arg 65
70 75 80 Thr Glu Asp Gly Arg
Ile Leu Leu Phe Arg Pro Asp Gln Asn Gly Leu 85
90 95 Arg Leu Gln Ala Gly Ala Asp Arg Leu Tyr
Met Pro Tyr Pro Ser Val 100 105
110 Asp Gln Phe Val Ser Ala Ile Lys Gln Val Ala Leu Ala Asn Lys
Lys 115 120 125 Trp
Ile Pro Pro Pro Gly Lys Gly Thr Leu Tyr Ile Arg Pro Ile Leu 130
135 140 Phe Gly Ser Gly Pro Ile
Leu Gly Ser Phe Pro Ile Pro Glu Thr Thr 145 150
155 160 Phe Thr Ala Phe Ala Cys Pro Val Gly Arg Tyr
His Lys Asp Asn Ser 165 170
175 Gly Leu Asn Leu Lys Ile Glu Asp Gln Phe Arg Arg Ala Phe Pro Ser
180 185 190 Gly Thr
Gly Gly Val Lys Ser Ile Thr Asn Tyr Cys Pro Val Trp Ile 195
200 205 Pro Leu Ala Glu Ala Lys Lys
Gln Gly Phe Ser Asp Ile Leu Phe Leu 210 215
220 Asp Ala Ala Thr Gly Lys Asn Ile Glu Glu Leu Phe
Ala Ala Asn Val 225 230 235
240 Phe Met Leu Lys Gly Asn Val Val Ser Thr Pro Thr Ile Ala Gly Thr
245 250 255 Ile Leu Pro
Gly Val Thr Arg Asn Cys Val Met Glu Leu Cys Arg Asp 260
265 270 Phe Gly Tyr Gln Val Glu Glu Arg
Thr Ile Pro Leu Val Asp Phe Leu 275 280
285 Asp Ala Asp Glu Ala Phe Cys Thr Gly Thr Ala Ser Ile
Val Thr Ser 290 295 300
Ile Ala Ser Val Thr Phe Lys Asp Lys Lys Thr Gly Phe Lys Thr Gly 305
310 315 320 Glu Glu Thr Leu
Ala Ala Lys Leu Tyr Glu Thr Leu Ser Asp Ile Gln 325
330 335 Thr Gly Arg Val Glu Asp Thr Lys Gly
Trp Thr Val Glu Ile Asp Arg 340 345
350 Gln Gly 1531242DNAArabidopsis thaliana 153atggagagag
cagcaattct cccgagtgtt aatcaaaatt acctactttg tccttcacgc 60gccttctcca
cgcgcctcca ctcctctact cgtaacttat cgccgccgtc atttgcctcc 120atcaagcttc
agcattcttc ttcctctgtt tcttctaatg gtggaatctc tcttactcga 180tgcaacgctg
tttcgtccaa ttcttccagt acgttggtaa ctgaattagc cgacatagat 240tgggataccg
ttggatttgg gcttaagcca gctgattata tgtatgtgat gaaatgtaac 300attgatggag
agttctcaaa aggtgagttg caacgttttg ggaatattga aattagccca 360tctgctggtg
tactcaacta tggacaggga ttgtttgaag ggctaaaagc ttacagaaag 420aaagatggta
ataacatcct cctctttcgt cctgaggaga atgcaaagcg tatgagaaat 480ggtgctgaga
ggatgtgtat gcctgctcca accgttgagc agtttgtaga agctgtgaca 540gaaactgtac
tagcaaacaa acgttgggtt ccaccaccag gtaaaggttc cttatatgtt 600agaccattgc
taatgggaac aggagctgtt cttggtcttg cgcctgcacc agaatatact 660ttcattatct
atgtttcgcc tgttgggaac tacttcaagg aaggtgtggc acctatcaat 720ttgattgtgg
agaatgaatt tcaccgtgca actcctggtg gtaccggagg tgttaaaacc 780ataggcaatt
atgctgcagt actgaaggca cagtcaattg cgaaagctaa aggatattcc 840gatgttttgt
accttgattg catttacaaa agatatcttg aggaggtctc gtcttgcaat 900attttcatcg
tgaaggacaa tgtgatatct actcctgaaa taaaaggaac cattttaccc 960ggtattactc
gaaaaagtat gatagacgtg gctcgaacac aagggtttca ggtggaggaa 1020cggaatgtga
cagtggatga attgttagaa gcagacgagg ttttctgcac aggaaccgcc 1080gtggttgtct
ctcctgttgg aagcgtcact tacaaaggca aaagagtgtc ttacggagaa 1140ggtaccttcg
gaactgtgtc gaagcaactc tacaccgttc tgacaagctt gcagatgggt 1200ctgattgaag
acaacatgaa atggactgtg aatcttagtt aa
1242154413PRTArabidopsis thaliana 154Met Glu Arg Ala Ala Ile Leu Pro Ser
Val Asn Gln Asn Tyr Leu Leu 1 5 10
15 Cys Pro Ser Arg Ala Phe Ser Thr Arg Leu His Ser Ser Thr
Arg Asn 20 25 30
Leu Ser Pro Pro Ser Phe Ala Ser Ile Lys Leu Gln His Ser Ser Ser
35 40 45 Ser Val Ser Ser
Asn Gly Gly Ile Ser Leu Thr Arg Cys Asn Ala Val 50
55 60 Ser Ser Asn Ser Ser Ser Thr Leu
Val Thr Glu Leu Ala Asp Ile Asp 65 70
75 80 Trp Asp Thr Val Gly Phe Gly Leu Lys Pro Ala Asp
Tyr Met Tyr Val 85 90
95 Met Lys Cys Asn Ile Asp Gly Glu Phe Ser Lys Gly Glu Leu Gln Arg
100 105 110 Phe Gly Asn
Ile Glu Ile Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly 115
120 125 Gln Gly Leu Phe Glu Gly Leu Lys
Ala Tyr Arg Lys Lys Asp Gly Asn 130 135
140 Asn Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala Lys Arg
Met Arg Asn 145 150 155
160 Gly Ala Glu Arg Met Cys Met Pro Ala Pro Thr Val Glu Gln Phe Val
165 170 175 Glu Ala Val Thr
Glu Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro 180
185 190 Pro Gly Lys Gly Ser Leu Tyr Val Arg
Pro Leu Leu Met Gly Thr Gly 195 200
205 Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Ile
Ile Tyr 210 215 220
Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Val Ala Pro Ile Asn 225
230 235 240 Leu Ile Val Glu Asn
Glu Phe His Arg Ala Thr Pro Gly Gly Thr Gly 245
250 255 Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala
Val Leu Lys Ala Gln Ser 260 265
270 Ile Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Cys
Ile 275 280 285 Tyr
Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Ile Val 290
295 300 Lys Asp Asn Val Ile Ser
Thr Pro Glu Ile Lys Gly Thr Ile Leu Pro 305 310
315 320 Gly Ile Thr Arg Lys Ser Met Ile Asp Val Ala
Arg Thr Gln Gly Phe 325 330
335 Gln Val Glu Glu Arg Asn Val Thr Val Asp Glu Leu Leu Glu Ala Asp
340 345 350 Glu Val
Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser 355
360 365 Val Thr Tyr Lys Gly Lys Arg
Val Ser Tyr Gly Glu Gly Thr Phe Gly 370 375
380 Thr Val Ser Lys Gln Leu Tyr Thr Val Leu Thr Ser
Leu Gln Met Gly 385 390 395
400 Leu Ile Glu Asp Asn Met Lys Trp Thr Val Asn Leu Ser
405 410 1551248DNAArabidopsis thaliana
155atggagagaa gcgccgttgc ctcaggtttt catagaaatt acatcctctg tgcttcacgc
60gccgccactt ccacgacgcg cctccactct ttgtcctccc tcagaaactt tccctcttcc
120tctctcagga ttcgtcactg tccttctccc atctcttcca atttcatcgt tagtgaagtt
180tcccgaaacc gacgatgcga cgccgtttct tccagcacca ccgatgtgac tgaattagcc
240gaaattgatt gggacaagat tgattttggg cttaaaccaa cggattacat gtacgccatg
300aaatgtagcc gtgatggtga attctctcaa ggtcaattgc aaccttttgg taacattgac
360attaacccag cagctggtgt tctcaactat ggacaaggtt tgtttgaagg tctaaaagct
420tacagaaaac aagatgggaa tattctactc ttccgtcctg aggagaatgc gatccgaatg
480agaaatggcg ctgaaagaat gtgtatgcct tctccaaccg ttgaacagtt tgttgaggct
540gtgaaaacta ctgtattagc taacaaacgc tggattccac ctccaggtaa aggatcatta
600tacataaggc cattgctaat gggaactgga gctgttcttg gtcttgctcc tgctcctgaa
660tacactttcc ttatctttgt ttcacctgtc gggaactact tcaaggaagg tgttgcgccg
720atcaacttaa ttgttgaaac tgaattccat cgtgcaactc ccggcggtac tggaggtgtt
780aaaaccatcg gtaattatgc tgcagtcttg aaggctcagt cgattgcgaa agctaaaggg
840tattctgatg ttttatacct tgattgcctt cacaaaagat atcttgagga ggtttcatcg
900tgcaatattt tcattgtgaa ggataatgtg atatctactc ctgaaattaa aggaaccatc
960ttgcctggaa ttacccggaa gagtatcatc gaagtagctc gtagccaagg tttcaaggtg
1020gaggaacgaa atgtgacagt tgatgaattg gtagaagcag acgaggtttt ctgcacagga
1080accgccgttg ttttatctcc ggttggaagc atcacttaca aaagccaaag gttttcttat
1140ggagaagatg gctttggaac agtctcgaaa caactctaca cttccttgac gagcctgcaa
1200atgggtctga gcgaagataa catgaactgg actgttcaat tgagttaa
1248156415PRTArabidopsis thaliana 156Met Glu Arg Ser Ala Val Ala Ser Gly
Phe His Arg Asn Tyr Ile Leu 1 5 10
15 Cys Ala Ser Arg Ala Ala Thr Ser Thr Thr Arg Leu His Ser
Leu Ser 20 25 30
Ser Leu Arg Asn Phe Pro Ser Ser Ser Leu Arg Ile Arg His Cys Pro
35 40 45 Ser Pro Ile Ser
Ser Asn Phe Ile Val Ser Glu Val Ser Arg Asn Arg 50
55 60 Arg Cys Asp Ala Val Ser Ser Ser
Thr Thr Asp Val Thr Glu Leu Ala 65 70
75 80 Glu Ile Asp Trp Asp Lys Ile Asp Phe Gly Leu Lys
Pro Thr Asp Tyr 85 90
95 Met Tyr Ala Met Lys Cys Ser Arg Asp Gly Glu Phe Ser Gln Gly Gln
100 105 110 Leu Gln Pro
Phe Gly Asn Ile Asp Ile Asn Pro Ala Ala Gly Val Leu 115
120 125 Asn Tyr Gly Gln Gly Leu Phe Glu
Gly Leu Lys Ala Tyr Arg Lys Gln 130 135
140 Asp Gly Asn Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala
Ile Arg Met 145 150 155
160 Arg Asn Gly Ala Glu Arg Met Cys Met Pro Ser Pro Thr Val Glu Gln
165 170 175 Phe Val Glu Ala
Val Lys Thr Thr Val Leu Ala Asn Lys Arg Trp Ile 180
185 190 Pro Pro Pro Gly Lys Gly Ser Leu Tyr
Ile Arg Pro Leu Leu Met Gly 195 200
205 Thr Gly Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr
Phe Leu 210 215 220
Ile Phe Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Val Ala Pro 225
230 235 240 Ile Asn Leu Ile Val
Glu Thr Glu Phe His Arg Ala Thr Pro Gly Gly 245
250 255 Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr
Ala Ala Val Leu Lys Ala 260 265
270 Gln Ser Ile Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu
Asp 275 280 285 Cys
Leu His Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile Phe 290
295 300 Ile Val Lys Asp Asn Val
Ile Ser Thr Pro Glu Ile Lys Gly Thr Ile 305 310
315 320 Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Glu
Val Ala Arg Ser Gln 325 330
335 Gly Phe Lys Val Glu Glu Arg Asn Val Thr Val Asp Glu Leu Val Glu
340 345 350 Ala Asp
Glu Val Phe Cys Thr Gly Thr Ala Val Val Leu Ser Pro Val 355
360 365 Gly Ser Ile Thr Tyr Lys Ser
Gln Arg Phe Ser Tyr Gly Glu Asp Gly 370 375
380 Phe Gly Thr Val Ser Lys Gln Leu Tyr Thr Ser Leu
Thr Ser Leu Gln 385 390 395
400 Met Gly Leu Ser Glu Asp Asn Met Asn Trp Thr Val Gln Leu Ser
405 410 415 1571242DNAGlycine max
157atgggaaaac agaaacagaa aatggagagc attcgactaa tttacccgat ctgcccctct
60cgacattctt cctttcttct ctctcatcaa tctcccttcc tatgcgaacc ttctctctct
120ctcaagcttc gaaagcagtt tcctctcact tcgcagaatg ttctggaagc cgcctctcct
180ctcaggcctt ccgccactct gtcttctgat ccctacagtg agacgattga attagctgat
240atagaatggg acaaccttgg gtttgggctt caacccactg attatatgta tatcatgaaa
300tgcacacgag gtggaacctt ttccaaaggt gaattgcagc gttttgggaa catcgagttg
360aacccctccg ctggagtttt aaactatggc cagggattat ttgagggttt gaaagcatac
420cgcaaacaag atgggagtat actcctcttc cgtccggaag aaaatggttt gcggatgcag
480ataggtgcgg agcggatgtg catgccatca cctactatgg agcagtttgt ggaagctgtg
540aaggatactg ttttagctaa caaacgttgg gttccccctg caggtaaagg ttccttgtat
600attagacctt tgttaatggg aagtggacct gtacttggtg ttgcacctgc accagagtac
660acatttctaa tatatgtttc acctgttggg aactacttca aggaaggttt ggccccaatc
720aatttgattg tagaaaatga attccatcgt gcaactcctg gtggcactgg aggtgtgaag
780accattggaa actatgctgc agttctgaag gcacagtctg aagcaaaagc taaaggctac
840tctgatgttt tataccttga ctgtgtgcac aaaagatatt tggaggaggt ttcttcatgc
900aacatttttg ttgttaaggg taacattatt tcaactccag ctattaaagg gacaatccta
960cctggcatta ctcgcaaaag tataattgat gtggctcgaa gcgaagggtt tcaggttgag
1020gagcgattag tttcagtgga tgaattgcta gatgctgatg aggttttctg cacgggaaca
1080gctgtggttg tatcacctgt tggcagtatt acttatcttg gcaagagggt aacatatggg
1140gatggtattg gcgtggttgc acagcaactt tatactgtcc ttaccagatt acagatgggt
1200cttacggagg atgagatgaa ttggactgtt gagctgagat aa
1242158413PRTGlycine max 158Met Gly Lys Gln Lys Gln Lys Met Glu Ser Ile
Arg Leu Ile Tyr Pro 1 5 10
15 Ile Cys Pro Ser Arg His Ser Ser Phe Leu Leu Ser His Gln Ser Pro
20 25 30 Phe Leu
Cys Glu Pro Ser Leu Ser Leu Lys Leu Arg Lys Gln Phe Pro 35
40 45 Leu Thr Ser Gln Asn Val Leu
Glu Ala Ala Ser Pro Leu Arg Pro Ser 50 55
60 Ala Thr Leu Ser Ser Asp Pro Tyr Ser Glu Thr Ile
Glu Leu Ala Asp 65 70 75
80 Ile Glu Trp Asp Asn Leu Gly Phe Gly Leu Gln Pro Thr Asp Tyr Met
85 90 95 Tyr Ile Met
Lys Cys Thr Arg Gly Gly Thr Phe Ser Lys Gly Glu Leu 100
105 110 Gln Arg Phe Gly Asn Ile Glu Leu
Asn Pro Ser Ala Gly Val Leu Asn 115 120
125 Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg
Lys Gln Asp 130 135 140
Gly Ser Ile Leu Leu Phe Arg Pro Glu Glu Asn Gly Leu Arg Met Gln 145
150 155 160 Ile Gly Ala Glu
Arg Met Cys Met Pro Ser Pro Thr Met Glu Gln Phe 165
170 175 Val Glu Ala Val Lys Asp Thr Val Leu
Ala Asn Lys Arg Trp Val Pro 180 185
190 Pro Ala Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met
Gly Ser 195 200 205
Gly Pro Val Leu Gly Val Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile 210
215 220 Tyr Val Ser Pro Val
Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile 225 230
235 240 Asn Leu Ile Val Glu Asn Glu Phe His Arg
Ala Thr Pro Gly Gly Thr 245 250
255 Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala
Gln 260 265 270 Ser
Glu Ala Lys Ala Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Cys 275
280 285 Val His Lys Arg Tyr Leu
Glu Glu Val Ser Ser Cys Asn Ile Phe Val 290 295
300 Val Lys Gly Asn Ile Ile Ser Thr Pro Ala Ile
Lys Gly Thr Ile Leu 305 310 315
320 Pro Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Arg Ser Glu Gly
325 330 335 Phe Gln
Val Glu Glu Arg Leu Val Ser Val Asp Glu Leu Leu Asp Ala 340
345 350 Asp Glu Val Phe Cys Thr Gly
Thr Ala Val Val Val Ser Pro Val Gly 355 360
365 Ser Ile Thr Tyr Leu Gly Lys Arg Val Thr Tyr Gly
Asp Gly Ile Gly 370 375 380
Val Val Ala Gln Gln Leu Tyr Thr Val Leu Thr Arg Leu Gln Met Gly 385
390 395 400 Leu Thr Glu
Asp Glu Met Asn Trp Thr Val Glu Leu Arg 405
410 1591167DNAGlycine max 159atgattcaaa gaactgtgtc
ctttcccagt ttaaggaaat tgcttcttcg ggctggttgt 60tctaaatctg cttcgtccaa
gatcggaact tacaattgct ttgcttctca gtcctcccct 120ctaccgagcc acaaccctag
ttaccgtgat gacgagtatg ctgatgtgga ctgggacagt 180cttggatttg gactgatgcc
cactgattat atgtatatta ctaaatgttg tgagggccaa 240aattttggac aaggacaact
cagtcgttat gggaacattg aactcagtcc atcagctggt 300gtcctaaatt atggtcaggg
tttattcgaa ggcacgaaag catacagaaa agaaaatggg 360ggcttgctac tcttccgtcc
agaagaaaat gccattcgca tgaagactgg tgcccaaaga 420atgtgcatgg catcgccttc
cattgatcat tttgttgatg ctttgaagca aactgtcttg 480gctaataagc gttgggttcc
tccaccgggc aaaggatcct tgtaccttag gcctctgctc 540ctaggaactg gtccggtttt
gggtttggct cctgcacctg aatacacatt cctcatattt 600gcttcccctg ttcgcaacta
tttcaaggag ggctctgctc cactcaactt gtacgtggag 660gaaaactttg accgtgcttc
tagccgcggc actggaaacg ttaaaaccat ttccaattat 720gcaccggtct tgatggcaca
aattcaagcc aagaaaagag gattttcgga tgtgctatac 780cttgattcag acaccaagaa
aaatctcgag gaggtctctt cttgtaacat ttttattgcc 840aagggcaaat gcatctcaac
acctgctact aatggaacta ttctttccgg aattacccga 900aaaagtgtca ttgaaattgc
tcgcgatcat ggctatcagg tagaagagcg tgctgttgcc 960gtggatgaat tgattgaggc
tgatgaagtt ttctgcacag gaactgcggt cggtgttgct 1020ccagtaggga gtatcacata
ccaggataaa aggatggaat atataacagg ttctggaacc 1080atttgtcaag agctgaacaa
taccatttca ggaattcaaa cgggtactat tgaagataag 1140aagggatgga ttgtcgaagt
tgattaa 1167160388PRTGlycine max
160Met Ile Gln Arg Thr Val Ser Phe Pro Ser Leu Arg Lys Leu Leu Leu 1
5 10 15 Arg Ala Gly Cys
Ser Lys Ser Ala Ser Ser Lys Ile Gly Thr Tyr Asn 20
25 30 Cys Phe Ala Ser Gln Ser Ser Pro Leu
Pro Ser His Asn Pro Ser Tyr 35 40
45 Arg Asp Asp Glu Tyr Ala Asp Val Asp Trp Asp Ser Leu Gly
Phe Gly 50 55 60
Leu Met Pro Thr Asp Tyr Met Tyr Ile Thr Lys Cys Cys Glu Gly Gln 65
70 75 80 Asn Phe Gly Gln Gly
Gln Leu Ser Arg Tyr Gly Asn Ile Glu Leu Ser 85
90 95 Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln
Gly Leu Phe Glu Gly Thr 100 105
110 Lys Ala Tyr Arg Lys Glu Asn Gly Gly Leu Leu Leu Phe Arg Pro
Glu 115 120 125 Glu
Asn Ala Ile Arg Met Lys Thr Gly Ala Gln Arg Met Cys Met Ala 130
135 140 Ser Pro Ser Ile Asp His
Phe Val Asp Ala Leu Lys Gln Thr Val Leu 145 150
155 160 Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys
Gly Ser Leu Tyr Leu 165 170
175 Arg Pro Leu Leu Leu Gly Thr Gly Pro Val Leu Gly Leu Ala Pro Ala
180 185 190 Pro Glu
Tyr Thr Phe Leu Ile Phe Ala Ser Pro Val Arg Asn Tyr Phe 195
200 205 Lys Glu Gly Ser Ala Pro Leu
Asn Leu Tyr Val Glu Glu Asn Phe Asp 210 215
220 Arg Ala Ser Ser Arg Gly Thr Gly Asn Val Lys Thr
Ile Ser Asn Tyr 225 230 235
240 Ala Pro Val Leu Met Ala Gln Ile Gln Ala Lys Lys Arg Gly Phe Ser
245 250 255 Asp Val Leu
Tyr Leu Asp Ser Asp Thr Lys Lys Asn Leu Glu Glu Val 260
265 270 Ser Ser Cys Asn Ile Phe Ile Ala
Lys Gly Lys Cys Ile Ser Thr Pro 275 280
285 Ala Thr Asn Gly Thr Ile Leu Ser Gly Ile Thr Arg Lys
Ser Val Ile 290 295 300
Glu Ile Ala Arg Asp His Gly Tyr Gln Val Glu Glu Arg Ala Val Ala 305
310 315 320 Val Asp Glu Leu
Ile Glu Ala Asp Glu Val Phe Cys Thr Gly Thr Ala 325
330 335 Val Gly Val Ala Pro Val Gly Ser Ile
Thr Tyr Gln Asp Lys Arg Met 340 345
350 Glu Tyr Ile Thr Gly Ser Gly Thr Ile Cys Gln Glu Leu Asn
Asn Thr 355 360 365
Ile Ser Gly Ile Gln Thr Gly Thr Ile Glu Asp Lys Lys Gly Trp Ile 370
375 380 Val Glu Val Asp 385
1611080DNAGlycine max 161atgtctcccc cttctatgtt aggcaaccgc
aaagacggtt ctgaaattgc tgttgcggaa 60aactatgctg acattaattg ggatgagctt
ggatttagtc tagttccaac agattacatg 120tatgtcatga aatgtgcaaa aggagataag
ttttcacaag gatccatcgt tccatttgga 180aacatagaga tcagcccttc tgctggaatc
ttaaattatg gacagggact ctttgagggg 240ctaaaagcac atagaactga agatgggcat
gtacttttat ttcgaccaga tgagaatgct 300caacgcatga aacgaggtgc agatagattg
tgtatgccat ccccatctcc tggccaattt 360gttaatgctg taaagcagat agttattgcc
aacaaacgtt gggtgcctcc accagggaaa 420gggtcactat atattaggcc attgctgata
ggaacaggag cattgttagg ggtggcacct 480gcccctgagt atacatttct tatttattgt
tctccagttg gcagctacca gaagggtgca 540ctaaatttaa aggttgagga taaactatat
agagcaatat ctggctgtgg tggaactgga 600gggatcaaaa gtgtcaccaa ttatgcccct
gtttatactg caatggctga tgcaaaggcc 660aacggattct ctgatgtgct gttcttagac
tcagcaactg gaaaacatat agaggaggcc 720tcagcatgca atgtatttgt tttgaaggac
aatgctatct ccactcctgc aatagatgga 780accatcctac ctggtatcac ccgaaaatcc
atcattgaca ttgccattga tttgggttat 840caggtcatgg aacgttccgt atcagtggag
gagatgctag gtgctgatga aatgttctgc 900actggaactg cagttgttgt caactctgtt
gcatctgtaa cttataagga aacaagggtg 960gattacaaaa caggcccagc aacattgtcc
tcaaaactac ggaaaacact tgttggaatt 1020caaacagggt gtcttgagga caaaaaatca
tggacagtcc gagtagattc aacaatatag 1080162359PRTGlycine max 162Met Ser
Pro Pro Ser Met Leu Gly Asn Arg Lys Asp Gly Ser Glu Ile 1 5
10 15 Ala Val Ala Glu Asn Tyr Ala
Asp Ile Asn Trp Asp Glu Leu Gly Phe 20 25
30 Ser Leu Val Pro Thr Asp Tyr Met Tyr Val Met Lys
Cys Ala Lys Gly 35 40 45
Asp Lys Phe Ser Gln Gly Ser Ile Val Pro Phe Gly Asn Ile Glu Ile
50 55 60 Ser Pro Ser
Ala Gly Ile Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly 65
70 75 80 Leu Lys Ala His Arg Thr Glu
Asp Gly His Val Leu Leu Phe Arg Pro 85
90 95 Asp Glu Asn Ala Gln Arg Met Lys Arg Gly Ala
Asp Arg Leu Cys Met 100 105
110 Pro Ser Pro Ser Pro Gly Gln Phe Val Asn Ala Val Lys Gln Ile
Val 115 120 125 Ile
Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Ser Leu Tyr 130
135 140 Ile Arg Pro Leu Leu Ile
Gly Thr Gly Ala Leu Leu Gly Val Ala Pro 145 150
155 160 Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Cys Ser
Pro Val Gly Ser Tyr 165 170
175 Gln Lys Gly Ala Leu Asn Leu Lys Val Glu Asp Lys Leu Tyr Arg Ala
180 185 190 Ile Ser
Gly Cys Gly Gly Thr Gly Gly Ile Lys Ser Val Thr Asn Tyr 195
200 205 Ala Pro Val Tyr Thr Ala Met
Ala Asp Ala Lys Ala Asn Gly Phe Ser 210 215
220 Asp Val Leu Phe Leu Asp Ser Ala Thr Gly Lys His
Ile Glu Glu Ala 225 230 235
240 Ser Ala Cys Asn Val Phe Val Leu Lys Asp Asn Ala Ile Ser Thr Pro
245 250 255 Ala Ile Asp
Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile 260
265 270 Asp Ile Ala Ile Asp Leu Gly Tyr
Gln Val Met Glu Arg Ser Val Ser 275 280
285 Val Glu Glu Met Leu Gly Ala Asp Glu Met Phe Cys Thr
Gly Thr Ala 290 295 300
Val Val Val Asn Ser Val Ala Ser Val Thr Tyr Lys Glu Thr Arg Val 305
310 315 320 Asp Tyr Lys Thr
Gly Pro Ala Thr Leu Ser Ser Lys Leu Arg Lys Thr 325
330 335 Leu Val Gly Ile Gln Thr Gly Cys Leu
Glu Asp Lys Lys Ser Trp Thr 340 345
350 Val Arg Val Asp Ser Thr Ile 355
1631203DNAGlycine max 163atggtgcctc ctcatggaaa aggagcgttg tacattaggc
ctttattatt tggaagtgga 60tctgttatgg gtattgcacc tgcaccacat tgcaccttcc
taatatacac taatccaatt 120tccaacgctt acaaatgcag ggttgagttc aaaacaggag
ccgacactgt gacccagaat 180actgctgggg aaaactatgc tgacattaat tgggatgagc
ttggatttag tctagttcca 240acagattaca tgtatgtcat gaaatgtgca aagggagata
agttttcaca aggatccatc 300cttccctatg gaaacttaga gattaaccct tctgctggaa
tcttaaatta tgggcaggga 360atctttgagg gactaaaggc atatagaact gaagatgggt
gcatccttct gtttagacca 420gaagagaatg ctcaacgaat gaagatagga gcagacagat
tgtgcatgcc atccccatcc 480attgaccagt ttgttgctgc tgtgaagcag acagttcttg
ccaacaaacg ttgggtgcct 540ccaccaggga aagggtcact atatattagg ccattgctca
tgggaaccgg agcttctttg 600aacttgtctc cagcacctga gtacacatta cttatttatt
gttctcctgt cactaattac 660cacaagggtt cactgaactt aaaagtggag agtaagttct
accgagcaat atctggcact 720ggtggaaccg gagggatcaa gagtgttacc aactatgccc
ctgtttatgc tgcaagcatt 780gaagcaaagg ccagtggatt ctctgatgtt ttgttcttgg
actcagcaac tggaaaaaat 840atagaggagg tttctgcatg caatgtgttt gttgtgaagg
gtaatgctat ctgcaccccg 900gcaacaaatg gagccatcct ccctgggatc acacgaaaat
ccatcattga gattgcctta 960gatatgggtt atcaggtcac ggaacgtgcc atatcagtgg
aggaaatgct agatgctgat 1020gaagtgttct gcacaggaac tgcagttgtt gtcaactctg
tttcatctgt aacctacaaa 1080gaaacaagaa ctgagtataa aacaggacca gaaacattgt
cccaaaaact gcgcaaaaca 1140ctggttggaa ttcaaactgg gtgtattgag gacacaaagg
gctggacagt tcgaatagat 1200tga
1203164400PRTGlycine max 164Met Val Pro Pro His Gly
Lys Gly Ala Leu Tyr Ile Arg Pro Leu Leu 1 5
10 15 Phe Gly Ser Gly Ser Val Met Gly Ile Ala Pro
Ala Pro His Cys Thr 20 25
30 Phe Leu Ile Tyr Thr Asn Pro Ile Ser Asn Ala Tyr Lys Cys Arg
Val 35 40 45 Glu
Phe Lys Thr Gly Ala Asp Thr Val Thr Gln Asn Thr Ala Gly Glu 50
55 60 Asn Tyr Ala Asp Ile Asn
Trp Asp Glu Leu Gly Phe Ser Leu Val Pro 65 70
75 80 Thr Asp Tyr Met Tyr Val Met Lys Cys Ala Lys
Gly Asp Lys Phe Ser 85 90
95 Gln Gly Ser Ile Leu Pro Tyr Gly Asn Leu Glu Ile Asn Pro Ser Ala
100 105 110 Gly Ile
Leu Asn Tyr Gly Gln Gly Ile Phe Glu Gly Leu Lys Ala Tyr 115
120 125 Arg Thr Glu Asp Gly Cys Ile
Leu Leu Phe Arg Pro Glu Glu Asn Ala 130 135
140 Gln Arg Met Lys Ile Gly Ala Asp Arg Leu Cys Met
Pro Ser Pro Ser 145 150 155
160 Ile Asp Gln Phe Val Ala Ala Val Lys Gln Thr Val Leu Ala Asn Lys
165 170 175 Arg Trp Val
Pro Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu 180
185 190 Leu Met Gly Thr Gly Ala Ser Leu
Asn Leu Ser Pro Ala Pro Glu Tyr 195 200
205 Thr Leu Leu Ile Tyr Cys Ser Pro Val Thr Asn Tyr His
Lys Gly Ser 210 215 220
Leu Asn Leu Lys Val Glu Ser Lys Phe Tyr Arg Ala Ile Ser Gly Thr 225
230 235 240 Gly Gly Thr Gly
Gly Ile Lys Ser Val Thr Asn Tyr Ala Pro Val Tyr 245
250 255 Ala Ala Ser Ile Glu Ala Lys Ala Ser
Gly Phe Ser Asp Val Leu Phe 260 265
270 Leu Asp Ser Ala Thr Gly Lys Asn Ile Glu Glu Val Ser Ala
Cys Asn 275 280 285
Val Phe Val Val Lys Gly Asn Ala Ile Cys Thr Pro Ala Thr Asn Gly 290
295 300 Ala Ile Leu Pro Gly
Ile Thr Arg Lys Ser Ile Ile Glu Ile Ala Leu 305 310
315 320 Asp Met Gly Tyr Gln Val Thr Glu Arg Ala
Ile Ser Val Glu Glu Met 325 330
335 Leu Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val Val
Asn 340 345 350 Ser
Val Ser Ser Val Thr Tyr Lys Glu Thr Arg Thr Glu Tyr Lys Thr 355
360 365 Gly Pro Glu Thr Leu Ser
Gln Lys Leu Arg Lys Thr Leu Val Gly Ile 370 375
380 Gln Thr Gly Cys Ile Glu Asp Thr Lys Gly Trp
Thr Val Arg Ile Asp 385 390 395
400 1651236DNAGlycine max 165atggagagca gcgccgtcta cggaagcatt
cgaccaagtt acccgatctg cccctctcga 60cgttcttcct ctcttctctc tcaccaatct
cccttcctat tcgagccttc tctttctctc 120aagcttcgca agcagtttcc tctcatttcg
cagaattttc tggaagccgc ttctcctctc 180aggccttctg ccactttgtc ttctgattcc
taccgtgaga cgattgaatt agctgatata 240gaatgggaca accttggttt tgggcttcaa
cccacggatt atatgtattc catgaaatgc 300acacgaggtg gaaccttctc caaaggtgaa
ctgcagcgtt ttggtaacat tgaattgaac 360ccctcagcag gagttttaaa ctatggtcag
ggattatttg agggtttgaa agcgtatcgg 420aaacaagatg ggagtatact cctcttccgt
ccggaagaaa atggtttgcg gatgcagata 480ggtgcagaga ggatgtgcat gccatcgcct
actgttgagc agtttgtgga agctgtaaag 540gagacagttt tagcaaacaa acgttgggtt
ccccctgcag gtaaaggttc cctgtatatt 600agacctttgc taatgggaag tggaccggta
cttggtcttg cacctgctcc agagtacacc 660tttctaatat acgtttcacc tgttgggaac
tacttcaagg aaggtttggc cccaatcaat 720ttgattgtgg aaaatgaatt acatcgtgca
actcctggtg gcactggagg tgtgaagacc 780attggaaact atgctgcagt tctgaaggca
cagtctgaag caaaagctaa aggctactct 840gatgttttat accttgactg tgtgcacaaa
agatatttgg aggaggtttc ttcatgcaac 900atttttgttg ttaagggtaa cgttatttca
actccagcta ttaaagggac aatcctacct 960ggcattactc gcaaaagtat aattgatgtt
gctcgaagcc aagggttcca ggttgaggag 1020cgattagtgt cagtggatga attgctagat
gctgacgagg tcttctgcac gggaacagct 1080gtggttgtat cacctgttgg cagtattact
tatcttgaca agagtttatt tctattaatt 1140tgttgtgatt ttactttatt ttatttttta
tacactgaaa gactaaggcc atatgaattt 1200ggtccatatt tatctccatt ctcttcacct
cattaa 1236166411PRTGlycine max 166Met Glu
Ser Ser Ala Val Tyr Gly Ser Ile Arg Pro Ser Tyr Pro Ile 1 5
10 15 Cys Pro Ser Arg Arg Ser Ser
Ser Leu Leu Ser His Gln Ser Pro Phe 20 25
30 Leu Phe Glu Pro Ser Leu Ser Leu Lys Leu Arg Lys
Gln Phe Pro Leu 35 40 45
Ile Ser Gln Asn Phe Leu Glu Ala Ala Ser Pro Leu Arg Pro Ser Ala
50 55 60 Thr Leu Ser
Ser Asp Ser Tyr Arg Glu Thr Ile Glu Leu Ala Asp Ile 65
70 75 80 Glu Trp Asp Asn Leu Gly Phe
Gly Leu Gln Pro Thr Asp Tyr Met Tyr 85
90 95 Ser Met Lys Cys Thr Arg Gly Gly Thr Phe Ser
Lys Gly Glu Leu Gln 100 105
110 Arg Phe Gly Asn Ile Glu Leu Asn Pro Ser Ala Gly Val Leu Asn
Tyr 115 120 125 Gly
Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Gln Asp Gly 130
135 140 Ser Ile Leu Leu Phe Arg
Pro Glu Glu Asn Gly Leu Arg Met Gln Ile 145 150
155 160 Gly Ala Glu Arg Met Cys Met Pro Ser Pro Thr
Val Glu Gln Phe Val 165 170
175 Glu Ala Val Lys Glu Thr Val Leu Ala Asn Lys Arg Trp Val Pro Pro
180 185 190 Ala Gly
Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly 195
200 205 Pro Val Leu Gly Leu Ala Pro
Ala Pro Glu Tyr Thr Phe Leu Ile Tyr 210 215
220 Val Ser Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu
Ala Pro Ile Asn 225 230 235
240 Leu Ile Val Glu Asn Glu Leu His Arg Ala Thr Pro Gly Gly Thr Gly
245 250 255 Gly Val Lys
Thr Ile Gly Asn Tyr Ala Ala Val Leu Lys Ala Gln Ser 260
265 270 Glu Ala Lys Ala Lys Gly Tyr Ser
Asp Val Leu Tyr Leu Asp Cys Val 275 280
285 His Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Ile
Phe Val Val 290 295 300
Lys Gly Asn Val Ile Ser Thr Pro Ala Ile Lys Gly Thr Ile Leu Pro 305
310 315 320 Gly Ile Thr Arg
Lys Ser Ile Ile Asp Val Ala Arg Ser Gln Gly Phe 325
330 335 Gln Val Glu Glu Arg Leu Val Ser Val
Asp Glu Leu Leu Asp Ala Asp 340 345
350 Glu Val Phe Cys Thr Gly Thr Ala Val Val Val Ser Pro Val
Gly Ser 355 360 365
Ile Thr Tyr Leu Asp Lys Ser Leu Phe Leu Leu Ile Cys Cys Asp Phe 370
375 380 Thr Leu Phe Tyr Phe
Leu Tyr Thr Glu Arg Leu Arg Pro Tyr Glu Phe 385 390
395 400 Gly Pro Tyr Leu Ser Pro Phe Ser Ser Pro
His 405 410 1671143DNAHelianthus
annuus 167atgatgatcc gacaaagccc cttccttctt ggtttgattc agacttccat
atcaaagatt 60tctgcaagat gtttgacggc acaggctgcc tcggcgcttc aggaaaatga
acctatgatc 120agacacgagg aatatgctgc tgatattgat tggaacaact tgggttttgg
tataaaacaa 180accgattaca tgtacaagtc taaatgcaca aagaacaaca cttttgagca
aggacaacta 240gttaattatg gaaacttaga attgagcccg gctgctggag ttttaaacta
cggccaggga 300ctcttcgaag gtacaaaagc cgtgagggga gaagacggtc gccttttgct
ttttcgaccc 360gatcaaaacg ccatccgaat gcaaatcgga gccgagcgaa tgtgcatgca
atccccatct 420atagaacagg ttgtagatgc agttaaacaa acagctttag ccaataaacg
ttggattcca 480cctccaggaa aagggtcgct ttacatcagg cctttgctca ttggaactgg
gcctatattg 540ggcttatctc ctgctcatga gtacacattt ttagtatatg cctccccagt
tggcaactat 600ttcaaggaag gtacggcacc gttaaactta tacgttaaca acgagtttca
tcgtacaact 660cgtggtggag cgggaggggt caaaaccatt acaaattatg ccccggtatt
gaaaccgtta 720ttaagagcaa aggaacaagg gttctcagat gtagtgtacc ttgattcggt
ccataaaaag 780tacatcgagg aagttagttc ttgtaatatt ttcattgtta agggtgatgt
tatttcaacg 840ccttcaacgg taggtactat cctcgaagga atccccagaa agagcatcat
tgatattgca 900cgtgccttag gatacaaggt tgaagaacgt ttagttgcag tagatgaatt
gatggaagcc 960gacgaagttt tcactactgg aaccgcggtt actgttgcca ctgttggtag
cattacatac 1020aatggtcgaa gagtggcgta tagaacaggt gatgggttgg tgagtcagaa
tttattcaaa 1080aggctagtag gaattcaagt tgggaaagtt gaagacaaat acacctggat
agttgatatt 1140taa
1143168380PRTHelianthus annuus 168Met Met Ile Arg Gln Ser Pro
Phe Leu Leu Gly Leu Ile Gln Thr Ser 1 5
10 15 Ile Ser Lys Ile Ser Ala Arg Cys Leu Thr Ala
Gln Ala Ala Ser Ala 20 25
30 Leu Gln Glu Asn Glu Pro Met Ile Arg His Glu Glu Tyr Ala Ala
Asp 35 40 45 Ile
Asp Trp Asn Asn Leu Gly Phe Gly Ile Lys Gln Thr Asp Tyr Met 50
55 60 Tyr Lys Ser Lys Cys Thr
Lys Asn Asn Thr Phe Glu Gln Gly Gln Leu 65 70
75 80 Val Asn Tyr Gly Asn Leu Glu Leu Ser Pro Ala
Ala Gly Val Leu Asn 85 90
95 Tyr Gly Gln Gly Leu Phe Glu Gly Thr Lys Ala Val Arg Gly Glu Asp
100 105 110 Gly Arg
Leu Leu Leu Phe Arg Pro Asp Gln Asn Ala Ile Arg Met Gln 115
120 125 Ile Gly Ala Glu Arg Met Cys
Met Gln Ser Pro Ser Ile Glu Gln Val 130 135
140 Val Asp Ala Val Lys Gln Thr Ala Leu Ala Asn Lys
Arg Trp Ile Pro 145 150 155
160 Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Thr
165 170 175 Gly Pro Ile
Leu Gly Leu Ser Pro Ala His Glu Tyr Thr Phe Leu Val 180
185 190 Tyr Ala Ser Pro Val Gly Asn Tyr
Phe Lys Glu Gly Thr Ala Pro Leu 195 200
205 Asn Leu Tyr Val Asn Asn Glu Phe His Arg Thr Thr Arg
Gly Gly Ala 210 215 220
Gly Gly Val Lys Thr Ile Thr Asn Tyr Ala Pro Val Leu Lys Pro Leu 225
230 235 240 Leu Arg Ala Lys
Glu Gln Gly Phe Ser Asp Val Val Tyr Leu Asp Ser 245
250 255 Val His Lys Lys Tyr Ile Glu Glu Val
Ser Ser Cys Asn Ile Phe Ile 260 265
270 Val Lys Gly Asp Val Ile Ser Thr Pro Ser Thr Val Gly Thr
Ile Leu 275 280 285
Glu Gly Ile Pro Arg Lys Ser Ile Ile Asp Ile Ala Arg Ala Leu Gly 290
295 300 Tyr Lys Val Glu Glu
Arg Leu Val Ala Val Asp Glu Leu Met Glu Ala 305 310
315 320 Asp Glu Val Phe Thr Thr Gly Thr Ala Val
Thr Val Ala Thr Val Gly 325 330
335 Ser Ile Thr Tyr Asn Gly Arg Arg Val Ala Tyr Arg Thr Gly Asp
Gly 340 345 350 Leu
Val Ser Gln Asn Leu Phe Lys Arg Leu Val Gly Ile Gln Val Gly 355
360 365 Lys Val Glu Asp Lys Tyr
Thr Trp Ile Val Asp Ile 370 375 380
1691029DNAHordeum vulgare 169atggattgcg gcacggcctc gcacggcgcc ctactcgccg
ccgcgccgct cgccggccgg 60cggccccggc tgctgcccct ctcgccgccg ccatcgacgc
cgtccattca gattcagaat 120cgactttatt cgatgtcact gcttccgctt cgaaaggctc
gtggcatggg aagatgcgag 180gcttctctag caagtaacta cacgcagaca tcagagtttg
ctgatttgga ttgggagaac 240cttggttttg gacttgtgca aactgactat atgtatactg
caaaatgtgg gccggatggg 300aactttgaca agggtggaat ggtgccgttt gggccgatag
aaatgaaccc agcatccgga 360gtcctgaatt atggacaggg attgttcgag ggcctaaagg
cgtataggaa aaccgatgga 420tccatcctgt tgtttcgccc aatggaaaat gcaatgcgga
tgcaaactgg tgctgagagg 480atgtgcatgc ctgcacctcc tgtcgagcaa tttgtgaacg
cagtaaaaca aaccgtttta 540gcaaacaaga gatgggtgcc tcctacgggt aaaggttctt
tgtatattag gccactactc 600gtgggaagtg gagctgttct tggtctcgca cctgctcctg
agtacacatt cattattttt 660gcctcccctg ttgggaacta ctttaaggaa ggattagccc
caataaattt gatagttgaa 720gacaagtttc atcgggccac ccctggtgga actggaggtg
ttaagaccat tgggaattat 780gcctcggtct tgatggcaca gaagattgca aaggagaaag
gttattctga tgttctctac 840ttggatgctg ttgagaaaaa gtaccttgaa gaagtatctt
cgtgtaatat ttttgttgtg 900aagggcaatg ttatttcaac tccagcaata aaaggaacaa
tactaccggg catcacaagg 960aaaagtataa ttgatgttgc tctgagtaaa ggcttccagg
ttgaggagcg gcctcgtgtc 1020cgtggatga
1029170342PRTHordeum vulgare 170Met Asp Cys Gly Thr
Ala Ser His Gly Ala Leu Leu Ala Ala Ala Pro 1 5
10 15 Leu Ala Gly Arg Arg Pro Arg Leu Leu Pro
Leu Ser Pro Pro Pro Ser 20 25
30 Thr Pro Ser Ile Gln Ile Gln Asn Arg Leu Tyr Ser Met Ser Leu
Leu 35 40 45 Pro
Leu Arg Lys Ala Arg Gly Met Gly Arg Cys Glu Ala Ser Leu Ala 50
55 60 Ser Asn Tyr Thr Gln Thr
Ser Glu Phe Ala Asp Leu Asp Trp Glu Asn 65 70
75 80 Leu Gly Phe Gly Leu Val Gln Thr Asp Tyr Met
Tyr Thr Ala Lys Cys 85 90
95 Gly Pro Asp Gly Asn Phe Asp Lys Gly Gly Met Val Pro Phe Gly Pro
100 105 110 Ile Glu
Met Asn Pro Ala Ser Gly Val Leu Asn Tyr Gly Gln Gly Leu 115
120 125 Phe Glu Gly Leu Lys Ala Tyr
Arg Lys Thr Asp Gly Ser Ile Leu Leu 130 135
140 Phe Arg Pro Met Glu Asn Ala Met Arg Met Gln Thr
Gly Ala Glu Arg 145 150 155
160 Met Cys Met Pro Ala Pro Pro Val Glu Gln Phe Val Asn Ala Val Lys
165 170 175 Gln Thr Val
Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly 180
185 190 Ser Leu Tyr Ile Arg Pro Leu Leu
Val Gly Ser Gly Ala Val Leu Gly 195 200
205 Leu Ala Pro Ala Pro Glu Tyr Thr Phe Ile Ile Phe Ala
Ser Pro Val 210 215 220
Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu Ile Val Glu 225
230 235 240 Asp Lys Phe His
Arg Ala Thr Pro Gly Gly Thr Gly Gly Val Lys Thr 245
250 255 Ile Gly Asn Tyr Ala Ser Val Leu Met
Ala Gln Lys Ile Ala Lys Glu 260 265
270 Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val Glu Lys
Lys Tyr 275 280 285
Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys Gly Asn Val 290
295 300 Ile Ser Thr Pro Ala
Ile Lys Gly Thr Ile Leu Pro Gly Ile Thr Arg 305 310
315 320 Lys Ser Ile Ile Asp Val Ala Leu Ser Lys
Gly Phe Gln Val Glu Glu 325 330
335 Arg Pro Arg Val Arg Gly 340
1711194DNAHordeum vulgare 171atggctgtgc tgtcttctgc gaagcgcgtc ctcccgtgcg
cctcggccgg cggggtcagc 60ggcggcctcc gagctctact cgggacggac ggaggcggcc
gctctcttct cccgtcccgg 120tggaagtcgt cgctgccgca gctggacccc gtcgacaggt
ccgacgagga gagcggcggc 180gacatcgact gggacaacct cggcttcggg ctcaccccga
cggactacat gtacgtcatg 240cggtgctcgc gggaggaggg cggcttctcc cgcggcgagc
tcgcccgcta cggcaacatc 300gagctcagcc cctcctccgg cgtcctcaac tacggccagg
ggctgttcga ggggctgaag 360gcgtacaggc ggtcggacgg ggccgggtac atgctgttcc
ggccggagga gaacgcgcgg 420cggatgcagc acggcgcggg ccgcatgtgc atgccgtccc
cgtctgtcga gcagttcgtg 480cacgccgtca agcagaccgt cctcgccaac aggcgctggg
tgccgccgca aggcaaggga 540gcgctgtaca tcaggccgct gctcatcggg agcggggcga
tcctcgggct ggcgccggcc 600cccgagtaca ccttcatgat ctacgccgcg cctgtgggga
catatttcaa ggaaggcatg 660gcggcgataa acctgctggt cgaggaggag atccaccgcg
cgatgccggg cggcaccggc 720ggggtcaaga ccatctccaa ctacgcgccg gtgctcaagc
cgcagatgga cgcgaaaagc 780aaggggttcg cggacgtgct gtacctggac gcggtccaca
agaggtacgt cgaggaggcc 840tcctcctgca acctcttcgt cgtcaagggc ggcgccgtcg
cgacgccggc gacgacggca 900gggaccatcc tgccgggtgt cacgcgcagg agcatcatcg
agctcgccag ggatgacggc 960taccaggtcg aagagcgcct cgtctccatc gacgatctcg
tcggcgcaga cgaagtgttc 1020tgcacgggaa cggccgtcgg cgtcaccccg gtgtcgacca
tcacctacca agggacaagg 1080cacgagttca ggactgggga agacacgttg tcgaggaaat
tgtacacgac tctcacatcg 1140atccagatgg ggctggcaga ggacaagaaa ggatggacgg
tagcgattga ttga 1194172397PRTHordeum vulgare 172Met Ala Val Leu
Ser Ser Ala Lys Arg Val Leu Pro Cys Ala Ser Ala 1 5
10 15 Gly Gly Val Ser Gly Gly Leu Arg Ala
Leu Leu Gly Thr Asp Gly Gly 20 25
30 Gly Arg Ser Leu Leu Pro Ser Arg Trp Lys Ser Ser Leu Pro
Gln Leu 35 40 45
Asp Pro Val Asp Arg Ser Asp Glu Glu Ser Gly Gly Asp Ile Asp Trp 50
55 60 Asp Asn Leu Gly Phe
Gly Leu Thr Pro Thr Asp Tyr Met Tyr Val Met 65 70
75 80 Arg Cys Ser Arg Glu Glu Gly Gly Phe Ser
Arg Gly Glu Leu Ala Arg 85 90
95 Tyr Gly Asn Ile Glu Leu Ser Pro Ser Ser Gly Val Leu Asn Tyr
Gly 100 105 110 Gln
Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Arg Ser Asp Gly Ala 115
120 125 Gly Tyr Met Leu Phe Arg
Pro Glu Glu Asn Ala Arg Arg Met Gln His 130 135
140 Gly Ala Gly Arg Met Cys Met Pro Ser Pro Ser
Val Glu Gln Phe Val 145 150 155
160 His Ala Val Lys Gln Thr Val Leu Ala Asn Arg Arg Trp Val Pro Pro
165 170 175 Gln Gly
Lys Gly Ala Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly 180
185 190 Ala Ile Leu Gly Leu Ala Pro
Ala Pro Glu Tyr Thr Phe Met Ile Tyr 195 200
205 Ala Ala Pro Val Gly Thr Tyr Phe Lys Glu Gly Met
Ala Ala Ile Asn 210 215 220
Leu Leu Val Glu Glu Glu Ile His Arg Ala Met Pro Gly Gly Thr Gly 225
230 235 240 Gly Val Lys
Thr Ile Ser Asn Tyr Ala Pro Val Leu Lys Pro Gln Met 245
250 255 Asp Ala Lys Ser Lys Gly Phe Ala
Asp Val Leu Tyr Leu Asp Ala Val 260 265
270 His Lys Arg Tyr Val Glu Glu Ala Ser Ser Cys Asn Leu
Phe Val Val 275 280 285
Lys Gly Gly Ala Val Ala Thr Pro Ala Thr Thr Ala Gly Thr Ile Leu 290
295 300 Pro Gly Val Thr
Arg Arg Ser Ile Ile Glu Leu Ala Arg Asp Asp Gly 305 310
315 320 Tyr Gln Val Glu Glu Arg Leu Val Ser
Ile Asp Asp Leu Val Gly Ala 325 330
335 Asp Glu Val Phe Cys Thr Gly Thr Ala Val Gly Val Thr Pro
Val Ser 340 345 350
Thr Ile Thr Tyr Gln Gly Thr Arg His Glu Phe Arg Thr Gly Glu Asp
355 360 365 Thr Leu Ser Arg
Lys Leu Tyr Thr Thr Leu Thr Ser Ile Gln Met Gly 370
375 380 Leu Ala Glu Asp Lys Lys Gly Trp
Thr Val Ala Ile Asp 385 390 395
1731098DNAHordeum vulgare 173atgccgactc tccatcacaa ggcccatacc acagtaggat
gccaggcttc tgtagcctct 60aaatacatgg aaacacctga gatagtcgat ttggactggg
aaaaccttgg ctttggcctt 120gtcaataccg actttatgta catggccaaa tgtgggccag
atgggaactt ttccaaagga 180gaaattctgc catttggacc catagcacta agcccgtctg
ctggagtctt aaattatgga 240cagggactgt ttgagggcct aaaagcatat aggaaaactg
atggttctgt cctattattc 300cgtccggagg agaatgccgt acggatgaag aatggttcag
ataggatgtg catgcctgca 360ccgactgttg agcagttcgt ggacgcagtg aaacaaaccg
ttttggcaaa taaaagatgg 420gtgcctccta ctggtaaagg ttccttgtat atcaggccac
tacttattgg aagcggggct 480attcttggtc ttgcacctgc tcctgagtac accttcctta
tttatgtctc acctgttgga 540aactatttca aggaaggttt agctcctatt aacttgatta
ttgaagataa ctttcaccgt 600gcggcccctg gtggaactgg aggcgtgaaa accattggaa
actatgcctc ggtgttgaaa 660gcacagagaa ccgcaaagga gaaaggatat tctgatgtcc
tctatttgga cgccgttcac 720aacaaatatc tggaagaagt ttcttcgtgc aatattttcg
ttgtgaaagg caatgctatt 780tgcactccag caatagaagg aacgatactg cctggtatca
caaggaaaag tatcatcgaa 840gtagccgaga gcaaaggcta caaggtggag gaacgccatg
tgtccgtaga cgaactgctt 900gacgctgacg aagttttctg cacgggaaca gctgttgtgg
tttcacccgt ggggagtatt 960acctataagg ggaaaagggt aaaatacgac ggcaaccaag
gagtcggtgt ggtgtcgcag 1020cagctctaca cctcgctgac gagcctccag atgggtcatg
cagaggaccc gatgggctgg 1080accgtgcaac tgaattaa
1098174365PRTHordeum vulgare 174Met Pro Thr Leu His
His Lys Ala His Thr Thr Val Gly Cys Gln Ala 1 5
10 15 Ser Val Ala Ser Lys Tyr Met Glu Thr Pro
Glu Ile Val Asp Leu Asp 20 25
30 Trp Glu Asn Leu Gly Phe Gly Leu Val Asn Thr Asp Phe Met Tyr
Met 35 40 45 Ala
Lys Cys Gly Pro Asp Gly Asn Phe Ser Lys Gly Glu Ile Leu Pro 50
55 60 Phe Gly Pro Ile Ala Leu
Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly 65 70
75 80 Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg
Lys Thr Asp Gly Ser 85 90
95 Val Leu Leu Phe Arg Pro Glu Glu Asn Ala Val Arg Met Lys Asn Gly
100 105 110 Ser Asp
Arg Met Cys Met Pro Ala Pro Thr Val Glu Gln Phe Val Asp 115
120 125 Ala Val Lys Gln Thr Val Leu
Ala Asn Lys Arg Trp Val Pro Pro Thr 130 135
140 Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile
Gly Ser Gly Ala 145 150 155
160 Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val
165 170 175 Ser Pro Val
Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu 180
185 190 Ile Ile Glu Asp Asn Phe His Arg
Ala Ala Pro Gly Gly Thr Gly Gly 195 200
205 Val Lys Thr Ile Gly Asn Tyr Ala Ser Val Leu Lys Ala
Gln Arg Thr 210 215 220
Ala Lys Glu Lys Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val His 225
230 235 240 Asn Lys Tyr Leu
Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys 245
250 255 Gly Asn Ala Ile Cys Thr Pro Ala Ile
Glu Gly Thr Ile Leu Pro Gly 260 265
270 Ile Thr Arg Lys Ser Ile Ile Glu Val Ala Glu Ser Lys Gly
Tyr Lys 275 280 285
Val Glu Glu Arg His Val Ser Val Asp Glu Leu Leu Asp Ala Asp Glu 290
295 300 Val Phe Cys Thr Gly
Thr Ala Val Val Val Ser Pro Val Gly Ser Ile 305 310
315 320 Thr Tyr Lys Gly Lys Arg Val Lys Tyr Asp
Gly Asn Gln Gly Val Gly 325 330
335 Val Val Ser Gln Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln Met
Gly 340 345 350 His
Ala Glu Asp Pro Met Gly Trp Thr Val Gln Leu Asn 355
360 365 1751098DNAHordeum vulgare 175atgccgactc
tccatcacaa ggcccatacc acagtaggat gccaggcttc tgtagcctct 60aaatacatgg
aaacacctga gatagtcgat ttggactggg aaaaccttgg ctttggcctt 120gtcaataccg
actttatgta catggccaaa tgtgggccag atgggaactt ttccaaagga 180gaaattctgc
catttggacc catagcacta agcccgtctg ctggagtctt aaattatgga 240cagggactgt
ttgagggcct aaaagcatat aggaaaactg atggttctgt cctattattc 300cgtccggagg
agaatgccgt acggatgaag aatggttcag ataggatgtg catgcctgca 360ccgactgttg
agcagttcgt ggacgcagtg aaacaaaccg ttttggcaaa taaaagatgg 420gtgcctccta
ctggtaaagg ttccttgtat atcaggccac tacttattgg aagcggggct 480attcttggtc
ttgcacctgc tcctgagtac accttcctta tttatgtctc acctgttgga 540aactatttca
aggaaggttt agctcctatt aacttgatta ttgaagataa ctttcaccgt 600gcggcccctg
gtggaactgg aggcgtgaaa accattggaa actatgcctc ggtgttgaaa 660gcacagagaa
ccgcaaagga gaaaggatat tctgatgtcc tctatttgga cgccgttcac 720aacaaatatc
tggaagaagt ttcttcgtgc aatattttcg ttgtgaaagg caatgctatt 780tgcactccag
caatagaagg aacgatactg cctggtatca caaggaaaag tatcatcgaa 840gtagccgaga
gcaaaggcta caaggtggag gaacgccatg tgtccgtaga cgaactgctt 900gacgctgacg
aagttttctg cacgggaaca gctgttgtgg tttcacccgt ggggagtatt 960acctataagg
ggaaaagggt aaaatacgac ggcaaccaag gagtcggtgt ggtgtcgcag 1020cagctctaca
cctcgctgac gagcctccag atgggtcatg cagagggccc gatgggctgg 1080accgtgcaac
tgaattaa
1098176365PRTHordeum vulgare 176Met Pro Thr Leu His His Lys Ala His Thr
Thr Val Gly Cys Gln Ala 1 5 10
15 Ser Val Ala Ser Lys Tyr Met Glu Thr Pro Glu Ile Val Asp Leu
Asp 20 25 30 Trp
Glu Asn Leu Gly Phe Gly Leu Val Asn Thr Asp Phe Met Tyr Met 35
40 45 Ala Lys Cys Gly Pro Asp
Gly Asn Phe Ser Lys Gly Glu Ile Leu Pro 50 55
60 Phe Gly Pro Ile Ala Leu Ser Pro Ser Ala Gly
Val Leu Asn Tyr Gly 65 70 75
80 Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Lys Thr Asp Gly Ser
85 90 95 Val Leu
Leu Phe Arg Pro Glu Glu Asn Ala Val Arg Met Lys Asn Gly 100
105 110 Ser Asp Arg Met Cys Met Pro
Ala Pro Thr Val Glu Gln Phe Val Asp 115 120
125 Ala Val Lys Gln Thr Val Leu Ala Asn Lys Arg Trp
Val Pro Pro Thr 130 135 140
Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala 145
150 155 160 Ile Leu Gly
Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val 165
170 175 Ser Pro Val Gly Asn Tyr Phe Lys
Glu Gly Leu Ala Pro Ile Asn Leu 180 185
190 Ile Ile Glu Asp Asn Phe His Arg Ala Ala Pro Gly Gly
Thr Gly Gly 195 200 205
Val Lys Thr Ile Gly Asn Tyr Ala Ser Val Leu Lys Ala Gln Arg Thr 210
215 220 Ala Lys Glu Lys
Gly Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val His 225 230
235 240 Asn Lys Tyr Leu Glu Glu Val Ser Ser
Cys Asn Ile Phe Val Val Lys 245 250
255 Gly Asn Ala Ile Cys Thr Pro Ala Ile Glu Gly Thr Ile Leu
Pro Gly 260 265 270
Ile Thr Arg Lys Ser Ile Ile Glu Val Ala Glu Ser Lys Gly Tyr Lys
275 280 285 Val Glu Glu Arg
His Val Ser Val Asp Glu Leu Leu Asp Ala Asp Glu 290
295 300 Val Phe Cys Thr Gly Thr Ala Val
Val Val Ser Pro Val Gly Ser Ile 305 310
315 320 Thr Tyr Lys Gly Lys Arg Val Lys Tyr Asp Gly Asn
Gln Gly Val Gly 325 330
335 Val Val Ser Gln Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln Met Gly
340 345 350 His Ala Glu
Gly Pro Met Gly Trp Thr Val Gln Leu Asn 355 360
365 1771074DNAMedicago truncatula 177atggctcctc cttctatttt
aagggacact gaagatggtt ctgaaagtga tatgggtgaa 60aattatgctg acatcaattg
ggaaggactt agttttagtc tgactcaaac agattacatg 120catgtcatga aatgcacaaa
aggagaaaag ttttctcaag gatccctcat tcgctacgga 180aacattgaga taagcccggc
tgctggtatc ataaactatg gacagggaat cttcgaggga 240ctaaaagcat atagaacaga
agatgggcga atccttcttt tccgaccgga ggagaatgct 300ctacgcatga agatgggggc
tgataggttg tgtatgccgt caccatcggt tgagcagttt 360gttgatgctg ttaagcaaac
agttcttgcc aataaacgtt gggtacctcc tccagggaaa 420gggacgcttt atcttaggcc
tttgctgatg ggaacaggag ctgcattagg cctggctcca 480tcacctgagt acacatttct
catttattgc tcccctgttg gaaagtatca cgagggagga 540agactaaact taaaagtgga
ggataaattt catcgatcaa tagctggcag cggtggaaca 600ggaggaatca agagtgttac
taattatgcc ccaatatata ctgcagtaac tgaagcaaaa 660gccaatggat tttctgatgt
cttgttcttg gattcagcaa ctggtaaaaa tattgaggag 720gctactgcgt gcaatatatt
tgttgtgaag gaaaatgata tcttcactcc ggcaatagat 780ggatctattc tgcctggggt
cacacgaaaa tccatcatag acattgccat tgatttgggt 840tataaggtca tagaacgttc
catatcagtg gaggaaatga tgagcgctga tgaagtgttc 900tgcacaggaa ctgcagtggt
tgttacctct gttgcatctg taacatataa ggaaacaaga 960gctgaatata aaacaggcgc
agaaacgttg tctcaaaaac tacaaggaat actggttgga 1020atacaaacag ggtgtattga
ggacaaaaag tcatggacag tccaagtaga ttga 1074178357PRTMedicago
truncatula 178Met Ala Pro Pro Ser Ile Leu Arg Asp Thr Glu Asp Gly Ser Glu
Ser 1 5 10 15 Asp
Met Gly Glu Asn Tyr Ala Asp Ile Asn Trp Glu Gly Leu Ser Phe
20 25 30 Ser Leu Thr Gln Thr
Asp Tyr Met His Val Met Lys Cys Thr Lys Gly 35
40 45 Glu Lys Phe Ser Gln Gly Ser Leu Ile
Arg Tyr Gly Asn Ile Glu Ile 50 55
60 Ser Pro Ala Ala Gly Ile Ile Asn Tyr Gly Gln Gly Ile
Phe Glu Gly 65 70 75
80 Leu Lys Ala Tyr Arg Thr Glu Asp Gly Arg Ile Leu Leu Phe Arg Pro
85 90 95 Glu Glu Asn Ala
Leu Arg Met Lys Met Gly Ala Asp Arg Leu Cys Met 100
105 110 Pro Ser Pro Ser Val Glu Gln Phe Val
Asp Ala Val Lys Gln Thr Val 115 120
125 Leu Ala Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Thr
Leu Tyr 130 135 140
Leu Arg Pro Leu Leu Met Gly Thr Gly Ala Ala Leu Gly Leu Ala Pro 145
150 155 160 Ser Pro Glu Tyr Thr
Phe Leu Ile Tyr Cys Ser Pro Val Gly Lys Tyr 165
170 175 His Glu Gly Gly Arg Leu Asn Leu Lys Val
Glu Asp Lys Phe His Arg 180 185
190 Ser Ile Ala Gly Ser Gly Gly Thr Gly Gly Ile Lys Ser Val Thr
Asn 195 200 205 Tyr
Ala Pro Ile Tyr Thr Ala Val Thr Glu Ala Lys Ala Asn Gly Phe 210
215 220 Ser Asp Val Leu Phe Leu
Asp Ser Ala Thr Gly Lys Asn Ile Glu Glu 225 230
235 240 Ala Thr Ala Cys Asn Ile Phe Val Val Lys Glu
Asn Asp Ile Phe Thr 245 250
255 Pro Ala Ile Asp Gly Ser Ile Leu Pro Gly Val Thr Arg Lys Ser Ile
260 265 270 Ile Asp
Ile Ala Ile Asp Leu Gly Tyr Lys Val Ile Glu Arg Ser Ile 275
280 285 Ser Val Glu Glu Met Met Ser
Ala Asp Glu Val Phe Cys Thr Gly Thr 290 295
300 Ala Val Val Val Thr Ser Val Ala Ser Val Thr Tyr
Lys Glu Thr Arg 305 310 315
320 Ala Glu Tyr Lys Thr Gly Ala Glu Thr Leu Ser Gln Lys Leu Gln Gly
325 330 335 Ile Leu Val
Gly Ile Gln Thr Gly Cys Ile Glu Asp Lys Lys Ser Trp 340
345 350 Thr Val Gln Val Asp 355
1791077DNAMedicago truncatula 179atggcaacat cccatcaact acccaacaat
ggtaaagctt ccaacaggga gactgaaaaa 60atatatgcca atatggattg ggacaaactt
acatgtggag tgattccaac tgattatatg 120tacataatta aatccaatga agaccgaacc
tattcaaacg gtactctcgt gccttttgga 180accattgata tcaacccaca ttctgctgtt
ataaattatg gacagggatt atttgagggc 240atgaaggctt acagaacaaa agacggcaat
gtgcaactat tccgaccgga agaaaatgcg 300ctgcgcatgc agatgggagc agagaggctg
ctgatgccat caccttctgt tgagcagtac 360attgatgctg taaaacaagt tgttcatgca
aataaacgtt gggtgcctcc ttggggaaaa 420ggaacattgt acattaggcc tttactattt
ggaagcggac ctgttctggg tattggacca 480gcacctcaat gcaccctctt aatattcact
aatccaatta gcaacattta caagggacaa 540acatcagcct tgaatttgtt gattaatgaa
aactttcctc gtgcatatcc tggtggaact 600ggtggagtaa aaagtattag taattatcca
cttgttttcc aagttgtaaa agaagcaaaa 660gccaaaggat tttccgatgt gctttttcta
gatgcagtgg aacataaata cattgaagag 720gtatcttcgt gtaatgcttt cattgtgaag
ggtaaggttc tttcaactgc acctacactt 780ggaactattc ttcctggagt cacaaggaaa
agtgtcattg aacttgcacg tgatttgggt 840tacgaggtga tggaacgcaa ggtctcggta
gaagaactgc ttgaagctga tgaggttttc 900tgcactggaa ctgctgttgg gatttctgct
gttggaagtg taacatacaa gaataaaagg 960tgtgttacgt tcaaaacagg ggcagatact
gtgactaaga agttgtatga tttgattacg 1020ggcatccaga caggtctctt ggaagataag
aaaggatggg tggtcaagat tgattga 1077180358PRTMedicago truncatula
180Met Ala Thr Ser His Gln Leu Pro Asn Asn Gly Lys Ala Ser Asn Arg 1
5 10 15 Glu Thr Glu Lys
Ile Tyr Ala Asn Met Asp Trp Asp Lys Leu Thr Cys 20
25 30 Gly Val Ile Pro Thr Asp Tyr Met Tyr
Ile Ile Lys Ser Asn Glu Asp 35 40
45 Arg Thr Tyr Ser Asn Gly Thr Leu Val Pro Phe Gly Thr Ile
Asp Ile 50 55 60
Asn Pro His Ser Ala Val Ile Asn Tyr Gly Gln Gly Leu Phe Glu Gly 65
70 75 80 Met Lys Ala Tyr Arg
Thr Lys Asp Gly Asn Val Gln Leu Phe Arg Pro 85
90 95 Glu Glu Asn Ala Leu Arg Met Gln Met Gly
Ala Glu Arg Leu Leu Met 100 105
110 Pro Ser Pro Ser Val Glu Gln Tyr Ile Asp Ala Val Lys Gln Val
Val 115 120 125 His
Ala Asn Lys Arg Trp Val Pro Pro Trp Gly Lys Gly Thr Leu Tyr 130
135 140 Ile Arg Pro Leu Leu Phe
Gly Ser Gly Pro Val Leu Gly Ile Gly Pro 145 150
155 160 Ala Pro Gln Cys Thr Leu Leu Ile Phe Thr Asn
Pro Ile Ser Asn Ile 165 170
175 Tyr Lys Gly Gln Thr Ser Ala Leu Asn Leu Leu Ile Asn Glu Asn Phe
180 185 190 Pro Arg
Ala Tyr Pro Gly Gly Thr Gly Gly Val Lys Ser Ile Ser Asn 195
200 205 Tyr Pro Leu Val Phe Gln Val
Val Lys Glu Ala Lys Ala Lys Gly Phe 210 215
220 Ser Asp Val Leu Phe Leu Asp Ala Val Glu His Lys
Tyr Ile Glu Glu 225 230 235
240 Val Ser Ser Cys Asn Ala Phe Ile Val Lys Gly Lys Val Leu Ser Thr
245 250 255 Ala Pro Thr
Leu Gly Thr Ile Leu Pro Gly Val Thr Arg Lys Ser Val 260
265 270 Ile Glu Leu Ala Arg Asp Leu Gly
Tyr Glu Val Met Glu Arg Lys Val 275 280
285 Ser Val Glu Glu Leu Leu Glu Ala Asp Glu Val Phe Cys
Thr Gly Thr 290 295 300
Ala Val Gly Ile Ser Ala Val Gly Ser Val Thr Tyr Lys Asn Lys Arg 305
310 315 320 Cys Val Thr Phe
Lys Thr Gly Ala Asp Thr Val Thr Lys Lys Leu Tyr 325
330 335 Asp Leu Ile Thr Gly Ile Gln Thr Gly
Leu Leu Glu Asp Lys Lys Gly 340 345
350 Trp Val Val Lys Ile Asp 355
1811227DNAMedicago truncatula 181atggagagca gcgccgcact aactagcatt
cgactcactt cctcgatccg tccttcccgt 60ttttcttccc cttttctttc ccccgcattt
cctcccaaac ccacttctct atccctcaag 120ctccaaaagc agtttccttt cacttcccag
aatgttctcc aagcttctaa tgctctcaga 180ccttctgctt ctgtttctgc tagtgaggcg
attgagttgg cagacataga ttgggacaac 240ctaggatttg gtcttcagcc tactgattat
atgtatttca tgaaatgtga tcaaggtgga 300accttttcta agggtgaatt aaagcgtttt
gggaacattg aattgaaccc ttctgctggt 360gttttaaact atggacaggg attatttgag
ggtttgaaag cttaccgtaa agatgatggg 420aacatactcc tctttcgtcc ggaagaaaac
gctttacgga tgaagacggg tgcagagcga 480atgtgcatgc catcacctag tgtagaacag
ttcgtggaag ctgtgaaaga tactgtttta 540gcaaacaaac gttggatccc ccctcagggt
aaaggttcat tgtatattag acctttgcta 600atgggaagtg gagctgtact tgggcttgca
cctgctccag agtacacctt tctaatatat 660gtttcacctg ttgggaacta cttcaaggaa
ggtttggctc caatcaattt gattgtggag 720agtgaactac atcgtgcaac tcccggtggc
actggaggtg tgaagaccat tggaaactat 780gctgcagttc ttaaggcaca gtctgcagcc
aaggcgaaag gctactctga tgttttgtac 840cttgactgcg tgcacaaaag atatttggag
gaggtttctt cctgcaatat atttgttgtt 900aagggtaatg ttatttcaac tccatccatc
aaagggacta tcctgcctgg cattactcga 960aagagtataa ttgacgttgc tcgaagccaa
ggattcgagg ttgaggagcg attagtggca 1020gtggacgaat tgctcgaggc agacgaggtc
ttctgcacag gaacagctgt ggttgtatca 1080cctgttggca gtattacata tcttggcgag
aagaaatctt atggagatgg tgttggagca 1140gtttcacagc aactttatac tggccttacc
agactacaga tgggtcttgc agaggataac 1200atgaattgga ctgttgagct gagataa
1227182408PRTMedicago truncatula 182Met
Glu Ser Ser Ala Ala Leu Thr Ser Ile Arg Leu Thr Ser Ser Ile 1
5 10 15 Arg Pro Ser Arg Phe Ser
Ser Pro Phe Leu Ser Pro Ala Phe Pro Pro 20
25 30 Lys Pro Thr Ser Leu Ser Leu Lys Leu Gln
Lys Gln Phe Pro Phe Thr 35 40
45 Ser Gln Asn Val Leu Gln Ala Ser Asn Ala Leu Arg Pro Ser
Ala Ser 50 55 60
Val Ser Ala Ser Glu Ala Ile Glu Leu Ala Asp Ile Asp Trp Asp Asn 65
70 75 80 Leu Gly Phe Gly Leu
Gln Pro Thr Asp Tyr Met Tyr Phe Met Lys Cys 85
90 95 Asp Gln Gly Gly Thr Phe Ser Lys Gly Glu
Leu Lys Arg Phe Gly Asn 100 105
110 Ile Glu Leu Asn Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly
Leu 115 120 125 Phe
Glu Gly Leu Lys Ala Tyr Arg Lys Asp Asp Gly Asn Ile Leu Leu 130
135 140 Phe Arg Pro Glu Glu Asn
Ala Leu Arg Met Lys Thr Gly Ala Glu Arg 145 150
155 160 Met Cys Met Pro Ser Pro Ser Val Glu Gln Phe
Val Glu Ala Val Lys 165 170
175 Asp Thr Val Leu Ala Asn Lys Arg Trp Ile Pro Pro Gln Gly Lys Gly
180 185 190 Ser Leu
Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly Ala Val Leu Gly 195
200 205 Leu Ala Pro Ala Pro Glu Tyr
Thr Phe Leu Ile Tyr Val Ser Pro Val 210 215
220 Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn
Leu Ile Val Glu 225 230 235
240 Ser Glu Leu His Arg Ala Thr Pro Gly Gly Thr Gly Gly Val Lys Thr
245 250 255 Ile Gly Asn
Tyr Ala Ala Val Leu Lys Ala Gln Ser Ala Ala Lys Ala 260
265 270 Lys Gly Tyr Ser Asp Val Leu Tyr
Leu Asp Cys Val His Lys Arg Tyr 275 280
285 Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Val Val Lys
Gly Asn Val 290 295 300
Ile Ser Thr Pro Ser Ile Lys Gly Thr Ile Leu Pro Gly Ile Thr Arg 305
310 315 320 Lys Ser Ile Ile
Asp Val Ala Arg Ser Gln Gly Phe Glu Val Glu Glu 325
330 335 Arg Leu Val Ala Val Asp Glu Leu Leu
Glu Ala Asp Glu Val Phe Cys 340 345
350 Thr Gly Thr Ala Val Val Val Ser Pro Val Gly Ser Ile Thr
Tyr Leu 355 360 365
Gly Glu Lys Lys Ser Tyr Gly Asp Gly Val Gly Ala Val Ser Gln Gln 370
375 380 Leu Tyr Thr Gly Leu
Thr Arg Leu Gln Met Gly Leu Ala Glu Asp Asn 385 390
395 400 Met Asn Trp Thr Val Glu Leu Arg
405 1831251DNAOryza sativa 183atggctgctg ctgctgctgc
tgcgtcgtcc gcgaagcgcg cgctcctccc gtgggcacgc 60gacgcccacc acgcgctggc
cagggccctg cagggatgcg gcggcggcgg cggcctcggt 120ctccgcgggg cgctcccgac
ggccggaggc aggtggtctc tgctccagtg ccggtggagg 180tcgtcgctgc cgcagctcga
ctccgccgac aggtccgatg aggaaagcgg cggcgaaatc 240gactgggaca acctggggtt
cgggctgacg ccgaccgact acatgtacgt catgcggtgc 300tcgctggagg acggcgtctt
ctcccgcggc gagctcagcc gctacggcaa catcgagctc 360agcccctcct ccggcgtcat
caactacggc caggggctct tcgagggtct gaaggcgtac 420agggcggcga accaacaggg
gtcgtacatg ctgttccggc cggaggagaa cgcgcggcgg 480atgcagcacg gcgccgagcg
catgtgcatg ccgtcgccgt cggtggagca gttcgtccac 540gccgtcaagc agaccgtcct
cgccaaccgc cgctgggtgc caccgcaagg aaagggggcg 600ctgtacatca ggccgctgct
catcgggagc ggaccgattc tcgggctggc tcccgccccg 660gagtacacgt tcctcatcta
cgccgcaccg gttggaacgt acttcaagga gggtctagcg 720ccgataaacc ttgtcgtaga
ggactcgata caccgggcca tgccgggcgg caccggcggg 780gtcaagacga tcaccaacta
cgcgccggtg ctcaaggcgc agatggacgc caagagcaga 840gggttcactg acgtgctgta
cctcgacgcg gtgcacaaga cgtacctgga ggaggcctcc 900tcctgcaacc tcttcatcgt
caaggacggc gtcgtcgcca cgccggccac cgtgggaacc 960atcctgccgg ggatcacgcg
caagagcgtc atcgagctcg ccagggaccg cggctatcag 1020caggttgaag aacggctcgt
ctccatcgac gatctggtcg gcgcagacga ggtgttctgc 1080accggaacag cggtggtcgt
tgccccagta tcgagtgtta cttaccatgg gcaaaggtac 1140gagttcagga ctggacatga
cacgttatcg cagacactgc acacgactct gacgtccatc 1200cagatgggcc tggctgagga
caagaaagga tggacagtgg caatagatta a 1251184416PRTOryza sativa
184Met Ala Ala Ala Ala Ala Ala Ala Ser Ser Ala Lys Arg Ala Leu Leu 1
5 10 15 Pro Trp Ala Arg
Asp Ala His His Ala Leu Ala Arg Ala Leu Gln Gly 20
25 30 Cys Gly Gly Gly Gly Gly Leu Gly Leu
Arg Gly Ala Leu Pro Thr Ala 35 40
45 Gly Gly Arg Trp Ser Leu Leu Gln Cys Arg Trp Arg Ser Ser
Leu Pro 50 55 60
Gln Leu Asp Ser Ala Asp Arg Ser Asp Glu Glu Ser Gly Gly Glu Ile 65
70 75 80 Asp Trp Asp Asn Leu
Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met Tyr 85
90 95 Val Met Arg Cys Ser Leu Glu Asp Gly Val
Phe Ser Arg Gly Glu Leu 100 105
110 Ser Arg Tyr Gly Asn Ile Glu Leu Ser Pro Ser Ser Gly Val Ile
Asn 115 120 125 Tyr
Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Ala Ala Asn 130
135 140 Gln Gln Gly Ser Tyr Met
Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg 145 150
155 160 Met Gln His Gly Ala Glu Arg Met Cys Met Pro
Ser Pro Ser Val Glu 165 170
175 Gln Phe Val His Ala Val Lys Gln Thr Val Leu Ala Asn Arg Arg Trp
180 185 190 Val Pro
Pro Gln Gly Lys Gly Ala Leu Tyr Ile Arg Pro Leu Leu Ile 195
200 205 Gly Ser Gly Pro Ile Leu Gly
Leu Ala Pro Ala Pro Glu Tyr Thr Phe 210 215
220 Leu Ile Tyr Ala Ala Pro Val Gly Thr Tyr Phe Lys
Glu Gly Leu Ala 225 230 235
240 Pro Ile Asn Leu Val Val Glu Asp Ser Ile His Arg Ala Met Pro Gly
245 250 255 Gly Thr Gly
Gly Val Lys Thr Ile Thr Asn Tyr Ala Pro Val Leu Lys 260
265 270 Ala Gln Met Asp Ala Lys Ser Arg
Gly Phe Thr Asp Val Leu Tyr Leu 275 280
285 Asp Ala Val His Lys Thr Tyr Leu Glu Glu Ala Ser Ser
Cys Asn Leu 290 295 300
Phe Ile Val Lys Asp Gly Val Val Ala Thr Pro Ala Thr Val Gly Thr 305
310 315 320 Ile Leu Pro Gly
Ile Thr Arg Lys Ser Val Ile Glu Leu Ala Arg Asp 325
330 335 Arg Gly Tyr Gln Gln Val Glu Glu Arg
Leu Val Ser Ile Asp Asp Leu 340 345
350 Val Gly Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val
Val Ala 355 360 365
Pro Val Ser Ser Val Thr Tyr His Gly Gln Arg Tyr Glu Phe Arg Thr 370
375 380 Gly His Asp Thr Leu
Ser Gln Thr Leu His Thr Thr Leu Thr Ser Ile 385 390
395 400 Gln Met Gly Leu Ala Glu Asp Lys Lys Gly
Trp Thr Val Ala Ile Asp 405 410
415 1851233DNAOryza sativa 185atggagctcc tcccgcgtgt gggtgtggcc
gcccccggtc ccggacgcgg cggcgcgtcg 60ccgtccccga cgcgccgtca tcgcgcgccc
tctcacccca ttctgaagcg atcggcggcg 120gtttgcggcg cagtcgccgt ctgcagagga
ggggctgtcg ccaggaggag ccggtggtca 180actctggtga ccgcagcata ttacacagga
actgctgaac tggtcgactt taactgggaa 240actcttgggt ttcaacccgt gccgactgac
tttatgtatg tgatgagatg ttccgaggaa 300ggggtgttca ccaagggtga attggtgcca
tatgggccaa tagaactgaa cccagcagct 360ggagtgttga attatggtca gggtttactt
gaaggtctgc gagcacatag aaaagaggat 420ggatcagtcc ttctatttcg tcctgatgaa
aatgctttac ggatgagagt aggcgcagac 480cggttatgta tgcctgcacc aagtgtagag
cagttcctag aagctataaa gctaacaatt 540ttagcaaaca agcgctgggt accccctact
ggcaaaggtt ctttatatat cagaccgctg 600ctgattggaa gtggggctat cctcggtgtt
gcaccagccc cagagtacac atttgttgtc 660tttgcttgcc cagttgggca ctattttaag
gatggcttat ctccaatcag cttgttaacc 720gaggaagaat atcagtgtgc ggcaccaggt
ggaactggtg atataaagac tatcggaaat 780tatgcttcag ccgtttatgc taaagaaaga
gctaaggaga gaggtcattc tgatgttctt 840tacttggatc cagtgcataa aaagtttgtt
gaggaacttt cgtcctgtaa tatattcatg 900gtgaaggaca acattatttc tactccacta
ttaacgggaa cagttcttcc tggcatcaca 960agaagaagta taattgaata cgcccgtagc
cttggatttc aggttgaaga gtgtcttatt 1020acaatagatg agttgcttga cgctgatgaa
gttttctgta ctggaacttc tgtggtacta 1080tcctctgttg gttgcatagt gtaccagggg
agaagagtgg agtatgggaa ccagaagttc 1140agaactgtgt ctcagcaact ctattcagca
cttacggcta tccagaaagg cctcgtggag 1200gacagtatgg gatggactgt gcaactgaat
tag 1233186410PRTOryza sativa 186Met Glu
Leu Leu Pro Arg Val Gly Val Ala Ala Pro Gly Pro Gly Arg 1 5
10 15 Gly Gly Ala Ser Pro Ser Pro
Thr Arg Arg His Arg Ala Pro Ser His 20 25
30 Pro Ile Leu Lys Arg Ser Ala Ala Val Cys Gly Ala
Val Ala Val Cys 35 40 45
Arg Gly Gly Ala Val Ala Arg Arg Ser Arg Trp Ser Thr Leu Val Thr
50 55 60 Ala Ala Tyr
Tyr Thr Gly Thr Ala Glu Leu Val Asp Phe Asn Trp Glu 65
70 75 80 Thr Leu Gly Phe Gln Pro Val
Pro Thr Asp Phe Met Tyr Val Met Arg 85
90 95 Cys Ser Glu Glu Gly Val Phe Thr Lys Gly Glu
Leu Val Pro Tyr Gly 100 105
110 Pro Ile Glu Leu Asn Pro Ala Ala Gly Val Leu Asn Tyr Gly Gln
Gly 115 120 125 Leu
Leu Glu Gly Leu Arg Ala His Arg Lys Glu Asp Gly Ser Val Leu 130
135 140 Leu Phe Arg Pro Asp Glu
Asn Ala Leu Arg Met Arg Val Gly Ala Asp 145 150
155 160 Arg Leu Cys Met Pro Ala Pro Ser Val Glu Gln
Phe Leu Glu Ala Ile 165 170
175 Lys Leu Thr Ile Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys
180 185 190 Gly Ser
Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala Ile Leu 195
200 205 Gly Val Ala Pro Ala Pro Glu
Tyr Thr Phe Val Val Phe Ala Cys Pro 210 215
220 Val Gly His Tyr Phe Lys Asp Gly Leu Ser Pro Ile
Ser Leu Leu Thr 225 230 235
240 Glu Glu Glu Tyr Gln Cys Ala Ala Pro Gly Gly Thr Gly Asp Ile Lys
245 250 255 Thr Ile Gly
Asn Tyr Ala Ser Ala Val Tyr Ala Lys Glu Arg Ala Lys 260
265 270 Glu Arg Gly His Ser Asp Val Leu
Tyr Leu Asp Pro Val His Lys Lys 275 280
285 Phe Val Glu Glu Leu Ser Ser Cys Asn Ile Phe Met Val
Lys Asp Asn 290 295 300
Ile Ile Ser Thr Pro Leu Leu Thr Gly Thr Val Leu Pro Gly Ile Thr 305
310 315 320 Arg Arg Ser Ile
Ile Glu Tyr Ala Arg Ser Leu Gly Phe Gln Val Glu 325
330 335 Glu Cys Leu Ile Thr Ile Asp Glu Leu
Leu Asp Ala Asp Glu Val Phe 340 345
350 Cys Thr Gly Thr Ser Val Val Leu Ser Ser Val Gly Cys Ile
Val Tyr 355 360 365
Gln Gly Arg Arg Val Glu Tyr Gly Asn Gln Lys Phe Arg Thr Val Ser 370
375 380 Gln Gln Leu Tyr Ser
Ala Leu Thr Ala Ile Gln Lys Gly Leu Val Glu 385 390
395 400 Asp Ser Met Gly Trp Thr Val Gln Leu Asn
405 410 1871230DNAOryza sativa
187atggagtacg gtgcagcaac gcgtggcgcg ctcctcgcgg ccgccccgct ctccggcgcc
60cggcgtagct ggttgcccct ctcatcgccg ccgtcgccgc cctctattca gattcagaat
120cgactttatt cgatatcgtc gcttccacta aaggctcgag gcgtgagaag atgcgaggct
180tctctagcaa gtgactacac gaaggcatct gaggtagctg atttagattg ggagaacctt
240ggttttggaa tcgtgcagac cgactacatg tatatcacaa aatgcggaca ggacgggaat
300ttttctgagg gtgaaatgat tccatttgga cctatagcgc tgaacccatc ttctggagtc
360cttaattacg gacagggatt atttgaaggt ctaaaagcat atagaacaac agatgactct
420atcttattat ttcgcccgga ggaaaatgca ctgagaatga gaacaggtgc agaaagaatg
480tgcatgcctg cgcctagtgt tgagcagttt gtggatgcag taaagcaaac tgttttagca
540aacaagagat gggtgcctcc taccggtaaa ggttctttgt atattagacc gctactcatg
600ggtagtggtg ctgttcttgg tcttgcacct gctcctgagt atacgttcat tatatttgtc
660tcgcctgtgg ggaactactt taaggaaggt ttagctccaa taaatttgat agttgaagat
720aagtttcatc gtgcaacccc tggtggaact ggaagtgtga agaccatagg aaattatgcc
780tcggtcttga tggcacagaa gattgcaaaa gaaaagggct attctgatgt tctctacttg
840gatgctgttc acaaaaagta tcttgaagaa gtttcttcat gtaatatttt tgttgtcaag
900ggcaatgtca tttcaactcc agcagtaaaa ggaacaatat tgccaggcat cacaaggaaa
960agtatcattg atgttgctct gagcaagggt ttccaggtcg aggagcgact tgtgtcagta
1020gatgagctgc ttgaagctga tgaggttttc tgcacaggaa ctgctgtcgt agtgtctcct
1080gtgggtagta ttacctatca agggaaaagg gtcgaatatg ctggcaacaa aggagttggt
1140gtcgtgtctc agcagctata tacttcatta acaagcctgc agatgggcca ggcagaagat
1200tggctagtct ggactgtgca actgagttag
1230188409PRTOryza sativa 188Met Glu Tyr Gly Ala Ala Thr Arg Gly Ala Leu
Leu Ala Ala Ala Pro 1 5 10
15 Leu Ser Gly Ala Arg Arg Ser Trp Leu Pro Leu Ser Ser Pro Pro Ser
20 25 30 Pro Pro
Ser Ile Gln Ile Gln Asn Arg Leu Tyr Ser Ile Ser Ser Leu 35
40 45 Pro Leu Lys Ala Arg Gly Val
Arg Arg Cys Glu Ala Ser Leu Ala Ser 50 55
60 Asp Tyr Thr Lys Ala Ser Glu Val Ala Asp Leu Asp
Trp Glu Asn Leu 65 70 75
80 Gly Phe Gly Ile Val Gln Thr Asp Tyr Met Tyr Ile Thr Lys Cys Gly
85 90 95 Gln Asp Gly
Asn Phe Ser Glu Gly Glu Met Ile Pro Phe Gly Pro Ile 100
105 110 Ala Leu Asn Pro Ser Ser Gly Val
Leu Asn Tyr Gly Gln Gly Leu Phe 115 120
125 Glu Gly Leu Lys Ala Tyr Arg Thr Thr Asp Asp Ser Ile
Leu Leu Phe 130 135 140
Arg Pro Glu Glu Asn Ala Leu Arg Met Arg Thr Gly Ala Glu Arg Met 145
150 155 160 Cys Met Pro Ala
Pro Ser Val Glu Gln Phe Val Asp Ala Val Lys Gln 165
170 175 Thr Val Leu Ala Asn Lys Arg Trp Val
Pro Pro Thr Gly Lys Gly Ser 180 185
190 Leu Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly Ala Val Leu
Gly Leu 195 200 205
Ala Pro Ala Pro Glu Tyr Thr Phe Ile Ile Phe Val Ser Pro Val Gly 210
215 220 Asn Tyr Phe Lys Glu
Gly Leu Ala Pro Ile Asn Leu Ile Val Glu Asp 225 230
235 240 Lys Phe His Arg Ala Thr Pro Gly Gly Thr
Gly Ser Val Lys Thr Ile 245 250
255 Gly Asn Tyr Ala Ser Val Leu Met Ala Gln Lys Ile Ala Lys Glu
Lys 260 265 270 Gly
Tyr Ser Asp Val Leu Tyr Leu Asp Ala Val His Lys Lys Tyr Leu 275
280 285 Glu Glu Val Ser Ser Cys
Asn Ile Phe Val Val Lys Gly Asn Val Ile 290 295
300 Ser Thr Pro Ala Val Lys Gly Thr Ile Leu Pro
Gly Ile Thr Arg Lys 305 310 315
320 Ser Ile Ile Asp Val Ala Leu Ser Lys Gly Phe Gln Val Glu Glu Arg
325 330 335 Leu Val
Ser Val Asp Glu Leu Leu Glu Ala Asp Glu Val Phe Cys Thr 340
345 350 Gly Thr Ala Val Val Val Ser
Pro Val Gly Ser Ile Thr Tyr Gln Gly 355 360
365 Lys Arg Val Glu Tyr Ala Gly Asn Lys Gly Val Gly
Val Val Ser Gln 370 375 380
Gln Leu Tyr Thr Ser Leu Thr Ser Leu Gln Met Gly Gln Ala Glu Asp 385
390 395 400 Trp Leu Val
Trp Thr Val Gln Leu Ser 405
1891218DNAOryza sativa 189atggagctcc acctcacctc ccgcggcgcc ctcccgctgt
ctccgccgct cgccggccag 60cggcgtcctc acctctctct ctccacgccg tcgcttccga
tcaagaatca cacttattca 120gtgccacctc ctttctccaa ggctcactgc gcgataggat
gccaagcttc tctagcaact 180aactacatgg aaacctctgc ggtggctgat ttggactggg
agaacctcgg ttttggcctt 240gtccagacag attttatgta tattgcaaaa tgcgggccag
atgggaactt ttccaaagga 300gaaatggtac catttggacc tatagaactg agcccatctg
ctggagtctt aaattatgga 360cagggcttgt ttgagggctt aaaggcatat agaaaaacag
atggatacat tctgctgttt 420cgtccggagg agaatgccat aaggatgaga aatggtgcag
agaggatgtg tatgcctgca 480ccaactcttg aacaatttgt ggatgcagta aagcaaaccg
ttttggcaaa taaaagatgg 540gtgcccccaa ccggtaaagg ctccctgtat ataaggccgc
tgcttatggg aagtggagct 600gtccttggtc ttgcacctgc tcctgagtat acctttatga
tttttgtctc ccctgttggg 660aactatttca aggaaggttt agcccctatt aacttgatta
tagaagaaaa ctttcaccgt 720gctgcccctg gtggaactgg cggagtgaaa accattggaa
actatgcctc ggtattaaaa 780gcacagagga ttgcaaaaca gaaaggatat tcagatgtcc
tctatctaga tgccgttcac 840aagaaatatc tggaagaagt gtcttcgtgc aatatcttta
ttgtgaaagg caatgttatt 900tctactccag caataaaagg aaccatactg cctggtataa
caaggaaaag tattcttgaa 960gttgctcaga gaaaaggctt catggttgag gagcgccttg
tgtcagtgga tgagcttctt 1020gaagctgatg aagttttctg cacgggaaca gctgttgtgg
tgtcccctgt ggggagcata 1080acttatctgg ggcaaagggt ggaatatggc aaccaaggag
tgggcgtggt gtgtcagcag 1140ctgtatactt cacttacaag cctccagatg ggtcatgtgg
acgattgtat gggctggact 1200gtggaactaa accagtga
1218190405PRTOryza sativa 190Met Glu Leu His Leu
Thr Ser Arg Gly Ala Leu Pro Leu Ser Pro Pro 1 5
10 15 Leu Ala Gly Gln Arg Arg Pro His Leu Ser
Leu Ser Thr Pro Ser Leu 20 25
30 Pro Ile Lys Asn His Thr Tyr Ser Val Pro Pro Pro Phe Ser Lys
Ala 35 40 45 His
Cys Ala Ile Gly Cys Gln Ala Ser Leu Ala Thr Asn Tyr Met Glu 50
55 60 Thr Ser Ala Val Ala Asp
Leu Asp Trp Glu Asn Leu Gly Phe Gly Leu 65 70
75 80 Val Gln Thr Asp Phe Met Tyr Ile Ala Lys Cys
Gly Pro Asp Gly Asn 85 90
95 Phe Ser Lys Gly Glu Met Val Pro Phe Gly Pro Ile Glu Leu Ser Pro
100 105 110 Ser Ala
Gly Val Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys 115
120 125 Ala Tyr Arg Lys Thr Asp Gly
Tyr Ile Leu Leu Phe Arg Pro Glu Glu 130 135
140 Asn Ala Ile Arg Met Arg Asn Gly Ala Glu Arg Met
Cys Met Pro Ala 145 150 155
160 Pro Thr Leu Glu Gln Phe Val Asp Ala Val Lys Gln Thr Val Leu Ala
165 170 175 Asn Lys Arg
Trp Val Pro Pro Thr Gly Lys Gly Ser Leu Tyr Ile Arg 180
185 190 Pro Leu Leu Met Gly Ser Gly Ala
Val Leu Gly Leu Ala Pro Ala Pro 195 200
205 Glu Tyr Thr Phe Met Ile Phe Val Ser Pro Val Gly Asn
Tyr Phe Lys 210 215 220
Glu Gly Leu Ala Pro Ile Asn Leu Ile Ile Glu Glu Asn Phe His Arg 225
230 235 240 Ala Ala Pro Gly
Gly Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala 245
250 255 Ser Val Leu Lys Ala Gln Arg Ile Ala
Lys Gln Lys Gly Tyr Ser Asp 260 265
270 Val Leu Tyr Leu Asp Ala Val His Lys Lys Tyr Leu Glu Glu
Val Ser 275 280 285
Ser Cys Asn Ile Phe Ile Val Lys Gly Asn Val Ile Ser Thr Pro Ala 290
295 300 Ile Lys Gly Thr Ile
Leu Pro Gly Ile Thr Arg Lys Ser Ile Leu Glu 305 310
315 320 Val Ala Gln Arg Lys Gly Phe Met Val Glu
Glu Arg Leu Val Ser Val 325 330
335 Asp Glu Leu Leu Glu Ala Asp Glu Val Phe Cys Thr Gly Thr Ala
Val 340 345 350 Val
Val Ser Pro Val Gly Ser Ile Thr Tyr Leu Gly Gln Arg Val Glu 355
360 365 Tyr Gly Asn Gln Gly Val
Gly Val Val Cys Gln Gln Leu Tyr Thr Ser 370 375
380 Leu Thr Ser Leu Gln Met Gly His Val Asp Asp
Cys Met Gly Trp Thr 385 390 395
400 Val Glu Leu Asn Gln 405
1911317DNAPhyscomitrella patens 191atggcggtga tgtgcgggat cgggttggct
tcctccttgt tgcagcagga gagttacatg 60agcgtggcgt cgtctgaagc ggggagagct
gatgttaagc gcgtctcatc gtcttcgtcg 120cctcagcttc ttcagaatgg cgtcggtttg
aggaggactt gtcggatgcc ggcttttttt 180gtgacagagg agaggcttag gtcttcgttg
tcgcataact caacctacca tgtacgcgga 240tcgaaagtgc agcaattgca tgcagtagca
gatgctctga accaaaccag tgacttggat 300acattggaag ggatcgactg ggacaatttt
ggttttggtc tgcgtccaac tgatttcatg 360tttgttatga agggcgacct agagggcaat
tggcaaaaag gacagctacg accgtttggg 420aatttggaag tcagtccatc tgctggagtg
ttaaattatg gacagggtgt gtttgaaggt 480atgaaggcat ataggacagc tgatgatcgc
atattaattt tccgtccaga ggagaatgcc 540atgcgtatga taaatggggc tgagcggatg
agcatgcctg ccccagatgt tgacacattt 600gttgatgctg tgaaaaaaac ggttctggca
aacaaacgtt gggtgccccc gacaggcaag 660ggatcacttt acatccgtcc cttgctcatt
ggcactggcc ctattttggg cttagcacct 720gcaccagaat atacctttct aatttatgta
tctcctgttg gaacatattt caaggggggc 780ttatctccaa ttgacctgaa agtagagact
tatttccatc gtgctgctcc tggtggaact 840ggtggagtaa aaactatatc caattatgcc
ccagtgctga agactcaact gatggctaaa 900gggaacggct attcagatgt cttgtaccta
gacgcaatag agaacaagta tgtggaggaa 960gtctcgtctt gcaacatatt catggtcaag
ggcaaagtga tctcaactcc tgaattggct 1020ggaacgatcc tgccgggaat cacaagaaag
agcattattc agttagcacg cagtcgcggt 1080tatgaggtaa atgagcgacc agtgtcggtg
gatgaactgc tagctgccga tgaggtgttt 1140tgcactggaa cggctgtggt tgtaaatccc
gtaggaagca tcactcatgg cacaaacagg 1200gtgcagtaca ataatggagc tgtgggaaga
gtatcacaag agctttatga agccctgaca 1260accttacaaa tgggtgtatc caaagatgaa
tttgactggg tagtagaatt ggtgtaa 1317192438PRTPhyscomitrella patens
192Met Ala Val Met Cys Gly Ile Gly Leu Ala Ser Ser Leu Leu Gln Gln 1
5 10 15 Glu Ser Tyr Met
Ser Val Ala Ser Ser Glu Ala Gly Arg Ala Asp Val 20
25 30 Lys Arg Val Ser Ser Ser Ser Ser Pro
Gln Leu Leu Gln Asn Gly Val 35 40
45 Gly Leu Arg Arg Thr Cys Arg Met Pro Ala Phe Phe Val Thr
Glu Glu 50 55 60
Arg Leu Arg Ser Ser Leu Ser His Asn Ser Thr Tyr His Val Arg Gly 65
70 75 80 Ser Lys Val Gln Gln
Leu His Ala Val Ala Asp Ala Leu Asn Gln Thr 85
90 95 Ser Asp Leu Asp Thr Leu Glu Gly Ile Asp
Trp Asp Asn Phe Gly Phe 100 105
110 Gly Leu Arg Pro Thr Asp Phe Met Phe Val Met Lys Gly Asp Leu
Glu 115 120 125 Gly
Asn Trp Gln Lys Gly Gln Leu Arg Pro Phe Gly Asn Leu Glu Val 130
135 140 Ser Pro Ser Ala Gly Val
Leu Asn Tyr Gly Gln Gly Val Phe Glu Gly 145 150
155 160 Met Lys Ala Tyr Arg Thr Ala Asp Asp Arg Ile
Leu Ile Phe Arg Pro 165 170
175 Glu Glu Asn Ala Met Arg Met Ile Asn Gly Ala Glu Arg Met Ser Met
180 185 190 Pro Ala
Pro Asp Val Asp Thr Phe Val Asp Ala Val Lys Lys Thr Val 195
200 205 Leu Ala Asn Lys Arg Trp Val
Pro Pro Thr Gly Lys Gly Ser Leu Tyr 210 215
220 Ile Arg Pro Leu Leu Ile Gly Thr Gly Pro Ile Leu
Gly Leu Ala Pro 225 230 235
240 Ala Pro Glu Tyr Thr Phe Leu Ile Tyr Val Ser Pro Val Gly Thr Tyr
245 250 255 Phe Lys Gly
Gly Leu Ser Pro Ile Asp Leu Lys Val Glu Thr Tyr Phe 260
265 270 His Arg Ala Ala Pro Gly Gly Thr
Gly Gly Val Lys Thr Ile Ser Asn 275 280
285 Tyr Ala Pro Val Leu Lys Thr Gln Leu Met Ala Lys Gly
Asn Gly Tyr 290 295 300
Ser Asp Val Leu Tyr Leu Asp Ala Ile Glu Asn Lys Tyr Val Glu Glu 305
310 315 320 Val Ser Ser Cys
Asn Ile Phe Met Val Lys Gly Lys Val Ile Ser Thr 325
330 335 Pro Glu Leu Ala Gly Thr Ile Leu Pro
Gly Ile Thr Arg Lys Ser Ile 340 345
350 Ile Gln Leu Ala Arg Ser Arg Gly Tyr Glu Val Asn Glu Arg
Pro Val 355 360 365
Ser Val Asp Glu Leu Leu Ala Ala Asp Glu Val Phe Cys Thr Gly Thr 370
375 380 Ala Val Val Val Asn
Pro Val Gly Ser Ile Thr His Gly Thr Asn Arg 385 390
395 400 Val Gln Tyr Asn Asn Gly Ala Val Gly Arg
Val Ser Gln Glu Leu Tyr 405 410
415 Glu Ala Leu Thr Thr Leu Gln Met Gly Val Ser Lys Asp Glu Phe
Asp 420 425 430 Trp
Val Val Glu Leu Val 435 1931317DNAPhyscomitrella
patens 193atggggatgg cgtgtgggaa tcggcttgca tcctctctgt tgcaagagag
tctcatgacg 60gtggcctcgt ccgaagctag aagaaaggat gccaggcgcg tctcgttgtc
gtcgtcgccg 120tctcagcttc gtcaaagtgc ttctggtggc ttgagacggt gccgagtgcc
tgcatttgcc 180ctgatagagg acgattctag gtcatcagtt tcgcagaatg caacctgcca
tgtacgcggg 240tcgaaagtgc agcccttgaa tgcagtagca gatgccctca accaaaccag
tgatttggat 300acgttgcaag ggattgattg ggacaacttt ggatttggtc tgcgtcctac
tgatttcatg 360tacgtaaaga agggcgacat tgcgggaaat tggcaagagg gggagctagt
accatatggg 420aatttggaaa tcagtccatc tgctggagtg ttaaattatg gacagggtgt
gtttgaaggc 480ctgaaggcgt acaggacagc tgatgatagc atattaatgt tccgtccaga
ggagaatgct 540ttgcgcatgg ttcatggggc tgagcgtatg agtatgcctg ctcctgatgt
tgacacattc 600atcaatgctg taaagcaaac tgttctggcg aataaacgtt gggtgccccc
gactggaaaa 660ggatcacttt acatccgtcc cttgcttatt ggcactggtc ctattttggg
cttagcacca 720gctccagagt atacctttct cgtatatgtg tctcctgtcg gaacctactt
caagggaggg 780ctatctccta ttgacctgaa agtggaaact tatttccatc gtgctgctcc
tggtgggact 840gggggagtta aaaccatctc caattatgct ccagtgctca agactcaata
tacagctaaa 900gggaaaggct attcagatgt cgtatattta gacgcaaaag agaacaagta
tgtggaggag 960gtttcgtctt gcaacatatt cgtggttaag gacaaagtga tctcaacccc
ggaattggct 1020ggaacaatcc tgccgggaat tacaaggaat agcattattc aattagctcg
gagtcgtggt 1080tatgaggtga atgagcgacc agtatccgtg gatgagctgc tagctgctga
tgaggtgttt 1140tgcactggaa cggctgtggt tgtaaatcct gtgggcagcg tcactcacgg
cacgaagcgg 1200gtgctgtata atcacggagt tgttggagga gtatcgcaag agctttatga
agccctaaca 1260tccatacaaa tgggtgtatc caaagatgag tttgattggg tagtagaatt
ggcgtaa 1317194438PRTPhyscomitrella patens 194Met Gly Met Ala Cys
Gly Asn Arg Leu Ala Ser Ser Leu Leu Gln Glu 1 5
10 15 Ser Leu Met Thr Val Ala Ser Ser Glu Ala
Arg Arg Lys Asp Ala Arg 20 25
30 Arg Val Ser Leu Ser Ser Ser Pro Ser Gln Leu Arg Gln Ser Ala
Ser 35 40 45 Gly
Gly Leu Arg Arg Cys Arg Val Pro Ala Phe Ala Leu Ile Glu Asp 50
55 60 Asp Ser Arg Ser Ser Val
Ser Gln Asn Ala Thr Cys His Val Arg Gly 65 70
75 80 Ser Lys Val Gln Pro Leu Asn Ala Val Ala Asp
Ala Leu Asn Gln Thr 85 90
95 Ser Asp Leu Asp Thr Leu Gln Gly Ile Asp Trp Asp Asn Phe Gly Phe
100 105 110 Gly Leu
Arg Pro Thr Asp Phe Met Tyr Val Lys Lys Gly Asp Ile Ala 115
120 125 Gly Asn Trp Gln Glu Gly Glu
Leu Val Pro Tyr Gly Asn Leu Glu Ile 130 135
140 Ser Pro Ser Ala Gly Val Leu Asn Tyr Gly Gln Gly
Val Phe Glu Gly 145 150 155
160 Leu Lys Ala Tyr Arg Thr Ala Asp Asp Ser Ile Leu Met Phe Arg Pro
165 170 175 Glu Glu Asn
Ala Leu Arg Met Val His Gly Ala Glu Arg Met Ser Met 180
185 190 Pro Ala Pro Asp Val Asp Thr Phe
Ile Asn Ala Val Lys Gln Thr Val 195 200
205 Leu Ala Asn Lys Arg Trp Val Pro Pro Thr Gly Lys Gly
Ser Leu Tyr 210 215 220
Ile Arg Pro Leu Leu Ile Gly Thr Gly Pro Ile Leu Gly Leu Ala Pro 225
230 235 240 Ala Pro Glu Tyr
Thr Phe Leu Val Tyr Val Ser Pro Val Gly Thr Tyr 245
250 255 Phe Lys Gly Gly Leu Ser Pro Ile Asp
Leu Lys Val Glu Thr Tyr Phe 260 265
270 His Arg Ala Ala Pro Gly Gly Thr Gly Gly Val Lys Thr Ile
Ser Asn 275 280 285
Tyr Ala Pro Val Leu Lys Thr Gln Tyr Thr Ala Lys Gly Lys Gly Tyr 290
295 300 Ser Asp Val Val Tyr
Leu Asp Ala Lys Glu Asn Lys Tyr Val Glu Glu 305 310
315 320 Val Ser Ser Cys Asn Ile Phe Val Val Lys
Asp Lys Val Ile Ser Thr 325 330
335 Pro Glu Leu Ala Gly Thr Ile Leu Pro Gly Ile Thr Arg Asn Ser
Ile 340 345 350 Ile
Gln Leu Ala Arg Ser Arg Gly Tyr Glu Val Asn Glu Arg Pro Val 355
360 365 Ser Val Asp Glu Leu Leu
Ala Ala Asp Glu Val Phe Cys Thr Gly Thr 370 375
380 Ala Val Val Val Asn Pro Val Gly Ser Val Thr
His Gly Thr Lys Arg 385 390 395
400 Val Leu Tyr Asn His Gly Val Val Gly Gly Val Ser Gln Glu Leu Tyr
405 410 415 Glu Ala
Leu Thr Ser Ile Gln Met Gly Val Ser Lys Asp Glu Phe Asp 420
425 430 Trp Val Val Glu Leu Ala
435 1951020DNAPopulus trichocarpa 195atgcaacaag
gatttgtgcc attgcatatc aattgggata atgttggttt tggtctaact 60cccacggatt
tcatgttctt aatgaaatgc cctgttggag acaaatattc agaaggacac 120cttgttccct
atggaaatct tgagataagc ccatcctctt cagtgttaaa ctacggacag 180gggttacttg
aaggcttaaa ggcatataga ggtgatgata accgtattcg actcttccgg 240ccagaacaaa
atgctctacg catgcaaatg ggggcggaaa gaatgtgcat gtcatcacca 300actgctgagc
aatttgttag ttcaataaag caaactgctt tggccaataa aagatgggta 360cctcctccag
gaaaaggatc gctctatatt aggcccttgc tcctgggaac agggccaatt 420ctaggtgtgg
cgccatctcc agaatacacc ttcctagcat atgcttctcc agttggcaac 480tatttcaatg
gtcccatgca cttctctgtt gaagataagg tctatcgagc aattcctgga 540ggaactggtg
gcattaaatc tatcactaac tattcgccta tttacaaggc aatcactcaa 600gcaaaggcca
aaggcttcac cgatgctata ttccttgatg cagcaactgg caaaaatata 660gaggaggcta
ctgcatgtaa tatcttcgtt gtgaagggaa atgtcatctc aactcctcca 720atagccggaa
ctattctgcc tggaatcaca agaaaaagca tcattgaagt tgcttcctgg 780ctcggatatc
aaattgagga acgtgctatc ccactggagg agttgataaa tgttgatgaa 840gctttctgct
caggaactgc gatagcaatt aagcctgttg gcagtgtaac ctatcaggga 900caaagggttg
aatataaaac aggcgagggt actgtatctg agaaactatg tagaacactg 960acaggaattc
aaactggtct cattgaggac actatgggat gggtcgtgga gattgaataa
1020196339PRTPopulus trichocarpa 196Met Gln Gln Gly Phe Val Pro Leu His
Ile Asn Trp Asp Asn Val Gly 1 5 10
15 Phe Gly Leu Thr Pro Thr Asp Phe Met Phe Leu Met Lys Cys
Pro Val 20 25 30
Gly Asp Lys Tyr Ser Glu Gly His Leu Val Pro Tyr Gly Asn Leu Glu
35 40 45 Ile Ser Pro Ser
Ser Ser Val Leu Asn Tyr Gly Gln Gly Leu Leu Glu 50
55 60 Gly Leu Lys Ala Tyr Arg Gly Asp
Asp Asn Arg Ile Arg Leu Phe Arg 65 70
75 80 Pro Glu Gln Asn Ala Leu Arg Met Gln Met Gly Ala
Glu Arg Met Cys 85 90
95 Met Ser Ser Pro Thr Ala Glu Gln Phe Val Ser Ser Ile Lys Gln Thr
100 105 110 Ala Leu Ala
Asn Lys Arg Trp Val Pro Pro Pro Gly Lys Gly Ser Leu 115
120 125 Tyr Ile Arg Pro Leu Leu Leu Gly
Thr Gly Pro Ile Leu Gly Val Ala 130 135
140 Pro Ser Pro Glu Tyr Thr Phe Leu Ala Tyr Ala Ser Pro
Val Gly Asn 145 150 155
160 Tyr Phe Asn Gly Pro Met His Phe Ser Val Glu Asp Lys Val Tyr Arg
165 170 175 Ala Ile Pro Gly
Gly Thr Gly Gly Ile Lys Ser Ile Thr Asn Tyr Ser 180
185 190 Pro Ile Tyr Lys Ala Ile Thr Gln Ala
Lys Ala Lys Gly Phe Thr Asp 195 200
205 Ala Ile Phe Leu Asp Ala Ala Thr Gly Lys Asn Ile Glu Glu
Ala Thr 210 215 220
Ala Cys Asn Ile Phe Val Val Lys Gly Asn Val Ile Ser Thr Pro Pro 225
230 235 240 Ile Ala Gly Thr Ile
Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Glu 245
250 255 Val Ala Ser Trp Leu Gly Tyr Gln Ile Glu
Glu Arg Ala Ile Pro Leu 260 265
270 Glu Glu Leu Ile Asn Val Asp Glu Ala Phe Cys Ser Gly Thr Ala
Ile 275 280 285 Ala
Ile Lys Pro Val Gly Ser Val Thr Tyr Gln Gly Gln Arg Val Glu 290
295 300 Tyr Lys Thr Gly Glu Gly
Thr Val Ser Glu Lys Leu Cys Arg Thr Leu 305 310
315 320 Thr Gly Ile Gln Thr Gly Leu Ile Glu Asp Thr
Met Gly Trp Val Val 325 330
335 Glu Ile Glu 1971185DNAPopulus trichocarpa 197atgattcaaa
cgaattctgg cttacgcagt ttggttcaat ctttacgacc catcacttcc 60tctttatcgg
agcttatagt tgttgctaca catcacaagc agcatctgct cttcaacaag 120ttagcagacc
atattcaaac agaagctttc ggatttttgc tgttcgttgg tggtgattct 180agcgaggatg
agtatgctaa agtggactgg gataatctca gatttggcat cacaccagct 240gattacatgt
acacaatgaa atgttccagt gatgggaagt ttgaacaagg gcagcttgct 300ccatacggaa
atgttgaatt gagcccttca gcagcaggac tttatgaagg cacaaaagca 360tatagaacag
aagatgggcg cctgcttctc tttcgtctgg atcaaaatgc cacgcggatg 420aagatgggcg
ctgacagatt gtgcatggct tgcccctcca tttatcaaat tattgacgcg 480gtcaaacaaa
ctgctctcgc taacaagcgc tggaccccac ctcgagggaa agggactttg 540tatatcaggc
ctttgctaat gggaagtggt cctattctgg gattagcacc agcacctgaa 600tacacattcc
tcatatatgc ttctcctgtc ggcaattatt tcaaggaggg tttgaaaccc 660ttgaacctat
atgttgagga tgagtttcat cgggctactc gaggaggagc tggaggcgtt 720aaatccatca
caaattatgc accagtttta aaagcaatgg ccagagcaaa aagcagagga 780ttttctgatg
ttttgtacct cgactcggcc aataagaaaa atctggaaga agtctcttct 840tgcaacattt
tccttgtgaa gggcaatata atttctagtc ctgctacaag tgggactatt 900cttccagggg
tcactcgaag aagcatcatt gaaattgctc tcgatcatgg ctatcaggtc 960gaggaacgtg
caattccatt ggacgaattg atggatgccg atgaagtttt ttgcacggga 1020actgcagtag
gtgttgcccc tgtgggcacg attacatatc aggataggag agttgagtac 1080aacgtcggag
aagagtccgt gtctcagaag ctttactcga ttcttgaagg aattaaaacg 1140ggagtcatcg
aggataagaa aggctggact attgagatcc agtga
1185198394PRTPopulus trichocarpa 198Met Ile Gln Thr Asn Ser Gly Leu Arg
Ser Leu Val Gln Ser Leu Arg 1 5 10
15 Pro Ile Thr Ser Ser Leu Ser Glu Leu Ile Val Val Ala Thr
His His 20 25 30
Lys Gln His Leu Leu Phe Asn Lys Leu Ala Asp His Ile Gln Thr Glu
35 40 45 Ala Phe Gly Phe
Leu Leu Phe Val Gly Gly Asp Ser Ser Glu Asp Glu 50
55 60 Tyr Ala Lys Val Asp Trp Asp Asn
Leu Arg Phe Gly Ile Thr Pro Ala 65 70
75 80 Asp Tyr Met Tyr Thr Met Lys Cys Ser Ser Asp Gly
Lys Phe Glu Gln 85 90
95 Gly Gln Leu Ala Pro Tyr Gly Asn Val Glu Leu Ser Pro Ser Ala Ala
100 105 110 Gly Leu Tyr
Glu Gly Thr Lys Ala Tyr Arg Thr Glu Asp Gly Arg Leu 115
120 125 Leu Leu Phe Arg Leu Asp Gln Asn
Ala Thr Arg Met Lys Met Gly Ala 130 135
140 Asp Arg Leu Cys Met Ala Cys Pro Ser Ile Tyr Gln Ile
Ile Asp Ala 145 150 155
160 Val Lys Gln Thr Ala Leu Ala Asn Lys Arg Trp Thr Pro Pro Arg Gly
165 170 175 Lys Gly Thr Leu
Tyr Ile Arg Pro Leu Leu Met Gly Ser Gly Pro Ile 180
185 190 Leu Gly Leu Ala Pro Ala Pro Glu Tyr
Thr Phe Leu Ile Tyr Ala Ser 195 200
205 Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Lys Pro Leu Asn
Leu Tyr 210 215 220
Val Glu Asp Glu Phe His Arg Ala Thr Arg Gly Gly Ala Gly Gly Val 225
230 235 240 Lys Ser Ile Thr Asn
Tyr Ala Pro Val Leu Lys Ala Met Ala Arg Ala 245
250 255 Lys Ser Arg Gly Phe Ser Asp Val Leu Tyr
Leu Asp Ser Ala Asn Lys 260 265
270 Lys Asn Leu Glu Glu Val Ser Ser Cys Asn Ile Phe Leu Val Lys
Gly 275 280 285 Asn
Ile Ile Ser Ser Pro Ala Thr Ser Gly Thr Ile Leu Pro Gly Val 290
295 300 Thr Arg Arg Ser Ile Ile
Glu Ile Ala Leu Asp His Gly Tyr Gln Val 305 310
315 320 Glu Glu Arg Ala Ile Pro Leu Asp Glu Leu Met
Asp Ala Asp Glu Val 325 330
335 Phe Cys Thr Gly Thr Ala Val Gly Val Ala Pro Val Gly Thr Ile Thr
340 345 350 Tyr Gln
Asp Arg Arg Val Glu Tyr Asn Val Gly Glu Glu Ser Val Ser 355
360 365 Gln Lys Leu Tyr Ser Ile Leu
Glu Gly Ile Lys Thr Gly Val Ile Glu 370 375
380 Asp Lys Lys Gly Trp Thr Ile Glu Ile Gln 385
390 1991257DNAPopulus trichocarpa
199atggcttcat caacctcagg cccaaaaaaa gtgctgtcaa ggttccgcca agtcgttggg
60ctattgcttc cgcattcgaa atctacacca ccacctgtta gcagtactga tgatgacaaa
120gtgaccactg atgatcacca agtgagtact gatgatcaca aagtgagtac tgatgagaaa
180gtgaagtctg gtggtgaaga tatcaattgg gataatgttg gttttggtct aactcccacg
240gatttcatgt tcttaatgaa atgccctgtt ggagacaaat attcagaagg acaccttgtt
300ccctatggaa atcttgagat aagcccatcc tcttcagtgt taaactacgg acagggattg
360tttgaaggga tgaaagtata taggagagaa gatgacagaa tcatgatctt taggccagaa
420gaaaatgctc gacgcatgca aatgggagca gagagactgc tgatgcaagc accaacgacc
480gagcaattta ttgatgctgt gaagaaaact gcccttgcaa acgagcgttg ggtgcctccc
540catgggacgg gaacattgta cctgaggcct ttgctaatgg gaagtggagc tgttttgggt
600attggaccag ctcctgaatg cacattcctt atctttgcat ctcctatccg caactcttac
660aagagtggga tcgacgcctt taacttgtct atcgagacca aacttcatcg agcttcccct
720ggtggaactg gaggtatcaa aagcattacc aactatgctc cggtaagcat agtttgcaac
780tggtttgatt ttgatactgt ttatgaagtc gcagtggttt ttgaatcagt gaagcgagcg
840aaggctgcag ggtttgatga tgtcctgttc ttggatggag aaactggaaa gcatattgaa
900gaggcttctt cgtgtaatgt tttcatgttg aagggtaatg tcatttcaac ccccaccata
960ctcgggacaa ttttgcctgg aattactaga aaaagcatcc tggagattgc tcaagattgt
1020ggttatgagg tcgaagaagg acgtattcca gttgaggatg tgcttgctgc ggatgaggta
1080ttttgcacag gaactgcagt tgtagtcact tctgttgcca gcataaccta tcaggaacaa
1140agggtggaat ataaaacagg agagaacaca gtgtgtcacg aactgcgaac agcccttaca
1200ggaattcaaa ctggacttgt tgaggacaag aagggatgga ctgtccacct taattaa
1257200418PRTPopulus trichocarpa 200Met Ala Ser Ser Thr Ser Gly Pro Lys
Lys Val Leu Ser Arg Phe Arg 1 5 10
15 Gln Val Val Gly Leu Leu Leu Pro His Ser Lys Ser Thr Pro
Pro Pro 20 25 30
Val Ser Ser Thr Asp Asp Asp Lys Val Thr Thr Asp Asp His Gln Val
35 40 45 Ser Thr Asp Asp
His Lys Val Ser Thr Asp Glu Lys Val Lys Ser Gly 50
55 60 Gly Glu Asp Ile Asn Trp Asp Asn
Val Gly Phe Gly Leu Thr Pro Thr 65 70
75 80 Asp Phe Met Phe Leu Met Lys Cys Pro Val Gly Asp
Lys Tyr Ser Glu 85 90
95 Gly His Leu Val Pro Tyr Gly Asn Leu Glu Ile Ser Pro Ser Ser Ser
100 105 110 Val Leu Asn
Tyr Gly Gln Gly Leu Phe Glu Gly Met Lys Val Tyr Arg 115
120 125 Arg Glu Asp Asp Arg Ile Met Ile
Phe Arg Pro Glu Glu Asn Ala Arg 130 135
140 Arg Met Gln Met Gly Ala Glu Arg Leu Leu Met Gln Ala
Pro Thr Thr 145 150 155
160 Glu Gln Phe Ile Asp Ala Val Lys Lys Thr Ala Leu Ala Asn Glu Arg
165 170 175 Trp Val Pro Pro
His Gly Thr Gly Thr Leu Tyr Leu Arg Pro Leu Leu 180
185 190 Met Gly Ser Gly Ala Val Leu Gly Ile
Gly Pro Ala Pro Glu Cys Thr 195 200
205 Phe Leu Ile Phe Ala Ser Pro Ile Arg Asn Ser Tyr Lys Ser
Gly Ile 210 215 220
Asp Ala Phe Asn Leu Ser Ile Glu Thr Lys Leu His Arg Ala Ser Pro 225
230 235 240 Gly Gly Thr Gly Gly
Ile Lys Ser Ile Thr Asn Tyr Ala Pro Val Ser 245
250 255 Ile Val Cys Asn Trp Phe Asp Phe Asp Thr
Val Tyr Glu Val Ala Val 260 265
270 Val Phe Glu Ser Val Lys Arg Ala Lys Ala Ala Gly Phe Asp Asp
Val 275 280 285 Leu
Phe Leu Asp Gly Glu Thr Gly Lys His Ile Glu Glu Ala Ser Ser 290
295 300 Cys Asn Val Phe Met Leu
Lys Gly Asn Val Ile Ser Thr Pro Thr Ile 305 310
315 320 Leu Gly Thr Ile Leu Pro Gly Ile Thr Arg Lys
Ser Ile Leu Glu Ile 325 330
335 Ala Gln Asp Cys Gly Tyr Glu Val Glu Glu Gly Arg Ile Pro Val Glu
340 345 350 Asp Val
Leu Ala Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val 355
360 365 Val Thr Ser Val Ala Ser Ile
Thr Tyr Gln Glu Gln Arg Val Glu Tyr 370 375
380 Lys Thr Gly Glu Asn Thr Val Cys His Glu Leu Arg
Thr Ala Leu Thr 385 390 395
400 Gly Ile Gln Thr Gly Leu Val Glu Asp Lys Lys Gly Trp Thr Val His
405 410 415 Leu Asn
2011254DNASolanum lycopersicum 201atggagagcg ccgccgtatt tgcagggctt
caccctattc ccggtcacca taaccacctt 60ctgggtccat cacgaactgc tattaagctt
cttcctcctt ccattgataa aatcaatttt 120tctcctttgc ccctcaagtt tcagaagcag
tcgcatttca cttcttatat tggtaatagt 180gccataaaca gtggaaattc atttcgtgtg
gcatctcctg caagcgacgt tgcatctgaa 240ttagccgaca tcgattggga taaccttggc
tttggcttta tgcctactga ttatatgtat 300agcatgaaat gctctcaggg tgaaaacttt
tctaagggtg aattacagcg tttcggtaac 360attgagttga gtccgtctgc tggaatatta
aattatggtc agggattgtt cgaaggttta 420aaagcatatc gaaaacatga cggcaatata
ttgttgtttc gacctgagga aaatgctacg 480cgtttgaaga tgggtgctga acgtatgtgt
atgccttcac cgtctgttga acagtttgta 540gaagcagtga aagccactgt gttagctaat
gaaagatgga ttcctcctcc cggtaaaggc 600tcattataca taagacctct gcttatgggg
agtggagctg ttcttggtct tgctcctgct 660cctgagtaca cattcctgat ttatgtgtca
cctgttggaa attattttaa ggaaggtttg 720gcaccaataa atttggtagt tgagactgaa
atgcaccgtg caacacctgg tggtactgga 780ggcgttaaga ctattggaaa ttatgctgca
gttctgaagg cacagagtgc tgctaaagca 840aaaggctatt ctgatgttct gtaccttgat
tgtgttcaga aaaaatatct cgaagaggtt 900tcctcttgca atgtctttat tgtgaagggt
aatctgatag taactcctgc aattaaagga 960accattctac ctggaattac gcgaaaaagc
ataatcgacg tagctattag tcaaggattc 1020gaggttgagg aacgacaggt gtctgtggac
gaattgcttg atgctgacga agttttctgt 1080acgggaactg ccgtggtagt atctcctgtt
ggtagcatta ctcatcaagg gagaagggtg 1140acatatggaa atgatggtgt tggtcttgtg
tcgcagcagt tatactctgc acttactagc 1200ctacaaatgg ggctctcaga ggataagatg
ggttggattg ttgagctcaa atga 1254202417PRTSolanum lycopersicum
202Met Glu Ser Ala Ala Val Phe Ala Gly Leu His Pro Ile Pro Gly His 1
5 10 15 His Asn His Leu
Leu Gly Pro Ser Arg Thr Ala Ile Lys Leu Leu Pro 20
25 30 Pro Ser Ile Asp Lys Ile Asn Phe Ser
Pro Leu Pro Leu Lys Phe Gln 35 40
45 Lys Gln Ser His Phe Thr Ser Tyr Ile Gly Asn Ser Ala Ile
Asn Ser 50 55 60
Gly Asn Ser Phe Arg Val Ala Ser Pro Ala Ser Asp Val Ala Ser Glu 65
70 75 80 Leu Ala Asp Ile Asp
Trp Asp Asn Leu Gly Phe Gly Phe Met Pro Thr 85
90 95 Asp Tyr Met Tyr Ser Met Lys Cys Ser Gln
Gly Glu Asn Phe Ser Lys 100 105
110 Gly Glu Leu Gln Arg Phe Gly Asn Ile Glu Leu Ser Pro Ser Ala
Gly 115 120 125 Ile
Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg 130
135 140 Lys His Asp Gly Asn Ile
Leu Leu Phe Arg Pro Glu Glu Asn Ala Thr 145 150
155 160 Arg Leu Lys Met Gly Ala Glu Arg Met Cys Met
Pro Ser Pro Ser Val 165 170
175 Glu Gln Phe Val Glu Ala Val Lys Ala Thr Val Leu Ala Asn Glu Arg
180 185 190 Trp Ile
Pro Pro Pro Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu 195
200 205 Met Gly Ser Gly Ala Val Leu
Gly Leu Ala Pro Ala Pro Glu Tyr Thr 210 215
220 Phe Leu Ile Tyr Val Ser Pro Val Gly Asn Tyr Phe
Lys Glu Gly Leu 225 230 235
240 Ala Pro Ile Asn Leu Val Val Glu Thr Glu Met His Arg Ala Thr Pro
245 250 255 Gly Gly Thr
Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ala Val Leu 260
265 270 Lys Ala Gln Ser Ala Ala Lys Ala
Lys Gly Tyr Ser Asp Val Leu Tyr 275 280
285 Leu Asp Cys Val Gln Lys Lys Tyr Leu Glu Glu Val Ser
Ser Cys Asn 290 295 300
Val Phe Ile Val Lys Gly Asn Leu Ile Val Thr Pro Ala Ile Lys Gly 305
310 315 320 Thr Ile Leu Pro
Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Ile 325
330 335 Ser Gln Gly Phe Glu Val Glu Glu Arg
Gln Val Ser Val Asp Glu Leu 340 345
350 Leu Asp Ala Asp Glu Val Phe Cys Thr Gly Thr Ala Val Val
Val Ser 355 360 365
Pro Val Gly Ser Ile Thr His Gln Gly Arg Arg Val Thr Tyr Gly Asn 370
375 380 Asp Gly Val Gly Leu
Val Ser Gln Gln Leu Tyr Ser Ala Leu Thr Ser 385 390
395 400 Leu Gln Met Gly Leu Ser Glu Asp Lys Met
Gly Trp Ile Val Glu Leu 405 410
415 Lys 2031233DNATriticum aestivum 203atggaactcc gcctccgcgc
cccggcgtcc cccgcttccg cctctccgcg cggcacgtcg 60gtctccccca gccccaggcc
gcatccgcgc ctaccctcgc aacccattca gaagcgattg 120tccggcagcg ccgtctccgt
ctccaggcga ggcaccgcgg caaggagcag cctgtgttcc 180gccctgatgg cggcatcata
caacacagga actccggacc tagtcgactt cgactgggag 240actcttggat ttcaactggt
cccgacggac tttatgtata taatgaaatg ttcgtcagat 300ggagtgttca ccaagggtga
attggttcca tatgggccaa tcgagctgaa ccctgctgct 360gcagttttaa attacggcca
gggattgctc gaaggtctta gagcacacag aaaggaggat 420ggttcagtaa ttgtttttcg
ccccaaggaa aacgcgttgc ggatgaggat aggtgcagat 480cggctatgca tgcctgcacc
aagcgttgag cagttcctat cagctgtcaa gcaaactata 540ttggcaaaca agcgttgggt
accccccact ggcaaaggtt ctttatatat caggccgctg 600ctgattggaa gtggagctat
gctaggtgta gcacctgccc cggagtatac atttgtcgtg 660tatgtttgcc cagttggtca
ctatttcaag gatggcctgt ctcctattag cttattgact 720gaggaagaat atcaccgcgc
tgcacctggt ggaactggtg atattaagac aattggaaat 780tatgcttcgg ttgttagtgc
tcagagaaga gccaaggaga aaggtcattc tgatgttctt 840tacttggatc ccgtgcataa
gaagtttgtg gaggaagttt cttcctgtaa tatattcatg 900gtgaaggata atgttatttc
tactccacta ttaacgggaa caatccttcc tggaatcaca 960agaagaagta taatcgaaat
tgccagcaat cttggaattc aggttgaaga gcgccttatt 1020gcgatagatg agttgcttga
tgctgatgaa gtcttctgta cagggactgc cgttgtacta 1080tcacctgttg gttccatagt
gtaccacgga agaagagtgg agtatggggg cgggaaggtc 1140ggagctgtgt cccagcaact
gtactcagca cttacagcta tccagaaagg ccttgtggag 1200gacagtatgg gatggagtgt
gcaattgaat tag 1233204410PRTTriticum
aestivum 204Met Glu Leu Arg Leu Arg Ala Pro Ala Ser Pro Ala Ser Ala Ser
Pro 1 5 10 15 Arg
Gly Thr Ser Val Ser Pro Ser Pro Arg Pro His Pro Arg Leu Pro
20 25 30 Ser Gln Pro Ile Gln
Lys Arg Leu Ser Gly Ser Ala Val Ser Val Ser 35
40 45 Arg Arg Gly Thr Ala Ala Arg Ser Ser
Leu Cys Ser Ala Leu Met Ala 50 55
60 Ala Ser Tyr Asn Thr Gly Thr Pro Asp Leu Val Asp Phe
Asp Trp Glu 65 70 75
80 Thr Leu Gly Phe Gln Leu Val Pro Thr Asp Phe Met Tyr Ile Met Lys
85 90 95 Cys Ser Ser Asp
Gly Val Phe Thr Lys Gly Glu Leu Val Pro Tyr Gly 100
105 110 Pro Ile Glu Leu Asn Pro Ala Ala Ala
Val Leu Asn Tyr Gly Gln Gly 115 120
125 Leu Leu Glu Gly Leu Arg Ala His Arg Lys Glu Asp Gly Ser
Val Ile 130 135 140
Val Phe Arg Pro Lys Glu Asn Ala Leu Arg Met Arg Ile Gly Ala Asp 145
150 155 160 Arg Leu Cys Met Pro
Ala Pro Ser Val Glu Gln Phe Leu Ser Ala Val 165
170 175 Lys Gln Thr Ile Leu Ala Asn Lys Arg Trp
Val Pro Pro Thr Gly Lys 180 185
190 Gly Ser Leu Tyr Ile Arg Pro Leu Leu Ile Gly Ser Gly Ala Met
Leu 195 200 205 Gly
Val Ala Pro Ala Pro Glu Tyr Thr Phe Val Val Tyr Val Cys Pro 210
215 220 Val Gly His Tyr Phe Lys
Asp Gly Leu Ser Pro Ile Ser Leu Leu Thr 225 230
235 240 Glu Glu Glu Tyr His Arg Ala Ala Pro Gly Gly
Thr Gly Asp Ile Lys 245 250
255 Thr Ile Gly Asn Tyr Ala Ser Val Val Ser Ala Gln Arg Arg Ala Lys
260 265 270 Glu Lys
Gly His Ser Asp Val Leu Tyr Leu Asp Pro Val His Lys Lys 275
280 285 Phe Val Glu Glu Val Ser Ser
Cys Asn Ile Phe Met Val Lys Asp Asn 290 295
300 Val Ile Ser Thr Pro Leu Leu Thr Gly Thr Ile Leu
Pro Gly Ile Thr 305 310 315
320 Arg Arg Ser Ile Ile Glu Ile Ala Ser Asn Leu Gly Ile Gln Val Glu
325 330 335 Glu Arg Leu
Ile Ala Ile Asp Glu Leu Leu Asp Ala Asp Glu Val Phe 340
345 350 Cys Thr Gly Thr Ala Val Val Leu
Ser Pro Val Gly Ser Ile Val Tyr 355 360
365 His Gly Arg Arg Val Glu Tyr Gly Gly Gly Lys Val Gly
Ala Val Ser 370 375 380
Gln Gln Leu Tyr Ser Ala Leu Thr Ala Ile Gln Lys Gly Leu Val Glu 385
390 395 400 Asp Ser Met Gly
Trp Ser Val Gln Leu Asn 405 410
2051200DNATriticum aestivum 205atggacgtgc tgtcgtctgc gaagcgcgcc
ctcccgtggg gccgcacctc ggccggcggg 60gtcatcggcg gcctccgagc tctactcggg
acggacggag gcggcggccg ctctcttctc 120ccgtcccggt ggaagtcgtc gcagccgcag
ctggaccccg tcgacaggtc cgacgaggag 180ggcggcggcg acatcgactg ggacaacctc
ggcttcgggc tcaccccgac cgactacatg 240tacgtcatgc ggtgctcgca ggaggagggc
ggcttctccc gcggcgagct cgcccgctac 300ggcaacatcg agctcagccc ctcctccggc
gtgctcaact acgggcaggg gctgttcgag 360gggctcaagg cgtaccggag ggcggacggg
cccgggtaca tgctgttccg gccggaggag 420aacgcgcggc ggatgcagca cggcgccggg
cgcatgtgca tgccggcccc gtccgtcgag 480cagttcgtgc acgccgtcaa gcagaccgtc
ctcgccaaca ggcgctgggt gccgccgcag 540ggaaagggag cgctgtacct caggccgctg
ctcatcggga gcggggcgat cctcgggctg 600gcgccggcgc cggagtacac cttcatgatc
tacgccgcgc ctgtggggac atatttcaag 660gaaggcatgg cggcgataaa cctgctggtc
gaggaggaga tccaccgcgc catgccgggc 720ggcaccggcg gggtcaagag catctccaac
tacgcgccgg tgctcaaggc gcagatggac 780gcgaggagca aggggttcgc ggacgtgctg
tacctggact cggtgcacaa gaggtacgtg 840gaggaggcct cctcctgcaa cctcttcgtc
gtgaagggcg gcgccatcgc gacgccggcg 900acggagggga ccatcctgcc gggggtcacg
cgcaggagca tcatcgagct cgccagagac 960agcggctacc aggtggaaga gcgcctcgtc
tccatcgacg atctgatcag tgcagacgaa 1020gtgttctgca cgggaacggc cgtcggcatc
accccggtgt cgaccatcac ctaccaaggg 1080acaaggtacg agttcaggac cggggaggac
acgttgtcga agaagcttta cacggctctg 1140acgtcgatcc agatgggcct ggcggaggac
aagaagggat ggacggtcgc ggttgattga 1200206399PRTTriticum aestivum 206Met
Asp Val Leu Ser Ser Ala Lys Arg Ala Leu Pro Trp Gly Arg Thr 1
5 10 15 Ser Ala Gly Gly Val Ile
Gly Gly Leu Arg Ala Leu Leu Gly Thr Asp 20
25 30 Gly Gly Gly Gly Arg Ser Leu Leu Pro Ser
Arg Trp Lys Ser Ser Gln 35 40
45 Pro Gln Leu Asp Pro Val Asp Arg Ser Asp Glu Glu Gly Gly
Gly Asp 50 55 60
Ile Asp Trp Asp Asn Leu Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met 65
70 75 80 Tyr Val Met Arg Cys
Ser Gln Glu Glu Gly Gly Phe Ser Arg Gly Glu 85
90 95 Leu Ala Arg Tyr Gly Asn Ile Glu Leu Ser
Pro Ser Ser Gly Val Leu 100 105
110 Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Arg
Ala 115 120 125 Asp
Gly Pro Gly Tyr Met Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg 130
135 140 Met Gln His Gly Ala Gly
Arg Met Cys Met Pro Ala Pro Ser Val Glu 145 150
155 160 Gln Phe Val His Ala Val Lys Gln Thr Val Leu
Ala Asn Arg Arg Trp 165 170
175 Val Pro Pro Gln Gly Lys Gly Ala Leu Tyr Leu Arg Pro Leu Leu Ile
180 185 190 Gly Ser
Gly Ala Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe 195
200 205 Met Ile Tyr Ala Ala Pro Val
Gly Thr Tyr Phe Lys Glu Gly Met Ala 210 215
220 Ala Ile Asn Leu Leu Val Glu Glu Glu Ile His Arg
Ala Met Pro Gly 225 230 235
240 Gly Thr Gly Gly Val Lys Ser Ile Ser Asn Tyr Ala Pro Val Leu Lys
245 250 255 Ala Gln Met
Asp Ala Arg Ser Lys Gly Phe Ala Asp Val Leu Tyr Leu 260
265 270 Asp Ser Val His Lys Arg Tyr Val
Glu Glu Ala Ser Ser Cys Asn Leu 275 280
285 Phe Val Val Lys Gly Gly Ala Ile Ala Thr Pro Ala Thr
Glu Gly Thr 290 295 300
Ile Leu Pro Gly Val Thr Arg Arg Ser Ile Ile Glu Leu Ala Arg Asp 305
310 315 320 Ser Gly Tyr Gln
Val Glu Glu Arg Leu Val Ser Ile Asp Asp Leu Ile 325
330 335 Ser Ala Asp Glu Val Phe Cys Thr Gly
Thr Ala Val Gly Ile Thr Pro 340 345
350 Val Ser Thr Ile Thr Tyr Gln Gly Thr Arg Tyr Glu Phe Arg
Thr Gly 355 360 365
Glu Asp Thr Leu Ser Lys Lys Leu Tyr Thr Ala Leu Thr Ser Ile Gln 370
375 380 Met Gly Leu Ala Glu
Asp Lys Lys Gly Trp Thr Val Ala Val Asp 385 390
395 2071206DNAZea mays 207atggaatacg gcgccgtcct
cgccgccgcg ccgctcgtcg cacggccgaa ctggctcctc 60ctctcgccgc cgccactggc
gccgtctatt cagattcaga atcgtcttta ttcgatctcg 120tcattcccac taaaggctgg
acctgtaagg gcatgcagag ctttagcaag caactacacg 180caaacatctg aaacagttga
tttggactgg gagaacctgg gttttgggat tgtgcaaact 240gattatatgt atattgctaa
gtgcgggaca gacgggaatt tttctgaggg tgaaatggtg 300ccttttggac ctatagcgct
gagtccatct tctggagtcc taaattatgg acagggattg 360tttgagggcc taaaggcgta
taagaaaact gatggatcca tcctattatt tcgcccagag 420gaaaatgctg agaggatgcg
gacaggtgct gagaggatgt gcatgcctgc accctctgtc 480gagcagttta ttgatgcagt
aaaacaaacc gttcttgcaa ataagagatg gattcctcct 540actggtaaag gttctctgta
tattaggccc ttacttatgg gaagtggggc tgttcttggt 600cttgcacctg ctcctgagta
tacattcatt atatttgtct ctcctgttgg aaactacttt 660aaggaaggtt tagcaccaat
aaatttgata gttgtagaca agttccatcg tgctactcct 720ggtggtactg ggggtgtgaa
gaccatagga aattatgctt cggtgttgat ggcacagaaa 780attgcaaagg aaaagggtta
ttctgatgtc ctctacttgg acgctgttca caagaagtac 840cttgaagaag tttcttcatg
caatgttttt gttgtcaagg acaatgttat ttctacccca 900gcaataaaag gaacaatatt
acctggtatc acaaggaaaa gtataattga cgttgctttg 960agtaaaggct tccaggttga
ggagcggctt gtttcagtgg atgaactgct tgatgctgat 1020gaggtattct gcacaggaac
tgctgttgtg gtgtctcctg ttgggagcat tacatatcaa 1080gggaaaagag tggaatacgg
ccaccaaggt gttggcgttg tgtcccagca gctgtacact 1140tcactgacga gtcttcagat
gggtcaaacc gaggattgga tgggctggac tgtgcaactg 1200aattag
1206208401PRTZea mays 208Met
Glu Tyr Gly Ala Val Leu Ala Ala Ala Pro Leu Val Ala Arg Pro 1
5 10 15 Asn Trp Leu Leu Leu Ser
Pro Pro Pro Leu Ala Pro Ser Ile Gln Ile 20
25 30 Gln Asn Arg Leu Tyr Ser Ile Ser Ser Phe
Pro Leu Lys Ala Gly Pro 35 40
45 Val Arg Ala Cys Arg Ala Leu Ala Ser Asn Tyr Thr Gln Thr
Ser Glu 50 55 60
Thr Val Asp Leu Asp Trp Glu Asn Leu Gly Phe Gly Ile Val Gln Thr 65
70 75 80 Asp Tyr Met Tyr Ile
Ala Lys Cys Gly Thr Asp Gly Asn Phe Ser Glu 85
90 95 Gly Glu Met Val Pro Phe Gly Pro Ile Ala
Leu Ser Pro Ser Ser Gly 100 105
110 Val Leu Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr
Lys 115 120 125 Lys
Thr Asp Gly Ser Ile Leu Leu Phe Arg Pro Glu Glu Asn Ala Glu 130
135 140 Arg Met Arg Thr Gly Ala
Glu Arg Met Cys Met Pro Ala Pro Ser Val 145 150
155 160 Glu Gln Phe Ile Asp Ala Val Lys Gln Thr Val
Leu Ala Asn Lys Arg 165 170
175 Trp Ile Pro Pro Thr Gly Lys Gly Ser Leu Tyr Ile Arg Pro Leu Leu
180 185 190 Met Gly
Ser Gly Ala Val Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr 195
200 205 Phe Ile Ile Phe Val Ser Pro
Val Gly Asn Tyr Phe Lys Glu Gly Leu 210 215
220 Ala Pro Ile Asn Leu Ile Val Val Asp Lys Phe His
Arg Ala Thr Pro 225 230 235
240 Gly Gly Thr Gly Gly Val Lys Thr Ile Gly Asn Tyr Ala Ser Val Leu
245 250 255 Met Ala Gln
Lys Ile Ala Lys Glu Lys Gly Tyr Ser Asp Val Leu Tyr 260
265 270 Leu Asp Ala Val His Lys Lys Tyr
Leu Glu Glu Val Ser Ser Cys Asn 275 280
285 Val Phe Val Val Lys Asp Asn Val Ile Ser Thr Pro Ala
Ile Lys Gly 290 295 300
Thr Ile Leu Pro Gly Ile Thr Arg Lys Ser Ile Ile Asp Val Ala Leu 305
310 315 320 Ser Lys Gly Phe
Gln Val Glu Glu Arg Leu Val Ser Val Asp Glu Leu 325
330 335 Leu Asp Ala Asp Glu Val Phe Cys Thr
Gly Thr Ala Val Val Val Ser 340 345
350 Pro Val Gly Ser Ile Thr Tyr Gln Gly Lys Arg Val Glu Tyr
Gly His 355 360 365
Gln Gly Val Gly Val Val Ser Gln Gln Leu Tyr Thr Ser Leu Thr Ser 370
375 380 Leu Gln Met Gly Gln
Thr Glu Asp Trp Met Gly Trp Thr Val Gln Leu 385 390
395 400 Asn 2091218DNAZea mays 209atggagctct
gcacctgcgc ggcccgcagt tctgcgcccc cctcttctag ctgccgcgcg 60gcgccattcc
cgcgagttct ctcccatcgc atttggagcc gatcgggata ctgcactgtc 120tgctttaccc
cgtcaagccc tgttgctagg agtcgattct ctactctaat gacgactgca 180cacaacacag
ggaccccaga tctagttgac ttcaattggg atgatcttgg gtttcaactg 240atcccaacgg
acttcatgta tttaatgagc tgttcttcag atggggtgtt tatgaatggt 300aaattagtgc
catatgggtc aattgagctg aatccagcag ccgccgtgct gaattatggt 360cagggattgc
ttgaaggtct acgatcacat agaaaagagg atggatcaat ccttcttttt 420cgtccacatg
aaaatgcacg gcggatggaa attggtgcag accggttatg catgcctgca 480ccaagtgtag
agcaattcct agaagctgtg aaactaactg ttctggcaaa caagcattgg 540gtgcctcctt
ttggtaaagg ttctttgtat atcagaccgc agctaattgg aagtggggct 600atgcttggtg
tggcacctgc cccacagtac acattcattg tgtttgtttg cccagttggg 660cattatttca
agggtggtct agctccaatc agcttgttaa ctgaggaaga ataccaccgt 720gctgcacctg
gtggaactgg tgatataaaa actattggga actatgcttc ggttgtcagt 780gctcagagaa
gatccaagga aaaaggccat tctgatgttt tatacttaga tccactccat 840aataagtttg
ttgaggaagt ttcttcttgt aatatattca tggtgaagga caatattatt 900tctactccac
tgttaacggg gacaattctt cctggaatca caaggagaag tgtgattgaa 960atttctcaga
atcttggatt tcaggttgag gagcgtctta tcacaataga tgaactgctt 1020ggggctgatg
aagtcttttg tacaggaaca gctgttgtat tgtcacctgt tgggagcatc 1080acttaccgtg
gaagaagagt ggagtatggg aagaaccagg aggccggagt cgtgtcccaa 1140caactctatg
ccgcactcac agctatccag aaaggtctca cggaggacag catgggatgg 1200acgttgcagc
taacttag 1218210405PRTZea
mays 210Met Glu Leu Cys Thr Cys Ala Ala Arg Ser Ser Ala Pro Pro Ser Ser 1
5 10 15 Ser Cys Arg
Ala Ala Pro Phe Pro Arg Val Leu Ser His Arg Ile Trp 20
25 30 Ser Arg Ser Gly Tyr Cys Thr Val
Cys Phe Thr Pro Ser Ser Pro Val 35 40
45 Ala Arg Ser Arg Phe Ser Thr Leu Met Thr Thr Ala His
Asn Thr Gly 50 55 60
Thr Pro Asp Leu Val Asp Phe Asn Trp Asp Asp Leu Gly Phe Gln Leu 65
70 75 80 Ile Pro Thr Asp
Phe Met Tyr Leu Met Ser Cys Ser Ser Asp Gly Val 85
90 95 Phe Met Asn Gly Lys Leu Val Pro Tyr
Gly Ser Ile Glu Leu Asn Pro 100 105
110 Ala Ala Ala Val Leu Asn Tyr Gly Gln Gly Leu Leu Glu Gly
Leu Arg 115 120 125
Ser His Arg Lys Glu Asp Gly Ser Ile Leu Leu Phe Arg Pro His Glu 130
135 140 Asn Ala Arg Arg Met
Glu Ile Gly Ala Asp Arg Leu Cys Met Pro Ala 145 150
155 160 Pro Ser Val Glu Gln Phe Leu Glu Ala Val
Lys Leu Thr Val Leu Ala 165 170
175 Asn Lys His Trp Val Pro Pro Phe Gly Lys Gly Ser Leu Tyr Ile
Arg 180 185 190 Pro
Gln Leu Ile Gly Ser Gly Ala Met Leu Gly Val Ala Pro Ala Pro 195
200 205 Gln Tyr Thr Phe Ile Val
Phe Val Cys Pro Val Gly His Tyr Phe Lys 210 215
220 Gly Gly Leu Ala Pro Ile Ser Leu Leu Thr Glu
Glu Glu Tyr His Arg 225 230 235
240 Ala Ala Pro Gly Gly Thr Gly Asp Ile Lys Thr Ile Gly Asn Tyr Ala
245 250 255 Ser Val
Val Ser Ala Gln Arg Arg Ser Lys Glu Lys Gly His Ser Asp 260
265 270 Val Leu Tyr Leu Asp Pro Leu
His Asn Lys Phe Val Glu Glu Val Ser 275 280
285 Ser Cys Asn Ile Phe Met Val Lys Asp Asn Ile Ile
Ser Thr Pro Leu 290 295 300
Leu Thr Gly Thr Ile Leu Pro Gly Ile Thr Arg Arg Ser Val Ile Glu 305
310 315 320 Ile Ser Gln
Asn Leu Gly Phe Gln Val Glu Glu Arg Leu Ile Thr Ile 325
330 335 Asp Glu Leu Leu Gly Ala Asp Glu
Val Phe Cys Thr Gly Thr Ala Val 340 345
350 Val Leu Ser Pro Val Gly Ser Ile Thr Tyr Arg Gly Arg
Arg Val Glu 355 360 365
Tyr Gly Lys Asn Gln Glu Ala Gly Val Val Ser Gln Gln Leu Tyr Ala 370
375 380 Ala Leu Thr Ala
Ile Gln Lys Gly Leu Thr Glu Asp Ser Met Gly Trp 385 390
395 400 Thr Leu Gln Leu Thr
405 2111239DNAZea mays 211atggccgcgt tgacatctgc gaagggcgct ctccttccgt
cgtgggctcg cagcagcagc 60agcggccatg gcggcgactt gtggagggtc ctggggaagg
cgttggctac ggccggagga 120ggaggcggcg gcggatgctc ccttctgctc ccgcgccggt
ggcagtcgtc gctgccgcag 180ctggaccacg tcgccgacag gtccaacgag gagagcggcg
gcgagatcga ctgggacaac 240ctcggcttcg gcctcacccc gaccgactac atgtacgtca
cgcggtgctc gccggaggac 300cgcggcgact tcccccgcgg cgagctctgc cgctacggca
acatcgagct cagcccctcc 360tccggcgttc taaactacgc ccagggcctg ttcgagggaa
tgaaggcgta ccggcggccg 420gaccgggccg ggtacacgct gttccggccg gaggagaacg
cgcggcggat gcagcgcggc 480gccgagcgca tgtgcatgcc ggcgccgtcg gtggagcagt
tcgtccacgc cgtcaggcag 540acagtcctcg ccaacaggcg ctgggtgccg ccgcagggga
agggagccct gtacctccgg 600cctctgctcg tggggagcgg cccgatcctt gggctggctc
cggcccccga gtacaccttc 660ctcatctacg ccgcacccgt tgggaactac ttcaaggagg
gcctggcgcc catcaacctg 720gtggtgcatg acgagttcca ccgcgcgatg cccggcggca
ccggcggggt caagaccatc 780gccaactacg cgccggtgct gagggcgcag atggacgcca
agagcaaggg gttcacggac 840gtgctgtacc tggactcggt ccacaagcgg tacctggagg
aggtgtcgtc gtgcaacgtg 900ttcgtcgtca agggcggcgt ggtcgccacg ccggacaccc
ggggcaccat cctgccgggc 960atcacgcgca agagcgtcat cgagctcgcc agggaccgcg
gatacaaggt tgaggaacgc 1020ctggtttcca tcgacgatct ggtggccgca gacgaggtgt
tctgcaccgg gaccgcggtg 1080gtggttgctc ccgtgtcgac agtcacgtac cagggcgaga
ggtatgagtt cagaacgggg 1140ccggacacgg tgtcgcagga gctgtacacg acgctgacat
ccattcagat gggcatggcc 1200gccgaggaca gcaagggatg gacagtagca gtagagtag
1239212412PRTZea mays 212Met Ala Ala Leu Thr Ser
Ala Lys Gly Ala Leu Leu Pro Ser Trp Ala 1 5
10 15 Arg Ser Ser Ser Ser Gly His Gly Gly Asp Leu
Trp Arg Val Leu Gly 20 25
30 Lys Ala Leu Ala Thr Ala Gly Gly Gly Gly Gly Gly Gly Cys Ser
Leu 35 40 45 Leu
Leu Pro Arg Arg Trp Gln Ser Ser Leu Pro Gln Leu Asp His Val 50
55 60 Ala Asp Arg Ser Asn Glu
Glu Ser Gly Gly Glu Ile Asp Trp Asp Asn 65 70
75 80 Leu Gly Phe Gly Leu Thr Pro Thr Asp Tyr Met
Tyr Val Thr Arg Cys 85 90
95 Ser Pro Glu Asp Arg Gly Asp Phe Pro Arg Gly Glu Leu Cys Arg Tyr
100 105 110 Gly Asn
Ile Glu Leu Ser Pro Ser Ser Gly Val Leu Asn Tyr Ala Gln 115
120 125 Gly Leu Phe Glu Gly Met Lys
Ala Tyr Arg Arg Pro Asp Arg Ala Gly 130 135
140 Tyr Thr Leu Phe Arg Pro Glu Glu Asn Ala Arg Arg
Met Gln Arg Gly 145 150 155
160 Ala Glu Arg Met Cys Met Pro Ala Pro Ser Val Glu Gln Phe Val His
165 170 175 Ala Val Arg
Gln Thr Val Leu Ala Asn Arg Arg Trp Val Pro Pro Gln 180
185 190 Gly Lys Gly Ala Leu Tyr Leu Arg
Pro Leu Leu Val Gly Ser Gly Pro 195 200
205 Ile Leu Gly Leu Ala Pro Ala Pro Glu Tyr Thr Phe Leu
Ile Tyr Ala 210 215 220
Ala Pro Val Gly Asn Tyr Phe Lys Glu Gly Leu Ala Pro Ile Asn Leu 225
230 235 240 Val Val His Asp
Glu Phe His Arg Ala Met Pro Gly Gly Thr Gly Gly 245
250 255 Val Lys Thr Ile Ala Asn Tyr Ala Pro
Val Leu Arg Ala Gln Met Asp 260 265
270 Ala Lys Ser Lys Gly Phe Thr Asp Val Leu Tyr Leu Asp Ser
Val His 275 280 285
Lys Arg Tyr Leu Glu Glu Val Ser Ser Cys Asn Val Phe Val Val Lys 290
295 300 Gly Gly Val Val Ala
Thr Pro Asp Thr Arg Gly Thr Ile Leu Pro Gly 305 310
315 320 Ile Thr Arg Lys Ser Val Ile Glu Leu Ala
Arg Asp Arg Gly Tyr Lys 325 330
335 Val Glu Glu Arg Leu Val Ser Ile Asp Asp Leu Val Ala Ala Asp
Glu 340 345 350 Val
Phe Cys Thr Gly Thr Ala Val Val Val Ala Pro Val Ser Thr Val 355
360 365 Thr Tyr Gln Gly Glu Arg
Tyr Glu Phe Arg Thr Gly Pro Asp Thr Val 370 375
380 Ser Gln Glu Leu Tyr Thr Thr Leu Thr Ser Ile
Gln Met Gly Met Ala 385 390 395
400 Ala Glu Asp Ser Lys Gly Trp Thr Val Ala Val Glu
405 410 21318PRTArtificial sequencemotif 13
213Ala Asn Lys Arg Trp Val Pro Pro Xaa Gly Lys Gly Ser Leu Tyr Ile 1
5 10 15 Arg Pro
21418PRTArtificial sequencemotif 14 214Arg Pro Xaa Glu Asn Ala Xaa Arg
Met Xaa Xaa Gly Ala Xaa Arg Xaa 1 5 10
15 Cys Met 21518PRTArtificial sequencemotif 15 215Leu
Asn Tyr Gly Gln Gly Leu Phe Glu Gly Leu Lys Ala Tyr Arg Xaa 1
5 10 15 Glu Asp
21612PRTArtificial sequencesignature sequence 216Ala Asn Xaa Xaa Trp Xaa
Pro Pro Xaa Gly Lys Gly 1 5 10
2173803DNAArtificial sequenceexpression cassette 217aatccgaaaa gtttctgcac
cgttttcacc ccctaactaa caatataggg aacgtgtgct 60aaatataaaa tgagacctta
tatatgtagc gctgataact agaactatgc aagaaaaact 120catccaccta ctttagtggc
aatcgggcta aataaaaaag agtcgctaca ctagtttcgt 180tttccttagt aattaagtgg
gaaaatgaaa tcattattgc ttagaatata cgttcacatc 240tctgtcatga agttaaatta
ttcgaggtag ccataattgt catcaaactc ttcttgaata 300aaaaaatctt tctagctgaa
ctcaatgggt aaagagagag atttttttta aaaaaataga 360atgaagatat tctgaacgta
ttggcaaaga tttaaacata taattatata attttatagt 420ttgtgcattc gtcatatcgc
acatcattaa ggacatgtct tactccatcc caatttttat 480ttagtaatta aagacaattg
acttattttt attatttatc ttttttcgat tagatgcaag 540gtacttacgc acacactttg
tgctcatgtg catgtgtgag tgcacctcct caatacacgt 600tcaactagca acacatctct
aatatcactc gcctatttaa tacatttagg tagcaatatc 660tgaattcaag cactccacca
tcaccagacc acttttaata atatctaaaa tacaaaaaat 720aattttacag aatagcatga
aaagtatgaa acgaactatt taggtttttc acatacaaaa 780aaaaaaagaa ttttgctcgt
gcgcgagcgc caatctccca tattgggcac acaggcaaca 840acagagtggc tgcccacaga
acaacccaca aaaaacgatg atctaacgga ggacagcaag 900tccgcaacaa ccttttaaca
gcaggctttg cggccaggag agaggaggag aggcaaagaa 960aaccaagcat cctccttctc
ccatctataa attcctcccc ccttttcccc tctctatata 1020ggaggcatcc aagccaagaa
gagggagagc accaaggaca cgcgactagc agaagccgag 1080cgaccgcctt ctcgatccat
atcttccggt cgagttcttg gtcgatctct tccctcctcc 1140acctcctcct cacagggtat
gtgcctccct tcggttgttc ttggatttat tgttctaggt 1200tgtgtagtac gggcgttgat
gttaggaaag gggatctgta tctgtgatga ttcctgttct 1260tggatttggg atagaggggt
tcttgatgtt gcatgttatc ggttcggttt gattagtagt 1320atggttttca atcgtctgga
gagctctatg gaaatgaaat ggtttaggga tcggaatctt 1380gcgattttgt gagtaccttt
tgtttgaggt aaaatcagag caccggtgat tttgcttggt 1440gtaataaagt acggttgttt
ggtcctcgat tctggtagtg atgcttctcg atttgacgaa 1500gctatccttt gtttattccc
tattgaacaa aaataatcca actttgaaga cggtcccgtt 1560gatgagattg aatgattgat
tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga 1620tacagtagtc cccatcacga
aattcatgga aacagttata atcctcagga acaggggatt 1680ccctgttctt ccgatttgct
ttagtcccag aatttttttt cccaaatatc ttaaaaagtc 1740actttctggt tcagttcaat
gaattgattg ctacaaataa tgcttttata gcgttatcct 1800agctgtagtt cagttaatag
gtaatacccc tatagtttag tcaggagaag aacttatccg 1860atttctgatc tccattttta
attatatgaa atgaactgta gcataagcag tattcatttg 1920gattattttt tttattagct
ctcacccctt cattattctg agctgaaagt ctggcatgaa 1980ctgtcctcaa ttttgttttc
aaattcacat cgattatcta tgcattatcc tcttgtatct 2040acctgtagaa gtttcttttt
ggttattcct tgactgcttg attacagaaa gaaatttatg 2100aagctgtaat cgggatagtt
atactgcttg ttcttatgat tcatttcctt tgtgcagttc 2160ttggtgtagc ttgccacttt
caccagcaaa gttcatttaa atcaactagg gatatcacaa 2220gtttgtacaa aaaagcaggc
ttaaacaatg gagagaagcg ccgtctttgg tggtctgcaa 2280ccaaattacc ttctttaccc
ctcacccaac tcttcatccc ttcctttctc agaccaccgc 2340gctagacttc caaatttctc
tcctcctccc tctctgtctc tcaagataca taagcaggtt 2400tcttcttgtt ttaaagctgt
gtctcctttt aagcgtggag ctgcgttttc tgatacacac 2460agtgacacat ttgaattagc
tgacatagac tgggatgacc ttggatttgc atacgttccc 2520actgattata tgtattcaat
gaaatgcact aaaggtggaa acttttccaa aggtgaatta 2580cagagatatg gaaacattga
actgaaccct tctgctggcg tcttaaatta tggccaggga 2640ttgtttgaag gtctgaaagc
ctacaggaaa gaagatggta accttcttct atttcgtcct 2700gaggaaaatg ctatgcggat
gataatgggt gcagagagga tgtgcatgcc atcaccgaca 2760attgatcagt ttgtggatgc
agtaaaagca actgttttag caaacaaacg ttgggttcct 2820cctccaggta aaggttcctt
atatatcaga ccattgctaa tggggagtgg agctgttctt 2880ggtcttgcac ctgctcctga
gtataccttt ctcatttatg tttcaccggt ggggaactat 2940tttaaggaag gtgtggcacc
aattcattta attgtggagc atgaacttca tcgagcaact 3000cctggtggca ctggaggtgt
gaagactata gggaattatg ctgcggttct caaggcacaa 3060tctgctgcaa aagccagagg
tttttctgac gttttatatc ttgattgtgt acataaaaag 3120tatctagaag aggtttcctc
ttgcaacatt tttgttgtga agggtaacag catctccact 3180cctgcaataa aagggacaat
cctaccagga attacaagga agagcataat tgatgttgct 3240cgaagccaag gatttcaggt
tgaggaacgg cttgtgacag tagatgaatt gcttgatgct 3300gatgaggttt tttgtaccgg
aacagctgtt gttgtgtcac ctgtgggaag catcacctac 3360aagggtaaaa gggtgtctta
tggcgtagaa ggttttggtg ctgtctcgca acaactctat 3420agtgtgctaa ccaagctaca
gatgggcctt atagaggaca agatgaattg gactgtggag 3480ctgagttagg cgtactgcag
tgaacccagc tttcttgtac aaagtggtga tatcacaagc 3540ccgggcggtc ttctagggat
aacagggtaa ttatatccct ctagatcaca agcccgggcg 3600gtcttctacg atgattgagt
aataatgtgt cacgcatcac catgggtggc agtgtcagtg 3660tgagcaatga cctgaatgaa
caattgaaat gaaaagaaaa aaagtactcc atctgttcca 3720aattaaaatt ggttttaacc
ttttaatagg tttatacaat aattgatata tgttttctgt 3780atatgtctaa tttgttatca
tcc 38032182194DNAOryza sativa
218aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct
60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact
120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt
180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc
240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata
300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga
360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt
420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat
480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag
540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt
600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc
660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat
720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa
780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca
840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag
900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa
960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata
1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag
1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc
1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt
1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct
1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt
1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt
1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt
1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa
1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt
1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga
1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt
1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc
1740actttctggt tcagttcaat gaattgattg ctacaaataa tgcttttata gcgttatcct
1800agctgtagtt cagttaatag gtaatacccc tatagtttag tcaggagaag aacttatccg
1860atttctgatc tccattttta attatatgaa atgaactgta gcataagcag tattcatttg
1920gattattttt tttattagct ctcacccctt cattattctg agctgaaagt ctggcatgaa
1980ctgtcctcaa ttttgttttc aaattcacat cgattatcta tgcattatcc tcttgtatct
2040acctgtagaa gtttcttttt ggttattcct tgactgcttg attacagaaa gaaatttatg
2100aagctgtaat cgggatagtt atactgcttg ttcttatgat tcatttcctt tgtgcagttc
2160ttggtgtagc ttgccacttt caccagcaaa gttc
219421952DNAArtificial sequenceprimer prm15099 219ggggacaagt ttgtacaaaa
aagcaggctt aaacaatgga gagaagcgcc gt 5222050DNAArtificial
sequenceprimer prm15100 220ggggaccact ttgtacaaga aagctgggtt cactgcagta
cgcctaactc 50221380PRTArtificial sequenceConsensus 221Met
Gly Xaa Xaa Glu Glu Xaa Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1
5 10 15 Xaa Ser Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tyr Xaa 20
25 30 Asp Trp Ser Xaa Ser Met Gln Ala Tyr Tyr
Xaa Xaa Gly Ala Xaa Pro 35 40
45 Xaa Xaa Phe Phe Xaa Ser Xaa Val Ala Ser Pro Thr Pro His
Pro Tyr 50 55 60
Met Trp Gly Gly Gln His Xaa Met Met Pro Pro Xaa Tyr Gly Thr Pro 65
70 75 80 Val Pro Tyr Pro Ala
Leu Tyr Pro Pro Gly Gly Val Tyr Ala His Pro 85
90 95 Xaa Met Xaa Met Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa 100 105
110 Xaa Xaa Lys Xaa Xaa Asp Gly Lys Asp Arg Xaa Ser Xaa Lys Lys
Xaa 115 120 125 Lys
Gly Xaa Ser Gly Xaa Xaa Gly Xaa Xaa Xaa Xaa Xaa Xaa Gly Glu 130
135 140 Xaa Gly Lys Ala Xaa Ser
Gly Ser Gly Asn Asp Gly Xaa Ser Xaa Xaa 145 150
155 160 Ser Glu Xaa Ser Gly Ser Glu Gly Ser Ser Asp
Ala Ser Asp Glu Asn 165 170
175 Xaa Asn Xaa Gln Glu Xaa Ala Ala Xaa Lys Lys Gly Ser Phe Xaa Gln
180 185 190 Met Leu
Ala Asp Ala Ala Xaa Xaa Gln Asn Xaa Xaa Xaa Xaa Xaa Xaa 195
200 205 Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 210 215
220 Xaa Xaa Xaa Xaa Xaa Val Pro Gly Xaa Xaa Val Val
Ser Met Pro Ala 225 230 235
240 Thr Asn Xaa Leu Asn Ile Gly Met Asp Leu Trp Asn Ala Ser Xaa Ala
245 250 255 Gly Xaa Xaa
Xaa Xaa Lys Xaa Xaa Xaa Met Xaa Xaa Asn Xaa Xaa Xaa 260
265 270 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 275 280
285 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asp Xaa Xaa Glu Leu
Lys Arg Gln 290 295 300
Lys Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg 305
310 315 320 Lys Gln Ala Glu
Cys Glu Glu Leu Gln Xaa Arg Val Glu Xaa Leu Ser 325
330 335 Xaa Glu Asn Xaa Ser Leu Arg Asp Glu
Leu Gln Arg Leu Ser Glu Glu 340 345
350 Cys Glu Lys Leu Thr Ser Glu Asn Xaa Ser Ile Lys Glu Glu
Leu Xaa 355 360 365
Arg Leu Xaa Gly Pro Glu Ala Val Ala Xaa Leu Glu 370
375 380 222463PRTArtificial sequenceConsensus 222Met Gly
Asn Xaa Glu Glu Xaa Lys Xaa Xaa Xaa Xaa Xaa Xaa Lys Xaa 1 5
10 15 Xaa Xaa Xaa Ser Ser Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25
30 Xaa Xaa Xaa Xaa Xaa His Val Tyr Xaa Asp Trp Ala
Ala Met Gln Ala 35 40 45
Tyr Tyr Gly Pro Arg Val Ala Ile Pro Pro Tyr Tyr Asn Ser Ala Val
50 55 60 Ala Ser Gly
Xaa Xaa His Ala Pro Xaa Pro Tyr Met Trp Gly Pro Pro 65
70 75 80 Gln Pro Met Met Pro Pro Tyr
Gly Xaa Pro Tyr Ala Ala Xaa Xaa Xaa 85
90 95 Xaa Xaa Gly Xaa Val Tyr Xaa His Pro Ala Val
Xaa Ile Gly Xaa Xaa 100 105
110 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ser Xaa Pro Xaa Xaa Xaa Xaa Xaa
Xaa 115 120 125 Xaa
Gly Thr Xaa Leu Ser Ile Asp Thr Pro Xaa Lys Ser Ser Gly Asn 130
135 140 Thr Asp Gln Gly Leu Met
Lys Lys Leu Lys Gly Phe Asp Gly Leu Ala 145 150
155 160 Met Ser Ile Gly Asn Xaa Xaa Xaa Glu Ser Ala
Glu Xaa Xaa Ala Xaa 165 170
175 Xaa Xaa Arg Xaa Ser Gln Ser Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
180 185 190 Xaa Xaa
Xaa Xaa Xaa Xaa Asp Thr Glu Gly Ser Ser Asp Gly Ser Asp 195
200 205 Gly Asn Thr Thr Gly Ala Xaa
Gln Xaa Arg Xaa Lys Arg Ser Arg Glu 210 215
220 Gly Thr Pro Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 225 230 235
240 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Ser Xaa Lys Xaa Xaa Xaa
245 250 255 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Gly Xaa Val 260
265 270 Val Ser Xaa Xaa Met Xaa Thr Xaa
Xaa Xaa Leu Glu Leu Arg Asn Xaa 275 280
285 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa 290 295 300
Xaa Ala Val Val Pro Xaa Glu Xaa Trp Leu Gln Asn Glu Arg Glu Leu 305
310 315 320 Lys Arg Glu Arg
Arg Lys Gln Ser Asn Arg Glu Ser Ala Arg Arg Ser 325
330 335 Arg Leu Arg Lys Gln Ala Glu Thr Glu
Glu Leu Ala Arg Lys Val Glu 340 345
350 Ser Leu Thr Ala Glu Asn Leu Thr Leu Lys Ser Glu Ile Asn
Gln Leu 355 360 365
Thr Glu Xaa Ser Glu Lys Leu Arg Leu Glu Asn Ala Ala Leu Leu Glu 370
375 380 Lys Leu Lys Asn Ala
Gln Leu Gly Xaa Xaa Xaa Glu Ile Xaa Leu Xaa 385 390
395 400 Xaa Xaa Asp Xaa Xaa Arg Xaa Xaa Pro Val
Ser Thr Glu Asn Leu Leu 405 410
415 Ser Arg Val Asn Asn Xaa Xaa Gly Ser Xaa Asp Arg Xaa Xaa Glu
Xaa 420 425 430 Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Asn Ser Gly Ala Lys Leu His 435
440 445 Gln Leu Leu Asp Ala Ser
Pro Arg Ala Asp Ala Val Ala Ala Gly 450 455
460
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20170112471 | ULTRASOUND DIAGNOSTIC APPARATUS AND ULTRASOUND SIGNAL PROCESSING METHOD |
20170112470 | METHOD AND APPARATUS FOR REAL-TIME AND ROBUST STRAIN IMAGING |
20170112469 | ULTRASONIC PROBE |
20170112468 | IMAGE DIAGNOSIS APPARATUS AND IMAGE DIAGNOSIS METHOD |
20170112467 | RETENTION AND STABILIZATION OF ANATOMY FOR ULTRASOUND IMAGING |