Patent application title: Plants Having Enhanced Yield-Related Traits and a Method for Making the Same
Inventors:
Yves Hatzfeld (Lille, FR)
Assignees:
BASF Plant Science GmbH
IPC8 Class: AC12N1582FI
USPC Class:
800290
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part the polynucleotide alters plant part growth (e.g., stem or tuber length, etc.)
Publication date: 2013-11-14
Patent application number: 20130305414
Abstract:
The present invention relates generally to the field of molecular biology
and concerns a method for enhancing various economically important
yield-related traits in plants. More specifically, the present invention
concerns a method for enhancing yield-related traits in plants by
modulating expression in a plant of a nucleic acid encoding a NITR
(Nitrite Reductase) polypeptide or an ASNS (Asparagine Synthase)
polypeptide. The present invention also concerns plants having modulated
expression of a nucleic acid encoding a NITR polypeptide or an ASNS
polypeptide, which plants have enhanced yield-related traits relative to
control plants. The invention also provides constructs comprising
NITR-encoding nucleic acids or ASNS-encoding nucleic acids, useful in
performing the methods of the invention.Claims:
1. A method for enhancing yield-related traits in plants relative to
control plants, comprising modulating expression in a plant of a nucleic
acid encoding an ASNS, wherein said ASNS is represented by SEQ ID NO: 63
or an orthologue or paralogue thereof.
2. The method according to claim 1, wherein said modulated expression is effected by introducing and expressing in a plant a nucleic acid encoding said ASNS polypeptide.
3. The method according to claim 1, wherein said nucleic acid encoding said ASNS polypeptide is a portion of SEQ ID NO: 62, or a nucleic acid capable of hybridizing with such a nucleic acid.
4. The method according to claim 1, wherein said nucleic acid sequence encodes SEQ ID NO: 63.
5. The method according to claim 1, wherein said enhanced yield-related traits comprise increased early vigour, increased yield, increased root thickness, and/or increased seed yield, relative to control plants.
6. The method according to claim 1, wherein said enhanced yield-related traits are obtained under non-stress conditions or under conditions of reduced nutrient availability.
7. The method according to claim 2, wherein said nucleic acid is operably linked to a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice.
8. The method according to claim 1, wherein said nucleic acid encoding an ASNS polypeptide is of plant origin, from a monocotyledonous plant, from the family Poaceae, from the genus Oryza, or from Oryza sativa.
9. A plant or part thereof, including seeds, obtained by the method according to claim 1, wherein said plant or part thereof comprises a recombinant nucleic acid encoding an ASNS polypeptide.
10. A construct comprising: (i) the nucleic acid encoding an ASNS polypeptide as defined in claim 1; (ii) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally (iii) a transcription termination sequence.
11. The construct according to claim 10, wherein one of said control sequences is a constitutive promoter, a GOS2 promoter, or a GOS2 promoter from rice.
12. A method for making plants having increased yield-related traits comprising transforming a plant with the construct according to claim 10.
13. A plant, plant part or plant cell transformed with the construct according to claim 10.
14. A method for the production of a transgenic plant having increased yield, increased biomass, and/or increased seed yield relative to control plants, comprising: (i) introducing and expressing in a plant the nucleic acid encoding an ASNS polypeptide as defined in claim 1; and (ii) cultivating the plant under conditions promoting plant growth and development.
15. A transgenic plant having increased yield-related traits resulting from increased expression of the nucleic acid encoding an ASNS polypeptide as defined in claim 1, or a transgenic plant cell derived from said transgenic plant.
16. The transgenic plant of claim 15, wherein the increased yield-related traits are selected from the group consisting of increased early vigour, increased root thickness, and increased seed yield relative to control plants.
17. The plant according to claim 9, or a transgenic plant cell derived thereof, wherein said plant is a crop plant, a monocot, a cereal, rice, maize, wheat, barley, millet, rye, triticale, sorghum, or oats.
18. Harvestable parts of the plant according to claim 17,
19. The harvestable parts of claim 18, wherein said harvestable parts are root biomass and/or seeds.
20. Products derived from the plant according to claim 17 and/or from harvestable parts of said plant.
Description:
RELATED APPLICATIONS
[0001] This application is a divisional of U.S. application Ser. No. 12/669,596, filed Jan. 19, 2010, which is a national stage application (under 35 U.S.C. §371) of PCT/EP2008/060030, filed Jul. 31, 2008, which claims benefit of European application 07113568.5, filed Jul. 31, 2007 and European application 07113569.3, filed Jul. 31, 2007, the entire contents of each of which are hereby incorporated by reference in this application.
SUBMISSION OF SEQUENCE LISTING
[0002] The Sequence Listing associated with this application is filed in electronic format via EFS-Web and hereby incorporated by reference into the specification in its entirety. The name of the text file containing the Sequence Listing is Sequence_Listing--32279--00061_US. The size of the text file is 455 KB, and the text file was created on Jul. 10, 2013.
[0003] The present invention relates generally to the field of molecular biology and concerns a method for improving various plant growth characteristics by modulating expression in a plant of a nucleic acid encoding a NITR (Nitrite Reductase). The present invention also concerns plants having modulated expression of a nucleic acid encoding a NITR, which plants have improved growth characteristics relative to corresponding wild type plants or other control plants. The present invention furthermore concerns a method for improving various plant growth characteristics by modulating expression in a plant of a nucleic acid encoding an ASNS (Asparagine Synthase). The present invention also concerns plants having modulated expression of a nucleic acid encoding an ASNS, which plants have improved growth characteristics relative to corresponding wild type plants or other control plants. The invention also provides constructs useful in the methods of the invention.
[0004] The ever-increasing world population and the dwindling supply of arable land available for agriculture fuels research towards increasing the efficiency of agriculture. Conventional means for crop and horticultural improvements utilise selective breeding techniques to identify plants having desirable characteristics. However, such selective breeding techniques have several drawbacks, namely that these techniques are typically labour intensive and result in plants that often contain heterogeneous genetic components that may not always result in the desirable trait being passed on from parent plants. Advances in molecular biology have allowed mankind to modify the germplasm of animals and plants. Genetic engineering of plants entails the isolation and manipulation of genetic material (typically in the form of DNA or RNA) and the subsequent introduction of that genetic material into a plant. Such technology has the capacity to deliver crops or plants having various improved economic, agronomic or horticultural traits.
[0005] A trait of particular economic interest is increased yield. Yield is normally defined as the measurable produce of economic value from a crop. This may be defined in terms of quantity and/or quality. Yield is directly dependent on several factors, for example, the number and size of the organs, plant architecture (for example, the number of branches), seed production, leaf senescence and more. Root development, nutrient uptake, stress tolerance and early vigour may also be important factors in determining yield. Optimizing the abovementioned factors may therefore contribute to increasing crop yield.
[0006] Seed yield is a particularly important trait, since the seeds of many plants are important for human and animal nutrition. Crops such as corn, rice, wheat, canola and soybean account for over half the total human caloric intake, whether through direct consumption of the seeds themselves or through consumption of meat products raised on processed seeds. They are also a source of sugars, oils and many kinds of metabolites used in industrial processes. Seeds contain an embryo (the source of new shoots and roots) and an endosperm (the source of nutrients for embryo growth during germination and during early growth of seedlings). The development of a seed involves many genes, and requires the transfer of metabolites from the roots, leaves and stems into the growing seed. The endosperm, in particular, assimilates the metabolic precursors of carbohydrates, oils and proteins and synthesizes them into storage macromolecules to fill out the grain.
[0007] Another important trait for many crops is early vigour. Improving early vigour is an important objective of modern rice breeding programs in both temperate and tropical rice cultivars. Long roots are important for proper soil anchorage in water-seeded rice. Where rice is sown directly into flooded fields, and where plants must emerge rapidly through water, longer shoots are associated with vigour. Where drill-seeding is practiced, longer mesocotyls and coleoptiles are important for good seedling emergence. The ability to engineer early vigour into plants would be of great importance in agriculture. For example, poor early vigor has been a limitation to the introduction of maize (Zea mays L.) hybrids based on Corn Belt germplasm in the European Atlantic.
[0008] A further important trait is that of improved abiotic stress tolerance. Abiotic stress is a primary cause of crop loss worldwide, reducing average yields for most major crop plants by more than 50% (Wang et al., Planta (2003) 218: 1-14). Abiotic stresses may be caused by drought, salinity, extremes of temperature, chemical toxicity and oxidative stress. The ability to improve plant tolerance to abiotic stress would be of great economic advantage to farmers worldwide and would allow for the cultivation of crops during adverse conditions and in territories where cultivation of crops may not otherwise be possible.
[0009] Crop yield may therefore be increased by optimising one of the above-mentioned factors.
[0010] Depending on the end use, the modification of certain yield traits may be favoured over others. For example for applications such as forage or wood production, or bio-fuel resource, an increase in the vegetative parts of a plant may be desirable, and for applications such as flour, starch or oil production, an increase in seed parameters may be particularly desirable. Even amongst the seed parameters, some may be favoured over others, depending on the application. Various mechanisms may contribute to increasing seed yield, whether that is in the form of increased seed size or increased seed number.
[0011] One approach to increasing yield (seed yield and/or biomass) in plants may be through modification of the inherent growth mechanisms of a plant, such as the cell cycle or various signalling pathways involved in plant growth or in defense mechanisms.
[0012] Surprisingly, it has now been found that modulating expression of a nucleic acid encoding a NITR polypeptide or of an ASNS polypeptide gives plants having enhanced yield-related traits relative to control plants.
[0013] According one embodiment, there is provided a method for improving yield related traits of a plant relative to control plants, comprising modulating expression of a nucleic acid encoding a NITR polypeptide in a plant. The improved yield related traits comprised one or more of increased biomass, increased early vigour, and increased seed yield.
[0014] According another embodiment, there is provided a method for improving yield related traits of a plant relative to control plants, comprising modulating expression of a nucleic acid encoding an ASNS polypeptide in a plant. The improved yield related traits comprised one or more of increased biomass, increased early vigour, and increased seed yield.
DEFINITIONS
Polypeptide(s)/Protein(s)
[0015] The terms "polypeptide" and "protein" are used interchangeably herein and refer to amino acids in a polymeric form of any length, linked together by peptide bonds.
Polynucleotide(s)/Nucleic Acid(s)/Nucleic Acid Sequence(s)/Nucleotide Sequence(s)
[0016] The terms "polynucleotide(s)", "nucleic acid sequence(s)", "nucleotide sequence(s)", "nucleic acid(s)", "nucleic acid molecule" are used interchangeably herein and refer to nucleotides, either ribonucleotides or deoxyribonucleotides or a combination of both, in a polymeric unbranched form of any length.
Control Plant(s)
[0017] The choice of suitable control plants is a routine part of an experimental setup and may include corresponding wild type plants or corresponding plants without the gene of interest. The control plant is typically of the same plant species or even of the same variety as the plant to be assessed. The control plant may also be a nullizygote of the plant to be assessed. Nullizygotes are individuals missing the transgene by segregation. A "control plant" as used herein refers not only to whole plants, but also to plant parts, including seeds and seed parts.
Homologue(s)
[0018] "Homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified protein in question and having similar biological and functional activity as the unmodified protein from which they are derived.
[0019] A deletion refers to removal of one or more amino acids from a protein.
[0020] An insertion refers to one or more amino acid residues being introduced into a predetermined site in a protein. Insertions may comprise N-terminal and/or C-terminal fusions as well as intra-sequence insertions of single or multiple amino acids. Generally, insertions within the amino acid sequence will be smaller than N- or C-terminal fusions, of the order of about 1 to 10 residues. Examples of N- or C-terminal fusion proteins or peptides include the binding domain or activation domain of a transcriptional activator as used in the yeast two-hybrid system, phage coat proteins, (histidine)-6-tag, glutathione S-transferase-tag, protein A, maltose-binding protein, dihydrofolate reductase, Tag•100 epitope, c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding peptide), HA epitope, protein C epitope and VSV epitope.
[0021] A substitution refers to replacement of amino acids of the protein with other amino acids having similar properties (such as similar hydrophobicity, hydrophilicity, antigenicity, propensity to form or break α-helical structures or β-sheet structures). Amino acid substitutions are typically of single residues, but may be clustered depending upon functional constraints placed upon the polypeptide; insertions will usually be of the order of about 1 to 10 amino acid residues. The amino acid substitutions are preferably conservative amino acid substitutions. Conservative substitution tables are well known in the art (see for example Creighton (1984) Proteins. W.H. Freeman and Company (Eds) and Table 1 below).
TABLE-US-00001 TABLE 1 Examples of conserved amino acid substitutions Conservative Conservative Residue Substitutions Residue Substitutions Ala Ser Leu Ile; Val Arg Lys Lys Arg; Gln Asn Gln; His Met Leu; Ile Asp Glu Phe Met; Leu; Tyr Gln Asn Ser Thr; Gly Cys Ser Thr Ser; Val Glu Asp Trp Tyr Gly Pro Tyr Trp; Phe His Asn; Gln Val Ile; Leu Ile Leu, Val
[0022] Amino acid substitutions, deletions and/or insertions may readily be made using peptide synthetic techniques well known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulation. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. For example, techniques for making substitution mutations at predetermined sites in DNA are well known to those skilled in the art and include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio), QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.), PCR-mediated site-directed mutagenesis or other site-directed mutagenesis protocols.
Derivatives
[0023] "Derivatives" include peptides, oligopeptides, polypeptides which may, compared to the amino acid sequence of the naturally-occurring form of the protein, such as the protein of interest, comprise substitutions of amino acids with non-naturally occurring amino acid residues, or additions of non-naturally occurring amino acid residues. "Derivatives" of a protein also encompass peptides, oligopeptides, polypeptides which comprise naturally occurring altered (glycosylated, acylated, prenylated, phosphorylated, myristoylated, sulphated etc.) or non-naturally altered amino acid residues compared to the amino acid sequence of a naturally-occurring form of the polypeptide. A derivative may also comprise one or more non-amino acid substituents or additions compared to the amino acid sequence from which it is derived, for example a reporter molecule or other ligand, covalently or non-covalently bound to the amino acid sequence, such as a reporter molecule which is bound to facilitate its detection, and non-naturally occurring amino acid residues relative to the amino acid sequence of a naturally-occurring protein. Furthermore, "derivatives" also include fusions of the naturally-occurring form of the protein with tagging peptides such as FLAG, HIS6 or thioredoxin (for a review of tagging peptides, see Terpe, Appl. Microbiol. Biotechnol. 60, 523-533, 2003).
Orthologue(s)/Paralogue(s)
[0024] Orthologues and paralogues encompass evolutionary concepts used to describe the ancestral relationships of genes. Paralogues are genes within the same species that have originated through duplication of an ancestral gene; orthologues are genes from different organisms that have originated through speciation, and are also derived from a common ancestral gene.
Domain
[0025] The term "domain" refers to a set of amino acids conserved at specific positions along an alignment of sequences of evolutionarily related proteins. While amino acids at other positions can vary between homologues, amino acids that are highly conserved at specific positions indicate amino acids that are likely essential in the structure, stability or function of a protein. Identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers to determine if any polypeptide in question belongs to a previously identified polypeptide family.
Motif/Consensus Sequence/Signature
[0026] The term "motif" or "consensus sequence" or "signature" refers to a short conserved region in the sequence of evolutionarily related proteins. Motifs are frequently highly conserved parts of domains, but may also include only part of the domain, or be located outside of conserved domain (if all of the amino acids of the motif fall outside of a defined domain).
Hybridisation
[0027] The term "hybridisation" as defined herein is a process wherein substantially homologous complementary nucleotide sequences anneal to each other. The hybridisation process can occur entirely in solution, i.e. both complementary nucleic acids are in solution. The hybridisation process can also occur with one of the complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose beads or any other resin. The hybridisation process can furthermore occur with one of the complementary nucleic acids immobilised to a solid support such as a nitro-cellulose or nylon membrane or immobilised by e.g. photolithography to, for example, a siliceous glass support (the latter known as nucleic acid arrays or microarrays or as nucleic acid chips). In order to allow hybridisation to occur, the nucleic acid molecules are generally thermally or chemically denatured to melt a double strand into two single strands and/or to remove hairpins or other secondary structures from single stranded nucleic acids.
[0028] The term "stringency" refers to the conditions under which a hybridisation takes place. The stringency of hybridisation is influenced by conditions such as temperature, salt concentration, ionic strength and hybridisation buffer composition. Generally, low stringency conditions are selected to be about 30° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Medium stringency conditions are when the temperature is 20° C. below Tm, and high stringency conditions are when the temperature is 10° C. below Tm. High stringency hybridisation conditions are typically used for isolating hybridising sequences that have high sequence similarity to the target nucleic acid sequence. However, nucleic acids may deviate in sequence and still encode a substantially identical polypeptide, due to the degeneracy of the genetic code. Therefore medium stringency hybridisation conditions may sometimes be needed to identify such nucleic acid molecules.
[0029] The Tm is the temperature under defined ionic strength and pH, at which 50% of the target sequence hybridises to a perfectly matched probe. The Tm is dependent upon the solution conditions and the base composition and length of the probe. For example, longer sequences hybridise specifically at higher temperatures. The maximum rate of hybridisation is obtained from about 16° C. up to 32° C. below Tm. The presence of monovalent cations in the hybridisation solution reduce the electrostatic repulsion between the two nucleic acid strands thereby promoting hybrid formation; this effect is visible for sodium concentrations of up to 0.4M (for higher concentrations, this effect may be ignored). Formamide reduces the melting temperature of DNA-DNA and DNA-RNA duplexes with 0.6 to 0.7° C. for each percent formamide, and addition of 50% formamide allows hybridisation to be performed at 30 to 45° C., though the rate of hybridisation will be lowered. Base pair mismatches reduce the hybridisation rate and the thermal stability of the duplexes. On average and for large probes, the Tm decreases about 1° C. per % base mismatch. The Tm may be calculated using the following equations, depending on the types of hybrids:
[0030] 1) DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284, 1984):
[0030] Tm=81.5° C.+16.6×log10[Na.sup.+]a+0.41×%[G/Cb]-500.time- s.[Lc]-1-0.61×% formamide
[0031] 2) DNA-RNA or RNA-RNA hybrids:
[0031] Tm=79.8+18.5(log10[Na.sup.+]a)+0.58(% G/Cb)+11.8(% G/Cb)2-820/Lc
[0032] 3) oligo-DNA or oligo-RNAd hybrids:
[0033] For <20 nucleotides: Tm=2(ln)
[0034] For 20-35 nucleotides: Tm=22+1.46(ln)
[0035] a or for other monovalent cation, but only accurate in the 0.01-0.4 M range.
[0036] b only accurate for % GC in the 30% to 75% range.
[0037] c L=length of duplex in base pairs.
[0038] d oligo, oligonucleotide; ln, =effective length of primer=2×(no. of G/C)+(no. of A/T).
[0039] Non-specific binding may be controlled using any one of a number of known techniques such as, for example, blocking the membrane with protein containing solutions, additions of heterologous RNA, DNA, and SDS to the hybridisation buffer, and treatment with Rnase. For non-homologous probes, a series of hybridizations may be performed by varying one of (i) progressively lowering the annealing temperature (for example from 68° C. to 42° C.) or (ii) progressively lowering the formamide concentration (for example from 50% to 0%). The skilled artisan is aware of various parameters which may be altered during hybridisation and which will either maintain or change the stringency conditions.
[0040] Besides the hybridisation conditions, specificity of hybridisation typically also depends on the function of post-hybridisation washes. To remove background resulting from non-specific hybridisation, samples are washed with dilute salt solutions. Critical factors of such washes include the ionic strength and temperature of the final wash solution: the lower the salt concentration and the higher the wash temperature, the higher the stringency of the wash. Wash conditions are typically performed at or below hybridisation stringency. A positive hybridisation gives a signal that is at least twice of that of the background. Generally, suitable stringent conditions for nucleic acid hybridisation assays or gene amplification detection procedures are as set forth above. More or less stringent conditions may also be selected. The skilled artisan is aware of various parameters which may be altered during washing and which will either maintain or change the stringency conditions.
[0041] For example, typical high stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 65° C. in 1×SSC or at 42° C. in 1×SSC and 50% formamide, followed by washing at 65° C. in 0.3×SSC. Examples of medium stringency hybridisation conditions for DNA hybrids longer than 50 nucleotides encompass hybridisation at 50° C. in 4×SSC or at 40° C. in 6×SSC and 50% formamide, followed by washing at 50° C. in 2×SSC. The length of the hybrid is the anticipated length for the hybridising nucleic acid. When nucleic acids of known sequence are hybridised, the hybrid length may be determined by aligning the sequences and identifying the conserved regions described herein. 1×SSC is 0.15M NaCl and 15 mM sodium citrate; the hybridisation solution and wash solutions may additionally include 5×Denhardt's reagent, 0.5-1.0% SDS, 100 μg/ml denatured, fragmented salmon sperm DNA, 0.5% sodium pyrophosphate.
[0042] For the purposes of defining the level of stringency, reference can be made to Sambrook et al. (2001) Molecular Cloning: a laboratory manual, 3rd Edition, Cold Spring Harbor Laboratory Press, CSH, New York or to Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989 and yearly updates).
Splice Variant
[0043] The term "splice variant" as used herein encompasses variants of a nucleic acid sequence in which selected introns and/or exons have been excised, replaced, displaced or added, or in which introns have been shortened or lengthened. Such variants will be ones in which the biological activity of the protein is substantially retained; this may be achieved by selectively retaining functional segments of the protein. Such splice variants may be found in nature or may be manmade. Methods for predicting and isolating such splice variants are well known in the art (see for example Foissac and Schiex (2005) BMC Bioinformatics 6: 25).
Allelic Variant
[0044] Alleles or allelic variants are alternative forms of a given gene, located at the same chromosomal position. Allelic variants encompass Single Nucleotide Polymorphisms (SNPs), as well as Small Insertion/Deletion Polymorphisms (INDELs). The size of INDELs is usually less than 100 bp. SNPs and INDELs form the largest set of sequence variants in naturally occurring polymorphic strains of most organisms.
Gene Shuffling/Directed Evolution
[0045] Gene shuffling or directed evolution consists of iterations of DNA shuffling followed by appropriate screening and/or selection to generate variants of nucleic acids or portions thereof encoding proteins having a modified biological activity (Castle et al., (2004) Science 304(5674): 1151-4; U.S. Pat. Nos. 5,811,238 and 6,395,547).
Regulatory Element/Control Sequence/Promoter
[0046] The terms "regulatory element", "control sequence" and "promoter" are all used interchangeably herein and are to be taken in a broad context to refer to regulatory nucleic acid sequences capable of effecting expression of the sequences to which they are ligated. The term "promoter" typically refers to a nucleic acid control sequence located upstream from the transcriptional start of a gene and which is involved in recognising and binding of RNA polymerase and other proteins, thereby directing transcription of an operably linked nucleic acid. Encompassed by the aforementioned terms are transcriptional regulatory sequences derived from a classical eukaryotic genomic gene (including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence) and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. Also included within the term is a transcriptional regulatory sequence of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 box transcriptional regulatory sequences. The term "regulatory element" also encompasses a synthetic fusion molecule or derivative that confers, activates or enhances expression of a nucleic acid molecule in a cell, tissue or organ.
[0047] A "plant promoter" comprises regulatory elements, which mediate the expression of a coding sequence segment in plant cells. Accordingly, a plant promoter need not be of plant origin, but may originate from viruses or micro-organisms, for example from viruses which attack plant cells. The "plant promoter" can also originate from a plant cell, e.g. from the plant which is transformed with the nucleic acid sequence to be expressed in the inventive process and described herein. This also applies to other "plant" regulatory signals, such as "plant" terminators. The promoters upstream of the nucleotide sequences useful in the methods of the present invention can be modified by one or more nucleotide substitution(s), insertion(s) and/or deletion(s) without interfering with the functionality or activity of either the promoters, the open reading frame (ORF) or the 3'-regulatory region such as terminators or other 3' regulatory regions which are located away from the ORF. It is furthermore possible that the activity of the promoters is increased by modification of their sequence, or that they are replaced completely by more active promoters, even promoters from heterologous organisms. For expression in plants, the nucleic acid molecule must, as described above, be linked operably to or comprise a suitable promoter which expresses the gene at the right point in time and with the required spatial expression pattern.
[0048] For the identification of functionally equivalent promoters, the promoter strength and/or expression pattern of a candidate promoter may be analysed for example by operably linking the promoter to a reporter gene and assaying the expression level and pattern of the reporter gene in various tissues of the plant. Suitable well-known reporter genes include for example beta-glucuronidase or beta-galactosidase. The promoter activity is assayed by measuring the enzymatic activity of the beta-glucuronidase or beta-galactosidase. The promoter strength and/or expression pattern may then be compared to that of a reference promoter (such as the one used in the methods of the present invention). Alternatively, promoter strength may be assayed by quantifying mRNA levels or by comparing mRNA levels of the nucleic acid used in the methods of the present invention, with mRNA levels of housekeeping genes such as 18S rRNA, using methods known in the art, such as Northern blotting with densitometric analysis of autoradiograms, quantitative real-time PCR or RT-PCR (Heid et al., 1996 Genome Methods 6: 986-994). Generally by "weak promoter" is intended a promoter that drives expression of a coding sequence at a low level. By "low level" is intended at levels of about 1/10,000 transcripts to about 1/100,000 transcripts, to about 1/500,0000 transcripts per cell. Conversely, a "strong promoter" drives expression of a coding sequence at high level, or at about 1/10 transcripts to about 1/100 transcripts to about 1/1000 transcripts per cell. Generally, by "medium strength promoter" is intended a promoter that drives expression of a coding sequence at a lower level than a strong promoter, in particular at a level that is in all instances below that obtained when under the control of a 35S CaMV promoter.
Operably Linked
[0049] The term "operably linked" as used herein refers to a functional linkage between the promoter sequence and the gene of interest, such that the promoter sequence is able to initiate transcription of the gene of interest.
Constitutive Promoter
[0050] A "constitutive promoter" refers to a promoter that is transcriptionally active during most, but not necessarily all, phases of growth and development and under most environmental conditions, in at least one cell, tissue or organ. Table 2a below gives examples of constitutive promoters.
TABLE-US-00002 TABLE 2a Examples of constitutive promoters Gene Source Reference Actin McElroy et al, Plant Cell, 2: 163-171, 1990 HMGP WO 2004/070039 CAMV 35S Odell et al, Nature, 313: 810-812, 1985 CaMV 19S Nilsson et al., Physiol. Plant. 100: 456-462, 1997 GOS2 de Pater et al, Plant J Nov; 2(6): 837-44, 1992, WO 2004/065596 Ubiquitin Christensen et al, Plant Mol. Biol. 18: 675-689, 1992 Rice cyclophilin Buchholz et al, Plant Mol Biol. 25(5): 837-43, 1994 Maize H3 histone Lepetit et al, Mol. Gen. Genet. 231: 276-285, 1992 Alfalfa H3 histone Wu et al. Plant Mol. Biol. 11: 641-649, 1988 Actin 2 An et al, Plant J. 10(1); 107-121, 1996 34S FMV Sanger et al., Plant. Mol. Biol., 14, 1990: 433-443 Rubisco small subunit U.S. Pat. No. 4,962,028 OCS Leisner (1988) Proc Natl Acad Sci USA 85(5): 2553 SAD1 Jain et al., Crop Science, 39 (6), 1999: 1696 SAD2 Jain et al., Crop Science, 39 (6), 1999: 1696 nos Shaw et al. (1984) Nucleic Acids Res. 12(20): 7831-7846 V-ATPase WO 01/14572 Super promoter WO 95/14098 G-box proteins WO 94/12015
Ubiquitous Promoter
[0051] A ubiquitous promoter is active in substantially all tissues or cells of an organism.
Developmentally-Regulated Promoter
[0052] A developmentally-regulated promoter is active during certain developmental stages or in parts of the plant that undergo developmental changes.
Inducible Promoter
[0053] An inducible promoter has induced or increased transcription initiation in response to a chemical (for a review see Gatz 1997, Annu. Rev. Plant Physiol. Plant Mol. Biol., 48:89-108), environmental or physical stimulus, or may be "stress-inducible", i.e. activated when a plant is exposed to various stress conditions, or a "pathogen-inducible" i.e. activated when a plant is exposed to exposure to various pathogens.
Organ-Specific/Tissue-Specific Promoter
[0054] An organ-specific or tissue-specific promoter is one that is capable of preferentially initiating transcription in certain organs or tissues, such as the leaves, roots, seed tissue etc. For example, a "root-specific promoter" is a promoter that is transcriptionally active predominantly in plant roots, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Promoters able to initiate transcription in certain cells only are referred to herein as "cell-specific".
[0055] Examples of root-specific promoters are listed in Table 2b below:
TABLE-US-00003 TABLE 2b Examples of root-specific promoters Gene Source Reference RCc3 Plant Mol Biol. 1995 Jan; 27(2): 237-48 Arabidopsis PHT1 Kovama et al., 2005; Mudge et al. (2002, Plant J. 31: 341) Medicago phosphate transporter Xiao et al., 2006 Arabidopsis Pyk10 Nitz et al. (2001) Plant Sci 161(2): 337-346 root-expressible genes Tingey et al., EMBO J. 6: 1, 1987. tobacco auxin-inducible gene Van der Zaal et al., Plant Mol. Biol. 16, 983, 1991. β-tubulin Oppenheimer, et al., Gene 63: 87, 1988. tobacco root-specific genes Conkling, et al., Plant Physiol. 93: 1203, 1990. B. napus G1-3b gene U.S. Pat. No. 5,401,836 SbPRP1 Suzuki et al., Plant Mol. Biol. 21: 109-119, 1993. LRX1 Baumberger et al. 2001, Genes & Dev. 15: 1128 BTG-26 Brassica napus US 20050044585 LeAMT1 (tomato) Lauter et al. (1996, PNAS 3: 8139) The LeNRT1-1 (tomato) Lauter et al. (1996, PNAS 3: 8139) class I patatin gene (potato) Liu et al., Plant Mol. Biol. 153: 386-395, 1991. KDC1 (Daucus carota) Downey et al. (2000, J. Biol. Chem. 275: 39420) TobRB7 gene W Song (1997) PhD Thesis, North Carolina State University, Raleigh, NC USA OsRAB5a (rice) Wang et al. 2002, Plant Sci. 163: 273 ALF5 (Arabidopsis) Diener et al. (2001, Plant Cell 13: 1625) NRT2; 1Np (N. plumbaginifolia) Quesada et al. (1997, Plant Mol. Biol. 34: 265)
[0056] A seed-specific promoter is transcriptionally active predominantly in seed tissue, but not necessarily exclusively in seed tissue (in cases of leaky expression). The seed-specific promoter may be active during seed development and/or during germination. Examples of seed-specific promoters are shown in Table 2c to Table 2f below. Further examples of seed-specific promoters are given in Qing Qu and Takaiwa (Plant Biotechnol. J. 2, 113-125, 2004), which disclosure is incorporated by reference herein as if fully set forth.
TABLE-US-00004 TABLE 2c Examples of seed-specific promoters Gene source Reference seed-specific genes Simon et al., Plant Mol. Biol. 5: 191, 1985; Scofield et al., J. Biol. Chem. 262: 12202, 1987.; Baszczynski et al., Plant Mol. Biol. 14: 633, 1990. Brazil Nut albumin Pearson et al., Plant Mol. Biol. 18: 235-245, 1992. legumin Ellis et al., Plant Mol. Biol. 10: 203-214, 1988. glutelin (rice) Takaiwa et al., Mol. Gen. Genet. 208: 15-22, 1986; Takaiwa et al., FEBS Letts. 221: 43-47, 1987. zein Matzke et al Plant Mol Biol, 14(3): 323-32 1990 napA Stalberg et al, Planta 199: 515-519, 1996. wheat LMW and HMW glutenin-1 Mol Gen Genet 216: 81-90, 1989; NAR 17: 461-2, 1989 wheat SPA Albani et al, Plant Cell, 9: 171-184, 1997 wheat α, β, γ-gliadins EMBO J. 3: 1409-15, 1984 barley ltr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Theor Appl Gen 98: 1253-62, 1999; Plant J 4: 343-55, 1993; Mol Gen Genet 250: 750-60, 1996 barley DOF Mena et al, The Plant Journal, 116(1): 53-62, 1998 blz2 EP99106056.7 synthetic promoter Vicente-Carbajosa et al., Plant J. 13: 629-640, 1998. rice prolamin NRP33 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice a-globulin Glb-1 Wu et al, Plant Cell Physiology 39(8) 885-889, 1998 rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 rice α-globulin REB/OHP-1 Nakase et al. Plant Mol. Biol. 33: 513-522, 1997 rice ADP-glucose pyrophos- Trans Res 6: 157-68, 1997 phorylase maize ESR gene family Plant J 12: 235-46, 1997 sorghum α-kafirin DeRose et al., Plant Mol. Biol 32: 1029-35, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 rice oleosin Wu et al, J. Biochem. 123: 386, 1998 sunflower oleosin Cummins et al., Plant Mol. Biol. 19: 873-876, 1992 PRO0117, putative rice 40S WO 2004/070039 ribosomal protein PRO0136, rice alanine unpublished aminotransferase PRO0147, trypsin inhibitor ITR1 unpublished (barley) PRO0151, rice WSI18 WO 2004/070039 PRO0175, rice RAB21 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039 α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998
TABLE-US-00005 TABLE 2d examples of endosperm-specific promoters Gene source Reference glutelin (rice) Takaiwa et al. (1986) Mol Gen Genet 208: 15-22; Takaiwa et al. (1987) FEBS Letts. 221: 43-47 zein Matzke et al., (1990) Plant Mol Biol 14(3): 323-32 wheat LMW and HMW glutenin-1 Colot et al. (1989) Mol Gen Genet 216: 81-90, Anderson et al. (1989) NAR 17: 461-2 wheat SPA Albani et al. (1997) Plant Cell 9: 171-184 wheat gliadins Rafalski et al. (1984) EMBO 3: 1409-15 barley ltr1 promoter Diaz et al. (1995) Mol Gen Genet 248(5): 592-8 barley B1, C, D, hordein Cho et al. (1999) Theor Appl Genet 98: 1253-62; Muller et al. (1993) Plant J 4: 343-55; Sorenson et al. (1996) Mol Gen Genet 250: 750-60 barley DOF Mena et al, (1998) Plant J 116(1): 53-62 blz2 Onate et al. (1999) J Biol Chem 274(14): 9175-82 synthetic promoter Vicente-Carbajosa et al. (1998) Plant J 13: 629-640 rice prolamin NRP33 Wu et al, (1998) Plant Cell Physiol 39(8) 885-889 rice globulin Glb-1 Wu et al. (1998) Plant Cell Physiol 39(8) 885-889 rice globulin REB/OHP-1 Nakase et al. (1997) Plant Molec Biol 33: 513-522 rice ADP-glucose pyrophosphorylase Russell et al. (1997) Trans Res 6: 157-68 maize ESR gene family Opsahl-Ferstad et al. (1997) Plant J 12: 235-46 sorghum kafirin DeRose et al. (1996) Plant Mol Biol 32: 1029-35
TABLE-US-00006 TABLE 2e Examples of embryo specific promoters: Gene source Reference rice OSH1 Sato et al, Proc. Natl. Acad. Sci. USA, 93: 8117-8122, 1996 KNOX Postma-Haarsma et al, Plant Mol. Biol. 39: 257-71, 1999 PRO0151 WO 2004/070039 PRO0175 WO 2004/070039 PRO005 WO 2004/070039 PRO0095 WO 2004/070039
TABLE-US-00007 TABLE 2f Examples of aleurone-specific promoters: Gene source Reference α-amylase (Amy32b) Lanahan et al, Plant Cell 4: 203-211, 1992; Skriver et al, Proc Natl Acad Sci USA 88: 7266-7270, 1991 cathepsin β-like gene Cejudo et al, Plant Mol Biol 20: 849-856, 1992 Barley Ltp2 Kalla et al., Plant J. 6: 849-60, 1994 Chi26 Leah et al., Plant J. 4: 579-89, 1994 Maize B-Peru Selinger et al., Genetics 149; 1125-38, 1998
[0057] A green tissue-specific promoter as defined herein is a promoter that is transcriptionally active predominantly in green tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts.
[0058] Examples of green tissue-specific promoters which may be used to perform the methods of the invention are shown in Table 2g below.
TABLE-US-00008 TABLE 2g Examples of green tissue-specific promoters Gene Expression Reference Maize Orthophosphate dikinase Leaf specific Fukavama et al., 2001 Maize Phosphoenolpyruvate Leaf specific Kausch et al., 2001 carboxylase Rice Phosphoenolpyruvate Leaf specific Liu et al., 2003 carboxylase Rice small subunit Rubisco Leaf specific Nomura et al., 2000 rice beta expansin EXBP9 Shoot specific WO 2004/070039 Pigeonpea small subunit Rubisco Leaf specific Panguluri et al., 2005 Pea RBCS3A Leaf specific
[0059] Another example of a tissue-specific promoter is a meristem-specific promoter, which is transcriptionally active predominantly in meristematic tissue, substantially to the exclusion of any other parts of a plant, whilst still allowing for any leaky expression in these other plant parts. Examples of green meristem-specific promoters which may be used to perform the methods of the invention are shown in Table 2h below.
TABLE-US-00009 TABLE 2h Examples of meristem-specific promoters Gene source Expression pattern Reference rice OSH1 Shoot apical meristem, from Sato et al. (1996) Proc. embryo globular stage to Natl. Acad. Sci. USA, seedling stage 93: 8117-8122 Rice Meristem specific BAD87835.1 metallothionein WAK1 & WAK 2 Shoot and root apical Wagner & Kohorn meristems, and in expanding (2001) Plant Cell leaves and sepals 13(2): 303-318
Terminator
[0060] The term "terminator" encompasses a control sequence which is a DNA sequence at the end of a transcriptional unit which signals 3' processing and polyadenylation of a primary transcript and termination of transcription. The terminator can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The terminator to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
Modulation
[0061] The term "modulation" means in relation to expression or gene expression, a process in which the expression level is changed by said gene expression in comparison to the control plant, the expression level may be increased or decreased. The original, unmodulated expression may be of any kind of expression of a structural RNA (rRNA, tRNA) or mRNA with subsequent translation. The term "modulating the activity" shall mean any change of the expression of the inventive nucleic acid sequences or encoded proteins, which leads to increased yield and/or increased growth of the plants.
Expression
[0062] The term "expression" or "gene expression" means the transcription of a specific gene or specific genes or specific genetic construct. The term "expression" or "gene expression" in particular means the transcription of a gene or genes or genetic construct into structural RNA (rRNA, tRNA) or mRNA with or without subsequent translation of the latter into a protein. The process includes transcription of DNA and processing of the resulting mRNA product.
Increased Expression/Overexpression
[0063] The term "increased expression" or "overexpression" as used herein means any form of expression that is additional to the original wild-type expression level.
[0064] Methods for increasing expression of genes or gene products are well documented in the art and include, for example, overexpression driven by appropriate promoters, the use of transcription enhancers or translation enhancers. Isolated nucleic acids which serve as promoter or enhancer elements may be introduced in an appropriate position (typically upstream) of a non-heterologous form of a polynucleotide so as to upregulate expression of a nucleic acid encoding the polypeptide of interest. For example, endogenous promoters may be altered in vivo by mutation, deletion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al., WO9322443), or isolated promoters may be introduced into a plant cell in the proper orientation and distance from a gene of the present invention so as to control the expression of the gene.
[0065] If polypeptide expression is desired, it is generally desirable to include a polyadenylation region at the 3'-end of a polynucleotide coding region. The polyadenylation region can be derived from the natural gene, from a variety of other plant genes, or from T-DNA. The 3' end sequence to be added may be derived from, for example, the nopaline synthase or octopine synthase genes, or alternatively from another plant gene, or less preferably from any other eukaryotic gene.
[0066] An intron sequence may also be added to the 5' untranslated region (UTR) or the coding sequence of the partial coding sequence to increase the amount of the mature message that accumulates in the cytosol. Inclusion of a spliceable intron in the transcription unit in both plant and animal expression constructs has been shown to increase gene expression at both the mRNA and protein levels up to 1000-fold (Buchman and Berg (1988) Mol. Cell biol. 8: 4395-4405; Callis et al. (1987) Genes Dev 1:1183-1200). Such intron enhancement of gene expression is typically greatest when placed near the 5' end of the transcription unit. Use of the maize introns Adh1-S intron 1, 2, and 6, the Bronze-1 intron are known in the art. For general information see: The Maize Handbook, Chapter 116, Freeling and Walbot, Eds., Springer, N.Y. (1994).
Endogenous Gene
[0067] Reference herein to an "endogenous" gene not only refers to the gene in question as found in a plant in its natural form (i.e., without there being any human intervention), but also refers to that same gene (or a substantially homologous nucleic acid/gene) in an isolated form subsequently (re)introduced into a plant (a transgene). For example, a transgenic plant containing such a transgene may encounter a substantial reduction of the transgene expression and/or substantial reduction of expression of the endogenous gene. The isolated gene may be isolated from an organism or may be manmade, for example by chemical synthesis.
Decreased Expression
[0068] Reference herein to "decreased expression" or "reduction or substantial elimination" of expression is taken to mean a decrease in endogenous gene expression and/or polypeptide levels and/or polypeptide activity relative to control plants. The reduction or substantial elimination is in increasing order of preference at least 10%, 20%, 30%, 40% or 50%, 60%, 70%, 80%, 85%, 90%, or 95%, 96%, 97%, 98%, 99% or more reduced compared to that of control plants.
[0069] For the reduction or substantial elimination of expression an endogenous gene in a plant, a sufficient length of substantially contiguous nucleotides of a nucleic acid sequence is required. In order to perform gene silencing, this may be as little as 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 or fewer nucleotides, alternatively this may be as much as the entire gene (including the 5' and/or 3' UTR, either in part or in whole). The stretch of substantially contiguous nucleotides may be derived from the nucleic acid encoding the protein of interest (target gene), or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest. Preferably, the stretch of substantially contiguous nucleotides is capable of forming hydrogen bonds with the target gene (either sense or antisense strand), more preferably, the stretch of substantially contiguous nucleotides has, in increasing order of preference, 50%, 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity to the target gene (either sense or antisense strand). A nucleic acid sequence encoding a (functional) polypeptide is not a requirement for the various methods discussed herein for the reduction or substantial elimination of expression of an endogenous gene.
[0070] This reduction or substantial elimination of expression may be achieved using routine tools and techniques. A preferred method for the reduction or substantial elimination of endogenous gene expression is by introducing and expressing in a plant a genetic construct into which the nucleic acid (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of any one of the protein of interest) is cloned as an inverted repeat (in part or completely), separated by a spacer (non-coding DNA).
[0071] In such a preferred method, expression of the endogenous gene is reduced or substantially eliminated through RNA-mediated silencing using an inverted repeat of a nucleic acid or a part thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), preferably capable of forming a hairpin structure. The inverted repeat is cloned in an expression vector comprising control sequences. A non-coding DNA nucleic acid sequence (a spacer, for example a matrix attachment region fragment (MAR), an intron, a polylinker, etc.) is located between the two inverted nucleic acids forming the inverted repeat. After transcription of the inverted repeat, a chimeric RNA with a self-complementary structure is formed (partial or complete). This double-stranded RNA structure is referred to as the hairpin RNA (hpRNA). The hpRNA is processed by the plant into siRNAs that are incorporated into an RNA-induced silencing complex (RISC). The RISC further cleaves the mRNA transcripts, thereby substantially reducing the number of mRNA transcripts to be translated into polypeptides. For further general details see for example, Grierson et al. (1998) WO 98/53083; Waterhouse et al. (1999) WO 99/53050).
[0072] Performance of the methods of the invention does not rely on introducing and expressing in a plant a genetic construct into which the nucleic acid is cloned as an inverted repeat, but any one or more of several well-known "gene silencing" methods may be used to achieve the same effects.
[0073] One such method for the reduction of endogenous gene expression is RNA-mediated silencing of gene expression (downregulation). Silencing in this case is triggered in a plant by a double stranded RNA sequence (dsRNA) that is substantially similar to the target endogenous gene. This dsRNA is further processed by the plant into about 20 to about 26 nucleotides called short interfering RNAs (siRNAs). The siRNAs are incorporated into an RNA-induced silencing complex (RISC) that cleaves the mRNA transcript of the endogenous target gene, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. Preferably, the double stranded RNA sequence corresponds to a target gene.
[0074] Another example of an RNA silencing method involves the introduction of nucleic acid sequences or parts thereof (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest) in a sense orientation into a plant. "Sense orientation" refers to a DNA sequence that is homologous to an mRNA transcript thereof. Introduced into a plant would therefore be at least one copy of the nucleic acid sequence. The additional nucleic acid sequence will reduce expression of the endogenous gene, giving rise to a phenomenon known as co-suppression. The reduction of gene expression will be more pronounced if several additional copies of a nucleic acid sequence are introduced into the plant, as there is a positive correlation between high transcript levels and the triggering of co-suppression.
[0075] Another example of an RNA silencing method involves the use of antisense nucleic acid sequences. An "antisense" nucleic acid sequence comprises a nucleotide sequence that is complementary to a "sense" nucleic acid sequence encoding a protein, i.e. complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA transcript sequence. The antisense nucleic acid sequence is preferably complementary to the endogenous gene to be silenced. The complementarity may be located in the "coding region" and/or in the "non-coding region" of a gene. The term "coding region" refers to a region of the nucleotide sequence comprising codons that are translated into amino acid residues. The term "non-coding region" refers to 5 and 3' sequences that flank the coding region that are transcribed but not translated into amino acids (also referred to as 5 and 3' untranslated regions).
[0076] Antisense nucleic acid sequences can be designed according to the rules of Watson and Crick base pairing. The antisense nucleic acid sequence may be complementary to the entire nucleic acid sequence (in this case a stretch of substantially contiguous nucleotides derived from the gene of interest, or from any nucleic acid capable of encoding an orthologue, paralogue or homologue of the protein of interest), but may also be an oligonucleotide that is antisense to only a part of the nucleic acid sequence (including the mRNA 5' and 3' UTR). For example, the antisense oligonucleotide sequence may be complementary to the region surrounding the translation start site of an mRNA transcript encoding a polypeptide. The length of a suitable antisense oligonucleotide sequence is known in the art and may start from about 50, 45, 40, 35, 30, 25, 20, 15 or 10 nucleotides in length or less. An antisense nucleic acid sequence according to the invention may be constructed using chemical synthesis and enzymatic ligation reactions using methods known in the art. For example, an antisense nucleic acid sequence (e.g., an antisense oligonucleotide sequence) may be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acid sequences, e.g., phosphorothioate derivatives and acridine substituted nucleotides may be used. Examples of modified nucleotides that may be used to generate the antisense nucleic acid sequences are well known in the art. Known nucleotide modifications include methylation, cyclization and `caps` and substitution of one or more of the naturally occurring nucleotides with an analogue such as inosine. Other modifications of nucleotides are well known in the art.
[0077] The antisense nucleic acid sequence can be produced biologically using an expression vector into which a nucleic acid sequence has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest). Preferably, production of antisense nucleic acid sequences in plants occurs by means of a stably integrated nucleic acid construct comprising a promoter, an operably linked antisense oligonucleotide, and a terminator.
[0078] The nucleic acid molecules used for silencing in the methods of the invention (whether introduced into a plant or generated in situ) hybridize with or bind to mRNA transcripts and/or genomic DNA encoding a polypeptide to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid sequence which binds to DNA duplexes, through specific interactions in the major groove of the double helix. Antisense nucleic acid sequences may be introduced into a plant by transformation or direct injection at a specific tissue site. Alternatively, antisense nucleic acid sequences can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense nucleic acid sequences can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid sequence to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid sequences can also be delivered to cells using the vectors described herein.
[0079] According to a further aspect, the antisense nucleic acid sequence is an a-anomeric nucleic acid sequence. An a-anomeric nucleic acid sequence forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual b-units, the strands run parallel to each other (Gaultier et al. (1987) Nucl Ac Res 15: 6625-6641). The antisense nucleic acid sequence may also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucl Ac Res 15, 6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215, 327-330).
[0080] The reduction or substantial elimination of endogenous gene expression may also be performed using ribozymes. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid sequence, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334, 585-591) can be used to catalytically cleave mRNA transcripts encoding a polypeptide, thereby substantially reducing the number of mRNA transcripts to be translated into a polypeptide. A ribozyme having specificity for a nucleic acid sequence can be designed (see for example: Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742). Alternatively, mRNA transcripts corresponding to a nucleic acid sequence can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules (Bartel and Szostak (1993) Science 261, 1411-1418). The use of ribozymes for gene silencing in plants is known in the art (e.g., Atkins et al. (1994) WO 94/00012; Lenne et al. (1995) WO 95/03404; Lutziger et al. (2000) WO 00/00619; Prinsen et al. (1997) WO 97/13865 and Scott et al. (1997) WO 97/38116).
[0081] Gene silencing may also be achieved by insertion mutagenesis (for example, T-DNA insertion or transposon insertion) or by strategies as described by, among others, Angell and Baulcombe ((1999) Plant J 20(3): 357-62), (Amplicon VIGS WO 98/36083), or Baulcombe (WO 99/15682).
[0082] Gene silencing may also occur if there is a mutation on an endogenous gene and/or a mutation on an isolated gene/nucleic acid subsequently introduced into a plant. The reduction or substantial elimination may be caused by a non-functional polypeptide. For example, the polypeptide may bind to various interacting proteins; one or more mutation(s) and/or truncation(s) may therefore provide for a polypeptide that is still able to bind interacting proteins (such as receptor proteins) but that cannot exhibit its normal function (such as signalling ligand).
[0083] A further approach to gene silencing is by targeting nucleic acid sequences complementary to the regulatory region of the gene (e.g., the promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See Helene, C., Anticancer Drug Res. 6, 569-84, 1991; Helene et al., Ann. N.Y. Acad. Sci. 660, 27-36 1992; and Maher, L. J. Bioassays 14, 807-15, 1992.
[0084] Other methods, such as the use of antibodies directed to an endogenous polypeptide for inhibiting its function in planta, or interference in the signalling pathway in which a polypeptide is involved, will be well known to the skilled man. In particular, it can be envisaged that manmade molecules may be useful for inhibiting the biological function of a target polypeptide, or for interfering with the signalling pathway in which the target polypeptide is involved.
[0085] Alternatively, a screening program may be set up to identify in a plant population natural variants of a gene, which variants encode polypeptides with reduced activity. Such natural variants may also be used for example, to perform homologous recombination.
[0086] Artificial and/or natural microRNAs (miRNAs) may be used to knock out gene expression and/or mRNA translation. Endogenous miRNAs are single stranded small RNAs of typically 19-24 nucleotides long. They function primarily to regulate gene expression and/or mRNA translation. Most plant microRNAs (miRNAs) have perfect or near-perfect complementarity with their target sequences. However, there are natural targets with up to five mismatches. They are processed from longer non-coding RNAs with characteristic fold-back structures by double-strand specific RNases of the Dicer family. Upon processing, they are incorporated in the RNA-induced silencing complex (RISC) by binding to its main component, an Argonaute protein. MiRNAs serve as the specificity components of RISC, since they base-pair to target nucleic acids, mostly mRNAs, in the cytoplasm. Subsequent regulatory events include target mRNA cleavage and destruction and/or translational inhibition. Effects of miRNA overexpression are thus often reflected in decreased mRNA levels of target genes.
[0087] Artificial microRNAs (amiRNAs), which are typically 21 nucleotides in length, can be genetically engineered specifically to negatively regulate gene expression of single or multiple genes of interest. Determinants of plant microRNA target selection are well known in the art. Empirical parameters for target recognition have been defined and can be used to aid in the design of specific amiRNAs, (Schwab et al., Dev. Cell 8, 517-527, 2005). Convenient tools for design and generation of amiRNAs and their precursors are also available to the public (Schwab et al., Plant Cell 18, 1121-1133, 2006).
[0088] For optimal performance, the gene silencing techniques used for reducing expression in a plant of an endogenous gene requires the use of nucleic acid sequences from monocotyledonous plants for transformation of monocotyledonous plants, and from dicotyledonous plants for transformation of dicotyledonous plants. Preferably, a nucleic acid sequence from any given plant species is introduced into that same species. For example, a nucleic acid sequence from rice is transformed into a rice plant. However, it is not an absolute requirement that the nucleic acid sequence to be introduced originates from the same plant species as the plant in which it will be introduced. It is sufficient that there is substantial homology between the endogenous target gene and the nucleic acid to be introduced.
[0089] Described above are examples of various methods for the reduction or substantial elimination of expression in a plant of an endogenous gene. A person skilled in the art would readily be able to adapt the aforementioned methods for silencing so as to achieve reduction of expression of an endogenous gene in a whole plant or in parts thereof through the use of an appropriate promoter, for example.
Selectable Marker (Gene)/Reporter Gene
[0090] "Selectable marker", "selectable marker gene" or "reporter gene" includes any gene that confers a phenotype on a cell in which it is expressed to facilitate the identification and/or selection of cells that are transfected or transformed with a nucleic acid construct of the invention. These marker genes enable the identification of a successful transfer of the nucleic acid molecules via a series of different principles. Suitable markers may be selected from markers that confer antibiotic or herbicide resistance, that introduce a new metabolic trait or that allow visual selection. Examples of selectable marker genes include genes conferring resistance to antibiotics (such as nptII that phosphorylates neomycin and kanamycin, or hpt, phosphorylating hygromycin, or genes conferring resistance to, for example, bleomycin, streptomycin, tetracyclin, chloramphenicol, ampicillin, gentamycin, geneticin (G418), spectinomycin or blasticidin), to herbicides (for example bar which provides resistance to Basta®; aroA or gox providing resistance against glyphosate, or the genes conferring resistance to, for example, imidazolinone, phosphinothricin or sulfonylurea), or genes that provide a metabolic trait (such as manA that allows plants to use mannose as sole carbon source or xylose isomerase for the utilisation of xylose, or antinutritive markers such as the resistance to 2-deoxyglucose). Expression of visual marker genes results in the formation of colour (for example β-glucuronidase, GUS or β-galactosidase with its coloured substrates, for example X-Gal), luminescence (such as the luciferin/luceferase system) or fluorescence (Green Fluorescent Protein, GFP, and derivatives thereof). This list represents only a small number of possible markers. The skilled worker is familiar with such markers. Different markers are preferred, depending on the organism and the selection method.
[0091] It is known that upon stable or transient integration of nucleic acids into plant cells, only a minority of the cells takes up the foreign DNA and, if desired, integrates it into its genome, depending on the expression vector used and the transfection technique used. To identify and select these integrants, a gene coding for a selectable marker (such as the ones described above) is usually introduced into the host cells together with the gene of interest. These markers can for example be used in mutants in which these genes are not functional by, for example, deletion by conventional methods. Furthermore, nucleic acid molecules encoding a selectable marker can be introduced into a host cell on the same vector that comprises the sequence encoding the polypeptides of the invention or used in the methods of the invention, or else in a separate vector. Cells which have been stably transfected with the introduced nucleic acid can be identified for example by selection (for example, cells which have integrated the selectable marker survive whereas the other cells die).
[0092] Since the marker genes, particularly genes for resistance to antibiotics and herbicides, are no longer required or are undesired in the transgenic host cell once the nucleic acids have been introduced successfully, the process according to the invention for introducing the nucleic acids advantageously employs techniques which enable the removal or excision of these marker genes. One such a method is what is known as co-transformation. The co-transformation method employs two vectors simultaneously for the transformation, one vector bearing the nucleic acid according to the invention and a second bearing the marker gene(s). A large proportion of transformants receives or, in the case of plants, comprises (up to 40% or more of the transformants), both vectors. In case of transformation with Agrobacteria, the transformants usually receive only a part of the vector, i.e. the sequence flanked by the T-DNA, which usually represents the expression cassette. The marker genes can subsequently be removed from the transformed plant by performing crosses. In another method, marker genes integrated into a transposon are used for the transformation together with desired nucleic acid (known as the Ac/Ds technology). The transformants can be crossed with a transposase source or the transformants are transformed with a nucleic acid construct conferring expression of a transposase, transiently or stable. In some cases (approx. 10%), the transposon jumps out of the genome of the host cell once transformation has taken place successfully and is lost. In a further number of cases, the transposon jumps to a different location. In these cases the marker gene must be eliminated by performing crosses. In microbiology, techniques were developed which make possible, or facilitate, the detection of such events. A further advantageous method relies on what is known as recombination systems; whose advantage is that elimination by crossing can be dispensed with. The best-known system of this type is what is known as the Cre/lox system. Cre1 is a recombinase that removes the sequences located between the loxP sequences. If the marker gene is integrated between the loxP sequences, it is removed once transformation has taken place successfully, by expression of the recombinase. Further recombination systems are the HIN/HIX, FLP/FRT and REP/STB system (Tribble et al., J. Biol. Chem., 275, 2000: 22255-22267; Velmurugan et al., J. Cell Biol., 149, 2000: 553-566). A site-specific integration into the plant genome of the nucleic acid sequences according to the invention is possible. Naturally, these methods can also be applied to microorganisms such as yeast, fungi or bacteria.
Transgenic/Transgene/Recombinant
[0093] For the purposes of the invention, "transgenic", "transgene" or "recombinant" means with regard to, for example, a nucleic acid sequence, an expression cassette, gene construct or a vector comprising the nucleic acid sequence or an organism transformed with the nucleic acid sequences, expression cassettes or vectors according to the invention, all those constructions brought about by recombinant methods in which either
[0094] (a) the nucleic acid sequences encoding proteins useful in the methods of the invention, or
[0095] (b) genetic control sequence(s) which is operably linked with the nucleic acid sequence according to the invention, for example a promoter, or
[0096] (c) a) and b) are not located in their natural genetic environment or have been modified by recombinant methods, it being possible for the modification to take the form of, for example, a substitution, addition, deletion, inversion or insertion of one or more nucleotide residues. The natural genetic environment is understood as meaning the natural genomic or chromosomal locus in the original plant or the presence in a genomic library. In the case of a genomic library, the natural genetic environment of the nucleic acid sequence is preferably retained, at least in part. The environment flanks the nucleic acid sequence at least on one side and has a sequence length of at least 50 bp, preferably at least 500 bp, especially preferably at least 1000 bp, most preferably at least 5000 bp. A naturally occurring expression cassette--for example the naturally occurring combination of the natural promoter of the nucleic acid sequences with the corresponding nucleic acid sequence encoding a polypeptide useful in the methods of the present invention, as defined above--becomes a transgenic expression cassette when this expression cassette is modified by non-natural, synthetic ("artificial") methods such as, for example, mutagenic treatment. Suitable methods are described, for example, in U.S. Pat. No. 5,565,350 or WO 00/15815.
[0097] A transgenic plant for the purposes of the invention is thus understood as meaning, as above, that the nucleic acids used in the method of the invention are not at their natural locus in the genome of said plant, it being possible for the nucleic acids to be expressed homologously or heterologously. However, as mentioned, transgenic also means that, while the nucleic acids according to the invention or used in the inventive method are at their natural position in the genome of a plant, the sequence has been modified with regard to the natural sequence, and/or that the regulatory sequences of the natural sequences have been modified. Transgenic is preferably understood as meaning the expression of the nucleic acids according to the invention at an unnatural locus in the genome, i.e. homologous or, preferably, heterologous expression of the nucleic acids takes place. Preferred transgenic plants are mentioned herein.
Transformation
[0098] The term "introduction" or "transformation" as referred to herein encompasses the transfer of an exogenous polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may be transformed with a genetic construct of the present invention and a whole plant regenerated there from. The particular tissue chosen will vary depending on the clonal propagation systems available for, and best suited to, the particular species being transformed. Exemplary tissue targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), and induced meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). The polynucleotide may be transiently or stably introduced into a host cell and may be maintained non-integrated, for example, as a plasmid. Alternatively, it may be integrated into the host genome. The resulting transformed plant cell may then be used to regenerate a transformed plant in a manner known to persons skilled in the art.
[0099] The transfer of foreign genes into the genome of a plant is called transformation. Transformation of plant species is now a fairly routine technique. Advantageously, any of several transformation methods may be used to introduce the gene of interest into a suitable ancestor cell. The methods described for the transformation and regeneration of plants from plant tissues or plant cells may be utilized for transient or for stable transformation. Transformation methods include the use of liposomes, electroporation, chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle gun bombardment, transformation using viruses or pollen and microprojection. Methods may be selected from the calcium/polyethylene glycol method for protoplasts (Krens, F. A. et al., (1982) Nature 296, 72-74; Negrutiu I et al. (1987) Plant Mol Biol 8: 363-373); electroporation of protoplasts (Shillito R. D. et al. (1985) Bio/Technol 3, 1099-1102); microinjection into plant material (Crossway A et al., (1986) Mol. Gen Genet 202: 179-185); DNA or RNA-coated particle bombardment (Klein T M et al., (1987) Nature 327: 70) infection with (non-integrative) viruses and the like. Transgenic plants, including transgenic crop plants, are preferably produced via Agrobacterium-mediated transformation. An advantageous transformation method is the transformation in planta. To this end, it is possible, for example, to allow the agrobacteria to act on plant seeds or to inoculate the plant meristem with agrobacteria. It has proved particularly expedient in accordance with the invention to allow a suspension of transformed agrobacteria to act on the intact plant or at least on the flower primordia. The plant is subsequently grown on until the seeds of the treated plant are obtained (Clough and Bent, Plant J. (1998) 16, 735-743). Methods for Agrobacterium-mediated transformation of rice include well known methods for rice transformation, such as those described in any of the following: European patent application EP 1198985 A1, Aldemita and Hodges (Planta 199: 612-617, 1996); Chan et al. (Plant Mol Biol 22 (3): 491-506, 1993), Hiei et al. (Plant J 6 (2): 271-282, 1994), which disclosures are incorporated by reference herein as if fully set forth. In the case of corn transformation, the preferred method is as described in either Ishida et al. (Nat. Biotechnol 14(6): 745-50, 1996) or Frame et al. (Plant Physiol 129(1): 13-22, 2002), which disclosures are incorporated by reference herein as if fully set forth. Said methods are further described by way of example in B. Jenes et al., Techniques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press (1993) 128-143 and in Potrykus Annu. Rev. Plant Physiol. Plant Molec. Biol. 42 (1991) 205-225). The nucleic acids or the construct to be expressed is preferably cloned into a vector, which is suitable for transforming Agrobacterium tumefaciens, for example pBin19 (Bevan et al., Nucl. Acids Res. 12 (1984) 8711). Agrobacteria transformed by such a vector can then be used in known manner for the transformation of plants, such as plants used as a model, like Arabidopsis (Arabidopsis thaliana is within the scope of the present invention not considered as a crop plant), or crop plants such as, by way of example, tobacco plants, for example by immersing bruised leaves or chopped leaves in an agrobacterial solution and then culturing them in suitable media. The transformation of plants by means of Agrobacterium tumefaciens is described, for example, by Hofgen and Willmitzer in Nucl. Acid Res. (1988) 16, 9877 or is known inter alia from F. F. White, Vectors for Gene Transfer in Higher Plants; in Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D. Kung and R. Wu, Academic Press, 1993, pp. 15-38.
[0100] In addition to the transformation of somatic cells, which then have to be regenerated into intact plants, it is also possible to transform the cells of plant meristems and in particular those cells which develop into gametes. In this case, the transformed gametes follow the natural plant development, giving rise to transgenic plants. Thus, for example, seeds of Arabidopsis are treated with agrobacteria and seeds are obtained from the developing plants of which a certain proportion is transformed and thus transgenic [Feldman, K A and Marks M D (1987). Mol Gen Genet 208:274-289; Feldmann K (1992). In: C Koncz, N-H Chua and J Shell, eds, Methods in Arabidopsis Research. Word Scientific, Singapore, pp. 274-289]. Alternative methods are based on the repeated removal of the inflorescences and incubation of the excision site in the center of the rosette with transformed agrobacteria, whereby transformed seeds can likewise be obtained at a later point in time (Chang (1994). Plant J. 5: 551-558; Katavic (1994). Mol Gen Genet, 245: 363-370). However, an especially effective method is the vacuum infiltration method with its modifications such as the "floral dip" method. In the case of vacuum infiltration of Arabidopsis, intact plants under reduced pressure are treated with an agrobacterial suspension [Bechthold, N (1993). C R Acad Sci Paris Life Sci, 316: 1194-1199], while in the case of the "floral dip" method the developing floral tissue is incubated briefly with a surfactant-treated agrobacterial suspension [Clough, S J and Bent A F (1998) The Plant J. 16, 735-743]. A certain proportion of transgenic seeds are harvested in both cases, and these seeds can be distinguished from non-transgenic seeds by growing under the above-described selective conditions. In addition the stable transformation of plastids is of advantages because plastids are inherited maternally is most crops reducing or eliminating the risk of transgene flow through pollen. The transformation of the chloroplast genome is generally achieved by a process which has been schematically displayed in Klaus et al., 2004 [Nature Biotechnology 22 (2), 225-229]. Briefly the sequences to be transformed are cloned together with a selectable marker gene between flanking sequences homologous to the chloroplast genome. These homologous flanking sequences direct site specific integration into the plastome. Plastidal transformation has been described for many different plant species and an overview is given in Bock (2001) Transgenic plastids in basic research and plant biotechnology. J Mol Biol. 2001 Sep. 21; 312 (3):425-38 or Maliga, P (2003) Progress towards commercialization of plastid transformation technology. Trends Biotechnol. 21, 20-28. Further biotechnological progress has recently been reported in form of marker free plastid transformants, which can be produced by a transient co-integrated maker gene (Klaus et al., 2004, Nature Biotechnology 22(2), 225-229).
T-DNA Activation Tagging
[0101] T-DNA activation tagging (Hayashi et al. Science (1992) 1350-1353), involves insertion of T-DNA, usually containing a promoter (may also be a translation enhancer or an intron), in the genomic region of the gene of interest or 10 kb up- or downstream of the coding region of a gene in a configuration such that the promoter directs expression of the targeted gene. Typically, regulation of expression of the targeted gene by its natural promoter is disrupted and the gene falls under the control of the newly introduced promoter. The promoter is typically embedded in a T-DNA. This T-DNA is randomly inserted into the plant genome, for example, through Agrobacterium infection and leads to modified expression of genes near the inserted T-DNA. The resulting transgenic plants show dominant phenotypes due to modified expression of genes close to the introduced promoter.
TILLING
[0102] The term "TILLING" is an abbreviation of "Targeted Induced Local Lesions In Genomes" and refers to a mutagenesis technology useful to generate and/or identify nucleic acids encoding proteins with modified expression and/or activity. TILLING also allows selection of plants carrying such mutant variants. These mutant variants may exhibit modified expression, either in strength or in location or in timing (if the mutations affect the promoter for example). These mutant variants may exhibit higher activity than that exhibited by the gene in its natural form. TILLING combines high-density mutagenesis with high-throughput screening methods. The steps typically followed in TILLING are: (a) EMS mutagenesis (Redei G P and Koncz C (1992) In Methods in Arabidopsis Research, Koncz C, Chua N H, Schell J, eds. Singapore, World Scientific Publishing Co, pp. 16-82; Feldmann et al., (1994) In Meyerowitz E M, Somerville C R, eds, Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 137-172; Lightner J and Caspar T (1998) In J Martinez-Zapater, J Salinas, eds, Methods on Molecular Biology, Vol. 82. Humana Press, Totowa, N.J., pp 91-104); (b) DNA preparation and pooling of individuals; (c) PCR amplification of a region of interest; (d) denaturation and annealing to allow formation of heteroduplexes; (e) DHPLC, where the presence of a heteroduplex in a pool is detected as an extra peak in the chromatogram; (f) identification of the mutant individual; and (g) sequencing of the mutant PCR product. Methods for TILLING are well known in the art (McCallum et al., (2000) Nat Biotechnol 18: 455-457; reviewed by Stemple (2004) Nat Rev Genet 5(2): 145-50).
Homologous Recombination
[0103] Homologous recombination allows introduction in a genome of a selected nucleic acid at a defined selected position. Homologous recombination is a standard technology used routinely in biological sciences for lower organisms such as yeast or the moss Physcomitrella. Methods for performing homologous recombination in plants have been described not only for model plants (Offring a et al. (1990) EMBO J 9(10): 3077-84) but also for crop plants, for example rice (Terada et al. (2002) Nat Biotech 20(10): 1030-4; lida and Terada (2004) Curr Opin Biotech 15(2): 132-8), and approaches exist that are generally applicable regardless of the target organism (Miller et al, Nature Biotechnol. 25, 778-785, 2007).
Yield
[0104] The term "yield" in general means a measurable produce of economic value, typically related to a specified crop, to an area, and to a period of time. Individual plant parts directly contribute to yield based on their number, size and/or weight, or the actual yield is the yield per square meter for a crop and year, which is determined by dividing total production (includes both harvested and appraised production) by planted square meters. The term "yield" of a plant may relate to vegetative biomass (root and/or shoot biomass), to reproductive organs, and/or to propagules (such as seeds) of that plant.
Early Vigour
[0105] "Early vigour" refers to active healthy well-balanced growth especially during early stages of plant growth, and may result from increased plant fitness due to, for example, the plants being better adapted to their environment (i.e. optimizing the use of energy resources and partitioning between shoot and root). Plants having early vigour also show increased seedling survival and a better establishment of the crop, which often results in highly uniform fields (with the crop growing in uniform manner, i.e. with the majority of plants reaching the various stages of development at substantially the same time), and often better and higher yield. Therefore, early vigour may be determined by measuring various factors, such as thousand kernel weight, percentage germination, percentage emergence, seedling growth, seedling height, root length, root and shoot biomass and many more.
Increase/Improve/Enhance
[0106] The terms "increase", "improve" or "enhance" are interchangeable and shall mean in the sense of the application at least a 3%, 4%, 5%, 6%, 7%, 8%, 9% or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or 40% more yield and/or growth in comparison to control plants as defined herein.
Seed Yield
[0107] Increased seed yield may manifest itself as one or more of the following: a) an increase in seed biomass (total seed weight) which may be on an individual seed basis and/or per plant and/or per square meter; b) increased number of flowers per plant; c) increased number of (filled) seeds; d) increased seed filling rate (which is expressed as the ratio between the number of filled seeds divided by the total number of seeds); e) increased harvest index, which is expressed as a ratio of the yield of harvestable parts, such as seeds, divided by the total biomass; and f) increased thousand kernel weight (TKW), which is extrapolated from the number of filled seeds counted and their total weight. An increased TKW may result from an increased seed size and/or seed weight, and may also result from an increase in embryo and/or endosperm size.
[0108] An increase in seed yield may also be manifested as an increase in seed size and/or seed volume. Furthermore, an increase in seed yield may also manifest itself as an increase in seed area and/or seed length and/or seed width and/or seed perimeter. Increased yield may also result in modified architecture, or may occur because of modified architecture.
Greenness Index
[0109] The "greenness index" as used herein is calculated from digital images of plants. For each pixel belonging to the plant object on the image, the ratio of the green value versus the red value (in the RGB model for encoding color) is calculated. The greenness index is expressed as the percentage of pixels for which the green-to-red ratio exceeds a given threshold. Under normal growth conditions, under salt stress growth conditions, and under reduced nutrient availability growth conditions, the greenness index of plants is measured in the last imaging before flowering. In contrast, under drought stress growth conditions, the greenness index of plants is measured in the first imaging after drought.
Plant
[0110] The term "plant" as used herein encompasses whole plants, ancestors and progeny of the plants and plant parts, including seeds, shoots, stems, leaves, roots (including tubers), flowers, and tissues and organs, wherein each of the aforementioned comprise the gene/nucleic acid of interest. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen and microspores, again wherein each of the aforementioned comprises the gene/nucleic acid of interest.
[0111] Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs selected from the list comprising Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g. Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp. (e.g. Brassica napus, Brassica rapa ssp. [canola, oilseed rape, turnip rape]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g. Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp., Fragaria spp., Ginkgo biloba, Glycine spp. (e.g. Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g. Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g. Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens cullnaris, Linum usitatissimum, Litchi chihensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g. Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melliotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g. Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g. Solanum tuberosum, Solanum integrifolium or Solanum lycoperskum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g. Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vilis spp., Zea mays, Zizania palustris, Ziziphus spp., amongst others.
DETAILED DESCRIPTION OF THE INVENTION
I NITR
[0112] Surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding a NITR polypeptide gives plants having enhanced yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding a NITR polypeptide.
[0113] A preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding a NITR polypeptide is by introducing and expressing in a plant a nucleic acid encoding a NITR polypeptide.
[0114] Any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean a NITR polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such a NITR polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "NITR nucleic acid" or "NITR gene".
[0115] A "NITR polypeptide" as defined herein refers to the nitrite reductase protein represented by SEQ ID NO: 2 and to homologues (orthologues and paralogues) thereof. Nitrite reductases belong to the enzyme class EC 1.7.7.1 and catalyse the reduction of nitrite to ammonium.
[0116] Preferably, the homologues of SEQ ID NO: 2 have a NIR_SIR domain. NIR_SIR domains (Pfam entry PF01077, Nitrite and sulphite reductase 4Fe-4S region) are well known in the art and may readily be identified by persons skilled in the art. Preferably, the NITR polypeptides also comprise one or more of the following domains:
[0117] InterPro: IPR005117 (Nitrite/sulphite reductase, hemoprotein beta-component, ferrodoxin-like)
[0118] PFAM: PF03460 (NIR_SIR_ferr)
[0119] InterPro: IPR006066 (Nitrite and sulphite reductase iron-sulphur/siroheme-binding site)
[0120] PRINTS: PR00397 (SIROHAEM)
[0121] PROSITE: PS00365 (NIR_SIR)
[0122] InterPro: IPR006067 (Nitrite and sulphite reductase 4Fe-4S region)
[0123] GENE3D: G3DSA:3.30.413.10 (G3DSA:3.30.413.10)
[0124] Alternatively, the homologue of a NITR protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 2, provided that the homologous protein comprises the conserved domains as outlined above. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters and preferably with sequences of mature proteins (i.e. without taking into account secretion signals or transit peptides). Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered.
[0125] Preferably, the polypeptide sequence which when used in the construction of a phylogenetic tree, such as the one depicted in FIG. 3, clusters with the group of NITR polypeptides comprising the amino acid sequence represented by SEQ ID NO: 2 rather than with Sulfite Reductases or any other group.
[0126] The term "domain" and "motif" is defined in the "definitions" section herein. Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788 (2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.
[0127] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters.
[0128] Furthermore, NITR polypeptides (at least in their native form), as far as SEQ ID NO: 2 and its homologues are concerned, typically have oxidoreductase activity. Tools and techniques for measuring oxidoreductase activity are well known in the art, see for example Ferrari and Varner, Plant Physiol., 47(6), 790-794 (1971).
[0129] Nitrite reductases group together with Sulfite Reductases (EC 1.8.1.2, Hilz et al., Biochem. Z. 332, 151-166, 1959), which catalyse the reaction:
hydrogen sulfide+3NADP++3H2O=sulfite+3NADPH+3H+
However, it should be noted that the group of Sulfite Reductases are not encompassed by the term NITR polypeptides as used in the present invention.
[0130] The present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 1, encoding the polypeptide sequence of SEQ ID NO: 2. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any NITR-encoding nucleic acid or NITR polypeptide as defined herein (thereby excluding the Sulfite Reductases).
[0131] Examples of nucleic acids encoding NITR polypeptides (such as those provided in FIG. 2 or in the sequence listing) may be found in databases known in the art. Such nucleic acids are useful in performing the methods of the invention. Orthologues and paralogues, the terms "orthologues" and "paralogues" being as defined herein, may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using SEQ ID NO: 2) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 1 or SEQ ID NO: 2, the second BLAST would therefore be against Arabidopsis thaliana sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits.
[0132] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.
[0133] Nucleic acid variants encoding homologues and derivatives of SEQ ID NO: 2 may also be useful in practising the methods of the invention, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acids encoding homologues and derivatives of orthologues or paralogues of SEQ ID NO: 2. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived.
[0134] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acids encoding NITR polypeptides, nucleic acids hybridising to nucleic acids encoding NITR polypeptides, splice variants of nucleic acids encoding NITR polypeptides, allelic variants of nucleic acids encoding NITR polypeptides and variants of nucleic acids encoding NITR polypeptides obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.
[0135] Nucleic acids encoding NITR polypeptides need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of SEQ ID NO: 1, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 2.
[0136] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.
[0137] Portions useful in the methods of the invention, encode a NITR polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 2. Preferably, the portion is a portion of any one of the nucleic acids given in SEQ ID NO: 1, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in SEQ ID NO: 1. Preferably the portion is at least 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750 consecutive nucleotides in length, the consecutive nucleotides being of SEQ ID NO: 1, or of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 1.
[0138] Another nucleic acid variant useful in the methods of the invention is a nucleic acid capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding a NITR polypeptide as defined herein, or with a portion as defined herein.
[0139] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to SEQ ID NO: 1, or comprising introducing and expressing in a plant a nucleic acid capable of hybridising to a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 1.
[0140] Hybridising sequences useful in the methods of the invention encode a NITR polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 2. Preferably, the hybridising sequence is capable of hybridising to SEQ ID NO: 1, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 2.
[0141] Another nucleic acid variant useful in the methods of the invention is a splice variant encoding a NITR polypeptide as defined hereinabove, a splice variant being as defined herein.
[0142] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of SEQ ID NO: 1, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 2.
[0143] Another nucleic acid variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding a NITR polypeptide as defined hereinabove, an allelic variant being as defined herein.
[0144] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of SEQ ID NO: 1, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of the amino acid sequences represented by SEQ ID NO: 2.
[0145] The allelic variants useful in the methods of the present invention have substantially the same biological activity as the NITR polypeptide of SEQ ID NO: 2. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Gene shuffling or directed evolution may also be used to generate variants of nucleic acids encoding NITR polypeptides as defined above; the term "gene shuffling" being as defined herein.
[0146] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of SEQ ID NO: 1, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 2, which variant nucleic acid is obtained by gene shuffling.
[0147] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).
[0148] Nucleic acids encoding NITR polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the NITR polypeptide-encoding nucleic acid is from a plant. In the case of SEQ ID NO: 1, the NITR polypeptide encoding nucleic acid is preferably from a monocotyledonous plant, more preferably from the family Brassicaceae, most preferably the nucleic acid is from Arabidopsis thaliana.
[0149] Performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased early vigour and increased yield, especially increased biomass and increased seed yield relative to control plants. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.
[0150] Reference herein to enhanced yield-related traits is taken to mean an increase in early vigour and/or in biomass (weight) of one or more parts of a plant, which may include aboveground (harvestable) parts and/or (harvestable) parts below ground. In particular, such harvestable parts are biomass and/or seeds, and performance of the methods of the invention results in plants having increased early vigour, biomass and/or seed yield relative to the early vigour, biomass or seed yield of control plants.
[0151] Taking corn as an example, a yield increase may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), among others. Taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, number of spikelets per panicle, number of flowers (florets) per panicle (which is expressed as a ratio of the number of filled seeds over the number of primary panicles), increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), increase in thousand kernel weight, among others.
[0152] The present invention provides a method for increasing yield, especially biomass and/or seed yield of plants, relative to control plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding a NITR polypeptide as defined herein.
[0153] Since the transgenic plants according to the present invention have increased yield, it is likely that these plants exhibit an increased growth rate (during at least part of their life cycle), relative to the growth rate of control plants at a corresponding stage in their life cycle.
[0154] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a dry mature seed up to the stage where the plant has produced dry mature seeds, similar to the starting material. This life cycle may be influenced by factors such as early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.
[0155] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding a NITR polypeptide as defined herein. In a particular embodiment, performance of the methods of the present invention gives plants with increased early vigour.
[0156] An increase in yield and/or growth rate occurs whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35% or 30%, preferably less than 25%, 20% or 15%, more preferably less than 14%, 13%, 12%, 11% or 10% or less in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Mild stresses are the everyday biotic and/or abiotic (environmental) stresses to which a plant is exposed. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures. The abiotic stress may be an osmotic stress caused by a water stress (particularly due to drought), salt stress, oxidative stress or an ionic stress. Biotic stresses are typically those stresses caused by pathogens, such as bacteria, viruses, fungi and insects.
[0157] In particular, the methods of the present invention may be performed under non-stress conditions or under conditions of mild drought to give plants having increased yield relative to control plants. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location.
[0158] Performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield and/or increased early vigour, relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield and/or early vigour in plants grown under non-stress conditions or under mild drought conditions, which method comprises increasing expression in a plant of a nucleic acid encoding a NITR polypeptide.
[0159] Performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises increasing expression in a plant of a nucleic acid encoding a NITR polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others.
[0160] Performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a POI polypeptide. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.
[0161] The present invention encompasses plants or parts thereof (including seeds) obtainable by the methods according to the present invention. The plants or parts thereof comprise a nucleic acid transgene encoding a NITR polypeptide as defined above.
[0162] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids encoding NITR polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.
[0163] More specifically, the present invention provides a construct comprising:
[0164] (a) a nucleic acid encoding a NITR polypeptide as defined above;
[0165] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0166] (c) a transcription termination sequence.
[0167] Preferably, the nucleic acid encoding a NITR polypeptide is as defined above. The term "control sequence" and "termination sequence" are as defined herein.
[0168] Plants are transformed with a vector comprising any of the nucleic acids described above. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).
[0169] Advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence. A constitutive promoter is particularly useful in the methods of the invention. Preferably the constitutive promoter is also a ubiquitous promoter. See the "Definitions" section herein for definitions of the various promoter types.
[0170] It should be clear that the applicability of the present invention is not restricted to the NITR polypeptide-encoding nucleic acid represented by SEQ ID NO: 1, nor is the applicability of the invention restricted to expression of a NITR polypeptide-encoding nucleic acid when driven by a constitutive specific promoter.
[0171] The constitutive promoter is preferably a medium strength promoter of plant origin, preferably a GOS2 promoter, more preferably a GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 3, most preferably the constitutive promoter is as represented by SEQ ID NO: 3. See Table 2 in the "Definitions" section herein for further examples of constitutive promoters.
[0172] Optionally, one or more terminator sequences may be used in the construct introduced into a plant. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.
[0173] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.
[0174] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.
[0175] The invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding a NITR polypeptide as defined hereinabove.
[0176] More specifically, the present invention provides a method for the production of transgenic plants having increased enhanced yield-related traits, particularly increased early vigour and/or increased yield, which method comprises:
[0177] (i) introducing and expressing in a plant or plant cell a NITR polypeptide-encoding nucleic acid; and
[0178] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0179] The nucleic acid of (i) may be any of the nucleic acids capable of encoding a NITR polypeptide as defined herein.
[0180] The nucleic acid may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant by transformation. The term "transformation" is described in more detail in the "definitions" section herein.
[0181] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer.
[0182] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.
[0183] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.
[0184] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).
[0185] The present invention clearly extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.
[0186] The invention also includes host cells containing an isolated nucleic acid encoding a NITR polypeptide as defined hereinabove. Preferred host cells according to the invention are plant cells. Host plants for the nucleic acids or the vector used in the method according to the invention, the expression cassette or construct or vector are, in principle, advantageously all plants, which are capable of synthesizing the polypeptides used in the inventive method.
[0187] The methods of the invention are advantageously applicable to any plant. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to a preferred embodiment of the present invention, the plant is a crop plant. Examples of crop plants include soybean, sunflower, canola, alfalfa, rapeseed, cotton, tomato, potato and tobacco. Further preferably, the plant is a monocotyledonous plant. Examples of monocotyledonous plants include sugarcane. More preferably the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, secale, einkorn, teff, milo and oats.
[0188] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs, which harvestable parts comprise a recombinant nucleic acid encoding a NITR polypeptide. The invention furthermore relates to products derived, preferably directly derived, from a harvestable part of such a plant, such as dry pellets or powders, oil, fat and fatty acids, starch or proteins.
[0189] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.
[0190] As mentioned above, a preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding a NITR polypeptide is by introducing and expressing in a plant a nucleic acid encoding a NITR polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.
[0191] The present invention also encompasses use of nucleic acids encoding NITR polypeptides as described herein and use of these NITR polypeptides in enhancing any of the aforementioned yield-related traits in plants.
[0192] Nucleic acids encoding NITR polypeptide described herein, or the NITR polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to a NITR polypeptide-encoding gene. The nucleic acids/genes, or the NITR polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined hereinabove in the methods of the invention.
[0193] Allelic variants of a NITR polypeptide-encoding nucleic acid/gene may also find use in marker-assisted breeding programmes. Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.
[0194] Nucleic acids encoding NITR polypeptides may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes. Such use of NITR polypeptide-encoding nucleic acids requires only a nucleic acid sequence of at least 15 nucleotides in length. The NITR polypeptide-encoding nucleic acids may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the NITR-encoding nucleic acids. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acids may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the NITR polypeptide-encoding nucleic acid in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32:314-331).
[0195] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.
[0196] The nucleic acid probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).
[0197] In another embodiment, the nucleic acid probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.
[0198] A variety of nucleic acid amplification-based methods for genetic and physical mapping may be carried out using the nucleic acids. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art. In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.
[0199] The methods according to the present invention result in plants having enhanced yield-related traits, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-enhancing traits, tolerance to other abiotic and biotic stresses, traits modifying various architectural features and/or biochemical and/or physiological features.
II ASNS
[0200] Surprisingly, it has now been found that modulating expression in a plant of a nucleic acid encoding an ASNS polypeptide gives plants having enhanced yield-related traits relative to control plants. According to a first embodiment, the present invention provides a method for enhancing yield-related traits in plants relative to control plants, comprising modulating expression in a plant of a nucleic acid encoding an ASNS polypeptide.
[0201] A preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding an ASNS polypeptide is by introducing and expressing in a plant a nucleic acid encoding an ASNS polypeptide.
[0202] Any reference hereinafter to a "protein useful in the methods of the invention" is taken to mean an ASNS polypeptide as defined herein. Any reference hereinafter to a "nucleic acid useful in the methods of the invention" is taken to mean a nucleic acid capable of encoding such an ASNS polypeptide. The nucleic acid to be introduced into a plant (and therefore useful in performing the methods of the invention) is any nucleic acid encoding the type of protein which will now be described, hereafter also named "ASNS nucleic acid" or "ASNSgene".
[0203] An "ASNS polypeptide" as defined herein refers to the Asparagine synthetase represented by SEQ ID NO: 63 and to homologues (orthologues and paralogues) thereof. SEQ ID NO: 63 comprises, compared to the wild type sequence (Os06g0265000, SEQ ID NO: 67), two point mutations: R382G and S165G (FIG. 6). Arginine on position 382 in SEQ ID NO: 67 is highly conserved among Asparagine synthetases and may be part of a large alpha-helix which delimits the molecular tunnel between the 2 active sites. It is also close to the AMP binding site. Serine on position 165 may be located in a distorted a-helix region, on the external side of the glutamine binding side according to the structure derived from E. coli. It is postulated that the S165G mutation will probably have little impact on the structure of this region.
[0204] Therefore, ASNS polypeptides useful in the methods of the present invention preferably have a substitution of the Arginine residue that corresponds to R382 in SEQ ID NO: 67, into an amino acid that distorts the alpha-helix, preferably into a Glycine. Optionally ASNS polypeptides useful in the methods of the present invention additionally have a substitution of the Serine residue that corresponds to S165 in SEQ ID NO: 67, into another amino acid, preferably into a Glycine. Arg residues corresponding to R382 in SEQ ID NO: 67 or Ser residues corresponding to S165 can be identified by aligning the amino acid sequence to the one of SEQ ID NO: 67, see for example the multiple alignment in FIG. 4. Such alignment methods are well known in the art.
[0205] Preferably, the homologues of SEQ ID NO: 63 have a Asn_synthase domain. Asn_synthase domains (Pfam entry PF00733) are well known in the art and may readily be identified by persons skilled in the art. Besides the Asn_synthase domain, ASNS polypeptides preferably also have a Glutamine amidotransferase, class-II domain (InterPro IPR000583; GATase--2 (HMMPfam entry PF00310), GATase_Type_II (PROSITE entry PS00443)) and/or a Asparagine synthase, glutamine-hydrolyzing domain (asn_synth_AEB: asparagine synthase (glutami (TIGRFAMs entry TIGR01536))
[0206] Alternatively, the homologue of a ASNS protein has in increasing order of preference at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% overall sequence identity to the amino acid represented by SEQ ID NO: 63, provided that the homologous protein comprises the conserved motifs as outlined above and the substitution of the Arg residue that corresponds to R382 in SEQ ID NO: 67. The overall sequence identity is determined using a global alignment algorithm, such as the Needleman Wunsch algorithm in the program GAP (GCG Wisconsin Package, Accelrys), preferably with default parameters. Compared to overall sequence identity, the sequence identity will generally be higher when only conserved domains or motifs are considered.
[0207] The term "domain" and "motif" is defined in the "definitions" section herein. Specialist databases exist for the identification of domains, for example, SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher and Bairoch (1994), A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation. (In) ISMB-94; Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R., Searls D., Eds., pp 53-61, AAAI Press, Menlo Park; Hulo et al., Nucl. Acids. Res. 32:D134-D137, (2004)), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)). A set of tools for in silico analysis of protein sequences is available on the ExPASy proteomics server (Swiss Institute of Bioinformatics (Gasteiger et al., ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res. 31:3784-3788 (2003)). Domains or motifs may also be identified using routine techniques, such as by sequence alignment.
[0208] Methods for the alignment of sequences for comparison are well known in the art, such methods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) to find the global (i.e. spanning the complete sequences) alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. The BLAST algorithm (Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percent sequence identity and performs a statistical analysis of the similarity between the two sequences. The software for performing BLAST analysis is publicly available through the National Centre for Biotechnology Information (NCBI). Homologues may readily be identified using, for example, the ClustalW multiple sequence alignment algorithm (version 1.83), with the default pairwise alignment parameters, and a scoring method in percentage. Global percentages of similarity and identity may also be determined using one of the methods available in the MatGAT software package (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.). Minor manual editing may be performed to optimise alignment between conserved motifs, as would be apparent to a person skilled in the art. Furthermore, instead of using full-length sequences for the identification of homologues, specific domains may also be used. The sequence identity values may be determined over the entire nucleic acid or amino acid sequence or over selected domains or conserved motif(s), using the programs mentioned above using the default parameters. For local alignments, the Smith-Waterman algorithm is particularly useful (Smith T F, Waterman M S (1981) J. Mol. Biol 147(1); 195-7).
[0209] Furthermore, ASNS polypeptides (at least in their native form), as far as SEQ ID NO: 2 and its homologues are concerned, typically have asparagine synthetase activity (Patterson and Orr, J. Biol. Chem. 243, 376-380, 1968; Enzyme Catalogue 6.3.5.4, reaction scheme:
ATP+L-aspartate+L-glutamine+H2O=AMP+diphosphate+L-asparagine+L-glutamate- ).
Tools and techniques for measuring asparagine synthetase activity are well known in the art.
[0210] The present invention is illustrated by transforming plants with the nucleic acid sequence represented by SEQ ID NO: 62, encoding the polypeptide sequence of SEQ ID NO: 63. However, performance of the invention is not restricted to these sequences; the methods of the invention may advantageously be performed using any ASNS-encoding nucleic acid or ASNS polypeptide as defined herein.
[0211] Examples of nucleic acids encoding ASNS polypeptides may be found in databases known in the art, and some of them are listed in FIG. 5. Such nucleic acids are useful in performing the methods of the invention. Orthologues and paralogues, the terms "orthologues" and "paralogues" being as defined herein, may readily be identified by performing a so-called reciprocal blast search. Typically, this involves a first BLAST involving BLASTing a query sequence (for example using SEQ ID NO: 63) against any sequence database, such as the publicly available NCBI database. BLASTN or TBLASTX (using standard default values) are generally used when starting from a nucleotide sequence, and BLASTP or TBLASTN (using standard default values) when starting from a protein sequence. The BLAST results may optionally be filtered. The full-length sequences of either the filtered results or non-filtered results are then BLASTed back (second BLAST) against sequences from the organism from which the query sequence is derived (where the query sequence is SEQ ID NO: 62 or SEQ ID NO: 63, the second BLAST would therefore be against Oryza sativa sequences). The results of the first and second BLASTs are then compared. A paralogue is identified if a high-ranking hit from the first blast is from the same species as from which the query sequence is derived, a BLAST back then ideally results in the query sequence amongst the highest hits; an orthologue is identified if a high-ranking hit in the first BLAST is not from the same species as from which the query sequence is derived, and preferably results upon BLAST back in the query sequence being among the highest hits. Examples of orthologues and paralogues of SEQ ID NO: 63 or SEQ ID NO: 67 are listed in FIG. 5.
[0212] High-ranking hits are those having a low E-value. The lower the E-value, the more significant the score (or in other words the lower the chance that the hit was found by chance). Computation of the E-value is well known in the art. In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In the case of large families, ClustalW may be used, followed by a neighbour joining tree, to help visualize clustering of related genes and to identify orthologues and paralogues.
[0213] Nucleic acid variants encoding homologues and derivatives of SEQ ID NO: 63 may also be useful in practising the methods of the invention, the terms "homologue" and "derivative" being as defined herein. Also useful in the methods of the invention are nucleic acids encoding homologues and derivatives of orthologues or paralogues of SEQ ID NO: 63. Homologues and derivatives useful in the methods of the present invention have substantially the same biological and functional activity as the unmodified protein from which they are derived.
[0214] Further nucleic acid variants useful in practising the methods of the invention include portions of nucleic acids encoding ASNS polypeptides, nucleic acids hybridising to nucleic acids encoding ASNS polypeptides, splice variants of nucleic acids encoding ASNS polypeptides, allelic variants of nucleic acids encoding ASNS polypeptides and variants of nucleic acids encoding ASNS polypeptides obtained by gene shuffling. The terms hybridising sequence, splice variant, allelic variant and gene shuffling are as described herein.
[0215] Nucleic acids encoding ASNS polypeptides need not be full-length nucleic acids, since performance of the methods of the invention does not rely on the use of full-length nucleic acid sequences. According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a portion of SEQ ID NO: 62, or a portion of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 63.
[0216] A portion of a nucleic acid may be prepared, for example, by making one or more deletions to the nucleic acid. The portions may be used in isolated form or they may be fused to other coding (or non-coding) sequences in order to, for example, produce a protein that combines several activities. When fused to other coding sequences, the resultant polypeptide produced upon translation may be bigger than that predicted for the protein portion.
[0217] Portions useful in the methods of the invention, encode an ASNS polypeptide as defined herein, and have substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 63. Preferably, the portion is a portion of any one of the nucleic acids given in SEQ ID NO: 62, or is a portion of a nucleic acid encoding an orthologue or paralogue of any one of the amino acid sequences given in SEQ ID NO: 62. Preferably the portion is at least 800, 900, 1000, 1100, 1200, 1300, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750 consecutive nucleotides in length, the consecutive nucleotides being of SEQ ID NO: 62, or of a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 63. Most preferably the portion is a portion of the nucleic acid of SEQ ID NO: 62.
[0218] Another nucleic acid variant useful in the methods of the invention is a nucleic acid capable of hybridising, under reduced stringency conditions, preferably under stringent conditions, with a nucleic acid encoding an ASNS polypeptide as defined herein, or with a portion as defined herein.
[0219] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a nucleic acid capable of hybridizing to SEQ ID NO: 62, or comprising introducing and expressing in a plant a nucleic acid capable of hybridising to a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 62.
[0220] Hybridising sequences useful in the methods of the invention encode an ASNS polypeptide as defined herein, having substantially the same biological activity as the amino acid sequences given in SEQ ID NO: 63. Preferably, the hybridising sequence is capable of hybridising to SEQ ID NO: 62, or to a portion of any of these sequences, a portion being as defined above, or the hybridising sequence is capable of hybridising to a nucleic acid encoding an orthologue or paralogue of SEQ ID NO: 63.
[0221] Another nucleic acid variant useful in the methods of the invention is a splice variant encoding an ASNS polypeptide as defined hereinabove, a splice variant being as defined herein.
[0222] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a splice variant of SEQ ID NO: 62, or a splice variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 63.
[0223] Another nucleic acid variant useful in performing the methods of the invention is an allelic variant of a nucleic acid encoding an ASNS polypeptide as defined hereinabove, an allelic variant being as defined herein.
[0224] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant an allelic variant of SEQ ID NO: 62, or comprising introducing and expressing in a plant an allelic variant of a nucleic acid encoding an orthologue, paralogue or homologue of the amino acid sequences represented by SEQ ID NO: 63.
[0225] The allelic variants useful in the methods of the present invention have substantially the same biological activity as the ASNS polypeptide of SEQ ID NO: 63. Allelic variants exist in nature, and encompassed within the methods of the present invention is the use of these natural alleles. Gene shuffling or directed evolution may also be used to generate variants of nucleic acids encoding ASNS polypeptides as defined above; the term "gene shuffling" being as defined herein.
[0226] According to the present invention, there is provided a method for enhancing yield-related traits in plants, comprising introducing and expressing in a plant a variant of SEQ ID NO: 62, or comprising introducing and expressing in a plant a variant of a nucleic acid encoding an orthologue, paralogue or homologue of SEQ ID NO: 63, which variant nucleic acid is obtained by gene shuffling.
[0227] Furthermore, nucleic acid variants may also be obtained by site-directed mutagenesis. Several methods are available to achieve site-directed mutagenesis, the most common being PCR based methods (Current Protocols in Molecular Biology. Wiley Eds.).
[0228] Nucleic acids encoding ASNS polypeptides may be derived from any natural or artificial source. The nucleic acid may be modified from its native form in composition and/or genomic environment through deliberate human manipulation. Preferably the ASNS polypeptide-encoding nucleic acid is from a plant. In the case of SEQ ID NO: 62, the ASNS polypeptide encoding nucleic acid is preferably from a monocotyledonous plant, more preferably from the family Poaceae, most preferably the nucleic acid is from Oryza sativa.
[0229] Performance of the methods of the invention gives plants having enhanced yield-related traits. In particular performance of the methods of the invention gives plants having increased early vigour and increased yield, especially increased biomass and increased seed yield relative to control plants. The terms "yield" and "seed yield" are described in more detail in the "definitions" section herein.
[0230] Reference herein to enhanced yield-related traits is taken to mean an increase in early vigour and/or in biomass (weight) of one or more parts of a plant, which may include aboveground (harvestable) parts and/or (harvestable) parts below ground. In particular, such harvestable parts are biomass and/or seeds, and performance of the methods of the invention results in plants having increased early vigour, biomass and/or seed yield relative to the early vigour, biomass or seed yield of control plants.
[0231] Taking corn as an example, a yield increase may be manifested as one or more of the following: increase in the number of plants established per square meter, an increase in the number of ears per plant, an increase in the number of rows, number of kernels per row, kernel weight, thousand kernel weight, ear length/diameter, increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), among others. Taking rice as an example, a yield increase may manifest itself as an increase in one or more of the following: number of plants per square meter, number of panicles per plant, number of spikelets per panicle, number of flowers (florets) per panicle (which is expressed as a ratio of the number of filled seeds over the number of primary panicles), increase in the seed filling rate (which is the number of filled seeds divided by the total number of seeds and multiplied by 100), increase in thousand kernel weight, among others.
[0232] The present invention provides a method for increasing yield, especially biomass and/or seed yield of plants, relative to control plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding an ASNS polypeptide as defined herein.
[0233] Since the transgenic plants according to the present invention have increased yield, it is likely that these plants exhibit an increased growth rate (during at least part of their life cycle), relative to the growth rate of control plants at a corresponding stage in their life cycle.
[0234] The increased growth rate may be specific to one or more parts of a plant (including seeds), or may be throughout substantially the whole plant. Plants having an increased growth rate may have a shorter life cycle. The life cycle of a plant may be taken to mean the time needed to grow from a dry mature seed up to the stage where the plant has produced dry mature seeds, similar to the starting material. This life cycle may be influenced by factors such as early vigour, growth rate, greenness index, flowering time and speed of seed maturation. The increase in growth rate may take place at one or more stages in the life cycle of a plant or during substantially the whole plant life cycle. Increased growth rate during the early stages in the life cycle of a plant may reflect enhanced vigour. The increase in growth rate may alter the harvest cycle of a plant allowing plants to be sown later and/or harvested sooner than would otherwise be possible (a similar effect may be obtained with earlier flowering time). If the growth rate is sufficiently increased, it may allow for the further sowing of seeds of the same plant species (for example sowing and harvesting of rice plants followed by sowing and harvesting of further rice plants all within one conventional growing period). Similarly, if the growth rate is sufficiently increased, it may allow for the further sowing of seeds of different plants species (for example the sowing and harvesting of corn plants followed by, for example, the sowing and optional harvesting of soybean, potato or any other suitable plant). Harvesting additional times from the same rootstock in the case of some crop plants may also be possible. Altering the harvest cycle of a plant may lead to an increase in annual biomass production per square meter (due to an increase in the number of times (say in a year) that any particular plant may be grown and harvested). An increase in growth rate may also allow for the cultivation of transgenic plants in a wider geographical area than their wild-type counterparts, since the territorial limitations for growing a crop are often determined by adverse environmental conditions either at the time of planting (early season) or at the time of harvesting (late season). Such adverse conditions may be avoided if the harvest cycle is shortened. The growth rate may be determined by deriving various parameters from growth curves, such parameters may be: T-Mid (the time taken for plants to reach 50% of their maximal size) and T-90 (time taken for plants to reach 90% of their maximal size), amongst others.
[0235] According to a preferred feature of the present invention, performance of the methods of the invention gives plants having an increased growth rate relative to control plants. Therefore, according to the present invention, there is provided a method for increasing the growth rate of plants, which method comprises modulating expression, preferably increasing expression, in a plant of a nucleic acid encoding an ASNS polypeptide as defined herein. In a particular embodiment, performance of the methods of the present invention gives plants with increased early vigour.
[0236] An increase in yield and/or growth rate occurs whether the plant is under non-stress conditions or whether the plant is exposed to various stresses compared to control plants. Plants typically respond to exposure to stress by growing more slowly. In conditions of severe stress, the plant may even stop growing altogether. Mild stress on the other hand is defined herein as being any stress to which a plant is exposed which does not result in the plant ceasing to grow altogether without the capacity to resume growth. Mild stress in the sense of the invention leads to a reduction in the growth of the stressed plants of less than 40%, 35% or 30%, preferably less than 25%, 20% or 15%, more preferably less than 14%, 13%, 12%, 11% or 10% or less in comparison to the control plant under non-stress conditions. Due to advances in agricultural practices (irrigation, fertilization, pesticide treatments) severe stresses are not often encountered in cultivated crop plants. As a consequence, the compromised growth induced by mild stress is often an undesirable feature for agriculture. Mild stresses are the everyday biotic and/or abiotic (environmental) stresses to which a plant is exposed. Abiotic stresses may be due to drought or excess water, anaerobic stress, salt stress, chemical toxicity, oxidative stress and hot, cold or freezing temperatures. The abiotic stress may be an osmotic stress caused by a water stress (particularly due to drought), salt stress, oxidative stress or an ionic stress. Biotic stresses are typically those stresses caused by pathogens, such as bacteria, viruses, fungi and insects.
[0237] In particular, the methods of the present invention may be performed under non-stress conditions or under conditions of mild drought to give plants having increased yield relative to control plants. As reported in Wang et al. (Planta (2003) 218: 1-14), abiotic stress leads to a series of morphological, physiological, biochemical and molecular changes that adversely affect plant growth and productivity. Drought, salinity, extreme temperatures and oxidative stress are known to be interconnected and may induce growth and cellular damage through similar mechanisms. Rabbani et al. (Plant Physiol (2003) 133: 1755-1767) describes a particularly high degree of "cross talk" between drought stress and high-salinity stress. For example, drought and/or salinisation are manifested primarily as osmotic stress, resulting in the disruption of homeostasis and ion distribution in the cell. Oxidative stress, which frequently accompanies high or low temperature, salinity or drought stress, may cause denaturing of functional and structural proteins. As a consequence, these diverse environmental stresses often activate similar cell signalling pathways and cellular responses, such as the production of stress proteins, up-regulation of anti-oxidants, accumulation of compatible solutes and growth arrest. The term "non-stress" conditions as used herein are those environmental conditions that allow optimal growth of plants. Persons skilled in the art are aware of normal soil conditions and climatic conditions for a given location.
[0238] Performance of the methods of the invention gives plants grown under non-stress conditions or under mild drought conditions increased yield and/or increased early vigour, relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield and/or early vigour in plants grown under non-stress conditions or under mild drought conditions, which method comprises increasing expression in a plant of a nucleic acid encoding an ASNS polypeptide.
[0239] Performance of the methods of the invention gives plants grown under conditions of nutrient deficiency, particularly under conditions of nitrogen deficiency, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of nutrient deficiency, which method comprises increasing expression in a plant of a nucleic acid encoding an ASNS polypeptide. Nutrient deficiency may result from a lack of nutrients such as nitrogen, phosphates and other phosphorous-containing compounds, potassium, calcium, cadmium, magnesium, manganese, iron and boron, amongst others.
[0240] Performance of the methods of the invention gives plants grown under conditions of salt stress, increased yield relative to control plants grown under comparable conditions. Therefore, according to the present invention, there is provided a method for increasing yield in plants grown under conditions of salt stress, which method comprises modulating expression in a plant of a nucleic acid encoding a ASNS polypeptide. The term salt stress is not restricted to common salt (NaCl), but may be any one or more of: NaCl, KCl, LiCl, MgCl2, CaCl2, amongst others.
[0241] The present invention encompasses plants or parts thereof (including seeds) obtainable by the methods according to the present invention. The plants or parts thereof comprise a nucleic acid transgene encoding an ASNS polypeptide as defined above.
[0242] The invention also provides genetic constructs and vectors to facilitate introduction and/or expression in plants of nucleic acids encoding ASNS polypeptides. The gene constructs may be inserted into vectors, which may be commercially available, suitable for transforming into plants and suitable for expression of the gene of interest in the transformed cells. The invention also provides use of a gene construct as defined herein in the methods of the invention.
[0243] More specifically, the present invention provides a construct comprising:
[0244] (a) a nucleic acid encoding an ASNS polypeptide as defined above;
[0245] (b) one or more control sequences capable of driving expression of the nucleic acid sequence of (a); and optionally
[0246] (c) a transcription termination sequence.
[0247] Preferably, the nucleic acid encoding an ASNS polypeptide is as defined above. The term "control sequence" and "termination sequence" are as defined herein.
[0248] Plants are transformed with a vector comprising any of the nucleic acids described above. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells containing the sequence of interest. The sequence of interest is operably linked to one or more control sequences (at least to a promoter).
[0249] Advantageously, any type of promoter, whether natural or synthetic, may be used to drive expression of the nucleic acid sequence, but preferably the promoter is of plant origin. A constitutive promoter is particularly useful in the methods of the invention. Preferably the constitutive promoter is also a ubiquitous promoter of medium strength. See the "Definitions" section herein for definitions of the various promoter types.
[0250] It should be clear that the applicability of the present invention is not restricted to the ASNS polypeptide-encoding nucleic acid represented by SEQ ID NO: 62, nor is the applicability of the invention restricted to expression of an ASNS polypeptide-encoding nucleic acid when driven by a constitutive specific promoter.
[0251] The constitutive promoter is preferably a GOS2 promoter, preferably a GOS2 promoter from rice. Further preferably the constitutive promoter is represented by a nucleic acid sequence substantially similar to SEQ ID NO: 64, most preferably the constitutive promoter is as represented by SEQ ID NO: 64. See Table 2 in the "Definitions" section herein for further examples of constitutive promoters.
[0252] Optionally, one or more terminator sequences may be used in the construct introduced into a plant. Additional regulatory elements may include transcriptional as well as translational enhancers. Those skilled in the art will be aware of terminator and enhancer sequences that may be suitable for use in performing the invention. An intron sequence may also be added to the 5' untranslated region (UTR) or in the coding sequence to increase the amount of the mature message that accumulates in the cytosol, as described in the definitions section. Other control sequences (besides promoter, enhancer, silencer, intron sequences, 3'UTR and/or 5'UTR regions) may be protein and/or RNA stabilizing elements. Such sequences would be known or may readily be obtained by a person skilled in the art.
[0253] The genetic constructs of the invention may further include an origin of replication sequence that is required for maintenance and/or replication in a specific cell type. One example is when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not limited to, the f1-ori and colE1.
[0254] For the detection of the successful transfer of the nucleic acid sequences as used in the methods of the invention and/or selection of transgenic plants comprising these nucleic acids, it is advantageous to use marker genes (or reporter genes). Therefore, the genetic construct may optionally comprise a selectable marker gene. Selectable markers are described in more detail in the "definitions" section herein. The marker genes may be removed or excised from the transgenic cell once they are no longer needed. Techniques for marker removal are known in the art, useful techniques are described above in the definitions section.
[0255] The invention also provides a method for the production of transgenic plants having enhanced yield-related traits relative to control plants, comprising introduction and expression in a plant of any nucleic acid encoding an ASNS polypeptide as defined hereinabove.
[0256] More specifically, the present invention provides a method for the production of transgenic plants having increased enhanced yield-related traits, particularly increased early vigour and/or increased yield, which method comprises:
[0257] (i) introducing and expressing in a plant or plant cell an ASNS polypeptide-encoding nucleic acid; and
[0258] (ii) cultivating the plant cell under conditions promoting plant growth and development.
[0259] The nucleic acid of (i) may be any of the nucleic acids capable of encoding an ASNS polypeptide as defined herein.
[0260] The nucleic acid may be introduced directly into a plant cell or into the plant itself (including introduction into a tissue, organ or any other part of a plant). According to a preferred feature of the present invention, the nucleic acid is preferably introduced into a plant by transformation. The term "transformation" is described in more detail in the "definitions" section herein.
[0261] The genetically modified plant cells can be regenerated via all methods with which the skilled worker is familiar. Suitable methods can be found in the abovementioned publications by S. D. Kung and R. Wu, Potrykus or Hofgen and Willmitzer.
[0262] Generally after transformation, plant cells or cell groupings are selected for the presence of one or more markers which are encoded by plant-expressible genes co-transferred with the gene of interest, following which the transformed material is regenerated into a whole plant. To select transformed plants, the plant material obtained in the transformation is, as a rule, subjected to selective conditions so that transformed plants can be distinguished from untransformed plants. For example, the seeds obtained in the above-described manner can be planted and, after an initial growing period, subjected to a suitable selection by spraying. A further possibility consists in growing the seeds, if appropriate after sterilization, on agar plates using a suitable selection agent so that only the transformed seeds can grow into plants. Alternatively, the transformed plants are screened for the presence of a selectable marker such as the ones described above.
[0263] Following DNA transfer and regeneration, putatively transformed plants may also be evaluated, for instance using Southern analysis, for the presence of the gene of interest, copy number and/or genomic organisation. Alternatively or additionally, expression levels of the newly introduced DNA may be monitored using Northern and/or Western analysis, both techniques being well known to persons having ordinary skill in the art.
[0264] The generated transformed plants may be propagated by a variety of means, such as by clonal propagation or classical breeding techniques. For example, a first generation (or T1) transformed plant may be selfed and homozygous second-generation (or T2) transformants selected, and the T2 plants may then further be propagated through classical breeding techniques. The generated transformed organisms may take a variety of forms. For example, they may be chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells transformed to contain the expression cassette); grafts of transformed and untransformed tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion).
[0265] The present invention clearly extends to any plant cell or plant produced by any of the methods described herein, and to all plant parts and propagules thereof. The present invention extends further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or whole plant that has been produced by any of the aforementioned methods, the only requirement being that progeny exhibit the same genotypic and/or phenotypic characteristic(s) as those produced by the parent in the methods according to the invention.
[0266] The invention also includes host cells containing an isolated nucleic acid encoding an ASNS polypeptide as defined hereinabove. Preferred host cells according to the invention are plant cells. Host plants for the nucleic acids or the vector used in the method according to the invention, the expression cassette or construct or vector are, in principle, advantageously all plants, which are capable of synthesizing the polypeptides used in the inventive method.
[0267] The methods of the invention are advantageously applicable to any plant. Plants that are particularly useful in the methods of the invention include all plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and dicotyledonous plants including fodder or forage legumes, ornamental plants, food crops, trees or shrubs. According to a preferred embodiment of the present invention, the plant is a crop plant. Examples of crop plants include soybean, sunflower, canola, alfalfa, rapeseed, cotton, tomato, potato and tobacco. Further preferably, the plant is a monocotyledonous plant. Examples of monocotyledonous plants include sugarcane. More preferably the plant is a cereal. Examples of cereals include rice, maize, wheat, barley, millet, rye, triticale, sorghum, emmer, spelt, secale, einkorn, teff, milo and oats.
[0268] The invention also extends to harvestable parts of a plant such as, but not limited to seeds, leaves, fruits, flowers, stems, roots, rhizomes, tubers and bulbs. The invention furthermore relates to products derived, preferably directly derived, from a harvestable part of such a plant, such as dry pellets or powders, oil, fat and fatty acids, starch or proteins.
[0269] According to a preferred feature of the invention, the modulated expression is increased expression. Methods for increasing expression of nucleic acids or genes, or gene products, are well documented in the art and examples are provided in the definitions section.
[0270] As mentioned above, a preferred method for modulating (preferably, increasing) expression of a nucleic acid encoding an ASNS polypeptide is by introducing and expressing in a plant a nucleic acid encoding an ASNS polypeptide; however the effects of performing the method, i.e. enhancing yield-related traits may also be achieved using other well known techniques, including but not limited to T-DNA activation tagging, TILLING, homologous recombination. A description of these techniques is provided in the definitions section.
[0271] The present invention also encompasses use of nucleic acids encoding ASNS polypeptides as described herein and use of these ASNS polypeptides in enhancing any of the aforementioned yield-related traits in plants.
[0272] Nucleic acids encoding ASNS polypeptide described herein, or the ASNS polypeptides themselves, may find use in breeding programmes in which a DNA marker is identified which may be genetically linked to an ASNS polypeptide-encoding gene. The nucleic acids/genes, or the ASNS polypeptides themselves may be used to define a molecular marker. This DNA or protein marker may then be used in breeding programmes to select plants having enhanced yield-related traits as defined hereinabove in the methods of the invention.
[0273] Allelic variants of an ASNS polypeptide-encoding nucleic acid/gene may also find use in marker-assisted breeding programmes. Such breeding programmes sometimes require introduction of allelic variation by mutagenic treatment of the plants, using for example EMS mutagenesis; alternatively, the programme may start with a collection of allelic variants of so called "natural" origin caused unintentionally. Identification of allelic variants then takes place, for example, by PCR. This is followed by a step for selection of superior allelic variants of the sequence in question and which give increased yield. Selection is typically carried out by monitoring growth performance of plants containing different allelic variants of the sequence in question. Growth performance may be monitored in a greenhouse or in the field. Further optional steps include crossing plants in which the superior allelic variant was identified with another plant. This could be used, for example, to make a combination of interesting phenotypic features.
[0274] Nucleic acids encoding ASNS polypeptides may also be used as probes for genetically and physically mapping the genes that they are a part of, and as markers for traits linked to those genes. Such information may be useful in plant breeding in order to develop lines with desired phenotypes. Such use of ASNS polypeptide-encoding nucleic acids requires only a nucleic acid sequence of at least 15 nucleotides in length. The ASNS polypeptide-encoding nucleic acids may be used as restriction fragment length polymorphism (RFLP) markers. Southern blots (Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, A Laboratory Manual) of restriction-digested plant genomic DNA may be probed with the ASNS-encoding nucleic acids. The resulting banding patterns may then be subjected to genetic analyses using computer programs such as MapMaker (Lander et al. (1987) Genomics 1: 174-181) in order to construct a genetic map. In addition, the nucleic acids may be used to probe Southern blots containing restriction endonuclease-treated genomic DNAs of a set of individuals representing parent and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted and used to calculate the position of the ASNS polypeptide-encoding nucleic acid in the genetic map previously obtained using this population (Botstein et al. (1980) Am. J. Hum. Genet. 32:314-331).
[0275] The production and use of plant gene-derived probes for use in genetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol. Reporter 4: 37-41. Numerous publications describe genetic mapping of specific cDNA clones using the methodology outlined above or variations thereof. For example, F2 intercross populations, backcross populations, randomly mated populations, near isogenic lines, and other sets of individuals may be used for mapping. Such methodologies are well known to those skilled in the art.
[0276] The nucleic acid probes may also be used for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel et al. In: Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346, and references cited therein).
[0277] In another embodiment, the nucleic acid probes may be used in direct fluorescence in situ hybridisation (FISH) mapping (Trask (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favour use of large clones (several kb to several hundred kb; see Laan et al. (1995) Genome Res. 5:13-20), improvements in sensitivity may allow performance of FISH mapping using shorter probes.
[0278] A variety of nucleic acid amplification-based methods for genetic and physical mapping may be carried out using the nucleic acids. Examples include allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med 11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332), allele-specific ligation (Landegren et al. (1988) Science 241:1077-1080), nucleotide extension reactions (Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, the sequence of a nucleic acid is used to design and produce primer pairs for use in the amplification reaction or in primer extension reactions. The design of such primers is well known to those skilled in the art.
[0279] In methods employing PCR-based genetic mapping, it may be necessary to identify DNA sequence differences between the parents of the mapping cross in the region corresponding to the instant nucleic acid sequence. This, however, is generally not necessary for mapping methods.
[0280] The methods according to the present invention result in plants having enhanced yield-related traits, as described hereinbefore. These traits may also be combined with other economically advantageous traits, such as further yield-enhancing traits, tolerance to other abiotic and biotic stresses, traits modifying various architectural features and/or biochemical and/or physiological features.
DESCRIPTION OF FIGURES
[0281] The present invention will now be described with reference to the following figures in which:
[0282] FIG. 1 represents the binary vector for increased expression in Oryza sativa of a NITR-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2::NITR)
[0283] FIG. 2 details examples of sequences useful in performing the methods according to the present invention.
[0284] FIG. 3 gives a phylogenetic tree of the NITR protein sequences listed in FIG. 2, in which tree the outgroup is represented by Sulfite Reductases, exemplified by SEQ ID NO: 9 (C. reinhardtii 59303), SEQ ID NO: 11 (C. reinhardtii 192232) and SEQ ID NO: 33 (A. thaliana At5g04590).
[0285] FIG. 4 represents the binary vector for increased expression in Oryza sativa of an ASNS-encoding nucleic acid under the control of a rice GOS2 promoter (pGOS2::ASNS)
[0286] FIG. 5 details examples of sequences useful in performing the methods according to the present invention.
[0287] FIG. 6 shows an alignment between SEQ ID NO: 63 and SEQ ID NO: 67. The S165G and R382G mutations are indicated.
[0288] FIG. 7 is a multiple alignment of examples of ASNS polypeptides. The asterisks represent amino acids that are identical in all sequences, the colons indicate highly conserved residues, the dots represent conserved residues. The Arg residues corresponding to R382 in SEQ ID NO: 67 is shown in bold.
EXAMPLES
[0289] The present invention will now be described with reference to the following examples, which are by way of illustration alone. The following examples are not intended to completely define or otherwise limit the scope of the invention.
[0290] DNA manipulation: unless otherwise stated, recombinant DNA techniques are performed according to standard protocols described in (Sambrook (2001) Molecular Cloning: a laboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH, New York) or in Volumes 1 and 2 of Ausubel et al. (1994), Current Protocols in Molecular Biology, Current Protocols. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R. D. D. Croy, published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications (UK).
Example 1
Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of the Invention
[0291] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention are identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acids used in the present invention are used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis is viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflects the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0292] In some instances, related sequences may tentatively be assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid or polypeptide sequence of interest.
Example 2
Alignment of NITR Polypeptide Sequences
[0293] Alignment of polypeptide sequences is performed using the AlignX programme from the Vector NTI package (Invitrogen) which is based on the popular Clustal W algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500). Default values are for the gap open penalty of 10, for the gap extension penalty of 0.1 and the selected weight matrix is Blosum 62 (if polypeptides are aligned). Minor manual editing may be done to further optimise the alignment.
[0294] A phylogenetic tree of NITR polypeptides is constructed using a neighbour-joining clustering algorithm as provided in the AlignX programme from Vector NTI (Invitrogen).
[0295] For the construction of the phylogenetic tree of FIG. 3, the proteins of FIG. 2 were aligned using MUSCLE (Edgar (2004), Nucleic Acids Research 32(5): 1792-97). A Neighbour-Joining tree was calculated using QuickTree (Howe et al. (2002), Bioinformatics 18(11): 1546-7). Support of the major branching is indicated for 100 bootstrap repetitions. A circular phylogram was drawn using Dendroscope (Huson et al. (2007), BMC Bioinformatics 8(1):460).
Example 3
Calculation of Global Percentage Identity Between Polypeptide Sequences Useful in Performing the Methods of the Invention
[0296] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention are determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.
[0297] Parameters used in the comparison were:
[0298] Scoring matrix: Blosum62
[0299] First Gap: 12
[0300] Extending gap: 2
[0301] A MATGAT table for local alignment of a specific domain, or data on % identity/similarity between specific domains may also be generated.
Example 4
Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention
[0302] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, ProDom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.
[0303] The protein sequences representing the NITR are used as query to search the InterPro database.
Example 5
Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of the Invention
[0304] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. TargetP is maintained at the server of the Technical University of Denmark.
[0305] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.
[0306] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).
[0307] The protein sequence represented by SEQ ID NO: 2 was used to query TargetP 1.1. The "plant" organism group is selected, no cutoffs defined, and the predicted length of the transit peptide requested. The protein has a predicted location in the chloroplast (probability 0.793, reliability class 3).
[0308] Many other algorithms can be used to perform such analyses, including:
[0309] ChloroP 1.1 hosted on the server of the Technical University of Denmark;
[0310] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;
[0311] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;
[0312] TMHMM, hosted on the server of the Technical University of Denmark
Example 6
Cloning of the Nucleic Acid Sequence Used in the Methods of the Invention
Cloning of SEQ ID NO: 1:
[0313] The NITR encoding nucleic acid sequence SEQ ID NO: 1 used in the methods of the invention was amplified by PCR using as template a custom-made Arabidopsis thaliana seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm07073 (SEQ ID NO: 4; sense, start codon in bold): 5'-ggggacaagtttgt acaaaaaagcaggcttaaacaatgacttctttctctctcactt-3' and prm07074 (SEQ ID NO: 5; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggtcaatagct tttgaatcaatct-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombines in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone". Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0314] The entry clone comprising SEQ ID NO: 1 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 3) for seed specific expression was located upstream of this Gateway cassette.
[0315] After the LR recombination step, the resulting expression vector pGOS2::NITR (FIG. 1) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
Example 7
Plant Transformation
Rice Transformation
[0316] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 minutes in 0.2% HgCl2, followed by a 6 times 15 minutes wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in the dark for four weeks, embryogenic, scutellum-derived calli were excised and propagated on the same medium. After two weeks, the calli were multiplied or propagated by subculture on the same medium for another 2 weeks. Embryogenic callus pieces were sub-cultured on fresh medium 3 days before co-cultivation (to boost cell division activity).
[0317] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD600) of about 1. The suspension was then transferred to a Petri dish and the calli immersed in the suspension for 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. Co-cultivated calli were grown on 2,4-D-containing medium for 4 weeks in the dark at 28° C. in the presence of a selection agent. During this period, rapidly growing resistant callus islands developed. After transfer of this material to a regeneration medium and incubation in the light, the embryogenic potential was released and shoots developed in the next four to five weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.
[0318] Approximately 35 independent T0 rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges 1996, Chan et al. 1993, Hiei et al. 1994).
Corn Transformation
[0319] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Wheat Transformation
[0320] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Soybean Transformation
[0321] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Rapeseed/Canola Transformation
[0322] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/l BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/l BAP, cefotaxime, carbenicillin, or timentin (300 mg/l) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/l BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Alfalfa Transformation
[0323] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown D C W and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50 g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Example 8
Phenotypic Evaluation Procedure
8.1 Evaluation Setup
[0324] Approximately 35 independent TO rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Five events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%.
Drought Screen
[0325] Plants from T2 seeds are grown in potting soil under normal conditions until they approached the heading stage. They are then transferred to a "dry" section where irrigation is withheld. Humidity probes are inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC goes below certain thresholds, the plants are automatically re-watered continuously until a normal level is reached again. The plants are then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) is the same as for plants not grown under abiotic stress conditions. Growth and yield parameters are recorded as detailed for growth under normal conditions.
Nitrogen Use Efficiency Screen
[0326] Rice plants from T2 seeds were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters were recorded as detailed for growth under normal conditions.
Salt Stress Screen
[0327] Plants are grown on a substrate made of coco fibers and argex (3 to 1 ratio). A normal nutrient solution is used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) is added to the nutrient solution, until the plants are harvested. Seed-related parameters are then measured.
8.2 Statistical Analysis: F Test
[0328] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.
[0329] Because two experiments with overlapping events are carried out, a combined analysis is performed. This is useful to check consistency of the effects over the two experiments, and if this is the case, to accumulate evidence from both experiments in order to increase confidence in the conclusion. The method used is a mixed-model approach that takes into account the multilevel structure of the data (i.e. experiment-event-segregants). P values are obtained by comparing likelihood ratio test to chi square distributions.
8.3 Parameters Measured
Biomass-Related Parameter Measurement
[0330] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.
[0331] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass. The early vigour is the plant (seedling) aboveground area three weeks post-germination. Increase in root biomass is expressed as an increase in total root biomass (measured as maximum biomass of roots observed during the lifespan of a plant); or as an increase in the root/shoot index (measured as the ratio between root mass and shoot mass in the period of active growth of root and shoot).
[0332] Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration. The results described below are for plants three weeks post-germination.
Seed-Related Parameter Measurements
[0333] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The filled husks were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance. The number of filled seeds was determined by counting the number of filled husks that remained after the separation step. The total seed yield was measured by weighing all filled husks harvested from a plant. Total seed number per plant was measured by counting the number of husks harvested from a plant. The Harvest Index (HI) in the present invention is defined as the ratio between the total seed yield and the above ground area (mm2), multiplied by a factor 106. The total number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds and the number of mature primary panicles.
Example 9
Results of the Phenotypic Evaluation of the Transgenic Plants
[0334] The transgenic rice plants expressing the NITR nucleic acid represented by SEQ ID NO: 1 under control of the GOS2 promoter showed an increase of more than 5% for biomass (root and shoot), early vigour, total weight of seeds, number of filled seeds, harvest index, total number of seeds and number of flowers per panicle when grown under nitrogen deficiency-stress conditions. When evaluated over two generations (T1 and T2) the following data were obtained (Table 3):
TABLE-US-00010 TABLE 3 Yield increase for transgenic plants expressing the NITR nucleic acid compared to the control plants. For each parameter the p value is ≦ 0.05. Parameter Overall increase (%) Early vigour 17.5 Root/Shoot index 9.0 Total weight of seeds 6.1 Number of filled seeds 5.8 Total number of seeds 5.0
Example 10
Identification of Sequences Related to the Nucleic Acid Sequence Used in the Methods of the Invention
[0335] Sequences (full length cDNA, ESTs or genomic) related to the nucleic acid sequence used in the methods of the present invention are identified amongst those maintained in the Entrez Nucleotides database at the National Center for Biotechnology Information (NCBI) using database sequence search tools, such as the Basic Local Alignment Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucleic Acids Res. 25:3389-3402). The program is used to find regions of local similarity between sequences by comparing nucleic acid or polypeptide sequences to sequence databases and by calculating the statistical significance of matches. For example, the polypeptide encoded by the nucleic acids used in the present invention are used for the TBLASTN algorithm, with default settings and the filter to ignore low complexity sequences set off. The output of the analysis is viewed by pairwise comparison, and ranked according to the probability score (E-value), where the score reflects the probability that a particular alignment occurs by chance (the lower the E-value, the more significant the hit). In addition to E-values, comparisons are also scored by percentage identity. Percentage identity refers to the number of identical nucleotides (or amino acids) between the two compared nucleic acid (or polypeptide) sequences over a particular length. In some instances, the default parameters may be adjusted to modify the stringency of the search. For example the E-value may be increased to show less stringent matches. This way, short nearly exact matches may be identified.
[0336] In some instances, related sequences may tentatively be assembled and publicly disclosed by research institutions, such as The Institute for Genomic Research (TIGR). The Eukaryotic Gene Orthologs (EGO) database may be used to identify such related sequences, either by keyword search or by using the BLAST algorithm with the nucleic acid or polypeptide sequence of interest.
Example 11
Alignment of ASNS Polypeptide Sequences
[0337] Alignment of polypeptide sequences is performed using the AlignX programme from the Vector NTI package (Invitrogen) which is based on the popular Clustal W algorithm of progressive alignment (Thompson et al. (1997) Nucleic Acids Res 25:4876-4882; Chenna et al. (2003). Nucleic Acids Res 31:3497-3500). Default values are for the gap open penalty of 10, for the gap extension penalty of 0.1 and the selected weight matrix is Blosum 62 (if polypeptides are aligned). Minor manual editing may be done to further optimise the alignment. For the alignment of FIG. 4, the ClustalW 2.0 algorithm was used with default parameters (Matrix: Gonnet, Gap-opening penalty: 10, Gap-extension penalty: 0.1).
[0338] A phylogenetic tree of ASNS polypeptides is constructed using a neighbour-joining clustering algorithm as provided in the AlignX programme from Vector NTI (Invitrogen).
Example 12
Calculation of Global Percentage Identity Between Polypeptide Sequences Useful in Performing the Methods of the Invention
[0339] Global percentages of similarity and identity between full length polypeptide sequences useful in performing the methods of the invention are determined using one of the methods available in the art, the MatGAT (Matrix Global Alignment Tool) software (BMC Bioinformatics. 2003 4:29. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. Campanella J J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGAT software generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data. The program performs a series of pair-wise alignments using the Myers and Miller global alignment algorithm (with a gap opening penalty of 12, and a gap extension penalty of 2), calculates similarity and identity using for example Blosum 62 (for polypeptides), and then places the results in a distance matrix. Sequence similarity is shown in the bottom half of the dividing line and sequence identity is shown in the top half of the diagonal dividing line.
[0340] Parameters used in the comparison were:
[0341] Scoring matrix: Blosum62
[0342] First Gap: 12
[0343] Extending gap: 2
[0344] A MATGAT table for local alignment of a specific domain, or data on % identity/similarity between specific domains may also be generated.
Example 13
Identification of Domains Comprised in Polypeptide Sequences Useful in Performing the Methods of the Invention
[0345] The Integrated Resource of Protein Families, Domains and Sites (InterPro) database is an integrated interface for the commonly used signature databases for text- and sequence-based searches. The InterPro database combines these databases, which use different methodologies and varying degrees of biological information about well-characterized proteins to derive protein signatures. Collaborating databases include SWISS-PROT, PROSITE, TrEMBL, PRINTS, ProDom and Pfam, Smart and TIGRFAMs. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. Pfam is hosted at the Sanger Institute server in the United Kingdom. Interpro is hosted at the European Bioinformatics Institute in the United Kingdom.
[0346] The protein sequences represented by SEQ ID NO: 63 was used as query to search the InterPro database.
Example 14
Topology Prediction of the Polypeptide Sequences Useful in Performing the Methods of the Invention
[0347] TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal pre-sequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Scores on which the final prediction is based are not really probabilities, and they do not necessarily add to one. However, the location with the highest score is the most likely according to TargetP, and the relationship between the scores (the reliability class) may be an indication of how certain the prediction is. The reliability class (RC) ranges from 1 to 5, where 1 indicates the strongest prediction. TargetP is maintained at the server of the Technical University of Denmark.
[0348] For the sequences predicted to contain an N-terminal presequence a potential cleavage site can also be predicted.
[0349] A number of parameters were selected, such as organism group (non-plant or plant), cutoff sets (none, predefined set of cutoffs, or user-specified set of cutoffs), and the calculation of prediction of cleavage sites (yes or no).
[0350] The protein sequence of SEQ ID NO: 63 was used to query TargetP 1.1. The "plant" organism group is selected, no cutoffs defined, and the predicted length of the transit peptide requested. No clear subcellular location was predicted by TargetP, but SubLoc (Hua and Sun, Bioinformatics) predicted a cytoplasmic localisation.
[0351] Many other algorithms can be used to perform such analyses, including:
[0352] ChloroP 1.1 hosted on the server of the Technical University of Denmark;
[0353] Protein Prowler Subcellular Localisation Predictor version 1.2 hosted on the server of the Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia;
[0354] PENCE Proteome Analyst PA-GOSUB 2.5 hosted on the server of the University of Alberta, Edmonton, Alberta, Canada;
[0355] TMHMM, hosted on the server of the Technical University of Denmark
Example 15
Cloning of the Nucleic Acid Sequence Used in the Methods of the Invention
Cloning of SEQ ID NO: 62:
[0356] The nucleic acid sequence SEQ ID NO: 64 used in the methods of the invention was amplified by PCR using as template a custom-made Oryza sativa seedlings cDNA library (in pCMV Sport 6.0; Invitrogen, Paisley, UK). PCR was performed using Hifi Taq DNA polymerase in standard conditions, using 200 ng of template in a 50 μl PCR mix. The primers used were prm06049 (SEQ ID NO: 65; sense, start codon in bold): 5'-ggggacaagtttgtacaa aaaagcaggcttaaacaatgtgtggcatcctcgccgtgctcg-3' and prm06050 (SEQ ID NO: 66; reverse, complementary): 5'-ggggaccactttgtacaagaaagctgggtgcgacgatagaa agttaaacggcag-3', which include the AttB sites for Gateway recombination. The amplified PCR fragment was purified also using standard methods. The first step of the Gateway procedure, the BP reaction, was then performed, during which the PCR fragment recombines in vivo with the pDONR201 plasmid to produce, according to the Gateway terminology, an "entry clone". Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway® technology.
[0357] The entry clone comprising SEQ ID NO: 62 was then used in an LR reaction with a destination vector used for Oryza sativa transformation. This vector contained as functional elements within the T-DNA borders: a plant selectable marker; a screenable marker expression cassette; and a Gateway cassette intended for LR in vivo recombination with the nucleic acid sequence of interest already cloned in the entry clone. A rice GOS2 promoter (SEQ ID NO: 64) for seed specific expression was located upstream of this Gateway cassette.
[0358] After the LR recombination step, the resulting expression vector pGOS2::ASNS (FIG. 4) was transformed into Agrobacterium strain LBA4044 according to methods well known in the art.
Example 16
Plant Transformation
Rice Transformation
[0359] The Agrobacterium containing the expression vector was used to transform Oryza sativa plants. Mature dry seeds of the rice japonica cultivar Nipponbare were dehusked. Sterilization was carried out by incubating for one minute in 70% ethanol, followed by 30 minutes in 0.2% HgCl2, followed by a 6 times 15 minutes wash with sterile distilled water. The sterile seeds were then germinated on a medium containing 2,4-D (callus induction medium). After incubation in the dark for four weeks, embryogenic, scutellum-derived calli were excised and propagated on the same medium. After two weeks, the calli were multiplied or propagated by subculture on the same medium for another 2 weeks. Embryogenic callus pieces were sub-cultured on fresh medium 3 days before co-cultivation (to boost cell division activity).
[0360] Agrobacterium strain LBA4404 containing the expression vector was used for co-cultivation. Agrobacterium was inoculated on AB medium with the appropriate antibiotics and cultured for 3 days at 28° C. The bacteria were then collected and suspended in liquid co-cultivation medium to a density (OD600) of about 1. The suspension was then transferred to a Petri dish and the calli immersed in the suspension for 15 minutes. The callus tissues were then blotted dry on a filter paper and transferred to solidified, co-cultivation medium and incubated for 3 days in the dark at 25° C. Co-cultivated calli were grown on 2,4-D-containing medium for 4 weeks in the dark at 28° C. in the presence of a selection agent. During this period, rapidly growing resistant callus islands developed. After transfer of this material to a regeneration medium and incubation in the light, the embryogenic potential was released and shoots developed in the next four to five weeks. Shoots were excised from the calli and incubated for 2 to 3 weeks on an auxin-containing medium from which they were transferred to soil. Hardened shoots were grown under high humidity and short days in a greenhouse.
[0361] Approximately 35 independent T0 rice transformants were generated for one construct. The primary transformants were transferred from a tissue culture chamber to a greenhouse. After a quantitative PCR analysis to verify copy number of the T-DNA insert, only single copy transgenic plants that exhibit tolerance to the selection agent were kept for harvest of T1 seed. Seeds were then harvested three to five months after transplanting. The method yielded single locus transformants at a rate of over 50% (Aldemita and Hodges 1996, Chan et al. 1993, Hiei et al. 1994).
Corn Transformation
[0362] Transformation of maize (Zea mays) is performed with a modification of the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. Transformation is genotype-dependent in corn and only specific genotypes are amenable to transformation and regeneration. The inbred line A188 (University of Minnesota) or hybrids with A188 as a parent are good sources of donor material for transformation, but other genotypes can be used successfully as well. Ears are harvested from corn plant approximately 11 days after pollination (DAP) when the length of the immature embryo is about 1 to 1.2 mm. Immature embryos are cocultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. Excised embryos are grown on callus induction medium, then maize regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to maize rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Wheat Transformation
[0363] Transformation of wheat is performed with the method described by Ishida et al. (1996) Nature Biotech 14(6): 745-50. The cultivar Bobwhite (available from CIMMYT, Mexico) is commonly used in transformation. Immature embryos are co-cultivated with Agrobacterium tumefaciens containing the expression vector, and transgenic plants are recovered through organogenesis. After incubation with Agrobacterium, the embryos are grown in vitro on callus induction medium, then regeneration medium, containing the selection agent (for example imidazolinone but various selection markers can be used). The Petri plates are incubated in the light at 25° C. for 2-3 weeks, or until shoots develop. The green shoots are transferred from each embryo to rooting medium and incubated at 25° C. for 2-3 weeks, until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Soybean Transformation
[0364] Soybean is transformed according to a modification of the method described in the Texas A&M patent U.S. Pat. No. 5,164,310. Several commercial soybean varieties are amenable to transformation by this method. The cultivar Jack (available from the Illinois Seed foundation) is commonly used for transformation. Soybean seeds are sterilised for in vitro sowing. The hypocotyl, the radicle and one cotyledon are excised from seven-day old young seedlings. The epicotyl and the remaining cotyledon are further grown to develop axillary nodes. These axillary nodes are excised and incubated with Agrobacterium tumefaciens containing the expression vector. After the cocultivation treatment, the explants are washed and transferred to selection media. Regenerated shoots are excised and placed on a shoot elongation medium. Shoots no longer than 1 cm are placed on rooting medium until roots develop. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Rapeseed/Canola Transformation
[0365] Cotyledonary petioles and hypocotyls of 5-6 day old young seedling are used as explants for tissue culture and transformed according to Babic et al. (1998, Plant Cell Rep 17: 183-188). The commercial cultivar Westar (Agriculture Canada) is the standard variety used for transformation, but other varieties can also be used. Canola seeds are surface-sterilized for in vitro sowing. The cotyledon petiole explants with the cotyledon attached are excised from the in vitro seedlings, and inoculated with Agrobacterium (containing the expression vector) by dipping the cut end of the petiole explant into the bacterial suspension. The explants are then cultured for 2 days on MSBAP-3 medium containing 3 mg/l BAP, 3% sucrose, 0.7% Phytagar at 23° C., 16 hr light. After two days of co-cultivation with Agrobacterium, the petiole explants are transferred to MSBAP-3 medium containing 3 mg/l BAP, cefotaxime, carbenicillin, or timentin (300 mg/l) for 7 days, and then cultured on MSBAP-3 medium with cefotaxime, carbenicillin, or timentin and selection agent until shoot regeneration. When the shoots are 5-10 mm in length, they are cut and transferred to shoot elongation medium (MSBAP-0.5, containing 0.5 mg/l BAP). Shoots of about 2 cm in length are transferred to the rooting medium (MS0) for root induction. The rooted shoots are transplanted to soil in the greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Alfalfa Transformation
[0366] A regenerating clone of alfalfa (Medicago sativa) is transformed using the method of (McKersie et al., 1999 Plant Physiol 119: 839-847). Regeneration and transformation of alfalfa is genotype dependent and therefore a regenerating plant is required. Methods to obtain regenerating plants have been described. For example, these can be selected from the cultivar Rangelander (Agriculture Canada) or any other commercial alfalfa variety as described by Brown D C W and A Atanassov (1985. Plant Cell Tissue Organ Culture 4: 111-112). Alternatively, the RA3 variety (University of Wisconsin) has been selected for use in tissue culture (Walker et al., 1978 Am J Bot 65:654-659). Petiole explants are cocultivated with an overnight culture of Agrobacterium tumefaciens C58C1 pMP90 (McKersie et al., 1999 Plant Physiol 119: 839-847) or LBA4404 containing the expression vector. The explants are cocultivated for 3 d in the dark on SH induction medium containing 288 mg/L Pro, 53 mg/L thioproline, 4.35 g/L K2SO4, and 100 μm acetosyringinone. The explants are washed in half-strength Murashige-Skoog medium (Murashige and Skoog, 1962) and plated on the same SH induction medium without acetosyringinone but with a suitable selection agent and suitable antibiotic to inhibit Agrobacterium growth. After several weeks, somatic embryos are transferred to BOi2Y development medium containing no growth regulators, no antibiotics, and 50 g/L sucrose. Somatic embryos are subsequently germinated on half-strength Murashige-Skoog medium. Rooted seedlings were transplanted into pots and grown in a greenhouse. T1 seeds are produced from plants that exhibit tolerance to the selection agent and that contain a single copy of the T-DNA insert.
Example 17
Phenotypic Evaluation Procedure
17.1 Evaluation Setup
[0367] Approximately 35 independent TO rice transformants were generated. The primary transformants were transferred from a tissue culture chamber to a greenhouse for growing and harvest of T1 seed. Six events, of which the T1 progeny segregated 3:1 for presence/absence of the transgene, were retained. For each of these events, approximately 10 T1 seedlings containing the transgene (hetero- and homo-zygotes) and approximately 10 T1 seedlings lacking the transgene (nullizygotes) were selected by monitoring visual marker expression. The transgenic plants and the corresponding nullizygotes were grown side-by-side at random positions. Greenhouse conditions were of shorts days (12 hours light), 28° C. in the light and 22° C. in the dark, and a relative humidity of 70%.
[0368] Four T1 events were further evaluated in the T2 generation following the same evaluation procedure as for the T1 generation but with more individuals per event.
Drought Screen
[0369] Plants from T2 seeds are grown in potting soil under normal conditions until they approached the heading stage. They are then transferred to a "dry" section where irrigation is withheld. Humidity probes are inserted in randomly chosen pots to monitor the soil water content (SWC). When SWC goes below certain thresholds, the plants are automatically re-watered continuously until a normal level is reached again. The plants are then re-transferred again to normal conditions. The rest of the cultivation (plant maturation, seed harvest) is the same as for plants not grown under abiotic stress conditions. Growth and yield parameters are recorded as detailed for growth under normal conditions.
Nitrogen Use Efficiency Screen
[0370] Rice plants from T2 seeds were grown in potting soil under normal conditions except for the nutrient solution. The pots were watered from transplantation to maturation with a specific nutrient solution containing reduced N nitrogen (N) content, usually between 7 to 8 times less. The rest of the cultivation (plant maturation, seed harvest) was the same as for plants not grown under abiotic stress. Growth and yield parameters were recorded as detailed for growth under normal conditions.
Salt Stress Screen
[0371] Plants are grown on a substrate made of coco fibers and argex (3 to 1 ratio). A normal nutrient solution is used during the first two weeks after transplanting the plantlets in the greenhouse. After the first two weeks, 25 mM of salt (NaCl) is added to the nutrient solution, until the plants are harvested. Seed-related parameters are then measured.
17.2 Statistical Analysis: F Test
[0372] A two factor ANOVA (analysis of variants) was used as a statistical model for the overall evaluation of plant phenotypic characteristics. An F test was carried out on all the parameters measured of all the plants of all the events transformed with the gene of the present invention. The F test was carried out to check for an effect of the gene over all the transformation events and to verify for an overall effect of the gene, also known as a global gene effect. The threshold for significance for a true global gene effect was set at a 5% probability level for the F test. A significant F test value points to a gene effect, meaning that it is not only the mere presence or position of the gene that is causing the differences in phenotype.
[0373] Because two experiments with overlapping events are carried out, a combined analysis was performed. This is useful to check consistency of the effects over the two experiments, and if this is the case, to accumulate evidence from both experiments in order to increase confidence in the conclusion. The method used was a mixed-model approach that takes into account the multilevel structure of the data (i.e. experiment-event-segregants). P values were obtained by comparing likelihood ratio test to chi square distributions.
17.3 Parameters Measured
Biomass-Related Parameter Measurement
[0374] From the stage of sowing until the stage of maturity the plants were passed several times through a digital imaging cabinet. At each time point digital images (2048×1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles.
[0375] The plant aboveground area (or leafy biomass) was determined by counting the total number of pixels on the digital images from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from the different angles and was converted to a physical surface value expressed in square mm by calibration. Experiments show that the aboveground plant area measured this way correlates with the biomass of plant parts above ground. The above ground area is the area measured at the time point at which the plant had reached its maximal leafy biomass. The early vigour is the plant (seedling) aboveground area three weeks post-germination.
[0376] Early vigour was determined by counting the total number of pixels from aboveground plant parts discriminated from the background. This value was averaged for the pictures taken on the same time point from different angles and was converted to a physical surface value expressed in square mm by calibration. The results described below are for plants three weeks post-germination.
Seed-Related Parameter Measurements
[0377] The mature primary panicles were harvested, counted, bagged, barcode-labelled and then dried for three days in an oven at 37° C. The panicles were then threshed and all the seeds were collected and counted. The filled husks were separated from the empty ones using an air-blowing device. The empty husks were discarded and the remaining fraction was counted again. The filled husks were weighed on an analytical balance. The number of filled seeds was determined by counting the number of filled husks that remained after the separation step. The total seed yield was measured by weighing all filled husks harvested from a plant. Total seed number per plant was measured by counting the number of husks harvested from a plant. Thousand Kernel Weight (TKW) is extrapolated from the number of filled seeds counted and their total weight. The Harvest Index (HI) in the present invention is defined as the ratio between the total seed yield and the above ground area (mm2), multiplied by a factor 106. The total number of flowers per panicle as defined in the present invention is the ratio between the total number of seeds and the number of mature primary panicles. The seed fill rate as defined in the present invention is the proportion (expressed as a %) of the number of filled seeds over the total number of seeds (or florets). Increase in root biomass is expressed as root thickness, which is the maximum biomass of roots above a certain thickness threshold observed during the lifespan of a plant (obtained by a root-imaging system).
Example 18
Results of the Phenotypic Evaluation of the Transgenic Plants
[0378] The transgenic rice plants expressing the ASNS nucleic acid represented by SEQ ID NO: 62 under control of the GOS2 promoter and grown whether under non-stress conditions or under conditions of reduced nitrogen availability, showed an increase of more than 5% for at least one of the following parameters: early vigour, total weight of seeds, number of filled seeds, fill rate, number of flowers per panicle, Harvest Index, total number of seeds and root thickness. For Thousand Kernel Weight the observed increase was at least 3%.
Sequence CWU
1
1
12211807DNAArabidopsis thaliana 1ggcttaaaca atgacttctt tctctctcac
tttcacatct cctctcctcc cttcctcctc 60caccaaaccc aaaagatccg tccttgtcgc
cgccgctcag accacagctc cggccgaatc 120caccgcctct gttgacgcag atcgtctcga
gccaagagtt gagttgaaag atggtttttt 180tattctcaag gagaagtttc gaaaagggat
caatcctcag gagaaggtta agatcgagag 240agagcccatg aagttgttta tggagaatgg
tattgaagag cttgctaaga aatctatgga 300agagcttgat agtgaaaagt cttctaaaga
tgatattgat gttagactca agtggcttgg 360tctctttcac cgtagaaagc atcagtatgg
gaagtttatg atgaggttga agttaccaaa 420tggtgtgact acaagtgcac agactcggta
tttagcgagt gtgattagga agtatggtga 480agatgggtgt gctgatgtga ctactagaca
gaattggcag atccgtggtg ttgtgttgcc 540tgatgtgcct gagatcttga aaggtcttgc
ttctgttggt ttaacgagtc ttcaaagtgg 600tatggataac gtgaggaacc cggttgggaa
tcctatagct gggattgatc cggaggagat 660tgttgacacg aggccttaca cgaatctcct
ttcgcagttt atcaccgcta attcacaagg 720aaaccccgat ttcaccaact tgccaagaaa
gtggaatgtg tgtgtggtgg ggactcatga 780tctctatgag catccacata tcaatgattt
ggcctacatg cctgctaata aagatggacg 840gtttggattc aatttgcttg tgggaggatt
ctttagtccc aaaagatgtg aagaagcgat 900tcctcttgat gcttgggtcc ctgctgatga
cgttcttcca ctctgcaaag ctgttctaga 960ggcttacaga gatcttggaa ctcgaggaaa
ccgacagaag acaagaatga tgtggcttat 1020cgacgaactt ggtgttgaag gatttagaac
tgaggtagag aagagaatgc caaatgggaa 1080actcgagaga ggatcttcag aggatcttgt
gaacaaacag tgggagagga gagactattt 1140cggagtcaac cctcagaaac aagaaggtct
tagcttcgtg gggcttcacg ttccggttgg 1200taggctacaa gctgatgaca tggatgagct
tgctcggtta gctgatacct acgggtcagg 1260tgagctaaga ctcacagtag agcaaaacat
catcatccca aatgtagaaa cctcgaaaac 1320cgaagctttg cttcaagagc cgtttctcaa
gaaccgtttc tcccctgaac catctatcct 1380aatgaaaggc ttagttgctt gtaccggtag
ccagttctgc ggacaagcga taatcgagac 1440taagctaaga gctttaaaag tgacagaaga
agtagagaga cttgtatctg tgccaagacc 1500gataaggatg cattggacag gatgtcccaa
tacttgcgga caagtccaag tagcagatat 1560cggattcatg ggatgcttaa cacgaggcga
ggaaggaaag ccagtcgagg gtgctgacgt 1620gtacgtcggg ggacgaatag gaagtgactc
gcatatcgga gagatctata agaaaggtgt 1680tcgtgtcacg gagttggttc cattggtggc
tgagattctg atcaaagaat ttggtgctgt 1740gcctagagaa agagaagaga atgaagattg
attcaaaagc tattgaccca gctttcttgt 1800acaaagt
18072586PRTArabidopsis thaliana 2Met Thr
Ser Phe Ser Leu Thr Phe Thr Ser Pro Leu Leu Pro Ser Ser 1 5
10 15 Ser Thr Lys Pro Lys Arg Ser
Val Leu Val Ala Ala Ala Gln Thr Thr 20 25
30 Ala Pro Ala Glu Ser Thr Ala Ser Val Asp Ala Asp
Arg Leu Glu Pro 35 40 45
Arg Val Glu Leu Lys Asp Gly Phe Phe Ile Leu Lys Glu Lys Phe Arg
50 55 60 Lys Gly Ile
Asn Pro Gln Glu Lys Val Lys Ile Glu Arg Glu Pro Met 65
70 75 80 Lys Leu Phe Met Glu Asn Gly
Ile Glu Glu Leu Ala Lys Lys Ser Met 85
90 95 Glu Glu Leu Asp Ser Glu Lys Ser Ser Lys Asp
Asp Ile Asp Val Arg 100 105
110 Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln Tyr Gly
Lys 115 120 125 Phe
Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Ala Gln 130
135 140 Thr Arg Tyr Leu Ala Ser
Val Ile Arg Lys Tyr Gly Glu Asp Gly Cys 145 150
155 160 Ala Asp Val Thr Thr Arg Gln Asn Trp Gln Ile
Arg Gly Val Val Leu 165 170
175 Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Ala Ser Val Gly Leu Thr
180 185 190 Ser Leu
Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro 195
200 205 Ile Ala Gly Ile Asp Pro Glu
Glu Ile Val Asp Thr Arg Pro Tyr Thr 210 215
220 Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn Ser Gln
Gly Asn Pro Asp 225 230 235
240 Phe Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly Thr His
245 250 255 Asp Leu Tyr
Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala 260
265 270 Asn Lys Asp Gly Arg Phe Gly Phe
Asn Leu Leu Val Gly Gly Phe Phe 275 280
285 Ser Pro Lys Arg Cys Glu Glu Ala Ile Pro Leu Asp Ala
Trp Val Pro 290 295 300
Ala Asp Asp Val Leu Pro Leu Cys Lys Ala Val Leu Glu Ala Tyr Arg 305
310 315 320 Asp Leu Gly Thr
Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu 325
330 335 Ile Asp Glu Leu Gly Val Glu Gly Phe
Arg Thr Glu Val Glu Lys Arg 340 345
350 Met Pro Asn Gly Lys Leu Glu Arg Gly Ser Ser Glu Asp Leu
Val Asn 355 360 365
Lys Gln Trp Glu Arg Arg Asp Tyr Phe Gly Val Asn Pro Gln Lys Gln 370
375 380 Glu Gly Leu Ser Phe
Val Gly Leu His Val Pro Val Gly Arg Leu Gln 385 390
395 400 Ala Asp Asp Met Asp Glu Leu Ala Arg Leu
Ala Asp Thr Tyr Gly Ser 405 410
415 Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn
Val 420 425 430 Glu
Thr Ser Lys Thr Glu Ala Leu Leu Gln Glu Pro Phe Leu Lys Asn 435
440 445 Arg Phe Ser Pro Glu Pro
Ser Ile Leu Met Lys Gly Leu Val Ala Cys 450 455
460 Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile Ile
Glu Thr Lys Leu Arg 465 470 475
480 Ala Leu Lys Val Thr Glu Glu Val Glu Arg Leu Val Ser Val Pro Arg
485 490 495 Pro Ile
Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly Gln Val 500
505 510 Gln Val Ala Asp Ile Gly Phe
Met Gly Cys Leu Thr Arg Gly Glu Glu 515 520
525 Gly Lys Pro Val Glu Gly Ala Asp Val Tyr Val Gly
Gly Arg Ile Gly 530 535 540
Ser Asp Ser His Ile Gly Glu Ile Tyr Lys Lys Gly Val Arg Val Thr 545
550 555 560 Glu Leu Val
Pro Leu Val Ala Glu Ile Leu Ile Lys Glu Phe Gly Ala 565
570 575 Val Pro Arg Glu Arg Glu Glu Asn
Glu Asp 580 585 33246DNAOryza sativa
3aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg aacgtgtgct
60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc aagaaaaact
120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca ctagtttcgt
180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata cgttcacatc
240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc ttcttgaata
300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta aaaaaataga
360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata attttatagt
420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc caatttttat
480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat tagatgcaag
540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct caatacacgt
600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg tagcaatatc
660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa tacaaaaaat
720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc acatacaaaa
780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac acaggcaaca
840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga ggacagcaag
900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag aggcaaagaa
960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc tctctatata
1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc agaagccgag
1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct tccctcctcc
1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat tgttctaggt
1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga ttcctgttct
1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt gattagtagt
1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga tcggaatctt
1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat tttgcttggt
1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg atttgacgaa
1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga cggtcccgtt
1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg cttgtttaga
1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga acaggggatt
1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc ttaaaaagtc
1740actttctggt tcagttcaat gaattgattg ctacaaataa tggtgcaaat caggtctata
1800tgattgattt tgggctggcc aagaagtata gagactcatc aactcatcag catattccgt
1860atagagaaaa caaaaatttg acaggaactg ctagatacgc aagcatgaat actcatcttg
1920gcattgaaca aagtcgaagg gatgatttgg aatcgctggg ttatgtttta atgtacttct
1980taagaggaag tctcccttgg caggggctga aagcaggcac taagaaacag aagtatgaga
2040agatcagtga gaagaaagta tcaacatcaa tagagacctt gtgtagggga tatcctgcag
2100agtttgcatc atattttcat tactgtcgat cactaagatt tgatgataaa ccagattatg
2160cttatctgaa gagaattttc cgtgatcttt tcattcgtga agggtttcaa tttgattata
2220tatttgactg gaccattttg aaatatcagc aatcacagct tgccaatcct ccatctcgtg
2280ctcttggtgg tactgctggg ccaagctcag ggatgcctca tgctcttgtt aatgttgaga
2340ggcaatcagg tggagatgaa ggtcgaccaa ctggttggtc ttcatcaaat cttacacgta
2400ataagagcac ggggctgcat ttcaattctg gaagcttatt gaagcaaaaa ggcacagttg
2460ctaatgattt atccatgggt aaagagttat ccagttctaa ttttttccgg tcaagtggac
2520cattgaggcg tccagttgtc tctagcatcc gagacccagt gattgcaggg ggtgaacctg
2580acccctccgg cactctgaca aaagatgcaa gcccgggacc attgcgtaaa gtatccagtg
2640ctgcacggag gagttcacca gttgtgtcct cagatcacaa gcgcagctcc tctatcaaaa
2700atgccaacat aaagaattta gagtccaccg tcaagggaat agagggttta agttttcgat
2760gatgagggac tgcattagta gctgtgcttt gtctcagttc tccgttcact gtaaattttg
2820gcacaccaac ttggggagta agagttctga tattagttgc tgtcaggaag taccataaag
2880ctgaattata caattaaaat ttgggatcca atcgcaaaag cacattaagg atatgatggg
2940gttgcagatc caaactcaca gattccagtt tatgctcgtc catacagtta taggcacttt
3000ccatattctt ttctttaatc tctgtctctt gcttgttatt gttatgtcgt ggtattcttg
3060ttgaggtcat gtttgtgaat tgcgaagatg gtcatgtata attgccgaga aatcatgtac
3120tagtttgttt taaacatgag caaactgtta ttttgttcaa gctactttaa tatcaaaaaa
3180aaaaaaaaaa gggcggccgc tctagagtat ccctcgaggg gcccaagctt acgcgtaccc
3240agcttt
3246457DNAArtificial sequenceprimer prm07073 4ggggacaagt ttgtacaaaa
aagcaggctt aaacaatgac ttctttctct ctcactt 57550DNAArtificial
sequenceprimer prm07074 5ggggaccact ttgtacaaga aagctgggtc aatagctttt
gaatcaatct 506600PRTAquilegia formosa 6Ser Lys Asn Glu Leu
Cys Arg Leu Ser Ser Thr Phe Leu Ser Thr Met 1 5
10 15 Ala Ser Leu Gln Phe Leu Ala Pro Ser Ser
Ser Pro Leu Gln Ser Asn 20 25
30 Arg Leu Met Val Arg Ala Thr Ser Ser Thr Ser Pro Ser Val Asn
Gln 35 40 45 Thr
Met Val Ala Pro Asp Leu Ser Arg Leu Glu Pro Arg Val Glu Glu 50
55 60 Arg Glu Gly Gly Tyr Trp
Val Leu Lys Glu Lys Tyr Arg Glu Lys Ile 65 70
75 80 Asn Pro Gln Glu Lys Ile Lys Ile Glu Lys Glu
Pro Met Lys Phe Val 85 90
95 Thr Glu Gly Gly Ile His Glu Leu Ala Lys Thr Pro Phe Glu Glu Leu
100 105 110 Glu Lys
Ala Lys Leu Thr Lys Asp Asp Ile Asp Val Arg Leu Lys Trp 115
120 125 Leu Gly Leu Phe His Arg Arg
Lys Asn His Tyr Gly Arg Phe Met Met 130 135
140 Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Glu
Gln Thr Arg Tyr 145 150 155
160 Leu Ala Ser Val Ile Arg Arg Tyr Gly Lys Asp Gly Cys Ala Asp Val
165 170 175 Thr Thr Arg
Gln Asn Trp Gln Ile Arg Gly Val Glu Leu Pro His Val 180
185 190 Pro Glu Ile Met Lys Gly Leu Asn
Gln Val Gly Leu Thr Ser Leu Gln 195 200
205 Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro
Leu Ala Gly 210 215 220
Ile Asp Pro Leu Glu Ile Val Asp Thr Arg Pro Tyr Asn Asp Gln Leu 225
230 235 240 Ser Arg Phe Ile
Thr Gly Asn Phe Lys Gly Asn Leu Ala Phe Thr Asn 245
250 255 Leu Pro Arg Lys Trp Asn Val Cys Val
Val Gly Ser His Asp Leu Phe 260 265
270 Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr
Lys Asn 275 280 285
Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys 290
295 300 Arg Cys Ala Glu Ala
Ile Pro Leu Asp Ala Trp Val Ser Gly Glu Asp 305 310
315 320 Val Ile Pro Val Cys Lys Ala Ile Leu Glu
Ala Tyr Arg Asp Leu Gly 325 330
335 Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp
Glu 340 345 350 Leu
Gly Val Glu Gly Phe Arg Ser Glu Val Val Lys Arg Met Pro Glu 355
360 365 Gln Glu Leu Glu Arg Ser
Ser Thr Glu Glu Leu Val Gln Lys Gln Trp 370 375
380 Glu Arg Arg Asp Leu Ile Gly Val His Ala Gln
Lys Gln Ala Gly Tyr 385 390 395
400 Ser Phe Val Gly Leu His Ile Pro Val Gly Arg Leu Gln Ala Asp Asp
405 410 415 Met Asp
Glu Leu Ala Arg Ile Ala Asp Glu Tyr Gly Ser Gly Glu Leu 420
425 430 Arg Leu Thr Val Glu Gln Asn
Ile Ile Ile Pro Asn Val Glu Asn Ser 435 440
445 Arg Val Glu Ala Leu Leu Lys Glu Ala Leu Leu Arg
Asp Arg Phe Ser 450 455 460
Pro Thr Pro Pro Leu Leu Met Lys Gly Leu Val Ala Cys Thr Gly Asn 465
470 475 480 Gln Phe Cys
Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys 485
490 495 Val Thr Glu Glu Val Glu Arg Leu
Val Ala Val Thr Lys Pro Val Arg 500 505
510 Met His Trp Thr Gly Cys Pro Asn Thr Cys Ala Gln Val
Gln Val Ala 515 520 525
Asp Ile Gly Phe Met Gly Cys Met Ala Arg Asp Glu Asn Gly Lys Pro 530
535 540 Cys Glu Gly Ala
Asp Val Tyr Leu Gly Gly Arg Ile Gly Ser Asp Ser 545 550
555 560 His Leu Gly Asp Ile Tyr Lys Lys Ser
Val Pro Cys Lys Asp Leu Val 565 570
575 Pro Leu Val Val Asp Ile Leu Ile Glu Arg Phe Gly Ala Val
Pro Arg 580 585 590
Glu Arg Glu Glu Asp Gly Glu Asp 595 600
7583PRTBetula pendula 7Met Ser Ser Leu Ser Val Arg Phe Leu Ser Pro Pro
Leu Phe Ser Ser 1 5 10
15 Thr Pro Ala Trp Pro Arg Thr Gly Leu Ala Ala Thr Gln Ala Val Pro
20 25 30 Pro Val Val
Ala Glu Val Asp Ala Gly Arg Leu Glu Pro Arg Val Glu 35
40 45 Glu Arg Glu Gly Tyr Trp Val Leu
Lys Glu Lys Phe Arg Glu Gly Ile 50 55
60 Asn Pro Gln Glu Lys Leu Lys Leu Glu Arg Glu Pro Met
Lys Leu Phe 65 70 75
80 Met Glu Gly Gly Ile Glu Asp Leu Ala Lys Met Ser Leu Glu Glu Ile
85 90 95 Asp Lys Asp Lys
Ile Ser Lys Ser Asp Ile Asp Val Arg Leu Lys Trp 100
105 110 Leu Gly Leu Phe His Arg Arg Lys His
His Tyr Gly Arg Phe Met Met 115 120
125 Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Ala Gln Thr
Arg Tyr 130 135 140
Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Ala Asp Val 145
150 155 160 Thr Thr Arg Gln Asn
Trp Gln Ile Arg Gly Val Val Leu Ser Asp Val 165
170 175 Pro Glu Ile Leu Lys Gly Leu Asp Glu Val
Gly Leu Thr Ser Leu Gln 180 185
190 Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala
Gly 195 200 205 Ile
Asp Ile His Glu Ile Val Ala Thr Arg Pro Tyr Asn Asn Leu Leu 210
215 220 Ser Gln Phe Ile Thr Ala
Asn Ser Arg Gly Asn Leu Ala Phe Thr Asn 225 230
235 240 Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly
Ser His Asp Leu Phe 245 250
255 Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Ile Lys Asp
260 265 270 Gly Arg
Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Arg 275
280 285 Arg Cys Ala Glu Ala Val Pro
Leu Asp Ala Trp Val Ser Ala Asp Asp 290 295
300 Ile Ile Leu Val Cys Lys Ala Ile Leu Glu Ala Tyr
Arg Asp Leu Gly 305 310 315
320 Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu
325 330 335 Leu Gly Ile
Glu Gly Phe Arg Ser Glu Val Val Lys Arg Met Pro Asn 340
345 350 Gln Glu Leu Glu Arg Ala Ala Pro
Glu Asp Leu Ile Glu Lys Gln Trp 355 360
365 Glu Arg Arg Glu Leu Ile Gly Val His Pro Gln Lys Gln
Glu Gly Leu 370 375 380
Ser Tyr Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala Asp Asp 385
390 395 400 Met Asp Glu Leu
Ala Arg Leu Ala Asp Thr Tyr Gly Cys Gly Glu Leu 405
410 415 Arg Leu Thr Val Glu Gln Asn Ile Ile
Ile Pro Asn Ile Glu Asn Ser 420 425
430 Lys Leu Glu Ala Leu Leu Gly Glu Pro Leu Leu Lys Asp Arg
Phe Ser 435 440 445
Pro Glu Pro Pro Ile Leu Met Lys Gly Leu Val Ala Cys Thr Gly Asn 450
455 460 Gln Phe Cys Gly Gln
Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys 465 470
475 480 Val Thr Glu Glu Val Gln Arg Gln Val Ala
Val Thr Arg Pro Val Arg 485 490
495 Met His Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Val Gln Val
Ala 500 505 510 Asp
Ile Gly Phe Met Gly Cys Met Ala Arg Asp Glu Asn Gly Lys Pro 515
520 525 Cys Glu Gly Ala Ala Val
Phe Leu Gly Gly Arg Ile Gly Ser Asp Ser 530 535
540 His Leu Gly Asn Leu Tyr Lys Lys Gly Val Pro
Cys Lys Asn Leu Val 545 550 555
560 Pro Leu Val Val Asp Ile Leu Val Lys His Phe Gly Ala Val Pro Arg
565 570 575 Glu Arg
Glu Glu Ser Glu Asp 580 8612PRTCapsicum annuum
8Met Thr Ala Thr Ile Ile Thr Thr Leu Asn Asn Gln Glu Ser Thr Lys 1
5 10 15 Phe Leu Asn Ser
Lys Phe Gly Glu Met Ala Ser Phe Ser Val Lys Phe 20
25 30 Ser Ala Thr Ser Ser Leu Thr Ser Ser
Lys Arg Phe Ser Lys Leu His 35 40
45 Ala Thr Pro Pro Gln Thr Val Ala Val Pro Pro Ser Gly Ala
Val Glu 50 55 60
Val Ala Ala Glu Arg Leu Glu Pro Arg Leu Glu Glu Arg Asp Gly Tyr 65
70 75 80 Trp Val Leu Lys Glu
Lys Phe Arg Lys Gly Ile Asn Pro Ala Glu Lys 85
90 95 Ala Lys Ile Glu Lys Glu Pro Met Lys Leu
Phe Thr Glu Asn Gly Ile 100 105
110 Glu Asp Ile Ala Lys Ile Ser Leu Glu Glu Ile Glu Lys Ser Lys
Leu 115 120 125 Ala
Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His 130
135 140 Arg Arg Lys His Gln Tyr
Gly Arg Phe Met Met Arg Leu Lys Leu Pro 145 150
155 160 Asn Gly Ile Thr Thr Ser Ala Gln Thr Arg Tyr
Leu Ala Ser Val Ile 165 170
175 Arg Lys Tyr Gly Lys Asp Gly Cys Ala Asp Val Thr Thr Arg Gln Asn
180 185 190 Trp Gln
Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu Ile Leu Lys 195
200 205 Gly Leu Asp Glu Val Gly Leu
Thr Ser Leu Gln Ser Gly Met Asp Asn 210 215
220 Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile
Asp Pro Gln Glu 225 230 235
240 Ile Val Asp Thr Arg Pro Tyr Ala Asn Leu Leu Ser Asn Leu Leu Ser
245 250 255 Gln Tyr Val
Thr Ala Asn Phe Arg Gly Asn Leu Ser Val His Asn Leu 260
265 270 Pro Arg Lys Trp Asn Val Cys Val
Ile Gly Ser His Asp Leu Tyr Glu 275 280
285 His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr
Lys Asp Gly 290 295 300
Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys Arg 305
310 315 320 Cys Ala Glu Ala
Ile Pro Leu Asp Ala Trp Val Pro Ala Asp Asp Val 325
330 335 Val Pro Val Cys Lys Thr Ile Leu Glu
Ala Tyr Arg Asp Leu Gly Thr 340 345
350 Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp
Glu Leu 355 360 365
Gly Val Glu Gly Phe Arg Ala Glu Val Val Lys Arg Met Pro Gln Lys 370
375 380 Lys Leu Glu Arg Glu
Ser Thr Glu Asp Leu Val Gln Lys Gln Trp Glu 385 390
395 400 Arg Arg Glu Tyr Leu Gly Val Asn Pro Gln
Lys Gln Glu Gly Tyr Ser 405 410
415 Phe Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala Asp Asp
Met 420 425 430 Asp
Glu Leu Ala Arg Leu Ala Glu Glu Tyr Gly Ser Gly Glu Leu Arg 435
440 445 Leu Thr Val Glu Gln Asn
Ile Ile Ile Pro Asn Ile Glu Asn Ser Lys 450 455
460 Ile Asp Ala Leu Leu Asn Glu Pro Leu Leu Lys
Gln Ile Ser Pro Asp 465 470 475
480 Pro Pro Ile Leu Met Arg Asn Leu Val Ala Cys Thr Gly Asn Gln Phe
485 490 495 Cys Gly
Gln Ala Ile Ile Glu Thr Lys Ala Arg Ser Met Lys Ile Thr 500
505 510 Glu Glu Val Gln Arg Leu Val
Ser Val Thr Gln Pro Val Arg Met His 515 520
525 Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Val Gln
Val Ala Asp Ile 530 535 540
Gly Phe Met Gly Cys Leu Thr Arg Lys Glu Gly Lys Thr Val Glu Gly 545
550 555 560 Ala Asp Val
Phe Leu Gly Gly Arg Ile Gly Thr Asp Ser His Leu Gly 565
570 575 Asp Ile Tyr Lys Lys Ser Val Pro
Cys Glu Asp Leu Val Pro Ile Ile 580 585
590 Val Asp Leu Leu Val Asn Asn Phe Gly Ala Val Pro Arg
Glu Arg Glu 595 600 605
Glu Ala Glu Asp 610 9612PRTChlamydomonas reinhardtii 9Met
Leu Leu His Ala Pro His Val Lys Pro Leu Gly Gln Arg Ser Ser 1
5 10 15 Ile Arg Arg Gly Asn Leu
Val Val Ala Asn Val Ala Cys Thr Ala Gly 20
25 30 Lys Asn Pro Thr Ser Arg Pro Ala Lys Arg
Ser Lys Val Glu Phe Ile 35 40
45 Lys Glu Asn Ser Asp His Leu Arg His Pro Leu Met Glu Glu
Leu Val 50 55 60
Asn Asp Glu Thr Phe Ile Thr Glu Asp Ser Val Gln Leu Met Lys Phe 65
70 75 80 His Gly Ser Tyr Gln
Gln Asp Asn Arg Glu Lys Arg Ala Phe Gly Gln 85
90 95 Gly Lys Ala Tyr Ser Phe Leu Met Arg Thr
Arg Gln Pro Ala Gly Val 100 105
110 Val Pro Asn Arg Leu Tyr Leu Val Met Asp Asp Leu Ala Asp Gln
Phe 115 120 125 Gly
Asn Gly Thr Leu Arg Leu Thr Thr Arg Gln Ala Tyr Gln Leu His 130
135 140 Gly Val Leu Lys Lys Asp
Leu Lys Thr Val Phe Ser Ser Val Ile Lys 145 150
155 160 Asn Met Gly Ser Thr Leu Ala Ala Cys Gly Asp
Val Asn Arg Asn Val 165 170
175 Met Gly Pro Ala Ala Pro Phe Thr Asn Arg Pro Asp Tyr Leu Ala Ala
180 185 190 Gln Lys
Ala Ala Leu Asp Leu Ala Asp Leu Leu Thr Pro Gln Ser Gly 195
200 205 Ala Tyr Tyr Asp Val Trp Leu
Asp Gly Glu Lys Phe Met Ser Ser Tyr 210 215
220 Lys Glu Asp Pro Ala Val Thr Glu Ala Arg Ala Phe
Asn Gly Phe Gly 225 230 235
240 Thr Asn Phe Asp Asn Ser Pro Glu Pro Ile Tyr Gly Ser Gln Tyr Leu
245 250 255 Pro Arg Lys
Phe Lys Ile Ala Thr Thr Val Pro Gly Asp Asn Ser Val 260
265 270 Asp Leu Phe Thr Gln Asp Leu Gly
Val Val Val Gln Gly Tyr Asn Leu 275 280
285 Tyr Val Gly Gly Gly Gln Gly Arg Ser His Arg Asp Ala
Asp Thr Phe 290 295 300
Pro Arg Leu Ala Asp Pro Leu Gly Tyr Val Ala Ala Ala Asp Leu Phe 305
310 315 320 Ala Ala Ala Lys
Ala Val Val Ala Val Phe Arg Asp Tyr Gly Arg Arg 325
330 335 Asp Asn Arg Lys Gln Ala Arg Thr Arg
His Met Leu Ala Glu Trp Gly 340 345
350 Val Asp Lys Phe Arg Ser Val Ala Glu Gln Tyr Leu Gly Lys
Arg Phe 355 360 365
Gln Glu Pro Val Pro Leu Pro Pro Trp Gln Tyr Lys Asp Tyr Leu Gly 370
375 380 Trp Gly Glu Gln Gly
Asp Gly Arg Leu Tyr Cys Gly Val Tyr Val Gln 385 390
395 400 Asn Gly Arg Ile Lys Gly Glu Ala Lys Arg
Ala Leu Arg Ala Ala Ile 405 410
415 Glu Arg Tyr Ser Leu Pro Val Val Leu Thr Pro His Gln Asn Leu
Val 420 425 430 Leu
Arg Asp Val Arg Pro Glu Asp Arg Glu Asp Ile Glu Gln Leu Leu 435
440 445 Arg Ala Gly Gly Val Lys
Glu Leu Val Glu Trp Asp Gly Leu Asp Arg 450 455
460 Leu Ser Met Ala Cys Pro Ala Leu Pro Leu Cys
Gly Leu Ala Val Thr 465 470 475
480 Glu Ala Glu Arg Ala Leu Pro Asp Val Asn Thr Arg Ile Arg Ala Met
485 490 495 Leu Thr
Arg Ala Gly Leu Pro Pro Ser Gln Pro Leu His Val Arg Met 500
505 510 Thr Gly Cys Pro Asn Gly Cys
Val Arg Pro Tyr Met Ala Glu Leu Gly 515 520
525 Leu Val Gly Asp Gly Pro Asn Ser Tyr Gln Leu Trp
Leu Gly Gly Gly 530 535 540
Pro Ala Gln Thr Arg Leu Ala Gln Pro Tyr Ala Glu Arg Val Lys Val 545
550 555 560 Lys Asp Leu
Glu Ser Thr Leu Glu Pro Leu Phe Gly Ala Trp Arg Ala 565
570 575 Gly Arg Gln Pro Asp Glu Ala Phe
Gly Asp Trp Val Ala Arg Leu Gly 580 585
590 Phe Asp Ala Val Arg Gln Gln Ala Ala Ala Ala Ala Ala
Ala Ala Pro 595 600 605
Val Gly Thr Ala 610 10589PRTChlamydomonas reinhardtii 10Met
Gln Ser Arg Gln Cys Leu Asn Arg Lys Ala Ser Gly Ala Arg Pro 1
5 10 15 Cys Ala Asn Ser Arg Ser
Leu Thr Ala Arg Val Leu Ala Thr Ala Ala 20
25 30 Pro Val Ala Pro Ser Ala Thr Pro Ala Ser
Ala Pro Leu Pro Leu Pro 35 40
45 Asp Gly Val Gly Glu His Ser Gly Leu Lys His Leu Pro Glu
Ala Ala 50 55 60
Arg Thr Arg Ala Leu Asp Lys Lys Ala Asn Lys Phe Glu Lys Val Lys 65
70 75 80 Val Glu Lys Cys Gly
Ser Arg Ala Trp Asn Asp Val Phe Glu Leu Ser 85
90 95 Ser Leu Leu Lys Glu Gly Lys Thr Lys Trp
Glu Asp Leu Asn Leu Asp 100 105
110 Asp Val Asp Ile Arg Leu Lys Trp Ala Gly Leu Phe His Arg Gly
Lys 115 120 125 Arg
Thr Pro Gly Lys Phe Met Met Arg Leu Lys Val Pro Asn Gly Glu 130
135 140 Leu Thr Ala Ala Gln Leu
Arg Phe Leu Ala Ser Ser Ile Ala Pro Tyr 145 150
155 160 Gly Ala Asp Gly Cys Ala Asp Ile Thr Thr Arg
Ala Asn Ile Gln Leu 165 170
175 Arg Gly Val Thr Met Glu Asp Ser Glu Thr Val Ile Lys Gly Leu Trp
180 185 190 Asp Val
Gly Leu Thr Ser Phe Gln Ser Gly Met Asp Ser Val Arg Asn 195
200 205 Leu Thr Gly Asn Pro Ile Ala
Gly Val Asp Pro His Glu Leu Val Asp 210 215
220 Thr Arg Pro Leu Leu Arg Asp Met Glu Ala Met Leu
Phe Asn Asn Gly 225 230 235
240 Lys Gly Arg Glu Glu Phe Ala Asn Leu Pro Arg Lys Leu Asn Ile Cys
245 250 255 Ile Ser Ser
Thr Arg Asp Asp Phe Pro His Thr His Ile Asn Asp Val 260
265 270 Gly Tyr Glu Ala Val Ala Lys Pro
Asn Gly Glu Val Val Tyr Asn Val 275 280
285 Val Val Gly Gly Tyr Phe Ser Ile Lys Arg Asn Ile Met
Ser Ile Pro 290 295 300
Leu Gly Cys Ser Ile Thr Gln Asp Gln Leu Met Pro Phe Thr Glu Ala 305
310 315 320 Leu Leu Arg Val
Phe Arg Asp His Gly Pro Arg Gly Asp Arg Gln Gln 325
330 335 Thr Arg Leu Met Trp Leu Val Glu Ala
Val Gly Val Asp Lys Phe Arg 340 345
350 Gln Leu Leu Ser Glu Tyr Met Gly Gly Ala Thr Phe Gly Glu
Pro Val 355 360 365
His Val His His Asp Gln Pro Trp Glu Arg Arg Asn Leu Leu Gly Val 370
375 380 His Arg Gln Arg Gln
Ala Gly Leu Asn Trp Val Gly Ala Cys Val Pro 385 390
395 400 Ala Gly Arg Leu His Ala Ala Asp Phe Glu
Glu Ile Ala Ala Val Ala 405 410
415 Glu Lys Tyr Gly Asp Gly Thr Val Arg Ile Thr Cys Glu Glu Asn
Val 420 425 430 Ile
Phe Thr Asn Val Pro Asp Ala Lys Leu Glu Ala Met Lys Ala Glu 435
440 445 Pro Leu Phe Gln Arg Phe
Pro Ile Phe Pro Gly Val Leu Leu Ser Gly 450 455
460 Met Val Ser Cys Thr Gly Asn Gln Phe Cys Gly
Phe Gly Leu Ala Glu 465 470 475
480 Thr Lys Ala Lys Ala Val Lys Val Val Glu Ala Leu Asp Ala Gln Leu
485 490 495 Glu Leu
Ser Arg Pro Val Arg Ile His Phe Thr Gly Cys Pro Asn Ser 500
505 510 Cys Gly Gln Ala Gln Val Gly
Asp Ile Gly Leu Met Gly Ala Pro Ala 515 520
525 Lys His Glu Gly Lys Ala Val Glu Gly Tyr Lys Ile
Phe Leu Gly Gly 530 535 540
Lys Ile Gly Glu Asn Pro Ala Leu Ala Thr Glu Phe Ala Gln Gly Val 545
550 555 560 Pro Ala Ile
Glu Ser Val Leu Val Pro Arg Leu Lys Glu Ile Leu Ile 565
570 575 Ser Glu Phe Gly Ala Lys Glu Arg
Ala Thr Ala Thr Ala 580 585
11469PRTChlamydomonas reinhardtii 11Met Leu Leu Lys Gly Ile Thr Thr Pro
Met Leu Gly Gln Gln Arg Pro 1 5 10
15 Thr Arg Gly Gln Leu His Val Val Asn Val Ala Thr Pro Ser
Lys Asn 20 25 30
Pro Ser Ser Arg Leu Ala Lys Arg Ser Lys Val Glu Ile Ile Lys Glu
35 40 45 Lys Ser Asp Tyr
Leu Arg His Pro Leu Met Glu Glu Leu Val Asn Asp 50
55 60 Ala Thr Phe Ile Thr Glu Asp Ser
Val Gln Leu Met Lys Phe His Gly 65 70
75 80 Ser Tyr Gln Gln Asp His Arg Glu Lys Arg Ala Phe
Gly Gln Gly Lys 85 90
95 Ala Tyr Cys Phe Met Met Arg Thr Arg Gln Pro Ala Gly Val Val Pro
100 105 110 Asn Arg Leu
Tyr Leu Val Met Asp Asp Leu Ala Asp Gln Tyr Gly Asn 115
120 125 Gly Thr Leu Arg Leu Thr Thr Arg
Gln Ala Tyr Gln Leu His Gly Val 130 135
140 Leu Lys Lys Asp Leu Lys Thr Val Phe Ser Ser Val Ile
Lys Asn Met 145 150 155
160 Gly Ser Thr Leu Ala Ala Cys Gly Asp Val Asn Arg Asn Val Met Gly
165 170 175 Pro Ser Ala Pro
Phe Thr Asn Arg Pro Asp Tyr Val Ala Ala Gln Lys 180
185 190 Ala Ala Asn Asp Ile Ala Asp Leu Leu
Thr Pro Gln Ser Gly Ala Tyr 195 200
205 Tyr Asp Val Trp Leu Asp Gly Glu Lys Phe Met Ser Ala Tyr
Lys Glu 210 215 220
Asp Pro Lys Val Thr Ala Asp Arg Ala Tyr Asn Gly Phe Gly Thr Asn 225
230 235 240 Phe Glu Asn Ser Pro
Glu Pro Ile Tyr Gly Ala Gln Phe Leu Pro Arg 245
250 255 Lys Phe Lys Val Ala Thr Thr Val Pro Gly
Asp Asn Ser Val Asp Leu 260 265
270 Phe Thr Gln Asp Leu Gly Val Val Val Ile Met Asp Glu Ser Gly
Lys 275 280 285 Glu
Val Lys Gly Tyr Asn Leu Thr Val Gly Gly Gly Met Gly Arg Thr 290
295 300 His Arg Asp Asp Glu Thr
Phe Pro Arg Leu Ala Asp Pro Leu Gly Tyr 305 310
315 320 Val Asp Lys Asp Asp Leu Phe His Ala Val Lys
Ala Val Val Ala Val 325 330
335 Gln Arg Asp Tyr Gly Arg Arg Asp Asn Arg Lys Gln Ala Arg Leu Lys
340 345 350 Tyr Leu
Val Gly Leu Pro Ala Asp Gln Glu Leu His Val Arg Met Thr 355
360 365 Gly Cys Pro Asn Gly Cys Ala
Arg Pro Tyr Met Ala Glu Leu Gly Phe 370 375
380 Val Gly Asp Gly Pro Asn Ser Tyr Gln Leu Tyr Phe
Gly Gly Asn Val 385 390 395
400 Asn Gln Thr Arg Leu Ala Gln Leu Phe Ala Asp Arg Val Lys Val Lys
405 410 415 Asp Leu Glu
Ser Thr Leu Glu Pro Ile Phe Ala Ala Trp Lys Ala Ser 420
425 430 Arg Arg Pro Lys Glu Ser Phe Gly
Asp Trp Val Ser Arg Pro Ser Gln 435 440
445 Asp Pro Lys Asn Leu Ser Ser Val Gln Gln Gly Thr Gln
His Glu Ser 450 455 460
Ala Val Val Ala His 465 12588PRTGossypium hirsutum
12Met Ser Ser Leu Ser Val Arg Phe Phe Ala Pro Gln Gln Pro Leu Leu 1
5 10 15 Pro Ser Thr Ala
Ser Ser Phe Lys Pro Lys Thr Trp Val Met Ala Ala 20
25 30 Pro Thr Thr Ala Pro Ala Thr Ser Val
Asp Val Asp Gly Gly Arg Leu 35 40
45 Glu Pro Arg Val Glu Glu Arg Glu Gly Tyr Phe Val Leu Lys
Glu Lys 50 55 60
Phe Arg Asp Gly Ile Asn Pro Gln Glu Lys Ile Lys Ile Glu Lys Asp 65
70 75 80 Pro Leu Lys Leu Phe
Met Glu Ala Gly Ile Asp Glu Leu Ala Lys Met 85
90 95 Ser Phe Glu Asp Leu Asp Lys Ala Lys Ala
Thr Lys Asp Asp Ile Asp 100 105
110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln
Tyr 115 120 125 Gly
Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130
135 140 Ala Gln Thr Arg Tyr Leu
Ala Ser Val Ile Arg Lys Tyr Gly Lys Glu 145 150
155 160 Gly Cys Ala Asp Val Thr Thr Arg Gln Asn Trp
Gln Ile Arg Gly Ala 165 170
175 Val Leu Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly
180 185 190 Leu Thr
Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195
200 205 Asn Pro Leu Ala Gly Ile Asp
Pro Glu Glu Ile Val Asp Thr Arg Pro 210 215
220 Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn
Ser Arg Gly Asn 225 230 235
240 Pro Ala Val Ala Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly
245 250 255 Ser His Asp
Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260
265 270 Pro Ala Thr Lys Asn Gly Arg Phe
Gly Phe Asn Leu Leu Val Gly Gly 275 280
285 Phe Phe Ser Ala Lys Arg Cys Asp Glu Ala Ile Pro Leu
Asp Ala Trp 290 295 300
Val Ser Ala Asp Asp Val Ile Pro Leu Cys Lys Ala Val Leu Glu Ala 305
310 315 320 Tyr Arg Asp Leu
Gly Tyr Arg Gly Asn Arg Gln Lys Thr Arg Met Met 325
330 335 Trp Leu Ile Asp Glu Leu Gly Ile Glu
Val Phe Arg Ser Glu Val Ala 340 345
350 Lys Arg Met Pro Gln Lys Glu Leu Glu Arg Ala Ser Asp Glu
Asp Leu 355 360 365
Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr Leu Gly Val His Pro Gln 370
375 380 Lys Gln Glu Gly Phe
Ser Tyr Ile Gly Ile His Ile Pro Val Gly Arg 385 390
395 400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala
Arg Leu Ala Asp Thr Tyr 405 410
415 Gly Ser Gly Glu Phe Arg Leu Thr Val Glu Gln Asn Ile Ile Ile
Pro 420 425 430 Asn
Val Glu Asn Ser Lys Leu Glu Ala Leu Leu Asn Glu Pro Leu Leu 435
440 445 Lys Asp Arg Phe Ser Pro
Gln Pro Ser Ile Leu Met Lys Gly Leu Val 450 455
460 Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala
Ile Ile Glu Thr Lys 465 470 475
480 Ala Arg Ala Leu Lys Val Thr Glu Glu Val Glu Arg Leu Val Ser Val
485 490 495 Ser Arg
Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly 500
505 510 Gln Val Gln Val Ala Asp Ile
Gly Phe Met Gly Cys Met Ala Arg Asp 515 520
525 Glu Asn Gly Lys Pro Cys Glu Gly Ala Asp Ile Phe
Leu Gly Gly Arg 530 535 540
Ile Gly Ser Asp Ser His Leu Gly Glu Leu Tyr Lys Lys Gly Val Pro 545
550 555 560 Cys Lys Asn
Leu Val Pro Val Val Ala Asp Ile Leu Val Glu Pro Phe 565
570 575 Gly Ala Val Pro Arg Gln Arg Glu
Glu Gly Glu Asp 580 585
13473PRTHordeum vulgareUNSURE(473)..(473)Unknown amino acid 13Met Ala Ser
Ser Ala Ser Leu Gln Ser Phe Leu Pro Pro Ser Ala His 1 5
10 15 Ala Ala Thr Ser Ser Ser Arg Leu
Arg Pro Ser Arg Ala Arg Pro Val 20 25
30 Gln Cys Ala Ala Val Ser Ala Pro Ser Ser Ser Ser Ser
Ser Ala Ser 35 40 45
Pro Ser Ala Ser Ala Val Pro Ser Glu Arg Leu Glu Pro Arg Val Glu 50
55 60 Gln Arg Glu Gly
Gly Tyr Trp Val Leu Lys Glu Lys Tyr Arg Thr Ser 65 70
75 80 Leu Asn Pro Gln Glu Lys Val Lys Leu
Gly Lys Glu Pro Met Ala Leu 85 90
95 Phe Thr Glu Gly Gly Ile Asn Asp Leu Ala Lys Leu Pro Met
Glu Gln 100 105 110
Ile Asp Ala Asp Lys Leu Thr Lys Glu Asp Val Asp Val Arg Leu Lys
115 120 125 Trp Leu Gly Leu
Phe His Arg Arg Lys Gln Gln Tyr Gly Arg Phe Met 130
135 140 Met Arg Leu Lys Leu Pro Asn Gly
Val Thr Thr Ser Glu Gln Thr Arg 145 150
155 160 Tyr Leu Ala Ser Val Ile Asp Lys Tyr Gly Glu Glu
Gly Cys Ala Asp 165 170
175 Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Thr Leu Pro Asp
180 185 190 Val Pro Glu
Ile Leu Asp Gly Leu Arg Ser Val Gly Leu Thr Ser Leu 195
200 205 Gln Ser Gly Met Asp Asn Val Arg
Asn Pro Val Gly Ser Pro Leu Ala 210 215
220 Gly Ile Asp Pro Leu Glu Ile Val Asp Thr Arg Pro Tyr
Thr Asn Leu 225 230 235
240 Leu Ser Ser Tyr Ile Thr Asn Asn Ser Glu Gly Asn Leu Ala Ile Thr
245 250 255 Asn Leu Pro Arg
Lys Trp Asn Val Cys Val Ile Gly Thr His Asp Leu 260
265 270 Tyr Glu His Pro His Ile Asn Asp Leu
Ala Tyr Met Pro Ala Glu Lys 275 280
285 Asp Gly Lys Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Ile
Ser Pro 290 295 300
Lys Arg Trp Gly Glu Ala Leu Pro Leu Asp Ala Trp Val Pro Gly Asp 305
310 315 320 Asp Ile Ile Pro Val
Cys Lys Ala Val Leu Glu Ala Phe Arg Asp Leu 325
330 335 Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg
Met Met Trp Leu Ile Asp 340 345
350 Glu Leu Gly Met Glu Ala Phe Arg Ser Glu Ile Glu Lys Arg Met
Pro 355 360 365 Asn
Gly Val Leu Glu Arg Ala Ala Pro Glu Asp Leu Ile Asp Lys Lys 370
375 380 Trp Glu Arg Arg Asp Tyr
Leu Gly Val His Pro Gln Lys Gln Glu Gly 385 390
395 400 Leu Ser Phe Val Gly Leu His Val Pro Val Gly
Arg Leu Gln Ala Ala 405 410
415 Asp Met Phe Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Ser Gly Glu
420 425 430 Leu Arg
Leu Thr Val Glu Gln Asn Ile Val Leu Pro Asn Val Lys Asn 435
440 445 Glu Lys Val Glu Ala Leu Leu
Ala Glu Pro Leu Leu His Lys Phe Ser 450 455
460 Ala His Pro Ser Leu Leu Met Lys Xaa 465
470 14582PRTLotus japonicus 14Met Ser Ser Ser Phe Ser
Ile Arg Phe Leu Ala Pro Pro Phe Pro Ser 1 5
10 15 Thr Ser Arg Pro Lys Ser Cys Leu Ser Ala Ala
Thr Pro Ala Val Ala 20 25
30 Pro Thr Asp Ala Ala Val Ser Arg Leu Glu Pro Arg Val Glu Glu
Arg 35 40 45 Asn
Gly Tyr Trp Val Leu Lys Glu Glu His Arg Gly Gly Ile Asn Pro 50
55 60 Gln Glu Lys Val Lys Leu
Glu Lys Glu Pro Met Ala Leu Phe Met Glu 65 70
75 80 Gly Gly Ile Asp Glu Leu Ala Lys Val Ser Ile
Glu Glu Leu Asp Ser 85 90
95 Ser Lys Leu Thr Lys Asp Asp Val Asp Val Arg Leu Lys Trp Leu Gly
100 105 110 Leu Phe
His Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Leu 115
120 125 Lys Leu Pro Asn Gly Val Thr
Thr Ser Ala Gln Thr Arg Tyr Leu Ala 130 135
140 Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Ala
Asp Val Thr Thr 145 150 155
160 Arg His Asn Trp Gln Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu
165 170 175 Ile Leu Lys
Gly Leu Ala Glu Val Gly Leu Thr Ser Leu Gln Ser Gly 180
185 190 Met Asp Asn Val Arg Asn Pro Val
Gly Asn Pro Leu Ala Gly Ile Asp 195 200
205 Pro Asp Glu Ile Val Asp Thr Arg Pro Tyr Thr Asn Leu
Leu Ser His 210 215 220
Phe Ile Thr Ala Asn Ser Arg Gly Asn Pro Thr Val Ser Asn Leu Pro 225
230 235 240 Arg Lys Trp Asn
Val Cys Val Val Gly Ser His Asp Leu Phe Glu His 245
250 255 Pro His Ile Asn Asp Leu Ala Tyr Met
Pro Ala Asn Lys Asp Gly Arg 260 265
270 Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Pro Lys
Arg Cys 275 280 285
Ala Glu Ala Ile Pro Leu Asp Ala Trp Val Ser Ala Glu Asp Val Ile 290
295 300 Pro Val Cys Lys Ala
Ile Leu Glu Met Tyr Arg Asp Leu Gly Thr Arg 305 310
315 320 Gly Asn Arg Gln Lys Thr Arg Met Met Trp
Leu Ile Asp Glu Leu Gly 325 330
335 Ile Glu Val Phe Arg Ser Glu Val Val Lys Arg Met Pro Leu Gly
Gln 340 345 350 Gln
Leu Glu Arg Ala Ser Gln Glu Asp Leu Val Gln Lys Gln Trp Glu 355
360 365 Arg Arg Asp Tyr Phe Gly
Ala Asn Pro Gln Lys Gln Glu Gly Leu Ser 370 375
380 Tyr Val Gly Ile His Ile Pro Val Gly Arg Ile
Gln Ala Asp Glu Met 385 390 395
400 Asp Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Thr Gly Glu Leu Arg
405 410 415 Leu Thr
Val Glu Gln Asn Ile Ile Ile Pro Asn Val Glu Asn Ser Lys 420
425 430 Leu Ser Ala Leu Leu Asn Glu
Pro Leu Leu Lys Glu Lys Phe Ser Pro 435 440
445 Glu Pro Ser Leu Leu Met Lys Thr Leu Val Ala Cys
Thr Gly Ser Gln 450 455 460
Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys Val 465
470 475 480 Thr Glu Glu
Val Glu Arg Leu Val Ala Val Thr Arg Pro Val Arg Met 485
490 495 His Trp Thr Gly Cys Pro Asn Thr
Cys Gly Gln Val Gln Val Ala Asp 500 505
510 Ile Gly Phe Met Gly Cys Met Ala Arg Asp Glu Asn Gly
Lys Pro Gly 515 520 525
Glu Gly Val Asp Ile Phe Leu Gly Gly Arg Ile Gly Ser Asp Ser His 530
535 540 Leu Ala Glu Val
Tyr Lys Lys Ala Val Pro Cys Lys Asp Leu Val Pro 545 550
555 560 Ile Val Ala Asp Ile Leu Val Lys His
Phe Gly Ala Val Gln Arg Asn 565 570
575 Arg Glu Glu Gly Asp Asp 580
15587PRTNicotiana tabacum 15Met Ala Ser Phe Ser Val Lys Phe Ser Ala Thr
Ser Leu Pro Asn Pro 1 5 10
15 Asn Arg Phe Ser Arg Thr Ala Lys Leu His Ala Thr Pro Pro Gln Thr
20 25 30 Val Ala
Val Pro Pro Ser Gly Glu Ala Glu Ile Ala Ser Glu Arg Leu 35
40 45 Glu Pro Arg Val Glu Glu Lys
Asp Gly Tyr Trp Val Leu Lys Glu Lys 50 55
60 Phe Arg Gln Gly Ile Asn Pro Ala Glu Lys Ala Lys
Ile Glu Lys Glu 65 70 75
80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu Asp Leu Ala Lys Ile
85 90 95 Ser Leu Glu
Glu Ile Glu Gly Ser Lys Leu Thr Lys Asp Asp Ile Asp 100
105 110 Val Arg Leu Lys Trp Leu Gly Leu
Phe His Arg Arg Lys His His Tyr 115 120
125 Gly Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val
Thr Thr Ser 130 135 140
Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Lys Asp 145
150 155 160 Gly Cys Gly Asp
Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val 165
170 175 Val Leu Pro Asp Val Pro Glu Ile Leu
Lys Gly Leu Asp Glu Val Gly 180 185
190 Leu Thr Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro
Val Gly 195 200 205
Asn Pro Leu Ala Gly Ile Asp Pro His Glu Ile Val Asp Thr Arg Pro 210
215 220 Tyr Thr Asn Leu Leu
Ser Gln Tyr Val Thr Ala Asn Phe Arg Gly Asn 225 230
235 240 Pro Ala Val Thr Asn Leu Pro Arg Lys Trp
Asn Val Cys Val Ile Gly 245 250
255 Ser His Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr
Met 260 265 270 Pro
Ala Ser Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly 275
280 285 Phe Phe Ser Pro Lys Arg
Cys Ala Glu Ala Val Pro Leu Asp Ala Trp 290 295
300 Val Pro Ala Asp Asp Val Val Pro Val Cys Lys
Ala Ile Leu Glu Ala 305 310 315
320 Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met
325 330 335 Trp Leu
Val Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val Val 340
345 350 Lys Arg Met Pro Gln Gln Lys
Leu Asp Arg Glu Ser Thr Glu Asp Leu 355 360
365 Val Gln Lys Gln Trp Glu Arg Arg Glu Tyr Leu Gly
Val His Pro Gln 370 375 380
Lys Gln Glu Gly Tyr Ser Phe Val Gly Leu His Ile Pro Val Gly Arg 385
390 395 400 Val Gln Ala
Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Asn Tyr 405
410 415 Gly Ser Gly Glu Leu Arg Leu Thr
Val Glu Gln Asn Ile Ile Ile Pro 420 425
430 Asn Val Glu Asn Ser Lys Ile Glu Ser Leu Leu Asn Glu
Pro Leu Leu 435 440 445
Lys Asn Arg Phe Ser Thr Asn Pro Pro Ile Leu Met Lys Asn Leu Val 450
455 460 Ala Cys Thr Gly
Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys 465 470
475 480 Ala Arg Ser Met Lys Ile Thr Glu Glu
Val Gln Arg Leu Val Ser Val 485 490
495 Thr Lys Pro Val Arg Met His Trp Thr Gly Cys Pro Asn Ser
Cys Gly 500 505 510
Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Lys
515 520 525 Glu Gly Lys Thr
Val Glu Gly Ala Asp Val Tyr Leu Gly Gly Arg Ile 530
535 540 Gly Ser Asp Ser His Leu Gly Asp
Val Tyr Lys Lys Ser Val Pro Cys 545 550
555 560 Glu Asp Leu Val Pro Ile Ile Val Asp Leu Leu Val
Asn Asn Phe Gly 565 570
575 Ala Val Pro Arg Glu Arg Glu Glu Ala Glu Asp 580
585 16587PRTNicotiana tabacum 16Met Ala Ser Phe Ser Ile
Lys Phe Leu Ala Pro Ser Leu Pro Asn Pro 1 5
10 15 Ala Arg Phe Ser Lys Asn Ala Val Lys Leu His
Ala Thr Pro Pro Ser 20 25
30 Val Ala Ala Pro Pro Thr Gly Ala Pro Glu Val Ala Ala Glu Arg
Leu 35 40 45 Glu
Pro Arg Val Glu Glu Lys Asp Gly Tyr Trp Ile Leu Lys Glu Gln 50
55 60 Phe Arg Lys Gly Ile Asn
Pro Gln Glu Lys Val Lys Ile Glu Lys Gln 65 70
75 80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu
Glu Leu Ala Lys Ile 85 90
95 Pro Ile Glu Glu Ile Asp Gln Ser Lys Leu Thr Lys Asp Asp Ile Asp
100 105 110 Val Arg
Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys Asn Gln Tyr 115
120 125 Gly Arg Phe Met Met Arg Leu
Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135
140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys
Tyr Gly Lys Glu 145 150 155
160 Gly Cys Ala Asp Ile Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val
165 170 175 Val Leu Pro
Asp Val Pro Glu Ile Leu Lys Gly Leu Ala Glu Val Gly 180
185 190 Leu Thr Ser Leu Gln Ser Gly Met
Asp Asn Val Arg Asn Pro Val Gly 195 200
205 Asn Pro Leu Ala Gly Ile Asp Pro Glu Glu Ile Val Asp
Thr Arg Pro 210 215 220
Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Gly Asn Ser Arg Gly Asn 225
230 235 240 Pro Ala Val Ser
Asn Leu Pro Arg Lys Trp Asn Pro Cys Val Val Gly 245
250 255 Ser His Asp Leu Tyr Glu His Pro His
Ile Asn Asp Leu Ala Tyr Met 260 265
270 Pro Ala Thr Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val
Gly Gly 275 280 285
Phe Phe Ser Ala Lys Arg Cys Asp Glu Ala Ile Pro Leu Asp Ala Trp 290
295 300 Val Pro Ala Asp Asp
Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala 305 310
315 320 Phe Arg Asp Leu Gly Phe Arg Gly Asn Arg
Gln Lys Cys Arg Met Met 325 330
335 Trp Leu Ile Asp Glu Leu Gly Val Glu Gly Phe Arg Ala Glu Val
Glu 340 345 350 Lys
Arg Met Pro Gln Gln Gln Leu Glu Arg Ala Ser Pro Glu Asp Leu 355
360 365 Val Gln Lys Gln Trp Glu
Arg Arg Asp Tyr Leu Gly Val His Pro Gln 370 375
380 Lys Gln Glu Gly Tyr Ser Phe Ile Gly Leu His
Ile Pro Val Gly Arg 385 390 395
400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Glu Tyr
405 410 415 Gly Ser
Gly Glu Ile Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420
425 430 Asn Ile Glu Asn Ser Lys Ile
Glu Ala Leu Leu Lys Glu Pro Val Leu 435 440
445 Ser Thr Phe Ser Pro Asp Pro Pro Ile Leu Met Lys
Gly Leu Val Ala 450 455 460
Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Ala 465
470 475 480 Arg Ser Leu
Met Ile Thr Glu Glu Val Gln Arg Gln Val Ser Leu Thr 485
490 495 Arg Pro Val Arg Met His Trp Thr
Gly Cys Pro Asn Thr Cys Ala Gln 500 505
510 Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr
Arg Asp Lys 515 520 525
Asn Gly Lys Thr Val Glu Gly Ala Asp Val Phe Leu Gly Gly Arg Ile 530
535 540 Gly Ser Asp Ser
His Leu Gly Glu Val Tyr Lys Lys Ala Val Pro Cys 545 550
555 560 Asp Asp Leu Val Pro Leu Val Val Asp
Leu Leu Val Asn Asn Phe Gly 565 570
575 Ala Val Pro Arg Glu Arg Glu Glu Thr Glu Asp
580 585 17584PRTNicotiana tabacum 17Met Ala Ser
Phe Ser Val Lys Phe Ser Ala Thr Ser Leu Pro Asn His 1 5
10 15 Lys Arg Phe Ser Lys Leu His Ala
Thr Pro Pro Gln Thr Val Ala Val 20 25
30 Ala Pro Ser Gly Ala Ala Glu Ile Ala Ser Glu Arg Leu
Glu Pro Arg 35 40 45
Val Glu Glu Lys Asp Gly Tyr Trp Val Leu Lys Glu Lys Phe Arg Gln 50
55 60 Gly Ile Asn Pro
Ala Glu Lys Ala Lys Ile Glu Lys Glu Pro Met Lys 65 70
75 80 Leu Phe Met Glu Asn Gly Ile Glu Asp
Leu Ala Lys Ile Ser Leu Glu 85 90
95 Glu Ile Glu Gly Ser Lys Leu Thr Lys Asp Asp Ile Asp Val
Arg Leu 100 105 110
Lys Trp Leu Gly Leu Phe His Arg Arg Lys His His Tyr Gly Arg Phe
115 120 125 Met Met Arg Leu
Lys Leu Pro Asn Gly Val Thr Thr Ser Ser Gln Thr 130
135 140 Arg Tyr Leu Ala Ser Val Ile Arg
Lys Tyr Gly Lys Asp Gly Cys Ala 145 150
155 160 Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly
Val Val Leu Pro 165 170
175 Asp Val Pro Glu Ile Leu Lys Gly Leu Asp Glu Val Gly Leu Thr Ser
180 185 190 Leu Gln Ser
Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro Leu 195
200 205 Ala Gly Ile Asp Pro His Glu Ile
Val Asp Thr Arg Pro Tyr Thr Asn 210 215
220 Leu Leu Ser Gln Tyr Val Thr Ala Asn Phe Arg Gly Asn
Pro Ala Val 225 230 235
240 Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Ile Gly Ser His Asp
245 250 255 Leu Tyr Glu His
Pro Gln Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr 260
265 270 Lys Asp Gly Arg Phe Gly Phe Asn Leu
Leu Val Gly Gly Phe Phe Ser 275 280
285 Pro Lys Arg Cys Ala Glu Ala Val Pro Leu Asp Ala Trp Val
Pro Ala 290 295 300
Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala Tyr Arg Asp 305
310 315 320 Leu Gly Thr Arg Gly
Asn Arg Gln Lys Thr Arg Met Met Trp Leu Val 325
330 335 Asp Glu Leu Gly Val Glu Gly Phe Arg Ala
Glu Val Val Lys Arg Met 340 345
350 Pro Gln Gln Lys Leu Asp Arg Glu Ser Thr Glu Asp Leu Val Gln
Lys 355 360 365 Gln
Trp Glu Arg Arg Glu Tyr Leu Gly Val His Pro Gln Lys Gln Glu 370
375 380 Gly Tyr Ser Phe Val Gly
Leu His Ile Pro Val Gly Arg Val Gln Ala 385 390
395 400 Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp
Glu Tyr Gly Ser Gly 405 410
415 Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val Lys
420 425 430 Asn Ser
Lys Ile Glu Ala Leu Leu Asn Glu Pro Leu Leu Lys Asn Arg 435
440 445 Phe Ser Thr Asp Pro Pro Ile
Leu Met Lys Asn Leu Val Ala Cys Thr 450 455
460 Gly Asn Gln Phe Cys Gly Lys Ala Ile Ile Glu Thr
Lys Ala Arg Ser 465 470 475
480 Met Lys Ile Thr Glu Glu Val Gln Leu Leu Val Ser Ile Thr Gln Pro
485 490 495 Val Arg Met
His Trp Thr Gly Cys Pro Asn Ser Cys Ala Gln Val Gln 500
505 510 Val Ala Asp Ile Gly Phe Met Gly
Cys Leu Thr Arg Lys Glu Gly Lys 515 520
525 Thr Val Glu Gly Ala Asp Val Tyr Leu Gly Gly Arg Ile
Gly Ser Asp 530 535 540
Ser His Leu Gly Asp Val Tyr Lys Lys Ser Val Pro Cys Glu Asp Leu 545
550 555 560 Val Pro Ile Ile
Val Asp Leu Leu Val Asp Asn Phe Gly Ala Val Pro 565
570 575 Arg Glu Arg Glu Glu Ala Glu Asp
580 18594PRTOryza sativa 18Met Ala Ser Ser Ala
Ser Leu Gln Arg Phe Leu Pro Pro Tyr Pro His 1 5
10 15 Ala Ala Ala Ser Arg Cys Arg Pro Pro Gly
Val Arg Ala Arg Pro Val 20 25
30 Gln Ser Ser Thr Val Ser Ala Pro Ser Ser Ser Thr Pro Ala Ala
Asp 35 40 45 Glu
Ala Val Ser Ala Glu Arg Leu Glu Pro Arg Val Glu Gln Arg Glu 50
55 60 Gly Arg Tyr Trp Val Leu
Lys Glu Lys Tyr Arg Thr Gly Leu Asn Pro 65 70
75 80 Gln Glu Lys Val Lys Leu Gly Lys Glu Pro Met
Ser Leu Phe Met Glu 85 90
95 Gly Gly Ile Lys Glu Leu Ala Lys Met Pro Met Glu Glu Ile Glu Ala
100 105 110 Asp Lys
Leu Ser Lys Glu Asp Ile Asp Val Arg Leu Lys Trp Leu Gly 115
120 125 Leu Phe His Arg Arg Lys His
Gln Tyr Gly Arg Phe Met Met Arg Leu 130 135
140 Lys Leu Pro Asn Gly Val Thr Thr Ser Glu Gln Thr
Arg Tyr Leu Ala 145 150 155
160 Ser Val Ile Glu Ala Tyr Gly Lys Glu Gly Cys Ala Asp Val Thr Thr
165 170 175 Arg Arg Gln
Ile Arg Gly Val Thr Leu Pro Asp Val Pro Ala Ile Leu 180
185 190 Asp Gly Leu Asn Ala Val Gly Leu
Thr Ser Leu Gln Ser Gly Met Asp 195 200
205 Asn Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile
Asp Pro Asp 210 215 220
Glu Ile Val Asp Thr Arg Ser Tyr Thr Asn Leu Leu Ser Ser Tyr Ile 225
230 235 240 Thr Ser Asn Phe
Gln Gly Asn Pro Thr Ile Thr Asn Leu Pro Arg Lys 245
250 255 Trp Asn Val Cys Val Ile Gly Ser His
Asp Leu Tyr Glu His Pro His 260 265
270 Ile Asn Asp Leu Ala Tyr Met Pro Ala Val Lys Gly Gly Lys
Phe Gly 275 280 285
Phe Asn Leu Leu Val Gly Gly Phe Ile Ser Pro Lys Arg Trp Glu Glu 290
295 300 Ala Leu Pro Leu Asp
Ala Trp Val Pro Gly Asp Asp Ile Ile Pro Val 305 310
315 320 Cys Lys Ala Val Leu Glu Ala Tyr Arg Asp
Leu Gly Thr Arg Gly Asn 325 330
335 Arg Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Leu Gly Met
Glu 340 345 350 Ala
Phe Arg Ser Glu Val Glu Lys Arg Met Pro Asn Gly Val Leu Glu 355
360 365 Arg Ala Ala Pro Glu Asp
Leu Ile Asp Lys Lys Trp Gln Arg Arg Asp 370 375
380 Tyr Leu Gly Val His Pro Gln Lys Gln Glu Gly
Met Ser Tyr Val Gly 385 390 395
400 Leu His Val Pro Val Gly Arg Val Gln Ala Ala Asp Met Phe Glu Leu
405 410 415 Ala Arg
Leu Ala Asp Glu Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val 420
425 430 Glu Gln Asn Ile Val Ile Pro
Asn Val Lys Asn Glu Lys Val Glu Ala 435 440
445 Leu Leu Ser Glu Pro Leu Leu Gln Lys Phe Ser Pro
Gln Pro Ser Leu 450 455 460
Leu Leu Lys Gly Leu Val Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln 465
470 475 480 Ala Ile Ile
Glu Thr Lys Gln Arg Ala Leu Leu Val Thr Ser Gln Val 485
490 495 Glu Lys Leu Val Ser Val Pro Arg
Ala Val Arg Met His Trp Thr Gly 500 505
510 Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala Asp Ile
Gly Phe Met 515 520 525
Gly Cys Leu Thr Lys Asp Ser Ala Gly Lys Ile Val Glu Ala Ala Asp 530
535 540 Ile Phe Val Gly
Gly Arg Val Gly Ser Asp Ser His Leu Ala Gly Ala 545 550
555 560 Tyr Lys Lys Ser Val Pro Cys Asp Glu
Leu Ala Pro Ile Val Ala Asp 565 570
575 Ile Leu Val Glu Arg Phe Gly Ala Val Arg Arg Glu Arg Glu
Glu Asp 580 585 590
Glu Glu 19602PRTPhyscomitrella patens 19Met Gln Gly Ala Met Gln Thr Lys
Met Trp Arg Gly Glu Leu Ile Ser 1 5 10
15 Thr Ser Thr His Phe Ile Gly Gly Thr Arg Leu Gln Pro
Lys Leu Asn 20 25 30
Gln Asp Ala Arg Lys Pro Thr Lys Ser Glu Asn Cys Ile Val Arg Val
35 40 45 Ser Met Glu Arg
Glu Val Lys Ala Lys Ala Ala Val Ser Pro Pro Ala 50
55 60 Val Ala Ala Asp Arg Leu Thr Pro
Arg Val Gln Glu Arg Asp Gly Tyr 65 70
75 80 Tyr Val Leu Lys Glu Glu Phe Arg Gln Gly Ile Asn
Pro Gln Glu Lys 85 90
95 Ile Lys Leu Gly Lys Glu Pro Met Lys Phe Phe Ile Glu Asn Glu Ile
100 105 110 Glu Glu Leu
Ala Lys Thr Pro Phe Ala Glu Leu Asp Ser Ser Lys Pro 115
120 125 Gly Lys Asp Asp Ile Asp Val Arg
Leu Lys Trp Leu Gly Leu Phe His 130 135
140 Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Phe
Lys Leu Pro 145 150 155
160 Asn Gly Ile Thr Asn Ser Thr Gln Thr Arg Phe Leu Ala Glu Thr Ile
165 170 175 Ser Lys Tyr Gly
Lys Glu Gly Cys Ala Asp Leu Thr Thr Arg Gln Asn 180
185 190 Trp Gln Ile Arg Gly Ile Met Leu Glu
Asp Val Pro Ser Leu Leu Lys 195 200
205 Gly Leu Glu Ser Val Gly Leu Ser Ser Leu Gln Ser Gly Met
Asp Asn 210 215 220
Val Arg Asn Ala Val Gly Asn Pro Leu Ala Gly Ile Asp Pro Asp Glu 225
230 235 240 Ile Val Asp Thr Ile
Pro Ile Cys Gln Ala Leu Asn Asp Tyr Ile Ile 245
250 255 Asn Arg Gly Lys Gly Asn Thr Glu Ile Thr
Asn Leu Pro Arg Lys Trp 260 265
270 Asn Val Cys Val Val Gly Thr His Asp Leu Phe Glu His Pro His
Ile 275 280 285 Asn
Asp Leu Ala Tyr Val Pro Ala Thr Lys Asn Gly Val Phe Gly Phe 290
295 300 Asn Ile Leu Val Gly Gly
Phe Phe Ser Ser Lys Arg Cys Ala Glu Ala 305 310
315 320 Ile Pro Met Asp Ala Trp Val Pro Thr Asp Asp
Val Val Pro Leu Cys 325 330
335 Lys Ala Ile Leu Glu Thr Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg
340 345 350 Gln Lys
Thr Arg Met Met Trp Leu Ile Asp Glu Met Gly Val Glu Glu 355
360 365 Phe Arg Ala Glu Val Glu Arg
Arg Met Pro Ser Gly Thr Ile Arg Arg 370 375
380 Ala Gly Gln Asp Leu Ile Asp Pro Ser Trp Lys Arg
Arg Ser Phe Phe 385 390 395
400 Gly Val Asn Pro Gln Lys Gln Ala Gly Leu Asn Tyr Val Gly Leu His
405 410 415 Val Pro Val
Gly Arg Leu His Ala Pro Glu Met Phe Glu Leu Ala Arg 420
425 430 Ile Ala Asp Glu Tyr Gly Asn Gly
Glu Ile Arg Ile Thr Val Glu Gln 435 440
445 Asn Leu Ile Leu Pro Asn Ile Pro Thr Glu Lys Ile Asp
Lys Leu Met 450 455 460
Gln Glu Pro Leu Leu Gln Lys Tyr Ser Pro Asn Pro Thr Pro Leu Leu 465
470 475 480 Ala Asn Leu Val
Ala Cys Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile 485
490 495 Ala Glu Thr Lys Ala Leu Ser Leu Gln
Leu Thr Gln Gln Leu Glu Asp 500 505
510 Thr Met Glu Thr Thr Arg Pro Ile Arg Leu His Phe Thr Gly
Cys Pro 515 520 525
Asn Thr Cys Ala Gln Ile Gln Val Ala Asp Ile Gly Phe Met Gly Thr 530
535 540 Met Ala Arg Asp Glu
Asn Arg Lys Pro Val Glu Gly Phe Asp Ile Tyr 545 550
555 560 Leu Gly Gly Arg Ile Gly Ser Asp Ser His
Leu Gly Glu Leu Val Val 565 570
575 Pro Gly Val Pro Ala Thr Lys Leu Leu Pro Val Val Gln Glu Leu
Met 580 585 590 Ile
Gln His Phe Gly Ala Lys Arg Lys Pro 595 600
20602PRTPhyscomitrella patens 20Met Gln Gly Thr Met Gln Ser Gln Met Trp
Arg Gly Gln Val Ser Gly 1 5 10
15 Ala Ser Leu His Phe Thr Gly Ala Thr Arg Val Gln Gly Asn Ser
His 20 25 30 Gln
Asp Leu Val Tyr Pro Thr Gln Phe His Lys His Gly Val Arg Ala 35
40 45 Ser Ala Glu Arg Glu Val
Lys Ala Lys Ala Val Ala Ala Pro Pro Thr 50 55
60 Ile Ala Ala Asp Arg Leu Val Pro Arg Val Glu
Glu Arg Asp Gly Tyr 65 70 75
80 Tyr Val Leu Lys Glu Glu Phe Arg Gln Gly Ile Asn Pro Ser Glu Lys
85 90 95 Ile Lys
Ile Ala Lys Glu Pro Met Lys Phe Phe Met Glu Asn Glu Ile 100
105 110 Glu Glu Leu Ala Lys Thr Pro
Phe Ala Glu Leu Asp Ser Ser Lys Ala 115 120
125 Gly Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu
Gly Leu Phe His 130 135 140
Arg Arg Lys His Gln Tyr Gly Arg Phe Met Met Arg Phe Lys Leu Pro 145
150 155 160 Asn Gly Ile
Thr Asn Ser Ser Gln Thr Arg Phe Leu Ala Glu Thr Ile 165
170 175 Ser Lys Tyr Gly Glu Tyr Gly Cys
Ala Asp Leu Thr Thr Arg Gln Asn 180 185
190 Trp Gln Ile Arg Gly Ile Val Leu Glu Asp Val Pro Ala
Leu Leu Lys 195 200 205
Gly Leu Glu Ser Val Gly Leu Ser Ser Leu Gln Ser Gly Met Asp Asn 210
215 220 Val Arg Asn Pro
Val Gly Asn Pro Leu Ala Gly Ile Asp Pro Asp Glu 225 230
235 240 Ile Val Asp Thr Ala Pro Phe Cys Lys
Val Leu Ser Asp Tyr Ile Ile 245 250
255 Asn Arg Gly Gln Gly Asn Pro Gln Ile Thr Asn Leu Pro Arg
Lys Trp 260 265 270
Asn Val Cys Val Val Gly Thr His Asp Leu Phe Glu His Pro His Ile
275 280 285 Asn Asp Leu Ala
Tyr Met Pro Ala Thr Lys Asn Gly Val Phe Gly Phe 290
295 300 Asn Ile Leu Val Gly Gly Phe Phe
Ser Pro Lys Arg Cys Ala Glu Ala 305 310
315 320 Ile Pro Met Asp Ala Trp Val Pro Ala Asp Asp Val
Val Pro Leu Cys 325 330
335 Lys Ala Ile Leu Glu Thr Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg
340 345 350 Gln Lys Thr
Arg Met Met Trp Leu Ile Asp Glu Met Gly Ile Glu Glu 355
360 365 Phe Arg Ala Glu Val Glu Arg Arg
Met Pro Gly Gly Ser Ile Leu Arg 370 375
380 Ala Gly Lys Asp Leu Val Asp Pro Ser Trp Thr Arg Arg
Ser Phe Tyr 385 390 395
400 Gly Val Asn Pro Gln Lys Gln Pro Gly Leu Asn Tyr Val Gly Leu His
405 410 415 Ile Pro Val Gly
Arg Leu His Ala Pro Glu Met Phe Glu Leu Ala Arg 420
425 430 Ile Ala Asp Glu Tyr Gly Asn Gly Glu
Ile Arg Ile Ser Val Glu Gln 435 440
445 Asn Leu Ile Leu Pro Asn Val Pro Thr Glu Lys Ile Glu Lys
Leu Leu 450 455 460
Lys Glu Pro Leu Leu Glu Lys Tyr Ser Pro Asn Pro Thr Pro Leu Leu 465
470 475 480 Ala Asn Leu Val Ala
Cys Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile 485
490 495 Ala Glu Thr Lys Ala Arg Ser Leu Gln Leu
Thr Gln Glu Leu Glu Ala 500 505
510 Thr Met Glu Thr Thr Arg Pro Ile Arg Leu His Phe Thr Gly Cys
Pro 515 520 525 Asn
Thr Cys Ala Gln Ile Gln Val Ala Asp Ile Gly Phe Met Gly Thr 530
535 540 Met Ala Arg Asp Glu Asn
Arg Lys Pro Val Glu Gly Phe Asp Ile Tyr 545 550
555 560 Leu Gly Gly Arg Ile Gly Ser Asp Ser His Leu
Gly Glu Leu Val Val 565 570
575 Pro Gly Val Pro Ala Thr Lys Leu Leu Pro Val Val Gln Asp Leu Met
580 585 590 Ile Gln
His Phe Gly Ala Lys Arg Lys Thr 595 600
21616PRTPinus taeda 21Met Asn Leu Ser Ser Pro Val Arg Phe Asp Glu Ile Arg
Pro Leu Ala 1 5 10 15
His Val Val Tyr Asn Pro Val Cys Cys Gly His Lys Pro Asn Arg Leu
20 25 30 Arg Leu Met Thr
Ala Ile Gln Val Arg Ala Val Asn His Gly Gly Arg 35
40 45 Asn Ser Glu Ile Ser Thr Asp Gly Asn
Ser Lys Gly Thr Thr Ala Lys 50 55
60 Ala Val Ala Ser Pro Ala Gly Ser His Val Ala Val Asp
Ala Ser Arg 65 70 75
80 Leu Glu Ala Arg Val Glu Glu Arg Asp Gly Tyr Trp Val Leu Lys Glu
85 90 95 Glu Phe Arg Ala
Gly Ile Asn Pro Gln Glu Lys Ile Lys Leu Gln Arg 100
105 110 Glu Pro Met Lys Leu Phe Met Glu Asn
Glu Ile Glu Glu Leu Ala Lys 115 120
125 Lys Pro Phe Ala Glu Ile Glu Ser Glu Lys Val Asn Lys Asp
Asp Ile 130 135 140
Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His His 145
150 155 160 Tyr Gly Arg Phe Met
Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr 165
170 175 Ser Leu Gln Thr Arg Tyr Leu Ala Ser Val
Ile Gln Gln Tyr Gly Pro 180 185
190 Glu Gly Cys Ala Asp Ile Thr Thr Arg Gln Asn Trp Gln Ile Arg
Gly 195 200 205 Val
Val Leu Asp Asp Val Pro Ala Ile Leu Lys Gly Leu Lys Glu Val 210
215 220 Gly Leu Ser Ser Leu Gln
Ser Gly Met Asp Asn Val Arg Asn Pro Val 225 230
235 240 Gly Asn Pro Leu Ala Gly Ile Asp Ala Asp Glu
Ile Ile Asp Thr Arg 245 250
255 Pro Tyr Thr Lys Val Leu Thr Asp Tyr Ile Val Asn Asn Gly Lys Gly
260 265 270 Asn Pro
Ser Ile Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val 275
280 285 Gly Thr His Asp Leu Phe Glu
His Pro His Ile Asn Asp Leu Ala Tyr 290 295
300 Ile Pro Ala Met Asn Ser Gly Arg Phe Gly Phe Asn
Leu Leu Val Gly 305 310 315
320 Gly Phe Phe Ser Pro Lys Arg Cys Glu Glu Ala Val Pro Leu Asp Ala
325 330 335 Trp Val Ala
Gly Glu Asp Val Val Pro Val Cys Arg Ala Ile Leu Glu 340
345 350 Val Tyr Arg Asp Leu Gly Thr Arg
Gly Asn Arg Gln Lys Thr Arg Met 355 360
365 Met Trp Leu Ile Asp Glu Leu Gly Ile Glu Gly Phe Arg
Ser Glu Val 370 375 380
Val Lys Arg Met Pro Gly Glu Lys Leu Glu Arg Ala Ala Thr Glu Asp 385
390 395 400 Met Leu Asp Lys
Ser Trp Glu Arg Arg Ser Tyr Leu Gly Val His Pro 405
410 415 Gln Lys Gln Glu Gly Leu Asn Phe Val
Gly Leu His Val Pro Val Gly 420 425
430 Arg Leu Gln Ala Glu Asp Met Leu Glu Leu Ala Arg Leu Ala
Glu Gln 435 440 445
Tyr Gly Thr Gln Glu Leu Arg Leu Thr Val Glu Gln Asn Ala Ile Ile 450
455 460 Pro Asn Val Pro Thr
Asp Lys Ile Glu Ala Leu Leu Gln Glu Pro Leu 465 470
475 480 Leu Gln Lys Phe Ser Pro Ser Pro Pro Leu
Leu Val Ser Thr Leu Val 485 490
495 Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr
Lys 500 505 510 Ala
Arg Ala Leu Lys Ile Thr Glu Glu Leu Asp Arg Thr Met Glu Val 515
520 525 Pro Lys Pro Val Arg Met
His Trp Thr Gly Cys Pro Asn Thr Cys Gly 530 535
540 Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly
Cys Met Thr Arg Asp 545 550 555
560 Glu Asn Lys Lys Val Val Glu Gly Val Asp Ile Phe Ile Gly Gly Arg
565 570 575 Val Gly
Ala Asp Ser His Leu Gly Asp Leu Ile His Lys Gly Val Pro 580
585 590 Cys Lys Asp Val Val Pro Val
Val Gln Glu Leu Leu Ile Lys His Phe 595 600
605 Gly Ala Ile Arg Lys Thr Asp Met 610
615 22588PRTPopulus trichocarpa 22Met Ser Ser Leu Ser Val Arg
Phe Leu Thr Pro Gln Leu Ser Pro Thr 1 5
10 15 Val Pro Ser Ser Ser Ala Arg Pro Arg Thr Arg
Leu Phe Ala Gly Pro 20 25
30 Pro Thr Val Ala Gln Pro Ala Glu Thr Gly Val Asp Ala Gly Arg
Leu 35 40 45 Glu
Pro Arg Val Glu Lys Lys Asp Gly Tyr Tyr Val Leu Lys Glu Lys 50
55 60 Phe Arg Gln Gly Ile Asn
Pro Gln Glu Lys Val Lys Ile Glu Lys Glu 65 70
75 80 Pro Met Lys Leu Phe Met Glu Asn Gly Ile Glu
Glu Leu Ala Lys Leu 85 90
95 Ser Met Glu Glu Ile Asp Lys Glu Lys Ser Thr Lys Asp Asp Ile Asp
100 105 110 Val Arg
Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys His Gln Tyr 115
120 125 Gly Arg Phe Met Met Arg Leu
Lys Leu Pro Asn Gly Val Thr Thr Ser 130 135
140 Ala Gln Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys
Tyr Gly Lys Asp 145 150 155
160 Gly Cys Ala Asp Val Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val
165 170 175 Val Leu Pro
Asp Val Pro Glu Ile Leu Arg Gly Leu Ala Glu Val Gly 180
185 190 Leu Thr Ser Leu Gln Ser Gly Met
Asp Asn Val Arg Asn Pro Val Gly 195 200
205 Asn Pro Leu Ala Gly Ile Asp Pro Asp Glu Ile Val Asp
Thr Arg Pro 210 215 220
Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Ala Asn Ser Arg Gly Asn 225
230 235 240 Pro Glu Phe Thr
Asn Leu Pro Arg Lys Trp Asn Val Cys Val Val Gly 245
250 255 Ser His Asp Leu Tyr Glu His Pro His
Ile Asn Asp Leu Ala Tyr Met 260 265
270 Pro Ala Met Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val
Gly Gly 275 280 285
Phe Phe Ser Pro Lys Arg Cys Ala Glu Ala Ile Pro Leu Asp Ala Trp 290
295 300 Val Ser Ala Asp Asp
Val Leu Pro Ser Cys Lys Ala Val Leu Glu Ala 305 310
315 320 Tyr Arg Asp Leu Gly Thr Arg Gly Asn Arg
Gln Lys Thr Arg Met Met 325 330
335 Trp Leu Ile Asp Glu Leu Gly Ile Glu Gly Phe Arg Ser Glu Val
Val 340 345 350 Lys
Arg Met Pro Arg Gln Glu Leu Glu Arg Glu Ser Ser Glu Asp Leu 355
360 365 Val Gln Lys Gln Trp Glu
Arg Arg Asp Tyr Phe Gly Val His Pro Gln 370 375
380 Lys Gln Glu Gly Leu Ser Tyr Ala Gly Leu His
Ile Pro Val Gly Arg 385 390 395
400 Val Gln Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Ile Tyr
405 410 415 Gly Thr
Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile Pro 420
425 430 Asn Ile Glu Asp Ser Lys Ile
Glu Ala Leu Leu Lys Glu Pro Leu Leu 435 440
445 Lys Asp Arg Phe Ser Pro Glu Pro Pro Leu Leu Met
Gln Gly Leu Val 450 455 460
Ala Cys Thr Gly Lys Glu Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys 465
470 475 480 Ala Arg Ala
Met Lys Val Thr Glu Glu Val Gln Arg Leu Val Ser Val 485
490 495 Ser Lys Pro Val Arg Met His Trp
Thr Gly Cys Pro Asn Thr Cys Gly 500 505
510 Gln Val Gln Val Ala Asp Ile Gly Phe Met Gly Cys Met
Ala Arg Asp 515 520 525
Glu Asn Gly Lys Ile Cys Glu Gly Ala Asp Val Tyr Val Gly Gly Arg 530
535 540 Val Gly Ser Asp
Ser His Leu Gly Glu Leu Tyr Lys Lys Ser Val Pro 545 550
555 560 Cys Lys Asp Leu Val Pro Leu Val Val
Asp Ile Leu Val Lys Gln Phe 565 570
575 Gly Ala Val Pro Arg Glu Arg Glu Glu Val Asp Asp
580 585 23587PRTSolanum lycopersicum
23Met Ala Ser Phe Ser Ile Lys Phe Leu Ala Pro Ser Leu Pro Asn Pro 1
5 10 15 Thr Arg Phe Ser
Lys Ser Ser Ile Val Lys Leu Asn Ala Thr Pro Pro 20
25 30 Gln Thr Val Ala Ala Ala Gly Pro Pro
Glu Val Ala Ala Glu Arg Leu 35 40
45 Glu Pro Arg Val Glu Glu Lys Asp Gly Tyr Trp Ile Leu Lys
Glu Gln 50 55 60
Phe Arg Gln Gly Ile Asn Pro Gln Glu Lys Val Lys Ile Glu Lys Glu 65
70 75 80 Pro Met Lys Leu Phe
Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Ile 85
90 95 Pro Ile Glu Glu Ile Asp Gln Ser Lys Leu
Thr Lys Asp Asp Ile Asp 100 105
110 Val Arg Leu Lys Trp Leu Gly Leu Phe His Arg Arg Lys Asn Gln
Tyr 115 120 125 Gly
Arg Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser 130
135 140 Ala Gln Thr Arg Tyr Leu
Ala Ser Val Ile Arg Lys Tyr Gly Glu Glu 145 150
155 160 Gly Cys Ala Asp Ile Thr Thr Arg Gln Asn Trp
Gln Ile Arg Gly Val 165 170
175 Val Leu Pro Asp Val Pro Glu Ile Leu Lys Gly Leu Glu Glu Val Gly
180 185 190 Leu Thr
Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly 195
200 205 Asn Pro Leu Ala Gly Ile Asp
Pro Glu Glu Ile Val Asp Thr Arg Pro 210 215
220 Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr Gly Asn
Ser Arg Gly Asn 225 230 235
240 Pro Ala Val Ser Asn Leu Pro Arg Lys Trp Asn Pro Cys Val Val Gly
245 250 255 Ser His Asp
Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met 260
265 270 Pro Ala Ile Lys Asp Gly Arg Phe
Gly Phe Asn Leu Leu Val Gly Gly 275 280
285 Phe Phe Ser Ala Lys Arg Cys Asp Glu Ala Ile Pro Leu
Asp Ala Trp 290 295 300
Val Pro Ala Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu Ala 305
310 315 320 Phe Arg Asp Leu
Gly Phe Arg Gly Asn Arg Gln Lys Cys Arg Met Met 325
330 335 Trp Leu Ile Asp Glu Leu Gly Val Glu
Gly Phe Arg Ala Glu Val Val 340 345
350 Lys Arg Met Pro Gln Gln Glu Leu Glu Arg Ala Ser Pro Glu
Asp Leu 355 360 365
Val Gln Lys Gln Trp Glu Arg Arg Asp Tyr Leu Gly Val His Pro Gln 370
375 380 Lys Gln Glu Gly Tyr
Ser Phe Ile Gly Leu His Ile Pro Val Gly Arg 385 390
395 400 Val Gln Ala Asp Asp Met Asp Asp Leu Ala
Arg Leu Ala Asp Glu Tyr 405 410
415 Gly Ser Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Ile Ile Ile
Pro 420 425 430 Asn
Ile Glu Asn Ser Lys Ile Asp Ala Leu Leu Lys Glu Pro Ile Leu 435
440 445 Ser Lys Phe Ser Pro Asp
Pro Pro Ile Leu Met Lys Gly Leu Val Ala 450 455
460 Cys Thr Gly Asn Gln Phe Cys Gly Gln Ala Ile
Ile Glu Thr Lys Ala 465 470 475
480 Arg Ser Leu Lys Ile Thr Glu Glu Val Gln Arg Gln Val Ser Leu Thr
485 490 495 Arg Pro
Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Ala Gln 500
505 510 Val Gln Val Ala Asp Ile Gly
Phe Met Gly Cys Leu Thr Arg Asp Lys 515 520
525 Asp Lys Lys Thr Val Glu Gly Ala Asp Val Phe Leu
Gly Gly Arg Ile 530 535 540
Gly Ser Asp Ser His Leu Gly Glu Val Tyr Lys Lys Ala Val Pro Cys 545
550 555 560 Asp Glu Leu
Val Pro Leu Ile Val Asp Leu Leu Ile Lys Asn Phe Gly 565
570 575 Ala Val Pro Arg Glu Arg Glu Glu
Thr Glu Asp 580 585 24584PRTSolanum
lycopersicum 24Met Thr Ser Phe Ser Val Lys Phe Ser Ala Thr Ser Leu Pro
Asn Ser 1 5 10 15
Asn Arg Phe Ser Lys Leu His Ala Thr Pro Pro Gln Thr Val Ala Val
20 25 30 Pro Ser Tyr Gly Ala
Ala Glu Ile Ala Ala Glu Arg Leu Glu Pro Arg 35
40 45 Val Glu Gln Arg Asp Gly Tyr Trp Val
Val Lys Asp Lys Phe Arg Gln 50 55
60 Gly Ile Asn Pro Ala Glu Lys Ala Lys Ile Glu Lys Glu
Pro Met Lys 65 70 75
80 Leu Phe Thr Glu Asn Gly Ile Glu Asp Leu Ala Lys Ile Ser Leu Glu
85 90 95 Glu Ile Glu Lys
Ser Lys Leu Thr Lys Glu Asp Ile Asp Ile Arg Leu 100
105 110 Lys Trp Leu Gly Leu Phe His Arg Arg
Lys His His Tyr Gly Arg Phe 115 120
125 Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr Thr Ser Asp
Gln Thr 130 135 140
Arg Tyr Leu Gly Ser Val Ile Arg Lys Tyr Gly Lys Asp Gly Cys Gly 145
150 155 160 Asp Val Thr Thr Arg
Gln Asn Trp Gln Ile Arg Gly Val Val Leu Pro 165
170 175 Asp Val Pro Glu Ile Leu Lys Gly Leu Asp
Glu Val Gly Leu Thr Ser 180 185
190 Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val Gly Asn Pro
Leu 195 200 205 Ala
Gly Ile Asp Leu His Glu Ile Val Asp Thr Arg Pro Tyr Thr Asn 210
215 220 Leu Leu Ser Gln Tyr Val
Thr Ala Asn Phe Arg Gly Asn Val Asp Val 225 230
235 240 Thr Asn Leu Pro Arg Lys Trp Asn Val Cys Val
Ile Gly Ser His Asp 245 250
255 Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met Pro Ala Thr
260 265 270 Lys Asp
Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe Ser 275
280 285 Pro Lys Arg Cys Ala Glu Ala
Ile Pro Leu Asp Ala Trp Val Pro Ala 290 295
300 Asp Asp Val Val Pro Val Cys Lys Ala Ile Leu Glu
Ala Tyr Arg Asp 305 310 315
320 Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg Met Met Trp Leu Ile
325 330 335 Asp Glu Leu
Gly Val Glu Gly Phe Arg Ala Glu Val Val Lys Arg Met 340
345 350 Pro Gln Lys Lys Leu Asp Arg Glu
Ser Ser Glu Asp Leu Val Leu Lys 355 360
365 Gln Trp Glu Arg Arg Glu Tyr Leu Gly Val His Pro Gln
Lys Gln Glu 370 375 380
Gly Tyr Ser Phe Val Gly Leu His Ile Pro Val Gly Arg Val Gln Ala 385
390 395 400 Asp Asp Met Asp
Glu Leu Ala Arg Leu Ala Asp Glu Tyr Gly Ser Gly 405
410 415 Glu Leu Arg Leu Thr Val Glu Gln Asn
Ile Ile Ile Pro Asn Ile Glu 420 425
430 Asn Ser Lys Ile Asp Ala Leu Leu Asn Glu Pro Leu Leu Lys
Asn Arg 435 440 445
Phe Ser Pro Asp Pro Pro Ile Leu Met Arg Asn Leu Val Ala Cys Thr 450
455 460 Gly Asn Gln Phe Cys
Gly Gln Ala Ile Ile Glu Thr Lys Ala Arg Ser 465 470
475 480 Met Lys Ile Thr Glu Glu Val Gln Arg Leu
Val Ser Val Thr Gln Pro 485 490
495 Val Arg Met His Trp Thr Gly Cys Pro Asn Thr Cys Gly Gln Val
Gln 500 505 510 Val
Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Lys Glu Gly Lys 515
520 525 Thr Val Glu Gly Ala Asp
Val Phe Leu Gly Gly Arg Ile Gly Ser Asp 530 535
540 Ser His Leu Gly Glu Val Tyr Lys Lys Ser Val
Pro Cys Glu Asp Leu 545 550 555
560 Val Pro Ile Ile Val Asp Leu Leu Ile Asn Asn Phe Gly Ala Val Pro
565 570 575 Arg Glu
Arg Glu Glu Thr Glu Glu 580
25586PRTArabidopsis thaliana 25Met Thr Ser Phe Ser Leu Thr Phe Thr Ser
Pro Leu Leu Pro Ser Ser 1 5 10
15 Ser Thr Lys Pro Lys Arg Ser Val Leu Val Ala Ala Ala Gln Thr
Thr 20 25 30 Ala
Pro Ala Glu Ser Thr Ala Ser Val Asp Ala Asp Arg Leu Glu Pro 35
40 45 Arg Val Glu Leu Lys Asp
Gly Phe Phe Ile Leu Lys Glu Lys Phe Arg 50 55
60 Lys Gly Ile Asn Pro Gln Glu Lys Val Lys Ile
Glu Arg Glu Pro Met 65 70 75
80 Lys Leu Phe Met Glu Asn Gly Ile Glu Glu Leu Ala Lys Lys Ser Met
85 90 95 Glu Glu
Leu Asp Ser Glu Lys Ser Ser Lys Asp Asp Ile Asp Val Arg 100
105 110 Leu Lys Trp Leu Gly Leu Phe
His Arg Arg Lys His Gln Tyr Gly Lys 115 120
125 Phe Met Met Arg Leu Lys Leu Pro Asn Gly Val Thr
Thr Ser Ala Gln 130 135 140
Thr Arg Tyr Leu Ala Ser Val Ile Arg Lys Tyr Gly Glu Asp Gly Cys 145
150 155 160 Ala Asp Val
Thr Thr Arg Gln Asn Trp Gln Ile Arg Gly Val Val Leu 165
170 175 Pro Asp Val Pro Glu Ile Leu Lys
Gly Leu Ala Ser Val Gly Leu Thr 180 185
190 Ser Leu Gln Ser Gly Met Asp Asn Val Arg Asn Pro Val
Gly Asn Pro 195 200 205
Ile Ala Gly Ile Asp Pro Glu Glu Ile Val Asp Thr Arg Pro Tyr Thr 210
215 220 Asn Leu Leu Ser
Gln Phe Ile Thr Ala Asn Ser Gln Gly Asn Pro Asp 225 230
235 240 Phe Thr Asn Leu Pro Arg Lys Trp Asn
Val Cys Val Val Gly Thr His 245 250
255 Asp Leu Tyr Glu His Pro His Ile Asn Asp Leu Ala Tyr Met
Pro Ala 260 265 270
Asn Lys Asp Gly Arg Phe Gly Phe Asn Leu Leu Val Gly Gly Phe Phe
275 280 285 Ser Pro Lys Arg
Cys Glu Glu Ala Ile Pro Leu Asp Ala Trp Val Pro 290
295 300 Ala Asp Asp Val Leu Pro Leu Cys
Lys Ala Val Leu Glu Ala Tyr Arg 305 310
315 320 Asp Leu Gly Thr Arg Gly Asn Arg Gln Lys Thr Arg
Met Met Trp Leu 325 330
335 Ile Asp Glu Leu Gly Val Glu Gly Phe Arg Thr Glu Val Glu Lys Arg
340 345 350 Met Pro Asn
Gly Lys Leu Glu Arg Gly Ser Ser Glu Asp Leu Val Asn 355
360 365 Lys Gln Trp Glu Arg Arg Asp Tyr
Phe Gly Val Asn Pro Gln Lys Gln 370 375
380 Glu Gly Leu Ser Phe Val Gly Leu His Val Pro Val Gly
Arg Leu Gln 385 390 395
400 Ala Asp Asp Met Asp Glu Leu Ala Arg Leu Ala Asp Thr Tyr Gly Ser
405 410 415 Gly Glu Leu Arg
Leu Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Val 420
425 430 Glu Thr Ser Lys Thr Glu Ala Leu Leu
Gln Glu Pro Phe Leu Lys Asn 435 440
445 Arg Phe Ser Pro Glu Pro Ser Ile Leu Met Lys Gly Leu Val
Ala Cys 450 455 460
Thr Gly Ser Gln Phe Cys Gly Gln Ala Ile Ile Glu Thr Lys Leu Arg 465
470 475 480 Ala Leu Lys Val Thr
Glu Glu Val Glu Arg Leu Val Ser Val Pro Arg 485
490 495 Pro Ile Arg Met His Trp Thr Gly Cys Pro
Asn Thr Cys Gly Gln Val 500 505
510 Gln Val Ala Asp Ile Gly Phe Met Gly Cys Leu Thr Arg Gly Glu
Glu 515 520 525 Gly
Lys Pro Val Glu Gly Ala Asp Val Tyr Val Gly Gly Arg Ile Gly 530
535 540 Ser Asp Ser His Ile Gly
Glu Ile Tyr Lys Lys Gly Val Arg Val Thr 545 550
555 560 Glu Leu Val Pro Leu Val Ala Glu Ile Leu Ile
Lys Glu Phe Gly Ala 565 570
575 Val Pro Arg Glu Arg Glu Glu Asn Glu Asp 580
585 26595PRTVitis vinifera 26Met Ala Ser Ile Ser Val Pro Phe
Leu Ser Gln Ala Pro Thr His Leu 1 5 10
15 Ser Asn Ser Thr Ser Leu Arg Leu Lys Thr Arg Ile Ser
Ala Thr Pro 20 25 30
Thr Pro Thr Pro Thr Pro Thr Thr Val Ala Pro Ser Ser Thr Ala Ala
35 40 45 Val Asp Ala Ser
Arg Met Glu Pro Arg Val Glu Glu Arg Gly Gly Tyr 50
55 60 Trp Val Leu Lys Glu Lys Phe Arg
Glu Gly Ile Asn Pro Gln Glu Lys 65 70
75 80 Val Lys Ile Glu Lys Asp Pro Met Lys Leu Phe Ile
Glu Asp Gly Phe 85 90
95 Asn Glu Leu Ala Ser Met Ser Phe Glu Glu Ile Glu Lys Ser Lys His
100 105 110 Thr Lys Asp
Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu Phe His 115
120 125 Arg Arg Lys His Gln Tyr Gly Arg
Phe Met Met Arg Leu Lys Leu Pro 130 135
140 Asn Gly Val Thr Ser Ser Ala Gln Thr Arg Tyr Leu Ala
Ser Ala Ile 145 150 155
160 Arg Gln Tyr Gly Lys Glu Gly Cys Ala Asp Val Thr Thr Arg Gln Asn
165 170 175 Trp Gln Ile Arg
Gly Val Val Leu Pro Asp Val Pro Glu Ile Leu Lys 180
185 190 Gly Leu Ser Glu Val Gly Leu Thr Ser
Leu Gln Ser Gly Met Asp Asn 195 200
205 Val Arg Asn Pro Val Gly Asn Pro Leu Ala Gly Ile Asp Pro
His Glu 210 215 220
Ile Val Asp Thr Arg Pro Tyr Thr Asn Leu Leu Ser Gln Phe Ile Thr 225
230 235 240 Ala Asn Ala Arg Gly
Asn Thr Ala Phe Thr Asn Leu Pro Arg Lys Trp 245
250 255 Asn Val Cys Val Val Gly Ser His Asp Leu
Tyr Glu His Pro His Ile 260 265
270 Asn Asp Leu Ala Tyr Met Pro Ala Thr Lys Lys Gly Arg Phe Gly
Phe 275 280 285 Asn
Leu Leu Val Gly Gly Phe Phe Ser Pro Lys Arg Cys Ala Asp Ala 290
295 300 Ile Pro Leu Asp Ala Trp
Ile Pro Ala Asp Asp Val Leu Pro Val Cys 305 310
315 320 Gln Ala Val Leu Glu Ala Tyr Arg Asp Leu Gly
Thr Arg Gly Asn Arg 325 330
335 Gln Lys Thr Arg Met Met Trp Leu Ile Asp Glu Leu Gly Ile Glu Gln
340 345 350 Phe Arg
Ala Glu Val Val Lys Arg Met Pro Gln Gln Glu Leu Glu Arg 355
360 365 Ser Ser Ser Glu Asp Leu Val
Gln Lys Gln Trp Glu Arg Arg Asp Tyr 370 375
380 Leu Gly Val His Pro Gln Lys Gln Glu Gly Phe Ser
Phe Val Gly Ile 385 390 395
400 His Ile Pro Val Gly Arg Val Gln Ala Asp Asp Met Asp Glu Leu Ala
405 410 415 Arg Leu Ala
Asp Glu Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val Glu 420
425 430 Gln Asn Ile Ile Ile Pro Asn Val
Glu Asn Ser Arg Leu Glu Ala Leu 435 440
445 Leu Lys Glu Pro Leu Leu Arg Asp Arg Phe Ser Pro Glu
Pro Pro Ile 450 455 460
Leu Met Lys Gly Leu Val Ala Cys Thr Gly Asn Gln Phe Cys Gly Gln 465
470 475 480 Ala Ile Ile Glu
Thr Lys Ala Arg Ala Leu Lys Val Thr Glu Asp Val 485
490 495 Gly Arg Leu Val Ser Val Thr Gln Pro
Val Arg Met His Trp Thr Gly 500 505
510 Cys Pro Asn Ser Cys Gly Gln Val Gln Val Ala Asp Ile Gly
Phe Met 515 520 525
Gly Cys Met Thr Arg Asp Glu Asn Gly Asn Val Cys Glu Gly Ala Asp 530
535 540 Val Phe Leu Gly Gly
Arg Ile Gly Ser Asp Cys His Leu Gly Glu Val 545 550
555 560 Tyr Lys Lys Arg Val Pro Cys Lys Asp Leu
Val Pro Leu Val Ala Glu 565 570
575 Ile Leu Val Asn His Phe Gly Gly Val Pro Arg Glu Arg Glu Glu
Glu 580 585 590 Ala
Glu Asp 595 27580PRTVolvox sp. 27Met Gln Ser Gln Ser Leu Ser Arg
Arg Thr Cys Thr Arg Thr Leu Gly 1 5 10
15 Arg Gly Leu Val Thr Pro Val Leu Ala Thr Ala Ala Pro
Ala Ser Ala 20 25 30
Ala Gln Ala Ala Asp Gly Ile Asn Ala His Ser Gly Leu Lys His Leu
35 40 45 Pro Glu Ala Ala
Arg Val Arg Ala Leu Asp Arg Lys Ala Asn Lys Phe 50
55 60 Glu Lys Val Lys Val Glu Lys Cys
Gly Ser Arg Ala Trp Thr Asp Val 65 70
75 80 Phe Glu Leu Ser Arg Leu Leu Lys Glu Gly Asn Thr
Lys Trp Glu Asp 85 90
95 Leu Asp Leu Asp Asp Ile Asp Ile Arg Met Lys Trp Ala Gly Leu Phe
100 105 110 His Arg Gly
Lys Arg Thr Pro Gly Lys Phe Met Met Arg Leu Lys Val 115
120 125 Pro Asn Gly Glu Leu Asp Ala Arg
Gln Leu Arg Phe Leu Ala Ser Ala 130 135
140 Ile Ala Pro Tyr Gly Ala Asp Gly Cys Ala Asp Ile Thr
Thr Arg Ala 145 150 155
160 Asn Ile Gln Leu Arg Gly Val Thr Leu Ala Asp Ala Asp Ala Ile Ile
165 170 175 Arg Gly Leu Trp
Asp Val Gly Leu Thr Ser Phe Gln Ser Gly Met Asp 180
185 190 Ser Val Arg Asn Leu Thr Gly Asn Pro
Ile Ala Gly Val Asp Pro His 195 200
205 Glu Leu Ile Asp Thr Arg Pro Leu Leu Arg Glu Met Glu Ala
Met Leu 210 215 220
Phe Asn Asn Gly Lys Gly Arg Glu Glu Phe Ala Asn Leu Pro Arg Lys 225
230 235 240 Leu Asn Ile Cys Ile
Ser Ser Thr Arg Asp Asp Phe Pro His Thr His 245
250 255 Ile Asn Asp Val Gly Phe Glu Ala Val Arg
Arg Pro Asp Asp Gly Glu 260 265
270 Val Val Phe Asn Val Val Val Gly Gly Phe Phe Ser Ile Lys Arg
Asn 275 280 285 Val
Met Ser Ile Pro Leu Gly Cys Ser Val Thr Gln Asp Gln Leu Met 290
295 300 Pro Phe Thr Glu Ala Leu
Leu Arg Val Phe Arg Asp His Gly Pro Arg 305 310
315 320 Gly Asp Arg Gln Gln Thr Arg Leu Met Trp Met
Val Asp Ala Ile Gly 325 330
335 Val Glu Lys Phe Arg Gln Leu Leu Ser Glu Tyr Met Gly Gly Ala Glu
340 345 350 Leu Ala
Pro Pro Val His Val His His Glu Gly Pro Trp Glu Arg Arg 355
360 365 Asp Val Leu Gly Val His Pro
Gln Lys Gln Pro Gly Leu Asn Trp Val 370 375
380 Gly Ala Cys Val Pro Ala Gly Arg Leu Gln Ala Ala
Asp Phe Asp Glu 385 390 395
400 Phe Ala Arg Ile Ala Glu Thr Tyr Gly Asp Gly Thr Val Arg Ile Thr
405 410 415 Cys Glu Glu
Asn Val Ile Phe Thr Asn Val Pro Asp Ala Lys Leu Pro 420
425 430 Asp Met Leu Ala Glu Pro Leu Phe
Gln Arg Phe Lys Val Asn Pro Gly 435 440
445 Leu Leu Leu Arg Gly Leu Val Ser Cys Thr Gly Asn Gln
Phe Cys Gly 450 455 460
Phe Gly Leu Ala Glu Thr Lys Ala Arg Ala Val Lys Val Val Glu Met 465
470 475 480 Leu Glu Glu Gln
Leu Glu Leu Thr Arg Pro Val Arg Ile His Phe Thr 485
490 495 Gly Cys Pro Asn Ser Cys Gly Gln Ala
Gln Gln Val Gly Asp Ile Gly 500 505
510 Leu Met Gly Ala Pro Ala Lys Leu Asp Gly Lys Ala Val Glu
Gly Tyr 515 520 525
Lys Ile Phe Leu Gly Gly Lys Ile Gly Glu Asn Pro Gln Leu Ala Thr 530
535 540 Glu Phe Ala Gln Gly
Ile Pro Ala Val Glu Ser His Leu Val Pro Lys 545 550
555 560 Leu Lys Glu Ile Leu Ile Lys Glu Phe Gly
Ala Lys Glu Lys Glu Thr 565 570
575 Ala Val Val Val 580 28594PRTSpinacia oleracea
28Met Ala Ser Leu Pro Val Asn Lys Ile Ile Pro Ser Ser Thr Thr Leu 1
5 10 15 Leu Ser Ser Ser
Asn Asn Asn Arg Arg Arg Asn Asn Ser Ser Ile Arg 20
25 30 Cys Gln Lys Ala Val Ser Pro Ala Ala
Glu Thr Ala Ala Val Ser Pro 35 40
45 Ser Val Asp Ala Ala Arg Leu Glu Pro Arg Val Glu Glu Arg
Asp Gly 50 55 60
Phe Trp Val Leu Lys Glu Glu Phe Arg Ser Gly Ile Asn Pro Ala Glu 65
70 75 80 Lys Val Lys Ile Glu
Lys Asp Pro Met Lys Leu Phe Ile Glu Asp Gly 85
90 95 Ile Ser Asp Leu Ala Thr Leu Ser Met Glu
Glu Val Asp Lys Ser Lys 100 105
110 His Asn Lys Asp Asp Ile Asp Val Arg Leu Lys Trp Leu Gly Leu
Phe 115 120 125 His
Arg Arg Lys His His Tyr Gly Arg Phe Met Met Arg Leu Lys Leu 130
135 140 Pro Asn Gly Val Thr Thr
Ser Glu Gln Thr Arg Tyr Leu Ala Ser Val 145 150
155 160 Ile Lys Lys Tyr Gly Lys Asp Gly Cys Ala Asp
Val Thr Thr Arg Gln 165 170
175 Asn Trp Gln Ile Arg Gly Val Val Leu Pro Asp Val Pro Glu Ile Ile
180 185 190 Lys Gly
Leu Glu Ser Val Gly Leu Thr Ser Leu Gln Ser Gly Met Asp 195
200 205 Asn Val Arg Asn Pro Val Gly
Asn Pro Leu Ala Gly Ile Asp Pro His 210 215
220 Glu Ile Val Asp Thr Arg Pro Phe Thr Asn Leu Ile
Ser Gln Phe Val 225 230 235
240 Thr Ala Asn Ser Arg Gly Asn Leu Ser Ile Thr Asn Leu Pro Arg Lys
245 250 255 Trp Asn Pro
Cys Val Ile Gly Ser His Asp Leu Tyr Glu His Pro His 260
265 270 Ile Asn Asp Leu Ala Tyr Met Pro
Ala Thr Lys Asn Gly Lys Phe Gly 275 280
285 Phe Asn Leu Leu Val Gly Gly Phe Phe Ser Ile Lys Arg
Cys Glu Glu 290 295 300
Ala Ile Pro Leu Asp Ala Trp Val Ser Ala Glu Asp Val Val Pro Val 305
310 315 320 Cys Lys Ala Met
Leu Glu Ala Phe Arg Asp Leu Gly Phe Arg Gly Asn 325
330 335 Arg Gln Lys Cys Arg Met Met Trp Leu
Ile Asp Glu Leu Gly Met Glu 340 345
350 Ala Phe Arg Gly Glu Val Glu Lys Arg Met Pro Glu Gln Val
Leu Glu 355 360 365
Arg Ala Ser Ser Glu Glu Leu Val Gln Lys Asp Trp Glu Arg Arg Glu 370
375 380 Tyr Leu Gly Val His
Pro Gln Lys Gln Gln Gly Leu Ser Phe Val Gly 385 390
395 400 Leu His Ile Pro Val Gly Arg Leu Gln Ala
Asp Glu Met Glu Glu Leu 405 410
415 Ala Arg Ile Ala Asp Val Tyr Gly Ser Gly Glu Leu Arg Leu Thr
Val 420 425 430 Glu
Gln Asn Ile Ile Ile Pro Asn Val Glu Asn Ser Lys Ile Asp Ser 435
440 445 Leu Leu Asn Glu Pro Leu
Leu Lys Glu Arg Tyr Ser Pro Glu Pro Pro 450 455
460 Ile Leu Met Lys Gly Leu Val Ala Cys Thr Gly
Ser Gln Phe Cys Gly 465 470 475
480 Gln Ala Ile Ile Glu Thr Lys Ala Arg Ala Leu Lys Val Thr Glu Glu
485 490 495 Val Gln
Arg Leu Val Ser Val Thr Arg Pro Val Arg Met His Trp Thr 500
505 510 Gly Cys Pro Asn Ser Cys Gly
Gln Val Gln Val Ala Asp Ile Gly Phe 515 520
525 Met Gly Cys Met Thr Arg Asp Glu Asn Gly Lys Pro
Cys Glu Gly Ala 530 535 540
Asp Val Phe Val Gly Gly Arg Ile Gly Ser Asp Ser His Leu Gly Asp 545
550 555 560 Ile Tyr Lys
Lys Ala Val Pro Cys Lys Asp Leu Val Pro Val Val Ala 565
570 575 Glu Ile Leu Ile Asn Gln Phe Gly
Ala Val Pro Arg Glu Arg Glu Glu 580 585
590 Ala Glu 29536PRTNostoc sp. 29Met Thr Asp Thr Val
Thr Thr Pro Lys Ala Ser Leu Asn Lys Phe Glu 1 5
10 15 Lys Phe Lys Ala Glu Lys Asp Gly Leu Ala
Ile Lys Ser Glu Ile Glu 20 25
30 Lys Ile Ala Ser Leu Gly Trp Glu Ala Met Asp Ala Thr Asp Arg
Asp 35 40 45 His
Arg Leu Lys Trp Val Gly Val Phe Phe Arg Pro Val Thr Pro Gly 50
55 60 Lys Phe Met Met Arg Met
Arg Met Pro Asn Gly Ile Leu Thr Ser Asp 65 70
75 80 Gln Met Arg Val Leu Ala Glu Val Val Gln Arg
Tyr Gly Asp Asp Gly 85 90
95 Asn Ala Asp Ile Thr Thr Arg Gln Asn Ile Gln Leu Arg Gly Ile Arg
100 105 110 Ile Glu
Asp Leu Pro His Ile Phe Asn Lys Phe His Ala Val Gly Leu 115
120 125 Thr Ser Val Gln Ser Gly Met
Asp Asn Ile Arg Asn Ile Thr Gly Asp 130 135
140 Pro Ile Ala Gly Leu Asp Ala Asp Glu Leu Tyr Asp
Thr Arg Glu Leu 145 150 155
160 Val Gln Gln Ile Gln Asp Met Leu Thr Asn Lys Gly Glu Gly Asn Arg
165 170 175 Glu Phe Ser
Asn Leu Pro Arg Lys Phe Asn Ile Ala Ile Ala Gly Gly 180
185 190 Arg Asp Asn Ser Val His Ala Glu
Ile Asn Asp Leu Ala Phe Val Pro 195 200
205 Ala Phe Lys Glu Gly Ile Gly Asp Trp Val Leu Gly Asn
Gly Glu Glu 210 215 220
Ser Ser Thr Tyr Gln Lys Val Phe Gly Phe Asn Val Leu Val Gly Gly 225
230 235 240 Phe Phe Ser Ala
Lys Arg Cys Glu Ala Ala Ile Pro Leu Asn Ala Trp 245
250 255 Val Thr Pro Glu Glu Val Leu Pro Leu
Cys Arg Ala Ile Leu Glu Val 260 265
270 Tyr Arg Asp Asn Gly Leu Arg Ala Asn Arg Leu Lys Ser Arg
Leu Met 275 280 285
Trp Leu Ile Asp Glu Trp Gly Ile Asp Lys Phe Arg Ala Glu Val Glu 290
295 300 Gln Arg Leu Gly Lys
Ser Leu Leu Pro Ala Ala Pro Lys Asp Glu Ile 305 310
315 320 Asp Trp Glu Lys Arg Asp His Ile Gly Val
Tyr Lys Gln Lys Gln Glu 325 330
335 Gly Leu Asn Tyr Val Gly Leu His Ile Pro Val Gly Arg Leu Tyr
Ala 340 345 350 Glu
Asp Met Phe Glu Leu Ala Arg Ile Ala Asp Val Tyr Gly Ser Gly 355
360 365 Glu Ile Arg Met Thr Val
Glu Gln Asn Ile Ile Ile Pro Asn Ile Thr 370 375
380 Asp Ser Arg Leu Arg Thr Leu Leu Thr Asp Pro
Leu Leu Glu Arg Phe 385 390 395
400 Ser Leu Asp Pro Gly Ala Leu Thr Arg Ser Leu Val Ser Cys Thr Gly
405 410 415 Ala Gln
Phe Cys Asn Phe Ala Leu Ile Glu Thr Lys Asn Arg Ala Leu 420
425 430 Glu Met Ile Lys Gly Leu Glu
Ala Glu Leu Thr Phe Thr Arg Pro Val 435 440
445 Arg Ile His Trp Thr Gly Cys Pro Asn Ser Cys Gly
Gln Pro Gln Val 450 455 460
Ala Asp Ile Gly Leu Met Gly Thr Lys Ala Arg Lys Asn Gly Lys Ala 465
470 475 480 Val Glu Gly
Val Asp Ile Tyr Met Gly Gly Lys Val Gly Lys Asp Ala 485
490 495 His Leu Gly Ser Cys Val Gln Lys
Gly Ile Pro Cys Glu Asp Leu His 500 505
510 Leu Val Leu Arg Asp Leu Leu Ile Thr Asn Phe Gly Ala
Lys Pro Arg 515 520 525
Gln Glu Ala Leu Val Thr Ser Gln 530 535
30654PRTPlectonema boryanum 30Met Thr Asp Thr Leu Ala Ala Pro Thr Leu Asn
Lys Phe Glu Lys Leu 1 5 10
15 Lys Ala Glu Lys Asp Gly Leu Ala Val Lys Ala Glu Leu Glu His Phe
20 25 30 Ala Arg
Leu Gly Trp Glu Ala Met Asp Glu Thr Asp Arg Asp His Arg 35
40 45 Leu Lys Trp Leu Gly Val Phe
Phe Arg Pro Val Thr Pro Gly Lys Phe 50 55
60 Met Leu Arg Met Arg Val Pro Asn Gly Ile Ile Thr
Ser Gly Gln Thr 65 70 75
80 Arg Val Leu Gly Glu Ile Leu Gln Arg Tyr Gly Asp Asp Gly Asn Ala
85 90 95 Asp Ile Thr
Thr Arg Gln Asn Phe Gln Leu Arg Gly Ile Arg Ile Glu 100
105 110 Asp Leu Pro Glu Ile Phe Arg Lys
Phe Asp Gln Ala Gly Leu Thr Ser 115 120
125 Ile Gln Ser Gly Met Asp Asn Val Arg Asn Ile Thr Gly
Ser Pro Val 130 135 140
Ala Gly Ile Asp Ala Asp Glu Leu Ile Asp Thr Arg Gly Leu Val Arg 145
150 155 160 Lys Val Gln Asp
Met Ile Thr Asn Asn Gly Arg Gly Asn Ser Ser Phe 165
170 175 Ser Asn Leu Pro Arg Lys Phe Asn Ile
Ala Ile Ala Gly Cys Arg Asp 180 185
190 Asn Ser Val His Ala Glu Ile Asn Asp Ile Ala Phe Val Pro
Ala Phe 195 200 205
Lys Asp Gly Thr Leu Gly Phe Asn Ile Leu Val Gly Gly Phe Phe Ser 210
215 220 Gly Lys Arg Cys Glu
Ala Ala Ile Pro Leu Asn Ala Trp Val Asp Pro 225 230
235 240 Arg Asp Val Val Ala Val Cys Glu Ala Ile
Leu Thr Val Tyr Arg Asn 245 250
255 Leu Gly Leu Arg Ala Asn Arg Gln Lys Ala Arg Leu Met Trp Leu
Ile 260 265 270 Asp
Glu Met Gly Leu Glu Pro Phe Arg Glu Ala Val Glu Lys Gln Leu 275
280 285 Gly Tyr Ala Phe Thr Pro
Ala Ala Ala Lys Asp Glu Ile Leu Trp Asp 290 295
300 Lys Arg Asp His Ile Gly Ile His Ala Gln Lys
Gln Pro Gly Leu Asn 305 310 315
320 Tyr Val Gly Leu His Val Pro Val Gly Arg Leu Tyr Ala Gln Asp Leu
325 330 335 Phe Asp
Leu Ala Arg Ile Ala Glu Val Tyr Gly Ser Gly Glu Ile Arg 340
345 350 Leu Thr Val Glu Gln Asn Val
Ile Ile Pro Asn Val Pro Asp Ser Arg 355 360
365 Val Ser Ala Leu Leu Arg Glu Pro Ile Val Lys Arg
Phe Ser Ile Glu 370 375 380
Pro Gln Asn Leu Ser Arg Ala Leu Val Ser Cys Thr Gly Ala Gln Phe 385
390 395 400 Cys Asn Phe
Ala Leu Ile Glu Thr Lys Asn Arg Ala Val Ala Leu Met 405
410 415 Gln Glu Leu Glu Gln Asp Leu Tyr
Cys Pro Arg Pro Val Arg Ile His 420 425
430 Trp Thr Gly Cys Pro Asn Ser Cys Gly Gln Pro Gln Val
Ala Asp Ile 435 440 445
Gly Leu Met Gly Thr Lys Val Arg Lys Asp Gly Lys Thr Val Glu Gly 450
455 460 Val Asp Leu Tyr
Met Gly Gly Lys Val Gly Lys His Ala Glu Leu Gly 465 470
475 480 Thr Cys Val Arg Lys Ser Ile Pro Cys
Glu Asp Leu Lys Pro Ile Leu 485 490
495 Gln Glu Ile Leu Ile Glu Gln Phe Gly Ala Arg Leu Trp Ser
Asp Leu 500 505 510
Pro Glu Ser Ala Arg Pro Asn Pro Thr Ala Leu Ile Thr Leu Asp Arg
515 520 525 Pro Thr Val Glu
Thr Pro Asn Gly Lys Ser Thr Thr Val Gln Glu Leu 530
535 540 Asn Ala Gln Glu Phe Asp Tyr Val
Leu Ser Ala Pro Pro Val Val Lys 545 550
555 560 Ala Pro Thr Glu Ile Ala Ala Pro Ala Thr Ile Arg
Phe Ala Gln Ser 565 570
575 Gly Lys Glu Ile Thr Cys Thr Gln Asp Asp Leu Ile Leu Asp Ile Ala
580 585 590 Asp Gln Ala
Glu Val Ala Ile Glu Ser Ser Cys Arg Ser Gly Thr Cys 595
600 605 Gly Ser Cys Lys Cys Thr Leu Leu
Glu Gly Glu Val Ser Tyr Asp Ser 610 615
620 Glu Pro Asp Val Leu Asp Glu His Asp Arg Ala Ser Gly
Gln Ile Leu 625 630 635
640 Thr Cys Ile Ala Arg Pro Val Gly Arg Ile Leu Leu Asp Ala 645
650 31536PRTAnabaena variabilis 31Met Thr
Asp Thr Ala Thr Thr Pro Lys Ala Ser Leu Asn Lys Phe Glu 1 5
10 15 Lys Phe Lys Ala Glu Lys Asp
Gly Leu Ala Ile Lys Ser Glu Ile Glu 20 25
30 Lys Ile Ala Ser Leu Gly Trp Glu Ala Met Asp Glu
Thr Asp Arg Asp 35 40 45
His Arg Leu Lys Trp Val Gly Val Phe Phe Arg Pro Val Thr Pro Gly
50 55 60 Lys Phe Met
Met Arg Met Arg Met Pro Asn Gly Ile Leu Thr Ser Asp 65
70 75 80 Gln Met Arg Val Leu Ala Glu
Val Val Gln Arg Tyr Gly Asp Asp Gly 85
90 95 Asn Ala Asp Ile Thr Thr Arg Gln Asn Ile Gln
Leu Arg Gly Ile Arg 100 105
110 Ile Glu Asp Leu Pro His Ile Phe Asn Lys Phe His Ala Val Gly
Leu 115 120 125 Thr
Ser Val Gln Ser Gly Met Asp Asn Ile Arg Asn Ile Thr Gly Asp 130
135 140 Pro Ile Ala Gly Leu Asp
Ala Asp Glu Leu Tyr Asp Thr Arg Glu Leu 145 150
155 160 Val Gln Gln Ile Gln Asp Met Leu Thr Asn Lys
Gly Glu Gly Asn Arg 165 170
175 Glu Phe Ser Asn Leu Pro Arg Lys Phe Asn Ile Ala Ile Ala Gly Gly
180 185 190 Arg Asp
Asn Ser Val His Ala Glu Ile Asn Asp Leu Ala Phe Val Pro 195
200 205 Ala Phe Lys Glu Gly Ile Gly
Asp Trp Val Leu Gly Gly Gly Glu Glu 210 215
220 Ser Ser Thr His Gln Lys Val Phe Gly Phe Asn Val
Leu Val Gly Gly 225 230 235
240 Phe Phe Ser Ala Lys Arg Cys Glu Ala Ala Ile Pro Leu Asn Ala Trp
245 250 255 Val Thr Ala
Glu Glu Val Val Ala Leu Cys Arg Ala Val Leu Glu Val 260
265 270 Tyr Arg Asp Asn Gly Leu Arg Ala
Asn Arg Leu Lys Ser Arg Leu Met 275 280
285 Trp Leu Ile Asp Glu Trp Gly Ile Asp Lys Phe Arg Ala
Glu Val Glu 290 295 300
Gln Arg Leu Gly Lys Ser Leu Leu Tyr Ala Ala Pro Lys Asp Glu Ile 305
310 315 320 Asp Trp Glu Lys
Arg Asp His Ile Gly Val Tyr Lys Gln Lys Gln Glu 325
330 335 Gly Leu Asn Tyr Val Gly Leu His Ile
Pro Val Gly Arg Leu Tyr Ala 340 345
350 Glu Asp Met Phe Glu Leu Ala Arg Ile Ala Asp Val Tyr Gly
Ser Gly 355 360 365
Glu Ile Arg Met Thr Val Glu Gln Asn Ile Ile Ile Pro Asn Ile Thr 370
375 380 Asp Ser Arg Leu Lys
Thr Leu Leu Thr Asp Pro Leu Leu Glu Arg Phe 385 390
395 400 Ser Leu Asp Pro Gly Ala Leu Thr Arg Ser
Leu Val Ser Cys Thr Gly 405 410
415 Ala Gln Phe Cys Asn Phe Ala Leu Ile Glu Thr Lys Asn Arg Ala
Leu 420 425 430 Glu
Met Ile Lys Gly Leu Glu Ala Glu Leu Thr Phe Thr Arg Pro Val 435
440 445 Arg Ile His Trp Thr Gly
Cys Pro Asn Ser Cys Gly Gln Pro Gln Val 450 455
460 Ala Asp Ile Gly Leu Met Gly Thr Lys Ala Arg
Lys Asn Gly Lys Ala 465 470 475
480 Val Glu Gly Val Asp Ile Tyr Met Gly Gly Lys Val Gly Lys Asp Ala
485 490 495 His Leu
Gly Ser Cys Val Gln Lys Gly Ile Pro Cys Glu Asp Leu His 500
505 510 Leu Val Leu Arg Asp Leu Leu
Ile Thr Asn Phe Gly Ala Lys Pro Arg 515 520
525 Gln Glu Ala Leu Val Ser Ser Gln 530
535 32515PRTSynechococcus sp. 32Met Ala Asn Gln Phe Glu Arg
Leu Lys Ser Glu Lys Asp Gly Leu Ala 1 5
10 15 Val Lys Ala Glu Leu Glu Ala Phe Ala Arg Met
Gly Trp Glu Asn Ile 20 25
30 Pro Glu Asp Asp Arg Asp His Arg Leu Lys Trp Leu Gly Ile Phe
Phe 35 40 45 Arg
Lys Arg Thr Pro Gly Gln Phe Met Leu Arg Leu Arg Leu Pro Asn 50
55 60 Gly Ile Leu Thr Ser Gly
Gln Met Arg Met Leu Gly Ala Ile Ile His 65 70
75 80 Pro Tyr Gly Glu Gln Gly Val Ala Asp Ile Thr
Thr Arg Gln Asn Leu 85 90
95 Gln Leu Arg Gly Ile Pro Ile Glu Glu Met Pro Gln Ile Leu Gly Tyr
100 105 110 Leu Lys
Glu Val Gly Leu Thr Ser Ile Gln Ser Gly Met Asp Asn Val 115
120 125 Arg Asn Ile Thr Gly Ser Pro
Leu Ala Gly Ile Asp Pro Asp Glu Leu 130 135
140 Ile Asp Val Arg Gly Leu Thr Arg Lys Val Gln Asp
Met Val Thr Asn 145 150 155
160 Asn Gly Glu Gly Asn Pro Ser Phe Ser Asn Leu Pro Arg Lys Phe Asn
165 170 175 Ile Ala Ile
Cys Gly Cys Arg Asp Asn Ser Val His Ala Glu Ile Asn 180
185 190 Asp Leu Ala Phe Val Pro Ala Phe
Lys Asn Gly Arg Leu Gly Phe Asn 195 200
205 Val Leu Val Gly Gly Phe Phe Ser Ala Arg Arg Cys Ala
Glu Ala Ile 210 215 220
Gly Leu Asp Val Trp Val Asp Pro Arg Asp Val Val Pro Leu Cys Glu 225
230 235 240 Ala Val Leu Leu
Val Tyr Arg Asp His Gly Leu Arg Ala Asn Arg Gln 245
250 255 Lys Ala Arg Leu Met Trp Leu Ile Asp
Glu Trp Gly Leu Glu Lys Phe 260 265
270 Arg Ala Ala Val Glu Arg Gln Ile Gly His Pro Leu Pro Arg
Ala Ala 275 280 285
Glu Lys Asp Glu Val Val Trp His Lys Arg Asp Leu Leu Gly Val His 290
295 300 Ala Gln Lys Gln Pro
Gly Leu Asn Phe Val Gly Leu His Val Pro Val 305 310
315 320 Gly Arg Leu Asn Ala Leu Glu Met Met Glu
Leu Ala Arg Leu Ala Glu 325 330
335 Val Tyr Gly Ser Gly Glu Leu Arg Leu Thr Val Glu Gln Asn Val
Leu 340 345 350 Ile
Pro Asn Val Pro Asp Ser Arg Val Ala Pro Leu Leu Lys Glu Pro 355
360 365 Leu Leu Lys Lys Phe Ser
Pro Asn Pro Gly Pro Leu Gln Arg Gly Leu 370 375
380 Val Ser Cys Thr Gly Asn Gln Phe Cys Asn Phe
Ala Leu Ile Glu Thr 385 390 395
400 Lys Asn Arg Ala Val Ala Leu Met Glu Glu Leu Glu Ala Glu Leu Glu
405 410 415 Ile Pro
Gln Thr Val Arg Ile His Trp Thr Gly Cys Pro Asn Ser Cys 420
425 430 Gly Gln Pro Gln Val Ala Asp
Ile Gly Leu Met Gly Thr Thr Ala Arg 435 440
445 Lys Asp Gly Arg Val Val Glu Ala Val Asp Ile Tyr
Met Gly Gly Glu 450 455 460
Val Gly Lys Asp Ala Lys Leu Gly Glu Cys Val Arg Lys Gly Ile Pro 465
470 475 480 Cys Glu Asp
Leu Lys Pro Val Leu Val Glu Leu Leu Ile Glu His Phe 485
490 495 Gly Ala Lys Pro Arg Gln His Pro
Ser Ala Ala Gln Ala Ser Val Leu 500 505
510 Val Thr Arg 515 33642PRTArabidopsis
thaliana 33Met Ser Ser Thr Phe Arg Ala Pro Ala Gly Ala Ala Thr Val Phe
Thr 1 5 10 15 Ala
Asp Gln Lys Ile Arg Leu Gly Arg Leu Asp Ala Leu Arg Ser Ser
20 25 30 His Ser Val Phe Leu
Gly Arg Tyr Gly Arg Gly Gly Val Pro Val Pro 35
40 45 Pro Ser Ala Ser Ser Ser Ser Ser Ser
Pro Ile Gln Ala Val Ser Thr 50 55
60 Pro Ala Lys Pro Glu Thr Ala Thr Lys Arg Ser Lys Val
Glu Ile Ile 65 70 75
80 Lys Glu Lys Ser Asn Phe Ile Arg Tyr Pro Leu Asn Glu Glu Leu Leu
85 90 95 Thr Glu Ala Pro
Asn Val Asn Glu Ser Ala Val Gln Leu Ile Lys Phe 100
105 110 His Gly Ser Tyr Gln Gln Tyr Asn Arg
Glu Glu Arg Gly Gly Arg Ser 115 120
125 Tyr Ser Phe Met Leu Arg Thr Lys Asn Pro Ser Gly Lys Val
Pro Asn 130 135 140
Gln Leu Tyr Leu Thr Met Asp Asp Leu Ala Asp Glu Phe Gly Ile Gly 145
150 155 160 Thr Leu Arg Leu Thr
Thr Arg Gln Thr Phe Gln Leu His Gly Val Leu 165
170 175 Lys Gln Asn Leu Lys Thr Val Met Ser Ser
Ile Ile Lys Asn Met Gly 180 185
190 Ser Thr Leu Gly Ala Cys Gly Asp Leu Asn Arg Asn Val Leu Ala
Pro 195 200 205 Ala
Ala Pro Tyr Val Lys Lys Asp Tyr Leu Phe Ala Gln Glu Thr Ala 210
215 220 Asp Asn Ile Ala Ala Leu
Leu Ser Pro Gln Ser Gly Phe Tyr Tyr Asp 225 230
235 240 Met Trp Val Asp Gly Glu Gln Phe Met Thr Ala
Glu Pro Pro Glu Val 245 250
255 Val Lys Ala Arg Asn Asp Asn Ser His Gly Thr Asn Phe Val Asp Ser
260 265 270 Pro Glu
Pro Ile Tyr Gly Thr Gln Phe Leu Pro Arg Lys Phe Lys Val 275
280 285 Ala Val Thr Val Pro Thr Asp
Asn Ser Val Asp Leu Leu Thr Asn Asp 290 295
300 Ile Gly Val Val Val Val Ser Asp Glu Asn Gly Glu
Pro Gln Gly Phe 305 310 315
320 Asn Ile Tyr Val Gly Gly Gly Met Gly Arg Thr His Arg Met Glu Ser
325 330 335 Thr Phe Ala
Arg Leu Ala Glu Pro Ile Gly Tyr Val Pro Lys Glu Asp 340
345 350 Ile Leu Tyr Ala Val Lys Ala Ile
Val Val Thr Gln Arg Glu His Gly 355 360
365 Arg Arg Asp Asp Arg Lys Tyr Ser Arg Met Lys Tyr Leu
Ile Ser Ser 370 375 380
Trp Gly Ile Glu Lys Phe Arg Asp Val Val Glu Gln Tyr Tyr Gly Lys 385
390 395 400 Lys Phe Glu Pro
Ser Arg Glu Leu Pro Glu Trp Glu Phe Lys Ser Tyr 405
410 415 Leu Gly Trp His Glu Gln Gly Asp Gly
Ala Trp Phe Cys Gly Leu His 420 425
430 Val Asp Ser Gly Arg Val Gly Gly Ile Met Lys Lys Thr Leu
Arg Glu 435 440 445
Val Ile Glu Lys Tyr Lys Ile Asp Val Arg Ile Thr Pro Asn Gln Asn 450
455 460 Ile Val Leu Cys Asp
Ile Lys Thr Glu Trp Lys Arg Pro Ile Thr Thr 465 470
475 480 Val Leu Ala Gln Ala Gly Leu Leu Gln Pro
Glu Phe Val Asp Pro Leu 485 490
495 Asn Gln Thr Ala Met Ala Cys Pro Ala Phe Pro Leu Cys Pro Leu
Ala 500 505 510 Ile
Thr Glu Ala Glu Arg Gly Ile Pro Ser Ile Leu Lys Arg Val Arg 515
520 525 Ala Met Phe Glu Lys Val
Gly Leu Asp Tyr Asp Glu Ser Val Val Ile 530 535
540 Arg Val Thr Gly Cys Pro Asn Gly Cys Ala Arg
Pro Tyr Met Ala Glu 545 550 555
560 Leu Gly Leu Val Gly Asp Gly Pro Asn Ser Tyr Gln Val Trp Leu Gly
565 570 575 Gly Thr
Pro Asn Leu Thr Gln Ile Ala Arg Ser Phe Met Asp Lys Val 580
585 590 Lys Val His Asp Leu Glu Lys
Val Cys Glu Pro Leu Phe Tyr His Trp 595 600
605 Lys Leu Glu Arg Gln Thr Lys Glu Ser Phe Gly Glu
Tyr Thr Thr Arg 610 615 620
Met Gly Phe Glu Lys Leu Lys Glu Leu Ile Asp Thr Tyr Lys Gly Val 625
630 635 640 Ser Gln
342041DNAAquilegia formosa 34ctgatccaag aatgaactct gcagactttc ttctaccttt
ctttcaacaa tggcttcatt 60acagtttctt gcaccttcat catcaccttt gcaatccaac
cgactcatgg ttcgagccac 120tagtagtact agtccatcag tcaaccagac catggttgca
ccagacttat caagattgga 180accaagagtt gaagaaagag aaggtggtta ttgggttttg
aaagagaaat atagagagaa 240aataaatcca caagagaaaa tcaaaataga gaaagaacca
atgaagtttg ttactgaagg 300tggtatacat gaattagcaa aaactccatt tgaagaactt
gagaaagcta aacttactaa 360agatgatatt gatgttagac tcaagtggct tggtcttttt
catagaagaa aaaatcatta 420tggtagattt atgatgagat tgaagttgcc taatggagtt
acaactagtg aacaaacgcg 480atatcttgcg agtgttatta gaaggtatgg aaaggatgga
tgtgctgatg ttacaactag 540acagaactgg caaattcgcg gtgttgagtt acctcatgtg
cctgagataa tgaaaggatt 600aaatcaagtt ggattaacta gtcttcagag tggtatggat
aatgtgcgta atcctgttgg 660taatccactt gctggtattg acccactaga gattgtcgat
actagaccct acaatgatca 720gctatctcga tttattactg gcaattttaa agggaacctg
gcttttacta atctgccgag 780gaaatggaat gtatgtgtgg tgggctctca tgatcttttt
gagcatcccc acatcaatga 840tcttgcttac atgccagcca caaagaatgg ccgttttggg
tttaatctgt tagtaggtgg 900tttcttcagt ccaaaaagat gtgcagaggc aattcctctc
gatgcctggg tttcaggaga 960agacgtgatc ccagtttgca aagctatact tgaggcatac
agagatcttg gcaccagagg 1020aaaccgacag aaaacacgaa tgatgtggtt gattgatgaa
cttggggtag aaggatttag 1080gtcagaagtg gtgaaaagga tgcctgaaca agagctggag
agatcttcca ctgaagagtt 1140ggttcaaaag caatgggaga ggagagatct aatcggtgtc
catgcgcaaa agcaggcagg 1200ctacagtttt gttggtctcc acataccagt aggcaggctt
caggctgatg acatggatga 1260actagcccgg atagctgatg agtatggctc aggggagctc
cgtctcactg tggaacaaaa 1320tatcataatt cctaatgttg agaactcaag agttgaagct
ttgctgaagg aagccctatt 1380gagggacagg ttttcaccca ctccacctct tctaatgaaa
ggacttgtgg cctgcacagg 1440caaccagttc tgtggacaag ccatcattga gacaaaggca
cgagcactga aggtgacaga 1500agaggttgaa agactggtgg cagtgactaa accagtaaga
atgcattgga caggatgccc 1560aaacacctgc gcgcaggtgc aagtagctga tattgggttc
atggggtgca tggcaagaga 1620tgaaaacggg aaaccgtgtg aaggagcaga tgtttactta
ggtgggagga ttggtagtga 1680ttctcatttg ggagatatat ataagaaatc tgtgccttgt
aaggacttgg ttcctctggt 1740agttgacatc ttgattgagc gctttggagc tgtccctagg
gagagagaag aagatggcga 1800agactagatt atcaaattcc taaccgaaag ccctttctga
ttttaataaa ctaatttgga 1860aggtgaatgc acatagacaa tttggatgaa taaaagccat
gcagaagtgg ttctttttgg 1920acttgagttg aggaagcaac tttattgttg tatcagaaga
caggttattt taaatttcaa 1980ttcgttctta tgtactcaga atacttggat catatctcta
gacattctta atcaccgttt 2040t
2041352472DNABetula pendula 35aaaagctgct agagtatgga
aacatgcttg tccaggagca ggacaatgtg aagagagttc 60aactggcaga cacgtacttg
agccaagcag ctcttggaga tgcaaacgag gattcgatca 120agcggggaac tttctatggc
aaggcaggcc aacaagttaa tgtacccgtt cctgaaggtt 180gcaccgatcc atctgctagt
aactttgatc caacagctag gagcgataat ggtagctgcc 240agtattgagg ctaagccatt
tctagccttc tacctgctag gctatataaa tgctgtatga 300ggttgggaga actattcatt
tccactattg cttgctttct cgatacggag aagtattcct 360aattttgttg taatgaacgt
ataattttat cttaatcaca accacgacta aaattaccat 420tacaagcttc agtttattac
catgtcgtcg ctctcagtgc gctttctttc acctcccctt 480ttttcttcca cccctgcatg
gccaagaaca gggcttgccg ccactcaggc ggtgccaccg 540gttgtggcgg aggtggacgc
ggggaggctg gagccgagag tggaggagag agaagggtac 600tgggtgttga aggagaagtt
cagagaaggc ataaatcctc aggagaaatt gaagctcgag 660agagagccta tgaagctttt
catggaaggt gggatagaag atttggccaa gatgtcgctc 720gaggaaattg acaaggataa
gatttcaaag agtgatattg atgtaaggct caagtggctt 780ggtctcttcc ataggagaaa
gcatcattat ggtagattta tgatgagact gaagctacct 840aatggggtaa caacaagtgc
acaaactcga tacttagcga gtgtgattag gaaatatgga 900aaggacgggt gcgcagatgt
gaccaccagg caaaattggc aaattcgtgg tgtggtactg 960tctgatgtgc cagaaatact
taaaggtctt gatgaagttg gcttgacaag cctgcagagt 1020ggaatggata atgtgagaaa
ccctgttggg aacccccttg caggcattga catacatgag 1080attgttgcta cacggcctta
caacaacttg ttatcacaat ttatcactgc taattcgcgc 1140ggtaatctgg ccttcactaa
cttgccaagg aagtggaatg tgtgtgtagt gggttctcat 1200gatctctttg agcatcctca
catcaatgat cttgcttaca tgcctgctat aaaggatgga 1260aggtttggtt tcaatctgct
ggttggtggc ttctttagtc ccaggcgatg tgcagaagca 1320gtccctctcg atgcctgggt
ctcagcggat gacataatcc tcgtgtgcaa agccatactg 1380gaggcttata gggatcttgg
caccagaggg aacagacaga aaacaagaat gatgtggttg 1440attgatgaac ttggaataga
aggattcagg tctgaggtag tgaaaagaat gcccaaccaa 1500gagctggaga gagctgctcc
tgaagatcta attgagaagc aatgggaaag gagagagtta 1560attggtgtcc atccacagaa
acaagaaggc cttagttacg tgggtcttca cattccggtg 1620ggtcgagtcc aagcagatga
catggatgaa cttgctcgtt tagccgacac atatggctgt 1680ggcgaacttc ggctcactgt
ggagcaaaac atcataattc ccaacattga gaactcaaag 1740ctcgaagcct tactcggaga
gcctctattg aaagacagat tttcaccaga accgcctatt 1800ctcatgaaag ggttggtggc
ttgcactggc aatcagttct gtgggcaagc cattatagag 1860acaaaggcca gggccttgaa
ggtgactgag gaagttcaac ggcaagtggc agtgactcgg 1920ccggttagga tgcactggac
aggctgtcca aatagctgtg ggcaggttca agtggctgat 1980attggtttca tggggtgtat
ggcaagggat gagaatggga agccttgtga aggtgctgct 2040gtttttctgg gaggcagaat
tgggagcgac tcacatttgg gaaatcttta caaaaagggt 2100gttccttgca agaacttggt
gccattggta gtggacattc ttgttaaaca ttttggagct 2160gtaccaaggg agagggaaga
gagcgaggat tgattcaaac agcaagatta cttcttcttt 2220taccattttg gatgactccc
tgcaaagcat ttgttctggg agagggaacg tgatgcatca 2280aagaaatcct tatgggacta
aaatttgtga gagggaggca cattttagtg ctatacccag 2340cttttaacat gttggtttta
taggtttggt acgctataag tactctgttt gaattaactt 2400atgtattaaa acagctaaga
gttgaattgt aatatgaaag taataaaata ggaggctttt 2460ggtgcaaaaa aa
2472362242DNACapsicum annuum
36cccacctcac cccaccttac gactacaaaa atgatcttat ttcgccattt taaccatgac
60cgccacgatc atcaccaccc tcaataatca agaatcaact aaattcctca attccaaatt
120tggcgaaatg gcatcttttt ctgttaaatt ttcagcaact tcttcgctga caagttctaa
180gagattttcc aagcttcatg ccactccacc gcagacagtg gcagtacctc catctggggc
240agtggaggta gctgcagaga gactagagcc tagactggag gaaagagatg ggtattgggt
300acttaaggaa aagttcagaa aaggcataaa tcctgctgaa aaggccaaga ttgaaaagga
360acctatgaaa ttgttcactg aaaatggtat tgaagatatt gctaagatct cacttgaaga
420gatcgaaaaa tctaagcttg ctaaggatga tattgatgtt aggctcaagt ggcttggcct
480cttccatagg agaaagcatc aatatggacg attcatgatg cgactgaagc ttccaaatgg
540gataacgacg agtgcccaaa ctcgatattt agcaagtgtg attaggaaat atgggaaaga
600tggatgtgca gatgtgacta caaggcaaaa ttggcagatt cgtggggttg tgctacctga
660tgtgcctgag attctaaagg gactggatga agttggcttg accagtctgc aaagtggcat
720ggacaatgtt agaaatcccg tggggaaccc tctggcgggg attgatccac aagaaattgt
780ggacacaagg ccttacgcta atttgctatc caatttgcta tcccaatatg tcactgccaa
840ttttcgtggc aatctgtccg tgcataactt gccaaggaag tggaatgtat gtgtaatagg
900gtcacacgat ctttatgagc atccccatat caatgatctt gcctatatgc ctgcaacgaa
960agatggacga tttggattca acctgcttgt gggtggattc ttcagtccga agcgatgtgc
1020agaggcaatt cctcttgatg catgggttcc agctgatgat gtagtccctg tttgcaaaac
1080aatattagaa gcttatagag atcttggtac cagagggaac aggcagaaaa caagaatgat
1140gtggttaatt gacgaactgg gtgttgaagg attcagggca gaagttgtga agagaatgcc
1200tcaaaagaag ctagagagag aatccacaga ggatttggtg cagaaacaat gggaaaggag
1260agagtatctt ggggttaatc cacagaaaca ggaaggttac agctttgttg gtcttcacat
1320tccagtgggt cgtgtccaag cagatgacat ggatgagctt gctcgtttag cagaagagta
1380tggttcagga gagctccggc tgactgttga gcaaaacatc attattccga acattgagaa
1440ctcaaagatt gatgcattgc tcaatgaacc tcttctgaaa cagatttcac ccgatccacc
1500tattctcatg agaaatttgg tggcttgtac tggtaaccaa ttctgtgggc aagccataat
1560cgagactaaa gcacgttcaa tgaagataac tgaggaggtt caacggctag tctctgtgac
1620tcagcccgtg aggatgcact ggactggttg cccaaattca tgtggacaag ttcaagttgc
1680agatatcgga tttatgggat gcctgacaag aaaggaagga aagacagtgg aaggcgctga
1740tgttttcttg ggtggcagaa tagggactga ctcacacttg ggagatattt ataagaagtc
1800tgtcccctgt gaagatttgg taccaataat tgtggactta ctagttaaca actttggtgc
1860tgttccaaga gagagagaag aagcagaaga ttaatctcaa catttcagaa tcagctcgtg
1920gctttactca acatagtaaa ttggacgttg atggaatgtg cttaccatat taagatattt
1980ccaaggtaca gaactggtgg agctgttgtt ggaagttagt agaataatca gaacatgagc
2040tgttcttgac atgctatgtg tgacattcca cgatgcaaat acttgtactt gtttcagaat
2100attcacccgg tgtattgttt tggaaaagag ctgatccaaa ctaaaaggtt tttgaattgt
2160gggattccta ataatagatt ttttaaaaat gtaatttaat aatcatacat ttcaattttt
2220acctattatt atattctttg tt
2242372459DNAChlamydomonas reinhardtii 37ttgcatcgtt atctccttcg accaccttga
attgcctgcg ggccccttga cctcatccga 60cgcagccatg cttctgcacg cgccgcatgt
taagcccctg gggcagcgta gttcgatacg 120gcgtggaaat ttggtggttg cgaacgtagc
gtgcacggcg ggcaagaacc cgacgtcgcg 180gccagcgaaa cgctccaagg tggagttcat
caaggagaac agcgaccacc tgcgccaccc 240gctcatggaa gagctggtga atgacgagac
attcatcacc gaggactcgg tgcagctgat 300gaaatttcac ggctcctacc aacaagacaa
ccgtgagaaa cgcgccttcg gccaaggcaa 360agcttactca ttcctgatgc ggactcggca
gcccgctggc gttgtgccca accggctcta 420cctggtgatg gacgacctcg ccgaccagtt
cggcaacggc acgctgcgcc tgaccacgcg 480ccaggcctac cagctgcacg gcgtgctgaa
gaaggacctc aagacggtgt tcagctccgt 540catcaagaac atgggatcca cactggccgc
atgcggcgac gtcaaccgca acgtgatggg 600gcccgcagcg cccttcacca accgccccga
ctacctggcc gcccagaagg cggcgctgga 660cctggcggat ctgctaacgc cgcagtcggg
cgcctactac gacgtgtggc tggacggcga 720gaagttcatg agcagctaca aggaggaccc
cgctgtgacc gaggcccgtg ccttcaacgg 780cttcggaacc aatttcgaca acagccccga
gcccatctac ggctcccagt acctcccccg 840caagttcaag atcgccacca cggtgcctgg
tgacaacagt gtggacctgt tcactcagga 900cctgggcgtg gtggttcagg gctacaacct
gtatgtgggc ggtgggcagg gccgcagcca 960cagagacgca gacaccttcc cgcgcctggc
ggacccgctg ggctacgtgg ccgccgccga 1020cctgttcgcc gcggccaagg cggtggtggc
ggtgttccgc gactacggcc gccgtgacaa 1080ccgcaagcag gcgcgaacac ggcacatgct
ggcggagtgg ggcgtggaca agttccgctc 1140ggtggcggag cagtacctgg gcaagcgctt
ccaggagccg gtgccgctgc cgccctggca 1200gtacaaggac tacctgggct ggggcgagca
gggcgacggg cggctgtact gcggcgtgta 1260tgtgcagaac gggcgcatca agggcgaggc
caagcgggcg ctgcgtgcgg ccattgagcg 1320ctacagcctg ccggtggtac tcacgccgca
ccagaacctg gtcctgcggg acgtgcggcc 1380cgaggaccgg gaggacattg agcagctgct
gcgggccggc ggcgtcaagg agctggtgga 1440gtgggacggg ctggaccggc tgtccatggc
ctgccccgcg ctgccgctgt gcggcctggc 1500ggtcacggag gcggagcggg cgctgccgga
cgtcaacacg cgcatccggg ccatgttgac 1560acgggcgggc ctgcctccct cccagccgct
gcacgtgcgc atgacgggct gccccaacgg 1620ctgcgtgcgg ccctacatgg ccgagttggg
gctggtgggc gacggaccca acagctacca 1680gctgtggctg ggcggcgggc cggcgcagac
acgcctggcg cagccgtacg cggagagggt 1740caaggtgaag gacttggagt ccacgctgga
gcccctgttt ggcgcctgga gggccgggcg 1800ccagccggac gaggcctttg gagattgggt
ggcgcggctc ggatttgacg ccgtgcggca 1860gcaggcggcg gcggcggcgg cggcggctcc
tgtcggcacc gcgtgaggcg gcggctcggg 1920gctttcccgg tgcaaacgta cgtgcgtgcg
tatgcgtgtt tacgtgtgtg taagtatgta 1980tctgtgtatg tgtaccgtat gtgtacgaga
agcgaaaatg gtggacgacg actgcacagt 2040cgcagcaccg gcggcttgtg gggtaggctg
tggctacctc tcgcaatgcg gccacgtaat 2100ggtattgcaa aatgcccctg cgtcaatgat
aagagattgc gtattcatgc acgtgactga 2160ggagaaacgg ttcacaacga aaccctgcag
cccggcaatg ccatgttcta gataggtcac 2220gcacgcaatc cgcatgcagc gcggtcttcg
tatgtactat gtagcactac cctgtgcgca 2280gtgcaccatt tatatgcttt gctagcagca
agcggttttg cttgaggttc cttttgcctg 2340gattcgcctg ccagccctcc gggagctagg
ggtgctctgt agcgatcatg caaaagtaag 2400atgagttctg tttgggttgc gcggaagtgc
tgaggcgctc ttgtgcaata cgagtacgg 2459383102DNAChlamydomonas reinhardtii
38cttgtaactt gacaaccaag gacaaccaag gaccagccgc ttataatcac tagggttgcg
60ctccagtcgg tgtcttgtga gcgttgattc ctcgctgaaa gctttatctt gagcaccata
120ctagttgagt cgtgattgca ttcgcaaggg caaaataacc cgaggcttgt gactacaatc
180aacaaacggc aatgcagtcg cgccagtgct tgaaccgcaa ggccagcggc gcgcggccct
240gcgctaactc gcgcagcctc acagctcgcg tactcgctac ggccgcgcct gtcgcgccgt
300ccgccacacc cgcctccgcc cccctgcccc tccccgatgg cgttggcgag cacagcggcc
360tgaagcacct gcccgaggcc gcccgcactc gtgcgctcga caagaaggcc aacaagtttg
420agaaggttaa ggtcgagaag tgcggctcgc gcgcctggaa cgacgtgttt gagctgtctt
480ccctgctgaa ggagggcaag accaagtggg aggaccttaa cctcgatgat gtcgacatcc
540gtctcaagtg ggccggcctg ttccaccgcg gcaagcgcac ccccggcaag ttcatgatgc
600gtctcaaggt gcccaacggc gagctcaccg ccgcgcagct gcgcttcctg gcctcctcca
660tcgcgcccta cggcgctgac ggctgcgccg acatcaccac ccgcgccaac atccagctgc
720gcggcgtcac catggaggac tcggagacgg tcatcaaggg gctgtgggat gtgggcctga
780cgtccttcca gtcgggcatg gactccgtgc gcaacctcac cggcaacccc atcgccggag
840tcgacccaca cgagctggtg gacacgcggc cgctgctgcg cgacatggag gcgatgctgt
900tcaacaacgg caagggccgc gaggagtttg ccaacctgcc gcgcaagctg aacatctgca
960tctcctccac ccgcgacgac ttcccgcaca cccacatcaa cgacgttggc tacgaggccg
1020tggccaagcc caacggcgag gtggtgtaca atgtggtggt gggcggctac ttctccatca
1080agcgcaacat catgtccatc ccgctgggct gctccatcac ccaggaccag ctgatgccct
1140tcactgaggc cctgctgcgc gtgttccggg atcacggccc gcgcggcgac cggcagcaga
1200cgcggctgat gtggctggtg gaggcggtgg gcgtggacaa gttccgccag ctgctgtcgg
1260agtacatggg cggcgccacc ttcggcgagc ccgtgcacgt tcaccacgac cagccctggg
1320agcggcgcaa cctgctgggc gtgcaccggc agaggcaggc cggcctgaac tgggtcggcg
1380cctgcgtgcc cgcgggccgc ctgcacgccg ccgactttga ggagatcgcg gctgtggctg
1440agaagtacgg cgacggcacg gtgcgcatca cgtgcgagga gaacgtgatc ttcaccaacg
1500tgcccgacgc caagctggag gcgatgaagg cggagccgct gttccagcgc ttccccatct
1560tccccggcgt gctgctgtcg ggcatggtgt cctgcaccgg caaccagttc tgcggcttcg
1620gtctggctga gaccaaggcg aaggccgtga aggtggtgga ggcgctggac gcgcagctgg
1680agctgagccg gcccgtgcgc atccacttca ccggctgccc caactcatgc ggccaggcgc
1740aggtgggcga catcgggctg atgggcgcgc ccgccaagca cgagggcaag gccgtggagg
1800gctacaagat cttcctgggc ggcaagatcg gcgagaaccc cgcgctcgcc accgagttcg
1860cgcagggtgt gccggccatt gagagcgtgc tggtgcctcg gctaaaggag attctgatct
1920ccgagttcgg tgccaaggag cgcgccaccg ccaccgccta agagcgtggt gtcacgagcg
1980tggcggcagt ggaacgtgct tgcagcgttg gtgtttggag cgagctcctc agagcgtgag
2040tgccttgttg aacacgccgg cgttgcgtga tgggaaggtg ggattggtgg tcgccctgag
2100gtgcatgaag catgcagggc aggggagtgg gattggttgg agaggaaaat gagtaggagt
2160gatgcgcacc tgcggctgcc tatataacat aaggaagtaa gcgtgatgga tgcacgggct
2220gtgttttgct tgaagcggca gagccctgca ggagccagac ggccgacatg tactgctaag
2280gcaggagcca gttcctgcgt tgagaagaag cgtgcttgct tgccggcgga ggccgtcttg
2340cgtgccatac cagggcacgg cagcgctgga agactgcatg cgacgcagcg atcggagcac
2400gctgtggttc tttaccctcg ttttacatat gcgttgtcgt gttccttgtg tgtatgtacg
2460tgtgtgtgtg tgtgtgtgta cggtgtgtat acggcgtgcg gggcaggcag gcggaggctg
2520caaagggagc gcagatgcgc atccttaggg aaaggtacgt aggagccgcc gctgcgtgta
2580tgtatgtact agcagctaga tatgcacgtg gtgacctgca gcgctgtgct caatgcgtgc
2640tgtggcacca gcgcaggggc aagaagcgta ggcatttcgg tagtacggta ttgtgtgcgc
2700gtgctggcgc tgggaggcgg tgcagtggtg caggtttgtt ggcgccggcc gctgcacctg
2760ctgcgcttgc gactaggcag gcgccgtacg gtaatagggg ttgaggcaca ttgcgcatgc
2820atttgtctag ataatggtat gcggcgccga caagtggcaa ctagcgttag ggtggcttgt
2880ctgtactaaa ccacggccca taccgcagtg cggcgtgtgg ctgcaacacc cgtgccggcg
2940tgtaggagga gctgacgtgt gatctagagt gaataccaat ggtactggaa gaggtaacag
3000actttgcgac gagcgttgca atgcgaggcg cccgccgggg caggcgtgca cacaaccacc
3060tagatggctg catcccgggc gaatgtaaca acaccggaag ga
3102391467DNAChlamydomonas reinhardtii 39ctgtttcgtc acgtcgttat tgaattctat
taagtggttt aaccgtaggt agcagccatg 60cttctcaagg gcattacaac cccgatgctg
gggcagcagc gccccactcg cggccagctg 120cacgtcgtga acgtggctac gccctccaag
aatccctcct ctcgcctggc gaagcgcagc 180aaggtggaga ttattaagga gaagagcgac
tacctgcggc acccactcat ggaggagctg 240gttaacgacg ccaccttcat caccgaggac
tcggtgcagc tcatgaagtt ccacggctcg 300taccagcagg accaccgcga gaagcgcgcg
tttggtcagg gcaaggctta ctgctttatg 360atgcgcacgc gtcagcccgc tggtgtcgtg
cccaaccgcc tgtacctggt gatggacgac 420ctggccgatc agtacggcaa cggcacgctg
cgcctgacta cgcgccaggc ctaccagctg 480cacggcgtgc tgaagaagga cctcaagacg
gtgttcagct ccgtcatcaa gaacatggga 540tccaccctgg ccgcctgcgg cgacgtcaac
cgcaacgtta tgggcccctc cgcgcccttc 600accaaccgcc ccgactacgt ggccgcccag
aaggccgcca acgacatcgc cgacctgctg 660acgccgcagt cgggcgccta ctacgacgtg
tggctggacg gcgagaagtt catgtcggct 720tacaaggagg accccaaggt gaccgccgac
cgtgcctaca acggcttcgg caccaacttt 780gagaacagcc ccgagcctat ctacggcgcg
cagttcctgc cccgcaagtt caaggtggcc 840accacggtgc cgggcgacaa cagcgtggac
ctgttcaccc aggacctggg cgtggtggtc 900atcatggacg agagcggcaa ggaggtcaag
ggctacaacc tgacggtggg cggcggcatg 960ggccgcacac accgcgacga tgagaccttc
ccgcgtctgg ctgacccgct gggctacgtg 1020gacaaggacg acctgttcca cgccgtcaag
gcggttgttg cggttcagcg cgactacggc 1080cgccgcgaca accgcaagca ggcgcgcctc
aagtacctgg tgggcctgcc cgccgaccag 1140gagctgcacg tgcgcatgac gggctgcccc
aacggctgcg cgcggcccta catggccgag 1200ctgggcttcg tgggcgacgg ccccaacagc
taccagctct acttcggcgg caacgtcaac 1260cagacgcgcc tggcgcagct gttcgcggac
agggtcaagg tgaaggacct ggagtccacg 1320ctggagccca tcttcgccgc ctggaaggcc
agccgccggc caaaggagtc gttcggcgac 1380tgggtgtcgc ggccgtccca agatcccaag
aatctcagtt ctgtacaaca gggcacgcag 1440cacgagagcg ccgtcgtcgc gcactaa
1467402080DNAGossypium hirsutum
40tatcccttca cttatctttc caccaccaca attccaccag ttccaagctt cttttcaaac
60aacaaaaccc cacatgtctt ccttgtcggt ccgtttcttt gctccacaac agccgttact
120gccgtccaca gcttcctctt tcaagcccaa aacatgggtt atggcagctc ccacgacggc
180gccggcgact tcggtggatg tcgacggggg gaggttggaa ccccgagttg aagaacgaga
240ggggtacttc gtgttgaaag agaagttcag agatggcatc aaccctcagg agaaaataaa
300gatcgagaaa gaccctttga agcttttcat ggaagctggg attgatgaac tcgctaagat
360gtcgttcgag gatcttgata aagctaaggc tacaaaggac gacattgatg ttagacttaa
420atggctcggc ttgttccata ggagaaaaca tcaatatggg agatttatga tgagactaaa
480actaccaaat ggtgtaacaa caagtgcaca aacacggtac ttagccagtg tgataaggaa
540atacggcaaa gaagggtgtg ccgatgttac gacaaggcaa aactggcaaa tccgtggagc
600ggtgttgcct gatgtgcctg aaatacttaa gggtctcgac gaagtaggct tgacgagcct
660acagagtggc atggacaatg tgaggaaccc tgtcggtaat cctcttgccg gcatcgaccc
720cgaagagatt gtcgatactc gaccttatac caacttgtta tctcagttca tcaccgccaa
780ttcccgcggc aatccggctg ttgccaactt gcctaggaaa tggaatgtct gtgtcgtggg
840gtctcatgat ctttacgaac atccccatat caatgatctc gcttatatgc cggcgacgaa
900aaacggacga tttgggttta atttgctggt tggtgggttc tttagtgcca agagatgtga
960tgaggccatt cctcttgatg cttgggtctc agctgatgat gtgattccat tgtgcaaagc
1020tgtgttagaa gcctataggg atcttggata caggggcaat aggcaaaaga ctagaatgat
1080gtggctgatt gatgaactgg gtattgaagt gttcagatca gaagtagcca aaagaatgcc
1140tcagaaagag ttggagagag catctgatga agatttggtt caaaagcaat gggaaaggag
1200agactacctt ggtgtccatc cgcaaaagca agaaggtttc agctacatcg gcattcacat
1260cccagtcggt cgagtccaag ccgacgacat ggacgaacta gcccggttag ccgacacgta
1320tggctcgggc gaattcagac tcactgtgga gcaaaacatc ataatcccca acgttgagaa
1380ctcgaaacta gaagcattac taaacgagcc tctattgaaa gaccggtttt caccccaacc
1440aagtattctc atgaaagggc tagtagcttg tactggtaac cagttttgcg gacaagccat
1500tattgaaaca aaagctagag ccttgaaggt gacggaagag gttgaaaggc tagtgtcggt
1560gagccggccg gtgaggatgc attggaccgg ttgccccaac acgtgtggtc aagtccaagt
1620ggcggatata ggtttcatgg ggtgcatggc aagggatgag aatgggaaac catgtgaagg
1680ggcagacata ttcttgggag ggagaattgg gagtgactca catttaggag agctttataa
1740gaagggtgtc ccttgtaaga acttggtacc tgtagttgct gacattttgg tggaaccctt
1800tggagctgtc cctaggcaaa gggaagaagg ggaagattga ttcaaaatca acttcatttc
1860attccattac ttttatattt gttttatttt ttttttttaa taaccaagaa aaatgaaggg
1920tttgaaagat actggggagg attaaatttg gagaatattg atcaatggca tgatgatgaa
1980gggctttgta ttataaaata tgtaacattt tcagcatatg tattagaata aagttactgg
2040taatatattt tcagttaaaa tttagagatg atcatgtttg
2080411482DNAHordeum vulgare 41accaccatca ccgccacaga gcagcagcag
cggcaccacc accaccgcaa ccacaagcag 60catccatggc gtcctcggcc tccctgcaga
gcttcctccc gccctcggcc cacgcggcga 120cgtcgtcgtc ccggctccgg cccagccgcg
cccgccccgt ccagtgcgct gccgtctccg 180cgccgtcgtc gtcgtcgtcg tccgcatcgc
cgtcggcctc ggccgtcccg tcggagcggc 240tggagccgcg ggtggagcag cgggagggcg
gctactgggt gctcaaggag aagtaccgca 300ccagcctgaa cccgcaggag aaggtgaagc
tgggcaagga gcccatggcg ctcttcaccg 360agggcggcat caacgacctc gccaagctgc
ccatggagca gatcgacgcc gacaagctca 420ccaaggagga cgtcgacgtg cgcctcaagt
ggctcggcct cttccaccgc cgcaagcagc 480agtatgggcg gttcatgatg cggctgaagc
tgcccaacgg cgtgacgacg agcgagcaga 540cgaggtacct ggcgagcgtg atcgacaagt
acggcgagga ggggtgcgcc gacgtgacga 600cccggcagaa ctggcagatc cgcggcgtga
cgctgccgga cgtgccggag atcctggacg 660ggctccgctc cgtcggcctc accagcctgc
agagcggcat ggacaacgtg cgcaaccccg 720tcggcagccc gctcgccggc atcgaccccc
tcgagatcgt cgacacgcgc ccctacacca 780acctcctctc ctcctacatc accaacaact
ccgagggcaa cctcgccatc accaaccttc 840ctaggaagtg gaacgtgtgc gtgatcggca
cacatgatct gtacgagcac ccgcacatca 900acgacctggc gtacatgccg gccgagaagg
acggcaagtt cgggttcaac ctgctcgtgg 960gcgggttcat cagccccaag aggtggggtg
aggccctgcc gctcgacgcc tgggtccccg 1020gcgacgacat catcccggtc tgcaaggccg
tcctcgaggc gttccgcgac ctcggcacca 1080ggggcaaccg ccagaagacg cgcatgatgt
ggctcatcga cgagctcggg atggaggcgt 1140tccggtcgga gatcgagaag aggatgccca
acggcgtgct ggagcgcgcg gcgccggagg 1200acctgatcga caagaagtgg gagaggcgcg
actacctcgg cgtgcacccg cagaagcagg 1260aggggctctc cttcgtcggc cttcacgtgc
ccgtcggccg gctgcaggcc gcggacatgt 1320tcgagctggc ccgcctcgcc gacgagtacg
gctccggcga gctccgcctc acggtggagc 1380agaacatcgt gctgcccaac gtgaagaacg
agaaggtgga ggcgctgctg gcggagccgc 1440tgctgcacaa gttctcggcg cacccgtcgc
tgctgatgaa gg 1482422092DNALotus japonicus
42tcaccatgtc ttcttccttc tccattcgct tcctcgctcc tccatttccc tccacctctc
60gccccaagtc atgtctctcc gccgccacgc cggctgtggc tccaaccgat gcggcggtgt
120cgaggttgga gcccagagtg gaggagagaa atgggtactg ggttttgaag gaagagcaca
180ggggtggcat taatccgcag gaaaaggtga agctggagaa agagcctatg gcccttttta
240tggaaggtgg gattgatgag ttggctaagg tttctattga agagcttgat agctctaagc
300ttactaagga tgatgttgat gttaggctca aatggcttgg tctttttcat aggagaaagc
360atcagtatgg tagatttatg atgaggctga aacttccaaa tggggtgaca acgagtgcgc
420agacacgata cttggcgagt gtgatcagga agtacgggaa agatgggtgt gctgatgtga
480ccacaaggca taattggcaa attcgtggtg tagtgctacc tgatgttcct gaaattctta
540agggccttgc agaggttggc ttgactagtc tgcagagtgg tatggacaat gtaagaaacc
600ctgtgggtaa ccctcttgca ggcattgacc ctgatgagat tgttgatacc cgaccttaca
660cgaacttgtt gtcccatttc atcactgcca attcacgtgg caacccaacc gtctcaaact
720tgccaaggaa gtggaatgta tgcgttgtgg gttctcatga tctctttgag catccccaca
780taaatgatct tgcttacatg cctgctaaca aagatggtcg ttttggattc aacttattgg
840tggggggttt ctttagtccc aagcgatgtg cagaggcaat tccacttgat gcatgggtct
900ctgcagaaga tgtaatccca gtttgtaaag caatcctcga gatgtacagg gatcttggca
960ccagaggaaa cagacagaaa acaagaatga tgtggttgat tgacgaactg gggatagaag
1020tattcaggtc agaggtggta aaaagaatgc cattagggca gcagctggag agagcatccc
1080aggaagatct ggttcagaaa caatgggaaa gaagagatta ctttggtgcc aatccacaga
1140aacaagaggg cttaagctat gttgggattc acattccagt tggtaggatc caagcagatg
1200agatggacga gctggcccgt ctggccgatg aatacggcac tggtgaactg aggctcactg
1260tagagcaaaa cataataatc ccaaatgtgg aaaactcaaa actcagtgcc ctgctcaatg
1320agcctctctt gaaagaaaag ttctcacctg aaccttccct tctaatgaaa acactggtgg
1380catgcactgg tagccaattt tgtgggcaag ccataattga gacaaaggcg agggcattga
1440aggtgactga agaagtggag agactagtgg cagtgactag gcctgtgaga atgcactgga
1500ctgggtgtcc caacacctgc gggcaagtgc aggttgctga tattggtttc atggggtgca
1560tggccagaga tgagaatggt aagcctggtg aaggtgtgga tattttcctg ggagggagga
1620taggaagtga ttcacactta gctgaggttt ataagaaggc tgttccttgc aaggacttgg
1680tgcccatagt ggcagacata ctagtaaaac attttggagc tgtccagagg aatagagaag
1740aaggagatga ttaagttatt taggtttaac ttttgaaatt aaaccttctg ttgtatctat
1800gacaaaatat cattttcttg tccaaaattt ataatagtag taagggtgat caagtgagat
1860ataccacatg tgccaatggg gaaaaaaagt cggatatgaa agttgtaatc ttacatgagt
1920ggttttgaaa ttacatgaca catttttatt gatcggacgg aaaagaagat ccaaacaaat
1980gtgtaagaaa tttttcttag tttctaattt ccactttcta ttcataaata aatgtgtaag
2040ctatggttct tactttgtga catttgttaa aataaatatt ttcacttttt tt
2092432149DNANicotiana tabacum 43atggcatctt tttctgttaa attctcagca
acttcattgc caaatcctaa cagattttcc 60aggactgcta agcttcatgc aacaccgccg
cagacggtgg cagtaccacc atctggggag 120gcggagatag cttccgagag gctagagcct
agagtagagg aaaaagatgg gtattgggta 180ctcaaggaaa aattcagaca agggataaat
ccagctgaaa aggccaagat tgagaaagaa 240ccaatgaaat tatttatgga aaatggtatt
gaagatcttg ctaagatctc acttgaagag 300atcgaagggt ctaagcttac taaagatgat
attgatgtta ggctcaagtg gcttggcctt 360ttccatagga gaaagcatca ttatggccga
ttcatgatgc gattgaagct tccaaatggg 420gtaacaacga gtgcccaaac tcgatactta
gccagtgtga taaggaaata tggaaaagat 480ggatgtggtg atgtgactac aaggcaaaat
tggcagattc gcggggttgt actacctgat 540gtacccgaga ttctaaaggg actggatgaa
gttggcttga ccagtctgca aagtggcatg 600gacaacgttc gaaatccggt gggaaatcct
ctggcgggga ttgatccaca tgaaattgta 660gacacaaggc cttacactaa tttgctctcc
caatatgtta ctgccaattt tcgtggcaat 720ccggctgtta ctaacttgcc aaggaagtgg
aatgtatgtg taatagggtc acatgatctt 780tatgagcatc cccatatcaa tgatcttgcc
tatatgccgg catcaaaaga tggacgattt 840ggattcaacc tgcttgtggg tggattcttc
agtccgaagc gatgtgcaga ggcagttcct 900ctagatgcat gggttccagc tgatgacgtg
gtccctgttt gcaaagcaat attagaagct 960tatagagatc ttggtaccag agggaacagg
caaaaaacaa gaatgatgtg gttagttgat 1020gaactgggcg ttgaaggatt cagggcagag
gtcgtaaaga gaatgcctca acaaaagcta 1080gatagagaat caacagagga cttggttcaa
aaacaatggg aaaggagaga ataccttggc 1140gtgcatccgc agaaacaaga aggatacagc
tttgttggcc ttcacattcc ggtaggtcgt 1200gtccaagcag atgacatgga cgagctagct
cgtttagcgg ataactatgg ttcaggagag 1260ctccggttga ctgttgaaca gaacatcatt
attcccaacg ttgagaactc aaagatcgag 1320tcattgctca atgagcctct cttaaagaac
agattttcga ccaatccacc tattctcatg 1380aaaaatctgg tggcttgtac tggtaaccaa
ttttgcgggc aagccataat tgagactaaa 1440gcgcgttcca tgaagataac tgaggaggta
caacgactag tttctgtgac aaagccggtg 1500aggatgcatt ggactggttg cccgaattca
tgtggacaag ttcaagtcgc ggatattgga 1560tttatgggat gcttgacaag aaaagaagga
aaaactgtag aaggtgctga tgtttatttg 1620ggaggcagaa tagggagtga ctcacatttg
ggagatgttt ataagaaatc agtaccttgt 1680gaggatttgg tgccaataat tgtggactta
ctagttaaca actttggtgc tgttccaaga 1740gaaagagaag aagcagaaga ttaatttcaa
gatttcataa cagctcgcgg atcgcgctgc 1800agaattggac attaatggaa tgtgcacacc
atatcaagtt atttcgaagg tacagaaatg 1860gtgacactga tcctgaaaac caaggttttc
tttattgaaa gttagttgaa taattggtat 1920atgtgccgtt attaacatgc tcatgtgtga
tatagcacga cagaaatatt tgtacttgtt 1980tcagaataat tatattgtgt attcttttgg
aaaaactgat acaaaccaaa aggcttttaa 2040accacccttc agttgggatt ctaataatcc
atctttacat accaattaat catgttgttg 2100tattcttaat catattgtta tattataata
atccattcgg tttgatgcc 2149441902DNANicotiana tabacum
44atggcatctt tttctattaa atttctggca ccttcattgc caaatccagc tagattttcc
60aagaatgctg tcaagctcca cgcaacaccg ccgtctgtgg cagcgccgcc aactggtgct
120ccagaggttg ctgctgagag gctagaaccc agagttgagg aaaaagatgg ttattggata
180ctcaaagagc agtttagaaa aggcataaat cctcaagaaa aggtcaagat tgagaagcaa
240cctatgaagt tgttcatgga aaatggtatt gaagagcttg ctaagatacc cattgaagag
300atagatcagt ccaagcttac taaggatgat attgatgtta ggcttaagtg gcttggcctc
360ttccatagga gaaagaacca atatgggcgg ttcatgatga gattgaagct tccaaatgga
420gtaacaacga gtgcacagac tcgatactta gcgagtgtga taaggaaata cgggaaggaa
480ggatgtgctg atattacgac aaggcaaaat tggcagattc gtggagttgt actgcctgat
540gtgccggaga tactaaaggg actagcagaa gttgggttga ccagtttgca gagtggcatg
600gacaatgtca ggaatccagt aggaaatcct ctggctggaa ttgatccaga agaaatagta
660gacacaaggc cttacactaa tttgctctcc caatttatca ctggcaattc acgaggcaat
720cccgcagttt ctaacttgcc aaggaagtgg aatccgtgtg tagtaggctc tcatgatctt
780tatgagcatc cccatatcaa cgatctcgcg tacatgcctg ccacgaaaga cgggcgattt
840ggattcaacc tgcttgtggg agggttcttc agtgcaaaaa gatgtgatga ggcaattcct
900cttgatgcat gggttccagc cgatgatgtt gttccggttt gcaaagcaat actggaagct
960tttagagatc ttggtttcag agggaacaga cagaaatgta gaatgatgtg gttaatcgat
1020gaactgggtg tagaaggatt cagggcagag gtcgagaaga gaatgccaca gcaacaacta
1080gagagagcat ctccagagga cttggttcag aaacaatggg aaagaagaga ttatcttggt
1140gtacatccac aaaaacaaga aggctacagc tttattggtc ttcacattcc agtgggtcgt
1200gttcaagcag acgatatgga tgagctagct cgtttagctg atgagtatgg ttcaggagag
1260atccggctta ctgtggaaca aaacattatt attcccaaca ttgagaactc aaagattgag
1320gcactgctca aagagcctgt tctgagcaca ttttcacctg atccacctat tctcatgaaa
1380ggtttagtgg cttgtactgg taaccagttt tgtggacaag ccataatcga gactaaagct
1440cgttccctga tgataactga agaggttcaa cggcaagttt ctttgacacg gccagtgagg
1500atgcactgga caggctgccc gaatacgtgt gcacaagttc aagttgcgga cattggattc
1560atgggatgcc tgactagaga taagaatgga aagactgtgg aaggcgccga tgttttctta
1620ggaggcagaa tagggagtga ttcacatttg ggagaagtat ataagaaggc tgttccttgt
1680gatgatttgg taccacttgt tgtggactta ctagttaaca actttggtgc agttccacga
1740gaaagagaag aaacagaaga ctaataaaat ttagaatagt tggtgatttt gctgtgttca
1800taacatgtaa tgtatgataa atcaatgcaa acatttctac ctacgtgaga attattacat
1860gctacatata ttcttttgaa gaaaattaca tgcgtactcc tc
1902451755DNANicotiana tabacum 45atggcatctt tttctgttaa attctcagct
acttcattac caaatcataa aagattttca 60aagctacatg caacaccgcc gcagacggtg
gctgtagccc catctggggc ggcggagata 120gcatcggaga ggttagagcc tagagtagaa
gaaaaagatg ggtattgggt acttaaggaa 180aaattcagac aagggataaa tccagctgaa
aaagctaaga ttgagaagga accaatgaaa 240ttgtttatgg aaaatggtat tgaagatcta
gctaagatct cacttgaaga gatcgaaggg 300tctaagctta ctaaagatga tattgatgtt
aggctcaagt ggcttggcct tttccatagg 360agaaagcatc actatggccg attcatgatg
agattgaagc ttccaaatgg ggtaacaacg 420agttcccaaa ctcgatactt agccagtgtg
ataaggaaat atgggaaaga tggatgtgct 480gatgtgacga caaggcaaaa ttggcagatt
cgtggggttg tactacctga tgtacccgag 540attctaaagg gactggatga agttggctta
accagtctgc agagtggcat ggacaatgtt 600agaaatccgg tgggaaatcc tctggcgggg
attgatccac atgaaattgt agacacaagg 660ccttacacta atttgctctc ccaatatgtt
actgccaatt ttcgtggcaa tccggctgtg 720actaacttgc caaggaagtg gaatgtatgt
gtaatagggt cacacgatct ttatgagcat 780ccccagatca acgatcttgc ctatatgccg
gcaacaaaag atggacgatt tggattcaac 840ctgcttgtgg gtggattctt cagtccgaag
cgatgtgcag aggcagttcc tcttgatgca 900tgggttccag ctgatgacgt agtccctgtt
tgcaaagcaa tattagaagc ttatagagat 960cttggcacca gagggaacag gcagaaaaca
agaatgatgt ggttagttga tgaactgggc 1020gttgaaggat tcagggcaga ggttgtaaag
agaatgcctc aacaaaagct agatagagaa 1080tcaacagagg acttggttca aaaacaatgg
gaaaggagag aataccttgg cgtgcatcca 1140cagaaacaag aagggtacag ctttgttggt
cttcacattc cagtgggtcg tgtccaagca 1200gatgacatgg acgagctagc tcgtttggcc
gatgagtatg gttccggaga gctccggctg 1260actgttgaac aaaacatcat tattcccaat
gttaagaact caaagatcga ggcattgctc 1320aatgaacctc tcttaaagaa cagattttca
accgatccac ctattctcat gaaaaatttg 1380gtcgcttgta ctggtaacca attttgcggg
aaagccataa ttgagactaa ggcacgatcc 1440atgaaaataa ctgaggaggt tcaactacta
gtttctataa cgcagcctgt gaggatgcat 1500tggactggtt gcccgaattc atgtgcacaa
gttcaggtcg cggatattgg atttatggga 1560tgcttgacaa gaaaagaagg aaaaactgta
gaaggtgctg atgtttattt gggaggcaga 1620atagggagtg actcacattt gggagatgtt
tataagaaat cagtaccttg tgaggatttg 1680gtgccaataa ttgtggactt actagttgac
aactttggtg ctgttccaag agaaagagaa 1740gaagcagaag attaa
1755462496DNAOryza sativa 46gaaccttatc
tccttctctc tcgtcgcttt ctgcgtctcc ccgtctctcc ttcgccaaca 60gccgagaaga
ggcagagaga gcgccgcccc ccgtccctct ctctccctct cgtcctcgcc 120cccatccctc
tcgtctttcc cttgccggca gcagaggagg cggcagcgac ggcttcagct 180gctcccacgg
gccggatcgg gcagtggcgg tggcgtcggc ggcttccgct ggcgaatccg 240gcgggtggat
acaaatcagt gttccgatag gtaaaaccct gctctcagca tctgcccttt 300tgaattcgcc
aagagccagc atctgccctt ttgaattcgc caagggccag catctgccca 360tttgattttg
aattcgccaa gagccagcaa cagcgccccc gcgccccctc cctcctccgc 420aataaacagc
cacacgcgcc gcccccatgt ccaccctcat cgccacagcg caccaccacc 480accaccacca
ccaccaccac caccgtctcc agccatggcc tcctccgcct ccctgcagcg 540cttcctcccc
ccgtaccccc acgcggcagc atcccgctgc cgccctcccg gcgtccgcgc 600ccgccccgtg
cagtcgtcga cggtgtccgc accgtcctcc tcgactccgg cggcggacga 660ggccgtgtcg
gcggagcggc tggagccgcg ggtggagcag cgggagggcc ggtactgggt 720gctcaaggag
aagtaccgga cggggctgaa cccgcaggag aaggtgaagc tggggaagga 780gcccatgtca
ttgttcatgg agggcggcat caaggagctc gccaagatgc ccatggagga 840gatcgaggcc
gacaagctct ccaaggagga catcgacgtg cggctcaagt ggctcggcct 900cttccaccgc
cgcaagcatc agtatgggcg gttcatgatg cggctgaagc tgccaaacgg 960tgtgacgacg
agcgagcaga cgaggtacct ggcgagcgtg atcgaggcgt acggcaagga 1020gggctgcgcc
gacgtgacaa cccgccggca gatccgcggc gtcacgctcc ccgacgtgcc 1080ggccatcctc
gacgggctca acgccgtcgg cctcaccagc ctccagagcg gcatggacaa 1140cgtccgcaac
cccgtcggca acccgctcgc cggcatcgac cccgacgaga tcgtcgacac 1200gcgatcctac
accaacctcc tctcctccta catcaccagc aacttccagg gcaaccccac 1260catcaccaac
ctgccgagga agtggaacgt gtgcgtgatc gggtcgcacg atctgtacga 1320gcacccacac
atcaacgacc tcgcgtacat gccggcggtg aagggcggca agttcgggtt 1380caacctcctc
gtcggcgggt tcataagccc caagaggtgg gaggaggcgc tgccgctcga 1440cgcctgggtc
cccggcgacg acatcatccc ggtgtgcaag gccgttctcg aggcgtaccg 1500cgacctcggc
accaggggca accgccagaa gacccgcatg atgtggctca tcgacgaact 1560tggaatggag
gcttttcggt cggaggtgga gaagaggatg ccgaacggcg tgctggagcg 1620cgcggcgccg
gaggacctca tcgacaagaa atggcagagg agggactacc tcggcgtgca 1680cccgcagaag
caggaaggga tgtcctacgt cggcctgcac gtgcccgtcg gccgggtgca 1740ggcggcggac
atgttcgagc tcgcacgcct cgccgacgag tacggctccg gcgagctccg 1800cctcaccgtg
gagcagaaca tcgtgatccc gaacgtcaag aacgagaagg tggaggcgct 1860gctctccgag
ccgctgcttc agaagttctc cccgcagccg tcgctgctgc tcaagggcct 1920cgtcgcgtgc
accggcaacc agttctgcgg ccaggccatc atcgagacga agcagcgggc 1980gctgctggtg
acgtcgcagg tggagaagct cgtgtcggtg ccccgggcgg tgcggatgca 2040ctggaccggc
tgccccaaca gctgcggcca ggtgcaggtc gccgacatcg gcttcatggg 2100ctgcctcacc
aaggacagcg ccggcaagat cgttgaggcg gccgacatct tcgtcggcgg 2160ccgcgtcggc
agcgactcgc acctcgccgg cgcgtacaag aagtccgtgc cgtgcgacga 2220gctggcgccg
atcgtcgccg acatcctggt cgagcggttc ggggccgtgc ggagggagag 2280ggaggaggac
gaggagtagg aacacagact ggggtgtttt gcttgctccg gtgatctctc 2340gccgtccttg
taaagtagac gacaatatgc cttcgcccat ggcacgcttg tactgtcacg 2400ttttggtttg
atcttgtagc ccaaaagttg tgttcattct cgttacagtc ttacagagga 2460tgattgattg
ataaataaag aagaaacaga ttctgc
2496472265DNAPhyscomitrella patens 47attagagagt tgatggacat cgtttgatcg
ttaactgcag cgaaataagt ccatggggtt 60tttaggaagt ggagtgatac atcgtcgcat
agttactggg aaaattgtaa ttgctcgtgc 120tcaggctgga atttcaagca agttgaggat
tgcaggcgaa atttactgaa gtaaaattcg 180ccaggcgcaa tgcaaggtgc aatgcagaca
aagatgtgga ggggagagct gatcagcaca 240tcgacccact ttataggcgg cactcgactg
cagcccaaac taaaccagga tgcaaggaaa 300cccacgaaaa gtgaaaattg tatcgttcga
gtctccatgg agcgtgaggt caaggctaag 360gccgcggttt ctccacccgc tgttgctgca
gaccgtctca ctccacgagt gcaagaaaga 420gatggctact acgttctcaa agaggaattc
cgacaaggaa ttaaccccca agagaagatc 480aaacttggga aagagccgat gaaattcttc
atagagaacg agatagagga gcttgcaaag 540acgccgttcg cggagctaga cagctcgaag
cctgggaagg acgatatcga tgttagactc 600aagtggttgg gtctcttcca ccgccgcaaa
catcaatatg gaaggttcat gatgcggttc 660aagcttccga atggaatcac gaacagtaca
cagacgaggt ttttggccga gaccatctca 720aaatacggaa aggaagggtg tgcagatttg
acgacaagac agaactggca aattcgtggg 780attatgctcg aagatgtgcc ctcccttctg
aaaggactgg aatccgtggg cctatcgtct 840ctgcagagcg ggatggacaa tgtaagaaat
gcggtcggta accctcttgc tggaatcgac 900cccgacgaaa tcgtcgacac cattcctatc
tgtcaggcgc tgaacgacta catcatcaac 960agagggaaag gaaatactga gatcaccaac
ttacctcgga agtggaacgt gtgcgtggtc 1020gggacgcacg acttatttga acatccgcac
atcaacgatc ttgcgtacgt tcccgcaacc 1080aagaacggcg tcttcggttt caacattctt
gttggaggat tcttcagctc aaagcggtgc 1140gccgaagcta ttccgatgga cgcttgggtg
ccgacagacg acgtcgtccc gttgtgcaaa 1200gcaattctgg agacttatcg agacctcggg
actcgcggca accgacagaa gactcgcatg 1260atgtggttga tcgatgagat gggagtcgag
gagttcagag ccgaggtgga aaggcgcatg 1320cccagcggca ctatccggcg agccggacag
gatctgatag acccgtcgtg gaagcgccgg 1380agcttcttcg gagtaaaccc ccagaagcaa
gcagggctga actacgttgg tcttcacgtc 1440ccggtcgggc gtttgcacgc tccagagatg
ttcgagctgg ctcgcattgc cgatgagtac 1500ggcaacggcg agatccggat cactgtggag
cagaacctga ttctgcccaa catcccgacg 1560gagaaaattg acaagttgat gcaggagccc
ctcttgcaga aatactctcc gaatcccacc 1620cccttgttgg cgaacttggt ggcctgcact
ggcagccagt tctgcggcca agcgatcgcg 1680gagacgaagg ccctgtccct gcaactcacg
cagcagctcg aagacaccat ggaaacgact 1740cgcccgatcc gattgcactt cacgggatgc
cccaacacat gcgctcaaat ccaggttgcg 1800gatatcggat tcatgggcac catggctcga
gatgaaaacc gaaagcccgt tgaagggttc 1860gacatctacc tcggaggccg catcggctcc
gactctcact tgggagagct tgtcgtgcct 1920ggtgtgcctg ccaccaagct gcttccggtg
gtgcaagagc tgatgatcca gcatttcggc 1980gctaaaagga aaccttgaga tgcaaatctg
ggtatagtaa caaaaaatca ctactcgtca 2040cacacacaca cacaccgctg atgtataatt
tacgtaaaac caatctatcg aatagcacga 2100ttcacagtta cgaaactctg ggtaaaaccc
ggttataaat tgatgaccat tcattcgtct 2160tgtgcagcct tccagtgaca ttgtcagtgt
cggtgggcat gagctctgtc gctaatcccc 2220acttctccaa taaagtttcg gcaaatctgt
gcccacatga atcat 2265481809DNAPhyscomitrella patens
48atgcaaggca ctatgcagtc acaaatgtgg aggggacagg tgagcggcgc atcgctccac
60ttcacaggcg caacccgagt gcagggtaac agccaccagg atttagtata tcccacgcaa
120tttcacaaac atggcgttcg ggcctctgcg gagcgcgagg tcaaggccaa ggctgtagct
180gccccaccta ccatcgctgc agaccgcctc gtgccacgcg tggaagaacg agatggttat
240tacgttctta aggaggaatt tcgacagggc atcaacccgt cggagaagat aaaaatcgcc
300aaagaaccca tgaaattctt catggagaac gagatagaag agctggcgaa aacgccgttc
360gccgagctcg atagttcgaa ggcaggaaag gacgacattg atgtgagatt gaagtggttg
420ggcctcttcc accgtcgcaa acatcaatat gggagattca tgatgcggtt caagcttcca
480aatgggatca cgaatagctc gcagacgcgg ttcttggctg agacaatctc caagtacgga
540gagtatgggt gcgctgattt gacgacacgt caaaactggc aaatcagggg gattgttctc
600gaagacgtgc ctgctcttct gaagggattg gaatcagtag gcctgtcatc tttgcagagc
660ggcatggaca acgttaggaa cccagttggt aaccctcttg caggaatcga ccctgacgaa
720attgtcgaca ctgccccgtt ctgcaaggta ctcagcgatt acatcatcaa ccgagggcaa
780ggaaatcctc agatcaccaa tttacctcgg aaatggaacg tgtgcgtggt tggaacacat
840gacttgttcg agcacccgca catcaacgac ctggcgtaca tgccagccac aaagaacggt
900gtcttcggtt tcaacatcct ggtgggagga ttctttagcc ctaagcggtg tgcggaagca
960attcccatgg atgcttgggt gccagcagat gatgtcgttc ccttgtgcaa ggcaattctg
1020gaaacctacc gagaccttgg aacccgaggc aaccgacaga agacccgcat gatgtggttg
1080atcgacgaga tgggaattga ggaattcaga gccgaggtag agaggcgcat gcccggtggg
1140tccattctta gagccgggaa ggacctggtc gatccatcct ggacgcgccg gagcttctat
1200ggagtgaacc cgcagaagca accgggctta aactacgtag gcctccacat tcccgtcggc
1260cggctgcatg ctccagagat gttcgagctt gcgcgcattg cagacgagta cggcaacggg
1320gagattcgga tctcggtgga gcagaacctg atcctgccca acgtccccac ggagaaaatc
1380gagaagctat tgaaggagcc cctcctggag aaatactccc cgaatcccac ccctctgctc
1440gccaacttgg tggcctgcac aggcagccag ttctgtggcc aggccatcgc ggagaccaag
1500gcccggtcgt tgcagctcac gcaagagctg gaagccacca tggaaaccac tcgtcctatt
1560cggttgcact tcaccggatg ccccaacaca tgcgcccaaa tccaggttgc ggatattggc
1620ttcatgggta caatggcacg agacgaaaat agaaagcccg tggaggggtt tgacatctac
1680cttggaggtc gtatcggctc cgactcacat ttgggagagc tcgtggtgcc gggcgtgcct
1740gcgaccaagc tgctccccgt tgtgcaagac ctcatgatcc agcatttcgg cgccaagcgt
1800aagacttaa
1809492270DNAPinus taeda 49cggccggggg agacaagccc tcatcataga tttaattact
gatctttgca tcttggattt 60gtaatcggag tagtcaggat gaatctctct agtccagtca
gattcgatga gattcgtccc 120ttggcccatg tcgtttacaa tcctgtttgc tgtgggcata
agccgaatcg gctcaggttg 180atgacagcaa tccaggttcg tgctgttaat catggtggac
gcaattctga gatcagtaca 240gatgggaata gcaaagggac aacagccaag gctgtagcca
gtcctgctgg ctctcatgtg 300gctgtagatg cctcaaggct ggaggctaga gttgaggaga
gggatggata ctgggttctc 360aaagaggaat tcagggctgg aatcaaccct caggagaaga
ttaagttgca gagggagccc 420atgaaattgt tcatggagaa tgagatcgaa gaacttgcaa
agaagccctt cgctgaaatt 480gagagtgaga aggttaataa agatgatata gatgtacgcc
tgaagtggtt gggtctcttt 540caccgaagaa aacatcacta tgggagattc atgatgagac
ttaagcttcc gaatggagtg 600actaccagtc tccaaactcg atatttggca agcgtgattc
aacaatatgg accagaggga 660tgcgcagata taacaactcg gcagaattgg cagattcgtg
gagttgtgct ggatgacgtg 720cctgccatat tgaaagggct gaaggaggtt ggactgtcta
gcttgcagag tggaatggac 780aacgttagaa accctgtggg aaatccttta gcagggattg
atgctgatga aatcattgac 840acaaggccat atacaaaggt tctgactgac tacattgtca
acaatggaaa gggcaatcca 900tccataacca acctgccacg taaatggaat gtctgtgttg
tgggtacaca tgacttgttt 960gagcatcccc acatcaatga cctcgcctac attcctgcaa
tgaatagtgg gagatttggt 1020ttcaatctgc tcgttggtgg attctttagt ccaaaacgct
gtgaagaagc agttccactt 1080gatgcttggg ttgctggaga ggatgttgta ccagtatgca
gagccatttt ggaggtttat 1140agagatctgg gcacccgggg aaatcgccag aaaactcgaa
tgatgtggct gattgatgag 1200ttgggcatag agggcttccg ttcagaagtg gtgaagagaa
tgccaggaga gaagttggaa 1260agagcagcaa cagaagacat gttagataaa tcatgggagc
gcaggagtta tcttggtgtg 1320cacccacaga agcaggaagg cttgaatttc gtaggtctcc
atgttccagt gggtcgactt 1380caggcagaag atatgttaga actggctcgt cttgcagaac
aatatggcac gcaggaactc 1440cgcctcacag tagaacaaaa tgccatcatt ccaaacgtac
ctacagataa gatagaggca 1500cttttacagg aacccctcct ccaaaaattc tccccttccc
ctcctcttct tgttagcaca 1560ttagtggctt gtaccggcaa ccagttctgt ggtcaggcaa
tcatcgaaac aaaagcaaga 1620gccttgaaaa tcacagagga attggataga accatggaag
ttcccaagcc tgtgagaatg 1680cactggacag gatgccctaa tacatgtgga caagtgcagg
ttgcagacat tggcttcatg 1740ggttgcatga ctagggatga aaacaagaaa gttgttgagg
gagtggacat attcattgga 1800ggtagggtgg gagcagattc acatctaggg gatttaatcc
acaagggagt accttgcaag 1860gacgtggtac ctgtggttca agaactactt attaaacact
ttggagccat caggaaaaca 1920gacatgtgaa aatgaattcc aatttctcat ccatcgccat
cttcagtgga ggacaatcac 1980cagattgcta aggttctgag cgggtatcca actcattgaa
atctgaataa ataaatgtag 2040agatgcaatg tatagatgta ttgtttacga agtccaacgt
gttcagaaat aaaatagctg 2100attactgtgt tcacagcagg gtttttttac attaaactcg
tcttgcactt ttgaacagta 2160tggaatacaa ataaaaacgg attagcccaa aaaaataatg
gaataataga aattccagta 2220agattatgat aaaatctgta gaatttttga aaatctgagt
ttcactggtg 2270501877DNAPopulus trichocarpa 50acacttctct
agaaactatc taccatcatt atgtcatcac tttcagttcg ttttctcacg 60ccacaattgt
cacccacagt tccaagctcc tctgcaagac caagaacaag actctttgct 120ggacctccca
cagtggctca gccagcggag acgggggtgg atgcagggag gttggaacct 180agagtggaga
agaaagacgg atactatgtg ttgaaagaga agtttaggca aggtattaat 240cctcaagaga
aagtgaagat agagaaagag ccaatgaagc ttttcatgga aaatgggatc 300gaggagcttg
ctaaattgtc gatggaagag attgacaaag agaagagcac taaagatgat 360attgatgtta
gactcaagtg gctcggtctc tttcacagaa ggaagcacca atatggtaga 420tttatgatga
gactaaagct accaaatggg gtaacaacaa gtgcacaaac aagatacttg 480gcaagcgtga
tcaggaaata tgggaaagat ggctgtgcag atgtaacaac aagacaaaac 540tggcaaattc
gtggagtggt gttgcctgat gtgccagaaa tactaagggg tctagctgaa 600gttggtctga
caagcctgca gagtggcatg gacaacgtga gaaaccccgt cggaaatccg 660cttgcaggaa
ttgatccgga tgagattgtt gataccagac cttataccaa cttgttgtcc 720caatttatca
ctgccaattc tcgtggaaat cctgagttca ctaacttgcc aaggaagtgg 780aatgtatgtg
tcgtgggttc tcatgatctt tatgagcatc ctcatatcaa tgatcttgct 840tacatgcctg
ccatgaagga cgggcggttt ggattcaatt tgctggttgg tgggttcttt 900agtcccaagc
gatgtgctga ggcaattcct cttgatgctt gggtttcagc tgatgatgtg 960ctcccatctt
gcaaagcagt gttagaggcc tacagagatc ttggcaccag agggaacagg 1020caaaagacta
gaatgatgtg gctgatcgac gagcttggca ttgaaggatt caggtcagaa 1080gtagtaaaaa
gaatgccacg tcaagagcta gagagagaat cttctgaaga tttggttcaa 1140aagcaatggg
aaaggaggga ctatttcggt gtccatccac agaagcaaga aggccttagc 1200tatgcaggtc
ttcacattcc tgtcggtcgc gtccaagcag atgacatgga tgagctagct 1260cgtttagctg
atatttatgg cactggcgaa ctcagactca ctgtggagca gaacatcata 1320attcccaaca
ttgaggactc aaagattgaa gccctactta aagaacctct attaaaagac 1380aggttctcac
ctgagccacc tcttctcatg caagggttgg tagcatgcac tggcaaagag 1440ttttgcgggc
aagcaataat tgaaacaaag gctagggcca tgaaggtaac tgaggaggtg 1500cagaggttag
tgtcggtgtc taaaccagtg agaatgcact ggacaggctg tcctaatacc 1560tgtgggcagg
tacaagttgc cgatattggg ttcatgggtt gcatggcaag agatgaaaat 1620gggaaaatct
gtgaaggagc agatgtgtac gtaggaggaa gagttgggag tgactcacat 1680ttgggagagc
tttataagaa aagtgttcca tgcaaggact tggtgccttt ggttgtggac 1740attttagtta
aacaattcgg agctgtacct agggagaggg aagaggtgga tgattagttc 1800atttaatcaa
aatgttcatt cttgtttcat tgcaaattcg gaggggatct aatgcatgct 1860tttggaatcg
gaaatga
1877512000DNASolanum lycopersicum 51caacaatcaa gagtccacta aacgttttgc
cacacatcca tttactccca cagctctaca 60aaatgctctg acatctcttt tgcaacttcc
aaaatggcat ctttttctat caaatttttg 120gcaccttcat tgccaaatcc aactagattt
tccaagagta gtattgtcaa gctcaatgca 180actccgccgc agacagtggc tgcggcgggg
cctccagagg ttgctgctga gagactagaa 240ccaagagttg aggaaaaaga tggatattgg
atactaaaag agcagtttag gcaaggaatt 300aatcctcaag agaaggtgaa gattgagaag
gaacctatga agttgttcat ggaaaatggt 360attgaggagt tagctaagat tccaattgaa
gagatagatc aatcaaagct tactaaggat 420gacattgatg ttaggctcaa gtggcttggc
ctcttccata ggagaaagaa tcaatatggg 480agattcatga tgaggttgaa acttccaaat
ggagtaacaa caagtgctca gactcgatat 540ttggcgagtg tgataaggaa atatggagag
gaaggatgtg ctgatattac gacaaggcaa 600aattggcaga ttcgtggagt agtgctgcct
gatgtgcctg agattctaaa gggacttgaa 660gaagttggct tgactagttt gcagagtggc
atggataatg tcaggaatcc agttggaaat 720cctctggctg gaattgatcc tgaagaaata
gttgacacaa gaccttacac taatttgctc 780tcccaattta tcactggtaa ttcacgaggc
aatccggctg tttctaactt gccaaggaag 840tggaatccgt gtgtagtagg gtctcatgat
ctttatgagc accctcatat caatgatctt 900gcatacatgc ctgccataaa agatggacga
tttggattca acctgcttgt gggagggttc 960ttcagtgcca aaagatgtga tgaggcaatt
cctcttgatg catgggttcc agccgatgat 1020gttgttccgg tttgcaaggc aatactggaa
gcttttagag accttgggtt cagagggaac 1080aggcagaagt gtagaatgat gtggttgatc
gatgaactgg gtgtagaagg attcagggca 1140gaggtcgtaa agagaatgcc tcagcaagag
ctagagagag catctccgga agacttggtt 1200cagaaacaat gggaaagaag agattatctt
ggtgtacatc cacagaaaca ggaaggctat 1260agctttattg gtcttcacat tccagtgggt
cgtgtacaag cagacgacat ggatgatcta 1320gctcgtttgg ctgatgagta cggctcagga
gagctacggc tgactgtgga acagaacatt 1380attattccca acattgagaa ctcaaagatt
gacgcactgc taaaagagcc tattttgagc 1440aaattttcac ctgatccacc tattctcatg
aaaggtttag tggcttgtac tggtaaccag 1500ttttgtggac aagccattat tgaaacgaaa
gctcgttccc tgaagatcac cgaagaggtt 1560caaaggcaag tatctctaac gaggccagta
aggatgcact ggacaggctg cccaaatacg 1620tgtgcacaag ttcaagttgc agacattgga
ttcatgggat gcctgactag agataaagac 1680aagaagactg tggaaggcgc cgatgttttc
ttaggaggca gaatagggag tgactcacat 1740ttgggtgaag tatacaagaa ggcagttcct
tgtgatgaat tagtaccact tattgtggac 1800ttacttatta agaactttgg tgcagttcca
cgagaaagag aagaaacaga agattaataa 1860aatttggatt agatcataat gatggaatgt
gcaattatgt ttagtgatta tggaggtata 1920tagctaagag ctggtttgaa taatcagaaa
tatgttgtgt tcatatcatt tattgtacga 1980taaatcaaca caaacattcc
2000522135DNASolanum lycopersicum
52gacgatcacc gctacctcaa tcgactaaat tctcaatttt aagttggttt tgtaacttag
60ttgttctttt taatttgtcg aaatgacttc tttttctgtt aaattttcag ctacttcact
120tccaaattct aatagatttt ccaaacttca tgctactcca ccgcagacgg tggcggtacc
180gtcgtacggg gcggcggaga tagctgctga aagactagag cctagagttg agcaaagaga
240tgggtattgg gtagttaagg ataagttcag acaaggcata aatccagctg aaaaggcgaa
300gattgaaaag gaaccaatga aactattcac tgaaaatggt atcgaagatc ttgctaagat
360ctcgcttgaa gagatcgaga aatcaaagct aactaaagaa gatattgata ttcgcctcaa
420gtggcttgga ctcttccatc ggagaaaaca ccactatggt cgattcatga tgcgattgaa
480gcttccaaat ggagtaacga cgagtgatca aactcgatat ttaggtagtg tgattaggaa
540atatgggaaa gatggatgtg gtgatgtgac tacaaggcaa aattggcaga ttcgtggggt
600tgtgttacct gatgtgcctg agattctaaa ggggcttgat gaagttggct tgactagtct
660gcagagtggc atggataatg ttcgaaatcc ggtggggaat cctctcgcag ggattgatct
720tcatgaaatt gtagacacaa ggccttacac taatttgctg tcccaatatg tcaccgccaa
780ttttcgtggc aatgtggatg tgactaactt gccaaggaag tggaatgtat gtgtaatagg
840gtcacatgat ctttatgagc atccgcatat caatgatctt gcgtatatgc ctgcaaccaa
900agatggacga tttggattca acctgcttgt gggtggattc ttcagtccga agcgatgtgc
960agaggcaatt cctcttgatg catgggttcc agctgatgat gtagtccctg tttgcaaagc
1020tatattagaa gcttatagag atcttggtac ccgagggaac aggcagaaaa caagaatgat
1080gtggttaatt gacgaactgg gtgttgaagg attcagggca gaagttgtga agagaatgcc
1140ccaaaagaag ctagatagag aatcttcaga ggatttggtc ctgaaacaat gggaaaggag
1200agagtacctt ggcgtgcatc cgcagaaaca ggaaggatac agctttgttg gtcttcacat
1260tccggttggt cgtgtccaag cagatgacat ggacgagcta gctcgtttgg ctgatgagta
1320tggttcagga gaactccggt tgactgttga acagaacatc attattccca acatcgagaa
1380ctcaaagatc gatgcattac tcaatgagcc tctcctaaag aacagatttt cacctgatcc
1440acctattctc atgagaaatt tggtggcttg tactggtaac caattctgtg ggcaagcaat
1500aatcgagact aaagcacgtt caatgaagat aaccgaggag gttcaacgtc tagtctctgt
1560gacacagcca gtgaggatgc actggacagg ttgcccaaat acatgtggac aagttcaagt
1620tgccgatatc ggattcatgg gatgcctgac tagaaaggaa ggcaaaactg ttgaaggtgc
1680tgatgttttc ttgggtggca gaatagggag cgactcgcat ttaggagaag tttataagaa
1740gtctgtacca tgtgaggatt tggtaccaat aatcgtcgac ttactaatta acaactttgg
1800tgctgttcca agagaaagag aagaaacaga ggagtaatct aaaatcttca gaatgtactt
1860tttatgatat tgaaatattt ccaaggtaca gcattgtaag ttagtaaaat aatcacaaca
1920tgagatgttg ttaacatgtt catgtgtgac atagcatgat gcaaatactt gaacttgttt
1980caaaatataa tcacattgtg tattcttttg gaaatactca tccaaactaa aaggcttttg
2040aattgttgaa ttcctaataa tacatttttt aaaatgtaat ttgatattca tttgttttga
2100ttattatatt cttaaaataa tttacttatt ctctc
2135532110DNAArabidopsis thaliana 53aagagctcat ctcttccctc tacaaaaatg
gccgcacgtc tccaaccttc tcccaactcc 60ttcttccgcc atcatcatga cttctttctc
tctcactttc acatctcctc tcctcccttc 120ctcctccacc aaacccaaaa gatccgtcct
tgtcgccgcc gctcagacca cagctccggc 180cgaatccacc gcctctgttg acgcagatcg
tctcgagcca agagttgagt tgaaagatgg 240tttttttatt ctcaaggaga agtttcgaaa
agggatcaat cctcaggaga aggttaagat 300cgagagagag cccatgaagt tgtttatgga
gaatggtatt gaagagcttg ctaagaaatc 360tatggaagag cttgatagtg aaaagtcttc
taaagatgat attgatgtta gactcaagtg 420gcttggtctc tttcaccgta gaaagcatca
gtatgggaag tttatgatga ggttgaagtt 480accaaatggt gtgactacaa gtgcacagac
tcggtattta gcgagtgtga ttaggaagta 540tggtgaagat gggtgtgctg atgtgactac
tagacagaat tggcagatcc gtggtgttgt 600gttgcctgat gtgcctgaga tcttgaaagg
tcttgcttct gttggtttaa cgagtcttca 660aagtggtatg gataacgtga ggaacccggt
tgggaatcct atagctggga ttgatccgga 720ggagattgtt gacacgaggc cttacacgaa
tctcctttcg cagtttatca ccgctaattc 780acaaggaaac cccgatttca ccaacttgcc
aagaaagtgg aatgtgtgtg tggtggggac 840tcatgatctc tatgagcatc cacatatcaa
tgatttggcc tacatgcctg ctaataaaga 900tggacggttt ggattcaatt tgcttgtggg
aggattcttt agtcccaaaa gatgtgaaga 960agcgattcct cttgatgctt gggtccctgc
tgatgacgtt cttccactct gcaaagctgt 1020tctagaggct tacagagatc ttggaactcg
aggaaaccga cagaagacaa gaatgatgtg 1080gcttatcgac gaacttggtg ttgaaggatt
tagaactgag gtagagaaga gaatgccaaa 1140tgggaaactc gagagaggat cttcagagga
tcttgtgaac aaacagtggg agaggagaga 1200ctatttcgga gtcaaccctc agaaacaaga
aggtcttagc ttcgtggggc ttcacgttcc 1260ggttggtagg ctacaagctg atgacatgga
tgagcttgct cggttagctg atacctacgg 1320gtcaggtgag ctaagactca cagtagagca
aaacatcatc atcccaaatg tagaaacctc 1380gaaaaccgaa gctttgcttc aagagccgtt
tctcaagaac cgtttctccc ctgaaccatc 1440tatcctaatg aaaggcttag ttgcttgtac
cggtagccag ttctgcggac aagcgataat 1500cgagactaag ctaagagctt taaaagtgac
agaagaagta gagagacttg tatctgtgcc 1560aagaccgata aggatgcatt ggacaggatg
tcccaacact tgcggacaag tccaagtagc 1620agatatcgga ttcatgggat gcttaacacg
aggcgaggaa ggaaagccag tcgagggtgc 1680tgacgtgtac gtcgggggac gaataggaag
tgactcgcat atcggagaga tctataagaa 1740aggtgttcgt gtcacggagt tggttccatt
ggtggctgag attctgatca aagaatttgg 1800tgctgtgcct agagaaagag aagagaatga
agattgattc aaaagctatt ggattcttaa 1860taagtcaaga gacctatgaa tggttctctc
tctggtttca gactttgata cttgatactt 1920gtatttgtat tgtgcccata attttgggtt
ttgtagctct ctcctttgtt gtaacctgta 1980actttgtcct tggttgtttt gtaatatctt
gttttttagt aatagtagta taatctgatt 2040ttttgtcata tattgtcttg atttctctgt
gatatttata agaaataaac atttgtttct 2100ttttacctcc
2110541788DNAVitis vinifera 54atggcttcta
tctctgttcc tttcctctct caggcaccca cccacctttc aaactccact 60tctctccgtc
tcaaaaccag gatctctgcc accccgactc cgactccaac tccaaccacg 120gttgcaccgt
cgtccacggc ggcggtggac gcctccagga tggagcccag ggtggaggag 180agagggggtt
actgggtttt gaaggagaag ttcagggaag gtataaatcc acaggagaag 240gtgaagattg
agaaggatcc tatgaagctc ttcatagaag atgggttcaa tgagctggcc 300agcatgtctt
ttgaagaaat tgaaaagtct aagcatacta aggatgatat tgatgtgagg 360ctcaagtggc
ttggactgtt tcataggagg aagcatcaat atggtagatt tatgatgaga 420ttgaagctgc
caaatggggt gacatcaagt gcacaaactc gttacctggc cagtgcaata 480aggcaatacg
ggaaggaggg atgtgccgat gtgactacgc ggcaaaactg gcaaattcga 540ggtgtggtac
tgcctgatgt gcctgaaata ctaaagggtc tttcagaggt tggtttgacg 600agcctgcaga
gtggcatgga caatgtgagg aatcctgttg gaaatcctct tgcaggcatt 660gaccctcatg
agattgttga tacacgacct tacaccaact tgttatccca attcattact 720gccaatgctc
gtgggaatac agccttcact aacttgccga ggaagtggaa tgtgtgtgtt 780gtaggctccc
atgatctcta tgagcatccc cacatcaatg atctggcgta catgcctgcc 840acaaagaaag
gaagatttgg attcaatctg ctagtaggcg ggttctttag tcccaaacgt 900tgtgctgatg
ctattcctct cgatgcctgg atccctgccg acgatgtcct cccagtttgt 960caagcagtac
tagaggctta cagggatctt ggtaccagag gaaaccgcca aaagacaaga 1020atgatgtggt
taattgatga gctgggcata gagcagttcc gggcagaggt ggtgaaaaga 1080atgccccaac
aagagctgga aagatcatct tctgaagacc tggttcagaa gcaatgggag 1140aggagagatt
accttggtgt ccatccccag aaacaggaag gctttagctt tgtgggtatt 1200cacattccag
tgggtcgagt ccaggcagat gacatggacg agctagctcg attggcagac 1260gaatatggct
caggcgagct ccggctcact gtagagcaga acatcataat tcccaatgtg 1320gagaactcaa
gacttgaagc cttgctcaaa gagcctctct tgagagacag attctctccg 1380gagcctccta
ttctcatgaa aggcttggtg gcctgcaccg gcaatcagtt ttgtggacag 1440gccattatcg
agaccaaggc cagagcattg aaggtgacgg aggatgtggg gcggctggtt 1500tcagtgaccc
agccagtgag gatgcactgg accggctgcc caaactcctg cggccaggtg 1560caagtggcgg
atatcggatt catggggtgc atgacaaggg acgagaatgg gaacgtttgt 1620gaaggggcag
atgtattctt aggaggtaga attgggagcg actgtcattt gggagaggtt 1680tataagaagc
gtgttccttg caaagactta gtgcccttgg ttgctgaaat tttggtaaat 1740cactttggag
gagtccccag ggagagggaa gaagaagctg aagactga
1788552433DNAVolvox sp. 55atgcagtcgc agtcgctgtc ccgccgcacc tgcacccgta
ctcttggccg cggcctcgtc 60acccctgtcc tggcaaccgc ggcaccggct tcagcagcgc
aagcggccga tggcatcaac 120gcgcatagcg ggctgaagca cctgccagag gctgctcgcg
ttcgcgctct cgaccgcaag 180gccaataagt ttgagaaggt caaggttgag aagtgcggat
cacgcgcatg gacagatgtc 240ttcgagctgt cacggctgct gaaagaggga aacaccaagt
gggaggattt ggatttggac 300gacatagaca tccgcatgaa gtgggcgggc ctgttccatc
gcggaaagcg cacgcccggc 360aagttcatga tgcgcctcaa ggttcccaac ggcgagctgg
atgcccgcca gctgcgcttc 420ctcgcctcgg caatcgcgcc atacggcgcc gacggctgcg
ccgacatcac cacgcgcgcc 480aacatccagc tccgaggcgt gacgctggcg gacgccgacg
ccatcattcg cggtctttgg 540gacgttggcc tcacgtcctt ccagagcggt atggacagcg
tacggaactt gacgggcaac 600cccatcgcgg gtgtggaccc ccatgagctc atagataccc
gtccgctgct gcgggaaatg 660gaggccatgc tgttcaacaa cggcaagggc cgcgaggagt
ttgcgaacct gcctcgcaag 720ctcaacatct gcatttcctc aacccgcgac gacttcccgc
acacgcacat caacgacgtg 780ggcttcgaag cggtgcgccg ccccgatgat ggcgaggtgg
tgttcaatgt ggtcgttggc 840ggcttcttct ccatcaagcg caacgttatg tccatccctc
ttggctgctc tgtcactcaa 900gaccagctga tgcccttcac ggaggctctg ctgcgggtgt
tccgcgacca cgggccccgc 960ggggaccgcc agcagactcg cctgatgtgg atggtagatg
cgattggcgt ggagaagttc 1020cgccagctgc tttcggagta catgggcggc gcggagctgg
cgccgccggt gcacgtgcat 1080cacgaggggc cctgggagcg ccgtgacgtg ctgggtgtgc
acccccagaa gcagccgggg 1140ctgaattggg tgggcgcctg tgttccggct ggcaggctgc
aggctgccga ctttgacgag 1200ttcgcccgca tcgcggagac gtacggcgac ggcaccgtac
ggatcacgtg cgaggagaac 1260gtgatcttta ccaacgtccc cgacgccaag ctgccggaca
tgcttgctga gcccctgttc 1320cagcgcttca aagtcaatcc ggggctgctg ctccgggggc
ttgtgtcctg cacgggcaac 1380cagttttgcg gcttcggtct ggcggagaca aaggcgcggg
cggtcaaggt agttgagatg 1440ctggaggagc agttggagct cacccggcct gtcaggatcc
acttcaccgg atgccccaac 1500agttgcggcc aagcgcagca ggttggcgac attgggttaa
tgggagcccc cgccaagctg 1560gatggcaagg cggtggaggg ctacaagatc tttttgggcg
ggaagattgg ggagaacccg 1620cagctggcca cggaattcgc tcaagggatc ccggctgtgg
agtctcatct ggtgcccaaa 1680ctcaaggaga tccttattaa ggagtttggt gccaaggaaa
aggagactgc cgttgtcgtc 1740taaataggcg tcgttgcgta attaggtgct tataacggag
aagggggaat gatagcttgg 1800tgtaagtgtt acataggatt ggggagggag tggtaggcac
gggtttgatg cgtgatatac 1860tacatgtgac ctgatgtcgt attttgcata caagtatctt
gtccggcgct tctcatgcgt 1920gtgcgtgtct gtttgttctg tttcggctag cagggcggcc
aagtcgttta tgttcgggga 1980ttcctactac gggcgcaatt gcaatgataa aagaaggatg
cgtgtcttgt ctggggcctg 2040tgaatcactc cttccgatat gccgcgacgt ttgctgtgcg
cgcggcgtgc aggtcagggt 2100ttgtcgatag gtagcgtttg cacgtcgcgt ccgtgagtat
ctatatcaga gcagcttgcg 2160catgtatgtg ttaaccaagt tttttttatt ggcgtgggaa
ctgtgctccc gggcgaatta 2220tgctcgccag cgctgccggt ggtctgtgat tgattaggca
ttggtcatct gtatccattc 2280gacttatcag acttatcatg tctcgcgatc ggatgttgtg
ctgccttgtt ccattctttt 2340gcacatccgt tgtgtcgatg gcgtgggaag atgccgaggc
tacgatgaag agtgtagata 2400gagggtcgcg ttcgtggtga tggtgccgca cag
2433562062DNASpinacia oleracea 56catcatcttc
atcttcatct tcatcattca tagttgcaag aaacagagca accaaaaaaa 60atggcatcac
ttccagtcaa caagatcata ccatcatcaa cgacattact gtcatcgtcg 120aacaacaaca
gaagaagaaa taactcatca attcgatgcc agaaggcggt ttcacccgcg 180gcagaaacgg
ctgcagtgtc gccgtctgtg gacgcggcga ggctggagcc gagagtggag 240gagagagatg
ggttttgggt attgaaggag gaatttagga gtgggattaa cccagctgag 300aaagttaaga
ttgagaaaga cccaatgaag ttgtttattg aggatgggat tagtgatctt 360gctactttgt
caatggagga agttgataaa tctaagcata ataaggatga tattgatgtt 420agactcaagt
ggcttggact tttccatcgc cgtaaacatc actatgggag attcatgatg 480aggttgaagc
tgccgaatgg ggtaacaacg agtgagcaga cacggtacct agcaagcgtg 540atcaagaagt
acggaaaaga tggatgtgcg gatgtaacaa caaggcaaaa ctggcaaatt 600agaggagttg
ttctgcctga tgtgccagag atcatcaaag ggctggaatc cgttggtctt 660accagcttac
agagtgggat ggacaatgta aggaaccctg taggtaaccc tcttgcaggg 720attgaccctc
atgaaattgt tgacacccga ccttttacca acctaatttc ccaatttgtc 780actgccaatt
cgcgtggaaa cctttctatt accaatctgc caaggaagtg gaatccatgt 840gttattgggt
cccatgatct ttatgagcat ccacacatca atgaccttgc ttacatgcct 900gctacaaaga
atgggaaatt cgggtttaat ttgttggttg gaggattctt tagcatcaaa 960agatgtgaag
aggcaatccc actagacgct tgggtctcag cagaagatgt ggttcctgta 1020tgcaaagcta
tgcttgaagc tttcagggac cttggcttta gaggaaacag gcagaagtgc 1080agaatgatgt
ggcttattga tgagcttggt atggaagcat tcaggggaga ggttgagaag 1140agaatgcctg
agcaagttct agaaagagca tcctcagaag agctggttca gaaggactgg 1200gagagaagag
aatacttagg agttcaccct cagaaacaac aaggacttag ctttgtgggt 1260ctccacattc
ctgtgggccg tctgcaagct gatgagatgg aagagttagc ccgtatagct 1320gatgtgtatg
gatcagggga gctccgtctg acagtagagc agaacataat catcccaaat 1380gttgaaaact
caaagataga ttcactacta aacgagcctc tgttaaaaga gcgttactcc 1440cctgaaccac
ccatcttgat gaaggggctt gtggcctgta cggggagcca attttgtgga 1500caagccatta
tcgagaccaa ggctagggca ctcaaggtga cagaagaggt acaacgacta 1560gtgtctgtaa
cacggcctgt taggatgcat tggaccgggt gtcctaatag ttgtggtcaa 1620gtacaagtgg
ctgatattgg gttcatgggt tgcatgacta gggatgagaa cggtaagcct 1680tgtgaaggag
ctgatgtgtt tgtaggagga cgtataggaa gtgactcgca tctaggagac 1740atttacaaga
aggcagtccc atgtaaagat ttggtgcctg ttgttgctga gatattgatc 1800aaccaattcg
gtgctgttcc tagggagagg gaagaggcag agtagtagct agactgtttt 1860gggtgcctgt
tcttgttaac tgttatcggt attcggtaat tacttgtaat atttgcattt 1920tttttcaagc
atataattaa attgcataaa gatcccttgt atgtctgcat aacaagatac 1980tcagttatgt
aatgtcaata gcaggtttac tttgtttatt caataggcac tgtgaaaggg 2040aaagttcatt
attcatttct ca
2062571611DNANostoc sp. 57atgacagata cagtaactac ccccaaagcc agcctcaata
agtttgagaa attcaaagcc 60gaaaaagatg gacttgccat caagtcagag atcgaaaaaa
ttgcctcttt gggatgggaa 120gcaatggacg caacagaccg agatcatcgc ctcaaatggg
tgggtgtatt ctttcgccca 180gtcacccctg gtaaatttat gatgcggatg cggatgccga
atggtatcct caccagcgat 240cagatgcgtg ttttagccga agtggtgcag cgttacggag
atgacggcaa cgctgatatt 300acaactaggc agaatattca actacgaggt atcagaatag
aagacttacc gcacatattc 360aataaatttc atgcagtagg tttaaccagt gtgcagtcag
ggatggacaa catccgtaac 420atcacaggcg acccgatagc ggggttagat gcggatgagt
tgtatgacac ccgtgagtta 480gtgcagcaaa ttcaggatat gctcaccaac aaaggagaag
gcaatcgaga gtttagtaat 540ttgcctcgta aatttaatat tgcgatcgcc ggtggacggg
ataattcagt tcatgcggaa 600atcaacgatt tagcctttgt tccagcattt aaagaaggga
ttggagattg ggtattgggg 660aatggggaag aatcatctac ttaccaaaaa gtctttggat
ttaacgtgtt agttggtggt 720ttcttttctg ctaaacgctg tgaggcggcg attcctttga
atgcttgggt aactccggaa 780gaagtcttac ccttatgtag agcaatttta gaggtctatc
gtgacaatgg actcagggct 840aatcggctca agtctcgctt gatgtggcta attgatgaat
ggggtataga taagtttcgg 900gcagaagtcg aacagcgttt gggtaaatcc ttactccccg
cagcccccaa agacgaaatt 960gattgggaaa aacgcgacca tatcggagtc tataagcaaa
agcaagaggg attgaactat 1020gtagggttac acatccctgt aggtagattg tatgccgagg
atatgtttga attggctcgg 1080atagccgatg tatacggtag cggtgaaatc cgcatgactg
ttgaacaaaa catcatcatt 1140cccaacatta ccgactcgcg gttaaggact ttgttgacag
atcccttact agagagattt 1200tctcttgatc ctggagcatt gacgcgatcg ctagtttcct
gcacgggcgc acaattttgc 1260aacttcgccc tcatcgaaac caaaaaccgc gccctagaaa
tgattaaagg cttagaagca 1320gaattgacct ttactcgtcc agtgcgaatc cattggacag
gttgccccaa ctcctgcgga 1380cagccccaag ttgcagacat tggcttaatg ggaacaaaag
ctcgtaaaaa cggtaaagcc 1440gtggaaggtg ttgacatcta tatgggtggc aaagtcggca
aagatgcaca tttaggtagc 1500tgtgtacaaa aaggcatccc ctgcgaagac ttgcacctag
tattacgaga cttactcatt 1560actaattttg gagccaaacc cagacaggaa gccttagtta
ccagccaata a 1611582700DNAPlectonema boryanum 58aacactgccg
gaactcgact catgacccat ccaacgcttg cccacgatag aaatgttctc 60cgacgcatga
ggttctccta aagaacgata gaggaatagt gagtagggag tggggagtag 120ggtaaatcct
ttctatctcc cactcctccc ccgctcccca ccaaattaca actatttcta 180aagtacgccc
ttccccctct tcccgccgac agatgacgaa aacgaatcgg ctttatgcag 240aaacgtcata
ttatgaaaag ttttgtaaca acagatacga atgtcctctg tgatcccgat 300tacctttact
cagtaatcac cgcgaatcat caaacggttc cgcagttgat atcgatttgt 360gttcgctctg
gaacacctta tattcatagg ctcaatccat gacagacacc cttgcagcac 420cgaccctcaa
taagtttgaa aaactcaaag cagagaaaga tggtcttgcg gtgaaagcag 480aactcgagca
ctttgctcgg ctcggctggg aagcaatgga tgaaaccgat cgtgatcatc 540gcttgaagtg
gctcggtgtg ttctttcgcc ccgtaactcc tggcaaattt atgctgagaa 600tgcgggttcc
gaatggcatt atcacgagcg gacaaacccg ggtgctagga gaaatccttc 660agcgctatgg
agatgatggc aatgcagaca tcacgactcg ccagaacttt caactgcgag 720gaattcggat
tgaagacctt cccgaaattt ttcgtaagtt tgaccaagct ggattgacga 780gcattcaatc
cgggatggat aacgttcgta acattaccgg atcgcctgtt gctggcattg 840atgcagatga
gctaattgat actcgtgggc tagttcgcaa agttcaagac atgatcacga 900acaatggtcg
tggtaattcg agctttagta acttgcctcg gaaattcaat attgcgatcg 960cagggtgccg
cgataactca gttcatgctg aaatcaatga cattgctttc gttcccgctt 1020tcaaagatgg
cacattagga ttcaatatcc tagttggcgg attcttctct gggaaacgct 1080gcgaagctgc
aattccactc aatgcttggg ttgacccgcg cgatgtcgtt gcggtctgcg 1140aagcaatttt
aacggtctat cggaacttgg gactgagagc aaatcgtcaa aaagctcgct 1200taatgtggct
gattgatgag atgggattgg aaccgttccg cgaagcggtt gaaaaacaat 1260tgggatatgc
ttttacgcct gctgctgcca aagacgagat cctttgggac aagcgagatc 1320acattgggat
tcatgcccaa aaacagcctg gattaaacta tgtgggcttg catgttccag 1380tgggacggtt
atacgcgcaa gatttgtttg atttagctcg gatcgctgaa gtttacggca 1440gtggtgaaat
tcgcttaact gtcgagcaga atgtgatcat tccgaatgtt ccggattcac 1500gagtttctgc
attgctcaga gaacccattg tcaaacggtt ctcgatcgag cctcagaatc 1560tttcacgggc
attagtgtct tgtactggcg cacagttttg taacttcgca ctgattgaaa 1620ctaaaaatcg
tgcggttgct ttaatgcaag agctagaaca agacctgtac tgtcctcgtc 1680cagtgcgcat
tcattggaca ggttgcccga actcttgtgg acaacctcaa gttgcagata 1740tcggactgat
gggcacaaaa gtccgcaaag atggcaaaac agtcgaaggc gtggatctct 1800atatgggggg
caaagttggc aaacatgctg aacttggaac ctgtgtgaga aaaagcattc 1860cctgtgaaga
tctcaaaccg attctgcaag agattttgat cgagcaattt ggggcgcgtc 1920tctggtcaga
cctgcccgaa tccgctcgtc caaatccgac cgccttgatc acgctcgatc 1980gtcccacggt
ggaaacaccg aacgggaaat caacaaccgt gcaagagctt aatgcacaag 2040agtttgacta
tgtgctgagt gcgccacctg ttgtaaaagc gccaacagaa atcgcagctc 2100cagcaacgat
tcgttttgct cagtcaggaa aagaaatcac ctgcacccag gatgatttga 2160ttctagacat
tgcagaccaa gccgaagtcg cgatcgaaag ttcttgccga tcaggaacgt 2220gtggaagttg
taaatgcacc ttactcgaag gtgaagtcag ctatgacagc gaacccgatg 2280tgctcgatga
gcacgatcgc gcttcgggtc agattctcac ctgtattgct cgtcctgtcg 2340gtcgtatctt
gctcgatgct tgatccctaa gttttgttgc tccgctcatt gttctcacat 2400gcgccagctt
tttgctgtgc ttccttttcc ttcagtacat tctctaaaaa ggacgatcca 2460tgtcttctaa
tctttcaaga cgtaagttca ttttgaccgc aggcgcaacc gcagcaggcg 2520cagtgattgt
gaatggttgt agcacaggtc taaataaaag tgcttctagc ggtgcgtcct 2580ctcctgctgc
ctctcctgct gcaaatatca gtgcggcaga tgcaccagaa gtcacaacgg 2640ctaaattagg
ctttatcgcc ctgaccgatt cggctccatt gatcattgcg ttagagaaag
2700591611DNAAnabaena variabilis 59atgacagata cagcaactac ccccaaagcc
agtctcaata agtttgagaa attcaaagcc 60gaaaaagatg gccttgccat caagtcagag
attgaaaaaa ttgcctcttt gggatgggaa 120gcaatggacg aaacagaccg agaccatcgc
ctcaaatggg tgggtgtatt ctttcgtcca 180gtcacccctg gcaaattcat gatgcggatg
cggatgccta atggtattct caccagcgat 240caaatgcgtg ttttagctga agtggtgcag
cgttacggag atgatggcaa cgctgatatt 300acaactaggc agaatatcca actacgggga
atcagaatag aagacttacc gcacatattc 360aataaatttc atgcagtagg tttaactagt
gtgcagtcgg ggatggacaa tatccgcaat 420attacaggcg accccatagc agggttggat
gcagatgaat tgtatgatac ccgtgagtta 480gtgcagcaaa tccaagatat gctcaccaac
aagggagaag gtaatcgaga gtttagtaat 540ttaccacgga aatttaatat tgcgatcgct
ggtggacggg ataattcagt tcatgcagaa 600atcaacgatt tagcttttgt tcccgcattc
aaagaaggga ttggggattg ggtattggga 660ggtggtgaag aatcttctac tcaccaaaaa
gtctttggat ttaacgtgtt agttggtggc 720ttcttttctg ccaaacgttg tgaagcggca
attcctttaa atgcttgggt aacagctgaa 780gaagtcgtag ccttatgtag agcagttctg
gaagtctatc gtgacaacgg acttagagct 840aatcggctta agtctcgctt gatgtggcta
attgatgaat ggggtataga taagttccgt 900gcagaagtcg aacagcgttt gggtaaatcc
ttactatacg ctgcacccaa agacgaaatt 960gattgggaaa aacgcgacca tatcggagtc
tataaacaaa agcaagaggg attgaactat 1020gtaggcttac acatacccgt aggtagattg
tatgccgaag atatgtttga actagctcgg 1080atagccgatg tttacggtag cggtgaaatc
cgtatgactg ttgaacaaaa catcatcatt 1140cccaacatta ccgactcgcg gttaaagact
ttgttgacag atcctttact agagagattt 1200tctcttgatc cgggagcatt gacgcgatcg
ctagtttcct gcacaggcgc acaattttgc 1260aacttcgccc tcatcgaaac caaaaaccgc
gccctagaaa tgattaaagg cttagaagca 1320gagttaacat tcacccgtcc agtgcgaatc
cattggacag gttgccccaa ctcctgcgga 1380caaccccaag ttgcagacat cggtttaatg
ggaacaaaag cccgtaagaa cggtaaagcc 1440gtcgaaggtg ttgacatcta tatggggggc
aaagtcggca aagacgcaca tttaggtagt 1500tgtgtacaaa aaggcatccc ctgcgaagac
ttgcacctag tattacgaga cttgctgatt 1560actaattttg gagccaaacc caggcaggaa
gccttagtta gtagccagta g 1611601548DNASynechococcus sp.
60atggcgaacc aatttgaacg cctcaaaagc gaaaaggatg ggctggcggt caaggccgag
60ctggaggcgt ttgcccggat gggttgggag aacattcctg aagacgaccg ggatcaccgc
120ctcaagtggc tggggatctt ctttcgcaag cgcaccccag gtcagttcat gctgcggctg
180cgcctgccca atgggatcct aaccagcggc caaatgcgga tgttgggcgc aatcatccac
240ccctatggag aacagggcgt agccgacatc accacccggc agaacctgca actgcgcggc
300atccccattg aagaaatgcc ccagatcctg ggctacctga aagaggtagg cctgaccagc
360atccagtcgg gcatggacaa cgtgcgcaac atcacgggat cccctctggc cggtattgac
420ccggatgagc tgatcgatgt gcgcggtctc acccgcaagg tgcaggacat ggttaccaac
480aacggcgagg gcaacccttc cttcagcaac ctgccgcgca agttcaacat cgccatctgc
540ggttgtcgcg acaactccgt gcatgcggag atcaacgacc tggcctttgt gcccgccttc
600aaaaatggcc gcctgggctt caacgtcctg gtgggcggct ttttctcggc tcgccgctgc
660gccgaggcaa ttggcctaga tgtctgggtg gatccccgcg atgtagttcc cctgtgcgag
720gcggtgctgc tggtctaccg ggatcacggc ctgcgggcca accggcaaaa ggcgcggttg
780atgtggctca ttgacgagtg gggcctagag aagttccggg cggctgtgga gcgccagata
840ggccaccctc tgcccagggc agcggaaaaa gacgaggtgg tctggcacaa gcgggatctg
900ctgggggtgc atgcccagaa gcagccgggc ctcaactttg tcggcctgca tgtgccggtg
960gggcggctca acgccctgga gatgatggag ctggcccgct tggcggaggt gtacggctcc
1020ggggagctgc ggctgacagt ggagcagaac gtgctcatcc ccaatgtgcc cgactcccga
1080gtggccccgc tcctcaaaga gccgctcttg aagaagttct cccccaaccc agggcccttg
1140cagcgggggt tggtgtcctg cacgggcaac cagttctgca actttgccct tatcgagacc
1200aaaaaccggg ctgtggcctt gatggaggag ctggaggcgg agctggagat cccccaaacg
1260gtgcgcatcc actggacggg ctgccccaac tcctgcggcc aaccccaagt agccgatatc
1320ggccttatgg gcaccactgc tcgcaaggac ggcagggtgg tggaggccgt ggacatctac
1380atggggggag aggtgggcaa agacgccaag ctgggcgaat gcgtgcgcaa agggatccct
1440tgcgaagacc tcaagccggt cttggtggag ctgctcattg aacactttgg ggccaagccg
1500cgtcagcatc cgtccgccgc ccaggcttct gttttggtaa cccgctag
1548612203DNAArabidopsis thaliana 61tctcacccac ccaaagccac tcactctctc
ttctctctct ctgaagcgat gtcatcgacg 60tttcgagctc cggcgggagc cgctactgtg
tttacggcgg atcagaagat cagacttggg 120aggctcgacg ctctgagatc ctctcattct
gttttcttag gaagatatgg acgcggcggc 180gtcccggttc ctccttccgc ttcttcgtcg
agttcttcgc ctattcaagc cgtctccact 240cctgcgaagc ctgagactgc gaccaagcgg
agcaaagtcg aaattatcaa ggagaagagt 300aatttcataa ggtatccttt gaacgaggag
cttttaacag aggctccaaa tgtcaacgag 360tcagccgtgc agcttatcaa gttccacggt
agctaccaac agtacaacag agaagaacgt 420ggtggaagat cttactcctt catgcttcga
actaagaatc catctgggaa ggtccctaac 480cagctctatt tgactatgga tgacttagct
gatgagtttg gaattggtac tcttcgtttg 540accacaaggc agacgtttca gcttcatggt
gttctgaagc agaatcttaa gactgtgatg 600agctcgatta ttaaaaatat gggtagcacg
cttggtgcat gtggtgatct gaacagaaat 660gttcttgctc ctgctgcacc ttatgtgaag
aaagactatc tctttgcaca agaaactgct 720gacaacattg cggctcttct ttctcctcaa
tcagggttct attatgatat gtgggttgat 780ggagagcagt tcatgactgc tgaacctcca
gaggtagtga aggctcgaaa tgataactcc 840catggaacta actttgtcga ctctcctgag
cccatctatg gcacccagtt cttgcctaga 900aagttcaagg tcgctgtaac tgttcctaca
gataattccg tcgacctcct caccaatgac 960attggcgttg ttgttgtttc agatgaaaat
ggggaaccac agggtttcaa tatttatgtt 1020ggtgggggta tgggaagaac acacagaatg
gagtctactt ttgcccgcct ggcagaacca 1080ataggttatg ttccaaagga agatattttg
tatgctgtga aggccattgt agtcacacag 1140cgagaacacg ggagacgaga tgatcgtaaa
tatagcagaa tgaaatattt gatcagctcc 1200tggggaattg agaagttcag agatgttgtt
gagcaatatt atggtaaaaa gtttgagcct 1260tcccgtgaac ttccagagtg ggagttcaag
agttacttgg gatggcatga acagggagat 1320ggtgcatggt tttgtgggct tcacgtagac
agtggtcgtg ttggaggtat aatgaagaag 1380acgctgagag aagtaataga gaaatacaaa
attgatgtcc gcatcacacc aaaccaaaac 1440attgtcttgt gtgatataaa gactgaatgg
aagcgtccca tcaccacagt acttgctcag 1500gccggcttac tgcaacctga gtttgtcgac
ccattaaacc aaactgcaat ggcttgccca 1560gcttttcctt tgtgccctct ggcaataact
gaggcagagc gcgggatccc cagcattcta 1620aagagagtta gggcaatgtt tgaaaaggtt
ggtctggact acgacgagtc tgttgtgata 1680agagtaaccg gttgtccaaa cggctgtgca
agaccgtaca tggctgagct cggtctagtc 1740ggggatggtc ccaacagcta tcaggtttgg
ctaggaggaa caccgaacct gacccagata 1800gcgagaagtt tcatggataa ggttaaggtt
cacgacttag agaaagtctg cgagccattg 1860ttctatcact ggaaactaga gaggcaaact
aaagaatcat ttggagaata cacaacccgc 1920atgggattcg agaaactgaa ggagctgata
gatacataca aaggagtttc tcaatgagca 1980caacagagat catctttcgt tttataattc
atgtaatgta atgtctctgt ctgaactgtt 2040actcttcggt aactctgatg gagaacttgt
tctcgttttg gtttgatttt gtaccctctt 2100tttttttttt gtttttttgg attgctttgt
ctttgattgg ataatgaagc attactgtat 2160caaggctaat tagcccatca ataagccttt
ttaaagctct gga 2203622195DNAOryza sativa 62ttttttataa
tgccaacttt gtacaaaaaa gcaggcttaa acaatgtgtg gcatcctcgc 60cgtgctcggc
gtcgcagacg tctccctcgc caagcgctcc cgcatcatcg agctatcccg 120ccggttacgt
catagaggcc ctgattggag tggtatacac tgctatcagg attgctatct 180tgcacaccag
cggttggcta ttgttgatcc cacatccgga gaccagccgt tgtacaatga 240ggacaaatct
gttgttgtga cggtgaatgg agagatctat aaccatgaag aattgaaagc 300taacctgaaa
tctcataaat tccaaactgc tagcgattgt gaagttattg ctcatctgta 360tgaggaatat
ggggaggaat ttgtggatat gttggatggg atgttcgctt ttgttcttct 420tgacacacgt
gataaaagct tcattgcagc ccgtgatgct attggcattt gtcctttata 480catgggctgg
ggtcttgatg gttcggtttg gttttcgtca gagatgaagg cattaggtga 540tgattgcgag
cgattcatat ccttcccccc tgggcacttg tactccagca aaacaggtgg 600cctaaggaga
tggtacaacc caccatggtt ttctgaaagc attccctcca ccccgtacaa 660tcctcttctt
ctccgacaga gctttgagaa ggctattatt aagaggctaa tgacagatgt 720gccatttggt
gttctcttgt ctggtggact ggactcttct ttggttgcat ctgttgtttc 780gcggcacttg
gcagaggcaa aagttgccgc acagtgggga aacaaactgc atacattttg 840cattggtttg
aaaggttctc ctgatcttag agctgctaag gaagttgcag actaccttgg 900tactgttcat
cacgaactcc acttcacagt gcaggaaggc attgatgcac tggaggaagt 960catttaccat
gttgagacat atgatgtaac gacaattaga gcaagcaccc caatgttctt 1020gatgtcacgt
aaaattaaat ctttgggggt gaagatggtt ctttcgggag aaggttctga 1080tgagatattt
ggcggttacc tttattttca caaggcacca aacaagaagg aattccatga 1140ggaaacatgt
cggaagataa aagcccttca tttatatgat tgcttgggag cgaacaaatc 1200aacttctgca
tggggtgttg aggcccgtgt tccgttcctt gacaaaaact tcatcaatgt 1260agctatggac
attgatcctg aatggaaaat gataaaacgt gatcttggcc gtattgagaa 1320atgggttctc
cggaatgcat ttgatgatga ggagaagccc tatttaccta agcacattct 1380atacaggcaa
aaggagcaat tcagtgatgg tgttgggtac agttggattg atggattgaa 1440ggatcatgca
aatgaacatg tatcagattc catgatgatg aacgctagct ttgtttaccc 1500agaaaacact
ccagttacaa aagaagcgta ctattatagg acaatattcg agaaattctt 1560tcccaagaat
gctgctaggt tgacagtacc tggaggtcct agcgtcgcgt gcagcactgc 1620taaagctgtt
gaatgggacg cagcctggtc caaaaacctt gatccatctg gtcgtgctgc 1680tcttggtgtt
catgatgctg catatgaaga tactctacaa aaatctcctg cctctgccaa 1740tcctgtcttg
gataacggct ttggtccagc ccttggggaa agcatggtca aaaccgttgc 1800ttcagccact
gccgtttaac tttctatcgt cgcacccagc tttcttgtac aaagttggca 1860ttataagaaa
gcattgctta tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata 1920atatcattat
ttgccatcca gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac 1980attgcacaag
ataaaaatat atcatcatga acaataaaac tgtctgctta cataaacagt 2040aatacaaggg
gtgttatgag ccatattcaa cgggaaacgt cgaggccgcg attaaattcc 2100aacatggatg
ctgatttata tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt 2160gcgacaatct
atcgcttgta tgggaagccc gatga
219563591PRTOryza sativa 63Met Cys Gly Ile Leu Ala Val Leu Gly Val Ala
Asp Val Ser Leu Ala 1 5 10
15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly
20 25 30 Pro Asp
Trp Ser Gly Ile His Cys Tyr Gln Asp Cys Tyr Leu Ala His 35
40 45 Gln Arg Leu Ala Ile Val Asp
Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55
60 Asn Glu Asp Lys Ser Val Val Val Thr Val Asn Gly
Glu Ile Tyr Asn 65 70 75
80 His Glu Glu Leu Lys Ala Asn Leu Lys Ser His Lys Phe Gln Thr Ala
85 90 95 Ser Asp Cys
Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100
105 110 Phe Val Asp Met Leu Asp Gly Met
Phe Ala Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly
Ile Cys Pro 130 135 140
Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145
150 155 160 Met Lys Ala Leu
Gly Asp Asp Cys Glu Arg Phe Ile Ser Phe Pro Pro 165
170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly
Gly Leu Arg Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Phe Ser Glu Ser Ile Pro Ser Thr Pro Tyr Asn
Pro Leu 195 200 205
Leu Leu Arg Gln Ser Phe Glu Lys Ala Ile Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe Gly
Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ser Val Val Ser Arg His Leu Ala
Glu Ala Lys Val Ala Ala 245 250
255 Gln Trp Gly Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly
Ser 260 265 270 Pro
Asp Leu Arg Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly Thr Val 275
280 285 His His Glu Leu His Phe
Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295
300 Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val
Thr Thr Ile Arg Ala 305 310 315
320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val
325 330 335 Lys Met
Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340
345 350 Leu Tyr Phe His Lys Ala Pro
Asn Lys Lys Glu Phe His Glu Glu Thr 355 360
365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys
Leu Gly Ala Asn 370 375 380
Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385
390 395 400 Lys Asn Phe
Ile Asn Val Ala Met Asp Ile Asp Pro Glu Trp Lys Met 405
410 415 Ile Lys Arg Asp Leu Gly Arg Ile
Glu Lys Trp Val Leu Arg Asn Ala 420 425
430 Phe Asp Asp Glu Glu Lys Pro Tyr Leu Pro Lys His Ile
Leu Tyr Arg 435 440 445
Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450
455 460 Leu Lys Asp His
Ala Asn Glu His Val Ser Asp Ser Met Met Met Asn 465 470
475 480 Ala Ser Phe Val Tyr Pro Glu Asn Thr
Pro Val Thr Lys Glu Ala Tyr 485 490
495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Asn Ala
Ala Arg 500 505 510
Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala
515 520 525 Val Glu Trp Asp
Ala Ala Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val His Asp Ala
Ala Tyr Glu Asp Thr Leu Gln Lys 545 550
555 560 Ser Pro Ala Ser Ala Asn Pro Val Leu Asp Asn Gly
Phe Gly Pro Ala 565 570
575 Leu Gly Glu Ser Met Val Lys Thr Val Ala Ser Ala Thr Ala Val
580 585 590 643246DNAOryza
sativa 64aatccgaaaa gtttctgcac cgttttcacc ccctaactaa caatataggg
aacgtgtgct 60aaatataaaa tgagacctta tatatgtagc gctgataact agaactatgc
aagaaaaact 120catccaccta ctttagtggc aatcgggcta aataaaaaag agtcgctaca
ctagtttcgt 180tttccttagt aattaagtgg gaaaatgaaa tcattattgc ttagaatata
cgttcacatc 240tctgtcatga agttaaatta ttcgaggtag ccataattgt catcaaactc
ttcttgaata 300aaaaaatctt tctagctgaa ctcaatgggt aaagagagag atttttttta
aaaaaataga 360atgaagatat tctgaacgta ttggcaaaga tttaaacata taattatata
attttatagt 420ttgtgcattc gtcatatcgc acatcattaa ggacatgtct tactccatcc
caatttttat 480ttagtaatta aagacaattg acttattttt attatttatc ttttttcgat
tagatgcaag 540gtacttacgc acacactttg tgctcatgtg catgtgtgag tgcacctcct
caatacacgt 600tcaactagca acacatctct aatatcactc gcctatttaa tacatttagg
tagcaatatc 660tgaattcaag cactccacca tcaccagacc acttttaata atatctaaaa
tacaaaaaat 720aattttacag aatagcatga aaagtatgaa acgaactatt taggtttttc
acatacaaaa 780aaaaaaagaa ttttgctcgt gcgcgagcgc caatctccca tattgggcac
acaggcaaca 840acagagtggc tgcccacaga acaacccaca aaaaacgatg atctaacgga
ggacagcaag 900tccgcaacaa ccttttaaca gcaggctttg cggccaggag agaggaggag
aggcaaagaa 960aaccaagcat cctccttctc ccatctataa attcctcccc ccttttcccc
tctctatata 1020ggaggcatcc aagccaagaa gagggagagc accaaggaca cgcgactagc
agaagccgag 1080cgaccgcctt ctcgatccat atcttccggt cgagttcttg gtcgatctct
tccctcctcc 1140acctcctcct cacagggtat gtgcctccct tcggttgttc ttggatttat
tgttctaggt 1200tgtgtagtac gggcgttgat gttaggaaag gggatctgta tctgtgatga
ttcctgttct 1260tggatttggg atagaggggt tcttgatgtt gcatgttatc ggttcggttt
gattagtagt 1320atggttttca atcgtctgga gagctctatg gaaatgaaat ggtttaggga
tcggaatctt 1380gcgattttgt gagtaccttt tgtttgaggt aaaatcagag caccggtgat
tttgcttggt 1440gtaataaagt acggttgttt ggtcctcgat tctggtagtg atgcttctcg
atttgacgaa 1500gctatccttt gtttattccc tattgaacaa aaataatcca actttgaaga
cggtcccgtt 1560gatgagattg aatgattgat tcttaagcct gtccaaaatt tcgcagctgg
cttgtttaga 1620tacagtagtc cccatcacga aattcatgga aacagttata atcctcagga
acaggggatt 1680ccctgttctt ccgatttgct ttagtcccag aatttttttt cccaaatatc
ttaaaaagtc 1740actttctggt tcagttcaat gaattgattg ctacaaataa tggtgcaaat
caggtctata 1800tgattgattt tgggctggcc aagaagtata gagactcatc aactcatcag
catattccgt 1860atagagaaaa caaaaatttg acaggaactg ctagatacgc aagcatgaat
actcatcttg 1920gcattgaaca aagtcgaagg gatgatttgg aatcgctggg ttatgtttta
atgtacttct 1980taagaggaag tctcccttgg caggggctga aagcaggcac taagaaacag
aagtatgaga 2040agatcagtga gaagaaagta tcaacatcaa tagagacctt gtgtagggga
tatcctgcag 2100agtttgcatc atattttcat tactgtcgat cactaagatt tgatgataaa
ccagattatg 2160cttatctgaa gagaattttc cgtgatcttt tcattcgtga agggtttcaa
tttgattata 2220tatttgactg gaccattttg aaatatcagc aatcacagct tgccaatcct
ccatctcgtg 2280ctcttggtgg tactgctggg ccaagctcag ggatgcctca tgctcttgtt
aatgttgaga 2340ggcaatcagg tggagatgaa ggtcgaccaa ctggttggtc ttcatcaaat
cttacacgta 2400ataagagcac ggggctgcat ttcaattctg gaagcttatt gaagcaaaaa
ggcacagttg 2460ctaatgattt atccatgggt aaagagttat ccagttctaa ttttttccgg
tcaagtggac 2520cattgaggcg tccagttgtc tctagcatcc gagacccagt gattgcaggg
ggtgaacctg 2580acccctccgg cactctgaca aaagatgcaa gcccgggacc attgcgtaaa
gtatccagtg 2640ctgcacggag gagttcacca gttgtgtcct cagatcacaa gcgcagctcc
tctatcaaaa 2700atgccaacat aaagaattta gagtccaccg tcaagggaat agagggttta
agttttcgat 2760gatgagggac tgcattagta gctgtgcttt gtctcagttc tccgttcact
gtaaattttg 2820gcacaccaac ttggggagta agagttctga tattagttgc tgtcaggaag
taccataaag 2880ctgaattata caattaaaat ttgggatcca atcgcaaaag cacattaagg
atatgatggg 2940gttgcagatc caaactcaca gattccagtt tatgctcgtc catacagtta
taggcacttt 3000ccatattctt ttctttaatc tctgtctctt gcttgttatt gttatgtcgt
ggtattcttg 3060ttgaggtcat gtttgtgaat tgcgaagatg gtcatgtata attgccgaga
aatcatgtac 3120tagtttgttt taaacatgag caaactgtta ttttgttcaa gctactttaa
tatcaaaaaa 3180aaaaaaaaaa gggcggccgc tctagagtat ccctcgaggg gcccaagctt
acgcgtaccc 3240agcttt
32466560DNAArtificial sequenceprimer prm06049 65ggggacaagt
ttgtacaaaa aagcaggctt aaacaatgtg tggcatcctc gccgtgctcg
606654DNAArtificial sequenceprimer prm06050 66ggggaccact ttgtacaaga
aagctgggtg cgacgataga aagttaaacg gcag 5467591PRTOryza sativa
67Met Cys Gly Ile Leu Ala Val Leu Gly Val Ala Asp Val Ser Leu Ala 1
5 10 15 Lys Arg Ser Arg
Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20
25 30 Pro Asp Trp Ser Gly Ile His Cys Tyr
Gln Asp Cys Tyr Leu Ala His 35 40
45 Gln Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro
Leu Tyr 50 55 60
Asn Glu Asp Lys Ser Val Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65
70 75 80 His Glu Glu Leu Lys
Ala Asn Leu Lys Ser His Lys Phe Gln Thr Ala 85
90 95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr
Glu Glu Tyr Gly Glu Glu 100 105
110 Phe Val Asp Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu Asp
Thr 115 120 125 Arg
Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Cys Pro 130
135 140 Leu Tyr Met Gly Trp Gly
Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145 150
155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe
Ile Ser Phe Pro Pro 165 170
175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn
180 185 190 Pro Pro
Trp Phe Ser Glu Ser Ile Pro Ser Thr Pro Tyr Asn Pro Leu 195
200 205 Leu Leu Arg Gln Ser Phe Glu
Lys Ala Ile Ile Lys Arg Leu Met Thr 210 215
220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu
Asp Ser Ser Leu 225 230 235
240 Val Ala Ser Val Val Ser Arg His Leu Ala Glu Ala Lys Val Ala Ala
245 250 255 Gln Trp Gly
Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260
265 270 Pro Asp Leu Arg Ala Ala Lys Glu
Val Ala Asp Tyr Leu Gly Thr Val 275 280
285 His His Glu Leu His Phe Thr Val Gln Glu Gly Ile Asp
Ala Leu Glu 290 295 300
Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305
310 315 320 Ser Thr Pro Met
Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val 325
330 335 Lys Met Val Leu Ser Gly Glu Gly Ser
Asp Glu Ile Phe Gly Gly Tyr 340 345
350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu
Glu Thr 355 360 365
Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370
375 380 Lys Ser Thr Ser Ala
Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390
395 400 Lys Asn Phe Ile Asn Val Ala Met Asp Ile
Asp Pro Glu Trp Lys Met 405 410
415 Ile Lys Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn
Ala 420 425 430 Phe
Asp Asp Glu Glu Lys Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg 435
440 445 Gln Lys Glu Gln Phe Ser
Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450 455
460 Leu Lys Asp His Ala Asn Glu His Val Ser Asp
Ser Met Met Met Asn 465 470 475
480 Ala Ser Phe Val Tyr Pro Glu Asn Thr Pro Val Thr Lys Glu Ala Tyr
485 490 495 Tyr Tyr
Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Asn Ala Ala Arg 500
505 510 Leu Thr Val Pro Gly Gly Pro
Ser Val Ala Cys Ser Thr Ala Lys Ala 515 520
525 Val Glu Trp Asp Ala Ala Trp Ser Lys Asn Leu Asp
Pro Ser Gly Arg 530 535 540
Ala Ala Leu Gly Val His Asp Ala Ala Tyr Glu Asp Thr Leu Gln Lys 545
550 555 560 Ser Pro Ala
Ser Ala Asn Pro Val Leu Asp Asn Gly Phe Gly Pro Ala 565
570 575 Leu Gly Glu Ser Met Val Lys Thr
Val Ala Ser Ala Thr Ala Val 580 585
590 68591PRTAquilegia formosa 68Met Cys Gly Ile Leu Ala Val Leu
Gly Cys Ser Asp Asp Ser Gln Ala 1 5 10
15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys
His Arg Gly 20 25 30
Pro Asp Trp Ser Gly Leu Tyr Gln His Gly Asp Asn Phe Leu Ser His
35 40 45 Gln Arg Leu Ala
Val Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50
55 60 Asn Glu Asp Lys Ser Ile Val Val
Thr Val Asn Gly Glu Ile Tyr Asn 65 70
75 80 His Glu Ala Leu Arg Lys Arg Leu Pro Asn His Lys
Phe Arg Thr Gly 85 90
95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu Phe Gly Glu Asp
100 105 110 Phe Val Asp
Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115
120 125 Arg Asp Asn Ser Phe Leu Val Ala
Arg Asp Ala Ile Gly Ile Thr Ser 130 135
140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Ile Trp Ile
Ser Ser Glu 145 150 155
160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Cys Phe Pro Pro
165 170 175 Gly His Leu Tyr
Ser Ser Lys Asn Ser Gly Phe Arg Arg Trp Tyr Asn 180
185 190 Pro Ser Trp Phe Ser Glu Ala Val Pro
Ser Thr Pro Tyr Asp Pro Leu 195 200
205 Val Leu Arg Arg Ala Phe Glu Asn Ala Val Val Lys Arg Leu
Met Thr 210 215 220
Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ser Ile Thr
Ala Arg His Leu Ala Glu Thr Lys Ala Ala Lys 245
250 255 Gln Trp Gly Ala Gln Leu His Ser Phe Cys
Val Gly Leu Glu Gly Ser 260 265
270 Pro Asp Leu Lys Ala Gly Lys Glu Val Ala Asp Tyr Leu Gly Thr
Val 275 280 285 His
His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Asp Val Ile Tyr His Val
Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile
Lys Ser Leu Gly Val 325 330
335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr
340 345 350 Leu Tyr
Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Arg Glu Thr 355
360 365 Cys His Lys Ile Lys Ala Leu
His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg Val
Pro Phe Leu Asp 385 390 395
400 Lys Glu Phe Ile Asn Val Ala Met Ala Ile Asp Pro Glu Trp Lys Met
405 410 415 Ile Lys Arg
Asp Gln Gly Arg Ile Glu Lys Trp Val Leu Arg Arg Ala 420
425 430 Phe Asp Asp Glu Asp His Pro Tyr
Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp
Ile Asp Gly 450 455 460
Leu Lys Ala His Ala Ala Ser His Val Thr Asp Lys Met Met Arg Asn 465
470 475 480 Ala Lys Asn Ile
Phe Leu His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe
Phe Pro Gln Asn Ser Ala Lys 500 505
510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala
Lys Ala 515 520 525
Val Glu Trp Asp Ala Ser Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val
His Ala Ser Ala Tyr Glu Ala Gln Leu Ser Ala 545 550
555 560 Pro Leu Ala Asn Gly Asn Val Pro Val Lys
Ile Phe Asn Asn Val Pro 565 570
575 Arg Met Val Glu Val Gly Ala Pro Ala Ser Leu Thr Ile Arg Ser
580 585 590
69590PRTAsparagus officinalis 69Met Cys Gly Ile Leu Ala Val Leu Gly Cys
Ser Asp Asp Ser Gln Ala 1 5 10
15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg
Gly 20 25 30 Pro
Asp Trp Ser Gly Leu Cys Gln His Gly Asp Cys Phe Leu Ser His 35
40 45 Gln Arg Leu Ala Ile Ile
Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50 55
60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn
Gly Glu Ile Tyr Asn 65 70 75
80 His Glu Glu Leu Arg Arg Arg Leu Pro Asp His Lys Tyr Arg Thr Gly
85 90 95 Ser Asp
Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asp 100
105 110 Phe Val Asp Met Leu Asp Gly
Met Phe Ser Phe Val Leu Leu Asp Thr 115 120
125 Arg Asn Asn Cys Phe Val Ala Ala Arg Asp Ala Val
Gly Ile Thr Pro 130 135 140
Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Leu Ser Ser Glu 145
150 155 160 Met Lys Gly
Leu Asn Asp Asp Cys Glu His Phe Glu Val Phe Pro Pro 165
170 175 Gly Asn Leu Tyr Ser Ser Arg Ser
Gly Ser Phe Arg Arg Trp Tyr Asn 180 185
190 Pro Gln Trp Tyr Asn Glu Thr Ile Pro Ser Ala Pro Tyr
Asp Pro Leu 195 200 205
Val Leu Arg Lys Ala Phe Glu Asp Ala Val Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe
Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ala Val Thr Ala Arg His Leu
Ala Gly Ser Lys Ala Ala Glu 245 250
255 Gln Trp Gly Thr Gln Leu His Ser Phe Cys Val Gly Leu Glu
Gly Ser 260 265 270
Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Glu Tyr Leu Gly Thr Val
275 280 285 His His Glu Phe
His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Asp Val Ile Phe His Ile Glu Thr
Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys
Ser Leu Gly Val 325 330
335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr
340 345 350 Leu Tyr Phe
His Lys Ala Pro Asn Lys Glu Glu Phe His His Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala Leu His
Gln Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro
Phe Leu Asp 385 390 395
400 Lys Glu Phe Met Asp Val Ala Met Ser Ile Asp Pro Glu Ser Lys Met
405 410 415 Ile Lys Pro Asp
Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Lys Ala 420
425 430 Phe Asp Asp Glu Glu Asn Pro Tyr Leu
Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile
Asp Gly 450 455 460
Leu Lys Ala His Ala Ala Lys His Val Thr Asp Arg Met Met Leu Asn 465
470 475 480 Ala Ala Arg Ile Tyr
Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe
Pro Gln Asn Ser Ala Arg 500 505
510 Phe Thr Val Pro Gly Gly Pro Ser Ile Ala Cys Ser Thr Ala Lys
Ala 515 520 525 Ile
Glu Trp Asp Ala Arg Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val His
Asp Ser Ala Tyr Asp Pro Pro Leu Pro Ser 545 550
555 560 Ser Ile Ser Ala Gly Lys Gly Ala Ala Met Ile
Thr Asn Lys Lys Pro 565 570
575 Arg Ile Val Asp Val Ala Thr Pro Gly Val Val Ile Ser Thr
580 585 590 70586PRTBrassica oleracea
70Met Cys Gly Ile Leu Ala Leu Leu Gly Cys Ser Asp Asp Ser Gln Ala 1
5 10 15 Lys Arg Val Arg
Val Leu Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20
25 30 Pro Asp Trp Ser Gly Ile Tyr Gln Asn
Gly Phe Asn Tyr Leu Ala His 35 40
45 Gln Arg Leu Ala Ile Ile Asp Pro Asp Ser Gly Asp Gln Pro
Leu Phe 50 55 60
Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65
70 75 80 His Glu Glu Leu Arg
Lys Gly Leu Lys Asn His Lys Phe His Thr Gly 85
90 95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr
Glu Glu His Gly Glu Asn 100 105
110 Phe Val Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp
Thr 115 120 125 Arg
Asp Asn Ser Phe Met Val Ala Arg Asp Ala Val Gly Val Thr Ser 130
135 140 Leu Tyr Ile Gly Trp Gly
Leu Asp Gly Ser Leu Trp Val Ser Ser Glu 145 150
155 160 Met Lys Gly Leu His Glu Asp Cys Glu His Phe
Glu Ala Phe Pro Pro 165 170
175 Gly His Leu Tyr Ser Ser Lys Ser Gly Gly Gly Phe Lys Gln Trp Tyr
180 185 190 Asn Pro
Pro Trp Phe Asn Glu Ser Val Pro Ser Thr Pro Tyr Glu Pro 195
200 205 Leu Ala Ile Arg Ser Ala Phe
Glu Asp Ala Val Ile Lys Arg Leu Met 210 215
220 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly
Leu Asp Ser Ser 225 230 235
240 Leu Val Ala Ser Ile Thr Ala Arg His Leu Ala Gly Thr Lys Ala Ala
245 250 255 Lys Arg Trp
Gly Pro Gln Leu His Ser Phe Cys Val Gly Leu Glu Gly 260
265 270 Ser Pro Asp Leu Lys Ala Gly Lys
Glu Val Ala Glu Tyr Leu Gly Thr 275 280
285 Val His His Glu Phe His Phe Thr Val Gln Asp Gly Ile
Asp Ala Ile 290 295 300
Glu Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg 305
310 315 320 Ala Ser Thr Pro
Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly 325
330 335 Val Lys Met Val Leu Ser Gly Glu Gly
Ser Asp Glu Ile Phe Gly Gly 340 345
350 Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Gln Glu Phe His
Gln Glu 355 360 365
Thr Cys Arg Lys Ile Lys Ala Leu His Lys Tyr Asp Cys Leu Arg Ala 370
375 380 Asn Lys Ala Thr Ser
Ala Phe Gly Leu Glu Ala Arg Val Pro Phe Leu 385 390
395 400 Asp Lys Glu Phe Ile Asn Thr Ala Met Ser
Leu Asp Pro Glu Ser Lys 405 410
415 Met Ile Lys Pro Glu Glu Gly Arg Ile Glu Lys Trp Val Leu Arg
Arg 420 425 430 Ala
Phe Asp Asp Glu Glu Arg Pro Tyr Leu Pro Lys His Ile Leu Tyr 435
440 445 Arg Gln Lys Glu Gln Phe
Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp 450 455
460 Gly Leu Lys Ala His Ala Ala Glu Asn Val Asn
Asp Lys Met Met Ser 465 470 475
480 Lys Ala Ala Phe Ile Phe Pro His Asn Thr Pro Leu Thr Lys Glu Ala
485 490 495 Tyr Tyr
Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala 500
505 510 Arg Leu Thr Val Pro Gly Gly
Ala Thr Val Ala Cys Ser Thr Ala Lys 515 520
525 Ala Val Glu Trp Asp Ala Ser Trp Ser Asn Asn Met
Asp Pro Ser Gly 530 535 540
Arg Ala Ala Ile Gly Val His Leu Ser Ala Tyr Asp Gly Ser Lys Val 545
550 555 560 Ala Leu Pro
Leu Pro Ala Pro His Lys Ala Ile Asp Asp Ile Pro Met 565
570 575 Met Met Gly Gln Glu Val Val Ile
Gln Thr 580 585 71578PRTChlamydomonas
reinhardtii 71Met Cys Gly Ile Leu Ala Val Leu Asn Thr Thr Asp Asp Ser Gln
Ala 1 5 10 15 Met
Arg Ser Arg Val Leu Ala Leu Ser Arg Arg Gln Arg His Arg Gly
20 25 30 Pro Asp Trp Ser Gly
Met His Gln Phe Gly Asn Asn Phe Leu Ala His 35
40 45 Glu Arg Leu Ala Ile Met Asp Pro Ala
Ser Gly Asp Gln Pro Leu Phe 50 55
60 Asn Glu Asp Arg Thr Ile Val Val Thr Val Asn Gly Glu
Ile Tyr Asn 65 70 75
80 Tyr Lys Glu Leu Arg Gln Gln Ile Thr Asp Ala Cys Pro Gly Lys Lys
85 90 95 Phe Ala Thr Asn
Ser Asp Cys Glu Val Ile Ser His Leu Tyr Glu Leu 100
105 110 His Gly Glu Lys Val Ala Ser Met Leu
Asp Gly Phe Phe Ala Phe Val 115 120
125 Val Leu Asp Thr Arg Asn Asn Thr Phe Tyr Ala Ala Arg Asp
Pro Ile 130 135 140
Gly Ile Thr Cys Met Tyr Ile Gly Trp Gly Arg Asp Gly Ser Val Trp 145
150 155 160 Leu Ser Ser Glu Met
Lys Cys Leu Lys Asp Asp Cys Thr Arg Phe Gln 165
170 175 Gln Phe Pro Pro Gly His Phe Tyr Asn Ser
Lys Thr Gly Glu Phe Thr 180 185
190 Arg Tyr Tyr Asn Pro Lys Tyr Phe Leu Asp Phe Glu Ala Lys Pro
Gln 195 200 205 Arg
Phe Pro Ser Ala Pro Tyr Asp Pro Val Ala Leu Arg Gln Ala Phe 210
215 220 Glu Gln Ser Val Glu Lys
Arg Met Met Ser Asp Val Pro Phe Gly Val 225 230
235 240 Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu Val
Ala Ser Ile Ala Ala 245 250
255 Arg Lys Ile Lys Arg Glu Gly Ser Val Trp Gly Lys Leu His Ser Phe
260 265 270 Cys Val
Gly Leu Pro Gly Ser Pro Asp Leu Lys Ala Gly Ala Gln Val 275
280 285 Ala Glu Phe Leu Gly Thr Asp
His His Glu Phe His Phe Thr Val Gln 290 295
300 Glu Gly Ile Asp Ala Ile Ser Glu Val Ile Tyr His
Ile Glu Thr Phe 305 310 315
320 Asp Val Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg
325 330 335 Lys Ile Lys
Ala Leu Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser 340
345 350 Asp Glu Val Phe Gly Gly Tyr Leu
Tyr Phe His Lys Ala Pro Asn Lys 355 360
365 Glu Glu Phe Gln Ser Glu Thr Val Arg Lys Ile Gln Asp
Leu Tyr Lys 370 375 380
Tyr Asp Cys Leu Arg Ala Asn Lys Ser Thr Met Ala Trp Gly Val Glu 385
390 395 400 Ala Arg Val Pro
Phe Leu Asp Arg His Phe Leu Asp Val Ala Met Glu 405
410 415 Ile Asp Pro Ala Glu Lys Met Ile Asp
Lys Ser Lys Gly Arg Ile Glu 420 425
430 Lys Tyr Ile Leu Arg Lys Ala Phe Asp Thr Pro Glu Asp Pro
Tyr Leu 435 440 445
Pro Asn Glu Val Leu Trp Arg Gln Lys Glu Gln Phe Ser Asp Gly Val 450
455 460 Gly Tyr Asn Trp Ile
Asp Gly Leu Lys Ala His Ala Asp Ser Gln Val 465 470
475 480 Ser Asp Asp Met Met Lys Thr Ala Ala His
Arg Tyr Pro Asp Asn Thr 485 490
495 Pro Arg Thr Lys Glu Ala Tyr Trp Tyr Arg Ser Ile Phe Glu Thr
His 500 505 510 Phe
Pro Gln Arg Ala Ala Val Glu Thr Val Pro Gly Gly Pro Ser Val 515
520 525 Ala Cys Ser Thr Ala Thr
Ala Ala Leu Trp Asp Ala Thr Trp Ala Gly 530 535
540 Lys Glu Asp Pro Ser Gly Arg Ala Val Ala Gly
Val His Asp Ser Ala 545 550 555
560 Tyr Asp Ala Ala Ala Ala Ala Asn Gly Glu Pro Ala Ala Lys Lys Ala
565 570 575 Lys Lys
72579PRTGlycine max 72Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Ser
Ser Gln Ala 1 5 10 15
Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly
20 25 30 Pro Asp Trp Ser
Gly Leu His Gln Tyr Gly Asp Asn Tyr Leu Ala His 35
40 45 Gln Arg Leu Ala Ile Val Asp Pro Ala
Ser Gly Asp Gln Pro Leu Phe 50 55
60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu
Ile Tyr Asn 65 70 75
80 His Glu Glu Leu Arg Lys Gln Leu Pro Asn His Thr Phe Arg Thr Gly
85 90 95 Ser Asp Cys Asp
Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100
105 110 Phe Val Asp Met Leu Asp Gly Ile Phe
Ser Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Val
Thr Ser 130 135 140
Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145
150 155 160 Leu Lys Gly Leu Asn
Asp Asp Cys Glu His Phe Glu Ser Phe Pro Pro 165
170 175 Gly His Leu Tyr Ser Ser Lys Glu Arg Ala
Phe Arg Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Phe Ser Glu Ala Ile Pro Ser Ala Pro Tyr Asp Pro
Leu 195 200 205 Ala
Leu Arg His Ala Phe Glu Lys Ala Val Val Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe Gly Val
Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ala Val Thr Ala Arg Tyr Leu Ala Gly
Thr Asn Ala Ala Lys 245 250
255 Gln Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly Ala
260 265 270 Pro Asp
Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Ile Gly Thr Val 275
280 285 His His Glu Phe His Tyr Thr
Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295
300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr
Thr Ile Arg Ala 305 310 315
320 Ser Ile Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val
325 330 335 Lys Trp Val
Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340
345 350 Leu Tyr Phe His Lys Ala Pro Asn
Lys Glu Glu Phe His Gln Glu Thr 355 360
365 Cys Arg Lys Ile Lys Ala Leu His Lys Tyr Asp Cys Leu
Arg Ala Asn 370 375 380
Lys Ser Thr Phe Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385
390 395 400 Lys Asp Phe Ile
Arg Val Ala Met Asn Ile Asp Pro Asp Tyr Lys Met 405
410 415 Ile Lys Lys Glu Glu Gly Arg Ile Glu
Lys Trp Val Leu Arg Arg Ala 420 425
430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu
Tyr Arg 435 440 445
Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Gly Trp Ile Asp Gly 450
455 460 Leu Lys Ala His Ala
Glu Lys His Val Thr Asp Arg Met Met Leu Asn 465 470
475 480 Ala Ala Asn Ile Phe Pro Phe Asn Thr Pro
Thr Thr Lys Glu Ala Tyr 485 490
495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala
Arg 500 505 510 Leu
Ser Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515
520 525 Val Glu Trp Asp Ala Ala
Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530 535
540 Ala Ala Leu Gly Val His Ala Ser Ala Tyr Gly
Asn Gln Val Lys Ala 545 550 555
560 Val Glu Pro Glu Lys Ile Ile Pro Lys Met Glu Val Ser Pro Leu Gly
565 570 575 Val Ala
Ile 73579PRTGlycine max 73Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser
Asp Ser Ser Gln Ala 1 5 10
15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly
20 25 30 Pro Asp
Trp Ser Gly Leu His Gln Tyr Gly Asp Asn Tyr Leu Ala His 35
40 45 Gln Arg Leu Ala Ile Val Asp
Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55
60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly
Glu Ile Tyr Asn 65 70 75
80 His Glu Glu Leu Arg Lys Gln Leu Pro Asn His Thr Phe Arg Thr Gly
85 90 95 Ser Asp Cys
Asp Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100
105 110 Phe Met Asp Met Leu Asp Gly Ile
Ser Ser Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly
Val Thr Ser 130 135 140
Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145
150 155 160 Leu Lys Gly Leu
Asn Asp Asp Cys Glu His Phe Glu Ser Phe Pro Pro 165
170 175 Gly His Leu Tyr Ser Ser Lys Glu Arg
Ala Phe Arg Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Leu Ser Leu Ala Ile Pro Ser Ala Pro Tyr Asp
Pro Leu 195 200 205
Ala Leu Arg His Ala Phe Glu Lys Leu Trp Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe Gly
Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ala Val Thr Ala Arg Tyr Leu Ala
Gly Thr Lys Ala Ala Lys 245 250
255 Gln Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly
Ala 260 265 270 Pro
Asp Leu Lys Ala Thr Lys Glu Val Ala Glu Tyr Ile Gly Thr Val 275
280 285 His His Glu Phe His Tyr
Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295
300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val
Thr Thr Ile Arg Ala 305 310 315
320 Ser Ile Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val
325 330 335 Lys Trp
Val Ile Ser Gly Glu Gly Ser Asp Val Phe Phe Gly Gly Tyr 340
345 350 Leu Tyr Phe His Lys Ala Pro
Asn Lys Glu Glu Phe His Gln Glu Thr 355 360
365 Cys Arg Thr Ile Ile Val Leu His Arg Tyr Asp Cys
Ser Arg Ala Asn 370 375 380
Lys Ser Thr Phe Val Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385
390 395 400 Lys Glu Phe
Ile Arg Val Ala Met Asn Ile Asp Pro Glu Cys Lys Met 405
410 415 Ile Lys Lys Glu Glu Gly Arg Ile
Glu Lys Trp Ala Leu Arg Arg Ala 420 425
430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile
Leu Tyr Arg 435 440 445
Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Gly Trp Ile Asp Gly 450
455 460 Leu Lys Ala His
Ala Glu Lys His Val Thr Asp Arg Met Met Leu Asn 465 470
475 480 Ala Ala Asn Ile Phe Pro Phe Asn Thr
Pro Thr Thr Lys Glu Ala Tyr 485 490
495 His Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser
Cys Arg 500 505 510
Leu Thr Val Pro Gly Gly Thr Ser Val Ala Cys Ser Thr Ala Lys Ala
515 520 525 Val Glu Trp Asp
Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val His Ala Ser
Ala Tyr Gly Asn Gln Val Lys Ala 545 550
555 560 Val Glu Pro Glu Lys Ile Ile Pro Lys Met Glu Val
Ser Pro Leu Gly 565 570
575 Val Ala Ile 74581PRTGlycine max 74Met Cys Gly Ile Leu Ala Val
Leu Gly Cys Ser Asp Asp Ser Arg Ala 1 5
10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg
Leu Lys His Arg Gly 20 25
30 Pro Asp Trp Ser Gly Leu His Gln His Gly Asp Cys Phe Leu Ala
His 35 40 45 Gln
Arg Leu Ala Ile Val Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50
55 60 Asn Glu Asp Lys Ser Val
Ile Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70
75 80 His Glu Glu Leu Arg Lys Gln Leu Pro Asn His
Asn Phe Arg Thr Gly 85 90
95 Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asp
100 105 110 Phe Val
Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp Thr 115
120 125 Arg Asp Asn Ser Phe Ile Val
Ala Arg Asp Ala Ile Gly Val Thr Ser 130 135
140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp
Ile Ser Ser Glu 145 150 155
160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Cys Phe Pro Pro
165 170 175 Gly His Leu
Tyr Ser Ser Lys Glu Arg Gly Phe Arg Arg Trp Tyr Asn 180
185 190 Pro Pro Trp Phe Ser Glu Ala Ile
Pro Ser Ala Pro Tyr Asp Pro Leu 195 200
205 Val Leu Arg His Ala Phe Glu Gln Ala Val Ile Lys Arg
Leu Met Thr 210 215 220
Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ser Ile
Thr Ser Arg Tyr Leu Ala Asn Thr Lys Ala Ala Glu 245
250 255 Gln Trp Gly Ser Lys Leu His Ser Phe
Cys Val Gly Leu Glu Gly Ser 260 265
270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly
Thr Val 275 280 285
His His Glu Phe Thr Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Asp Val Ile Tyr His
Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys
Ile Lys Ser Leu Gly Val 325 330
335 Lys Trp Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly
Tyr 340 345 350 Leu
Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Arg Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala
Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ser Thr Phe Ala Trp Gly Leu Glu Ala Arg
Val Pro Phe Leu Asp 385 390 395
400 Lys Ala Phe Ile Asn Ala Ala Met Ser Ile Asp Pro Glu Trp Lys Met
405 410 415 Ile Lys
Arg Asp Glu Gly Arg Ile Glu Lys Trp Ile Leu Arg Arg Ala 420
425 430 Phe Asp Asp Glu Glu His Pro
Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser
Trp Ile Asp Gly 450 455 460
Leu Lys Ala His Ala Ala Lys His Val Thr Glu Lys Met Met Leu Asn 465
470 475 480 Ala Gly Asn
Ile Tyr Pro His Asn Thr Pro Lys Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Met Ile Phe Glu Arg
Phe Phe Pro Gln Asn Ser Ala Arg 500 505
510 Leu Thr Val Pro Gly Gly Ala Ser Val Ala Cys Ser Thr
Ala Lys Ala 515 520 525
Val Glu Trp Asp Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly
Val His Ile Ser Ala Tyr Glu Asn Gln Asn Asn Lys 545 550
555 560 Gly Val Glu Ile Glu Lys Ile Ile Pro
Met Asp Ala Ala Pro Leu Gly 565 570
575 Val Ala Ile Gln Gly 580 75569PRTGlycine
max 75Met Cys Gly Ile Leu Ala Val Leu Gly Cys Val Asp Asn Ser Gln Thr 1
5 10 15 Lys Arg Ala
Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly 20
25 30 Pro Asp Trp Ser Gly Ile His Cys
Tyr Glu Asp Cys Tyr Leu Ala His 35 40
45 Gln Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln
Pro Leu Tyr 50 55 60
Asn Glu Asp Lys Thr Ile Ile Val Thr Val Asn Gly Glu Ile Tyr Asn 65
70 75 80 His Lys Gln Leu
Arg Gln Lys Leu Ser Ser His Gln Phe Arg Thr Gly 85
90 95 Ser Asp Cys Glu Val Ile Ala His Leu
Tyr Glu Glu His Gly Glu Glu 100 105
110 Phe Val Asn Met Leu Asp Gly Met Phe Ala Phe Ile Leu Leu
Asp Thr 115 120 125
Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile Thr Pro 130
135 140 Leu Tyr Leu Gly Trp
Gly His Asp Gly Ser Thr Trp Phe Ala Ser Glu 145 150
155 160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg
Phe Ile Ser Phe Pro Pro 165 170
175 Gly His Ile Tyr Ser Ser Lys Gln Gly Gly Leu Arg Arg Trp Tyr
Asn 180 185 190 Pro
Pro Trp Phe Ser Glu Asp Ile Pro Ser Thr Pro Tyr Asp Pro Thr 195
200 205 Leu Leu Arg Glu Thr Phe
Glu Arg Ala Val Val Lys Arg Met Met Thr 210 215
220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly
Leu Asp Ser Ser Leu 225 230 235
240 Val Ala Ala Val Val Asn Arg Tyr Leu Ala Glu Ser Glu Ser Ala Arg
245 250 255 Gln Trp
Gly Ser Gln Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser 260
265 270 Pro Asp Leu Lys Ala Ala Lys
Glu Val Ala Asp Tyr Leu Gly Thr Arg 275 280
285 His His Glu Leu Tyr Phe Thr Val Gln Glu Gly Ile
Asp Ala Leu Glu 290 295 300
Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305
310 315 320 Ser Thr Ala
Met Phe Leu Met Ser Arg Lys Ile Lys Ala Leu Gly Val 325
330 335 Lys Met Val Leu Ser Gly Glu Gly
Ser Asp Glu Ile Phe Gly Gly Tyr 340 345
350 Leu Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His
Glu Glu Thr 355 360 365
Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370
375 380 Lys Ser Thr Ala
Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385 390
395 400 Lys Glu Phe Ile Asn Val Ala Met Ser
Ile Asp Pro Glu Trp Lys Met 405 410
415 Ile Arg Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg
Asn Ala 420 425 430
Phe Asp Asp Asp Lys Asn Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg
435 440 445 Gln Lys Glu Gln
Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450
455 460 Leu Lys Asp His Ala Asn Lys Gln
Val Thr Asp Ala Thr Met Met Ala 465 470
475 480 Ala Asn Phe Ile Tyr Pro Glu Asn Thr Pro Thr Thr
Lys Glu Gly Tyr 485 490
495 Leu Tyr Arg Thr Ile Phe Glu Lys Phe Phe Pro Lys Asn Ala Ala Lys
500 505 510 Ala Thr Val
Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala 515
520 525 Val Glu Trp Asp Ala Ala Trp Ser
Lys Asn Leu Asp Pro Ser Gly Arg 530 535
540 Ala Ala Leu Gly Ile His Asp Ala Ala Tyr Asp Ala Val
Asp Thr Lys 545 550 555
560 Ile Asp Glu Pro Lys Asn Gly Thr Leu 565
76581PRTPhyscomitrella patens 76Met Cys Gly Ile Leu Ala Ile Leu Gly Ser
His Asp Ala Ser Pro Ala 1 5 10
15 Arg Arg Asp Arg Ile Leu Glu Leu Ser Arg Arg Leu Arg His Arg
Gly 20 25 30 Pro
Asp Trp Ser Gly Leu Phe Ala Gly Gln Lys Cys Trp Cys Tyr Leu 35
40 45 Ala His Glu Arg Leu Ala
Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro 50 55
60 Leu Tyr Asn Glu Asn Lys Asp Ile Val Val Ala
Ala Asn Gly Glu Ile 65 70 75
80 Tyr Asn His Glu Ala Leu Lys Lys Ser Met Lys Pro His Lys Tyr His
85 90 95 Thr Gln
Ser Asp Cys Glu Val Ile Ala His Leu Phe Glu Asp Val Gly 100
105 110 Glu Asp Val Val Asn Met Leu
Asp Gly Met Phe Ser Phe Val Leu Val 115 120
125 Asp Asn Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp
Pro Ile Gly Ile 130 135 140
Thr Pro Leu Tyr Tyr Gly Trp Gly Ala Asp Gly Ser Val Trp Phe Ala 145
150 155 160 Ser Glu Met
Lys Ala Leu Lys Asp Asp Cys Glu Arg Phe Glu Ile Phe 165
170 175 Pro Pro Gly His Ile Tyr Ser Ser
Lys Ala Gly Gly Leu Arg Arg Tyr 180 185
190 Tyr Asn Pro Ala Trp Phe Ser Glu Thr Phe Val Pro Ser
Thr Pro Tyr 195 200 205
Gln Ser Leu Val Leu Arg Ala Ala Phe Glu Lys Ala Val Ile Lys Arg 210
215 220 Leu Met Thr Asp
Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp 225 230
235 240 Ser Ser Leu Val Ala Ala Val Ala Ser
Arg His Ile Ala Gly Thr Lys 245 250
255 Ala Ala Asn Ile Trp Gly Lys Gln Leu His Ser Phe Cys Val
Gly Leu 260 265 270
Gln Gly Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asn Tyr Ile
275 280 285 Gly Thr Gln His
His Glu Phe His Phe Thr Val Gln Glu Gly Leu Asp 290
295 300 Ala Leu Ser Asp Val Ile Tyr His
Val Glu Thr Tyr Asp Val Thr Thr 305 310
315 320 Ile Arg Ala Ser Thr Pro Met Phe Leu Met Thr Arg
Lys Ile Lys Ala 325 330
335 Leu Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe
340 345 350 Gly Gly Tyr
Leu Tyr Phe His Lys Ala Pro Asn Arg Glu Glu Phe His 355
360 365 His Glu Leu Val Arg Lys Ile Lys
Ala Leu His Met Tyr Asp Cys Gln 370 375
380 Arg Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala
Arg Val Pro 385 390 395
400 Phe Leu Asp Lys Glu Phe Met Glu Val Ala Met Ala Ile Asp Pro Ala
405 410 415 Glu Lys Leu Ile
Arg Lys Asp Gln Gly Arg Ile Glu Lys Trp Val Leu 420
425 430 Arg Lys Ala Phe Tyr Asp Glu Lys Asn
Pro Tyr Leu Pro Lys His Ile 435 440
445 Leu Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr
Ser Trp 450 455 460
Ile Asp Gly Leu Lys Ala His Ala Gln Ser His Val Ser Asp Gln Met 465
470 475 480 Leu Lys His Ala Lys
His Val Tyr Pro Tyr Asn Thr Pro Gln Thr Lys 485
490 495 Glu Ala Tyr Tyr Tyr Arg Met Leu Phe Glu
Lys His Phe Pro Gln Gln 500 505
510 Ser Ala Arg Leu Thr Val Pro Gly Gly Ala Ser Val Ala Cys Ser
Thr 515 520 525 Ala
Thr Ala Val Ala Trp Asp Lys Ser Trp Ala Gly Asn Leu Asp Pro 530
535 540 Ser Gly Arg Ala Ala Leu
Gly Cys His Asp Ala Ala Tyr Thr Glu Asn 545 550
555 560 Ser Ala Ala Met Ser Tyr Ile Thr Lys Asn Met
Ser Asn Val Gly Gln 565 570
575 Lys Met Thr Ile His 580 77603PRTPhyscomitrella
patens 77Met Cys Gly Ile Leu Ala Ile Leu Gly Ala Asp Gly Ala Val Pro Ser
1 5 10 15 Ala Gly
Arg Asp Arg Ala Leu Ala Leu Ser Arg Arg Leu Arg His Arg 20
25 30 Gly Pro Asp Trp Ser Gly Leu
Phe Glu Gly Lys Asp Ser Trp Cys Tyr 35 40
45 Leu Ala His Glu Arg Leu Ala Ile Ile Asp Pro Ala
Ser Gly Asp Gln 50 55 60
Pro Leu Tyr Asn Gly Thr Lys Asp Ile Val Val Ala Ala Asn Gly Glu 65
70 75 80 Ile Tyr Asn
His Glu Leu Leu Lys Lys Asn Met Lys Pro His Glu Tyr 85
90 95 His Thr Gln Ser Asp Cys Glu Val
Ile Ala His Leu Tyr Glu Asp Val 100 105
110 Gly Glu Glu Val Val Asn Met Leu Asp Gly Met Trp Ser
Phe Val Leu 115 120 125
Val Asp Ser Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp Pro Ile Gly 130
135 140 Ile Thr Pro Leu
Tyr Leu Gly Trp Gly Ala Asp Gly Arg Thr Val Trp 145 150
155 160 Phe Ala Ser Glu Met Lys Ala Leu Lys
Asp Asp Cys Glu Arg Leu Glu 165 170
175 Val Phe Pro Pro Gly His Ile Tyr Ser Ser Lys Ala Gly Gly
Leu Arg 180 185 190
Arg Tyr Tyr Asn Pro Gln Trp Phe Ser Glu Thr Phe Val Pro Glu Thr
195 200 205 Pro Tyr Gln Pro
Leu Glu Leu Arg Ser Ala Phe Glu Lys Ala Val Val 210
215 220 Lys Arg Leu Met Thr Asp Val Pro
Phe Gly Val Leu Leu Ser Gly Gly 225 230
235 240 Leu Asp Ser Ser Leu Val Ala Ser Val Ala Ala Arg
His Leu Ala Glu 245 250
255 Thr Lys Ala Val Arg Ile Trp Gly Asn Glu Leu His Ser Phe Cys Val
260 265 270 Gly Leu Glu
Gly Ser Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Lys 275
280 285 Tyr Ile Gly Thr Arg His His Glu
Phe Asn Phe Thr Val Gln Glu Gly 290 295
300 Leu Asp Ala Leu Ser Asp Val Ile Tyr His Val Glu Thr
Tyr Asp Val 305 310 315
320 Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu Met Thr Arg Lys Ile
325 330 335 Lys Ala Leu Gly
Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu 340
345 350 Ile Phe Gly Gly Tyr Leu Tyr Phe His
Lys Ala Pro Asn Arg Glu Glu 355 360
365 Phe His His Glu Leu Val Arg Lys Ile Lys Ala Leu His Leu
Tyr Asp 370 375 380
Cys Gln Arg Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg 385
390 395 400 Val Pro Phe Leu Asp
Lys Glu Phe Met Asp Val Ala Met Met Ile Asp 405
410 415 Pro Ser Glu Lys Met Ile Arg Lys Asp Leu
Gly Arg Ile Glu Lys Trp 420 425
430 Val Leu Arg Lys Ala Phe Asp Asp Glu Glu Arg Pro Tyr Leu Pro
Lys 435 440 445 His
Ile Leu Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr 450
455 460 Ser Trp Ile Asp Gly Leu
Lys Glu Tyr Ala Glu Ser His Val Thr Asp 465 470
475 480 Gln Met Met Lys His Ala Lys His Val Tyr Pro
Phe Asn Thr Pro Asn 485 490
495 Thr Lys Glu Gly Tyr Tyr Tyr Arg Met Ile Phe Glu Lys His Phe Pro
500 505 510 Gln Gln
Ser Ala Arg Met Thr Val Pro Gly Gly Pro Ser Val Ala Cys 515
520 525 Ser Thr Ala Thr Ala Val Ala
Trp Asp Glu Ala Trp Ala Asn Asn Leu 530 535
540 Asp Pro Ser Gly Arg Ala Ala Leu Gly Cys His Asp
Ser Ala Tyr Thr 545 550 555
560 Asp Lys His Ser Glu Lys Ala Ala Pro Ala Ala Glu Ala Asn Gly Thr
565 570 575 Ala Ser His
Glu Asn Gly His Thr Phe Ser Lys Pro Lys Ser Thr Leu 580
585 590 Asp Ala Thr Ile Leu Lys Thr Gln
Ala Val His 595 600
78592PRTPhyscomitrella patens 78Met Cys Gly Ile Leu Ala Ile Leu Gly Cys
His Asp Lys Ser Val Thr 1 5 10
15 Arg Arg His Arg Cys Leu Glu Leu Ser Arg Arg Leu Arg His Arg
Gly 20 25 30 Pro
Asp Trp Ser Gly Leu Phe Val Asp Glu Ala Ser Gly Cys Tyr Leu 35
40 45 Ala His Glu Arg Leu Ala
Ile Ile Asp Pro Thr Ser Gly Asp Gln Pro 50 55
60 Leu Phe Asn Glu Asn Lys Asp Ile Val Val Ala
Val Asn Gly Glu Ile 65 70 75
80 Tyr Asn His Glu Ala Leu Lys Ala Ser Met Lys Ala His Lys Tyr His
85 90 95 Thr Gln
Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Ile Gly 100
105 110 Glu Glu Val Val Glu Lys Leu
Asp Gly Met Phe Ser Phe Val Leu Val 115 120
125 Asp Leu Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp
Pro Leu Gly Ile 130 135 140
Thr Pro Leu Tyr Leu Gly Trp Gly Asn Asp Gly Ser Val Trp Phe Ala 145
150 155 160 Ser Glu Met
Lys Ala Leu Lys Asp Asp Cys Glu Arg Phe Glu Ser Phe 165
170 175 Pro Pro Gly His Met Tyr Ser Ser
Lys Gln Gly Gly Leu Arg Arg Tyr 180 185
190 Tyr Asn Pro Pro Trp Phe Asn Glu Ser Ile Pro Ala Glu
Pro Tyr Asp 195 200 205
Pro Leu Ile Leu Arg His Ala Phe Glu Lys Ser Val Ile Lys Arg Leu 210
215 220 Met Thr Asp Val
Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser 225 230
235 240 Ser Leu Val Ala Ala Val Ala Gln Arg
His Leu Ala Gly Ser Thr Ala 245 250
255 Ala Lys Gln Trp Gly Asn Lys Leu His Ser Phe Cys Val Gly
Leu Glu 260 265 270
Gly Ser Pro Asp Leu Lys Ala Gly Arg Glu Val Ala Asp Tyr Ile Gly
275 280 285 Thr Val His Lys
Glu Phe His Phe Thr Val Gln Glu Gly Leu Asp Ala 290
295 300 Ile Ser Asp Val Ile Tyr His Ile
Glu Thr Tyr Asp Val Thr Thr Ile 305 310
315 320 Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys
Ile Lys Ala Leu 325 330
335 Gly Val Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly
340 345 350 Gly Tyr Leu
Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Lys 355
360 365 Glu Thr Cys Arg Lys Leu Lys Ala
Leu His Leu Tyr Asp Cys Leu Arg 370 375
380 Ala Asn Lys Ser Thr Ser Ala Trp Gly Leu Glu Ala Arg
Val Pro Phe 385 390 395
400 Leu Asp Arg Asp Phe Val Asn Leu Ala Met Ser Ile Asp Pro Ala Glu
405 410 415 Lys Met Ile Asn
Lys Lys Glu Gly Lys Ile Glu Lys Trp Ile Ile Arg 420
425 430 Lys Ala Phe Asp Asp Glu Glu Asn Pro
Tyr Leu Pro Lys His Ile Leu 435 440
445 Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser
Trp Ile 450 455 460
Asp Gly Leu Lys Asp His Ala Ala Ser Gln Val Ser Asp Gln Met Leu 465
470 475 480 Ala Asn Ala Lys His
Ile Tyr Pro His Asn Thr Pro Gly Thr Lys Glu 485
490 495 Gly Tyr Tyr Tyr Arg Met Ile Phe Glu Arg
Cys Phe Pro Gln Glu Ser 500 505
510 Ala Arg Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr
Ala 515 520 525 Ala
Ala Ile Ala Trp Asp Lys Ala Trp Ala Asn Asn Leu Asp Pro Ser 530
535 540 Gly Arg Ala Ala Thr Gly
Val His Asp Ser Ala Tyr Glu Gly Gly Glu 545 550
555 560 Val Glu Ser Ser Ala Val Ser His Lys Glu Gly
Gly Glu Asp Gly Leu 565 570
575 Ala Asn Ser Lys Val Gly Asp Lys Val Gln Glu Ala Ile Ala Val Ala
580 585 590
79589PRTPopulus trichocarpa 79Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser
Asp Asp Ser Gln Ala 1 5 10
15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly
20 25 30 Pro Asp
Trp Ser Gly Leu Tyr Gln Cys Gly Asp Phe Tyr Leu Ala His 35
40 45 Gln Arg Leu Ala Ile Ile Asp
Pro Ala Ser Gly Asp Gln Pro Leu Phe 50 55
60 Asn Glu Asp Gln Ala Ile Val Val Thr Val Asn Gly
Glu Ile Tyr Asn 65 70 75
80 His Glu Glu Leu Arg Lys Arg Leu Pro Asn His Lys Phe Arg Thr Gly
85 90 95 Ser Asp Cys
Asp Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Asn 100
105 110 Phe Val Asp Met Leu Asp Gly Met
Phe Ser Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly
Ile Thr Pro 130 135 140
Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145
150 155 160 Leu Lys Gly Leu
Asn Asp Asp Cys Glu His Phe Glu Cys Phe Pro Pro 165
170 175 Gly His Leu Tyr Ser Ser Lys Ser Gly
Gly Leu Arg Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Phe Cys Glu Ala Ile Pro Ser Thr Pro Tyr Asp
Pro Leu 195 200 205
Val Leu Arg Arg Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe Gly
Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ala Val Thr Ala Arg His Leu Ala
Gly Thr Lys Ala Ala Arg 245 250
255 Gln Trp Gly Ala Gln Leu His Ser Phe Cys Val Gly Leu Glu Asn
Ser 260 265 270 Pro
Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Gly Thr Val 275
280 285 His His Glu Phe Tyr Phe
Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295
300 Asp Val Ile Tyr His Ile Glu Thr Tyr Asp Val
Thr Thr Ile Arg Ala 305 310 315
320 Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys Ala Leu Gly Val
325 330 335 Lys Met
Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340
345 350 Leu Tyr Phe His Lys Ala Pro
Asn Lys Glu Glu Leu His Arg Glu Thr 355 360
365 Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys
Leu Arg Ala Asn 370 375 380
Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385
390 395 400 Lys Asp Phe
Ile Asn Val Ala Met Ala Ile Asp Pro Glu Trp Lys Met 405
410 415 Ile Lys Pro Gly Gln Gly His Ile
Glu Lys Trp Val Leu Arg Lys Ala 420 425
430 Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Lys His Ile
Leu Tyr Arg 435 440 445
Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450
455 460 Leu Lys Ala His
Ala Ala Gln His Val Thr Asp Lys Met Met Gln Asn 465 470
475 480 Ala Glu His Ile Phe Pro His Asn Thr
Pro Thr Thr Lys Glu Ala Tyr 485 490
495 Tyr Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser
Ala Arg 500 505 510
Leu Ser Val Pro Gly Gly Ala Ser Val Ala Cys Ser Thr Ala Lys Ala
515 520 525 Val Glu Trp Asp
Ala Ala Trp Ser Asn Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val His Leu Ser
Asp Tyr Asp Gln Gln Ala Ala Leu 545 550
555 560 Ala Asn Ala Gly Val Val Pro Pro Lys Ile Ile Asp
Thr Leu Pro Arg 565 570
575 Met Leu Glu Val Ser Ala Ser Gly Val Ala Ile His Ser
580 585 80587PRTPopulus trichocarpa 80Met
Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1
5 10 15 Lys Arg Phe Arg Val Leu
Glu Leu Ser Arg Arg Leu Lys His Arg Gly 20
25 30 Pro Asp Trp Ser Gly Leu Phe Gln His Gly
Asp Phe Tyr Leu Ala His 35 40
45 Gln Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro
Leu Phe 50 55 60
Asn Glu Asp Gln Ala Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65
70 75 80 His Glu Glu Leu Arg
Lys Arg Leu Pro Asn His Lys Phe Arg Thr Gly 85
90 95 Ser Asp Cys Asp Val Ile Ser His Leu Tyr
Glu Glu Tyr Gly Glu Asn 100 105
110 Phe Val Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp
Thr 115 120 125 Arg
Asp Asn Ser Phe Ile Val Ala Arg Asp Ala Ile Gly Ile Thr Ser 130
135 140 Leu Tyr Ile Gly Trp Gly
Leu Asp Gly Ser Val Trp Ile Ser Ser Glu 145 150
155 160 Leu Lys Gly Leu Asn Asp Asp Cys Glu His Phe
Lys Cys Phe Pro Pro 165 170
175 Gly His Ile Tyr Ser Ser Lys Ser Gly Gly Leu Arg Arg Trp Tyr Asn
180 185 190 Pro Leu
Trp Phe Ser Glu Ala Ile Pro Ser Thr Pro Tyr Asp Pro Leu 195
200 205 Ala Leu Arg Arg Ala Phe Glu
Lys Ala Val Ile Lys Arg Leu Met Thr 210 215
220 Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu
Asp Ser Ser Leu 225 230 235
240 Val Ala Ala Val Thr Ala Arg His Leu Ala Gly Thr Gln Ala Ala Arg
245 250 255 Gln Trp Gly
Ala His Leu His Ser Phe Cys Val Gly Leu Glu Asn Ser 260
265 270 Pro Asp Leu Lys Ala Ala Arg Glu
Val Ala Asp Tyr Leu Gly Thr Ile 275 280
285 His His Glu Phe His Phe Thr Val Gln Asp Gly Ile Asp
Ala Ile Glu 290 295 300
Asp Val Ile Tyr His Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305
310 315 320 Ser Thr Pro Met
Phe Leu Leu Ala Arg Lys Ile Lys Ala Leu Gly Val 325
330 335 Lys Met Val Ile Ser Gly Glu Gly Ser
Asp Glu Ile Phe Gly Gly Tyr 340 345
350 Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Leu His Gly
Glu Thr 355 360 365
Cys Arg Lys Ile Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370
375 380 Lys Ala Thr Ser Ala
Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385 390
395 400 Lys Asp Phe Ile Asn Val Ala Met Ala Ile
Asp Pro Glu Trp Lys Met 405 410
415 Ile Lys Pro Gly Arg Ile Glu Lys Trp Val Leu Arg Lys Ala Phe
Asp 420 425 430 Asp
Glu Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr Arg Gln Lys 435
440 445 Glu Gln Phe Ser Asp Gly
Val Gly Tyr Ser Trp Ile Asp Gly Leu Lys 450 455
460 Ala His Ala Glu Leu His Val His Asp Lys Met
Met Gln Asn Ala Glu 465 470 475
480 His Ile Phe Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr Tyr Tyr
485 490 495 Arg Met
Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser Ala Arg Leu Thr 500
505 510 Val Pro Gly Gly Ala Ser Val
Ala Cys Ser Thr Ala Lys Ala Val Glu 515 520
525 Trp Asp Ala Ser Trp Ser Asn Asn Leu Asp Pro Ser
Gly Arg Ala Ala 530 535 540
Leu Gly Val His Leu Ser Ala Tyr Glu Gln Gln Ala Ala Leu Ala Ser 545
550 555 560 Ala Gly Val
Val Pro Pro Glu Ile Ile Asp Asn Leu Pro Arg Met Met 565
570 575 Lys Val Gly Ala Pro Gly Val Ala
Ile Gln Ser 580 585
81578PRTArabidopsis thaliana 81Met Cys Gly Ile Leu Ala Val Leu Gly Cys
Ile Asp Asn Ser Gln Ala 1 5 10
15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg
Gly 20 25 30 Pro
Asp Trp Ser Gly Leu His Cys Tyr Glu Asp Cys Tyr Leu Ala His 35
40 45 Glu Arg Leu Ala Ile Ile
Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50 55
60 Asn Glu Asp Lys Thr Val Ala Val Thr Val Asn
Gly Glu Ile Tyr Asn 65 70 75
80 His Lys Ile Leu Arg Glu Lys Leu Lys Ser His Gln Phe Arg Thr Gly
85 90 95 Ser Asp
Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Glu 100
105 110 Phe Ile Asp Met Leu Asp Gly
Met Phe Ala Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile
Gly Ile Thr Pro 130 135 140
Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ala Ser Glu 145
150 155 160 Met Lys Ala
Leu Ser Asp Asp Cys Glu Gln Phe Met Ser Phe Pro Pro 165
170 175 Gly His Ile Tyr Ser Ser Lys Gln
Gly Gly Leu Arg Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Tyr Asn Glu Gln Val Pro Ser Thr Pro Tyr
Asp Pro Leu 195 200 205
Val Leu Arg Asn Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe
Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ala Val Ala Leu Arg His Leu
Glu Lys Ser Glu Ala Ala Arg 245 250
255 Gln Trp Gly Ser Gln Leu His Thr Phe Cys Ile Gly Leu Gln
Gly Ser 260 265 270
Pro Asp Leu Lys Ala Gly Arg Glu Val Ala Asp Tyr Leu Gly Thr Arg
275 280 285 His His Glu Phe
Gln Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Glu Val Ile Tyr His Ile Glu Thr
Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys
Ser Leu Gly Val 325 330
335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Leu Gly Gly Tyr
340 345 350 Leu Tyr Phe
His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala Leu His
Gln Phe Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro
Phe Leu Asp 385 390 395
400 Lys Glu Phe Leu Asn Val Ala Met Ser Ile Asp Pro Glu Trp Lys Leu
405 410 415 Ile Lys Pro Asp
Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420
425 430 Phe Asp Asp Glu Glu Arg Pro Tyr Leu
Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile
Asp Gly 450 455 460
Leu Lys Asp His Ala Asn Lys His Val Ser Asp Thr Met Leu Ser Asn 465
470 475 480 Ala Ser Phe Val Phe
Pro Asp Asn Thr Pro Leu Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe Phe
Pro Lys Ser Ala Ala Arg 500 505
510 Ala Thr Val Pro Gly Gly Pro Ser Ile Ala Cys Ser Thr Ala Lys
Ala 515 520 525 Val
Glu Trp Asp Ala Thr Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val His
Val Ala Ala Tyr Glu Glu Asp Lys Ala Ala 545 550
555 560 Ala Ala Ala Lys Ala Gly Ser Asp Leu Val Asp
Pro Leu Pro Lys Asn 565 570
575 Gly Thr 82584PRTArabidopsis thaliana 82Met Cys Gly Ile Leu Ala
Val Leu Gly Cys Ser Asp Asp Ser Gln Ala 1 5
10 15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg
Leu Arg His Arg Gly 20 25
30 Pro Asp Trp Ser Gly Leu Tyr Gln Asn Gly Asp Asn Tyr Leu Ala
His 35 40 45 Gln
Arg Leu Ala Val Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Phe 50
55 60 Asn Glu Asp Lys Thr Ile
Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70
75 80 His Glu Glu Leu Arg Lys Arg Leu Lys Asn His
Lys Phe Arg Thr Gly 85 90
95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Val Asp
100 105 110 Phe Val
Asp Met Leu Asp Gly Ile Phe Ser Phe Val Leu Leu Asp Thr 115
120 125 Arg Asp Asn Ser Phe Met Val
Ala Arg Asp Ala Ile Gly Val Thr Ser 130 135
140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp
Ile Ser Ser Glu 145 150 155
160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Thr Phe Pro Pro
165 170 175 Gly His Phe
Tyr Ser Ser Lys Leu Gly Gly Phe Lys Gln Trp Tyr Asn 180
185 190 Pro Pro Trp Phe Asn Glu Ser Val
Pro Ser Thr Pro Tyr Glu Pro Leu 195 200
205 Ala Ile Arg Arg Ala Phe Glu Asn Ala Val Ile Lys Arg
Leu Met Thr 210 215 220
Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ser Ile
Thr Ala Arg His Leu Ala Gly Thr Lys Ala Ala Lys 245
250 255 Gln Trp Gly Pro Gln Leu His Ser Phe
Cys Val Gly Leu Glu Gly Ser 260 265
270 Pro Asp Leu Lys Ala Gly Lys Glu Val Ala Glu Tyr Leu Gly
Thr Val 275 280 285
His His Glu Phe His Phe Ser Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Asp Val Ile Tyr His
Val Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys
Ile Lys Ser Leu Gly Val 325 330
335 Lys Met Val Leu Ser Gly Glu Gly Ala Asp Glu Ile Phe Gly Gly
Tyr 340 345 350 Leu
Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Gln Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala
Leu His Lys Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ser Thr Ser Ala Phe Gly Leu Glu Ala Arg
Val Pro Phe Leu Asp 385 390 395
400 Lys Asp Phe Ile Asn Thr Ala Met Ser Leu Asp Pro Glu Ser Lys Met
405 410 415 Ile Lys
Pro Glu Glu Gly Arg Ile Glu Lys Trp Val Leu Arg Arg Ala 420
425 430 Phe Asp Asp Glu Glu Arg Pro
Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser
Trp Ile Asp Gly 450 455 460
Leu Lys Asp His Ala Ala Gln Asn Val Asn Asp Lys Met Met Ser Asn 465
470 475 480 Ala Gly His
Ile Phe Pro His Asn Thr Pro Asn Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Met Ile Phe Glu Arg
Phe Phe Pro Gln Asn Ser Ala Arg 500 505
510 Leu Thr Val Pro Gly Gly Ala Thr Val Ala Cys Ser Thr
Ala Lys Ala 515 520 525
Val Glu Trp Asp Ala Ser Trp Ser Asn Asn Met Asp Pro Ser Gly Arg 530
535 540 Ala Ala Ile Gly
Val His Leu Ser Ala Tyr Asp Gly Lys Asn Val Ala 545 550
555 560 Leu Thr Ile Pro Pro Leu Lys Ala Ile
Asp Asn Met Pro Met Met Met 565 570
575 Gly Gln Gly Val Val Ile Gln Ser 580
83578PRTArabidopsis thaliana 83Met Cys Gly Ile Leu Ala Val Leu
Gly Cys Val Asp Asn Ser Gln Ala 1 5 10
15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg
His Arg Gly 20 25 30
Pro Asp Trp Ser Gly Leu His Cys Tyr Glu Asp Cys Tyr Leu Ala His
35 40 45 Glu Arg Leu Ala
Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50
55 60 Asn Glu Asp Lys Thr Ile Ala Val
Thr Val Asn Gly Glu Ile Tyr Asn 65 70
75 80 His Lys Ala Leu Arg Glu Asn Leu Lys Ser His Gln
Phe Arg Thr Gly 85 90
95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Glu
100 105 110 Phe Val Asp
Met Leu Asp Gly Met Phe Ala Phe Val Leu Leu Asp Thr 115
120 125 Arg Asp Lys Ser Phe Ile Ala Ala
Arg Asp Ala Ile Gly Ile Thr Pro 130 135
140 Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Phe
Ala Ser Glu 145 150 155
160 Met Lys Ala Leu Ser Asp Asp Cys Glu Gln Phe Met Cys Phe Pro Pro
165 170 175 Gly His Ile Tyr
Ser Ser Lys Gln Gly Gly Leu Arg Arg Trp Tyr Asn 180
185 190 Pro Pro Trp Phe Ser Glu Val Val Pro
Ser Thr Pro Tyr Asp Pro Leu 195 200
205 Val Val Arg Asn Thr Phe Glu Lys Ala Val Ile Lys Arg Leu
Met Thr 210 215 220
Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ser Val Ala
Leu Arg His Leu Glu Lys Ser Glu Ala Ala Cys 245
250 255 Gln Trp Gly Ser Lys Leu His Thr Phe Cys
Ile Gly Leu Lys Gly Ser 260 265
270 Pro Asp Leu Lys Ala Gly Arg Glu Val Ala Asp Tyr Leu Gly Thr
Arg 275 280 285 His
His Glu Leu His Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Glu Val Ile Tyr His Val
Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile
Lys Ser Leu Gly Val 325 330
335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr
340 345 350 Leu Tyr
Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala Leu
His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val
Pro Phe Leu Asp 385 390 395
400 Lys Glu Phe Ile Asn Val Ala Met Ser Ile Asp Pro Glu Trp Lys Met
405 410 415 Ile Arg Pro
Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420
425 430 Phe Asp Asp Glu Lys Asn Pro Tyr
Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp
Ile Asp Gly 450 455 460
Leu Lys Asp His Ala Asn Lys His Val Ser Glu Thr Met Leu Met Asn 465
470 475 480 Ala Ser Phe Val
Phe Pro Asp Asn Thr Pro Leu Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Thr Ile Phe Glu Lys Phe
Phe Pro Lys Ser Ala Ala Arg 500 505
510 Ala Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala
Lys Ala 515 520 525
Val Glu Trp Asp Ala Ala Trp Ser Gln Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val
His Val Ser Ala Tyr Gly Glu Asp Lys Thr Glu 545 550
555 560 Asp Ser Arg Pro Glu Lys Leu Gln Lys Leu
Ala Glu Lys Thr Pro Ala 565 570
575 Ile Val 84581PRTTriticum aestivum 84Met Cys Gly Ile Leu
Ala Val Leu Gly Cys Gly Asp Glu Ser Gln Gly 1 5
10 15 Lys Arg Val His Val Leu Glu Leu Ser Arg
Arg Leu Lys His Arg Gly 20 25
30 Pro Asp Trp Ser Gly Leu His Gln Val Ala Asp Asn Tyr Leu Cys
His 35 40 45 Gln
Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50
55 60 Asn Glu Asp Lys Ser Ile
Ala Val Ala Val Asn Gly Glu Val Tyr Asn 65 70
75 80 His Glu Glu Leu Arg Ala Arg Leu Ser Gly His
Arg Phe Arg Thr Gly 85 90
95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Ser
100 105 110 Phe Ile
Asp Met Leu Asp Gly Val Phe Ser Phe Val Leu Leu Asp Ala 115
120 125 Arg Asp Asn Ser Phe Ile Ala
Ala Arg Asp Ala Ile Gly Val Thr Pro 130 135
140 Leu Tyr Ile Gly Trp Gly Ile Asp Gly Ser Val Trp
Ile Ser Ser Glu 145 150 155
160 Met Lys Gly Leu Asn Asp Asp Cys Glu His Phe Glu Ile Phe Pro Pro
165 170 175 Gly Asn Leu
Tyr Ser Ser Lys Glu Lys Ser Phe Lys Arg Trp Tyr Asn 180
185 190 Pro Pro Trp Phe Ser Glu Val Ile
Pro Ser Val Pro Tyr Asp Pro Leu 195 200
205 Arg Leu Arg Ser Ala Phe Glu Lys Ala Val Ile Lys Arg
Leu Met Thr 210 215 220
Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ala Val
Ala Ala Arg His Phe Ala Gly Thr Lys Ala Ala Lys 245
250 255 Arg Trp Gly Thr Arg Leu His Ser Phe
Cys Val Gly Leu Glu Gly Ser 260 265
270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp His Leu Gly
Thr Val 275 280 285
His His Glu Phe Asn Phe Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290
295 300 Asp Val Ile Tyr His
Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Leu Met Phe Gln Met Ser Arg Lys
Ile Lys Ala Leu Gly Val 325 330
335 Lys Met Val Ile Ser Gly Glu Gly Ala Asp Glu Ile Phe Gly Gly
Tyr 340 345 350 Leu
Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Phe His Gln Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala
Leu His Gln Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Val Arg
Val Pro Phe Leu Asp 385 390 395
400 Lys Glu Phe Ile Asn Glu Ala Met Ser Ile Asp Pro Glu Trp Lys Met
405 410 415 Ile Arg
Pro Asp Leu Gly Arg Ile Glu Lys Trp Ile Leu Arg Lys Ala 420
425 430 Phe Asp Asp Glu Glu Arg Pro
Phe Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser
Trp Ile Asp Gly 450 455 460
Leu Lys Asp His Ala Ala Ser Asn Val Ser Asp Lys Met Met Ser Asn 465
470 475 480 Ala Lys Phe
Ile Tyr Pro His Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Met Ile Phe Glu Arg
Tyr Phe Pro Gln Ser Ser Ala Ile 500 505
510 Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr
Ala Lys Ala 515 520 525
Ile Glu Trp Asp Ala Gln Trp Ser Gly Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly
Val His Leu Ser Ala Tyr Glu Gln Asp Thr Val Ala 545 550
555 560 Val Gly Gly Ser Asn Lys Pro Gly Val
Met Asn Thr Val Val Pro Gly 565 570
575 Val Ala Ile Glu Thr 580
85585PRTTriticum aestivum 85Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ala
Asp Asp Thr Gln Gly 1 5 10
15 Lys Arg Val Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His Arg Gly
20 25 30 Pro Asp
Trp Ser Gly Met His Gln Val Gly Asp Cys Tyr Leu Ser His 35
40 45 Gln Arg Leu Ala Ile Ile Asp
Pro Ala Ser Gly Asp Gln Pro Leu Tyr 50 55
60 Asn Glu Asp Lys Ser Ile Val Val Thr Val Asn Gly
Glu Ile Tyr Asn 65 70 75
80 His Glu Gln Leu Arg Ala Gln Leu Ser Ser His Thr Phe Arg Thr Gly
85 90 95 Ser Asp Cys
Glu Val Ile Ala His Leu Tyr Glu Glu His Gly Glu Asn 100
105 110 Phe Ile Asp Met Leu Asp Gly Val
Phe Ser Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Asn Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly
Val Thr Pro 130 135 140
Leu Tyr Ile Gly Trp Gly Ile Asp Gly Ser Val Trp Ile Ser Ser Glu 145
150 155 160 Met Lys Gly Leu
Asn Asp Asp Cys Glu His Phe Glu Ile Phe Pro Pro 165
170 175 Gly His Leu Tyr Ser Ser Lys Gln Gly
Gly Phe Lys Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Phe Ser Glu Val Ile Pro Ser Val Pro Tyr Asp
Pro Leu 195 200 205
Ala Leu Arg Lys Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe Gly
Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ala Val Thr Val Arg His Leu Ala
Gly Thr Lys Ala Ala Lys 245 250
255 Arg Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly
Ser 260 265 270 Pro
Asp Leu Lys Ala Ala Lys Glu Val Ala Asn Tyr Leu Gly Thr Met 275
280 285 His His Glu Phe Thr Phe
Thr Val Gln Asp Gly Ile Asp Ala Ile Glu 290 295
300 Asp Val Ile Tyr His Thr Glu Thr Tyr Asp Val
Thr Thr Ile Arg Ala 305 310 315
320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val
325 330 335 Lys Met
Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340
345 350 Leu Tyr Phe His Lys Ala Pro
Asn Lys Glu Glu Leu His Arg Glu Thr 355 360
365 Cys Gln Lys Ile Lys Ala Leu His Gln Tyr Asp Cys
Leu Arg Ala Asn 370 375 380
Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu Asp 385
390 395 400 Lys Glu Phe
Ile Asn Glu Ala Met Ser Ile Asp Pro Glu Trp Lys Met 405
410 415 Ile Arg Pro Asp Leu Gly Arg Ile
Glu Lys Trp Met Leu Arg Lys Ala 420 425
430 Phe Asp Asp Glu Glu Gln Pro Phe Leu Pro Lys His Ile
Leu Tyr Arg 435 440 445
Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp Gly 450
455 460 Leu Lys Ala His
Ala Glu Ser Asn Val Thr Asp Lys Met Met Ser Asn 465 470
475 480 Ala Lys Phe Ile Tyr Pro His Asn Thr
Pro Thr Thr Lys Glu Ala Tyr 485 490
495 Cys Tyr Arg Met Ile Phe Glu Arg Phe Phe Pro Gln Asn Ser
Ala Ile 500 505 510
Leu Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys Ala
515 520 525 Val Glu Trp Asp
Ala Gln Trp Ser Gly Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly Val His Leu Ser
Ala Tyr Glu Gln Glu His Leu Pro 545 550
555 560 Ala Thr Ile Met Ala Gly Thr Ser Lys Lys Pro Arg
Met Ile Glu Val 565 570
575 Ala Ala Pro Gly Val Ala Ile Glu Ser 580
585 86589PRTVitis vinifera 86Met Cys Gly Ile Leu Ala Val Leu Gly Cys
Ser Asp Asp Ser Gln Ala 1 5 10
15 Lys Arg Val Arg Leu Phe Tyr His Cys Tyr Leu Cys Phe Cys Asp
Arg 20 25 30 Leu
Lys His Arg Gly Pro Asp Trp Ser Gly Leu Tyr Gln His Gly Asp 35
40 45 Cys Tyr Leu Ala His Gln
Arg Leu Ala Ile Ile Asp Pro Ala Ser Gly 50 55
60 Asp Gln Pro Leu Tyr Asn Glu Asn Gln Ala Ile
Val Val Thr Val Asn 65 70 75
80 Gly Glu Ile Tyr Asn His Glu Glu Leu Arg Lys Ser Met Pro Asn His
85 90 95 Lys Phe
Arg Thr Gly Ser Asp Cys Asp Val Ile Ala His Leu Tyr Glu 100
105 110 Glu His Gly Glu Asn Phe Val
Asp Met Leu Asp Gly Met Phe Ser Phe 115 120
125 Val Leu Leu Asp Thr Arg Asp Asp Ser Phe Ile Val
Ala Arg Asp Ala 130 135 140
Ile Gly Ile Thr Ser Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Ser 145
150 155 160 Val Trp Ile
Ser Ser Glu Leu Lys Gly Leu Asn Asp Asp Cys Glu His 165
170 175 Phe Glu Ser Phe Pro Pro Gly His
Met Tyr Ser Ser Lys Glu Gly Gly 180 185
190 Phe Lys Arg Trp Tyr Asn Pro Pro Trp Phe Ser Glu Ala
Ile Pro Ser 195 200 205
Ala Pro Tyr Asp Pro Leu Val Leu Arg Arg Ala Phe Glu Asn Ala Val 210
215 220 Ile Lys Arg Leu
Met Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly 225 230
235 240 Gly Leu Asp Ser Ser Leu Val Ala Ser
Ile Thr Ala Arg His Leu Ala 245 250
255 Gly Thr Lys Ala Ala Lys Gln Trp Gly Ala Gln Leu His Ser
Phe Cys 260 265 270
Val Gly Leu Glu Gly Ser Pro Asp Leu Lys Ala Ala Lys Glu Val Ala
275 280 285 Asp Tyr Leu Gly
Thr Val His His Glu Phe His Phe Thr Val Gln Asp 290
295 300 Gly Ile Asp Ala Ile Glu Asp Val
Ile Tyr His Ile Glu Thr Tyr Asp 305 310
315 320 Val Thr Thr Ile Arg Ala Ser Thr Pro Met Phe Leu
Met Ser Arg Lys 325 330
335 Ile Lys Ser Leu Gly Val Lys Met Val Ile Ser Gly Glu Gly Ser Asp
340 345 350 Glu Ile Phe
Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Glu 355
360 365 Glu Phe His Arg Glu Thr Cys Arg
Lys Ile Lys Ala Leu Tyr Gln Tyr 370 375
380 Asp Cys Leu Arg Ala Asn Lys Ser Thr Ser Ala Trp Gly
Leu Glu Ala 385 390 395
400 Arg Val Pro Phe Leu Asp Lys Glu Phe Ile Lys Val Ala Met Asp Ile
405 410 415 Asp Pro Glu Trp
Lys Met Ile Lys Pro Glu Gln Gly Arg Ile Glu Lys 420
425 430 Trp Val Leu Arg Arg Ala Phe Asp Asp
Glu Glu Gln Pro Tyr Leu Pro 435 440
445 Lys His Ile Leu Tyr Arg Gln Lys Glu Gln Phe Ser Asp Gly
Val Gly 450 455 460
Tyr Ser Trp Ile Asp Gly Leu Lys Ala His Ala Ser Gln His Val Thr 465
470 475 480 Asp Lys Met Met Leu
Asn Ala Ser His Ile Phe Pro His Asn Thr Pro 485
490 495 Thr Thr Lys Glu Ala Tyr Tyr Tyr Arg Met
Ile Phe Glu Arg Phe Phe 500 505
510 Pro Gln Asn Ser Ala Arg Leu Thr Val Pro Gly Gly Ala Ser Val
Ala 515 520 525 Cys
Ser Thr Ala Lys Ala Val Glu Trp Asp Ser Ala Trp Ser Asn Asn 530
535 540 Leu Asp Pro Ser Gly Arg
Ala Ala Leu Gly Val His Leu Ser Ala Tyr 545 550
555 560 Asp Gln Lys Leu Thr Thr Val Ser Ala Ala Asn
Val Pro Thr Lys Ile 565 570
575 Ile Asp Asn Met Pro Arg Ile Met Glu Val Thr Ala Pro
580 585 87578PRTVolvox carteri 87Met Cys
Gly Ile Leu Ala Val Leu Asn Ser Thr Asp Asp Ser Pro Ala 1 5
10 15 Met Arg Ala Lys Val Leu Ala
Leu Ser Arg Arg Gln Lys His Arg Gly 20 25
30 Pro Asp Trp Ser Gly Met His Gln Phe Gly Asn Asn
Phe Leu Ala His 35 40 45
Glu Arg Leu Ala Ile Met Asp Pro Ser Ser Gly Asp Gln Pro Leu Tyr
50 55 60 Asn Glu Asp
Lys Ser Ile Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65
70 75 80 Tyr Lys Glu Leu Arg Lys Glu
Ile Ser Asp Lys Cys Pro Gly Lys Lys 85
90 95 Phe Arg Thr Asn Ser Asp Cys Glu Val Ile Ser
His Leu Tyr Glu Leu 100 105
110 Tyr Gly Glu Ala Val Ala Asn Lys Leu Asp Gly Phe Phe Ala Phe
Val 115 120 125 Leu
Leu Asp Thr Arg Asn Asn Thr Phe Phe Ala Ala Arg Asp Pro Leu 130
135 140 Gly Val Thr Cys Met Tyr
Ile Gly Trp Gly Arg Asp Gly Ser Val Trp 145 150
155 160 Leu Ser Ser Glu Met Lys Cys Leu Lys Asp Asp
Cys Ala Arg Phe Gln 165 170
175 Gln Phe Pro Pro Gly His Tyr Tyr Ser Ser Lys Thr Gly Glu Phe Val
180 185 190 Arg Tyr
Phe Asn Pro Gln Phe Tyr Leu Asp Phe Glu Ala Glu Pro Gln 195
200 205 Val Phe Pro Ser Val Pro Tyr
Asp Pro Val Thr Leu Arg Thr Ala Phe 210 215
220 Glu Ala Ala Val Glu Lys Arg Met Met Ser Asp Val
Pro Phe Gly Val 225 230 235
240 Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu Val Ala Ser Ile Ala Ala
245 250 255 Arg Lys Ile
Lys Arg Glu Gly Ser Val Trp Gly Lys Leu His Ser Phe 260
265 270 Cys Val Gly Leu Glu Gly Ser Pro
Asp Leu Lys Ala Gly Ala Ala Val 275 280
285 Ala Glu Phe Leu Gly Thr Asp His His Glu Phe His Phe
Thr Val Gln 290 295 300
Glu Gly Ile Asp Ala Ile Ser Glu Val Ile Tyr His Ile Glu Thr Phe 305
310 315 320 Asp Val Thr Thr
Ile Arg Ala Ser Thr Pro Met Phe Leu Met Ser Arg 325
330 335 Lys Ile Lys Ala Leu Gly Val Lys Met
Val Leu Ser Gly Glu Gly Ser 340 345
350 Asp Glu Val Phe Gly Gly Tyr Leu Tyr Phe His Lys Ala Pro
Ser Lys 355 360 365
Asp Glu Phe His Ser Glu Thr Val Arg Lys Leu Lys Asp Leu Phe Lys 370
375 380 Tyr Asp Cys Leu Arg
Ala Asn Lys Ala Thr Met Ala Trp Gly Val Glu 385 390
395 400 Ala Arg Val Pro Phe Leu Asp Arg Ala Phe
Leu Asp Val Ala Met Ser 405 410
415 Ile Asp Pro Ala Glu Lys Met Ile Asp Lys Ser Lys Gly Arg Ile
Glu 420 425 430 Lys
Tyr Ile Leu Arg Lys Ala Phe Asp Thr Pro Glu Asp Pro Tyr Leu 435
440 445 Pro Lys Glu Val Leu Trp
Arg Gln Lys Glu Gln Phe Ser Asp Gly Val 450 455
460 Gly Tyr Asn Trp Ile Asp Gly Leu Lys Ala His
Ala Glu Ser Gln Val 465 470 475
480 Ser Asp Glu Met Leu Lys Asn Ala Val His Arg Phe Pro Asp Asn Thr
485 490 495 Pro Arg
Thr Lys Glu Ala Tyr Trp Tyr Arg Ser Ile Phe Glu Ser His 500
505 510 Phe Pro Gln Arg Ala Ala Met
Glu Thr Val Pro Gly Gly Pro Ser Val 515 520
525 Ala Cys Ser Thr Ala Thr Ala Ala Leu Trp Asp Ala
Ala Trp Ala Gly 530 535 540
Lys Glu Asp Pro Ser Gly Arg Ala Val Ala Gly Val His Asp Ala Ala 545
550 555 560 Tyr Glu Glu
Gly Ala Glu Ala Asn Gly Glu Pro Ala Ser Lys Lys Gln 565
570 575 Lys Val 88588PRTZea mays 88Met
Cys Gly Ile Leu Ala Val Leu Gly Cys Ser Asp Trp Ser Gln Ala 1
5 10 15 Lys Arg Ala Arg Ile Leu
Ala Cys Ser Arg Arg Leu Lys His Arg Gly 20
25 30 Pro Asp Trp Ser Gly Leu Tyr Gln His Glu
Gly Asn Phe Leu Ala Gln 35 40
45 Gln Arg Leu Ala Val Val Ser Pro Leu Ser Gly Asp Gln Pro
Leu Phe 50 55 60
Asn Glu Asp Arg Thr Val Val Val Val Ala Asn Gly Glu Ile Tyr Asn 65
70 75 80 His Lys Asn Val Arg
Lys Gln Phe Thr Gly Thr His Asn Phe Ser Thr 85
90 95 Gly Ser Asp Cys Glu Val Ile Ile Pro Leu
Tyr Glu Lys Tyr Gly Glu 100 105
110 Asn Phe Val Asp Met Leu Asp Gly Val Phe Ala Phe Val Leu Tyr
Asp 115 120 125 Thr
Arg Asp Arg Thr Tyr Val Ala Ala Arg Asp Ala Ile Gly Val Asn 130
135 140 Pro Leu Tyr Ile Gly Trp
Gly Ser Asp Gly Ser Val Trp Ile Ala Ser 145 150
155 160 Glu Met Lys Ala Leu Asn Glu Asp Cys Val Arg
Phe Glu Ile Phe Pro 165 170
175 Pro Gly His Leu Tyr Ser Ser Ala Gly Gly Gly Phe Arg Arg Trp Tyr
180 185 190 Thr Pro
His Trp Phe Gln Glu Gln Val Pro Arg Met Pro Tyr Gln Pro 195
200 205 Leu Val Leu Arg Glu Ala Phe
Glu Lys Ala Val Ile Lys Arg Leu Met 210 215
220 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly
Leu Asp Ser Ser 225 230 235
240 Leu Val Ala Ser Val Thr Lys Arg His Leu Val Glu Thr Glu Ala Ala
245 250 255 Glu Lys Phe
Gly Thr Glu Leu His Ser Phe Val Val Gly Leu Glu Gly 260
265 270 Ser Pro Asp Leu Lys Ala Ala Arg
Glu Val Ala Asp Tyr Leu Gly Thr 275 280
285 Ile His His Glu Phe His Phe Thr Val Gln Asp Gly Ile
Asp Ala Ile 290 295 300
Glu Glu Val Ile Tyr His Asp Glu Thr Tyr Asp Val Thr Thr Ile Arg 305
310 315 320 Ala Ser Thr Pro
Met Phe Leu Met Ala Arg Lys Ile Lys Ser Leu Gly 325
330 335 Val Lys Met Val Leu Ser Gly Glu Gly
Ser Asp Glu Leu Leu Gly Gly 340 345
350 Tyr Leu Tyr Phe His Phe Ala Pro Asn Lys Glu Glu Phe His
Arg Glu 355 360 365
Thr Cys Arg Lys Val Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala 370
375 380 Asn Lys Ala Thr Ser
Ala Trp Gly Leu Glu Val Arg Val Pro Phe Leu 385 390
395 400 Asp Lys Glu Phe Ile Asn Val Ala Met Gly
Met Asp Pro Glu Trp Lys 405 410
415 Met Tyr Asp Lys Asn Leu Gly Arg Ile Glu Lys Trp Val Met Arg
Lys 420 425 430 Ala
Phe Asp Asp Asp Glu His Pro Tyr Leu Pro Lys His Ile Leu Tyr 435
440 445 Arg Gln Lys Glu Gln Phe
Ser Asp Gly Val Gly Tyr Asn Trp Ile Asp 450 455
460 Gly Leu Lys Ser Phe Thr Glu Gln Gln Val Thr
Asp Glu Met Met Asn 465 470 475
480 Asn Ala Ala Gln Met Phe Pro Tyr Asn Thr Pro Val Asn Lys Glu Ala
485 490 495 Tyr Tyr
Tyr Arg Met Ile Phe Glu Arg Leu Phe Pro Gln Asp Ser Ala 500
505 510 Arg Glu Thr Val Pro Trp Gly
Pro Ser Ile Ala Cys Ser Thr Pro Ala 515 520
525 Ala Ile Glu Trp Val Glu Gln Trp Lys Ala Ser Asn
Asp Pro Ser Gly 530 535 540
Arg Phe Ile Ser Ser His Asp Ser Ala Ala Thr Asp His Thr Gly Gly 545
550 555 560 Lys Pro Ala
Val Ala Asn Gly Gly Gly His Gly Ala Ala Asn Gly Thr 565
570 575 Val Asn Gly Lys Asp Val Ala Val
Ala Ile Ala Val 580 585
89586PRTZea mays 89Met Cys Gly Ile Leu Ala Val Leu Gly Val Val Glu Val
Ser Leu Ala 1 5 10 15
Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg Leu Arg His Arg Gly
20 25 30 Pro Asp Trp Ser
Gly Leu His Cys His Glu Asp Cys Tyr Leu Ala His 35
40 45 Gln Arg Leu Ala Ile Ile Asp Pro Thr
Ser Gly Asp Gln Pro Leu Tyr 50 55
60 Asn Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu
Ile Tyr Asn 65 70 75
80 His Glu Glu Leu Lys Ala Lys Leu Lys Thr His Glu Phe Gln Thr Gly
85 90 95 Ser Asp Cys Glu
Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu 100
105 110 Phe Val Asp Met Leu Asp Gly Met Phe
Ser Phe Val Leu Leu Asp Thr 115 120
125 Arg Asp Lys Ser Phe Ile Ala Ala Arg Asp Ala Ile Gly Ile
Cys Pro 130 135 140
Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ser Ser Glu 145
150 155 160 Met Lys Ala Leu Ser
Asp Asp Cys Glu Arg Phe Ile Thr Phe Pro Pro 165
170 175 Gly His Leu Tyr Ser Ser Lys Thr Gly Gly
Leu Arg Arg Trp Tyr Asn 180 185
190 Pro Pro Trp Phe Ser Glu Thr Val Pro Ser Thr Pro Tyr Asn Ala
Leu 195 200 205 Phe
Leu Arg Glu Met Phe Glu Lys Ala Val Ile Lys Arg Leu Met Thr 210
215 220 Asp Val Pro Phe Gly Val
Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225 230
235 240 Val Ala Ser Val Ala Ser Arg His Leu Asn Glu
Thr Lys Val Asp Arg 245 250
255 Gln Trp Gly Asn Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly Ser
260 265 270 Pro Asp
Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Ser Thr Val 275
280 285 His His Glu Phe His Phe Thr
Val Gln Glu Gly Ile Asp Ala Leu Glu 290 295
300 Glu Val Ile Tyr His Ile Glu Thr Tyr Asp Val Thr
Thr Ile Arg Ala 305 310 315
320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly Val
325 330 335 Lys Met Val
Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly Tyr 340
345 350 Leu Tyr Phe His Lys Ala Pro Asn
Lys Lys Glu Phe Leu Glu Glu Thr 355 360
365 Cys Arg Lys Ile Lys Ala Leu His Leu Tyr Asp Cys Leu
Arg Ala Asn 370 375 380
Lys Ala Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu Asp 385
390 395 400 Lys Ser Phe Ile
Ser Val Ala Met Asp Ile Asp Pro Glu Trp Asn Met 405
410 415 Ile Lys Arg Asp Leu Gly Arg Ile Glu
Lys Trp Val Met Arg Lys Ala 420 425
430 Phe Asp Asp Asp Glu His Pro Tyr Leu Pro Lys His Ile Leu
Tyr Arg 435 440 445
Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Asn Trp Ile Asp Gly 450
455 460 Leu Lys Ser Phe Thr
Glu Gln Gln Val Thr Asp Glu Met Met Asn Asn 465 470
475 480 Ala Ala Gln Met Phe Pro Tyr Asn Thr Pro
Val Asn Lys Glu Ala Tyr 485 490
495 Tyr Tyr Arg Met Ile Phe Glu Arg Leu Phe Pro Gln Asp Ser Ala
Arg 500 505 510 Glu
Thr Val Pro Trp Gly Pro Ser Ile Ala Cys Ser Thr Pro Ala Ala 515
520 525 Ile Glu Trp Val Glu Gln
Trp Lys Ala Ser Asn Asp Pro Ser Gly Arg 530 535
540 Phe Ile Ser Ser His Asp Ser Ala Ala Thr Asp
His Thr Ala Val Ser 545 550 555
560 Arg Arg Trp Pro Thr Ala Ala Ala Arg Pro Ala Asn Gly Thr Val Asn
565 570 575 Gly Lys
Asp Val Pro Val Pro Ile Ala Val 580 585
90606PRTZea mays 90Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ala Asp Glu
Ala Lys Gly 1 5 10 15
Ser Ser Lys Arg Ser Arg Val Leu Glu Leu Ser Arg Arg Leu Lys His
20 25 30 Arg Gly Pro Asp
Trp Ser Gly Leu Arg Gln Val Gly Asp Cys Tyr Leu 35
40 45 Ser His Gln Arg Leu Ala Ile Ile Asp
Pro Ala Ser Gly Asp Gln Pro 50 55
60 Leu Tyr Asn Glu Asp Gln Ser Val Val Val Ala Val Asn
Gly Glu Ile 65 70 75
80 Tyr Asn His Leu Asp Leu Arg Ser Arg Leu Ala Gly Ala Gly His Ser
85 90 95 Phe Arg Thr Gly
Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu 100
105 110 His Gly Glu Glu Phe Val Asp Met Leu
Asp Gly Val Phe Ser Phe Val 115 120
125 Leu Leu Asp Thr Arg His Gly Asp Arg Ala Gly Ser Ser Phe
Phe Met 130 135 140
Ala Ala Arg Asp Ala Ile Gly Val Thr Pro Leu Tyr Ile Gly Trp Gly 145
150 155 160 Val Asp Gly Ser Val
Trp Ile Ser Ser Glu Met Lys Ala Leu His Asp 165
170 175 Glu Cys Glu His Phe Glu Ile Phe Pro Pro
Gly His Leu Tyr Ser Ser 180 185
190 Asn Thr Gly Gly Phe Ser Arg Trp Tyr Asn Pro Pro Trp Tyr Asp
Asp 195 200 205 Asp
Asp Asp Glu Glu Ala Val Val Thr Pro Ser Val Pro Tyr Asp Pro 210
215 220 Leu Ala Leu Arg Lys Ala
Phe Glu Lys Ala Val Val Lys Arg Leu Met 225 230
235 240 Thr Asp Val Pro Phe Gly Val Leu Leu Ser Gly
Gly Leu Asp Ser Ser 245 250
255 Leu Val Ala Thr Val Ala Val Arg His Leu Ala Arg Thr Glu Ala Ala
260 265 270 Arg Arg
Trp Gly Thr Lys Leu His Ser Phe Cys Val Gly Leu Glu Gly 275
280 285 Ser Pro Asp Leu Lys Ala Ala
Arg Glu Val Ala Glu Tyr Leu Gly Thr 290 295
300 Leu His His Glu Phe His Phe Thr Val Gln Asp Gly
Ile Asp Ala Ile 305 310 315
320 Glu Asp Val Ile Tyr His Thr Glu Thr Tyr Asp Val Thr Thr Ile Arg
325 330 335 Ala Ser Thr
Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly 340
345 350 Val Lys Met Val Ile Ser Gly Glu
Gly Ser Asp Glu Leu Phe Gly Gly 355 360
365 Tyr Leu Tyr Phe His Lys Ala Pro Asn Lys Glu Glu Leu
His Arg Glu 370 375 380
Thr Cys Arg Lys Val Lys Ala Leu His Gln Tyr Asp Cys Leu Arg Ala 385
390 395 400 Asn Lys Ala Thr
Ser Ala Trp Gly Leu Glu Ala Arg Val Pro Phe Leu 405
410 415 Asp Lys Glu Phe Ile Asn Ala Ala Met
Ser Ile Asp Pro Glu Trp Lys 420 425
430 Met Val Gln Pro Asp Leu Gly Arg Ile Glu Lys Trp Val Leu
Arg Lys 435 440 445
Ala Phe Asp Asp Glu Glu Gln Pro Phe Leu Pro Lys His Ile Leu Tyr 450
455 460 Arg Gln Lys Glu Gln
Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp 465 470
475 480 Gly Leu Lys Ala His Ala Thr Ser Asn Val
Thr Asp Lys Met Leu Ser 485 490
495 Asn Ala Lys Phe Ile Phe Pro His Asn Thr Pro Thr Thr Lys Glu
Ala 500 505 510 Tyr
Tyr Tyr Arg Met Val Phe Glu Arg Phe Phe Pro Gln Lys Ser Ala 515
520 525 Ile Leu Thr Val Pro Gly
Gly Pro Ser Val Ala Cys Ser Thr Ala Lys 530 535
540 Ala Ile Glu Trp Asp Ala Gln Trp Ser Gly Asn
Leu Asp Pro Ser Gly 545 550 555
560 Arg Ala Ala Leu Gly Val His Leu Ala Ala Tyr Glu His Gln His Asp
565 570 575 Pro Glu
His Val Pro Ala Ala Ile Ala Ala Gly Ser Gly Lys Lys Pro 580
585 590 Arg Thr Ile Arg Val Ala Pro
Pro Gly Val Ala Ile Glu Gly 595 600
605 91606PRTZea mays 91Met Cys Gly Ile Leu Ala Val Leu Gly Cys Ser
Asp Cys Ser Gln Ala 1 5 10
15 Arg Arg Ala Arg Ile Leu Ala Cys Ser Arg Arg Leu Lys His Arg Gly
20 25 30 Pro Asp
Trp Ser Gly Leu Tyr Gln His Glu Gly Asn Phe Leu Ala Gln 35
40 45 Gln Arg Leu Ala Ile Val Ser
Pro Leu Ser Gly Asp Gln Pro Leu Phe 50 55
60 Asn Glu Asp Arg Thr Val Val Val Val Ala Asn Gly
Glu Ile Tyr Asn 65 70 75
80 His Lys Asn Val Arg Lys Gln Phe Thr Gly Ala His Ser Phe Ser Thr
85 90 95 Gly Ser Asp
Cys Glu Val Ile Ile Pro Leu Tyr Glu Lys Tyr Gly Glu 100
105 110 Asn Phe Val Asp Met Leu Asp Gly
Val Phe Ala Phe Val Leu Tyr Asp 115 120
125 Thr Arg Asp Arg Thr Tyr Val Ala Ala Arg Asp Ala Ile
Gly Val Asn 130 135 140
Pro Leu Tyr Ile Gly Trp Gly Ser Asp Gly Ser Val Trp Met Ser Ser 145
150 155 160 Glu Met Lys Ala
Leu Asn Glu Asp Cys Val Arg Phe Glu Ile Phe Pro 165
170 175 Pro Gly His Leu Tyr Ser Ser Ala Ala
Gly Gly Phe Arg Arg Trp Tyr 180 185
190 Thr Pro His Trp Phe Gln Glu Gln Val Pro Arg Thr Pro Tyr
Gln Pro 195 200 205
Leu Val Leu Arg Glu Ala Phe Glu Lys Ala Val Ile Lys Arg Leu Met 210
215 220 Thr Asp Val Pro Phe
Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser 225 230
235 240 Leu Val Ala Ser Val Thr Lys Arg His Leu
Val Lys Thr Asp Ala Ala 245 250
255 Gly Lys Phe Gly Thr Glu Leu His Ser Phe Val Val Gly Leu Glu
Gly 260 265 270 Ser
Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Gly Thr 275
280 285 Thr His His Glu Phe His
Phe Thr Val Gln Asp Gly Ile Asp Ala Ile 290 295
300 Glu Glu Val Ile Tyr His Asp Glu Thr Tyr Asp
Val Thr Thr Ile Arg 305 310 315
320 Ala Ser Thr Pro Met Phe Leu Met Ala Arg Lys Ile Lys Ser Leu Gly
325 330 335 Val Lys
Met Val Leu Ser Gly Glu Gly Ser Asp Glu Leu Leu Gly Gly 340
345 350 Tyr Leu Tyr Phe His Phe Ala
Pro Asn Arg Glu Glu Leu His Arg Glu 355 360
365 Thr Cys Arg Lys Val Lys Ala Leu His Gln Tyr Asp
Cys Leu Arg Ala 370 375 380
Asn Lys Ala Thr Ser Ala Trp Gly Leu Glu Val Arg Val Pro Phe Leu 385
390 395 400 Asp Lys Glu
Phe Val Asp Val Ala Met Gly Met Asp Pro Glu Trp Lys 405
410 415 Met Tyr Asp Lys Asn Leu Gly Arg
Ile Glu Lys Trp Val Leu Arg Lys 420 425
430 Ala Phe Asp Asp Glu Glu His Pro Tyr Leu Pro Glu His
Ile Leu Tyr 435 440 445
Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Asn Trp Ile Asp 450
455 460 Gly Leu Lys Ala
Phe Thr Glu Gln Gln Val Asp Gly Arg Arg Arg Ser 465 470
475 480 Leu Thr Ser Ala Asp Val Pro Pro His
Val Gln Val Thr Asp Glu Met 485 490
495 Met Asn Ser Ala Ala Gln Met Phe Pro Tyr Asn Thr Pro Val
Asn Lys 500 505 510
Glu Ala Tyr Tyr Tyr Arg Met Ile Phe Glu Arg Leu Phe Pro Gln Asp
515 520 525 Ser Ala Arg Glu
Thr Val Pro Trp Gly Pro Ser Ile Ala Cys Ser Thr 530
535 540 Pro Ala Ala Ile Glu Trp Val Glu
Gln Trp Lys Ala Ser Asn Asp Pro 545 550
555 560 Ser Gly Arg Phe Ile Ser Ser His Asp Ser Ala Ala
Thr Asp Arg Thr 565 570
575 Gly Asp Lys Leu Ala Val Val Asn Gly Asp Gly His Gly Ala Ala Asn
580 585 590 Gly Thr Val
Asn Gly Asn Asp Val Ala Val Ala Ile Ala Val 595
600 605 92591PRTZea mays 92Met Cys Gly Ile Leu Ala
Val Leu Gly Val Ala Glu Val Ser Leu Ala 1 5
10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg Arg
Leu Arg His Arg Gly 20 25
30 Pro Asp Trp Ser Gly Leu His Cys His Glu Asp Cys Tyr Leu Ala
His 35 40 45 Gln
Arg Leu Ala Ile Ile Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50
55 60 Asn Glu Asp Lys Thr Val
Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70
75 80 His Glu Glu Leu Lys Ala Lys Leu Lys Thr His
Glu Phe Gln Thr Gly 85 90
95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu
100 105 110 Phe Val
Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115
120 125 Arg Asp Lys Ser Phe Ile Ala
Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135
140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp
Phe Ser Ser Glu 145 150 155
160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Thr Phe Pro Pro
165 170 175 Gly His Leu
Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180
185 190 Pro Pro Trp Phe Ser Glu Thr Val
Pro Ser Thr Pro Tyr Asn Ala Leu 195 200
205 Phe Leu Arg Glu Met Phe Glu Lys Ala Val Ile Lys Arg
Leu Met Thr 210 215 220
Asp Val Pro Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ser Val
Ala Ser Arg His Phe Asn Glu Thr Lys Gly Asp Arg 245
250 255 Gln Trp Gly Asn Lys Leu His Thr Phe
Cys Ile Gly Leu Lys Gly Ser 260 265
270 Pro Asp Leu Lys Ala Ala Arg Glu Val Ala Asp Tyr Leu Ser
Thr Val 275 280 285
His His Glu Phe His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290
295 300 Glu Val Ile Tyr His
Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys
Ile Lys Ser Leu Gly Val 325 330
335 Lys Met Val Ile Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly
Tyr 340 345 350 Leu
Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Phe His Glu Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala
Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ala Thr Ser Ala Trp Gly Val Glu Ala Arg
Val Pro Phe Leu Asp 385 390 395
400 Lys Ser Phe Ile Ser Val Ala Met Asp Ile Asp Pro Asp Trp Lys Met
405 410 415 Ile Lys
Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Ile Arg Asn Ala 420
425 430 Phe Asp Asp Asp Glu Arg Pro
Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser
Trp Ile Asp Gly 450 455 460
Leu Lys Asp His Ala Ser Gln His Val Ser Asp Ser Met Met Met Asn 465
470 475 480 Ala Gly Phe
Val Tyr Pro Glu Asn Thr Pro Thr Thr Lys Glu Gly Tyr 485
490 495 Tyr Tyr Arg Met Ile Phe Glu Lys
Phe Phe Pro Lys Pro Ala Ala Arg 500 505
510 Ser Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr
Ala Lys Ala 515 520 525
Val Glu Trp Asp Ala Ser Trp Ser Lys Asn Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly
Val His Asp Ala Ala Tyr Glu Asp Thr Ala Gly Lys 545 550
555 560 Thr Pro Ala Ser Ala Asp Pro Val Ser
Asp Lys Gly Leu Arg Pro Ala 565 570
575 Ile Gly Glu Ser Leu Gly Thr Pro Val Ala Ser Ala Thr Ala
Val 580 585 590
93580PRTBrassica napus 93Met Cys Gly Ile Leu Ala Val Leu Gly Cys Val Asp
Asn Ser Gln Ala 1 5 10
15 Thr Arg Ser Arg Ile Ile Lys Leu Ser Arg Arg Leu Arg His Arg Gly
20 25 30 Pro Asp Trp
Ser Gly Leu His Cys Tyr Glu Asp Cys Tyr Leu Ala His 35
40 45 Glu Arg Leu Ala Ile Ile Asp Pro
Ile Ser Gly Asp Gln Pro Leu Tyr 50 55
60 Ser Glu Asp Lys Thr Val Val Val Thr Val Asn Gly Glu
Ile Tyr Asn 65 70 75
80 His Lys Ala Leu Arg Glu Ser Glu Ser Leu Lys Ser His Lys Tyr His
85 90 95 Thr Gly Ser Asp
Cys Glu Val Leu Ala His Leu Tyr Glu Glu His Gly 100
105 110 Glu Glu Phe Ile Asn Met Leu Asp Gly
Met Phe Ala Phe Val Leu Leu 115 120
125 Asp Thr Lys Asp Lys Ser Tyr Ile Ala Val Arg Asp Ala Ile
Gly Val 130 135 140
Ile Pro Leu Tyr Ile Gly Trp Gly Leu Asp Gly Ser Val Trp Phe Ala 145
150 155 160 Ser Glu Met Lys Ala
Leu Ser Asp Asp Cys Glu Gln Phe Met Ala Phe 165
170 175 Pro Pro Gly His Ile Tyr Ser Ser Lys Gln
Gly Gly Leu Arg Arg Trp 180 185
190 Tyr Asn Pro Pro Trp Phe Ser Glu Leu Val Pro Ser Thr Pro Tyr
Asp 195 200 205 Pro
Leu Val Leu Arg Asp Thr Phe Glu Lys Ala Val Ile Lys Arg Leu 210
215 220 Met Thr Asp Val Pro Phe
Gly Val Leu Leu Ser Gly Gly Leu Asp Ser 225 230
235 240 Ser Leu Val Ala Ser Val Ala Ile Arg His Leu
Glu Lys Ser Asp Ala 245 250
255 Arg Gln Trp Gly Ser Lys Leu His Thr Phe Cys Ile Gly Leu Lys Gly
260 265 270 Ser Pro
Asp Leu Lys Ala Gly Lys Glu Val Ala Asp Tyr Leu Gly Thr 275
280 285 Arg His His Glu Leu His Phe
Thr Val Gln Glu Gly Ile Asp Ala Ile 290 295
300 Glu Glu Val Ile Tyr His Val Glu Thr Tyr Asp Val
Thr Thr Ile Arg 305 310 315
320 Ala Ser Thr Pro Met Phe Leu Met Ser Arg Lys Ile Lys Ser Leu Gly
325 330 335 Val Lys Met
Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly 340
345 350 Tyr Leu Tyr Phe His Lys Ala Pro
Asn Lys Lys Glu Leu His Glu Glu 355 360
365 Thr Cys Arg Lys Ile Lys Ala Leu Tyr Gln Tyr Asp Cys
Leu Arg Ala 370 375 380
Asn Lys Ser Thr Ser Ala Trp Gly Val Glu Ala Arg Val Pro Phe Leu 385
390 395 400 Asp Lys Ala Phe
Leu Asp Val Ala Met Gly Ile Asp Pro Glu Trp Lys 405
410 415 Met Ile Arg Pro Asp Leu Gly Arg Ile
Glu Lys Trp Val Leu Arg Asn 420 425
430 Ala Phe Asp Asp Glu Lys Asn Pro Tyr Leu Pro Lys His Ile
Leu Tyr 435 440 445
Arg Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser Trp Ile Asp 450
455 460 Gly Leu Lys Asp His
Ala Asn Lys His Val Ser Asp Ala Met Leu Thr 465 470
475 480 Asn Ala Asn Phe Val Phe Pro Glu Asn Thr
Pro Leu Thr Lys Glu Ala 485 490
495 Tyr Tyr Tyr Arg Ala Ile Phe Glu Lys Phe Phe Pro Lys Ser Ala
Ala 500 505 510 Arg
Ala Thr Val Pro Gly Gly Pro Ser Val Ala Cys Ser Thr Ala Lys 515
520 525 Ala Val Glu Trp Asp Ala
Ala Trp Lys Gly Asn Leu Asp Pro Ser Gly 530 535
540 Arg Ala Ala Leu Gly Val His Val Ala Ala Tyr
Glu Gly Asp Lys Ala 545 550 555
560 Glu Asp Pro Arg Pro Glu Lys Val Gln Lys Leu Ala Glu Lys Thr Ala
565 570 575 Glu Ala
Ile Val 580 94589PRTTriticum aestivum 94Met Cys Gly Ile Leu
Ala Val Leu Gly Val Gly Asp Val Ser Leu Ala 1 5
10 15 Lys Arg Ser Arg Ile Ile Glu Leu Ser Arg
Arg Leu Arg His Arg Gly 20 25
30 Pro Asp Trp Ser Gly Ile His Ser Phe Glu Asp Cys Tyr Leu Ala
His 35 40 45 Gln
Arg Leu Ala Ile Val Asp Pro Thr Ser Gly Asp Gln Pro Leu Tyr 50
55 60 Asn Glu Asp Lys Thr Val
Val Val Thr Val Asn Gly Glu Ile Tyr Asn 65 70
75 80 His Glu Glu Leu Lys Ala Lys Leu Lys Ser His
Gln Phe Gln Thr Gly 85 90
95 Ser Asp Cys Glu Val Ile Ala His Leu Tyr Glu Glu Tyr Gly Glu Glu
100 105 110 Phe Val
Asp Met Leu Asp Gly Met Phe Ser Phe Val Leu Leu Asp Thr 115
120 125 Arg Asp Lys Ser Phe Ile Ala
Ala Arg Asp Ala Ile Gly Ile Cys Pro 130 135
140 Leu Tyr Met Gly Trp Gly Leu Asp Gly Ser Val Trp
Phe Ser Ser Glu 145 150 155
160 Met Lys Ala Leu Ser Asp Asp Cys Glu Arg Phe Ile Ser Phe Pro Pro
165 170 175 Gly His Leu
Tyr Ser Ser Lys Thr Gly Gly Leu Arg Arg Trp Tyr Asn 180
185 190 Pro Pro Trp Phe Ser Glu Ser Ile
Pro Ser Ala Pro Tyr Asp Pro Leu 195 200
205 Leu Ile Arg Glu Ser Ile Glu Lys Ala Ala Ile Lys Arg
Leu Met Thr 210 215 220
Asp Val Thr Phe Gly Val Leu Leu Ser Gly Gly Leu Asp Ser Ser Leu 225
230 235 240 Val Ala Ser Val
Val Ser Arg Tyr Leu Ala Glu Thr Lys Val Ala Arg 245
250 255 Gln Trp Arg Asn Lys Leu His Thr Phe
Cys Ile Gly Met Lys Gly Ser 260 265
270 Pro Asp Leu Lys Ala Ala Lys Glu Val Ala Asp Tyr Leu Gly
Thr Val 275 280 285
His His Glu Leu His Phe Thr Val Gln Glu Gly Ile Asp Ala Leu Glu 290
295 300 Glu Val Ile Tyr His
Ile Glu Thr Tyr Asp Val Thr Thr Ile Arg Ala 305 310
315 320 Ser Thr Pro Met Phe Leu Met Ser Arg Lys
Ile Lys Ser Leu Gly Val 325 330
335 Lys Met Val Leu Ser Gly Glu Gly Ser Asp Glu Ile Phe Gly Gly
Tyr 340 345 350 Leu
Tyr Phe His Lys Ala Pro Asn Lys Lys Glu Leu His Glu Glu Thr 355
360 365 Cys Arg Lys Ile Lys Ala
Leu His Leu Tyr Asp Cys Leu Arg Ala Asn 370 375
380 Lys Ala Thr Ser Ala Trp Gly Leu Glu Ala Arg
Val Pro Phe Leu Asp 385 390 395
400 Lys Asn Phe Ile Asn Val Ala Met Asp Leu Asp Pro Glu Cys Lys Met
405 410 415 Ile Arg
Arg Asp Leu Gly Arg Ile Glu Lys Trp Val Leu Arg Asn Ala 420
425 430 Phe Asp Asp Glu Glu Lys Pro
Tyr Leu Pro Lys His Ile Leu Tyr Arg 435 440
445 Gln Lys Glu Gln Phe Ser Asp Gly Val Gly Tyr Ser
Trp Ile Asp Gly 450 455 460
Leu Lys Asp His Ala Lys Ala His Val Ser Asp Ser Met Met Thr Asn 465
470 475 480 Ala Ser Phe
Val Tyr Pro Glu Asn Thr Pro Thr Thr Lys Glu Ala Tyr 485
490 495 Tyr Tyr Arg Thr Val Phe Glu Lys
Phe Tyr Pro Lys Asn Ala Ala Arg 500 505
510 Leu Thr Val Pro Gly Gly Pro Ser Ile Ala Cys Ser Thr
Ala Lys Ala 515 520 525
Val Glu Trp Asp Ala Ala Trp Ser Lys Leu Leu Asp Pro Ser Gly Arg 530
535 540 Ala Ala Leu Gly
Val His Asp Ala Ala Tyr Lys Glu Lys Ala Pro Ala 545 550
555 560 Ser Val Asp Pro Ala Val Asp Asn Val
Ser Arg Ser Pro Ala His Asp 565 570
575 Val Lys Arg Leu Lys Thr Ala Ile Ser Ala Ala Ala Val
580 585 952264DNAOryza sativa
95gcggattcca ttctcctctt ggcatcacga ggcggcgccg cttggtctag ctagtagcca
60cagggagagg tggtagccgc agccgccgcc gacgagacct cgccgccggg gggagggcac
120catgtgtggc atcctcgccg tgctcggcgt cgcagacgtc tccctcgcca agcgctcccg
180catcatcgag ctatcccgcc ggttacgtca tagaggccct gattggagtg gtatacactg
240ctatcaggat tgctatcttg cacaccagcg gttggctatt gttgatccca catccggaga
300ccagccgttg tacaatgagg acaaatctgt tgttgtgacg gtgaatggag agatctataa
360ccatgaagaa ttgaaagcta acctgaaatc tcataaattc caaactgcta gcgattgtga
420agttattgct catctgtatg aggaatatgg ggaggaattt gtggatatgt tggatgggat
480gttcgctttt gttcttcttg acacacgtga taaaagcttc attgcagccc gtgatgctat
540tggcatttgt cctttataca tgggctgggg tcttgatggt tcggtttggt tttcgtcaga
600gatgaaggca ttaagtgatg attgcgagcg attcatatcc ttcccccctg ggcacttgta
660ctccagcaaa acaggtggcc taaggagatg gtacaaccca ccatggtttt ctgaaagcat
720tccctccacc ccgtacaatc ctcttcttct ccgacagagc tttgagaagg ctattattaa
780gaggctaatg acagatgtgc catttggtgt tctcttgtct ggtggactgg actcttcttt
840ggttgcatct gttgtttcgc ggcacttggc agaggcaaaa gttgccgcac agtggggaaa
900caaactgcat acattttgca ttggtttgaa aggttctcct gatcttagag ctgctaagga
960agttgcagac taccttggta ctgttcatca cgaactccac ttcacagtgc aggaaggcat
1020tgatgcactg gaggaagtca tttaccatgt tgagacatat gatgtaacga caattagagc
1080aagcacccca atgttcttga tgtcacgtaa aattaaatct ttgggggtga agatggttct
1140ttcgggagaa ggttctgatg aaatatttgg cggttacctt tattttcaca aggcaccaaa
1200caagaaggaa ttccatgagg aaacatgtcg gaagataaaa gcccttcatt tatatgattg
1260cttgagagcg aacaaatcaa cttctgcatg gggtgttgag gcccgtgttc cgttccttga
1320caaaaacttc atcaatgtag ctatggacat tgatcctgaa tggaaaatga taaaacgtga
1380tcttggccgt attgagaaat gggttctccg gaatgcattt gatgatgagg agaagcccta
1440tttacctaag cacattctat acaggcaaaa ggagcaattc agtgatggtg ttgggtacag
1500ttggattgat ggattgaagg atcatgcaaa tgaacatgta tcagattcca tgatgatgaa
1560cgctagcttt gtttacccag aaaacactcc agttacaaaa gaagcgtact attataggac
1620aatattcgag aaattctttc ccaagaatgc tgctaggttg acagtacctg gaggtcctag
1680cgtcgcgtgc agcactgcta aagctgttga atgggacgca gcctggtcca aaaaccttga
1740tccatctggt cgtgctgctc ttggtgttca tgatgctgca tatgaagata ctctacaaaa
1800atctcctgcc tctgccaatc ctgtcttgga taacggcttt ggtccagccc ttggggaaag
1860catggtcaaa accgttgctt cagccactgc cgtttaactt tctatcgtcg cataaaactc
1920cgtagtttgt tgttcttggt tcaatcccag cttctttcag atgtcgttag tttcttcaaa
1980catgtaatgg agatgcgtgc ttttcctggc ttgttagtta ctgtatgctt gtcatcgtgt
2040atgttttctt ttcttttcca atatgcaaac tgtttggtcg tggactgatc agaacattgt
2100aaatatgaat aaccgcgact gatatcctca agttgctttt ggtttgcaat agttctaatc
2160ttgatgttct gctgggaatc ggaagatgtt atgcagtatg cgtattgttg gggtgtaacc
2220gtgtaagtgc atctgaaatg aagttacggg cgatggtaac tggg
2264962332DNAAquilegia formosa 96caagtgatta aatcatccac atttcttctt
tctttctttc tttctttttt tttctttttt 60tctttgttct ccttgtttat agaatcttat
tatttattaa cagagcaaaa gcatttctct 120agctagctag ctttatttct ttgtgatcat
caatcaataa atatatataa ttcatcatca 180tgtgtggaat tctagctgtt ttgggttgtt
ctgatgattc tcaagccaaa agagttcgtg 240ttcttgagct ttctcgcaga ttgaagcacc
gtgggcctga ttggagtggt ctgtatcagc 300atggtgacaa ttttctatct catcaaaggc
ttgcagtcat tgatcctgct tctggggatc 360agcctcttta taatgaagac aaatcaattg
tcgtaactgt gaatggagaa atttataacc 420atgaagcctt gaggaagcgc ttgccaaatc
acaaatttcg aactggaagt gactgtgatg 480ttattgctca tctgtatgaa gaattcgggg
aggattttgt tgacatgttg gacgggatgt 540tctcatttgt tttattggac acccgcgata
acagcttcct tgtcgcccgg gatgccattg 600ggattacctc cctttatatt ggttggggac
ttgatggttc aatttggatt tcatctgaga 660tgaaaggact aaatgatgac tgtgaacact
ttgaatgctt tcctcctggt cacctttact 720cgagcaaaaa tagtggtttt cgtaggtggt
acaatccctc atggttctca gaagctgttc 780catctacacc atatgatcca ctcgtcctca
gacgtgcatt tgaaaatgct gtagttaaga 840ggctaatgac tgatgtacca tttggagttc
tcctatctgg tggccttgat tcatcattag 900ttgcctccat cacggcacgc cacttggcag
agacaaaggc tgccaagcaa tggggggcac 960aacttcattc cttctgtgtt ggtctggagg
gctcacctga tttaaaggct ggaaaagagg 1020ttgccgatta tttgggtacc gttcaccatg
agtttcactt cactgttcag gatggtatcg 1080atgccattga agatgtgatt taccatgtag
aaacatatga tgtaacgact atccgggcga 1140gcacacctat gtttcttatg tctcgcaaga
tcaagtcact aggagtgaag atggttatct 1200ctggagaagg ctccgatgaa atatttggtg
ggtacttata tttccacaag gctcctaaca 1260aggaggagtt tcatcgcgag acatgtcata
agataaaggc tcttcatcag tatgattgct 1320tgagagctaa taaatcgacc tctgcttggg
gtctggaagc tcgggtgcca ttcttagaca 1380aagaattcat caatgttgca atggctattg
accctgaatg gaagatgatt aaacgtgatc 1440aaggccgtat tgaaaagtgg gtactcagga
gggcttttga tgatgaggac cacccctacc 1500tgccaaagca cattctctac aggcagaaag
aacaatttag tgatggtgtt ggatatagtt 1560ggatcgatgg actcaaggcc cacgctgcat
cacatgttac ggataagatg atgcgcaatg 1620ccaagaacat tttcctacac aacacaccaa
ctaccaaaga agcctactac tacagaatga 1680tttttgagag gtttttccct cagaactcgg
caaaattaac agttccaggt ggtccaagtg 1740ttgcttgcag cactgccaag gctgtcgaat
gggatgcttc ttggtcaaat aatttggacc 1800cttctggcag ggctgcatta ggtgtccatg
cttcagcata tgaagcacaa ctgtctgctc 1860ctcttgctaa tggtaatgtt ccagttaaga
tttttaacaa tgtaccaaga atggttgaag 1920taggtgctcc agctagcctc acgatccgca
gctaatattt ctggtgaatg tgccttattt 1980tgtatggatt tgaagttaag aggccatagt
atgcaaggtt cttttttttt cttttttttt 2040ttcagtgtgc agtgtgtata tgtactagta
gtccatatgt gaaggaagat gaaacaaaac 2100tatgtaaaag tccatgtctt ttatatttct
gaaaaaagaa ggttcttgtg atttcttttt 2160tgctacaaat aggcataaaa tagctgattc
catgtatcgg gcacccctgg caaacaccaa 2220tgtatgcagt ctgcatagcg ttgtggatca
gccttctgct catcggtcaa cactttccct 2280tgttgttctg tgtaaactga tgtatgtgca
tcaatccgat attcagatat tt 2332971925DNAAsparagus officinalis
97tctgcttgca ccttttgaga gagagggaga gagagagaga gagagagaga ggatcatgtg
60tgggatactt gcagtgctcg gttgctccga tgactctcag gcgaagaggg ttcgagttct
120cgagctctct cgcaggttga agcacagggg cccagattgg agcgggcttt gccaacatgg
180agattgtttc ttgtctcatc agagattggc gatcattgat cccgcctctg gtgatcaacc
240cctgtacaac gaggacaagt ccatcgttgt cacggtaaac ggagagattt acaaccacga
300agagctaagg cgacgcctgc ctgatcataa atacagaact ggaagcgact gtgaagtcat
360cgctcatctg tatgaggaac acggagaaga tttcgtcgat atgttggatg gaatgttctc
420cttcgttcta ttggacaccc gaaacaattg cttcgttgcg gcaagggatg cagtgggaat
480aacccccctc tacattggct ggggattaga cggctctgtt tggctctcgt cggaaatgaa
540aggattaaac gatgactgcg aacattttga agtatttcca cctggaaacc tgtactcaag
600cagatcaggc agcttcagaa gatggtataa tcctcagtgg tacaatgaga ctatcccttc
660ggccccctat gatcctcttg ttctgaggaa agcttttgag gatgctgtta taaagaggct
720gatgactgat gtgccatttg gggttctgtt atctggtggc ctcgattcct cgttggtcgc
780cgctgttact gctcggcatc ttgcaggaag taaagctgca gagcaatggg gaactcagct
840ccattctttc tgtgttggct tagagggatc accagatctc aaggctgcaa aagaggttgc
900agagtatctg ggtactgtcc accatgagtt tcacttcaca gttcaggatg gaattgatgc
960cattgaggat gtaatcttcc acattgaaac gtacgatgtg acaacaatca gggcaagcac
1020tccaatgttc ctcatggcca gaaaaatcaa gtccttagga gtaaaaatgg tgatctcagg
1080cgaaggctcg gatgaaatct ttggcgggta cttgtatttt cacaaagcac ctaacaaaga
1140agaattccat cacgaaacat gtcgaaagat caaagctctg catcagtatg actgcctcag
1200agccaacaaa gcaacatcag catgggggct ggaagctcga gtgccatttt tagacaagga
1260gttcatggat gttgctatga gtatagatcc tgaatcgaaa atgattaagc ctgatctcgg
1320gaggatcgag aagtgggtac tgaggaaagc ttttgatgat gaagagaatc cctatcttcc
1380aaagcatatt ctctataggc aaaaggagca gttcagtgat ggtgttggat atagttggat
1440tgatgggctg aaggctcatg ctgcaaaaca tgtaactgat agaatgatgc tgaatgcagc
1500acgtatttac ccccacaaca caccaaccac aaaagaggct tattactaca gaatgatctt
1560tgaaaggttc ttccctcaga actcggcgag atttactgtc cctggaggtc caagcattgc
1620ttgcagcacg gcgaaggcta tcgaatggga cgctcgctgg tcgaacaatt tggatccgtc
1680ggggagagca gctctcggcg tccatgactc tgcctacgat cctcctcttc cttcttcgat
1740ttctgcagga aaaggagctg caatgatcac taacaagaag ccgaggattg tggatgtagc
1800aactccggga gttgttatta gtacctgatg ttggtttggt ttggtttggt tttgatgtac
1860aagttaaaat aaatgtgtgg ggcgttgtat tttggatgga gggtactaaa gcgtgtaatt
1920tgctg
1925982026DNABrassica oleracea 98ttgcgattaa ataagaaaaa tgtgtggaat
acttgccctt ttaggatgct ccgacgattc 60tcaggccaag agagtacgcg ttcttgagct
ttctcgcaga ttgaggcaca gaggacctga 120ttggagcgga atatatcaga acgggttcaa
ttacttggcc catcaacgtc ttgctatcat 180cgatcctgat tccggtgatc aacctctctt
taacgaggac aagtccattg ttgtcacggt 240gaacggagag atttataacc atgaggagct
gagaaagggt ttgaagaatc acaagttcca 300caccggtagt gattgtgacg tcatagctca
cctgtacgag gagcatggtg agaattttgt 360ggacatgttg gatggaatct tctcctttgt
gttgctggac acaagagata actcattcat 420ggttgctcgt gacgcggttg gtgtcacttc
gctctacatt ggttggggat tagatggatc 480tctgtgggtc tcttccgaga tgaaaggctt
acacgaagat tgtgagcatt tcgaagcctt 540tcctccaggt catttgtatt caagcaaatc
aggaggaggg tttaagcaat ggtacaatcc 600tccttggttc aatgaatctg ttccttctac
gccttatgag cctctcgcaa ttagaagcgc 660ctttgaagac gctgtgataa agcggttgat
gactgatgtc ccatttggag ttttgctatc 720tggtggtctt gattcttctc ttgttgcatc
catcactgcc cgtcacttgg ccggtactaa 780ggccgctaag cgatggggtc ctcagctcca
ttccttttgt gtcggtcttg agggctcgcc 840ggacttgaag gcggggaaag aagtggcgga
gtatttgggg acggtgcacc atgagttcca 900tttcacggtg caagacggga ttgatgcgat
tgaggatgtg atctaccatg tcgagacata 960tgatgtgacg acaattagag ctagcacacc
catgttcttg atgtccagga aaatcaagtc 1020tctaggtgtt aagatggttc tttccggtga
aggttctgat gagatctttg gagggtatct 1080ttacttccac aaggcaccta acaagcaaga
atttcaccaa gaaacttgtc gcaagatcaa 1140ggctcttcac aaatacgatt gtttaagagc
caacaaagct acctctgctt ttggtctaga 1200ggcgcgtgtt ccttttctgg acaaggagtt
tatcaacacc gctatgtctc tcgaccctga 1260atccaagatg atcaaaccag aggaagggag
gatcgagaag tgggttctaa ggagagcctt 1320tgatgatgaa gaacgtcctt atttgccaaa
acacattctc tacagacaga aagagcagtt 1380tagtgatggt gttggctaca gctggatcga
tggcctcaaa gcccacgctg ctgaaaatgt 1440taatgacaag atgatgtcga aagctgcttt
tatcttccct cacaacaccc cactcaccaa 1500agaagcatac tattacagaa tgatctttga
gaggttcttc ccacagaact cggcaaggct 1560aactgttccc ggaggtgcga ccgtggcttg
ctcgaccgca aaagcggtgg agtgggatgc 1620aagctggtcc aacaatatgg atccatctgg
aagagctgcg attggagttc acctctcggc 1680ctacgacggc agcaaagtgg cattgccctt
gccggcgcca cataaggcaa tcgacgacat 1740cccaatgatg atgggacaag aagttgtgat
tcagacatga gtttgaagga tatatagggg 1800aattggagtt ctttaaagtt gtcctaatgg
gtttaagtgt ttttgtatga tttcaaaata 1860aaattggttt cgtgttctta gggaaatatg
aatgcataaa ttatttttct tgtactatta 1920gtaaatattc gaatgtactg tttctgcaaa
atcgatgtac atcaatctta ttataattat 1980atgtattgta atatgatatg aaaaatgtga
ttttgcttgt tttcac 2026993203DNAChlamydomonas reinhardtii
99cccctcccgc tccctccccg acatatgatc cagcattgat gggtgataca gacgaagcgc
60agaagcagca atccggtgtg tacgcatatg ggcacggcag cagctgctgg cagccccgga
120cgaaatccct agctgcactt tcgggcccgc gccagtccct tccaagcgct ttgtgacgtc
180ttctggctac ttacttgctc agcgtatcgc gcacgcccgg ctgcgccccg ctttgccctt
240gcgccacttc cgcacgaagg gtctgcacct tctccaggtc atccgctgca tcgtctgctt
300ccctgtccga gtacgttgcc cttatataag tcagcagcgg tgttttgatg tccacagtct
360ccgtcttctt gcaatgtatc gctaacataa ccgattgagc ggtcggcatt tttcaagagg
420cccttcgtga gcgtgccttg ctagatctgg ctagaggttg cagcgcgggt gtgaaaacgc
480agtgagggtt tggttgaatc gacatgcagc cccgtgcgcc catgcaactg tctttccgcg
540cgcagcaggg ccgatggatt cctttccttt acgcccaaac tacgctgggc acacacatct
600ttttgggtag ggctcttacg gtagccaaat tcttatagag tttggggagt gcgggtagca
660ctcaaaaatg tgcggcattc ttgccgtcct caacacgacg gatgacagcc aggctatgcg
720ctcgagggtg ctggccctga gccgtcgcca gcgtcaccgt ggccccgact ggtctggcat
780gcaccagttc ggcaacaact tccttgccca tgagcgcctt gcgattatgg accccgcctc
840gggtgaccag cccctgttca acgaggaccg cacaatcgtg gtcaccgtga acggtgagat
900ctacaactac aaggagctgc gccagcagat cacggatgcc tgccccggca agaagttcgc
960caccaacagc gattgcgagg tgattagcca cctgtacgag ctgcacggcg agaaggtggc
1020ctccatgctg gacggcttct tcgccttcgt ggtgctggac acccgcaaca acaccttcta
1080cgccgcgcgc gaccccattg gcatcacctg catgtacatc ggctggggcc gtgacggcag
1140cgtgtggctg tcgagcgaga tgaagtgcct gaaggatgac tgcacccgct tccagcagtt
1200ccctcccggg cacttctaca actccaagac gggtgagttc acccgctact acaaccccaa
1260gtacttcctg gacttcgagg ccaagccgca gcgtttcccc agcgctccct acgaccccgt
1320cgcgctgcgt caggcgttcg agcagtccgt ggagaagcgc atgatgtcgg atgtgccgtt
1380cggcgtgctg ctgtcgggcg gcctggacag ctcgctggtg gcgtccatcg cggcgcgcaa
1440gattaagcgt gagggcagcg tgtggggcaa gctgcacagc ttctgcgtgg gcctgcccgg
1500cagccctgac ctgaaggctg gcgcccaggt ggctgagttc ctgggcaccg accaccacga
1560gttccacttc acggtgcagg agggcattga cgccatcagc gaggtcatct accacattga
1620gacctttgac gtcaccacca tccgcgcctc cacgcccatg ttcctgatga gccgcaagat
1680caaggcgctg ggcgtgaaga tggtgctgtc aggcgagggt tccgacgagg tgttcggcgg
1740ctacctgtac ttccacaagg cgcccaacaa ggaggagttc cagtcggaga ctgtgcgcaa
1800gatccaggac ctgtacaagt acgactgcct gcgcgccaac aagtccacca tggcttgggg
1860cgtggaggcg cgcgtgccgt tcctggaccg ccacttcctg gacgtggcca tggagatcga
1920ccccgccgag aagatgattg acaagagcaa gggccgcatc gagaagtaca tcctccggaa
1980agccttcgat acccccgagg acccctacct gcccaacgag gtgctctggc gccagaagga
2040gcagttcagc gacggcgtgg gctacaactg gatcgacggc ctcaaggcgc acgcggacag
2100ccaggtcagc gacgacatga tgaagacggc cgcgcatcgg taccccgaca acacgccccg
2160caccaaggag gcgtactggt accgcagcat cttcgagacc cacttccccc agcgtgccgc
2220cgtggagacg gtgccgggcg gcccctcggt ggcctgctcc accgccaccg ccgcgctgtg
2280ggacgccacc tgggctggca aggaggaccc ctcgggccgc gccgtggccg gcgtgcacga
2340ctcggcctac gacgccgccg ccgccgccaa cggcgagccg gctgccaaga aggccaagaa
2400gtaaacgggc cttgtccacc acttgcggtc ccgactgcgg cagctgagac tagctgtcag
2460aggttgctgc gcatggggcc gcggcgtgcg tcgctaccgg gaagcagcgt gctgtggggg
2520agtttgatgt gcttcctgat cagcatcgtg ctcgcggagt agcgagagcg agtccggatc
2580atgcacgcga tgcggctgca tgcataaaga gcagcacctc agctgcaccg ccgtctgtgc
2640atgcatggcc agtgattcca ccaggtgcac ggccttgcgt ttttgagcga agagcacacg
2700tcacggatgt caacgcgtta ttcgggggct acgagcctgc gcgctattgt gtcgtgtttt
2760actggcgtgg agtgtcgtgg atgctgtttc tgacagatgt ctttcactgc gagtgtgaat
2820cataggggtg acttgacggt caatgtagac gaggaacggg gagacgacat gcccattgac
2880aggatgacta ggtcttgacg gtggaggatg ggtcacgggc ggcacaagac gcgggggaac
2940aggcggtgcg aagtccagca catggattaa ttagataaag gggcgccagc aacttggcgc
3000ccgcgtagaa agtcatgaag ccatgctagg cggtagtcgc aaggaagcga gaacgggatg
3060ggacgcagct gcacacgtgc ggcggtgggg agccgctgaa gctctttaag aagacgttcc
3120gcagactctc tgatcccaac tgccattctg ccaacccgtt ttgcacgccg aaaacctggc
3180acactggaag cgctcatcac gct
32031002238DNAGlycine max 100tggaaccctt ctacgtgttc tccattccct ctctcactcc
tccatctacg tttcttaaat 60catttccttc tttctctctt tctttatctt ctcattttcc
tcattacact cttttttttt 120tctctcaact tttctcttat taaccatagt tcacatatta
tatcatcaca tatcatagtg 180atatattata tcatatcaca atgtgtggca tacttgctgt
gcttggttgc tctgattcat 240ctcaagccaa aagggttcgc gtccttgagc tttctcgcag
attgaagcac cgtggtcctg 300actggagtgg gctccaccaa tatggtgata actatttggc
tcatcaaagg ttagccatag 360ttgatccagc ttctggtgat caacccctct tcaatgaaga
caaaactgtc gtggttacgg 420tgaatggaga gatctacaat catgaagaac tcaggaaaca
gttgcctaat cacaccttcc 480gtacaggaag tgactgtgat gttattgctc acctgtatga
ggagcacgga gaaaactttg 540tggacatgct tgatggtata ttttcgtttg ttctgctaga
tactcgtgac aacagtttta 600tagtggcacg agatgcaatt ggggtcactt ccttgtacat
tggttggggt ctagatggct 660ctgtctggat ttcatcagaa ttgaaggggt tgaatgatga
ttgcgaacat tttgagtctt 720ttccacctgg tcacttgtac tctagcaaag agagagcgtt
ccgcagatgg tacaatcctc 780catggttctc tgaggctatt ccctcagcac cttatgatcc
tcttgctttg aggcatgcct 840ttgagaaggc tgtggtaaaa aggttgatga ctgatgttcc
ctttggtgtt ttgctctctg 900gaggtttgga ctcttcattg gttgcagccg tcacggctcg
ctacctggca ggcacaaatg 960ctgccaagca atggggaacc aaattacact ctttctgtgt
aggccttgag ggtgcacctg 1020acctaaaggc agcaaaggaa gtagcagact acataggaac
tgtacatcat gaatttcact 1080acactgttca ggatggcata gatgccattg aggatgtgat
ctatcacatt gaaacatatg 1140atgtgacaac aattagagca agcattccca tgtttcttat
gtctcgtaag atcaagtcat 1200tgggagtcaa atgggttata tctggagaag gatctgatga
gatctttgga gggtatctat 1260atttccacaa ggcaccaaac aaagaagagt ttcatcaaga
aacatgccgc aagattaaag 1320cactccacaa atatgattgc ttgcgagcca ataaatcgac
ctttgcctgg ggtctagaag 1380ccagagtgcc atttttggac aaagatttta tcagagttgc
aatgaacatt gatcctgatt 1440ataaaatgat taaaaaggaa gaagggcgaa ttgagaaatg
ggtactgagg agggcctttg 1500atgatgaaga acatccttat ctgccaaagc acattttata
caggcagaaa gaacaattca 1560gtgatggagt tggctatggt tggattgatg gccttaaagc
tcatgctgag aaacatgtga 1620ctgacagaat gatgctcaat gctgctaaca ttttcccctt
caacacacca accaccaaag 1680aagcatacta ctatagaatg atatttgaga ggttcttccc
tcagaactca gccaggctga 1740gtgttcctgg aggaccaagt gttgcatgta gcacagccaa
agctgtagag tgggatgctg 1800cttggtctaa caaccttgat ccatctggta gggcagcact
tggagtgcat gcatcagctt 1860atggaaatca ggtcaaagct gtagaaccag agaagatcat
accaaagatg gaagtttccc 1920cactaggagt tgccatatag agctagtatg agccatagca
aaaactagta gttgccctag 1980aaccaaaata tattattata ctagtcatca atgactcatt
aatcatcata aatgaaaatt 2040tggcctgctg tgtagtttat tcaggcaagg ctatatataa
atagataagg ctctctatct 2100agctgtctta agtgttgttc catccacatc ttgtcttcgt
tttctattta tgtcatctga 2160gcactatcat gatgtactgg atttccaaga aaatgttcag
ttaaatttga atgcaaagtt 2220cactatttca gactttca
22381012108DNAGlycine max 101ggggcattgg attctcacca
acgtttgcgt tactcaagcc gacattctcg cttccgttgg 60aaccgttctt cgtgttctcc
attccctctc tcactccttc atctacttca catattatat 120catcacatat catagtgata
tcatatcaca atgtgtggca tacttgctgt gcttggttgc 180tctgattcat ctcaagccaa
aagggtccgc gtccttgagc tttctcgcag attgaagcac 240cgtggtcctg actggagtgg
gctccaccaa tatggtgata actatttggc tcatcaacgg 300ttagccatag ttgatccagc
ttctggtgat caacccctct tcaatgaaga caaaactgtt 360gttgttacgg tgaatggaga
gatctacaat catgaagaac tcaggaaaca attgcctaat 420cacaccttcc gtacaggaag
tgattgtgat gttattgctc acctgtatga ggagcacgga 480gaaaacttta tggacatgct
tgatggtata tcttcatttg ttctgctgga tactcgtgac 540aacagtttta tagtggcgcg
ggatgcaatt ggggtcactt ccttgtacat tggttggggt 600ttagatggct ctgtctggat
ttcctctgaa ttgaaggggt tgaatgatga ttgcgaacat 660tttgagtctt ttccacctgg
tcacttgtat tctagcaaag agagagcgtt ccgcagatgg 720tacaatcctc catggttgtc
tctggctatt ccatctgccc cttatgatcc tcttgctttg 780agacatgcct ttgagaagct
gtggataaaa aggttgatga ctgatgtgcc ctttggtgtt 840ttgctctctg gaggtttgga
ctcttcattg gttgcagccg tcacggctcg ctacctggca 900ggcacaaaag ctgcgaagca
atggggaact aaattacact ctttctgtgt aggccttgag 960ggtgcacccg acctaaaggc
tacaaaggaa gtagcagagt acataggaac tgtccatcat 1020gaatttcact acactgttca
ggatggcata gatgccatcg aagatgtgat ctatcacatt 1080gagacatatg atgtgacaac
aattagagca agcattccca tgtttcttat gtctcggaag 1140atcaagtcat tgggagtcaa
atgggttatc tctggagaag gatctgatgt tttttttgga 1200gggtatctat atttccacaa
ggcacccaac aaagaagagt tccaccaaga aacatgccgc 1260acaattattg tactccacag
gtatgattgc tcgcgagcca ataaatcgac ctttgtctgg 1320ggtctagaag ccagagtacc
atttttggac aaagagttta tcagagttgc aatgaacatt 1380gatcctgagt gtaaaatgat
aaaaaaggaa gaagggcgaa ttgagaaatg ggcactgagg 1440agggcctttg atgatgaaga
acatccttat ctgccaaagc acattttata taggcagaaa 1500gaacaattca gtgatggagt
tggctatggt tggattgatg gccttaaagc tcatgctgag 1560aaacatgtga ctgacagaat
gatgctcaat gctgccaaca ttttcccctt caacactcca 1620accaccaaag aagcatacca
ctatagaatg atatttgaga ggttcttccc tcagaactca 1680tgcaggctca ctgttcctgg
aggaacaagt gttgcatgta gcacagcaaa agctgttgag 1740tgggatgctg cttggtctaa
caaccttgat ccatcaggta gagcagcact tggagtgcat 1800gcatcagctt atggaaacca
ggtcaaagct gtagaaccag agaagatcat acccaagatg 1860gaagtttctc cactaggagt
tgccatatag agctagtatg agccatagca aggactagta 1920gttgccctag aaccagcata
tattattatt atactaatca tcaaatcatg aaacatcagg 1980ttgctttgta gttatccagg
gaatggtata taaatagata aggatctcta tctatctggc 2040tctctttctg ggccacccag
atctagcctc aacttgcttt cgatgtcacc tgatgcacaa 2100tcataaag
21081022134DNAGlycine max
102ggcacgagct tcaacttcac ccattcatac gtggtgttgt tactgctgct cttttctctt
60ttcttttctc tttagttctc tcttcccctt tctttttctt tttcttcttc ttctgagctt
120gtttaagctt ttcttccatt aacatattat cacaatgtgt ggtattcttg ctgttcttgg
180ttgttctgat gactctcgag ccaaaagggt ccgcgtgctt gagctctctc gcagattgaa
240gcaccgtggc cctgactgga gtgggctcca tcaacatggt gactgctttt tggcacatca
300acggttagcc atagttgatc ctgcttctgg ggatcaacct ctctttaacg aggacaaatc
360cgtcattgtt acggtaaatg gagagattta caaccatgaa gagctcagga aacagctgcc
420taatcacaac ttccgaactg gaagtgattg tgatgttatt gcacacctgt acgaggaaca
480tggagaagac tttgtggaca tgctggatgg tatcttctca tttgttctac tggacacccg
540tgacaacagt tttatagtgg ctcgggatgc tattggggtc acttccttgt acattggatg
600ggggttagat ggctctgttt ggatttcatc agaaatgaaa ggcctgaatg atgattgtga
660acactttgag tgttttccac ctggtcactt gtactctagc aaagaaagag ggttccgcag
720atggtacaat cctccttggt tctctgaggc tattccatct gccccttatg atcctcttgt
780tttaagacac gcctttgagc aggcagtcat aaaaaggttg atgactgatg tgccttttgg
840tgttctactc tctggaggtt tggactcttc tttggttgca tccatcactt ctcgttactt
900ggccaacaca aaggctgctg agcagtgggg atcaaagtta cattcattct gtgtaggcct
960tgagggctca ccagatttga aggctgcaaa agaggttgct gactatctag gcactgtcca
1020ccatgagttt accttcactg ttcaggatgg aatagatgcc attgaagatg ttatctacca
1080tattgaaaca tatgatgtga ctacaattag agcaagcaca cctatgtttc tcatgtctcg
1140gaagattaaa tcacttggtg tcaaatgggt tatctcagga gaaggatctg atgagatctt
1200tggagggtat ttgtacttcc acaaggcacc caacaaggag gagttccaca gagaaacatg
1260ccgcaagatc aaagcacttc accaatatga ttgcttgcga gccaataaat caacatttgc
1320ttggggtcta gaagcccgtg taccattttt ggacaaggcg tttatcaatg ctgcaatgag
1380tattgaccct gagtggaaga tgataaaaag agatgaagga cgaattgaga agtggattct
1440gaggagagcc tttgatgatg aagagcatcc ttatctgcca aagcacattt tatacaggca
1500gaaagaacaa ttcagtgatg gagttggcta tagttggatt gatggcctta aggcccatgc
1560tgcaaaacat gtgactgaaa aaatgatgct taatgctggt aacatttacc cccacaacac
1620cccaaaaacc aaggaagcat attactacag aatgatcttt gagaggttct tccctcagaa
1680ctcagctagg ctcactgttc ctggaggagc aagtgttgca tgtagcacag ccaaagctgt
1740tgagtgggat gctgcttggt ctaacaacct tgatccctct ggtagagcag cacttggagt
1800gcacatttca gcctatgaaa accagaacaa caagggtgta gaaattgaga agataatacc
1860tatggatgct gctccccttg gtgttgccat ccagggctaa tacaaagatg tgacaaagaa
1920taatttgggc gacaatgaag ataactaagc taaaggtgaa tgaaaaattt gcctgcagtg
1980taatttcatc tgggcaaagc ttttatagtt tatagttata aggctttcta aaaagtgttg
2040cgtattgtat tatcttgaat gctgtgattt gaagtcttaa taaaagtgtt tcctttatca
2100gttcataatg aatgcaaagt ccattatttt aaaa
21341031986DNAGlycine max 103agcagtggta tcaacgcaga gtacgcggga gttctgttgt
tgtgttgtgt tgtgtgtctt 60cccttgtgtg ttccagtttt tatttgcagc cgccatgtgc
ggaatcctcg cagtgttggg 120ttgcgtcgac aactctcaga ccaagcgcgc tcgcatcatc
gaattgtctc gcaggttgcg 180gcatagaggt cctgattgga gtggcataca ttgctatgag
gattgttacc tagctcatca 240acgccttgct attgttgacc ctacttcagg ggaccaacct
ttgtacaacg aagacaaaac 300tattattgtc actgtaaatg gggagatata caatcacaag
caattgaggc agaaactgag 360ttcccatcaa tttcgaactg gtagtgattg tgaagtgatt
gcccatcttt atgaagaaca 420tggagaagaa tttgttaata tgctggatgg gatgtttgcc
tttattcttc ttgatactag 480ggataaaagt tttattgctg ctcgtgatgc tattggcatt
acccctctat acttgggctg 540gggtcatgat ggatcaacat ggtttgcatc tgaaatgaaa
gctctgagtg atgattgtga 600gagattcata tcttttcctc cagggcacat ctattccagc
aaacagggag gattaagaag 660gtggtacaat ccaccatggt tttcagagga tattccatca
actccctatg atccaaccct 720tttgcgtgag accttcgaga gggctgtagt taagagaatg
atgactgatg taccttttgg 780agttcttttg tctggaggat tggactcatc acttgttgct
gcagtggtca atcgttattt 840ggctgaatct gaatctgctc gtcaatgggg atcacagtta
catactttct gcattggttt 900aaagggctct cctgacttga aagctgcaaa agaggtagca
gattaccttg gtactcgtca 960ccatgaactt tatttcacgg ttcaggaagg tatagatgca
cttgaagaag tcatttacca 1020tattgaaaca tatgatgtaa cgactatcag agcaagtact
gcaatgtttc ttatgtccag 1080aaaaattaaa gccttgggag tgaaaatggt actttctgga
gaaggttcag atgaaatatt 1140tggaggttac ctgtattttc acaaggcacc taataagaaa
gagtttcatg aagaaacatg 1200tcgaaaaatt aaagctcttc atctttatga ctgcctgaga
gccaataaat caactgcagc 1260atggggtgta gaggcacgtg taccattctt ggataaagaa
tttatcaacg tagccatgag 1320tatagatccg gaatggaaaa tgataaggcc tgatcttgga
aggatagaga agtgggtatt 1380acgcaatgca tttgatgacg ataagaatcc atatttacca
aagcacatat tgtacaggca 1440gaaggaacaa ttcagtgatg gggttggtta cagctggatt
gatggcttga aggatcacgc 1500aaacaaacaa gtcacagatg cgacgatgat ggctgccaat
tttatttacc ctgaaaacac 1560tcctaccaca aaagaaggat acctctacag gacaattttt
gagaagttct ttccaaagaa 1620tgcagcaaag gcaacagtgc caggaggtcc tagtgtggca
tgcagtactg caaaagctgt 1680ggaatgggat gcagcatggt caaaaaatct tgatccttct
ggtcgtgccg cacttggtat 1740tcatgatgct gcatatgatg cagtggatac caaaattgac
gagcccaaaa atggaaccct 1800ttaaggccca taatcgattg tcaagagaaa aaaatgtatg
caacaactgt ctagtgggga 1860tttaaacttc tagtaggcaa aactaatgag aagtgggatt
gtttttattt tcagctcaaa 1920ttaatatgta ggttttgaac tgtttgtggg ttattttaaa
taaatatcta tatttaaatt 1980ttgtag
19861041746DNAPhyscomitrella patens 104atgtgcggaa
ttttggctat tctcggttcc cacgacgcgt cgcctgcgcg acgtgatcgc 60attctggagc
tttcccgcag gctgcgccac cgcggtcccg actggagtgg gctgttcgca 120gggcagaagt
gctggtgtta tctggctcat gagcgcttgg ccatcattga tcccgcctcg 180ggcgaccaac
ctctgtacaa tgagaacaaa gatatcgtcg tcgctgccaa tggagaaatc 240tacaaccacg
aggccttgaa gaagagcatg aagcctcaca agtatcacac gcagtccgac 300tgtgaagtta
ttgctcatct ctttgaagat gtcggcgagg acgtggtcaa catgctggac 360ggcatgttct
cattcgtgtt ggtcgacaac cgcgataatt ccttcatcgc cgcccgggat 420cccattggca
tcacccctct ctactacggc tggggtgcgg atggaagtgt ttggtttgca 480tcggagatga
aggccttgaa ggacgattgc gagcggttcg agattttccc acccggtcac 540atctactcta
gcaaagctgg agggcttcgg cgatattaca acccagcttg gttctctgag 600acttttgtcc
ccagcacccc ttaccagtct cttgttctcc gcgcagcctt cgagaaggct 660gtaatcaaga
gactgatgac cgacgtgccc ttcggtgtac tcctatccgg agggctggat 720tcttcattag
tggcagcagt ggcatcccgt catatcgcag gaactaaagc tgccaacatc 780tggggcaagc
agcttcactc tttctgcgtc ggacttcagg gttctcctga cctgaaggct 840gctcgggaag
tcgccaacta catcggcacc cagcaccacg agttccactt tactgtccaa 900gaaggtttgg
acgctctgtc ggatgtgatc tatcatgtgg agacttacga cgtgaccacc 960atccgagcta
gcacgcccat gttcctcatg acacgcaaga ttaaggctct gggtgtaaag 1020atggtgttgt
ctggggaggg atccgatgaa atttttggtg gttacctcta tttccataaa 1080gcgcccaaca
gggaggagtt ccaccatgag cttgttcgca agatcaaggc gctgcatatg 1140tatgattgcc
agagagccaa taagtcgacg tctgcctggg gtttggaggc gcgtgttccc 1200ttcctagaca
aagaatttat ggaagttgcc atggctatcg atcctgcgga aaagctgatc 1260aggaaggacc
aaggaagaat agagaagtgg gtgctccgaa aagctttcta cgacgaaaag 1320aatccttacc
tgcccaagca cattttgtat cgccagaagg agcaattcag cgatggcgtt 1380ggctacagct
ggattgacgg cctcaaggct catgcacaga gccatgtatc cgaccaaatg 1440ctgaagcatg
caaagcacgt gtacccctac aacacgccgc agactaaaga agcatactat 1500taccgaatgc
tcttcgagaa acacttcccg cagcaatccg ctcgcttgac ggtccccgga 1560ggtgctagcg
tcgcatgtag cacggccaca gcagttgcat gggacaagtc ctgggcgggc 1620aacctggacc
catctggccg agcagcattg ggatgccacg acgcggccta cacggaaaac 1680agcgctgcaa
tgagttacat aacaaaaaac atgtcaaatg ttggacaaaa aatgaccata 1740cattga
17461052199DNAPhyscomitrella patens 105atgtgtggaa ttctagcgat tctcggtgcc
gacggcgccg ttccgtctgc cggacgtgat 60cgcgctctag cgctgtcccg aaggctgcgc
catcgaggac ctgactggag tggactcttt 120gagggcaagg attcctggtg ttacctcgct
catgagcgcc tggctatcat cgatccggct 180tcgggtgatc aacccctcta caatggcact
aaggacatcg ttgtcgctgc taacggagag 240atttacaacc acgagttgtt gaagaagaac
atgaaaccac acgagtacca cacgcagtcc 300gattgcgaag tcattgctca tctttatgag
gatgtaggtg aggaggttgt gaacatgctt 360gacggcatgt ggtcgttcgt gctggtggac
agccgagaca actccttcat cgcagcccgc 420gaccccatcg gcatcactcc tctctatctt
ggttggggag ccgatggtag aactgtgtgg 480tttgcctcgg agatgaaagc cttgaaggac
gattgcgaac ggcttgaggt ctttccacca 540ggccacatct actcaagcaa agctggaggg
ctccgtcgct actacaaccc acagtggttc 600tcagagactt ttgttcccga aactccttac
cagcctctgg aactacgttc agccttcgag 660aaggctgtgg taaagaggct catgaccgac
gtccccttcg gtgtgctcct ttccggaggc 720ttggattctt ccttggtggc atcagtggca
gcccgacatc ttgccgaaac caaagctgtc 780agaatctggg gcaacgagct ccactccttc
tgtgttggcc ttgagggttc tcccgacctg 840aaggctgcga gggaagttgc caagtacatc
ggcacccgcc accacgaatt taacttcacc 900gtccaggaag gattggacgc tctgtctgac
gtgatctacc atgtggagac ctacgacgtg 960accaccatta gggcgagcac accaatgttc
ctcatgacac ggaagatcaa ggctctgggt 1020gtgaagatgg tgttgtctgg ggagggatcc
gacgagatct ttggtggtta cctctacttc 1080cacaaagctc ccaacaggga ggagtttcac
cacgaactag tccgcaagat caaggcgcta 1140cacttgtacg attgccagag agccaacaaa
tcaacctctg cttggggtct ggaagctcgt 1200gttcccttcc ttgacaagga gttcatggac
gttgcgatga tgatcgaccc tagcgagaag 1260atgatcagga aggacctggg cagaattgag
aagtgggtgc tgcgtaaagc tttcgatgac 1320gaagagagac catacttgcc caagcacatt
ttgtacaggc aaaaggagca attcagcgat 1380ggagtgggct acagctggat tgatggactc
aaggaatatg cggagagcca tgtgacggat 1440cagatgatga agcacgcgaa gcatgtgtac
cccttcaaca cgcccaacac caaagaagga 1500tattactacc gaatgatctt cgagaagcat
ttcccccaac aatccgcccg gatgacggtc 1560cccggaggtc cttcggtagc atgcagcacc
gccacagctg tggcatggga cgaagcatgg 1620gccaacaact tggacccctc cggcagagca
gcattgggat gccatgactc agcttacaca 1680gacaaacaca gtgagaaagc tgcaccagcg
gcagaagcta acggcacggc ttctcacgag 1740aacggccaca cattctccaa gcccaaatcc
acactggatg ccaccattct gaaaactcag 1800gccgtgcact aatctctagc aagacacacg
tttcagtagt tatctaagtg gcagcaactg 1860caaccaagcc tcagaatggg ctcccaacaa
gctgggtttc catgtgaaga gctggagctt 1920gaattgcaac atgcgccctg taacaataat
agaaaactcg ctcaaaacaa acgtagaaaa 1980atagaataaa gagtactgga ctgaaagacc
gaagaccttt gcttgagtcc tctgaggcgc 2040tggtatggat ataaaccgga cagtgtatgg
caaatagtgc gaggaaagta attttaataa 2100gttagcagct atagtttgag ctatggcagt
cacagaccca tatctgtaca agcttcactt 2160cccctaagtt atgaattccc tcgtttccag
tttcatata 21991062252DNAPhyscomitrella patens
106actgtgtggg cttgggtggt tgtggtgaag gaggacgagg aagagtaaga ggaagaggcg
60gattctgcat caagggttta tgatgctctt tgcacgacaa acctacgaat cctgacccag
120ctggtcgctt gtcgtccccc ctccttcctt tttggcttct ctcttgtctt tccgttagcg
180cttttgagga gacttgagcc gccgtcacaa tgtgtggaat tttagccatc cttgggtgcc
240atgacaagag cgtcacgcgg cggcatcgct gcctggagct ctctcgcagg ttgcggcacc
300ggggacctga ctggagtggt ttgttcgtgg acgaggcgtc gggatgttat ctggcgcacg
360aaaggttggc aattatcgat cccacgtcgg gcgaccagcc gttgttcaac gagaacaagg
420acattgtcgt cgcggtgaat ggcgagattt acaaccatga ggccctcaag gcgagcatga
480aggcacataa ataccacact cagagtgatt gtgaagttat tgcacatctg tacgaggaaa
540ttggggagga ggtggttgag aagctggatg gcatgttttc atttgtattg gtagacttgc
600gcgataagtc attcattgct gctcgcgatc cccttggaat cacaccactc tacctcgggt
660ggggcaatga tgggtctgta tggtttgcgt ctgagatgaa ggctttgaag gacgattgtg
720agcgctttga gtcgttccct ccaggtcaca tgtattccag caagcaaggt ggtctgcgta
780ggtattacaa cccaccttgg ttcaacgaaa gcatcccagc agaaccttat gacccgctca
840tactacgaca tgcctttgag aaatcagtca tcaaacggtt aatgacggat gtgccgtttg
900gagtgctgct gtcgggtggc cttgattcct cgttggtagc tgcggttgct caacgacatc
960tagccggcag tacagcagcc aagcaatggg ggaataagct tcattctttc tgtgttggac
1020tggagggctc tcccgatttg aaggctggac gggaagttgc tgattacatc ggtacggtgc
1080acaaagagtt tcatttcact gtccaggaag gtctggatgc catttctgat gtaatatatc
1140acattgaaac gtatgatgtc actacaattc gagctagtac acccatgttc ctcatgtctc
1200gaaaaatcaa agcccttggc gtgaagatgg ttctttctgg agagggttca gacgagatat
1260ttgggggtta cctttacttc cacaaagctc ctaacaagga ggagtttcac aaggaaactt
1320gtaggaagtt gaaggcactg cacttgtacg attgtttgag ggcaaacaaa tcaacatcag
1380cctggggttt ggaagctcgt gtaccattct tggataggga cttcgtaaac ctcgccatgt
1440cgatcgaccc tgctgagaaa atgataaaca agaaggaagg gaaaatcgag aagtggatca
1500tccgtaaagc ttttgatgat gaagagaacc catacctgcc caagcatatt ttgtacagac
1560agaaggagca gttcagtgac ggtgttggct acagttggat tgatggcttg aaggaccatg
1620cagccagtca ggtttctgac cagatgctgg caaatgctaa acacatttat ccccacaaca
1680ctccaggaac aaaggaaggt tactactacc gcatgatctt cgagagatgc ttcccacagg
1740agtcagcaag gcttacagtt ccaggaggac ctagtgtagc ttgcagtact gctgctgcca
1800ttgcctggga caaggcatgg gccaataact tggatccctc aggcagggca gctacaggtg
1860ttcacgattc cgcatatgaa ggtggtgagg tggagagctc agcagtgagc cacaaagaag
1920gtggtgagga tggtttggcc aactcgaaag tgggcgacaa ggttcaggaa gccatagctg
1980ttgcctgagg tgacgcatgg tgttctttga ttaggatgct cattgtaagc tgacccacct
2040actgtactgc aagcaattgt agctttatat gtattggtga acaattgcca ttttagagtg
2100atcagttttc atttccgttt actttgagat aaatgcctta tgtgtatttg agtaggaact
2160ggttaaagga cttttaaatt tgttgttgac cgtgaaagag atcaaccttc aggtatatat
2220tgttttcgaa tgagcttgtt tttcaaaccc tc
22521072289DNAPopulus trichocarpa 107ctcattcaac aataacaaaa caagctcttg
ctctacgtgt tggtgtttcc tattaacagc 60ccatctcctt ctcctgccac ctcgctttcc
tttttattac cagattttct tctttcatta 120ctacccaatt tcatctctat agtttatcca
tccatttttc tctgtctttg tttttaagat 180atacatatct agcaaaatct tcttttatct
gctatatcgt ttttttttaa gaaacgacga 240tgtgtgggat acttgctgtt ttgggttgtt
ctgacgactc tcaggccaag agggttcggg 300tgctagagct ctctcgcagg ttgaagcacc
gtggtccaga ttggagtggg ctctatcagt 360gcggtgactt ttacttggct catcaaaggc
tggctattat cgatcctgct tctggtgacc 420agccactctt taatgaggac caagccatcg
ttgtcacggt gaacggagaa atttacaacc 480atgaagaact aaggaagcgt ttgccaaatc
acaagttccg aacaggcagt gactgtgatg 540ttatcgccca tctgtacgag gaatatggcg
aaaattttgt ggacatgttg gatggaatgt 600tttcatttgt tctgctggat actcgtgaca
acagtttcat tgttgctcgt gacgccattg 660ggatcacccc cctctatatt ggctggggac
ttgatgggtc cgtgtggatt tcatctgaac 720tgaaaggtct gaatgacgac tgtgaacatt
ttgagtgctt tcctcctggt catttgtact 780cgagtaaatc gggtggatta cgtcgttggt
acaatcctcc ttggttctgc gaggccattc 840cctcaacccc atatgatcca cttgttctga
gacgtgcatt tgaaaaggct gtgattaaaa 900ggctaatgac tgatgtgcct tttggagttc
ttttatctgg aggcctagat tcatcactgg 960ttgctgctgt tactgctcgc catttggcag
gtacaaaggc tgccagacaa tggggggcac 1020aactccattc cttctgtgtt ggcctagaga
attcaccaga tttgaaggct gcaagagaag 1080ttgcagatta tctgggaacc gtccaccatg
aattttactt cacggttcag gatggtatag 1140atgccattga ggatgtcata taccatatag
aaacatatga tgttacaacc atcagagcaa 1200gtacccctat gttcctaatg gctcgtaaga
tcaaggcact aggagtgaag atggttattt 1260ctggtgaagg ttctgatgag atttttggtg
ggtatttgta ctttcataag gcacctaaca 1320aagaagagtt acaccgcgaa acatgtcgca
agataaaggc ccttcatcaa tatgattgct 1380tgagagctaa caaggcaaca tctgcttggg
gtttagaagc ccgtgtcccc ttcttggaca 1440aggattttat taatgttgca atggctattg
atcctgaatg gaagatgatc aaacctggac 1500aaggccatat tgagaaatgg gtccttagga
aagcctttga cgacgaggag catccttatc 1560tgcctaagca tattctttac aggcagaaag
agcaatttag cgatggtgtt ggctatagct 1620ggatcgatgg tctcaaagct catgctgccc
aacatgtgac tgacaagatg atgcaaaatg 1680ctgagcacat ctttccacat aataccccta
ccaccaaaga agcctattac tacagaatga 1740tttttgagag gttcttccca cagaactcag
ccaggctgtc tgttcctgga ggagccagtg 1800tagcatgcag cacagctaaa gctgttgaat
gggatgctgc ctggtccaat aatctggatc 1860cttctggacg ggctgcattg ggtgtacatc
tctctgatta tgatcagcag gcagctcttg 1920ccaatgcagg agtggtgcca ccaaaaatta
ttgacactct tcctcgaatg ttggaagtta 1980gtgcttcggg agttgcgatc cacagttagc
gcctgctgga ggactaagta ttggtgaatt 2040tgatatctat agccttggta ttatttaaac
ttgtgttgcc ttgtatatgt aaaaatctta 2100gaggtcatat gtagatgtta caaataatga
tccgtggtcc tttgaagtcg tgtgttgtca 2160ttactttgtg gtttttgtac aaggtaattc
atgtatgtta tcaatgccct gtagctgttt 2220aaagctgcaa ggcaaccttt cctactgttt
aaagctgtaa tgcaaccttt cctatggttt 2280ctttgcttc
22891082205DNAPopulus trichocarpa
108gctcattcaa caataacata acaggctatt actctacgga ttatggtttc ctgttaacac
60tccatctccc tctcctcctg cttctttgtt ttcccttttt ttttcccagt attattctct
120cgttattacc tggttccatc tttatcttcg atcttaagat atacttaagc tacttctatc
180ttcaatatcg aacgttttat ttttgaaaaa caaagaagga tgtgtgggat acttgctgtt
240ttgggttgtt ctgatgactc tcaggccaaa aggtttcgag tgcttgagct ctctcgcaga
300ttgaagcacc gtggtcctga ttggagtggg ctctttcagc acggtgactt ctacttggct
360catcaaaggc tagccattat tgatccggct tctggtgatc agcctctctt taatgaagac
420caagccatcg ttgtcacggt gaacggagaa atttacaatc atgaagaact gaggaagcgc
480ttgccaaatc acaagtttcg aacaggcagt gactgtgatg ttatctccca tttgtacgag
540gaatatggcg agaattttgt ggacatgttg gatggaatgt tttcatttgt tctgctggat
600actcgtgaca acagtttcat tgtcgcccga gacgccattg ggatcacctc cctctacatt
660ggctggggac ttgatgggtc tgtgtggatt tcgtcggaat tgaaaggtct gaatgatgac
720tgcgaacatt tcaagtgctt tccacctggt catatatact cgagcaaatc cggtggatta
780aggcgttggt ataatcctct ttggttctct gaggctattc cctcgacccc atatgaccca
840cttgctctga gaagggcatt tgaaaaggct gtgattaaga ggctgatgac tgatgttcct
900tttggagtgc ttttatccgg gggactagat tcgtcattgg ttgctgctgt gactgcccgg
960catttggcag gtacacaggc tgccagacaa tggggggcac atctccattc cttctgtgta
1020ggcctagaga attctccaga tctgaaggct gctagagaag ttgcagatta tttgggcacc
1080atccaccatg aatttcactt cacagttcag gatggtattg atgccattga agatgtcata
1140taccatgttg aaacatatga tgttacaacc atcagagcaa gtacccctat gttccttttg
1200gctcgtaaga tcaaggcgct aggagtgaag atggttattt ccggtgaagg ttctgatgag
1260atttttggtg ggtatttgta ctttcacaag gcacctaata aggaagagct ccacggcgaa
1320acatgtcgca agataaaagc ccttcatcaa tatgactgct tgagagctaa caaagcaaca
1380tctgcttggg gtctagaagc ccgcgtcccc ttcttggaca aggattttat taatgttgca
1440atggctattg atcctgaatg gaagatgatc aaacctggac gtatcgagaa atgggttctt
1500aggaaagcct ttgacgacga ggagcatcct tatctgccaa agcatattct gtacaggcag
1560aaagagcaat ttagtgatgg cgttggctac agttggattg atggtctcaa agctcatgct
1620gaattacatg tgcacgacaa gatgatgcaa aatgctgagc acatctttcc acataatacc
1680cctaccacca aagaggccta ttactacaga atgatttttg agaggttctt cccacagaac
1740tcagcgaggc tgactgttcc tggaggagcc agtgtagcat gcagcacagc taaagctgtt
1800gaatgggatg cttcctggtc caacaatctc gatccttccg gccgtgctgc attgggtgtg
1860catctttctg cttatgaaca gcaggcagct cttgccagtg ctggagtggt gccaccggag
1920attattgaca atcttcctcg aatgatgaaa gttggtgctc caggagttgc aatccaaagt
1980tagcttctgc tggaggaccg aagtacatgc cttgtacatg tataaatcat atagatcatg
2040tgtagaagtt acgaataata atctctgctc gtttgtagta gtgttggcac cttgttgttt
2100ctgtacaagg caattcaagt gtgcaatcga tgttctgtag ctgtttaaag ttgtaatgca
2160acctttcctc tggtttcctt acttcataga cgaatccttt gtttt
22051092069DNAArabidopsis thaliana 109ccattgttat ttgttttcgt tgccactcta
acacaatgtg tgggattctc gctgttcttg 60gttgcatcga caactctcaa gctaaacgtt
ctcgtatcat cgaactctct cgcagattga 120ggcacagagg tcctgattgg agtggactcc
attgttatga agattgttat cttgcccatg 180agcgtttagc catcattgac cctacttcag
gagaccaacc tctctataac gaagacaaga 240ccgtcgctgt cactgtaaat ggagagatat
acaaccacaa gattttgcgt gaaaagctta 300agtctcatca gttccgtact ggtagtgact
gtgaagtgat tgcacatctt tacgaagaac 360atggagagga atttatcgac atgttggatg
gaatgttcgc gtttgtcctt cttgatactc 420gcgacaaaag ttttattgct gcaagggacg
ctattggtat cactccactt tacattggat 480ggggtcttga tggttctgtc tggtttgctt
cggagatgaa agcgcttagt gatgattgtg 540aacagtttat gtcttttcct cctggccaca
tctactcaag taaacaagga gggcttagga 600ggtggtacaa tcctccgtgg tacaatgagc
aggttccttc aaccccatat gatcctttag 660ttctgcgcaa tgctttcgag aaggctgtaa
taaagagact tatgactgat gtgccttttg 720gagttctcct atctggagga ttggactcgt
ctctcgttgc tgcagtagca ttacgccatt 780tggaaaaatc agaagctgct cgtcaatggg
gttcacaatt gcacacgttt tgcatcggtt 840tgcagggatc gccagatctt aaagctggca
gagaagttgc tgactatctt ggaacacgcc 900accacgagtt tcagtttaca gttcaggacg
ggatagacgc gatagaggaa gtcatttacc 960atattgagac ttatgacgtt actacaataa
gagctagcac cccaatgttt cttatgtcca 1020gaaaaattaa atctttaggt gtaaagatgg
ttctttctgg ggaaggttct gatgaaatac 1080tggggggata cttgtacttc cataaggctc
ccaacaagaa agaatttcat gaagaaacat 1140gccgaaagat caaagctctc caccaatttg
attgtttgag agctaacaaa tcaacttctg 1200cgtggggtgt cgaagctcgt gtgcctttcc
tagataaaga atttttaaat gttgcaatga 1260gcatcgatcc agagtggaag ttgatcaagc
ctgatctcgg aaggatcgag aagtgggtgc 1320tacgcaatgc ctttgatgat gaagaacgac
cttatctacc aaagcacatt ctatatagac 1380agaaagaaca gtttagtgat ggagttgggt
atagctggat agatggtctg aaagatcatg 1440caaataaaca tgtctctgat actatgctgt
caaacgcaag ctttgtcttc ccggataaca 1500cacctctgac aaaagaagcg tactactaca
gaaccatctt cgagaagttc ttcccgaaga 1560gtgctgctag agcgactgta ccaggaggtc
caagtatagc ttgcagtacc gcgaaagctg 1620tagaatggga tgcaacttgg tcaaagaatc
ttgatccgtc aggccgtgcg gctcttggag 1680ttcatgttgc agcttatgaa gaggataaag
cagctgctgc tgctaaggct ggatcggatt 1740tagtagatcc tcttcctaag aatggaacat
aagagaacaa cactacaggc attgaggata 1800taagcaaatg ttttattctt ctacacagag
agatcgttat cttctagagg gatcaatgaa 1860taaaagcttc gtccatttct agctggagat
tccatggatc tccagttagt gcaagtgata 1920cacgttgtct acatttgtac ctaagtttct
gcattttttg tcgttctttt gtgttagaca 1980agtcttggac cctagatgat acttcagttt
cttagacgtt aaatttgatg aatccgaact 2040tgtttgattt caaagcctgg cctttctgc
20691102207DNAArabidopsis thaliana
110agacatcaaa aacacgaata tcgatagtac acttctacgt gcaattttct cctttctctt
60cctggacatc tgtctgttta ttacattttc ttgtaatctc tttttggggt tttacaatat
120ctatccccta aagtttcgga aaattctgtt tttctgttct cattcttcgt gatctttttc
180actttcttca aaaaaaaaac atgtgtggaa tacttgccgt gttaggatgt tccgatgatt
240ctcaggccaa gagagttcgt gttcttgagc tttctcgcag attgaggcac agaggacctg
300actggagtgg cttatatcag aacggagata attacttggc ccatcaacgt cttgccgtca
360tcgatcctgc ttccggtgat caacctcttt tcaacgagga caagaccatt gttgtcacgg
420tgaacggaga gatttataac catgaggagc tgagaaaacg tctgaagaat cacaagttcc
480gtactggtag tgattgtgaa gtcattgctc acttgtacga ggagtatggt gtggattttg
540ttgatatgtt ggatggaata ttctcctttg tgttgctcga cacacgagat aactctttca
600tggtggctcg tgatgcgatt ggtgtcactt cgctctacat tggttgggga ctagacggat
660ctgtgtggat atcttcagag atgaaaggcc taaacgatga ctgtgagcat ttcgaaacgt
720ttcctccagg tcatttttat tcaagcaaat taggagggtt taagcaatgg tataatcctc
780cttggttcaa tgaatctgtt ccttcaacgc cttatgagcc tcttgcgata agacgcgcct
840ttgaaaacgc tgtgattaag cggttgatga ctgatgttcc atttggagtt ttgctctctg
900gtggtcttga ttcttccctt gttgcctcca tcactgcacg tcacttggcc ggtactaagg
960cggctaagca atggggtcct cagctccatt ccttttgcgt tggtcttgag ggctcaccgg
1020acttgaaggc agggaaagag gtggcggaat atttggggac ggtgcaccac gagttccact
1080tctcggtgca ggacgggatt gatgcgatag aggatgtgat ttaccatgtt gagacctatg
1140atgtgacgac tatcagagcg agcacaccga tgttcttgat gtcccggaaa atcaagtctc
1200taggggtcaa gatggttctc tccggcgaag gtgcggacga gatctttgga gggtacctct
1260atttccacaa ggcacctaac aagaaagagt ttcaccaaga aacttgtcgc aagatcaagg
1320ctcttcacaa gtatgactgt ctaagagcca acaaatctac ctctgccttt ggactagagg
1380cacgtgttcc tttccttgac aaagacttca tcaacacagc tatgtctctc gaccctgaat
1440ccaagatgat caagccagag gaaggaagga tcgagaaatg ggttctaagg agagcctttg
1500acgacgaaga acgtccttat ctaccaaaac acattctcta cagacagaaa gaacagttca
1560gtgatggtgt tggctacagt tggatcgatg gcctgaaaga tcacgctgct caaaatgtca
1620atgacaagat gatgtcgaac gcggggcata tcttccctca caacactcca aacactaaag
1680aagcttacta ctacagaatg atctttgaaa ggttcttccc gcagaactct gcgagactaa
1740cggttcctgg aggtgccacc gtggcttgtt cgactgcaaa ggcagtggag tgggatgcaa
1800gctggtccaa caatatggat ccatcaggaa gagccgctat cggagttcac ctttcggcct
1860acgatggcaa gaacgtggca ttgaccatac caccacttaa ggcaattgac aacatgccga
1920tgatgatggg tcaaggagtt gtgattcagt cataacttcg aaggagaaat ggatgaaata
1980tgtgttatat cttcccaatg ggtgaagtgt tttgtatgat tttaaaaata agaatgtgat
2040cctttttttt tcctatgaag atctgaatgt ataatctatc ttgtaaaaat ttgtttcttt
2100gtaagatttg aatgtaccgc ttttacgtag atcgatgtac atcaatctta taagtttcaa
2160ttatgtatta tattatgtcg atttgccaaa aataaatcta aaacctc
22071112030DNAArabidopsis thaliana 111tccatttctc tgaagccgtt gtgttctctt
attgccgcca ccaccaccat gtgtgggatt 60ctcgctgtgt taggctgcgt cgataactct
caagctaaac gttcccgtat catcgaactc 120tctcgcagat tgaggcatag aggtcctgac
tggagtggtc tacattgtta tgaggattgt 180tatttggctc atgagcgttt ggctatcgtt
gaccccactt ctggagatca accactctat 240aacgaagata agaccattgc tgtcacggtc
aatggagaga tttacaacca caaggctttg 300cgtgaaaatt tgaagtctca ccaattccgt
actgggagtg attgtgaagt gattgcccat 360ctttacgaag aacatggaga ggaatttgtc
gacatgttgg atggcatgtt tgcatttgtg 420cttcttgata cccgagacaa aagctttatt
gctgcaaggg atgccattgg tatcactcca 480ctctacatcg ggtggggtct cgatggttct
gtttggtttg cttccgagat gaaagcactt 540agtgatgatt gtgagcagtt tatgtgcttc
cccccaggcc acatctattc aagtaaacaa 600ggtgggctta ggaggtggta caaccccccg
tggttctctg aggttgttcc ttcaacccca 660tatgatcccc tagtggtgcg caatactttt
gagaaggctg ttataaaacg actaatgact 720gatgtgcctt ttggtgtcct cctatctggt
ggattagatt catcccttgt tgcttcagta 780gcattacgcc atctggaaaa atcagaagct
gcttgtcagt ggggttcaaa gttgcacaca 840ttttgtatcg gtttgaaggg atccccggat
cttaaagctg gcagagaagt cgctgactat 900ttaggaactc gccaccacga gttacacttt
acagttcagg acggaataga tgccatagaa 960gaagtcatct accatgttga gacctatgat
gtgactacta ttagagccag cactccaatg 1020tttcttatgt cgcgaaaaat caaatcgctt
ggtgtaaaga tggttctttc tggggaaggc 1080tctgatgaaa tttttggagg atatttgtac
ttccataaag ctcccaacaa gaaggaattt 1140catgaggaaa catgtcgaaa gatcaaagct
cttcatcaat atgactgctt gagggctaac 1200aaatcaactt ctgcatgggg tgttgaggct
cgtgtacctt tcctcgataa agaatttata 1260aatgtcgcaa tgagcatcga tccagagtgg
aagatgatta ggcctgattt gggaaggatc 1320gagaaatggg tgttacgcaa tgcctttgat
gatgagaaaa atccttacct accaaagcac 1380attctatata ggcagaaaga acagttcagt
gatggagttg gatacagctg gattgatggt 1440ctaaaagatc atgcaaacaa acatgtctct
gagacaatgc tgatgaacgc aagctttgtc 1500ttccctgata acacaccttt gacaaaagaa
gcttactact acagaaccat ctttgaaaag 1560ttcttcccta agagtgctgc tagagcaact
gtaccaggag gtccaagtgt ggcatgtagc 1620acagcaaaag ctgtggaatg ggacgcagct
tggtcacaga atcttgaccc atcaggtcgt 1680gcggctcttg gagttcatgt ttcagcttat
ggggaagata aaaccgaaga ttctcgtccc 1740gagaagctac agaaactagc agagaagact
ccagccattg tttgaggata aacaaacaag 1800gtttcagcta atgttgaatc gtgcaatact
cttattgtct caaagacaat agatatcctc 1860ttctataggt tctaaaaagg ctttcttttt
ttcttgtttt ctggggttct ttggatgtgt 1920acctaataag ttcctggtga atttctgtgt
ttagtgttat tagacaatcc atgaaagctt 1980gatacttcag attatgaacg ttatttttca
tgaatccgat tctttctttc 20301122141DNATriticum aestivum
112gcgacgtgta gccctgctct ccgccatctc cggccaggca tctatctacc tacaagtaga
60gccaagccat tcctgcacac ctccatacag aaacacaatt cagatcgact agctcgctgc
120tggctgtaga ggacgatcga cgacgatcca gaggagcagc ataaccgagg agagcggagc
180atgtgcggca tactagcggt gctggggtgc ggcgacgagt cgcaggggaa gagggtccac
240gtgctagagc tctcgcgcag gctcaagcac cggggcccgg actggagcgg cctgcaccag
300gtcgccgaca actacctctg ccaccagcgc ctcgccatca tcgacccggc ctccggcgac
360cagccgctct acaacgagga caagtccatc gccgttgccg tcaacgggga ggtctacaac
420catgaggagc ttcgggcacg gctctccgga cacaggttcc ggaccggcag tgactgcgag
480gtcatcgccc atctgtatga ggaatacgga gaaagcttca ttgacatgtt ggatggtgtt
540ttctccttcg tgttacttga cgcacgagat aacagcttca ttgctgctcg tgatgccatt
600ggtgtcacgc ctctctacat tggctgggga attgatggtt cagtgtggat atcttcggag
660atgaaaggac taaacgatga ttgtgagcac ttcgagatct tcccgcctgg taatctttac
720tccagcaaag agaagtcctt caagagatgg tataaccctc cttggttctc tgaggtcatc
780ccctcggttc cctatgaccc actgcgtctc agatcggcat ttgaaaaggc tgttatcaag
840aggctcatga cagatgttcc atttggcgtc ctcctctccg gtggtctcga ctcatcattg
900gtggctgctg tcgcagcccg tcatttcgct gggacgaagg ctgcaaagcg ctggggaact
960aggctccact ccttctgtgt ggggcttgag ggatcaccag atctcaaggc tgcaaaggag
1020gtcgcggatc acctgggtac cgtgcaccac gagttcaact tcacagttca ggatggcatc
1080gatgcaattg aagatgtgat ataccacatt gaaacatatg atgtgacgac gatcagggca
1140agcacactga tgttccagat gtcacgcaag atcaaggcgc ttggagtcaa gatggtcatc
1200tccggtgagg gtgccgatga gatcttcgga gggtacttgt atttccacaa ggcccctaac
1260aaggaggagt tccaccagga aacatgtcgg aagataaaag ctctccatca gtacgattgc
1320ttgagggcca acaaagcaac atctgcatgg ggccttgagg ttcgtgtgcc attcttggac
1380aaggagttca tcaatgaggc tatgagcata gatcccgaat ggaagatgat ccggcctgat
1440cttggaagaa ttgagaaatg gatactgagg aaagcgttcg atgatgagga gcgacccttc
1500ctgccgaagc atattctgta caggcagaag gagcagttta gtgatggtgt tgggtatagc
1560tggattgatg gcctgaagga tcatgcagcc tcaaatgtga gtgataagat gatgtccaat
1620gcaaagttca tctacccaca caacacccca acaactaaag aggcctacta ttacaggatg
1680atctttgaga ggtacttccc ccagagctcg gcgatcctca cggtgccagg cgggccaagc
1740gtggcgtgca gcacagccaa ggctatagag tgggatgccc aatggtctgg gaacctggac
1800ccctctggga gagcagcgct tggagtccat ctctcagcct acgagcagga cacggtcgct
1860gtgggaggta gcaacaagcc tggggtgatg aacaccgtgg tacctggtgt tgccattgag
1920acttgatgaa tggtacatgt atcatatcgt gtcctactaa aggcaaataa gaacggttgt
1980gtgcatcgct tcatgtagag gccgggcata ctccttttca aaaaaaaaag agaaaataag
2040atgcatatgt tcttgtcagc gttgtaataa gacgggccta tgttttgcta tttaattaaa
2100gggttaatta tccttttgcc ttgagtgatg tctgtgtgct c
21411132032DNATriticum aestivum 113gcacgaggcc catcctcctt cagaagcaca
gagagagatc ttctagctac atactgttgc 60cgtcgatcca ggaaaatgtg cggcatactg
gcggtgctgg gctgcgctga tgacacccag 120gggaagagag tgcgcgtgct cgagctctcg
cgcaggctca agcaccgcgg ccccgactgg 180agcggcatgc accaggttgg cgactgctac
ctctcccacc agcgcctcgc cattatcgac 240cctgcctctg gcgaccagcc gctctacaac
gaggacaagt ccatcgtcgt cacagtgaat 300ggagagatct acaaccatga acagctccgg
gcgcagctct cctcccacac gttcaggaca 360ggcagcgact gcgaggtcat cgcacacctg
tacgaggagc atggggagaa cttcatcgac 420atgctggatg gtgtcttctc cttcgtcttg
ctcgatacac gcgacaacag cttcattgct 480gcacgtgatg ccattggcgt cacacccctc
tatattggct ggggaattga tgggtcggta 540tggatatcat cagagatgaa gggcctgaat
gatgattgtg agcactttga gatctttcct 600cctggccatc tctactccag caagcaggga
ggcttcaaga gatggtacaa cccaccttgg 660ttctccgagg tcattccttc agtgccatat
gacccacttg ctctcaggaa ggctttcgaa 720aaggctgtca tcaagaggct tatgacggac
gttccattcg gtgttctact ctctggtggc 780cttgactcat cattggttgc agccgttaca
gttcgtcacc tggcaggaac aaaggctgca 840aagcgctggg ggactaagct tcactctttt
tgtgtcggac ttgaggggtc acctgatctg 900aaggctgcaa aggaggtagc caattacctg
ggcaccatgc accatgagtt caccttcact 960gttcaggacg gcattgatgc aattgaggat
gtgatttatc acaccgaaac atatgatgtg 1020acgacaatca gggcaagcac gccaatgttc
ctgatgtcac gcaagatcaa gtcacttggc 1080gtcaagatgg tcatctctgg tgagggttcc
gatgagattt tcggagggta cctctacttc 1140cacaaggcac ccaacaaaga ggagctccac
cgtgagacat gtcaaaagat caaagctctg 1200catcagtacg attgcttgag ggccaacaag
gcaacatctg catggggcct cgaagcacgt 1260gtgccattct tggacaagga gtttatcaat
gaggcaatga gcattgatcc tgagtggaag 1320atgatccggc ctgatcttgg aagaattgag
aaatggatgc tgaggaaagc atttgatgac 1380gaggagcaac cattcctgcc gaagcacatt
ctgtacaggc agaaagagca gttcagtgat 1440ggtgttggct acagctggat tgatggccta
aaggctcacg cagaatcaaa tgtgacagat 1500aagatgatgt caaatgcaaa gttcatctac
ccacacaaca ccccgactac aaaagaggcc 1560tactgttaca ggatgatatt tgagaggttc
ttcccccaga actcggcgat cctgacggtg 1620ccaggtgggc caagcgttgc atgcagcacg
gcgaaggcgg tagagtggga tgcccagtgg 1680tcagggaacc tggatccctc agggagagca
gcacttggag tccatctctc ggcctatgaa 1740caggagcatc tcccagcaac catcatggca
ggaaccagca agaagccaag gatgatcgag 1800gttgcggcgc ctggtgtcgc aattgagagt
tgatggtgtc ctgtcctgct tgccgtttct 1860gataagaaat aagatgtacc tggtcttgcc
attagagtgg tgcagaccta aggtttgagt 1920gaagattgtg cattaatgtt tctattgttc
ttatgaaatc ggagaccggt gatttctaat 1980cctttctggc aacttccatc aaaacattat
tacatgatgg ttattatttg ac 20321141770DNAVitis vinifera
114atgtgcggaa tacttgcagt tctgggttgt tctgatgatt cccaggccaa aagggtccga
60ttgttttacc attgttattt atgcttctgt gataggttga agcatcgtgg tcctgactgg
120agtgggctat accaacatgg agattgttat ttagctcatc agcggctagc aatcatcgat
180ccagcttctg gtgatcagcc tctatataat gaaaaccaag ccattgttgt gacggtgaat
240ggagaaattt ataaccatga ggagttgagg aagagcatgc caaatcacaa gttcaggacc
300gggagcgatt gcgatgtcat tgcccatttg tacgaggagc atggggaaaa ttttgtggac
360atgttggatg gaatgttctc atttgtcctg ctggataccc gtgatgatag cttcattgtt
420gcccgagatg ccatcggaat cacctccctc tatattggtt ggggacttga tggtagctcg
480gtatggattt catctgagct caaaggtttg aatgatgact gtgaacattt tgagagcttt
540ccacctggtc acatgtactc tagcaaagag ggtggattca aaagatggta caacccccct
600tggttctctg aggctattcc atcggcacca tatgaccctc ttgttctgag gcgagctttt
660gagaatgccg tgatcaagag gttaatgacc gatgttcctt ttggggttct gctgtcagga
720ggtctggatt catccttagt tgcctctatt accgctcgcc acttagcagg cacaaaggct
780gctaagcagt ggggagcaca gctccattcc ttctgtgttg ggctagaggg ctcaccggat
840ctgaaggctg caaaagaagt tgcagactat ttgggcaccg ttcaccacga gtttcacttc
900accgttcagg atggtatcga tgccattgag gatgttattt accatattga aacttatgat
960gtgacaacga tccgagcaag tacccctatg tttctcatgt cgcgtaagat taagtcacta
1020ggagtgaaga tggtgatatc cggagagggc tctgatgaga tttttggtgg gtacttatac
1080tttcacaagg cgcccaacaa ggaagagttc catagggaaa catgtcgcaa gataaaggca
1140ctctaccagt atgattgctt gagagctaat aaatcaacat ctgcatgggg tttggaagcc
1200cgggtcccct ttttagacaa ggaattcatt aaagttgcaa tggatattga ccctgagtgg
1260aagatgataa agccagaaca agggcgaatt gagaaatggg ttctgaggag ggcttttgat
1320gatgaggaac aaccctatct gccaaagcat attctctaca ggcaaaaaga gcaattcagt
1380gatggtgtcg gctacagttg gattgatggg ctcaaagccc atgcgtcaca acatgtgacc
1440gataaaatga tgctcaatgc ttcacatatc ttcccacaca atacccctac cacaaaagaa
1500gcctactatt accgaatgat ctttgagagg ttcttcccac agaactcagc taggctgact
1560gttccgggag gagcaagcgt tgcatgcagc actgccaaag cagttgaatg ggattctgcg
1620tggtcaaata accttgatcc ttctggcagg gcggcattag gagtccatct ttcagcttat
1680gaccagaagt taaccacagt cagtgctgca aatgtgccaa caaagatcat tgataatatg
1740ccgcggatta tggaagtaac cgcaccctga
17701151737DNAVolvox carteri 115atgtgcggaa tcctagctgt gctcaactcc
acggatgata gcccggcgat gagggcgaag 60gtgctggcgc ttagtcgtcg ccagaagcat
cgtggccccg actggtcggg gatgcaccag 120tttggcaaca acttcctggc gcatgagcgg
cttgcgatta tggatcccag ctcgggcgat 180cagccgctgt acaacgagga caagtctatc
gtcgtgacgg tgaacggcga gatctacaat 240tataaggaac tgcgcaagga gatctctgac
aagtgccctg gcaagaagtt ccgcaccaac 300agcgactgtg aggtgatcag ccacctgtac
gaactgtacg gcgaggcagt tgccaacaag 360ctggacggct tctttgcctt tgtactgctg
gacactcgca acaacacctt cttcgcggcg 420cgcgatccgt tgggcgtcac ctgcatgtac
attggctggg gccgggatgg cagcgtgtgg 480ctgtcctccg agatgaaatg tctcaaggac
gactgtgcgc gcttccagca attccctccc 540ggccattatt actcgtccaa gacaggcgag
tttgtgcggt acttcaaccc ccagttttac 600ctggactttg aggcagagcc gcaggttttc
ccctcggtgc cctacgaccc cgtcacgttg 660cgcacggcgt ttgaggcggc cgtggagaag
cgcatgatga gcgacgtgcc cttcggtgtg 720ctgctgagtg gcggtctgga cagcagcctt
gtggcctcta tcgcggcccg caaaatcaag 780cgggagggca gtgtgtgggg caagctgcac
agcttctgcg ttggtctgga gggcagcccc 840gacctcaagg caggtgccgc tgtggctgag
tttctgggca ccgaccacca cgagttccac 900tttacagtgc aggagggcat tgacgccatc
tcggaggtca tttaccacat cgagacgttt 960gacgtgacca cgatccgcgc ctccactccc
atgttcctga tgagccgcaa gatcaaggcc 1020ctgggtgtca agatggtgct gtccggagaa
ggctcggatg aggtgttcgg gggttacctc 1080tacttccata aggctcccag caaggatgag
ttccacagcg aaacggttcg caagctgaag 1140gacctgttca agtacgactg cctgcgagcc
aacaaggcca ccatggcctg gggtgtagag 1200gcgcgtgtgc ccttcctgga ccgggcattc
ctggatgtgg ccatgtccat tgacccggcg 1260gagaagatga ttgacaagag caagggccgg
atcgagaaat acattctccg gaaagccttc 1320gatacgcccg aggatccata cctgcctaag
gaggtactgt ggcgccagaa ggagcagttc 1380agcgacggcg tgggctacaa ctggattgat
gggctcaagg cgcatgctga gagccaagtc 1440agcgatgaga tgctcaagaa cgccgtgcac
agattcccgg acaacacccc gcgcaccaag 1500gaggcctact ggtaccgctc tatctttgag
agccacttcc cgcagcgtgc tgctatggag 1560acggtgccgg gtggtccctc agtggcttgc
tccaccgcga cagccgccct gtgggatgca 1620gcgtgggccg ggaaggagga cccgtcgggc
cgcgccgtgg cgggcgttca tgacgctgct 1680tacgaggaag gcgcggaagc caatggcgag
cccgcatcca aaaagcaaaa ggtctga 17371162197DNAZea mays 116gatcgtctcg
tctccctccc aaaaaaaaaa aaaaaaactg ctcggttgct gctcctgctc 60cgccgcgccg
gcatcatgtg tggcatctta gccgtgctcg gttgctccga ctggtctcag 120gcaaagaggg
ctcgcatcct cgcctgctcc agaaggttga agcacagggg ccccgactgg 180tcgggcctct
accagcacga gggcaacttc ctggcgcagc agcggctcgc cgtcgtctcc 240ccgctgtccg
gcgaccagcc gctgttcaac gaggaccgca ccgtcgtggt ggtggccaat 300ggagagatct
acaaccacaa gaacgtccgg aagcagttca ccggcacaca caacttcagc 360acgggcagtg
actgcgaggt catcatcccc ctgtacgaga agtacggcga gaacttcgtg 420gacatgctgg
acggggtgtt cgcgttcgtg ctctacgaca cccgcgacag gacctacgtg 480gcggcgcgcg
acgccatcgg cgtcaacccg ctctacatcg gctggggcag tgacggttcc 540gtctggatcg
cgtccgagat gaaggcgctg aacgaggact gcgtgcgctt cgagatcttc 600ccgccgggcc
acctctactc cagcgccggc ggcgggttcc ggcggtggta caccccgcac 660tggttccagg
agcaggtgcc ccggatgccg taccagccgc tcgtcctcag agaggccttc 720gagaaggcgg
tcatcaagag gctcatgact gacgtcccgt tcggggtcct cctctccggc 780ggcctcgact
cctcgctcgt cgcctccgtc accaagcgcc acctcgtcga gaccgaggcc 840gccgagaagt
tcggcaccga gctccactcc tttgtcgtcg gcctcgaggg ctctcctgac 900ctgaaggccg
cacgagaggt cgctgactac cttggaacca tccatcacga gttccacttc 960accgtacagg
acggcatcga cgcgatcgag gaggtgatct accacgacga gacgtacgac 1020gtgacgacga
tccgggccag cacgcccatg ttcctgatgg ctcgcaagat caagtcgctg 1080ggcgtgaaga
tggtgctgtc cggggagggc tccgacgagc tcctgggcgg ctacctctac 1140ttccacttcg
cccccaacaa ggaggagttc cacagggaga cctgccgcaa ggtgaaggcc 1200ctgcaccagt
acgactgcct gcgcgccaac aaggccacgt cggcgtgggg cctggaggtc 1260cgcgtgccgt
tcctcgacaa ggagttcatc aacgtcgcga tgggcatgga ccccgaatgg 1320aaaatgtacg
acaagaacct gggccgcatc gagaagtggg tcatgaggaa ggcgttcgac 1380gacgacgagc
acccttacct gcccaagcat attctctaca ggcagaaaga acagttcagt 1440gacggcgttg
gctacaactg gatcgatggc ctcaaatcct tcactgaaca gcaggtgacg 1500gatgagatga
tgaacaacgc cgcccagatg ttcccctaca acacgcccgt caacaaggag 1560gcctactact
accggatgat attcgagagg ctcttccctc aggactcggc gagggagacg 1620gtgccgtggg
gcccgagcat cgcctgcagc acgcccgcgg ccatcgagtg ggtggagcag 1680tggaaggcct
ccaacgaccc ctccggccgc ttcatctcct cccacgactc cgccgccacc 1740gaccacaccg
gcggtaagcc ggcggtggcc aacggcggcg gccacggcgc cgcgaacggc 1800acggtcaacg
gcaaggacgt cgcagtcgcg atcgcggtct gacgagagta cgtgctcgcg 1860cacctccctg
ctagcttcta ccgggctgca gcctgcagca tgcactgtgc gagcacagcc 1920gatcagcgcc
aataaactgg aggataagaa cgactggtag gtgtgtgtgt gtgtcgtgcg 1980tgcccaccgg
ccatatcccg gtgcggcagc acgtgctatt gttacgtgtt gtactgccgc 2040cagcgtacgt
gtctgtgtgt ctcgatcata tctgtacgtt tttagattta gaagaaaaaa 2100aaaaggcatg
tccgtgtctg tatgtctgga tcatatctgt acgttcttag atttagaaga 2160aagaagaaaa
acattatata cgtacgtcca tgtctct
21971172081DNAZea mays 117atgtgtggga ttctggcggt gctgggcgtc gttgaggtct
ccctcgccaa gcgctcccgc 60atcattgagc tctcgcgcag gttacggcac cgagggcctg
attggagtgg tttgcactgt 120catgaggatt gttaccttgc acaccagcgg ttggctatta
tcgatcctac atctggagac 180cagcctttgt acaatgagga taaaacagtt gttgtaacgg
tgaacggcga aatttacaat 240catgaagaat tgaaagctaa gttgaaaact catgagttcc
aaactggcag tgattgtgaa 300gttatagccc atctttacga agaatatggc gaagaatttg
tggatatgtt ggatggaatg 360ttctcctttg ttcttcttga tacacgtgat aaaagcttca
tcgcagctcg tgatgctatt 420ggcatctgcc ctttatacat gggatggggt cttgatggat
cagtctggtt ttcttcagag 480atgaaggcat tgagtgatga ttgtgaacgc ttcataacat
ttcccccagg gcatctctac 540tccagcaaga caggtggtct aaggagatgg tacaacccac
catggttttc agagactgtc 600ccttcaaccc cttacaatgc tctcttcctc cgggagatgt
ttgagaaggc tgttattaag 660aggctgatga ctgatgtgcc atttggtgtg cttttatctg
gtggactcga ctcttctttg 720gttgcatctg ttgcttcgcg gcacttaaac gaaacaaagg
ttgacaggca gtggggaaat 780aaattgcata ctttctgtat aggcttgaag ggttctcctg
atcttaaagc tgctagagaa 840gttgctgatt acctcagcac tgtacatcat gagttccact
tcacagtgca ggaggggatt 900gatgccttgg aagaagtcat ctaccatatt gagacatatg
atgttacaac aatcagagca 960agtaccccaa tgtttttgat gtcacgcaaa atcaaatctt
tgggtgtgaa gatggttatt 1020tctggcgaag gttcagatga aatttttggt ggttaccttt
attttcacaa ggcaccaaac 1080aagaaagaat tcctagagga aacatgtcgg aagataaaag
cactacatct gtatgactgc 1140ttgagagcta acaaagcaac ttctgcctgg ggtgttgagg
ctcgtgttcc attccttgac 1200aaaagtttca tcagtgtagc aatggacatt gatcctgaat
ggaacatgat aaaacgtgac 1260ctcggtcgaa ttgagaagtg ggtcatgagg aaggcgttcg
acgacgacga gcacccttac 1320ctgcccaagc atattctcta caggcagaaa gaacagttca
gtgacggcgt tggctacaac 1380tggatcgatg gcctcaaatc cttcactgaa cagcaggtga
cggatgagat gatgaacaac 1440gccgcccaga tgttccccta caacacgccc gtcaacaagg
aggcctacta ctaccggatg 1500atattcgaga ggctcttccc tcaggactcg gcgagggaga
cggtgccgtg gggcccgagc 1560atcgcctgca gcacgcccgc ggccatcgag tgggtggagc
agtggaaggc ctccaacgac 1620ccctccggcc gcttcatctc ctcccacgac tccgccgcca
ccgaccacac ggcggtaagc 1680cggcggtggc caacggcggc ggcacggccg gcgaacggca
cggtcaacgg caaggacgtg 1740ccagtgccga tcgcggtctg acgagagtac gtgctcgcgc
acctccctgc tagcttctac 1800cgggctgcag cctgcagcat gcactgtgcg agcacagccg
atcagcgcca ataaactgga 1860ggataagaac gactggtagc tgtgtgtgtg tgtgtcgtgc
gtgcccaccg gccatatccc 1920ggtgcggcag cacgtgctat tgttacgtgt tgtactgcca
ccagcgtacg tgtctgtgtg 1980tctcgatcat atctgtacgt ttttagattt agaagagaaa
aaaaaagtat gcccgtgtct 2040gtatgtctgg atcatatctg tacgttctta gatttagaag a
20811182257DNAZea mays 118ggaattcccc gggatcaagg
agcaccgtct gctgctcgct ctataaaacg aacggaggct 60gcagagcaga gcagagcaga
gcaagaagct ttacagtgaa cgagtgagta tgtgcggcat 120acttgctgtg ctcgggtgcg
ccgacgaggc caagggcagc agcaagaggt cccgggtgct 180ggagctgtcg cggcggctga
agcaccgggg ccccgactgg agcggcctcc ggcaggtggg 240cgactgctac ctctctcacc
agcgcctcgc catcatcgac ccggcctctg gcgaccagcc 300cctctacaac gaggaccagt
cggtggtcgt cgccgtcaac ggcgagatct acaaccacct 360ggacctcagg agccgcctcg
ccggcgcagg ccacagcttc aggaccggca gcgactgcga 420ggtcatcgcg cacctgtacg
aggagcatgg agaagagttc gtggacatgc tggacggcgt 480cttctccttc gtgctgctgg
acactcgcca tggcgaccgc gcgggcagca gcttcttcat 540ggctgctcgc gacgccatcg
gtgtgacgcc cctctacatc ggatggggag tcgatgggtc 600ggtgtggatt tcgtcggaga
tgaaggccct gcacgacgag tgtgagcact tcgagatctt 660ccctccgggg catctctact
ccagcaacac cggcggattc agcaggtggt acaaccctcc 720ttggtacgac gacgacgacg
acgaggaggc cgtcgtcacc ccctccgtcc cctacgaccc 780gctggcgcta aggaaggcgt
tcgagaaggc cgtggtgaag cggctgatga cagacgtccc 840gttcggcgtc ctgctctccg
gcgggctgga ctcgtcgctg gtggcgaccg tcgccgtgcg 900ccacctcgcc cggacagagg
ccgccaggcg ctggggcacc aagctccact ccttctgcgt 960gggcctggag gggtcccctg
acctcaaggc ggccagggag gtggcggagt acctgggcac 1020cctgcaccat gagttccact
tcactgttca ggacggcatc gacgccatcg aggacgtgat 1080ctaccacacg gagacgtacg
acgtcaccac gatcagggcg agcacgccca tgttcctcat 1140gtcgcgcaag atcaagtcgc
tcggggtcaa gatggtcatc tccggcgagg gctccgacga 1200gctcttcgga ggctacctct
acttccacaa ggcgcccaac aaggaggagt tgcaccgaga 1260gacgtgtagg aaggttaagg
ctctgcatca gtacgactgc ctgagagcca acaaggcgac 1320atcagcttgg ggcctggagg
ctcgcgtccc gttcctggac aaggagttca tcaatgcggc 1380catgagcatc gatcctgagt
ggaagatggt ccagcctgat cttggaagga ttgagaagtg 1440ggtgctgagg aaggcattcg
acgacgagga gcagccattc ctgcccaagc atatcctcta 1500cagacagaag gagcagttca
gtgacggcgt tgggtacagc tggatcgatg gcctgaaggc 1560tcatgcaaca tcaaatgtga
ctgacaagat gctgtcaaat gcaaagttca tcttcccaca 1620caacactccg accaccaagg
aggcctacta ctacaggatg gtcttcgaga ggttcttccc 1680acagaaatct gctatcctga
cggtacctgg tgggccaagt gtggcgtgca gcacagccaa 1740ggccatcgag tgggacgcac
aatggtcagg aaatctggac ccctcgggaa gggcggcact 1800gggcgtccat ctcgccgcct
acgaacacca acatgatccc gagcatgtcc cggcggccat 1860tgcagcagga agcggcaaga
agccaaggac gattagggtg gcaccgcctg gcgttgccat 1920cgagggatag acgacgacgc
atatataagc ttcctacttt tgtttcaatg catgcatgct 1980atgtatctgt gtccaccggc
tgtctagcct tatcatcatc actgtctgca acaaattaat 2040aatcaagtgg tatggggtac
ctacgtttaa tgtatacgga gtattgtatt gcttgtgtgt 2100ggtatgctta ggttggccgt
gagtaaggga ttacaagtat tcgatatcgg gtgtttctat 2160aggttgaagt gctcataaag
ggctccctat cctctatggt catgtttgta atagtttttt 2220ttcttaaaga gcttttctat
gaatttggat tcctgtt 22571193700DNAZea mays
119cgagcgctca gcgtctcgtc tcctcctccc cacaaaaagc cgctgaattg ctccgtcggc
60gtcatgtgtg gcatcttagc cgtgctcgga tgctccgact gctcccaggc caggagggct
120cgcatcctcg cctgctccag aaggctgaag cacaggggcc ccgactggtc gggcctctac
180cagcacgagg gcaacttcct ggcgcagcag cggctcgcca tcgtctcccc gctgtccggc
240gaccagccgc tgttcaacga ggaccgcacc gtcgtggtgg tggccaatgg agagatctac
300aaccacaaga acgtccggaa gcagttcacc ggcgcgcaca gcttcagcac cggcagtgac
360tgcgaggtca tcatccccct gtacgagaag tacggcgaga acttcgtgga catgctggac
420ggagtcttcg cgttcgtgct ctacgacacg cgagacagga cctacgtggc ggcacgcgac
480gccatcggcg tcaacccgct ctacatcggc tggggcagcg acggttccgt ctggatgtcg
540tccgagatga aggcgctgaa cgaggactgc gtgcgcttcg agatcttccc gccggggcac
600ctctactcca gcgccgccgg cgggttccgc cggtggtaca ccccgcactg gttccaggag
660caggtgcccc ggacgccgta ccagccgctc gtccttagag aggccttcga gaaggcggtt
720atcaagaggc tcatgaccga cgtcccgttc ggggtcctcc tctccggcgg cctcgactcc
780tccctcgtcg cctccgtcac caagcgccac ctcgtcaaga ccgacgccgc cggaaagttc
840ggcacagagc tccactcctt cgtcgtcggc ctcgagggct cccctgacct gaaggccgca
900cgagaggtcg ctgactacct cggaaccacc catcacgagt tccatttcac cgtacaggac
960ggcatcgacg cgatcgagga ggtgatctac cacgacgaga cgtacgacgt gacgacgatc
1020cgggccagca cgcccatgtt cctgatggct cgcaagatca agtcgctggg cgtgaagatg
1080gtgctgtccg gggagggctc cgacgagctc ctgggcggct acctctactt ccacttcgcc
1140cccaacaggg aggagctcca cagggagacc tgccgcaagg tgaaggccct gcaccagtac
1200gactgcctgc gcgccaacaa ggcgacgtcg gcgtggggcc tggaggtccg cgtgccgttc
1260ctcgacaagg agttcgtcga cgtcgcgatg ggcatggacc ccgaatggaa aatgtacgac
1320aagaacctgg gtcgcatcga gaagtgggtc ctgaggaagg cgttcgacga cgaggagcac
1380ccttacctgc ccgagcatat tctgtacagg cagaaagaac agttcagtga cggagtgggc
1440tacaactgga tcgatggact caaagccttc accgaacagc aggttgatgg tcgtcgtcga
1500agttagctaa ccagcgctga cgttcccccc catgtccagg tgacggatga gatgatgaac
1560agcgccgccc agatgttccc gtacaacacg cccgtcaaca aggaggccta ctactaccgg
1620atgatattcg agaggctctt ccctcaggac tcggcgaggg agacggtgcc gtggggcccg
1680agcatcgcct gcagcacgcc cgcggccatc gagtgggtgg agcagtggaa ggcctccaac
1740gacccctccg gccgcttcat ctcctcccac gactccgccg ccaccgaccg caccggagac
1800aagctggcgg tggtcaacgg cgacgggcac ggcgcggcga acggcacggt caacggcaac
1860gacgtcgctg tcgccatcgc ggtgtaacag taatgaactg gaggataggg acgaacgaac
1920gactggtagg tgtggcgtac ctgccgcgtg cccaccggcc ggccatatat atcgaatccc
1980ggcccggcgc ggcagcacgt gctattgtta cgtgtcacca gcgtacgtgt ctgtgtagtg
2040cctcgatcgt atctgtacgt ctttaggaaa aggtgtgtcc gtgtgtattg tatgtgtgtg
2100agcaagcgtg cgtgacgcgc tctgcctgtg tgacaaagca gagcagtaca agctcaggca
2160ttttctgtcc gagcgatgat ttgaactgga tctatcatct ctgaattgaa ctcggccgga
2220cgacgaccta ccgctaaaat tattcccagc tggatttcgg tacgtgtccc cgttgttcgt
2280tctcgcggct gtgactgtga ccgaacctgc tgctacaagt gcgcgtaaag gatctggttc
2340cacgtgtccg gcacgccggg cacgcaccag tggatgcagt ccgtgtacgt ctgcgggtcg
2400gccctctgcg cgtcggtgag cagctcgccg ccggtctcgg tgtacacgga gacgtgggcg
2460tcgacgcggt gctccgtcag ctgcgtgacg ttgagcagcg tcacgggcac ccgcatccgg
2520cccaccacgt cggacatcac ctccatcatc cgccggtccg cgccgctgcc ccagtacccc
2580ttccgcgtga cgggcaacgt ctcgttgtag caccggatgc cgccctcccg gccccagtcc
2640tcgctcctca tgtgcgtggt ggagatcgac atgaagaaga cccttgtggc gttgggatcg
2700atgttggcgt ccacccagtt ggcccatgtc ttgagtccaa gccggaacgc gacccaggcg
2760tccagctcct cgtacccgtc gtccccgaac gacccccaca ctgatttgat cctgctgccg
2820gtcatccacc acacgtagct gtcgaagacg aggatgtcca cgcccttcca gtgcctagcg
2880tgcagctcga cggcgtcgac gtggagcacg cggccgtcgg cccccagccg gatgttgcgg
2940tcggagttgg cctccaccag gtacggcgcc cagtagaact cgatcgtcgc gttgtactcc
3000gtggcggtga agacggacag ggtggtgctg cgctccatgg accgcgcggt gtagggcacg
3060gcggagttga cgaggcagac catggagagc cactgcccca tctgcagcga gtcgccaacg
3120aacatcatcc gcttcccccg cagcgtctcc agcaacgcca ccgggtcgaa ccttgggaga
3180ctgcagtcgt ctaggtgcca atcccagcgc aggtagtcgc tgtccggcct gccgttcctc
3240tggcaagaga cctgcctgtc gatgaacggg catgtttggt ccgtgtaaag cagctccttg
3300gacctgttgt acgcccagta cccctccgtc acgctgcacc ggctcgggtc gaacgctgcc
3360ttcgccggct gcggcggcgg cgttagcggc atcttcctcg tcgtcgtccc ggcatgtaga
3420gacgtcgtcc tcttgtgctc cttcgctttc cccttcttct ccatgatctc agtgagtgag
3480cggaggtcgt cggtgaagat gacgcccgct agagccagcc cgccgatgat tgccaccacc
3540actgacagcg gggcccgccc cttcatccgc ttcaccgctg ctatctgaat ctgaaccatg
3600aagctcagat gctacgtgga tgctggcatg cagcaatgct agcttgttgc aggctcaagg
3660tgtgagacgg cttatcgatt tatttgcagc tgctctttgt
37001202189DNAZea maysmisc_feature(2173)..(2173)n is a, c, g, or t
120ccgaggcggc gcttttgggg tcggaagcga cacgggcgcc gggcgggtcc gcgggtggtg
60gtgctactgc tagcaagcag cagcaggcga cgctaggcga gagccccagt cggagcaggc
120caccatgtgc ggcatcctcg ctgtcctcgg cgtcgctgag gtctccctcg ccaagcgctc
180ccgcatcatt gagctctcgc gcaggttacg gcaccgaggg cctgattgga gtggtttgca
240ctgtcatgag gattgttacc ttgcacacca gcggttggct attatcgatc ctacatctgg
300agaccagcct ttgtacaatg aggataaaac agttgttgta acggtgaacg gagagatcta
360taaccatgaa gaattgaaag ctaagttgaa aactcatgag ttccaaactg gcagtgattg
420tgaagttata gcccatcttt acgaagaata tggcgaagaa tttgtggata tgttggatgg
480aatgttctcc tttgttcttc ttgatacacg tgataaaagc ttcatcgcag ctcgtgatgc
540tattggcatc tgccctttat acatgggatg gggtcttgat ggatcagtct ggttttcttc
600agagatgaag gcattgagtg atgattgtga acgcttcata acatttcccc cagggcatct
660ctactccagc aagacaggtg gtctaaggag atggtacaac ccaccatggt tttcagagac
720ggtcccttca accccttaca atgctctctt cctccgggag atgtttgaga aggctgttat
780taagaggctg atgactgatg tgccatttgg tgtgctttta tctggtggac tcgactcttc
840tttggttgca tctgttgctt cgcggcactt taacgaaaca aagggtgaca ggcagtgggg
900aaataaattg catactttct gtataggctt gaagggttct cctgatctta aagctgctag
960agaagttgct gattacctca gcactgtaca tcatgagttc cacttcacag tgcaggaggg
1020cattgatgcc ttggaagaag tcatctacca tattgagaca tatgatgtta caacaatcag
1080agcaagtacc ccaatgtttt tgatgtcacg caaaatcaaa tctttgggtg tgaagatggt
1140tatttctggc gaaggttcag atgaaatttt tggtggttac ctttattttc acaaggcacc
1200aaacaagaaa gaattccatg aggaaacatg tcggaagata aaagcactac atctgtatga
1260ctgcttgaga gctaacaaag caacttctgc ctggggtgtt gaggctcgtg ttccattcct
1320tgacaaaagt ttcatcagtg tagcaatgga cattgatcct gattggaaga tgataaaacg
1380tgacctcggt cgaattgaga aatgggttat ccgtaatgca tttgatgatg atgagaggcc
1440ctatttacct aagcacattc tctacaggca aaaggaacag ttcagtgatg gtgttgggta
1500tagttggatc gatggattga aggaccatgc cagccaacat gtctccgatt ccatgatgat
1560gaatgctggc tttgtttacc cagagaacac acccacaaca aaagaagggt actactacag
1620aatgatattc gagaaattct ttcccaagcc tgcagcaagg tcaactgttc ctggaggtcc
1680tagtgtggcc tgcagcactg ccaaagctgt tgaatgggac gcatcctggt ccaagaacct
1740tgatccttct ggccgtgctg ctttgggtgt tcacgatgct gcgtatgaag acactgcagg
1800gaaaactcct gcctctgctg atcctgtctc agacaagggc cttcgtccag ctattggcga
1860aagcctaggg acacccgttg cttcagccac agctgtctaa ccttatgttt atcacccagc
1920aatgcttgaa acagcaaagg ttgtccattg cttgtttcag tttccttccg atcatgtttt
1980tagttccatc aatcaagcaa tggagacatg cttgtgcttc atacttggca gcatcgtgtt
2040tgggttttca ctgggcagta ctgtttaatt tttatggact gaaaagactc agttttgtaa
2100atattcgtca ctgtgaccaa ttcctgtggt ggtttatgtg atttgcagat tgcagtggtt
2160agtgtatctt ccncaatttt cactccttt
21891212083DNABrassica napus 121gaattctccg ggtcgacgat ttcgtacgaa
atcgtcattg ccgccaccat ccatcaacca 60tgtgtgggat tctcgctgtt ctaggctgcg
tcgataactc tcaagccaca cgttctcgta 120tcatcaaact ctctcgcaga ttgaggcata
gaggtcctga ttggagcggg cttcattgtt 180atgaggattg ttacttggct catgagcgtt
tggccatcat tgaccccatt tctggagacc 240agcctctcta cagcgaagat aagaccgtcg
ttgtcacggt gaatggagag atatacaatc 300acaaggcatt gcgtgaaagt gaaagtctga
agtctcacaa gtaccatacc gggagtgatt 360gtgaagtgct tgcccatctt tatgaagaac
atggagagga atttatcaac atgttggacg 420gcatgtttgc atttgtcctt cttgatacta
aggacaaaag ttatattgct gtaagggatg 480ccattggtgt catcccactc tacattggct
ggggtctcga tggttctgtc tggtttgctt 540ctgagatgaa agcacttagt gatgattgtg
aacagtttat ggctttccca ccaggccaca 600tctattccag taaacaaggt ggtcttagga
ggtggtacaa ccctccatgg ttctctgagc 660tcgttccttc aaccccttat gatcccttag
tattgcgaga tactttcgag aaggctgtaa 720taaagagact aatgaccgat gtgccttttg
gtgtcctact ctctggagga ctagactcat 780ctcttgttgc ttcagtggct atacgccatt
tggaaaagtc agatgctcgt cagtggggtt 840ccaagctgca caccttttgc attggtttaa
agggatctcc ggatcttaaa gctggtaaag 900aagttgctga ctatctagga actcgccacc
acgagctcca ctttacagtt caggaaggga 960tagacgccat agaagaagtt atataccatg
ttgagaccta tgacgtgact accataagag 1020caagcactcc catgtttctc atgtcgagaa
aaatcaaatc gcttggtgtg aagatggttc 1080tctctggtga aggctctgat gagatctttg
gagggtattt gtacttccac aaagcaccta 1140acaagaagga gttacacgag gaaacatgcc
gaaagatcaa agcactttat caatatgatt 1200gcttgagggc taacaaatca acttctgcgt
ggggtgttga ggctcgtgtg cctttccttg 1260ataaagcgtt tctagatgta gcaatgggca
ttgatccaga gtggaagatg atcaggcctg 1320acttgggaag gattgagaaa tgggtgttac
gcaatgcctt tgatgatgag aagaatcctt 1380atctaccaaa gcacattctg tacaggcaga
aggaacagtt cagtgatgga gttggataca 1440gctggattga cggtctgaaa gatcatgcaa
acaaacatgt ctctgacgca atgctgacga 1500acgcaaactt tgtcttcccg gagaacacac
ctttgacaaa ggaggcttac tactacagag 1560ccatctttga aaagttcttc cctaagagcg
ctgctagagc aactgtacca ggaggtccaa 1620gtgtagcatg tagtactgca aaagctgtgg
agtgggacgc agcttggaaa gggaaccttg 1680acccgtcggg tcgtgcggct cttggagttc
atgttgcagc ttatgaagga gataaagctg 1740aagatcctcg tcctgagaag gtacagaagc
tggcagagaa aactgcagaa gccattgttt 1800gaggatgaaa cgaatgtttg agtcgtgcgt
ttcttttatt ttctcataag acaatacgtt 1860attatcatct tccgtaggat caataagtac
aataagttgt ctctctttaa ctgaattgag 1920gtgggagtgt ctgaggttgt acctaagttg
ttggtgattt tctggttctt tcatttgtca 1980caaagttttc agcgtttctt ttatgtatga
tgtatcgttc acccctgtta atctagattt 2040ggttcagttc aaaaaaaaaa aaaaaaaaag
cggacgctct aga 20831222288DNATriticum aestivum
122ggcctggccc gctacgaacc ccaaacgcgc atctctccta gccccctccc tgctgctcta
60ccaccaccgt gccgccgtag aacgccgtac ctgacccccc accaccacct gcgcctgcgt
120cgccgccggc gccgtcgccg tcgcccgtcc gtactagtcg gggcatcgcc ggtgattagt
180caaatcacct tcggagctcg cgaccaccca aatcacccgc ggagtctcgc caacgagcag
240ggaccgcccg ccggccgcca ccatgtgcgg catcctcgcc gtcctcggcg tcggcgacgt
300ctccctcgcc aagcgctccc gcatcatcga gctctcccgc cgattacggc acagaggccc
360tgattggagt ggtatacaca gctttgagga ttgctatctt gcacaccagc ggttggctat
420tgttgatcct acatctggag accagccatt gtacaacgag gacaaaacag ttgttgtgac
480ggtgaatgga gagatctata atcatgaaga actgaaagct aagctgaaat ctcatcaatt
540ccaaactggt agtgattgtg aagttattgc tcacctatat gaggaatacg gggaggaatt
600tgtggatatg ctggatggca tgttctcgtt tgtgcttctt gacacacgtg ataaaagctt
660cattgctgcc cgtgatgcta ttggcatctg tcctttgtac atgggctggg gtcttgatgg
720gtcagtttgg ttttcttcag agatgaaggc attgagtgat gattgcgagc gcttcatatc
780gttccctcct ggacacttgt actcaagcaa aacaggtggc ctaaggaggt ggtacaaccc
840cccatggttt tcagaaagca ttccctcagc cccctatgat cctctcctca tccgagagag
900tattgagaag gctgctatta agaggctaat gactgatgtg acatttggcg ttctcttgtc
960tggtgggctt gactcttctt tggtggcttc tgttgtttca cgctacttgg cagaaacaaa
1020agttgctagg cagtggcgaa acaaactgca caccttttgc atcggcatga agggttctcc
1080tgatcttaaa gctgctaagg aagttgctga ctaccttggc acagtccatc atgaattaca
1140cttcacagtg caggagggca ttgatgcttt ggaagaagtt atatatcaca tcgagacgta
1200tgatgtcacg accattagag caagtacccc aatgtttcta atgtctcgga aaatcaaatc
1260gttgggtgtg aagatggttc tttcgggaga aggctccgat gaaatatttg gtggttatct
1320ttattttcac aaggcaccaa acaaaaagga actacatgag gaaacatgta ggaagataaa
1380agctctccat ttatatgatt gtttgagagc gaacaaagca acttctgcct ggggtctcga
1440ggctcgtgtt ccattcctcg acaaaaactt catcaatgta gcaatggacc tggatccgga
1500atgtaagatg ataagacgtg atcttggccg gatcgagaaa tgggttctgc gtaatgcatt
1560tgatgatgag gagaagccct atttacccaa gcacattctt tacaggcaaa aagaacaatt
1620cagcgatggg gttgggtaca gttggattga tggattgaag gaccatgcta aagcacatgt
1680gtcggattcc atgatgacga acgccagctt tgtttaccct gaaaacacac ccacaacaaa
1740agaggcctac tattacagga ccgtattcga gaagttctat cccaagaatg ctgctaggct
1800aacggtgcca ggaggtccca gcatcgcatg cagcaccgct aaagctgtcg aatgggacgc
1860cgcctggtcc aagctcctcg acccgtctgg ccgcgccgct cttggcgtgc acgatgcggc
1920gtacaaagaa aaggctcctg catcggtcga tcctgccgtg gataacgtct cacgttcacc
1980tgcacatgac gtcaaaagac tcaaaaccgc catttcagca gctgctgtat aaccttccat
2040tccatggttc caaaaatgcc gtcgcttagt tttaatccta gcaatcctgt ctgtagttca
2100ttcagtcatg cagtgcagaa atcgctttgc tctacttttt cgttcatgtt gtgctttcgc
2160atgtatgtac caagttagtt tgtttatgca gcgagcgttt gcgtcgtaaa taaatatttc
2220accgtggttg atatccttgt gttgctcagt gtttggtttg caagctgcaa attgcactaa
2280taaattcc
2288
User Contributions:
Comment about this patent or add new information about this topic: