Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: CLONING GENES FROM STREPTOMYCES CYANEOGRISEUS SUBSP. NONCYANOGENUS FOR BIOSYNTHESIS OF ANTIBIOTICS AND METHODS OF USE

Inventors:  Chengjin Huang (Fort Dodge, IA, US)  Deborah T. Chaleff (Pennington, NJ, US)  Mark E. Ruppen (Garnerville, NY, US)  Jerome Stephens (Mentone, AL, US)
Assignees:  Wyeth
IPC8 Class: AC07K14195FI
USPC Class: 530350
Class name: Chemistry: natural resins or derivatives; peptides or proteins; lignins or reaction products thereof proteins, i.e., more than 100 amino acid residues
Publication date: 2009-07-09
Patent application number: 20090176969





Sign up to receive free email alerts when patent applications with chosen keywords are published SIGN UP

Abstract:

The present invention relates to the complete biosynthetic pathway for the formation of the LL-F28249 compounds and, most importantly, the major component LL-F28249α. The purified and isolated nucleic acid molecule encoding the proteins of the biosynthetic pathway, which is isolated from a wild-type or mutant Streptomyces, is fully described in FIG. 6 to FIG. 6-39 and SEQ ID NO:1. The DNA gene cluster and its expression in a suitable host enable the efficient production of the highly active natural metabolites and semisynthetic derivatives. The invention further concerns plasmids, vectors and host cells that contain and express the novel nucleic acid molecule. Of particular interest, the entire biosynthetic pathway fits compactly in three plasmids, Cos11, Cos36 and Cos40. The invention also concerns the purified and isolated biosynthesis proteins that are encoded by the whole DNA gene cluster. Additionally, the invention involves a new efficient, biochemical method of preparing moxidectin.

Claims:

1. A purified and isolated nucleic acid molecule encoding at least one protein of the biosynthetic pathway for producing an LL-F28249 compound, wherein said nucleic acid molecule is isolated from an antibiotic-producing wild-type or mutant Streptomyces.

2. The nucleic acid molecule according to claim 1, wherein the nucleic acid molecule is isolated from an antibiotic-producing wild-type or mutant Streptomyces cyaneogriseus subsp. noncyanogenus.

3. The nucleic acid molecule according to claim 1, wherein the LL-F28249 compound is LL-F28249.alpha..

4. The nucleic acid molecule according to claim 1, wherein the molecule has the nucleotide sequence set forth in SEQ ID NO:1 or its complementary strand.

5. A nucleic acid sequence which hybridizes to the sequence of the nucleic acid molecule of claim 4 and encodes a protein of the biosynthetic pathway for producing an LL-F28249 compound.

6. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 7697-10465 of SEQ ID NO:1.

7. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 10791-11570 of SEQ ID NO:1.

8. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 11659-12462 of SEQ ID NO:1.

9. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 12850-19875 of SEQ ID NO:1.

10. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 19865-31036 of SEQ ID NO:1.

11. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 31115-49246 of SEQ ID NO:1.

12. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 50449-51303 of SEQ ID NO:1.

13. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 51300-52706 of SEQ ID NO:1.

14. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 52809-69833 of SEQ ID NO:1.

15. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 69929-85429 of SEQ ID NO:1.

16. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 85574-86338 of SEQ ID NO:1.

17. A biologically functional plasmid or vector containing the nucleic acid molecule according to claim 1.

18. The plasmid or vector according to claim 17, wherein the plasmid or vector comprises Cos11 having ATCC Designation Number PTA-4392, Cos36 having ATCC Designation Number PTA-4393 or Cos40 having ATCC Designation Number PTA-4394.

19. A suitable host cell stably transformed or transfected by the plasmid or vector according to claim 17.

20. The host cell according to claim 19, wherein the host is Escherichia, Actinomycetales, Bacillus, Corynebacteria or Thermoactinomyces.

21. The host cell according to claim 20, wherein the host is Escherichia coli, Streptomyces lividans, Streptomyces coelicolor, Streptomyces griseofuscus or Streptomyces ambofaciens.

22. A biosynthesis protein encoded by the nucleic acid molecule according to claim 1.

23. The protein according to claim 22, wherein the amino acid sequence is set forth in any one of SEQ ID NO:2 to SEQ ID NO:12, or a biologically active variant thereof.

24. A process for the production of a protein involved in the biosynthesis of an LL-F28249 compound, said process comprising: growing, under suitable nutrient conditions, a prokaryotic or eukaryotic host cell transformed or transfected with a nucleic acid molecule according to claim 1 in a manner allowing expression of the protein product, and isolating the desired protein product of the expression of the nucleic acid molecule.

25. A protein product of the expression of the nucleic acid molecule in a prokaryotic or eukaryotic host cell according to claim 24.

26. A plasmid or a combination of two or three plasmids for cloning the nucleic acid molecule which encodes the proteins of the biosynthetic pathway of an LL-F28249 compound, wherein said plasmid or combination contains the nucleic acid molecule that spans the entire biosynthetic gene cluster and encodes type I polyketide synthase that is responsible for producing the LL-F28249 compound.

27. The combination according to claim 26, which comprises Cos11 having ATCC Designation Number PTA-4392, Cos36 having ATCC Designation Number PTA-4393 and Cos40 having ATCC Designation Number PTA-4394.

28-35. (canceled)

Description:

CROSS-REFERENCE TO RELATED U.S. APPLICATIONS

[0001]This nonprovisional application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 60/471,256, filed on May 16, 2003. The prior application is incorporated herein by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002]Not Applicable

REFERENCE TO A "SEQUENCE LISTING"

[0003]The material on a single compact disc containing a Sequence Listing file provided in this application is incorporated by reference. The date of creation is ______ and the size is approximately ______.

BACKGROUND OF THE INVENTION

[0004]1. Field of the Invention

[0005]The present invention concerns the novel biosynthetic genes for encoding the proteins responsible for producing the LL-F28249 compounds and the use thereof to make the active metabolites from the fermentation of Streptomyces cyaneogriseus subsp. noncyanogenus. The invention further concerns the genetic manipulation of the biosynthetic pathway to make active semisynthetic derivatives of the natural metabolites.

[0006]2. Description of the Related Art

[0007]All patents and publications cited in this specification are hereby incorporated by reference in their entirety.

[0008]Streptomyces are producers of a wide variety of commercially important secondary metabolites, including the majority of active antibiotics known as the β-lactams and the macrocyclic lactone compounds or macrolides. Because of the commercial importance of the secondary metabolites produced by Streptomyces, there has been considerable recent investment in the development of methods for molecular genetic manipulation of Streptomyces. Procedures have been developed for the introduction of genetic material by polyethylene glycol mediated transformation and by conjugal transfer from Escherichia coli. Vectors have been developed including high and low copy number vectors, integrative vectors, and E. coli-Streptomyces shuttle vectors. These methods for molecular genetic manipulation of Streptomyces have been summarized in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985). In many cases, the genes for the production of secondary metabolites are clustered in Streptomyces. Thus, the identification of a single gene in a biosynthetic gene cluster may lead to the identification of all of the genes responsible for the biosynthesis of the metabolite. This observation has proven to be tremendously valuable, and secondary metabolite biosynthetic gene clusters have been cloned by reverse genetics, complementation of blocked mutants, resistance and use of heterologous probes. Using these methods, nucleotide and predicted amino acid sequence data have been obtained for many macrolide biosynthetic gene clusters including those directing the synthesis of erythromycin (see S. Donadio et al., Science 252:675-679 (1991) and S. F. Haydock et al., Molecular and General Genetics 230:120-128 (1991)); rapamycin (see T. Schwecke et al., Proceedings of the National Academy of Sciences USA 92:7839-7843 (1995) and X. Ruan et al., Gene 203:1-9 (1997)); FK506 (H. Motamedi and A. Shafiee, European Journal of Biochemistry 256:528-534 (1998)); oleandomycin (D. G. Swan et al., Molecular and General Genetics 242:358-362 (1994)) and rifamycin (see P. R. August et al., Chemistry & Biology 5:69-79 (1998)). However, the complete biosynthetic gene cluster for the macrocyclic lactone compounds known as the LL-F28249 compounds has not yet been described in the art.

[0009]There are many reports that molecular genetic manipulations can be used to alter the course of polyketide biosynthesis (see S. Donadio et al., Science 252:675-679 (1991) and S. Donadio et al., Proceedings of the National Academy of Sciences USA 90: 7119-7123 (1993)). In those studies, erythromycin-related lactones were produced following manipulation of the 6-deoxyerythronolide B synthase ("DEBS") gene cluster (the core polyketide synthase gene cluster responsible for erythromycin biosynthesis) such that either the module 4 enoylreductase or the module 5 ketoreductase domains were nonfunctional. Strains containing these variant DEBS gene clusters produced the expected erythromycin-related lactones. These pioneering studies have since been repeated and expanded upon, and the results of many such studies have been reviewed in the literature (see, for example, L. Katz and S. Donadio, Annual Reviews of Microbiology 47:875-912 (1993); C. R. Hutchinson and I. Fujii, Annual Reviews of Microbiology 49:201-238 (1995); D. A. Hopwood, Chemical Reviews 97:2465-2497 (1997); and C. W. Carreras and D. V. Santi, Current Opinions in Biotechnology 9: 403-411 (1998)).

[0010]Data summarized in the literature suggest that the organization of catalytic domains in type I polyketide synthase ("PKS") modules is conserved, and many highly conserved amino acid sequence motifs have also been described in those biosynthetic gene clusters. For example, the organization of the biosynthetic gene cluster of avermectin, which is produced by S. avermitilis, has been reported (see D. J. MacNeil et al., Gene 115:119-125 (1992) and D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)); and partial nucleotide sequences of that biosynthetic gene cluster have been reported or are otherwise available. MacNeil and colleagues have also predicted the modular organization and reported a limited restriction endonuclease map of the wild-type S. cyaneogriseus (NRRL 15773) nemadectin biosynthetic gene cluster (see D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)), but their restriction map was incomplete. Their analysis only indicated the presence of nine modular repeats of PKS function and required six overlapping clones to define the 75 kb region of the S. cyaneogriseus genome. MacNeil et al. did not complete the DNA sequencing of the whole biosynthetic gene cluster. Instead, the authors sequenced only the ends of selected cosmids. From the limited sequence information, they could only generate a very sketchy restriction endonuclease map. Further C-13 labeling studies have been conducted, and a mechanism for synthesis of the LL-F28249α compound from its constituent acyl units has been proposed (H. R. Tsou et al., Journal of Antibiotics (Tokyo) 42:398-406 (1989)).

[0011]The highly active LL-F28249 compounds, which are natural endectocidal agents widely used for treatment of nematode and arthropod parasites, including the control or prevention of helmintic, arthropod ectoparasitic and acaridal infections, are isolated from the fermentation broth of Streptomyces cyaneogriseus subsp. noncyanogenus (hereinafter referred to as "S. cyaneogriseus"). The series of anti-parasitic LL-F28249 compounds produced from S. cyaneogriseus are structurally similar to, but patentably distinct from, the well-characterized avermectins. U.S. Pat. No. 5,106,994 and its continuation U.S. Pat. No. 5,169,956 describe the preparation of the major and minor components, LL-F28249α-λ. The LL-F28249 family of compounds further includes, but is not limited to, the semisynthetic 23-oxo derivatives and 23-imino derivatives of LL-F28249α-λ, which are shown in U.S. Pat. No. 4,916,154. Moxidectin, chemically known as 23-(O-methyloxime)-LL-F28249α, is a particularly potent 23-imino derivative. Other examples of LL-F28249 derivatives include, but are not limited to, 23-(O-methyloxime)-5-(phenoxyacetoxy)-LL-F28249α, 23-(semicarbazone)-LL-F28249α and 23-(thiosemicarbazone)-LL-F28249α.

[0012]One of the major nemadectin metabolites, LL-F28249α (hereinafter referred to as "Fα"), is converted to the commercially important compound moxidectin using a four-step chemical process. The determination of the biosynthetic gene cluster of Fα, heretofore unknown, would be of great commercial significance. Not only would isolation of the gene be highly desirable to make the active Fα compound and other natural members of the LL-F28249 family of compounds, but also to prepare the commercially potent semisynthetic derivatives such as moxidectin more quickly and efficiently.

[0013]It is therefore an important object of the present invention to isolate and characterize the entire nucleotide sequence encoding the proteins responsible for producing the LL-F28249 compounds, preferably the LL-F28249α metabolite, and then to isolate and determine the function of the amino acid sequences comprising the biosynthesis proteins.

[0014]Another object is to provide a new process for isolating natural and semisynthetic derivatives directly from the fermentation broth of bioengineered strains of Streptomyces cyaneogriseus subsp. noncyanogenus.

[0015]A further object is to provide a new method for the preparation of moxidectin in an efficient process with fewer steps than heretofore achievable.

[0016]Further purposes and objects of the present invention will appear as the specification proceeds.

[0017]The foregoing objects are accomplished by providing a new, purified and isolated nucleic acid molecule that encodes the proteins connected with the entire biosynthetic pathway for producing the LL-F28249 compounds.

BRIEF SUMMARY OF THE INVENTION

[0018]The present invention concerns the unique cloning and characterization of the complete biosynthetic pathway for the formation of the LL-F28249 compounds and, most importantly, the highly active, major component LL-F28249α. The full DNA gene cluster and its expression in a suitable host enable the efficient production of the highly active natural metabolites and semisynthetic derivatives. Remarkably, the whole biosynthetic pathway is efficiently contained in only three plasmids identified as Cosmid Numbers 11, 36 and 40 (hereinafter referred to as "Cos11," "Cos36" and "Cos40," respectively).

BRIEF DESCRIPTION OF THE DRAWINGS

[0019]The background of the invention and its departure from the art will be further described hereinbelow with reference to the accompanying drawings, wherein:

[0020]FIG. 1 illustrates the construction of the biosynthetic gene cluster for making the LL-F28249 compounds via the gene segments contained within cosmids made according to the present invention. S. cyaneogriseus cosmid libraries are constructed by ligating Sal3A fragments of S. cyaneogriseus genomic DNA into the BamH1 site of cosmid vector pSuperCos 1. The resultant cosmid libraries are transformed into E. coli VCS257. Various cosmids are identified by hybridization technique using the avermectin ketoacyl synthase probe or by a "walking" technique as described herein. The cosmids are characterized by restriction endonuclease mapping and DNA sequencing. The BamH1 restriction map of the Fα gene cluster is obtained from analyzing overlapping cosmids and confirmed by DNA sequencing. B denotes a BamH1 site.

[0021]FIG. 2 illustrates the biosynthesis proteins and their positions encoded by the cloned biosynthetic gene cluster for making the LL-F28249 compounds. A contiguous nucleotide sequence of approximately 88 Kbp containing the entire Fα polyketide synthase gene cluster is obtained by sequencing overlapping cosmids and the subclones thereof. The 13 modules and respective domains are identified using BLAST alignment analysis. Other biosynthetic genes are identified in the same way. The following abbreviations are used in the figure: ACP, acyl carrier protein; DH, dehydratase; ER, enoylreductase; KR, ketoreductase; KS, ketoacyl synthase; LD, loading domain; TE, thioesterase; MT, methyl transferase; AT, acyl transferase.

[0022]FIG. 3 shows the structure of the components of the vector designated pKR0.9, which is the 900 bp BstEII-AatII fragment of pNE57 (and contains the desired region of the Fα module 3 ketoreductase domain), in the BstEII-AatII sites of pSL301 (Invitrogen, Carlsbad, Calif.). The following abbreviations are used in the figure: mod3 KR, Fα module 3 ketoreductase domain; amp, the ampicillin resistance marker.

[0023]FIG. 4 shows the structure of the plasmid components of the pFDmod3/5.2 series. These plasmids are constructed to combine the site-directed mutations of the Fα module 3 ketoreductase domain with flanking DNA to facilitate homologous integration. The backbone vector is E. coli-Streptomycin shuttle vector pKC1132. The following abbreviations are used in the figure: mod3 KS, module 3 ketoacyl synthase domain; mod3 AT, module 3 acyl transferase; mod3 DH, module 3 dehydratase; mod 3 ER, module 3 enoylreductase; mod3 KR, module 3 ketoreductase domain; apra, apramycin resistance marker.

[0024]FIG. 5 shows the structure of the plasmid components of the pFDmod3/4.2 series. These plasmids are derived from the pFDmod3/4.2 series by removing approximately 1 Kbp of flanking DNA to minimize aberrant integration. The following abbreviations are used in the figure: mod3 AT, module 3 acyl transferase; mod3 DH, module 3 dehydratase; mod 3 ER, module 3 enoylreductase; mod3 KR, module 3 ketoreductase domain; apra, apramycin resistance marker.

[0025]FIG. 6 to FIG. 6-39 show the full-length nucleotide sequence (88400 bp) of the biosynthetic genes for making the LL-F28249 compounds (which corresponds to SEQ ID NO:1).

[0026]FIG. 7 represents the putative amino acid sequence (922 aa) of the regulatory protein encoded by the ORF1 gene (which corresponds to SEQ ID NO:2).

[0027]FIG. 8 represents the putative amino acid sequence (259 aa) of the thioesterase protein encoded by the ORF2 gene (which corresponds to SEQ ID NO:3).

[0028]FIG. 9 represents the putative amino acid sequence (267 aa) of the reductase protein encoded by the ORF3 gene (which corresponds to SEQ ID NO:4).

[0029]FIG. 10 to FIG. 10-1 represent the putative amino acid sequence (2341 aa) of the loading domain protein for Mod1 encoded by the ORF4 gene (which corresponds to SEQ ID NO:5).

[0030]FIG. 11 to FIG. 11-2 represent the putative amino acid sequence (3723 aa) of the loading domain protein for Mod2-Mod3 encoded by the ORF5 gene (which corresponds to SEQ ID NO:6).

[0031]FIG. 12 to FIG. 12-3 represent the putative amino acid sequence (6043 aa) of the loading domain protein for Mod-4-Mod7 encoded by the ORF6 gene (which corresponds to SEQ ID NO:7).

[0032]FIG. 13 represents the putative amino acid sequence (284 aa) of the methyltransferase protein encoded by the ORF7 gene (which corresponds to SEQ ID NO:8).

[0033]FIG. 14 represents the putative amino acid sequence (468 aa) of the p450 protein encoded by the ORF8 gene (which corresponds to SEQ ID NO:9).

[0034]FIG. 15 to FIG. 15-3 represent the putative amino acid sequence (5674 aa) of the loading domain protein for Mod8-Mod10 encoded by the ORF9 gene (which corresponds to SEQ ID NO:10).

[0035]FIG. 16 to FIG. 16-3 represent the putative amino acid sequence (5166 aa) of the loading domain protein for Mod11-Mod13 encoded by the ORF10 gene (which corresponds to SEQ ID NO:11).

[0036]FIG. 17 represents the putative amino acid sequence (254 aa) of the oxidoreductase protein encoded by the ORF11 gene (which corresponds to SEQ ID NO:12).

DETAILED DESCRIPTION OF THE INVENTION

[0037]In accordance with the present invention, there is provided a novel, purified and isolated nucleic acid molecule encoding the proteins of the entire biosynthetic pathway for producing the LL-F28249 compounds. The nucleic acid molecule of this invention is isolated from an antibiotic-producing wild-type or mutant Streptomyces. Surprisingly, the complete DNA for encoding all of the essential biosynthetic proteins is efficiently packaged in only three cosmids. These three cosmids, Cos11, Cos36 and Cos40, which have been constructed to contain the nucleic acid molecule according to the invention, are sufficient to regenerate the entire biosynthetic pathway for producing the LL-F28249 compounds. Thus, the present invention uniquely provides the entire biosynthetic gene cluster in three cosmids, as a preferred embodiment, which enables a substantially more efficient means for making the active anti-parasitic LL-F28249 compounds, particularly moxidectin, in fewer steps than previously contemplated. The success of this invention has overcome the prior failed attempts by others to isolate the full biosynthetic gene and satisfies a long-standing need.

[0038]The nucleotide sequence of this complete DNA gene cluster is fully described in FIG. 6 to FIG. 6-39 (which corresponds to SEQ ID NO:1). The scope of the invention also embraces its complementary strand, that is, those nucleotides that are the complement nucleotides (for example, A substituted for T, C substituted for G and vice versa) and/or reverse nucleotide sequences (i.e., a descending order instead of the forward or ascending strand, for example, changing the direction from reading 5' to the 3' end to reading 3' to the 5' end).

[0039]The present invention further includes the nucleic acid sequence that hybridizes to the sequence of the nucleic acid molecule of SEQ ID NO:1 isolated from the microbial source or its complementary strand and encodes a protein of the biosynthetic pathway for producing the LL-F28249 compounds. Typical hybridization procedures and conditions, which are well known to those of ordinary skill in the art, are illustrated in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). While standard or stringent conditions are employed for homologous probes, less stringent hybridization conditions may be used for partially homologous probes that have less than 100% homology with the target nucleic acid sequence. In the latter case of partially homologous probes, a series of Southern and Northern hybridizations may be readily carried out at different stringencies. For instance, when hybridization is carried out in formamide-containing solvents, preferred conditions employ a temperature and ionic strength at a constant of about 42° C. with a solution containing 6×SSC, 50% formamide strength. Less stringent hybridization conditions may use the same temperature and ionic strength but less or lowering amounts of formamide in the annealing buffer at a range of about 45% to 0%. Alternatively, hybridization may be carried out in aqueous solutions containing no formamide. Usually for aqueous hybridization, the ionic strength of the solution is kept the same, often at about 1 M Na+ while the temperature of annealing may be lowered from about 68° C. to 42° C.

[0040]In general, the isolation and characterization of the genomic DNA and the cloned, recombinant DNA from suitable host cells may be done via standard or stringent hybridization techniques, utilizing all or a portion of a nucleotide sequence as a probe to screen an appropriate library. As an alternative approach, oligonucleotide primers, which are constructed on the basis of other related, known DNA and protein sequences, can be used in polymerase chain reactions to amplify and identify other identical or related sequences. The nucleotides and proteins described herein are isolated and purified by routine methods to varying degrees. Preferably, the proteins are obtained in substantially pure form but a lower range of about 80% to about 90% pure is acceptable. It is contemplated that the scope of the invention also includes the DNA and proteins that are made by chemical synthesis, which have the same or substantially the same structures as those derived directly from the antibiotic-producing wild-type or mutant Streptomyces and are confirmed by routine testing or standard assays to be involved in the biosynthetic pathway of the LL-F28249 compounds.

[0041]Additionally, the invention encompasses and fully describes the isolated biosynthesis proteins comprising the amino acid sequences that include, but are not limited to, the regulatory protein encoded by the ORF1 gene (which corresponds to SEQ ID NO:2), the thioesterase protein encoded by the ORF2 gene (which corresponds to SEQ ID NO:3), the reductase protein encoded by the ORF3 gene (which corresponds to SEQ ID NO:4), the loading domain protein for Mod1 encoded by the ORF4 gene (which corresponds to SEQ ID NO:5), the loading domain protein for Mod2-Mod3 encoded by the ORF5 gene (which corresponds to SEQ ID NO:6), the loading domain protein for Mod-4-Mod7 encoded by the ORF6 gene (which corresponds to SEQ ID NO:7), the methyltransferase protein encoded by the ORF7 gene (which corresponds to SEQ ID NO:8), the p450 protein encoded by the ORF8 gene (which corresponds to SEQ ID NO:9), the loading domain protein for Mod8-Mod10 encoded by the ORF9 gene (which corresponds to SEQ ID NO:10), the loading domain protein for Mod11-Mod13 encoded by the ORF10 gene (which corresponds to SEQ ID NO:11) and the oxidoreductase protein encoded by the ORF11 gene (which corresponds to SEQ ID NO:12).

[0042]The open reading frames of the genomic DNA cluster, which encode the biosynthesis proteins, may be identified using a variety of art-recognized techniques. The techniques include, but are not limited to, computer analysis to locate known start and stop codons, putative reading frame locations based on codon frequencies, similarity alignments to expressed genes in other known Streptomyces strains and the like. In this fashion, the proteins of the invention are identified using the nucleotide sequence of the present invention and the open reading frames or the encoded proteins may then be isolated and purified or, alternatively, synthesized by chemical means. Expressible genetic constructs based on the open reading frames and appropriate promoters, initiators, terminators and the like may be designed and introduced into a suitable host cell to express the protein encoded by the open reading frame.

[0043]As used herein, the term "proteins" means the polypeptides, the enzymes and the like, as those terms are commonly used in the art, which are encoded by the nucleic acid molecule comprising the biosynthetic pathway for producing the LL-F28249 compounds. The proteins of the invention encompass amino acid chains of varying length, including full-length, wherein the amino acid residues are linked by covalent peptide bonds, as well as the biologically active variants thereof. The proteins may be natural, recombinant or synthetic. For example, the biosynthesis proteins may be made through conventional recombinant technology by inserting a nucleotide sequence that encodes the protein into an appropriate expression vector and expressing the protein in a suitable host cell or through standard chemical synthesis by the Merrifield solid-phase synthesis method described in Merrifield, J. Am. Chem. Soc. 85:2149-2154 (1963), in which the amino acids are individually and sequentially attached to an amino acid chain. Alternatively, modern equipment is commercially available from a variety of manufacturers such as Perkin-Elmer, Inc. (Wellesley, Mass.) for the automated synthesis of proteins.

[0044]The biologically active variants that are included within the scope of the present invention comprise, at a minimum, the biologically functional portion of the amino acid sequence encoded by the nucleic acid molecule of the invention. As used herein, the "biologically functional portion" is that part of the protein structure which still retains the active function of the protein, for example, that part of the regulatory protein molecule encoded by the ORF1 gene which has the same or substantially the same activity and/or binding properties, i.e., at least about 90%, and more preferably, about 95%, similarities or potencies. The biologically active variants of the proteins include active amino acid structures having deleted, substituted or added amino acid residues, naturally occurring alleles, etc. The biologically functional portion may be easily identified by subjecting the full-length protein to chemical or enzymatic digestion to prepare fragments and then testing those fragments in standard assays to analyze which part of the amino acid structure retains the same or substantially the same biological activity as the full-length protein.

[0045]The determination of the full biosynthesis gene cluster of Fα, heretofore unknown, is of great commercial significance. The isolation and complete description of the gene according to the present methods permit the enhanced production of the active Fα compound and other natural members of the LL-F28249 family of compounds. Furthermore, the information about the gene enables an improved method for preparing the commercially potent semisynthetic derivatives such as moxidectin in a more quick and efficient fashion than the prior chemical process of manufacture. As a direct and beneficial consequence of the cloning and characterization the novel Fα biosynthesis gene cluster, which is described herein, unique processes for the direct fermentative production of moxidectin and other important LL-F28249 derivatives using bioengineered strains of S. cyaneogriseus are now obtainable.

[0046]One advantage of the present invention is the ability to enhance the production of the highly active Fα from the fermentation broth of S. cyaneogriseus. Cos11 contains a putative transcription activator gene (ORF1) for the PKS cluster. Increasing the expression level of the activator can result in a higher yield of Fα. This is achieved by increasing the copy number of the gene or by enhancing the regulatory sequence elements for this gene according to known techniques (see, for example, Perez-Llarena et al., Journal of Bacteriology 179:2053-2059 (1997)).

[0047]Another benefit derived from obtaining the full biosynthetic gene cluster of the present invention is to enable the efficient fermentative production and manufacture of the natural and semisynthetic derivatives of the LL-F28249 family of compounds such as, for example, LL-F28249α, LL-F28249β, LL-F28249γ, 23-(O-methyloxime)-LL-F28249α (moxidectin), 23-(O-methyloxime)-5-(phenoxyacetoxy)-LL-F28249α, 23-(semicarbazone)-LL-F28249α, 23-(thiosemicarbazone)-LL-F28249α, etc. Through the identification of the biosynthesis genes encoding the proteins responsible for the production of the LL-F28249 compounds and, desirably, the Fα metabolite as the major product, additional cloning and mutagenesis of the pathway readily produces other metabolites as by-products of the fermentation process. The biosynthesis genes are particularly useful to minimize the number of chemical reaction steps in preparing other semisynthetic members of the family.

[0048]The highly preferred utility of this invention involves the preparation of the commercially important compound moxidectin in fewer steps than previously done via known chemical processes. Moxidectin is currently produced by a four-step chemical process from Fα, which is first obtained by fermentation of Streptomyces cyaneogriseus subsp. noncyanogenus. The conversion of the natural metabolite Fα to moxidectin involves the following chemical reactions: (1) protection of the 5-hydroxyl group; (2) oxidation of the 23-hydroxyl group to a keto function; (3) conversion of the 23-keto to 23-O-methyloxime group; and (4) deprotection of the 5-hydroxyl group. The efficient method of the present invention now permits the chemical conversion of 23-keto Fα to moxidectin to be accomplished in a single step.

[0049]By generating mutants of the biosynthesis gene cluster, the specific activity responsible for reduction of the keto function at position 23 of the LL-F28249 compound structure is eliminated and the chemical synthesis is reduced to the one step. Surprisingly, the remainder of the modular polyketide synthase remains functional and the functional remainder of the polyketide synthase recognizes the unnatural polyketide intermediate. The unique bioengineered strain is then capable of being used, cloned and re-used for the direct fermentative production of 23-keto Fα further reducing the normal processing time.

[0050]In the below examples, selective mutagenesis illustrates how to modify Fα biosynthesis and to obtain the desired metabolites according to the present methods. Basically, mutants of the module 3 ketoreductase domain of the S. cyaneogriseus Fα biosynthetic gene cluster are generated by site-directed mutagenesis. These ketoreductase variants are designed by comparing the predicted amino acid sequence of the Fα module 3 ketoreductase domain to ketoreductase domains from a number of biologically active ketoreductase domains and several "cryptic" ketoreductase domains. The module 3 ketoreductase domain of the S. cyaneogriseus Fα biosynthetic gene cluster is then replaced with these variant domains by homologous recombination in order to alter Fα biosynthesis and obtain the desired metabolite.

[0051]Generally speaking, the site-directed mutagenesis introduces a small deletion or point mutation in the 23-keto (oxo) reductase gene (23-KR gene) to render the 23-ketoreductase domain nonfunctional while it retains the functions of other domains of the polyketide synthase. Mutations in the 23-KR gene are introduced by standard methods into a wild-type Streptomyces cyaneogriseus subsp. noncyanogenus strain or the mutant Fα production strain 142, resulting in the direct fermentative production of 23-keto (oxo) Fα. In addition, the whole Fα PKS gene cluster carrying mutations in the 23-KR gene may be introduced into a suitable host cell such as S. lividans, S. coelicolor, E. coli and the like to produce 23-keto Fα. The transformed host cells are used as the source of DNA for conjugal transfer to S. cyaneogriseus using methods described herein for the further fermentative production of 23-keto Fα.

[0052]The imino derivatives (23-oxime) of the 23-oxo compounds are then readily prepared by standard techniques such as procedures described by S. M. McElvain in The Characterization of Organic Compounds, published by MacMillian Company, New York, 1953, pages 204-205 and incorporated herein by reference. Typically, the 23-oxo compound is stirred in alcohol, such as methanol or ethanol, or dioxane in the presence of acetic acid and an excess of the amino derivatizing agent, such as hydroxylamine hydrochloride, O-methylhydroxylamine hydrochloride, semicarbazide hydrochloride and the like along with an equivalent amount of sodium acetate, at room temperature to about 50° C. The reaction is usually complete in several hours to several days at room temperature but can be readily speeded by heating. This subsequent conversion to moxidection via the 23-keto Fα compound is surprisingly and beneficially the only necessary chemical reaction to take place.

[0053]It is further contemplated that the genetic material contained within the three cosmids, Cos11, Cos36 and Cos40, may be reduced to fit into two plasmids or a single plasmid through genetic manipulations known to those of ordinary skill in the art. For example, the cloned Fα biosynthesis genes that are present in the Cos11, Cos36 and Cos40 prepared according to the methods of the present invention would be used to assemble the entire polyketide synthase (PKS) gene cluster on two plasmids or a single plasmid. The assembling can be achieved by use of cloning, PCR or synthetic genes, or a combination of any of these art-recognized techniques. The assembled Fα PKS gene cluster can be introduced into a suitable host cell such as S. lividans, S. coelicolor, E. coli and the like to produce Fα. Thereafter, the assembled PKS gene cluster can be used in a cell-free expression system such as, for example, a cell-free expression system described by Olsthoorn-Tieleman et al., Eur. J. Biochem. 268:3807-3815 (2001), to produce further amounts of Fα and related products.

[0054]Using the modular organization of the core LL-F28249α polyketide synthase and the functional domains within those modules, the biosynthesis gene cluster described herein is cloned and fully characterized. Generally, for the isolation of the biosynthetic genes, a cosmid library of S. cyaneogriseus genomic DNA is prepared in the commercially available vector pSuperCos (Stratagene, La Jolla, Calif.). This cosmid library is probed with fragments of DNA corresponding to the avermectin module 1 ketoacyl synthase, which has been amplified from S. avermitilis genomic DNA using the polymerase chain reaction. Subsequently, several regions of the Fα biosynthetic gene cluster, which have been amplified from previously characterized cosmids using the polymerase chain reaction, are used as probes to isolate additional cosmids. Using these methods, a series of cosmids are isolated that collectively span over 100 Kbp of genomic DNA. Complete restriction endonuclease mapping and thorough nucleotide sequence analysis identify the cosmids and result in a definitive, unambiguous contiguous nucleotide sequence spanning nearly 88 Kbp. Analysis of this nucleotide sequence reveals the presence of 13 complete modules of a modular polyketide synthase together with at least six additional genes involved in the biosynthesis or in the regulation of the biosynthesis of Fα.

[0055]The invention further embraces biologically functional plasmids or vectors containing the nucleic acid molecule of the present invention. The particular plasmids of the invention are selected for their ability to incorporate large DNA gene clusters but they are conventional and are derived from commonly available vectors, for example, pKR0.9, the pFDmod3/5.2 series, the pFDmod3/4.2 series and the like.

[0056]Although E. coli is used as the heterologous host in the examples, the heterologous expression of antibiotic biosynthetic genes is expected in a wide number of Actinomycetales, Bacillus, Corynebacteria, Thermoactinomyces and the like so long as they are capable of being transformed with the relatively large plasmid constructs described herein. Those that are transformed include, but are not limited to, Streptomyces lividans, Streptomyces coelicolor, Streptomyces griseofuscus and Streptomyces ambofaciens, which are known to be relatively non-restricting. Preferably, the suitable host cell that is stably transformed or transfected by the plasmid or vector is Streptomyces coelicolor or an Escherichia coli-Streptomyces cosmid vehicle. In vitro expression of the proteins may be performed, if desirable, using standard art methods.

[0057]The following section highlights general methods and materials, available to those of ordinary skill in the art, which have been used to successfully clone and characterize the entire, large biosynthetic pathway of the present invention.

General Methods and Materials

A. Materials, Plasmids and Bacterial Strains

[0058]An E. coli-Streptomyces shuttle vector that contains elements required for replication and selection in E. coli and in Streptomyces, including antibiotic resistance markers for selection with apramycin, pKC1132, is used throughout this work (see M. Bierman et al., Gene 116:43-49 (1992)). In addition to pKC1132, commercially available cloning vectors are used as indicated herein. Those of ordinary skill in the art will be able to select other well known cloning vectors, which can readily be substituted for the exemplified vectors, and avoid or minimize instability problems encountered with certain older strains of the cosmid-harboring E. coli using standard techniques.

[0059]Plasmid DNA is manipulated using procedures similar to those established by work on other plasmids. Typical procedures are presented in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Typical procedures for Streptomyces are presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985). Specific methods used in this work are described herein unless they are identical to methods presented in the above-referenced laboratory manuals.

[0060]E. coli JM109 and DH5α, common laboratory strains used throughout this work, are readily available from a number of commercial sources (for example, Stratagene, La Jolla, Calif.). E. coli XL1-Blue MRF' strain is obtained from Stratagene (La Jolla, Calif.). E. coli ETS12567 (pUZ8002) is obtained from Professor Heinz Floss, of the Department of Chemistry, University of Washington (Seattle, Wash.). E. coli VCS257 is obtained from Stratagene (La Jolla, Calif.). S. avermitilis is obtained from the American Type Culture Collection under ATCC Deposit Accession No. 31,267 but it can also be obtained from the Agricultural Research Culture Collection (NRRL), 1815 N. University Street, Peoria, Ill. 61604, under NRRL 8165. "Wild-type" Streptomyces cyaneogriseus subsp. noncyanogenus LL-F28249 (NRRL 15773) and the mutant Fα production strain of S. cyaneogriseus designated "S. cyaneogriseus strain 142" are used separately throughout this written disclosure of the present invention but they are interchangeable and may substitute for each other in any given step of the disclosed process. Strain 142, which is derived from the wild-type strain, has undergone classic genetic manipulations to enhance antibiotic production but it retains the same polyketide synthase DNA sequence as the wild-type strain. Because their polyketide synthase sequences are identical, all of the plasmids described herein, including but not limited to Cos11, Cos36 and Cos40, can be derived from wild-type Streptomyces cyaneogriseus subsp. noncyanogenus or S. cyaneogriseus strain 142 with the same result.

B. Restriction Analysis of Plasmid DNA

[0061]Procedures for restriction analysis of plasmid DNA, procedures for agarose gel electrophoresis, and other standard techniques of recombinant DNA technology are described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Plasmid DNA is digested with restriction endonucleases according to the manufacturer's procedures. Enzymes are obtained from New England Biolabs (Beverly, Mass.), Life Technologies (Rockville, Md.) or Promega (Madison, Wis.). Restriction digests are analyzed by electrophoresis in 0.8% w/v agarose using 40 mM tris-acetate, 1 mM EDTA as a buffer. The size of the fragments is determined by comparison to DNA fragments of known molecular weight (1 Kb ladder, Life Technologies, Rockville, Md.).

C. Preparation of Hybridization Probes

[0062]Hybridization probes are isolated from plasmids following restriction digestion or are generated using the polymerase chain reaction as described herein. Probes are radiolabeled to high specific radioactivity using EasyTides® α32 P-dCTP (3000 Ci/mmol) from New England Nuclear (Boston, Mass.) and the Rediprime® II random prime labeling system from Amersham Pharmacia Biotech (Piscataway, N.J.) according to procedures provided by the manufacturer.

[0063]Hybridization probes are used to identify cosmids containing the Fα biosynthetic gene cluster (from both S. cyaneogriseus strain 142 and wild-type S. cyaneogriseus cosmid libraries), to confirm and characterize transconjugants and excisants, and to facilitate the generation of accurate restriction maps of the Fα biosynthetic gene cluster that confirm the identity of the gene. These hybridization probes are either generated by PCR amplification or the probes are excised from clones as summarized in the following Table 1.

TABLE-US-00001 TABLE 1 PCR Primer Sequence or Restriction Probe Sites Use Avermectin F: GCCGAATTCCTTCGGCATCAGCCCC To Isolate Cosmids Containing the Fα Biosynthetic KS1 R: GCTCGCACCGTCCTGGTTGACCGC Gene Cluster (S. cyaneogriseus strain 142) NE5.7 5.7 Kbp NotI/EcoRI Fragment of Cos7 To Isolate Cosmids Containing the Fα Biosynthetic Gene Cluster (wild-type S. cyaneogriseus) (Contains Fa Module 3) Apramycin 750 bp SacI Fragment of pKC1132 To Confirm and Characterize Transconjugants Mod3 F: GACAACGTCGGTCCGG To Confirm and Characterize Transconjugants, and in R: CGCGGTGACTCGCTTGAGGTATTC Restriction Mapping Thioesterase F: GCTTCACCGACCCCTCGGCTATGACC To Restriction Map the Right End of the Fα R: GTGAAGTGGTTGCCGTCGGTTTCGAGG Biosynthetic Gene Cluster p450 F: GATGACGTGCTCACCGATGTCGGTGAGC To Restriction Map the Right End of the Fα R: GACGTGGAAATCATGTACAGCTCGTACG Biosynthetic Gene Cluster Cos36 (end) 500 bp NotI Fragment of Cos36 To Restriction Map the Right End of the Fα Biosynthetic Gene Cluster Cos12 (end) 1.1 Kbp BamHI/EcoRI Fragment of Cos12 To Restriction Map the Left End of the Fα Biosynthetic Gene Cluster B5.5 5.5 Kbp BamHI Fragment of Cos11 To Restriction Map the Left End of the Fα Biosynthetic Gene Cluster, and To Isolate Cosmids Containing the Fa Biosynthetic Gene Cluster (wild- type S. cyaneogriseus)

Isolation, Maintenance and Propagation of Plasmids

A. Plasmid Isolation

[0064]E. coli strains, both untransformed and those transformed with vectors as described herein, are grown using well-established methods similar to those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989).

[0065]Plasmid DNA is isolated from E. coli cultures using reagents and materials obtained from QIAGEN (Valencia, Calif.). Depending on the numbers of strains being analyzed, the miniprep plasmid isolation systems used included the QIAprep® Spin Miniprep Kits (for plasmid isolation from relatively small numbers of strains); the QIAprep® 8 Turbo Miniprep Kits (for higher-throughput plasmid isolation from somewhat larger numbers of strains); or the QIAprep® 96 Turbo Miniprep Kits (for partially automated isolation of plasmids from strains in 96-well blocks). For the isolation of larger quantities of plasmid DNA from E. coli, reagents and materials included in the QIAGEN Plasmid Midi (up to 100 μg) and Maxi (up to 500 μg) kits, or reagents and materials included in the Nucleobond AX-100 (up to 100 μg) kit from Clontech (Palo Alto, Calif.) are used.

B. Transformation of Escherichia coli by Plasmid DNA

[0066]Plasmid DNA is transformed into electrocompetent E. coli strains by electroporation or into chemically competent E. coli strains by heat shock using well-established procedures similar to those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Transformants are selected using appropriate antibiotics, and after plasmids are isolated using methods described herein, they are characterized following digestion with restriction endonucleases, again using well-established methods described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989).

C. Conjugal Transfer of Plasmid DNA from Escherichia coli to Streptomyces cyaneogriseus

[0067]In all cases, the plasmids of interest are first transformed into the E. coli strain designated ETS12567 (pUZ8002) by electroporation as described herein. This strain is cmr, tetr, dam.sup.-, and dcm-1. Additionally, pUZ8002, which is an oriT.sup.- version of the plasmid pRK2 (see R. Meyer et al., Science 190:1226-1228 (1975)), confers kanr. The transformed cells are maintained in the presence of appropriate antibiotic selection, including 5 μg/ml kanamycin and 100 μg/ml apramycin. The conjugal transfer of plasmid DNA from these E. coli transformants to S. cyaneogriseus is accomplished using the following procedures, both of which are modified from a procedure described by M. Bierman et al., Gene 116: 43-49 (1992). [0068]Conjugation Method #1: A 3 ml LB media supplemented with 5 μg/ml kanamycin, 5 μg/ml chloramphenicol, 50 μg/ml apramycin is inoculated with a single well-isolated transformed E. coli colony, and the culture is incubated at 37° C., with shaking at 220 rpm, for 16 hours. 10 ml TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving) media is inoculated with 100 μl of a frozen stock of S. cyaneogriseus mycelial fragments, and the culture is incubated at 31° C., with shaking at 220 rpm, for 16 hours. The next day, 10 ml LB media supplemented with 50 μg/ml apramycin is inoculated with a 100 μl aliquot of the overnight E. coli culture. At the same time, a 2 ml aliquot of the S. cyaneogriseus overnight culture is vortexed in a tube containing sterile glass beads for 2 minutes. The suspension is sonicated (3×, 5 second bursts at 100% output); and 1 ml of this suspension of mycelial fragments is transferred to 9 ml of TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving). Both cultures are incubated at 37° C., with shaking at 220 rpm, until the absorbance at 600 nm of the E. coli culture reached 0.4-0.6. The cells in each culture are collected by centrifugation, washed 2× with LB, and suspended in 500 μl 2XYT (16 g/L tryptone, 10 g/L yeast extract, 5 g/L NaCl, pH 7.0). Aliquots (100 μl) of the two preparations are combined; the mixture is incubated at 50° C. for 5 minutes; and the cells are collected by centrifugation. The supernatant is removed, and the cell pellet is suspended in 100 ml of 2XYT (16 g/L tryptone, 10 g/L yeast extract, 5 g/L NaCl, pH 7.0), and plated onto SFM (25 g/L soybean flour nutrisoy, 25 g/L mannitol, 20 g/L agar, 0.462 g/L L-cysteine, 0.462 g/L L-arginine, 0.462 g/L L-proline) plates. These plates are incubated at 37° C. for 16 hours, and then overlaid with 1 ml of sterile water containing 0.5 mg of nalidixic acid and 1 mg of apramycin (final concentrations 20 μg/ml and 40 μg/ml, respectively). The plates are incubated at 37° C. until colonies are well established. [0069]Conjugation Method #2: 3 ml LB media supplemented with 5 μg/ml kanamycin, 5 μg/ml chloramphenicol, 100 μg/ml apramycin is inoculated with a single well-isolated transformed E. coli colony, and the culture is incubated at 37° C., with shaking at 220 rpm, for 16 hours. 25 ml KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8, and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) is inoculated with 1 ml of a frozen stock of S. cyaneogriseus, and the culture is incubated at 31° C., with shaking at 220 rpm, for 16 hours. The next day, 1 ml of the overnight E. coli culture is combined with 9 ml of LB supplemented with 50 μg/ml apramycin. At the same time, a 5 ml aliquot of the S. cyaneogriseus overnight culture is vortexed in a tube containing sterile glass beads for 2 minutes. A 2.5 ml aliquot of the homogenized culture is inoculated into 25 ml of KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O), and both cultures are incubated at 37° C. for 3 hours. The cells in each culture are collected by centrifugation, and washed 2× with water. The E. coli and S. cyaneogriseus cell pellets are suspended in 1 ml and 2 ml, respectively, of TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving). 10 μl of the S. cyaneogriseus suspension, and 100 μl of the E. coli suspension are combined with 890 μl of TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving), and 100 μl of the mixture is plated onto AS-1 plates (1 g/L yeast extract, 0.2 g/L L-alanine, 0.2 g/L L-arginine, 0.5 g/L L-asparagine, 5 g/L soluble starch, 2.5 g/L NaCl, 10 g/L Na2SO4, 20 g/L agar, pH 7.5) supplemented with 10 mM MgCl2. These plates are incubated at 37° C. for 16 hours, and then overlaid with 3 ml of R2 agar (100 g/L sucrose, 10 g/L glucose, 10 g/L MgCl2, 0.25 g/L K2SO4, 0.1 g/L casamino acids, 25 g/L agar). At use, the following solutions are added to each 80 ml flask of R2 agar: 1 ml of 0.5% K2HPO4; 8 ml of 3.68% CaCl2.2H2O; 1.5 ml of 20% L-proline; 10 ml of 5.73% TES, pH 7.2; 0.5 ml of 1N NaOH; and 1 ml of a trace elements solution containing 40 mg/L ZnCl2, 200 mg/L FeCl3.6H2O, 10 mg/L CuCl2.2H2O, 10 mg/L MnCl2.4H2O, 10 mg/L Na2B4O7.10H2O, 10 mg/L (NH4)6Mo7O24.4H2O). The solution is also supplemented to 100 μg/ml apramycin and 100 μg/ml nalidixic acid (final concentrations). The plates are incubated at 37° C. until colonies are well established.

[0070]Using either method, putative transconjugants are repetitively picked onto fresh plates, in the presence of 100 μg/ml apramycin and 100 μg/ml nalidixic acid until cured of visible contamination by the E. coli strain used as the source of the plasmid.

[0071]The purified DNA derived from Streptomyces cyaneogriseus subsp. noncyanogenus, which encodes the entire biosynthetic pathway for the production of the LL-F28249 compounds, has been deposited in connection with the present patent application under the conditions mandated by 37 C.F.R. § 1.808 and maintained pursuant to the Budapest Treaty in the American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, U.S.A. More specifically, the purified cosmid DNA, described herein fully and identified as Cos11, Cos36 and Cos40, was deposited in the ATCC on May 24, 2002 and assigned ATCC Patent Deposit Designation Numbers PTA-4392, PTA-4393 and PTA-4394, respectively. It should be appreciated that related purified DNA, other cosmids or plasmids containing related nucleotide sequences, which may be readily constructed using site-directed mutagenesis and the techniques described herein, are also encompassed within the scope of the present invention.

[0072]The following examples demonstrate certain aspects of the present invention. However, it is to be understood that these examples are for illustration only and do not purport to be wholly definitive as to conditions and scope of this invention. It should be appreciated that when typical reaction conditions (e.g., temperature, reaction times, etc.) have been given, the conditions both above and below the specified ranges can also be used, though generally less conveniently. The examples are conducted at room temperature (about 23° C. to about 28° C.) and at atmospheric pressure. All parts and percents referred to herein are on a weight basis and all temperatures are expressed in degrees centigrade unless otherwise specified.

[0073]A further understanding of the invention may be obtained from the non-limiting examples that follow below.

Example 1

Characterization of the Biosynthetic Gene Cluster for Making LL-F28249 Compounds

A. Isolation and Characterization of Cosmids Containing the Fα Biosynthetic Gene Cluster

[0074]1. Construction of Streptomyces cyaneogriseus Cosmid Libraries

[0075]Genomic DNA was isolated from S. cyaneogriseus (both wild-type and the Fα production strain designated 142) using a method presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985) ("Isolation of Streptomyces "Total" DNA: Procedure 3). The S. cyaneogriseus genomic DNA preparation was subjected to partial restriction endonuclease digestion with Sau3AI as follows. A reaction mixture was prepared containing Sau3AI and genomic DNA, and at time points (0, 5, 10, 15, 20, 30, and 45 minutes) aliquots were removed and the reactions were quenched by the addition of EDTA to a final concentration of 10 mM. A portion of each quenched reaction time point was resolved by electrophoresis through 0.3% w/v agarose at 25 volts for 16 hours. The reaction time point containing DNA fragments that were predominantly between 23 Kbp and 50 Kbp was selected for the cosmid library. At the same time, pSuperCos 1 (Stratagene, La Jolla, Calif.) was digested with the restriction endonuclease XbaI; dephosphorylated using calf intestine alkaline phosphatase; and after ethanol precipitation, the linear vector was digested with the restriction endonuclease BamHI in order to remove one of the Cos sites. The Sau3AI fragments of S. cyaneogriseus genomic DNA were ligated into linearized, BamHI treated pSuperCos 1 according to procedures provided by the manufacturer. The resultant recombinant cosmid DNA preparation was packaged using Gigapack® III XL Packaging Extract, and after lysis of the resultant lambda phage particles with chloroform, the cosmid DNA library was transformed into E. coli VCS257. These manipulations were all conducted using reagents, materials, and procedures provided by the manufacturer (Stratagene, La Jolla, Calif.).

[0076]2. Isolation of Cosmids Containing the Fα Biosynthetic Gene Cluster

[0077]Genomic DNA was isolated from S. avermitilis using a method presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985) ("Isolation of Streptomyces "Total" DNA: Procedure 3). This genomic DNA preparation was used as a template for amplification of a region of the module 1 ketoacyl synthase domain of the avermectin biosynthetic gene cluster using the polymerase chain reaction. The oligonucleotide primers used were designed on the basis of nucleotide sequences of the avermectin biosynthetic gene cluster that have been deposited into public databases. Colony lifts of the S. cyaneogriseus strain 142 cosmid library were screened for hybridization to the avermectin ketoacyl synthase probe, and more than 30 cosmids potentially containing type I polyketide synthase DNA were isolated. Initially, these cosmids were analyzed following digestion with BamHI, by agarose gel electrophoresis, by Southern blot using the avermectin module 1 ketoacyl synthase probe, and by limited nucleotide sequence analysis. Comparison of these data to data reported by MacNeil and colleagues (see D. J. MacNeil et al., Gene 115:119-125 (1992) and D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)) suggested that two of these cosmids (designated Cos7 and Cos11) appeared to span the majority of the Fα biosynthetic gene cluster. The limited data presented by MacNeil and his colleagues were also used as the initial basis to support the isolation of a 5.7 Kbp NotI-EcoRI fragment that included most of module 3. A clone of this 5.7 Kbp NotI-EcoRI fragment was prepared (designated pNE57). The nucleotide sequence of this 5.7 Kbp fragment was determined in its entirety. This fragment of the Fα biosynthetic gene cluster (from genomic DNA isolated from the Fα production strain) was then used as a probe to screen the wild-type S. cyaneogriseus cosmid library and 45 cosmids potentially containing type I polyketide synthase DNA were isolated. These cosmids were extensively mapped with BamHI, NotI, and EcoRI using methods described herein, and on the basis of comparison of those restriction maps to the incomplete data presented by MacNeil and his colleagues, two cosmids (designated Cos36 and Cos40 from the wild-type strain), that appeared to span the majority of the Fα biosynthetic gene cluster, were identified.

[0078]In order to identify cosmids spanning the "ends" of the Fα biosynthetic gene cluster, but not containing significant stretches of core polyketide synthase DNA, the following strategy was employed. A 5.5 Kbp BamHI fragment isolated from Cos11 (from S. cyaneogriseus strain 142) was used to reprobe the wild-type S. cyaneogriseus cosmids that had been selected previously in order to identify additional cosmids that would extend the cluster to the "left." A number of cosmids were identified that hybridized to the probe, and after restriction mapping, one of these, Cos14, was identified that would support extending the cluster the furthest to the left. A 500 bp NotI fragment isolated from the 3' end of Cos36 was used to reprobe the wild-type S. cyaneogriseus cosmid library in order to identify additional cosmids that would extend the cluster to the "right." A number of additional cosmids were identified that hybridized to the probe, and after restriction mapping, one of these, Cos50, was identified that would support extending the cluster the furthest to the "right."

[0079]3. Restriction Mapping Cosmids Containing the Fα Biosynthetic Gene Cluster

[0080]Initially, more than 30 cosmids from the S. cyaneogriseus strain 142 cosmid library that hybridized to the avermectin ketoacyl synthase probe, and 45 cosmids from the wild-type S. cyaneogriseus cosmid library that hybridized to the Fα module 3 probe (pNE57), were mapped following digestion with BamHI, NotI, and EcoRI. On the basis of this preliminary analysis, and on the basis of comparison of the restriction maps to the incomplete data presented by MacNeil and his colleagues (see D. J. MacNeil et al., Gene 115:119-125 (1992) and D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)), several cosmids were selected for more comprehensive analysis. These cosmids (designated Cos7 and Cos11 from S. cyaneogriseus strain 142; and Cos12, Cos14, Cos36, Cos40 and Cos50 from wild-type S. cyaneogriseus) were carefully mapped following digestion with BamHI, NotI, and EcoRI and double-digestion with BamHI/MluI, NotI/EcoRI, BamHI/EcoRI, SacI/EcoRI, and NotI/MluI. To resolve ambiguity in the restriction maps that were observed, subclones of these cosmids were constructed as summarized in the following Table 2, and these subclones were extensively mapped as described above.

TABLE-US-00002 TABLE 2 Subcloned Designation from: Vector Restriction Sites/Size pB5.5 Cos11 pZeroBlunt BamHI/5.5 Kbp pB18.0 Cos11 pUC19 BamHI/ 18.0 Kbp PBE15.0 Cos12 pBluescript KS BamHI/EcoRI/15.0 Kbp pB2.5 Cos14 pBluescript KS BamHI/2.5 Kbp pB5.5 Cos14 PZeroBlunt BamHI/5.5 Kbp PBB14.0 Cos14 pBluescript KS BamHI/Bg/II/14.0 Kbp PM14.0 Cos14 pLitmus38 MluI/14.0 Kbp PN2.0 Cos14 pBluescript KS NotI/2.0 Kbp PN4.3 Cos14 pBluescript KS NotI/4.3 Kbp pS1.45 Cos14 pBluescript KS SacI/1.45 Kbp pS8.2 Cos14 pBluescript KS SacI/8.2 Kbp pS2.0 Cos14 pLitmus38 SphI/2.0 pB11.5 Cos36 pBluescript KS BamHI/11.5 Kbp PBE4.8 Cos36 pBluescript KS BamHI/EcoRI/4.8 Kbp PM4.6 Cos36 pLitmus38 MluI 4.6 Kbp PN1.6 Cos36 pBluescript KS NotI/1.6 Kbp PN4.8 Cos36 pBluescript KS NotI/4.8 Kbp PBE5.3 Cos40 pBluescript KS BamHI/EcoRI/5.3 Kbp PN5.2 Cos50 pBluescript KS NotI/5.2 Kbp PN10.0 Cos50 pBluescript KS NotI/10.0 Kbp pS3.3 Cos50 pBluescript KS SacI/3.3 Kbp

B. Nucleotide Sequence of the Fα Biosynthetic Gene Cluster

[0081]1. Sequencing Strategy

[0082]The vast majority of the nucleotide sequence data was obtained by end-sequencing random, size selected sublibraries of cosmid DNA that were prepared as described herein. Random sublibraries were sequenced until sufficient coverage (8-10× redundancy) should have existed over the entire fragment of DNA. In order to obtain nucleotide sequence data for regions of the biosynthetic gene cluster that were underrepresented in the random sublibraries, or that for other reasons were difficult to sequence, two other sequencing strategies were used. In the first, products were generated using the polymerase chain reaction in such a way as to span the region of interest of the gene cluster. These PCR products were sequenced directly using the PCR primers as sequencing primers, or the products were cloned into the commercially available PCR product cloning vector pTOPO TA (Invitrogen, Carlsbad, Calif.), and sequenced using universal primers. Alternatively, sequencing primers were synthesized which facilitated obtaining nucleotide sequence by "walking" through regions of interest on cosmids or subclones prepared from the cosmids. Throughout, nucleotide sequence was obtained on Applied Biosystems Model 377 Automated sequencers, using ABI PRISM® BigDye® Terminator Cycle Sequencing Ready Reaction reagents and materials according to detailed procedures provided by the manufacturer (Applied Biosystems, a Division of Perkin Elmer, Foster City, Calif.). Nucleotide sequence data was collected and analyzed using standard "Collection" and "Sequencing Analysis" algorithms (Applied Biosystems, a Division of Perkin Elmer, Foster City, Calif.). Nucleotide sequence assemblies were generated using the SeqMan® II sequence analysis package that is commercially available from DNASTAR (Madison, Wis.), and using the custom Finch®-300 Assembly Server developed for us by Geospiza (Seattle, Wash.).

[0083]Two cosmids (designated Cos36 and Cos40) that appeared on the basis of extensive restriction mapping to span the majority of the Fα biosynthetic gene cluster were isolated from the wild-type S. cyaneogriseus cosmid library. These cosmids were sequenced in their entirety by end-sequencing random, size selected sublibraries that were prepared as described herein. In addition, random, size selected sublibraries prepared from the inserts in several subclones (as summarized in the following Table 3) were also sequenced. Finally, the majority of the subclones generated to support comprehensive restriction mapping of the Fα biosynthetic gene cluster were end-sequenced using universal primers.

TABLE-US-00003 TABLE 3 Subcloned Designation from Cosmid Restriction Sites/Size pNE57 Cos7 NotI-EcoRI/5.7 Kbp (S. cyaneogriseus strain 142) pNE57 Cos40 NotI-EcoRI/5.7 Kbp (wild-type S. cyaneogriseus) pB5.5 Cos14 BamHI/5.5 Kbp pN4.3 Cos14 NotI/4.3 Kbp pN10.0 Cos50 NotI/10.0 Kbp pS8.2 Cos14 SacI/8.2 Kbp

[0084]2. Construction of Sublibraries for Nucleotide Sequence Analysis

[0085]To generate large quantities of the inserts present in cosmids and in the subclones derived from those cosmids, large quantities of plasmid DNA were required. Media (typically 1 L) were inoculated with the clone of interest, and incubated at 37° C. overnight. Plasmid (cosmid) DNA was isolated from these cultures using materials and reagents included in the QIAGEN Plasmid Midi (up to 100 μg) and Maxi (up to 500 μg) kits, or reagents and materials included in the Nucleobond AX-100 (up to 100 μg) kit from Clontech (Palo Alto, Calif.). The inserts present in these plasmids (cosmids) were excised by digestion with appropriate restriction endonucleases, and the fragments were resolved by electrophoresis through 0.8% w/v agarose. The desired fragments were excised from these gels, and the DNA contained in those bands was isolated using reagents, materials, and procedures included in the QIAEX II® (for fragments larger than 10 Kbp) or QIAquick II (for fragments smaller than 10 Kbp) Gel Extraction Systems from QIAGEN (Valencia, Calif.). Then, the DNA was randomly sheared by sonication using a Microson cell disrupter at 10% output. Sonication times were optimized in order to generate fragments of the desired size (typically about 18 seconds for larger inserts isolated from cosmids, and about eight seconds for the smaller fragments isolated from plasmid subclones of those cosmids). Following ethanol precipitation, the DNA fragments were "blunted" using T4 DNA polymerase (New England Biolabs, Beverly, Mass.) in 25 μl reaction volumes containing 2.5 μl of 10×T4 DNA polymerase reaction buffer, 1 μl of 25 μg/ml BSA, and 1.5 μl of T4 DNA polymerase. The reaction mixtures were incubated at 16° C. for 20 minutes, and resolved by electrophoresis through 0.8% w/v agarose. The region of the gel containing DNA between 1.5 Kbp and 2.5 Kbp (by comparison to DNA fragments of known molecular weight) was excised, and the DNA was extracted from the agarose using reagents, materials, and procedures included in the QIAquick II Gel Extraction System from QIAGEN (Valencia, Calif.). Purified DNA was collected by ethanol precipitation and resuspended in 8 μl of water. These DNA fragments were then cloned into pCR®-Blunt, and the ligated products were transformed into chemically competent E. coli TOP10 using reagents, materials and procedures provided by the manufacturer (Invitrogen, Carlsbad, Calif.). Colonies were picked and used to inoculate 2 ml LB media supplemented with 50 μg/ml kanamycin, in 96-well deep well blocks. Plasmid DNA was purified from each of these cultures using reagents, materials and procedures included in QIAprep® 96 Turbo Miniprep Kits. Although the frequency of clones with insert generally exceeded 90%, each plasmid was digested with EcoRI and the fragments were resolved by electrophoresis through 0.8% w/v agarose in order to determine whether an insert of the desired size was present. Clones that did contain desired inserts were sequenced using universal sequencing primers as described herein.

[0086]3. Identification of Biosynthetic Modules and Domains within Modules

[0087]Many modular polyketide biosynthetic gene clusters have been characterized and manipulated. In addition, a large number of nucleotide sequences of modular polyketide biosynthetic gene clusters have been deposited in the public databases. In general, modules of modular polyketide biosynthetic gene clusters, and the domains within those modules can be identified by performing BLAST searches against the public databases, and extensive use of those public databases was made to facilitate the present analysis of the Fα biosynthetic gene cluster (see S. F. Altschul et al., Nucleic Acids Research 25:3389-3402 (1997)). In addition, use of a recent literature reference that summarizes methods for identification of modular polyketide synthase domains, that in particular, describes the differentiation of malonyl-class from methylmalonyl-class acyltransferase domains was employed (S. J. Kakavas et al., Journal of Bacteriology 179:7515-7522 (1997). Leadlay and colleagues originally described methods for differentiation of malonyl-class from methylmalonyl-class acyltransferase domains (see T. Schwecke et al., Proceedings of the National Academy of Sciences USA 92:7839-7843 (1995)).

[0088]A description of five open reading frames, which together encode the loading domain and the 13 modules of the polyketide synthase, is illustrated in the below Table 4. For each open reading frame, the position in the Fα biosynthetic gene cluster (in nucleotides) and the length (in amino acids) of the predicted gene product are shown. In addition, the approximate location of each biosynthetic domain within that predicted gene product (again in amino acids) is also displayed. Abbreviations used are as follows: ACP, acyl carrier protein; ATm, malonyl-class acyltransferase; ATmm, methylmalonyl-class acyltransferase; DH, dehydratase; ER, enoylreductase; KR, ketoreductase; KS, ketoacyl synthase; LD, loading domain; TE, thioesterase.

TABLE-US-00004 TABLE 4 ORF4: nt 12850-19875 (2341 aa) Designation: Loading Domain-Mod1 ATmm-LD aa 22-350 ACP-LD aa 365-450 KS-1 aa 473-897 ATmm-1 aa 1006-1339 DH-1 aa 1359-1547 KR-1 aa 1865-2052 ACP-1 aa 2137-2223 ORF5: nt 19865-31036 (3723 aa) Designation: Mod2-Mod3 KS-2 aa 34-466 ATmm-2 aa 574-908 KR-2 aa 1211-1391 ACP-2 aa 1473-1559 KS-3 aa 1578-2005 ATm-3 aa 2136-2476 DH-3 aa 2486-2667 ER-3 aa 2925-3279 KR-3 aa 3287-3466 ACP-3 aa 3556-3640 ORF6: nt 31115-49246 (6043 aa) Designation: Mod4-Mod7 KS-4 aa 34-456 ATm-4 aa 582-907 ACP-4 aa 950-1031 KS-5 aa 1055-1481 ATm-5 aa 1613-1938 KR-5 aa 2247-2427 ACP-5 aa 2516-2601 KS-6 aa 2621-3047 ATm-6 aa 3168-3493 KR-6 aa 3802-3983 ACP-6 aa 4078-4164 KS-7 aa 4189-4615 ATmm-7 aa 4727-5056 DH-7 aa 5078-5257 KR-7 aa 5588-5768 ACP-7 aa 5868-5952 ORF9: nt 52809-69833 (5674 aa) Designation: Mod8-Mod10 KS-8 aa 39-465 ATmm aa 574-904 DH-8 aa 926-1106 ER-8 aa 1366-1718 KR-8 aa 1726-1908 ACP-8 aa 1995-2080 KS-9 aa 2102-2529 ATm-9 aa 2661-2986 DH-9 aa 3009-3188 KR-9 aa 3492-3674 ACP-9 aa 3753-3842 KS-10 aa 3864-4290 ATmm-10 aa 4402-4732 DH-10 aa 4753-4928 KR-10 aa 5234-5416 ACP-10 aa 5499-5586 ORF10: nt 69929-85429 (5166 aa) Designation: Mod11-Mod13 KS-11 aa 34-456 ATm-11 aa 578-916 KR-11 aa 1199-1380 ACP-11 aa 1464-1549 KS-12 aa 1570-1996 ATmm-12 aa 2105-2442 KR-12 aa 2724-2906 ACP-12 aa 2992-3076 KS-13 aa 3096-3519 ATm-13 aa 3631-3975 DH-13 aa 4003-4188 KR-13 aa 4505-4687 ACP-13 aa 4780-4866 TE-13 aa 4893-5167

[0089]4. Identification of Other Biosynthetic Pathway Genes

[0090]Whether the other open reading frames that were found to be clustered with the core modular polyketide synthase genes played a role in Fα biosynthesis, and if so, what that role might be was based on a BLAST comparison of the nucleotide and predicted amino acid sequences of these open reading frames to sequences that have been deposited in the public databases cluster (see S. F. Altschul et al., Nucleic Acids Research 25:3389-3402 (1997)). Using those methods, a tentative identification of at least six other genes that could be involved in Fα biosynthesis was made.

[0091]A description of six additional open reading frames, which encode genes that could be involved in Fα biosynthesis, is illustrated in the below Table 5. For each open reading frame, the position in the Fα biosynthetic gene cluster (in nucleotides) and the length (in amino acids) of the predicted gene product are shown. In addition, a brief description of the BLAST results used to assign a putative functional role in Fα biosynthesis, is also included here for each of the open reading frames.

TABLE-US-00005 TABLE 5 ORFA: nt 382-2514 (711 aa) Designation: K+-Translocating ATPase, Subunit B (Not related to Fα Biosynthetic Gene Cluster) ORFB: nt 2511-4175 (555 aa) Designation: K+-Translocating ATPase, Subunit A (Not related to Fα Biosynthetic Gene Cluster) ORF1: nt 7697-10465 (922 aa) Designation: Regulatory Protein ORF2: nt 10791-11570 (259 aa) Designation: Thioesterase ORF3: nt 11659-12462 (267 aa) Designation: Reductase ORF7: nt 50449-51303 (284 aa) Designation: Methyltransferase ORF8: nt 51300-52706 (468 aa) Designation: p450 ORF11: nt 85574-86338 (254 aa) Designation: Oxidoreductase ORFX: nt 87037-88293 (419 aa) Designation: Endo-1,3-β-glucosidase (Not related to Fα Biosynthetic Gene Cluster)

[0092]ORFA and ORFB: BLAST results reveal considerable homology between ORFA and ORFB and K+-translocating ATPase subunits B and A, respectively, particularly the Mycobacterium tuberculosis genes (nucleotide sequences of which were directly submitted to the public databases). These genes are unrelated to the Fα biosynthetic gene cluster.

[0093]ORF1: BLAST results suggest that at the nucleotide level, ORF1 is related to a putative transcriptional activator in the pikCD operon of a macrolide biosynthetic gene cluster from S. venezuelae (see Y. Xue et al., Proceedings of the National Academy of Sciences USA 95:12111-12116 (1998)), and a putative regulatory protein in a Type-I polyketide synthase biosynthetic gene cluster from the rapamycin producing organism, S. hygroscopicus (see X. Ruan et al., Gene 203: 1-9 (1997)). At the predicted amino acid sequence level, the gene product exhibits limited homology to a family of hypothetical transcriptional activators related to the E. coli narL gene product. On the basis of these BLAST results, ORF1 appears to encode a transcriptional activator.

[0094]ORF2: BLAST results reveal significant homology between ORF2 and thioesterases at both the nucleotide and predicted amino acid sequence levels, including thioesterases in the Amycolatopsis mediterranei rifamycin biosynthetic gene cluster (see P. R. August et al., Chemistry & Biology 5:69-79 (1998)), and the S. griseus candicidin biosynthetic gene cluster (see L. M. Criado et al., Gene 126:135-139 (1993)). On the basis of these BLAST results, ORF2 appears to encode a thioesterase.

[0095]ORF3: An analysis of BLAST results suggests that ORF3 is homologous to reductases in the S. cyanogenus S136 landomycin biosynthetic gene cluster (see L. Westrich et al., FEMS Microbiological Letters 170:381-387 (1999)). At the predicted amino acid sequence level, BLAST results reveal homology between the ORF3 gene product and an oxidoreductase responsible for the conversion of versicolorin A to sterigmatocystin in the Aspergillus parasiticus aflatoxin biosynthetic pathway (see C. D. Skory et al., Applied and Environmental Microbiology 58:3527-3537 (1992)). On the basis of these BLAST results, ORF3 appears to encode a reductase.

[0096]ORF7: BLAST results reveal significant homology between ORF7 and methyltransferases at the nucleotide level, including methyltransferases in the S. lavendulae mitomycin C biosynthetic gene cluster (see Y. Q. Mao et al., Chemistry & Biology 6:251-263 (1999) and the Saccharopolyspora erythraea erythromycin biosynthetic gene cluster (see S. F. Haydock et al., Molecular and General Genetics 230:120-128 (1991)). On the basis of these BLAST results, ORF7 appears to encode a methyltransferase.

[0097]ORF8: BLAST results reveal limited homology between ORF8 and putative cytochrome P450's, including P450's in the S. roseofulvus frenolicin biosynthetic gene cluster and the S. pristinaespiralis pristinamycin biosynthetic gene cluster (see V. de Crecy-Lagard et al., Journal of Bacteriology 179:705-713 (1997)). At the predicted amino acid sequence level, ORF8 exhibits homology to a large family of mammalian cytochrome P450's. On the basis of these BLAST results, ORF8 appears to encode a cytochrome P450.

[0098]ORF11: BLAST results reveal significant homology between ORF11 and oxidoreductases at both the nucleotide and predicted amino acid sequence levels, including oxidoreductases in the S. violaceoruber granaticin biosynthetic gene cluster (D. H. Sherman et al., EMBO Journal 8:2717-2725, (1989)), and the S. cinnamonensis monensin biosynthetic gene cluster (see T. J. Arrowsmith et al., Molecular and General genetics 234:254-264 (1992)). On the basis of these BLAST results, ORF11 appears to encode an oxidoreductase.

[0099]ORFX: BLAST results reveal homology between ORFX and a glucan endo-1,3-β-glucosidase from Oerskovia xanthineolytica (see S. H. Shen et al., Journal of Biological Chemistry 266:1058-1063 (1991)). This gene is unrelated to the Fα biosynthetic gene cluster.

[0100]There are several open reading frames in the 3.5 Kbp region between characterized ORFB and ORF1, which on the basis of nucleotide sequence characteristics (G+C content, potential ribosome binding sites) appear to encode proteins. BLAST analysis, however, does not reveal significant homology between the predicted amino acid sequences of these hypothetical proteins and sequences of proteins that have been deposited in public databases. Consequently, ascribing a functional role to these hypothetical proteins in the biosynthesis of Fα is not possible on the basis of their nucleotide (or predicted amino acid) sequence alone. In addition, there are a number of open reading frames in the 7.8 Kbp region between characterized ORFX and the end of the nucleotide sequence that have now been obtained. Since ORFX encodes a gene that does not appear to play a role in Fα biosynthesis, and since macrolide biosynthetic genes are typically clustered, hypothetical proteins encoded by the open reading frames beyond ORFX do not participate in Fα biosynthesis.

Example 2

Gene Replacement, Characterization of Integrants and Excisants

A. Gene Replacement

[0101]In order to develop an S. cyaneogriseus strain capable of direct fermentative production of 23-keto-Fα, generating derivatives of the Fα production strain in which the module 3 ketoreductase domain had been replaced with nonfunctional variants were sought. A series of directed amino acid substitutions, each designed to disrupt ketoreductase activity while minimally affecting the rest of the polyketide synthase were designed as follows. A multiple amino acid sequence alignment was generated in which the predicted amino acid sequence of the module 3 ketoreductase domain from the S. cyaneogriseus Fα biosynthetic gene cluster was aligned with the predicted amino acid sequences of a large number of biologically active ketoreductase domains. These ketoreductase domain sequences were from the S. avermitilis avermectin biosynthetic gene cluster, the Saccharopolyspora erythreae erythromycin biosynthetic gene cluster, the S. hygroscopicus rapamycin biosynthetic gene cluster, the S. caelestis niddamycin biosynthetic gene cluster, and the Amycolatopsis mediterranei rifamycin biosynthetic gene cluster. Three ketoreductase domains known to be nonfunctional (so-called "cryptic" ketoreductase domains from module 3 of the Saccharopolyspora erythreae erythromycin biosynthetic gene cluster, module 4 of the S. caelestis niddamycin biosynthetic gene cluster, and module 3 of the Amycolatopsis mediterranei rifamycin biosynthetic gene cluster) were also included in the sequence alignment. This multiple amino acid sequence alignment readily supported the identification of relatively invariant amino acid sequences common to the majority of biologically active ketoreductase domains, but absent from (or altered in) nonfunctional ketoreductase domains.

[0102]Methods were also developed for gene replacement in S. cyaneogriseus by homologous recombination such that the desired variants of the module 3 ketoreductase domain from the Fα biosynthetic gene cluster could be replaced with the engineered variants of the module 3 ketoreductase domain, as described herein.

[0103]1. Construction of Plasmids for Site-Directed Mutagenesis

[0104]The QuikChange® site-directed mutagenesis procedure is a double-stranded method based on the polymerase chain reaction that requires two mutagenic oligonucleotides, one corresponding to each strand of the double stranded region of DNA. The method is less efficient when large plasmids, particularly large plasmids containing high G+C content DNA, are used. Consequently, site-directed mutagenesis of the Fα module 3 ketoreductase domain was performed in a vector designated pKR0.9 (see FIG. 3), which is the 900 bp BstEII-AatII fragment of pNE57 (and contains the desired region of the Fα module 3 ketoreductase domain), in the BstEII-AatII sites of pSL301 (Invitrogen, Carlsbad, Calif.).

[0105]2. Site-Directed Mutagenesis

[0106]Five variants of the Fα module 3 ketoreductase domain were generated by site-directed mutagenesis using reagents, materials and procedures provided by the manufacturer of the QuikChange® Site-Directed Mutagenesis kit (Stratagene, La Jolla, Calif.). The following amino acid substitutions were generated in pKR0.9, using the mutagenic oligonucleotides indicated below:

TABLE-US-00006 "179" GGTGTLG (SEQ ID NO: 13) to GAASTLG (SEQ ID NO: 14) 5'-CTGGTGACGGGCGCTGCAAGCACTCTGGGGGCG (SEQ ID NO: 15) 3'-GACCACTGCCCGCGACGTTCGTGAGACCCCCGC (SEQ ID NO: 16) "204" LVSRRGM (SEQ ID NO: 17) to LVAAAGM (SEQ ID NO: 18) 5'-GCGGCATCTGCTGCTGGTGGCAGCGGCAGGCATGGCCGCCGCCGGTG (SEQ ID NO: 19) 3'-CGCCGTAGACGACGACCACCGTCGCCGTCCGTACCGGCGGCGGCCAC (SEQ ID NO: 20) "260" HTAGVLD (SEQ ID NO: 21) to HTPPLLD (SEQ ID NO: 22) 5'-GACCGCTGTGGTGCACACGCCACCTCTCCTGGACGACGCCACCGTG (SEQ ID NO: 23) 3'-CTGGCGACACCACGTGTGCGGTGGAGAGGACCTGCTGCGGTGGCAC (SEQ ID NO: 24) "283" GAKVD (SEQ ID NO: 25) to GAAVD (SEQ ID NO: 26) 5'-GATGCGGTGCTCGGGGCGGCTGTGGACGGTGCCCTGCAC (SEQ ID NO: 27) 3'-CTACGCCACGAGCCCCGCCGACACCTGCCACGGGACGTG (SEQ ID NO: 28) "306" VLFSSAA (SEQ ID NO: 29) to VLFAAAA (SEQ ID NO: 30) 5'-GTCGGCGTTCGTGCTGTTCGCAGCGGCCGCCGGGGTCCTGG (SEQ ID NO: 31) 3'-CAGCCGCAAGCACGACAAGCGTCGCCGGCGGCCCCAGGACC (SEQ ID NO: 32)

[0107]The QuickChange® mutagenesis reactions contained 125 ng of each of the mutagenic oligonucleotides, 50 ng of pKR0.9, 0.7 μl of Pfu DNA polymerase, and 2.5% DMSO in final reaction volumes of 50 μl. The reactions were subjected to 22 cycles of amplification (95° C. for 45 seconds, 63° C. for 1 minute, and 70° C. for 10 minutes), and amplified products were cloned according to detailed procedures provided by the manufacturer. After completing the site-directed mutagenesis procedure, colonies were picked and used to inoculate 2 ml LB media supplemented with 100 μg/ml carbenicillin. Plasmid DNA was purified from each of these cultures using reagents, materials and procedures included in the QIAprep® 8 Turbo Miniprep Kits, and the mutated 900 bp BstEII-AatII region of the Fα module 3 ketoreductase domain was sequenced in its entirety in order to confirm that the desired changes had been made.

[0108]3. Construction of Plasmids for Integration

[0109]A three-way ligation was used to combine the five site-directed mutants of the Fα module 3 ketoreductase domain with flanking DNA to facilitate homologous integration using the pKC1132 backbone. The three components included: the 4.3 Kbp NotI-BstEII fragment of pNE57 (containing the majority of the Fα module 3 adjacent to the regions mutagenized); the 1.1 Kbp BstEII-PstI fragments of six pKR0.9 constructs (containing the five site-directed mutants of the Fα module 3 ketoreductase domain, and the wild-type Fα module 3 ketoreductase domain); and the 3.6 Kbp PstI-NotI fragment of pKC1132 (containing all of the elements necessary for selection and replication of the resultant plasmid in E. coli and Streptomyces). These manipulations resulted in the generation of the pFDmod3/5.2 plasmid series. These plasmids were then used to construct versions of the plasmids for integration from which approximately 1 Kbp of flanking DNA had been removed. These plasmids were constructed by digesting each of the pFDmod3/5.2 plasmids with EcoRI. This EcoRI site is immediately adjacent to the NotI site in pKC1132 that was used to introduce the 4.3 Kbp NotI-BstEII fragment of pNE57 (containing the majority of the Fα module 3). The 3' overhang was filled in using T4 DNA polymerase under standard reaction conditions, and the linearized plasmids were digested with MscI. The digests were resolved by electrophoresis through 0.8% w/v agarose, the desired fragments were excised from the gel, and the DNA was extracted from the agarose using reagents, materials and procedures included in the QIAquick II Gel Extraction System from QIAGEN (Valencia, Calif.). Purified DNA was collected by ethanol precipitation and ligated to generate the pFDmod3/4.2 plasmid series (see FIG. 5).

[0110]Plasmids of the pFDmod3/5.2 series (see FIG. 4) and the pFDmod3/4.2 series (see FIG. 5) were transformed into E. coli ETS12567 (pUZ8002) using methods described herein. Then, these transformed E. coli strains were used as the source of DNA for conjugal transfer to S. cyaneogriseus using methods described herein.

[0111]4. Isolation and Analysis of Genomic DNA from S. cyaneogriseus Transconjugants and Excisants

[0112]A method modified from methods presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985) ("Isolation of Streptomyces "Total" DNA": Procedure 4) was used for the isolation of small amounts of genomic DNA from S. cyaneogriseus strains. Putative S. cyaneogriseus transconjugants and excisants were picked and used to inoculate 3 ml KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O). The cultures were incubated at 31° C., with shaking at 220 rpm, for 24-28 hours. The cells in 500 μl aliquots of these cultures were collected by centrifugation in a microfuge at 13,000 rpm for 5 minutes, and the supernatant was discarded. After washing the cell pellets with water, they were suspended in 450 μl of SET (0.3 M sucrose, 25 mM EDTA, 25 mM Tris, pH 8.0, containing 4 mg/ml lysozyme and 50 μg/ml RNaseA), and the suspensions were incubated at 37° C. for 2-4 hours. 250 μl of a 2% solution of SDS was added, and the samples were vortexed for 1 minute. The samples were extracted with 250 μl of phenol:CHCl3 (1:1) and the phases were resolved by centrifugation in a microfuge at 13,000 rpm for 5 minutes. The aqueous layer was removed to a new tube, and after adding 1/10th volume 3 M sodium acetate, the DNA was precipitated by adding an equal volume of isopropanol. Precipitated DNA was collected by centrifugation in a microfuge at 13,000 rpm for 5 minutes, washed with -20° C. 70% ethanol, and suspended in 100 μl of water.

[0113]For the isolation of larger amounts of genomic DNA from S. cyaneogriseus strains, 25 ml KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) was inoculated with mycelial fragments of the strain of interest. The cultures were incubated at 31° C., with shaking at 220 rpm, for 24-28 hours. The cells in 3 ml aliquots of these cultures were collected by centrifugation in a microfuge at 13,000 rpm for 5 minutes, and the supernatant was discarded. After washing the cell pellets with water, genomic DNA was isolated using reagents, materials and procedures included in the DNAeasy® system for the isolation of total (plant) DNA from QIAGEN (Valencia, Calif.).

[0114]5. Characterization of Transconjugants

[0115]Putative transconjugants were plated on CM agar (5 g/L corn steep liquor, 5 g/L Bacto-peptone, 10 g/L soluble starch, 0.5 g/L NaCl, 0.5 g/L CaCl2.2H2O, 20 g/L Bacto-agar) plates containing 100 μg/ml apramycin, 30 μg/ml nalidixic acid, 50 μg/ml cycloheximide, and 50 μg/ml nystatin A. These plates were incubated at 31° C. until the colonies were well-established. Genomic DNA was then isolated from the putative transconjugants using methods described herein, for analysis by Southern blot and nucleotide sequence analysis as follows. Aliquots of the genomic DNA preparations were digested with HindIII/StuI and with SalI. The fragments were resolved by electrophoresis through 0.8% w/v agarose, and blotted onto Nytran® membranes (commercially available from Schleicher & Schuell BioScience, Inc. USA, Keene, N.H.) for Southern analysis according to well-established procedures similar to those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Typically, these Southern blots were probed with the mod3-specific probe, which was generated as described herein. The expected sizes of the fragments were:

TABLE-US-00007 Strain HindIII/StuI SalI S. cyaneogriseus production strain 10.8 Kbp 4.6 Kbp 142 S. cyaneogriseus production strain 13.3 Kbp 4.6 Kbp + 3.3 Kbp 142/pFDmod3/5.2 transconjugants S. cyaneogriseus production strain 12.3 Kbp 4.6 Kbp + 3.3 Kbp 142/pFDmod3/4.2 transconjugants

[0116]The region of interest of transconjugants that appeared to be correct on the basis of the Southern analysis was amplified using standard polymerase chain reaction (PCR), and the PCR products were sequenced to confirm that the desired sequence had been obtained. Two primer sets were used to characterize the transconjugants. Each pair was comprised of one mod3-specific primer, and one primer specific for vector-derived sequences. In addition, the primer pairs were designed such that one pair would amplify products from the "right side of the cassette" and the other pair would amplify products from the "left side of the cassette." The primer pairs used were:

TABLE-US-00008 Left (mod70F) 5'-TACTGCGCCACACGGAGCCCGAG (SEQ ID NO:33) and (P6568B) 5'-TGGGTAACGCCAGGGTTTTC (SEQ ID NO:34) Right (PECOR1F) 5'-GGAAACAGCTATGACATGATTACG (SEQ ID NO:35) and (mod3633B) 5'-TCGGAGCCGCTCCACCTGAG (SEQ ID NO:36)

[0117]With genomic DNA isolated from a "correct" transconjugant as a template, these PCR primers would direct the amplification of 6.4 Kbp and 5.7 Kbp products, respectively. The region of these PCR products containing the ketoreductase domain were sequenced to confirm that the desired sequence had been obtained, using the following oligonucleotide sequencing primers:

TABLE-US-00009 "179" Transconjugants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 37) Reverse 5'-GACACCGAAACCCCTG (SEQ ID NO: 38) "204" Transconjugants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 39) Reverse 5'-GCCGTGTGCACCACAGCGGTCAG (SEQ ID NO: 40) "260", "283", "306" Transconjugants: Forward 5'-GTGTGATGTCGCCGACCGCGCCCAGGTC (SEQ ID NO: 41) Reverse 5'-GCGCTGGTGGGCCAGGGCGTCC (SEQ ID NO: 42)

[0118]6. Excision and Characterization of Excisants

[0119]Transconjugants that had been verified by Southern analysis and by nucleotide sequence analysis of PCR products as described herein were used to inoculate 25 ml of KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O), and the cultures were incubated at 31° C. with shaking at 220 rpm, for 48 hours. A 500 μl aliquot of the culture was crossed into a fresh 25 ml of KB3 medium, and incubation was continued at 31° C. with shaking at 220 rpm, for an additional 48 hours. This process was continued for many such rounds, in the absence of selection, in order to allow for the excision event to occur. After rounds 3-6, serial dilutions of the cultures were prepared from 10-1 to 10-5, and 250 μl aliquots of the 10-3 to 10-5 dilutions were plated onto 140 mm diameter CM agar plates (5 g/L corn steep liquor, 5 g/L Bacto-peptone, 10 g/L soluble starch, 0.5 g/L NaCl, 0.5 g/L CaCl2.2H2O, 20 g/L Bacto-agar). These plates were incubated at 31° C. for 48-96 hours, until colonies were well-established. Individual colonies were then picked, and patched in replicate onto CM plates, and CM plates supplemented with 100 mg/ml apramycin. These plates were incubated at 31° C. for up to 5 days, at which time colonies sensitive to apramycin, but capable of growing normally in the absence of selection were identified. Genomic DNA was then isolated from these putative excisants using methods described herein. Using these genomic DNA preparations as templates, the region of interest was amplified using the polymerase chain reaction (PCR), and the PCR products were sequenced to confirm that the desired sequence had been obtained. The primer pair used for amplification was:

TABLE-US-00010 (mod70F) 5'-TACTGCGCCACACGGAGCCCGAG (SEQ ID NO: 33) and (mod3633B) 5'-TCGGAGCCGCTCCACCTGAG (SEQ ID NO: 36)

[0120]With genomic DNA isolated from a "correct" excisant as a template, these PCR primers would direct the amplification of a 6.6 Kbp product. The region of these PCR products containing the ketoreductase domain were sequenced herein to confirm that the desired sequence had been obtained, using the following oligonucleotide sequencing primers:

TABLE-US-00011 "179" Excisants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 37) Reverse 5'-GACACCGAAACCCCTG (SEQ ID NO: 38) "204" Excisants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 39) Reverse 5'-GCCGTGTGCACCACAGCGGTCAG (SEQ ID NO: 40) "260", "283", "306" Excisants: Forward 5'-GTGTGATGTCGCCGACCGCGCCCAGGTC (SEQ ID NO: 41) Reverse 5'-GCGCTGGTGGGCCAGGGCGTCC (SEQ ID NO: 42)

B. Fermentation and Analysis of Fermentation Products

[0121]Seed flasks containing 25 ml of KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) were inoculated with 500 μl of a suspension of S. cyaneogriseus mycelial fragments (either fresh or frozen) and the cultures were incubated at 31° C. with shaking at 220 rpm, for 48 hours. A 500 μl aliquot of the seed culture was crossed into production flasks containing 25 ml of SD2 production medium (85.5 g/L glucose, 0.36 g/L KCl, 0.72 g/L MgSO4.7H2O, 7.2 g/L Ca CO3, 4.86 g/L (NH4)2SO4, 0.72 g/L K2HPO4, 7.2 g/L pharmamedia, and 1.8 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) and the cultures were incubated at 31° C. for 10 days. Starting at (typically) 120 hours, and continuing through the end of the fermentation, 100 μl aliquots of the production culture were removed, and combined with 900 μl of methanol. The suspensions were vortexed for 1 minute, clarified by centrifugation in a microfuge at 13,000 rpm for 10 minutes, and 10 μl aliquots of the extract were analyzed by reversed phase HPLC.

[0122]For analysis by reversed phase HPLC, samples were subjected to chromatography on a Waters Model 625 Liquid Chromatography Station equipped with a Waters Model 996 Photodiode Array Detector, a Waters Model 717 Autosampler, and a Waters Nova-Pak C18 column (8 mm×100 mm). The column was equilibrated in and eluted with a mobile phase containing 60% (v/v) acetonitrile and 40% (v/v) 100 mM ammonium acetate, pH 4.5 at a flow rate of 2 ml/min. The compounds of interest, Fα and 23-keto Fα (predecessor of moxidectin), were detected by monitoring their absorbance at 242 nm, and retention times were compared to those of authentic samples.

[0123]In the foregoing, there has been provided a detailed description of particular embodiments of the present invention for the purpose of illustration and not limitation. It is to be understood that all other modifications, ramifications and equivalents obvious to those having skill in the art based on this disclosure are intended to be included within the scope of the invention as claimed.

Sequence CWU 1

42188400DNAbacteria 1gagctcttcg ctcccgccgg accggttggt cgcgccggag aggacgagcc ggtagcgggt 60gttgatctcg ttggtgccga gcccgttggc cgggcggccg tggaaccacc tcggatcggg 120ctcgggggtc tcctggccct tcttcagcgg cagatggtac ggctggccga tcagcgagga 180gccgacggcc ctgccgtccg ccgtgatctc ggagccgtcg gcccggtcgc ggaagagtgc 240ctgggcgacg ccggtgacga ccagcgggta gccggcgccc gtcaccaggg tcagcacgag 300gagggcccgc aggcccgccc cgagcagccg gacggtgtgg gtggcggagt tgttcatggc 360ggtcagcacg ctttcgtgac gtcacggccc gggaacgagg gagatgaaca ggtcgatgat 420cttgatgcct atgaagggcg ccaccaggcc gcccaggccg tagatcccga ggttgcgccg 480cagcatccgg tccgcgctca ccggccggta ccgcacgccc ctcagggaca gcggcaccag 540cgccacgatg accagcgcgt tgaagatcac cgcggagagg atcgcggagt cgggtgagga 600caggcccatg acgtcgagtc gctccaggcc gggatgggcc ggcgcgaaca gcgccgggat 660gatcgcgaag tacttcgcga cgtcgttggc cagggagaag gtcgtcagtg cgccgcgtgt 720gatcagcagt tgcttgccga tctccacgat ctcgatcagt ttggtgggat cggagtcgag 780gtcgaccatg ttgccggcct ccttcgcggc cgacgtaccg gtgttcatcg ccacgccgac 840gtccgcctgg gccagagccg gggcgtcgtt ggtgccgtcc ccggtcatgg cgaccagcct 900gccgcctgcc tgctcccgcc tgatcagcgc catcttgtcc tcgggagtcg cctccgcgag 960gtagtcgtcg acgcccgcct cgcgcgcgac ggcctgcgcg gtcagcgggt tgtcacccgt 1020gatcatgacg gtcctgatgc ccatgcggcg cagttcctcg aaccgcgcgc gcatgccgtc 1080cttgacgacg tccttgaggt ggacggctcc cagcacccgg gcgccccgct cgtcccgcgc 1140ggcgaccagc aggggcgtgc cccccgatcc ggcgatgcgg tcggcgatgg ccttcgcgtc 1200ctgggcggcc tcaccgccct gctcctcgac ccaggcgagg atggaaccgg ccgcgccctt 1260gcggatcctg cggccgccga cgtccacgcc cgacatgcgg gtccgggcgg tgaacgcgat 1320ccattcggcg ccggcgagtt cgccccggtg ccgctcgcgc agtccgtact gctccttcgc 1380caggacgacg acggaccggc cctcgggcgt ctcgtccgcg agcgaggaga gctgcgcggc 1440gtccgccacc tcggcctccg tggtgccgga caccggcacg aacccggccg cccgccggtt 1500gccgagcgtg atcgtgccgg tcttgtccag cagcagcgtg gagacgtcgc ccgcggcctc 1560gaccgcccgg cccgacacgg ccagcacatt gcgctgcacc aggcggtcca tgcccgcgat 1620gccgatcgcc gagagcagcg cgccgatcgt ggtcgggatg aggcagacca gcagcgccac 1680cagcaccgtc ggtgtcaggt gggtgcccgc gtgatccgcg aagggcggca gcgtggcgca 1740gaccagcagg aagacgatgg tcagcgaggc cagcaggatg ttcagcgcga tttcgttagg 1800cgtcttctgc cgggccgcgc cttcgacgag gtcgatcatc cggtcgatga aggtctcacc 1860gggcttggtc gtgatccgga tgacgacacg gtcggacagg accttggtgc cgccggtgac 1920ggcgctccgg tctccccccg actcgcggat gacgggtgcc gactcgccgg tgatggcgga 1980ctcgtcgacg gacgcgacgc cctcgacgac atcaccgtcg ccggggatga cgtccccggc 2040ctcgcagacc accagatcgc cgatcctcag tccggtgccc ggcacccgct cctccgagcc 2100gtcctcgcgc aggcggcggg cgacggtgcc ggtcctggtc ttgcgcaggg tgtcggcctg 2160tgccttgccg cggccttcgg cgaccgcctc cgcgaggttg gcgaagagca cggtcatcca 2220gagccaggcg gagacggtcc agccgaaccg gtcgccggga tccatgaggg cgaagacggt 2280ggtgaggacc gagccgatcc acaccacgaa catcacgggc gtcttgatct gcacccgcgg 2340gtccagcttg cggaaggcgt ccggcaacga cctgacgagc tggcccgggt cgaacagacc 2400gccgccgacc cgcctttcgg acggctggtg accggtgggg gcgtcgcgct gcggcgtccg 2460ggcgggagtg atcgtggaca tcgggttccc ttggtcgtcc gggtgtgcgc tcatgccgcc 2520agcccttcgg cgagcggccc cagcgccagg gccgggaagt acgtcaagcc ggcgaggatc 2580aggatcgcgc ccaccatcag gccgctgaac agcggcttgt cggtgcgcag ggtgccggtg 2640gtgaccggca cgggccgttg cccggcgagc gagccggcca gcgccaggac gaacaccatc 2700ggcaggaagc ggccgagcag catcgccagt ccgatggtgg tgttgaacca ctgcgtgtcc 2760gcgtcgagac cggcgaaggc cgagccgttg ttgttggcgc cggaggtgta ggcgtagagg 2820atctcggaga acccgtgcgc gccgctgccg gtcgtcgagt tcaccggcgt cggcagggcc 2880atcgcgcacg cggtgaggat caggaccagc gccggggtga ccagcaggtg gcaagcggcc 2940agtttgatct cgcgggtgcc gatcttcttg cccaggtact cgggcgtgcg gccgaccatc 3000agaccggcga tgaacaccgc tgtgacggcc atgacgagca tgccgtagag gccggatccc 3060accccgcccg gagcgatctc gcccagcatc atgccgagca tcgcgatgcc tccgccaagg 3120ccggtgaagg aggagtggaa ggagtccacc gcgccggtcg aggtgagcgt ggtcgacacc 3180gcgaagatgg acgaggcgcc gacaccgaag cggacctcct tgccctccat cgcgccaccg 3240gcgatctcga gcgccgggcc gcggtgcgag aactcggtcc acatcatcag ggcgacgaag 3300gcgatccaga aggtggccat cgtcgccagg atcgcgtagc cctgcctgac cgagccgacc 3360atgacgccga acgtccgggt gatcgagaag gggatcacca ggatcaggaa gatctcgaag 3420aggttggtga agggcgtcgg gttctcgaac gggtgggcgc tgttggcgtt gaaatagccg 3480ccgccgttgg tgcccagcag tttgatggcc tcctgggagg cgaccgcgcc cccgttccac 3540tgctgcgagc cgcccgtgaa ccggccgacc tcgtggatgc cggagaagtt ctggatgacc 3600ccgcaggcgg ccagcaccac ggcgccgagg gtggccagcg gcaccaggac gcggaccgtt 3660ccgcgcacca gatcggccca gaggtttccc agttcaccgg tgcgggagcg cgcgaacccg 3720cgcaccagcg cgaccgcgac ggccatgccc acggcggccg aggtgaagtt ctgcacggcc 3780aggccggcgg tctgcacgag gtgtcccatg gcctgttcgc cggagtacga ctgccagttg 3840gtgttggtca cgaaggacac ggccgtgttg aacgcctggt ccgggtcgac ggctcgaaag 3900ccgagggaca gcggcaggac gccttgcgcc cgctggacca ggtacaggaa gaggacgccg 3960gccacggaga aggccagcac accgcggagg tacgcgggcc agcgcatctg ggcgccgggg 4020tcgacaccga tgccccggta gatccatctc tcgacgcgcc agtgctcgtc ggaggagtag 4080accttggcca tgtggttgcc gaggggtttg tggacgagtg ccagagcact cgtcagggcg 4140agcagttgga gcacgccggc gagtacggga cccatggctg ctctcagaac ctctccggga 4200agatcagggc gaggacgaga tagcccagca gggagacggc cacgaccagg ccgacgacgg 4260tctcggcggt cacagcttcg tcacccccct ggcgacaaca gccaccagcg cgaagagcgc 4320gagcgtggtg acgacgaagg ccgtatcggc catcgcggac tcctggaatg aggtgcggtg 4380gaaacggacc cttgcaggta agcgcctcac cgaccgaaac aggacgtccg ttgacgtttc 4440ccttacggcg tgacgtacgt ctttgacgga actcttacgc ctgaggtccg tgtccatgcc 4500cctcggtccg ttgcggacca tgcccccgcg gccaccggag agggcggcgt ccccctcagc 4560gggccccgcc cgtctccccc ggggcctccg tctcccccac cttctccacg accgtctcct 4620ccggcccgac cggccgtccg tccgcggcac ggatccgcaa ggggcgcagc gggcggccgg 4680tgcggcggtc gagcaggcgg gagtggtcct cgccgggggc gaagcactgc tcctcgcccc 4740actggcgcag ggcgacgatc accgggaaca aggcgcggcc cttgtccgtg aggacgtact 4800cacggtggga accgccgtcc ggcgcgggca cgttgcgcag taccccggcc tcctccagcg 4860cgcgcagccg cgccgtcagg atgttcttgg cgatgccgag gctgcgctgg aactcgccga 4920agcggcgact gccgtcgaag gcgtcccgca cgatcagcag cgaccaccag tcgccgatgg 4980cgttcaccga ccgggcgacg ggacaggggt cggcgtcgaa acgggtgcgg gcgaccatcc 5040gcgtctctcc tctcctccgg caccccggat ccctccaggg atggttgcaa catgctacct 5100cgtacggcta ccgtcctcgc cggtagcaag atgcaaccga gtgagaggtg tgacggtatg 5160gcggtccagt gctccggtgc ggacggcgga tgcggcgaag ccggtggtcg cggagcggcc 5220ggcacggcgc cgcccgcgcg gctcgtgccc ctgctcgccc tggcctgtgg cagctccgtc 5280gccaccgtct acttcgccca ccccctgctg gtgaccctcg gtgagcgctt cgcgctcggc 5340cccgggctgc tcggcgcgat cgtcaccgtg acgcaactcg gttacgcggt gggcctgctg 5400acactcgtgc cgctcggcga cctgctcggc caccggcggc tggtcaccgc tcagctcgga 5460ctgctggcac tggcgctgct ggccgccggg ctggcgccgg gcgcggctgc gctgctcggc 5520gcgctcgccg cggtcgggct gctcgccgtc gtcgcccaga cgatggtcgc ggctgccgcc 5580gccctgagcc cgcccgaccg gcgaggccgc gccgtgggaa ccgtcaccgg cggcatcgtc 5640accggcatcc tgctggcgcg cgccgccgcg ggcgtcctcg ccgacctcgc cggctggcgg 5700gcggtctacc tggcgtcggc gggcgtcacc gccgtcctcg ccgtgctgct gcgccgtgcg 5760ctgcccccgg gatcgccgtc cgcaaaggct cgcgagacgt cgtacgtacg gctggtggcc 5820tcgaccgtca ccctgttcgc ccgccatccg ctgctgcgga tccggggggc cctggccctg 5880ctggtgttcg cggccttcag cacgctgtgg agcggcgtgg cccagccctt gagcgatccg 5940ccgtggtcgc tgtcgcacac cgcgatcggc gcgttcgggc tcgccggggc ggccggagcc 6000gtcgccgcac aggtggccgg gcgctggaac gaccgggggc tcgcccggcg cacgaccggc 6060gccggcctcg cgctgctggc gctctcctgg ctcccgatcg ccctgacccg gcaatcgctg 6120tgggcgctgg cgatcggcgc cgtcctgctg gacttcgccg tgcaggccgt ccacgtcacc 6180aaccagaccc tcatccacgc cgtccggccc gaggcgggca gcaggatcat cggtggttac 6240atggtcttct actccgcggg cagcagcctc ggcgccctcg gttcctccct cgcctacgcc 6300acggcgggct ggccggccgt gacggccctg ggcgcgtcgt tcagcgtcgc cgcgctgctg 6360ctgtggacgg cgacccgtcg tacggggctg cccggcgacg acccggcggc cgaacggacg 6420gaccctggcc gtccgtccgg ggacagggct gccgggaggc ccgcccgcag ccgctcttcc 6480ggcccccggt gaacgtctgg ggtggcgcgg ggcgcgcgat aggggtccgc cgagagtcag 6540gggttgtccc gcctgtacag cggcatcggc ggtcggtcca atggaaggcg tgcctccgga 6600tgcggggcgg gtcgtcgacg accctgtgcc ggcggcggat tcccaccctc ccccggaagg 6660cgaggtcgtg atgaccgacg tccgtcatga cagcaggcag acgggtccgg cgctgcgcgc 6720gctcagcgcg gcccggcggg cgcgggcctt ggcgtccgcg atggcggcgg ccgccgcgga 6780gacgcggcag gccgtcgagg ccgcggacgg cacggaccgc gcggccgccg tagccgagat 6840cggcgcggta ctggaggacg cctcccggca cacggacgcc gccgccgagg ccgccgcctc 6900ggctgccgag gccgccgccc gggccgagac ggccgaggcc gcccgcacgg tggccgccga 6960gtcggccgag gccgtcgtcg ccgccgccga gacggcggtc cgggcggccc gggtcaccga 7020ggccgccacg agcgccgccg cccaggccgc ggccggtacg gacgcggcgg gcgtgatggc 7080ggacgccgcg gcgcacaccc ggcaggccac cgccgagacc gcggcgatcg ccgaggccgc 7140cgccgcggcg gccagcgcgg cccgggccgc cgtcggcgac gaggcggcgg acggcgcgga 7200cccgtgccga cgggctgacg aggcggaggc cgcggccctg cggctgtgcg aggacacgcc 7260gtggctgcgc aggcacctcc ccgacgtgtg aggcagggtg cccggcggcg ccggcgcgag 7320atggaacccg ggccgggcgg ccctcttccc tccgggtgcc ggcgacgaga ccgtcgcccc 7380cacctcgaac cgggccttcc ggtcgctgta ccggcatccg gagatcgagc ggcggccgca 7440ctccgagggg gaccgggtgc tcggcggccg ggccgccacc tggacggatc cgccgtcgct 7500ggagctgacg ggccgggtgg tccacgacgc gctgcgcctg ttccccggac gggctgctca 7560ccgggtcatc accgaggaca cggggcgcgc agggcgcgcg ccgccggccg gcggcgtcgt 7620cgcctgggtc gatcagcagc cccccgcggc gggttgctca cgttcggcgg cgggcaggcg 7680tccgcccggt gcgcgttcag accgggcttt cgatgtgcgc gcacagccgg gcgggcagtt 7740cctgacggcg gctgatctcc agcttgcggt aggcgcgcgt caggtgctgc tccacggtgc 7800tgacggtgat gcagagccgg gcggcgatct cccggttggt gtgcccgttg gcggcgagtt 7860ccacgaccct gagctccgat ccgctcagcg ccgcctcggt ccgccccgtg ccgtccccga 7920acgactgccg cccaggacct cccggcagga tccgctcgca cagagcccgc gccccgcagt 7980cgttcgccag gtgccaggcg cggcggatcg tggcgcccgc ccgggtcgac tcgccgcgct 8040cccggtaggc ggcgcccaga tcggccaggg cacccgccag ggccagccgg tcgccgctgc 8100tcttcaggtg gttcacggcc tcggtcagca ggttcagccg atccggcggt tcggcgatct 8160gcgcgcgcag ccgcagcgag acgccgcgca catgaggatc gtcgtccggg gtccgggcca 8220gttgttcccg caggagccga tcggccctcc tcggctcgca cagccgcagg aacgcctccg 8280ccgcgtccga gcgccacggc atcagcgtcg gccggtcgat cccccagcgc cgcagcagac 8340ggccggcgcc gaggaagtcg cggacggcgg cgaggggccg gtccagggcg aggtagtagt 8400ggccccgggc gcgcaggtag gcggggccgt acacgctgcg gaacagggcc tccggcaccg 8460ggtggtcgag ctgccgggtg gcctcgtcgt agcgccccat cgcggtggcc gcgaacaccc 8520ggctggccag cggcccgccg atgaagacgc tgcggctgca cctcggcacg caggccaggg 8580cctgacgggc gtactcctcg gtgtcggcga gcagcccccg gcacagcgcg atctcggcct 8640ggagggcgag gaactgcgcc ttccagcccg gcagccggcg accggcggcc tcgccgagca 8700gacgtgtgca ccagagcgcc gcggtctcgt acgagccggt gcggcacagg acccggaccg 8760cgttgacgac gatcaccagg gtggtgtcgg tgagcggcag gctcctcagg acgtgctccg 8820ccgcgtcgga ggccgaggcg ttggtgccgt cgccggggag gtcccagatc ccggtcaccg 8880gcatccgcgg gcgggggctc tcctcgtccc cgggctccgg gtccgtccgg ggacggatca 8940gcggctccca gagcgcggag gcgtggaagc ccgtctccag ccggggagtt cgcgggtcgc 9000cgtgggggcc cggccgtccc atgacctccg tggcctcctc cagccgtccg cagccgagca 9060gcaggtgacc cagccgttcg gtctcggcgc tcgtcagtcg tccggcccgc agctcggtga 9120cgagttcggc gaggtggtcc tccgccgccg ccgggtcggt gcgccgggtg gcgacggtca 9180ggcgcagcag gatctcggcg cgccggggcc ctcccgcaca ggagcgccgg gccagttcga 9240gacaggagac ggcggtcagg acgtcgtccc gcatcagcag ctgctcggcg gcgtcccgca 9300gcacggacat cgcccagggc ccggccgcgt gccgggcggc gagcaggtgc cgggccacct 9360cgtccggttc cgcgccgacg tcgtacagca gcgcggcggc gcggcggtgg aggtgtgcgc 9420ggtggtcgtg gtccagggtg tccagggcgg ccgcctcgac cacggggtgc cggaagcggc 9480cggacgccgt caggccggtc gcctccaggg cgcgcagccc gcgggccgcc atggcgcggc 9540cgatgccgag cagccgggcg atcacctcgg cgcagccgga gtcaccgagg acggcgagcg 9600cgccggcgct gtgcctaacc aggctgtcgg tgcgggacag tgaggcgagg acggcctggt 9660agaaccgccc gccgatgacg ggcgacgctg ctctccggcg ccggccggca cgctcgtcgg 9720tgtgcccttg ggtgtggctt tcgacgagtt cttcgagcag ggcatgcacc agcagcggat 9780tgccgccggt gacggcgagg aggtcgtccg cgggcagggc ctcgacggcc ggtccggggc 9840gggcggcgcg cagtccggac acggcgcgca gggagaggcg gccgagcatg acgcgctgga 9900gggccggctg gcacagcagc tcggcctcca cggcggggtc ggccgccagg ccggacggca 9960gcgcggtgca gaccagcagc agtcgggtgg cgcggggatg gtcgacggcc tggagcaggc 10020agtgcaggga ctgcgggtcc gcgtggtgca ggtcgtcgat gccgatgacg accggcgcgg 10080cgccggtgag ctggtggagc gccgcccgca cacgctgcgc ggccggggtc tccgtcccga 10140cggcgtcctg gagcagtgag cgctgggcgt ccgggatgtc ggggtcgacc gccagttgcc 10200gcaggaggtc gaaggggcgc cggccctccg gcgggcttcc ggcggaccgg aggaccagga 10260agcccgatgc cgccgcgtgc ttgagcgcct cccccaggaa cgcgcttttc ccgcagccga 10320gtcctccttc gacgaccagc acccgcaccc ggccggcggc gcacgcctcg agcgccgttc 10380tcacggcatg ggactgcctg cccaggccga ggaacgtgag cccttgcggc tcccgcacgg 10440acaccgaagg ggaaacgccc cgcataatct ccctctgact ccctcccccg aagaccgggg 10500gctttacgga ttcgtaccaa caggaaagcc cacaagtcga cgagatactg cccctctccc 10560gaagccgcca cacgcgcacc ccgatacgag aatgagccaa tgagcaagcg tggtggccga 10620gttgatacga acccgtgaat ttacgttatt tcgctcaccc tttcgagcgt gtggagagtc 10680ctcggaatgg gcggccggga ggttgggcag cctccgcggg acggcgagcc attcgcgagg 10740tcacgcggac acgcgtgttg cgataatcgc acttaaggag aggacgagcg atgcccgacc 10800tttgcgagac cgaatccctc tggctccggc ggttccagcc ggctcccgcg gcccggacgc 10860ggctcatgtg cttcccgcac gcgggcgggt ccgccagcgc ctatctgcgc ctggcccggt 10920ccctcgcccc cggcatcgag gtcctggcgg tccagtaccc cggacgacag gaccggcgcg 10980ccgagccctg cccggactcc gtcgaaggcc tggcggacga tctgttcgcg gccgtccggc 11040accgcgtgga cgcgtcgacc gcgctgttcg gacacagcat gggcgcggtc ctcgccttcg 11100agctggcccg gcggctggag cgcgacgcgg gggtccgctg cgcccggatc ttcgcctcgg 11160ggcgccgggc accctcccgg ttccgtgacg actccgcccc ggccgccagc gacgcctcga 11220tgctcgccga gatgcggact ctcggcggaa ccgacctgcg ggtgctccag gacgaggaac 11280tgctgatcgc cgcgctgccc gcgctgcgcg ccgactaccg cgcgatcggg acctaccgcg 11340ccgccgacga cgccgtggtc ggctgcccgg tcaccgtgct ggtcggtgac gccgatccga 11400ggaccagcct cgacgacgcc cacgcctgga gcgcccacac cacggcggag tccgaggtgc 11460tcaccttctc cggcgggcac ttcttcctcg acgcccacca cgacgcggtg gtggaggtcg 11520tcaccgcgcg cctgcggcag gaccgcgcgc cccggccgga ccgggtgtga gggggcccgg 11580cccgaagggc cgggccgctc cgcgcgtctg ccggcaccgg gccgcaccgg acccggcgcc 11640ggcagacgcg cggcgacctc acatcatggc gggcgccagg gccattcccc cgctggcgtc 11700cagcagttgg ccggtgatcc agcgggcgtc gtcggagacc aggaaggcga cgatgccggc 11760gatgtcgttc ggccggccca gccggccgag cgcggtcagg gccgagatgc ccgcctcggc 11820ccccggggtc tcgcgcaccc agcggttcat gtcggtgtcc gtgatgccgg gggccacggt 11880gttgacggtg atgccgcgcg aaccgagttc gttggcgagc cggggagcca tcatctccag 11940cgcccccttg gtcatggcgt agggcagcag cggccaggcg atccgggtga cggccgagga 12000gacattgacg atgcgtccgc cgtcggccat cagtgacagg gcccgctggg tcacgaagaa 12060cggtgcccgg acgttgatgc ggtacacgcg gtcgaactcc tcgggcgtgg tgtccgacag 12120gccggggaca tagccgtcct gtgccgcgag cgccgggtcg ccgggggcgg gggcgacggc 12180cgcgttgttc accaggatgt gcagcggacg cccctccagc tcccgctcca gtgcggtgaa 12240gagctcatcc acggcgtcgt cccggaggag gtccgcccgg accgcgaagg cccgtccccc 12300cgcgcgttcg atcgtctcca ccgtctcctg ggcgctcttt tcctgcgttc cgtagtgcac 12360ggcgacccgg acgccctcgg cggcgagtcg ctgggcgatg gcttttccga tgccgcgcga 12420ggcacccgtg accaaggccg tcctgtcgtt caattccggc atcccgaatc cccttctgcc 12480gattatctta cttttcctct tgatgcatgg ggtcggaccc gaggccagat ccgcaccccg 12540gccacgcgtg aggtcgcgac ctcaccgatt actgtgccag agtccaggcg acacacggga 12600gggcgggaat gcgatcgatt tccgcacccg gaactcgtag ggggagcaag aagatcggcc 12660gaatacccct ggggtggata gggggtacca ggaccgtcgg gcgatcacta ttttgaaaca 12720cgactccggc gcgcggccgg cggcgaaagt cctctccatg ccgggctgtc ccctgcctcg 12780aaatacctgc ggcgactttc gccctgcgat gcggccgccc atccctgccg agcggtgagg 12840agacgacaag tgcacgagac acacgcgcac ggcgaggaag ggtcgtccga cgggtccgcg 12900gacgcagtgg tcttcgtctt ccccggacag gggtctcagt ggccggggat gggtgcggaa 12960ctgtgggaca cctccccggt gttccgcgag agtgtgcgcg cctgcgccga cgcgctcgcc 13020ccgtacctcg actggtccgt cgaaggcgtc ctgcgcggcg ccccggacgc cccggccggc 13080ccggcgctcg atcgcgccga cgtcgcgcag ccggccctgt tcaccctcat ggtgtcgctg 13140gccgagctct ggcgctcgca cggagtcgaa ccctgcgccg tcctcgggca cagcctcggc 13200gagatcgccg ccgcgcatgt ggccggcgcc ctgaccctgg ccgacgccgc ccgggtggcg 13260gccctgtgga gccgggccca ggccacgctg tcgggcaccg gcacccttct cgcggccaag 13320gccgcccccg aggaactggc accgcacctt cagcggtgga acggcgacga ccggcacggc 13380acccggctcg cgatcgccgg cgtcaacggg cccggcagca cggtggtggc gggggacctc 13440gacgcgatcg ccgcgctggc cgccgacctg gcctcggcgg gggtgcggac ccgccgggtc 13500gccgtcgacg tgcccaccca ctcccccgcg atgcggaccc tgcgggaacg gatcctcacc 13560gacctggcct ccgtcgcccc gtgcgtctcc cgtctcccct tccactcctc gctcaccggc 13620ggtctggtgg acacccgcgg gctggacgcc gactactggt accgcaacat cagcgagacc 13680gcgcgcttcg acctcgccgc ccgcggtctc ctggccgacg gacaccggac gttcgtggag 13740ctgagcccgc acccgatact caccctgggc ctgcaagcgc tcgccgacga cgtccccggc 13800gccgccgacg cgctcgtgac gggcacgctg cgccgcgggc gcggcggaat gcggcagttc 13860caggacgcgc tcggccggct cagcgtcccc gcgggcgggc ggcccggccg cgaggtgagc 13920gccgcggccc tggccggccg gctggcgccg ctctccccgg cgcagcagga gcatctgctg 13980gtggaattgg tctgcgccca cttcgccgca ctcgtcggcg gcgacggcgg ggcgccgccg 14040acggtgcggc cgtcggccgc cttcaccgat cagggctgcg actccgccac cgccctggag 14100ctgcgcgacc ggctccgcga ggcgaccggg ctgcgcctgc ccgccacgct ggtcttcgac 14160cacccgacgc cggccgcggt cgccggccgg ttgcgccgac tcgccctcgg gatcgaggag 14220acggcggaca cggcaccggt cgccgtccgc ggccaccggg agggcgaacc gatcgcgatc 14280gtcgggatgg cctgccgctt cccgggaggt gtccggtcgc cggaggacct gtggcggctg 14340gtcaccgaag gcggtgacgc gctcgggccg ttccccaccg accgcggctg ggacaccggc 14400cgccacgcgg aggacccggc cacacccggc acctacgtcc agggcgaggg cggattcctg 14460tacgacgcgg gcgagttcga cgccgagttc ttcgggatct ccccgcgtga ggcgctggcc 14520atggacccgc agcagcggtt gctgctggag atggcgtggg agaccttcga acgggcggga 14580atcgatccca cctcggcccg gggatcgcgt accggcgtct tcgccggggt cctcccgctc 14640ggctacggcc cccgcatgga cgagacggac cagggcaccg ccgacctcca gggccatctc 14700ctcaccggca cactgcccag cgtcgcctcg ggccgcatct cctacaccct cggcctggag 14760ggcccggcgg tgtcggtgga gacggcctgc tcgtcgtcgc tcgtcgccct ccacctcgcc 14820tgccgctcgc tgcgggcggg cgagtgcgac ctcgccctga cggggggcgt ctcggtgctg 14880gccaccctcg gcctgttcgt cgagttctcc cggcagcgtg gactgtcggc ggacggccgg 14940tgcaaggcgt acgcggcggc ggccgacggg accggatgga gcgagggtgc cgggctgctg 15000ctggtcgaac ggctctccga cgcacggcgg ctggggcacc

gggtgctcgc ggtggtccgg 15060ggcagcgcga tcaaccagga cggcgcgtcg aacgggctga ccgcccccag cgggccgtcc 15120cagcagcggg tcatccgcga ggccctggcc gacgcgggcc tgacggcggc ggacgtcgac 15180gcggtggagg ggcacgggac cggcacacga ctgggcgacc cgatcgagat cgaggcgctg 15240ctcgccacct acggacaggg acgcgcccgg gaacggccgc tgtggctcgg atcgctgaag 15300tcgaacatcg gtcacaccat ggccgcggcg ggggtgggcg gggtcatcaa gatggtgatg 15360gcgctgcggc acggggagct gccccgcacc ctgcacgtgg acgcgccctc gccccgggcc 15420gactggtcgg cgggcgaggt acggctgctg acggaggccg tcgcgtggcc cgcggcggcg 15480gacggtgagc cgcggcgggc cggggtgtcg tccttcggcg tgagcggcac caacgcgcac 15540gccatcctgg aggaggcgcc cgccccggag gacgaggaac cggcgccgcc ggacggtgaa 15600gcactactgc cgtgggcggt gtccacgcgg tcggaggccg cactgcggac gcaggcacgg 15660atgctggcgg acgtcgtacg cgacgacccc ggagtcggac tcgccgatgt gggtgcggag 15720ctggcccggg ggcgggcggc tctcgagcac cgggccgtcg tcatcgcctc cgggcgcgcg 15780gagttcgcgc gggcgctgga ggcggtggcg tccggcgagc cgcacccggc cgtggtccgg 15840ggccacgcgg ggagcgagcg cggcggagtg gtgttcgtct tcccgggcca gggcggtcag 15900tgggccggca tgggactcga cctcctgcga agctcaccgg tgttcgcgga gcacatcgcg 15960gcctgcggca aagctctggc cccgtgggtg aagtggtcgc tcacggaggt gctgcaccgg 16020gacgccgagg atccggtctg ggaccgggcc gacgtcgtcc agccggtgct gttctcggtc 16080atgacgtcgc tggcggcgct gtggcgctcg tacggcgtcg agccggacgc cgtgaccggg 16140cactcgcagg gggagatcgc cgccgcgtac gtctgcggag cgctcggtct ggaggacgcc 16200gcacggacgg tggcgctgcg cagccgcgcc ctggtggcgc tgcgcgggcg gggcggcatg 16260gcgtccgtcg cctccgccgc cccggacgtc gaggagctca tcgcgcggcg ctggcccggc 16320cggctgtggg tcgcggcgtt caacggcccc ggcgcggtga ccgtttccgg ggacggtgat 16380gcgctggagg agttcctggg ccactgcgcg gacacggagg tgagggctcg gcgcgtcccg 16440gtggactacg cctcccactg cccgcacacg gaggcgatcg agcgggaact gctcgacgcc 16500ctggaggaca tcaccccccg gccggcggcg gtcccgttct attcgacggt cgacgacgcg 16560tggctggaca ccacacggct ggacgcctcc tactggtacc gcaacctgcg ccggcccgtc 16620cgtttcagcc aggccgtgcg cgccctcacg gacggcggcc accgcgtctt catcgaggcg 16680agcccgcatc ccaccctcgt ccccgccatc gaggaccacg gcgacgtcac cgccctcggc 16740accctgcgcc gccacggcga cgacaccgag cggttcctca ccgccctcgc ccacctccat 16800gtcaccggag ccgccggcca ggacctctgg cgccaccact acgcccggct caggcccgcc 16860ccccgccacg tcgacctgcc cacctacgcc ttccagcgcg accggtactg gtggagcggc 16920ggcgccgggc gcggggacgt caccaccgcc ggtctgcacc ccggcggcca tcccctcctc 16980ggcgccgcgc tggacctcgc cgacggcggc ggccgcctcc acaccggccg tgtctccctg 17040cgcacccacc cctggatcgc cgaccacggc gtcgcgggca tcaccctcct gcccggcacc 17100gccttcctcg aactcgccct gcacacgggc gagtcgggga acgtgcggga actcaccctg 17160cacgcgcccc tggtcgttcc cgacgaggag ggcgtcgacc tgcaagtgca cctcgcccgg 17220cccgacgaag cgggcctgcg cgccctgacc cgtcttctcc cgggccgcgg ggtgccgacc 17280ccgagagccc cctggcagcc ccacgccacc ggccttctcg ggccggccga ccgagcaccc 17340ggctcctccg gcctcgagcc gcacgacctg ggcggcgcct ggcctccgcc gggggcggtc 17400cccctcgtcc ccggcgaact cggcgacgtg cccggctgct acgcccgcct ggccgacgag 17460gggttcgagt acgggccggc cttccggggg ctgcgtgcgg tgtggcgccg cggcacggag 17520atcttcgccg aggtcgccct cccggccggc gacggctccg tgttccggct gcatccggcg 17580ctgctggacg ccgtgctgca ccccgtcgta ctcgggctgg tggacggcgt gccggcccgt 17640ccgctgccct tctcctggaa cggcgtggcg ctgcacgccc ccgcgagcgg cgcgctgcgg 17700gtgcgcctcg cgccggccga cgacggcgct gtcggcatca cggccgcgac ggccgccggt 17760gagccggtgc tctcggtcgc cgcgctggcc ctgcggtccg cctcggcgga gcagttgcgc 17820gcggcgatcc gctccgcggc gggctcgcgc gacgccctct acgagctgga ctggctgccg 17880ctcccggcgg accgggccgc ttcgcccggt ggggccgaca tcgcggccct gggcacatcg 17940gagctgccct gccgtacgta cgagaccatc gcggagctgt cgcaggccct cgccgacggt 18000gctcccgccc ccgacgccgt cgtctccgac gtcggcgccg tcggcgggcc gctggacacc 18060gtgagcctgc acggcctctg ccggcgcggg ctggaactcg tgcaagcctg gctgggcgag 18120ccccggacgg ccgacacgcg gctggtgctc gtgacgcgtg gggcggtcgg ctgtgccccg 18180gccgagccgg tcgccgatcc ggccgcggcc gcgctgtggg ggctggtgcg gtccgcgcag 18240gcggagcacc ccggacggct gctcctgctg gacctcgacc ccgccgggtc gcggcccgtc 18300tccggccgcc tggtggaaca ggcggtggcc tgcggtgagc cgcacatcgc cgtacggggc 18360gacggcctgc gcgtcccccg gttgtcccgc gcgacggccg cccccgcaca ccctcccgcc 18420ggtggccggg aagcgcagtg ggacccggaa gggaccgtcc tcatcaccgg cggcaccgga 18480agtctcggcg cgctgttcgc ccggcatctg gtgaccgcgc acggggtacg gcggctgctc 18540ctcgccagcc gcagtggccc cggcgccccc ggcgccgccg ggctgcggga cgaactgacc 18600gctcacggag ccaccgtcac cgtcgccgcc tgtgatgtgg ccgaccggga ggccgtcgcc 18660gccctcctgg cgtccgtgcc gtccgagcac ccgctgaccg ccgtagtgca caccgccggc 18720gtgctggacg acggcgtact cgcctcgctc accgccgacc ggctggcccg cgtcctgcgt 18780gccaaggccg acgccgcgct ccacctgcac gatctcaccc gcgatctgcc gctcgccgcc 18840ttcgtcctct tctcctccgt cacggcgacg ctcggcacac ccggccaggc caactacacc 18900gccgccaacg cgttcctcga cgcgctcgcc cggcatcggc gcgccgcggg cctgcccgcc 18960gtctcactcg cctgggggct gtgggagcag accggcgggc tgaccgatca cctcggatcg 19020gtcgacctgc ggcggatggc ccgcaacggc ctggtcgcgc tgcccgccga cgccggcctg 19080gcgctcttcg acaccgcgct ggccctggac cgcgccaacc tggtcccggc gcggctcgac 19140ctgcccgcgc tgcgccgcgc cacacacgtg ccgcccgttc tgcggcggct ggtcgaggtg 19200ccgggggcgc cgagcgcgga ccggtccgcc gggtccggcg gcgaggtgag gccgctgcgt 19260gagacgctgg ccgggctgga cgaccggaaa cgccccgctg ccgtctcccg cctggtccgc 19320aggcacgtcg cgtgggtgct cggcgccgac ggtccggagt cggtggacga ggaccgcagc 19380ttccgcgacc tcggcttcga ctcgctgatg gccgtcgaac tgcgcaacca gctcaacacc 19440gccgccggca tccggctcgc ggccaccctc gtcttcgacc acccgacacc gtcggccgtg 19500gcgcggcacc tcctcgaccg gtgctcgccg gacccggccg ccccggccgc tccctcgggt 19560acggcggtcg cgtcggcgct cgccactctg gccgagctgg agacggcttt gaacggcatc 19620ccggccgagg agtggacggc cgccgggggc ccggcccggc tgatgacgct ggcgtcctcg 19680ctgcccgcgc ccgcgtccgt ccctcggaca ccggcggccg gcgaagccgc cgagaagctc 19740gcccacgcct cgcgcgacga gatcttcgcg ttcatcgatc gggagctggg gcgtgactcc 19800gggccagcct caccctctcg cctcggtccg cagacccccg actcgacaga caaggcgccc 19860tttcatggag aatgaggaaa agctcctgga ctacctcaag tgggtcaccg ccgatctgca 19920ccgctcgcgg gaacgcgtca ccgagctgga ggaggccggc cgggagccga tcgccatcgt 19980cgggatggcc tgccggttcc cgggcgaggt gcggtcgccg gaggagctgt gggggctggt 20040cgcctcgggc ggcgacgcga tcggggcgtt cccggacgac cgcgggtggg atctggacgg 20100gctgttcgac cccgacccgg agcgtgcggg cacctcgtac acccggcgcg gcggtttcct 20160gtacgacgcg gcggagttcg acgcgggctt cttcgggatc tccccgcgtg aggcgatggc 20220gatggacccg cagcagcggc tgctgctgga gacctcgtgg gaggctttcg agcgggccgg 20280catcgacccg tcctcggtac gcgggtcccg ggtcggtgtc ttcgccggcc tcatgtacca 20340cgactacgcg gcggcccagg gcagcaccgg ggacggagac ggggagccgg acttcgaggg 20400ctacctcggc gacggcagcg tcagcagcat cgcctcgggc cgtatcgcct acaccctcgg 20460gctcgcgggc gcggcgatca ccgtcgacac ggcctgctcc tcttccctgg tcgccctgca 20520cctcgcctgc caggcgctgc gcaccggcga ctccgagctg gccctggccg gcggggtcag 20580cgtcatgtcc accccccgca ccttcgtcca gttctcgcgg cagcggggcc tgtcggcgga 20640cggccggtgc aaggcgtacg cggcggcggc cgacgggacg gggttctccg agggcgtcgg 20700catggtgctg gtcgaacggc tctccgacgc ccggcggctg gggcatccgg tactggcggt 20760cgtgcggggc agcgcggtca accaggacgg cgcgtcgaac ggtctgacgg cgcccaacgg 20820accgtcgcag gagagggtga tccgcgaggc gctggccaac gcgggcctga cggcggcgga 20880cgtcgacgcg gtggaggggc acgggaccgg gacacggctg ggtgacccga tcgagttgca 20940ggcgctgctc gccacctacg gacagggacg cgcccgggag cggccgctgt ggctcggatc 21000ggtgaagtcc aacatcggtc acgcgcaggc ggcggcgggg gtgggcggcg tcatcaagat 21060ggtgatggcg ctgcggcacg gggagctgcc gcgcaccctg cacgtggacg cgccctcgcc 21120ccgggtcgac tggtcggcgg gcgaggtacg gctgctgacg gaggccgtcg cgtggcccgc 21180ggcggcggac ggtgagccgc ggcgggccgg ggtgtcgtcc ttcggggtga gcggcaccaa 21240cgcccatgtg atcctggagg aggcgcccgc gtcggagggc gaggaagctc cgccgccgga 21300gcccgggtcg ccgttgccgt gggtggtgtc cggtcactcg gaggcgggct tgcgcgccca 21360ggcgcaggct ctggcggagt tcgcacggac cgcgcccggg gccgaactcg tggacgtggg 21420agcggcgttg gcccgggggc gggcggcgct ggggcatcgg gcggtcgtcg tcgcctcgga 21480gcgtgaggag ttcgagcggg cgctggccgc gctggcctgt ggcgaaccgc acccgtgtgt 21540ggtcgacggg tcggcggacg gccggcgcga ggacggtgtg gtgttcgtct tcccgggcca 21600gggcggtcag tgggccggca tgggactcga tctgctgacg acctcggggg tgttcgccga 21660acatatcggt gcgtgtgaac gcgcgctggc gccgtgggtg gagtggtcgc tgacggagat 21720gctccaccgc gaggcggagg acccggtgtg ggagcgggcg gacatcgtcc agccggtgct 21780gttctcggtc atggtgtccc tggccgcgct gtggcggtcc tacggcatcg aacccgacgc 21840ggtggtcggc cactcccagg gcgagatcgc cgccgcccac gtctgcggcg ccctcaccct 21900cgaagacgcc gcgaaagtcg tggcactgcg cagccgggcc ctggccgcac tgcggggccg 21960cggcggcatg gtctccctct cgctgtcgac cgcggatgcc ggggagctgg tggagcggcg 22020gtgggccggg cggctgtggg tcgcggcgct caacgggccg gaggcgacga cggtctcggg 22080ggacgtcgac gcgctggagg agctcctggc ccactgcgcg aaaagcgagg tgcgagcgcg 22140gcgcgtcccg gtggactacg cctcccactg cccgcacacg gaagcgatcg cggaagagat 22200cgtcgattca ctcggggaca tcacgccccg ggccgccacc gttccgttct actcgacggt 22260cgacgacatg tggttggaca ccacacggct ggacgcctcc tactggtacc gcaacctgcg 22320cctcccggtc cgcttcagcc aggccgtgcg cgccctcacg gaagaaggcc accgcctctt 22380catcgagacg agcccgcatc ccaccctcgt ccccgccatc gaggaccacg gcgacgtcac 22440cgccctcggg accctgcgcc gccacggcga cgacaccgag cggttcctca ccgccctcgc 22500ccacctccat gtcaccggag ccgccggcca ggacctctgg cgccaccact acgccaggct 22560caggcccgcc ccccgccacg tcgacctgcc cacctacccc ttccaacgcc ggcgctactg 22620gctggagaaa cccgacccgc agaccaggcc ccagcggtcc cgctccaccg ccccggacct 22680cgacaggctg gaggcggagt tctggcaggc cgtcgaggaa accgacaccg acaccctcgc 22740ccacaccctc cacctcgaca cccagaccct cgaacccgtc ctccccgccc tcgccacctg 22800gcaccaacaa caacgcgacc acgcccgcat caacacctgg acctaccagg aaacctggaa 22860accactccac ctccccacca cccgacccac cacccccacc agctggctca tcgccatccc 22920cgaaacccac cgcaaccacc cccacaccac caacctcctc accaacctcc cccaccacaa 22980catcaccccc atccccctca ccatcaacca caccaccgac ctccaccacg cctaccacca 23040cgcccaccac cacaccaccc cacccatcac cgccgtcctc tccctcctcg ccctcgacga 23100aacaccccac ccccaccacc cccacacccc caccggcacc ctcctcaacc tcaccctcac 23160ccaaacccac acccaaaccc acccaccaac ccccctctgg tacctcacca cccaagccac 23220caccacccac cccaacgacc ccctcaccca ccccacccaa gcccaaacca tcggactcgc 23280ccgcaccacc cacctcgaac acccccacca caccggcgga cacatcgacc tccccaccac 23340accccacccc aacaccctca cccaactcat caccgccctc acccaccccc accaccaaca 23400caacctcacc atccgcaccc acaccaccca cacccgacga ctcaccccca ccaccctcca 23460acccaccacc cccacaccac ccaccaaccc ccacggcacc accctcatca ccggcggcac 23520cggcgccctc gccaccaccc tcgcccacca cctcgccacc accggcaccc aacacctcct 23580cctcaccagc cgacgcggcc cccacacccc cggcgcccga caactccaca cccaactcac 23640ccaactcggc accaacacca ccatcaccgc ctgcgacctc tccgaccccg accaactcac 23700ccacctcctc acccacatcc cccccgaaca ccccctcacc accgtcatcc acaccgccgg 23760catcctcgac gacgccaccc tcaccaacct cacccccacc caactcgaca acgtcctgcg 23820cgccaaagcc cacaccgccc acctcctcca ccacgccacc ctccacaccc ccctcgacca 23880cttcgtcctc tactcctccg ccgccgccac cctcggcgcc cccggccaag ccaactacgc 23940agccgccaac gcctacctcg acgccctcgc ccaccaccgc cacacccaca acctccccgc 24000caccaccatc gcctggggaa cctggcaagg aaacggcctc gccgactcgg acaaggcccg 24060cgccaacctc gaccgccggg gcttcctgcc catgcccgag acgctggccg cagccgcggc 24120cgtgcgggcg atcgagagca ggcggccgtc cgtggtcatc gccgccatcg actgggccag 24180agccgagcgc acccccgacg tcgaggatct cctccccgcg gccgacgagg ggtcgtcgag 24240tggcaagccg gaggccgcgc cggtggacct gcgcggtacc ttgagccggc agtccgccgc 24300cgaccaacag gccacactgc tcggcctggt gcggacccag gcagccgtcg tactgcgcca 24360cacggagccc gaggcgctcg ccccgggcca ggccttccgg gcgctcggct tcgactccct 24420caccgccgtc gaactccgca accgactggc caaggccacg gacctcgcgc tgcccgcctc 24480actggtcttc gatcacccga ctccggtgaa gctcgcggag ttcctgcgca ccgagctgct 24540cggcaccgca ccagctacca ccgccgccgt cccggccctc caggcacaca ccgacgaacc 24600catcgccatc atcggcatgg cctgccgctt ccccggcgcc gtcaccacac ccgaacacct 24660gtggaacctc atcgccaccg aacaagacgc catcggcgag ttccccaccg accgcggctg 24720ggacctggac aacctctacc accccgaccc cgaccacccc ggcaccacct acacccgcca 24780cggcggattc ctccacgacg ccggcgactt cgacgccgac ttcttcggca tcaacccacg 24840cgaagccctc gccatggacc cccaacaacg actcctcctc gaaaccgcct gggaagccat 24900cgaacacgcc ggcatcctcc ccgacgccct gcacggcacc cccaccggcg tcttcaccgg 24960cgtcaacgcc caggactacg ccgcacacac ccacacctcc ccccacacca ccgagggcta 25020caccctcacc ggaaccgccg gcagcatcgc ctccggccgc atcgcctacg tcctcggact 25080cgaaggcccc gccgtcacca tcgacaccgc ctgctcctcc tccctcgtcg ccctccacct 25140cgcctgccag gccctgcgag caggcgaatg caccacagcc ctcgccagcg gcatcagcat 25200catgaccaca ccgctggcct tcaccgagtt ctcccggcag cggggtctgg cggcggacgg 25260ccggtgcaag gcgttcgcgg cggccgccga cggtaccggc tggtcggagg gggtggggac 25320gctgctgttg gagcggttgt cggacgccga gcggaacggg caccgggttc tggcggtggt 25380gcggggcagc gcggtcaacc aggacggcgc ctccaacggg ctgacggcgc cgaacggtcc 25440gtcccagcag cgtgtgatcc gccaggccct ggtcaacgcg aacctctccg cagttgatgt 25500cgacgccgtc gaagcccacg gcacggggac caagctgggc gacccgatcg aagcccaggc 25560cctgctcgcc acctacggcc agggacgtgc gcaggaacag ccactgtggc tcggttcggt 25620caaatccaac ctgggtcaca cccaggcggc ggcaggcatg gccggcctga tcaagatggt 25680gatggcgctg cggcacgagt cgttgccgcg gacgttgcat gtggatgagc cgtcgccgga 25740ggtggactgg tcgtcggggg cggtgagtct gctgaccgag gcgcggccct ggccgcgggt 25800cgaggaccgg ccccggcggg ccggggtgtc ctcgttcggg gtgagcggga cgaacgccca 25860cgtcatcgtg gaggaggcgc ccgcgccgac gggagtggag gcggtggaag ccgcgccggc 25920gggggtggag actgcggcgg ctgcggcggt ggtggtggag acggacggtg cgggccgggt 25980gtcggcggat ctgccgttgg tgtgggtggc gtcgggcaag tcgcaggccg cgatacgcgc 26040ccaagccgcc gccctgcacg cccacgtcct ggaccacccc gaacaggacg cggacgacat 26100cggctacagc ctggccacca cccgcgccct gttcgaccac cgcgccaccc tcatcgcccc 26160cgaccgccac accgtcccgg agcccctcac cgggctgggc gacggacgca cgcaccccca 26220cctcatcccc acacccccca ccgaacccgg ccacacccac aaaatcgcct tcctctgctc 26280cggacaaggc acccaacgcc ccggcatggc caccggcctc taccacacct accccgcctt 26340cgccgccgcc ctcgacgaaa cctgcgccca cttcgacccc cacctcgacc accccctgca 26400cgacctcctc ctcaaccacg accccaccga cctcctcacc cacaccctct acgcccagcc 26460cgccctcttc accctccaaa aagccctcca ccacctcatc accgaaacct acggcatcac 26520cccccactac ctcgccggac actccctcgg cgaaatcacc gccgcccacc tcgccggcat 26580cctcaccctc cccgacgcca cccacctcat caccacccgc gcccgcctca tgcaaaccat 26640gccccccggc accatgacca ccctccacac cacccccgaa cacatccaac ccctcctcga 26700ccaacacccc ggcaaagccg ccatcgccgc cgtcaacagc ccccactccc tcgtcatcag 26760cggcgacccc gacaccatcc accacatcac caccacctgc cacaaccaag gcatcaccac 26820caaacccctc gccaccaacc acgccttcca ctccccccac accgacacca tcctcgaaca 26880actcgacacc accacccaca ccctcaccta ccaccaaccc cacacccccc tcatcaccag 26940cacccccggc gaccccctca ccccccacta ctggacccac cagacccgcc aacccgtcca 27000ctggaccgac accatccaca ccctccacac ccacggcgtg accacgtaca tcgcactcgg 27060accagagcac accctcacca ccctcaccca ccacaacgtc ccccaccacc aacccaccgc 27120catcaccctc acccaccccc accacaaccc cacccaccac ctcctcaccg cactcgccca 27180cctccacaca acccaaccca ccggccccaa catctggcac caccactaca ccccagtcgc 27240acccgccccc cgccacgtcg acctgcccac ctaccccttc ccacgccggc gctactgggt 27300gcaggcgtcc gccggtacgg gtgacgtgtc ggctgccggg ctccagcgac cggaccaccc 27360actgctcggc gcggtgatgg agctcgcgga cggggacgga atcgtcctca ccgggcgctt 27420gtccctgcac acccacccct ggctcgccga ccacagcgtc ggcggcgtcg ccctccttcc 27480cggtaccgct ctgctggagc tggcttttca ggctggtctg cgtgcgggtt gtcctggtgt 27540cgatgagctg actctccatg ctcctctggt ggttccggag tcggggcatg tggtggtgca 27600ggtgtcggtt tcggtgccgg gcgaggcggg tcgtcgtggt gtgagtgtgt acgggcggct 27660ggtggaggac ggggggctgg agggtgagtg gacgcggcat gccgagggtg tggtgtgtcc 27720gtctgttcct ggggagtcgg tggttgtgga gccggtggcg gacggggtgt ggccgccgtc 27780cggtgcgcag ccggtggatc ttgaggagtt ctacggtcgt ctggcgggtg ggggttttgt 27840ctacggtccg gtgttccagg gtttgtgtgc ggcctggcgg gacggggacg acgtggtggc 27900cgaggtgcgt ctgccggacg aggggctggc cgatgtcgcg ggcttcgggg tgcatccggc 27960gctcctggac gcggccgtgc aggcagtcac cctcctgttc ccggaccagc agcaagccgg 28020tctcgcggcc cacacatgga acggtgtctc gctccacgcc cggggcgcca ccgtcctgcg 28080cctgcgcatg actcccaccg acgcgacctc gaccgccgtt cgcctgcacg ccaccgacga 28140gaccggagca cccgttctca ccctcgactc gctcctgatg cgtccggtgc cgttggaggg 28200gctgggggcg ggggtgcggc gtggctcgtt gttcgagctg gggtgggtgc cggtggaggg 28260gatgccggcc tcggtggccg gtgggggcgg ggagttggtg gcgtgggagt gcccgggtgg 28320tggggtggcc gaggtcacgg ccgcggcgtt gggagtggtg caggagtggc tcgccgatga 28380gcgggagggg gacgcgcggc tggtcgtggt gacgcgtggt gcggtcgcgg tggatgcggg 28440ggagccggtg cgggacgtgg cgggggccgc tgtgtggggg ctggtccgct cggcccagtc 28500cgagcatccc gaccggttcg ccctgctcga cctcgacccc gacaccaaga ccgaccccgg 28560catcgacacc gacggggaca ccgacgtgtc cgccgacgcg aaggtcggca ccggtgatgg 28620tctcgacgat gccgccgtcg cgtccgctct ggcccgcggt gagagccaac tcgccgtacg 28680cgacggggtg gttcgcgtag cgcggttggg gggtttggtt ggggggttgt cgttgcctgg 28740tggggtgggg tggcggctgg atggtggtgg gtcggggttg ttggaggggg tgggtgtggt 28800tgcttcggat gcggctgggg tggtgctggg tcgggggcag gtgcgggtgg cggtgcgggc 28860tgccggggtg aacttccggg atgttctggt ggcgttgggg atggtgccgg gtcaggtggg 28920ggtgggcagt gagggtgcgg gggtggtggt ggaggtgggg cccggggtgg agggcctggt 28980ggtgggggac cgggtgttcg gggtgttcgg ggacgcgttc gcgccggtgg tggtggcgca 29040ggaggtgttg ctggcccgta tcccggaggg ctggtcgttc gcgcaggcgg cttcggtgcc 29100ggtggtgttc gctaccgctt acctgggact ggtcgatctg gcgggggtgc ggcgggggga 29160gagtgtgctg gtccatgcgg cggccggcgg ggtcggtacc gccgcggtgc agctcgcccg 29220tcatctgggg gcggaggtgt atgcgacggc cagtgaggcg aagtgggcgc gtctgcgggc 29280ggcgggtgtc gcgccgcagc ggatcgcgtc ctcgcggagt gtggagttcg agtcccgttt 29340ccgccgggcc agtggcggcc ggggtgtgga tgtggtgctg aactgtctgg cgggtgagta 29400caccgatgcc tcgttgcggc tgtgttcgcc gcaggggggc cggttcctgg agctgggcaa 29460gaccgacatc cgtgatgccg gtgaggtcgc cgctcggttc ccgggggtgt cctaccgggc 29520gtatgacctg atggacgcgg gtgcgcagcg ggtgggggag atcctgcaca cggtggtgga 29580tctgttccgg cgcggggtgc tggagccgtt gccggtcacc gcgtgggacg tgcgccaggc 29640ccatcaggca ctgcggtcga tgcggtcggg cctgcacgtc ggcaagaacg tgctcaccct 29700gcccgtgccc ctggatgcgg aggggacggt gctggtgacg ggcgggaccg gcactctggg 29760ggcggcggtc gcgcgccatc tggccgccgg gcacggggtg cggcatctgc tgctggtgag 29820ccggcgcggc atggccgccg ccggtgccga aaaactgtgt gcggaactgg gtcaggcagg 29880ggtttcggtg tcggtggccg ggtgtgatgt cgccgaccgc gcccaggtcg ccgccctgct 29940ggagcaggtg cccgcggagc atccgctgac cgctgtggtg cacacggccg gtgtcctgga 30000cgacgccacc gtgacctgcc tggaccggaa caagatcgat gcggtgctcg gggcgaaggt 30060ggacggtgcc ctgcacctgc acgagctgac cgcggggatg

gacctgtcgg cgttcgtgct 30120gttctcctcc gccgccgggg tcctgggctc gccggggcag ggcaactacg ccgccgccaa 30180cgccgccctg gacgccctgg cccaccagcg ccgcgccgcc ggtctgcccg ccctctccct 30240ggcctgggga ctgtgggaag aggccagcgg gatgaccggc catctggatg ccgctgaccg 30300tcaccgcatc acccgctcgg ggctgcatcc cctgaccacc cccgacgccc tcgccctcct 30360cgacaccgcc ctggccgccg gacgccccgc actcctgccc gccgacctac gccccaccca 30420ccccgcaccg cccctcctgg aacacctcgc gcccgcccgc accagccacc gcaccgcaca 30480caccagcacc gcaaccggcg tgggccagga cgtctccctc accgaccgcc tcgccaccct 30540gacccccgaa cagcggcacg acaccctgct ggcgctggcc cgtacccaca tcgccgccgt 30600cctgggccac cccagccccg acaccatcga ccccgaacgc accttccgcg acctcggctt 30660cgactccctc accgccgtcg aactccgcaa ccggctcacc cgcgccaccg gcctgcgcct 30720gcccgccacc ctcgccttcg accaccccac ccccaccgca ctcacccacc acctcaccac 30780cctcctcaac cccaacgaca acgacaacgt cggtccggta ctgatggagc tcgaaagact 30840ggaatccgct ctcgccgcgc tggacaggga cgacagcgcc tgcgagcggg tcactctgcg 30900actgcaatcg ctgatgctca ggtggagcgg ctccgagcgg cagtcagccg aaaacacgga 30960cgactccagc aggttcgcgt cggcgaccgc ggaggagcta ctcgaattca tcgaccgaga 31020cctgggtctt tcctgaacca gctcggtctt ccctgaacca gctcgacgac gcggttttcc 31080cgtgcgcgac ggactccaag gacgtgaacc agacgtggcg aatgacgaga aggtgctcga 31140atacctcaag cgagtcaccg cggatttgga ccggaccagg cggcgcctgt acgaagtcgt 31200cgagcgggag caggagccga tcgccatcgt ggggatggct tgccgttatc cgggcggggc 31260cgggtcgccc gcaggtctct gggacctcgt cagctccggt acggacgcca tcggggagtt 31320ccccaccgat cgtggctggg atctggaacg tctctacgac cccgaccccg atcacccggg 31380caccacgtac acccgccacg gcggattcct cgacggcgta ggtgagttcg acgcggagtt 31440cttcggcgtc agcccgcgtg aggccctggc gatggacccc cagcagcggc tcctcctcga 31500aaccgcctgg gaagccatcg aacacgccgg catcgtcccc gagtcgctgc gcggcacgtc 31560caccggcgtc ttcgccggta tcaacccgca ggactacacc atcagtcagt acggacggga 31620ttcggagatc gagggctatc tgctgaccgg ggcagccgcc agtatcgcct ccggccgtat 31680ctcctacacc ctcggcctcg aaggcccagc cgtcaccatc gacaccgcct gctcctcctc 31740cctcgtcgcc ctccacctgg cttgccaagc gctgcgcgca ggggagtgca ccatggccct 31800ggcgggcggc gcctcggtcc tgtccacacc gctgatcttc gtcgagttcg ctcgccatca 31860cggcctgtcg gtcgacggcc ggtgcaaggc gttctccgct tcggccgacg gcacgggctg 31920gggcgagggc gccggcctgc tcctcctcga acggctctcc gacgccaagc gcaacggccg 31980ccgcatcctc gctctcgtac gggggagcgc ggtcaaccag gacggcgcct cgaacgggct 32040gacggcgccg aacggaccct cccagtgcag ggtcatccgc cgggccttgg ccaacgccca 32100tctcgccccg gccgacatcg atgccgtgga agctcacggc accggcacca ccctgggcga 32160ccccatcgaa gcccaggccc tccaggaagc gtacggcgcg gaccgacccg acgatcggcc 32220gctctgggtc ggcacgctca agtcgaacat cggccactcg atcgccgcgg cgggtgtggg 32280cggggtcatc aagatggtga tggcgctgcg gcacgagtcg ttgccgcgga ccttgcatgt 32340ggatgagccg tcgccgcagg tggactggtc gtcgggtgcg gtgagtctgc tgaccgaagc 32400gcggccctgg ccgcgggacg aggaccggcc ccggcgggcc ggggtgtcct cgttcggggt 32460gagcgggacc aacgcgcacg tgatcctgga ggaagcgccc gcgccggcgg aggtgcaggc 32520ggtagaaact gcgccggtgg tgcgggtgga tggtggggag cgttccgcac cggcggatgt 32580gccgttggtg tgggtcgtgt cgggcaagtc gcaggccgcg ctacgcgccc aggccgccgc 32640cctgcacgcc cacgtcctgg accaccccga acaggacgcg gccgacatcg gctacagcct 32700ggccaccacc cgcgccctgt tcgaccaccg cgccaccctc atcgcccccg accgcgacac 32760cctcctggac gccctcaccg ccctggccga cggccgcacc cacccccacc tcgtccccgc 32820accccccacc gaacccggcc acgcccacaa aatcgccttc ctctgctccg gacagggcac 32880ccaacgcccc ggcatggcca ccggcctcta ccacacctac cccgccttcg ccgccgccct 32940cgacgaaacc tgcgcccact tcgaccccca cctcgaccac cccctgcgcg acctcctcct 33000caaccacgac cccaccggcc tcctcaccca caccctctac gcccagcccg ccctcttcac 33060cctccaaaaa gccctccacc acctcatcac cgaaacctac ggcatcaccc cccactacct 33120cgccggacac tccctcggcg aaatcaccgc cgcccacctc gccggcatcc tcaccctccc 33180cgacgccacc cacctcatca ccacccgcgc ccgcctcatg caaaccatgc cccccggcac 33240catgaccacc ctccacacca cccccgaaca catccaaccc ctcctcgacc aacaccccgg 33300caaagccacc atcgccgccg tcaacagccc ccactccctc gtcatcagcg gcgaccccga 33360caccatccac cacatcacca ccacctgcca cacccaaggc atcaccacca aacccctcac 33420caccaaccac gccttccact ccccccacac cgacaccatc ctcgaacaac tcgacaccac 33480cacccacacc ctcacctacc acccacccca cacccccctc atcaccagca cccccggcga 33540ccccctcacc ccccactact ggacccacca gacccgccaa cccgtccact ggaccgacac 33600catccacacc ctccacacca acggcgtcac cacctacatc gaactcggac ccgaccacac 33660cctcaccacc ctcacccacc acaacctccc ccaccaccaa cccaccgcca tcaccctcac 33720ccacccccac cacaacccca cccaccacct cctcaccgca ctcgcccaca cccccaccac 33780ctggcacacc caccaccaca cccacaccaa cccccacccc cacaccatcc ccgacctccc 33840cacctacccc ttccaacgcc ggcactactg gctccaggcg cccaccacca gcaccgatca 33900gccggtggcc ccgacgaacg acgacgcccc cgcgcctcga gcgacatcgc tccgggacac 33960tcttgccgga cgaagccctc aagagcgcga agaagtgctc ctggatctcg tactgaccca 34020ggtcgccgcc gtgctcggcc acaccgcgcc tgaggtggtg gatccccaaa gggcgttcaa 34080ggacctcggc ttcgactcac tggccgccat caaactccgc aacaggctcg ccgcagccac 34140cggactcgag ctgccgacca cccttgtctt cgaccacccc acgccggtgg cactccgcca 34200gtacttccag tcgcagatcc tcggagcgga ggcggacgcc cccaaccgtc tgcccctccg 34260ggcggcgacc accgacgaac ccatcgcgat cgtcggcatg gcgtgccgct tcccgggcgg 34320cgttcggacg gccgacgacc tgtggcagct cctgagcgac gaacacgatg cggtcggcgg 34380cttccccacc aaccggggtt gggacgtggc gaacctctac gacccggacc cggatcgcca 34440cggcaccacg tacacccagc agggcggctt cctctacgaa gcgggggagt tcgacgccga 34500gttcttcggc atcagcccgc gtgaggccct ggcgatggac ccccagcagc ggctcctcct 34560cgaaaccgcc tgggaagcca tcgaacacgc cggcatcaac cccgatgccc tgcgcaacac 34620gtccaccggt gttttcgccg gggtcatcta ccacgactac gcgagccggt tcctcaccgc 34680gccggccggt tacgagggct acctcggcca cgggagtgcc ggcagcatcg cgtcgggccg 34740tgtcgcgtac gtgctgggtc tcgagggtcc cgcggtcacg gtcgacaccg cgtgttcgtc 34800gtcgctcgtc gcgctgcatc tggcctgtca ggcactgcgg tcgggcgagt gcacgatggc 34860tctggcgggc ggcgcgacgg tgatgtcgac cccgcaggcg ttcgtggagt tctcccggca 34920gcggggtctg gcggcggacg gccggtgcaa ggcgttctcc gctgcggccg acggcacggg 34980ctggggcgag ggcgccggcc tgcttctcct cgaacggctc tccgaggccg agcggaacgg 35040acaccgggtt ctggcggtgg tgcggggcag cgcggtcaac caggacggcg cctcgaacgg 35100gctgacggcg ccgaacggtc cgtcccagca gcgcgtgatc cgccaagctt tggccaactc 35160gggcctgacc ggcgccgatg tcgacgccgt cgaagcccac ggcacgggga ccaagctggg 35220cgacccgatc gaagcccagg ccctgctcgc cacctacggc caggaacacc accccgacca 35280gccgctctgg ctcggctccc tgaagtccaa catcggccac gcccaagcgg cagcaggcgt 35340gggcagcatc atcaagatga tcatggctat gcgcaacgag tcgctgccgc ggacgttgca 35400cgtggatgag ccgtcacccc atgtggactg gtcgtcgggg gcggtgagtc tgctgaccga 35460gccacgcccc tggccacgcc gggaagaccg gccccggcga gcgggaatct cctccttcgg 35520agtcagcggg acgaacgccc acgtcatcgt ggaggagccg cctgcgcggg cggaggtgga 35580ggcggtggaa gccgcgccgg cgggggtgga gactgcggcg gctgccgcgg tggtggtgga 35640gacagacggt gcgggccggg tgtcctccga tgtgccgttg gtgtgggtgg tgtccggcaa 35700gtcgcaggcc gcgctacgcg cccaggccgc cgccctgcac gcccacgtcc tggaccaccc 35760cgaacaggac gcggccgaca tcggctacag cctggccacc acccgcgccc tgttcgacca 35820ccgcgccacc ctcatcgccc ccgaccgcga caccctcctg gacgccctca ccgccctggc 35880cgacggccgc acccaccccc acctcatccc cacacccccc accgaacccg gccacaccca 35940caaaatcgcc ttcctctgct ccggacaagg cacccaacgc cccggcatgg ccaccggcct 36000ctaccacacc taccccgcct tcgccgccgc cctcgacgaa acctgcgccc acttcgaccc 36060ccacctcgac caccccctgc gcgacctcct cctcaaccac gaccccaccg acctcctcac 36120ccacaccctc tacgcccagc ccgccctctt caccctccaa aaagccctcc accacctcat 36180caccgaaacc tacggcatca ccccccacta cctcgccgga cactccctcg gcgaaatcac 36240cgccgcccac ctcgccggca tcctcaccct ccccgacgcc acccacctca tcaccacccg 36300cgcccgcctc atgcaaacca tgccccccgg caccatgacc accctccaca ccacccccga 36360acacatccaa cccctcctcg accaacaccc cggcaaagcc accatcgccg ccgtcaacag 36420cccccactcc ctcgtcatca gcggcgaccc cgacaccatc caccacatca ccaccacctg 36480ccacaaccaa ggcatcacca ccaaacccct caccaccaac cacgccttcc actcccccca 36540caccaacacc atcctcgaac aactcgacac caccacccac accctcacct accacccacc 36600ccacaccccc ctcatcacca gcacccccgg caaccccctc accccccact actggaccca 36660ccagacccgc caacccgtcc actgggcgga caccatccac accctccaca ccaacggcgt 36720caccacctac atcggactcg gacccgacca caccctctcc accctcaccc accacaacct 36780cccccaacac caacccaccg ccatcaccct cacccacccc caccacaacc ccacccacca 36840cctcctcacc gcactcgccc acacccccac cacctggcac acccaccacc acacccacac 36900caacccccac ccccacacca tccccgacct ccccacctac cccttccaac gccggcacta 36960ctggctggag gtcccgaagc cgactgccga agcatccgcc tcagccagtg gcccggggcg 37020gaaccgggcc gccaaactct cagcgctcga ggcggagttc tggcaggccg tcgaggaaac 37080cgacaccgac accctcgccc acaccctcga cctcgacacc cagaccctcg aacccgtcct 37140ccccgccctc gccacctggc accaacaaca acgcgaccac gcccgcatca acacctggac 37200ctaccaggaa acctggaaac cactccacct ccccaccacc cgacccacca cccccaccag 37260ctggctcatc gccatccccg aaacccaccg caaccacccc cacaccacca acctcctcac 37320caacctcccc caccacaaca tcacccccat ccccctcacc atcaaccaca ccaccgacct 37380ccaccacgcc taccaccacg cccaccacca caccacccca cccatcaccg ccgtcctctc 37440cctcctcgcc ctcgacgaaa caccccaccc ccaccacccc cacaccccca ccggcaccct 37500cctcaacctc accctcaccc aaacccacac ccaaacccac ccaccaaccc ccctctggta 37560cctcaccacc caagccacca ccacccaccc caacgacccc ctcacccacc ccacccaagc 37620ccaaaccatc ggactcgccc gcaccaccca cctcgaacac ccccaccaca ccggcggaca 37680catcgacctc cccaccacac cccaccccaa caccctcacc caactcatca ccgccctcac 37740ccacccccac caccaacaca acctcaccat ccgcacccac accacccaca cccgacgact 37800cacccccacc accctccaac ccaccacccc cacaccaccc accaaccccc acggcaccac 37860cctcatcacc ggcggcaccg gcgccctcgc caccaccctc gcccaccacc tcgccaccac 37920cggcacccaa cacctcctcc tcaccagccg acgcggcccc cacacccccg gcgcccgaca 37980actccacacc caactcaccc aactcggcac caacaccacc atcaccgcct gcgacctctc 38040cgaccccgac caactcaccc acctcctcac ccacatcccc cccgaacacc ccctcaccac 38100cgtcatccac accgccggca tcctcgacga cgccaccctc accaacctca cccccaccca 38160actcgacaac gtcctgcgcg ccaaagccca caccgcccac ctcctccacc acgccaccct 38220ccacaccccc ctcgaccact tcgtcctcta ctcctccgcc gccgccaccc tcggcgcccc 38280cggccaagcc aactacgcag ccgccaacgc ctacctcgac gccctcgccc accaccgcca 38340cacccacaac ctccccgcca ccaccatcgc ctggggaacc tggcaaggaa acggcctcgc 38400gagcggtgac atcggcgagc atctgcgccg ccgcgggatg atcccgctgg atcccgagtc 38460cgctgtcggt gccttcgacc gggcggtcgc gagcgatcgg cccagcgtct tcgtcgcgga 38520catcgactgg cccaccttcg gccgcaacac ctccagcggt cttcgcgccc tcttcgagga 38580cattccggag gccacacagc ctgagccgac cgcccggagc gcggaccagc cgaacgggca 38640cggtagcctc caggaacttc tcgcccgcca gtccccggcc gagcaggccg aaacgctcct 38700ggcattggtc cggacgcatt ccgcgaccgt cctcgggcgt gacggggccg atgccgtcgc 38760cgccgaacgt cccttcaggg acctgggatt cgactcactg tccgccgtcg agctccgcaa 38820tcatctgacg gccgacacgg agctcgctct gccgacaacg ctggtcttcg atcacccgac 38880tccggtgaag ctcgcggagt tcctgcgcac cgagctgctc ggcaccgcac cagccaccac 38940cgccgccgtc ccggccctcc agtcccacac cgacgaaccc atcgccatca tcggcatggc 39000ctgccgcttc cccggcgccg tcaccacacc cgaacacctg tggaacctca tcgccaccga 39060acaagacgcc atcggcgagt tccccaccga ccgcggctgg gacctggaca acctctacca 39120ccccgacccc gaccaccccg gcaccaccta cacccgccac ggtggtttcc tctacgacgc 39180cggcgacttc gacgccgagt tcttcggcat caacccacgc gaagccctcg ccatggaccc 39240ccagcaacga ctcctcctgg aaaccgcctg ggaagccatc gaacacgccg gcatcctccc 39300cgacgccctg cacggcaccc ccaccggcgt cttcaccggc gtcaacgccc aggactacgc 39360cgcacacacc cacgcctccc cccacaccac cgagggctac accctcaccg gaaccgccgg 39420cagcatcgcc tccggccgca tcgcctacac cctcggactc gaaggccccg ccgtcaccat 39480cgacaccgcc tgctcctcct ccctcgtcgc cctccacctc gcctgccagg ccctgcgagc 39540aggcgaatgc accacagccc tcgccagcgg catcaccgtc atgaccagcc cggtcacgtt 39600caccgagttc tcccggcagc gagggctcgc ccccgacgga cactgcaagg cgttctccgc 39660ctcggccgac ggcaccggct ggagcgaggg cgtgggcacc atcctcgtcg aacggctctc 39720cgacgccgag cggaacgggc accggattct ggcggtggtg cggggcagcg cggtcaacca 39780ggacggcgcc tccaacggcc tgacggcgcc gaacggcccc tcccagcaac gcgtcatccg 39840ccaggccctg gccaactccg gcctgaccgg cgccgatgtc gacgccgtcg aagcccacgg 39900cacgggaacc aaactcggcg accccatcga agcccaggcc ctgctcgcca cctacggcca 39960gggacgtgcg caggaacagc cactgtggct cggctcggtc aaatccaacc tcggccacac 40020ccaggcagcg gcaggcatgg ccggcctgat caagatggtg atggcgctgc ggcacgagtc 40080gttgccgcgg acgttgcatg tggatgagcc gtcgccgcag gtggactggt cgtcgggtgc 40140ggtcagcctg ctgaccgagg cgcggccctg gccacgccgg gaggaccggc cccggcgagc 40200gggaatctcg tccttcgggg tgagcgggac gaacgcgcac gtgatcctgg aggaggcgcc 40260cgcgccggcg gaggcggtgg agacggaaca gggtgtggtg ccgcagggcg accaggagtg 40320ttccgcgccg gtgggtgtgc cgttggtgtg ggtggtgtcc ggcaagtcgc aggccgcgct 40380acgcgcccag gccgccgccc tgcacgccca cgtcctggac caccccgaac aggacgcggc 40440cgacatcggc tacagcctgg ccaccacccg cgccctgttc gaccaccgcg ccaccctcat 40500cgcccccgac cgcgacaccc tcctggacgc cctcaccgcc ctggccgacg gccgcaccca 40560cccccacctc atccccacac cccccaccga acccggccac acccacaaaa tcgccttcct 40620ctgctccgga caaggcaccc aacgccccgg catggccacc ggcctctacc acacctaccc 40680cgccttcgcc gccgccctcg acgaaacctg cgcccacttc gacccccacc tcgaccaccc 40740cctgcgcgac ctcctcctca accacgaccc caccgacctc ctcacccaca ccctctacgc 40800ccaacccgcc ctcttcaccc tccaaaaagc cctccaccac ctcatcaccg aaacctacgg 40860catcaccccc cactacctcg ccggacactc cctcggcgaa atcaccgccg cccacctcgc 40920cggcatcctc accctccccg acgccaccca cctcatcacc acccgcgccc gcctcatgca 40980aaccatgccc cccggcacca tgaccaccct ccacaccacc cccgaacaca tccaacccct 41040cctcgaccaa caccccggca aagccaccat cgccgccgtc aacagccccc actccctcgt 41100catcagcggc gaccccgaca ccatccacca catcaccacc acctgccaca cccaaggcat 41160caccaccaaa cccctcacca ccaaccacgc cttccactcc ccccacaccg acaccatcct 41220cgaacaactc gacaccacca cccacaccct cacctaccac caaccccaca cccccctcat 41280caccagcacc cccggcgacc ccctcacccc ccactactgg acccaccaga cccgccaacc 41340cgtccactgg gcggacacca tccacaccct ccacaccaac ggcgtcacca cctacatcgg 41400actcggaccc gaccacaccc tctccaccct cacccaccac aacctccccc aacaccaacc 41460caccgccatc accctcaccc acccccacca caaccccacc caccacctcc tcaccgcact 41520cgcccacacc cccaccacct ggcacaccca ccaccacacc cacaccaacc cccaccccca 41580caccatcccc gacctcccca cctacccctt ccaacgccgg cactactggc tggaggtccc 41640gaagccgact gccgaagcat ccgcctcagc cagtggcccg gggcggaacc gggccgccaa 41700actctcagcg ctcgaggcgg agttctggca ggccgtcgag gaaaccgaca ccgacaccct 41760cgcccacacc ctcgacctcg acacccagac cctcgaaccc gtcctccccg ccctcgccac 41820ctggcaccaa caacaacgcg accacgcccg catcaacacc tggacctacc aggaaacctg 41880gaaaccactc cacctcccca ccacccgacc caccaccccc accagctggc tcatcgccat 41940ccccgaaacc caccgcaacc acccccacac caccaacctc ctcaccaacc tcccccacca 42000caacatcacc cccatccccc tcaccatcaa ccacaccacc gacctccacc acgcctacca 42060ccacgcccac caccacacca ccccacccat caccgccgtc ctctccctcc tcgccctcga 42120cgaaacaccc cacccccacc acccccacac ccccaccggc accctcctca acctcaccct 42180cacccaaacc cacacccaaa cccacccacc aacccccctc tggtacctca ccacccaagc 42240caccaccacc caccccaacg accccctcac ccaccccacc caagcccaaa ccatcggact 42300cgcccgcacc acccacctcg aacaccccca ccacaccggc ggacacatcg acctccccac 42360cacaccccac cccaacaccc tcacccaact catcaccgcc ctcacccacc cccaccacca 42420acacaacctc accatccgca cccacaccac ccacacccga cgactcaccc ccaccaccct 42480ccaacccacc acccccacac cacccaccaa cccccacggc accaccctca tcaccggcgg 42540caccggcgcc ctcgccacca ccctcgccca ccacctcgcc accaccggca cccaacacct 42600cctcctcacc agccgacgcg gcccccacac ccccggcgcc cgacaactcc acacccaact 42660cacccaactc ggcaccaaca ccaccatcac cgcctgcgac ctctccgacc ccgaccaact 42720cacccacatc ctcacccaca tcccccccga acaccccctc accaccgtca tccacaccgc 42780cggcgtcaac cattacgctc ccgtggcggc gaccgacccg tccacgttcg cgtccgtcct 42840cgccgcgaag gcggccggcg cggcacacct gcatgaactc ctgctggagc tggacacggt 42900cgagcagttc atcctcttct cctccggttc gggggcctgg ggcagcggca accagtgcgc 42960gtacgcggct gccaacgcct acctcgatgc gctggcggcg caccgccagg cccgcggcct 43020gcctggcatg tcgctcgcct gggggccttg ggacggtgac gggatgtcgg ccggagagga 43080cgcccagcgg tacctccgtg agcggggcgt actgcccatg gatccgcggc tcgccgtcgc 43140ggccttcgac gaggcggtcc gggcgcggcc gaactccaac ctcgtcgtcg cggacatcga 43200ctgggagcgt ttcgtcccga cgttcaccgc gcggggccac aaccccctga tcgaggacat 43260ccccgaagtc cgccggctgg ccgcggaggc cgaggccgcc cagaccacga ccgccgccac 43320ggacgccccc gcccttctca accgactctc aggtctgtcg gccactcagc agaagcagca 43380tcttctccgg ctggtgcggt cacacatggg cgaggtcctc ggccgcgagg acgtcgacac 43440gctcgacgag cgccacacct tccgggacct gggcttcgac tcgctcacct cggcccgatt 43500cagccagcgg ctcgccaagg acacggggct gcaccttcct gccaccctcg tcttcgacca 43560cccgacgccc gccgactgcg tggctcatct gcgggatcaa cttctgggtg aaacggacga 43620catgactccg aggaagcgag atcacctcgg ggaggaccgg cgggcggcca ccgcggacga 43680cccgatcgcg atcgtcggga tggcgtgccg gttcccgggc ggcgtgcggt ccgccgatga 43740tctgtgggac ctgctgtcgt cgggcaccga cgccatcagc ggcttcccca ccgatcgcgg 43800ctgggacatc gagagcctct acgaccccga ccccgaccgc tccggcacca cgtacacccg 43860ccacggtggt ttcctctacg acgccgggca gttcgacgcc gagttcttcg gcatcagccc 43920gcgtgaggcc ctggccatgg atccccagca gcggctcctt ctcgaaaccg cctgggaggc 43980cgtcgaacac gcaggcatca acccgcagac actccacggc acccccaccg gcgtcttcac 44040gggcgtcaac gcccaggact acgcagccca cctgcgccag gcgtcgggca acgtcgaggg 44100gtacgccctg accggaagct cgggcagtgt cgtgtcgggt cgggtggctt acaccttcgg 44160tttcgagggg ccggccgtct cggtcgacac cgcgtgctcg tcgtcgctcg tcgcactgca 44220cctcgcaggc caagccctgc ggtccggcga gtgcacgatg gccctcgccg gcggcgtcat 44280ggtgatgtcc tcccctgaga cgttcgtgga gttctcgcgg cagcggggtt tgtcggtgga 44340cgggcggtgc aagtccttcg cggccgcggc cgacggtacc ggctggggcg agggcgtggg 44400catgctgctc gtggagcggt tgtcggacgc cgagcgcaac gggcaccggg ttctggcggt 44460ggtgcggggc agcgcggtca accaggacgg cgcctccaac ggcctgaccg caccgaacgg 44520cccctcccag cagcgcgtga tccgccaggc cctggccaac tccggcctga ccggcgccga 44580tgtcgacgcc gtcgaagccc acggcacagg aaccaaactc ggcgacccca tcgaagccca 44640ggccctgctc gccacctacg gccaggaaca ccaccccgac cagccgctct ggctcggctc 44700cctgaagtcc aacatcggcc acgcccaagc agcggcaggt gtcggcggga tcatcaaaat 44760ggtgatggca ctgcgccacg agacgctgcc gcgcacgctg cacatcgacg agccgacccc 44820ccaggtcgac tggtcgtccg gcgcggtcag cctgctgacc gagccccgcc cctggccacg 44880ccagggggac cggccccgac gcgccggcat ctcctccttc ggagtcagcg gaaccaacgc 44940ccacgtcatc ctggaagagg cacccgccca gccggccggg gaccccgccc cagaagacgg 45000cgccccggtg ccctgggcga tgtcggcgcg ttcaaacgcc gcgctgcggg cacaggccgc 45060actcctgcgt gacttcctcc aaggccccgg caccgacacc gcactacggg cggtcggagc 45120cgaactcgcc catggcaggg ccgtcctgga acaccgcgcc

gtgatcgtgg cacgggaacg 45180gacagagttc gaagacgcgc tggaagcact ggcctcgggt gaaccgcacc ccgcactcat 45240cgaagacacg accggcagcc agaccaacag ccactccggt ggcggggtgg tgttcgtctt 45300ccccggccag ggcggtcagt gggccggcat gggactcgac ctgctgcgcg actcccaggt 45360gttcgccgac catgtcggtg cgtgtgaacg cgcgctggcg ccgtgggtgg agtggtcgct 45420caccgaaatg ctccaccggg acgcggagga tccggtgtgg gagcgggcgg atgtggtcca 45480gccggtgctg ttctcggtca tggtgtccct ggcggcgctg tggcggtcct acggcatcga 45540acccgaagcg gtggtcggcc actcccaggg cgagatcgcc gccgcccacg tctgcggcgc 45600actcaccctg gaggacgccg cgaagatcgt ggcactgcgc agccgggccc tggccgcgct 45660gcggggccac ggcggcatgg cctcactcgc cctgaccgga accgaggccg aggacctcat 45720caccacccac tggccaggac ggctgtggac ggccgcgttc aacgggccac gggccaccac 45780cgtctccggc gacaccgacg ccctggacga actcctcacc cactgcaccg aaaccggggt 45840acgggcccgc cgcatccccg tggactacgc atcccactgc ccccacaccg aaaccatcga 45900acacgacctg ctccacatgc tccacggcat caccccccag cccggcagca tcccgttcta 45960ctccaccgtc gaggacgcct ggaccgacac caccaccctg gacgccgcct actggtaccg 46020caacctgcgc cggcccgtcc gcttcaccca cgccgtccgc accctcaccg cccagggcca 46080ccgcctcttc atcgagacca gcccccaccc caccctgacc cccgccatcg aagaccacga 46140ccacaccacc gccctgggca ccctgcgccg ccacgacaac gacacccacc gcttcctcac 46200cgccctcgcc cacgcccaca ccaccggcca caccgtcacc tggaccaccc actaccccac 46260caccccccac acccccgcca tcgacctgcc cacctacccc ttccaacacc accactactg 46320gctccacaca cccaccacca gcaccggcga cgtctccgcc gccggactgc accccaccga 46380gcaccctctc ctcggcgcca ccgtggaact cgccgacgga gacggaacct tgctcaccgg 46440gcgcctgtcc ctgcacaccc acccctggct cgccgaccac agcgtcggcg gcatcgtcct 46500cctccccggc accgccctcc tcgaactcgc cctcgaagcc gggacgcgca ccggttgccc 46560ccacgtccag gaactcaccc tgcacacgcc cctggtgatt cccgagaccg gacacgtcgt 46620cttccagctg acggtctcgg caccggacga gaccgggcag cgcccgttca ccgtccattt 46680ccgttccgag gccgtcaccg gcgcggacga tccggcggac cggacctgga cgcggtgcgc 46740caccggtgcg ctctcgaccg cggccgcccc cgatcactcc gaagccgcca cctggccgcc 46800gccgtccgct cagccgctgg acctcgacgg tctgtacgac cgcatggcgg aggcgggtct 46860ggtctacggt ccggtgttcc aggggctccg cgaggcttgg ctcgatggcg aggacatcgt 46920cgccgaggtg cgcctgccgc aggaggcggc cgccgacacg cagggcttcg gcctgcatcc 46980cgccctgctc gacgccgctc tgcatgtgac ggcgctgacc tcacaggccg gtacagcgga 47040cgaagacgcg caggaacggc gtcggttgcc gttcgcgtgg gccggtgtct ccctgttcgc 47100cagggagtgc gcggcgctgc gtgtgcgggt ggcgccgtgt gcgccgcacc cgggggacgc 47160cgtggcgatc acagccaccg acgaggacgg ccgtccggtg ctggcggtgg aatcgctcac 47220cctccggccc gtctcccccg accagttgcg ggcggcggcc ccggccgccg ggcgggattc 47280gctgttccgc ctggagtggg taccggtcac ggcctccgcc tccgcctccg cccggccgac 47340cgggccctgg gccgccatcg gcaccggtcc ggcggtggcc ggcctggccg gccacgcaga 47400cctgacggtg tacgcggagg ccggcgatct gctccgggat ctggacggag gggcccccgc 47460gcccgctgtg gtcgtgctca gcgtcacgcc cgatgccgac gaattcgcca ctccccgtgc 47520ggcgaccggc cgggccctct ccgtccttca ggcctggctg gcggacgagc gcctggccga 47580cagccggctc gtggccgtca cttctggggc ggtcgtcgcc gcgcccgggg acgacacggt 47640cgacgtcccg ggtgccgccg tgtggggctt ggtgcgttcc gggcagtccg agcacccgga 47700ccgcatcacg ctgctcgact gtgcgagcgg cgcccggccc gggccggacc tcgtcgccgc 47760cgccctcgcc tcgggcgagc cgcagctcgc cgcccgcgcc ggggtcctct acacgccccg 47820gctggccagg ccgcaccgcg acgcctcggc cgtaccgcgg tcgctgccgt cccacggcac 47880cgtgctcatc accggcggca ccggtctgct gggcgggttg gtcgcccggc gcctggtgga 47940ggcgcacggt gtccgccgcc ttctcctggc cggccgcagg ggtccggcgg cggaggggct 48000ggactcgctg acgtccgagt tgcgtgagcg cggggcgacc gtcgaggtcg ccgcgtgcga 48060cgcggccgac cgcacacagt tggaggcgct gctggccggg gtgcccgagg agcatcccct 48120gtccgcggtc gtgcacgccg cgggtgtgct cgacgacggg gttctcacgt ccctgacgaa 48180cgagcggctg ggagctgtcc tgcgggcgaa ggcggattcg gcgctgcttc tgcacgagct 48240cactcaggac ctcgacctgt ccgccttcgt cctgttctcc tccgccgccg gcgtcctcgg 48300ctctcccggc cagggcagct acgccgccgc caacgccgtg ctcgacgcac tcgcccacca 48360gcgcagcgcc gccggtctgc ccgctctctc cctggcctgg gggctgtggg cggagggcag 48420cgggatgacc gggcacctcg acgccgacga ccgctcccgg atcaaccggg ccggtatggc 48480gccgctcccg acgcccgatg ccctggatct gttcgacgcc gcgctgtcgt cggacgaacc 48540cttcctggta ccggctcgct tcgacctttc cgccgtacgc accaggaccg cgtacggccc 48600gctcccgccg ctgctgcgcg gcctggtccg gacctcgggc gcgcaccggg tccggggcgc 48660agtcggcgaa gcccgggcgg ccggcgtgga cgaggccgga cggctgcggg aacggctggc 48720ccgccagagt gacgccgaac gccggaacac cttgctgcgg ctcgtgcagt cgaacgtcgc 48780ggcggtgctc ggtcaccgcg gcacggggac cgtcgccgag acacgcgcct tccgtgagct 48840gggcttcgac tcgctcacgg cggtggagct gcggaaccgg ctgaaggtcg ccacagggct 48900ggcgctgcgg gccacggtcg ccttcgactt cccgactccg gcggcgctgg ccgagcatct 48960gggtgcccgc ctgcttccgc cggacggcgc cgtgtccgag gcggtgggcg agaaggagct 49020gcgcgggctc ttgacgtcga tcccgatcgg ccggctgcgg gaggcggggc tgatcgaccg 49080cctcctggcg ctcgccgctg cggcgccaga ctccgccgat cagacggcgg agcagccctc 49140ccggtccgtg tcggtcgagg acatcgacgc catggacgtc gacagcctca tcggcctggc 49200ccacgacacc ggcaccgact ccggtcacgc cccctgcgag ggctgacctc cacttcacgg 49260atgcgagaga cgacatgacg cagattccgc caaccggtca cgacgccgtg gcagccgggc 49320ccgcccccgg cgccgcggaa cagaaacgag gacggaaacg gaaaccagga cgggagcccc 49380ggccagagca tcgacgggaa caggaacgag ggcagggagc agggctgggg caggggcagg 49440aacgcgcgcg gcccgcggac ggtggtcggc ggctcgtgct tggctgggcg gcgctcggcg 49500cggtgtgcct ggccctgcag gcgtacgtgc tcgtccgctg ggcggccgac ggtgggtatc 49560gcctggtgga cgtacccggt gagggcggcg cggagcgtgg ccaccgaagg gtcctcgaca 49620tcgtgttccc ggcgctgtcg gcggcaggtg tcgtggggct ggcgctgtgg ctctaccgca 49680ggtgccgcgc ggagcggcgg gtgtcgttcg acgccctgct gttcgccgga gtgctgttcg 49740cgggctggct gagcccgctg atgaactggt tccatcccgt cctggtctcc aacacgcacg 49800tgtggggcgc ggtcggctcc tgggggccgt acacgccggg atggcagggg tccgcccccg 49860ggatggaggc cgagctgccg ctggtgacgt tcagcgtgtg ctcgacagcg ctcctgggtg 49920tgctggcctg ctgtcacgtg ctgtcccgcg tccgggaccg gtggcccggg gtccgcccgt 49980ggcaactgat cggggtggcc gtcgccaccg cggtggccct ggacctgtcg gagccggcga 50040tctccttgat cggtctagtc cgtctggtcg aaggcgctgc cggaggtgtc gctgtggagc 50100ggtgcctggt accagttccc tctgtaccag ctcctgaccg cggccctggc cagcgggttg 50160ctgagcgcgc tccggttctt ccgcgacgag cgggacgaga cgctggtgga gcgcggtgcc 50220tggcgcctgc cgggccgtgt ccgcctctgg gcgcggttcc tggccgtcgt cggcggcgtc 50280catgtcgtga tgggcggcta tacggccctt catgtgctgc tctcgttggt cggcggccaa 50340ccgccggacg cgttgccggg gttcttccgt ccgccggccg tctactgagg gcggggcgga 50400cggcacgcaa cgaggggagg ggccggcgtc tcatgctctg ctgtccggtc agacctcagc 50460gcgctggcac ggcgcggtca ggacgacgta cccgatgtcc tccgtgtacc actggctgca 50520cttgcccacg aacgtctcca ggtcgtccgc cgtcatctcc agggcttcgg cgtaggcgtg 50580ggcgttggcg cgtacgtcgt cacccatcgc gctgtacgac ggggcgatga cgtgctcacc 50640gatgtcggtg agctcggtca gccggagtcc ggcgtcgctg atcatcccgg cataggcggt 50700gatggggatc agcgagggga cggcgagttc gctggacgac cagtccgccc ccgtcggctg 50760tgatgcgcgg agtgtgacgt ccatggccgc cagccgccca ccggggcgca gcacacgggc 50820catctcctgg aacacccggg ccgggtcggg catgtgcagc aggcactcga gggcccagac 50880ggcgtcgaag gaggcgtcgg ggaagggcag gtccatggcg tcggcgcact cgaagcggac 50940ccggttcgcg agtccggacc gctcggcgag cgcggtggcc agctcgacct gccgggggct 51000gatggtgatg ccgacgatgt ccaccggctc gctgtgcgcc aggcgcaggg ccggccggcc 51060ggaaccgcag ccgacgtcca gcacacgtct gaccgggcgc ccggtgtgtt cccgaagctt 51120gccgatcata tggtcggtga ggcggtcgga ggcctggccg agtgtgctgc cgtcgtccgg 51180gtgcggccag tatccgaggt gcgtgttgcc gcccagggcc cggttcaaca ggctggtcat 51240gcggtcgtag tagtcaccga cgtccgcggg ggtcggtgat ccctggtgag gcgccttggt 51300catggttccg gcagctcctt cggtcgtgcg gcggcctcaa gggaggcgtc cgcgggggcg 51360tggccgcgag ggatggcggg ggtcctgggc tcggctatca tccgcaggcg gtcggggaag 51420acgtgggtcg ccttggcgac cgggcggacg cggtcgccct tgaggggacg cagacgccag 51480cgcgaggcga tgaccgcgac ggcgacggcc gtctccatga gggcgaagtt gtcgccgatg 51540cacttgtagg tgccgagcgc gaagggaacc caggcgccct tcggaacgtc gcgcgtggtc 51600tctttcgact cccagcggtc ggggtcgagc ttctccggat cacggtacca gcgggggtca 51660cgctggagcg cgtacgagct gtacatgatt tccacgtcgg ccggcagctc gtgttccccg 51720agccggacgg ggcgcaccgt gcgccgcgag cccacccagc cggggtactt gcgcagcgcc 51780tccttgacca ggcgctgggt gtacgggagg cgcgggaggt ccgcgctggt ggggagccgg 51840cctccgagga cggtgtcgat ttcggcgtgc agcctctgtt cgatgaggtg gtcgtgagcg 51900agttcgtgga agatccacgc ggtgagagcg gccggcccac cgattccggc gaccgcgagc 51960cccatgatct cgttgtgcac ctcgtcgtcc gtcatggtgt tgccctcggc gtcccgcgcg 52020cgcagcatcg tcgagagcag gtcgccgtgg tcgcggccgt cggcgcggta ggcggtgacc 52080gcctcccgga tggcggcgct ggtgcggccc atgtggcgct tggcggcagt gggcagggag 52140gtgtagagct gcggggcgag cgcgctcagc ctggccacct tcaggatgtc gtgccccgtg 52200gtgcgcagtt ccgcctcggc cgccgcaccc aggtcggact ggaacaacgc cttcgtgatc 52260atggccagtg agaggtcgct cgccatcttc gggacatcca cgacctggcc cggccgccag 52320gaatcggcgg tctcctcggc ggcggcggac atgctgatga cgtagtggtc gagcttgccc 52380cggtggaatc cgggttgcat catccgccgc tggcggcggt gcgagtcccc ggagacggcc 52440acgaggatgg ggccgatgaa ccggctggcg cccgccgcgc ccttgctgcg ggtgaagtcc 52500gccgcgccgg acaccagcat ggtccgcacg atttcggggt gggtggcgag gtagacggtg 52560ttgtggccga ggcggatgcg gaagaggtcc ccgcgttccg tgacggcgga caggaagccc 52620agggggtcgc ggaggagggc cggcaggtgg ccgaggaccg gccaggcacc gggggcctcg 52680gggatggtcg acggaggtga ggacactgtt gctcctgagg ggagggccgg gcgagtcggc 52740gtggggtggg gtgaggtgtg cggtcgggca ggtggtcgcg tcgccggtgg tcggcgacgg 52800gtggtgggtc agggggatcc ggtttcctgg tcgatgagcg cgaacatctc ctcgtccgtc 52860gcctccccga ggtcgggacg cggcgcctcc tccccgccga gcacctgggc gagtgagcga 52920agccgcgacg ccagccggga ccgggcttcc tctcccaggc cctgcgcccc ggggagcggc 52980gacgcgacgg aggagagcac cgcttccagc cggccgatct ccgcgaagag ggactgctcg 53040ggcggcgccg tggccgcgtc gtcggggagg agccgggtca gcaggtgccg ggtgagcgcg 53100gtggcggtgg ggtggtcgaa ggcgagggtg gcgggcaggc gcaggccggt ggcgcgggag 53160agccggttgc ggagttcgac ggcggtgaga gagtcgaagc cgaggtcccg gaaggccgag 53220tccgcaggga ccgcctccgg cgtctgatgg ccgaggaccg tggcgatctg ggtgcgcacc 53280acggtgaaca gggtgtcgtg ttgttgctcg ggggtcaggg tggcgaggcg gtcggcgagg 53340gagacgtcct ggcctgcgcc tgcactggtg ccggtgtgtg cggtgcgggg gctggtgcgg 53400gcgggcgcga ggtgttccag aaggggcggt gcggggtggg tggggcgtag gtcggcgggc 53460aggagtgcgg gacgtccggt gaccagggcg gtgtcgagga gggcgagggc gtcgggggtg 53520gtcagggggt gcagccccga gcgggtgatg cggtgacggt caccggcgtc cagatgcccg 53580gtcatcccgc tggcctcttc ccacagtccc caggccaggg agagggcggg caggccggcg 53640gcgcggcgct ggtgggccag ggcgtccagg gcggcgttgg cggcggcgta gttgccctgc 53700cccggcgagc ccaggacacc ggccgcggag gagaacagca cgaacgccga caggtccatc 53760cccgcggtca gctcgtgcag atgcagggca ccgtccacct tcgccccgac caccgcatcg 53820atcttctccc ggttcagaca ggccacggtg gcgtcgtcca ggacaccggc cgtgtgcacc 53880acagccgtca gcggatgctc cgcgggcacc tgctccagca gggcggcgac ctgggcgcgg 53940tcggcgacat cgcacgccgc caccgacacc gacacccctg cctgacccag ttccgcacac 54000agttcttcgg caccggtggc ggccatgccg cgccggctca ccagcagcag atgccgcacc 54060ccgtgcccgg cggccagatg gcgcgcgacc gccgctccca gagtgccggt cccacccgtc 54120accagcaccg tcccctccgc atcgaaccgg acggcatccg acgagtcggc gggcaccggc 54180acatgtgcca agcgcgccgc cagcagtcgc ccgccacgca ccgccaactg gggttcgtcg 54240caggccagag ccgccgtgac ggtggtgtcg tcggagaggt cggcatccag caggacgaac 54300cgccccgggt gttccgactg cgccgaacgc agcaggcccc agaccgccgc tccggcgacg 54360tccgtcacct cctcaccggt ccgggtggcc acggcaccgt gtgtcaccac cacgagccgg 54420gcctcggcaa ggcgatcgtc ggccagccag tcctgcacca cgctcagcac ttcgccgagg 54480acgtccgcca ccgcgccctg ggagcacgtc agcagcacgg cgtcggggac gggggcgtcg 54540tcggtgtcca gcccggacag caggccggag aggtccgccg ccgcccggtc atgggtgagt 54600acggtcgacc ggacggccgt gtccgggggc ggggtgcccg gtgtgacgtc cttccaggcc 54660acgtcgaaca gcgccgcgcg gccggccgcc tgggcagagg ctcgcagttc gccggtgtcc 54720agcggtcgta cggcgagaga gtcgaccgac aacacgcccc ggccggtttc gtcggccagc 54780gacacggaga cggcggtccg ttcgccgtcc cgcccggccg gcgccacccg gacccgtacc 54840gctgccgcct tcaccgcgtg cagggtcaca ccactgaacg agaacggcac cgcccccggc 54900ggcaggcccg tcgccgctcc gagcgccacc gcgtgcaggg cggcgtccag cagagctggg 54960tgcaggttgt accgggacgc ctcgtcgagc acggactccg gcaggcggac ctccgcgaag 55020acctcttcgc cccgccgcca agccgcacgc agcccccgga acgccggtcc gtaggcgaac 55080ccgcgggcct cctgtgccgc gtagaagctt tccagttcgt ccgccgcgca cggcagggcc 55140ccttccggcg gccagctgcg cagggcgtcg ccgtcggcgg agggctgggc gtccagcagg 55200ccggtggcgt ggtgctgcca gggatcctcc gggcgggcgt gctcgctccg ggaggagacg 55260gtgagggtgc gggccccggt gtcgtcgggc gccgacacgc ggacctggag gtcgacggcc 55320gcgtcgtgcg ggacggcgag gggcgcgtga agggtgagct ctcgcacgtg cgcggcaccg 55380ccggcttgga gggcgagttc gaggagggcg gtgccgggga ggaggacgat gccgccgacg 55440ctgtggtcgg cgagccaggg gtgggtgtgc agggacaggc ggccggtgag caaggttccg 55500tctccgtcgg cgagttccac ggtggcgccg aggagagggt gctcggtggg gtgcagtccg 55560gcggcggaga cgtcgccggt gctggtggtg ggtgtgtgga gccagtagtg gtggtgttgg 55620aaggggtagg tgggcaggtc gatggcgggg gtgtgggggg tggtggggta gtgggtggtc 55680caggtgacgg tgtggccggt ggtgtgggcg tgggcgaggg cggtgaggaa gcggtgggtg 55740tcgttgtcgt ggcggcgcag ggtgcccagg gcggtggtgt ggtcgtggtc ttcgatggcg 55800ggggtcaggg tggggtgggg gctggtctcg atgaagaggc ggtggccctg ggcggtgagg 55860gtgcggacgg cgtgggtgaa gcggacgggc cggcgcaggt tgcggtacca gtaggcggcg 55920tccagggtgg tggtgtcggt ccaggcgtct tcgacggtgg agtagaacgg gatgctgccg 55980ggctgggggg tgatgccgtg gagcatgtgg agcaggtcgt gttcgatggt ttcggtgtgg 56040gggcagtggg atgcgtagtc cacggggatg cggcgggccc gtaccccggt ttcggtgcag 56100tgggtgagga gttcgtccag ggcgtcggtg tcgccggaga cggtggtggc ccgtggcccg 56160ttgaacgcgg ccctccacag ccgtcccggc cagtgggtgg tgatgaggtc ctcggcctcg 56220gttccggtca gggcgagtga ggccatgccg ccgtggcccc gcagcgcggc cagggcccgg 56280ctgcgcagtg ccacgatctt cgcggcgtcc tccagggtga gtgcgccgca gacgtgggcg 56340gcggcgatct cgccctggga gtggccgacc accgcgtcgg gttcgatgcc gtaggaccgc 56400cacagcgccg ccagggacac catgaccgag aacagcaccg gctgcaccac atccgcccgc 56460tcccacaccg gatcctccgc gtcccggtgg agcatttcgg tgagcgacca ctccacccac 56520ggcgccagcg cgcgttcaca cgcaccgaca tggtcggcga acacctggga gtcgcgcagc 56580aggtcgagtc ccatgccggc ccactgaccg ccctggccgg ggaagacgaa caccaccccg 56640ccaccggagt ggctgttggt ctggctgccg gtcgtgtctt cgatgagtgc ggggtgcggt 56700tcacccgagg ccagtgcttc cagcgcgtct tcgaactctg tccgttcccg tgccacgatc 56760acggcgcggt gttccaggac ggccctgcca tgggcgagtt cggctccgac cgcccgtagt 56820gcggtgtcgg tgccggggcc ttggaggaag tcacgcagga gtgcggcctg tgcccgcagc 56880gcggcgtttg aacgcgccga catcgcccag ggcaccgggg cgccgtcttc tggggcgggg 56940tccccggccg gctgggcggg tgcctcttcc aggatgacgt gggcgttggt tccgctgact 57000ccgaaggagg agatgccggc gcgtcggggc cggtccccct ggcgtggcca ggggcggggc 57060tcggtcagca ggctgaccgc gccggacgac cagtcgacct ggggggtcgg ctcgtcgatg 57120tgcagcgtgc gcggcagcgt ctcgtggcgc agtgccatca ccatcttgat gatcccgccg 57180acacctgccg ctgcttgggc gtggccgatg ttggacttca gggagccgag ccagagcggc 57240tggtcggggt ggtgttcctg gccgtaggtg gcgagcaggg cctgggcttc gatggggtcg 57300ccgagtttgg ttcctgtgcc gtgggcttcg acggcgtcga catcggcgcc ggtcaggccg 57360gagttggcca gggcctggcg gatcacgcgc tgctgggagg ggccgttcgg tgcggtcagg 57420ccgttggagg cgccgtcctg gttgaccgcg ctgccccgca ccaccgccag aacccggtgc 57480ccgttgcgct cggcgtccga caaccgctcc acgagcagca tgcccacgcc ctcgccccag 57540ccggtaccgt cggccgcggc cgcgaaggac ttgcaccgcc cgtccaccga caaaccccgc 57600tgccgggaga agtcgatgaa ggtgcccggt gaagacatca ccgtcacgcc gccggcgagg 57660gccatcgagc attcgcccga tcgcagggct tggcctgcga ggtgcagtgc gacgagcgac 57720gacgagcacg cggtgtcgac cgagacggcc ggcccctcga aaccgaaggt gtaagccacc 57780cgacccgaca cgacactgcc cgcgtttccg ttgccgatgt agccctccgc cccttcggga 57840acggcggtca aacgggcggc gtagtcgtgg tacatcacac ccgcgaacac gcccgttcgg 57900gaaccgcgta cggcagcggg atcgatcccc gcgtgttcga gggtctccca gacggtttcg 57960aggaggagcc gctgctgggg gtccatggca agggcctcac gcgggctgat accgaagaac 58020tcggcgtcga actgcccggc gtcgtagagg aaaccaccgt gccgggtgta cgacgctccg 58080gcccgctccg ggtccgggtc gaacagcccg gccaggtccc acccgcggtc ggccgggaac 58140tccccgatcg cgtcaccgcc cgaagccacc agcccccaca actcctccgg cgaccgcaca 58200ccgcccggga agcggcacgc catcccgacg atcgccagcg gctcgtcact gccgacggct 58260gtggtttcgg cgtacggcga agtgctgtcc gcggcgtcgt ccccgagcag ttccgtgcgc 58320agcaggcggg ccacggccgc ggggctgggc tggtcgaaga ccaggctcgc cggcagtcgc 58380agtcccgtct ccgcgctcag gcggtttcgc agatccacgg ccgtcaggga gtcgaagccg 58440aggtcgcgga aggccgagtc gaccgggatg gcttccggtg cttggtggcc gaggacggtg 58500gcgacatgcg agcggaccag cccgagcagg gcctggtact gctgttcggg tgtccgtccc 58560gcaagccgtg cccgcagcga cgcaccgctg tcagtggtgg ggagggtggt gcggtggctg 58620gtgcgggcgg gcgcgaggtg ttccagaagg ggcggtgcgg gatgggtggg gcgtaggtcg 58680gcgggcagga gtgcgggacg tccggcggcc agggcggtgt cgaggagggc gagggcgtcg 58740ggggtggtca ggggatgcag ccccgagcgg gtgatgcggt gacggtcacc ggcatccaga 58800tgcccggtca tcccgctggt ctcttcccac agtccccagg ccagggagag ggcgggcaga 58860ccggcggcac ggcgctggtg ggccagggcg tccagggcgg cgttggcggc ggcgtagttc 58920ccctgccccg gcgagcccag gacacctgcg gcggaggaga acagcacgaa cgccgacagg 58980tccatccccg cggtcagctc gtgcagatgc agggcaccgt ccaccttcgc cccgaccacc 59040gcatcgatct tctcccggtc cagacacgtc acggtggcgt cgtccaggac accggccgta 59100tgcaccacag ccgtcagcgg atgctccgcg ggcacctgct ccagcagggc ggcgacctgg 59160gcacggtcgg cgacatcgca cgccgccacc gacaccgaca cccccgcccc acccagttcc 59220gcacacagtt cttcggcacc ggtagcggcc atgccgcgcc ggctcaccag cagcagatgc 59280cgcaccccgt gcccggcggc cagatggcgc gcgaccaccg ctcccagagt gccggtccca 59340cccgtcacca gcaccgtccc ctccgcatcg aaccggacgg catccgacga ctccgacagc 59400ggcggcaccc gcttcaaccg tggtacgcga accaccccgt cgcgtacggc gagttggctc 59460tcaccgcggg ccagagcgga cgcgacggcg gcatcgtcga gaccggcgcc ggtgccgacc 59520ttcgcgtcgg cggacacgtc ggtgtccccg tcggtgtcgg tgtcggtgtc ggtgtcgggg 59580tcggtcttgg tgtcggggtc gaggtccagc aggacgaacc ggtcgggatg ctcggactgg 59640gccgagcgga ccagccccca cacagcggcc cccgccacat cccgcaccgg ctcacccgca 59700tccaccgcga ccgcaccacg cgtcaccacg accagccgcg catccccctc ccgctcatcg 59760gcgagccact cccgcaccac acccaacgcc gcggccgtga cctcggccac cccaccaccc 59820gggcactccc acgccaccaa ctccccgccc ccaccggcca ccgaggccgg caccccctcc 59880accggcaccc accccagctc gaacaacgac ccacgccgca cccgagcccc cagcccctcc 59940aacggcaccg gacgcatcag gagcgactcg agggtgagaa cgggtgctcc ggtctcgtcg 60000gtggcgtgca ggcgaacggc ggtcgaggtc gcgtcggtgg gagtcatgcg caggcgcagg 60060acggtggcgc cccgggcgtg gagcgagaca ccgttccatg tgtgagggac gagcccggcc 60120tgctgctggt ccgccagcag cagggtgacc gattgcacgg ccgcgtccag gagcgccgga 60180tgcaccccga agcccgcgac atcggccagc ccctcgtccg

gcagacgcac ctcggccacc 60240acgtcgtccc cgtcccgcca ggccgcacac aaaccctgga acaccggacc gtagacaaaa 60300cccccacccg ccagacggtc gtagagaccg tcgagaacga ccggtcgggc gccgggtggc 60360ggccactccc cggccgccgc tgcctccgcg ttcggatcgt cttcggtgga cggggacagc 60420acaccctcgg catgccgtgt ccactctccg tccgtctcct cgtccccggc cggccgcgcg 60480tacacattca cggcgcgccg ccccgcctcg tccggcaccg acaccgacac ctgcaccacc 60540acgtgccccg actcgggaat caccagaggg gcgtggagag tgagctcgtc gacacgagga 60600cagccggtac gcagaccggc ctgaaaagcc agatccagga gggcggtgcc cgggaggagg 60660acgattccgc cgacactgtg gtcggcgagc caggtgtggg tgcgcaggga caggctgccg 60720gtgaggacga tcccgtcccc gtccgcgagc tccatcaccg cacccagcag cggatggtcc 60780ggccgctgga gcccggcagc cgacacatcg cccgcaccgg caccgggagt cgcctggagc 60840cagtagtgcc ggcgttggaa ggggtaggtg gggaggtcgg ggatggtgtg ggggtggggg 60900ttggtgtggg tgtggtggtg ggtgtgccag gtggtggggg tgtgggcgag tgcggtgagg 60960aggtggtggg tggggttgtg gtgggggtgg gtgagggtga tggcggtggg ttggtggtgg 61020gggaggttgt ggtgggtgag ggtggtgagg gtgtggtcgg gtccgagttc gatgtaggtg 61080gtgacgccgt tggtgtggag ggtgtggatg gtgtcggtcc agtggacggg ttggcgggtc 61140tggtgggtcc agtagtgggg ggtgaggggg tcgccggggg tgctggtgat gaggggggtg 61200tggggtgggt ggtaggtgag ggtgtgggtg gtggtgtcga gttgttcgag gatggtgtcg 61260gtgtgggggg agtggaaggc gtggttggtg gtgaggggtt tggtggtgat gccttgggtg 61320tggcaggtgg tggtgatgtg gtggatggtg tcggggtcgc cgctgatgac gagggagtgg 61380gggctgttga cggcggcgat ggtggctttg ccggggtgtt ggtcgaggag gggttggatg 61440tgttcggggg tggtgtggag ggtggtcatg gtgccggggg gcatggtttg catgaggcgg 61500gcgcgggtgg tgatgaggtg ggtggcgtcg gggagggtga ggatgccggc gaggtgggcg 61560gcggtgattt cgccgaggga gtgtccggcg aggtagtggg gggtgatgcc gtaggtttcg 61620gtgatgaggt ggtggagggc tttttggagg gtgaagaggg cgggttgggc gtagagggtg 61680tgggtgagga ggtcggtggg gtcgtggttg aggaggaggt cgcgcagggg gtggtcgagg 61740tgggggtcga agtgggcgca ggtttcgtcg agggcgtcgg cgaaggcggg gtaggtgtgg 61800tagaggccgg tggccatgcc ggggcgttgg gtgccttgtc cggagcagag gaaggcgatt 61860ttgtgggtgt ggccgggttc ggtggggggt gtggggatga ggtgggggtg ggtgcggccg 61920tcggccaggg cggtgagggc gtccaggagg gtgtcgcggt cgggggcgat gagggtggcg 61980cggtggtcga acagggcgcg ggtggtggcc aggctgtagc cgatgtcggc cgcgtcctgt 62040tcggggtggt ccaggacgtg ggcgtgcagg gcggcggcct gggcgcgtag cgcggcctgc 62100gacttgcccg acacgaccca caccaacggc acatccgccg acacccggcc cgcaccgtcc 62160gtctccacca ccaccgccgc agccgccgca gtctccaccc ccgccggcgc ggcttccacc 62220gcctccacct ccgcccgcgc gggcgcctcc tccaggatca cgtgcgcgtt cgtcccgctc 62280accccgaagg acgagattcc cgctcgccgg ggccggtcct cccggcgtgg ccagggccgc 62340gcctcggtca gcaggctcac cgctcccgac gaccagtcca cctgcggcga cggctcatcc 62400acatgcaacg tccgcggcaa cgactcgtgc cgcaacgcca tcaccatctt gatgatcccg 62460ccgacacctg ctgccgcttg ggcgtggccg atgttggact tcagggagcc gagccagagc 62520ggctggtcgg ggtggtgttc ctggccgtag gtggcgagca gggcctgggc ttcgatcggg 62580tcgcccagct tggtccccgt gccatgggct tcgacggcgt cgacatcaac tgcggagagg 62640ttcgcgttgg ccagggcctg gcggatcaca cgctgctggg acggaccgtt cggcgccgtc 62700agcccgttcg aggcgccgtc ctggttgacc gcgctgcccc gcaccaccgc cagaacccgg 62760tgcccgttgc gctcggcgtc cgacaaccgc tccagcagga gcatcccggc tccctccgac 62820cagccggtac cgtcggccga ggcggagaac gccttgcacc ggccgtccgc cgccaggccc 62880cgctgccgcg agaactccag gaacgcggta ggcgtggaca tcaccgtcgc accgcccgcc 62940aaggccatgg tgcactcgcc cgaccgcagt gcctgacagg ccagatgcag cgcgacgagc 63000gacgacgagc acgcggtgtc cacggacacg gcggggcctt cgagcccgaa cgtgtaggcg 63060acccggcccg acgccacgct tccggacgtg ccggtgagaa cgtacccgtc gacgtcggcg 63120gcgacatggt gccgtgagcg ggacgcgtac tcctgaggca tgacgccggc gaacacgccc 63180gtctggctgc cgcgcacggc accggggtcg atacccgccc gctcgaacgc ctcccacgtc 63240gtctccagca acagccgctg ctgggggtcc atcgcgagcg cctcgcgcgg ggagatcccg 63300aagaatcccg cgtcgaactc ccccgcgtcg tagaggaatc ccccgtgacg ggtgtacgag 63360gtgccccgct gcccgggctc cgggtcgtag agcgcctcca cgtcccagcc acggtcggcc 63420gggaactccc ccaccgcgtc gccgccggag gcgacgagtt gccagaggtc ctcggccgag 63480gcgacacctc ccggataccg gcatcccaca ccgatgatcg cgatgggctc gtgctgcccg 63540gccttcggtt cggcggcggc aggtgccgaa ggcgtcttgg tgtcgttggg gttgaggagg 63600gtggtgaggt ggtgggtgag tgcggtgggg gtggggtggt cgaaggcgag ggtggtgggc 63660aggcgcaggc cggtggcgcg ggtgagccgg ttgcggagtt cgacggcggt gagggagtcg 63720aagccgaggt cgcggaaggt gcgttcgggg tcgatggtgt cgggggtggg gtggcccagg 63780acggcggcga tgtgggtacg ggccagggcc agcagggtgg cgtgccgctg ttcggaggtc 63840agggtggcga ggcggtcggc gagggagacg tcctggcctg cgcctgcact ggtgccggtg 63900tgtgcggtgc gggggctggt gcgggcgggc gcgaggtgtt ccaggagggg tggtgcgggg 63960tgggtggggc gtaggtcggc gggcaggagt gcgggacgtc cggtggccag ggcggtgtcg 64020aggagggcga gggcgtcggg ggtggtcagg ggatgcagtc ccgagcgggt gatgcggtga 64080cggtcaccgg cgtccaggtg gccggtcatc ccgctggcct cttcccacag tccccaggcc 64140agggagaggg cgggcagacc ggcggcgcgg cgctggtggg ccagggcgtc cagggcggcg 64200ttggcggcgg cgtagttgcc ctgccccggc gagcccagga cacctgcggc ggaggagaac 64260agcacgaacg ccgacaggtc catccccgcg gtcagctcgt gcagatgcag ggcaccgtcc 64320accttcgccc cgaccaccgc atcgatcttc tcccggtcca gacacgtcac ggtggcgtcg 64380tccaggacac cggccgtatg caccacagcc gtcagcggat gctccgcggg cacctgctcc 64440agcagagcgg cgacctgggc gcggtcggcg acatcgcacg ccgccaccga caccgacacc 64500cctgcctgac ccagttccgc acacagttct tcggcaccgg cggcggccat gccgcgccgg 64560ctcaccagca gcagatgccg caccccgtgc ccggcggcca gatggcgcgc gaccgccgcc 64620cccagagtgc cggtcccacc cgtcaccagc accgtcccct ccgcatccag gggcacaggc 64680agggtcagca cgttcttgcc gacatgcagg cccgaccgca tcgaccgcag cgcctggcgg 64740gcctggcgca cgtcccacgc ggtgaccggc aacggctcca gcaccccgcg ccggaacaga 64800tccaccaccg tgtgcaggat ctcccccacc cgctgcgcac ccgcgtccat caggtcatac 64860gcccggtagg acacccccgg gaaccgagcg gcgacctcac cggcatcacg gatgtcggtc 64920ttgcccagct ccaggaaccg gcccccctgc ggcgaacaca gccgcaacga ggcatcggtg 64980tactcacccg ccagacagtt cagcaccaca tccacacccc gcccgccact ggcccggcgg 65040aaacgcgact cgaactccac actccgcgag gaagcgatcc gctgcggcgc gacacccgcc 65100gcccgcagac gcgcccactt cgcctcactc gccgtcgcat acacctccgc ccccagatga 65160cgggcgagct gcaccgccgc cgtaccgacc ccgccggccg ccgcatggac cagcacactc 65220tccccccgcc gcacccccgc cagatcgacc agccccaggt aagcggtagc gaacaccacc 65280ggcaccgaag ccgcctgcgc gaacgaccag ccctccggga tacgggccag caacacctcc 65340tgcgccacca ccaccggcgc gaacgcgtcc ccgaacaccc cgaacacccg gtctcccacc 65400accaggccct ccaccccggg ccccacctcc accaccaccc ccgcaccctc actgcccacc 65460cccacctgac ccggcaccat ccccaacgcc accagaacat cacggaagtt caccccggca 65520gcccgcaccg ccacccgcac ctgcccccga cccagcacca ccccagccgc atccgaagca 65580accacaccca ccccctccaa caaccccgac ccaccaccat ccagccgcca ccccacccca 65640ccaggcaacg acaacccttc acccgcaccc ccaagccgcc cccaccgctc caaccgctcc 65700aaccgcggca cccgcacgac cccaccacgc acggcaacct gtgcctcgcc acacgcgaca 65760aacccggcca catcgacacc agcaccaaca ccggcgccca tgtcctcatc ggcatcgacg 65820acggtctcca cgccggtgcc ggggtcgagg tccagcagga cgaaccggtc gggatgctcg 65880gactgggccg agcggaccag cccccacaca gcggcccccg ccacgtcccg caccggctcc 65940cccgcatcca ccgcgaccgc accacgcgtc accacgacca gccgcgcgtc cccctcccgc 66000tcatcggcga gccactcccg caccacaccc aacgccgcgg ccgtgacctc ggccacccca 66060ccacccgggc actcccacac caccaactcc ccgcccccac cggccaacga ggccggcacc 66120ccctccaccg gcacccaccc caactcgaac aacgacccac gccgcacccc cgcccccagc 66180ccttccaacg gcaccggacg cagaaccaga gactccagcg cgagcaccag cgcaccggtc 66240tcatcggcaa cccgaagact cacggtcgtt ccggccgcgt cgacagacgt cacccggact 66300cgcagtgccc tggcaccccg ggcgtggagg gaagcaccgt tccatgtgta aggcagcaga 66360ccggcctctt ggtcctcggg cagcaggagg gtgaccgtct gcacggccgc gtccaggagc 66420gccggatgca ccccgaagcc cgcgacatcg gccagcccct cgtccggcag acgcacctcg 66480gccaccacgt cgtccccgtc ccgccaggcc gcacacaaac cctggaacac cggaccgtag 66540acaaaacccc cacccgccag acgaccgtag aactcatcga gatccaccgg ctgcgcaccg 66600gacggcggcc acaccccgtc cgccaccggc tcaacaacca ccgactcccc aggaacagac 66660ggacacacca caccctcggc atgccgcgtc cactcaccct ccagccctcc gtcctccacc 66720agccgcccgt acacactcac accacgacga cccgcctcgt ccggcaccga aaccgacacc 66780tgcaccacca catgccccga ctccggaacc accagaggag catggagagt cagctcatcg 66840acaccaggac aacccgcacg cagaccagcc tgaaaagcca gctccagcag agcggtaccg 66900ggcagcagga cgacgccgcc gacgctgtgg tcggcgagcc aggggtgggt gtgcagggac 66960aggcgcccgg tgaggacgat tccgtccccg tccgcgagct ccatcaccgc gccgagcagt 67020gggtggtccg gtcgctggag tccggcggcg gagacgtcgc cggtgctggt ggtgggtgtg 67080tggagccagt agtggtggtg ttggaagggg taggtgggca ggtcgatggc gggggtgtgg 67140ggggtggtgg ggtagtgggt ggtccaggtg acggtgtggc cggtggtgtg ggcgtgggcg 67200agggccgtga ggaagcggtg ggtgtcgttg tcgtggcggc gcagggtgcc cagggcggtg 67260gtgtggtcgt ggtcttcgat ggcgggggtc agggtggggt gggggctggt ctcgatgaag 67320aggcggtggc cctgggcggt gagggtgcgg acggcgtggg tgaagcggac gggccggcgc 67380aggttgcggt accagtaggc ggcgtccagg gtggtggtgt cggtccaggc gtcctcgacg 67440gtggagtaga acgggatgct gccgggctgg ggggtgatgc cgtggagcat gtggagcagg 67500tcgtgttcga tggtttcggt gtgggggcag tgggaggcgt agtccacggg gatgcggcgg 67560gcccgtaccc cggtttcggt gcagtgggtg aggagttcgt ccagggcgtc ggtgtcgccg 67620gagacggtgg tggcccgtgg cccgttgaac gcggccgtcc acagccgtcc cggccagtgg 67680gtggtgatga ggtcctcggc ctcggttccg gtcagggcga gtgaggccat gccgccgtgg 67740ccccgcagcg cggccagggc ccggctgcgc agtgccacga ccttcgcggc gtcctccagg 67800gtgagggcgc cgcagacgtg ggcggcggcg atctcgccct gggagtggcc gaccaccgcg 67860tcgggttcga tgccgtagga ccgccacagc gccgccaggg agaccatgac cgagaacagc 67920accggctgga ccacatcggc ccgctcccac accggatcct ccgcctcgcg gtggagcatc 67980tcggtgagcg accactccac ccacggcgcc agcgcgcgtt cacacgcacc gatatggtcg 68040gcgaacaccc ccgaggtcgt cagcagatca agtcccatgc cggcccactg accaccctgg 68100cccgggaaca cgaacaccac cccgccaccg gaatggctgt ggctgccggt cgcgtcttcg 68160atgagtgcgg ggtgcggctc acccgaggcc agtgcttcca gcgcgccttc gaactccgcc 68220cgctcccgtg ccacgatcac cgcgcggtgc tccagcacgg ccctgccacg agccaactct 68280gccccgatat cccgcacccc ggcatccgta ccggggccgc gcaggaactc acgcaagacc 68340atggcctgcg cccgcaacgc cgcacccgaa cgcgccgaca ccacccaggg caccggagcc 68400ccgtcctcta ccgcagcctc cccgggccga cgggcgggtg cctcctccag gatcacgtgc 68460gcgttggttc cgctcacccc gaacgaggac accccggccc gccggggccg gtcctcccga 68520cgcggccagg gccgcgcctc ggacagcagg ctcaccgccc ccgacgacca gtccacctgc 68580ggtgacggct catccacatg caacgtccgc ggcagcgact cgtgccgcag cgccatcacc 68640atcttgatga tcccgcccac acccgctgcc gcctgggcgt ggccgatgtt ggacttcacc 68700gagcccagcc acaacggctg ttccccggaa cgctcctggc catatgtgtc gagcagggcc 68760tgggcttcga tcgggtcacc gagccgtgtg ccggtcccgt gcccctccac cgcgtcgacg 68820tccgccaccg tcagccccgc gttggccagt gcctcgcgga tcacgcgctc ctgcgagggg 68880ccgttcggtg cggtcaggcc gttggaggcg ccgtcctggt tgaccgcgct gccccgcacc 68940accgccagaa cccggtgccc gttgcgctcg gcgtccgaca accgctcgac cagcagcatg 69000cccacgccct cgcccatgcc ggtaccgtcg gccgcggccg cgaaggactt gcaccgcccg 69060tccaccgaca gaccccgctg ccgggagaac tccacgaaca ggagcggggt ggacatcacg 69120gtgaccccac cggcgagggc gagatcgcac tcgcccgtcc gcagcgactg gcaggccagg 69180tgcagcgcca ccagcgacga cgaacacgcc gtgtcgacgg tgacggccgg gccttccaga 69240ccgagcgtgt aggcgacgcg cccggaggcg acggcgccac cgctgccgtt gccgatgtag 69300ccctcgaacc cttcggggat ggtggcgagc cgggaggcgt agtcgtggta catcatgccg 69360gtgaagacac ccgctcgggc tccccggacc gaggaggggt cgatcccggc ccgctcgaag 69420acctcccacg aggtctccag cagcaagcgc tgctgggggt ccatggccag ggcctcacgc 69480ggactgatgc cgaaaagctc ggcgtcgaac tgtccggcgt cgtagaggaa accaccgtgg 69540cgggtgtacg acgctccggc ccgctccggg tccgggtcga acagcccggc caggtcccac 69600ccgcggtcgg ccgggaactc cccgatcgcg tcaccgcccg aagccaccag cccccacaac 69660tcctccggcg accgcacacc gcccgggaac cggcacgcca tcccgacgat cgccagcggc 69720tcctgtgccg cctcgacggc cgctgtgagc tgctggttgc gccgccgcag ggcctcattg 69780gccttcaggg atgcccgcag cgcctcgacg agcttctcgc tgggcgtagc catcggtgtc 69840tccaagtctg cgaatccggc aggtgcggac gcggtggtgt ggacggggcg ggggtcggcg 69900gggaccgcgg cgggcgactc gggtggtgtc agcgacgccg ctgctcggtg agcccggcca 69960gccaggtgtg gacgtgccgg gccgtcgact ccgcgtgctc ttcgagcatc gtgaagtggt 70020tgccgtcggt ttcgaggacg gtgtgcggct cgccccacac cggcggcggc tgttcgctct 70080cacgggcgcg gaggaagagg gtgggtgtct cgagggcggg cggccgccag cccgcgaaga 70140tgcggaagta gccgcccatc gccaccaggc gggcgtagtc caggtcgatg aactcggtga 70200cgcggtcgaa gatttcgctg gtgagggcgg cggcgacggg ggccatcccc tcgtcgggca 70260ggtaggcgtc catgaccacc acggcctgcg gccggacgcc caggtgttcc aggcggctcg 70320tgacggtgtg ggtgaaccag ccgccggcgg agtgtccggc gagggcgaag ggctcgccgt 70380cggtgtggcg gaggatggcg tcggtgaaca gccgggtgat ggtgtcgacg tcggcgggga 70440ggggctcgcc gtcggcgaag ccgggcgccg gcacgtacca gacgtcgcgg agcccgtcga 70500gggccgccgc gaagcgggag tactggtaga cgctggacac ggcggcgacg gtgggcaggc 70560agatcagcgc gggcccggtg tcgccctggg cgacgcggac gaaggggggt cgggtcatag 70620ccgaggggtc ggtgaagcag ggccggaagg cggaggccgc cgacagcagg gccatggact 70680cctcgacgcg gccgctgtcg tgaccgatcc agaacagggc ttccaccgtg tcggcggacg 70740ggccgctccc ggctcgcgac gaggcggtgg catcgcgctc cccaggggcg ccggccgttt 70800cggcggtcat atcggaggcc ggctcggcgg cgaggagcct ttcgaggtgg tcggcgagcg 70860ctgccggggt cgggtggtcg aagacgagcg tggtggccag gcgcagcccg gtcgctgcgt 70920tgaggcggtt gcgcagttcc acggcggtca gggagtcgaa gccgaactcg cggaactcgc 70980cgtcggcggt gacggtgtcg gtgccgccgt ggccgaggac ggccgcggcg tgggtgcgga 71040ccacctccgt cagcagggcg gtgcgctcgg cgggcttcgg ggtcccggcc agtcgcccgc 71100ggagttcggc ggcggcgtcg gccccgacgc cgtggtcggc ggtccggcgg gccggggtgc 71160ggaccaggcc cctgaggacg ggcggcaggg tgccgacggc cgcctgctca cggagggtgc 71220ccgggtcgag gggggtggcg aggagcagcg gttcgtcgag ggcgagggcg gtgtcgaaca 71280gggcgagccc gtgggcgttg gtgagcggga gcaggccgct gcgggtcatc cgggcgacgt 71340cggcggcggc gaggtgctcg gccatgccgc ccgcctcggc ccagcgtccc caggcgaggg 71400agcggccggg caggccgagg gcgtgccgtt gctgcatcag agcgtcgagg aaggcgttcg 71460cggccgtgta gttggcctgt ccggggctgc cgaaggaggc ggcggcggac gagaaggcga 71520tgaacgcgtc gagcccggcg tcgcgggtga ggtcgtgcag gtgggcggcg ccgtgggcct 71580tggcgctcag gacggcgtcc aggcggtcgg gggtcaggga ggtgaggacg ccgtcgtcga 71640cgacgccggc ggtgtggagc acggccttga gcggatgccg tgccgggatc tcggccagca 71700gcgccgcgac ggcccgccgg tcggcgaggt cgcaggcgac ggccgtcgtc cgggcgccca 71760gctcggcgag ttcggcgacg agttcggcgg tgccgggagc ggtggggccg ctgcggctgg 71820tcagcagcag gtgccgtacg ccgtgggtga cgacgaggtg gcgggcgagg agccggccga 71880ggtagccggt gccgccggtg atgaggacag tggcgtcggg gtcccagtgt ccgctgtccg 71940ctcgggcgcc gaccgggatg cgggccagcc gcggggtgtg ggcgcggccc tcgcgcagga 72000cggtctgcgg ctcaccggag agcagggccg cggccagggc gcgccggctg gcgtcggtgt 72060cgtcgaggtc ggtgaggacg aaccggccgg ggttctcggt ctgggcggag cggaccatgc 72120cccagacggc ggcgtgcgcg aggtcgggga cggagtcgcc cggtgcggcg gcgaccgcgc 72180cgtgggtgac gaacgcgagc cgggagtccg cgaaccggtc gtcggcgagc cagctctgca 72240gcaggtgcag gacgcggacg gtggcccgcc gggtggcgtc ggccgcgtcg gcggcgccgt 72300cgcggtgcgg gcaggggacg acgaccacgt cgggtacggg tgtgccggcc gaggccagtt 72360cctccagatc cgcgtatgtg ctccacggca cgccgggggc gtcggggcac tcggcttcgg 72420agccgatcag cgccaggcgc gtcttcgacg acggtgtcct gggcagcggt acgggcgccc 72480agtcgagccg gaagagggcg tcgtggtggg cggtgcgggc cgagtggagc tgtccggccg 72540tgacgggccg gaacgcgagt gactcggccg tgacgacggt gtgtcccgtg ctgtccgtgg 72600ccagcagggc gatcgtgtcg ggcgaccgcc gactgaggcg gacgcgcagc gccgatgccc 72660cggaggccgt gacggtgacg ccgctccagg agaagggcag ccagccgtgg ccctcgtccg 72720gctcgtcctc ggcgaagccg aggaccaccg ggtggagtgc cgcgtcgagc agcgccgggt 72780ggacggcgta gcggtcggcg tcgcccgacg gtccgtcggg cagtgcgacc tcggcgtaca 72840ggtcgtcccc gtgccgccag gcggcccgca gtccctggaa cgcgggtccg tatccgaggc 72900ccgcgtcggc cagtgtcccg taccagtggt cgaggtcgac ggggaccgcg tccgtgggcg 72960gccacggtgc ggcggtgtcg tgggcggtct ccgcgcgccg tgtcaggacg cccgtcgcgt 73020ggcaggtcca gccggtcccg tccgtgccgg tcggcgcggc gggggtgagt ccgtcgtctt 73080cccgcgcgta gagcgtgaag gggcgccgct ccaccccgtc cggcgccgtc tcggtcgcgc 73140cgacggagag ctgcaggacg accgatccgc gctccggcag gaccagcggg acctggagtg 73200ccagttcctc gacggtgtcg cagccgactt cgtcgcccgc gcggacggcc agttcgagga 73260tggccgtgcc gggcagcagg acggtgccga agacggcgtg gtcggcgagc cagggatggg 73320tgcgcagcga gatcctgccg gtgagcaggc attcctgcga ctgcggtgat ccggccaggg 73380gtacggcgga gccgagcagg gggtgtccgg ctgcggtgag tccggcggcc gagacgtccc 73440cggacaggct ggtgtcggcg tccagccagt agcggcggcg gtcgaagggg taggtcggca 73500ggtcgaggtg gcgggcgcgt tccggtgtcg cgccgatgag ggccggccag tggacggccg 73560tcccgccctt cggtgtgccc tgcacgtgga ggtgggcgag cgcggtgagc agcgccaggg 73620gttcggggcg gtcggcgcgc agcagcggga ccagggcggg accgggctcg gtggtgttgt 73680cgtcggcggg caggcactct ccggcgaggg cgcagagggt tccgtccggg ccgagttcca 73740ggaaggtgcg taccccgtcg tcgtcgtgga ggcggcgtac ggcgtcgccg aagcgtacgg 73800tgcggcgtag ctggcggacc cagtactccg ggtcggtgag cgtgccggcc gtggcgcggt 73860cgccggtgac ggtggagacc accgggatcg tcggttcggc gtaggcgatg ccggtcgcga 73920cctgccggaa ctcctccagg atcggttcca tcagcgggga gtggaaggcg cggtcggtgc 73980tgaggcgttt ggtgcgcagg ccctgctcgg cgaaggcggc ggcggcttcc agtacgtccg 74040gctcggctcc ggagatcacc accgacgtgg ggccgttgac agcggcgacg gccacccgtg 74100cctcccggcc ggcgagcatc cgggtgacct gttcctcgct cgcgcggacc gccagcatgg 74160ctccgccggg cggcagttgt gtctgcgcca gccggccccg ggccgcgacc agccgggccg 74220cgtcggtcag tgagaggacg ccggcgacgt gcgcggcggc cagttcgccg atcgagtggc 74280cggcgacgtg gtccgggcgg atcccggcgc tctccagcag gcggaagagg gcgacctgga 74340gggcgaagag cgccggctgc gcgtcgccgg tgcggtccag cggctgcggc tcgtcgagga 74400ggagcgggcg caggggccgg tcgagatggg gttcgagctc cgcgagtacg tcgtccaacg 74460cctgggcgaa ggcggggtgg gcggcgtaca gctctcggcc catgccgggc cgttgggttc 74520cctggccgga gaagaggaag gcgaccttgc cgtgggcggt gcgggcgggg gattcgatca 74580ggccgggggc cgtggtgccc tcggccagtg cgtccagggt gcgcaggagt ccctcgtggt 74640cctcggcgac cagcacggcc cggcgttcga atgtcgaccg gccggtggcc agggcgtgtg 74700cgacgtcacc gatcgggatg tcggggttgg cggcgaggta gtcgcgcagc cgcctggcct 74760gggcgcgcag ggcggtgtcg gtcctggccg agaggagaca gggcaccgtc gcgggtccgg 74820cctcgtcctg cgacggtgcc tcctccggcc gtacctcttc ctcctgcggt gcctcttcca 74880ggatgacgtg tgcgttggtg ccgctcaccc cgaacgacga cacacccgca cgacgcggac 74940gctcaccccg ctcccacacc acctcctccg tcagcaaccg caccgcaccc gacgaccaat 75000ccacatgcgg cgacggctca tccacatgca acgtccgcgg cagacgaccc cgccacaacg 75060ccatcaccat cttgatcaca ccagccaccc ccgccgcagc ctgcgtatga cccagattcg 75120acttcaccga ccccaaccac aacggcaccc cacgaccccg cccatacgcc gccagcaccg 75180cctgcgcctc gatcggatca cccaacgacg tccccgtccc atgcccctcc accacatcca 75240cctcagcagc cgacaacccc gcacacacca acgcctgacc

gatcacccgc tgctgagacg 75300gaccattcgg cgccgtcaac ccattcgacg caccatcctg attcaccgca ctcccccgca 75360ccaccgccaa cacccgatgc cccagccgcc gcgcatccga caaccgctcc accaacaaca 75420cacccacacc ctcggaccac cccaccccgt ccgccgccgc cgcgaacgcc ttgcaccgcc 75480cgtccgccga caaaccccgc tgccgcgaga actccacgaa cgcccccggc gtcgacatca 75540ccgtcacacc ccccgccaac gccaactccg actcacccga ccgcaacgac tgacacgcca 75600gatgcaacgc caccaacgac gacgaacacg ccgtgtccac cgtcaccgcc ggaccctcca 75660acccgaacgt gtacgacaac cgccccgaca acacactccc cgacacaccc gtcagcgcat 75720acccctccag atcccggcca ccccgacgca ccaactccgc ataatcctga ttcgccaccc 75780ccgcgaacac acccgtacga ctcccccgca acgacaccgg atcgatcccc gcccgctcca 75840gcgcctccca ggacacctcc agcaacaacc gctgctgcgg atccatcgcc aacgcctcac 75900gcggcgaaat cccgaaaaac cccgcatcga actccgccgc acccgccaaa aacccacccg 75960accgcgtata cgacgacccc ggccgccccg cctccggatc gtaaagaccc tccacatccc 76020agccccggtc caccgggaag ccgccgatcg cgtccccacc cgaggcgacc aactcccaca 76080aatcctccgg cgaccacaca ccccccggaa aacgacacgc catccccacg atcgccaccg 76140gctcatccac cacaccgggt cgggccgcga cgggcggtgt cgccggggcg gttccgcaca 76200gctggtcccg gaggtgccgg gccaggacgg cggggcgcgg gtagtcgaag accagtgtcg 76260tgggaaggcg cagtccggtg gccgtgttga ggcggttgcg cagttcgacg gcggtgagcg 76320agtcgaagcc gagttcgcgg aaggcccggt cggccgggac ggtgtcggcc tcccggtggc 76380cgagcaccgc ggccgtgtgg gtgctgacca gttccagcag gacgtccgtc cgggcgtccg 76440gttccagggc ggcgagccga tcgcgcaggt cggtgccgtg ggcggtgccg gtgccggtgg 76500cgcgggtgag cgcccggacg tcggggaggt cgccgatcag gggcaggcgg ctgccggcgg 76560tgtgggcggt gaagcgctcc cagtcgatgt cggcgacggt caagccgctc tcgtcgtggt 76620ccagcacccg ggccagggcc gccagggcga gttcgggcgc catctccgtc agtccccggc 76680ggtgcagccg ggcggccgcg tccggccggc cggcggcgct gtgtccgcgc caggggcccc 76740aggccaccgc ggtggaaggc agtccgagac cgcgccggtg gacggcgaga gcctcgacat 76800aggcgttggc cgccacatag gcaccctggc caccggaacc gaacgtggcg gcggccgagg 76860agaacaccac gaacgccgaa agatccgcac cccgcgtcag ctcgtgcagg ttccgcgcac 76920ccaccgccct ggccgccagc accccctcca gccgctccgg cgtcaacgcg tccagcacac 76980cgtcgtccac cacccccgcc gtatgcacga ccgcccccag cggacaatcc tcgggaaccg 77040ccgtggccag cagctccgcg agcgcccccc gatcggccac atcacaggcg gcgatggtca 77100cccgggcgcc ccactggtcc gtcgtgtcgg cgaggccgaa gccgtcgccg gaggtactga 77160tcagcagcag gtgttcggct ccgcggtcgg ccagccatcg ggcgaggtgg gcggccggct 77220gctcggggtc ggcgttctcg ccggtgatca ggacggtgcc gcgcggccgc cacccaccag 77280cctccgctcc ccctcccgga gcacgcacca gacgccgcac gaacaccccc gacgcccgga 77340ccgcgacctc cccctcaccc ccgccccccg acagcacacc cagcaaaccc tccaccaccc 77400gctcgtcgac cacctccggc agatcgacca ccccacccca ccgatccggc aactccaacc 77460ccgccacccg gcccaaaccc cacaccacag ccccacccgg atcccccagc cgatccccct 77520cccccaccga caccgcaccc cgcgtcacac accacaacgg cacccccaac ccctccaccg 77580cctggaccaa ccccagcacc aacccggcaa cacccacccc acccccacac acagccagca 77640cccccaccgg cccctcacca ccccacacct cacgcaaccg ctcccccaac accacccgat 77700ccgcacaacc cccctccaca gccaccaccc gcacacacac ccccgcccac tccaaaccct 77760ccaccacagc agcagcaccc accaccccct caggcaccac caccacccac accccacccg 77820acaccacacc accacccgac cgcgacaccg gacgccacac cacccgatac cgccagccat 77880cgaccaccgc acgctcccga acaccccgac gccagtcgcc gagcgcggac accagcgcgt 77940cgagcggcgc gtcctcgtcc acggcgagca gcgcggccac ggccgccggg tcctctcgtt 78000cgacggcttc ccacagcggg ccgtcctccg tggtggccgg cgcggccggt gtctcctccg 78060ggtccagcca gtaccgctca cgctcgaagg cgtacgtcgg cagctccacc cggccggcgg 78120tcccggacgg tttgccgccc agtacggccg cccagtccac ccgtaccccg cgcacggaca 78180gctccgccgc ggaggccagg aagcgccgca gaccgccctc gccccggcgc agtgagccga 78240ccaccagggt gtcggcggcg ccgaggtcgt cgagcgtctc ctggaccgcg acggagaccg 78300cggggtgcgg gccggcctcg acgaagacgg tgtgcccgtc gcgggcgagg gcccgggtgg 78360cgtcccggaa ccggacgggc tcgcgcaggt tgcggtacca gtacgcggcg tcgagtgcgg 78420tgccgtcgac gggctcgccg gtgaccgtgg agtagagcgg gatgtcggcg gggcgggggg 78480tgacgggagc gagaaggccg agcaggtctg cgcggatcgc ctcgacctgc ggggagtgcg 78540aggcccagtc gaccttcagc aggcgggccg ggacgccgtc ccgggtcagg tcgtcgacca 78600gggcggtgac cgcgtccggg gagccggaga ccacggccga gcgggccccg ttgtcggcgg 78660cgaccaccag gctcgggtcc acggcggcga gccgcggttc caggtcctcg gccggcagac 78720cgaccgaggc catggccccc tgtccggcga gcgcggcgag ggcctggctg cgcagggcgg 78780tgacgcgcgc cgcgtcctcc agggagaggg caccggcgac gcaggccgcc gcgatctcgc 78840cctgggagtg tccggcgacg gcgtcggggc ggacgccgta ggagcgccag agggccgcca 78900gggacaccat gaccgcgaag agcacgggct ggacgacgtc gacccggtcc agcggcgggg 78960cgtccggttc gccgcgcagg acgtcgagca gttcccagtc gaggtacgga cgcagggcgt 79020cggcgcattc ggtcatgcgc tgggcgaaga ccggtgaaga gtccaggagt tcggcggcca 79080tgccgtccca ctgggtgccc tggccgccga agagcagcgc gattttgccg tccgcctcgg 79140ggccggtgcg tccggccacg actccggccg tcggcaggcc ggtggcgagg gcgtcgaggc 79200cgtgccggaa accgtcgagg tcctcggcga gcacgaccgc ccggtgctcc agccacgccc 79260gctccaccgc cagcgcacgc ccgacctcca ccggagccgc ccccgcccca tcggcgaaca 79320cccgcaaccg ccgcgcctgc ccccgcaacg ccgactccga acgagccgac accacccacg 79380gcaccaccgc gggtccggcc tcgccctgcg acggtgcctc ctccggccgt acctcttcct 79440cctgcggtgc ctcctccaga atcacatgcg cgttggtgcc gctcaccccg aacgacgaca 79500cacccgcacg ccgcgggcgc tcaccccgct cccacaccac ctcctccgtc agcaaccgca 79560ccgcacccga cgaccaatcc acatgcggcg acggctcatc cacatgcaac gtccgcggca 79620gacgaccccg ccacaacacc atcaccatct tgatcacacc agccaccccc gccgcagcct 79680gcgtatgacc cagattcgac ttcaccgacc ccaaccacaa cggcacccca cgaccccgcc 79740catacgccgc cagcaccgcc tgcgcctcga tcggatcacc caacgacgtc cccgtcccat 79800gcccctccac cacatccacc tcagcagccg acaaccccgc acacaccaac gcctgaccga 79860tcacccgctg ctgagacgga ccattcggcg ccgtcaaccc attcgacgca ccatcctgat 79920tcaccgcact cccccgcacc accggcaaca cccgatgccc cagccggcgc gcatccgaca 79980accgctccac caacaacaca cccacaccct cggaccaccc caccccgttc gtcggcgtcg 80040cgaacgcctt gcaccggccg tccggcgaca aaccccgctg tcgcgagaac tccacgaacg 80100cccccggcgt cgacatcacc gtcacacccc ccggcaacgc caacttcgac tcacccgaac 80160gcaacgactg acacgccaga tgcaacgcca ccaacgacga cgaacacgcc gtgttcaccg 80220tcaccggcgg acccttcaac ccgaacgtgt acgacaaccg ccccgacaac acactccccg 80280acacacccgt cagcgcatac ccctccagat cccggccacc ccgacgcacc aactccgcat 80340aatcctgatt cgccaccccc gcgaacacac ccgtacgact cccccgcaac gacaccggat 80400cgatccccgc ccgctccagc gcctcccagg acacctccag caacaaccgc tgctgcggat 80460ccatcgccaa cgcctcacgc ggcgaaatcc cgaaaaaccc cgcatcgaac tccgccgcac 80520ccgccaaaaa cccacccgcc cgcgtatacg acgaccccgg ccgccccgcc tccggatcgt 80580aaagaccctc cacatcccag ccccggtcca ccgggaagcc gccgatcgcg tccccgcccg 80640aggcgaccaa ctcccacaaa tcctccggcg accacacacc ccccggaaaa cggcacgcca 80700tccccacgat cgccaccggc tcatccacca caccggaccg gatgaaggcg ggccggccgg 80760ccggggcttc cccgccggtg ctcagcagtg tgccgaggtg tgtggccagg gcggacgggt 80820tggggtagtc gaacaccagc gtgctgggca accgcagtcc ggtggccgtg ttgaggccgt 80880tgcgcagttc gacggcgttg agcgagccga agccgagctc ccggaaggcc cggtcggccg 80940ggacggcggt ggcggtgcgg tggccgagga cggtggcggt gtgggtgcgg acgaggtcga 81000gcagggcgcg gtcccgttcg gccggttcca gggcggccag gcgtgcgcgg agcgagccgg 81060gggcctccgt gccggtggcg ggccgggcga gccgggcctc ggggatgtcg gagagcagcc 81120gggcgaggcc gtcggcggcg ggcaggcgct cccagtcgat gtcggcgatg gtgaggcagg 81180tctcgttgcg gtccagtacc tggccgaggg cggacagcgc gggctcggtg tccatgggcc 81240ggatcccgcg gcggtccatc cgcgtggcgg cctccgcgtc cgcggccatg cccccgcccg 81300cccaggcgcc ccaggccacc gcggtggagg gcagtccgag accgcgccgg tggacggcga 81360gagcctcgac ataggcgttg gccgccacat aggcaccctg gccaccggaa ccgaacgtgg 81420cggcggccga ggagaacacc acgaacgccg acagatccgc accccgcgtc agttcgtgca 81480ggttccgcgc acccaccgcc ttggccgcca gcaccccctc cagccgctcc ggcgtcaacg 81540cgtccagcac accgtcgtcc accacccccg ccgtatgcac gacggcaccc agcggacaat 81600cctcgggaac cgccgtggcc agcagctccg cgagcgcccc ccggtcggcc acatcacagg 81660cggcgatggt cacccgggcg cccatcgcgg tgagttccgc gcggagctcc ccggcaccct 81720tggcctcgcg tccgctccgg ctgaccagca gcaggtgctc ggccccgcgc cggaccatcc 81780agcgggcgac gtgtgctccc agagcgccgg tgccgccggt gatcaggacg gtgccgcgcg 81840gccgccaccc accagcctcc gctccccctc ccggagcacg caccagacgc cgcacgaaca 81900cccccgacgc ccggaccgcg acctccccct cacccccgcc ccccgacagc acacccagca 81960aaccctccac cacccgctcg tcgaccacct ccggcagatc gaccacccca ccccaccgat 82020ccggcaactc caaccccgcc acccggccca aaccccacac cacagcccca cccggatccc 82080ccagccgatc cccctccccc accgacaccg caccccgcgt cacacaccac aacggcaccc 82140ccaacccctc caccgcctgg accagcccca gcaccaaccc ggcaacaccc accccacccc 82200cacacacagc cagcaccccc accggcccct caccaccaca cacctcacgc aaccgctccc 82260ccaacaccac ccgatccgca caacccccct ccacagccac cacccgcaca cacacccccg 82320cccgctccaa accctccacc acagcagcag cacccaccac cccctcaggc accaccacca 82380cccacacccc acccgacacc acaccaccac ccgaccgcga caccggacgc cacaccaccc 82440gataccgcca gccatcgacc accgcacgct cccgaacacc ccgacgccag tcgccgagcg 82500cggacaccac ggatcccaaa ggcgcgtcct cgtcgacttc cagcagggcc gcgaccgccg 82560gcaggtccgc gcgctcgacc gcctgccaca gcgggccgtc ctccccggcc ggcagcgcgg 82620ccggtgtctc gcccgcgtcc agccagtacc gctcacgctc gaacgcgtac gtcggcagct 82680ccacccggcc ggcggtcccg gacggtgcgc cgcccagcac ggccgcccag tccacccgca 82740ccccgcgcac ggacagcccg gccacggcgg cgagcacgga cacggcctcc ggccggtccg 82800gtcgcagtgc ggggagcagg ggggcgggtt cggtgagggc gtcctggccg agggcgcaga 82860gtgtgccgtc cgggccgagt tcgaggtagg cggtgacgcc ctgggcctgg agccaggcga 82920ggccgtcgcc gaagcggacg gtgtggcggg cgtgctggac ccagtagtca gcggtgccca 82980tggtgtcggc ggagacgggg gcgccggtga ggttggtgac cacggggatg cgcggcgggg 83040cgaagacgac ctgctccgcg acgcggcgga agtcgtccag tacggcgtcc aggtgcgggg 83100agtggaaggc gtggctggtg cgcagccgcc gggtccggcg gccctgttcc gcccagtggc 83160gggcgagcgt gagtacggcg tcctcgtcgc cggcgaggac gaccgcgcgc gggccgttga 83220cggcggccag gtccgcgcgc ccctcggcat cctggagcag cggccggact tcctcctccg 83280tcgcctcgac ggcgaccatg gcgccggtgt ccggcagcgc ctgcatcagc cggccccggg 83340ccgtcaccag ggcggccgcg tcggggaggg agagcatccc ggcgacgtgt gcggcggcca 83400gttcaccgac ggagtgcccc aggaggtagt cgggtgtcac cccccagttc tcgaccagcc 83460ggtacagcgc gacctcgacg gcgaacaggg cgggctgggc gtattccgtc tgttcgatca 83520gctcggcgcc gggggatccg ggggccgcga acacgatgtc gcgcagggtg tggcctgctt 83580ccccgatcgg gccgaagtgg gcgcacacct cgtcgaaggc gtccgcgaag gcggggaagt 83640gcgcgtggag ttcgcggccc atggccgggc gctgtgtgcc ctggcccgcg aagaggaacg 83700ccagcgggcc ttcgtcggtc gcggtgccgg tgacgacttc cggggcggga cggccggtgg 83760cgagggcgtc gaggccgtgc cggaaaccgt cgaggtcctc ggcgagcacg accgcccggt 83820gctccagcca cgcccgctcc accgccagcg cacgcccgac ctccaccgga gccgcccccg 83880ccccatcggc gaacacccgc aaccgccgcg cctgcccccg caacgccgac tccgaacgag 83940ccgacaccac ccacggcacc accgcgggtc cggccccgtc ccccgacgga accaccaccg 84000gcccgacgcc gtcccccgac ggtgcctcct ccggccgtac ctcttcctcc tgcggtgcct 84060cctccagaat cacatgcgcg ttggtaccgc tcaccccgaa cgacgacaca cccgcacgcc 84120gcggacgctc accccgctcc cacaccacct cctccgtcag caaccgcacc gcacccgacg 84180accaatccac atgcggcgac ggctcatcca catgcaacgt ccgcggcaga cgaccccgcc 84240acaacgccat caccatcttg atcacaccag ccacccccgc cgcagcctgc gtatgaccca 84300gattcgactt caccgacccc aaccacaacg gcaccccacg accccgccca tacgccgcca 84360gcaccgcctg cgcctcgatc ggatcaccca acgacgtccc cgtcccatgc ccctccacca 84420catccacctc agcagccgac aaccccgcac acaccaacgc ctgaccgatc acccgctgct 84480gagacggacc attcggcgcc gtcaacccat tcgacgcacc atcctgattc accgcactcc 84540cccgcaccac cgccaacacc cgatgcccca gccgccgcgc atccgacaac cgctccacca 84600gcagcacacc gacaccctcg gacatcccgg tgccgtcggc cgcggcggcg tagggcttgc 84660agcggccgtc cgccgacagg ccccgttggc gcgagaactc cacgaacatg gcgggggtgg 84720acatcacggt gaccccgccg gcgagtgcga gggaggattc gcccgacctg acggactggc 84780aggccaggtg cagtgccacc agcgatgacg agcacgccgt gtcgaccgtc accgcggggc 84840cttcgaaccc gaaggtgtag gagagccggc cggacaggac gctcgccgcg ttgccgttgc 84900cgaggaagcc ctgaaggtga tccggaacgg acagcagacg ggtggcgtag tcgtgcgaca 84960tcatccccgc gaagacgccg gtgcggctgc cgcgcagggt ggccgggtcg atcccggccc 85020gctccagcgc ctcccaggac acctccagca tcaaccgctg ctgcgggtcc atcgccagcg 85080cctcgcgcgg ggagagaccg aagaatcccg cgtcgaactc cgctgcctcg tgcaggaatc 85140cgcccgatcg cgtgtacgac cgccctgccc ggcccggctc cgggtcgtag aggtcctcga 85200cgtcccagcc ccggtccacc gggaagtcgc cgatcgcgtc cccgcccgag gcgaccagct 85260cccacaggtc ctcgggcgat cgcacacctc ccgggaaccg gcacgccatg cccacgatcg 85320cgaccggctc ctgcctgccc gactcgacct gctccagccg gcgccggacc cgcagcagat 85380cggcggtcgc gcgcttgagg tactcgcgga gcatttcctc gttggccatg acggggtctc 85440ctcgccgctg cgctggaggt ggcacggaac cccgccagat tagggtgggc aagtcaaccc 85500gaataccccc tatacacccc agactggcta cgtgaagcga atacccgttc aaataggggg 85560aagagccgca ggcatggatc gttacgcgaa gcgtttcgag gaccggctgg tcctggtcac 85620gggggcgggg agcggcatcg ggcgggcgac ggcctgccgg ttcggtgccg ccggggcgcg 85680gctggtgtgt gtggaccggg acgggcccgg cgcggaggcg accgccgaac tggcgcgtgc 85740gcggggggcg cgggcggcgt gcgccgaggt ggccgacgtc tcggacgagg tggcgatgga 85800gcggctcgcc gcgcgcgtca cggccgcgca cggcgtgctg gacgtgctcg tgaacaatgc 85860cggtatcggc atgtcggggc ggtttctcga cacgtcggcc gaggactggc gccgcaccct 85920gggggtgaat ctgtggggcg tcatccacgg gtgccggctc ctcggccggg gcatggccga 85980gcgccggcag ggcggtcaca tcgtgacggt ggcctcggcg gccgcgttcc agccgacccg 86040ggtcgttccg gtgtacgcca ccagcaaggc cgcggccctg atgctgagcg agtgtctgcg 86100cgcggagttg gcggagttcg gcatcggtgt gagcgtggtc tgccccggcc tggtccgtac 86160gccgttcgcg tccgcgatgt acttcgccgg cgcgtccccc gacgagcaca cccggctgcg 86220tgagtcctcc gcccgccgct tcgcgggccg cggctgcccg ccggagaagg tcgcggacgc 86280cgtcctgcgc gcgatcatgc ggacggcctt gccgacggtg accgggtcga cgccgtagag 86340ctggatcagc gcggtctcct cgcccgtctc cggcttgacc tcgaagtacg cgagcggctc 86400ggcgtcggcg gctgccgcgt cgtacagcag gatgcgcaga tccggaagtc ctgctcttcg 86460acgagccgtt cagcgcgctg gacccgctga tccctttagt gagggttaat tgcggccgcg 86520ttccagccga cccgggtcgt tccggtgtac gccaccagca aggccgcggc cctgatgctg 86580agcgagtgtc tgcgcgcgga gttggcggag ttcggcatcg gtgtgagcgt ggtctgcccc 86640ggcctggtcc gtacgccgtt cgcgtccgcg atgtacttcg ccggcgcgtc ccccgacgag 86700cacacccggc tgcgtgagtc ctccgcccgc cgcttcgcgg gccgcggctg cccgccggag 86760aaggtcgcgg acgccgtcct gcgcgcgatc gtccgcaata cggcggtggt cgccgtcacc 86820cccgacgccc gcgccgtccg tctgatgagc cgcttcgcgc cccgcctccg cgccgtcgtg 86880gcccggctgg acccgtaggc agggcccgta cgggcagcgg gcgtccggtt cgggccaccg 86940gccgcggtat ccgcgcccct gcccggagct gtgccgctcc gggcaggggc gcgcggacga 87000ggcggtccgg cccggcggcc cggacctggc ggtccgttac tcaaaccgcg tgagcgtcag 87060ccggatcccg gtgggagcgg tgtcctggat gtaggaggcg aagtcggcca cgtcgtcgaa 87120ggcgaagccg taggctcggc cgtcctccgt gatcgcgtgc atcgccttgg cgtagtggtt 87180ggtcagcgcg gtcctgtaga aggccgcggg gtcggtcgtg ggctgggcgg cggaggtgag 87240cagggtcgag cggttgaatc cggcgccgag gaccgcggcg acgggaccgg tggtgccgtc 87300gttcggcgcg gcgagggcac cgtggcagaa gagcacgtcg cgcgtggtgg gcttggcgaa 87360ggacacctgg gcgggcccgt cgaaggtcag ccgctcgccg cgcacccggc cggtgaaggt 87420cccggcgttg gtggtgaccg tgaggtccct ggcggtgtag gtgctccaca cctcgtcgat 87480gtacggagcg aggtagtcct tcgggaacag gccggcgtcc agcccgtgcc cgggggcgat 87540cacacggagg tcgtccagga ccagtggcgc gaactccgcg acgcggcgga ccgcttcgaa 87600cgccgccgcc cggccgccgg cccgcacggt gccggtcgtc tggtccttcg cgcccgtcag 87660ccggatactg agcggcacgc tgaacatgtc caccatggtc gtgttgcaga acatgccgga 87720ggggttgtag gtgaactcgg cgcagtcgtg cagcaccctg tagttcggat cggacgcgac 87780ccagccggcc gggtactgca gcgcggcgtt cccggctccg tccgtgacca ccttgaactt 87840gagtttctgt ccgagcgcga catagatccg gccggacatg tacggcaggg agagccgggt 87900ctcgccgctg ccggccagtg tgatcgcgta gtccgtgaag ccgtcggggc cgttgtccga 87960cagggcgacg ggcgcgaggg tgccgtcggg cgtgagccgt acctgtcggc cgtcctggtt 88020ccccacgacg tagacatgga cgtcgccgtt gccgaagacg ccggtgtcgt tgacgaccgt 88080cagcggcagg gcgcccgccg tggtcccttc cgcgtcgcgg tcccggccgg ggccggcgag 88140ggcgtgcggt gccagggctg cgacggcggg agcggccatc gcggcgccgc cgagggcgac 88200gagcagggtg cggcggccga ggctgcgctg gtgtcgagga gtcatgtggg gggcctcctg 88260gtgggcttgc cgatgttcta atgacgggaa catgacaggt gagaagcgtg ggagcgctcc 88320tcagggcccg atggtacgca cggggaggcg tcccgcgtcc ccgtgccggg accgcttaac 88380cgacgcttaa gggccgttta 884002922PRTbacteria 2Met Arg Gly Val Ser Pro Ser Val Ser Val Arg Glu Pro Gln Gly Leu1 5 10 15Thr Phe Leu Gly Leu Gly Arg Gln Ser His Ala Val Arg Thr Ala Leu 20 25 30Glu Ala Cys Ala Ala Gly Arg Val Arg Val Leu Val Val Glu Gly Gly 35 40 45Leu Gly Cys Gly Lys Ser Ala Phe Leu Gly Glu Ala Leu Lys His Ala 50 55 60Ala Ala Ser Gly Phe Leu Val Leu Arg Ser Ala Gly Ser Pro Pro Glu65 70 75 80Gly Arg Arg Pro Phe Asp Leu Leu Arg Gln Leu Ala Val Asp Pro Asp 85 90 95Ile Pro Asp Ala Gln Arg Ser Leu Leu Gln Asp Ala Val Gly Thr Glu 100 105 110Thr Pro Ala Ala Gln Arg Val Arg Ala Ala Leu His Gln Leu Thr Gly 115 120 125Ala Ala Pro Val Val Ile Gly Ile Asp Asp Leu His His Ala Asp Pro 130 135 140Gln Ser Leu His Cys Leu Leu Gln Ala Val Asp His Pro Arg Ala Thr145 150 155 160Arg Leu Leu Leu Val Cys Thr Ala Leu Pro Ser Gly Leu Ala Ala Asp 165 170 175Pro Ala Val Glu Ala Glu Leu Leu Cys Gln Pro Ala Leu Gln Arg Val 180 185 190Met Leu Gly Arg Leu Ser Leu Arg Ala Val Ser Gly Leu Arg Ala Ala 195 200 205Arg Pro Gly Pro Ala Val Glu Ala Leu Pro Ala Asp Asp Leu Leu Ala 210 215 220Val Thr Gly Gly Asn Pro Leu Leu Val His Ala Leu Leu Glu Glu Leu225 230 235 240Val Glu Ser His Thr Gln Gly His Thr Asp Glu Arg Ala Gly Arg Arg 245 250 255Arg Arg Ala Ala Ser Pro Val Ile Gly Gly Arg Phe Tyr Gln Ala Val 260 265 270Leu Ala Ser Leu Ser Arg Thr Asp Ser Leu Val Arg His Ser Ala Gly 275 280 285Ala Leu Ala Val Leu Gly Asp Ser Gly Cys Ala Glu Val Ile Ala Arg 290 295 300Leu Leu Gly Ile Gly Arg Ala Met Ala Ala Arg Gly

Leu Arg Ala Leu305 310 315 320Glu Ala Thr Gly Leu Thr Ala Ser Gly Arg Phe Arg His Pro Val Val 325 330 335Glu Ala Ala Ala Leu Asp Thr Leu Asp His Asp His Arg Ala His Leu 340 345 350His Arg Arg Ala Ala Ala Leu Leu Tyr Asp Val Gly Ala Glu Pro Asp 355 360 365Glu Val Ala Arg His Leu Leu Ala Ala Arg His Ala Ala Gly Pro Trp 370 375 380Ala Met Ser Val Leu Arg Asp Ala Ala Glu Gln Leu Leu Met Arg Asp385 390 395 400Asp Val Leu Thr Ala Val Ser Cys Leu Glu Leu Ala Arg Arg Ser Cys 405 410 415Ala Gly Gly Pro Arg Arg Ala Glu Ile Leu Leu Arg Leu Thr Val Ala 420 425 430Thr Arg Arg Thr Asp Pro Ala Ala Ala Glu Asp His Leu Ala Glu Leu 435 440 445Val Thr Glu Leu Arg Ala Gly Arg Leu Thr Ser Ala Glu Thr Glu Arg 450 455 460Leu Gly His Leu Leu Leu Gly Cys Gly Arg Leu Glu Glu Ala Thr Glu465 470 475 480Val Met Gly Arg Pro Gly Pro His Gly Asp Pro Arg Thr Pro Arg Leu 485 490 495Glu Thr Gly Phe His Ala Ser Ala Leu Trp Glu Pro Leu Ile Arg Pro 500 505 510Arg Thr Asp Pro Glu Pro Gly Asp Glu Glu Ser Pro Arg Pro Arg Met 515 520 525Pro Val Thr Gly Ile Trp Asp Leu Pro Gly Asp Gly Thr Asn Ala Ser 530 535 540Ala Ser Asp Ala Ala Glu His Val Leu Arg Ser Leu Pro Leu Thr Asp545 550 555 560Thr Thr Leu Val Ile Val Val Asn Ala Val Arg Val Leu Cys Arg Thr 565 570 575Gly Ser Tyr Glu Thr Ala Ala Leu Trp Cys Thr Arg Leu Leu Gly Glu 580 585 590Ala Ala Gly Arg Arg Leu Pro Gly Trp Lys Ala Gln Phe Leu Ala Leu 595 600 605Gln Ala Glu Ile Ala Leu Cys Arg Gly Leu Leu Ala Asp Thr Glu Glu 610 615 620Tyr Ala Arg Gln Ala Leu Ala Cys Val Pro Arg Cys Ser Arg Ser Val625 630 635 640Phe Ile Gly Gly Pro Leu Ala Ser Arg Val Phe Ala Ala Thr Ala Met 645 650 655Gly Arg Tyr Asp Glu Ala Thr Arg Gln Leu Asp His Pro Val Pro Glu 660 665 670Ala Leu Phe Arg Ser Val Tyr Gly Pro Ala Tyr Leu Arg Ala Arg Gly 675 680 685His Tyr Tyr Leu Ala Leu Asp Arg Pro Leu Ala Ala Val Arg Asp Phe 690 695 700Leu Gly Ala Gly Arg Leu Leu Arg Arg Trp Gly Ile Asp Arg Pro Thr705 710 715 720Leu Met Pro Trp Arg Ser Asp Ala Ala Glu Ala Phe Leu Arg Leu Cys 725 730 735Glu Pro Arg Arg Ala Asp Arg Leu Leu Arg Glu Gln Leu Ala Arg Thr 740 745 750Pro Asp Asp Asp Pro His Val Arg Gly Val Ser Leu Arg Leu Arg Ala 755 760 765Gln Ile Ala Glu Pro Pro Asp Arg Leu Asn Leu Leu Thr Glu Ala Val 770 775 780Asn His Leu Lys Ser Ser Gly Asp Arg Leu Ala Leu Ala Gly Ala Leu785 790 795 800Ala Asp Leu Gly Ala Ala Tyr Arg Glu Arg Gly Glu Ser Thr Arg Ala 805 810 815Gly Ala Thr Ile Arg Arg Ala Trp His Leu Ala Asn Asp Cys Gly Ala 820 825 830Arg Ala Leu Cys Glu Arg Ile Leu Pro Gly Gly Pro Gly Arg Gln Ser 835 840 845Phe Gly Asp Gly Thr Gly Arg Thr Glu Ala Ala Leu Ser Gly Ser Glu 850 855 860Leu Arg Val Val Glu Leu Ala Ala Asn Gly His Thr Asn Arg Glu Ile865 870 875 880Ala Ala Arg Leu Cys Ile Thr Val Ser Thr Val Glu Gln His Leu Thr 885 890 895Arg Ala Tyr Arg Lys Leu Glu Ile Ser Arg Arg Gln Glu Leu Pro Ala 900 905 910Arg Leu Cys Ala His Ile Glu Ser Pro Val 915 9203259PRTbacteria 3Met Pro Asp Leu Cys Glu Thr Glu Ser Leu Trp Leu Arg Arg Phe Gln1 5 10 15Pro Ala Pro Ala Ala Arg Thr Arg Leu Met Cys Phe Pro His Ala Gly 20 25 30Gly Ser Ala Ser Ala Tyr Leu Arg Leu Ala Arg Ser Leu Ala Pro Gly 35 40 45Ile Glu Val Leu Ala Val Gln Tyr Pro Gly Arg Gln Asp Arg Arg Ala 50 55 60Glu Pro Cys Pro Asp Ser Val Glu Gly Leu Ala Asp Asp Leu Phe Ala65 70 75 80Ala Val Arg His Arg Val Asp Ala Ser Thr Ala Leu Phe Gly His Ser 85 90 95Met Gly Ala Val Leu Ala Phe Glu Leu Ala Arg Arg Leu Glu Arg Asp 100 105 110Ala Gly Val Arg Cys Ala Arg Ile Phe Ala Ser Gly Arg Arg Ala Pro 115 120 125Ser Arg Phe Arg Asp Asp Ser Ala Pro Ala Ala Ser Asp Ala Ser Met 130 135 140Leu Ala Glu Met Arg Thr Leu Gly Gly Thr Asp Leu Arg Val Leu Gln145 150 155 160Asp Glu Glu Leu Leu Ile Ala Ala Leu Pro Ala Leu Arg Ala Asp Tyr 165 170 175Arg Ala Ile Gly Thr Tyr Arg Ala Ala Asp Asp Ala Val Val Gly Cys 180 185 190Pro Val Thr Val Leu Val Gly Asp Ala Asp Pro Arg Thr Ser Leu Asp 195 200 205Asp Ala His Ala Trp Ser Ala His Thr Thr Ala Glu Ser Glu Val Leu 210 215 220Thr Phe Ser Gly Gly His Phe Phe Leu Asp Ala His His Asp Ala Val225 230 235 240Val Glu Val Val Thr Ala Arg Leu Arg Gln Asp Arg Ala Pro Arg Pro 245 250 255Asp Arg Val4267PRTbacteria 4Met Pro Glu Leu Asn Asp Arg Thr Ala Leu Val Thr Gly Ala Ser Arg1 5 10 15Gly Ile Gly Lys Ala Ile Ala Gln Arg Leu Ala Ala Glu Gly Val Arg 20 25 30Val Ala Val His Tyr Gly Thr Gln Glu Lys Ser Ala Gln Glu Thr Val 35 40 45Glu Thr Ile Glu Arg Ala Gly Gly Arg Ala Phe Ala Val Arg Ala Asp 50 55 60Leu Leu Arg Asp Asp Ala Val Asp Glu Leu Phe Thr Ala Leu Glu Arg65 70 75 80Glu Leu Glu Gly Arg Pro Leu His Ile Leu Val Asn Asn Ala Ala Val 85 90 95Ala Pro Ala Pro Gly Asp Pro Ala Leu Ala Ala Gln Asp Gly Tyr Val 100 105 110Pro Gly Leu Ser Asp Thr Thr Pro Glu Glu Phe Asp Arg Val Tyr Arg 115 120 125Ile Asn Val Arg Ala Pro Phe Phe Val Thr Gln Arg Ala Leu Ser Leu 130 135 140Met Ala Asp Gly Gly Arg Ile Val Asn Val Ser Ser Ala Val Thr Arg145 150 155 160Ile Ala Trp Pro Leu Leu Pro Tyr Ala Met Thr Lys Gly Ala Leu Glu 165 170 175Met Met Ala Pro Arg Leu Ala Asn Glu Leu Gly Ser Arg Gly Ile Thr 180 185 190Val Asn Thr Val Ala Pro Gly Ile Thr Asp Thr Asp Met Asn Arg Trp 195 200 205Val Arg Glu Thr Pro Gly Ala Glu Ala Gly Ile Ser Ala Leu Thr Ala 210 215 220Leu Gly Arg Leu Gly Arg Pro Asn Asp Ile Ala Gly Ile Val Ala Phe225 230 235 240Leu Val Ser Asp Asp Ala Arg Trp Ile Thr Gly Gln Leu Leu Asp Ala 245 250 255Ser Gly Gly Met Ala Leu Ala Pro Ala Met Met 260 26552341PRTbacteria 5Val His Glu Thr His Ala His Gly Glu Glu Gly Ser Ser Asp Gly Ser1 5 10 15Ala Asp Ala Val Val Phe Val Phe Pro Gly Gln Gly Ser Gln Trp Pro 20 25 30Gly Met Gly Ala Glu Leu Trp Asp Thr Ser Pro Val Phe Arg Glu Ser 35 40 45Val Arg Ala Cys Ala Asp Ala Leu Ala Pro Tyr Leu Asp Trp Ser Val 50 55 60Glu Gly Val Leu Arg Gly Ala Pro Asp Ala Pro Ala Gly Pro Ala Leu65 70 75 80Asp Arg Ala Asp Val Ala Gln Pro Ala Leu Phe Thr Leu Met Val Ser 85 90 95Leu Ala Glu Leu Trp Arg Ser His Gly Val Glu Pro Cys Ala Val Leu 100 105 110Gly His Ser Leu Gly Glu Ile Ala Ala Ala His Val Ala Gly Ala Leu 115 120 125Thr Leu Ala Asp Ala Ala Arg Val Ala Ala Leu Trp Ser Arg Ala Gln 130 135 140Ala Thr Leu Ser Gly Thr Gly Thr Leu Leu Ala Ala Lys Ala Ala Pro145 150 155 160Glu Glu Leu Ala Pro His Leu Gln Arg Trp Asn Gly Asp Asp Arg His 165 170 175Gly Thr Arg Leu Ala Ile Ala Gly Val Asn Gly Pro Gly Ser Thr Val 180 185 190Val Ala Gly Asp Leu Asp Ala Ile Ala Ala Leu Ala Ala Asp Leu Ala 195 200 205Ser Ala Gly Val Arg Thr Arg Arg Val Ala Val Asp Val Pro Thr His 210 215 220Ser Pro Ala Met Arg Thr Leu Arg Glu Arg Ile Leu Thr Asp Leu Ala225 230 235 240Ser Val Ala Pro Cys Val Ser Arg Leu Pro Phe His Ser Ser Leu Thr 245 250 255Gly Gly Leu Val Asp Thr Arg Gly Leu Asp Ala Asp Tyr Trp Tyr Arg 260 265 270Asn Ile Ser Glu Thr Ala Arg Phe Asp Leu Ala Ala Arg Gly Leu Leu 275 280 285Ala Asp Gly His Arg Thr Phe Val Glu Leu Ser Pro His Pro Ile Leu 290 295 300Thr Leu Gly Leu Gln Ala Leu Ala Asp Asp Val Pro Gly Ala Ala Asp305 310 315 320Ala Leu Val Thr Gly Thr Leu Arg Arg Gly Arg Gly Gly Met Arg Gln 325 330 335Phe Gln Asp Ala Leu Gly Arg Leu Ser Val Pro Ala Gly Gly Arg Pro 340 345 350Gly Arg Glu Val Ser Ala Ala Ala Leu Ala Gly Arg Leu Ala Pro Leu 355 360 365Ser Pro Ala Gln Gln Glu His Leu Leu Val Glu Leu Val Cys Ala His 370 375 380Phe Ala Ala Leu Val Gly Gly Asp Gly Gly Ala Pro Pro Thr Val Arg385 390 395 400Pro Ser Ala Ala Phe Thr Asp Gln Gly Cys Asp Ser Ala Thr Ala Leu 405 410 415Glu Leu Arg Asp Arg Leu Arg Glu Ala Thr Gly Leu Arg Leu Pro Ala 420 425 430Thr Leu Val Phe Asp His Pro Thr Pro Ala Ala Val Ala Gly Arg Leu 435 440 445Arg Arg Leu Ala Leu Gly Ile Glu Glu Thr Ala Asp Thr Ala Pro Val 450 455 460Ala Val Arg Gly His Arg Glu Gly Glu Pro Ile Ala Ile Val Gly Met465 470 475 480Ala Cys Arg Phe Pro Gly Gly Val Arg Ser Pro Glu Asp Leu Trp Arg 485 490 495Leu Val Thr Glu Gly Gly Asp Ala Leu Gly Pro Phe Pro Thr Asp Arg 500 505 510Gly Trp Asp Thr Gly Arg His Ala Glu Asp Pro Ala Thr Pro Gly Thr 515 520 525Tyr Val Gln Gly Glu Gly Gly Phe Leu Tyr Asp Ala Gly Glu Phe Asp 530 535 540Ala Glu Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro545 550 555 560Gln Gln Arg Leu Leu Leu Glu Met Ala Trp Glu Thr Phe Glu Arg Ala 565 570 575Gly Ile Asp Pro Thr Ser Ala Arg Gly Ser Arg Thr Gly Val Phe Ala 580 585 590Gly Val Leu Pro Leu Gly Tyr Gly Pro Arg Met Asp Glu Thr Asp Gln 595 600 605Gly Thr Ala Asp Leu Gln Gly His Leu Leu Thr Gly Thr Leu Pro Ser 610 615 620Val Ala Ser Gly Arg Ile Ser Tyr Thr Leu Gly Leu Glu Gly Pro Ala625 630 635 640Val Ser Val Glu Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 645 650 655Ala Cys Arg Ser Leu Arg Ala Gly Glu Cys Asp Leu Ala Leu Thr Gly 660 665 670Gly Val Ser Val Leu Ala Thr Leu Gly Leu Phe Val Glu Phe Ser Arg 675 680 685Gln Arg Gly Leu Ser Ala Asp Gly Arg Cys Lys Ala Tyr Ala Ala Ala 690 695 700Ala Asp Gly Thr Gly Trp Ser Glu Gly Ala Gly Leu Leu Leu Val Glu705 710 715 720Arg Leu Ser Asp Ala Arg Arg Leu Gly His Arg Val Leu Ala Val Val 725 730 735Arg Gly Ser Ala Ile Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala 740 745 750Pro Ser Gly Pro Ser Gln Gln Arg Val Ile Arg Glu Ala Leu Ala Asp 755 760 765Ala Gly Leu Thr Ala Ala Asp Val Asp Ala Val Glu Gly His Gly Thr 770 775 780Gly Thr Arg Leu Gly Asp Pro Ile Glu Ile Glu Ala Leu Leu Ala Thr785 790 795 800Tyr Gly Gln Gly Arg Ala Arg Glu Arg Pro Leu Trp Leu Gly Ser Leu 805 810 815Lys Ser Asn Ile Gly His Thr Met Ala Ala Ala Gly Val Gly Gly Val 820 825 830Ile Lys Met Val Met Ala Leu Arg His Gly Glu Leu Pro Arg Thr Leu 835 840 845His Val Asp Ala Pro Ser Pro Arg Ala Asp Trp Ser Ala Gly Glu Val 850 855 860Arg Leu Leu Thr Glu Ala Val Ala Trp Pro Ala Ala Ala Asp Gly Glu865 870 875 880Pro Arg Arg Ala Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 885 890 895His Ala Ile Leu Glu Glu Ala Pro Ala Pro Glu Asp Glu Glu Pro Ala 900 905 910Pro Pro Asp Gly Glu Ala Leu Leu Pro Trp Ala Val Ser Thr Arg Ser 915 920 925Glu Ala Ala Leu Arg Thr Gln Ala Arg Met Leu Ala Asp Val Val Arg 930 935 940Asp Asp Pro Gly Val Gly Leu Ala Asp Val Gly Ala Glu Leu Ala Arg945 950 955 960Gly Arg Ala Ala Leu Glu His Arg Ala Val Val Ile Ala Ser Gly Arg 965 970 975Ala Glu Phe Ala Arg Ala Leu Glu Ala Val Ala Ser Gly Glu Pro His 980 985 990Pro Ala Val Val Arg Gly His Ala Gly Ser Glu Arg Gly Gly Val Val 995 1000 1005Phe Val Phe Pro Gly Gln Gly Gly Gln Trp Ala Gly Met Gly Leu 1010 1015 1020Asp Leu Leu Arg Ser Ser Pro Val Phe Ala Glu His Ile Ala Ala 1025 1030 1035Cys Gly Lys Ala Leu Ala Pro Trp Val Lys Trp Ser Leu Thr Glu 1040 1045 1050Val Leu His Arg Asp Ala Glu Asp Pro Val Trp Asp Arg Ala Asp 1055 1060 1065Val Val Gln Pro Val Leu Phe Ser Val Met Thr Ser Leu Ala Ala 1070 1075 1080Leu Trp Arg Ser Tyr Gly Val Glu Pro Asp Ala Val Thr Gly His 1085 1090 1095Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Cys Gly Ala Leu Gly 1100 1105 1110Leu Glu Asp Ala Ala Arg Thr Val Ala Leu Arg Ser Arg Ala Leu 1115 1120 1125Val Ala Leu Arg Gly Arg Gly Gly Met Ala Ser Val Ala Ser Ala 1130 1135 1140Ala Pro Asp Val Glu Glu Leu Ile Ala Arg Arg Trp Pro Gly Arg 1145 1150 1155Leu Trp Val Ala Ala Phe Asn Gly Pro Gly Ala Val Thr Val Ser 1160 1165 1170Gly Asp Gly Asp Ala Leu Glu Glu Phe Leu Gly His Cys Ala Asp 1175 1180 1185Thr Glu Val Arg Ala Arg Arg Val Pro Val Asp Tyr Ala Ser His 1190 1195 1200Cys Pro His Thr Glu Ala Ile Glu Arg Glu Leu Leu Asp Ala Leu 1205 1210 1215Glu Asp Ile Thr Pro Arg Pro Ala Ala Val Pro Phe Tyr Ser Thr 1220 1225 1230Val Asp Asp Ala Trp Leu Asp Thr Thr Arg Leu Asp Ala Ser Tyr 1235 1240 1245Trp Tyr Arg Asn Leu Arg Arg Pro Val Arg Phe Ser Gln Ala Val 1250 1255 1260Arg Ala Leu Thr Asp Gly Gly His Arg Val Phe Ile Glu Ala Ser 1265 1270 1275Pro His Pro Thr Leu Val Pro Ala Ile Glu Asp His Gly Asp Val 1280 1285 1290Thr Ala Leu Gly Thr Leu Arg Arg His Gly Asp Asp Thr Glu Arg 1295 1300 1305Phe Leu Thr Ala Leu Ala His Leu His Val Thr Gly Ala Ala Gly 1310 1315 1320Gln

Asp Leu Trp Arg His His Tyr Ala Arg Leu Arg Pro Ala Pro 1325 1330 1335Arg His Val Asp Leu Pro Thr Tyr Ala Phe Gln Arg Asp Arg Tyr 1340 1345 1350Trp Trp Ser Gly Gly Ala Gly Arg Gly Asp Val Thr Thr Ala Gly 1355 1360 1365Leu His Pro Gly Gly His Pro Leu Leu Gly Ala Ala Leu Asp Leu 1370 1375 1380Ala Asp Gly Gly Gly Arg Leu His Thr Gly Arg Val Ser Leu Arg 1385 1390 1395Thr His Pro Trp Ile Ala Asp His Gly Val Ala Gly Ile Thr Leu 1400 1405 1410Leu Pro Gly Thr Ala Phe Leu Glu Leu Ala Leu His Thr Gly Glu 1415 1420 1425Ser Gly Asn Val Arg Glu Leu Thr Leu His Ala Pro Leu Val Val 1430 1435 1440Pro Asp Glu Glu Gly Val Asp Leu Gln Val His Leu Ala Arg Pro 1445 1450 1455Asp Glu Ala Gly Leu Arg Ala Leu Thr Arg Leu Leu Pro Gly Arg 1460 1465 1470Gly Val Pro Thr Pro Arg Ala Pro Trp Gln Pro His Ala Thr Gly 1475 1480 1485Leu Leu Gly Pro Ala Asp Arg Ala Pro Gly Ser Ser Gly Leu Glu 1490 1495 1500Pro His Asp Leu Gly Gly Ala Trp Pro Pro Pro Gly Ala Val Pro 1505 1510 1515Leu Val Pro Gly Glu Leu Gly Asp Val Pro Gly Cys Tyr Ala Arg 1520 1525 1530Leu Ala Asp Glu Gly Phe Glu Tyr Gly Pro Ala Phe Arg Gly Leu 1535 1540 1545Arg Ala Val Trp Arg Arg Gly Thr Glu Ile Phe Ala Glu Val Ala 1550 1555 1560Leu Pro Ala Gly Asp Gly Ser Val Phe Arg Leu His Pro Ala Leu 1565 1570 1575Leu Asp Ala Val Leu His Pro Val Val Leu Gly Leu Val Asp Gly 1580 1585 1590Val Pro Ala Arg Pro Leu Pro Phe Ser Trp Asn Gly Val Ala Leu 1595 1600 1605His Ala Pro Ala Ser Gly Ala Leu Arg Val Arg Leu Ala Pro Ala 1610 1615 1620Asp Asp Gly Ala Val Gly Ile Thr Ala Ala Thr Ala Ala Gly Glu 1625 1630 1635Pro Val Leu Ser Val Ala Ala Leu Ala Leu Arg Ser Ala Ser Ala 1640 1645 1650Glu Gln Leu Arg Ala Ala Ile Arg Ser Ala Ala Gly Ser Arg Asp 1655 1660 1665Ala Leu Tyr Glu Leu Asp Trp Leu Pro Leu Pro Ala Asp Arg Ala 1670 1675 1680Ala Ser Pro Gly Gly Ala Asp Ile Ala Ala Leu Gly Thr Ser Glu 1685 1690 1695Leu Pro Cys Arg Thr Tyr Glu Thr Ile Ala Glu Leu Ser Gln Ala 1700 1705 1710Leu Ala Asp Gly Ala Pro Ala Pro Asp Ala Val Val Ser Asp Val 1715 1720 1725Gly Ala Val Gly Gly Pro Leu Asp Thr Val Ser Leu His Gly Leu 1730 1735 1740Cys Arg Arg Gly Leu Glu Leu Val Gln Ala Trp Leu Gly Glu Pro 1745 1750 1755Arg Thr Ala Asp Thr Arg Leu Val Leu Val Thr Arg Gly Ala Val 1760 1765 1770Gly Cys Ala Pro Ala Glu Pro Val Ala Asp Pro Ala Ala Ala Ala 1775 1780 1785Leu Trp Gly Leu Val Arg Ser Ala Gln Ala Glu His Pro Gly Arg 1790 1795 1800Leu Leu Leu Leu Asp Leu Asp Pro Ala Gly Ser Arg Pro Val Ser 1805 1810 1815Gly Arg Leu Val Glu Gln Ala Val Ala Cys Gly Glu Pro His Ile 1820 1825 1830Ala Val Arg Gly Asp Gly Leu Arg Val Pro Arg Leu Ser Arg Ala 1835 1840 1845Thr Ala Ala Pro Ala His Pro Pro Ala Gly Gly Arg Glu Ala Gln 1850 1855 1860Trp Asp Pro Glu Gly Thr Val Leu Ile Thr Gly Gly Thr Gly Ser 1865 1870 1875Leu Gly Ala Leu Phe Ala Arg His Leu Val Thr Ala His Gly Val 1880 1885 1890Arg Arg Leu Leu Leu Ala Ser Arg Ser Gly Pro Gly Ala Pro Gly 1895 1900 1905Ala Ala Gly Leu Arg Asp Glu Leu Thr Ala His Gly Ala Thr Val 1910 1915 1920Thr Val Ala Ala Cys Asp Val Ala Asp Arg Glu Ala Val Ala Ala 1925 1930 1935Leu Leu Ala Ser Val Pro Ser Glu His Pro Leu Thr Ala Val Val 1940 1945 1950His Thr Ala Gly Val Leu Asp Asp Gly Val Leu Ala Ser Leu Thr 1955 1960 1965Ala Asp Arg Leu Ala Arg Val Leu Arg Ala Lys Ala Asp Ala Ala 1970 1975 1980Leu His Leu His Asp Leu Thr Arg Asp Leu Pro Leu Ala Ala Phe 1985 1990 1995Val Leu Phe Ser Ser Val Thr Ala Thr Leu Gly Thr Pro Gly Gln 2000 2005 2010Ala Asn Tyr Thr Ala Ala Asn Ala Phe Leu Asp Ala Leu Ala Arg 2015 2020 2025His Arg Arg Ala Ala Gly Leu Pro Ala Val Ser Leu Ala Trp Gly 2030 2035 2040Leu Trp Glu Gln Thr Gly Gly Leu Thr Asp His Leu Gly Ser Val 2045 2050 2055Asp Leu Arg Arg Met Ala Arg Asn Gly Leu Val Ala Leu Pro Ala 2060 2065 2070Asp Ala Gly Leu Ala Leu Phe Asp Thr Ala Leu Ala Leu Asp Arg 2075 2080 2085Ala Asn Leu Val Pro Ala Arg Leu Asp Leu Pro Ala Leu Arg Arg 2090 2095 2100Ala Thr His Val Pro Pro Val Leu Arg Arg Leu Val Glu Val Pro 2105 2110 2115Gly Ala Pro Ser Ala Asp Arg Ser Ala Gly Ser Gly Gly Glu Val 2120 2125 2130Arg Pro Leu Arg Glu Thr Leu Ala Gly Leu Asp Asp Arg Lys Arg 2135 2140 2145Pro Ala Ala Val Ser Arg Leu Val Arg Arg His Val Ala Trp Val 2150 2155 2160Leu Gly Ala Asp Gly Pro Glu Ser Val Asp Glu Asp Arg Ser Phe 2165 2170 2175Arg Asp Leu Gly Phe Asp Ser Leu Met Ala Val Glu Leu Arg Asn 2180 2185 2190Gln Leu Asn Thr Ala Ala Gly Ile Arg Leu Ala Ala Thr Leu Val 2195 2200 2205Phe Asp His Pro Thr Pro Ser Ala Val Ala Arg His Leu Leu Asp 2210 2215 2220Arg Cys Ser Pro Asp Pro Ala Ala Pro Ala Ala Pro Ser Gly Thr 2225 2230 2235Ala Val Ala Ser Ala Leu Ala Thr Leu Ala Glu Leu Glu Thr Ala 2240 2245 2250Leu Asn Gly Ile Pro Ala Glu Glu Trp Thr Ala Ala Gly Gly Pro 2255 2260 2265Ala Arg Leu Met Thr Leu Ala Ser Ser Leu Pro Ala Pro Ala Ser 2270 2275 2280Val Pro Arg Thr Pro Ala Ala Gly Glu Ala Ala Glu Lys Leu Ala 2285 2290 2295His Ala Ser Arg Asp Glu Ile Phe Ala Phe Ile Asp Arg Glu Leu 2300 2305 2310Gly Arg Asp Ser Gly Pro Ala Ser Pro Ser Arg Leu Gly Pro Gln 2315 2320 2325Thr Pro Asp Ser Thr Asp Lys Ala Pro Phe His Gly Glu 2330 2335 234063723PRTbacteria 6Met Glu Asn Glu Glu Lys Leu Leu Asp Tyr Leu Lys Trp Val Thr Ala1 5 10 15Asp Leu His Arg Ser Arg Glu Arg Val Thr Glu Leu Glu Glu Ala Gly 20 25 30Arg Glu Pro Ile Ala Ile Val Gly Met Ala Cys Arg Phe Pro Gly Glu 35 40 45Val Arg Ser Pro Glu Glu Leu Trp Gly Leu Val Ala Ser Gly Gly Asp 50 55 60Ala Ile Gly Ala Phe Pro Asp Asp Arg Gly Trp Asp Leu Asp Gly Leu65 70 75 80Phe Asp Pro Asp Pro Glu Arg Ala Gly Thr Ser Tyr Thr Arg Arg Gly 85 90 95Gly Phe Leu Tyr Asp Ala Ala Glu Phe Asp Ala Gly Phe Phe Gly Ile 100 105 110Ser Pro Arg Glu Ala Met Ala Met Asp Pro Gln Gln Arg Leu Leu Leu 115 120 125Glu Thr Ser Trp Glu Ala Phe Glu Arg Ala Gly Ile Asp Pro Ser Ser 130 135 140Val Arg Gly Ser Arg Val Gly Val Phe Ala Gly Leu Met Tyr His Asp145 150 155 160Tyr Ala Ala Ala Gln Gly Ser Thr Gly Asp Gly Asp Gly Glu Pro Asp 165 170 175Phe Glu Gly Tyr Leu Gly Asp Gly Ser Val Ser Ser Ile Ala Ser Gly 180 185 190Arg Ile Ala Tyr Thr Leu Gly Leu Ala Gly Ala Ala Ile Thr Val Asp 195 200 205Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Cys Gln Ala 210 215 220Leu Arg Thr Gly Asp Ser Glu Leu Ala Leu Ala Gly Gly Val Ser Val225 230 235 240Met Ser Thr Pro Arg Thr Phe Val Gln Phe Ser Arg Gln Arg Gly Leu 245 250 255Ser Ala Asp Gly Arg Cys Lys Ala Tyr Ala Ala Ala Ala Asp Gly Thr 260 265 270Gly Phe Ser Glu Gly Val Gly Met Val Leu Val Glu Arg Leu Ser Asp 275 280 285Ala Arg Arg Leu Gly His Pro Val Leu Ala Val Val Arg Gly Ser Ala 290 295 300Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro305 310 315 320Ser Gln Glu Arg Val Ile Arg Glu Ala Leu Ala Asn Ala Gly Leu Thr 325 330 335Ala Ala Asp Val Asp Ala Val Glu Gly His Gly Thr Gly Thr Arg Leu 340 345 350Gly Asp Pro Ile Glu Leu Gln Ala Leu Leu Ala Thr Tyr Gly Gln Gly 355 360 365Arg Ala Arg Glu Arg Pro Leu Trp Leu Gly Ser Val Lys Ser Asn Ile 370 375 380Gly His Ala Gln Ala Ala Ala Gly Val Gly Gly Val Ile Lys Met Val385 390 395 400Met Ala Leu Arg His Gly Glu Leu Pro Arg Thr Leu His Val Asp Ala 405 410 415Pro Ser Pro Arg Val Asp Trp Ser Ala Gly Glu Val Arg Leu Leu Thr 420 425 430Glu Ala Val Ala Trp Pro Ala Ala Ala Asp Gly Glu Pro Arg Arg Ala 435 440 445Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala His Val Ile Leu 450 455 460Glu Glu Ala Pro Ala Ser Glu Gly Glu Glu Ala Pro Pro Pro Glu Pro465 470 475 480Gly Ser Pro Leu Pro Trp Val Val Ser Gly His Ser Glu Ala Gly Leu 485 490 495Arg Ala Gln Ala Gln Ala Leu Ala Glu Phe Ala Arg Thr Ala Pro Gly 500 505 510Ala Glu Leu Val Asp Val Gly Ala Ala Leu Ala Arg Gly Arg Ala Ala 515 520 525Leu Gly His Arg Ala Val Val Val Ala Ser Glu Arg Glu Glu Phe Glu 530 535 540Arg Ala Leu Ala Ala Leu Ala Cys Gly Glu Pro His Pro Cys Val Val545 550 555 560Asp Gly Ser Ala Asp Gly Arg Arg Glu Asp Gly Val Val Phe Val Phe 565 570 575Pro Gly Gln Gly Gly Gln Trp Ala Gly Met Gly Leu Asp Leu Leu Thr 580 585 590Thr Ser Gly Val Phe Ala Glu His Ile Gly Ala Cys Glu Arg Ala Leu 595 600 605Ala Pro Trp Val Glu Trp Ser Leu Thr Glu Met Leu His Arg Glu Ala 610 615 620Glu Asp Pro Val Trp Glu Arg Ala Asp Ile Val Gln Pro Val Leu Phe625 630 635 640Ser Val Met Val Ser Leu Ala Ala Leu Trp Arg Ser Tyr Gly Ile Glu 645 650 655Pro Asp Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala His 660 665 670Val Cys Gly Ala Leu Thr Leu Glu Asp Ala Ala Lys Val Val Ala Leu 675 680 685Arg Ser Arg Ala Leu Ala Ala Leu Arg Gly Arg Gly Gly Met Val Ser 690 695 700Leu Ser Leu Ser Thr Ala Asp Ala Gly Glu Leu Val Glu Arg Arg Trp705 710 715 720Ala Gly Arg Leu Trp Val Ala Ala Leu Asn Gly Pro Glu Ala Thr Thr 725 730 735Val Ser Gly Asp Val Asp Ala Leu Glu Glu Leu Leu Ala His Cys Ala 740 745 750Lys Ser Glu Val Arg Ala Arg Arg Val Pro Val Asp Tyr Ala Ser His 755 760 765Cys Pro His Thr Glu Ala Ile Ala Glu Glu Ile Val Asp Ser Leu Gly 770 775 780Asp Ile Thr Pro Arg Ala Ala Thr Val Pro Phe Tyr Ser Thr Val Asp785 790 795 800Asp Met Trp Leu Asp Thr Thr Arg Leu Asp Ala Ser Tyr Trp Tyr Arg 805 810 815Asn Leu Arg Leu Pro Val Arg Phe Ser Gln Ala Val Arg Ala Leu Thr 820 825 830Glu Glu Gly His Arg Leu Phe Ile Glu Thr Ser Pro His Pro Thr Leu 835 840 845Val Pro Ala Ile Glu Asp His Gly Asp Val Thr Ala Leu Gly Thr Leu 850 855 860Arg Arg His Gly Asp Asp Thr Glu Arg Phe Leu Thr Ala Leu Ala His865 870 875 880Leu His Val Thr Gly Ala Ala Gly Gln Asp Leu Trp Arg His His Tyr 885 890 895Ala Arg Leu Arg Pro Ala Pro Arg His Val Asp Leu Pro Thr Tyr Pro 900 905 910Phe Gln Arg Arg Arg Tyr Trp Leu Glu Lys Pro Asp Pro Gln Thr Arg 915 920 925Pro Gln Arg Ser Arg Ser Thr Ala Pro Asp Leu Asp Arg Leu Glu Ala 930 935 940Glu Phe Trp Gln Ala Val Glu Glu Thr Asp Thr Asp Thr Leu Ala His945 950 955 960Thr Leu His Leu Asp Thr Gln Thr Leu Glu Pro Val Leu Pro Ala Leu 965 970 975Ala Thr Trp His Gln Gln Gln Arg Asp His Ala Arg Ile Asn Thr Trp 980 985 990Thr Tyr Gln Glu Thr Trp Lys Pro Leu His Leu Pro Thr Thr Arg Pro 995 1000 1005Thr Thr Pro Thr Ser Trp Leu Ile Ala Ile Pro Glu Thr His Arg 1010 1015 1020Asn His Pro His Thr Thr Asn Leu Leu Thr Asn Leu Pro His His 1025 1030 1035Asn Ile Thr Pro Ile Pro Leu Thr Ile Asn His Thr Thr Asp Leu 1040 1045 1050His His Ala Tyr His His Ala His His His Thr Thr Pro Pro Ile 1055 1060 1065Thr Ala Val Leu Ser Leu Leu Ala Leu Asp Glu Thr Pro His Pro 1070 1075 1080His His Pro His Thr Pro Thr Gly Thr Leu Leu Asn Leu Thr Leu 1085 1090 1095Thr Gln Thr His Thr Gln Thr His Pro Pro Thr Pro Leu Trp Tyr 1100 1105 1110Leu Thr Thr Gln Ala Thr Thr Thr His Pro Asn Asp Pro Leu Thr 1115 1120 1125His Pro Thr Gln Ala Gln Thr Ile Gly Leu Ala Arg Thr Thr His 1130 1135 1140Leu Glu His Pro His His Thr Gly Gly His Ile Asp Leu Pro Thr 1145 1150 1155Thr Pro His Pro Asn Thr Leu Thr Gln Leu Ile Thr Ala Leu Thr 1160 1165 1170His Pro His His Gln His Asn Leu Thr Ile Arg Thr His Thr Thr 1175 1180 1185His Thr Arg Arg Leu Thr Pro Thr Thr Leu Gln Pro Thr Thr Pro 1190 1195 1200Thr Pro Pro Thr Asn Pro His Gly Thr Thr Leu Ile Thr Gly Gly 1205 1210 1215Thr Gly Ala Leu Ala Thr Thr Leu Ala His His Leu Ala Thr Thr 1220 1225 1230Gly Thr Gln His Leu Leu Leu Thr Ser Arg Arg Gly Pro His Thr 1235 1240 1245Pro Gly Ala Arg Gln Leu His Thr Gln Leu Thr Gln Leu Gly Thr 1250 1255 1260Asn Thr Thr Ile Thr Ala Cys Asp Leu Ser Asp Pro Asp Gln Leu 1265 1270 1275Thr His Leu Leu Thr His Ile Pro Pro Glu His Pro Leu Thr Thr 1280 1285 1290Val Ile His Thr Ala Gly Ile Leu Asp Asp Ala Thr Leu Thr Asn 1295 1300 1305Leu Thr Pro Thr Gln Leu Asp Asn Val Leu Arg Ala Lys Ala His 1310 1315 1320Thr Ala His Leu Leu His His Ala Thr Leu His Thr Pro Leu Asp 1325 1330 1335His Phe Val Leu Tyr Ser Ser Ala Ala Ala Thr Leu Gly Ala Pro 1340 1345 1350Gly Gln Ala Asn Tyr Ala Ala Ala Asn Ala Tyr Leu Asp Ala Leu 1355 1360 1365Ala His His Arg His Thr His Asn Leu Pro Ala Thr Thr Ile Ala 1370 1375 1380Trp Gly Thr Trp Gln Gly Asn Gly Leu Ala Asp Ser Asp Lys Ala 1385 1390 1395Arg Ala Asn Leu Asp Arg Arg Gly Phe Leu Pro Met Pro Glu Thr 1400 1405 1410Leu Ala Ala Ala Ala Ala Val Arg Ala Ile Glu Ser Arg Arg Pro 1415 1420 1425Ser Val Val Ile

Ala Ala Ile Asp Trp Ala Arg Ala Glu Arg Thr 1430 1435 1440Pro Asp Val Glu Asp Leu Leu Pro Ala Ala Asp Glu Gly Ser Ser 1445 1450 1455Ser Gly Lys Pro Glu Ala Ala Pro Val Asp Leu Arg Gly Thr Leu 1460 1465 1470Ser Arg Gln Ser Ala Ala Asp Gln Gln Ala Thr Leu Leu Gly Leu 1475 1480 1485Val Arg Thr Gln Ala Ala Val Val Leu Arg His Thr Glu Pro Glu 1490 1495 1500Ala Leu Ala Pro Gly Gln Ala Phe Arg Ala Leu Gly Phe Asp Ser 1505 1510 1515Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Ala Lys Ala Thr Asp 1520 1525 1530Leu Ala Leu Pro Ala Ser Leu Val Phe Asp His Pro Thr Pro Val 1535 1540 1545Lys Leu Ala Glu Phe Leu Arg Thr Glu Leu Leu Gly Thr Ala Pro 1550 1555 1560Ala Thr Thr Ala Ala Val Pro Ala Leu Gln Ala His Thr Asp Glu 1565 1570 1575Pro Ile Ala Ile Ile Gly Met Ala Cys Arg Phe Pro Gly Ala Val 1580 1585 1590Thr Thr Pro Glu His Leu Trp Asn Leu Ile Ala Thr Glu Gln Asp 1595 1600 1605Ala Ile Gly Glu Phe Pro Thr Asp Arg Gly Trp Asp Leu Asp Asn 1610 1615 1620Leu Tyr His Pro Asp Pro Asp His Pro Gly Thr Thr Tyr Thr Arg 1625 1630 1635His Gly Gly Phe Leu His Asp Ala Gly Asp Phe Asp Ala Asp Phe 1640 1645 1650Phe Gly Ile Asn Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln 1655 1660 1665Arg Leu Leu Leu Glu Thr Ala Trp Glu Ala Ile Glu His Ala Gly 1670 1675 1680Ile Leu Pro Asp Ala Leu His Gly Thr Pro Thr Gly Val Phe Thr 1685 1690 1695Gly Val Asn Ala Gln Asp Tyr Ala Ala His Thr His Thr Ser Pro 1700 1705 1710His Thr Thr Glu Gly Tyr Thr Leu Thr Gly Thr Ala Gly Ser Ile 1715 1720 1725Ala Ser Gly Arg Ile Ala Tyr Val Leu Gly Leu Glu Gly Pro Ala 1730 1735 1740Val Thr Ile Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His 1745 1750 1755Leu Ala Cys Gln Ala Leu Arg Ala Gly Glu Cys Thr Thr Ala Leu 1760 1765 1770Ala Ser Gly Ile Ser Ile Met Thr Thr Pro Leu Ala Phe Thr Glu 1775 1780 1785Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Cys Lys Ala 1790 1795 1800Phe Ala Ala Ala Ala Asp Gly Thr Gly Trp Ser Glu Gly Val Gly 1805 1810 1815Thr Leu Leu Leu Glu Arg Leu Ser Asp Ala Glu Arg Asn Gly His 1820 1825 1830Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly 1835 1840 1845Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg 1850 1855 1860Val Ile Arg Gln Ala Leu Val Asn Ala Asn Leu Ser Ala Val Asp 1865 1870 1875Val Asp Ala Val Glu Ala His Gly Thr Gly Thr Lys Leu Gly Asp 1880 1885 1890Pro Ile Glu Ala Gln Ala Leu Leu Ala Thr Tyr Gly Gln Gly Arg 1895 1900 1905Ala Gln Glu Gln Pro Leu Trp Leu Gly Ser Val Lys Ser Asn Leu 1910 1915 1920Gly His Thr Gln Ala Ala Ala Gly Met Ala Gly Leu Ile Lys Met 1925 1930 1935Val Met Ala Leu Arg His Glu Ser Leu Pro Arg Thr Leu His Val 1940 1945 1950Asp Glu Pro Ser Pro Glu Val Asp Trp Ser Ser Gly Ala Val Ser 1955 1960 1965Leu Leu Thr Glu Ala Arg Pro Trp Pro Arg Val Glu Asp Arg Pro 1970 1975 1980Arg Arg Ala Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 1985 1990 1995His Val Ile Val Glu Glu Ala Pro Ala Pro Thr Gly Val Glu Ala 2000 2005 2010Val Glu Ala Ala Pro Ala Gly Val Glu Thr Ala Ala Ala Ala Ala 2015 2020 2025Val Val Val Glu Thr Asp Gly Ala Gly Arg Val Ser Ala Asp Leu 2030 2035 2040Pro Leu Val Trp Val Ala Ser Gly Lys Ser Gln Ala Ala Ile Arg 2045 2050 2055Ala Gln Ala Ala Ala Leu His Ala His Val Leu Asp His Pro Glu 2060 2065 2070Gln Asp Ala Asp Asp Ile Gly Tyr Ser Leu Ala Thr Thr Arg Ala 2075 2080 2085Leu Phe Asp His Arg Ala Thr Leu Ile Ala Pro Asp Arg His Thr 2090 2095 2100Val Pro Glu Pro Leu Thr Gly Leu Gly Asp Gly Arg Thr His Pro 2105 2110 2115His Leu Ile Pro Thr Pro Pro Thr Glu Pro Gly His Thr His Lys 2120 2125 2130Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro Gly Met 2135 2140 2145Ala Thr Gly Leu Tyr His Thr Tyr Pro Ala Phe Ala Ala Ala Leu 2150 2155 2160Asp Glu Thr Cys Ala His Phe Asp Pro His Leu Asp His Pro Leu 2165 2170 2175His Asp Leu Leu Leu Asn His Asp Pro Thr Asp Leu Leu Thr His 2180 2185 2190Thr Leu Tyr Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys Ala Leu 2195 2200 2205His His Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu 2210 2215 2220Ala Gly His Ser Leu Gly Glu Ile Thr Ala Ala His Leu Ala Gly 2225 2230 2235Ile Leu Thr Leu Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala 2240 2245 2250Arg Leu Met Gln Thr Met Pro Pro Gly Thr Met Thr Thr Leu His 2255 2260 2265Thr Thr Pro Glu His Ile Gln Pro Leu Leu Asp Gln His Pro Gly 2270 2275 2280Lys Ala Ala Ile Ala Ala Val Asn Ser Pro His Ser Leu Val Ile 2285 2290 2295Ser Gly Asp Pro Asp Thr Ile His His Ile Thr Thr Thr Cys His 2300 2305 2310Asn Gln Gly Ile Thr Thr Lys Pro Leu Ala Thr Asn His Ala Phe 2315 2320 2325His Ser Pro His Thr Asp Thr Ile Leu Glu Gln Leu Asp Thr Thr 2330 2335 2340Thr His Thr Leu Thr Tyr His Gln Pro His Thr Pro Leu Ile Thr 2345 2350 2355Ser Thr Pro Gly Asp Pro Leu Thr Pro His Tyr Trp Thr His Gln 2360 2365 2370Thr Arg Gln Pro Val His Trp Thr Asp Thr Ile His Thr Leu His 2375 2380 2385Thr His Gly Val Thr Thr Tyr Ile Ala Leu Gly Pro Glu His Thr 2390 2395 2400Leu Thr Thr Leu Thr His His Asn Val Pro His His Gln Pro Thr 2405 2410 2415Ala Ile Thr Leu Thr His Pro His His Asn Pro Thr His His Leu 2420 2425 2430Leu Thr Ala Leu Ala His Leu His Thr Thr Gln Pro Thr Gly Pro 2435 2440 2445Asn Ile Trp His His His Tyr Thr Pro Val Ala Pro Ala Pro Arg 2450 2455 2460His Val Asp Leu Pro Thr Tyr Pro Phe Pro Arg Arg Arg Tyr Trp 2465 2470 2475Val Gln Ala Ser Ala Gly Thr Gly Asp Val Ser Ala Ala Gly Leu 2480 2485 2490Gln Arg Pro Asp His Pro Leu Leu Gly Ala Val Met Glu Leu Ala 2495 2500 2505Asp Gly Asp Gly Ile Val Leu Thr Gly Arg Leu Ser Leu His Thr 2510 2515 2520His Pro Trp Leu Ala Asp His Ser Val Gly Gly Val Ala Leu Leu 2525 2530 2535Pro Gly Thr Ala Leu Leu Glu Leu Ala Phe Gln Ala Gly Leu Arg 2540 2545 2550Ala Gly Cys Pro Gly Val Asp Glu Leu Thr Leu His Ala Pro Leu 2555 2560 2565Val Val Pro Glu Ser Gly His Val Val Val Gln Val Ser Val Ser 2570 2575 2580Val Pro Gly Glu Ala Gly Arg Arg Gly Val Ser Val Tyr Gly Arg 2585 2590 2595Leu Val Glu Asp Gly Gly Leu Glu Gly Glu Trp Thr Arg His Ala 2600 2605 2610Glu Gly Val Val Cys Pro Ser Val Pro Gly Glu Ser Val Val Val 2615 2620 2625Glu Pro Val Ala Asp Gly Val Trp Pro Pro Ser Gly Ala Gln Pro 2630 2635 2640Val Asp Leu Glu Glu Phe Tyr Gly Arg Leu Ala Gly Gly Gly Phe 2645 2650 2655Val Tyr Gly Pro Val Phe Gln Gly Leu Cys Ala Ala Trp Arg Asp 2660 2665 2670Gly Asp Asp Val Val Ala Glu Val Arg Leu Pro Asp Glu Gly Leu 2675 2680 2685Ala Asp Val Ala Gly Phe Gly Val His Pro Ala Leu Leu Asp Ala 2690 2695 2700Ala Val Gln Ala Val Thr Leu Leu Phe Pro Asp Gln Gln Gln Ala 2705 2710 2715Gly Leu Ala Ala His Thr Trp Asn Gly Val Ser Leu His Ala Arg 2720 2725 2730Gly Ala Thr Val Leu Arg Leu Arg Met Thr Pro Thr Asp Ala Thr 2735 2740 2745Ser Thr Ala Val Arg Leu His Ala Thr Asp Glu Thr Gly Ala Pro 2750 2755 2760Val Leu Thr Leu Asp Ser Leu Leu Met Arg Pro Val Pro Leu Glu 2765 2770 2775Gly Leu Gly Ala Gly Val Arg Arg Gly Ser Leu Phe Glu Leu Gly 2780 2785 2790Trp Val Pro Val Glu Gly Met Pro Ala Ser Val Ala Gly Gly Gly 2795 2800 2805Gly Glu Leu Val Ala Trp Glu Cys Pro Gly Gly Gly Val Ala Glu 2810 2815 2820Val Thr Ala Ala Ala Leu Gly Val Val Gln Glu Trp Leu Ala Asp 2825 2830 2835Glu Arg Glu Gly Asp Ala Arg Leu Val Val Val Thr Arg Gly Ala 2840 2845 2850Val Ala Val Asp Ala Gly Glu Pro Val Arg Asp Val Ala Gly Ala 2855 2860 2865Ala Val Trp Gly Leu Val Arg Ser Ala Gln Ser Glu His Pro Asp 2870 2875 2880Arg Phe Ala Leu Leu Asp Leu Asp Pro Asp Thr Lys Thr Asp Pro 2885 2890 2895Gly Ile Asp Thr Asp Gly Asp Thr Asp Val Ser Ala Asp Ala Lys 2900 2905 2910Val Gly Thr Gly Asp Gly Leu Asp Asp Ala Ala Val Ala Ser Ala 2915 2920 2925Leu Ala Arg Gly Glu Ser Gln Leu Ala Val Arg Asp Gly Val Val 2930 2935 2940Arg Val Ala Arg Leu Gly Gly Leu Val Gly Gly Leu Ser Leu Pro 2945 2950 2955Gly Gly Val Gly Trp Arg Leu Asp Gly Gly Gly Ser Gly Leu Leu 2960 2965 2970Glu Gly Val Gly Val Val Ala Ser Asp Ala Ala Gly Val Val Leu 2975 2980 2985Gly Arg Gly Gln Val Arg Val Ala Val Arg Ala Ala Gly Val Asn 2990 2995 3000Phe Arg Asp Val Leu Val Ala Leu Gly Met Val Pro Gly Gln Val 3005 3010 3015Gly Val Gly Ser Glu Gly Ala Gly Val Val Val Glu Val Gly Pro 3020 3025 3030Gly Val Glu Gly Leu Val Val Gly Asp Arg Val Phe Gly Val Phe 3035 3040 3045Gly Asp Ala Phe Ala Pro Val Val Val Ala Gln Glu Val Leu Leu 3050 3055 3060Ala Arg Ile Pro Glu Gly Trp Ser Phe Ala Gln Ala Ala Ser Val 3065 3070 3075Pro Val Val Phe Ala Thr Ala Tyr Leu Gly Leu Val Asp Leu Ala 3080 3085 3090Gly Val Arg Arg Gly Glu Ser Val Leu Val His Ala Ala Ala Gly 3095 3100 3105Gly Val Gly Thr Ala Ala Val Gln Leu Ala Arg His Leu Gly Ala 3110 3115 3120Glu Val Tyr Ala Thr Ala Ser Glu Ala Lys Trp Ala Arg Leu Arg 3125 3130 3135Ala Ala Gly Val Ala Pro Gln Arg Ile Ala Ser Ser Arg Ser Val 3140 3145 3150Glu Phe Glu Ser Arg Phe Arg Arg Ala Ser Gly Gly Arg Gly Val 3155 3160 3165Asp Val Val Leu Asn Cys Leu Ala Gly Glu Tyr Thr Asp Ala Ser 3170 3175 3180Leu Arg Leu Cys Ser Pro Gln Gly Gly Arg Phe Leu Glu Leu Gly 3185 3190 3195Lys Thr Asp Ile Arg Asp Ala Gly Glu Val Ala Ala Arg Phe Pro 3200 3205 3210Gly Val Ser Tyr Arg Ala Tyr Asp Leu Met Asp Ala Gly Ala Gln 3215 3220 3225Arg Val Gly Glu Ile Leu His Thr Val Val Asp Leu Phe Arg Arg 3230 3235 3240Gly Val Leu Glu Pro Leu Pro Val Thr Ala Trp Asp Val Arg Gln 3245 3250 3255Ala His Gln Ala Leu Arg Ser Met Arg Ser Gly Leu His Val Gly 3260 3265 3270Lys Asn Val Leu Thr Leu Pro Val Pro Leu Asp Ala Glu Gly Thr 3275 3280 3285Val Leu Val Thr Gly Gly Thr Gly Thr Leu Gly Ala Ala Val Ala 3290 3295 3300Arg His Leu Ala Ala Gly His Gly Val Arg His Leu Leu Leu Val 3305 3310 3315Ser Arg Arg Gly Met Ala Ala Ala Gly Ala Glu Lys Leu Cys Ala 3320 3325 3330Glu Leu Gly Gln Ala Gly Val Ser Val Ser Val Ala Gly Cys Asp 3335 3340 3345Val Ala Asp Arg Ala Gln Val Ala Ala Leu Leu Glu Gln Val Pro 3350 3355 3360Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu 3365 3370 3375Asp Asp Ala Thr Val Thr Cys Leu Asp Arg Asn Lys Ile Asp Ala 3380 3385 3390Val Leu Gly Ala Lys Val Asp Gly Ala Leu His Leu His Glu Leu 3395 3400 3405Thr Ala Gly Met Asp Leu Ser Ala Phe Val Leu Phe Ser Ser Ala 3410 3415 3420Ala Gly Val Leu Gly Ser Pro Gly Gln Gly Asn Tyr Ala Ala Ala 3425 3430 3435Asn Ala Ala Leu Asp Ala Leu Ala His Gln Arg Arg Ala Ala Gly 3440 3445 3450Leu Pro Ala Leu Ser Leu Ala Trp Gly Leu Trp Glu Glu Ala Ser 3455 3460 3465Gly Met Thr Gly His Leu Asp Ala Ala Asp Arg His Arg Ile Thr 3470 3475 3480Arg Ser Gly Leu His Pro Leu Thr Thr Pro Asp Ala Leu Ala Leu 3485 3490 3495Leu Asp Thr Ala Leu Ala Ala Gly Arg Pro Ala Leu Leu Pro Ala 3500 3505 3510Asp Leu Arg Pro Thr His Pro Ala Pro Pro Leu Leu Glu His Leu 3515 3520 3525Ala Pro Ala Arg Thr Ser His Arg Thr Ala His Thr Ser Thr Ala 3530 3535 3540Thr Gly Val Gly Gln Asp Val Ser Leu Thr Asp Arg Leu Ala Thr 3545 3550 3555Leu Thr Pro Glu Gln Arg His Asp Thr Leu Leu Ala Leu Ala Arg 3560 3565 3570Thr His Ile Ala Ala Val Leu Gly His Pro Ser Pro Asp Thr Ile 3575 3580 3585Asp Pro Glu Arg Thr Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr 3590 3595 3600Ala Val Glu Leu Arg Asn Arg Leu Thr Arg Ala Thr Gly Leu Arg 3605 3610 3615Leu Pro Ala Thr Leu Ala Phe Asp His Pro Thr Pro Thr Ala Leu 3620 3625 3630Thr His His Leu Thr Thr Leu Leu Asn Pro Asn Asp Asn Asp Asn 3635 3640 3645Val Gly Pro Val Leu Met Glu Leu Glu Arg Leu Glu Ser Ala Leu 3650 3655 3660Ala Ala Leu Asp Arg Asp Asp Ser Ala Cys Glu Arg Val Thr Leu 3665 3670 3675Arg Leu Gln Ser Leu Met Leu Arg Trp Ser Gly Ser Glu Arg Gln 3680 3685 3690Ser Ala Glu Asn Thr Asp Asp Ser Ser Arg Phe Ala Ser Ala Thr 3695 3700 3705Ala Glu Glu Leu Leu Glu Phe Ile Asp Arg Asp Leu Gly Leu Ser 3710 3715 372076043PRTbacteria 7Val Ala Asn Asp Glu Lys Val Leu Glu Tyr Leu Lys Arg Val Thr Ala1 5 10 15Asp Leu Asp Arg Thr Arg Arg Arg Leu Tyr Glu Val Val Glu Arg Glu 20 25 30Gln Glu Pro Ile Ala Ile Val Gly Met Ala Cys Arg Tyr Pro Gly Gly 35 40 45Ala Gly Ser Pro Ala Gly Leu Trp Asp Leu Val Ser Ser Gly Thr Asp 50 55 60Ala Ile Gly Glu Phe Pro Thr Asp Arg Gly Trp Asp Leu Glu Arg Leu65 70 75 80Tyr Asp Pro Asp Pro Asp His Pro Gly Thr Thr Tyr Thr Arg His Gly 85 90 95Gly Phe Leu Asp Gly Val Gly Glu Phe Asp Ala Glu Phe Phe Gly Val 100 105 110Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu 115 120 125Glu Thr Ala Trp Glu Ala Ile Glu His Ala Gly Ile Val Pro Glu Ser 130 135 140Leu Arg

Gly Thr Ser Thr Gly Val Phe Ala Gly Ile Asn Pro Gln Asp145 150 155 160Tyr Thr Ile Ser Gln Tyr Gly Arg Asp Ser Glu Ile Glu Gly Tyr Leu 165 170 175Leu Thr Gly Ala Ala Ala Ser Ile Ala Ser Gly Arg Ile Ser Tyr Thr 180 185 190Leu Gly Leu Glu Gly Pro Ala Val Thr Ile Asp Thr Ala Cys Ser Ser 195 200 205Ser Leu Val Ala Leu His Leu Ala Cys Gln Ala Leu Arg Ala Gly Glu 210 215 220Cys Thr Met Ala Leu Ala Gly Gly Ala Ser Val Leu Ser Thr Pro Leu225 230 235 240Ile Phe Val Glu Phe Ala Arg His His Gly Leu Ser Val Asp Gly Arg 245 250 255Cys Lys Ala Phe Ser Ala Ser Ala Asp Gly Thr Gly Trp Gly Glu Gly 260 265 270Ala Gly Leu Leu Leu Leu Glu Arg Leu Ser Asp Ala Lys Arg Asn Gly 275 280 285Arg Arg Ile Leu Ala Leu Val Arg Gly Ser Ala Val Asn Gln Asp Gly 290 295 300Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Cys Arg Val305 310 315 320Ile Arg Arg Ala Leu Ala Asn Ala His Leu Ala Pro Ala Asp Ile Asp 325 330 335Ala Val Glu Ala His Gly Thr Gly Thr Thr Leu Gly Asp Pro Ile Glu 340 345 350Ala Gln Ala Leu Gln Glu Ala Tyr Gly Ala Asp Arg Pro Asp Asp Arg 355 360 365Pro Leu Trp Val Gly Thr Leu Lys Ser Asn Ile Gly His Ser Ile Ala 370 375 380Ala Ala Gly Val Gly Gly Val Ile Lys Met Val Met Ala Leu Arg His385 390 395 400Glu Ser Leu Pro Arg Thr Leu His Val Asp Glu Pro Ser Pro Gln Val 405 410 415Asp Trp Ser Ser Gly Ala Val Ser Leu Leu Thr Glu Ala Arg Pro Trp 420 425 430Pro Arg Asp Glu Asp Arg Pro Arg Arg Ala Gly Val Ser Ser Phe Gly 435 440 445Val Ser Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro Ala Pro 450 455 460Ala Glu Val Gln Ala Val Glu Thr Ala Pro Val Val Arg Val Asp Gly465 470 475 480Gly Glu Arg Ser Ala Pro Ala Asp Val Pro Leu Val Trp Val Val Ser 485 490 495Gly Lys Ser Gln Ala Ala Leu Arg Ala Gln Ala Ala Ala Leu His Ala 500 505 510His Val Leu Asp His Pro Glu Gln Asp Ala Ala Asp Ile Gly Tyr Ser 515 520 525Leu Ala Thr Thr Arg Ala Leu Phe Asp His Arg Ala Thr Leu Ile Ala 530 535 540Pro Asp Arg Asp Thr Leu Leu Asp Ala Leu Thr Ala Leu Ala Asp Gly545 550 555 560Arg Thr His Pro His Leu Val Pro Ala Pro Pro Thr Glu Pro Gly His 565 570 575Ala His Lys Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro 580 585 590Gly Met Ala Thr Gly Leu Tyr His Thr Tyr Pro Ala Phe Ala Ala Ala 595 600 605Leu Asp Glu Thr Cys Ala His Phe Asp Pro His Leu Asp His Pro Leu 610 615 620Arg Asp Leu Leu Leu Asn His Asp Pro Thr Gly Leu Leu Thr His Thr625 630 635 640Leu Tyr Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys Ala Leu His His 645 650 655Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu Ala Gly His 660 665 670Ser Leu Gly Glu Ile Thr Ala Ala His Leu Ala Gly Ile Leu Thr Leu 675 680 685Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala Arg Leu Met Gln Thr 690 695 700Met Pro Pro Gly Thr Met Thr Thr Leu His Thr Thr Pro Glu His Ile705 710 715 720Gln Pro Leu Leu Asp Gln His Pro Gly Lys Ala Thr Ile Ala Ala Val 725 730 735Asn Ser Pro His Ser Leu Val Ile Ser Gly Asp Pro Asp Thr Ile His 740 745 750His Ile Thr Thr Thr Cys His Thr Gln Gly Ile Thr Thr Lys Pro Leu 755 760 765Thr Thr Asn His Ala Phe His Ser Pro His Thr Asp Thr Ile Leu Glu 770 775 780Gln Leu Asp Thr Thr Thr His Thr Leu Thr Tyr His Pro Pro His Thr785 790 795 800Pro Leu Ile Thr Ser Thr Pro Gly Asp Pro Leu Thr Pro His Tyr Trp 805 810 815Thr His Gln Thr Arg Gln Pro Val His Trp Thr Asp Thr Ile His Thr 820 825 830Leu His Thr Asn Gly Val Thr Thr Tyr Ile Glu Leu Gly Pro Asp His 835 840 845Thr Leu Thr Thr Leu Thr His His Asn Leu Pro His His Gln Pro Thr 850 855 860Ala Ile Thr Leu Thr His Pro His His Asn Pro Thr His His Leu Leu865 870 875 880Thr Ala Leu Ala His Thr Pro Thr Thr Trp His Thr His His His Thr 885 890 895His Thr Asn Pro His Pro His Thr Ile Pro Asp Leu Pro Thr Tyr Pro 900 905 910Phe Gln Arg Arg His Tyr Trp Leu Gln Ala Pro Thr Thr Ser Thr Asp 915 920 925Gln Pro Val Ala Pro Thr Asn Asp Asp Ala Pro Ala Pro Arg Ala Thr 930 935 940Ser Leu Arg Asp Thr Leu Ala Gly Arg Ser Pro Gln Glu Arg Glu Glu945 950 955 960Val Leu Leu Asp Leu Val Leu Thr Gln Val Ala Ala Val Leu Gly His 965 970 975Thr Ala Pro Glu Val Val Asp Pro Gln Arg Ala Phe Lys Asp Leu Gly 980 985 990Phe Asp Ser Leu Ala Ala Ile Lys Leu Arg Asn Arg Leu Ala Ala Ala 995 1000 1005Thr Gly Leu Glu Leu Pro Thr Thr Leu Val Phe Asp His Pro Thr 1010 1015 1020Pro Val Ala Leu Arg Gln Tyr Phe Gln Ser Gln Ile Leu Gly Ala 1025 1030 1035Glu Ala Asp Ala Pro Asn Arg Leu Pro Leu Arg Ala Ala Thr Thr 1040 1045 1050Asp Glu Pro Ile Ala Ile Val Gly Met Ala Cys Arg Phe Pro Gly 1055 1060 1065Gly Val Arg Thr Ala Asp Asp Leu Trp Gln Leu Leu Ser Asp Glu 1070 1075 1080His Asp Ala Val Gly Gly Phe Pro Thr Asn Arg Gly Trp Asp Val 1085 1090 1095Ala Asn Leu Tyr Asp Pro Asp Pro Asp Arg His Gly Thr Thr Tyr 1100 1105 1110Thr Gln Gln Gly Gly Phe Leu Tyr Glu Ala Gly Glu Phe Asp Ala 1115 1120 1125Glu Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro 1130 1135 1140Gln Gln Arg Leu Leu Leu Glu Thr Ala Trp Glu Ala Ile Glu His 1145 1150 1155Ala Gly Ile Asn Pro Asp Ala Leu Arg Asn Thr Ser Thr Gly Val 1160 1165 1170Phe Ala Gly Val Ile Tyr His Asp Tyr Ala Ser Arg Phe Leu Thr 1175 1180 1185Ala Pro Ala Gly Tyr Glu Gly Tyr Leu Gly His Gly Ser Ala Gly 1190 1195 1200Ser Ile Ala Ser Gly Arg Val Ala Tyr Val Leu Gly Leu Glu Gly 1205 1210 1215Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala 1220 1225 1230Leu His Leu Ala Cys Gln Ala Leu Arg Ser Gly Glu Cys Thr Met 1235 1240 1245Ala Leu Ala Gly Gly Ala Thr Val Met Ser Thr Pro Gln Ala Phe 1250 1255 1260Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Cys 1265 1270 1275Lys Ala Phe Ser Ala Ala Ala Asp Gly Thr Gly Trp Gly Glu Gly 1280 1285 1290Ala Gly Leu Leu Leu Leu Glu Arg Leu Ser Glu Ala Glu Arg Asn 1295 1300 1305Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln 1310 1315 1320Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln 1325 1330 1335Gln Arg Val Ile Arg Gln Ala Leu Ala Asn Ser Gly Leu Thr Gly 1340 1345 1350Ala Asp Val Asp Ala Val Glu Ala His Gly Thr Gly Thr Lys Leu 1355 1360 1365Gly Asp Pro Ile Glu Ala Gln Ala Leu Leu Ala Thr Tyr Gly Gln 1370 1375 1380Glu His His Pro Asp Gln Pro Leu Trp Leu Gly Ser Leu Lys Ser 1385 1390 1395Asn Ile Gly His Ala Gln Ala Ala Ala Gly Val Gly Ser Ile Ile 1400 1405 1410Lys Met Ile Met Ala Met Arg Asn Glu Ser Leu Pro Arg Thr Leu 1415 1420 1425His Val Asp Glu Pro Ser Pro His Val Asp Trp Ser Ser Gly Ala 1430 1435 1440Val Ser Leu Leu Thr Glu Pro Arg Pro Trp Pro Arg Arg Glu Asp 1445 1450 1455Arg Pro Arg Arg Ala Gly Ile Ser Ser Phe Gly Val Ser Gly Thr 1460 1465 1470Asn Ala His Val Ile Val Glu Glu Pro Pro Ala Arg Ala Glu Val 1475 1480 1485Glu Ala Val Glu Ala Ala Pro Ala Gly Val Glu Thr Ala Ala Ala 1490 1495 1500Ala Ala Val Val Val Glu Thr Asp Gly Ala Gly Arg Val Ser Ser 1505 1510 1515Asp Val Pro Leu Val Trp Val Val Ser Gly Lys Ser Gln Ala Ala 1520 1525 1530Leu Arg Ala Gln Ala Ala Ala Leu His Ala His Val Leu Asp His 1535 1540 1545Pro Glu Gln Asp Ala Ala Asp Ile Gly Tyr Ser Leu Ala Thr Thr 1550 1555 1560Arg Ala Leu Phe Asp His Arg Ala Thr Leu Ile Ala Pro Asp Arg 1565 1570 1575Asp Thr Leu Leu Asp Ala Leu Thr Ala Leu Ala Asp Gly Arg Thr 1580 1585 1590His Pro His Leu Ile Pro Thr Pro Pro Thr Glu Pro Gly His Thr 1595 1600 1605His Lys Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro 1610 1615 1620Gly Met Ala Thr Gly Leu Tyr His Thr Tyr Pro Ala Phe Ala Ala 1625 1630 1635Ala Leu Asp Glu Thr Cys Ala His Phe Asp Pro His Leu Asp His 1640 1645 1650Pro Leu Arg Asp Leu Leu Leu Asn His Asp Pro Thr Asp Leu Leu 1655 1660 1665Thr His Thr Leu Tyr Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys 1670 1675 1680Ala Leu His His Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His 1685 1690 1695Tyr Leu Ala Gly His Ser Leu Gly Glu Ile Thr Ala Ala His Leu 1700 1705 1710Ala Gly Ile Leu Thr Leu Pro Asp Ala Thr His Leu Ile Thr Thr 1715 1720 1725Arg Ala Arg Leu Met Gln Thr Met Pro Pro Gly Thr Met Thr Thr 1730 1735 1740Leu His Thr Thr Pro Glu His Ile Gln Pro Leu Leu Asp Gln His 1745 1750 1755Pro Gly Lys Ala Thr Ile Ala Ala Val Asn Ser Pro His Ser Leu 1760 1765 1770Val Ile Ser Gly Asp Pro Asp Thr Ile His His Ile Thr Thr Thr 1775 1780 1785Cys His Asn Gln Gly Ile Thr Thr Lys Pro Leu Thr Thr Asn His 1790 1795 1800Ala Phe His Ser Pro His Thr Asn Thr Ile Leu Glu Gln Leu Asp 1805 1810 1815Thr Thr Thr His Thr Leu Thr Tyr His Pro Pro His Thr Pro Leu 1820 1825 1830Ile Thr Ser Thr Pro Gly Asn Pro Leu Thr Pro His Tyr Trp Thr 1835 1840 1845His Gln Thr Arg Gln Pro Val His Trp Ala Asp Thr Ile His Thr 1850 1855 1860Leu His Thr Asn Gly Val Thr Thr Tyr Ile Gly Leu Gly Pro Asp 1865 1870 1875His Thr Leu Ser Thr Leu Thr His His Asn Leu Pro Gln His Gln 1880 1885 1890Pro Thr Ala Ile Thr Leu Thr His Pro His His Asn Pro Thr His 1895 1900 1905His Leu Leu Thr Ala Leu Ala His Thr Pro Thr Thr Trp His Thr 1910 1915 1920His His His Thr His Thr Asn Pro His Pro His Thr Ile Pro Asp 1925 1930 1935Leu Pro Thr Tyr Pro Phe Gln Arg Arg His Tyr Trp Leu Glu Val 1940 1945 1950Pro Lys Pro Thr Ala Glu Ala Ser Ala Ser Ala Ser Gly Pro Gly 1955 1960 1965Arg Asn Arg Ala Ala Lys Leu Ser Ala Leu Glu Ala Glu Phe Trp 1970 1975 1980Gln Ala Val Glu Glu Thr Asp Thr Asp Thr Leu Ala His Thr Leu 1985 1990 1995Asp Leu Asp Thr Gln Thr Leu Glu Pro Val Leu Pro Ala Leu Ala 2000 2005 2010Thr Trp His Gln Gln Gln Arg Asp His Ala Arg Ile Asn Thr Trp 2015 2020 2025Thr Tyr Gln Glu Thr Trp Lys Pro Leu His Leu Pro Thr Thr Arg 2030 2035 2040Pro Thr Thr Pro Thr Ser Trp Leu Ile Ala Ile Pro Glu Thr His 2045 2050 2055Arg Asn His Pro His Thr Thr Asn Leu Leu Thr Asn Leu Pro His 2060 2065 2070His Asn Ile Thr Pro Ile Pro Leu Thr Ile Asn His Thr Thr Asp 2075 2080 2085Leu His His Ala Tyr His His Ala His His His Thr Thr Pro Pro 2090 2095 2100Ile Thr Ala Val Leu Ser Leu Leu Ala Leu Asp Glu Thr Pro His 2105 2110 2115Pro His His Pro His Thr Pro Thr Gly Thr Leu Leu Asn Leu Thr 2120 2125 2130Leu Thr Gln Thr His Thr Gln Thr His Pro Pro Thr Pro Leu Trp 2135 2140 2145Tyr Leu Thr Thr Gln Ala Thr Thr Thr His Pro Asn Asp Pro Leu 2150 2155 2160Thr His Pro Thr Gln Ala Gln Thr Ile Gly Leu Ala Arg Thr Thr 2165 2170 2175His Leu Glu His Pro His His Thr Gly Gly His Ile Asp Leu Pro 2180 2185 2190Thr Thr Pro His Pro Asn Thr Leu Thr Gln Leu Ile Thr Ala Leu 2195 2200 2205Thr His Pro His His Gln His Asn Leu Thr Ile Arg Thr His Thr 2210 2215 2220Thr His Thr Arg Arg Leu Thr Pro Thr Thr Leu Gln Pro Thr Thr 2225 2230 2235Pro Thr Pro Pro Thr Asn Pro His Gly Thr Thr Leu Ile Thr Gly 2240 2245 2250Gly Thr Gly Ala Leu Ala Thr Thr Leu Ala His His Leu Ala Thr 2255 2260 2265Thr Gly Thr Gln His Leu Leu Leu Thr Ser Arg Arg Gly Pro His 2270 2275 2280Thr Pro Gly Ala Arg Gln Leu His Thr Gln Leu Thr Gln Leu Gly 2285 2290 2295Thr Asn Thr Thr Ile Thr Ala Cys Asp Leu Ser Asp Pro Asp Gln 2300 2305 2310Leu Thr His Leu Leu Thr His Ile Pro Pro Glu His Pro Leu Thr 2315 2320 2325Thr Val Ile His Thr Ala Gly Ile Leu Asp Asp Ala Thr Leu Thr 2330 2335 2340Asn Leu Thr Pro Thr Gln Leu Asp Asn Val Leu Arg Ala Lys Ala 2345 2350 2355His Thr Ala His Leu Leu His His Ala Thr Leu His Thr Pro Leu 2360 2365 2370Asp His Phe Val Leu Tyr Ser Ser Ala Ala Ala Thr Leu Gly Ala 2375 2380 2385Pro Gly Gln Ala Asn Tyr Ala Ala Ala Asn Ala Tyr Leu Asp Ala 2390 2395 2400Leu Ala His His Arg His Thr His Asn Leu Pro Ala Thr Thr Ile 2405 2410 2415Ala Trp Gly Thr Trp Gln Gly Asn Gly Leu Ala Ser Gly Asp Ile 2420 2425 2430Gly Glu His Leu Arg Arg Arg Gly Met Ile Pro Leu Asp Pro Glu 2435 2440 2445Ser Ala Val Gly Ala Phe Asp Arg Ala Val Ala Ser Asp Arg Pro 2450 2455 2460Ser Val Phe Val Ala Asp Ile Asp Trp Pro Thr Phe Gly Arg Asn 2465 2470 2475Thr Ser Ser Gly Leu Arg Ala Leu Phe Glu Asp Ile Pro Glu Ala 2480 2485 2490Thr Gln Pro Glu Pro Thr Ala Arg Ser Ala Asp Gln Pro Asn Gly 2495 2500 2505His Gly Ser Leu Gln Glu Leu Leu Ala Arg Gln Ser Pro Ala Glu 2510 2515 2520Gln Ala Glu Thr Leu Leu Ala Leu Val Arg Thr His Ser Ala Thr 2525 2530 2535Val Leu Gly Arg Asp Gly Ala Asp Ala Val Ala Ala Glu Arg Pro 2540 2545 2550Phe Arg Asp Leu Gly Phe Asp Ser Leu Ser Ala Val Glu Leu Arg 2555 2560 2565Asn His Leu Thr Ala Asp Thr Glu Leu Ala Leu Pro Thr Thr Leu 2570 2575 2580Val Phe Asp His Pro Thr Pro Val Lys Leu Ala Glu Phe Leu Arg

2585 2590 2595Thr Glu Leu Leu Gly Thr Ala Pro Ala Thr Thr Ala Ala Val Pro 2600 2605 2610Ala Leu Gln Ser His Thr Asp Glu Pro Ile Ala Ile Ile Gly Met 2615 2620 2625Ala Cys Arg Phe Pro Gly Ala Val Thr Thr Pro Glu His Leu Trp 2630 2635 2640Asn Leu Ile Ala Thr Glu Gln Asp Ala Ile Gly Glu Phe Pro Thr 2645 2650 2655Asp Arg Gly Trp Asp Leu Asp Asn Leu Tyr His Pro Asp Pro Asp 2660 2665 2670His Pro Gly Thr Thr Tyr Thr Arg His Gly Gly Phe Leu Tyr Asp 2675 2680 2685Ala Gly Asp Phe Asp Ala Glu Phe Phe Gly Ile Asn Pro Arg Glu 2690 2695 2700Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Ala 2705 2710 2715Trp Glu Ala Ile Glu His Ala Gly Ile Leu Pro Asp Ala Leu His 2720 2725 2730Gly Thr Pro Thr Gly Val Phe Thr Gly Val Asn Ala Gln Asp Tyr 2735 2740 2745Ala Ala His Thr His Ala Ser Pro His Thr Thr Glu Gly Tyr Thr 2750 2755 2760Leu Thr Gly Thr Ala Gly Ser Ile Ala Ser Gly Arg Ile Ala Tyr 2765 2770 2775Thr Leu Gly Leu Glu Gly Pro Ala Val Thr Ile Asp Thr Ala Cys 2780 2785 2790Ser Ser Ser Leu Val Ala Leu His Leu Ala Cys Gln Ala Leu Arg 2795 2800 2805Ala Gly Glu Cys Thr Thr Ala Leu Ala Ser Gly Ile Thr Val Met 2810 2815 2820Thr Ser Pro Val Thr Phe Thr Glu Phe Ser Arg Gln Arg Gly Leu 2825 2830 2835Ala Pro Asp Gly His Cys Lys Ala Phe Ser Ala Ser Ala Asp Gly 2840 2845 2850Thr Gly Trp Ser Glu Gly Val Gly Thr Ile Leu Val Glu Arg Leu 2855 2860 2865Ser Asp Ala Glu Arg Asn Gly His Arg Ile Leu Ala Val Val Arg 2870 2875 2880Gly Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala 2885 2890 2895Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Gln Ala Leu Ala 2900 2905 2910Asn Ser Gly Leu Thr Gly Ala Asp Val Asp Ala Val Glu Ala His 2915 2920 2925Gly Thr Gly Thr Lys Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu 2930 2935 2940Leu Ala Thr Tyr Gly Gln Gly Arg Ala Gln Glu Gln Pro Leu Trp 2945 2950 2955Leu Gly Ser Val Lys Ser Asn Leu Gly His Thr Gln Ala Ala Ala 2960 2965 2970Gly Met Ala Gly Leu Ile Lys Met Val Met Ala Leu Arg His Glu 2975 2980 2985Ser Leu Pro Arg Thr Leu His Val Asp Glu Pro Ser Pro Gln Val 2990 2995 3000Asp Trp Ser Ser Gly Ala Val Ser Leu Leu Thr Glu Ala Arg Pro 3005 3010 3015Trp Pro Arg Arg Glu Asp Arg Pro Arg Arg Ala Gly Ile Ser Ser 3020 3025 3030Phe Gly Val Ser Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala 3035 3040 3045Pro Ala Pro Ala Glu Ala Val Glu Thr Glu Gln Gly Val Val Pro 3050 3055 3060Gln Gly Asp Gln Glu Cys Ser Ala Pro Val Gly Val Pro Leu Val 3065 3070 3075Trp Val Val Ser Gly Lys Ser Gln Ala Ala Leu Arg Ala Gln Ala 3080 3085 3090Ala Ala Leu His Ala His Val Leu Asp His Pro Glu Gln Asp Ala 3095 3100 3105Ala Asp Ile Gly Tyr Ser Leu Ala Thr Thr Arg Ala Leu Phe Asp 3110 3115 3120His Arg Ala Thr Leu Ile Ala Pro Asp Arg Asp Thr Leu Leu Asp 3125 3130 3135Ala Leu Thr Ala Leu Ala Asp Gly Arg Thr His Pro His Leu Ile 3140 3145 3150Pro Thr Pro Pro Thr Glu Pro Gly His Thr His Lys Ile Ala Phe 3155 3160 3165Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro Gly Met Ala Thr Gly 3170 3175 3180Leu Tyr His Thr Tyr Pro Ala Phe Ala Ala Ala Leu Asp Glu Thr 3185 3190 3195Cys Ala His Phe Asp Pro His Leu Asp His Pro Leu Arg Asp Leu 3200 3205 3210Leu Leu Asn His Asp Pro Thr Asp Leu Leu Thr His Thr Leu Tyr 3215 3220 3225Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys Ala Leu His His Leu 3230 3235 3240Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu Ala Gly His 3245 3250 3255Ser Leu Gly Glu Ile Thr Ala Ala His Leu Ala Gly Ile Leu Thr 3260 3265 3270Leu Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala Arg Leu Met 3275 3280 3285Gln Thr Met Pro Pro Gly Thr Met Thr Thr Leu His Thr Thr Pro 3290 3295 3300Glu His Ile Gln Pro Leu Leu Asp Gln His Pro Gly Lys Ala Thr 3305 3310 3315Ile Ala Ala Val Asn Ser Pro His Ser Leu Val Ile Ser Gly Asp 3320 3325 3330Pro Asp Thr Ile His His Ile Thr Thr Thr Cys His Thr Gln Gly 3335 3340 3345Ile Thr Thr Lys Pro Leu Thr Thr Asn His Ala Phe His Ser Pro 3350 3355 3360His Thr Asp Thr Ile Leu Glu Gln Leu Asp Thr Thr Thr His Thr 3365 3370 3375Leu Thr Tyr His Gln Pro His Thr Pro Leu Ile Thr Ser Thr Pro 3380 3385 3390Gly Asp Pro Leu Thr Pro His Tyr Trp Thr His Gln Thr Arg Gln 3395 3400 3405Pro Val His Trp Ala Asp Thr Ile His Thr Leu His Thr Asn Gly 3410 3415 3420Val Thr Thr Tyr Ile Gly Leu Gly Pro Asp His Thr Leu Ser Thr 3425 3430 3435Leu Thr His His Asn Leu Pro Gln His Gln Pro Thr Ala Ile Thr 3440 3445 3450Leu Thr His Pro His His Asn Pro Thr His His Leu Leu Thr Ala 3455 3460 3465Leu Ala His Thr Pro Thr Thr Trp His Thr His His His Thr His 3470 3475 3480Thr Asn Pro His Pro His Thr Ile Pro Asp Leu Pro Thr Tyr Pro 3485 3490 3495Phe Gln Arg Arg His Tyr Trp Leu Glu Val Pro Lys Pro Thr Ala 3500 3505 3510Glu Ala Ser Ala Ser Ala Ser Gly Pro Gly Arg Asn Arg Ala Ala 3515 3520 3525Lys Leu Ser Ala Leu Glu Ala Glu Phe Trp Gln Ala Val Glu Glu 3530 3535 3540Thr Asp Thr Asp Thr Leu Ala His Thr Leu Asp Leu Asp Thr Gln 3545 3550 3555Thr Leu Glu Pro Val Leu Pro Ala Leu Ala Thr Trp His Gln Gln 3560 3565 3570Gln Arg Asp His Ala Arg Ile Asn Thr Trp Thr Tyr Gln Glu Thr 3575 3580 3585Trp Lys Pro Leu His Leu Pro Thr Thr Arg Pro Thr Thr Pro Thr 3590 3595 3600Ser Trp Leu Ile Ala Ile Pro Glu Thr His Arg Asn His Pro His 3605 3610 3615Thr Thr Asn Leu Leu Thr Asn Leu Pro His His Asn Ile Thr Pro 3620 3625 3630Ile Pro Leu Thr Ile Asn His Thr Thr Asp Leu His His Ala Tyr 3635 3640 3645His His Ala His His His Thr Thr Pro Pro Ile Thr Ala Val Leu 3650 3655 3660Ser Leu Leu Ala Leu Asp Glu Thr Pro His Pro His His Pro His 3665 3670 3675Thr Pro Thr Gly Thr Leu Leu Asn Leu Thr Leu Thr Gln Thr His 3680 3685 3690Thr Gln Thr His Pro Pro Thr Pro Leu Trp Tyr Leu Thr Thr Gln 3695 3700 3705Ala Thr Thr Thr His Pro Asn Asp Pro Leu Thr His Pro Thr Gln 3710 3715 3720Ala Gln Thr Ile Gly Leu Ala Arg Thr Thr His Leu Glu His Pro 3725 3730 3735His His Thr Gly Gly His Ile Asp Leu Pro Thr Thr Pro His Pro 3740 3745 3750Asn Thr Leu Thr Gln Leu Ile Thr Ala Leu Thr His Pro His His 3755 3760 3765Gln His Asn Leu Thr Ile Arg Thr His Thr Thr His Thr Arg Arg 3770 3775 3780Leu Thr Pro Thr Thr Leu Gln Pro Thr Thr Pro Thr Pro Pro Thr 3785 3790 3795Asn Pro His Gly Thr Thr Leu Ile Thr Gly Gly Thr Gly Ala Leu 3800 3805 3810Ala Thr Thr Leu Ala His His Leu Ala Thr Thr Gly Thr Gln His 3815 3820 3825Leu Leu Leu Thr Ser Arg Arg Gly Pro His Thr Pro Gly Ala Arg 3830 3835 3840Gln Leu His Thr Gln Leu Thr Gln Leu Gly Thr Asn Thr Thr Ile 3845 3850 3855Thr Ala Cys Asp Leu Ser Asp Pro Asp Gln Leu Thr His Ile Leu 3860 3865 3870Thr His Ile Pro Pro Glu His Pro Leu Thr Thr Val Ile His Thr 3875 3880 3885Ala Gly Val Asn His Tyr Ala Pro Val Ala Ala Thr Asp Pro Ser 3890 3895 3900Thr Phe Ala Ser Val Leu Ala Ala Lys Ala Ala Gly Ala Ala His 3905 3910 3915Leu His Glu Leu Leu Leu Glu Leu Asp Thr Val Glu Gln Phe Ile 3920 3925 3930Leu Phe Ser Ser Gly Ser Gly Ala Trp Gly Ser Gly Asn Gln Cys 3935 3940 3945Ala Tyr Ala Ala Ala Asn Ala Tyr Leu Asp Ala Leu Ala Ala His 3950 3955 3960Arg Gln Ala Arg Gly Leu Pro Gly Met Ser Leu Ala Trp Gly Pro 3965 3970 3975Trp Asp Gly Asp Gly Met Ser Ala Gly Glu Asp Ala Gln Arg Tyr 3980 3985 3990Leu Arg Glu Arg Gly Val Leu Pro Met Asp Pro Arg Leu Ala Val 3995 4000 4005Ala Ala Phe Asp Glu Ala Val Arg Ala Arg Pro Asn Ser Asn Leu 4010 4015 4020Val Val Ala Asp Ile Asp Trp Glu Arg Phe Val Pro Thr Phe Thr 4025 4030 4035Ala Arg Gly His Asn Pro Leu Ile Glu Asp Ile Pro Glu Val Arg 4040 4045 4050Arg Leu Ala Ala Glu Ala Glu Ala Ala Gln Thr Thr Thr Ala Ala 4055 4060 4065Thr Asp Ala Pro Ala Leu Leu Asn Arg Leu Ser Gly Leu Ser Ala 4070 4075 4080Thr Gln Gln Lys Gln His Leu Leu Arg Leu Val Arg Ser His Met 4085 4090 4095Gly Glu Val Leu Gly Arg Glu Asp Val Asp Thr Leu Asp Glu Arg 4100 4105 4110His Thr Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ser Ala Arg 4115 4120 4125Phe Ser Gln Arg Leu Ala Lys Asp Thr Gly Leu His Leu Pro Ala 4130 4135 4140Thr Leu Val Phe Asp His Pro Thr Pro Ala Asp Cys Val Ala His 4145 4150 4155Leu Arg Asp Gln Leu Leu Gly Glu Thr Asp Asp Met Thr Pro Arg 4160 4165 4170Lys Arg Asp His Leu Gly Glu Asp Arg Arg Ala Ala Thr Ala Asp 4175 4180 4185Asp Pro Ile Ala Ile Val Gly Met Ala Cys Arg Phe Pro Gly Gly 4190 4195 4200Val Arg Ser Ala Asp Asp Leu Trp Asp Leu Leu Ser Ser Gly Thr 4205 4210 4215Asp Ala Ile Ser Gly Phe Pro Thr Asp Arg Gly Trp Asp Ile Glu 4220 4225 4230Ser Leu Tyr Asp Pro Asp Pro Asp Arg Ser Gly Thr Thr Tyr Thr 4235 4240 4245Arg His Gly Gly Phe Leu Tyr Asp Ala Gly Gln Phe Asp Ala Glu 4250 4255 4260Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln 4265 4270 4275Gln Arg Leu Leu Leu Glu Thr Ala Trp Glu Ala Val Glu His Ala 4280 4285 4290Gly Ile Asn Pro Gln Thr Leu His Gly Thr Pro Thr Gly Val Phe 4295 4300 4305Thr Gly Val Asn Ala Gln Asp Tyr Ala Ala His Leu Arg Gln Ala 4310 4315 4320Ser Gly Asn Val Glu Gly Tyr Ala Leu Thr Gly Ser Ser Gly Ser 4325 4330 4335Val Val Ser Gly Arg Val Ala Tyr Thr Phe Gly Phe Glu Gly Pro 4340 4345 4350Ala Val Ser Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu 4355 4360 4365His Leu Ala Gly Gln Ala Leu Arg Ser Gly Glu Cys Thr Met Ala 4370 4375 4380Leu Ala Gly Gly Val Met Val Met Ser Ser Pro Glu Thr Phe Val 4385 4390 4395Glu Phe Ser Arg Gln Arg Gly Leu Ser Val Asp Gly Arg Cys Lys 4400 4405 4410Ser Phe Ala Ala Ala Ala Asp Gly Thr Gly Trp Gly Glu Gly Val 4415 4420 4425Gly Met Leu Leu Val Glu Arg Leu Ser Asp Ala Glu Arg Asn Gly 4430 4435 4440His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp 4445 4450 4455Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln 4460 4465 4470Arg Val Ile Arg Gln Ala Leu Ala Asn Ser Gly Leu Thr Gly Ala 4475 4480 4485Asp Val Asp Ala Val Glu Ala His Gly Thr Gly Thr Lys Leu Gly 4490 4495 4500Asp Pro Ile Glu Ala Gln Ala Leu Leu Ala Thr Tyr Gly Gln Glu 4505 4510 4515His His Pro Asp Gln Pro Leu Trp Leu Gly Ser Leu Lys Ser Asn 4520 4525 4530Ile Gly His Ala Gln Ala Ala Ala Gly Val Gly Gly Ile Ile Lys 4535 4540 4545Met Val Met Ala Leu Arg His Glu Thr Leu Pro Arg Thr Leu His 4550 4555 4560Ile Asp Glu Pro Thr Pro Gln Val Asp Trp Ser Ser Gly Ala Val 4565 4570 4575Ser Leu Leu Thr Glu Pro Arg Pro Trp Pro Arg Gln Gly Asp Arg 4580 4585 4590Pro Arg Arg Ala Gly Ile Ser Ser Phe Gly Val Ser Gly Thr Asn 4595 4600 4605Ala His Val Ile Leu Glu Glu Ala Pro Ala Gln Pro Ala Gly Asp 4610 4615 4620Pro Ala Pro Glu Asp Gly Ala Pro Val Pro Trp Ala Met Ser Ala 4625 4630 4635Arg Ser Asn Ala Ala Leu Arg Ala Gln Ala Ala Leu Leu Arg Asp 4640 4645 4650Phe Leu Gln Gly Pro Gly Thr Asp Thr Ala Leu Arg Ala Val Gly 4655 4660 4665Ala Glu Leu Ala His Gly Arg Ala Val Leu Glu His Arg Ala Val 4670 4675 4680Ile Val Ala Arg Glu Arg Thr Glu Phe Glu Asp Ala Leu Glu Ala 4685 4690 4695Leu Ala Ser Gly Glu Pro His Pro Ala Leu Ile Glu Asp Thr Thr 4700 4705 4710Gly Ser Gln Thr Asn Ser His Ser Gly Gly Gly Val Val Phe Val 4715 4720 4725Phe Pro Gly Gln Gly Gly Gln Trp Ala Gly Met Gly Leu Asp Leu 4730 4735 4740Leu Arg Asp Ser Gln Val Phe Ala Asp His Val Gly Ala Cys Glu 4745 4750 4755Arg Ala Leu Ala Pro Trp Val Glu Trp Ser Leu Thr Glu Met Leu 4760 4765 4770His Arg Asp Ala Glu Asp Pro Val Trp Glu Arg Ala Asp Val Val 4775 4780 4785Gln Pro Val Leu Phe Ser Val Met Val Ser Leu Ala Ala Leu Trp 4790 4795 4800Arg Ser Tyr Gly Ile Glu Pro Glu Ala Val Val Gly His Ser Gln 4805 4810 4815Gly Glu Ile Ala Ala Ala His Val Cys Gly Ala Leu Thr Leu Glu 4820 4825 4830Asp Ala Ala Lys Ile Val Ala Leu Arg Ser Arg Ala Leu Ala Ala 4835 4840 4845Leu Arg Gly His Gly Gly Met Ala Ser Leu Ala Leu Thr Gly Thr 4850 4855 4860Glu Ala Glu Asp Leu Ile Thr Thr His Trp Pro Gly Arg Leu Trp 4865 4870 4875Thr Ala Ala Phe Asn Gly Pro Arg Ala Thr Thr Val Ser Gly Asp 4880 4885 4890Thr Asp Ala Leu Asp Glu Leu Leu Thr His Cys Thr Glu Thr Gly 4895 4900 4905Val Arg Ala Arg Arg Ile Pro Val Asp Tyr Ala Ser His Cys Pro 4910 4915 4920His Thr Glu Thr Ile Glu His Asp Leu Leu His Met Leu His Gly 4925 4930 4935Ile Thr Pro Gln Pro Gly Ser Ile Pro Phe Tyr Ser Thr Val Glu 4940 4945 4950Asp Ala Trp Thr Asp Thr Thr Thr Leu Asp Ala Ala Tyr Trp Tyr 4955 4960 4965Arg Asn Leu Arg Arg Pro Val Arg Phe Thr His Ala Val Arg Thr 4970 4975 4980Leu Thr Ala Gln Gly His Arg Leu Phe Ile Glu Thr Ser Pro His 4985 4990 4995Pro Thr Leu Thr Pro Ala Ile Glu Asp His Asp His Thr Thr Ala 5000 5005 5010Leu Gly Thr Leu Arg Arg His Asp Asn Asp Thr His Arg Phe Leu 5015 5020 5025Thr

Ala Leu Ala His Ala His Thr Thr Gly His Thr Val Thr Trp 5030 5035 5040Thr Thr His Tyr Pro Thr Thr Pro His Thr Pro Ala Ile Asp Leu 5045 5050 5055Pro Thr Tyr Pro Phe Gln His His His Tyr Trp Leu His Thr Pro 5060 5065 5070Thr Thr Ser Thr Gly Asp Val Ser Ala Ala Gly Leu His Pro Thr 5075 5080 5085Glu His Pro Leu Leu Gly Ala Thr Val Glu Leu Ala Asp Gly Asp 5090 5095 5100Gly Thr Leu Leu Thr Gly Arg Leu Ser Leu His Thr His Pro Trp 5105 5110 5115Leu Ala Asp His Ser Val Gly Gly Ile Val Leu Leu Pro Gly Thr 5120 5125 5130Ala Leu Leu Glu Leu Ala Leu Glu Ala Gly Thr Arg Thr Gly Cys 5135 5140 5145Pro His Val Gln Glu Leu Thr Leu His Thr Pro Leu Val Ile Pro 5150 5155 5160Glu Thr Gly His Val Val Phe Gln Leu Thr Val Ser Ala Pro Asp 5165 5170 5175Glu Thr Gly Gln Arg Pro Phe Thr Val His Phe Arg Ser Glu Ala 5180 5185 5190Val Thr Gly Ala Asp Asp Pro Ala Asp Arg Thr Trp Thr Arg Cys 5195 5200 5205Ala Thr Gly Ala Leu Ser Thr Ala Ala Ala Pro Asp His Ser Glu 5210 5215 5220Ala Ala Thr Trp Pro Pro Pro Ser Ala Gln Pro Leu Asp Leu Asp 5225 5230 5235Gly Leu Tyr Asp Arg Met Ala Glu Ala Gly Leu Val Tyr Gly Pro 5240 5245 5250Val Phe Gln Gly Leu Arg Glu Ala Trp Leu Asp Gly Glu Asp Ile 5255 5260 5265Val Ala Glu Val Arg Leu Pro Gln Glu Ala Ala Ala Asp Thr Gln 5270 5275 5280Gly Phe Gly Leu His Pro Ala Leu Leu Asp Ala Ala Leu His Val 5285 5290 5295Thr Ala Leu Thr Ser Gln Ala Gly Thr Ala Asp Glu Asp Ala Gln 5300 5305 5310Glu Arg Arg Arg Leu Pro Phe Ala Trp Ala Gly Val Ser Leu Phe 5315 5320 5325Ala Arg Glu Cys Ala Ala Leu Arg Val Arg Val Ala Pro Cys Ala 5330 5335 5340Pro His Pro Gly Asp Ala Val Ala Ile Thr Ala Thr Asp Glu Asp 5345 5350 5355Gly Arg Pro Val Leu Ala Val Glu Ser Leu Thr Leu Arg Pro Val 5360 5365 5370Ser Pro Asp Gln Leu Arg Ala Ala Ala Pro Ala Ala Gly Arg Asp 5375 5380 5385Ser Leu Phe Arg Leu Glu Trp Val Pro Val Thr Ala Ser Ala Ser 5390 5395 5400Ala Ser Ala Arg Pro Thr Gly Pro Trp Ala Ala Ile Gly Thr Gly 5405 5410 5415Pro Ala Val Ala Gly Leu Ala Gly His Ala Asp Leu Thr Val Tyr 5420 5425 5430Ala Glu Ala Gly Asp Leu Leu Arg Asp Leu Asp Gly Gly Ala Pro 5435 5440 5445Ala Pro Ala Val Val Val Leu Ser Val Thr Pro Asp Ala Asp Glu 5450 5455 5460Phe Ala Thr Pro Arg Ala Ala Thr Gly Arg Ala Leu Ser Val Leu 5465 5470 5475Gln Ala Trp Leu Ala Asp Glu Arg Leu Ala Asp Ser Arg Leu Val 5480 5485 5490Ala Val Thr Ser Gly Ala Val Val Ala Ala Pro Gly Asp Asp Thr 5495 5500 5505Val Asp Val Pro Gly Ala Ala Val Trp Gly Leu Val Arg Ser Gly 5510 5515 5520Gln Ser Glu His Pro Asp Arg Ile Thr Leu Leu Asp Cys Ala Ser 5525 5530 5535Gly Ala Arg Pro Gly Pro Asp Leu Val Ala Ala Ala Leu Ala Ser 5540 5545 5550Gly Glu Pro Gln Leu Ala Ala Arg Ala Gly Val Leu Tyr Thr Pro 5555 5560 5565Arg Leu Ala Arg Pro His Arg Asp Ala Ser Ala Val Pro Arg Ser 5570 5575 5580Leu Pro Ser His Gly Thr Val Leu Ile Thr Gly Gly Thr Gly Leu 5585 5590 5595Leu Gly Gly Leu Val Ala Arg Arg Leu Val Glu Ala His Gly Val 5600 5605 5610Arg Arg Leu Leu Leu Ala Gly Arg Arg Gly Pro Ala Ala Glu Gly 5615 5620 5625Leu Asp Ser Leu Thr Ser Glu Leu Arg Glu Arg Gly Ala Thr Val 5630 5635 5640Glu Val Ala Ala Cys Asp Ala Ala Asp Arg Thr Gln Leu Glu Ala 5645 5650 5655Leu Leu Ala Gly Val Pro Glu Glu His Pro Leu Ser Ala Val Val 5660 5665 5670His Ala Ala Gly Val Leu Asp Asp Gly Val Leu Thr Ser Leu Thr 5675 5680 5685Asn Glu Arg Leu Gly Ala Val Leu Arg Ala Lys Ala Asp Ser Ala 5690 5695 5700Leu Leu Leu His Glu Leu Thr Gln Asp Leu Asp Leu Ser Ala Phe 5705 5710 5715Val Leu Phe Ser Ser Ala Ala Gly Val Leu Gly Ser Pro Gly Gln 5720 5725 5730Gly Ser Tyr Ala Ala Ala Asn Ala Val Leu Asp Ala Leu Ala His 5735 5740 5745Gln Arg Ser Ala Ala Gly Leu Pro Ala Leu Ser Leu Ala Trp Gly 5750 5755 5760Leu Trp Ala Glu Gly Ser Gly Met Thr Gly His Leu Asp Ala Asp 5765 5770 5775Asp Arg Ser Arg Ile Asn Arg Ala Gly Met Ala Pro Leu Pro Thr 5780 5785 5790Pro Asp Ala Leu Asp Leu Phe Asp Ala Ala Leu Ser Ser Asp Glu 5795 5800 5805Pro Phe Leu Val Pro Ala Arg Phe Asp Leu Ser Ala Val Arg Thr 5810 5815 5820Arg Thr Ala Tyr Gly Pro Leu Pro Pro Leu Leu Arg Gly Leu Val 5825 5830 5835Arg Thr Ser Gly Ala His Arg Val Arg Gly Ala Val Gly Glu Ala 5840 5845 5850Arg Ala Ala Gly Val Asp Glu Ala Gly Arg Leu Arg Glu Arg Leu 5855 5860 5865Ala Arg Gln Ser Asp Ala Glu Arg Arg Asn Thr Leu Leu Arg Leu 5870 5875 5880Val Gln Ser Asn Val Ala Ala Val Leu Gly His Arg Gly Thr Gly 5885 5890 5895Thr Val Ala Glu Thr Arg Ala Phe Arg Glu Leu Gly Phe Asp Ser 5900 5905 5910Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Lys Val Ala Thr Gly 5915 5920 5925Leu Ala Leu Arg Ala Thr Val Ala Phe Asp Phe Pro Thr Pro Ala 5930 5935 5940Ala Leu Ala Glu His Leu Gly Ala Arg Leu Leu Pro Pro Asp Gly 5945 5950 5955Ala Val Ser Glu Ala Val Gly Glu Lys Glu Leu Arg Gly Leu Leu 5960 5965 5970Thr Ser Ile Pro Ile Gly Arg Leu Arg Glu Ala Gly Leu Ile Asp 5975 5980 5985Arg Leu Leu Ala Leu Ala Ala Ala Ala Pro Asp Ser Ala Asp Gln 5990 5995 6000Thr Ala Glu Gln Pro Ser Arg Ser Val Ser Val Glu Asp Ile Asp 6005 6010 6015Ala Met Asp Val Asp Ser Leu Ile Gly Leu Ala His Asp Thr Gly 6020 6025 6030Thr Asp Ser Gly His Ala Pro Cys Glu Gly 6035 60408284PRTbacteria 8Met Thr Lys Ala Pro His Gln Gly Ser Pro Thr Pro Ala Asp Val Gly1 5 10 15Asp Tyr Tyr Asp Arg Met Thr Ser Leu Leu Asn Arg Ala Leu Gly Gly 20 25 30Asn Thr His Leu Gly Tyr Trp Pro His Pro Asp Asp Gly Ser Thr Leu 35 40 45Gly Gln Ala Ser Asp Arg Leu Thr Asp His Met Ile Gly Lys Leu Arg 50 55 60Glu His Thr Gly Arg Pro Val Arg Arg Val Leu Asp Val Gly Cys Gly65 70 75 80Ser Gly Arg Pro Ala Leu Arg Leu Ala His Ser Glu Pro Val Asp Ile 85 90 95Val Gly Ile Thr Ile Ser Pro Arg Gln Val Glu Leu Ala Thr Ala Leu 100 105 110Ala Glu Arg Ser Gly Leu Ala Asn Arg Val Arg Phe Glu Cys Ala Asp 115 120 125Ala Met Asp Leu Pro Phe Pro Asp Ala Ser Phe Asp Ala Val Trp Ala 130 135 140Leu Glu Cys Leu Leu His Met Pro Asp Pro Ala Arg Val Phe Gln Glu145 150 155 160Met Ala Arg Val Leu Arg Pro Gly Gly Arg Leu Ala Ala Met Asp Val 165 170 175Thr Leu Arg Ala Ser Gln Pro Thr Gly Ala Asp Trp Ser Ser Ser Glu 180 185 190Leu Ala Val Pro Ser Leu Ile Pro Ile Thr Ala Tyr Ala Gly Met Ile 195 200 205Ser Asp Ala Gly Leu Arg Leu Thr Glu Leu Thr Asp Ile Gly Glu His 210 215 220Val Ile Ala Pro Ser Tyr Ser Ala Met Gly Asp Asp Val Arg Ala Asn225 230 235 240Ala His Ala Tyr Ala Glu Ala Leu Glu Met Thr Ala Asp Asp Leu Glu 245 250 255Thr Phe Val Gly Lys Cys Ser Gln Trp Tyr Thr Glu Asp Ile Gly Tyr 260 265 270Val Val Leu Thr Ala Pro Cys Gln Arg Ala Glu Val 275 2809468PRTbacteria 9Val Ser Ser Pro Pro Ser Thr Ile Pro Glu Ala Pro Gly Ala Trp Pro1 5 10 15Val Leu Gly His Leu Pro Ala Leu Leu Arg Asp Pro Leu Gly Phe Leu 20 25 30Ser Ala Val Thr Glu Arg Gly Asp Leu Phe Arg Ile Arg Leu Gly His 35 40 45Asn Thr Val Tyr Leu Ala Thr His Pro Glu Ile Val Arg Thr Met Leu 50 55 60Val Ser Gly Ala Ala Asp Phe Thr Arg Ser Lys Gly Ala Ala Gly Ala65 70 75 80Ser Arg Phe Ile Gly Pro Ile Leu Val Ala Val Ser Gly Asp Ser His 85 90 95Arg Arg Gln Arg Arg Met Met Gln Pro Gly Phe His Arg Gly Lys Leu 100 105 110Asp His Tyr Val Ile Ser Met Ser Ala Ala Ala Glu Glu Thr Ala Asp 115 120 125Ser Trp Arg Pro Gly Gln Val Val Asp Val Pro Lys Met Ala Ser Asp 130 135 140Leu Ser Leu Ala Met Ile Thr Lys Ala Leu Phe Gln Ser Asp Leu Gly145 150 155 160Ala Ala Ala Glu Ala Glu Leu Arg Thr Thr Gly His Asp Ile Leu Lys 165 170 175Val Ala Arg Leu Ser Ala Leu Ala Pro Gln Leu Tyr Thr Ser Leu Pro 180 185 190Thr Ala Ala Lys Arg His Met Gly Arg Thr Ser Ala Ala Ile Arg Glu 195 200 205Ala Val Thr Ala Tyr Arg Ala Asp Gly Arg Asp His Gly Asp Leu Leu 210 215 220Ser Thr Met Leu Arg Ala Arg Asp Ala Glu Gly Asn Thr Met Thr Asp225 230 235 240Asp Glu Val His Asn Glu Ile Met Gly Leu Ala Val Ala Gly Ile Gly 245 250 255Gly Pro Ala Ala Leu Thr Ala Trp Ile Phe His Glu Leu Ala His Asp 260 265 270His Leu Ile Glu Gln Arg Leu His Ala Glu Ile Asp Thr Val Leu Gly 275 280 285Gly Arg Leu Pro Thr Ser Ala Asp Leu Pro Arg Leu Pro Tyr Thr Gln 290 295 300Arg Leu Val Lys Glu Ala Leu Arg Lys Tyr Pro Gly Trp Val Gly Ser305 310 315 320Arg Arg Thr Val Arg Pro Val Arg Leu Gly Glu His Glu Leu Pro Ala 325 330 335Asp Val Glu Ile Met Tyr Ser Ser Tyr Ala Leu Gln Arg Asp Pro Arg 340 345 350Trp Tyr Arg Asp Pro Glu Lys Leu Asp Pro Asp Arg Trp Glu Ser Lys 355 360 365Glu Thr Thr Arg Asp Val Pro Lys Gly Ala Trp Val Pro Phe Ala Leu 370 375 380Gly Thr Tyr Lys Cys Ile Gly Asp Asn Phe Ala Leu Met Glu Thr Ala385 390 395 400Val Ala Val Ala Val Ile Ala Ser Arg Trp Arg Leu Arg Pro Leu Lys 405 410 415Gly Asp Arg Val Arg Pro Val Ala Lys Ala Thr His Val Phe Pro Asp 420 425 430Arg Leu Arg Met Ile Ala Glu Pro Arg Thr Pro Ala Ile Pro Arg Gly 435 440 445His Ala Pro Ala Asp Ala Ser Leu Glu Ala Ala Ala Arg Pro Lys Glu 450 455 460Leu Pro Glu Pro465105674PRTbacteria 10Met Ala Thr Pro Ser Glu Lys Leu Val Glu Ala Leu Arg Ala Ser Leu1 5 10 15Lys Ala Asn Glu Ala Leu Arg Arg Arg Asn Gln Gln Leu Thr Ala Ala 20 25 30Val Glu Ala Ala Gln Glu Pro Leu Ala Ile Val Gly Met Ala Cys Arg 35 40 45Phe Pro Gly Gly Val Arg Ser Pro Glu Glu Leu Trp Gly Leu Val Ala 50 55 60Ser Gly Gly Asp Ala Ile Gly Glu Phe Pro Ala Asp Arg Gly Trp Asp65 70 75 80Leu Ala Gly Leu Phe Asp Pro Asp Pro Glu Arg Ala Gly Ala Ser Tyr 85 90 95Thr Arg His Gly Gly Phe Leu Tyr Asp Ala Gly Gln Phe Asp Ala Glu 100 105 110Leu Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln 115 120 125Arg Leu Leu Leu Glu Thr Ser Trp Glu Val Phe Glu Arg Ala Gly Ile 130 135 140Asp Pro Ser Ser Val Arg Gly Ala Arg Ala Gly Val Phe Thr Gly Met145 150 155 160Met Tyr His Asp Tyr Ala Ser Arg Leu Ala Thr Ile Pro Glu Gly Phe 165 170 175Glu Gly Tyr Ile Gly Asn Gly Ser Gly Gly Ala Val Ala Ser Gly Arg 180 185 190Val Ala Tyr Thr Leu Gly Leu Glu Gly Pro Ala Val Thr Val Asp Thr 195 200 205Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Cys Gln Ser Leu 210 215 220Arg Thr Gly Glu Cys Asp Leu Ala Leu Ala Gly Gly Val Thr Val Met225 230 235 240Ser Thr Pro Leu Leu Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ser 245 250 255Val Asp Gly Arg Cys Lys Ser Phe Ala Ala Ala Ala Asp Gly Thr Gly 260 265 270Met Gly Glu Gly Val Gly Met Leu Leu Val Glu Arg Leu Ser Asp Ala 275 280 285Glu Arg Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val 290 295 300Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser305 310 315 320Gln Glu Arg Val Ile Arg Glu Ala Leu Ala Asn Ala Gly Leu Thr Val 325 330 335Ala Asp Val Asp Ala Val Glu Gly His Gly Thr Gly Thr Arg Leu Gly 340 345 350Asp Pro Ile Glu Ala Gln Ala Leu Leu Asp Thr Tyr Gly Gln Glu Arg 355 360 365Ser Gly Glu Gln Pro Leu Trp Leu Gly Ser Val Lys Ser Asn Ile Gly 370 375 380His Ala Gln Ala Ala Ala Gly Val Gly Gly Ile Ile Lys Met Val Met385 390 395 400Ala Leu Arg His Glu Ser Leu Pro Arg Thr Leu His Val Asp Glu Pro 405 410 415Ser Pro Gln Val Asp Trp Ser Ser Gly Ala Val Ser Leu Leu Ser Glu 420 425 430Ala Arg Pro Trp Pro Arg Arg Glu Asp Arg Pro Arg Arg Ala Gly Val 435 440 445Ser Ser Phe Gly Val Ser Gly Thr Asn Ala His Val Ile Leu Glu Glu 450 455 460Ala Pro Ala Arg Arg Pro Gly Glu Ala Ala Val Glu Asp Gly Ala Pro465 470 475 480Val Pro Trp Val Val Ser Ala Arg Ser Gly Ala Ala Leu Arg Ala Gln 485 490 495Ala Met Val Leu Arg Glu Phe Leu Arg Gly Pro Gly Thr Asp Ala Gly 500 505 510Val Arg Asp Ile Gly Ala Glu Leu Ala Arg Gly Arg Ala Val Leu Glu 515 520 525His Arg Ala Val Ile Val Ala Arg Glu Arg Ala Glu Phe Glu Gly Ala 530 535 540Leu Glu Ala Leu Ala Ser Gly Glu Pro His Pro Ala Leu Ile Glu Asp545 550 555 560Ala Thr Gly Ser His Ser His Ser Gly Gly Gly Val Val Phe Val Phe 565 570 575Pro Gly Gln Gly Gly Gln Trp Ala Gly Met Gly Leu Asp Leu Leu Thr 580 585 590Thr Ser Gly Val Phe Ala Asp His Ile Gly Ala Cys Glu Arg Ala Leu 595 600 605Ala Pro Trp Val Glu Trp Ser Leu Thr Glu Met Leu His Arg Glu Ala 610 615 620Glu Asp Pro Val Trp Glu Arg Ala Asp Val Val Gln Pro Val Leu Phe625 630 635 640Ser Val Met Val Ser Leu Ala Ala Leu Trp Arg Ser Tyr Gly Ile Glu 645 650 655Pro Asp Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala His 660 665 670Val Cys Gly Ala Leu Thr Leu Glu Asp Ala Ala Lys Val Val Ala Leu 675 680 685Arg Ser Arg Ala

Leu Ala Ala Leu Arg Gly His Gly Gly Met Ala Ser 690 695 700Leu Ala Leu Thr Gly Thr Glu Ala Glu Asp Leu Ile Thr Thr His Trp705 710 715 720Pro Gly Arg Leu Trp Thr Ala Ala Phe Asn Gly Pro Arg Ala Thr Thr 725 730 735Val Ser Gly Asp Thr Asp Ala Leu Asp Glu Leu Leu Thr His Cys Thr 740 745 750Glu Thr Gly Val Arg Ala Arg Arg Ile Pro Val Asp Tyr Ala Ser His 755 760 765Cys Pro His Thr Glu Thr Ile Glu His Asp Leu Leu His Met Leu His 770 775 780Gly Ile Thr Pro Gln Pro Gly Ser Ile Pro Phe Tyr Ser Thr Val Glu785 790 795 800Asp Ala Trp Thr Asp Thr Thr Thr Leu Asp Ala Ala Tyr Trp Tyr Arg 805 810 815Asn Leu Arg Arg Pro Val Arg Phe Thr His Ala Val Arg Thr Leu Thr 820 825 830Ala Gln Gly His Arg Leu Phe Ile Glu Thr Ser Pro His Pro Thr Leu 835 840 845Thr Pro Ala Ile Glu Asp His Asp His Thr Thr Ala Leu Gly Thr Leu 850 855 860Arg Arg His Asp Asn Asp Thr His Arg Phe Leu Thr Ala Leu Ala His865 870 875 880Ala His Thr Thr Gly His Thr Val Thr Trp Thr Thr His Tyr Pro Thr 885 890 895Thr Pro His Thr Pro Ala Ile Asp Leu Pro Thr Tyr Pro Phe Gln His 900 905 910His His Tyr Trp Leu His Thr Pro Thr Thr Ser Thr Gly Asp Val Ser 915 920 925Ala Ala Gly Leu Gln Arg Pro Asp His Pro Leu Leu Gly Ala Val Met 930 935 940Glu Leu Ala Asp Gly Asp Gly Ile Val Leu Thr Gly Arg Leu Ser Leu945 950 955 960His Thr His Pro Trp Leu Ala Asp His Ser Val Gly Gly Val Val Leu 965 970 975Leu Pro Gly Thr Ala Leu Leu Glu Leu Ala Phe Gln Ala Gly Leu Arg 980 985 990Ala Gly Cys Pro Gly Val Asp Glu Leu Thr Leu His Ala Pro Leu Val 995 1000 1005Val Pro Glu Ser Gly His Val Val Val Gln Val Ser Val Ser Val 1010 1015 1020Pro Asp Glu Ala Gly Arg Arg Gly Val Ser Val Tyr Gly Arg Leu 1025 1030 1035Val Glu Asp Gly Gly Leu Glu Gly Glu Trp Thr Arg His Ala Glu 1040 1045 1050Gly Val Val Cys Pro Ser Val Pro Gly Glu Ser Val Val Val Glu 1055 1060 1065Pro Val Ala Asp Gly Val Trp Pro Pro Ser Gly Ala Gln Pro Val 1070 1075 1080Asp Leu Asp Glu Phe Tyr Gly Arg Leu Ala Gly Gly Gly Phe Val 1085 1090 1095Tyr Gly Pro Val Phe Gln Gly Leu Cys Ala Ala Trp Arg Asp Gly 1100 1105 1110Asp Asp Val Val Ala Glu Val Arg Leu Pro Asp Glu Gly Leu Ala 1115 1120 1125Asp Val Ala Gly Phe Gly Val His Pro Ala Leu Leu Asp Ala Ala 1130 1135 1140Val Gln Thr Val Thr Leu Leu Leu Pro Glu Asp Gln Glu Ala Gly 1145 1150 1155Leu Leu Pro Tyr Thr Trp Asn Gly Ala Ser Leu His Ala Arg Gly 1160 1165 1170Ala Arg Ala Leu Arg Val Arg Val Thr Ser Val Asp Ala Ala Gly 1175 1180 1185Thr Thr Val Ser Leu Arg Val Ala Asp Glu Thr Gly Ala Leu Val 1190 1195 1200Leu Ala Leu Glu Ser Leu Val Leu Arg Pro Val Pro Leu Glu Gly 1205 1210 1215Leu Gly Ala Gly Val Arg Arg Gly Ser Leu Phe Glu Leu Gly Trp 1220 1225 1230Val Pro Val Glu Gly Val Pro Ala Ser Leu Ala Gly Gly Gly Gly 1235 1240 1245Glu Leu Val Val Trp Glu Cys Pro Gly Gly Gly Val Ala Glu Val 1250 1255 1260Thr Ala Ala Ala Leu Gly Val Val Arg Glu Trp Leu Ala Asp Glu 1265 1270 1275Arg Glu Gly Asp Ala Arg Leu Val Val Val Thr Arg Gly Ala Val 1280 1285 1290Ala Val Asp Ala Gly Glu Pro Val Arg Asp Val Ala Gly Ala Ala 1295 1300 1305Val Trp Gly Leu Val Arg Ser Ala Gln Ser Glu His Pro Asp Arg 1310 1315 1320Phe Val Leu Leu Asp Leu Asp Pro Gly Thr Gly Val Glu Thr Val 1325 1330 1335Val Asp Ala Asp Glu Asp Met Gly Ala Gly Val Gly Ala Gly Val 1340 1345 1350Asp Val Ala Gly Phe Val Ala Cys Gly Glu Ala Gln Val Ala Val 1355 1360 1365Arg Gly Gly Val Val Arg Val Pro Arg Leu Glu Arg Leu Glu Arg 1370 1375 1380Trp Gly Arg Leu Gly Gly Ala Gly Glu Gly Leu Ser Leu Pro Gly 1385 1390 1395Gly Val Gly Trp Arg Leu Asp Gly Gly Gly Ser Gly Leu Leu Glu 1400 1405 1410Gly Val Gly Val Val Ala Ser Asp Ala Ala Gly Val Val Leu Gly 1415 1420 1425Arg Gly Gln Val Arg Val Ala Val Arg Ala Ala Gly Val Asn Phe 1430 1435 1440Arg Asp Val Leu Val Ala Leu Gly Met Val Pro Gly Gln Val Gly 1445 1450 1455Val Gly Ser Glu Gly Ala Gly Val Val Val Glu Val Gly Pro Gly 1460 1465 1470Val Glu Gly Leu Val Val Gly Asp Arg Val Phe Gly Val Phe Gly 1475 1480 1485Asp Ala Phe Ala Pro Val Val Val Ala Gln Glu Val Leu Leu Ala 1490 1495 1500Arg Ile Pro Glu Gly Trp Ser Phe Ala Gln Ala Ala Ser Val Pro 1505 1510 1515Val Val Phe Ala Thr Ala Tyr Leu Gly Leu Val Asp Leu Ala Gly 1520 1525 1530Val Arg Arg Gly Glu Ser Val Leu Val His Ala Ala Ala Gly Gly 1535 1540 1545Val Gly Thr Ala Ala Val Gln Leu Ala Arg His Leu Gly Ala Glu 1550 1555 1560Val Tyr Ala Thr Ala Ser Glu Ala Lys Trp Ala Arg Leu Arg Ala 1565 1570 1575Ala Gly Val Ala Pro Gln Arg Ile Ala Ser Ser Arg Ser Val Glu 1580 1585 1590Phe Glu Ser Arg Phe Arg Arg Ala Ser Gly Gly Arg Gly Val Asp 1595 1600 1605Val Val Leu Asn Cys Leu Ala Gly Glu Tyr Thr Asp Ala Ser Leu 1610 1615 1620Arg Leu Cys Ser Pro Gln Gly Gly Arg Phe Leu Glu Leu Gly Lys 1625 1630 1635Thr Asp Ile Arg Asp Ala Gly Glu Val Ala Ala Arg Phe Pro Gly 1640 1645 1650Val Ser Tyr Arg Ala Tyr Asp Leu Met Asp Ala Gly Ala Gln Arg 1655 1660 1665Val Gly Glu Ile Leu His Thr Val Val Asp Leu Phe Arg Arg Gly 1670 1675 1680Val Leu Glu Pro Leu Pro Val Thr Ala Trp Asp Val Arg Gln Ala 1685 1690 1695Arg Gln Ala Leu Arg Ser Met Arg Ser Gly Leu His Val Gly Lys 1700 1705 1710Asn Val Leu Thr Leu Pro Val Pro Leu Asp Ala Glu Gly Thr Val 1715 1720 1725Leu Val Thr Gly Gly Thr Gly Thr Leu Gly Ala Ala Val Ala Arg 1730 1735 1740His Leu Ala Ala Gly His Gly Val Arg His Leu Leu Leu Val Ser 1745 1750 1755Arg Arg Gly Met Ala Ala Ala Gly Ala Glu Glu Leu Cys Ala Glu 1760 1765 1770Leu Gly Gln Ala Gly Val Ser Val Ser Val Ala Ala Cys Asp Val 1775 1780 1785Ala Asp Arg Ala Gln Val Ala Ala Leu Leu Glu Gln Val Pro Ala 1790 1795 1800Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Asp 1805 1810 1815Asp Ala Thr Val Thr Cys Leu Asp Arg Glu Lys Ile Asp Ala Val 1820 1825 1830Val Gly Ala Lys Val Asp Gly Ala Leu His Leu His Glu Leu Thr 1835 1840 1845Ala Gly Met Asp Leu Ser Ala Phe Val Leu Phe Ser Ser Ala Ala 1850 1855 1860Gly Val Leu Gly Ser Pro Gly Gln Gly Asn Tyr Ala Ala Ala Asn 1865 1870 1875Ala Ala Leu Asp Ala Leu Ala His Gln Arg Arg Ala Ala Gly Leu 1880 1885 1890Pro Ala Leu Ser Leu Ala Trp Gly Leu Trp Glu Glu Ala Ser Gly 1895 1900 1905Met Thr Gly His Leu Asp Ala Gly Asp Arg His Arg Ile Thr Arg 1910 1915 1920Ser Gly Leu His Pro Leu Thr Thr Pro Asp Ala Leu Ala Leu Leu 1925 1930 1935Asp Thr Ala Leu Ala Thr Gly Arg Pro Ala Leu Leu Pro Ala Asp 1940 1945 1950Leu Arg Pro Thr His Pro Ala Pro Pro Leu Leu Glu His Leu Ala 1955 1960 1965Pro Ala Arg Thr Ser Pro Arg Thr Ala His Thr Gly Thr Ser Ala 1970 1975 1980Gly Ala Gly Gln Asp Val Ser Leu Ala Asp Arg Leu Ala Thr Leu 1985 1990 1995Thr Ser Glu Gln Arg His Ala Thr Leu Leu Ala Leu Ala Arg Thr 2000 2005 2010His Ile Ala Ala Val Leu Gly His Pro Thr Pro Asp Thr Ile Asp 2015 2020 2025Pro Glu Arg Thr Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ala 2030 2035 2040Val Glu Leu Arg Asn Arg Leu Thr Arg Ala Thr Gly Leu Arg Leu 2045 2050 2055Pro Thr Thr Leu Ala Phe Asp His Pro Thr Pro Thr Ala Leu Thr 2060 2065 2070His His Leu Thr Thr Leu Leu Asn Pro Asn Asp Thr Lys Thr Pro 2075 2080 2085Ser Ala Pro Ala Ala Ala Glu Pro Lys Ala Gly Gln His Glu Pro 2090 2095 2100Ile Ala Ile Ile Gly Val Gly Cys Arg Tyr Pro Gly Gly Val Ala 2105 2110 2115Ser Ala Glu Asp Leu Trp Gln Leu Val Ala Ser Gly Gly Asp Ala 2120 2125 2130Val Gly Glu Phe Pro Ala Asp Arg Gly Trp Asp Val Glu Ala Leu 2135 2140 2145Tyr Asp Pro Glu Pro Gly Gln Arg Gly Thr Ser Tyr Thr Arg His 2150 2155 2160Gly Gly Phe Leu Tyr Asp Ala Gly Glu Phe Asp Ala Gly Phe Phe 2165 2170 2175Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg 2180 2185 2190Leu Leu Leu Glu Thr Thr Trp Glu Ala Phe Glu Arg Ala Gly Ile 2195 2200 2205Asp Pro Gly Ala Val Arg Gly Ser Gln Thr Gly Val Phe Ala Gly 2210 2215 2220Val Met Pro Gln Glu Tyr Ala Ser Arg Ser Arg His His Val Ala 2225 2230 2235Ala Asp Val Asp Gly Tyr Val Leu Thr Gly Thr Ser Gly Ser Val 2240 2245 2250Ala Ser Gly Arg Val Ala Tyr Thr Phe Gly Leu Glu Gly Pro Ala 2255 2260 2265Val Ser Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His 2270 2275 2280Leu Ala Cys Gln Ala Leu Arg Ser Gly Glu Cys Thr Met Ala Leu 2285 2290 2295Ala Gly Gly Ala Thr Val Met Ser Thr Pro Thr Ala Phe Leu Glu 2300 2305 2310Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Cys Lys Ala 2315 2320 2325Phe Ser Ala Ser Ala Asp Gly Thr Gly Trp Ser Glu Gly Ala Gly 2330 2335 2340Met Leu Leu Leu Glu Arg Leu Ser Asp Ala Glu Arg Asn Gly His 2345 2350 2355Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly 2360 2365 2370Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg 2375 2380 2385Val Ile Arg Gln Ala Leu Ala Asn Ala Asn Leu Ser Ala Val Asp 2390 2395 2400Val Asp Ala Val Glu Ala His Gly Thr Gly Thr Lys Leu Gly Asp 2405 2410 2415Pro Ile Glu Ala Gln Ala Leu Leu Ala Thr Tyr Gly Gln Glu His 2420 2425 2430His Pro Asp Gln Pro Leu Trp Leu Gly Ser Leu Lys Ser Asn Ile 2435 2440 2445Gly His Ala Gln Ala Ala Ala Gly Val Gly Gly Ile Ile Lys Met 2450 2455 2460Val Met Ala Leu Arg His Glu Ser Leu Pro Arg Thr Leu His Val 2465 2470 2475Asp Glu Pro Ser Pro Gln Val Asp Trp Ser Ser Gly Ala Val Ser 2480 2485 2490Leu Leu Thr Glu Ala Arg Pro Trp Pro Arg Arg Glu Asp Arg Pro 2495 2500 2505Arg Arg Ala Gly Ile Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 2510 2515 2520His Val Ile Leu Glu Glu Ala Pro Ala Arg Ala Glu Val Glu Ala 2525 2530 2535Val Glu Ala Ala Pro Ala Gly Val Glu Thr Ala Ala Ala Ala Ala 2540 2545 2550Val Val Val Glu Thr Asp Gly Ala Gly Arg Val Ser Ala Asp Val 2555 2560 2565Pro Leu Val Trp Val Val Ser Gly Lys Ser Gln Ala Ala Leu Arg 2570 2575 2580Ala Gln Ala Ala Ala Leu His Ala His Val Leu Asp His Pro Glu 2585 2590 2595Gln Asp Ala Ala Asp Ile Gly Tyr Ser Leu Ala Thr Thr Arg Ala 2600 2605 2610Leu Phe Asp His Arg Ala Thr Leu Ile Ala Pro Asp Arg Asp Thr 2615 2620 2625Leu Leu Asp Ala Leu Thr Ala Leu Ala Asp Gly Arg Thr His Pro 2630 2635 2640His Leu Ile Pro Thr Pro Pro Thr Glu Pro Gly His Thr His Lys 2645 2650 2655Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro Gly Met 2660 2665 2670Ala Thr Gly Leu Tyr His Thr Tyr Pro Ala Phe Ala Asp Ala Leu 2675 2680 2685Asp Glu Thr Cys Ala His Phe Asp Pro His Leu Asp His Pro Leu 2690 2695 2700Arg Asp Leu Leu Leu Asn His Asp Pro Thr Asp Leu Leu Thr His 2705 2710 2715Thr Leu Tyr Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys Ala Leu 2720 2725 2730His His Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu 2735 2740 2745Ala Gly His Ser Leu Gly Glu Ile Thr Ala Ala His Leu Ala Gly 2750 2755 2760Ile Leu Thr Leu Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala 2765 2770 2775Arg Leu Met Gln Thr Met Pro Pro Gly Thr Met Thr Thr Leu His 2780 2785 2790Thr Thr Pro Glu His Ile Gln Pro Leu Leu Asp Gln His Pro Gly 2795 2800 2805Lys Ala Thr Ile Ala Ala Val Asn Ser Pro His Ser Leu Val Ile 2810 2815 2820Ser Gly Asp Pro Asp Thr Ile His His Ile Thr Thr Thr Cys His 2825 2830 2835Thr Gln Gly Ile Thr Thr Lys Pro Leu Thr Thr Asn His Ala Phe 2840 2845 2850His Ser Pro His Thr Asp Thr Ile Leu Glu Gln Leu Asp Thr Thr 2855 2860 2865Thr His Thr Leu Thr Tyr His Pro Pro His Thr Pro Leu Ile Thr 2870 2875 2880Ser Thr Pro Gly Asp Pro Leu Thr Pro His Tyr Trp Thr His Gln 2885 2890 2895Thr Arg Gln Pro Val His Trp Thr Asp Thr Ile His Thr Leu His 2900 2905 2910Thr Asn Gly Val Thr Thr Tyr Ile Glu Leu Gly Pro Asp His Thr 2915 2920 2925Leu Thr Thr Leu Thr His His Asn Leu Pro His His Gln Pro Thr 2930 2935 2940Ala Ile Thr Leu Thr His Pro His His Asn Pro Thr His His Leu 2945 2950 2955Leu Thr Ala Leu Ala His Thr Pro Thr Thr Trp His Thr His His 2960 2965 2970His Thr His Thr Asn Pro His Pro His Thr Ile Pro Asp Leu Pro 2975 2980 2985Thr Tyr Pro Phe Gln Arg Arg His Tyr Trp Leu Gln Ala Thr Pro 2990 2995 3000Gly Ala Gly Ala Gly Asp Val Ser Ala Ala Gly Leu Gln Arg Pro 3005 3010 3015Asp His Pro Leu Leu Gly Ala Val Met Glu Leu Ala Asp Gly Asp 3020 3025 3030Gly Ile Val Leu Thr Gly Ser Leu Ser Leu Arg Thr His Thr Trp 3035 3040 3045Leu Ala Asp His Ser Val Gly Gly Ile Val Leu Leu Pro Gly Thr 3050 3055 3060Ala Leu Leu Asp Leu Ala Phe Gln Ala Gly Leu Arg Thr Gly Cys 3065 3070 3075Pro Arg Val Asp Glu Leu Thr Leu His Ala Pro Leu Val Ile Pro 3080 3085 3090Glu Ser Gly His Val Val Val Gln Val Ser Val Ser Val Pro Asp 3095 3100 3105Glu Ala Gly Arg Arg Ala Val Asn Val Tyr Ala Arg Pro Ala Gly 3110 3115 3120Asp Glu Glu Thr Asp Gly Glu Trp Thr Arg His Ala Glu Gly

Val 3125 3130 3135Leu Ser Pro Ser Thr Glu Asp Asp Pro Asn Ala Glu Ala Ala Ala 3140 3145 3150Ala Gly Glu Trp Pro Pro Pro Gly Ala Arg Pro Val Val Leu Asp 3155 3160 3165Gly Leu Tyr Asp Arg Leu Ala Gly Gly Gly Phe Val Tyr Gly Pro 3170 3175 3180Val Phe Gln Gly Leu Cys Ala Ala Trp Arg Asp Gly Asp Asp Val 3185 3190 3195Val Ala Glu Val Arg Leu Pro Asp Glu Gly Leu Ala Asp Val Ala 3200 3205 3210Gly Phe Gly Val His Pro Ala Leu Leu Asp Ala Ala Val Gln Ser 3215 3220 3225Val Thr Leu Leu Leu Ala Asp Gln Gln Gln Ala Gly Leu Val Pro 3230 3235 3240His Thr Trp Asn Gly Val Ser Leu His Ala Arg Gly Ala Thr Val 3245 3250 3255Leu Arg Leu Arg Met Thr Pro Thr Asp Ala Thr Ser Thr Ala Val 3260 3265 3270Arg Leu His Ala Thr Asp Glu Thr Gly Ala Pro Val Leu Thr Leu 3275 3280 3285Glu Ser Leu Leu Met Arg Pro Val Pro Leu Glu Gly Leu Gly Ala 3290 3295 3300Arg Val Arg Arg Gly Ser Leu Phe Glu Leu Gly Trp Val Pro Val 3305 3310 3315Glu Gly Val Pro Ala Ser Val Ala Gly Gly Gly Gly Glu Leu Val 3320 3325 3330Ala Trp Glu Cys Pro Gly Gly Gly Val Ala Glu Val Thr Ala Ala 3335 3340 3345Ala Leu Gly Val Val Arg Glu Trp Leu Ala Asp Glu Arg Glu Gly 3350 3355 3360Asp Ala Arg Leu Val Val Val Thr Arg Gly Ala Val Ala Val Asp 3365 3370 3375Ala Gly Glu Pro Val Arg Asp Val Ala Gly Ala Ala Val Trp Gly 3380 3385 3390Leu Val Arg Ser Ala Gln Ser Glu His Pro Asp Arg Phe Val Leu 3395 3400 3405Leu Asp Leu Asp Pro Asp Thr Lys Thr Asp Pro Asp Thr Asp Thr 3410 3415 3420Asp Thr Asp Thr Asp Gly Asp Thr Asp Val Ser Ala Asp Ala Lys 3425 3430 3435Val Gly Thr Gly Ala Gly Leu Asp Asp Ala Ala Val Ala Ser Ala 3440 3445 3450Leu Ala Arg Gly Glu Ser Gln Leu Ala Val Arg Asp Gly Val Val 3455 3460 3465Arg Val Pro Arg Leu Lys Arg Val Pro Pro Leu Ser Glu Ser Ser 3470 3475 3480Asp Ala Val Arg Phe Asp Ala Glu Gly Thr Val Leu Val Thr Gly 3485 3490 3495Gly Thr Gly Thr Leu Gly Ala Val Val Ala Arg His Leu Ala Ala 3500 3505 3510Gly His Gly Val Arg His Leu Leu Leu Val Ser Arg Arg Gly Met 3515 3520 3525Ala Ala Thr Gly Ala Glu Glu Leu Cys Ala Glu Leu Gly Gly Ala 3530 3535 3540Gly Val Ser Val Ser Val Ala Ala Cys Asp Val Ala Asp Arg Ala 3545 3550 3555Gln Val Ala Ala Leu Leu Glu Gln Val Pro Ala Glu His Pro Leu 3560 3565 3570Thr Ala Val Val His Thr Ala Gly Val Leu Asp Asp Ala Thr Val 3575 3580 3585Thr Cys Leu Asp Arg Glu Lys Ile Asp Ala Val Val Gly Ala Lys 3590 3595 3600Val Asp Gly Ala Leu His Leu His Glu Leu Thr Ala Gly Met Asp 3605 3610 3615Leu Ser Ala Phe Val Leu Phe Ser Ser Ala Ala Gly Val Leu Gly 3620 3625 3630Ser Pro Gly Gln Gly Asn Tyr Ala Ala Ala Asn Ala Ala Leu Asp 3635 3640 3645Ala Leu Ala His Gln Arg Arg Ala Ala Gly Leu Pro Ala Leu Ser 3650 3655 3660Leu Ala Trp Gly Leu Trp Glu Glu Thr Ser Gly Met Thr Gly His 3665 3670 3675Leu Asp Ala Gly Asp Arg His Arg Ile Thr Arg Ser Gly Leu His 3680 3685 3690Pro Leu Thr Thr Pro Asp Ala Leu Ala Leu Leu Asp Thr Ala Leu 3695 3700 3705Ala Ala Gly Arg Pro Ala Leu Leu Pro Ala Asp Leu Arg Pro Thr 3710 3715 3720His Pro Ala Pro Pro Leu Leu Glu His Leu Ala Pro Ala Arg Thr 3725 3730 3735Ser His Arg Thr Thr Leu Pro Thr Thr Asp Ser Gly Ala Ser Leu 3740 3745 3750Arg Ala Arg Leu Ala Gly Arg Thr Pro Glu Gln Gln Tyr Gln Ala 3755 3760 3765Leu Leu Gly Leu Val Arg Ser His Val Ala Thr Val Leu Gly His 3770 3775 3780Gln Ala Pro Glu Ala Ile Pro Val Asp Ser Ala Phe Arg Asp Leu 3785 3790 3795Gly Phe Asp Ser Leu Thr Ala Val Asp Leu Arg Asn Arg Leu Ser 3800 3805 3810Ala Glu Thr Gly Leu Arg Leu Pro Ala Ser Leu Val Phe Asp Gln 3815 3820 3825Pro Ser Pro Ala Ala Val Ala Arg Leu Leu Arg Thr Glu Leu Leu 3830 3835 3840Gly Asp Asp Ala Ala Asp Ser Thr Ser Pro Tyr Ala Glu Thr Thr 3845 3850 3855Ala Val Gly Ser Asp Glu Pro Leu Ala Ile Val Gly Met Ala Cys 3860 3865 3870Arg Phe Pro Gly Gly Val Arg Ser Pro Glu Glu Leu Trp Gly Leu 3875 3880 3885Val Ala Ser Gly Gly Asp Ala Ile Gly Glu Phe Pro Ala Asp Arg 3890 3895 3900Gly Trp Asp Leu Ala Gly Leu Phe Asp Pro Asp Pro Glu Arg Ala 3905 3910 3915Gly Ala Ser Tyr Thr Arg His Gly Gly Phe Leu Tyr Asp Ala Gly 3920 3925 3930Gln Phe Asp Ala Glu Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu 3935 3940 3945Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Val Trp Glu 3950 3955 3960Thr Leu Glu His Ala Gly Ile Asp Pro Ala Ala Val Arg Gly Ser 3965 3970 3975Arg Thr Gly Val Phe Ala Gly Val Met Tyr His Asp Tyr Ala Ala 3980 3985 3990Arg Leu Thr Ala Val Pro Glu Gly Ala Glu Gly Tyr Ile Gly Asn 3995 4000 4005Gly Asn Ala Gly Ser Val Val Ser Gly Arg Val Ala Tyr Thr Phe 4010 4015 4020Gly Phe Glu Gly Pro Ala Val Ser Val Asp Thr Ala Cys Ser Ser 4025 4030 4035Ser Leu Val Ala Leu His Leu Ala Gly Gln Ala Leu Arg Ser Gly 4040 4045 4050Glu Cys Ser Met Ala Leu Ala Gly Gly Val Thr Val Met Ser Ser 4055 4060 4065Pro Gly Thr Phe Ile Asp Phe Ser Arg Gln Arg Gly Leu Ser Val 4070 4075 4080Asp Gly Arg Cys Lys Ser Phe Ala Ala Ala Ala Asp Gly Thr Gly 4085 4090 4095Trp Gly Glu Gly Val Gly Met Leu Leu Val Glu Arg Leu Ser Asp 4100 4105 4110Ala Glu Arg Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser 4115 4120 4125Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn 4130 4135 4140Gly Pro Ser Gln Gln Arg Val Ile Arg Gln Ala Leu Ala Asn Ser 4145 4150 4155Gly Leu Thr Gly Ala Asp Val Asp Ala Val Glu Ala His Gly Thr 4160 4165 4170Gly Thr Lys Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu Leu Ala 4175 4180 4185Thr Tyr Gly Gln Glu His His Pro Asp Gln Pro Leu Trp Leu Gly 4190 4195 4200Ser Leu Lys Ser Asn Ile Gly His Ala Gln Ala Ala Ala Gly Val 4205 4210 4215Gly Gly Ile Ile Lys Met Val Met Ala Leu Arg His Glu Thr Leu 4220 4225 4230Pro Arg Thr Leu His Ile Asp Glu Pro Thr Pro Gln Val Asp Trp 4235 4240 4245Ser Ser Gly Ala Val Ser Leu Leu Thr Glu Pro Arg Pro Trp Pro 4250 4255 4260Arg Gln Gly Asp Arg Pro Arg Arg Ala Gly Ile Ser Ser Phe Gly 4265 4270 4275Val Ser Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro Ala 4280 4285 4290Gln Pro Ala Gly Asp Pro Ala Pro Glu Asp Gly Ala Pro Val Pro 4295 4300 4305Trp Ala Met Ser Ala Arg Ser Asn Ala Ala Leu Arg Ala Gln Ala 4310 4315 4320Ala Leu Leu Arg Asp Phe Leu Gln Gly Pro Gly Thr Asp Thr Ala 4325 4330 4335Leu Arg Ala Val Gly Ala Glu Leu Ala His Gly Arg Ala Val Leu 4340 4345 4350Glu His Arg Ala Val Ile Val Ala Arg Glu Arg Thr Glu Phe Glu 4355 4360 4365Asp Ala Leu Glu Ala Leu Ala Ser Gly Glu Pro His Pro Ala Leu 4370 4375 4380Ile Glu Asp Thr Thr Gly Ser Gln Thr Asn Ser His Ser Gly Gly 4385 4390 4395Gly Val Val Phe Val Phe Pro Gly Gln Gly Gly Gln Trp Ala Gly 4400 4405 4410Met Gly Leu Asp Leu Leu Arg Asp Ser Gln Val Phe Ala Asp His 4415 4420 4425Val Gly Ala Cys Glu Arg Ala Leu Ala Pro Trp Val Glu Trp Ser 4430 4435 4440Leu Thr Glu Met Leu His Arg Asp Ala Glu Asp Pro Val Trp Glu 4445 4450 4455Arg Ala Asp Val Val Gln Pro Val Leu Phe Ser Val Met Val Ser 4460 4465 4470Leu Ala Ala Leu Trp Arg Ser Tyr Gly Ile Glu Pro Asp Ala Val 4475 4480 4485Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala His Val Cys Gly 4490 4495 4500Ala Leu Thr Leu Glu Asp Ala Ala Lys Ile Val Ala Leu Arg Ser 4505 4510 4515Arg Ala Leu Ala Ala Leu Arg Gly His Gly Gly Met Ala Ser Leu 4520 4525 4530Ala Leu Thr Gly Thr Glu Ala Glu Asp Leu Ile Thr Thr His Trp 4535 4540 4545Pro Gly Arg Leu Trp Arg Ala Ala Phe Asn Gly Pro Arg Ala Thr 4550 4555 4560Thr Val Ser Gly Asp Thr Asp Ala Leu Asp Glu Leu Leu Thr His 4565 4570 4575Cys Thr Glu Thr Gly Val Arg Ala Arg Arg Ile Pro Val Asp Tyr 4580 4585 4590Ala Ser His Cys Pro His Thr Glu Thr Ile Glu His Asp Leu Leu 4595 4600 4605His Met Leu His Gly Ile Thr Pro Gln Pro Gly Ser Ile Pro Phe 4610 4615 4620Tyr Ser Thr Val Glu Asp Ala Trp Thr Asp Thr Thr Thr Leu Asp 4625 4630 4635Ala Ala Tyr Trp Tyr Arg Asn Leu Arg Arg Pro Val Arg Phe Thr 4640 4645 4650His Ala Val Arg Thr Leu Thr Ala Gln Gly His Arg Leu Phe Ile 4655 4660 4665Glu Thr Ser Pro His Pro Thr Leu Thr Pro Ala Ile Glu Asp His 4670 4675 4680Asp His Thr Thr Ala Leu Gly Thr Leu Arg Arg His Asp Asn Asp 4685 4690 4695Thr His Arg Phe Leu Thr Ala Leu Ala His Ala His Thr Thr Gly 4700 4705 4710His Thr Val Thr Trp Thr Thr His Tyr Pro Thr Thr Pro His Thr 4715 4720 4725Pro Ala Ile Asp Leu Pro Thr Tyr Pro Phe Gln His His His Tyr 4730 4735 4740Trp Leu His Thr Pro Thr Thr Ser Thr Gly Asp Val Ser Ala Ala 4745 4750 4755Gly Leu His Pro Thr Glu His Pro Leu Leu Gly Ala Thr Val Glu 4760 4765 4770Leu Ala Asp Gly Asp Gly Thr Leu Leu Thr Gly Arg Leu Ser Leu 4775 4780 4785His Thr His Pro Trp Leu Ala Asp His Ser Val Gly Gly Ile Val 4790 4795 4800Leu Leu Pro Gly Thr Ala Leu Leu Glu Leu Ala Leu Gln Ala Gly 4805 4810 4815Gly Ala Ala His Val Arg Glu Leu Thr Leu His Ala Pro Leu Ala 4820 4825 4830Val Pro His Asp Ala Ala Val Asp Leu Gln Val Arg Val Ser Ala 4835 4840 4845Pro Asp Asp Thr Gly Ala Arg Thr Leu Thr Val Ser Ser Arg Ser 4850 4855 4860Glu His Ala Arg Pro Glu Asp Pro Trp Gln His His Ala Thr Gly 4865 4870 4875Leu Leu Asp Ala Gln Pro Ser Ala Asp Gly Asp Ala Leu Arg Ser 4880 4885 4890Trp Pro Pro Glu Gly Ala Leu Pro Cys Ala Ala Asp Glu Leu Glu 4895 4900 4905Ser Phe Tyr Ala Ala Gln Glu Ala Arg Gly Phe Ala Tyr Gly Pro 4910 4915 4920Ala Phe Arg Gly Leu Arg Ala Ala Trp Arg Arg Gly Glu Glu Val 4925 4930 4935Phe Ala Glu Val Arg Leu Pro Glu Ser Val Leu Asp Glu Ala Ser 4940 4945 4950Arg Tyr Asn Leu His Pro Ala Leu Leu Asp Ala Ala Leu His Ala 4955 4960 4965Val Ala Leu Gly Ala Ala Thr Gly Leu Pro Pro Gly Ala Val Pro 4970 4975 4980Phe Ser Phe Ser Gly Val Thr Leu His Ala Val Lys Ala Ala Ala 4985 4990 4995Val Arg Val Arg Val Ala Pro Ala Gly Arg Asp Gly Glu Arg Thr 5000 5005 5010Ala Val Ser Val Ser Leu Ala Asp Glu Thr Gly Arg Gly Val Leu 5015 5020 5025Ser Val Asp Ser Leu Ala Val Arg Pro Leu Asp Thr Gly Glu Leu 5030 5035 5040Arg Ala Ser Ala Gln Ala Ala Gly Arg Ala Ala Leu Phe Asp Val 5045 5050 5055Ala Trp Lys Asp Val Thr Pro Gly Thr Pro Pro Pro Asp Thr Ala 5060 5065 5070Val Arg Ser Thr Val Leu Thr His Asp Arg Ala Ala Ala Asp Leu 5075 5080 5085Ser Gly Leu Leu Ser Gly Leu Asp Thr Asp Asp Ala Pro Val Pro 5090 5095 5100Asp Ala Val Leu Leu Thr Cys Ser Gln Gly Ala Val Ala Asp Val 5105 5110 5115Leu Gly Glu Val Leu Ser Val Val Gln Asp Trp Leu Ala Asp Asp 5120 5125 5130Arg Leu Ala Glu Ala Arg Leu Val Val Val Thr His Gly Ala Val 5135 5140 5145Ala Thr Arg Thr Gly Glu Glu Val Thr Asp Val Ala Gly Ala Ala 5150 5155 5160Val Trp Gly Leu Leu Arg Ser Ala Gln Ser Glu His Pro Gly Arg 5165 5170 5175Phe Val Leu Leu Asp Ala Asp Leu Ser Asp Asp Thr Thr Val Thr 5180 5185 5190Ala Ala Leu Ala Cys Asp Glu Pro Gln Leu Ala Val Arg Gly Gly 5195 5200 5205Arg Leu Leu Ala Ala Arg Leu Ala His Val Pro Val Pro Ala Asp 5210 5215 5220Ser Ser Asp Ala Val Arg Phe Asp Ala Glu Gly Thr Val Leu Val 5225 5230 5235Thr Gly Gly Thr Gly Thr Leu Gly Ala Ala Val Ala Arg His Leu 5240 5245 5250Ala Ala Gly His Gly Val Arg His Leu Leu Leu Val Ser Arg Arg 5255 5260 5265Gly Met Ala Ala Thr Gly Ala Glu Glu Leu Cys Ala Glu Leu Gly 5270 5275 5280Gln Ala Gly Val Ser Val Ser Val Ala Ala Cys Asp Val Ala Asp 5285 5290 5295Arg Ala Gln Val Ala Ala Leu Leu Glu Gln Val Pro Ala Glu His 5300 5305 5310Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Asp Asp Ala 5315 5320 5325Thr Val Ala Cys Leu Asn Arg Glu Lys Ile Asp Ala Val Val Gly 5330 5335 5340Ala Lys Val Asp Gly Ala Leu His Leu His Glu Leu Thr Ala Gly 5345 5350 5355Met Asp Leu Ser Ala Phe Val Leu Phe Ser Ser Ala Ala Gly Val 5360 5365 5370Leu Gly Ser Pro Gly Gln Gly Asn Tyr Ala Ala Ala Asn Ala Ala 5375 5380 5385Leu Asp Ala Leu Ala His Gln Arg Arg Ala Ala Gly Leu Pro Ala 5390 5395 5400Leu Ser Leu Ala Trp Gly Leu Trp Glu Glu Ala Ser Gly Met Thr 5405 5410 5415Gly His Leu Asp Ala Gly Asp Arg His Arg Ile Thr Arg Ser Gly 5420 5425 5430Leu His Pro Leu Thr Thr Pro Asp Ala Leu Ala Leu Leu Asp Thr 5435 5440 5445Ala Leu Val Thr Gly Arg Pro Ala Leu Leu Pro Ala Asp Leu Arg 5450 5455 5460Pro Thr His Pro Ala Pro Pro Leu Leu Glu His Leu Ala Pro Ala 5465 5470 5475Arg Thr Ser Pro Arg Thr Ala His Thr Gly Thr Ser Ala Gly Ala 5480 5485 5490Gly Gln Asp Val Ser Leu Ala Asp Arg Leu Ala Thr Leu Thr Pro 5495 5500 5505Glu Gln Gln His Asp Thr Leu Phe Thr Val Val Arg Thr Gln Ile 5510 5515 5520Ala Thr Val Leu Gly His Gln Thr Pro Glu Ala Val Pro Ala Asp 5525 5530 5535Ser Ala Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu 5540 5545 5550Leu Arg Asn Arg Leu Ser Arg Ala Thr Gly Leu Arg Leu Pro Ala 5555 5560

5565Thr Leu Ala Phe Asp His Pro Thr Ala Thr Ala Leu Thr Arg His 5570 5575 5580Leu Leu Thr Arg Leu Leu Pro Asp Asp Ala Ala Thr Ala Pro Pro 5585 5590 5595Glu Gln Ser Leu Phe Ala Glu Ile Gly Arg Leu Glu Ala Val Leu 5600 5605 5610Ser Ser Val Ala Ser Pro Leu Pro Gly Ala Gln Gly Leu Gly Glu 5615 5620 5625Glu Ala Arg Ser Arg Leu Ala Ser Arg Leu Arg Ser Leu Ala Gln 5630 5635 5640Val Leu Gly Gly Glu Glu Ala Pro Arg Pro Asp Leu Gly Glu Ala 5645 5650 5655Thr Asp Glu Glu Met Phe Ala Leu Ile Asp Gln Glu Thr Gly Ser 5660 5665 5670Pro115166PRTbacteria 11Met Ala Asn Glu Glu Met Leu Arg Glu Tyr Leu Lys Arg Ala Thr Ala1 5 10 15Asp Leu Leu Arg Val Arg Arg Arg Leu Glu Gln Val Glu Ser Gly Arg 20 25 30Gln Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Phe Pro Gly Gly 35 40 45Val Arg Ser Pro Glu Asp Leu Trp Glu Leu Val Ala Ser Gly Gly Asp 50 55 60Ala Ile Gly Asp Phe Pro Val Asp Arg Gly Trp Asp Val Glu Asp Leu65 70 75 80Tyr Asp Pro Glu Pro Gly Arg Ala Gly Arg Ser Tyr Thr Arg Ser Gly 85 90 95Gly Phe Leu His Glu Ala Ala Glu Phe Asp Ala Gly Phe Phe Gly Leu 100 105 110Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Met Leu 115 120 125Glu Val Ser Trp Glu Ala Leu Glu Arg Ala Gly Ile Asp Pro Ala Thr 130 135 140Leu Arg Gly Ser Arg Thr Gly Val Phe Ala Gly Met Met Ser His Asp145 150 155 160Tyr Ala Thr Arg Leu Leu Ser Val Pro Asp His Leu Gln Gly Phe Leu 165 170 175Gly Asn Gly Asn Ala Ala Ser Val Leu Ser Gly Arg Leu Ser Tyr Thr 180 185 190Phe Gly Phe Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser 195 200 205Ser Leu Val Ala Leu His Leu Ala Cys Gln Ser Val Arg Ser Gly Glu 210 215 220Ser Ser Leu Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Ala225 230 235 240Met Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ser Ala Asp Gly Arg 245 250 255Cys Lys Pro Tyr Ala Ala Ala Ala Asp Gly Thr Gly Met Ser Glu Gly 260 265 270Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg Leu Gly 275 280 285His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly 290 295 300Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val305 310 315 320Ile Gly Gln Ala Leu Val Cys Ala Gly Leu Ser Ala Ala Glu Val Asp 325 330 335Val Val Glu Gly His Gly Thr Gly Thr Ser Leu Gly Asp Pro Ile Glu 340 345 350Ala Gln Ala Val Leu Ala Ala Tyr Gly Arg Gly Arg Gly Val Pro Leu 355 360 365Trp Leu Gly Ser Val Lys Ser Asn Leu Gly His Thr Gln Ala Ala Ala 370 375 380Gly Val Ala Gly Val Ile Lys Met Val Met Ala Leu Trp Arg Gly Arg385 390 395 400Leu Pro Arg Thr Leu His Val Asp Glu Pro Ser Pro His Val Asp Trp 405 410 415Ser Ser Gly Ala Val Arg Leu Leu Thr Glu Glu Val Val Trp Glu Arg 420 425 430Gly Glu Arg Pro Arg Arg Ala Gly Val Ser Ser Phe Gly Val Ser Gly 435 440 445Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro Gln Glu Glu Glu Val 450 455 460Arg Pro Glu Glu Ala Pro Ser Gly Asp Gly Val Gly Pro Val Val Val465 470 475 480Pro Ser Gly Asp Gly Ala Gly Pro Ala Val Val Pro Trp Val Val Ser 485 490 495Ala Arg Ser Glu Ser Ala Leu Arg Gly Gln Ala Arg Arg Leu Arg Val 500 505 510Phe Ala Asp Gly Ala Gly Ala Ala Pro Val Glu Val Gly Arg Ala Leu 515 520 525Ala Val Glu Arg Ala Trp Leu Glu His Arg Ala Val Val Leu Ala Glu 530 535 540Asp Leu Asp Gly Phe Arg His Gly Leu Asp Ala Leu Ala Thr Gly Arg545 550 555 560Pro Ala Pro Glu Val Val Thr Gly Thr Ala Thr Asp Glu Gly Pro Leu 565 570 575Ala Phe Leu Phe Ala Gly Gln Gly Thr Gln Arg Pro Ala Met Gly Arg 580 585 590Glu Leu His Ala His Phe Pro Ala Phe Ala Asp Ala Phe Asp Glu Val 595 600 605Cys Ala His Phe Gly Pro Ile Gly Glu Ala Gly His Thr Leu Arg Asp 610 615 620Ile Val Phe Ala Ala Pro Gly Ser Pro Gly Ala Glu Leu Ile Glu Gln625 630 635 640Thr Glu Tyr Ala Gln Pro Ala Leu Phe Ala Val Glu Val Ala Leu Tyr 645 650 655Arg Leu Val Glu Asn Trp Gly Val Thr Pro Asp Tyr Leu Leu Gly His 660 665 670Ser Val Gly Glu Leu Ala Ala Ala His Val Ala Gly Met Leu Ser Leu 675 680 685Pro Asp Ala Ala Ala Leu Val Thr Ala Arg Gly Arg Leu Met Gln Ala 690 695 700Leu Pro Asp Thr Gly Ala Met Val Ala Val Glu Ala Thr Glu Glu Glu705 710 715 720Val Arg Pro Leu Leu Gln Asp Ala Glu Gly Arg Ala Asp Leu Ala Ala 725 730 735Val Asn Gly Pro Arg Ala Val Val Leu Ala Gly Asp Glu Asp Ala Val 740 745 750Leu Thr Leu Ala Arg His Trp Ala Glu Gln Gly Arg Arg Thr Arg Arg 755 760 765Leu Arg Thr Ser His Ala Phe His Ser Pro His Leu Asp Ala Val Leu 770 775 780Asp Asp Phe Arg Arg Val Ala Glu Gln Val Val Phe Ala Pro Pro Arg785 790 795 800Ile Pro Val Val Thr Asn Leu Thr Gly Ala Pro Val Ser Ala Asp Thr 805 810 815Met Gly Thr Ala Asp Tyr Trp Val Gln His Ala Arg His Thr Val Arg 820 825 830Phe Gly Asp Gly Leu Ala Trp Leu Gln Ala Gln Gly Val Thr Ala Tyr 835 840 845Leu Glu Leu Gly Pro Asp Gly Thr Leu Cys Ala Leu Gly Gln Asp Ala 850 855 860Leu Thr Glu Pro Ala Pro Leu Leu Pro Ala Leu Arg Pro Asp Arg Pro865 870 875 880Glu Ala Val Ser Val Leu Ala Ala Val Ala Gly Leu Ser Val Arg Gly 885 890 895Val Arg Val Asp Trp Ala Ala Val Leu Gly Gly Ala Pro Ser Gly Thr 900 905 910Ala Gly Arg Val Glu Leu Pro Thr Tyr Ala Phe Glu Arg Glu Arg Tyr 915 920 925Trp Leu Asp Ala Gly Glu Thr Pro Ala Ala Leu Pro Ala Gly Glu Asp 930 935 940Gly Pro Leu Trp Gln Ala Val Glu Arg Ala Asp Leu Pro Ala Val Ala945 950 955 960Ala Leu Leu Glu Val Asp Glu Asp Ala Pro Leu Gly Ser Val Val Ser 965 970 975Ala Leu Gly Asp Trp Arg Arg Gly Val Arg Glu Arg Ala Val Val Asp 980 985 990Gly Trp Arg Tyr Arg Val Val Trp Arg Pro Val Ser Arg Ser Gly Gly 995 1000 1005Gly Val Val Ser Gly Gly Val Trp Val Val Val Val Pro Glu Gly 1010 1015 1020Val Val Gly Ala Ala Ala Val Val Glu Gly Leu Glu Arg Ala Gly 1025 1030 1035Val Cys Val Arg Val Val Ala Val Glu Gly Gly Cys Ala Asp Arg 1040 1045 1050Val Val Leu Gly Glu Arg Leu Arg Glu Val Cys Gly Gly Glu Gly 1055 1060 1065Pro Val Gly Val Leu Ala Val Cys Gly Gly Gly Val Gly Val Ala 1070 1075 1080Gly Leu Val Leu Gly Leu Val Gln Ala Val Glu Gly Leu Gly Val 1085 1090 1095Pro Leu Trp Cys Val Thr Arg Gly Ala Val Ser Val Gly Glu Gly 1100 1105 1110Asp Arg Leu Gly Asp Pro Gly Gly Ala Val Val Trp Gly Leu Gly 1115 1120 1125Arg Val Ala Gly Leu Glu Leu Pro Asp Arg Trp Gly Gly Val Val 1130 1135 1140Asp Leu Pro Glu Val Val Asp Glu Arg Val Val Glu Gly Leu Leu 1145 1150 1155Gly Val Leu Ser Gly Gly Gly Gly Glu Gly Glu Val Ala Val Arg 1160 1165 1170Ala Ser Gly Val Phe Val Arg Arg Leu Val Arg Ala Pro Gly Gly 1175 1180 1185Gly Ala Glu Ala Gly Gly Trp Arg Pro Arg Gly Thr Val Leu Ile 1190 1195 1200Thr Gly Gly Thr Gly Ala Leu Gly Ala His Val Ala Arg Trp Met 1205 1210 1215Val Arg Arg Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly 1220 1225 1230Arg Glu Ala Lys Gly Ala Gly Glu Leu Arg Ala Glu Leu Thr Ala 1235 1240 1245Met Gly Ala Arg Val Thr Ile Ala Ala Cys Asp Val Ala Asp Arg 1250 1255 1260Gly Ala Leu Ala Glu Leu Leu Ala Thr Ala Val Pro Glu Asp Cys 1265 1270 1275Pro Leu Gly Ala Val Val His Thr Ala Gly Val Val Asp Asp Gly 1280 1285 1290Val Leu Asp Ala Leu Thr Pro Glu Arg Leu Glu Gly Val Leu Ala 1295 1300 1305Ala Lys Ala Val Gly Ala Arg Asn Leu His Glu Leu Thr Arg Gly 1310 1315 1320Ala Asp Leu Ser Ala Phe Val Val Phe Ser Ser Ala Ala Ala Thr 1325 1330 1335Phe Gly Ser Gly Gly Gln Gly Ala Tyr Val Ala Ala Asn Ala Tyr 1340 1345 1350Val Glu Ala Leu Ala Val His Arg Arg Gly Leu Gly Leu Pro Ser 1355 1360 1365Thr Ala Val Ala Trp Gly Ala Trp Ala Gly Gly Gly Met Ala Ala 1370 1375 1380Asp Ala Glu Ala Ala Thr Arg Met Asp Arg Arg Gly Ile Arg Pro 1385 1390 1395Met Asp Thr Glu Pro Ala Leu Ser Ala Leu Gly Gln Val Leu Asp 1400 1405 1410Arg Asn Glu Thr Cys Leu Thr Ile Ala Asp Ile Asp Trp Glu Arg 1415 1420 1425Leu Pro Ala Ala Asp Gly Leu Ala Arg Leu Leu Ser Asp Ile Pro 1430 1435 1440Glu Ala Arg Leu Ala Arg Pro Ala Thr Gly Thr Glu Ala Pro Gly 1445 1450 1455Ser Leu Arg Ala Arg Leu Ala Ala Leu Glu Pro Ala Glu Arg Asp 1460 1465 1470Arg Ala Leu Leu Asp Leu Val Arg Thr His Thr Ala Thr Val Leu 1475 1480 1485Gly His Arg Thr Ala Thr Ala Val Pro Ala Asp Arg Ala Phe Arg 1490 1495 1500Glu Leu Gly Phe Gly Ser Leu Asn Ala Val Glu Leu Arg Asn Gly 1505 1510 1515Leu Asn Thr Ala Thr Gly Leu Arg Leu Pro Ser Thr Leu Val Phe 1520 1525 1530Asp Tyr Pro Asn Pro Ser Ala Leu Ala Thr His Leu Gly Thr Leu 1535 1540 1545Leu Ser Thr Gly Gly Glu Ala Pro Ala Gly Arg Pro Ala Phe Ile 1550 1555 1560Arg Ser Gly Val Val Asp Glu Pro Val Ala Ile Val Gly Met Ala 1565 1570 1575Cys Arg Phe Pro Gly Gly Val Trp Ser Pro Glu Asp Leu Trp Glu 1580 1585 1590Leu Val Ala Ser Gly Gly Asp Ala Ile Gly Gly Phe Pro Val Asp 1595 1600 1605Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Glu Ala Gly Arg 1610 1615 1620Pro Gly Ser Ser Tyr Thr Arg Ala Gly Gly Phe Leu Ala Gly Ala 1625 1630 1635Ala Glu Phe Asp Ala Gly Phe Phe Gly Ile Ser Pro Arg Glu Ala 1640 1645 1650Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Val Ser Trp 1655 1660 1665Glu Ala Leu Glu Arg Ala Gly Ile Asp Pro Val Ser Leu Arg Gly 1670 1675 1680Ser Arg Thr Gly Val Phe Ala Gly Val Ala Asn Gln Asp Tyr Ala 1685 1690 1695Glu Leu Val Arg Arg Gly Gly Arg Asp Leu Glu Gly Tyr Ala Leu 1700 1705 1710Thr Gly Val Ser Gly Ser Val Leu Ser Gly Arg Leu Ser Tyr Thr 1715 1720 1725Phe Gly Leu Lys Gly Pro Pro Val Thr Val Asn Thr Ala Cys Ser 1730 1735 1740Ser Ser Leu Val Ala Leu His Leu Ala Cys Gln Ser Leu Arg Ser 1745 1750 1755Gly Glu Ser Lys Leu Ala Leu Pro Gly Gly Val Thr Val Met Ser 1760 1765 1770Thr Pro Gly Ala Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ser 1775 1780 1785Pro Asp Gly Arg Cys Lys Ala Phe Ala Thr Pro Thr Asn Gly Val 1790 1795 1800Gly Trp Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser 1805 1810 1815Asp Ala Arg Arg Leu Gly His Arg Val Leu Pro Val Val Arg Gly 1820 1825 1830Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro 1835 1840 1845Asn Gly Pro Ser Gln Gln Arg Val Ile Gly Gln Ala Leu Val Cys 1850 1855 1860Ala Gly Leu Ser Ala Ala Glu Val Asp Val Val Glu Gly His Gly 1865 1870 1875Thr Gly Thr Ser Leu Gly Asp Pro Ile Glu Ala Gln Ala Val Leu 1880 1885 1890Ala Ala Tyr Gly Arg Gly Arg Gly Val Pro Leu Trp Leu Gly Ser 1895 1900 1905Val Lys Ser Asn Leu Gly His Thr Gln Ala Ala Ala Gly Val Ala 1910 1915 1920Gly Val Ile Lys Met Val Met Val Leu Trp Arg Gly Arg Leu Pro 1925 1930 1935Arg Thr Leu His Val Asp Glu Pro Ser Pro His Val Asp Trp Ser 1940 1945 1950Ser Gly Ala Val Arg Leu Leu Thr Glu Glu Val Val Trp Glu Arg 1955 1960 1965Gly Glu Arg Pro Arg Arg Ala Gly Val Ser Ser Phe Gly Val Ser 1970 1975 1980Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro Gln Glu Glu 1985 1990 1995Glu Val Arg Pro Glu Glu Ala Pro Ser Gln Gly Glu Ala Gly Pro 2000 2005 2010Ala Val Val Pro Trp Val Val Ser Ala Arg Ser Glu Ser Ala Leu 2015 2020 2025Arg Gly Gln Ala Arg Arg Leu Arg Val Phe Ala Asp Gly Ala Gly 2030 2035 2040Ala Ala Pro Val Glu Val Gly Arg Ala Leu Ala Val Glu Arg Ala 2045 2050 2055Trp Leu Glu His Arg Ala Val Val Leu Ala Glu Asp Leu Asp Gly 2060 2065 2070Phe Arg His Gly Leu Asp Ala Leu Ala Thr Gly Leu Pro Thr Ala 2075 2080 2085Gly Val Val Ala Gly Arg Thr Gly Pro Glu Ala Asp Gly Lys Ile 2090 2095 2100Ala Leu Leu Phe Gly Gly Gln Gly Thr Gln Trp Asp Gly Met Ala 2105 2110 2115Ala Glu Leu Leu Asp Ser Ser Pro Val Phe Ala Gln Arg Met Thr 2120 2125 2130Glu Cys Ala Asp Ala Leu Arg Pro Tyr Leu Asp Trp Glu Leu Leu 2135 2140 2145Asp Val Leu Arg Gly Glu Pro Asp Ala Pro Pro Leu Asp Arg Val 2150 2155 2160Asp Val Val Gln Pro Val Leu Phe Ala Val Met Val Ser Leu Ala 2165 2170 2175Ala Leu Trp Arg Ser Tyr Gly Val Arg Pro Asp Ala Val Ala Gly 2180 2185 2190His Ser Gln Gly Glu Ile Ala Ala Ala Cys Val Ala Gly Ala Leu 2195 2200 2205Ser Leu Glu Asp Ala Ala Arg Val Thr Ala Leu Arg Ser Gln Ala 2210 2215 2220Leu Ala Ala Leu Ala Gly Gln Gly Ala Met Ala Ser Val Gly Leu 2225 2230 2235Pro Ala Glu Asp Leu Glu Pro Arg Leu Ala Ala Val Asp Pro Ser 2240 2245 2250Leu Val Val Ala Ala Asp Asn Gly Ala Arg Ser Ala Val Val Ser 2255 2260 2265Gly Ser Pro Asp Ala Val Thr Ala Leu Val Asp Asp Leu Thr Arg 2270 2275 2280Asp Gly Val Pro Ala Arg Leu Leu Lys Val Asp Trp Ala Ser His 2285 2290 2295Ser Pro Gln Val Glu Ala Ile Arg Ala Asp Leu Leu Gly Leu Leu 2300 2305 2310Ala Pro Val Thr Pro Arg Pro Ala Asp Ile Pro Leu Tyr Ser Thr 2315 2320 2325Val Thr Gly Glu Pro Val Asp Gly Thr Ala Leu Asp Ala Ala Tyr 2330 2335

2340Trp Tyr Arg Asn Leu Arg Glu Pro Val Arg Phe Arg Asp Ala Thr 2345 2350 2355Arg Ala Leu Ala Arg Asp Gly His Thr Val Phe Val Glu Ala Gly 2360 2365 2370Pro His Pro Ala Val Ser Val Ala Val Gln Glu Thr Leu Asp Asp 2375 2380 2385Leu Gly Ala Ala Asp Thr Leu Val Val Gly Ser Leu Arg Arg Gly 2390 2395 2400Glu Gly Gly Leu Arg Arg Phe Leu Ala Ser Ala Ala Glu Leu Ser 2405 2410 2415Val Arg Gly Val Arg Val Asp Trp Ala Ala Val Leu Gly Gly Lys 2420 2425 2430Pro Ser Gly Thr Ala Gly Arg Val Glu Leu Pro Thr Tyr Ala Phe 2435 2440 2445Glu Arg Glu Arg Tyr Trp Leu Asp Pro Glu Glu Thr Pro Ala Ala 2450 2455 2460Pro Ala Thr Thr Glu Asp Gly Pro Leu Trp Glu Ala Val Glu Arg 2465 2470 2475Glu Asp Pro Ala Ala Val Ala Ala Leu Leu Ala Val Asp Glu Asp 2480 2485 2490Ala Pro Leu Asp Ala Leu Val Ser Ala Leu Gly Asp Trp Arg Arg 2495 2500 2505Gly Val Arg Glu Arg Ala Val Val Asp Gly Trp Arg Tyr Arg Val 2510 2515 2520Val Trp Arg Pro Val Ser Arg Ser Gly Gly Gly Val Val Ser Gly 2525 2530 2535Gly Val Trp Val Val Val Val Pro Glu Gly Val Val Gly Ala Ala 2540 2545 2550Ala Val Val Glu Gly Leu Glu Trp Ala Gly Val Cys Val Arg Val 2555 2560 2565Val Ala Val Glu Gly Gly Cys Ala Asp Arg Val Val Leu Gly Glu 2570 2575 2580Arg Leu Arg Glu Val Trp Gly Gly Glu Gly Pro Val Gly Val Leu 2585 2590 2595Ala Val Cys Gly Gly Gly Val Gly Val Ala Gly Leu Val Leu Gly 2600 2605 2610Leu Val Gln Ala Val Glu Gly Leu Gly Val Pro Leu Trp Cys Val 2615 2620 2625Thr Arg Gly Ala Val Ser Val Gly Glu Gly Asp Arg Leu Gly Asp 2630 2635 2640Pro Gly Gly Ala Val Val Trp Gly Leu Gly Arg Val Ala Gly Leu 2645 2650 2655Glu Leu Pro Asp Arg Trp Gly Gly Val Val Asp Leu Pro Glu Val 2660 2665 2670Val Asp Glu Arg Val Val Glu Gly Leu Leu Gly Val Leu Ser Gly 2675 2680 2685Gly Gly Gly Glu Gly Glu Val Ala Val Arg Ala Ser Gly Val Phe 2690 2695 2700Val Arg Arg Leu Val Arg Ala Pro Gly Gly Gly Ala Glu Ala Gly 2705 2710 2715Gly Trp Arg Pro Arg Gly Thr Val Leu Ile Thr Gly Glu Asn Ala 2720 2725 2730Asp Pro Glu Gln Pro Ala Ala His Leu Ala Arg Trp Leu Ala Asp 2735 2740 2745Arg Gly Ala Glu His Leu Leu Leu Ile Ser Thr Ser Gly Asp Gly 2750 2755 2760Phe Gly Leu Ala Asp Thr Thr Asp Gln Trp Gly Ala Arg Val Thr 2765 2770 2775Ile Ala Ala Cys Asp Val Ala Asp Arg Gly Ala Leu Ala Glu Leu 2780 2785 2790Leu Ala Thr Ala Val Pro Glu Asp Cys Pro Leu Gly Ala Val Val 2795 2800 2805His Thr Ala Gly Val Val Asp Asp Gly Val Leu Asp Ala Leu Thr 2810 2815 2820Pro Glu Arg Leu Glu Gly Val Leu Ala Ala Arg Ala Val Gly Ala 2825 2830 2835Arg Asn Leu His Glu Leu Thr Arg Gly Ala Asp Leu Ser Ala Phe 2840 2845 2850Val Val Phe Ser Ser Ala Ala Ala Thr Phe Gly Ser Gly Gly Gln 2855 2860 2865Gly Ala Tyr Val Ala Ala Asn Ala Tyr Val Glu Ala Leu Ala Val 2870 2875 2880His Arg Arg Gly Leu Gly Leu Pro Ser Thr Ala Val Ala Trp Gly 2885 2890 2895Pro Trp Arg Gly His Ser Ala Ala Gly Arg Pro Asp Ala Ala Ala 2900 2905 2910Arg Leu His Arg Arg Gly Leu Thr Glu Met Ala Pro Glu Leu Ala 2915 2920 2925Leu Ala Ala Leu Ala Arg Val Leu Asp His Asp Glu Ser Gly Leu 2930 2935 2940Thr Val Ala Asp Ile Asp Trp Glu Arg Phe Thr Ala His Thr Ala 2945 2950 2955Gly Ser Arg Leu Pro Leu Ile Gly Asp Leu Pro Asp Val Arg Ala 2960 2965 2970Leu Thr Arg Ala Thr Gly Thr Gly Thr Ala His Gly Thr Asp Leu 2975 2980 2985Arg Asp Arg Leu Ala Ala Leu Glu Pro Asp Ala Arg Thr Asp Val 2990 2995 3000Leu Leu Glu Leu Val Ser Thr His Thr Ala Ala Val Leu Gly His 3005 3010 3015Arg Glu Ala Asp Thr Val Pro Ala Asp Arg Ala Phe Arg Glu Leu 3020 3025 3030Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Asn 3035 3040 3045Thr Ala Thr Gly Leu Arg Leu Pro Thr Thr Leu Val Phe Asp Tyr 3050 3055 3060Pro Arg Pro Ala Val Leu Ala Arg His Leu Arg Asp Gln Leu Cys 3065 3070 3075Gly Thr Ala Pro Ala Thr Pro Pro Val Ala Ala Arg Pro Gly Val 3080 3085 3090Val Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Phe Pro 3095 3100 3105Gly Gly Val Trp Ser Pro Glu Asp Leu Trp Glu Leu Val Ala Ser 3110 3115 3120Gly Gly Asp Ala Ile Gly Gly Phe Pro Val Asp Arg Gly Trp Asp 3125 3130 3135Val Glu Gly Leu Tyr Asp Pro Glu Ala Gly Arg Pro Gly Ser Ser 3140 3145 3150Tyr Thr Arg Ser Gly Gly Phe Leu Ala Gly Ala Ala Glu Phe Asp 3155 3160 3165Ala Gly Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp 3170 3175 3180Pro Gln Gln Arg Leu Leu Leu Glu Val Ser Trp Glu Ala Leu Glu 3185 3190 3195Arg Ala Gly Ile Asp Pro Val Ser Leu Arg Gly Ser Arg Thr Gly 3200 3205 3210Val Phe Ala Gly Val Ala Asn Gln Asp Tyr Ala Glu Leu Val Arg 3215 3220 3225Arg Gly Gly Arg Asp Leu Glu Gly Tyr Ala Leu Thr Gly Val Ser 3230 3235 3240Gly Ser Val Leu Ser Gly Arg Leu Ser Tyr Thr Phe Gly Leu Glu 3245 3250 3255Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val 3260 3265 3270Ala Leu His Leu Ala Cys Gln Ser Leu Arg Ser Gly Glu Ser Glu 3275 3280 3285Leu Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Gly Ala 3290 3295 3300Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ser Ala Asp Gly Arg 3305 3310 3315Cys Lys Ala Phe Ala Ala Ala Ala Asp Gly Val Gly Trp Ser Glu 3320 3325 3330Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg 3335 3340 3345Leu Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn 3350 3355 3360Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser 3365 3370 3375Gln Gln Arg Val Ile Gly Gln Ala Leu Val Cys Ala Gly Leu Ser 3380 3385 3390Ala Ala Glu Val Asp Val Val Glu Gly His Gly Thr Gly Thr Ser 3395 3400 3405Leu Gly Asp Pro Ile Glu Ala Gln Ala Val Leu Ala Ala Tyr Gly 3410 3415 3420Arg Gly Arg Gly Val Pro Leu Trp Leu Gly Ser Val Lys Ser Asn 3425 3430 3435Leu Gly His Thr Gln Ala Ala Ala Gly Val Ala Gly Val Ile Lys 3440 3445 3450Met Val Met Ala Leu Trp Arg Gly Arg Leu Pro Arg Thr Leu His 3455 3460 3465Val Asp Glu Pro Ser Pro His Val Asp Trp Ser Ser Gly Ala Val 3470 3475 3480Arg Leu Leu Thr Glu Glu Val Val Trp Glu Arg Gly Glu Arg Pro 3485 3490 3495Arg Arg Ala Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 3500 3505 3510His Val Ile Leu Glu Glu Ala Pro Gln Glu Glu Glu Val Arg Pro 3515 3520 3525Glu Glu Ala Pro Ser Gln Asp Glu Ala Gly Pro Ala Thr Val Pro 3530 3535 3540Cys Leu Leu Ser Ala Arg Thr Asp Thr Ala Leu Arg Ala Gln Ala 3545 3550 3555Arg Arg Leu Arg Asp Tyr Leu Ala Ala Asn Pro Asp Ile Pro Ile 3560 3565 3570Gly Asp Val Ala His Ala Leu Ala Thr Gly Arg Ser Thr Phe Glu 3575 3580 3585Arg Arg Ala Val Leu Val Ala Glu Asp His Glu Gly Leu Leu Arg 3590 3595 3600Thr Leu Asp Ala Leu Ala Glu Gly Thr Thr Ala Pro Gly Leu Ile 3605 3610 3615Glu Ser Pro Ala Arg Thr Ala His Gly Lys Val Ala Phe Leu Phe 3620 3625 3630Ser Gly Gln Gly Thr Gln Arg Pro Gly Met Gly Arg Glu Leu Tyr 3635 3640 3645Ala Ala His Pro Ala Phe Ala Gln Ala Leu Asp Asp Val Leu Ala 3650 3655 3660Glu Leu Glu Pro His Leu Asp Arg Pro Leu Arg Pro Leu Leu Leu 3665 3670 3675Asp Glu Pro Gln Pro Leu Asp Arg Thr Gly Asp Ala Gln Pro Ala 3680 3685 3690Leu Phe Ala Leu Gln Val Ala Leu Phe Arg Leu Leu Glu Ser Ala 3695 3700 3705Gly Ile Arg Pro Asp His Val Ala Gly His Ser Ile Gly Glu Leu 3710 3715 3720Ala Ala Ala His Val Ala Gly Val Leu Ser Leu Thr Asp Ala Ala 3725 3730 3735Arg Leu Val Ala Ala Arg Gly Arg Leu Ala Gln Thr Gln Leu Pro 3740 3745 3750Pro Gly Gly Ala Met Leu Ala Val Arg Ala Ser Glu Glu Gln Val 3755 3760 3765Thr Arg Met Leu Ala Gly Arg Glu Ala Arg Val Ala Val Ala Ala 3770 3775 3780Val Asn Gly Pro Thr Ser Val Val Ile Ser Gly Ala Glu Pro Asp 3785 3790 3795Val Leu Glu Ala Ala Ala Ala Phe Ala Glu Gln Gly Leu Arg Thr 3800 3805 3810Lys Arg Leu Ser Thr Asp Arg Ala Phe His Ser Pro Leu Met Glu 3815 3820 3825Pro Ile Leu Glu Glu Phe Arg Gln Val Ala Thr Gly Ile Ala Tyr 3830 3835 3840Ala Glu Pro Thr Ile Pro Val Val Ser Thr Val Thr Gly Asp Arg 3845 3850 3855Ala Thr Ala Gly Thr Leu Thr Asp Pro Glu Tyr Trp Val Arg Gln 3860 3865 3870Leu Arg Arg Thr Val Arg Phe Gly Asp Ala Val Arg Arg Leu His 3875 3880 3885Asp Asp Asp Gly Val Arg Thr Phe Leu Glu Leu Gly Pro Asp Gly 3890 3895 3900Thr Leu Cys Ala Leu Ala Gly Glu Cys Leu Pro Ala Asp Asp Asn 3905 3910 3915Thr Thr Glu Pro Gly Pro Ala Leu Val Pro Leu Leu Arg Ala Asp 3920 3925 3930Arg Pro Glu Pro Leu Ala Leu Leu Thr Ala Leu Ala His Leu His 3935 3940 3945Val Gln Gly Thr Pro Lys Gly Gly Thr Ala Val His Trp Pro Ala 3950 3955 3960Leu Ile Gly Ala Thr Pro Glu Arg Ala Arg His Leu Asp Leu Pro 3965 3970 3975Thr Tyr Pro Phe Asp Arg Arg Arg Tyr Trp Leu Asp Ala Asp Thr 3980 3985 3990Ser Leu Ser Gly Asp Val Ser Ala Ala Gly Leu Thr Ala Ala Gly 3995 4000 4005His Pro Leu Leu Gly Ser Ala Val Pro Leu Ala Gly Ser Pro Gln 4010 4015 4020Ser Gln Glu Cys Leu Leu Thr Gly Arg Ile Ser Leu Arg Thr His 4025 4030 4035Pro Trp Leu Ala Asp His Ala Val Phe Gly Thr Val Leu Leu Pro 4040 4045 4050Gly Thr Ala Ile Leu Glu Leu Ala Val Arg Ala Gly Asp Glu Val 4055 4060 4065Gly Cys Asp Thr Val Glu Glu Leu Ala Leu Gln Val Pro Leu Val 4070 4075 4080Leu Pro Glu Arg Gly Ser Val Val Leu Gln Leu Ser Val Gly Ala 4085 4090 4095Thr Glu Thr Ala Pro Asp Gly Val Glu Arg Arg Pro Phe Thr Leu 4100 4105 4110Tyr Ala Arg Glu Asp Asp Gly Leu Thr Pro Ala Ala Pro Thr Gly 4115 4120 4125Thr Asp Gly Thr Gly Trp Thr Cys His Ala Thr Gly Val Leu Thr 4130 4135 4140Arg Arg Ala Glu Thr Ala His Asp Thr Ala Ala Pro Trp Pro Pro 4145 4150 4155Thr Asp Ala Val Pro Val Asp Leu Asp His Trp Tyr Gly Thr Leu 4160 4165 4170Ala Asp Ala Gly Leu Gly Tyr Gly Pro Ala Phe Gln Gly Leu Arg 4175 4180 4185Ala Ala Trp Arg His Gly Asp Asp Leu Tyr Ala Glu Val Ala Leu 4190 4195 4200Pro Asp Gly Pro Ser Gly Asp Ala Asp Arg Tyr Ala Val His Pro 4205 4210 4215Ala Leu Leu Asp Ala Ala Leu His Pro Val Val Leu Gly Phe Ala 4220 4225 4230Glu Asp Glu Pro Asp Glu Gly His Gly Trp Leu Pro Phe Ser Trp 4235 4240 4245Ser Gly Val Thr Val Thr Ala Ser Gly Ala Ser Ala Leu Arg Val 4250 4255 4260Arg Leu Ser Arg Arg Ser Pro Asp Thr Ile Ala Leu Leu Ala Thr 4265 4270 4275Asp Ser Thr Gly His Thr Val Val Thr Ala Glu Ser Leu Ala Phe 4280 4285 4290Arg Pro Val Thr Ala Gly Gln Leu His Ser Ala Arg Thr Ala His 4295 4300 4305His Asp Ala Leu Phe Arg Leu Asp Trp Ala Pro Val Pro Leu Pro 4310 4315 4320Arg Thr Pro Ser Ser Lys Thr Arg Leu Ala Leu Ile Gly Ser Glu 4325 4330 4335Ala Glu Cys Pro Asp Ala Pro Gly Val Pro Trp Ser Thr Tyr Ala 4340 4345 4350Asp Leu Glu Glu Leu Ala Ser Ala Gly Thr Pro Val Pro Asp Val 4355 4360 4365Val Val Val Pro Cys Pro His Arg Asp Gly Ala Ala Asp Ala Ala 4370 4375 4380Asp Ala Thr Arg Arg Ala Thr Val Arg Val Leu His Leu Leu Gln 4385 4390 4395Ser Trp Leu Ala Asp Asp Arg Phe Ala Asp Ser Arg Leu Ala Phe 4400 4405 4410Val Thr His Gly Ala Val Ala Ala Ala Pro Gly Asp Ser Val Pro 4415 4420 4425Asp Leu Ala His Ala Ala Val Trp Gly Met Val Arg Ser Ala Gln 4430 4435 4440Thr Glu Asn Pro Gly Arg Phe Val Leu Thr Asp Leu Asp Asp Thr 4445 4450 4455Asp Ala Ser Arg Arg Ala Leu Ala Ala Ala Leu Leu Ser Gly Glu 4460 4465 4470Pro Gln Thr Val Leu Arg Glu Gly Arg Ala His Thr Pro Arg Leu 4475 4480 4485Ala Arg Ile Pro Val Gly Ala Arg Ala Asp Ser Gly His Trp Asp 4490 4495 4500Pro Asp Ala Thr Val Leu Ile Thr Gly Gly Thr Gly Tyr Leu Gly 4505 4510 4515Arg Leu Leu Ala Arg His Leu Val Val Thr His Gly Val Arg His 4520 4525 4530Leu Leu Leu Thr Ser Arg Ser Gly Pro Thr Ala Pro Gly Thr Ala 4535 4540 4545Glu Leu Val Ala Glu Leu Ala Glu Leu Gly Ala Arg Thr Thr Ala 4550 4555 4560Val Ala Cys Asp Leu Ala Asp Arg Arg Ala Val Ala Ala Leu Leu 4565 4570 4575Ala Glu Ile Pro Ala Arg His Pro Leu Lys Ala Val Leu His Thr 4580 4585 4590Ala Gly Val Val Asp Asp Gly Val Leu Thr Ser Leu Thr Pro Asp 4595 4600 4605Arg Leu Asp Ala Val Leu Ser Ala Lys Ala His Gly Ala Ala His 4610 4615 4620Leu His Asp Leu Thr Arg Asp Ala Gly Leu Asp Ala Phe Ile Ala 4625 4630 4635Phe Ser Ser Ala Ala Ala Ser Phe Gly Ser Pro Gly Gln Ala Asn 4640 4645 4650Tyr Thr Ala Ala Asn Ala Phe Leu Asp Ala Leu Met Gln Gln Arg 4655 4660 4665His Ala Leu Gly Leu Pro Gly Arg Ser Leu Ala Trp Gly Arg Trp 4670 4675 4680Ala Glu Ala Gly Gly Met Ala Glu His Leu Ala Ala Ala Asp Val 4685 4690 4695Ala Arg Met Thr Arg Ser Gly Leu Leu Pro Leu Thr Asn Ala His 4700 4705 4710Gly Leu Ala Leu Phe Asp Thr Ala Leu Ala Leu Asp Glu Pro Leu 4715 4720 4725Leu Leu Ala Thr Pro Leu Asp Pro Gly Thr Leu Arg Glu Gln Ala 4730 4735 4740Ala Val Gly Thr Leu Pro Pro Val Leu Arg Gly Leu Val Arg Thr 4745 4750 4755Pro Ala Arg Arg Thr Ala Asp His Gly Val Gly Ala Asp Ala Ala 4760 4765 4770Ala Glu Leu Arg Gly Arg Leu Ala Gly

Thr Pro Lys Pro Ala Glu 4775 4780 4785Arg Thr Ala Leu Leu Thr Glu Val Val Arg Thr His Ala Ala Ala 4790 4795 4800Val Leu Gly His Gly Gly Thr Asp Thr Val Thr Ala Asp Gly Glu 4805 4810 4815Phe Arg Glu Phe Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg 4820 4825 4830Asn Arg Leu Asn Ala Ala Thr Gly Leu Arg Leu Ala Thr Thr Leu 4835 4840 4845Val Phe Asp His Pro Thr Pro Ala Ala Leu Ala Asp His Leu Glu 4850 4855 4860Arg Leu Leu Ala Ala Glu Pro Ala Ser Asp Met Thr Ala Glu Thr 4865 4870 4875Ala Gly Ala Pro Gly Glu Arg Asp Ala Thr Ala Ser Ser Arg Ala 4880 4885 4890Gly Ser Gly Pro Ser Ala Asp Thr Val Glu Ala Leu Phe Trp Ile 4895 4900 4905Gly His Asp Ser Gly Arg Val Glu Glu Ser Met Ala Leu Leu Ser 4910 4915 4920Ala Ala Ser Ala Phe Arg Pro Cys Phe Thr Asp Pro Ser Ala Met 4925 4930 4935Thr Arg Pro Pro Phe Val Arg Val Ala Gln Gly Asp Thr Gly Pro 4940 4945 4950Ala Leu Ile Cys Leu Pro Thr Val Ala Ala Val Ser Ser Val Tyr 4955 4960 4965Gln Tyr Ser Arg Phe Ala Ala Ala Leu Asp Gly Leu Arg Asp Val 4970 4975 4980Trp Tyr Val Pro Ala Pro Gly Phe Ala Asp Gly Glu Pro Leu Pro 4985 4990 4995Ala Asp Val Asp Thr Ile Thr Arg Leu Phe Thr Asp Ala Ile Leu 5000 5005 5010Arg His Thr Asp Gly Glu Pro Phe Ala Leu Ala Gly His Ser Ala 5015 5020 5025Gly Gly Trp Phe Thr His Thr Val Thr Ser Arg Leu Glu His Leu 5030 5035 5040Gly Val Arg Pro Gln Ala Val Val Val Met Asp Ala Tyr Leu Pro 5045 5050 5055Asp Glu Gly Met Ala Pro Val Ala Ala Ala Leu Thr Ser Glu Ile 5060 5065 5070Phe Asp Arg Val Thr Glu Phe Ile Asp Leu Asp Tyr Ala Arg Leu 5075 5080 5085Val Ala Met Gly Gly Tyr Phe Arg Ile Phe Ala Gly Trp Arg Pro 5090 5095 5100Pro Ala Leu Glu Thr Pro Thr Leu Phe Leu Arg Ala Arg Glu Ser 5105 5110 5115Glu Gln Pro Pro Pro Val Trp Gly Glu Pro His Thr Val Leu Glu 5120 5125 5130Thr Asp Gly Asn His Phe Thr Met Leu Glu Glu His Ala Glu Ser 5135 5140 5145Thr Ala Arg His Val His Thr Trp Leu Ala Gly Leu Thr Glu Gln 5150 5155 5160Arg Arg Arg 516512254PRTbacteria 12Met Asp Arg Tyr Ala Lys Arg Phe Glu Asp Arg Leu Val Leu Val Thr1 5 10 15Gly Ala Gly Ser Gly Ile Gly Arg Ala Thr Ala Cys Arg Phe Gly Ala 20 25 30Ala Gly Ala Arg Leu Val Cys Val Asp Arg Asp Gly Pro Gly Ala Glu 35 40 45Ala Thr Ala Glu Leu Ala Arg Ala Arg Gly Ala Arg Ala Ala Cys Ala 50 55 60Glu Val Ala Asp Val Ser Asp Glu Val Ala Met Glu Arg Leu Ala Ala65 70 75 80Arg Val Thr Ala Ala His Gly Val Leu Asp Val Leu Val Asn Asn Ala 85 90 95Gly Ile Gly Met Ser Gly Arg Phe Leu Asp Thr Ser Ala Glu Asp Trp 100 105 110Arg Arg Thr Leu Gly Val Asn Leu Trp Gly Val Ile His Gly Cys Arg 115 120 125Leu Leu Gly Arg Gly Met Ala Glu Arg Arg Gln Gly Gly His Ile Val 130 135 140Thr Val Ala Ser Ala Ala Ala Phe Gln Pro Thr Arg Val Val Pro Val145 150 155 160Tyr Ala Thr Ser Lys Ala Ala Ala Leu Met Leu Ser Glu Cys Leu Arg 165 170 175Ala Glu Leu Ala Glu Phe Gly Ile Gly Val Ser Val Val Cys Pro Gly 180 185 190Leu Val Arg Thr Pro Phe Ala Ser Ala Met Tyr Phe Ala Gly Ala Ser 195 200 205Pro Asp Glu His Thr Arg Leu Arg Glu Ser Ser Ala Arg Arg Phe Ala 210 215 220Gly Arg Gly Cys Pro Pro Glu Lys Val Ala Asp Ala Val Leu Arg Ala225 230 235 240Ile Met Arg Thr Ala Leu Pro Thr Val Thr Gly Ser Thr Pro 245 250137PRTbacteria 13Gly Gly Thr Gly Thr Leu Gly1 5147PRTbacteria 14Gly Ala Ala Ser Thr Leu Gly1 51533DNAbacteria 15ctggtgacgg gcgctgcaag cactctgggg gcg 331633DNAbacteria 16gaccactgcc cgcgacgttc gtgagacccc cgc 33177PRTbacteria 17Leu Val Ser Arg Arg Gly Met1 5187PRTbacteria 18Leu Val Ala Ala Ala Gly Met1 51947DNAbacteria 19gcggcatctg ctgctggtgg cagcggcagg catggccgcc gccggtg 472047DNAbacteria 20cgccgtagac gacgaccacc gtcgccgtcc gtaccggcgg cggccac 47217PRTbacteria 21His Thr Ala Gly Val Leu Asp1 5227PRTbacteria 22His Thr Pro Pro Leu Leu Asp1 52346DNAbacteria 23gaccgctgtg gtgcacacgc cacctctcct ggacgacgcc accgtg 462446DNAbacteria 24ctggcgacac cacgtgtgcg gtggagagga cctgctgcgg tggcac 46255PRTbacteria 25Gly Ala Lys Val Asp1 5265PRTbacteria 26Gly Ala Ala Val Asp1 52739DNAbacteria 27gatgcggtgc tcggggcggc tgtggacggt gccctgcac 392839DNAbacteria 28ctacgccacg agccccgccg acacctgcca cgggacgtg 39297PRTbacteria 29Val Leu Phe Ser Ser Ala Ala1 5307PRTbacteria 30Val Leu Phe Ala Ala Ala Ala1 53141DNAbacteria 31gtcggcgttc gtgctgttcg cagcggccgc cggggtcctg g 413241DNAbacteria 32cagccgcaag cacgacaagc gtcgccggcg gccccaggac c 413323DNAbacteria 33tactgcgcca cacggagccc gag 233420DNAbacteria 34tgggtaacgc cagggttttc 203524DNAbacteria 35ggaaacagct atgacatgat tacg 243620DNAbacteria 36tcggagccgc tccacctgag 203720DNAbacteria 37cctgatggac gcgggtgcgc 203816DNAbacteria 38gacaccgaaa cccctg 163920DNAbacteria 39cctgatggac gcgggtgcgc 204023DNAbacteria 40gccgtgtgca ccacagcggt cag 234128DNAbacteria 41gtgtgatgtc gccgaccgcg cccaggtc 284222DNAbacteria 42gcgctggtgg gccagggcgt cc 22


Patent applications by Chengjin Huang, Fort Dodge, IA US

Patent applications by Deborah T. Chaleff, Pennington, NJ US

Patent applications by Jerome Stephens, Mentone, AL US

Patent applications by Mark E. Ruppen, Garnerville, NY US

Patent applications by Wyeth

Patent applications in class PROTEINS, I.E., MORE THAN 100 AMINO ACID RESIDUES

Patent applications in all subclasses PROTEINS, I.E., MORE THAN 100 AMINO ACID RESIDUES


User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA
Similar patent applications:
DateTitle
2011-07-14Compositions monovalent for cd40l binding and methods of use
2012-08-02Containerless synthesis of amorphous and nanophase organic materials
2012-10-04Novel monoclonal thyroid stimulating or blocking antibodies, peptide sequences corresponding to their variable regions, and their uses in diagnostic, preventive and therapeutic medicine
2011-01-27Non-natural peptides as models for the development of antibiotics
2011-02-24Disease-resistant plants and method of constructing the same
New patent applications in this class:
DateTitle
2013-06-13Cell culture of growth factor-free adapted cells
2013-06-06Crystal structure of a marr family polypeptide
2013-05-23Auto-processing domains for polypeptide expression
2013-05-16Engineering surface epitopes to improve protein crystallization
2013-04-18Thiosulfonate compound, reversible cationization agent for protein and/or peptide, and method for solubilization
New patent applications from these inventors:
DateTitle
2012-10-25Compositions relating to a mutant clostridium difficile toxin and methods thereof
2012-04-19Non-lipidated variants of neisseria meningitidis orf2086 antigens
2011-12-29Hemorrhagic feline calicivirus, calicivirus vaccine and method for preventing calicivirus infection or disease
2009-01-29Process for preparing rapamycin 42-esters and fk-506 32-esters with dicarboxylic acid, precursors for rapamycin conjugates and antibodies
Top Inventors for class "Chemistry: natural resins or derivatives; peptides or proteins; lignins or reaction products thereof"
RankInventor's name
1Kevin I. Segall
2Martin Schweizer
3John R. Desjarlais
4Scott R. Presnell
5Aniket Kale