Patent application title: CLONING GENES FROM STREPTOMYCES CYANEOGRISEUS SUBSP. NONCYANOGENUS FOR BIOSYNTHESIS OF ANTIBIOTICS AND METHODS OF USE
Inventors:
Chengjin Huang (Fort Dodge, IA, US)
Deborah T. Chaleff (Pennington, NJ, US)
Mark E. Ruppen (Garnerville, NY, US)
Mark E. Ruppen (Garnerville, NY, US)
Jerome Stephens (Mentone, AL, US)
Assignees:
Wyeth
IPC8 Class: AC07K14195FI
USPC Class:
530350
Class name: Chemistry: natural resins or derivatives; peptides or proteins; lignins or reaction products thereof proteins, i.e., more than 100 amino acid residues
Publication date: 2009-07-09
Patent application number: 20090176969
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: CLONING GENES FROM STREPTOMYCES CYANEOGRISEUS SUBSP. NONCYANOGENUS FOR BIOSYNTHESIS OF ANTIBIOTICS AND METHODS OF USE
Inventors:
Mark E. Ruppen
Chengjin Huang
Deborah T. Chaleff
Jerome Stephens
Agents:
WYETH;PATENT LAW GROUP
Assignees:
Wyeth
Origin: MADISON, NJ US
IPC8 Class: AC07K14195FI
USPC Class:
530350
Abstract:
The present invention relates to the complete biosynthetic pathway for the
formation of the LL-F28249 compounds and, most importantly, the major
component LL-F28249α. The purified and isolated nucleic acid
molecule encoding the proteins of the biosynthetic pathway, which is
isolated from a wild-type or mutant Streptomyces, is fully described in
FIG. 6 to FIG. 6-39 and SEQ ID NO:1. The DNA gene cluster and its
expression in a suitable host enable the efficient production of the
highly active natural metabolites and semisynthetic derivatives. The
invention further concerns plasmids, vectors and host cells that contain
and express the novel nucleic acid molecule. Of particular interest, the
entire biosynthetic pathway fits compactly in three plasmids, Cos11,
Cos36 and Cos40. The invention also concerns the purified and isolated
biosynthesis proteins that are encoded by the whole DNA gene cluster.
Additionally, the invention involves a new efficient, biochemical method
of preparing moxidectin.Claims:
1. A purified and isolated nucleic acid molecule encoding at least one
protein of the biosynthetic pathway for producing an LL-F28249 compound,
wherein said nucleic acid molecule is isolated from an
antibiotic-producing wild-type or mutant Streptomyces.
2. The nucleic acid molecule according to claim 1, wherein the nucleic acid molecule is isolated from an antibiotic-producing wild-type or mutant Streptomyces cyaneogriseus subsp. noncyanogenus.
3. The nucleic acid molecule according to claim 1, wherein the LL-F28249 compound is LL-F28249.alpha..
4. The nucleic acid molecule according to claim 1, wherein the molecule has the nucleotide sequence set forth in SEQ ID NO:1 or its complementary strand.
5. A nucleic acid sequence which hybridizes to the sequence of the nucleic acid molecule of claim 4 and encodes a protein of the biosynthetic pathway for producing an LL-F28249 compound.
6. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 7697-10465 of SEQ ID NO:1.
7. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 10791-11570 of SEQ ID NO:1.
8. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 11659-12462 of SEQ ID NO:1.
9. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 12850-19875 of SEQ ID NO:1.
10. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 19865-31036 of SEQ ID NO:1.
11. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 31115-49246 of SEQ ID NO:1.
12. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 50449-51303 of SEQ ID NO:1.
13. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 51300-52706 of SEQ ID NO:1.
14. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 52809-69833 of SEQ ID NO:1.
15. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 69929-85429 of SEQ ID NO:1.
16. The nucleic acid molecule of claim 4, wherein the molecule comprises nucleotides 85574-86338 of SEQ ID NO:1.
17. A biologically functional plasmid or vector containing the nucleic acid molecule according to claim 1.
18. The plasmid or vector according to claim 17, wherein the plasmid or vector comprises Cos11 having ATCC Designation Number PTA-4392, Cos36 having ATCC Designation Number PTA-4393 or Cos40 having ATCC Designation Number PTA-4394.
19. A suitable host cell stably transformed or transfected by the plasmid or vector according to claim 17.
20. The host cell according to claim 19, wherein the host is Escherichia, Actinomycetales, Bacillus, Corynebacteria or Thermoactinomyces.
21. The host cell according to claim 20, wherein the host is Escherichia coli, Streptomyces lividans, Streptomyces coelicolor, Streptomyces griseofuscus or Streptomyces ambofaciens.
22. A biosynthesis protein encoded by the nucleic acid molecule according to claim 1.
23. The protein according to claim 22, wherein the amino acid sequence is set forth in any one of SEQ ID NO:2 to SEQ ID NO:12, or a biologically active variant thereof.
24. A process for the production of a protein involved in the biosynthesis of an LL-F28249 compound, said process comprising: growing, under suitable nutrient conditions, a prokaryotic or eukaryotic host cell transformed or transfected with a nucleic acid molecule according to claim 1 in a manner allowing expression of the protein product, and isolating the desired protein product of the expression of the nucleic acid molecule.
25. A protein product of the expression of the nucleic acid molecule in a prokaryotic or eukaryotic host cell according to claim 24.
26. A plasmid or a combination of two or three plasmids for cloning the nucleic acid molecule which encodes the proteins of the biosynthetic pathway of an LL-F28249 compound, wherein said plasmid or combination contains the nucleic acid molecule that spans the entire biosynthetic gene cluster and encodes type I polyketide synthase that is responsible for producing the LL-F28249 compound.
27. The combination according to claim 26, which comprises Cos11 having ATCC Designation Number PTA-4392, Cos36 having ATCC Designation Number PTA-4393 and Cos40 having ATCC Designation Number PTA-4394.
28-35. (canceled)
Description:
CROSS-REFERENCE TO RELATED U.S. APPLICATIONS
[0001]This nonprovisional application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 60/471,256, filed on May 16, 2003. The prior application is incorporated herein by reference in its entirety.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
[0002]Not Applicable
REFERENCE TO A "SEQUENCE LISTING"
[0003]The material on a single compact disc containing a Sequence Listing file provided in this application is incorporated by reference. The date of creation is ______ and the size is approximately ______.
BACKGROUND OF THE INVENTION
[0004]1. Field of the Invention
[0005]The present invention concerns the novel biosynthetic genes for encoding the proteins responsible for producing the LL-F28249 compounds and the use thereof to make the active metabolites from the fermentation of Streptomyces cyaneogriseus subsp. noncyanogenus. The invention further concerns the genetic manipulation of the biosynthetic pathway to make active semisynthetic derivatives of the natural metabolites.
[0006]2. Description of the Related Art
[0007]All patents and publications cited in this specification are hereby incorporated by reference in their entirety.
[0008]Streptomyces are producers of a wide variety of commercially important secondary metabolites, including the majority of active antibiotics known as the β-lactams and the macrocyclic lactone compounds or macrolides. Because of the commercial importance of the secondary metabolites produced by Streptomyces, there has been considerable recent investment in the development of methods for molecular genetic manipulation of Streptomyces. Procedures have been developed for the introduction of genetic material by polyethylene glycol mediated transformation and by conjugal transfer from Escherichia coli. Vectors have been developed including high and low copy number vectors, integrative vectors, and E. coli-Streptomyces shuttle vectors. These methods for molecular genetic manipulation of Streptomyces have been summarized in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985). In many cases, the genes for the production of secondary metabolites are clustered in Streptomyces. Thus, the identification of a single gene in a biosynthetic gene cluster may lead to the identification of all of the genes responsible for the biosynthesis of the metabolite. This observation has proven to be tremendously valuable, and secondary metabolite biosynthetic gene clusters have been cloned by reverse genetics, complementation of blocked mutants, resistance and use of heterologous probes. Using these methods, nucleotide and predicted amino acid sequence data have been obtained for many macrolide biosynthetic gene clusters including those directing the synthesis of erythromycin (see S. Donadio et al., Science 252:675-679 (1991) and S. F. Haydock et al., Molecular and General Genetics 230:120-128 (1991)); rapamycin (see T. Schwecke et al., Proceedings of the National Academy of Sciences USA 92:7839-7843 (1995) and X. Ruan et al., Gene 203:1-9 (1997)); FK506 (H. Motamedi and A. Shafiee, European Journal of Biochemistry 256:528-534 (1998)); oleandomycin (D. G. Swan et al., Molecular and General Genetics 242:358-362 (1994)) and rifamycin (see P. R. August et al., Chemistry & Biology 5:69-79 (1998)). However, the complete biosynthetic gene cluster for the macrocyclic lactone compounds known as the LL-F28249 compounds has not yet been described in the art.
[0009]There are many reports that molecular genetic manipulations can be used to alter the course of polyketide biosynthesis (see S. Donadio et al., Science 252:675-679 (1991) and S. Donadio et al., Proceedings of the National Academy of Sciences USA 90: 7119-7123 (1993)). In those studies, erythromycin-related lactones were produced following manipulation of the 6-deoxyerythronolide B synthase ("DEBS") gene cluster (the core polyketide synthase gene cluster responsible for erythromycin biosynthesis) such that either the module 4 enoylreductase or the module 5 ketoreductase domains were nonfunctional. Strains containing these variant DEBS gene clusters produced the expected erythromycin-related lactones. These pioneering studies have since been repeated and expanded upon, and the results of many such studies have been reviewed in the literature (see, for example, L. Katz and S. Donadio, Annual Reviews of Microbiology 47:875-912 (1993); C. R. Hutchinson and I. Fujii, Annual Reviews of Microbiology 49:201-238 (1995); D. A. Hopwood, Chemical Reviews 97:2465-2497 (1997); and C. W. Carreras and D. V. Santi, Current Opinions in Biotechnology 9: 403-411 (1998)).
[0010]Data summarized in the literature suggest that the organization of catalytic domains in type I polyketide synthase ("PKS") modules is conserved, and many highly conserved amino acid sequence motifs have also been described in those biosynthetic gene clusters. For example, the organization of the biosynthetic gene cluster of avermectin, which is produced by S. avermitilis, has been reported (see D. J. MacNeil et al., Gene 115:119-125 (1992) and D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)); and partial nucleotide sequences of that biosynthetic gene cluster have been reported or are otherwise available. MacNeil and colleagues have also predicted the modular organization and reported a limited restriction endonuclease map of the wild-type S. cyaneogriseus (NRRL 15773) nemadectin biosynthetic gene cluster (see D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)), but their restriction map was incomplete. Their analysis only indicated the presence of nine modular repeats of PKS function and required six overlapping clones to define the 75 kb region of the S. cyaneogriseus genome. MacNeil et al. did not complete the DNA sequencing of the whole biosynthetic gene cluster. Instead, the authors sequenced only the ends of selected cosmids. From the limited sequence information, they could only generate a very sketchy restriction endonuclease map. Further C-13 labeling studies have been conducted, and a mechanism for synthesis of the LL-F28249α compound from its constituent acyl units has been proposed (H. R. Tsou et al., Journal of Antibiotics (Tokyo) 42:398-406 (1989)).
[0011]The highly active LL-F28249 compounds, which are natural endectocidal agents widely used for treatment of nematode and arthropod parasites, including the control or prevention of helmintic, arthropod ectoparasitic and acaridal infections, are isolated from the fermentation broth of Streptomyces cyaneogriseus subsp. noncyanogenus (hereinafter referred to as "S. cyaneogriseus"). The series of anti-parasitic LL-F28249 compounds produced from S. cyaneogriseus are structurally similar to, but patentably distinct from, the well-characterized avermectins. U.S. Pat. No. 5,106,994 and its continuation U.S. Pat. No. 5,169,956 describe the preparation of the major and minor components, LL-F28249α-λ. The LL-F28249 family of compounds further includes, but is not limited to, the semisynthetic 23-oxo derivatives and 23-imino derivatives of LL-F28249α-λ, which are shown in U.S. Pat. No. 4,916,154. Moxidectin, chemically known as 23-(O-methyloxime)-LL-F28249α, is a particularly potent 23-imino derivative. Other examples of LL-F28249 derivatives include, but are not limited to, 23-(O-methyloxime)-5-(phenoxyacetoxy)-LL-F28249α, 23-(semicarbazone)-LL-F28249α and 23-(thiosemicarbazone)-LL-F28249α.
[0012]One of the major nemadectin metabolites, LL-F28249α (hereinafter referred to as "Fα"), is converted to the commercially important compound moxidectin using a four-step chemical process. The determination of the biosynthetic gene cluster of Fα, heretofore unknown, would be of great commercial significance. Not only would isolation of the gene be highly desirable to make the active Fα compound and other natural members of the LL-F28249 family of compounds, but also to prepare the commercially potent semisynthetic derivatives such as moxidectin more quickly and efficiently.
[0013]It is therefore an important object of the present invention to isolate and characterize the entire nucleotide sequence encoding the proteins responsible for producing the LL-F28249 compounds, preferably the LL-F28249α metabolite, and then to isolate and determine the function of the amino acid sequences comprising the biosynthesis proteins.
[0014]Another object is to provide a new process for isolating natural and semisynthetic derivatives directly from the fermentation broth of bioengineered strains of Streptomyces cyaneogriseus subsp. noncyanogenus.
[0015]A further object is to provide a new method for the preparation of moxidectin in an efficient process with fewer steps than heretofore achievable.
[0016]Further purposes and objects of the present invention will appear as the specification proceeds.
[0017]The foregoing objects are accomplished by providing a new, purified and isolated nucleic acid molecule that encodes the proteins connected with the entire biosynthetic pathway for producing the LL-F28249 compounds.
BRIEF SUMMARY OF THE INVENTION
[0018]The present invention concerns the unique cloning and characterization of the complete biosynthetic pathway for the formation of the LL-F28249 compounds and, most importantly, the highly active, major component LL-F28249α. The full DNA gene cluster and its expression in a suitable host enable the efficient production of the highly active natural metabolites and semisynthetic derivatives. Remarkably, the whole biosynthetic pathway is efficiently contained in only three plasmids identified as Cosmid Numbers 11, 36 and 40 (hereinafter referred to as "Cos11," "Cos36" and "Cos40," respectively).
BRIEF DESCRIPTION OF THE DRAWINGS
[0019]The background of the invention and its departure from the art will be further described hereinbelow with reference to the accompanying drawings, wherein:
[0020]FIG. 1 illustrates the construction of the biosynthetic gene cluster for making the LL-F28249 compounds via the gene segments contained within cosmids made according to the present invention. S. cyaneogriseus cosmid libraries are constructed by ligating Sal3A fragments of S. cyaneogriseus genomic DNA into the BamH1 site of cosmid vector pSuperCos 1. The resultant cosmid libraries are transformed into E. coli VCS257. Various cosmids are identified by hybridization technique using the avermectin ketoacyl synthase probe or by a "walking" technique as described herein. The cosmids are characterized by restriction endonuclease mapping and DNA sequencing. The BamH1 restriction map of the Fα gene cluster is obtained from analyzing overlapping cosmids and confirmed by DNA sequencing. B denotes a BamH1 site.
[0021]FIG. 2 illustrates the biosynthesis proteins and their positions encoded by the cloned biosynthetic gene cluster for making the LL-F28249 compounds. A contiguous nucleotide sequence of approximately 88 Kbp containing the entire Fα polyketide synthase gene cluster is obtained by sequencing overlapping cosmids and the subclones thereof. The 13 modules and respective domains are identified using BLAST alignment analysis. Other biosynthetic genes are identified in the same way. The following abbreviations are used in the figure: ACP, acyl carrier protein; DH, dehydratase; ER, enoylreductase; KR, ketoreductase; KS, ketoacyl synthase; LD, loading domain; TE, thioesterase; MT, methyl transferase; AT, acyl transferase.
[0022]FIG. 3 shows the structure of the components of the vector designated pKR0.9, which is the 900 bp BstEII-AatII fragment of pNE57 (and contains the desired region of the Fα module 3 ketoreductase domain), in the BstEII-AatII sites of pSL301 (Invitrogen, Carlsbad, Calif.). The following abbreviations are used in the figure: mod3 KR, Fα module 3 ketoreductase domain; amp, the ampicillin resistance marker.
[0023]FIG. 4 shows the structure of the plasmid components of the pFDmod3/5.2 series. These plasmids are constructed to combine the site-directed mutations of the Fα module 3 ketoreductase domain with flanking DNA to facilitate homologous integration. The backbone vector is E. coli-Streptomycin shuttle vector pKC1132. The following abbreviations are used in the figure: mod3 KS, module 3 ketoacyl synthase domain; mod3 AT, module 3 acyl transferase; mod3 DH, module 3 dehydratase; mod 3 ER, module 3 enoylreductase; mod3 KR, module 3 ketoreductase domain; apra, apramycin resistance marker.
[0024]FIG. 5 shows the structure of the plasmid components of the pFDmod3/4.2 series. These plasmids are derived from the pFDmod3/4.2 series by removing approximately 1 Kbp of flanking DNA to minimize aberrant integration. The following abbreviations are used in the figure: mod3 AT, module 3 acyl transferase; mod3 DH, module 3 dehydratase; mod 3 ER, module 3 enoylreductase; mod3 KR, module 3 ketoreductase domain; apra, apramycin resistance marker.
[0025]FIG. 6 to FIG. 6-39 show the full-length nucleotide sequence (88400 bp) of the biosynthetic genes for making the LL-F28249 compounds (which corresponds to SEQ ID NO:1).
[0026]FIG. 7 represents the putative amino acid sequence (922 aa) of the regulatory protein encoded by the ORF1 gene (which corresponds to SEQ ID NO:2).
[0027]FIG. 8 represents the putative amino acid sequence (259 aa) of the thioesterase protein encoded by the ORF2 gene (which corresponds to SEQ ID NO:3).
[0028]FIG. 9 represents the putative amino acid sequence (267 aa) of the reductase protein encoded by the ORF3 gene (which corresponds to SEQ ID NO:4).
[0029]FIG. 10 to FIG. 10-1 represent the putative amino acid sequence (2341 aa) of the loading domain protein for Mod1 encoded by the ORF4 gene (which corresponds to SEQ ID NO:5).
[0030]FIG. 11 to FIG. 11-2 represent the putative amino acid sequence (3723 aa) of the loading domain protein for Mod2-Mod3 encoded by the ORF5 gene (which corresponds to SEQ ID NO:6).
[0031]FIG. 12 to FIG. 12-3 represent the putative amino acid sequence (6043 aa) of the loading domain protein for Mod-4-Mod7 encoded by the ORF6 gene (which corresponds to SEQ ID NO:7).
[0032]FIG. 13 represents the putative amino acid sequence (284 aa) of the methyltransferase protein encoded by the ORF7 gene (which corresponds to SEQ ID NO:8).
[0033]FIG. 14 represents the putative amino acid sequence (468 aa) of the p450 protein encoded by the ORF8 gene (which corresponds to SEQ ID NO:9).
[0034]FIG. 15 to FIG. 15-3 represent the putative amino acid sequence (5674 aa) of the loading domain protein for Mod8-Mod10 encoded by the ORF9 gene (which corresponds to SEQ ID NO:10).
[0035]FIG. 16 to FIG. 16-3 represent the putative amino acid sequence (5166 aa) of the loading domain protein for Mod11-Mod13 encoded by the ORF10 gene (which corresponds to SEQ ID NO:11).
[0036]FIG. 17 represents the putative amino acid sequence (254 aa) of the oxidoreductase protein encoded by the ORF11 gene (which corresponds to SEQ ID NO:12).
DETAILED DESCRIPTION OF THE INVENTION
[0037]In accordance with the present invention, there is provided a novel, purified and isolated nucleic acid molecule encoding the proteins of the entire biosynthetic pathway for producing the LL-F28249 compounds. The nucleic acid molecule of this invention is isolated from an antibiotic-producing wild-type or mutant Streptomyces. Surprisingly, the complete DNA for encoding all of the essential biosynthetic proteins is efficiently packaged in only three cosmids. These three cosmids, Cos11, Cos36 and Cos40, which have been constructed to contain the nucleic acid molecule according to the invention, are sufficient to regenerate the entire biosynthetic pathway for producing the LL-F28249 compounds. Thus, the present invention uniquely provides the entire biosynthetic gene cluster in three cosmids, as a preferred embodiment, which enables a substantially more efficient means for making the active anti-parasitic LL-F28249 compounds, particularly moxidectin, in fewer steps than previously contemplated. The success of this invention has overcome the prior failed attempts by others to isolate the full biosynthetic gene and satisfies a long-standing need.
[0038]The nucleotide sequence of this complete DNA gene cluster is fully described in FIG. 6 to FIG. 6-39 (which corresponds to SEQ ID NO:1). The scope of the invention also embraces its complementary strand, that is, those nucleotides that are the complement nucleotides (for example, A substituted for T, C substituted for G and vice versa) and/or reverse nucleotide sequences (i.e., a descending order instead of the forward or ascending strand, for example, changing the direction from reading 5' to the 3' end to reading 3' to the 5' end).
[0039]The present invention further includes the nucleic acid sequence that hybridizes to the sequence of the nucleic acid molecule of SEQ ID NO:1 isolated from the microbial source or its complementary strand and encodes a protein of the biosynthetic pathway for producing the LL-F28249 compounds. Typical hybridization procedures and conditions, which are well known to those of ordinary skill in the art, are illustrated in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). While standard or stringent conditions are employed for homologous probes, less stringent hybridization conditions may be used for partially homologous probes that have less than 100% homology with the target nucleic acid sequence. In the latter case of partially homologous probes, a series of Southern and Northern hybridizations may be readily carried out at different stringencies. For instance, when hybridization is carried out in formamide-containing solvents, preferred conditions employ a temperature and ionic strength at a constant of about 42° C. with a solution containing 6×SSC, 50% formamide strength. Less stringent hybridization conditions may use the same temperature and ionic strength but less or lowering amounts of formamide in the annealing buffer at a range of about 45% to 0%. Alternatively, hybridization may be carried out in aqueous solutions containing no formamide. Usually for aqueous hybridization, the ionic strength of the solution is kept the same, often at about 1 M Na+ while the temperature of annealing may be lowered from about 68° C. to 42° C.
[0040]In general, the isolation and characterization of the genomic DNA and the cloned, recombinant DNA from suitable host cells may be done via standard or stringent hybridization techniques, utilizing all or a portion of a nucleotide sequence as a probe to screen an appropriate library. As an alternative approach, oligonucleotide primers, which are constructed on the basis of other related, known DNA and protein sequences, can be used in polymerase chain reactions to amplify and identify other identical or related sequences. The nucleotides and proteins described herein are isolated and purified by routine methods to varying degrees. Preferably, the proteins are obtained in substantially pure form but a lower range of about 80% to about 90% pure is acceptable. It is contemplated that the scope of the invention also includes the DNA and proteins that are made by chemical synthesis, which have the same or substantially the same structures as those derived directly from the antibiotic-producing wild-type or mutant Streptomyces and are confirmed by routine testing or standard assays to be involved in the biosynthetic pathway of the LL-F28249 compounds.
[0041]Additionally, the invention encompasses and fully describes the isolated biosynthesis proteins comprising the amino acid sequences that include, but are not limited to, the regulatory protein encoded by the ORF1 gene (which corresponds to SEQ ID NO:2), the thioesterase protein encoded by the ORF2 gene (which corresponds to SEQ ID NO:3), the reductase protein encoded by the ORF3 gene (which corresponds to SEQ ID NO:4), the loading domain protein for Mod1 encoded by the ORF4 gene (which corresponds to SEQ ID NO:5), the loading domain protein for Mod2-Mod3 encoded by the ORF5 gene (which corresponds to SEQ ID NO:6), the loading domain protein for Mod-4-Mod7 encoded by the ORF6 gene (which corresponds to SEQ ID NO:7), the methyltransferase protein encoded by the ORF7 gene (which corresponds to SEQ ID NO:8), the p450 protein encoded by the ORF8 gene (which corresponds to SEQ ID NO:9), the loading domain protein for Mod8-Mod10 encoded by the ORF9 gene (which corresponds to SEQ ID NO:10), the loading domain protein for Mod11-Mod13 encoded by the ORF10 gene (which corresponds to SEQ ID NO:11) and the oxidoreductase protein encoded by the ORF11 gene (which corresponds to SEQ ID NO:12).
[0042]The open reading frames of the genomic DNA cluster, which encode the biosynthesis proteins, may be identified using a variety of art-recognized techniques. The techniques include, but are not limited to, computer analysis to locate known start and stop codons, putative reading frame locations based on codon frequencies, similarity alignments to expressed genes in other known Streptomyces strains and the like. In this fashion, the proteins of the invention are identified using the nucleotide sequence of the present invention and the open reading frames or the encoded proteins may then be isolated and purified or, alternatively, synthesized by chemical means. Expressible genetic constructs based on the open reading frames and appropriate promoters, initiators, terminators and the like may be designed and introduced into a suitable host cell to express the protein encoded by the open reading frame.
[0043]As used herein, the term "proteins" means the polypeptides, the enzymes and the like, as those terms are commonly used in the art, which are encoded by the nucleic acid molecule comprising the biosynthetic pathway for producing the LL-F28249 compounds. The proteins of the invention encompass amino acid chains of varying length, including full-length, wherein the amino acid residues are linked by covalent peptide bonds, as well as the biologically active variants thereof. The proteins may be natural, recombinant or synthetic. For example, the biosynthesis proteins may be made through conventional recombinant technology by inserting a nucleotide sequence that encodes the protein into an appropriate expression vector and expressing the protein in a suitable host cell or through standard chemical synthesis by the Merrifield solid-phase synthesis method described in Merrifield, J. Am. Chem. Soc. 85:2149-2154 (1963), in which the amino acids are individually and sequentially attached to an amino acid chain. Alternatively, modern equipment is commercially available from a variety of manufacturers such as Perkin-Elmer, Inc. (Wellesley, Mass.) for the automated synthesis of proteins.
[0044]The biologically active variants that are included within the scope of the present invention comprise, at a minimum, the biologically functional portion of the amino acid sequence encoded by the nucleic acid molecule of the invention. As used herein, the "biologically functional portion" is that part of the protein structure which still retains the active function of the protein, for example, that part of the regulatory protein molecule encoded by the ORF1 gene which has the same or substantially the same activity and/or binding properties, i.e., at least about 90%, and more preferably, about 95%, similarities or potencies. The biologically active variants of the proteins include active amino acid structures having deleted, substituted or added amino acid residues, naturally occurring alleles, etc. The biologically functional portion may be easily identified by subjecting the full-length protein to chemical or enzymatic digestion to prepare fragments and then testing those fragments in standard assays to analyze which part of the amino acid structure retains the same or substantially the same biological activity as the full-length protein.
[0045]The determination of the full biosynthesis gene cluster of Fα, heretofore unknown, is of great commercial significance. The isolation and complete description of the gene according to the present methods permit the enhanced production of the active Fα compound and other natural members of the LL-F28249 family of compounds. Furthermore, the information about the gene enables an improved method for preparing the commercially potent semisynthetic derivatives such as moxidectin in a more quick and efficient fashion than the prior chemical process of manufacture. As a direct and beneficial consequence of the cloning and characterization the novel Fα biosynthesis gene cluster, which is described herein, unique processes for the direct fermentative production of moxidectin and other important LL-F28249 derivatives using bioengineered strains of S. cyaneogriseus are now obtainable.
[0046]One advantage of the present invention is the ability to enhance the production of the highly active Fα from the fermentation broth of S. cyaneogriseus. Cos11 contains a putative transcription activator gene (ORF1) for the PKS cluster. Increasing the expression level of the activator can result in a higher yield of Fα. This is achieved by increasing the copy number of the gene or by enhancing the regulatory sequence elements for this gene according to known techniques (see, for example, Perez-Llarena et al., Journal of Bacteriology 179:2053-2059 (1997)).
[0047]Another benefit derived from obtaining the full biosynthetic gene cluster of the present invention is to enable the efficient fermentative production and manufacture of the natural and semisynthetic derivatives of the LL-F28249 family of compounds such as, for example, LL-F28249α, LL-F28249β, LL-F28249γ, 23-(O-methyloxime)-LL-F28249α (moxidectin), 23-(O-methyloxime)-5-(phenoxyacetoxy)-LL-F28249α, 23-(semicarbazone)-LL-F28249α, 23-(thiosemicarbazone)-LL-F28249α, etc. Through the identification of the biosynthesis genes encoding the proteins responsible for the production of the LL-F28249 compounds and, desirably, the Fα metabolite as the major product, additional cloning and mutagenesis of the pathway readily produces other metabolites as by-products of the fermentation process. The biosynthesis genes are particularly useful to minimize the number of chemical reaction steps in preparing other semisynthetic members of the family.
[0048]The highly preferred utility of this invention involves the preparation of the commercially important compound moxidectin in fewer steps than previously done via known chemical processes. Moxidectin is currently produced by a four-step chemical process from Fα, which is first obtained by fermentation of Streptomyces cyaneogriseus subsp. noncyanogenus. The conversion of the natural metabolite Fα to moxidectin involves the following chemical reactions: (1) protection of the 5-hydroxyl group; (2) oxidation of the 23-hydroxyl group to a keto function; (3) conversion of the 23-keto to 23-O-methyloxime group; and (4) deprotection of the 5-hydroxyl group. The efficient method of the present invention now permits the chemical conversion of 23-keto Fα to moxidectin to be accomplished in a single step.
[0049]By generating mutants of the biosynthesis gene cluster, the specific activity responsible for reduction of the keto function at position 23 of the LL-F28249 compound structure is eliminated and the chemical synthesis is reduced to the one step. Surprisingly, the remainder of the modular polyketide synthase remains functional and the functional remainder of the polyketide synthase recognizes the unnatural polyketide intermediate. The unique bioengineered strain is then capable of being used, cloned and re-used for the direct fermentative production of 23-keto Fα further reducing the normal processing time.
[0050]In the below examples, selective mutagenesis illustrates how to modify Fα biosynthesis and to obtain the desired metabolites according to the present methods. Basically, mutants of the module 3 ketoreductase domain of the S. cyaneogriseus Fα biosynthetic gene cluster are generated by site-directed mutagenesis. These ketoreductase variants are designed by comparing the predicted amino acid sequence of the Fα module 3 ketoreductase domain to ketoreductase domains from a number of biologically active ketoreductase domains and several "cryptic" ketoreductase domains. The module 3 ketoreductase domain of the S. cyaneogriseus Fα biosynthetic gene cluster is then replaced with these variant domains by homologous recombination in order to alter Fα biosynthesis and obtain the desired metabolite.
[0051]Generally speaking, the site-directed mutagenesis introduces a small deletion or point mutation in the 23-keto (oxo) reductase gene (23-KR gene) to render the 23-ketoreductase domain nonfunctional while it retains the functions of other domains of the polyketide synthase. Mutations in the 23-KR gene are introduced by standard methods into a wild-type Streptomyces cyaneogriseus subsp. noncyanogenus strain or the mutant Fα production strain 142, resulting in the direct fermentative production of 23-keto (oxo) Fα. In addition, the whole Fα PKS gene cluster carrying mutations in the 23-KR gene may be introduced into a suitable host cell such as S. lividans, S. coelicolor, E. coli and the like to produce 23-keto Fα. The transformed host cells are used as the source of DNA for conjugal transfer to S. cyaneogriseus using methods described herein for the further fermentative production of 23-keto Fα.
[0052]The imino derivatives (23-oxime) of the 23-oxo compounds are then readily prepared by standard techniques such as procedures described by S. M. McElvain in The Characterization of Organic Compounds, published by MacMillian Company, New York, 1953, pages 204-205 and incorporated herein by reference. Typically, the 23-oxo compound is stirred in alcohol, such as methanol or ethanol, or dioxane in the presence of acetic acid and an excess of the amino derivatizing agent, such as hydroxylamine hydrochloride, O-methylhydroxylamine hydrochloride, semicarbazide hydrochloride and the like along with an equivalent amount of sodium acetate, at room temperature to about 50° C. The reaction is usually complete in several hours to several days at room temperature but can be readily speeded by heating. This subsequent conversion to moxidection via the 23-keto Fα compound is surprisingly and beneficially the only necessary chemical reaction to take place.
[0053]It is further contemplated that the genetic material contained within the three cosmids, Cos11, Cos36 and Cos40, may be reduced to fit into two plasmids or a single plasmid through genetic manipulations known to those of ordinary skill in the art. For example, the cloned Fα biosynthesis genes that are present in the Cos11, Cos36 and Cos40 prepared according to the methods of the present invention would be used to assemble the entire polyketide synthase (PKS) gene cluster on two plasmids or a single plasmid. The assembling can be achieved by use of cloning, PCR or synthetic genes, or a combination of any of these art-recognized techniques. The assembled Fα PKS gene cluster can be introduced into a suitable host cell such as S. lividans, S. coelicolor, E. coli and the like to produce Fα. Thereafter, the assembled PKS gene cluster can be used in a cell-free expression system such as, for example, a cell-free expression system described by Olsthoorn-Tieleman et al., Eur. J. Biochem. 268:3807-3815 (2001), to produce further amounts of Fα and related products.
[0054]Using the modular organization of the core LL-F28249α polyketide synthase and the functional domains within those modules, the biosynthesis gene cluster described herein is cloned and fully characterized. Generally, for the isolation of the biosynthetic genes, a cosmid library of S. cyaneogriseus genomic DNA is prepared in the commercially available vector pSuperCos (Stratagene, La Jolla, Calif.). This cosmid library is probed with fragments of DNA corresponding to the avermectin module 1 ketoacyl synthase, which has been amplified from S. avermitilis genomic DNA using the polymerase chain reaction. Subsequently, several regions of the Fα biosynthetic gene cluster, which have been amplified from previously characterized cosmids using the polymerase chain reaction, are used as probes to isolate additional cosmids. Using these methods, a series of cosmids are isolated that collectively span over 100 Kbp of genomic DNA. Complete restriction endonuclease mapping and thorough nucleotide sequence analysis identify the cosmids and result in a definitive, unambiguous contiguous nucleotide sequence spanning nearly 88 Kbp. Analysis of this nucleotide sequence reveals the presence of 13 complete modules of a modular polyketide synthase together with at least six additional genes involved in the biosynthesis or in the regulation of the biosynthesis of Fα.
[0055]The invention further embraces biologically functional plasmids or vectors containing the nucleic acid molecule of the present invention. The particular plasmids of the invention are selected for their ability to incorporate large DNA gene clusters but they are conventional and are derived from commonly available vectors, for example, pKR0.9, the pFDmod3/5.2 series, the pFDmod3/4.2 series and the like.
[0056]Although E. coli is used as the heterologous host in the examples, the heterologous expression of antibiotic biosynthetic genes is expected in a wide number of Actinomycetales, Bacillus, Corynebacteria, Thermoactinomyces and the like so long as they are capable of being transformed with the relatively large plasmid constructs described herein. Those that are transformed include, but are not limited to, Streptomyces lividans, Streptomyces coelicolor, Streptomyces griseofuscus and Streptomyces ambofaciens, which are known to be relatively non-restricting. Preferably, the suitable host cell that is stably transformed or transfected by the plasmid or vector is Streptomyces coelicolor or an Escherichia coli-Streptomyces cosmid vehicle. In vitro expression of the proteins may be performed, if desirable, using standard art methods.
[0057]The following section highlights general methods and materials, available to those of ordinary skill in the art, which have been used to successfully clone and characterize the entire, large biosynthetic pathway of the present invention.
General Methods and Materials
A. Materials, Plasmids and Bacterial Strains
[0058]An E. coli-Streptomyces shuttle vector that contains elements required for replication and selection in E. coli and in Streptomyces, including antibiotic resistance markers for selection with apramycin, pKC1132, is used throughout this work (see M. Bierman et al., Gene 116:43-49 (1992)). In addition to pKC1132, commercially available cloning vectors are used as indicated herein. Those of ordinary skill in the art will be able to select other well known cloning vectors, which can readily be substituted for the exemplified vectors, and avoid or minimize instability problems encountered with certain older strains of the cosmid-harboring E. coli using standard techniques.
[0059]Plasmid DNA is manipulated using procedures similar to those established by work on other plasmids. Typical procedures are presented in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Typical procedures for Streptomyces are presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985). Specific methods used in this work are described herein unless they are identical to methods presented in the above-referenced laboratory manuals.
[0060]E. coli JM109 and DH5α, common laboratory strains used throughout this work, are readily available from a number of commercial sources (for example, Stratagene, La Jolla, Calif.). E. coli XL1-Blue MRF' strain is obtained from Stratagene (La Jolla, Calif.). E. coli ETS12567 (pUZ8002) is obtained from Professor Heinz Floss, of the Department of Chemistry, University of Washington (Seattle, Wash.). E. coli VCS257 is obtained from Stratagene (La Jolla, Calif.). S. avermitilis is obtained from the American Type Culture Collection under ATCC Deposit Accession No. 31,267 but it can also be obtained from the Agricultural Research Culture Collection (NRRL), 1815 N. University Street, Peoria, Ill. 61604, under NRRL 8165. "Wild-type" Streptomyces cyaneogriseus subsp. noncyanogenus LL-F28249 (NRRL 15773) and the mutant Fα production strain of S. cyaneogriseus designated "S. cyaneogriseus strain 142" are used separately throughout this written disclosure of the present invention but they are interchangeable and may substitute for each other in any given step of the disclosed process. Strain 142, which is derived from the wild-type strain, has undergone classic genetic manipulations to enhance antibiotic production but it retains the same polyketide synthase DNA sequence as the wild-type strain. Because their polyketide synthase sequences are identical, all of the plasmids described herein, including but not limited to Cos11, Cos36 and Cos40, can be derived from wild-type Streptomyces cyaneogriseus subsp. noncyanogenus or S. cyaneogriseus strain 142 with the same result.
B. Restriction Analysis of Plasmid DNA
[0061]Procedures for restriction analysis of plasmid DNA, procedures for agarose gel electrophoresis, and other standard techniques of recombinant DNA technology are described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Plasmid DNA is digested with restriction endonucleases according to the manufacturer's procedures. Enzymes are obtained from New England Biolabs (Beverly, Mass.), Life Technologies (Rockville, Md.) or Promega (Madison, Wis.). Restriction digests are analyzed by electrophoresis in 0.8% w/v agarose using 40 mM tris-acetate, 1 mM EDTA as a buffer. The size of the fragments is determined by comparison to DNA fragments of known molecular weight (1 Kb ladder, Life Technologies, Rockville, Md.).
C. Preparation of Hybridization Probes
[0062]Hybridization probes are isolated from plasmids following restriction digestion or are generated using the polymerase chain reaction as described herein. Probes are radiolabeled to high specific radioactivity using EasyTides® α32 P-dCTP (3000 Ci/mmol) from New England Nuclear (Boston, Mass.) and the Rediprime® II random prime labeling system from Amersham Pharmacia Biotech (Piscataway, N.J.) according to procedures provided by the manufacturer.
[0063]Hybridization probes are used to identify cosmids containing the Fα biosynthetic gene cluster (from both S. cyaneogriseus strain 142 and wild-type S. cyaneogriseus cosmid libraries), to confirm and characterize transconjugants and excisants, and to facilitate the generation of accurate restriction maps of the Fα biosynthetic gene cluster that confirm the identity of the gene. These hybridization probes are either generated by PCR amplification or the probes are excised from clones as summarized in the following Table 1.
TABLE-US-00001 TABLE 1 PCR Primer Sequence or Restriction Probe Sites Use Avermectin F: GCCGAATTCCTTCGGCATCAGCCCC To Isolate Cosmids Containing the Fα Biosynthetic KS1 R: GCTCGCACCGTCCTGGTTGACCGC Gene Cluster (S. cyaneogriseus strain 142) NE5.7 5.7 Kbp NotI/EcoRI Fragment of Cos7 To Isolate Cosmids Containing the Fα Biosynthetic Gene Cluster (wild-type S. cyaneogriseus) (Contains Fa Module 3) Apramycin 750 bp SacI Fragment of pKC1132 To Confirm and Characterize Transconjugants Mod3 F: GACAACGTCGGTCCGG To Confirm and Characterize Transconjugants, and in R: CGCGGTGACTCGCTTGAGGTATTC Restriction Mapping Thioesterase F: GCTTCACCGACCCCTCGGCTATGACC To Restriction Map the Right End of the Fα R: GTGAAGTGGTTGCCGTCGGTTTCGAGG Biosynthetic Gene Cluster p450 F: GATGACGTGCTCACCGATGTCGGTGAGC To Restriction Map the Right End of the Fα R: GACGTGGAAATCATGTACAGCTCGTACG Biosynthetic Gene Cluster Cos36 (end) 500 bp NotI Fragment of Cos36 To Restriction Map the Right End of the Fα Biosynthetic Gene Cluster Cos12 (end) 1.1 Kbp BamHI/EcoRI Fragment of Cos12 To Restriction Map the Left End of the Fα Biosynthetic Gene Cluster B5.5 5.5 Kbp BamHI Fragment of Cos11 To Restriction Map the Left End of the Fα Biosynthetic Gene Cluster, and To Isolate Cosmids Containing the Fa Biosynthetic Gene Cluster (wild- type S. cyaneogriseus)
Isolation, Maintenance and Propagation of Plasmids
A. Plasmid Isolation
[0064]E. coli strains, both untransformed and those transformed with vectors as described herein, are grown using well-established methods similar to those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989).
[0065]Plasmid DNA is isolated from E. coli cultures using reagents and materials obtained from QIAGEN (Valencia, Calif.). Depending on the numbers of strains being analyzed, the miniprep plasmid isolation systems used included the QIAprep® Spin Miniprep Kits (for plasmid isolation from relatively small numbers of strains); the QIAprep® 8 Turbo Miniprep Kits (for higher-throughput plasmid isolation from somewhat larger numbers of strains); or the QIAprep® 96 Turbo Miniprep Kits (for partially automated isolation of plasmids from strains in 96-well blocks). For the isolation of larger quantities of plasmid DNA from E. coli, reagents and materials included in the QIAGEN Plasmid Midi (up to 100 μg) and Maxi (up to 500 μg) kits, or reagents and materials included in the Nucleobond AX-100 (up to 100 μg) kit from Clontech (Palo Alto, Calif.) are used.
B. Transformation of Escherichia coli by Plasmid DNA
[0066]Plasmid DNA is transformed into electrocompetent E. coli strains by electroporation or into chemically competent E. coli strains by heat shock using well-established procedures similar to those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Transformants are selected using appropriate antibiotics, and after plasmids are isolated using methods described herein, they are characterized following digestion with restriction endonucleases, again using well-established methods described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989).
C. Conjugal Transfer of Plasmid DNA from Escherichia coli to Streptomyces cyaneogriseus
[0067]In all cases, the plasmids of interest are first transformed into the E. coli strain designated ETS12567 (pUZ8002) by electroporation as described herein. This strain is cmr, tetr, dam.sup.-, and dcm-1. Additionally, pUZ8002, which is an oriT.sup.- version of the plasmid pRK2 (see R. Meyer et al., Science 190:1226-1228 (1975)), confers kanr. The transformed cells are maintained in the presence of appropriate antibiotic selection, including 5 μg/ml kanamycin and 100 μg/ml apramycin. The conjugal transfer of plasmid DNA from these E. coli transformants to S. cyaneogriseus is accomplished using the following procedures, both of which are modified from a procedure described by M. Bierman et al., Gene 116: 43-49 (1992). [0068]Conjugation Method #1: A 3 ml LB media supplemented with 5 μg/ml kanamycin, 5 μg/ml chloramphenicol, 50 μg/ml apramycin is inoculated with a single well-isolated transformed E. coli colony, and the culture is incubated at 37° C., with shaking at 220 rpm, for 16 hours. 10 ml TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving) media is inoculated with 100 μl of a frozen stock of S. cyaneogriseus mycelial fragments, and the culture is incubated at 31° C., with shaking at 220 rpm, for 16 hours. The next day, 10 ml LB media supplemented with 50 μg/ml apramycin is inoculated with a 100 μl aliquot of the overnight E. coli culture. At the same time, a 2 ml aliquot of the S. cyaneogriseus overnight culture is vortexed in a tube containing sterile glass beads for 2 minutes. The suspension is sonicated (3×, 5 second bursts at 100% output); and 1 ml of this suspension of mycelial fragments is transferred to 9 ml of TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving). Both cultures are incubated at 37° C., with shaking at 220 rpm, until the absorbance at 600 nm of the E. coli culture reached 0.4-0.6. The cells in each culture are collected by centrifugation, washed 2× with LB, and suspended in 500 μl 2XYT (16 g/L tryptone, 10 g/L yeast extract, 5 g/L NaCl, pH 7.0). Aliquots (100 μl) of the two preparations are combined; the mixture is incubated at 50° C. for 5 minutes; and the cells are collected by centrifugation. The supernatant is removed, and the cell pellet is suspended in 100 ml of 2XYT (16 g/L tryptone, 10 g/L yeast extract, 5 g/L NaCl, pH 7.0), and plated onto SFM (25 g/L soybean flour nutrisoy, 25 g/L mannitol, 20 g/L agar, 0.462 g/L L-cysteine, 0.462 g/L L-arginine, 0.462 g/L L-proline) plates. These plates are incubated at 37° C. for 16 hours, and then overlaid with 1 ml of sterile water containing 0.5 mg of nalidixic acid and 1 mg of apramycin (final concentrations 20 μg/ml and 40 μg/ml, respectively). The plates are incubated at 37° C. until colonies are well established. [0069]Conjugation Method #2: 3 ml LB media supplemented with 5 μg/ml kanamycin, 5 μg/ml chloramphenicol, 100 μg/ml apramycin is inoculated with a single well-isolated transformed E. coli colony, and the culture is incubated at 37° C., with shaking at 220 rpm, for 16 hours. 25 ml KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8, and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) is inoculated with 1 ml of a frozen stock of S. cyaneogriseus, and the culture is incubated at 31° C., with shaking at 220 rpm, for 16 hours. The next day, 1 ml of the overnight E. coli culture is combined with 9 ml of LB supplemented with 50 μg/ml apramycin. At the same time, a 5 ml aliquot of the S. cyaneogriseus overnight culture is vortexed in a tube containing sterile glass beads for 2 minutes. A 2.5 ml aliquot of the homogenized culture is inoculated into 25 ml of KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O), and both cultures are incubated at 37° C. for 3 hours. The cells in each culture are collected by centrifugation, and washed 2× with water. The E. coli and S. cyaneogriseus cell pellets are suspended in 1 ml and 2 ml, respectively, of TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving). 10 μl of the S. cyaneogriseus suspension, and 100 μl of the E. coli suspension are combined with 890 μl of TSB (27.5 g/L tryptic soy broth, 5 g/L yeast extract, 5 g/L KH2PO4, pH 7.0, 100 ml/L of a sterile solution of 20% (w/v) glucose added after autoclaving), and 100 μl of the mixture is plated onto AS-1 plates (1 g/L yeast extract, 0.2 g/L L-alanine, 0.2 g/L L-arginine, 0.5 g/L L-asparagine, 5 g/L soluble starch, 2.5 g/L NaCl, 10 g/L Na2SO4, 20 g/L agar, pH 7.5) supplemented with 10 mM MgCl2. These plates are incubated at 37° C. for 16 hours, and then overlaid with 3 ml of R2 agar (100 g/L sucrose, 10 g/L glucose, 10 g/L MgCl2, 0.25 g/L K2SO4, 0.1 g/L casamino acids, 25 g/L agar). At use, the following solutions are added to each 80 ml flask of R2 agar: 1 ml of 0.5% K2HPO4; 8 ml of 3.68% CaCl2.2H2O; 1.5 ml of 20% L-proline; 10 ml of 5.73% TES, pH 7.2; 0.5 ml of 1N NaOH; and 1 ml of a trace elements solution containing 40 mg/L ZnCl2, 200 mg/L FeCl3.6H2O, 10 mg/L CuCl2.2H2O, 10 mg/L MnCl2.4H2O, 10 mg/L Na2B4O7.10H2O, 10 mg/L (NH4)6Mo7O24.4H2O). The solution is also supplemented to 100 μg/ml apramycin and 100 μg/ml nalidixic acid (final concentrations). The plates are incubated at 37° C. until colonies are well established.
[0070]Using either method, putative transconjugants are repetitively picked onto fresh plates, in the presence of 100 μg/ml apramycin and 100 μg/ml nalidixic acid until cured of visible contamination by the E. coli strain used as the source of the plasmid.
[0071]The purified DNA derived from Streptomyces cyaneogriseus subsp. noncyanogenus, which encodes the entire biosynthetic pathway for the production of the LL-F28249 compounds, has been deposited in connection with the present patent application under the conditions mandated by 37 C.F.R. § 1.808 and maintained pursuant to the Budapest Treaty in the American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, U.S.A. More specifically, the purified cosmid DNA, described herein fully and identified as Cos11, Cos36 and Cos40, was deposited in the ATCC on May 24, 2002 and assigned ATCC Patent Deposit Designation Numbers PTA-4392, PTA-4393 and PTA-4394, respectively. It should be appreciated that related purified DNA, other cosmids or plasmids containing related nucleotide sequences, which may be readily constructed using site-directed mutagenesis and the techniques described herein, are also encompassed within the scope of the present invention.
[0072]The following examples demonstrate certain aspects of the present invention. However, it is to be understood that these examples are for illustration only and do not purport to be wholly definitive as to conditions and scope of this invention. It should be appreciated that when typical reaction conditions (e.g., temperature, reaction times, etc.) have been given, the conditions both above and below the specified ranges can also be used, though generally less conveniently. The examples are conducted at room temperature (about 23° C. to about 28° C.) and at atmospheric pressure. All parts and percents referred to herein are on a weight basis and all temperatures are expressed in degrees centigrade unless otherwise specified.
[0073]A further understanding of the invention may be obtained from the non-limiting examples that follow below.
Example 1
Characterization of the Biosynthetic Gene Cluster for Making LL-F28249 Compounds
A. Isolation and Characterization of Cosmids Containing the Fα Biosynthetic Gene Cluster
[0074]1. Construction of Streptomyces cyaneogriseus Cosmid Libraries
[0075]Genomic DNA was isolated from S. cyaneogriseus (both wild-type and the Fα production strain designated 142) using a method presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985) ("Isolation of Streptomyces "Total" DNA: Procedure 3). The S. cyaneogriseus genomic DNA preparation was subjected to partial restriction endonuclease digestion with Sau3AI as follows. A reaction mixture was prepared containing Sau3AI and genomic DNA, and at time points (0, 5, 10, 15, 20, 30, and 45 minutes) aliquots were removed and the reactions were quenched by the addition of EDTA to a final concentration of 10 mM. A portion of each quenched reaction time point was resolved by electrophoresis through 0.3% w/v agarose at 25 volts for 16 hours. The reaction time point containing DNA fragments that were predominantly between 23 Kbp and 50 Kbp was selected for the cosmid library. At the same time, pSuperCos 1 (Stratagene, La Jolla, Calif.) was digested with the restriction endonuclease XbaI; dephosphorylated using calf intestine alkaline phosphatase; and after ethanol precipitation, the linear vector was digested with the restriction endonuclease BamHI in order to remove one of the Cos sites. The Sau3AI fragments of S. cyaneogriseus genomic DNA were ligated into linearized, BamHI treated pSuperCos 1 according to procedures provided by the manufacturer. The resultant recombinant cosmid DNA preparation was packaged using Gigapack® III XL Packaging Extract, and after lysis of the resultant lambda phage particles with chloroform, the cosmid DNA library was transformed into E. coli VCS257. These manipulations were all conducted using reagents, materials, and procedures provided by the manufacturer (Stratagene, La Jolla, Calif.).
[0076]2. Isolation of Cosmids Containing the Fα Biosynthetic Gene Cluster
[0077]Genomic DNA was isolated from S. avermitilis using a method presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985) ("Isolation of Streptomyces "Total" DNA: Procedure 3). This genomic DNA preparation was used as a template for amplification of a region of the module 1 ketoacyl synthase domain of the avermectin biosynthetic gene cluster using the polymerase chain reaction. The oligonucleotide primers used were designed on the basis of nucleotide sequences of the avermectin biosynthetic gene cluster that have been deposited into public databases. Colony lifts of the S. cyaneogriseus strain 142 cosmid library were screened for hybridization to the avermectin ketoacyl synthase probe, and more than 30 cosmids potentially containing type I polyketide synthase DNA were isolated. Initially, these cosmids were analyzed following digestion with BamHI, by agarose gel electrophoresis, by Southern blot using the avermectin module 1 ketoacyl synthase probe, and by limited nucleotide sequence analysis. Comparison of these data to data reported by MacNeil and colleagues (see D. J. MacNeil et al., Gene 115:119-125 (1992) and D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)) suggested that two of these cosmids (designated Cos7 and Cos11) appeared to span the majority of the Fα biosynthetic gene cluster. The limited data presented by MacNeil and his colleagues were also used as the initial basis to support the isolation of a 5.7 Kbp NotI-EcoRI fragment that included most of module 3. A clone of this 5.7 Kbp NotI-EcoRI fragment was prepared (designated pNE57). The nucleotide sequence of this 5.7 Kbp fragment was determined in its entirety. This fragment of the Fα biosynthetic gene cluster (from genomic DNA isolated from the Fα production strain) was then used as a probe to screen the wild-type S. cyaneogriseus cosmid library and 45 cosmids potentially containing type I polyketide synthase DNA were isolated. These cosmids were extensively mapped with BamHI, NotI, and EcoRI using methods described herein, and on the basis of comparison of those restriction maps to the incomplete data presented by MacNeil and his colleagues, two cosmids (designated Cos36 and Cos40 from the wild-type strain), that appeared to span the majority of the Fα biosynthetic gene cluster, were identified.
[0078]In order to identify cosmids spanning the "ends" of the Fα biosynthetic gene cluster, but not containing significant stretches of core polyketide synthase DNA, the following strategy was employed. A 5.5 Kbp BamHI fragment isolated from Cos11 (from S. cyaneogriseus strain 142) was used to reprobe the wild-type S. cyaneogriseus cosmids that had been selected previously in order to identify additional cosmids that would extend the cluster to the "left." A number of cosmids were identified that hybridized to the probe, and after restriction mapping, one of these, Cos14, was identified that would support extending the cluster the furthest to the left. A 500 bp NotI fragment isolated from the 3' end of Cos36 was used to reprobe the wild-type S. cyaneogriseus cosmid library in order to identify additional cosmids that would extend the cluster to the "right." A number of additional cosmids were identified that hybridized to the probe, and after restriction mapping, one of these, Cos50, was identified that would support extending the cluster the furthest to the "right."
[0079]3. Restriction Mapping Cosmids Containing the Fα Biosynthetic Gene Cluster
[0080]Initially, more than 30 cosmids from the S. cyaneogriseus strain 142 cosmid library that hybridized to the avermectin ketoacyl synthase probe, and 45 cosmids from the wild-type S. cyaneogriseus cosmid library that hybridized to the Fα module 3 probe (pNE57), were mapped following digestion with BamHI, NotI, and EcoRI. On the basis of this preliminary analysis, and on the basis of comparison of the restriction maps to the incomplete data presented by MacNeil and his colleagues (see D. J. MacNeil et al., Gene 115:119-125 (1992) and D. J. MacNeil et al., Annals of the New York Academy of Sciences 721:123-132 (1994)), several cosmids were selected for more comprehensive analysis. These cosmids (designated Cos7 and Cos11 from S. cyaneogriseus strain 142; and Cos12, Cos14, Cos36, Cos40 and Cos50 from wild-type S. cyaneogriseus) were carefully mapped following digestion with BamHI, NotI, and EcoRI and double-digestion with BamHI/MluI, NotI/EcoRI, BamHI/EcoRI, SacI/EcoRI, and NotI/MluI. To resolve ambiguity in the restriction maps that were observed, subclones of these cosmids were constructed as summarized in the following Table 2, and these subclones were extensively mapped as described above.
TABLE-US-00002 TABLE 2 Subcloned Designation from: Vector Restriction Sites/Size pB5.5 Cos11 pZeroBlunt BamHI/5.5 Kbp pB18.0 Cos11 pUC19 BamHI/ 18.0 Kbp PBE15.0 Cos12 pBluescript KS BamHI/EcoRI/15.0 Kbp pB2.5 Cos14 pBluescript KS BamHI/2.5 Kbp pB5.5 Cos14 PZeroBlunt BamHI/5.5 Kbp PBB14.0 Cos14 pBluescript KS BamHI/Bg/II/14.0 Kbp PM14.0 Cos14 pLitmus38 MluI/14.0 Kbp PN2.0 Cos14 pBluescript KS NotI/2.0 Kbp PN4.3 Cos14 pBluescript KS NotI/4.3 Kbp pS1.45 Cos14 pBluescript KS SacI/1.45 Kbp pS8.2 Cos14 pBluescript KS SacI/8.2 Kbp pS2.0 Cos14 pLitmus38 SphI/2.0 pB11.5 Cos36 pBluescript KS BamHI/11.5 Kbp PBE4.8 Cos36 pBluescript KS BamHI/EcoRI/4.8 Kbp PM4.6 Cos36 pLitmus38 MluI 4.6 Kbp PN1.6 Cos36 pBluescript KS NotI/1.6 Kbp PN4.8 Cos36 pBluescript KS NotI/4.8 Kbp PBE5.3 Cos40 pBluescript KS BamHI/EcoRI/5.3 Kbp PN5.2 Cos50 pBluescript KS NotI/5.2 Kbp PN10.0 Cos50 pBluescript KS NotI/10.0 Kbp pS3.3 Cos50 pBluescript KS SacI/3.3 Kbp
B. Nucleotide Sequence of the Fα Biosynthetic Gene Cluster
[0081]1. Sequencing Strategy
[0082]The vast majority of the nucleotide sequence data was obtained by end-sequencing random, size selected sublibraries of cosmid DNA that were prepared as described herein. Random sublibraries were sequenced until sufficient coverage (8-10× redundancy) should have existed over the entire fragment of DNA. In order to obtain nucleotide sequence data for regions of the biosynthetic gene cluster that were underrepresented in the random sublibraries, or that for other reasons were difficult to sequence, two other sequencing strategies were used. In the first, products were generated using the polymerase chain reaction in such a way as to span the region of interest of the gene cluster. These PCR products were sequenced directly using the PCR primers as sequencing primers, or the products were cloned into the commercially available PCR product cloning vector pTOPO TA (Invitrogen, Carlsbad, Calif.), and sequenced using universal primers. Alternatively, sequencing primers were synthesized which facilitated obtaining nucleotide sequence by "walking" through regions of interest on cosmids or subclones prepared from the cosmids. Throughout, nucleotide sequence was obtained on Applied Biosystems Model 377 Automated sequencers, using ABI PRISM® BigDye® Terminator Cycle Sequencing Ready Reaction reagents and materials according to detailed procedures provided by the manufacturer (Applied Biosystems, a Division of Perkin Elmer, Foster City, Calif.). Nucleotide sequence data was collected and analyzed using standard "Collection" and "Sequencing Analysis" algorithms (Applied Biosystems, a Division of Perkin Elmer, Foster City, Calif.). Nucleotide sequence assemblies were generated using the SeqMan® II sequence analysis package that is commercially available from DNASTAR (Madison, Wis.), and using the custom Finch®-300 Assembly Server developed for us by Geospiza (Seattle, Wash.).
[0083]Two cosmids (designated Cos36 and Cos40) that appeared on the basis of extensive restriction mapping to span the majority of the Fα biosynthetic gene cluster were isolated from the wild-type S. cyaneogriseus cosmid library. These cosmids were sequenced in their entirety by end-sequencing random, size selected sublibraries that were prepared as described herein. In addition, random, size selected sublibraries prepared from the inserts in several subclones (as summarized in the following Table 3) were also sequenced. Finally, the majority of the subclones generated to support comprehensive restriction mapping of the Fα biosynthetic gene cluster were end-sequenced using universal primers.
TABLE-US-00003 TABLE 3 Subcloned Designation from Cosmid Restriction Sites/Size pNE57 Cos7 NotI-EcoRI/5.7 Kbp (S. cyaneogriseus strain 142) pNE57 Cos40 NotI-EcoRI/5.7 Kbp (wild-type S. cyaneogriseus) pB5.5 Cos14 BamHI/5.5 Kbp pN4.3 Cos14 NotI/4.3 Kbp pN10.0 Cos50 NotI/10.0 Kbp pS8.2 Cos14 SacI/8.2 Kbp
[0084]2. Construction of Sublibraries for Nucleotide Sequence Analysis
[0085]To generate large quantities of the inserts present in cosmids and in the subclones derived from those cosmids, large quantities of plasmid DNA were required. Media (typically 1 L) were inoculated with the clone of interest, and incubated at 37° C. overnight. Plasmid (cosmid) DNA was isolated from these cultures using materials and reagents included in the QIAGEN Plasmid Midi (up to 100 μg) and Maxi (up to 500 μg) kits, or reagents and materials included in the Nucleobond AX-100 (up to 100 μg) kit from Clontech (Palo Alto, Calif.). The inserts present in these plasmids (cosmids) were excised by digestion with appropriate restriction endonucleases, and the fragments were resolved by electrophoresis through 0.8% w/v agarose. The desired fragments were excised from these gels, and the DNA contained in those bands was isolated using reagents, materials, and procedures included in the QIAEX II® (for fragments larger than 10 Kbp) or QIAquick II (for fragments smaller than 10 Kbp) Gel Extraction Systems from QIAGEN (Valencia, Calif.). Then, the DNA was randomly sheared by sonication using a Microson cell disrupter at 10% output. Sonication times were optimized in order to generate fragments of the desired size (typically about 18 seconds for larger inserts isolated from cosmids, and about eight seconds for the smaller fragments isolated from plasmid subclones of those cosmids). Following ethanol precipitation, the DNA fragments were "blunted" using T4 DNA polymerase (New England Biolabs, Beverly, Mass.) in 25 μl reaction volumes containing 2.5 μl of 10×T4 DNA polymerase reaction buffer, 1 μl of 25 μg/ml BSA, and 1.5 μl of T4 DNA polymerase. The reaction mixtures were incubated at 16° C. for 20 minutes, and resolved by electrophoresis through 0.8% w/v agarose. The region of the gel containing DNA between 1.5 Kbp and 2.5 Kbp (by comparison to DNA fragments of known molecular weight) was excised, and the DNA was extracted from the agarose using reagents, materials, and procedures included in the QIAquick II Gel Extraction System from QIAGEN (Valencia, Calif.). Purified DNA was collected by ethanol precipitation and resuspended in 8 μl of water. These DNA fragments were then cloned into pCR®-Blunt, and the ligated products were transformed into chemically competent E. coli TOP10 using reagents, materials and procedures provided by the manufacturer (Invitrogen, Carlsbad, Calif.). Colonies were picked and used to inoculate 2 ml LB media supplemented with 50 μg/ml kanamycin, in 96-well deep well blocks. Plasmid DNA was purified from each of these cultures using reagents, materials and procedures included in QIAprep® 96 Turbo Miniprep Kits. Although the frequency of clones with insert generally exceeded 90%, each plasmid was digested with EcoRI and the fragments were resolved by electrophoresis through 0.8% w/v agarose in order to determine whether an insert of the desired size was present. Clones that did contain desired inserts were sequenced using universal sequencing primers as described herein.
[0086]3. Identification of Biosynthetic Modules and Domains within Modules
[0087]Many modular polyketide biosynthetic gene clusters have been characterized and manipulated. In addition, a large number of nucleotide sequences of modular polyketide biosynthetic gene clusters have been deposited in the public databases. In general, modules of modular polyketide biosynthetic gene clusters, and the domains within those modules can be identified by performing BLAST searches against the public databases, and extensive use of those public databases was made to facilitate the present analysis of the Fα biosynthetic gene cluster (see S. F. Altschul et al., Nucleic Acids Research 25:3389-3402 (1997)). In addition, use of a recent literature reference that summarizes methods for identification of modular polyketide synthase domains, that in particular, describes the differentiation of malonyl-class from methylmalonyl-class acyltransferase domains was employed (S. J. Kakavas et al., Journal of Bacteriology 179:7515-7522 (1997). Leadlay and colleagues originally described methods for differentiation of malonyl-class from methylmalonyl-class acyltransferase domains (see T. Schwecke et al., Proceedings of the National Academy of Sciences USA 92:7839-7843 (1995)).
[0088]A description of five open reading frames, which together encode the loading domain and the 13 modules of the polyketide synthase, is illustrated in the below Table 4. For each open reading frame, the position in the Fα biosynthetic gene cluster (in nucleotides) and the length (in amino acids) of the predicted gene product are shown. In addition, the approximate location of each biosynthetic domain within that predicted gene product (again in amino acids) is also displayed. Abbreviations used are as follows: ACP, acyl carrier protein; ATm, malonyl-class acyltransferase; ATmm, methylmalonyl-class acyltransferase; DH, dehydratase; ER, enoylreductase; KR, ketoreductase; KS, ketoacyl synthase; LD, loading domain; TE, thioesterase.
TABLE-US-00004 TABLE 4 ORF4: nt 12850-19875 (2341 aa) Designation: Loading Domain-Mod1 ATmm-LD aa 22-350 ACP-LD aa 365-450 KS-1 aa 473-897 ATmm-1 aa 1006-1339 DH-1 aa 1359-1547 KR-1 aa 1865-2052 ACP-1 aa 2137-2223 ORF5: nt 19865-31036 (3723 aa) Designation: Mod2-Mod3 KS-2 aa 34-466 ATmm-2 aa 574-908 KR-2 aa 1211-1391 ACP-2 aa 1473-1559 KS-3 aa 1578-2005 ATm-3 aa 2136-2476 DH-3 aa 2486-2667 ER-3 aa 2925-3279 KR-3 aa 3287-3466 ACP-3 aa 3556-3640 ORF6: nt 31115-49246 (6043 aa) Designation: Mod4-Mod7 KS-4 aa 34-456 ATm-4 aa 582-907 ACP-4 aa 950-1031 KS-5 aa 1055-1481 ATm-5 aa 1613-1938 KR-5 aa 2247-2427 ACP-5 aa 2516-2601 KS-6 aa 2621-3047 ATm-6 aa 3168-3493 KR-6 aa 3802-3983 ACP-6 aa 4078-4164 KS-7 aa 4189-4615 ATmm-7 aa 4727-5056 DH-7 aa 5078-5257 KR-7 aa 5588-5768 ACP-7 aa 5868-5952 ORF9: nt 52809-69833 (5674 aa) Designation: Mod8-Mod10 KS-8 aa 39-465 ATmm aa 574-904 DH-8 aa 926-1106 ER-8 aa 1366-1718 KR-8 aa 1726-1908 ACP-8 aa 1995-2080 KS-9 aa 2102-2529 ATm-9 aa 2661-2986 DH-9 aa 3009-3188 KR-9 aa 3492-3674 ACP-9 aa 3753-3842 KS-10 aa 3864-4290 ATmm-10 aa 4402-4732 DH-10 aa 4753-4928 KR-10 aa 5234-5416 ACP-10 aa 5499-5586 ORF10: nt 69929-85429 (5166 aa) Designation: Mod11-Mod13 KS-11 aa 34-456 ATm-11 aa 578-916 KR-11 aa 1199-1380 ACP-11 aa 1464-1549 KS-12 aa 1570-1996 ATmm-12 aa 2105-2442 KR-12 aa 2724-2906 ACP-12 aa 2992-3076 KS-13 aa 3096-3519 ATm-13 aa 3631-3975 DH-13 aa 4003-4188 KR-13 aa 4505-4687 ACP-13 aa 4780-4866 TE-13 aa 4893-5167
[0089]4. Identification of Other Biosynthetic Pathway Genes
[0090]Whether the other open reading frames that were found to be clustered with the core modular polyketide synthase genes played a role in Fα biosynthesis, and if so, what that role might be was based on a BLAST comparison of the nucleotide and predicted amino acid sequences of these open reading frames to sequences that have been deposited in the public databases cluster (see S. F. Altschul et al., Nucleic Acids Research 25:3389-3402 (1997)). Using those methods, a tentative identification of at least six other genes that could be involved in Fα biosynthesis was made.
[0091]A description of six additional open reading frames, which encode genes that could be involved in Fα biosynthesis, is illustrated in the below Table 5. For each open reading frame, the position in the Fα biosynthetic gene cluster (in nucleotides) and the length (in amino acids) of the predicted gene product are shown. In addition, a brief description of the BLAST results used to assign a putative functional role in Fα biosynthesis, is also included here for each of the open reading frames.
TABLE-US-00005 TABLE 5 ORFA: nt 382-2514 (711 aa) Designation: K+-Translocating ATPase, Subunit B (Not related to Fα Biosynthetic Gene Cluster) ORFB: nt 2511-4175 (555 aa) Designation: K+-Translocating ATPase, Subunit A (Not related to Fα Biosynthetic Gene Cluster) ORF1: nt 7697-10465 (922 aa) Designation: Regulatory Protein ORF2: nt 10791-11570 (259 aa) Designation: Thioesterase ORF3: nt 11659-12462 (267 aa) Designation: Reductase ORF7: nt 50449-51303 (284 aa) Designation: Methyltransferase ORF8: nt 51300-52706 (468 aa) Designation: p450 ORF11: nt 85574-86338 (254 aa) Designation: Oxidoreductase ORFX: nt 87037-88293 (419 aa) Designation: Endo-1,3-β-glucosidase (Not related to Fα Biosynthetic Gene Cluster)
[0092]ORFA and ORFB: BLAST results reveal considerable homology between ORFA and ORFB and K+-translocating ATPase subunits B and A, respectively, particularly the Mycobacterium tuberculosis genes (nucleotide sequences of which were directly submitted to the public databases). These genes are unrelated to the Fα biosynthetic gene cluster.
[0093]ORF1: BLAST results suggest that at the nucleotide level, ORF1 is related to a putative transcriptional activator in the pikCD operon of a macrolide biosynthetic gene cluster from S. venezuelae (see Y. Xue et al., Proceedings of the National Academy of Sciences USA 95:12111-12116 (1998)), and a putative regulatory protein in a Type-I polyketide synthase biosynthetic gene cluster from the rapamycin producing organism, S. hygroscopicus (see X. Ruan et al., Gene 203: 1-9 (1997)). At the predicted amino acid sequence level, the gene product exhibits limited homology to a family of hypothetical transcriptional activators related to the E. coli narL gene product. On the basis of these BLAST results, ORF1 appears to encode a transcriptional activator.
[0094]ORF2: BLAST results reveal significant homology between ORF2 and thioesterases at both the nucleotide and predicted amino acid sequence levels, including thioesterases in the Amycolatopsis mediterranei rifamycin biosynthetic gene cluster (see P. R. August et al., Chemistry & Biology 5:69-79 (1998)), and the S. griseus candicidin biosynthetic gene cluster (see L. M. Criado et al., Gene 126:135-139 (1993)). On the basis of these BLAST results, ORF2 appears to encode a thioesterase.
[0095]ORF3: An analysis of BLAST results suggests that ORF3 is homologous to reductases in the S. cyanogenus S136 landomycin biosynthetic gene cluster (see L. Westrich et al., FEMS Microbiological Letters 170:381-387 (1999)). At the predicted amino acid sequence level, BLAST results reveal homology between the ORF3 gene product and an oxidoreductase responsible for the conversion of versicolorin A to sterigmatocystin in the Aspergillus parasiticus aflatoxin biosynthetic pathway (see C. D. Skory et al., Applied and Environmental Microbiology 58:3527-3537 (1992)). On the basis of these BLAST results, ORF3 appears to encode a reductase.
[0096]ORF7: BLAST results reveal significant homology between ORF7 and methyltransferases at the nucleotide level, including methyltransferases in the S. lavendulae mitomycin C biosynthetic gene cluster (see Y. Q. Mao et al., Chemistry & Biology 6:251-263 (1999) and the Saccharopolyspora erythraea erythromycin biosynthetic gene cluster (see S. F. Haydock et al., Molecular and General Genetics 230:120-128 (1991)). On the basis of these BLAST results, ORF7 appears to encode a methyltransferase.
[0097]ORF8: BLAST results reveal limited homology between ORF8 and putative cytochrome P450's, including P450's in the S. roseofulvus frenolicin biosynthetic gene cluster and the S. pristinaespiralis pristinamycin biosynthetic gene cluster (see V. de Crecy-Lagard et al., Journal of Bacteriology 179:705-713 (1997)). At the predicted amino acid sequence level, ORF8 exhibits homology to a large family of mammalian cytochrome P450's. On the basis of these BLAST results, ORF8 appears to encode a cytochrome P450.
[0098]ORF11: BLAST results reveal significant homology between ORF11 and oxidoreductases at both the nucleotide and predicted amino acid sequence levels, including oxidoreductases in the S. violaceoruber granaticin biosynthetic gene cluster (D. H. Sherman et al., EMBO Journal 8:2717-2725, (1989)), and the S. cinnamonensis monensin biosynthetic gene cluster (see T. J. Arrowsmith et al., Molecular and General genetics 234:254-264 (1992)). On the basis of these BLAST results, ORF11 appears to encode an oxidoreductase.
[0099]ORFX: BLAST results reveal homology between ORFX and a glucan endo-1,3-β-glucosidase from Oerskovia xanthineolytica (see S. H. Shen et al., Journal of Biological Chemistry 266:1058-1063 (1991)). This gene is unrelated to the Fα biosynthetic gene cluster.
[0100]There are several open reading frames in the 3.5 Kbp region between characterized ORFB and ORF1, which on the basis of nucleotide sequence characteristics (G+C content, potential ribosome binding sites) appear to encode proteins. BLAST analysis, however, does not reveal significant homology between the predicted amino acid sequences of these hypothetical proteins and sequences of proteins that have been deposited in public databases. Consequently, ascribing a functional role to these hypothetical proteins in the biosynthesis of Fα is not possible on the basis of their nucleotide (or predicted amino acid) sequence alone. In addition, there are a number of open reading frames in the 7.8 Kbp region between characterized ORFX and the end of the nucleotide sequence that have now been obtained. Since ORFX encodes a gene that does not appear to play a role in Fα biosynthesis, and since macrolide biosynthetic genes are typically clustered, hypothetical proteins encoded by the open reading frames beyond ORFX do not participate in Fα biosynthesis.
Example 2
Gene Replacement, Characterization of Integrants and Excisants
A. Gene Replacement
[0101]In order to develop an S. cyaneogriseus strain capable of direct fermentative production of 23-keto-Fα, generating derivatives of the Fα production strain in which the module 3 ketoreductase domain had been replaced with nonfunctional variants were sought. A series of directed amino acid substitutions, each designed to disrupt ketoreductase activity while minimally affecting the rest of the polyketide synthase were designed as follows. A multiple amino acid sequence alignment was generated in which the predicted amino acid sequence of the module 3 ketoreductase domain from the S. cyaneogriseus Fα biosynthetic gene cluster was aligned with the predicted amino acid sequences of a large number of biologically active ketoreductase domains. These ketoreductase domain sequences were from the S. avermitilis avermectin biosynthetic gene cluster, the Saccharopolyspora erythreae erythromycin biosynthetic gene cluster, the S. hygroscopicus rapamycin biosynthetic gene cluster, the S. caelestis niddamycin biosynthetic gene cluster, and the Amycolatopsis mediterranei rifamycin biosynthetic gene cluster. Three ketoreductase domains known to be nonfunctional (so-called "cryptic" ketoreductase domains from module 3 of the Saccharopolyspora erythreae erythromycin biosynthetic gene cluster, module 4 of the S. caelestis niddamycin biosynthetic gene cluster, and module 3 of the Amycolatopsis mediterranei rifamycin biosynthetic gene cluster) were also included in the sequence alignment. This multiple amino acid sequence alignment readily supported the identification of relatively invariant amino acid sequences common to the majority of biologically active ketoreductase domains, but absent from (or altered in) nonfunctional ketoreductase domains.
[0102]Methods were also developed for gene replacement in S. cyaneogriseus by homologous recombination such that the desired variants of the module 3 ketoreductase domain from the Fα biosynthetic gene cluster could be replaced with the engineered variants of the module 3 ketoreductase domain, as described herein.
[0103]1. Construction of Plasmids for Site-Directed Mutagenesis
[0104]The QuikChange® site-directed mutagenesis procedure is a double-stranded method based on the polymerase chain reaction that requires two mutagenic oligonucleotides, one corresponding to each strand of the double stranded region of DNA. The method is less efficient when large plasmids, particularly large plasmids containing high G+C content DNA, are used. Consequently, site-directed mutagenesis of the Fα module 3 ketoreductase domain was performed in a vector designated pKR0.9 (see FIG. 3), which is the 900 bp BstEII-AatII fragment of pNE57 (and contains the desired region of the Fα module 3 ketoreductase domain), in the BstEII-AatII sites of pSL301 (Invitrogen, Carlsbad, Calif.).
[0105]2. Site-Directed Mutagenesis
[0106]Five variants of the Fα module 3 ketoreductase domain were generated by site-directed mutagenesis using reagents, materials and procedures provided by the manufacturer of the QuikChange® Site-Directed Mutagenesis kit (Stratagene, La Jolla, Calif.). The following amino acid substitutions were generated in pKR0.9, using the mutagenic oligonucleotides indicated below:
TABLE-US-00006 "179" GGTGTLG (SEQ ID NO: 13) to GAASTLG (SEQ ID NO: 14) 5'-CTGGTGACGGGCGCTGCAAGCACTCTGGGGGCG (SEQ ID NO: 15) 3'-GACCACTGCCCGCGACGTTCGTGAGACCCCCGC (SEQ ID NO: 16) "204" LVSRRGM (SEQ ID NO: 17) to LVAAAGM (SEQ ID NO: 18) 5'-GCGGCATCTGCTGCTGGTGGCAGCGGCAGGCATGGCCGCCGCCGGTG (SEQ ID NO: 19) 3'-CGCCGTAGACGACGACCACCGTCGCCGTCCGTACCGGCGGCGGCCAC (SEQ ID NO: 20) "260" HTAGVLD (SEQ ID NO: 21) to HTPPLLD (SEQ ID NO: 22) 5'-GACCGCTGTGGTGCACACGCCACCTCTCCTGGACGACGCCACCGTG (SEQ ID NO: 23) 3'-CTGGCGACACCACGTGTGCGGTGGAGAGGACCTGCTGCGGTGGCAC (SEQ ID NO: 24) "283" GAKVD (SEQ ID NO: 25) to GAAVD (SEQ ID NO: 26) 5'-GATGCGGTGCTCGGGGCGGCTGTGGACGGTGCCCTGCAC (SEQ ID NO: 27) 3'-CTACGCCACGAGCCCCGCCGACACCTGCCACGGGACGTG (SEQ ID NO: 28) "306" VLFSSAA (SEQ ID NO: 29) to VLFAAAA (SEQ ID NO: 30) 5'-GTCGGCGTTCGTGCTGTTCGCAGCGGCCGCCGGGGTCCTGG (SEQ ID NO: 31) 3'-CAGCCGCAAGCACGACAAGCGTCGCCGGCGGCCCCAGGACC (SEQ ID NO: 32)
[0107]The QuickChange® mutagenesis reactions contained 125 ng of each of the mutagenic oligonucleotides, 50 ng of pKR0.9, 0.7 μl of Pfu DNA polymerase, and 2.5% DMSO in final reaction volumes of 50 μl. The reactions were subjected to 22 cycles of amplification (95° C. for 45 seconds, 63° C. for 1 minute, and 70° C. for 10 minutes), and amplified products were cloned according to detailed procedures provided by the manufacturer. After completing the site-directed mutagenesis procedure, colonies were picked and used to inoculate 2 ml LB media supplemented with 100 μg/ml carbenicillin. Plasmid DNA was purified from each of these cultures using reagents, materials and procedures included in the QIAprep® 8 Turbo Miniprep Kits, and the mutated 900 bp BstEII-AatII region of the Fα module 3 ketoreductase domain was sequenced in its entirety in order to confirm that the desired changes had been made.
[0108]3. Construction of Plasmids for Integration
[0109]A three-way ligation was used to combine the five site-directed mutants of the Fα module 3 ketoreductase domain with flanking DNA to facilitate homologous integration using the pKC1132 backbone. The three components included: the 4.3 Kbp NotI-BstEII fragment of pNE57 (containing the majority of the Fα module 3 adjacent to the regions mutagenized); the 1.1 Kbp BstEII-PstI fragments of six pKR0.9 constructs (containing the five site-directed mutants of the Fα module 3 ketoreductase domain, and the wild-type Fα module 3 ketoreductase domain); and the 3.6 Kbp PstI-NotI fragment of pKC1132 (containing all of the elements necessary for selection and replication of the resultant plasmid in E. coli and Streptomyces). These manipulations resulted in the generation of the pFDmod3/5.2 plasmid series. These plasmids were then used to construct versions of the plasmids for integration from which approximately 1 Kbp of flanking DNA had been removed. These plasmids were constructed by digesting each of the pFDmod3/5.2 plasmids with EcoRI. This EcoRI site is immediately adjacent to the NotI site in pKC1132 that was used to introduce the 4.3 Kbp NotI-BstEII fragment of pNE57 (containing the majority of the Fα module 3). The 3' overhang was filled in using T4 DNA polymerase under standard reaction conditions, and the linearized plasmids were digested with MscI. The digests were resolved by electrophoresis through 0.8% w/v agarose, the desired fragments were excised from the gel, and the DNA was extracted from the agarose using reagents, materials and procedures included in the QIAquick II Gel Extraction System from QIAGEN (Valencia, Calif.). Purified DNA was collected by ethanol precipitation and ligated to generate the pFDmod3/4.2 plasmid series (see FIG. 5).
[0110]Plasmids of the pFDmod3/5.2 series (see FIG. 4) and the pFDmod3/4.2 series (see FIG. 5) were transformed into E. coli ETS12567 (pUZ8002) using methods described herein. Then, these transformed E. coli strains were used as the source of DNA for conjugal transfer to S. cyaneogriseus using methods described herein.
[0111]4. Isolation and Analysis of Genomic DNA from S. cyaneogriseus Transconjugants and Excisants
[0112]A method modified from methods presented in D. A. Hopwood et al., Genetic Manipulation of Streptomyces, A Laboratory Manual, John Innes Foundation Press, Norwich, UK (1985) ("Isolation of Streptomyces "Total" DNA": Procedure 4) was used for the isolation of small amounts of genomic DNA from S. cyaneogriseus strains. Putative S. cyaneogriseus transconjugants and excisants were picked and used to inoculate 3 ml KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O). The cultures were incubated at 31° C., with shaking at 220 rpm, for 24-28 hours. The cells in 500 μl aliquots of these cultures were collected by centrifugation in a microfuge at 13,000 rpm for 5 minutes, and the supernatant was discarded. After washing the cell pellets with water, they were suspended in 450 μl of SET (0.3 M sucrose, 25 mM EDTA, 25 mM Tris, pH 8.0, containing 4 mg/ml lysozyme and 50 μg/ml RNaseA), and the suspensions were incubated at 37° C. for 2-4 hours. 250 μl of a 2% solution of SDS was added, and the samples were vortexed for 1 minute. The samples were extracted with 250 μl of phenol:CHCl3 (1:1) and the phases were resolved by centrifugation in a microfuge at 13,000 rpm for 5 minutes. The aqueous layer was removed to a new tube, and after adding 1/10th volume 3 M sodium acetate, the DNA was precipitated by adding an equal volume of isopropanol. Precipitated DNA was collected by centrifugation in a microfuge at 13,000 rpm for 5 minutes, washed with -20° C. 70% ethanol, and suspended in 100 μl of water.
[0113]For the isolation of larger amounts of genomic DNA from S. cyaneogriseus strains, 25 ml KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) was inoculated with mycelial fragments of the strain of interest. The cultures were incubated at 31° C., with shaking at 220 rpm, for 24-28 hours. The cells in 3 ml aliquots of these cultures were collected by centrifugation in a microfuge at 13,000 rpm for 5 minutes, and the supernatant was discarded. After washing the cell pellets with water, genomic DNA was isolated using reagents, materials and procedures included in the DNAeasy® system for the isolation of total (plant) DNA from QIAGEN (Valencia, Calif.).
[0114]5. Characterization of Transconjugants
[0115]Putative transconjugants were plated on CM agar (5 g/L corn steep liquor, 5 g/L Bacto-peptone, 10 g/L soluble starch, 0.5 g/L NaCl, 0.5 g/L CaCl2.2H2O, 20 g/L Bacto-agar) plates containing 100 μg/ml apramycin, 30 μg/ml nalidixic acid, 50 μg/ml cycloheximide, and 50 μg/ml nystatin A. These plates were incubated at 31° C. until the colonies were well-established. Genomic DNA was then isolated from the putative transconjugants using methods described herein, for analysis by Southern blot and nucleotide sequence analysis as follows. Aliquots of the genomic DNA preparations were digested with HindIII/StuI and with SalI. The fragments were resolved by electrophoresis through 0.8% w/v agarose, and blotted onto Nytran® membranes (commercially available from Schleicher & Schuell BioScience, Inc. USA, Keene, N.H.) for Southern analysis according to well-established procedures similar to those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989). Typically, these Southern blots were probed with the mod3-specific probe, which was generated as described herein. The expected sizes of the fragments were:
TABLE-US-00007 Strain HindIII/StuI SalI S. cyaneogriseus production strain 10.8 Kbp 4.6 Kbp 142 S. cyaneogriseus production strain 13.3 Kbp 4.6 Kbp + 3.3 Kbp 142/pFDmod3/5.2 transconjugants S. cyaneogriseus production strain 12.3 Kbp 4.6 Kbp + 3.3 Kbp 142/pFDmod3/4.2 transconjugants
[0116]The region of interest of transconjugants that appeared to be correct on the basis of the Southern analysis was amplified using standard polymerase chain reaction (PCR), and the PCR products were sequenced to confirm that the desired sequence had been obtained. Two primer sets were used to characterize the transconjugants. Each pair was comprised of one mod3-specific primer, and one primer specific for vector-derived sequences. In addition, the primer pairs were designed such that one pair would amplify products from the "right side of the cassette" and the other pair would amplify products from the "left side of the cassette." The primer pairs used were:
TABLE-US-00008 Left (mod70F) 5'-TACTGCGCCACACGGAGCCCGAG (SEQ ID NO:33) and (P6568B) 5'-TGGGTAACGCCAGGGTTTTC (SEQ ID NO:34) Right (PECOR1F) 5'-GGAAACAGCTATGACATGATTACG (SEQ ID NO:35) and (mod3633B) 5'-TCGGAGCCGCTCCACCTGAG (SEQ ID NO:36)
[0117]With genomic DNA isolated from a "correct" transconjugant as a template, these PCR primers would direct the amplification of 6.4 Kbp and 5.7 Kbp products, respectively. The region of these PCR products containing the ketoreductase domain were sequenced to confirm that the desired sequence had been obtained, using the following oligonucleotide sequencing primers:
TABLE-US-00009 "179" Transconjugants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 37) Reverse 5'-GACACCGAAACCCCTG (SEQ ID NO: 38) "204" Transconjugants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 39) Reverse 5'-GCCGTGTGCACCACAGCGGTCAG (SEQ ID NO: 40) "260", "283", "306" Transconjugants: Forward 5'-GTGTGATGTCGCCGACCGCGCCCAGGTC (SEQ ID NO: 41) Reverse 5'-GCGCTGGTGGGCCAGGGCGTCC (SEQ ID NO: 42)
[0118]6. Excision and Characterization of Excisants
[0119]Transconjugants that had been verified by Southern analysis and by nucleotide sequence analysis of PCR products as described herein were used to inoculate 25 ml of KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O), and the cultures were incubated at 31° C. with shaking at 220 rpm, for 48 hours. A 500 μl aliquot of the culture was crossed into a fresh 25 ml of KB3 medium, and incubation was continued at 31° C. with shaking at 220 rpm, for an additional 48 hours. This process was continued for many such rounds, in the absence of selection, in order to allow for the excision event to occur. After rounds 3-6, serial dilutions of the cultures were prepared from 10-1 to 10-5, and 250 μl aliquots of the 10-3 to 10-5 dilutions were plated onto 140 mm diameter CM agar plates (5 g/L corn steep liquor, 5 g/L Bacto-peptone, 10 g/L soluble starch, 0.5 g/L NaCl, 0.5 g/L CaCl2.2H2O, 20 g/L Bacto-agar). These plates were incubated at 31° C. for 48-96 hours, until colonies were well-established. Individual colonies were then picked, and patched in replicate onto CM plates, and CM plates supplemented with 100 mg/ml apramycin. These plates were incubated at 31° C. for up to 5 days, at which time colonies sensitive to apramycin, but capable of growing normally in the absence of selection were identified. Genomic DNA was then isolated from these putative excisants using methods described herein. Using these genomic DNA preparations as templates, the region of interest was amplified using the polymerase chain reaction (PCR), and the PCR products were sequenced to confirm that the desired sequence had been obtained. The primer pair used for amplification was:
TABLE-US-00010 (mod70F) 5'-TACTGCGCCACACGGAGCCCGAG (SEQ ID NO: 33) and (mod3633B) 5'-TCGGAGCCGCTCCACCTGAG (SEQ ID NO: 36)
[0120]With genomic DNA isolated from a "correct" excisant as a template, these PCR primers would direct the amplification of a 6.6 Kbp product. The region of these PCR products containing the ketoreductase domain were sequenced herein to confirm that the desired sequence had been obtained, using the following oligonucleotide sequencing primers:
TABLE-US-00011 "179" Excisants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 37) Reverse 5'-GACACCGAAACCCCTG (SEQ ID NO: 38) "204" Excisants: Forward 5'-CCTGATGGACGCGGGTGCGC (SEQ ID NO: 39) Reverse 5'-GCCGTGTGCACCACAGCGGTCAG (SEQ ID NO: 40) "260", "283", "306" Excisants: Forward 5'-GTGTGATGTCGCCGACCGCGCCCAGGTC (SEQ ID NO: 41) Reverse 5'-GCGCTGGTGGGCCAGGGCGTCC (SEQ ID NO: 42)
B. Fermentation and Analysis of Fermentation Products
[0121]Seed flasks containing 25 ml of KB3 medium (10 g/L Bacto-tryptone, 5 g/L yeast extract, 3 g/L beef extract, 1 g/L KH2PO4, 1 g/L K2HPO4, 1.5 g/L Difco agar, pH 6.8 and 0.5 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) were inoculated with 500 μl of a suspension of S. cyaneogriseus mycelial fragments (either fresh or frozen) and the cultures were incubated at 31° C. with shaking at 220 rpm, for 48 hours. A 500 μl aliquot of the seed culture was crossed into production flasks containing 25 ml of SD2 production medium (85.5 g/L glucose, 0.36 g/L KCl, 0.72 g/L MgSO4.7H2O, 7.2 g/L Ca CO3, 4.86 g/L (NH4)2SO4, 0.72 g/L K2HPO4, 7.2 g/L pharmamedia, and 1.8 ml/L of a trace metal solution containing 30 g/L FeSO4, 30 g/L ZnSO4.7H2O, 4 g/L MnSO4, 4 g/L CuCl2.5H2O, 0.4 g/L CoCl2.6H2O) and the cultures were incubated at 31° C. for 10 days. Starting at (typically) 120 hours, and continuing through the end of the fermentation, 100 μl aliquots of the production culture were removed, and combined with 900 μl of methanol. The suspensions were vortexed for 1 minute, clarified by centrifugation in a microfuge at 13,000 rpm for 10 minutes, and 10 μl aliquots of the extract were analyzed by reversed phase HPLC.
[0122]For analysis by reversed phase HPLC, samples were subjected to chromatography on a Waters Model 625 Liquid Chromatography Station equipped with a Waters Model 996 Photodiode Array Detector, a Waters Model 717 Autosampler, and a Waters Nova-Pak C18 column (8 mm×100 mm). The column was equilibrated in and eluted with a mobile phase containing 60% (v/v) acetonitrile and 40% (v/v) 100 mM ammonium acetate, pH 4.5 at a flow rate of 2 ml/min. The compounds of interest, Fα and 23-keto Fα (predecessor of moxidectin), were detected by monitoring their absorbance at 242 nm, and retention times were compared to those of authentic samples.
[0123]In the foregoing, there has been provided a detailed description of particular embodiments of the present invention for the purpose of illustration and not limitation. It is to be understood that all other modifications, ramifications and equivalents obvious to those having skill in the art based on this disclosure are intended to be included within the scope of the invention as claimed.
Sequence CWU
1
42188400DNAbacteria 1gagctcttcg ctcccgccgg accggttggt cgcgccggag
aggacgagcc ggtagcgggt 60gttgatctcg ttggtgccga gcccgttggc cgggcggccg
tggaaccacc tcggatcggg 120ctcgggggtc tcctggccct tcttcagcgg cagatggtac
ggctggccga tcagcgagga 180gccgacggcc ctgccgtccg ccgtgatctc ggagccgtcg
gcccggtcgc ggaagagtgc 240ctgggcgacg ccggtgacga ccagcgggta gccggcgccc
gtcaccaggg tcagcacgag 300gagggcccgc aggcccgccc cgagcagccg gacggtgtgg
gtggcggagt tgttcatggc 360ggtcagcacg ctttcgtgac gtcacggccc gggaacgagg
gagatgaaca ggtcgatgat 420cttgatgcct atgaagggcg ccaccaggcc gcccaggccg
tagatcccga ggttgcgccg 480cagcatccgg tccgcgctca ccggccggta ccgcacgccc
ctcagggaca gcggcaccag 540cgccacgatg accagcgcgt tgaagatcac cgcggagagg
atcgcggagt cgggtgagga 600caggcccatg acgtcgagtc gctccaggcc gggatgggcc
ggcgcgaaca gcgccgggat 660gatcgcgaag tacttcgcga cgtcgttggc cagggagaag
gtcgtcagtg cgccgcgtgt 720gatcagcagt tgcttgccga tctccacgat ctcgatcagt
ttggtgggat cggagtcgag 780gtcgaccatg ttgccggcct ccttcgcggc cgacgtaccg
gtgttcatcg ccacgccgac 840gtccgcctgg gccagagccg gggcgtcgtt ggtgccgtcc
ccggtcatgg cgaccagcct 900gccgcctgcc tgctcccgcc tgatcagcgc catcttgtcc
tcgggagtcg cctccgcgag 960gtagtcgtcg acgcccgcct cgcgcgcgac ggcctgcgcg
gtcagcgggt tgtcacccgt 1020gatcatgacg gtcctgatgc ccatgcggcg cagttcctcg
aaccgcgcgc gcatgccgtc 1080cttgacgacg tccttgaggt ggacggctcc cagcacccgg
gcgccccgct cgtcccgcgc 1140ggcgaccagc aggggcgtgc cccccgatcc ggcgatgcgg
tcggcgatgg ccttcgcgtc 1200ctgggcggcc tcaccgccct gctcctcgac ccaggcgagg
atggaaccgg ccgcgccctt 1260gcggatcctg cggccgccga cgtccacgcc cgacatgcgg
gtccgggcgg tgaacgcgat 1320ccattcggcg ccggcgagtt cgccccggtg ccgctcgcgc
agtccgtact gctccttcgc 1380caggacgacg acggaccggc cctcgggcgt ctcgtccgcg
agcgaggaga gctgcgcggc 1440gtccgccacc tcggcctccg tggtgccgga caccggcacg
aacccggccg cccgccggtt 1500gccgagcgtg atcgtgccgg tcttgtccag cagcagcgtg
gagacgtcgc ccgcggcctc 1560gaccgcccgg cccgacacgg ccagcacatt gcgctgcacc
aggcggtcca tgcccgcgat 1620gccgatcgcc gagagcagcg cgccgatcgt ggtcgggatg
aggcagacca gcagcgccac 1680cagcaccgtc ggtgtcaggt gggtgcccgc gtgatccgcg
aagggcggca gcgtggcgca 1740gaccagcagg aagacgatgg tcagcgaggc cagcaggatg
ttcagcgcga tttcgttagg 1800cgtcttctgc cgggccgcgc cttcgacgag gtcgatcatc
cggtcgatga aggtctcacc 1860gggcttggtc gtgatccgga tgacgacacg gtcggacagg
accttggtgc cgccggtgac 1920ggcgctccgg tctccccccg actcgcggat gacgggtgcc
gactcgccgg tgatggcgga 1980ctcgtcgacg gacgcgacgc cctcgacgac atcaccgtcg
ccggggatga cgtccccggc 2040ctcgcagacc accagatcgc cgatcctcag tccggtgccc
ggcacccgct cctccgagcc 2100gtcctcgcgc aggcggcggg cgacggtgcc ggtcctggtc
ttgcgcaggg tgtcggcctg 2160tgccttgccg cggccttcgg cgaccgcctc cgcgaggttg
gcgaagagca cggtcatcca 2220gagccaggcg gagacggtcc agccgaaccg gtcgccggga
tccatgaggg cgaagacggt 2280ggtgaggacc gagccgatcc acaccacgaa catcacgggc
gtcttgatct gcacccgcgg 2340gtccagcttg cggaaggcgt ccggcaacga cctgacgagc
tggcccgggt cgaacagacc 2400gccgccgacc cgcctttcgg acggctggtg accggtgggg
gcgtcgcgct gcggcgtccg 2460ggcgggagtg atcgtggaca tcgggttccc ttggtcgtcc
gggtgtgcgc tcatgccgcc 2520agcccttcgg cgagcggccc cagcgccagg gccgggaagt
acgtcaagcc ggcgaggatc 2580aggatcgcgc ccaccatcag gccgctgaac agcggcttgt
cggtgcgcag ggtgccggtg 2640gtgaccggca cgggccgttg cccggcgagc gagccggcca
gcgccaggac gaacaccatc 2700ggcaggaagc ggccgagcag catcgccagt ccgatggtgg
tgttgaacca ctgcgtgtcc 2760gcgtcgagac cggcgaaggc cgagccgttg ttgttggcgc
cggaggtgta ggcgtagagg 2820atctcggaga acccgtgcgc gccgctgccg gtcgtcgagt
tcaccggcgt cggcagggcc 2880atcgcgcacg cggtgaggat caggaccagc gccggggtga
ccagcaggtg gcaagcggcc 2940agtttgatct cgcgggtgcc gatcttcttg cccaggtact
cgggcgtgcg gccgaccatc 3000agaccggcga tgaacaccgc tgtgacggcc atgacgagca
tgccgtagag gccggatccc 3060accccgcccg gagcgatctc gcccagcatc atgccgagca
tcgcgatgcc tccgccaagg 3120ccggtgaagg aggagtggaa ggagtccacc gcgccggtcg
aggtgagcgt ggtcgacacc 3180gcgaagatgg acgaggcgcc gacaccgaag cggacctcct
tgccctccat cgcgccaccg 3240gcgatctcga gcgccgggcc gcggtgcgag aactcggtcc
acatcatcag ggcgacgaag 3300gcgatccaga aggtggccat cgtcgccagg atcgcgtagc
cctgcctgac cgagccgacc 3360atgacgccga acgtccgggt gatcgagaag gggatcacca
ggatcaggaa gatctcgaag 3420aggttggtga agggcgtcgg gttctcgaac gggtgggcgc
tgttggcgtt gaaatagccg 3480ccgccgttgg tgcccagcag tttgatggcc tcctgggagg
cgaccgcgcc cccgttccac 3540tgctgcgagc cgcccgtgaa ccggccgacc tcgtggatgc
cggagaagtt ctggatgacc 3600ccgcaggcgg ccagcaccac ggcgccgagg gtggccagcg
gcaccaggac gcggaccgtt 3660ccgcgcacca gatcggccca gaggtttccc agttcaccgg
tgcgggagcg cgcgaacccg 3720cgcaccagcg cgaccgcgac ggccatgccc acggcggccg
aggtgaagtt ctgcacggcc 3780aggccggcgg tctgcacgag gtgtcccatg gcctgttcgc
cggagtacga ctgccagttg 3840gtgttggtca cgaaggacac ggccgtgttg aacgcctggt
ccgggtcgac ggctcgaaag 3900ccgagggaca gcggcaggac gccttgcgcc cgctggacca
ggtacaggaa gaggacgccg 3960gccacggaga aggccagcac accgcggagg tacgcgggcc
agcgcatctg ggcgccgggg 4020tcgacaccga tgccccggta gatccatctc tcgacgcgcc
agtgctcgtc ggaggagtag 4080accttggcca tgtggttgcc gaggggtttg tggacgagtg
ccagagcact cgtcagggcg 4140agcagttgga gcacgccggc gagtacggga cccatggctg
ctctcagaac ctctccggga 4200agatcagggc gaggacgaga tagcccagca gggagacggc
cacgaccagg ccgacgacgg 4260tctcggcggt cacagcttcg tcacccccct ggcgacaaca
gccaccagcg cgaagagcgc 4320gagcgtggtg acgacgaagg ccgtatcggc catcgcggac
tcctggaatg aggtgcggtg 4380gaaacggacc cttgcaggta agcgcctcac cgaccgaaac
aggacgtccg ttgacgtttc 4440ccttacggcg tgacgtacgt ctttgacgga actcttacgc
ctgaggtccg tgtccatgcc 4500cctcggtccg ttgcggacca tgcccccgcg gccaccggag
agggcggcgt ccccctcagc 4560gggccccgcc cgtctccccc ggggcctccg tctcccccac
cttctccacg accgtctcct 4620ccggcccgac cggccgtccg tccgcggcac ggatccgcaa
ggggcgcagc gggcggccgg 4680tgcggcggtc gagcaggcgg gagtggtcct cgccgggggc
gaagcactgc tcctcgcccc 4740actggcgcag ggcgacgatc accgggaaca aggcgcggcc
cttgtccgtg aggacgtact 4800cacggtggga accgccgtcc ggcgcgggca cgttgcgcag
taccccggcc tcctccagcg 4860cgcgcagccg cgccgtcagg atgttcttgg cgatgccgag
gctgcgctgg aactcgccga 4920agcggcgact gccgtcgaag gcgtcccgca cgatcagcag
cgaccaccag tcgccgatgg 4980cgttcaccga ccgggcgacg ggacaggggt cggcgtcgaa
acgggtgcgg gcgaccatcc 5040gcgtctctcc tctcctccgg caccccggat ccctccaggg
atggttgcaa catgctacct 5100cgtacggcta ccgtcctcgc cggtagcaag atgcaaccga
gtgagaggtg tgacggtatg 5160gcggtccagt gctccggtgc ggacggcgga tgcggcgaag
ccggtggtcg cggagcggcc 5220ggcacggcgc cgcccgcgcg gctcgtgccc ctgctcgccc
tggcctgtgg cagctccgtc 5280gccaccgtct acttcgccca ccccctgctg gtgaccctcg
gtgagcgctt cgcgctcggc 5340cccgggctgc tcggcgcgat cgtcaccgtg acgcaactcg
gttacgcggt gggcctgctg 5400acactcgtgc cgctcggcga cctgctcggc caccggcggc
tggtcaccgc tcagctcgga 5460ctgctggcac tggcgctgct ggccgccggg ctggcgccgg
gcgcggctgc gctgctcggc 5520gcgctcgccg cggtcgggct gctcgccgtc gtcgcccaga
cgatggtcgc ggctgccgcc 5580gccctgagcc cgcccgaccg gcgaggccgc gccgtgggaa
ccgtcaccgg cggcatcgtc 5640accggcatcc tgctggcgcg cgccgccgcg ggcgtcctcg
ccgacctcgc cggctggcgg 5700gcggtctacc tggcgtcggc gggcgtcacc gccgtcctcg
ccgtgctgct gcgccgtgcg 5760ctgcccccgg gatcgccgtc cgcaaaggct cgcgagacgt
cgtacgtacg gctggtggcc 5820tcgaccgtca ccctgttcgc ccgccatccg ctgctgcgga
tccggggggc cctggccctg 5880ctggtgttcg cggccttcag cacgctgtgg agcggcgtgg
cccagccctt gagcgatccg 5940ccgtggtcgc tgtcgcacac cgcgatcggc gcgttcgggc
tcgccggggc ggccggagcc 6000gtcgccgcac aggtggccgg gcgctggaac gaccgggggc
tcgcccggcg cacgaccggc 6060gccggcctcg cgctgctggc gctctcctgg ctcccgatcg
ccctgacccg gcaatcgctg 6120tgggcgctgg cgatcggcgc cgtcctgctg gacttcgccg
tgcaggccgt ccacgtcacc 6180aaccagaccc tcatccacgc cgtccggccc gaggcgggca
gcaggatcat cggtggttac 6240atggtcttct actccgcggg cagcagcctc ggcgccctcg
gttcctccct cgcctacgcc 6300acggcgggct ggccggccgt gacggccctg ggcgcgtcgt
tcagcgtcgc cgcgctgctg 6360ctgtggacgg cgacccgtcg tacggggctg cccggcgacg
acccggcggc cgaacggacg 6420gaccctggcc gtccgtccgg ggacagggct gccgggaggc
ccgcccgcag ccgctcttcc 6480ggcccccggt gaacgtctgg ggtggcgcgg ggcgcgcgat
aggggtccgc cgagagtcag 6540gggttgtccc gcctgtacag cggcatcggc ggtcggtcca
atggaaggcg tgcctccgga 6600tgcggggcgg gtcgtcgacg accctgtgcc ggcggcggat
tcccaccctc ccccggaagg 6660cgaggtcgtg atgaccgacg tccgtcatga cagcaggcag
acgggtccgg cgctgcgcgc 6720gctcagcgcg gcccggcggg cgcgggcctt ggcgtccgcg
atggcggcgg ccgccgcgga 6780gacgcggcag gccgtcgagg ccgcggacgg cacggaccgc
gcggccgccg tagccgagat 6840cggcgcggta ctggaggacg cctcccggca cacggacgcc
gccgccgagg ccgccgcctc 6900ggctgccgag gccgccgccc gggccgagac ggccgaggcc
gcccgcacgg tggccgccga 6960gtcggccgag gccgtcgtcg ccgccgccga gacggcggtc
cgggcggccc gggtcaccga 7020ggccgccacg agcgccgccg cccaggccgc ggccggtacg
gacgcggcgg gcgtgatggc 7080ggacgccgcg gcgcacaccc ggcaggccac cgccgagacc
gcggcgatcg ccgaggccgc 7140cgccgcggcg gccagcgcgg cccgggccgc cgtcggcgac
gaggcggcgg acggcgcgga 7200cccgtgccga cgggctgacg aggcggaggc cgcggccctg
cggctgtgcg aggacacgcc 7260gtggctgcgc aggcacctcc ccgacgtgtg aggcagggtg
cccggcggcg ccggcgcgag 7320atggaacccg ggccgggcgg ccctcttccc tccgggtgcc
ggcgacgaga ccgtcgcccc 7380cacctcgaac cgggccttcc ggtcgctgta ccggcatccg
gagatcgagc ggcggccgca 7440ctccgagggg gaccgggtgc tcggcggccg ggccgccacc
tggacggatc cgccgtcgct 7500ggagctgacg ggccgggtgg tccacgacgc gctgcgcctg
ttccccggac gggctgctca 7560ccgggtcatc accgaggaca cggggcgcgc agggcgcgcg
ccgccggccg gcggcgtcgt 7620cgcctgggtc gatcagcagc cccccgcggc gggttgctca
cgttcggcgg cgggcaggcg 7680tccgcccggt gcgcgttcag accgggcttt cgatgtgcgc
gcacagccgg gcgggcagtt 7740cctgacggcg gctgatctcc agcttgcggt aggcgcgcgt
caggtgctgc tccacggtgc 7800tgacggtgat gcagagccgg gcggcgatct cccggttggt
gtgcccgttg gcggcgagtt 7860ccacgaccct gagctccgat ccgctcagcg ccgcctcggt
ccgccccgtg ccgtccccga 7920acgactgccg cccaggacct cccggcagga tccgctcgca
cagagcccgc gccccgcagt 7980cgttcgccag gtgccaggcg cggcggatcg tggcgcccgc
ccgggtcgac tcgccgcgct 8040cccggtaggc ggcgcccaga tcggccaggg cacccgccag
ggccagccgg tcgccgctgc 8100tcttcaggtg gttcacggcc tcggtcagca ggttcagccg
atccggcggt tcggcgatct 8160gcgcgcgcag ccgcagcgag acgccgcgca catgaggatc
gtcgtccggg gtccgggcca 8220gttgttcccg caggagccga tcggccctcc tcggctcgca
cagccgcagg aacgcctccg 8280ccgcgtccga gcgccacggc atcagcgtcg gccggtcgat
cccccagcgc cgcagcagac 8340ggccggcgcc gaggaagtcg cggacggcgg cgaggggccg
gtccagggcg aggtagtagt 8400ggccccgggc gcgcaggtag gcggggccgt acacgctgcg
gaacagggcc tccggcaccg 8460ggtggtcgag ctgccgggtg gcctcgtcgt agcgccccat
cgcggtggcc gcgaacaccc 8520ggctggccag cggcccgccg atgaagacgc tgcggctgca
cctcggcacg caggccaggg 8580cctgacgggc gtactcctcg gtgtcggcga gcagcccccg
gcacagcgcg atctcggcct 8640ggagggcgag gaactgcgcc ttccagcccg gcagccggcg
accggcggcc tcgccgagca 8700gacgtgtgca ccagagcgcc gcggtctcgt acgagccggt
gcggcacagg acccggaccg 8760cgttgacgac gatcaccagg gtggtgtcgg tgagcggcag
gctcctcagg acgtgctccg 8820ccgcgtcgga ggccgaggcg ttggtgccgt cgccggggag
gtcccagatc ccggtcaccg 8880gcatccgcgg gcgggggctc tcctcgtccc cgggctccgg
gtccgtccgg ggacggatca 8940gcggctccca gagcgcggag gcgtggaagc ccgtctccag
ccggggagtt cgcgggtcgc 9000cgtgggggcc cggccgtccc atgacctccg tggcctcctc
cagccgtccg cagccgagca 9060gcaggtgacc cagccgttcg gtctcggcgc tcgtcagtcg
tccggcccgc agctcggtga 9120cgagttcggc gaggtggtcc tccgccgccg ccgggtcggt
gcgccgggtg gcgacggtca 9180ggcgcagcag gatctcggcg cgccggggcc ctcccgcaca
ggagcgccgg gccagttcga 9240gacaggagac ggcggtcagg acgtcgtccc gcatcagcag
ctgctcggcg gcgtcccgca 9300gcacggacat cgcccagggc ccggccgcgt gccgggcggc
gagcaggtgc cgggccacct 9360cgtccggttc cgcgccgacg tcgtacagca gcgcggcggc
gcggcggtgg aggtgtgcgc 9420ggtggtcgtg gtccagggtg tccagggcgg ccgcctcgac
cacggggtgc cggaagcggc 9480cggacgccgt caggccggtc gcctccaggg cgcgcagccc
gcgggccgcc atggcgcggc 9540cgatgccgag cagccgggcg atcacctcgg cgcagccgga
gtcaccgagg acggcgagcg 9600cgccggcgct gtgcctaacc aggctgtcgg tgcgggacag
tgaggcgagg acggcctggt 9660agaaccgccc gccgatgacg ggcgacgctg ctctccggcg
ccggccggca cgctcgtcgg 9720tgtgcccttg ggtgtggctt tcgacgagtt cttcgagcag
ggcatgcacc agcagcggat 9780tgccgccggt gacggcgagg aggtcgtccg cgggcagggc
ctcgacggcc ggtccggggc 9840gggcggcgcg cagtccggac acggcgcgca gggagaggcg
gccgagcatg acgcgctgga 9900gggccggctg gcacagcagc tcggcctcca cggcggggtc
ggccgccagg ccggacggca 9960gcgcggtgca gaccagcagc agtcgggtgg cgcggggatg
gtcgacggcc tggagcaggc 10020agtgcaggga ctgcgggtcc gcgtggtgca ggtcgtcgat
gccgatgacg accggcgcgg 10080cgccggtgag ctggtggagc gccgcccgca cacgctgcgc
ggccggggtc tccgtcccga 10140cggcgtcctg gagcagtgag cgctgggcgt ccgggatgtc
ggggtcgacc gccagttgcc 10200gcaggaggtc gaaggggcgc cggccctccg gcgggcttcc
ggcggaccgg aggaccagga 10260agcccgatgc cgccgcgtgc ttgagcgcct cccccaggaa
cgcgcttttc ccgcagccga 10320gtcctccttc gacgaccagc acccgcaccc ggccggcggc
gcacgcctcg agcgccgttc 10380tcacggcatg ggactgcctg cccaggccga ggaacgtgag
cccttgcggc tcccgcacgg 10440acaccgaagg ggaaacgccc cgcataatct ccctctgact
ccctcccccg aagaccgggg 10500gctttacgga ttcgtaccaa caggaaagcc cacaagtcga
cgagatactg cccctctccc 10560gaagccgcca cacgcgcacc ccgatacgag aatgagccaa
tgagcaagcg tggtggccga 10620gttgatacga acccgtgaat ttacgttatt tcgctcaccc
tttcgagcgt gtggagagtc 10680ctcggaatgg gcggccggga ggttgggcag cctccgcggg
acggcgagcc attcgcgagg 10740tcacgcggac acgcgtgttg cgataatcgc acttaaggag
aggacgagcg atgcccgacc 10800tttgcgagac cgaatccctc tggctccggc ggttccagcc
ggctcccgcg gcccggacgc 10860ggctcatgtg cttcccgcac gcgggcgggt ccgccagcgc
ctatctgcgc ctggcccggt 10920ccctcgcccc cggcatcgag gtcctggcgg tccagtaccc
cggacgacag gaccggcgcg 10980ccgagccctg cccggactcc gtcgaaggcc tggcggacga
tctgttcgcg gccgtccggc 11040accgcgtgga cgcgtcgacc gcgctgttcg gacacagcat
gggcgcggtc ctcgccttcg 11100agctggcccg gcggctggag cgcgacgcgg gggtccgctg
cgcccggatc ttcgcctcgg 11160ggcgccgggc accctcccgg ttccgtgacg actccgcccc
ggccgccagc gacgcctcga 11220tgctcgccga gatgcggact ctcggcggaa ccgacctgcg
ggtgctccag gacgaggaac 11280tgctgatcgc cgcgctgccc gcgctgcgcg ccgactaccg
cgcgatcggg acctaccgcg 11340ccgccgacga cgccgtggtc ggctgcccgg tcaccgtgct
ggtcggtgac gccgatccga 11400ggaccagcct cgacgacgcc cacgcctgga gcgcccacac
cacggcggag tccgaggtgc 11460tcaccttctc cggcgggcac ttcttcctcg acgcccacca
cgacgcggtg gtggaggtcg 11520tcaccgcgcg cctgcggcag gaccgcgcgc cccggccgga
ccgggtgtga gggggcccgg 11580cccgaagggc cgggccgctc cgcgcgtctg ccggcaccgg
gccgcaccgg acccggcgcc 11640ggcagacgcg cggcgacctc acatcatggc gggcgccagg
gccattcccc cgctggcgtc 11700cagcagttgg ccggtgatcc agcgggcgtc gtcggagacc
aggaaggcga cgatgccggc 11760gatgtcgttc ggccggccca gccggccgag cgcggtcagg
gccgagatgc ccgcctcggc 11820ccccggggtc tcgcgcaccc agcggttcat gtcggtgtcc
gtgatgccgg gggccacggt 11880gttgacggtg atgccgcgcg aaccgagttc gttggcgagc
cggggagcca tcatctccag 11940cgcccccttg gtcatggcgt agggcagcag cggccaggcg
atccgggtga cggccgagga 12000gacattgacg atgcgtccgc cgtcggccat cagtgacagg
gcccgctggg tcacgaagaa 12060cggtgcccgg acgttgatgc ggtacacgcg gtcgaactcc
tcgggcgtgg tgtccgacag 12120gccggggaca tagccgtcct gtgccgcgag cgccgggtcg
ccgggggcgg gggcgacggc 12180cgcgttgttc accaggatgt gcagcggacg cccctccagc
tcccgctcca gtgcggtgaa 12240gagctcatcc acggcgtcgt cccggaggag gtccgcccgg
accgcgaagg cccgtccccc 12300cgcgcgttcg atcgtctcca ccgtctcctg ggcgctcttt
tcctgcgttc cgtagtgcac 12360ggcgacccgg acgccctcgg cggcgagtcg ctgggcgatg
gcttttccga tgccgcgcga 12420ggcacccgtg accaaggccg tcctgtcgtt caattccggc
atcccgaatc cccttctgcc 12480gattatctta cttttcctct tgatgcatgg ggtcggaccc
gaggccagat ccgcaccccg 12540gccacgcgtg aggtcgcgac ctcaccgatt actgtgccag
agtccaggcg acacacggga 12600gggcgggaat gcgatcgatt tccgcacccg gaactcgtag
ggggagcaag aagatcggcc 12660gaatacccct ggggtggata gggggtacca ggaccgtcgg
gcgatcacta ttttgaaaca 12720cgactccggc gcgcggccgg cggcgaaagt cctctccatg
ccgggctgtc ccctgcctcg 12780aaatacctgc ggcgactttc gccctgcgat gcggccgccc
atccctgccg agcggtgagg 12840agacgacaag tgcacgagac acacgcgcac ggcgaggaag
ggtcgtccga cgggtccgcg 12900gacgcagtgg tcttcgtctt ccccggacag gggtctcagt
ggccggggat gggtgcggaa 12960ctgtgggaca cctccccggt gttccgcgag agtgtgcgcg
cctgcgccga cgcgctcgcc 13020ccgtacctcg actggtccgt cgaaggcgtc ctgcgcggcg
ccccggacgc cccggccggc 13080ccggcgctcg atcgcgccga cgtcgcgcag ccggccctgt
tcaccctcat ggtgtcgctg 13140gccgagctct ggcgctcgca cggagtcgaa ccctgcgccg
tcctcgggca cagcctcggc 13200gagatcgccg ccgcgcatgt ggccggcgcc ctgaccctgg
ccgacgccgc ccgggtggcg 13260gccctgtgga gccgggccca ggccacgctg tcgggcaccg
gcacccttct cgcggccaag 13320gccgcccccg aggaactggc accgcacctt cagcggtgga
acggcgacga ccggcacggc 13380acccggctcg cgatcgccgg cgtcaacggg cccggcagca
cggtggtggc gggggacctc 13440gacgcgatcg ccgcgctggc cgccgacctg gcctcggcgg
gggtgcggac ccgccgggtc 13500gccgtcgacg tgcccaccca ctcccccgcg atgcggaccc
tgcgggaacg gatcctcacc 13560gacctggcct ccgtcgcccc gtgcgtctcc cgtctcccct
tccactcctc gctcaccggc 13620ggtctggtgg acacccgcgg gctggacgcc gactactggt
accgcaacat cagcgagacc 13680gcgcgcttcg acctcgccgc ccgcggtctc ctggccgacg
gacaccggac gttcgtggag 13740ctgagcccgc acccgatact caccctgggc ctgcaagcgc
tcgccgacga cgtccccggc 13800gccgccgacg cgctcgtgac gggcacgctg cgccgcgggc
gcggcggaat gcggcagttc 13860caggacgcgc tcggccggct cagcgtcccc gcgggcgggc
ggcccggccg cgaggtgagc 13920gccgcggccc tggccggccg gctggcgccg ctctccccgg
cgcagcagga gcatctgctg 13980gtggaattgg tctgcgccca cttcgccgca ctcgtcggcg
gcgacggcgg ggcgccgccg 14040acggtgcggc cgtcggccgc cttcaccgat cagggctgcg
actccgccac cgccctggag 14100ctgcgcgacc ggctccgcga ggcgaccggg ctgcgcctgc
ccgccacgct ggtcttcgac 14160cacccgacgc cggccgcggt cgccggccgg ttgcgccgac
tcgccctcgg gatcgaggag 14220acggcggaca cggcaccggt cgccgtccgc ggccaccggg
agggcgaacc gatcgcgatc 14280gtcgggatgg cctgccgctt cccgggaggt gtccggtcgc
cggaggacct gtggcggctg 14340gtcaccgaag gcggtgacgc gctcgggccg ttccccaccg
accgcggctg ggacaccggc 14400cgccacgcgg aggacccggc cacacccggc acctacgtcc
agggcgaggg cggattcctg 14460tacgacgcgg gcgagttcga cgccgagttc ttcgggatct
ccccgcgtga ggcgctggcc 14520atggacccgc agcagcggtt gctgctggag atggcgtggg
agaccttcga acgggcggga 14580atcgatccca cctcggcccg gggatcgcgt accggcgtct
tcgccggggt cctcccgctc 14640ggctacggcc cccgcatgga cgagacggac cagggcaccg
ccgacctcca gggccatctc 14700ctcaccggca cactgcccag cgtcgcctcg ggccgcatct
cctacaccct cggcctggag 14760ggcccggcgg tgtcggtgga gacggcctgc tcgtcgtcgc
tcgtcgccct ccacctcgcc 14820tgccgctcgc tgcgggcggg cgagtgcgac ctcgccctga
cggggggcgt ctcggtgctg 14880gccaccctcg gcctgttcgt cgagttctcc cggcagcgtg
gactgtcggc ggacggccgg 14940tgcaaggcgt acgcggcggc ggccgacggg accggatgga
gcgagggtgc cgggctgctg 15000ctggtcgaac ggctctccga cgcacggcgg ctggggcacc
gggtgctcgc ggtggtccgg 15060ggcagcgcga tcaaccagga cggcgcgtcg aacgggctga
ccgcccccag cgggccgtcc 15120cagcagcggg tcatccgcga ggccctggcc gacgcgggcc
tgacggcggc ggacgtcgac 15180gcggtggagg ggcacgggac cggcacacga ctgggcgacc
cgatcgagat cgaggcgctg 15240ctcgccacct acggacaggg acgcgcccgg gaacggccgc
tgtggctcgg atcgctgaag 15300tcgaacatcg gtcacaccat ggccgcggcg ggggtgggcg
gggtcatcaa gatggtgatg 15360gcgctgcggc acggggagct gccccgcacc ctgcacgtgg
acgcgccctc gccccgggcc 15420gactggtcgg cgggcgaggt acggctgctg acggaggccg
tcgcgtggcc cgcggcggcg 15480gacggtgagc cgcggcgggc cggggtgtcg tccttcggcg
tgagcggcac caacgcgcac 15540gccatcctgg aggaggcgcc cgccccggag gacgaggaac
cggcgccgcc ggacggtgaa 15600gcactactgc cgtgggcggt gtccacgcgg tcggaggccg
cactgcggac gcaggcacgg 15660atgctggcgg acgtcgtacg cgacgacccc ggagtcggac
tcgccgatgt gggtgcggag 15720ctggcccggg ggcgggcggc tctcgagcac cgggccgtcg
tcatcgcctc cgggcgcgcg 15780gagttcgcgc gggcgctgga ggcggtggcg tccggcgagc
cgcacccggc cgtggtccgg 15840ggccacgcgg ggagcgagcg cggcggagtg gtgttcgtct
tcccgggcca gggcggtcag 15900tgggccggca tgggactcga cctcctgcga agctcaccgg
tgttcgcgga gcacatcgcg 15960gcctgcggca aagctctggc cccgtgggtg aagtggtcgc
tcacggaggt gctgcaccgg 16020gacgccgagg atccggtctg ggaccgggcc gacgtcgtcc
agccggtgct gttctcggtc 16080atgacgtcgc tggcggcgct gtggcgctcg tacggcgtcg
agccggacgc cgtgaccggg 16140cactcgcagg gggagatcgc cgccgcgtac gtctgcggag
cgctcggtct ggaggacgcc 16200gcacggacgg tggcgctgcg cagccgcgcc ctggtggcgc
tgcgcgggcg gggcggcatg 16260gcgtccgtcg cctccgccgc cccggacgtc gaggagctca
tcgcgcggcg ctggcccggc 16320cggctgtggg tcgcggcgtt caacggcccc ggcgcggtga
ccgtttccgg ggacggtgat 16380gcgctggagg agttcctggg ccactgcgcg gacacggagg
tgagggctcg gcgcgtcccg 16440gtggactacg cctcccactg cccgcacacg gaggcgatcg
agcgggaact gctcgacgcc 16500ctggaggaca tcaccccccg gccggcggcg gtcccgttct
attcgacggt cgacgacgcg 16560tggctggaca ccacacggct ggacgcctcc tactggtacc
gcaacctgcg ccggcccgtc 16620cgtttcagcc aggccgtgcg cgccctcacg gacggcggcc
accgcgtctt catcgaggcg 16680agcccgcatc ccaccctcgt ccccgccatc gaggaccacg
gcgacgtcac cgccctcggc 16740accctgcgcc gccacggcga cgacaccgag cggttcctca
ccgccctcgc ccacctccat 16800gtcaccggag ccgccggcca ggacctctgg cgccaccact
acgcccggct caggcccgcc 16860ccccgccacg tcgacctgcc cacctacgcc ttccagcgcg
accggtactg gtggagcggc 16920ggcgccgggc gcggggacgt caccaccgcc ggtctgcacc
ccggcggcca tcccctcctc 16980ggcgccgcgc tggacctcgc cgacggcggc ggccgcctcc
acaccggccg tgtctccctg 17040cgcacccacc cctggatcgc cgaccacggc gtcgcgggca
tcaccctcct gcccggcacc 17100gccttcctcg aactcgccct gcacacgggc gagtcgggga
acgtgcggga actcaccctg 17160cacgcgcccc tggtcgttcc cgacgaggag ggcgtcgacc
tgcaagtgca cctcgcccgg 17220cccgacgaag cgggcctgcg cgccctgacc cgtcttctcc
cgggccgcgg ggtgccgacc 17280ccgagagccc cctggcagcc ccacgccacc ggccttctcg
ggccggccga ccgagcaccc 17340ggctcctccg gcctcgagcc gcacgacctg ggcggcgcct
ggcctccgcc gggggcggtc 17400cccctcgtcc ccggcgaact cggcgacgtg cccggctgct
acgcccgcct ggccgacgag 17460gggttcgagt acgggccggc cttccggggg ctgcgtgcgg
tgtggcgccg cggcacggag 17520atcttcgccg aggtcgccct cccggccggc gacggctccg
tgttccggct gcatccggcg 17580ctgctggacg ccgtgctgca ccccgtcgta ctcgggctgg
tggacggcgt gccggcccgt 17640ccgctgccct tctcctggaa cggcgtggcg ctgcacgccc
ccgcgagcgg cgcgctgcgg 17700gtgcgcctcg cgccggccga cgacggcgct gtcggcatca
cggccgcgac ggccgccggt 17760gagccggtgc tctcggtcgc cgcgctggcc ctgcggtccg
cctcggcgga gcagttgcgc 17820gcggcgatcc gctccgcggc gggctcgcgc gacgccctct
acgagctgga ctggctgccg 17880ctcccggcgg accgggccgc ttcgcccggt ggggccgaca
tcgcggccct gggcacatcg 17940gagctgccct gccgtacgta cgagaccatc gcggagctgt
cgcaggccct cgccgacggt 18000gctcccgccc ccgacgccgt cgtctccgac gtcggcgccg
tcggcgggcc gctggacacc 18060gtgagcctgc acggcctctg ccggcgcggg ctggaactcg
tgcaagcctg gctgggcgag 18120ccccggacgg ccgacacgcg gctggtgctc gtgacgcgtg
gggcggtcgg ctgtgccccg 18180gccgagccgg tcgccgatcc ggccgcggcc gcgctgtggg
ggctggtgcg gtccgcgcag 18240gcggagcacc ccggacggct gctcctgctg gacctcgacc
ccgccgggtc gcggcccgtc 18300tccggccgcc tggtggaaca ggcggtggcc tgcggtgagc
cgcacatcgc cgtacggggc 18360gacggcctgc gcgtcccccg gttgtcccgc gcgacggccg
cccccgcaca ccctcccgcc 18420ggtggccggg aagcgcagtg ggacccggaa gggaccgtcc
tcatcaccgg cggcaccgga 18480agtctcggcg cgctgttcgc ccggcatctg gtgaccgcgc
acggggtacg gcggctgctc 18540ctcgccagcc gcagtggccc cggcgccccc ggcgccgccg
ggctgcggga cgaactgacc 18600gctcacggag ccaccgtcac cgtcgccgcc tgtgatgtgg
ccgaccggga ggccgtcgcc 18660gccctcctgg cgtccgtgcc gtccgagcac ccgctgaccg
ccgtagtgca caccgccggc 18720gtgctggacg acggcgtact cgcctcgctc accgccgacc
ggctggcccg cgtcctgcgt 18780gccaaggccg acgccgcgct ccacctgcac gatctcaccc
gcgatctgcc gctcgccgcc 18840ttcgtcctct tctcctccgt cacggcgacg ctcggcacac
ccggccaggc caactacacc 18900gccgccaacg cgttcctcga cgcgctcgcc cggcatcggc
gcgccgcggg cctgcccgcc 18960gtctcactcg cctgggggct gtgggagcag accggcgggc
tgaccgatca cctcggatcg 19020gtcgacctgc ggcggatggc ccgcaacggc ctggtcgcgc
tgcccgccga cgccggcctg 19080gcgctcttcg acaccgcgct ggccctggac cgcgccaacc
tggtcccggc gcggctcgac 19140ctgcccgcgc tgcgccgcgc cacacacgtg ccgcccgttc
tgcggcggct ggtcgaggtg 19200ccgggggcgc cgagcgcgga ccggtccgcc gggtccggcg
gcgaggtgag gccgctgcgt 19260gagacgctgg ccgggctgga cgaccggaaa cgccccgctg
ccgtctcccg cctggtccgc 19320aggcacgtcg cgtgggtgct cggcgccgac ggtccggagt
cggtggacga ggaccgcagc 19380ttccgcgacc tcggcttcga ctcgctgatg gccgtcgaac
tgcgcaacca gctcaacacc 19440gccgccggca tccggctcgc ggccaccctc gtcttcgacc
acccgacacc gtcggccgtg 19500gcgcggcacc tcctcgaccg gtgctcgccg gacccggccg
ccccggccgc tccctcgggt 19560acggcggtcg cgtcggcgct cgccactctg gccgagctgg
agacggcttt gaacggcatc 19620ccggccgagg agtggacggc cgccgggggc ccggcccggc
tgatgacgct ggcgtcctcg 19680ctgcccgcgc ccgcgtccgt ccctcggaca ccggcggccg
gcgaagccgc cgagaagctc 19740gcccacgcct cgcgcgacga gatcttcgcg ttcatcgatc
gggagctggg gcgtgactcc 19800gggccagcct caccctctcg cctcggtccg cagacccccg
actcgacaga caaggcgccc 19860tttcatggag aatgaggaaa agctcctgga ctacctcaag
tgggtcaccg ccgatctgca 19920ccgctcgcgg gaacgcgtca ccgagctgga ggaggccggc
cgggagccga tcgccatcgt 19980cgggatggcc tgccggttcc cgggcgaggt gcggtcgccg
gaggagctgt gggggctggt 20040cgcctcgggc ggcgacgcga tcggggcgtt cccggacgac
cgcgggtggg atctggacgg 20100gctgttcgac cccgacccgg agcgtgcggg cacctcgtac
acccggcgcg gcggtttcct 20160gtacgacgcg gcggagttcg acgcgggctt cttcgggatc
tccccgcgtg aggcgatggc 20220gatggacccg cagcagcggc tgctgctgga gacctcgtgg
gaggctttcg agcgggccgg 20280catcgacccg tcctcggtac gcgggtcccg ggtcggtgtc
ttcgccggcc tcatgtacca 20340cgactacgcg gcggcccagg gcagcaccgg ggacggagac
ggggagccgg acttcgaggg 20400ctacctcggc gacggcagcg tcagcagcat cgcctcgggc
cgtatcgcct acaccctcgg 20460gctcgcgggc gcggcgatca ccgtcgacac ggcctgctcc
tcttccctgg tcgccctgca 20520cctcgcctgc caggcgctgc gcaccggcga ctccgagctg
gccctggccg gcggggtcag 20580cgtcatgtcc accccccgca ccttcgtcca gttctcgcgg
cagcggggcc tgtcggcgga 20640cggccggtgc aaggcgtacg cggcggcggc cgacgggacg
gggttctccg agggcgtcgg 20700catggtgctg gtcgaacggc tctccgacgc ccggcggctg
gggcatccgg tactggcggt 20760cgtgcggggc agcgcggtca accaggacgg cgcgtcgaac
ggtctgacgg cgcccaacgg 20820accgtcgcag gagagggtga tccgcgaggc gctggccaac
gcgggcctga cggcggcgga 20880cgtcgacgcg gtggaggggc acgggaccgg gacacggctg
ggtgacccga tcgagttgca 20940ggcgctgctc gccacctacg gacagggacg cgcccgggag
cggccgctgt ggctcggatc 21000ggtgaagtcc aacatcggtc acgcgcaggc ggcggcgggg
gtgggcggcg tcatcaagat 21060ggtgatggcg ctgcggcacg gggagctgcc gcgcaccctg
cacgtggacg cgccctcgcc 21120ccgggtcgac tggtcggcgg gcgaggtacg gctgctgacg
gaggccgtcg cgtggcccgc 21180ggcggcggac ggtgagccgc ggcgggccgg ggtgtcgtcc
ttcggggtga gcggcaccaa 21240cgcccatgtg atcctggagg aggcgcccgc gtcggagggc
gaggaagctc cgccgccgga 21300gcccgggtcg ccgttgccgt gggtggtgtc cggtcactcg
gaggcgggct tgcgcgccca 21360ggcgcaggct ctggcggagt tcgcacggac cgcgcccggg
gccgaactcg tggacgtggg 21420agcggcgttg gcccgggggc gggcggcgct ggggcatcgg
gcggtcgtcg tcgcctcgga 21480gcgtgaggag ttcgagcggg cgctggccgc gctggcctgt
ggcgaaccgc acccgtgtgt 21540ggtcgacggg tcggcggacg gccggcgcga ggacggtgtg
gtgttcgtct tcccgggcca 21600gggcggtcag tgggccggca tgggactcga tctgctgacg
acctcggggg tgttcgccga 21660acatatcggt gcgtgtgaac gcgcgctggc gccgtgggtg
gagtggtcgc tgacggagat 21720gctccaccgc gaggcggagg acccggtgtg ggagcgggcg
gacatcgtcc agccggtgct 21780gttctcggtc atggtgtccc tggccgcgct gtggcggtcc
tacggcatcg aacccgacgc 21840ggtggtcggc cactcccagg gcgagatcgc cgccgcccac
gtctgcggcg ccctcaccct 21900cgaagacgcc gcgaaagtcg tggcactgcg cagccgggcc
ctggccgcac tgcggggccg 21960cggcggcatg gtctccctct cgctgtcgac cgcggatgcc
ggggagctgg tggagcggcg 22020gtgggccggg cggctgtggg tcgcggcgct caacgggccg
gaggcgacga cggtctcggg 22080ggacgtcgac gcgctggagg agctcctggc ccactgcgcg
aaaagcgagg tgcgagcgcg 22140gcgcgtcccg gtggactacg cctcccactg cccgcacacg
gaagcgatcg cggaagagat 22200cgtcgattca ctcggggaca tcacgccccg ggccgccacc
gttccgttct actcgacggt 22260cgacgacatg tggttggaca ccacacggct ggacgcctcc
tactggtacc gcaacctgcg 22320cctcccggtc cgcttcagcc aggccgtgcg cgccctcacg
gaagaaggcc accgcctctt 22380catcgagacg agcccgcatc ccaccctcgt ccccgccatc
gaggaccacg gcgacgtcac 22440cgccctcggg accctgcgcc gccacggcga cgacaccgag
cggttcctca ccgccctcgc 22500ccacctccat gtcaccggag ccgccggcca ggacctctgg
cgccaccact acgccaggct 22560caggcccgcc ccccgccacg tcgacctgcc cacctacccc
ttccaacgcc ggcgctactg 22620gctggagaaa cccgacccgc agaccaggcc ccagcggtcc
cgctccaccg ccccggacct 22680cgacaggctg gaggcggagt tctggcaggc cgtcgaggaa
accgacaccg acaccctcgc 22740ccacaccctc cacctcgaca cccagaccct cgaacccgtc
ctccccgccc tcgccacctg 22800gcaccaacaa caacgcgacc acgcccgcat caacacctgg
acctaccagg aaacctggaa 22860accactccac ctccccacca cccgacccac cacccccacc
agctggctca tcgccatccc 22920cgaaacccac cgcaaccacc cccacaccac caacctcctc
accaacctcc cccaccacaa 22980catcaccccc atccccctca ccatcaacca caccaccgac
ctccaccacg cctaccacca 23040cgcccaccac cacaccaccc cacccatcac cgccgtcctc
tccctcctcg ccctcgacga 23100aacaccccac ccccaccacc cccacacccc caccggcacc
ctcctcaacc tcaccctcac 23160ccaaacccac acccaaaccc acccaccaac ccccctctgg
tacctcacca cccaagccac 23220caccacccac cccaacgacc ccctcaccca ccccacccaa
gcccaaacca tcggactcgc 23280ccgcaccacc cacctcgaac acccccacca caccggcgga
cacatcgacc tccccaccac 23340accccacccc aacaccctca cccaactcat caccgccctc
acccaccccc accaccaaca 23400caacctcacc atccgcaccc acaccaccca cacccgacga
ctcaccccca ccaccctcca 23460acccaccacc cccacaccac ccaccaaccc ccacggcacc
accctcatca ccggcggcac 23520cggcgccctc gccaccaccc tcgcccacca cctcgccacc
accggcaccc aacacctcct 23580cctcaccagc cgacgcggcc cccacacccc cggcgcccga
caactccaca cccaactcac 23640ccaactcggc accaacacca ccatcaccgc ctgcgacctc
tccgaccccg accaactcac 23700ccacctcctc acccacatcc cccccgaaca ccccctcacc
accgtcatcc acaccgccgg 23760catcctcgac gacgccaccc tcaccaacct cacccccacc
caactcgaca acgtcctgcg 23820cgccaaagcc cacaccgccc acctcctcca ccacgccacc
ctccacaccc ccctcgacca 23880cttcgtcctc tactcctccg ccgccgccac cctcggcgcc
cccggccaag ccaactacgc 23940agccgccaac gcctacctcg acgccctcgc ccaccaccgc
cacacccaca acctccccgc 24000caccaccatc gcctggggaa cctggcaagg aaacggcctc
gccgactcgg acaaggcccg 24060cgccaacctc gaccgccggg gcttcctgcc catgcccgag
acgctggccg cagccgcggc 24120cgtgcgggcg atcgagagca ggcggccgtc cgtggtcatc
gccgccatcg actgggccag 24180agccgagcgc acccccgacg tcgaggatct cctccccgcg
gccgacgagg ggtcgtcgag 24240tggcaagccg gaggccgcgc cggtggacct gcgcggtacc
ttgagccggc agtccgccgc 24300cgaccaacag gccacactgc tcggcctggt gcggacccag
gcagccgtcg tactgcgcca 24360cacggagccc gaggcgctcg ccccgggcca ggccttccgg
gcgctcggct tcgactccct 24420caccgccgtc gaactccgca accgactggc caaggccacg
gacctcgcgc tgcccgcctc 24480actggtcttc gatcacccga ctccggtgaa gctcgcggag
ttcctgcgca ccgagctgct 24540cggcaccgca ccagctacca ccgccgccgt cccggccctc
caggcacaca ccgacgaacc 24600catcgccatc atcggcatgg cctgccgctt ccccggcgcc
gtcaccacac ccgaacacct 24660gtggaacctc atcgccaccg aacaagacgc catcggcgag
ttccccaccg accgcggctg 24720ggacctggac aacctctacc accccgaccc cgaccacccc
ggcaccacct acacccgcca 24780cggcggattc ctccacgacg ccggcgactt cgacgccgac
ttcttcggca tcaacccacg 24840cgaagccctc gccatggacc cccaacaacg actcctcctc
gaaaccgcct gggaagccat 24900cgaacacgcc ggcatcctcc ccgacgccct gcacggcacc
cccaccggcg tcttcaccgg 24960cgtcaacgcc caggactacg ccgcacacac ccacacctcc
ccccacacca ccgagggcta 25020caccctcacc ggaaccgccg gcagcatcgc ctccggccgc
atcgcctacg tcctcggact 25080cgaaggcccc gccgtcacca tcgacaccgc ctgctcctcc
tccctcgtcg ccctccacct 25140cgcctgccag gccctgcgag caggcgaatg caccacagcc
ctcgccagcg gcatcagcat 25200catgaccaca ccgctggcct tcaccgagtt ctcccggcag
cggggtctgg cggcggacgg 25260ccggtgcaag gcgttcgcgg cggccgccga cggtaccggc
tggtcggagg gggtggggac 25320gctgctgttg gagcggttgt cggacgccga gcggaacggg
caccgggttc tggcggtggt 25380gcggggcagc gcggtcaacc aggacggcgc ctccaacggg
ctgacggcgc cgaacggtcc 25440gtcccagcag cgtgtgatcc gccaggccct ggtcaacgcg
aacctctccg cagttgatgt 25500cgacgccgtc gaagcccacg gcacggggac caagctgggc
gacccgatcg aagcccaggc 25560cctgctcgcc acctacggcc agggacgtgc gcaggaacag
ccactgtggc tcggttcggt 25620caaatccaac ctgggtcaca cccaggcggc ggcaggcatg
gccggcctga tcaagatggt 25680gatggcgctg cggcacgagt cgttgccgcg gacgttgcat
gtggatgagc cgtcgccgga 25740ggtggactgg tcgtcggggg cggtgagtct gctgaccgag
gcgcggccct ggccgcgggt 25800cgaggaccgg ccccggcggg ccggggtgtc ctcgttcggg
gtgagcggga cgaacgccca 25860cgtcatcgtg gaggaggcgc ccgcgccgac gggagtggag
gcggtggaag ccgcgccggc 25920gggggtggag actgcggcgg ctgcggcggt ggtggtggag
acggacggtg cgggccgggt 25980gtcggcggat ctgccgttgg tgtgggtggc gtcgggcaag
tcgcaggccg cgatacgcgc 26040ccaagccgcc gccctgcacg cccacgtcct ggaccacccc
gaacaggacg cggacgacat 26100cggctacagc ctggccacca cccgcgccct gttcgaccac
cgcgccaccc tcatcgcccc 26160cgaccgccac accgtcccgg agcccctcac cgggctgggc
gacggacgca cgcaccccca 26220cctcatcccc acacccccca ccgaacccgg ccacacccac
aaaatcgcct tcctctgctc 26280cggacaaggc acccaacgcc ccggcatggc caccggcctc
taccacacct accccgcctt 26340cgccgccgcc ctcgacgaaa cctgcgccca cttcgacccc
cacctcgacc accccctgca 26400cgacctcctc ctcaaccacg accccaccga cctcctcacc
cacaccctct acgcccagcc 26460cgccctcttc accctccaaa aagccctcca ccacctcatc
accgaaacct acggcatcac 26520cccccactac ctcgccggac actccctcgg cgaaatcacc
gccgcccacc tcgccggcat 26580cctcaccctc cccgacgcca cccacctcat caccacccgc
gcccgcctca tgcaaaccat 26640gccccccggc accatgacca ccctccacac cacccccgaa
cacatccaac ccctcctcga 26700ccaacacccc ggcaaagccg ccatcgccgc cgtcaacagc
ccccactccc tcgtcatcag 26760cggcgacccc gacaccatcc accacatcac caccacctgc
cacaaccaag gcatcaccac 26820caaacccctc gccaccaacc acgccttcca ctccccccac
accgacacca tcctcgaaca 26880actcgacacc accacccaca ccctcaccta ccaccaaccc
cacacccccc tcatcaccag 26940cacccccggc gaccccctca ccccccacta ctggacccac
cagacccgcc aacccgtcca 27000ctggaccgac accatccaca ccctccacac ccacggcgtg
accacgtaca tcgcactcgg 27060accagagcac accctcacca ccctcaccca ccacaacgtc
ccccaccacc aacccaccgc 27120catcaccctc acccaccccc accacaaccc cacccaccac
ctcctcaccg cactcgccca 27180cctccacaca acccaaccca ccggccccaa catctggcac
caccactaca ccccagtcgc 27240acccgccccc cgccacgtcg acctgcccac ctaccccttc
ccacgccggc gctactgggt 27300gcaggcgtcc gccggtacgg gtgacgtgtc ggctgccggg
ctccagcgac cggaccaccc 27360actgctcggc gcggtgatgg agctcgcgga cggggacgga
atcgtcctca ccgggcgctt 27420gtccctgcac acccacccct ggctcgccga ccacagcgtc
ggcggcgtcg ccctccttcc 27480cggtaccgct ctgctggagc tggcttttca ggctggtctg
cgtgcgggtt gtcctggtgt 27540cgatgagctg actctccatg ctcctctggt ggttccggag
tcggggcatg tggtggtgca 27600ggtgtcggtt tcggtgccgg gcgaggcggg tcgtcgtggt
gtgagtgtgt acgggcggct 27660ggtggaggac ggggggctgg agggtgagtg gacgcggcat
gccgagggtg tggtgtgtcc 27720gtctgttcct ggggagtcgg tggttgtgga gccggtggcg
gacggggtgt ggccgccgtc 27780cggtgcgcag ccggtggatc ttgaggagtt ctacggtcgt
ctggcgggtg ggggttttgt 27840ctacggtccg gtgttccagg gtttgtgtgc ggcctggcgg
gacggggacg acgtggtggc 27900cgaggtgcgt ctgccggacg aggggctggc cgatgtcgcg
ggcttcgggg tgcatccggc 27960gctcctggac gcggccgtgc aggcagtcac cctcctgttc
ccggaccagc agcaagccgg 28020tctcgcggcc cacacatgga acggtgtctc gctccacgcc
cggggcgcca ccgtcctgcg 28080cctgcgcatg actcccaccg acgcgacctc gaccgccgtt
cgcctgcacg ccaccgacga 28140gaccggagca cccgttctca ccctcgactc gctcctgatg
cgtccggtgc cgttggaggg 28200gctgggggcg ggggtgcggc gtggctcgtt gttcgagctg
gggtgggtgc cggtggaggg 28260gatgccggcc tcggtggccg gtgggggcgg ggagttggtg
gcgtgggagt gcccgggtgg 28320tggggtggcc gaggtcacgg ccgcggcgtt gggagtggtg
caggagtggc tcgccgatga 28380gcgggagggg gacgcgcggc tggtcgtggt gacgcgtggt
gcggtcgcgg tggatgcggg 28440ggagccggtg cgggacgtgg cgggggccgc tgtgtggggg
ctggtccgct cggcccagtc 28500cgagcatccc gaccggttcg ccctgctcga cctcgacccc
gacaccaaga ccgaccccgg 28560catcgacacc gacggggaca ccgacgtgtc cgccgacgcg
aaggtcggca ccggtgatgg 28620tctcgacgat gccgccgtcg cgtccgctct ggcccgcggt
gagagccaac tcgccgtacg 28680cgacggggtg gttcgcgtag cgcggttggg gggtttggtt
ggggggttgt cgttgcctgg 28740tggggtgggg tggcggctgg atggtggtgg gtcggggttg
ttggaggggg tgggtgtggt 28800tgcttcggat gcggctgggg tggtgctggg tcgggggcag
gtgcgggtgg cggtgcgggc 28860tgccggggtg aacttccggg atgttctggt ggcgttgggg
atggtgccgg gtcaggtggg 28920ggtgggcagt gagggtgcgg gggtggtggt ggaggtgggg
cccggggtgg agggcctggt 28980ggtgggggac cgggtgttcg gggtgttcgg ggacgcgttc
gcgccggtgg tggtggcgca 29040ggaggtgttg ctggcccgta tcccggaggg ctggtcgttc
gcgcaggcgg cttcggtgcc 29100ggtggtgttc gctaccgctt acctgggact ggtcgatctg
gcgggggtgc ggcgggggga 29160gagtgtgctg gtccatgcgg cggccggcgg ggtcggtacc
gccgcggtgc agctcgcccg 29220tcatctgggg gcggaggtgt atgcgacggc cagtgaggcg
aagtgggcgc gtctgcgggc 29280ggcgggtgtc gcgccgcagc ggatcgcgtc ctcgcggagt
gtggagttcg agtcccgttt 29340ccgccgggcc agtggcggcc ggggtgtgga tgtggtgctg
aactgtctgg cgggtgagta 29400caccgatgcc tcgttgcggc tgtgttcgcc gcaggggggc
cggttcctgg agctgggcaa 29460gaccgacatc cgtgatgccg gtgaggtcgc cgctcggttc
ccgggggtgt cctaccgggc 29520gtatgacctg atggacgcgg gtgcgcagcg ggtgggggag
atcctgcaca cggtggtgga 29580tctgttccgg cgcggggtgc tggagccgtt gccggtcacc
gcgtgggacg tgcgccaggc 29640ccatcaggca ctgcggtcga tgcggtcggg cctgcacgtc
ggcaagaacg tgctcaccct 29700gcccgtgccc ctggatgcgg aggggacggt gctggtgacg
ggcgggaccg gcactctggg 29760ggcggcggtc gcgcgccatc tggccgccgg gcacggggtg
cggcatctgc tgctggtgag 29820ccggcgcggc atggccgccg ccggtgccga aaaactgtgt
gcggaactgg gtcaggcagg 29880ggtttcggtg tcggtggccg ggtgtgatgt cgccgaccgc
gcccaggtcg ccgccctgct 29940ggagcaggtg cccgcggagc atccgctgac cgctgtggtg
cacacggccg gtgtcctgga 30000cgacgccacc gtgacctgcc tggaccggaa caagatcgat
gcggtgctcg gggcgaaggt 30060ggacggtgcc ctgcacctgc acgagctgac cgcggggatg
gacctgtcgg cgttcgtgct 30120gttctcctcc gccgccgggg tcctgggctc gccggggcag
ggcaactacg ccgccgccaa 30180cgccgccctg gacgccctgg cccaccagcg ccgcgccgcc
ggtctgcccg ccctctccct 30240ggcctgggga ctgtgggaag aggccagcgg gatgaccggc
catctggatg ccgctgaccg 30300tcaccgcatc acccgctcgg ggctgcatcc cctgaccacc
cccgacgccc tcgccctcct 30360cgacaccgcc ctggccgccg gacgccccgc actcctgccc
gccgacctac gccccaccca 30420ccccgcaccg cccctcctgg aacacctcgc gcccgcccgc
accagccacc gcaccgcaca 30480caccagcacc gcaaccggcg tgggccagga cgtctccctc
accgaccgcc tcgccaccct 30540gacccccgaa cagcggcacg acaccctgct ggcgctggcc
cgtacccaca tcgccgccgt 30600cctgggccac cccagccccg acaccatcga ccccgaacgc
accttccgcg acctcggctt 30660cgactccctc accgccgtcg aactccgcaa ccggctcacc
cgcgccaccg gcctgcgcct 30720gcccgccacc ctcgccttcg accaccccac ccccaccgca
ctcacccacc acctcaccac 30780cctcctcaac cccaacgaca acgacaacgt cggtccggta
ctgatggagc tcgaaagact 30840ggaatccgct ctcgccgcgc tggacaggga cgacagcgcc
tgcgagcggg tcactctgcg 30900actgcaatcg ctgatgctca ggtggagcgg ctccgagcgg
cagtcagccg aaaacacgga 30960cgactccagc aggttcgcgt cggcgaccgc ggaggagcta
ctcgaattca tcgaccgaga 31020cctgggtctt tcctgaacca gctcggtctt ccctgaacca
gctcgacgac gcggttttcc 31080cgtgcgcgac ggactccaag gacgtgaacc agacgtggcg
aatgacgaga aggtgctcga 31140atacctcaag cgagtcaccg cggatttgga ccggaccagg
cggcgcctgt acgaagtcgt 31200cgagcgggag caggagccga tcgccatcgt ggggatggct
tgccgttatc cgggcggggc 31260cgggtcgccc gcaggtctct gggacctcgt cagctccggt
acggacgcca tcggggagtt 31320ccccaccgat cgtggctggg atctggaacg tctctacgac
cccgaccccg atcacccggg 31380caccacgtac acccgccacg gcggattcct cgacggcgta
ggtgagttcg acgcggagtt 31440cttcggcgtc agcccgcgtg aggccctggc gatggacccc
cagcagcggc tcctcctcga 31500aaccgcctgg gaagccatcg aacacgccgg catcgtcccc
gagtcgctgc gcggcacgtc 31560caccggcgtc ttcgccggta tcaacccgca ggactacacc
atcagtcagt acggacggga 31620ttcggagatc gagggctatc tgctgaccgg ggcagccgcc
agtatcgcct ccggccgtat 31680ctcctacacc ctcggcctcg aaggcccagc cgtcaccatc
gacaccgcct gctcctcctc 31740cctcgtcgcc ctccacctgg cttgccaagc gctgcgcgca
ggggagtgca ccatggccct 31800ggcgggcggc gcctcggtcc tgtccacacc gctgatcttc
gtcgagttcg ctcgccatca 31860cggcctgtcg gtcgacggcc ggtgcaaggc gttctccgct
tcggccgacg gcacgggctg 31920gggcgagggc gccggcctgc tcctcctcga acggctctcc
gacgccaagc gcaacggccg 31980ccgcatcctc gctctcgtac gggggagcgc ggtcaaccag
gacggcgcct cgaacgggct 32040gacggcgccg aacggaccct cccagtgcag ggtcatccgc
cgggccttgg ccaacgccca 32100tctcgccccg gccgacatcg atgccgtgga agctcacggc
accggcacca ccctgggcga 32160ccccatcgaa gcccaggccc tccaggaagc gtacggcgcg
gaccgacccg acgatcggcc 32220gctctgggtc ggcacgctca agtcgaacat cggccactcg
atcgccgcgg cgggtgtggg 32280cggggtcatc aagatggtga tggcgctgcg gcacgagtcg
ttgccgcgga ccttgcatgt 32340ggatgagccg tcgccgcagg tggactggtc gtcgggtgcg
gtgagtctgc tgaccgaagc 32400gcggccctgg ccgcgggacg aggaccggcc ccggcgggcc
ggggtgtcct cgttcggggt 32460gagcgggacc aacgcgcacg tgatcctgga ggaagcgccc
gcgccggcgg aggtgcaggc 32520ggtagaaact gcgccggtgg tgcgggtgga tggtggggag
cgttccgcac cggcggatgt 32580gccgttggtg tgggtcgtgt cgggcaagtc gcaggccgcg
ctacgcgccc aggccgccgc 32640cctgcacgcc cacgtcctgg accaccccga acaggacgcg
gccgacatcg gctacagcct 32700ggccaccacc cgcgccctgt tcgaccaccg cgccaccctc
atcgcccccg accgcgacac 32760cctcctggac gccctcaccg ccctggccga cggccgcacc
cacccccacc tcgtccccgc 32820accccccacc gaacccggcc acgcccacaa aatcgccttc
ctctgctccg gacagggcac 32880ccaacgcccc ggcatggcca ccggcctcta ccacacctac
cccgccttcg ccgccgccct 32940cgacgaaacc tgcgcccact tcgaccccca cctcgaccac
cccctgcgcg acctcctcct 33000caaccacgac cccaccggcc tcctcaccca caccctctac
gcccagcccg ccctcttcac 33060cctccaaaaa gccctccacc acctcatcac cgaaacctac
ggcatcaccc cccactacct 33120cgccggacac tccctcggcg aaatcaccgc cgcccacctc
gccggcatcc tcaccctccc 33180cgacgccacc cacctcatca ccacccgcgc ccgcctcatg
caaaccatgc cccccggcac 33240catgaccacc ctccacacca cccccgaaca catccaaccc
ctcctcgacc aacaccccgg 33300caaagccacc atcgccgccg tcaacagccc ccactccctc
gtcatcagcg gcgaccccga 33360caccatccac cacatcacca ccacctgcca cacccaaggc
atcaccacca aacccctcac 33420caccaaccac gccttccact ccccccacac cgacaccatc
ctcgaacaac tcgacaccac 33480cacccacacc ctcacctacc acccacccca cacccccctc
atcaccagca cccccggcga 33540ccccctcacc ccccactact ggacccacca gacccgccaa
cccgtccact ggaccgacac 33600catccacacc ctccacacca acggcgtcac cacctacatc
gaactcggac ccgaccacac 33660cctcaccacc ctcacccacc acaacctccc ccaccaccaa
cccaccgcca tcaccctcac 33720ccacccccac cacaacccca cccaccacct cctcaccgca
ctcgcccaca cccccaccac 33780ctggcacacc caccaccaca cccacaccaa cccccacccc
cacaccatcc ccgacctccc 33840cacctacccc ttccaacgcc ggcactactg gctccaggcg
cccaccacca gcaccgatca 33900gccggtggcc ccgacgaacg acgacgcccc cgcgcctcga
gcgacatcgc tccgggacac 33960tcttgccgga cgaagccctc aagagcgcga agaagtgctc
ctggatctcg tactgaccca 34020ggtcgccgcc gtgctcggcc acaccgcgcc tgaggtggtg
gatccccaaa gggcgttcaa 34080ggacctcggc ttcgactcac tggccgccat caaactccgc
aacaggctcg ccgcagccac 34140cggactcgag ctgccgacca cccttgtctt cgaccacccc
acgccggtgg cactccgcca 34200gtacttccag tcgcagatcc tcggagcgga ggcggacgcc
cccaaccgtc tgcccctccg 34260ggcggcgacc accgacgaac ccatcgcgat cgtcggcatg
gcgtgccgct tcccgggcgg 34320cgttcggacg gccgacgacc tgtggcagct cctgagcgac
gaacacgatg cggtcggcgg 34380cttccccacc aaccggggtt gggacgtggc gaacctctac
gacccggacc cggatcgcca 34440cggcaccacg tacacccagc agggcggctt cctctacgaa
gcgggggagt tcgacgccga 34500gttcttcggc atcagcccgc gtgaggccct ggcgatggac
ccccagcagc ggctcctcct 34560cgaaaccgcc tgggaagcca tcgaacacgc cggcatcaac
cccgatgccc tgcgcaacac 34620gtccaccggt gttttcgccg gggtcatcta ccacgactac
gcgagccggt tcctcaccgc 34680gccggccggt tacgagggct acctcggcca cgggagtgcc
ggcagcatcg cgtcgggccg 34740tgtcgcgtac gtgctgggtc tcgagggtcc cgcggtcacg
gtcgacaccg cgtgttcgtc 34800gtcgctcgtc gcgctgcatc tggcctgtca ggcactgcgg
tcgggcgagt gcacgatggc 34860tctggcgggc ggcgcgacgg tgatgtcgac cccgcaggcg
ttcgtggagt tctcccggca 34920gcggggtctg gcggcggacg gccggtgcaa ggcgttctcc
gctgcggccg acggcacggg 34980ctggggcgag ggcgccggcc tgcttctcct cgaacggctc
tccgaggccg agcggaacgg 35040acaccgggtt ctggcggtgg tgcggggcag cgcggtcaac
caggacggcg cctcgaacgg 35100gctgacggcg ccgaacggtc cgtcccagca gcgcgtgatc
cgccaagctt tggccaactc 35160gggcctgacc ggcgccgatg tcgacgccgt cgaagcccac
ggcacgggga ccaagctggg 35220cgacccgatc gaagcccagg ccctgctcgc cacctacggc
caggaacacc accccgacca 35280gccgctctgg ctcggctccc tgaagtccaa catcggccac
gcccaagcgg cagcaggcgt 35340gggcagcatc atcaagatga tcatggctat gcgcaacgag
tcgctgccgc ggacgttgca 35400cgtggatgag ccgtcacccc atgtggactg gtcgtcgggg
gcggtgagtc tgctgaccga 35460gccacgcccc tggccacgcc gggaagaccg gccccggcga
gcgggaatct cctccttcgg 35520agtcagcggg acgaacgccc acgtcatcgt ggaggagccg
cctgcgcggg cggaggtgga 35580ggcggtggaa gccgcgccgg cgggggtgga gactgcggcg
gctgccgcgg tggtggtgga 35640gacagacggt gcgggccggg tgtcctccga tgtgccgttg
gtgtgggtgg tgtccggcaa 35700gtcgcaggcc gcgctacgcg cccaggccgc cgccctgcac
gcccacgtcc tggaccaccc 35760cgaacaggac gcggccgaca tcggctacag cctggccacc
acccgcgccc tgttcgacca 35820ccgcgccacc ctcatcgccc ccgaccgcga caccctcctg
gacgccctca ccgccctggc 35880cgacggccgc acccaccccc acctcatccc cacacccccc
accgaacccg gccacaccca 35940caaaatcgcc ttcctctgct ccggacaagg cacccaacgc
cccggcatgg ccaccggcct 36000ctaccacacc taccccgcct tcgccgccgc cctcgacgaa
acctgcgccc acttcgaccc 36060ccacctcgac caccccctgc gcgacctcct cctcaaccac
gaccccaccg acctcctcac 36120ccacaccctc tacgcccagc ccgccctctt caccctccaa
aaagccctcc accacctcat 36180caccgaaacc tacggcatca ccccccacta cctcgccgga
cactccctcg gcgaaatcac 36240cgccgcccac ctcgccggca tcctcaccct ccccgacgcc
acccacctca tcaccacccg 36300cgcccgcctc atgcaaacca tgccccccgg caccatgacc
accctccaca ccacccccga 36360acacatccaa cccctcctcg accaacaccc cggcaaagcc
accatcgccg ccgtcaacag 36420cccccactcc ctcgtcatca gcggcgaccc cgacaccatc
caccacatca ccaccacctg 36480ccacaaccaa ggcatcacca ccaaacccct caccaccaac
cacgccttcc actcccccca 36540caccaacacc atcctcgaac aactcgacac caccacccac
accctcacct accacccacc 36600ccacaccccc ctcatcacca gcacccccgg caaccccctc
accccccact actggaccca 36660ccagacccgc caacccgtcc actgggcgga caccatccac
accctccaca ccaacggcgt 36720caccacctac atcggactcg gacccgacca caccctctcc
accctcaccc accacaacct 36780cccccaacac caacccaccg ccatcaccct cacccacccc
caccacaacc ccacccacca 36840cctcctcacc gcactcgccc acacccccac cacctggcac
acccaccacc acacccacac 36900caacccccac ccccacacca tccccgacct ccccacctac
cccttccaac gccggcacta 36960ctggctggag gtcccgaagc cgactgccga agcatccgcc
tcagccagtg gcccggggcg 37020gaaccgggcc gccaaactct cagcgctcga ggcggagttc
tggcaggccg tcgaggaaac 37080cgacaccgac accctcgccc acaccctcga cctcgacacc
cagaccctcg aacccgtcct 37140ccccgccctc gccacctggc accaacaaca acgcgaccac
gcccgcatca acacctggac 37200ctaccaggaa acctggaaac cactccacct ccccaccacc
cgacccacca cccccaccag 37260ctggctcatc gccatccccg aaacccaccg caaccacccc
cacaccacca acctcctcac 37320caacctcccc caccacaaca tcacccccat ccccctcacc
atcaaccaca ccaccgacct 37380ccaccacgcc taccaccacg cccaccacca caccacccca
cccatcaccg ccgtcctctc 37440cctcctcgcc ctcgacgaaa caccccaccc ccaccacccc
cacaccccca ccggcaccct 37500cctcaacctc accctcaccc aaacccacac ccaaacccac
ccaccaaccc ccctctggta 37560cctcaccacc caagccacca ccacccaccc caacgacccc
ctcacccacc ccacccaagc 37620ccaaaccatc ggactcgccc gcaccaccca cctcgaacac
ccccaccaca ccggcggaca 37680catcgacctc cccaccacac cccaccccaa caccctcacc
caactcatca ccgccctcac 37740ccacccccac caccaacaca acctcaccat ccgcacccac
accacccaca cccgacgact 37800cacccccacc accctccaac ccaccacccc cacaccaccc
accaaccccc acggcaccac 37860cctcatcacc ggcggcaccg gcgccctcgc caccaccctc
gcccaccacc tcgccaccac 37920cggcacccaa cacctcctcc tcaccagccg acgcggcccc
cacacccccg gcgcccgaca 37980actccacacc caactcaccc aactcggcac caacaccacc
atcaccgcct gcgacctctc 38040cgaccccgac caactcaccc acctcctcac ccacatcccc
cccgaacacc ccctcaccac 38100cgtcatccac accgccggca tcctcgacga cgccaccctc
accaacctca cccccaccca 38160actcgacaac gtcctgcgcg ccaaagccca caccgcccac
ctcctccacc acgccaccct 38220ccacaccccc ctcgaccact tcgtcctcta ctcctccgcc
gccgccaccc tcggcgcccc 38280cggccaagcc aactacgcag ccgccaacgc ctacctcgac
gccctcgccc accaccgcca 38340cacccacaac ctccccgcca ccaccatcgc ctggggaacc
tggcaaggaa acggcctcgc 38400gagcggtgac atcggcgagc atctgcgccg ccgcgggatg
atcccgctgg atcccgagtc 38460cgctgtcggt gccttcgacc gggcggtcgc gagcgatcgg
cccagcgtct tcgtcgcgga 38520catcgactgg cccaccttcg gccgcaacac ctccagcggt
cttcgcgccc tcttcgagga 38580cattccggag gccacacagc ctgagccgac cgcccggagc
gcggaccagc cgaacgggca 38640cggtagcctc caggaacttc tcgcccgcca gtccccggcc
gagcaggccg aaacgctcct 38700ggcattggtc cggacgcatt ccgcgaccgt cctcgggcgt
gacggggccg atgccgtcgc 38760cgccgaacgt cccttcaggg acctgggatt cgactcactg
tccgccgtcg agctccgcaa 38820tcatctgacg gccgacacgg agctcgctct gccgacaacg
ctggtcttcg atcacccgac 38880tccggtgaag ctcgcggagt tcctgcgcac cgagctgctc
ggcaccgcac cagccaccac 38940cgccgccgtc ccggccctcc agtcccacac cgacgaaccc
atcgccatca tcggcatggc 39000ctgccgcttc cccggcgccg tcaccacacc cgaacacctg
tggaacctca tcgccaccga 39060acaagacgcc atcggcgagt tccccaccga ccgcggctgg
gacctggaca acctctacca 39120ccccgacccc gaccaccccg gcaccaccta cacccgccac
ggtggtttcc tctacgacgc 39180cggcgacttc gacgccgagt tcttcggcat caacccacgc
gaagccctcg ccatggaccc 39240ccagcaacga ctcctcctgg aaaccgcctg ggaagccatc
gaacacgccg gcatcctccc 39300cgacgccctg cacggcaccc ccaccggcgt cttcaccggc
gtcaacgccc aggactacgc 39360cgcacacacc cacgcctccc cccacaccac cgagggctac
accctcaccg gaaccgccgg 39420cagcatcgcc tccggccgca tcgcctacac cctcggactc
gaaggccccg ccgtcaccat 39480cgacaccgcc tgctcctcct ccctcgtcgc cctccacctc
gcctgccagg ccctgcgagc 39540aggcgaatgc accacagccc tcgccagcgg catcaccgtc
atgaccagcc cggtcacgtt 39600caccgagttc tcccggcagc gagggctcgc ccccgacgga
cactgcaagg cgttctccgc 39660ctcggccgac ggcaccggct ggagcgaggg cgtgggcacc
atcctcgtcg aacggctctc 39720cgacgccgag cggaacgggc accggattct ggcggtggtg
cggggcagcg cggtcaacca 39780ggacggcgcc tccaacggcc tgacggcgcc gaacggcccc
tcccagcaac gcgtcatccg 39840ccaggccctg gccaactccg gcctgaccgg cgccgatgtc
gacgccgtcg aagcccacgg 39900cacgggaacc aaactcggcg accccatcga agcccaggcc
ctgctcgcca cctacggcca 39960gggacgtgcg caggaacagc cactgtggct cggctcggtc
aaatccaacc tcggccacac 40020ccaggcagcg gcaggcatgg ccggcctgat caagatggtg
atggcgctgc ggcacgagtc 40080gttgccgcgg acgttgcatg tggatgagcc gtcgccgcag
gtggactggt cgtcgggtgc 40140ggtcagcctg ctgaccgagg cgcggccctg gccacgccgg
gaggaccggc cccggcgagc 40200gggaatctcg tccttcgggg tgagcgggac gaacgcgcac
gtgatcctgg aggaggcgcc 40260cgcgccggcg gaggcggtgg agacggaaca gggtgtggtg
ccgcagggcg accaggagtg 40320ttccgcgccg gtgggtgtgc cgttggtgtg ggtggtgtcc
ggcaagtcgc aggccgcgct 40380acgcgcccag gccgccgccc tgcacgccca cgtcctggac
caccccgaac aggacgcggc 40440cgacatcggc tacagcctgg ccaccacccg cgccctgttc
gaccaccgcg ccaccctcat 40500cgcccccgac cgcgacaccc tcctggacgc cctcaccgcc
ctggccgacg gccgcaccca 40560cccccacctc atccccacac cccccaccga acccggccac
acccacaaaa tcgccttcct 40620ctgctccgga caaggcaccc aacgccccgg catggccacc
ggcctctacc acacctaccc 40680cgccttcgcc gccgccctcg acgaaacctg cgcccacttc
gacccccacc tcgaccaccc 40740cctgcgcgac ctcctcctca accacgaccc caccgacctc
ctcacccaca ccctctacgc 40800ccaacccgcc ctcttcaccc tccaaaaagc cctccaccac
ctcatcaccg aaacctacgg 40860catcaccccc cactacctcg ccggacactc cctcggcgaa
atcaccgccg cccacctcgc 40920cggcatcctc accctccccg acgccaccca cctcatcacc
acccgcgccc gcctcatgca 40980aaccatgccc cccggcacca tgaccaccct ccacaccacc
cccgaacaca tccaacccct 41040cctcgaccaa caccccggca aagccaccat cgccgccgtc
aacagccccc actccctcgt 41100catcagcggc gaccccgaca ccatccacca catcaccacc
acctgccaca cccaaggcat 41160caccaccaaa cccctcacca ccaaccacgc cttccactcc
ccccacaccg acaccatcct 41220cgaacaactc gacaccacca cccacaccct cacctaccac
caaccccaca cccccctcat 41280caccagcacc cccggcgacc ccctcacccc ccactactgg
acccaccaga cccgccaacc 41340cgtccactgg gcggacacca tccacaccct ccacaccaac
ggcgtcacca cctacatcgg 41400actcggaccc gaccacaccc tctccaccct cacccaccac
aacctccccc aacaccaacc 41460caccgccatc accctcaccc acccccacca caaccccacc
caccacctcc tcaccgcact 41520cgcccacacc cccaccacct ggcacaccca ccaccacacc
cacaccaacc cccaccccca 41580caccatcccc gacctcccca cctacccctt ccaacgccgg
cactactggc tggaggtccc 41640gaagccgact gccgaagcat ccgcctcagc cagtggcccg
gggcggaacc gggccgccaa 41700actctcagcg ctcgaggcgg agttctggca ggccgtcgag
gaaaccgaca ccgacaccct 41760cgcccacacc ctcgacctcg acacccagac cctcgaaccc
gtcctccccg ccctcgccac 41820ctggcaccaa caacaacgcg accacgcccg catcaacacc
tggacctacc aggaaacctg 41880gaaaccactc cacctcccca ccacccgacc caccaccccc
accagctggc tcatcgccat 41940ccccgaaacc caccgcaacc acccccacac caccaacctc
ctcaccaacc tcccccacca 42000caacatcacc cccatccccc tcaccatcaa ccacaccacc
gacctccacc acgcctacca 42060ccacgcccac caccacacca ccccacccat caccgccgtc
ctctccctcc tcgccctcga 42120cgaaacaccc cacccccacc acccccacac ccccaccggc
accctcctca acctcaccct 42180cacccaaacc cacacccaaa cccacccacc aacccccctc
tggtacctca ccacccaagc 42240caccaccacc caccccaacg accccctcac ccaccccacc
caagcccaaa ccatcggact 42300cgcccgcacc acccacctcg aacaccccca ccacaccggc
ggacacatcg acctccccac 42360cacaccccac cccaacaccc tcacccaact catcaccgcc
ctcacccacc cccaccacca 42420acacaacctc accatccgca cccacaccac ccacacccga
cgactcaccc ccaccaccct 42480ccaacccacc acccccacac cacccaccaa cccccacggc
accaccctca tcaccggcgg 42540caccggcgcc ctcgccacca ccctcgccca ccacctcgcc
accaccggca cccaacacct 42600cctcctcacc agccgacgcg gcccccacac ccccggcgcc
cgacaactcc acacccaact 42660cacccaactc ggcaccaaca ccaccatcac cgcctgcgac
ctctccgacc ccgaccaact 42720cacccacatc ctcacccaca tcccccccga acaccccctc
accaccgtca tccacaccgc 42780cggcgtcaac cattacgctc ccgtggcggc gaccgacccg
tccacgttcg cgtccgtcct 42840cgccgcgaag gcggccggcg cggcacacct gcatgaactc
ctgctggagc tggacacggt 42900cgagcagttc atcctcttct cctccggttc gggggcctgg
ggcagcggca accagtgcgc 42960gtacgcggct gccaacgcct acctcgatgc gctggcggcg
caccgccagg cccgcggcct 43020gcctggcatg tcgctcgcct gggggccttg ggacggtgac
gggatgtcgg ccggagagga 43080cgcccagcgg tacctccgtg agcggggcgt actgcccatg
gatccgcggc tcgccgtcgc 43140ggccttcgac gaggcggtcc gggcgcggcc gaactccaac
ctcgtcgtcg cggacatcga 43200ctgggagcgt ttcgtcccga cgttcaccgc gcggggccac
aaccccctga tcgaggacat 43260ccccgaagtc cgccggctgg ccgcggaggc cgaggccgcc
cagaccacga ccgccgccac 43320ggacgccccc gcccttctca accgactctc aggtctgtcg
gccactcagc agaagcagca 43380tcttctccgg ctggtgcggt cacacatggg cgaggtcctc
ggccgcgagg acgtcgacac 43440gctcgacgag cgccacacct tccgggacct gggcttcgac
tcgctcacct cggcccgatt 43500cagccagcgg ctcgccaagg acacggggct gcaccttcct
gccaccctcg tcttcgacca 43560cccgacgccc gccgactgcg tggctcatct gcgggatcaa
cttctgggtg aaacggacga 43620catgactccg aggaagcgag atcacctcgg ggaggaccgg
cgggcggcca ccgcggacga 43680cccgatcgcg atcgtcggga tggcgtgccg gttcccgggc
ggcgtgcggt ccgccgatga 43740tctgtgggac ctgctgtcgt cgggcaccga cgccatcagc
ggcttcccca ccgatcgcgg 43800ctgggacatc gagagcctct acgaccccga ccccgaccgc
tccggcacca cgtacacccg 43860ccacggtggt ttcctctacg acgccgggca gttcgacgcc
gagttcttcg gcatcagccc 43920gcgtgaggcc ctggccatgg atccccagca gcggctcctt
ctcgaaaccg cctgggaggc 43980cgtcgaacac gcaggcatca acccgcagac actccacggc
acccccaccg gcgtcttcac 44040gggcgtcaac gcccaggact acgcagccca cctgcgccag
gcgtcgggca acgtcgaggg 44100gtacgccctg accggaagct cgggcagtgt cgtgtcgggt
cgggtggctt acaccttcgg 44160tttcgagggg ccggccgtct cggtcgacac cgcgtgctcg
tcgtcgctcg tcgcactgca 44220cctcgcaggc caagccctgc ggtccggcga gtgcacgatg
gccctcgccg gcggcgtcat 44280ggtgatgtcc tcccctgaga cgttcgtgga gttctcgcgg
cagcggggtt tgtcggtgga 44340cgggcggtgc aagtccttcg cggccgcggc cgacggtacc
ggctggggcg agggcgtggg 44400catgctgctc gtggagcggt tgtcggacgc cgagcgcaac
gggcaccggg ttctggcggt 44460ggtgcggggc agcgcggtca accaggacgg cgcctccaac
ggcctgaccg caccgaacgg 44520cccctcccag cagcgcgtga tccgccaggc cctggccaac
tccggcctga ccggcgccga 44580tgtcgacgcc gtcgaagccc acggcacagg aaccaaactc
ggcgacccca tcgaagccca 44640ggccctgctc gccacctacg gccaggaaca ccaccccgac
cagccgctct ggctcggctc 44700cctgaagtcc aacatcggcc acgcccaagc agcggcaggt
gtcggcggga tcatcaaaat 44760ggtgatggca ctgcgccacg agacgctgcc gcgcacgctg
cacatcgacg agccgacccc 44820ccaggtcgac tggtcgtccg gcgcggtcag cctgctgacc
gagccccgcc cctggccacg 44880ccagggggac cggccccgac gcgccggcat ctcctccttc
ggagtcagcg gaaccaacgc 44940ccacgtcatc ctggaagagg cacccgccca gccggccggg
gaccccgccc cagaagacgg 45000cgccccggtg ccctgggcga tgtcggcgcg ttcaaacgcc
gcgctgcggg cacaggccgc 45060actcctgcgt gacttcctcc aaggccccgg caccgacacc
gcactacggg cggtcggagc 45120cgaactcgcc catggcaggg ccgtcctgga acaccgcgcc
gtgatcgtgg cacgggaacg 45180gacagagttc gaagacgcgc tggaagcact ggcctcgggt
gaaccgcacc ccgcactcat 45240cgaagacacg accggcagcc agaccaacag ccactccggt
ggcggggtgg tgttcgtctt 45300ccccggccag ggcggtcagt gggccggcat gggactcgac
ctgctgcgcg actcccaggt 45360gttcgccgac catgtcggtg cgtgtgaacg cgcgctggcg
ccgtgggtgg agtggtcgct 45420caccgaaatg ctccaccggg acgcggagga tccggtgtgg
gagcgggcgg atgtggtcca 45480gccggtgctg ttctcggtca tggtgtccct ggcggcgctg
tggcggtcct acggcatcga 45540acccgaagcg gtggtcggcc actcccaggg cgagatcgcc
gccgcccacg tctgcggcgc 45600actcaccctg gaggacgccg cgaagatcgt ggcactgcgc
agccgggccc tggccgcgct 45660gcggggccac ggcggcatgg cctcactcgc cctgaccgga
accgaggccg aggacctcat 45720caccacccac tggccaggac ggctgtggac ggccgcgttc
aacgggccac gggccaccac 45780cgtctccggc gacaccgacg ccctggacga actcctcacc
cactgcaccg aaaccggggt 45840acgggcccgc cgcatccccg tggactacgc atcccactgc
ccccacaccg aaaccatcga 45900acacgacctg ctccacatgc tccacggcat caccccccag
cccggcagca tcccgttcta 45960ctccaccgtc gaggacgcct ggaccgacac caccaccctg
gacgccgcct actggtaccg 46020caacctgcgc cggcccgtcc gcttcaccca cgccgtccgc
accctcaccg cccagggcca 46080ccgcctcttc atcgagacca gcccccaccc caccctgacc
cccgccatcg aagaccacga 46140ccacaccacc gccctgggca ccctgcgccg ccacgacaac
gacacccacc gcttcctcac 46200cgccctcgcc cacgcccaca ccaccggcca caccgtcacc
tggaccaccc actaccccac 46260caccccccac acccccgcca tcgacctgcc cacctacccc
ttccaacacc accactactg 46320gctccacaca cccaccacca gcaccggcga cgtctccgcc
gccggactgc accccaccga 46380gcaccctctc ctcggcgcca ccgtggaact cgccgacgga
gacggaacct tgctcaccgg 46440gcgcctgtcc ctgcacaccc acccctggct cgccgaccac
agcgtcggcg gcatcgtcct 46500cctccccggc accgccctcc tcgaactcgc cctcgaagcc
gggacgcgca ccggttgccc 46560ccacgtccag gaactcaccc tgcacacgcc cctggtgatt
cccgagaccg gacacgtcgt 46620cttccagctg acggtctcgg caccggacga gaccgggcag
cgcccgttca ccgtccattt 46680ccgttccgag gccgtcaccg gcgcggacga tccggcggac
cggacctgga cgcggtgcgc 46740caccggtgcg ctctcgaccg cggccgcccc cgatcactcc
gaagccgcca cctggccgcc 46800gccgtccgct cagccgctgg acctcgacgg tctgtacgac
cgcatggcgg aggcgggtct 46860ggtctacggt ccggtgttcc aggggctccg cgaggcttgg
ctcgatggcg aggacatcgt 46920cgccgaggtg cgcctgccgc aggaggcggc cgccgacacg
cagggcttcg gcctgcatcc 46980cgccctgctc gacgccgctc tgcatgtgac ggcgctgacc
tcacaggccg gtacagcgga 47040cgaagacgcg caggaacggc gtcggttgcc gttcgcgtgg
gccggtgtct ccctgttcgc 47100cagggagtgc gcggcgctgc gtgtgcgggt ggcgccgtgt
gcgccgcacc cgggggacgc 47160cgtggcgatc acagccaccg acgaggacgg ccgtccggtg
ctggcggtgg aatcgctcac 47220cctccggccc gtctcccccg accagttgcg ggcggcggcc
ccggccgccg ggcgggattc 47280gctgttccgc ctggagtggg taccggtcac ggcctccgcc
tccgcctccg cccggccgac 47340cgggccctgg gccgccatcg gcaccggtcc ggcggtggcc
ggcctggccg gccacgcaga 47400cctgacggtg tacgcggagg ccggcgatct gctccgggat
ctggacggag gggcccccgc 47460gcccgctgtg gtcgtgctca gcgtcacgcc cgatgccgac
gaattcgcca ctccccgtgc 47520ggcgaccggc cgggccctct ccgtccttca ggcctggctg
gcggacgagc gcctggccga 47580cagccggctc gtggccgtca cttctggggc ggtcgtcgcc
gcgcccgggg acgacacggt 47640cgacgtcccg ggtgccgccg tgtggggctt ggtgcgttcc
gggcagtccg agcacccgga 47700ccgcatcacg ctgctcgact gtgcgagcgg cgcccggccc
gggccggacc tcgtcgccgc 47760cgccctcgcc tcgggcgagc cgcagctcgc cgcccgcgcc
ggggtcctct acacgccccg 47820gctggccagg ccgcaccgcg acgcctcggc cgtaccgcgg
tcgctgccgt cccacggcac 47880cgtgctcatc accggcggca ccggtctgct gggcgggttg
gtcgcccggc gcctggtgga 47940ggcgcacggt gtccgccgcc ttctcctggc cggccgcagg
ggtccggcgg cggaggggct 48000ggactcgctg acgtccgagt tgcgtgagcg cggggcgacc
gtcgaggtcg ccgcgtgcga 48060cgcggccgac cgcacacagt tggaggcgct gctggccggg
gtgcccgagg agcatcccct 48120gtccgcggtc gtgcacgccg cgggtgtgct cgacgacggg
gttctcacgt ccctgacgaa 48180cgagcggctg ggagctgtcc tgcgggcgaa ggcggattcg
gcgctgcttc tgcacgagct 48240cactcaggac ctcgacctgt ccgccttcgt cctgttctcc
tccgccgccg gcgtcctcgg 48300ctctcccggc cagggcagct acgccgccgc caacgccgtg
ctcgacgcac tcgcccacca 48360gcgcagcgcc gccggtctgc ccgctctctc cctggcctgg
gggctgtggg cggagggcag 48420cgggatgacc gggcacctcg acgccgacga ccgctcccgg
atcaaccggg ccggtatggc 48480gccgctcccg acgcccgatg ccctggatct gttcgacgcc
gcgctgtcgt cggacgaacc 48540cttcctggta ccggctcgct tcgacctttc cgccgtacgc
accaggaccg cgtacggccc 48600gctcccgccg ctgctgcgcg gcctggtccg gacctcgggc
gcgcaccggg tccggggcgc 48660agtcggcgaa gcccgggcgg ccggcgtgga cgaggccgga
cggctgcggg aacggctggc 48720ccgccagagt gacgccgaac gccggaacac cttgctgcgg
ctcgtgcagt cgaacgtcgc 48780ggcggtgctc ggtcaccgcg gcacggggac cgtcgccgag
acacgcgcct tccgtgagct 48840gggcttcgac tcgctcacgg cggtggagct gcggaaccgg
ctgaaggtcg ccacagggct 48900ggcgctgcgg gccacggtcg ccttcgactt cccgactccg
gcggcgctgg ccgagcatct 48960gggtgcccgc ctgcttccgc cggacggcgc cgtgtccgag
gcggtgggcg agaaggagct 49020gcgcgggctc ttgacgtcga tcccgatcgg ccggctgcgg
gaggcggggc tgatcgaccg 49080cctcctggcg ctcgccgctg cggcgccaga ctccgccgat
cagacggcgg agcagccctc 49140ccggtccgtg tcggtcgagg acatcgacgc catggacgtc
gacagcctca tcggcctggc 49200ccacgacacc ggcaccgact ccggtcacgc cccctgcgag
ggctgacctc cacttcacgg 49260atgcgagaga cgacatgacg cagattccgc caaccggtca
cgacgccgtg gcagccgggc 49320ccgcccccgg cgccgcggaa cagaaacgag gacggaaacg
gaaaccagga cgggagcccc 49380ggccagagca tcgacgggaa caggaacgag ggcagggagc
agggctgggg caggggcagg 49440aacgcgcgcg gcccgcggac ggtggtcggc ggctcgtgct
tggctgggcg gcgctcggcg 49500cggtgtgcct ggccctgcag gcgtacgtgc tcgtccgctg
ggcggccgac ggtgggtatc 49560gcctggtgga cgtacccggt gagggcggcg cggagcgtgg
ccaccgaagg gtcctcgaca 49620tcgtgttccc ggcgctgtcg gcggcaggtg tcgtggggct
ggcgctgtgg ctctaccgca 49680ggtgccgcgc ggagcggcgg gtgtcgttcg acgccctgct
gttcgccgga gtgctgttcg 49740cgggctggct gagcccgctg atgaactggt tccatcccgt
cctggtctcc aacacgcacg 49800tgtggggcgc ggtcggctcc tgggggccgt acacgccggg
atggcagggg tccgcccccg 49860ggatggaggc cgagctgccg ctggtgacgt tcagcgtgtg
ctcgacagcg ctcctgggtg 49920tgctggcctg ctgtcacgtg ctgtcccgcg tccgggaccg
gtggcccggg gtccgcccgt 49980ggcaactgat cggggtggcc gtcgccaccg cggtggccct
ggacctgtcg gagccggcga 50040tctccttgat cggtctagtc cgtctggtcg aaggcgctgc
cggaggtgtc gctgtggagc 50100ggtgcctggt accagttccc tctgtaccag ctcctgaccg
cggccctggc cagcgggttg 50160ctgagcgcgc tccggttctt ccgcgacgag cgggacgaga
cgctggtgga gcgcggtgcc 50220tggcgcctgc cgggccgtgt ccgcctctgg gcgcggttcc
tggccgtcgt cggcggcgtc 50280catgtcgtga tgggcggcta tacggccctt catgtgctgc
tctcgttggt cggcggccaa 50340ccgccggacg cgttgccggg gttcttccgt ccgccggccg
tctactgagg gcggggcgga 50400cggcacgcaa cgaggggagg ggccggcgtc tcatgctctg
ctgtccggtc agacctcagc 50460gcgctggcac ggcgcggtca ggacgacgta cccgatgtcc
tccgtgtacc actggctgca 50520cttgcccacg aacgtctcca ggtcgtccgc cgtcatctcc
agggcttcgg cgtaggcgtg 50580ggcgttggcg cgtacgtcgt cacccatcgc gctgtacgac
ggggcgatga cgtgctcacc 50640gatgtcggtg agctcggtca gccggagtcc ggcgtcgctg
atcatcccgg cataggcggt 50700gatggggatc agcgagggga cggcgagttc gctggacgac
cagtccgccc ccgtcggctg 50760tgatgcgcgg agtgtgacgt ccatggccgc cagccgccca
ccggggcgca gcacacgggc 50820catctcctgg aacacccggg ccgggtcggg catgtgcagc
aggcactcga gggcccagac 50880ggcgtcgaag gaggcgtcgg ggaagggcag gtccatggcg
tcggcgcact cgaagcggac 50940ccggttcgcg agtccggacc gctcggcgag cgcggtggcc
agctcgacct gccgggggct 51000gatggtgatg ccgacgatgt ccaccggctc gctgtgcgcc
aggcgcaggg ccggccggcc 51060ggaaccgcag ccgacgtcca gcacacgtct gaccgggcgc
ccggtgtgtt cccgaagctt 51120gccgatcata tggtcggtga ggcggtcgga ggcctggccg
agtgtgctgc cgtcgtccgg 51180gtgcggccag tatccgaggt gcgtgttgcc gcccagggcc
cggttcaaca ggctggtcat 51240gcggtcgtag tagtcaccga cgtccgcggg ggtcggtgat
ccctggtgag gcgccttggt 51300catggttccg gcagctcctt cggtcgtgcg gcggcctcaa
gggaggcgtc cgcgggggcg 51360tggccgcgag ggatggcggg ggtcctgggc tcggctatca
tccgcaggcg gtcggggaag 51420acgtgggtcg ccttggcgac cgggcggacg cggtcgccct
tgaggggacg cagacgccag 51480cgcgaggcga tgaccgcgac ggcgacggcc gtctccatga
gggcgaagtt gtcgccgatg 51540cacttgtagg tgccgagcgc gaagggaacc caggcgccct
tcggaacgtc gcgcgtggtc 51600tctttcgact cccagcggtc ggggtcgagc ttctccggat
cacggtacca gcgggggtca 51660cgctggagcg cgtacgagct gtacatgatt tccacgtcgg
ccggcagctc gtgttccccg 51720agccggacgg ggcgcaccgt gcgccgcgag cccacccagc
cggggtactt gcgcagcgcc 51780tccttgacca ggcgctgggt gtacgggagg cgcgggaggt
ccgcgctggt ggggagccgg 51840cctccgagga cggtgtcgat ttcggcgtgc agcctctgtt
cgatgaggtg gtcgtgagcg 51900agttcgtgga agatccacgc ggtgagagcg gccggcccac
cgattccggc gaccgcgagc 51960cccatgatct cgttgtgcac ctcgtcgtcc gtcatggtgt
tgccctcggc gtcccgcgcg 52020cgcagcatcg tcgagagcag gtcgccgtgg tcgcggccgt
cggcgcggta ggcggtgacc 52080gcctcccgga tggcggcgct ggtgcggccc atgtggcgct
tggcggcagt gggcagggag 52140gtgtagagct gcggggcgag cgcgctcagc ctggccacct
tcaggatgtc gtgccccgtg 52200gtgcgcagtt ccgcctcggc cgccgcaccc aggtcggact
ggaacaacgc cttcgtgatc 52260atggccagtg agaggtcgct cgccatcttc gggacatcca
cgacctggcc cggccgccag 52320gaatcggcgg tctcctcggc ggcggcggac atgctgatga
cgtagtggtc gagcttgccc 52380cggtggaatc cgggttgcat catccgccgc tggcggcggt
gcgagtcccc ggagacggcc 52440acgaggatgg ggccgatgaa ccggctggcg cccgccgcgc
ccttgctgcg ggtgaagtcc 52500gccgcgccgg acaccagcat ggtccgcacg atttcggggt
gggtggcgag gtagacggtg 52560ttgtggccga ggcggatgcg gaagaggtcc ccgcgttccg
tgacggcgga caggaagccc 52620agggggtcgc ggaggagggc cggcaggtgg ccgaggaccg
gccaggcacc gggggcctcg 52680gggatggtcg acggaggtga ggacactgtt gctcctgagg
ggagggccgg gcgagtcggc 52740gtggggtggg gtgaggtgtg cggtcgggca ggtggtcgcg
tcgccggtgg tcggcgacgg 52800gtggtgggtc agggggatcc ggtttcctgg tcgatgagcg
cgaacatctc ctcgtccgtc 52860gcctccccga ggtcgggacg cggcgcctcc tccccgccga
gcacctgggc gagtgagcga 52920agccgcgacg ccagccggga ccgggcttcc tctcccaggc
cctgcgcccc ggggagcggc 52980gacgcgacgg aggagagcac cgcttccagc cggccgatct
ccgcgaagag ggactgctcg 53040ggcggcgccg tggccgcgtc gtcggggagg agccgggtca
gcaggtgccg ggtgagcgcg 53100gtggcggtgg ggtggtcgaa ggcgagggtg gcgggcaggc
gcaggccggt ggcgcgggag 53160agccggttgc ggagttcgac ggcggtgaga gagtcgaagc
cgaggtcccg gaaggccgag 53220tccgcaggga ccgcctccgg cgtctgatgg ccgaggaccg
tggcgatctg ggtgcgcacc 53280acggtgaaca gggtgtcgtg ttgttgctcg ggggtcaggg
tggcgaggcg gtcggcgagg 53340gagacgtcct ggcctgcgcc tgcactggtg ccggtgtgtg
cggtgcgggg gctggtgcgg 53400gcgggcgcga ggtgttccag aaggggcggt gcggggtggg
tggggcgtag gtcggcgggc 53460aggagtgcgg gacgtccggt gaccagggcg gtgtcgagga
gggcgagggc gtcgggggtg 53520gtcagggggt gcagccccga gcgggtgatg cggtgacggt
caccggcgtc cagatgcccg 53580gtcatcccgc tggcctcttc ccacagtccc caggccaggg
agagggcggg caggccggcg 53640gcgcggcgct ggtgggccag ggcgtccagg gcggcgttgg
cggcggcgta gttgccctgc 53700cccggcgagc ccaggacacc ggccgcggag gagaacagca
cgaacgccga caggtccatc 53760cccgcggtca gctcgtgcag atgcagggca ccgtccacct
tcgccccgac caccgcatcg 53820atcttctccc ggttcagaca ggccacggtg gcgtcgtcca
ggacaccggc cgtgtgcacc 53880acagccgtca gcggatgctc cgcgggcacc tgctccagca
gggcggcgac ctgggcgcgg 53940tcggcgacat cgcacgccgc caccgacacc gacacccctg
cctgacccag ttccgcacac 54000agttcttcgg caccggtggc ggccatgccg cgccggctca
ccagcagcag atgccgcacc 54060ccgtgcccgg cggccagatg gcgcgcgacc gccgctccca
gagtgccggt cccacccgtc 54120accagcaccg tcccctccgc atcgaaccgg acggcatccg
acgagtcggc gggcaccggc 54180acatgtgcca agcgcgccgc cagcagtcgc ccgccacgca
ccgccaactg gggttcgtcg 54240caggccagag ccgccgtgac ggtggtgtcg tcggagaggt
cggcatccag caggacgaac 54300cgccccgggt gttccgactg cgccgaacgc agcaggcccc
agaccgccgc tccggcgacg 54360tccgtcacct cctcaccggt ccgggtggcc acggcaccgt
gtgtcaccac cacgagccgg 54420gcctcggcaa ggcgatcgtc ggccagccag tcctgcacca
cgctcagcac ttcgccgagg 54480acgtccgcca ccgcgccctg ggagcacgtc agcagcacgg
cgtcggggac gggggcgtcg 54540tcggtgtcca gcccggacag caggccggag aggtccgccg
ccgcccggtc atgggtgagt 54600acggtcgacc ggacggccgt gtccgggggc ggggtgcccg
gtgtgacgtc cttccaggcc 54660acgtcgaaca gcgccgcgcg gccggccgcc tgggcagagg
ctcgcagttc gccggtgtcc 54720agcggtcgta cggcgagaga gtcgaccgac aacacgcccc
ggccggtttc gtcggccagc 54780gacacggaga cggcggtccg ttcgccgtcc cgcccggccg
gcgccacccg gacccgtacc 54840gctgccgcct tcaccgcgtg cagggtcaca ccactgaacg
agaacggcac cgcccccggc 54900ggcaggcccg tcgccgctcc gagcgccacc gcgtgcaggg
cggcgtccag cagagctggg 54960tgcaggttgt accgggacgc ctcgtcgagc acggactccg
gcaggcggac ctccgcgaag 55020acctcttcgc cccgccgcca agccgcacgc agcccccgga
acgccggtcc gtaggcgaac 55080ccgcgggcct cctgtgccgc gtagaagctt tccagttcgt
ccgccgcgca cggcagggcc 55140ccttccggcg gccagctgcg cagggcgtcg ccgtcggcgg
agggctgggc gtccagcagg 55200ccggtggcgt ggtgctgcca gggatcctcc gggcgggcgt
gctcgctccg ggaggagacg 55260gtgagggtgc gggccccggt gtcgtcgggc gccgacacgc
ggacctggag gtcgacggcc 55320gcgtcgtgcg ggacggcgag gggcgcgtga agggtgagct
ctcgcacgtg cgcggcaccg 55380ccggcttgga gggcgagttc gaggagggcg gtgccgggga
ggaggacgat gccgccgacg 55440ctgtggtcgg cgagccaggg gtgggtgtgc agggacaggc
ggccggtgag caaggttccg 55500tctccgtcgg cgagttccac ggtggcgccg aggagagggt
gctcggtggg gtgcagtccg 55560gcggcggaga cgtcgccggt gctggtggtg ggtgtgtgga
gccagtagtg gtggtgttgg 55620aaggggtagg tgggcaggtc gatggcgggg gtgtgggggg
tggtggggta gtgggtggtc 55680caggtgacgg tgtggccggt ggtgtgggcg tgggcgaggg
cggtgaggaa gcggtgggtg 55740tcgttgtcgt ggcggcgcag ggtgcccagg gcggtggtgt
ggtcgtggtc ttcgatggcg 55800ggggtcaggg tggggtgggg gctggtctcg atgaagaggc
ggtggccctg ggcggtgagg 55860gtgcggacgg cgtgggtgaa gcggacgggc cggcgcaggt
tgcggtacca gtaggcggcg 55920tccagggtgg tggtgtcggt ccaggcgtct tcgacggtgg
agtagaacgg gatgctgccg 55980ggctgggggg tgatgccgtg gagcatgtgg agcaggtcgt
gttcgatggt ttcggtgtgg 56040gggcagtggg atgcgtagtc cacggggatg cggcgggccc
gtaccccggt ttcggtgcag 56100tgggtgagga gttcgtccag ggcgtcggtg tcgccggaga
cggtggtggc ccgtggcccg 56160ttgaacgcgg ccctccacag ccgtcccggc cagtgggtgg
tgatgaggtc ctcggcctcg 56220gttccggtca gggcgagtga ggccatgccg ccgtggcccc
gcagcgcggc cagggcccgg 56280ctgcgcagtg ccacgatctt cgcggcgtcc tccagggtga
gtgcgccgca gacgtgggcg 56340gcggcgatct cgccctggga gtggccgacc accgcgtcgg
gttcgatgcc gtaggaccgc 56400cacagcgccg ccagggacac catgaccgag aacagcaccg
gctgcaccac atccgcccgc 56460tcccacaccg gatcctccgc gtcccggtgg agcatttcgg
tgagcgacca ctccacccac 56520ggcgccagcg cgcgttcaca cgcaccgaca tggtcggcga
acacctggga gtcgcgcagc 56580aggtcgagtc ccatgccggc ccactgaccg ccctggccgg
ggaagacgaa caccaccccg 56640ccaccggagt ggctgttggt ctggctgccg gtcgtgtctt
cgatgagtgc ggggtgcggt 56700tcacccgagg ccagtgcttc cagcgcgtct tcgaactctg
tccgttcccg tgccacgatc 56760acggcgcggt gttccaggac ggccctgcca tgggcgagtt
cggctccgac cgcccgtagt 56820gcggtgtcgg tgccggggcc ttggaggaag tcacgcagga
gtgcggcctg tgcccgcagc 56880gcggcgtttg aacgcgccga catcgcccag ggcaccgggg
cgccgtcttc tggggcgggg 56940tccccggccg gctgggcggg tgcctcttcc aggatgacgt
gggcgttggt tccgctgact 57000ccgaaggagg agatgccggc gcgtcggggc cggtccccct
ggcgtggcca ggggcggggc 57060tcggtcagca ggctgaccgc gccggacgac cagtcgacct
ggggggtcgg ctcgtcgatg 57120tgcagcgtgc gcggcagcgt ctcgtggcgc agtgccatca
ccatcttgat gatcccgccg 57180acacctgccg ctgcttgggc gtggccgatg ttggacttca
gggagccgag ccagagcggc 57240tggtcggggt ggtgttcctg gccgtaggtg gcgagcaggg
cctgggcttc gatggggtcg 57300ccgagtttgg ttcctgtgcc gtgggcttcg acggcgtcga
catcggcgcc ggtcaggccg 57360gagttggcca gggcctggcg gatcacgcgc tgctgggagg
ggccgttcgg tgcggtcagg 57420ccgttggagg cgccgtcctg gttgaccgcg ctgccccgca
ccaccgccag aacccggtgc 57480ccgttgcgct cggcgtccga caaccgctcc acgagcagca
tgcccacgcc ctcgccccag 57540ccggtaccgt cggccgcggc cgcgaaggac ttgcaccgcc
cgtccaccga caaaccccgc 57600tgccgggaga agtcgatgaa ggtgcccggt gaagacatca
ccgtcacgcc gccggcgagg 57660gccatcgagc attcgcccga tcgcagggct tggcctgcga
ggtgcagtgc gacgagcgac 57720gacgagcacg cggtgtcgac cgagacggcc ggcccctcga
aaccgaaggt gtaagccacc 57780cgacccgaca cgacactgcc cgcgtttccg ttgccgatgt
agccctccgc cccttcggga 57840acggcggtca aacgggcggc gtagtcgtgg tacatcacac
ccgcgaacac gcccgttcgg 57900gaaccgcgta cggcagcggg atcgatcccc gcgtgttcga
gggtctccca gacggtttcg 57960aggaggagcc gctgctgggg gtccatggca agggcctcac
gcgggctgat accgaagaac 58020tcggcgtcga actgcccggc gtcgtagagg aaaccaccgt
gccgggtgta cgacgctccg 58080gcccgctccg ggtccgggtc gaacagcccg gccaggtccc
acccgcggtc ggccgggaac 58140tccccgatcg cgtcaccgcc cgaagccacc agcccccaca
actcctccgg cgaccgcaca 58200ccgcccggga agcggcacgc catcccgacg atcgccagcg
gctcgtcact gccgacggct 58260gtggtttcgg cgtacggcga agtgctgtcc gcggcgtcgt
ccccgagcag ttccgtgcgc 58320agcaggcggg ccacggccgc ggggctgggc tggtcgaaga
ccaggctcgc cggcagtcgc 58380agtcccgtct ccgcgctcag gcggtttcgc agatccacgg
ccgtcaggga gtcgaagccg 58440aggtcgcgga aggccgagtc gaccgggatg gcttccggtg
cttggtggcc gaggacggtg 58500gcgacatgcg agcggaccag cccgagcagg gcctggtact
gctgttcggg tgtccgtccc 58560gcaagccgtg cccgcagcga cgcaccgctg tcagtggtgg
ggagggtggt gcggtggctg 58620gtgcgggcgg gcgcgaggtg ttccagaagg ggcggtgcgg
gatgggtggg gcgtaggtcg 58680gcgggcagga gtgcgggacg tccggcggcc agggcggtgt
cgaggagggc gagggcgtcg 58740ggggtggtca ggggatgcag ccccgagcgg gtgatgcggt
gacggtcacc ggcatccaga 58800tgcccggtca tcccgctggt ctcttcccac agtccccagg
ccagggagag ggcgggcaga 58860ccggcggcac ggcgctggtg ggccagggcg tccagggcgg
cgttggcggc ggcgtagttc 58920ccctgccccg gcgagcccag gacacctgcg gcggaggaga
acagcacgaa cgccgacagg 58980tccatccccg cggtcagctc gtgcagatgc agggcaccgt
ccaccttcgc cccgaccacc 59040gcatcgatct tctcccggtc cagacacgtc acggtggcgt
cgtccaggac accggccgta 59100tgcaccacag ccgtcagcgg atgctccgcg ggcacctgct
ccagcagggc ggcgacctgg 59160gcacggtcgg cgacatcgca cgccgccacc gacaccgaca
cccccgcccc acccagttcc 59220gcacacagtt cttcggcacc ggtagcggcc atgccgcgcc
ggctcaccag cagcagatgc 59280cgcaccccgt gcccggcggc cagatggcgc gcgaccaccg
ctcccagagt gccggtccca 59340cccgtcacca gcaccgtccc ctccgcatcg aaccggacgg
catccgacga ctccgacagc 59400ggcggcaccc gcttcaaccg tggtacgcga accaccccgt
cgcgtacggc gagttggctc 59460tcaccgcggg ccagagcgga cgcgacggcg gcatcgtcga
gaccggcgcc ggtgccgacc 59520ttcgcgtcgg cggacacgtc ggtgtccccg tcggtgtcgg
tgtcggtgtc ggtgtcgggg 59580tcggtcttgg tgtcggggtc gaggtccagc aggacgaacc
ggtcgggatg ctcggactgg 59640gccgagcgga ccagccccca cacagcggcc cccgccacat
cccgcaccgg ctcacccgca 59700tccaccgcga ccgcaccacg cgtcaccacg accagccgcg
catccccctc ccgctcatcg 59760gcgagccact cccgcaccac acccaacgcc gcggccgtga
cctcggccac cccaccaccc 59820gggcactccc acgccaccaa ctccccgccc ccaccggcca
ccgaggccgg caccccctcc 59880accggcaccc accccagctc gaacaacgac ccacgccgca
cccgagcccc cagcccctcc 59940aacggcaccg gacgcatcag gagcgactcg agggtgagaa
cgggtgctcc ggtctcgtcg 60000gtggcgtgca ggcgaacggc ggtcgaggtc gcgtcggtgg
gagtcatgcg caggcgcagg 60060acggtggcgc cccgggcgtg gagcgagaca ccgttccatg
tgtgagggac gagcccggcc 60120tgctgctggt ccgccagcag cagggtgacc gattgcacgg
ccgcgtccag gagcgccgga 60180tgcaccccga agcccgcgac atcggccagc ccctcgtccg
gcagacgcac ctcggccacc 60240acgtcgtccc cgtcccgcca ggccgcacac aaaccctgga
acaccggacc gtagacaaaa 60300cccccacccg ccagacggtc gtagagaccg tcgagaacga
ccggtcgggc gccgggtggc 60360ggccactccc cggccgccgc tgcctccgcg ttcggatcgt
cttcggtgga cggggacagc 60420acaccctcgg catgccgtgt ccactctccg tccgtctcct
cgtccccggc cggccgcgcg 60480tacacattca cggcgcgccg ccccgcctcg tccggcaccg
acaccgacac ctgcaccacc 60540acgtgccccg actcgggaat caccagaggg gcgtggagag
tgagctcgtc gacacgagga 60600cagccggtac gcagaccggc ctgaaaagcc agatccagga
gggcggtgcc cgggaggagg 60660acgattccgc cgacactgtg gtcggcgagc caggtgtggg
tgcgcaggga caggctgccg 60720gtgaggacga tcccgtcccc gtccgcgagc tccatcaccg
cacccagcag cggatggtcc 60780ggccgctgga gcccggcagc cgacacatcg cccgcaccgg
caccgggagt cgcctggagc 60840cagtagtgcc ggcgttggaa ggggtaggtg gggaggtcgg
ggatggtgtg ggggtggggg 60900ttggtgtggg tgtggtggtg ggtgtgccag gtggtggggg
tgtgggcgag tgcggtgagg 60960aggtggtggg tggggttgtg gtgggggtgg gtgagggtga
tggcggtggg ttggtggtgg 61020gggaggttgt ggtgggtgag ggtggtgagg gtgtggtcgg
gtccgagttc gatgtaggtg 61080gtgacgccgt tggtgtggag ggtgtggatg gtgtcggtcc
agtggacggg ttggcgggtc 61140tggtgggtcc agtagtgggg ggtgaggggg tcgccggggg
tgctggtgat gaggggggtg 61200tggggtgggt ggtaggtgag ggtgtgggtg gtggtgtcga
gttgttcgag gatggtgtcg 61260gtgtgggggg agtggaaggc gtggttggtg gtgaggggtt
tggtggtgat gccttgggtg 61320tggcaggtgg tggtgatgtg gtggatggtg tcggggtcgc
cgctgatgac gagggagtgg 61380gggctgttga cggcggcgat ggtggctttg ccggggtgtt
ggtcgaggag gggttggatg 61440tgttcggggg tggtgtggag ggtggtcatg gtgccggggg
gcatggtttg catgaggcgg 61500gcgcgggtgg tgatgaggtg ggtggcgtcg gggagggtga
ggatgccggc gaggtgggcg 61560gcggtgattt cgccgaggga gtgtccggcg aggtagtggg
gggtgatgcc gtaggtttcg 61620gtgatgaggt ggtggagggc tttttggagg gtgaagaggg
cgggttgggc gtagagggtg 61680tgggtgagga ggtcggtggg gtcgtggttg aggaggaggt
cgcgcagggg gtggtcgagg 61740tgggggtcga agtgggcgca ggtttcgtcg agggcgtcgg
cgaaggcggg gtaggtgtgg 61800tagaggccgg tggccatgcc ggggcgttgg gtgccttgtc
cggagcagag gaaggcgatt 61860ttgtgggtgt ggccgggttc ggtggggggt gtggggatga
ggtgggggtg ggtgcggccg 61920tcggccaggg cggtgagggc gtccaggagg gtgtcgcggt
cgggggcgat gagggtggcg 61980cggtggtcga acagggcgcg ggtggtggcc aggctgtagc
cgatgtcggc cgcgtcctgt 62040tcggggtggt ccaggacgtg ggcgtgcagg gcggcggcct
gggcgcgtag cgcggcctgc 62100gacttgcccg acacgaccca caccaacggc acatccgccg
acacccggcc cgcaccgtcc 62160gtctccacca ccaccgccgc agccgccgca gtctccaccc
ccgccggcgc ggcttccacc 62220gcctccacct ccgcccgcgc gggcgcctcc tccaggatca
cgtgcgcgtt cgtcccgctc 62280accccgaagg acgagattcc cgctcgccgg ggccggtcct
cccggcgtgg ccagggccgc 62340gcctcggtca gcaggctcac cgctcccgac gaccagtcca
cctgcggcga cggctcatcc 62400acatgcaacg tccgcggcaa cgactcgtgc cgcaacgcca
tcaccatctt gatgatcccg 62460ccgacacctg ctgccgcttg ggcgtggccg atgttggact
tcagggagcc gagccagagc 62520ggctggtcgg ggtggtgttc ctggccgtag gtggcgagca
gggcctgggc ttcgatcggg 62580tcgcccagct tggtccccgt gccatgggct tcgacggcgt
cgacatcaac tgcggagagg 62640ttcgcgttgg ccagggcctg gcggatcaca cgctgctggg
acggaccgtt cggcgccgtc 62700agcccgttcg aggcgccgtc ctggttgacc gcgctgcccc
gcaccaccgc cagaacccgg 62760tgcccgttgc gctcggcgtc cgacaaccgc tccagcagga
gcatcccggc tccctccgac 62820cagccggtac cgtcggccga ggcggagaac gccttgcacc
ggccgtccgc cgccaggccc 62880cgctgccgcg agaactccag gaacgcggta ggcgtggaca
tcaccgtcgc accgcccgcc 62940aaggccatgg tgcactcgcc cgaccgcagt gcctgacagg
ccagatgcag cgcgacgagc 63000gacgacgagc acgcggtgtc cacggacacg gcggggcctt
cgagcccgaa cgtgtaggcg 63060acccggcccg acgccacgct tccggacgtg ccggtgagaa
cgtacccgtc gacgtcggcg 63120gcgacatggt gccgtgagcg ggacgcgtac tcctgaggca
tgacgccggc gaacacgccc 63180gtctggctgc cgcgcacggc accggggtcg atacccgccc
gctcgaacgc ctcccacgtc 63240gtctccagca acagccgctg ctgggggtcc atcgcgagcg
cctcgcgcgg ggagatcccg 63300aagaatcccg cgtcgaactc ccccgcgtcg tagaggaatc
ccccgtgacg ggtgtacgag 63360gtgccccgct gcccgggctc cgggtcgtag agcgcctcca
cgtcccagcc acggtcggcc 63420gggaactccc ccaccgcgtc gccgccggag gcgacgagtt
gccagaggtc ctcggccgag 63480gcgacacctc ccggataccg gcatcccaca ccgatgatcg
cgatgggctc gtgctgcccg 63540gccttcggtt cggcggcggc aggtgccgaa ggcgtcttgg
tgtcgttggg gttgaggagg 63600gtggtgaggt ggtgggtgag tgcggtgggg gtggggtggt
cgaaggcgag ggtggtgggc 63660aggcgcaggc cggtggcgcg ggtgagccgg ttgcggagtt
cgacggcggt gagggagtcg 63720aagccgaggt cgcggaaggt gcgttcgggg tcgatggtgt
cgggggtggg gtggcccagg 63780acggcggcga tgtgggtacg ggccagggcc agcagggtgg
cgtgccgctg ttcggaggtc 63840agggtggcga ggcggtcggc gagggagacg tcctggcctg
cgcctgcact ggtgccggtg 63900tgtgcggtgc gggggctggt gcgggcgggc gcgaggtgtt
ccaggagggg tggtgcgggg 63960tgggtggggc gtaggtcggc gggcaggagt gcgggacgtc
cggtggccag ggcggtgtcg 64020aggagggcga gggcgtcggg ggtggtcagg ggatgcagtc
ccgagcgggt gatgcggtga 64080cggtcaccgg cgtccaggtg gccggtcatc ccgctggcct
cttcccacag tccccaggcc 64140agggagaggg cgggcagacc ggcggcgcgg cgctggtggg
ccagggcgtc cagggcggcg 64200ttggcggcgg cgtagttgcc ctgccccggc gagcccagga
cacctgcggc ggaggagaac 64260agcacgaacg ccgacaggtc catccccgcg gtcagctcgt
gcagatgcag ggcaccgtcc 64320accttcgccc cgaccaccgc atcgatcttc tcccggtcca
gacacgtcac ggtggcgtcg 64380tccaggacac cggccgtatg caccacagcc gtcagcggat
gctccgcggg cacctgctcc 64440agcagagcgg cgacctgggc gcggtcggcg acatcgcacg
ccgccaccga caccgacacc 64500cctgcctgac ccagttccgc acacagttct tcggcaccgg
cggcggccat gccgcgccgg 64560ctcaccagca gcagatgccg caccccgtgc ccggcggcca
gatggcgcgc gaccgccgcc 64620cccagagtgc cggtcccacc cgtcaccagc accgtcccct
ccgcatccag gggcacaggc 64680agggtcagca cgttcttgcc gacatgcagg cccgaccgca
tcgaccgcag cgcctggcgg 64740gcctggcgca cgtcccacgc ggtgaccggc aacggctcca
gcaccccgcg ccggaacaga 64800tccaccaccg tgtgcaggat ctcccccacc cgctgcgcac
ccgcgtccat caggtcatac 64860gcccggtagg acacccccgg gaaccgagcg gcgacctcac
cggcatcacg gatgtcggtc 64920ttgcccagct ccaggaaccg gcccccctgc ggcgaacaca
gccgcaacga ggcatcggtg 64980tactcacccg ccagacagtt cagcaccaca tccacacccc
gcccgccact ggcccggcgg 65040aaacgcgact cgaactccac actccgcgag gaagcgatcc
gctgcggcgc gacacccgcc 65100gcccgcagac gcgcccactt cgcctcactc gccgtcgcat
acacctccgc ccccagatga 65160cgggcgagct gcaccgccgc cgtaccgacc ccgccggccg
ccgcatggac cagcacactc 65220tccccccgcc gcacccccgc cagatcgacc agccccaggt
aagcggtagc gaacaccacc 65280ggcaccgaag ccgcctgcgc gaacgaccag ccctccggga
tacgggccag caacacctcc 65340tgcgccacca ccaccggcgc gaacgcgtcc ccgaacaccc
cgaacacccg gtctcccacc 65400accaggccct ccaccccggg ccccacctcc accaccaccc
ccgcaccctc actgcccacc 65460cccacctgac ccggcaccat ccccaacgcc accagaacat
cacggaagtt caccccggca 65520gcccgcaccg ccacccgcac ctgcccccga cccagcacca
ccccagccgc atccgaagca 65580accacaccca ccccctccaa caaccccgac ccaccaccat
ccagccgcca ccccacccca 65640ccaggcaacg acaacccttc acccgcaccc ccaagccgcc
cccaccgctc caaccgctcc 65700aaccgcggca cccgcacgac cccaccacgc acggcaacct
gtgcctcgcc acacgcgaca 65760aacccggcca catcgacacc agcaccaaca ccggcgccca
tgtcctcatc ggcatcgacg 65820acggtctcca cgccggtgcc ggggtcgagg tccagcagga
cgaaccggtc gggatgctcg 65880gactgggccg agcggaccag cccccacaca gcggcccccg
ccacgtcccg caccggctcc 65940cccgcatcca ccgcgaccgc accacgcgtc accacgacca
gccgcgcgtc cccctcccgc 66000tcatcggcga gccactcccg caccacaccc aacgccgcgg
ccgtgacctc ggccacccca 66060ccacccgggc actcccacac caccaactcc ccgcccccac
cggccaacga ggccggcacc 66120ccctccaccg gcacccaccc caactcgaac aacgacccac
gccgcacccc cgcccccagc 66180ccttccaacg gcaccggacg cagaaccaga gactccagcg
cgagcaccag cgcaccggtc 66240tcatcggcaa cccgaagact cacggtcgtt ccggccgcgt
cgacagacgt cacccggact 66300cgcagtgccc tggcaccccg ggcgtggagg gaagcaccgt
tccatgtgta aggcagcaga 66360ccggcctctt ggtcctcggg cagcaggagg gtgaccgtct
gcacggccgc gtccaggagc 66420gccggatgca ccccgaagcc cgcgacatcg gccagcccct
cgtccggcag acgcacctcg 66480gccaccacgt cgtccccgtc ccgccaggcc gcacacaaac
cctggaacac cggaccgtag 66540acaaaacccc cacccgccag acgaccgtag aactcatcga
gatccaccgg ctgcgcaccg 66600gacggcggcc acaccccgtc cgccaccggc tcaacaacca
ccgactcccc aggaacagac 66660ggacacacca caccctcggc atgccgcgtc cactcaccct
ccagccctcc gtcctccacc 66720agccgcccgt acacactcac accacgacga cccgcctcgt
ccggcaccga aaccgacacc 66780tgcaccacca catgccccga ctccggaacc accagaggag
catggagagt cagctcatcg 66840acaccaggac aacccgcacg cagaccagcc tgaaaagcca
gctccagcag agcggtaccg 66900ggcagcagga cgacgccgcc gacgctgtgg tcggcgagcc
aggggtgggt gtgcagggac 66960aggcgcccgg tgaggacgat tccgtccccg tccgcgagct
ccatcaccgc gccgagcagt 67020gggtggtccg gtcgctggag tccggcggcg gagacgtcgc
cggtgctggt ggtgggtgtg 67080tggagccagt agtggtggtg ttggaagggg taggtgggca
ggtcgatggc gggggtgtgg 67140ggggtggtgg ggtagtgggt ggtccaggtg acggtgtggc
cggtggtgtg ggcgtgggcg 67200agggccgtga ggaagcggtg ggtgtcgttg tcgtggcggc
gcagggtgcc cagggcggtg 67260gtgtggtcgt ggtcttcgat ggcgggggtc agggtggggt
gggggctggt ctcgatgaag 67320aggcggtggc cctgggcggt gagggtgcgg acggcgtggg
tgaagcggac gggccggcgc 67380aggttgcggt accagtaggc ggcgtccagg gtggtggtgt
cggtccaggc gtcctcgacg 67440gtggagtaga acgggatgct gccgggctgg ggggtgatgc
cgtggagcat gtggagcagg 67500tcgtgttcga tggtttcggt gtgggggcag tgggaggcgt
agtccacggg gatgcggcgg 67560gcccgtaccc cggtttcggt gcagtgggtg aggagttcgt
ccagggcgtc ggtgtcgccg 67620gagacggtgg tggcccgtgg cccgttgaac gcggccgtcc
acagccgtcc cggccagtgg 67680gtggtgatga ggtcctcggc ctcggttccg gtcagggcga
gtgaggccat gccgccgtgg 67740ccccgcagcg cggccagggc ccggctgcgc agtgccacga
ccttcgcggc gtcctccagg 67800gtgagggcgc cgcagacgtg ggcggcggcg atctcgccct
gggagtggcc gaccaccgcg 67860tcgggttcga tgccgtagga ccgccacagc gccgccaggg
agaccatgac cgagaacagc 67920accggctgga ccacatcggc ccgctcccac accggatcct
ccgcctcgcg gtggagcatc 67980tcggtgagcg accactccac ccacggcgcc agcgcgcgtt
cacacgcacc gatatggtcg 68040gcgaacaccc ccgaggtcgt cagcagatca agtcccatgc
cggcccactg accaccctgg 68100cccgggaaca cgaacaccac cccgccaccg gaatggctgt
ggctgccggt cgcgtcttcg 68160atgagtgcgg ggtgcggctc acccgaggcc agtgcttcca
gcgcgccttc gaactccgcc 68220cgctcccgtg ccacgatcac cgcgcggtgc tccagcacgg
ccctgccacg agccaactct 68280gccccgatat cccgcacccc ggcatccgta ccggggccgc
gcaggaactc acgcaagacc 68340atggcctgcg cccgcaacgc cgcacccgaa cgcgccgaca
ccacccaggg caccggagcc 68400ccgtcctcta ccgcagcctc cccgggccga cgggcgggtg
cctcctccag gatcacgtgc 68460gcgttggttc cgctcacccc gaacgaggac accccggccc
gccggggccg gtcctcccga 68520cgcggccagg gccgcgcctc ggacagcagg ctcaccgccc
ccgacgacca gtccacctgc 68580ggtgacggct catccacatg caacgtccgc ggcagcgact
cgtgccgcag cgccatcacc 68640atcttgatga tcccgcccac acccgctgcc gcctgggcgt
ggccgatgtt ggacttcacc 68700gagcccagcc acaacggctg ttccccggaa cgctcctggc
catatgtgtc gagcagggcc 68760tgggcttcga tcgggtcacc gagccgtgtg ccggtcccgt
gcccctccac cgcgtcgacg 68820tccgccaccg tcagccccgc gttggccagt gcctcgcgga
tcacgcgctc ctgcgagggg 68880ccgttcggtg cggtcaggcc gttggaggcg ccgtcctggt
tgaccgcgct gccccgcacc 68940accgccagaa cccggtgccc gttgcgctcg gcgtccgaca
accgctcgac cagcagcatg 69000cccacgccct cgcccatgcc ggtaccgtcg gccgcggccg
cgaaggactt gcaccgcccg 69060tccaccgaca gaccccgctg ccgggagaac tccacgaaca
ggagcggggt ggacatcacg 69120gtgaccccac cggcgagggc gagatcgcac tcgcccgtcc
gcagcgactg gcaggccagg 69180tgcagcgcca ccagcgacga cgaacacgcc gtgtcgacgg
tgacggccgg gccttccaga 69240ccgagcgtgt aggcgacgcg cccggaggcg acggcgccac
cgctgccgtt gccgatgtag 69300ccctcgaacc cttcggggat ggtggcgagc cgggaggcgt
agtcgtggta catcatgccg 69360gtgaagacac ccgctcgggc tccccggacc gaggaggggt
cgatcccggc ccgctcgaag 69420acctcccacg aggtctccag cagcaagcgc tgctgggggt
ccatggccag ggcctcacgc 69480ggactgatgc cgaaaagctc ggcgtcgaac tgtccggcgt
cgtagaggaa accaccgtgg 69540cgggtgtacg acgctccggc ccgctccggg tccgggtcga
acagcccggc caggtcccac 69600ccgcggtcgg ccgggaactc cccgatcgcg tcaccgcccg
aagccaccag cccccacaac 69660tcctccggcg accgcacacc gcccgggaac cggcacgcca
tcccgacgat cgccagcggc 69720tcctgtgccg cctcgacggc cgctgtgagc tgctggttgc
gccgccgcag ggcctcattg 69780gccttcaggg atgcccgcag cgcctcgacg agcttctcgc
tgggcgtagc catcggtgtc 69840tccaagtctg cgaatccggc aggtgcggac gcggtggtgt
ggacggggcg ggggtcggcg 69900gggaccgcgg cgggcgactc gggtggtgtc agcgacgccg
ctgctcggtg agcccggcca 69960gccaggtgtg gacgtgccgg gccgtcgact ccgcgtgctc
ttcgagcatc gtgaagtggt 70020tgccgtcggt ttcgaggacg gtgtgcggct cgccccacac
cggcggcggc tgttcgctct 70080cacgggcgcg gaggaagagg gtgggtgtct cgagggcggg
cggccgccag cccgcgaaga 70140tgcggaagta gccgcccatc gccaccaggc gggcgtagtc
caggtcgatg aactcggtga 70200cgcggtcgaa gatttcgctg gtgagggcgg cggcgacggg
ggccatcccc tcgtcgggca 70260ggtaggcgtc catgaccacc acggcctgcg gccggacgcc
caggtgttcc aggcggctcg 70320tgacggtgtg ggtgaaccag ccgccggcgg agtgtccggc
gagggcgaag ggctcgccgt 70380cggtgtggcg gaggatggcg tcggtgaaca gccgggtgat
ggtgtcgacg tcggcgggga 70440ggggctcgcc gtcggcgaag ccgggcgccg gcacgtacca
gacgtcgcgg agcccgtcga 70500gggccgccgc gaagcgggag tactggtaga cgctggacac
ggcggcgacg gtgggcaggc 70560agatcagcgc gggcccggtg tcgccctggg cgacgcggac
gaaggggggt cgggtcatag 70620ccgaggggtc ggtgaagcag ggccggaagg cggaggccgc
cgacagcagg gccatggact 70680cctcgacgcg gccgctgtcg tgaccgatcc agaacagggc
ttccaccgtg tcggcggacg 70740ggccgctccc ggctcgcgac gaggcggtgg catcgcgctc
cccaggggcg ccggccgttt 70800cggcggtcat atcggaggcc ggctcggcgg cgaggagcct
ttcgaggtgg tcggcgagcg 70860ctgccggggt cgggtggtcg aagacgagcg tggtggccag
gcgcagcccg gtcgctgcgt 70920tgaggcggtt gcgcagttcc acggcggtca gggagtcgaa
gccgaactcg cggaactcgc 70980cgtcggcggt gacggtgtcg gtgccgccgt ggccgaggac
ggccgcggcg tgggtgcgga 71040ccacctccgt cagcagggcg gtgcgctcgg cgggcttcgg
ggtcccggcc agtcgcccgc 71100ggagttcggc ggcggcgtcg gccccgacgc cgtggtcggc
ggtccggcgg gccggggtgc 71160ggaccaggcc cctgaggacg ggcggcaggg tgccgacggc
cgcctgctca cggagggtgc 71220ccgggtcgag gggggtggcg aggagcagcg gttcgtcgag
ggcgagggcg gtgtcgaaca 71280gggcgagccc gtgggcgttg gtgagcggga gcaggccgct
gcgggtcatc cgggcgacgt 71340cggcggcggc gaggtgctcg gccatgccgc ccgcctcggc
ccagcgtccc caggcgaggg 71400agcggccggg caggccgagg gcgtgccgtt gctgcatcag
agcgtcgagg aaggcgttcg 71460cggccgtgta gttggcctgt ccggggctgc cgaaggaggc
ggcggcggac gagaaggcga 71520tgaacgcgtc gagcccggcg tcgcgggtga ggtcgtgcag
gtgggcggcg ccgtgggcct 71580tggcgctcag gacggcgtcc aggcggtcgg gggtcaggga
ggtgaggacg ccgtcgtcga 71640cgacgccggc ggtgtggagc acggccttga gcggatgccg
tgccgggatc tcggccagca 71700gcgccgcgac ggcccgccgg tcggcgaggt cgcaggcgac
ggccgtcgtc cgggcgccca 71760gctcggcgag ttcggcgacg agttcggcgg tgccgggagc
ggtggggccg ctgcggctgg 71820tcagcagcag gtgccgtacg ccgtgggtga cgacgaggtg
gcgggcgagg agccggccga 71880ggtagccggt gccgccggtg atgaggacag tggcgtcggg
gtcccagtgt ccgctgtccg 71940ctcgggcgcc gaccgggatg cgggccagcc gcggggtgtg
ggcgcggccc tcgcgcagga 72000cggtctgcgg ctcaccggag agcagggccg cggccagggc
gcgccggctg gcgtcggtgt 72060cgtcgaggtc ggtgaggacg aaccggccgg ggttctcggt
ctgggcggag cggaccatgc 72120cccagacggc ggcgtgcgcg aggtcgggga cggagtcgcc
cggtgcggcg gcgaccgcgc 72180cgtgggtgac gaacgcgagc cgggagtccg cgaaccggtc
gtcggcgagc cagctctgca 72240gcaggtgcag gacgcggacg gtggcccgcc gggtggcgtc
ggccgcgtcg gcggcgccgt 72300cgcggtgcgg gcaggggacg acgaccacgt cgggtacggg
tgtgccggcc gaggccagtt 72360cctccagatc cgcgtatgtg ctccacggca cgccgggggc
gtcggggcac tcggcttcgg 72420agccgatcag cgccaggcgc gtcttcgacg acggtgtcct
gggcagcggt acgggcgccc 72480agtcgagccg gaagagggcg tcgtggtggg cggtgcgggc
cgagtggagc tgtccggccg 72540tgacgggccg gaacgcgagt gactcggccg tgacgacggt
gtgtcccgtg ctgtccgtgg 72600ccagcagggc gatcgtgtcg ggcgaccgcc gactgaggcg
gacgcgcagc gccgatgccc 72660cggaggccgt gacggtgacg ccgctccagg agaagggcag
ccagccgtgg ccctcgtccg 72720gctcgtcctc ggcgaagccg aggaccaccg ggtggagtgc
cgcgtcgagc agcgccgggt 72780ggacggcgta gcggtcggcg tcgcccgacg gtccgtcggg
cagtgcgacc tcggcgtaca 72840ggtcgtcccc gtgccgccag gcggcccgca gtccctggaa
cgcgggtccg tatccgaggc 72900ccgcgtcggc cagtgtcccg taccagtggt cgaggtcgac
ggggaccgcg tccgtgggcg 72960gccacggtgc ggcggtgtcg tgggcggtct ccgcgcgccg
tgtcaggacg cccgtcgcgt 73020ggcaggtcca gccggtcccg tccgtgccgg tcggcgcggc
gggggtgagt ccgtcgtctt 73080cccgcgcgta gagcgtgaag gggcgccgct ccaccccgtc
cggcgccgtc tcggtcgcgc 73140cgacggagag ctgcaggacg accgatccgc gctccggcag
gaccagcggg acctggagtg 73200ccagttcctc gacggtgtcg cagccgactt cgtcgcccgc
gcggacggcc agttcgagga 73260tggccgtgcc gggcagcagg acggtgccga agacggcgtg
gtcggcgagc cagggatggg 73320tgcgcagcga gatcctgccg gtgagcaggc attcctgcga
ctgcggtgat ccggccaggg 73380gtacggcgga gccgagcagg gggtgtccgg ctgcggtgag
tccggcggcc gagacgtccc 73440cggacaggct ggtgtcggcg tccagccagt agcggcggcg
gtcgaagggg taggtcggca 73500ggtcgaggtg gcgggcgcgt tccggtgtcg cgccgatgag
ggccggccag tggacggccg 73560tcccgccctt cggtgtgccc tgcacgtgga ggtgggcgag
cgcggtgagc agcgccaggg 73620gttcggggcg gtcggcgcgc agcagcggga ccagggcggg
accgggctcg gtggtgttgt 73680cgtcggcggg caggcactct ccggcgaggg cgcagagggt
tccgtccggg ccgagttcca 73740ggaaggtgcg taccccgtcg tcgtcgtgga ggcggcgtac
ggcgtcgccg aagcgtacgg 73800tgcggcgtag ctggcggacc cagtactccg ggtcggtgag
cgtgccggcc gtggcgcggt 73860cgccggtgac ggtggagacc accgggatcg tcggttcggc
gtaggcgatg ccggtcgcga 73920cctgccggaa ctcctccagg atcggttcca tcagcgggga
gtggaaggcg cggtcggtgc 73980tgaggcgttt ggtgcgcagg ccctgctcgg cgaaggcggc
ggcggcttcc agtacgtccg 74040gctcggctcc ggagatcacc accgacgtgg ggccgttgac
agcggcgacg gccacccgtg 74100cctcccggcc ggcgagcatc cgggtgacct gttcctcgct
cgcgcggacc gccagcatgg 74160ctccgccggg cggcagttgt gtctgcgcca gccggccccg
ggccgcgacc agccgggccg 74220cgtcggtcag tgagaggacg ccggcgacgt gcgcggcggc
cagttcgccg atcgagtggc 74280cggcgacgtg gtccgggcgg atcccggcgc tctccagcag
gcggaagagg gcgacctgga 74340gggcgaagag cgccggctgc gcgtcgccgg tgcggtccag
cggctgcggc tcgtcgagga 74400ggagcgggcg caggggccgg tcgagatggg gttcgagctc
cgcgagtacg tcgtccaacg 74460cctgggcgaa ggcggggtgg gcggcgtaca gctctcggcc
catgccgggc cgttgggttc 74520cctggccgga gaagaggaag gcgaccttgc cgtgggcggt
gcgggcgggg gattcgatca 74580ggccgggggc cgtggtgccc tcggccagtg cgtccagggt
gcgcaggagt ccctcgtggt 74640cctcggcgac cagcacggcc cggcgttcga atgtcgaccg
gccggtggcc agggcgtgtg 74700cgacgtcacc gatcgggatg tcggggttgg cggcgaggta
gtcgcgcagc cgcctggcct 74760gggcgcgcag ggcggtgtcg gtcctggccg agaggagaca
gggcaccgtc gcgggtccgg 74820cctcgtcctg cgacggtgcc tcctccggcc gtacctcttc
ctcctgcggt gcctcttcca 74880ggatgacgtg tgcgttggtg ccgctcaccc cgaacgacga
cacacccgca cgacgcggac 74940gctcaccccg ctcccacacc acctcctccg tcagcaaccg
caccgcaccc gacgaccaat 75000ccacatgcgg cgacggctca tccacatgca acgtccgcgg
cagacgaccc cgccacaacg 75060ccatcaccat cttgatcaca ccagccaccc ccgccgcagc
ctgcgtatga cccagattcg 75120acttcaccga ccccaaccac aacggcaccc cacgaccccg
cccatacgcc gccagcaccg 75180cctgcgcctc gatcggatca cccaacgacg tccccgtccc
atgcccctcc accacatcca 75240cctcagcagc cgacaacccc gcacacacca acgcctgacc
gatcacccgc tgctgagacg 75300gaccattcgg cgccgtcaac ccattcgacg caccatcctg
attcaccgca ctcccccgca 75360ccaccgccaa cacccgatgc cccagccgcc gcgcatccga
caaccgctcc accaacaaca 75420cacccacacc ctcggaccac cccaccccgt ccgccgccgc
cgcgaacgcc ttgcaccgcc 75480cgtccgccga caaaccccgc tgccgcgaga actccacgaa
cgcccccggc gtcgacatca 75540ccgtcacacc ccccgccaac gccaactccg actcacccga
ccgcaacgac tgacacgcca 75600gatgcaacgc caccaacgac gacgaacacg ccgtgtccac
cgtcaccgcc ggaccctcca 75660acccgaacgt gtacgacaac cgccccgaca acacactccc
cgacacaccc gtcagcgcat 75720acccctccag atcccggcca ccccgacgca ccaactccgc
ataatcctga ttcgccaccc 75780ccgcgaacac acccgtacga ctcccccgca acgacaccgg
atcgatcccc gcccgctcca 75840gcgcctccca ggacacctcc agcaacaacc gctgctgcgg
atccatcgcc aacgcctcac 75900gcggcgaaat cccgaaaaac cccgcatcga actccgccgc
acccgccaaa aacccacccg 75960accgcgtata cgacgacccc ggccgccccg cctccggatc
gtaaagaccc tccacatccc 76020agccccggtc caccgggaag ccgccgatcg cgtccccacc
cgaggcgacc aactcccaca 76080aatcctccgg cgaccacaca ccccccggaa aacgacacgc
catccccacg atcgccaccg 76140gctcatccac cacaccgggt cgggccgcga cgggcggtgt
cgccggggcg gttccgcaca 76200gctggtcccg gaggtgccgg gccaggacgg cggggcgcgg
gtagtcgaag accagtgtcg 76260tgggaaggcg cagtccggtg gccgtgttga ggcggttgcg
cagttcgacg gcggtgagcg 76320agtcgaagcc gagttcgcgg aaggcccggt cggccgggac
ggtgtcggcc tcccggtggc 76380cgagcaccgc ggccgtgtgg gtgctgacca gttccagcag
gacgtccgtc cgggcgtccg 76440gttccagggc ggcgagccga tcgcgcaggt cggtgccgtg
ggcggtgccg gtgccggtgg 76500cgcgggtgag cgcccggacg tcggggaggt cgccgatcag
gggcaggcgg ctgccggcgg 76560tgtgggcggt gaagcgctcc cagtcgatgt cggcgacggt
caagccgctc tcgtcgtggt 76620ccagcacccg ggccagggcc gccagggcga gttcgggcgc
catctccgtc agtccccggc 76680ggtgcagccg ggcggccgcg tccggccggc cggcggcgct
gtgtccgcgc caggggcccc 76740aggccaccgc ggtggaaggc agtccgagac cgcgccggtg
gacggcgaga gcctcgacat 76800aggcgttggc cgccacatag gcaccctggc caccggaacc
gaacgtggcg gcggccgagg 76860agaacaccac gaacgccgaa agatccgcac cccgcgtcag
ctcgtgcagg ttccgcgcac 76920ccaccgccct ggccgccagc accccctcca gccgctccgg
cgtcaacgcg tccagcacac 76980cgtcgtccac cacccccgcc gtatgcacga ccgcccccag
cggacaatcc tcgggaaccg 77040ccgtggccag cagctccgcg agcgcccccc gatcggccac
atcacaggcg gcgatggtca 77100cccgggcgcc ccactggtcc gtcgtgtcgg cgaggccgaa
gccgtcgccg gaggtactga 77160tcagcagcag gtgttcggct ccgcggtcgg ccagccatcg
ggcgaggtgg gcggccggct 77220gctcggggtc ggcgttctcg ccggtgatca ggacggtgcc
gcgcggccgc cacccaccag 77280cctccgctcc ccctcccgga gcacgcacca gacgccgcac
gaacaccccc gacgcccgga 77340ccgcgacctc cccctcaccc ccgccccccg acagcacacc
cagcaaaccc tccaccaccc 77400gctcgtcgac cacctccggc agatcgacca ccccacccca
ccgatccggc aactccaacc 77460ccgccacccg gcccaaaccc cacaccacag ccccacccgg
atcccccagc cgatccccct 77520cccccaccga caccgcaccc cgcgtcacac accacaacgg
cacccccaac ccctccaccg 77580cctggaccaa ccccagcacc aacccggcaa cacccacccc
acccccacac acagccagca 77640cccccaccgg cccctcacca ccccacacct cacgcaaccg
ctcccccaac accacccgat 77700ccgcacaacc cccctccaca gccaccaccc gcacacacac
ccccgcccac tccaaaccct 77760ccaccacagc agcagcaccc accaccccct caggcaccac
caccacccac accccacccg 77820acaccacacc accacccgac cgcgacaccg gacgccacac
cacccgatac cgccagccat 77880cgaccaccgc acgctcccga acaccccgac gccagtcgcc
gagcgcggac accagcgcgt 77940cgagcggcgc gtcctcgtcc acggcgagca gcgcggccac
ggccgccggg tcctctcgtt 78000cgacggcttc ccacagcggg ccgtcctccg tggtggccgg
cgcggccggt gtctcctccg 78060ggtccagcca gtaccgctca cgctcgaagg cgtacgtcgg
cagctccacc cggccggcgg 78120tcccggacgg tttgccgccc agtacggccg cccagtccac
ccgtaccccg cgcacggaca 78180gctccgccgc ggaggccagg aagcgccgca gaccgccctc
gccccggcgc agtgagccga 78240ccaccagggt gtcggcggcg ccgaggtcgt cgagcgtctc
ctggaccgcg acggagaccg 78300cggggtgcgg gccggcctcg acgaagacgg tgtgcccgtc
gcgggcgagg gcccgggtgg 78360cgtcccggaa ccggacgggc tcgcgcaggt tgcggtacca
gtacgcggcg tcgagtgcgg 78420tgccgtcgac gggctcgccg gtgaccgtgg agtagagcgg
gatgtcggcg gggcgggggg 78480tgacgggagc gagaaggccg agcaggtctg cgcggatcgc
ctcgacctgc ggggagtgcg 78540aggcccagtc gaccttcagc aggcgggccg ggacgccgtc
ccgggtcagg tcgtcgacca 78600gggcggtgac cgcgtccggg gagccggaga ccacggccga
gcgggccccg ttgtcggcgg 78660cgaccaccag gctcgggtcc acggcggcga gccgcggttc
caggtcctcg gccggcagac 78720cgaccgaggc catggccccc tgtccggcga gcgcggcgag
ggcctggctg cgcagggcgg 78780tgacgcgcgc cgcgtcctcc agggagaggg caccggcgac
gcaggccgcc gcgatctcgc 78840cctgggagtg tccggcgacg gcgtcggggc ggacgccgta
ggagcgccag agggccgcca 78900gggacaccat gaccgcgaag agcacgggct ggacgacgtc
gacccggtcc agcggcgggg 78960cgtccggttc gccgcgcagg acgtcgagca gttcccagtc
gaggtacgga cgcagggcgt 79020cggcgcattc ggtcatgcgc tgggcgaaga ccggtgaaga
gtccaggagt tcggcggcca 79080tgccgtccca ctgggtgccc tggccgccga agagcagcgc
gattttgccg tccgcctcgg 79140ggccggtgcg tccggccacg actccggccg tcggcaggcc
ggtggcgagg gcgtcgaggc 79200cgtgccggaa accgtcgagg tcctcggcga gcacgaccgc
ccggtgctcc agccacgccc 79260gctccaccgc cagcgcacgc ccgacctcca ccggagccgc
ccccgcccca tcggcgaaca 79320cccgcaaccg ccgcgcctgc ccccgcaacg ccgactccga
acgagccgac accacccacg 79380gcaccaccgc gggtccggcc tcgccctgcg acggtgcctc
ctccggccgt acctcttcct 79440cctgcggtgc ctcctccaga atcacatgcg cgttggtgcc
gctcaccccg aacgacgaca 79500cacccgcacg ccgcgggcgc tcaccccgct cccacaccac
ctcctccgtc agcaaccgca 79560ccgcacccga cgaccaatcc acatgcggcg acggctcatc
cacatgcaac gtccgcggca 79620gacgaccccg ccacaacacc atcaccatct tgatcacacc
agccaccccc gccgcagcct 79680gcgtatgacc cagattcgac ttcaccgacc ccaaccacaa
cggcacccca cgaccccgcc 79740catacgccgc cagcaccgcc tgcgcctcga tcggatcacc
caacgacgtc cccgtcccat 79800gcccctccac cacatccacc tcagcagccg acaaccccgc
acacaccaac gcctgaccga 79860tcacccgctg ctgagacgga ccattcggcg ccgtcaaccc
attcgacgca ccatcctgat 79920tcaccgcact cccccgcacc accggcaaca cccgatgccc
cagccggcgc gcatccgaca 79980accgctccac caacaacaca cccacaccct cggaccaccc
caccccgttc gtcggcgtcg 80040cgaacgcctt gcaccggccg tccggcgaca aaccccgctg
tcgcgagaac tccacgaacg 80100cccccggcgt cgacatcacc gtcacacccc ccggcaacgc
caacttcgac tcacccgaac 80160gcaacgactg acacgccaga tgcaacgcca ccaacgacga
cgaacacgcc gtgttcaccg 80220tcaccggcgg acccttcaac ccgaacgtgt acgacaaccg
ccccgacaac acactccccg 80280acacacccgt cagcgcatac ccctccagat cccggccacc
ccgacgcacc aactccgcat 80340aatcctgatt cgccaccccc gcgaacacac ccgtacgact
cccccgcaac gacaccggat 80400cgatccccgc ccgctccagc gcctcccagg acacctccag
caacaaccgc tgctgcggat 80460ccatcgccaa cgcctcacgc ggcgaaatcc cgaaaaaccc
cgcatcgaac tccgccgcac 80520ccgccaaaaa cccacccgcc cgcgtatacg acgaccccgg
ccgccccgcc tccggatcgt 80580aaagaccctc cacatcccag ccccggtcca ccgggaagcc
gccgatcgcg tccccgcccg 80640aggcgaccaa ctcccacaaa tcctccggcg accacacacc
ccccggaaaa cggcacgcca 80700tccccacgat cgccaccggc tcatccacca caccggaccg
gatgaaggcg ggccggccgg 80760ccggggcttc cccgccggtg ctcagcagtg tgccgaggtg
tgtggccagg gcggacgggt 80820tggggtagtc gaacaccagc gtgctgggca accgcagtcc
ggtggccgtg ttgaggccgt 80880tgcgcagttc gacggcgttg agcgagccga agccgagctc
ccggaaggcc cggtcggccg 80940ggacggcggt ggcggtgcgg tggccgagga cggtggcggt
gtgggtgcgg acgaggtcga 81000gcagggcgcg gtcccgttcg gccggttcca gggcggccag
gcgtgcgcgg agcgagccgg 81060gggcctccgt gccggtggcg ggccgggcga gccgggcctc
ggggatgtcg gagagcagcc 81120gggcgaggcc gtcggcggcg ggcaggcgct cccagtcgat
gtcggcgatg gtgaggcagg 81180tctcgttgcg gtccagtacc tggccgaggg cggacagcgc
gggctcggtg tccatgggcc 81240ggatcccgcg gcggtccatc cgcgtggcgg cctccgcgtc
cgcggccatg cccccgcccg 81300cccaggcgcc ccaggccacc gcggtggagg gcagtccgag
accgcgccgg tggacggcga 81360gagcctcgac ataggcgttg gccgccacat aggcaccctg
gccaccggaa ccgaacgtgg 81420cggcggccga ggagaacacc acgaacgccg acagatccgc
accccgcgtc agttcgtgca 81480ggttccgcgc acccaccgcc ttggccgcca gcaccccctc
cagccgctcc ggcgtcaacg 81540cgtccagcac accgtcgtcc accacccccg ccgtatgcac
gacggcaccc agcggacaat 81600cctcgggaac cgccgtggcc agcagctccg cgagcgcccc
ccggtcggcc acatcacagg 81660cggcgatggt cacccgggcg cccatcgcgg tgagttccgc
gcggagctcc ccggcaccct 81720tggcctcgcg tccgctccgg ctgaccagca gcaggtgctc
ggccccgcgc cggaccatcc 81780agcgggcgac gtgtgctccc agagcgccgg tgccgccggt
gatcaggacg gtgccgcgcg 81840gccgccaccc accagcctcc gctccccctc ccggagcacg
caccagacgc cgcacgaaca 81900cccccgacgc ccggaccgcg acctccccct cacccccgcc
ccccgacagc acacccagca 81960aaccctccac cacccgctcg tcgaccacct ccggcagatc
gaccacccca ccccaccgat 82020ccggcaactc caaccccgcc acccggccca aaccccacac
cacagcccca cccggatccc 82080ccagccgatc cccctccccc accgacaccg caccccgcgt
cacacaccac aacggcaccc 82140ccaacccctc caccgcctgg accagcccca gcaccaaccc
ggcaacaccc accccacccc 82200cacacacagc cagcaccccc accggcccct caccaccaca
cacctcacgc aaccgctccc 82260ccaacaccac ccgatccgca caacccccct ccacagccac
cacccgcaca cacacccccg 82320cccgctccaa accctccacc acagcagcag cacccaccac
cccctcaggc accaccacca 82380cccacacccc acccgacacc acaccaccac ccgaccgcga
caccggacgc cacaccaccc 82440gataccgcca gccatcgacc accgcacgct cccgaacacc
ccgacgccag tcgccgagcg 82500cggacaccac ggatcccaaa ggcgcgtcct cgtcgacttc
cagcagggcc gcgaccgccg 82560gcaggtccgc gcgctcgacc gcctgccaca gcgggccgtc
ctccccggcc ggcagcgcgg 82620ccggtgtctc gcccgcgtcc agccagtacc gctcacgctc
gaacgcgtac gtcggcagct 82680ccacccggcc ggcggtcccg gacggtgcgc cgcccagcac
ggccgcccag tccacccgca 82740ccccgcgcac ggacagcccg gccacggcgg cgagcacgga
cacggcctcc ggccggtccg 82800gtcgcagtgc ggggagcagg ggggcgggtt cggtgagggc
gtcctggccg agggcgcaga 82860gtgtgccgtc cgggccgagt tcgaggtagg cggtgacgcc
ctgggcctgg agccaggcga 82920ggccgtcgcc gaagcggacg gtgtggcggg cgtgctggac
ccagtagtca gcggtgccca 82980tggtgtcggc ggagacgggg gcgccggtga ggttggtgac
cacggggatg cgcggcgggg 83040cgaagacgac ctgctccgcg acgcggcgga agtcgtccag
tacggcgtcc aggtgcgggg 83100agtggaaggc gtggctggtg cgcagccgcc gggtccggcg
gccctgttcc gcccagtggc 83160gggcgagcgt gagtacggcg tcctcgtcgc cggcgaggac
gaccgcgcgc gggccgttga 83220cggcggccag gtccgcgcgc ccctcggcat cctggagcag
cggccggact tcctcctccg 83280tcgcctcgac ggcgaccatg gcgccggtgt ccggcagcgc
ctgcatcagc cggccccggg 83340ccgtcaccag ggcggccgcg tcggggaggg agagcatccc
ggcgacgtgt gcggcggcca 83400gttcaccgac ggagtgcccc aggaggtagt cgggtgtcac
cccccagttc tcgaccagcc 83460ggtacagcgc gacctcgacg gcgaacaggg cgggctgggc
gtattccgtc tgttcgatca 83520gctcggcgcc gggggatccg ggggccgcga acacgatgtc
gcgcagggtg tggcctgctt 83580ccccgatcgg gccgaagtgg gcgcacacct cgtcgaaggc
gtccgcgaag gcggggaagt 83640gcgcgtggag ttcgcggccc atggccgggc gctgtgtgcc
ctggcccgcg aagaggaacg 83700ccagcgggcc ttcgtcggtc gcggtgccgg tgacgacttc
cggggcggga cggccggtgg 83760cgagggcgtc gaggccgtgc cggaaaccgt cgaggtcctc
ggcgagcacg accgcccggt 83820gctccagcca cgcccgctcc accgccagcg cacgcccgac
ctccaccgga gccgcccccg 83880ccccatcggc gaacacccgc aaccgccgcg cctgcccccg
caacgccgac tccgaacgag 83940ccgacaccac ccacggcacc accgcgggtc cggccccgtc
ccccgacgga accaccaccg 84000gcccgacgcc gtcccccgac ggtgcctcct ccggccgtac
ctcttcctcc tgcggtgcct 84060cctccagaat cacatgcgcg ttggtaccgc tcaccccgaa
cgacgacaca cccgcacgcc 84120gcggacgctc accccgctcc cacaccacct cctccgtcag
caaccgcacc gcacccgacg 84180accaatccac atgcggcgac ggctcatcca catgcaacgt
ccgcggcaga cgaccccgcc 84240acaacgccat caccatcttg atcacaccag ccacccccgc
cgcagcctgc gtatgaccca 84300gattcgactt caccgacccc aaccacaacg gcaccccacg
accccgccca tacgccgcca 84360gcaccgcctg cgcctcgatc ggatcaccca acgacgtccc
cgtcccatgc ccctccacca 84420catccacctc agcagccgac aaccccgcac acaccaacgc
ctgaccgatc acccgctgct 84480gagacggacc attcggcgcc gtcaacccat tcgacgcacc
atcctgattc accgcactcc 84540cccgcaccac cgccaacacc cgatgcccca gccgccgcgc
atccgacaac cgctccacca 84600gcagcacacc gacaccctcg gacatcccgg tgccgtcggc
cgcggcggcg tagggcttgc 84660agcggccgtc cgccgacagg ccccgttggc gcgagaactc
cacgaacatg gcgggggtgg 84720acatcacggt gaccccgccg gcgagtgcga gggaggattc
gcccgacctg acggactggc 84780aggccaggtg cagtgccacc agcgatgacg agcacgccgt
gtcgaccgtc accgcggggc 84840cttcgaaccc gaaggtgtag gagagccggc cggacaggac
gctcgccgcg ttgccgttgc 84900cgaggaagcc ctgaaggtga tccggaacgg acagcagacg
ggtggcgtag tcgtgcgaca 84960tcatccccgc gaagacgccg gtgcggctgc cgcgcagggt
ggccgggtcg atcccggccc 85020gctccagcgc ctcccaggac acctccagca tcaaccgctg
ctgcgggtcc atcgccagcg 85080cctcgcgcgg ggagagaccg aagaatcccg cgtcgaactc
cgctgcctcg tgcaggaatc 85140cgcccgatcg cgtgtacgac cgccctgccc ggcccggctc
cgggtcgtag aggtcctcga 85200cgtcccagcc ccggtccacc gggaagtcgc cgatcgcgtc
cccgcccgag gcgaccagct 85260cccacaggtc ctcgggcgat cgcacacctc ccgggaaccg
gcacgccatg cccacgatcg 85320cgaccggctc ctgcctgccc gactcgacct gctccagccg
gcgccggacc cgcagcagat 85380cggcggtcgc gcgcttgagg tactcgcgga gcatttcctc
gttggccatg acggggtctc 85440ctcgccgctg cgctggaggt ggcacggaac cccgccagat
tagggtgggc aagtcaaccc 85500gaataccccc tatacacccc agactggcta cgtgaagcga
atacccgttc aaataggggg 85560aagagccgca ggcatggatc gttacgcgaa gcgtttcgag
gaccggctgg tcctggtcac 85620gggggcgggg agcggcatcg ggcgggcgac ggcctgccgg
ttcggtgccg ccggggcgcg 85680gctggtgtgt gtggaccggg acgggcccgg cgcggaggcg
accgccgaac tggcgcgtgc 85740gcggggggcg cgggcggcgt gcgccgaggt ggccgacgtc
tcggacgagg tggcgatgga 85800gcggctcgcc gcgcgcgtca cggccgcgca cggcgtgctg
gacgtgctcg tgaacaatgc 85860cggtatcggc atgtcggggc ggtttctcga cacgtcggcc
gaggactggc gccgcaccct 85920gggggtgaat ctgtggggcg tcatccacgg gtgccggctc
ctcggccggg gcatggccga 85980gcgccggcag ggcggtcaca tcgtgacggt ggcctcggcg
gccgcgttcc agccgacccg 86040ggtcgttccg gtgtacgcca ccagcaaggc cgcggccctg
atgctgagcg agtgtctgcg 86100cgcggagttg gcggagttcg gcatcggtgt gagcgtggtc
tgccccggcc tggtccgtac 86160gccgttcgcg tccgcgatgt acttcgccgg cgcgtccccc
gacgagcaca cccggctgcg 86220tgagtcctcc gcccgccgct tcgcgggccg cggctgcccg
ccggagaagg tcgcggacgc 86280cgtcctgcgc gcgatcatgc ggacggcctt gccgacggtg
accgggtcga cgccgtagag 86340ctggatcagc gcggtctcct cgcccgtctc cggcttgacc
tcgaagtacg cgagcggctc 86400ggcgtcggcg gctgccgcgt cgtacagcag gatgcgcaga
tccggaagtc ctgctcttcg 86460acgagccgtt cagcgcgctg gacccgctga tccctttagt
gagggttaat tgcggccgcg 86520ttccagccga cccgggtcgt tccggtgtac gccaccagca
aggccgcggc cctgatgctg 86580agcgagtgtc tgcgcgcgga gttggcggag ttcggcatcg
gtgtgagcgt ggtctgcccc 86640ggcctggtcc gtacgccgtt cgcgtccgcg atgtacttcg
ccggcgcgtc ccccgacgag 86700cacacccggc tgcgtgagtc ctccgcccgc cgcttcgcgg
gccgcggctg cccgccggag 86760aaggtcgcgg acgccgtcct gcgcgcgatc gtccgcaata
cggcggtggt cgccgtcacc 86820cccgacgccc gcgccgtccg tctgatgagc cgcttcgcgc
cccgcctccg cgccgtcgtg 86880gcccggctgg acccgtaggc agggcccgta cgggcagcgg
gcgtccggtt cgggccaccg 86940gccgcggtat ccgcgcccct gcccggagct gtgccgctcc
gggcaggggc gcgcggacga 87000ggcggtccgg cccggcggcc cggacctggc ggtccgttac
tcaaaccgcg tgagcgtcag 87060ccggatcccg gtgggagcgg tgtcctggat gtaggaggcg
aagtcggcca cgtcgtcgaa 87120ggcgaagccg taggctcggc cgtcctccgt gatcgcgtgc
atcgccttgg cgtagtggtt 87180ggtcagcgcg gtcctgtaga aggccgcggg gtcggtcgtg
ggctgggcgg cggaggtgag 87240cagggtcgag cggttgaatc cggcgccgag gaccgcggcg
acgggaccgg tggtgccgtc 87300gttcggcgcg gcgagggcac cgtggcagaa gagcacgtcg
cgcgtggtgg gcttggcgaa 87360ggacacctgg gcgggcccgt cgaaggtcag ccgctcgccg
cgcacccggc cggtgaaggt 87420cccggcgttg gtggtgaccg tgaggtccct ggcggtgtag
gtgctccaca cctcgtcgat 87480gtacggagcg aggtagtcct tcgggaacag gccggcgtcc
agcccgtgcc cgggggcgat 87540cacacggagg tcgtccagga ccagtggcgc gaactccgcg
acgcggcgga ccgcttcgaa 87600cgccgccgcc cggccgccgg cccgcacggt gccggtcgtc
tggtccttcg cgcccgtcag 87660ccggatactg agcggcacgc tgaacatgtc caccatggtc
gtgttgcaga acatgccgga 87720ggggttgtag gtgaactcgg cgcagtcgtg cagcaccctg
tagttcggat cggacgcgac 87780ccagccggcc gggtactgca gcgcggcgtt cccggctccg
tccgtgacca ccttgaactt 87840gagtttctgt ccgagcgcga catagatccg gccggacatg
tacggcaggg agagccgggt 87900ctcgccgctg ccggccagtg tgatcgcgta gtccgtgaag
ccgtcggggc cgttgtccga 87960cagggcgacg ggcgcgaggg tgccgtcggg cgtgagccgt
acctgtcggc cgtcctggtt 88020ccccacgacg tagacatgga cgtcgccgtt gccgaagacg
ccggtgtcgt tgacgaccgt 88080cagcggcagg gcgcccgccg tggtcccttc cgcgtcgcgg
tcccggccgg ggccggcgag 88140ggcgtgcggt gccagggctg cgacggcggg agcggccatc
gcggcgccgc cgagggcgac 88200gagcagggtg cggcggccga ggctgcgctg gtgtcgagga
gtcatgtggg gggcctcctg 88260gtgggcttgc cgatgttcta atgacgggaa catgacaggt
gagaagcgtg ggagcgctcc 88320tcagggcccg atggtacgca cggggaggcg tcccgcgtcc
ccgtgccggg accgcttaac 88380cgacgcttaa gggccgttta
884002922PRTbacteria 2Met Arg Gly Val Ser Pro Ser
Val Ser Val Arg Glu Pro Gln Gly Leu1 5 10
15Thr Phe Leu Gly Leu Gly Arg Gln Ser His Ala Val Arg
Thr Ala Leu 20 25 30Glu Ala
Cys Ala Ala Gly Arg Val Arg Val Leu Val Val Glu Gly Gly 35
40 45Leu Gly Cys Gly Lys Ser Ala Phe Leu Gly
Glu Ala Leu Lys His Ala 50 55 60Ala
Ala Ser Gly Phe Leu Val Leu Arg Ser Ala Gly Ser Pro Pro Glu65
70 75 80Gly Arg Arg Pro Phe Asp
Leu Leu Arg Gln Leu Ala Val Asp Pro Asp 85
90 95Ile Pro Asp Ala Gln Arg Ser Leu Leu Gln Asp Ala
Val Gly Thr Glu 100 105 110Thr
Pro Ala Ala Gln Arg Val Arg Ala Ala Leu His Gln Leu Thr Gly 115
120 125Ala Ala Pro Val Val Ile Gly Ile Asp
Asp Leu His His Ala Asp Pro 130 135
140Gln Ser Leu His Cys Leu Leu Gln Ala Val Asp His Pro Arg Ala Thr145
150 155 160Arg Leu Leu Leu
Val Cys Thr Ala Leu Pro Ser Gly Leu Ala Ala Asp 165
170 175Pro Ala Val Glu Ala Glu Leu Leu Cys Gln
Pro Ala Leu Gln Arg Val 180 185
190Met Leu Gly Arg Leu Ser Leu Arg Ala Val Ser Gly Leu Arg Ala Ala
195 200 205Arg Pro Gly Pro Ala Val Glu
Ala Leu Pro Ala Asp Asp Leu Leu Ala 210 215
220Val Thr Gly Gly Asn Pro Leu Leu Val His Ala Leu Leu Glu Glu
Leu225 230 235 240Val Glu
Ser His Thr Gln Gly His Thr Asp Glu Arg Ala Gly Arg Arg
245 250 255Arg Arg Ala Ala Ser Pro Val
Ile Gly Gly Arg Phe Tyr Gln Ala Val 260 265
270Leu Ala Ser Leu Ser Arg Thr Asp Ser Leu Val Arg His Ser
Ala Gly 275 280 285Ala Leu Ala Val
Leu Gly Asp Ser Gly Cys Ala Glu Val Ile Ala Arg 290
295 300Leu Leu Gly Ile Gly Arg Ala Met Ala Ala Arg Gly
Leu Arg Ala Leu305 310 315
320Glu Ala Thr Gly Leu Thr Ala Ser Gly Arg Phe Arg His Pro Val Val
325 330 335Glu Ala Ala Ala Leu
Asp Thr Leu Asp His Asp His Arg Ala His Leu 340
345 350His Arg Arg Ala Ala Ala Leu Leu Tyr Asp Val Gly
Ala Glu Pro Asp 355 360 365Glu Val
Ala Arg His Leu Leu Ala Ala Arg His Ala Ala Gly Pro Trp 370
375 380Ala Met Ser Val Leu Arg Asp Ala Ala Glu Gln
Leu Leu Met Arg Asp385 390 395
400Asp Val Leu Thr Ala Val Ser Cys Leu Glu Leu Ala Arg Arg Ser Cys
405 410 415Ala Gly Gly Pro
Arg Arg Ala Glu Ile Leu Leu Arg Leu Thr Val Ala 420
425 430Thr Arg Arg Thr Asp Pro Ala Ala Ala Glu Asp
His Leu Ala Glu Leu 435 440 445Val
Thr Glu Leu Arg Ala Gly Arg Leu Thr Ser Ala Glu Thr Glu Arg 450
455 460Leu Gly His Leu Leu Leu Gly Cys Gly Arg
Leu Glu Glu Ala Thr Glu465 470 475
480Val Met Gly Arg Pro Gly Pro His Gly Asp Pro Arg Thr Pro Arg
Leu 485 490 495Glu Thr Gly
Phe His Ala Ser Ala Leu Trp Glu Pro Leu Ile Arg Pro 500
505 510Arg Thr Asp Pro Glu Pro Gly Asp Glu Glu
Ser Pro Arg Pro Arg Met 515 520
525Pro Val Thr Gly Ile Trp Asp Leu Pro Gly Asp Gly Thr Asn Ala Ser 530
535 540Ala Ser Asp Ala Ala Glu His Val
Leu Arg Ser Leu Pro Leu Thr Asp545 550
555 560Thr Thr Leu Val Ile Val Val Asn Ala Val Arg Val
Leu Cys Arg Thr 565 570
575Gly Ser Tyr Glu Thr Ala Ala Leu Trp Cys Thr Arg Leu Leu Gly Glu
580 585 590Ala Ala Gly Arg Arg Leu
Pro Gly Trp Lys Ala Gln Phe Leu Ala Leu 595 600
605Gln Ala Glu Ile Ala Leu Cys Arg Gly Leu Leu Ala Asp Thr
Glu Glu 610 615 620Tyr Ala Arg Gln Ala
Leu Ala Cys Val Pro Arg Cys Ser Arg Ser Val625 630
635 640Phe Ile Gly Gly Pro Leu Ala Ser Arg Val
Phe Ala Ala Thr Ala Met 645 650
655Gly Arg Tyr Asp Glu Ala Thr Arg Gln Leu Asp His Pro Val Pro Glu
660 665 670Ala Leu Phe Arg Ser
Val Tyr Gly Pro Ala Tyr Leu Arg Ala Arg Gly 675
680 685His Tyr Tyr Leu Ala Leu Asp Arg Pro Leu Ala Ala
Val Arg Asp Phe 690 695 700Leu Gly Ala
Gly Arg Leu Leu Arg Arg Trp Gly Ile Asp Arg Pro Thr705
710 715 720Leu Met Pro Trp Arg Ser Asp
Ala Ala Glu Ala Phe Leu Arg Leu Cys 725
730 735Glu Pro Arg Arg Ala Asp Arg Leu Leu Arg Glu Gln
Leu Ala Arg Thr 740 745 750Pro
Asp Asp Asp Pro His Val Arg Gly Val Ser Leu Arg Leu Arg Ala 755
760 765Gln Ile Ala Glu Pro Pro Asp Arg Leu
Asn Leu Leu Thr Glu Ala Val 770 775
780Asn His Leu Lys Ser Ser Gly Asp Arg Leu Ala Leu Ala Gly Ala Leu785
790 795 800Ala Asp Leu Gly
Ala Ala Tyr Arg Glu Arg Gly Glu Ser Thr Arg Ala 805
810 815Gly Ala Thr Ile Arg Arg Ala Trp His Leu
Ala Asn Asp Cys Gly Ala 820 825
830Arg Ala Leu Cys Glu Arg Ile Leu Pro Gly Gly Pro Gly Arg Gln Ser
835 840 845Phe Gly Asp Gly Thr Gly Arg
Thr Glu Ala Ala Leu Ser Gly Ser Glu 850 855
860Leu Arg Val Val Glu Leu Ala Ala Asn Gly His Thr Asn Arg Glu
Ile865 870 875 880Ala Ala
Arg Leu Cys Ile Thr Val Ser Thr Val Glu Gln His Leu Thr
885 890 895Arg Ala Tyr Arg Lys Leu Glu
Ile Ser Arg Arg Gln Glu Leu Pro Ala 900 905
910Arg Leu Cys Ala His Ile Glu Ser Pro Val 915
9203259PRTbacteria 3Met Pro Asp Leu Cys Glu Thr Glu Ser Leu Trp
Leu Arg Arg Phe Gln1 5 10
15Pro Ala Pro Ala Ala Arg Thr Arg Leu Met Cys Phe Pro His Ala Gly
20 25 30Gly Ser Ala Ser Ala Tyr Leu
Arg Leu Ala Arg Ser Leu Ala Pro Gly 35 40
45Ile Glu Val Leu Ala Val Gln Tyr Pro Gly Arg Gln Asp Arg Arg
Ala 50 55 60Glu Pro Cys Pro Asp Ser
Val Glu Gly Leu Ala Asp Asp Leu Phe Ala65 70
75 80Ala Val Arg His Arg Val Asp Ala Ser Thr Ala
Leu Phe Gly His Ser 85 90
95Met Gly Ala Val Leu Ala Phe Glu Leu Ala Arg Arg Leu Glu Arg Asp
100 105 110Ala Gly Val Arg Cys Ala
Arg Ile Phe Ala Ser Gly Arg Arg Ala Pro 115 120
125Ser Arg Phe Arg Asp Asp Ser Ala Pro Ala Ala Ser Asp Ala
Ser Met 130 135 140Leu Ala Glu Met Arg
Thr Leu Gly Gly Thr Asp Leu Arg Val Leu Gln145 150
155 160Asp Glu Glu Leu Leu Ile Ala Ala Leu Pro
Ala Leu Arg Ala Asp Tyr 165 170
175Arg Ala Ile Gly Thr Tyr Arg Ala Ala Asp Asp Ala Val Val Gly Cys
180 185 190Pro Val Thr Val Leu
Val Gly Asp Ala Asp Pro Arg Thr Ser Leu Asp 195
200 205Asp Ala His Ala Trp Ser Ala His Thr Thr Ala Glu
Ser Glu Val Leu 210 215 220Thr Phe Ser
Gly Gly His Phe Phe Leu Asp Ala His His Asp Ala Val225
230 235 240Val Glu Val Val Thr Ala Arg
Leu Arg Gln Asp Arg Ala Pro Arg Pro 245
250 255Asp Arg Val4267PRTbacteria 4Met Pro Glu Leu Asn
Asp Arg Thr Ala Leu Val Thr Gly Ala Ser Arg1 5
10 15Gly Ile Gly Lys Ala Ile Ala Gln Arg Leu Ala
Ala Glu Gly Val Arg 20 25
30Val Ala Val His Tyr Gly Thr Gln Glu Lys Ser Ala Gln Glu Thr Val
35 40 45Glu Thr Ile Glu Arg Ala Gly Gly
Arg Ala Phe Ala Val Arg Ala Asp 50 55
60Leu Leu Arg Asp Asp Ala Val Asp Glu Leu Phe Thr Ala Leu Glu Arg65
70 75 80Glu Leu Glu Gly Arg
Pro Leu His Ile Leu Val Asn Asn Ala Ala Val 85
90 95Ala Pro Ala Pro Gly Asp Pro Ala Leu Ala Ala
Gln Asp Gly Tyr Val 100 105
110Pro Gly Leu Ser Asp Thr Thr Pro Glu Glu Phe Asp Arg Val Tyr Arg
115 120 125Ile Asn Val Arg Ala Pro Phe
Phe Val Thr Gln Arg Ala Leu Ser Leu 130 135
140Met Ala Asp Gly Gly Arg Ile Val Asn Val Ser Ser Ala Val Thr
Arg145 150 155 160Ile Ala
Trp Pro Leu Leu Pro Tyr Ala Met Thr Lys Gly Ala Leu Glu
165 170 175Met Met Ala Pro Arg Leu Ala
Asn Glu Leu Gly Ser Arg Gly Ile Thr 180 185
190Val Asn Thr Val Ala Pro Gly Ile Thr Asp Thr Asp Met Asn
Arg Trp 195 200 205Val Arg Glu Thr
Pro Gly Ala Glu Ala Gly Ile Ser Ala Leu Thr Ala 210
215 220Leu Gly Arg Leu Gly Arg Pro Asn Asp Ile Ala Gly
Ile Val Ala Phe225 230 235
240Leu Val Ser Asp Asp Ala Arg Trp Ile Thr Gly Gln Leu Leu Asp Ala
245 250 255Ser Gly Gly Met Ala
Leu Ala Pro Ala Met Met 260
26552341PRTbacteria 5Val His Glu Thr His Ala His Gly Glu Glu Gly Ser Ser
Asp Gly Ser1 5 10 15Ala
Asp Ala Val Val Phe Val Phe Pro Gly Gln Gly Ser Gln Trp Pro 20
25 30Gly Met Gly Ala Glu Leu Trp Asp
Thr Ser Pro Val Phe Arg Glu Ser 35 40
45Val Arg Ala Cys Ala Asp Ala Leu Ala Pro Tyr Leu Asp Trp Ser Val
50 55 60Glu Gly Val Leu Arg Gly Ala Pro
Asp Ala Pro Ala Gly Pro Ala Leu65 70 75
80Asp Arg Ala Asp Val Ala Gln Pro Ala Leu Phe Thr Leu
Met Val Ser 85 90 95Leu
Ala Glu Leu Trp Arg Ser His Gly Val Glu Pro Cys Ala Val Leu
100 105 110Gly His Ser Leu Gly Glu Ile
Ala Ala Ala His Val Ala Gly Ala Leu 115 120
125Thr Leu Ala Asp Ala Ala Arg Val Ala Ala Leu Trp Ser Arg Ala
Gln 130 135 140Ala Thr Leu Ser Gly Thr
Gly Thr Leu Leu Ala Ala Lys Ala Ala Pro145 150
155 160Glu Glu Leu Ala Pro His Leu Gln Arg Trp Asn
Gly Asp Asp Arg His 165 170
175Gly Thr Arg Leu Ala Ile Ala Gly Val Asn Gly Pro Gly Ser Thr Val
180 185 190Val Ala Gly Asp Leu Asp
Ala Ile Ala Ala Leu Ala Ala Asp Leu Ala 195 200
205Ser Ala Gly Val Arg Thr Arg Arg Val Ala Val Asp Val Pro
Thr His 210 215 220Ser Pro Ala Met Arg
Thr Leu Arg Glu Arg Ile Leu Thr Asp Leu Ala225 230
235 240Ser Val Ala Pro Cys Val Ser Arg Leu Pro
Phe His Ser Ser Leu Thr 245 250
255Gly Gly Leu Val Asp Thr Arg Gly Leu Asp Ala Asp Tyr Trp Tyr Arg
260 265 270Asn Ile Ser Glu Thr
Ala Arg Phe Asp Leu Ala Ala Arg Gly Leu Leu 275
280 285Ala Asp Gly His Arg Thr Phe Val Glu Leu Ser Pro
His Pro Ile Leu 290 295 300Thr Leu Gly
Leu Gln Ala Leu Ala Asp Asp Val Pro Gly Ala Ala Asp305
310 315 320Ala Leu Val Thr Gly Thr Leu
Arg Arg Gly Arg Gly Gly Met Arg Gln 325
330 335Phe Gln Asp Ala Leu Gly Arg Leu Ser Val Pro Ala
Gly Gly Arg Pro 340 345 350Gly
Arg Glu Val Ser Ala Ala Ala Leu Ala Gly Arg Leu Ala Pro Leu 355
360 365Ser Pro Ala Gln Gln Glu His Leu Leu
Val Glu Leu Val Cys Ala His 370 375
380Phe Ala Ala Leu Val Gly Gly Asp Gly Gly Ala Pro Pro Thr Val Arg385
390 395 400Pro Ser Ala Ala
Phe Thr Asp Gln Gly Cys Asp Ser Ala Thr Ala Leu 405
410 415Glu Leu Arg Asp Arg Leu Arg Glu Ala Thr
Gly Leu Arg Leu Pro Ala 420 425
430Thr Leu Val Phe Asp His Pro Thr Pro Ala Ala Val Ala Gly Arg Leu
435 440 445Arg Arg Leu Ala Leu Gly Ile
Glu Glu Thr Ala Asp Thr Ala Pro Val 450 455
460Ala Val Arg Gly His Arg Glu Gly Glu Pro Ile Ala Ile Val Gly
Met465 470 475 480Ala Cys
Arg Phe Pro Gly Gly Val Arg Ser Pro Glu Asp Leu Trp Arg
485 490 495Leu Val Thr Glu Gly Gly Asp
Ala Leu Gly Pro Phe Pro Thr Asp Arg 500 505
510Gly Trp Asp Thr Gly Arg His Ala Glu Asp Pro Ala Thr Pro
Gly Thr 515 520 525Tyr Val Gln Gly
Glu Gly Gly Phe Leu Tyr Asp Ala Gly Glu Phe Asp 530
535 540Ala Glu Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu
Ala Met Asp Pro545 550 555
560Gln Gln Arg Leu Leu Leu Glu Met Ala Trp Glu Thr Phe Glu Arg Ala
565 570 575Gly Ile Asp Pro Thr
Ser Ala Arg Gly Ser Arg Thr Gly Val Phe Ala 580
585 590Gly Val Leu Pro Leu Gly Tyr Gly Pro Arg Met Asp
Glu Thr Asp Gln 595 600 605Gly Thr
Ala Asp Leu Gln Gly His Leu Leu Thr Gly Thr Leu Pro Ser 610
615 620Val Ala Ser Gly Arg Ile Ser Tyr Thr Leu Gly
Leu Glu Gly Pro Ala625 630 635
640Val Ser Val Glu Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu
645 650 655Ala Cys Arg Ser
Leu Arg Ala Gly Glu Cys Asp Leu Ala Leu Thr Gly 660
665 670Gly Val Ser Val Leu Ala Thr Leu Gly Leu Phe
Val Glu Phe Ser Arg 675 680 685Gln
Arg Gly Leu Ser Ala Asp Gly Arg Cys Lys Ala Tyr Ala Ala Ala 690
695 700Ala Asp Gly Thr Gly Trp Ser Glu Gly Ala
Gly Leu Leu Leu Val Glu705 710 715
720Arg Leu Ser Asp Ala Arg Arg Leu Gly His Arg Val Leu Ala Val
Val 725 730 735Arg Gly Ser
Ala Ile Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala 740
745 750Pro Ser Gly Pro Ser Gln Gln Arg Val Ile
Arg Glu Ala Leu Ala Asp 755 760
765Ala Gly Leu Thr Ala Ala Asp Val Asp Ala Val Glu Gly His Gly Thr 770
775 780Gly Thr Arg Leu Gly Asp Pro Ile
Glu Ile Glu Ala Leu Leu Ala Thr785 790
795 800Tyr Gly Gln Gly Arg Ala Arg Glu Arg Pro Leu Trp
Leu Gly Ser Leu 805 810
815Lys Ser Asn Ile Gly His Thr Met Ala Ala Ala Gly Val Gly Gly Val
820 825 830Ile Lys Met Val Met Ala
Leu Arg His Gly Glu Leu Pro Arg Thr Leu 835 840
845His Val Asp Ala Pro Ser Pro Arg Ala Asp Trp Ser Ala Gly
Glu Val 850 855 860Arg Leu Leu Thr Glu
Ala Val Ala Trp Pro Ala Ala Ala Asp Gly Glu865 870
875 880Pro Arg Arg Ala Gly Val Ser Ser Phe Gly
Val Ser Gly Thr Asn Ala 885 890
895His Ala Ile Leu Glu Glu Ala Pro Ala Pro Glu Asp Glu Glu Pro Ala
900 905 910Pro Pro Asp Gly Glu
Ala Leu Leu Pro Trp Ala Val Ser Thr Arg Ser 915
920 925Glu Ala Ala Leu Arg Thr Gln Ala Arg Met Leu Ala
Asp Val Val Arg 930 935 940Asp Asp Pro
Gly Val Gly Leu Ala Asp Val Gly Ala Glu Leu Ala Arg945
950 955 960Gly Arg Ala Ala Leu Glu His
Arg Ala Val Val Ile Ala Ser Gly Arg 965
970 975Ala Glu Phe Ala Arg Ala Leu Glu Ala Val Ala Ser
Gly Glu Pro His 980 985 990Pro
Ala Val Val Arg Gly His Ala Gly Ser Glu Arg Gly Gly Val Val 995
1000 1005Phe Val Phe Pro Gly Gln Gly Gly
Gln Trp Ala Gly Met Gly Leu 1010 1015
1020Asp Leu Leu Arg Ser Ser Pro Val Phe Ala Glu His Ile Ala Ala
1025 1030 1035Cys Gly Lys Ala Leu Ala
Pro Trp Val Lys Trp Ser Leu Thr Glu 1040 1045
1050Val Leu His Arg Asp Ala Glu Asp Pro Val Trp Asp Arg Ala
Asp 1055 1060 1065Val Val Gln Pro Val
Leu Phe Ser Val Met Thr Ser Leu Ala Ala 1070 1075
1080Leu Trp Arg Ser Tyr Gly Val Glu Pro Asp Ala Val Thr
Gly His 1085 1090 1095Ser Gln Gly Glu
Ile Ala Ala Ala Tyr Val Cys Gly Ala Leu Gly 1100
1105 1110Leu Glu Asp Ala Ala Arg Thr Val Ala Leu Arg
Ser Arg Ala Leu 1115 1120 1125Val Ala
Leu Arg Gly Arg Gly Gly Met Ala Ser Val Ala Ser Ala 1130
1135 1140Ala Pro Asp Val Glu Glu Leu Ile Ala Arg
Arg Trp Pro Gly Arg 1145 1150 1155Leu
Trp Val Ala Ala Phe Asn Gly Pro Gly Ala Val Thr Val Ser 1160
1165 1170Gly Asp Gly Asp Ala Leu Glu Glu Phe
Leu Gly His Cys Ala Asp 1175 1180
1185Thr Glu Val Arg Ala Arg Arg Val Pro Val Asp Tyr Ala Ser His
1190 1195 1200Cys Pro His Thr Glu Ala
Ile Glu Arg Glu Leu Leu Asp Ala Leu 1205 1210
1215Glu Asp Ile Thr Pro Arg Pro Ala Ala Val Pro Phe Tyr Ser
Thr 1220 1225 1230Val Asp Asp Ala Trp
Leu Asp Thr Thr Arg Leu Asp Ala Ser Tyr 1235 1240
1245Trp Tyr Arg Asn Leu Arg Arg Pro Val Arg Phe Ser Gln
Ala Val 1250 1255 1260Arg Ala Leu Thr
Asp Gly Gly His Arg Val Phe Ile Glu Ala Ser 1265
1270 1275Pro His Pro Thr Leu Val Pro Ala Ile Glu Asp
His Gly Asp Val 1280 1285 1290Thr Ala
Leu Gly Thr Leu Arg Arg His Gly Asp Asp Thr Glu Arg 1295
1300 1305Phe Leu Thr Ala Leu Ala His Leu His Val
Thr Gly Ala Ala Gly 1310 1315 1320Gln
Asp Leu Trp Arg His His Tyr Ala Arg Leu Arg Pro Ala Pro 1325
1330 1335Arg His Val Asp Leu Pro Thr Tyr Ala
Phe Gln Arg Asp Arg Tyr 1340 1345
1350Trp Trp Ser Gly Gly Ala Gly Arg Gly Asp Val Thr Thr Ala Gly
1355 1360 1365Leu His Pro Gly Gly His
Pro Leu Leu Gly Ala Ala Leu Asp Leu 1370 1375
1380Ala Asp Gly Gly Gly Arg Leu His Thr Gly Arg Val Ser Leu
Arg 1385 1390 1395Thr His Pro Trp Ile
Ala Asp His Gly Val Ala Gly Ile Thr Leu 1400 1405
1410Leu Pro Gly Thr Ala Phe Leu Glu Leu Ala Leu His Thr
Gly Glu 1415 1420 1425Ser Gly Asn Val
Arg Glu Leu Thr Leu His Ala Pro Leu Val Val 1430
1435 1440Pro Asp Glu Glu Gly Val Asp Leu Gln Val His
Leu Ala Arg Pro 1445 1450 1455Asp Glu
Ala Gly Leu Arg Ala Leu Thr Arg Leu Leu Pro Gly Arg 1460
1465 1470Gly Val Pro Thr Pro Arg Ala Pro Trp Gln
Pro His Ala Thr Gly 1475 1480 1485Leu
Leu Gly Pro Ala Asp Arg Ala Pro Gly Ser Ser Gly Leu Glu 1490
1495 1500Pro His Asp Leu Gly Gly Ala Trp Pro
Pro Pro Gly Ala Val Pro 1505 1510
1515Leu Val Pro Gly Glu Leu Gly Asp Val Pro Gly Cys Tyr Ala Arg
1520 1525 1530Leu Ala Asp Glu Gly Phe
Glu Tyr Gly Pro Ala Phe Arg Gly Leu 1535 1540
1545Arg Ala Val Trp Arg Arg Gly Thr Glu Ile Phe Ala Glu Val
Ala 1550 1555 1560Leu Pro Ala Gly Asp
Gly Ser Val Phe Arg Leu His Pro Ala Leu 1565 1570
1575Leu Asp Ala Val Leu His Pro Val Val Leu Gly Leu Val
Asp Gly 1580 1585 1590Val Pro Ala Arg
Pro Leu Pro Phe Ser Trp Asn Gly Val Ala Leu 1595
1600 1605His Ala Pro Ala Ser Gly Ala Leu Arg Val Arg
Leu Ala Pro Ala 1610 1615 1620Asp Asp
Gly Ala Val Gly Ile Thr Ala Ala Thr Ala Ala Gly Glu 1625
1630 1635Pro Val Leu Ser Val Ala Ala Leu Ala Leu
Arg Ser Ala Ser Ala 1640 1645 1650Glu
Gln Leu Arg Ala Ala Ile Arg Ser Ala Ala Gly Ser Arg Asp 1655
1660 1665Ala Leu Tyr Glu Leu Asp Trp Leu Pro
Leu Pro Ala Asp Arg Ala 1670 1675
1680Ala Ser Pro Gly Gly Ala Asp Ile Ala Ala Leu Gly Thr Ser Glu
1685 1690 1695Leu Pro Cys Arg Thr Tyr
Glu Thr Ile Ala Glu Leu Ser Gln Ala 1700 1705
1710Leu Ala Asp Gly Ala Pro Ala Pro Asp Ala Val Val Ser Asp
Val 1715 1720 1725Gly Ala Val Gly Gly
Pro Leu Asp Thr Val Ser Leu His Gly Leu 1730 1735
1740Cys Arg Arg Gly Leu Glu Leu Val Gln Ala Trp Leu Gly
Glu Pro 1745 1750 1755Arg Thr Ala Asp
Thr Arg Leu Val Leu Val Thr Arg Gly Ala Val 1760
1765 1770Gly Cys Ala Pro Ala Glu Pro Val Ala Asp Pro
Ala Ala Ala Ala 1775 1780 1785Leu Trp
Gly Leu Val Arg Ser Ala Gln Ala Glu His Pro Gly Arg 1790
1795 1800Leu Leu Leu Leu Asp Leu Asp Pro Ala Gly
Ser Arg Pro Val Ser 1805 1810 1815Gly
Arg Leu Val Glu Gln Ala Val Ala Cys Gly Glu Pro His Ile 1820
1825 1830Ala Val Arg Gly Asp Gly Leu Arg Val
Pro Arg Leu Ser Arg Ala 1835 1840
1845Thr Ala Ala Pro Ala His Pro Pro Ala Gly Gly Arg Glu Ala Gln
1850 1855 1860Trp Asp Pro Glu Gly Thr
Val Leu Ile Thr Gly Gly Thr Gly Ser 1865 1870
1875Leu Gly Ala Leu Phe Ala Arg His Leu Val Thr Ala His Gly
Val 1880 1885 1890Arg Arg Leu Leu Leu
Ala Ser Arg Ser Gly Pro Gly Ala Pro Gly 1895 1900
1905Ala Ala Gly Leu Arg Asp Glu Leu Thr Ala His Gly Ala
Thr Val 1910 1915 1920Thr Val Ala Ala
Cys Asp Val Ala Asp Arg Glu Ala Val Ala Ala 1925
1930 1935Leu Leu Ala Ser Val Pro Ser Glu His Pro Leu
Thr Ala Val Val 1940 1945 1950His Thr
Ala Gly Val Leu Asp Asp Gly Val Leu Ala Ser Leu Thr 1955
1960 1965Ala Asp Arg Leu Ala Arg Val Leu Arg Ala
Lys Ala Asp Ala Ala 1970 1975 1980Leu
His Leu His Asp Leu Thr Arg Asp Leu Pro Leu Ala Ala Phe 1985
1990 1995Val Leu Phe Ser Ser Val Thr Ala Thr
Leu Gly Thr Pro Gly Gln 2000 2005
2010Ala Asn Tyr Thr Ala Ala Asn Ala Phe Leu Asp Ala Leu Ala Arg
2015 2020 2025His Arg Arg Ala Ala Gly
Leu Pro Ala Val Ser Leu Ala Trp Gly 2030 2035
2040Leu Trp Glu Gln Thr Gly Gly Leu Thr Asp His Leu Gly Ser
Val 2045 2050 2055Asp Leu Arg Arg Met
Ala Arg Asn Gly Leu Val Ala Leu Pro Ala 2060 2065
2070Asp Ala Gly Leu Ala Leu Phe Asp Thr Ala Leu Ala Leu
Asp Arg 2075 2080 2085Ala Asn Leu Val
Pro Ala Arg Leu Asp Leu Pro Ala Leu Arg Arg 2090
2095 2100Ala Thr His Val Pro Pro Val Leu Arg Arg Leu
Val Glu Val Pro 2105 2110 2115Gly Ala
Pro Ser Ala Asp Arg Ser Ala Gly Ser Gly Gly Glu Val 2120
2125 2130Arg Pro Leu Arg Glu Thr Leu Ala Gly Leu
Asp Asp Arg Lys Arg 2135 2140 2145Pro
Ala Ala Val Ser Arg Leu Val Arg Arg His Val Ala Trp Val 2150
2155 2160Leu Gly Ala Asp Gly Pro Glu Ser Val
Asp Glu Asp Arg Ser Phe 2165 2170
2175Arg Asp Leu Gly Phe Asp Ser Leu Met Ala Val Glu Leu Arg Asn
2180 2185 2190Gln Leu Asn Thr Ala Ala
Gly Ile Arg Leu Ala Ala Thr Leu Val 2195 2200
2205Phe Asp His Pro Thr Pro Ser Ala Val Ala Arg His Leu Leu
Asp 2210 2215 2220Arg Cys Ser Pro Asp
Pro Ala Ala Pro Ala Ala Pro Ser Gly Thr 2225 2230
2235Ala Val Ala Ser Ala Leu Ala Thr Leu Ala Glu Leu Glu
Thr Ala 2240 2245 2250Leu Asn Gly Ile
Pro Ala Glu Glu Trp Thr Ala Ala Gly Gly Pro 2255
2260 2265Ala Arg Leu Met Thr Leu Ala Ser Ser Leu Pro
Ala Pro Ala Ser 2270 2275 2280Val Pro
Arg Thr Pro Ala Ala Gly Glu Ala Ala Glu Lys Leu Ala 2285
2290 2295His Ala Ser Arg Asp Glu Ile Phe Ala Phe
Ile Asp Arg Glu Leu 2300 2305 2310Gly
Arg Asp Ser Gly Pro Ala Ser Pro Ser Arg Leu Gly Pro Gln 2315
2320 2325Thr Pro Asp Ser Thr Asp Lys Ala Pro
Phe His Gly Glu 2330 2335
234063723PRTbacteria 6Met Glu Asn Glu Glu Lys Leu Leu Asp Tyr Leu Lys Trp
Val Thr Ala1 5 10 15Asp
Leu His Arg Ser Arg Glu Arg Val Thr Glu Leu Glu Glu Ala Gly 20
25 30Arg Glu Pro Ile Ala Ile Val Gly
Met Ala Cys Arg Phe Pro Gly Glu 35 40
45Val Arg Ser Pro Glu Glu Leu Trp Gly Leu Val Ala Ser Gly Gly Asp
50 55 60Ala Ile Gly Ala Phe Pro Asp Asp
Arg Gly Trp Asp Leu Asp Gly Leu65 70 75
80Phe Asp Pro Asp Pro Glu Arg Ala Gly Thr Ser Tyr Thr
Arg Arg Gly 85 90 95Gly
Phe Leu Tyr Asp Ala Ala Glu Phe Asp Ala Gly Phe Phe Gly Ile
100 105 110Ser Pro Arg Glu Ala Met Ala
Met Asp Pro Gln Gln Arg Leu Leu Leu 115 120
125Glu Thr Ser Trp Glu Ala Phe Glu Arg Ala Gly Ile Asp Pro Ser
Ser 130 135 140Val Arg Gly Ser Arg Val
Gly Val Phe Ala Gly Leu Met Tyr His Asp145 150
155 160Tyr Ala Ala Ala Gln Gly Ser Thr Gly Asp Gly
Asp Gly Glu Pro Asp 165 170
175Phe Glu Gly Tyr Leu Gly Asp Gly Ser Val Ser Ser Ile Ala Ser Gly
180 185 190Arg Ile Ala Tyr Thr Leu
Gly Leu Ala Gly Ala Ala Ile Thr Val Asp 195 200
205Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Cys
Gln Ala 210 215 220Leu Arg Thr Gly Asp
Ser Glu Leu Ala Leu Ala Gly Gly Val Ser Val225 230
235 240Met Ser Thr Pro Arg Thr Phe Val Gln Phe
Ser Arg Gln Arg Gly Leu 245 250
255Ser Ala Asp Gly Arg Cys Lys Ala Tyr Ala Ala Ala Ala Asp Gly Thr
260 265 270Gly Phe Ser Glu Gly
Val Gly Met Val Leu Val Glu Arg Leu Ser Asp 275
280 285Ala Arg Arg Leu Gly His Pro Val Leu Ala Val Val
Arg Gly Ser Ala 290 295 300Val Asn Gln
Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro305
310 315 320Ser Gln Glu Arg Val Ile Arg
Glu Ala Leu Ala Asn Ala Gly Leu Thr 325
330 335Ala Ala Asp Val Asp Ala Val Glu Gly His Gly Thr
Gly Thr Arg Leu 340 345 350Gly
Asp Pro Ile Glu Leu Gln Ala Leu Leu Ala Thr Tyr Gly Gln Gly 355
360 365Arg Ala Arg Glu Arg Pro Leu Trp Leu
Gly Ser Val Lys Ser Asn Ile 370 375
380Gly His Ala Gln Ala Ala Ala Gly Val Gly Gly Val Ile Lys Met Val385
390 395 400Met Ala Leu Arg
His Gly Glu Leu Pro Arg Thr Leu His Val Asp Ala 405
410 415Pro Ser Pro Arg Val Asp Trp Ser Ala Gly
Glu Val Arg Leu Leu Thr 420 425
430Glu Ala Val Ala Trp Pro Ala Ala Ala Asp Gly Glu Pro Arg Arg Ala
435 440 445Gly Val Ser Ser Phe Gly Val
Ser Gly Thr Asn Ala His Val Ile Leu 450 455
460Glu Glu Ala Pro Ala Ser Glu Gly Glu Glu Ala Pro Pro Pro Glu
Pro465 470 475 480Gly Ser
Pro Leu Pro Trp Val Val Ser Gly His Ser Glu Ala Gly Leu
485 490 495Arg Ala Gln Ala Gln Ala Leu
Ala Glu Phe Ala Arg Thr Ala Pro Gly 500 505
510Ala Glu Leu Val Asp Val Gly Ala Ala Leu Ala Arg Gly Arg
Ala Ala 515 520 525Leu Gly His Arg
Ala Val Val Val Ala Ser Glu Arg Glu Glu Phe Glu 530
535 540Arg Ala Leu Ala Ala Leu Ala Cys Gly Glu Pro His
Pro Cys Val Val545 550 555
560Asp Gly Ser Ala Asp Gly Arg Arg Glu Asp Gly Val Val Phe Val Phe
565 570 575Pro Gly Gln Gly Gly
Gln Trp Ala Gly Met Gly Leu Asp Leu Leu Thr 580
585 590Thr Ser Gly Val Phe Ala Glu His Ile Gly Ala Cys
Glu Arg Ala Leu 595 600 605Ala Pro
Trp Val Glu Trp Ser Leu Thr Glu Met Leu His Arg Glu Ala 610
615 620Glu Asp Pro Val Trp Glu Arg Ala Asp Ile Val
Gln Pro Val Leu Phe625 630 635
640Ser Val Met Val Ser Leu Ala Ala Leu Trp Arg Ser Tyr Gly Ile Glu
645 650 655Pro Asp Ala Val
Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala His 660
665 670Val Cys Gly Ala Leu Thr Leu Glu Asp Ala Ala
Lys Val Val Ala Leu 675 680 685Arg
Ser Arg Ala Leu Ala Ala Leu Arg Gly Arg Gly Gly Met Val Ser 690
695 700Leu Ser Leu Ser Thr Ala Asp Ala Gly Glu
Leu Val Glu Arg Arg Trp705 710 715
720Ala Gly Arg Leu Trp Val Ala Ala Leu Asn Gly Pro Glu Ala Thr
Thr 725 730 735Val Ser Gly
Asp Val Asp Ala Leu Glu Glu Leu Leu Ala His Cys Ala 740
745 750Lys Ser Glu Val Arg Ala Arg Arg Val Pro
Val Asp Tyr Ala Ser His 755 760
765Cys Pro His Thr Glu Ala Ile Ala Glu Glu Ile Val Asp Ser Leu Gly 770
775 780Asp Ile Thr Pro Arg Ala Ala Thr
Val Pro Phe Tyr Ser Thr Val Asp785 790
795 800Asp Met Trp Leu Asp Thr Thr Arg Leu Asp Ala Ser
Tyr Trp Tyr Arg 805 810
815Asn Leu Arg Leu Pro Val Arg Phe Ser Gln Ala Val Arg Ala Leu Thr
820 825 830Glu Glu Gly His Arg Leu
Phe Ile Glu Thr Ser Pro His Pro Thr Leu 835 840
845Val Pro Ala Ile Glu Asp His Gly Asp Val Thr Ala Leu Gly
Thr Leu 850 855 860Arg Arg His Gly Asp
Asp Thr Glu Arg Phe Leu Thr Ala Leu Ala His865 870
875 880Leu His Val Thr Gly Ala Ala Gly Gln Asp
Leu Trp Arg His His Tyr 885 890
895Ala Arg Leu Arg Pro Ala Pro Arg His Val Asp Leu Pro Thr Tyr Pro
900 905 910Phe Gln Arg Arg Arg
Tyr Trp Leu Glu Lys Pro Asp Pro Gln Thr Arg 915
920 925Pro Gln Arg Ser Arg Ser Thr Ala Pro Asp Leu Asp
Arg Leu Glu Ala 930 935 940Glu Phe Trp
Gln Ala Val Glu Glu Thr Asp Thr Asp Thr Leu Ala His945
950 955 960Thr Leu His Leu Asp Thr Gln
Thr Leu Glu Pro Val Leu Pro Ala Leu 965
970 975Ala Thr Trp His Gln Gln Gln Arg Asp His Ala Arg
Ile Asn Thr Trp 980 985 990Thr
Tyr Gln Glu Thr Trp Lys Pro Leu His Leu Pro Thr Thr Arg Pro 995
1000 1005Thr Thr Pro Thr Ser Trp Leu Ile
Ala Ile Pro Glu Thr His Arg 1010 1015
1020Asn His Pro His Thr Thr Asn Leu Leu Thr Asn Leu Pro His His
1025 1030 1035Asn Ile Thr Pro Ile Pro
Leu Thr Ile Asn His Thr Thr Asp Leu 1040 1045
1050His His Ala Tyr His His Ala His His His Thr Thr Pro Pro
Ile 1055 1060 1065Thr Ala Val Leu Ser
Leu Leu Ala Leu Asp Glu Thr Pro His Pro 1070 1075
1080His His Pro His Thr Pro Thr Gly Thr Leu Leu Asn Leu
Thr Leu 1085 1090 1095Thr Gln Thr His
Thr Gln Thr His Pro Pro Thr Pro Leu Trp Tyr 1100
1105 1110Leu Thr Thr Gln Ala Thr Thr Thr His Pro Asn
Asp Pro Leu Thr 1115 1120 1125His Pro
Thr Gln Ala Gln Thr Ile Gly Leu Ala Arg Thr Thr His 1130
1135 1140Leu Glu His Pro His His Thr Gly Gly His
Ile Asp Leu Pro Thr 1145 1150 1155Thr
Pro His Pro Asn Thr Leu Thr Gln Leu Ile Thr Ala Leu Thr 1160
1165 1170His Pro His His Gln His Asn Leu Thr
Ile Arg Thr His Thr Thr 1175 1180
1185His Thr Arg Arg Leu Thr Pro Thr Thr Leu Gln Pro Thr Thr Pro
1190 1195 1200Thr Pro Pro Thr Asn Pro
His Gly Thr Thr Leu Ile Thr Gly Gly 1205 1210
1215Thr Gly Ala Leu Ala Thr Thr Leu Ala His His Leu Ala Thr
Thr 1220 1225 1230Gly Thr Gln His Leu
Leu Leu Thr Ser Arg Arg Gly Pro His Thr 1235 1240
1245Pro Gly Ala Arg Gln Leu His Thr Gln Leu Thr Gln Leu
Gly Thr 1250 1255 1260Asn Thr Thr Ile
Thr Ala Cys Asp Leu Ser Asp Pro Asp Gln Leu 1265
1270 1275Thr His Leu Leu Thr His Ile Pro Pro Glu His
Pro Leu Thr Thr 1280 1285 1290Val Ile
His Thr Ala Gly Ile Leu Asp Asp Ala Thr Leu Thr Asn 1295
1300 1305Leu Thr Pro Thr Gln Leu Asp Asn Val Leu
Arg Ala Lys Ala His 1310 1315 1320Thr
Ala His Leu Leu His His Ala Thr Leu His Thr Pro Leu Asp 1325
1330 1335His Phe Val Leu Tyr Ser Ser Ala Ala
Ala Thr Leu Gly Ala Pro 1340 1345
1350Gly Gln Ala Asn Tyr Ala Ala Ala Asn Ala Tyr Leu Asp Ala Leu
1355 1360 1365Ala His His Arg His Thr
His Asn Leu Pro Ala Thr Thr Ile Ala 1370 1375
1380Trp Gly Thr Trp Gln Gly Asn Gly Leu Ala Asp Ser Asp Lys
Ala 1385 1390 1395Arg Ala Asn Leu Asp
Arg Arg Gly Phe Leu Pro Met Pro Glu Thr 1400 1405
1410Leu Ala Ala Ala Ala Ala Val Arg Ala Ile Glu Ser Arg
Arg Pro 1415 1420 1425Ser Val Val Ile
Ala Ala Ile Asp Trp Ala Arg Ala Glu Arg Thr 1430
1435 1440Pro Asp Val Glu Asp Leu Leu Pro Ala Ala Asp
Glu Gly Ser Ser 1445 1450 1455Ser Gly
Lys Pro Glu Ala Ala Pro Val Asp Leu Arg Gly Thr Leu 1460
1465 1470Ser Arg Gln Ser Ala Ala Asp Gln Gln Ala
Thr Leu Leu Gly Leu 1475 1480 1485Val
Arg Thr Gln Ala Ala Val Val Leu Arg His Thr Glu Pro Glu 1490
1495 1500Ala Leu Ala Pro Gly Gln Ala Phe Arg
Ala Leu Gly Phe Asp Ser 1505 1510
1515Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Ala Lys Ala Thr Asp
1520 1525 1530Leu Ala Leu Pro Ala Ser
Leu Val Phe Asp His Pro Thr Pro Val 1535 1540
1545Lys Leu Ala Glu Phe Leu Arg Thr Glu Leu Leu Gly Thr Ala
Pro 1550 1555 1560Ala Thr Thr Ala Ala
Val Pro Ala Leu Gln Ala His Thr Asp Glu 1565 1570
1575Pro Ile Ala Ile Ile Gly Met Ala Cys Arg Phe Pro Gly
Ala Val 1580 1585 1590Thr Thr Pro Glu
His Leu Trp Asn Leu Ile Ala Thr Glu Gln Asp 1595
1600 1605Ala Ile Gly Glu Phe Pro Thr Asp Arg Gly Trp
Asp Leu Asp Asn 1610 1615 1620Leu Tyr
His Pro Asp Pro Asp His Pro Gly Thr Thr Tyr Thr Arg 1625
1630 1635His Gly Gly Phe Leu His Asp Ala Gly Asp
Phe Asp Ala Asp Phe 1640 1645 1650Phe
Gly Ile Asn Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln 1655
1660 1665Arg Leu Leu Leu Glu Thr Ala Trp Glu
Ala Ile Glu His Ala Gly 1670 1675
1680Ile Leu Pro Asp Ala Leu His Gly Thr Pro Thr Gly Val Phe Thr
1685 1690 1695Gly Val Asn Ala Gln Asp
Tyr Ala Ala His Thr His Thr Ser Pro 1700 1705
1710His Thr Thr Glu Gly Tyr Thr Leu Thr Gly Thr Ala Gly Ser
Ile 1715 1720 1725Ala Ser Gly Arg Ile
Ala Tyr Val Leu Gly Leu Glu Gly Pro Ala 1730 1735
1740Val Thr Ile Asp Thr Ala Cys Ser Ser Ser Leu Val Ala
Leu His 1745 1750 1755Leu Ala Cys Gln
Ala Leu Arg Ala Gly Glu Cys Thr Thr Ala Leu 1760
1765 1770Ala Ser Gly Ile Ser Ile Met Thr Thr Pro Leu
Ala Phe Thr Glu 1775 1780 1785Phe Ser
Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Cys Lys Ala 1790
1795 1800Phe Ala Ala Ala Ala Asp Gly Thr Gly Trp
Ser Glu Gly Val Gly 1805 1810 1815Thr
Leu Leu Leu Glu Arg Leu Ser Asp Ala Glu Arg Asn Gly His 1820
1825 1830Arg Val Leu Ala Val Val Arg Gly Ser
Ala Val Asn Gln Asp Gly 1835 1840
1845Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg
1850 1855 1860Val Ile Arg Gln Ala Leu
Val Asn Ala Asn Leu Ser Ala Val Asp 1865 1870
1875Val Asp Ala Val Glu Ala His Gly Thr Gly Thr Lys Leu Gly
Asp 1880 1885 1890Pro Ile Glu Ala Gln
Ala Leu Leu Ala Thr Tyr Gly Gln Gly Arg 1895 1900
1905Ala Gln Glu Gln Pro Leu Trp Leu Gly Ser Val Lys Ser
Asn Leu 1910 1915 1920Gly His Thr Gln
Ala Ala Ala Gly Met Ala Gly Leu Ile Lys Met 1925
1930 1935Val Met Ala Leu Arg His Glu Ser Leu Pro Arg
Thr Leu His Val 1940 1945 1950Asp Glu
Pro Ser Pro Glu Val Asp Trp Ser Ser Gly Ala Val Ser 1955
1960 1965Leu Leu Thr Glu Ala Arg Pro Trp Pro Arg
Val Glu Asp Arg Pro 1970 1975 1980Arg
Arg Ala Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 1985
1990 1995His Val Ile Val Glu Glu Ala Pro Ala
Pro Thr Gly Val Glu Ala 2000 2005
2010Val Glu Ala Ala Pro Ala Gly Val Glu Thr Ala Ala Ala Ala Ala
2015 2020 2025Val Val Val Glu Thr Asp
Gly Ala Gly Arg Val Ser Ala Asp Leu 2030 2035
2040Pro Leu Val Trp Val Ala Ser Gly Lys Ser Gln Ala Ala Ile
Arg 2045 2050 2055Ala Gln Ala Ala Ala
Leu His Ala His Val Leu Asp His Pro Glu 2060 2065
2070Gln Asp Ala Asp Asp Ile Gly Tyr Ser Leu Ala Thr Thr
Arg Ala 2075 2080 2085Leu Phe Asp His
Arg Ala Thr Leu Ile Ala Pro Asp Arg His Thr 2090
2095 2100Val Pro Glu Pro Leu Thr Gly Leu Gly Asp Gly
Arg Thr His Pro 2105 2110 2115His Leu
Ile Pro Thr Pro Pro Thr Glu Pro Gly His Thr His Lys 2120
2125 2130Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr
Gln Arg Pro Gly Met 2135 2140 2145Ala
Thr Gly Leu Tyr His Thr Tyr Pro Ala Phe Ala Ala Ala Leu 2150
2155 2160Asp Glu Thr Cys Ala His Phe Asp Pro
His Leu Asp His Pro Leu 2165 2170
2175His Asp Leu Leu Leu Asn His Asp Pro Thr Asp Leu Leu Thr His
2180 2185 2190Thr Leu Tyr Ala Gln Pro
Ala Leu Phe Thr Leu Gln Lys Ala Leu 2195 2200
2205His His Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr
Leu 2210 2215 2220Ala Gly His Ser Leu
Gly Glu Ile Thr Ala Ala His Leu Ala Gly 2225 2230
2235Ile Leu Thr Leu Pro Asp Ala Thr His Leu Ile Thr Thr
Arg Ala 2240 2245 2250Arg Leu Met Gln
Thr Met Pro Pro Gly Thr Met Thr Thr Leu His 2255
2260 2265Thr Thr Pro Glu His Ile Gln Pro Leu Leu Asp
Gln His Pro Gly 2270 2275 2280Lys Ala
Ala Ile Ala Ala Val Asn Ser Pro His Ser Leu Val Ile 2285
2290 2295Ser Gly Asp Pro Asp Thr Ile His His Ile
Thr Thr Thr Cys His 2300 2305 2310Asn
Gln Gly Ile Thr Thr Lys Pro Leu Ala Thr Asn His Ala Phe 2315
2320 2325His Ser Pro His Thr Asp Thr Ile Leu
Glu Gln Leu Asp Thr Thr 2330 2335
2340Thr His Thr Leu Thr Tyr His Gln Pro His Thr Pro Leu Ile Thr
2345 2350 2355Ser Thr Pro Gly Asp Pro
Leu Thr Pro His Tyr Trp Thr His Gln 2360 2365
2370Thr Arg Gln Pro Val His Trp Thr Asp Thr Ile His Thr Leu
His 2375 2380 2385Thr His Gly Val Thr
Thr Tyr Ile Ala Leu Gly Pro Glu His Thr 2390 2395
2400Leu Thr Thr Leu Thr His His Asn Val Pro His His Gln
Pro Thr 2405 2410 2415Ala Ile Thr Leu
Thr His Pro His His Asn Pro Thr His His Leu 2420
2425 2430Leu Thr Ala Leu Ala His Leu His Thr Thr Gln
Pro Thr Gly Pro 2435 2440 2445Asn Ile
Trp His His His Tyr Thr Pro Val Ala Pro Ala Pro Arg 2450
2455 2460His Val Asp Leu Pro Thr Tyr Pro Phe Pro
Arg Arg Arg Tyr Trp 2465 2470 2475Val
Gln Ala Ser Ala Gly Thr Gly Asp Val Ser Ala Ala Gly Leu 2480
2485 2490Gln Arg Pro Asp His Pro Leu Leu Gly
Ala Val Met Glu Leu Ala 2495 2500
2505Asp Gly Asp Gly Ile Val Leu Thr Gly Arg Leu Ser Leu His Thr
2510 2515 2520His Pro Trp Leu Ala Asp
His Ser Val Gly Gly Val Ala Leu Leu 2525 2530
2535Pro Gly Thr Ala Leu Leu Glu Leu Ala Phe Gln Ala Gly Leu
Arg 2540 2545 2550Ala Gly Cys Pro Gly
Val Asp Glu Leu Thr Leu His Ala Pro Leu 2555 2560
2565Val Val Pro Glu Ser Gly His Val Val Val Gln Val Ser
Val Ser 2570 2575 2580Val Pro Gly Glu
Ala Gly Arg Arg Gly Val Ser Val Tyr Gly Arg 2585
2590 2595Leu Val Glu Asp Gly Gly Leu Glu Gly Glu Trp
Thr Arg His Ala 2600 2605 2610Glu Gly
Val Val Cys Pro Ser Val Pro Gly Glu Ser Val Val Val 2615
2620 2625Glu Pro Val Ala Asp Gly Val Trp Pro Pro
Ser Gly Ala Gln Pro 2630 2635 2640Val
Asp Leu Glu Glu Phe Tyr Gly Arg Leu Ala Gly Gly Gly Phe 2645
2650 2655Val Tyr Gly Pro Val Phe Gln Gly Leu
Cys Ala Ala Trp Arg Asp 2660 2665
2670Gly Asp Asp Val Val Ala Glu Val Arg Leu Pro Asp Glu Gly Leu
2675 2680 2685Ala Asp Val Ala Gly Phe
Gly Val His Pro Ala Leu Leu Asp Ala 2690 2695
2700Ala Val Gln Ala Val Thr Leu Leu Phe Pro Asp Gln Gln Gln
Ala 2705 2710 2715Gly Leu Ala Ala His
Thr Trp Asn Gly Val Ser Leu His Ala Arg 2720 2725
2730Gly Ala Thr Val Leu Arg Leu Arg Met Thr Pro Thr Asp
Ala Thr 2735 2740 2745Ser Thr Ala Val
Arg Leu His Ala Thr Asp Glu Thr Gly Ala Pro 2750
2755 2760Val Leu Thr Leu Asp Ser Leu Leu Met Arg Pro
Val Pro Leu Glu 2765 2770 2775Gly Leu
Gly Ala Gly Val Arg Arg Gly Ser Leu Phe Glu Leu Gly 2780
2785 2790Trp Val Pro Val Glu Gly Met Pro Ala Ser
Val Ala Gly Gly Gly 2795 2800 2805Gly
Glu Leu Val Ala Trp Glu Cys Pro Gly Gly Gly Val Ala Glu 2810
2815 2820Val Thr Ala Ala Ala Leu Gly Val Val
Gln Glu Trp Leu Ala Asp 2825 2830
2835Glu Arg Glu Gly Asp Ala Arg Leu Val Val Val Thr Arg Gly Ala
2840 2845 2850Val Ala Val Asp Ala Gly
Glu Pro Val Arg Asp Val Ala Gly Ala 2855 2860
2865Ala Val Trp Gly Leu Val Arg Ser Ala Gln Ser Glu His Pro
Asp 2870 2875 2880Arg Phe Ala Leu Leu
Asp Leu Asp Pro Asp Thr Lys Thr Asp Pro 2885 2890
2895Gly Ile Asp Thr Asp Gly Asp Thr Asp Val Ser Ala Asp
Ala Lys 2900 2905 2910Val Gly Thr Gly
Asp Gly Leu Asp Asp Ala Ala Val Ala Ser Ala 2915
2920 2925Leu Ala Arg Gly Glu Ser Gln Leu Ala Val Arg
Asp Gly Val Val 2930 2935 2940Arg Val
Ala Arg Leu Gly Gly Leu Val Gly Gly Leu Ser Leu Pro 2945
2950 2955Gly Gly Val Gly Trp Arg Leu Asp Gly Gly
Gly Ser Gly Leu Leu 2960 2965 2970Glu
Gly Val Gly Val Val Ala Ser Asp Ala Ala Gly Val Val Leu 2975
2980 2985Gly Arg Gly Gln Val Arg Val Ala Val
Arg Ala Ala Gly Val Asn 2990 2995
3000Phe Arg Asp Val Leu Val Ala Leu Gly Met Val Pro Gly Gln Val
3005 3010 3015Gly Val Gly Ser Glu Gly
Ala Gly Val Val Val Glu Val Gly Pro 3020 3025
3030Gly Val Glu Gly Leu Val Val Gly Asp Arg Val Phe Gly Val
Phe 3035 3040 3045Gly Asp Ala Phe Ala
Pro Val Val Val Ala Gln Glu Val Leu Leu 3050 3055
3060Ala Arg Ile Pro Glu Gly Trp Ser Phe Ala Gln Ala Ala
Ser Val 3065 3070 3075Pro Val Val Phe
Ala Thr Ala Tyr Leu Gly Leu Val Asp Leu Ala 3080
3085 3090Gly Val Arg Arg Gly Glu Ser Val Leu Val His
Ala Ala Ala Gly 3095 3100 3105Gly Val
Gly Thr Ala Ala Val Gln Leu Ala Arg His Leu Gly Ala 3110
3115 3120Glu Val Tyr Ala Thr Ala Ser Glu Ala Lys
Trp Ala Arg Leu Arg 3125 3130 3135Ala
Ala Gly Val Ala Pro Gln Arg Ile Ala Ser Ser Arg Ser Val 3140
3145 3150Glu Phe Glu Ser Arg Phe Arg Arg Ala
Ser Gly Gly Arg Gly Val 3155 3160
3165Asp Val Val Leu Asn Cys Leu Ala Gly Glu Tyr Thr Asp Ala Ser
3170 3175 3180Leu Arg Leu Cys Ser Pro
Gln Gly Gly Arg Phe Leu Glu Leu Gly 3185 3190
3195Lys Thr Asp Ile Arg Asp Ala Gly Glu Val Ala Ala Arg Phe
Pro 3200 3205 3210Gly Val Ser Tyr Arg
Ala Tyr Asp Leu Met Asp Ala Gly Ala Gln 3215 3220
3225Arg Val Gly Glu Ile Leu His Thr Val Val Asp Leu Phe
Arg Arg 3230 3235 3240Gly Val Leu Glu
Pro Leu Pro Val Thr Ala Trp Asp Val Arg Gln 3245
3250 3255Ala His Gln Ala Leu Arg Ser Met Arg Ser Gly
Leu His Val Gly 3260 3265 3270Lys Asn
Val Leu Thr Leu Pro Val Pro Leu Asp Ala Glu Gly Thr 3275
3280 3285Val Leu Val Thr Gly Gly Thr Gly Thr Leu
Gly Ala Ala Val Ala 3290 3295 3300Arg
His Leu Ala Ala Gly His Gly Val Arg His Leu Leu Leu Val 3305
3310 3315Ser Arg Arg Gly Met Ala Ala Ala Gly
Ala Glu Lys Leu Cys Ala 3320 3325
3330Glu Leu Gly Gln Ala Gly Val Ser Val Ser Val Ala Gly Cys Asp
3335 3340 3345Val Ala Asp Arg Ala Gln
Val Ala Ala Leu Leu Glu Gln Val Pro 3350 3355
3360Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val
Leu 3365 3370 3375Asp Asp Ala Thr Val
Thr Cys Leu Asp Arg Asn Lys Ile Asp Ala 3380 3385
3390Val Leu Gly Ala Lys Val Asp Gly Ala Leu His Leu His
Glu Leu 3395 3400 3405Thr Ala Gly Met
Asp Leu Ser Ala Phe Val Leu Phe Ser Ser Ala 3410
3415 3420Ala Gly Val Leu Gly Ser Pro Gly Gln Gly Asn
Tyr Ala Ala Ala 3425 3430 3435Asn Ala
Ala Leu Asp Ala Leu Ala His Gln Arg Arg Ala Ala Gly 3440
3445 3450Leu Pro Ala Leu Ser Leu Ala Trp Gly Leu
Trp Glu Glu Ala Ser 3455 3460 3465Gly
Met Thr Gly His Leu Asp Ala Ala Asp Arg His Arg Ile Thr 3470
3475 3480Arg Ser Gly Leu His Pro Leu Thr Thr
Pro Asp Ala Leu Ala Leu 3485 3490
3495Leu Asp Thr Ala Leu Ala Ala Gly Arg Pro Ala Leu Leu Pro Ala
3500 3505 3510Asp Leu Arg Pro Thr His
Pro Ala Pro Pro Leu Leu Glu His Leu 3515 3520
3525Ala Pro Ala Arg Thr Ser His Arg Thr Ala His Thr Ser Thr
Ala 3530 3535 3540Thr Gly Val Gly Gln
Asp Val Ser Leu Thr Asp Arg Leu Ala Thr 3545 3550
3555Leu Thr Pro Glu Gln Arg His Asp Thr Leu Leu Ala Leu
Ala Arg 3560 3565 3570Thr His Ile Ala
Ala Val Leu Gly His Pro Ser Pro Asp Thr Ile 3575
3580 3585Asp Pro Glu Arg Thr Phe Arg Asp Leu Gly Phe
Asp Ser Leu Thr 3590 3595 3600Ala Val
Glu Leu Arg Asn Arg Leu Thr Arg Ala Thr Gly Leu Arg 3605
3610 3615Leu Pro Ala Thr Leu Ala Phe Asp His Pro
Thr Pro Thr Ala Leu 3620 3625 3630Thr
His His Leu Thr Thr Leu Leu Asn Pro Asn Asp Asn Asp Asn 3635
3640 3645Val Gly Pro Val Leu Met Glu Leu Glu
Arg Leu Glu Ser Ala Leu 3650 3655
3660Ala Ala Leu Asp Arg Asp Asp Ser Ala Cys Glu Arg Val Thr Leu
3665 3670 3675Arg Leu Gln Ser Leu Met
Leu Arg Trp Ser Gly Ser Glu Arg Gln 3680 3685
3690Ser Ala Glu Asn Thr Asp Asp Ser Ser Arg Phe Ala Ser Ala
Thr 3695 3700 3705Ala Glu Glu Leu Leu
Glu Phe Ile Asp Arg Asp Leu Gly Leu Ser 3710 3715
372076043PRTbacteria 7Val Ala Asn Asp Glu Lys Val Leu Glu
Tyr Leu Lys Arg Val Thr Ala1 5 10
15Asp Leu Asp Arg Thr Arg Arg Arg Leu Tyr Glu Val Val Glu Arg
Glu 20 25 30Gln Glu Pro Ile
Ala Ile Val Gly Met Ala Cys Arg Tyr Pro Gly Gly 35
40 45Ala Gly Ser Pro Ala Gly Leu Trp Asp Leu Val Ser
Ser Gly Thr Asp 50 55 60Ala Ile Gly
Glu Phe Pro Thr Asp Arg Gly Trp Asp Leu Glu Arg Leu65 70
75 80Tyr Asp Pro Asp Pro Asp His Pro
Gly Thr Thr Tyr Thr Arg His Gly 85 90
95Gly Phe Leu Asp Gly Val Gly Glu Phe Asp Ala Glu Phe Phe
Gly Val 100 105 110Ser Pro Arg
Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu 115
120 125Glu Thr Ala Trp Glu Ala Ile Glu His Ala Gly
Ile Val Pro Glu Ser 130 135 140Leu Arg
Gly Thr Ser Thr Gly Val Phe Ala Gly Ile Asn Pro Gln Asp145
150 155 160Tyr Thr Ile Ser Gln Tyr Gly
Arg Asp Ser Glu Ile Glu Gly Tyr Leu 165
170 175Leu Thr Gly Ala Ala Ala Ser Ile Ala Ser Gly Arg
Ile Ser Tyr Thr 180 185 190Leu
Gly Leu Glu Gly Pro Ala Val Thr Ile Asp Thr Ala Cys Ser Ser 195
200 205Ser Leu Val Ala Leu His Leu Ala Cys
Gln Ala Leu Arg Ala Gly Glu 210 215
220Cys Thr Met Ala Leu Ala Gly Gly Ala Ser Val Leu Ser Thr Pro Leu225
230 235 240Ile Phe Val Glu
Phe Ala Arg His His Gly Leu Ser Val Asp Gly Arg 245
250 255Cys Lys Ala Phe Ser Ala Ser Ala Asp Gly
Thr Gly Trp Gly Glu Gly 260 265
270Ala Gly Leu Leu Leu Leu Glu Arg Leu Ser Asp Ala Lys Arg Asn Gly
275 280 285Arg Arg Ile Leu Ala Leu Val
Arg Gly Ser Ala Val Asn Gln Asp Gly 290 295
300Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Cys Arg
Val305 310 315 320Ile Arg
Arg Ala Leu Ala Asn Ala His Leu Ala Pro Ala Asp Ile Asp
325 330 335Ala Val Glu Ala His Gly Thr
Gly Thr Thr Leu Gly Asp Pro Ile Glu 340 345
350Ala Gln Ala Leu Gln Glu Ala Tyr Gly Ala Asp Arg Pro Asp
Asp Arg 355 360 365Pro Leu Trp Val
Gly Thr Leu Lys Ser Asn Ile Gly His Ser Ile Ala 370
375 380Ala Ala Gly Val Gly Gly Val Ile Lys Met Val Met
Ala Leu Arg His385 390 395
400Glu Ser Leu Pro Arg Thr Leu His Val Asp Glu Pro Ser Pro Gln Val
405 410 415Asp Trp Ser Ser Gly
Ala Val Ser Leu Leu Thr Glu Ala Arg Pro Trp 420
425 430Pro Arg Asp Glu Asp Arg Pro Arg Arg Ala Gly Val
Ser Ser Phe Gly 435 440 445Val Ser
Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro Ala Pro 450
455 460Ala Glu Val Gln Ala Val Glu Thr Ala Pro Val
Val Arg Val Asp Gly465 470 475
480Gly Glu Arg Ser Ala Pro Ala Asp Val Pro Leu Val Trp Val Val Ser
485 490 495Gly Lys Ser Gln
Ala Ala Leu Arg Ala Gln Ala Ala Ala Leu His Ala 500
505 510His Val Leu Asp His Pro Glu Gln Asp Ala Ala
Asp Ile Gly Tyr Ser 515 520 525Leu
Ala Thr Thr Arg Ala Leu Phe Asp His Arg Ala Thr Leu Ile Ala 530
535 540Pro Asp Arg Asp Thr Leu Leu Asp Ala Leu
Thr Ala Leu Ala Asp Gly545 550 555
560Arg Thr His Pro His Leu Val Pro Ala Pro Pro Thr Glu Pro Gly
His 565 570 575Ala His Lys
Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro 580
585 590Gly Met Ala Thr Gly Leu Tyr His Thr Tyr
Pro Ala Phe Ala Ala Ala 595 600
605Leu Asp Glu Thr Cys Ala His Phe Asp Pro His Leu Asp His Pro Leu 610
615 620Arg Asp Leu Leu Leu Asn His Asp
Pro Thr Gly Leu Leu Thr His Thr625 630
635 640Leu Tyr Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys
Ala Leu His His 645 650
655Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu Ala Gly His
660 665 670Ser Leu Gly Glu Ile Thr
Ala Ala His Leu Ala Gly Ile Leu Thr Leu 675 680
685Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala Arg Leu Met
Gln Thr 690 695 700Met Pro Pro Gly Thr
Met Thr Thr Leu His Thr Thr Pro Glu His Ile705 710
715 720Gln Pro Leu Leu Asp Gln His Pro Gly Lys
Ala Thr Ile Ala Ala Val 725 730
735Asn Ser Pro His Ser Leu Val Ile Ser Gly Asp Pro Asp Thr Ile His
740 745 750His Ile Thr Thr Thr
Cys His Thr Gln Gly Ile Thr Thr Lys Pro Leu 755
760 765Thr Thr Asn His Ala Phe His Ser Pro His Thr Asp
Thr Ile Leu Glu 770 775 780Gln Leu Asp
Thr Thr Thr His Thr Leu Thr Tyr His Pro Pro His Thr785
790 795 800Pro Leu Ile Thr Ser Thr Pro
Gly Asp Pro Leu Thr Pro His Tyr Trp 805
810 815Thr His Gln Thr Arg Gln Pro Val His Trp Thr Asp
Thr Ile His Thr 820 825 830Leu
His Thr Asn Gly Val Thr Thr Tyr Ile Glu Leu Gly Pro Asp His 835
840 845Thr Leu Thr Thr Leu Thr His His Asn
Leu Pro His His Gln Pro Thr 850 855
860Ala Ile Thr Leu Thr His Pro His His Asn Pro Thr His His Leu Leu865
870 875 880Thr Ala Leu Ala
His Thr Pro Thr Thr Trp His Thr His His His Thr 885
890 895His Thr Asn Pro His Pro His Thr Ile Pro
Asp Leu Pro Thr Tyr Pro 900 905
910Phe Gln Arg Arg His Tyr Trp Leu Gln Ala Pro Thr Thr Ser Thr Asp
915 920 925Gln Pro Val Ala Pro Thr Asn
Asp Asp Ala Pro Ala Pro Arg Ala Thr 930 935
940Ser Leu Arg Asp Thr Leu Ala Gly Arg Ser Pro Gln Glu Arg Glu
Glu945 950 955 960Val Leu
Leu Asp Leu Val Leu Thr Gln Val Ala Ala Val Leu Gly His
965 970 975Thr Ala Pro Glu Val Val Asp
Pro Gln Arg Ala Phe Lys Asp Leu Gly 980 985
990Phe Asp Ser Leu Ala Ala Ile Lys Leu Arg Asn Arg Leu Ala
Ala Ala 995 1000 1005Thr Gly Leu
Glu Leu Pro Thr Thr Leu Val Phe Asp His Pro Thr 1010
1015 1020Pro Val Ala Leu Arg Gln Tyr Phe Gln Ser Gln
Ile Leu Gly Ala 1025 1030 1035Glu Ala
Asp Ala Pro Asn Arg Leu Pro Leu Arg Ala Ala Thr Thr 1040
1045 1050Asp Glu Pro Ile Ala Ile Val Gly Met Ala
Cys Arg Phe Pro Gly 1055 1060 1065Gly
Val Arg Thr Ala Asp Asp Leu Trp Gln Leu Leu Ser Asp Glu 1070
1075 1080His Asp Ala Val Gly Gly Phe Pro Thr
Asn Arg Gly Trp Asp Val 1085 1090
1095Ala Asn Leu Tyr Asp Pro Asp Pro Asp Arg His Gly Thr Thr Tyr
1100 1105 1110Thr Gln Gln Gly Gly Phe
Leu Tyr Glu Ala Gly Glu Phe Asp Ala 1115 1120
1125Glu Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp
Pro 1130 1135 1140Gln Gln Arg Leu Leu
Leu Glu Thr Ala Trp Glu Ala Ile Glu His 1145 1150
1155Ala Gly Ile Asn Pro Asp Ala Leu Arg Asn Thr Ser Thr
Gly Val 1160 1165 1170Phe Ala Gly Val
Ile Tyr His Asp Tyr Ala Ser Arg Phe Leu Thr 1175
1180 1185Ala Pro Ala Gly Tyr Glu Gly Tyr Leu Gly His
Gly Ser Ala Gly 1190 1195 1200Ser Ile
Ala Ser Gly Arg Val Ala Tyr Val Leu Gly Leu Glu Gly 1205
1210 1215Pro Ala Val Thr Val Asp Thr Ala Cys Ser
Ser Ser Leu Val Ala 1220 1225 1230Leu
His Leu Ala Cys Gln Ala Leu Arg Ser Gly Glu Cys Thr Met 1235
1240 1245Ala Leu Ala Gly Gly Ala Thr Val Met
Ser Thr Pro Gln Ala Phe 1250 1255
1260Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Cys
1265 1270 1275Lys Ala Phe Ser Ala Ala
Ala Asp Gly Thr Gly Trp Gly Glu Gly 1280 1285
1290Ala Gly Leu Leu Leu Leu Glu Arg Leu Ser Glu Ala Glu Arg
Asn 1295 1300 1305Gly His Arg Val Leu
Ala Val Val Arg Gly Ser Ala Val Asn Gln 1310 1315
1320Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro
Ser Gln 1325 1330 1335Gln Arg Val Ile
Arg Gln Ala Leu Ala Asn Ser Gly Leu Thr Gly 1340
1345 1350Ala Asp Val Asp Ala Val Glu Ala His Gly Thr
Gly Thr Lys Leu 1355 1360 1365Gly Asp
Pro Ile Glu Ala Gln Ala Leu Leu Ala Thr Tyr Gly Gln 1370
1375 1380Glu His His Pro Asp Gln Pro Leu Trp Leu
Gly Ser Leu Lys Ser 1385 1390 1395Asn
Ile Gly His Ala Gln Ala Ala Ala Gly Val Gly Ser Ile Ile 1400
1405 1410Lys Met Ile Met Ala Met Arg Asn Glu
Ser Leu Pro Arg Thr Leu 1415 1420
1425His Val Asp Glu Pro Ser Pro His Val Asp Trp Ser Ser Gly Ala
1430 1435 1440Val Ser Leu Leu Thr Glu
Pro Arg Pro Trp Pro Arg Arg Glu Asp 1445 1450
1455Arg Pro Arg Arg Ala Gly Ile Ser Ser Phe Gly Val Ser Gly
Thr 1460 1465 1470Asn Ala His Val Ile
Val Glu Glu Pro Pro Ala Arg Ala Glu Val 1475 1480
1485Glu Ala Val Glu Ala Ala Pro Ala Gly Val Glu Thr Ala
Ala Ala 1490 1495 1500Ala Ala Val Val
Val Glu Thr Asp Gly Ala Gly Arg Val Ser Ser 1505
1510 1515Asp Val Pro Leu Val Trp Val Val Ser Gly Lys
Ser Gln Ala Ala 1520 1525 1530Leu Arg
Ala Gln Ala Ala Ala Leu His Ala His Val Leu Asp His 1535
1540 1545Pro Glu Gln Asp Ala Ala Asp Ile Gly Tyr
Ser Leu Ala Thr Thr 1550 1555 1560Arg
Ala Leu Phe Asp His Arg Ala Thr Leu Ile Ala Pro Asp Arg 1565
1570 1575Asp Thr Leu Leu Asp Ala Leu Thr Ala
Leu Ala Asp Gly Arg Thr 1580 1585
1590His Pro His Leu Ile Pro Thr Pro Pro Thr Glu Pro Gly His Thr
1595 1600 1605His Lys Ile Ala Phe Leu
Cys Ser Gly Gln Gly Thr Gln Arg Pro 1610 1615
1620Gly Met Ala Thr Gly Leu Tyr His Thr Tyr Pro Ala Phe Ala
Ala 1625 1630 1635Ala Leu Asp Glu Thr
Cys Ala His Phe Asp Pro His Leu Asp His 1640 1645
1650Pro Leu Arg Asp Leu Leu Leu Asn His Asp Pro Thr Asp
Leu Leu 1655 1660 1665Thr His Thr Leu
Tyr Ala Gln Pro Ala Leu Phe Thr Leu Gln Lys 1670
1675 1680Ala Leu His His Leu Ile Thr Glu Thr Tyr Gly
Ile Thr Pro His 1685 1690 1695Tyr Leu
Ala Gly His Ser Leu Gly Glu Ile Thr Ala Ala His Leu 1700
1705 1710Ala Gly Ile Leu Thr Leu Pro Asp Ala Thr
His Leu Ile Thr Thr 1715 1720 1725Arg
Ala Arg Leu Met Gln Thr Met Pro Pro Gly Thr Met Thr Thr 1730
1735 1740Leu His Thr Thr Pro Glu His Ile Gln
Pro Leu Leu Asp Gln His 1745 1750
1755Pro Gly Lys Ala Thr Ile Ala Ala Val Asn Ser Pro His Ser Leu
1760 1765 1770Val Ile Ser Gly Asp Pro
Asp Thr Ile His His Ile Thr Thr Thr 1775 1780
1785Cys His Asn Gln Gly Ile Thr Thr Lys Pro Leu Thr Thr Asn
His 1790 1795 1800Ala Phe His Ser Pro
His Thr Asn Thr Ile Leu Glu Gln Leu Asp 1805 1810
1815Thr Thr Thr His Thr Leu Thr Tyr His Pro Pro His Thr
Pro Leu 1820 1825 1830Ile Thr Ser Thr
Pro Gly Asn Pro Leu Thr Pro His Tyr Trp Thr 1835
1840 1845His Gln Thr Arg Gln Pro Val His Trp Ala Asp
Thr Ile His Thr 1850 1855 1860Leu His
Thr Asn Gly Val Thr Thr Tyr Ile Gly Leu Gly Pro Asp 1865
1870 1875His Thr Leu Ser Thr Leu Thr His His Asn
Leu Pro Gln His Gln 1880 1885 1890Pro
Thr Ala Ile Thr Leu Thr His Pro His His Asn Pro Thr His 1895
1900 1905His Leu Leu Thr Ala Leu Ala His Thr
Pro Thr Thr Trp His Thr 1910 1915
1920His His His Thr His Thr Asn Pro His Pro His Thr Ile Pro Asp
1925 1930 1935Leu Pro Thr Tyr Pro Phe
Gln Arg Arg His Tyr Trp Leu Glu Val 1940 1945
1950Pro Lys Pro Thr Ala Glu Ala Ser Ala Ser Ala Ser Gly Pro
Gly 1955 1960 1965Arg Asn Arg Ala Ala
Lys Leu Ser Ala Leu Glu Ala Glu Phe Trp 1970 1975
1980Gln Ala Val Glu Glu Thr Asp Thr Asp Thr Leu Ala His
Thr Leu 1985 1990 1995Asp Leu Asp Thr
Gln Thr Leu Glu Pro Val Leu Pro Ala Leu Ala 2000
2005 2010Thr Trp His Gln Gln Gln Arg Asp His Ala Arg
Ile Asn Thr Trp 2015 2020 2025Thr Tyr
Gln Glu Thr Trp Lys Pro Leu His Leu Pro Thr Thr Arg 2030
2035 2040Pro Thr Thr Pro Thr Ser Trp Leu Ile Ala
Ile Pro Glu Thr His 2045 2050 2055Arg
Asn His Pro His Thr Thr Asn Leu Leu Thr Asn Leu Pro His 2060
2065 2070His Asn Ile Thr Pro Ile Pro Leu Thr
Ile Asn His Thr Thr Asp 2075 2080
2085Leu His His Ala Tyr His His Ala His His His Thr Thr Pro Pro
2090 2095 2100Ile Thr Ala Val Leu Ser
Leu Leu Ala Leu Asp Glu Thr Pro His 2105 2110
2115Pro His His Pro His Thr Pro Thr Gly Thr Leu Leu Asn Leu
Thr 2120 2125 2130Leu Thr Gln Thr His
Thr Gln Thr His Pro Pro Thr Pro Leu Trp 2135 2140
2145Tyr Leu Thr Thr Gln Ala Thr Thr Thr His Pro Asn Asp
Pro Leu 2150 2155 2160Thr His Pro Thr
Gln Ala Gln Thr Ile Gly Leu Ala Arg Thr Thr 2165
2170 2175His Leu Glu His Pro His His Thr Gly Gly His
Ile Asp Leu Pro 2180 2185 2190Thr Thr
Pro His Pro Asn Thr Leu Thr Gln Leu Ile Thr Ala Leu 2195
2200 2205Thr His Pro His His Gln His Asn Leu Thr
Ile Arg Thr His Thr 2210 2215 2220Thr
His Thr Arg Arg Leu Thr Pro Thr Thr Leu Gln Pro Thr Thr 2225
2230 2235Pro Thr Pro Pro Thr Asn Pro His Gly
Thr Thr Leu Ile Thr Gly 2240 2245
2250Gly Thr Gly Ala Leu Ala Thr Thr Leu Ala His His Leu Ala Thr
2255 2260 2265Thr Gly Thr Gln His Leu
Leu Leu Thr Ser Arg Arg Gly Pro His 2270 2275
2280Thr Pro Gly Ala Arg Gln Leu His Thr Gln Leu Thr Gln Leu
Gly 2285 2290 2295Thr Asn Thr Thr Ile
Thr Ala Cys Asp Leu Ser Asp Pro Asp Gln 2300 2305
2310Leu Thr His Leu Leu Thr His Ile Pro Pro Glu His Pro
Leu Thr 2315 2320 2325Thr Val Ile His
Thr Ala Gly Ile Leu Asp Asp Ala Thr Leu Thr 2330
2335 2340Asn Leu Thr Pro Thr Gln Leu Asp Asn Val Leu
Arg Ala Lys Ala 2345 2350 2355His Thr
Ala His Leu Leu His His Ala Thr Leu His Thr Pro Leu 2360
2365 2370Asp His Phe Val Leu Tyr Ser Ser Ala Ala
Ala Thr Leu Gly Ala 2375 2380 2385Pro
Gly Gln Ala Asn Tyr Ala Ala Ala Asn Ala Tyr Leu Asp Ala 2390
2395 2400Leu Ala His His Arg His Thr His Asn
Leu Pro Ala Thr Thr Ile 2405 2410
2415Ala Trp Gly Thr Trp Gln Gly Asn Gly Leu Ala Ser Gly Asp Ile
2420 2425 2430Gly Glu His Leu Arg Arg
Arg Gly Met Ile Pro Leu Asp Pro Glu 2435 2440
2445Ser Ala Val Gly Ala Phe Asp Arg Ala Val Ala Ser Asp Arg
Pro 2450 2455 2460Ser Val Phe Val Ala
Asp Ile Asp Trp Pro Thr Phe Gly Arg Asn 2465 2470
2475Thr Ser Ser Gly Leu Arg Ala Leu Phe Glu Asp Ile Pro
Glu Ala 2480 2485 2490Thr Gln Pro Glu
Pro Thr Ala Arg Ser Ala Asp Gln Pro Asn Gly 2495
2500 2505His Gly Ser Leu Gln Glu Leu Leu Ala Arg Gln
Ser Pro Ala Glu 2510 2515 2520Gln Ala
Glu Thr Leu Leu Ala Leu Val Arg Thr His Ser Ala Thr 2525
2530 2535Val Leu Gly Arg Asp Gly Ala Asp Ala Val
Ala Ala Glu Arg Pro 2540 2545 2550Phe
Arg Asp Leu Gly Phe Asp Ser Leu Ser Ala Val Glu Leu Arg 2555
2560 2565Asn His Leu Thr Ala Asp Thr Glu Leu
Ala Leu Pro Thr Thr Leu 2570 2575
2580Val Phe Asp His Pro Thr Pro Val Lys Leu Ala Glu Phe Leu Arg
2585 2590 2595Thr Glu Leu Leu Gly Thr
Ala Pro Ala Thr Thr Ala Ala Val Pro 2600 2605
2610Ala Leu Gln Ser His Thr Asp Glu Pro Ile Ala Ile Ile Gly
Met 2615 2620 2625Ala Cys Arg Phe Pro
Gly Ala Val Thr Thr Pro Glu His Leu Trp 2630 2635
2640Asn Leu Ile Ala Thr Glu Gln Asp Ala Ile Gly Glu Phe
Pro Thr 2645 2650 2655Asp Arg Gly Trp
Asp Leu Asp Asn Leu Tyr His Pro Asp Pro Asp 2660
2665 2670His Pro Gly Thr Thr Tyr Thr Arg His Gly Gly
Phe Leu Tyr Asp 2675 2680 2685Ala Gly
Asp Phe Asp Ala Glu Phe Phe Gly Ile Asn Pro Arg Glu 2690
2695 2700Ala Leu Ala Met Asp Pro Gln Gln Arg Leu
Leu Leu Glu Thr Ala 2705 2710 2715Trp
Glu Ala Ile Glu His Ala Gly Ile Leu Pro Asp Ala Leu His 2720
2725 2730Gly Thr Pro Thr Gly Val Phe Thr Gly
Val Asn Ala Gln Asp Tyr 2735 2740
2745Ala Ala His Thr His Ala Ser Pro His Thr Thr Glu Gly Tyr Thr
2750 2755 2760Leu Thr Gly Thr Ala Gly
Ser Ile Ala Ser Gly Arg Ile Ala Tyr 2765 2770
2775Thr Leu Gly Leu Glu Gly Pro Ala Val Thr Ile Asp Thr Ala
Cys 2780 2785 2790Ser Ser Ser Leu Val
Ala Leu His Leu Ala Cys Gln Ala Leu Arg 2795 2800
2805Ala Gly Glu Cys Thr Thr Ala Leu Ala Ser Gly Ile Thr
Val Met 2810 2815 2820Thr Ser Pro Val
Thr Phe Thr Glu Phe Ser Arg Gln Arg Gly Leu 2825
2830 2835Ala Pro Asp Gly His Cys Lys Ala Phe Ser Ala
Ser Ala Asp Gly 2840 2845 2850Thr Gly
Trp Ser Glu Gly Val Gly Thr Ile Leu Val Glu Arg Leu 2855
2860 2865Ser Asp Ala Glu Arg Asn Gly His Arg Ile
Leu Ala Val Val Arg 2870 2875 2880Gly
Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala 2885
2890 2895Pro Asn Gly Pro Ser Gln Gln Arg Val
Ile Arg Gln Ala Leu Ala 2900 2905
2910Asn Ser Gly Leu Thr Gly Ala Asp Val Asp Ala Val Glu Ala His
2915 2920 2925Gly Thr Gly Thr Lys Leu
Gly Asp Pro Ile Glu Ala Gln Ala Leu 2930 2935
2940Leu Ala Thr Tyr Gly Gln Gly Arg Ala Gln Glu Gln Pro Leu
Trp 2945 2950 2955Leu Gly Ser Val Lys
Ser Asn Leu Gly His Thr Gln Ala Ala Ala 2960 2965
2970Gly Met Ala Gly Leu Ile Lys Met Val Met Ala Leu Arg
His Glu 2975 2980 2985Ser Leu Pro Arg
Thr Leu His Val Asp Glu Pro Ser Pro Gln Val 2990
2995 3000Asp Trp Ser Ser Gly Ala Val Ser Leu Leu Thr
Glu Ala Arg Pro 3005 3010 3015Trp Pro
Arg Arg Glu Asp Arg Pro Arg Arg Ala Gly Ile Ser Ser 3020
3025 3030Phe Gly Val Ser Gly Thr Asn Ala His Val
Ile Leu Glu Glu Ala 3035 3040 3045Pro
Ala Pro Ala Glu Ala Val Glu Thr Glu Gln Gly Val Val Pro 3050
3055 3060Gln Gly Asp Gln Glu Cys Ser Ala Pro
Val Gly Val Pro Leu Val 3065 3070
3075Trp Val Val Ser Gly Lys Ser Gln Ala Ala Leu Arg Ala Gln Ala
3080 3085 3090Ala Ala Leu His Ala His
Val Leu Asp His Pro Glu Gln Asp Ala 3095 3100
3105Ala Asp Ile Gly Tyr Ser Leu Ala Thr Thr Arg Ala Leu Phe
Asp 3110 3115 3120His Arg Ala Thr Leu
Ile Ala Pro Asp Arg Asp Thr Leu Leu Asp 3125 3130
3135Ala Leu Thr Ala Leu Ala Asp Gly Arg Thr His Pro His
Leu Ile 3140 3145 3150Pro Thr Pro Pro
Thr Glu Pro Gly His Thr His Lys Ile Ala Phe 3155
3160 3165Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro Gly
Met Ala Thr Gly 3170 3175 3180Leu Tyr
His Thr Tyr Pro Ala Phe Ala Ala Ala Leu Asp Glu Thr 3185
3190 3195Cys Ala His Phe Asp Pro His Leu Asp His
Pro Leu Arg Asp Leu 3200 3205 3210Leu
Leu Asn His Asp Pro Thr Asp Leu Leu Thr His Thr Leu Tyr 3215
3220 3225Ala Gln Pro Ala Leu Phe Thr Leu Gln
Lys Ala Leu His His Leu 3230 3235
3240Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu Ala Gly His
3245 3250 3255Ser Leu Gly Glu Ile Thr
Ala Ala His Leu Ala Gly Ile Leu Thr 3260 3265
3270Leu Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala Arg Leu
Met 3275 3280 3285Gln Thr Met Pro Pro
Gly Thr Met Thr Thr Leu His Thr Thr Pro 3290 3295
3300Glu His Ile Gln Pro Leu Leu Asp Gln His Pro Gly Lys
Ala Thr 3305 3310 3315Ile Ala Ala Val
Asn Ser Pro His Ser Leu Val Ile Ser Gly Asp 3320
3325 3330Pro Asp Thr Ile His His Ile Thr Thr Thr Cys
His Thr Gln Gly 3335 3340 3345Ile Thr
Thr Lys Pro Leu Thr Thr Asn His Ala Phe His Ser Pro 3350
3355 3360His Thr Asp Thr Ile Leu Glu Gln Leu Asp
Thr Thr Thr His Thr 3365 3370 3375Leu
Thr Tyr His Gln Pro His Thr Pro Leu Ile Thr Ser Thr Pro 3380
3385 3390Gly Asp Pro Leu Thr Pro His Tyr Trp
Thr His Gln Thr Arg Gln 3395 3400
3405Pro Val His Trp Ala Asp Thr Ile His Thr Leu His Thr Asn Gly
3410 3415 3420Val Thr Thr Tyr Ile Gly
Leu Gly Pro Asp His Thr Leu Ser Thr 3425 3430
3435Leu Thr His His Asn Leu Pro Gln His Gln Pro Thr Ala Ile
Thr 3440 3445 3450Leu Thr His Pro His
His Asn Pro Thr His His Leu Leu Thr Ala 3455 3460
3465Leu Ala His Thr Pro Thr Thr Trp His Thr His His His
Thr His 3470 3475 3480Thr Asn Pro His
Pro His Thr Ile Pro Asp Leu Pro Thr Tyr Pro 3485
3490 3495Phe Gln Arg Arg His Tyr Trp Leu Glu Val Pro
Lys Pro Thr Ala 3500 3505 3510Glu Ala
Ser Ala Ser Ala Ser Gly Pro Gly Arg Asn Arg Ala Ala 3515
3520 3525Lys Leu Ser Ala Leu Glu Ala Glu Phe Trp
Gln Ala Val Glu Glu 3530 3535 3540Thr
Asp Thr Asp Thr Leu Ala His Thr Leu Asp Leu Asp Thr Gln 3545
3550 3555Thr Leu Glu Pro Val Leu Pro Ala Leu
Ala Thr Trp His Gln Gln 3560 3565
3570Gln Arg Asp His Ala Arg Ile Asn Thr Trp Thr Tyr Gln Glu Thr
3575 3580 3585Trp Lys Pro Leu His Leu
Pro Thr Thr Arg Pro Thr Thr Pro Thr 3590 3595
3600Ser Trp Leu Ile Ala Ile Pro Glu Thr His Arg Asn His Pro
His 3605 3610 3615Thr Thr Asn Leu Leu
Thr Asn Leu Pro His His Asn Ile Thr Pro 3620 3625
3630Ile Pro Leu Thr Ile Asn His Thr Thr Asp Leu His His
Ala Tyr 3635 3640 3645His His Ala His
His His Thr Thr Pro Pro Ile Thr Ala Val Leu 3650
3655 3660Ser Leu Leu Ala Leu Asp Glu Thr Pro His Pro
His His Pro His 3665 3670 3675Thr Pro
Thr Gly Thr Leu Leu Asn Leu Thr Leu Thr Gln Thr His 3680
3685 3690Thr Gln Thr His Pro Pro Thr Pro Leu Trp
Tyr Leu Thr Thr Gln 3695 3700 3705Ala
Thr Thr Thr His Pro Asn Asp Pro Leu Thr His Pro Thr Gln 3710
3715 3720Ala Gln Thr Ile Gly Leu Ala Arg Thr
Thr His Leu Glu His Pro 3725 3730
3735His His Thr Gly Gly His Ile Asp Leu Pro Thr Thr Pro His Pro
3740 3745 3750Asn Thr Leu Thr Gln Leu
Ile Thr Ala Leu Thr His Pro His His 3755 3760
3765Gln His Asn Leu Thr Ile Arg Thr His Thr Thr His Thr Arg
Arg 3770 3775 3780Leu Thr Pro Thr Thr
Leu Gln Pro Thr Thr Pro Thr Pro Pro Thr 3785 3790
3795Asn Pro His Gly Thr Thr Leu Ile Thr Gly Gly Thr Gly
Ala Leu 3800 3805 3810Ala Thr Thr Leu
Ala His His Leu Ala Thr Thr Gly Thr Gln His 3815
3820 3825Leu Leu Leu Thr Ser Arg Arg Gly Pro His Thr
Pro Gly Ala Arg 3830 3835 3840Gln Leu
His Thr Gln Leu Thr Gln Leu Gly Thr Asn Thr Thr Ile 3845
3850 3855Thr Ala Cys Asp Leu Ser Asp Pro Asp Gln
Leu Thr His Ile Leu 3860 3865 3870Thr
His Ile Pro Pro Glu His Pro Leu Thr Thr Val Ile His Thr 3875
3880 3885Ala Gly Val Asn His Tyr Ala Pro Val
Ala Ala Thr Asp Pro Ser 3890 3895
3900Thr Phe Ala Ser Val Leu Ala Ala Lys Ala Ala Gly Ala Ala His
3905 3910 3915Leu His Glu Leu Leu Leu
Glu Leu Asp Thr Val Glu Gln Phe Ile 3920 3925
3930Leu Phe Ser Ser Gly Ser Gly Ala Trp Gly Ser Gly Asn Gln
Cys 3935 3940 3945Ala Tyr Ala Ala Ala
Asn Ala Tyr Leu Asp Ala Leu Ala Ala His 3950 3955
3960Arg Gln Ala Arg Gly Leu Pro Gly Met Ser Leu Ala Trp
Gly Pro 3965 3970 3975Trp Asp Gly Asp
Gly Met Ser Ala Gly Glu Asp Ala Gln Arg Tyr 3980
3985 3990Leu Arg Glu Arg Gly Val Leu Pro Met Asp Pro
Arg Leu Ala Val 3995 4000 4005Ala Ala
Phe Asp Glu Ala Val Arg Ala Arg Pro Asn Ser Asn Leu 4010
4015 4020Val Val Ala Asp Ile Asp Trp Glu Arg Phe
Val Pro Thr Phe Thr 4025 4030 4035Ala
Arg Gly His Asn Pro Leu Ile Glu Asp Ile Pro Glu Val Arg 4040
4045 4050Arg Leu Ala Ala Glu Ala Glu Ala Ala
Gln Thr Thr Thr Ala Ala 4055 4060
4065Thr Asp Ala Pro Ala Leu Leu Asn Arg Leu Ser Gly Leu Ser Ala
4070 4075 4080Thr Gln Gln Lys Gln His
Leu Leu Arg Leu Val Arg Ser His Met 4085 4090
4095Gly Glu Val Leu Gly Arg Glu Asp Val Asp Thr Leu Asp Glu
Arg 4100 4105 4110His Thr Phe Arg Asp
Leu Gly Phe Asp Ser Leu Thr Ser Ala Arg 4115 4120
4125Phe Ser Gln Arg Leu Ala Lys Asp Thr Gly Leu His Leu
Pro Ala 4130 4135 4140Thr Leu Val Phe
Asp His Pro Thr Pro Ala Asp Cys Val Ala His 4145
4150 4155Leu Arg Asp Gln Leu Leu Gly Glu Thr Asp Asp
Met Thr Pro Arg 4160 4165 4170Lys Arg
Asp His Leu Gly Glu Asp Arg Arg Ala Ala Thr Ala Asp 4175
4180 4185Asp Pro Ile Ala Ile Val Gly Met Ala Cys
Arg Phe Pro Gly Gly 4190 4195 4200Val
Arg Ser Ala Asp Asp Leu Trp Asp Leu Leu Ser Ser Gly Thr 4205
4210 4215Asp Ala Ile Ser Gly Phe Pro Thr Asp
Arg Gly Trp Asp Ile Glu 4220 4225
4230Ser Leu Tyr Asp Pro Asp Pro Asp Arg Ser Gly Thr Thr Tyr Thr
4235 4240 4245Arg His Gly Gly Phe Leu
Tyr Asp Ala Gly Gln Phe Asp Ala Glu 4250 4255
4260Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro
Gln 4265 4270 4275Gln Arg Leu Leu Leu
Glu Thr Ala Trp Glu Ala Val Glu His Ala 4280 4285
4290Gly Ile Asn Pro Gln Thr Leu His Gly Thr Pro Thr Gly
Val Phe 4295 4300 4305Thr Gly Val Asn
Ala Gln Asp Tyr Ala Ala His Leu Arg Gln Ala 4310
4315 4320Ser Gly Asn Val Glu Gly Tyr Ala Leu Thr Gly
Ser Ser Gly Ser 4325 4330 4335Val Val
Ser Gly Arg Val Ala Tyr Thr Phe Gly Phe Glu Gly Pro 4340
4345 4350Ala Val Ser Val Asp Thr Ala Cys Ser Ser
Ser Leu Val Ala Leu 4355 4360 4365His
Leu Ala Gly Gln Ala Leu Arg Ser Gly Glu Cys Thr Met Ala 4370
4375 4380Leu Ala Gly Gly Val Met Val Met Ser
Ser Pro Glu Thr Phe Val 4385 4390
4395Glu Phe Ser Arg Gln Arg Gly Leu Ser Val Asp Gly Arg Cys Lys
4400 4405 4410Ser Phe Ala Ala Ala Ala
Asp Gly Thr Gly Trp Gly Glu Gly Val 4415 4420
4425Gly Met Leu Leu Val Glu Arg Leu Ser Asp Ala Glu Arg Asn
Gly 4430 4435 4440His Arg Val Leu Ala
Val Val Arg Gly Ser Ala Val Asn Gln Asp 4445 4450
4455Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser
Gln Gln 4460 4465 4470Arg Val Ile Arg
Gln Ala Leu Ala Asn Ser Gly Leu Thr Gly Ala 4475
4480 4485Asp Val Asp Ala Val Glu Ala His Gly Thr Gly
Thr Lys Leu Gly 4490 4495 4500Asp Pro
Ile Glu Ala Gln Ala Leu Leu Ala Thr Tyr Gly Gln Glu 4505
4510 4515His His Pro Asp Gln Pro Leu Trp Leu Gly
Ser Leu Lys Ser Asn 4520 4525 4530Ile
Gly His Ala Gln Ala Ala Ala Gly Val Gly Gly Ile Ile Lys 4535
4540 4545Met Val Met Ala Leu Arg His Glu Thr
Leu Pro Arg Thr Leu His 4550 4555
4560Ile Asp Glu Pro Thr Pro Gln Val Asp Trp Ser Ser Gly Ala Val
4565 4570 4575Ser Leu Leu Thr Glu Pro
Arg Pro Trp Pro Arg Gln Gly Asp Arg 4580 4585
4590Pro Arg Arg Ala Gly Ile Ser Ser Phe Gly Val Ser Gly Thr
Asn 4595 4600 4605Ala His Val Ile Leu
Glu Glu Ala Pro Ala Gln Pro Ala Gly Asp 4610 4615
4620Pro Ala Pro Glu Asp Gly Ala Pro Val Pro Trp Ala Met
Ser Ala 4625 4630 4635Arg Ser Asn Ala
Ala Leu Arg Ala Gln Ala Ala Leu Leu Arg Asp 4640
4645 4650Phe Leu Gln Gly Pro Gly Thr Asp Thr Ala Leu
Arg Ala Val Gly 4655 4660 4665Ala Glu
Leu Ala His Gly Arg Ala Val Leu Glu His Arg Ala Val 4670
4675 4680Ile Val Ala Arg Glu Arg Thr Glu Phe Glu
Asp Ala Leu Glu Ala 4685 4690 4695Leu
Ala Ser Gly Glu Pro His Pro Ala Leu Ile Glu Asp Thr Thr 4700
4705 4710Gly Ser Gln Thr Asn Ser His Ser Gly
Gly Gly Val Val Phe Val 4715 4720
4725Phe Pro Gly Gln Gly Gly Gln Trp Ala Gly Met Gly Leu Asp Leu
4730 4735 4740Leu Arg Asp Ser Gln Val
Phe Ala Asp His Val Gly Ala Cys Glu 4745 4750
4755Arg Ala Leu Ala Pro Trp Val Glu Trp Ser Leu Thr Glu Met
Leu 4760 4765 4770His Arg Asp Ala Glu
Asp Pro Val Trp Glu Arg Ala Asp Val Val 4775 4780
4785Gln Pro Val Leu Phe Ser Val Met Val Ser Leu Ala Ala
Leu Trp 4790 4795 4800Arg Ser Tyr Gly
Ile Glu Pro Glu Ala Val Val Gly His Ser Gln 4805
4810 4815Gly Glu Ile Ala Ala Ala His Val Cys Gly Ala
Leu Thr Leu Glu 4820 4825 4830Asp Ala
Ala Lys Ile Val Ala Leu Arg Ser Arg Ala Leu Ala Ala 4835
4840 4845Leu Arg Gly His Gly Gly Met Ala Ser Leu
Ala Leu Thr Gly Thr 4850 4855 4860Glu
Ala Glu Asp Leu Ile Thr Thr His Trp Pro Gly Arg Leu Trp 4865
4870 4875Thr Ala Ala Phe Asn Gly Pro Arg Ala
Thr Thr Val Ser Gly Asp 4880 4885
4890Thr Asp Ala Leu Asp Glu Leu Leu Thr His Cys Thr Glu Thr Gly
4895 4900 4905Val Arg Ala Arg Arg Ile
Pro Val Asp Tyr Ala Ser His Cys Pro 4910 4915
4920His Thr Glu Thr Ile Glu His Asp Leu Leu His Met Leu His
Gly 4925 4930 4935Ile Thr Pro Gln Pro
Gly Ser Ile Pro Phe Tyr Ser Thr Val Glu 4940 4945
4950Asp Ala Trp Thr Asp Thr Thr Thr Leu Asp Ala Ala Tyr
Trp Tyr 4955 4960 4965Arg Asn Leu Arg
Arg Pro Val Arg Phe Thr His Ala Val Arg Thr 4970
4975 4980Leu Thr Ala Gln Gly His Arg Leu Phe Ile Glu
Thr Ser Pro His 4985 4990 4995Pro Thr
Leu Thr Pro Ala Ile Glu Asp His Asp His Thr Thr Ala 5000
5005 5010Leu Gly Thr Leu Arg Arg His Asp Asn Asp
Thr His Arg Phe Leu 5015 5020 5025Thr
Ala Leu Ala His Ala His Thr Thr Gly His Thr Val Thr Trp 5030
5035 5040Thr Thr His Tyr Pro Thr Thr Pro His
Thr Pro Ala Ile Asp Leu 5045 5050
5055Pro Thr Tyr Pro Phe Gln His His His Tyr Trp Leu His Thr Pro
5060 5065 5070Thr Thr Ser Thr Gly Asp
Val Ser Ala Ala Gly Leu His Pro Thr 5075 5080
5085Glu His Pro Leu Leu Gly Ala Thr Val Glu Leu Ala Asp Gly
Asp 5090 5095 5100Gly Thr Leu Leu Thr
Gly Arg Leu Ser Leu His Thr His Pro Trp 5105 5110
5115Leu Ala Asp His Ser Val Gly Gly Ile Val Leu Leu Pro
Gly Thr 5120 5125 5130Ala Leu Leu Glu
Leu Ala Leu Glu Ala Gly Thr Arg Thr Gly Cys 5135
5140 5145Pro His Val Gln Glu Leu Thr Leu His Thr Pro
Leu Val Ile Pro 5150 5155 5160Glu Thr
Gly His Val Val Phe Gln Leu Thr Val Ser Ala Pro Asp 5165
5170 5175Glu Thr Gly Gln Arg Pro Phe Thr Val His
Phe Arg Ser Glu Ala 5180 5185 5190Val
Thr Gly Ala Asp Asp Pro Ala Asp Arg Thr Trp Thr Arg Cys 5195
5200 5205Ala Thr Gly Ala Leu Ser Thr Ala Ala
Ala Pro Asp His Ser Glu 5210 5215
5220Ala Ala Thr Trp Pro Pro Pro Ser Ala Gln Pro Leu Asp Leu Asp
5225 5230 5235Gly Leu Tyr Asp Arg Met
Ala Glu Ala Gly Leu Val Tyr Gly Pro 5240 5245
5250Val Phe Gln Gly Leu Arg Glu Ala Trp Leu Asp Gly Glu Asp
Ile 5255 5260 5265Val Ala Glu Val Arg
Leu Pro Gln Glu Ala Ala Ala Asp Thr Gln 5270 5275
5280Gly Phe Gly Leu His Pro Ala Leu Leu Asp Ala Ala Leu
His Val 5285 5290 5295Thr Ala Leu Thr
Ser Gln Ala Gly Thr Ala Asp Glu Asp Ala Gln 5300
5305 5310Glu Arg Arg Arg Leu Pro Phe Ala Trp Ala Gly
Val Ser Leu Phe 5315 5320 5325Ala Arg
Glu Cys Ala Ala Leu Arg Val Arg Val Ala Pro Cys Ala 5330
5335 5340Pro His Pro Gly Asp Ala Val Ala Ile Thr
Ala Thr Asp Glu Asp 5345 5350 5355Gly
Arg Pro Val Leu Ala Val Glu Ser Leu Thr Leu Arg Pro Val 5360
5365 5370Ser Pro Asp Gln Leu Arg Ala Ala Ala
Pro Ala Ala Gly Arg Asp 5375 5380
5385Ser Leu Phe Arg Leu Glu Trp Val Pro Val Thr Ala Ser Ala Ser
5390 5395 5400Ala Ser Ala Arg Pro Thr
Gly Pro Trp Ala Ala Ile Gly Thr Gly 5405 5410
5415Pro Ala Val Ala Gly Leu Ala Gly His Ala Asp Leu Thr Val
Tyr 5420 5425 5430Ala Glu Ala Gly Asp
Leu Leu Arg Asp Leu Asp Gly Gly Ala Pro 5435 5440
5445Ala Pro Ala Val Val Val Leu Ser Val Thr Pro Asp Ala
Asp Glu 5450 5455 5460Phe Ala Thr Pro
Arg Ala Ala Thr Gly Arg Ala Leu Ser Val Leu 5465
5470 5475Gln Ala Trp Leu Ala Asp Glu Arg Leu Ala Asp
Ser Arg Leu Val 5480 5485 5490Ala Val
Thr Ser Gly Ala Val Val Ala Ala Pro Gly Asp Asp Thr 5495
5500 5505Val Asp Val Pro Gly Ala Ala Val Trp Gly
Leu Val Arg Ser Gly 5510 5515 5520Gln
Ser Glu His Pro Asp Arg Ile Thr Leu Leu Asp Cys Ala Ser 5525
5530 5535Gly Ala Arg Pro Gly Pro Asp Leu Val
Ala Ala Ala Leu Ala Ser 5540 5545
5550Gly Glu Pro Gln Leu Ala Ala Arg Ala Gly Val Leu Tyr Thr Pro
5555 5560 5565Arg Leu Ala Arg Pro His
Arg Asp Ala Ser Ala Val Pro Arg Ser 5570 5575
5580Leu Pro Ser His Gly Thr Val Leu Ile Thr Gly Gly Thr Gly
Leu 5585 5590 5595Leu Gly Gly Leu Val
Ala Arg Arg Leu Val Glu Ala His Gly Val 5600 5605
5610Arg Arg Leu Leu Leu Ala Gly Arg Arg Gly Pro Ala Ala
Glu Gly 5615 5620 5625Leu Asp Ser Leu
Thr Ser Glu Leu Arg Glu Arg Gly Ala Thr Val 5630
5635 5640Glu Val Ala Ala Cys Asp Ala Ala Asp Arg Thr
Gln Leu Glu Ala 5645 5650 5655Leu Leu
Ala Gly Val Pro Glu Glu His Pro Leu Ser Ala Val Val 5660
5665 5670His Ala Ala Gly Val Leu Asp Asp Gly Val
Leu Thr Ser Leu Thr 5675 5680 5685Asn
Glu Arg Leu Gly Ala Val Leu Arg Ala Lys Ala Asp Ser Ala 5690
5695 5700Leu Leu Leu His Glu Leu Thr Gln Asp
Leu Asp Leu Ser Ala Phe 5705 5710
5715Val Leu Phe Ser Ser Ala Ala Gly Val Leu Gly Ser Pro Gly Gln
5720 5725 5730Gly Ser Tyr Ala Ala Ala
Asn Ala Val Leu Asp Ala Leu Ala His 5735 5740
5745Gln Arg Ser Ala Ala Gly Leu Pro Ala Leu Ser Leu Ala Trp
Gly 5750 5755 5760Leu Trp Ala Glu Gly
Ser Gly Met Thr Gly His Leu Asp Ala Asp 5765 5770
5775Asp Arg Ser Arg Ile Asn Arg Ala Gly Met Ala Pro Leu
Pro Thr 5780 5785 5790Pro Asp Ala Leu
Asp Leu Phe Asp Ala Ala Leu Ser Ser Asp Glu 5795
5800 5805Pro Phe Leu Val Pro Ala Arg Phe Asp Leu Ser
Ala Val Arg Thr 5810 5815 5820Arg Thr
Ala Tyr Gly Pro Leu Pro Pro Leu Leu Arg Gly Leu Val 5825
5830 5835Arg Thr Ser Gly Ala His Arg Val Arg Gly
Ala Val Gly Glu Ala 5840 5845 5850Arg
Ala Ala Gly Val Asp Glu Ala Gly Arg Leu Arg Glu Arg Leu 5855
5860 5865Ala Arg Gln Ser Asp Ala Glu Arg Arg
Asn Thr Leu Leu Arg Leu 5870 5875
5880Val Gln Ser Asn Val Ala Ala Val Leu Gly His Arg Gly Thr Gly
5885 5890 5895Thr Val Ala Glu Thr Arg
Ala Phe Arg Glu Leu Gly Phe Asp Ser 5900 5905
5910Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Lys Val Ala Thr
Gly 5915 5920 5925Leu Ala Leu Arg Ala
Thr Val Ala Phe Asp Phe Pro Thr Pro Ala 5930 5935
5940Ala Leu Ala Glu His Leu Gly Ala Arg Leu Leu Pro Pro
Asp Gly 5945 5950 5955Ala Val Ser Glu
Ala Val Gly Glu Lys Glu Leu Arg Gly Leu Leu 5960
5965 5970Thr Ser Ile Pro Ile Gly Arg Leu Arg Glu Ala
Gly Leu Ile Asp 5975 5980 5985Arg Leu
Leu Ala Leu Ala Ala Ala Ala Pro Asp Ser Ala Asp Gln 5990
5995 6000Thr Ala Glu Gln Pro Ser Arg Ser Val Ser
Val Glu Asp Ile Asp 6005 6010 6015Ala
Met Asp Val Asp Ser Leu Ile Gly Leu Ala His Asp Thr Gly 6020
6025 6030Thr Asp Ser Gly His Ala Pro Cys Glu
Gly 6035 60408284PRTbacteria 8Met Thr Lys Ala Pro His
Gln Gly Ser Pro Thr Pro Ala Asp Val Gly1 5
10 15Asp Tyr Tyr Asp Arg Met Thr Ser Leu Leu Asn Arg
Ala Leu Gly Gly 20 25 30Asn
Thr His Leu Gly Tyr Trp Pro His Pro Asp Asp Gly Ser Thr Leu 35
40 45Gly Gln Ala Ser Asp Arg Leu Thr Asp
His Met Ile Gly Lys Leu Arg 50 55
60Glu His Thr Gly Arg Pro Val Arg Arg Val Leu Asp Val Gly Cys Gly65
70 75 80Ser Gly Arg Pro Ala
Leu Arg Leu Ala His Ser Glu Pro Val Asp Ile 85
90 95Val Gly Ile Thr Ile Ser Pro Arg Gln Val Glu
Leu Ala Thr Ala Leu 100 105
110Ala Glu Arg Ser Gly Leu Ala Asn Arg Val Arg Phe Glu Cys Ala Asp
115 120 125Ala Met Asp Leu Pro Phe Pro
Asp Ala Ser Phe Asp Ala Val Trp Ala 130 135
140Leu Glu Cys Leu Leu His Met Pro Asp Pro Ala Arg Val Phe Gln
Glu145 150 155 160Met Ala
Arg Val Leu Arg Pro Gly Gly Arg Leu Ala Ala Met Asp Val
165 170 175Thr Leu Arg Ala Ser Gln Pro
Thr Gly Ala Asp Trp Ser Ser Ser Glu 180 185
190Leu Ala Val Pro Ser Leu Ile Pro Ile Thr Ala Tyr Ala Gly
Met Ile 195 200 205Ser Asp Ala Gly
Leu Arg Leu Thr Glu Leu Thr Asp Ile Gly Glu His 210
215 220Val Ile Ala Pro Ser Tyr Ser Ala Met Gly Asp Asp
Val Arg Ala Asn225 230 235
240Ala His Ala Tyr Ala Glu Ala Leu Glu Met Thr Ala Asp Asp Leu Glu
245 250 255Thr Phe Val Gly Lys
Cys Ser Gln Trp Tyr Thr Glu Asp Ile Gly Tyr 260
265 270Val Val Leu Thr Ala Pro Cys Gln Arg Ala Glu Val
275 2809468PRTbacteria 9Val Ser Ser Pro Pro Ser Thr
Ile Pro Glu Ala Pro Gly Ala Trp Pro1 5 10
15Val Leu Gly His Leu Pro Ala Leu Leu Arg Asp Pro Leu
Gly Phe Leu 20 25 30Ser Ala
Val Thr Glu Arg Gly Asp Leu Phe Arg Ile Arg Leu Gly His 35
40 45Asn Thr Val Tyr Leu Ala Thr His Pro Glu
Ile Val Arg Thr Met Leu 50 55 60Val
Ser Gly Ala Ala Asp Phe Thr Arg Ser Lys Gly Ala Ala Gly Ala65
70 75 80Ser Arg Phe Ile Gly Pro
Ile Leu Val Ala Val Ser Gly Asp Ser His 85
90 95Arg Arg Gln Arg Arg Met Met Gln Pro Gly Phe His
Arg Gly Lys Leu 100 105 110Asp
His Tyr Val Ile Ser Met Ser Ala Ala Ala Glu Glu Thr Ala Asp 115
120 125Ser Trp Arg Pro Gly Gln Val Val Asp
Val Pro Lys Met Ala Ser Asp 130 135
140Leu Ser Leu Ala Met Ile Thr Lys Ala Leu Phe Gln Ser Asp Leu Gly145
150 155 160Ala Ala Ala Glu
Ala Glu Leu Arg Thr Thr Gly His Asp Ile Leu Lys 165
170 175Val Ala Arg Leu Ser Ala Leu Ala Pro Gln
Leu Tyr Thr Ser Leu Pro 180 185
190Thr Ala Ala Lys Arg His Met Gly Arg Thr Ser Ala Ala Ile Arg Glu
195 200 205Ala Val Thr Ala Tyr Arg Ala
Asp Gly Arg Asp His Gly Asp Leu Leu 210 215
220Ser Thr Met Leu Arg Ala Arg Asp Ala Glu Gly Asn Thr Met Thr
Asp225 230 235 240Asp Glu
Val His Asn Glu Ile Met Gly Leu Ala Val Ala Gly Ile Gly
245 250 255Gly Pro Ala Ala Leu Thr Ala
Trp Ile Phe His Glu Leu Ala His Asp 260 265
270His Leu Ile Glu Gln Arg Leu His Ala Glu Ile Asp Thr Val
Leu Gly 275 280 285Gly Arg Leu Pro
Thr Ser Ala Asp Leu Pro Arg Leu Pro Tyr Thr Gln 290
295 300Arg Leu Val Lys Glu Ala Leu Arg Lys Tyr Pro Gly
Trp Val Gly Ser305 310 315
320Arg Arg Thr Val Arg Pro Val Arg Leu Gly Glu His Glu Leu Pro Ala
325 330 335Asp Val Glu Ile Met
Tyr Ser Ser Tyr Ala Leu Gln Arg Asp Pro Arg 340
345 350Trp Tyr Arg Asp Pro Glu Lys Leu Asp Pro Asp Arg
Trp Glu Ser Lys 355 360 365Glu Thr
Thr Arg Asp Val Pro Lys Gly Ala Trp Val Pro Phe Ala Leu 370
375 380Gly Thr Tyr Lys Cys Ile Gly Asp Asn Phe Ala
Leu Met Glu Thr Ala385 390 395
400Val Ala Val Ala Val Ile Ala Ser Arg Trp Arg Leu Arg Pro Leu Lys
405 410 415Gly Asp Arg Val
Arg Pro Val Ala Lys Ala Thr His Val Phe Pro Asp 420
425 430Arg Leu Arg Met Ile Ala Glu Pro Arg Thr Pro
Ala Ile Pro Arg Gly 435 440 445His
Ala Pro Ala Asp Ala Ser Leu Glu Ala Ala Ala Arg Pro Lys Glu 450
455 460Leu Pro Glu Pro465105674PRTbacteria 10Met
Ala Thr Pro Ser Glu Lys Leu Val Glu Ala Leu Arg Ala Ser Leu1
5 10 15Lys Ala Asn Glu Ala Leu Arg
Arg Arg Asn Gln Gln Leu Thr Ala Ala 20 25
30Val Glu Ala Ala Gln Glu Pro Leu Ala Ile Val Gly Met Ala
Cys Arg 35 40 45Phe Pro Gly Gly
Val Arg Ser Pro Glu Glu Leu Trp Gly Leu Val Ala 50 55
60Ser Gly Gly Asp Ala Ile Gly Glu Phe Pro Ala Asp Arg
Gly Trp Asp65 70 75
80Leu Ala Gly Leu Phe Asp Pro Asp Pro Glu Arg Ala Gly Ala Ser Tyr
85 90 95Thr Arg His Gly Gly Phe
Leu Tyr Asp Ala Gly Gln Phe Asp Ala Glu 100
105 110Leu Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met
Asp Pro Gln Gln 115 120 125Arg Leu
Leu Leu Glu Thr Ser Trp Glu Val Phe Glu Arg Ala Gly Ile 130
135 140Asp Pro Ser Ser Val Arg Gly Ala Arg Ala Gly
Val Phe Thr Gly Met145 150 155
160Met Tyr His Asp Tyr Ala Ser Arg Leu Ala Thr Ile Pro Glu Gly Phe
165 170 175Glu Gly Tyr Ile
Gly Asn Gly Ser Gly Gly Ala Val Ala Ser Gly Arg 180
185 190Val Ala Tyr Thr Leu Gly Leu Glu Gly Pro Ala
Val Thr Val Asp Thr 195 200 205Ala
Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Cys Gln Ser Leu 210
215 220Arg Thr Gly Glu Cys Asp Leu Ala Leu Ala
Gly Gly Val Thr Val Met225 230 235
240Ser Thr Pro Leu Leu Phe Val Glu Phe Ser Arg Gln Arg Gly Leu
Ser 245 250 255Val Asp Gly
Arg Cys Lys Ser Phe Ala Ala Ala Ala Asp Gly Thr Gly 260
265 270Met Gly Glu Gly Val Gly Met Leu Leu Val
Glu Arg Leu Ser Asp Ala 275 280
285Glu Arg Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val 290
295 300Asn Gln Asp Gly Ala Ser Asn Gly
Leu Thr Ala Pro Asn Gly Pro Ser305 310
315 320Gln Glu Arg Val Ile Arg Glu Ala Leu Ala Asn Ala
Gly Leu Thr Val 325 330
335Ala Asp Val Asp Ala Val Glu Gly His Gly Thr Gly Thr Arg Leu Gly
340 345 350Asp Pro Ile Glu Ala Gln
Ala Leu Leu Asp Thr Tyr Gly Gln Glu Arg 355 360
365Ser Gly Glu Gln Pro Leu Trp Leu Gly Ser Val Lys Ser Asn
Ile Gly 370 375 380His Ala Gln Ala Ala
Ala Gly Val Gly Gly Ile Ile Lys Met Val Met385 390
395 400Ala Leu Arg His Glu Ser Leu Pro Arg Thr
Leu His Val Asp Glu Pro 405 410
415Ser Pro Gln Val Asp Trp Ser Ser Gly Ala Val Ser Leu Leu Ser Glu
420 425 430Ala Arg Pro Trp Pro
Arg Arg Glu Asp Arg Pro Arg Arg Ala Gly Val 435
440 445Ser Ser Phe Gly Val Ser Gly Thr Asn Ala His Val
Ile Leu Glu Glu 450 455 460Ala Pro Ala
Arg Arg Pro Gly Glu Ala Ala Val Glu Asp Gly Ala Pro465
470 475 480Val Pro Trp Val Val Ser Ala
Arg Ser Gly Ala Ala Leu Arg Ala Gln 485
490 495Ala Met Val Leu Arg Glu Phe Leu Arg Gly Pro Gly
Thr Asp Ala Gly 500 505 510Val
Arg Asp Ile Gly Ala Glu Leu Ala Arg Gly Arg Ala Val Leu Glu 515
520 525His Arg Ala Val Ile Val Ala Arg Glu
Arg Ala Glu Phe Glu Gly Ala 530 535
540Leu Glu Ala Leu Ala Ser Gly Glu Pro His Pro Ala Leu Ile Glu Asp545
550 555 560Ala Thr Gly Ser
His Ser His Ser Gly Gly Gly Val Val Phe Val Phe 565
570 575Pro Gly Gln Gly Gly Gln Trp Ala Gly Met
Gly Leu Asp Leu Leu Thr 580 585
590Thr Ser Gly Val Phe Ala Asp His Ile Gly Ala Cys Glu Arg Ala Leu
595 600 605Ala Pro Trp Val Glu Trp Ser
Leu Thr Glu Met Leu His Arg Glu Ala 610 615
620Glu Asp Pro Val Trp Glu Arg Ala Asp Val Val Gln Pro Val Leu
Phe625 630 635 640Ser Val
Met Val Ser Leu Ala Ala Leu Trp Arg Ser Tyr Gly Ile Glu
645 650 655Pro Asp Ala Val Val Gly His
Ser Gln Gly Glu Ile Ala Ala Ala His 660 665
670Val Cys Gly Ala Leu Thr Leu Glu Asp Ala Ala Lys Val Val
Ala Leu 675 680 685Arg Ser Arg Ala
Leu Ala Ala Leu Arg Gly His Gly Gly Met Ala Ser 690
695 700Leu Ala Leu Thr Gly Thr Glu Ala Glu Asp Leu Ile
Thr Thr His Trp705 710 715
720Pro Gly Arg Leu Trp Thr Ala Ala Phe Asn Gly Pro Arg Ala Thr Thr
725 730 735Val Ser Gly Asp Thr
Asp Ala Leu Asp Glu Leu Leu Thr His Cys Thr 740
745 750Glu Thr Gly Val Arg Ala Arg Arg Ile Pro Val Asp
Tyr Ala Ser His 755 760 765Cys Pro
His Thr Glu Thr Ile Glu His Asp Leu Leu His Met Leu His 770
775 780Gly Ile Thr Pro Gln Pro Gly Ser Ile Pro Phe
Tyr Ser Thr Val Glu785 790 795
800Asp Ala Trp Thr Asp Thr Thr Thr Leu Asp Ala Ala Tyr Trp Tyr Arg
805 810 815Asn Leu Arg Arg
Pro Val Arg Phe Thr His Ala Val Arg Thr Leu Thr 820
825 830Ala Gln Gly His Arg Leu Phe Ile Glu Thr Ser
Pro His Pro Thr Leu 835 840 845Thr
Pro Ala Ile Glu Asp His Asp His Thr Thr Ala Leu Gly Thr Leu 850
855 860Arg Arg His Asp Asn Asp Thr His Arg Phe
Leu Thr Ala Leu Ala His865 870 875
880Ala His Thr Thr Gly His Thr Val Thr Trp Thr Thr His Tyr Pro
Thr 885 890 895Thr Pro His
Thr Pro Ala Ile Asp Leu Pro Thr Tyr Pro Phe Gln His 900
905 910His His Tyr Trp Leu His Thr Pro Thr Thr
Ser Thr Gly Asp Val Ser 915 920
925Ala Ala Gly Leu Gln Arg Pro Asp His Pro Leu Leu Gly Ala Val Met 930
935 940Glu Leu Ala Asp Gly Asp Gly Ile
Val Leu Thr Gly Arg Leu Ser Leu945 950
955 960His Thr His Pro Trp Leu Ala Asp His Ser Val Gly
Gly Val Val Leu 965 970
975Leu Pro Gly Thr Ala Leu Leu Glu Leu Ala Phe Gln Ala Gly Leu Arg
980 985 990Ala Gly Cys Pro Gly Val
Asp Glu Leu Thr Leu His Ala Pro Leu Val 995 1000
1005Val Pro Glu Ser Gly His Val Val Val Gln Val Ser
Val Ser Val 1010 1015 1020Pro Asp Glu
Ala Gly Arg Arg Gly Val Ser Val Tyr Gly Arg Leu 1025
1030 1035Val Glu Asp Gly Gly Leu Glu Gly Glu Trp Thr
Arg His Ala Glu 1040 1045 1050Gly Val
Val Cys Pro Ser Val Pro Gly Glu Ser Val Val Val Glu 1055
1060 1065Pro Val Ala Asp Gly Val Trp Pro Pro Ser
Gly Ala Gln Pro Val 1070 1075 1080Asp
Leu Asp Glu Phe Tyr Gly Arg Leu Ala Gly Gly Gly Phe Val 1085
1090 1095Tyr Gly Pro Val Phe Gln Gly Leu Cys
Ala Ala Trp Arg Asp Gly 1100 1105
1110Asp Asp Val Val Ala Glu Val Arg Leu Pro Asp Glu Gly Leu Ala
1115 1120 1125Asp Val Ala Gly Phe Gly
Val His Pro Ala Leu Leu Asp Ala Ala 1130 1135
1140Val Gln Thr Val Thr Leu Leu Leu Pro Glu Asp Gln Glu Ala
Gly 1145 1150 1155Leu Leu Pro Tyr Thr
Trp Asn Gly Ala Ser Leu His Ala Arg Gly 1160 1165
1170Ala Arg Ala Leu Arg Val Arg Val Thr Ser Val Asp Ala
Ala Gly 1175 1180 1185Thr Thr Val Ser
Leu Arg Val Ala Asp Glu Thr Gly Ala Leu Val 1190
1195 1200Leu Ala Leu Glu Ser Leu Val Leu Arg Pro Val
Pro Leu Glu Gly 1205 1210 1215Leu Gly
Ala Gly Val Arg Arg Gly Ser Leu Phe Glu Leu Gly Trp 1220
1225 1230Val Pro Val Glu Gly Val Pro Ala Ser Leu
Ala Gly Gly Gly Gly 1235 1240 1245Glu
Leu Val Val Trp Glu Cys Pro Gly Gly Gly Val Ala Glu Val 1250
1255 1260Thr Ala Ala Ala Leu Gly Val Val Arg
Glu Trp Leu Ala Asp Glu 1265 1270
1275Arg Glu Gly Asp Ala Arg Leu Val Val Val Thr Arg Gly Ala Val
1280 1285 1290Ala Val Asp Ala Gly Glu
Pro Val Arg Asp Val Ala Gly Ala Ala 1295 1300
1305Val Trp Gly Leu Val Arg Ser Ala Gln Ser Glu His Pro Asp
Arg 1310 1315 1320Phe Val Leu Leu Asp
Leu Asp Pro Gly Thr Gly Val Glu Thr Val 1325 1330
1335Val Asp Ala Asp Glu Asp Met Gly Ala Gly Val Gly Ala
Gly Val 1340 1345 1350Asp Val Ala Gly
Phe Val Ala Cys Gly Glu Ala Gln Val Ala Val 1355
1360 1365Arg Gly Gly Val Val Arg Val Pro Arg Leu Glu
Arg Leu Glu Arg 1370 1375 1380Trp Gly
Arg Leu Gly Gly Ala Gly Glu Gly Leu Ser Leu Pro Gly 1385
1390 1395Gly Val Gly Trp Arg Leu Asp Gly Gly Gly
Ser Gly Leu Leu Glu 1400 1405 1410Gly
Val Gly Val Val Ala Ser Asp Ala Ala Gly Val Val Leu Gly 1415
1420 1425Arg Gly Gln Val Arg Val Ala Val Arg
Ala Ala Gly Val Asn Phe 1430 1435
1440Arg Asp Val Leu Val Ala Leu Gly Met Val Pro Gly Gln Val Gly
1445 1450 1455Val Gly Ser Glu Gly Ala
Gly Val Val Val Glu Val Gly Pro Gly 1460 1465
1470Val Glu Gly Leu Val Val Gly Asp Arg Val Phe Gly Val Phe
Gly 1475 1480 1485Asp Ala Phe Ala Pro
Val Val Val Ala Gln Glu Val Leu Leu Ala 1490 1495
1500Arg Ile Pro Glu Gly Trp Ser Phe Ala Gln Ala Ala Ser
Val Pro 1505 1510 1515Val Val Phe Ala
Thr Ala Tyr Leu Gly Leu Val Asp Leu Ala Gly 1520
1525 1530Val Arg Arg Gly Glu Ser Val Leu Val His Ala
Ala Ala Gly Gly 1535 1540 1545Val Gly
Thr Ala Ala Val Gln Leu Ala Arg His Leu Gly Ala Glu 1550
1555 1560Val Tyr Ala Thr Ala Ser Glu Ala Lys Trp
Ala Arg Leu Arg Ala 1565 1570 1575Ala
Gly Val Ala Pro Gln Arg Ile Ala Ser Ser Arg Ser Val Glu 1580
1585 1590Phe Glu Ser Arg Phe Arg Arg Ala Ser
Gly Gly Arg Gly Val Asp 1595 1600
1605Val Val Leu Asn Cys Leu Ala Gly Glu Tyr Thr Asp Ala Ser Leu
1610 1615 1620Arg Leu Cys Ser Pro Gln
Gly Gly Arg Phe Leu Glu Leu Gly Lys 1625 1630
1635Thr Asp Ile Arg Asp Ala Gly Glu Val Ala Ala Arg Phe Pro
Gly 1640 1645 1650Val Ser Tyr Arg Ala
Tyr Asp Leu Met Asp Ala Gly Ala Gln Arg 1655 1660
1665Val Gly Glu Ile Leu His Thr Val Val Asp Leu Phe Arg
Arg Gly 1670 1675 1680Val Leu Glu Pro
Leu Pro Val Thr Ala Trp Asp Val Arg Gln Ala 1685
1690 1695Arg Gln Ala Leu Arg Ser Met Arg Ser Gly Leu
His Val Gly Lys 1700 1705 1710Asn Val
Leu Thr Leu Pro Val Pro Leu Asp Ala Glu Gly Thr Val 1715
1720 1725Leu Val Thr Gly Gly Thr Gly Thr Leu Gly
Ala Ala Val Ala Arg 1730 1735 1740His
Leu Ala Ala Gly His Gly Val Arg His Leu Leu Leu Val Ser 1745
1750 1755Arg Arg Gly Met Ala Ala Ala Gly Ala
Glu Glu Leu Cys Ala Glu 1760 1765
1770Leu Gly Gln Ala Gly Val Ser Val Ser Val Ala Ala Cys Asp Val
1775 1780 1785Ala Asp Arg Ala Gln Val
Ala Ala Leu Leu Glu Gln Val Pro Ala 1790 1795
1800Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu
Asp 1805 1810 1815Asp Ala Thr Val Thr
Cys Leu Asp Arg Glu Lys Ile Asp Ala Val 1820 1825
1830Val Gly Ala Lys Val Asp Gly Ala Leu His Leu His Glu
Leu Thr 1835 1840 1845Ala Gly Met Asp
Leu Ser Ala Phe Val Leu Phe Ser Ser Ala Ala 1850
1855 1860Gly Val Leu Gly Ser Pro Gly Gln Gly Asn Tyr
Ala Ala Ala Asn 1865 1870 1875Ala Ala
Leu Asp Ala Leu Ala His Gln Arg Arg Ala Ala Gly Leu 1880
1885 1890Pro Ala Leu Ser Leu Ala Trp Gly Leu Trp
Glu Glu Ala Ser Gly 1895 1900 1905Met
Thr Gly His Leu Asp Ala Gly Asp Arg His Arg Ile Thr Arg 1910
1915 1920Ser Gly Leu His Pro Leu Thr Thr Pro
Asp Ala Leu Ala Leu Leu 1925 1930
1935Asp Thr Ala Leu Ala Thr Gly Arg Pro Ala Leu Leu Pro Ala Asp
1940 1945 1950Leu Arg Pro Thr His Pro
Ala Pro Pro Leu Leu Glu His Leu Ala 1955 1960
1965Pro Ala Arg Thr Ser Pro Arg Thr Ala His Thr Gly Thr Ser
Ala 1970 1975 1980Gly Ala Gly Gln Asp
Val Ser Leu Ala Asp Arg Leu Ala Thr Leu 1985 1990
1995Thr Ser Glu Gln Arg His Ala Thr Leu Leu Ala Leu Ala
Arg Thr 2000 2005 2010His Ile Ala Ala
Val Leu Gly His Pro Thr Pro Asp Thr Ile Asp 2015
2020 2025Pro Glu Arg Thr Phe Arg Asp Leu Gly Phe Asp
Ser Leu Thr Ala 2030 2035 2040Val Glu
Leu Arg Asn Arg Leu Thr Arg Ala Thr Gly Leu Arg Leu 2045
2050 2055Pro Thr Thr Leu Ala Phe Asp His Pro Thr
Pro Thr Ala Leu Thr 2060 2065 2070His
His Leu Thr Thr Leu Leu Asn Pro Asn Asp Thr Lys Thr Pro 2075
2080 2085Ser Ala Pro Ala Ala Ala Glu Pro Lys
Ala Gly Gln His Glu Pro 2090 2095
2100Ile Ala Ile Ile Gly Val Gly Cys Arg Tyr Pro Gly Gly Val Ala
2105 2110 2115Ser Ala Glu Asp Leu Trp
Gln Leu Val Ala Ser Gly Gly Asp Ala 2120 2125
2130Val Gly Glu Phe Pro Ala Asp Arg Gly Trp Asp Val Glu Ala
Leu 2135 2140 2145Tyr Asp Pro Glu Pro
Gly Gln Arg Gly Thr Ser Tyr Thr Arg His 2150 2155
2160Gly Gly Phe Leu Tyr Asp Ala Gly Glu Phe Asp Ala Gly
Phe Phe 2165 2170 2175Gly Ile Ser Pro
Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg 2180
2185 2190Leu Leu Leu Glu Thr Thr Trp Glu Ala Phe Glu
Arg Ala Gly Ile 2195 2200 2205Asp Pro
Gly Ala Val Arg Gly Ser Gln Thr Gly Val Phe Ala Gly 2210
2215 2220Val Met Pro Gln Glu Tyr Ala Ser Arg Ser
Arg His His Val Ala 2225 2230 2235Ala
Asp Val Asp Gly Tyr Val Leu Thr Gly Thr Ser Gly Ser Val 2240
2245 2250Ala Ser Gly Arg Val Ala Tyr Thr Phe
Gly Leu Glu Gly Pro Ala 2255 2260
2265Val Ser Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His
2270 2275 2280Leu Ala Cys Gln Ala Leu
Arg Ser Gly Glu Cys Thr Met Ala Leu 2285 2290
2295Ala Gly Gly Ala Thr Val Met Ser Thr Pro Thr Ala Phe Leu
Glu 2300 2305 2310Phe Ser Arg Gln Arg
Gly Leu Ala Ala Asp Gly Arg Cys Lys Ala 2315 2320
2325Phe Ser Ala Ser Ala Asp Gly Thr Gly Trp Ser Glu Gly
Ala Gly 2330 2335 2340Met Leu Leu Leu
Glu Arg Leu Ser Asp Ala Glu Arg Asn Gly His 2345
2350 2355Arg Val Leu Ala Val Val Arg Gly Ser Ala Val
Asn Gln Asp Gly 2360 2365 2370Ala Ser
Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg 2375
2380 2385Val Ile Arg Gln Ala Leu Ala Asn Ala Asn
Leu Ser Ala Val Asp 2390 2395 2400Val
Asp Ala Val Glu Ala His Gly Thr Gly Thr Lys Leu Gly Asp 2405
2410 2415Pro Ile Glu Ala Gln Ala Leu Leu Ala
Thr Tyr Gly Gln Glu His 2420 2425
2430His Pro Asp Gln Pro Leu Trp Leu Gly Ser Leu Lys Ser Asn Ile
2435 2440 2445Gly His Ala Gln Ala Ala
Ala Gly Val Gly Gly Ile Ile Lys Met 2450 2455
2460Val Met Ala Leu Arg His Glu Ser Leu Pro Arg Thr Leu His
Val 2465 2470 2475Asp Glu Pro Ser Pro
Gln Val Asp Trp Ser Ser Gly Ala Val Ser 2480 2485
2490Leu Leu Thr Glu Ala Arg Pro Trp Pro Arg Arg Glu Asp
Arg Pro 2495 2500 2505Arg Arg Ala Gly
Ile Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 2510
2515 2520His Val Ile Leu Glu Glu Ala Pro Ala Arg Ala
Glu Val Glu Ala 2525 2530 2535Val Glu
Ala Ala Pro Ala Gly Val Glu Thr Ala Ala Ala Ala Ala 2540
2545 2550Val Val Val Glu Thr Asp Gly Ala Gly Arg
Val Ser Ala Asp Val 2555 2560 2565Pro
Leu Val Trp Val Val Ser Gly Lys Ser Gln Ala Ala Leu Arg 2570
2575 2580Ala Gln Ala Ala Ala Leu His Ala His
Val Leu Asp His Pro Glu 2585 2590
2595Gln Asp Ala Ala Asp Ile Gly Tyr Ser Leu Ala Thr Thr Arg Ala
2600 2605 2610Leu Phe Asp His Arg Ala
Thr Leu Ile Ala Pro Asp Arg Asp Thr 2615 2620
2625Leu Leu Asp Ala Leu Thr Ala Leu Ala Asp Gly Arg Thr His
Pro 2630 2635 2640His Leu Ile Pro Thr
Pro Pro Thr Glu Pro Gly His Thr His Lys 2645 2650
2655Ile Ala Phe Leu Cys Ser Gly Gln Gly Thr Gln Arg Pro
Gly Met 2660 2665 2670Ala Thr Gly Leu
Tyr His Thr Tyr Pro Ala Phe Ala Asp Ala Leu 2675
2680 2685Asp Glu Thr Cys Ala His Phe Asp Pro His Leu
Asp His Pro Leu 2690 2695 2700Arg Asp
Leu Leu Leu Asn His Asp Pro Thr Asp Leu Leu Thr His 2705
2710 2715Thr Leu Tyr Ala Gln Pro Ala Leu Phe Thr
Leu Gln Lys Ala Leu 2720 2725 2730His
His Leu Ile Thr Glu Thr Tyr Gly Ile Thr Pro His Tyr Leu 2735
2740 2745Ala Gly His Ser Leu Gly Glu Ile Thr
Ala Ala His Leu Ala Gly 2750 2755
2760Ile Leu Thr Leu Pro Asp Ala Thr His Leu Ile Thr Thr Arg Ala
2765 2770 2775Arg Leu Met Gln Thr Met
Pro Pro Gly Thr Met Thr Thr Leu His 2780 2785
2790Thr Thr Pro Glu His Ile Gln Pro Leu Leu Asp Gln His Pro
Gly 2795 2800 2805Lys Ala Thr Ile Ala
Ala Val Asn Ser Pro His Ser Leu Val Ile 2810 2815
2820Ser Gly Asp Pro Asp Thr Ile His His Ile Thr Thr Thr
Cys His 2825 2830 2835Thr Gln Gly Ile
Thr Thr Lys Pro Leu Thr Thr Asn His Ala Phe 2840
2845 2850His Ser Pro His Thr Asp Thr Ile Leu Glu Gln
Leu Asp Thr Thr 2855 2860 2865Thr His
Thr Leu Thr Tyr His Pro Pro His Thr Pro Leu Ile Thr 2870
2875 2880Ser Thr Pro Gly Asp Pro Leu Thr Pro His
Tyr Trp Thr His Gln 2885 2890 2895Thr
Arg Gln Pro Val His Trp Thr Asp Thr Ile His Thr Leu His 2900
2905 2910Thr Asn Gly Val Thr Thr Tyr Ile Glu
Leu Gly Pro Asp His Thr 2915 2920
2925Leu Thr Thr Leu Thr His His Asn Leu Pro His His Gln Pro Thr
2930 2935 2940Ala Ile Thr Leu Thr His
Pro His His Asn Pro Thr His His Leu 2945 2950
2955Leu Thr Ala Leu Ala His Thr Pro Thr Thr Trp His Thr His
His 2960 2965 2970His Thr His Thr Asn
Pro His Pro His Thr Ile Pro Asp Leu Pro 2975 2980
2985Thr Tyr Pro Phe Gln Arg Arg His Tyr Trp Leu Gln Ala
Thr Pro 2990 2995 3000Gly Ala Gly Ala
Gly Asp Val Ser Ala Ala Gly Leu Gln Arg Pro 3005
3010 3015Asp His Pro Leu Leu Gly Ala Val Met Glu Leu
Ala Asp Gly Asp 3020 3025 3030Gly Ile
Val Leu Thr Gly Ser Leu Ser Leu Arg Thr His Thr Trp 3035
3040 3045Leu Ala Asp His Ser Val Gly Gly Ile Val
Leu Leu Pro Gly Thr 3050 3055 3060Ala
Leu Leu Asp Leu Ala Phe Gln Ala Gly Leu Arg Thr Gly Cys 3065
3070 3075Pro Arg Val Asp Glu Leu Thr Leu His
Ala Pro Leu Val Ile Pro 3080 3085
3090Glu Ser Gly His Val Val Val Gln Val Ser Val Ser Val Pro Asp
3095 3100 3105Glu Ala Gly Arg Arg Ala
Val Asn Val Tyr Ala Arg Pro Ala Gly 3110 3115
3120Asp Glu Glu Thr Asp Gly Glu Trp Thr Arg His Ala Glu Gly
Val 3125 3130 3135Leu Ser Pro Ser Thr
Glu Asp Asp Pro Asn Ala Glu Ala Ala Ala 3140 3145
3150Ala Gly Glu Trp Pro Pro Pro Gly Ala Arg Pro Val Val
Leu Asp 3155 3160 3165Gly Leu Tyr Asp
Arg Leu Ala Gly Gly Gly Phe Val Tyr Gly Pro 3170
3175 3180Val Phe Gln Gly Leu Cys Ala Ala Trp Arg Asp
Gly Asp Asp Val 3185 3190 3195Val Ala
Glu Val Arg Leu Pro Asp Glu Gly Leu Ala Asp Val Ala 3200
3205 3210Gly Phe Gly Val His Pro Ala Leu Leu Asp
Ala Ala Val Gln Ser 3215 3220 3225Val
Thr Leu Leu Leu Ala Asp Gln Gln Gln Ala Gly Leu Val Pro 3230
3235 3240His Thr Trp Asn Gly Val Ser Leu His
Ala Arg Gly Ala Thr Val 3245 3250
3255Leu Arg Leu Arg Met Thr Pro Thr Asp Ala Thr Ser Thr Ala Val
3260 3265 3270Arg Leu His Ala Thr Asp
Glu Thr Gly Ala Pro Val Leu Thr Leu 3275 3280
3285Glu Ser Leu Leu Met Arg Pro Val Pro Leu Glu Gly Leu Gly
Ala 3290 3295 3300Arg Val Arg Arg Gly
Ser Leu Phe Glu Leu Gly Trp Val Pro Val 3305 3310
3315Glu Gly Val Pro Ala Ser Val Ala Gly Gly Gly Gly Glu
Leu Val 3320 3325 3330Ala Trp Glu Cys
Pro Gly Gly Gly Val Ala Glu Val Thr Ala Ala 3335
3340 3345Ala Leu Gly Val Val Arg Glu Trp Leu Ala Asp
Glu Arg Glu Gly 3350 3355 3360Asp Ala
Arg Leu Val Val Val Thr Arg Gly Ala Val Ala Val Asp 3365
3370 3375Ala Gly Glu Pro Val Arg Asp Val Ala Gly
Ala Ala Val Trp Gly 3380 3385 3390Leu
Val Arg Ser Ala Gln Ser Glu His Pro Asp Arg Phe Val Leu 3395
3400 3405Leu Asp Leu Asp Pro Asp Thr Lys Thr
Asp Pro Asp Thr Asp Thr 3410 3415
3420Asp Thr Asp Thr Asp Gly Asp Thr Asp Val Ser Ala Asp Ala Lys
3425 3430 3435Val Gly Thr Gly Ala Gly
Leu Asp Asp Ala Ala Val Ala Ser Ala 3440 3445
3450Leu Ala Arg Gly Glu Ser Gln Leu Ala Val Arg Asp Gly Val
Val 3455 3460 3465Arg Val Pro Arg Leu
Lys Arg Val Pro Pro Leu Ser Glu Ser Ser 3470 3475
3480Asp Ala Val Arg Phe Asp Ala Glu Gly Thr Val Leu Val
Thr Gly 3485 3490 3495Gly Thr Gly Thr
Leu Gly Ala Val Val Ala Arg His Leu Ala Ala 3500
3505 3510Gly His Gly Val Arg His Leu Leu Leu Val Ser
Arg Arg Gly Met 3515 3520 3525Ala Ala
Thr Gly Ala Glu Glu Leu Cys Ala Glu Leu Gly Gly Ala 3530
3535 3540Gly Val Ser Val Ser Val Ala Ala Cys Asp
Val Ala Asp Arg Ala 3545 3550 3555Gln
Val Ala Ala Leu Leu Glu Gln Val Pro Ala Glu His Pro Leu 3560
3565 3570Thr Ala Val Val His Thr Ala Gly Val
Leu Asp Asp Ala Thr Val 3575 3580
3585Thr Cys Leu Asp Arg Glu Lys Ile Asp Ala Val Val Gly Ala Lys
3590 3595 3600Val Asp Gly Ala Leu His
Leu His Glu Leu Thr Ala Gly Met Asp 3605 3610
3615Leu Ser Ala Phe Val Leu Phe Ser Ser Ala Ala Gly Val Leu
Gly 3620 3625 3630Ser Pro Gly Gln Gly
Asn Tyr Ala Ala Ala Asn Ala Ala Leu Asp 3635 3640
3645Ala Leu Ala His Gln Arg Arg Ala Ala Gly Leu Pro Ala
Leu Ser 3650 3655 3660Leu Ala Trp Gly
Leu Trp Glu Glu Thr Ser Gly Met Thr Gly His 3665
3670 3675Leu Asp Ala Gly Asp Arg His Arg Ile Thr Arg
Ser Gly Leu His 3680 3685 3690Pro Leu
Thr Thr Pro Asp Ala Leu Ala Leu Leu Asp Thr Ala Leu 3695
3700 3705Ala Ala Gly Arg Pro Ala Leu Leu Pro Ala
Asp Leu Arg Pro Thr 3710 3715 3720His
Pro Ala Pro Pro Leu Leu Glu His Leu Ala Pro Ala Arg Thr 3725
3730 3735Ser His Arg Thr Thr Leu Pro Thr Thr
Asp Ser Gly Ala Ser Leu 3740 3745
3750Arg Ala Arg Leu Ala Gly Arg Thr Pro Glu Gln Gln Tyr Gln Ala
3755 3760 3765Leu Leu Gly Leu Val Arg
Ser His Val Ala Thr Val Leu Gly His 3770 3775
3780Gln Ala Pro Glu Ala Ile Pro Val Asp Ser Ala Phe Arg Asp
Leu 3785 3790 3795Gly Phe Asp Ser Leu
Thr Ala Val Asp Leu Arg Asn Arg Leu Ser 3800 3805
3810Ala Glu Thr Gly Leu Arg Leu Pro Ala Ser Leu Val Phe
Asp Gln 3815 3820 3825Pro Ser Pro Ala
Ala Val Ala Arg Leu Leu Arg Thr Glu Leu Leu 3830
3835 3840Gly Asp Asp Ala Ala Asp Ser Thr Ser Pro Tyr
Ala Glu Thr Thr 3845 3850 3855Ala Val
Gly Ser Asp Glu Pro Leu Ala Ile Val Gly Met Ala Cys 3860
3865 3870Arg Phe Pro Gly Gly Val Arg Ser Pro Glu
Glu Leu Trp Gly Leu 3875 3880 3885Val
Ala Ser Gly Gly Asp Ala Ile Gly Glu Phe Pro Ala Asp Arg 3890
3895 3900Gly Trp Asp Leu Ala Gly Leu Phe Asp
Pro Asp Pro Glu Arg Ala 3905 3910
3915Gly Ala Ser Tyr Thr Arg His Gly Gly Phe Leu Tyr Asp Ala Gly
3920 3925 3930Gln Phe Asp Ala Glu Phe
Phe Gly Ile Ser Pro Arg Glu Ala Leu 3935 3940
3945Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Val Trp
Glu 3950 3955 3960Thr Leu Glu His Ala
Gly Ile Asp Pro Ala Ala Val Arg Gly Ser 3965 3970
3975Arg Thr Gly Val Phe Ala Gly Val Met Tyr His Asp Tyr
Ala Ala 3980 3985 3990Arg Leu Thr Ala
Val Pro Glu Gly Ala Glu Gly Tyr Ile Gly Asn 3995
4000 4005Gly Asn Ala Gly Ser Val Val Ser Gly Arg Val
Ala Tyr Thr Phe 4010 4015 4020Gly Phe
Glu Gly Pro Ala Val Ser Val Asp Thr Ala Cys Ser Ser 4025
4030 4035Ser Leu Val Ala Leu His Leu Ala Gly Gln
Ala Leu Arg Ser Gly 4040 4045 4050Glu
Cys Ser Met Ala Leu Ala Gly Gly Val Thr Val Met Ser Ser 4055
4060 4065Pro Gly Thr Phe Ile Asp Phe Ser Arg
Gln Arg Gly Leu Ser Val 4070 4075
4080Asp Gly Arg Cys Lys Ser Phe Ala Ala Ala Ala Asp Gly Thr Gly
4085 4090 4095Trp Gly Glu Gly Val Gly
Met Leu Leu Val Glu Arg Leu Ser Asp 4100 4105
4110Ala Glu Arg Asn Gly His Arg Val Leu Ala Val Val Arg Gly
Ser 4115 4120 4125Ala Val Asn Gln Asp
Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn 4130 4135
4140Gly Pro Ser Gln Gln Arg Val Ile Arg Gln Ala Leu Ala
Asn Ser 4145 4150 4155Gly Leu Thr Gly
Ala Asp Val Asp Ala Val Glu Ala His Gly Thr 4160
4165 4170Gly Thr Lys Leu Gly Asp Pro Ile Glu Ala Gln
Ala Leu Leu Ala 4175 4180 4185Thr Tyr
Gly Gln Glu His His Pro Asp Gln Pro Leu Trp Leu Gly 4190
4195 4200Ser Leu Lys Ser Asn Ile Gly His Ala Gln
Ala Ala Ala Gly Val 4205 4210 4215Gly
Gly Ile Ile Lys Met Val Met Ala Leu Arg His Glu Thr Leu 4220
4225 4230Pro Arg Thr Leu His Ile Asp Glu Pro
Thr Pro Gln Val Asp Trp 4235 4240
4245Ser Ser Gly Ala Val Ser Leu Leu Thr Glu Pro Arg Pro Trp Pro
4250 4255 4260Arg Gln Gly Asp Arg Pro
Arg Arg Ala Gly Ile Ser Ser Phe Gly 4265 4270
4275Val Ser Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro
Ala 4280 4285 4290Gln Pro Ala Gly Asp
Pro Ala Pro Glu Asp Gly Ala Pro Val Pro 4295 4300
4305Trp Ala Met Ser Ala Arg Ser Asn Ala Ala Leu Arg Ala
Gln Ala 4310 4315 4320Ala Leu Leu Arg
Asp Phe Leu Gln Gly Pro Gly Thr Asp Thr Ala 4325
4330 4335Leu Arg Ala Val Gly Ala Glu Leu Ala His Gly
Arg Ala Val Leu 4340 4345 4350Glu His
Arg Ala Val Ile Val Ala Arg Glu Arg Thr Glu Phe Glu 4355
4360 4365Asp Ala Leu Glu Ala Leu Ala Ser Gly Glu
Pro His Pro Ala Leu 4370 4375 4380Ile
Glu Asp Thr Thr Gly Ser Gln Thr Asn Ser His Ser Gly Gly 4385
4390 4395Gly Val Val Phe Val Phe Pro Gly Gln
Gly Gly Gln Trp Ala Gly 4400 4405
4410Met Gly Leu Asp Leu Leu Arg Asp Ser Gln Val Phe Ala Asp His
4415 4420 4425Val Gly Ala Cys Glu Arg
Ala Leu Ala Pro Trp Val Glu Trp Ser 4430 4435
4440Leu Thr Glu Met Leu His Arg Asp Ala Glu Asp Pro Val Trp
Glu 4445 4450 4455Arg Ala Asp Val Val
Gln Pro Val Leu Phe Ser Val Met Val Ser 4460 4465
4470Leu Ala Ala Leu Trp Arg Ser Tyr Gly Ile Glu Pro Asp
Ala Val 4475 4480 4485Val Gly His Ser
Gln Gly Glu Ile Ala Ala Ala His Val Cys Gly 4490
4495 4500Ala Leu Thr Leu Glu Asp Ala Ala Lys Ile Val
Ala Leu Arg Ser 4505 4510 4515Arg Ala
Leu Ala Ala Leu Arg Gly His Gly Gly Met Ala Ser Leu 4520
4525 4530Ala Leu Thr Gly Thr Glu Ala Glu Asp Leu
Ile Thr Thr His Trp 4535 4540 4545Pro
Gly Arg Leu Trp Arg Ala Ala Phe Asn Gly Pro Arg Ala Thr 4550
4555 4560Thr Val Ser Gly Asp Thr Asp Ala Leu
Asp Glu Leu Leu Thr His 4565 4570
4575Cys Thr Glu Thr Gly Val Arg Ala Arg Arg Ile Pro Val Asp Tyr
4580 4585 4590Ala Ser His Cys Pro His
Thr Glu Thr Ile Glu His Asp Leu Leu 4595 4600
4605His Met Leu His Gly Ile Thr Pro Gln Pro Gly Ser Ile Pro
Phe 4610 4615 4620Tyr Ser Thr Val Glu
Asp Ala Trp Thr Asp Thr Thr Thr Leu Asp 4625 4630
4635Ala Ala Tyr Trp Tyr Arg Asn Leu Arg Arg Pro Val Arg
Phe Thr 4640 4645 4650His Ala Val Arg
Thr Leu Thr Ala Gln Gly His Arg Leu Phe Ile 4655
4660 4665Glu Thr Ser Pro His Pro Thr Leu Thr Pro Ala
Ile Glu Asp His 4670 4675 4680Asp His
Thr Thr Ala Leu Gly Thr Leu Arg Arg His Asp Asn Asp 4685
4690 4695Thr His Arg Phe Leu Thr Ala Leu Ala His
Ala His Thr Thr Gly 4700 4705 4710His
Thr Val Thr Trp Thr Thr His Tyr Pro Thr Thr Pro His Thr 4715
4720 4725Pro Ala Ile Asp Leu Pro Thr Tyr Pro
Phe Gln His His His Tyr 4730 4735
4740Trp Leu His Thr Pro Thr Thr Ser Thr Gly Asp Val Ser Ala Ala
4745 4750 4755Gly Leu His Pro Thr Glu
His Pro Leu Leu Gly Ala Thr Val Glu 4760 4765
4770Leu Ala Asp Gly Asp Gly Thr Leu Leu Thr Gly Arg Leu Ser
Leu 4775 4780 4785His Thr His Pro Trp
Leu Ala Asp His Ser Val Gly Gly Ile Val 4790 4795
4800Leu Leu Pro Gly Thr Ala Leu Leu Glu Leu Ala Leu Gln
Ala Gly 4805 4810 4815Gly Ala Ala His
Val Arg Glu Leu Thr Leu His Ala Pro Leu Ala 4820
4825 4830Val Pro His Asp Ala Ala Val Asp Leu Gln Val
Arg Val Ser Ala 4835 4840 4845Pro Asp
Asp Thr Gly Ala Arg Thr Leu Thr Val Ser Ser Arg Ser 4850
4855 4860Glu His Ala Arg Pro Glu Asp Pro Trp Gln
His His Ala Thr Gly 4865 4870 4875Leu
Leu Asp Ala Gln Pro Ser Ala Asp Gly Asp Ala Leu Arg Ser 4880
4885 4890Trp Pro Pro Glu Gly Ala Leu Pro Cys
Ala Ala Asp Glu Leu Glu 4895 4900
4905Ser Phe Tyr Ala Ala Gln Glu Ala Arg Gly Phe Ala Tyr Gly Pro
4910 4915 4920Ala Phe Arg Gly Leu Arg
Ala Ala Trp Arg Arg Gly Glu Glu Val 4925 4930
4935Phe Ala Glu Val Arg Leu Pro Glu Ser Val Leu Asp Glu Ala
Ser 4940 4945 4950Arg Tyr Asn Leu His
Pro Ala Leu Leu Asp Ala Ala Leu His Ala 4955 4960
4965Val Ala Leu Gly Ala Ala Thr Gly Leu Pro Pro Gly Ala
Val Pro 4970 4975 4980Phe Ser Phe Ser
Gly Val Thr Leu His Ala Val Lys Ala Ala Ala 4985
4990 4995Val Arg Val Arg Val Ala Pro Ala Gly Arg Asp
Gly Glu Arg Thr 5000 5005 5010Ala Val
Ser Val Ser Leu Ala Asp Glu Thr Gly Arg Gly Val Leu 5015
5020 5025Ser Val Asp Ser Leu Ala Val Arg Pro Leu
Asp Thr Gly Glu Leu 5030 5035 5040Arg
Ala Ser Ala Gln Ala Ala Gly Arg Ala Ala Leu Phe Asp Val 5045
5050 5055Ala Trp Lys Asp Val Thr Pro Gly Thr
Pro Pro Pro Asp Thr Ala 5060 5065
5070Val Arg Ser Thr Val Leu Thr His Asp Arg Ala Ala Ala Asp Leu
5075 5080 5085Ser Gly Leu Leu Ser Gly
Leu Asp Thr Asp Asp Ala Pro Val Pro 5090 5095
5100Asp Ala Val Leu Leu Thr Cys Ser Gln Gly Ala Val Ala Asp
Val 5105 5110 5115Leu Gly Glu Val Leu
Ser Val Val Gln Asp Trp Leu Ala Asp Asp 5120 5125
5130Arg Leu Ala Glu Ala Arg Leu Val Val Val Thr His Gly
Ala Val 5135 5140 5145Ala Thr Arg Thr
Gly Glu Glu Val Thr Asp Val Ala Gly Ala Ala 5150
5155 5160Val Trp Gly Leu Leu Arg Ser Ala Gln Ser Glu
His Pro Gly Arg 5165 5170 5175Phe Val
Leu Leu Asp Ala Asp Leu Ser Asp Asp Thr Thr Val Thr 5180
5185 5190Ala Ala Leu Ala Cys Asp Glu Pro Gln Leu
Ala Val Arg Gly Gly 5195 5200 5205Arg
Leu Leu Ala Ala Arg Leu Ala His Val Pro Val Pro Ala Asp 5210
5215 5220Ser Ser Asp Ala Val Arg Phe Asp Ala
Glu Gly Thr Val Leu Val 5225 5230
5235Thr Gly Gly Thr Gly Thr Leu Gly Ala Ala Val Ala Arg His Leu
5240 5245 5250Ala Ala Gly His Gly Val
Arg His Leu Leu Leu Val Ser Arg Arg 5255 5260
5265Gly Met Ala Ala Thr Gly Ala Glu Glu Leu Cys Ala Glu Leu
Gly 5270 5275 5280Gln Ala Gly Val Ser
Val Ser Val Ala Ala Cys Asp Val Ala Asp 5285 5290
5295Arg Ala Gln Val Ala Ala Leu Leu Glu Gln Val Pro Ala
Glu His 5300 5305 5310Pro Leu Thr Ala
Val Val His Thr Ala Gly Val Leu Asp Asp Ala 5315
5320 5325Thr Val Ala Cys Leu Asn Arg Glu Lys Ile Asp
Ala Val Val Gly 5330 5335 5340Ala Lys
Val Asp Gly Ala Leu His Leu His Glu Leu Thr Ala Gly 5345
5350 5355Met Asp Leu Ser Ala Phe Val Leu Phe Ser
Ser Ala Ala Gly Val 5360 5365 5370Leu
Gly Ser Pro Gly Gln Gly Asn Tyr Ala Ala Ala Asn Ala Ala 5375
5380 5385Leu Asp Ala Leu Ala His Gln Arg Arg
Ala Ala Gly Leu Pro Ala 5390 5395
5400Leu Ser Leu Ala Trp Gly Leu Trp Glu Glu Ala Ser Gly Met Thr
5405 5410 5415Gly His Leu Asp Ala Gly
Asp Arg His Arg Ile Thr Arg Ser Gly 5420 5425
5430Leu His Pro Leu Thr Thr Pro Asp Ala Leu Ala Leu Leu Asp
Thr 5435 5440 5445Ala Leu Val Thr Gly
Arg Pro Ala Leu Leu Pro Ala Asp Leu Arg 5450 5455
5460Pro Thr His Pro Ala Pro Pro Leu Leu Glu His Leu Ala
Pro Ala 5465 5470 5475Arg Thr Ser Pro
Arg Thr Ala His Thr Gly Thr Ser Ala Gly Ala 5480
5485 5490Gly Gln Asp Val Ser Leu Ala Asp Arg Leu Ala
Thr Leu Thr Pro 5495 5500 5505Glu Gln
Gln His Asp Thr Leu Phe Thr Val Val Arg Thr Gln Ile 5510
5515 5520Ala Thr Val Leu Gly His Gln Thr Pro Glu
Ala Val Pro Ala Asp 5525 5530 5535Ser
Ala Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu 5540
5545 5550Leu Arg Asn Arg Leu Ser Arg Ala Thr
Gly Leu Arg Leu Pro Ala 5555 5560
5565Thr Leu Ala Phe Asp His Pro Thr Ala Thr Ala Leu Thr Arg His
5570 5575 5580Leu Leu Thr Arg Leu Leu
Pro Asp Asp Ala Ala Thr Ala Pro Pro 5585 5590
5595Glu Gln Ser Leu Phe Ala Glu Ile Gly Arg Leu Glu Ala Val
Leu 5600 5605 5610Ser Ser Val Ala Ser
Pro Leu Pro Gly Ala Gln Gly Leu Gly Glu 5615 5620
5625Glu Ala Arg Ser Arg Leu Ala Ser Arg Leu Arg Ser Leu
Ala Gln 5630 5635 5640Val Leu Gly Gly
Glu Glu Ala Pro Arg Pro Asp Leu Gly Glu Ala 5645
5650 5655Thr Asp Glu Glu Met Phe Ala Leu Ile Asp Gln
Glu Thr Gly Ser 5660 5665
5670Pro115166PRTbacteria 11Met Ala Asn Glu Glu Met Leu Arg Glu Tyr Leu
Lys Arg Ala Thr Ala1 5 10
15Asp Leu Leu Arg Val Arg Arg Arg Leu Glu Gln Val Glu Ser Gly Arg
20 25 30Gln Glu Pro Val Ala Ile Val
Gly Met Ala Cys Arg Phe Pro Gly Gly 35 40
45Val Arg Ser Pro Glu Asp Leu Trp Glu Leu Val Ala Ser Gly Gly
Asp 50 55 60Ala Ile Gly Asp Phe Pro
Val Asp Arg Gly Trp Asp Val Glu Asp Leu65 70
75 80Tyr Asp Pro Glu Pro Gly Arg Ala Gly Arg Ser
Tyr Thr Arg Ser Gly 85 90
95Gly Phe Leu His Glu Ala Ala Glu Phe Asp Ala Gly Phe Phe Gly Leu
100 105 110Ser Pro Arg Glu Ala Leu
Ala Met Asp Pro Gln Gln Arg Leu Met Leu 115 120
125Glu Val Ser Trp Glu Ala Leu Glu Arg Ala Gly Ile Asp Pro
Ala Thr 130 135 140Leu Arg Gly Ser Arg
Thr Gly Val Phe Ala Gly Met Met Ser His Asp145 150
155 160Tyr Ala Thr Arg Leu Leu Ser Val Pro Asp
His Leu Gln Gly Phe Leu 165 170
175Gly Asn Gly Asn Ala Ala Ser Val Leu Ser Gly Arg Leu Ser Tyr Thr
180 185 190Phe Gly Phe Glu Gly
Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser 195
200 205Ser Leu Val Ala Leu His Leu Ala Cys Gln Ser Val
Arg Ser Gly Glu 210 215 220Ser Ser Leu
Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Ala225
230 235 240Met Phe Val Glu Phe Ser Arg
Gln Arg Gly Leu Ser Ala Asp Gly Arg 245
250 255Cys Lys Pro Tyr Ala Ala Ala Ala Asp Gly Thr Gly
Met Ser Glu Gly 260 265 270Val
Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg Leu Gly 275
280 285His Arg Val Leu Ala Val Val Arg Gly
Ser Ala Val Asn Gln Asp Gly 290 295
300Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val305
310 315 320Ile Gly Gln Ala
Leu Val Cys Ala Gly Leu Ser Ala Ala Glu Val Asp 325
330 335Val Val Glu Gly His Gly Thr Gly Thr Ser
Leu Gly Asp Pro Ile Glu 340 345
350Ala Gln Ala Val Leu Ala Ala Tyr Gly Arg Gly Arg Gly Val Pro Leu
355 360 365Trp Leu Gly Ser Val Lys Ser
Asn Leu Gly His Thr Gln Ala Ala Ala 370 375
380Gly Val Ala Gly Val Ile Lys Met Val Met Ala Leu Trp Arg Gly
Arg385 390 395 400Leu Pro
Arg Thr Leu His Val Asp Glu Pro Ser Pro His Val Asp Trp
405 410 415Ser Ser Gly Ala Val Arg Leu
Leu Thr Glu Glu Val Val Trp Glu Arg 420 425
430Gly Glu Arg Pro Arg Arg Ala Gly Val Ser Ser Phe Gly Val
Ser Gly 435 440 445Thr Asn Ala His
Val Ile Leu Glu Glu Ala Pro Gln Glu Glu Glu Val 450
455 460Arg Pro Glu Glu Ala Pro Ser Gly Asp Gly Val Gly
Pro Val Val Val465 470 475
480Pro Ser Gly Asp Gly Ala Gly Pro Ala Val Val Pro Trp Val Val Ser
485 490 495Ala Arg Ser Glu Ser
Ala Leu Arg Gly Gln Ala Arg Arg Leu Arg Val 500
505 510Phe Ala Asp Gly Ala Gly Ala Ala Pro Val Glu Val
Gly Arg Ala Leu 515 520 525Ala Val
Glu Arg Ala Trp Leu Glu His Arg Ala Val Val Leu Ala Glu 530
535 540Asp Leu Asp Gly Phe Arg His Gly Leu Asp Ala
Leu Ala Thr Gly Arg545 550 555
560Pro Ala Pro Glu Val Val Thr Gly Thr Ala Thr Asp Glu Gly Pro Leu
565 570 575Ala Phe Leu Phe
Ala Gly Gln Gly Thr Gln Arg Pro Ala Met Gly Arg 580
585 590Glu Leu His Ala His Phe Pro Ala Phe Ala Asp
Ala Phe Asp Glu Val 595 600 605Cys
Ala His Phe Gly Pro Ile Gly Glu Ala Gly His Thr Leu Arg Asp 610
615 620Ile Val Phe Ala Ala Pro Gly Ser Pro Gly
Ala Glu Leu Ile Glu Gln625 630 635
640Thr Glu Tyr Ala Gln Pro Ala Leu Phe Ala Val Glu Val Ala Leu
Tyr 645 650 655Arg Leu Val
Glu Asn Trp Gly Val Thr Pro Asp Tyr Leu Leu Gly His 660
665 670Ser Val Gly Glu Leu Ala Ala Ala His Val
Ala Gly Met Leu Ser Leu 675 680
685Pro Asp Ala Ala Ala Leu Val Thr Ala Arg Gly Arg Leu Met Gln Ala 690
695 700Leu Pro Asp Thr Gly Ala Met Val
Ala Val Glu Ala Thr Glu Glu Glu705 710
715 720Val Arg Pro Leu Leu Gln Asp Ala Glu Gly Arg Ala
Asp Leu Ala Ala 725 730
735Val Asn Gly Pro Arg Ala Val Val Leu Ala Gly Asp Glu Asp Ala Val
740 745 750Leu Thr Leu Ala Arg His
Trp Ala Glu Gln Gly Arg Arg Thr Arg Arg 755 760
765Leu Arg Thr Ser His Ala Phe His Ser Pro His Leu Asp Ala
Val Leu 770 775 780Asp Asp Phe Arg Arg
Val Ala Glu Gln Val Val Phe Ala Pro Pro Arg785 790
795 800Ile Pro Val Val Thr Asn Leu Thr Gly Ala
Pro Val Ser Ala Asp Thr 805 810
815Met Gly Thr Ala Asp Tyr Trp Val Gln His Ala Arg His Thr Val Arg
820 825 830Phe Gly Asp Gly Leu
Ala Trp Leu Gln Ala Gln Gly Val Thr Ala Tyr 835
840 845Leu Glu Leu Gly Pro Asp Gly Thr Leu Cys Ala Leu
Gly Gln Asp Ala 850 855 860Leu Thr Glu
Pro Ala Pro Leu Leu Pro Ala Leu Arg Pro Asp Arg Pro865
870 875 880Glu Ala Val Ser Val Leu Ala
Ala Val Ala Gly Leu Ser Val Arg Gly 885
890 895Val Arg Val Asp Trp Ala Ala Val Leu Gly Gly Ala
Pro Ser Gly Thr 900 905 910Ala
Gly Arg Val Glu Leu Pro Thr Tyr Ala Phe Glu Arg Glu Arg Tyr 915
920 925Trp Leu Asp Ala Gly Glu Thr Pro Ala
Ala Leu Pro Ala Gly Glu Asp 930 935
940Gly Pro Leu Trp Gln Ala Val Glu Arg Ala Asp Leu Pro Ala Val Ala945
950 955 960Ala Leu Leu Glu
Val Asp Glu Asp Ala Pro Leu Gly Ser Val Val Ser 965
970 975Ala Leu Gly Asp Trp Arg Arg Gly Val Arg
Glu Arg Ala Val Val Asp 980 985
990Gly Trp Arg Tyr Arg Val Val Trp Arg Pro Val Ser Arg Ser Gly Gly
995 1000 1005Gly Val Val Ser Gly Gly
Val Trp Val Val Val Val Pro Glu Gly 1010 1015
1020Val Val Gly Ala Ala Ala Val Val Glu Gly Leu Glu Arg Ala
Gly 1025 1030 1035Val Cys Val Arg Val
Val Ala Val Glu Gly Gly Cys Ala Asp Arg 1040 1045
1050Val Val Leu Gly Glu Arg Leu Arg Glu Val Cys Gly Gly
Glu Gly 1055 1060 1065Pro Val Gly Val
Leu Ala Val Cys Gly Gly Gly Val Gly Val Ala 1070
1075 1080Gly Leu Val Leu Gly Leu Val Gln Ala Val Glu
Gly Leu Gly Val 1085 1090 1095Pro Leu
Trp Cys Val Thr Arg Gly Ala Val Ser Val Gly Glu Gly 1100
1105 1110Asp Arg Leu Gly Asp Pro Gly Gly Ala Val
Val Trp Gly Leu Gly 1115 1120 1125Arg
Val Ala Gly Leu Glu Leu Pro Asp Arg Trp Gly Gly Val Val 1130
1135 1140Asp Leu Pro Glu Val Val Asp Glu Arg
Val Val Glu Gly Leu Leu 1145 1150
1155Gly Val Leu Ser Gly Gly Gly Gly Glu Gly Glu Val Ala Val Arg
1160 1165 1170Ala Ser Gly Val Phe Val
Arg Arg Leu Val Arg Ala Pro Gly Gly 1175 1180
1185Gly Ala Glu Ala Gly Gly Trp Arg Pro Arg Gly Thr Val Leu
Ile 1190 1195 1200Thr Gly Gly Thr Gly
Ala Leu Gly Ala His Val Ala Arg Trp Met 1205 1210
1215Val Arg Arg Gly Ala Glu His Leu Leu Leu Val Ser Arg
Ser Gly 1220 1225 1230Arg Glu Ala Lys
Gly Ala Gly Glu Leu Arg Ala Glu Leu Thr Ala 1235
1240 1245Met Gly Ala Arg Val Thr Ile Ala Ala Cys Asp
Val Ala Asp Arg 1250 1255 1260Gly Ala
Leu Ala Glu Leu Leu Ala Thr Ala Val Pro Glu Asp Cys 1265
1270 1275Pro Leu Gly Ala Val Val His Thr Ala Gly
Val Val Asp Asp Gly 1280 1285 1290Val
Leu Asp Ala Leu Thr Pro Glu Arg Leu Glu Gly Val Leu Ala 1295
1300 1305Ala Lys Ala Val Gly Ala Arg Asn Leu
His Glu Leu Thr Arg Gly 1310 1315
1320Ala Asp Leu Ser Ala Phe Val Val Phe Ser Ser Ala Ala Ala Thr
1325 1330 1335Phe Gly Ser Gly Gly Gln
Gly Ala Tyr Val Ala Ala Asn Ala Tyr 1340 1345
1350Val Glu Ala Leu Ala Val His Arg Arg Gly Leu Gly Leu Pro
Ser 1355 1360 1365Thr Ala Val Ala Trp
Gly Ala Trp Ala Gly Gly Gly Met Ala Ala 1370 1375
1380Asp Ala Glu Ala Ala Thr Arg Met Asp Arg Arg Gly Ile
Arg Pro 1385 1390 1395Met Asp Thr Glu
Pro Ala Leu Ser Ala Leu Gly Gln Val Leu Asp 1400
1405 1410Arg Asn Glu Thr Cys Leu Thr Ile Ala Asp Ile
Asp Trp Glu Arg 1415 1420 1425Leu Pro
Ala Ala Asp Gly Leu Ala Arg Leu Leu Ser Asp Ile Pro 1430
1435 1440Glu Ala Arg Leu Ala Arg Pro Ala Thr Gly
Thr Glu Ala Pro Gly 1445 1450 1455Ser
Leu Arg Ala Arg Leu Ala Ala Leu Glu Pro Ala Glu Arg Asp 1460
1465 1470Arg Ala Leu Leu Asp Leu Val Arg Thr
His Thr Ala Thr Val Leu 1475 1480
1485Gly His Arg Thr Ala Thr Ala Val Pro Ala Asp Arg Ala Phe Arg
1490 1495 1500Glu Leu Gly Phe Gly Ser
Leu Asn Ala Val Glu Leu Arg Asn Gly 1505 1510
1515Leu Asn Thr Ala Thr Gly Leu Arg Leu Pro Ser Thr Leu Val
Phe 1520 1525 1530Asp Tyr Pro Asn Pro
Ser Ala Leu Ala Thr His Leu Gly Thr Leu 1535 1540
1545Leu Ser Thr Gly Gly Glu Ala Pro Ala Gly Arg Pro Ala
Phe Ile 1550 1555 1560Arg Ser Gly Val
Val Asp Glu Pro Val Ala Ile Val Gly Met Ala 1565
1570 1575Cys Arg Phe Pro Gly Gly Val Trp Ser Pro Glu
Asp Leu Trp Glu 1580 1585 1590Leu Val
Ala Ser Gly Gly Asp Ala Ile Gly Gly Phe Pro Val Asp 1595
1600 1605Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp
Pro Glu Ala Gly Arg 1610 1615 1620Pro
Gly Ser Ser Tyr Thr Arg Ala Gly Gly Phe Leu Ala Gly Ala 1625
1630 1635Ala Glu Phe Asp Ala Gly Phe Phe Gly
Ile Ser Pro Arg Glu Ala 1640 1645
1650Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Val Ser Trp
1655 1660 1665Glu Ala Leu Glu Arg Ala
Gly Ile Asp Pro Val Ser Leu Arg Gly 1670 1675
1680Ser Arg Thr Gly Val Phe Ala Gly Val Ala Asn Gln Asp Tyr
Ala 1685 1690 1695Glu Leu Val Arg Arg
Gly Gly Arg Asp Leu Glu Gly Tyr Ala Leu 1700 1705
1710Thr Gly Val Ser Gly Ser Val Leu Ser Gly Arg Leu Ser
Tyr Thr 1715 1720 1725Phe Gly Leu Lys
Gly Pro Pro Val Thr Val Asn Thr Ala Cys Ser 1730
1735 1740Ser Ser Leu Val Ala Leu His Leu Ala Cys Gln
Ser Leu Arg Ser 1745 1750 1755Gly Glu
Ser Lys Leu Ala Leu Pro Gly Gly Val Thr Val Met Ser 1760
1765 1770Thr Pro Gly Ala Phe Val Glu Phe Ser Arg
Gln Arg Gly Leu Ser 1775 1780 1785Pro
Asp Gly Arg Cys Lys Ala Phe Ala Thr Pro Thr Asn Gly Val 1790
1795 1800Gly Trp Ser Glu Gly Val Gly Val Leu
Leu Val Glu Arg Leu Ser 1805 1810
1815Asp Ala Arg Arg Leu Gly His Arg Val Leu Pro Val Val Arg Gly
1820 1825 1830Ser Ala Val Asn Gln Asp
Gly Ala Ser Asn Gly Leu Thr Ala Pro 1835 1840
1845Asn Gly Pro Ser Gln Gln Arg Val Ile Gly Gln Ala Leu Val
Cys 1850 1855 1860Ala Gly Leu Ser Ala
Ala Glu Val Asp Val Val Glu Gly His Gly 1865 1870
1875Thr Gly Thr Ser Leu Gly Asp Pro Ile Glu Ala Gln Ala
Val Leu 1880 1885 1890Ala Ala Tyr Gly
Arg Gly Arg Gly Val Pro Leu Trp Leu Gly Ser 1895
1900 1905Val Lys Ser Asn Leu Gly His Thr Gln Ala Ala
Ala Gly Val Ala 1910 1915 1920Gly Val
Ile Lys Met Val Met Val Leu Trp Arg Gly Arg Leu Pro 1925
1930 1935Arg Thr Leu His Val Asp Glu Pro Ser Pro
His Val Asp Trp Ser 1940 1945 1950Ser
Gly Ala Val Arg Leu Leu Thr Glu Glu Val Val Trp Glu Arg 1955
1960 1965Gly Glu Arg Pro Arg Arg Ala Gly Val
Ser Ser Phe Gly Val Ser 1970 1975
1980Gly Thr Asn Ala His Val Ile Leu Glu Glu Ala Pro Gln Glu Glu
1985 1990 1995Glu Val Arg Pro Glu Glu
Ala Pro Ser Gln Gly Glu Ala Gly Pro 2000 2005
2010Ala Val Val Pro Trp Val Val Ser Ala Arg Ser Glu Ser Ala
Leu 2015 2020 2025Arg Gly Gln Ala Arg
Arg Leu Arg Val Phe Ala Asp Gly Ala Gly 2030 2035
2040Ala Ala Pro Val Glu Val Gly Arg Ala Leu Ala Val Glu
Arg Ala 2045 2050 2055Trp Leu Glu His
Arg Ala Val Val Leu Ala Glu Asp Leu Asp Gly 2060
2065 2070Phe Arg His Gly Leu Asp Ala Leu Ala Thr Gly
Leu Pro Thr Ala 2075 2080 2085Gly Val
Val Ala Gly Arg Thr Gly Pro Glu Ala Asp Gly Lys Ile 2090
2095 2100Ala Leu Leu Phe Gly Gly Gln Gly Thr Gln
Trp Asp Gly Met Ala 2105 2110 2115Ala
Glu Leu Leu Asp Ser Ser Pro Val Phe Ala Gln Arg Met Thr 2120
2125 2130Glu Cys Ala Asp Ala Leu Arg Pro Tyr
Leu Asp Trp Glu Leu Leu 2135 2140
2145Asp Val Leu Arg Gly Glu Pro Asp Ala Pro Pro Leu Asp Arg Val
2150 2155 2160Asp Val Val Gln Pro Val
Leu Phe Ala Val Met Val Ser Leu Ala 2165 2170
2175Ala Leu Trp Arg Ser Tyr Gly Val Arg Pro Asp Ala Val Ala
Gly 2180 2185 2190His Ser Gln Gly Glu
Ile Ala Ala Ala Cys Val Ala Gly Ala Leu 2195 2200
2205Ser Leu Glu Asp Ala Ala Arg Val Thr Ala Leu Arg Ser
Gln Ala 2210 2215 2220Leu Ala Ala Leu
Ala Gly Gln Gly Ala Met Ala Ser Val Gly Leu 2225
2230 2235Pro Ala Glu Asp Leu Glu Pro Arg Leu Ala Ala
Val Asp Pro Ser 2240 2245 2250Leu Val
Val Ala Ala Asp Asn Gly Ala Arg Ser Ala Val Val Ser 2255
2260 2265Gly Ser Pro Asp Ala Val Thr Ala Leu Val
Asp Asp Leu Thr Arg 2270 2275 2280Asp
Gly Val Pro Ala Arg Leu Leu Lys Val Asp Trp Ala Ser His 2285
2290 2295Ser Pro Gln Val Glu Ala Ile Arg Ala
Asp Leu Leu Gly Leu Leu 2300 2305
2310Ala Pro Val Thr Pro Arg Pro Ala Asp Ile Pro Leu Tyr Ser Thr
2315 2320 2325Val Thr Gly Glu Pro Val
Asp Gly Thr Ala Leu Asp Ala Ala Tyr 2330 2335
2340Trp Tyr Arg Asn Leu Arg Glu Pro Val Arg Phe Arg Asp Ala
Thr 2345 2350 2355Arg Ala Leu Ala Arg
Asp Gly His Thr Val Phe Val Glu Ala Gly 2360 2365
2370Pro His Pro Ala Val Ser Val Ala Val Gln Glu Thr Leu
Asp Asp 2375 2380 2385Leu Gly Ala Ala
Asp Thr Leu Val Val Gly Ser Leu Arg Arg Gly 2390
2395 2400Glu Gly Gly Leu Arg Arg Phe Leu Ala Ser Ala
Ala Glu Leu Ser 2405 2410 2415Val Arg
Gly Val Arg Val Asp Trp Ala Ala Val Leu Gly Gly Lys 2420
2425 2430Pro Ser Gly Thr Ala Gly Arg Val Glu Leu
Pro Thr Tyr Ala Phe 2435 2440 2445Glu
Arg Glu Arg Tyr Trp Leu Asp Pro Glu Glu Thr Pro Ala Ala 2450
2455 2460Pro Ala Thr Thr Glu Asp Gly Pro Leu
Trp Glu Ala Val Glu Arg 2465 2470
2475Glu Asp Pro Ala Ala Val Ala Ala Leu Leu Ala Val Asp Glu Asp
2480 2485 2490Ala Pro Leu Asp Ala Leu
Val Ser Ala Leu Gly Asp Trp Arg Arg 2495 2500
2505Gly Val Arg Glu Arg Ala Val Val Asp Gly Trp Arg Tyr Arg
Val 2510 2515 2520Val Trp Arg Pro Val
Ser Arg Ser Gly Gly Gly Val Val Ser Gly 2525 2530
2535Gly Val Trp Val Val Val Val Pro Glu Gly Val Val Gly
Ala Ala 2540 2545 2550Ala Val Val Glu
Gly Leu Glu Trp Ala Gly Val Cys Val Arg Val 2555
2560 2565Val Ala Val Glu Gly Gly Cys Ala Asp Arg Val
Val Leu Gly Glu 2570 2575 2580Arg Leu
Arg Glu Val Trp Gly Gly Glu Gly Pro Val Gly Val Leu 2585
2590 2595Ala Val Cys Gly Gly Gly Val Gly Val Ala
Gly Leu Val Leu Gly 2600 2605 2610Leu
Val Gln Ala Val Glu Gly Leu Gly Val Pro Leu Trp Cys Val 2615
2620 2625Thr Arg Gly Ala Val Ser Val Gly Glu
Gly Asp Arg Leu Gly Asp 2630 2635
2640Pro Gly Gly Ala Val Val Trp Gly Leu Gly Arg Val Ala Gly Leu
2645 2650 2655Glu Leu Pro Asp Arg Trp
Gly Gly Val Val Asp Leu Pro Glu Val 2660 2665
2670Val Asp Glu Arg Val Val Glu Gly Leu Leu Gly Val Leu Ser
Gly 2675 2680 2685Gly Gly Gly Glu Gly
Glu Val Ala Val Arg Ala Ser Gly Val Phe 2690 2695
2700Val Arg Arg Leu Val Arg Ala Pro Gly Gly Gly Ala Glu
Ala Gly 2705 2710 2715Gly Trp Arg Pro
Arg Gly Thr Val Leu Ile Thr Gly Glu Asn Ala 2720
2725 2730Asp Pro Glu Gln Pro Ala Ala His Leu Ala Arg
Trp Leu Ala Asp 2735 2740 2745Arg Gly
Ala Glu His Leu Leu Leu Ile Ser Thr Ser Gly Asp Gly 2750
2755 2760Phe Gly Leu Ala Asp Thr Thr Asp Gln Trp
Gly Ala Arg Val Thr 2765 2770 2775Ile
Ala Ala Cys Asp Val Ala Asp Arg Gly Ala Leu Ala Glu Leu 2780
2785 2790Leu Ala Thr Ala Val Pro Glu Asp Cys
Pro Leu Gly Ala Val Val 2795 2800
2805His Thr Ala Gly Val Val Asp Asp Gly Val Leu Asp Ala Leu Thr
2810 2815 2820Pro Glu Arg Leu Glu Gly
Val Leu Ala Ala Arg Ala Val Gly Ala 2825 2830
2835Arg Asn Leu His Glu Leu Thr Arg Gly Ala Asp Leu Ser Ala
Phe 2840 2845 2850Val Val Phe Ser Ser
Ala Ala Ala Thr Phe Gly Ser Gly Gly Gln 2855 2860
2865Gly Ala Tyr Val Ala Ala Asn Ala Tyr Val Glu Ala Leu
Ala Val 2870 2875 2880His Arg Arg Gly
Leu Gly Leu Pro Ser Thr Ala Val Ala Trp Gly 2885
2890 2895Pro Trp Arg Gly His Ser Ala Ala Gly Arg Pro
Asp Ala Ala Ala 2900 2905 2910Arg Leu
His Arg Arg Gly Leu Thr Glu Met Ala Pro Glu Leu Ala 2915
2920 2925Leu Ala Ala Leu Ala Arg Val Leu Asp His
Asp Glu Ser Gly Leu 2930 2935 2940Thr
Val Ala Asp Ile Asp Trp Glu Arg Phe Thr Ala His Thr Ala 2945
2950 2955Gly Ser Arg Leu Pro Leu Ile Gly Asp
Leu Pro Asp Val Arg Ala 2960 2965
2970Leu Thr Arg Ala Thr Gly Thr Gly Thr Ala His Gly Thr Asp Leu
2975 2980 2985Arg Asp Arg Leu Ala Ala
Leu Glu Pro Asp Ala Arg Thr Asp Val 2990 2995
3000Leu Leu Glu Leu Val Ser Thr His Thr Ala Ala Val Leu Gly
His 3005 3010 3015Arg Glu Ala Asp Thr
Val Pro Ala Asp Arg Ala Phe Arg Glu Leu 3020 3025
3030Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg
Leu Asn 3035 3040 3045Thr Ala Thr Gly
Leu Arg Leu Pro Thr Thr Leu Val Phe Asp Tyr 3050
3055 3060Pro Arg Pro Ala Val Leu Ala Arg His Leu Arg
Asp Gln Leu Cys 3065 3070 3075Gly Thr
Ala Pro Ala Thr Pro Pro Val Ala Ala Arg Pro Gly Val 3080
3085 3090Val Asp Glu Pro Val Ala Ile Val Gly Met
Ala Cys Arg Phe Pro 3095 3100 3105Gly
Gly Val Trp Ser Pro Glu Asp Leu Trp Glu Leu Val Ala Ser 3110
3115 3120Gly Gly Asp Ala Ile Gly Gly Phe Pro
Val Asp Arg Gly Trp Asp 3125 3130
3135Val Glu Gly Leu Tyr Asp Pro Glu Ala Gly Arg Pro Gly Ser Ser
3140 3145 3150Tyr Thr Arg Ser Gly Gly
Phe Leu Ala Gly Ala Ala Glu Phe Asp 3155 3160
3165Ala Gly Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met
Asp 3170 3175 3180Pro Gln Gln Arg Leu
Leu Leu Glu Val Ser Trp Glu Ala Leu Glu 3185 3190
3195Arg Ala Gly Ile Asp Pro Val Ser Leu Arg Gly Ser Arg
Thr Gly 3200 3205 3210Val Phe Ala Gly
Val Ala Asn Gln Asp Tyr Ala Glu Leu Val Arg 3215
3220 3225Arg Gly Gly Arg Asp Leu Glu Gly Tyr Ala Leu
Thr Gly Val Ser 3230 3235 3240Gly Ser
Val Leu Ser Gly Arg Leu Ser Tyr Thr Phe Gly Leu Glu 3245
3250 3255Gly Pro Ala Val Thr Val Asp Thr Ala Cys
Ser Ser Ser Leu Val 3260 3265 3270Ala
Leu His Leu Ala Cys Gln Ser Leu Arg Ser Gly Glu Ser Glu 3275
3280 3285Leu Ala Leu Ala Gly Gly Val Thr Val
Met Ser Thr Pro Gly Ala 3290 3295
3300Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ser Ala Asp Gly Arg
3305 3310 3315Cys Lys Ala Phe Ala Ala
Ala Ala Asp Gly Val Gly Trp Ser Glu 3320 3325
3330Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg
Arg 3335 3340 3345Leu Gly His Arg Val
Leu Ala Val Val Arg Gly Ser Ala Val Asn 3350 3355
3360Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly
Pro Ser 3365 3370 3375Gln Gln Arg Val
Ile Gly Gln Ala Leu Val Cys Ala Gly Leu Ser 3380
3385 3390Ala Ala Glu Val Asp Val Val Glu Gly His Gly
Thr Gly Thr Ser 3395 3400 3405Leu Gly
Asp Pro Ile Glu Ala Gln Ala Val Leu Ala Ala Tyr Gly 3410
3415 3420Arg Gly Arg Gly Val Pro Leu Trp Leu Gly
Ser Val Lys Ser Asn 3425 3430 3435Leu
Gly His Thr Gln Ala Ala Ala Gly Val Ala Gly Val Ile Lys 3440
3445 3450Met Val Met Ala Leu Trp Arg Gly Arg
Leu Pro Arg Thr Leu His 3455 3460
3465Val Asp Glu Pro Ser Pro His Val Asp Trp Ser Ser Gly Ala Val
3470 3475 3480Arg Leu Leu Thr Glu Glu
Val Val Trp Glu Arg Gly Glu Arg Pro 3485 3490
3495Arg Arg Ala Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn
Ala 3500 3505 3510His Val Ile Leu Glu
Glu Ala Pro Gln Glu Glu Glu Val Arg Pro 3515 3520
3525Glu Glu Ala Pro Ser Gln Asp Glu Ala Gly Pro Ala Thr
Val Pro 3530 3535 3540Cys Leu Leu Ser
Ala Arg Thr Asp Thr Ala Leu Arg Ala Gln Ala 3545
3550 3555Arg Arg Leu Arg Asp Tyr Leu Ala Ala Asn Pro
Asp Ile Pro Ile 3560 3565 3570Gly Asp
Val Ala His Ala Leu Ala Thr Gly Arg Ser Thr Phe Glu 3575
3580 3585Arg Arg Ala Val Leu Val Ala Glu Asp His
Glu Gly Leu Leu Arg 3590 3595 3600Thr
Leu Asp Ala Leu Ala Glu Gly Thr Thr Ala Pro Gly Leu Ile 3605
3610 3615Glu Ser Pro Ala Arg Thr Ala His Gly
Lys Val Ala Phe Leu Phe 3620 3625
3630Ser Gly Gln Gly Thr Gln Arg Pro Gly Met Gly Arg Glu Leu Tyr
3635 3640 3645Ala Ala His Pro Ala Phe
Ala Gln Ala Leu Asp Asp Val Leu Ala 3650 3655
3660Glu Leu Glu Pro His Leu Asp Arg Pro Leu Arg Pro Leu Leu
Leu 3665 3670 3675Asp Glu Pro Gln Pro
Leu Asp Arg Thr Gly Asp Ala Gln Pro Ala 3680 3685
3690Leu Phe Ala Leu Gln Val Ala Leu Phe Arg Leu Leu Glu
Ser Ala 3695 3700 3705Gly Ile Arg Pro
Asp His Val Ala Gly His Ser Ile Gly Glu Leu 3710
3715 3720Ala Ala Ala His Val Ala Gly Val Leu Ser Leu
Thr Asp Ala Ala 3725 3730 3735Arg Leu
Val Ala Ala Arg Gly Arg Leu Ala Gln Thr Gln Leu Pro 3740
3745 3750Pro Gly Gly Ala Met Leu Ala Val Arg Ala
Ser Glu Glu Gln Val 3755 3760 3765Thr
Arg Met Leu Ala Gly Arg Glu Ala Arg Val Ala Val Ala Ala 3770
3775 3780Val Asn Gly Pro Thr Ser Val Val Ile
Ser Gly Ala Glu Pro Asp 3785 3790
3795Val Leu Glu Ala Ala Ala Ala Phe Ala Glu Gln Gly Leu Arg Thr
3800 3805 3810Lys Arg Leu Ser Thr Asp
Arg Ala Phe His Ser Pro Leu Met Glu 3815 3820
3825Pro Ile Leu Glu Glu Phe Arg Gln Val Ala Thr Gly Ile Ala
Tyr 3830 3835 3840Ala Glu Pro Thr Ile
Pro Val Val Ser Thr Val Thr Gly Asp Arg 3845 3850
3855Ala Thr Ala Gly Thr Leu Thr Asp Pro Glu Tyr Trp Val
Arg Gln 3860 3865 3870Leu Arg Arg Thr
Val Arg Phe Gly Asp Ala Val Arg Arg Leu His 3875
3880 3885Asp Asp Asp Gly Val Arg Thr Phe Leu Glu Leu
Gly Pro Asp Gly 3890 3895 3900Thr Leu
Cys Ala Leu Ala Gly Glu Cys Leu Pro Ala Asp Asp Asn 3905
3910 3915Thr Thr Glu Pro Gly Pro Ala Leu Val Pro
Leu Leu Arg Ala Asp 3920 3925 3930Arg
Pro Glu Pro Leu Ala Leu Leu Thr Ala Leu Ala His Leu His 3935
3940 3945Val Gln Gly Thr Pro Lys Gly Gly Thr
Ala Val His Trp Pro Ala 3950 3955
3960Leu Ile Gly Ala Thr Pro Glu Arg Ala Arg His Leu Asp Leu Pro
3965 3970 3975Thr Tyr Pro Phe Asp Arg
Arg Arg Tyr Trp Leu Asp Ala Asp Thr 3980 3985
3990Ser Leu Ser Gly Asp Val Ser Ala Ala Gly Leu Thr Ala Ala
Gly 3995 4000 4005His Pro Leu Leu Gly
Ser Ala Val Pro Leu Ala Gly Ser Pro Gln 4010 4015
4020Ser Gln Glu Cys Leu Leu Thr Gly Arg Ile Ser Leu Arg
Thr His 4025 4030 4035Pro Trp Leu Ala
Asp His Ala Val Phe Gly Thr Val Leu Leu Pro 4040
4045 4050Gly Thr Ala Ile Leu Glu Leu Ala Val Arg Ala
Gly Asp Glu Val 4055 4060 4065Gly Cys
Asp Thr Val Glu Glu Leu Ala Leu Gln Val Pro Leu Val 4070
4075 4080Leu Pro Glu Arg Gly Ser Val Val Leu Gln
Leu Ser Val Gly Ala 4085 4090 4095Thr
Glu Thr Ala Pro Asp Gly Val Glu Arg Arg Pro Phe Thr Leu 4100
4105 4110Tyr Ala Arg Glu Asp Asp Gly Leu Thr
Pro Ala Ala Pro Thr Gly 4115 4120
4125Thr Asp Gly Thr Gly Trp Thr Cys His Ala Thr Gly Val Leu Thr
4130 4135 4140Arg Arg Ala Glu Thr Ala
His Asp Thr Ala Ala Pro Trp Pro Pro 4145 4150
4155Thr Asp Ala Val Pro Val Asp Leu Asp His Trp Tyr Gly Thr
Leu 4160 4165 4170Ala Asp Ala Gly Leu
Gly Tyr Gly Pro Ala Phe Gln Gly Leu Arg 4175 4180
4185Ala Ala Trp Arg His Gly Asp Asp Leu Tyr Ala Glu Val
Ala Leu 4190 4195 4200Pro Asp Gly Pro
Ser Gly Asp Ala Asp Arg Tyr Ala Val His Pro 4205
4210 4215Ala Leu Leu Asp Ala Ala Leu His Pro Val Val
Leu Gly Phe Ala 4220 4225 4230Glu Asp
Glu Pro Asp Glu Gly His Gly Trp Leu Pro Phe Ser Trp 4235
4240 4245Ser Gly Val Thr Val Thr Ala Ser Gly Ala
Ser Ala Leu Arg Val 4250 4255 4260Arg
Leu Ser Arg Arg Ser Pro Asp Thr Ile Ala Leu Leu Ala Thr 4265
4270 4275Asp Ser Thr Gly His Thr Val Val Thr
Ala Glu Ser Leu Ala Phe 4280 4285
4290Arg Pro Val Thr Ala Gly Gln Leu His Ser Ala Arg Thr Ala His
4295 4300 4305His Asp Ala Leu Phe Arg
Leu Asp Trp Ala Pro Val Pro Leu Pro 4310 4315
4320Arg Thr Pro Ser Ser Lys Thr Arg Leu Ala Leu Ile Gly Ser
Glu 4325 4330 4335Ala Glu Cys Pro Asp
Ala Pro Gly Val Pro Trp Ser Thr Tyr Ala 4340 4345
4350Asp Leu Glu Glu Leu Ala Ser Ala Gly Thr Pro Val Pro
Asp Val 4355 4360 4365Val Val Val Pro
Cys Pro His Arg Asp Gly Ala Ala Asp Ala Ala 4370
4375 4380Asp Ala Thr Arg Arg Ala Thr Val Arg Val Leu
His Leu Leu Gln 4385 4390 4395Ser Trp
Leu Ala Asp Asp Arg Phe Ala Asp Ser Arg Leu Ala Phe 4400
4405 4410Val Thr His Gly Ala Val Ala Ala Ala Pro
Gly Asp Ser Val Pro 4415 4420 4425Asp
Leu Ala His Ala Ala Val Trp Gly Met Val Arg Ser Ala Gln 4430
4435 4440Thr Glu Asn Pro Gly Arg Phe Val Leu
Thr Asp Leu Asp Asp Thr 4445 4450
4455Asp Ala Ser Arg Arg Ala Leu Ala Ala Ala Leu Leu Ser Gly Glu
4460 4465 4470Pro Gln Thr Val Leu Arg
Glu Gly Arg Ala His Thr Pro Arg Leu 4475 4480
4485Ala Arg Ile Pro Val Gly Ala Arg Ala Asp Ser Gly His Trp
Asp 4490 4495 4500Pro Asp Ala Thr Val
Leu Ile Thr Gly Gly Thr Gly Tyr Leu Gly 4505 4510
4515Arg Leu Leu Ala Arg His Leu Val Val Thr His Gly Val
Arg His 4520 4525 4530Leu Leu Leu Thr
Ser Arg Ser Gly Pro Thr Ala Pro Gly Thr Ala 4535
4540 4545Glu Leu Val Ala Glu Leu Ala Glu Leu Gly Ala
Arg Thr Thr Ala 4550 4555 4560Val Ala
Cys Asp Leu Ala Asp Arg Arg Ala Val Ala Ala Leu Leu 4565
4570 4575Ala Glu Ile Pro Ala Arg His Pro Leu Lys
Ala Val Leu His Thr 4580 4585 4590Ala
Gly Val Val Asp Asp Gly Val Leu Thr Ser Leu Thr Pro Asp 4595
4600 4605Arg Leu Asp Ala Val Leu Ser Ala Lys
Ala His Gly Ala Ala His 4610 4615
4620Leu His Asp Leu Thr Arg Asp Ala Gly Leu Asp Ala Phe Ile Ala
4625 4630 4635Phe Ser Ser Ala Ala Ala
Ser Phe Gly Ser Pro Gly Gln Ala Asn 4640 4645
4650Tyr Thr Ala Ala Asn Ala Phe Leu Asp Ala Leu Met Gln Gln
Arg 4655 4660 4665His Ala Leu Gly Leu
Pro Gly Arg Ser Leu Ala Trp Gly Arg Trp 4670 4675
4680Ala Glu Ala Gly Gly Met Ala Glu His Leu Ala Ala Ala
Asp Val 4685 4690 4695Ala Arg Met Thr
Arg Ser Gly Leu Leu Pro Leu Thr Asn Ala His 4700
4705 4710Gly Leu Ala Leu Phe Asp Thr Ala Leu Ala Leu
Asp Glu Pro Leu 4715 4720 4725Leu Leu
Ala Thr Pro Leu Asp Pro Gly Thr Leu Arg Glu Gln Ala 4730
4735 4740Ala Val Gly Thr Leu Pro Pro Val Leu Arg
Gly Leu Val Arg Thr 4745 4750 4755Pro
Ala Arg Arg Thr Ala Asp His Gly Val Gly Ala Asp Ala Ala 4760
4765 4770Ala Glu Leu Arg Gly Arg Leu Ala Gly
Thr Pro Lys Pro Ala Glu 4775 4780
4785Arg Thr Ala Leu Leu Thr Glu Val Val Arg Thr His Ala Ala Ala
4790 4795 4800Val Leu Gly His Gly Gly
Thr Asp Thr Val Thr Ala Asp Gly Glu 4805 4810
4815Phe Arg Glu Phe Gly Phe Asp Ser Leu Thr Ala Val Glu Leu
Arg 4820 4825 4830Asn Arg Leu Asn Ala
Ala Thr Gly Leu Arg Leu Ala Thr Thr Leu 4835 4840
4845Val Phe Asp His Pro Thr Pro Ala Ala Leu Ala Asp His
Leu Glu 4850 4855 4860Arg Leu Leu Ala
Ala Glu Pro Ala Ser Asp Met Thr Ala Glu Thr 4865
4870 4875Ala Gly Ala Pro Gly Glu Arg Asp Ala Thr Ala
Ser Ser Arg Ala 4880 4885 4890Gly Ser
Gly Pro Ser Ala Asp Thr Val Glu Ala Leu Phe Trp Ile 4895
4900 4905Gly His Asp Ser Gly Arg Val Glu Glu Ser
Met Ala Leu Leu Ser 4910 4915 4920Ala
Ala Ser Ala Phe Arg Pro Cys Phe Thr Asp Pro Ser Ala Met 4925
4930 4935Thr Arg Pro Pro Phe Val Arg Val Ala
Gln Gly Asp Thr Gly Pro 4940 4945
4950Ala Leu Ile Cys Leu Pro Thr Val Ala Ala Val Ser Ser Val Tyr
4955 4960 4965Gln Tyr Ser Arg Phe Ala
Ala Ala Leu Asp Gly Leu Arg Asp Val 4970 4975
4980Trp Tyr Val Pro Ala Pro Gly Phe Ala Asp Gly Glu Pro Leu
Pro 4985 4990 4995Ala Asp Val Asp Thr
Ile Thr Arg Leu Phe Thr Asp Ala Ile Leu 5000 5005
5010Arg His Thr Asp Gly Glu Pro Phe Ala Leu Ala Gly His
Ser Ala 5015 5020 5025Gly Gly Trp Phe
Thr His Thr Val Thr Ser Arg Leu Glu His Leu 5030
5035 5040Gly Val Arg Pro Gln Ala Val Val Val Met Asp
Ala Tyr Leu Pro 5045 5050 5055Asp Glu
Gly Met Ala Pro Val Ala Ala Ala Leu Thr Ser Glu Ile 5060
5065 5070Phe Asp Arg Val Thr Glu Phe Ile Asp Leu
Asp Tyr Ala Arg Leu 5075 5080 5085Val
Ala Met Gly Gly Tyr Phe Arg Ile Phe Ala Gly Trp Arg Pro 5090
5095 5100Pro Ala Leu Glu Thr Pro Thr Leu Phe
Leu Arg Ala Arg Glu Ser 5105 5110
5115Glu Gln Pro Pro Pro Val Trp Gly Glu Pro His Thr Val Leu Glu
5120 5125 5130Thr Asp Gly Asn His Phe
Thr Met Leu Glu Glu His Ala Glu Ser 5135 5140
5145Thr Ala Arg His Val His Thr Trp Leu Ala Gly Leu Thr Glu
Gln 5150 5155 5160Arg Arg Arg
516512254PRTbacteria 12Met Asp Arg Tyr Ala Lys Arg Phe Glu Asp Arg Leu
Val Leu Val Thr1 5 10
15Gly Ala Gly Ser Gly Ile Gly Arg Ala Thr Ala Cys Arg Phe Gly Ala
20 25 30Ala Gly Ala Arg Leu Val Cys
Val Asp Arg Asp Gly Pro Gly Ala Glu 35 40
45Ala Thr Ala Glu Leu Ala Arg Ala Arg Gly Ala Arg Ala Ala Cys
Ala 50 55 60Glu Val Ala Asp Val Ser
Asp Glu Val Ala Met Glu Arg Leu Ala Ala65 70
75 80Arg Val Thr Ala Ala His Gly Val Leu Asp Val
Leu Val Asn Asn Ala 85 90
95Gly Ile Gly Met Ser Gly Arg Phe Leu Asp Thr Ser Ala Glu Asp Trp
100 105 110Arg Arg Thr Leu Gly Val
Asn Leu Trp Gly Val Ile His Gly Cys Arg 115 120
125Leu Leu Gly Arg Gly Met Ala Glu Arg Arg Gln Gly Gly His
Ile Val 130 135 140Thr Val Ala Ser Ala
Ala Ala Phe Gln Pro Thr Arg Val Val Pro Val145 150
155 160Tyr Ala Thr Ser Lys Ala Ala Ala Leu Met
Leu Ser Glu Cys Leu Arg 165 170
175Ala Glu Leu Ala Glu Phe Gly Ile Gly Val Ser Val Val Cys Pro Gly
180 185 190Leu Val Arg Thr Pro
Phe Ala Ser Ala Met Tyr Phe Ala Gly Ala Ser 195
200 205Pro Asp Glu His Thr Arg Leu Arg Glu Ser Ser Ala
Arg Arg Phe Ala 210 215 220Gly Arg Gly
Cys Pro Pro Glu Lys Val Ala Asp Ala Val Leu Arg Ala225
230 235 240Ile Met Arg Thr Ala Leu Pro
Thr Val Thr Gly Ser Thr Pro 245
250137PRTbacteria 13Gly Gly Thr Gly Thr Leu Gly1
5147PRTbacteria 14Gly Ala Ala Ser Thr Leu Gly1
51533DNAbacteria 15ctggtgacgg gcgctgcaag cactctgggg gcg
331633DNAbacteria 16gaccactgcc cgcgacgttc gtgagacccc cgc
33177PRTbacteria 17Leu Val Ser Arg Arg Gly
Met1 5187PRTbacteria 18Leu Val Ala Ala Ala Gly Met1
51947DNAbacteria 19gcggcatctg ctgctggtgg cagcggcagg catggccgcc
gccggtg 472047DNAbacteria 20cgccgtagac gacgaccacc
gtcgccgtcc gtaccggcgg cggccac 47217PRTbacteria 21His
Thr Ala Gly Val Leu Asp1 5227PRTbacteria 22His Thr Pro Pro
Leu Leu Asp1 52346DNAbacteria 23gaccgctgtg gtgcacacgc
cacctctcct ggacgacgcc accgtg 462446DNAbacteria
24ctggcgacac cacgtgtgcg gtggagagga cctgctgcgg tggcac
46255PRTbacteria 25Gly Ala Lys Val Asp1 5265PRTbacteria
26Gly Ala Ala Val Asp1 52739DNAbacteria 27gatgcggtgc
tcggggcggc tgtggacggt gccctgcac
392839DNAbacteria 28ctacgccacg agccccgccg acacctgcca cgggacgtg
39297PRTbacteria 29Val Leu Phe Ser Ser Ala Ala1
5307PRTbacteria 30Val Leu Phe Ala Ala Ala Ala1
53141DNAbacteria 31gtcggcgttc gtgctgttcg cagcggccgc cggggtcctg g
413241DNAbacteria 32cagccgcaag cacgacaagc gtcgccggcg
gccccaggac c 413323DNAbacteria 33tactgcgcca
cacggagccc gag
233420DNAbacteria 34tgggtaacgc cagggttttc
203524DNAbacteria 35ggaaacagct atgacatgat tacg
243620DNAbacteria 36tcggagccgc tccacctgag
203720DNAbacteria
37cctgatggac gcgggtgcgc
203816DNAbacteria 38gacaccgaaa cccctg
163920DNAbacteria 39cctgatggac gcgggtgcgc
204023DNAbacteria 40gccgtgtgca ccacagcggt
cag 234128DNAbacteria
41gtgtgatgtc gccgaccgcg cccaggtc
284222DNAbacteria 42gcgctggtgg gccagggcgt cc
22
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: