Patent application title: MULTIMERIC ELP FUSION CONSTRUCTS
Inventors:
Suzanne Dagher (Durham, NC, US)
IPC8 Class: AC07K1456FI
USPC Class:
530351
Class name: Chemistry: natural resins or derivatives; peptides or proteins; lignins or reaction products thereof proteins, i.e., more than 100 amino acid residues lymphokines, e.g., interferons, interlukins, etc.
Publication date: 2011-04-07
Patent application number: 20110082283
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: MULTIMERIC ELP FUSION CONSTRUCTS
Inventors:
Suzanne DAGHER
Agents:
Assignees:
Origin: ,
IPC8 Class: AC07K1456FI
USPC Class:
Publication date: 04/07/2011
Patent application number: 20110082283
Abstract:
ELP fusion proteins, multimeric ELP spider complexes formed of ELP fusion
proteins, and methods of using the same. The construct may be in the form
of an ELP spider structure complex including multi-leg moieties
comprising ELP fusion proteins capable of forming covalent disulfide
bonds. The multimeric fusion constructs may be employed in peptide
production and purification and/or to enhance proteolytic resistance of a
protein or peptide moiety in a fusion construct, by provision of the
fusion protein in an ELP spider complex.Claims:
1. A fusion protein exhibiting a phase transition, comprising: a) at
least one target protein or peptide; b) one or more proteins comprising
oligomeric repeats of a polypeptide sequence selected front SEQ ID NOs:
1-17, wherein the one or more proteins exhibits a phase transition and
are joined to the at least one target protein or peptide of a); c) at
least two residues capable of forming a disulfide bond; and d)
optionally, a spacer sequence separating any of the phase transition
protein(s) of b) from any of the target protein(s) or peptide(s) of a).
2. The fusion protein of claim 1, wherein the target protein is IFNa2b.
3. The fusion protein of claim 1, wherein the target peptide is Orexin-B, MMN, NPY or Gh.
4. The fusion protein of claim 1, wherein the phase transition protein is selected from the group consisting of: SEQ ID NOs: 1, 2, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17.
5. The fusion protein of claim 1, wherein the phase transition protein is selected from the group consisting of: SEQ ID NOs: 1, 2, 9, 10, 11, 12, 13, 14, 15, 16, and 17.
6. The fusion protein of claim 1, wherein the phase transition protein is selected from the group consisting of: ELP1-90, ELP1-180, ELP2-64, ELP2-128, ELP3-72, ELP3-144, ELP4-60, ELP4-120, ELP5-15, ELP5.30, ELP5-60, ELP6-15, ELP6-30, ELP6-60, ELP7-13, ELP7-30 and ELP7-60.
7. The fusion protein of claim 1, wherein the phase transition is mediatable by an agency comprising one or more of: changing temperature; changing pH; addition of solutes and/or solvents; side chain ionization or chemical modification; and changing pressure.
8. The fusion protein of claim 1, wherein the at least two residues capable of forming a disulfide bond comprise cysteine.
9. The fusion protein of claim 1, comprising said spacer sequence.
10. The fusion protein of claim 9, wherein said spacer sequence comprises a proteolytic cleavage site.
11. The fusion protein of claim 1, selected from pET17b-SD33-ELP1-90-IFNA2bSD (SEQ ID NO: 22), pET17b-SD33-ELP4-60-IFNA2bSD (SEQ ID NO: 23), pET17b-SD34-ELP1-90-IFNA2bSD (SEQ ID NO: 24), pET17b-SD34-ELP4-60-IFNA2bSD (SEQ ID NO: 25), pET17b-SD22-ELP1-90-IFNA2bSD (SEQ ID NO: 26), pET17b-SD22-ELP4-60-IFNA2bSD (SEQ ID NO: 27), pET17b-SD35-IFNA2bSD-ELP1-90 (SEQ ID NO: 28), pET17b-SD35-IFNA2bSD-ELP4-60 (SEQ ID NO: 29), pET17b-SD37-IFNA2bSD-ELP1-90 (SEQ ID NO: 30), pET17b-SD37-IFNA2bSD-ELP4-60 (SEQ ID NO: 31), pET17b-SD31-IFNA2bSD-ELP1-90 (SEQ ID NO: 32) and pET17b-SD31-IFNA2bSD-ELP4-60 (SEQ ID NO: 33).
12. The fusion protein of claim 1, selected from ELP4-60-S-S-Orexin B, ELP4-60-S-S-MMN, ELP4-60-S-S-NPY or ELP4-60-S-S-Gh.
13. A polynucleotide comprising a nucleotide sequence encoding the fusion protein of claim 1.
14. The polynucleotide of claim 13, selected from pET17b-SD33-ELP1-90-IFNA2bSD (SEQ ID NO: 34), pET17b-SD33-ELP4-60-IFNA2bSD (SEQ ID NO: 35), pET17b-SD34-ELP1-90-IFNA2bSD (SEQ ID NO: 36), pET17b-SD34-ELP4-60-IFNA2bSD (SEQ ID NO: 37), pET17b-SD22-ELP1-90-IFNA2bSD (SEQ ID NO: 38), pET17b-SD22-ELP4-60-IFNA2bSD (SEQ ID NO: 39), pET17b-SD35-IFNA2bSD-ELP1-90 (SEQ ID NO: 40), pET17b-SD35-IFNA2bSD-ELP4-60 (SEQ ID NO: 41), pET17b-SD37-IFNA2bSD-ELP1-90 (SEQ ID NO: 42), pET17b-SD37-IFNA2bSD-ELP4-60 (SEQ ID NO: 43), pET17b-SD31-IFNA2bSD-ELP1-90 (SEQ ID NO: 44) and pET17b-SD31-IFNA2bSD-ELP4-60 (SEQ ID NO: 45).
15. An expression vector comprising the polynucleotide of claim 13.
16. A host cell transformed by the vector of claim 15, wherein said host cell expresses the fusion protein.
17. An ELP spider complex comprising two or more fusion proteins of claim 1, linked by at least one disulfide bond.
18. An ELP spider complex comprising two or more fusion proteins exhibiting a phase transition, comprising: a) at least one target protein or peptide; b) one or more proteins comprising oligomeric repeats of a polypeptide sequence selected from SEQ ID NOs: 1-17, wherein the one or more proteins exhibits a phase transition and are joined to the at least one target protein or peptide of a); c) at least two residues capable of forming a disulfide bond; and d) optionally, a spacer sequence separating any of the phase transition protein(s) of h) from any of the target protein(s) or peptide(s) of a). wherein the two or more fusion proteins exhibiting a phase transition are linked by at least one disulfide bond.
19. A method of providing a purified protein of interest, comprising: contacting a fusion protein comprising the protein of interest and an ELP tag, wherein the fusion protein contains at least one cleavage site that is cleavable to yield the protein of interest as a cleavage product with ELP-TEV1 that is effective to cleave said cleavage site, thereby yielding said protein of interest as a cleavage product in a cleavage product mixture comprising said ELP tag, any uncleaved fusion protein and said ELP-tagged cleavage agent; and separating the protein of interest from the cleavage product mixture by inverse phase transition.
20. The method of claim 19, wherein the fusion protein comprises a spider construct.
21. A method of enhancing proteolytic resistance of a protein or peptide moiety in an ELP-based fusion peptide, comprising provision of the ELP-based fusion peptide in an ELP spider complex.
22. An ELP fusion protein spider complex.
Description:
CROSS-REFERENCE TO RELATED APPLICATION
[0001] The benefit of priority of U.S. Provisional Patent Application 60/756,269 filed Jan. 4, 2006 in the name of Suzanne Dagher is hereby claimed under the provisions of 35 USC 119(e). The disclosure of such Provisional Patent Application 60/756,269 is hereby incorporated herein by reference, in its entirety.
FIELD OF THE INVENTION
[0002] The present invention relates to fusion constructs comprising multimeric elastin-like peptides (ELPs) having utility, inter cilia, in biopharmaceutical applications, and to methods of making and using the same.
DESCRIPTION OF THE RELATED ART
[0003] U.S. Pat. No. 6,852,834 issued Feb. 8, 2005 in the name of Ashutosh Chilkoti for "FUSION PEPTIDES ISOLATABLE BY PHASE TRANSITION" and U.S. patent application Ser. No. 11/053,100 filed Feb. 8, 2005 for "FUSION PEPTIDES ISOLATABLE BY PHASE TRANSITION," in the name of Ashutosh Chilkoti (published Nov. 17, 2005 as U.S. Patent Publication No. 2005/0255554) disclose fusion proteins exhibiting a phase transition. Such fusion proteins comprise a biologically active molecule joined to one or more phase transition protein(s), where the one or more phase transition protein(s) comprise polymeric or oligomeric repeats of a polypeptide sequence. The fusion protein may also, optionally, contain a spacer sequence between the biologically active molecule and the one or more phase transition protein(s).
[0004] The biologically active molecules in these constructs can be of widely varying types, including, for example, peptides, non-peptide proteins, lipids, oligonucleotides and carbohydrates, or alternatively a ligand-binding protein or an active fragment thereof having binding affinity to a biomolecule selected from the group consisting of small organic or inorganic molecules, proteins, peptides, single-stranded or double-stranded oligonucleotides, polynucleotides, lipids, and carbohydrates.
[0005] The phase transition can be mediated by one or more techniques such as changing temperature, changing pH, addition of solutes and/or solvents, side-chain ionization or chemical modification and/or changing pressure.
[0006] The one or more one or more protein(s) exhibiting phase transition behavior can include polymeric or oligomeric repeats of the pentapeptide Ile-Pro-Gly-X-Gly or Leu-Pro-Gly-X-Gly, wherein X is any natural or non-natural amino acid residue, and wherein X optionally varies among polymeric or oligomeric repeats.
[0007] The technology disclosed by the above-identified Chilkoti patent and application includes methods of purification of fusion proteins to yield a protein of interest, by forming a polynucleotide comprising a nucleotide sequence encoding a fusion protein exhibiting a phase transition, expressing the fusion protein in culture, and subjecting a fusion protein-containing material from the culture to processing involving centrifugation and inverse transition cycling to recover the protein of interest.
[0008] The Chilkoti technology reflects an initial discovery that non-chromatographic, thermally-stimulated phase separation and purification of recombinant proteins can be easily achieved by forming fusion proteins that contain the target recombinant proteins with N- or C-terminal elastin-like polypeptide (ELP) tags.
[0009] ELPs are repeating peptide sequences that have been found to exist in the elastin protein. Among these repeating peptide sequences are polytetra-, polypenta-, polyhexa-, polyhepta-, polyocta, and polynonapeptides.
[0010] ELPs undergo a reversible inverse temperature transition: they are structurally disordered and highly soluble in water below a transition temperature (Tt), but exhibit a sharp (2-3° C. range) disorder-to-order phase transition when the temperature is raised above Tt, leading to desolvation and aggregation of the polypeptides. The ELP aggregates, when reaching sufficient size, can be readily removed and isolated from solution by centrifugation. More importantly, such phase transition is reversible, and the isolated ELP aggregates can be completely resolubilized in buffer solution when the temperature is returned below the Tt of the ELPs.
[0011] It was a surprising and unexpected discovery that fusion proteins ("FP's") containing target recombinant proteins with N- or C-terminal ELP tags also undergo a thermo-dependent phase transition similar to that of free ELPs.
[0012] This discovery has been particularly useful for non-chromatographic, thermally-stimulated separation and purification of recombinant proteins. By fusing a thermally responsive ELP tag to a target protein of interest, environmentally sensitive solubility can be imparted to such target protein. Target proteins are readily expressed as soluble fusion proteins with N- or C-terminal ELP sequences in host organisms such as E. coli, wherein the fusion proteins exhibit a soluble-insoluble phase transition when the temperature is raised from below Tt to above Tt. This inverse phase transition is exploited for purifying the target proteins from other soluble proteins produced by the organism, by nonchromatographic "inverse transition cycling" (ITC) separation.
[0013] The fundamental principle of ITC thus is remarkably simple. It involves forming an ELP fusion protein as described hereinabove, which contains the target protein with a N- or C-terminal ELP tag, rendering the ELP fusion protein insoluble in aqueous solution by triggering its inverse phase transition. This can be accomplished either by increasing the temperature above the Tt or alternatively by depressing the Tt below the solution temperature by the addition of NaCl or other salt or solute, organic or inorganic, to the solution. This results in aggregation of the ELP fusion protein, allowing it to be collected by centrifugation or other weight- and/or size-dependent mass separation techniques, e.g., membrane separation or filtration.
[0014] The aggregated ELP fusion protein can then be resolubilized in fresh buffer solution at a temperature below the Tt, thereby reversing the inverse phase transition, to yield soluble, functionally active, and purified fusion protein.
[0015] Successive purification steps may also be carried out using ITC to achieve a highly pure, e.g. ultrapure, fusion protein product. Furthermore. ITC may also be used to concentrate and exchange buffers if desired as follows: the purified protein is aggregated by triggering the phase transition, and resolubilized in a smaller volume than before inducing the phase transition to concentrate the protein solution, and buffer exchange is achieved by simply resolubilizing the protein in a buffer of different composition than the starting buffer.
[0016] Free target protein then can be obtained, for example, by carrying out protease digestion or other scission process at an engineered recognition site located between the target protein and the ELP tag, followed by a final round of ITC to remove the cleaved ELP tag and yield the purified free target protein.
[0017] The advantage of the use of inverse phase transition cycling is that purification of the protein of interest is facilitated in a ready and efficient manner. Protein production and purification efficiency are of continuing interest in ongoing efforts to refine and develop this technology, involving the search for new constructs that are well-adapted to the inverse phase transition cycling process, to yield proteins of interest at high yields. The present invention provides such new constructs.
SUMMARY OF THE INVENTION
[0018] The present invention is based on the discovery of new protein constructs that are easier to purify, enable production of peptide or protein products in higher amounts and are less susceptible to proteolysis, as compared to the single ELP-based constructs of the prior art. Such constructs are useful in inverse phase transition processes.
[0019] Thus in one embodiment, the invention provides a fusion protein exhibiting a phase transition, including at least one target protein or peptide, one or more proteins comprising oligomeric repeats of a polypeptide sequence, wherein the one or more proteins exhibits a phase transition and are joined to the at least one target protein or peptide, at least two residues capable of forming a disulfide bond and optionally a spacer sequence separating any of the phase transition protein(s) front any of the target protein(s) or peptide(s). The invention also provides a polynucleotide encoding such a fusion protein, an expression vector comprising such a polynucleotide and a host cell that expresses the fusion protein.
[0020] In another embodiment the invention provides ELP spider complex comprising two or more fusion proteins exhibiting a phase transition, including at least one target protein or peptide, one or more proteins comprising oligomeric repeats of a polypeptide sequence, wherein the one or more proteins exhibits a phase transition and are joined to the at least one target protein or peptide, at least two residues capable of forming a disulfide bond and optionally a spacer sequence separating any of the phase transition protein(s) from any of the target protein(s) or peptide(s), wherein the two or more fusion proteins exhibiting a phase transition are linked by at least one disulfide bond.
[0021] In still another embodiment the invention provides methods of providing a purified protein of interest and of enhancing proteolytic resistance of a protein or peptide moiety.
[0022] The method of the invention providing a purified protein of interest comprises contacting a fusion protein comprising the protein of interest and an ELP tag, wherein the fusion protein contains at least one cleavage site that is cleavable to yield the protein of interest as a cleavage product with ELP-TEV1 that is effective to cleave the cleavage site, thereby yielding said protein of interest as a cleavage product in a cleavage product mixture comprising said ELP rag, any uncleaved fusion protein and said ELP-tagged cleavage agent; and separating the protein of interest from the cleavage produce mixture by inverse phase transition.
[0023] In another aspect the invention provides a method of enhancing proteolytic resistance of a protein or peptide moiety in an ELP-based fusion peptide, comprising provision of the ELP-based fusion peptide in an ELP spider complex.
BRIEF DESCRIPTION OF THE DRAWINGS
[0024] FIG. 1A is a schematic representation of an ELP fusion protein including a -Cys-Cys-moiety after the ELP and before the TEV site and the peptide. FIG. 1B is a schematic representation of a corresponding spider construct, for pET17b-SD34-ELP4-60-Orexin B (as set forth below in Example 1), and transformation into a BL21 trxB-strain, allowing disulfide bond formation between the fusion proteins to take place in the cytoplasm.
[0025] FIG. 2 is an SDS-PAGE gel comparison for ELP4-60-Orexin B (Normal) and ELP4-60-S-S-Orexin B (Spider) (Lane 1: ELP4-60-Orexin B or ELP4-60-S-S-Orexin B; Lane 2: +ELP1-90-TEV; Lane 3: insolubles following Tt (I); and Lane 4: solubles following Tt (S)).
[0026] FIG. 3 shows an LC-MS analysis of Orexin B derived from ELP4-60-Orexin B (FIG. 3A), in which the level of Orexin B was too low to detect by LC-MS, so that the theoretical location was estimated by location of Orexin B in (B) and ELP4-60-S-S-Orexin B (FIG. 3B), 85% pure, in which the major peak contains the correct molecular weight of Orexin B and the minor peak is a proteolytic fragment of Orexin B.
[0027] FIG. 4 is a SDS-PAGE gel analysis, showing that the spider influences the amount of purification, the cleavage with ELP4-60-TEV and proteolysis, with all samples being reduced with DTT prior to loading on the SDS-gel.
[0028] FIG. 5 is an SDS-PAGE gel analysis of ELP-spider constructs of three additional peptides (Lane 1: ELP4-60-S-S-peptide (reduced with DTT); Lane 2: +ELP1-90-TEV; Lane 3: insolubles following Tt (1); and Lane 4: solubles following Tt (S)).
[0029] FIG. 6 shows LC-MS analysis for non-spider peptides, MMN from ELP4-60-S-S-MMN: 99.3% pure 13 mg/L (FIG. 6A) and NPY from ELP4-60-S-S-NPY: 98.1% pure 20 mg/L (FIG. 6B).
[0030] FIG. 7 shows LC-MS analysis for pro-CT from ELP1-90-pro-CT: 98.0% pure (FIG. 7A) and Leptin from ELP1-90-Leptin: 85% pure (FIG. 7B).
[0031] FIG. 8 shows the results of testing of spider constructs grown in Example 8, as compared to non-spider constructs containing IFNa2bSD, in order to determine which were most likely to produce soluble protein.
[0032] FIG. 9 shows re-expression of selected constructs from FIG. 8. The results of the re-expression show that spider constructs clearly express more target protein than non-spider constructs.
[0033] FIG. 10 shows expression of purified target protein obtained after cleavage of each of the constructs. It is shown that the spider constructs express more soluble ELP fusion protein and more IFNa2bSD than non-spider constructs.
DETAILED DESCRIPTION OF THE INVENTION AND PREFERRED EMBODIMENTS THEREOF
[0034] The present invention relates to multimeric elastin-like peptide (ELP) constructs having utility for biopharmaceutical applications, and methods of making and using the same.
[0035] The disclosures of the following U.S. patent and U.S. patent application are hereby incorporated herein by reference in their respective entireties: U.S. Pat. No. 6,352,834 issued Feb. 8, 2005 in the name of Ashutosh Chilkoti for "FUSION PEPTIDES ISOLATABLE BY PHASE TRANSITION" and U.S. patent application Ser. No. 11/053,100 filed Feb. 8, 2005 for "FUSION PEPTIDES ISOLATABLE BY PHASE TRANSITION," in the name of Ashutosh Chilkoti (published Nov. 17, 2005 as U.S. Patent Publication No. 2005/0255554).
[0036] The present invention represents the discovery that multimeric ELP-peptide or ELP-protein constructs can be formed that in relation to single ELP-based peptide or protein constructs of the prior art: (i) can be more easily purified, (ii) enable production, e.g., in a suitable host such as F coli, of peptide or protein products in higher amounts, reflecting enhanced stability in relation to single ELP-based peptide or protein constructs, and (iii) are less susceptible to proteolysis.
[0037] As used herein, the term "spider construct" is used to refer to fusion proteins capable of forming multimeric or multi-legged spider complexes of the present invention. Spider constructs differ from single ELP-based constructs in that they are capable of forming covalent crosslinks in the form of disulfide bonds, linking them to other spider constructs and forming spider complexes. Such disulfide bonds are often formed between cysteine residues of the spider constructs. Cysteine residues present in a construct of the invention may be added to the construct on either side of the ELP or may be found within the ELP itself. Cysteines adjacent to the ELP in the construct may be on either the C-terminal or N-terminal end of the ELP, regardless of whether the ELP is oriented to the amino or carboxyl end of the protein or peptide.
[0038] A "spider complex" of the invention contains at least two spider constructs linked by at least one disulfide bond, but is not limited by any maximum number of spider constructs or any maximum number of disulfide bonds.
[0039] Therefore in one embodiment the invention provides a fusion protein of spider construct comprising [0040] a) at least one target protein or peptide; [0041] b) one or more proteins comprising oligomeric repeats of a polypeptide sequence, such as those hereinafter described, wherein the one or more proteins exhibit a phase transition and are joined to the at least one target protein or peptide of a); [0042] c) at least two residues capable of forming a disulfide bond; and [0043] d) optionally, a spacer sequence separating any of the phase transition protein(s) of b) from any of the target protein(s) or peptide(s) of a).
[0044] The fusion protein of the invention contains a target protein or peptide. More preferably, the target protein or peptide comprises a polypeptide protein, most preferably a biologically active polypeptide, e.g. a therapeutic peptide, protein or an enzyme useful in industrial biocatalysis. Suitable proteins include those of interest in medicine, agriculture and other scientific and industrial fields, particularly including therapeutic proteins such as erythropoietins, interferons, insulin, monoclonial antibodies, blood factors, colony stimulating factors, growth hormones, interleukins, growth factors, therapeutic vaccines, calcitonins, tumor necrosis factors (TNF), and enzymes. Specific examples of such therapeutic proteins include, without limitation, enzymes utilized in replacement therapy; hormones for promoting growth in animals, or cell growth in cell culture; and active proteinaceous substances used in various applications, e.g., in biotechnology or in medical diagnostics. Specific examples include, but are not limited to: superoxide dismutase, interferon, asparaginease, glutamase, arginase, arginine deaminase, adenosine deaminase ribonuclease, trypsin, chromotrypsin, papin, insulin, calcitonin, ACTH, glucagon, somatosin, somatropin, somatomedin, parathyroid hormone, erthyropoietin, hypothalamic releasing factors, prolactin, thyroid stimulating hormones, endorphins, enkephatins, and vasopressin.
[0045] The target protein or peptide may also comprise a ligand-binding protein or an active fragment thereof, such as an antibody or antibody fragment, which has specific affinity for a protein of interest. Upon binding to the protein of interest, the fusion protein preferably retains some or all of its phase transition character, so that the protein of interest bound to such fusion protein may be isolated by inverse phase transition. In another embodiment the target protein or peptide may be selected from the group consisting of proteins, lipids, carbohydrates, and single- or double-stranded oligonucleotides.
[0046] In various embodiments of the invention, the target protein or peptide may be, but is not limited to, IFNa2b, Orexin-B, MMN, NPY, Gh or active fragments thereof.
[0047] In addition to the target protein or peptide component, a fusion protein of the invention also includes one or more ELPs exhibiting a phase transition. These ELPs may be of any suitable type.
[0048] As used herein, ELPs are repeating peptide sequences that exist in the elastin protein. Among these repeating peptide sequences are polytetra-, polypenta-, polyhexa-, polyhepta-, polyocta, and polynonapeptides.
[0049] The ELPs may comprise polymeric or oligomeric repeats of various tetra-, penta-, hexa-, hepta-, octa-, and nonapeptides, including but not limited to VPGG (SEQ ID NO: 1), IPGG (SEQ ID NO: 2), XGVPG (SEQ ID NO: 3), VGVPG (SEQ ID NO: 4), VPAVG (SEQ ID NO: 5), GVGIP (SEQ ID NO: 6), VGLPG (SEQ ID NO: 7), VPGXG (SEQ ID NO: 8). AVGVP (SEQ ID NO: 9), IPGVG (SEQ ID NO: 10). IPGXG (SEQ ID NO: 11), LPGVG (SEQ ID NO: 12), LPGXG (SEQ ID NO: 13). VAPGVG (SEQ ID NO: 14). GVGVPGVG (SEQ ID NO: 15), VPGFGVGAG (SEQ ID NO: 16), and VPGVGVPGG (SEQ ID NO: 17). It will be appreciated by those of skill in the art that the ELPs need not consist of only polymeric or oligomeric sequences as listed hereinabove, in order to exhibit the desired phase transition, and that other polymeric or oligomeric sequences of varying size and constitution that exhibit phase transition behavior are also usefully employed in the broad practice of the present invention.
[0050] In one embodiment, the ELPs are polymeric or oligomeric repeats selected from one of the above listed polypentapeptides. Where the above polypentapeptides contain a guest residue X, X may be any amino acid that does not eliminate the phase transition characteristics of the ELP. X may be a naturally occurring or non-naturally occurring amino acid. For example, X may be selected from the group consisting of: alanine, arginine, asparagine, aspartic acid, cysteine, glutamic acid, glutamine, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine and valine. In one aspect of the invention, where the ELP is VPGXG (SEQ ID NO: 8), X is not proline.
[0051] X may be a non-classical amino acid. Examples of non-classical amino acids include: D-isomers of the common amino acids, 2,4-diaminobutyric acid, α-amino isobutyric acid, 4-aminobutyric acid, Abu, 2-amino butyric acid, γ-Abu, ε-Ahx, 6-amino hexanoic acid, Aib, 2-amino isobutyric acid, 3-amino propionic acid, ornithine, norleucine, norvaline, hydroxyproline, sarcosine, citrulline, homocitrulline, cysteic acid, t-butylglycine, t-butylalanine, phenylglycine, cyclohexylalanine, β-alanine, fluoro-amino acids, designer amino acids such as β-methyl amino acids, C α-methyl amino acids, N α-methyl amino acids, and amino acid analogs in general.
[0052] In one embodiment, the ELP is a "Series I" pentapeptide of repeating oligomer XGVPG (SEQ ID NO: 3) where X is independently selected from Val, Gly and Ala, in a ratio of 5V:3G:2A, selected from ELP1-90 and ELP1-180.
[0053] In another embodiment, the ELP is a "Series II" pentapeptide of repeating oligomer XGVPG (SEQ ID NO: 3) where X is independently selected from Lys, Val and Phe, in a ratio of 1K:2V:1F, selected from ELP2-64 and ELP2-128.
[0054] In still another embodiment of the invention, the ELP is a "Series III" pentapeptide of repeating oligomer XGVPG (SEQ ID NO: 3) where X is independently selected from Lys, Val and Phe, in a ratio of 1K:7V:1F, selected from ELP3-72 and ELP3-144.
[0055] In a further embodiment, the ELP is a "Series IV" pentapeptide of repeating oligomer VGVPG (SEQ ID NO: 4), selected from ELP4-60 and ELP4-120.
[0056] In a further embodiment, the ELP is a "Series V" pentapeptide of repeating oligomer VPAVG (SEQ ID NO: 5), selected from ELP5-15, ELP5-30 and ELP5-60.
[0057] In a further embodiment, the ELP is a "Series VI" pentapeptide of repeating oligomer GVGIP (SEQ ID NO: 6), selected from ELP6-15. ELP6-30 and ELP6-60.
[0058] In a further embodiment, the ELP is a "Series VII" pentapeptide of repeating oligomer VGLPG (SEQ ID NO: 7), selected from ELP7-15, ELP7-30 and ELP7-60.
[0059] Alternatively, such ELPs can be polymeric or oligomeric repeats of the pentapeptide IPGXG (SEQ ID NO: 11) or LPGXG (SEQ ID NO: 13), where X is as defined hereinabove.
[0060] Therefore in one embodiment of the invention, the fusion protein contains a phase transition protein selected front the group consisting of: SEQ ID NO: 1-17.
[0061] In another embodiment of the invention, the fusion protein contains a phase transition protein selected from the group consisting of: SEQ ID NOs: 1, 2, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17.
[0062] In still another embodiment, the fusion protein contains a phase transition protein selected from the group consisting of: SEQ ID NOs: 1, 2, 9, 10, 11, 12, 13, 14, 15, 16, and 17.
[0063] A further embodiment of the invention relates to a fusion protein containing a phase transition protein, wherein the phase transition protein is selected from the group consisting of: ELP1-90. ELP1-180, ELP2-64, ELP2-128, ELP3-72, ELP3-144, ELP4-60, ELP4-120, ELP5-15, ELP5-30, ELP5-60, ELP6-15, ELP6-30, ELP6-60, ELP7-15, ELP7-30 and ELP7-60.
[0064] In the fusion protein the phase transition protein is joined to the at least one target protein or peptide. Such joining may be on either end of the target protein or peptide, forming either an N- or C-terminal ELP tag.
[0065] The fusion protein of the invention contains at least two residues capable of forming a disulfide bond. In one aspect this comprises the presence of two cysteine residues within the ELP monomer sequence. In another aspect the cysteines are located elsewhere in the fusion protein, either adjacent to the ELP, or separated from the ELP. The cysteines may be located on either side of the ELP within the fusion protein.
[0066] An example of an ELP containing two cysteines within the monomer sequence is set forth below. In the example, the ELP is of a general form "ELPx-y", where x is an indicator of the ELP series and y is the number of oligomeric repeats. In the following exemplified sequence, the ratio of G:V:C:A is 2:5:2:2.
TABLE-US-00001 G V G V P G V G V P G C C V P (SEQ ID NO: 18) GGCGTGGGTGTTCCGGGCGTGGGTGTTCCGGGTTGCTGCGTGCC (SEQ ID NO: 19) G A G V P G V G V P G V G V P G V G V P GGGCGCAGGTGTTCCTGGTGTAGGTGTGCCGGGTGTTGGTGTGCCGGGTGTTGGTGTACC G G G V P G A G V P G G G V P G AGGTGGCGGTGTTCCGGGTGCAGGCGTTCCGGGTGGCGGTGTGCCGGGC . . .
[0067] Another example of an ELP containing two cysteines within the monomer sequence is set forth below, where the ratio of G:V:C:A is 1:5:2:2.
TABLE-US-00002 G V G V P G V G V P G C G V P (SEQ ID NO: 20) GGCGTGGGTGTTCCGGGCGTGGGTGTTCCGGGTTGCGGGGTGCC (SEQ ID NO: 21) G A G V P G V G V P G V G V P G V G V P GGGCGCAGGTGTTCCTGGTGTAGGTGTGCCGGGTGTTGGTGTGCCGGGTGTTGGTGTACC G C G V P G A G V P G G G V P G AGGTTGCGGTGTTCCGGGTGCAGGCGTTCCGGGTGGCGGTGTGCCGGGC . . .
[0068] The phase transition of the fusion protein is preferably mediated by one or more mechanisms selected from, but not limited to, changing temperature, changing pH, addition of (organic or inorganic) solutes and/or solvents, side-chain ionization or chemical modification, irradiation with electromagnetic waves (rf, ultrasound, and light) and changing pressure. The preferred mechanisms for mediating the phase transition are raising temperature and adding solutes and/or solvents.
[0069] Optionally, the fusion protein may contain a spacer sequence separating the phase transition protein from the target protein or peptide. The spacer sequence, when present, preferably comprises a cleavage site, e.g., a proteolytic cleavage site, a chemical cleavage site, a photolytic cleavage site, a thermolytic cleavage site, or a cleavage site susceptible to cleavage in the presence of a shear force, pH change, enzymatic agent, ultrasonic or other predetermined frequency field providing energy effective for cleavage. The cleavage modality may be of any of widely varying types, it being necessary only that the cleaving step yield at least one biological molecule (as a cleavage product) that retains functional utility for its intended purpose.
[0070] The fusion peptides of the present invention may also optionally comprise signal peptides for directing secretion of the fusion peptides from the cell, so that the fusion peptides may readily be isolated from the medium of an active culture of recombinant cells genetically modified to produce the fusion peptides. Such signal peptides are preferably cleavable from the fusion protein by enzymatic cleavage.
[0071] Therefore in one embodiment the invention provides a fusion protein comprising a spacer sequence. In another embodiment the invention provides a fusion protein with a spacer sequence that is a proteolytic cleavage site.
[0072] In one embodiment, the invention provides a fusion protein selected front pET17b-SD33-ELP1-90-IFNA2bSD (SEQ ID NO: 22), pET17b-SD33-ELP4-60-IFNA2bSD (SEQ ID NO: 23), pET17b-SD34-ELP1-90-IFNA2bSD (SEQ ID NO: 24), pET17b-SD34-ELP4-60-IFNA2bSD (SEQ ID NO: 25), pET17b-SD22-ELP1-90-IFNA2bSD (SEQ ID NO: 26), pET17b-SD22-ELP4-60-IFNA2bSD (SEQ ID NO: 27), pET17b-SD35-IFNA2bSD-ELP1-90 (SEQ ID NO: 28), pET17b-SD35-IFNA2bSD-ELP4-60 (SEQ ID NO: 29), pET17b-SD37-IFNA2bSD-ELP1-90 (SEQ ID NO: 30), pET17b-SD37-IFNA2bSD-ELP4-60 (SEQ ID NO: 31), pET17b-SD31-IFNA2bSD-ELP1-90 (SEQ ID NO: 32) and pET17b-SD31-IFNA2bSD-ELP4-60 (SEQ ID NO: 33).
[0073] In another embodiment the invention provides a fusion protein selected from ELP4-60-S-S-Orexin B, ELP4-60-S-S-MMN, ELP4-60-S-S-NPY and ELP4-60-S-S-Gh.
[0074] Yet another embodiment of the invention relates to a polynucleotide comprising a nucleotide sequence encoding a fusion protein exhibiting a phase transition, comprising: [0075] a) at least one target protein or peptide; [0076] b) one or more proteins comprising oligomeric repeats of a polypeptide sequence selected from SEQ ID NOs: 1-17, wherein the one or more proteins exhibits a phase transition and are joined to the at least one target protein or peptide of a); [0077] c) at least two residues capable of forming a disulfide bond; and [0078] d) optionally, a spacer sequence separating any of the phase transition protein(s) of b) from any of the target protein(s) or peptide(s) of a).
[0079] The invention in a further aspect relates to a polynucleotide selected from pET17b-SD33-ELP1-90-IFNA2bSD (SEQ ID NO; 34), pET17b-SD33-ELP4-60-IFNA2bSD (SEQ ID NO: 35), pET17b-SD34-ELP1-90-IFNA2bSD (SEQ ID NO: 36), pET17b-SD34-ELP4-60-IFNA2bSD (SEQ ID NO: 37), pET17b-SD22-ELP1-90-IFNA2bSD (SEQ ID NO: 38), pET17b-SD22-ELP4-60-IFNA2bSD (SEQ ID NO: 39), pET17b-SD35-IFNA2bSD-ELP1-90 (SEQ ID NO: 40), pET17b-SD35-IFNA2bSD-ELP4-60 (SEQ ID NO: 41), pET17b-SD37-IFNA2bSD-ELP1-90 (SEQ ID NO: 42), pET17b-SD37-IFNA2bSD-ELP4-60 (SEQ ID NO: 43), pET17b-SD31-IFNA2bSD-ELP1-90 (SEQ ID NO: 44) and pET17b-SD31-IFNA2bSD-ELP4-60 (SEQ ID NO: 45).
[0080] In still another embodiment the invention provides an expression vector comprising a polynucleotide encoding a fusion protein of the invention. In yet another embodiment the invention provides a host cell transformed by such an expression vector, where the host cell expresses the fusion protein.
[0081] In a still further embodiment, the invention provides an ELP spider complex comprising two or more fusion proteins of the invention, covalently linked by at least one disulfide bond.
[0082] The invention also provides a spider complex which includes two or more fusion proteins exhibiting a phase transition comprising: [0083] a) at least one target protein or peptide; [0084] b) one or more proteins comprising oligomeric repeals of a polypeptide sequence selected from SEQ ID NOs: 1-17, wherein the one or more proteins exhibits a phase transition and are joined to the at least one target protein or peptide of a); [0085] c) at least two residues capable of forming a disulfide bond; and [0086] d) optionally, a spacer sequence separating any of the phase transition protein(s) of b) from any of the target protein(s) or peptide(s) of a). [0087] wherein the two or more fusion proteins exhibiting a phase transition are covalently linked by at least one disulfide bond.
[0088] The invention also provides methods of utilizing the fusion proteins and spider complexes as discussed herein.
[0089] In one aspect the invention relates to a method of providing a purified protein of interest, comprising contacting a fusion protein comprising the protein of interest and an ELP tag, wherein the fusion protein contains at least one cleavage site that is cleavable to yield the protein of interest as a cleavage product with ELP-TEV1 that is effective to cleave said cleavage site, thereby yielding said protein of interest as a cleavage product in a cleavage product mixture comprising said ELP tag, any uncleaved fusion protein and said ELP-tagged cleavage agent and separating the protein of interest from the cleavage produce mixture by inverse phase transition.
[0090] In another aspect the invention provides a method of enhancing proteolytic resistance of a protein or peptide moiety in an ELP-based fusion peptide, comprising provision of the ELP-based fusion peptide in an ELP spider complex form.
[0091] Examples of spider complexes discussed herein include Orexin B/ELP and IFNa2bSD/ELP spider constructs, but are broadly applicable to a wide spectrum of other proteins and peptides, and have particular utility for proteins or peptides that are susceptible to proteolytic degradation.
[0092] Methods of protection of target proteins or peptides by the spider construct may involve, but are not limited to: (i) slowing or stopping degradative action of proteases, (ii) decreasing or eliminating non-specific associated proteins that may make the target protein or peptide insoluble or prevent the target protein or peptide from properly folding, (iii) increasing the amount of total spider construct concentration in the cell, and/or (iv) exposure of a region of the target protein or peptide subject to proteolysis by the ELP fused to TEV protease when produced in the absence of disulfide bonds. Disulfides formed between ELP and TEV site have been tested. Spacing may be desirably adjusted for such bonding, e.g., with intra-disulfide bonds being separated by at least 2 amino acid residues, such as for example by a -Cys-Cys-moiety. Other techniques for increasing inter- and decreasing intra-disulfide bond formation can be employed.
[0093] FIG. 1A is a schematic representation of an ELP fusion protein including a -Cys-Cys-moiety after the ELP and before the TEV site and the peptide. FIG. 1B is a schematic representation of a corresponding spider construct, for pET17b-SD34-ELP4-60-Orexin B (as set forth below in Example 1), and transformation into a BL21 trxB-strain, allowing disulfide bond formation between the fusion proteins to take place in the cytoplasm.
[0094] The following examples are intended to illustrate, but not limit the invention.
Example 1
Comparison of ELP-Orexin B Normal Fusion Protein and ELP-Orexin B Spider Construct
[0095] Fusion protein constructs of ELP4-60-Orexin B (Normal) and ELP4-60-S-S-Orexin B (Spider) were generated.
[0096] FIG. 2 is an SDS-PAGE gel comparison of the two fusion protein constructs. (Lane 1: ELP4-60-Orexin B or ELP4-60-S-S-Orexin B; Lane 2: +ELP1-90-TEV; Lane 3: insolubles following Tt (I); and Lane 4: solubles following Tt (S)).
[0097] FIG. 3 shows an LC-MS analysis of Orexin B derived from ELP4-60-Orexin B (FIG. 3A), in which the level of Orexin B was too low to detect by LC-MS, so that the theoretical location was estimated by location of Orexin B in (B) and ELP4-60-S-S-Orexin B (FIG. 3B), 85% pure, in which the major peak contains the correct molecular weight of Orexin B and the minor peak is a proteolytic fragment of Orexin B.
[0098] The original ELP4-60-Orexin B appeared to purify well, however a substantial amount was difficult to resuspend and purify away from insoluble contaminates. Only a portion of Orexin B was cleaved following 18 hr digestion with ELP1-90-TEV. Once cleaved 50% was insoluble following final transition to eliminate uncleaved ELP4-60-Orexin D, ELP1-90-TEV and ELP4-60. The level of Orexin B was too low to analyze by LC-MS. Loading the gel with 10× more cleaved ELP4-60-Orexin B indicated proteolysis had occurred prior to or during cleavage.
[0099] ELP4-60-S-S-Orexin B was much easier to purify away from contaminates. Complete cleavage occurred following 18 hr digestion with ELP1-90-TEV. A minor amount of Orexin B remained insoluble following the final transition to eliminate uncleaved ELP4-60-Orexin B, ELP1-90-TEV, and ELP4-60. LC-MS analysis indicated the largest peak contained the correct molecular weight Orexin B peptide and was 85% of total peaks. The minor peak was a proteolytic fragment of Orexin B.
[0100] The fusion protein constructs of ELP4-60-Orexin B (Normal) and ELP4-60-S-S-Orexin B (Spider) were expressed in E. coli strains BL21 and trxB. The results are summarized in FIG. 4. FIG. 4 is an SDS-PAGE gel analysis, showing that use of a spider construct influences the amount of purification, the cleavage with ELP4-60-TEV and proteolysis. All samples in the comparison were reduced with DTT prior to loading on the SDS-gel.
[0101] It can be seen in FIG. 4 that in both BL21 and trxB, more spider constructs (lanes 2, 4, 6, 8) were produced than normal constructs (lanes 1, 3, 5, 7). Disulfide bond formation within the cytoplasm did not influence the amount of spider construct made. The addition of DTT during purification (lanes 6, 8) did not influence the amount of spider construct purified in the absence of DTT (lanes 2,4). Cleavage with ELP1-90-TEV was not complete when spider constructs were produced in BL21* and purified in the absence of DTT (lane 2). Spider constructs were degraded when produced in BL21* and exposed to OTT during purification (lane 6). The presence of disulfide bonds during expression produced more cleaved Orexin B (lanes 4,8).
Example 2
Spider Constructs of Additional Peptides
[0102] FIG. 5 is an SDS-PAGE gel analysis of ELP-spider constructs of three peptides: MMN, NPY and Gh. For each peptide, the lanes are as follows: Lane 1: ELP4-60-S-S-peptide (reduced with DTT): Lane 2: +ELP1-90-TEV; Lane 3: insolubles following T, (I); and Lane 4: solubles following Tt (S)).
[0103] Comparison of spider constructs and normal constructs were as follows:
[0104] MMN peptide: ELP4-60-MMN (58 mg/liter) was not difficult to purify and resulted in 6 mg/liter MMN. ELP4-60-S-S-MMN (180 mg/liter) was not difficult to purify and resulted in 13 mg/liter MMN. It can be seen that use of spider constructs increased the amount of MMN purified.
[0105] NPY peptide: ELP4-60-NPY (62 mg/liter) was not difficult to purify and resulted in 7 mg/liter NPY. ELP4-60-S-S-NPY (222 mg/liter) was not difficult to purify and resulted in 20 mg/liter NPY. It can be seen that use of spider constructs increased the amount of NPY purified, it being noted that NPY does not stain well.
[0106] Gh peptide: ELP4-60-GH was difficult to purify and resulted in 0 mg/liter Gh. ELP4-60-S-S-GH was not as difficult to purify and resulted in approx 2 mg/liter Gh. A spider construct only partially eliminated Gh degradation. The amount of non-degraded Gh was too low for LC-MS determination. These factors were exacerbated by ELP4-60-Gh transitioning at room temperature (RT): [0107] a. ELP4-60-GH had to be diluted 10 fold compared to MMN and NPY to prevent transitioning at RT during cleavage with ELP1-90-TEV. [0108] b. The 10-fold dilution made the final concentration of GH too low for LC-MS analysis. [0109] c. Incomplete cleavage with ELP1-90-TEV may be due to a fraction of ELP4-60-GH that continued to transition at RT even after a 10 fold dilution. [0110] d. The low transition temperature of ELP4-60-GH may also be due to contaminants that could not be eliminated through 3 temperature transitions.
[0111] It can be seen that use of spider constructs increased the amount of Gh purified.
[0112] With all three peptides (MMN, NPY and Gh), use of spider constructs may also act to buffer the possible toxic effect of the peptide and allow more ELP-peptide to be produced per cell, in addition to decreasing protcolysis.
[0113] FIG. 6 shows LC-MS analysis for non-spider peptides. MMN from ELP4-60-S-S-MMN: 99.3% pure 13 mg/L (FIG. 6A) and NPY from ELP4-60-S-S-NPY: 98.1% pure 20 mg/L (FIG. 6B).
[0114] FIG. 7 shows LC-MS analysis for pro-CT from ELP1-90-pro-CT: 98.0% pure (FIG. 7A) and Leptin from ELP1-90-Leptin: 85% pure (FIG. 7B).
Example 3
Construction of pET17b-SD35-ELP
[0115] SD 35 forward and reverse oligos were annealed, forming 5' XhoI and 3' StyI overhangs:
TABLE-US-00003 SD35 forward oligo: (SEQ ID NO: 46) TCGAGAACCTGTATTTCCAGGGCGGGTGCTGCGGC SD35 reverse oligo: (SEQ ID NO: 47) CTTGGCCGCAGCAGCCGCCCTGGAAATACAGGTTC Annealed oligos: (SEQ ID NO: 48) L E N L Y F Q G G C C G Q G cTCGAGAACCTGTATTTCCAGGGCGGGTGCTGCGGCcaagg (SEQ ID NO: 49) gagctCTTGGACATAAAGGTCCCGCCCACGACGCCGGTTCc XhoI StyI
[0116] The annealed SD35 oligos were ligated into pUC19-SD31-ELP, digested with XhoI and StyI and 5' dephosphorylated with CIP to create pUC19-SD35-ELP. The pUC19-SD35-ELP XbaI-EcoRI fragment containing SD35-ELP was subcloned into pET17b, digested with XbaI and EcoRI and 5' dephosphorylated with CIP.
[0117] In this spider construct, the Cys-Cys is placed following the TEV cleavage site at the amino terminus of the ELP to create a protein/peptide-ELP orientation.
[0118] Individual spider constructs pET17b-SD35-ELP1-90 (SEQ ID NO: 50) and pET17b-SD35-ELP4-60 (SEQ ID NO: 51) were created.
Example 4
Construction of pET17b-SD37-ELP
[0119] SD37 forward and reverse oligos were annealed, forming BglI and NheI overhangs:
TABLE-US-00004 SD37 forward oligo: TGGCCTTGCTGCTGATAAG (SEQ ID NO: 52) SD37 reverse oligo: CTAGCTTATCAGCAGCAAGGCCAGCC (SEQ ID NO: 53) Annealed oligos: P G W P C C * * A S (SEQ ID NO: 54) gccgggcTGGCCTTGCTGCTGATAAGctagc cggcCCGACCGGAACGACGACTATTCGATCg (SEQ ID NO: 55) BglI NheI
[0120] The annealed SD37 oligos were ligated into pET17b-SD31-ELP, digested with BglI and NheI and 5' dephosphorylated with CIP to create pET17b-SD37-ELP.
[0121] In this spider construct, the Cys-Cys is placed at the carboxyl terminus of the ELP to create a protein/peptide-ELP orientation.
[0122] Individual spider constructs pET17b-SD37-ELP1-90 (SEQ ID NO: 56) and pET17b-SD37-ELP4-60 (SEQ ID NO: 57) were created.
Example 5
Construction of pET17b-SD33-ELP
[0123] SD33 forward and reverse oligos were annealed to form XbaI and NcoI overhangs: SD33 forward oligo:
TABLE-US-00005 SD33 forward oligo: CTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACCATGTGCTGCCC (SEQ ID NO: 58) SD33 reverse oligo CATGGGGCAGCACATGGTATATCTCCTTCTTAAAGTTAAACAAAATTATTT (SEQ ID NO: 59) Annealed oligos: M C C P M G (SEQ ID NO: 60) tCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACCATGTGCTGCCCcatgg agatcTTTATTAAAACAAATTGAAATTCTTCCTCTATATGGTACACGACGGGGTACc (SEQ ID NO: 61) XbaI NcoI
[0124] The annealed SD33 oligos were ligated into pET17b-SD22-ELP digested with XbaI and NcoI and 5' dephosphorylated with CIP to create pET17b-SD33-ELP.
[0125] In this spider construct, the Cys-Cys is placed at the amino terminus of the ELP to create an ELP-protein/peptide orientation.
[0126] Individual spider constructs pET17b-SD33-ELP1-90 (SEQ ID NO: 62) and pET176-SD33-ELP4-60 (SEQ ID NO: 63) were created.
Example 6
Construction of pET17b-SD34-ELP
[0127] SD34 forward and reverse oligos were annealed to form BglI and EcoRV overhangs:
TABLE-US-00006 SD34 forward oligo: TGCCCGTGCTGCAGCAGCGGTGAT (SEQ ID NO: 64) SD34 reverse oligo: ATCACCGCTGCTGCAGCACGGCCAGCC (SEQ ID NO: 65)
[0128] The annealed SD34 oligos were ligated into pET17b-SD22-ELP digested with BglI and EcoRV and 3' dephosphorylated to create pET17b-SD34-ELP.
[0129] In this spider construct, the Cys-Cys is placed at the carboxyl terminus of the ELP to create an ELP-protein/peptide orientation.
[0130] Individual spider constructs pET17b-SD34-ELP1-90 (SEQ ID NO: 68) and pET17b-SD34-ELP4-60 (SEQ ID NO: 69) were created.
Example 7
Construction of Spider Constructs Containing IFNa2bSD
[0131] PCR amplification was used to generate the following DNA fragments from a Human cDNA library containing IFNA2b:
[0132] IFNa2bSD NdeI-XhoI containing STOP codon:
TABLE-US-00007 (SEQ ID NO: 70) CATAT 5150 GTGTGATCTGCCTCAAACCCACAGCCTGGGTAGCAGGAGGACCTTGATGC 5200 TCCTGGCACAGATGAGGAGAATCTCTCTTTTCTCCTGCTTGAAGGACAGA 5250 CATGACTTTGGATTTCCCCAGGAGGAGTTTGGCAACCAGTTCCAAAAGGC 5300 TGAAACCATCCCTGTCCTCCATGAGATGATCCAGCAGATCTTCAATCTCT 5350 TCAGCACAAAGGACTCATCTGCTGCTTGGGATGAGACCCTCCTAGACAAA 5400 TTCTACACTGAACTCTACCAGCAGCTGAATGACCTGGAAGCCTGTGTGAT 5450 ACAGGGGGTGGGGGTGACAGAGACTCCCCTGATGAAGGAGGACTCCATTC 5500 TGGCTGTGAGGAAATACTTCCAAAGAATCACTCTCTATCTGAAAGAGAAG 5550 AAATACAGCCCTTGTGCCTGGGAGGTTGTCAGAGCAGAAATCATGAGATC 5000 TTTTTCTTTGTCAACAAACTTGCAAGAAAGTTTAAGAAGTAAGGAATAACTCGAG
[0133] IFNa2bSD NdeI-XhoI, not containing STOP codon:
TABLE-US-00008 (SEQ ID NO: 71) CATAT 5150 GTGTGATCTGCCTCAAACCCACAGCCTGGGTAGCAGGAGGACCTTGATGC 5200 TCCTGGCACAGATGAGGAGAATCTCTCTTTTCTCCTGCTTGAAGGACAGA 5250 CATGACTTTGGATTTCCCCAGGAGGAGTTTGGCAACCAGTTCCAAAAGGC 5300 TGAAACCATCCCTGTCCTCCATGAGATGATCCAGCAGATCTTCAATCTCT 5350 TCAGCACAAAGGACTCATCTGCTGCTTGGGATGAGACCCTCCTAGACAAA 5400 TTCTACACTGAACTCTACCAGCAGCTGAATGACCTGGAAGCCTGTGTGAT 5450 ACAGGGGGTGGGGGTGACAGAGACTCCCCTGATGAAGGAGGACTCCATTC 5500 TGGCTGTGAGGAAATACTTCCAAAGAATCACTCTCTATCTGAAAGAGAAG 5550 AAATACAGCCCTTGTGCCTGGGAGGTTGTCAGAGCAGAAATCATGAGATC 5600 TTTTTCTTTGTCAACAAACTTGCAAGAAAGTTTAAGAAGTAAGGAACTCGAG
[0134] IFNA2bSD with stop codon was ligated into pET17b-SD33-ELP, pET17b-SD34-ELP and pET17b-SD22-ELP digested with NdeI and partially digested with XhoI and 5' dephosphorylated with CIP to create pET17b-SD33-ELP-IFNA2bSD, pET17b-SD34-ELP-IFNA2bSD and pET17b-SD22-ELP-IFNA2bSD respectively.
[0135] IFNA2bSD without stop codon was ligated into pET17b-SD35-ELP, pET17b-SD37-ELP and pET17b-SD31-ELP digested with NdeI and partially digested with XhoI and 5' dephosphorylated with CIP to create pET17b-SD35-IFNA2bSD-ELP, pET17b-SD37-IFNA2bSD-ELP and pET17b-SD31-IFNA2bSD-ELP respectively.
[0136] Resulting spider constructs created included pET17b-SD33-ELP1-90-IFNA2bSD (SEQ ID NO: 34), pET17b-SD33-ELP4-60-IFNA2bSD (SEQ ID NO: 35), pET17b-SD34-ELP1-90-IFNA2bSD (SEQ ID NO: 36), pET17b-SD34-ELP4-60-IFNA2bSD (SEQ ID NO: 37), pET17b-SD22-ELP1-90-IFNA2bSD (SEQ ID NO: 38), pET17b-SD22-ELP4-60-IFNA2bSD (SEQ ID NO: 39), pET17b-SD35-IFNA2bSD-ELP1-90 (SEQ ID NO: 40), pET17b-SD35-IFNA2bSD-ELP4-60 (SEQ ID NO: 41), pET17b-SD37-IFNA2bSD-ELP1-90 (SEQ ID NO: 42), pET17b-SD37-IFNA2bSD-ELP4-60 (SEQ ID NO: 43), pET17b-SD31-IFNA2bSD-ELP1-90 (SEQ ID NO: 44) and pET17b-SD31-IFNA2bSD-ELP4-60 (SEQ ID NO: 45). Protein translations of these spider constructs are SEQ ID NOs: 22-33.
Example 8
Purification of ELP Spider Proteins
[0137] E. coli strain BL21trxB-(DE3) F-ompT hsdSB(rB-mB-) gal dcm trxB15::kan (DE3) (Novagen) containing the ELP/protein construct was inoculated into 5 ml TB supplemented with 100 mM proline, 4% glycerol, phosphate buffer and ampicillin. Cultures were grown for 5 hrs at 37° C. before being transferred at 1:100 dilutions into the same media and grown for 43 hrs at 25° C. unless otherwise noted. The cultures were harvested and resuspended in 10 ml/gram wet weight in the following buffer: 50 mM Tris pH7.0, 1 mM EDTA. Cells were lysed by ultrasonic disruption on ice for 3 minutes, consisting of 15 second bursts at 60 W separated by 15 second cooling down intervals (Sonicate). Cell debris was removed by centrifugation at 20,000 g at 4° C. for 30 minutes. The insoluble pellet was resuspended in the original buffer and volume (Pellet). Soluble material comprised lysate. Inverse temperature transition was induced by adding NaCl to a final concentration of 1.0-2.0 M to the lysate at 25° C., followed by centrifugation at 20,000 g for 15 minutes at 25° C. The resulting pellet contained ELP/protein fusions. The pellet was resuspended in the original volume ice-cold ml buffer, centrifuged at 20,000 g, 4° C. for 15 minutes to remove non-specific insoluble proteins (T,1). The temperature transition cycle was repeated at one additional time to increase the purity of the ELP/protein fusions and reduce the final volume (T,2).
[0138] The above purification was performed for spider constructs created in Example 8.
[0139] Initially the spider constructs were tested after growth for 15 hours at 25° C. and compared to non-spider protein constructs containing IFNa2bSD to determine which were most likely to produce soluble protein. Results are shown in FIG. 8.
[0140] Selected constructs from FIG. 8 were re-expressed for 48 hours at 25° C. and purified using two inverse phase transitions as set forth above. The results of the re-expression are shown in FIG. 9. It can be seen that spider constructs clearly express more initial fusion protein than their non-spider counterparts.
Example 9
ELP1-90-TEV1 Cleavage
[0141] In order to purify cleaved protein from ELP, uncleaved protein and protease, the present invention provides an optimized purification of ELP TEV protease (ELP-TEV1) for inverse phase transition removal once cleavage has been completed. ELP1-90-TEV1 was added at a 1:100 dilution of the protein concentration in T,2 and supplemented with 1 mM DTT (T,2+ELP1-90-TEV1). Cleavage was allowed to proceed for 15 hrs at 4° C. Free protein was separated front free ELP and uncleaved ELP/protein fusions by adding 1M NaCl at 25° C., followed by centrifugation at 25° C. Salt transitioned material (ELP and ELP fusions) were resuspended in cold buffer (Insoluble). Salt soluble protein was transferred to a new tube (Soluble). Results can be seen in FIG. 10. Although TEV cleavage was incomplete when performed at 4° C. rather then 25° C., the spider construct not only expressed more soluble ELP fusion protein but also produced 3-4 times more soluble IFNa2bSD following cleavage with ELP1-90-TEV1 and a final transition to separate IFNa2bSD from free ELP, ELP fusions and ELP1-90-TEV1.
[0142] Resulting cleaved proteins were as follows:
TABLE-US-00009 (SEQ ID NO: 72) pET17b-SD33-IFINA2bSD-ELP1-90, cleaved with TEV: GAHMCDLPQTHSLGSRRILMLLAQMRRISLFSCLKDRHDFGFPQEEFG NQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQL NDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAW EVVRAEIMRSFSLSTNLQESLRSKE pET17b-SD33-IFNA2bSD-ELP4-60, cleaved with TEV: (SEQ ID NO: 73) GAHMCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFG NQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQL NDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAW EVVRAEIMRSFSLSTNLQESLRSKE pET17b-SD34-IFNA2bSD-ELP1-90, cleaved with TEV: (SEQ ID NO: 74) GAHMCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFG NQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQL NDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAW EVVRAEIMRSFSLSTNLQESLRSKE pET17b-SD34-IFNA2bSD-ELP4-60, cleaved with TEV: (SEQ ID NO: 75) GAHMCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFG NQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQL NDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAW EVVRAEIMRSFSLSTNLQESLRSKE pET17b-SD22-ELP1-90-IFNA2bSD, cleaved with TEV: (SEQ ID NO: 76) GAHMCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFG NQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQL NDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAW EVVRAEIMRSFSLSTNLQESLRSKE pET17b-SD22-ELP4-60-IFNA2bSD, cleaved with TEV: (SEQ ID NO: 77) GAHMCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFG NQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQL NDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAW EVVRAEIMRSFSLSTNLQESLRSKE pET17b-SD35-IFNA2bSD-ELP1-90, cleaved with TEV: (SEQ ID NO: 78) MCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDPGFPQEEFGNQF QKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDL EACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVV RAEIMRSFSLSTNLQESLRSKELENLYFQ pET17b-SD35-IFNA2bSD-ELP4-60, cleaved with TEV: (SEQ ID NO: 79) MCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFGNQF QKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDL EACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVV RAEIMRSFSLSTNLQESLRSKELENLYFQ pET17b-SD37-IFNA2bSD-ELP1-90, cleaved with TEV: (SEQ ID NO: 80) MCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFGNQF QKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDL EACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVV RAEIMRSFSLSTNLQESLRSKELENLYFQ pET17b-SD37-IFNA2bSD-ELP4-60 cleaved with TEV: (SEQ ID NO: 81) MCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFGNQF QKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDL EACVIQCVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVV RAEIMRSFSLSTNLQESLRSKELENLYFQ pET17b-SD31-IFNA2bSD-ELP4-60, cleaved with TEV: (SEQ ID NO: 82) MCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFGNQF QKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDL EACVIQGVCVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVV RAEIMRSFSLSTNLQESLRSKELENLYFQ pET17b-SD31-IFNA2bSD-ELP4-60, cleaved with TEV: (SEQ ID NO: 83) MCDLPQTHSLGSRRTLMLLAQMRRISLFSCLKDRHDFGFPQEEFGNQF QKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDL EACVIQGVCVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVV RAEIMRSFSLSTNLQESLRSKELENLYFQ
[0143] While the invention has been has been described herein in reference to specific aspects, features and illustrative embodiments of the invention, it will be appreciated that the utility of the invention is not thus limited, but rather extends to and encompasses numerous other variations, modifications and alternative embodiments, as will suggest themselves to those of ordinary skill in the field of the present invention, based on the disclosure herein. Correspondingly, the invention as hereinafter claimed is intended to be broadly construed and interpreted, as including all such variations, modifications and alternative embodiments, within its spirit and scope.
Sequence CWU
1
8714PRTArtificial SequenceSynthetic Construct 1Val Pro Gly
Gly124PRTArtificial SequenceSynthetic Construct 2Ile Pro Gly
Gly135PRTArtificial SequenceSynthetic Construct 3Xaa Gly Val Pro Gly1
545PRTArtificial SequenceSynthetic Construct 4Val Gly Val Pro
Gly1 555PRTArtificial SequenceSynthetic Construct 5Val Pro
Ala Val Gly1 565PRTArtificial SequenceSynthetic Construct
6Gly Val Gly Ile Pro1 575PRTArtificial SequenceSynthetic
Construct 7Val Gly Leu Pro Gly1 585PRTArtificial
SequenceSynthetic Construct 8Val Pro Gly Xaa Gly1
595PRTArtificial SequenceSynthetic Construct 9Ala Val Gly Pro Gly1
5105PRTArtificial SequenceSynthetic Construct 10Ile Pro Gly Val
Gly1 5115PRTArtificial SequenceSynthetic Construct 11Ile
Pro Gly Xaa Gly1 5125PRTArtificial SequenceSynthetic
Construct 12Leu Pro Gly Val Gly1 5135PRTArtificial
SequenceSynthetic Construct 13Leu Pro Gly Xaa Gly1
5146PRTArtificial SequenceSynthetic Construct 14Val Ala Pro Gly Val Gly1
5158PRTArtificial SequenceSynthetic Construct 15Gly Val Gly
Val Pro Gly Val Gly1 5169PRTArtificial SequenceSynthetic
Construct 16Val Pro Gly Phe Gly Val Gly Ala Gly1
5179PRTArtificial SequenceSynthetic Construct 17Val Pro Gly Val Gly Val
Pro Gly Gly1 51851PRTArtificial SequenceSynthetic Construct
18Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Cys Cys Val Pro Gly1
5 10 15Ala Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val 20 25
30Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro
Gly Gly Gly 35 40 45Val Pro Gly
5019153DNAArtificial SequenceSynthetic Construct 19ggcgtgggtg
ttccgggcgt gggtgttccg ggttgctgcg tgccgggcgc aggtgttcct 60ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 120ggtgcaggcg
ttccgggtgg cggtgtgccg ggc
1532051PRTArtificial SequenceSynthetic Construct 20Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Cys Gly Val Pro Gly1 5
10 15Ala Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val 20 25
30Gly Val Pro Gly Cys Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly
35 40 45Val Pro Gly
5021153DNAArtificial SequenceSynthetic construct 21ggcgtgggtg ttccgggcgt
gggtgttccg ggttgcgggg tgccgggcgc aggtgttcct 60ggtgtaggtg tgccgggtgt
tggtgtgccg ggtgttggtg taccaggttg cggtgttccg 120ggtgcaggcg ttccgggtgg
cggtgtgccg ggc 15322646PRTArtificial
SequenceSynthetic construct 22Met Cys Cys Pro Met Gly Gly Pro Gly Val Gly
Val Pro Gly Val Gly1 5 10
15Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val
20 25 30Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Gly Gly Val Pro 35 40
45Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro
Gly 50 55 60Val Gly Val Pro Gly Gly
Gly Val Pro Gly Ala Gly Val Pro Gly Val65 70
75 80Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Gly Gly 85 90
95Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly Val
100 105 110Pro Gly Val Gly Val Pro
Gly Gly Gly Val Pro Gly Ala Gly Val Pro 115 120
125Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly 130 135 140Gly Gly Val Pro Gly
Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val145 150
155 160Gly Val Pro Gly Val Gly Val Pro Gly Gly
Gly Val Pro Gly Ala Gly 165 170
175Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
180 185 190Pro Gly Gly Gly Val
Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro 195
200 205Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly
Gly Val Pro Gly 210 215 220Ala Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val225
230 235 240Gly Val Pro Gly Gly Gly Val
Pro Gly Ala Gly Val Pro Gly Gly Gly 245
250 255Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Gly Gly Val 260 265 270Pro
Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 275
280 285Gly Val Gly Val Pro Gly Gly Gly Val
Pro Gly Ala Gly Val Pro Gly 290 295
300Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly305
310 315 320Gly Val Pro Gly
Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 325
330 335Val Pro Gly Val Gly Val Pro Gly Gly Gly
Val Pro Gly Ala Gly Val 340 345
350Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
355 360 365Gly Gly Gly Val Pro Gly Ala
Gly Val Pro Gly Val Gly Val Pro Gly 370 375
380Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly
Ala385 390 395 400Gly Val
Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
405 410 415Val Pro Gly Gly Gly Val Pro
Gly Ala Gly Val Pro Gly Val Gly Val 420 425
430Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly
Val Pro 435 440 445Gly Ala Gly Val
Pro Gly Gly Gly Val Pro Gly Trp Pro Ser Ser Gly 450
455 460Asp Tyr Asp Ile Pro Thr Thr Glu Asn Leu Tyr Phe
Gln Gly Ala His465 470 475
480Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu
485 490 495Met Leu Leu Ala Gln
Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys 500
505 510Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe
Gly Asn Gln Phe 515 520 525Gln Lys
Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln Ile 530
535 540Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala
Ala Trp Asp Glu Thr545 550 555
560Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu
565 570 575Glu Ala Cys Val
Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met 580
585 590Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr
Phe Gln Arg Ile Thr 595 600 605Leu
Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val 610
615 620Arg Ala Glu Ile Met Arg Ser Phe Ser Leu
Ser Thr Asn Leu Gln Glu625 630 635
640Ser Leu Arg Ser Lys Glu 64523496PRTArtificial
SequenceSynthetic construct 23Met Cys Cys Pro Met Gly Gly Pro Gly Val Gly
Val Pro Gly Val Gly1 5 10
15Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
20 25 30Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro 35 40
45Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly 50 55 60Val Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val65 70
75 80Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly 85 90
95Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
100 105 110Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro 115 120
125Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly 130 135 140Val Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val145 150
155 160Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly 165 170
175Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
180 185 190Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 195
200 205Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly 210 215 220Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val225
230 235 240Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly 245
250 255Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 260 265 270Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 275
280 285Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly 290 295
300Val Gly Val Pro Gly Trp Pro Ser Ser Gly Asp Tyr Asp Ile Pro Thr305
310 315 320Thr Glu Asn Leu
Tyr Phe Gln Gly Ala His Met Cys Asp Leu Pro Gln 325
330 335Thr His Ser Leu Gly Ser Arg Arg Thr Leu
Met Leu Leu Ala Gln Met 340 345
350Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys Asp Arg His Asp Phe Gly
355 360 365Phe Pro Gln Glu Glu Phe Gly
Asn Gln Phe Gln Lys Ala Glu Thr Ile 370 375
380Pro Val Leu His Glu Met Ile Gln Gln Ile Phe Asn Leu Phe Ser
Thr385 390 395 400Lys Asp
Ser Ser Ala Ala Trp Asp Glu Thr Leu Leu Asp Lys Phe Tyr
405 410 415Thr Glu Leu Tyr Gln Gln Leu
Asn Asp Leu Glu Ala Cys Val Ile Gln 420 425
430Gly Val Gly Val Thr Glu Thr Pro Leu Met Lys Glu Asp Ser
Ile Leu 435 440 445Ala Val Arg Lys
Tyr Phe Gln Arg Ile Thr Leu Tyr Leu Lys Glu Lys 450
455 460Lys Tyr Ser Pro Cys Ala Trp Glu Val Val Arg Ala
Glu Ile Met Arg465 470 475
480Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu Ser Leu Arg Ser Lys Glu
485 490 49524642PRTArtificial
SequenceSynthetic construct 24Met Gly Gly Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Gly1 5 10
15Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
20 25 30Val Pro Gly Val Gly Val Pro
Gly Gly Gly Val Pro Gly Ala Gly Val 35 40
45Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro 50 55 60Gly Gly Gly Val Pro Gly
Ala Gly Val Pro Gly Val Gly Val Pro Gly65 70
75 80Val Gly Val Pro Gly Val Gly Val Pro Gly Gly
Gly Val Pro Gly Ala 85 90
95Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
100 105 110Val Pro Gly Gly Gly Val
Pro Gly Ala Gly Val Pro Gly Val Gly Val 115 120
125Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly
Val Pro 130 135 140Gly Ala Gly Val Pro
Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly145 150
155 160Val Gly Val Pro Gly Gly Gly Val Pro Gly
Ala Gly Val Pro Gly Val 165 170
175Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly
180 185 190Val Pro Gly Ala Gly
Val Pro Gly Gly Gly Val Pro Gly Val Gly Val 195
200 205Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly
Ala Gly Val Pro 210 215 220Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly225
230 235 240Gly Gly Val Pro Gly Ala Gly
Val Pro Gly Gly Gly Val Pro Gly Val 245
250 255Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val
Pro Gly Ala Gly 260 265 270Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val 275
280 285Pro Gly Gly Gly Val Pro Gly Ala Gly
Val Pro Gly Gly Gly Val Pro 290 295
300Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly305
310 315 320Ala Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val 325
330 335Gly Val Pro Gly Gly Gly Val Pro Gly Ala
Gly Val Pro Gly Gly Gly 340 345
350Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val
355 360 365Pro Gly Ala Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro 370 375
380Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro
Gly385 390 395 400Gly Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly
405 410 415Gly Val Pro Gly Ala Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly 420 425
430Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala
Gly Val 435 440 445Pro Gly Gly Gly
Val Pro Gly Trp Pro Cys Cys Ser Ser Gly Asp Ile 450
455 460Pro Thr Thr Glu Asn Leu Tyr Phe Gln Gly Ala His
Met Cys Asp Leu465 470 475
480Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu Met Leu Leu Ala
485 490 495Gln Met Arg Arg Ile
Ser Leu Phe Ser Cys Leu Lys Asp Arg His Asp 500
505 510Phe Gly Phe Pro Gln Glu Glu Phe Gly Asn Gln Phe
Gln Lys Ala Glu 515 520 525Thr Ile
Pro Val Leu His Glu Met Ile Gln Gln Ile Phe Asn Leu Phe 530
535 540Ser Thr Lys Asp Ser Ser Ala Ala Trp Asp Glu
Thr Leu Leu Asp Lys545 550 555
560Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu Glu Ala Cys Val
565 570 575Ile Gln Gly Val
Gly Val Thr Glu Thr Pro Leu Met Lys Glu Asp Ser 580
585 590Ile Leu Ala Val Arg Lys Tyr Phe Gln Arg Ile
Thr Leu Tyr Leu Lys 595 600 605Glu
Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val Arg Ala Glu Ile 610
615 620Met Arg Ser Phe Ser Leu Ser Thr Asn Leu
Gln Glu Ser Leu Arg Ser625 630 635
640Lys Glu25492PRTArtificial SequenceSynthetic construct 25Met
Gly Gly Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val1
5 10 15Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly 20 25
30Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val 35 40 45Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 50 55
60Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly65 70 75
80Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
85 90 95Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly 100
105 110Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 115 120 125Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 130
135 140Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly145 150 155
160Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
165 170 175Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 180
185 190Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val 195 200 205Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 210
215 220Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly225 230 235
240Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val 245 250 255Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 260
265 270Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val 275 280
285Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 290
295 300Gly Trp Pro Cys Cys Ser Ser Gly
Asp Ile Pro Thr Thr Glu Asn Leu305 310
315 320Tyr Phe Gln Gly Ala His Met Cys Asp Leu Pro Gln
Thr His Ser Leu 325 330
335Gly Ser Arg Arg Thr Leu Met Leu Leu Ala Gln Met Arg Arg Ile Ser
340 345 350Leu Phe Ser Cys Leu Lys
Asp Arg His Asp Phe Gly Phe Pro Gln Glu 355 360
365Glu Phe Gly Asn Gln Phe Gln Lys Ala Glu Thr Ile Pro Val
Leu His 370 375 380Glu Met Ile Gln Gln
Ile Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser385 390
395 400Ala Ala Trp Asp Glu Thr Leu Leu Asp Lys
Phe Tyr Thr Glu Leu Tyr 405 410
415Gln Gln Leu Asn Asp Leu Glu Ala Cys Val Ile Gln Gly Val Gly Val
420 425 430Thr Glu Thr Pro Leu
Met Lys Glu Asp Ser Ile Leu Ala Val Arg Lys 435
440 445Tyr Phe Gln Arg Ile Thr Leu Tyr Leu Lys Glu Lys
Lys Tyr Ser Pro 450 455 460Cys Ala Trp
Glu Val Val Arg Ala Glu Ile Met Arg Ser Phe Ser Leu465
470 475 480Ser Thr Asn Leu Gln Glu Ser
Leu Arg Ser Lys Glu 485
49026642PRTArtificial SequenceSynthetic construct 26Met Gly Gly Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Gly1 5
10 15Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly 20 25
30Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val
35 40 45Pro Gly Gly Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro 50 55
60Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly65
70 75 80Val Gly Val Pro Gly
Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala 85
90 95Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly 100 105
110Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val
115 120 125Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Gly Gly Val Pro 130 135
140Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro
Gly145 150 155 160Val Gly
Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val
165 170 175Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Gly Gly 180 185
190Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val
Gly Val 195 200 205Pro Gly Val Gly
Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro 210
215 220Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly225 230 235
240Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val
245 250 255Gly Val Pro Gly Val
Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly 260
265 270Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 275 280 285Pro Gly
Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro 290
295 300Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Gly Gly Val Pro Gly305 310 315
320Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
325 330 335Gly Val Pro Gly
Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly 340
345 350Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Gly Gly Val 355 360 365Pro
Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 370
375 380Gly Val Gly Val Pro Gly Gly Gly Val Pro
Gly Ala Gly Val Pro Gly385 390 395
400Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Gly 405 410 415Gly Val Pro
Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 420
425 430Val Pro Gly Val Gly Val Pro Gly Gly Gly
Val Pro Gly Ala Gly Val 435 440
445Pro Gly Gly Gly Val Pro Gly Trp Pro Ser Ser Gly Asp Tyr Asp Ile 450
455 460Pro Thr Thr Glu Asn Leu Tyr Phe
Gln Gly Ala His Met Cys Asp Leu465 470
475 480Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu
Met Leu Leu Ala 485 490
495Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys Asp Arg His Asp
500 505 510Phe Gly Phe Pro Gln Glu
Glu Phe Gly Asn Gln Phe Gln Lys Ala Glu 515 520
525Thr Ile Pro Val Leu His Glu Met Ile Gln Gln Ile Phe Asn
Leu Phe 530 535 540Ser Thr Lys Asp Ser
Ser Ala Ala Trp Asp Glu Thr Leu Leu Asp Lys545 550
555 560Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn
Asp Leu Glu Ala Cys Val 565 570
575Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met Lys Glu Asp Ser
580 585 590Ile Leu Ala Val Arg
Lys Tyr Phe Gln Arg Ile Thr Leu Tyr Leu Lys 595
600 605Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val
Arg Ala Glu Ile 610 615 620Met Arg Ser
Phe Ser Leu Ser Thr Asn Leu Gln Glu Ser Leu Arg Ser625
630 635 640Lys Glu27492PRTArtificial
SequenceSynthetic construct 27Met Gly Gly Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val1 5 10
15Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
20 25 30Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val 35 40
45Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro 50 55 60Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly65 70
75 80Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val 85 90
95Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
100 105 110Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val 115 120
125Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro 130 135 140Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly145 150
155 160Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Val 165 170
175Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
180 185 190Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val 195
200 205Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro 210 215 220Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly225
230 235 240Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val 245
250 255Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly 260 265 270Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val 275
280 285Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro 290 295
300Gly Trp Pro Ser Ser Gly Asp Tyr Asp Ile Pro Thr Thr Glu Asn Leu305
310 315 320Tyr Phe Gln Gly
Ala His Met Cys Asp Leu Pro Gln Thr His Ser Leu 325
330 335Gly Ser Arg Arg Thr Leu Met Leu Leu Ala
Gln Met Arg Arg Ile Ser 340 345
350Leu Phe Ser Cys Leu Lys Asp Arg His Asp Phe Gly Phe Pro Gln Glu
355 360 365Glu Phe Gly Asn Gln Phe Gln
Lys Ala Glu Thr Ile Pro Val Leu His 370 375
380Glu Met Ile Gln Gln Ile Phe Asn Leu Phe Ser Thr Lys Asp Ser
Ser385 390 395 400Ala Ala
Trp Asp Glu Thr Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr
405 410 415Gln Gln Leu Asn Asp Leu Glu
Ala Cys Val Ile Gln Gly Val Gly Val 420 425
430Thr Glu Thr Pro Leu Met Lys Glu Asp Ser Ile Leu Ala Val
Arg Lys 435 440 445Tyr Phe Gln Arg
Ile Thr Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro 450
455 460Cys Ala Trp Glu Val Val Arg Ala Glu Ile Met Arg
Ser Phe Ser Leu465 470 475
480Ser Thr Asn Leu Gln Glu Ser Leu Arg Ser Lys Glu 485
49028638PRTArtificial SequenceSynthetic construct 28Met Cys
Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu1 5
10 15Met Leu Leu Ala Gln Met Arg Arg
Ile Ser Leu Phe Ser Cys Leu Lys 20 25
30Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly Asn Gln
Phe 35 40 45Gln Lys Ala Glu Thr
Ile Pro Val Leu His Glu Met Ile Gln Gln Ile 50 55
60Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp Asp
Glu Thr65 70 75 80Leu
Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu
85 90 95Glu Ala Cys Val Ile Gln Gly
Val Gly Val Thr Glu Thr Pro Leu Met 100 105
110Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln Arg
Ile Thr 115 120 125Leu Tyr Leu Lys
Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val 130
135 140Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser Thr
Asn Leu Gln Glu145 150 155
160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu Tyr Phe Gln Gly Gly Cys
165 170 175Cys Gly Gln Gly Gly
Met Gly Gly Pro Gly Val Gly Val Pro Gly Val 180
185 190Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val
Pro Gly Val Gly 195 200 205Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val 210
215 220Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro
Gly Val Gly Val Pro225 230 235
240Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly
245 250 255Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly 260
265 270Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly
Val Pro Gly Val Gly 275 280 285Val
Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val 290
295 300Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro305 310 315
320Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro
Gly 325 330 335Val Gly Val
Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala 340
345 350Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly 355 360
365Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val 370
375 380Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Gly Gly Val Pro385 390
395 400Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly 405 410
415Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly
420 425 430Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Gly Gly 435 440
445Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val 450 455 460Pro Gly Val Gly Val
Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro465 470
475 480Gly Gly Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly 485 490
495Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val
500 505 510Gly Val Pro Gly Val
Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly 515
520 525Val Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 530 535 540Pro Gly Gly
Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro545
550 555 560Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Gly Gly Val Pro Gly 565
570 575Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly
Val Pro Gly Val 580 585 590Gly
Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly 595
600 605Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Gly Gly Val 610 615
620Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Trp Pro625
630 63529488PRTArtificial SequenceSynthetic construct
29Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu1
5 10 15Met Leu Leu Ala Gln Met
Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys 20 25
30Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly
Asn Gln Phe 35 40 45Gln Lys Ala
Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln Ile 50
55 60Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala
Trp Asp Glu Thr65 70 75
80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu
85 90 95Glu Ala Cys Val Ile Gln
Gly Val Gly Val Thr Glu Thr Pro Leu Met 100
105 110Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe
Gln Arg Ile Thr 115 120 125Leu Tyr
Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val 130
135 140Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser
Thr Asn Leu Gln Glu145 150 155
160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu Tyr Phe Gln Gly Gly Cys
165 170 175Cys Gly Gln Gly
Gly Met Gly Gly Pro Gly Val Gly Val Pro Gly Val 180
185 190Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly 195 200 205Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val 210
215 220Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro225 230 235
240Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly 245 250 255Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val 260
265 270Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly 275 280
285Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val 290
295 300Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro305 310
315 320Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly 325 330
335Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
340 345 350Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly 355 360
365Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val 370 375 380Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro385 390
395 400Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly 405 410
415Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
420 425 430Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 435
440 445Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 450 455 460Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro465
470 475 480Gly Val Gly Val Pro Gly Trp
Pro 48530634PRTArtificial SequenceSynthetic construct
30Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu1
5 10 15Met Leu Leu Ala Gln Met
Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys 20 25
30Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly
Asn Gln Phe 35 40 45Gln Lys Ala
Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln Ile 50
55 60Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala
Trp Asp Glu Thr65 70 75
80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu
85 90 95Glu Ala Cys Val Ile Gln
Gly Val Gly Val Thr Glu Thr Pro Leu Met 100
105 110Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe
Gln Arg Ile Thr 115 120 125Leu Tyr
Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu Val Val 130
135 140Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser
Thr Asn Leu Gln Glu145 150 155
160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu Tyr Phe Gln Gly Gly Met
165 170 175Gly Gly Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly 180
185 190Val Pro Gly Ala Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val 195 200 205Pro
Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro 210
215 220Gly Gly Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly225 230 235
240Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly
Val 245 250 255Gly Val Pro
Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly 260
265 270Val Pro Gly Gly Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val 275 280
285Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro 290
295 300Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Gly Gly Val Pro Gly305 310
315 320Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly
Val Pro Gly Val 325 330
335Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Val Gly
340 345 350Val Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Gly Gly Val 355 360
365Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly
Val Pro 370 375 380Gly Val Gly Val Pro
Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly385 390
395 400Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Gly 405 410
415Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly Val Gly
420 425 430Val Pro Gly Val Gly
Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val 435
440 445Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro 450 455 460Gly Gly Gly
Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val Pro Gly465
470 475 480Val Gly Val Pro Gly Val Gly
Val Pro Gly Gly Gly Val Pro Gly Ala 485
490 495Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly 500 505 510Val
Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val 515
520 525Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Gly Gly Val Pro 530 535
540Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly545
550 555 560Val Gly Val Pro
Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly 565
570 575Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Gly Gly 580 585
590Val Pro Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
595 600 605Pro Gly Val Gly Val Pro Gly
Gly Gly Val Pro Gly Ala Gly Val Pro 610 615
620Gly Gly Gly Val Pro Gly Trp Pro Cys Cys625
63031484PRTArtificial SequenceSynthetic construct 31Met Cys Asp Leu Pro
Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu1 5
10 15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu
Phe Ser Cys Leu Lys 20 25
30Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly Asn Gln Phe
35 40 45Gln Lys Ala Glu Thr Ile Pro Val
Leu His Glu Met Ile Gln Gln Ile 50 55
60Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65
70 75 80Leu Leu Asp Lys Phe
Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu 85
90 95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr
Glu Thr Pro Leu Met 100 105
110Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr
115 120 125Leu Tyr Leu Lys Glu Lys Lys
Tyr Ser Pro Cys Ala Trp Glu Val Val 130 135
140Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln
Glu145 150 155 160Ser Leu
Arg Ser Lys Glu Leu Glu Asn Leu Tyr Phe Gln Gly Gly Met
165 170 175Gly Gly Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly 180 185
190Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val 195 200 205Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 210
215 220Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly225 230 235
240Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
245 250 255Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 260
265 270Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 275 280 285Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 290
295 300Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly305 310 315
320Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
325 330 335Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 340
345 350Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val 355 360 365Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 370
375 380Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly385 390 395
400Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val 405 410 415Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 420
425 430Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val 435 440
445Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 450
455 460Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly465 470
475 480Trp Pro Cys Cys32632PRTArtificial
SequenceSynthetic construct 32Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln Gly Gly Met 165 170
175Gly Gly Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly
180 185 190Val Pro Gly Ala Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val 195
200 205Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly
Ala Gly Val Pro 210 215 220Gly Gly Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly225
230 235 240Gly Gly Val Pro Gly Ala Gly
Val Pro Gly Val Gly Val Pro Gly Val 245
250 255Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val
Pro Gly Ala Gly 260 265 270Val
Pro Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val 275
280 285Pro Gly Gly Gly Val Pro Gly Ala Gly
Val Pro Gly Val Gly Val Pro 290 295
300Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly305
310 315 320Ala Gly Val Pro
Gly Gly Gly Val Pro Gly Val Gly Val Pro Gly Val 325
330 335Gly Val Pro Gly Gly Gly Val Pro Gly Ala
Gly Val Pro Gly Val Gly 340 345
350Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val
355 360 365Pro Gly Ala Gly Val Pro Gly
Gly Gly Val Pro Gly Val Gly Val Pro 370 375
380Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro
Gly385 390 395 400Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly
405 410 415Gly Val Pro Gly Ala Gly Val
Pro Gly Gly Gly Val Pro Gly Val Gly 420 425
430Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala
Gly Val 435 440 445Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 450
455 460Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly
Gly Val Pro Gly465 470 475
480Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala
485 490 495Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 500
505 510Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro
Gly Gly Gly Val 515 520 525Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro 530
535 540Gly Ala Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly545 550 555
560Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly
565 570 575Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly 580
585 590Val Pro Gly Ala Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val 595 600 605Pro
Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro 610
615 620Gly Gly Gly Val Pro Gly Trp Pro625
63033482PRTArtificial SequenceSynthetic construct 33Met Cys Asp
Leu Pro Gln Thr His Ser Leu Gly Ser Arg Arg Thr Leu1 5
10 15Met Leu Leu Ala Gln Met Arg Arg Ile
Ser Leu Phe Ser Cys Leu Lys 20 25
30Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly Asn Gln Phe
35 40 45Gln Lys Ala Glu Thr Ile Pro
Val Leu His Glu Met Ile Gln Gln Ile 50 55
60Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65
70 75 80Leu Leu Asp Lys
Phe Tyr Thr Glu Leu Tyr Gln Gln Leu Asn Asp Leu 85
90 95Glu Ala Cys Val Ile Gln Gly Val Gly Val
Thr Glu Thr Pro Leu Met 100 105
110Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr
115 120 125Leu Tyr Leu Lys Glu Lys Lys
Tyr Ser Pro Cys Ala Trp Glu Val Val 130 135
140Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln
Glu145 150 155 160Ser Leu
Arg Ser Lys Glu Leu Glu Asn Leu Tyr Phe Gln Gly Gly Met
165 170 175Gly Gly Pro Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly 180 185
190Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val 195 200 205Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 210
215 220Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly225 230 235
240Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
245 250 255Gly Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 260
265 270Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val 275 280 285Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 290
295 300Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly305 310 315
320Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
325 330 335Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 340
345 350Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly Val 355 360 365Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 370
375 380Gly Val Gly Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly385 390 395
400Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val 405 410 415Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 420
425 430Val Pro Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val 435 440
445Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 450
455 460Gly Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly465 470
475 480Trp Pro345123DNAArtificial SequenceSynthetic
construct 34ttcttgaaga cgaaagggcc tcgtgatacg cctattttta taggttaatg
tcatgataat 60aatggtttct tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa
cccctatttg 120tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac
cctgataaat 180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg
tcgcccttat 240tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc
tggtgaaagt 300aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg
atctcaacag 360cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga
gcacttttaa 420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc
aactcggtcg 480ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag
aaaagcatct 540tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga
gtgataacac 600tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg
cttttttgca 660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga
atgaagccat 720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt
tgcgcaaact 780attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact
ggatggaggc 840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt
ttattgctga 900taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg
ggccagatgg 960taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta
tggatgaacg 1020aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac
tgtcagacca 1080agtttactca tatatacttt agattgattt aaaacttcat ttttaattta
aaaggatcta 1140ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt
tttcgttcca 1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt
tttttctgcg 1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt
gtttgccgga 1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc
agataccaaa 1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg
tagcaccgcc 1440tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg
ataagtcgtg 1500tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt
cgggctgaac 1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac
tgagatacct 1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg
acaggtatcc 1680ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg
gaaacgcctg 1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat
ttttgtgatg 1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt
tacggttcct 1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg
attctgtgga 1920taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa
cgaccgagcg 1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc
tccttacgca 2040tctgtgcggt atttcacacc gcatatatgg tgcactctca gtacaatctg
ctctgatgcc 2100gcatagttaa gccagtatac actccgctat cgctacgtga ctgggtcatg
gctgcgcccc 2160gacacccgcc aacacccgct gacgcgccct gacgggcttg tctgctcccg
gcatccgctt 2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca
ccgtcatcac 2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc
gattcacaga 2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt
aatgtctggc 2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg tttggtcact
gatgcctccg 2460tgtaaggggg atttctgttc atgggggtaa tgataccgat gaaacgagag
aggatgctca 2520cgatacgggt tactgatgat gaacatgccc ggttactgga acgttgtgag
ggtaaacaac 2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca gggtcaatgc
cagcgcttcg 2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca tcctgcgatg
cagatccgga 2700acataatggt gcagggcgct gacttccgcg tttccagact ttacgaaaca
cggaaaccga 2760agaccattca tgttgttgct caggtcgcag acgttttgca gcagcagtcg
cttcacgttc 2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca accccgccag
cctagccggg 2880tcctcaacga caggagcacg atcatgcgca cccgtggcca ggacccaacg
ctgcccgaga 2940tctcgatccc gcgaaattaa tacgactcac tatagggaga ccacaacggt
ttccctctag 3000aaataatttt gtttaacttt aagaaggaga tataccatgt gctgccccat
gggtgggccg 3060ggcgtgggtg ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc
aggtgttcct 3120ggtgtaggtg tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg
cggtgttccg 3180ggtgcaggcg ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt
gggtgttccg 3240ggtggcggtg tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt
tggtgtgccg 3300ggtgttggtg taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg
cggtgtgccg 3360ggcgtgggtg ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc
aggtgttcct 3420ggtgtaggtg tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg
cggtgttccg 3480ggtgcaggcg ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt
gggtgttccg 3540ggtggcggtg tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt
tggtgtgccg 3600ggtgttggtg taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg
cggtgtgccg 3660ggcgtgggtg ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc
aggtgttcct 3720ggtgtaggtg tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg
cggtgttccg 3780ggtgcaggcg ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt
gggtgttccg 3840ggtggcggtg tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt
tggtgtgccg 3900ggtgttggtg taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg
cggtgtgccg 3960ggcgtgggtg ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc
aggtgttcct 4020ggtgtaggtg tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg
cggtgttccg 4080ggtgcaggcg ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt
gggtgttccg 4140ggtggcggtg tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt
tggtgtgccg 4200ggtgttggtg taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg
cggtgtgccg 4260ggcgtgggtg ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc
aggtgttcct 4320ggtgtaggtg tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg
cggtgttccg 4380ggtgcaggcg ttccgggtgg cggtgtgccg ggctggccga gcagcggtga
ttacgatatc 4440ccaacgaccg aaaacctgta ttttcagggc gcccatatgt gtgatctgcc
tcaaacccac 4500agcctgggta gcaggaggac cttgatgctc ctggcacaga tgaggagaat
ctctcttttc 4560tcctgcttga aggacagaca tgactttgga tttccccagg aggagtttgg
caaccagttc 4620caaaaggctg aaaccatccc tgtcctccat gagatgatcc agcagatctt
caatctcttc 4680agcacaaagg actcatctgc tgcttgggat gagaccctcc tagacaaatt
ctacactgaa 4740ctctaccagc agctgaatga cctggaagcc tgtgtgatac agggggtggg
ggtgacagag 4800actcccctga tgaaggagga ctccattctg gctgtgagga aatacttcca
aagaatcact 4860ctctatctga aagagaagaa atacagccct tgtgcctggg aggttgtcag
agcagaaatc 4920atgagatctt tttctttgtc aacaaacttg caagaaagtt taagaagtaa
ggaataactc 4980gagcagatcc ggctgctaac aaagcccgaa aggaagctga gttggctgct
gccaccgctg 5040agcaataact agcataaccc cttggggcct ctaaacgggt cttgaggggt
tttttgctga 5100aaggaggaac tatatccgga taa
5123354673DNAArtificial SequenceSynthetic construct
35ttcttgaaga cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat
60aatggtttct tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg
120tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat
180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat
240tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt
300aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag
360cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa
420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg
480ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct
540tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac
600tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca
660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat
720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact
780attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc
840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga
900taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg
960taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg
1020aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca
1080agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta
1140ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca
1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg
1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga
1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa
1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc
1440tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg
1500tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac
1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct
1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc
1680ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg
1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg
1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct
1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga
1920taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg
1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca
2040tctgtgcggt atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc
2100gcatagttaa gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc
2160gacacccgcc aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt
2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac
2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga
2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc
2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg
2460tgtaaggggg atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca
2520cgatacgggt tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac
2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg
2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga
2700acataatggt gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga
2760agaccattca tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc
2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg
2880tcctcaacga caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga
2940tctcgatccc gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag
3000aaataatttt gtttaacttt aagaaggaga tataccatgt gctgccccat gggtgggccg
3060ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct
3120ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg
3180ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca
3240ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg
3300ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg
3360ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct
3420ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg
3480ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca
3540ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg
3600ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg
3660ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct
3720ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg
3780ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca
3840ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg
3900ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg
3960ggctggccga gcagcggtga ttacgatatc ccaacgaccg aaaacctgta ttttcagggc
4020gcccatatgt gtgatctgcc tcaaacccac agcctgggta gcaggaggac cttgatgctc
4080ctggcacaga tgaggagaat ctctcttttc tcctgcttga aggacagaca tgactttgga
4140tttccccagg aggagtttgg caaccagttc caaaaggctg aaaccatccc tgtcctccat
4200gagatgatcc agcagatctt caatctcttc agcacaaagg actcatctgc tgcttgggat
4260gagaccctcc tagacaaatt ctacactgaa ctctaccagc agctgaatga cctggaagcc
4320tgtgtgatac agggggtggg ggtgacagag actcccctga tgaaggagga ctccattctg
4380gctgtgagga aatacttcca aagaatcact ctctatctga aagagaagaa atacagccct
4440tgtgcctggg aggttgtcag agcagaaatc atgagatctt tttctttgtc aacaaacttg
4500caagaaagtt taagaagtaa ggaataactc gagcagatcc ggctgctaac aaagcccgaa
4560aggaagctga gttggctgct gccaccgctg agcaataact agcataaccc cttggggcct
4620ctaaacgggt cttgaggggt tttttgctga aaggaggaac tatatccgga taa
4673365111DNAArtificial SequenceSynthetic construct 36ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tataccatgg gtgggccggg cgtgggtgtt 3060ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 3120ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 3180ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 3240ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 3300ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 3360ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 3420ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 3480ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 3540ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 3600ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 3660ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 3720ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 3780ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 3840ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 3900ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 3960ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 4020ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 4080ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 4140ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 4200ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 4260ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 4320ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 4380ccgggtggcg
gtgtgccggg ctggccgtgc tgcagcagcg gtgatatccc aacgaccgaa 4440aacctgtatt
ttcagggcgc ccatatgtgt gatctgcctc aaacccacag cctgggtagc 4500aggaggacct
tgatgctcct ggcacagatg aggagaatct ctcttttctc ctgcttgaag 4560gacagacatg
actttggatt tccccaggag gagtttggca accagttcca aaaggctgaa 4620accatccctg
tcctccatga gatgatccag cagatcttca atctcttcag cacaaaggac 4680tcatctgctg
cttgggatga gaccctccta gacaaattct acactgaact ctaccagcag 4740ctgaatgacc
tggaagcctg tgtgatacag ggggtggggg tgacagagac tcccctgatg 4800aaggaggact
ccattctggc tgtgaggaaa tacttccaaa gaatcactct ctatctgaaa 4860gagaagaaat
acagcccttg tgcctgggag gttgtcagag cagaaatcat gagatctttt 4920tctttgtcaa
caaacttgca agaaagttta agaagtaagg aataactcga gcagatccgg 4980ctgctaacaa
agcccgaaag gaagctgagt tggctgctgc caccgctgag caataactag 5040cataacccct
tggggcctct aaacgggtct tgaggggttt tttgctgaaa ggaggaacta 5100tatccggata a
5111374661DNAArtificial SequenceSynthetic construct 37ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tataccatgg gtgggccggg cgtgggtgtt 3060ccgggcgtag
gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg 3120ccgggcgtgg
gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt 3180cctggtgtcg
gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta 3240ccgggcgttg
gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc 3300ccaggtgtgg
gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg cgtgggtgtt 3360ccgggcgtag
gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg 3420ccgggcgtgg
gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt 3480cctggtgtcg
gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta 3540ccgggcgttg
gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc 3600ccaggtgtgg
gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg cgtgggtgtt 3660ccgggcgtag
gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg 3720ccgggcgtgg
gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt 3780cctggtgtcg
gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta 3840ccgggcgttg
gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc 3900ccaggtgtgg
gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg ctggccgtgc 3960tgcagcagcg
gtgatatccc aacgaccgaa aacctgtatt ttcagggcgc ccatatgtgt 4020gatctgcctc
aaacccacag cctgggtagc aggaggacct tgatgctcct ggcacagatg 4080aggagaatct
ctcttttctc ctgcttgaag gacagacatg actttggatt tccccaggag 4140gagtttggca
accagttcca aaaggctgaa accatccctg tcctccatga gatgatccag 4200cagatcttca
atctcttcag cacaaaggac tcatctgctg cttgggatga gaccctccta 4260gacaaattct
acactgaact ctaccagcag ctgaatgacc tggaagcctg tgtgatacag 4320ggggtggggg
tgacagagac tcccctgatg aaggaggact ccattctggc tgtgaggaaa 4380tacttccaaa
gaatcactct ctatctgaaa gagaagaaat acagcccttg tgcctgggag 4440gttgtcagag
cagaaatcat gagatctttt tctttgtcaa caaacttgca agaaagttta 4500agaagtaagg
aataactcga gcagatccgg ctgctaacaa agcccgaaag gaagctgagt 4560tggctgctgc
caccgctgag caataactag cataacccct tggggcctct aaacgggtct 4620tgaggggttt
tttgctgaaa ggaggaacta tatccggata a
4661385111DNAArtificial SequenceSynthetic construct 38ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tataccatgg gtgggccggg cgtgggtgtt 3060ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 3120ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 3180ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 3240ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 3300ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 3360ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 3420ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 3480ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 3540ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 3600ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 3660ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 3720ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 3780ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 3840ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 3900ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 3960ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 4020ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 4080ccgggtggcg
gtgtgccggg cgtgggtgtt ccgggcgtgg gtgttccggg tggcggtgtg 4140ccgggcgcag
gtgttcctgg tgtaggtgtg ccgggtgttg gtgtgccggg tgttggtgta 4200ccaggtggcg
gtgttccggg tgcaggcgtt ccgggtggcg gtgtgccggg cgtgggtgtt 4260ccgggcgtgg
gtgttccggg tggcggtgtg ccgggcgcag gtgttcctgg tgtaggtgtg 4320ccgggtgttg
gtgtgccggg tgttggtgta ccaggtggcg gtgttccggg tgcaggcgtt 4380ccgggtggcg
gtgtgccggg ctggccgagc agcggtgatt acgatatccc aacgaccgaa 4440aacctgtatt
ttcagggcgc ccatatgtgt gatctgcctc aaacccacag cctgggtagc 4500aggaggacct
tgatgctcct ggcacagatg aggagaatct ctcttttctc ctgcttgaag 4560gacagacatg
actttggatt tccccaggag gagtttggca accagttcca aaaggctgaa 4620accatccctg
tcctccatga gatgatccag cagatcttca atctcttcag cacaaaggac 4680tcatctgctg
cttgggatga gaccctccta gacaaattct acactgaact ctaccagcag 4740ctgaatgacc
tggaagcctg tgtgatacag ggggtggggg tgacagagac tcccctgatg 4800aaggaggact
ccattctggc tgtgaggaaa tacttccaaa gaatcactct ctatctgaaa 4860gagaagaaat
acagcccttg tgcctgggag gttgtcagag cagaaatcat gagatctttt 4920tctttgtcaa
caaacttgca agaaagttta agaagtaagg aataactcga gcagatccgg 4980ctgctaacaa
agcccgaaag gaagctgagt tggctgctgc caccgctgag caataactag 5040cataacccct
tggggcctct aaacgggtct tgaggggttt tttgctgaaa ggaggaacta 5100tatccggata a
5111394661DNAArtificial SequenceSynthetic construct 39ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tataccatgg gtgggccggg cgtgggtgtt 3060ccgggcgtag
gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg 3120ccgggcgtgg
gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt 3180cctggtgtcg
gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta 3240ccgggcgttg
gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc 3300ccaggtgtgg
gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg cgtgggtgtt 3360ccgggcgtag
gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg 3420ccgggcgtgg
gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt 3480cctggtgtcg
gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta 3540ccgggcgttg
gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc 3600ccaggtgtgg
gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg cgtgggtgtt 3660ccgggcgtag
gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg 3720ccgggcgtgg
gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt 3780cctggtgtcg
gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta 3840ccgggcgttg
gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc 3900ccaggtgtgg
gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg ctggccgagc 3960agcggtgatt
acgatatccc aacgaccgaa aacctgtatt ttcagggcgc ccatatgtgt 4020gatctgcctc
aaacccacag cctgggtagc aggaggacct tgatgctcct ggcacagatg 4080aggagaatct
ctcttttctc ctgcttgaag gacagacatg actttggatt tccccaggag 4140gagtttggca
accagttcca aaaggctgaa accatccctg tcctccatga gatgatccag 4200cagatcttca
atctcttcag cacaaaggac tcatctgctg cttgggatga gaccctccta 4260gacaaattct
acactgaact ctaccagcag ctgaatgacc tggaagcctg tgtgatacag 4320ggggtggggg
tgacagagac tcccctgatg aaggaggact ccattctggc tgtgaggaaa 4380tacttccaaa
gaatcactct ctatctgaaa gagaagaaat acagcccttg tgcctgggag 4440gttgtcagag
cagaaatcat gagatctttt tctttgtcaa caaacttgca agaaagttta 4500agaagtaagg
aataactcga gcagatccgg ctgctaacaa agcccgaaag gaagctgagt 4560tggctgctgc
caccgctgag caataactag cataacccct tggggcctct aaacgggtct 4620tgaggggttt
tttgctgaaa ggaggaacta tatccggata a
4661405173DNAArtificial SequenceSynthetic construct 40ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg tgtgatctgc ctcaaaccca 3060cagcctgggt
agcaggagga ccttgatgct cctggcacag atgaggagaa tctctctttt 3120ctcctgcttg
aaggacagac atgactttgg atttccccag gaggagtttg gcaaccagtt 3180ccaaaaggct
gaaaccatcc ctgtcctcca tgagatgatc cagcagatct tcaatctctt 3240cagcacaaag
gactcatctg ctgcttggga tgagaccctc ctagacaaat tctacactga 3300actctaccag
cagctgaatg acctggaagc ctgtgtgata cagggggtgg gggtgacaga 3360gactcccctg
atgaaggagg actccattct ggctgtgagg aaatacttcc aaagaatcac 3420tctctatctg
aaagagaaga aatacagccc ttgtgcctgg gaggttgtca gagcagaaat 3480catgagatct
ttttctttgt caacaaactt gcaagaaagt ttaagaagta aggaactcga 3540gaacctgtat
ttccagggcg ggtgctgcgg ccaaggcggc atgggtgggc cgggcgtggg 3600tgttccgggc
gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg 3660tgtgccgggt
gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg 3720cgttccgggt
ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg 3780tgtgccgggc
gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg 3840tgtaccaggt
ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg 3900tgttccgggc
gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg 3960tgtgccgggt
gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg 4020cgttccgggt
ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg 4080tgtgccgggc
gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg 4140tgtaccaggt
ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg 4200tgttccgggc
gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg 4260tgtgccgggt
gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg 4320cgttccgggt
ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg 4380tgtgccgggc
gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg 4440tgtaccaggt
ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg 4500tgttccgggc
gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg 4560tgtgccgggt
gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg 4620cgttccgggt
ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg 4680tgtgccgggc
gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg 4740tgtaccaggt
ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg 4800tgttccgggc
gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg 4860tgtgccgggt
gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg 4920cgttccgggt
ggcggtgtgc cgggctggcc gtgataagct agcatgactg gtggacagca 4980aatgggtcgg
atccgaattc tgcagatatc catcacactg gcggccgctc gagcagatcc 5040ggctgctaac
aaagcccgaa aggaagctga gttggctgct gccaccgctg agcaataact 5100agcataaccc
cttggggcct ctaaacgggt cttgaggggt tttttgctga aaggaggaac 5160tatatccgga
taa
5173414723DNAArtificial SequenceSynthetic construct 41ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg tgtgatctgc ctcaaaccca 3060cagcctgggt
agcaggagga ccttgatgct cctggcacag atgaggagaa tctctctttt 3120ctcctgcttg
aaggacagac atgactttgg atttccccag gaggagtttg gcaaccagtt 3180ccaaaaggct
gaaaccatcc ctgtcctcca tgagatgatc cagcagatct tcaatctctt 3240cagcacaaag
gactcatctg ctgcttggga tgagaccctc ctagacaaat tctacactga 3300actctaccag
cagctgaatg acctggaagc ctgtgtgata cagggggtgg gggtgacaga 3360gactcccctg
atgaaggagg actccattct ggctgtgagg aaatacttcc aaagaatcac 3420tctctatctg
aaagagaaga aatacagccc ttgtgcctgg gaggttgtca gagcagaaat 3480catgagatct
ttttctttgt caacaaactt gcaagaaagt ttaagaagta aggaactcga 3540gaacctgtat
ttccagggcg ggtgctgcgg ccaaggcggc atgggtgggc cgggcgtggg 3600tgttccgggc
gtaggtgtcc caggtgtggg cgtaccgggc gttggtgttc ctggtgtcgg 3660cgtgccgggc
gtgggtgttc cgggcgtagg tgtcccaggt gtgggcgtac cgggcgttgg 3720tgttcctggt
gtcggcgtgc cgggcgtggg tgttccgggc gtaggtgtcc caggtgtggg 3780cgtaccgggc
gttggtgttc ctggtgtcgg cgtgccgggc gtgggtgttc cgggcgtagg 3840tgtcccaggt
gtgggcgtac cgggcgttgg tgttcctggt gtcggcgtgc cgggcgtggg 3900tgttccgggc
gtaggtgtcc caggtgtggg cgtaccgggc gttggtgttc ctggtgtcgg 3960cgtgccgggc
gtgggtgttc cgggcgtagg tgtcccaggt gtgggcgtac cgggcgttgg 4020tgttcctggt
gtcggcgtgc cgggcgtggg tgttccgggc gtaggtgtcc caggtgtggg 4080cgtaccgggc
gttggtgttc ctggtgtcgg cgtgccgggc gtgggtgttc cgggcgtagg 4140tgtcccaggt
gtgggcgtac cgggcgttgg tgttcctggt gtcggcgtgc cgggcgtggg 4200tgttccgggc
gtaggtgtcc caggtgtggg cgtaccgggc gttggtgttc ctggtgtcgg 4260cgtgccgggc
gtgggtgttc cgggcgtagg tgtcccaggt gtgggcgtac cgggcgttgg 4320tgttcctggt
gtcggcgtgc cgggcgtggg tgttccgggc gtaggtgtcc caggtgtggg 4380cgtaccgggc
gttggtgttc ctggtgtcgg cgtgccgggc gtgggtgttc cgggcgtagg 4440tgtcccaggt
gtgggcgtac cgggcgttgg tgttcctggt gtcggcgtgc cgggctggcc 4500gtgataagct
agcatgactg gtggacagca aatgggtcgg atccgaattc tgcagatatc 4560catcacactg
gcggccgctc gagcagatcc ggctgctaac aaagcccgaa aggaagctga 4620gttggctgct
gccaccgctg agcaataact agcataaccc cttggggcct ctaaacgggt 4680cttgaggggt
tttttgctga aaggaggaac tatatccgga taa
4723425211DNAArtificial SequenceSynthetic construct 42ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg tgtgatctgc ctcaaaccca 3060cagcctgggt
agcaggagga ccttgatgct cctggcacag atgaggagaa tctctctttt 3120ctcctgcttg
aaggacagac atgactttgg atttccccag gaggagtttg gcaaccagtt 3180ccaaaaggct
gaaaccatcc ctgtcctcca tgagatgatc cagcagatct tcaatctctt 3240cagcacaaag
gactcatctg ctgcttggga tgagaccctc ctagacaaat tctacactga 3300actctaccag
cagctgaatg acctggaagc ctgtgtgata cagggggtgg gggtgacaga 3360gactcccctg
atgaaggagg actccattct ggctgtgagg aaatacttcc aaagaatcac 3420tctctatctg
aaagagaaga aatacagccc ttgtgcctgg gaggttgtca gagcagaaat 3480catgagatct
ttttctttgt caacaaactt gcaagaaagt ttaagaagta aggaactcga 3540gaacctgtat
ttccaaggcg gcatgggtgg gccgggcgtg ggtgttccgg gcgtgggtgt 3600tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 3660gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 3720gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 3780tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 3840tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 3900tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 3960gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4020gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 4080tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 4140tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 4200tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 4260gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4320gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 4380tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 4440tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 4500tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 4560gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4620gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 4680tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 4740tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 4800tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 4860gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4920gccgggctgg
ccttgctgct gataagctag catgactggt ggacagcaaa tgggtcggga 4980ttcaagcttg
gtaccgagct cggatccact agtaacggcc gccagtgtgc tggaattctg 5040cagatatcca
tcacactggc ggccgctcga gcagatccgg ctgctaacaa agcccgaaag 5100gaagctgagt
tggctgctgc caccgctgag caataactag cataacccct tggggcctct 5160aaacgggtct
tgaggggttt tttgctgaaa ggaggaacta tatccggata a
5211434761DNAArtificial SequenceSynthetic construct 43ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg tgtgatctgc ctcaaaccca 3060cagcctgggt
agcaggagga ccttgatgct cctggcacag atgaggagaa tctctctttt 3120ctcctgcttg
aaggacagac atgactttgg atttccccag gaggagtttg gcaaccagtt 3180ccaaaaggct
gaaaccatcc ctgtcctcca tgagatgatc cagcagatct tcaatctctt 3240cagcacaaag
gactcatctg ctgcttggga tgagaccctc ctagacaaat tctacactga 3300actctaccag
cagctgaatg acctggaagc ctgtgtgata cagggggtgg gggtgacaga 3360gactcccctg
atgaaggagg actccattct ggctgtgagg aaatacttcc aaagaatcac 3420tctctatctg
aaagagaaga aatacagccc ttgtgcctgg gaggttgtca gagcagaaat 3480catgagatct
ttttctttgt caacaaactt gcaagaaagt ttaagaagta aggaactcga 3540gaacctgtat
ttccaaggcg gcatgggtgg gccgggcgtg ggtgttccgg gcgtaggtgt 3600cccaggtgtg
ggcgtaccgg gcgttggtgt tcctggtgtc ggcgtgccgg gcgtgggtgt 3660tccgggcgta
ggtgtcccag gtgtgggcgt accgggcgtt ggtgttcctg gtgtcggcgt 3720gccgggcgtg
ggtgttccgg gcgtaggtgt cccaggtgtg ggcgtaccgg gcgttggtgt 3780tcctggtgtc
ggcgtgccgg gcgtgggtgt tccgggcgta ggtgtcccag gtgtgggcgt 3840accgggcgtt
ggtgttcctg gtgtcggcgt gccgggcgtg ggtgttccgg gcgtaggtgt 3900cccaggtgtg
ggcgtaccgg gcgttggtgt tcctggtgtc ggcgtgccgg gcgtgggtgt 3960tccgggcgta
ggtgtcccag gtgtgggcgt accgggcgtt ggtgttcctg gtgtcggcgt 4020gccgggcgtg
ggtgttccgg gcgtaggtgt cccaggtgtg ggcgtaccgg gcgttggtgt 4080tcctggtgtc
ggcgtgccgg gcgtgggtgt tccgggcgta ggtgtcccag gtgtgggcgt 4140accgggcgtt
ggtgttcctg gtgtcggcgt gccgggcgtg ggtgttccgg gcgtaggtgt 4200cccaggtgtg
ggcgtaccgg gcgttggtgt tcctggtgtc ggcgtgccgg gcgtgggtgt 4260tccgggcgta
ggtgtcccag gtgtgggcgt accgggcgtt ggtgttcctg gtgtcggcgt 4320gccgggcgtg
ggtgttccgg gcgtaggtgt cccaggtgtg ggcgtaccgg gcgttggtgt 4380tcctggtgtc
ggcgtgccgg gcgtgggtgt tccgggcgta ggtgtcccag gtgtgggcgt 4440accgggcgtt
ggtgttcctg gtgtcggcgt gccgggctgg ccttgctgct gataagctag 4500catgactggt
ggacagcaaa tgggtcggga ttcaagcttg gtaccgagct cggatccact 4560agtaacggcc
gccagtgtgc tggaattctg cagatatcca tcacactggc ggccgctcga 4620gcagatccgg
ctgctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag 4680caataactag
cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa 4740ggaggaacta
tatccggata a
4761445155DNAArtificial SequenceSynthetic construct 44ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg tgtgatctgc ctcaaaccca 3060cagcctgggt
agcaggagga ccttgatgct cctggcacag atgaggagaa tctctctttt 3120ctcctgcttg
aaggacagac atgactttgg atttccccag gaggagtttg gcaaccagtt 3180ccaaaaggct
gaaaccatcc ctgtcctcca tgagatgatc cagcagatct tcaatctctt 3240cagcacaaag
gactcatctg ctgcttggga tgagaccctc ctagacaaat tctacactga 3300actctaccag
cagctgaatg acctggaagc ctgtgtgata cagggggtgg gggtgacaga 3360gactcccctg
atgaaggagg actccattct ggctgtgagg aaatacttcc aaagaatcac 3420tctctatctg
aaagagaaga aatacagccc ttgtgcctgg gaggttgtca gagcagaaat 3480catgagatct
ttttctttgt caacaaactt gcaagaaagt ttaagaagta aggaactcga 3540gaacctgtat
ttccaaggcg gcatgggtgg gccgggcgtg ggtgttccgg gcgtgggtgt 3600tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 3660gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 3720gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 3780tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 3840tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 3900tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 3960gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4020gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 4080tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 4140tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 4200tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 4260gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4320gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 4380tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 4440tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 4500tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 4560gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4620gccgggcgtg
ggtgttccgg gcgtgggtgt tccgggtggc ggtgtgccgg gcgcaggtgt 4680tcctggtgta
ggtgtgccgg gtgttggtgt gccgggtgtt ggtgtaccag gtggcggtgt 4740tccgggtgca
ggcgttccgg gtggcggtgt gccgggcgtg ggtgttccgg gcgtgggtgt 4800tccgggtggc
ggtgtgccgg gcgcaggtgt tcctggtgta ggtgtgccgg gtgttggtgt 4860gccgggtgtt
ggtgtaccag gtggcggtgt tccgggtgca ggcgttccgg gtggcggtgt 4920gccgggctgg
ccgtgataag ctagcatgac tggtggacag caaatgggtc ggatccgaat 4980tctgcagata
tccatcacac tggcggccgc tcgagcagat ccggctgcta acaaagcccg 5040aaaggaagct
gagttggctg ctgccaccgc tgagcaataa ctagcataac cccttggggc 5100ctctaaacgg
gtcttgaggg gttttttgct gaaaggagga actatatccg gataa
5155454705DNAArtificial SequenceSynthetic construct 45ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg tgtgatctgc ctcaaaccca 3060cagcctgggt
agcaggagga ccttgatgct cctggcacag atgaggagaa tctctctttt 3120ctcctgcttg
aaggacagac atgactttgg atttccccag gaggagtttg gcaaccagtt 3180ccaaaaggct
gaaaccatcc ctgtcctcca tgagatgatc cagcagatct tcaatctctt 3240cagcacaaag
gactcatctg ctgcttggga tgagaccctc ctagacaaat tctacactga 3300actctaccag
cagctgaatg acctggaagc ctgtgtgata cagggggtgg gggtgacaga 3360gactcccctg
atgaaggagg actccattct ggctgtgagg aaatacttcc aaagaatcac 3420tctctatctg
aaagagaaga aatacagccc ttgtgcctgg gaggttgtca gagcagaaat 3480catgagatct
ttttctttgt caacaaactt gcaagaaagt ttaagaagta aggaactcga 3540gaacctgtat
ttccaaggcg gcatgggtgg gccgggcgtg ggtgttccgg gcgtaggtgt 3600cccaggtgtg
ggcgtaccgg gcgttggtgt tcctggtgtc ggcgtgccgg gcgtgggtgt 3660tccgggcgta
ggtgtcccag gtgtgggcgt accgggcgtt ggtgttcctg gtgtcggcgt 3720gccgggcgtg
ggtgttccgg gcgtaggtgt cccaggtgtg ggcgtaccgg gcgttggtgt 3780tcctggtgtc
ggcgtgccgg gcgtgggtgt tccgggcgta ggtgtcccag gtgtgggcgt 3840accgggcgtt
ggtgttcctg gtgtcggcgt gccgggcgtg ggtgttccgg gcgtaggtgt 3900cccaggtgtg
ggcgtaccgg gcgttggtgt tcctggtgtc ggcgtgccgg gcgtgggtgt 3960tccgggcgta
ggtgtcccag gtgtgggcgt accgggcgtt ggtgttcctg gtgtcggcgt 4020gccgggcgtg
ggtgttccgg gcgtaggtgt cccaggtgtg ggcgtaccgg gcgttggtgt 4080tcctggtgtc
ggcgtgccgg gcgtgggtgt tccgggcgta ggtgtcccag gtgtgggcgt 4140accgggcgtt
ggtgttcctg gtgtcggcgt gccgggcgtg ggtgttccgg gcgtaggtgt 4200cccaggtgtg
ggcgtaccgg gcgttggtgt tcctggtgtc ggcgtgccgg gcgtgggtgt 4260tccgggcgta
ggtgtcccag gtgtgggcgt accgggcgtt ggtgttcctg gtgtcggcgt 4320gccgggcgtg
ggtgttccgg gcgtaggtgt cccaggtgtg ggcgtaccgg gcgttggtgt 4380tcctggtgtc
ggcgtgccgg gcgtgggtgt tccgggcgta ggtgtcccag gtgtgggcgt 4440accgggcgtt
ggtgttcctg gtgtcggcgt gccgggctgg ccgtgataag ctagcatgac 4500tggtggacag
caaatgggtc ggatccgaat tctgcagata tccatcacac tggcggccgc 4560tcgagcagat
ccggctgcta acaaagcccg aaaggaagct gagttggctg ctgccaccgc 4620tgagcaataa
ctagcataac cccttggggc ctctaaacgg gtcttgaggg gttttttgct 4680gaaaggagga
actatatccg gataa
47054635DNAArtificial Sequenceoligo 46tcgagaacct gtatttccag ggcgggtgct
gcggc 354735DNAArtificial Sequenceoligo
47cttggccgca gcagccgccc tggaaataca ggttc
354814PRTArtificial Sequencesynthetic construct 48Leu Glu Asn Leu Tyr Phe
Gln Gly Gly Cys Cys Gly Gln Gly1 5
104982DNAArtificial Sequenceannealed oligo 49ctcgagaacc tgtatttcca
gggcgggtgc tgcggccaag ggagctcttg gacataaagg 60tcccgcccac gacgccggtt
cc 82504681DNAArtificial
SequenceSynthetic construct 50ttcttgaaga cgaaagggcc tcgtgatacg cctattttta
taggttaatg tcatgataat 60aatggtttct tagacgtcag gtggcacttt tcggggaaat
gtgcgcggaa cccctatttg 120tttatttttc taaatacatt caaatatgta tccgctcatg
agacaataac cctgataaat 180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa
catttccgtg tcgcccttat 240tccctttttt gcggcatttt gccttcctgt ttttgctcac
ccagaaacgc tggtgaaagt 300aaaagatgct gaagatcagt tgggtgcacg agtgggttac
atcgaactgg atctcaacag 360cggtaagatc cttgagagtt ttcgccccga agaacgtttt
ccaatgatga gcacttttaa 420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc
gggcaagagc aactcggtcg 480ccgcatacac tattctcaga atgacttggt tgagtactca
ccagtcacag aaaagcatct 540tacggatggc atgacagtaa gagaattatg cagtgctgcc
ataaccatga gtgataacac 600tgcggccaac ttacttctga caacgatcgg aggaccgaag
gagctaaccg cttttttgca 660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa
ccggagctga atgaagccat 720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg
gcaacaacgt tgcgcaaact 780attaactggc gaactactta ctctagcttc ccggcaacaa
ttaatagact ggatggaggc 840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg
gctggctggt ttattgctga 900taaatctgga gccggtgagc gtgggtctcg cggtatcatt
gcagcactgg ggccagatgg 960taagccctcc cgtatcgtag ttatctacac gacggggagt
caggcaacta tggatgaacg 1020aaatagacag atcgctgaga taggtgcctc actgattaag
cattggtaac tgtcagacca 1080agtttactca tatatacttt agattgattt aaaacttcat
ttttaattta aaaggatcta 1140ggtgaagatc ctttttgata atctcatgac caaaatccct
taacgtgagt tttcgttcca 1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct
tgagatcctt tttttctgcg 1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca
gcggtggttt gtttgccgga 1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc
agcagagcgc agataccaaa 1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc
aagaactctg tagcaccgcc 1440tacatacctc gctctgctaa tcctgttacc agtggctgct
gccagtggcg ataagtcgtg 1500tcttaccggg ttggactcaa gacgatagtt accggataag
gcgcagcggt cgggctgaac 1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc
tacaccgaac tgagatacct 1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg
agaaaggcgg acaggtatcc 1680ggtaagcggc agggtcggaa caggagagcg cacgagggag
cttccagggg gaaacgcctg 1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt
gagcgtcgat ttttgtgatg 1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac
gcggcctttt tacggttcct 1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg
ttatcccctg attctgtgga 1920taaccgtatt accgcctttg agtgagctga taccgctcgc
cgcagccgaa cgaccgagcg 1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg
cggtattttc tccttacgca 2040tctgtgcggt atttcacacc gcatatatgg tgcactctca
gtacaatctg ctctgatgcc 2100gcatagttaa gccagtatac actccgctat cgctacgtga
ctgggtcatg gctgcgcccc 2160gacacccgcc aacacccgct gacgcgccct gacgggcttg
tctgctcccg gcatccgctt 2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca
gaggttttca ccgtcatcac 2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg
gtcgtgaagc gattcacaga 2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc
cagaagcgtt aatgtctggc 2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg
tttggtcact gatgcctccg 2460tgtaaggggg atttctgttc atgggggtaa tgataccgat
gaaacgagag aggatgctca 2520cgatacgggt tactgatgat gaacatgccc ggttactgga
acgttgtgag ggtaaacaac 2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca
gggtcaatgc cagcgcttcg 2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca
tcctgcgatg cagatccgga 2700acataatggt gcagggcgct gacttccgcg tttccagact
ttacgaaaca cggaaaccga 2760agaccattca tgttgttgct caggtcgcag acgttttgca
gcagcagtcg cttcacgttc 2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca
accccgccag cctagccggg 2880tcctcaacga caggagcacg atcatgcgca cccgtggcca
ggacccaacg ctgcccgaga 2940tctcgatccc gcgaaattaa tacgactcac tatagggaga
ccacaacggt ttccctctag 3000aaataatttt gtttaacttt aagaaggaga tatacatatg
ccgctcgaga acctgtattt 3060ccagggcggg tgctgcggcc aaggcggcat gggtgggccg
ggcgtgggtg ttccgggcgt 3120gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct
ggtgtaggtg tgccgggtgt 3180tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg
ggtgcaggcg ttccgggtgg 3240cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg
ggtggcggtg tgccgggcgc 3300aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg
ggtgttggtg taccaggtgg 3360cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg
ggcgtgggtg ttccgggcgt 3420gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct
ggtgtaggtg tgccgggtgt 3480tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg
ggtgcaggcg ttccgggtgg 3540cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg
ggtggcggtg tgccgggcgc 3600aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg
ggtgttggtg taccaggtgg 3660cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg
ggcgtgggtg ttccgggcgt 3720gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct
ggtgtaggtg tgccgggtgt 3780tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg
ggtgcaggcg ttccgggtgg 3840cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg
ggtggcggtg tgccgggcgc 3900aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg
ggtgttggtg taccaggtgg 3960cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg
ggcgtgggtg ttccgggcgt 4020gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct
ggtgtaggtg tgccgggtgt 4080tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg
ggtgcaggcg ttccgggtgg 4140cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg
ggtggcggtg tgccgggcgc 4200aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg
ggtgttggtg taccaggtgg 4260cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg
ggcgtgggtg ttccgggcgt 4320gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct
ggtgtaggtg tgccgggtgt 4380tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg
ggtgcaggcg ttccgggtgg 4440cggtgtgccg ggctggccgt gataagctag catgactggt
ggacagcaaa tgggtcggat 4500ccgaattctg cagatatcca tcacactggc ggccgctcga
gcagatccgg ctgctaacaa 4560agcccgaaag gaagctgagt tggctgctgc caccgctgag
caataactag cataacccct 4620tggggcctct aaacgggtct tgaggggttt tttgctgaaa
ggaggaacta tatccggata 4680a
4681514231DNAArtificial SequenceSynthetic construct
51ttcttgaaga cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat
60aatggtttct tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg
120tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat
180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat
240tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt
300aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag
360cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa
420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg
480ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct
540tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac
600tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca
660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat
720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact
780attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc
840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga
900taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg
960taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg
1020aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca
1080agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta
1140ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca
1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg
1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga
1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa
1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc
1440tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg
1500tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac
1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct
1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc
1680ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg
1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg
1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct
1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga
1920taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg
1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca
2040tctgtgcggt atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc
2100gcatagttaa gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc
2160gacacccgcc aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt
2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac
2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga
2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc
2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg
2460tgtaaggggg atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca
2520cgatacgggt tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac
2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg
2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga
2700acataatggt gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga
2760agaccattca tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc
2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg
2880tcctcaacga caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga
2940tctcgatccc gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag
3000aaataatttt gtttaacttt aagaaggaga tatacatatg ccgctcgaga acctgtattt
3060ccagggcggg tgctgcggcc aaggcggcat gggtgggccg ggcgtgggtg ttccgggcgt
3120aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt
3180gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt
3240cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt
3300tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt
3360gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt
3420aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt
3480gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt
3540cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt
3600tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt
3660gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt
3720aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt
3780gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt
3840cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt
3900tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt
3960gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg ggctggccgt gataagctag
4020catgactggt ggacagcaaa tgggtcggat ccgaattctg cagatatcca tcacactggc
4080ggccgctcga gcagatccgg ctgctaacaa agcccgaaag gaagctgagt tggctgctgc
4140caccgctgag caataactag cataacccct tggggcctct aaacgggtct tgaggggttt
4200tttgctgaaa ggaggaacta tatccggata a
42315219DNAArtificial Sequenceoligo 52tggccttgct gctgataag
195326DNAArtificial Sequenceoligo
53ctagcttatc agcagcaagg ccagcc
265410PRTArtificial SequenceSynthetic construct 54Pro Gly Trp Pro Cys Cys
Xaa Xaa Ala Ser1 5 105562DNAArtificial
Sequenceannealed oligos 55gccgggctgg ccttgctgct gataagctag ccggcccgac
cggaacgacg actattcgat 60cg
62564719DNAArtificial SequenceSynthetic construct
56ttcttgaaga cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat
60aatggtttct tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg
120tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat
180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat
240tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt
300aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag
360cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa
420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg
480ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct
540tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac
600tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca
660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat
720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact
780attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc
840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga
900taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg
960taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg
1020aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca
1080agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta
1140ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca
1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg
1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga
1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa
1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc
1440tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg
1500tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac
1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct
1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc
1680ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg
1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg
1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct
1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga
1920taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg
1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca
2040tctgtgcggt atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc
2100gcatagttaa gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc
2160gacacccgcc aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt
2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac
2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga
2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc
2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg
2460tgtaaggggg atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca
2520cgatacgggt tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac
2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg
2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga
2700acataatggt gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga
2760agaccattca tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc
2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg
2880tcctcaacga caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga
2940tctcgatccc gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag
3000aaataatttt gtttaacttt aagaaggaga tatacatatg ccgctcgaga acctgtattt
3060ccaaggcggc atgggtgggc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg
3120tgtgccgggc gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg
3180tgtaccaggt ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg
3240tgttccgggc gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg
3300tgtgccgggt gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg
3360cgttccgggt ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg
3420tgtgccgggc gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg
3480tgtaccaggt ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg
3540tgttccgggc gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg
3600tgtgccgggt gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg
3660cgttccgggt ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg
3720tgtgccgggc gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg
3780tgtaccaggt ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg
3840tgttccgggc gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg
3900tgtgccgggt gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg
3960cgttccgggt ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg
4020tgtgccgggc gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg
4080tgtaccaggt ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggcgtggg
4140tgttccgggc gtgggtgttc cgggtggcgg tgtgccgggc gcaggtgttc ctggtgtagg
4200tgtgccgggt gttggtgtgc cgggtgttgg tgtaccaggt ggcggtgttc cgggtgcagg
4260cgttccgggt ggcggtgtgc cgggcgtggg tgttccgggc gtgggtgttc cgggtggcgg
4320tgtgccgggc gcaggtgttc ctggtgtagg tgtgccgggt gttggtgtgc cgggtgttgg
4380tgtaccaggt ggcggtgttc cgggtgcagg cgttccgggt ggcggtgtgc cgggctggcc
4440ttgctgctga taagctagca tgactggtgg acagcaaatg ggtcgggatt caagcttggt
4500accgagctcg gatccactag taacggccgc cagtgtgctg gaattctgca gatatccatc
4560acactggcgg ccgctcgagc agatccggct gctaacaaag cccgaaagga agctgagttg
4620gctgctgcca ccgctgagca ataactagca taaccccttg gggcctctaa acgggtcttg
4680aggggttttt tgctgaaagg aggaactata tccggataa
4719574269DNAArtificial SequenceSynthetic construct 57ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tatacatatg ccgctcgaga acctgtattt 3060ccaaggcggc
atgggtgggc cgggcgtggg tgttccgggc gtaggtgtcc caggtgtggg 3120cgtaccgggc
gttggtgttc ctggtgtcgg cgtgccgggc gtgggtgttc cgggcgtagg 3180tgtcccaggt
gtgggcgtac cgggcgttgg tgttcctggt gtcggcgtgc cgggcgtggg 3240tgttccgggc
gtaggtgtcc caggtgtggg cgtaccgggc gttggtgttc ctggtgtcgg 3300cgtgccgggc
gtgggtgttc cgggcgtagg tgtcccaggt gtgggcgtac cgggcgttgg 3360tgttcctggt
gtcggcgtgc cgggcgtggg tgttccgggc gtaggtgtcc caggtgtggg 3420cgtaccgggc
gttggtgttc ctggtgtcgg cgtgccgggc gtgggtgttc cgggcgtagg 3480tgtcccaggt
gtgggcgtac cgggcgttgg tgttcctggt gtcggcgtgc cgggcgtggg 3540tgttccgggc
gtaggtgtcc caggtgtggg cgtaccgggc gttggtgttc ctggtgtcgg 3600cgtgccgggc
gtgggtgttc cgggcgtagg tgtcccaggt gtgggcgtac cgggcgttgg 3660tgttcctggt
gtcggcgtgc cgggcgtggg tgttccgggc gtaggtgtcc caggtgtggg 3720cgtaccgggc
gttggtgttc ctggtgtcgg cgtgccgggc gtgggtgttc cgggcgtagg 3780tgtcccaggt
gtgggcgtac cgggcgttgg tgttcctggt gtcggcgtgc cgggcgtggg 3840tgttccgggc
gtaggtgtcc caggtgtggg cgtaccgggc gttggtgttc ctggtgtcgg 3900cgtgccgggc
gtgggtgttc cgggcgtagg tgtcccaggt gtgggcgtac cgggcgttgg 3960tgttcctggt
gtcggcgtgc cgggctggcc ttgctgctga taagctagca tgactggtgg 4020acagcaaatg
ggtcgggatt caagcttggt accgagctcg gatccactag taacggccgc 4080cagtgtgctg
gaattctgca gatatccatc acactggcgg ccgctcgagc agatccggct 4140gctaacaaag
cccgaaagga agctgagttg gctgctgcca ccgctgagca ataactagca 4200taaccccttg
gggcctctaa acgggtcttg aggggttttt tgctgaaagg aggaactata 4260tccggataa
42695851DNAArtificial Sequenceoligo 58ctagaaataa ttttgtttaa ctttaagaag
gagatatacc atgtgctgcc c 515951DNAArtificial Sequenceoligo
59catggggcag cacatggtat atctccttct taaagttaaa caaaattatt t
51606PRTArtificial SequenceSynthetic construct 60Met Cys Cys Pro Met Gly1
561114DNAArtificial Sequenceannealed oligo 61tctagaaata
attttgttta actttaagaa ggagatatac catgtgctgc cccatggaga 60tctttattaa
aacaaattga aattcttcct ctatatggta cacgacgggg tacc
114624664DNAArtificial SequenceSynthetic construct 62ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tataccatgt gctgccccat gggtgggccg 3060ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 3120ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 3180ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 3240ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 3300ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 3360ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 3420ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 3480ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 3540ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 3600ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 3660ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 3720ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 3780ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 3840ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 3900ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 3960ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 4020ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 4080ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 4140ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 4200ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 4260ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 4320ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 4380ggtgcaggcg
ttccgggtgg cggtgtgccg ggctggccga gcagcggtga ttacgatatc 4440ccaacgaccg
aaaacctgta ttttcagggc gcccatatgg gatccgaatt ctgcagatat 4500ccatcacact
ggcggccgct cgagcagatc cggctgctaa caaagcccga aaggaagctg 4560agttggctgc
tgccaccgct gagcaataac tagcataacc ccttggggcc tctaaacggg 4620tcttgagggg
ttttttgctg aaaggaggaa ctatatccgg ataa
4664634214DNAArtificial SequenceSynthetic construct 63ttcttgaaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420agttctgcta
tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720accaaacgac
gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 900taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 960taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 1860ggccttttgc
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 1920taaccgtatt
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 1980cagcgagtca
gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2040tctgtgcggt
atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100gcatagttaa
gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc 2160gacacccgcc
aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 2220acagacaagc
tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280cgaaacgcgc
gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340tgtctgcctg
ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc 2400ttctgataaa
gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg 2460tgtaaggggg
atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520cgatacgggt
tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580tggcggtatg
gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640ttaatacaga
tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700acataatggt
gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga 2760agaccattca
tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc 2820gctcgcgtat
cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880tcctcaacga
caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940tctcgatccc
gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag 3000aaataatttt
gtttaacttt aagaaggaga tataccatgt gctgccccat gggtgggccg 3060ggcgtgggtg
ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct 3120ggtgtcggcg
tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg 3180ggcgttggtg
ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca 3240ggtgtgggcg
taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg 3300ggcgtaggtg
tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg 3360ggcgtgggtg
ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct 3420ggtgtcggcg
tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg 3480ggcgttggtg
ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca 3540ggtgtgggcg
taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg 3600ggcgtaggtg
tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg 3660ggcgtgggtg
ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct 3720ggtgtcggcg
tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg 3780ggcgttggtg
ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca 3840ggtgtgggcg
taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg 3900ggcgtaggtg
tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg 3960ggctggccga
gcagcggtga ttacgatatc ccaacgaccg aaaacctgta ttttcagggc 4020gcccatatgg
gatccgaatt ctgcagatat ccatcacact ggcggccgct cgagcagatc 4080cggctgctaa
caaagcccga aaggaagctg agttggctgc tgccaccgct gagcaataac 4140tagcataacc
ccttggggcc tctaaacggg tcttgagggg ttttttgctg aaaggaggaa 4200ctatatccgg
ataa
42146424DNAArtificial Sequenceoligo 64tggccgtgct gcagcagcgg tgat
246527DNAArtificial Sequenceoligo
65atcaccgctg ctgcagcacg gccagcc
276611PRTArtificial SequenceSynthetic construct 66Pro Gly Trp Pro Cys Cys
Ser Ser Gly Asp Ile1 5
106768DNAArtificial Sequenceannealed oligos 67gccgggctgg ccgtgctgca
gcagcggtga tatccggccc gaccggcacg acgtcgtcgc 60cactatag
68684652DNAArtificial
SequenceSynthetic construct 68ttcttgaaga cgaaagggcc tcgtgatacg cctattttta
taggttaatg tcatgataat 60aatggtttct tagacgtcag gtggcacttt tcggggaaat
gtgcgcggaa cccctatttg 120tttatttttc taaatacatt caaatatgta tccgctcatg
agacaataac cctgataaat 180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa
catttccgtg tcgcccttat 240tccctttttt gcggcatttt gccttcctgt ttttgctcac
ccagaaacgc tggtgaaagt 300aaaagatgct gaagatcagt tgggtgcacg agtgggttac
atcgaactgg atctcaacag 360cggtaagatc cttgagagtt ttcgccccga agaacgtttt
ccaatgatga gcacttttaa 420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc
gggcaagagc aactcggtcg 480ccgcatacac tattctcaga atgacttggt tgagtactca
ccagtcacag aaaagcatct 540tacggatggc atgacagtaa gagaattatg cagtgctgcc
ataaccatga gtgataacac 600tgcggccaac ttacttctga caacgatcgg aggaccgaag
gagctaaccg cttttttgca 660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa
ccggagctga atgaagccat 720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg
gcaacaacgt tgcgcaaact 780attaactggc gaactactta ctctagcttc ccggcaacaa
ttaatagact ggatggaggc 840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg
gctggctggt ttattgctga 900taaatctgga gccggtgagc gtgggtctcg cggtatcatt
gcagcactgg ggccagatgg 960taagccctcc cgtatcgtag ttatctacac gacggggagt
caggcaacta tggatgaacg 1020aaatagacag atcgctgaga taggtgcctc actgattaag
cattggtaac tgtcagacca 1080agtttactca tatatacttt agattgattt aaaacttcat
ttttaattta aaaggatcta 1140ggtgaagatc ctttttgata atctcatgac caaaatccct
taacgtgagt tttcgttcca 1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct
tgagatcctt tttttctgcg 1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca
gcggtggttt gtttgccgga 1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc
agcagagcgc agataccaaa 1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc
aagaactctg tagcaccgcc 1440tacatacctc gctctgctaa tcctgttacc agtggctgct
gccagtggcg ataagtcgtg 1500tcttaccggg ttggactcaa gacgatagtt accggataag
gcgcagcggt cgggctgaac 1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc
tacaccgaac tgagatacct 1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg
agaaaggcgg acaggtatcc 1680ggtaagcggc agggtcggaa caggagagcg cacgagggag
cttccagggg gaaacgcctg 1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt
gagcgtcgat ttttgtgatg 1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac
gcggcctttt tacggttcct 1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg
ttatcccctg attctgtgga 1920taaccgtatt accgcctttg agtgagctga taccgctcgc
cgcagccgaa cgaccgagcg 1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg
cggtattttc tccttacgca 2040tctgtgcggt atttcacacc gcatatatgg tgcactctca
gtacaatctg ctctgatgcc 2100gcatagttaa gccagtatac actccgctat cgctacgtga
ctgggtcatg gctgcgcccc 2160gacacccgcc aacacccgct gacgcgccct gacgggcttg
tctgctcccg gcatccgctt 2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca
gaggttttca ccgtcatcac 2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg
gtcgtgaagc gattcacaga 2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc
cagaagcgtt aatgtctggc 2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg
tttggtcact gatgcctccg 2460tgtaaggggg atttctgttc atgggggtaa tgataccgat
gaaacgagag aggatgctca 2520cgatacgggt tactgatgat gaacatgccc ggttactgga
acgttgtgag ggtaaacaac 2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca
gggtcaatgc cagcgcttcg 2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca
tcctgcgatg cagatccgga 2700acataatggt gcagggcgct gacttccgcg tttccagact
ttacgaaaca cggaaaccga 2760agaccattca tgttgttgct caggtcgcag acgttttgca
gcagcagtcg cttcacgttc 2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca
accccgccag cctagccggg 2880tcctcaacga caggagcacg atcatgcgca cccgtggcca
ggacccaacg ctgcccgaga 2940tctcgatccc gcgaaattaa tacgactcac tatagggaga
ccacaacggt ttccctctag 3000aaataatttt gtttaacttt aagaaggaga tataccatgg
gtgggccggg cgtgggtgtt 3060ccgggcgtgg gtgttccggg tggcggtgtg ccgggcgcag
gtgttcctgg tgtaggtgtg 3120ccgggtgttg gtgtgccggg tgttggtgta ccaggtggcg
gtgttccggg tgcaggcgtt 3180ccgggtggcg gtgtgccggg cgtgggtgtt ccgggcgtgg
gtgttccggg tggcggtgtg 3240ccgggcgcag gtgttcctgg tgtaggtgtg ccgggtgttg
gtgtgccggg tgttggtgta 3300ccaggtggcg gtgttccggg tgcaggcgtt ccgggtggcg
gtgtgccggg cgtgggtgtt 3360ccgggcgtgg gtgttccggg tggcggtgtg ccgggcgcag
gtgttcctgg tgtaggtgtg 3420ccgggtgttg gtgtgccggg tgttggtgta ccaggtggcg
gtgttccggg tgcaggcgtt 3480ccgggtggcg gtgtgccggg cgtgggtgtt ccgggcgtgg
gtgttccggg tggcggtgtg 3540ccgggcgcag gtgttcctgg tgtaggtgtg ccgggtgttg
gtgtgccggg tgttggtgta 3600ccaggtggcg gtgttccggg tgcaggcgtt ccgggtggcg
gtgtgccggg cgtgggtgtt 3660ccgggcgtgg gtgttccggg tggcggtgtg ccgggcgcag
gtgttcctgg tgtaggtgtg 3720ccgggtgttg gtgtgccggg tgttggtgta ccaggtggcg
gtgttccggg tgcaggcgtt 3780ccgggtggcg gtgtgccggg cgtgggtgtt ccgggcgtgg
gtgttccggg tggcggtgtg 3840ccgggcgcag gtgttcctgg tgtaggtgtg ccgggtgttg
gtgtgccggg tgttggtgta 3900ccaggtggcg gtgttccggg tgcaggcgtt ccgggtggcg
gtgtgccggg cgtgggtgtt 3960ccgggcgtgg gtgttccggg tggcggtgtg ccgggcgcag
gtgttcctgg tgtaggtgtg 4020ccgggtgttg gtgtgccggg tgttggtgta ccaggtggcg
gtgttccggg tgcaggcgtt 4080ccgggtggcg gtgtgccggg cgtgggtgtt ccgggcgtgg
gtgttccggg tggcggtgtg 4140ccgggcgcag gtgttcctgg tgtaggtgtg ccgggtgttg
gtgtgccggg tgttggtgta 4200ccaggtggcg gtgttccggg tgcaggcgtt ccgggtggcg
gtgtgccggg cgtgggtgtt 4260ccgggcgtgg gtgttccggg tggcggtgtg ccgggcgcag
gtgttcctgg tgtaggtgtg 4320ccgggtgttg gtgtgccggg tgttggtgta ccaggtggcg
gtgttccggg tgcaggcgtt 4380ccgggtggcg gtgtgccggg ctggccgtgc tgcagcagcg
gtgatatccc aacgaccgaa 4440aacctgtatt ttcagggcgc ccatatggga tccgaattct
gcagatatcc atcacactgg 4500cggccgctcg agcagatccg gctgctaaca aagcccgaaa
ggaagctgag ttggctgctg 4560ccaccgctga gcaataacta gcataacccc ttggggcctc
taaacgggtc ttgaggggtt 4620ttttgctgaa aggaggaact atatccggat aa
4652694202DNAArtificial SequenceSynthetic construct
69ttcttgaaga cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat
60aatggtttct tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg
120tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat
180gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat
240tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt
300aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag
360cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa
420agttctgcta tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg
480ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct
540tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac
600tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca
660caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat
720accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact
780attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc
840ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga
900taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg
960taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg
1020aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca
1080agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta
1140ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca
1200ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg
1260cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga
1320tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa
1380tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc
1440tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg
1500tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac
1560ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct
1620acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc
1680ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg
1740gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg
1800ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct
1860ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga
1920taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg
1980cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca
2040tctgtgcggt atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc
2100gcatagttaa gccagtatac actccgctat cgctacgtga ctgggtcatg gctgcgcccc
2160gacacccgcc aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt
2220acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac
2280cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga
2340tgtctgcctg ttcatccgcg tccagctcgt tgagtttctc cagaagcgtt aatgtctggc
2400ttctgataaa gcgggccatg ttaagggcgg ttttttcctg tttggtcact gatgcctccg
2460tgtaaggggg atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca
2520cgatacgggt tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac
2580tggcggtatg gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg
2640ttaatacaga tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga
2700acataatggt gcagggcgct gacttccgcg tttccagact ttacgaaaca cggaaaccga
2760agaccattca tgttgttgct caggtcgcag acgttttgca gcagcagtcg cttcacgttc
2820gctcgcgtat cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg
2880tcctcaacga caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga
2940tctcgatccc gcgaaattaa tacgactcac tatagggaga ccacaacggt ttccctctag
3000aaataatttt gtttaacttt aagaaggaga tataccatgg gtgggccggg cgtgggtgtt
3060ccgggcgtag gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg
3120ccgggcgtgg gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt
3180cctggtgtcg gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta
3240ccgggcgttg gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc
3300ccaggtgtgg gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg cgtgggtgtt
3360ccgggcgtag gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg
3420ccgggcgtgg gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt
3480cctggtgtcg gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta
3540ccgggcgttg gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc
3600ccaggtgtgg gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg cgtgggtgtt
3660ccgggcgtag gtgtcccagg tgtgggcgta ccgggcgttg gtgttcctgg tgtcggcgtg
3720ccgggcgtgg gtgttccggg cgtaggtgtc ccaggtgtgg gcgtaccggg cgttggtgtt
3780cctggtgtcg gcgtgccggg cgtgggtgtt ccgggcgtag gtgtcccagg tgtgggcgta
3840ccgggcgttg gtgttcctgg tgtcggcgtg ccgggcgtgg gtgttccggg cgtaggtgtc
3900ccaggtgtgg gcgtaccggg cgttggtgtt cctggtgtcg gcgtgccggg ctggccgtgc
3960tgcagcagcg gtgatatccc aacgaccgaa aacctgtatt ttcagggcgc ccatatggga
4020tccgaattct gcagatatcc atcacactgg cggccgctcg agcagatccg gctgctaaca
4080aagcccgaaa ggaagctgag ttggctgctg ccaccgctga gcaataacta gcataacccc
4140ttggggcctc taaacgggtc ttgaggggtt ttttgctgaa aggaggaact atatccggat
4200aa
420270510DNAArtificial SequenceSynthetic construct 70catatgtgtg
atctgcctca aacccacagc ctgggtagca ggaggacctt gatgctcctg 60gcacagatga
ggagaatctc tcttttctcc tgcttgaagg acagacatga ctttggattt 120ccccaggagg
agtttggcaa ccagttccaa aaggctgaaa ccatccctgt cctccatgag 180atgatccagc
agatcttcaa tctcttcagc acaaaggact catctgctgc ttgggatgag 240accctcctag
acaaattcta cactgaactc taccagcagc tgaatgacct ggaagcctgt 300gtgatacagg
gggtgggggt gacagagact cccctgatga aggaggactc cattctggct 360gtgaggaaat
acttccaaag aatcactctc tatctgaaag agaagaaata cagcccttgt 420gcctgggagg
ttgtcagagc agaaatcatg agatcttttt ctttgtcaac aaacttgcaa 480gaaagtttaa
gaagtaagga ataactcgag
51071507DNAArtificial SequenceSynthetic construct 71catatgtgtg atctgcctca
aacccacagc ctgggtagca ggaggacctt gatgctcctg 60gcacagatga ggagaatctc
tcttttctcc tgcttgaagg acagacatga ctttggattt 120ccccaggagg agtttggcaa
ccagttccaa aaggctgaaa ccatccctgt cctccatgag 180atgatccagc agatcttcaa
tctcttcagc acaaaggact catctgctgc ttgggatgag 240accctcctag acaaattcta
cactgaactc taccagcagc tgaatgacct ggaagcctgt 300gtgatacagg gggtgggggt
gacagagact cccctgatga aggaggactc cattctggct 360gtgaggaaat acttccaaag
aatcactctc tatctgaaag agaagaaata cagcccttgt 420gcctgggagg ttgtcagagc
agaaatcatg agatcttttt ctttgtcaac aaacttgcaa 480gaaagtttaa gaagtaagga
actcgag 50772169PRTArtificial
SequenceSynthetic construct 72Gly Ala His Met Cys Asp Leu Pro Gln Thr His
Ser Leu Gly Ser Arg1 5 10
15Arg Thr Leu Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser
20 25 30Cys Leu Lys Asp Arg His Asp
Phe Gly Phe Pro Gln Glu Glu Phe Gly 35 40
45Asn Gln Phe Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met
Ile 50 55 60Gln Gln Ile Phe Asn Leu
Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp65 70
75 80Asp Glu Thr Leu Leu Asp Lys Phe Tyr Thr Glu
Leu Tyr Gln Gln Leu 85 90
95Asn Asp Leu Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr
100 105 110Pro Leu Met Lys Glu Asp
Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln 115 120
125Arg Ile Thr Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys
Ala Trp 130 135 140Glu Val Val Arg Ala
Glu Ile Met Arg Ser Phe Ser Leu Ser Thr Asn145 150
155 160Leu Gln Glu Ser Leu Arg Ser Lys Glu
16573169PRTArtificial SequenceSynthetic construct 73Gly Ala His
Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg1 5
10 15Arg Thr Leu Met Leu Leu Ala Gln Met
Arg Arg Ile Ser Leu Phe Ser 20 25
30Cys Leu Lys Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly
35 40 45Asn Gln Phe Gln Lys Ala Glu
Thr Ile Pro Val Leu His Glu Met Ile 50 55
60Gln Gln Ile Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp65
70 75 80Asp Glu Thr Leu
Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu 85
90 95Asn Asp Leu Glu Ala Cys Val Ile Gln Gly
Val Gly Val Thr Glu Thr 100 105
110Pro Leu Met Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln
115 120 125Arg Ile Thr Leu Tyr Leu Lys
Glu Lys Lys Tyr Ser Pro Cys Ala Trp 130 135
140Glu Val Val Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser Thr
Asn145 150 155 160Leu Gln
Glu Ser Leu Arg Ser Lys Glu 16574169PRTArtificial
SequenceSynthetic construct 74Gly Ala His Met Cys Asp Leu Pro Gln Thr His
Ser Leu Gly Ser Arg1 5 10
15Arg Thr Leu Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser
20 25 30Cys Leu Lys Asp Arg His Asp
Phe Gly Phe Pro Gln Glu Glu Phe Gly 35 40
45Asn Gln Phe Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met
Ile 50 55 60Gln Gln Ile Phe Asn Leu
Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp65 70
75 80Asp Glu Thr Leu Leu Asp Lys Phe Tyr Thr Glu
Leu Tyr Gln Gln Leu 85 90
95Asn Asp Leu Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr
100 105 110Pro Leu Met Lys Glu Asp
Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln 115 120
125Arg Ile Thr Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys
Ala Trp 130 135 140Glu Val Val Arg Ala
Glu Ile Met Arg Ser Phe Ser Leu Ser Thr Asn145 150
155 160Leu Gln Glu Ser Leu Arg Ser Lys Glu
16575169PRTArtificial SequenceSynthetic construct 75Gly Ala His
Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg1 5
10 15Arg Thr Leu Met Leu Leu Ala Gln Met
Arg Arg Ile Ser Leu Phe Ser 20 25
30Cys Leu Lys Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly
35 40 45Asn Gln Phe Gln Lys Ala Glu
Thr Ile Pro Val Leu His Glu Met Ile 50 55
60Gln Gln Ile Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp65
70 75 80Asp Glu Thr Leu
Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu 85
90 95Asn Asp Leu Glu Ala Cys Val Ile Gln Gly
Val Gly Val Thr Glu Thr 100 105
110Pro Leu Met Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln
115 120 125Arg Ile Thr Leu Tyr Leu Lys
Glu Lys Lys Tyr Ser Pro Cys Ala Trp 130 135
140Glu Val Val Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser Thr
Asn145 150 155 160Leu Gln
Glu Ser Leu Arg Ser Lys Glu 16576169PRTArtificial
SequenceSynthetic construct 76Gly Ala His Met Cys Asp Leu Pro Gln Thr His
Ser Leu Gly Ser Arg1 5 10
15Arg Thr Leu Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser
20 25 30Cys Leu Lys Asp Arg His Asp
Phe Gly Phe Pro Gln Glu Glu Phe Gly 35 40
45Asn Gln Phe Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met
Ile 50 55 60Gln Gln Ile Phe Asn Leu
Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp65 70
75 80Asp Glu Thr Leu Leu Asp Lys Phe Tyr Thr Glu
Leu Tyr Gln Gln Leu 85 90
95Asn Asp Leu Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr
100 105 110Pro Leu Met Lys Glu Asp
Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln 115 120
125Arg Ile Thr Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys
Ala Trp 130 135 140Glu Val Val Arg Ala
Glu Ile Met Arg Ser Phe Ser Leu Ser Thr Asn145 150
155 160Leu Gln Glu Ser Leu Arg Ser Lys Glu
16577169PRTArtificial SequenceSynthetic construct 77Gly Ala His
Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly Ser Arg1 5
10 15Arg Thr Leu Met Leu Leu Ala Gln Met
Arg Arg Ile Ser Leu Phe Ser 20 25
30Cys Leu Lys Asp Arg His Asp Phe Gly Phe Pro Gln Glu Glu Phe Gly
35 40 45Asn Gln Phe Gln Lys Ala Glu
Thr Ile Pro Val Leu His Glu Met Ile 50 55
60Gln Gln Ile Phe Asn Leu Phe Ser Thr Lys Asp Ser Ser Ala Ala Trp65
70 75 80Asp Glu Thr Leu
Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln Gln Leu 85
90 95Asn Asp Leu Glu Ala Cys Val Ile Gln Gly
Val Gly Val Thr Glu Thr 100 105
110Pro Leu Met Lys Glu Asp Ser Ile Leu Ala Val Arg Lys Tyr Phe Gln
115 120 125Arg Ile Thr Leu Tyr Leu Lys
Glu Lys Lys Tyr Ser Pro Cys Ala Trp 130 135
140Glu Val Val Arg Ala Glu Ile Met Arg Ser Phe Ser Leu Ser Thr
Asn145 150 155 160Leu Gln
Glu Ser Leu Arg Ser Lys Glu 16578173PRTArtificial
SequenceSynthetic construct 78Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln 165 17079173PRTArtificial
SequenceSynthetic construct 79Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln 165 17080173PRTArtificial
SequenceSynthetic construct 80Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln 165 17081173PRTArtificial
SequenceSynthetic construct 81Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln 165 17082173PRTArtificial
SequenceSynthetic construct 82Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln 165 17083173PRTArtificial
SequenceSynthetic construct 83Met Cys Asp Leu Pro Gln Thr His Ser Leu Gly
Ser Arg Arg Thr Leu1 5 10
15Met Leu Leu Ala Gln Met Arg Arg Ile Ser Leu Phe Ser Cys Leu Lys
20 25 30Asp Arg His Asp Phe Gly Phe
Pro Gln Glu Glu Phe Gly Asn Gln Phe 35 40
45Gln Lys Ala Glu Thr Ile Pro Val Leu His Glu Met Ile Gln Gln
Ile 50 55 60Phe Asn Leu Phe Ser Thr
Lys Asp Ser Ser Ala Ala Trp Asp Glu Thr65 70
75 80Leu Leu Asp Lys Phe Tyr Thr Glu Leu Tyr Gln
Gln Leu Asn Asp Leu 85 90
95Glu Ala Cys Val Ile Gln Gly Val Gly Val Thr Glu Thr Pro Leu Met
100 105 110Lys Glu Asp Ser Ile Leu
Ala Val Arg Lys Tyr Phe Gln Arg Ile Thr 115 120
125Leu Tyr Leu Lys Glu Lys Lys Tyr Ser Pro Cys Ala Trp Glu
Val Val 130 135 140Arg Ala Glu Ile Met
Arg Ser Phe Ser Leu Ser Thr Asn Leu Gln Glu145 150
155 160Ser Leu Arg Ser Lys Glu Leu Glu Asn Leu
Tyr Phe Gln 165 17084451PRTArtificial
SequenceSynthetic construct 84Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Gly Gly Val Pro Gly1 5 10
15Ala Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val
20 25 30Gly Val Pro Gly Gly Gly Val
Pro Gly Ala Gly Val Pro Gly Gly Gly 35 40
45Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly
Val 50 55 60Pro Gly Ala Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro65 70
75 80Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly
Ala Gly Val Pro Gly 85 90
95Gly Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly
100 105 110Gly Val Pro Gly Ala Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly 115 120
125Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala
Gly Val 130 135 140Pro Gly Gly Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro145 150
155 160Gly Gly Gly Val Pro Gly Ala Gly Val Pro
Gly Val Gly Val Pro Gly 165 170
175Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala
180 185 190Gly Val Pro Gly Gly
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly 195
200 205Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro
Gly Val Gly Val 210 215 220Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro225
230 235 240Gly Ala Gly Val Pro Gly Gly
Gly Val Pro Gly Val Gly Val Pro Gly 245
250 255Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly
Val Pro Gly Val 260 265 270Gly
Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly 275
280 285Val Pro Gly Ala Gly Val Pro Gly Gly
Gly Val Pro Gly Val Gly Val 290 295
300Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro305
310 315 320Gly Val Gly Val
Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly 325
330 335Gly Gly Val Pro Gly Ala Gly Val Pro Gly
Gly Gly Val Pro Gly Val 340 345
350Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly
355 360 365Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val 370 375
380Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly Gly Gly Val
Pro385 390 395 400Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Gly Gly Val Pro Gly
405 410 415Ala Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val 420 425
430Gly Val Pro Gly Gly Gly Val Pro Gly Ala Gly Val Pro Gly
Gly Gly 435 440 445Val Pro Gly
450851353DNAArtificial SequenceSynthetic construct 85ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 60ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 120ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 180ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 240ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 300ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 360ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 420ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 480ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 540ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 600ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 660ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 720ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 780ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 840ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 900ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 960ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 1020ggtgcaggcg
ttccgggtgg cggtgtgccg ggcgtgggtg ttccgggcgt gggtgttccg 1080ggtggcggtg
tgccgggcgc aggtgttcct ggtgtaggtg tgccgggtgt tggtgtgccg 1140ggtgttggtg
taccaggtgg cggtgttccg ggtgcaggcg ttccgggtgg cggtgtgccg 1200ggcgtgggtg
ttccgggcgt gggtgttccg ggtggcggtg tgccgggcgc aggtgttcct 1260ggtgtaggtg
tgccgggtgt tggtgtgccg ggtgttggtg taccaggtgg cggtgttccg 1320ggtgcaggcg
ttccgggtgg cggtgtgccg ggc
135386301PRTArtificial SequenceSynthetic construct 86Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly1 5
10 15Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val 20 25
30Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
35 40 45Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val 50 55
60Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro65
70 75 80Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly 85
90 95Val Gly Val Pro Gly Val Gly Val Pro Gly Val
Gly Val Pro Gly Val 100 105
110Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly
115 120 125Val Pro Gly Val Gly Val Pro
Gly Val Gly Val Pro Gly Val Gly Val 130 135
140Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro145 150 155 160Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
165 170 175Val Gly Val Pro Gly Val Gly
Val Pro Gly Val Gly Val Pro Gly Val 180 185
190Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly 195 200 205Val Pro Gly Val
Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val 210
215 220Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
Val Gly Val Pro225 230 235
240Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly
245 250 255Val Gly Val Pro Gly
Val Gly Val Pro Gly Val Gly Val Pro Gly Val 260
265 270Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val
Pro Gly Val Gly 275 280 285Val Pro
Gly Val Gly Val Pro Gly Val Gly Val Pro Gly 290 295
30087903DNAArtificial SequenceSynthetic construct
87ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct
60ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg
120ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca
180ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg
240ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg
300ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct
360ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg
420ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca
480ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg
540ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg
600ggcgtgggtg ttccgggcgt aggtgtccca ggtgtgggcg taccgggcgt tggtgttcct
660ggtgtcggcg tgccgggcgt gggtgttccg ggcgtaggtg tcccaggtgt gggcgtaccg
720ggcgttggtg ttcctggtgt cggcgtgccg ggcgtgggtg ttccgggcgt aggtgtccca
780ggtgtgggcg taccgggcgt tggtgttcct ggtgtcggcg tgccgggcgt gggtgttccg
840ggcgtaggtg tcccaggtgt gggcgtaccg ggcgttggtg ttcctggtgt cggcgtgccg
900ggc
903
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: