Patent application title: PERMANENT GENE CORRECTION BY MEANS OF NUCLEOTIDE-MODIFIED MESSENGER RNA
Inventors:
Michael Kormann (Tuebingen, DE)
Lauren Mays (Tuebingen, DE)
Assignees:
Eberhard Karls Universitat Tubingen Medizinische F akultat
IPC8 Class: AC12N922FI
USPC Class:
1 1
Class name:
Publication date: 2016-10-13
Patent application number: 20160298099
Abstract:
The present invention relates to a nucleotide-modified messenger RNA for
the permanent correction of a genetic alteration on a DNA. The invention
further relates to a nucleotide-modified messenger RNA in combination
with a repair template. It also relates to a pharmaceutical composition.
It finally relates to methods for the correction of a genetic alteration
on a DNA.Claims:
1. Nuclease-encoding nucleotide-modified messenger RNA (nec-mRNA)
configured for the correction of a genetic alteration on a DNA.
2. nec-mRNA of claim 1, wherein in the nec-mRNA up to including approx. 50% of the uridine nucleotides and up to including approx. 50% of the cytidine nucleotides are modified by exchanging uridine for 2-thiouridine (s2U) or pseudouridine (.psi.), and by exchanging cytidine for 5-methylcytidine (m5C).
3. nec-mRNA of claim 1, wherein the genetic alteration exists in a lung protein.
4. nec-mRNA of claim 3, wherein the genetic alteration exists in a surfactant protein.
5. nec-mRNA of claim 4, wherein the genetic alteration exists in a lung protein selected from the group consisting of: surfactant protein B (SP-B), cystic fibrosis transmembrane and conductance regulator (CFTR), Foxp3.
6. nec-mRNA of claim 1, wherein the nec-mRNA encodes a nuclease which is configured in such a way that it can bind upstream or downstream of the genetic alteration on the DNA.
7. nec-mRNA of claim 1, wherein the nuclease is selected from the group consisting of: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALEN), CRISPR/Cas9, and dimeric CRISPR RNA guided FokI nucleases.
8. nec-mRNA of claim 1, wherein the nec-mRNA is coupled to an aptamer.
9. nec-mRNA of claim 1, wherein the nec-mRNA is packed into a nanoparticle.
10. nec-mRNA of claim 9, wherein the nanoparticle is coated with chitosan.
11. nec-mRNA of claim 1 associated with a repair template.
12. nec-mRNA of claim 11, wherein the repair template comprises a nucleotide section which is exchangeable by homologous recombination (HR) against a section on the DNA comprising the genetic alteration.
13. nec-mRNA of claim 12, wherein the repair template is one of the following: packed into an adeno-associated viral vector (AAV), encoded by a plasmid DNA, packed into a lentiviral vector, and is packed into a protein-capped adenoviral vector (AdV).
14. Pharmaceutical composition comprising a nuclease-encoding nucleotide-modified messenger RNA (nec-mRNA).
15. Pharmaceutical composition of claim 14, which further comprises a repair template.
16. Pharmaceutical composition of claim 14 configured for the treatment of a lung diseaseselected from the group consisting of: surfactant protein B deficiency, cystic fibrosis (CF), asthma, and chronic obstructive pulmonary disease (COPD).
17. Pharmaceutical composition of claim 14, which comprises the nec-mRNA of claims 1 to 8 and/or the combination of any of the claims 9 to 11.
18. Pharmaceutical composition of claim 14, which comprises the nec-mRNA of claim 11.
19. Method for the correction of a genetic alteration on a DNA comprising the following steps: (1) introducing a repair template into a DNA-containing cell, which comprises the genetic alteration to be corrected, (2) introducing a nec-mRNA into the cell.
20. The method of claim 19, wherein the cell is a lung cell and the introduction is realized by means of high pressure application of the repair template and the nec-mRNA into the lung.
21. The method of claim 20, wherein the nec-mRNA is the nec-mRNA of claim 11.
22. Method for the correction of a genetic alteration on a DNA comprising the following steps: (1) introducing a repair template into a living being having a genetically altered DNA to be corrected, (2) introducing a nec-mRNA into the living being.
23. The method of claim 22, wherein the introduction is realized by means of high pressure application of the repair template and the nec-mRNA into the lung of the living being.
24. The method of claim 23, wherein the nec-mRNA is the nec-mRNA of claim 11.
Description:
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] This application is a continuation of copending International Patent Application PCT/EP2014/071343 filed on Oct. 6, 2014 and designating the United States of America, which was published in English, and claims priority of Germany Patent Application DE 10 2013 111 099.1 filed on Oct. 8, 2013, which are both incorporated herein by reference in their entireties.
FIELD
[0002] The present invention concerns a nucleotide-modified messenger RNA for the permanent correction of a genetic alteration on a DNA. The invention further concerns a nucleotide-modified messenger RNA in combination with a repair template. It also concerns a pharmaceutical composition. It finally relates to methods for the correction of a genetic alteration on a DNA.
BACKGROUND
[0003] The gene therapy refers to the insertion of nucleic acids such as DNA or RNA into somatic cells of an individual, e.g. in order to treat a disease. By doing so, usually an intact gene should be inserted into the genome of the target cell in order to replace a defective gene which is causally related to the development of the disease. Basically a gene therapy has chances of success only for such diseases which are based on the alteration of only one or a few number of genes.
[0004] Surfactant protein B deficiency and cystic fibrosis (CF) are severe, congenital, fatal diseases for which currently no satisfying therapies do exist. Surfactant protein B deficiency is rare and occurs in about one out of one million newborns. Surfactant protein B (SP-B) is a pulmonary surfactant associated protein that plays an essential role in the alveolar stability by lowering the surface tension at the air-liquid interface in the lung. Mutations of the SP-B encoding gene (SFTPB) results in a rapidly fatal respiratory failure associated with alveolar proteinosis, within the first year of life. The cystic fibrosis is the most prevalent life-limiting autosomal-recessive disease in Caucasian populations. It can be found in one out of 2,500 newborns and affects more than 70,000 people world-wide. Mutations in the gene coding for the "cystic fibrosis transmembrane conductance regulator" (CFTR), a chloride channel, result in an impaired anion secretion and hyper-absorption of sodium across epithelia. The chronic lung disease is the major factor contributing to mortality and morbidity in CF patients. Even with the current therapy, the mean survival is only between 30 and 40 years.
RELATED PRIOR ART
[0005] Current gene therapeutic efforts with DNA-based or viral vectors remain largely unsuccessful in treating these fatal illnesses. As the airways evolved in direct contact to the environment, their inherent defense mechanisms present a significant barrier for the delivery of foreign vectors into the lung. In addition, the use of most of the viral vectors presents a health risk since they can act as oncogenes.
[0006] Kormann et al. (2011), Expression of therapeutic proteins after delivery of chemically modified mRNA in mice, Letters to Nature Biotechnology, pages 1-6, describe a therapeutic approach for the treatment of SP-B deficiency where a functional nucleotide modified messenger RNA (mRNA) encoding SP-B is introduced into alveolar cells of the mouse. This is affected by intratracheal high pressure application of the SP-B mRNA. Through the nucleotide modification 25% of the uridine and cytidine were replaced by 2-thiouridine (s2U) and 5-methylcytidine (m5C), respectively. As a result the modified mRNA has less immunogenicity and is more stable than its unmodified counterpart. However, this approach has the disadvantage that the nucleotide modified mRNA can compensate the SP-B deficiency only for a limited time period, i.e. until it will be degraded by RNases so that a detectable effect will soon disappear. A permanent gene supplementation or gene correction cannot be realized.
[0007] McCaffrey et al. (2013), Targeted genome engineering with zinc-finger nucleases, TALENs and CRISPR. The buzz on the cut: from dream to reality, internet article posted on Jul. 1, 2013, in Therapeutics and tagged gene engineering, genome editing, mRNA [http://zon.trilinkbiotech.com/2013/07/01/the-buzz-on-thecut-from-dream-t- o-reality/], suggests the use of the ZFN and TALEN nucleases encoded by synthetic mRNAs for transient expression in genome engineering. He mentions that mRNAs could be made less immunogenic and non-toxic by substitution of cytosine and uridine with 5-methylcytosine and pseudouridine.
[0008] US 2013/0117870A1 discloses the use of mRNA encoding the TALEN nuclease for producing genetically modified or transgenic animals, respectively. It also discloses the transfection of swine fibroblasts with TALEN encloding nucleotide-modified mRNA.
SUMMARY OF THE INVENTION
[0009] Against this background it is an object of the present invention to provide a new substance which can be used as a tool within the framework of the gene therapy. With this substance the preconditions should be established for a permanent correction of a genetic alteration on a DNA.
[0010] This measure is realized by the provision of a nuclease encoding nucleotide-modified messenger RNA (nec-mRNA).
[0011] The inventors have surprisingly recognized that by the gene therapeutically use of a nec-mRNA the preconditions are established to correct a genetic alteration on the DNA in a permanent manner. For that purpose the nec-mRNA is transfected into the cytoplasm of a target cell and will there be translated into a nuclease. The nuclease is then transported into the nucleus. In the nucleus it can bind to the DNA which comprises the genetic alteration and can initiate a double-strand break (DSB). The DSB as a repair mechanism stimulates a homologous recombination, thus establishing the precondition for an exchange of the genetic alteration against e.g. the wild type sequence of the corresponding DNA section. Here the temporarily existing nuclease activity is advantageous, especially in contrast to any DNA or virus encoded nuclease activities where there is the risk of an integration into the genome of the host, and gives additional therapeutic safety for the system according to the invention.
[0012] This finding was surprising. In the art so far nucleotide-modified mRNA is mostly used for the direct substitution of the deficient gene or protein, respectively. For example Kormann et al. (cit. loc.) describe a nucleotide-modified mRNA which encodes the red fluorescent protein (RFP), the mouse erythropoietin (mEpo) or the surfactant protein B. The WO 2011/012316 also describes nucleotide-modified mRNA which encodes the surfactant protein B.
[0013] Kariko et. al. (2012), Increased Erythropoiesis in Mice Injected With Submicrogram Quantities of Pseudouridine-containing mRNA Encoding Erythropoietin, Molecular Therapy, Vol. 16, No. 11, pages 1833-1844, describe the use of nucleotide-modified mRNA for the synthesis of erythropoietin in a mouse model and propose the therapeutic use of nucleotide-modified mRNA.
[0014] The use of nec-mRNA as a molecular tool for establishing a permanent gene correction is not described in the prior art.
[0015] According to the invention "nucleotide-modified messenger RNA" refers to such an mRNA, where a part of the nucleotides, or nucleosides or nucleobases is modified, i.e. changed. In this respect the terms "nucleotides" and "nucleosides" are used interchangeably. Preferably it is referred to a chemical modification. This modification has the result that the mRNA is more stable and has less immunogenicity. Nucleotide-modified messenger RNA is generally known in the prior art, cf. for example WO 2011/012316. The content of the before-mentioned publication is incorporated herein by reference. Examples for chemically-modified nucleotides or nucleosides are pseudouridine (.psi.), 5-methylcytidine (m5C), N6-methyladenosine (m6A), 5-methyluridine (m5U) or 2-thiouridine (s2U).
[0016] According to the invention the use of a nec-mRNA also encompasses the use of different nec-mRNAs, such as a pair of nec-mRNAs, where each nec-mRNA could encode different nucleases. For example, one nuclease might bind and cleave upstream and the other nuclease downstream of the genetic alteration and, in doing so, create the optimum preconditions for a homologous recombination.
[0017] A "genetic alteration" refers to any change of the sequence on the DNA in comparison to the wild type, e.g. caused by a mutation or a polymorphism. Preferably the genetic alteration can be found in a gene resulting in a loss of function of the encoded protein or even in a complete knockout.
[0018] A "correction" of a genetic alteration on the DNA or a "gene correction" refers to a permanent exchange of the genetic alteration or the genetically altered gene for a nucleic acid section without such alteration, for example of the wild type.
[0019] According to a preferred embodiment up to including approx. 100% of the uridine nucleotides and/or up to including approx. 100% of the cytidine nucleotides, preferably up to including approx. 70% of the uridine nucleotides and/or up to including approx. 70% of the cytidine nucleotides, further preferably up to including approx. 50% of the uridine nucleotides and/or up to including approx. 50% of the cytidine nucleotides, further preferably up to including approx. 25% of the uridine nucleotides and/or up to including approx. 25% of the cytidine nucleotides, and highly preferably approx. 10% of the uridine nucleotides and/or approx. 10% of the cytidine nucleotides of the nec-mRNA are modified, further preferably by exchanging uridine for 2-thiouridine (s2U) and/or pseudouridine (.psi.) and/or by exchanging cytidine for 5-methylcytidine (m5C).
[0020] This measure has the advantage that through the prescribed content of nucleotide modifications an mRNA is provided which is significantly stable and little immunogenic. Even more, the inventors could surprisingly realize that it is sufficient if only up to including about 10% of the cytidines are replaced by 5-methylcytidine (m5C) and/or up to including approx. 10% of the uridines are replaced by 2-thiouridine (s2U). The inventors could provide evidence that also such slightly modified nec-mRNA is stable and little immunogenic. Since the nucleotide modification is complex, this has the advantage that the nec-mRNA according to the invention, because of the low concentration of nucleotide modifications, can be produced in a cost-saving manner. Besides of reducing costs, the lowering of the portion of modified nucleotides has also the advantage that the efficiency of the translation is increased. This is because very high portions of specifically modified nucleotides, such as 2-thiouridine (s2U), significantly interfere with the translation of the modified mRNA. However, with lower portions an optimum translation can be observed.
[0021] According to a preferred embodiment the genetic alteration is located in a lung protein, preferably in a surfactant protein, further preferably in the surfactant protein B (SP-B), further preferably a receptor protein including cystic fibrosis transmembrane and conductance regulator (CFTR), further preferably a transcription factor including Foxp3.
[0022] This measure has the advantage that the invention can be used as a therapeutic tool in the therapy of lung diseases for which currently no satisfying therapies do exist.
[0023] According to a preferred embodiment the nec-mRNA encodes a nuclease which is configured to bind to the DNA upstream and/or downstream of the genetic alteration.
[0024] This measure has the advantage that the encoded nuclease binds next to the genetic alteration, there catalyzes a double-strand break (DSB), thereby initiate cellular repair mechanisms including a homologous recombination (HR). In this way only the genetic alteration can be replaced in a targeted manner, e.g. by the wild type.
[0025] The nuclease is preferably selected from the group consisting of: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALEN), CRISPR/Cas9, and dimeric CRISPR RNA guided FokI nucleases.
[0026] Zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALEN) are artificial endonucleases which bind to specific sequences upstream and/or downstream of the genetic alteration on the DNA via DNA-binding polypeptides. The targeted configuration, the structure and the functionality of these nucleases are known to the person skilled in the art. Reference is made in this connection to the document of Carlson et al. (2012), Targeting DNA with fingers and TALENs, Molecular Therapy-Nucleic Acids 1, e3. "Cas" stands for "CRISPR associated". CRISPR stands for "Clustered Regularly Interspaced Short Palindromic Repeats". Cas9 is a nuclease which has originally discovered in bacteria, which binds in a targeted manner to distinct sections of the DNA via the CRISP/Cas system by means of a short complementary single-stranded RNA; cf. Mail et al. (2013), RNA-guided human genome engineering via Cas9, Science 339(6121), pages 823-826 and Cong et al. (2013), Multiplex genome engineering using CRISPR/Cas systems, Science 339(6121), pages 819-823; and Tsai et al. (2014), Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing, Nature Biotechnology 32, p. 569-576. These publications are incorporated herein by reference.
[0027] According to a preferred embodiment of the invention the nec-mRNA is coupled to an aptamer.
[0028] This measure has the advantage that the binding site of the nec-mRNA can be adjusted to defined target cells by a sequence-specific design of the aptamer. In this way it can be ensured that the nec-mRNA corrects the genetic alteration in a targeted manner only in specific cells where it is necessary. If, for example, the correction of a lung protein is intended, such aptamers are coupled to the nec-mRNA which selectively bind to cells of the lung tissue. Cells which are not affected by the genetic alteration remain untouched. This measure results in additional therapeutic safety. By the so-called SELEX method such aptamer sequences can be enriched which bind to the desired cell or the desired cell membrane structure, respectively.
[0029] According to a preferred embodiment the nec-mRNA is packed into nanoparticles.
[0030] This measure has the advantage that the absorption of the nec-mRNA into the cell is significantly improved, in particular into cells of the lung tissue. In particular, nanoparticle associated or packed nec-mRNA can be administered intravenously (i.v.) and still can reach the lung cells. Such a lung-cell targeted i.v. administration probably won't be possible without using nanoparticles. Examples for appropriate nanoparticles are the lipid GL67/DOPE and biocompatible chitosan-coated nanoparticles. In this connection "nanoparticle" refers to a particle between approx. 1 and approx. 300 nanometers in size (hydrodynamic diameter), preferably between approx. 50 nm and approx. 250 nm, further preferably between approx. 75 nm and approx. 200 nm, further preferably between approx 100 nm and approx. 175 nm, and highly preferably between approx. 150 nm and approx. 160 nm.
[0031] According to a preferred embodiment the nanoparticle is coated with chitosan.
[0032] This measure has the advantage that the respirability is further increased. In addition, chitosan has been proven as being particularly biocompatible resulting in an increase of the tolerance of the nec-mRNA by a living being.
[0033] Another subject matter of the present invention is the nec-mRNA according to the invention in combination with a repair template.
[0034] A "repair template" refers to a nucleic acid molecule, such as a DNA fragment, which comprises a nucleotide section which should be exchanged by homologous recombination (HR) for the section on the DNA comprising the genetic alteration. For example, this nucleotide section corresponds to the wild type or the "healthy gene", respectively, which does not comprise the genetic alteration. Upstream and downstream of the genetic alteration the repair template comprises sections which are significantly homologous to the DNA that a hybridization and homologous recombination can take place after the nuclease has induced a double-strand break.
[0035] The combination according to the invention of the nec-mRNA and the repair template allows a permanent correction of the genetic alteration by a lifelong expression of the corrected protein. As a consequence, nec-mRNA and repair template are in this way a "gene correction set" according to the invention.
[0036] It goes without saying that the repair template can comprise an inducible promoter by means of which the expression of the repaired gene can be controlled in a targeted manner.
[0037] According to a preferred embodiment of the invention the repair template is packed into an adeno-associated viral vector (AAV), and/or is encoded by a plasmid DNA, and/or is packed into a lentiviral vector, and/or is packed into a protein-capped adenoviral vector (AdV).
[0038] This measure has the advantage that it is made use of an established principle of the introduction of genetic information into the cell. AAV, lentiviral and AdV vectors have been proven successful in the practice of gene transfer because of the absence of gene toxic side effects.
[0039] Another subject matter of the present invention is a pharmaceutical composition comprising a nec-mRNA, preferably the before-mentioned nec-mRNA according to the invention, and further preferably in addition a repair template. Furthermore, the pharmaceutical composition is provided for the treatment of a lung disease which might be surfactant protein B deficiency and/or cystic fibrosis (CF) and/or Asthma and/or chronic obstructive pulmonary disease (COPD).
[0040] The features, characteristics and advantages of the nec-mRNA according to the invention also apply to the composition according to the invention.
[0041] Also disclosed is a method for the correction of a genetic alteration on a DNA comprising the following steps: (1) introducing a repair template into a DNA-containing cell, which comprises the genetic alteration to be corrected, (2) introducing a nec-mRNA into the cell. The cell can preferably be a lung cell and the introduction is preferably realized by means of high pressure application of the repair template and the nec-mRNA into the lung.
[0042] Both, the repair template, preferably packed into a AAV vector, as well as the nec-mRNA, preferably packed into nanoparticles, can be administered systemically or intravenously, respectively. This is of particular advantage if an application into the respiratory tract is not possible because of a mucus obstruction of the lung.
[0043] Another subject-matter of the present invention relates to a method for the correction of a genetic alteration on a DNA comprising the following steps: (1) introducing a repair template into a living being having a genetically altered DNA to be corrected, (2) introducing a nec-mRNA into the living being. The living being can be preferably a human being, and the introduction is preferably realized by means of high pressure application of the repair template and the nec-mRNA into the lung of the living being.
[0044] The features, advantages, further developments, embodiments disclosed in relation to the nec-mRNA, combination and pharmaceutical composition according to the invention apply to the methods according to the invention in equal measure.
[0045] It goes without saying that the before-mentioned features and those to be explained in the following embodiments cannot only be used in the combination specifically indicated but also in other combinations or in isolated manner without departing from the scope of the invention.
[0046] The invention will now be explained on the basis of embodiments resulting in further advantages, characteristics and features.
[0047] Reference is made to the enclosed figures which show the following:
BRIEF DESCRIPTION OF THE FIGURES
[0048] FIG. 1 shows the optimizing of the nec-mRNA by varying the portion of modified nucleotides;
[0049] FIG. 2 shows the principle of the permanent gene correction by the use of nec-mRNA;
[0050] FIG. 3 shows the in vivo gene correction at the SP-B locus of a SP-B knockout mouse by means of a nec-mRNA and the resulting increase of the life span of the mouse; and
[0051] FIG. 4 shows the homology directed repair of the SP-B locus in mouse fibroblasts in vitro by TALEN, encoded by modified mRNA.
[0052] FIG. 5 SP-B expression in lung BALF. The difference in SP-B expression levels per .mu.g total protein in BALF between transgenic SP-B mice on Doxycycline and BALB/c mice was not significant (n.s.). Boxes represent medians.+-.IQRs (interquartile ranges). n=4 mice per group.
[0053] FIG. 6 nec-mRNA cleaves the SP-B cassette, induces HDR in vitro, and is expressed in lung cells in vivo. (a) TALEN and ZFN candidates relative to the transgenic SP-B cassette. Transgenic SP-B mice-derived fibroblasts were used for (b-d); n.d., not detectable. (b) T7 assays to determine the frequency of TALEN- and ZFN-induced indels in genomic DNA harvested 3 d post-transfection (5 .mu.g/cell). (c) T1- and Z3-induced indels following delivery as either mRNA or pDNA (0.5 or 5 .mu.g). (d) Percent HDR 3 d following co-transfection of 5 .mu.g T1 or Z3 mRNA (or pDNA) with 0-4 .mu.g donor plasmid. Arrows denote NheI-sensitive cleavage products resulting from HDR. (e) Time-course showing kinetics and stability of 3.times.FLAG-tagged Z3 mRNA versus Z3 AAV in A549 cells (n=3). (f) Anti-3.times.FLAG flow cytometry shows protein expression in total lung cells and ATII cells. Boxes, medians.+-.IQRs; whiskers, minimum and maximum; *, P<0.05 versus unmodified; ** and ***, P<0.01 and P<0.001 versus no NPs. (g) Immunostaining for 3.times.FLAG in lung sections from mice described in f. Scale bar, 50 .mu.m. Arrows indicate 3.times.FLAG expression.
[0054] FIG. 7 Expression of FLAG-tagged TALENs and ZFNs in MLE12 cells. a, MLE12 cells (murine ATII cells) were transfected with 1 .mu.g of each TALEN and ZFN plasmid (left bar=left TALEN/ZFN, right bar=right TALEN/ZFN) or untransfected (control). After 24 h the expression was determined by flow cytometry. Transfection efficiency (% 3.times.FLAG expression, left y axis) and median fluorescence intensity (right axis, blue lines) from three pooled samples each are shown. b, Representative FACS dot plots of MLE12 cells expressing 3.times.FLAG. MLE12 cells were transfected with plasmids encoding for TALEN (T) 1, 2, 3, and ZFN (Z) 2, 3, 4 and 5. L, left arm. R, right arm. All assays were performed in biological triplicates.
[0055] FIG. 8 Schematic of the T7 assay to prove cleavage of the transgenic SP-B promoter region. Fibroblasts from transgenic SP-B mice were transfected with TALEN or ZFN plasmid pairs (5 .mu.g) (1.), leading to cellular repair, with NHEJ leading to approximated 1-5% indels (insertions and/or deletions) (2.). Genomic DNA was harvested 4 d after transfection and a locus-specific PCR was performed (3.). PCR products were melted at 95.degree. C., re-annealed at gradually decreasing temperatures (4.) and treated with T7 endonuclease (5.). T7 endonuclease only cuts heterodimers at sites of mismatch, resulting in smaller fragments which then can be visualized on agarose gels (6.) to determine the frequency of nucleaseinduced indels in the samples.
[0056] FIG. 9 Z3 pair amino acid sequence. Amino acid sequence of FLAG-tagged a, left Z3 arm (SEQ ID no. 35) and b, right Z3 arm (SEQ ID no. 36).
[0057] FIG. 10 Z3 nec-mRNA is efficiently deposited in the lung in vivo. 100 .mu.l PBS, 20 .mu.g unmodified Z3 mRNA, or 20 .mu.g Z3 nec-mRNA containing a 5' 3.times.FLAG tag (naked or complexed with nanoparticles, 100 .mu.l total volume) were i.t. administered to BALB/c mice (n=5 mice per group). After 24 h, total RNA was extracted from lungs, reverse transcribed, and Z3 mRNA was quantified by qPCR. Mean+SD ist shown. ***, P<0.001 versus "unmod. Z3 mRNA naked"; .sup..sctn., P<0.05 versus "unmod. Z3 mRNA+NP".
[0058] FIG. 11 3.times.FLAG+ Clara cells 24 h after i.t. administration of PBS or unmodified or differently modified Z3 nec-mRNA, with or without NPs. Z3 protein expression was quantified viai flow cytometry against 3.times.FLAG (n=5 mice per group). Percentages of 3.times.FLAG+ Clara cells *, P<0.05 versus unmodified mRNA; **, P<0.01 versus "without NP". Boxes represent medians.+-.IQRs. Whiskers represent the minimum and maximum observations.
[0059] FIG. 12 In vivo immune reaction to nec-mRNA. 1 .mu.g of Z3 mRNA panel i.v. or i.p. injected into mice (n=3 per group). 6 h and 24 h post-injection, IFN-.alpha. was measured by ELISA in duplicates. Relative mRNA deposition amounts were determined by RT-qPCR of isolated lung tissue. *, P<0.05 versus "NP only".
[0060] FIG. 13 Rescue of SP-B deficient mice by in vivo gene manipulation. (a) Treatment scheme and Kaplan-Meier survival curves of transgenic SP-B mice treated i.t. with donor (2.5.times.10.sup.11 v.g. AAV6-donor, AAV6-mock, or none) and nuclease (20 .mu.g Z3 nec-mRNA-NP, mock-mRNA-NP, 5.times.10.sup.10 v.g. Z3 AAV, or none), then withdrawn from doxycycline. Groups C-F, n=6; groups A and B, n=13, reduced to n=4 20 d post-doxycycline removal. Log-rank tests were performed. (b,c) Representative SP-B expression (brown) in lung tissue (c) and anti-SP-B blots on cell-free BALF supernatant (10 .mu.g total protein/lane) (c) from mice described in a. Scale bar, 50 .mu.m. Lavages and tissue were harvested 20 days after doxycycline removal. (d) Lung compliance normalized to respective body weight (n=4), 20 d after doxycycline removal. Baseline measurement performed for 20 min; values calculated prior to each hyperinflation. ***, P<0.001 versus groups C-F; .sctn., .sctn..sctn., and .sctn..sctn..sctn., P<0.05, P<0.01, and P<0.001 versus group D. (e,f) PCR on lung-isolated DNA from groups A and B or untargeted lungs; each lane represents an individual mouse. Samples were taken 20 d after doxycycline removal. (e) PCR of the targeted locus followed by T7 assays. Arrows show expected bands. (f) PCR using P1/P3 or P1/P2, followed by gel electrophoresis. #, untargeted control; .sctn., DNA pool of groups A and B. Arrow indicates band resulting from HDR. (g) Schematic of the transgenic SP-B cassette, CAG integration and primer (P1, P2 and P3) locations for in-out PCRs. (h) Representative immunohistochemistry for groups A, B, and a doxycycline-control group (+Doxy) using two different anti-3.times.FLAG antibodies. Scale bar, 50 .mu.m. Tissue was collected 20 d after doxycycline removal.
[0061] FIG. 14 Structures of TUB07-pFB-ZFN3-repair-template (A), TUB09-pFB-CMV-3Flag-NLS-38561-Fok-KKR-bGHpA (B), and TUB08-pFB-CMV-3Flag-NLS-38558-Fok-ELD-bGHpA (C).
[0062] FIG. 15 Ex vivo transgene integration. Fibroblasts, derived from transgenic SP-B mice, were transduced with 2.5.times.10.sup.5 v.g. of AAV6-donor and either a mock control ("untargeted"), 2 .mu.g Z3 nec-mRNA ("mRNA targeted) or 1.times.10.sup.5 v.g. AAV6-Z3 ("AAV targeted"). L, ladder. Lanes marked with "-" are the respective no-template negative controls. Given are the expected amplicon sizes.
[0063] FIG. 16 SP-B expression in mouse experimental groups A-F as measured by Western blot. SP-B expression in ng/.mu.g total protein was determined by Quantity One software (www.bio-rad.de). Boxes represent medians.+-.IQRs (interquartile ranges). Whiskers represent the minimum and maximum observations. n=6 mice per group were used.
[0064] FIG. 17 Semiquantitative analysis of the immunohistochemistry shown in FIG. 2b). Boxes represent medians.+-.IQRs (interquartile range). n=6 mice per group were used.
[0065] FIG. 18 % Resistance of mouse lungs after gene manipulation. Calculated by dividing the mean resistance values at the end of challenge by mean values at end of each washout period of the +Doxy group and main groups A and B, challenged with methacholine in rising concentrations over time to determine airway hyper responsiveness. n=3 mice per group were used.
[0066] FIG. 19 Representative photographs of the lungs from groups +Doxy, A and B, before and after perfusion and lung function measurements. n=3 mice per group were examined.
[0067] FIG. 20 Representative photographs of the lungs from groups C to F, before and after perfusion and lung function measurements. n=3 mice per group were examined.
[0068] FIG. 21 Comparison of hemorrhagic counts (semiquantitive analysis of data from Supplementary FIGS. 14 and 15). If left lung showed partial hemorrhage it was counted as 1, when more than half of the left lung area was hemorrhagic it was counted as 2. For all four right lung lobes, signs of hemorrhage were counted as 1 (resulting in an maximum count of 6). The straight lines represent the means. a, hemorrhage count before perfusion; b, hemorrhage count after perfusion. *, P<0.05 versus Doxy-control, groups A and B; .sup..sctn., P<0.05 versus Doxy-control and group B. (Mann-Whitney test, two-sided, asymptotic significance).
[0069] FIG. 22 Differential cell counts. Cells from lung lavages were stained with May-Grunwald/Giemsa, counted and related to 1 ml of BALF, 20 d after doxycycline removal.
[0070] FIG. 23 IL-12 ELISA in BALF. Cytokine levels were quantified in mice BALF by ELISA at sacrificing date (mean.+-.s.e.m); n.s., not significant. Serum was tested 20 days after doxycycline removal.
[0071] FIG. 24 3.times.FLAG expression score (combined semiquantitative analysis of the immunohistochemistry shown in FIG. 2h). Boxes represent medians.+-.IQRs (interquartile range). *, P<0.05 versus Doxy-control and group A. The lavage was harvested 20 d after doxycycline removal.
[0072] FIG. 25 Expression of 3.times.FLAG+ in a) total lung cells and b) ATII cells. *, P<0.05 versus Doxy-control; **, P<0.01 versus Doxy-control. The tissue was harvested 20 d after doxycycline removal.
[0073] FIG. 26 Target site sequencing. We pooled sorted AT II cell samples within different experimental groups (A-F), one pool per group, performed single-cell separation, cloned PCR amplicons of the DNA (P1/P2) from those single cells in TOPO vectors, and sequenced the amplicons with primers P1 and P2. Subsequently, we performed an alignment of the sequences with the donor reference sequence, thereby identifying corrected cells (lower lanes depicting part of intron 1 (named "rev 3. NBT1P1 . . . " and "fwd 4. NBT1P1 . . . ").
[0074] FIG. 27 In-out PCRs and T7 assays on DNA samples from donor only treated mice (group C). The non-appearance of any secondary band(s) demonstrates that no TI accidentally took place in the donor only group C. L, ladder. Lanes marked with "-" are the respective non-template negative controls.
DESCRIPTION OF PREFERRED EMBODIMENTS
1. Optimizing the nec-mRNA
[0075] In Kormann et al. (2011; cit. loc.) it is described that the replacement of 25% of each uridine and cytidine in the mRNA by 2-thiouridine and 5-methylcytidine in the SP-B deficient mouse results in a significantly stable and low immunogenic SP-B mRNA. The inventors have tested in an experiment whether the portion of modified nucleotides can be further reduced. The result of such an experiment is shown in FIG. 1.
[0076] In a first approach the inventors have manufactured an mRNA encoding the red fluorescent protein (RFP), where different levels of uridine and cytidine were replaced by 2-thiouridine (s2U) and 5-methylcytidine (m5C), namely 25% of each, 10% of each, and 100% of m5C and 10% of s2U. With these RNA molecules A-549 cells were transfected and after 24 hours the median of the fluorescence intensity (MIT) as the size of the transfection efficiency and a positive expression were measured. The result is shown in FIG. 1A. It can be seen that with a substitution of each of 25% or 10% an optimum expression is detectable; cf. 4th and 5th column.
[0077] In a further approach the immunogenicity of the modified mRNA molecules was examined. For this purpose, in addition to the s2U and m5C modified mRNAs also pseudouridine (Psi) modified mRNA molecules were manufactured which all encode zinc-finger nucleases (ZFN-5; both directions). With the chemically-modified mRNAs PBMCs were transfected via liposome fusion (Lipofectamin-2000). In the following, in an ELISA the expression of IFN-alpha was measured as a rate for the immunogenicity. The result is shown in FIG. 1B. Here it becomes evident that the use of an unmodified mRNA results in a very strong immunoreaction (column far right). When using 10% of each m5C/s2U (6th approach) and 25% of each m5C/s2U (5th approach), the immunogenicity is significantly reduced.
[0078] The low replacement of modified nucleotides has the advantage that the nec-mRNA is producible in a significantly cheaper manner than the nucleotide-modified mRNAs of the prior art where considerably higher portions of the nucleotides are replaced. It has further the advantage that the efficiency of the translation is optimized.
2. Principle of the Permanent Gene Correction by the Use of nec-mRNA
[0079] The inventors have developed a system to achieve a permanent correction of the gene loci which are present in mutated form in several diseases such as severe congenital lung diseases, in order to allow a stable lifelong expression of the corrected protein. This system is shown in FIG. 2. A lung efficient AAV vector (1.) shuttles the repair template into the cell and the nucleus (2.). Subsequently, a modified, aptamer-coupled mRNA encoding a specific nuclease pair (3.) is transfected into the cytoplasm of the target cell (4.), where it is translated into the nuclease pair proteins. The duration and strength of the expression can be influenced or controlled by means of different chemical modifications (5.). The nuclease pair is transported into the nucleus, where it binds to the target region and creates a double-strand break (DSB) (6.). The DSB stimulates a HR as a repair mechanism, exchanging the genetic defect for the corrected repair template (7.).
3. In Vivo Gene Correction at the SP-B Locus of the Mouse by nec-mRNA
[0080] Next, it was examined whether a modified mRNA encoding a nuclease can catalyze an effective gene correction in the lung cell of the mouse in vivo. For this purpose, experiments were performed with a mouse having SP-B deficiency. The employed mouse model is described in detail in Kormann et al. (2011; cit. loc.).
[0081] To this end, the inventors designed an mRNA encoding a TALEN pair which is specific for the SP-B locus. Furthermore, a repair template was designed which is encoded by an adeno-associated viral vector (AAV). It comprises a constitutive promoter upstream of the SP-B cDNA to make the gene expression independent from doxycycline. The repair template is schematically shown in FIG. 3a. The repair template which carries a fully-functional CAG promoter with TetO.sub.7-CMV and a truncated SP-B cDNA as homology arms is integrated into the genome via homologous recombination (HR) to overcome doxycycline dependency.
[0082] The nucleotide-modified mRNA (25% s2U/m5C) and the vector-encoded repair template were administered into the lung of the mice via a singular high pressure application. In the following, the delivery of doxycycline was stopped which causes acute respiratory failure in the mice and, as a consequence, determines the life span of the mice. The result is shown in FIG. 3b. It turns out, mice which were treated with a combination of 25% of s2U/m5C modified SP-B locus specific TALEN mRNA and the repair template (n=6) survive significantly longer than the controls; cf. continuous right curve in comparison to the left dotted curve. The controls were treated with a corresponding nucleotide modified mRNA which encodes the read fluorescent protein (RFP) (n=5, Kaplan Meier survival curves, Wilcoxon-Gehan test).
4. TALEN-Encoding Nucleotide-Modified mRNA Induces Homology-Directed Repair In Vitro
[0083] In a further experiment it should be examined whether and to which extent a replacement or a correction of the genetic alteration on the DNA can be obtained. For this purpose, the DNA of SP-B fibroblasts was cleaved by TALEN, encoded by modified mRNA, and a repair template with a NheI restriction site was introduced. In the following, the extent of the homologous recombination was measured. The result is shown in FIG. 4. It becomes evident that a homologous recombination of 31% was reached which demonstrates the suitability of the combination of nec-mRNA and repair template according to the invention.
5. Material and Methods
[0084] TALEN and ZFN reagents. TALENs and ZFNs targeting the transgenic SP-B cassette were screened by Dual Luciferase Single Strand Annealing Assay (DLSSA) and assembled using an archive of zinc-finger proteins, as previously described; Urnov, F. D. et al. Nature 435, 646-651 (2005). The full amino acid sequences of the Z3 pair are shown in FIG. 7. The ZFN expression vector was assembled as previously described; Doyon, Y. et al. Nat Biotechnol 26, 702-708 (2008).
[0085] Dual Luciferase Single Strand Assay (DLSSA). ZFNs were screened using a luciferase-based reporter system. This reporter-based assay system is composed of four mammalian expression vectors each under the control of the cytomegalovirus (CMV) immediate early promoter. The vectors are (1) ZFN1, (2) ZFN2, (3) pDLSSA-Firefly, and (4) pDLSSA-Renilla. pDLSSA-Firefly vector contains a Firefly luciferase gene derived from the pGL3-Promoter vector (www.promega.com) with an internal .about.600 bp duplication of the middle part of the Firefly luciferase gene. DNA fragments that contain individual ZFN pair binding sites are inserted between these duplicated regions. pDLSSA-Renilla is derived from pRL-TK (www.promega.com) and expresses the Renilla luciferase gene. One day before transfection, 20,000 mouse Neruo2A cells (www.atcc.org) are seeded in a 96-well plate with Dulbecco's Modified Eagle Medium (DMEM; www.cellgro.com) plus 5 mM L-glutamine and 10% FBS. The four expression vectors described above (6.25 ng each) are co-transfected using Lipofectamine 2000 (Life Technologies). ZFN cleavage of the target plasmid followed by 5' to 3' end resection generates single-stranded DNA from the duplicated portion of the Firefly luciferase gene. Annealing of this complementary DNA and subsequent DNA repair creates an intact Firefly luciferase gene that reports the activity of the test ZFNs. Detection of Renilla luciferase serves as an internal control and allows for normalization of intra-transfection variability. Cells are harvested 24-hours post-transfection and the activities of both Firefly and Renilla luciferase are measured using the Dual-Glo Luciferase System (www.promega.com). ZFN activity is scored as the ratio of Firefly luciferase activity to Renilla luciferase activity.
[0086] Targeting vectors. The targeting vector carrying the CAG promoter was assembled from synthetic oligonucleotides (www.lifetechnologies.com) and PCR products, and was verified by sequencing. The NheI RFLP donor plasmid was constructed by removing the CAG promoter from the targeting vector by NheI digestion, leaving a single NheI restriction site, which was used in the RFLP assays.
[0087] Cell culture and transfection. For the T7 and HDR assays 1.times.10.sup.6 fibroblasts in 6-well plates were transfected as indicated in the respective figure legends using the Neon electroporation system (www.lifetechnologies.com) with 100 .mu.l tips. The electroporation settings were 1,650 Volts, 20 ms, 1 pulse. A549 cells (human ATII cells, the cell type responsible for SP-B expression in the lungs) were maintained at 37.degree. C. under 5% CO.sub.2 and grown in minimal essential medium (www.lifetechnologies.com), supplemented with 10% FCS, 1% penicillin-streptomycin. One day before transfection, 50,000 or 80,000 cells/well/500 .mu.l were plated in 24-well plates. The cells (70-90% confluent) were transfected with 5 .mu.g (T7 assays, fragment analyses and RFLP) or 1 .mu.g Z3 pair nec-mRNA (time-course experiment) using Neon electroporation (www.lifetechnologies.com) with a transfection mix volume of 100 .mu.l according to manufacturer's instructions or transduced with MOI of 1.times.10.sup.5 v.g. of each Z3 AAV6. For transfection experiments demonstrated in FIG. 6d, we equilibrated the DNA amounts by adding inert (empty vector) DNA to a total of 9 .mu.g each. For transduction, the cells were washed once with PBS and cultured in Opti-MEM; 6 h after transduction 10% FCS was supplied. After 24 h the medium was removed, the cells washed once with PBS and fresh culture medium was added. Primary fibroblasts from transgenic SP-B mice were obtained by removing the dorsal skin, followed by separation of epidermis from the dermis using dispase. After further digestion of the dermis using collagenase, the suspension was passaged through a 70 .mu.m strainer. After wash and centrifugation steps, the cell pellet was resuspended in fibroblast culture medium (DMEM/Ham's F-12 medium with L-glutamine, 10% MSC grade Fetal Calf Serum, 1.times.MEM non-essential amino acids, 1.times. sodium pyruvate, 1% penicillin/streptomycin, 0.1 mM 2-mercaptoethanol). For the time course experiments: after 1, 2, 3, 4, 5, and 14 days after transfection the A549 cells were harvested, permeabilized using BD Cytofix/Cytoperm plus (www.bd.com), stained with APC anti-DYKDDDK clone L5 (www.biolegend.com) antibody, and analysed on an LSR-I flow cytometer (www.bd.com) and data were analysed with BD FACSDiva software (www.bd.com).
[0088] Generation of (nec-)mRNA. To generate templates for in vitro transcription the 3.times.FLAG-tagged T1 and Z3 were cut out of their original vectors and subcloned into a PolyA-120 containing pVAX1 (www.lifetechnologies.com). The plasmids were linearized with XbaI and transcribed in vitro using the MEGAscript T7 Transcription kit (www.lifetechnologies.com), incorporating 25% 2-thio-UTP and 25% 5-methyl-CTP or 100% PseudoUTP and 100% 5-methyl-CTP (all from www.trilink.com). The anti reverse CAP analog (ARCA) capped synthesized nec-mRNAs were purified using the MEGAclear kit (www.lifetechnologies.com) and analyzed for size on agarose gels and for purity and concentration on a NanoPhotometer (www.implen.com).
[0089] T7 nuclease assay. Genomic DNA was extracted from fibroblasts using the DNeasy Blood & Tissue Kit (www.qiagen.com). A 50 .mu.l PCR reaction was set up using 100 ng of gDNA derived from fibroblasts previously transfected with 5 .mu.g T1 or Z3 pair, 0.5 .mu.M primers (for T1: fwd, GTAGGCGTGTACGGTGGGAG [SEQ ID No. 1]; rev, CAGCAGAGGGTAGGAAGCAGC [SEQ ID No. 2]; for Z3: fwd, TGTACGGTGGGAGGCCTAT [SEQ ID No. 3]; rev, CCTGGCAGGTGATGTGG [SEQ ID No. 4]), and AmpliTaq Gold 360 Mastermix (www.lifetechnologies.com). Another PCR reaction was performed using the same primer sets, but with gDNA from untransfected cells. The PCR products were run on agarose gels to verify size and sufficient amplification, pooled, purified by ethanol precipitation, dissolved in 20 .mu.l water and the DNA concentration was measured on a NanoPhotometer. 2 .mu.l NEBuffer 2 (www.neb.com), 2 .mu.g purified PCR product and water were brought to a total volume of 19 .mu.l. The DNA was hybridized in a thermocycler according to the following protocol: 95.degree. C. for 5 min, 95-85.degree. C. at -2.degree. C./sec, 85-25.degree. C. at -0.1.degree. C./sec, hold at 4.degree. C. 1 .mu.l (10 U) of T7E1 (www.neb.com, M0302L) was added and incubated at 37.degree. C. for 15 min. The reaction was stopped by adding 2 .mu.l of 0.25 M EDTA. The reaction was again purified by ethanol precipitation and dissolved in 15 .mu.l water. The nuclease specific cleavage products were determined on agarose gels. The band intensities were quantified using ImageJ (http://rsb.info.nih.gov/ij/).
[0090] For measuring off-target effects, A549 cells were transfected 5 .mu.g mRNA or transduced with 1.times.10.sup.5 v.g. AAV6-Z3. PCR and T7 was performed as described above (primers: off-target 1: fwd, GCAAGTTTGGCGTCGCTCCA [SEQ ID No. 5]; rev, AGAGGAAGGCGCGGCAGG [SEQ ID No. 6]; off-target 2: fwd, TTCTTGCTCCAGTGACTCTCTTA [SEQ ID No. 7]; rev, AGCCTAGTAAAGACAACACTAGTG [SEQ ID No. 8]; off-target 3: fwd, CAACGTGACCTGCGAGCG [SEQ ID No. 9]; rev, GTGCACGCTCCACTTCTCG [SEQ ID No. 10]; off-target 4: fwd, CTGGAGATGCATCCTTGTCTGT [SEQ ID No. 11]; rev, GAGGGTGAAGACTTTTGGAGCT [SEQ ID No. 12]; off-target 5: fwd, CAGCACCAGATGTTCCCTGTTA [SEQ ID No. 13]; rev, TGGAAAGCAATAGTTCTAGGATGA [SEQ ID No. 14]).
[0091] HDR/RFLP assay. Genomic DNA was extracted from fibroblasts or lung tissue using the DNeasy Blood & Tissue Kit (www.qiagen.com). T1 or Z3 target loci were amplified by PCR (40 cycles, 58.degree. C. annealing and 30 s elongation at 72.degree. C.; 5 min at 72.degree. C. to assure completion of amplicons) using 0.5 .mu.M of primers P1 (CCTGGCAGGTGATGTGG [SEQ ID No. 15]) and P3 (TGTACGGTGGGAGGCCTAT [SEQ ID No. 16]) with AmpliTaq Gold 360 Mastermix. In addition, in-out PCR reactions were performed using primers P1 and P2 (AGGCACTGGGCAGGTAAGTA [SEQ ID No. 17]).
[0092] Flow Cytometry. Harvested lungs were digested at 37.degree. C. for 1 hour on a rotating shaker in 1 mg/ml collagenase type I (www.lifetechnologies.com), 1% (500 U) DNase (www.epibio.com) solution. Digested lung was passed through a 40-.mu.m nylon cell strainer and erythrocytes were lysed using ACK Lysing Buffer (www.lifetechnologies). PE anti-CD45 clone 30-F11, PE anti-CD31 clone C13.3, APC anti-mouse Ly-6A (Sca-1) clone D7 (www.biolegend.com), FITC anti-FLAG M2 and anti-clara cell secretory protein (www.sigmaaldrich.com) were used to stain lung cells. After staining for extracellular markers, cells were fixed and permeabilized using BD Cytofix/Cytoperm plus (www.bd.com), then stained with intracellular antibodies. Flow cytometer analyses were performed on a LSR-I flow cytometer (www.bd.com) and data were analysed with BD FACSDiva software (www.bd.com). ATII and Clara cells sorting were performed with a FACSAria (www.bd.com).
[0093] Nanoparticles. Chitosan (83% deacetylated (Protasan UP CL 113, www.novamatrix.biz)) coated PLGA (Poly-d,l-lactide-co-glycolide 75:25 (Resomer RG 752H, www.evonik.de) nanoparticles (short: NPs) were prepared by using emulsion-diffusion-evaporation 15 with minor changes. In brief, 100 mg PLGA was dissolve in ethyl acetate and added dropwise to an aqueous 2.5% PVA solution (Polyvinyl alcohol, Mowiol 4-88, www.kuraray.eu) containing 15 mg Chitosan. This emulsion was stirred (1.5 h at RT) and followed by homogenization at 17,000 rpm for 10 min using a Polytron PT 2500E (www.kinematica.ch). These positive charged NPs were sterile filtered and characterized by Malvern ZetasizerNano ZSP (hydrodynamic diameter: 157.3.+-.0.87 nm, PDI 0.11, zeta potential +30.8.+-.0.115 mV). After particle formation they were loaded with mRNA by mixing (weight ratio: 25:1).
[0094] Transgenic SP-B cassette, mRNA templates and AAVs. AAV serotype 6 vectors from the Z3 pair and the donor sequence were produced and purchased from Virovek (www.virovek.com). The sequence information can be retrieved from the Sequence Listing at SEQ ID nos. 24-34 and from FIG. 14.
[0095] Transgenic SP-B cassette (before gene manipulation): the sequence at nucleotide positions 427-450 of SEQ ID no. 24 is deleted when transgene integration occurs.
[0096] AAV6_CAG_SP-B_donor: 5' AAV ITR: 3933-4051 (119 bp); ZFN3-repair-template: 4087-6074 (1988 bp); 3' AAV ITR: 6112-6241 (130 bp)
[0097] AAV6-ZFN 3-LEFT: 5' AAV ITR: 3933-4051 (119 bp); CMV Promoter: 4060-4638 (579 bp); 3Flag-NLS-38561-Fok-KKR: 4844-5992 (1149 bp); bGHpA: 5999 6223 (225 bp); 3' AAV ITR: 6240-6369 (130 bp).
[0098] AAV6-ZFN 3-RIGHT: 5' AAV ITR: 3933-4051 (119 bp); CMV Promoter: 4060-4638 (579 bp); 3Flag-NLS-38558-Fok-ELD: 4766-6031 (1266 bp); bGHpA: 6038-6262 (225 bp); 3' AAV ITR: 6279-6408 (130 bp).
[0099] Animal experiments. 6-8 week old BALB/c mice (www.criver.com) and transgenic SP-B mice6 [SP-C rtTA/(teto).sub.7 SP-B/SP-B.sup.-/-] were maintained under specific pathogen-free conditions and were kept with a 12 h/12 h light/dark cycle. All animals were provided with food and water ad libitum, and were acclimatized for at least 7 d before the start of the respective experiment. Transgenic SP-B mice were fed with doxycycline containing food until cessation (day 0 of the control and main groups). All animal procedures were approved and controlled by the local ethics committee and carried out according to the German law of protection of animal life.
[0100] Intratracheal Injection.
[0101] BALB/c or transgenic SP-B mice were anesthetized intraperitoneally with a mixture of medetomidine (0.5 mg/kg), midazolam (5 mg/kg) and fentanyl (50 .mu.g/kg), and suspended on a mouse intubation platform (www.penncentury.com, Model MIP) at a 45.degree. angle by the upper teeth. A small animal laryngoscope (www.penncentury.com) was used to provide optimal illumination of the trachea. A Microsprayer Aerosolizer--Model IA-1C connected to a FMJ-250 High Pressure Syringe (both from www.penncentury.com) was endotracheally inserted and PBS, 20 .mu.g Z3 (nec-)mRNA naked or complexed with Nanoparticles or AAV6 (www.virovek.com) (was applied in a volume of 100 .mu.l. The Microsprayer tip was withdrawn after 10 s, antidot was injected subcutaneously (atipamezol (50 .mu.g/kg), flumazenil (10 .mu.g/kg) and naloxon (24 .mu.g/kg)), and the mouse was taken off the support after 2 min.
[0102] Airway compliance. Compliance was determined by using the ex vivo model of the isolated perfused lung as described previously (IPL, Harvard Apparatus). In short, in situ mouse lungs were placed in a thorax chamber and mice were ventilated via a tracheal cannula. Ventilation rate was set to 90 breaths per minute with negative pressure ventilation between -2.8 cm H.sub.2O and 8.5 cm H.sub.2O. To prevent atelectasis a hyperinflation was triggered every 5 minutes (-25 cm H.sub.2O). Perfusion of lungs was done with a 4% hydroxyethyl starch containing perfusion buffer via the pulmonary artery (flow 1 ml/min). Lung function parameters were recorded automatically and compliance calculated by HSE-HA Pulmodyn W Software (Harvard Apparatus). For graphical and statistical analysis, the mean compliance values were calculated from the last 10 timestamps (40 sec) of each 5-minute period (between two hyperinflations).
[0103] Airway resistance. Airway resistance in response to methacholine (MCh, acetyl-.beta.-methylcholine chloride; Sigma-Aldrich) was again determined using the ex vivo model of the isolated perfused lung (IPL, Harvard Apparatus). In brief, after a 20-minutes baseline measurement, lungs were perfused with increasing concentrations of MCh (0.1 .mu.M, 1 .mu.M, 10 .mu.M, and 100 .mu.M) for 10 minutes each, separated by a 10-minute washout period with perfusion buffer. Lung function parameters were recorded automatically and airway resistance was recorded by HSE-HA Pulmodyn W Software (Harvard Apparatus). For graphical and statistical analysis, the mean resistance values were calculated from the last 10 timestamps (40 sec) of each 10-minute MCh exposure.
[0104] Histopathology. Mouse lungs were fixed in 4.5% Histofix (www.carlroth.com) at 4.degree. C. overnight. Fixed lungs were embedded in paraffin, and slices were stained with either H&E or Surfactant Protein-B DAB (mouse monoclonal anti-SP-B antibody (www.abcam.com, ab3282), Zytochem Plus HRP One-Step Polymer anti-mouse/rabbit/rat (www.zytomed.com, ZUC53-006) and DAB substrate kit for peroxidase (www.vectorlabs.com, SK-4100). 3.times.FLAG FITC fluorescence staining (monoclonal anti-FLAG M2-FITC antibody (www.sigma-aldrich.com, F4049) and DAPI counterstaining (www.applichem.com, A1001) was examined using a Zeiss Axio Imager. For 3.times.FLAG Cy3 fluorescence staining, rabbit polyclonal to DDDDK tag antibody (www.abcam.com, ab21536) was used as primary antibody and goat anti rabbit Cy3 antibody (www.jacksonimmuno.com, 111-165-144) was used as secondary antibody together with DAPI (www.applichem.com, A1001).
[0105] Western Blot. Protein from BALF was separated on NuPAGE 10% Bis-Tris Plus gels and a NuPAGE Mini Gel Tank (all from www.lifetechnologies.com), and immunoblotting was performed by standard procedures according to manufacturer's instructions using the XCell II Mini-Cell and blot modules (www.lifetechnologies.com). After blocking for 2 hours at room temperature, primary antibody against SP-B (kindly provided by Prof. Griese, Munich) or ANTI-FLAG M2 (www.sigmaaldrich.com) was incubated overnight, HRP-conjugated secondary antibodies (anti rabbit from www.dianova.com) were incubated for 1 hour. Blots were processed by using ECL Prime Western Blot Detection Reagents (www.gelifesciences.com). Sem iquantitative analysis was performed with the Quantity One software (www.bio-rad.de).
[0106] Target-site sequencing. Genomic DNA from primary fibroblasts (in vitro transfected/transduced) or sorted ATII cells (after in vivo transfection/transduction) was isolated using the NucleoSpin Tissue Kit (www.mn-net.com) according to the manufacturer's protocol. Amplicons were derived from PCR with Primers P1 and P2 (sequences see above) using the following conditions: AmpliTaq Gold 360 master mix (www.lifetechnologies.com) at 95.degree. C. for 10 min, 95.degree. C. for 30 sec, 60.degree. C. for 30 sec., 72.degree. C. for 60 sec, with in total 35 cycles and a final extension step at 72.degree. C. for 7 min. The amplicons were cloned into the pCR-TOPO vector (www.lifetechnologies.com) and sequenced using the primers M13forward (GTAAAACGACGGCCAGTG [SEQ ID No. 18]) and M13reverse (CAGGAAACAGCTATGACCATG [SEQ ID No. 19]). The alignments have been performed with Geneious R6 (www.biomatters.com) using the "multiple align" function, choosing a cost matrix of 65% similarity (5.0/-4.0), a gap open penalty of 12 and a gap extension penalty of 3.
[0107] RealTime RT PCR. The lung cell separations were washed vigorously three times with PBS to avoid carrying over RNA not taken up by lung cells (the third supernatant was later tested for RNA contamination using the qPCR procedure described below). RNA was then isolated with the RNeasy purification kit (www.qiagen.com). Reverse transcription of 50 ng RNA was carried out using iScript cDNA synthesis kit (www.bio-rad.com). Detection of Z3 cDNA was performed by SYBR-Green based quantitative Real-Time PCR in 20 .mu.l reactions on a ViiA7 (www.lifetechnologies.com). Reactions were incubated for 10 min at 95.degree. C., followed by 40 cycles of 15 sec at 95.degree. C. and 2 min at 50.degree. C. (annealing and extension), followed by standard melting curve analysis. The following primer pairs were used: Z3 left fwd: TGTACGGCTACAGGGGAA [SEQ ID No. 20], Z3 left rev GCCGATAGGCAGATTGTA [SEQ ID No. 21]; optimal determined house-keeping gene beta-actin: fwd TAGGCACCAGGGTGATG [SEQ ID No. 22], rev GCCATGTTCAATGGGGTACT [SEQ ID No. 23].
[0108] Statistics. Differences in mRNA expression between groups were analyzed by pair-wise fixed reallocation randomization tests with REST 2009 software17. All other analyses were performed using the Wilcoxon-Mann-Whitney test with SPSS 21 (www.ibm.com). Data are presented as mean.+-.s.e.m. or as the median.+-.IQR (interquartile ranges) and P<0.05 (two-tailed) was considered statistically significant. For survival studies Log-rank tests were performed. Statistics for lung compliance was performed using 2way ANOVA with Bonferroni-post tests with GraphPad Prism 5.0 software. Lung function data are presented as mean.+-.s.d. and P<0.05 (two-tailed) was considered statistically significant. No randomization was used for animal experiments. In all cases but at administration of AAV6/mRNA i.t., the investigators were blinded when assessing outcomes.
6. Results
[0109] Nuclease-mediated genome editing holds enormous potential to knockout unwanted genes or repair disease-causing mutations. An ideal nuclease delivery vehicle is (i) short-lived, (ii) non-integrating, and (iii) able to enter target cells efficiently. A variety of vectors have been utilized to deliver nuclease pairs, however, to date, none have achieved direct in vivo gene correction while simultaneously being transient and non-integrating.
[0110] The inventors have used modified mRNA as an alternative to traditional viral vectors, one which naturally avoids genomic integration and provides a transient pulse of protein expression. By using nucleotide-modified mRNA, the inventors reached therapeutic protein expression levels in vivo in mouse models of surfactant protein B (SP-B) deficiency and experimental asthma. Here, the inventors utilize modified mRNA to deliver site-specific nucleases to the lung to demonstrate the value of "nec-mRNA" as a tool for in vivo genome editing.
[0111] To illustrate the effectiveness of nec-mRNA as a nuclease-delivery vehicle, the inventors chose a well-established transgenic mouse model of SP-B deficiency, where SP-B cDNA is under the control of a Tetracycline-inducible promoter. Administration of doxycycline drives SP-B expression levels comparable to those observed in wild-type mice (FIG. 5). Following cessation of doxycycline, this model closely mimics the phenotypic changes seen in the human version of the disease: thickened alveolar walls, heavy cellular infiltration, increased macrophages and neutrophils, interstitial edema, augmented cytokines in the lavage, a significant drop in lung function, and fatal respiratory distress leading to death within days. Here, the inventors insert a constitutive CAG promoter immediately upstream of the SP-B cDNA to allow doxycycline-independent expression and prolonged life in treated mice.
[0112] First, a panel of ZFNs and TALENs was customized to target the transgenic SP-B cassette (FIG. 6a and FIG. 7). Due to their high activity and proximity to the desired site of promoter integration, TALEN #1 (T1) and ZFN #3 (Z3) were selected (FIG. 6a,b). In comparison to plasmid DNA, T1 and Z3 delivered as mRNA showed a significant increase in both DSB-induction (FIG. 6c FIG. 8; P<0.05) and homology directed repair (HDR) (FIG. 6d, P<0.05). As Z3 mRNA was more efficient than T1 mRNA in both cases, Z3 was chosen for further experimentation (amino acid sequences in FIG. 9; SEQ ID nos. 35 and 36). Comparison with a Z3-encoding AAV vector ("Z3 AAV") highlights the short-lived expression pattern of Z3 mRNA (FIG. 6e), limiting the time during which off-target cleavage activity could occur.
[0113] To optimize Z3 expression in the lung, the inventors administered a panel of 3.times.FLAG-tagged Z3 mRNAs with various modification schemes, with or without mRNA-complexation to nanoparticles (NPs); cf. Nafee, N., Taetz, S., Schneider, M., Schaefer, U. F. & Lehr, C. M. Nanomedicine 3, 173-183 (2007). Following intratracheal (i.t.) delivery, NP-complexing significantly increased mRNA expression levels (FIG. 10). 3.times.FLAG protein expression was most robust for the s2U.sub.0.25/m5C.sub.0.25-modified, NP-complexed group (FIG. 6f,g and FIG. 11), and no immune activation was observed following i.t. delivery (FIG. 12). Hence, subsequent in vivo studies utilized i.t. delivery of this candidate, referred to as "Z3 nec-mRNA-NP".
[0114] Next, a complementary donor template was designed to insert a constitutive CAG promoter at the Z3 nec-mRNA-NP cut site, upstream of the transgenic SP-B cDNA (FIG. 13g and FIG. 14). Successful site-specific HDR would allow mice to survive and produce SP-B in the absence of doxycycline. As it is critical to deliver the donor template in excess to ensure it is favored over the homologous chromosome during HDR, for this proof-of-principle, the inventors utilized a vector known to transduce lung cells with high efficiency, AAV-serotype 6 (integration-deficient lentiviruses will be tested in future studies). Ex vivo delivery of the AAV6-donor with Z3 nec-mRNA-NP resulted in successful HDR in primary fibroblasts (FIG. 15).
[0115] Moving in vivo, AAV6-donor and Z3 nec-mRNA-NP (or a Z3 AAV positive control) were then delivered to the lung of transgenic SP-B mice, followed by cessation of doxycycline (FIG. 13a). Notably, mice in these groups lived significantly longer in comparison to matched controls groups (FIG. 13a, P<0.001), while maintaining SP-B expression levels comparable to mice receiving doxycycline, as far as 20 d post-doxycycline-removal (FIG. 13b,c and FIGS. 16 and 17).
[0116] Phenotypically, combining gene correction with AAV6-donor and Z3 nec-mRNA-NP (or Z3 AAV) prevented the significant drop in lung function (FIG. 13d and FIG. 18), severe hemorrhagic infiltrations and large-scale edema (FIGS. 19-21), and neutrophilia (FIG. 22) observed in the lungs of negative controls. A non-significant increase of IL-12 was observed in nec-mRNA-NP- versus PBS-treated mice (FIG. 13), however, no IFN-.alpha. elevation was detected (data not shown). DSBs and HDR rates (the latter determined by in-out PCR, see FIG. 13g) were concomitant with successful gene manipulation (FIG. 13e,f), which was also determined by target site sequencing (FIG. 26). If achieved in humans, HDR rates of .about.9% would likely be sufficient to avoid severe disease progression (see Discussion). Results also confirmed that nuclease expression was longer-lived if administered via AAV, rendering nec-mRNA-NP a more favorable delivery vehicle (FIG. 13h and FIGS. 24 and 25).
[0117] Inherent key limitations of our approach are (i) the co-transfection of an AAV-DNA donor template in conjunction with nec-mRNA, (ii) the temporal delimitation of our curative in vivo treatment, probably owing to the natural turnover of the transfected lung cell populations, and (iii) the engineering of an artificial, transgenic cassette compared to humanized models. However, the use of nec-mRNA will have immediate implications for all nuclease platforms, including CRISPR/Cas9 systems, targeted gene knockout, as well as therapeutic gene correction strategies for the treatment of SP-B deficiency and other diseases, such as cancer. The inventors will test this technology in humanized models when available and are confident to move nec-mRNA and nec-mRNA-NP (for efficient lung transfection) finally to the clinic.
[0118] Overall, the inventors conclude that co-delivery of Z3 nec-mRNA-NP and AAV6-donor results in successful site-specific genome editing in vivo, documenting the first report of life-prolonging gene correction in the lung.
7. Discussion
[0119] This proof-of-principle in a transgenic model of SP-B deficiency demonstrates that nec-mRNA can achieve therapeutic levels of gene correction in vivo, while possessing the three main criteria of an optimal genome editing reagent: (i) transience, (ii) an inability to integrate, and (iii) sufficient transfection of target cells. As lung cell turnover likely prevented survival beyond 30-35 days in this model, animals in future studies will undergo repeated nec-mRNA administration to target additional differentiated cells of the lung.
[0120] The inventors made sure that the truncated SP-B gene fragment in the right homology arm does not express functional SP-B protein by testing the administration of AAV6 donor w/o functional nucleases (FIG. 13a, group C): all mice died within three days. Although there was still some residual SP-B detectable in the Western Blot (about 10%, which the inventors usually see if lavages are tested only three days after Doxycycline cessation (data not shown)), it is highly unlikely that this signal is derived from the donor construct as the molecular weight of the band is normal (and therefore inconsistent with any truncated form).
[0121] The ability to target and correct lung progenitor cells will also be the subject of ongoing investigation. The inventors also want to emphasize that the main safety gain by our nec-mRNA technology concerns the reduction of nuclease-derived off-target effects; it does not eliminate the integration of donor template. Since SP-B acts extracellularly in the alveolar space, modification of a small number of cells could functionally correct a larger area of lung tissue. The inventors found in vivo HDRs of about 9%: in humans 5-10% of SP-B levels in the lung are sufficient and show only a mild disease (in humans there are SNPs in the SP-B gene causing about 10% of normal SP-B levels, and many of those people were completely healthy. Also, there is no linear correlation between achieved HDR in lung cells (see FIG. 13f) and SP-B expression in the lungs (see FIGS. 13b,c and FIG. 17); together supporting the notion that an in vivo HDR of .about.9%--if achievable--should have therapeutic effects in human.
[0122] Though AAV vectors have not been associated with genotoxicity, further development of nec-mRNA-mediated gene correction approaches may also benefit from pairing with a non-viral or integration-deficient lentiviral donor template. The inventors chose to not look for AAV donor integration for several reasons: a) the inventors wanted the vector to be as coherent as possible and AAV donor integration measurements are at best mere estimations; b) any experiment that uses a transgene donor will require the use of a DNA-based donor and therefore has some risk of insertional mutagenesis. Consequently, it is not possible to achieve HDR in vivo, while avoiding background vector integration. A multitude of papers describing use of AAV and lentivirus have been published in NBT, all of which have necessarily had some background level of donor integration; however no pathological effects of AAV utilization could be demonstrated in extensive murine studies; and c) the big advantage of the inventors' work is that mRNA delivery prevents the persistent expression of the nucleases. This is a far larger worry than background AAV integration.
[0123] Off-target cleavage in vitro on the top five in silico predicted sites was a minor issue with an average of only 0.78% indels at day 14 of transfected or transduced A549 cells (data not shown).
[0124] We found it also important to perform in/out PCRs (and T7 endonuclease reactions) on lung samples of mice that received only AAV6-donor (group C). This is important because, given the large overlap between the donor and the chromosome in our case, recombination can occur in the PCR itself. Therefore, we did control PCRs (P1/P3, P1/P2) and T7 reactions on all mice (P1/P3, group C) or pooled samples (P1/P2, groups A, B and C) and could strengthen the positive results found in groups A and B, as no HDR event could have been detected in DNA samples from group C (FIG. 27).
[0125] With respect to human SP-B deficiency, it is important to note that the site-specific nucleases designed to target the transgenic locus in this mouse model will not be directly applicable to the human condition. Also, the CAG promoter is very strong, so manipulated cells likely produce significantly higher amounts of SP-B than normal, endogenous cells. Further, the PCR assay used for quantification likely underestimates the true amount of CAG promoter integration. Despite this, the inventors feel that transgenic SP-B deficient mice serve as an excellent proof-of-principle model for several reasons: first, cessation of doxycycline results in phenotypic changes closely modeling those observed in the human condition. Second, administration of doxycycline drives SP-B expression levels comparable to wild-type mice; and finally, the outcome of survival in this model is a definitive measure of efficacy. Together with Chitosan-coated NP's, nec-mRNA presents a strong tool to approach lung diseases still currently uncorrectable in the human system. Combining nec-mRNA with other structured NPs (cf. Young C. et al. Nat Prot 9, 1900-1915 (2014); incorporated herein by reference) may expand the capabilities of gene manipulation to other large disease fields such a cancer therapeutics.
8. Conclusion
[0126] The inventors were able to demonstrate in an impressive manner by means of a mouse model that by using a nuclease-encoding nucleotide-modified messenger RNA (nec-mRNA) a genetic alteration on a DNA can be permanently corrected. The nec-mRNA is administered together with a repair template which comprises the genetic information to be inserted or to be replaced, respectively.
Sequence CWU
1
1
36120DNAArtificial SequencePCR primer 1gtaggcgtgt acggtgggag
20221DNAArtificial SequencePCR primer
2cagcagaggg taggaagcag c
21319DNAArtificial SequencePCR primer 3tgtacggtgg gaggcctat
19417DNAArtificial SequencePCR primer
4cctggcaggt gatgtgg
17520DNAArtificial SequencePCR primer 5gcaagtttgg cgtcgctcca
20618DNAArtificial SequencePCR primer
6agaggaaggc gcggcagg
18723DNAArtificial SequencePCR primer 7ttcttgctcc agtgactctc tta
23824DNAArtificial SequencePCR primer
8agcctagtaa agacaacact agtg
24918DNAArtificial SequencePCR primer 9caacgtgacc tgcgagcg
181019DNAArtificial SequencePCR primer
10gtgcacgctc cacttctcg
191122DNAArtificial SequencePCR primer 11ctggagatgc atccttgtct gt
221222DNAArtificial SequencePCR
primer 12gagggtgaag acttttggag ct
221322DNAArtificial SequencePCR primer 13cagcaccaga tgttccctgt ta
221424DNAArtificial SequencePCR
primer 14tggaaagcaa tagttctagg atga
241517DNAArtificial SequencePCR primer 15cctggcaggt gatgtgg
171619DNAArtificial SequencePCR
primer 16tgtacggtgg gaggcctat
191720DNAArtificial SequencePCR primer 17aggcactggg caggtaagta
201818DNAArtificial SequencePCR
primer 18gtaaaacgac ggccagtg
181921DNAArtificial SequencePCR primer 19caggaaacag ctatgaccat g
212018DNAArtificial SequencePCR
primer 20tgtacggcta caggggaa
182118DNAArtificial SequencePCR primer 21gccgataggc agattgta
182217DNAArtificial SequencePCR
primer 22taggcaccag ggtgatg
172320DNAArtificial SequencePCR primer 23gccatgttca atggggtact
20241665DNAArtificial
SequenceSynthetic 24cctcgagttt accactccct atcagtgata gagaaaagtg
aaagtcgagt ttaccactcc 60ctatcagtga tagagaaaag tgaaagtcga gtttaccact
ccctatcagt gatagagaaa 120agtgaaagtc gagtttacca ctccctatca gtgatagaga
aaagtgaaag tcgagtttac 180cactccctat cagtgataga gaaaagtgaa agtcgagttt
accactccct atcagtgata 240gagaaaagtg aaagtcgagt ttaccactcc ctatcagtga
tagagaaaag tgaaagtcga 300gctcggtacc cgggtcgagg taggcgtgta cggtgggagg
cctatataag cagagctcgt 360ttagtgaacc gtcagatcgc ctggagacgc catccacgct
gttttgacct ccatagaaga 420caccgggacc gatccagcct ccgcggcccc gaattctgca
gatatccagc acagtggcgg 480ccgctaggca gccatggcca agtcgcacct actgcagtgg
ctactgctgc ttcctaccct 540ctgctgccca ggtgcagcta tcacgtcggc ctcatccctg
gagtgtgcac aaggccctca 600attctggtgc caaagcctgg agcatgcagt gcagtgcaga
gccctggggc actgcctgca 660ggaagtctgg gggcatgcag gagctaatga cctgtgccaa
gagtgtgagg atattgtcca 720cctcctcaca aagatgacca aggaagatgc tttccaggaa
gcaatccgga agttcctgga 780acaagaatgt gatatccttc ccttgaagct gcttgtgccc
cggtgtcgcc aagtgcttga 840tgtctacctg cccctggtta ttgactactt ccagagccag
attaacccca aagccatctg 900caatcatgtg ggcctgtgcc cacgtgggca ggctaagcca
gaacagaatc cagggatgcc 960ggatgccgtt ccaaaccctc tgctggacaa gctggtcctc
cctgtgctgc caggagccct 1020cttggcaagg cctgggcctc acactcagga cttctctgag
caacagctcc ccattcccct 1080gcccttctgc tggctttgca gaactctgat caagcgggtt
caagccgtga tccccaaggg 1140tgtgctggct gtggctgtgt cccaggtgtg ccacgtggta
cccctggtgg tgggtggcat 1200ctgccagtgc ctggctgagc gctacacagt tctcctgcta
gacgcactgc tgggccgtgt 1260ggtgccccag ctagtctgtg gccttgtcct ccgatgttcc
actgaggatg ccatgggccc 1320tgccctccct gctgtggagc ctctgataga agaatggcca
ctacaggaca ctgagtgcca 1380tttctgcaag tctgtgatca accaggcctg gaacaccagt
gaacaggcta tgccacaggc 1440aatgcaccag gcctgccttc gcttctggct agacaggcaa
aagtgtgaac agtttgtgga 1500acagcacatg ccccagctgc tggccctggt gcctaggagc
caggatgccc acatcacctg 1560ccaggccctt ggcgtatgtg aggccccggc tagccctctg
cagtcgttcc aaaccccaca 1620cctctgannn nntctagagg gcccgtttaa acccgctgat
cagcc 1665252640DNAArtificial SequenceSynthetic
25cctcgagttt accactccct atcagtgata gagaaaagtg aaagtcgagt ttaccactcc
60ctatcagtga tagagaaaag tgaaagtcga gtttaccact ccctatcagt gatagagaaa
120agtgaaagtc gagtttacca ctccctatca gtgatagaga aaagtgaaag tcgagtttac
180cactccctat cagtgataga gaaaagtgaa agtcgagttt accactccct atcagtgata
240gagaaaagtg aaagtcgagt ttaccactcc ctatcagtga tagagaaaag tgaaagtcga
300gctcggtacc cgggtcgagg taggcgtgta cggtgggagg cctatataag cagagctcgt
360ttagtgaacc gtcagatcgc ctggagacgc catccacgct gttttgacct ccatagaaga
420caccgggcta gcggatcctc tagaactata gctagtcgac attgattatt gactagttat
480taatagtaat caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca
540taacttacgg taaatggccc gcctggctga ccgcccaacg acccccgccc attgacgtca
600ataatgacgt atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg
660gactatttac ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtacg
720ccccctattg acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc
780ttatgggact ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatgtcg
840aggccacgtt ctgcttcact ctccccatct cccccccctc cccaccccca attttgtatt
900tatttatttt ttaattattt tgtgcagcga tgggggcggg gggggggggc gcgcgccagg
960cggggcgggg cggggcgagg ggcggggcgg ggcgaggcgg agaggtgcgg cggcagccaa
1020tcagagcggc gcgctccgaa agtttccttt tatggcgagg cggcggcggc ggcggcccta
1080taaaaagcga agcgcgcggc gggcgggagc aagctttatt gcggtagttt atcacagtta
1140aattgctaac gcagtcagtg cttctgacac aacagtctcg aacttaagct gcagaagttg
1200gtcgtgaggc actgggcagg taagtatcaa ggttacaaga caggtttaag gagaccaata
1260gaaactgggc ttgtcgagac agagaagact cttgcgtttc tgataggcac ctattggtct
1320tactgacatc cactttgcct ttctctccac aggtgtccac tcccagttca attacagctc
1380ttaaggctag agtacttaat acgactcact ataggctagc ctcgagaatt ctgcagatat
1440ccagcacagt ggcggccgct aggcagccat ggccaagtcg cacctactgc agtggctact
1500gctgcttcct accctctgct gcccaggtgc agctatcacg tcggcctcat ccctggagtg
1560tgcacaaggc cctcaattct ggtgccaaag cctggagcat gcagtgcagt gcagagccct
1620ggggcactgc ctgcaggaag tctgggggca tgcaggagct aatgacctgt gccaagagtg
1680tgaggatatt gtccacctcc tcacaaagat gaccaaggaa gatgctttcc aggaagcaat
1740ccggaagttc ctggaacaag aatgtgatat ccttcccttg aagctgcttg tgccccggtg
1800tcgccaagtg cttgatgtct acctgcccct ggttattgac tacttccaga gccagattaa
1860ccccaaagcc atctgcaatc atgtgggcct gtgcccacgt gggcaggcta agccagaaca
1920gaatccaggg atgccggatg ccgttccaaa ccctctgctg gacaagctgg tcctccctgt
1980gctgccagga gccctcttgg caaggcctgg gcctcacact caggacttct ctgagcaaca
2040gctccccatt cccctgccct tctgctggct ttgcagaact ctgatcaagc gggttcaagc
2100cgtgatcccc aagggtgtgc tggctgtggc tgtgtcccag gtgtgccacg tggtacccct
2160ggtggtgggt ggcatctgcc agtgcctggc tgagcgctac acagttctcc tgctagacgc
2220actgctgggc cgtgtggtgc cccagctagt ctgtggcctt gtcctccgat gttccactga
2280ggatgccatg ggccctgccc tccctgctgt ggagcctctg atagaagaat ggccactaca
2340ggacactgag tgccatttct gcaagtctgt gatcaaccag gcctggaaca ccagtgaaca
2400ggctatgcca caggcaatgc accaggcctg ccttcgcttc tggctagaca ggcaaaagtg
2460tgaacagttt gtggaacagc acatgcccca gctgctggcc ctggtgccta ggagccagga
2520tgcccacatc acctgccagg cccttggcgt atgtgaggcc ccggctagcc ctctgcagtc
2580gttccaaacc ccacacctct gannnnntct agagggcccg tttaaacccg ctgatcagcc
2640266122DNAArtificial SequenceSynthetic 26gactcttcgc gatgtacggg
ccagatatac gcgttgacat tgattattga ctagttatta 60atagtaatca attacggggt
cattagttca tagcccatat atggagttcc gcgttacata 120acttacggta aatggcccgc
ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 180aatgacgtat gttcccatag
taacgccaat agggactttc cattgacgtc aatgggtgga 240ctatttacgg taaactgccc
acttggcagt acatcaagtg tatcatatgc caagtacgcc 300ccctattgac gtcaatgacg
gtaaatggcc cgcctggcat tatgcccagt acatgacctt 360atgggacttt cctacttggc
agtacatcta cgtattagtc atcgctatta ccatggtgat 420gcggttttgg cagtacatca
atgggcgtgg atagcggttt gactcacggg gatttccaag 480tctccacccc attgacgtca
atgggagttt gttttggcac caaaatcaac gggactttcc 540aaaatgtcgt aacaactccg
ccccattgac gcaaatgggc ggtaggcgtg tacggtggga 600ggtctatata agcagagctc
tctggctaac tagagaaccc actgcttact ggcttatcga 660aattaatacg actcactata
gggagaccca agctggctag cgtttaaact taagcttggt 720accgagctcg gatccactag
tccagtgtgg tggaattagc tctctggcta actagagaac 780ccactgctta ctggcttatc
gaaattaata cgactcacta tagggagagc caagctgact 840agcgtttaaa cttaagctga
tccactagtc cagtgtggtg gaattcgcca tggactacaa 900agaccatgac ggtgattata
aagatcatga catcgattac aaggatgacg atgacaagat 960ggcccccaag aagaagagga
aggtcggcat ccacggggta cctatggtgg acttgaggac 1020actcggttat tcgcaacagc
aacaggagaa aatcaagcct aaggtcagga gcaccgtcgc 1080gcaacaccac gaggcgcttg
tggggcatgg cttcactcat gcgcatattg tcgcgctttc 1140acagcaccct gcggcgcttg
ggacggtggc tgtcaaatac caagatatga ttgcggccct 1200gcccgaagcc acgcacgagg
caattgtagg ggtcggtaaa cagtggtcgg gagcgcgagc 1260acttgaggcg ctgctgactg
tggcgggtga gcttaggggg cctccgctcc agctcgacac 1320cgggcagctg ctgaagatcg
cgaagagagg gggagtaaca gcggtagagg cagtgcacgc 1380ctggcgcaat gcgctcaccg
gggccccctt gaacctgacc ccagaccagg tagtcgcaat 1440cgcgtcgaat ggcgggggaa
agcaagccct ggaaaccgtg caaaggttgt tgccggtcct 1500ttgtcaagac cacggcctta
caccggatca agtcgtggcc attgcaaata ataacggtgg 1560caaacaggct cttgagacgg
ttcagagact tctcccagtt ctctgtcaag cccacgggct 1620gactcccgat caagttgtag
cgattgcgag caacatcgga gggaaacaag cattggagac 1680tgtccaacgg ctccttcccg
tgttgtgtca agcccacggt ttgacgcctg cacaagtggt 1740cgccatcgcc tcccacgacg
gcggtaagca ggcgctggaa acagtacagc gcctgctgcc 1800tgtactgtgc caggatcatg
gactcacccc agaccaggta gtcgcaatcg cgtcgcatga 1860cgggggaaag caagccctgg
aaaccgtgca aaggttgttg ccggtccttt gtcaagacca 1920cggccttaca ccggagcaag
tcgtggccat tgcatcaaac ggaggtggca aacaggctct 1980tgagacggtt cagagacttc
tcccagttct ctgtcaagcc cacgggctga ctcccgatca 2040agttgtagcg attgcgagcc
atgatggagg gaaacaagca ttggagactg tccaacggct 2100ccttcccgtg ttgtgtcaag
cccacggttt gacgcctgca caagtggtcg ccatcgactc 2160ccacgacggc ggtaagcagg
cgctggaaac agtacagcgc ctgctgcctg tactgtgcca 2220ggatcatggg ctgaccccag
accaggtagt cgcaatcgcg tcgaacattg ggggaaagca 2280agccctggaa accgtgcaaa
ggttgttgcc ggtcctttgt caagaccacg gccttacacc 2340ggagcaagtc gtggccattg
catcaaacgg aggtggcaaa caggctcttg agacggttca 2400gagacttctc ccagttctct
gtcaagccca cgggctgact cccgatcaag ttgtagcgat 2460tgcgagcaac atcggaggga
aacaagcatt ggagactgtc caacggctcc ttcccgtgtt 2520gtgtcaagcc cacggtttga
cgcctgcaca agtggtcgcc atcgccaaca acaacggcgg 2580taagcaggcg ctggaaacag
tacagcgcct gctgcctgta ctgtgccagg atcatggttt 2640gaccccagac caggtagtcg
caatcgcgtc gaacattggg ggaaagcaag ccctggaaac 2700cgtgcaaagg ttgttgccgg
tcctttgtca agaccacggc cttacaccgg agcaagtcgt 2760ggccattgca tcaaatatcg
gtggcaaaca ggctcttgag acggttcaga gacttctccc 2820agttctctgt caagcccacg
ggctgactcc cgatcaagtt gtagcgattg cgaataacaa 2880tggagggaaa caagcattgg
agactgtcca acggctcctt cccgtgttgt gtcaagccca 2940cggtttgacg cctgcacaag
tggtcgccat cgcctccaat attggcggta agcaggcgct 3000ggaaacagta cagcgcctgc
tgcctgtact gtgccaggat catggcctga cacccgaaca 3060ggtggtcgcc attgctagcc
acgatggagg acggccagcc ttggagtcca tcgtagccca 3120attgtccagg cccgatcccg
cgttggctgc gttaacggga tcccagctgg tgaagagcga 3180gctggaggag aagaagtccg
agctgcggca caagctgaag tacgtgcccc acgagtacat 3240cgagctgatc gagatcgcca
ggaacagcac ccaggaccgc atcctggaga tgaaggtgat 3300ggagttcttc atgaaggtgt
acggctacag gggaaagcac ctgggcggaa gcagaaagcc 3360tgacggcgcc atctatacag
tgggcagccc catcgattac ggcgtgatcg tggacacaaa 3420ggcctacagc ggcggctaca
atctgcctat cggccaggcc gacgagatgg agagatacgt 3480ggaggagaac cagacccggg
ataagcacct caaccccaac gagtggtgga aggtgtaccc 3540tagcagcgtg accgagttca
agttcctgtt cgtgagcggc cacttcaagg gcaactacaa 3600ggcccagctg accaggctga
accacatcac caactgcaat ggcgccgtgc tgagcgtgga 3660ggagctgctg atcggcggcg
agatgatcaa agccggcacc ctgacactgg aggaggtgcg 3720gcgcaagttc aacaacggcg
agatcaactt cagatcttga taactcgagc taattctgca 3780gaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3840aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3900agcggccgct cgagtctaga
gggcccgttt aaacccgctg atcagcctcg actgtgcctt 3960ctagttgcca gccatctgtt
gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg 4020ccactcccac tgtcctttcc
taataaaatg aggaaattgc atcgcattgt ctgagtaggt 4080gtcattctat tctggggggt
ggggtggggc aggacagcaa gggggaggat tgggaagaca 4140atagcaggca tgctggggat
gcggtgggct ctatggcttc tactgggcgg ttttatggac 4200agcaagcgaa ccggaattgc
cagctggggc gccctctggt aaggttggga agccctgcaa 4260agtaaactgg atggctttct
cgccgccaag gatctgatgg cgcaggggat caagctctga 4320tcaagagaca ggatgaggat
cgtttcgcat gattgaacaa gatggattgc acgcaggttc 4380tccggccgct tgggtggaga
ggctattcgg ctatgactgg gcacaacaga caatcggctg 4440ctctgatgcc gccgtgttcc
ggctgtcagc gcaggggcgc ccggttcttt ttgtcaagac 4500cgacctgtcc ggtgccctga
atgaactgca agacgaggca gcgcggctat cgtggctggc 4560cacgacgggc gttccttgcg
cagctgtgct cgacgttgtc actgaagcgg gaagggactg 4620gctgctattg ggcgaagtgc
cggggcagga tctcctgtca tctcaccttg ctcctgccga 4680gaaagtatcc atcatggctg
atgcaatgcg gcggctgcat acgcttgatc cggctacctg 4740cccattcgac caccaagcga
aacatcgcat cgagcgagca cgtactcgga tggaagccgg 4800tcttgtcgat caggatgatc
tggacgaaga gcatcagggg ctcgcgccag ccgaactgtt 4860cgccaggctc aaggcgagca
tgcccgacgg cgaggatctc gtcgtgaccc atggcgatgc 4920ctgcttgccg aatatcatgg
tggaaaatgg ccgcttttct ggattcatcg actgtggccg 4980gctgggtgtg gcggaccgct
atcaggacat agcgttggct acccgtgata ttgctgaaga 5040gcttggcggc gaatgggctg
accgcttcct cgtgctttac ggtatcgccg ctcccgattc 5100gcagcgcatc gccttctatc
gccttcttga cgagttcttc tgaattatta acgcttacaa 5160tttcctgatg cggtattttc
tccttacgca tctgtgcggt atttcacacc gcatacaggt 5220ggcacttttc ggggaaatgt
gcgcggaacc cctatttgtt tatttttcta aatacattca 5280aatatgtatc cgctcatgag
acaataaccc tgataaatgc ttcaataata gcacgtgcta 5340aaacttcatt tttaatttaa
aaggatctag gtgaagatcc tttttgataa tctcatgacc 5400aaaatccctt aacgtgagtt
ttcgttccac tgagcgtcag accccgtaga aaagatcaaa 5460ggatcttctt gagatccttt
ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca 5520ccgctaccag cggtggtttg
tttgccggat caagagctac caactctttt tccgaaggta 5580actggcttca gcagagcgca
gataccaaat actgtccttc tagtgtagcc gtagttaggc 5640caccacttca agaactctgt
agcaccgcct acatacctcg ctctgctaat cctgttacca 5700gtggctgctg ccagtggcga
taagtcgtgt cttaccgggt tggactcaag acgatagtta 5760ccggataagg cgcagcggtc
gggctgaacg gggggttcgt gcacacagcc cagcttggag 5820cgaacgacct acaccgaact
gagataccta cagcgtgagc tatgagaaag cgccacgctt 5880cccgaaggga gaaaggcgga
caggtatccg gtaagcggca gggtcggaac aggagagcgc 5940acgagggagc ttccaggggg
aaacgcctgg tatctttata gtcctgtcgg gtttcgccac 6000ctctgacttg agcgtcgatt
tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac 6060gccagcaacg cggccttttt
acggttcctg ggcttttgct ggccttttgc tcacatgttc 6120tt
6122276194DNAArtificial
SequenceSynthetic 27gactcttcgc gatgtacggg ccagatatac gcgttgacat
tgattattga ctagttatta 60atagtaatca attacggggt cattagttca tagcccatat
atggagttcc gcgttacata 120acttacggta aatggcccgc ctggctgacc gcccaacgac
ccccgcccat tgacgtcaat 180aatgacgtat gttcccatag taacgccaat agggactttc
cattgacgtc aatgggtgga 240ctatttacgg taaactgccc acttggcagt acatcaagtg
tatcatatgc caagtacgcc 300ccctattgac gtcaatgacg gtaaatggcc cgcctggcat
tatgcccagt acatgacctt 360atgggacttt cctacttggc agtacatcta cgtattagtc
atcgctatta ccatggtgat 420gcggttttgg cagtacatca atgggcgtgg atagcggttt
gactcacggg gatttccaag 480tctccacccc attgacgtca atgggagttt gttttggcac
caaaatcaac gggactttcc 540aaaatgtcgt aacaactccg ccccattgac gcaaatgggc
ggtaggcgtg tacggtggga 600ggtctatata agcagagctc tctggctaac tagagaaccc
actgcttact ggcttatcga 660aattaatacg actcactata gggagaccca agctggctag
cgtttaaact taagcttggt 720accgagctcg gatccactag tccagtgtgg tggaattagc
tctctggcta actagagaac 780ccactgctta ctggcttatc gaaattaata cgactcacta
tagggagagc caagctgact 840agcgtttaaa cttaagctga tccactagtc cagtgtggtg
gaattcgcct agagatctgg 900cggcggagag ggcagaggaa gtcttctaac ctgcggtgac
gtggaggaga atcccggccc 960taggaccatg gactacaaag accatgacgg tgattataaa
gatcatgaca tcgattacaa 1020ggatgacgat gacaagatgg cccccaagaa gaagaggaag
gtcggcattc atggggtacc 1080tatggtggac ttgaggacac tcggttattc gcaacagcaa
caggagaaaa tcaagcctaa 1140ggtcaggagc accgtcgcgc aacaccacga ggcgcttgtg
gggcatggct tcactcatgc 1200gcatattgtc gcgctttcac agcaccctgc ggcgcttggg
acggtggctg tcaaatacca 1260agatatgatt gcggccctgc ccgaagccac gcacgaggca
attgtagggg tcggtaaaca 1320gtggtcggga gcgcgagcac ttgaggcgct gctgactgtg
gcgggtgagc ttagggggcc 1380tccgctccag ctcgacaccg ggcagctgct gaagatcgcg
aagagagggg gagtaacagc 1440ggtagaggca gtgcacgcct ggcgcaatgc gctcaccggg
gcccccttga acctgacccc 1500agaccaggta gtcgcaatcg cgtcgcatga cgggggaaag
caagccctgg aaaccgtgca 1560aaggttgttg ccggtccttt gtcaagacca cggccttaca
ccggatcaag tcgtggccat 1620tgcaaataat aacggtggca aacaggctct tgagacggtt
cagagacttc tcccagttct 1680ctgtcaagcc cacgggctga ctcccgatca agttgtagcg
attgcgaata acaatggagg 1740gaaacaagca ttggagactg tccaacggct ccttcccgtg
ttgtgtcaag cccacggttt 1800gacgcctgca caagtggtcg ccatcgccaa caacaacggc
ggtaagcagg cgctggaaac 1860agtacagcgc ctgctgcctg tactgtgcca ggatcatgga
ctcaccccag accaggtagt 1920cgcaatcgcc aacaataacg ggggaaagca agccctggaa
accgtgcaaa ggttgttgcc 1980ggtcctttgt caagaccacg gccttacacc ggagcaagtc
gtggccattg catcacatga 2040cggtggcaaa caggctcttg agacggttca gagacttctc
ccagttctct gtcaagccca 2100cgggctgact cccgatcaag ttgtagcgat tgcgagccat
gatggaggga aacaagcatt 2160ggagactgtc caacggctcc ttcccgtgtt gtgtcaagcc
cacggtttga cgcctgcaca 2220agtggtcgcc atcgccaaca acaacggcgg taagcaggcg
ctggaaacag tacagcgcct 2280gctgcctgta ctgtgccagg atcatgggct gaccccagac
caggtagtcg caatcgcgtc 2340gcatgacggg ggaaagcaag ccctggaaac cgtgcaaagg
ttgttgccgg tcctttgtca 2400agaccacggc cttacaccgg atcaagtcgt ggccattgca
aataataacg gtggcaaaca 2460ggctcttgag acggttcaga gacttctccc agttctctgt
caagcccacg ggctgactcc 2520cgatcaagtt gtagcgattg cgaataacaa tggagggaaa
caagcattgg agactgtcca 2580acggctcctt cccgtgttgt gtcaagccca cggtttgacg
cctgcacaag tggtcgccat 2640cgcctccaat attggcggta agcaggcgct ggaaacagta
cagcgcctgc tgcctgtact 2700gtgccaggat catggtttga ccccagacca ggtagtcgca
atcgccaaca ataacggggg 2760aaagcaagcc ctggaaaccg tgcaaaggtt gttgcaggtc
ctttgtcaag accacggcct 2820tacaccggat caagtcgtgg ccattgcaaa taataacggt
ggcaaacagg ctcttgagac 2880ggttcagaga cttctcccag ttctctgtca agcccacggg
ctgactcccg atcaagttgt 2940agcgattgcg agccatgatg gagggaaaca agcattggag
actgtccaac ggctccttcc 3000cgtgttgtgt caagcccacg gtttgacgcc tgcacaagtg
gtcgccatcg cctccaacgg 3060tggcggtaag caggcgctgg aaacagtaca gcgcctgctg
cctgtactgt gccaggatca 3120tggcctgaca cccgaacagg tggtcgccat tgctagcaat
aaaggaggac ggccagcctt 3180ggagtccatc gtagcccaat tgtccaggcc cgatcccgcg
ttggctgcgt taacgggatc 3240ccagctggtg aagagcgagc tggaggagaa gaagtccgag
ctgcggcaca agctgaagta 3300cgtgccccac gagtacatcg agctgatcga gatcgccagg
aacagcaccc aggaccgcat 3360cctggagatg aaggtgatgg agttcttcat gaaggtgtac
ggctacaggg gaaagcacct 3420gggcggaagc agaaagcctg acggcgccat ctatacagtg
ggcagcccca tcgattacgg 3480cgtgatcgtg gacacaaagg cctacagcgg cggctacaat
ctgcctatcg gccaggccga 3540cgagatgcag agatacgtga aggagaacca gacccggaat
aagcacatca accccaacga 3600gtggtggaag gtgtacccta gcagcgtgac cgagttcaag
ttcctgttcg tgagcggcca 3660cttcaagggc aactacaagg cccagctgac caggctgaac
cgcaaaacca actgcaatgg 3720cgccgtgctg agcgtggagg agctgctgat cggcggcgag
atgatcaaag ccggcaccct 3780gacactggag gaggtgcggc gcaagttcaa caacggcgag
atcaacttct gataactcga 3840gctaattctg cagaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa 3900aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa 3960aaaaaaaaaa aaagcggccg ctcgagtcta gagggcccgt
ttaaacccgc tgatcagcct 4020cgactgtgcc ttctagttgc cagccatctg ttgtttgccc
ctcccccgtg ccttccttga 4080ccctggaagg tgccactccc actgtccttt cctaataaaa
tgaggaaatt gcatcgcatt 4140gtctgagtag gtgtcattct attctggggg gtggggtggg
gcaggacagc aagggggagg 4200attgggaaga caatagcagg catgctgggg atgcggtggg
ctctatggct tctactgggc 4260ggttttatgg acagcaagcg aaccggaatt gccagctggg
gcgccctctg gtaaggttgg 4320gaagccctgc aaagtaaact ggatggcttt ctcgccgcca
aggatctgat ggcgcagggg 4380atcaagctct gatcaagaga caggatgagg atcgtttcgc
atgattgaac aagatggatt 4440gcacgcaggt tctccggccg cttgggtgga gaggctattc
ggctatgact gggcacaaca 4500gacaatcggc tgctctgatg ccgccgtgtt ccggctgtca
gcgcaggggc gcccggttct 4560ttttgtcaag accgacctgt ccggtgccct gaatgaactg
caagacgagg cagcgcggct 4620atcgtggctg gccacgacgg gcgttccttg cgcagctgtg
ctcgacgttg tcactgaagc 4680gggaagggac tggctgctat tgggcgaagt gccggggcag
gatctcctgt catctcacct 4740tgctcctgcc gagaaagtat ccatcatggc tgatgcaatg
cggcggctgc atacgcttga 4800tccggctacc tgcccattcg accaccaagc gaaacatcgc
atcgagcgag cacgtactcg 4860gatggaagcc ggtcttgtcg atcaggatga tctggacgaa
gagcatcagg ggctcgcgcc 4920agccgaactg ttcgccaggc tcaaggcgag catgcccgac
ggcgaggatc tcgtcgtgac 4980ccatggcgat gcctgcttgc cgaatatcat ggtggaaaat
ggccgctttt ctggattcat 5040cgactgtggc cggctgggtg tggcggaccg ctatcaggac
atagcgttgg ctacccgtga 5100tattgctgaa gagcttggcg gcgaatgggc tgaccgcttc
ctcgtgcttt acggtatcgc 5160cgctcccgat tcgcagcgca tcgccttcta tcgccttctt
gacgagttct tctgaattat 5220taacgcttac aatttcctga tgcggtattt tctccttacg
catctgtgcg gtatttcaca 5280ccgcatacag gtggcacttt tcggggaaat gtgcgcggaa
cccctatttg tttatttttc 5340taaatacatt caaatatgta tccgctcatg agacaataac
cctgataaat gcttcaataa 5400tagcacgtgc taaaacttca tttttaattt aaaaggatct
aggtgaagat cctttttgat 5460aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc
actgagcgtc agaccccgta 5520gaaaagatca aaggatcttc ttgagatcct ttttttctgc
gcgtaatctg ctgcttgcaa 5580acaaaaaaac caccgctacc agcggtggtt tgtttgccgg
atcaagagct accaactctt 5640tttccgaagg taactggctt cagcagagcg cagataccaa
atactgtcct tctagtgtag 5700ccgtagttag gccaccactt caagaactct gtagcaccgc
ctacatacct cgctctgcta 5760atcctgttac cagtggctgc tgccagtggc gataagtcgt
gtcttaccgg gttggactca 5820agacgatagt taccggataa ggcgcagcgg tcgggctgaa
cggggggttc gtgcacacag 5880cccagcttgg agcgaacgac ctacaccgaa ctgagatacc
tacagcgtga gctatgagaa 5940agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc
cggtaagcgg cagggtcgga 6000acaggagagc gcacgaggga gcttccaggg ggaaacgcct
ggtatcttta tagtcctgtc 6060gggtttcgcc acctctgact tgagcgtcga tttttgtgat
gctcgtcagg ggggcggagc 6120ctatggaaaa acgccagcaa cgcggccttt ttacggttcc
tgggcttttg ctggcctttt 6180gctcacatgt tctt
6194284354DNAArtificial SequenceSynthetic
28gactcttcgc gatgtacggg ccagatatac gcgttgacat tgattattga ctagttatta
60atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata
120acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat
180aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga
240ctatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtacgcc
300ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt
360atgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat
420gcggttttgg cagtacatca atgggcgtgg atagcggttt gactcacggg gatttccaag
480tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc
540aaaatgtcgt aacaactccg ccccattgac gcaaatgggc ggtaggcgtg tacggtggga
600ggtctatata agcagagctc tctggctaac tagagaaccc actgcttact ggcttatcga
660aattaatacg actcactata gggagaccca agctggctag cgtttaaact taagcttggt
720accgagctcg gatccactag tccagtgtgg tggaattaat tcgcctagag atctggcggc
780ggagagggca gaggaagtct tctaacctgc ggtgacgtgg aggagaatcc cggccctagg
840accatggact acaaagacca tgacggtgat tataaagatc atgacatcga ttacaaggat
900gacgatgaca agatggcccc caagaagaag aggaaggtcg gcattcatgg ggtacccgcc
960gctatggctg agaggccctt ccagtgtcga atctgcatgc gtaacttcag tgaccagtcc
1020aacctgcgcg cccacatccg cacccacacc ggcgagaagc cttttgcctg tgacatttgt
1080gggaggaaat ttgcccgcaa gtccgaccgc atcaagcata ccaagataca cacgggcagc
1140caaaagccct tccagtgtcg aatctgcatg cgtaagtttg cccgctccga caacctgtcc
1200gtgcatacca agatacacac gggcgagaag cccttccagt gtcgaatctg catgcgtaac
1260ttcagtgagc gcggcaccct ggcccgccac atccgcaccc acaccggcga gaagcctttt
1320gcctgtgaca tttgtgggag gaaatttgcc cgctccgacg ccctgaccca gcataccaag
1380atacacctgc ggggatccca gctggtgaag agcgagctgg aggagaagaa gtccgagctg
1440cggcacaagc tgaagtacgt gccccacgag tacatcgagc tgatcgagat cgccaggaac
1500agcacccagg accgcatcct ggagatgaag gtgatggagt tcttcatgaa ggtgtacggc
1560tacaggggaa agcacctggg cggaagcaga aagcctgacg gcgccatcta tacagtgggc
1620agccccatcg attacggcgt gatcgtggac acaaaggcct acagcggcgg ctacaatctg
1680cctatcggcc aggccgacga gatgcagaga tacgtgaagg agaaccagac ccggaataag
1740cacatcaacc ccaacgagtg gtggaaggtg taccctagca gcgtgaccga gttcaagttc
1800ctgttcgtga gcggccactt caagggcaac tacaaggccc agctgaccag gctgaaccgc
1860aaaaccaact gcaatggcgc cgtgctgagc gtggaggagc tgctgatcgg cggcgagatg
1920atcaaagccg gcaccctgac actggaggag gtgcggcgca agttcaacaa cggcgagatc
1980aacttctgat aactcgagtc tagaattctg cagaaaaaaa aaaaaaaaaa aaaaaaaaaa
2040aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
2100aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaagcggccg ctcgagtcta gagggcccgt
2160ttaaacccgc tgatcagcct cgactgtgcc ttctagttgc cagccatctg ttgtttgccc
2220ctcccccgtg ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa
2280tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg
2340gcaggacagc aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg
2400ctctatggct tctactgggc ggttttatgg acagcaagcg aaccggaatt gccagctggg
2460gcgccctctg gtaaggttgg gaagccctgc aaagtaaact ggatggcttt ctcgccgcca
2520aggatctgat ggcgcagggg atcaagctct gatcaagaga caggatgagg atcgtttcgc
2580atgattgaac aagatggatt gcacgcaggt tctccggccg cttgggtgga gaggctattc
2640ggctatgact gggcacaaca gacaatcggc tgctctgatg ccgccgtgtt ccggctgtca
2700gcgcaggggc gcccggttct ttttgtcaag accgacctgt ccggtgccct gaatgaactg
2760caagacgagg cagcgcggct atcgtggctg gccacgacgg gcgttccttg cgcagctgtg
2820ctcgacgttg tcactgaagc gggaagggac tggctgctat tgggcgaagt gccggggcag
2880gatctcctgt catctcacct tgctcctgcc gagaaagtat ccatcatggc tgatgcaatg
2940cggcggctgc atacgcttga tccggctacc tgcccattcg accaccaagc gaaacatcgc
3000atcgagcgag cacgtactcg gatggaagcc ggtcttgtcg atcaggatga tctggacgaa
3060gagcatcagg ggctcgcgcc agccgaactg ttcgccaggc tcaaggcgag catgcccgac
3120ggcgaggatc tcgtcgtgac ccatggcgat gcctgcttgc cgaatatcat ggtggaaaat
3180ggccgctttt ctggattcat cgactgtggc cggctgggtg tggcggaccg ctatcaggac
3240atagcgttgg ctacccgtga tattgctgaa gagcttggcg gcgaatgggc tgaccgcttc
3300ctcgtgcttt acggtatcgc cgctcccgat tcgcagcgca tcgccttcta tcgccttctt
3360gacgagttct tctgaattat taacgcttac aatttcctga tgcggtattt tctccttacg
3420catctgtgcg gtatttcaca ccgcatacag gtggcacttt tcggggaaat gtgcgcggaa
3480cccctatttg tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac
3540cctgataaat gcttcaataa tagcacgtgc taaaacttca tttttaattt aaaaggatct
3600aggtgaagat cctttttgat aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc
3660actgagcgtc agaccccgta gaaaagatca aaggatcttc ttgagatcct ttttttctgc
3720gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt tgtttgccgg
3780atcaagagct accaactctt tttccgaagg taactggctt cagcagagcg cagataccaa
3840atactgtcct tctagtgtag ccgtagttag gccaccactt caagaactct gtagcaccgc
3900ctacatacct cgctctgcta atcctgttac cagtggctgc tgccagtggc gataagtcgt
3960gtcttaccgg gttggactca agacgatagt taccggataa ggcgcagcgg tcgggctgaa
4020cggggggttc gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc
4080tacagcgtga gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc
4140cggtaagcgg cagggtcgga acaggagagc gcacgaggga gcttccaggg ggaaacgcct
4200ggtatcttta tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat
4260gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc
4320tgggcttttg ctggcctttt gctcacatgt tctt
4354294393DNAArtificial SequenceSynthetic 29gactcttcgc gatgtacggg
ccagatatac gcgttgacat tgattattga ctagttatta 60atagtaatca attacggggt
cattagttca tagcccatat atggagttcc gcgttacata 120acttacggta aatggcccgc
ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 180aatgacgtat gttcccatag
taacgccaat agggactttc cattgacgtc aatgggtgga 240ctatttacgg taaactgccc
acttggcagt acatcaagtg tatcatatgc caagtacgcc 300ccctattgac gtcaatgacg
gtaaatggcc cgcctggcat tatgcccagt acatgacctt 360atgggacttt cctacttggc
agtacatcta cgtattagtc atcgctatta ccatggtgat 420gcggttttgg cagtacatca
atgggcgtgg atagcggttt gactcacggg gatttccaag 480tctccacccc attgacgtca
atgggagttt gttttggcac caaaatcaac gggactttcc 540aaaatgtcgt aacaactccg
ccccattgac gcaaatgggc ggtaggcgtg tacggtggga 600ggtctatata agcagagctc
tctggctaac tagagaaccc actgcttact ggcttatcga 660aattaatacg actcactata
gggagaccca agctggctag cgtttaaact taagcttggt 720accgagctcg gatccactag
tccagtgtgg tggaattaat tcgccatgga ctacaaagac 780catgacggtg attataaaga
tcatgacatc gattacaagg atgacgatga caagatggcc 840cccaagaaga agaggaaggt
cggcatccac ggggtacccg ccgctatggc tgagaggccc 900ttccagtgtc gaatctgcat
gcgtaacttc agtcagtcct ccgacctgtc ccgccacatc 960cgcacccaca ccggcgagaa
gccttttgcc tgtgacattt gtgggaggaa atttgcctgg 1020cgctcctccc tgcgccagca
taccaagata cacacgcatc ccagggcacc tattcccaag 1080cccttccagt gtcgaatctg
catgcgtaac ttcagtcagt ccggcgacct gacccgccac 1140atccgcaccc acaccggcga
gaagcctttt gcctgtgaca tttgtgggag gaaatttgcc 1200cgccgcgccg accgcgccaa
gcataccaag atacacacgc acccgcgcgc cccgatcccg 1260aagcccttcc agtgtcgaat
ctgcatgcgt aacttcagtc gctccgacga cctgacccgc 1320cacatccgca cccacaccgg
cgagaagcct tttgcctgtg acatttgtgg gaggaaattt 1380gcccagcgct ccaccctgtc
ctcccatacc aagatacacc tgcggggatc ccagctggtg 1440aagagcgagc tggaggagaa
gaagtccgag ctgcggcaca agctgaagta cgtgccccac 1500gagtacatcg agctgatcga
gatcgccagg aacagcaccc aggaccgcat cctggagatg 1560aaggtgatgg agttcttcat
gaaggtgtac ggctacaggg gaaagcacct gggcggaagc 1620agaaagcctg acggcgccat
ctatacagtg ggcagcccca tcgattacgg cgtgatcgtg 1680gacacaaagg cctacagcgg
cggctacaat ctgcctatcg gccaggccga cgagatggag 1740agatacgtgg aggagaacca
gacccgggat aagcacctca accccaacga gtggtggaag 1800gtgtacccta gcagcgtgac
cgagttcaag ttcctgttcg tgagcggcca cttcaagggc 1860aactacaagg cccagctgac
caggctgaac cacatcacca actgcaatgg cgccgtgctg 1920agcgtggagg agctgctgat
cggcggcgag atgatcaaag ccggcaccct gacactggag 1980gaggtgcggc gcaagttcaa
caacggcgag atcaacttca gatcttgata actcgagtct 2040agaattctgc agaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2100aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160aaaaaaaaaa aagcggccgc
tcgagtctag agggcccgtt taaacccgct gatcagcctc 2220gactgtgcct tctagttgcc
agccatctgt tgtttgcccc tcccccgtgc cttccttgac 2280cctggaaggt gccactccca
ctgtcctttc ctaataaaat gaggaaattg catcgcattg 2340tctgagtagg tgtcattcta
ttctgggggg tggggtgggg caggacagca agggggagga 2400ttgggaagac aatagcaggc
atgctgggga tgcggtgggc tctatggctt ctactgggcg 2460gttttatgga cagcaagcga
accggaattg ccagctgggg cgccctctgg taaggttggg 2520aagccctgca aagtaaactg
gatggctttc tcgccgccaa ggatctgatg gcgcagggga 2580tcaagctctg atcaagagac
aggatgagga tcgtttcgca tgattgaaca agatggattg 2640cacgcaggtt ctccggccgc
ttgggtggag aggctattcg gctatgactg ggcacaacag 2700acaatcggct gctctgatgc
cgccgtgttc cggctgtcag cgcaggggcg cccggttctt 2760tttgtcaaga ccgacctgtc
cggtgccctg aatgaactgc aagacgaggc agcgcggcta 2820tcgtggctgg ccacgacggg
cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg 2880ggaagggact ggctgctatt
gggcgaagtg ccggggcagg atctcctgtc atctcacctt 2940gctcctgccg agaaagtatc
catcatggct gatgcaatgc ggcggctgca tacgcttgat 3000ccggctacct gcccattcga
ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg 3060atggaagccg gtcttgtcga
tcaggatgat ctggacgaag agcatcaggg gctcgcgcca 3120gccgaactgt tcgccaggct
caaggcgagc atgcccgacg gcgaggatct cgtcgtgacc 3180catggcgatg cctgcttgcc
gaatatcatg gtggaaaatg gccgcttttc tggattcatc 3240gactgtggcc ggctgggtgt
ggcggaccgc tatcaggaca tagcgttggc tacccgtgat 3300attgctgaag agcttggcgg
cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc 3360gctcccgatt cgcagcgcat
cgccttctat cgccttcttg acgagttctt ctgaattatt 3420aacgcttaca atttcctgat
gcggtatttt ctccttacgc atctgtgcgg tatttcacac 3480cgcatacagg tggcactttt
cggggaaatg tgcgcggaac ccctatttgt ttatttttct 3540aaatacattc aaatatgtat
ccgctcatga gacaataacc ctgataaatg cttcaataat 3600agcacgtgct aaaacttcat
ttttaattta aaaggatcta ggtgaagatc ctttttgata 3660atctcatgac caaaatccct
taacgtgagt tttcgttcca ctgagcgtca gaccccgtag 3720aaaagatcaa aggatcttct
tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa 3780caaaaaaacc accgctacca
gcggtggttt gtttgccgga tcaagagcta ccaactcttt 3840ttccgaaggt aactggcttc
agcagagcgc agataccaaa tactgtcctt ctagtgtagc 3900cgtagttagg ccaccacttc
aagaactctg tagcaccgcc tacatacctc gctctgctaa 3960tcctgttacc agtggctgct
gccagtggcg ataagtcgtg tcttaccggg ttggactcaa 4020gacgatagtt accggataag
gcgcagcggt cgggctgaac ggggggttcg tgcacacagc 4080ccagcttgga gcgaacgacc
tacaccgaac tgagatacct acagcgtgag ctatgagaaa 4140gcgccacgct tcccgaaggg
agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa 4200caggagagcg cacgagggag
cttccagggg gaaacgcctg gtatctttat agtcctgtcg 4260ggtttcgcca cctctgactt
gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc 4320tatggaaaaa cgccagcaac
gcggcctttt tacggttcct gggcttttgc tggccttttg 4380ctcacatgtt ctt
4393304541DNAArtificial
SequenceSynthetic 30gactcttcgc gatgtacggg ccagatatac gcgttgacat
tgattattga ctagttatta 60atagtaatca attacggggt cattagttca tagcccatat
atggagttcc gcgttacata 120acttacggta aatggcccgc ctggctgacc gcccaacgac
ccccgcccat tgacgtcaat 180aatgacgtat gttcccatag taacgccaat agggactttc
cattgacgtc aatgggtgga 240ctatttacgg taaactgccc acttggcagt acatcaagtg
tatcatatgc caagtacgcc 300ccctattgac gtcaatgacg gtaaatggcc cgcctggcat
tatgcccagt acatgacctt 360atgggacttt cctacttggc agtacatcta cgtattagtc
atcgctatta ccatggtgat 420gcggttttgg cagtacatca atgggcgtgg atagcggttt
gactcacggg gatttccaag 480tctccacccc attgacgtca atgggagttt gttttggcac
caaaatcaac gggactttcc 540aaaatgtcgt aacaactccg ccccattgac gcaaatgggc
ggtaggcgtg tacggtggga 600ggtctatata agcagagctc tctggctaac tagagaaccc
actgcttact ggcttatcga 660aattaatacg actcactata gggagaccca agctggctag
cgtttaaact taagcttggt 720acctcagacg agacttggaa gacagtcaca tctcagcagc
tcctctgccg ttatccagcc 780tgcctctgac aagaacccaa tgcccaaccc taggccagcc
aagcctatgg ctccttcctt 840ggcccttggc ccatccccag gagtcttgcc aagctggaag
actgcaccca agggctcaga 900acttctaggg accaggggct ctgggggacc cttccaaggt
cgggacctgc gaagtggggc 960ccacacctct tcttccttga accccctgcc accatcccag
ctgcagctgc ctacagtgcc 1020cctagtcatg gtggcaccgt ctggggcccg actaggtccc
tcaccccacc tacaggccct 1080tctccaggac agaccacact tcatgcatca gctctccact
gtggatgccc atgcccagac 1140ccctgtgctc caagtgcgtc cactggacaa cccagccatg
atcagcctcc caccaccttc 1200tgctgccact ggggtcttct ccctcaaggc ccggcctggc
ctgccacctg ggatcaatgt 1260ggccagtctg gaatgggtgt ccagggagcc agctctactc
tgcaccttcc cacgctcggg 1320tacacccagg aaagacagca accttttggc tgcaccccaa
ggatcctacc cactgctggc 1380aaatggagtc tgcaagtggc ctggttgtga gaaggtcttc
gaggagccag aagagtttct 1440caagcactgc caagcagatc atctcctgga tgagaaaggc
aaggcccagt gcctcctcca 1500gagagaagtg gtgcagtctc tggagcagca gctggagctg
gaaaaggaga agctgggagc 1560tatgcaggcc cacctggctg ggaagatggc gctggccaag
gctccatctg tggcctcaat 1620ggacaagagc tcttgctgca tcgtagccac cagtactcag
ggcagtgtgc tcccggcctg 1680gtctgctcct cgggaggctc cagacggcgg cctgtttgca
gtgcggaggc acctctgggg 1740aagccatggc aatagttcct tcccagagtt cttccacaac
atggactact tcaagtacca 1800caatatgcga ccccctttca cctatgccac ccttatccga
tgggccatcc tggaagcccc 1860ggagaggcag aggacactca atgaaatcta ccattggttt
actcgcatgt tcgcctactt 1920cagaaaccac cccgccacct ggaagaatgc catccgccac
aacctgagcc tgcacaagtg 1980ctttgtgcga gtggagagcg agaagggagc agtgtggacc
gtagatgaat ttgagtttcg 2040caagaagagg agccaacgcc ccaacaagtg ctccaatccc
tgcccttgac ctcaaaacca 2100agaaaaggtg ggcgggggag ggggccaaaa ccatgagact
gaggctgtgg gggcaaggag 2160gcaagtccta cgtgtaccta tggaaaccgg aattctgcag
aaaaaaaaaa aaaaaaaaaa 2220aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa 2280aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
gcggccgctc gagtctagag 2340ggcccgttta aacccgctga tcagcctcga ctgtgccttc
tagttgccag ccatctgttg 2400tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc
cactcccact gtcctttcct 2460aataaaatga ggaaattgca tcgcattgtc tgagtaggtg
tcattctatt ctggggggtg 2520gggtggggca ggacagcaag ggggaggatt gggaagacaa
tagcaggcat gctggggatg 2580cggtgggctc tatggcttct actgggcggt tttatggaca
gcaagcgaac cggaattgcc 2640agctggggcg ccctctggta aggttgggaa gccctgcaaa
gtaaactgga tggctttctc 2700gccgccaagg atctgatggc gcaggggatc aagctctgat
caagagacag gatgaggatc 2760gtttcgcatg attgaacaag atggattgca cgcaggttct
ccggccgctt gggtggagag 2820gctattcggc tatgactggg cacaacagac aatcggctgc
tctgatgccg ccgtgttccg 2880gctgtcagcg caggggcgcc cggttctttt tgtcaagacc
gacctgtccg gtgccctgaa 2940tgaactgcaa gacgaggcag cgcggctatc gtggctggcc
acgacgggcg ttccttgcgc 3000agctgtgctc gacgttgtca ctgaagcggg aagggactgg
ctgctattgg gcgaagtgcc 3060ggggcaggat ctcctgtcat ctcaccttgc tcctgccgag
aaagtatcca tcatggctga 3120tgcaatgcgg cggctgcata cgcttgatcc ggctacctgc
ccattcgacc accaagcgaa 3180acatcgcatc gagcgagcac gtactcggat ggaagccggt
cttgtcgatc aggatgatct 3240ggacgaagag catcaggggc tcgcgccagc cgaactgttc
gccaggctca aggcgagcat 3300gcccgacggc gaggatctcg tcgtgaccca tggcgatgcc
tgcttgccga atatcatggt 3360ggaaaatggc cgcttttctg gattcatcga ctgtggccgg
ctgggtgtgg cggaccgcta 3420tcaggacata gcgttggcta cccgtgatat tgctgaagag
cttggcggcg aatgggctga 3480ccgcttcctc gtgctttacg gtatcgccgc tcccgattcg
cagcgcatcg ccttctatcg 3540ccttcttgac gagttcttct gaattattaa cgcttacaat
ttcctgatgc ggtattttct 3600ccttacgcat ctgtgcggta tttcacaccg catacaggtg
gcacttttcg gggaaatgtg 3660cgcggaaccc ctatttgttt atttttctaa atacattcaa
atatgtatcc gctcatgaga 3720caataaccct gataaatgct tcaataatag cacgtgctaa
aacttcattt ttaatttaaa 3780aggatctagg tgaagatcct ttttgataat ctcatgacca
aaatccctta acgtgagttt 3840tcgttccact gagcgtcaga ccccgtagaa aagatcaaag
gatcttcttg agatcctttt 3900tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac
cgctaccagc ggtggtttgt 3960ttgccggatc aagagctacc aactcttttt ccgaaggtaa
ctggcttcag cagagcgcag 4020ataccaaata ctgtccttct agtgtagccg tagttaggcc
accacttcaa gaactctgta 4080gcaccgccta catacctcgc tctgctaatc ctgttaccag
tggctgctgc cagtggcgat 4140aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac
cggataaggc gcagcggtcg 4200ggctgaacgg ggggttcgtg cacacagccc agcttggagc
gaacgaccta caccgaactg 4260agatacctac agcgtgagct atgagaaagc gccacgcttc
ccgaagggag aaaggcggac 4320aggtatccgg taagcggcag ggtcggaaca ggagagcgca
cgagggagct tccaggggga 4380aacgcctggt atctttatag tcctgtcggg tttcgccacc
tctgacttga gcgtcgattt 4440ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg
ccagcaacgc ggccttttta 4500cggttcctgg gcttttgctg gccttttgct cacatgttct t
4541316612DNAArtificial SequenceSynthetic
31gctgcgcgct cgctcgctca ctgaggccgc ccgggcaaag cccgggcgtc gggcgacctt
60tggtcgcccg gcctcagtga gcgagcgagc gcgcagagag ggagtggcca actccatcac
120taggggttcc ttgtagttaa tgattaaccc gccatgctac ttatctacca gggtaatggg
180gatcctctag aactatagct agtcgacatt gattattgac tagttattaa tagtaatcaa
240ttacggggtc attagttcat agcccatata tggagttccg cgttacataa cttacggtaa
300atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata atgacgtatg
360ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac tatttacggt
420aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc cctattgacg
480tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta tgggactttc
540ctacttggca gtacatctac gtattagtca tcgctattac catggtcgag gtgagcccca
600cgttctgctt cactctcccc atctcccccc cctccccacc cccaattttg tatttattta
660ttttttaatt attttgtgca gcgatggggg cggggggggg gggggggcgc gcgccaggcg
720gggcggggcg gggcgagggg cggggcgggg cgaggcggag aggtgcggcg gcagccaatc
780agagcggcgc gctccgaaag tttcctttta tggcgaggcg gcggcggcgg cggccctata
840aaaagcgaag cgcgcggcgg gcggggagtc gctgcgacgc tgccttcgcc ccgtgccccg
900ctccgccgcc gcctcgcgcc gcccgccccg gctctgactg accgcgttac tcccacaggt
960gagcgggcgg gacggccctt ctcctccggg ctgtaattag cgcttggttt aatgacggct
1020tgtttctttt ctgtggctgc gtgaaagcct tgaggggctc cgggagggcc ctttgtgcgg
1080ggggagcggc tcggggggtg cgtgcgtgtg tgtgtgcgtg gggagcgccg cgtgcggctc
1140cgcgctgccc ggcggctgtg agcgctgcgg gcgcggcgcg gggctttgtg cgctccgcag
1200tgtgcgcgag gggagcgcgg ccgggggcgg tgccccgcgg tgcggggggg gctgcgaggg
1260gaacaaaggc tgcgtgcggg gtgtgtgcgt gggggggtga gcagggggtg tgggcgcgtc
1320ggtcgggctg caaccccccc tgcacccccc tccccgagtt gctgagcacg gcccggcttc
1380gggtgcgggg ctccgtacgg ggcgtggcgc ggggctcgcc gtgccgggcg gggggtggcg
1440gcaggtgggg gtgccgggcg gggcggggcc gcctcgggcc ggggagggct cgggggaggg
1500gcgcggcggc ccccggagcg ccggcggctg tcgaggcgcg gcgagccgca gccattgcct
1560tttatggtaa tcgtgcgaga gggcgcaggg acttcctttg tcccaaatct gtgcggagcc
1620gaaatctggg aggcgccgcc gcaccccctc tagcgggcgc ggggcgaagc ggtgcggcgc
1680cggcaggaag gaaatgggcg gggagggcct tcgtgcgtcg ccgcgccgcc gtccccttct
1740ccctctccag cctcggggct gtccgcgggg ggacggctgc cttcgggggg gacggggcag
1800ggcggggttc ggcttctggc gtgtgaccgg cggctctaga gcctctgcta accatgttca
1860tgccttcttc tttttcctac agctcctggg caacgtgctg gttattgtgc tgtctcatca
1920ttttggcaaa gaattcacgc gtggtacctc agacgagact tggaagacag tcacatctca
1980gcagctcctc tgccgttatc cagcctgcct ctgacaagaa cccaatgccc aaccctaggc
2040cagccaagcc tatggctcct tccttggccc ttggcccatc cccaggagtc ttgccaagct
2100ggaagactgc acccaagggc tcagaacttc tagggaccag gggctctggg ggacccttcc
2160aaggtcggga cctgcgaagt ggggcccaca cctcttcttc cttgaacccc ctgccaccat
2220cccagctgca gctgcctaca gtgcccctag tcatggtggc accgtctggg gcccgactag
2280gtccctcacc ccacctacag gcccttctcc aggacagacc acacttcatg catcagctct
2340ccactgtgga tgcccatgcc cagacccctg tgctccaagt gcgtccactg gacaacccag
2400ccatgatcag cctcccacca ccttctgctg ccactggggt cttctccctc aaggcccggc
2460ctggcctgcc acctgggatc aatgtggcca gtctggaatg ggtgtccagg gagccagctc
2520tactctgcac cttcccacgc tcgggtacac ccaggaaaga cagcaacctt ttggctgcac
2580cccaaggatc ctacccactg ctggcaaatg gagtctgcaa gtggcctggt tgtgagaagg
2640tcttcgagga gccagaagag tttctcaagc actgccaagc agatcatctc ctggatgaga
2700aaggcaaggc ccagtgcctc ctccagagag aagtggtgca gtctctggag cagcagctgg
2760agctggaaaa ggagaagctg ggagctatgc aggcccacct ggctgggaag atggcgctgg
2820ccaaggctcc atctgtggcc tcaatggaca agagctcttg ctgcatcgta gccaccagta
2880ctcagggcag tgtgctcccg gcctggtctg ctcctcggga ggctccagac ggcggcctgt
2940ttgcagtgcg gaggcacctc tggggaagcc atggcaatag ttccttccca gagttcttcc
3000acaacatgga ctacttcaag taccacaata tgcgaccccc tttcacctat gccaccctta
3060tccgatgggc catcctggaa gccccggaga ggcagaggac actcaatgaa atctaccatt
3120ggtttactcg catgttcgcc tacttcagaa accaccccgc cacctggaag aatgccatcc
3180gccacaacct gagcctgcac aagtgctttg tgcgagtgga gagcgagaag ggagcagtgt
3240ggaccgtaga tgaatttgag tttcgcaaga agaggagcca acgccccaac aagtgctcca
3300atccctgccc ttgacctcaa aaccaagaaa aggtgggcgg gggagggggc caaaaccatg
3360agactgaggc tgtgggggca aggaggcaag tcctacgtgt acctatggaa accgctcgag
3420gacggggtga actacgcctg aggatccgat ctttttccct ctgccaaaaa ttatggggac
3480atcatgaagc cccttgagca tctgacttct ggctaataaa ggaaatttat tttcattgca
3540atagtgtgtt ggaatttttt gtgtctctca ctcggaagca attcgttgat ctgaatttcg
3600accacccata atacccatta ccctggtaga taagtagcat ggcgggttaa tcattaacta
3660caaggaaccc ctagtgatgg agttggccac tccctctctg cgcgctcgct cgctcactga
3720ggccgggcga ccaaaggtcg cccgacgccc gggctttgcc cgggcggcct cagtgagcga
3780gcgagcgcgc agccttaatt aacctaattc actggccgtc gttttacaac gtcgtgactg
3840ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca catccccctt tcgccagctg
3900gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa cagttgcgca gcctgaatgg
3960cgaatgggac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag
4020cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt
4080tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc ctttagggtt
4140ccgatttagt gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg
4200tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt
4260taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt
4320tgatttataa gggattttgc cgatttcggc ctattggtta aaaaatgagc tgatttaaca
4380aaaatttaac gcgaatttta acaaaatatt aacgcttaca atttaggtgg cacttttcgg
4440ggaaatgtgc gcggaacccc tatttgttta tttttctaaa tacattcaaa tatgtatccg
4500ctcatgagac aataaccctg ataaatgctt caataatatt gaaaaaggaa gagtatgagt
4560attcaacatt tccgtgtcgc ccttattccc ttttttgcgg cattttgcct tcctgttttt
4620gctcacccag aaacgctggt gaaagtaaaa gatgctgaag atcagttggg tgcacgagtg
4680ggttacatcg aactggatct caacagcggt aagatccttg agagttttcg ccccgaagaa
4740cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtatt
4800gacgccgggc aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag
4860tactcaccag tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt
4920gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga
4980ccgaaggagc taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt
5040tgggaaccgg agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta
5100gcaatggcaa caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg
5160caacaattaa tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc
5220cttccggctg gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt
5280atcattgcag cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg
5340gggagtcagg caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg
5400attaagcatt ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa
5460cttcattttt aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa
5520atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga
5580tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg
5640ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact
5700ggcttcagca gagcgcagat accaaatact gttcttctag tgtagccgta gttaggccac
5760cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg
5820gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg
5880gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga
5940acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc
6000gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg
6060agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc
6120tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc
6180agcaacgcgg cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt
6240cctgcgttat cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc
6300gctcgccgca gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc
6360ccaatacgca aaccgcctct ccccgcgcgt tggccgattc attaatgcag ctggcacgac
6420aggtttcccg actggaaagc gggcagtgag cgcaacgcaa ttaatgtgag ttagctcact
6480cattaggcac cccaggcttt acactttatg cttccggctc gtatgttgtg tggaattgtg
6540agcggataac aatttcacac aggaaacagc tatgaccatg attacgccag atttaattaa
6600ggccttaatt ag
6612326898DNAArtificial SequenceSynthetic 32gacgcgccct gtagcggcgc
attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc 60gctacacttg ccagcgccct
agcgcccgct cctttcgctt tcttcccttc ctttctcgcc 120acgttcgccg gctttccccg
tcaagctcta aatcgggggc tccctttagg gttccgattt 180agtgctttac ggcacctcga
ccccaaaaaa cttgattagg gtgatggttc acgtagtggg 240ccatcgccct gatagacggt
ttttcgccct ttgacgttgg agtccacgtt ctttaatagt 300ggactcttgt tccaaactgg
aacaacactc aaccctatct cggtctattc ttttgattta 360taagggattt tgccgatttc
ggcctattgg ttaaaaaatg agctgattta acaaaaattt 420aacgcgaatt ttaacaaaat
attaacgttt acaatttcag gtggcacttt tcggggaaat 480gtgcgcggaa cccctatttg
tttatttttc taaatacatt caaatatgta tccgctcatg 540agacaataac cctgataaat
gcttcaataa tattgaaaaa ggaagagtat gagtattcaa 600catttccgtg tcgcccttat
tccctttttt gcggcatttt gccttcctgt ttttgctcac 660ccagaaacgc tggtgaaagt
aaaagatgct gaagatcagt tgggtgcacg agtgggttac 720atcgaactgg atctcaacag
cggtaagatc cttgagagtt ttcgccccga agaacgtttt 780ccaatgatga gcacttttaa
agttctgcta tgtggcgcgg tattatcccg tattgacgcc 840gggcaagagc aactcggtcg
ccgcatacac tattctcaga atgacttggt tgagtactca 900ccagtcacag aaaagcatct
tacggatggc atgacagtaa gagaattatg cagtgctgcc 960ataaccatga gtgataacac
tgcggccaac ttacttctga caacgatcgg aggaccgaag 1020gagctaaccg cttttttgca
caacatgggg gatcatgtaa ctcgccttga tcgttgggaa 1080ccggagctga atgaagccat
accaaacgac gagcgtgaca ccacgatgcc tgtagcaatg 1140gcaacaacgt tgcgcaaact
attaactggc gaactactta ctctagcttc ccggcaacaa 1200ttaatagact ggatggaggc
ggataaagtt gcaggaccac ttctgcgctc ggcccttccg 1260gctggctggt ttattgctga
taaatctgga gccggtgagc gtgggtctcg cggtatcatt 1320gcagcactgg ggccagatgg
taagccctcc cgtatcgtag ttatctacac gacggggagt 1380caggcaacta tggatgaacg
aaatagacag atcgctgaga taggtgcctc actgattaag 1440cattggtaac tgtcagacca
agtttactca tatatacttt agattgattt aaaacttcat 1500ttttaattta aaaggatcta
ggtgaagatc ctttttgata atctcatgac caaaatccct 1560taacgtgagt tttcgttcca
ctgagcgtca gaccccgtag aaaagatcaa aggatcttct 1620tgagatcctt tttttctgcg
cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca 1680gcggtggttt gtttgccgga
tcaagagcta ccaactcttt ttccgaaggt aactggcttc 1740agcagagcgc agataccaaa
tactgtcctt ctagtgtagc cgtagttagg ccaccacttc 1800aagaactctg tagcaccgcc
tacatacctc gctctgctaa tcctgttacc agtggctgct 1860gccagtggcg ataagtcgtg
tcttaccggg ttggactcaa gacgatagtt accggataag 1920gcgcagcggt cgggctgaac
ggggggttcg tgcacacagc ccagcttgga gcgaacgacc 1980tacaccgaac tgagatacct
acagcgtgag cattgagaaa gcgccacgct tcccgaaggg 2040agaaaggcgg acaggtatcc
ggtaagcggc agggtcggaa caggagagcg cacgagggag 2100cttccagggg gaaacgcctg
gtatctttat agtcctgtcg ggtttcgcca cctctgactt 2160gagcgtcgat ttttgtgatg
ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac 2220gcggcctttt tacggttcct
ggccttttgc tggccttttg ctcacatgtt ctttcctgcg 2280ttatcccctg attctgtgga
taaccgtatt accgcctttg agtgagctga taccgctcgc 2340cgcagccgaa cgaccgagcg
cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg 2400cggtattttc tccttacgca
tctgtgcggt atttcacacc gcagaccagc cgcgtaacct 2460ggcaaaatcg gttacggttg
agtaataaat ggatgccctg cgtaagcggg tgtgggcgga 2520caataaagtc ttaaactgaa
caaaatagat ctaaactatg acaataaagt cttaaactag 2580acagaatagt tgtaaactga
aatcagtcca gttatgctgt gaaaaagcat actggacttt 2640tgttatggct aaagcaaact
cttcattttc tgaagtgcaa attgcccgtc gtattaaaga 2700ggggcgtggc caagggcatg
gtaaagacta tattcgcggc gttgtgacaa tttaccgaac 2760aactccgcgg ccgggaagcc
gatctcggct tgaacgaatt gttaggtggc ggtacttggg 2820tcgatatcaa agtgcatcac
ttcttcccgt atgcccaact ttgtatagag agccactgcg 2880ggatcgtcac cgtaatctgc
ttgcacgtag atcacataag caccaagcgc gttggcctca 2940tgcttgagga gattgatgag
cgcggtggca atgccctgcc tccggtgctc gccggagact 3000gcgagatcat agatatagat
ctcactacgc ggctgctcaa acctgggcag aacgtaagcc 3060gcgagagcgc caacaaccgc
ttcttggtcg aaggcagcaa gcgcgatgaa tgtcttacta 3120cggagcaagt tcccgaggta
atcggagtcc ggctgatgtt gggagtaggt ggctacgtct 3180ccgaactcac gaccgaaaag
atcaagagca gcccgcatgg atttgacttg gtcagggccg 3240agcctacatg tgcgaatgat
gcccatactt gagccaccta actttgtttt agggcgactg 3300ccctgctgcg taacatcgtt
gctgctgcgt aacatcgttg ctgctccata acatcaaaca 3360tcgacccacg gcgtaacgcg
cttgctgctt ggatgcccga ggcatagact gtacaaaaaa 3420acagtcataa caagccatga
aaaccgccac tgcgccgtta ccaccgctgc gttcggtcaa 3480ggttctggac cagttgcgtg
agcgcatacg ctacttgcat tacagtttac gaaccgaaca 3540ggcttatgtc aactgggttc
gtgccttcat ccgtttccac ggtgtgcgtc acccggcaac 3600cttgggcagc agcgaagtcg
aggcatttct gtcctggctg gcgaacgagc gcaaggtttc 3660ggtctccacg catcgtcagg
cattggcggc cttgctgttc ttctacggca aggtgctgtg 3720cacggatctg ccctggcttc
aggagatcgg aagacctcgg ccgtcgcggc gcttgccggt 3780ggtgctgacc ccggatgaag
tggttcgcat cctcggtttt ctggaaggcg agcatcgttt 3840gttcgcccag gactctagct
atagttctag tggttggcta cattattgaa gcatttatca 3900gggttattgt ctcagagcat
gcctgcaggc agctgcgcgc tcgctcgctc actgaggccg 3960cccgggcgtc gggcgacctt
tggtcgcccg gcctcagtga gcgagcgagc gcgcagagag 4020ggagtggcca actccatcac
taggggttcc tgcggccgca cgcgtcgtct cacatgtggc 4080gcgccaacat gtctcgagct
gcgcgctcgc tcgctcactg aggccgcccg ggcaaagccc 4140gggcgtcggg cgacctttgg
tcgcccggcc tcagtgagcg agcgagcgcg cagagaggga 4200gtggccaact ccatcactag
gggttcctgt gatagagaaa agtgaaagtc gagctcggta 4260cccgggtcga ggtaggcgtg
tacggtggga ggcctatata agcagagctc gtttagtgaa 4320ccgtcagatc gcctggagac
gccatccacg ctgttttgac ctccatagaa gacaccggga 4380ccgatccagc ctccgcggcc
ccgaattctg cagatatcca gcacagtggc ggccgctagg 4440gctagcggat cctctagaac
tatagctagt cgacattgat tattgactag ttattaatag 4500taatcaatta cggggtcatt
agttcatagc ccatatatgg agttccgcgt tacataactt 4560acggtaaatg gcccgcctgg
ctgaccgccc aacgaccccc gcccattgac gtcaataatg 4620acgtatgttc ccatagtaac
gccaataggg actttccatt gacgtcaatg ggtggactat 4680ttacggtaaa ctgcccactt
ggcagtacat caagtgtatc atatgccaag tacgccccct 4740attgacgtca atgacggtaa
atggcccgcc tggcattatg cccagtacat gaccttatgg 4800gactttccta cttggcagta
catctacgta ttagtcatcg ctattaccat gtcgaggcca 4860cgttctgctt cactctcccc
atctcccccc cctccccacc cccaattttg tatttattta 4920ttttttaatt attttgtgca
gcgatggggg cggggggggg gggcgcgcgc caggcggggc 4980ggggcggggc gaggggcggg
gcggggcgag gcggagaggt gcggcggcag ccaatcagag 5040cggcgcgctc cgaaagtttc
cttttatggc gaggcggcgg cggcggcggc cctataaaaa 5100gcgaagcgcg cggcgggcgg
gagcaagctt tattgcggta gtttatcaca gttaaattgc 5160taacgcagtc agtgcttctg
acacaacagt ctcgaactta agctgcagaa gttggtcgtg 5220aggcactggg caggtaagta
tcaaggttac aagacaggtt taaggagacc aatagaaact 5280gggcttgtcg agacagagaa
gactcttgcg tttctgatag gcacctattg gtcttactga 5340catccacttt gcctttctct
ccacaggtgt ccactcccag ttcaattaca gctcttaagg 5400ctagagtact taatacgact
cactataggc tagcctcgag gccgccatgg ccaagtcgca 5460cctactgcag tggctactgc
tgcttcctac cctctgctgc ccaggtgcag ctatcacgtc 5520ggcctcatcc ctggagtgtg
cacaaggccc tcaattctgg tgccaaagcc tggagcatgc 5580agtgcagtgc agagccctgg
ggcactgcct gcaggaagtc tgggggcatg caggagctaa 5640tgacctgtgc caagagtgtg
aggatattgt ccacctcctc acaaagatga ccaaggaaga 5700tgctttccag gaagcaatcc
ggaagttcct ggaacaagaa tgtgatatcc ttcccttgaa 5760gctgcttgtg ccccggtgtc
gccaagtgct tgatgtctac ctgcccctgg ttattgacta 5820cttccagagc cagattaacc
ccaaagccat ctggaggttg aggattagag tccactagat 5880ggggatacgc ggaacggtcg
ggcgagtcat tagagtcctg gaaccggagg gtttaacgac 5940ccaggaaccc ctagtgatgg
agttggccac tccctctctg cgcgctcgct cgctcactga 6000ggccgggcga ccaaaggtcg
cccgacgccc gggctttgcc cgggcggcct cagtgagcga 6060gcgagcgcgc agccttaatt
aatccggacc acgtgcggac cgagcggccg caggaacccc 6120tagtgatgga gttggccact
ccctctctgc gcgctcgctc gctcactgag gccgggcgac 6180caaaggtcgc ccgacgcccg
ggctttgccc gggcggcctc agtgagcgag cgagcgcgca 6240gctgcctgca ggaagctgta
agcttgtcga gaagtactag aggatcataa tcagccatac 6300cacatttgta gaggttttac
ttgctttaaa aaacctccca cacctccccc tgaacctgaa 6360acataaaatg aatgcaattg
ttgttgttaa cttgtttatt gcagcttata atggttacaa 6420ataaagcaat agcatcacaa
atttcacaaa taaagcattt ttttcactgc attctagttg 6480tggtttgtcc aaactcatca
atgtatctta tcatgtctgg atctgatcac tgatatcgcc 6540taggagatcc gaaccagata
agtgaaatct agttccaaac tattttgtca tttttaattt 6600tcgtattagc ttacgacgct
acacccagtt cccatctatt ttgtcactct tccctaaata 6660atccttaaaa actccatttc
cacccctccc agttcccaac tattttgtcc gcccacagcg 6720gggcattttt cttcctgtta
tgtttttaat caaacatcct gccaactcca tgtgacaaac 6780cgtcatcttc ggctactttt
tctctgtcac agaatgaaaa tttttctgtc atctcttcgt 6840tattaatgtt tgtaattgac
tgaatatcaa cgcttatttg cagcctgaat ggcgaatg 6898337026DNAArtificial
SequenceSynthetic 33gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg
tggttacgcg cagcgtgacc 60gctacacttg ccagcgccct agcgcccgct cctttcgctt
tcttcccttc ctttctcgcc 120acgttcgccg gctttccccg tcaagctcta aatcgggggc
tccctttagg gttccgattt 180agtgctttac ggcacctcga ccccaaaaaa cttgattagg
gtgatggttc acgtagtggg 240ccatcgccct gatagacggt ttttcgccct ttgacgttgg
agtccacgtt ctttaatagt 300ggactcttgt tccaaactgg aacaacactc aaccctatct
cggtctattc ttttgattta 360taagggattt tgccgatttc ggcctattgg ttaaaaaatg
agctgattta acaaaaattt 420aacgcgaatt ttaacaaaat attaacgttt acaatttcag
gtggcacttt tcggggaaat 480gtgcgcggaa cccctatttg tttatttttc taaatacatt
caaatatgta tccgctcatg 540agacaataac cctgataaat gcttcaataa tattgaaaaa
ggaagagtat gagtattcaa 600catttccgtg tcgcccttat tccctttttt gcggcatttt
gccttcctgt ttttgctcac 660ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt
tgggtgcacg agtgggttac 720atcgaactgg atctcaacag cggtaagatc cttgagagtt
ttcgccccga agaacgtttt 780ccaatgatga gcacttttaa agttctgcta tgtggcgcgg
tattatcccg tattgacgcc 840gggcaagagc aactcggtcg ccgcatacac tattctcaga
atgacttggt tgagtactca 900ccagtcacag aaaagcatct tacggatggc atgacagtaa
gagaattatg cagtgctgcc 960ataaccatga gtgataacac tgcggccaac ttacttctga
caacgatcgg aggaccgaag 1020gagctaaccg cttttttgca caacatgggg gatcatgtaa
ctcgccttga tcgttgggaa 1080ccggagctga atgaagccat accaaacgac gagcgtgaca
ccacgatgcc tgtagcaatg 1140gcaacaacgt tgcgcaaact attaactggc gaactactta
ctctagcttc ccggcaacaa 1200ttaatagact ggatggaggc ggataaagtt gcaggaccac
ttctgcgctc ggcccttccg 1260gctggctggt ttattgctga taaatctgga gccggtgagc
gtgggtctcg cggtatcatt 1320gcagcactgg ggccagatgg taagccctcc cgtatcgtag
ttatctacac gacggggagt 1380caggcaacta tggatgaacg aaatagacag atcgctgaga
taggtgcctc actgattaag 1440cattggtaac tgtcagacca agtttactca tatatacttt
agattgattt aaaacttcat 1500ttttaattta aaaggatcta ggtgaagatc ctttttgata
atctcatgac caaaatccct 1560taacgtgagt tttcgttcca ctgagcgtca gaccccgtag
aaaagatcaa aggatcttct 1620tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa
caaaaaaacc accgctacca 1680gcggtggttt gtttgccgga tcaagagcta ccaactcttt
ttccgaaggt aactggcttc 1740agcagagcgc agataccaaa tactgtcctt ctagtgtagc
cgtagttagg ccaccacttc 1800aagaactctg tagcaccgcc tacatacctc gctctgctaa
tcctgttacc agtggctgct 1860gccagtggcg ataagtcgtg tcttaccggg ttggactcaa
gacgatagtt accggataag 1920gcgcagcggt cgggctgaac ggggggttcg tgcacacagc
ccagcttgga gcgaacgacc 1980tacaccgaac tgagatacct acagcgtgag cattgagaaa
gcgccacgct tcccgaaggg 2040agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa
caggagagcg cacgagggag 2100cttccagggg gaaacgcctg gtatctttat agtcctgtcg
ggtttcgcca cctctgactt 2160gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc
tatggaaaaa cgccagcaac 2220gcggcctttt tacggttcct ggccttttgc tggccttttg
ctcacatgtt ctttcctgcg 2280ttatcccctg attctgtgga taaccgtatt accgcctttg
agtgagctga taccgctcgc 2340cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg
aagcggaaga gcgcctgatg 2400cggtattttc tccttacgca tctgtgcggt atttcacacc
gcagaccagc cgcgtaacct 2460ggcaaaatcg gttacggttg agtaataaat ggatgccctg
cgtaagcggg tgtgggcgga 2520caataaagtc ttaaactgaa caaaatagat ctaaactatg
acaataaagt cttaaactag 2580acagaatagt tgtaaactga aatcagtcca gttatgctgt
gaaaaagcat actggacttt 2640tgttatggct aaagcaaact cttcattttc tgaagtgcaa
attgcccgtc gtattaaaga 2700ggggcgtggc caagggcatg gtaaagacta tattcgcggc
gttgtgacaa tttaccgaac 2760aactccgcgg ccgggaagcc gatctcggct tgaacgaatt
gttaggtggc ggtacttggg 2820tcgatatcaa agtgcatcac ttcttcccgt atgcccaact
ttgtatagag agccactgcg 2880ggatcgtcac cgtaatctgc ttgcacgtag atcacataag
caccaagcgc gttggcctca 2940tgcttgagga gattgatgag cgcggtggca atgccctgcc
tccggtgctc gccggagact 3000gcgagatcat agatatagat ctcactacgc ggctgctcaa
acctgggcag aacgtaagcc 3060gcgagagcgc caacaaccgc ttcttggtcg aaggcagcaa
gcgcgatgaa tgtcttacta 3120cggagcaagt tcccgaggta atcggagtcc ggctgatgtt
gggagtaggt ggctacgtct 3180ccgaactcac gaccgaaaag atcaagagca gcccgcatgg
atttgacttg gtcagggccg 3240agcctacatg tgcgaatgat gcccatactt gagccaccta
actttgtttt agggcgactg 3300ccctgctgcg taacatcgtt gctgctgcgt aacatcgttg
ctgctccata acatcaaaca 3360tcgacccacg gcgtaacgcg cttgctgctt ggatgcccga
ggcatagact gtacaaaaaa 3420acagtcataa caagccatga aaaccgccac tgcgccgtta
ccaccgctgc gttcggtcaa 3480ggttctggac cagttgcgtg agcgcatacg ctacttgcat
tacagtttac gaaccgaaca 3540ggcttatgtc aactgggttc gtgccttcat ccgtttccac
ggtgtgcgtc acccggcaac 3600cttgggcagc agcgaagtcg aggcatttct gtcctggctg
gcgaacgagc gcaaggtttc 3660ggtctccacg catcgtcagg cattggcggc cttgctgttc
ttctacggca aggtgctgtg 3720cacggatctg ccctggcttc aggagatcgg aagacctcgg
ccgtcgcggc gcttgccggt 3780ggtgctgacc ccggatgaag tggttcgcat cctcggtttt
ctggaaggcg agcatcgttt 3840gttcgcccag gactctagct atagttctag tggttggcta
cattattgaa gcatttatca 3900gggttattgt ctcagagcat gcctgcaggc agctgcgcgc
tcgctcgctc actgaggccg 3960cccgggcgtc gggcgacctt tggtcgcccg gcctcagtga
gcgagcgagc gcgcagagag 4020ggagtggcca actccatcac taggggttcc tgcggccgca
cgcgtggagc tagttattaa 4080tagtaatcaa ttacggggtc attagttcat agcccatata
tggagttccg cgttacataa 4140cttacggtaa atggcccgcc tggctgaccg cccaacgacc
cccgcccatt gacgtcaata 4200atgacgtatg ttcccatagt aacgtcaata gggactttcc
attgacgtca atgggtggag 4260tatttacggt aaactgccca cttggcagta catcaagtgt
atcatatgcc aagtacgccc 4320cctattgacg tcaatgacgg taaatggccc gcctggcatt
atgcccagta catgacctta 4380tgggactttc ctacttggca gtacatctac gtattagtca
tcgctattac catggtgatg 4440cggttttggc agtacatcaa tgggcgtgga tagcggtttg
actcacgggg atttccaagt 4500ctccacccca ttgacgtcaa tgggagtttg ttttgcacca
aaatcaacgg gactttccaa 4560aatgtcgtaa caactccgcc ccattgacgc aaatgggcgg
taggcgtgta cggtgggagg 4620tctatataag cagagctctc tggctaacta gagaacccac
tgcttactgg cttatcgaaa 4680ttaatacgac tcactatagg gagacccaag ctggctagcg
tttaaactta agctgatcca 4740ctagtccagt gtggtggaat tcgcctagag atctggcggc
ggagagggca gaggaagtct 4800tctaacctgc ggtgacgtgg aggagaatcc cggccctagg
accatggact acaaagacca 4860tgacggtgat tataaagatc atgacatcga ttacaaggat
gacgatgaca agatggcccc 4920caagaagaag aggaaggtcg gcattcatgg ggtacccgcc
gctatggctg agaggccctt 4980ccagtgtcga atctgcatgc gtaacttcag tgaccagtcc
aacctgcgcg cccacatccg 5040cacccacacc ggcgagaagc cttttgcctg tgacatttgt
gggaggaaat ttgcccgcaa 5100gtccgaccgc atcaagcata ccaagataca cacgggcagc
caaaagccct tccagtgtcg 5160aatctgcatg cgtaagtttg cccgctccga caacctgtcc
gtgcatacca agatacacac 5220gggcgagaag cccttccagt gtcgaatctg catgcgtaac
ttcagtgagc gcggcaccct 5280ggcccgccac atccgcaccc acaccggcga gaagcctttt
gcctgtgaca tttgtgggag 5340gaaatttgcc cgctccgacg ccctgaccca gcataccaag
atacacctgc ggggatccca 5400gctggtgaag agcgagctgg aggagaagaa gtccgagctg
cggcacaagc tgaagtacgt 5460gccccacgag tacatcgagc tgatcgagat cgccaggaac
agcacccagg accgcatcct 5520ggagatgaag gtgatggagt tcttcatgaa ggtgtacggc
tacaggggaa agcacctggg 5580cggaagcaga aagcctgacg gcgccatcta tacagtgggc
agccccatcg attacggcgt 5640gatcgtggac acaaaggcct acagcggcgg ctacaatctg
cctatcggcc aggccgacga 5700gatgcagaga tacgtgaagg agaaccagac ccggaataag
cacatcaacc ccaacgagtg 5760gtggaaggtg taccctagca gcgtgaccga gttcaagttc
ctgttcgtga gcggccactt 5820caagggcaac tacaaggccc agctgaccag gctgaaccgc
aaaaccaact gcaatggcgc 5880cgtgctgagc gtggaggagc tgctgatcgg cggcgagatg
atcaaagccg gcaccctgac 5940actggaggag gtgcggcgca agttcaacaa cggcgagatc
aacttctgat aactcgagct 6000gtgccttcta gttgccagcc atctgttgtt tgcccctccc
ccgtgccttc cttgaccctg 6060gaaggtgcca ctcccactgt cctttcctaa taaaatgagg
aaattgcatc gcattgtctg 6120agtaggtgtc attctattct ggggggtggg gtggggcagg
acagcaaggg ggaggattgg 6180gaagacaata gcaggcatgc tggggatgcg gtgggctcta
tggcggaccg agcggccgca 6240ggaaccccta gtgatggagt tggccactcc ctctctgcgc
gctcgctcgc tcactgaggc 6300cgggcgacca aaggtcgccc gacgcccggg ctttgcccgg
gcggcctcag tgagcgagcg 6360agcgcgcagc tgcctgcagg aagctgtaag cttgtcgaga
agtactagag gatcataatc 6420agccatacca catttgtaga ggttttactt gctttaaaaa
acctcccaca cctccccctg 6480aacctgaaac ataaaatgaa tgcaattgtt gttgttaact
tgtttattgc agcttataat 6540ggttacaaat aaagcaatag catcacaaat ttcacaaata
aagcattttt ttcactgcat 6600tctagttgtg gtttgtccaa actcatcaat gtatcttatc
atgtctggat ctgatcactg 6660atatcgccta ggagatccga accagataag tgaaatctag
ttccaaacta ttttgtcatt 6720tttaattttc gtattagctt acgacgctac acccagttcc
catctatttt gtcactcttc 6780cctaaataat ccttaaaaac tccatttcca cccctcccag
ttcccaacta ttttgtccgc 6840ccacagcggg gcatttttct tcctgttatg tttttaatca
aacatcctgc caactccatg 6900tgacaaaccg tcatcttcgg ctactttttc tctgtcacag
aatgaaaatt tttctgtcat 6960ctcttcgtta ttaatgtttg taattgactg aatatcaacg
cttatttgca gcctgaatgg 7020cgaatg
7026347065DNAArtificial SequenceSynthetic
34gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc
60gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc ctttctcgcc
120acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg gttccgattt
180agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc acgtagtggg
240ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt ctttaatagt
300ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc ttttgattta
360taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta acaaaaattt
420aacgcgaatt ttaacaaaat attaacgttt acaatttcag gtggcacttt tcggggaaat
480gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta tccgctcatg
540agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat gagtattcaa
600catttccgtg tcgcccttat tccctttttt gcggcatttt gccttcctgt ttttgctcac
660ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt tgggtgcacg agtgggttac
720atcgaactgg atctcaacag cggtaagatc cttgagagtt ttcgccccga agaacgtttt
780ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg tattgacgcc
840gggcaagagc aactcggtcg ccgcatacac tattctcaga atgacttggt tgagtactca
900ccagtcacag aaaagcatct tacggatggc atgacagtaa gagaattatg cagtgctgcc
960ataaccatga gtgataacac tgcggccaac ttacttctga caacgatcgg aggaccgaag
1020gagctaaccg cttttttgca caacatgggg gatcatgtaa ctcgccttga tcgttgggaa
1080ccggagctga atgaagccat accaaacgac gagcgtgaca ccacgatgcc tgtagcaatg
1140gcaacaacgt tgcgcaaact attaactggc gaactactta ctctagcttc ccggcaacaa
1200ttaatagact ggatggaggc ggataaagtt gcaggaccac ttctgcgctc ggcccttccg
1260gctggctggt ttattgctga taaatctgga gccggtgagc gtgggtctcg cggtatcatt
1320gcagcactgg ggccagatgg taagccctcc cgtatcgtag ttatctacac gacggggagt
1380caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc actgattaag
1440cattggtaac tgtcagacca agtttactca tatatacttt agattgattt aaaacttcat
1500ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac caaaatccct
1560taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa aggatcttct
1620tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca
1680gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt aactggcttc
1740agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg ccaccacttc
1800aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc agtggctgct
1860gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt accggataag
1920gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga gcgaacgacc
1980tacaccgaac tgagatacct acagcgtgag cattgagaaa gcgccacgct tcccgaaggg
2040agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg cacgagggag
2100cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca cctctgactt
2160gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac
2220gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt ctttcctgcg
2280ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga taccgctcgc
2340cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg
2400cggtattttc tccttacgca tctgtgcggt atttcacacc gcagaccagc cgcgtaacct
2460ggcaaaatcg gttacggttg agtaataaat ggatgccctg cgtaagcggg tgtgggcgga
2520caataaagtc ttaaactgaa caaaatagat ctaaactatg acaataaagt cttaaactag
2580acagaatagt tgtaaactga aatcagtcca gttatgctgt gaaaaagcat actggacttt
2640tgttatggct aaagcaaact cttcattttc tgaagtgcaa attgcccgtc gtattaaaga
2700ggggcgtggc caagggcatg gtaaagacta tattcgcggc gttgtgacaa tttaccgaac
2760aactccgcgg ccgggaagcc gatctcggct tgaacgaatt gttaggtggc ggtacttggg
2820tcgatatcaa agtgcatcac ttcttcccgt atgcccaact ttgtatagag agccactgcg
2880ggatcgtcac cgtaatctgc ttgcacgtag atcacataag caccaagcgc gttggcctca
2940tgcttgagga gattgatgag cgcggtggca atgccctgcc tccggtgctc gccggagact
3000gcgagatcat agatatagat ctcactacgc ggctgctcaa acctgggcag aacgtaagcc
3060gcgagagcgc caacaaccgc ttcttggtcg aaggcagcaa gcgcgatgaa tgtcttacta
3120cggagcaagt tcccgaggta atcggagtcc ggctgatgtt gggagtaggt ggctacgtct
3180ccgaactcac gaccgaaaag atcaagagca gcccgcatgg atttgacttg gtcagggccg
3240agcctacatg tgcgaatgat gcccatactt gagccaccta actttgtttt agggcgactg
3300ccctgctgcg taacatcgtt gctgctgcgt aacatcgttg ctgctccata acatcaaaca
3360tcgacccacg gcgtaacgcg cttgctgctt ggatgcccga ggcatagact gtacaaaaaa
3420acagtcataa caagccatga aaaccgccac tgcgccgtta ccaccgctgc gttcggtcaa
3480ggttctggac cagttgcgtg agcgcatacg ctacttgcat tacagtttac gaaccgaaca
3540ggcttatgtc aactgggttc gtgccttcat ccgtttccac ggtgtgcgtc acccggcaac
3600cttgggcagc agcgaagtcg aggcatttct gtcctggctg gcgaacgagc gcaaggtttc
3660ggtctccacg catcgtcagg cattggcggc cttgctgttc ttctacggca aggtgctgtg
3720cacggatctg ccctggcttc aggagatcgg aagacctcgg ccgtcgcggc gcttgccggt
3780ggtgctgacc ccggatgaag tggttcgcat cctcggtttt ctggaaggcg agcatcgttt
3840gttcgcccag gactctagct atagttctag tggttggcta cattattgaa gcatttatca
3900gggttattgt ctcagagcat gcctgcaggc agctgcgcgc tcgctcgctc actgaggccg
3960cccgggcgtc gggcgacctt tggtcgcccg gcctcagtga gcgagcgagc gcgcagagag
4020ggagtggcca actccatcac taggggttcc tgcggccgca cgcgtggagc tagttattaa
4080tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa
4140cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata
4200atgacgtatg ttcccatagt aacgtcaata gggactttcc attgacgtca atgggtggag
4260tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc
4320cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta
4380tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg
4440cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt
4500ctccacccca ttgacgtcaa tgggagtttg ttttgcacca aaatcaacgg gactttccaa
4560aatgtcgtaa caactccgcc ccattgacgc aaatgggcgg taggcgtgta cggtgggagg
4620tctatataag cagagctctc tggctaacta gagaacccac tgcttactgg cttatcgaaa
4680ttaatacgac tcactatagg gagacccaag ctggctagcg tttaaactta agctgatcca
4740ctagtccagt gtggtggaat tcgccatgga ctacaaagac catgacggtg attataaaga
4800tcatgacatc gattacaagg atgacgatga caagatggcc cccaagaaga agaggaaggt
4860cggcatccac ggggtacccg ccgctatggc tgagaggccc ttccagtgtc gaatctgcat
4920gcgtaacttc agtcagtcct ccgacctgtc ccgccacatc cgcacccaca ccggcgagaa
4980gccttttgcc tgtgacattt gtgggaggaa atttgcctgg cgctcctccc tgcgccagca
5040taccaagata cacacgcatc ccagggcacc tattcccaag cccttccagt gtcgaatctg
5100catgcgtaac ttcagtcagt ccggcgacct gacccgccac atccgcaccc acaccggcga
5160gaagcctttt gcctgtgaca tttgtgggag gaaatttgcc cgccgcgccg accgcgccaa
5220gcataccaag atacacacgc acccgcgcgc cccgatcccg aagcccttcc agtgtcgaat
5280ctgcatgcgt aacttcagtc gctccgacga cctgacccgc cacatccgca cccacaccgg
5340cgagaagcct tttgcctgtg acatttgtgg gaggaaattt gcccagcgct ccaccctgtc
5400ctcccatacc aagatacacc tgcggggatc ccagctggtg aagagcgagc tggaggagaa
5460gaagtccgag ctgcggcaca agctgaagta cgtgccccac gagtacatcg agctgatcga
5520gatcgccagg aacagcaccc aggaccgcat cctggagatg aaggtgatgg agttcttcat
5580gaaggtgtac ggctacaggg gaaagcacct gggcggaagc agaaagcctg acggcgccat
5640ctatacagtg ggcagcccca tcgattacgg cgtgatcgtg gacacaaagg cctacagcgg
5700cggctacaat ctgcctatcg gccaggccga cgagatggag agatacgtgg aggagaacca
5760gacccgggat aagcacctca accccaacga gtggtggaag gtgtacccta gcagcgtgac
5820cgagttcaag ttcctgttcg tgagcggcca cttcaagggc aactacaagg cccagctgac
5880caggctgaac cacatcacca actgcaatgg cgccgtgctg agcgtggagg agctgctgat
5940cggcggcgag atgatcaaag ccggcaccct gacactggag gaggtgcggc gcaagttcaa
6000caacggcgag atcaacttca gatcttgata actcgagctg tgccttctag ttgccagcca
6060tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc
6120ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg
6180gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag caggcatgct
6240ggggatgcgg tgggctctat ggcggaccga gcggccgcag gaacccctag tgatggagtt
6300ggccactccc tctctgcgcg ctcgctcgct cactgaggcc gggcgaccaa aggtcgcccg
6360acgcccgggc tttgcccggg cggcctcagt gagcgagcga gcgcgcagct gcctgcagga
6420agctgtaagc ttgtcgagaa gtactagagg atcataatca gccataccac atttgtagag
6480gttttacttg ctttaaaaaa cctcccacac ctccccctga acctgaaaca taaaatgaat
6540gcaattgttg ttgttaactt gtttattgca gcttataatg gttacaaata aagcaatagc
6600atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa
6660ctcatcaatg tatcttatca tgtctggatc tgatcactga tatcgcctag gagatccgaa
6720ccagataagt gaaatctagt tccaaactat tttgtcattt ttaattttcg tattagctta
6780cgacgctaca cccagttccc atctattttg tcactcttcc ctaaataatc cttaaaaact
6840ccatttccac ccctcccagt tcccaactat tttgtccgcc cacagcgggg catttttctt
6900cctgttatgt ttttaatcaa acatcctgcc aactccatgt gacaaaccgt catcttcggc
6960tactttttct ctgtcacaga atgaaaattt ttctgtcatc tcttcgttat taatgtttgt
7020aattgactga atatcaacgc ttatttgcag cctgaatggc gaatg
706535420PRTArtificial SequenceSynthetic 35Met Asp Tyr Lys Asp His Asp
Gly Asp Tyr Lys Asp His Asp Ile Asp 1 5
10 15 Tyr Lys Asp Asp Asp Asp Lys Met Ala Pro Lys
Lys Lys Arg Lys Val 20 25
30 Gly Ile His Gly Val Pro Ala Ala Met Ala Glu Arg Pro Phe Gln
Cys 35 40 45 Arg
Ile Cys Met Arg Asn Phe Ser Gln Ser Ser Asp Leu Ser Arg His 50
55 60 Ile Arg Thr His Thr Gly
Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly 65 70
75 80 Arg Lys Phe Ala Trp Arg Ser Ser Leu Arg Gln
His Thr Lys Ile His 85 90
95 Thr His Pro Arg Ala Pro Ile Pro Lys Pro Phe Gln Cys Arg Ile Cys
100 105 110 Met Arg
Asn Phe Ser Gln Ser Gly Asp Leu Thr Arg His Ile Arg Thr 115
120 125 His Thr Gly Glu Lys Pro Phe
Ala Cys Asp Ile Cys Gly Arg Lys Phe 130 135
140 Ala Arg Arg Ala Asp Arg Ala Lys His Thr Lys Ile
His Thr His Pro 145 150 155
160 Arg Ala Pro Ile Pro Lys Pro Phe Gln Cys Arg Ile Cys Met Arg Asn
165 170 175 Phe Ser Arg
Ser Asp Asp Leu Thr Arg His Ile Arg Thr His Thr Gly 180
185 190 Glu Lys Pro Phe Ala Cys Asp Ile
Cys Gly Arg Lys Phe Ala Gln Arg 195 200
205 Ser Thr Leu Ser Ser His Thr Lys Ile His Leu Arg Gly
Ser Gln Leu 210 215 220
Val Lys Ser Glu Leu Glu Glu Lys Lys Ser Glu Leu Arg His Lys Leu 225
230 235 240 Lys Tyr Val Pro
His Glu Tyr Ile Glu Leu Ile Glu Ile Ala Arg Asn 245
250 255 Ser Thr Gln Asp Arg Ile Leu Glu Met
Lys Val Met Glu Phe Phe Met 260 265
270 Lys Val Tyr Gly Tyr Arg Gly Lys His Leu Gly Gly Ser Arg
Lys Pro 275 280 285
Asp Gly Ala Ile Tyr Thr Val Gly Ser Pro Ile Asp Tyr Gly Val Ile 290
295 300 Val Asp Thr Lys Ala
Tyr Ser Gly Gly Tyr Asn Leu Pro Ile Gly Gln 305 310
315 320 Ala Asp Glu Met Glu Arg Tyr Val Glu Glu
Asn Gln Thr Arg Asp Lys 325 330
335 His Leu Asn Pro Asn Glu Trp Trp Lys Val Tyr Pro Ser Ser Val
Thr 340 345 350 Glu
Phe Lys Phe Leu Phe Val Ser Gly His Phe Lys Gly Asn Tyr Lys 355
360 365 Ala Gln Leu Thr Arg Leu
Asn His Ile Thr Asn Cys Asn Gly Ala Val 370 375
380 Leu Ser Val Glu Glu Leu Leu Ile Gly Gly Glu
Met Ile Lys Ala Gly 385 390 395
400 Thr Leu Thr Leu Glu Glu Val Arg Arg Lys Phe Asn Asn Gly Glu Ile
405 410 415 Asn Phe
Arg Ser 420 36381PRTArtificial SequenceSynthetic 36Met Asp
Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp 1 5
10 15 Tyr Lys Asp Asp Asp Asp Lys
Met Ala Pro Lys Lys Lys Arg Lys Val 20 25
30 Gly Ile His Gly Val Pro Ala Ala Met Ala Glu Arg
Pro Phe Gln Cys 35 40 45
Arg Ile Cys Met Arg Asn Phe Ser Asp Gln Ser Asn Leu Arg Ala His
50 55 60 Ile Arg Thr
His Thr Gly Glu Lys Pro Phe Ala Cys Asp Ile Cys Gly 65
70 75 80 Arg Lys Phe Ala Arg Lys Ser
Asp Arg Ile Lys His Thr Lys Ile His 85
90 95 Thr Gly Ser Gln Lys Pro Phe Gln Cys Arg Ile
Cys Met Arg Lys Phe 100 105
110 Ala Arg Ser Asp Asn Leu Ser Val His Thr Lys Ile His Thr Gly
Glu 115 120 125 Lys
Pro Phe Gln Cys Arg Ile Cys Met Arg Asn Phe Ser Glu Arg Gly 130
135 140 Thr Leu Ala Arg His Ile
Arg Thr His Thr Gly Glu Lys Pro Phe Ala 145 150
155 160 Cys Asp Ile Cys Gly Arg Lys Phe Ala Arg Ser
Asp Ala Leu Thr Gln 165 170
175 His Thr Lys Ile His Leu Arg Gly Ser Gln Leu Val Lys Ser Glu Leu
180 185 190 Glu Glu
Lys Lys Ser Glu Leu Arg His Lys Leu Lys Tyr Val Pro His 195
200 205 Glu Tyr Ile Glu Leu Ile Glu
Ile Ala Arg Asn Ser Thr Gln Asp Arg 210 215
220 Ile Leu Glu Met Lys Val Met Glu Phe Phe Met Lys
Val Tyr Gly Tyr 225 230 235
240 Arg Gly Lys His Leu Gly Gly Ser Arg Lys Pro Asp Gly Ala Ile Tyr
245 250 255 Thr Val Gly
Ser Pro Ile Asp Tyr Gly Val Ile Val Asp Thr Lys Ala 260
265 270 Tyr Ser Gly Gly Tyr Asn Leu Pro
Ile Gly Gln Ala Asp Glu Met Gln 275 280
285 Arg Tyr Val Lys Glu Asn Gln Thr Arg Asn Lys His Ile
Asn Pro Asn 290 295 300
Glu Trp Trp Lys Val Tyr Pro Ser Ser Val Thr Glu Phe Lys Phe Leu 305
310 315 320 Phe Val Ser Gly
His Phe Lys Gly Asn Tyr Lys Ala Gln Leu Thr Arg 325
330 335 Leu Asn Arg Lys Thr Asn Cys Asn Gly
Ala Val Leu Ser Val Glu Glu 340 345
350 Leu Leu Ile Gly Gly Glu Met Ile Lys Ala Gly Thr Leu Thr
Leu Glu 355 360 365
Glu Val Arg Arg Lys Phe Asn Asn Gly Glu Ile Asn Phe 370
375 380
User Contributions:
Comment about this patent or add new information about this topic: