Patent application title: CONDITIONALLLY REPLICATION-COMPETENT ADENOVIRUS
Inventors:
Hiroyuki Mizuguchi (Ibaraki-Shi, JP)
Fuminori Sakurai (Ibaraki-Shi, JP)
Assignees:
National Institute of Biomedical Innovation
IPC8 Class: AC12Q168FI
USPC Class:
435 5
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving virus or bacteriophage
Publication date: 2014-07-17
Patent application number: 20140199688
Abstract:
The object of the present invention is to provide a novel conditionally
replicating adenovirus and a reagent comprising the same for cancer cell
detection or for cancer diagnosis.
The present invention provides a polynucleotide, which comprises human
telomerase reverse transcriptase (hTERT) promoter, E1A gene, IRES
sequence and E1B gene in this order and which comprises a target sequence
of a first miRNA. The present invention also provides a recombinant
adenovirus, which comprises a replication cassette comprising the above
polynucleotide, wherein the replication cassette is integrated into the
E1 region of the adenovirus genome.Claims:
1. A polynucleotide, which comprises human telomerase reverse
transcriptase promoter, E1A gene, IRES sequence and E1B gene in this
order and which comprises a target sequence of a first microRNA.
2. The polynucleotide according to claim 1, wherein the first microRNA is expressed in non-cancer cells.
3. The polynucleotide according to claim 1 or 2, wherein the first microRNA is at least one selected from the group consisting of miR-142, miR-15, miR-16, miR-21, miR-126, miR-181, miR-223, miR-296, miR-125, miR-143, miR-145, miR-199 and let-7.
4. A recombinant adenovirus, which comprises a replication cassette comprising the polynucleotide according to claim 1, wherein the replication cassette is integrated into the E1 region of the adenovirus genome.
5. The recombinant adenovirus according to claim 4, which further comprises a labeling cassette comprising a reporter gene and a promoter capable of regulating the expression of the gene, wherein the labeling cassette is integrated into the E3 region of the adenovirus genome.
6. The recombinant adenovirus according to claim 5, wherein the labeling cassette further comprises a target sequence of a second microRNA.
7. The recombinant adenovirus according to claim 4, wherein a cell death-inducing cassette comprising a gene encoding a cell death induction-related protein and a promoter capable of regulating the expression of the gene is further integrated into the E3 region of the adenovirus genome.
8. The recombinant adenovirus according to claim 7, wherein the cell death-inducing cassette further comprises a target sequence of a second microRNA.
9. The recombinant adenovirus according to claim 6 or 8, wherein the second microRNA is expressed in non-cancer cells.
10. The recombinant adenovirus according to claim 9, wherein the second microRNA is at least one selected from the group consisting of miR-142, miR-15, miR-16, miR-21, miR-126, miR-181, miR-223, miR-296, miR-125, miR-143, miR-145, miR-199 and let-7.
11. The recombinant adenovirus according to claim 5 or 6, wherein the reporter gene is a gene encoding a protein which emits fluorescence or a gene encoding an enzyme protein which generates a luminophore or a chromophore upon enzymatic reaction.
12. The recombinant adenovirus according to claim 5, wherein the promoter is human telomerase reverse transcriptase promoter or cytomegalovirus promoter.
13. The recombinant adenovirus according to claim 4, which further comprises a gene encoding a CD46-binding fiber protein.
14. The recombinant adenovirus according to claim 13, wherein the CD46-binding fiber protein comprises at least the fiber knob region in the fiber protein of adenovirus type 34 or 35.
15. A reagent for cancer cell detection, which comprises the recombinant adenovirus according to claim 4.
16. A reagent for cancer diagnosis, which comprises the recombinant adenovirus according to claim 4.
17. The reagent according to claim 15, wherein the cancer cells are derived from a biological sample taken from a subject.
18. The reagent according to claim 17, wherein the biological sample is blood.
19. The reagent according to claim 15 or 18, wherein the cancer cells are circulating tumor cells.
20. The reagent according to claim 15, wherein the cancer cells are drug-resistant cancer cells.
21. The reagent according to claim 15, wherein the cancer cells are cancer stem cells.
22. The reagent according to claim 15, wherein the cancer cells are cancer cells having undergone epithelial-mesenchymal transition or mesenchymal-epithelial transition.
23. A method for cancer cell detection, which comprises contacting cancer cells with the recombinant adenovirus according to claim 11 and detecting the fluorescence or color produced by the cancer cells.
24. The method according to claim 23, wherein the cancer cells are derived from a biological sample taken from a subject.
25. The method according to claim 24, wherein the biological sample is blood.
26. The method according to claim 25, wherein the cancer cells are circulating tumor cells.
Description:
TECHNICAL FIELD
[0001] The present invention relates to a novel conditionally replicating adenovirus and a reagent comprising the same for cancer cell detection or for cancer diagnosis.
BACKGROUND ART
[0002] Techniques currently used for cancer diagnosis mainly include (i) those using large-sized testing instruments (e.g., MRI) and (ii) those for measuring tumor markers or the like in blood, and expectations are now focused on (ii) which are simple techniques with less burden on patients. In particular, cancer cells circulating in the peripheral blood of cancer patients (i.e., circulating tumor cells (CTCs)) show a close relationship with clinical symptoms because these cells increase the risk of systemic metastasis and because the prognosis of patients with CTCs is significantly poor. Thus, it has been expected to develop a technique for simple and highly sensitive detection of CTCs as a predictive factor or surrogate marker for prognosis.
[0003] Techniques used for CTC detection include detection with a cancer-related antigen such as EpCAM (epithelial cell adhesion molecule) or cytokeratin-8 (e.g., CellSearch system) and detection by means of RT-PCR, etc. However, these cancer-related antigens are also expressed on normal epithelial cells and hence are highly likely to cause false positive detection, while cell morphology characteristic of cancer cells cannot be observed at the same time in the case of PCR detection. For these reasons, there has been a demand for a new technique in terms of sensitivity, simplicity, accuracy and costs.
[0004] On the other hand, the inventors of the present invention have already developed a conditionally replicating adenovirus which grows specifically in cancer cells and expresses GFP (GFP-expressing conditionally replicating adenovirus: GFP-CRAd) (which is referred to as TelomeScan®, OBP-401 or Telomelysin-GFP) (Patent Document 1: WO2006/036004). Moreover, the inventors of the present invention have also developed a simple technique for CTC detection using this TelomeScan (Non-patent Document 1: Kojima T., et al, J. Clin. Invest., 119; 3172, 2009).
[0005] However, since TelomeScan has the fiber protein of adenovirus type 5 and infects via coxsackievirus and adenovirus receptor (CAR) in target cells, TelomeScan may not infect cells which do not express CAR. In particular, it is known that CAR expression is reduced in highly malignant cancer cells which are highly invasive, metastatic and proliferative (Non-patent Document 2: Okegawa T., et al, Cancer Res., 61: 6592-6600, 2001); and hence TelomeScan may not detect these highly malignant cancer cells. Moreover, although less likely, TelomeScan may give false positive results by infecting and growing in normal blood cells (e.g., leukocytes) to cause GFP expression.
[0006] For these reasons, there has been a demand for a reagent for cancer cell detection and a reagent for cancer diagnosis, each of which detects almost all cancer cells including CAR-negative ones and does not give any false positive results in normal blood cells.
PRIOR ART DOCUMENTS
Patent Documents
[0007] Patent Document 1: WO2006/036004
[0008] Non-patent Document 1: Kojima T., et al, J. Clin. Invest., 119: 3172, 2009
[0009] Non-patent Document 2: Okegawa T., et al, Cancer Res., 61: 6592-6600, 2001
SUMMARY OF THE INVENTION
Problem to be Solved by the Invention
[0010] The present invention has been made under these circumstances, and the problem to be solved by the present invention is to provide a reagent for cancer cell detection and a reagent for cancer diagnosis, each of which detects almost all cancer cells including CAR-negative ones and does not give any false positive results in blood cells, as well as to provide a conditionally replicating recombinant adenovirus which is useful as such a reagent.
Means to Solve the Problem
[0011] As a result of extensive and intensive efforts made to solve the above problem, the inventors of the present invention have found that not only CAR-positive cells, but also CAR-negative cells can be detected when the fiber of adenovirus type 5 in TelomeScan is replaced with another adenovirus fiber binding to CD46, which is highly expressed on almost all human cells, particularly cancer cells in general. Moreover, the inventors of the present invention have succeeded in avoiding any false positive results in blood cells by integration of a microRNA (miRNA)-mediated gene regulatory system into TelomeScan, which led to the completion of the present invention.
[0012] Namely, the present invention is as follows.
[0013] (1) A polynucleotide, which comprises human telomerase reverse transcriptase promoter, E1A gene, IRES sequence and E1B gene in this order and which comprises a target sequence of a first microRNA.
[0014] (2) The polynucleotide according to (1) above, wherein the first microRNA is expressed in non-cancer cells.
[0015] (3) The polynucleotide according to (1) or (2) above, wherein the first microRNA is at least one selected from the group consisting of miR-142, miR-15, miR-16, miR-21, miR-126, miR-181, miR-223, miR-296, miR-125, miR-143, miR-145, miR-199 and let-7.
[0016] (4) A recombinant adenovirus, which comprises a replication cassette comprising the polynucleotide according to any one of (1) to (3) above, wherein the replication cassette is integrated into the E1 region of the adenovirus genome.
[0017] (5) The recombinant adenovirus according to (4) above, which further comprises a labeling cassette comprising a reporter gene and a promoter capable of regulating the expression of the gene, wherein the labeling cassette is integrated into the E3 region of the adenovirus genome.
[0018] (6) The recombinant adenovirus according to (5) above, wherein the labeling cassette further comprises a target sequence of a second microRNA.
[0019] (7) The recombinant adenovirus according to (4) above, wherein a cell death-inducing cassette comprising a gene encoding a cell death induction-related protein and a promoter capable of regulating the expression of the gene is further integrated into the E3 region of the adenovirus genome.
[0020] (8) The recombinant adenovirus according to (7) above, wherein the cell death-inducing cassette further comprises a target sequence of a second microRNA.
[0021] (9) The recombinant adenovirus according to (6) or (8) above, wherein the second microRNA is expressed in non-cancer cells.
[0022] (10) The recombinant adenovirus according to (9) above, wherein the second microRNA is at least one selected from the group consisting of miR-142, miR-15, miR-16, miR-21, miR-126, miR-181, miR-223, miR-296, miR-125, miR-143, miR-145, miR-199 and let-7.
[0023] (11) The recombinant adenovirus according to (5) or (6) above, wherein the reporter gene is a gene encoding a protein which emits fluorescence or a gene encoding an enzyme protein which generates a luminophore or a chromophore upon enzymatic reaction.
[0024] (12) The recombinant adenovirus according to any one of (5) to (10) above, wherein the promoter is human telomerase reverse transcriptase promoter or cytomegalovirus promoter.
[0025] (13) The recombinant adenovirus according to any one of (4) to (12) above, which further comprises a gene encoding a CD46-binding fiber protein.
[0026] (14) The recombinant adenovirus according to (13) above, wherein the CD46-binding fiber protein comprises at least the fiber knob region in the fiber protein of adenovirus type 34 or 35.
[0027] (15) A reagent for cancer cell detection, which comprises the recombinant adenovirus according to any one of (4) to (14) above.
[0028] (16) A reagent for cancer diagnosis, which comprises the recombinant adenovirus according to any one of (4) to (14) above.
[0029] (17) The reagent according to (15) above, wherein the cancer cells are derived from a biological sample taken from a subject.
[0030] (18) The reagent according to (17) above, wherein the biological sample is blood.
[0031] (19) The reagent according to (15) or (18) above, wherein the cancer cells are circulating tumor cells.
[0032] (20) The reagent according to any one of (15) and (17) to (19) above, wherein the cancer cells are drug-resistant cancer cells.
[0033] (21) The reagent according to any one of (15) and (17) to (20) above, wherein the cancer cells are cancer stem cells.
[0034] (22) The reagent according to any one of (15) and (17) to (21) above, wherein the cancer cells are cancer cells having undergone epithelial-mesenchymal transition or mesenchymal-epithelial transition.
[0035] (23) A method for cancer cell detection, which comprises contacting cancer cells with the recombinant adenovirus according to (11) above and detecting the fluorescence or color produced by the cancer cells.
[0036] (24) The method according to (23) above, wherein the cancer cells are derived from a biological sample taken from a subject.
[0037] (25) The method according to (24) above, wherein the biological sample is blood.
[0038] (26) The method according to (25) above, wherein the cancer cells are circulating tumor cells.
Effects of the Invention
[0039] The present invention enables simple and highly sensitive detection of CAR-negative cancer cells without detection of normal blood cells (e.g., leukocytes).
BRIEF DESCRIPTION OF THE DRAWINGS
[0040] FIG. 1 is a schematic view showing an example of the structure of the recombinant adenovirus of the present invention.
[0041] FIG. 2 shows the results measured for activity of recombinant adenoviruses by flow cytometry.
[0042] FIG. 3 shows the results detected for H1299 cells contained in blood samples.
[0043] FIG. 4 shows the results detected for A549 cells contained in blood samples.
[0044] FIG. 5 shows the results measured for activity of the recombinant adenovirus of the present invention in various types of cancer cells.
[0045] FIG. 6 shows the results detected for cancer cells having undergone epithelial-mesenchymal transition (EMT).
[0046] FIG. 7 shows the results detected for cancer stem cells.
[0047] FIG. 8 shows the results detected for H1299 and T24 cells contained in blood samples by using a red fluorescent protein.
DESCRIPTION OF EMBODIMENTS
[0048] The present invention will be described in more detail below. The following embodiments are illustrated to describe the present invention, and it is not intended to limit the present invention only to these embodiments. The present invention can be implemented in various modes, without departing from the spirit of the present invention. Moreover, this specification incorporates the contents disclosed in the specification and drawings of Japanese Patent Application No. 2011-181414 (filed on Aug. 23, 2011), based on which the present application claims priority.
[0049] 1. Summary
[0050] TelomeScan (i.e., a conditionally replicating adenovirus comprising hTERT promoter, E1A gene, IRES sequence and E1B gene integrated in this order into the E1-deficient region of adenovirus type 5 and comprising cytomegalovirus (CMV) promoter and GFP integrated in this order into the E3-deficient region of adenovirus type 5), which has been previously developed by the inventors of the present invention, has problems in that: (i) TelomeScan may not detect highly malignant cancer cells where CAR expression is reduced; and (ii) TelomeScan may detect normal blood cells as false positive. As a result of extensive and intensive efforts made to solve these problems, the inventors of the present invention have found that highly malignant CAR-negative cancer cells can be detected when the fiber of adenovirus type 5 in TelomeScan is replaced with another adenovirus fiber binding to CD46, which is highly expressed on almost all human cells, particularly cancer cells in general. Moreover, the inventors of the present invention have also found that when a target sequence of miR-142-3p, which is miRNA, is integrated into each of the replication and labeling cassettes in TelomeScan, virus growth and labeling protein expression can be prevented in normal blood cells to thereby prevent the occurrence of false positive results in normal blood cells.
[0051] Namely, in a preferred embodiment of the present invention, the recombinant adenovirus of the present invention is a recombinant adenovirus, in which a replication cassette comprising hTERT promoter, E1A gene, IRES sequence, E1B gene and a target sequence of microRNA is integrated into the E1 region of the adenovirus genome and a labeling cassette comprising a reporter gene, a promoter capable of regulating the expression of the gene and a target sequence of microRNA is integrated into the E3 region of the adenovirus genome, and which comprises a gene encoding a CD46-binding adenovirus fiber protein (FIG. 1). This recombinant adenovirus has the following features.
(i) Because of comprising a gene encoding a CD46-binding adenovirus fiber protein, this recombinant adenovirus is able to infect almost all cells including CAR-negative cells. (ii) Because of comprising hTERT promoter, this recombinant adenovirus grows specifically in hTERT-expressing cancer cells and also increases reporter gene expression upon growth, whereby the production of a labeling protein, a chromophore or the like can be increased to detectable levels. (iii) Because of comprising a target sequence of miRNA, this recombinant adenovirus can prevent the occurrence of false positive results even when the virus infects normal cells having hTERT promoter activity, because expression of this miRNA prevents not only growth of the virus but also expression of the reporter gene. In particular, because of comprising a target sequence of miRNA which is expressed specifically in blood cells, this recombinant adenovirus can prevent the occurrence of false positive results even when the virus infects normal blood cells having hTERT promoter activity, because expression of this miRNA prevents not only growth of the virus in blood cells but also expression of the reporter gene.
[0052] The present invention has been completed on the basis of these findings.
[0053] 2. Recombinant Adenovirus
(1) Replication Cassette
[0054] The present invention relates to a polynucleotide, which comprises human telomerase reverse transcriptase (hTERT) promoter, E1A gene, IRES sequence and E1B gene in this order and which comprises a target sequence of microRNA. In addition, the present invention relates to a recombinant adenovirus, which comprises a replication cassette comprising the above polynucleotide, wherein the replication cassette is integrated into the E1 region of the adenovirus genome.
[0055] By the action of the above polynucleotide (or a replication cassette comprising the same), the recombinant adenovirus of the present invention can grow specifically in cancer cells and can also be prevented from growing in cells which express the desired miRNA. For example, if the target sequence of miRNA contained in the replication cassette of the present invention is a target sequence of miRNA which is expressed specifically in blood cells, the recombinant adenovirus of the present invention grows specifically in hTERT-expressing cancer cells and is prevented from growing in blood cells.
[0056] Human telomerase reverse transcriptase (hTERT) promoter is a promoter for reverse transcriptase which is an element of human telomerase. Although human telomerase activity will be increased by splicing of hTERT mRNA, post-translational modification of hTERT protein and other events, enhanced hTERT gene expression, i.e., increased hTERT promoter activity is thought to be the most important molecular mechanism. Human telomerase has been confirmed to show increased activity in 85% or more of human cancers, whereas it shows no activity in most normal cells. Thus, the use of hTERT promoter allows a gene downstream thereof to be expressed specifically in cancer cells. In the present invention, the hTERT promoter is located upstream of E1A gene, IRES sequence and E1B gene, whereby the virus can grow specifically in hTERT-expressing cancer cells.
[0057] hTERT has been confirmed to have many transcription factor binding sequences in a 1.4 kbp region upstream of its 5'-terminal end, and this region is regarded as hTERT promoter. In particular, a 181 bp sequence upstream of the translation initiation site is a core region important for expression of its downstream genes. In the present invention, although any sequence may be used as long as it includes this core region, an upstream sequence of approximately 378 bp which covers this core region in its entirety is preferred for use as the hTERT promoter. This sequence of approximately 378 bp has been confirmed to have the same efficiency of gene expression as the 181 bp core region alone. The nucleotide sequence of a 455 bp long hTERT promoter is shown in SEQ ID NO: 1.
[0058] In addition to the sequence shown in SEQ ID NO: 1, the nucleotide sequence of hTERT promoter includes the nucleotide sequences of polynucleotides which are hybridizable under stringent conditions with DNA consisting of a nucleotide sequence complementary to DNA consisting of SEQ ID NO: 1 and which have hTERT promoter activity. Such polynucleotides may be obtained from cDNA and genomic libraries by known hybridization techniques (e.g., colony hybridization, plaque hybridization, Southern blotting) using a polynucleotide which consists of the nucleotide sequence shown in SEQ ID NO: 1 or a fragment thereof as a probe.
[0059] For preparation of cDNA libraries, reference may be made to "Molecular Cloning, A Laboratory Manual 2nd ed." (Cold Spring Harbor Press (1989)). Alternatively, commercially available cDNA and genomic libraries may also be used for this purpose.
[0060] Stringent conditions in the above hybridization include, for example, conditions of 1×SSC to 2×SSC, 0.1% to 0.5% SDS and 42° C. to 68° C., more specifically prehybridization at 60° C. to 68° C. for 30 minutes or longer and the subsequent 4 to 6 washings in 2×SSC, 0.1% SDS at room temperature for 5 to 15 minutes.
[0061] As to detailed procedures for hybridization, reference may be made to "Molecular Cloning, A Laboratory Manual 2nd ed." (Cold Spring Harbor Press (1989); particularly Section 9.47-9.58), etc.
[0062] E1A and E1B genes are both included in the E1 gene of adenovirus. This E1 gene refers to one of the early genes among the virus early (E) and late (L) genes related to DNA replication, and it encodes a protein related to the regulation of viral genome transcription. EIA protein encoded by the E1A gene of adenovirus activates the transcription of a group of genes (e.g., E1B, E2, E4) required for infectious virus production. E1B protein encoded by the E1B gene of adenovirus assists late gene (L gene) mRNAs to accumulate into the cytoplasm of infected host cells and inhibits protein synthesis in the host cells, thereby facilitating virus replication. The nucleotide sequences of the E1A and E1B genes are shown in SEQ ID NO: 2 and SEQ ID NO: 3, respectively. In addition to the sequences shown in SEQ ID NO: 2 and SEQ ID NO: 3, the nucleotide sequences of the E1A and E1B genes include nucleotide sequences which are hybridizable under stringent conditions with DNA consisting of a nucleotide sequence complementary to DNA consisting of SEQ ID NO: 2 or SEQ ID NO: 3 and which encode a protein having E1A or E1B activity. Procedures and stringent conditions for hybridization are the same as those described above for the hTERT promoter.
[0063] IRES (internal ribosome entry site) sequence is a protein synthesis initiation signal specific to the picornavirus family and is considered to serve as a ribosomal binding site because of having a sequence complementary to the 3'-terminal end of 18S ribosomal RNA. It is known that translation of mRNAs derived from viruses of the picornavirus family is mediated by this sequence. The efficiency of translation from the IRES sequence is high and protein synthesis occurs even from the middle of mRNA in a manner not dependent on the cap structure. Thus, in the virus of the present invention, the E1A gene and the E1B gene, which is located downstream of the IRES sequence, are both translated independently by the action of hTERT promoter. With the use of the IRES sequence, hTERT promoter-mediated expression regulation occurs independently in both the E1A gene and the E1B gene, and hence virus growth can be more strictly limited to cells having telomerase activity when compared to the case where any one of the E1A gene or the E1B gene is regulated by the hTERT promoter. Moreover, the IRES sequence inserted between the E1A gene and the E1B gene can increase the growth capacity of the virus in host cells. The nucleotide sequence of the IRES sequence is shown in SEQ ID NO: 4. In addition to the sequence shown in SEQ ID NO: 4, the nucleotide sequence of the IRES sequence includes nucleotide sequences which are hybridizable under stringent conditions with DNA consisting of a nucleotide sequence complementary to DNA consisting of SEQ ID NO: 4 and which encode a protein having IRES activity. Procedures and stringent conditions for hybridization are the same as those described above for the hTERT promoter.
[0064] miRNA generally refers to short single-stranded RNA of approximately 15 to 25 nucleotides and is considered to regulate the translation of various genes upon binding to its target sequence present in mRNA. Thus, for example, when miRNA-expressing cells are infected with a recombinant adenovirus comprising a desired gene and a target sequence of the miRNA, the desired gene is prevented from being expressed in these cells. Such a target sequence of miRNA may be inserted into any site as long as a desired gene is prevented from being expressed, but it preferably inserted into an untranslated region of the desired gene, more preferably downstream of the desired gene.
[0065] The target sequence of miRNA to be used in the present invention includes target sequences of miRNAs which are expressed in non-cancer cells. Non-cancer cells are intended to mean cells that are not malignant tumor cells, and examples include normal cells, benign tumor cells and so on. Normal cells include, for example, normal blood cells, normal endothelial cells, normal fibroblasts, normal stem cells and so on. On the other hand, circulating tumor cells are regarded as cells originating from malignant tumors, and hence they fall within malignant tumor cells in the present invention.
[0066] The target sequence of miRNA to be used in the present invention also includes target sequences of miRNAs which are expressed specifically in blood cells. In the present invention, "blood cells" may include not only normal blood cells, but also cancerous blood cells. Namely, in the present invention, "miRNA which is expressed specifically in blood cells" may be expressed specifically in normal blood cells or may be expressed specifically in both normal blood cells and cancerous blood cells. Even when expressed specifically in both normal blood cells and cancerous blood cells, miRNA can also reduce false positive cases of normal blood cells during detection of circulating tumor cells and thereby ensures accurate detection of circulating tumor cells released from solid cancers. In the present invention, "miRNA which is expressed specifically in blood cells" is more preferably miRNA which is expressed in normal blood cells but is not expressed in cancerous blood cells.
[0067] In the present invention, blood cells include, but are not limited to, leukocytes (i.e., neutrophils, eosinophils, basophils, lymphocytes (T cells and B cells), monocytes, dendritic cells), CD34-positive cells, hematopoietic cells, hematopoietic stem cells, hematopoietic progenitor cells, peripheral blood mononuclear cells (PBMCs) and so on. Likewise, cancerous blood cells include leukemia cells, lymphoma cells and so on. In the present invention, being "expressed specifically" in certain cells is intended to mean not only that expression is limited only to the intended cells, but also that expression levels are higher in the intended cells than in other cells. For example, being "expressed specifically in blood cells" is intended to mean not only that expression is limited only to blood cells, but also that expression levels are higher in blood cells than in any cells other than blood cells.
[0068] miRNA which is expressed specifically in blood cells includes, for example, miR-142, miR-15, miR-16, miR-21, miR-126, miR-181, miR-223, miR-296 and so on, with miR-142, miR-15 and miR-16 being preferred.
[0069] Although miRNA is single-stranded RNA, it is possible to use a target sequence of either strand of premature double-stranded RNA as long as a desired gene can be prevented from being expressed. For example, there are miR-142-3p and miR-142-5p for miR-142, and a target sequence of either miRNA may be used in the present invention. Namely, in the present invention, "miR-142" includes both miR-142-3p and miR-142-5p, with miR-142-3p being preferred. Likewise, in the present invention, "miR-15" includes the sense strand (referred to as "miR-15S") and antisense strand (referred to as "miR-15AS") of premature double-stranded RNA. The same applies to other miRNAs.
[0070] miR-142-3p gene is located at a site where translocation occurs in B cell leukemia (aggressive B cell leukemia), and is known to be expressed in hematopoietic tissues (e.g., bone marrow, spleen, thymus), but not expressed in other tissues. Moreover, miR-142-3p has been observed to be expressed in mouse fetal liver (fetal hematopoietic tissue) and hence is considered to be involved in differentiation of the hematopoietic system (Chang-Zheng Chen, et al., Science, 2004).
[0071] In this embodiment, gene expression is regulated in two stages in a selective manner, because specific gene expression is caused in cancer cells by the action of hTERT promoter and gene expression in blood cells is regulated by the action of miRNA.
[0072] In another embodiment, the target sequence of miRNA to be used in the present invention includes a target sequence of miRNA whose expression is suppressed in cancer cells. miRNA whose expression is suppressed in cancer cells includes, for example, miR-125, miR-143, miR-145, miR-199, let-7 and so on. In this embodiment, specific gene expression in cancer cells is doubly regulated by the action of hTERT promoter and miRNA.
[0073] Although miRNA molecules have been initially found in nematodes, yeast and other organisms, there are currently found several hundreds of miRNAs in humans and mice. The sequences of these miRNAs are known, and sequence information and so on can be obtained by access to public DBs (e.g., miRBase sequence database (http://microrna.sanger.ac.uk/sequences/index.shtml, http://www.mirbase.org/)).
[0074] The sequences of miR-142, miRNA-15, miRNA-16, miR-21, miR-126, miR-181, miR-223, miR-296, miR-125, miR-143, miR-145, miR-199 and let-7 are shown below.
TABLE-US-00001 (SEQ ID NO: 5) miR-142-3p: 5'-UGUAGUGUUUCCUACUUUAUGGA (SEQ ID NO: 6) miR-142-5p: 5'-CAUAAAGUAGAAAGCACUACU (SEQ ID NO: 7) miR-15S: 5'-UAGCAGCACAUAAUGGUUUGUG (SEQ ID NO: 8) miR-15AS: 5'-CAGGCCAUAUUGUGCUGCCUCA (SEQ ID NO: 9) miR-16S: 5'-UAGCAGCACGUAAAUAUUGGCG (SEQ ID NO: 10) miR-16AS: 5'-CCAGUAUUAACUGUGCUGCUGA (SEQ ID NO: 11) miR-21S: 5'-UAGCUUAUCAGACUGAUGUUGA (SEQ ID NO: 12) miR-21AS: 5'-CAACACCAGUCGAUGGGCUGU (SEQ ID NO: 13) miR-126S: 5'-UCGUACCGUGAGUAAUAAUGCG (SEQ ID NO: 14) miR-126AS: 5'-CAUUAUUACUUUUGGUACGCG (SEQ ID NO: 15) miR-181: 5'-AACAUUCAACGCUGUCGGUGAGU (SEQ ID NO: 16) miR-223S: 5'-UGUCAGUUUGUCAAAUACCCCA (SEQ ID NO: 17) miR-223AS: 5'-CGUGUAUUUGACAAGCUGAGUU (SEQ ID NO: 18) miR-296-3p: 5'-GAGGGUUGGGUGGAGGCUCUCC (SEQ ID NO: 19) miR-296-5p: 5'-AGGGCCCCCCCUCAAUCCUGU (SEQ ID NO: 20) miR-125: 5'-UCCCUGAGACCCUUUAACCUGUGA (SEQ ID NO: 21) miR-143S: 5'-UGAGAUGAAGCACUGUAGCUC (SEQ ID NO: 22) miR-143AS: 5'-GGUGCAGUGCUGCAUCUCUGGU (SEQ ID NO: 23) miR-145S: 5'-GUCCAGUUUUCCCAGGAAUCCCU (SEQ ID NO: 24) miR-145AS: 5'-GGAUUCCUGGAAAUACUGUUCU (SEQ ID NO: 25) miR-199: 5'-CCCAGUGUUCAGACUACCUGUUC (SEQ ID NO: 26) let-7: 5'-UGAGGUAGUAGGUUGUAUAGUU
[0075] In the present invention, a single unit of a target sequence of miRNA is composed of a sequence complementary to the whole or part of the miRNA, and has a nucleotide length of 7 to 30 nucleotides, preferably 19 to 25 nucleotides, more preferably 21 to 23 nucleotides. In the present invention, a single unit of a target sequence of miRNA is intended to mean a nucleotide sequence having the minimum length required for serving as a target of certain miRNA. More specifically, it is intended to mean an oligonucleotide of at least 7 nucleotides in length selected from complementary sequences of the nucleotide sequences shown in SEQ ID NOs: 5 to 26, and such an oligonucleotide may comprise substitution, deletion, addition or removal of one or several nucleotides at any site(s).
[0076] The target sequence as a whole to be integrated into the polynucleotide or recombinant adenovirus of the present invention may comprise several copies of a single unit of target sequence in order to ensure effective interaction between miRNA and the target sequence. The target sequence as a whole to be integrated into the recombinant adenovirus may be of any length as long as it can be integrated into the viral genome. For example, it may comprise 1 to 10 copies, preferably 2 to 6 copies, and more preferably 2 or 4 copies of a single unit of target sequence (John G. Doench, et al., Genes Dev. 2003 17:438-442). An oligonucleotide of appropriate length may be inserted between single units of target sequence contained in the target sequence as a whole. The length of such an oligonucleotide of appropriate length is not limited in any way as long as the target sequence as a whole can be integrated into the recombinant adenovirus genome. For example, such an oligonucleotide may be of 0 to 8 nucleotides in length. Moreover, in the case of comprising several units of a target sequence of miRNA, the target sequences in the respective units may be those toward the same miRNA or those toward different miRNAs. Furthermore, in the case of comprising target sequences toward the same miRNA, the target sequences in the respective units may have different lengths and/or different nucleotide sequences.
[0077] The target sequence of miRNA to be contained in the polynucleotide of the present invention (or a replication cassette comprising the same) can also be referred to as a "target sequence of a first microRNA" in order that the polynucleotide, when integrated into the recombinant adenovirus, should be distinguished from other miRNA target sequences present in the recombinant adenovirus.
[0078] When miR-142-3p is used as miRNA in the present invention, a target sequence thereof may be exemplified by sequences comprising the following sequences, by way of example.
TABLE-US-00002 (i) Sequence comprising two units of a target sequence of miR-142-3p: (SEQ ID NO: 27, each underline represents a single unit of a target sequence of miR-142-3p) 5'-gcggcctccataaagtaggaaacactacacagctccataaagtaggaaacactacattataagcggtac (ii) Sequence comprising four units of a target sequence of miR-142-3p: (SEQ ID NO: 28, each underline represents a single unit of a target sequence of miR-142-3p) 5'-ggcctccataaagtaggaaacactacacagctccataaagtaggaaacactacattaattccataaagtag- gaaacactac accactccataaagtaggaaacactacagtac
[0079] In the present invention, a target sequence of miRNA is placed downstream of the construct of hTERT promoter-E1A gene-IRES sequence-E1B gene, and the resulting polynucleotide comprising the hTERT promoter, the E1A gene, the IRES sequence, the E1B gene and the target sequence of miRNA in this order (which polynucleotide is referred to as a replication cassette) is integrated into the adenovirus genome, whereby E1 gene expression and virus growth can be prevented in cells expressing the miRNA.
[0080] In the present invention, a target sequence of miRNA is integrated downstream of the E1B gene or the reporter gene described later, whereby a gene located upstream thereof is prevented from being expressed. Although the details of this mechanism are not clear, a possible mechanism is as follows. First, miRNA-RISC (RNA-induced silencing complex) cleaves a target sequence on mRNA to thereby remove polyA from the mRNA. This would reduce the stability of the mRNA to cause degradation of the mRNA and hence prevention of gene expression. Alternatively, miRNA-RISC would recruit polyA ribonuclease, as in the case of normal miRNA, to cause polyA degradation, as a result of which the stability of mRNA would be reduced and gene expression would be prevented.
[0081] It should be noted that there are previous reports showing that the miRNA-induced inhibitory effect against gene expression was not obtained for the expression (translation) of a gene inserted downstream of the IRES sequence (Ramesh S. Pillai et al., Science 309, 1573(2005); Geraldine Mathonnet, et al., Science 317, 1764 (2007)). However, when the inventors of the present invention confirmed gene expression for the recombinant adenovirus of the present invention comprising hTERT promoter, E1A gene, IRES sequence, E1B gene and a target sequence of miRNA in this order, the miRNA was found to sufficiently prevent the expression of the E1B gene inserted downstream of the IRES sequence. This is a new finding in the present invention.
[0082] The genes to be contained in the replication cassette of the present invention can be obtained by standard genetic engineering techniques. For example, it is possible to use nucleic acid synthesis with a DNA synthesizer, which is commonly used as a genetic engineering technique. Alternatively, it is also possible to use PCR techniques in which gene sequences serving as templates are isolated or synthesized, and primers specific to each gene are then designed to amplify the gene sequence with a PCR system (Current Protocols in Molecular Biology, John Wiley & Sons (1987) Section 6.1-6.4) or gene amplification techniques using a cloning vector. The above techniques can be easily accomplished by those skilled in the art in accordance with Molecular cloning 2nd Edt. Cold Spring Harbor Laboratory Press (1989), etc. For purification of the resulting PCR product, known techniques can be used. If necessary, conventionally used sequencing techniques may be used to confirm whether the intended gene has been obtained, as expected. For example, dideoxynucleotide chain termination sequencing (Sanger et al. (1977) Proc. Natl. Acad. Sci. USA 74: 5463) or the like may be used for this purpose. Alternatively, an appropriate DNA sequencer (e.g., ABI PRISM (Applied Biosystems)) may also be used for sequence analysis.
[0083] In the present invention, the target sequence of miRNA can be obtained by being designed and synthesized such that each single unit of target sequence is complementary to the whole or part of the nucleotide sequence of the miRNA. For example, a target sequence of miR-142-3p can be obtained by synthesizing DNA such that it is complementary to the nucleotide sequence of miR-142-3p.
[0084] Then, the respective genes obtained as above are ligated in a given order. First, the above genes are each cleaved with known restriction enzymes or the like, and the cleaved DNA fragment of each gene is inserted into and ligated to a known vector in accordance with known procedures. As a known vector, pIRES vector may be used, by way of example. The pIRES vector comprises the IRES (internal ribosome entry site) sequence of encephalomyocarditis virus (ECMV) and is capable of translating two open reading frames (ORFs) from one mRNA. With the use of the pIRES vector, it is possible to prepare a "polynucleotide which comprises hTERT promoter, E1A gene, IRES sequence and E1B gene in this order and which comprises a target sequence of microRNA" by sequentially inserting the required genes into a multicloning site. Such a target sequence of miRNA may be inserted into any site, but it is preferably inserted downstream of the hTERT promoter-EIA-IRES-E1B construct. For DNA ligation, DNA ligase may be used. Alternatively, CMV promoter contained in a known vector (e.g., pShuttle) may be removed with known restriction enzymes and a sequence cleaved from the hTERT promoter-EIA-IRES-E1B-miRNA target sequence with appropriate restriction enzymes may then be inserted into this site, if necessary. Once the E1 gene required for adenovirus growth is allowed to be expressed under the control of the hTERT promoter, the virus can be grown specifically in cancer cells.
[0085] (2) Labeling Cassette
[0086] In yet another embodiment, the present invention relates to a recombinant adenovirus in which the above replication cassette is integrated into the E1 region of the adenovirus genome and a labeling cassette is further integrated into the E3 region of the adenovirus genome. Such a labeling cassette comprises a reporter gene and a promoter capable of regulating the expression of the gene, and may further comprise a target sequence of miRNA.
[0087] The adenovirus E3 region contains 11.6 kDa ADP (adenovirus death protein), and ADP has the function of promoting cell damage and virus diffusion. The recombinant adenovirus of the present invention is designed to eliminate any viral genome region like the E3 region containing ADP, which encodes a protein having the function of promoting cell damage and virus diffusion, so that the timing of cell death is delayed to facilitate identification of cancer tissues by production (emission, expression) of fluorescence (e.g., GFP). This is also effective in that circulating tumor cells (CTCs) described later can be detected alive over a long period of time.
[0088] The reporter gene to be contained in the labeling cassette in the recombinant adenovirus of the present invention is not limited in any way, and examples include a gene encoding a protein which emits fluorescence, a gene encoding an enzyme protein which generates a luminophore or a chromophore upon enzymatic reaction, a gene encoding an antibiotic, a gene encoding a tag-fused protein, a gene encoding a protein which is expressed on the cell surface and binds to a specific antibody, a gene encoding a membrane transport protein, and so on. Examples of a protein which emits fluorescence (i.e., a labeling protein) include a green fluorescent protein (GFP) derived from luminous jellyfish such as Aequorea victorea, its variants EGFP (enhanced-humanized GFP) and rsGFP (red-shift GFP), a yellow fluorescent protein (YFP), a cyan fluorescent protein (CFP), a blue fluorescent protein (BFP), GFP derived from Renilla reniformis and so on, and genes encoding these proteins can be used in the present invention. The above protein which emits fluorescence is preferably GFP or EGFP.
[0089] Likewise, examples of an enzyme protein which generates a luminophore or a chromophore upon enzymatic reaction include 3-galactosidase, luciferase and so on. β-Galactosidase generates a blue chromophore from 5-bromo-4-chloro-3-indolyl-3-D-galactopyranoside (X-gal) upon enzymatic reaction. On the other hand, luciferase generates a luminophore upon enzymatic reaction with luciferin. Firefly luciferase, bacterial luciferase, Renilla luciferase and so on are known as members of luciferase, and those skilled in the art would be able to select an appropriate enzyme from known luciferase members.
[0090] Moreover, the promoter capable of regulating the expression of the above gene is not limited in any way as long as it is a suitable promoter compatible with the virus used for the expression of the above desired gene. Examples include, but are not limited to, CMV promoter, hTERT promoter, SV40 late promoter, MMTV LTR promoter, RSV LTR promoter, SRα promoter, β-actin promoter, PGK promoter, EF-1a promoter and so on. Preferably, CMV promoter or hTERT promoter can be used for this purpose.
[0091] The target sequence of miRNA to be integrated into the labeling cassette may be either the same or different from the target sequence of miRNA to be integrated into the replication cassette.
[0092] In the present invention, the target sequence of miRNA is placed within the untranslated region of the reporter gene, preferably downstream of this gene, whereby the reporter gene can be prevented from being expressed. Namely, in the present invention, the labeling cassette preferably comprises a promoter capable of regulating the reporter gene, the reporter gene and the target sequence of microRNA in this order. The target sequence of miRNA to be integrated into the labeling cassette is referred to as a "target sequence of a second microRNA" in order that it should be distinguished from the target sequence of miRNA to be contained in the replication cassette. Other explanations on miRNA are the same as described above.
[0093] Details on how to obtain, purify and sequence the recombinant genes to be contained in the labeling cassette of the present invention are the same as described above for the replication cassette.
[0094] (3) Cell Death-Inducing Cassette
[0095] In yet another embodiment, the present invention relates to a recombinant adenovirus in which the above replication cassette is integrated into the E1 region of the adenovirus genome and a cell death-inducing cassette is integrated into the E3 region of the adenovirus genome. Such a cell death-inducing cassette comprises a gene encoding a cell death induction-related protein and a promoter capable of regulating the expression of the gene, and may further comprise a target sequence of microRNA.
[0096] The cell death-inducing cassette used in the recombinant adenovirus of the present invention comprises a gene encoding a cell death induction-related protein and a promoter capable of regulating the expression of the gene. Thus, for example, when the recombinant adenovirus of the present invention is infected into cancer cells, the virus grows specifically in the cancer cells to thereby increase the intracellular expression level of the cell death induction-related protein and induce cell death only in the cancer cells without damaging other normal cells.
[0097] Such a gene encoding a cell death induction-related protein is intended to mean a gene encoding a protein related to the induction of cell death in specific cells. Examples of a cell death induction-related protein include immunological proteins such as PA28. PA28 is a protein which activates intracellular proteasomes and which elicits immune reactions and also induces cell death when overexpressed. Moreover, TRAIL can also be exemplified as an apoptosis-inducing protein. TRAIL refers to a molecule which induces apoptotic cell death upon binding to its receptor on the cell surface.
[0098] Moreover, another example of the gene encoding a cell death induction-related protein is a tumor suppressor gene, which has the function of suppressing the growth of cancer cells. Examples of such a tumor suppressor gene include the following genes used in conventional gene therapy. SEQ ID NO (nucleotide sequence) and GenBank Accession No. are shown below for each gene.
[0099] p53 (SEQ ID NO: 29; Accession No. M14694): multiple types of cancer
[0100] p15 (SEQ ID NO: 30; Accession No. L36844): multiple types of cancer
[0101] p16 (SEQ ID NO: 31; Accession No. L27211): multiple types of cancer
[0102] APC (SEQ ID NO: 32; Accession No. M74088): colorectal cancer, gastric cancer, pancreatic cancer
[0103] BRCA-1 (SEQ ID NO: 33; Accession No. U14680): ovarian cancer, breast cancer
[0104] DPC-4 (SEQ ID NO: 34; Accession No. U44378): colorectal cancer, pancreatic cancer
[0105] FHIT (SEQ ID NO: 35; Accession No. NM 112012): gastric cancer, lung cancer, uterine cancer
[0106] p73 (SEQ ID NO: 36; Accession No. Y11416): neuroblastoma
[0107] PATCHED (SEQ ID NO: 37; Accession No. U59464): basal cell carcinoma
[0108] Rbp110 (SEQ ID NO: 38; Accession No. M15400): lung cancer, osteosarcoma
[0109] DCC (SEQ ID NO: 39; Accession No. X76132): colorectal cancer
[0110] NF1 (SEQ ID NO: 40; Accession No. NM 000267): neurofibroma type 1
[0111] NF2 (SEQ ID NO: 41; Accession No. L11353): neurofibroma type 2
[0112] WT-1 (SEQ ID NO: 42; Accession No. NM 000378): Wilms tumor
[0113] The target sequence of miRNA to be contained in the cell death-inducing cassette may be either the same or different from the target sequence of miRNA to be integrated into the replication cassette. In the present invention, the target sequence of miRNA is placed within the untranslated region of the gene encoding a cell death induction-related protein, preferably downstream of this gene, whereby the cell death induction-related protein can be prevented from being expressed. Namely, in the present invention, the cell death-inducing cassette preferably comprises a promoter capable of regulating the gene encoding a cell death induction-related protein, the gene encoding a cell death induction-related protein and the target sequence of microRNA in this order. Other explanations on miRNA are the same as described above.
[0114] Details on how to obtain, purify and sequence the recombinant genes to be contained in the cell death-inducing cassette of the present invention are the same as described above for the replication cassette.
[0115] To determine whether or not cell death has been induced, morphological observation described below may be conducted for this purpose. Namely, once cells adhered onto the bottom surface of a culture vessel have been infected with the recombinant virus of the present invention and incubated for a given period, the cells will be rounded and detached from the bottom surface and then will float as shiny cells in the culture solution, as observed under an inverted microscope. At this stage, the cells have lost their vital mechanism and hence a determination can be made that cell death has been induced. Alternatively, cell death can also be confirmed with a commercially available kit for living cell assay which uses a tetrazolium salt (e.g., MTT, XTT).
[0116] (4) CD46-Binding Fiber Protein
[0117] In yet another embodiment, the recombinant adenovirus of the present invention may comprise a gene encoding a CD46-binding adenovirus fiber protein.
[0118] Adenovirus vectors which are now commonly used are prepared structurally based on adenovirus type 5 (or type 2) belonging to Subgroup C among 51 serotypes of human adenovirus. Although adenovirus type 5 is widely used because of its excellent gene transfer properties, adenovirus of this type has a problem of being difficult to infect cells with low expression of coxsackievirus and adenovirus receptor (CAR) because its infection is mediated by binding to CAR on target cells. In particular, CAR expression is reduced in highly malignant cancer cells which are highly invasive, metastatic and proliferative, and hence an adenovirus having the fiber protein of adenovirus type 5 may not infect such highly malignant cancer cells.
[0119] In contrast, CD46 is expressed on almost all cells except for erythrocytes in humans and is also expressed on highly malignant cancer cells. Thus, a recombinant adenovirus comprising a gene encoding a CD46-binding adenovirus fiber protein can also infect CAR-negative and highly malignant cancer cells. For example, adenovirus types 34 and 35 bind to CD46 as their receptor and thereby infect cells (Marko Marttila, et al., J. Virol. 2005, 79(22):14429-36). As described above, CD46 is expressed on almost all cells except for erythrocytes in humans, and hence adenovirus types 34 and 35 are able to infect a wide range of cells including CAR-negative cells. Moreover, the fiber of adenovirus consists of a knob region, a shaft region and a tail region, and adenovirus infects cells through binding of its fiber knob region to the receptor. Thus, at least the fiber knob region in the fiber protein is replaced from adenovirus type 5 origin to adenovirus type 34 or 35 origin, whereby the virus will be able to infect CAR-negative cells via CD46.
[0120] Because of comprising a gene encoding a CD46-binding adenovirus fiber protein, the recombinant adenovirus of the present invention is able to infect almost all cells except for erythrocytes and thus able to infect highly malignant CAR-negative cancer cells which are highly invasive, metastatic and proliferative. In the present invention, "CAR-negative" cells are intended to mean cells where CAR expression is low or cells where CAR is not expressed at all.
[0121] 57 serotypes have now been identified for human adenovirus, and these serotypes are classified into six groups, i.e., Groups A to F. Among them, adenovirus types belonging to Group B have been reported to bind to CD46. Adenovirus types belonging to Group B include adenovirus types 34 and 35, as well as adenovirus types 3, 7, 11, 16, 21 and 50, by way of example.
[0122] For use as a CD46-binding adenovirus fiber protein in the present invention, preferred is the fiber protein of adenovirus belonging to Group B, more preferred is the fiber protein of adenovirus type 3, 7, 34, 35, 11, 16, 21 or 50, and even more preferred is the fiber protein of adenovirus type 34 or 35.
[0123] The nucleotide sequence of a gene encoding the fiber protein of adenovirus type 34, 35, 3, 7, 11, 16, 21 or 50 is available from a known gene information database, e.g., the GenBank of NCBI (The National Center for Biotechnology Information). Moreover, in the present invention, the nucleotide sequence of a gene encoding the fiber protein of adenovirus type 34, 35, 3, 7, 11, 16, 21 or 50 includes not only the nucleotide sequence of each gene available from a database as described above, but also nucleotide sequences which are hybridizable under stringent conditions with DNA consisting of a nucleotide sequence complementary to DNA consisting of each nucleotide sequence available from a database and which encode a protein with binding activity to CD46.
[0124] The binding activity to CD46 can be evaluated when a recombinant adenovirus having DNA comprising the nucleotide sequence is measured for its infectivity to CD46-expressing cells. The infectivity of such a recombinant adenovirus may be measured in a known manner, for example, by detecting GFP expressed by the virus infected into CD46-expressing cells under a fluorescence microscope or by flow cytometry, etc. Procedures and stringent conditions for hybridization are the same as described above.
[0125] The recombinant adenovirus of the present invention may comprise the entire or partial region of a CD46-binding adenovirus fiber protein, such that at least the fiber knob region in the fiber protein binds to CD46. Namely, in the present invention, the CD46-binding adenovirus fiber protein may comprise at least the fiber knob region in the fiber protein of adenovirus belonging to Group B, more preferably at least the fiber knob region in the fiber protein of adenovirus of any type selected from the group consisting of type 34, type 35, type 3, type 7, type 11, type 16, type 21 and type 50, and even more preferably at least the fiber knob region in the fiber protein of adenovirus type 34 or 35. Moreover, the technical idea of the present invention is not limited to these fiber proteins as long as the intended protein binds to CD46, and it also covers various proteins capable of binding to CD46 as well as proteins having a motif capable of binding to CD46.
[0126] Alternatively, in the present invention, the CD46-binding fiber protein may comprise a region consisting of the fiber knob region and the fiber shaft region in the fiber protein of adenovirus belonging to Group B, more preferably a region consisting of the fiber knob region and the fiber shaft region in the fiber protein of adenovirus of any type selected from the group consisting of type 34, type 35, type 3, type 7, type 11, type 16, type 21 and type 50, and even more preferably a region consisting of the fiber knob region and the fiber shaft region in the fiber protein of adenovirus type 34 or 35.
[0127] In the present invention, the CD46-binding fiber protein may comprise the fiber shaft region or the fiber tail region in the fiber protein of adenovirus of any type (e.g., type 2, type 5) other than the above types, as long as it comprises at least the fiber knob region in the fiber protein of adenovirus belonging to Group B.
[0128] Examples of such a fiber protein include, but are not limited to, fiber proteins which comprise a region consisting of not only the fiber knob region and the fiber shaft region in the fiber protein of adenovirus of any type selected from the group consisting of type 34, type 35, type 3, type 7, type 11, type 16, type 21 and type 50, but also the fiber tail region in the fiber protein of adenovirus type 5.
[0129] The nucleotide sequences of a gene encoding the fiber knob region in the fiber protein of adenovirus type 34, a gene encoding the fiber shaft region in the fiber protein of adenovirus type 34 and a gene encoding a region consisting of the fiber knob region and the fiber shaft region in the fiber protein of adenovirus type 34 are shown in SEQ ID NOs: 47, 48 and 49, respectively.
[0130] Likewise, the nucleotide sequence of a gene encoding a region consisting of not only the fiber knob region and the fiber shaft region in the fiber protein of adenovirus type 34, but also the fiber tail region in the fiber protein of adenovirus type 5 is shown in SEQ ID NO: 50. In the present invention, the nucleotide sequence of such a gene includes not only the nucleotide sequence shown in SEQ ID NO: 50, but also nucleotide sequences which are hybridizable under stringent conditions with DNA consisting of a nucleotide sequence complementary to DNA consisting of the nucleotide sequence shown in SEQ ID NO: 50 and which encode a protein with binding activity to CD46. Procedures for evaluation of the binding activity to CD46, procedures and stringent conditions for hybridization are the same as described above.
[0131] To prepare the recombinant adenovirus of the present invention, a polynucleotide comprising the replication cassette, the labeling cassette and/or the cell death-inducing cassette may be excised with appropriate restriction enzymes and inserted into an appropriate virus expression vector. A preferred virus expression vector is an adenovirus vector, more preferably an adenovirus type 5 vector, and particularly preferably an adenovirus type 5 vector which comprises a gene encoding a CD46-binding adenovirus fiber protein (e.g., the fiber protein of adenovirus type 34 or 35).
[0132] As shown in Example 2 described later, GFP expression in blood cells was sufficiently suppressed in both cases where a miRNA target sequence was inserted downstream of the replication cassette and where a miRNA target sequence was inserted downstream of the labeling cassette, whereas GFP expression in blood cells was unexpectedly significantly suppressed in a case where miRNA target sequences were simultaneously inserted downstream of the replication cassette and downstream of the labeling cassette, respectively. This is a new finding in the present invention.
[0133] In the present invention, the recombinant adenovirus may be obtained in the following manner, by way of example.
[0134] First, pHMCMV5 (Mizuguchi H. et al., Human Gene Therapy, 10; 2013-2017, 1999) is treated with restriction enzymes and a target sequence of miRNA is inserted to prepare a vector having the target sequence of miRNA. Next, pSh-hAIB comprising a construct of hTERT promoter-E1A-IRES-E1B (WO2006/036004) is treated with restriction enzymes and the resulting fragment comprising the hTERT promoter-E1A-IRES-E1B construct is inserted into the above vector having the target sequence of miRNA to obtain a vector comprising hTERT promoter-EIA-IRES-E1B-miRNA target sequence. On the other hand, pHMCMVGFP-1 (pHMCMV5 comprising EGFP gene) is treated with restriction enzymes to obtain a fragment comprising CMV promoter and EGFP gene, and this fragment is inserted into the above vector having the target sequence of miRNA to obtain a vector comprising a construct of CMV-EGFP-miRNA target sequence. Then, the vector comprising hTERT promoter-EIA-IRES-E1B-miRNA target sequence and the vector comprising CMV-EGFP-miRNA target sequence are each treated with restriction enzymes and ligated together to obtain a vector in which hTERT promoter-E1A-IRES-E1B-miRNA target sequence is integrated into the E1-deficient region of the adenovirus genome and CMV-EGFP-miRNA target sequence is integrated into the E3-deficient region of the adenovirus genome. Alternatively, when a vector comprising a gene encoding a CD46-binding adenovirus fiber protein is used as a vector to be inserted with the DNA fragments comprising the respective constructs, it is possible to obtain a vector in which hTERT promoter-EIA-IRES-E1B-miRNA target sequence is integrated into the E1-deficient region of the adenovirus genome and CMV-EGFP-miRNA target sequence is integrated into the E3-deficient region of the adenovirus genome and which comprises a gene encoding a CD46-binding adenovirus fiber protein. Moreover, this vector may be linearized with a known restriction enzyme and then transfected into cultured cells (e.g., 293 cells) to thereby prepare an infectious recombinant adenovirus. It should be noted that those skilled in the art would be able to easily prepare all viruses falling within the present invention by making minor modifications to the above preparation procedures.
[0135] 3. Reagent for Cancer Cell Detection or Reagent for Cancer Diagnosis
[0136] As described above, the recombinant adenovirus of the present invention has the following features.
(i) This recombinant adenovirus infects almost all cells except for erythrocytes, and is also able to infect highly malignant CAR-negative cancer cells. (ii) This recombinant adenovirus grows specifically in hTERT-expressing cancer cells and also increases the expression level of a reporter gene upon growth, whereby the production of a labeling protein, a chromophore or the like can be increased to detectable levels. (iii) This recombinant adenovirus can prevent the occurrence of false positive results even when the virus infects normal cells having hTERT promoter activity, because miRNA expression prevents not only growth of the virus, but also expression of a reporter gene. In particular, because of comprising a target sequence of miRNA which is expressed specifically in blood cells, this recombinant adenovirus can prevent the occurrence of false positive results even when the virus infects normal blood cells having hTERT promoter activity, because expression of this miRNA prevents not only growth of the virus in blood cells but also expression of a reporter gene.
[0137] Thus, the recombinant adenovirus of the present invention can be used as a reagent for cancer cell detection or as a reagent for cancer diagnosis. In particular, because of having the above features, the recombinant virus of the present invention is extremely effective for detection of circulating tumor cells (CTCs) present in blood.
[0138] On the other hand, since 2004 when CTCs, which are cancer cells present in blood, were reported to serve as a prognostic factor for post-operative breast cancer patients in the New England Journal of Medicine (Cristofanilli M. et al., The New England Journal of Medicine, 2004, 781-791), CTCs have been measured as a biomarker in many clinical trials conducted in Europe and North America. Particularly in breast cancer, prostate cancer and skin cancer, CTCs have been proven to be an independent factor which determines the prognosis of these cancers. Moreover, in Europe, in the clinical trial in adjuvant setting of prostate cancer (SUCCESS), the number of CTCs counted is added to the inclusion criteria and only patients in whom one or more cells have been detected are included. This trial is a large-scale clinical trial including 2000 cases or more, and attention is being given to the results. Moreover, there is also a clinical trial in which an increase or decrease per se in CTCs is one of the clinical endpoints (MDV3100).
[0139] In recent years, the FDA in the United States has issued guidelines for approval and authorization of molecular-targeted anticancer agents, and hence the CTC test has become more important in cancer diagnosis. The guidelines issued by the FDA define that genetic changes in molecular targets in tumors should be tested before selection of molecular-targeted anticancer agents. When attempting to achieve the guidelines by conventional techniques, there arises a need for surgical biopsy from tumor tissues in patients to conduct genetic testing, which will impose a very strong burden on the patients. To solve this problem, efforts are now made to conduct genetic testing on CTCs collected from blood, and this strategy is referred to as "liquid biopsy" in contrast to the conventional "biopsy." Once this strategy has been achieved, genetic testing of tumor tissues can be conducted simply by blood collection and the burden on patients can be reduced greatly. For these reasons, the CTC test is receiving great attention as a highly useful testing technique in the clinical setting.
[0140] The CellSearch System of Veridex LLC is the only CTC detection device currently approved by the FDA, and most of the CTC detection methods used in clinical trials are accomplished by this CellSearch System. The CellSearch System is based on techniques to detect cancer cells with EpCAM antibody and cytokeratin antibody.
[0141] However, CTC detection techniques are designed to detect several to several tens of cells from among a billion of blood cells, and it is therefore very difficult to improve their sensitivity and accuracy. Thus, some problems are also pointed out in CTC detection methods based on the CellSearch System. For example, it is pointed out that cancer cells which are negative in the CTC test based on the CellSearch System are detected as being positive in another test, and that there are great differences in sensitivity and accuracy, depending on the cancer type (Allard W. J. et al., Clinical Cancer Research, 2004, 6897-6904). Moreover, the CellSearch System is also pointed out to have a problem of low CTC detection rate for lung cancer in the clinical setting (ibid).
[0142] Likewise, the CellSearch System is also pointed out to have a problem of reduced CTC detection rate because the expression of cell surface antigens including EpCAM is reduced in cancer cells having undergone epithelial-mesenchymal transition (EMT) (Anieta M. et. al., J Natl Cancer Inst, 101, 2009, 61-66, Janice Lu et. al., Int J Cancer, 126(3), 2010, 669-683).
[0143] Further, to conduct the above "liquid biopsy," additional steps are required for concentration and phenotyping or genotyping of CTCs, which require more sensitive and more accurate CTC detection techniques than simply counting the number of CTCs.
[0144] In contrast to this, because of having the above features (i) to (iii), the recombinant adenovirus of the present invention allows simple, highly sensitive and highly accurate detection of CTCs in blood without detection of leukocytes and other normal blood cells. Further, the reagent of the present invention allows detection of CTCs alive, so that the source organ of the detected CTCs can be identified upon analyzing surface antigens or the like present on the cell surface of the CTCs. Thus, the recombinant adenovirus of the present invention is useful for CTC detection and cancer diagnosis.
[0145] Moreover, the recombinant adenovirus or reagent for cancer cell detection of the present invention can be used to detect cancer cells having undergone EMT or mesenchymal-epithelial transition (MET). EMT is a phenomenon in which cancer cells lose their properties as epithelium and acquire features as mesenchymal lineage cells tending to migrate into surrounding tissues, and EMT is also involved in invasion and/or metastasis of cancer cells. On the other hand, mesenchymal-epithelial transition (MET) is a phenomenon in which mesenchymally derived cells acquire features as epithelium. As described above, it is difficult to detect cancer cells having undergone EMT by known techniques including the CellSerch System. In contrast, the present invention allows detection of cancer cells having undergone EMT or MET. The recombinant adenovirus of the present invention is therefore useful for cancer cell detection and for cancer diagnosis.
[0146] Further, the recombinant adenovirus of the present invention can also be used to detect drug-resistant cancer cells. Drugs intended in the present invention are those used for cancer chemotherapy. Examples of such drugs include, but are not limited to, adriamycin, carboplatin, cisplatin, 5-fluorouracil, mitomycin, bleomycin, doxorubicin, daunorubicin, methotrexate, paclitaxel, docetaxel and actinomycin D, etc. Moreover, the recombinant virus of the present invention can also be used to detect cancer stem cells. In the present invention, cancer stem cells refer to cells (stem cells) serving as the origin of cancer cells. Cancer stem cells also include those having drug resistance.
[0147] In the present invention, the type of cancer or tumor to be detected or diagnosed is not limited in any way, and cells of all cancer types can be used. Examples include solid cancers or blood tumors, more specifically brain tumor, cervical cancer, esophageal cancer, tongue cancer, lung cancer, breast cancer, pancreatic cancer, gastric cancer, small intestinal cancer, duodenal cancer, colorectal cancer, bladder cancer, kidney cancer, liver cancer, prostate cancer, uterine cancer, uterine cervical cancer, ovarian cancer, thyroid cancer, gallbladder cancer, pharyngeal cancer, sarcoma, melanoma, leukemia, lymphoma and multiple myeloma (MM). Most (85% or more) of the cancer cells derived from human tissues show increased telomerase activity, and the present invention allows detection of such telomerase-expressing cancer cells in general.
[0148] Moreover, in the present invention, CTCs are not limited in any way as long as they are cancer cells present in blood, and they include not only cancer cells released from solid cancers, but also blood tumor cells such as leukemia cells and lymphoma cells as mentioned above. However, in cases where CTCs are blood tumor cells, the miRNA target sequence contained in the adenovirus of the present invention is preferably a target sequence of miRNA which is expressed specifically in normal blood cells.
[0149] To prepare the reagent of the present invention, the recombinant adenovirus may be treated, e.g., by freezing for easy handling and then used directly or mixed with known pharmaceutically acceptable carriers (e.g., excipients, extenders, binders, lubricants) and/or known additives (including buffering agents, isotonizing agents, chelating agents, coloring agents, preservatives, aromatics, flavorings, sweeteners).
[0150] 4. Method for Cancer Cell Detection or Method for Cancer Diagnosis
[0151] Furthermore, the recombinant adenovirus of the present invention can be used for cancer cell detection or cancer diagnosis by contacting the same with cancer cells and detecting the fluorescence or color produced by the cancer cells.
[0152] In the present invention, the term "contact(ing)" is intended to mean that cancer cells and the recombinant adenovirus of the present invention are allowed to exist in the same reaction system, for example, by adding the recombinant adenovirus of the present invention to a sample containing cancer cells, by mixing cancer cells with the recombinant adenovirus, by culturing cancer cells in the presence of the recombinant adenovirus, or by infecting the recombinant adenovirus into cancer cells. Moreover, in the present invention, "fluorescence or color" is not limited in any way as long as it is light or color produced from a protein expressed from a reporter gene, and examples include fluorescence emitted from a labeling protein (e.g., GFP), light emitted from a luminophore generated by luciferase-mediated enzymatic reaction, blue color produced from a chromophore generated by enzymatic reaction between β-galactosidase and X-gal, etc.
[0153] Cancer cells for use in the method for cancer cell detection or in the method for cancer diagnosis may be derived from a biological sample taken from a subject. Such a biological sample taken from a subject is not limited in any way as long as it is a tissue suspected to contain cancer cells, and examples include blood, tumor tissue, lymphoid tissue and so on. Alternatively, cancer cells may be circulating tumor cells (CTCs) in blood, and explanations on CTCs are the same as described above.
[0154] Cancer cell detection and cancer diagnosis using the reagent of the present invention may be accomplished as follows, by way of example.
[0155] In cases where the biological sample taken from a subject is blood, the blood sample is treated by addition of an erythrocyte lysis reagent to remove erythrocytes and the remaining cell suspension is mixed in a test tube with the reagent of the present invention at a given ratio (0.01 to 1000 MOI (multiplicity of infection), preferably 0.1 to 100 MOI, more preferably 1 to 10 MOI). The test tube is allowed to stand or rotated for culture at room temperature or 37° C. for a given period of time (e.g., 4 to 96 hours, preferably 12 to 72 hours, more preferably 18 to 36 hours) to facilitate virus infection into cancer cells and virus growth. GFP fluorescence production in the cell fraction is quantitatively analyzed by flow cytometry. Alternatively, GFP-expressing cells are morphologically analyzed by being observed under a fluorescence microscope. This system allows highly sensitive detection of CTCs present in peripheral blood. This method can be used for detection of CTCs which are present in trace amounts in peripheral blood.
[0156] In cases where flow cytometry is used for CTC detection, CTCs may be detected by determining whether each cell is GFP-positive or GFP-negative, e.g., in accordance with the following criteria.
[0157] First, groups of cells in a sample which is not infected with any virus are analyzed to obtain a background fluorescence value. A threshold is set to the maximum fluorescence value. Subsequently, groups of cells in samples which have been infected with the virus of the present invention are analyzed and groups of cells in a sample showing a fluorescence value equal to or greater than the threshold are determined to be GFP-positive. In the case of using a blood sample taken from a subject, GFP-positive cells can be detected as CTCs. Further, these GFP-positive cells (CTCs) may be concentrated for phenotyping or genotyping.
[0158] In the present invention, examples of a subject include mammals such as humans, rabbits, guinea pigs, rats, mice, hamsters, cats, dogs, goats, pigs, sheep, cows, horses, monkeys and so on.
[0159] The amount of the reagent of the present invention to be used is selected as appropriate, depending on the state and amount of a biological sample to be used for detection and the type of detection method to be used, etc. For example, in the case of a blood sample, the reagent of the present invention can be used in an amount ranging from about 0.01 to 1000 MOI, preferably 0.1 to 100 MOI, and more preferably 1 to 10 MOI per 1 to 50 ml, preferably 3 to 25 ml, and more preferably 5 to 15 ml of the blood sample. MOI refers to the ratio between the amount of virus (infectious unit) and the number of cells when a given amount of cultured cells are infected with a given amount of virus particles, and is used as an index when viruses are infected into cells.
[0160] To infect the recombinant virus into cells, the following procedures may be used for this purpose. First, cells are seeded in a culture plate containing an appropriate culture medium and cultured at 37° C. in the presence of carbon dioxide gas. The culture medium is selected from DMEM, MEM, RPMI-1640 and others commonly used for animal cell culture, and may be supplemented with serum, antibiotics, vitamins and so on, if necessary. The cultured cells are inoculated with a given amount of the virus, for example, at 0.1 to 10 MOI.
[0161] For confirmation of virus growth, the virus-infected cells are collected and treated to extract their DNA, followed by real-time PCR with primers targeting an appropriate gene possessed by the virus of the present invention, whereby virus growth can be quantitatively analyzed.
[0162] In cases where GFP gene is used as a reporter gene, labeled cells may be detected as follows: cells showing virus growth will emit a given fluorescence (e.g., a green fluorescence for GFP) upon irradiation with an excitation light, so that cancer cells can be visualized by the fluorescence. For example, when the virus-infected cells are observed under a fluorescence microscope, GFP fluorescence production can be seen in the cells. Moreover, to observe the virus-infected cells over time, GFP fluorescence production can be monitored over time with a CCD camera.
[0163] Moreover, the reagent of the present invention also allows real-time detection of cancer cells present in vivo. To label and detect cells in vivo in a real-time manner, the recombinant adenovirus of the present invention may be administered in vivo.
[0164] The reagent of the present invention may be applied directly to the affected area or may be introduced in vivo (into target cells or organs) in any known manner, e.g., by injection into vein, muscle, peritoneal cavity or subcutaneous tissue, inhalation from nasal cavity, oral cavity or lungs, oral administration, catheter-mediated intravascular administration and so on, as preferably exemplified by local injection into muscle, peritoneal cavity or elsewhere, injection into vein, etc.
[0165] When the reagent of the present invention is administered to a subject, the dose may be selected as appropriate, depending on the type of active ingredient, the route of administration, a target to be administered, the age, body weight, sex and/or symptoms of a patient, and other conditions. As a daily dose, the amount of the virus of the present invention serving as an active ingredient may usually be set to around 106 to 1011 PFU (plaque forming units), preferably around 109 to 1011 PFU, given once a day or in divided doses.
[0166] Real-time in vivo monitoring of fluorescence from cancer cells has the advantage of being used for in vivo diagnostic agents. This is useful for so-called navigation surgery and so on. Details on navigation surgery can be found in WO2006/036004.
[0167] Further, the reagent of the present invention is useful for detection of CTCs as a biomarker, and hence the reagent of the present invention can be used to determine prognosis.
[0168] For example, in cases where GFP is used as a labeling protein in the virus of the present invention, a biological sample taken from a cancer patient before being treated by any cancer therapy (e.g., chemotherapy, radiation therapy, surgical operation) and a biological sample taken at a time point after a certain period (e.g., 1 to 90 days) has passed from the treatment are each infected with the virus of the present invention. Next, GFP-positive cells contained in the sample taken before the treatment and GFP-positive cells contained in the sample taken at a certain time point after the treatment are compared for their number under the same conditions. As a result, if the number of GFP-positive cells after the treatment becomes smaller than the number of GFP-positive cells before the treatment, a determination can be made that prognosis has been improved.
[0169] The present invention will be further described in more detail by way of the following illustrative examples, which are not intended to limit the scope of the invention.
Example 1
Preparation of Ad34 Fiber 142-3pT
[0170] (1) Preparation of pHMCMV5-miR-142-3pT
[0171] pHMCMV5 (Mizuguchi H. et al., Human Gene Therapy, 10; 2013-2017, 1999) was treated with NotI/KpnI and the resulting fragment was ligated to a double-stranded oligo, which had been prepared by annealing the following synthetic oligo DNAs, to thereby prepare pHMCMV5-miR-142-3pT(pre).
TABLE-US-00003 miR-142-3pT-S1: (SEQ ID NO: 43, each underline represents a miR- 142-3p target sequence) 5'-GGCCTCCATAAAGTAGGAAACACTACACAGCTCCATAAAGTAGGA AACACTACATTAATTAAGCGGTAC-3' miR-142-3pT-AS1: (SEQ ID NO: 44, each underline represents a miR- 142-3p target sequence) 5'-CGCTTAATTAATGTAGTGTTTCCTACTTTATGGAGCTGTGTAGTGTT TCCTACTTTATGGA-3'
[0172] Then, pHMCMV5-miR-142-3pT(pre) was treated with PacI/KpnI and the resulting fragment was ligated to a double-stranded oligo, which had been prepared by annealing the following synthetic oligo DNAs, to thereby obtain pHMCMV5-miR-142-3pT having 4 repeats of a miR-142-3p target sequence.
TABLE-US-00004 miR-142-3pT-S2: (SEQ ID NO: 45, each underline represents a miR- 142-3p target sequence) 5'-TCCATAAAGTAGGAAACACTACAGGACTCCATAAAGTAGGAAACA CTACAGTAC-3' miR-142-3pT-AS2: (SEQ ID NO: 46, each underline represents a miR- 142-3p target sequence) 5'-TGTAGTGTTTCCTACTTTATGGAGTCCTGTAGTGTTTCCTACTTTAT GGAAT-3'
[0173] (2) Preparation of E1 Shuttle Plasmid pHM5-hAIB-miR-142-3pT
[0174] pSh-hAIB (WO2006/036004) was digested with I-CeuI/PmeI and the digested product was electrophoresed on an agarose gel. A band of approximately 4.5 kbp (hAIB cassette) was excised from the gel and treated with GENECLEAN II (Q-Biogene) to purify and collect a DNA fragment. The purified DNA fragment (hAIB cassette) was ligated to a fragment which had been obtained from pHMCMV5-miR-142-3pT by being digested with NheI, treated with Klenow Fragment and further digested with I-CeuI, thereby obtaining pHM5-hAIB-miR-142-3pT having hTERT promoter, E1A gene, IRES (internal ribosomal entry site) sequence, E1B gene and a miR-142-3pT target sequence.
[0175] (3) Preparation of E3 Shuttle Plasmid pHM13CMV-EGFP-miR-142-3pT
[0176] pEGFP-N1 (Clontech) was digested with ApaI and NotI, and the resulting digested product was inserted into the ApaI/NotI site of pHMCMV5 to obtain pHMCMVGFP-1. pHMCMVGFP-1 was digested with PmeI/HindIII, and the digested product was electrophoresed on an agarose gel. A band of approximately 750 bp (EGFP) was excised from the gel and treated with GENECLEAN II to purify and collect a DNA fragment. The purified DNA fragment (EGFP) was ligated to a fragment which had been obtained from pBluescriptII KS+ by being digested with HincII/HindIII, thereby preparing pBSKS-EGFP. pBSKS-EGFP was digested with ApaI/XbaI, and the digested product was electrophoresed on an agarose gel. A band of approximately 750 bp (EGFP) was excised from the gel and treated with GENECLEAN II to purify and collect a DNA fragment. The purified DNA fragment (EGFP) was ligated to a fragment which had been obtained from pHMCMV5-miR-142-3pT by being digested with ApaI/XbaI, thereby obtaining pHMCMV5-EGFP-miR-142-3pT. pHMCMV5-EGFP-miR-142-3pT was digested with BglII, and the digested product was electrophoresed on an agarose gel. A band of approximately 2 kbp (CMV-EGFP-miR-142-3pT) was excised from the gel and treated with GENECLEAN II to purify and collect a DNA fragment. The purified DNA fragment (CMV-EGFP-miR-142-3pT) was ligated to a fragment which had been obtained from pHM13 (Mizuguchi et al., Biotechniques, 30; 1112-1116, 2001) by being digested with BamHI and treated with CIP (Alkaline Phosphatase, Calf Intest), thereby obtaining pHM13CMV-EGFP-miR-142-3pT.
[0177] (4) Preparation of pAdHM49-hAIB142-3pT-CG 1 42-3pT
[0178] pAdHM49 (Mizuguchi et al, J. Controlled Release 110; 202-211, 2005) was treated with I-CeuI/PI-SceI and the resulting fragment was ligated to pHM5-hAIB-miR-142-3pT which had also been treated with 1-CeuI/PI-SceI, thereby preparing pAdHM49-hAIB142-3pT in which hTERT promoter, E1A gene, IRES sequence, E1B gene and a miR-142-3pT target sequence were integrated into the E1-deficient region of the Ad vector. pAdHM49 is a recombinant adenovirus in which a region covering genes encoding the fiber knob and fiber shaft of the adenovirus type 5 fiber is replaced with a region covering genes encoding the fiber knob and fiber shaft of the adenovirus type 34 fiber, and hence pAdHM49 comprises the nucleotide sequence (SEQ ID NO: 49) of a gene encoding a region consisting of the fiber knob region and the fiber shaft region in the fiber protein of adenovirus type 34. The nucleotide sequence of a gene encoding the pAdHM49 fiber protein (i.e., the fiber knob region and fiber shaft region of the adenovirus type 34 fiber and the fiber tail region of the adenovirus type 5 fiber) is shown in SEQ ID NO: 50. In the nucleotide sequence shown in SEQ ID NO: 50, the nucleotide sequence of a gene encoding the fiber tail region of the adenovirus type 5 fiber is located at nucleotides 1 to 132, the nucleotide sequence of a gene encoding the fiber shaft region of the adenovirus type 34 fiber is located at nucleotides 133 to 402, and the nucleotide sequence of a gene encoding the fiber knob region of the adenovirus type 34 fiber is located at nucleotides 403 to 975. Namely, in the nucleotide sequence shown in SEQ ID NO: 50, the nucleotide sequence of a region derived from the adenovirus type 5 fiber is located at nucleotides 1 to 132, while the nucleotide sequence of a region derived from the adenovirus type 34 fiber is located at nucleotides 133 to 975.
[0179] Then, pAdHM49-hAIB142-3pT was digested with Csp45I and the resulting fragment was ligated to a fragment which had been obtained from pHM13CMV-EGFP-miR-142-3pT by being digested with ClaI, thereby obtaining pAdHM49-hAIB142-3pT-CG142-3pT in which hTERT promoter, E1A gene, IRES sequence, E1B gene and a miR-142-3pT target sequence were integrated into the E1-deficient region of the adenovirus vector and CMV promoter, EGFP and a miR-142-3pT target sequence were integrated into the E3-deficient region of the adenovirus vector, and which further comprised a gene encoding the fiber protein of adenovirus type 34.
[0180] (5) Preparation of Ad34 Fiber 142-3pT(E1,E3)
[0181] pAdHM49-hAIB142-3pT-CG142-3pT was linearized by being cleaved with a restriction enzyme PacI whose recognition site was present at each end of the adenovirus genome therein, and the linearized product was transfected into 293 cells seeded in a 60 mm culture dish by using Lipofectamine 2000 (Invitrogen). After about 2 weeks, a recombinant adenovirus Ad34 fiber 142-3pT(E1,E3) was obtained (FIG. 1).
Example 2
Activity Measurement of Ad34 Fiber 142-3pT(E1,E3)
(1) Cells
[0182] HeLa (derived from human uterine cancer cells) and LN319 (derived from human glioma cells) were used as CAR-positive cells, while LNZ308 (derived from human glioma cells), LN444 (derived from human glioma cells) and K562 (derived from human myelogenous leukemia cells) were used as CAR-negative cells. K562 cells are expressing miR-142-3p. DMEM (10% FCS, supplemented with antibiotics) was used for HeLa, LN319, LNZ308 and LN444 cells, while RPMI-1640 medium (10% FCS, supplemented with antibiotics) was used for K562 cells. These cells were cultured at 37° C. under saturated vapor pressure in the presence of 5% CO2.
[0183] (2) Activity Measurement of Ad34 Fiber 142-3pT(E1,E3) by Flow Cytometry
[0184] Cells of each line were seeded in a 24-well plate at 5×104 cells/500 ul/well and treated with Ad34 fiber 142-3pT(E1,E3) at an MOI of 10. As a control, TelomeScan (i.e., a conditionally replicating adenovirus comprising hTERT promoter, E1A gene, IRES sequence and E1B gene integrated in this order into the E1-deficient site of adenovirus type 5 and comprising CMV promoter and GFP integrated in this order into the E3-deficient site of adenovirus type 5) was used. After culture for 24 hours, the cells were collected and the number of GFP-positive cells was measured using a flow cytometer MACSQuant (Miltenyi Biotec).
[0185] The results obtained are shown in FIG. 2. In the specification and FIG. 2, "TelomeScan (Ad5 fiber)" represents TelomeScan, while "Ad34 fiber" represents a recombinant adenovirus which comprises hTERT promoter, E1A gene, IRES sequence and E1B gene integrated in this order into the E1-deficient site of the adenovirus genome and also comprises CMV promoter and GFP integrated in this order into the E3-deficient site of the adenovirus genome and which comprises a gene encoding a fiber protein derived from adenovirus type 34. Likewise, "Ad34 fiber 142-3pT(E1)" represents a recombinant adenovirus which further comprises a target sequence of miR-142-3p integrated into the E1-deficient region (downstream of the E1B gene) in the above Ad34 fiber, while "Ad34 fiber 142-3pT(E3)" represents a recombinant adenovirus which further comprises a target sequence of miR-142-3p integrated into the E3-deficient region (downstream of the GFP gene) in the above Ad34 fiber. Likewise, "Ad34 fiber 142-3pT(E1,E3)" represents a recombinant adenovirus which further comprises a target sequence of miR-142-3p integrated into each of the E1- and E3-deficient regions (downstream of the E1B gene and downstream of the GFP gene, respectively) in the above Ad34 fiber. Moreover, in FIG. 2 and the subsequent figures, "(containing GFP)" is intended to mean that the GFP gene is inserted into each viral genome.
[0186] As a result of activity measurement, when LNZ308, LN444 and K562, which are CAR-negative cells, were infected with TelomeScan (Ad5 fiber), no GFP-positive cell was detected (FIG. 2, panels k, p and u). In contrast, when these cells were infected with Ad34 fiber, GFP-positive cells were detected (85.5% positive in LNZ308, 58.4% positive in LN444, and 63.7% positive in K562) (panels 1, q and v).
[0187] This result indicated that the recombinant adenovirus of the present invention having a gene encoding the fiber protein of adenovirus type 34 allowed significant detection of CAR-negative cells.
[0188] Further, in the case of K562 cells which are CAR-negative and are expressing miR-142-3p, GFP-positive cells were 63.7% upon infection with Ad34 fiber (panel v), whereas GFP-positive cells were 12.2% upon infection with Ad34 fiber 142-3pT(E1) and 34.8% upon infection with Ad34 fiber 142-3pT(E3), and no GFP-positive cell was detected upon infection with Ad34 fiber 142-3pT(E1,E3) (panels w, x and y). Namely, the detection rate of K562 cells was significantly reduced when using an adenovirus comprising a target sequence of miR-142-3p integrated into either the E1- or E3-deficient region of the adenovirus genome, and K562 cells were no longer detected when using an adenovirus comprising a target sequence of miR-142-3p integrated into each of the E1- and E3-deficient regions.
[0189] This result indicated that the recombinant virus of the present invention comprising a target sequence of miR-142-3p did not detect highly miR-142-3p-expressing cells, such as normal blood cells.
Example 3
Detection of Cancer Cells in Blood Samples Using Ad34 Fiber 142-3pT(E1,E3)
[0190] 5×104 H1299 cells (CAR-positive) were suspended in 5 mL blood and erythrocytes were lysed to collect PBMCs. To these PBMCs, a virus was added in an amount of 1×109, 1×1010 or 1×1011 VPs (virus particles) and infected at 37° C. for 24 hours while rotating with a rotator. The cells were collected and immunostained with anti-CD45 antibody, and GFP-positive cells were observed under a fluorescence microscope. CD45 is known to be a surface antigen of blood cell lineage cells except for erythrocytes and platelets. "GFP Positive Cancer cells (%)" found in the vertical axis of FIGS. 3 and 4 represents the "number of GFP-positive and CD45-negative cells (%) among GFP-positive cells."
[0191] As a result, many false positive cells (GFP-positive and CD45-positive cells) were observed upon infection with TelomeScan (Ad5 fiber), whereas false positive cells were very few upon infection with Ad34 fiber 142-3pT(E1,E3), so that cancer cells were able to be specifically detected.
[0192] Moreover, as a result of quantitative analysis on the detection specificity of H1299 cells, many false positive cells were detected in the case of TelomeScan (Ad5 fiber) upon virus infection at 1×109 VPs, whereas the detection specificity was 90% or higher and some samples showed 100% detection specificity in the case of Ad34 fiber 142-3pT(E1,E3) even when the amount of virus infection was increased (FIG. 3). Likewise, quantitative analysis was also performed on A549 cells (CAR-positive cells) in the same manner, indicating that the detection specificity was 100% upon virus infection at 1×109 VPs (FIG. 4). These results indicated that the recombinant virus of the present invention allowed specific detection of cancer cells contained in the PBMC fraction.
[0193] In view of the foregoing, the detection reagent and diagnostic reagent of the present invention were demonstrated to allow detection of highly malignant CAR-negative cancer cells and, on the other hand, to ensure no false positive detection of highly miR-142-3p-expressing normal blood cells (e.g., leukocytes), etc.; and hence they were shown to be very effective for detection of circulating tumor cells (CTCs) in blood.
Example 4
Activity Measurement of Ad34 Fiber 142-3pT(E1,E3) in Various Human Cancer Cell Lines
(1) Cells
[0194] The cancer cells used in this example were human non-small cell lung cancer-derived H1299 cells, human lung cancer-derived A549 cells, human breast cancer-derived MCF7 cells, human breast cancer-derived MDA-MB-231 cells, human bladder cancer-derived KK47 cells, human gastric cancer-derived MKN45 cells, human colorectal cancer-derived SW620, human liver cancer-derived Huh7 cells, human pancreatic cancer-derived Panel cells, human glioma-derived LN319 cells, human bladder cancer-derived T24 cells, human glioma-derived LNZ308 cells, and human glioma-derived LN444 cells.
(2) Activity Measurement of Ad34 Fiber 142-3pT(E1,E3) by Flow Cytometry
[0195] 5×104 cancer cells of each line were suspended in 500 μl medium, to which 100 μl of a conditionally replicating Ad suspension prepared at 5×105 or 5×106 pfu/ml was then added. The resulting mixture of the cells and the conditionally replicating Ad was seeded in a 24-well plate and cultured at 37° C. for 24 hours. The cells were collected and centrifuged at 1500 rpm for 5 minutes. After removal of the medium, the cells were suspended in 300 μl of 2% FCS-containing PBS and measured for GFP-positive rate using a flow cytometer (MACS Quant Analyzer; Miltenyi Biotec). The data obtained were analyzed by FCS multi-color data analysis software (Flowjo).
[0196] As a result, Ad34 fiber 142-3pT(E1,E3) was found to efficiently infect almost all cancer cells, and 60% or more of the cancer cells were GFP-positive. Particularly in the case of CAR-negative cells (T24, LNZ308, LN444), their GFP-positive rate was significantly improved when compared to conventionally used TelomeScan (FIG. 5).
[0197] This result indicated that the recombinant virus of the present invention allowed efficient detection of not only CAR-positive cells but also CAR-negative cells.
Example 5
Detection of Cancer Cells Having Undergone Epithelial-Mesenchymal Transition (EMT)
[0198] Human pancreatic cancer Panel cells were cultured for 6 days in the presence of 10 ng/mL recombinant TGF-β1 to thereby induce epithelial-mesenchymal transition (EMT). After induction of EMT, relative expression of mRNAs encoding E-cadherin, EpCAM, hTERT, N-cadherin, Slug and Snail was measured by real-time RT-PCR. In addition, CAR and CD46 expression in the Panc I cells was analyzed by flow cytometry. The virus of the present invention was infected into the cells in the same manner as shown in Example 4.
[0199] As a result, upon culture in a TGF-β-containing medium, the expression of EMT marker genes Slug, Snail and N-cadherin were increased, while the expression of epithelial markers E-cadherin and EpCAM was reduced, thus indicating that EMT has been induced (FIG. 6A). Moreover, upon EMT induction, CAR expression was reduced whereas CD46 expression was not reduced at all (FIG. 6B). Further, when conventionally used TelomeScan was used for Panel cells having undergone EMT, only about 35% of these cells were GFP-positive, whereas almost 90% or more of the cells were GFP-positive in the case of Ad34 fiber 142-3pT(E1,E3) (FIG. 6C).
[0200] These results indicated that the recombinant virus of the present invention allowed highly sensitive detection of cancer cells having undergone epithelial-mesenchymal transition (EMT).
Example 6
Detection of Cancer Stem Cells
[0201] MCF7 cells and MCF7-ADR cells (cancer cells resistant to the anticancer agent adriamycin) were each seeded in a 96-well plate at 1×103 cells/well, and on the following day, adriamycin was added thereto at 0.2, 1, 5, 25 or 125 μg/mL. After 24 hours from the addition of adriamycin, an AlamarBlue® cell viability reagent was used to measure cell viability (value: mean±S.D. (n=6)).
[0202] MCF7 cells and MCF7-ADR cells were also analyzed by flow cytometry for expression of CAR, CD46, P-glycoprotein (MDR), CD24 and CD44. 5×105 MCF7-ADR cells were suspended in 100 μl of 2% FCS-containing PBS, and FITC-labeled mouse anti-human CD24 antibody and PE-labeled mouse anti-human CD44 antibody were each added thereto in a volume of 1 μl, followed by reaction for 1 hour on ice under light-shielded conditions. After washing with 4 ml of 2% FCS-containing PBS, the suspension was centrifuged at 1500 rpm for 5 minutes to remove the supernatant by aspiration. The cells were suspended again in 100 μl of 2% FCS-containing PBS and subjected to a cell sorter (FACS Aria II cell sorter; BD Biosciences) to sort a CD24-negative and CD44-positive cell fraction. The data obtained were analyzed by FCS multi-color data analysis software (Flowjo). In human breast cancer cells, a fraction having the characteristics of CD24-negative and CD44-positive cells is known to be cancer stem cells (Al-Hajj M., et al., Proc Natl Acad Sci USA, 100; 3983-3988, (2003)). The virus of the present invention was infected into the cells in the same manner as shown in Example 4.
[0203] As a result, MCF7-ADR cells showed significantly high viability even in the presence of adriamycin when compared to MCF7 cells and hence were found to have drug resistance ability (FIG. 7A). MCF7-ADR cells were also found to highly express CAR and CD46 as in the case of MCF7 cells. Moreover, MCF7-ADR cells were also found to highly express MDR, which is a membrane protein responsible for drug elimination ability (FIG. 7B). Further, when Ad34 fiber 142-3pT(E1,E3) was infected into CD24-negative and CD44-positive cells among MCF-ADR cells, 80% or more of the cells were GFP-positive. In contrast, about 70% of the cells were GFP-positive in the case of conventionally used TelomeScan (FIG. 7C).
[0204] These results indicated that the recombinant virus of the present invention allowed detection of drug-resistant cancer cells. Moreover, it was also indicated that the recombinant virus of the present invention allowed detection of cancer stem cells.
Example 7
Detection of Cancer Cells in Blood Samples Using Ad34 Fiber 142-3pT(E1,E3)
[0205] H1299 cells or T24 cells were infected with a lentivirus vector expressing a red fluorescent protein (monomeric red fluorescent protein; RFP) at an MOI of 100 and cultured. To obtain cell clones, the cells were then seeded in a 96-well plate at 0.1 cells/well and cultured until colonies were formed. RFP-expressing cells were selected under a fluorescence microscope and subjected to extended culture, followed by flow cytometry to measure the intensity of RFP expression. Then, cells showing high intensity of RFP expression were identified as RFP-expressing cells.
[0206] Human peripheral blood mononuclear cells (hPBMCs) obtained from 1.0 mL of human peripheral blood were suspended in 800 μL of RPMI-1640 medium (10% FCS, supplemented with antibiotics). To the hPBMC suspension, cancer cells prepared at 1.0×105 or 5.0×105 cells/mL were added in a volume of 1004 (in FIG. 8, "spiked cancer cells" represents the number of cancer cells added to the hPBMC suspension). Further, a conditionally replicating Ad suspension prepared at 2×108 pfu/mL was added in a volume of 100 μL to give a total volume of 1 mL, followed by culture at 37° C. for 24 hours while slowly rotating with a rotator.
[0207] The cell suspension cultured for 24 hours after virus infection was centrifuged at 300×g for 5 minutes to remove the supernatant. A cell fixative was added in a volume of 200 μL and reacted at 4° C. under light-shielded conditions for 15 minutes. After addition of 1 mL PBS, the suspension was centrifuged at 300×g for 5 minutes to remove the supernatant. The cells were suspended in 2% FCS-containing PBS and measured for GFP-positive rate using a flow cytometer (MACS Quant Analyzer; Miltenyi Biotec). The data obtained were analyzed by FCS multi-color data analysis software (Flowjo).
[0208] In this study, cancer cells labeled with RFP (red fluorescent protein) were mixed into hPBMCs to examine whether the cancer cells in hPBMCs were able to be detected. As a result, in the case of CAR-positive cancer cells (H1299), TelomeScan (Ad5 fiber) and Ad34 fiber 143-3pT(E1,E3) were both able to detect 80% or more of the cancer cells. On the other hand, in the case of CAR-negative cancer cells (T24), TelomeScan (Ad5 fiber) achieved very low detection efficiency (about 10% of the cells were detected as being GFP-positive), whereas Ad34 fiber 143-3pT(E1,E3) was able to detect 80% or more of the cancer cells (FIG. 8).
[0209] This result indicated that the recombinant adenovirus of the present invention allowed efficient detection of not only CAR-positive cancer cells but also CAR-negative cancer cells.
INDUSTRIAL APPLICABILITY
[0210] Reagents comprising the recombinant adenovirus of the present invention enable simple and highly sensitive detection of CAR-negative cancer cells without detection of normal blood cells (e.g., leukocytes).
SEQUENCE LISTING FREE TEXT
[0211] SEQ ID NO: 4: synthetic DNA
[0212] SEQ ID NOs: 5 to 26: synthetic RNA
[0213] SEQ ID NOs: 27 to 28: synthetic DNA
[0214] SEQ ID NOs: 43 to 46: synthetic DNA
[0215] SEQ ID NO: 50: synthetic DNA
Sequence CWU
1
1
501455DNAHomo sapiens 1tggcccctcc ctcgggttac cccacagcct aggccgattc
gacctctctc cgctggggcc 60ctcgctggcg tccctgcacc ctgggagcgc gagcggcgcg
cgggcgggga agcgcggccc 120agacccccgg gtccgcccgg agcagctgcg ctgtcggggc
caggccgggc tcccagtgga 180ttcgcgggca cagacgccca ggaccgcgct ccccacgtgg
cggagggact ggggacccgg 240gcacccgtcc tgccccttca ccttccagct ccgcctcctc
cgcgcggacc ccgccccgtc 300ccgacccctc ccgggtcccc ggcccagccc cctccgggcc
ctcccagccc ctccccttcc 360tttccgcggc cccgccctct cctcgcggcg cgagtttcag
gcagcgctgc gtcctgctgc 420gcacgtggga agccctggcc ccggccaccc ccgcg
4552899DNAAdenovirus 2acaccgggac tgaaaatgag
acatattatc tgccacggag gtgttattac cgaagaaatg 60gccgccagtc ttttggacca
gctgatcgaa gaggtactgg ctgataatct tccacctcct 120agccattttg aaccacctac
ccttcacgaa ctgtatgatt tagacgtgac ggcccccgaa 180gatcccaacg aggaggcggt
ttcgcagatt tttcccgact ctgtaatgtt ggcggtgcag 240gaagggattg acttactcac
ttttccgccg gcgcccggtt ctccggagcc gcctcacctt 300tcccggcagc ccgagcagcc
ggagcagaga gccttgggtc cggtttctat gccaaacctt 360gtaccggagg tgatcgatct
tacctgccac gaggctggct ttccacccag tgacgacgag 420gatgaagagg gtgaggagtt
tgtgttagat tatgtggagc accccgggca cggttgcagg 480tcttgtcatt atcaccggag
gaatacgggg gacccagata ttatgtgttc gctttgctat 540atgaggacct gtggcatgtt
tgtctacagt cctgtgtctg aacctgagcc tgagcccgag 600ccagaaccgg agcctgcaag
acctacccgc cgtcctaaaa tggcgcctgc tatcctgaga 660cgcccgacat cacctgtgtc
tagagaatgc aatagtagta cggatagctg tgactccggt 720ccttctaaca cacctcctga
gatacacccg gtggtcccgc tgtgccccat taaaccagtt 780gccgtgagag ttggtgggcg
tcgccaggct gtggaatgta tcgaggactt gcttaacgag 840cctgggcaac ctttggactt
gagctgtaaa cgccccaggc cataaggtgt aaacctgtg 89931823DNAAdenovirus
3ctgacctcat ggaggcttgg gagtgtttgg aagatttttc tgctgtgcgt aacttgctgg
60aacagagctc taacagtacc tcttggtttt ggaggtttct gtggggctca tcccaggcaa
120agttagtctg cagaattaag gaggattaca agtgggaatt tgaagagctt ttgaaatcct
180gtggtgagct gtttgattct ttgaatctgg gtcaccaggc gcttttccaa gagaaggtca
240tcaagacttt ggatttttcc acaccggggc gcgctgcggc tgctgttgct tttttgagtt
300ttataaagga taaatggagc gaagaaaccc atctgagcgg ggggtacctg ctggattttc
360tggccatgca tctgtggaga gcggttgtga gacacaagaa tcgcctgcta ctgttgtctt
420ccgtccgccc ggcgataata ccgacggagg agcagcagca gcagcaggag gaagccaggc
480ggcggcggca ggagcagagc ccatggaacc cgagagccgg cctggaccct cgggaatgaa
540tgttgtacag gtggctgaac tgtatccaga actgagacgc attttgacaa ttacagagga
600tgggcagggg ctaaaggggg taaagaggga gcggggggct tgtgaggcta cagaggaggc
660taggaatcta gcttttagct taatgaccag acaccgtcct gagtgtatta cttttcaaca
720gatcaaggat aattgcgcta atgagcttga tctgctggcg cagaagtatt ccatagagca
780gctgaccact tactggctgc agccagggga tgattttgag gaggctatta gggtatatgc
840aaaggtggca cttaggccag attgcaagta caagatcagc aaacttgtaa atatcaggaa
900ttgttgctac atttctggga acggggccga ggtggagata gatacggagg atagggtggc
960ctttagatgt agcatgataa atatgtggcc gggggtgctt ggcatggacg gggtggttat
1020tatgaatgta aggtttactg gccccaattt tagcggtacg gttttcctgg ccaataccaa
1080ccttatccta cacggtgtaa gcttctatgg gtttaacaat acctgtgtgg aagcctggac
1140cgatgtaagg gttcggggct gtgcctttta ctgctgctgg aagggggtgg tgtgtcgccc
1200caaaagcagg gcttcaatta agaaatgcct ctttgaaagg tgtaccttgg gtatcctgtc
1260tgagggtaac tccagggtgc gccacaatgt ggcctccgac tgtggttgct tcatgctagt
1320gaaaagcgtg gctgtgatta agcataacat ggtatgtggc aactgcgagg acagggcctc
1380tcagatgctg acctgctcgg acggcaactg tcacctgctg aagaccattc acgtagccag
1440ccactctcgc aaggcctggc cagtgtttga gcataacata ctgacccgct gttccttgca
1500tttgggtaac aggagggggg tgttcctacc ttaccaatgc aatttgagtc acactaagat
1560attgcttgag cccgagagca tgtccaaggt gaacctgaac ggggtgtttg acatgaccat
1620gaagatctgg aaggtgctga ggtacgatga gacccgcacc aggtgcagac cctgcgagtg
1680tggcggtaaa catattagga accagcctgt gatgctggat gtgaccgagg agctgaggcc
1740cgatcacttg gtgctggcct gcacccgcgc tgagtttggc tctagcgatg aagatacaga
1800ttgaggtact gaaatgtgtg ggc
18234605DNAArtificialsynthetic DNA 4tgcatctagg gcggccaatt ccgcccctct
ccctcccccc cccctaacgt tactggccga 60agccgcttgg aataaggccg gtgtgcgttt
gtctatatgt gattttccac catattgccg 120tcttttggca atgtgagggc ccggaaacct
ggccctgtct tcttgacgag cattcctagg 180ggtctttccc ctctcgccaa aggaatgcaa
ggtctgttga atgtcgtgaa ggaagcagtt 240cctctggaag cttcttgaag acaaacaacg
tctgtagcga ccctttgcag gcagcggaac 300cccccacctg gcgacaggtg cctctgcggc
caaaagccac gtgtataaga tacacctgca 360aaggcggcac aaccccagtg ccacgttgtg
agttggatag ttgtggaaag agtcaaatgg 420ctctcctcaa gcgtattcaa caaggggctg
aaggatgccc agaaggtacc ccattgtatg 480ggatctgatc tggggcctcg gtgcacatgc
tttacatgtg tttagtcgag gttaaaaaaa 540cgtctaggcc ccccgaacca cggggacgtg
gttttccttt gaaaaacacg atgataagct 600tgcca
605523RNAArtificialsynthetic RNA
5uguaguguuu ccuacuuuau gga
23621RNAArtificialsynthetic RNA 6cauaaaguag aaagcacuac u
21722RNAArtificialsynthetic RNA 7uagcagcaca
uaaugguuug ug
22822RNAArtificialsynthetic RNA 8caggccauau ugugcugccu ca
22922RNAArtificialsynthetic RNA 9uagcagcacg
uaaauauugg cg
221022RNAArtificialsynthetic RNA 10ccaguauuaa cugugcugcu ga
221122RNAArtificialsynthetic RNA
11uagcuuauca gacugauguu ga
221221RNAArtificialsynthetic RNA 12caacaccagu cgaugggcug u
211322RNAArtificialsynthetic RNA
13ucguaccgug aguaauaaug cg
221421RNAArtificialsynthetic RNA 14cauuauuacu uuugguacgc g
211523RNAArtificialsynthetic RNA
15aacauucaac gcugucggug agu
231622RNAArtificialsynthetic RNA 16ugucaguuug ucaaauaccc ca
221722RNAArtificialsynthetic RNA
17cguguauuug acaagcugag uu
221822RNAArtificialsynthetic RNA 18gaggguuggg uggaggcucu cc
221921RNAArtificialsynthetic RNA
19agggcccccc cucaauccug u
212024RNAArtificialsynthetic RNA 20ucccugagac ccuuuaaccu guga
242121RNAArtificialsynthetic RNA
21ugagaugaag cacuguagcu c
212222RNAArtificialsynthetic RNA 22ggugcagugc ugcaucucug gu
222323RNAArtificialsynthetic RNA
23guccaguuuu cccaggaauc ccu
232422RNAArtificialsynthetic RNA 24ggauuccugg aaauacuguu cu
222523RNAArtificialsynthetic RNA
25cccaguguuc agacuaccug uuc
232622RNAArtificialsynthetic RNA 26ugagguagua gguuguauag uu
222769DNAArtificialsynthetic DNA
27gcggcctcca taaagtagga aacactacac agctccataa agtaggaaac actacattat
60aagcggtac
6928113DNAArtificialsynthetic DNA 28ggcctccata aagtaggaaa cactacacag
ctccataaag taggaaacac tacattaatt 60ccataaagta ggaaacacta caccactcca
taaagtagga aacactacag tac 113291307DNAHomo sapiens 29accgtccagg
gagcaggtag ctgctgggct ccggggacac tttgcgttcg ggctgggagc 60gtgctttcca
cgacggtgac acgcttccct ggattggcag ccagactgcc ttccgggtca 120ctgccatgga
ggagccgcag tcagatccta gcgtcgagcc ccctctgagt caggaaacat 180tttcagacct
atggaaacta cttcctgaaa acaacgttct gtcccccttg ccgtcccaag 240caatggatga
tttgatgctg tccccggacg atattgaaca atggttcact gaagacccag 300gtccagatga
agctcccaga atgccagagg ctgctccccg cgtggcccct gcaccagcga 360ctcctacacc
ggcggcccct gcaccagccc cctcctggcc cctgtcatct tctgtccctt 420cccagaaaac
ctaccagggc agctacggtt tccgtctggg cttcttgcat tctgggacag 480ccaagtctgt
gacttgcacg tactcccctg ccctcaacaa gatgttttgc caactggcca 540agacctgccc
tgtgcagctg tgggttgatt ccacaccccc gcccggcacc cgcgtccgcg 600ccatggccat
ctacaagcag tcacagcaca tgacggaggt tgtgaggcgc tgcccccacc 660atgagcgctg
ctcagatagc gatggtctgg cccctcctca gcatcttatc cgagtggaag 720gaaatttgcg
tgtggagtat ttggatgaca gaaacacttt tcgacatagt gtggtggtgc 780cctatgagcc
gcctgaggtt ggctctgact gtaccaccat ccactacaac tacatgtgta 840acagttcctg
catgggcggc atgaaccgga ggcccatcct caccatcatc acactggaag 900actccagtgg
taatctactg ggacggaaca gctttgaggt gcgtgtttgt gcctgtcctg 960ggagagaccg
gcgcacagag gaagagaatc tccgcaagaa aggggagcct caccacgagc 1020tgcccccagg
gagcactaag cgagcactgc ccaacaacac cagctcctct ccccagccaa 1080agaagaaacc
actggatgga gaatatttca cccttcagat ccgtgggcgt gagcgcttcg 1140agatgttccg
agagctgaat gaggccttgg aactcaagga tgcccaggct gggaaggagc 1200caggggggag
cagggctcac tccagccacc tgaagtccaa aaagggtcag tctacctccc 1260gccataaaaa
actcatgttc aagacagaag ggcctgactc agactga 130730837DNAHomo
sapiens 30gaggactccg cgacggtccg caccctgcgg ccagagcggc tttgagctcg
gctgcttccg 60cgctaggcgc tttttcccag aagcaatcca ggcgcgcccg ctggttcttg
agcgccagga 120aaagcccgga gctaacgacc ggccgctcgg cactgcacgg ggccccaagc
cgcagaagaa 180ggacgacggg agggtaatga agctgagccc aggtctccta ggaaggagag
agtgcgccgg 240agcagcgtgg gaaagaaggg aagagtgtcg ttaagtttac ggccaacggt
ggattatccg 300ggccgctgcg cgtctggggg ctgcggaatg cgcgaggaga acaagggcat
gcccagtggg 360ggcggcagcg atgagggtct ggccacgccg gcgcggggac tagtggagaa
ggtgcgacac 420tcctgggaag ccggcgcgga tcccaacgga gtcaaccgtt tcgggaggcg
cgcgatccag 480gtcatgatga tgggcagcgc ccgcgtggcg gagctgctgc tgctccacgg
cgcggagccc 540aactgcgcag accctgccac tctcacccga ccggtgcatg atgctgcccg
ggagggcttc 600ctggacacgc tggtggtgct gcaccgggcc ggggcgcggc tggacgtgcg
cgatgcctgg 660ggtcgtctgc ccgtggactt ggccgaggag cggggccacc gcgacgttgc
agggtacctg 720cgcacagcca cgggggactg acgccaggtt ccccagccgc ccacaacgac
tttattttct 780tacccaattt cccaccccca cccacctaat tcgatgaagg ctgccaacgg
ggagcgg 83731987DNAHomo sapiens 31cggagagggg gagaacagac aacgggcggc
ggggagcagc atggagccgg cggcggggag 60cagcatggag ccttcggctg actggctggc
cacggccgcg gcccggggtc gggtagagga 120ggtgcgggcg ctgctggagg cgggggcgct
gcccaacgca ccgaatagtt acggtcggag 180gccgatccag gtcatgatga tgggcagcgc
ccgagtggcg gagctgctgc tgctccacgg 240cgcggagccc aactgcgccg accccgccac
tctcacccga cccgtgcacg acgctgcccg 300ggagggcttc ctggacacgc tggtggtgct
gcaccgggcc ggggcgcggc tggacgtgcg 360cgatgcctgg ggccgtctgc ccgtggacct
ggctgaggag ctgggccatc gcgatgtcgc 420acggtacctg cgcgcggctg cggggggcac
cagaggcagt aaccatgccc gcatagatgc 480cgcggaaggt ccctcagaca tccccgattg
aaagaaccag agaggctctg agaaacctcg 540ggaaacttag atcatcagtc accgaaggtc
ctacagggcc acaactgccc ccgccacaac 600ccaccccgct ttcgtagttt tcatttagaa
aatagagctt ttaaaaatgt cctgcctttt 660aacgtagata taagccttcc cccactaccg
taaatgtcca tttatatcat tttttatata 720ttcttataaa aatgtaaaaa agaaaaacac
cgcttctgcc ttttcactgt gttggagttt 780tctggagtga gcactcacgc cctaagcgca
cattcatgtg ggcatttctt gcgagcctcg 840cagcctccgg aagctgtcga cttcatgaca
agcattttgt gaactaggga agctcagggg 900ggttactggc ttctcttgag tcacactgct
agcaaatggc agaaccaaag ctcaaataaa 960aataaaataa ttttcattca ttcactc
987328972DNAHomo sapiens 32gtccaagggt
agccaaggat ggctgcagct tcatatgatc agttgttaaa gcaagttgag 60gcactgaaga
tggagaactc aaatcttcga caagagctag aagataattc caatcatctt 120acaaaactgg
aaactgaggc atctaatatg aaggaagtac ttaaacaact acaaggaagt 180attgaagatg
aagctatggc ttcttctgga cagattgatt tattagagcg tcttaaagag 240cttaacttag
atagcagtaa tttccctgga gtaaaactgc ggtcaaaaat gtccctccgt 300tcttatggaa
gccgggaagg atctgtatca agccgttctg gagagtgcag tcctgttcct 360atgggttcat
ttccaagaag agggtttgta aatggaagca gagaaagtac tggatattta 420gaagaacttg
agaaagagag gtcattgctt cttgctgatc ttgacaaaga agaaaaggaa 480aaagactggt
attacgctca acttcagaat ctcactaaaa gaatagatag tcttccttta 540actgaaaatt
tttccttaca aacagatatg accagaaggc aattggaata tgaagcaagg 600caaatcagag
ttgcgatgga agaacaacta ggtacctgcc aggatatgga aaaacgagca 660cagcgaagaa
tagccagaat tcagcaaatc gaaaaggaca tacttcgtat acgacagctt 720ttacagtccc
aagcaacaga agcagagagg tcatctcaga acaagcatga aaccggctca 780catgatgctg
agcggcagaa tgaaggtcaa ggagtgggag aaatcaacat ggcaacttct 840ggtaatggtc
agggttcaac tacacgaatg gaccatgaaa cagccagtgt tttgagttct 900agtagcacac
actctgcacc tcgaaggctg acaagtcatc tgggaaccaa ggtggaaatg 960gtgtattcat
tgttgtcaat gcttggtact catgataagg atgatatgtc gcgaactttg 1020ctagctatgt
ctagctccca agacagctgt atatccatgc gacagtctgg atgtcttcct 1080ctcctcatcc
agcttttaca tggcaatgac aaagactctg tattgttggg aaattcccgg 1140ggcagtaaag
aggctcgggc cagggccagt gcagcactcc acaacatcat tcactcacag 1200cctgatgaca
agagaggcag gcgtgaaatc cgagtccttc atcttttgga acagatacgc 1260gcttactgtg
aaacctgttg ggagtggcag gaagctcatg aaccaggcat ggaccaggac 1320aaaaatccaa
tgccagctcc tgttgaacat cagatctgtc ctgctgtgtg tgttctaatg 1380aaactttcat
ttgatgaaga gcatagacat gcaatgaatg aactaggggg actacaggcc 1440attgcagaat
tattgcaagt ggactgtgaa atgtacgggc ttactaatga ccactacagt 1500attacactaa
gacgatatgc tggaatggct ttgacaaact tgacttttgg agatgtagcc 1560aacaaggcta
cgctatgctc tatgaaaggc tgcatgagag cacttgtggc ccaactaaaa 1620tctgaaagtg
aagacttaca gcaggttatt gcaagtgttt tgaggaattt gtcttggcga 1680gcagatgtaa
atagtaaaaa gacgttgcga gaagttggaa gtgtgaaagc attgatggaa 1740tgtgctttag
aagttaaaaa ggaatcaacc ctcaaaagcg tattgagtgc cttatggaat 1800ttgtcagcac
attgcactga gaataaagct gatatatgtg ctgtagatgg tgcacttgca 1860tttttggttg
gcactcttac ttaccggagc cagacaaaca ctttagccat tattgaaagt 1920ggaggtggga
tattacggaa tgtgtccagc ttgatagcta caaatgagga ccacaggcaa 1980atcctaagag
agaacaactg tctacaaact ttattacaac acttaaaatc tcatagtttg 2040acaatagtca
gtaatgcatg tggaactttg tggaatctct cagcaagaaa tcctaaagac 2100caggaagcat
tatgggacat gggggcagtt agcatgctca agaacctcat tcattcaaag 2160cacaaaatga
ttgctatggg aagtgctgca gctttaagga atctcatggc aaataggcct 2220gcgaagtaca
aggatgccaa tattatgtct cctggctcaa gcttgccatc tcttcatgtt 2280aggaaacaaa
aagccctaga agcagaatta gatgctcagc acttatcaga aacttttgac 2340aatatagaca
atttaagtcc caaggcatct catcgtagta agcagagaca caagcaaagt 2400ctctatggtg
attatgtttt tgacaccaat cgacatgatg ataataggtc agacaatttt 2460aatactggca
acatgactgt cctttcacca tatttgaata ctacagtgtt acccagctcc 2520tcttcatcaa
gaggaagctt agatagttct cgttctgaaa aagatagaag tttggagaga 2580gaacgcggaa
ttggtctagg caactaccat ccagcaacag aaaatccagg aacttcttca 2640aagcgaggtt
tgcagatctc caccactgca gcccagattg ccaaagtcat ggaagaagtg 2700tcagccattc
atacctctca ggaagacaga agttctgggt ctaccactga attacattgt 2760gtgacagatg
agagaaatgc acttagaaga agctctgctg cccatacaca ttcaaacact 2820tacaatttca
ctaagtcgga aaattcaaat aggacatgtt ctatgcctta tgccaaatta 2880gaatacaaga
gatcttcaaa tgatagttta aatagtgtca gtagtagtga tggttatggt 2940aaaagaggtc
aaatgaaacc ctcgattgaa tcctattctg aagatgatga aagtaagttt 3000tgcagttatg
gtcaataccc agccgaccta gcccataaaa tacatagtgc aaatcatatg 3060gatgataatg
atggagaact agatacacca ataaattata gtcttaaata ttcagatgag 3120cagttgaact
ctggaaggca aagtccttca cagaatgaaa gatgggcaag acccaaacac 3180ataatagaag
atgaaataaa acaaagtgag caaagacaat caaggaatca aagtacaact 3240tatcctgttt
atactgagag cactgatgat aaacacctca agttccaacc acattttgga 3300cagcaggaat
gtgtttctcc atacaggtca cggggagcca atggttcaga aacaaatcga 3360gtgggttcta
atcatggaat taatcaaaat gtaagccagt ctttgtgtca agaagatgac 3420tatgaagatg
ataagcctac caattatagt gaacgttact ctgaagaaga acagcatgaa 3480gaagaagaga
gaccaacaaa ttatagcata aaatataatg aagagaaacg tcatgtggat 3540cagcctattg
attatagttt aaaatatgcc acagatattc cttcatcaca gaaacagtca 3600ttttcattct
caaagagttc atctggacaa agcagtaaaa ccgaacatat gtcttcaagc 3660agtgagaata
cgtccacacc ttcatctaat gccaagaggc agaatcagct ccatccaagt 3720tctgcacaga
gtagaagtgg tcagcctcaa aaggctgcca cttgcaaagt ttcttctatt 3780aaccaagaaa
caatacagac ttattgtgta gaagatactc caatatgttt ttcaagatgt 3840agttcattat
catctttgtc atcagctgaa gatgaaatag gatgtaatca gacgacacag 3900gaagcagatt
ctgctaatac cctgcaaata gcagaaataa aagaaaagat tggaactagg 3960tcagctgaag
atcctgtgag cgaagttcca gcagtgtcac agcaccctag aaccaaatcc 4020agcagactgc
agggttctag tttatcttca gaatcagcca ggcacaaagc tgttgaattt 4080tcttcaggag
cgaaatctcc ctccaaaagt ggtgctcaga cacccaaaag tccacctgaa 4140cactatgttc
aggagacccc actcatgttt agcagatgta cttctgtcag ttcacttgat 4200agttttgaga
gtcgttcgat tgccagctcc gttcagagtg aaccatgcag tggaatggta 4260agtggcatta
taagccccag tgatcttcca gatagccctg gacaaaccat gccaccaagc 4320agaagtaaaa
cacctccacc acctcctcaa acagctcaaa ccaagcgaga agtacctaaa 4380aataaagcac
ctactgctga aaagagagag agtggaccta agcaagctgc agtaaatgct 4440gcagttcaga
gggtccaggt tcttccagat gctgatactt tattacattt tgccacggaa 4500agtactccag
atggattttc ttgttcatcc agcctgagtg ctctgagcct cgatgagcca 4560tttatacaga
aagatgtgga attaagaata atgcctccag ttcaggaaaa tgacaatggg 4620aatgaaacag
aatcagagca gcctaaagaa tcaaatgaaa accaagagaa agaggcagaa 4680aaaactattg
attctgaaaa ggacctatta gatgattcag atgatgatga tattgaaata 4740ctagaagaat
gtattatttc tgccatgcca acaaagtcat cacgtaaagc aaaaaagcca 4800gcccagactg
cttcaaaatt acctccacct gtggcaagga aaccaagtca gctgcctgtg 4860tacaaacttc
taccatcaca aaacaggttg caaccccaaa agcatgttag ttttacaccg 4920ggggatgata
tgccacgggt gtattgtgtt gaagggacac ctataaactt ttccacagct 4980acatctctaa
gtgatctaac aatcgaatcc cctccaaatg agttagctgc tggagaagga 5040gttagaggag
gagcacagtc aggtgaattt gaaaaacgag ataccattcc tacagaaggc 5100agaagtacag
atgaggctca aggaggaaaa acctcatctg taaccatacc tgaattggat 5160gacaataaag
cagaggaagg tgatattctt gcagaatgca ttaattctgc tatgcccaaa 5220gggaaaagtc
acaagccttt ccgtgtgaaa aagataatgg accaggtcca gcaagcatct 5280gcgtcgtctt
ctgcacccaa caaaaatcag ttagatggta agaaaaagaa accaacttca 5340ccagtaaaac
ctataccaca aaatactgaa tataggacac gtgtaagaaa aaatgcagac 5400tcaaaaaata
atttaaatgc tgagagagtt ttctcagaca acaaagattc aaagaaacag 5460aatttgaaaa
ataattccaa ggacttcaat gataagctcc caaataatga agatagagtc 5520agaggaagtt
ttgcttttga ttcacctcat cattacacgc ctattgaagg aactccttac 5580tgtttttcac
gaaatgattc tttgagttct ctagattttg atgatgatga tgttgacctt 5640tccagggaaa
aggctgaatt aagaaaggca aaagaaaata aggaatcaga ggctaaagtt 5700accagccaca
cagaactaac ctccaaccaa caatcagcta ataagacaca agctattgca 5760aagcagccaa
taaatcgagg tcagcctaaa cccatacttc agaaacaatc cacttttccc 5820cagtcatcca
aagacatacc agacagaggg gcagcaactg atgaaaagtt acagaatttt 5880gctattgaaa
atactccagt ttgcttttct cataattcct ctctgagttc tctcagtgac 5940attgaccaag
aaaacaacaa taaagaaaat gaacctatca aagagactga gccccctgac 6000tcacagggag
aaccaagtaa acctcaagca tcaggctatg ctcctaaatc atttcatgtt 6060gaagataccc
cagtttgttt ctcaagaaac agttctctca gttctcttag tattgactct 6120gaagatgacc
tgttgcagga atgtataagc tccgcaatgc caaaaaagaa aaagccttca 6180agactcaagg
gtgataatga aaaacatagt cccagaaata tgggtggcat attaggtgaa 6240gatctgacac
ttgatttgaa agatatacag agaccagatt cagaacatgg tctatcccct 6300gattcagaaa
attttgattg gaaagctatt caggaaggtg caaattccat agtaagtagt 6360ttacatcaag
ctgctgctgc tgcatgttta tctagacaag cttcgtctga ttcagattcc 6420atcctttccc
tgaaatcagg aatctctctg ggatcaccat ttcatcttac acctgatcaa 6480gaagaaaaac
cctttacaag taataaaggc ccacgaattc taaaaccagg ggagaaaagt 6540acattggaaa
ctaaaaagat agaatctgaa agtaaaggaa tcaaaggagg aaaaaaagtt 6600tataaaagtt
tgattactgg aaaagttcga tctaattcag aaatttcagg ccaaatgaaa 6660cagccccttc
aagcaaacat gccttcaatc tctcgaggca ggacaatgat tcatattcca 6720ggagttcgaa
atagctcctc aagtacaagt cctgtttcta aaaaaggccc accccttaag 6780actccagcct
ccaaaagccc tagtgaaggt caaacagcca ccacttctcc tagaggagcc 6840aagccatctg
tgaaatcaga attaagccct gttgccaggc agacatccca aataggtggg 6900tcaagtaaag
caccttctag atcaggatct agagattcga ccccttcaag acctgcccag 6960caaccattaa
gtagacctat acagtctcct ggccgaaact caatttcccc tggtagaaat 7020ggaataagtc
ctcctaacaa attatctcaa cttccaagga catcatcccc tagtactgct 7080tcaactaagt
cctcaggttc tggaaaaatg tcatatacat ctccaggtag acagatgagc 7140caacagaacc
ttaccaaaca aacaggttta tccaagaatg ccagtagtat tccaagaagt 7200gagtctgcct
ccaaaggact aaatcagatg aataatggta atggagccaa taaaaaggta 7260gaactttcta
gaatgtcttc aactaaatca agtggaagtg aatctgatag atcagaaaga 7320cctgtattag
tacgccagtc aactttcatc aaagaagctc caagcccaac cttaagaaga 7380aaattggagg
aatctgcttc atttgaatct ctttctccat catctagacc agcttctccc 7440actaggtccc
aggcacaaac tccagtttta agtccttccc ttcctgatat gtctctatcc 7500acacattcgt
ctgttcaggc tggtggatgg cgaaaactcc cacctaatct cagtcccact 7560atagagtata
atgatggaag accagcaaag cgccatgata ttgcacggtc tcattctgaa 7620agtccttcta
gacttccaat caataggtca ggaacctgga aacgtgagca cagcaaacat 7680tcatcatccc
ttcctcgagt aagcacttgg agaagaactg gaagttcatc ttcaattctt 7740tctgcttcat
cagaatccag tgaaaaagca aaaagtgagg atgaaaaaca tgtgaactct 7800atttcaggaa
ccaaacaaag taaagaaaac caagtatccg caaaaggaac atggagaaaa 7860ataaaagaaa
atgaattttc tcccacaaat agtacttctc agaccgtttc ctcaggtgct 7920acaaatggtg
ctgaatcaaa gactctaatt tatcaaatgg cacctgctgt ttctaaaaca 7980gaggatgttt
gggtgagaat tgaggactgt cccattaaca atcctagatc tggaagatct 8040cccacaggta
atactccccc ggtgattgac agtgtttcag aaaaggcaaa tccaaacatt 8100aaagattcaa
aagataatca ggcaaaacaa aatgtgggta atggcagtgt tcccatgcgt 8160accgtgggtt
tggaaaatcg cctgaactcc tttattcagg tggatgcccc tgaccaaaaa 8220ggaactgaga
taaaaccagg acaaaataat cctgtccctg tatcagagac taatgaaagt 8280tctatagtgg
aacgtacccc attcagttct agcagctcaa gcaaacacag ttcacctagt 8340gggactgttg
ctgccagagt gactcctttt aattacaacc caagccctag gaaaagcagc 8400gcagatagca
cttcagctcg gccatctcag atcccaactc cagtgaataa caacacaaag 8460aagcgagatt
ccaaaactga cagcacagaa tccagtggaa cccaaagtcc taagcgccat 8520tctgggtctt
accttgtgac atctgtttaa aagagaggaa gaatgaaact aagaaaattc 8580tatgttaatt
acaactgcta tatagacatt ttgtttcaaa tgaaacttta aaagactgaa 8640aaattttgta
aataggtttg attcttgtta gagggttttt gttctggaag ccatatttga 8700tagtatactt
tgtcttcact ggtcttattt tgggaggcac tcttgatggt taggaaaaaa 8760atagtaaagc
caagtatgtt tgtacagtat gttttacatg tatttaaagt agcacccatc 8820ccaacttcct
ttaattattg cttgtcttaa aataatgaac actacagata gaaaatatga 8880tatattgctg
ttatcaatca tttctagatt ataaactgac taaacttaca tcagggaaaa 8940attggtattt
atgcaaaaaa aaatgttttt gt
8972335711DNAHomo sapiens 33agctcgctga gacttcctgg accccgcacc aggctgtggg
gtttctcaga taactgggcc 60cctgcgctca ggaggccttc accctctgct ctgggtaaag
ttcattggaa cagaaagaaa 120tggatttatc tgctcttcgc gttgaagaag tacaaaatgt
cattaatgct atgcagaaaa 180tcttagagtg tcccatctgt ctggagttga tcaaggaacc
tgtctccaca aagtgtgacc 240acatattttg caaattttgc atgctgaaac ttctcaacca
gaagaaaggg ccttcacagt 300gtcctttatg taagaatgat ataaccaaaa ggagcctaca
agaaagtacg agatttagtc 360aacttgttga agagctattg aaaatcattt gtgcttttca
gcttgacaca ggtttggagt 420atgcaaacag ctataatttt gcaaaaaagg aaaataactc
tcctgaacat ctaaaagatg 480aagtttctat catccaaagt atgggctaca gaaaccgtgc
caaaagactt ctacagagtg 540aacccgaaaa tccttccttg caggaaacca gtctcagtgt
ccaactctct aaccttggaa 600ctgtgagaac tctgaggaca aagcagcgga tacaacctca
aaagacgtct gtctacattg 660aattgggatc tgattcttct gaagataccg ttaataaggc
aacttattgc agtgtgggag 720atcaagaatt gttacaaatc acccctcaag gaaccaggga
tgaaatcagt ttggattctg 780caaaaaaggc tgcttgtgaa ttttctgaga cggatgtaac
aaatactgaa catcatcaac 840ccagtaataa tgatttgaac accactgaga agcgtgcagc
tgagaggcat ccagaaaagt 900atcagggtag ttctgtttca aacttgcatg tggagccatg
tggcacaaat actcatgcca 960gctcattaca gcatgagaac agcagtttat tactcactaa
agacagaatg aatgtagaaa 1020aggctgaatt ctgtaataaa agcaaacagc ctggcttagc
aaggagccaa cataacagat 1080gggctggaag taaggaaaca tgtaatgata ggcggactcc
cagcacagaa aaaaaggtag 1140atctgaatgc tgatcccctg tgtgagagaa aagaatggaa
taagcagaaa ctgccatgct 1200cagagaatcc tagagatact gaagatgttc cttggataac
actaaatagc agcattcaga 1260aagttaatga gtggttttcc agaagtgatg aactgttagg
ttctgatgac tcacatgatg 1320gggagtctga atcaaatgcc aaagtagctg atgtattgga
cgttctaaat gaggtagatg 1380aatattctgg ttcttcagag aaaatagact tactggccag
tgatcctcat gaggctttaa 1440tatgtaaaag tgaaagagtt cactccaaat cagtagagag
taatattgaa gacaaaatat 1500ttgggaaaac ctatcggaag aaggcaagcc tccccaactt
aagccatgta actgaaaatc 1560taattatagg agcatttgtt actgagccac agataataca
agagcgtccc ctcacaaata 1620aattaaagcg taaaaggaga cctacatcag gccttcatcc
tgaggatttt atcaagaaag 1680cagatttggc agttcaaaag actcctgaaa tgataaatca
gggaactaac caaacggagc 1740agaatggtca agtgatgaat attactaata gtggtcatga
gaataaaaca aaaggtgatt 1800ctattcagaa tgagaaaaat cctaacccaa tagaatcact
cgaaaaagaa tctgctttca 1860aaacgaaagc tgaacctata agcagcagta taagcaatat
ggaactcgaa ttaaatatcc 1920acaattcaaa agcacctaaa aagaataggc tgaggaggaa
gtcttctacc aggcatattc 1980atgcgcttga actagtagtc agtagaaatc taagcccacc
taattgtact gaattgcaaa 2040ttgatagttg ttctagcagt gaagagataa agaaaaaaaa
gtacaaccaa atgccagtca 2100ggcacagcag aaacctacaa ctcatggaag gtaaagaacc
tgcaactgga gccaagaaga 2160gtaacaagcc aaatgaacag acaagtaaaa gacatgacag
cgatactttc ccagagctga 2220agttaacaaa tgcacctggt tcttttacta agtgttcaaa
taccagtgaa cttaaagaat 2280ttgtcaatcc tagccttcca agagaagaaa aagaagagaa
actagaaaca gttaaagtgt 2340ctaataatgc tgaagacccc aaagatctca tgttaagtgg
agaaagggtt ttgcaaactg 2400aaagatctgt agagagtagc agtatttcat tggtacctgg
tactgattat ggcactcagg 2460aaagtatctc gttactggaa gttagcactc tagggaaggc
aaaaacagaa ccaaataaat 2520gtgtgagtca gtgtgcagca tttgaaaacc ccaagggact
aattcatggt tgttccaaag 2580ataatagaaa tgacacagaa ggctttaagt atccattggg
acatgaagtt aaccacagtc 2640gggaaacaag catagaaatg gaagaaagtg aacttgatgc
tcagtatttg cagaatacat 2700tcaaggtttc aaagcgccag tcatttgctc cgttttcaaa
tccaggaaat gcagaagagg 2760aatgtgcaac attctctgcc cactctgggt ccttaaagaa
acaaagtcca aaagtcactt 2820ttgaatgtga acaaaaggaa gaaaatcaag gaaagaatga
gtctaatatc aagcctgtac 2880agacagttaa tatcactgca ggctttcctg tggttggtca
gaaagataag ccagttgata 2940atgccaaatg tagtatcaaa ggaggctcta ggttttgtct
atcatctcag ttcagaggca 3000acgaaactgg actcattact ccaaataaac atggactttt
acaaaaccca tatcgtatac 3060caccactttt tcccatcaag tcatttgtta aaactaaatg
taagaaaaat ctgctagagg 3120aaaactttga ggaacattca atgtcacctg aaagagaaat
gggaaatgag aacattccaa 3180gtacagtgag cacaattagc cgtaataaca ttagagaaaa
tgtttttaaa gaagccagct 3240caagcaatat taatgaagta ggttccagta ctaatgaagt
gggctccagt attaatgaaa 3300taggttccag tgatgaaaac attcaagcag aactaggtag
aaacagaggg ccaaaattga 3360atgctatgct tagattaggg gttttgcaac ctgaggtcta
taaacaaagt cttcctggaa 3420gtaattgtaa gcatcctgaa ataaaaaagc aagaatatga
agaagtagtt cagactgtta 3480atacagattt ctctccatat ctgatttcag ataacttaga
acagcctatg ggaagtagtc 3540atgcatctca ggtttgttct gagacacctg atgacctgtt
agatgatggt gaaataaagg 3600aagatactag ttttgctgaa aatgacatta aggaaagttc
tgctgttttt agcaaaagcg 3660tccagaaagg agagcttagc aggagtccta gccctttcac
ccatacacat ttggctcagg 3720gttaccgaag aggggccaag aaattagagt cctcagaaga
gaacttatct agtgaggatg 3780aagagcttcc ctgcttccaa cacttgttat ttggtaaagt
aaacaatata ccttctcagt 3840ctactaggca tagcaccgtt gctaccgagt gtctgtctaa
gaacacagag gagaatttat 3900tatcattgaa gaatagctta aatgactgca gtaaccaggt
aatattggca aaggcatctc 3960aggaacatca ccttagtgag gaaacaaaat gttctgctag
cttgttttct tcacagtgca 4020gtgaattgga agacttgact gcaaatacaa acacccagga
tcctttcttg attggttctt 4080ccaaacaaat gaggcatcag tctgaaagcc agggagttgg
tctgagtgac aaggaattgg 4140tttcagatga tgaagaaaga ggaacgggct tggaagaaaa
taatcaagaa gagcaaagca 4200tggattcaaa cttaggtgaa gcagcatctg ggtgtgagag
tgaaacaagc gtctctgaag 4260actgctcagg gctatcctct cagagtgaca ttttaaccac
tcagcagagg gataccatgc 4320aacataacct gataaagctc cagcaggaaa tggctgaact
agaagctgtg ttagaacagc 4380atgggagcca gccttctaac agctaccctt ccatcataag
tgactcttct gcccttgagg 4440acctgcgaaa tccagaacaa agcacatcag aaaaagcagt
attaacttca cagaaaagta 4500gtgaataccc tataagccag aatccagaag gcctttctgc
tgacaagttt gaggtgtctg 4560cagatagttc taccagtaaa aataaagaac caggagtgga
aaggtcatcc ccttctaaat 4620gcccatcatt agatgatagg tggtacatgc acagttgctc
tgggagtctt cagaatagaa 4680actacccatc tcaagaggag ctcattaagg ttgttgatgt
ggaggagcaa cagctggaag 4740agtctgggcc acacgatttg acggaaacat cttacttgcc
aaggcaagat ctagagggaa 4800ccccttacct ggaatctgga atcagcctct tctctgatga
ccctgaatct gatccttctg 4860aagacagagc cccagagtca gctcgtgttg gcaacatacc
atcttcaacc tctgcattga 4920aagttcccca attgaaagtt gcagaatctg cccagagtcc
agctgctgct catactactg 4980atactgctgg gtataatgca atggaagaaa gtgtgagcag
ggagaagcca gaattgacag 5040cttcaacaga aagggtcaac aaaagaatgt ccatggtggt
gtctggcctg accccagaag 5100aatttatgct cgtgtacaag tttgccagaa aacaccacat
cactttaact aatctaatta 5160ctgaagagac tactcatgtt gttatgaaaa cagatgctga
gtttgtgtgt gaacggacac 5220tgaaatattt tctaggaatt gcgggaggaa aatgggtagt
tagctatttc tgggtgaccc 5280agtctattaa agaaagaaaa atgctgaatg agcatgattt
tgaagtcaga ggagatgtgg 5340tcaatggaag aaaccaccaa ggtccaaagc gagcaagaga
atcccaggac agaaagatct 5400tcagggggct agaaatctgt tgctatgggc ccttcaccaa
catgcccaca gatcaactgg 5460aatggatggt acagctgtgt ggtgcttctg tggtgaagga
gctttcatca ttcacccttg 5520gcacaggtgt ccacccaatt gtggttgtgc agccagatgc
ctggacagag gacaatggct 5580tccatgcaat tgggcagatg tgtgaggcac ctgtggtgac
ccgagagtgg gtgttggaca 5640gtgtagcact ctaccagtgc caggagctgg acacctacct
gataccccag atcccccaca 5700gccactactg a
5711342680DNAHomo sapiens 34ggttatcctg aatacatgtc
taacaatttt ccttgcaacg ttagctgttg tttttcactg 60tttccaaagg atcaaaattg
cttcagaaat tggagacata tttgatttaa aaggaaaaac 120ttgaacaaat ggacaatatg
tctattacga atacaccaac aagtaatgat gcctgtctga 180gcattgtgca tagtttgatg
tgccatagac aaggtggaga gagtgaaaca tttgcaaaaa 240gagcaattga aagtttggta
aagaagctga aggagaaaaa agatgaattg gattctttaa 300taacagctat aactacaaat
ggagctcatc ctagtaaatg tgttaccata cagagaacat 360tggatgggag gcttcaggtg
gctggtcgga aaggatttcc tcatgtgatc tatgcccgtc 420tctggaggtg gcctgatctt
cacaaaaatg aactaaaaca tgttaaatat tgtcagtatg 480cgtttgactt aaaatgtgat
agtgtctgtg tgaatccata tcactacgaa cgagttgtat 540cacctggaat tgatctctca
ggattaacac tgcagagtaa tgctccatca agtatgatgg 600tgaaggatga atatgtgcat
gactttgagg gacagccatc gttgtccact gaaggacatt 660caattcaaac catccagcat
ccaccaagta atcgtgcatc gacagagaca tacagcaccc 720cagctctgtt agccccatct
gagtctaatg ctaccagcac tgccaacttt cccaacattc 780ctgtggcttc cacaagtcag
cctgccagta tactgggggg cagccatagt gaaggactgt 840tgcagatagc atcagggcct
cagccaggac agcagcagaa tggatttact ggtcagccag 900ctacttacca tcataacagc
actaccacct ggactggaag taggactgca ccatacacac 960ctaatttgcc tcaccaccaa
aacggccatc ttcagcacca cccgcctatg ccgccccatc 1020ccggacatta ctggcctgtt
cacaatgagc ttgcattcca gcctcccatt tccaatcatc 1080ctgctcctga gtattggtgt
tccattgctt actttgaaat ggatgttcag gtaggagaga 1140catttaaggt tccttcaagc
tgccctattg ttactgttga tggatacgtg gacccttctg 1200gaggagatcg cttttgtttg
ggtcaactct ccaatgtcca caggacagaa gccattgaga 1260gagcaaggtt gcacataggc
aaaggtgtgc agttggaatg taaaggtgaa ggtgatgttt 1320gggtcaggtg ccttagtgac
cacgcggtct ttgtacagag ttactactta gacagagaag 1380ctgggcgtgc acctggagat
gctgttcata agatctaccc aagtgcatat ataaaggtct 1440ttgatttgcg tcagtgtcat
cgacagatgc agcagcaggc ggctactgca caagctgcag 1500cagctgccca ggcagcagcc
gtggcaggaa acatccctgg cccaggatca gtaggtggaa 1560tagctccagc tatcagtctg
tcagctgctg ctggaattgg tgttgatgac cttcgtcgct 1620tatgcatact caggatgagt
tttgtgaaag gctggggacc ggattaccca agacagagca 1680tcaaagaaac accttgctgg
attgaaattc acttacaccg ggccctccag ctcctagacg 1740aagtacttca taccatgccg
attgcagacc cacaaccttt agactgaggt cttttaccgt 1800tggggccctt aaccttatca
ggatggtgga ctacaaaata caatcctgtt tataatctga 1860agatatattt cacttttctt
ctgctttatc ttttcataaa gggttgaaaa tgtgtttgct 1920gccttgctcc tagcagacag
aaactggatt aaaacaattt ttttttcctc ttcagaactt 1980gtcaggcatg gctcagagct
tgaagattag gagaaacaca ttcttattaa ttcttcacct 2040gttatgtatg aaggaatcat
tccagtgcta gaaaatttag ccctttaaaa cgtcttagag 2100ccttttatct gcagaacatc
gatatgtata tcattctaca gaataatcca gtattgctga 2160ttttaaaggc agagaagttc
tcaaagttaa ttcacctatg ttattttgtg tacaagttgt 2220tattgttgaa catacttcaa
aaataatgtg ccatgtgggt gagttaattt taccaagagt 2280aactttactc tgtgtttaaa
aatgaagtta ataatgtatt gtaatctttc atccaaaata 2340ttttttgcaa gttatattag
tgaagatggt ttcaattcag attgtcttgc aacttcagtt 2400ttatttttgc caaggcaaaa
aactcttaat ctgtgtgtat attgagaatc ccttaaaatt 2460accagacaaa aaaatttaaa
attacgtttg ttattcctag tggatgactg ttgatgaagt 2520atacttttcc cctgttaaac
agtagttgta ttcttctgta tttctaggca caaggttggt 2580tgctaagaag cctataagag
gaatttcttt tccttcattc atagggaaag gttttgtatt 2640ttttaaaaca ctaaaagcag
cgtcactcta cctaatgtct 2680351095DNAHomo sapiens
35tccccgctct gctctgtccg gtcacaggac tttttgccct ctgttcccgg gtccctcagg
60cggccaccca gtgggcacac tcccaggcgg cgctccggcc ccgcgctccc tccctctgcc
120tttcattccc agctgtcaac atcctggaag ctttgaagct caggaaagaa gagaaatcca
180ctgagaacag tctgtaaagg tccgtagtgc tatctacatc cagacggtgg aagggagaga
240aagagaaaga aggtatccta ggaatacctg cctgcttaga ccctctataa aagctctgtg
300catcctgcca ctgaggactc cgaagaggta gcagtcttct gaaagacttc aactgtgagg
360acatgtcgtt cagatttggc caacatctca tcaagccctc tgtagtgttt ctcaaaacag
420aactgtcctt cgctcttgtg aataggaaac ctgtggtacc aggacatgtc cttgtgtgcc
480cgctgcggcc agtggagcgc ttccatgacc tgcgtcctga tgaagtggcc gatttgtttc
540agacgaccca gagagtcggg acagtggtgg aaaaacattt ccatgggacc tctctcacct
600tttccatgca ggatggcccc gaagccggac agactgtgaa gcacgttcac gtccatgttc
660ttcccaggaa ggctggagac tttcacagga atgacagcat ctatgaggag ctccagaaac
720atgacaagga ggactttcct gcctcttgga gatcagagga ggaaatggca gcagaagccg
780cagctctgcg ggtctacttt cagtgacaca gatgtttttc agatcctgaa ttccagcaaa
840agagctattg ccaaccagtt tgaagaccgc ccccccgcct ctccccaaga ggaactgaat
900cagcatgaaa atgcagtttc ttcatctcac catcctgtat tcttcaacca gtgatccccc
960acctcggtca ctccaactcc cttaaaatac ctagacctaa acggctcaga caggcagatt
1020tgaggtttcc ccctgtctcc ttattcggca gccttatgat taaacttcct tctctgctgc
1080aaaaaaaaaa aaaaa
1095362234DNAHomo sapiens 36aggggacgca gcgaaaccgg ggcccgcgcc aggccagccg
ggacggacgc cgatgcccgg 60ggctgcgacg gctgcagagc gagctgccct cggaggccgg
cgtggggaag atggcccagt 120ccaccgccac ctcccctgat gggggcacca cgtttgagca
cctctggagc tctctggaac 180cagacagcac ctacttcgac cttccccagt caagccgggg
gaataatgag gtggtgggcg 240gaacggattc cagcatggac gtcttccacc tggagggcat
gactacatct gtcatggccc 300agttcaatct gctgagcagc accatggacc agatgagcag
ccgcgcggcc tcggccagcc 360cctacacccc agagcacgcc gccagcgtgc ccacccactc
gccctacgca caacccagct 420ccaccttcga caccatgtcg ccggcgcctg tcatcccctc
caacaccgac taccccggac 480cccaccactt tgaggtcact ttccagcagt ccagcacggc
caagtcagcc acctggacgt 540actccccgct cttgaagaaa ctctactgcc agatcgccaa
gacatgcccc atccagatca 600aggtgtccac cccgccaccc ccaggcactg ccatccgggc
catgcctgtt tacaagaaag 660cggagcacgt gaccgacgtc gtgaaacgct gccccaacca
cgagctcggg agggacttca 720acgaaggaca gtctgctcca gccagccacc tcatccgcgt
ggaaggcaat aatctctcgc 780agtatgtgga tgaccctgtc accggcaggc agagcgtcgt
ggtgccctat gagccaccac 840aggtggggac ggaattcacc accatcctgt acaacttcat
gtgtaacagc agctgtgtag 900ggggcatgaa ccggcggccc atcctcatca tcatcaccct
ggagatgcgg gatgggcagg 960tgctgggccg ccggtccttt gagggccgca tctgcgcctg
tcctggccgc gaccgaaaag 1020ctgatgagga ccactaccgg gagcagcagg ccctgaacga
gagctccgcc aagaacgggg 1080ccgccagcaa gcgtgccttc aagcagagcc cccctgccgt
ccccgccctt ggtgccggtg 1140tgaagaagcg gcggcatgga gacgaggaca cgtactacct
tcaggtgcga ggccgggaga 1200actttgagat cctgatgaag ctgaaagaga gcctggagct
gatggagttg gtgccgcagc 1260cactggtgga ctcctatcgg cagcagcagc agctcctaca
gaggccgagt cacctacagc 1320ccccgtccta cgggccggtc ctctcgccca tgaacaaggt
gcacgggggc atgaacaagc 1380tgccctccgt caaccagctg gtgggccagc ctcccccgca
cagttcggca gctacaccca 1440acctggggcc cgtgggcccc gggatgctca acaaccatgg
ccacgcagtg ccagccaacg 1500gcgagatgag cagcagccac agcgcccagt ccatggtctc
ggggtcccac tgcactccgc 1560caccccccta ccacgccgac cccagcctcg tcagtttttt
aacaggattg gggtgtccaa 1620actgcatcga gtatttcacc tcccaagggt tacagagcat
ttaccacctg cagaacctga 1680ccattgagga cctgggggcc ctgaagatcc ccgagcagta
ccgcatgacc atctggcggg 1740gcctgcagga cctgaagcag ggccacgact acagcaccgc
gcagcagctg ctccgctcta 1800gcaacgcggc caccatctcc atcggcggct caggggaact
gcagcgccag cgggtcatgg 1860aggccgtgca cttccgcgtg cgccacacca tcaccatccc
caaccgcggc ggcccaggcg 1920gcggccctga cgagtgggcg gacttcggct tcgacctgcc
cgactgcaag gcccgcaagc 1980agcccatcaa ggaggagttc acggaggccg agatccactg
agggcctcgc ctggctgcag 2040cctgcgccac cgcccagaga cccaagctgc ctcccctctc
cttcctgtgt gtccaaaact 2100gcctcaggag gcaggacctt cgggctgtgc ccggggaaag
gcaaggtccg gcccatcccc 2160aggcacctca caggccccag gaaaggccca gccaccgaag
ccgcctgtgg acagcctgag 2220tcacctgcag aacc
2234374344DNAHomo sapiens 37atggcctcgg ctggtaacgc
cgccgagccc caggaccgcg gcggcggcgg cagcggctgt 60atcggtgccc cgggacggcc
ggctggaggc gggaggcgca gacggacggg ggggctgcgc 120cgtgctgccg cgccggaccg
ggactatctg caccggccca gctactgcga cgccgccttc 180gctctggagc agatttccaa
ggggaaggct actggccgga aagcgccact gtggctgaga 240gcgaagtttc agagactctt
atttaaactg ggttgttaca ttcaaaaaaa ctgcggcaag 300ttcttggttg tgggcctcct
catatttggg gccttcgcgg tgggattaaa agcagcgaac 360ctcgagacca acgtggagga
gctgtgggtg gaagttggag gacgagtaag tcgtgaatta 420aattatactc gccagaagat
tggagaagag gctatgttta atcctcaact catgatacag 480acccctaaag aagaaggtgc
taatgtcctg accacagaag cgctcctaca acacctggac 540tcggcactcc aggccagccg
tgtccatgta tacatgtaca acaggcagtg gaaattggaa 600catttgtgtt acaaatcagg
agagcttatc acagaaacag gttacatgga tcagataata 660gaatatcttt acccttgttt
gattattaca cctttggact gcttctggga aggggcgaaa 720ttacagtctg ggacagcata
cctcctaggt aaacctcctt tgcggtggac aaacttcgac 780cctttggaat tcctggaaga
gttaaagaaa ataaactatc aagtggacag ctgggaggaa 840atgctgaata aggctgaggt
tggtcatggt tacatggacc gcccctgcct caatccggcc 900gatccagact gccccgccac
agcccccaac aaaaattcaa ccaaacctct tgatatggcc 960cttgttttga atggtggatg
tcatggctta tccagaaagt atatgcactg gcaggaggag 1020ttgattgtgg gtggcacagt
caagaacagc actggaaaac tcgtcagcgc ccatgccctg 1080cagaccatgt tccagttaat
gactcccaag caaatgtacg agcacttcaa ggggtacgag 1140tatgtctcac acatcaactg
gaacgaggac aaagcggcag ccatcctgga ggcctggcag 1200aggacatatg tggaggtggt
tcatcagagt gtcgcacaga actccactca aaaggtgctt 1260tccttcacca ccacgaccct
ggacgacatc ctgaaatcct tctctgacgt cagtgtcatc 1320cgcgtggcca gcggctactt
actcatgctc gcctatgcct gtctaaccat gctgcgctgg 1380gactgctcca agtcccaggg
tgccgtgggg ctggctggcg tcctgctggt tgcactgtca 1440gtggctgcag gactgggcct
gtgctcattg atcggaattt cctttaacgc tgcaacaact 1500caggttttgc catttctcgc
tcttggtgtt ggtgtggatg atgtttttct tctggcccac 1560gccttcagtg aaacaggaca
gaataaaaga atcccttttg aggacaggac cggggagtgc 1620ctgaagcgca caggagccag
cgtggccctc acgtccatca gcaatgtcac agccttcttc 1680atggccgcgt taatcccaat
tcccgctctg cgggcgttct ccctccaggc agcggtagta 1740gtggtgttca attttgccat
ggttctgctc atttttcctg caattctcag catggattta 1800tatcgacgcg aggacaggag
actggatatt ttctgctgtt ttacaagccc ctgcgtcagc 1860agagtgattc aggttgaacc
tcaggcctac accgacacac acgacaatac ccgctacagc 1920cccccacctc cctacagcag
ccacagcttt gcccatgaaa cgcagattac catgcagtcc 1980actgtccagc tccgcacgga
gtacgacccc cacacgcacg tgtactacac caccgctgag 2040ccgcgctccg agatctctgt
gcagcccgtc accgtgacac aggacaccct cagctgccag 2100agcccagaga gcaccagctc
cacaagggac ctgctctccc agttctccga ctccagcctc 2160cactgcctcg agcccccctg
tacgaagtgg acactctcat cttttgctga gaagcactat 2220gctcctttcc tcttgaaacc
aaaagccaag gtagtggtga tcttcctttt tctgggcttg 2280ctgggggtca gcctttatgg
caccacccga gtgagagacg ggctggacct tacggacatt 2340gtacctcggg aaaccagaga
atatgacttt attgctgcac aattcaaata cttttctttc 2400tacaacatgt atatagtcac
ccagaaagca gactacccga atatccagca cttactttac 2460gacctacaca ggagtttcag
taacgtgaag tatgtcatgt tggaagaaaa caaacagctt 2520cccaaaatgt ggctgcacta
cttcagagac tggcttcagg gacttcagga tgcatttgac 2580agtgactggg aaaccgggaa
aatcatgcca aacaattaca agaatggatc agacgatgga 2640gtccttgcct acaaactcct
ggtgcaaacc ggcagccgcg ataagcccat cgacatcagc 2700cagttgacta aacagcgtct
ggtggatgca gatggcatca ttaatcccag cgctttctac 2760atctacctga cggcttgggt
cagcaacgac cccgtcgcgt atgctgcctc ccaggccaac 2820atccggccac accgaccaga
atgggtccac gacaaagccg actacatgcc tgaaacaagg 2880ctgagaatcc cggcagcaga
gcccatcgag tatgcccagt tccctttcta cctcaacggg 2940ttgcgggaca cctcagactt
tgtggaggca attgaaaaag taaggaccat ctgcagcaac 3000tatacgagcc tggggctgtc
cagttacccc aacggctacc ccttcctctt ctgggagcag 3060tacatcggcc tccgccactg
gctgctgctg ttcatcagcg tggtgttggc ctgcacattc 3120ctcgtgtgcg ctgtcttcct
tctgaacccc tggacggccg ggatcattgt gatggtcctg 3180gcgctgatga cggtcgagct
gttcggcatg atgggcctca tcggaatcaa gctcagtgcc 3240gtgcccgtgg tcatcctgat
cgcttctgtt ggcataggag tggagttcac cgttcacgtt 3300gctttggcct ttctgacggc
catcggcgac aagaaccgca gggctgtgct tgccctggag 3360cacatgtttg cacccgtcct
ggatggcgcc gtgtccactc tgctgggagt gctgatgctg 3420gcgggatctg agttcgactt
cattgtcagg tatttctttg ctgtgctggc gatcctcacc 3480atcctcggcg ttctcaatgg
gctggttttg cttcccgtgc ttttgtcttt ctttggacca 3540tatcctgagg tgtctccagc
caacggcttg aaccgcctgc ccacaccctc ccctgagcca 3600ccccccagcg tggtccgctt
cgccatgccg cccggccaca cgcacagcgg gtctgattcc 3660tccgactcgg agtatagttc
ccagacgaca gtgtcaggcc tcagcgagga gcttcggcac 3720tacgaggccc agcagggcgc
gggaggccct gcccaccaag tgatcgtgga agccacagaa 3780aaccccgtct tcgcccactc
cactgtggtc catcccgaat ccaggcatca cccaccctcg 3840aacccgagac agcagcccca
cctggactca gggtccctgc ctcccggacg gcaaggccag 3900cagccccgca gggacccccc
cagagaaggc ttgtggccac ccctctacag accgcgcaga 3960gacgcttttg aaatttctac
tgaagggcat tctggcccta gcaatagggc ccgctggggc 4020cctcgcgggg cccgttctca
caaccctcgg aacccagcgt ccactgccat gggcagctcc 4080gtgcccggct actgccagcc
catcaccact gtgacggctt ctgcctccgt gactgtcgcc 4140gtgcacccgc cgcctgtccc
tgggcctggg cggaaccccc gagggggact ctgcccaggc 4200taccctgaga ctgaccacgg
cctgtttgag gacccccacg tgcctttcca cgtccggtgt 4260gagaggaggg attcgaaggt
ggaagtcatt gagctgcagg acgtggaatg cgaggagagg 4320ccccggggaa gcagctccaa
ctga 4344384740DNAHomo sapiens
38ttccggtttt tctcagggga cgttgaaatt atttttgtaa cgggagtcgg gagaggacgg
60ggcgtgcccc gcgtgcgcgc gcgtcgtcct ccccggcgct cctccacagc tcgctggctc
120ccgccgcgga aaggcgtcat gccgcccaaa accccccgaa aaacggccgc caccgccgcc
180gctgccgccg cggaaccccc ggcaccgccg ccgccgcccc ctcctgagga ggacccagag
240caggacagcg gcccggagga cctgcctctc gtcaggcttg agtttgaaga aacagaagaa
300cctgatttta ctgcattatg tcagaaatta aagataccag atcatgtcag agagagagct
360tggttaactt gggagaaagt ttcatctgtg gatggagtat tgggaggtta tattcaaaag
420aaaaaggaac tgtggggaat ctgtatcttt attgcacgag ttgacctaga tgagatgtcg
480ttcactttac tgagctacag aaaaacatac gaaatcagtg tccataaatt ctttaactta
540ctaaaagaaa ttgataccag taccaaagtt gataatgcta tgtcaagact gttgaagaag
600tatgatgtat tgtttgcact cttcagcaaa ttggaaagga catgtgaact tatatatttg
660acacaaccca gcagttcgat atctactgaa ataaattctg cattggtgct aaaagtttct
720tggatcacat ttttattagc taaaggggaa gtattacaaa tggaagatga tctggtgatt
780tcatttcagt taatgctatg tgtccttgac tattttatta aactctcacc tcccatgttg
840ctcaaagaac catataaaac agctgttata cccattaatg gttcacctcg aacacccagg
900cgaggtcaga acaggagtgc acggatagca aaacaactag aaaatgatac aagaattatt
960gaagttctct gtaaagaaca tgaatgtaat atagatgagg tgaaaaatgt ttatttcaaa
1020aattttatac cttttatgaa ttctcttgga cttgtaacat ctaatggact tccagaggtt
1080gaaaatcttt ctaaacgata cgaagaaatt tatcttaaaa ataaagatct agatcgaaga
1140ttatttttgg atcatgataa aactcttcag actgattcta tagacagttt tgaaacacag
1200agaacaccac gaaaaagtaa ccttgatgaa gaggtgaata taattcctcc acacactcca
1260gttaggactg ttatgaacac tatccaacaa ttaatgatga ttttaaattc tgcaagtgat
1320caaccttcag aaaatctgat ttcctatttt aacaactgca cagtgaatcc aaaagaaagt
1380atactgaaaa gagtgaagga tataggatac atctttaaag agaaatttgc taaagctgtg
1440ggacagggtt gtgtcgaaat tggatcacag cgatacaaac ttggagttcg cttgtattac
1500cgagtaatgg aatccatgct taaatcagaa gaagaacgat tatccattca aaattttagc
1560aaacttctga atgacaacat ttttcatatg tctttattgg cgtgcgctct tgaggttgta
1620atggccacat atagcagaag tacatctcag aatcttgatt ctggaacaga tttgtctttc
1680ccatggattc tgaatgtgct taatttaaaa gcctttgatt tttacaaagt gatcgaaagt
1740tttatcaaag cagaaggcaa cttgacaaga gaaatgataa aacatttaga acgatgtgaa
1800catcgaatca tggaatccct tgcatggctc tcagattcac ctttatttga tcttattaaa
1860caatcaaagg accgagaagg accaactgat caccttgaat ctgcttgtcc tcttaatctt
1920cctctccaga ataatcacac tgcagcagat atgtatcttt ctcctgtaag atctccaaag
1980aaaaaaggtt caactacgcg tgtaaattct actgcaaatg cagagacaca agcaacctca
2040gccttccaga cccagaagcc attgaaatct acctctcttt cactgtttta taaaaaagtg
2100tatcggctag cctatctccg gctaaataca ctttgtgaac gccttctgtc tgagcaccca
2160gaattagaac atatcatctg gacccttttc cagcacaccc tgcagaatga gtatgaactc
2220atgagagaca ggcatttgga ccaaattatg atgtgttcca tgtatggcat atgcaaagtg
2280aagaatatag accttaaatt caaaatcatt gtaacagcat acaaggatct tcctcatgct
2340gttcaggaga cattcaaacg tgttttgatc aaagaagagg agtatgattc tattatagta
2400ttctataact cggtcttcat gcagagactg aaaacaaata ttttgcagta tgcttccacc
2460aggcccccta ccttgtcacc aatacctcac attcctcgaa gcccttacaa gtttcctagt
2520tcacccttac ggattcctgg agggaacatc tatatttcac ccctgaagag tccatataaa
2580atttcagaag gtctgccaac accaacaaaa atgactccaa gatcaagaat cttagtatca
2640attggtgaat cattcgggac ttctgagaag ttccagaaaa taaatcagat ggtatgtaac
2700agcgaccgtg tgctcaaaag aagtgctgaa ggaagcaacc ctcctaaacc actgaaaaaa
2760ctacgctttg atattgaagg atcagatgaa gcagatggaa gtaaacatct cccaggagag
2820tccaaatttc agcagaaact ggcagaaatg acttctactc gaacacgaat gcaaaagcag
2880aaaatgaatg atagcatgga tacctcaaac aaggaagaga aatgaggatc tcaggacctt
2940ggtggacact gtgtacacct ctggattcat tgtctctcac agatgtgact gtataacttt
3000cccaggttct gtttatggcc acatttaata tcttcagctc tttttgtgga tataaaatgt
3060gcagatgcaa ttgtttgggt gagtcctaag ccacttgaaa tgttagtcat tgttatttat
3120acaagattga aaatcttgtg taaatcctgc catttaaaaa gttgtagcag attgtttcct
3180cttccaaagt aaaattgctg tgctttatgg atagtaagaa tggccctaga gtgggagtcc
3240tgataaccca ggcctgtctg actactttgc cttcttttgt agcatatagg tgatgtttgc
3300tcttgttttt attaatttat atgtatattt ttttaattta acatgaacac ccttagaaaa
3360tgtgtcctat ctatcttcca aatgcaattt gattgactgc ccattcacca aaattatcct
3420gaactcttct gcaaaaatgg atattattag aaattagaaa aaaattacta attttacaca
3480ttagatttta ttttactatt ggaatctgat atactgtgtg cttgttttat aaaattttgc
3540ttttaattaa ataaaagctg gaagcaaagt ataaccatat gatactatca tactactgaa
3600acagatttca tacctcagaa tgtaaaagaa cttactgatt attttcttca tccaacttat
3660gtttttaaat gaggattatt gatagtactc ttggttttta taccattcag atcactgaat
3720ttataaagta cccatctagt acttgaaaaa gtaaagtgtt ctgccagatc ttaggtatag
3780aggaccctaa cacagtatat cccaagtgca ctttctaatg tttctgggtc ctgaagaatt
3840aagatacaaa ttaattttac tccataaaca gactgttaat tataggagcc ttaatttttt
3900tttcatagag atttgtctaa ttgcatctca aaattattct gccctcctta atttgggaag
3960gtttgtgttt tctctggaat ggtacatgtc ttccatgtat cttttgaact ggcaattgtc
4020tatttatctt ttattttttt aagtcagtat ggtctaacac tggcatgttc aaagccacat
4080tatttctagt ccaaaattac aagtaatcaa gggtcattat gggttaggca ttaatgtttc
4140tatctgattt tgtgcaaaag cttcaaatta aaacagctgc attagaaaaa gaggcgcttc
4200tcccctcccc tacacctaaa ggtgtattta aactatcttg tgtgattaac ttatttagag
4260atgctgtaac ttaaaatagg ggatatttaa ggtagcttca gctagctttt aggaaaatca
4320ctttgtctaa ctcagaatta tttttaaaaa gaaatctggt cttgttagaa aacaaaattt
4380tattttgtgc tcatttaagt ttcaaactta ctattttgac agttattttg ataacaatga
4440cactagaaaa cttgactcca tttcatcatt gtttctgcat gaatatcata caaatcagtt
4500agtttttagg tcaagggctt actatttctg ggtcttttgc tactaagttc acattagaat
4560tagtgccaga attttaggaa cttcagagat cgtgtattga gatttcttaa ataatgcttc
4620agatattatt gctttattgc ttttttgtat tggttaaaac tgtacattta aaattgctat
4680gttactattt tctacaatta atagtttgtc tattttaaaa taaattagtt gttaagagtc
4740394608DNAHomo sapiensl 39atggagaata gtcttagatg tgtttgggta cccaagctgg
cttttgtact cttcggagct 60tccttgctca gcgcgcatct tcaagtaacc ggttttcaaa
ttaaagcttt cacagcactg 120cgcttcctct cagaaccttc tgatgccgtc acaatgcggg
gaggaaatgt cctcctcgac 180tgctccgcgg agtccgaccg aggagttcca gtgatcaagt
ggaagaaaga tggcattcat 240ctggccttgg gaatggatga aaggaagcag caactttcaa
atgggtctct gctgatacaa 300aacatacttc attccagaca ccacaagcca gatgagggac
tttaccaatg tgaggcatct 360ttaggagatt ctggctcaat tattagtcgg acagcaaaag
ttgcagtagc aggaccactg 420aggttccttt cacagacaga atctgtcaca gccttcatgg
gagacacagt gctactcaag 480tgtgaagtca ttggggagcc catgccaaca atccactggc
agaagaacca acaagacctg 540actccaatcc caggtgactc ccgagtggtg gtcttgccct
ctggagcatt gcagatcagc 600cgactccaac cgggggacat tggaatttac cgatgctcag
ctcgaaatcc agccagctca 660agaacaggaa atgaagcaga agtcagaatt ttatcagatc
caggactgca tagacagctg 720tattttctgc aaagaccatc caatgtagta gccattgaag
gaaaagatgc tgtcctggaa 780tgttgtgttt ctggctatcc tccaccaagt tttacctggt
tacgaggcga ggaagtcatc 840caactcaggt ctaaaaagta ttctttattg ggtggaagca
acttgcttat ctccaatgtg 900acagatgatg acagtggaat gtatacctgt gttgtcacat
ataaaaatga gaatattagt 960gcctctgcag agctcacagt cttggttccg ccatggtttt
taaatcatcc ttccaacctg 1020tatgcctatg aaagcatgga tattgagttt gaatgtacag
tctctggaaa gcctgtgccc 1080actgtgaatt ggatgaagaa tggagatgtg gtcattccta
gtgattattt tcagatagtg 1140ggaggaagca acttacggat acttggggtg gtgaagtcag
atgaaggctt ttatcaatgt 1200gtggctgaaa atgaggctgg aaatgcccag accagtgcac
agctcattgt ccctaagcct 1260gcaatcccaa gctccagtgt cctcccttcg gctcccagag
atgtggtccc tgtcttggtt 1320tccagccgat ttgtccgtct cagctggcgc ccacctgcag
aagcgaaagg gaacattcaa 1380actttcacgg tctttttctc cagagaaggt gacaacaggg
aacgagcatt gaatacaaca 1440cagcctgggt cccttcagct cactgtggga aacctgaagc
cagaagccat gtacaccttt 1500cgagttgtgg cttacaatga atggggaccg ggagagagtt
ctcaacccat caaggtggcc 1560acacagcctg agttgcaagt tccagggcca gtagaaaacc
tgcaagctgt atctacctca 1620cctacctcaa ttcttattac ctgggaaccc cctgcctatg
caaacggtcc agtccaaggt 1680tacagattgt tctgcactga ggtgtccaca ggaaaagaac
agaatataga ggttgatgga 1740ctatcttata aactggaagg cctgaaaaaa ttcaccgaat
atagtcttcg attcttagct 1800tataatcgct atggtccggg cgtctctact gatgatataa
cagtggttac actttctgac 1860gtgccaagtg ccccgcctca gaacgtctcc ctggaagtgg
tcaattcaag aagtatcaaa 1920gttagctggc tgcctcctcc atcaggaaca caaaatggat
ttattaccgg ctataaaatt 1980cgacacagaa agacgacccg caggggtgag atggaaacac
tggagccaaa caacctctgg 2040tacctattca caggactgga gaaaggaagt cagtacagtt
tccaggtgtc agccatgaca 2100gtcaatggta ctggaccacc ttccaactgg tatactgcag
agactccaga gaatgatcta 2160gatgaatctc aagttcctga tcaaccaagc tctcttcatg
tgaggcccca gactaactgc 2220atcatcatga gttggactcc tcccttgaac ccaaacatcg
tggtgcgagg ttatattatc 2280ggttatggcg ttgggagccc ttacgctgag acagtgcgtg
tggacagcaa gcagcgatat 2340tattccattg agaggttaga gtcaagttcc cattatgtaa
tctccctaaa agcttttaac 2400aatgccggag aaggagttcc tctttatgaa agtgccacca
ccaggtctat aaccgatccc 2460actgacccag ttgattatta tcctttgctt gatgatttcc
ccacctcggt cccagatctc 2520tccaccccca tgctcccacc agtaggtgta caggctgtgg
ctcttaccca tgatgctgtg 2580agggtcagct gggcagacaa ctctgtccct aagaaccaaa
agacgtctga ggtgcgactt 2640tacaccgtcc ggtggagaac cagcttttct gcaagtgcaa
aatacaagtc agaagacaca 2700acatctctaa gttacacagc aacaggcctc aaaccaaaca
caatgtatga attctcggtc 2760atggtaacaa aaaacagaag gtccagtact tggagcatga
ctgcacatgc caccacgtat 2820gaagcagccc ccacctctgc tcccaaggac tttacagtca
ttactaggga agggaagcct 2880cgtgccgtca ttgtgagttg gcagcctccc ttggaagcca
atgggaaaat tactgcttac 2940atcttatttt ataccttgga caagaacatc ccaattgatg
actggattat ggaaacaatc 3000agtggtgata ggcttactca tcaaatcatg gatctcaacc
ttgatactat gtattacttt 3060cgaattcaag cacgaaattc aaaaggagtg gggccactct
ctgatcccat cctcttcagg 3120actctgaaag tggaacaccc tgacaaaatg gctaatgacc
aaggtcgtca tggagatgga 3180ggttattggc cagttgatac taatttgatt gatagaagca
ccctaaatga gccgccaatt 3240ggacaaatgc accccccgca tggcagtgtc actcctcaga
agaacagcaa cctgcttgtg 3300atcattgtgg tcaccgttgg tgtcatcaca gtgctggtag
tggtcatcgt ggctgtgatt 3360tgcacccgac gctcttcagc ccagcagaga aagaaacggg
ccacccacag tgctggcaaa 3420aggaagggca gccagaagga cctccgaccc cctgatcttt
ggatccatca tgaagaaatg 3480gagatgaaaa atattgaaaa gccatctggc actgaccctg
caggaaggga ctctcccatc 3540caaagttgcc aagacctcac accagtcagc cacagccagt
cagaaaccca actgggaagc 3600aaaagcacct ctcattcagg tcaagacact gaggaagcag
ggagctctat gtccactctg 3660gagaggtcgc tggctgcacg ccgagccccc cgggccaagc
tcatgattcc catggatgcc 3720cagtccaaca atcctgctgt cgtgagcgcc atcccggtgc
caacgctaga aagtgcccag 3780tacccaggaa tcctcccgtc tcccacctgt ggatatcccc
acccgcagtt cactctccgg 3840cctgtgccat tcccaacact ctcagtggac cgaggtttcg
gagcaggaag aagtcagtca 3900gtgagtgaag gaccaactac ccaacaacca cctatgctgc
ccccatctca gcctgagcat 3960tctagcagcg aggaggcacc aagcagaacc atccccacag
cttgtgttcg accaactcac 4020ccactccgca gctttgctaa tcctttgcta cctccaccaa
tgagtgcaat agaaccgaaa 4080gtcccttaca caccactttt gtctcagcca gggcccactc
ttcctaagac ccatgtgaaa 4140acagcctccc ttgggttggc tggaaaagca agatcccctt
tgcttcctgt gtctgtgcca 4200acagcccctg aagtgtctga ggagagccac aaaccaacag
aggattcagc caatgtgtat 4260gaacaggatg atctgagtga acaaatggca agtttggaag
gactcatgaa gcagcttaat 4320gccatcacag gctcagcctt ttaacatgta tttctgaatg
gatgaggtga attttccggg 4380aactttgcag cataccaatt acccataaac agcacacctg
tgtccaagaa ctctaaccag 4440tgtacaggtc acccatcagg accactcagt taaggaagat
cctgaagcag ttcagaagga 4500ataagcattc cttctttcac aggcatcagg aattgtcaaa
tgatgattat gagttcccta 4560aacaaaagca aagatgcatt ttcactgcaa tgtcaaagtt
tagctgct 4608408959DNAHomo sapiens 40ccccagcctc cttgccaacg
ccccctttcc ctctccccct cccgctcggc gctgaccccc 60catccccacc cccgtgggaa
cactgggagc ctgcactcca cagaccctct ccttgcctct 120tccctcacct cagcctccgc
tccccgccct cttcccggcc cagggcgccg gcccaccctt 180ccctccgccg ccccccggcc
gcggggagga catggccgcg cacaggccgg tggaatgggt 240ccaggccgtg gtcagccgct
tcgacgagca gcttccaata aaaacaggac agcagaacac 300acataccaaa gtcagtactg
agcacaacaa ggaatgtcta atcaatattt ccaaatacaa 360gttttctttg gttataagcg
gcctcactac tattttaaag aatgttaaca atatgagaat 420atttggagaa gctgctgaaa
aaaatttata tctctctcag ttgattatat tggatacact 480ggaaaaatgt cttgctgggc
aaccaaagga cacaatgaga ttagatgaaa cgatgctggt 540caaacagttg ctgccagaaa
tctgccattt tcttcacacc tgtcgtgaag gaaaccagca 600tgcagctgaa cttcggaatt
ctgcctctgg ggttttattt tctctcagct gcaacaactt 660caatgcagtc tttagtcgca
tttctaccag gttacaggaa ttaactgttt gttcagaaga 720caatgttgat gttcatgata
tagaattgtt acagtatatc aatgtggatt gtgcaaaatt 780aaaacgactc ctgaaggaaa
cagcatttaa atttaaagcc ctaaagaagg ttgcgcagtt 840agcagttata aatagcctgg
aaaaggcatt ttggaactgg gtagaaaatt atccagatga 900atttacaaaa ctgtaccaga
tcccacagac tgatatggct gaatgtgcag aaaagctatt 960tgacttggtg gatggttttg
ctgaaagcac caaacgtaaa gcagcagttt ggccactaca 1020aatcattctc cttatcttgt
gtccagaaat aatccaggat atatccaaag acgtggttga 1080tgaaaacaac atgaataaga
agttatttct ggacagtcta cgaaaagctc ttgctggcca 1140tggaggaagt aggcagctga
cagaaagtgc tgcaattgcc tgtgtcaaac tgtgtaaagc 1200aagtacttac atcaattggg
aagataactc tgtcattttc ctacttgttc agtccatggt 1260ggttgatctt aagaacctgc
tttttaatcc aagtaagcca ttctcaagag gcagtcagcc 1320tgcagatgtg gatctaatga
ttgactgcct tgtttcttgc tttcgtataa gccctcacaa 1380caaccaacac tttaagatct
gcctggctca gaattcacct tctacatttc actatgtgct 1440ggtaaattca ctccatcgaa
tcatcaccaa ttccgcattg gattggtggc ctaagattga 1500tgctgtgtat tgtcactcgg
ttgaacttcg aaatatgttt ggtgaaacac ttcataaagc 1560agtgcaaggt tgtggagcac
acccagcaat acgaatggca ccgagtctta catttaaaga 1620aaaagtaaca agccttaaat
ttaaagaaaa acctacagac ctggagacaa gaagctataa 1680gtatcttctc ttgtccatgg
tgaaactaat tcatgcagat ccaaagctct tgctttgtaa 1740tccaagaaaa caggggcccg
aaacccaagg cagtacagca gaattaatta cagggctcgt 1800ccaactggtc cctcagtcac
acatgccaga gattgctcag gaagcaatgg aggctctgct 1860ggttcttcat cagttagata
gcattgattt gtggaatcct gatgctcctg tagaaacatt 1920ttgggagatt agctcacaaa
tgctttttta catctgcaag aaattaacta gtcatcaaat 1980gcttagtagc acagaaattc
tcaagtggtt gcgggaaata ttgatctgca ggaataaatt 2040tcttcttaaa aataagcagg
cagatagaag ttcctgtcac tttctccttt tttacggggt 2100aggatgtgat attccttcta
gtggaaatac cagtcaaatg tccatggatc atgaagaatt 2160actacgtact cctggagcct
ctctccggaa gggaaaaggg aactcctcta tggatagtgc 2220agcaggatgc agcggaaccc
ccccaatttg ccgacaagcc cagaccaaac tagaagtggc 2280cctgtacatg tttctgtgga
accctgacac tgaagctgtt ctggttgcca tgtcctgttt 2340ccgccacctc tgtgaggaag
cagatatccg gtgtggggtg gatgaagtgt cagtgcataa 2400cctcttgccc aactataaca
cattcatgga gtttgcctct gtcagcaata tgatgtcaac 2460aggaagagca gcacttcaga
aaagagtgat ggcactgctg aggcgcattg agcatcccac 2520tgcaggaaac actgaggctt
gggaagatac acatgcaaaa tgggaacaag caacaaagct 2580aatccttaac tatccaaaag
ccaaaatgga agatggccag gctgctgaaa gccttcacaa 2640gaccattgtt aagaggcgaa
tgtcccatgt gagtggagga ggatccatag atttgtctga 2700cacagactcc ctacaggaat
ggatcaacat gactggcttc ctttgtgccc ttggaggagt 2760gtgcctccag cagagaagca
attctggcct ggcaacctat agcccaccca tgggtccagt 2820cagtgaacgt aagggttcta
tgatttcagt gatgtcttca gagggaaacg cagatacacc 2880tgtcagcaaa tttatggatc
ggctgttgtc cttaatggtg tgtaaccatg agaaagtggg 2940acttcaaata cggaccaatg
ttaaggatct ggtgggtcta gaattgagtc ctgctctgta 3000tccaatgcta tttaacaaat
tgaagaatac catcagcaag ttttttgact cccaaggaca 3060ggttttattg actgatacca
atactcaatt tgtagaacaa accatagcta taatgaagaa 3120cttgctagat aatcatactg
aaggcagctc tgaacatcta gggcaagcta gcattgaaac 3180aatgatgtta aatctggtca
ggtatgttcg tgtgcttggg aatatggtcc atgcaattca 3240aataaaaacg aaactgtgtc
aattagttga agtaatgatg gcaaggagag atgacctctc 3300attttgccaa gagatgaaat
ttaggaataa gatggtagaa tacctgacag actgggttat 3360gggaacatca aaccaagcag
cagatgatga tgtaaaatgt cttacaagag atttggacca 3420ggcaagcatg gaagcagtag
tttcacttct agctggtctc cctctgcagc ctgaagaagg 3480agatggtgtg gaattgatgg
aagccaaatc acagttattt cttaaatact tcacattatt 3540tatgaacctt ttgaatgact
gcagtgaagt tgaagatgaa agtgcgcaaa caggtggcag 3600gaaacgtggc atgtctcgga
ggctggcatc actgaggcac tgtacggtcc ttgcaatgtc 3660aaacttactc aatgccaacg
tagacagtgg tctcatgcac tccataggct taggttacca 3720caaggatctc cagacaagag
ctacatttat ggaagttctg acaaaaatcc ttcaacaagg 3780cacagaattt gacacacttg
cagaaacagt attggctgat cggtttgaga gattggtgga 3840actggtcaca atgatgggtg
atcaaggaga actccctata gcgatggctc tggccaatgt 3900ggttccttgt tctcagtggg
atgaactagc tcgagttctg gttactctgt ttgattctcg 3960gcatttactc taccaactgc
tctggaacat gttttctaaa gaagtagaat tggcagactc 4020catgcagact ctcttccgag
gcaacagctt ggccagtaaa ataatgacat tctgtttcaa 4080ggtatatggt gctacctatc
tacaaaaact cctggatcct ttattacgaa ttgtgatcac 4140atcctctgat tggcaacatg
ttagctttga agtggatcct accaggttag aaccatcaga 4200gagccttgag gaaaaccagc
ggaacctcct tcagatgact gaaaagttct tccatgccat 4260catcagttcc tcctcagaat
tcccccctca acttcgaagt gtgtgccact gtttatacca 4320ggtggttagc cagcgtttcc
ctcagaacag catcggtgca gtaggaagtg ccatgttcct 4380cagatttatc aatcctgcca
ttgtctcacc gtatgaagca gggattttag ataaaaagcc 4440accacctaga atcgaaaggg
gcttgaagtt aatgtcaaag atacttcaga gtattgccaa 4500tcatgttctc ttcacaaaag
aagaacatat gcggcctttc aatgattttg tgaaaagcaa 4560ctttgatgca gcacgcaggt
ttttccttga tatagcatct gattgtccta caagtgatgc 4620agtaaatcat agtctttcct
tcataagtga cggcaatgtg cttgctttac atcgtctact 4680ctggaacaat caggagaaaa
ttgggcagta tctttccagc aacagggatc ataaagctgt 4740tggaagacga ccttttgata
agatggcaac acttcttgca tacctgggtc ctccagagca 4800caaacctgtg gcagatacac
actggtccag ccttaacctt accagttcaa agtttgagga 4860atttatgact aggcatcagg
tacatgaaaa agaagaattc aaggctttga aaacgttaag 4920tattttctac caagctggga
cttccaaagc tgggaatcct attttttatt atgttgcacg 4980gaggttcaaa actggtcaaa
tcaatggtga tttgctgata taccatgtct tactgacttt 5040aaagccatat tatgcaaagc
catatgaaat tgtagtggac cttacccata ccgggcctag 5100caatcgcttt aaaacagact
ttctctctaa gtggtttgtt gtttttcctg gctttgctta 5160cgacaacgtc tccgcagtct
atatctataa ctgtaactcc tgggtcaggg agtacaccaa 5220gtatcatgag cggctgctga
ctggcctcaa aggtagcaaa aggcttgttt tcatagactg 5280tcctgggaaa ctggctgagc
acatagagca tgaacaacag aaactacctg ctgccacctt 5340ggctttagaa gaggacctga
aggtattcca caatgctctc aagctagctc acaaagacac 5400caaagtttct attaaagttg
gttctactgc tgtccaagta acttcagcag agcgaacaaa 5460agtcctaggg caatcagtct
ttctaaatga catttattat gcttcggaaa ttgaagaaat 5520ctgcctagta gatgagaacc
agttcacctt aaccattgca aaccagggca cgccgctcac 5580cttcatgcac caggagtgtg
aagccattgt ccagtctatc attcatatcc ggacccgctg 5640ggaactgtca cagcccgact
ctatccccca acacaccaag attcggccaa aagatgtccc 5700tgggacactg ctcaatatcg
cattacttaa tttaggcagt tctgacccga gtttacggtc 5760agctgcctat aatcttctgt
gtgccttaac ttgtaccttt aatttaaaaa tcgagggcca 5820gttactagag acatcaggtt
tatgtatccc tgccaacaac accctcttta ttgtctctat 5880tagtaagaca ctggcagcca
atgagccaca cctcacgtta gaatttttgg aagagtgtat 5940ttctggattt agcaaatcta
gtattgaatt gaaacacctt tgtttggaat acatgactcc 6000atggctgtca aatctagttc
gtttttgcaa gcataatgat gatgccaaac gacaaagagt 6060tactgctatt cttgacaagc
tgataacaat gaccatcaat gaaaaacaga tgtacccatc 6120tattcaagca aaaatatggg
gaagccttgg gcagattaca gatctgcttg atgttgtact 6180agacagtttc atcaaaacca
gtgcaacagg tggcttggga tcaataaaag ctgaggtgat 6240ggcagatact gctgtagctt
tggcttctgg aaatgtgaaa ttggtttcaa gcaaggttat 6300tggaaggatg tgcaaaataa
ttgacaagac atgcttatct ccaactccta ctttagaaca 6360acatcttatg tgggatgata
ttgctatttt agcacgctac atgctgatgc tgtccttcaa 6420caattccctt gatgtggcag
ctcatcttcc ctacctcttc cacgttgtta ctttcttagt 6480agccacaggt ccgctctccc
ttagagcttc cacacatgga ctggtcatta atatcattca 6540ctctctgtgt acttgttcac
agcttcattt tagtgaagag accaagcaag ttttgagact 6600cagtctgaca gagttctcat
tacccaaatt ttacttgctg tttggcatta gcaaagtcaa 6660gtcagctgct gtcattgcct
tccgttccag ttaccgggac aggtcattct ctcctggctc 6720ctatgagaga gagacttttg
ctttgacatc cttggaaaca gtcacagaag ctttgttgga 6780gatcatggag gcatgcatga
gagatattcc aacgtgcaag tggctggacc agtggacaga 6840actagctcaa agatttgcat
tccaatataa tccatccctg caaccaagag ctcttgttgt 6900ctttgggtgt attagcaaac
gagtgtctca tgggcagata aagcagataa tccgtattct 6960tagcaaggca cttgagagtt
gcttaaaagg acctgacact tacaacagtc aagttctgat 7020agaagctaca gtaatagcac
taaccaaatt acagccactt cttaataagg actcgcctct 7080gcacaaagcc ctcttttggg
tagctgtggc tgtgctgcag cttgatgagg tcaacttgta 7140ttcagcaggt accgcacttc
ttgaacaaaa cctgcatact ttagatagtc tccgtatatt 7200caatgacaag agtccagagg
aagtatttat ggcaatccgg aatcctctgg agtggcactg 7260caagcaaatg gatcattttg
ttggactcaa tttcaactct aactttaact ttgcattggt 7320tggacacctt ttaaaagggt
acaggcatcc ttcacctgct attgttgcaa gaacagtcag 7380aattttacat acactactaa
ctctggttaa caaacacaga aattgtgaca aatttgaagt 7440gaatacacag agcgtggcct
acttagcagc tttacttaca gtgtctgaag aagttcgaag 7500tcgctgcagc ctaaaacata
gaaagtcact tcttcttact gatatttcaa tggaaaatgt 7560tcctatggat acatatccca
ttcatcatgg tgacccttcc tataggacac taaaggagac 7620tcagccatgg tcctctccca
aaggttctga aggatacctt gcagccacct atccaactgt 7680cggccagacc agtccccgag
ccaggaaatc catgagcctg gacatggggc aaccttctca 7740ggccaacact aagaagttgc
ttggaacaag gaaaagtttt gatcacttga tatcagacac 7800aaaggctcct aaaaggcaag
aaatggaatc agggatcaca acacccccca aaatgaggag 7860agtagcagaa actgattatg
aaatggaaac tcagaggatt tcctcatcac aacagcaccc 7920acatttacgt aaagtttcag
tgtctgaatc aaatgttctc ttggatgaag aagtacttac 7980tgatccgaag atccaggcgc
tgcttcttac tgttctagct acactggtaa aatataccac 8040agatgagttt gatcaacgaa
ttctttatga atacttagca gaggccagtg ttgtgtttcc 8100caaagtcttt cctgttgtgc
ataatttgtt ggactctaag atcaacaccc tgttatcatt 8160gtgccaagat ccaaatttgt
taaatccaat ccatggaatt gtgcagagtg tggtgtacca 8220tgaagaatcc ccaccacaat
accaaacatc ttacctgcaa agttttggtt ttaatggctt 8280gtggcggttt gcaggaccgt
tttcaaagca aacacaaatt ccagactatg ctgagcttat 8340tgttaagttt cttgatgcct
tgattgacac gtacctgcct ggaattgatg aagaaaccag 8400tgaagaatcc ctcctgactc
ccacatctcc ttaccctcct gcactgcaga gccagcttag 8460tatcactgcc aaccttaacc
tttctaattc catgacctca cttgcaactt cccagcattc 8520cccaggaatc gacaaggaga
acgttgaact ctcccctacc actggccact gtaacagtgg 8580acgaactcgc cacggatccg
caagccaagt gcagaagcaa agaagcgctg gcagtttcaa 8640acgtaatagc attaagaaga
tcgtgtgaag cttgcttgct ttctttttta aaatcaactt 8700aacatgggct cttcactagt
gaccccttcc ctgtccttgc cctttccccc catgttgtaa 8760tgctgcactt cctgttttat
aatgaaccca tccggtttgc catgttgcca gatgatcaac 8820tcttcgaagc cttgcctaaa
tttaatgctg ccttttcttt aacttttttt cttctacttt 8880tggcgtgtat ctggtatatg
taagtgttca gaacaactgc aaagaaagtg ggaggtcagg 8940aaacttttaa ctgagaaat
8959412257DNAHomo sapiens
41acggcagccg tcagggaccg tcccccaact cccctttccg ctcaggcagg gtcctcgcgg
60cccatgctgg ccgctgggga cccgcgcagc ccagaccgtt cccgggccgg ccagccggca
120ccatggtggc cctgaggcct gtgcagcaac tccagggggg ctaaagggct cagagtgcag
180gccgtggggc gcgagggtcc cgggcctgag ccccgcgcca tggccggggc catcgcttcc
240cgcatgagct tcagctctct caagaggaag caacccaaga cgttcaccgt gaggatcgtc
300accatggacg ccgagatgga gttcaattgc gagatgaagt ggaaagggaa ggacctcttt
360gatttggtgt gccggactct ggggctccga gaaacctggt tctttggact gcagtacaca
420atcaaggaca cagtggcctg gctcaaaatg gacaagaagg tactggatca tgatgtttca
480aaggaagaac cagtcacctt tcacttcttg gccaaatttt atcctgagaa tgctgaagag
540gagctggttc aggagatcac acaacattta ttcttcttac aggtaaagaa gcagatttta
600gatgaaaaga tctactgccc tcctgaggct tctgtgctcc tggcttctta cgccgtccag
660gccaagtatg gtgactacga ccccagtgtt cacaagcggg gatttttggc ccaagaggaa
720ttgcttccaa aaagggtaat aaatctgtat cagatgactc cggaaatgtg ggaggagaga
780attactgctt ggtacgcaga gcaccgaggc cgagccaggg atgaagctga aatggaatat
840ctgaagatag ctcaggacct ggagatgtac ggtgtgaact actttgcaat ccggaataaa
900aagggcacag agctgctgct tggagtggat gccctggggc ttcacattta tgaccctgag
960aacagactga cccccaagat ctccttcccg tggaatgaaa tccgaaacat ctcgtacagt
1020gacaaggagt ttactattaa accactggat aagaaaattg atgtcttcaa gtttaactcc
1080tcaaagcttc gtgttaataa gctgattctc cagctatgta tcgggaacca tgatctattt
1140atgaggagaa ggaaagccga ttctttggaa gttcagcaga tgaaagccca ggccagggag
1200gagaaggcta gaaagcagat ggagcggcag cgcctcgctc gagagaagca gatgagggag
1260gaggctgaac gcacgaggga tgagttggag aggaggctgc tgcagatgaa agaagaagca
1320acaatggcca acgaagcact gatgcggtct gaggagacag ctgacctgtt ggctgaaaag
1380gcccagatca ccgaggagga ggcaaaactt ctggcccaga aggccgcaga ggctgagcag
1440gaaatgcagc gcatcaaggc cacagcgatt cgcacggagg aggagaagcg cctgatggag
1500cagaaggtgc tggaagccga ggtgctggca ctgaagatgg ctgaggagtc agagaggagg
1560gccaaagagg cagatcagct gaagcaggac ctgcaggaag cacgcgaggc ggagcgaaga
1620gccaagcaga agctcctgga gattgccacc aagcccacgt acccgcccat gaacccaatt
1680ccagcaccgt tgcctcctga cataccaagc ttcaacctca ttggtgacag cctgtctttc
1740gacttcaaag atactgacat gaagcggctt tccatggaga tagagaaaga aaaagtggaa
1800tacatggaaa agagcaagca tctgcaggag cagctcaatg aactcaagac agaaatcgag
1860gccttgaaac tgaaagagag ggagacagct ctggatattc tgcacaatga gaactccgac
1920aggggtggca gcagcaagca caataccatt aaaaagctca ccttgcagag cgccaagtcc
1980cgagtggcct tctttgaaga gctctagcag gtgacccagc caccccagga cctgccactt
2040ctcctgctac cgggaccgcg ggatggacca gatatcaaga gagccatcca tagggagctg
2100gctgggggtt tccgtgggag ctccagaact ttccccagct gagtgaagag cccagcccct
2160cttatgtgca attgccttga actacgaccc tgtagagatt tctctcatgg cgttctagtt
2220ctctgacctg agtctttgtt ttaagaagta tttgtct
2257422969DNAHomo sapiens 42ccaggcagct ggggtaagga gttcaaggca gcgcccacac
ccgggggctc tccgcaaccc 60gaccgcctgt ccgctccccc acttcccgcc ctccctccca
cctactcatt cacccaccca 120cccacccaga gccgggacgg cagcccaggc gcccgggccc
cgccgtctcc tcgccgcgat 180cctggacttc ctcttgctgc aggacccggc ttccacgtgt
gtcccggagc cggcgtctca 240gcacacgctc cgctccgggc ctgggtgcct acagcagcca
gagcagcagg gagtccggga 300cccgggcggc atctgggcca agttaggcgc cgccgaggcc
agcgctgaac gtctccaggg 360ccggaggagc cgcggggcgt ccgggtctga gccgcagcaa
atgggctccg acgtgcggga 420cctgaacgcg ctgctgcccg ccgtcccctc cctgggtggc
ggcggcggct gtgccctgcc 480tgtgagcggc gcggcgcagt gggcgccggt gctggacttt
gcgcccccgg gcgcttcggc 540ttacgggtcg ttgggcggcc ccgcgccgcc accggctccg
ccgccacccc cgccgccgcc 600gcctcactcc ttcatcaaac aggagccgag ctggggcggc
gcggagccgc acgaggagca 660gtgcctgagc gccttcactg tccacttttc cggccagttc
actggcacag ccggagcctg 720tcgctacggg cccttcggtc ctcctccgcc cagccaggcg
tcatccggcc aggccaggat 780gtttcctaac gcgccctacc tgcccagctg cctcgagagc
cagcccgcta ttcgcaatca 840gggttacagc acggtcacct tcgacgggac gcccagctac
ggtcacacgc cctcgcacca 900tgcggcgcag ttccccaacc actcattcaa gcatgaggat
cccatgggcc agcagggctc 960gctgggtgag cagcagtact cggtgccgcc cccggtctat
ggctgccaca cccccaccga 1020cagctgcacc ggcagccagg ctttgctgct gaggacgccc
tacagcagtg acaatttata 1080ccaaatgaca tcccagcttg aatgcatgac ctggaatcag
atgaacttag gagccacctt 1140aaagggccac agcacagggt acgagagcga taaccacaca
acgcccatcc tctgcggagc 1200ccaatacaga atacacacgc acggtgtctt cagaggcatt
caggatgtgc gacgtgtgcc 1260tggagtagcc ccgactcttg tacggtcggc atctgagacc
agtgagaaac gccccttcat 1320gtgtgcttac ccaggctgca ataagagata ttttaagctg
tcccacttac agatgcacag 1380caggaagcac actggtgaga aaccatacca gtgtgacttc
aaggactgtg aacgaaggtt 1440ttctcgttca gaccagctca aaagacacca aaggagacat
acaggtgtga aaccattcca 1500gtgtaaaact tgtcagcgaa agttctcccg gtccgaccac
ctgaagaccc acaccaggac 1560tcatacaggt gaaaagccct tcagctgtcg gtggccaagt
tgtcagaaaa agtttgcccg 1620gtcagatgaa ttagtccgcc atcacaacat gcatcagaga
aacatgacca aactccagct 1680ggcgctttga ggggtctccc tcggggaccg ttcagtgtcc
caggcagcac agtgtgtgaa 1740ctgctttcaa gtctgactct ccactcctcc tcactaaaaa
ggaaacttca gttgatcttc 1800ttcatccaac ttccaagaca agataccggt gcttctggaa
actaccaggt gtgcctggaa 1860gagttggtct ctgccctgcc tacttttagt tgactcacag
gccctggaga agcagctaac 1920aatgtctggt tagttaaaag cccattgcca tttggtgtgg
attttctact gtaagaagag 1980ccatagctga tcatgtcccc ctgacccttc ccttcttttt
ttatgctcgt tttcgctggg 2040gatggaatta ttgtaccatt ttctatcatg gaatatttat
aggccagggc atgtgtatgt 2100gtctgctaat gtaaactttg tcatggtttc catttactaa
cagcaacagc aagaaataaa 2160tcagagagca aggcatcggg ggtgaatctt gtctaacatt
cccgaggtca gccaggctgc 2220taacctggaa agcaggatgt agttctgcca ggcaactttt
aaagctcatg catttcaagc 2280agctgaagaa aaaatcagaa ctaaccagta cctctgtata
gaaatctaaa agaattttac 2340cattcagtta attcaatgtg aacactggca cactgctctt
aagaaactat gaagatctga 2400gatttttttg tgtatgtttt tgactctttt gagtggtaat
catatgtgtc tttatagatg 2460tacatacctc cttgcacaaa tggaggggaa ttcattttca
tcactgggag tgtccttagt 2520gtataaaaac catgctggta tatggcttca agttgtaaaa
atgaaagtga ctttaaaaga 2580aaatagggga tggtccagga tctccactga taagactgtt
tttaagtaac ttaaggacct 2640ttgggtctac aagtatatgt gaaaaaaatg agacttactg
ggtgaggaaa tccattgttt 2700aaagatggtc gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt
gtgttgtgtt gtgttttgtt 2760ttttaaggga gggaatttat tatttaccgt tgcttgaaat
tactgtgtaa atatatgtct 2820gataatgatt tgctctttga caactaaaat taggactgta
taagtactag atgcatcact 2880gggtgttgat cttacaagat attgatgata acacttaaaa
ttgtaacctg catttttcac 2940tttgctctca attaaagtct attcaaaag
29694369DNAArtificialsynthetic DNA 43ggcctccata
aagtaggaaa cactacacag ctccataaag taggaaacac tacattaatt 60aagcggtac
694461DNAArtificialsynthetic DNA 44cgcttaatta atgtagtgtt tcctacttta
tggagctgtg tagtgtttcc tactttatgg 60a
614554DNAArtificialsynthetic DNA
45tccataaagt aggaaacact acaggactcc ataaagtagg aaacactaca gtac
544652DNAArtificialsynthetic DNA 46tgtagtgttt cctactttat ggagtcctgt
agtgtttcct actttatgga at 5247573DNAAdenovirus 47ttatggactg
gaataaaccc tccacctaac tgtcaaattg tggaaaacac taatacaaat 60gatggcaaac
ttactttagt attagtaaaa aacggagggc ttgttaatgg ctacgtgtct 120ctagttggtg
tatcagacac tgtgaaccaa atgttcacac aaaagacagc aaacatccaa 180ttaagattat
attttgactc ttctggaaat ctattaactg atgaatcaga cttaaaaatt 240ccacttaaaa
ataaatcttc tacagcgacc agtgaaactg tagccagcag caaagccttt 300atgccaagta
ctacagctta tcccttcaac accactacta gggatagtga aaactacatt 360catggaatat
gttactacat gactagttat gatagaagtc tatttccctt gaacatttct 420ataatgctaa
acagccgtat gatttcttcc aatgttgcct atgccataca atttgaatgg 480aatctaaatg
caagtgaatc tccagaaagc aacatagcta cgctgaccac atcccccttt 540ttcttttctt
acattacaga agacgacaac taa
57348270DNAAdenovirus 48ggagttctta ctcttaagtg tttaacccca ctaacaacca
caggcggatc tctacagcta 60aaagtgggag ggggacttac agtggatgac actgatggta
ccttacaaga aaacatacgt 120gctacagcac ccattactaa aaataatcac tctgtagaac
tatccattgg aaatggatta 180gaaactcaaa acaataaact atgtgccaaa ttgggaaatg
ggttaaaatt taacaacggt 240gacatttgta taaaggatag tattaacacc
27049843DNAAdenovirus 49ggagttctta ctcttaagtg
tttaacccca ctaacaacca caggcggatc tctacagcta 60aaagtgggag ggggacttac
agtggatgac actgatggta ccttacaaga aaacatacgt 120gctacagcac ccattactaa
aaataatcac tctgtagaac tatccattgg aaatggatta 180gaaactcaaa acaataaact
atgtgccaaa ttgggaaatg ggttaaaatt taacaacggt 240gacatttgta taaaggatag
tattaacacc ttatggactg gaataaaccc tccacctaac 300tgtcaaattg tggaaaacac
taatacaaat gatggcaaac ttactttagt attagtaaaa 360aacggagggc ttgttaatgg
ctacgtgtct ctagttggtg tatcagacac tgtgaaccaa 420atgttcacac aaaagacagc
aaacatccaa ttaagattat attttgactc ttctggaaat 480ctattaactg atgaatcaga
cttaaaaatt ccacttaaaa ataaatcttc tacagcgacc 540agtgaaactg tagccagcag
caaagccttt atgccaagta ctacagctta tcccttcaac 600accactacta gggatagtga
aaactacatt catggaatat gttactacat gactagttat 660gatagaagtc tatttccctt
gaacatttct ataatgctaa acagccgtat gatttcttcc 720aatgttgcct atgccataca
atttgaatgg aatctaaatg caagtgaatc tccagaaagc 780aacatagcta cgctgaccac
atcccccttt ttcttttctt acattacaga agacgacaac 840taa
84350975DNAArtificialsynthetic DNA 50atgaagcgcg caagaccgtc tgaagatacc
ttcaaccccg tgtatccata tgacacggaa 60accggtcctc caactgtgcc ttttcttact
cctccctttg tatcccccaa tgggtttcaa 120gagagtcccc ctggagttct tactcttaag
tgtttaaccc cactaacaac cacaggcgga 180tctctacagc taaaagtggg agggggactt
acagtggatg acactgatgg taccttacaa 240gaaaacatac gtgctacagc acccattact
aaaaataatc actctgtaga actatccatt 300ggaaatggat tagaaactca aaacaataaa
ctatgtgcca aattgggaaa tgggttaaaa 360tttaacaacg gtgacatttg tataaaggat
agtattaaca ccttatggac tggaataaac 420cctccaccta actgtcaaat tgtggaaaac
actaatacaa atgatggcaa acttacttta 480gtattagtaa aaaacggagg gcttgttaat
ggctacgtgt ctctagttgg tgtatcagac 540actgtgaacc aaatgttcac acaaaagaca
gcaaacatcc aattaagatt atattttgac 600tcttctggaa atctattaac tgatgaatca
gacttaaaaa ttccacttaa aaataaatct 660tctacagcga ccagtgaaac tgtagccagc
agcaaagcct ttatgccaag tactacagct 720tatcccttca acaccactac tagggatagt
gaaaactaca ttcatggaat atgttactac 780atgactagtt atgatagaag tctatttccc
ttgaacattt ctataatgct aaacagccgt 840atgatttctt ccaatgttgc ctatgccata
caatttgaat ggaatctaaa tgcaagtgaa 900tctccagaaa gcaacatagc tacgctgacc
acatccccct ttttcttttc ttacattaca 960gaagacgaca actaa
975
User Contributions:
Comment about this patent or add new information about this topic: