Patent application title: METHOD FOR IN VITRO DIAGNOSIS OR PROGNOSIS OF TESTICULAR CANCER
Inventors:
Maud Arsac (Saint-Chamond, FR)
Maud Arsac (Saint-Chamond, FR)
Bertrand Bonnaud (Verin, FR)
Francois Mallet (Villeurbanne, FR)
Francois Mallet (Villeurbanne, FR)
Jean-Philippe Pichon (Clermont-Ferrand, FR)
Assignees:
BIOMERIEUX
IPC8 Class: AC12Q168FI
USPC Class:
435 6
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid
Publication date: 2010-12-30
Patent application number: 20100330581
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: METHOD FOR IN VITRO DIAGNOSIS OR PROGNOSIS OF TESTICULAR CANCER
Inventors:
Francois Mallet
Maud Arsac
Bertrand Bonnaud
Jean-Philippe Pichon
Agents:
OLIFF & BERRIDGE, PLC
Assignees:
Origin: ALEXANDRIA, VA US
IPC8 Class: AC12Q168FI
USPC Class:
Publication date: 12/30/2010
Patent application number: 20100330581
Abstract:
The invention relates to a method for in vitro diagnosis or prognosis of
testicular cancer which comprises a step of detecting the presence or
absence of at least one expression product from at least one nucleic acid
sequence selected from the sequences identified in SEQ ID NOS: 1 to 6 or
from the sequences which exhibit at least 99% identity with one of the
sequences identified in SEQ ID NOS: 1 to 6, to isolated nucleic acid
sequences and to the use thereof as a testicular cancer marker.Claims:
1. A method for in vitro diagnosis or prognosis of testicular cancer, in a
biological sample from a patient suspected of suffering from testicular
cancer, comprising a step of detecting presence or absence of at least
one expression product from at least one nucleic acid sequence selected
from the full-length sequences identified in SEQ ID NOS: 1 to 6 or from
sequences which exhibit at least 99% identity with one of the full-length
sequences identified in SEQ ID NOS: 1 to 6.
2. The method as claimed in claim 1, wherein the expression product detected is at least one mRNA transcript or at least one polypeptide.
3. The method as claimed in claim 2, wherein the mRNA transcript is detected by hybridization, by amplification, or by sequencing.
4. The method as claimed in claim 1, wherein the expression product is at least one mRNA transcript, having a sequence selected from one of the full-length sequences identified in SEQ ID NOS: 7 to 12.
5. The method as claimed in claim 1, wherein the mRNA is brought into contact with at least one probe and/or at least one primer under predetermined conditions which enable hybridization, and in that the presence or absence of hybridization to the mRNA is detected.
6. The method as claimed in claim 1, wherein DNA copies of the mRNA are prepared and the DNA copies are brought into contact with at least one probe and/or at least one primer under predetermined conditions which enable hybridization, and in that the presence or absence of hybridization to said DNA copies is detected.
7. The method as claimed in claim 2, wherein the polypeptide expressed is detected by bringing it into contact with at least one binding partner specific for said polypeptide.
8. A molecular marker for in vitro diagnosis or prognosis of testicular cancer, the molecular marker comprising at least one isolated nucleic acid sequence selected from the group consisting of:(i) the full-length DNA sequences set forth in SEQ ID NOS: 1-6,(ii) the full-length complementary DNA sequences of the sequences defined in(iii) DNA sequences that exhibit at least 99% identity with the sequences defined in (i) and, or (ii),(iv) RNA sequences that are the products of transcription of the sequences defined in (i),(v) RNA sequences that are the products of transcription of sequences that exhibit at least 99% identity with the sequences defined in (i), and(vi) the full-length RNA sequences set forth in SEQ ID NOS: 7-12.
Description:
[0001]Testicular cancer represents 1 to 2% of cancers in men, and 3.5% of
urological tumors. It is the most common tumor in young men, and rare
before 15 years of age and after 50 years of age. The risk is highest in
patients who are seropositive for HIV. Seminoma is the most common form
of testicular cancer (40%), but many other types of cancer exist, among
which are embryonic carcinoma (20%), teratocarcinoma (30%) and
choriocarcinoma (1%).
[0002]The diagnosis of testicular cancer is first clinical: it often presents in the form of a hard and irregular swelling of the testicle. An ultrasound confirms the intratesticular tumor and Doppler ultrasound demonstrates the increase in vascularization in the tumor. In some cases, a magnetic resonance examination (testicular MRI) can be useful. A thoracic, abdominal and pelvic scan makes it possible to investigate whether there is any lymph node involvement of the cancer. A blood sample for assaying tumor markers is virtually systematic. It makes it possible to orient the diagnosis of the type of tumor. Two main tumor markers are used and assayed in the blood: β-HCG and α-foetoprotein. However, these markers are not very specific and, furthermore, if the concentration of these markers is at physiological levels, this does not mean that there is an absence of tumor. At the current time, the final diagnosis and final prognosis are given after ablation of the affected testicle (orchidectomy), which constitutes the first stage of treatment. Next, depending on the type of cancer and on its stage, a complementary treatment by radiotherapy or chemotherapy is applied. There is therefore a real need for having markers which are specific for testicular cancer and which, in addition, make it possible to establish as early a diagnosis and prognosis as possible.
[0003]The rare event represented by the infection of a germline cell by an exogenous provirus results in the integration, into the host's genome, of a proviral DNA or provirus, which becomes an integral part of the genetic inheritance of the host. This endogenous provirus (HERV) is therefore transmissible to the next generation in Mendelien fashion. It is estimated that there are approximately a hundred or so HERV families representing approximately 8% of the human genome. Each of the families has from several tens to thousands of loci, which are the result of intracellular retrotranspositions of transcriptionally active copies. The loci of the contemporary HERV families are all replication-defective, which signifies loss of the infectious properties and therefore implies an exclusively vertical (Mendelien) transmission mode.
[0004]HERV expression has been particularly studied in three specific contexts, placentation, autoimmunity and cancer, which are associated with cell differentiation or with the modulation of immunity. It has thus been shown that the envelope glycoprotein of the ERVWE1 locus of the HERV-W family is involved in the fusion process resulting in syncytiotrophoblast formation. It has, moreover, been suggested that the Rec protein, which is a splice variant of the env gene of HERV-K, could be involved in the testicular tumorogenesis process. However, the following question has not yet been answered: are HERVs players or markers in pathological contexts?
[0005]The present inventors have now discovered and demonstrated that nucleic acid sequences belonging to loci of the HERV-W family are associated with testicular cancer and that these sequences are molecular markers for the pathological condition. The sequences identified are either proviruses, i.e. sequences containing all or part of the gag, pol and env genes flanked on the 5' and on the 3' by long terminal repeats (LTRs), or isolated LTRs. The DNA sequences identified are respectively referenced as SEQ ID Nos. 1 to 6 in the sequence listing.
[0006]The subject of the present invention is therefore a method for in vitro, diagnosis or prognosis of testicular cancer, in a biological sample from a patient suspected of suffering from testicular cancer, which comprises a step of detecting at least one expression product from at least one nucleic acid sequence of the endogenous retroviral family called HERV-W, said sequence being selected from the sequences identified in SEQ ID Nos. 1 to 6 or from the sequences which exhibit at least 99% identity, preferably at least 99.5% identity, and advantageously at least 99.6% identity, with one of the sequences identified in SEQ ID Nos. 1 to 6.
[0007]The percentage identity described above has been determined while taking into consideration the nucleotide diversity in the genome. It is known that nucleotide diversity is higher in the regions of the genome that are rich in repeat sequences than in the regions which do not contain repeat sequences. By way of example, D. A. Nickerson et al.,.sup.[1] have shown a diversity of approximately 0.3% (0.32%) in regions containing repeat sequences.
[0008]The expression product which is detected is preferably at least one mRNA transcript of at least one of the sequences SEQ ID Nos. 1 to 6, but this can also be a polypeptide which is the product of translation of at least one of said transcripts.
[0009]When the expression product is an mRNA transcript, it is detected by any suitable method, such as hybridization, sequencing or amplification. The mRNA can be detected directly by bringing it into contact with at least one probe and/or at least one primer which are designed so as to hybridize, under predetermined stringency conditions, to the mRNA transcripts, demonstrating the presence or absence of hybridization to the mRNA and, optionally, quantifying the mRNA. Among the preferred methods, mention may be made of amplification (for example, RT-PCR, NASBA, etc.) or else Northern blotting. The mRNA can also be detected indirectly on the basis of nucleic acids derived from said transcripts, such as cDNA copies, etc.
[0010]Generally, the method of the invention comprises an initial step of extracting the mRNA from the sample to be analyzed.
[0011]First, the method can comprise: [0012](i) a step of extracting the mRNA from the sample to be analyzed, [0013](ii) a step of detecting and quantifying the mRNA of the sample to be analyzed, [0014](iii) a step of extracting the mRNA from a healthy sample, [0015](iv) a step of detecting and quantifying the mRNA of the healthy sample, [0016](v) a step of comparing the amount of mRNA expressed in the sample to be analyzed and in the healthy sample; if the amount of mRNA expressed in the sample to be analyzed is determined as being greater than the amount of mRNA expressed in the healthy sample, this can be correlated with the diagnosis or prognosis of a testicular cancer; [0017]and in particular: [0018](i) extraction of the RNA to be analyzed from the sample, [0019](ii) determination, in the RNA to be analyzed, of a level of expression of at least one RNA sequence in the sample, said RNA sequence being the product of transcription of at least one nucleic acid sequence selected from the sequences identified in SEQ ID Nos. 1 to 6 or from the sequences which include, at least 99% identity, preferably at least 99.5% identity, and advantageously at least 99.6% identity, with one of the sequences identified in SEQ ID Nos. 1 to 6, and [0020](iii) comparison of the level of expression of said RNA sequence(s) defined in (ii) with the level of expression of said RNA sequence(s) in a noncancerous biological sample; if the level of expression of the RNA to be analyzed is determined as being greater than the level of expression of the RNA extracted from the noncancerous biological sample, this can be correlated with the diagnosis or prognosis of a testicular cancer.
[0021]The transcripts are overexpressed in testicular tumors. In order to detect such an overexpression, a reference point may be necessary, i.e. a control. The amount of mRNA in the healthy sample serves as a reference standard to which the amount of mRNA in the sample to be analyzed can be compared, it being possible for an overexpression of mRNA in the sample to be analyzed, compared with the expression of mRNA in the healthy sample, to be correlated with a diagnosis or prognosis of a testicular cancer. However, since transcription is generally negligible or even nonexistent in the healthy sample, whereas it is significantly higher in the cancer sample, a reference point is not essential, the significant expression of transcripts being an indicator of the disease.
[0022]The term "overexpressed sequence" is intended to mean an mRNA sequence which is found in greater amounts or at higher levels than those found for the same mRNA sequence derived from the same type of sample, but which is noncancerous, constituting the reference threshold value.
[0023]The sequences of said transcripts are respectively identified in SEQ ID Nos. 7 to 12 (given with reference to the genomic DNA):
[0024]SEQ ID No. 7=transcript of the HW4TT locus,
[0025]SEQ ID No. 8=transcript of the HW2TT locus,
[0026]SEQ ID No. 9=transcript of the HW13TT locus,
[0027]SEQ ID No. 10=transcript of the HWXTT locus,
[0028]SEQ ID No. 11=transcript of the HW21TT locus,
[0029]SEQ ID No. 12=transcript of the ERVWE1 locus.
[0030]When the expression product is a polypeptide derived from the translation of at least one of the transcripts, it can be detected, in the method of the invention, using at least one binding partner specific for said polypeptide, in particular an antibody, for example a monoclonal antibody. The method for producing monoclonal antibodies and the selection process are well known to those skilled in the art.
[0031]By way of illustration, polypeptide sequences are described and identified in SEQ ID Nos. 14, 16, 18, 20, 22 and 24:
[0032]SEQ ID No. 14=Gag protein of HW4TT,
[0033]SEQ ID No. 16=protease of HW4TT,
[0034]SEQ ID No. 18=Gag protein of HW2TT,
[0035]SEQ ID No. 20=protein of HW2TT,
[0036]SEQ ID No. 22=Gag protein of HW13TT,
[0037]SEQ ID No. 24=Gag protein of HW21TT
[0038]SEQ ID No. 26=Env protein of ERVWE1 (Syncytin-1).
[0039]The sample from the patient will generally comprise cells (such as the testicular cells). They may be present in a tissue sample (such as the testicular tissue) or be found in the circulation. In general, the sample is a testicular tissue extract or a biological fluid, such as blood, serum, plasma, urine or else seminal fluid.
[0040]The subject of the invention is also an isolated nucleic acid sequence which consists of: [0041](i) at least one DNA sequence selected from the sequences SEQ ID Nos. 1 to 6, or [0042](ii) at least one DNA sequence complementary to a sequence selected from the sequences SEQ ID Nos. 1 to 6, or [0043](iii) at least one DNA sequence which exhibits at least 99% identity, preferably at least 99.5% identity, and advantageously at least 99.6% identity, with a sequence as defined in (i) and (ii), or [0044](iv) at least one RNA sequence which is the product of transcription of a sequence selected from the sequences as defined in (i), or [0045](v) at least one RNA sequence which is the product of transcription of a sequence selected from the sequences which exhibit at least 99% identity, preferably at least 99.5% identity, and advantageously at least 99.6% identity, with a sequence as defined in (i), or [0046](vi) at least one RNA sequence selected from the sequences SEQ ID Nos. 7 to 12; and [0047]the use of at least one isolated nucleic acid sequence, as a molecular marker for in vitro diagnosis or prognosis of testicular cancer, in which the nucleic acid sequence consists of: [0048](i) at least one DNA sequence selected from the sequences SEQ ID Nos. 1 to 6, or [0049](ii) at least one DNA sequence complementary to a sequence selected from the sequences SEQ ID Nos. 1 to 6, or [0050](iii) at least one DNA sequence which exhibits at least 99% identity with a sequence as defined in (i) and (ii), or [0051](iv) at least one RNA sequence which is the product of transcription of a sequence selected from the sequences as defined in (i), or [0052](v) at least one RNA sequence which is the product of transcription of a sequence selected from the sequences which exhibit at least 99% identity with a sequence as defined in (i), or [0053](vi) at least one RNA sequence selected from the sequences SEQ ID Nos. 7 to 12.
FIGURES
[0054]FIG. 1 represents the principle of the WTA method for amplifying RNAs.
[0055]FIG. 2 represents a synoptic scheme of the nature and the sequence of the various steps for preprocessing DNA-chip data according to the RMA method.
[0056]FIG. 3 illustrates the nomenclature, the position and the structure of the HERV-W loci overexpressed and exhibiting a loss of methylation in the tumoral testicle.
[0057]FIG. 4 is a histogram representing the increase in expression of the five loci (HW4TT, HW2TT, HW13TT, HWXTT and HW21TT), respectively, in three pairs of testicular samples (testicle 1, testicle 2 and testicle 3), based on a comparative tumor sample/healthy sample quantification. The loci are represented along the x-axis and the factors of increase of expression between tumor tissue and healthy tissue are represented along the y-axis.
[0058]FIGS. 5 to 10 represent the methylation status of the U3 region of unique LTR or of the 5' LTR of the various loci, respectively HW4TT, HW2TT, HW13TT, HWXTT, HW21TT and ERVWE1 in the healthy testicle (normal) and in the tumoral testicle derived from the same patient, after amplification and analysis of the sequences obtained.
EXAMPLES
Example 1
Identification of HERV-W loci Expressed in Cancerous Tissues
[0059]Method:
[0060]The identification of expressed HERV-W loci is based on the design of a high-density DNA chip in the GeneChip format proposed by the company Affymetrix. It is a specially developed, custom-made chip, the probes of which correspond to HERV-W loci. The sequences of the HERV-W family were identified from the GenBank nucleic databank using the Blast algorithm (Altschul et al., 1990) with the sequence of the ERVWE1 locus, located on chromosome 7 at 7q21.2 and encoding the protein called syncytin. The sequences homologous to HERV-W were compared to a library containing reference sequences of the HERV-W family (ERVWE1) cut up into functional regions (LTR, gag, pol and env), using the RepeatMasker software (A. F. A. Smit and P. Green). These elements constitute the HERVgDB bank.
[0061]The probes making up the high-density chip were defined on a criterion of uniqueness of their sequences in the HERVgDB bank. The HERV-W proviral and solitary LTRs contained in the HERVgDB bank were extracted. Each of these sequences was broken down into a set of sequences of 25 nucleotides (25-mers) constituting it, i.e. as many potential probes. The evaluation of the uniqueness of each probe was carried out by means of a similarity search with all the 25-mers generated for all the LTRs of the family under consideration. This made it possible to identify all the 25-mers of unique occurrence for each family of HERV. Next, some of these 25-mers were retained as probes. For each U3 or U5 target region, a set of probes was formed on the basis of the probes identified as unique.
[0062]The samples analyzed using the HERV high-density chip correspond to RNAs extracted from tumors and to RNAs extracted from the healthy tissues adjacent to these tumors. The tissues analyzed are: uterus, colon, lung, breast, testicle, prostate and ovary. Placental RNAs (health tissue only) were also analyzed. For each sample, 400 ng of total RNA were amplified by means of an unbiased transcriptional method known as WTA. The principle of WTA amplification is the following: primers (RP-T7) comprising a random sequence and a T7 promoter sequence are hybridized to the transcripts; double-standard cDNAs are synthesized and serve as a template for transcriptional amplification by the T7 RNA polymerase; the antisense RNAs generated are converted to double-stranded cDNAs which are then fragmented and labeled by introducing biotinylated nucleotide analogs at the 3'OH ends using terminal transferase (TdT) (cf. FIG. 1).
[0063]For each sample, 16 μg of biotin-labeled amplification products were hybridized to a DNA chip according to the protocol recommended by the company Affymetrix. The chips were then washed and labeled, according to the recommended protocol. Finally, the chips were read by a scanner in order to acquire the image of their fluorescence. The image analysis carried out using the GCOS software makes it possible to obtain numerical values of fluorescence intensity which are preprocessed according to the RMA method (cf.: FIG. 2) before being able to carry out a statistical analysis in order to identify the HERV loci specifically expressed in certain samples.
[0064]Comparison of the means of more than two classes of samples was carried out by the SAM procedure applied to a Fisher test.
[0065]Results:
[0066]The processing of the data generated by the analysis on DNA chip using this method made it possible to identify six sets of probes corresponding to an overexpression in just one sample: the tumoral testicle. These five sets of probes are specific for six precise loci referenced HW4TT, HW2TT, HW13TT, HWXTT, HW21TT and ERVWE1 (cf.: FIG. 3). These six loci therefore represent markers for testicular cancer. Their nucleotide sequences are respectively identified in SEQ ID Nos. 1 to 6 in the sequence listing and the nucleotide sequences of their respective transcripts are identified in SEQ ID Nos. 7 to 12 in the sequence listing.
[0067]The information relating to the abovementioned six loci are summarized in Table 1 below.
TABLE-US-00001 TABLE 1 Locus SEQ ID No: Chromosome Position* HW4TT 1 4 41982184:41989670 HW2TT 2 2 17383689:17391462 HW13TT 3 13 68693759:68699228 HWXTT 4 X 113026618:113027400 HW21TT 5 21 27148627:27156168 ERVWE1 6 7 91935221:91945670 *Position relative to ensemble version No. 39 (June 2006) (NCBI No. 36) http://www.ensembl.org/Homo_sapiens/index.html
[0068]The HW13TT locus is a chimeric provirus of HERV-W/L type resulting from the recombination of an HERV-W provirus and an HERV-L provirus. This chimera is such that the 5' region made up of the sequence starting from the beginning of the 5' LTR to the end of the determined gag fragment is of W type and the 3' region made up of the sequence starting from the subsequent pol fragment to the end of the 3' LTR (U3-R only) is of L type. This results in a fusion of the 3' gag W-5' pol L regions.
[0069]A search of open reading frames (ORFs) of at least 150 bases, using the Mac Vector 9.5.2 software, based on the identification of a start codon and of a stop codon, was carried out and the corresponding polypeptides identified.
[0070]The ORF 1 of HW4TT identified in SEQ ID No. 13 encodes a Gag protein identified in SEQ ID No. 14 and the ORF 2 of HW4TT (SEQ ID No. 15) encodes a protease (SEQ ID No. 16),
[0071]the ORF1 of HW2TT identified in SEQ ID No. 17 encodes a Gag protein identified in SEQ ID No. 18 and the ORF 2 of HW2TT (SEQ ID No. 19) encodes a protein identified in SEQ ID No. 20,
[0072]the ORF of HW13TT identified in SEQ ID No. 21 encodes a Gag protein identified in SEQ ID No. 22,
[0073]the ORF of HW21TT identified in SEQ ID No. 23 encodes a Gag protein identified in SEQ ID No. 24,
[0074]the ORF of ERVWE1 identified in SEQ ID No. 25 encodes an Env protein identified in SEQ ID No. 26.
Example 2
Validation of the Loci Overexpressed in the Tumoral Testicle and Determination of the Associated Induction Factor
[0075]Principle:
[0076]Five of the six loci identified as overexpressed in the tumoral testicle by means of the high-density HERV chip were validated by real-time RT-PCR on three pairs of testicular samples. The specificity of this overexpression is evaluated by analyzing samples originating from other tissues. To this end, specific amplification systems were developed and used for the loci identified, as described in Table 2 below.
TABLE-US-00002 TABLE 2 Locus Sense primer (SEQ ID No:) Antisense primer (SEQ ID NO:) G6PD gene TGCAGATGCTGTGTCTGG (27) CGTACTGGCCCAGGACC (28) HW4TT GGTTCGTGCTAATTGAGCTG (29) ATGGTGGCAAGCTTCTTGTT (30) HW2TT TGAGCTTTCCCTCACTGTCC (31) TGTTCGGCTTGATTAGGATG (32) HW13TT CATGGCCCAATATTCCATTC (33) GGTCCTTGTTCACAGAACTCC (34) HWXTT CCGCTCCTGATTGGACTAAA (35) CGTGGGTCAAGGAAGAGAAC (36) HW21TT ATGACCCGCAGCTTCTAACAG (37) CTCCGCTCACAGAGCTCCTA (38)
[0077]The expression of these loci is standardized with respect to that of a suitable housekeeping gene: G6PD. This quantification of expression was carried out using an Mx3005P real-time RT-PCR machine, marketed by the company Stratagene.
[0078]Results:
[0079]The study of the three pairs of testicular samples indicates that the five loci identified, with the exception of HWXTT, the expression of which could not be quantified in the second testicular RNA pair, are overexpressed in the tumoral testicle compared with the health tissue (cf.: FIG. 4). The very marked nature of the overexpression, i.e. a low or even absent transcriptional expression in the healthy testicle and a high expression in the tumoral testicle, reveals the possibility of an epigenic method of regulation of transcription of these loci.
[0080]The analysis of pairs of samples originating from other tissues (colon, uterus, breast, ovary, lung and prostate) shows that the overexpression phenomenon is restricted to the tumoral testicle. Consequently, the expression of the identified loci assumes the nature of a marker specific for testicular cancer.
Example 3
Epigenetic Control of Transcription
[0081]Principle:
[0082]DNA methylation is an epigenetic modification which takes place, in eukaryotics, by the addition of a methyl group to the cytosines of 5'-CpG dinucleotides, and results in transcriptional repression when this modification occurs within the nucleotide sequence of a promoter. Apart from a few exceptions, human endogenous sequences of retroviral origin are restricted, owing to this methylation process, to a silent transcriptional state in the cells of the organism under physiological conditions.
[0083]In order to analyze the methylation status of the unique LTR or of the 5' LTR of the five loci, the "bisulfite sequencing PCR" method was used. This method makes it possible, on the basis of sequencing a representative sample of the population, to identify the methylation state of each CG dinucleotide on each of the sequences within the tissue studied.
[0084]Since the methylation information is lost during the amplification steps, it is advisable to translate the methylation information actually within the nucleotide sequence by means of the method of treating the genomic DNA with sodium bisulfite. The action of the bisulfite (sulfonation), followed by hydrolytic deamination and then alkaline desulfonation, in fact makes it possible to modify all the cytosines contained in the genomic DNA, into uracil. The speed of deamination of sulfonated cytosines (C) is, however, much higher than that of the sulfonated 5-methyl-Cs. It is therefore possible, by limiting the reaction time to 16 hours, to convert strictly the non-methylated cytosines to uracil (U), while at the same time preserving the cytosines which have a methyl group. After the sodium bisulfite treatment, the sequence of interest is amplified from the genomic DNA derived from the tumoral testicular section and from that derived from the adjacent healthy testicular section, by polymerase chain reaction (PCR) in two stages. The first PCR enables a specific selection of the sequence of interest, the second, "nested", PCR makes it possible to amplify this sequence.
[0085]Since the DNA sequence had been modified by the bisulfate, the design of the primers took into account the code change (C to U), and the primers were selected so as to hybridize to a region containing no CpG (their methylation state, and therefore their conversion state, being a priori unknown).
[0086]The sequences of the primers used are described in Tables 3 to 8 below.
TABLE-US-00003 TABLE 3 HW4TT locus Sense primer 5'→3' Antisense primer 5'→3' Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR CCAACATCACTAACACAACCT (39) GGGAGTTAGTAAGGGGTTTG (40) Nested PCR CAACCTATTAAACAAAACTAAATT (41) AGATTTAATAGAGTGAAAATAGAGTTT (42)
TABLE-US-00004 TABLE 4 HW2TT locus Sense primer 5'→3' Antisense primer 5'→3' Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TTATTAGTTTAGGGGATAGTTG (43) ACACAATAAACAACCTACTAAAT (44) Nested PCR GAGGGTAAGTGGTGATAAA (45) AACCTACTAAATCCAAAAAAA (46)
TABLE-US-00005 TABLE 5 HW13TT locus Sense primer 5'→3' Antisense primer 5'→3' Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TAGGATTTTAGGTTTATTGTTA (47) AAAAATAAAATATTAAACC (48) Nested PCR ATATGTGGGAGTGAGAGATA (49) CAACAACAAACAATAATAATAA (50)
TABLE-US-00006 TABLE 6 HWXTT locus Sense primer 5'→3' Antisense primer 5'→3' Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TTGAGTTTTTTTATTGATAGTG (51) TCTAAATCCTATTTTCCTACT (52) Nested PCR GTTTTTTTATTGATAGTGAGAGAT (53) TAACAAACCTTTAATCCAAT (54)
TABLE-US-00007 TABLE 7 HW21TT locus Sense primer 5'→3' Antisense primer 5'→3' Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TTTAGTGAGGATGATGTAATAT (55) CAACTTAATAAAAATAAACCCA (56) Nested PCR ATAATGTTTTAGTAAGTGTTGGAT (57) ACAATTACAAACCTTTAACC (58)
TABLE-US-00008 TABLE 8 ERVWE1 locus Sense primer 5'→3' Antisense primer 5'→3' Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR AATTCATTCAACATCCATTC (59) GGTTTAATATTATTTATTATTTTGGA (60) Nested PCR CTCTTACCTTCCTATACTCTCTAAA (61) AGAGTGTAGTTGTAAGATTTAATAGAGT (62)
[0087]After extraction on a gel and purification, the amplicons are cloned into plasmids, and the latter are used to transform competent bacteria. About twelve plasmid DNA mini preparations are carried out using the transformed bacteria and the amplicons contained in the plasmids are sequenced. The sequences obtained are then analyzed (cf.: FIGS. 5 to 10).
[0088]Results:
[0089]The analysis of the 5' region of the transcripts of the loci identified was carried out by means of the 5' Race technique. It in particular made it possible to show that the transcription is started at the beginning of the R region of the proviral 5' LTR. This reflects the existence of a promoter role for the U3 region of the proviral 5' LTR.
[0090]1. Methylation state of the U3 sequences of the 5' LTR of the HW4TT locus:
[0091]The U3 sequence of the 5' LTR of the HW4TT locus of reference contains 5 CpG sites: [0092]a) in the sample of healthy testicular tissue: out of 12 sequences analyzed, 9 are completely methylated. The other 3 each time exhibit 1 CpG nonmethylated out of the 5 contained in the U3 region. This therefore represents an overall methylation of the U3 region of the 5' LTR of the HW4TT locus amounting to 95% in the healthy testicular sample; [0093]b) in the sample of tumoral testicular tissue: out of 12 sequences analyzed, 5 (i.e. 41.66% of the sequences) are completely demethylated, 3 sequences have 4 CpGs out of 5 nonmethylated, 2 sequences have 2 CpGs out of 5 nonmethylated, 1 sequence has 1 CpG out of 5 nonmethylated, and 1 sequence remains completely methylated. This therefore represents an overall methylation of the U3 region of the 5' LTR of the HW4TT locus amounting to 30% in the tumoral testicular sample.
[0094]2. Methylation state of the U3 sequences of the 5' LTR of the HW2TT locus:
[0095]The U3 sequence of the 5' LTR of the HW2TT locus of reference contains 5 CpG sites: [0096]a) in the sample of healthy testicular tissue: out of 12 sequences analyzed, 9 are completely methylated, 1 has its 2nd CpG nonmethylated, 1 has the CpG at position 4 nonmethylated, 1 has the CpGs at positions 4 and 5 nonmethylated, and 3 sequences have point mutations on one or two CpGs (one in position 3, one in position 5 and one in positions 4 and 5), very probably reflecting PCR artifacts. This therefore represents an overall methylation of the U3 region of the 5' LTR of the HW2TT locus amounting to 92.9% in the healthy testicular sample; [0097]b) in the sample of tumoral testicular tissue: out of 12 sequences analyzed, 6 are completely demethylated, 5 sequences have one or two methylated CpG(s) (1 at position 1, 1 other at position 5, 1 on positions 1 and 5, 2 at positions 4 and 5 and 1 at position 3). Finally, one sequence has 4 CpGs methylated out of 5 (positions 1, 2, 4 and 5). This corresponds to an overall methylation of the U3 region of the 5' LTR of the HW2TT locus amounting to 20% in the tumoral testicular sample.
[0098]3. Methylation state of the U3 sequences of the 5' LTR of the HW13TT locus:
[0099]The U3 sequence of the 5' LTR of the HW13TT locus of reference contains 3 CpG sites: [0100]a) in the sample of healthy testicular tissue: an additional CpG, compared with the reference sequence, is found in 4 of the 10 clones studied for this locus. It is located between CpGs 2 and 3 and is methylated. In the other 6 clones, this site is mutated compared with the reference sequence. The other 3 CpGs of the U3 region are methylated in the 10 sequences analyzed. This therefore represents an overall methylation of the U3 region of the 5' LTR of the HW13TT locus amounting to 100% in the healthy testicular sample; [0101]b) in the sample of tumoral testicular tissue: the additional CpG indicated above is also found. It is demethylated in 4 of the 10 sequences analyzed, mutated in 3 other sequences, and its methylation state is indeterminate in the last 3 sequences. 7 sequences out of 10 are completely demethylated and the other 3 are methylated on the 2nd and on the 3rd CpG. This corresponds to an overall methylation of the U3 region of the 5' LTR of the HW13TT locus amounting to 20% in the tumoral testicular sample.
[0102]4. Methylation state of the U3 sequences of the solitary LTR of the HWXTT locus:
[0103]The U3 sequence of the 5' LTR of the HWXTT locus of reference contains 6 CpG sites: [0104]a) in the sample of healthy testicular tissue: the 8 sequences analyzed are completely methylated, which corresponds to a methylation percentage of 100% in the healthy testicular sample; [0105]b) in the sample of tumoral testicular tissue: the 9 sequences analyzed 6 are completely demethylated, which corresponds to a methylation percentage of 0%.
[0106]5. Methylation state of the U3 sequences of the 5' LTR of the HW21TT locus:.
[0107]The U3 sequence of the 5' LTR of the HW21TT locus of reference contains 7 CpG sites: [0108]a) in the sample of healthy testicular tissue: the 10 sequences analyzed all have 6 CpGs methylated out of 7; for 6 of the sequences, the 1st CpG is nonmethylated and for the other 4 sequences, the 4th CpG is nonmethylated. This therefore represents an overall methylation of the U3 region of the 5' LTR of the 1-1W21 TT locus amounting to 85.7% in the healthy testicular sample; [0109]b) in the sample of tumoral testicular tissue: out of 8 sequences analyzed, 6 are completely demethylated, 2 others exhibit a profile identical to one of those found in the healthy testicular tissue, namely 6 CpGs methylated and the 1st CpG nonmethylated. This corresponds to an overall methylation of the U3 region of the 5' LTR of the HW21TT locus amounting to 21.4% in the tumoral testicular sample.
[0110]6. Methylation state of the sequences of the activator of the U3 of the 5' LTR of the ERVWE1 locus:
[0111]The ERVWE1 locus comprises, in addition to its U3 promoter region, a known activator located directly upstream of the 5' LTR, and which contains two CpG sites (CpG 1 and 2). The U3 sequence of the 5' LTR of the ERVWE 1 locus of reference contains, for its part, 5 CpG sites (CpGs 3 to 7): [0112]a) in the sample of healthy testicular tissue: out of 10 sequences analyzed, 5 sequences have CpGs 1 and 2 (activator) and 5 (U3) nonmethylated, 1 sequence has CpGs 2 and 5 nonmethylated, 2 sequences have CpGs 1 (activator) and 7 (U3) nonmethylated, 1 sequence has CpG 7 only nonmethylated and, finally, 1 is completely methylated for the 7 CpGs. In total, this corresponds to a methylation percentage of 68.57% in the healthy testicular sample; [0113]b) in the sample of tumoral testicular tissue: out of the 10 sequences analyzed, only 3 sequences exhibit, for each one, a unique methylated CpG (CpG 4 or CpGS or CpG6), the other 7 sequences are completely demethylated, which corresponds to a methylation percentage of 4.29%.
[0114]The very high level of methylation of the U3 retroviral promoters of the loci considered, in the healthy tissue, is correlated with the low, or even absent, transcription expression of the U5 regions which correspond to the loci considered, indicating a repression of the transcriptional expression by an epigenetic mechanism. On the other hand, the low level of methylation of these same promoters in the tumoral tissue reflects a lifting of transcriptional inhibition, the result of which is the significantly higher expression demonstrated by means of the high-density HERV DNA chip and by means of the real-time RT-PCR.
Literature references
[0115][1] Nickerson D. A. et al., DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene, Nature Genetics, Vol. 19, pp 233-240 (1998).
[0116][2] Cottrell S. E., Molecular diagnostic applications of DNA methylation technology, Clinical Biochemistry 37, pp 595-604 (2004).
Sequence CWU
1
6217487DNAHomo sapiens 1tgagaaacag gactagttag atttcctagg ccaactaaga
atccctaagc ctagctggga 60aggtgatcgc atccaccttt aaacacgggc ttgcaactta
gctcacacct gaccaatcag 120gtagtaaaga gagctcacta aaatgctaat taggcaaaaa
caggaggtaa agaaatagcc 180aatcatctat tgcctgacac cacacgggga gggacaatga
ttgggatata aacccaggaa 240ttcgagctgg caacggcaac tccctttggg tctcctctca
ttgtatggga gctctgtttt 300cactctatta aatcttgcaa ctgcacactc ttctggtctg
tgtttgttat ggcttgagct 360gagcttttgc tggctgtcca ccactgctgt ttgctgccgt
cgcagacccc ttgctgactc 420ccacccctgc ggatctggca gggtgtctgc tgcgctcctg
atccagccag gcacccactg 480ctgctcccaa tcaggctaaa ggcttgccat tgttcctgca
tggctaagtg cccgggttcg 540tgctaattga gctgaacact agtcgctggg ttccacagtt
ctcttccgtg acccacagct 600tctaatagag ctataacact cactgcatgg cccaacattc
cattccttgg aatctgtgag 660gccaagaacc cccggtcaga gaacaagaag cttgccacca
tcttggaagc agcccgccac 720cattttggga gctctaagaa caaggacccc ccagtaacat
tttggtgacc acgaagggac 780ctccaaagca gtgagtaata ttgaaccact tccgcttgct
attctgtcct aaccttcctt 840agaattggag gaaaataccg ggcacctgtc ggccagttaa
gaacgattag cgtggccgcc 900agacttaaga ctctggtgtg aggctgtctg ggaaagggct
ttctaacaac ccccaaccct 960tccgggttgg gagctttggt ctgcctggaa ccagcttcca
ctttcaattt tcctggggaa 1020tccaagggct gactagaggc agaaagctgt catcccgaac
tcctggcatt agacagttga 1080gatcgtggcg cagccagaag tctctactca acagtcaccc
atgcgtgcac ccctaccttt 1140ccttctaacc catacctccc gggtcccaac catgactttc
ttgaaagtgt agcccctaaa 1200ttctctttac ctctaaatct acttcttctc atccctgctt
cctaggtact aatggttcag 1260actttcattt cctctagcaa gttctatctc cagagggatc
taaggaaggg atctatgctg 1320tgtccttagg cccctaggct atgaacccag agagtcttct
ccctgttatc tctccccatt 1380taggcataca gctctcaaca tggacagtta tgtgggaccc
attccctacc acccttgcca 1440gggccccaag ttttcaaagg gctagaagaa aaaagagaga
aagagagaga gaggcagagg 1500ggagagaaag agagagagac aaagagggag tcaaagagag
atagaaagag aaagatagaa 1560ctagtaaaga aaaaaagtat gccccattcc tttaaaagcc
agggtaaatt taaaacctat 1620aattgataat tgaaggtctt ctccatgacc ctataacact
ccaataccac cttgttttca 1680gtgtaaacaa gggtgtagcc cgaaaacact gagaccactg
acaacccata gccttcctat 1740caaaaatcct taacccagga acccatggat ggcccaaatg
cattcaatct gtagcagcaa 1800ctgctttgct aacagaagaa agtagaaaag taacttttag
agaaaacctc attgtgagca 1860cacctcacca gttcagaatt attctaagtc aaaaaagcaa
aaaggtagct tactaactca 1920aaaatcttaa agtatggggt tattctgtta gaaaaaggtg
atttaacatt aaccactgaa 1980aattccctta acccagcagg tttcctaatg ggatttaaat
cttcattacc atacaaaggt 2040ccgaccagac ccagcaggaa ctccctttag gacaggatga
tagatggttc ctcctgggtg 2100attgaggggg tgaaaaacca caatgggtgt tcagtaattg
atagggagac tcttgtggaa 2160ggagagttag gaaaattgcc taataattgg tctgctcaaa
tgtgcgagct gtttgcactc 2220agccaagcct taaagtactt acagaatcaa aaagactcta
tctcaatcct gactcaaaat 2280gttacctaca ccatctctga catgaatttg cataagaact
gttgtttatg ggaatgcatc 2340ttgatggggc agctgggttg ttatgaaata ctcaggaacc
cagcccaggt ctagaattca 2400cctctgagcg caaaggcaat gttggccatg ctggtaaagg
accactagaa tccaggagcc 2460tggacccctt tctttgtggt caagaaaggc gggaaaacag
gtgcaggact gctacatcag 2520agagcataac aaatccgata agcagagttc catgagtggt
taagcaccct ggaaaggaac 2580tcacctctga gtgcaaaggc aatgttaggc acaccagtaa
aggaccacta gaatccagca 2640gcccagaccc ctttctttgt gatcaagaaa ggcgggaaaa
ggggtgcagg actgctacat 2700cagtgagcgt aactaatctg ataagcagaa gtccatgggt
ggttacgcac cctggaaagg 2760aataagcatt aggaccacag aggacactct aagactaatg
ctcattggaa aatgactagg 2820ggtgctggca tccctatgtt tttttttcag atgggaaaca
ttccccccaa ggcaaaaacg 2880cccataagat atattctgga gaattcggcc cagagtgtat
gtatcttttt tccctgtcag 2940acttgaagca aacctaggta aattatcaga tagccctgat
ggctatattg atgctttaca 3000agggttagga caatcctttg atctaacatg gagagatata
ctgttactgc tagatcagac 3060actaatccca aatgaaagaa gtgccaccat aactgcagcc
agagagtttg atgatctctg 3120gtatctcagt caggtcaatg ataggatgac aacagaagaa
agaaaacaat tccccacagg 3180ccagcaggca gttcccagcg tagaccttca ttgggacaca
gaatcagaac atggagattg 3240gtgccgcaga catttactaa cttgcgcgct agaagcacta
aggaaaacta ggaagaagcc 3300tatgaattat tcaatgatgt ccactataac acagggaaag
gaagaaaatc ctactgcctt 3360tctggagaga ctaagggagg cattgagaaa gcatacctct
ctgtcacctg actctattga 3420aggccaacta atcttaaagg ataagttttc cactcagtca
gctgcagaca ttagaaaaaa 3480acttcaaaag tctgcgttag gccgggagca aaacttagaa
accctattga acttggcaac 3540ctcagttttt tatgatagag atcaggagga tcaggtggaa
tggacaaatg agattttaaa 3600aaaaggccac cactttagtc atggccctca ggcaagcaga
ctttggacac tctggaaaag 3660ggaaaagctg ggcaaatcga atgcctaata agacttgctt
ccagtgtggt ctacaaggac 3720actttaaaaa agattgtcca aatagaaata agccaccccc
tcgtccatgc tccttatgtc 3780aagggaatca ctggaaggcc tactgcccca ggggatgaag
gtcctctgag tcagaagcca 3840ctaaccagat gattcagccc caggactcag ggtgcccagg
gcaagcgcca gcctatgcca 3900tcaccctcac agagccctgg gtatgcttga ccattgaggg
tcaggaggtt aactatctcc 3960tggacactgg cgtggccttc tcagtcttac tctcctgtcc
cggacaactg tcctccagat 4020ctgtcactat ccgagggttt ctacgacagc cagccactag
atacttctcc cagccactaa 4080gttgtgactg gggaactcta ctcttttcac atgtttttct
aattatgcct gaaagcccca 4140ctcctttgtt agggaaagac attctagcaa aagcaggggc
cattatacac ctgaacatag 4200gagaaggaac acctgtttgt tgtcccctgc ttgaagaagg
aattaatcct gaagtctgga 4260caacagaagg acaatacaga tgagcaacaa atgcctgtcc
tgttcaagtt aaactaaagg 4320attatgcctc ctttccctac caaaggcagt acccccttag
acccgaggcc caacaaggac 4380tccaaaagat tgttaaggac ctaaaagctc aaagcctagc
aaaaccatgc agtagcccct 4440gcaatactcc aattttagga gtacagaaaa ccaacagaca
gtggaggtta gtgcaagatc 4500tcaggattat caatgaggct gttgttccta acccttatac
tctgctttcc caaataccag 4560aagaagcaga gtggtttaca gtcctggacc ttaaggatgg
ctttttctgc atccctgtac 4620atcctgactc tcaattcttg tttgcctttg gagatccttc
gaacccaatg tctcaactca 4680gcttgactgt tttaccccaa gggttcaggg atagccccca
tctagttggc caagcattag 4740ccgagccagt tctcctacct ggacactctt gtcctctggt
acatggatga tttattttta 4800gctgcccgtt cagaaacctt gtgccatcaa gccacccaag
tgctcttaaa tttcctcgcc 4860acctgtggct acaaggtttc caaaccaaag gctcagctct
gctcacagca ggttaaatac 4920ttagggctaa aattatccaa aggcaccagg gccctcagtg
aggaatgtat ccagcctgta 4980ttggcttatc ctcatcccaa aaccctaaag caactaagag
ggttccttgg cataacaggt 5040ttctgccaaa tgtggattcc caggtacggt gaaatagcca
ggccattata taccctaatt 5100aaggaaactc agaaagccaa cacccattta ttaagatgga
cacctgaagc agaagcagct 5160ttccaggccc taaagaaggc cctaacccaa gccccagtgt
taagcttgcc aacggggaag 5220acttttcttt atatgtcaca gaaaaaacag gaatagctct
aggagtcctt agacaggtcc 5280aagggatgag cttgcaacct gtggcatacc tgagtaagga
aattgatgta gttgcaaagg 5340gttgacctca ttgtttacag gtagtggcgg cagtagcagt
cttagtatct gaagcagtta 5400aaataataca gggaagagat cttactgtgt ggacatctca
tgatgtaaac ggcgtactca 5460cttctaaagg agacttgtgg ctgtcagaca accgtttact
taaatatcag gctctattac 5520ttgaagggcc agtgctgcga ctgcccactt gttcaactct
taacccagcc acatttcttt 5580cagacaatga agaaaagata gaacataact gtcaacaggt
gattgctcaa acctacggcg 5640ctcgagggga ccttctagag gttcccttga ctgatcccaa
cctcaacttg tatactgatg 5700gaagctcctt tgtagaaaaa ggactttgaa aggtggggta
tgcagtggtc agtgataatg 5760gaatacttga aagtaattcc ttcactccag gaactagtgc
tcagctggca gaactaatag 5820ccctcactca ggcactagaa ttaggagaag gaaaaagggt
aaatatatat gcagactcta 5880agtatgctta cccagtcctc cacgcccaca cagcaatatg
gagagatagg aaattcctaa 5940cttctgaggg aacaccgatc aaacatcagg aagccattag
gagattatta ttggctgtac 6000agaaacctaa agaggtggca gtcttacact gctggggtca
tcagaaagga aaggaaaagg 6060aaatagaaag gaaccaccaa gtggatattg aagccaaaag
agccacaagg caggccctcc 6120attagaaatg cttatagaag gatccctagt atggggtaat
cccctccggg aaaccaagcc 6180ccagtactca gcaggagaaa tagacacgag gacatagttt
cctcccctca ggatggctag 6240ccaccgaaaa agggaaaata cttttgcctg cagctaatca
atggaaatta cttaaaaccc 6300ttcaccaaac ctttcacttg ggcatggata gcatctatca
gatggccaat ttattattta 6360ctggaccagg ccttttcaaa actatcaagc agatagtcag
ggcctgtgaa atgtgccaaa 6420gaaataatcc cctgcacttc aagccataca tttcaatccc
tgtatcttta acctcctgtt 6480gtttgtctct tccagactca aagctgtaaa actgcaaatg
gttcctcata tggagcccca 6540gatgcagtcc atgactaaga tctaccacag agccctagac
cggcctgtta gcccatgctc 6600cgatgttgat gacatcaaag gcacaccttc cgaggaaatc
tcaactgcac gacccctact 6660aagccccaat tcagcaggaa gcagttaaga gcagtcgttg
gctaacatcc ccaatagtat 6720gtgggttttc ctgttgagag gggggactga gagacaggac
tagctggatt tcctaggcca 6780actaagaatc cctaagccta gttgggaagg tgaccgcatc
cacctttaaa cacggggctt 6840gcaacttagc tcacacccga ccaatcaggt agtaaagaga
gctcactaaa atgctaatta 6900ggcaaaaaca agaggtaaag aaatagccaa tcatctatcg
cctgagagca cagtggggag 6960ggacaatgat cgggatataa acccaggcat tcgggccggc
aacggcaacc cccattgcgt 7020cccctcccat tgtatgggag ctctgttttc attctattaa
atcttgcaac tgcacactct 7080tctggtctat gtttgttatg gctcgagctg agctttcgct
cgctgtccac cactgctgtt 7140tgccgccatc gcagacccac cactgacttc cacctctgca
gatctggcag ggtgtccgct 7200gtgctcctga cccagcgagc cacccattgc tgctcccaat
caggctaaag gcttgccatt 7260gttcctgcat ggctaagagc ccagggttcg tcctaatcga
gctgaacgct agtagctggg 7320ttccacagtt ctcttccgtg acccacggct cctaatagag
ctataacact caccacatgg 7380cccaaggttc cattcattgg aatccgtgag gccaagaacc
cccggtcaga gaacaagaag 7440cttgccacca tcttggaagc tctaaaaaca gagacacccc
agtaaca 748727774DNAHomo sapiens 2tgagacacag gactagctgg
atttcctagg ccgactaaga atccctaagc ctagctggga 60aggtgaccac atccaccttt
aaacacgggg tttacaactt agctcacacc cagccaatca 120gagagctcac taaaatgcta
attaggcaaa aacaggaggt aaagaaatag ccaatcatct 180attgcctgag agcacagcgg
gagggacaag gattgggata taaacccagg cattcgagct 240ggcaacggca accccctttg
ggtctcctcc ctttgcatag gagctctgtt ttcactctat 300taagtcttgc aactgcactc
ttctggtccg tgtttcttac cgcttgagct gagctttccc 360tcactgtcca ccactgctgt
tttgccaccg tcacaggccc accgctgact tccattcttc 420tggatctagc aggctgtcca
ctgtgctcct gatccagcga ggcgcccatt gccgctcccg 480attgggctaa aggcttgcca
ttgttcctgc atggctacgt gcctgggttc atcctaatca 540agccgaacac tagtcactgg
gttccacggt tctcttccat gacccacgac ttctaataga 600actataacac tcacctcatg
gcccaagatt ccattccttg gaatccatga ggccaagaac 660cccaggtcag agaacacgag
gcttgccacc atcttggaag tggccccacc accatcttgg 720gagctctggg agcaaggacc
cccggtaaca ttttggcgac cacaaaggga catccaaagt 780ggtgagtaat attggaccac
tttcacttgc tattctgttc tatccttcct tagaactgga 840ggaaaatacc aggcacaggc
acctgtcagc cagttaaaaa caattagcgt cgccgccaca 900cttaagactc aggtgtgagg
ctatctgggg aaagactttc taacaacccc caacccatct 960agtggggatg ttggtctgcc
tggagacagc ttccactttc aattttcttg gggaagccga 1020gggctcacta gaggcagaca
gctgttgtcc caaactccgg gcagtagccg gttgagatca 1080tggtgcagcc aggagtctct
actcagcagt cgccgatgca tgtgccccta ccttcccttc 1140tgacccatac atcctgagtc
ccgactgtga ctttcttgaa agtgtagccc caaaattctc 1200cttacctctg aatctacttc
ctctgatccc tgcctcctgg gtactaatga ttcagacttt 1260catttcctct agcaagttgt
gtctccaaag ggatctaagg aggctctacg ctgcatcctt 1320aggcacctag gctataaccc
aaggagtctt atccctggtg tccctcccga tttgggtata 1380caactctcaa catgggcagt
tatgtaggac ccattcccca ccacacttgc cagggcccca 1440agtttgtaat ggctaagaga
gagacacaga gagagagaga gagatggaga gagagacaag 1500gagggagtca aagagaaaaa
gaaagaaaaa gaaatagtag aaaaaaaagt gtgccctatt 1560cctttaaaag ccagggtaaa
tttaaaacct gtaattgata attgaaggtc ttctccgtga 1620ccctgtaaca ctccaatgcc
attttgttgt cagtgtaaat aagggcatag cccaaaagca 1680ctgaggtcac tgacaacccg
tagctttccc atcaaaaatc cttaacccag taatccgcgg 1740atgggccaaa tgcattcagt
cggtagcagc aaccgctttg ctaaaagtag aaaagtaact 1800tttagaggaa acctcattgt
gagcgcacac ctcaccagtt cagaattatt ctaagtcaaa 1860aaaaaaaaaa gcaaaaaggt
aacttactaa ctcaaaaatc ttaaagtata ggtctatcat 1920attagaaaag ggtaatgtaa
ctccaaccac tgataattcc cttaacccag cagatttcct 1980aacaggggat ttaaaactta
attaccatac aaaggtccca ccagacctag gaggaactcc 2040cttcaggaca ggacgataaa
cggttcctcc caggtgattg aggaaaaaaa ccacaatggg 2100tattcagtaa ttgatacaga
gactcatgtg gaagcagtta gaaaaattgc ctaataattg 2160gtctcctcaa acgtgtaagc
tgtttgcact cagccaagcc ttaaagtact tacagaatca 2220aaaagactct gaatcctgac
tcaaaaggtt tgctacaccc tctgtgaaac aaatttgcat 2280aagaactgtt gtttatggga
aggcatcttg atggggcagc tgggttgtta tgaaatactc 2340aggaccccag cccggctcta
ggactcaccc ctgagcgcaa aaggcaatgt tgggcacgct 2400ggtaaaggac cactagaatc
cagcagcccg gacccctttc tttgtggtca agagaggcgg 2460gaaaacaggt gcaggactgc
tacatcagtg agcataacta atccagtaag cagaggtcca 2520tgggtggtta tgcaccctgg
aaaagaatac gcattaggcc cttagaggat gctctaggac 2580taatgctcat cggaaaatga
ctaggggtgc tgacatccct atgttctttt ttcagatggg 2640aaacgttcct cccaccccaa
ggcaaaaaac acccctaaga tgtattttgg agaattagga 2700ccaatttgac cctcagacac
taagaaagaa atgacttaca ttcttctgca gtaccatgat 2760atcctcttca agggggagaa
acctggcctc ctgagagaag tataaattat aacaccatct 2820tacagtgaga cctcttctgt
agaaaggagg gcaaatggag tgaagtgcaa actttccttt 2880cattaagaga caactcgcaa
ttatgtaaaa agtgtgattt atgccctaca gaaagccctc 2940agtctacctc cctatcccag
ggtccccccg attcctttcc caactaataa ggacccccct 3000tttacccaaa tggtccaaag
gagatagatg aagggataaa caatgaacca aacagtgcca 3060atattccctg attatgcccc
ctccaggcag tgggaggagg agaattcggc ccagccagag 3120tgcatgtacc tttttttttc
tctcagactt aaagcaaatt aaaatagacc taggtaaatt 3180ctcagataac cctgatggct
atattgatgt tttacaaggg ttaggacaat cctttgctct 3240gacatggaga gatataatgt
tactgctaaa tcagacacta accccaaatg agagaagtgt 3300caccatagct gcagcccaag
agtttggcaa tctctggtat ctcagtcagg tcaatgatag 3360gatgacaaca gaggaaaggg
aatgattccc cacaggccag caggcagttc tcagtgtaga 3420ccctcactgg gacacagaat
aagaacatgg agatcggtgc cgcagatatt tgctaacttg 3480cgtgctagga ctaaggaaaa
ctaggaagaa gcctatgaat tattcagtga tgtccactat 3540aacacaggga aaggaagaaa
atcatactgc ctttccggaa atactaaggg aggcattgag 3600gaagcatacc tctctgtcac
ctgactgtat tgaagtccaa ctaatcttaa aggatatgtt 3660tatcactcag tcagctgcag
acattagaaa aaacttcaaa agtccacctt aggcccagag 3720caaaacttag aaaccctatt
gaacttgtta acctcagttt tttataatag agatcaggag 3780gagcaggcgg aacaggacaa
acaggattaa aaaaagacca ccgctttagt catggccctc 3840aggcaagtgg actttggaag
ctctggaaaa gggaaaagct gggcaaattg aatgcctaat 3900agggcttgct tccagtgtgg
tctacaagga cacttaaaaa aagattgtcc aagtagaaat 3960aagctgcccc ttcgtccatg
cctcttatgt caagggaatc actggaaggc ccattgcccc 4020aggggaggaa ggtcctctga
gtcagaagcc actaaccaga tgatccagca gcaggactaa 4080gggtgcccag ggcaagcccc
agcccatgcc atcaccctca cagagccccg ggtatgcttg 4140accattgagg gccaggaggt
taactgtctc ctgaacactg gcacagcctt ctcagtctta 4200ctttcctgtc ccggacaact
gtcctccaga tctgtcacta tctgagcggt cctaggacag 4260ccagtcacta gatatttctc
ccagccacta agttgtgact ggggaacttt actcttttca 4320catgcttttc taattatgcc
tgaaagcccc actcctttgt tagggagaga cattctagca 4380aaagcagggg ccattataca
tctgaacata ggagaaggaa cacccgtttg ttgtcacctg 4440cttgaggaag gaattaatgc
tgaagtctgg gcaacagaag gacaatatgg atgagcaaag 4500aatgcccatc ctgttcaagt
taaattaaag gattccgcct cctttcccta ccaaaggcaa 4560taccccctta gacccgaggc
ccaacaagga ctccaaaaga ttgttaagga cctaaaagcc 4620caaggcctag taaaaccatg
caatagcccc tgccatactc caattttagg agtaaggaaa 4680cccaacggac agtggaggtt
agtgcaagaa ctcaggatta tcaatgaggc tgttgttcct 4740ctatacccag ctgtacctaa
cccttataca gtgctttccc aaataccaga ggaagcagag 4800tggtttacag tcctggacct
taaggatgcc tttttctgca tccctgtacg tcctgactct 4860caattcttgt ttgcctttga
agatcctttg aacccaacgt ctcaactcac ctggactgtt 4920ttaccccaag ggttcaagga
tagcccccat ctatttggcc aggcattagc ccaagacttg 4980agccaattct catacctgga
cactcttatc cttcggtatg gggatgattt aattttagct 5040acccattcag aaacgttgtg
ccatcaagcc acccaagtgc tcttaaattt cctcgctacc 5100tgtggctaca ggtttccaaa
cgaaaggctc agctctgctc acagcaggtt aaatacttag 5160ggctaaaatt atccaaaggc
accagggccc tcagtgagga acgtatccag cctatactgg 5220cttattctca tcccaaaacc
ctaaagcaac taagagcatt ccttggcata acaggctgct 5280gctgaatatg gattcccagg
tacagtgaaa tagccaggcc attatacaca ctaattaagg 5340aaactcagaa agccaatacc
catttagtaa gatggacacc ttaagcagaa gcggctttcc 5400aggccttaaa gaaggcccta
acccaagccc cagtggtaag cttgccaaca gggcaagact 5460tttctttata tgtcacagaa
gaaacaggaa tagctctagg agtccttaca caggtctgag 5520ggatgagctt gcaacccatg
gcatacctga gtaaggaaac tgatgtagtg gcaaagggtt 5580ggcctcattg tttacgggta
gtggcagcag tagcagtctt agtatctgaa gtagttaaaa 5640taatacaggg aagagatctt
actgtgtgaa catctcatga tgtgaatggc atagtcactg 5700ctaaaggaga cttgtggctg
tcagacaact gtttacttaa ataccaggct ctattacttg 5760aagggccagt gctgcgactg
tgcacttgtg caactcttaa cccagacaca tttcttccag 5820acaatgaaga aaagatagaa
cataactgcc aacaagtaat tgctcaaacc tatgccactc 5880gaggggacct tttagaggtt
cccttgactg atcccaacct caacttgtat actgatggaa 5940gttcctctgt agaaaaagga
ctttgaaaag tggggtatgc agtggtcagt gataatggaa 6000tacttgaaag taatcccctc
actccaggaa ctagtgctca gctggcagaa ctaatagccc 6060tcactcgggc actagaatta
ggagaagaga aaagggtaaa tatatacaga ctctaagtat 6120gcttacctag tcctccatgc
ccatgcagca atatggagag aaagggaatt cctaatttcc 6180aagggaacac ctatccaaca
tcaggaagcc attaggagat tactattggc tgtacagaaa 6240cataaagagg tggcaatctt
acactgccgg tgtcaccaga aaggaaagga aagggaaata 6300gaaaggaacc accaagcgga
tattgaagcc aaaagagccg caaggcagga ccctccatta 6360gaaatgctta tagaaggacc
cctagtatgg ggtaatcccc tccaggaaac caagccccag 6420tactcagaag aagaaataga
atgaggaacc tcacaagcac atagtttcct cccctcagga 6480tggctagcca ctgaagaagg
aaaaatactt ttgcctgcag ctaaccaatg gaaattactt 6540aaaacccttc accaaacatt
tcccttaggc attgatagca cccatcagat ggccaaatta 6600ttatttactg gaccaggcct
tttcaaaact atcaagcaga tagtcagggc ctgtaaagtg 6660tgccaaacaa gtaatcccct
gcactgcagg ccatacattt caatccctgt atctttaacc 6720tccttgttaa gtttgtctct
tccagaatca aagctgtaaa actacaaata gttcttcaaa 6780tggagcccca gatgtagtcc
atgactaaga tctaccgcgg acccctggac aagcctgcta 6840gcccatgctc tgatgttaat
gacatggaag gcacccctcc cgaggaaatc gcaactgcac 6900aacccctatt acaccccaat
tcagcaggaa gcagttagag cattcatcag ccaacctccc 6960caacagcact tgggttttcc
tattgagagg gggtactgag agacaggact agctggatgt 7020cctaggctga ctaagaatcc
ctaagcctag ctgggaaggt gaccacatcc acctttaaat 7080acggggcttg caacctagct
cacacccaac agatcagaga gctcgttaaa atgctaatta 7140ggcaaaaaca ggaggtaaag
aaatagccaa tcatctattg cctgagagca cagcaggagg 7200gacaaggatt gggatataat
cccaggcatt cgagctggca acagcaaccc cctttgggtc 7260ccctcccttt gtatgggagc
tgttttcact ctatttcact ctattaaatc ttgcaactgc 7320actcttctgg tgcatgtttg
ttactgcttg agctgaactt tcactcgcca tctaccactg 7380ctgttttgcc gccgtcgcag
acccactgct gacttccatt cttctggatc cagcagggtg 7440tccactgtgc tcctgatcca
gtgaggcacc cattgccgct cccgatctgg ctaaaggctt 7500gccattgttc ctgcatcgct
aagtgcctgg gttcgtccta atcaagctga acactagtca 7560ctgggttcca cagttctctt
ccatgaccca cgacttctaa tagagctata acactcacct 7620tatggcccaa gattccattc
cttggaatcc atgaggccaa aaaccccagg tcagagaaca 7680tgagacttgc caccatgttg
aagtggcctg ctgccatttt ggaagtggcc caccaccatc 7740ttgggagctc tgggagcaag
gacccctggt aaca 777435470DNAHomo sapiens
3tgagagacag ctggatttcc taggccgact aagaatccct aagcctagct gggaaggtga
60ccgcatccac ctttaaacac agggcttgca acttagctca cacccaacca atcagagagc
120tcactaaaat gctaattagg caaaaacagg aggtaaagaa atagcaagtc atctattgcc
180tgagagcaca gtgggaggga caaggaccag gatataaacc caggcatttg agccagcaac
240ggcaacctcc tttgagtccc ctccctttgt ataggagctc tgttttcact gtgtttcact
300ctattaaatc ttgcaattgc actcttctgg tccatatttg tcacggcttg agctgagctt
360tcacttgccg tccaccacta ctgtttgctg ctgtcacaga cccgccgctg actcccatcc
420cgctgctgac tcccatccct ccggatccgg cagggtgtcc gctgtgctcc tgatccagca
480agactcccat tgccactccc gatagtgcta aaggcttgcc attgttcctg catggctaag
540tgcctgggtt cgtcctaatc cagctgaaca ctagtcactg ggttccacgg ttctcttcca
600tgacccgcgg cttctaatag agctataaca ctcaccacat ggcccaatat tccattcctt
660ggaatccgtg aggccaagaa ccccaggtca gagaacacga ggcttgccac catcttggaa
720gcagcctgcc accatcttgg aagtggctca ccaccgtctt gggagttctg tgaacaagga
780cccctggtaa cattttggcg accacgaagg gacatccaaa gctgtgagta atattggacc
840actttcgctt gctattctgt tctatcctta gaactggagg aaaatactgg gcacctgtcg
900ccagttaaaa atgattagca tggccgccgg acttaagact caggtgtgag gctatctggg
960aaagggcttt ctaacaaccc ccaagccttc tgttgggaac tttggtctgc ctggagccag
1020cttccacttt caattttctt ggggaagcca agggctgact ggaggcagaa agctgttgtc
1080ccgaactccc ggcagtagcc ggttgagatc atggcgcagc cagaagtctc tactcggcag
1140tcgcccatgc gtgcgccctt acctttcctt ctgaattata cctccggggt cccgactccg
1200actttcttga gagtttagcc ccaaaattct ccttacctct gaatctactt cctttgatcc
1260ctgcctcctg cctcctaggt actaatagtt cagactttca tttcctctag caagttgtgt
1320ctccaaaggg atctaaggag gctctatgct gtgtccttag gcacctaggc tataacccag
1380ggagtcttat ccctggtatc cctcccgatt taggtataca gctcttgaca tgggcagtta
1440tgtgggacct gttccccacc acccttgtga gggccccaag tttgtaatgg ctaagaaaga
1500gagacggaga gagagagaga cggagaaaga gacaaagagg gagtcaaaga gaaaaagaaa
1560gaaaaagata gaaatagtta aaaaaaaaaa aaagtgtgcc ctattccttt aaaagccagg
1620gtaaatttaa aacctgtaat tgataattgc cactttgttg tcagtgtaaa taagggcgta
1680gcaaatcctt aacccagtaa cccgcggata ggccaaatgc attcagtcgg tagcggcaac
1740agctttgcta aaagtagaaa agtaactttt agaggaaacc tcattgtgag cacacctcac
1800cagttcagag ttattctaag taaaaaaaaa aaaaaaaaaa aaagcaaaaa ggtagcttac
1860taactcaata atcttaaagt atggggctac tatgctagaa aagggtaatg taactccaac
1920cactgataac tcccttaacc cagcagattt cctaacaggg gatttaaatc ttaattacca
1980cacgaaggtc cgaccagacc taggaggaac tcccttcagc acaggacgat agatggttcc
2040tcccaggtga ctgaggaaaa aactacaatg ggtattcagt aattggtatg gagactcttg
2100tggaagcaga gttaaaaatt tgcctaataa ttggtctcct caaatgtgcg agctgtttgc
2160actcagccaa gccttaaagt acttacagaa tcaaaagact atctcaatcc tgactcaaaa
2220ggttagctac acagtctctg aaatgaattt gcagaagaac tgttgtttat gggaatgcat
2280cttgatgggg cagctgggtt gttatgaaat actcaggaac ccagcccagc tctaggactc
2340accgctgagc gcaaaggcaa tgttgggcac gctggtaaag gaccactaga atccagcagc
2400ccaggcccct ttctttgtgg tcaagaaagg caggaaaagg agtgcagaac tgctacattg
2460gtgagcgtaa ctaatccaat aagcagaggt ccatgagtgg ttatgcacgc tggaaaagaa
2520taagcattag gcccttagag gatgctctag gactaatgct catcggaaaa tgactagggg
2580tgctggcatc cttatgttct ttcttcagat gggaaacgtt ccccccaagg caaaagcgcc
2640cctaagatgt attctggaga attagaacca atttgaccct cagatgtcaa gaaagaaacg
2700acttatattc ttctgcagta ctgcctggcc acgatatcct cttcaagggg gagaaacctg
2760gcctcctgag ggaagtacaa attataacac catcttacag ctagacctct tttgtagaaa
2820agaaggcaaa tggagtgaag tgccatatgt gcaaactttc ttttcattaa gagacaactc
2880acaattatgt aaaaagtgtg gtttatgtct tacaggaagc cctcagagtc tacctcccta
2940tcccagcatt cccccgactc cttccccaac taataagcac cacccttgaa cccaaacagt
3000ccaaaaggag atagacaaac aggtaaacaa tgaaccaaag agtgtcagta ttccccgatt
3060atgccccttc caagcagtgg gaggaggaga attcggccca gccagagtgc atgtaccttt
3120ttctctctca gacttaacgc aaattaaaat agacttaggt aaattctcag ataaccctga
3180tggctacatt gatgttttac aagggttagg gcaatccttt gatctgacat ggagagatat
3240aatgttactg ctaaatcaga cactaacccc aaatgagaga agtgccgccg taactgcagc
3300ccgagagttt ggtgatctct ggtatctcag tcaggtcaat gataggatga caacagagaa
3360aagagaacga ttccccacag gccagcaggc agtttccagt gtagaccctc attaggacac
3420agaatcagaa catggagatt ggtgccacag atatttgcta acttgagtgc tagaaggact
3480aaggaaaact aggaagaagc ctatgaatta ttcagtgatg tccactataa cacaaggaaa
3540ggaagaaaat cctactgcct ttctggagag agtaagggag gcattaagga agcatacctc
3600cctgtcacct gactctattg aaggccaact aatcttaaag gataagtttg tcactcagtt
3660agctgcagac attagaaaaa aacttcaaaa gtccgactta ggcctggagt acggctgagt
3720gcccaatttg gcagcaggca agaccaacac tgagcccttc atatggcacc atgctttgtg
3780gtgatcagcc aactacttga tggcaggttg attatattgg acatctttca tcagagaaat
3840ggcagtggtt tgtccttcct ggaatagaca cttattctcg atatgggttt gtctatcctg
3900caggcaatgc ttctgccagg agtaccatct gtggactcat ggaaagcctt atccaccatc
3960atggcattcc acacagcatt gcctctaaac aaggcactta ttttatagct aaggaagtgt
4020ggcagtgggc tcatgctcat ggaattcact gattgtatct tgttgcccat tatcttaaag
4080cagctggatt gatagaacag tggaaaggcc atttgaaatc acaattacac caccaactag
4140gtgacaatac tttgcagggc tcggcaaagt tctcttgaag gctgagtatg tcctgaatca
4200gcatccaata tatggtactg tttccctcat agccagcatt cacaggccta agaatcaagg
4260ggtagaagta gaagtggcac cactcaccat cactcctagt gacccactag caaaaatttt
4320acttccagtt cccccaacat tatgttctgc tggccttagt tccagaggga agaattctgc
4380caccagtcga cacaagaatg ataccattaa actgaaagtt aaaattgcca cctggccact
4440ttgggctcct cccacctcta agtcaacagg tcaagaaagg agttacagtg ttgacttggg
4500tgattgacct ggactatcaa gatgaaatca ggttactact ccacagtgga ggtaaggaag
4560aatatgtgtg gaatacagga gatcccttag gccgtctttt agtactacca tgccctgtga
4620ttaaggtcag tggaaaacta caacaatcca atctaggcag gactacaaat ggcccagact
4680cttcaggaat gaagggttgg gtgacttcac caggtaaaaa aataacagcc tgctgaggtg
4740ctagctgaag gcaaagggaa tacagaatgg ttagtagaaa aaggtagtca tcaataccag
4800ctatgaccac aagaccagtt gcagaaatga gacctgtaat tgtcatgtgg atttcctcct
4860tacatgtttg tgcatgtata cacttctact aagaaaatac ctttatttat ttcctttgct
4920tttcccttat caagtgacat tattaacttc atatcagcag ttaagtgtta ttaactttat
4980gtaatagcat ttcggttaat aattcacttc tggttgtatg aaggatagcc gtattaagtt
5040aggtgtaatt atgacatcat tattgtcttt atttgaagat tatgtgtaat ttcaggagat
5100gtgtatgggt tcaagttgac aagggatgga cttgtgatgg ctaatgttga gtgtcaactt
5160gactgaggat gcaaagtatt gttcctgggt gtgtctgtga gggtgttgcc aaaggagatt
5220aacatttgtg tcagtgaact gggagatgca gacccacccg caatctgggt gagcaccatg
5280taatcagctg ccagagcagc tagaataaag caagcagaag aaggtggaag gagctgactt
5340gctgagtctt ctagtattct tcgttcttct atgctggttg cttcctgccc ccaaacatca
5400gtctgcaagt tcttctgctt ttggactctt ggacttacac cagtggtttg ccagggactc
5460tcgggccttc
54704783DNAHomo sapiens 4tgagagacag gactaactgg atttcctagg ccgactaaga
atccctaagc ctagctggga 60aggtgaccgc atccatcttt aaacacgggg cttgaaactt
agctcacacc taaccagtca 120gagagctcac taaaatgcta attaggcaaa aaacaggagg
taaagaaata gccaatcatc 180tattgcctga gagcacagcg ggagggacaa ggatcgggat
ataaacccag gcattcgagc 240cagcaatggc aacccccttt gggtcccctt cccttgtatg
ggagctctgt tttcactcta 300tttcactcta ttaaatcttg caactgcact cttctggtcc
atgtttgtta cggctcgagc 360tgagctttgg ctcgccatcc accactgctg tttgccgccg
tcgcacacct gctgctgact 420cccatccctc cggatccagc agggtgtgtc cgctgtgctc
ctgatccagc gaggtgccca 480ttgccgctcc tgattggact aaaggcttgc cattgttcct
gcacggctaa gtgcccgggt 540tcgtcctaat ccagctgaac actagtcact gggttccacg
gttctcttcc ttgacccacg 600gcttctaata gagctataac actcaccgca tggcccaaga
ttccattcct tggaatctgt 660gaggccaaga accccaggtc agagaacacg aggcttgcca
ccatcttgga agcggcctgc 720caacatcttg gaagtggctc gccaccatct tgggagctct
gtgagcaagg acccctggta 780aca
78357542DNAHomo sapiens 5tgagagacag gactagctgg
atttcctagg ccgactaaga atccctaagc ctagctggga 60aggtgaccgc ttccaccttt
aaacacgggg cttgcaactt agctcacacc cgaccaatca 120gatagtaaag agagcacact
aaaatgctaa ttaggcaaaa acaggaggta aagaaatagc 180caatcatcta ttgcctgaga
gcaaagcggg agggacaatg atcgggatag aaacccaggc 240attcaagccg gaatggctac
cctctttggg tcccctccct ttgtatggga gctctgtttt 300cactctattc aatcttgcaa
ctgcactctt ctggtccgtg tttgttacag cttgagctga 360gctttcgctc gccttccacc
actgctgttt gccgccatcg cagacctgcc gtgctgactt 420ccatccctct agatctggca
gggtgtccgc tgtgctcttg atccagcgag gcgcccattg 480ccgctcccga ttgggctaaa
ggcttgcaat tgttcctgca cgctaagtgc ctgggttcat 540cctcatcaag ctgggttcca
cggttctctt catgacccgc agcttctaac agagctataa 600aactctgtgc atggcccaag
attccattcc ttggaatctg tgaggccaag aaccccaggt 660cagagaacag gaggcttgcc
accatcttgg aagtggctcg ccaccatctt aggagctctg 720tgagcggaga cccccacccc
ccggtaacat tttggcgacc acgaagggac ctccaaagcg 780gtgagtaata ttggatcact
ttcgcttgct attctgtcct atccttcttt agaattggag 840gaaaatactg ggcacctgtc
ggccagttaa aaacaattag cgtggctgcc cgacttaaga 900ctcaggtgtg aggctatctg
gggaagggct ttctaacaac ccccaaccct tctgggttgg 960ggacgttggt ctgccccttc
cactttcaat tttcttgggg aagccaaggg tcgactagag 1020gcagaaagct gtcgtccgga
actcctggca gtagccggtt gagatcatgg cgcagccaga 1080agtctctact caacagtcgc
ccatgcgtgc gctcctacct ttcctcctga cccatacctc 1140ctgggtcccg acgatgactt
tcttgaaagt gtagccccaa aattctgctt acctctgaat 1200ctacttcccc tgatccctgg
ctcctaggta ctaatggttc agtttcattt cctctagcaa 1260gttgtatctc caaagggatc
taaggaagct ctacgctgcg tccttaggca tctaggctat 1320aaacccagga agtcttgtcc
ctggtgtccc tcccgattta ggcatacagc tctcgacatg 1380ggcagttatg tgggacccgt
tccccatcac ccttgtcaag gccccaagtt tgtaatggct 1440aagaggagag agagagaaag
agagagagac ggaggggaga gagagagaga gagatggagg 1500ggagagagag agagagagac
ggaggggaga gagagagaga gagagagacg gaggggagag 1560agagagagac ggaggggaga
gagagagaga tggaggagag aaagacaaag ggagtcaaag 1620agaaaaagaa agagaaagac
agaaatggta aaacaaacaa aaaacagcgt gccctattcc 1680tttaaaagcc ggggtaaatt
taaaacctat aattgataat tgaaggtctt ctccatgacc 1740ctataatact ccaatactac
cttgttgtca gtgtaaacaa gggcgtagcc tgaaaacact 1800gagaccactg acaacctgca
gctttcctat caaaaaatcc ttaacccagt aaccggcaga 1860tgcattcaat ctgtagcagc
aactgttttg ctaacagaag aaagtagaaa agtaactttt 1920agaggaaacc tcattgtgag
cacaccttac cagttcagaa ttattctaag tcaaaaaagc 1980aaaaaggtag cttactaact
caaaaatctt aaagtatggg gctattgtgt ttaaaaaaaa 2040aaaaaggtaa tttaacacca
accactgata attctcttaa cccagcaggt ttcctaacag 2100gggatttaaa tcttaattac
catacaaagg tctgaccaca cctaggagga actcccttca 2160ggacaggact atagagggtt
cctcccaggt gattgaggaa aaaaccacag tgggtattca 2220gtaattgata gggagactct
tgtggaagca gagttagaaa aattgcctaa taaatggtgt 2280cctcaaaagt gtgagctgtt
tgcactcagc caagccttaa agtacttaca gaatcgtaaa 2340aactatctca atcctgactc
aaaagtttac ttacaccctc tctgaaatga atttacataa 2400gaactgcttt tttgggaatg
catcttgatg gggcagctgg gtggttatga aatactcagg 2460aaaccagccc agctctagga
cacatccctg agcacaaagg caatgttggg cacgctggta 2520aaggaccact agaatccagc
agcctggact cctttctttg tggtcaagaa aggcaggaaa 2580acaggtgcag gactgctaca
tcagtgagca taactaatct gataagcaga gggccttggg 2640tggttacaca ccctggaaag
gaattcaact ctgagcgcaa aggcaatgtt gggcacattg 2700gtaaaggacc actagaatcc
agcagcccag gcccctttct ttatggtcaa gaaaggcggg 2760aaaaggggtg caggactgtt
acctcggtga gcgtaactaa tccgataagc agaggtccat 2820gggtgattac gcaccctgaa
aagaataagc attaggccct taaaggatgc tctaggacta 2880atgctcattg gaaaatgact
aggggtgctg gcatccctat gttcttttct cagacgggaa 2940atgttctcca ccctccccaa
ggcaaaaaca cccctaagat gtattctgga gaattgggac 3000caatttgacc cccagacgct
aagaaagaga tgacttatgt tcttctgcag taccacctgg 3060ccacgatatc ctcttcaagg
gggagaaacc tggcctcctg agggaagtat aaattataac 3120accatcttac agctagacct
cttctgtaga aaggagggca aatggagtga agtgccatat 3180gtgcaaactt tcttttcatt
aagagacaac ttgcaattat gtaagaagtg tgatttatgc 3240cctacaggaa gccctcagag
tctacctccc taccccagca tccccctgac tccttctcca 3300actaataagg aacccccttc
aacccaaacg gtccaaaagg agatagacaa aggggtaaac 3360aatgaaccaa agcgtgccaa
tgttccctga ttatgccccc tctaagcagt gggaggagga 3420gaatttggcc cagccagtgt
gcatgtgcct ttttctctct cagacttaaa gcaaattaaa 3480atagacctag gtaaattctc
agataaccct gatggctata ttgatgtttt ataagggtta 3540ggataatcct ttgatctgac
atggagagat ataatgttac tgctagatca gacactaacc 3600ccaaatgaga caagtgccgc
cataactgca gcctgagagt ttggcgatct ctggtatctc 3660actcgggtca atgataggag
gacaacagag gaaagagaat gattccccac agaccagcag 3720gcagttccca gtgtagaccc
tcactgggac acagaatcag aacatggaca ttggtgctgc 3780agacatttgc taacttacat
gctagaagga ctaaggaaaa ctaggaagaa gcctacgaat 3840tattcaatga tgtccactat
aacacaggga aaggaagaaa atcctactgc ctttctggag 3900cgactaaggg aggcattgag
gaagcatact tccctgtcac ctgactctat tgaaggccaa 3960ctaatcttaa aggataagtt
tatcactcag tcagctgaag acattaggaa aaaacttcaa 4020aagtctgcct taggcccaga
gcaaaactta gaaaccccat tgaacttggc aacctcggtt 4080ttttataata gagatcagga
ggagcaggcg gaacaggaca aacggggtaa aaaaaaggcc 4140accgctttag ttatggccct
caggcaagtg gactttggag gctctggaaa agggaaaagc 4200tgggcaaatc gaatgcctac
tagggcttgc ttccagagtg gtctacaagg acactttgaa 4260aaagattgtc caagtagaaa
taagtcgccc cttcgtccat gccccttata tcaagggaat 4320cactggaagg cccactatcc
caggggacaa atgtcctctg agtcagaagc cactaaccag 4380atgatccagc agcaggactg
agggtgccca gggcaagcac tagcccatgc cgtcaccctc 4440acagagcccc aggtatgctt
gaccattgag ggccaggagg ttaactgtct cctggacact 4500agcacggcct tctcagtctt
actctccttt cccggacaac tgtcctccag atctgtcact 4560atccgagggt tcctaggaca
gtcagtcact agatacttat cccagtcact aagttgtgac 4620tggtgaactt tactcttttc
acatgctttt ctaattatcc ctgaaagcac cactcccttg 4680ttagggcgag acattctagc
aaaagcaggg gccattatac acctgaacat aggagaagga 4740acacctgttt gttgtcccct
gcttgaggaa ggaattaatc ccgaagtctg ggcaacagaa 4800ggacaatacg gacgagcaaa
gaatgcctgt gctgttcaag ttaaactaaa ggattccgcc 4860tcctttccct accaaaggca
gtaccccctt agacctgagg cccaacaagg actccaaaag 4920attgttaagg acctaaaagc
ccatggccta gtaaaaccat gcaatagccc ctgcaatact 4980ccaattttag gagtacagaa
acccaacaga cagtggaggt tagtgcaaga tctcaggatt 5040atcattgagg ctgttgttcc
tgtatagcca gctgtaccta acccttatac tctgctttcc 5100caaataccac aggaagcaga
ggggtttaca gtccggggcc ttaaggacac ctttttctgc 5160atccctgtat atcctgactc
tcaattcttg tttgcctttg aagatccttc aaactcaacg 5220tctcaactca cctggaatgt
tttaccccaa gggttcaggg atagccccca ttagcccaag 5280acttgagcca gttcttatac
ctggacactc ttgtcctttg gtacgtggat gatttacttt 5340tagccacctg ttcagaaacc
ttgtgccatc aagccaccca agcactcttt aatttcctcg 5400ccacctgtgg ctacaggttt
ccaaaccaaa ggctcagctc tgctcacagc aatttaaatg 5460cttagggcta aaattatcca
aaggcaccag ggccctcagt gaggaaagta tccggcctat 5520actggcttat cctcatccca
aaaccctaaa gcaactaaga gtgttccttg gcataacggg 5580tttctgccga atatggattc
ccaggtacag cgaaatagcc agaccattat atacactaat 5640taaggaaact cagaaagcca
atacccattt ggtaagatgg acacctgaag cagaagcaga 5700tttccaggcc ctaaagaagg
ccctgaccca agccccagtg ttaagcttgc caatggggca 5760agacttttct ttatatgtca
cagaaaaaac aggaatagct ccaggagtcc ttacgcagat 5820ccaagggacg agcctgcaac
ccatggcata cctgagtaag gaaattagtg gcaaagggtt 5880ggcctcattg tttatgggta
gtggcagcag tcacagtctt agtaactgaa gcagttaaaa 5940tgatacaagg aagagatctt
actgtgtgga catctcatga tgtgaatggc atactcactg 6000ctaaaggaga cttgtgactg
tcagacaact gtttacttaa atatcaggct ctattacttg 6060aagggccagt gttgcgactg
tgcacttgtg caactcttaa cccagccaca ttgcttccag 6120acaatgaaga aaagatagaa
cataactgtc aacaaataat tgctcaaacc tacactgctc 6180gaggggacct tttagaagtt
cccttgactg atcccgatct caacttgtat actgatggaa 6240gttcctttgc agaaaaagga
cttcaaaagg cggtgtatgc agtagtcctt caaaatcgaa 6300gagctttaga attgctaatc
actgagagag ggggaacgtt tttattttta ggggaagaat 6360gctgttatta tgttaatcaa
ttcggaatca tcaccaagaa agttaaagaa attcaagatc 6420gaatacaacg tagaacagag
gagcttaaaa aacactggac cctggggcct cctcagccaa 6480tggatgccct ggattctccc
cttcttagga cctctagcag ctatatttct actcctcttt 6540ggaccctgta tctttaacct
ccgtgttaag tttgtctctt ccagaatcga agatgtaaaa 6600ctacaaatcg ttcttcaaat
ggacccccag atgcagtcca tgactaagat ctactgagga 6660cccctggacc agcctgctag
cccatgctcc aatgttaatg acattgaagg cacccctccc 6720aaggaaatct caactgcaca
acccctacta tgctccaatt cagcaggaag cagttacagt 6780ggtcctcggc caacctcccc
aacagcattt gtattttcct gttgggaggg ggcactgaga 6840gacaggacta gctggatttc
ctaggctgac tgagaatccc taagcctagc tgggaaggtg 6900accacttcca cctttaaaca
cagggcttgc aacttagctc acaccctacc aattggatag 6960taaagagagg tcactaaaat
gctaattagg caaaaacagg aggtaaagaa atagccaatc 7020atccattgcc tgagagcaca
gcgggaggga caatgaccag gatataaacc caggcattcc 7080agcctgcaac ggcaaccccc
tttgggtccc ctctctttgt atgggagctc tgttttcact 7140ctattcaatc ttgcaactgc
actcttctgg tccgtgtttg ttacggctca agctgagctt 7200ttgctcacca tccaccactg
ctgtttgccg ccgttgcaga cccgtcgctg acttccatcc 7260ctccagatct ggcagggtgt
ccactgtgct cctgatccag cgaggcaccc attgccactc 7320ccgatcaggc taaaggcttg
ccattgttcc tgcacagcta agtgcctggg ttcgtcctaa 7380tcaagctgaa cactagtcac
tgggttccat ggttctcttc catgacccat ggcttctaat 7440agagctataa cactcaccgc
atggcccaag attccattcc ttggaatccg tgaggccaag 7500aaccccaggt cagagaacac
gaggctgccg ccatcttgga ag 7542610288DNAHomo sapiens
6cctggggcgg gcttcctttc tgggatgagg gcaaaacgcc tggagataca gcaattatct
60tgcaactgag agacaggact agctggattt cctaggccga ctaagaatcc ctaagcctag
120ctgggaaggt gaccacgtcc acctttaaac acggggcttg caacttagct cacacctgac
180caatcagaga gctcactaaa atgctaatta ggcaaagaca ggaggtaaag aaatagccaa
240tcatctattg cctgagagca cagcaggagg gacaacaatc gggatataaa cccaggcatt
300cgagctggca acagcagccc ccctttgggt cccttccctt tgtatgggag ctgttttcat
360gctatttcac tctattaaat cttgcaactg cactcttctg gtccatgttt cttacggctc
420gagctgagct tttgctcacc gtccaccact gctgtttgcc accaccgcag acctgccgct
480gactcccatc cctctggatc ctgcagggtg tccgctgtgc tcctgatcca gcgaggcgcc
540cattgccgct cccaattggg ctaaaggctt gccattgttc ctgcacggct aagtgcctgg
600gtttgttcta attgagctga acactagtca ctgggttcca tggttctctt ctgtgaccca
660cggcttctaa tagaactata acacttacca catggcccaa gattccattc cttggaatcc
720gtgaggccaa gaactccagg tcagagaata cgaggcttgc caccatcttg gaagcggcct
780gctaccatct tggaagtggt tcaccaccat cttgggagct ctgtgagcaa ggaccccccg
840gtaacatttt ggcaaccacg aacggacatc caaagtggtg agtaatattg gaccactttc
900acttgctatt ctgtcctatc cttccttaga attggaggaa aataccgggc acttgtcggc
960cagttaaaaa cgattagtgt ggccaccgga cttaagactc aggtgtgagg ctatctgggg
1020aagggctttc taacaacccc caacccttct gggttgggga cttggtttgc ctcaagccag
1080cttccacttt cagttttctt ggggaagccg agggccgact agaggcagaa agctgtcgtc
1140ctgaactccc ggcagtagcc ggttgagatc atggtgtagc cagaagtctc aacagtcgcc
1200catgcatgca cccctatctt tccttctgac ccatacctcc tgggtcccaa ccacaacttt
1260cttcaaagtg tagccccaaa attctcctta cctctgaata tacttcctct gatccctgcc
1320tcctaggtac tattggttca gacttccatt tcctctagca agttgtatct ccaaagggat
1380ctaaggaagc tctgcgctgc gtccttaggc acctaggcta taacccaggg agtcttatcc
1440ctggtgtccc tcccaattta ggcatacagc tcttgacatg ggcagttatg taggacccac
1500tccccaccac ccttgccagg gccccaagtt tgtaaatggc tgagggaaaa gagagacaga
1560ggagagagag agaaatggag gagaaagaga gagagacaga gaggagagag agacagtgag
1620agagacagaa gagagagaga gacaaagagg agagagagag agtcaaagag agaaagaaag
1680agaaagaaat agtaaaaaac agtgtgccct attcctttaa aagccagggt aaatttaaaa
1740cctgtacttg ataattgaag gtcttctctg tgaccctata gcactccaat ccactttgtg
1800gtcagtgtaa ataagagcat aggccgaaag cactgaggcc attgacaacc cgtagcttcc
1860ctatcaaaaa tccttaaccc agtaacccgc agatggacca aatgcattca gtcggtagcg
1920caactgcttt gctaaaagta gaaaagtaac ttttagagga aacctcattg tgagcacacc
1980tcacctgttc agaattattc taataaaaaa agcaaaaagg tagcttacta actcaaaaat
2040cttaaagtat ggggctattc tgttagaaaa aggtaatgta actccaacca ctgataattc
2100ccttaaccca gcagatttcc taacgggatt taaatcttaa ttaccataca aaggtccgac
2160cagacctagg cggaactccc ttcaggacag gacgatagat ggttcctccc aggtgattga
2220ggaaaaaaac cacaatgggt attcagtaat tgatacgggg actcttgtgg aagcagagtt
2280agaaaaattg cctaataact ggtctcctca aacgtgtgag ctgtttgcac tcagccaagc
2340cttaaagtac ttacagaatc aaaagactat ctcaatcctg attcaaaagg ttagctacac
2400cctctctgta atgcatttgc ataagaactt gtttatggga atgcatcttg atggggcagc
2460tgggttgtta taaaatagga acccagccca gctctaggac tcacccctga gcgcaaaggc
2520aatgttgggc atgctggtaa aggaccacta gaatccagca gcccagaccc ctttctttgt
2580ggtcaagaaa ggcgggaaaa ggggtgcagg actgctacat cggtaagcat aactaatccg
2640ataaacagag gtccatgggt ggttacgcac cctggaaagg aactcacccc tgagcacaaa
2700ggcaatgttg ggcacgctgg taaaggacca ctagaatcca gcagcctgga cccctttctt
2760tgtggtcaag agaggcagga aaacaggtgc aggactgcaa catcagtgag cataactaat
2820tcgataagca gaggtccatg ggtggtgatg caccctggaa agaataagca ttaggaccat
2880agaggacact ccaggactaa agctcatcgg aaaatgacta gggttgctgg catccctatg
2940ttcttttttc agatgggaaa cgttccccgc aagacaaaaa cgcccctaag acgtattctg
3000gagaattggg accaatttga ccctcagaca ctaagaaaga aacgacttat attcttctgc
3060agtgccgcct ggcactcctg agggaagtat aaattataac accatcttac agctagacct
3120cttttgtaga aaaggcaaat ggagtgaagt gccataagta caaactttct tttcattaag
3180agacaactca caattatgta aaaagtgtga tttatgccct acaggaagcc ttcagagtct
3240acctccctat cccagcatcc ccgactcctt ccccaactaa taaggacccc ccttcaaccc
3300aaatggtcca aaaggagata gacaaaaggg taaacagtga accaaagagt gccaatattc
3360cccaattatg acccctccaa gcagtgggag gaagagaatt cggcccagcc agagtgcatg
3420tgcctttttc tctcccagac ttaaagcaaa taaaaacaga cttaggtaaa ttctcagata
3480accctgatgg ctatattgat gttttacaag ggttaggaca attctttgat ctgacatgga
3540gagatataat gtcactgcta aatcagacac taaccccaaa tgagagaagt gccaccataa
3600ctgcagcctg agagtttggc gatctctggt atctcagtca ggtcaatgat aggatgacaa
3660cagaggaaag agaatgattc cccacaggcc agcaggcagt tcccagtcta gaccctcatt
3720gggacacaga atcagaacat ggagattggt gctgcagaca tttgctaact tgtgtgctag
3780aaggactaag gaaaactagg aagaagtcta tgaattactc aatgatgtcc accataacac
3840agggaaggga agaaaatcct actgcctttc tggagagact aagggaggca ttgaggaagc
3900gtgcctctct gtcacctgac tcttctgaag gccaactaat cttaaagcgt aagtttatca
3960ctcagtcagc tgcagacatt agaaaaaaac ttcaaaagtc tgccgtaggc ccggagcaaa
4020acttagaaac cctattgaac ttggcaacct cggtttttta taatagagat caggaggagc
4080aggcggaaca ggacaaacgg gattaaaaaa aaggccaccg ctttagtcat gaccctcagg
4140caagtggact ttggaggctc tggaaaaggg aaaagctggg caaattgaat gcctaatagg
4200gcttgcttcc agtgcggtct acaaggacac tttaaaaaag attgtccaag tagaagtaag
4260ccgccccctc gtccatgccc cttatttcaa gggaatcact ggaaggccca ctgccccagg
4320ggacaaaggt cctctgagtc agaagccact aaccagatga tccagcagca ggactgaggg
4380tgcctggggc aagcgccatc ccatgccatc accctcacag agccctgggt atgcttgacc
4440attgagggcc aggaggttgt ctcctggaca ctggtgcggt cttcttagtc ttactcttct
4500gtcccggaca actgtcctcc agatctgtca ctatctgagg gggtcctaag acgggcagtc
4560actagatact tctcccagcc actaagttat gactggggag ctttattctt ttcacatgct
4620tttctaatta tgcttgaaag ccccactacc ttgttaggga gagacattct agcaaaagca
4680ggggccatta tacacctgaa cataggagaa ggaacacccg tttgttgtcc cctgcttgag
4740gaaggaatta atcctgaagt ctgggcaaca gaaggacaat atggacgagc aaagaatgcc
4800cgtcctgttc aagttaaact aaaggattcc acctcctttc cctaccaaag gcagtacccc
4860ctcagaccca aggcccaaca aggactccaa aagattgtta aggacctaaa agcccaaggc
4920ctagtaaaac catgcagtaa cccctgcagt actccaattt taggagtaca gaaacccaac
4980agacagtgga ggttagtgca agatctcagg attatcaatg aggctgttgt tcctctatag
5040ccagctgtac ctagccctta tactctgctt tcccaaatac cagaggaagc agagtggttt
5100acagtcctgg accttcagga tgccttcttc tgcatccctg tacatcctga ctctcaattc
5160ttgtttgcct ttgaagatac ttcaaaccca acatctcaac tcacctggac tattttaccc
5220caagggttca gggatagtcc ccatctattt ggccaggcat tagcccaaga cttgagccaa
5280tcctcatacc tggacacttg tccttcggta ggtggatgat ttacttttgg ccgcccattc
5340agaaaccttg tgccatcaag ccacccaagc gctcttcaat ttcctcgcta cctgtggcta
5400catggtttcc aaaccaaagg ctcaactctg ctcacagcag gttacttagg gctaaaatta
5460tccaaaggca ccagggccct cagtgaggaa cacatccagc ctatactggc ttatcctcat
5520cccaaaaccc taaagcaact aaggggattc cttggcgtaa taggtttctg ccgaaaatgg
5580attcccaggt atggcgaaat agccaggtca ttaaatacac taattaagga aactcagaaa
5640gccaataccc atttagtaag atggacaact gaagtagaag tggctttcca ggccctaacc
5700caagccccag tgttaagttt gccaacaggg caagactttt cttcatatgt cacagaaaaa
5760acaggaatag ctctaggagt ccttacacag atccgaggga tgagcttgca acctgtggca
5820tacctgacta aggaaattga tgtagtggca aagggttgac ctcattgttt acgggtagtg
5880gtggcagtag cagtcttagt atctgaagca gttaaaataa tacagggaag agatcttact
5940gtgtggacat ctcatgatgt gaatggcata ctcactgcta aaggagactt gtggctgtca
6000gacaactgtt tacttaaatg tcaggctcta ttacttgaag ggccagtgct gcgactgtgc
6060acttgtgcaa ctcttaaccc agccacattt cttccagaca atgaagaaaa gataaaacat
6120aactgtcaac aagtaatttc tcaaacctat gccactcgag gggacctttt agaggttcct
6180ttgactgatc ccgacctcaa cttgtatact gatggaagtt cctttgtaga aaaaggactt
6240cgaaaagtgg ggtatgcagt ggtcagtgat aatggaatac ttgaaagtaa tcccctcact
6300ccaggaacta gtgctcagct agcagaacta atagccctca cttgggcact agaattagga
6360gaagaaaaaa gggcaaatat atatacagac tctaaatatg cttacctagt cctccatgcc
6420catgcagcaa tatggaaaga aagggaattc ctaacttctg agagaacacc tatcaaacat
6480caggaagcca ttaggaaatt attattggct gtacagaaac ctaaagaggt ggcagtctta
6540cactgccggg gtcatcagaa aggaaaggaa agggaaatag aagagaactg ccaagcagat
6600attgaagcca aaagagctgc aaggcaggac cctccattag aaatgcttat aaaacaaccc
6660ctagtatagg gtaatcccct ccgggaaacc aagccccagt actcagcagg agaaacagaa
6720tggggaacct cacgaggaca gttttctccc ctcgggacgg ctagccactg aagaagggaa
6780aatacttttg cctgcaacta tccaatggaa attacttaaa acccttcatc aaacctttca
6840cttaggcatc gatagcaccc atcagatggc caaatcatta tttactggac caggcctttt
6900caaaactatc aagcagatag tcagggcctg tgaagtgtgc cagagaaata atcccctgcc
6960ttatcgccaa gctccttcag gagaacaaag aacaggccat taccctggag aagactggca
7020actgatttta cccacaagcc caaacctcag ggatttcagt atctactagt ctgggtagat
7080actttcacgg gttgggcaga ggccttcccc tgtaggacag aaaaggccca agaggtaata
7140aaggcactag ttcatgaaat aattcccaga ttcggacttc cccgaggctt acagagtgac
7200aatagccctg ctttccaggc cacagtaacc cagggagtat cccaggcgtt aggtatacga
7260tatcacttac actgcgcctg aaggccacag tcctcaggga aggtcgagaa aatgaatgaa
7320acactcaaag gacatctaaa aaagcaaacc caggaaaccc acctcacatg gcctgctctg
7380ttgcctatag ccttaaaaag aatctgcaac tttccccaaa aagcaggact tagcccatac
7440gaaatgctgt atggaaggcc cttcataacc aatgaccttg tgcttgaccc aagacagcca
7500acttagttgc agacatcacc tccttagcca aatatcaaca agttcttaaa acattacaag
7560gaacctatcc ctgagaagag ggaaaagaac tattccaccc ttgtgacatg gtattagtca
7620agtcccttcc ctctaattcc ccatccctag atacatcctg ggaaggaccc tacccagtca
7680ttttatctac cccaactgcg gttaaagtgg ctggagtgga gtcttggata catcacactt
7740gagtcaaatc ctggatactg ccaaaggaac ctgaaaatcc aggagacaac gctagctatt
7800cctgtgaacc tctagaggat ttgcgcctgc tcttcaaaca acaaccagga ggaaagtaac
7860taaaatcata aatccccatg gccctccctt atcatatttt tctctttact gttcttttac
7920cctctttcac tctcactgca ccccctccat gccgctgtat gaccagtagc tccccttacc
7980aagagtttct atggagaatg cagcgtcccg gaaatattga tgccccatcg tataggagtc
8040tttctaaggg aacccccacc ttcactgccc acacccatat gccccgcaac tgctatcact
8100ctgccactct ttgcatgcat gcaaatactc attattggac aggaaaaatg attaatccta
8160gttgtcctgg aggacttgga gtcactgtct gttggactta cttcacccaa actggtatgt
8220ctgatggggg tggagttcaa gatcaggcaa gagaaaaaca tgtaaaagaa gtaatctccc
8280aactcacccg ggtacatggc acctctagcc cctacaaagg actagatctc tcaaaactac
8340atgaaaccct ccgtacccat actcgcctgg taagcctatt taataccacc ctcactgggc
8400tccatgaggt ctcggcccaa aaccctacta actgttggat atgcctcccc ctgaacttca
8460ggccatatgt ttcaatccct gtacctgaac aatggaacaa cttcagcaca gaaataaaca
8520ccacttccgt tttagtagga cctcttgttt ccaatctgga aataacccat acctcaaacc
8580tcacctgtgt aaaatttagc aatactacat acacaaccaa ctcccaatgc atcaggtggg
8640taactcctcc cacacaaata gtctgcctac cctcaggaat attttttgtc tgtggtacct
8700cagcctatcg ttgtttgaat ggctcttcag aatctatgtg cttcctctca ttcttagtgc
8760cccctatgac catctacact gaacaagatt tatacagtta tgtcatatct aagccccgca
8820acaaaagagt acccattctt ccttttgtta taggagcagg agtgctaggt gcactaggta
8880ctggcattgg cggtatcaca acctctactc agttctacta caaactatct caagaactaa
8940atggggacat ggaacgggtc gccgactccc tggtcacctt gcaagatcaa cttaactccc
9000tagcagcagt agtccttcaa aatcgaagag ctttagactt gctaaccgct gaaagagggg
9060gaacctgttt atttttaggg gaagaatgct gttattatgt taatcaatcc ggaatcgtca
9120ctgagaaagt taaagaaatt cgagatcgaa tacaacgtag agcagaggag cttcgaaaca
9180ctggaccctg gggcctcctc agccaatgga tgccctggat tctccccttc ttaggacctc
9240tagcagctat aatattgcta ctcctctttg gaccctgtat ctttaacctc cttgttaact
9300ttgtctcttc cagaatcgaa gctgtaaaac tacaaatgga gcccaagatg cagtccaaga
9360ctaagatcta ccgcagaccc ctggaccggc ctgctagccc acgatctgat gttaatgaca
9420tcaaaggcac ccctcctgag gaaatctcag ctgcacaacc tctactacgc cccaattcag
9480caggaagcag ttagagcggt cgtcggccaa cctccccaac agcacttagg ttttcctgtt
9540gagatggggg actgagagac aggactagct ggatttccta ggctgactaa gaatccctaa
9600gcctagctgg gaaggtgacc acatccacct ttaaacacgg ggcttgcaac ttagctcaca
9660cctgaccaat cagagagctc actaaaatgc taattaggca aagacaggag gtaaagaaat
9720agccaatcat ctattgcctg agagcacagc aggagggaca atgatcggga tataaaccca
9780agtcttcgag ccggcaacgg caaccccctt tgggtcccct ccctttgtat gggagctctg
9840ttttcatgct atttcactct attaaatctt gcaactgcac tcttctggtc catgtttctt
9900acggcttgag ctgagctttc gctcgccatc caccactgct gtttgccgcc accgcagacc
9960cgccgctgac tcccatccct ctggatcatg cagggtgtcc gctgtgctcc tgatccagcg
10020aggcacccat tgccgctccc aatcgggcta aaggcttgcc attgttcctg catggctaag
10080tgcctgggtt catcctaatt gagctgaaca ctagtcactg ggttccatgg ttctcttctg
10140tgacccacag cttctaatag agctataaca ctcaccgcat ggcccaaggt tccattcctt
10200gaatccataa ggccaagaac cccaggtcag agaacacgag gcttgccacc atcttgggag
10260ctctgtgagc aaggaccccc aagtaaca
1028876818DNAHomo sapiens 7gcaactccct ttgggtctcc tctcattgta tgggagctct
gttttcactc tattaaatct 60tgcaactgca cactcttctg gtctgtgttt gttatggctt
gagctgagct tttgctggct 120gtccaccact gctgtttgct gccgtcgcag accccttgct
gactcccacc cctgcggatc 180tggcagggtg tctgctgcgc tcctgatcca gccaggcacc
cactgctgct cccaatcagg 240ctaaaggctt gccattgttc ctgcatggct aagtgcccgg
gttcgtgcta attgagctga 300acactagtcg ctgggttcca cagttctctt ccgtgaccca
cagcttctaa tagagctata 360acactcactg catggcccaa cattccattc cttggaatct
gtgaggccaa gaacccccgg 420tcagagaaca agaagcttgc caccatcttg gaagcagccc
gccaccattt tgggagctct 480aagaacaagg accccccagt aacattttgg tgaccacgaa
gggacctcca aagcagtgag 540taatattgaa ccacttccgc ttgctattct gtcctaacct
tccttagaat tggaggaaaa 600taccgggcac ctgtcggcca gttaagaacg attagcgtgg
ccgccagact taagactctg 660gtgtgaggct gtctgggaaa gggctttcta acaaccccca
acccttccgg gttgggagct 720ttggtctgcc tggaaccagc ttccactttc aattttcctg
gggaatccaa gggctgacta 780gaggcagaaa gctgtcatcc cgaactcctg gcattagaca
gttgagatcg tggcgcagcc 840agaagtctct actcaacagt cacccatgcg tgcaccccta
cctttccttc taacccatac 900ctcccgggtc ccaaccatga ctttcttgaa agtgtagccc
ctaaattctc tttacctcta 960aatctacttc ttctcatccc tgcttcctag gtactaatgg
ttcagacttt catttcctct 1020agcaagttct atctccagag ggatctaagg aagggatcta
tgctgtgtcc ttaggcccct 1080aggctatgaa cccagagagt cttctccctg ttatctctcc
ccatttaggc atacagctct 1140caacatggac agttatgtgg gacccattcc ctaccaccct
tgccagggcc ccaagttttc 1200aaagggctag aagaaaaaag agagaaagag agagagaggc
agaggggaga gaaagagaga 1260gagacaaaga gggagtcaaa gagagataga aagagaaaga
tagaactagt aaagaaaaaa 1320agtatgcccc attcctttaa aagccagggt aaatttaaaa
cctataattg ataattgaag 1380gtcttctcca tgaccctata acactccaat accaccttgt
tttcagtgta aacaagggtg 1440tagcccgaaa acactgagac cactgacaac ccatagcctt
cctatcaaaa atccttaacc 1500caggaaccca tggatggccc aaatgcattc aatctgtagc
agcaactgct ttgctaacag 1560aagaaagtag aaaagtaact tttagagaaa acctcattgt
gagcacacct caccagttca 1620gaattattct aagtcaaaaa agcaaaaagg tagcttacta
actcaaaaat cttaaagtat 1680ggggttattc tgttagaaaa aggtgattta acattaacca
ctgaaaattc ccttaaccca 1740gcaggtttcc taatgggatt taaatcttca ttaccataca
aaggtccgac cagacccagc 1800aggaactccc tttaggacag gatgatagat ggttcctcct
gggtgattga gggggtgaaa 1860aaccacaatg ggtgttcagt aattgatagg gagactcttg
tggaaggaga gttaggaaaa 1920ttgcctaata attggtctgc tcaaatgtgc gagctgtttg
cactcagcca agccttaaag 1980tacttacaga atcaaaaaga ctctatctca atcctgactc
aaaatgttac ctacaccatc 2040tctgacatga atttgcataa gaactgttgt ttatgggaat
gcatcttgat ggggcagctg 2100ggttgttatg aaatactcag gaacccagcc caggtctaga
attcacctct gagcgcaaag 2160gcaatgttgg ccatgctggt aaaggaccac tagaatccag
gagcctggac ccctttcttt 2220gtggtcaaga aaggcgggaa aacaggtgca ggactgctac
atcagagagc ataacaaatc 2280cgataagcag agttccatga gtggttaagc accctggaaa
ggaactcacc tctgagtgca 2340aaggcaatgt taggcacacc agtaaaggac cactagaatc
cagcagccca gacccctttc 2400tttgtgatca agaaaggcgg gaaaaggggt gcaggactgc
tacatcagtg agcgtaacta 2460atctgataag cagaagtcca tgggtggtta cgcaccctgg
aaaggaataa gcattaggac 2520cacagaggac actctaagac taatgctcat tggaaaatga
ctaggggtgc tggcatccct 2580atgttttttt ttcagatggg aaacattccc cccaaggcaa
aaacgcccat aagatatatt 2640ctggagaatt cggcccagag tgtatgtatc ttttttccct
gtcagacttg aagcaaacct 2700aggtaaatta tcagatagcc ctgatggcta tattgatgct
ttacaagggt taggacaatc 2760ctttgatcta acatggagag atatactgtt actgctagat
cagacactaa tcccaaatga 2820aagaagtgcc accataactg cagccagaga gtttgatgat
ctctggtatc tcagtcaggt 2880caatgatagg atgacaacag aagaaagaaa acaattcccc
acaggccagc aggcagttcc 2940cagcgtagac cttcattggg acacagaatc agaacatgga
gattggtgcc gcagacattt 3000actaacttgc gcgctagaag cactaaggaa aactaggaag
aagcctatga attattcaat 3060gatgtccact ataacacagg gaaaggaaga aaatcctact
gcctttctgg agagactaag 3120ggaggcattg agaaagcata cctctctgtc acctgactct
attgaaggcc aactaatctt 3180aaaggataag ttttccactc agtcagctgc agacattaga
aaaaaacttc aaaagtctgc 3240gttaggccgg gagcaaaact tagaaaccct attgaacttg
gcaacctcag ttttttatga 3300tagagatcag gaggatcagg tggaatggac aaatgagatt
ttaaaaaaag gccaccactt 3360tagtcatggc cctcaggcaa gcagactttg gacactctgg
aaaagggaaa agctgggcaa 3420atcgaatgcc taataagact tgcttccagt gtggtctaca
aggacacttt aaaaaagatt 3480gtccaaatag aaataagcca ccccctcgtc catgctcctt
atgtcaaggg aatcactgga 3540aggcctactg ccccagggga tgaaggtcct ctgagtcaga
agccactaac cagatgattc 3600agccccagga ctcagggtgc ccagggcaag cgccagccta
tgccatcacc ctcacagagc 3660cctgggtatg cttgaccatt gagggtcagg aggttaacta
tctcctggac actggcgtgg 3720ccttctcagt cttactctcc tgtcccggac aactgtcctc
cagatctgtc actatccgag 3780ggtttctacg acagccagcc actagatact tctcccagcc
actaagttgt gactggggaa 3840ctctactctt ttcacatgtt tttctaatta tgcctgaaag
ccccactcct ttgttaggga 3900aagacattct agcaaaagca ggggccatta tacacctgaa
cataggagaa ggaacacctg 3960tttgttgtcc cctgcttgaa gaaggaatta atcctgaagt
ctggacaaca gaaggacaat 4020acagatgagc aacaaatgcc tgtcctgttc aagttaaact
aaaggattat gcctcctttc 4080cctaccaaag gcagtacccc cttagacccg aggcccaaca
aggactccaa aagattgtta 4140aggacctaaa agctcaaagc ctagcaaaac catgcagtag
cccctgcaat actccaattt 4200taggagtaca gaaaaccaac agacagtgga ggttagtgca
agatctcagg attatcaatg 4260aggctgttgt tcctaaccct tatactctgc tttcccaaat
accagaagaa gcagagtggt 4320ttacagtcct ggaccttaag gatggctttt tctgcatccc
tgtacatcct gactctcaat 4380tcttgtttgc ctttggagat ccttcgaacc caatgtctca
actcagcttg actgttttac 4440cccaagggtt cagggatagc ccccatctag ttggccaagc
attagccgag ccagttctcc 4500tacctggaca ctcttgtcct ctggtacatg gatgatttat
ttttagctgc ccgttcagaa 4560accttgtgcc atcaagccac ccaagtgctc ttaaatttcc
tcgccacctg tggctacaag 4620gtttccaaac caaaggctca gctctgctca cagcaggtta
aatacttagg gctaaaatta 4680tccaaaggca ccagggccct cagtgaggaa tgtatccagc
ctgtattggc ttatcctcat 4740cccaaaaccc taaagcaact aagagggttc cttggcataa
caggtttctg ccaaatgtgg 4800attcccaggt acggtgaaat agccaggcca ttatataccc
taattaagga aactcagaaa 4860gccaacaccc atttattaag atggacacct gaagcagaag
cagctttcca ggccctaaag 4920aaggccctaa cccaagcccc agtgttaagc ttgccaacgg
ggaagacttt tctttatatg 4980tcacagaaaa aacaggaata gctctaggag tccttagaca
ggtccaaggg atgagcttgc 5040aacctgtggc atacctgagt aaggaaattg atgtagttgc
aaagggttga cctcattgtt 5100tacaggtagt ggcggcagta gcagtcttag tatctgaagc
agttaaaata atacagggaa 5160gagatcttac tgtgtggaca tctcatgatg taaacggcgt
actcacttct aaaggagact 5220tgtggctgtc agacaaccgt ttacttaaat atcaggctct
attacttgaa gggccagtgc 5280tgcgactgcc cacttgttca actcttaacc cagccacatt
tctttcagac aatgaagaaa 5340agatagaaca taactgtcaa caggtgattg ctcaaaccta
cggcgctcga ggggaccttc 5400tagaggttcc cttgactgat cccaacctca acttgtatac
tgatggaagc tcctttgtag 5460aaaaaggact ttgaaaggtg gggtatgcag tggtcagtga
taatggaata cttgaaagta 5520attccttcac tccaggaact agtgctcagc tggcagaact
aatagccctc actcaggcac 5580tagaattagg agaaggaaaa agggtaaata tatatgcaga
ctctaagtat gcttacccag 5640tcctccacgc ccacacagca atatggagag ataggaaatt
cctaacttct gagggaacac 5700cgatcaaaca tcaggaagcc attaggagat tattattggc
tgtacagaaa cctaaagagg 5760tggcagtctt acactgctgg ggtcatcaga aaggaaagga
aaaggaaata gaaaggaacc 5820accaagtgga tattgaagcc aaaagagcca caaggcaggc
cctccattag aaatgcttat 5880agaaggatcc ctagtatggg gtaatcccct ccgggaaacc
aagccccagt actcagcagg 5940agaaatagac acgaggacat agtttcctcc cctcaggatg
gctagccacc gaaaaaggga 6000aaatactttt gcctgcagct aatcaatgga aattacttaa
aacccttcac caaacctttc 6060acttgggcat ggatagcatc tatcagatgg ccaatttatt
atttactgga ccaggccttt 6120tcaaaactat caagcagata gtcagggcct gtgaaatgtg
ccaaagaaat aatcccctgc 6180acttcaagcc atacatttca atccctgtat ctttaacctc
ctgttgtttg tctcttccag 6240actcaaagct gtaaaactgc aaatggttcc tcatatggag
ccccagatgc agtccatgac 6300taagatctac cacagagccc tagaccggcc tgttagccca
tgctccgatg ttgatgacat 6360caaaggcaca ccttccgagg aaatctcaac tgcacgaccc
ctactaagcc ccaattcagc 6420aggaagcagt taagagcagt cgttggctaa catccccaat
agtatgtggg ttttcctgtt 6480gagagggggg actgagagac aggactagct ggatttccta
ggccaactaa gaatccctaa 6540gcctagttgg gaaggtgacc gcatccacct ttaaacacgg
ggcttgcaac ttagctcaca 6600cccgaccaat caggtagtaa agagagctca ctaaaatgct
aattaggcaa aaacaagagg 6660taaagaaata gccaatcatc tatcgcctga gagcacagtg
gggagggaca atgatcggga 6720tataaaccca ggcattcggg ccggcaacgg caacccccat
tgcgtcccct cccattgtat 6780gggagctctg ttttcattct attaaatctt gcaactgc
681887073DNAHomo sapiens 8gcaaccccct ttgggtctcc
tccctttgca taggagctct gttttcactc tattaagtct 60tgcaactgca ctcttctggt
ccgtgtttct taccgcttga gctgagcttt ccctcactgt 120ccaccactgc tgttttgcca
ccgtcacagg cccaccgctg acttccattc ttctggatct 180agcaggctgt ccactgtgct
cctgatccag cgaggcgccc attgccgctc ccgattgggc 240taaaggcttg ccattgttcc
tgcatggcta cgtgcctggg ttcatcctaa tcaagccgaa 300cactagtcac tgggttccac
ggttctcttc catgacccac gacttctaat agaactataa 360cactcacctc atggcccaag
attccattcc ttggaatcca tgaggccaag aaccccaggt 420cagagaacac gaggcttgcc
accatcttgg aagtggcccc accaccatct tgggagctct 480gggagcaagg acccccggta
acattttggc gaccacaaag ggacatccaa agtggtgagt 540aatattggac cactttcact
tgctattctg ttctatcctt ccttagaact ggaggaaaat 600accaggcaca ggcacctgtc
agccagttaa aaacaattag cgtcgccgcc acacttaaga 660ctcaggtgtg aggctatctg
gggaaagact ttctaacaac ccccaaccca tctagtgggg 720atgttggtct gcctggagac
agcttccact ttcaattttc ttggggaagc cgagggctca 780ctagaggcag acagctgttg
tcccaaactc cgggcagtag ccggttgaga tcatggtgca 840gccaggagtc tctactcagc
agtcgccgat gcatgtgccc ctaccttccc ttctgaccca 900tacatcctga gtcccgactg
tgactttctt gaaagtgtag ccccaaaatt ctccttacct 960ctgaatctac ttcctctgat
ccctgcctcc tgggtactaa tgattcagac tttcatttcc 1020tctagcaagt tgtgtctcca
aagggatcta aggaggctct acgctgcatc cttaggcacc 1080taggctataa cccaaggagt
cttatccctg gtgtccctcc cgatttgggt atacaactct 1140caacatgggc agttatgtag
gacccattcc ccaccacact tgccagggcc ccaagtttgt 1200aatggctaag agagagacac
agagagagag agagagatgg agagagagac aaggagggag 1260tcaaagagaa aaagaaagaa
aaagaaatag tagaaaaaaa agtgtgccct attcctttaa 1320aagccagggt aaatttaaaa
cctgtaattg ataattgaag gtcttctccg tgaccctgta 1380acactccaat gccattttgt
tgtcagtgta aataagggca tagcccaaaa gcactgaggt 1440cactgacaac ccgtagcttt
cccatcaaaa atccttaacc cagtaatccg cggatgggcc 1500aaatgcattc agtcggtagc
agcaaccgct ttgctaaaag tagaaaagta acttttagag 1560gaaacctcat tgtgagcgca
cacctcacca gttcagaatt attctaagtc aaaaaaaaaa 1620aaagcaaaaa ggtaacttac
taactcaaaa atcttaaagt ataggtctat catattagaa 1680aagggtaatg taactccaac
cactgataat tcccttaacc cagcagattt cctaacaggg 1740gatttaaaac ttaattacca
tacaaaggtc ccaccagacc taggaggaac tcccttcagg 1800acaggacgat aaacggttcc
tcccaggtga ttgaggaaaa aaaccacaat gggtattcag 1860taattgatac agagactcat
gtggaagcag ttagaaaaat tgcctaataa ttggtctcct 1920caaacgtgta agctgtttgc
actcagccaa gccttaaagt acttacagaa tcaaaaagac 1980tctgaatcct gactcaaaag
gtttgctaca ccctctgtga aacaaatttg cataagaact 2040gttgtttatg ggaaggcatc
ttgatggggc agctgggttg ttatgaaata ctcaggaccc 2100cagcccggct ctaggactca
cccctgagcg caaaaggcaa tgttgggcac gctggtaaag 2160gaccactaga atccagcagc
ccggacccct ttctttgtgg tcaagagagg cgggaaaaca 2220ggtgcaggac tgctacatca
gtgagcataa ctaatccagt aagcagaggt ccatgggtgg 2280ttatgcaccc tggaaaagaa
tacgcattag gcccttagag gatgctctag gactaatgct 2340catcggaaaa tgactagggg
tgctgacatc cctatgttct tttttcagat gggaaacgtt 2400cctcccaccc caaggcaaaa
aacaccccta agatgtattt tggagaatta ggaccaattt 2460gaccctcaga cactaagaaa
gaaatgactt acattcttct gcagtaccat gatatcctct 2520tcaaggggga gaaacctggc
ctcctgagag aagtataaat tataacacca tcttacagtg 2580agacctcttc tgtagaaagg
agggcaaatg gagtgaagtg caaactttcc tttcattaag 2640agacaactcg caattatgta
aaaagtgtga tttatgccct acagaaagcc ctcagtctac 2700ctccctatcc cagggtcccc
ccgattcctt tcccaactaa taaggacccc ccttttaccc 2760aaatggtcca aaggagatag
atgaagggat aaacaatgaa ccaaacagtg ccaatattcc 2820ctgattatgc cccctccagg
cagtgggagg aggagaattc ggcccagcca gagtgcatgt 2880accttttttt ttctctcaga
cttaaagcaa attaaaatag acctaggtaa attctcagat 2940aaccctgatg gctatattga
tgttttacaa gggttaggac aatcctttgc tctgacatgg 3000agagatataa tgttactgct
aaatcagaca ctaaccccaa atgagagaag tgtcaccata 3060gctgcagccc aagagtttgg
caatctctgg tatctcagtc aggtcaatga taggatgaca 3120acagaggaaa gggaatgatt
ccccacaggc cagcaggcag ttctcagtgt agaccctcac 3180tgggacacag aataagaaca
tggagatcgg tgccgcagat atttgctaac ttgcgtgcta 3240ggactaagga aaactaggaa
gaagcctatg aattattcag tgatgtccac tataacacag 3300ggaaaggaag aaaatcatac
tgcctttccg gaaatactaa gggaggcatt gaggaagcat 3360acctctctgt cacctgactg
tattgaagtc caactaatct taaaggatat gtttatcact 3420cagtcagctg cagacattag
aaaaaacttc aaaagtccac cttaggccca gagcaaaact 3480tagaaaccct attgaacttg
ttaacctcag ttttttataa tagagatcag gaggagcagg 3540cggaacagga caaacaggat
taaaaaaaga ccaccgcttt agtcatggcc ctcaggcaag 3600tggactttgg aagctctgga
aaagggaaaa gctgggcaaa ttgaatgcct aatagggctt 3660gcttccagtg tggtctacaa
ggacacttaa aaaaagattg tccaagtaga aataagctgc 3720cccttcgtcc atgcctctta
tgtcaaggga atcactggaa ggcccattgc cccaggggag 3780gaaggtcctc tgagtcagaa
gccactaacc agatgatcca gcagcaggac taagggtgcc 3840cagggcaagc cccagcccat
gccatcaccc tcacagagcc ccgggtatgc ttgaccattg 3900agggccagga ggttaactgt
ctcctgaaca ctggcacagc cttctcagtc ttactttcct 3960gtcccggaca actgtcctcc
agatctgtca ctatctgagc ggtcctagga cagccagtca 4020ctagatattt ctcccagcca
ctaagttgtg actggggaac tttactcttt tcacatgctt 4080ttctaattat gcctgaaagc
cccactcctt tgttagggag agacattcta gcaaaagcag 4140gggccattat acatctgaac
ataggagaag gaacacccgt ttgttgtcac ctgcttgagg 4200aaggaattaa tgctgaagtc
tgggcaacag aaggacaata tggatgagca aagaatgccc 4260atcctgttca agttaaatta
aaggattccg cctcctttcc ctaccaaagg caataccccc 4320ttagacccga ggcccaacaa
ggactccaaa agattgttaa ggacctaaaa gcccaaggcc 4380tagtaaaacc atgcaatagc
ccctgccata ctccaatttt aggagtaagg aaacccaacg 4440gacagtggag gttagtgcaa
gaactcagga ttatcaatga ggctgttgtt cctctatacc 4500cagctgtacc taacccttat
acagtgcttt cccaaatacc agaggaagca gagtggttta 4560cagtcctgga ccttaaggat
gcctttttct gcatccctgt acgtcctgac tctcaattct 4620tgtttgcctt tgaagatcct
ttgaacccaa cgtctcaact cacctggact gttttacccc 4680aagggttcaa ggatagcccc
catctatttg gccaggcatt agcccaagac ttgagccaat 4740tctcatacct ggacactctt
atccttcggt atggggatga tttaatttta gctacccatt 4800cagaaacgtt gtgccatcaa
gccacccaag tgctcttaaa tttcctcgct acctgtggct 4860acaggtttcc aaacgaaagg
ctcagctctg ctcacagcag gttaaatact tagggctaaa 4920attatccaaa ggcaccaggg
ccctcagtga ggaacgtatc cagcctatac tggcttattc 4980tcatcccaaa accctaaagc
aactaagagc attccttggc ataacaggct gctgctgaat 5040atggattccc aggtacagtg
aaatagccag gccattatac acactaatta aggaaactca 5100gaaagccaat acccatttag
taagatggac accttaagca gaagcggctt tccaggcctt 5160aaagaaggcc ctaacccaag
ccccagtggt aagcttgcca acagggcaag acttttcttt 5220atatgtcaca gaagaaacag
gaatagctct aggagtcctt acacaggtct gagggatgag 5280cttgcaaccc atggcatacc
tgagtaagga aactgatgta gtggcaaagg gttggcctca 5340ttgtttacgg gtagtggcag
cagtagcagt cttagtatct gaagtagtta aaataataca 5400gggaagagat cttactgtgt
gaacatctca tgatgtgaat ggcatagtca ctgctaaagg 5460agacttgtgg ctgtcagaca
actgtttact taaataccag gctctattac ttgaagggcc 5520agtgctgcga ctgtgcactt
gtgcaactct taacccagac acatttcttc cagacaatga 5580agaaaagata gaacataact
gccaacaagt aattgctcaa acctatgcca ctcgagggga 5640ccttttagag gttcccttga
ctgatcccaa cctcaacttg tatactgatg gaagttcctc 5700tgtagaaaaa ggactttgaa
aagtggggta tgcagtggtc agtgataatg gaatacttga 5760aagtaatccc ctcactccag
gaactagtgc tcagctggca gaactaatag ccctcactcg 5820ggcactagaa ttaggagaag
agaaaagggt aaatatatac agactctaag tatgcttacc 5880tagtcctcca tgcccatgca
gcaatatgga gagaaaggga attcctaatt tccaagggaa 5940cacctatcca acatcaggaa
gccattagga gattactatt ggctgtacag aaacataaag 6000aggtggcaat cttacactgc
cggtgtcacc agaaaggaaa ggaaagggaa atagaaagga 6060accaccaagc ggatattgaa
gccaaaagag ccgcaaggca ggaccctcca ttagaaatgc 6120ttatagaagg acccctagta
tggggtaatc ccctccagga aaccaagccc cagtactcag 6180aagaagaaat agaatgagga
acctcacaag cacatagttt cctcccctca ggatggctag 6240ccactgaaga aggaaaaata
cttttgcctg cagctaacca atggaaatta cttaaaaccc 6300ttcaccaaac atttccctta
ggcattgata gcacccatca gatggccaaa ttattattta 6360ctggaccagg ccttttcaaa
actatcaagc agatagtcag ggcctgtaaa gtgtgccaaa 6420caagtaatcc cctgcactgc
aggccataca tttcaatccc tgtatcttta acctccttgt 6480taagtttgtc tcttccagaa
tcaaagctgt aaaactacaa atagttcttc aaatggagcc 6540ccagatgtag tccatgacta
agatctaccg cggacccctg gacaagcctg ctagcccatg 6600ctctgatgtt aatgacatgg
aaggcacccc tcccgaggaa atcgcaactg cacaacccct 6660attacacccc aattcagcag
gaagcagtta gagcattcat cagccaacct ccccaacagc 6720acttgggttt tcctattgag
agggggtact gagagacagg actagctgga tgtcctaggc 6780tgactaagaa tccctaagcc
tagctgggaa ggtgaccaca tccaccttta aatacggggc 6840ttgcaaccta gctcacaccc
aacagatcag agagctcgtt aaaatgctaa ttaggcaaaa 6900acaggaggta aagaaatagc
caatcatcta ttgcctgaga gcacagcagg agggacaagg 6960attgggatat aatcccaggc
attcgagctg gcaacagcaa ccccctttgg gtcccctccc 7020tttgtatggg agctgttttc
actctatttc actctattaa atcttgcaac tgc 707395068DNAHomo sapiens
9gcaacctcct ttgagtcccc tccctttgta taggagctct gttttcactg tgtttcactc
60tattaaatct tgcaattgca ctcttctggt ccatatttgt cacggcttga gctgagcttt
120cacttgccgt ccaccactac tgtttgctgc tgtcacagac ccgccgctga ctcccatccc
180gctgctgact cccatccctc cggatccggc agggtgtccg ctgtgctcct gatccagcaa
240gactcccatt gccactcccg atagtgctaa aggcttgcca ttgttcctgc atggctaagt
300gcctgggttc gtcctaatcc agctgaacac tagtcactgg gttccacggt tctcttccat
360gacccgcggc ttctaataga gctataacac tcaccacatg gcccaatatt ccattccttg
420gaatccgtga ggccaagaac cccaggtcag agaacacgag gcttgccacc atcttggaag
480cagcctgcca ccatcttgga agtggctcac caccgtcttg ggagttctgt gaacaaggac
540ccctggtaac attttggcga ccacgaaggg acatccaaag ctgtgagtaa tattggacca
600ctttcgcttg ctattctgtt ctatccttag aactggagga aaatactggg cacctgtcgc
660cagttaaaaa tgattagcat ggccgccgga cttaagactc aggtgtgagg ctatctggga
720aagggctttc taacaacccc caagccttct gttgggaact ttggtctgcc tggagccagc
780ttccactttc aattttcttg gggaagccaa gggctgactg gaggcagaaa gctgttgtcc
840cgaactcccg gcagtagccg gttgagatca tggcgcagcc agaagtctct actcggcagt
900cgcccatgcg tgcgccctta cctttccttc tgaattatac ctccggggtc ccgactccga
960ctttcttgag agtttagccc caaaattctc cttacctctg aatctacttc ctttgatccc
1020tgcctcctgc ctcctaggta ctaatagttc agactttcat ttcctctagc aagttgtgtc
1080tccaaaggga tctaaggagg ctctatgctg tgtccttagg cacctaggct ataacccagg
1140gagtcttatc cctggtatcc ctcccgattt aggtatacag ctcttgacat gggcagttat
1200gtgggacctg ttccccacca cccttgtgag ggccccaagt ttgtaatggc taagaaagag
1260agacggagag agagagagac ggagaaagag acaaagaggg agtcaaagag aaaaagaaag
1320aaaaagatag aaatagttaa aaaaaaaaaa aagtgtgccc tattccttta aaagccaggg
1380taaatttaaa acctgtaatt gataattgcc actttgttgt cagtgtaaat aagggcgtag
1440caaatcctta acccagtaac ccgcggatag gccaaatgca ttcagtcggt agcggcaaca
1500gctttgctaa aagtagaaaa gtaactttta gaggaaacct cattgtgagc acacctcacc
1560agttcagagt tattctaagt aaaaaaaaaa aaaaaaaaaa aagcaaaaag gtagcttact
1620aactcaataa tcttaaagta tggggctact atgctagaaa agggtaatgt aactccaacc
1680actgataact cccttaaccc agcagatttc ctaacagggg atttaaatct taattaccac
1740acgaaggtcc gaccagacct aggaggaact cccttcagca caggacgata gatggttcct
1800cccaggtgac tgaggaaaaa actacaatgg gtattcagta attggtatgg agactcttgt
1860ggaagcagag ttaaaaattt gcctaataat tggtctcctc aaatgtgcga gctgtttgca
1920ctcagccaag ccttaaagta cttacagaat caaaagacta tctcaatcct gactcaaaag
1980gttagctaca cagtctctga aatgaatttg cagaagaact gttgtttatg ggaatgcatc
2040ttgatggggc agctgggttg ttatgaaata ctcaggaacc cagcccagct ctaggactca
2100ccgctgagcg caaaggcaat gttgggcacg ctggtaaagg accactagaa tccagcagcc
2160caggcccctt tctttgtggt caagaaaggc aggaaaagga gtgcagaact gctacattgg
2220tgagcgtaac taatccaata agcagaggtc catgagtggt tatgcacgct ggaaaagaat
2280aagcattagg cccttagagg atgctctagg actaatgctc atcggaaaat gactaggggt
2340gctggcatcc ttatgttctt tcttcagatg ggaaacgttc cccccaaggc aaaagcgccc
2400ctaagatgta ttctggagaa ttagaaccaa tttgaccctc agatgtcaag aaagaaacga
2460cttatattct tctgcagtac tgcctggcca cgatatcctc ttcaaggggg agaaacctgg
2520cctcctgagg gaagtacaaa ttataacacc atcttacagc tagacctctt ttgtagaaaa
2580gaaggcaaat ggagtgaagt gccatatgtg caaactttct tttcattaag agacaactca
2640caattatgta aaaagtgtgg tttatgtctt acaggaagcc ctcagagtct acctccctat
2700cccagcattc ccccgactcc ttccccaact aataagcacc acccttgaac ccaaacagtc
2760caaaaggaga tagacaaaca ggtaaacaat gaaccaaaga gtgtcagtat tccccgatta
2820tgccccttcc aagcagtggg aggaggagaa ttcggcccag ccagagtgca tgtacctttt
2880tctctctcag acttaacgca aattaaaata gacttaggta aattctcaga taaccctgat
2940ggctacattg atgttttaca agggttaggg caatcctttg atctgacatg gagagatata
3000atgttactgc taaatcagac actaacccca aatgagagaa gtgccgccgt aactgcagcc
3060cgagagtttg gtgatctctg gtatctcagt caggtcaatg ataggatgac aacagagaaa
3120agagaacgat tccccacagg ccagcaggca gtttccagtg tagaccctca ttaggacaca
3180gaatcagaac atggagattg gtgccacaga tatttgctaa cttgagtgct agaaggacta
3240aggaaaacta ggaagaagcc tatgaattat tcagtgatgt ccactataac acaaggaaag
3300gaagaaaatc ctactgcctt tctggagaga gtaagggagg cattaaggaa gcatacctcc
3360ctgtcacctg actctattga aggccaacta atcttaaagg ataagtttgt cactcagtta
3420gctgcagaca ttagaaaaaa acttcaaaag tccgacttag gcctggagta cggctgagtg
3480cccaatttgg cagcaggcaa gaccaacact gagcccttca tatggcacca tgctttgtgg
3540tgatcagcca actacttgat ggcaggttga ttatattgga catctttcat cagagaaatg
3600gcagtggttt gtccttcctg gaatagacac ttattctcga tatgggtttg tctatcctgc
3660aggcaatgct tctgccagga gtaccatctg tggactcatg gaaagcctta tccaccatca
3720tggcattcca cacagcattg cctctaaaca aggcacttat tttatagcta aggaagtgtg
3780gcagtgggct catgctcatg gaattcactg attgtatctt gttgcccatt atcttaaagc
3840agctggattg atagaacagt ggaaaggcca tttgaaatca caattacacc accaactagg
3900tgacaatact ttgcagggct cggcaaagtt ctcttgaagg ctgagtatgt cctgaatcag
3960catccaatat atggtactgt ttccctcata gccagcattc acaggcctaa gaatcaaggg
4020gtagaagtag aagtggcacc actcaccatc actcctagtg acccactagc aaaaatttta
4080cttccagttc ccccaacatt atgttctgct ggccttagtt ccagagggaa gaattctgcc
4140accagtcgac acaagaatga taccattaaa ctgaaagtta aaattgccac ctggccactt
4200tgggctcctc ccacctctaa gtcaacaggt caagaaagga gttacagtgt tgacttgggt
4260gattgacctg gactatcaag atgaaatcag gttactactc cacagtggag gtaaggaaga
4320atatgtgtgg aatacaggag atcccttagg ccgtctttta gtactaccat gccctgtgat
4380taaggtcagt ggaaaactac aacaatccaa tctaggcagg actacaaatg gcccagactc
4440ttcaggaatg aagggttggg tgacttcacc aggtaaaaaa ataacagcct gctgaggtgc
4500tagctgaagg caaagggaat acagaatggt tagtagaaaa aggtagtcat caataccagc
4560tatgaccaca agaccagttg cagaaatgag acctgtaatt gtcatgtgga tttcctcctt
4620acatgtttgt gcatgtatac acttctacta agaaaatacc tttatttatt tcctttgctt
4680ttcccttatc aagtgacatt attaacttca tatcagcagt taagtgttat taactttatg
4740taatagcatt tcggttaata attcacttct ggttgtatga aggatagccg tattaagtta
4800ggtgtaatta tgacatcatt attgtcttta tttgaagatt atgtgtaatt tcaggagatg
4860tgtatgggtt caagttgaca agggatggac ttgtgatggc taatgttgag tgtcaacttg
4920actgaggatg caaagtattg ttcctgggtg tgtctgtgag ggtgttgcca aaggagatta
4980acatttgtgt cagtgaactg ggagatgcag acccacccgc aatctgggtg agcaccatgt
5040aatcagctgc cagagcagct agaataaa
506810535DNAHomo sapiens 10gcaaccccct ttgggtcccc ttcccttgta tgggagctct
gttttcactc tatttcactc 60tattaaatct tgcaactgca ctcttctggt ccatgtttgt
tacggctcga gctgagcttt 120ggctcgccat ccaccactgc tgtttgccgc cgtcgcacac
ctgctgctga ctcccatccc 180tccggatcca gcagggtgtg tccgctgtgc tcctgatcca
gcgaggtgcc cattgccgct 240cctgattgga ctaaaggctt gccattgttc ctgcacggct
aagtgcccgg gttcgtccta 300atccagctga acactagtca ctgggttcca cggttctctt
ccttgaccca cggcttctaa 360tagagctata acactcaccg catggcccaa gattccattc
cttggaatct gtgaggccaa 420gaaccccagg tcagagaaca cgaggcttgc caccatcttg
gaagcggcct gccaacatct 480tggaagtggc tcgccaccat cttgggagct ctgtgagcaa
ggacccctgg taaca 535116905DNAHomo sapiens 11gctaccctct ttgggtcccc
tccctttgta tgggagctct gttttcactc tattcaatct 60tgcaactgca ctcttctggt
ccgtgtttgt tacagcttga gctgagcttt cgctcgcctt 120ccaccactgc tgtttgccgc
catcgcagac ctgccgtgct gacttccatc cctctagatc 180tggcagggtg tccgctgtgc
tcttgatcca gcgaggcgcc cattgccgct cccgattggg 240ctaaaggctt gcaattgttc
ctgcacgcta agtgcctggg ttcatcctca tcaagctggg 300ttccacggtt ctcttcatga
cccgcagctt ctaacagagc tataaaactc tgtgcatggc 360ccaagattcc attccttgga
atctgtgagg ccaagaaccc caggtcagag aacaggaggc 420ttgccaccat cttggaagtg
gctcgccacc atcttaggag ctctgtgagc ggagaccccc 480accccccggt aacattttgg
cgaccacgaa gggacctcca aagcggtgag taatattgga 540tcactttcgc ttgctattct
gtcctatcct tctttagaat tggaggaaaa tactgggcac 600ctgtcggcca gttaaaaaca
attagcgtgg ctgcccgact taagactcag gtgtgaggct 660atctggggaa gggctttcta
acaaccccca acccttctgg gttggggacg ttggtctgcc 720ccttccactt tcaattttct
tggggaagcc aagggtcgac tagaggcaga aagctgtcgt 780ccggaactcc tggcagtagc
cggttgagat catggcgcag ccagaagtct ctactcaaca 840gtcgcccatg cgtgcgctcc
tacctttcct cctgacccat acctcctggg tcccgacgat 900gactttcttg aaagtgtagc
cccaaaattc tgcttacctc tgaatctact tcccctgatc 960cctggctcct aggtactaat
ggttcagttt catttcctct agcaagttgt atctccaaag 1020ggatctaagg aagctctacg
ctgcgtcctt aggcatctag gctataaacc caggaagtct 1080tgtccctggt gtccctcccg
atttaggcat acagctctcg acatgggcag ttatgtggga 1140cccgttcccc atcacccttg
tcaaggcccc aagtttgtaa tggctaagag gagagagaga 1200gaaagagaga gagacggagg
ggagagagag agagagagat ggaggggaga gagagagaga 1260gagacggagg ggagagagag
agagagagag agacggaggg gagagagaga gagacggagg 1320ggagagagag agagatggag
gagagaaaga caaagggagt caaagagaaa aagaaagaga 1380aagacagaaa tggtaaaaca
aacaaaaaac agcgtgccct attcctttaa aagccggggt 1440aaatttaaaa cctataattg
ataattgaag gtcttctcca tgaccctata atactccaat 1500actaccttgt tgtcagtgta
aacaagggcg tagcctgaaa acactgagac cactgacaac 1560ctgcagcttt cctatcaaaa
aatccttaac ccagtaaccg gcagatgcat tcaatctgta 1620gcagcaactg ttttgctaac
agaagaaagt agaaaagtaa cttttagagg aaacctcatt 1680gtgagcacac cttaccagtt
cagaattatt ctaagtcaaa aaagcaaaaa ggtagcttac 1740taactcaaaa atcttaaagt
atggggctat tgtgtttaaa aaaaaaaaaa ggtaatttaa 1800caccaaccac tgataattct
cttaacccag caggtttcct aacaggggat ttaaatctta 1860attaccatac aaaggtctga
ccacacctag gaggaactcc cttcaggaca ggactataga 1920gggttcctcc caggtgattg
aggaaaaaac cacagtgggt attcagtaat tgatagggag 1980actcttgtgg aagcagagtt
agaaaaattg cctaataaat ggtgtcctca aaagtgtgag 2040ctgtttgcac tcagccaagc
cttaaagtac ttacagaatc gtaaaaacta tctcaatcct 2100gactcaaaag tttacttaca
ccctctctga aatgaattta cataagaact gcttttttgg 2160gaatgcatct tgatggggca
gctgggtggt tatgaaatac tcaggaaacc agcccagctc 2220taggacacat ccctgagcac
aaaggcaatg ttgggcacgc tggtaaagga ccactagaat 2280ccagcagcct ggactccttt
ctttgtggtc aagaaaggca ggaaaacagg tgcaggactg 2340ctacatcagt gagcataact
aatctgataa gcagagggcc ttgggtggtt acacaccctg 2400gaaaggaatt caactctgag
cgcaaaggca atgttgggca cattggtaaa ggaccactag 2460aatccagcag cccaggcccc
tttctttatg gtcaagaaag gcgggaaaag gggtgcagga 2520ctgttacctc ggtgagcgta
actaatccga taagcagagg tccatgggtg attacgcacc 2580ctgaaaagaa taagcattag
gcccttaaag gatgctctag gactaatgct cattggaaaa 2640tgactagggg tgctggcatc
cctatgttct tttctcagac gggaaatgtt ctccaccctc 2700cccaaggcaa aaacacccct
aagatgtatt ctggagaatt gggaccaatt tgacccccag 2760acgctaagaa agagatgact
tatgttcttc tgcagtacca cctggccacg atatcctctt 2820caagggggag aaacctggcc
tcctgaggga agtataaatt ataacaccat cttacagcta 2880gacctcttct gtagaaagga
gggcaaatgg agtgaagtgc catatgtgca aactttcttt 2940tcattaagag acaacttgca
attatgtaag aagtgtgatt tatgccctac aggaagccct 3000cagagtctac ctccctaccc
cagcatcccc ctgactcctt ctccaactaa taaggaaccc 3060ccttcaaccc aaacggtcca
aaaggagata gacaaagggg taaacaatga accaaagcgt 3120gccaatgttc cctgattatg
ccccctctaa gcagtgggag gaggagaatt tggcccagcc 3180agtgtgcatg tgcctttttc
tctctcagac ttaaagcaaa ttaaaataga cctaggtaaa 3240ttctcagata accctgatgg
ctatattgat gttttataag ggttaggata atcctttgat 3300ctgacatgga gagatataat
gttactgcta gatcagacac taaccccaaa tgagacaagt 3360gccgccataa ctgcagcctg
agagtttggc gatctctggt atctcactcg ggtcaatgat 3420aggaggacaa cagaggaaag
agaatgattc cccacagacc agcaggcagt tcccagtgta 3480gaccctcact gggacacaga
atcagaacat ggacattggt gctgcagaca tttgctaact 3540tacatgctag aaggactaag
gaaaactagg aagaagccta cgaattattc aatgatgtcc 3600actataacac agggaaagga
agaaaatcct actgcctttc tggagcgact aagggaggca 3660ttgaggaagc atacttccct
gtcacctgac tctattgaag gccaactaat cttaaaggat 3720aagtttatca ctcagtcagc
tgaagacatt aggaaaaaac ttcaaaagtc tgccttaggc 3780ccagagcaaa acttagaaac
cccattgaac ttggcaacct cggtttttta taatagagat 3840caggaggagc aggcggaaca
ggacaaacgg ggtaaaaaaa aggccaccgc tttagttatg 3900gccctcaggc aagtggactt
tggaggctct ggaaaaggga aaagctgggc aaatcgaatg 3960cctactaggg cttgcttcca
gagtggtcta caaggacact ttgaaaaaga ttgtccaagt 4020agaaataagt cgccccttcg
tccatgcccc ttatatcaag ggaatcactg gaaggcccac 4080tatcccaggg gacaaatgtc
ctctgagtca gaagccacta accagatgat ccagcagcag 4140gactgagggt gcccagggca
agcactagcc catgccgtca ccctcacaga gccccaggta 4200tgcttgacca ttgagggcca
ggaggttaac tgtctcctgg acactagcac ggccttctca 4260gtcttactct cctttcccgg
acaactgtcc tccagatctg tcactatccg agggttccta 4320ggacagtcag tcactagata
cttatcccag tcactaagtt gtgactggtg aactttactc 4380ttttcacatg cttttctaat
tatccctgaa agcaccactc ccttgttagg gcgagacatt 4440ctagcaaaag caggggccat
tatacacctg aacataggag aaggaacacc tgtttgttgt 4500cccctgcttg aggaaggaat
taatcccgaa gtctgggcaa cagaaggaca atacggacga 4560gcaaagaatg cctgtgctgt
tcaagttaaa ctaaaggatt ccgcctcctt tccctaccaa 4620aggcagtacc cccttagacc
tgaggcccaa caaggactcc aaaagattgt taaggaccta 4680aaagcccatg gcctagtaaa
accatgcaat agcccctgca atactccaat tttaggagta 4740cagaaaccca acagacagtg
gaggttagtg caagatctca ggattatcat tgaggctgtt 4800gttcctgtat agccagctgt
acctaaccct tatactctgc tttcccaaat accacaggaa 4860gcagaggggt ttacagtccg
gggccttaag gacacctttt tctgcatccc tgtatatcct 4920gactctcaat tcttgtttgc
ctttgaagat ccttcaaact caacgtctca actcacctgg 4980aatgttttac cccaagggtt
cagggatagc ccccattagc ccaagacttg agccagttct 5040tatacctgga cactcttgtc
ctttggtacg tggatgattt acttttagcc acctgttcag 5100aaaccttgtg ccatcaagcc
acccaagcac tctttaattt cctcgccacc tgtggctaca 5160ggtttccaaa ccaaaggctc
agctctgctc acagcaattt aaatgcttag ggctaaaatt 5220atccaaaggc accagggccc
tcagtgagga aagtatccgg cctatactgg cttatcctca 5280tcccaaaacc ctaaagcaac
taagagtgtt ccttggcata acgggtttct gccgaatatg 5340gattcccagg tacagcgaaa
tagccagacc attatataca ctaattaagg aaactcagaa 5400agccaatacc catttggtaa
gatggacacc tgaagcagaa gcagatttcc aggccctaaa 5460gaaggccctg acccaagccc
cagtgttaag cttgccaatg gggcaagact tttctttata 5520tgtcacagaa aaaacaggaa
tagctccagg agtccttacg cagatccaag ggacgagcct 5580gcaacccatg gcatacctga
gtaaggaaat tagtggcaaa gggttggcct cattgtttat 5640gggtagtggc agcagtcaca
gtcttagtaa ctgaagcagt taaaatgata caaggaagag 5700atcttactgt gtggacatct
catgatgtga atggcatact cactgctaaa ggagacttgt 5760gactgtcaga caactgttta
cttaaatatc aggctctatt acttgaaggg ccagtgttgc 5820gactgtgcac ttgtgcaact
cttaacccag ccacattgct tccagacaat gaagaaaaga 5880tagaacataa ctgtcaacaa
ataattgctc aaacctacac tgctcgaggg gaccttttag 5940aagttccctt gactgatccc
gatctcaact tgtatactga tggaagttcc tttgcagaaa 6000aaggacttca aaaggcggtg
tatgcagtag tccttcaaaa tcgaagagct ttagaattgc 6060taatcactga gagaggggga
acgtttttat ttttagggga agaatgctgt tattatgtta 6120atcaattcgg aatcatcacc
aagaaagtta aagaaattca agatcgaata caacgtagaa 6180cagaggagct taaaaaacac
tggaccctgg ggcctcctca gccaatggat gccctggatt 6240ctccccttct taggacctct
agcagctata tttctactcc tctttggacc ctgtatcttt 6300aacctccgtg ttaagtttgt
ctcttccaga atcgaagatg taaaactaca aatcgttctt 6360caaatggacc cccagatgca
gtccatgact aagatctact gaggacccct ggaccagcct 6420gctagcccat gctccaatgt
taatgacatt gaaggcaccc ctcccaagga aatctcaact 6480gcacaacccc tactatgctc
caattcagca ggaagcagtt acagtggtcc tcggccaacc 6540tccccaacag catttgtatt
ttcctgttgg gagggggcac tgagagacag gactagctgg 6600atttcctagg ctgactgaga
atccctaagc ctagctggga aggtgaccac ttccaccttt 6660aaacacaggg cttgcaactt
agctcacacc ctaccaattg gatagtaaag agaggtcact 6720aaaatgctaa ttaggcaaaa
acaggaggta aagaaatagc caatcatcca ttgcctgaga 6780gcacagcggg agggacaatg
accaggatat aaacccaggc attccagcct gcaacggcaa 6840ccccctttgg gtcccctctc
tttgtatggg agctctgttt tcactctatt caatcttgca 6900actgc
6905129565DNAHomo sapiens
12gcagcccccc tttgggtccc ttccctttgt atgggagctg ttttcatgct atttcactct
60attaaatctt gcaactgcac tcttctggtc catgtttctt acggctcgag ctgagctttt
120gctcaccgtc caccactgct gtttgccacc accgcagacc tgccgctgac tcccatccct
180ctggatcctg cagggtgtcc gctgtgctcc tgatccagcg aggcgcccat tgccgctccc
240aattgggcta aaggcttgcc attgttcctg cacggctaag tgcctgggtt tgttctaatt
300gagctgaaca ctagtcactg ggttccatgg ttctcttctg tgacccacgg cttctaatag
360aactataaca cttaccacat ggcccaagat tccattcctt ggaatccgtg aggccaagaa
420ctccaggtca gagaatacga ggcttgccac catcttggaa gcggcctgct accatcttgg
480aagtggttca ccaccatctt gggagctctg tgagcaagga ccccccggta acattttggc
540aaccacgaac ggacatccaa agtggtgagt aatattggac cactttcact tgctattctg
600tcctatcctt ccttagaatt ggaggaaaat accgggcact tgtcggccag ttaaaaacga
660ttagtgtggc caccggactt aagactcagg tgtgaggcta tctggggaag ggctttctaa
720caacccccaa cccttctggg ttggggactt ggtttgcctc aagccagctt ccactttcag
780ttttcttggg gaagccgagg gccgactaga ggcagaaagc tgtcgtcctg aactcccggc
840agtagccggt tgagatcatg gtgtagccag aagtctcaac agtcgcccat gcatgcaccc
900ctatctttcc ttctgaccca tacctcctgg gtcccaacca caactttctt caaagtgtag
960ccccaaaatt ctccttacct ctgaatatac ttcctctgat ccctgcctcc taggtactat
1020tggttcagac ttccatttcc tctagcaagt tgtatctcca aagggatcta aggaagctct
1080gcgctgcgtc cttaggcacc taggctataa cccagggagt cttatccctg gtgtccctcc
1140caatttaggc atacagctct tgacatgggc agttatgtag gacccactcc ccaccaccct
1200tgccagggcc ccaagtttgt aaatggctga gggaaaagag agacagagga gagagagaga
1260aatggaggag aaagagagag agacagagag gagagagaga cagtgagaga gacagaagag
1320agagagagac aaagaggaga gagagagagt caaagagaga aagaaagaga aagaaatagt
1380aaaaaacagt gtgccctatt cctttaaaag ccagggtaaa tttaaaacct gtacttgata
1440attgaaggtc ttctctgtga ccctatagca ctccaatcca ctttgtggtc agtgtaaata
1500agagcatagg ccgaaagcac tgaggccatt gacaacccgt agcttcccta tcaaaaatcc
1560ttaacccagt aacccgcaga tggaccaaat gcattcagtc ggtagcgcaa ctgctttgct
1620aaaagtagaa aagtaacttt tagaggaaac ctcattgtga gcacacctca cctgttcaga
1680attattctaa taaaaaaagc aaaaaggtag cttactaact caaaaatctt aaagtatggg
1740gctattctgt tagaaaaagg taatgtaact ccaaccactg ataattccct taacccagca
1800gatttcctaa cgggatttaa atcttaatta ccatacaaag gtccgaccag acctaggcgg
1860aactcccttc aggacaggac gatagatggt tcctcccagg tgattgagga aaaaaaccac
1920aatgggtatt cagtaattga tacggggact cttgtggaag cagagttaga aaaattgcct
1980aataactggt ctcctcaaac gtgtgagctg tttgcactca gccaagcctt aaagtactta
2040cagaatcaaa agactatctc aatcctgatt caaaaggtta gctacaccct ctctgtaatg
2100catttgcata agaacttgtt tatgggaatg catcttgatg gggcagctgg gttgttataa
2160aataggaacc cagcccagct ctaggactca cccctgagcg caaaggcaat gttgggcatg
2220ctggtaaagg accactagaa tccagcagcc cagacccctt tctttgtggt caagaaaggc
2280gggaaaaggg gtgcaggact gctacatcgg taagcataac taatccgata aacagaggtc
2340catgggtggt tacgcaccct ggaaaggaac tcacccctga gcacaaaggc aatgttgggc
2400acgctggtaa aggaccacta gaatccagca gcctggaccc ctttctttgt ggtcaagaga
2460ggcaggaaaa caggtgcagg actgcaacat cagtgagcat aactaattcg ataagcagag
2520gtccatgggt ggtgatgcac cctggaaaga ataagcatta ggaccataga ggacactcca
2580ggactaaagc tcatcggaaa atgactaggg ttgctggcat ccctatgttc ttttttcaga
2640tgggaaacgt tccccgcaag acaaaaacgc ccctaagacg tattctggag aattgggacc
2700aatttgaccc tcagacacta agaaagaaac gacttatatt cttctgcagt gccgcctggc
2760actcctgagg gaagtataaa ttataacacc atcttacagc tagacctctt ttgtagaaaa
2820ggcaaatgga gtgaagtgcc ataagtacaa actttctttt cattaagaga caactcacaa
2880ttatgtaaaa agtgtgattt atgccctaca ggaagccttc agagtctacc tccctatccc
2940agcatccccg actccttccc caactaataa ggacccccct tcaacccaaa tggtccaaaa
3000ggagatagac aaaagggtaa acagtgaacc aaagagtgcc aatattcccc aattatgacc
3060cctccaagca gtgggaggaa gagaattcgg cccagccaga gtgcatgtgc ctttttctct
3120cccagactta aagcaaataa aaacagactt aggtaaattc tcagataacc ctgatggcta
3180tattgatgtt ttacaagggt taggacaatt ctttgatctg acatggagag atataatgtc
3240actgctaaat cagacactaa ccccaaatga gagaagtgcc accataactg cagcctgaga
3300gtttggcgat ctctggtatc tcagtcaggt caatgatagg atgacaacag aggaaagaga
3360atgattcccc acaggccagc aggcagttcc cagtctagac cctcattggg acacagaatc
3420agaacatgga gattggtgct gcagacattt gctaacttgt gtgctagaag gactaaggaa
3480aactaggaag aagtctatga attactcaat gatgtccacc ataacacagg gaagggaaga
3540aaatcctact gcctttctgg agagactaag ggaggcattg aggaagcgtg cctctctgtc
3600acctgactct tctgaaggcc aactaatctt aaagcgtaag tttatcactc agtcagctgc
3660agacattaga aaaaaacttc aaaagtctgc cgtaggcccg gagcaaaact tagaaaccct
3720attgaacttg gcaacctcgg ttttttataa tagagatcag gaggagcagg cggaacagga
3780caaacgggat taaaaaaaag gccaccgctt tagtcatgac cctcaggcaa gtggactttg
3840gaggctctgg aaaagggaaa agctgggcaa attgaatgcc taatagggct tgcttccagt
3900gcggtctaca aggacacttt aaaaaagatt gtccaagtag aagtaagccg ccccctcgtc
3960catgcccctt atttcaaggg aatcactgga aggcccactg ccccagggga caaaggtcct
4020ctgagtcaga agccactaac cagatgatcc agcagcagga ctgagggtgc ctggggcaag
4080cgccatccca tgccatcacc ctcacagagc cctgggtatg cttgaccatt gagggccagg
4140aggttgtctc ctggacactg gtgcggtctt cttagtctta ctcttctgtc ccggacaact
4200gtcctccaga tctgtcacta tctgaggggg tcctaagacg ggcagtcact agatacttct
4260cccagccact aagttatgac tggggagctt tattcttttc acatgctttt ctaattatgc
4320ttgaaagccc cactaccttg ttagggagag acattctagc aaaagcaggg gccattatac
4380acctgaacat aggagaagga acacccgttt gttgtcccct gcttgaggaa ggaattaatc
4440ctgaagtctg ggcaacagaa ggacaatatg gacgagcaaa gaatgcccgt cctgttcaag
4500ttaaactaaa ggattccacc tcctttccct accaaaggca gtaccccctc agacccaagg
4560cccaacaagg actccaaaag attgttaagg acctaaaagc ccaaggccta gtaaaaccat
4620gcagtaaccc ctgcagtact ccaattttag gagtacagaa acccaacaga cagtggaggt
4680tagtgcaaga tctcaggatt atcaatgagg ctgttgttcc tctatagcca gctgtaccta
4740gcccttatac tctgctttcc caaataccag aggaagcaga gtggtttaca gtcctggacc
4800ttcaggatgc cttcttctgc atccctgtac atcctgactc tcaattcttg tttgcctttg
4860aagatacttc aaacccaaca tctcaactca cctggactat tttaccccaa gggttcaggg
4920atagtcccca tctatttggc caggcattag cccaagactt gagccaatcc tcatacctgg
4980acacttgtcc ttcggtaggt ggatgattta cttttggccg cccattcaga aaccttgtgc
5040catcaagcca cccaagcgct cttcaatttc ctcgctacct gtggctacat ggtttccaaa
5100ccaaaggctc aactctgctc acagcaggtt acttagggct aaaattatcc aaaggcacca
5160gggccctcag tgaggaacac atccagccta tactggctta tcctcatccc aaaaccctaa
5220agcaactaag gggattcctt ggcgtaatag gtttctgccg aaaatggatt cccaggtatg
5280gcgaaatagc caggtcatta aatacactaa ttaaggaaac tcagaaagcc aatacccatt
5340tagtaagatg gacaactgaa gtagaagtgg ctttccaggc cctaacccaa gccccagtgt
5400taagtttgcc aacagggcaa gacttttctt catatgtcac agaaaaaaca ggaatagctc
5460taggagtcct tacacagatc cgagggatga gcttgcaacc tgtggcatac ctgactaagg
5520aaattgatgt agtggcaaag ggttgacctc attgtttacg ggtagtggtg gcagtagcag
5580tcttagtatc tgaagcagtt aaaataatac agggaagaga tcttactgtg tggacatctc
5640atgatgtgaa tggcatactc actgctaaag gagacttgtg gctgtcagac aactgtttac
5700ttaaatgtca ggctctatta cttgaagggc cagtgctgcg actgtgcact tgtgcaactc
5760ttaacccagc cacatttctt ccagacaatg aagaaaagat aaaacataac tgtcaacaag
5820taatttctca aacctatgcc actcgagggg accttttaga ggttcctttg actgatcccg
5880acctcaactt gtatactgat ggaagttcct ttgtagaaaa aggacttcga aaagtggggt
5940atgcagtggt cagtgataat ggaatacttg aaagtaatcc cctcactcca ggaactagtg
6000ctcagctagc agaactaata gccctcactt gggcactaga attaggagaa gaaaaaaggg
6060caaatatata tacagactct aaatatgctt acctagtcct ccatgcccat gcagcaatat
6120ggaaagaaag ggaattccta acttctgaga gaacacctat caaacatcag gaagccatta
6180ggaaattatt attggctgta cagaaaccta aagaggtggc agtcttacac tgccggggtc
6240atcagaaagg aaaggaaagg gaaatagaag agaactgcca agcagatatt gaagccaaaa
6300gagctgcaag gcaggaccct ccattagaaa tgcttataaa acaaccccta gtatagggta
6360atcccctccg ggaaaccaag ccccagtact cagcaggaga aacagaatgg ggaacctcac
6420gaggacagtt ttctcccctc gggacggcta gccactgaag aagggaaaat acttttgcct
6480gcaactatcc aatggaaatt acttaaaacc cttcatcaaa cctttcactt aggcatcgat
6540agcacccatc agatggccaa atcattattt actggaccag gccttttcaa aactatcaag
6600cagatagtca gggcctgtga agtgtgccag agaaataatc ccctgcctta tcgccaagct
6660ccttcaggag aacaaagaac aggccattac cctggagaag actggcaact gattttaccc
6720acaagcccaa acctcaggga tttcagtatc tactagtctg ggtagatact ttcacgggtt
6780gggcagaggc cttcccctgt aggacagaaa aggcccaaga ggtaataaag gcactagttc
6840atgaaataat tcccagattc ggacttcccc gaggcttaca gagtgacaat agccctgctt
6900tccaggccac agtaacccag ggagtatccc aggcgttagg tatacgatat cacttacact
6960gcgcctgaag gccacagtcc tcagggaagg tcgagaaaat gaatgaaaca ctcaaaggac
7020atctaaaaaa gcaaacccag gaaacccacc tcacatggcc tgctctgttg cctatagcct
7080taaaaagaat ctgcaacttt ccccaaaaag caggacttag cccatacgaa atgctgtatg
7140gaaggccctt cataaccaat gaccttgtgc ttgacccaag acagccaact tagttgcaga
7200catcacctcc ttagccaaat atcaacaagt tcttaaaaca ttacaaggaa cctatccctg
7260agaagaggga aaagaactat tccacccttg tgacatggta ttagtcaagt cccttccctc
7320taattcccca tccctagata catcctggga aggaccctac ccagtcattt tatctacccc
7380aactgcggtt aaagtggctg gagtggagtc ttggatacat cacacttgag tcaaatcctg
7440gatactgcca aaggaacctg aaaatccagg agacaacgct agctattcct gtgaacctct
7500agaggatttg cgcctgctct tcaaacaaca accaggagga aagtaactaa aatcataaat
7560ccccatggcc ctcccttatc atatttttct ctttactgtt cttttaccct ctttcactct
7620cactgcaccc cctccatgcc gctgtatgac cagtagctcc ccttaccaag agtttctatg
7680gagaatgcag cgtcccggaa atattgatgc cccatcgtat aggagtcttt ctaagggaac
7740ccccaccttc actgcccaca cccatatgcc ccgcaactgc tatcactctg ccactctttg
7800catgcatgca aatactcatt attggacagg aaaaatgatt aatcctagtt gtcctggagg
7860acttggagtc actgtctgtt ggacttactt cacccaaact ggtatgtctg atgggggtgg
7920agttcaagat caggcaagag aaaaacatgt aaaagaagta atctcccaac tcacccgggt
7980acatggcacc tctagcccct acaaaggact agatctctca aaactacatg aaaccctccg
8040tacccatact cgcctggtaa gcctatttaa taccaccctc actgggctcc atgaggtctc
8100ggcccaaaac cctactaact gttggatatg cctccccctg aacttcaggc catatgtttc
8160aatccctgta cctgaacaat ggaacaactt cagcacagaa ataaacacca cttccgtttt
8220agtaggacct cttgtttcca atctggaaat aacccatacc tcaaacctca cctgtgtaaa
8280atttagcaat actacataca caaccaactc ccaatgcatc aggtgggtaa ctcctcccac
8340acaaatagtc tgcctaccct caggaatatt ttttgtctgt ggtacctcag cctatcgttg
8400tttgaatggc tcttcagaat ctatgtgctt cctctcattc ttagtgcccc ctatgaccat
8460ctacactgaa caagatttat acagttatgt catatctaag ccccgcaaca aaagagtacc
8520cattcttcct tttgttatag gagcaggagt gctaggtgca ctaggtactg gcattggcgg
8580tatcacaacc tctactcagt tctactacaa actatctcaa gaactaaatg gggacatgga
8640acgggtcgcc gactccctgg tcaccttgca agatcaactt aactccctag cagcagtagt
8700ccttcaaaat cgaagagctt tagacttgct aaccgctgaa agagggggaa cctgtttatt
8760tttaggggaa gaatgctgtt attatgttaa tcaatccgga atcgtcactg agaaagttaa
8820agaaattcga gatcgaatac aacgtagagc agaggagctt cgaaacactg gaccctgggg
8880cctcctcagc caatggatgc cctggattct ccccttctta ggacctctag cagctataat
8940attgctactc ctctttggac cctgtatctt taacctcctt gttaactttg tctcttccag
9000aatcgaagct gtaaaactac aaatggagcc caagatgcag tccaagacta agatctaccg
9060cagacccctg gaccggcctg ctagcccacg atctgatgtt aatgacatca aaggcacccc
9120tcctgaggaa atctcagctg cacaacctct actacgcccc aattcagcag gaagcagtta
9180gagcggtcgt cggccaacct ccccaacagc acttaggttt tcctgttgag atgggggact
9240gagagacagg actagctgga tttcctaggc tgactaagaa tccctaagcc tagctgggaa
9300ggtgaccaca tccaccttta aacacggggc ttgcaactta gctcacacct gaccaatcag
9360agagctcact aaaatgctaa ttaggcaaag acaggaggta aagaaatagc caatcatcta
9420ttgcctgaga gcacagcagg agggacaatg atcgggatat aaacccaagt cttcgagccg
9480gcaacggcaa ccccctttgg gtcccctccc tttgtatggg agctctgttt tcatgctatt
9540tcactctatt aaatcttgca actgc
956513543DNAHomo sapiens 13atgacaacag aagaaagaaa acaattcccc acaggccagc
aggcagttcc cagcgtagac 60cttcattggg acacagaatc agaacatgga gattggtgcc
gcagacattt actaacttgc 120gcgctagaag cactaaggaa aactaggaag aagcctatga
attattcaat gatgtccact 180ataacacagg gaaaggaaga aaatcctact gcctttctgg
agagactaag ggaggcattg 240agaaagcata cctctctgtc acctgactct attgaaggcc
aactaatctt aaaggataag 300ttttccactc agtcagctgc agacattaga aaaaaacttc
aaaagtctgc gttaggccgg 360gagcaaaact tagaaaccct attgaacttg gcaacctcag
ttttttatga tagagatcag 420gaggatcagg tggaatggac aaatgagatt ttaaaaaaag
gccaccactt tagtcatggc 480cctcaggcaa gcagactttg gacactctgg aaaagggaaa
agctgggcaa atcgaatgcc 540taa
54314180PRTHomo sapiens 14Met Thr Thr Glu Glu Arg
Lys Gln Phe Pro Thr Gly Gln Gln Ala Val1 5
10 15Pro Ser Val Asp Leu His Trp Asp Thr Glu Ser Glu
His Gly Asp Trp 20 25 30Cys
Arg Arg His Leu Leu Thr Cys Ala Leu Glu Ala Leu Arg Lys Thr 35
40 45Arg Lys Lys Pro Met Asn Tyr Ser Met
Met Ser Thr Ile Thr Gln Gly 50 55
60Lys Glu Glu Asn Pro Thr Ala Phe Leu Glu Arg Leu Arg Glu Ala Leu65
70 75 80Arg Lys His Thr Ser
Leu Ser Pro Asp Ser Ile Glu Gly Gln Leu Ile 85
90 95Leu Lys Asp Lys Phe Ser Thr Gln Ser Ala Ala
Asp Ile Arg Lys Lys 100 105
110Leu Gln Lys Ser Ala Leu Gly Arg Glu Gln Asn Leu Glu Thr Leu Leu
115 120 125Asn Leu Ala Thr Ser Val Phe
Tyr Asp Arg Asp Gln Glu Asp Gln Val 130 135
140Glu Trp Thr Asn Glu Ile Leu Lys Lys Gly His His Phe Ser His
Gly145 150 155 160Pro Gln
Ala Ser Arg Leu Trp Thr Leu Trp Lys Arg Glu Lys Leu Gly
165 170 175Lys Ser Asn Ala
18015435DNAHomo sapiens 15atgattcagc cccaggactc agggtgccca gggcaagcgc
cagcctatgc catcaccctc 60acagagccct gggtatgctt gaccattgag ggtcaggagg
ttaactatct cctggacact 120ggcgtggcct tctcagtctt actctcctgt cccggacaac
tgtcctccag atctgtcact 180atccgagggt ttctacgaca gccagccact agatacttct
cccagccact aagttgtgac 240tggggaactc tactcttttc acatgttttt ctaattatgc
ctgaaagccc cactcctttg 300ttagggaaag acattctagc aaaagcaggg gccattatac
acctgaacat aggagaagga 360acacctgttt gttgtcccct gcttgaagaa ggaattaatc
ctgaagtctg gacaacagaa 420ggacaataca gatga
43516144PRTHomo sapiens 16Met Ile Gln Pro Gln Asp
Ser Gly Cys Pro Gly Gln Ala Pro Ala Tyr1 5
10 15Ala Ile Thr Leu Thr Glu Pro Trp Val Cys Leu Thr
Ile Glu Gly Gln 20 25 30Glu
Val Asn Tyr Leu Leu Asp Thr Gly Val Ala Phe Ser Val Leu Leu 35
40 45Ser Cys Pro Gly Gln Leu Ser Ser Arg
Ser Val Thr Ile Arg Gly Phe 50 55
60Leu Arg Gln Pro Ala Thr Arg Tyr Phe Ser Gln Pro Leu Ser Cys Asp65
70 75 80Trp Gly Thr Leu Leu
Phe Ser His Val Phe Leu Ile Met Pro Glu Ser 85
90 95Pro Thr Pro Leu Leu Gly Lys Asp Ile Leu Ala
Lys Ala Gly Ala Ile 100 105
110Ile His Leu Asn Ile Gly Glu Gly Thr Pro Val Cys Cys Pro Leu Leu
115 120 125Glu Glu Gly Ile Asn Pro Glu
Val Trp Thr Thr Glu Gly Gln Tyr Arg 130 135
14017312DNAHomo sapiens 17atgccccctc caggcagtgg gaggaggaga
attcggccca gccagagtgc atgtaccttt 60ttttttctct cagacttaaa gcaaattaaa
atagacctag gtaaattctc agataaccct 120gatggctata ttgatgtttt acaagggtta
ggacaatcct ttgctctgac atggagagat 180ataatgttac tgctaaatca gacactaacc
ccaaatgaga gaagtgtcac catagctgca 240gcccaagagt ttggcaatct ctggtatctc
agtcaggtca atgataggat gacaacagag 300gaaagggaat ga
31218103PRTHomo sapiens 18Met Pro Pro
Pro Gly Ser Gly Arg Arg Arg Ile Arg Pro Ser Gln Ser1 5
10 15Ala Cys Thr Phe Phe Phe Leu Ser Asp
Leu Lys Gln Ile Lys Ile Asp 20 25
30Leu Gly Lys Phe Ser Asp Asn Pro Asp Gly Tyr Ile Asp Val Leu Gln
35 40 45Gly Leu Gly Gln Ser Phe Ala
Leu Thr Trp Arg Asp Ile Met Leu Leu 50 55
60Leu Asn Gln Thr Leu Thr Pro Asn Glu Arg Ser Val Thr Ile Ala Ala65
70 75 80Ala Gln Glu Phe
Gly Asn Leu Trp Tyr Leu Ser Gln Val Asn Asp Arg 85
90 95Met Thr Thr Glu Glu Arg Glu
10019207DNAHomo sapiens 19atgggcagtt atgtaggacc cattccccac cacacttgcc
agggccccaa gtttgtaatg 60gctaagagag agacacagag agagagagag agatggagag
agagacaagg agggagtcaa 120agagaaaaag aaagaaaaag aaatagtaga aaaaaaagtg
tgccctattc ctttaaaagc 180cagggtaaat ttaaaacctg taattga
2072068PRTHomo sapiens 20Met Gly Ser Tyr Val Gly
Pro Ile Pro His His Thr Cys Gln Gly Pro1 5
10 15Lys Phe Val Met Ala Lys Arg Glu Thr Gln Arg Glu
Arg Glu Arg Trp 20 25 30Arg
Glu Arg Gln Gly Gly Ser Gln Arg Glu Lys Glu Arg Lys Arg Asn 35
40 45Ser Arg Lys Lys Ser Val Pro Tyr Ser
Phe Lys Ser Gln Gly Lys Phe 50 55
60Lys Thr Cys Asn6521306DNAHomo sapiens 21atgtcaagaa agaaacgact
tatattcttc tgcagtactg cctggccacg atatcctctt 60caagggggag aaacctggcc
tcctgaggga agtacaaatt ataacaccat cttacagcta 120gacctctttt gtagaaaaga
aggcaaatgg agtgaagtgc catatgtgca aactttcttt 180tcattaagag acaactcaca
attatgtaaa aagtgtggtt tatgtcttac aggaagccct 240cagagtctac ctccctatcc
cagcattccc ccgactcctt ccccaactaa taagcaccac 300ccttga
30622101PRTHomo sapiens
22Met Ser Arg Lys Lys Arg Leu Ile Phe Phe Cys Ser Thr Ala Trp Pro1
5 10 15Arg Tyr Pro Leu Gln Gly
Gly Glu Thr Trp Pro Pro Glu Gly Ser Thr 20 25
30Asn Tyr Asn Thr Ile Leu Gln Leu Asp Leu Phe Cys Arg
Lys Glu Gly 35 40 45Lys Trp Ser
Glu Val Pro Tyr Val Gln Thr Phe Phe Ser Leu Arg Asp 50
55 60Asn Ser Gln Leu Cys Lys Lys Cys Gly Leu Cys Leu
Thr Gly Ser Pro65 70 75
80Gln Ser Leu Pro Pro Tyr Pro Ser Ile Pro Pro Thr Pro Ser Pro Thr
85 90 95Asn Lys His His Pro
10023603DNAHomo sapiens 23atgctagaag gactaaggaa aactaggaag
aagcctacga attattcaat gatgtccact 60ataacacagg gaaaggaaga aaatcctact
gcctttctgg agcgactaag ggaggcattg 120aggaagcata cttccctgtc acctgactct
attgaaggcc aactaatctt aaaggataag 180tttatcactc agtcagctga agacattagg
aaaaaacttc aaaagtctgc cttaggccca 240gagcaaaact tagaaacccc attgaacttg
gcaacctcgg ttttttataa tagagatcag 300gaggagcagg cggaacagga caaacggggt
aaaaaaaagg ccaccgcttt agttatggcc 360ctcaggcaag tggactttgg aggctctgga
aaagggaaaa gctgggcaaa tcgaatgcct 420actagggctt gcttccagag tggtctacaa
ggacactttg aaaaagattg tccaagtaga 480aataagtcgc cccttcgtcc atgcccctta
tatcaaggga atcactggaa ggcccactat 540cccaggggac aaatgtcctc tgagtcagaa
gccactaacc agatgatcca gcagcaggac 600tga
60324200PRTHomo sapiens 24Met Leu Glu
Gly Leu Arg Lys Thr Arg Lys Lys Pro Thr Asn Tyr Ser1 5
10 15Met Met Ser Thr Ile Thr Gln Gly Lys
Glu Glu Asn Pro Thr Ala Phe 20 25
30Leu Glu Arg Leu Arg Glu Ala Leu Arg Lys His Thr Ser Leu Ser Pro
35 40 45Asp Ser Ile Glu Gly Gln Leu
Ile Leu Lys Asp Lys Phe Ile Thr Gln 50 55
60Ser Ala Glu Asp Ile Arg Lys Lys Leu Gln Lys Ser Ala Leu Gly Pro65
70 75 80Glu Gln Asn Leu
Glu Thr Pro Leu Asn Leu Ala Thr Ser Val Phe Tyr 85
90 95Asn Arg Asp Gln Glu Glu Gln Ala Glu Gln
Asp Lys Arg Gly Lys Lys 100 105
110Lys Ala Thr Ala Leu Val Met Ala Leu Arg Gln Val Asp Phe Gly Gly
115 120 125Ser Gly Lys Gly Lys Ser Trp
Ala Asn Arg Met Pro Thr Arg Ala Cys 130 135
140Phe Gln Ser Gly Leu Gln Gly His Phe Glu Lys Asp Cys Pro Ser
Arg145 150 155 160Asn Lys
Ser Pro Leu Arg Pro Cys Pro Leu Tyr Gln Gly Asn His Trp
165 170 175Lys Ala His Tyr Pro Arg Gly
Gln Met Ser Ser Glu Ser Glu Ala Thr 180 185
190Asn Gln Met Ile Gln Gln Gln Asp 195
200251617DNAHomo sapiens 25atggccctcc cttatcatat ttttctcttt actgttcttt
taccctcttt cactctcact 60gcaccccctc catgccgctg tatgaccagt agctcccctt
accaagagtt tctatggaga 120atgcagcgtc ccggaaatat tgatgcccca tcgtatagga
gtctttctaa gggaaccccc 180accttcactg cccacaccca tatgccccgc aactgctatc
actctgccac tctttgcatg 240catgcaaata ctcattattg gacaggaaaa atgattaatc
ctagttgtcc tggaggactt 300ggagtcactg tctgttggac ttacttcacc caaactggta
tgtctgatgg gggtggagtt 360caagatcagg caagagaaaa acatgtaaaa gaagtaatct
cccaactcac ccgggtacat 420ggcacctcta gcccctacaa aggactagat ctctcaaaac
tacatgaaac cctccgtacc 480catactcgcc tggtaagcct atttaatacc accctcactg
ggctccatga ggtctcggcc 540caaaacccta ctaactgttg gatatgcctc cccctgaact
tcaggccata tgtttcaatc 600cctgtacctg aacaatggaa caacttcagc acagaaataa
acaccacttc cgttttagta 660ggacctcttg tttccaatct ggaaataacc catacctcaa
acctcacctg tgtaaaattt 720agcaatacta catacacaac caactcccaa tgcatcaggt
gggtaactcc tcccacacaa 780atagtctgcc taccctcagg aatatttttt gtctgtggta
cctcagccta tcgttgtttg 840aatggctctt cagaatctat gtgcttcctc tcattcttag
tgccccctat gaccatctac 900actgaacaag atttatacag ttatgtcata tctaagcccc
gcaacaaaag agtacccatt 960cttccttttg ttataggagc aggagtgcta ggtgcactag
gtactggcat tggcggtatc 1020acaacctcta ctcagttcta ctacaaacta tctcaagaac
taaatgggga catggaacgg 1080gtcgccgact ccctggtcac cttgcaagat caacttaact
ccctagcagc agtagtcctt 1140caaaatcgaa gagctttaga cttgctaacc gctgaaagag
ggggaacctg tttattttta 1200ggggaagaat gctgttatta tgttaatcaa tccggaatcg
tcactgagaa agttaaagaa 1260attcgagatc gaatacaacg tagagcagag gagcttcgaa
acactggacc ctggggcctc 1320ctcagccaat ggatgccctg gattctcccc ttcttaggac
ctctagcagc tataatattg 1380ctactcctct ttggaccctg tatctttaac ctccttgtta
actttgtctc ttccagaatc 1440gaagctgtaa aactacaaat ggagcccaag atgcagtcca
agactaagat ctaccgcaga 1500cccctggacc ggcctgctag cccacgatct gatgttaatg
acatcaaagg cacccctcct 1560gaggaaatct cagctgcaca acctctacta cgccccaatt
cagcaggaag cagttag 161726538PRTHomo sapiens 26Met Ala Leu Pro Tyr
His Ile Phe Leu Phe Thr Val Leu Leu Pro Ser1 5
10 15Phe Thr Leu Thr Ala Pro Pro Pro Cys Arg Cys
Met Thr Ser Ser Ser 20 25
30Pro Tyr Gln Glu Phe Leu Trp Arg Met Gln Arg Pro Gly Asn Ile Asp
35 40 45Ala Pro Ser Tyr Arg Ser Leu Ser
Lys Gly Thr Pro Thr Phe Thr Ala 50 55
60His Thr His Met Pro Arg Asn Cys Tyr His Ser Ala Thr Leu Cys Met65
70 75 80His Ala Asn Thr His
Tyr Trp Thr Gly Lys Met Ile Asn Pro Ser Cys 85
90 95Pro Gly Gly Leu Gly Val Thr Val Cys Trp Thr
Tyr Phe Thr Gln Thr 100 105
110Gly Met Ser Asp Gly Gly Gly Val Gln Asp Gln Ala Arg Glu Lys His
115 120 125Val Lys Glu Val Ile Ser Gln
Leu Thr Arg Val His Gly Thr Ser Ser 130 135
140Pro Tyr Lys Gly Leu Asp Leu Ser Lys Leu His Glu Thr Leu Arg
Thr145 150 155 160His Thr
Arg Leu Val Ser Leu Phe Asn Thr Thr Leu Thr Gly Leu His
165 170 175Glu Val Ser Ala Gln Asn Pro
Thr Asn Cys Trp Ile Cys Leu Pro Leu 180 185
190Asn Phe Arg Pro Tyr Val Ser Ile Pro Val Pro Glu Gln Trp
Asn Asn 195 200 205Phe Ser Thr Glu
Ile Asn Thr Thr Ser Val Leu Val Gly Pro Leu Val 210
215 220Ser Asn Leu Glu Ile Thr His Thr Ser Asn Leu Thr
Cys Val Lys Phe225 230 235
240Ser Asn Thr Thr Tyr Thr Thr Asn Ser Gln Cys Ile Arg Trp Val Thr
245 250 255Pro Pro Thr Gln Ile
Val Cys Leu Pro Ser Gly Ile Phe Phe Val Cys 260
265 270Gly Thr Ser Ala Tyr Arg Cys Leu Asn Gly Ser Ser
Glu Ser Met Cys 275 280 285Phe Leu
Ser Phe Leu Val Pro Pro Met Thr Ile Tyr Thr Glu Gln Asp 290
295 300Leu Tyr Ser Tyr Val Ile Ser Lys Pro Arg Asn
Lys Arg Val Pro Ile305 310 315
320Leu Pro Phe Val Ile Gly Ala Gly Val Leu Gly Ala Leu Gly Thr Gly
325 330 335Ile Gly Gly Ile
Thr Thr Ser Thr Gln Phe Tyr Tyr Lys Leu Ser Gln 340
345 350Glu Leu Asn Gly Asp Met Glu Arg Val Ala Asp
Ser Leu Val Thr Leu 355 360 365Gln
Asp Gln Leu Asn Ser Leu Ala Ala Val Val Leu Gln Asn Arg Arg 370
375 380Ala Leu Asp Leu Leu Thr Ala Glu Arg Gly
Gly Thr Cys Leu Phe Leu385 390 395
400Gly Glu Glu Cys Cys Tyr Tyr Val Asn Gln Ser Gly Ile Val Thr
Glu 405 410 415Lys Val Lys
Glu Ile Arg Asp Arg Ile Gln Arg Arg Ala Glu Glu Leu 420
425 430Arg Asn Thr Gly Pro Trp Gly Leu Leu Ser
Gln Trp Met Pro Trp Ile 435 440
445Leu Pro Phe Leu Gly Pro Leu Ala Ala Ile Ile Leu Leu Leu Leu Phe 450
455 460Gly Pro Cys Ile Phe Asn Leu Leu
Val Asn Phe Val Ser Ser Arg Ile465 470
475 480Glu Ala Val Lys Leu Gln Met Glu Pro Lys Met Gln
Ser Lys Thr Lys 485 490
495Ile Tyr Arg Arg Pro Leu Asp Arg Pro Ala Ser Pro Arg Ser Asp Val
500 505 510Asn Asp Ile Lys Gly Thr
Pro Pro Glu Glu Ile Ser Ala Ala Gln Pro 515 520
525Leu Leu Arg Pro Asn Ser Ala Gly Ser Ser 530
5352718DNAArtificial sequenceSynthetic Construct - amplification
primer 27tgcagatgct gtgtctgg
182817DNAArtificial sequenceSynthetic Construct - amplification
primer 28cgtactggcc caggacc
172920DNAArtificial sequenceSynthetic Construct - amplification
primer 29ggttcgtgct aattgagctg
203020DNAArtificial sequenceSynthetic Construct - amplification
primer 30atggtggcaa gcttcttgtt
203120DNAArtificial sequenceSynthetic Construct - amplification
primer 31tgagctttcc ctcactgtcc
203220DNAArtificial sequenceSynthetic Construct - amplification
primer 32tgttcggctt gattaggatg
203320DNAArtificial sequenceSynthetic Construct - amplification
primer 33catggcccaa tattccattc
203421DNAArtificial sequenceSynthetic Construct - amplification
primer 34ggtccttgtt cacagaactc c
213520DNAArtificial sequenceSynthetic Construct - amplification
primer 35ccgctcctga ttggactaaa
203620DNAArtificial sequenceSynthetic Construct - amplification
primer 36cgtgggtcaa ggaagagaac
203721DNAArtificial sequenceSynthetic Construct - amplification
primer 37atgacccgca gcttctaaca g
213820DNAArtificial sequenceSynthetic Construct - amplification
primer 38ctccgctcac agagctccta
203921DNAArtificial sequenceSynthetic Construct - amplification
primer 39ccaacatcac taacacaacc t
214020DNAArtificial sequenceSynthetic Construct - amplification
primer 40gggagttagt aaggggtttg
204124DNAArtificial sequenceSynthetic Construct - amplification
primer 41caacctatta aacaaaacta aatt
244227DNAArtificial sequenceSynthetic Construct - amplification
primer 42agatttaata gagtgaaaat agagttt
274322DNAArtificial sequenceSynthetic Construct - amplification
primer 43ttattagttt aggggatagt tg
224423DNAArtificial sequenceSynthetic Construct - amplification
primer 44acacaataaa caacctacta aat
234519DNAArtificial sequenceSynthetic Construct - amplification
primer 45gagggtaagt ggtgataaa
194621DNAArtificial sequenceSynthetic Construct - amplification
primer 46aacctactaa atccaaaaaa a
214722DNAArtificial sequenceSynthetic Construct - amplification
primer 47taggatttta ggtttattgt ta
224819DNAArtificial sequenceSynthetic Construct - amplification
primer 48aaaaataaaa tattaaacc
194920DNAArtificial sequenceSynthetic Construct - amplification
primer 49atatgtggga gtgagagata
205022DNAArtificial sequenceSynthetic Construct - amplification
primer 50caacaacaaa caataataat aa
225122DNAArtificial sequenceSynthetic Construct - amplification
primer 51ttgagttttt ttattgatag tg
225221DNAArtificial sequenceSynthetic Construct - amplification
primer 52tctaaatcct attttcctac t
215324DNAArtificial sequenceSynthetic Construct - amplification
primer 53gtttttttat tgatagtgag agat
245420DNAArtificial sequenceSynthetic Construct - amplification
primer 54taacaaacct ttaatccaat
205522DNAArtificial sequenceSynthetic Construct - amplification
primer 55tttagtgagg atgatgtaat at
225622DNAArtificial sequenceSynthetic Construct - amplification
primer 56caacttaata aaaataaacc ca
225724DNAArtificial sequenceSynthetic Construct - amplification
primer 57ataatgtttt agtaagtgtt ggat
245820DNAArtificial sequenceSynthetic Construct - amplification
primer 58acaattacaa acctttaacc
205920DNAArtificial sequenceSynthetic Construct - amplification
primer 59aattcattca acatccattc
206026DNAArtificial sequenceSynthetic Construct - amplification
primer 60ggtttaatat tatttattat tttgga
266125DNAArtificial sequenceSynthetic Construct - amplification
primer 61ctcttacctt cctatactct ctaaa
256228DNAArtificial sequenceSynthetic Construct - amplification
primer 62agagtgtagt tgtaagattt aatagagt
28
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: