Patent application title: DEP2 AND ITS USES IN MAJOR DEPRESSIVE DISORDER AND OTHER RELATED DISORDERS
Inventors:
David A. Katz (Chicago, IL, US)
Jeremy C. Packer (Wadsworth, IL, US)
Anahita Bhathena (Evanston, IL, US)
Christopher Neff (Salt Lake City, UT, US)
Victor Abkevich (Salt Lake City, UT, US)
Donna Shattuck (Salt Lake City, UT, US)
Srikanth Jammulapati (Salt Lake City, UT, US)
IPC8 Class: AC12Q168FI
USPC Class:
Class name:
Publication date: 2015-06-11
Patent application number: 20150159214
Abstract:
The present invention relates to DEP2, as well as other proteins, and
their uses in connection with the treatment of major depression or
related disorders.Claims:
1-45. (canceled)
46. A method of screening a composition for the ability to modulate the activity of a protein translated from SEQ ID NO:1, the method comprising the steps of: a) providing a composition; b) simultaneously exposing the protein to the composition and to a substrate, wherein the protein is exposed to the substrate for sufficient time and conditions to allow the substrate to react with the protein in order to produce a reaction product or complex; and c) measuring presence or absence of said reaction product or complex, wherein a lack of said reaction product or complex indicating a composition having the ability to modulate the activity of said protein.
47. The method of claim 45, wherein the protein modifies the phosphorylation of the substrate.
48. The method of claim 47, wherein the protein translated from SEQ ID NO:1 has an amino acid sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27.
49. The method of claim 48, wherein the substrate is selected from the group consisting of: phosphohistidine, phospholysine, phosphodiimide, pyrophosphate and a peptide or protein phosphorylated on histidine or lysine.
50. A method of screening a composition for the ability to modulate activity of a protein translated from SEQ ID NO:1, in a cell, the method comprising the steps of: a) exposing said cell to said composition; and b) measuring the amount of activity of said protein in said cell, wherein a decreased or increased amount of activity of said protein, as compared to a cell which has not been exposed to said composition, indicates a composition having the ability to modulate the activity of said protein.
51. The method of claim 45, wherein a composition having an ability to modulate the activity of a protein can be used to treat major depression or a related disorder in a subject.
52. A method of screening a composition for the ability to modulate expression of a protein translated from SEQ ID NO:1, in a cell, the method comprising the steps of: a) exposing said cell to said composition; and b) measuring the amount of said protein in said cell, wherein a decreased or increased amount of said protein, as compared to a cell which has not been exposed to said composition, indicates a composition having the ability to modulate the expression of said protein.
53. A method of screening a composition for the ability to modulate the level of expression of a protein translated from SEQ ID NO:1, the method comprising the steps of: a) exposing an in vitro transcription and translation system comprising a regulatory sequence from SEQ ID NO:1 functionally connected to the open reading frame for a detectable protein, to a composition for a time and under conditions sufficient for said test whether said composition modulates the level of expression of the detectable protein; and b) detecting the level of expression of the detectable protein, wherein a reduction or an increase in the level of expression of the detectable protein indicates that said composition has the ability to modulate the level of expression of a protein translated from SEQ ID NO:1.
54. The method of claim 52, wherein the composition having an ability to modulate the expression of the protein or the level of expression of the protein can be used to treat major depression or a related disorder in a subject.
55. The method of claim 50, wherein the protein translated from SEQ ID NO:1 has an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
56. A method of treating major depression or a related disorder in a subject in need of said treatment comprising the step of administering a composition identified as modulating the activity of a protein translated from SEQ ID NO:1, to said subject, in an amount sufficient to effect said treatment.
57. The method of claim 56, wherein said composition inhibits or reduces the activity of the protein.
58. The method of claim 56, wherein said composition increases the activity of the protein.
59. A method of treating major depression or a related disorder in a subject in need of said treatment comprising reducing the amount of a protein translated from SEQ ID NO:1 in said subject, to a level sufficient to effect said treatment.
60. The method of claim 59, wherein said reduction results from complete binding or partial binding of a composition to said protein.
61. A method of treating major depression or a related disorder in a subject in need of said treatment comprising increasing the amount of a protein translated from SEQ ID NO:1, to a level sufficient to effect said treatment.
62. The methods of claim 56, wherein said method involves administering to said subject a therapeutically effective amount of a protein translated from SEQ ID NO:1.
63. The method of claim 56, wherein the protein translated from SEQ ID NO:1 has an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
64-74. (canceled)
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This is a divisional of U.S. patent application Ser. No. 11/509,296, filed on Aug. 24, 2006, which is a continuation-in-part of U.S. patent application Ser. No. 11/412,184, filed on Apr. 26, 2006, the entire contents of all of which are fully incorporated herein by reference.
SEQUENCE LISTING
[0002] The instant application contains a Sequence Listing which has been submitted in ASCII format via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Oct. 7, 2014, is named 2014--10--08--8108USD1-SEQ-LIST.txt and is 322,598 bytes in size.
BACKGROUND INFORMATION
[0003] Mood disorders, of which major depressive disorder is the most common, affect one person in five during their lifetime. The World Health Organization estimates that depression is currently the fourth most important worldwide cause of disability-adjusted life year loss, and that it will become the second most important cause by 2020 (See, Murray C J L and Lopez A D, The Global Burden of Disease: A Comprehensive Assessment of Mortality and Disability From Disease, Injuries, and Risk Factors in 1990 and Projected to 2020, volume 1. World Health Organization. Cambridge Mass.: Harvard University Press. 1996.). Pharmaceutical treatment of depression is frequently inadequate. In randomized clinical trials of the current best treatments, one-third of patients or more do not achieve remission, even after several months of treatment (See, Journal of the American Medical Association, 286:2947-55 (2001); Biological Psychiatry, 48:894-901 (2000)). Even when today's drugs do help patients achieve remission from their depression, the onset of action is over a period of weeks and there appears to be an increased risk of suicide during initial antidepressant therapy, although this risk may be less than that just prior to therapy initiation (See, Neuropsychopharmacology, 31:473-492 (2006)). Further, there are high recurrence rates--approximately 85% of patients who achieve remission will suffer another episode of major depression (See, American Journal of Psychiatry, 156:1000-6 (1999)). Finally, currently available antidepressants are associated with side effects that lead some patients to stop taking their medications at risk of sinking back (further) into depression, and to morbidity in others (See, New England Journal of Medicine, 353: 1819-34 (2005)).
[0004] The currently available antidepressants work primarily by increasing the activity of certain neurotransmitters, serotonin and norepinephrine, in synapses. Some medications (such as monoamine oxidase inhibitors) inhibit the degradation of these molecules, others (such as selective serotonin reuptake inhibitors and dual serotonin/norepinephrine reuptake inhibitors) decrease removal of neurotransmitters from the synaptic space, and some medications (such as receptor antagonists) stimulate norepinephrine release or inhibit negative feedback of serotonin signaling. Because these medications are all based on a single principle, the strength and range of their efficacy is similar. The improvements of the last half century have involved the development of safer and more tolerable drugs. However, despite this, today's drugs are neither completely safe nor completely tolerable for many patients.
[0005] Thus, there is considerable need for new drugs that are effective in a broader range of patients (particularly for patients whose depression is resistant to available pharmaceuticals), that have a faster onset of action, that are safer and more tolerable, or that complement the efficacy of existing drugs. It is possible, but unlikely, that further improvement in any of these dimensions will be achieved through development of additional serotonergic or noradrenergic agents. Therefore, alternative pharmacological approaches must be developed and pursued.
[0006] Part of the challenge in developing new drugs lies in the complexity of demonstrating efficacy of a major depression treatment. For example, the development of novel antidepressants is constrained by the limited understanding of depression's etiology. Because of this, there are relatively few pharmacological targets that can be considered for antidepressant development. Thereupon, there is a need for the identification of drug targets for depression. Genetic linkage can open new windows for the development of novel depression drug targets. Specifically, if a genetic variant is identified as being linked to depression in families, the gene in which that variant occurs is likely to be involved in the etiology of disease. Such a gene can be a target for the development of novel antidepressants. Additionally, such a gene can lead to the identification of previously unknown physiological pathways that may be modulated for effective therapy of depression.
[0007] Several genes have been identified or proposed as factors for depression or related phenotypes. Among these, most have been associated with disease in population studies of candidate genes selected on the basis of existing hypotheses about the etiology of depression. Many of these genes relate to serotonin or norepinephrine. Examples include: (1) associations of a HTR1A (serotonin receptor 1A) promoter variant with depression, suicide, bipolar disorder, panic disorder with agoraphobia, neuroticism and anti-depressant response; (2) associations of the HTT (serotonin transporter) promoter short allele with depression, suicide, depressive behavior response to tryptophan depletion, bipolar disorder antidepressant-induced mania and lesser anti-depressant response; and (3) association of a variant in HTR2C (serotonin receptor 2C) with both recurrent major depression and bipolar disorder and with major depression.
[0008] Thereupon, as evidenced by the above, there is a need in the art to identify proteins and genes associated with the pathophysiology of depression that are proteins and genes that relate to other than serotonin or norepinephrine. Such proteins and genes would be useful in the diagnosis of depression or a related disorder, and in the development of new drugs that could be used to treat patients suffering from depression or a related disorder.
SUMMARY OF THE INVENTION
[0009] In one embodiment, the present invention relates to an isolated nucleic acid molecule or fragment thereof comprising a nucleotide sequence having at least 90% identity to: (i) SEQ ID NO:2, (ii) nucleotides 352 to 771 of SEQ ID NO:2; or (iii) nucleotides 812 to 1162 of SEQ ID NO:2 or (iv) a complement comprising a nucleotide sequence having at least 90% identity to: (i) SEQ ID NO:2, (ii) nucleotides 352 to 771 of SEQ ID NO:2; or (iii) nucleotides 812 to 1162 of SEQ ID NO:2. The present invention also encompasses a purified or isolated protein encoded by the above nucleic acid molecule or fragment thereof.
[0010] In another embodiment, the present invention relates to a purified polypeptide or fragment thereof comprising an amino acid sequence having at least 90% identity to: SEQ ID NO:3 or SEQ ID NO:4.
[0011] In yet another embodiment, the present invention relates to a vector comprising:
[0012] a) an isolated nucleic acid sequence comprising a nucleotide sequence having at least 90% identity to: (i) SEQ ID NO:2, (ii) nucleotides 352 to 771 of SEQ ID NO:2; or (iii) nucleotides 812 to 1162 of SEQ ID NO:2 or a complement comprising a nucleotide sequence having at least 90% identity to: (i) SEQ ID NO:2, (ii) nucleotides 352 to 771 of SEQ ID NO:2; or (iii) nucleotides 812 to 1162 of SEQ ID NO:2; operably linked to
[0013] b) a regulatory sequence.
[0014] Additionally, in yet another embodiment, the present invention relates to a host cell comprising the above-described vector.
[0015] In still yet another embodiment, the present invention relates to a non-human transgenic animal. In one aspect, said non-human transgenic animal comprises:
[0016] a) an exogenous and stably transmitted nucleic acid comprising a nucleotide sequence of SEQ ID NO:2 (or any one or more of the sequences described above); or
[0017] b) a knock-out of a nucleic acid comprising a nucleotide sequence of SEQ ID NO:2 (or any one or more of the sequences described above).
[0018] In another aspect, said non-human transgenic animal comprises:
[0019] a) an exogenous and stably transmitted nucleic acid having a nucleotide sequence selected from the group consisting of: SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33, with the proviso that said animal does not comprise an exogenous and stably transmitted nucleic acid having a nucleotide sequence of SEQ ID NO:2; or
[0020] b) a knock-out of a nucleic acid comprising a nucleotide sequence selected from the group consisting of: SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33, with the proviso that a nucleic acid having a nucleotide sequence of SEQ ID NO:2 is not knocked out.
[0021] In yet another embodiment, the present invention relates to a method of modifying or altering the expression of SEQ ID NO:2 in a cell or animal. The method involves the steps of:
[0022] a) exposing said cell or administering to said subject a nucleic acid molecule, wherein said nucleic acid molecule modifies or alters the expression of SEQ ID NO: 2; and
[0023] b) modifying or altering the expression of SEQ ID NO:2.
[0024] In the above-described method, the nucleic acid molecule can be an antisense molecule, a small interfering RNA, a co-suppression RNA, an aptamer, a ribozyme or a triplexing agent.
[0025] In yet another embodiment, the present invention relates to a method of modifying or altering the expression of a nucleic acid sequence having a nucleotide sequence selected from the group consisting of: SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33 in a cell or animal, with the proviso that the expression of a nucleic acid having the sequence of SEQ ID NO: 2 is not modified or altered. The method involves the steps of:
[0026] a) exposing said cell or administering to said subject a nucleic acid molecule, wherein said nucleic acid molecule modifies or alters the expression of a nucleotide sequence selected from the group consisting of: SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33; and
[0027] b) modifying or altering the expression of SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0028] In the above-described method, the nucleic acid molecule can be an antisense molecule, a small interfering RNA, a co-suppression RNA, an aptamer, a ribozyme or a triplexing agent.
[0029] In yet another embodiment, the present invention relates to a method of determining a genotype of a subject at a polymorphic site in nucleotides 1 to 316 of SEQ ID NO: 2 in a test sample. The method involves the steps of:
[0030] a) obtaining a test sample comprising DNA of a subject;
[0031] b) analyzing the test sample for at least one polymorphic site in nucleotides 1 to 316 of SEQ ID NO:2;
[0032] c) identifying the allele(s) present at said polymorphic site in said test sample; and
[0033] d) determining the genotype of a subject based on the identification of the allele(s) at said polymorphic site in said test sample.
[0034] The above-described method can further involve the step of analyzing the test sample(s) for a C-G polymorphism at position -1019 in a human serotonin receptor 1A (HTR1A) gene.
[0035] The analyzing performed in the above-described method can be accomplished using direct sequencing, polymerase chain reaction, ligase chain reaction, a fragment length polymorphism assay, a single strand conformation polymorphism analysis, a heteroduplex assay, hybridization, Taqman®, Molecular Beacon, Pyrosequencing, a microarray, Southern blotting, an Invader assay, a single base extension assay, or mass spectrometry.
[0036] In still yet another embodiment, the present invention relates to a method of determining a genotype of a subject at nucleotides 77402 or 79906 of SEQ ID NO: 1 in a test sample. The method involves the steps of:
[0037] a) obtaining a test sample comprising DNA of a subject;
[0038] b) analyzing the test sample for at least one polymorphic site selected from the group consisting of nucleotides 77402 and 79906 of SEQ ID NO: 1;
[0039] c) determining the allele(s) present at said polymorphic site in said test sample; and
[0040] d) determining the genotype of a subject based on the identification of the allele(s) at said polymorphic site in said test sample.
[0041] The above-described method can further involve the step of analyzing the test sample(s) for a C-G polymorphism at position -1019 in a human serotonin receptor 1A (HTR1A) gene.
[0042] The analyzing performed in the above-described method can be accomplished using direct sequencing, polymerase chain reaction, ligase chain reaction, a fragment length polymorphism assay, a single strand conformation polymorphism analysis, a heteroduplex assay, hybridization, Taqman®, Molecular Beacon, Pyrosequencing, a microarray, Southern blotting, an Invader assay, a single base extension assay, or mass spectrometry.
[0043] In still yet another embodiment, the present invention relates to a method of identifying a subject having major depression or a related disorder, or at risk of developing major depression or a related disorder. The method involves the steps of:
[0044] a) obtaining a test sample subject comprising DNA of a subject;
[0045] b) analyzing the test sample for at least one polymorphic site in SEQ ID NO: 1;
[0046] c) identifying at least one allele at said polymorphic site; and
[0047] d) identifying whether said subject has major depression or a related disorder or is at risk of developing major depression or a related disorder based on the allele(s) identified at said polymorphic site(s) in said test sample.
[0048] The above-described method can further involve the step of analyzing the test sample(s) for a C-G polymorphism at position -1019 in a human serotonin receptor 1A (HTR1A) gene.
[0049] The analyzing performed in the above-described method can be accomplished using direct sequencing, polymerase chain reaction, ligase chain reaction, a fragment length polymorphism assay, a single strand conformation polymorphism analysis, a heteroduplex assay, hybridization, Taqman®, Molecular Beacon, Pyrosequencing, a microarray, Southern blotting, an Invader assay, a single base extension assay, or mass spectrometry.
[0050] In still yet another embodiment, the present invention relates to a method of providing a prognosis for or predicting a response to treatment for a subject having major depression or a related disorder. The method involves the steps of:
[0051] a) obtaining a test sample comprising DNA of a subject;
[0052] b) analyzing the test sample for at least one polymorphic site in SEQ ID NO: 1;
[0053] c) identifying at least one allele(s) at said polymorphic site; and
[0054] d) providing a prognosis for and predicting the response to treatment for a subject having major depression or a related disorder based on the allele(s) identified at said polymorphic site(s) in said test sample.
[0055] The above-described method can further involve the step of analyzing the test sample(s) for a C-G polymorphism at position -1019 in a human serotonin receptor 1A (HTR1A) gene.
[0056] The analyzing performed in the above-described method can be accomplished using direct sequencing, polymerase chain reaction, ligase chain reaction, a fragment length polymorphism assay, a single strand conformation polymorphism analysis, a heteroduplex assay, hybridization, Taqman®, Molecular Beacon, Pyrosequencing, a microarray, Southern blotting, an Invader assay, a single base extension assay, or mass spectrometry.
[0057] In still yet another embodiment, the present invention relates to a method of detecting or quantifying an mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2 in a test sample. The method involves the steps of:
[0058] a) obtaining a test sample subject comprising mRNA of a subject;
[0059] b) analyzing the test sample for a mRNA comprising at least 15 contiguous nucleotides of nucleotides 1 to 316 of SEQ ID NO:2; and
[0060] b) detecting or quantifying said mRNA in said test sample.
[0061] The above-described method can further involve the step of analyzing the test sample(s) for an mRNA transcribed from a human serotonin receptor 1A (HTR1A) gene.
[0062] The analyzing performed in the above-described method can be accomplished using reverse transcription, quantitative polymerase chain reaction, cDNA microarrays, or Northern blotting.
[0063] In still yet another embodiment, the present invention relates to a method of identifying a subject having major depression or a related disorder, or at risk of developing major depression or a related disorder. The method involves the steps of:
[0064] a) obtaining a test sample subject comprising subject mRNA;
[0065] b) analyzing the test sample for at least one mRNA transcribed from SEQ ID NO: 1; and
[0066] c) identifying whether said subject has major depression or a related disorder or is at risk of developing major depression or a related disorder based on the presence, absence or amount of at least one of the mRNAs recited in step b) in said test sample.
[0067] The above-described method can further involve the step of analyzing the test sample(s) for an mRNA transcribed from a human serotonin receptor 1A (HTR1A) gene.
[0068] The analyzing performed in the above-described method can be accomplished using reverse transcription, quantitative polymerase chain reaction, cDNA microarrays, or Northern blotting.
[0069] In the above-described method, the mRNA transcribed from SEQ ID NO: 1 can have the nucleotide sequence of: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0070] In still yet another embodiment, the present invention relates to a method of providing a prognosis for or predicting a response to treatment for a subject having major depression or a related disorder. The method involves the steps of:
[0071] a) obtaining a test sample comprising mRNA of a subject;
[0072] b) analyzing the test sample for at least one mRNA transcribed from SEQ ID NO: 1; and
[0073] c) providing a prognosis for and predicting the response to treatment for a subject having major depression or a related disorder based on the presence, absence or amount of at least one of the mRNAs recited in step b) in said test sample.
[0074] The above-described method can further involve the step of analyzing the test sample(s) for an mRNA transcribed from a human serotonin receptor 1A (HTR1A) gene.
[0075] The analyzing performed in the above-described method can be accomplished using reverse transcription, quantitative polymerase chain reaction, cDNA microarrays, or Northern blotting.
[0076] In the above-described method, the mRNA transcribed from SEQ ID NO: 1 can have the nucleotide sequence of: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0077] In still yet a further embodiment, the present invention relates to a method of detecting or quantifying the amount of a protein having an amino acid sequence selected from the group consisting of: SEQ ID NO:3 and SEQ ID NO:4 in a test sample. The method involves the steps of:
[0078] a) obtaining a test sample subject comprising at least one polypeptide of a subject; and
[0079] b) detecting or quantifying the amount of a protein having an amino acid sequence selected from the group consisting of SEQ ID NO: 3 and SEQ ID NO:4 in said test sample.
[0080] The above-described method can further involve the step of detecting or quantifying the amount of a polypeptide encoded by a human serotonin receptor 1A (HTR1A) gene.
[0081] The analyzing performed in the above-described method can be accomplished using ELISA, RIA, Western blotting, fluorescence activated cell sorting or immunohistochemical analysis.
[0082] In still yet a further embodiment, the present invention relates to a method of identifying a subject having major depression or a related disorder, or at risk of developing major depression or a related disorder. The method involves the steps of:
[0083] a) obtaining a test sample comprising at least one polypeptide of a subject;
[0084] b) analyzing the test sample for at least one polypeptide translated from SEQ ID NO:1; and
[0085] c) identifying whether said subject has major depression or a related disorder or is at risk of developing major depression or a related disorder based on the presence, absence or amount of at least of the polypeptides recited in step b) in said test sample.
[0086] The above-described method can further involve the step of analyzing the test sample(s) for a polypeptide encoded by a human serotonin receptor 1A (HTR1A) gene.
[0087] The analyzing performed in the above-described method can be accomplished using ELISA, RIA, Western blotting, fluorescence activated cell sorting or immunohistochemical analysis.
[0088] In the above-described method, the polypeptide translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0089] In still yet a further embodiment, the present invention relates to a method of providing a prognosis for or predicting a response to treatment for a subject having major depression or a related disorder. The method involves the steps of:
[0090] a) obtaining a test sample comprising at least one polypeptide of a subject;
[0091] b) analyzing the test sample for at least one polypeptide translated from SEQ ID NO:1; and
[0092] c) providing a prognosis for and predicting the response to treatment for a subject having major depression or a related disorder based on the presence, absence or amount of at least one of the polypeptides recited in step b) in said test sample.
[0093] The above-described method can further involve the step of analyzing the test sample(s) for a polypeptide encoded by a human serotonin receptor 1A (HTR1A) gene.
[0094] The analyzing performed in the above-described method can be accomplished using ELISA, RIA, Western blotting, fluorescence activated cell sorting or immunohistochemical analysis.
[0095] In the above-described method, the polypeptide translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0096] In still yet another embodiment, the present invention relates to a kit. In one aspect, the kit can comprise:
[0097] a) at least one reagent for determining a genotype of a subject at a polymorphic site in SEQ ID NO: 1 in a test sample; and
[0098] b) instructions for determining the genotype of the subject.
[0099] In another aspect, the kit can comprise:
[0100] a) at least one reagent for determining a genotype of a subject for a C-G polymorphism at position -1019 in a human serotonin receptor 1A (HTR1A) gene;
[0101] b) at least one reagent for determining a genotype of a subject for a polymorphic site in SEQ ID NO: 1, in a test sample; and
[0102] c) instructions for determining the genotype of the subject.
[0103] In another aspect, the kit can comprise:
[0104] a) at least one reagent for determining a genotype of a subject at nucleotide 77402 of SEQ ID NO:1 in a test sample; and
[0105] b) instructions for determining the genotype of the subject.
[0106] In another aspect, the kit can comprise:
[0107] a) at least one reagent for determining a genotype of a subject at nucleotide 79906 of SEQ ID NO:1 in a test sample; and
[0108] b) instructions for determining the genotype of the subject.
[0109] In another aspect, the kit can comprise:
[0110] a) at least one reagent for determining a genotype of a subject for a C-G polymorphism at position -1019 in a human serotonin receptor 1A (HTR1A) gene;
[0111] b) at least one reagent for determining a genotype of a subject for a polymorphic site at at least one of nucleotides 77402 or 79906 in SEQ ID NO:1, in a test sample; and
[0112] c) instructions for determining the genotype of the subject.
[0113] In still yet another aspect, the kit comprises:
[0114] a) at least one reagent for detecting or quantifying an mRNA transcribed from SEQ ID NO:1 in a test sample; and
[0115] b) instructions for detecting or quantifying the mRNA transcribed from SEQ ID NO:1 in the test sample.
[0116] In still yet another aspect, the kit comprises:
[0117] a) at least one reagent for detecting or quantifying an mRNA from a serotonin receptor 1A (HTR1A) gene in a test sample;
[0118] b) at least one reagent for detecting or quantifying an mRNA transcribed from SEQ ID NO:1, in a test sample; and
[0119] c) instructions for detecting or quantifying the mRNA from a serotonin receptor 1A (HTR1A) gene and the mRNA transcribed from SEQ ID NO:1 in the test sample.
[0120] The mRNA transcribed from SEQ ID NO:1 in the above kits in can have the nucleotide sequence of: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0121] It still yet another aspect, the kit comprises:
[0122] a) at least one reagent for detecting or quantifying a polypeptide translated from SEQ ID NO:1 in a test sample; and
[0123] b) instructions for detecting or quantifying the polypeptide translated from SEQ ID NO:1 in the test sample.
[0124] In still yet another aspect, the kit comprises:
[0125] a) at least one reagent for detecting or quantifying a polypeptide encoded by a serotonin receptor 1A (HTR1A) gene in a test sample;
[0126] b) at least one reagent for detecting or quantifying a polypeptide translated from SEQ ID NO:1, in a test sample; and
[0127] c) instructions for detecting or quantifying the polypeptide encoded by a serotonin receptor 1A (HTR1A) gene and the polypeptide translated from SEQ ID NO:1 in the test sample.
[0128] The polypeptide translated from SEQ ID NO:1 in the above-described kits can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0129] In still yet another embodiment, the present invention relates to a method of screening a composition for the ability to bind to a protein translated from SEQ ID NO:1. The method involves the steps of:
[0130] a) exposing said protein to a composition for a time and under conditions sufficient for said test composition to bind to said protein to form protein/composition complexes; and
[0131] b) detecting presence or absence of said complexes, wherein the presence of said complexes indicates a composition having the ability to bind to said protein.
[0132] The presence or absences of the complexes in the above-described method can be detected using mass spectrometry. Additionally, a composition identified pursuant to the above-described method as having the ability to bind to a protein translated from SEQ ID NO:1 can be used to treat major depression or a related disorder in a subject.
[0133] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0134] In still yet another embodiment, the present invention relates to a method of detecting binding of a composition to a protein translated from SEQ ID NO:1. The method involves the steps of:
[0135] a) subjecting said protein to nuclear magnetic resonance and recording the resulting spectrum;
[0136] b) subjecting said protein to nuclear magnetic resonance in the presence of said composition and recording the resulting spectrum; and
[0137] c) detecting the difference between said spectrum of step a) and said spectrum of step b) and comparing said difference to a control, said comparison indicating whether said composition binds to said protein.
[0138] A composition identified pursuant to the above-described method as having the ability to bind to a protein translated from SEQ ID NO:1 can be used to treat major depression or a related disorder in a subject.
[0139] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0140] In still yet another embodiment, the present invention relates to a method of identifying the structure of a composition bound to a protein translated from SEQ ID NO:1. The method involves the steps of:
[0141] a) exposing said protein to a composition known to bind to said protein; and
[0142] b) observing the resulting X-ray diffraction pattern of said resulting bound composition of step a), said diffraction pattern indicating the structure of said composition.
[0143] Additionally, a composition identified pursuant to the above-described method as having the ability to bind to a protein translated from SEQ ID NO:1 can be used to treat major depression or a related disorder in a subject.
[0144] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0145] In still yet another embodiment, the present invention relates to a method of screening a composition for the ability to modulate the activity of a protein translated from SEQ ID NO:1. The method involves the steps of:
[0146] a) providing a composition;
[0147] b) exposing the protein to a substrate for sufficient time and conditions to allow the substrate to react with the protein in order to produce a reaction product or complex;
[0148] c) exposing the protein to the composition; and
[0149] d) measuring said reaction product or complex, wherein a decreased or increased amount of said reaction product or complex, as compared to the amount of reaction product or complex produced in the absence of said composition, indicates a composition having the ability to modulate the activity of said protein.
[0150] In the above-described method, a protein translated from SEQ ID NO:1 having an amino acid sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27 may be capable of modifying the phosphorylation of the substrate. Additionally, the substrate can be selected from the group consisting of: phosphohistidine, phospholysine, phosphodiimide, pyrophosphate and a peptide or protein phosphorylated on histidine or lysine.
[0151] Furthermore, a composition identified in the above-identified method as having an ability to modulate the activity of a protein can be used to treat major depression or a related disorder in a subject.
[0152] Moreover, in the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0153] In still yet another embodiment, the present invention relates to a method of screening a composition for the ability to modulate the activity of a protein translated from SEQ ID NO:1. The method involves the steps of:
[0154] a) providing a composition;
[0155] b) simultaneously exposing the protein to the composition and to a substrate, wherein the protein is exposed to the substrate for sufficient time and conditions to allow the substrate to react with the protein in order to produce a reaction product or complex; and
[0156] c) measuring presence or absence of said reaction product or complex, wherein a lack of said reaction product or complex indicating a composition having the ability to modulate the activity of said protein.
[0157] In the above-described method, a protein translated from SEQ ID NO:1 having an amino acid sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27 may be capable of modifying the phosphorylation of the substrate. Additionally, the substrate can be selected from the group consisting of: phosphohistidine, phospholysine, phosphodiimide, pyrophosphate and a peptide or protein phosphorylated on histidine or lysine.
[0158] Furthermore, a composition identified in the above-identified method as having an ability to modulate the activity of a protein can be used to treat major depression or a related disorder in a subject.
[0159] Moreover, in the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0160] In still yet another embodiment, the present invention relates to a method of screening a composition for the ability to modulate activity of a protein translated from SEQ ID NO:1, in a cell. The method involves the steps of:
[0161] a) exposing said cell to said composition; and
[0162] b) measuring the amount of activity of said protein in said cell, wherein a decreased or increased amount of activity of said protein, as compared to a cell which has not been exposed to said composition, indicates a composition having the ability to modulate the activity of said protein.
[0163] A composition identified in the above-identified method as having an ability to modulate the activity of a protein can be used to treat major depression or a related disorder in a subject.
[0164] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0165] In still yet another embodiment, the present invention relates to a method of screening a composition for the ability to modulate expression of a protein translated from SEQ ID NO:1, in a cell. The method involves the steps of:
[0166] a) exposing said cell to said composition; and
[0167] b) measuring the amount of said protein in said cell, wherein a decreased or increased amount of said protein, as compared to a cell which has not been exposed to said composition, indicates a composition having the ability to modulate the expression of said protein.
[0168] A composition identified pursuant to the above-described method as having an ability to modulate the expression of the protein can be used to treat major depression or a related disorder in a subject.
[0169] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0170] In still yet another embodiment, the present invention relates to a method of screening a composition for the ability to modulate the level of expression of a protein translated from SEQ ID NO:1. The method comprises the steps of:
[0171] a) exposing an in vitro transcription and translation system comprising a regulatory sequence from SEQ ID NO:1 functionally connected to the open reading frame for a detectable protein, to a composition for a time and under conditions sufficient for said test whether said composition modulates the level of expression of the detectable protein; and
[0172] b) detecting the level of expression of the detectable protein, wherein a reduction or an increase in the level of expression of the detectable protein indicates that said composition has the ability to modulate the level of expression of a protein translated from SEQ ID NO:1.
[0173] A composition identified pursuant to the above-described method as having an ability to modulate the level of expression of the protein can be used to treat major depression or a related disorder in a subject.
[0174] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0175] In still yet another embodiment, the present invention relates to a method of treating major depression or a related disorder in a subject in need of said treatment comprising the step of administering a composition identified as modulating the activity of a protein translated from SEQ ID NO:1, to said subject, in an amount sufficient to effect said treatment.
[0176] The composition administered to a subject pursuant to the above-described method can: (a) inhibit or reduce the activity of the protein; (b) increase the activity of the protein; or (c) decrease the activity of the protein.
[0177] A protein translated from SEQ ID NO:1 and that can be used in the above-described method can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0178] In still yet another embodiment, the present invention relates to a method of treating major depression or a related disorder in a subject in need of said treatment comprising reducing the amount of a protein translated from SEQ ID NO:1 in said subject, to a level sufficient to effect said treatment.
[0179] In the above-described method, the reduction can result from complete binding or partial binding of a composition to said protein. A protein translated from SEQ ID NO:1 and that can be used in the above-described method can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0180] In still yet another embodiment, the present invention relates to a method of treating major depression or a related disorder in a subject in need of said treatment comprising increasing the amount of a protein translated from SEQ ID NO:1, to a level sufficient to effect said treatment.
[0181] In addition, the above-described method can involve administering to said subject a therapeutically effective amount of a protein translated from SEQ ID NO:1. A protein translated from SEQ ID NO:1 and that can be used in the above-described method can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0182] In still yet another embodiment, the present invention relates to a method of treating major depression or a related disorder in a subject in need of said treatment comprising the step of administering a composition identified as modulating the level of expression of an mRNA molecule transcribed from SEQ ID NO: 1 to said subject, in an amount sufficient to effect said treatment.
[0183] In the above-described method, an mRNA molecule transcribed from SEQ ID NO:1 can have the nucleotide sequence of SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0184] In still yet another embodiment, the present invention relates to a method of determining the therapeutic activity of a composition used to treat major depression or a related disorder. The method involves the steps of:
[0185] a) determining the amount of a protein translated from SEQ ID NO:1, in a test sample from a subject treated with said composition; and
[0186] b) comparing the amount of said protein in said test sample with the amount of protein present in a test sample from said subject prior to treatment, an equal amount of said protein in said test sample of said treated subject indicating lack of therapeutic activity of said composition and a changed amount of said protein in said test sample of said treated subject indicating therapeutic activity of said composition.
[0187] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0188] In still yet another embodiment, the present invention relates to a method of determining the level of therapeutic activity of a composition used to treat major depression or a related disorder. The method involves the steps of:
[0189] a) determining the activity of a protein translated from SEQ ID NO:1, in a test sample from a subject treated with said composition; and
[0190] b) comparing the amount of activity of said protein in said test sample with the amount of activity of protein present in a test sample from said subject prior to treatment, an equal amount of activity of said protein in said test sample of said treated subject indicating lack of therapeutic activity of said composition and a changed amount of activity of said protein in said test sample of said treated subject indicating therapeutic activity of said composition.
[0191] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0192] In still yet another embodiment, the present invention relates to a method of determining the level of in vivo activity of a composition used to treat major depression or a related disorder comprising the steps of:
[0193] a) determining the amount of a mRNA molecule transcribed from SEQ ID NO:1, in a test sample from a subject treated with said composition; and
[0194] b) comparing the amount of said mRNA molecule in said test sample with the amount of mRNA molecule present in a test sample from said subject prior to treatment, an equal amount of said mRNA molecule in said test sample of said treated subject indicating lack of therapeutic activity of said composition and a changed amount of said mRNA molecule in said test sample of said treated subject indicating therapeutic activity of said composition.
[0195] In the above-described method, an mRNA molecule transcribed from SEQ ID NO:1 can have the nucleotide sequence of SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0196] In still yet another embodiment, the present invention relates to a method of determining presence or absence of activity of a composition used to treat major depression or a related disorder. The method involves the steps of:
[0197] a) observing phenotype of a subject according to a method validated as a measure of major depression or a related disorder;
[0198] b) administering said composition to said subject for a time and under conditions sufficient for said composition to bind to, inhibit, increase or reduce the activity of, or increase or reduce the amount of a protein translated from SEQ ID NO:1;
[0199] c) repeating step a) with said subject of step b); and
[0200] d) comparing said phenotype of step a) and said phenotype of step c), a difference in step c) as compared to step a) indicating presence of activity of said composition and the lack of a difference indicating absence of activity of said composition.
[0201] In the above-described method, a protein translated from SEQ ID NO:1 can have an amino acid sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0202] In still yet another embodiment, the present invention relates to a method of determining presence or absence of activity of a composition used to treat major depression or a related disorder comprising the steps of:
[0203] a) observing phenotype of a subject according to a method validated as a measure of major depression or a related disorder;
[0204] b) administering said composition to said subject for a time and under conditions sufficient for said composition to increase or reduce the amount of a mRNA molecule transcribed from SEQ ID NO:1;
[0205] c) repeating step a) with said subject of step b); and
[0206] d) comparing said phenotype of step a) and said phenotype of step c), a difference in step c) as compared to step a) indicating presence of activity of said composition and the lack of a difference indicating absence of activity of said composition.
[0207] In the above-described method, an mRNA molecule transcribed from SEQ ID NO:1 can have the nucleotide sequence of SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
BRIEF DESCRIPTION OF THE FIGURES
[0208] FIG. 1 shows the genomic DNA sequence comprising DEP2 (SEQ ID NO:1). The capital letters show the open reading frame for human phospholysine phosphohistidine inorganic pyrophosphate phosphatase (Lhpp) protein. Red text indicates single nucleotide polymorphisms. The bold underlined portions are forward primers. The bold italic underlined portions are reverse primers. The italic portions are repeat regions. The beginning portion of the sequence before the gap is disclosed as SEQ ID NO: 70.
[0209] FIG. 2 shows the nucleic acid sequence of DEP2-1 (SEQ ID NO:2). The nucleic acid sequence contains two (2) coding regions, each of which are shown in capital letters.
[0210] FIG. 3A shows the amino acid sequence of the Dep2-1a protein encoded by one of the coding regions of the nucleic acid shown in FIG. 2 (SEQ ID NO:3).
[0211] FIG. 3B shows the amino acid sequence of the Dep2-1b protein encoded by the second coding region of the nucleic acid shown in FIG. 2 (SEQ ID NO:4).
[0212] FIG. 4 shows the nucleic acid sequence of a naturally occurring splice variant of DEP2-1 (SEQ ID NO:5). The coding regions are shown in capital letters. The amino acid sequences of the proteins encoded by each of the coding regions is the same as shown in SEQ ID NOS: 3 and 4.
[0213] FIG. 5 shows the nucleic acid sequence of a naturally occurring splice variant of DEP2-1 (SEQ ID NO:6). The coding regions are shown in capital letters. The amino acid sequences of the proteins encoded by each of the coding regions is the same as shown in SEQ ID NOS: 3 and 4.
[0214] FIG. 6 shows the nucleic acid sequence of a naturally occurring splice variant of DEP2-1 (SEQ ID NO:7). The coding region is shown in capital letters. The amino acid sequence of the protein encoded by the coding region is the same as shown in SEQ ID NO:4.
[0215] FIG. 7 shows the nucleic acid sequence of a naturally occurring splice variant of DEP2-1 (SEQ ID NO:8). The coding region is shown in capital letters. The amino acid sequence of the protein encoded by the coding region is the same as shown in SEQ ID NO:4.
[0216] FIG. 8 shows the reference nucleic acid sequence of LHPP (SEQ ID NO:9). The coding region is shown in capital letters.
[0217] FIG. 9 shows the amino acid sequence of the Lhpp protein encoded by the nucleic acid shown in FIG. 2 (SEQ ID NO:10). A polymorphic amino acid has been found to exist at amino acid 94 where arginine is replaced by glutamine (see the underline).
[0218] FIG. 10 shows a naturally occurring splice variant of LHPP (SEQ ID NO:11). The coding region is shown in capital letters. The coding region encodes a protein that is identical to SEQ ID NO:10.
[0219] FIG. 11 shows a naturally occurring splice variant of LHPP (SEQ ID NO:12). The coding region is shown in capital letters.
[0220] FIG. 12 shows the amino acid sequence of the variant Lhpp protein encoded by the nucleic acid shown in FIG. 5 (SEQ ID NO:13).
[0221] FIG. 13 shows a naturally occurring splice variant of LHPP (SEQ ID NO:14). The coding region is shown in capital letters. The amino acid sequence of the variant Lhpp protein encoded by the nucleic acid is shown in SEQ ID NO:15.
[0222] FIG. 14 shows a nucleic acid sequence of a naturally occurring splice variant of LHPP (SEQ ID NO:16). The coding region is shown in capital letters. The amino acid sequence of the variant Lhpp protein encoded by the nucleic acid is shown in SEQ ID NO: 17.
[0223] FIG. 15A and FIG. 15B show a nucleic acid sequence of a naturally occurring splice variant of LHPP (SEQ ID NO:18). The coding region is shown in capital letters. The amino acid sequence of the variant Lhpp protein encoded by the nucleic acid is shown in SEQ ID NO:19.
[0224] FIG. 16 shows a nucleic acid of a naturally occurring splice variant of LHPP (SEQ ID NO:20). The coding region is shown in capital letters. The amino acid sequence of the variant Lhpp protein encoded by the nucleic acid is shown in SEQ ID NO:21.
[0225] FIG. 17 shows a nucleic acid of a naturally occurring splice variant of LHPP (SEQ ID NO:22). The coding region is shown in capital letters. The amino acid sequence of the variant Lhpp protein encoded by the nucleic acid is shown in SEQ ID NO:23.
[0226] FIG. 18 shows a nucleic acid of a naturally occurring splice variant of LHPP (SEQ ID NO:24). The coding region is shown in capital letters. The amino acid sequence of the variant Lhpp protein encoded by the nucleic acid is shown in SEQ ID NO:25.
[0227] FIG. 19 shows a nucleic acid of a naturally occurring splice variant of LHPP (SEQ ID NO:28). The coding region is shown in capital letters. The amino acid sequence of the protein encoded by the nucleic acid is shown in SEQ ID NO:29.
[0228] FIG. 20 shows a nucleic acid sequence of DEP2-2 (SEQ ID NO:26). The coding region is shown in capital letters.
[0229] FIG. 21 shows the amino acid sequence of the Dep2-2 protein encoded by the nucleic acid shown in FIG. 20 (SEQ ID NO:27).
[0230] FIG. 22 shows a nucleic acid sequence of DEP2-3 (SEQ ID NO:30).
[0231] FIG. 23A and FIG. 23B show a nucleic acid of AK127935 (GenBank: AK127935) (SEQ ID NO:31). The coding region is shown in capital letters.
[0232] FIG. 24 shows the amino acid sequence of the Dep2-4 protein encoded by the nucleic acid shown in FIG. 23 (SEQ ID NO:32).
[0233] FIG. 25A and FIG. 25B show a nucleic acid of AW867792 (GenBank: AW867792) (SEQ ID NO:33). The coding region is shown in capital letters.
[0234] FIG. 26 shows the amino acid sequence of the Dep2-5 protein encoded by the nucleic acid shown in FIG. 25 (SEQ ID NO:34).
[0235] FIG. 27 shows the genetic evidence on chromosome 10 for linkage of a gene to major depressive disorder. The lower, curved line and lower, two horizontal lines show evidence from standard linkage analysis. The green lines show evidence from serotonin receptor 1A-conditional linkage analysis. The dotted horizontal lines indicate the extent of the linkage region as defined by a drop of one unit in the heterogeneity LOD score. The dashed horizontal lines indicate the extent of the linkage region as defined by a drop of two units in the heterogeneity LOD score. The right side of the figure represents the telomere of chromosome 10. The approximate location of DEP2 is represented by an arrowhead.
[0236] FIG. 28 shows a Northern blot probed with a polynucleotide complementary to SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11 and SEQ ID NO:12.
[0237] FIG. 29 is a schematic representation of certain naturally occurring DEP2 transcripts and proteins. This figure was created using Genecarta software (Compugen, Tel Aviv, Israel). Dark lines represent transcripts and light lines represent coding regions. Transcripts shown include LHPP and naturally occurring variants thereof, DEP2-1 and naturally occurring variants thereof, and DEP2-2. DEP2-3, for which all supportive evidence has been generated, is not shown. AK127935 and AW867792, which share no exons in common with LHPP or DEP2-1, or naturally occurring variants thereof, are not shown. Although SEQ ID NO:2, SEQ ID NO:5 and SEQ ID NO:6 each contain the coding regions for both Dep2-1a and Dep2-1b, only the 5'-most (Dep2-1a) is associated with those transcripts in this figure. The location of Dep2-1b is associated with SEQ ID NO:7 and SEQ ID NO:8.
[0238] FIG. 30 shows a sequence alignment of DEP2-1 sequences either predicted by a bioinformatic algorithm (Genecarta (SEQ ID NO:71)) or determined experimentally by direct sequencing of cloned cDNAs h5173309 (SEQ ID NO:72), h5194531 (SEQ ID NO:73), h3197955 (SEQ ID NO:74) and h4565014 (SEQ ID NO:75). Arrowheads indicate major transcription start sites determined by RLM-RACE. A single nucleotide polymorphism is indicated by an underlined base in the h4565014 sequence. The last line of sequence was found, downstream of a polyadenylate tract, in h4565014 and does not match DEP2.
[0239] FIG. 31 shows RLM-RACE results. MWM=molecular weight markers.
[0240] FIG. 32 shows the results of exon bridging PCR experiment #1 in Example 7. The lower band between 50 and 100 nucleotide markers are primer dimers.
[0241] FIG. 33 shows the results of exon bridging PCR experiment #2 in Example 7. Negative controls are reactions in which no reverse transcriptase was added.
DETAILED DESCRIPTION OF THE INVENTION
Definitions
[0242] As used herein, the singular forms "a," "an" and "the" include plural reference unless the context clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one those of skill in the art to which this invention belongs.
[0243] As used herein, the term "allele" refers to a particular form of a nucleic acid, either DNA or RNA, wherein different alleles of a nucleic acid differ in sequence, by either change or insertion/deletion, at one or more nucleotides at a polymorphic site.
[0244] "cDNA" refers to a DNA that is complementary to and synthesized from a mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into the double-stranded form.
[0245] As used herein, the term "coding sequence" or "coding region" refers to a nucleic acid sequence that codes for a specific amino acid sequence.
[0246] As used herein, the term "complementarity" refers to as the degree of relatedness between two nucleic acid segments. It is determined by measuring the ability of the sense strand of one nucleic acid segment to hybridize with the antisense strand of the other nucleic acid segment, under appropriate conditions, to form a double helix. A "complement" is defined as a sequence which pairs to a given sequence based upon the canonic base-pairing rules. For example, a sequence A-G-T in one nucleotide strand is "complementary" to T-C-A in the other strand.
[0247] In the DNA double helix, wherever adenine appears in one strand, thymine (uridine in RNA) appears in the other strand. Similarly, wherever guanine is found in one strand, cytosine is found in the other. The greater the relatedness between the nucleotide sequences of two nucleic acid segments, the greater the ability to form hybrid duplexes between the strands of the two nucleic acid segments.
[0248] "Similarity" between two amino acid sequences is defined as the presence of a series of identical as well as conserved amino acid residues in both sequences. The higher the degree of similarity between two amino acid sequences, the higher the correspondence, sameness or equivalence of the two sequences. ("Identity" between two amino acid sequences is defined as the presence of a series of exactly alike or invariant amino acid residues in both sequences.) The definitions of "complementarity", "identity" and "similarity" are well known to those of ordinary skill in the art.
[0249] As used herein, the term "DEP2" refers to a gene on human chromosome 10q26.2 that has been statistically linked and associated with major depression, and that is believed to be within the 159 kb sequence comprising SEQ ID NO:1. Transcripts that arise from DEP2 include: (a) LHPP (SEQ ID NO:9) (See, Yokoi et al., J Biochem 133:607-14 (2003)); (b) naturally occurring splice variants of LHPP (SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 and SEQ ID NO:26; (c) DEP2-1 (SEQ ID NO:2); (d) naturally occurring splice variants of DEP2-1 (SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 or SEQ ID NO:8); (e) DEP2-2 (SEQ ID NO:28); (f) DEP2-3 (SEQ ID NO:30); (g) GenBank sequence AK127935 (SEQ ID NO:31); and (h) GenBank sequence AW867792 (SEQ ID NO:33). Proteins that are encoded within DEP2 include: (a) Lhpp (SEQ ID NO:10) (See, Yokoi et al., J Biochem 133:607-14 (2003)); (b) naturally occurring protein variants of Lhpp (SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27); (c) Dep2-1a and Dep2-1b (SEQ ID NO:3 and SEQ ID NO:4, respectively); (d) Dep2-2 (SEQ ID NO:29); (e) Dep2-4 (SEQ ID NO:32); and (f) Dep2-5 (SEQ ID NO:34).
[0250] As used herein, the term "DEP2 transcripts" refers to the group of transcripts arising in whole or in part from SEQ ID NO:1, including but not limited to: LHPP and naturally occurring splice variants thereof; DEP2-1 and naturally occurring splice variants thereof; DEP2-2; DEP2-3; GenBank sequence AK127935 and GenBank sequence AW867792. As used herein, the term "DEP2 proteins" refers to the group of proteins encoded in whole or in part from one or more DEP2 transcripts, including but not limited to: Lhpp and naturally occurring protein variants thereof; Dep2-1a; Dep2-1b; Dep2-2; Dep2-4 and Dep2-5. As used herein, the terms "DEP2 polymorphic sites" or "DEP2 polymorphisms", used interchangeably, refer to polymorphic sites found within SEQ ID NO:1, or if outside of SEQ ID NO:1, within a DEP2 transcript.
[0251] As used herein, the term "DEP2-1" refers to a messenger RNA shown in SEQ ID NO:2 and in FIG. 2, and DNA sequences that functionally regulate expression thereof. Experimental evidence that DEP2-1 messenger RNA is a naturally occurring transcript is disclosed herein. As shown in FIG. 2, DEP2-1 messenger RNA has 2 exons. Of these, an exon comprising nucleotides 1-315 of SEQ ID NO:2 was not previously known to be in any naturally occurring transcript. In addition, three naturally occurring polymorphic sites in nucleotides 1-315 of DEP2-1 messenger RNA are disclosed herein: (a) 135T>C, (b) 209A>G and (c) 241G>A.
[0252] As used herein, the terms "Dep2-1a" and "Dep2-1b" refer to proteins shown in SEQ ID NO:3 and in FIG. 3A, and in SEQ ID NO:4 and in FIG. 3B, respectively. These proteins may be encoded from DEP2-1 as well as naturally occurring splice variants thereof.
[0253] As used herein, the term "DEP2-2" refers to a messenger RNA shown in SEQ ID NO:28, and DNA sequences that functionally regulate expression thereof.
[0254] As used herein, the term "Dep2-2" refers to a protein shown in SEQ ID NO:29. This protein may be encoded from DEP2-2.
[0255] As used herein, the term "DEP2-3" refers to a messenger RNA shown in SEQ ID NO:30, and DNA sequences that functionally regulate expression thereof.
[0256] As used herein, the term "Dep2-4" refers to a protein shown in SEQ ID NO:32. This protein may be encoded from SEQ ID NO:31.
[0257] As used herein, the term "Dep2-5" refers to a protein shown in SEQ ID NO:34. This protein may be encoded from SEQ ID NO:33.
[0258] As used herein, the phrase "effective amount" or a "therapeutically effective amount", which are used interchangeably herein, when used in connection with an active agent (such as a drug) is meant a nontoxic but sufficient amount of the active agent to provide the desired effect. The amount of active agent (such as a drug) that is "effective" will vary from subject to subject, depending on the age and general condition of the individual, the particular active agent or agents, and the like. Thus, it is not always possible to specify an exact "effective amount." However, an appropriate "effective amount" in any individual case may be determined by one of ordinary skill in the art using routine experimentation.
[0259] As used herein, the phrase "encoded by" refers to a nucleic acid sequence which codes for a polypeptide sequence, wherein the polypeptide sequence or a portion thereof contains an amino acid sequence of at least 3 amino acids, more preferably at least 8 amino acids, and even more preferably at least 15 amino acids from a polypeptide encoded by the nucleic acid sequence.
[0260] As used herein, the term "exon" refers to a portion of the gene sequence that is transcribed and is found in the mature messenger RNA derived from the gene, but is not necessarily a part of the sequence that encodes the final gene product.
[0261] The term "expression", as used herein, refers to the production of a functional end-product. Expression of a gene involves transcription of the gene and translation of the mRNA into a precursor or mature protein. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of suppressing the expression of the target transcript. "Co-suppression" refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).
[0262] As used herein, the term "fragment" of a nucleic acid sequence refers to a contiguous sequence of approximately at least 6 nucleotides, preferably at least about 8 nucleotides, more preferably at least about 10 nucleotides, and even more preferably at least about 15 nucleotides, and most preferable at least about 20 nucleotides identical or complementary to a region of the specified nucleotide sequence.) Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, and "N" for any nucleotide.
[0263] As used herein, the term "gene" refers to a nucleic acid sequence that undergoes transcription as a result of the activity of at least one promoter. A gene may encode for a particular polypeptide, or alternatively, code for a RNA molecule. A gene includes one or more exons and one or more regulatory or control sequences and may include one or more introns. The phrase "target gene" as used herein, refers to a nucleic acid sequence, such as, but not limited to, a nucleic acid sequence of interest that encodes a polypeptide of interest or alternatively, a RNA molecule of interest. The term "target gene" can also refer to a gene to be identified or knocked-out according to the methods described herein.
[0264] As used herein, the term "genotype" refers to the identity of alleles present in a subject or in a test sample.
[0265] As used herein, the term "genotyping" refers to the process of determining the genotype of a subject.
[0266] As used herein, the terms "homologous", "substantially similar" and "corresponding substantially" are used interchangeably. They refer to nucleic acid or protein fragments wherein changes in one or more nucleotide bases or amino acids does not affect the ability of the nucleic acid or protein fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid or protein fragments of the instant invention such as deletion or insertion of one or more nucleotides or amino acids that do not substantially alter the functional properties of the resulting nucleic acid or protein fragment relative to the initial, unmodified fragment. It is therefore understood, as those skilled in the art will appreciate, that the invention encompasses more than the sequences exemplified herein.
[0267] As used herein, the term "identity" refers to the relatedness of two sequences on a nucleotide-by-nucleotide basis over a particular comparison window or segment. Thus, identity is defined as the degree of sameness, correspondence or equivalence between the same strands (either sense or antisense) of two DNA segments (or two amino acid sequences).
[0268] "Percentage of sequence identity" is calculated by comparing two optimally aligned sequences over a particular region, determining the number of positions at which the identical base or amino acid occurs in both sequences in order to yield the number of matched positions, dividing the number of such positions by the total number of positions in the segment being compared and multiplying the result by 100. Optimal alignment of sequences may be conducted by the algorithm of Smith & Waterman, Appl. Math., 2:482 (1981), by the algorithm of Needleman & Wunsch, J. Mol. Biol., 48:443 (1970), by the method of Pearson & Lipman, Proc. Natl. Acad. Sci., (USA) 85:2444 (1988) and by computer programs which implement the relevant algorithms (for example, Clustal Macaw Pileup (which is publicly available on the Internet; Higgins et al., CABIOS. 5L151-153 (1989)), FASTDB (Intelligenetics), BLAST (National Center for Biomedical Information; Altschul et al., Nucleic Acids Research, 25:3389-3402 (1997)), PILEUP (Genetics Computer Group, Madison, Wis.) or GAP, BESTFIT, FASTA and TFASTA (Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, Madison, Wis.). (See U.S. Pat. No. 5,912,120.)
[0269] As used herein, the term "isoform" refers to a particular form of a protein, wherein different isoforms of a protein differ in sequence, by either change or insertion/deletion, or covalent modification at one or more amino acids.
[0270] As used herein, the terms "isolated" or "purified", used interchangeably, when used in connection with biological molecules such as nucleic acids or proteins means that the molecule is substantially free of other biological molecules such as nucleic acids, proteins, lipids, carbohydrates or other material such as cellular debris and growth media. Generally, the term "isolated" or "purified" are not intended to refer to a complete absence of such material or to absence of water, buffers, or salts, unless they are present in amounts that substantially interfere with the methods of the present invention.
[0271] As used herein, an "isolated nucleic acid fragment or sequence" is a polymer of nucleic acid (RNA or DNA) that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
[0272] As used herein, the term "Lhpp" refers to an enzyme known as phospholysine phosphohistidine inorganic pyrophosphate phosphatase, and the term "LHPP" refers to the corresponding messenger RNA, and DNA sequences that functionally regulate expression thereof. Lhpp was originally purified from swine brain in 1957 (See, Seal et al., J Biol Chem 228:193-9 (1957)), and subsequently has been purified from several additional mammalian sources (See, Felix et al., J Biochem 147:111-8 (1975); Yoshida et al., Cancer Research 42:3256-31 (1982); Hachimori et al., J Biochem 93:257-64 (1983); Smirnova et al., Arch Biochem Biophys 287:135-40 (1991); Hiraishi et al., Arch Biochem Biophys 341:153-9 (1997)). The enzyme has been characterized in vitro as efficiently catalyzing the hydrolysis of P--N bonds in phosphohistidine and phospholysine, and less efficiently catalyzing the hydrolysis of P--N or P--O bonds in imidodiphosphate and pyrophosphate, respectively. Lhpp may be a protein histidine or lysine phosphoamidase, i.e., an enzyme that modifies the N-linked phosphorylation state of other proteins. The human LHPP has been cloned. Functional human Lhpp enzyme has been purified following heterologous expression in E. coli (See, Yokoi et al., J Biochem 133:607-14 (2003)). The nucleic acid sequence of LHPP messenger RNA is shown in SEQ ID NO:9 and in FIG. 8. The amino acid sequence of Lhpp is shown in SEQ ID NO:10 and in FIG. 9. As shown in FIG. 1, LHPP messenger RNA has 7 exons. The locations of these exons are provided below in Table A.
TABLE-US-00001 TABLE A Exon Start in SEQ ID NO: 1 End in SEQ ID NO: 1 1 3001 3163 2 25305 25492 3 29588 29741 4 38127 38190 5 39202 39294 6 58346 58437 7 154430 155313
In addition, there is a naturally occurring polymorphic site in Lhpp (R94Q) in which amino acid 94 is either arginine or glutamine in the two naturally occurring isoforms. In the corresponding naturally occurring polymorphic site in LHPP messenger RNA (281G>A), base 281 of the open reading frame is either guanine or adenine in the two naturally occurring alleles. Further, Lhpp is encoded from a naturally occurring splice variant of LHPP that is shown in SEQ ID NO:11 (See FIG. 10).
[0273] As used herein, the term "locus" refers to a location on a chromosome of a nucleic acid molecule corresponding to a gene or a physical or phenotypic feature, where physical features include polymorphic sites.
[0274] As used herein, the term "major depression or a related disorder" refers to any Mood Disorder or Anxiety Disorder described in the Diagnostic and Statistical Manual (DSM-IV-TR, American Psychiatric Association, 2000). Mood Disorders include, but are not limited to, Depressive Disorders (DSM-IV-TR 296.2x, 296.3x, 300.4, 311), Bipolar Disorders (DSM-IV-TR 296.0x, 296.40, 296.4x, 296.5x, 296.6x, 296.7, 296.89, 301.13, 296.80) and Mood Disorder Not Otherwise Specified (DSM-IV-TR 296.90). Anxiety Disorders include, but are not limited to, Panic Disorders (DSM-IV-TR 300.01, 300.21), Phobic Disorders (DSM-IV-TR 300.29, 300.22, 300.23), Obsessive-Compulsive Disorder (DSM-IV-TR 300.3), Post-Traumatic Stress Disorder (DSM-IV-TR 309.81), Acute Stress Disorder (DSM-IV-TR 308.3), Generalized Anxiety Disorder (DSM-IV-TR 300.02) and Anxiety Disorder Not Otherwise Specified (DSM-IV-TR 300.00). Extensive lists of symptoms and diagnostic criteria for each of these disorders are found in the DSM-IV-TR sections cited above.
[0275] As used herein, the terms "modulates" "modulation" or "modulating" as used interchangeably herein, refer to both upregulation (for example, activation or stimulation (for example, by agonizing or potentiating)) and downregulation (for example, inhibition or suppression (for example, by antagonizing, reducing, decreasing or inhibiting)).
[0276] As used herein, the term "naturally occurring" refers to a DNA molecule, a messenger RNA, a protein, an allele, an isoform, a polymorphic site, a splice variant or a protein variant, wherein the existence in nature of said DNA molecule, messenger RNA, protein, allele, isoform, polymorphic site, splice variant or protein variant is supported by either (a) direct experimental evidence or (b) algorithmic assembly from a database of nucleic acid or protein sequences. Alleles, isoforms, polymorphic sites, splice variants and protein variants might also be created by experimental manipulation.
[0277] As used herein, the term "naturally occurring splice variant of DEP2-1" includes but is not limited to the sequences shown in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8.
[0278] As used herein, the term "naturally occurring splice variant of LHPP" includes but is not limited to the sequences shown in SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 and SEQ ID NO:26. As used herein, the term "naturally occurring protein variant of Lhpp" includes but is not limited to the sequences shown in SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27.
[0279] As used herein, the phrase "3' non-coding sequences" refer to mRNA sequences located downstream of a coding sequence.
[0280] As used herein, the term "non-human animal" includes all vertebrate animals, except humans. It also includes an individual animal in all stages of development, including embryonic and fetal stages. A "transgenic animal" is any animal containing one or more cells bearing genetic information altered or received, directly or indirectly, by deliberate genetic manipulation at a subcellular level, such as by targeted recombination or microinjection or infection with recombinant virus.
[0281] Mice are often used for transgenic animal models because they are easy to house, relatively inexpensive, and easy to breed. However, other non-human transgenic animals may also be made in accordance with the present invention such as, but not limited to, primates, mice, goat, sheep, rabbits, dogs, cows, cats, guinea pigs, rats, zebrafish and nematodes. Transgenic animals are those which carry a transgene, that is, a cloned gene introduced and stably incorporated which is passed on to successive generations.
[0282] As used herein, the term "nucleic acid" refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term includes double- and single-stranded DNA, as well as double- and single-stranded RNA. It also includes modifications, such as methylation or capping and unmodified forms of the polynucleotide. The terms "polynucleotide," "oligomer," "oligonucleotide," and "oligo" are used interchangeably herein.
[0283] As used herein, the phrase "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is regulated by the other. For example, a promoter is operably linked with a coding sequence when it is capable of regulating the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions of the invention can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.
[0284] Polymerase chain reaction ("PCR") is a technique used to amplify DNA millions of fold, by repeated replication of a template, in a short period of time. (Mullis et al., Cold Spring Harbor Symp. Quant. Biol. 51:263-273 (1986); Erlich et al., European Patent Application No. 50,424; European Patent Application No. 84,796; European Patent Application No. 258,017; European Patent Application No. 237,362; Mullis, European Patent Application No. 201,184; Mullis et al., U.S. Pat. No. 4,683,202; Erlich, U.S. Pat. No. 4,582,788; and Saiki et al., U.S. Pat. No. 4,683,194). The process utilizes sets of specific in vitro synthesized oligonucleotides to prime DNA synthesis. The design of the primers is dependent upon the sequences of DNA that are desired to be analyzed. The technique is carried out through many cycles (usually 20-50) of melting the template at high temperature, allowing the primers to anneal to complementary sequences within the template and then replicating the template with DNA polymerase.
[0285] As used herein, the term "polymorphic site" refers to a nucleic acid sequence comprising one or more consecutive nucleotides that differ between alleles, or to a protein sequence comprising one or more consecutive amino acids that differ between isoforms.
[0286] As used herein, the term "polymorphism" refers to a sequence variation observed in a subject at a polymorphic site. Polymorphisms include nucleotide or amino acid substitutions, insertions and deletions and may, but need not, result in detectable differences in gene expression or protein function.
[0287] The terms "polypeptide" and "protein" are used interchangeably herein and indicate at least one molecular chain of amino acids linked through covalent and/or non-covalent bonds. The terms do not refer to a specific length of the product. Thus peptides, oligopeptides and proteins are included within the definition of polypeptide. In addition, protein fragments, analogs, mutated or variant proteins, fusion proteins and the like are included within the meaning of polypeptide.
[0288] As used herein, the term "primer" refers to an oligonucleotide, whether naturally occurring, such as in a purified restriction digest, or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced (such as in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). Primers can be single or double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. The exact length of the primers will depend on many factors, including temperature, source of primer and the use of the method. Primers preferably have a length of at least 10 contiguous nucleotides. For example, primers can have a length of 10 contiguous nucleotides, 15 contiguous nucleotides, 20 contiguous nucleotides, 25 contiguous nucleotides, etc.
[0289] As used herein, the term "probe" refers to an oligonucleotide, whether naturally occurring, such as in a purified restriction digest, produced synthetically, recombinantly or by polymerase chain reaction amplification which is capable of hybridizing to another oligonucleotide or nucleic acid of interest. A probe may be single-stranded or double-stranded. Probes can be labeled with a detectable label so as to make said probe detectable in a detection system. The detectable label used is not critical.
[0290] As used herein, the term "promoter" refers to a DNA sequence capable of controlling the transcription of a RNA. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments.
[0291] As used herein, the term "protein variant" refers to a polypeptide that is encoded from a splice variant, wherein two protein variants differ in the inclusion/exclusion of one or more blocks of consecutive amino acids.
[0292] The terms "recombinant construct", "construct", "expression construct" and "recombinant expression construct" are used interchangeably herein. These terms refer to a functional unit of genetic material that can be inserted into the genome of a cell or expressed in vitro using standard methodology well known to one skilled in the art. Such construct may be itself or may be used in conjunction with a vector. If a vector is used then the choice of vector is dependent upon the method that will be used to transform host plants as is well known to those skilled in the art. For example, a plasmid vector can be used. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells comprising any of the isolated nucleic acid fragments of the invention. The skilled artisan will also recognize that different independent transformation events will result in different levels and patterns of expression (Jones et al., EMBO J. 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics 218:78-86 (1989)), and thus that multiple events must be screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by Southern analysis of DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or phenotypic analysis.
[0293] As used herein, the term "regulatory sequences" refers to a DNA or RNA sequence capable of controlling the expression of a RNA or protein. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
[0294] As used herein, the phrase "RNA transcript" or "RNA molecule" as used interchangeable herein, refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. "Messenger RNA (mRNA)" refers to the RNA that is without introns and that can be translated into protein by the cell.
[0295] As used herein, the phrase "sense RNA" refers to RNA molecule that includes the mRNA and can be translated into protein within a cell or in vitro. As used herein, the phrase, "antisense RNA" refers to an RNA molecule that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. As used herein, the phrase, "functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.
[0296] As used herein, the term "single nucleotide polymorphism" or "SNP" refers typically, to a specific pair of nucleotides observed at a single polymorphic site. In some cases, which are rare, three or four nucleotides may be found.
[0297] As used herein, the term "splice variant" refers to a particular form of a messenger RNA, wherein two splice variants share either (a) a transcriptional start site or (b) an open reading frame, but differ in the inclusion/exclusion of one or more exons.
[0298] As used herein, the term "subject" refers to an animal, preferably a mammal, including a human or non-human. The animal can be a domesticated or non-domesticated animal.
[0299] As used herein, the term "treating" refers to reversing, alleviating, inhibiting the progress of, or preventing at least one overt symptomatic manifestation of the disorder or condition to which such term applies, or one or more symptoms of such disorder or condition. The term "treatment" as used herein, refers to the act of treating, as "treating" is defined herein. For the present invention, the term "treat" means to alleviate or eliminate one or more symptoms, behavior or events associated with a depressive disorder.
DEP2-1 Transcripts and Proteins Encoded Thereby
[0300] In one embodiment, the present invention relates to the discovery of a novel transcript of DEP2, named DEP2-1. FIG. 2 illustrates the isolated nucleic acid sequence for DEP2-1 (SEQ ID NO:2), which is 1198 base pairs in length. Nucleotides 1 to 316 of SEQ ID NO:2 comprise an exon that was not previously known to be in any naturally occurring transcript of DEP2. The remaining portion of SEQ ID NO:2 (from nucleotides 317 to 1198) are shared with LHPP and naturally occurring splice variants thereof, and correspond to nucleotides 882 to 1760 of SEQ ID NO:9. DEP2-1 was initially assembled using Genecarta software (Compugen, Tel Aviv, Israel) from publicly available expressed sequence tags ("ESTs"). Specifically, the proprietary algorithms identified that certain ESTs (GenBank accession numbers BI669229, BI489679, BI756098, Z44231, R15274, R11923, BX952014, BI754006, H51555, H51378 and BG397886) each comprised of sequences from within both nucleotides 1-316 and nucleotides 317-1198 of SEQ ID NO:2. These two sequence blocks are not contiguous in the human genome, implying that SEQ ID NO:2 is a spliced transcript. To confirm that DEP2-1 is a naturally occurring transcript, four IMAGE clones (h5173309, h5194531, h5197955 and h4565014, corresponding to BI489679, BI756098, BI754006 and BG397886, respectively) were completely sequenced in both directions (see Example 5). The contiguous sequence of nucleotides 19-1198 of SEQ ID NO:2 was thereby confirmed. IMAGE clones h5173309, h5194531 and h5197955 are from brain, and include only sequences shown in SEQ ID NO:2. Among the ESTs that assembled to form DEP2-1, all were from brain except for BG397886, which was from a tonsillar primary B cell line. IMAGE clone h4565014 corresponds to BG397886. The sequence of this clone included nucleotides 299-1198 followed by a polyadenine tail and a further 77 nucleotide sequence that did not match DEP2. Further, a single nucleotide polymorphism (1142G>A) was discovered at nucleotide 1142 of SEQ ID NO:2. Rapid amplification of cDNA ends was performed to determine the 5' end(s) of DEP2-1 (see Example 6). These experiments identified two 5' ends in human spinal cord RNA, at nucleotides 1 and 75 of SEQ ID NO:2. Two series of PCR experiments were also performed to determine whether the first exon (nucleotides 1-315) of DEP2-1 is included in additional transcripts with upstream exons (see Example 7). The first experiments used forward primers in an upstream exon of LHPP and reverse primers in the first exon of DEP2-1. The second experiments used forward primers in an upstream exon of LHPP and reverse primers in a downstream exon of LHPP. No sequence from the first exon of DEP2-1 was amplified in either set of experiments. In total, these experimental results demonstrate that DEP2-1 is a naturally occurring transcript, that it is not an alternative splice variant of LHPP, and thus does not encode a naturally occurring protein variant of Lhpp.
[0301] The isolated nucleic acid sequence of DEP2-1 has two coding regions, which are each illustrated in capital letters in FIG. 2. The first coding region (nucleotides 352 to 771 in SEQ ID NO:2) may encode for the protein, referred to herein as Dep2-1a, shown in FIG. 3A (SEQ ID NO:3). Dep2-1a is 140 amino acids in length. The second coding region (nucleotides 812 to 1162 in SEQ ID NO:2) may encode for the protein, referred to herein as, Dep2-1b, shown in FIG. 3B (SEQ ID NO:4). Dep2-1b is 117 amino acids in length.
[0302] In addition, naturally occurring splice variants of DEP2-1 have been identified by the inventors of the present invention. These transcripts were assembled using Genecarta software (Compugen, Tel Aviv, Israel) from publicly available expressed sequence tags ("ESTs"). These splice variants of the DEP2-1 are shown in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 and SEQ ID NO:8. SEQ ID NO:5 is also shown in FIG. 4 and is 1092 base pairs in length. The coding regions are shown in capital letters in this figure. The first and second coding regions encode for Dep2-1a (See, FIG. 3A (SEQ ID NO:3)) and Dep2-1b (FIG. 3B (See, SEQ ID NO:4)). SEQ ID NO:5 was assembled on the basis of three ESTs (BG759116, BF976531 and BQ706325), all of B cell origin. SEQ ID NO:6 is shown in FIG. 5 and is 1022 base pairs in length. The coding regions are shown in capital letters in this figure. The first and second coding regions encode for Dep2-1a (See, FIG. 3A (SEQ ID NO:3)) and Dep2-1b (FIG. 3B (See, SEQ ID NO:4)). SEQ ID NO:6 was assembled on the basis of two ESTs (BI226948 and BE396637), both from a Burkitt's lymphoma cell line. SEQ ID NO:7 is shown in FIG. 6 and is 1186 base pairs in length. The coding region is shown in capital letters in this figure. The coding region encodes for Dep2-1b (See, FIG. 3A (SEQ ID NO:4)). SEQ ID NO:7 was assembled on the basis of a single EST (CF454636), from the peripheral nervous system. SEQ ID NO:8 is shown in FIG. 7 and is 1005 base pairs in length. The coding region is shown in capital letters in this figure. The coding region encodes for Dep2-1b (See, FIG. 3B (SEQ ID NO:4)). SEQ ID NO:8 was assembled on the basis of a single EST (BE560698), from a Burkitt's lymphoma cell line.
[0303] The ESTs described in the preceding paragraph were used to assemble the 5' ends of the variant DEP2-1 transcripts. None of these ESTs contain the entire transcript sequence. In all cases, the 3' end of each of these transcripts is common to all these sequences as well as to LHPP as well as to some of the splice variants thereof and can be found in multiple ESTs. These ESTs are listed below in Table B.
TABLE-US-00002 TABLE B GenBank Accession Tissue Source (If Specified) AA292585 Ovarian tumor AA308083 Colon L KM12C (HCC) metastasis into mouse (liver) AA379353 Skin AA379626 Skin AA635531 Normal prostate AA669836 Bone Marrow stroma AA677990 Liver and spleen AA725685 AA912377 Dorsal root ganglion AI086359 Pooled human melanocyte, fetal heart, and pregnant uterus AI139589 Placenta AI97875 Anaplastic oligodendroglioma AI221142 Pooled AI272203 Oligodendroglioma AI361781 B-cell, chronic lymphotic leukemia AI378120 B-cell, chronic lymphotic leukemia AI420571 Prostate AI439876 Lymphoma, follicular mixed small and large cell AI475774 B-cell, chronic lymphotic leukemia AI492345 Kidney AI565021 Well-defferentiated endometrial adenocarcinoma, 7 pooled tumors AI582637 Kidney AI598057 Adenocarcinoma AI805094 Prostate AI972592 Prostate AW139347 AW301045 Moderately differentiated adenocarcinoma AW511836 Moderately-differentiated endometrial adenocarcinoma, 3 pooled tumors AW512618 Lymphoma, follicular mixed small and large cell AW954516 BE675320 B-cell, chronic lymphotic leukemia BF740030 Kidney BF949274 Nervous_normal BF986056 Placenta_normal BF986061 Placenta_normal BG027502 Osteosarcoma, cell line BG150734 Normal epithelium BI819728 Pooled brain, lung, testis BQ644034 Hepatocellular carcinoma, cell line BU539737 Adenocarcinoma, cell line BU683755 Primary lung cystic fibrosis epithelial cells BU753421 Placenta BX096697 Well-differentiated endometrial adenocarcinoma, 7 pooled tumors BX330376 Placenta BX361464* Placenta BX423161 Adult brain CD631211 CD631212 CK823134*{circumflex over ( )} Islets of Langerhans CN265311 Embryonic stem cells, embryoid bodies derived from H1, H7 and H9 cells N38886 Multiple sclerosis lesions *Reverse direction EST sequence also present in the public domain {circumflex over ( )}Two additional sequences from the same clone are also present in the public domain.
[0304] It should be noted that the present invention also encompasses isolated nucleotide sequences (and the corresponding encoded proteins) having sequences comprising, corresponding to, identical to, or complementary to at least about 90% identity to SEQ ID NO:2. (All integers (and portions thereof) between 90% and 100% are also considered to be within the scope of the present invention with respect to percent identity.) For example, the present invention encompasses an isolated nucleic acid or fragment thereof comprising (a) a nucleotide sequence having at least 90% identity to SEQ ID NO:2; or (b) a complement comprising a nucleotide sequence having at least 90% identity to SEQ ID NO:2. Such sequences may be derived from any source, either isolated from a natural source, or produced via a semi-synthetic route, or synthesized de novo.
[0305] The invention also includes a purified polypeptide that has at least about 90% amino acid similarity or identity to the amino acid sequences of SEQ ID NO:3 or SEQ ID NO:4 of the above-noted proteins which are, in turn, encoded by the above-described nucleic acid sequences.
[0306] The present invention also encompasses an isolated nucleic acid sequence which encodes a polypeptide having the amino acid sequence of SEQ ID NO: 3 or SEQ ID NO:4.
[0307] Once DEP2-1 or any naturally occurring variants thereof have been isolated, they may then be introduced into either a prokaryotic or eukaryotic host cell through the use of a vector or construct. The vector, for example, a bacteriophage, cosmid or plasmid, may comprise a nucleic acid sequence having a nucleotide sequence of SEQ ID NO:2, or nucleotides 352-771 or 812-1162 thereof, as well as any regulatory sequence (such as, but not limited to a promoter) which is functional in the host cell and is able to elicit expression of the protein encoded by the nucleotide sequence.
[0308] Alternatively, the vector may comprise a complement comprising a nucleotide sequence of SEQ ID NO:2 or nucleotides 352-771 or 812-1162 thereof, as well as any regulatory sequence. The regulatory sequence (for example, a promoter) is in operable association with, or operably linked to, the sequence of SEQ ID NO:2, or nucleotides 352-771 or 812-1162 thereof. Examples of promoters that can be used include LTR or the SV40 promoter, the E. coli lac or trp, the phage lambda P sub L promoter and other promoters known to those of skill in the art. Additionally, nucleic acid sequences which encode other proteins, oligosaccharides, lipids, etc. may also be included within the vector as well as other regulatory sequences such as a polyadenylation signal (for example, the poly-A signal of SV-40T-antigen, ovalalbumin or bovine growth hormone). The choice of sequences present in the construct is dependent upon the desired expression products as well as the nature of the host cell.
[0309] Once the vector has been constructed, it can be introduced (namely, transformed or transfected) into host cells, such as mammalian (such as, but not limited to, simian, canine, feline, bovine, equine, rodent, murine, etc.) or non-mammalian (such as, but not limited to, insect, reptile, fish, avian, etc.) cells, using any method known to those of skill in the art including, but not limited to, electroporation, calcium phosphate precipitation, DEAE dextran, lipofection, and receptor mediated endocytosis, polybrene, particle bombardment, and microinjection. Alternatively, the vector can be delivered to the cell as a viral particle (either replication competent or deficient). Examples of viruses useful for the delivery of nucleic acid include, but are not limited to, lentivirus, adenoviruses, adeno-associated viruses, retroviruses, herpesviruses, and vaccinia viruses. Other viruses suitable for delivery of nucleic acid sequences into cells that are known to those of skill in the art may be equivalently used in the present invention.
[0310] The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating the promoter sequences, selecting transfected cells, etc. The culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to those of skill in the art.
[0311] The engineered host cells containing the incorporated vector(s) can be identified using hybridization techniques that are well known to those of skill in the art or by using the polymerase chain reaction to amplify specific polynucleotide sequences. If the nucleic acid sequence transferred to the cells produces a protein that can be detected, for example, by means of an immunological or enzymatic assay, then the presence of recombinant protein can be confirmed by performing the assays either on the medium surrounding the cells or on cellular lysates.
Non-Human Transgenic Animals
[0312] In another embodiment, the present invention relates to non-human transgenic animals that contain the transcripts that arise from DEP2 as well as methods of making said animals. Specifically, the nucleic acid sequences that can be used in said non-human transgenic animals include: (a) LHPP (SEQ ID NO:9); (b) naturally occurring splice variants of LHPP (SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24 and SEQ ID NO:26; (c) DEP2-1 (SEQ ID NO:2); (d) naturally occurring splice variants of DEP2-1 (SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 or SEQ ID NO:8); (e) DEP2-2 (SEQ ID NO:28); (f) DEP2-3 (SEQ ID NO:30); (g) GenBank sequence AK127935 (SEQ ID NO:31); and (h) GenBank sequence AW867792 (SEQ ID NO:33).
[0313] A variety of methods can be used to create the non-human transgenic animals. For example, the generation of a specific alteration of a nucleic acid sequence of a target gene is one approach that can be used. Alterations can be accomplished by a variety of enzymatic and chemical methods used in vitro. One of the most common methods uses a specific oligonucleotide as a mutagen to generate precisely designed deletions, insertions and point mutations in a target gene. Secondly, a wildtype human gene or humanized non-human animal gene could be inserted by homologous recombination. It is also possible to insert an altered or mutated (singly or multiply) human gene as genomic or minigene constructs.
[0314] Additionally, non-human transgenic animals can also be made wherein at least one endogenous target gene is "knocked-out". The creation of knock-out animals allows those of skill in the art to assess in vivo function of the gene that has been "knocked-out". The knock-out of at least one target gene may be accomplished in a variety of ways. One strategy that can be used to "knock-out" a target gene is by the insertion of artificially modified fragments of the endogenous gene by homologous recombination. In this technique, mutant alleles are introduced by homologous recombination into embryonic stem ("ES") cells. The embryonic stem cells containing a knock out mutation in one allele of the gene being studied are introduced into a blastocyst. The resultant animals are chimeras containing tissues derived from both the transplanted ES cells and host cells. The chimeric animals are mated to assess whether the mutation is incorporated into the germ line. Those chimeric animals each heterozygous for the knock-out mutation are mated to produce homozygous knock-out mice. A second strategy that can be used to "knock-out" at least one gene involves using siRNA and shRNA and oocyte microinjection or transfection or microinjection into embryonic stem cells as described further herein.
[0315] The present invention contemplates that the somatic and germ cells of said non-human transgenic animal comprise an exogenous and stably transmitted nucleic acid sequence of SEQ ID NO:2 (DEP2-1). Additionally, the present invention further contemplates that the somatic and germ cells of the transgenic animals comprise an exogenous and stably transmitted nucleic acid sequence having a nucleotide sequence selected from the group consisting of: SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33, with the proviso that its somatic and germ cells do not comprise an exogenous and stably transmitted nucleic acid having a nucleotide sequence of SEQ ID NO:2. The methods for creating such transgenic animals will be discussed in more detail below.
[0316] The present invention further contemplates non-human transgenic animals wherein a nucleic acid comprising a nucleotide sequence of SEQ ID NO:2 (DEP2-1) is knocked out in said animal. Additionally, the present invention contemplates a non-human transgenic animal wherein a nucleic acid having a nucleotide sequence selected from the group consisting of: SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33 is knocked out, with the proviso that a nucleic acid sequence of SEQ ID NO:2 is not modified or altered. The methods for creating such "knock-out" animals will be described in more detail below.
[0317] To create a non-human transgenic animal containing an exogenous and stably transmitted nucleic acid of a target gene or other nucleic acid sequence, a nucleic sequence of interest can be inserted into a non-human animal germ line using standard techniques of oocyte microinjection or transfection or microinjection into embryonic stem cells. Alternatively, if it is desired to knock-out or replace an endogenous gene, homologous recombination using embryonic stem cells or siRNA or shRNA using oocyte microinjection or transfection or microinjection of embryonic stem cells can be used.
[0318] For oocyte injection, at least one nucleic acid sequence of interest that is operably linked to the promoter can be inserted into the pronucleus of a just-fertilized non-human animal oocyte. This oocyte is then reimplanted into a pseudopregnant foster mother. The liveborn non-human animal can then be screened for integrants by analyzing the animal's DNA (using polymerase chain reaction for example) such as from the tail, for the presence of the polynucleotide sequence of interest. Chimeric non-human animals are then identified. The nucleic acid can be a complete genomic sequence injected as a YAC or chromosome fragment, a cDNA, or a minigene containing the entire coding region and other elements found to be necessary for optimum expression.
[0319] Retroviral or lentiviral infection (See, Lois C, et al., Science, 295:868-872 (2002) (which teaches methods for transgenics using lentiviral transgenesis)) of early embryos can also be done to insert an altered gene. In this method, the altered gene is inserted into a retroviral vector which is used to directly infect mouse embryos during the early stages of development to generate a chimera, some of which will lead to germline transmission (Jaenisch, R., Proc. Natl. Acad. Sci. USA, 73: 1260-1264 (1976)).
[0320] Homologous recombination using embryonic stem cells allows for the screening of gene transfer cells to identify the rare homologous recombination events. Once identified, these can be used to generate chimeras by injection of at least one non-human animal blastocyst and a proportion of the resulting animals will show germline transmission from the recombinant line. This gene targeting methodology is especially useful if inactivation of the gene is desired. For example, inactivation of the gene can be done by designing a polynucleotide fragment which contains sequences from an exon flanking a selectable marker. Homologous recombination leads to the insertion of the marker sequences in the middle of an exon, inactivating the gene. DNA analysis of individual clones can then be used to recognize the homologous recombination events.
[0321] Alternatively, "knock-out" of a target gene can be accomplished using siRNA or shRNA. In one strategy, oocyte microinjection can be used as described herein. Specifically, at least one nucleic acid sequence of interest that expresses at least one RNA molecule that is siRNA or shRNA, and that is operably linked to at least one promoter (such as a RNA pol III dependent promoter), is prepared using the methods described herein. This nucleic acid is introduced into a non-human animal fertilized oocyte, preferably by injection. The fertilized oocyte is then allowed to develop into an embryo. The resulting embryo is then transferred into a pseudopregnant female non-human animal and then allowed to give birth. Liveborn non-human animals are then screened for chimeric animals that contain the nucleic acid by obtaining a sample and analyzing the animal's DNA (using techniques such as polymerase chain reaction) and such chimeric non-human animals are identified. When these non-human animals are treated with an inducing agent, transcription is induced, the siRNA or shRNA expressed, and the target gene is repressed or "knocked-out". In the absence of the inducing agent, the gene is not repressed or "knocked-out".
[0322] In a second strategy, microinjection of embryonic stem cells can be used as described herein. Specifically, at least one nucleic acid sequence of interest that expresses at least one RNA molecule that is siRNA or shRNA, and is operably linked to at least one RNA pol III dependent promoter sequence of the present invention, is prepared using the methods described herein. This nucleic acid is introduced into non-human animal embryonic stem cells which can be used to generate chimeras by introducing these embryonic stem cells, preferably by injection, into at least one non-human animal blastocyst. The resulting blastocyst is then implanted into a pseudopregnant female non-human animal and then allowed to give birth to a chimeric non-human animal PCR can be used to identify the animals of interest. Liveborn non-human animals are then screened for chimeric animals that contain the nucleic acid by obtaining and analyzing a sample of said animal's DNA (using techniques such as polymerase chain reaction) and such chimeric non-human animals are identified. This chimeric non-human animal can then be used in breeding to produce a transgenic non-human animal that stably contain this nucleic acid within their genome. As with the previous method, when these non-human animals are treated with an inducing agent, transcription is induced, the siRNA or shRNA expressed, and the target gene is repressed or "knocked-out". In the absence of the inducing agent, the gene is not repressed or "knocked-out".
[0323] Methods of making transgenic animals are described, for example, in Wall et al., J. Cell Biochem., 49(2):113-20 (1992); Hogan, et al., "Manipulating the mouse embryo", A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1992); in WO 91/08216 or U.S. Pat. No. 4,736,866 the disclosures of which are hereby incorporated by reference in their entirety.
Method of Modifying or Altering Expression of Nucleic Acid Molecules
[0324] In another embodiment, the present invention relates to methods of modifying or altering the expression of nucleic acid sequences. The present invention contemplates that the nucleic acid sequence whose expression is modified or altered is SEQ ID NO:2. The present invention further contemplates that the nucleic acid sequence whose expression is modified or altered is a nucleic acid having a nucleotide sequence of at least one of SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0325] Methods for modifying or altering the expression of a nucleic acid sequence are well known to those skilled in the art. Specifically, said methods involve exposing a cell or administering to a subject (such as a transgenic non-human animal (for example, a transgenic non-human animal having at least one nucleic acid molecule knocked-out)) containing a nucleic acid whose expression is to be modified or altered at least one nucleic acid molecule. The methods described herein could be useful, such as in transgenic non-human animals (such as in transgenic non-human animals having at least one nucleic acid molecule knocked-out), as animal models for major depression or a related disorder. Nucleic acid molecules such as antisense molecules, aptamers, triplexing agents, ribozymes, siRNA, or co-suppression (co-suppressor) RNA can be used in said methods.
[0326] An antisense molecule, aptamer, triplexing agent, ribozyme or siRNA are DNA, RNA or chemically modified or hybrid sequences thereof of varying length that are single or double stranded. These nucleic acid molecules are complementary to a target nucleic acid sequence, such as a mRNA of a nucleic acid (a) having a nucleotide sequence of SEQ ID NO:2; or (b) of at least one of SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33, and can be a coding sequence, a polynucleotide sequence comprising an intron-exon junction, a regulatory sequence, such as a promoter sequence, or the like. The degree of complementarity is such that the nucleic acid molecule can interact specifically with the target nucleic acid sequence in a cell. Depending on the total length of the nucleic acid molecule, one or a few mismatches with respect to the target nucleic acid sequence can be tolerated without losing the specificity of the nucleic acid molecule for the target sequence. Thus, a few mismatches, if any, would be tolerated, for example, in an antisense molecule containing, for example, 20 consecutive nucleotides, whereas several mismatches will not affect the hybridization efficiency of an antisense molecule that is complementary to a full length of a target mRNA encoding a protein (such as, Dep2-1a, Dep2-1b, Dep2-2, Dep2-4 or Dep2-5). The number of mismatches that can be tolerated can be estimated, using well known formulas for determining hybridization kinetics (See, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Edition (1989)) or can be determined empirically using methods known in the art, particularly by determining that the presence of the antisense molecule, aptamer, triplexing agent, ribozyme or siRNA in a cell modifies or alters (such as by decreasing) the level of expression of the target sequence in a cell.
[0327] A nucleic acid molecule useful as an antisense molecule, aptamer, triplexing agent, ribozyme or siRNA can reduce or inhibit translation or cleave a target nucleic acid, thereby reducing or inhibiting the amount of the protein encoded by said target nucleic acid in a cell. For example, an antisense molecule can bind to an mRNA to form a double stranded molecule that cannot be translated in a cell. Antisense oligonucleotides of about 15 to 50 consecutive nucleotides are preferred since they are easily synthesized and can hybridize specifically with a target nucleic acid, although longer antisense molecules can be used. When the antisense molecule is contacted directly with a target cell, it can be operatively associated with a chemically reactive group such as, but not limited to, iron-linked EDTA, which cleaves a target RNA at the site of hybridization. A triplexing agent, in comparison, can stall transcription (Maher et al., Antisense Res. Devel., 1:227 (1991); Helene, Anticancer Drug Design, 6:569 (1991)). Aptamers adopt highly specific three-dimensional conformations that enable them to bind to a specific location on a molecule whose activity is being affected. Methods for making antisense molecules, aptamers and triplexing agents are well known in the art.
[0328] A ribozyme is a catalytic RNA molecule that cleaves RNA in a sequence-specific manner. Ribozymes that cleave themselves are called cis-acting ribozymes, while ribozymes that cleave other RNA molecules are called trans-acting ribozymes. Nucleic acids molecules can encode ribozymes designed to cleave particular mRNA transcripts, thus preventing expression of a polypeptide. A ribozyme sequence can have a sequence from a hammerhead, axhead, or hairpin ribozyme, and may be modified to have either slow cleavage activity or enhanced cleavage activity. For example, nucleotide substitutions can be made to modify cleavage activity (see, e.g., Doudna and Cech, Nature, 418:222-228 (2002)). Hammerhead ribozymes are useful for destroying particular mRNAs, although various ribozymes that cleave mRNA at site-specific recognition sequences can be used. Hammerhead ribozymes cleave mRNAs at locations dictated by flanking regions that form complementary base pairs with the target mRNA. The sole requirement is that the target RNA contain a 5'-UG-3' nucleotide sequence. The construction and production of hammerhead ribozymes is known in the art. See, for example, U.S. Pat. No. 5,254,678. Hammerhead ribozyme sequences can be embedded in a stable RNA such as a transfer RNA (tRNA) to increase cleavage efficiency in vivo (Perriman, R. et al., Proc. Natl. Acad. Sci. USA, 92(13):6175-6179 (1995); de Feyter, R. and Gaudron, J., Methods in Molecular Biology, Vol. 74, Chapter 43, "Expressing Ribozymes in Plants", Edited by Turner, P. C, Humana Press Inc., Totowa, N.J.).
[0329] siRNA useful in the present invention can be obtained, for example, using an in vitro transcription system or can be synthesized chemically, and can be contacted with cells (or administered to a subject) as RNA molecules. siRNA also can be expressed from an encoding nucleic acid, which can be contacted with cells (or administered to a subject). siRNAs can be designed using techniques well known to those skilled in the art.
[0330] Another nucleic acid molecule that is useful in the present methods also can be a co-suppression RNA that reduces or inhibits transcription of a target nucleic acid, such as a nucleic acid (a) having a nucleotide sequence of SEQ ID NO:2; or (b) of at least one of SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33. A co-suppressor RNA, like an siRNA, comprises (or encodes) an RNA comprising an inverted repeat, which includes a first oligonucleotide that selectively hybridizes to the target nucleic acid or gene and, in operative linkage, a second oligonucleotide that is complementary and in a reverse orientation to the first oligonucleotide. In comparison to an siRNA, which comprises a functional portion of a transcribed region of the target nucleic acid or target gene and reduces or inhibits translation of RNA transcribed from the nucleic acid or gene, a co-suppressor RNA comprises a functional portion of a transcriptional regulatory region of the target nucleic acid or gene (namely, a promoter region) and reduces or inhibits transcription of the nucleic acid or gene. Methods for making co-suppression RNA are well known in the art.
Polymorphism Detection/Genotyping
[0331] In another embodiment, the present invention relates methods of genotyping one or more subjects. The information obtained from the genotyping of subjects can be used in a variety of different ways. For example, the genotyping of subjects can be used to diagnose those subjects suffering from major depression or a related disorder or at risk of developing major depression or a related disorder, provide a prognosis for or predict or diagnose a response to treatment for a subject suffering from major depression or a related disorder, or identify subjects for selection or inclusion in a clinical trial for treating major depression or a related disorder. Additionally, genotypes can be used to analyze the results of a clinical trial for subjects being treated for major depression or a related disorder. Specifically, the relationship the genotypes of subjects and the clinical outcome of said subjects can be determined.
[0332] Genotyping involves obtaining a test sample from said subject(s). The subject may or may not be experiencing any symptoms of major depression or a related disorder at the time the test sample is obtained. In this embodiment, a test sample is any biological sample which contains the DNA of the subject. Test samples can be prepared using techniques well known to those skilled in the art such as by obtaining a specimen from an individual and, if necessary, disrupting any cells contained therein to release DNA. Examples of test samples include, but are not limited to, whole blood, serum, plasma, cerebrospinal fluid, sputum, bronchial washing, bronchial aspires, urine, lymph fluids, and various external secretions of the respiratory, intestinal and genitourinary tracts, tears, saliva, milk, white blood cells, myelomas and the like, etc.
[0333] Once the test sample(s) is obtained, it is analyzed, using routine techniques known in the art, in order to determine the presence or absence of specific sequences (alleles) for: (a) at least one polymorphic site in nucleotides 1 to 316 of SEQ ID NO:2; (b) a T-C polymorphism at position (nucleotide) 136 of SEQ ID NO:2; (c) a A-G polymorphism at position 210 of SEQ ID NO:2; (d) a G-A polymorphism at position 242 of SEQ ID NO:2; (e) at least one polymorphic site selected from nucleotides 77402 and 79906 of SEQ ID NO:1; (f) at least one polymorphic site in SEQ ID NO:1; or (g) any combinations of (a)-(f). Additionally, the test sample may optionally be further analyzed for a C-G polymorphism at position -1019 of a human serotonin receptor 1A gene ("HTR1A"). For example, the identification of at least one polymorphic site at nucleotide 38048, 77402 or 79906 in SEQ ID NO:1 in combination with the identification of a C-G polymorphism at position -1019 of a human serotonin receptor 1A gene in a test sample obtained from a subject may indicate that the subject is at risk of developing major depression or a related disorder.
[0334] The genotype of the subject can be determined based on the combination of sequences present at one of more polymorphic sites. Once the genotype of the subject has been determined, then further determinations can be made, such as, diagnosing whether the subject has major depression or a related disorder or is at risk of developing major depression or a related disorder, providing a prognosis for or predicting the response to treatment for a subject having major depression or a related disorder, determining whether the subject should be selected for inclusion in a clinical trial for treatment of major depression or a related disorder, or analyzing the relationship between genotypes of subjects and their clinical outcome. Additionally, if the test sample is also analyzed for the presence of sequences at a C-G polymorphism at position -1019 in the HTR1A gene, then the genotype(s) at one or more polymorphic sites in DEP2 may be used in combination with the genotype at HTR1A to make further determinations, as elaborated above.
[0335] As briefly discussed above, techniques for identifying the presence or absence of at least one sequence (allele) at a polymorphic site in a test sample are well known in the art and include, but are not limited to direct sequencing, amplification, fragment length polymorphism assays, mobility based assays, hybridization assays and mass spectroscopy. These techniques will be discussed briefly below.
Direct Sequencing
[0336] The presence or absence of a sequence at a polymorphic site may be determined by direct nucleotide sequencing. Methods for direct sequencing are known in the art. For example, following amplification of the DNA from the test sample, the DNA can be sequenced using manual sequencing techniques, such as those that employ radioactive marker nucleotides, or by automated sequencing. The results of the sequencing can be displayed using any suitable method known in the art. The sequence is examined and the presence or absence of a given sequence at a polymorphic site is determined.
Amplification
[0337] The presence or absence of a sequence at a polymorphic site may be determined using amplification techniques, such as PCR. PCR involves the use of primers to amplify a region of a DNA sequence from the test sample containing the polymorphic site of interest. The design of primers is well known to those skilled in the art. For example, primers can be designed that hybridize only to a portion of SEQ ID NO:1 or a portion of SEQ ID NO:2 (hereinafter "the wildtype"). If these wildtype primers result in a PCR product, then the subject has the wildtype allele (namely, SEQ ID NO:1 or SEQ ID NO:2). Similarly, primers can be designed that hybridize only to a portion of SEQ ID NO:1 or a portion of SEQ ID NO:2 containing a variant sequence at one or more polymorphic sites (hereinafter "the variant"). If these variant primers result in a PCR product, then the subject has the variant allele (namely, SEQ ID NO:1 or SEQ ID NO:2). The presence of an amplification product only when wildtype primers are used, or only when variant primers are used, indicate a homozygous wildtype or variant genotype, respectively. The presence of an amplification product when either wildtype or variant primers are used indicates a heterozygous genotype. Amplification methods other than PCR can be used. Such methods include strand displacement, the QB replicase system, the repair chain reaction, ligase chain reaction, rolling circle amplification and ligation activated transcription.
Fragment Length Polymorphism Assays
[0338] The presence or absence of sequences at a polymorphic site may be determined using a fragment length polymorphism assay. In a fragment length polymorphism assay, a unique DNA banding pattern based on cleaving the DNA at a series of positions is generated using an enzyme (such as, but not limited to, a restriction endonuclease). DNA fragments from the test sample containing a variant sequence will have a different banding pattern than DNA fragments generated from the wildtype.
[0339] For example, sequences at a polymorphic site can be detected using a restriction fragment length polymorphism assay ("RFLP"). The region of interest in the DNA is first isolated using PCR. The PCR products are then cleaved with restriction enzymes known to give a unique length fragment for a given variant sequence. The restriction-enzyme digested PCR products are separated and detected (such by gel electrophoresis) and visualized (such as, but not limited to, by ethidium bromide staining). The length of the fragments is compared to molecular weight markers or fragments generated from wildtype and variant controls (for example, vectors containing the wildtype and variant sequences, respectively).
[0340] Sequences (alleles) at a polymorphic site can also be detected using a CLEAVASE fragment length polymorphism assay ("CFLP"; Third Wave Technologies, Madison, Wis.; See, U.S. Pat. No. 5,888,780). This assay is based on the observation that when single strands of DNA fold on themselves, they assume higher order structures that are highly individual to the precise sequence of the DNA molecule. These secondary structures involve partially duplexed regions of DNA such that single stranded regions are juxtaposed with double stranded DNA hairpins. The CLEAVASE I enzyme, is a structure-specific, thermostable nuclease that recognizes and cleaves the junctions between these single-stranded and double-stranded regions.
[0341] The region of interest is first isolated using routine techniques known in the art, such as by PCR. Next, DNA strands are separated by heating. The reactions are cooled to allow intrastrand secondary structure to form. The PCR products are then treated with the CLEAVASE I enzyme to generate a series of fragments that are unique to a given wildtype or variant sequence. The CLEAVASE enzyme treated PCR products are separated and detected (such by gel electrophoresis) and visualized (such as, but not limited to, by ethidium bromide staining). The length of the fragments is compared to molecular weight markers or fragments generated from wild-type and variant controls.
Mobility Based Assays
[0342] The presence or absence of a sequence (allele) at a polymorphic site may be determined by a single strand conformation polymorphism assay ("SSCP"). In this technique, PCR products from the region to be tested are heat denatured and rapidly cooled to avoid the reassociation of complementary strands. The single strands then form sequence dependent conformations that influence electrophoretic mobility. The different mobilities can then be analyzed by electrophoresis.
[0343] Alternatively, the assessment of a polymorphism may be by a heteroduplex assay. In this analysis, the DNA sequence to be tested is amplified, denatured and renatured to itself or to known wildtype DNA (namely, from SEQ ID NO:1 or SEQ ID NO:2). Heteroduplexes between different alleles contain DNA "bubbles" at mismatched basepairs that can affect electrophoretic mobility. Therefore, electrophoresis can be used to indicate the presence or absence of wildtype and variant sequences.
Hybridization Assays
[0344] The presence or absence of a sequence (allele) at a polymorphic site can be detected in a hybridization assay. In a hybridization assay, the presence or absence of a given sequence (allele) is determined based on the ability of the DNA from the test sample to hybridize to a complementary DNA molecule (such as, but not limited to, a probe). The hybridization of a probe to DNA from the test sample is subsequently detected. Detection of hybridization only to a wildtype probe, or only to a variant probe, indicate a homozygous wildtype or variant genotype, respectively. Detection of hybridization to both wildtype and variant probes indicates a heterozygous genotype. A number of hybridization assays using a variety of technologies for hybridization and detection are available. Examples of some of these assays are provided below.
Solution Based Detection
[0345] The presence or absence of polymorphisms can be determined using any solution based detection techniques known in the art. An example of such a technique that can be used is TaqMan® (Applied Biosystems, Forest City, Calif.; see, Holland et al; Proc. Natl. Acad. Sci. USA 88:7276-7280 (1991); and Gelmini et al. Clin. Chem. 43:752-758 (1997)). TaqMan® allows for the real-time quantification of PCR. TaqMan® probes are widely commercially available, and the TaqMan® system (Applied Biosystems) is well known in the art. TaqMan® probes anneal between the upstream and downstream primer in a PCR reaction. They contain a 5'-fluorophore and a 3'-quencher. During amplification the 5'-3' exonuclease activity of the Taq polymerase cleaves the fluorophore off the probe. Since the fluorophore is no longer in close proximity to the quencher, the fluorophore will be allowed to fluoresce. The resulting fluorescence may be measured, and is in direct proportion to the amount of target sequence that is being amplified.
[0346] Another technique that can be used is a Molecular Beacon (See, Tyagi et al., Nat. Biotechnol. 14:303-308 (1996); and Tyagi et al., Nat. Biotechnol. 16:49-53 (1998)), the beacons are hairpin-shaped probes with an internally quenched fluorophore whose fluorescence is restored when bound to its target. The loop portion acts as the probe while the stem is formed by complimentary "arm" sequences at the ends of the beacon. A fluorophore and quenching moiety are attached at opposite ends, the stem keeping each of the moieties in close proximity, causing the fluorophore to be quenched by energy transfer. When the beacon detects its target, it undergoes a conformational change forcing the stem apart, thus separating the fluorophore and quencher. This causes the energy transfer to be disrupted to restore fluorescence. Any suitable fluorophore known in the art can be used. For example, fluorophores that can be used include, but are not limited to, FAM, HEX®, NED®, ROX®, Texas Red®. Quenchers that can be used include, but are not limited to, Dabcyl and TAMRA.
[0347] Another technique that can be used is Pyrosequencing® (Pyrosequencing, Inc. Westborough, Mass.). This technique is based on the hybridization of a primer to a single stranded, PCR-amplified, DNA template in the presence of DNA polymerase, ATP sulfurylase, luciferase and apyrase enzymes and the adenosine 5' phosphosulfate ("APS") and luciferin substrates. In the second step, the first of four deoxynucleotide triphosphates ("DCNTP") is added to the reaction and the DNA polymerase catalyzes the incorporation of the deoxynucleotide triphosphate into the DNA strand, if it is complementary to the base in the template strand. Each incorporation event is accompanied by release of pyrophosphate ("PPi") in a quantity equimolar to the amount of incorporated nucleotide. In the last step, the ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5'-phosphosulfate. This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP. The light produced in the luciferase-catalyzed reaction is detected by a charge coupled device (`CCD") camera and seen as a peak in a Pyrogram®. Each light signal is proportional to the number of nucleotides incorporated.
Detection of Hybridization Using Reverse Solid Phase Detection
[0348] The presence or absence of polymorphisms can also be determined using reverse solid phase detection, such as, but not limited to, a microarray, such as a DNA chip assay. In a DNA chip assay, a series of probes are affixed to a solid support. The probes are designed to be unique to a given polymorphism. The DNA obtained from the test sample is contacted with the DNA "chip" and hybridization is detected. Any DNA "chip" assay known in the art can be used in the methods of the present invention. For example, the DNA chip assay can be a GeneChip assay (Affymetrix, Santa Clara, Calif.; See, U.S. Pat. No. 6,045,996). The GeneChip technology uses miniaturized, high-density arrays of probes affixed to a "chip." Alternatively, a DNA microchip containing electronically captured probes (Nanogen, San Diego, Calif.; See, U.S. Pat. No. 6,068,818) can be used. Also, a "bead array" can also be used (Illumina, San Diego, Calif.; See WO 99/67641 and WO 00/39587). Illumina uses a BEAD ARRAY technology that combines fiber optic bundles and beads that self-assemble into an array.
Solid Phase Detection
[0349] In solid phase detection, hybridization of a probe to the sequence of interest, such as a polymorphism, is detected directly by visualizing a bound probe by using Southern blotting. In this technique, genomic DNA is isolated from a subject. The DNA is then cleaved with a series of restriction enzymes that cleave infrequently in the genome and not near any of the markers being assayed. The DNA is then separated (such as, but not limited to, by agarose gel electrophoresis) and transferred to a membrane. At least one probe which has been labeled with, for example, a radioactive, fluorescent or enzymatic label, specific for the polymorphism being detected is allowed to contact the membrane under a condition of low, medium, or high stringency conditions. Unbound probe is removed and the presence of binding is detected by visualizing the labeled probe.
Enzymatic Detection of Hybridization
[0350] The presence or absence of polymorphisms can be detected using an assay that detects hybridization by enzymatic cleavage of specific structures ("INVADER assay", Third Wave Molecular Diagnostics, Madison, Wis.; See, U.S. Pat. No. 6,001,567). The INVADER assay detects specific DNA and RNA sequences by using structure-specific enzymes to cleave a complex formed by the hybridization of overlapping probes. Elevated temperature and an excess of one of the probes enable multiple probes to be cleaved for each target sequence present without temperature cycling. These cleaved probes then direct cleavage of a second labeled probe. The secondary probe can be 5'-end labeled (such as, but not limited to, with fluorescein) that is quenched by an internal dye. Upon cleavage, the dequenched fluorescein labeled product may be detected using a standard fluorescence plate reader.
[0351] The INVADER assay detects specific mutations and SNPs in unamplified genomic DNA. The isolated DNA sample is contacted with the first probe specific either for a SNP or wild type sequence and allowed to hybridize. Then a secondary probe, specific to the first probe, and containing the fluorescein label, is hybridized and the enzyme is added. Binding is detected using a fluorescent plate reader and comparing the signal of the test sample to known positive and negative controls.
[0352] Hybridization of a bound probe can be detected using a TaqMan® assay using the techniques described previously herein.
[0353] In still further embodiments, polymorphisms are detected using any single base extension ("SBE") methods known in the art (See U.S. Pat. Nos. 5,888,819 and 6,004,744). For example, a shifted termination assay ("STA") can be performed. The STA method involves designing a detection primer that is complementary to a target DNA. The detection primer is labeled with any detectable label known in the art. The 3'-terminal of detection primer ends at the base just before the target base. The detection primer hybridizes to the target nucleic acid sequence. When performing a primer extension reaction, if the first base is the target base, a primer extension reaction will be terminated at the target base position without incorporating any of the labeled nucleotides. No color reaction will be detected. If the target base is changed by any type of mutation, including point mutation (SNP), deletion, insertion, and translocation, a primer extension reaction will continue through the target base position, and multiple labeled nucleotides will be incorporated into the extended detection primer. A strong color reaction will be observed. A STA can be performed on a DNA sequence or using fluorescence polarization.
[0354] Another SBE that can be performed is a SNP-IT primer extension assay (Orchid Biosciences, Princeton, N.J.; See, U.S. Pat. No. 5,952,174). In this assay, SNPs are identified using a specially synthesized DNA primer and a DNA polymerase to selectively extend the DNA chain by one base at the suspected SNP location. DNA in the region of interest is amplified and denatured. PCR is then performed using miniaturized systems called microfluidics. Detection is accomplished by adding a label to the nucleotide suspected of being at the polymorphic site. Incorporation of the label into the DNA can be detected by any method known in the art.
Mass Spectroscopy
[0355] The presence or absence of polymorphisms can be detected using a MassARRAY system (Sequenom, San Diego, Calif.; See, U.S. Pat. No. 6,043,031). DNA is isolated from test samples using routine procedures known to those skilled in the art. Next, specific DNA regions containing the polymorphism of interest, about 200 base pairs in length, are amplified by PCR. The amplified fragments are then attached by one strand to a solid surface and the non-immobilized strands are removed by standard denaturation and washing. The remaining immobilized single strand then serves as a template for automated enzymatic reactions that produce genotype specific diagnostic products.
[0356] Very small quantities of the enzymatic products, typically five to ten nanoliters, are then transferred to a SpectroCHIP array for subsequent automated analysis with the SpectroREADER mass spectrometer. Each spot is preloaded with light absorbing crystals that form a matrix with the dispensed diagnostic product. The MassARRAY system uses MALDI-TOF (Matrix Assisted Laser Desorption Ionization-Time of Flight) mass spectrometry. In a process known as desorption, the matrix is hit with a pulse from a laser beam. Energy from the laser beam is transferred to the matrix and it is vaporized resulting in a small amount of the diagnostic product being expelled into a flight tube. As the diagnostic product is charged when an electrical field pulse is subsequently applied to the tube they are launched down the flight tube towards a detector. The time between application of the electrical field pulse and collision of the diagnostic product with the detector is referred to as the time of flight. This is a very precise measure of the product's molecular weight, as a molecule's mass correlates directly with time of flight with smaller molecules flying faster than larger molecules. The entire assay is completed in less than 0.0001 second, enabling samples to be analyzed in a total of 3-5 second including repetitive data collection. The SpectroTYPER software then calculates, records, compares and reports, the genotypes at the rate of three seconds per sample.
Kits
[0357] The present invention also provides kits that enable or allow for the detection of a genotype of one or more subjects. These kits are useful for diagnosing those subjects suffering from major depression or a related disorder or at risk of developing major depression or a related disorder, providing a prognosis for or predicting a response to treatment for a subject suffering from major depression or a related disorder, identifying subjects for selection or inclusion in a clinical trial for treating major depression or a related disorder, or for analyzing the relationship between genotypes of subjects being treated for major depression or a related disorder and their clinical outcome.
[0358] The kits can be produced in a variety of ways. For example, the kits contain at least one reagent useful for detecting (a) at least one polymorphic site in SEQ ID NO:1; (b) at least one polymorphic site in nucleotides 1 to 316 of SEQ ID NO:2; (c) a T-C polymorphism at position 136 of SEQ ID NO:2; (d) a A-G polymorphism at position 210 of SEQ ID NO:2; (e) a G-A polymorphism at position 242 of SEQ ID NO:2; (f) at least one polymorphic site in SEQ ID NO:1; (g) a polymorphic site in nucleotide 77402 of SEQ ID NO:1; (h) a polymorphic site in nucleotide 79906 in SEQ ID NO:1; or (i) any combinations of (a)-(h). Additionally, any of the kits described above in (a)-(i) can further contain at least one reagent useful for detecting a C-G polymorphism at position -1019 in a human serotonin receptor 1A gene. Examples of the at least one reagent that can be included in the kits described herein are one or more primers for amplifying the region of DNA containing the polymorphic site or one or more probes that bind to or near the polymorphic site. In addition, the kits can further contain (a) instructions for determining the genotype of a subject; (b) ancillary reagents such as buffering agents, nucleic acid stabilizing reagents, protein stabilizing reagents, and signal producing systems (such as, but not limited to, fluorescence generating systems); or (c) positive and/or negative control(s). The kit may be packaged in any suitable manner, typically with the elements in a single container or various containers as necessary.
RNA and Protein Detection and Quantification Assays
[0359] In another embodiment, the present invention relates methods for detecting or quantifying mRNA or protein in a test sample obtained from one or more subjects. The information obtained by detecting or quantifying mRNA or protein in a test sample obtained from a subject can be used in a variety of different ways. For example, the presence, absence or amount of mRNA or protein detected or quantified in subjects can be used to diagnose those subjects suffering from major depression or a related disorder or at risk of developing major depression or a related disorder, provide a prognosis for or predict or diagnose a response to treatment for a subject suffering from major depression or a related disorder or identifying subjects for selection or inclusion in a clinical trial for treating major depression or a related disorder. Additionally, the presence, absence or amount of mRNA or protein can be used to analyze the results of a clinical trial for subjects being treated for major depression or a related disorder. Specifically, the relationship between the presence, absence or amount of the mRNA or protein detected or quantified in the test samples and the clinical outcome of said subjects can be determined.
[0360] The methods described herein involve obtaining a test sample from said subject(s). The subject may or may not be experiencing any symptoms of major depression or a related disorder at the time the test sample is obtained. In this embodiment, a test sample is any biological sample which contains the RNA or protein of the subject. Test samples can be prepared using techniques well known to those skilled in the art such as by obtaining a specimen from an individual and, if necessary, disrupting any cells contained therein to release RNA or protein. Examples of test samples include, but are not limited to, whole blood, serum, plasma, cerebrospinal fluid, sputum, bronchial washing, bronchial aspires, urine, lymph fluids, and various external secretions of the respiratory, intestinal and genitourinary tracts, tears, saliva, milk, white blood cells, myelomas and the like, etc.
[0361] Once the test sample(s) is obtained, it is analyzed, using routine techniques known in the art, in order to determine or quantify the presence, absence or amount of: (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d). Additionally, the test sample may optionally be further analyzed for the presence, absence or amount of mRNA transcribed from the HTR1A gene or a polypeptide translated from the HTR1A gene.
[0362] As discussed above, once a test sample is obtained, it can be analyzed, using routine techniques known in the art for the presence, absence or amount of at least one mRNA transcribed from SEQ ID NO:1. Examples of mRNAs transcribed from SEQ ID NO:1 include, but are not limited to, SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33. Alternatively, the test sample can be analyzed for the presence, absence or amount of at least one polypeptide translated from SEQ ID NO:1. Examples of polypeptides translated from SEQ ID NO:1 include, but are not limited to, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:32 and SEQ ID NO:34.
[0363] Once the presence, absence or amount of a mRNA or a protein as specified in (a)-(e) above has been determined or quantified in a test sample, then further determinations can be made, such as, diagnosing whether the subject has major depression or a related disorder or is at risk of developing major depression or a related disorder, providing a prognosis for or predicting the response to treatment for a subject having major depression or a related disorder, determining whether the subject should be selected for inclusion in a clinical trial for treatment of major depression or a related disorder, or analyzing the relationship between the frequency of presence or relative amounts of at least one mRNA or polypeptide in subjects, and their clinical outcome. Additionally, if the test sample is further analyzed for the presence, absence or amount of mRNA transcribed from the HTR1A gene or a polypeptide translated from the HTR1A gene, the information pertaining to mRNA(s) or polypeptide(s) transcribed or translated from DEP2 may be used in combination with the information pertaining to mRNA(s) or polypeptide(s) transcribed or translated from HTR1A to make further determinations, as elaborated above.
[0364] For example, a test sample can be obtained from a subject. The test sample can then be analyzed using routine techniques known in the art, in order to determine or quantify the presence, absence or amount of (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d). If, for example, the presence of (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d) is detected, then a diagnosis can be made for said subject related to major depression or a related disorder or related to risk of developing major depression or a related disorder. This information can also be useful for providing a prognosis for or predicting the response to treatment for a subject already diagnosed as suffering from major depression or a related disorder. Moreover, this information can be used to determine whether or not the subject should or could be selected for inclusion in a clinical trial for treatment of major depression or a related disorder. Further, the frequency of presence of: (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d) can be used to analyze the results of a clinical trial for subjects being treated for major depression or a related disorder. Specifically, the relationship between the presence of said mRNA, protein, polypeptide or combinations thereof in the test samples and the clinical outcome of said subjects can be determined. Similarly, any of the above further determinations might be made on the basis of absence of: (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d), or on the basis of detection or quantification that an amount of: (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d), is within a certain range.
[0365] Techniques for identifying the presence, absence or amount of mRNAs or proteins in a test sample are well known in the art. For example, techniques for identifying the presence, absence or amount of mRNAs include, but are not limited to, reverse transcriptase, cDNA microarrays, quantitative PCR and Northern blotting. Techniques for identifying the presence, absence or amount of proteins include, but are not limited to, ELISA, RIA, Western blotting, fluorescence activated cell sorting and immunohistochemical analysis. These techniques will be discussed briefly below.
RNA Techniques
[0366] Reverse transcriptase can be used to prepare a cDNA by used of an oligo dT primer which is annealed to the poly A sequence of the RNA. Examples of reverse transcriptases that can be used include, but are not limited to, ImProm-II Reverse Transcriptase (Promega, Madison, Wis.) and BD Powerscript Reverse Transcriptase (BD Biosciences, Franklin Lakes, N.J.). Methods for using reverse transcriptases to prepare and obtain cDNA molecules are well known in the art and are described in Sambrook, J. et al., Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (1989).
[0367] A cDNA microarray is an array of multiple cDNA molecules, fixed in addressable locations (such as on a chip assay), to which complementary nucleic acids in applied samples may hybridize (see Hegde et al., Biotechniques 29(3):548-562 (2000)). cDNA microarrays provide for qualitative and quantitative analysis of mRNA expression of the molecules contained in the array.
[0368] Quantitative PCR allows for the direct monitoring of the progress of a PCR amplification as it is occurring, without the need for repeated sampling of the reaction products. In quantitative PCR, the reaction products may be monitored as they are generated and are tracked after they rise above background but before the reaction reaches a plateau. The number of cycles required to achieve a chosen level of fluorescence varies directly with the concentration of amplifiable targets at the beginning of the PCR process, enabling a measure of signal intensity to provide a measure of the amount of target DNA in a sample in real time. Quantitative PCR according to the present invention may be performed on any suitable instrument, including, but not limited to, Mx4000 or Mx3000P (Stratagene, La Jolla, Calif.), ABI7700 or ABI7000 (Applied BioSystems Inc., Foster City, Calif.), MJ Opticon (MJ Research, Watertown, Mass.), iCycler (Bio-Rad, Hercules, Calif.), RotorGene 3000 (Corbett Life Sciences, Mortlake, NSW, Australia), and the SmartCycler (Cepheid, Sunnyvale, Calif.).
[0369] In solid phase detection, hybridization of a probe to the sequence of interest, such as an RNA, is detected directly by visualizing a bound probe by using Northern blotting. In this technique, RNA is isolated from a subject. The RNA is then cleaved with a series of restriction enzymes that cleave infrequently in the genome and not near any of the markers being assayed. The RNA is then separated (such as, but not limited to, by agarose gel electrophoresis) and transferred to a membrane. At least one probe which has been labeled with, for example, a radioactive, fluorescent or enzymatic label, specific for the polymorphism being detected is allowed to contact the membrane under a condition of low, medium, or high stringency conditions. Unbound probe is removed and the presence of binding is detected by visualizing the labeled probe.
Protein Techniques
[0370] ELISA involves the fixation of a test sample containing a protein substrate of interest to a surface such as a well of a microliter plate. A substrate specific antibody coupled to an enzyme is applied and allowed to bind to the substrate. Presence of the antibody is then detected and quantitated by a colormetric reaction employing the enzyme coupled to the antibody. Enzymes commonly in ELISAs include, but are not limited to, horseradish peroxidase and alkaline phosphatase. If well calibrated and within the linear range of response, the amount of substrate present in the sample is proportional to the amount of color produced. A substrate standard is generally employed to improve quantitative accuracy.
[0371] Another technique that can be used is a radioimmunoassay ("RIA"). One version of RIA involves the precipitation of a desired substrate (such as a protein of interest) with a specific antibody and detectably labeled antibody binding protein (the antibody binding protein can be labeled with any detectable isotope known in the art) immobilized on a precipitable carrier, such as, but not limited to, agarose beads. The number of counts in the precipitated pellet is proportional to the amount of substrate present in the test sample. In an alternate version of RIA, a labeled substrate (such as a protein of interest) and an unlabelled antibody binding protein are employed. A test sample containing an unknown amount of substrate is added in varying amounts. The decrease in precipitated counts from the labeled substrate is proportional to the amount of substrate in the added sample.
[0372] Western blot involves separation of a substrate (such as a protein of interest) from another protein by means of an acrylamide gel followed by transfer of the substrate to a membrane (such as, but not limited to, nylon or PVDF). The presence of the substrate is then detected by antibodies specific to the substrate. The antibodies are then detected by antibody binding reagents. Antibody binding reagents may include, but are not limited to, protein A or other antibodies. The Antibody binding reagents may labeled with a detectable label as described previously herein. Detection may be by autoradiography, colorimetric reaction or chemiluminescence. Western blotting allows for both the quantitation of an amount of substrate and a determination of the substrate's identity by a relative position on the membrane which is indicative of a migration distance in the acrylamide gel during electrophoresis.
[0373] Fluorescence activated cell sorting ("FACS") involves detection of a substrate (such as a protein of interest) in situ in cells by substrate specific antibodies. The substrate specific antibodies are linked to fluorophores. Detection is by means of a cell sorting machine which reads the wavelength of light emitted from each cell as it passes through a light beam. This method may employ two or more antibodies simultaneously.
[0374] Immunohistochemical analysis involves detection of a substrate (such as a protein of interest) in situ in fixed cells by substrate specific antibodies. The substrate specific antibodies may be enzyme linked or linked to fluorophores. Detection is by microscopy and subjective evaluation. If enzyme linked antibodies are employed, a calorimetric reaction may be required.
Kits
[0375] The present invention also provides kits that enable or allow for the detection or quantification of (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d) in one more subjects. These kits are useful for diagnosing those subjects suffering from major depression or a related disorder or at risk of developing major depression or a related disorder, providing a prognosis for or predicting a response to treatment for a subject suffering from major depression or a related disorder, identifying subjects for selection or inclusion in a clinical trial for treating major depression or a related disorder, or for analyzing the results of a clinical trial for treating major depression or a related disorder relationship.
[0376] The kits can be produced in a variety of ways. For example, the kits contain at least one reagent useful for detecting or quantifying the presence, absence or amount of: (a) at least one mRNA which comprises nucleotides 1 to 316 of SEQ ID NO:2; (b) at least one mRNA transcribed from SEQ ID NO:1; (c) at least one protein having an amino acid sequence of SEQ ID NO:3 or SEQ ID NO:4; (d) at least one polypeptide translated from SEQ ID NO:1; or (e) any combinations of (a)-(d). Additionally, any of the kits described above in (a)-(e) can further contain at least one reagent useful for detecting or quantifying the presence, absence or amount of the presence, absence or amount of mRNA transcribed from the HTR1A gene or a polypeptide translated from the HTR1A gene. Examples of the at least one reagent that can be included in the kits described herein are a reverse transcriptase, one or more primers for amplifying cDNA or at least one antibody. In addition, the kits can further contain (a) instructions describing how to detect or quantify the presence, absence or amount of at least one mRNA or at least one protein in a test sample; (b) ancillary reagents such as buffering agents, nucleic acid stabilizing reagents, protein stabilizing reagents, and signal producing systems (such as, but not limited to, fluorescence generating systems); or (c) positive and/or negative control(s). The kit may be packaged in any suitable manner, typically with the elements in a single container or various containers as necessary.
Screening Assays and Methods of Treatment
[0377] In another embodiment, the present invention relates to methods (also referred to herein as "screening assays" or "screening methods") for identifying compositions, namely candidate or test compounds or agents (such as, but not limited to, small molecules, antibodies, nucleic acids, peptides, peptidomimetics, or other drugs), which: (a) bind to a protein translated from SEQ ID NO:1; (b) modulate the activity or expression of a protein translated from SEQ ID NO:1 (such as by inhibiting, reducing or decreasing the activity, reducing or decreasing the expression, or by stimulating or increasing the activity, or stimulating or increasing the expression, of the protein); or (c) modulate the expression of an mRNA molecule transcribed from SEQ ID NO:1 (such as by reducing or decreasing the expression or by stimulating or increasing the expression.
Since genetic linkage between DEP2 and major depressive disorder has been established, it is thought that compositions identified pursuant to the screening methods described herein may be useful in treating major depression or a related disorder.
[0378] Examples of proteins translated from SEQ ID NO:1 include, but are not limited to, (i) Lhpp (SEQ ID NO:10), (ii) naturally occurring protein variants of Lhpp (SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO; 17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:29); (iii) Dep2-1a (SEQ ID NO:3); (iv) Dep2-1b (SEQ ID NO:4); Dep2-2 (SEQ ID NO:27); (v) Dep2-4 (SEQ ID NO:32); and (vi) Dep2-5 (SEQ ID NO:34). Examples of RNA molecules transcribed from SEQ ID NO:1 include, but are not limited to, SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33.
[0379] As will be discussed in more detail herein, the present invention includes a method of determining whether a composition, identified in accordance with the methods described herein, is a potential therapy for major depression or a related disorder by initially administering the composition to a mammal (for example, an animal model). One may then monitor for major depression-related symptoms of the animal or the level or activity of a protein translated from SEQ ID NO:1 in the test subject. A decrease in the appearance of such symptoms indicates the potential suitability of the composition of interest in the treatment of major depression or related disorders. Such a finding in an animal model would then lead to use of the composition in human clinical trials.
[0380] Suitable animal models for such experiments include, but are not limited to, behavioral despair or mouse forced swim test (Arch. Int. Pharmacodyn. 229:327-336 (1977), Psychopharmacology 94:147-160 (1988)); tail suspension test (Psychopharmacology 85:367-370 (1985)); elevated plus maze test (Psychopharmacology 92:180-185 (1987)); open field test (Behav. Brain Res. 134:49-57 (2002)); dark-light transitions test (Pharmacol. Biochem. Behav. 15:695-699 (1981)); Irwin test (Brain Res. Vol. 835:18-26 (1999); Psychopharmacology 147:2-4 (1999)); inescapable stress test (learned helplessness) (Seligman and Maier, J. Exp. Psychol 74:1-9, (1967)); chronic mild stress (Ducottet et al., Prog. Neuro-Psychopharmacol. Biol. Psychiatry 27:625-631 (2003); Kopp et al., Behav. Pharmacol. 10:73-83 (1989)); and novelty-suppressed feeding model (Bodnoff et al., Psychopharamcology 97:277-279 (1989)).
[0381] The present invention additionally relates to the compositions identified by use of the above screening methods as well as to methods of using these compositions in the treatment of major depression or a related disorder. More specifically, once a composition of interest has been identified, the composition may be used in clinical trials to determine whether it actually alleviates the symptoms of major depression or a related disorder or at least decreases the severity thereof.
[0382] Also, it is submitted that the proteins described herein may be used to characterize the physical properties of compositions which may be used to ultimately treat major depression or a related disorder and thus in the "design" of such compositions. Thus, based upon such properties, one may design a composition or compound that has the ability to have a significant degree of binding affinity to a protein translated from SEQ ID NO:1, thereby modulating the activity of the protein. Such a composition or compound could then be used in the treatment of major depression or a related disorder.
[0383] Furthermore, one may detect binding of a test composition to a protein translated from SEQ ID NO:1 by subjecting the protein to, for example, nuclear magnetic resonance ("NMR") alone and in the presence of the composition.
[0384] Characteristic changes in the NMR spectrum of the protein may then allow one to determine whether and how the composition has bound to the protein. This procedure may be repeated for a series of compounds, enabling discovery of relationships between compound structure and binding to the target protein. This iterative process is known as "structure-activity relationships by NMR" or "SAR by NMR" (Shuker et al., Science 274:1531-1534 (1996); SAR by NMR is described in U.S. Pat. Nos. 5,891,643, 5,989,827, 5,804,390, 6,043,024 and 6,897,337).
[0385] Similarly, one may identify the structure of a composition bound to the protein by x-ray diffraction techniques. By iterative operation of this technique, one may optimize lead compositions or compounds so as to develop the most efficacious therapeutic compositions or compounds for the treatment of major depression or a related disorder.
[0386] One method of identifying compositions that modulate the amount or activity of a protein translated from SEQ ID NO:1 or that modulate the expression of an mRNA molecule that is transcribed from SEQ ID NO:1 is a reporter gene assay. It is well known to those skilled in the art that a reporter gene assay may be carried out in an intact cell transfected with the reporter gene construct, in extracts from a cell transfected with the reporter gene construct, or in a cellular extract (for example, reticulocyte lysate) to which the reporter gene construct is added. It is further recognized that reporter gene assays may be carried out using cells or extracts that naturally contain a protein translated from SEQ ID NO:1 or an mRNA molecule transcribed from SEQ ID NO:1, cells into which a vector for the expression of a protein translated from SEQ ID NO:1 or an mRNA molecule transcribed from SEQ ID NO:1 that has been transfected (transiently or stably), or extracts to which a purified or partially purified amino acid translated from SEQ ID NO:1 or an mRNA molecule transcribed from SEQ ID NO:1 is added. In the present invention, it is preferred that a protein translated from SEQ ID NO:1 or an mRNA molecule transcribed from SEQ ID NO:1 be purified from a human cell or tissue, or from expression in an heterologous system. Further, it is also well known in the field that reporter gene assays may be conducted in cells or extracts that are of human origin, or that come from a different mammal or organism. It is additionally recognized that there are many regulatory sequences (such as promoters) that can be used to initiate transcription in a reporter gene construct, and that the choice of a regulatory sequence may be determined more by the particular cell or extract in which the assay will be conducted. It is still further well known that there are a variety of reporter genes that are amenable to screening assays, including high throughput screening assays. Examples of reporter genes include those which are themselves fluorescent, luminescent or have easily detected spectral characteristics (for example, a green fluorescent protein), as well as those having well-characterized fluorescent, luminescent or colorimetric substrates (for example, beta-galactosidase, luciferases). It is finally recognized that certain cofactors may be added as purified or partially purified components to a reporter gene assay. A discussion of reporter systems can be found in Current Protocols in Pharmacology (2003), Units 6.2.1-6.2.11, Wiley & Sons, Inc.
[0387] An additional embodiment of reporter gene assays involve the use of at least one substrate for a protein translated from SEQ ID NO: 1. The substrate can be added before the protein is exposed to the test composition or simultaneously with the test composition, provided that the protein is exposed to the substrate for a time and under conditions sufficient to allow the protein to react with the substrate in order to produce a reaction product. Any substrate wherein the phosphorylation of said substrate is capable of being modified by a protein translated from SEQ ID NO:1 can be used in the reporter gene assays described herein. Proteins translated from SEQ ID NO:1 that can be used to modify the phosphorylation of said substrate, include, for example, proteins having an amino acid sequence selected from the group consisting of: SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27. Examples of substrates that can be used include, but are not limited to, phosphohistidine, phospholysine, phosphodiimide, pyrophosphate or any peptide or protein that is phosphorylated on a histidine or a lysine. For example, a reporter gene assay for screening a composition for the ability to inhibit activity of SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25 and SEQ ID NO:27 can be performed. The method involves exposing a protein to a test composition and then measuring the presence or absence of a reaction product or complex. The lack of a reaction product or complex indicates that the composition has the ability to inhibit the activity of the protein. Prior to exposing the protein to the test composition, a substrate can be added to the protein. Alternatively, the substrate can be added when the protein is exposed to the test composition.
[0388] An additional substrate that can be used is a radioactive enzyme substrate. In such an embodiment, the reporter gene encodes an enzyme (for example, chloramphenicol acetyltransferase) having a substrate that is readily separated from the corresponding reaction product. This type of radioactive detection assay may be utilized in order to identify a compound that binds to or modulates a protein translated from SEQ ID NO:1 or that modulates the expression of an mRNA molecule transcribed from SEQ ID NO:1. It is well known to those skilled in the art that the separation and detection of radioactive compounds may be accomplished by a variety of chromatographic and other methods. A discussion of radioactive reporter gene assays can be found in Current Protocols in Pharmacology (2003), Units 6.4.1-6.4.11, Wiley & Sons.
[0389] An additional assay that may be used to detect a composition or compound having the ability to modulate the activity or expression of a protein translated from SEQ ID NO:1 or modulate the expression of an mRNA molecule transcribed from SEQ ID NO:1 is the scintillation proximity assay. This assay is based upon the binding of a radiolabeled tracer to a protein translated from SEQ ID NO:1 or an mRNA molecule transcribed from SEQ ID NO:1 that has been exposed to the composition or compound of interest. The scintillant is incorporated into small fluoromicrospheres to which target macromolecules (for example, proteins or mRNAs) attach. If a radioactive molecule (for example, 3H) binds to the target, it is brought close enough to the bead to stimulate the scintillant to produce light. On the other hand, unbound radioactivity is not detected if the bead is outside the distance subatomic particles produced by the decay are likely to travel. Thus compositions or compounds that bind to a protein translated from SEQ ID NO:1 or an mRNA molecule transcribed from SEQ ID NO:1 may be detected by changes in the amount of scintillant-emitted light. A discussion of scintillation proximity assays can be found in Current Protocols in Pharmacology (2003), Unit 9.4.9-9.4.10, Wiley & Sons, Inc.
[0390] Another assay which may be utilized in the identification of compositions that affect the binding or that modulate a protein translated from SEQ ID NO:1 to mRNA is a filter binding assay. An example of the filter binding assay that may be utilized for a protein translated from SEQ ID NO:1 involves immobilization of an RNA molecule (for example, all or part of an mRNA transcribed from SEQ ID NO:1) on a solid support, exposure of the immobilized RNA to a protein translated from SEQ ID NO:1 in the absence or presence of compositions thought to bind or inhibit protein translated from SEQ ID NO:1, and quantitation of a protein translated from SEQ ID NO:1 on the solid support. It is well known to those skilled in the art that the solid support may be a nitrocellulose or other filter, or any of a variety of beads or microparticles. It is further recognized that a protein translated from SEQ ID NO:1 used in the assay may be purified from an heterologous expression system, and will advantageously be tagged such that it can be detected using commonly available reagents. For example, a protein translated from SEQ ID NO:1 may be a fusion to a `tag` sequence expressed in E. coli (Tateiwa et al., Journal of Neuroimmunology 120:161-69 (2001)). Compositions that bind to or inhibit a protein translated from SEQ ID NO:1 may be identified by a increase or reduction in the amount of a protein translated from SEQ ID NO:1 on the solid support, relative to a reaction in which no test compound was added.
[0391] Another type of assay that may be useful to screen for compositions that bind to protein translated from SEQ ID NO:1 is a fluorescence polarization assay. This method detects molecular interactions and is based on the concept that fluorescent molecules excited by light polarized in one plane will emit a fluorescent signal again in a polarized manner. The rotational relaxation time is proportional to the molecular volume if other physical variables are unchanged. Thus, when binding to a larger molecule restricting rotation and tumbling, the emission remains polarized, such polarization can be calculated and is directly proportional to the fraction of bound ligand. Change in fluorescence polarization thus accounts for the ratio of bound versus total ligand. For a protein translated from SEQ ID NO:1, one embodiment of a fluorescent polarization assay would involve a fluorescently labeled polynucleotide comprising all or part of a nucleic acid of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 or SEQ ID NO:33. Compositions that bind to a protein translated from SEQ ID NO:1 may be detected by a reduction of fluorescent polarization attributable to the labeled polynucleotide. A discussion of fluorescent polarization assays can be found in Current Protocols in Pharmacology, (2003), Units 9.4.12-9.4.13 Wiley & Sons, Inc.
[0392] Another type of assay that may be used to screen for compositions that bind to a protein translated from SEQ ID NO:1 is the spin-screening assay. This method detects molecular interactions, and is based on the concept that the sedimentation rate of molecules or molecular complexes in solution depends on mass and shape. In particular, the sedimentation rate of a small molecule alone is expected to be substantially different from that of the same small molecule bound to a macromolecule. Thus, when a directional force is applied (for example, by spinning a solution at high speed in a centrifuge), small molecules that bind to a protein translated from SEQ ID NO:1 can be readily separated from other small molecules in a mixture that do not. Separation of bound from unbound small molecules can also be accomplished by including a size exclusion filter within the centrifuge tube, such that unbound small molecules pass through the filter but bound small molecules do not. It is recognized by those skilled in the art that molecules separated in this fashion can be identified by a variety of spectroscopic and other methods. In one embodiment, a spin-screening assay includes detection by mass spectrometry.
[0393] In another embodiment, the present invention relates to methods of determining the in vivo activity of a composition identified as a potential therapy for the treatment of major depression or a related disorder. These methods involve obtaining at least two (2) test samples from a subject, preferably a human, being treated for major depression or a related disorder. A first test sample can be considered to be a test sample that is obtained at a period in time before the subject has begun a course of treatment with the test composition. Alternatively, a first test sample can also be considered to be a test sample obtained at a period in time during which a subject has been receiving a course of treatment with the test composition. A second test sample can be considered to be a test sample that is obtained at a period in time that is subsequent to the obtaining of the first test sample. For example, the second test sample can be obtained after a period of time has elapsed after the subject has begun an initial course of treatment with the test composition (meaning that the subject had not previously received the test composition prior obtaining the first test sample). Alternatively, if a subject has been receiving treatment with a test composition, a first test sample can be obtained from said subject. After a period of time has elapsed (for example, three (3) months) during which said subject is still being treated with the test composition, a second test sample can be obtained from said subject. A discussion of what constitutes a test sample and examples of test samples has already been provided previously herein and is incorporated herein by reference.
[0394] Once the test samples are obtained, they are analyzed, using routine techniques known in the art (which have been discussed previously herein), to determine or quantify the: (a) amount or activity of a protein translated from SEQ ID NO:1; or (b) amount of mRNA transcribed from SEQ ID NO:1, in each of the test samples. The amount or activity of a protein translated from SEQ ID NO:1 or the amount of mRNA transcribed from SEQ ID NO:1 that was detected or quantified in each of the test samples is compared. If, for example, the amount or activity of protein or the amount of mRNA determined or quantified in the second test sample (which was the test sample obtained after the subject began a course of treatment with the composition) is the same (namely, equal) as the amount or activity of protein or the amount or activity of mRNA determined or quantified in the first test sample (which was the test sample obtained from the subject prior to undergoing said course of treatment with the composition), this indicates that the composition lacks therapeutic activity. In contrast, if the amount or activity of the protein or the amount of mRNA determined or quantified in the second test sample has changed, namely, has increased or decreased, when compared to the amount or activity of the protein or the amount of mRNA determined or quantified in the first test sample, this indicates that the composition possesses some type or degree of therapeutic activity.
[0395] In addition, in yet another embodiment, the present invention relates to methods for determining the presence or absence of activity of a composition identified pursuant to the screening methods described herein that is being used to treat a subject suffering from major depression or a related disorder. The method involves observing the phenotype of said subject prior to the subject being administered the test composition ("first visit"). For example, observation of the subject's phenotype should be based on a method that has been validated as a measure of major depression or a related disorder. Such validated methods include, but are not limited to: the Hamilton depression rating scale (Hamilton, J. Neurol. Neurolsurg. Psychiatry 23:56-62 (1960), Schedule for affective disorders and schizophrenia (Spitzer and Endicott, Schedule for affective disorders and schizophrenia, lifetime version. New York, N.Y.: New York State Psychiatric Institute, Biometrics Research. 1975), Montgomery-Asberg depression rating score (Montgomery, Br. J. Psychiatry 134:382-389 (1979)) and the Structured clinical interview for DSM-IV (First et al., Structured Clinical Interview for DSM-IV. Washington, D.C.: American Psychiatric Press 1997). After observation, the subject is administered the test composition for a time and under conditions that are sufficient for the composition to either: (a) bind to, inhibit, increase, decrease or reduce the amount of a protein translated from SEQ ID NO:1; or (b) increase or reduce the amount of a mRNA molecule transcribed from SEQ ID NO:1. After the subject has been administered the test composition for a time and under the conditions described above, the phenotype of the subject is again observed, preferably, using the same validated method as was used to establish the initial phenotype. Observable improvement in the phenotype of the subject at the second observation compared to the first observation indicates that the composition has some type or degree of therapeutic activity. A lack of observable differences in the phenotype of the subject at the first observation compared to the second observation indicates that the composition does not possess therapeutic activity. The steps of observing the phenotype of the subject and administering the composition to said subject can be repeated for as long as the treating physician deems necessary. The physician may then compare the phenotype of the subject between any pair of observations to judge whether the composition has some type or degree of therapeutic activity. In one aspect of this embodiment, commonly used in clinical research, the phenotype observed at the time of the last administration of the test composition is referred to as the "last visit". Eventually, the phenotype of the subject prior to initiation of treatment is compared with the phenotype of the subject at the last visit. Observable differences in the phenotype of the subject prior to initiation of treatment compared to the last visit indicates that the composition possesses some type or degree of therapeutic activity. A lack of observable differences in phenotype of the subject prior to initiation of treatment compared to the last visit indicates that the composition does not possess any type of therapeutic activity.
[0396] In another embodiment, the present invention relates to the compositions identified by methods described herein in the prevention of major depression or a related disorder or the treatment of major depression or a related disorder. More specifically, the present invention contemplates a method for at least substantially preventing in a subject major depression or a related disorder by administering to a subject in need of treatment thereof, a therapeutically effective amount of at least one composition that has been identified by the hereinbefore described methods that: (a) modulates the activity of a protein translated from SEQ ID NO:1; (b) reduces the amount of a protein translated from SEQ ID NO:1; (c) increases the amount of a protein translated from SEQ ID NO:1; or (d) modulates the level of expression of an mRNA molecule transcribed from SEQ ID NO:1. Administration of a prophylactic composition can occur prior to the manifestation of symptoms characteristic of major depression or a related disorder.
[0397] Additionally, the present invention further contemplates a method of treating a subject suffering from major depression or a related disorder by administering to a subject in need of treatment thereof, a therapeutically effective amount of at least one composition that has been identified by the hereinbefore described methods that: (a) modulates the activity of a protein translated from SEQ ID NO:1; (b) reduces the amount of a protein translated from SEQ ID NO:1; (c) increases the amount of a protein translated from SEQ ID NO:1; or (d) modulates the level of expression of an mRNA molecule transcribed from SEQ ID NO:1.
[0398] By way of example, and not of limitation, examples of the present invention shall now be given.
Example 1
Identification of Genetic Linkage and Association Between DEP2 and Major Depressive Disorder
[0399] Genetic linkage between DEP2 and major depressive disorder was established in a pedigree-based study in the Mormon population of Utah. The ascertainment and characteristics of a majority of these pedigrees has been described (Abkevich et al., Am. J. Hum. Genet. 73:1271-1281 (2003)). In the study described herein, a total of 93 pedigrees that contain a minimum of four females affected with major depressive disorder (DSM-IV-TR sections 296.2x or 296.3x) were selected for genetic analysis. These pedigrees comprised 744 affected females.
[0400] Affected individuals were genotyped and genome-wide linkage analysis was performed as described (Abkevich et al., op. cit.). Two meaningful differences between the present study and our previously published work are: first, that additional pedigrees were ascertained; second, that the definition of affected status was different as it did not include bipolar disorder in this study.
[0401] Using a dominant genetic model and considering only females with major depressive disorder as affected, evidence of linkage on chromosome 10 at marker D1051676 was observed (heterogeneity LOD score (HLOD) 2.4). Upon genotyping of additional markers in the 26 centimorgan ("cM") interval between D1052322 and D1051700, the linkage evidence increased to a peak HLOD of 3.4 at D105214 (FIG. 27 and Table 1).
[0402] The serotonin receptor 1A (Htr1a) is a therapeutic target in the management of depressive and anxiety disorders (Barnes and Sharp, Neuropharmacology 38:1083-1152 (1999)). A common polymorphic site in the corresponding gene (HTR1A) has been described, such that the 1019th nucleotide upstream of the transcriptional start site naturally occurs as either cytosine or guanosine (Wu and Comings, Psych. Genet. 9:105-106 (1999)). Results of in vitro experiments suggest that the variant allele (-1019G) prevents binding of a transcriptional repressor, resulting in enhanced Htr1a expression (Lemonde et al., J. Neurosci., 23:8788-8799 (2003)). Either the -1019G allele or homozygous -1019GG genotype has been associated with depression, suicide, bipolar disorder, panic disorder with agoraphobia, neuroticism and decreased anti-depressant response (Arias et al., Mol. Psych. 7:930-932 (2002); Strobel et al., J. Neural Transm., 110-1445-1453 (2003); Lemonde et al., J. Neurosci., 23:8788-8799 (2003); Rothe et al., Int. J. Neuropsychopharmacol. 7:189-192 (2004); Huang et al., Int. J. Neuropsychopharmacol. 7:441-451 (2004); Serretti et al., Int. J. Neuropsychopharmacol. 7:453-460 (2004); Lemonde et al., Int. J. Neuropsychopharmacol. 7:501-506 (2004); Arias et al., J. Psychopharmacol. 19:166-172 (2005)).
[0403] In the Utah population, HTR1A allele -1019G and genotype -1019GG were 1.1- and 1.3-fold over-represented among individuals affected with major depressive disorder compared to unaffected individuals (one-tailed p=0.05 and 0.02, respectively). Hence, linkage analysis was stratified according to HTR1A -1019 alleles. That is, only individuals with major depressive disorder, and also carrying one or two copies of the HTR1A-1019G risk allele, were considered affected. In a genome-wide HTR1A-conditional linkage analysis using a dominant genetic model and also restricted to female sex, the observed evidence of linkage on chromosome 10 strengthened to a peak HLOD of 3.1 at D10S1222. Upon inclusion of additional marker data in the 26 cM interval between D10S2322 and D10S1700, the linkage evidence increased to a peak HLOD of 4.4 at D10S575 (FIG. 27 and Table 1).
[0404] The conditional linkage method improved upon the previously performed traditional linkage analysis in three ways. First, as noted above, it revealed stronger evidence supporting linkage of a dominant gene to major depressive disorder in females on chromosome 10 in the vicinity of D105575. Second, it narrowed the linkage region (as defined by a drop of HLOD of either 1 or 2 from the peak value), such that the location of the linked gene was better defined. Third, and most importantly, it revealed linkage evidence in a distinct subset of pedigrees. Further investigation of those pedigrees was crucial to the discovery of DEP2 as a gene linked to major depressive disorder.
[0405] As a next step to identify a gene linked to major depressive disorder, each gene in the conditional linkage region was resequenced in representative affected females from each of sixteen pedigrees. These pedigrees were selected on the basis of having a familial HLOD of at least 0.4. Among these pedigrees, six had not shown linkage evidence without stratification on the basis of HTR1A alleles. The frequencies of variant alleles among the 22 chromosomes that segregated with major depressive disorder within these pedigrees was compared to the frequencies among 60 control chromosomes. For seven single nucleotide polymorphisms ("SNPs") within SEQ ID NO:1, statistically significant frequency differences were observed. Additionally, a statistical trend was observed for an eighth SNP in SEQ ID NO:1 (Tables 2 and 3). Two pairs of these SNPs (DEP2.0001 and DEP2.0002, DEP2.0004 and DEP2.0005) were in complete linkage disequilibrium with each other. Between these markers, only DEP2.0002 and DEP2.0004 are described further. One SNP in each of six other genes in the linkage region showed statistically significant frequency differences between the 22 chromosomes that segregated with major depressive disorder within these pedigrees and the set of 60 control chromosomes (Table 3).
[0406] For three of six tested SNPs in SEQ ID NO:1, statistically significant frequency differences were also observed between the 22 chromosomes that segregated with major depressive disorder and an independent set of 180 control chromosomes (Table 4). None of the six tested SNPs from other genes showed statistical significance in this test. For five of the six tested SNPs in SEQ ID NO:1, statistically significant frequency differences were also observed between the 22 chromosomes that segregated with major depressive disorder and a third independent set of 708 control chromosomes (Table 5).
[0407] To confirm the relationship between DEP2 genotypes and major depressive disorder, genetic association studies comparing genotype frequencies between individuals affected with major depressive disorder (not ascertained on the basis of familial history of disease) and healthy controls were performed in two populations. Consistent with the dominant linkage model, DEP2 genotypes were grouped into dichotomous variables such that carriers of a DEP2 risk allele (heterozygous or homozygous) were compared to non-carriers. Following the conditionality of DEP2 linkage on carriage of the HTR1A -1019G allele, this genotype was similarly included in statistical models as a dichotomous variable. Sex and all first-order interaction terms between genotypes or between genotype and sex were also included in statistical models. Non-significant terms (p>0.05) were sequentially dropped from statistical models using a backward elimination process.
[0408] In the Mormon population, DEP2.0004 (odds ratio for the T allele 1.40, 95% confidence interval 1.00-1.94) and DEP2.0007 (odds ratio for the A allele 2.03, 95% confidence interval 0.99-4.48) were associated with major depressive disorder (Tables 6 and 7). For each marker, the frequency of DEP2 allele carriage was highest among -1019G-positive cases, and approximately equal among all other groups. Additionally, the same DEP2 alleles were both linked to and associated with major depression in the Mormon population. There was also a significant DEP2.0004 genotype-by-sex interaction. In an Ashkenazi Jewish population, DEP2.0004 (odds ratio for the T allele 0.59, 95% confidence interval 0.35-0.99) and DEP2.0006 (odds ratio for the A allele 0.43, 95% confidence interval 0.24-0.75) were associated with major depressive disorder (Tables 8 and 9). For each marker, the frequency of DEP2 allele carriage was lowest among -1019G-positive cases, and approximately equal among all other groups. There was no association of DEP2.0006 in the Mormon population (Table 10), or of DEP2.0007 in the Jewish population (Table 11), with major depressive disorder.
[0409] The DEP2 polymorphisms associated with major depressive disorder differ between Mormon and Jewish populations, and opposite alleles at DEP2.0004 were associated with major depressive order between the two populations. This sort of situation is not unusual in psychiatric genetics, in fact it has been observed for most of the genes that have been linked to schizophrenia (Harrison and Weinberger, Mol. Psych. 10:40-68 (2005)). The most parsimonious explanation for these results is that functional alleles of DEP2 arose on different haplotypes in the Mormon and Jewish populations.
TABLE-US-00003 TABLE 1 Microsatellite Marker HLOD Conditional HLOD D10S1656 0.8 1.9 D10S2322 0.7 1.9 D10S575 3.1 4.4 D10S214 3.4 4.2 D10S1703 2.9 3.8 D10S1782 2.7 3.6 D10S1222 2.7 3.5 D10S1727 2.7 3.5 D10S1676 3.3 3.6 D10S1439 2.7 3.1 D10S1134 2.5 3.0 D10S1248 2.8 2.3 D10S505 2.4 1.6 D10S1770 1.9 1.6 D10S1651 1.8 1.7 D10S590 1.7 1.7 D10S212 2.3 1.9 D10S1711 2.0 1.8 D10S1700 2.0 1.8
TABLE-US-00004 TABLE 2 Marker Name Alleles Location in SEQ ID NO: 1 DEP2.0001 C, T 2955 DEP2.0002 A, C 3005 DEP2.0003 C, A 33241 DEP2.0004 C, T 38048 DEP2.0005 G, A 38215 DEP2.0006 G, A 77402 DEP2.0007 G, A 77333 DEP2.0007 C, T 79906
TABLE-US-00005 TABLE 3 Marker Name Linked Chromosomes Control Chromosomes P DEP2.0001 10/22 5/60 0.0004 DEP2.0002 10/22 5/60 0.0004 DEP2.0003 10/22 12/58 0.02 DEP2.0004 13/22 17/60 0.02 DEP2.0005 13/22 17/60 0.02 DEP2.0006 14/22 25/60 0.09 DEP2.0007 4/22 1/60 0.02 DEP2.0008 3/22 0/60 0.02 rs14663 61 17/22 34/58 0.05 rs3740013 14/22 20/60 0.003 rs4462251 12/22 18/58 0.05 rs1063536 6/22 2/60 0.004 rs3781412 15/22 25/60 0.007 rs1436803 7/22 6/60 0.009
TABLE-US-00006 TABLE 4 Marker Name Linked Chromosomes Control Chromosomes P DEP2.0002 10/22 45/176 0.07 DEP2.0003 10/22 32/178 0.01 DEP2.0004 13/22 61/178 0.03 DEP2.0006 14/22 73/178 0.07 DEP2.0007 4/22 4/178 0.006 DEP2.0008 3/22 6/174 0.07 rs14663 61 17/22 103/176 0.11 rs3740013 14/22 75/176 0.07 rs4462251 12/22 86/178 0.65 rs1063536 6/22 21/174 0.09 rs3781412 15/22 106/178 0.50 rs1436803 7/22 25/176 0.06
TABLE-US-00007 TABLE 5 Marker Name Linked Chromosomes Control Chromosomes P DEP2.0002 10/22 105/356 0.15 DEP2.0003 10/22 121/684 0.003 DEP2.0004 13/22 184/686 0.002 DEP2.0006 14/22 251/696 0.01 DEP2.0007 4/22 17/696 0.003 DEP2.0008 3/22 20/708 0.03
TABLE-US-00008 TABLE 6 HTR1A DEP2.0004 T+ Males Females Genotype Status Number % Number % Number % G+ Case 197/342 58 69/113 61 128/229 56 Control 85/182 47 42/101 42 43/81 53 CC Case 33/77 43 20/37 54 13/40 32 Control 28/62 45 11/31 35 17/31 55
TABLE-US-00009 TABLE 7 HTR1A DEP2.0007 A+ Genotype Status Number % G+ Case 29/347 8 Control 7/183 4 CC Case 3/77 4 Control 3/63 5
TABLE-US-00010 TABLE 8 HTR1A DEP2.0004 T+ Genotype Status Number % G+ Case 51/132 39 Control 35/64 55 CC Case 29/50 58 Control 13/21 62
TABLE-US-00011 TABLE 9 HTR1A DEP2.0006 A+ Genotype Status Number % G+ Case 71/132 54 Control 49/65 75 CC Case 34/50 68 Control 17/22 77
TABLE-US-00012 TABLE 10 HTR1A DEP2.0006 A+ Genotype Status Number % G+ Case 122/347 35 Control 75/183 41 CC Case 27/77 35 Control 27/63 43
TABLE-US-00013 TABLE 11 HTR1A DEP2.0007 A+ Genotype Status Number % G+ Case 10/132 8 Control 4/65 6 CC Case 1/50 2 Control 3/22 14
Example 2
Detection of a Brain-Specific DEP2 Transcript by Northern Blotting
[0410] Northern blotting was performed using a probe from the 3' UTR (nt. 1295 to 1713) of LHPP (SEQ ID NO:9) on a multi-tissue blot containing poly(A) RNA from the following human tissues: brain, placenta, skeletal muscle, heart, kidney, pancreas, liver, lung, spleen, and colon. The probe used is within sequences that are common to SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, and SEQ ID NO:12.
Methods
[0411] The pre-made poly(A) RNA Northern blot was product #3140 from Ambion (Austin Tex.). PCR was conducted to amplify a product that lies entirely in the 3' UTR of SEQ ID NO:9 (nucleotides 1295 to 1713).
TABLE-US-00014 Forward primer: (SEQ ID NO: 68) GAATCTCCCAAATCCCAGAACTCA Reverse primer: (SEQ ID NO: 69) ACACCGGGCATGACACCTTCAAGT
[0412] The DNA product was labeled using an AmbionStrip-EZ DNA kit (Ambion, Austin Tex.) and [α-32P] dATP. The blot was hybridized overnight at 42 degrees Celsius in ULTRAhyb Ultrasensitive Hybridization Buffer (Ambion, Austin Tex.). The blot was washed 2×15 minutes at low stringency (2×SSPE, 0.1% SDS) and 2×15 minutes at high stringency (0.1×SSPE, 0.1% SDS). All procedures were carried out per the manufacturer's instructions.
Results
[0413] FIG. 28 shows the existence of at least two DEP2 transcripts. An approximately 1.7 kb transcript was present in approximately equal abundance across all tissues tested. An approximately 1.1 kb transcript was very abundant in brain and observed at low levels in skeletal muscle and lung.
Conclusion
[0414] The 1.7 kb transcript is consistent with LHPP (SEQ ID NO:9) in the literature (Yokoi et al, J Biochem 133:607-613 (2003)). The existence of a novel DEP2 transcript of approximately 1.1 kb was established. Furthermore, this transcript appears to be most abundant in brain.
Example 3
Observation of Enhanced DEP2 Transcript Expression on Microarrays
[0415] Probe sets within DEP2 are present on Affymetrix U133 Plus and U133Av2 microarrays. These probe sets are annotated as recognizing LHPP, however they are within sequences that are common to SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:11, and SEQ ID NO:12.
Methods
[0416] Datasets from microarray experiments that had been conducted for unrelated purposes were mined to learn additional information regarding the expression level of DEP2 transcripts in normal human tissues. An alpha value of 1E-12 was used for statistical significance.
Results
[0417] Data from Affymetrix U133 Plus and U133Av2 microarrays are shown in Table 11 and Table 12, respectively. Results considered statistically significant are in boldface type.
TABLE-US-00015 TABLE 11 Tissue type Intensity p-value spinal cord (BD) 1192.5 0 univ ref (BD) 658.0 4.10E-30 brain (AM) 616.6 1.00E-27 brain (AM) 604.8 5.88E-39 Caudate nucleus (AM) 592.9 5.61E-45 basal ganglia (AM) 508.4 1.74E-39 hippocampus (AM) 388.4 8.36E-25 brain (AM) 323.4 1.61E-16 brain (BD) 230.5 7.06E-20 hypothalamus (AM) 228.9 8.77E-17 cerebellum (BD) 198.0 4.54E-16 Adrenal gland (BD) 172.3 6.39E-09 univ ref (S) 153.5 1.94E-07 salivary gland (BD) 148.9 0.00013 salivary gland (BD) 128.9 7.24E-08 prostate (BD) 107.1 4.30E-05 testis (AM) 105.1 0.000276 retina (BD) 103.0 0.00229 ileum (AM) 100.8 0.000296 pericardium (AM) 93.5 0.000186 lymph node (AM) 90.6 2.71E-05 Thyroid gland (BD) 87.8 2.28E-06 kidney (AM) 87.4 0.00573 Trachea (BD) 86.2 4.03E-06 aorta (AM) 84.3 0.000571 colon proximal (AM) 81.8 1.23E-05 liver (BD) 77.5 2.72E-06 prostate (BD) 73.7 0.02 Thyroid (AM) 71.0 0.00019 fetal liver (BD) 68.6 0.000435 kidney (BD) 67.5 0.000257 right atrium (AM) 66.2 0.03 colon distal (AM) 65.3 0.000397 testis (BD) 59.5 0.000539 Thymus (BD) 55.1 0.00131 spleen (AM) 54.5 0.00491 Vena cava (AM) 53.2 0.05 jejunum (AM) 47.5 0.02 uterus (BD) 46.9 0.00417 pancreas (BD) 46.2 0.03 bone marrow (BD) 45.9 0.05 Bladder (AM) 41.2 0.00923 Left ventricle (AM) 39.2 0.15 left atrium (AM) 37.8 0.13 duodenum (AM) 35.0 0.04 Thymus (BD) 35.0 0.12 placenta (BD) 34.3 0.01 prostate (BD) 34.1 0.03 right ventricle (AM) 31.3 0.14 heart (BD) 30.8 0.08 lung (BD) 29.4 0.04 bone marrow (BD) 25.6 0.04 breast (AM) 24.8 0.09 Skeletal muscle (BD) 15.5 0.16 stomach (AM) 14.8 0.25 ovary (AM) 14.4 0.21 pancreas (AM) 8.7 0.37 fetal brain (BD) 7.4 0.32 (AM) = purchased from Ambion (BD) = purchased from BD Biosciences (S) = purchased from Sigma
TABLE-US-00016 TABLE 12 Tissue type Intensity p-value Frontal cortex 897.9 0 Thalamus 806.1 0 Basal ganglia 634.7 0 Temporal cortex 348.3 0 Occipital cortex 294.6 0 Parietal cortex 274.8 0 Medulla 257.9 0 Cerebellum 253.1 0 Universal 1 83.2 0.02 Heart 47.5 0.22 Stomach 42.2 0.05 Prostate 34.2 0.08 Universal 2 0.7 0.49 Pancreas -20.7 0.7
[0418] All RNAs were purchased from Ambion.
Conclusion
[0419] Based on observation of statistically significant intensity data for every central nervous system sample examined (except a single fetal brain sample), and lack of statistically significant intensity data for any other sample examined, it appears that DEP2 transcripts are preferentially expressed in the central nervous system. Because the microarray probe sets are complementary to sequences common to several naturally occurring DEP2 transcripts, attributing intensity data from these probe sets specifically to LHPP may be misleading.
Example 4
Tissue Distribution of DEP2 Transcripts
[0420] The tissue distributions of human DEP2 transcripts were determined using quantitative reverse transcription polymerase chain reaction (QPCR). Assays were conducted for each DEP2 transcript for which there was more supportive evidence (bioinformatic or experimental) than a single expressed sequence tag. Because of the linkage and association of DEP2 to major depressive disorder, there was particular focus on the distributions of these transcripts in the brain.
Methods
[0421] Human total RNAs were purchased from either Ambion, Inc. (Austin, Tex.) or BD Biosciences (Franklin Lakes, N.J.).
[0422] Reverse transcription and PCR were conducted using the Invitrogen Platinum Thermoscript One Step System qRTPCR kit following the manufacturer's instructions. 50 ng DNAse-treated total RNA was used as a template for each reaction. All Ct readings were normalized to 28S rRNA. A dilution series of Universal Human Reference (BD Biosciences) was used to generate a standard curve for these analyses. Relative expression levels were determined by the Relative Standard Curve Method described in the ABI Prism User Bulletin Number 2 with 28s rRNA assayed as an endogenous control for each sample. Equivalent reverse-transcription efficiency was assumed for gene-to-gene comparison in the absence of quantitative standards such as purified RNA transcripts.
[0423] A schematic of DEP2 transcripts is shown in FIG. 29.
[0424] The following primers and probe were used for amplification and detection of DEP2-1 mRNA (SEQ ID NO:2). These primers and probe do not discriminate against a naturally occurring splice variant of DEP2-1 (SEQ ID NO:7).
TABLE-US-00017 Set 1 (inter-exon) (SEQ ID NO: 35) 5'-CACGTACCCATCAGCCTTCAC-3' (SEQ ID NO: 36) 5'-CCTGTGGAAGGAGCATACAGT-3' (SEQ ID NO: 37) 5'-\56-FAM\CCCAGTGACGAGCACCATCCGG\36-TAMSp\-3' (probe) Set 2 (intra-exon) (SEQ ID NO: 38) 5'-CAACACTGGCACCTGCAGAT-3' (SEQ ID NO: 39) 5'-CCACCCCATGCCATCAA-3' (SEQ ID NO: 40) 5'-\56-FAM\AAGTGGCAGAGCAGCCCCCAGC\36-TAMSp\-3' (probe)
[0425] The following primers and probe were used for amplification and detection of a splice variant of DEP2-1 (SEQ ID NO:5).
TABLE-US-00018 (SEQ ID NO: 35) 5'-CACGTACCCATCAGCCTTCAC-3' (SEQ ID NO: 41) 5'-CCCGCCTCTCCAAGACCAT-3' (SEQ ID NO: 37) 5'-\56-FAM\CCCAGTGACGAGCACCATCCGG\36-TAMSp\-3' (probe)
[0426] The following primers and probe were used for amplification and detection of a splice variant of DEP2-1 (SEQ ID NO:6). These primers and probe do not discriminate against another naturally occurring splice variant of DEP2-1 (SEQ ID NO:8).
TABLE-US-00019 (SEQ ID NO: 35) 5'-CACGTACCCATCAGCCTTCAC-3' (SEQ ID NO: 42) 5'-GGTACACTCATGTCCCCACCAT-3' (SEQ ID NO: 37) 5'-\56-FAM\CCCAGTGACGAGCACCATCCGG\36-TAMSp\-3' (probe)
[0427] The following primers and probe were used for amplification and detection of LHPP mRNA (SEQ ID NO:9).
TABLE-US-00020 (SEQ ID NO: 35) 5'-CACGTACCCATCAGCCTTCAC-3' (SEQ ID NO: 43) 5'-GCGCACCGGGAAGTTCAG-3' (SEQ ID NO: 37) 5'-\56-FAM\CCCAGTGACGAGCACCATCCGG\36-TAMSp\-3' (probe)
[0428] The following primers and probe were used for amplification and detection of a splice variant of LHPP (SEQ ID NO:12).
TABLE-US-00021 (SEQ ID NO: 44) 5'-TGCAAGCGATAGGAGTGGAA-3' (SEQ ID NO: 45) 5'-GGTTGTCCACGTACCCATCAG-3' (SEQ ID NO: 46) 5'-\56-FAM\CCCACCAGGCCCAGTGACGAGC\36-TAMSp\-3' (probe)
[0429] The following primers and probe were used for amplification and detection of a splice variant of LHPP (SEQ ID NO:20).
TABLE-US-00022 (SEQ ID NO: 47) 5'-GCGCACCGGGAAGTTCAG-3' (SEQ ID NO: 48) 5'-TGAAGAACAAAACAGAATGAGAATGTG-3' (SEQ ID NO: 49) 5'-\56-FAM\CCAGCTGGAGTCATTTATTCACCTTCCTTCC\36- TAMSp\-3'(probe)
[0430] The following primers and probe were used for amplification and detection of a splice variant of LHPP (SEQ ID NO:24).
TABLE-US-00023 (SEQ ID NO: 50) 5'-CCACCAGTTACTTTCAGTATGAAAGCA-3' (SEQ ID NO: 51) 5'-TATCCTTTCAGAGAAGCAGCAAAAAC-3' (SEQ ID NO: 52) 5'-\56-FAM\CAGAAATGCCTGCGGCTTTTCCTG\36-TAMSp\-3' (probe)
[0431] The following primers and probe were used for amplification and detection of DEP2-2 (SEQ ID NO:28).
TABLE-US-00024 (SEQ ID NO: 53) 5'-CGCCCCAGACCCAAGAATC-3' (SEQ ID NO: 54) 5'-CAGGAAGTGCCCATCAGCCT-3' (SEQ ID NO: 55) 5'-\56-FAM\CCCGCCTCTCCAAGACCATCCCT\36-TAMSp\-3' (probe)
[0432] The following primers and probe were used for amplification and detection of DEP2-3 (SEQ ID NO:30).
TABLE-US-00025 (SEQ ID NO: 56) 5'-AGCACCATCCGGAAGTGAAG-3' (SEQ ID NO: 57) 5'-GCTGCAAGATCTGTGCATAGGA-3' (SEQ ID NO: 58) 5'-\56-FAM\CTGATGGTGAAGAGCCTGGAAGAAACCCA\36- TAMSp\-3' (probe)
[0433] The following primers and probe were used for amplification and detection of AK127935 (SEQ ID NO:31).
TABLE-US-00026 (SEQ ID NO: 59) 5'-CAATTTAGGTCGCTGCTATGGA-3' (SEQ ID NO: 60) 5'-TGGTGACTCAAAGGCCTAATGG-3' (SEQ ID NO: 61) 5'-\56-FAM\CCTGGCCTCTTAACTCATTTACCCGGG\36- TAMSp\-3'
[0434] The following primers and probe were used for amplification and detection of AW867792 (SEQ ID NO:33).
TABLE-US-00027 (SEQ ID NO: 62) 5'-TGGAGGCAGCCTCGCTTTA-3' (SEQ ID NO: 63) 5'-TTGGAGGAAGAGTTCTCATGCA-3'\ (SEQ ID NO: 64) 5'-\56-FAM\CCCGCAGAACCTCCACGCTGTT\36-TAMSp\-3'
[0435] Cycle threshold (Ct) values were qualitatively interpreted as follows:
TABLE-US-00028 Ct > 35 probably noise Ct = 30-35 possibly low abundance transcript, reliability assessed by shape of Ct curve Ct = 25-30 moderate abundance transcript Ct = 20-25 high abundance transcript Ct < 20 very high abundance transcript
Results
[0436] Observed Ct values are shown in Table 13. For SEQ ID NO:5, only inter-exon results are shown.
TABLE-US-00029 TABLE 13 SEQ ID NO: 2 5 9 12 20 24 28 30 31 33 Adrenal gland (MP) 27.41 38.66 23.27 22.54 25.68 30.16 38.01 27.7 26.68 26.64 Amygdala (AM) 20.48 32 22.91 21.83 26.78 30.47 38.93 27.12 27.93 27.15 Basal ganglia (AM) 19.8 36.52 22.84 21.02 26.54 30.86 37.54 26.82 27.87 26.67 Brain (AM) 19.82 31.78 22.86 21.34 26.37 30.34 36.91 26.36 27.31 26.75 Brain (MP) 24.5 37.02 22.91 21.99 26.01 29.9 40 26.22 26.21 25.7 Caudate nucleus (AM) 20.36 33.16 22.58 21.05 25.52 29.79 36.34 26.69 25.92 25.63 Fetal brain (AM) 26.24 31.91 23.97 22.83 ND ND ND ND ND ND Fetal liver (MP) 31.03 40 40 23.53 33.18 31.48 39.34 29.88 30.27 29.93 Globus pallidus (AM) 17.54 32.81 22.59 21 26.23 30.74 40 25.71 26.99 26.52 Heart (AM) 30.96 40 26.49 25.12 40 31.59 39.7 32.78 32.72 29.93 Heart (MP) 29.59 34.83 23.87 24.38 31.43 30.17 38.33 30.14 28.71 28.33 Hippocampus (AM) 21.09 32.7 22.58 21.22 26.2 30.17 40 26.47 27.03 26.21 Hypothalamus (AM) 19.83 28.99 22.43 20.23 26.66 30.43 33.54 25.85 27.42 26.86 Kidney (AM) 31.94 32.1 23.38 21.81 31.6 30.06 35.26 29.37 30.24 29.38 Kidney (MP) 27.93 38.31 22.51 21.31 26.64 30.96 39.4 26.67 27.54 27.62 Liver (AM) 30.01 40 23.8 22.48 32.56 32.36 40 29.74 30.65 29.51 Liver (MP) 30.12 40 23.67 22.41 29.47 31.65 40 27.96 28.59 28.12 Lung (AM) 40 32.29 25.44 25.11 33.74 32.6 39.22 32.82 31.34 29.92 Lung (MP) 28.01 36.39 24.55 23.66 28.05 31.7 37.84 28.46 28.39 28.21 Medulla (AM) 18.77 40 22.5 21.2 26.17 30.87 40 25.52 27.09 26.5 Orbital frontal cortex (AM) 18.84 30.1 22.47 20.92 25.86 30.12 35.83 25.65 26.96 25.94 Peripheral leukocytes (BD) 25.15 29.52 24.18 23.35 27.4 31.12 31.81 27.7 27.09 27.52 Placenta (MP) 29.52 40 25.02 24.75 33.54 31.93 38.76 35.4 29.48 28.93 Pons (AM) 19.05 38.91 22.25 20.7 26.34 30.59 40 25.05 27.54 26.94 Prefrontal cortex (AM) 20.25 37.69 22.59 21.34 26.06 30.06 40 26.66 27.15 26.04 Prostate (MP) 28.26 33.94 24.84 23.54 28.4 31.48 34.97 29.31 28.3 28.09 Salivary gland (MP) 29.25 31.94 25.1 24.14 31.38 31.47 34.7 29.37 29.39 29.02 Skeletal muscle (MP) 30.18 40 26.01 25.27 40 33.16 40 30.85 30.33 29.51 Small intestine (AM) 28.42 36.4 24.03 23.19 30.37 31.34 39.58 29.81 29.4 29.17 Spinal cord (AM) 18.12 31.91 21.89 20.59 27.21 31.32 37.64 24.45 27.67 26.99 Spinal cord (MP) 22.88 37.62 22.34 21.29 27.4 31.32 39.49 24.74 27.56 27.32 Spleen (MP) 28.97 27.06 25.32 24.34 30.36 32.28 29.81 29.17 28.68 28.91 Testis (MP) 25.85 40 22.65 21.96 24.48 27.75 40 26.53 24 25.04 Thalamus (AM) 19.81 40 21.81 20.5 26.29 30.09 40 26.28 27.33 26.48 Thymus (MP) 26.57 28.63 24.44 23.52 27.08 30.48 31.79 29.56 27.04 27.26 Thyroid gland (MP) 27.56 32.25 22.87 21.72 26.23 30.05 35.03 26.86 26.96 27.28 Trachea (MP) 27.85 31.73 24.44 23.53 28.36 31.18 34.36 28.83 28.2 28.12 Uterus (MP) 28.04 40 24.54 23.79 28.84 32.16 40 29.17 28.3 28.31 (AM) = purchased from Ambion (BD) or (MP) = purchased from BD Biosciences ND = not done Data for SEQ ID NO: 2 are from primer/probe set 1.
[0437] DEP2-1 (SEQ ID NO:2) was detected as a very high or high abundance transcript in all central nervous system samples tested (Ct range 17.5-24.5) except for fetal brain (Ct 26.2), and as a moderate or low abundance transcript in other tissues (Ct range 25.2-31.9) except for lung (Ct 40). Similar results were obtained using a second set of primers and probe within the first exon of DEP2-1 (nucleotides 1-316 of SEQ ID NO:2).
[0438] A splice variant of DEP2-1 (SEQ ID NO:5) was reliably detected in spleen, thymus, hypothalamus and peripheral leukocytes (Ct range 27.1-29.5).
[0439] A splice variant of DEP2-1 (SEQ ID NO:6) was not reliably detected (not shown).
[0440] LHPP (SEQ ID NO:9) was detected as a high or moderate abundance transcript in all samples tested (Ct range 21.8-26.5) except fetal liver (Ct 40). Expression in central nervous system was in general slightly higher than in other tissues.
[0441] A splice variant of LHPP (SEQ ID NO:12) was detected as a high abundance transcript in all samples tested (Ct range 20.2-25.2). Expression in central nervous system was in general slightly higher than in other tissues.
[0442] A splice variant of LHPP (SEQ ID NO:20) was detected as a moderate abundance transcript in all central nervous system samples tested (Ct range 25.5-27.4), as well as in several other tissues.
[0443] A splice variant of LHPP (SEQ ID NO:24) was detected as a moderate abundance transcript in a few samples, and as a low abundance transcript in all others.
[0444] DEP2-2 (SEQ ID NO:28) was detected as a low abundance transcript (smallest Ct 29.8).
[0445] DEP2-3 (SEQ ID NO:30) was detected as a moderate abundance transcript in all central nervous system samples tested (Ct range 24.4-27.1), as well as in several other tissues.
[0446] AK127935 (SEQ ID NO:31) was detected as a moderate abundance transcript in all central nervous system samples tested (Ct range 25.9-27.9), as well as in several other tissues.
[0447] AW867792 (SEQ ID NO:33) was detected as a moderate abundance transcript in all samples tested (Ct range 25.0-30.0). Expression in central nervous system was in general slightly higher than in other tissues.
Conclusions
[0448] SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:9, SEQ ID NO:12, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:31 and SEQ ID NO:33 are naturally occurring transcripts arising from DEP2 (SEQ ID NO:1). Some of the signal observed for SEQ ID NO:5 may be attributable to SEQ ID NO:7, see Examples 5-7 below for independent experimental evidence that SEQ ID NO:5 is a naturally occurring transcript arising from DEP2. Failure to detect SEQ ID NO:6 (or SEQ ID NO:8, which would be amplified and detected by the same primer/probe set) cannot be taken as evidence that it is not a naturally occurring transcript arising from DEP2, without use of a positive control to ensure that the assay worked. Expression of transcripts arising from DEP2, relative to 28S rRNA, was generally higher in the central nervous system than in other tissues. The difference between central nervous system and other tissues was strongest for DEP2-1 (SEQ ID NO:2).
Example 5
Establishing the Sequence of DEP2-1
[0449] DEP2-1 (SEQ ID NO:2) which is a novel sequence that has never been described previously, comprises distinct protein coding capacity for Dep2-1a (SEQ ID NO:3) and Dep2-1b (SEQ ID NO:4), and is highly and preferentially expressed in the central nervous system. These characteristics make it of particular interest as a candidate to explain the linkage and association of DEP2 to major depressive disorder. DEP2-1 clones were sequenced to establish whether the sequence predicted by mining EST sequence databases was correct.
Methods
[0450] IMAGE clones h3175509, h5194531, h5197955 and h4565014 were obtained from the American Type Culture Collection ("ATCC"), P.O. Box 1549, Manassas, Va. 20108. DNA sequencing was performed using standard methods well known to those practiced in the art.
Results
[0451] Aligned sequences of the four IMAGE clones and the sequence predicted by Genecarta software from EST sequences are shown in FIG. 30. Clone h4565014 contains sequence, downstream of a polyadenylate tract, that does not match DEP2.
Conclusion
[0452] The full-length sequence of DEP2-1 has been established. A novel single nucleotide polymorphism has been identified.
Example 6
Characterization of the 5' Ends of DEP2-1 by RLM-RACE
[0453] RNA ligase-mediated rapid amplification of cDNA ends (RLM-RACE) was performed on a pool of human spinal cord RNA to determine the 5' ends of DEP2-1 (SEQ ID NO:2).
Methods
[0454] Human spinal cord total RNA (#636554 (also called 64113-1)) was obtained from BD Biosciences Clontech (Palo Alto, Calif.). The FirstChoice RLM-RACE kit and was purchased from Ambion (Austin, Tex.). The following gene-specific RACE primers were used.
TABLE-US-00030 5'RACE gene specific outer primer: (SEQ ID NO: 65) TCTCCCACTGTATGCTCCTTCCA 5'RACE gene specific inner primer: (SEQ ID NO: 66) CTCTGCCACTTCATCTGCAGGT
[0455] RLM-RACE was performed using 10 μg human spinal cord total RNA according to manufacturer's instructions, except as noted below. Total RNA was treated with calf intestine alkaline phosphatase to remove free 5'phosphates from molecules such as rRNA, fragmented mRNA, tRNA and contaminating DNA (this step entailed treatment with 3 μL calf intestine alkaline phosphatase (CIP) for 1.5 h at 37° C.). Full length mRNA molecules that contain a 5' methylated guanosine (CAP) should not have been affected by this treatment. The RNA was then treated with tobacco acid pyrophosphatase to remove the CAP structure from full length mRNA, leaving a 5'phosphate. An RNA adapter oligonucleotide was ligated to full-length mRNA using T4 RNA ligase. The adapter could not ligate to dephosphorylated RNA molecules since they lacked a 5'phosphate. A random-primed reverse transcription reaction and nested PCR (using gene-specific inner primers) was then used to amplify the 5'ends of a specific transcript. As a negative control, RNA was treated with calf intestinal alkaline phosphatase but not with tobacco acid pyrophosphatease, such that T4 RNA ligase should have had no substrate to ligate to the RNA adapter oligonucleotide. RLM-RACE products were separated by electrophoresis, extracted from the gel, purified, and sequenced using standard methods well known to those practiced in the art.
Results
[0456] RLM-RACE was performed on 2 different lots of human spinal cord mRNA, with identical results (FIG. 31). Two bands (approximately 168 and 243 nucleotides) were observed using tobacco acid pyrophosphatase-treated RNA. These were not observed in negative control reactions.
[0457] Sequencing revealed 5'cDNA ends at nucleotides 1 and 76 of SEQ ID NO:2 (FIG. 30). Clean splices at the junction between the 5'RACE Adapter and the mRNA were not observed. This suggests that there was a mixed population of DNA within a single band on the gel, and that transcription initiation may also occur within a few bases upstream or downstream of the major transcription start sites.
Conclusion
[0458] Two major transcription start sites of DEP2-1 have been identified.
Example 7
Characterization of the 5' Ends of DEP2-1 by Exon-Bridging PCR
[0459] Two exon-bridging RT-PCR experiments were conducted to learn whether DEP2-1 was a naturally occurring splice variant of LHPP, or only originates from one or more distinct transcriptional start sites. In a first experiment, PCR was conducted using a reverse primer within the first exon (nucleotides 1-316) of DEP2-1 (SEQ ID NO:2) and a forward primer within an upstream exon of LHPP (SEQ ID NO:9). This PCR was designed to amplify any transcript originating from the LHPP transcription start and comprising exon 1 of DEP2-1. In a second experiment, PCR was conducted using a forward primer within an upstream exon of LHPP and a reverse primer in the exon common to LHPP and DEP2-1. This PCR was designed to amplify any transcript containing both exons of DEP2-1 as well as upstream sequences from LHPP.
Methods
[0460] cDNA was prepared from 3 different lots of human spinal cord mRNA (BD Biosciences Clontech) using the SuperScript III First Strand Synthesis System (Invitrogen, Carlsbad, Calif.). The following primers were used.
TABLE-US-00031 Experiment 1 Forward: (SEQ ID NO: 44) TGCAAGCGATAGGAGTGGAA Reverse: (SEQ ID NO: 39) CCACCCCATGCCATCAA Experiment 2 Forward: (SEQ ID NO: 44) TGCAAGCGATAGGAGTGGAA Reverse: (SEQ ID NO: 67) CACGTACCCATCAGCCTTCAC
[0461] PCR, electrophoresis and sequencing were performed using standard methods well known to those practiced in the art.
Results
[0462] In experiment 1, products were observed following PCR and agarose gel electrophoresis (FIG. 32). These were weak and not consistently observed. Sequencing revealed that these were PCR artifacts not containing any sequence from SEQ ID NO:2. In experiment 2, products were observed following PCR and agarose gel electrophoresis (FIG. 33). This was expected, as the primer pair used amplifies LHPP (SEQ ID NO:9). No sequence from exon 1 (nucleotides 1-316) of SEQ ID NO:2 was detected in these products.
Conclusion
[0463] DEP2-1 and LHPP do not share a transcriptional start site.
Example 8
Expression of LHPP, a Naturally Occurring Splice Variant Thereof, and DEP2-1, in Human Neuronal and Glial Cell Lines
[0464] To establish feasibility of cell-based assays to screen for compositions that modulate the activity or expression of DEP2 products, quantitative PCR ("QPCR") experiments were conducted to detect expression of DEP2-1 (SEQ ID NO:2), LHPP (SEQ ID NO:9) and a naturally occurring splice variant thereof (SEQ ID NO:12).
Methods
[0465] Cells of six ATCC cell lines (SH-SY5H, SK-N_SH, LN18, H4, Ntera2 and U87MG) were suspended in RNALater (Ambion, Austin Tex.). Total RNA was isolated from each cell line using TRIZOL reagent (Invitrogen, Carlsbad Calif.) and purified using RNeasy columns (Qiagen, Valencia Calif.). Reverse transcription and PCR conditions were done as described in Example 4.
Results
[0466] Observed Ct values are shown in Table 14.
TABLE-US-00032 TABLE 14 SEQ ID NO: 2 9 12 SH-SY5H 40 40 40 SK-N_SH 31.71 24.82 23.65 LN18 31.02 25.16 24.01 H4 30.04 25.79 24.72 Ntera2 29.35 24.91 23.76 U87MG 31.57 27.19 26.17
Conclusions
[0467] Cell-based assays to screen for compositions that modulate expression of SEQ ID NO:2, SEQ ID NO:9 or SEQ ID NO:12, or that modulate expression or activity of the corresponding proteins Dep2-1a (SEQ ID NO:3), Dep2-1b (SEQ ID NO:4) or Lhpp (SEQ ID NO:10), may be feasible using five of the six tested cell lines.
[0468] One skilled in the art would readily appreciate that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The molecular complexes and the methods, procedures, treatments, molecules, specific compounds described herein are presently representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention. It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention.
[0469] All patents and publications mentioned in the specification are indicative of the levels of those skilled in the art to which the invention pertains. All patents and publications are herein incorporated by reference to the same extent as if each individual publication was specifically and individually indicated to be incorporated by reference.
[0470] The invention illustratively described herein suitably may be practiced in the absence of any element or elements, limitation or limitations which is not specifically disclosed herein. Thus, for example, in each instance herein any of the terms "comprising," "consisting essentially of" and "consisting of" may be replaced with either of the other two terms. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended claims.
Sequence CWU
1
1
821159047DNAHomo sapiens 1gtagatttgc tataacaggc tgaaagctca gtcagtggtg
tatgcttcct ggcacactat 60ggtgttcatg tgccacctgg ctagctctgg cccaaaagat
gcaaatgaga gtccattgga 120tcagggttct gggaaagctt ttgctttctg ggtataaagg
gtcagactca gctgtcacat 180gccctttttt cttttctttc ccctccttcc tgcctgaaat
gcagtcatga tgcctcgggt 240ggagcaggaa tcttgctagc gtgaggagca ggccacatgc
tgagagagtg gtggagcaga 300aagcaggagg cagcctgatg gtcctgggag ctgttgtccc
atccaggatt tcctccctct 360tggcttcttg atatatgagg aaaagtaaaa tctgatttgg
ttgggcaact atggctgggt 420ttctgttaca tgtagccaaa tgcaacccta acatagacaa
tgccctacac tcattgcatc 480ccaagaaaag cctggaattt tcagatcaag ttttagccta
tcagtggccc acacttagag 540acttgccaca tttttcaggt gagggacctg gatccctgta
tttgtttctt gtcactactg 600caagaaatta ccatatactt agtggcttaa aacaacagat
ttattattat agttctggag 660gtaagaagtc ctaaacttga agtaccagca gggctgcatt
ccttctggag gctctagggg 720agaatccaat tctttgcctt ttttagcttc cagaggccac
ctgcatttct tggctcatgg 780cccctttgta ttcaaggcca gtagcacggt atcttcacac
ctctctctct gacctctgct 840tctgctgtca cacctccttc tctgactctg actctctcct
gcttccctct ttcccttatg 900aggactcatg tcattatact gggcccacct ggaccatcca
ggataatctc cctacaggag 960cataggctct gcaagatcta agtccctttt gccatgtcag
gtaatatatt tagaggttgt 1020ggggattggg aggtggacat ctcaaggggg cactattctg
tctactgtag tcgtagaaat 1080gaaaagggac ttgcccaaga gtacccacag agctgactga
aaagaaacaa ctatgggtag 1140gctcacacca ctggcagaag cacccctttt cctcctcctt
ggcatgaagc taaagcccag 1200ccattctcca aggtccattt caggctgccc ctgcccagaa
gccacagctg ccctcctagg 1260cctgcagagg tcatcctctt gactgtgtct gtgtcactca
ttgccacttg gcacacattt 1320ggtcagcagt cgtgaaggtc ttccattgtt ctctgcccag
tcttgatcct ggaaggcaag 1380gatcacaccc tagaactttt ttgtgcccac tgcacagagt
agggtctcaa tacgtgtacg 1440gttcatgatg ttcctagcag attctccaaa ctgactcact
cagcccctgg aacactaaaa 1500actgcatttc ccagactcct ttgcagcgtg ggctaggttc
caccaagcca gacacagtga 1560aagtgagtgg gtggccacag gtggagggca cagtggtgga
tatgttttgt tcttttgggg 1620gcagctatgc tgggggctgc actaagtgat ggggcagtgg
tcccaatgga gtagtaaaat 1680taagtgtttc tcgtggctgc atggctcctg gctgtgtagg
acttagaccc gtcatctgta 1740agctccttca cctcatgtaa acaaaacccc tctagctgga
gtggtgtcta ttgtctgcga 1800ctaagaaccc ttatataagg tatactactc cccaacccca
ttagtggaaa tcccaaaggg 1860taggaactgt attttatttc acttgtaaac agctccccta
gtaagcatgt caacaaaata 1920tacacaattc attgaacccc ataacatttc aacgaattcc
tcatcctttc tgtgaatcaa 1980gagcctgaaa agaaatggtg aaataatatg atcctctctt
ctttgaaagc tcaaagctat 2040gttggaccag aagtaaagtg ttctcgtttc tatttaataa
cttgaaaggt tccgaggggc 2100cattgaggaa actcctccct tttaatatca atgtgtattt
attgcaaaaa taatgtagca 2160tcgagtggta ttttatagct tatccaaaaa cctcctgggt
ttaacgcatt gtgatagtcc 2220cgttttcttc tcagcccagg tcctatgcat cctcatctat
gcagggctgt tatctgcata 2280taattttttt tttttttaag acaaagtctt gctctgtcgc
cccggctgga gtgcagtggt 2340gcaatctcgg ctcactgcaa cctccgcctc ccaggttcaa
gcggttcttc cgcctcagcc 2400taccgagtag ctgggactac aggcatgcgc caccacacct
aggtgatttt tgtattttta 2460gtagagacag gggtttcacc atgttgacca ggctggtctc
gaactcctga tctcaagcga 2520tccacccgcc tcagcctccc aaagtgctgg gattacaggc
ataagccact acgcccggcc 2580tcaattttgt attgtacttt ttctttcttt ctttaataga
gacagggtct cactatgttg 2640actaggttgg tctagaactc ctgggcacaa gctgtccgcc
cgcttctgcc tcccaaagtg 2700ctgggattgc aggcgtgaac caccgcccct ggctacaggt
gccttcttgt ctcaatttgc 2760ctttgacctt tcttagggac ttgttttctg cttttcctgc
tctttgtccg ctgatctcct 2820gggaagaaag cttccgaaaa ggacaccgtt tcaggggcga
gtgacgccgg ggtgcccagg 2880ccgcgcccca gttccgggtt tgcacccggt cttcttgccc
tgccccgccc gcgggactac 2940agttcccagg cgcccctgcg cggccgcggc gccggcgccg
gcgtcggttg ggacgcggag 3000gcggagctga ggagcagggc cgggcgccat ggcaccgtgg
ggcaagcggc tggctggcgt 3060gcgcggggtg ctgcttgaca tctcgggcgt gctgtacgac
agcggcgcgg gcggcggcac 3120ggccatcgcc ggctcggtgg aggcggtggc caggtgagtg
ggccccggga cgccgctggg 3180gccgccgagc tctaagctca gcccgctccc tggctgccgg
cagggggcgg ggcggcaggg 3240ggcggggccg cggcgcaggc cccgcctcgg tctccccctt
cccaccccgg tgcgcgcaca 3300gtgctgacca cggacgaccc cactgttgcc cccggcgagc
accaggactc tgctggttag 3360ggctgcgcgg tcagacaggg cggccacctg gtacgtgcgc
tgctgagcgc ttgacctgcg 3420gccagtctga attgagatgt gctgtaagcg taaaatgcat
accgattaac aagacttagc 3480cgggcgcggt ggctcgcatc tgtgattcta gcacttgcag
aggcagagga gggaggatcg 3540cttgaggcta gggggttgga gaccaccctg ggcaacatag
ggagaccccg tctctgccaa 3600aaatttaaaa attagccggg tgtggtggtg cacgcctgta
ggcccagcca cttgggaggc 3660tgaggcagga ggatcgcttg agcccaggag gtcaaggcta
cagtgagctg tgatcacccc 3720actgtactct agcctggact gcagagcgag acggtctcat
aaacaaaacc cccaaaaaca 3780ccagacttcg tgagtaaaat gtaaaataaa aacatgaaaa
tgggttatgt tgtttgcctg 3840ttgaaatgat atactttttg tttgttttgt tttgttcttt
tcagatggag tcttgctctg 3900tcgcccaggc tggagtgcaa tggcacgatc tcggctcact
gcaacctccg cctcccgggt 3960gcaagtgatt ctcccccttc aggctcccga gtagctggga
ttacaggcac ccgccatcat 4020gcccagttac tttttttatt tgttttgttt gagacggagt
tttgctcttg ttgcccaggc 4080tggaatgcag tggtgcaatc tctgctgact gcaaccgccg
cctcccaggt tcaagcgatt 4140ctcctgcctc agcctcctga atagctggaa ttacaggcat
gtgccaccac acccggctaa 4200ttttttgtat ttttagtaga gacggggttt caccatgttg
ggcaggctgg tctctaactc 4260ctgacctcag gtgatccacc tgcctcggcc tcccaaagtg
ctgggattat aggcgtgagc 4320cacatgccca gttgataata tttttaatat actgttttag
gttaaaatat ttgaaattaa 4380gtataccttt tttttttaac ttttgttaat gtgcttatta
gaaaacttta aattaccttt 4440gtggctctca ttataattct gttggactgc attggtttag
gtgagaggag gaggaagaga 4500acatcgtgcc agccaagact tgaccatgag gacacttttt
tttttttgag acagttttgc 4560tcttgttgcc caggccggag tgcagtggcg ccatctgggc
tcactgcaac ctctgcctcc 4620caggttcaag tgattctcct gcctcagcct cccaagtatc
tgggattaca ggcacccgcc 4680accacgccca gctaattttt gtatttttgg tagacatggg
gtttcaccat actggctagg 4740ctggtctcga actcctgacc tcaactgatc cacctgcctg
ggccttctaa agtgctggga 4800ttacaggcgt gagccaccgc acctggctga cacttttctt
tgtataattt ttttttttga 4860gacaaggtct cattatgttg ccagggtggt cttgaactcc
ttggctcagg cagtcctcct 4920gccttggtct cccaaagtgc taggattaca ggcatgagcc
accacacctg gcctaatttt 4980tttttctttt tttttttttt tttttgagac aaggtcttac
tttgccaccc aggctggagt 5040gcagtggttc aatctcactg caacctctac ctcaagggct
caagagatcc tcccacctca 5100gcctcccaag tagctggaac tacaggcacg tgccaccatg
cccagctaat ttttttgtgt 5160tttttgtaga gatggggttt tgccatgttg cccaggccgg
tcttgaactc ctgggctgaa 5220gtgatctgcc tgcctcagct cccagagttc tgggattaga
ggtgtgagcc actgtggcca 5280gccagaagga cactttcagt agcatggttg gcaccactgg
acctgggctt tgtaaggcct 5340cctgttggca tcccaggtca gcactgccat gtgttgatag
gcagccggga agggctaact 5400gccaggactt ggcctcccgg agtcactagc tgaggcaccc
tggcttggta gcctgatgtc 5460actgggactc aatataccta tctgtaggtg gcaattaact
gagtctcttg gtccttctta 5520gagaccccat tctaccaatc tgcatgtctg atgtctggga
gctgctgaga taatagggac 5580ccaaacggtc ccgcatagtg aagccggagg tgtttgggag
gctgggaact gggaacaggg 5640agacttggct tccaggagca agcacaaagg gcagcccaga
cctaggacac tccgctgggt 5700cctgggtctg ctctcagtgt tggctgaagt tcagggttaa
gtgcattcct gaccttattc 5760tccattactc caaagcttta gactagctca gtcacagatg
agagcattct gtgagtggca 5820ctggtgggaa tccagcagag ggctctgggc agtgttttgg
ggtgaagtgg gcctcatctg 5880gggagcagga agccagagct caatgctgtg actggggcag
gcaaggcctc atgggagacc 5940ctgccactca ttgcttgcgg aagatacatt tgacattatc
taccagtggt tccacttctg 6000agaatttgtc cttgtctcat gcgtgagaaa tgacatgagt
acaaggtcat tcctggcagt 6060accttttaaa tagcaaaaga ctggaaacag ctcagatacc
catcagtagg ggagtattag 6120ggtctgatgt gtggagggga gagggacagt aaagctgcag
gcacacaggt aaatcatcct 6180cacaggtggg cacacgggta aagcgcatca tcacaggtag
gcctaagggg tgtgaagacg 6240ggagggtact gagagtgtga tggacgaggg tggcctgtgc
tagagagggg gtcgggacag 6300ctctgaggag aaggcactgg acagacccgg agggataaga
gggaactgtc tgcaggaaga 6360ggcggaggaa gcgcattcca ggcagaagga atggtagctg
caaagacctc aggaagagag 6420cacttggcga gttcgagggc gtggggtggg ggagtgggta
caccaggcag gaacttgagg 6480ggcacggaga ggtgtccgga ctttatcctg agtggcagga
agtcagtgaa gactctagat 6540aggcatatca tgtggtctga tttatatgtt ttaaagaaat
acaaatgtaa aaagtgagac 6600atacaagtca aatgaaagca agggagattt cctttgaaac
cttggagaga aggcctttac 6660atccaggcct tgaaatctag aaagcaaaaa caacgaacaa
gcagcccaat agcaatatga 6720gcaaaagatg caaacaggtc atggaagagg aagtgagaga
ggctccgcct tactcaaata 6780agaggtgtgt gatctcggct cactgcagcc tctgcctcct
gggttcaagc cactcttgtg 6840cctcagcctc ctgagcagct gggactacag gcatatgcca
ccgtgcccag ctaatttttg 6900gtatttttag tagaggcagg attttaccat gatggccagg
ctggtcccga actcctggct 6960tcaagtgatt cacccacctt ggcctcccaa ggtgctggga
tcacaggcgt gagccactgc 7020gcccagcctg agaaatgcaa attaaaacta tatcaagctg
ccctgtttca tcaatttgac 7080aaaattggcc aaaatccaaa agtttgagaa cactttctct
gtgagagttt gtggaaacag 7140accctgccac acattgcttg cagaagatac atttgacatt
atctaccagt ggttgcgctt 7200ctgagaattt gtcctatctc gtgtgagaaa ggacacaagt
acaaggtcat tcctggcaat 7260gccttttaaa tagcaaaaga ctggaagcag ctcagatatc
catcagtagg ggagttttag 7320ggtctgtctg tttatacagc agaataacat gcagtgtaag
aaggaatgag taccttgtct 7380atagacatat tactagtgga gacagcaagc acacagcagg
atgcacggtg cgctgtttct 7440ggtatagaaa atgcagaaat gcaagaatgt atttctcctg
ctgtgtttgc ataaagacac 7500tggaaggacg caggctgctt cagagtctgg gaagagctgg
aggataagga aagtgggggc 7560cactgtgtgg ttggagacag ggtaggagaa agactcttca
ctgtataact ttgtatacca 7620acgtttttgt ttttgttttt gaaacagagt gttgctctgt
cacccagatt ggagggcagt 7680ggtgcaattt tggcttactg taacctctgc ctcctgggtt
caagtgatta tcctgcctca 7740gcctcccgag tagctgagat tacaggcacc tgccaccatg
cctggctaat ttttgtattt 7800ttagtagaga tggggttttg ccacattggc caggctggtt
tcgaactcct gacctcatgt 7860gatctgccca ccttggactc ccaaagtgct gggattacag
gtgtgagcca ccgcgcccag 7920cctgcatatc atttttaaga tttttaaact atgtgaatgt
gttttctatt caaaatatta 7980caattcaaat attaaattga atattaaaat tcaaaatatt
aacgtctggg cacagtggct 8040catacctgta atcccagcac tttgggaggc tggaggtaga
aggattgaga ccagcctggg 8100caacacagcg agacccccat ctctacaaaa aatacaaaaa
ttagccgggc atggtggtgc 8160acgcctgtag tcccaactac ttgggaggct gaggcaggag
aactgcctga gcccaggagt 8220ttgaggctgc cgtgagctat gaatgtgcta ctgcactcca
gcctgggtga cagagtgaga 8280ctctaaaaaa aaaaaccatt gttttttaaa taaaatattt
caaggagatc tctctggctg 8340cagtgtggag aacaggctgg cagtagggca ggaggtgggg
gggcttggac tgggtagaag 8400gcaatggaga tggggaggaa ggcatgaatt aaagaggtgt
tttggaagca ggacctatgg 8460gacctgcgga catacggggt attgtggggg tcagggagaa
cttgaaataa agggtggttt 8520ccggtgtctg actggaacag ctggcagggt ggagaggggg
cagggagggg gtcatggggc 8580gggaaccatg atggctcatg atgtatgtca agttcctggc
ctggtgtttg gcacacagta 8640gacccacaac agtgtaacat taaagaccac tttcttttct
ttttcttttt tttttttttt 8700tgagacagga tcttgatctg tctcccaggc tggagtgcag
tgacacaatc acggctcact 8760gcagccttga cctcctccta ggctcaaaaa aatctttcta
cccaagtttc ccagatagct 8820gggactacag gtatacctcc acatgcaggt ttttcttttg
tgtgtgtgtg tgttttttgt 8880tgttgttgtt gttgttgttt tgagacagtc tcactctgtc
tcccaggctg gagtgcagtg 8940gtatgatctc ggctcactgc agcctccgcc tcctgggttc
aagggattct cctgcctcag 9000agtagctggg attacaggtg cctgccacca tgcctggtta
atcatttgta tttttagtag 9060agacggggtt tcaccatgtt gaccaggctg gtctcaaact
cctgacctca agtgatccac 9120tcgcctgggc ctcccaaagt gctgggatta caggcatgag
ccaccatgcc cagcccccaa 9180gggacagttt gagccctggt ctgcccaacc cagaccccta
cttggaatcc ttcagggaga 9240agggggtgtt gggaaatgtc acaggcttct ctaacagctt
attttgagca gatgaccccc 9300acgtgatata gaactgtcca acaaacgtgc agattccagg
ttgtgacatt ggaaagggtt 9360ctgttaactt ctctgggttc tgggttggtg tcttctgact
gattgattgc caggagggtt 9420ttgtttggat ttgctttggc tcctgcagat ttatctagct
gggggtgtct ctggtagcag 9480ctatactgta tacatctaac agcaatgtaa taacaatccc
gaaatacacc acagcctgtt 9540atgtagctgg gagaggctaa aatctccctt cccaaaagca
tatgggatta tttttggtga 9600ggggacatca cacccgtgat atgggaggtt ttgttattgt
gctgtctgtg ggcctgcaga 9660accgatggcc tggtctctca accccatggg cttcccatgc
cataggacca agtccccatg 9720tcctgccttg tccacgagga ccttctcaac tgacctttgt
ggcctcttcc actggccccc 9780agttacacag acttcttggt ggggttttca ctgaaggggc
cgctgagctg tccagcaccc 9840acaaacctgt ttccgcccac accagcgtgc ccttcggctc
tcggcagacc ttggcagggc 9900ccccttcagc ctttctgttg gcttttaagg gcaaggcccc
tccactctgg cccagtgcct 9960ctggggccag atcatcaccc tcagggcccg ggaagctccg
tacaccctcc ctcgcatgcc 10020tgtggggctc gtggcagatg cccagagtgg ctgcgaaggt
ggtgcaggaa acgccccctc 10080caggaaagtg tgcctgcttt tggaattttc cctggggatt
ttccagagat gcaggacacc 10140tgttttcctc gcctggtgat tgagccggga agccttcatg
gagcaggccc tacttgccca 10200ggtgaagtct gtgtggcctg cagccacggg ggcgagtggc
cctgcgcctt tcatgctgtg 10260gcctccttga tatgtacagc ctgctttgaa gtcctgcagg
agctcaaagg aagttcctgg 10320ggtgagacca gccgccgcag agggaaaagt gtgattggag
tggggtgggg ggtcttaggg 10380ttggcagggg gaacacatgc ctccgtggca gctcccggca
ccatgggctc cctggcagca 10440tcgtccgttc tctgagcctg cgaggggcga cacactgtca
ccattcttct tcttggattt 10500tctgtgatga tgggtcctcc gattttattg agggccacgg
agcacggggg aagtagaagc 10560tcaacttttt aaaataatat ccacagcaac agcagtcacc
ccgcatcgag ggaggcctgt 10620cgtgcgccag gcaggtgctt aaccacccac cttcccttca
caccagcccc tgaggtgggg 10680ggggcttcct cttttcaaag tgaggaccaa aaatcggaga
gattcaggaa ctggcagaag 10740tagaaaatgt tgccgagccc tcattaaaga tgcatgcaca
cacacacatt cgttcattta 10800ttcctttttt attttttttt cattcattca ttcctttact
catacactca acaagcaaac 10860attgagtgcc tctgctgtac taaagacacc atcctaggcc
ctggggatgt gcagcagtga 10920acaggtggag ccgatgctgc agtcatgggg gaccagcagg
cgacagcgtg cactcgggaa 10980ggctagagca tctgagacac gcagtgaggt ggggagccct
gggctgggta cctggcctcg 11040cagatcccat ggcgagggag ctgggattgg aggatgctga
gcgagccagc caggcaggca 11100tctgggctaa gagcatccca gatgaaggag caggaggccc
aaacctctga ggccacactg 11160gtgctgctgg ccaggcaggg cccacttgct gctctaggac
ttgtgggtgg gcgtggtcac 11220aggccggagg acatctgtaa ccggccgtgt ccccccaaca
ggccctttat tgaagccctt 11280tcatgggcgg attgggttcc ccgaggggag ccattcagag
gcagggccca gcacagcgag 11340gggtgcagag gccggcctcg gggccctggg cctatggctg
ggaggcatct gcatgtgaaa 11400gggggtgggc atttactgca tcctgtcttg ggccactaaa
gcacatctgc tgggcccagc 11460ccaggcccgg ctcggcgttt ttggagccgg cggaaggaag
aagacttggc ccagggcagc 11520ctcataaagc aattcccacc caggaccgcc ccgatcgatc
ggcctggtgc tgtggagcca 11580ggcccacagc ccttccgtcc agtagcacct ctcaatcccc
tgtgtgtctg ggtgccttcg 11640ggatcctgat aaaacgtagg tgcctcctag taggtctgga
acagcgaagg ccagggccag 11700gggatctttg aagtccagct ggtcctggca ccccagtatc
ccctccttca tgcccggctt 11760ccctggtacc atccagccct agggacacag gtccctggga
gtagcagcaa tagtaacaac 11820aacaacaaca acaacaacaa caacaacaat aataataatg
taggctgtgg gcctgggctg 11880gaacttcatc tctattgcct catccaaccc ccacacgcgg
tgtggcacct gccggagggg 11940tctggcagga cattgtaggg agagggccat ttgtttgcat
ttgtttaagc caacaggagt 12000tttcctgatg caacctggac cttggacaag gggtgtttgt
cgagaacgtg ctgtcgtcac 12060ctgtggtccc atttgctgtc aggaggtgaa gcactttgtc
ttcagagatg ggagccggtg 12120ctctttcacc ctgggttctg tgatttcagc tgtcatttca
acattttaag tttcataaga 12180aataggtggc ctccttatgt aaacattctc accagcctgt
gaatatacag agtgaaaagt 12240ccaagtcctc catcctcgac caagtcccag tgacttcccc
agggtgacac aatacatcct 12300tttccttcgt attcctcaaa agcacagagg attttgagat
gcagccttcc ctgactcttg 12360tccagttctg cgggcacctg ccccatcact cttgaatgcc
accgccccac cattgcccaa 12420ggcatgtgtt tggttagtca caggtcactg gaggctggag
accccttgca ggtccccggg 12480ccaggagttt tgaattcttt gtagcctcag agcacttatc
caaccaaatg aaattttgag 12540tagaactcca aaattatata tatataatat atatatttat
atattatata tatatttata 12600tattatatat atttatatat ttatatatat ttatatatta
tatatattta tatatttata 12660tatatttata tattatatat atttatatat ttatatatat
ttatatatat tttatatata 12720tttatatata tataaatttt tttatattta tatatttttt
atatatttat atatattaca 12780tatattttta tattttatat atatttatat atgtatttat
atatatgtat atttatatat 12840atttgtatat atatttatgt atatttatat atatttgtat
atatatttat gtatatatat 12900atttgtatat atatttgtat atatatatat ttatatataa
atatttatga gaacagtgtt 12960tgaatagtat attattttta atagggatgg gatcttaata
tattgcccag gctggtctca 13020aactcctggc ctcaaggagt cctcctgcct cagcctaccc
agtagctggg attataggtg 13080tgtgccaccg tgtctagctc agaatggaat atgttttctg
tgttcaaatt tataatgttt 13140gttctgaggt caactgatga gaatgatgga agtgttttta
tgaataacaa tgttgggctg 13200ggtgcagtgg ctcacgcctg taatccctgc actttgggag
gccaaagcgg gcagatcaca 13260taaggtcagg agtttgagac cagcctggcc cacatagtga
aacccccgtc tttactaaat 13320atacaaaaat cagctgggtg tggtgttgtg cgcctgtaat
cctagctact tgggaggctg 13380aggcgtgaga attgcttgaa cccgagaggc agaggctggg
gtgagccaag attgtgctac 13440tgcaccccag cctgggagac acagcaagac tctgtctcaa
aaaaaatgtt gaggtctgca 13500gcagtgaacc acacaacaca gtttatatgc ttagaagttt
atgtgagtgc tggaagcgag 13560cttattagga gggatgagtg aatgttttct ggtttatcca
gacttcagac tgttagacag 13620ctatgaaaga atgacttcgc tctgtttggt gctgacatta
gtttatatga tatgagtgct 13680gtatatacca agtgaaaaag ctgtagaata atgtgtataa
acttactttt tttttttttt 13740ttgagacgca gtctcgctct gtcacccagg ctggagtgca
gtggtgcgat ctcagctcac 13800tgcaaccttg gcctcccggg ttcaagcgat tctcctcctc
agcctcctga gtagctggga 13860ctacaggcgc ctgccaccac gtctggctaa cttttgtatt
tttagtagag acggagtttc 13920actatgttga ccaggctggt tttgaactcc tgacctcagg
tgatccgccc gcctcagctt 13980cccaaagtgc tgggattaca ggcgtgagcc actgcgccca
gcctaaactt ccatttttgt 14040taagagaaaa caaaggaaac acaaccaccc ccgtatatct
ggggtagagt cataatagaa 14100tcacaaactt gaactgtatg gaaagagact gaggaaaatc
tgatggctgc ttgaaagctt 14160tcccgtggct ctgctggcag gagtaaggtc tgcttctcag
tttctgatct cagccttggt 14220gggcgggggt caccctgtga aaatcaaggg cagccagggt
gggtctggca caggattatt 14280ggtttaccct gcaggtgctg cctctggttg gaagcccttg
gccaattaca aggggagtgt 14340ggcacaagtt acaaagctca tgaatggagt cctgccccta
cccccagtgc cccacaacgt 14400gcaagaagcc atattggcaa agcagtgcca cacctgccca
ctgtggtgtg cagggtgatt 14460tcccaggccc ccggcaacag gcacgttgct gtccagcttg
tagaggagga acttaaggtt 14520cacagaagtg ataggactta ttcagacacc ttttgagggg
gactgagggc tgctgcttct 14580gccagggcag gcaccacctc ttacctgttg caaagcttaa
tcctgtcaca gaaagtgcca 14640ggtgtcagct ctagtgtgag agcgtgggtt tccaatccca
gctctgcccc atagttgcca 14700tgtggcctca aggaagtccc ttagcgttcc gtcaccccat
tatggctgca gctcaagggc 14760cccagtttcc tatgtgtaat tcctgtaccc aggctgcaag
aattaagtga aatattttct 14820cttaaaccaa tgtttgggtg gggcgaggcg actcatgcct
gtaatcccag cactttggga 14880ggccgaggca agtggatcac ctgaggtcag gagttcaaga
ccagcctggc caacatggtg 14940aaaccccgtc tcactaaaaa tataaaaaaa ttagctgggt
gcagtggttt gcacctctaa 15000taccaactac tcaggagact gaagaatcac ttgaacctgg
gaggcggagg ttgcagtgag 15060ctgagattgc accactgcac tccagcctgg gcaataccag
cgagactcgg tctcaaaaaa 15120gcaaaaacaa acaaaagcca acgttttcct ttttttttaa
gtttttgttt tcttttcctt 15180tttaccttaa actatcagga gataaaagcc aacgttttac
ttattccata tctaggagct 15240aaaggagcat tttgttgttg aagggtcaaa taattttgaa
gagtgggttg ctttgccctt 15300tttttttttt tttttttttt tttgagacag ggtctcactt
tgtcacctag gctggagtgc 15360agtggcacaa tcttggctca ctgcagcctc gacctcctgg
gctcaagcga tcctttcacc 15420tcagccccca aagtagctga gactacaagt gcacaccact
gcacctggcc aatttttgta 15480tttttagtac agacaaggtt ttgccacgtt gcccaggctg
gtcttgaact ccagagctca 15540acggatctgc ccacctcagt ctcccaaagt gttgggatta
caagtatgaa ccaccgtgcc 15600cggccatatg aagtcttgta gttgttgaac tcggtgtaca
tgctaaacca aagcacacag 15660ttctgctcaa ggagctgtat tttccacatc tgtttgtatt
ttcatgaagc caacataaaa 15720ggattctcat cccagttctg tccctagaca agtcactgaa
cccctccaaa ccacaggtcc 15780catctccaaa cagcaggtcc catctctaaa ccacaggtcc
catctccaaa ccacaggtcc 15840catctgtaaa tggaaacatg aacccacctc tcagactgtc
aagaggattc gatgtgaaca 15900tgccgttggg gctcggtgga ggaggagcca gtcaccgggc
gctgctgtag ccactgctgg 15960ggctcagttc accctgggct ccctgctggt gtaaactttc
ccctctttct aaaggtgccc 16020ctgggcgata cactctgggc tcctcctgct ggcctccttc
ccagggcacc tggcttcctc 16080tcagaatgta cctgccccaa cagtgggttt gtcagagacg
tggttgtcat tgtcaccacc 16140acagcaatta ctgccattta tttagtattt gaaatgtgcc
aggtgcttct catgtctttt 16200ttttttgaaa tggagtctcg ctctgtcgcc caggctggag
tgcagtggtg ccatctcagc 16260gcactgcaac ctctgccacc caggttctag tgattctcct
gcctcagcct cctgaatagc 16320tgggattaca ggtgtgcacc accacaccca gctaactttt
gtattttaaa tagagatggg 16380gtttcactgt gttgttcgcc atgttggcca ggctggtctc
aaactcctga cctcaagtga 16440cccacccgcc ttggcctccc aaagtgctgg ggattacagg
catgagccgg tggtcaggag 16500ttcgagacca gcctggccaa catggtgaaa ccccatctct
actaaaaata caaaaaataa 16560gctgggcata gtggcgggtg cctgtaatcc cagctacttg
ggaggctgag gcaggaggat 16620cacttgaacc tgggaggcag aggttgcagt gagccaagat
catgccactg cactccagcc 16680tgggtgactg agtgagactc cgtctcaaaa ataaaataaa
ataaaaaata aataatatga 16740tcttcatgac agccctgtcc agtagctact gttggtgcct
cattttacag gtgaggaaac 16800tgaggttgca gagggttagg tgacaagcca cagtcataca
gggaggaggt gtgtctgctt 16860cccgagctga gctcccacct tacttacagg ggcctggcct
ttgtcccctg cagccaccat 16920ggccctcagg gacgtcgggc tgctgctgag gctcctaggc
tgaggagtgg aagaaaggag 16980ccaggatccc cacccctcac ccccacatgc cacgaggcat
ttcctgtccc tctgggcctt 17040agtatccccc tctgtgaaat gagacactcg ggctgtgtgg
tctccaaagt ctctggctgc 17100cgattcccat gtacttcatt ctaaattgcg tttattgatt
cctgggacat cgatgcttgt 17160gggatcaaca tgtccctgtg atctgggctc cgtggcacag
gcacttcaga gaacactcta 17220gatgggagtt ctgccggcag gtgaggaggc gcaggtacgg
gagggtgtgc tgcgtgtttt 17280ataagtgccc tagctggggg ccggccccag cttgacctca
ttgccgtccc gctaaggtga 17340ctgcccagct gacccagacc aggaggccca gctaggtccc
agctcctgcg tcactgtgtc 17400tcttctctcc attcaaactt tggactctga tgactcaagg
ctctgtgcag gtgctcaaca 17460gcgtctagtg gacctgttgc cctggccggt cagctggaag
gaggagcaag tgcaggaggc 17520tcagctggag ccagtgtgag gccgtgccct gtgccagggg
cccgtggcat ctctgctgtg 17580cccagccttt acctcgtctc cttcatggca gccctggaga
gctggaaggc tgggcaccac 17640catcacccat cactgtttat ggctcgggtt ccaagactcc
accagcctgg tgcgcgcgcc 17700agggtgggaa gccagacccc acgcccaggg gacagatgcc
gatctgttgc tgctcctacc 17760tgcaggtggt cccccagccc aagctggtgt gcttcatggt
gcagggcacg gggtgggctc 17820atgcgcacat cagtgtgggc gtccccgggg catatgtgaa
taactgtatt tttctattga 17880aaaaacaact ctgctctgct tacggggctc ctgcttgacc
ttctgttttg gaaaattgga 17940tttttggagg aaatcagcac cttgtgggtt tcaattctcc
aaacatgcgt gttcaatctc 18000cattgctttc caaaagggga gcgactgaag atctgaaaat
tagacagtgg ctagcactgc 18060ccagaggccc cgttattaaa aacatttcaa ggccgggtgc
cggtggctta cgcctgtaat 18120cccagcactt tgggaggctg agactggcgg atcacgaggt
caggagttca agaccagcct 18180agccaacata gcgaaacccg tctactaaaa atacaaaaat
tagctgtgca tggtggcacg 18240cgcctgtagt cccaactact cgggaggttg aggcaggaga
atcgcttgaa cctgggaggc 18300agaggttgtg gtaagttgag attgtgccac tgcactccag
cctgggcaac ggagtgagac 18360tctgtctcaa aacaaaaaca tttcatgcca gtggcattgc
tgagggcctg gggctggtgc 18420tgcacctgca tctcttttca ctacaacccc cggggaaggg
ggtggtcaat gtcctttccc 18480ctctctgagg aaactgaggc attgagcttg tatttctctg
cactggttcc catttcagta 18540gtaataggga ccctccagca caccaggtga tgtctgcaat
gtggccccag aggcaggagc 18600aggctgggag ttaggaggct gaagcctgat gtgggggagc
gctgcccccc aaccccactc 18660cctgggaacc aaccctccac ccctttgccc tgctgcctcc
acaccccttt cctctggctg 18720tgaatttctc cattaccaag tgggcccgtc agatctggaa
cttgggtcag ttaggtgctt 18780ctcccagtaa agcacctact gtgctgggta ctgggtactg
tgcccggtag attttgggca 18840ggattctaga gagcatctgg caaagttgct gcaggaaagg
gccagtcact gcctgccagt 18900ctcggcccct gtgaatgcac gagttgacat ttctcaaatt
ccagcctcac tttgtgggcc 18960aggcagccct tctgacctat gagctgccaa actgagcctt
ttgatgacac tgcccccgag 19020gattccctgc agccctatgt cctcctgagc acctggtggc
ccctctgtgg cctccgagac 19080cctggcctgg gtgctcactg cccctcatcc ctacacagga
gatgctgggg tgactcctct 19140gttggtaaac caaggctctg gttgctgagg atggggctgg
tcctgcgcca gggtgccccc 19200aggtgctgat cttggctcca ctgtggttgg gcaccaggca
tggcagaggt gctgagcaga 19260caaaaccctg ccctcccaag ccccgggctt gcagaggagg
caggaggagg gaggtggtga 19320tgtcaggacc agagcagcac caagggctgt gtgggcgcag
ctcggaggat gggatctggc 19380tgggtgcatg gctgtggggc ctccaggagg aagaggtgta
tgaaccctgc tttgtaggat 19440gatcagggtc aggccccagg ggagcatgaa agcaaagact
cagggtgcta tgagggcctc 19500tgggtcacac gaaggtgacc agtaccgggg ctagagggca
gggggcccag ctgggagggg 19560ctcatgctgc ccttttcctg aggccccctg ctgcctgccc
agtctcccct ccgagggatc 19620ccattctccc tccagcctca ggaccctcgg cggtccccag
agggctagct attgagctgt 19680cccaaggtca ctcaggatgt ccaataattg tcctaacagt
ttacctgctg tggaacagta 19740atgagagggt tttcattatt tgcgggtgag cgctcctggc
agatgccaac agccagcaga 19800tgtggagagt ccgaggtgat ttgtaagggc cggttcatct
gatgcacggt aattgccccg 19860ggcgatgttg tcactcagag gctcgtcctg gcctgctgag
atgaggtaaa ggtcagttca 19920aagatccagt ttgggccagg cgcggtggct caggccgggt
gccatggctc acgcctgtaa 19980ttccagcact ttgggaggcc gaggcaggtg gatcactgga
gattgggagt tcaagaccag 20040cctgggcaac atggtgagac cccgtctcta ctaaaaatat
aaaaattagc cgggcatggt 20100ggtgggcacc tgtaatccca gctactcggg agactgaggc
aggagaattg cttgaaccca 20160ggaggcggag gttgcagtga gctgagattg tgccattgca
ctccagcctg gctgacagag 20220tgagactctg tctcaaaaaa aaaaaaaaaa agatccagtt
tggctacagg aagtgggagc 20280aaatccccac tcccatggga ctcctgggga ggaggcagcg
tgctgaggtg ggggccatgg 20340tctcagaggg gctatctggc agccaggacc ctgcagaagg
cccctttccc caaggttctt 20400gggtggtggg gtgggaacag cccatcccag agctgggctt
gtccttcttt gcaggggctc 20460tttgtacagc tctctgcaag ccttgctggg gtcctggagc
cgcacctgac tagggctttc 20520ctgatatttt ggaattcttg tacttcatgc ctcccaaggc
ttagtctgtg acatctggtg 20580gcaggcccca cggagggatg acaagggttt gcttccttag
ctgtcctagt cggcctgggc 20640tgccataaca agtgccattg actggggact tcaacagcag
acgtttattt tttcacagtt 20700ctggaggctg gaaggctgat atcagggggt cggcatggcc
agtttctggt gaggcccctt 20760ttctgggctt gtaagcaggt gccggcttgc tgtgtgctca
catggcctct ttgtgcattc 20820atgagggtgc tctctggtgt ctcttcctcc taggacactg
atcctgtcat gtgggagtcc 20880cacccttatg ccttcatttc acctgaatta ctccacaaag
gccctgtctc cagatgcagt 20940cacattgggg gttaggactc cccatctggg ctcacaggcc
ttgctgctta caagctgtgt 21000gaccttggtc aggtctctgt ctcctggggc ctccatcttc
tcctttctag agtggagata 21060atggttcttg cctcataggc tcgtttaccc agtgggtacg
acttactgag ggccacctga 21120cagccaggca cctgggcctt ggggatgtcc attagggaca
gcaaatacct tgggcacgag 21180gacaaaatca gatcatggat cctgagtgac accaccgtcc
ttacccagat gcaaggctca 21240gtataatgtt gtattaatat tttactcttt aaacaatagt
gcctactttc tttattttca 21300gtaatttaaa ataaggaaga ggcaacctgg ggctggtccc
tggtggtcag aatgggtctc 21360tggtgacggc tcaaaggggg gtgtggtccg gggcgcagaa
ggacagaacc caatgggagc 21420ggacttaccc cagcatttca catccgattt ttcagggata
gatttcgtct ctaaaactag 21480cacttgatat tataagagca tataggctga atgaacttat
attgctattt cagaggaggc 21540cagtggctta agagtcccaa agtgagttct gtgacttact
catgtttcat ccacatctga 21600agttgtgttt ggaataacag atgtgaggca ctgtcatctc
actgtcacat cgattgctca 21660ctgttctgag aaggtcacat catcattcat tagcaataat
tagtcccagg ggaccccctg 21720cacttcttca ctcaactgct ccattaactt gggtgcaaat
cacacgttcc aggtgccaat 21780caatggagat ctatccataa tgtacaaggt gacgctatta
gttacaatat attaactgcc 21840taatttaaaa ataaaactat ctttatgaag ggcaattaac
cactaagtgt aattgataat 21900tcataaacct ctgattagga aagacaaata ataagaaagg
aatgaatcac ccatcctatt 21960gaagaagacg ttgccagcct ctggtctgca ttcactatgt
ggtcagcaga tctacagatt 22020tcttcctaaa ttgcacctgt cagctccagg gctgggggtc
tggaggctca tttcagtgca 22080gaaacatgtg ttttcaccac tgaagcctgt ctgtttctgt
tgggggttat aaggaggctg 22140caggctcagg acaccctcag ggaagtagtg ggtgaggtca
tttagacttc tcagtcagtc 22200agtcattcag caaatgttaa tgggggattg tggagtgcca
agccctgtgg gaagccctgg 22260ggatgccgtg gagacccgaa tagattcatc cctgaactcc
tggagctcgt aggccagtga 22320gaaagacaat tgcgtaaatc agatgattgg agacactggt
tagtgttaca aacaaaaact 22380gacagagaca gggcggaggc aatggcacac tggatggtgg
gaggaaggga cattggcctg 22440agtcctgcca cctttttgta gtagctgtca ctaccactca
tggctgtatt tggggccctc 22500tctcactggt ggggaatcgt gcctggctta tctgccccct
tttttttttt tttttttttt 22560gcgacagagt ctcactgtgt gacccaggct ggagggcagt
ggcttgatct tggctcactg 22620tgacctccac ctcctgggtt caagagattc ttctacctca
gcctcccgag tagctaggac 22680tacaggtgcc cgccaccaca cctggctaat tttttttggt
atttttagta cagacaggat 22740ttctccatgt tgaccaggct gatctcaaac tcctgccctc
aagtgatcca cccacctcga 22800cctcccaaag tgctggggtt acaggactga gccaccatgc
tctgccttat ctgccctctc 22860agtggtcctc cactccagcc tcatttcctg gctccacctt
ccacataccc cacgctgctc 22920gagttctttc tatccgagag ggtggggcca gtgcacacct
gcctgtgcgc aaggtgggtg 22980gcccttccca ccacctggcc tgccctttcc tgccatctcc
ttgtccgtcc ctcaggtctc 23040cagttgggcc tgaatcccca gtcttttaag accccacact
gggctgtgtc ctttccccac 23100tatgtgggag agctccagga gtgctgattt cctgttcctg
actccactgt gtgtagatat 23160cagctccatc tccaaagtct caggctcaag agcggaacag
aaagcccttg acatttctgg 23220gattcagtcc ctcacccata acatggataa attacattca
cactgggttc agtttgagtt 23280gagtttttag ttgggtctca cttgctcaga tggaagttga
tttcaagccg tagttctagt 23340aactgggttg aatctgcgtg tattcatttc attgggccac
atctgtaaac agaacttgag 23400cacatgcctc ctctctatgt gtgttttcta taaattttaa
attattcatt taacactatt 23460gcttaggtta taaaaaatac acaacttgaa agaataggtg
aagagccaag atgtctgcat 23520ctttggagat cagtggaatg tgtgtgtttc gggacctgtc
tgagaacagt gccagatgca 23580attagcatgg ctgattaatt ttactagggt ttcttaaaag
tgcctcccgg cccttgggct 23640tgcagcttga tgtcttttcc acgagtccac actgcagctc
catgcagatg tcaaatatga 23700ccgctgcctt gttggggccg gtttgtccag gaagggatct
tgtgatcttg gcgagggagc 23760caggtcctgc ctgtcaccct ccccccaggt ctgagtgacc
caccccgcct gcctctgtag 23820gttcctctag cacgctcggg tcccttcctg gtgggcgcca
ggtcctccag cctcctctcc 23880ctgcacagaa gcaccacacc ctggccctct ctcttgcttc
cccagggagg cctcccctga 23940ccttcctacc ccaccttgtc accctctgtc tgctcaccct
gtttatattc tgcatagcgt 24000ctctctcttc tcagtcacaa attgcccttc ctttgcaggg
tctcttgttc ctattccaac 24060ctcctcctat tccaacctgg atggtggcct ggagatcagg
gaacttttcc atgctcacga 24120ctacagctct gtgcaccagg ccagtgccag gcacgtagta
ggcgctcagt gacagtttgc 24180agaagggtgg atgttacttc atttggtgcg ttctgagaac
ttgcccaggt gcagcctgga 24240ttcgcgggtc tcctatgaag tcaggaggga agaagctcgg
tgccgccggc cagtggcagc 24300atcctgaagt tcatgagacc tccaggctgg gtttgggcag
ggtccacgtc ctgtctgtcc 24360atcccggaat gttggtgtcc ctgatgactt tggcaaacct
ttctccattg acagtgatga 24420gaagacaaac ctgtctcggg ccatctgcct ttgtactttt
tgggaagtat ggatccagag 24480ttagtaaggc gtggacgtgt gcagctcact tgcaagtgga
aatgccaaat gtggtggaaa 24540gggacagaga aatagctgtg gacattcaga ggtggcccct
agccaggctg agcaggattg 24600tggagaggag gtattgcatg gatctcaggg gctgggtaga
agtggccagg caggtgggcc 24660aggcacggtg gctcacacct gtaatcccag cactttggga
cacctgaggt caggagttcg 24720agaccagcct tgctaacata gtgaaacccc atctgtacta
aaaatacaaa aaattagcca 24780ggtatgatgg caggcgcctg taatcccagc tgctctggag
gctgaggcag gagaatcgct 24840tgaatctggg aggcggaggt tgcagtgagc tgagatcatg
ccactgcact ccagcctggg 24900cgacaagagc gagactccat ctcaaaaata aagaaataaa
aaaaataaag aagtggcaca 24960ggtagagagg cggggggaca gagttgcatt ccaggcaagg
ggagaggcaa tgaaaagggg 25020ggtcaggtgg ggaacctcgg ggagcctcag gtgtatgaag
tccagtgagg atggaaaaga 25080gttgtcgggg cgtggtatgg gggtcctctg agaagggatc
tgcttcagct cagaaatccc 25140agcctgaaca catcctggat cagccaggac cccctcaggg
gctcgcagtg gcctagtctg 25200ctctgggctt tgctggatgc ccttcgagaa tcaccgcgca
acctccttca gaggcccaac 25260actccacgtg ctgacttccc gggcctgtgt ctcccctgcc
gcagactgaa gcgttcccgg 25320ctgaaggtga ggttctgcac caacgagtcg cagaagtccc
gggcagagct ggtggggcag 25380cttcagaggc tgggatttga catctctgag caggaggtga
ccgccccggc accagctgcc 25440tgccagatcc tgaaggagcg aggcctgcga ccatacctgc
tcatccatga cggtaggcct 25500gtcggacacc aggacctcac gggggtgaaa gctccccttt
cccagggtgg gggctgtgca 25560gagagcctct ttcactgggc caaaccactg actgagctag
gccaccaaca ctcatgggtt 25620gggggtaaaa acctcatggg acttcctgct gggggctggg
ggcaggttag gacccagctc 25680tgtccattcc ttggcctcac accagagggt ccttagagca
ggttctggat ggctcctggg 25740aggaaaatcc taggctctct ttcctgatct aggatggacc
aagcggccag ggaattgtca 25800aggtggcgga gttctagctc ctgttgagaa agagacagag
agagagagag agagagagag 25860agaaaatgag aatgagaatg aatgaatata aatgagaagc
tacattgagg tccatagctt 25920ctaacgtctg gcaccatagc cttcaactat gtgaaaacca
gccctgctct ttggaacagt 25980ttgcaaatgg cgtaggaaca aggtttgcat cctgatttta
gtctgaatgc aagaacagac 26040agcccctttg accctcagtg gggaggtttt cttggaatga
agcagaagga acaatattac 26100tggtgcctcc tgaatattta aaaagaagag acattagcaa
ttccagacac ttctccattg 26160gtccactgca ccttcttcct tcctctgcac ttgatatttt
tagttttaca aactgctagg 26220cagaggcaaa catggttttt tcctttccag ggcaaggtgg
gggctctcaa gaagtggcgg 26280gtgctccctg gaaagcctag atgtcgactt catagccctg
gtgccagtta agggtgacag 26340gcctggcctt tagaaatcat gtttctcaaa attccttctc
tgtaatctag gatccctgct 26400agtgccttta gaacatggtc ttcaaagaaa gaagatttaa
gaaaaatatc ttgccccacc 26460ctccagaaaa gggttgctgg acccgggagt ggagctggaa
gaagccagca ggagagggaa 26520gtgggcatcc gcttggaggg tggccttggc cagtgaatca
gacaagcaga cggcaaagct 26580gggaggctga agggaggatg aatgcccttc ctcagctgta
atcctgcatc cgtgttttca 26640ataactagac ccttgttttc agacttggaa tttttttttt
cagtcattgg tggattttct 26700gtctaacatt ttatgaaaat tttcaagcat ccaacaaagt
tcaaataatt tttacttgga 26760atagttttgg tttttttttt ggagacagag tcttgctctg
tcacccaggc tggagtacaa 26820cagtggtgtg atcttggctt cctgtgactt ccgcttccct
ggttcaagcg attcttctgc 26880ctcagccccc caagtagctg gggttacagg tgtgtgccac
cacgcccgcc taatttttgt 26940atttttagta gagatggggt ttcaccatgt tggccaggct
cgtcttaaac tcctgacctc 27000aggtgatcca cccaccttgg cctcccaaag tgctgggatt
acaggcgtga gccaccgcct 27060ccggtcttac ttggcatagt tttataccct tcacctagat
tgtcaatgac attttttgat 27120attttcttca ctttatatcc atccatccct ccgtctgtct
atccatgtta ttatttagat 27180gcatttccga gtaaaatgca gacacgggta tactccccta
tacacttcag catatcattg 27240gagtttggta tttgtttacg gtttttcctg tttgatgtaa
aatttacata caatgaaatg 27300tacacacgtc aagtgtactt ttgctgagtt ttgacatatg
catgcaccca tggaacccaa 27360ccccgtgaag agctagacca tcaccatcac tgacagctct
gacagctacc tcgcaccttg 27420cccaggtggt cttcacccct tgtcccctta gaggcaacca
ctgtttggat tttttccgtt 27480ataaataagt tttcctattc tagaatttta tgtaaataga
atcatacagt atataattgt 27540gctggggcct ccagtgctac ccccaggctg aatgattcat
tagcaggact cacagaactt 27600agaaaaggta ttacactccc agttagactt attactgtga
aagaataaag attaaattca 27660gcgaagagag gagacacatg gggcaaagtc taggagaaga
caggcatgaa cttccagttg 27720tcccttcgaa atggagtcct actgggcaca cttgattctc
ccagcgatga tgtgtgacaa 27780cacatgggaa gcattggcca ccagggaagc tcactggagc
ctcagtgtcc agggtttttg 27840tgtagacatt agcacctgtg tgactgactt tagcgactca
gtctccagcc ccacagagtc 27900caaactgata cagctgggcc agggccgcag ggatacagaa
acaggtgttc actgtaagcc 27960aagctgttag cgtaaatgat ctggtcagac tgagagagca
gggcccgagg ccttgagcat 28020acaaaacccc ttatcaggta gaacattgca gtggctcaga
ggttatctcc caggagccag 28080tcctgaagac aggccttttt tggaatgtac agggtttgag
caacctagtc ctgctgagtt 28140aaccctttcc gtcacagtag tcttttgtgt aaggcttctt
ttacccaacg taatgttttt 28200gagactcatc catattattg catgtatcag cagtgttgtt
catttttatt gcagaatagt 28260attccattga gtgaacagac cacagtttat ccattcacat
gttgttgata cctggattgt 28320gaccagtttc agtctattat gagcaaatct gctatgaaca
ttcttacaca gttcttttgg 28380gtaaatacct agcagcgaaa ttactgggtc atagggtaca
tgtatgttca cctttgtaaa 28440attctgccag accttttttt cagagggatt gcaatatttt
acacgtcctc catctgctgc 28500agctgttcag tatcctcacc agaccttggt gttatcagtc
tttttaattc tcaccattct 28560ggtgggtatg aaatcgtatc tcattgtgta tttcagtttc
tttctttctt tttttttttt 28620ttttttgaga cagagtctcg ctctgttgcc caggctggaa
tacagtggtg caatctcagc 28680ccactgcaac ctctgcctct cgggttcaag tgattctact
gcctcagact cctgagtagc 28740tgagattaca ggcggccgcc accgtgccgg gctgattgtt
gtatttttag tagagatggg 28800attttgacca tgttggccag gctggtctcg aactcctgaa
ctcaggcgat ccgcccacct 28860tgacctccca aagtgctggg attacaggca tgagccaccg
cacctggcct cagtttcttt 28920tcaatttaga aaggtcaaca tctagtccta agagagatga
cagcatgttt cttctaggat 28980ctgaaaacac agatccaatg atttgatgta tccttgcaaa
ttttcctatt ttaatttaaa 29040atgggggaag acaaagaagg agaaggagga agagagtttc
ctaagggagc tggtgaactg 29100gacctgtggg gatggaattc tgctttttct tcaggacata
tgtgtggggt tgcctgaaat 29160gtgtcaacgg ttcttacagt ttgggaccag tagtgccaac
cctccaaggc ttgtgcagcc 29220ccatttcctc ctgctgcagg ctctcagtgt cttctgagca
gatttccctt ctcagtagca 29280aagctgtggg tttcctgcct gcgatagtga cctcagggag
ctcctggggc tgatgtggta 29340ttgggatgga gggtcctggg tgtgtccacc tttctcctca
taagccctct ctgaggagcc 29400gaattccctc ctctgtcaaa taggaataac ccttctgtgt
ctacctccca tggcagctgt 29460gatgatcaga aagaaatgca tctgagaaag ggggaggaca
tgtgaaaggt gtccctggta 29520ggagatgggg aggtaggcca tgtcctccca gggctccgtg
gcactctgtc tctctctctc 29580tttccaggag tccgctcaga atttgatcag atcgacacat
ccaacccaaa ctgtgtggta 29640attgcagacg caggagaaag cttttcttat caaaacatga
ataacgcctt ccaggtgctc 29700atggagctgg aaaaacctgt gctcatatca ctgggaaaag
ggtaagttgg ctccagggag 29760agtcatttct cggtgcttta gatgatgctg tgccagtctc
caacttgcgg aatcaggcag 29820agaaggatgc agaaggggag gtgggaggag caccggtgat
cttcttttcc ctcccttttt 29880catgctttct ctcttcattt ctttattcct ctcctttaat
catttcttta ggaaatagga 29940acactgtctt caggagctgg ttgatcaagc ttccatgtgg
agtctagact agaagagggc 30000gtcattgcag agggagaccc tctggagatc tcagaggcca
agtgttgatt gtcccaagtt 30060tagagcaaag aaggagaagg tgcatttgca taggatgtag
cagggtgttg gagaatgctc 30120tcaagggttg tctaattagt aaaacatagg cttgagggcc
tacccagtct gggcaccgac 30180agtctctggc cacagatcac tgtctctgtg aagttgtgat
tttgtgatca gtgttttgtc 30240tgagccttta gcacatgcct tgccatctct ttcttaaaca
aaacagtgtt ggttattttt 30300aacatttttc ttacactgag atagcacatg ttcattgaag
aaaaattaga caatacagat 30360gagcaaaaga agaaaccaga aagtattttt agtcccactg
cacgaggaaa acacctgtcc 30420ctattttggt gtatgtcctt gcaagtggct tcctgtgcga
atagactatt atggatggag 30480atgttcacag catcgatgtc acacttgacc ttgttttgtc
atctgttttt tattgttcac 30540caaacaatat ttgtgactga cttagcatgc agcaaatgga
tatgtcttgg ggcatggtta 30600tctatggtgt ggctatcttg tatatatata tattttttaa
gactgagtct cgctctgtca 30660cccaggctgg agtgcagcgg cgtgatctcg gctcgctgca
acctctccct cccgggttca 30720agcgattctc ctgcctcagc ctcctgagta gctgggacta
caggcgcccg ccaccatccc 30780agctaatttt tatatttttg gtagaggcgg ggtttcacca
tgttggccag gatggtctca 30840atctcctgac cttgtgatcc gcccgccttg gcctcccaaa
gtgctgggat tacaggcatg 30900agccaccgcc cccaacctct tgtaattttt ttcttttttt
ttttttctgc ttataagttt 30960attcaatgca aaataaccct caccagtttt actgaggtgg
ctgaccatgt ccacgaccaa 31020atacgcctgt aaactgaaat tcggttgctg acccattccc
agcctcagct ttctcactgg 31080caccaggggg acagcactcc atctgtgggt gtctctttct
ctctatggct gtctgtctgt 31140gggtgtctct ctctgtctgt gggtgtcttt cgccatctgt
gggtatctct ctctgtctgt 31200gggtatctct cccatctgtg ggtgtccatc tctgtctttg
ggtgtctctc tttgtgagtg 31260tctctgtctg tggttgtctc tgtctgtggg tgtctctctg
tgagtgtccc tgtgagtgtc 31320tctgtctgtg ggtgtctctc ctcgtctgtg ggtatctctc
cctgtctgtg ggtgtctctg 31380ttggcttccc cacttgtggg tcttgcaggt cggtcacgct
ccagaccttt aggccgcagc 31440ctgccagtct ccagaccgct gtggcatggg gtagcagaca
cgctctccag gggcagatgg 31500tggtaatcgc agagattctg gatccccatg tgggtgaggt
accagtagaa atgtctccag 31560gcaaactcct tcctgcaacc tcaggacctg agagactgcc
tggccttcat gacgtgaagg 31620ttgggcacat tctcatctgc cagctccggg tcttaggcag
gtggacattc ttcttggcta 31680ccgtgactcc ctccttaaaa aggagttcat aaatagcaat
ctggttcttc ttaggcatca 31740acatctctgc agctgtaggg tccaggtccg gggctggaaa
gcatgatttt tttctaactg 31800atctctgctg atggcatcta gattgttcct ggtttttcac
cataccaggg ctgtgatgag 31860catcttggtg catttcggat gacgtctcca gatacagtta
cagaacgagt atttttgagg 31920ttcttgaggc atgttgccaa gttgtttcca gaaagctgca
cagacttatt ctgcacagcc 31980tagaattcta gaatcacagg gttctgcaca acctagagtt
ctggaatcac agggttctgc 32040acagctagaa ttctagaatc acagggttct gcacagctag
aattctagaa tcacagggtt 32100ctgcacaacc tagagttctg gaatcacagg gttctgcaca
gcctagagtt ctggaatcac 32160agggttctgc acagctagaa ttctagaatc acagggttct
gcacagccta gagttctgga 32220atcacagggt tctgcacagc ctagagtttt ggaatcacag
ggttctgcac agctagaatt 32280ctagaatcac agggttctgc acagctagaa ttctagaatc
acagggttct acacagctag 32340aattctagaa tcacaggctc ccagggttgc aaggacactt
tggagtgtct acctcagcat 32400ctcatgaagt gtgggaattc cgaggcggtg gcggaggaag
tgttttccat cttcggtgct 32460ttcgttgctt ctggtgacag cgctcactgc ctctgcttgc
tgtacgggac cagctgatgg 32520aaccgacagg gagggacttt ttatctggcc attggccact
gccacacact ttgtgtaccc 32580cgttttgtgt aattctgact acaaccttgt gggatctagg
caggtcattg ctgttttgca 32640agtggggttg ttgaagccac aggagatgaa ataagctgct
gtcccccagc cattgagtgc 32700tgataggatc aggagtgcca gttggtgtgg ctgaccccag
accctgtgcg tgttacctct 32760aagctacatt ctagagcaga ctttttgccc acacaagcct
taaatgtggg ctggggacag 32820tggctcacgc cggtaatccc agcactttgg gaggacaagg
tgggcagatc acctgaggcc 32880aggggttcaa gaccaggctg gccaacatgg tgaaaccctg
tctctactaa aaatacaaaa 32940attagccagg tgtggtggtg cgtgcctata gtcccagcta
ctcgggaggc tgacgcatga 33000gaattgcttg aacctgggag gcagaggttg cagtaagtca
agactgcgcc attgcactct 33060agcctgggcg acagagcaag actccatctc gaaaaaaaca
acaaaacctt aaatgtattt 33120ttgaggctgt gtttaaaaat ggggatattt tacacaaaat
atccagattt ctggattctt 33180ttgaagaatc agaagatctg acaatacgga gcctcacatt
cctgcacaca cagcagccat 33240agctggagcc actgcctcca ttagtttgaa tttactgcag
accccactcc tccctgtcgt 33300ccctgtctcc agaccacaga gttagttgtc attgatcgtg
tgccatttgt tgtttttttc 33360aaagtagaga agtacttctt cacgctgtgt ctctatcaaa
aatggacaag tgaaagatgt 33420ttcaagaaat gaaaagattt ttttagtgac aaaaaatttc
tagtatgttt ctcatataaa 33480taaaatgtgt cctgtatgta gtcagggttc ctcagagaag
cccgaagcta caggatatag 33540atatgtagag agattgtgga ggcttggcga gtccaaaatc
tgcagggcag ggctggcagg 33600ctagggactc aggaatgcgc gcagcagagt cgtaaggctg
tgtgctggtg ggattcttgc 33660tcggggaagg tcagtctttg ttcttgtaaa gcctgcaact
ggttggatgt ggtccaccca 33720cattgcggaa gggaatgtac tctcctccta gttcaccgat
ttaaatgtta atctcatcca 33780aaaacacctt cacagaaaca tccagaataa tgtttgacca
catatctggg caccgtggcc 33840cagccaagtt gacatattaa attaaccctt gtagtccctt
tttaaactta cacccattgc 33900aatttaggtc gctgctatgg agcaagccac agaacctggc
ctcttaactc atttacccgg 33960gctgacccat taggcctttg agtcaccaac acctcactag
agaacaagca taatgaagaa 34020gctctgctgt aattcgttaa tgttaacact tttttcttta
aagatgtctc atgctgagct 34080tcgtggcaca cgcctataat cccagcactt tgggaggctg
agatgagagg atggcttgag 34140ctcagaggtt cgagaccagc ctgggcagca tagtaagatt
ccgtctctac aaaaaagaaa 34200agaaaaaaag ttgtctcata attattaaaa accactattc
cagatcatgg ataataatag 34260tcagaacagg tatattgttg aaaaagaaaa aaaaaaaccg
gaagcagtcg ctttcagctt 34320tgtaataccg cagcaaatgt catagtagac aagggttttt
ttaaagtagg agattagtgg 34380cgtctctgct gatgggggtt ctgggtgttt aaaaaccaga
gaaaggaacc ggcggtttta 34440gggactaggt cccagcttct caggacagcc tgaggacccc
ccgagagctg ccatcaccat 34500gccgtcctcg gctgtgggtc actgctggtg aggtccatcc
tgcactgcgg gacacagggg 34560cacaggctga gcggtgccca ccaaggaggc aggagcagcg
ggtggaggcg ccggggcaga 34620ctgtccctct cctggaaatg gagtgattta catttggcat
ctgtttggac cccagagtgg 34680gaattggctt tgtaatttct tctgtcatgc cgctctcaga
aaattaagct gtttacccga 34740cagactatgg aaggttaacg gctcttctgg gaaaggtcaa
gtggtcattt gcagggaggc 34800tgtgctgaat aatctgaaac ggcagaagaa taaagcataa
aatcgcatct gactcctcct 34860cccgggcttc cccctttctt tccagcatcc acgggttcct
ctttgtggct agaacattca 34920cgtcctaagt ggggctgccg ttggctgggt taaaccagta
accagtaaca ggagaaatac 34980gtcgcatctc tagtgtgtgg agattaccgg gcctttatat
taaaaaaaaa ttttaagtgc 35040ttatatttgc tctgtctcta aaaaatgttt ccaggtaaca
gttggcctca agatgccttt 35100ttgtttttct gaatgaataa ggaataaaga aaaattctgg
gcactataaa atcaagccca 35160gattcagttt tttaaaaaat aaacgttaaa gtttctcatt
ttgatttctg gaagttcgtg 35220ttctccttgg actctaaaga ttagcatgca aagacaaggt
tttgattgtg tattttcagt 35280tgcacgttca gtttttttaa aatcccatcc aggactaaat
ttaaacataa gaaaatccag 35340gtttctttcg ctttcgaaga ggctgcaggg aacttgaggc
cagaggaagt ggcgccgcgt 35400gggtcccctt aggaggtttt gccagatgat gggaggtgct
tgctcccgtc cacccaacct 35460ctagctcctc cagccggccc cagggtagct gagggacagg
ggtctgggcc cagaagcccc 35520cagaatcctc agctccaaac agagctgtac ctgtcgcaat
ggagtagttt tgttgatttc 35580aggttttctt ttgttgctgg gtaagcattt cctgtgtggg
gtatattctt tcctccgtcc 35640tggcttactt ccacacatca cttcattagg gaggccctcc
ctacccacct cgatgccctc 35700cttctcctgg cacaggtggg caggagggga cgagggtcat
ggcctttgag ggcctgcaga 35760gggccttgcc tgtggtggcg tggccaccct ggtcctcagc
agggtgctca gtggattcca 35820aggtgctccc tggacccgga cctccccttc acttcctttt
agctcaggtc tcagctttcc 35880atgcctctgc tgcctccctg aacccagcct gtcccccgcc
ctgttcaagc agactgcccc 35940tcccctgggt cctgcccttt aaagtttgtc tgtgggcttc
tttgcttccc caccagactg 36000tgagtccctg tggagcaggc tggggctggg agtatccctg
agtcctgcaa agggctgtgg 36060agcatctggg atcgttaatg ccaaggggga ccaaggtaac
ctgggccatt tgcagcacca 36120gccagtgcag gccccaggac ccggtgtccc ctcactgttt
cctgagaact ctggacccca 36180ctaaggaaac catcagacca aagtagctga ttgtccaact
gggaaaaaga aaaagcagta 36240cttttttagg ccatgtaaat gcaaaatata catttgaagt
ctgcgtttaa gacatgaact 36300tagagatttc taagcctgtt tttgggggtc actctaaaag
ggagggcatt ttcaggtccc 36360ccctctgtga gcattcacag catcccagcc agcctgtatc
cctgtgccgt gaccccagag 36420cacagagcct cagaggctgt ggaacaggcc tggggttcac
gtggagagtc ccaagcccac 36480gtcatgacag gagatgccca caccttcaac aaacagactc
caagctccaa agctgcctaa 36540caggagtggc agggggacca gccctggcaa ggcccaggtt
tggctgaagc ccgtacctgc 36600agcccagctt gtaagtttag ccgggaagca tctgaggcag
gcagtggagg acgcacaggg 36660gtgagcggtt catctcacag gtgtggctgg gtgcctgcta
cagcttcaag gcatcaggtg 36720gggcagaaag ggacaggcag ggtgcccagg agtggcggag
ctggaggaag ggcacagccc 36780cagggcagct cccgaggagc tcagttctca caccccaggt
ggtcacagaa cacatttcag 36840cccccaaatt gaacagggct gtgttcattt tttccttccc
ccagctcatc ttattgcttg 36900tgtcagcctg ggtgaaaggt tatttataat aatattctcc
tgtctcatct gggcagcttc 36960tgcatggaag gtaccgagtt aataactttg catgcgttat
tttgttcaag cttcacagca 37020atccgtgagg taggtattgt ttgttgttcg ttttgcccat
tttacagatt aggaaactga 37080ggcccacagg cgtaagcaac ccaaagtcac accgttagaa
ggcagtgaga tctgaatgct 37140caagtcagca tgcgctcttc ttccctctcg ttttctgcat
acgtgctttc tccacgtctc 37200ccatatcgta agggcatggc aggaggagag ggaaactgat
gtgatggtgg atggcaaaac 37260attttagtgg ttccaagaat caagccccct tgtgtattct
tggctttcaa tgaactgcgt 37320cttcagagca gcgcaggtta aaccactttc tcgcctcatt
gtagtcaatt ggcttagttc 37380agccagagcc aggtctggcc caccatgtgt ctggactgtg
agtgccttga ggctcagtgt 37440gacgcgatcg tagggtgagc ctggccgggc tgcagccagc
caccaagcca gccctgggct 37500caaagagcag tgaggccatt ttgagagagg aatgctccct
cgtgcagagc ctgtctccac 37560ctccatgaga cgggaaggaa cagttgtctt ctcataaaag
catctggcgc ccatgaaacc 37620tggcacctcg cttgacctcc ggctaacggg atgcggcagg
gtgctgtgga actaatggag 37680ttctcttctt cagccagcaa ttaggggtta attattagca
attaggggtt cgccacatac 37740cccgtggacc tgctgttcct tcccaggaca caggtcactt
gtcattctct gtgttagcaa 37800ctgaaaccgt ctgccttctg ggtcccctgt taatcattgc
agcagggtcc ccacgagtct 37860gacagcaggg ttgcgttctg tgcgacctgt ggcatccctg
gagctgctgg ctgctgcgct 37920cgccccactg ggtgtggcgg cctgccgccc ggctctgtgg
tgctggtccc ctgcggtcat 37980tctcagcgta gcgcctggac tgcccctcag ccgcggcttg
gctctgcagc cagcgggaca 38040ggcccggtgc tcagctcccg atacttagca tcctgcggtc
agctccccgt gctcagcatc 38100ccgtgctccg ttctgctctc tcctaggcgt tactacaagg
agacctctgg cctgatgctg 38160gacgttggtc cctacatgaa ggcgcttgag gtaccagccc
tggttctgtc ccaaactctc 38220ttcagacctc agggcacctg ggactttttg ggtctttgtt
gagaagaggc tgtgcttaat 38280taacgtacaa atggcctttt gggcttccac aaaggctgcc
ttcctattct gggctcccct 38340ggccgggccc tcgggccctc aggccctcag ctgtcctcct
ccctccccat ctctcccttc 38400ccatccttcc catcctcccc atcctcccca tcctccccat
cctccccatc ctccccatcc 38460tccagtgggc gcagcccccc ctgcccgccg tggaccctgc
cgctctccca gctgctctct 38520gccacactcc tctcgggctt gtggcttctg tatcagtgct
gaccaccgct ctcattcctg 38580accctgggca tctgcagcac ctcaagcttg acgggtcctc
actggcctcc ttcctgactt 38640ccccttcagt ttccaaggcc atctgtggcc aggcctccgg
gtggggctcc cctgggtcca 38700ctctggccca ggtcccatta gctgccccag gtcagcctca
catcccgtcc catcctcaca 38760gctgcagccc ctcctgcctg cctgccgtca ggtgggaggg
agaggcagac gagcagatgg 38820agccctgtga tgagctgcga gtgggcaggg gagaagaggg
ggagaaggga tgagaggacg 38880gaattgcagg cgagtgagca gagcactgct ctgctggccc
ttgggagccc tgtagttttg 38940ccagctcttg gatgccttct gctgtgggta ggcctttgag
tggatggcag ggtgcctggt 39000gctaggggac ctgagtgacc cgggcagccc tttctggagt
tgtgaggctc ctgcctgtga 39060gatgagttct caggcccaca gcatcctgct tcttcaaggc
ccttgactct tgggggcaca 39120tttgggacag tgtttccgtg gctgggcata cacctgttcc
ttgatcccaa acttatgcca 39180tgtccctctc tcttttccca gtatgcctgt ggcatcaaag
ccgaggtggt ggggaagcct 39240tctcctgagt ttttcaagtc tgccctgcaa gcgataggag
tggaagccca ccaggtaggt 39300tggcgccttg tgaagtgggt caggggaggc agccccgtca
gggaggccct ggagcttgga 39360atggattaca ggactcaggc agcctgtggg gttggccagg
cagccaagcg tgggctccct 39420gtgaaagctg acctggctgg gaaggagggg gaagacagca
aacgaaatcc actgaatagt 39480ttcaaccgtg aagttacttt cagtatgaaa gcaagaagca
gaaatgcctg cggcttttcc 39540tgagtttttg ctgcttctct gaaaggataa gaattgacaa
gtcctatcag tgtgttaata 39600tatctcactg gcaagacagt gtaacagcaa gattacaaca
atatggagga aataataaag 39660tcactcattt tgcgaccttt atattttgac tattttggga
tagattgcct ttcaaacttc 39720caaattttag aaaggaaaaa gaatcggtta tgattttatt
gtctacacct accccaaccc 39780taagtgagtc tggcttcgtc ctccagtggg ttttcttttc
tttttctttt tttttttttt 39840ttttttttga gccaatgtct ccatttgtca ccctggctag
agtgcagtgt agtggcacag 39900tcatggctca ctgcagcctg gacctcccag gctcaagcga
tcctgccatc tcagtgcccc 39960cttaaccccc ttactaacca ggcccccccc ccgccaaccc
caactgagta gctgggacta 40020caggtatatg ccaccacgcc tggcaaactt tttttttttt
tttttgagac agagtctcac 40080tctgtcaccc aggctggagt gcagtggcgc tgtctcggct
cactgcaacc tccacctcct 40140gggttcaagc aattcttctg cctcagcctc ctgatgagct
gggactacag gtgtgcacca 40200ccacaccagc taatttttta tttttagtag agacagggtt
tcaccatgtt ggccaggctg 40260gtctcaaatt cctgacctca ggttatctgc ccgccttggc
ctctcaaagt gctgggatta 40320caggtgtgag ccaccatgcc cagccccggc taacttttgt
attttccata gagatggggt 40380tttgccatgt tgcccaggct ggttttaaac tcctagctca
agccgtcctc ctgcctcggc 40440ctcccaaagt gctgggatta caggcatgag ccaccaacac
acagccccct ccagcaggtt 40500ttctgaaaga ttgtaaggaa tagtgggcag ccggcatgga
ggctgacgcc tgtagtccca 40560acactttggg aggctgaggc aggtggatca cttgaggtca
ggagtttgag accagcctgg 40620ccaatgtgat gaaaccccat ctctactaaa aatacagaaa
aacaaaaaaa attagctggg 40680cgaggtggtg catgcctggt aatcccagct acttgggaag
ctgagggatg agaattgctt 40740gaacctggga ggcagaagtt gcagcgagcc gagatcatgc
cactgcactc cacctgggtg 40800atagagtgag actccttctt taaaaaaaag aaaaaagaaa
cagagtccac agctgcccca 40860ggctgtcctg ctgcctcctc agaagtgggc agcagaagca
tgggttgggg gtccctcagg 40920tgtcaagcct ggtgcatgtt tcccctgtgt gcatttcggt
ccagaacact ggtgcctcat 40980ggtgctttga gaaagatggc ccaaagctgc caaaatctag
acactgcacc tgcacctgcc 41040ctgagatcca tagagatgtc agaagatgca ggctgtctgg
agtccccaga aaacacctgg 41100gaaatgtcac attagtctgt tgagaaggaa aggtttgtgt
tttgaatgag tgagaactgc 41160agcaataaat aacctggctt tttactacct tgcttcatag
ggcagaattt ctttttttat 41220tagagatttc aaacatacag aaaagtagag agaatgatac
agacacccca ttatatccat 41280catccagctt cagtggttaa gattttgcca cattcctttc
ttctatcctc ttttttttcc 41340ttttaaaaaa tttactggct gggtgtggtg gcccacgcct
gtaatcccag cactttggga 41400ggctgaggca ggtggatcac ctgaggtcag gagttcaaga
ccagcctggc caacatggtg 41460aaaccccgtc tctactaaaa tataaaaatt agccgggtgt
ggtggtatgc acctgtaatc 41520ccagctactc aggaggctga ggcaggagaa ttgcttgagc
ccaggaggtg gaggttgcag 41580tgagccgaaa ttgtgccact gcattccagc ctgggcgaca
gagtgagact ctatttccaa 41640gtaatagtaa tagtaatgat aataataaat ttacttattt
ttttgcttca tattttaaag 41700caaatctcaa ttgtatcctt tcatgttgta gcatgacatt
taaaaatcac gatttttgga 41760tgtaaataca atacattaaa actcctaaca tggccaacag
gcatacaaaa agatactcaa 41820tatcatcatt catcaggaaa tgcagatcaa aactccagtg
agataccact tcacactaag 41880atgactgtat agttaaaaaa aaaaatagaa aataacaagt
gttggcaagg atgtagagaa 41940attggaaccc ttctctattg ctgctgggaa ggtgaaatga
caatgctgcc acggaaagcc 42000acttggtggt tcctcaaagt taaacataga atttgccgta
tgactcagca acccatttcc 42060aggtctatac ccagaagaac tgaaacacgt gttcaaacaa
aaacttgtac acagatattc 42120atagcagctc catacacaat agccaaaagg taaaaaaagc
caagatacct gcccatcaac 42180tgatgaaagg ataaccaaat gtggtatgtc catacagtgg
aatattattc agccataaag 42240aggaatgaag ttctgacaca tgctacagtg tagatgaacc
ttggtagcat tatgctaagt 42300gaaggaagcc aaacacaaaa ggacacatat tgtattcttc
catgtatagc aaatgtctag 42360aatgggcaaa tccatagaga tggaaaggag attagtggtt
ggcaggggtc gggtgcaggg 42420gaatggggag tgcagcttca gggtacgagg tctccttttg
ggatgataca aatattctgg 42480aactgggccg ggcacagcag ctcacacctg taaccacagc
actttgggag gctgaggagg 42540gtgaatcact tgaggtcagg agttcgagat cagcctggcc
aacatggtga aaccccgtct 42600ccactaaaaa tacaaacatt agccagacgt ggtggtgtgc
acctataatc ctagccactc 42660gggaggctga ggtgggagaa ttgcttgaac ccaggaggtg
gagattgcag tgagccgaga 42720tcgtgcaact gtgctccagc ctgggcgaca gagccagact
ctgtcttaaa aaaaaaaaaa 42780aaaaaaagaa aaatattctg gaactggaca gtgatgatgg
tggcacgacg ttctgaatgt 42840gctcgatgcc agtgaattga atactctggt cacggtcata
aattttatgt aatgtgtatt 42900ttaccacaat aaacatttta aaaccaccta aaaaattaac
agtgattttt gttagtatta 42960tttaatagca acccatattt aaatttcctt gactgtctca
gaaatgctct tttaccgttg 43020atctgttgga atccaatctg aagaagctcc gtgacttgtg
tgtcgttttt atgttttctg 43080agcctctctt tttagccctc catcatagaa catttgaaat
atctatgtca gtagagaggg 43140cagctcaggg aactcctagg tacccatcac tgtgctttaa
taattatcaa ctaataacca 43200ctttaaaatt gttttcaact tttcattttg aagtaacttg
acactcacaa aaagttgcaa 43260acatagtaca aaatgttcac gtgtaccgtt cagcagcttc
cccatcatca tcgagcagtg 43320ataaaaccgg gaaatcaaca ctggcacaag gcggttagct
aacctagaga gcttattcag 43380acaccaccag tcatcccact catggctttc tggggacacg
ttgcatgtag cagtcatgac 43440cctagtgtcc tttatctgtg aactctgatc acctgttgag
gaggtgtctt ccagagttct 43500ctactgtcat tgctatttgt ccctttcagt aattatcttt
ttttttttga gacggagtct 43560cgctgtgtca cccaggctgg agtgcagtgg cgctgtctca
gctcactgca agctccgcct 43620cctgggttca cgccattctc ctgcctcagc ctcccaagta
gctgggactg caggcgcccg 43680ccaccatgcc tgagtaattt ttgtttttgt atttttagta
gagatggggc ttcaccatgt 43740tagccaggat ggtctcgatc tcctgacctc gtgatctgcc
cgcttcggct cccaaagtgc 43800cgggattaca gacgtgagcc accacgccca gccttttttt
tttttttttt tttttttgag 43860caggaatctc atcctgtcac ccaggctgga gtacaatagc
atgatctcag ctcactgcaa 43920cctcctccac ctcctgggtt caagtgattc tcctgcctca
gcctcctgag tagctgggat 43980tacaggcacc tgccaccaag cccggctaat ttttgtattt
ttagtagaga cagggtttct 44040ccatgttggc caggcttgtc ttgaactcct gacctcaggt
gatctgcctg ccttggcctc 44100ccaaagtgct gggattacaa gcgtgagcca ctgcatccaa
ccctaataag tatcttgtgg 44160gaaagttctt tgacactatg taaatattct ctttctgatc
attagttcta attttggcat 44220ctcttgataa ttcttttatt tttttatttt ttgagacaga
gtttcactcc tgttgcccag 44280gctggagtgc agtggtgcct gggctcacca caaccccctc
ctcctgggtt caagcaattc 44340tcctgcctca gcctcttgag tagctgggat tacaggcatg
tgccaccacg cccatctaat 44400tttgtgtttt tagtagagac ggggtttctc catgttggtc
aggctggtct caaactctca 44460atctcaggtg atccgcccgc ctcggcctcc caaagtgctg
ggattacagg catgagccac 44520cacatccggc cttcattggt gtattttatg tgtggcccaa
gacatttctt cttccagtat 44580ggtgcaggga agccaaaaga tcggacaccc ctgctctaga
ggtttatgac caaaatactg 44640tatttaaagc tgctgggaat aattcttttt tgtgggtggt
attcccccaa gtggacgctc 44700tttgaattca ttttcatttt gatgtttagg gaattggctt
aacatttaca tttgttttgt 44760agttatgtaa aacatttata tggttctaaa atcaaatcca
caaagcaaga catattcaaa 44820gaagtccaac ttcttttcct tctattttgt ttttacctgt
ctctggtagg tagccatgtt 44880tataaatatt aaagctataa atattttata aatattatag
ctttttttat ttaaaaaata 44940agcaactatg tatatatttg tattcccctc ataggatgaa
tttaagagac aaattctcaa 45000aataatgtgt tttatatata gagagaaagg ctaatgcttt
aagtttaaaa atcagtattt 45060aaaagtttat aagtaggctg ggcatggtgg ctcacacctg
taatctcagc actttaggag 45120tccaaggagg gcggatcacc tgaggtcaga agttccagac
cagcctggcc aacatggcaa 45180aaccccatct ctactacaaa tacaaaaatt agccgggcct
ggcagcaggt gcctgtaatc 45240ccagctactc aggaggctga ggtgggagaa tcacttgaac
ccaggaggcg gaggttacag 45300tgagccgaga tcctgccact gcactccagc ctgggtgaca
gagtgagact ctgtctcaaa 45360aaataaaata aaataatata agtggtatga tcacaaccat
ctcaaacagc ccccttcccc 45420cgacccaaaa cccaaacaaa aacaaaaccc tgccactgcc
tggaaaataa gacctggcca 45480ggtgccatgg ctcacgcctg aaatcccagc actttgggag
gccgaggtgg gcaggtcatt 45540agagcccaaa agtttgagac cagcctgggc aacacggaaa
aaccccgtct ctacagaaaa 45600aatgcaaaaa aaaaaaaaag gcagatgtgg ttgcatgtgc
ctatggttcc agctacttgg 45660ggggctgagg tgggcagatt gcttgggccc aggaggttga
ggttgcagca agctgagatt 45720gcaccactgc actccagcct gggcaactga gcgagaccct
gtctcaaaaa aaaaaaaata 45780ggaaagcaag aaggaaggaa ggaaagaagg aaggaaggag
ggaaggggaa gggaagagag 45840gggaggagac gagactggct gtccagatgg cattgctggt
gatatttgga tggtctgacc 45900ttgggtggtc tccctgcctt tttcttgtgg cattttcagt
tattttcaca gtcagcttcc 45960ttcacatttc gaatgagata cagcccacct caccgccgaa
agaagacccc tcatgggggt 46020ctggtgaatc tgggtttgtc gttttcggat gctggggcag
tgcgtctggg agggagcggg 46080gctgtggatt cctggtgggt caccgtggct gcttgaggct
ttggtccctg ccgctgccat 46140ccaccctgag catctgtggc cctggagagc gaaagacttg
tttcctctgg tggctccgag 46200catatgtggc aaagctgtgt cttatttacc tcccacatca
tcatcctctc ctaccctcat 46260atgacatttc tgcctgtccc cttgacagtg gagtagcagc
tgcatgttcc gaacgtcccc 46320accctgccat tttcctaaaa tgctcactaa cgttatttga
ctctcaggat gtgtagacat 46380tttaaagctg tatttcaaga taaggccatt tcaaatgaac
tgtgtcttat gcacagggta 46440ataattctct gttcctggct tttttatttt taatgaaaaa
tgcctgcaag gacctctctt 46500gctgctgaat gccaggctgt ggcaaacctt aaatttttcc
agctgcttta atcaataaag 46560ttatcaccag tcatttccct tgatcactgc tgcttcccga
gaccctgacc tgcttcttgc 46620tgctgagttt tcagtctgca ctcagaatag atataactcc
ccaaatccaa ttttcatttt 46680aacctttcct cagtggccta gaaatcacag attgctggta
ttatgctaat agattggaaa 46740ttcaaacctc ttagaggcaa cctggccctt tctaggattc
agtaataagg caaagaataa 46800cataccttaa aggttacttc tagaaatatc taattctcac
ggtttaagct tctgaatcag 46860tggatggttt tatagcttca tatgagcacc caagtgatgt
gaaggttttg gcagttctcc 46920atcatcataa gaagttctgt gggatgcaaa tcctgcaggt
ttagtggaga ttgagactgt 46980tcaaaccaga gcacttagtt atatgtgaca gccaagaacc
ctcgcctgga ggggcccgtt 47040tcacaggtga ggagcctgaa gcttagagat gaaacacctt
gtcataggcg ctttgcattc 47100ccaccatgag gttaagacag gatttctcag gctgggcgca
gttgctcaca cctgtaatcc 47160cagcactttg ggaggccgag gtgggtggat cacgagccca
ggagttcgag accagcctgg 47220gtaacatggt gaaaccctgt ctctacaaaa aaatacaaaa
attaactggg catggtggct 47280ggcacatgcc tgtggtccca gctgctcagg aagctgacgc
acgagattcg cttgaatctg 47340ggaggcaggg gttgcagtga gccaagatgg taccactgta
ctctaggctg gacaagagag 47400taagaccttg tctcaaaaaa aacaacaaac ccccccccca
ccccgcggtt tctcaacctc 47460gacacttggt gtttttggac tgggtcattt tgtggtgagg
ggatgtccta tgcattgtag 47520ggtgctaaac agcatcccac tgtccgccca ctacatggga
ccaccccctt cccaagctgt 47580gaaaccaaaa ttgtctgtag acattgccag atgtgccctg
gggagaaccg cagcattaag 47640cggtattaag agaacggtta cggagaactc tagtactgaa
cccaagatta ctttgccaat 47700gaggagtgaa gccaggcacc aaaccatcgt cttccagccc
cgtccagtgc tctttgccat 47760agcccaactc tcagggagga ccagcggtgc agcagaattt
cttagggagg tggggagggt 47820agggaggatt tcaggttggg gggtagggaa gatttcaggt
tggcgggtag ggaagatttc 47880aagtggggtg ggtaaggagg atttcaggtt ggggggtagg
gaagatttca agtggggtgg 47940gtaaggagga tttcaggttg gggggggtag ggaggatttc
aggtgggggg gtagggagga 48000tttcaggtag ggggtaggga ggatttcagg tggagaggta
gacaggattt caggttgggg 48060ggtagggagg atttcaggtt ggggggtaga caggatttca
ggtggggggg tagggaggat 48120ttcaggtggg cgggtagaca ggatttcagg tgaggggggt
agggaggatt tcaggtcggg 48180gggtagacag gatttcaggt gaggggggta gggaggattt
caggtgtagg ggtagacagg 48240atttcaggtg ggggggtagg gaggatttca ggtgcagggg
tagacaggat ttcaggtcgg 48300ggggttagag aggatttcag gtgggagggt aggcaggatt
tcaggtcagg gggatagaca 48360ggatttcagg tggggggata gggaagattt caggtgtgga
gggtaggcag gatttcaggt 48420cgggggggta gacaggattt caggtggggg ggtagggagg
atttcaggtg tggagggtag 48480acaggatttc aggtgggggg gtagggagga tttcaggtgt
ggagggtaga caggatttca 48540ggtgggggga tagggagtat ttcaggtggg gagggtaggg
aggatttcag gtgtggaggg 48600tagacaggat ttcaggtggg gggataggga ggatttcagg
tgtggagggt agacaggatt 48660tcaggtgggg ggatagggag tatttcaggt ggggagggta
gggaggattt caggtgtgga 48720gggtagacag gatttcaggt ggggggatag ggagtatttc
aggtgtggag ggtagacagg 48780atttcaggtg tggagggtag acaggatttc aggtgggggg
atagggagta tttcaggtgg 48840ggagggtaga caggatttca ggtgtggagg gtagacagga
tttcaggtgg gggggtaggg 48900aggatttcag gtgtggaggg tagacaggat ttcaggtggg
gggataggga gtatttcagg 48960tggggggggt agggaggatt tcaggtgtgg agggtagaca
ggatttcagg tgggggggta 49020gggaggattt caggtggggg ggtaggcagg atttcaggtc
gggggggtag acaggatttc 49080aggtggggag ggtaggcatg agggctgcag tgatcttcat
ggctcgcacc ctcagaactg 49140cctgtgaaaa atgggggtga ggtttgcagg cttatccaga
ggcctggcca tataaactca 49200aaggtgccct accatttcgt tccctgattc tcagggcatc
agtcctgact ataggggtct 49260ggggacagtc tgtaccccta gggggaggct ggtgagtggg
gagtacatgt ctgtccacat 49320gggggccact ccggcccagt tctttctgct gtggcctccc
tcccccatgg gtgtttccac 49380tgaggcctcc ccacgaactc tgttggctga gccccacctc
ccagccgtgg gcattttctt 49440tggattgtca gtgtggggat gaccctgctc aactcctgac
tgggagcagg ctggcctgct 49500gggccgtgcc cacttccagc actgaaagaa tgggcccagg
acgggggtca gccctcctgc 49560agccctggtt tgaattggaa actctgtcca ccgatgcctg
gcacctggca cctagccctg 49620cagctgtggg tctgaaactg acgggagttg caggagcttc
cggccctggc tttgtctttg 49680tgatgtgtcc tagtcataag gtgctctgag acctcatggc
ttagttttcc ttcagagttt 49740tttctccctt ggaggctttt tttttttttt aagtagattt
tattttcttc tcaaaacaca 49800agtatttaaa atatctgatt atgaagcaac tcatgctcat
taaaggagct aagaaaaatt 49860caggaaagta caagaagaaa agaaaccact catttcacca
gcccagaata acacaaatca 49920tgcttgatct tgccaatttt ttgtgtctat atatacattt
ttaacacaaa attgggataa 49980tcttgtatct gctgtttgaa atctgagttt tcccccttaa
tatcctattg gggacaattt 50040ttgttaactt atattttagg ttcagggata cgtgtgcagg
tttgtgatac aggtacattg 50100tgtgtcttgg gggtttggtg tgcagattat tttgtcctcc
aggtcataag cctagtacct 50160gttggggtag tttttcaatc ctcaccctcc tcccaccctc
caccctcagg taggccctgg 50220tatctgttgt tgccctgtgt ccatgtgtat tcagtattta
gcccccactt ataagtgcaa 50280acacacagta tttggttttc tgttcctgca ttagtttgct
taggataatg gcctccagtt 50340ccatccctgt agctgcaaac gacatgacct tgttcttttt
taggggacag ctgtctttgt 50400gcttattgaa gcactttaac gaagcagttt tcacagctgc
atactattcc actgttcaga 50460ctcccgtcat attcttgacc atttcctatt gatggagatg
aggctgcttc cagtttcgct 50520cttgtgcgta accctgatga acacatcttg tatgtacatc
ctcttgtaca tttctaatta 50580tttgctccaa atcctagtgg tgtggctggg aggctgtgat
gcatgttgct tatttctgtt 50640tcagaaaggt gtgtccgttc accttcctga cctgctgcgt
atacagatgc ttgtttcccc 50700acatccttac caacaccatg atttctgact tttttttttt
aaagtggtct ttgagttttg 50760ctgttagtct tgttccttat gaaggtgtcc gaccacagtg
gcaattactc agcattgcac 50820tgtctccagg tctctgtggg agcctgggtt cctttggggc
ctggattcag ggctcatcct 50880agagtgagct gtcatctcag ggcatctccc tgggggatat
gtgcacccca gagctgcagt 50940tacagacact gagctcactc caccttcacc acccagccct
gctgcttccc gccctaatcc 51000tcctggcaga gcccctggat aaggaaaata tttgtgaacc
acggaagctc tttggtgctg 51060gaattttgca ctccaaagca tcgccgcctg ttagatgatg
tttccctcca gcacctagga 51120cactgaagag gagaccaggg caggcaggca gatgcgtaag
agggaggatt gactgctgta 51180gcacccgacc caggacacct gcctctggga ctgtggcgcc
tccatccggc tctctggatc 51240ctgggttctg gggggtccga cctttcattt ttttctccgc
ctcacacccc accatgtcgt 51300gtctgaaagg aagcgaagcc cagccctggc aggtggagta
tttttatttc ccctgacagc 51360gcacctcctg acctctatct tcttgccagt gattttttaa
atcctaaatg atttggcaat 51420aattgctcct ccttgcgtga ttattgccct tcaccgatag
taaatctggg aatcccctgg 51480ggtcctgagc gcactttggt atgcggagcc cacaagctgg
cctggacagg catcactgct 51540ccaggcagct gtgcacccag ccggccctgg cgtcccggcc
cccagcctcc gcctccctcg 51600ccgacagctt cccggggctc tgcctggagg cagcctcgct
ttatcccgca gaacctccac 51660gctgttcctg catgagaact cttcctccaa gagaccacgt
tcctctggtc ctgtctccgt 51720gggccacatc cacaaatgtt ccagcccagg aagaaccaag
tctgtattct gggagggaaa 51780cctgtgtgct gtctggtttt gtatgtggga ggagcatgca
cacagtttca ttttccccgg 51840aagcatggag gtgtaggatg acatgtaata tttggaagac
ccattgcatt tccaaggaac 51900cttctagcag gatggccctg gagtctctgg cagcagtgga
catggtcctg aattggagtg 51960ctgtgaatgt tcactgctga cctcacggtt tgaagacaat
tccttgagaa gggcagaggg 52020gacccgaata cgggggctct ggccggccct gctgccagat
gccagcagcc gtgggggctt 52080gcccggccag ccctggcttt tcccaaccct cagagtgccc
cttgtgcatg cagcagacac 52140tggtgctggt cccattcccg ggagcacgcg gtgccactgg
ctgtgtcttg cctggccagg 52200tgttggtttg atagactcga aaacctcatt tggatttggc
aggattttcc ctaacatcag 52260tcattctgga aaaaacactt tgaattttta ccatatctgg
gtatttttac catatctggg 52320tacttgccac tagctctatt tacttcacgt ttcttaaaat
tgattcattt tattacttaa 52380ataaatgtat tttaaaagga acttttgatg actactataa
atggaaaacc agaatctctc 52440acttaccata atagattaat ctaaaaataa agaccacatg
attacattaa aatgtatgtc 52500cttgctctgc ctggagtcaa atagtatgtc acccgtgtgc
atgacacact ctgggaagca 52560caggggtggg tggaggaaag atcagagaaa atccgtggaa
agtatgaagt agagagcacc 52620acaccaggaa gacgtggttt tatttcagta cctcatgtct
cgcggtggct tgcggctacc 52680agtaactggc agaattctca acattgtggc cctcagccac
ctcctccgag ctctctctca 52740gcaccaccat ccgccgtggc ttctctgagg gaggcccctc
tggcctctgc ttgactatga 52800tacatgtcga gtccatgccg agtggcagaa aagctccgtt
ttcagctgaa actacctccc 52860tggcctcagg gggcctcatt ggctctgtct ttgtactttc
tcaatggctc ctgatgattc 52920cttggttatc tgaaggcctg tttgggagac ggttcccagg
acagttgctt caggcttata 52980tgtgtgtgct ttctggaatt cacagcatgg cgtgccctgg
tgagtggtgg ggagatgggt 53040agtgaacacg tgtggtagcc cctgctgggc gtgtcccacc
tgtcctccat gtcctacaac 53100ctggttggcc ccttttgctt gtcagagcaa agctcagaac
ctcacctcct ctgggaggcc 53160tttcctgacc acccacaagc ccagtaggtc aggaggtcag
gcctcacagc ctcctgtgtg 53220ttatcccagg ccctgatgat gagactgtgg aattacctga
ttaatgtctg tctttagcac 53280ttgtcgtgag ctccgtaagt gctgggactg catgtgtctg
attcacactg tgtgcccact 53340gcctggcctg gcactggcct gcgaaggcct agggcaaagg
tatcaagtaa ccagggcaag 53400ggtgcctgtt gcaggtcggg ggcctctctg atgtggtgag
atagggaatt tcctcaacga 53460aacatttcct catgtttctg tttcctttgt aacatgttta
ggccgggcgc ggtggctcac 53520acctgtaatt ccagcacttt gggaggctga ggcaggtgga
tgacttggtc aggagatcaa 53580gaccatcttg gttaacacgg tgaaaccccg tctctactaa
aaatacaaaa caaacagcgg 53640ggcatggtgg cagccgcctg taatcccagc tactcaggag
gctgaggcag gagaatggcg 53700tgaacctggg aggcggagct tgcagtgagc cgagatcgcg
ccactgcact ccagcctggg 53760tgacagagcg ggactccgtc tcaaaaaaaa aaaaaaaaaa
aaaaagttta ggtaatgtat 53820tatgtcctgg ccctgtgctg ggtgcttcac taagatgatt
tcctttaaac gccacacctg 53880ttatacccat tttacagatg gaggtggagg ctggcccaag
gtcacacagc cagcaaacag 53940gagagctggg atctccaccc cactgctcta gctcagcgtg
agacaggagt gagtctgtgc 54000atctgctggc cacatcgggc ccgccccctc tttctgtggt
ccccttcccc cagatggaga 54060gcaaattcca cctggacccg tccttgcagg cagagtccaa
atctatggga cacacactca 54120gtaagactcc cctggcttgg gcatgcttca tagtcatcgt
ttcattttgt cttgattttg 54180gtttgtgtat ttatcggtta gcccctctta gcatctgaat
tccagtgaga agacatgtag 54240tttgtttgtt tgtttgtttg tttgtttgtt ttaagacaga
gttttgctct tctcgcccag 54300gctggagtgc aatggttgca atcttggctc attctcctgc
ctcagcctcc caagtagctg 54360ggattacagg catccgccac catgcctggc taattttttg
tattttaata gagacgggtt 54420tcaccatgtt gcccaggctg gtctggaact cctgggctca
agtgatctgc ctccctcggc 54480ctcccacagt gctggaatta caggcatgag tcactgcgcc
cagcttttaa tcagattttg 54540gagtagggga gtgagaggag ctggttggtg gtggtttttt
tttctttaaa attattatta 54600cttttttttt ctttaaaatt attattactt tttttttttt
ttttttttga gaccgagtct 54660cactccatcg cccatgctgg agttcagtgg cacaatctca
gctcactgca acctccacct 54720cccaggttca agcaattcgt gtgcctcagc ctcccgagta
tctgggatta caggtgcatg 54780ccgggctaat ttgtatattt ttagtgaaga ctggtttcac
tatgttggcc aggctggtct 54840cgaactcctg acctcaagtg atccacccgc ctcagcctcc
caaagtgctg ggattatagg 54900tgtgtgctac cactcccagc tgttttttct ttaaattaca
tatatttatt tatattaaaa 54960aaaaaaaaag agcatgagtc catttaagca ataacacatg
ttaatgtggc aaaaggagac 55020ccacaaatga ctgacgtttc agccacactg gcgaggactg
tatgactccc tccactctat 55080gtaggtgacc aatgggaagt cttgtttgcg cctcataagt
gtgtccagac acagccagcc 55140gtgccgtaat gcagagcact cagctgctca agacacattt
gctgagcgtc aaccaagtgc 55200cagttgtcaa cccagattcc ttgtcctgaa ggagctccta
cttctggaag ggtgggatga 55260gcagagtcca ttatgctgca ctatgacacg ggtggcacca
gggtgtgatc ccagagtgct 55320ttgggagagc cacacaaccc agcctggggg cagaggaaag
gtggtggtca ccagagctga 55380gtcctgaagg cctgggggga attagccagc agaggaggga
ggtgagcagg cagaaggtca 55440gcacgtgccc acaaggtaat tttctggtgt tgatgtgtat
ttccagctat tgtcattttc 55500ccttcgccca aaggacttct tctacatttc ttgtagggta
gatctcttgg tgatgaatta 55560tttccgcttc ggatgtctga aatatcttta tttcatcttc
attttggaaa gatatttttg 55620ctgggtacag aattctagat gggcagtttt ttcttttact
gcttaaaaga cggcactcca 55680cagtcttctc acttgcattg tttccaacaa gaaatctaac
gtcttaatct tcgtttctct 55740gtacataaca tgtctctttt gtctgactgt tttataagat
tttctccttg tcactggctt 55800tgagcaattt gattatgatg tgccttaata cagttttctt
tgtgtttctt gtgtttggtg 55860ttcattggga tcttggatct gtggatttat tatttttatc
aagtttggaa aagtttcagt 55920cactatttct tcaaatatat ttatgtcttc tctcatctca
cttttttttt ttttaatcta 55980atttagagac agggtcttgc tctgtcgctc aggctggagt
gcagtggcac aaacacagct 56040cactgcagtc tcaaactccc gggctcaagt gatcctccca
cctcagtctc ccaagtagct 56100gggactacag gcgtgcgcca ccacatctgg ctaatttttt
aaatttttat agagacaggg 56160tcttgcggtg ttgcctaggc tggccttgga ctcctgggct
caagtgatcc tcccacctca 56220gccttccaca gtgctaggat gataggtgtg agccactgtg
ccccagctca tctcactttg 56280aggaactcta attgcgtgta catcaggccg cttgaagttg
tgccatagct cactgatgct 56340ctgttcattt ttgaaaataa tttttctgcc tctgtttcat
tttgggcagt ttctattgcc 56400atgtctccag gttcattcat cttttcctct gcggtatctc
ttctggcatt aatcccaccc 56460agtgtatttt gatccctgac attgtagttt tcatctctag
aagcttggtt tggatttttt 56520cttttatctt ccatttctct acttttggaa cacatgaaat
accattacaa gggctgactg 56580aatgctctcg tctgctcatt ctcacctctc tgtcagttct
ggctcagttt ggataaactc 56640atctttctcc tccttctgca tgatactttc cttctttgca
cgtctgatca ttttttattg 56700ggtgccagac gtgatcaatt ttaccttctt tttgggtgct
ggatattttt gtgtgcccat 56760aaagaatttc gaactttgtt gtgggatgca gttaaattac
ttgaaaacag tttgatcctt 56820ccagatcttg ctgttaagat ttgggggcag caccagagca
gggttgccct gggtgaatca 56880ttcctcatag tgaagccagc ggcttctgag atcgcactca
atgccctgtg catcgtgagt 56940gagctttccc agcctggctg gtggggacat aacgcagtcc
tgcgtgagac acaggcaccg 57000ttttccctag ttatttggtg tgattctttc ccaggctcag
ggagtttcct ccctcacgtg 57060tgctggcagt ggtctgctga gtacttgagg ggaccctcca
cgatctccac agctctttct 57120ccatgcagcc ttctcctctc tgattccctg tcctgtggac
tctgctcctg ggtctcccca 57180gaatcctagc tttgtctccc cagttcaggg atcctactgg
gctctacctg ggtttccctt 57240ggtattagtc agcccagtct gtcatacaaa acaccacaga
ctgtgtggat tcaacagcag 57300aaactttttt tctcactgtt ctggaggctt gaagcctgag
atcagggtgc cagcaagtcc 57360ggtttctggt gagggctctc ttcccagctt gtagatggcc
gccttcttgc agtgtcctcc 57420catggcctct cctctgtgct tgcatggaga gggcactcca
gtgcctcttc ctctggtaag 57480gacagcagtc ccatcagatc agggccccct ctgacctcat
tcgcccttaa agaattacct 57540ccttacaggc cctgtctcca aatacagtca cgctgcgggc
tagtgaattt gtgggggaca 57600cacttgagtt cacagcaccc ctcctgtgtc cacagtctgg
aagcatcgaa ggcagtaggc 57660tgggggcagt cacaggagcc acctcatttg tttcctgcgt
ctcagaagtc attgtccttc 57720gttgcttgat gtccggtgtc ctgaaagctg ttggttcata
atttggtttt ctggtttttt 57780caggtgagag agtttatccc agaggctagt ttttgaaatg
atacccaggc tgtgagcatt 57840tcctctgtgt gtgtgtgaat tttggtttgt atatattatt
ttgtttttaa ggcaatacct 57900tcaccattga aagcaaaacc tatataaagc aaaacttaca
aagatttgcc agagatttgc 57960ttcactttca tggaatcaag ggcaaataac caggttacag
agtctttagg gtcaggtgaa 58020atttagcctg tgttccagca gaggccagac agcgttcagc
atccccgcac gtccttgggg 58080cgccaaatca ctttgacgtc tgctgattga tttcaaattg
cacagatata tactgttcaa 58140agctgtggcc tacgggggca gtgtgagata tattcaaggt
gggttgactt ttaatcaata 58200cgatgacaat attatcttgt ttaatttgta tgatggtggt
tgtaaacatc aaatcaagcc 58260atttattatg tcaaaaaaag atgaaggagc ccgggaataa
aactctcctg acatcacttc 58320tgagcgtatt tcactgccgt gacaggccgt catgattggg
gacgatatcg tgggcgacgt 58380cggcggtgcc cagcggtgtg gaatgagagc gctgcaggtg
cgcaccggga agttcaggtc 58440agtgccagct ggagtcattt attcaccttc cttccagggg
atgaccacat tctcattctg 58500ttttgttctt caaaataaag gggatattct ttccaaatca
aagagcagta tgtgggcatt 58560cactatcttg tatgtaatgg acctctaaga acagggctga
gtggtctttt atagcagaca 58620gtgataaagt agagtggctt ttgatacagg gtctgggttt
tggaatggcg tgcagtgcag 58680agagaaataa gcaaaggcac tatccatcct ggcggcggcc
caccagggag ttgggccgaa 58740tttgtgctgc tgtccctttg gcaatttgat cagaccctgc
actacggcag ttacagctgt 58800gggcactcag ccccttggac agcctccctc tccttttgtt
cactttctcc actctgtaag 58860acacagagag cctggagacg acagagggtt gctcaggcat
gccctggagc agaacctcca 58920tctcaagaag aaggttgtgt tttatttgct ggcaaaagtc
tgcctggtct ggagcagctg 58980tcaggggttg ggagctccca ggcatgtgga gggccctgtg
gtacccaagc tccagctcgc 59040ctggtcatcg agaggatctt ggacatcaga gtgttgcctt
gccttatgct agatggtttc 59100gattgtggcc attcagagaa agtacatctt tgcagatgct
ggggagactt ggttgaaaca 59160tggttacctg gaggtacttg gatttggggt gggggcagtt
acatttgtcc tgagagatct 59220gcttaggcag gaccagagaa gacagatcag agggcccgac
cgtcgctctg aagtctctcc 59280tgcgggttgg gagctgtggc tgagtgcgag gtgagtgagg
aaggctcagc gcacctgggt 59340ccagggtcag gaagttggtc ccttccagct acctcccaga
ccctcccagg cttcccccga 59400cgtgtatgac atatgggcct tctttctttc ttgaagttgg
gttttgtggc tccccttggt 59460taggcctggc ctaagggagt gagggagacc ccactcacaa
gccacaggaa cgtttttttc 59520caagagacca gccaggggtc ccccagtccc caatcctgcc
tttcctgcaa aggtcgcaga 59580gcctggtctt gctgctgctc ccaggcgagg ggctctccgc
tgacagcgga cctcggggcc 59640ctatagttga tgccgggtgc tgcctggcgt cagccgtcag
gagcccagcc cggcgcccca 59700tgcgtctcag ggcacacctg tgacccgtga gcacaagtga
cacctccctg cttgggagca 59760cagcggtgtg gctcggcagg aggtggaaag ggagaaccgt
actgtgttgt gtactgggct 59820tcttccagtt ccttggagga gaaacatgtc tgtcctgaat
gccacctcag ttggggtctt 59880tctggccagc ccaggccagg tgcccctata aactgagaat
gggcctctga agaaggcccc 59940ccacagccca cgaatgcagg tgggcacgag ggaggctgtg
cccaccctac gccagcaagg 60000gttggtgctt ttcagattcc ggttacctgc atgctattta
aagaagagta cgaggtgggc 60060atcccccatg acactgtctc ccggagagag agaaactctg
tttttttgtt tttgtttttg 60120tttttttgag acagagtctc cctctgtcgc ccaggctgga
gtacagaggt gcgatcgatc 60180tcagctcact gcaacctccg cctcccaggc tcaagtgatt
ctcctgcctc agcctcccga 60240gtagctggga ctacaggtgt gcaccaccac gcctggctag
tttttgtatt tttagtggag 60300acggggtttc accatgttgg ccaggctggt ctcgaactcc
tgacctcagg cgaaccaccc 60360gcctcttcct ccggaagtgc tgggattata cgcgtgagcc
accgcgcccg gccccagaga 60420gagaaactcc atcaacagtt tgccgaatcc ttccacactg
ccttctctgc atagataaac 60480agactggtaa tattttggtt ttttcttttt ataaaaatgg
gatgatattg ttttgcagcc 60540tcctttttcc ttgaacactg ttttatggcc tctttctaca
agtgtcttca taatgatttt 60600tttggtaatt tttaaaattt cagtagtttt tggagaacag
gtggtgtttg gttacatgca 60660taagttcttt agtggtgatt tctgagattt tggtgcaccc
atcacccgag cagtatacac 60720tgtacccagt gtgtagtctt ttatttctca cctacctccc
acccttcccg gcaagccccc 60780aaagtccatt gtattatttt tatgcctttg cgtcctcata
acttagctcc cacttataag 60840tgagaacata cagtgtttgg ttttccagtc ctgagttacc
tccattcctt gaggtctcca 60900actccatcta ggtttgctgt taatgccatt atttcgttcc
tttttatggc taagtagtat 60960tccatggtgt atatgtacca catttttttt ctttgtttct
ttcttttttt ttctttgaga 61020tagagtcttg ctctgtcacc caggctggag tgcagtggcg
cagtctcggc tcactgcaag 61080ctccgcctcc cgggttcacg ccattctcct gcctcagcct
cccaagtagc tgggactaca 61140ggcgcccacc accacgcccg gctaattttt tggtattttt
agtagagacg gcgtttcacc 61200atgttagcca ggatggtctc gatctcctga ccccgtgatc
tgcccacctc ggcctcccaa 61260agtgctagga ttacaggtgt gagccaccgc gcccggcctc
tttctttttt ttttaacttt 61320taagttcagg ggtacatgtg caggttcatt acataggtga
acatgtatca tgggggttta 61380ttgtacagat tatttcatca ctcagatatt aaacctagta
cccattagtt atttttcctg 61440atcctctccc tcctcccctc ttcaccctcc aacaggcccc
agtgtgtgtg ttatcctgtg 61500tatccatgtg ttcagatgct gtaggaaagg aatcagtgta
gtgccgggca cacagcaagc 61560cctcagcaaa tggccagggt ggggagcacg tgcggccact
actgtccata tccatccgta 61620tccatgcata tccatccata tccaccaagc tccagacatg
gagtcacctg cctccattga 61680agaggatggg acttagccca tttgttccta acgagattta
aaatgtcaga ccttttgatt 61740tgtttgtaag gatgcaaatc tgcctttgtt aacatgcagg
acaaagggag aagagtgtta 61800gcttggctaa ggtagggacc cagtcgctgg gggaggccac
atgggcatca aaggaggtgg 61860ggctctgcgt gtgtaggtgt gccccagtta aagggatggg
aggtctccca gccattggtg 61920actctctgga gctgcactgt cagacacagt agcggcgagc
cgcagggggt gacttgtagc 61980tcgcccacag tgagacgtgc aggaagtgta aaatacacac
tggatttcag agactgggtg 62040gggagaaagg attgtgaaag agctcatgag tattttttta
tatcggctgc atgttgaaat 62100gctgatattt cgtgttaatt tcacccattt atttaaattt
ttgttaacgt ggctactaga 62160aaaggttcgc tggcttctgc aggagtgttg gtggccccag
gtggcagctg ccctcctcct 62220gcactgtccc cttctccctc cagatggtac tcttggtaca
cagatacact gaaatgctgg 62280aggggcgtgg ggtcctggat gagtggctga tggggcctcg
ctctctggag ggctgttttg 62340ggactggaat tgcctgcgtg gtgctcccta tggatgtgtg
gccagcctgg ggccagcaca 62400cagcctctgg agtcagccgg gcacctgccg ctgggccggc
atcccttcct tccaggaaat 62460acccccagcc aggctctatg gagcagcacc tcgcccttcc
agggccaccc ccagccctct 62520ggcacctcag cacggcccca ttctgcacag cggtgagccc
cccaggggca gcagctgtgt 62580ctcctctgca gctgaggcat gtcgtcgcaa ccctgtgctg
agcctctcct ctcctgcaag 62640tggcccaaag aggcctcaac aggggccaat gaccactgga
tgggtcagct gggcatcaga 62700aggccttcca ggctctgtcc tctgttggtg gccttggcgg
aaatgggaga atgaaggggg 62760aggaggggtg gccttgtcct gggcccaggt ggcagagagc
cctgcagtga gtaactgcgc 62820tcacacttgg tttgaggcca gggcagcatg gggaggccgt
ccaggctggg gcagctggcg 62880ggtgggggca gcggtcagtc ctgccagtca ctggcaccag
gaagggggag aagccagagg 62940gatgactgag gcctcccact ggtcctgctg tgggacgtgc
cctcagccaa ggtcttctag 63000tacggagcac gcacacgtgc acgcacttac acgtacacac
acacacacac acaattgttt 63060ctgaggcagc tgtcaggagt ctgagaggcc gagaggtggc
tgaggagtgg gattcggccg 63120ctgactggca gtgagatctc cttgtcacct tagatgtatc
catttctctt gttttcatta 63180taaaatggcc caaagagagt gaagcctcgg ggttgggaga
gattgcaggt ggcttagctc 63240tggagggaaa gttttgcagg gcctgctccc caggaaagca
tcatccccgg gcaggggtgt 63300ctggagcttg gcccaacagg gctgcccaag agccctggac
agggaagcac aaagctttct 63360tttcatacct gctccctgtg gaaatcctgc ccagccaggg
ggcggccctc acctggctgg 63420caggtaaagg ccttctgggt gttccctgcc tgccctgttc
ccaggactcc tggcgcattg 63480gacatggctg ctcccccaga ctcctcgagg tcccatccag
ccacagcgtt cctgctgccc 63540aggccctgga caacagtgtc tcctccgtgg ggggtcctgt
gccatgggtg gcaaatcctc 63600atgatgccat gagtcctgca gccccagagg gtgctgctgc
cctgccgtga gaggcttttt 63660cagatgggac agctaggacc ttggccaagg ccacctggca
aggttttaca gaagctggac 63720tggaacctgg ccgcccccag tcccctgggc gggtgctttt
cctgcacttt tccctggtag 63780aatgtcctgc ccttggggcc acattttgca gcagtcacag
ggcctggcct cactccaggg 63840ctccatgttt ggatgcgagg cccagtgggt aagtgcactg
ctgcccacgc ccccccccaa 63900gcactgtctg ctcctccctg cttgcctgcc tgtcaaggca
gccctcatct tccttggagg 63960atggggctgg gctgggagcc tgcgccctcc cattctgaca
aatgagtaga gggaagggac 64020gttttgccca gttccaaggt gccctctctg tcctcagtgg
agaaggaaga gccgctggca 64080catctgagcc acagaccctt cctgcctctt atatccccac
caacctgtgt tcctgtggac 64140atgccagaaa ccaggccgtg tggcctctca ccaccccatc
actctgttcg ctgttccacg 64200ccacaccctg cagccaccac ccagtcaaag gcgccactgc
ctgggaagcc cctgcgctag 64260tcctggtgct gctgtgtaaa taaagccgcg atgcggagct
cgcctctggt cttctcttat 64320tgtttaccca aggctggcag atgttttata agccccgaac
gttcttgaca aacagtttgc 64380ccttcggaga gggaaagttt cctggaccgt gattcccgac
atcctgcaag gtcccttcta 64440gctgtggcct gggtgctgaa ggggtttccc cccagtgggg
tggccagccg atctaggatt 64500ccagccgcct tgcacaaagc tggaagacga ggggctgcag
ggtgcatggc tgtggagctg 64560gctgcgtcgg gagggagagt gcccccacgc ctggcctttg
accttggaag cggctgccgt 64620caaagcagca gcttgtcctg ggtcttccca gacttcgcag
tgacggattt cagagtgttt 64680cctgttacga gaaggctgtg gattcgcttt ggaacatgac
agatttttct tctgtgctac 64740gtggagtgtg agggaaggaa gtgaaactgg tgttaatttt
tcagcagtca catgtcaaag 64800tgagtggcta gcaagcaggg gggtccaggc tgtggtggcc
ctggtcagcc ttccaggagt 64860caggaccctc ctgctgccct cctgtgtaaa cagaccagcc
agggcaatcc gagccagccc 64920cggtggagag aaagggccag tgctgggctt gggagctcac
cccaagttcc tcgggtactc 64980atcctggcag tgcagggaag ggcaccggga acaggcctgt
ggaaaaatcc actgtgcaat 65040gagtccgtga atcccagaca tcctctcagc agccctctgg
tcctgcccgg gatagaggct 65100gtggcttctg gtgtaggtgg gcacttctgg gctcagaggg
gtcatttagc tcacctgagg 65160tcacacggag tgggacccac cccaatttaa gcctgtacac
ttttttcttt tgtctctttt 65220ggtaacagct ttattgagaa atcactccca cgctgtataa
ctcagtggtt ttcatacatt 65280cacagagctg tgcagtcgtc accacaattt tataacattt
catcacccca aaaagaagcc 65340tcaaatcgtt tagctattat tagcaaagac cccccacatt
ccacccagcc ccagacaatc 65400actaatctac tttatgtcat actttctttc ttttgttttc
attttttttt tttttttttt 65460gagacagagt ctcactctga cagccaggct ggagtgcagt
agcatgatct tggctcacta 65520caacctccgc ctcccaggtt caagcgattc tcctgcctca
gcctcccaag tagctgggat 65580tatagccacg caccaccatg cccaactaat ttatgtattt
ttagtagaga tggggtttca 65640ccacgttggc caaactggtc tcaaactcct gacctcaaat
gatccccccg ccttgacctc 65700ccaaagtgct gggattacag gcatgagcca ccgtgtctgg
ccttatgtca tactttcatt 65760tgaaaaattt agaaaaatta aattggccca ggagtttgag
accagcctgg gcaacatagc 65820gagacctcat ttctattaaa aatcaaaatt agccaggtgt
ggtgacaccc acctgtggtc 65880ccagctactc gggaggctga ggcaggagga tctcttaaac
ccaggaggtc cacgaggctg 65940cagtgagctg tgatcatacc actgcacttc agcctgggtg
acagagcaag accttgtctc 66000aaaaagaaat ttttttaaat tgtggtaaaa tatatataac
ataaaatatg ccattaaaca 66060attgtgatgt gaagtgaata aaatttgaat acacagcata
atgaataaca aagcaaacac 66120tctaaggtca agaattagga ggctatcagc tccccagaag
acctagtccc tgattgacat 66180gcccagtcct gcagaggcgg ccccgccttg ccgtgggcac
agtggccgcg gtgactgggc 66240ctgctttctt ctcccacatg gctggcttcc ctccaacatg
gcgctggctc ttccttctct 66300gacctttatc caagtggaat cgcacactgt accgttgagc
tgtggcttgt tcatcgtgtc 66360ctctgtggct ccaacacccc tctctgcccg tccctcagag
ggaacatctg ggtgggtccc 66420agtttggccc ttctgcactg gctgccagga atgttctggt
ttttgtctcc tggacacatg 66480ggcgtgagtc cttctggggc ggatctgctg atagatggca
gtgctgggtc atggaggagc 66540atatgttcaa cttgaccaga tgaggccaga cgggtttatg
cccccaccac ggggggaccc 66600tgttgctccc cgccccgacc atcacgagct gttctgggag
ggaggcccat gttcttccca 66660gtttacggtc ctgggaggtt tctggacatt gtcagcatgt
ggcagtcccg gacgctgggc 66720ctcctgagag agtggggagg agacccgagg cggaggcgag
tgtgtgaaag gaggctggtg 66780gggtccacat ggctcagcag gtgaacatgg ggtttccagg
ggcagacacc ccatcgtggc 66840ccccatggct ttggttgggg cagcatcagc ttcctcctct
gctgggtttc ctgccttctg 66900ttggctgctg gggcagggga gacaggattg ggaacagtca
ccctctctgg tactgaccct 66960cattagcatt cttggttcct gagcaaggct gcttcatgtc
cagggcaaga ggggaaagat 67020gttaaaacct gaacccccct gctgggcttg gtgccgcctg
tgggaaaacc ggtgggctga 67080gggctggggg gctctgagct gaggaagggg gtcccaggtg
gaggccggga ggtggctgca 67140gagtgaggcc acaaggcctg ggggttggcc cagaaggtag
gacccctgta ggcagcctcc 67200caaccccagg aatttgcctc ttgggaacgt gcaggtccct
cgagggtcct ggaaggcgct 67260ggagtggtgg gacagggatt gagggagcgt cagcagggcc
ctctcttctc cctagagccg 67320ctcaacacgt gttgcaaggc acgtgctctg gacctggtgc
tcacacgtgc tcttctccca 67380agacactccc tgaagcgcgt tccccaggag ctggaagggg
gataacgcag gtatgcctca 67440tggcggggcc agtgctcatg cttacacaca ctcagcctca
gctcgtctac accctgcaac 67500agcccagaga agccacccgc cttcttaccc ctaccattta
gatttgcaaa tggaagctca 67560gagaggtgaa gtaacttgcc caaggccaca cagcctacca
gtgtcagaat cgggattcga 67620acccacctct gcctgactcc aaggcctgtg cccctgaaca
ttctcaagcc accccccaat 67680tagccaagtg ttgctggagc tctgagcagg ggccgggcct
cacaaaacaa tagtagccag 67740tgtgggggct cacaggactg gtcagctccc gccctccccc
gctttggccc cctccacaca 67800cacacgacct gctgctgagc tggacataag caggacctgg
ctctgggggc ttctgaccta 67860gggcaggccc tgaagttggt cattagaccc cacccaccca
cctgagagcc tggcggagat 67920gtgacaggtg acagccccag aagtcccatc gttgctgcct
ttgtggttaa agtccgctga 67980acccaagcgc tctcttcccc tccagtgtgt ggtggggagg
tggggagggg acccaggcag 68040atctgggaag ccgtggccag gccgcagcaa gcaggggaag
aaggggcccg cagtgaactt 68100gcagcaggag ctgtggccac cgtcctagcc aggccggcca
cctgcccaaa atgttctcta 68160accttaacat gaacctgggc cacttgggct ctgctggatg
cccccagggc tgccagagcc 68220tgcccgccac agagcccctc tgagacctgg agtctcacct
gagccctctc gctgcccaca 68280gggctggcct ctgcctttcc gctgctgctc tgcggctcgg
ggacacggcc agggctggtg 68340gcgcttacag cccaggggca ggggagggca catcccccac
agcattcacc attgctgctg 68400tgttctgatg catcagggaa tctgcatttt tgaccctttg
attgctacat cttttccagc 68460ctggaaaagt gaagaaggaa ggaaaggggg ctcagaggcc
cctgactcct gtccttgcag 68520gcacagctgg acgcagttct tattttcaac catcttgcag
cagtggtttt atttgtgcac 68580aggttcctct gtgtcatggg tgggtcccag gatgtgaacc
cggtgtgctc ctctagctcc 68640actgtcctca ggttggggac agccagggac aaggccccct
actccagctc caaacgtagt 68700tccaggctgg ggtgcccttt gcccaggccc agcctggtag
ccccgggtgc tctccctgac 68760ttttcatgag cccgggggct gaaaccagag cagcagggag
acagaaaagc ccagccccac 68820cctccagagg gggagtccag ctgcccggag accccaagag
cactgccaga atctgggtat 68880gagacccagt cccctgccgc gtgctaacca tgtaaccttg
gacgtgatct ctttgtgctt 68940tttttttttt gtttttttct ttttgagatg cggtcttgtt
ctgtttccta ggtgggagtg 69000cagtggcacc atcatggccc actgcagcct ggaactcctg
ggcacaagtg attctcccgc 69060ctcagcctcc caaatagctg ggaaacaggt ttatgccacc
acgcccaact aatttttaaa 69120ttttttataa agatggagtc tccctgtgtt gcccagcctc
ctgggctcaa gtgatctccc 69180cgccttggcc tcccaaagtg ctgggattac ccggcttgag
catgatcttt ttgccttgct 69240tggctctggg tttcctcatg tgtcaagtgg ggctgagccc
cagctgtgtg ggctgaacag 69300gggagggtta agctcggccc ctagagtgag ggcccgccat
ggtggtggct cccgtatggc 69360atatttccct aacggttccc atttggtctg tcctgctaga
ccaggctggt gggacgtaca 69420ctgactctgg ggttctgtcc accccgtggg tgtggtccac
cacatggtca tggtggcttt 69480ctagggtaga cttgaccttc cccactgcct ttatttattt
atttattatt ttattttttt 69540ttgaaacgga gtcttgctct gttgcccagg ctggagtgca
atggtgcgat cttggctcat 69600tgcaacctcc gcctcccggg ttcaagggat tctcctgcct
cagccttcca agtagctggg 69660aatacagatg tctgccacca cacccggcta atttttgtat
ttccactaga gacggggttt 69720cgccatgttg gccaggctgg tcttgaactc ctgacctcag
gtgatctgcc cacctcagcc 69780tcctaaagtg ctgggattag aggcgtgagc cacagtgcct
ggccttcccc actgccttta 69840aaatctctat ggtccccact gcccacggga ggaggcccaa
ccctgtcctc tgttagtccc 69900ctggtccccg tagatattta ccaagcacct actatggcca
ggcgctgttg aggatgtggg 69960ggatcgagtg aatgtggcag gtgtcccctg ccctctaggg
ctgttcttcc ttctagggag 70020ggaggttcaa gcccctctcc tgatgcccag ctggcctccc
agcgtcaggg atctctgctc 70080tgttggtccc ctggagtggc cctggggcca ccacataaca
cttttgtaag aataagagtt 70140attttttatg gcacaatcgc ccctgccctg ggtttgtgaa
tttgggggat tctttttttt 70200tttttttttt tttttgagac agagtcttgc tctgtcgccc
aggctggagt gcagtggtgt 70260gatctcggct cactgtaagc tccgcctccc gggttcatgc
cattctcctg cctcagcctc 70320ctgaatagct gggactacag gcacccacca ctaccatgcc
cagctaattt ttttgtattt 70380ttagtagaga cggggtttca ctgtgttagc caggatggtc
tcaatctcct gacctcgtga 70440tctgcccacc tcggcctccc aaagtgctgg gattacaggt
gttgggggat tcttaaggaa 70500aaccacctga aagtgcgggg ttattaaaaa cagaaaccaa
cccatgcacc caccactcac 70560gagcagccca cacagctccc acctgaccag ggctccccct
tctcatgggt gagtgaaagt 70620tgctgaaaat aggcctcctg gtgaggaacc tcccagccag
ccctgtcagt ggcactgagg 70680ctgctctccg actgtggcat cccccatgtc tgtcctcatt
actgagcaaa aatttgtgca 70740gaggagtgct ctacactttc caagggttta atccaggagc
tgagggccag gggctccatg 70800actgtagact caacatcctg ggccacagcc atccgtgagg
gggaggacga ggggccacgc 70860cctgcactct gtcccagcat cccctctccc tggagccatg
cagagatggc ctcatttcat 70920cccacactca catgcgggtc ctagacacca gctcatgcac
atacacacac acacacacgc 70980acacacacac acacacacat gcgcacatgc acccacgcac
tctctttgcc tgtgggaggg 71040caggatccca gctgccccag cggggaaggg gtgtggggaa
tcagccctgc agatgcggtg 71100tgaggggcag acatggcatg ctggtctttg gcagtgtggt
cccaagtggg ggcggctgga 71160gaggtgcctt gccacagggc ccaggccacc cgcatgcatc
ccatgaggcc gtccaagccc 71220gtctgccctg catcagagag gacaccccaa tgcggggagg
gaggaacctg gtgaagccag 71280gagcggtggg caggggccag caaggatccc caggccccca
gggcagtgtg catccggcct 71340tcttgcatgc ccacctggca tcctgcctgc cgcctcctgc
ctcccgcccc acccctcctg 71400cgtcccagcc tggtggacct ctcagggttc actctcattt
cctcctcccc acgccacctg 71460ctcgggccac aagagggcgc cttgctgcca gggatgggac
aggccgggcc tgctggagcc 71520agggccgatg cacgcacagc tacatacact cacccgggcc
cacgtgtgct cacgaggggc 71580ccccataccc acacgatccc gcatccacat gtgcgcacgc
caggccccac aggttctcag 71640gcccaggcca cacatccaca ggtgcccact ctcatgctgg
cctcatagtg catgcacaca 71700cacacacaca cacacatata catacactca cgcccagacc
cccccacgcc ccaggccaca 71760catgtgggcc tgctcacacc ccactcattc cctcccagat
gccgcaccac ccccatgccc 71820tgtgtgccct ctgcctctcg gggccctggg gccccagctc
caagaaccac gggcttgctt 71880tagccaggct tccacaagcc tgcccattcc tcaggacact
gaccctctcc tctgccaccc 71940ttcacgggtg cagacgatga gcatggagaa caccacagcg
cacagggccg agggccacca 72000tcagagtctc ctgcagccct ggattccctg ggcctgaagc
aggcagagcg gccaaggtag 72060cctcatcagc ctcaccagcc cgtccttctt gtctccctgg
caccttgtca gagagaggcc 72120taagaacacg ctctgggcat tgggggcctc ccagtgttcc
ctgcgagttt ggagaggcca 72180ggtccctggg atgaggacac gcccacccgc caagctcccc
tcctgccagt cccacctgga 72240gcctttatgc atgctgtctg cgcccttcac ctgtcccacc
acctcttctt tggcctgcct 72300ggatcctgcc atatcctcaa ggccttccct ggtgatggaa
ttccagagtc ctcccaattg 72360caagggcctg agggatccgc tgccctccaa ccccacaccc
cgtctcaagg tcatgtgcta 72420atcccgtgca agctggggtg tgggctggct gccatgtcaa
gctcgtttct gttgtaccca 72480aaacacagtc tcacctccag aagaattacc tggcactccc
tggagttccc gcagccttct 72540cattttcaaa taatttttca gatttccaga atagaaagaa
attccataca cccttctccc 72600agagacagac cccattcaag ctttgccaat cattccatga
actcctctag agaggggaag 72660acccagtccg gggccaggca gctctctctg gtctctctgg
actccgtcct ggagcacttc 72720cctgtcctcg cctcagttct cacagattgc cagttttgaa
gattacgaga cagtcatttt 72780gtgtggtgtt ttctcatgat acaacccaag ttacgcgttt
ggggctagca ttgcttagaa 72840gtggtgctgt gcttgttctc ataggcatgt aatgttggtt
tgcccaaata ttgctgatgt 72900taaccttgac ctcttgttta atggcaacca cataatgtgt
ttttcccctt tgtgattgat 72960acgtatttgt ggggagatgc tttgaggtga tgcgatgtcc
cactttcatc tgccagtttt 73020agtgtccctt agtatttctt gtttgaatga atgatttcta
tgatggtcgc caaataactc 73080catcattctt ccttgtattt actcctggtg ttttacttta
aggaaagatg ttctcttctt 73140tccactgatt gatgtatgcc agtgtgggct cgtggcttcc
tatgtaattc agtgggttat 73200cgtctgttaa ccgtcatcac tgggatattc agatggtccc
cgacgcaggc actggcagcc 73260ccttcaggct ggccctgtga cctttcggca catcctcacc
gttctttgag cccttccttg 73320cattttggcc caaaaagatg tttcaggttc atttgttcag
tctttgcttc agccctggaa 73380tcggccattt ctccaaggag ccccagttcc ttccagtgga
gggcactgtt tagaatctca 73440gcccagggcc ccaggtgtgc ttgtgttgtc tctgggcatt
gctgctcagg ccctggcggt 73500ggacagaggt aggaaagatg cgcatggtat gcaggtgtga
cacacacatg cgtccatgct 73560taccctaccc acctggacct gcccaaaagc ccgagctcac
agcagtgcct ccgattccag 73620tccagcagca cagggttcct tctcacctca tgcctttaca
cgttcatgcc ttccttccac 73680agcagggaga agcctggctt cctcattctc tctccatctg
cttatcgctc tatcccatgt 73740agagccaagt ccctgacctt gcctgtggtg gacactctgc
gtttgttaaa ctgaataaca 73800acaatcctcc ctattgggag aaaccttttc agcttcccag
gccatttctg catttgtggt 73860tctcttccaa gcctcttggg taggcataga aagtcttctc
tgttttccag ggaagctgca 73920gactcaccca gcatcaccca gctggtaagg gtgtagcagg
acagtgccag catgtctgtc 73980ccaactgtcc ctatggcccc tatggaagat tagctatgtg
gtggcaaagt gtcccgagga 74040aggggcattc tgggatgttc aggtgctcga ctgggctggg
acttgtactt tggaaatgag 74100aagaagccca ggctcaaatc tttgccccac tgcttgctgg
caggaagacc actgagcctc 74160agttttctca tctgtaaaat ggacttattt tggacttact
tcccaggtat cctcagcacc 74220tttattacag gagctccatt aatacatggc tgcaggaggt
agttcaggga agcaatggga 74280gtggaatgct gggtgcatca cggcctgggg ttgtcagcct
ggcctctgca ggcatctgag 74340atattcctga gggcagattc agccaggcca acacttctgg
atagaaccac agggccataa 74400actgagacca ttgagggaga ctgcttggct tggaaaactc
gcccagggac catgtgctct 74460gatttgcaaa atgcaccccc cagagcggct tcctgccttc
actttgtgca gagttggggg 74520cggcaggggc gtggggggtt ggatggacca tgtcacccaa
aagtttgtgt cctcggatgg 74580ggaggggaga gcgtctgccc aagcacagcc tggaggcttc
ccttacttcc gatgaccttg 74640tggcagtcag tgtcacccgg acagagtgtg gcacgccttg
gggaccagcc cctcccagaa 74700ggataggcgc ttgattggaa acgttgactt ttcatttaaa
taagtaaagg tcaaggccta 74760ctgcatggcg aggaggagat ggctcttccc tgcctctgtc
ctcccttacc cccgtgtgag 74820tggacaaacc ccccgtcctc acccaggtgg cccgcttctg
ccacccgcac tatatccaag 74880acacagccca gctttgcagg cacttgccgt gtgctccgca
gtgtgcaggg tgatgagatg 74940aacaagacca tgtggcctga agggtcccac gctgcagagg
gccccagccc agggccagac 75000cccaagtgtg tgcctggagt gactcagggg tgcgcgagga
ggactcagtg ggatatggaa 75060tgcaggggct gaggcatagg gagtgacaca ggggctcagg
gctttcccat gagtcgaggc 75120tcattgggtt ggcagggacc acatctgccc tggtcgatgc
cctgcccatg caggctctaa 75180cggcagagtt tggggattgg ggcgggggat gcagggggcc
tggggtgggg agcttggggg 75240cagtgggcca ggcaaagcag agccctttgc tgtgttggaa
acccctcaga ggccctaggc 75300gtgataagtg agtgtgacgc aggagagtct ctctgcatgg
gccacccaca attgcagaga 75360agacgccctg catgtcatag gcttggcggc ccatgccggg
aatttggcat ggccgtcttt 75420cctccggtgc tggggagctc agcagctgga ggcagatgct
ggattgtctt tctctaccgc 75480agcagggcag gtagcttgtc ttttagtttg ttttgttagt
ctcctgtatc tgtgactctg 75540cagagagcca acggcaggaa gagggcagca ccggcacgtc
ggaggccgag gacggcttag 75600acgggaggtc cggaggctct cagcctgacc agccctggcc
tagggccggg gaaaggcctg 75660ggctttgggg tcaagggcag gggatgagat ctcagaccca
cgacctaacc attgacctgc 75720ctgggccact tcacccctgc atgcttcctt ccctgtctgt
aaaagcggat cacgtgaggg 75780gccttgcagg aagctctgag gcccagtgag gcccccggga
gacaggcgag accgctgtga 75840gcgcccgtcc ttcccgcccg gtgcctggca ggttcctatg
cgctctcggg gcttcgggct 75900ggggccctgc tgcggcgagg ggggtgctcc ccaggctggc
gctgtcggtg gtgtggcttt 75960ggctgacacg aggtggagag gatcttgttg acctgaccgt
ataaagtagc agtgaggcct 76020tacaagtagt tctaggcctg gaaaggaaag ggaaagaaag
agcaatggaa tctggtattt 76080tctcaggtgt cctggactcc atctgaaaga tttcgccatg
ggaaacattc cttcctgggt 76140cctcggcgcc tattgttaca gcacgcaggc acccgggctg
tggcctaggg gtgaaacctc 76200tctcgattgt gtggccatag gtggttactt aacttctctg
tgcctcagcg ttgccactgt 76260gtgtggcgat agcaacatgt cttcttccta aggttgttgg
gaaaacactg atcgcaacag 76320tgacagatgc aggacccact ccaccagaaa ttgttccatg
agcttgttta cccaccagac 76380gcctgtatga gagagaactg acatcgtcct cactttgcag
agaggttaag taacttgcct 76440aaggtcacac agcctttgaa ccctgggagt cgaactccgt
catctatgct ctgaaccact 76500ctgccacatg tggtggttac tgtcagttcc atttgtttcc
acacagcttt aagttagtcc 76560tgtggtttgc ttttatttta tttttaagac atggtcttgc
tctgtcaccc aggctggagt 76620gcagtggtgc aatcatagct cactgcagcc tcgacttctg
atgagattct cgcatctcag 76680cctcccaagt agccgggact ataggcacat gccacaatgc
ctggctaatt aaaaaaacat 76740tttttttttt tttgtagaga caaagctgtt gctatgttgc
ccaagctggt cttaaactcc 76800tggcctcaag tgattctccc acatcgggat cccaaagtgc
tggaattagc ttggcgtgag 76860ccgccatgcc tggcctagtc tggtgggtgt ttaaaccttg
ctggccttta ggggtgcatg 76920ctgtgaagac ctgtgggcca gaggccaggt atcaggagga
gacactcagg cctgttgaat 76980cagctcagtg ccacagggct gcaagagcag ctgcttcagg
ctcctcaggc cgtgggcccc 77040tgagcaccag ttctgctctg ctccctcagg ggaactgggg
agatcaaggt ggggcccctg 77100gacctcccct aattcagcca gagaagctct catctcttcc
ggggtcccag gcaggattcc 77160cctgggctgg ggtgggaggt tctgctgctt taaaaccact
ggaaaggctt gagaaccagt 77220gatcttctgc cctgttacag acgcggaagc cggggtggag
caggcctgta aggtgggggc 77280tggggccggc actgggtccc cgggcttttc cacccctgcc
gcgtgtgcgt gcagactggc 77340cctgcggtcc atgggcttgg gcctgccttt tggcttgctc
ccgagtattt tgcccagagg 77400ccttgtggct ttcagggatg ctgtctgggg ctggaagtca
gtccttgttt tgtcctggca 77460gtagtcagca ggagggagaa gaaggggtta acctgtctgg
aaccaggaaa gaggaagcgg 77520tagtgttccc gcatgcagtc ccaaatagac tcctgctcct
gctccgaggc caccccagcc 77580aggctggatg gaagctcccc ggctgggcca ttgttccggc
ccctctccct ccaccctgag 77640aaagccccca gcccggtggg tcagggcagg acatggacag
gccaaggctt tgggctccag 77700ggaggccaag ctgctggggc ccagagaccc tgctgtgtat
cccagagcca gagaagaacc 77760tccctgaggg ctctgtgagc cagagggctg ggggatgtca
agagacaggc agggcagggc 77820tgcaaggtta gagtgagggg aggggtacct ggtgccattt
tggaagggac ccctgaaggc 77880aggaagcagg acagaccccc cgccattggc agaagacctc
tggaatcctg agatacttaa 77940attccctgcg gcagctccaa gcctagggaa ccagacagat
cggggaggat tggctagaac 78000agtgcatcct ggtgggtcgt gttggaagct ctctgaggtc
cctctgctgc tgagagatga 78060aggaatggag ccctggctct gctgtgggat gccaggctgc
ccaggctccc cgacctctcc 78120agcaacaaaa atcataaaca cagcatatgt aatatataaa
taataaaagt aacagccatc 78180ttttcagaat aatctagcac ttagaatgtg ctagggaaca
tacaagcact tttctcctta 78240tttacagtaa gccttaaggt aaatttcatc atctccagat
tttaagtaaa gaatttgaga 78300atcagagaca ttcacagcaa cctggtgcac ccggagcctg
gctgacccca gcaccgctgc 78360tggcggcccc acctcactcg cccagtctcc ctgcttcccc
agccaccgtc cagctacctc 78420gagagatcct gggagcatgg tgattgtggg gtgcatgggg
gcccgtgagc tgttgggggt 78480gtcagggttc tgcccgtgag atcctccttg ttgccagaca
tgaggacctg gactggcagc 78540tgtgggtggc ctcatgaatc agggatcaga gatagcgggc
agcaggcggg cccagcccgg 78600agcaagctgt gccattggcg atgcggggag gctggcccca
tcgaaggctg gtgggactgg 78660tggagactcc tgtccactgc tcagcactag gcctgcagca
gacaccatga gccccaaact 78720tcccaaagcc cttccccagt cccacaagat ggtgtctgcg
gaccgtgctc gtgagagatg 78780gcagccaggc agtccccaca gggcacccat tttcagctgc
ccccgcttct cagacaagga 78840aactgaggcc agaaagccag gtggcccagg agctggcttc
cccatttcct gctcctgtgg 78900gccccactgc agtgcccatg ggccgggctg atattacccg
agacttcgga gctctcacgg 78960gtgcgagtaa tttaggctgc atggacacaa gctgctggct
tgagtcgccc cgttatgaat 79020gtgtgtgggt ctgtgcccct ttcatgtgct gccacagggc
ccacgagtgt gctgaaaggg 79080aaggacacgg ccaaggggcc atggtggaca ggagaccttc
ttgggggttc ggtggtgtcc 79140ttgaccccac tctgactgag cactgcccca aggcactgcc
attccaggcc cccttccctg 79200agcctcccac cccaggccca cccacctgct gggtcctccc
acctgcgggg cccgccatgc 79260ggggtcacca tgcgagtctc accatgcagg gtcaccacac
gagtctcacc atgcagggtc 79320accacgcgag tctcaccatg cagggtcacc atgcggggtc
accatgtggg gtcaccatgc 79380ggggctcacc atgtggggct tcaggagctt gctgagcacc
ctccccaccc acggtcactc 79440tccctggggt ctgtaagcct ccctgggcct gagcagctcc
cagccttgct gctgcctttc 79500cacttcctgg cagtgaggtc tcctgggtgc cttctctcag
ccctttggga tgttttttgt 79560gaggaaggga ggctttgatg ctgtggagca tctgtagtgc
ccactccagt ggcttcacag 79620gagcagcagg ctgtttgttc tgagctgttc caccttgtgc
ctgccagagg ggagatagtg 79680gacaggcctc cctcccccca agtggtgggg tggaccccct
gcccgctgtg gccccatacc 79740tgggggccac acaccactgc cctgggccgt gcagctgcta
tgaagagtgt gctgctgaga 79800ccctggaaga gacggaggat gaaattgtgt tgccagatag
tccatttgtt gttctgagac 79860tcgcatgcct gggagaatcc tgggaattaa ctagctcctt
ctctcccatc ccattttaca 79920gaaaagtgag acccaaggtg gtttctgact tgcccagagg
tcataactgc ttggacagtc 79980atggtcctca gagcccacgt ttgctgacca gtgcaggctc
tcacagccac tcagctcctg 80040cagccgtggc gtggcagagg agggaagcac ttcctgggat
ttatgctgcc tccctgacat 80100ttcaaggccc ttcatttctc taaatattgg aggagttgaa
ttatttttag ttgagcctca 80160agggatcaga gaataagctt gcagcaacgt tggcagatgg
gcttcttcta gcagagagtg 80220gttattcggg gcctcttatt gagagaatcg ggtgatttga
ggaaatctgg ggtgtcctga 80280ggcataccag aggaccccca agtttttcct gtggctcgtc
tgccatcagg aaaccaaaat 80340gactcccctc gtcctgagct ctccagggtg tggacctgga
atgcttaagg ggaggcaatg 80400gcatatcttt aagatgagca cagctccgga gccactcgag
cacccaaggc cacgtcctgc 80460tcagggcact tcgggcctca gtttccttat ctttaaaatg
gacagagttg gccgggtgag 80520gtggccctgc ctgtaatccc agcactttgg gaggccaagg
ctggcagatt gcttgagccc 80580aggagtttga agccagcctg ggcaacatgg cgaaacccca
tctctactac aagtacaaaa 80640atttggccgg gcatggtggc tcatgcttgt aatcccagca
ctttgggagg ccaaggagag 80700cggatcactt gaggccagaa gctcgagacc gcctctacta
aaaatacaaa aattagccag 80760gcgtggtggc tcacgcctgt aatctcagct actcgggtgg
ctgaggcagg agaatcactt 80820gaacctggga agtagaggtt gcagtgagct gagatcgtgc
cactgcactc tagcctgggc 80880gacagagcaa aaccctgtct caaaaaaaaa aaaaaaaatt
agtcaggcat ggtggcacac 80940acctgtagtc ccagctactc aggaggctga ggtgggagga
ttgtttgagc ccagaaggtt 81000gaggctgcat tgagctgaga ttgcaccact gcactccagc
ctgggcgaca gagcgagacc 81060ctgtctcaaa aaaataaaat agacataata agagtaccta
ccacctacgg ctggggagac 81120cagaatgaga tatcctgcca aaagcactca ccgcacttcc
tggcacacag caagggttca 81180gcagttaccc gcttctgctc attttcttgg tgttctcatc
agtatgatta tttagggaaa 81240cttgggtccc cagataaccg tggggagggg agggtttacc
tgcaggtgcc ctggcccagc 81300cgttcatgca gcgccgtgct gtttctgtgt ctgtgcgctt
cggtggagat gctgtgggtg 81360ggtgggcagg tgtgccttgt gctgtggcct cccgagacaa
ggagggctcc cacctaagca 81420gggtcctgca gcccaggcac atagccctgc cctggccctc
caggtccaca cggctgtggg 81480ttcccaccag ggggcggcag ggacttgcgg ccggggaccc
agcctggttt ctcccgcttt 81540gcttcgtggc caggctaggg ccaggggggc tgcgcaggat
ggggcctttt caccactgcc 81600tggagccgct cgcccaccaa ccccactgac ggctctcagt
ttccctttcc ccgagcgccc 81660ttcaccctga ggcacagcac agcctcgtgc tcctggcccc
actggcctgc tcccttaagc 81720tgagttttgc ctttgcagtt tgagtgacat ttttgctggg
gatccggggc tgctcggggt 81780ggcctcgcac acccctgcac tttggctgcc tggaacgtgg
cttggggttc acggtgctcc 81840atggtttgac tcacaggcag ggttccagca gcagcagaga
gaaagagcat cttctccagc 81900cggccctaca gaggccccgg gaccccaaag gctgcactga
ggtggccacc aaaaccaggg 81960ggcagagtcc tcctaagttc tggccctaga tctaggctcc
agatctcccg ggagttggat 82020ccagaggaac ctgggctcac ttctttagca ggttctggac
acacgatgcc tttaggggct 82080caagaaaatg tttaatttat tttaaaatta gaaggaaaat
atctctggaa catttagggc 82140caaagaaaat gttttaatgt tttttctcac tttagaaata
ataattcaaa aatccattaa 82200attatttaac agctttttta atggaggaag gggcccccca
aatcacaaag tggccctggg 82260ccctgggatg gccggaggag agcctgccat acccacgcag
tgtgcctgag tcctggagtg 82320cgatgtattc tgagccaccc agaatggtga gtcaccaaga
ggcagcttgg aggtgggccg 82380ctgcccctgc acactgtgct tgggggacac agccgacacg
accagcctgc agtaccagga 82440tagacaattt gggctggagc caaacttgat cagagggagg
ctgactttat cgagggctgc 82500tttgtggcgg gcgttcattg tggaaagagc ccaggaagga
gctctctgcc ttgcaagaac 82560attgcctgtg gctcaaaatg gccacatggg atgcagttcc
tgaggtggca gaagagccag 82620cctctgctag ccccagacct aacggagttg cggctcatgg
ccaccctgag gaattccacc 82680tgggcgtttg gtgtccctag catatctcga atggcaggct
gaggctcgaa agtctcatgc 82740ctgtctgcag gacccccgcg tgagcctggg accacccgtg
gggacagtgt gtgtctaggc 82800aaatgccttg ggcactcgag ggggcactga ggccctggag
ggcttctgtg ggtgggagtt 82860cccatgacgt catccctcag tacaagctac gagggcttct
tagaacacgc tggacactgg 82920ggagggctct aatggtattt gcacaaattt gcaatgttgg
agaccagagt ggagtgggct 82980gggcgggcca ggccccggcc cagatggggt gacctttcct
tcacccctac ttccccactg 83040caagggtgct gcctccccag ctcctacctg gggagatgct
gacatctggt cattcgcaga 83100gtgccctgtg cccacatgct gtccctgtcc agacatgaaa
cggacagcag gagggggctt 83160tgaggggtct ctcctgagac acctctccag ctgaggctcc
tggagttgga agcccctgca 83220gactcggtgg ctgcctcacc aaaggctcca tctgcatttg
cccccatgac cttcccactg 83280tgcctgggac cgcagttaga gctgaccagc acctggggca
gtctggggtc tggggctggc 83340gttcgggcca gcatggaaca cagatgcctt cagtggctct
ggtgggcacg gggcttgcac 83400agatcggact gtgaccccag gctgtgggac tcagccacca
agcgccttca cttatttact 83460gaccttgaag ttatccagca cttatgtgtt tacagatacc
tgcacaacag ccagagaggt 83520acgaatacat ctccacctac aggtgagggc acagagaggt
taggacccac ctgaggggat 83580gcagccagca ggtgggactt gggccctggc agtctgtttc
cagcctgtgt gctcaggtcc 83640agggtctccc tgggactgga gagggcgctg ggggaatgga
gccgcctcct gcatcagagc 83700ttctcctgct ccagttcagc ctgcagctct cctgcatccc
caggtgggga ccagagagag 83760gggcaggctg gtaccagcct gggggagggg gcagttacca
gaagaccttg gtccccatcc 83820cagccctgtg caccctgggg cccagccagg ctggaggccc
atttttccca gaagcccacc 83880tctctgcctc cttccagggg ctcaccccgc cctgctgcct
cttcttgatc gcctccttgc 83940ccctactctc ccctctgccc tctcgcctcc cctcacgttc
atggggtttt gtggacgcgc 84000ctgcctgcct ttcatcagtg gtgtccagcc gtctccctga
ccgaggctga gggtcctgca 84060catctaggcc ccgtcacggt cgttctcccc ggtccctgcc
tgcccatggc tcttgccaca 84120gtggggatgc agcagctctg ggttgagtgt ctgtcgttgc
ccctgcactg cgtgtgccct 84180gtgagtgggg gctgctgtgc cacacctcac tgccacccat
gcccatctca gggtgtgggc 84240ccagcgggtc ctcaggctgt acctgtcagg gggctggata
cctggacagg gccttgtccg 84300agggttgact gggagagaaa atgcccccgt tccagctcgg
gcagtacaag ccttcatcct 84360cccagactcc acctccctgg ctgctccttc ctgtatggcc
cccacagggg ttggcagggc 84420gcactcccca gagggacagg agtttgtggc caacctgagg
gtagacactg cctgtcctcc 84480agacccctag acacccccgg tcggcccagg tgcggggtct
gcccgcatcc cacagccatc 84540tttccagcga ggacagtgca ggcgcaggtg gctgtaccgt
ggctgcggtg atgtgctgac 84600tgggaacggc agaggtgccc tgggaaggcg gaggtctggg
agccccacaa gggctgagcc 84660ttggggcctc aaaacaggga gtcttctctg ggctcagggc
ctttctggag atgggacttc 84720taggttgcca gggccactag ccgtgaggag ttcagctgcc
catggtggcg gcggctgggc 84780tgtatcttgc acttctcagc tgtgtggagg ggtccgtggg
gcctgtctgg gcccagcaca 84840gcagcgcaca gaaaagccac tttatatatt tgggattcta
ttaatccttc tagtttgtac 84900attttttatg gcaccttcaa tttttttctt ttaaaaatta
atatattttt ttttctgata 84960atataagaaa tgcatgcctt ctttggaaaa tacagaaaat
cacaaagaaa ataatccaca 85020atcccacctt ccaaaaggaa ccacagttag cattttgttc
agcctgtttt ttaaaaaaca 85080aacaaacaaa aaatgcatac taaaagcaac gatttataaa
ctgtggttgt agagttaagc 85140tcatacccag agttgccggc atctcaccag ttcctctcct
gggacagtga ttgtgccttc 85200ccagagctgc cgcaccctgg cctcgcagcc aagcctggtg
acagcagtgg ggtgaccctg 85260cgaggcccct cggatggctg atgctgagag cccagcagag
ctggaggggg taggggcagc 85320ctccccagga ctgatctgac agacgggtat gtgcctcacc
cggcatcacc caccggggag 85380gcctccgtcc gcccccatgc agtcatgctg ggtcccccca
tctcgtgtct tctggggaat 85440ggagcggaag acctcccctc tctggaacca ggggtgggag
ggactgtgct ccccagaatg 85500aatattcctt ggctgtgggt gagcctgcca gcccagagtc
cccaggggtc aggagcagac 85560tggcaggggc catggcaacg actgggacac ctggggagcc
aggtgtccag gcaggctcca 85620gtgtccactc ccaacaagta tggggcattg agatcctggg
tggatcacag gggagcaggc 85680gtgctcaagc tgacacagcc caagctgtct ttattcacca
ccagacccga atccacctct 85740gggccaacag tctcaccctg caggaggcct ggcacctgtg
tttgtatttt atcaccatga 85800ggttgtgtgg gctgctgaga gcgtgtgcag agctctgggc
acaagcctcg gctctgtggc 85860cggctgagtg gccgggcctc ggtgttctcg tctctgaagt
ggggagaagg ctgcccccgt 85920cacgggacct cacagtgtta gatgagattg cagatgcgat
gaggtggggg cccagcaggc 85980ctgctcaggg cgtggctgtc cctcctggtg cagacaccat
cacgtggaag gcgatcacat 86040cgagtgttct ggccactgtg gcgcgagtgc atatgtgcgc
agggctgtgc atgggcggcc 86100tgggttgggc ttcctttctc ggcttgtctg ggaggacggt
ggctcccagt gggtgtctcc 86160ctgtctctgc cccactctga ggatgtgggt gctttgagct
gagtgaagga ggaacttctc 86220taggagagag ctctgcctgc cacctccctg gtgcccccgc
ccccagagca gagctgtcct 86280tacagggcca cagcctctgc tggccccgaa gcccctttgc
agtgaagggg ggctcctgtg 86340tgcggctccc tccagatgga caaaaaccag gtttggatgc
cggggcctcg gctgagtccc 86400tcgcttctct aagcattcct ttcctcgtct ggaaagtggg
ggtgtagtat tgtgcgaacc 86460tcccaggcca tcccaggatg aactgagatc atgcccataa
catagacttc cgcctggtcc 86520ctgctcagag ctccccgaat ggtcattgcc acatctaaac
aaatttttct ttcagctggg 86580ggtgctggcg gcccctgcag cggtgcccac ggcagcccat
gcacccaggc aggaagctgc 86640tttgtgcagg agggtctttg tgcaggggca gtgtgggcga
gcctggcagc ctcctgagcg 86700gcaggactat gcaggcgcct tattgagctg tcagcggctg
gcagctgctc ccaggtagca 86760ggaagaaggg ccgagctcca cgtgggcatg aaaggcccag
cgggtgtgtc agggggcggg 86820gtggggtgga agtgttagtg cgaatgtgag tgtgtgtgca
cgcttgcagg ataaagatga 86880gtttttaaaa tgtgtctcct gtggtttttg atttttttaa
tctttcccaa aatcaataca 86940aacaaaccac aggcataatt tgggtctgta aaggccgacg
ctgtcattct gctgcaagag 87000ggagggaagt tcagttgggg ggaaaatgca gcgcttttgc
tgtcattaag tccaatggtg 87060gctggcgaca gtcacggcgc tggtggagct cagagcagag
gtgggagggg tggtattgac 87120caggcgtgtg tgtttccggt gtgtgatgtg agaacccagg
gctcgctggg atgctgtcct 87180ccctccctcc ctccctctgc caagctggtg ggaaggccct
tgccaaagcg cacaagaccc 87240cttcacacag caggggcaca agctcttcga ggcagagtcg
cctgcaggtg gggtggaagg 87300ggtgcgcccc agacccaaga atcgcccgcc tctccaagac
catccctggc tgctggtgag 87360cagggtagga gggttgttgg cacctgctgt ggccggggcc
gtgccaggtc tgtacacatt 87420gtgctctttg ctgggtggca gccaccacgt attctcaagg
gcctgtgcta ggcctctggc 87480gggccctggg ttgttccctt ccttatgagc tcccattatt
aactaaagaa gacactgaat 87540gatctcatgt ggttgtatgt ttttttgttt gtttgtttgt
ttttttgaga cggagtctca 87600ctctgtcgcc caggctggag tgcagtggcg cgatctcggc
tcactgcaag ctctgcctcc 87660tgggttcaca ccattctcct gcctcaccct cccgagtagc
tgggactaca ggtgcctgcc 87720accacgcccg gctaattttt tgtatttttg tatttttagt
agagatgggg tttcaccatg 87780ttagccagga tggtcttgat ctcctgacct catgatccgc
ccgcctcagc ctcccaaagt 87840gctggaatta caggcgtgag ccaccgcgcc cgcctcatgc
ggttgtatct tggtgggtaa 87900ggcaggtctt tctatgaccc acctgccagc ctcatcatcc
gtgctcccca cctcaccatg 87960catgcttcgg ttgctgtttg tccacaggcg tacttacttt
gttttatatc agtttttaga 88020tacaaaaata gtacaggtag aattcggtat tctacatttt
cagggaatat cacatcacac 88080acacgctgct ctttcactat ggttttcatg ttttctgcac
ctctcccgct cttaacccgc 88140ctcccaccca gatttagctc tgttaaccct tcaggcttct
ttttctgcac acacacacgc 88200acacgcgcgc acacacacag gaggttgtcc tgtttggttt
ttagaaatgg aattgtcctg 88260ttcctctttg caccttgctc atttcacttc atggtgcgtc
ccagatacca ctttctaagt 88320caccggcaga agcaggacct tgtgcttcct gatggctgtg
agctgtcgtg gcggcacacc 88380tgtcctcggt gtggagttcc agttcattcc tgcagtcccc
tgttaacaga tgtttagact 88440gtttcccttt tttccccttt tcattacaaa caaaaaaaat
tgctttctta aacttatttt 88500catgttttta aactttctaa tttctcataa atatttcaac
tttagattct gtgagtgagt 88560ctattagtgt caggctttgc agaggaaagg aaagttctag
aagagtctgc agctcgttct 88620cccaccccct gcccccagtg tccacccaca gaggtgtgag
catgcaaata gggtgagctt 88680ctgtctcccg tctcttgggt gtcttttcca cctggggcct
ccctggggcc atgctggggg 88740agtgaaaggc ccctccctgg aagcccgagg cccctggcct
ggatgctgta gcttgcctga 88800aagcggagga attctggctg cctggtggag gtcggggctg
gccctgcacc gccaggagaa 88860gcagtctgag ggagggctgc tcaggctgga cccagccctg
accagcgggg ccatcgggcg 88920tcctgggccc tggcaggatg tcgttccagt gatacagcac
ttccttgctt ctaaactgct 88980gagtggtcag tgtcggttat cagctgtcca catatattgc
ataatctcag atgctggctt 89040gtaacagtga gtggaggtgg gtggccacac tgtcccttta
gaataatcca catggccctg 89100tccaggcagg gcagcctctg ggctcccgcc caccgcctga
tgcccctgtg cctggtgagc 89160ctgggccggg gtgtgtttgc tgtggatggg tgtgttccga
gttctccggt cagcaagctc 89220cctttgtctc taggggagtt cctgggaggt aaaggggagc
cttttcaagc cacaagatca 89280cccagaaaac ctacgtaagc actaaagttc agaaaaatcc
agccagcctt ggagctgggg 89340agactgcttc ccttggggcc cggccactcc agagccattt
ccgtgatggt aaaatgagtg 89400ggggccccat gctgcctctg cctcctgtga acgccatcct
ctggttcttc accttcaggc 89460tcccgcatga gctgggtggt tggtgttaac tcctctcttc
tctctttcta cccggaaccc 89520cctgcccaga gctccaggga gctccaccat agccagtaga
gggagggctc tgtttacctc 89580cttccaagtt cccaattttg ctctactgat ttttgtctgc
ggtgcataat caatcagcct 89640ggaccccaga gaggccttct gtcgaggtgg gtgggtctca
gaacggcagg ctgctccccc 89700tcctacctcg gggcccccac tgtgcacctg cagctgaggc
atgttttctg gggttcctgg 89760ctttggcctg cacggctggg tgggaggagg gcatgggagg
tgatgctgct cactgagcag 89820ggcctgcttt gaccacctcg ggatcaagag tcctgttaga
tgcccaggta gagacccttg 89880ctgggcgggc aggtggatct tcgagaatgg agctccgagg
gacaggttgg ctgatctcag 89940tgaaccctga ggcttcccca aggccataca gcttgtctgg
cagagcctca gcaggagccc 90000gggtccccgg acaccgaggc tagctctttc tgctgtggcc
ttctcacaga ccaaggtggg 90060atcgcagcct tggtagcctt gaagtctccc taacctggca
agacagtgga gaagggggac 90120agccagcggg acattccact gtagaataaa tacatgcatg
gcaaaacact tgacatctat 90180taacttctaa gtggcaagag tgattcttgt ggttagacct
gaatgggaac ttgtaagtct 90240aaagtgaaaa tctaattaga tttttcccac aaaacaattt
tccgcaacgc attgtggaca 90300taaaggtttg atgaatgtat agccaggcac agtggctcac
atctgtattc ccaacacttt 90360gggaggccaa ggcaggagga ttgcttgagc acaggatttg
agaccagcct gggcaacata 90420gtgagactct gtctctataa aaaaataaat tagcagggca
tgacggtgta cacctgtagc 90480cccagctact cgggaggctg aggcaggagg atcacttgag
ctcaggagtt cgaggttgca 90540ccatgattac acctgtgagt agccatgcac tccagcctgg
acaacacagc aagaccctgt 90600ctcaaaaaaa aaaaaaagat ttggccaatg aattttatca
gctaaactgg gatatttaag 90660agaaaggaag gaactcctaa tttctgagcc ctggttctga
aaaagctttc ttttttcttt 90720cttcacttta gtcttgttag caaaacctta acagttcata
gaagaaatca ttttttactt 90780ttttcccctt aatctgagca caacatctgg ggttgccatg
gtggtccatt ggtggtgatt 90840tgcaggacgg gaccttacat ctggggaagg ggctgccagg
ctgggcctgc ctctggccct 90900gtgggcatgg ccggggtggg catggccggg aaaggcatgg
cctccagaat cagaggcctg 90960ggttgaatcc tggctgagtg cctagtgggg cactttggga
caaatgacca cctttctgag 91020ccttgctttt ctcatctgta aatgggggtg gcaatgccta
cttagttgaa tgttaaataa 91080aatgaagtcg tgtctgtgaa gcacctggca tggtaatcgg
tggccgtgag gatgacgctg 91140gcctcgcagg gactgggtgg cacaggtgtc ctgaccttgg
catgcagggt ggatgggcca 91200ggcttccagc cagtgctgga tctcagaggc ccaggggtta
gagaactaag ggcctccgag 91260ccaggctcct cgttgcaaat ggacttctct cctttgcgtg
ctgtgaactg ggcgtgaaca 91320cctcctgtga tggggagctc actccttgcc gagggagcct
ggtaagctgc cggcacagtc 91380atcctcatac cctctctcct gacctgtgtc ttatggcctc
tgcccccagg cagctgcggg 91440gcagggggct ggggtgaccc ccgtcttggt tgttgagccc
ttccactagc tggagggtga 91500ctgtcccccc ttctcccccg acacaaggca cagcctttct
tgtggctgtc aggagagagg 91560gcctgtcctc ctttgcgccc acgctgcccc tgcgccttgt
ggcactgcat tcaaagtgag 91620tgcgcctgct tttgtttctt cctaggactt ctgaaaacgt
gtgggtgggg gagggtgtgg 91680agggttcgtg caccgggtgg tgtggctgca gtggcgtggg
aggcggcccc agaggggcag 91740aggcgcagtg tgtttatgtg cacagggccg ttcccttcct
gtccctcatg tgaacaccag 91800tgtacaaatg tgtgcacaca ggcgtgtccc ccgtgccttt
tccctacctt ggctgatcgt 91860ggggaagata gactcacttc ccctctccag gcggcgattc
cggaaaggag gctctggaat 91920gccagctggg ggcagggaag ccagggcagc caggagggac
ctagcaggcc cacagaggtc 91980tgggacagct gagccctcct gcccacggcc ctgccatccc
tgtgtgtccc caaggtaacg 92040tccaagctct gcagcaggga ccagccctgg gtcccacaga
ggaggctggt ccaggtggtg 92100gactagagct ccgcttgtgc cagcctaggt gcagaggccc
ggcccagccc ttcctaacct 92160tgccgtcctg ggccagctgt tcggtctctg cgtgcctgtt
tctacacaac ccatctgtaa 92220aatggggcca tgggagccac cacctccccc aggcctgtcg
ccatcactgg cacctccacc 92280agcattagca ccagtgtctc catgaagttg gggttcctgg
ggagatggct ccccccagcc 92340tcgcctcctc cctgcctcag agactgcgca gaaggagggc
agctgggctg atgagcaact 92400cgcctggggc cccagagagg cccgatccca tcccgctggg
gcacgcccca ggtccaagcc 92460tgtctgggca cccccagggc tctggactca cctccacccc
tccctctgca tccttggtgg 92520tgacaccatc agcaagtttg gcacccccgg ccctcacact
gaccccagca ccccttcccc 92580aggctctggt ggactccttg ctcacgctgc ctgctctgca
gggtgggggc gggggggcag 92640agcctgccag cccctcccac acttccctca ccagatcaca
ggaccctcgg ggtggaggct 92700ggtgcgtggg gaggtaaggg tcaggtggtg ggaggccatc
ctcagcaggg gtcccagagt 92760gggagaccca gagctttagg accccaccat gcggaggggc
cagggtccct gggatgggga 92820gaggcctcag ccagggtgag cagaagagct gcagcccctg
agacccgcgg gcttggccgg 92880ccccccggga agaagaccca cagctgggag ggcagttcat
gcagggcacc ctcccctcca 92940cctgccggcc ttggggagca ccaggcaagg ctccctgggg
agcagggctg ccacctctga 93000cctcgctcca gcctgacctc accacggagg cgcacgcagg
atgtccgcag gcccccgggt 93060ggcagcttcc tacagtgctt ttccaccagg gaccgtgctg
aggcccaggg ccctgagacc 93120ctcaggccca tcacagcctc cccactgagc acaggcacgt
ccatcccctc cccttctcca 93180cctgtcacct gtctgtcttc caggcttcgt gggacttccc
agtgccccat tcctgcttgt 93240cagaagaggg gctgggagaa tggcagggcc tggcactcca
ggctgcccct gccctcctgc 93300gctcttgccc tccagccttt ccccaccctg gaagtgagtt
catgatgtta atggcagcag 93360cagccatggt gcaccaggca ctctgctaag cactttatgt
acattcttaa attaccggat 93420ccccattacg gaaacggagt cttaggttga aaaactactt
gaccaaggtc atcccagctg 93480taagtggcag agcaggccct gaatccaggc aacccaggct
gatcaccacc ctgccaccac 93540ccacccctct ctcacataca ggctctaggc ctcctgtcta
tactggaatc ccccacagca 93600gatccacatg gagctgtcta aagaagcacc tgcccctccc
ccagacaagg caagaaagcc 93660atctctacct ggagggagag cagtgagggg catcattaat
gggctgttgg gtgtcctgat 93720ttctttgaga gaagacgagc tgagcaggca cagctcttaa
tcctccagcc actgcccccc 93780aggctggggc ctgccagccc ccgcccctgt agtcatttct
aaatcgggtc tagatggact 93840ggacgcgtgg ctgccagaga ccatttgaac cccctacctc
caaattcccc aaggtccccg 93900cctgcaaagg accagcccca gcatgaggcc ctgaaagtgg
caacatttca atagcctatc 93960ctggacaatc aggacgatcc ctttactaat ttatatgcta
tacgtttggg gtttaattgg 94020agcgtaattg taatccatat gctgttaagg taattacact
tataaaccta atataatcca 94080tagatcaatg ggtttggagc taattaaggc aaccgtaatt
aacatttacc aaaaaccttg 94140caattactgg aatatccttt aagctggcct tccgagctga
taatgagcgc cgctcagcgc 94200tccgcggcct cctagcgtgt ctttattctg cacaaggggg
ccttctgggg acctcgatgt 94260tgggagggca gtcagtgcct gcttttagca cccagggact
ctgatcaggg acccagctga 94320tgagtggcag caccatggga agccaggtcc ctgtgacttg
gccatcccca gaacacggct 94380gattttctgg ctcaatagac tcagcctgac aaaggggaac
ttggagaaag agaacaagaa 94440atccatatgc tgagctggga cagaacccaa gtggggttct
catgtgggga gcatatgggg 94500ctttcctcct gcccaccccc tgcacctgca gccaggatgc
acctgagctc ggactgccct 94560gaaaccccag gaaaatccca acagcggcac catagggcca
ggattacagc cattcttttt 94620tttttttttt tttttttttt ttgagacgga gtctcgctct
gtcgcccagg ctggagtgcc 94680gtggcacaat ctcagctcac tgcaagctcc gcctcccggg
ttcacgccat tctcctgcct 94740cagcctcccg agtagctggg actacaggtg cccgccacca
cgcctggcta attttttgta 94800ttttttagta gagacagggt ttcaccgtgt tagccaggat
ggtctcgatc tcctgacctt 94860gtgatccgcc cgcctcggcc tcccaaagtg ccgggattac
aggcgtgagc caccacgccc 94920ggcctacagc cattcttgaa tgttgtatga ctcccggtga
tattgtcacc ttgggagctg 94980cacagagacc tcccccgggc caggcccctc ctctgtctgc
agaaggaggt aaggggccca 95040ccacatctgt cttaccacag gctaagctgt tggcatctgt
tcttttcctg ctccaggagg 95100gacagagccc ccctgttgtc cttgcggcac catgctctgg
gaactgggcg ctcctgtgga 95160cagctccatg tgagtcttct tcgagagact ccactgcaga
tagaagaagg cctcgagcct 95220gtagttgcgg ggcggtggtc agcctgggct gcctggattc
agctcctgct ctgacattta 95280cagctgggac gacctgggcc aggtagaggt ttttcaacct
aagactcagt ttattatgta 95340tttattcatt tattcatttt atttatttat tgacagggtc
ttgctctgtc acctaggctg 95400gagtgcagtg gtgcaatcat ggctcactgc agccttgacc
ttctggactc aagtgatcct 95460cccacctccg cctcctgggt aggatgcatg cctgaggtgg
gactactggc atgcaccacc 95520acaccggcta gtttttaaaa ttcttgcaga gacagggtct
ccctgtgtgg cccaggctgg 95580tcttgaactg ggctcaagtg atcctcctgc cttggccttc
caaaatactg ggattacagg 95640catgagccac tacacccggc ctggactcag tttctataat
gggatatggc agtttaataa 95700tgtgcatgca gtgcttagcc tagtgcccgg tgcagagcca
ctgcttcatt aatgctgctg 95760ctgctgtcct tgtcatctct atcattgaca gccctagacg
agctgaaggg gtgggcccag 95820gtgggagggg ccaggtgggc ggggccaggt gggctgtctg
cacctcgctt actcctgggt 95880gggctgttag cttggcgggt cccagggctc acttttgccc
catcatcatt tctgtacact 95940ttttccatct ctgggcctta gggcctttcc aaagactttg
caactctggc agatatttct 96000agacagaatt aggttcttta acgaaggaat gaggcaaatg
tggacaggag attgttcata 96060attgcggctg tcatttattg agcacctgct gtgtgctctt
agcctctggg ctaaacactg 96120cacagacata attcactgat tgccaacagc agctttatgc
ggctggcatt ttagtctcac 96180tttagggtga ggaaactgag gctcagagag gctgaatcac
tggtttgagt acgtccagcg 96240agagggcagc tgggccagga tttgaagcca gccattctca
acaccactgt gctctctgcc 96300tcccgagtgg aggccctgac ccctttgctc tcttcctggg
tgggccagtg gggatgggtg 96360aggggtgagt gtgtcaatgt ttccctcagc ccagtagtgg
agaacaggca gggtgggcga 96420actcctggct ggaccacagc cacaggcctg caggaggcac
ccttgccagg gcgagtgagc 96480actgtagtct gggagaccct agagtgagat gggggctcta
caaggcacag ctcctccctg 96540gaggaaggga ggaagggagg gtccctgcct tacagggcct
ggccctcccc acttgtccaa 96600agcccggctg ggtcctggtg tgtcagcgct ttgcagtaca
ccctgtgcta cccacccaac 96660ctgtgtcccc gtctgatctc cctggaccct cccacagcct
gggaagggtc ccccagctgt 96720gaggtcagca tgggggacct tgctctgcag agggcacctg
tagggctggc cttattggag 96780tccctccttt ctaggcccac agtcctgcgc tgcctggtgt
gcggggcctg aatgtcattg 96840tttcatatat tttatctgat tttctagttg ctttaaggta
ctttgtctat cttgactgtt 96900aggctcttta tttttcaaaa aaaaattcaa tataatagtt
tgcatttttt cttattgata 96960ataaaaccac tgacattatt gagctcttgt tctgggctca
gttctgcccc gagctgtttg 97020tacatgttac cttgctgaac ctgcacccaa gtcctcaggc
catattcccc cttacagatg 97080aatgaagtca ctggcccgag gtcacacagc tgggacgtag
tgcaaccagg ctgggcccta 97140gctggttgac cccagtgccc actatttatt tggtgcttac
atgcactggc tcattttgtc 97200tctacagcct tgcaaaggcc acagttatga gatggcagag
gtggtattca aattcaggcc 97260cctaaacttg tgtctgtgtt ccctcttgaa aacttcagta
aactgaagaa gtgaaaggca 97320gagggtaaac caccgaccag cccaccatcc agaggtgacg
gtgacctctg atgccgctgg 97380gtgtccacct tccaaactgt gtgcacaaat gtgtcctctg
tgttctagtc cacaggtgag 97440gtcaggagca caggctgttt tgggacatct cttcccagtg
cccagttcca gaacaccctt 97500ccaaaccacc gtgcattctc cggggccatc gttttaatgg
ctgcaccctg ctcccgcgtg 97560tggacgcatc ctaaacagtc ccttagtatt atggttagat
gctccatgtg tttccaattc 97620ttcattattg taaacccaac tgcattgctc atccttgtag
catagctttg cacttatttt 97680ggatttttgc tttaagataa cttctcacac atggaattgc
tgggtcaaag ggtatgcaaa 97740attctaaggt gtttggtact tctgcagaat tctccctcca
gacatcaagt tgctctccca 97800tctgcactgt ggataattct gcaactctcg acaccctcac
caacaaataa caccgaggac 97860tagcagtgtc atttttaaaa cgcgcctcca tcaatgggaa
aagaatggga tcttctttga 97920ttgataatga gactaaaggc tttttctgtc ttcctcagtc
gctgatattt gtttcttcta 97980taaattgcat attcatatcc tttctcagtc ttttttgtat
gttttttaaa cgaaggctct 98040tctttaatgg gaaatctgca tttcatcaaa gatggagcgc
agagtctgtg gttcatttat 98100tctctggaga agatgcttct tttttgttct ctaaccaatc
agagaccagc ctgaagatgc 98160tagcaaatta atttcaggcg cacgagacag gcggtcggtg
ctcatgagga tcctggctga 98220ggaaatcaca ttttagtccc aactcctcga agtgttgagg
ccgcctggag tccacgcgga 98280ccagggttca aatccccact cggccacttc ctagcagcat
gactttggga tgtccctcag 98340ccatcagaga cttcactttc tgggctgtac catgaggtgt
tggtgtgcag tttaaaggaa 98400gtttggacgg ctgtggtgtg gatagttctg agtgctcagt
aggcagacct ggttttgagc 98460tgatagagaa ggcccacgcg aacgggcaga gctgggcctc
gccagggcct ggcactcagt 98520agtgctcacc aaatgcctgt tgcacaaggg atggtgccca
ggtaacctgg gtgagctaca 98580aaccagagcc gctggacgag ttccaaaggg aactcgggcg
cactagcccc ctttgtttgt 98640gtttggagtg ggagtcaccg tatagcttta tttgattact
ttttacccaa atccccatat 98700cctcaggaat tcaaggacaa ggatacataa caaaatgatt
tttgtctcag agtaaatcac 98760agtccacttg gaaaatctcc atcctagagg agggtgaggg
tgcaggtctt ggggtgaggc 98820tggggatgaa gcctcccatg actttgtggt taccctgctg
gtgggctcac tccctgcagg 98880gccattggtg gagtcacatg atgccaggca aggctggcca
agcagggtac agagcacact 98940gggggaggga tggctgcaca ggcgtggacc gtgggggatc
cagggtaggg atcccacctg 99000ggggtcttgg gctgagtggt ggagcgtgtt gcctgaacac
atgagcccgt ggacatctag 99060tgagcttagg gcatggcagg ggtggggtgc gttgcctgca
gcctggggcg gcctcggctc 99120cttcatgtca gcgcgtcctg acaccgtgac actgcgcatg
gtcacatgga gcctgtcctc 99180ttgtatctgc ttcatctctg gctttcctgt ccatctgcct
ggtgctttgg agggtgtacc 99240aggctgtgag ggagaagcag ccacgtggct ccgttttctt
gtgctgcagc tgggccagca 99300gatctagggg aaaggccact tcctagtggc ctttctggtg
ggagggagtc atagtgatga 99360gtctggaagc tgagccctgg cttgggagca gctggaaagg
ctgcctgtgg cctccctgag 99420tggacgtcct ctagcagttg ttcagcctcc ctgtccccag
ccagccacca cctagttggc 99480actcccagtg ggccccacga gccgcgtgtt gctatgcaca
gcccttacga gaatcccact 99540ttccagatga gggcgacagg aggaggaagc atcactcact
tgcccaaggt cccaccggca 99600aggggctctg agtctggacg tggccctctg agctgctctg
tccccgcttg gcgttggagc 99660gtctcagcct ctcccaagag tgtgtgccct tgtgtcgctg
gttcagaagc cagaaggctt 99720ccaggggcag acctaggcag aagccctgag tgggcatcct
ggtcagccct gtgggggttg 99780ggggccagtg ggcaggggca ctcaccatgc tacccaaccc
cccatggcac ccccaatgtg 99840gaacgcacgg gctcaggact cacttttatt ttggtagagc
ttggagtccc catttccatg 99900ttttgttttt ctttacagtc ccagtgtcaa gcttagtagg
acacagatgt ctcagcaaac 99960aaaactagga gccatccgag caggcccctg cggccatatg
cattcccatg cgtagcgaaa 100020ccaggccctc ttgccaagca ctctttggga ggatatttgg
aaaaaatgtc aaggcattca 100080tttttctctc gaccttttct gttggaatcg tggggcctta
acactggaag taaaaggagt 100140tataagacac agctggcttc ctgcacccgg gacagcccca
gtggccaggc ctggccctat 100200tctcccatct ggccctcggt ggtttctgca gaggttacca
tgcacaaggg ccaggccgcg 100260gcccccacct atgtgcaaag aagttggcaa tgatgtccag
gtagcaggtg cagcgggcaa 100320ggtttctacc cccaggctag ttaaggacac aggagactcg
aggagggaga ggaccagcag 100380aggacactct atgggggtgg cttctggcaa aacagtgcta
tctgaagatg agtttgaagt 100440taggaatgtt tgctgagttg gcctggatat tggaagtgcc
ctttcttttg agccctttgg 100500ctctcttggc gttgaccatc tgctggtgga acatgggtgg
gctcccagag cacctgcccc 100560tgtggtgtat accagctgga tctcagctac acctacggac
ctgcactggt atagatttgg 100620gtcccaggtt ccccaccagg ggcctggctg gcttccccag
gtccacaccc ttccagcgaa 100680gtcacatcac aagttgtgtg gaaatggagg tcaccaccct
ggccacagag gtgaggctgg 100740tgaggcccag agtgtacctg ttggggatgc tgcaacactt
ggaaatccgg ctgcccgagg 100800ctggaagccg tcttgctgcg tatgtcacgt aggataatga
gtggactctt ttctcagtcc 100860tggaggactt cctcggccag tctggagtaa gccatggctg
gggagatcag ttcttggaag 100920tgctgggttt tcctgcatct ttggtccttt ggtggcacag
gccttccctt gcaggccttt 100980ggcacctgtc tggggtgtgt gccattggct cagtagccca
tgggcagtaa gttccgtgga 101040atcaggcaca ttttcagcca gcgctgccaa gtgctccctg
caatggtgtt gccacctgag 101100gtgccaccca aagcagaaaa ccctggggct gggcaagaca
gcagctttca ggagctggga 101160ggaggcatca tttccacttt ccaggccctg cgatctggtt
tgtagcctgt accagggcac 101220tggagtcctg ccttggcaca gctcatggtc tagaggaata
acctcaggtc caggtgagac 101280actcagcctg tgatggcttc atatggtcac tacgttgttg
tatatggaga gaggactctt 101340tcttggtttt aaattttgtt tttaatttta taaatgtaat
atattcacat ggaaaatttg 101400gaaattataa acaatcaaaa agaagcagct aaaaaaagtc
ctttcataat ccaccggcaa 101460catcagttaa catgtggcat atttccttcc cgtctcctcc
actcccattt tttttggact 101520tcttaaacat aattgagatt gtgttttatg tacaatttta
tagcatgtac ttgttagctt 101580tgaaacagac ccatctgcgt ttgcaatttg gcctgcaggg
aaaagaccag attgctgagt 101640cttagctaaa gctgtgggag aaggaaatct aaaaataccc
tcctcgttgg cctgctcagg 101700gagcgccgtg gcctcttgtt ctcctccctg tcagtacagc
agaattagcc atgactcaga 101760gtgggtctcc gtggtgattc tgcctctaga gaacatgccg
tgctcccgac gtcaggctca 101820caggtgagcg ggggcactag ccggggacga ggggctggag
cctggccggc ccaactacag 101880aggggcctgg ggtttggctc ctgccggagg agcccctcta
gggctctccg gggtcttctg 101940cagagcgaac actttgaggg gagcatctgt cagagacaca
tgatgcttct tcacccaaag 102000ccccaccctg gctgcaagag gagaaaatca aaaccgtgaa
gagcatttta gagacttcca 102060cttccagcaa gatggagcag aagtgccttt ccctattcct
cctgccaagt gcaatgaaaa 102120accctggacg tcagttatga aacaaacaga agaggctgaa
aggaagagag gagaaggcag 102180actgaccagg gacctcagga cctggggatc agtggagggg
tgaattccct gggtttcctt 102240tgtgccttgc acatctcaga cgtggagctg aagaagccag
caagagcctc tgcagaagcc 102300tgctctccct ggctgaagga ccaagccagg ggcacctggc
gagacagaaa acatttaggc 102360aataactgtg gtactccagc caaacaccac agaaaaccct
gtggccccct cagctactcg 102420tgccagcaag ggctcagcag ggatcctgga ccaaaacact
gaccaggctg aaatgagccc 102480ctaccactgg gggtgtcaga gaaggccaag gagggagcca
ggagcagggt gggaatgagg 102540tcccccagct ggagcatcag tggagaccac atggtaagcc
tgtgcttccc ctagtcagtg 102600gtgacaagac atcccttccc ctcccactgg acgggcagca
tcatactgaa gagtcaggac 102660ttcatcgcca ctcagaggtt acaaggctat ccgccccccc
cacaacagtg acaatggagc 102720ccgcatggga cacagtaatg agggattcct gcacctccca
accaggggaa aacttcctcc 102780ttgaagagca gtaatgagag cacctccccc acctcagtgt
cagcagaggc caagtgggga 102840acctggacgt ccaccccacc caagcagaat aaggcagcac
taccctcttc ccctgcctga 102900gttgtgtcca gagaaggcca ctgaaaggga aggtttaaat
aagactcaga gtctcacacc 102960acacacccaa aatgtccagc ttgtggtaca aaaccacctg
tcatgctaag aaccaaggaa 103020accccaacat gaataagaag agaagaccaa cagacaccag
cactgagatg acacagatgt 103080cagggcaaag actaaaacag ctactggaaa aatgcttcaa
ccagcaattg caaatatgct 103140caaaacaaat gacaaagtag aaagtctcag caaagaaata
gaagatacag ggccaggtac 103200ggtggctcat gcctgtgatc ccagcacttt gggaggctga
ggtcggtgga tcacttgagc 103260ccaggagttt gagaccagac ctgggcaaca cagggagacc
ttgtctctac aaaaattaaa 103320aaattagctg agcatggtgg cacacacctg tagtctcagc
tacttaggag gctgaggtgg 103380gaggatcacc tgagcctgaa aggtcaaggt tgtagtgagc
tgagatcacg ccactgcact 103440ccagtctggg caacagggtg agaccctgtc tcaaacaaac
aaacaaacaa acaaataaat 103500agaagataca atgaagagcc aaatgaaaat tttagaactg
aaaaatacag tagccaaaat 103560agaaaactcc atggatgggt ttgacagcag aatcaagggg
acagaggaaa gagtctgtga 103620acttgaaggc aggaaaatag agatgaccca atcagaacag
agaaaaaata gactgaaaag 103680aatgaacaga catttcacca aagatgatat ccaaatagca
aataagcaca taaaaagatg 103740tcatgagtca ctagggtaac gcaatttaaa gccacaatga
gacatcacta cacacctatc 103800agaatggcta aaataggcca ggcgtggtgg ctcacagctg
taatcccagc attttggaag 103860gctaaggtgg gtggatcact tgaggccagg agctcgagaa
cagcctggcc aacatggcaa 103920aaccctgtct ctattaaaaa tataaaaatt acctaggcat
gatggtgctc gcctgtaatt 103980ccagctactt gggaggttga ggtgggaaga tcacttgaac
tggggaggca gaggttgtag 104040tgagccgaga tcatgccact gcacctcagc ctgggcgaca
aaacaagaaa gataccatct 104100aaaaaaaaaa agaaaaaaaa aaagaatgac taaaataaaa
aatgccgatg agaacagtga 104160gaatctgaat cactcagaca ttgctggtga gaaggtaaaa
tgatacaaat gctctggaaa 104220atagcttggc agtgtcttaa aactctaaat gtaaaactgt
cacatgactc agccatggta 104280ctcctgggca gttatcagag aaatgaaaaa ttataattta
cacaaaaacc tatacacaaa 104340tgttcgttgc agttttattg ataatagccc caaactagaa
gcaatccagg tgtccttcag 104400tgggtgacga gaggaactac atccatgtca tgcaatagta
ctcagcaata aaaagaaaca 104460atctctctct ctctttttct tttttttttt ttttgcataa
aatccagatc ctgcagaaaa 104520aagaaacaaa tttgatactc acaatgactt ggatggactc
ccagtgaaaa aagcctatcc 104580caaaaggtta tgtactttct gattccatgt atagaacatc
ctcaaaatga caaaattcta 104640tcatgcaaaa cagatactgt tggatggggt ggggcggggc
aggagggcag tgggcgtggc 104700tataaagagc agcctgaggg gtcctggtgg tgatggaaag
cttctttatt tcactgtatc 104760aatgtcaatt gttgactttg atattgttct atagttttgc
aatatattac tgttaagaga 104820aactgagtca gtggcgttct tacaactgca tgtgaatctc
caattagccc caaattaaaa 104880gtttactaaa gcaacaattg aacaagaaaa agaaaagcct
ttgtacctag caacccagaa 104940tgcctatgat aagtggcagg agctgatgac aggcgttctt
ggttcagaac aatgacaact 105000ctgggacaag catgccgctg ggagaaagag accaggagct
gcacagccag acagatgctt 105060ttgcagtggg atggcagtgg ggagttggaa tggttcatca
tttacttttt atatttaaca 105120cttttttttt taagagatgg aatgctctgt cacccaggct
ggagtgcagt ggtatgatcg 105180tggctcactg caacttcgaa ctcctgggct caagcgatcc
tcccacctca gcctcctgag 105240tggctggacc tacaggtgca cattgaagca cccagctgat
ttaaaaaaaa attttttttt 105300tttttttgag acggagtctc gctctgtcac ccaggctgga
gtgcagtggc gcgatcttgg 105360ctcattgcaa cctccacctc ccaggttcaa gtgattctcc
tgcttcagcc tcccgagtag 105420ctgggactac aggcacgcag caccacaccc agctaatttt
ttgtattttt agtagagaca 105480gggtttcacc atgttaccca agatggtctc aatctcctga
tctcgtgatc tgcccacctt 105540ggcctcccaa agtgctggga ttgcaggcgt gagccaccgc
gcccggccaa tttttttttt 105600tatagagacg agtctcactg tggtgcccag gctggtcttg
aactcctggg ctcaagtgat 105660cctcccactt cagcctctca aagtgctggg attacaggtg
tgccccaccg tgcccagcct 105720cggattgtca tttagcggtc tgtctgaaca tcagtctcct
gacgtccttc ctacagatat 105780gaacagagcc cctgctaggc atgagtattc ggagcataag
ctcctggttc cccagtgaga 105840gacggatgtg taaggaggca gtctcagggt aggatgttga
gtctgggctg gggacagtgc 105900actgtgccgt ggaattatgg agagtttttc ctcctatctc
taagtccaat gtgtggcatg 105960gttatattac atttttagga tggaaaagtt ataccatata
tttaaatgaa ttaacagcat 106020ttgcaatgat ctggatgaga ttggagacta ttattctaag
tgaagtaact caggaatgga 106080aaaccaaaca tcttatgttc tcactgatat gtgggagcta
agctatgaag acgcaaaggc 106140atgagaatga tacagtggac tttggggact tggaggaaga
gtgggaggga acgaggtgtc 106200cttgtccagc aagcgctctg tgtgaggctc catgcatctt
tatgcattct gctctgatgt 106260ttgcaaacac gacgcccggc aataagtgag gcccccctga
ggtactcaag gggcttcact 106320cgccgaagct cccctcccca cagcaacctg agcttctgca
aaagctgaga agcacgcatg 106380agaagggcac ctcccctgct tggaaataat tttttggttg
cagtaggggg ttttgtttgt 106440tgtttgttgc ttgctgctcg ctagcttgtt cttgcagaag
gaagagctgg tctcctttgc 106500catcccttac tgagggtcga gaaatacaga caggtgtctt
caggctctcc gtgcaggcca 106560tagtccccaa gaattcactt gtgagcctct gtggcagttt
atgttgagga ttaggtttag 106620tttggggaga ggaggatcca tggagctggg gttccagatt
tgatgtctaa ttttagaaag 106680gacactattt ttaaatcaat agctgtgcag cctagcatat
cactctgacc ctgtgccctt 106740aagggacgca ctggggagga cattgttcag gacggagcct
gacaagaaag gaatgctacc 106800ccacgccgtg cgccccaggc ggccactgtc gtccaagaat
ttccaggtct tttctggtat 106860cacttcttgg cacaggcgga agaagccatg ttcccatcgt
cccaacatgc agctggggac 106920ctggagaccg ctttggcctt tctcaaggtc acaggaaggc
agggggaagg ctggcgaggg 106980ccctgtggat ccgccctgct ccagcccggc gtcgggttac
agcagtgggg acaggagggg 107040gaaggggtgt cactgcagga gctaaggaga aatccagggt
tgacaccctc agaaactgga 107100gaagaccaag aaagcagagc ggagagtgcc ccctgcagtc
cgcagagggc ctttgccgag 107160gtgggcctgg agccgtcgcc gtacccctgg gtggcccgca
aggccaggct tggctgcctg 107220gcaatgtttg ttgtgggcag ttgccatgcc actccggcgg
cagcaagagc agggacaggt 107280gtggcagagc ccggtggttc cagcccagcc tggcctccca
gcatcacctt ctccagttac 107340ttgaagtggg ccaggtcctg gagaccgggc gtgtggttat
gggctctcca gattcttggc 107400gccttcctga ttgcaggagt ctggttggca tcatgggacc
cctgggagac tcactttcag 107460gaggggttgt cctttccccc ctgggggaca cactctgctt
acttgtcccc tgggagcgct 107520gggcagggag actgcaggtg ccagccactc ctgcagcttc
tgcccagcct cctctggtgt 107580cctctgcact gtcaccaagc catggcctca agcaacttgt
gggtgcctgt gatttaggcc 107640ttggggtgtt ggagcctgct tggacccact ggcaggagtc
agttgtgatg tcttttgcca 107700gctctcttct gtagcttgaa attggccacg gtggaagtat
ttacaccatg gaagttggca 107760aatgcacgga cccagcaccc ctaccccatc ccggagagcc
agttaccagc acacgtctgc 107820cttttggggt cttttgtgca ggactgatgg atggagtcgc
agcagacact tggcctggcc 107880actgtgaggc tgtgtggtgg gaggtcgggg gcacctgaga
cagagggagg aggctgagag 107940tctgaggggg cagccagggc tggagggcca tttgggattc
tgttggattt ccctgcactt 108000tgtccagggg cagaagcctc cctcccaagg tcaggcccca
tttctagccg tgctggtcgc 108060acatccctgt gagcaggggt gcagaagaga aatggtccct
caggagcccc acacccgctc 108120ccatccagct ctcctccaga tgacatctcc aggcactggg
accccacaga ggcaggccct 108180gctcagcctg cagagctccc ggtggagagg gagggcgggg
gtccaggtgg cctgaggaga 108240ggggtgcagg ccagctgagg gggctcccag aactccacga
gctcagagac gggctcttgc 108300ccttgcagcc ccatccttgg gatgccccag cccttcatcc
agacctctgc tgtgattgtg 108360tcaggacctg tccctgcagt ggactggaag ctccaggtgg
gcaggccttg ccgtggtcac 108420ctgtgtccac aggcctgggc agagcaaact gatgagtgtc
tactaaatca gagagaagac 108480acaaccaggg agcgagtgca cgcccgggcc gaccagggct
gcgggcatca tatcccgctc 108540agccgctgcc tgccctccac gcgcccctgg ctgcctctct
cagcctcggc cccctccact 108600catgagtggg aataaataag agagtagaaa acgtaagtga
ggttcttgtc tggccgcccg 108660gctcatagca atgtttaata gatagtaact ggtgttatga
ctgaagacgt cctttgtggg 108720ctattggagt tcttaatatc cccgagtaac cacagggcaa
cgctctggct tgacaagggg 108780cggcactgct ggtcagcagc ccagccagcc gcagcggcca
cttgtgttcc cagctgtggc 108840ttgtgatcta tggtcgactg agctgttact ctcctcggtt
agacacgcat atgcacacac 108900atgcacactc atgtacttac acatgtgcac gcgcatgcaa
cgcacaaaca catggacaca 108960catgcgcatg catgcacaca ctgtacccac gtgcacacac
gagtgcacag acacgagggc 109020tcactgtgat ctgctctaat gcatggtctt aatagctata
taatattcca cggttctttt 109080ttctttcttt cctttttaag atggagtctc gctctgtcgc
caggctggag tgcagtggcg 109140cgatctcggc tcactgcaac ctccgcctcc cgggttcaag
cgattctcct gcctcagcct 109200ccttagtagg gggggactac aggcgcgtgc caccacgccc
ggctaatttt ttgtattttt 109260cagtagagac ggggtttcac cgtgttggct aggatggtct
ccatctctcg acctcgtgat 109320ccgcccgcct cagcctccca aagtgctggg ataacaggtg
tgagccacca tgcccggctg 109380gttcttattt ctttaatttt attgaagaat aacataccta
cggagccgtg tgcatatcat 109440aagggtacaa tttgatatac tttcatggag tgaaatacat
ccctgaaacc agctccggat 109500cgagaagcag gacatttccc accctcttct tgccctttgc
ctgcaacccc catccccggg 109560gtgaccattc cccggcctct gcccccggga cagttttgtt
cgctcctttg ctttatggag 109620gtgaaaccgc gcgatgaggc ccctttcgag tctggctgct
tttgcttagc cctgcgcttg 109680tgagatccct ccgcggtggc gtggagaggt gtggccactt
cccgcttcgt ggcaagtcgt 109740gttctgccat ttctccgaat accgatggag gcttaccgtg
ctggaccatg gaatggggtc 109800tggaatgttc tcatacaggt cctgtggtgc aatgggcgtg
tgctggcact gggattctgt 109860gtctaatgct cttcatccta gacctgtgag aggagactgt
tgtccccagg gcagtgagga 109920ggaggccagc tttggagtgg tcaccctcac ctcgccgtta
cacacgtcac ttcatttaat 109980ttttaaacaa cagtcctgct gccggtggcc ttttctgtgt
gctatagaag tgagagcagg 110040tgcctgggga cacttggagg cccccctagg tttgtctgta
atgagcttgt gctggcgact 110100ggcatgttct gtccccccca cggctcccag cgtgctccca
gcgccctccc acagctttag 110160gcaattaaga cacgggcttg gagtggccat tgcaggacct
gctggtaatg ggggtgaggg 110220gtaagctttg tttccaaggc aggtctgtgc cctggggtct
tgtccccctc cacatgcaca 110280ctgcctgggt ggggctcatt cctgcgggta aggcccggga
tcaggaatgg ctgtctttct 110340tctgcccaga gggagctgct tgtggcttga ggtggccctg
agactgtgag acaggccact 110400gggaggccca ggctctagca aaagtgggga tttttttctc
tccctagttt tggatcaggt 110460ttttgaggtt gtgcaaacgt aggaatgagg cctttcaccc
cgtcagttgt ctgtcgggct 110520gtgtcgcaca gtggcccctc gggacagcct acagagctgc
tgtctcttcc aggcattgct 110580gcagggcctg ctgtctctgc tgtcctgggc caggcgctga
gttctgtcgc ggaaagtctg 110640agggcagagc tgggagggca gcgtttccaa agcaaatgga
aaaggcaaac gggaacgtgg 110700cgggcagggt gggcagggca gcctggaaac accctttgtc
ccatgcagct cattcactcg 110760gccaggattt gccgagtggc ttactgcact ctgtggccct
gaaccagggc tgggggatct 110820cagggacatg tgtgtccaca gagatggtca cctgccacct
ccaccatcac agatgccatg 110880aaggctgtga agggaaacag acctgtaagg cagcggagcc
gagatggggt tgatgagtgt 110940caccagtcac aagcaaggtg tctctgttct ctcttctaca
tgttgatcga ggtataacac 111000ccaggaaagg acatcaagtg tacagtttga atttttgcac
acacgtacac ccgtgtaacc 111060cccacacaga tcaagatgta gaacattcca gcacccagaa
gtttccccgt gccccttccc 111120agtcagaccc cccgacccca ggtcacccct gccctgactt
gcagtgtaga gctcacccgg 111180ccacttcctc ttttgcccac agcaggtgag gagggccttc
tactgggaaa aactgtgact 111240attaatttag gtttaaaatg tattggtaac aggcagatct
ggccagtccc tccctggctt 111300caatctcctt ggtagcttcc aggattactc ttaggaggga
ctggaggaaa agcccctctc 111360tgtccccatg gcacccacag tgaccgtcac agccctcctc
tgggccctca ctgccaccag 111420tggactcact gtctccctct gccctgtgca cagcttctta
atttgggtcc ctctggttta 111480gctcagtgca gccagaattc acccagtgcc tatgtgagtc
gggcactggg ctggcccagg 111540ggcctttgga atgaagggga cagtcggtga gctgaggttg
tccccaaatc agaggtgtga 111600ccctatgagc agccaggttt ttaattttgt aattgagata
ttcacatgcc atacaattcc 111660tcttaaaata cacaattcag tggtctttag gtttttaata
tattcacaat gctgtgcaac 111720tatcacctct aattccagaa tattctgtca ctccaaagag
aaaccctgta cccattagca 111780gtgacttccc actccccaca cccccagccc ctagcagcct
ccagcctctt tcaatctctg 111840tggattcgcc cgctctgggc atttcatgta aatggaatca
gacgatacat ggtcttttgt 111900gtctgactcc attcactttg cttaatgtct tgaccgttca
tctgtgttac ggtgtgtatc 111960agagtttcac ttctttttgt ggctgagtaa tattccacta
tgtggatagg ccacattttg 112020ttgatctgtt atcagctgat ggacatttgg attgcttcct
gatatggttt ggctgcatcc 112080caacccaaat ctcgtctcga attcccatgt gttgtgggag
gtaattgaat cacgggggca 112140ggtctttccc atgttgttct gttctggtga tagtgaatta
gtctcacaag atcagatggt 112200tttaaaaagc ggagttgccc tgcacacgct ctcttctctt
gtctgccgcc atgtgagaca 112260tgcctttcac cttctgccgt gacggtgagg cctccccagt
cacgtggaac tgtaagtgca 112320ttaaacctct ttcttttgta aactgcccag tctctctggt
atgtctttat cagcagtgtg 112380aaaacggact gatacacttc cactttttgg ctattgggaa
tagtggtgct gtgaacacta 112440atgtataggt ttttgtgtgg acacgtttcc agttgtcttg
gatatatacc tagttgtgga 112500attcctggct catgtggtaa ttcatgttta actttttgag
taacgagcgg ccatttttaa 112560agactggata ggaatatgta aggaagaaag taaatttatt
cacttccgtt caggaaattt 112620acctaccaaa caacatgctg aatgggaacc tttaggggct
tggctgaatg gccctgcaca 112680aggtgtccag gatgcactgc ctaaaagctg tggaccctgt
gcctgggagg aggaccagca 112740aagggcagag ccccccgtgt ttcagatgga aaggttaagt
aatgtgactc tggtaatgca 112800gtccagttca tttatgagat catggacctc ttatactaca
gactttttgt gtttttttgc 112860tgagtctttt gctccctaac cgtgttccct ggctggccac
cctgccagca ccacgctggt 112920cctcaggaag catcatcagt cagagacaca ggcctttgct
gtggcctcac atgctagcag 112980aggaagatag acaacatggc gatggatgag gaagttgggt
tgtctgtaag aggatgagaa 113040gggctgtcat taaacaggtg ggttcaggtg ggtctcattg
aaccatgaga tttgaaccaa 113100gacctagggg ggtgagtgag gggttgggga agggagatta
atgcgagttc tggggaagga 113160acattccagg tgaggaaaca gccactgcag agccccttgg
gtgttggagg aagacactca 113220ggtggctgga ggggtagggg aagcattggc agggagaggc
cagggctggc aggcctgccg 113280aaggctttgc ttctcttctg ccaggctggc catcacgggc
cttgcacaga agagagcaat 113340gacctgcctt caggtttcaa aaggtccctc tggccgctgt
ttgagagaag gtcacagtgg 113400gatgagggca gagcaggggg actgctcagg aggcttttgc
agtaaccgac ggagtggtgg 113460cggtcactgg caccgagaag tgtacatgct gcaaagaaaa
tactctgttc aggagaccag 113520gtgcggtggc tcacgcctgt aatcccagca ctttgggagg
ccgaggtggg tggaccactt 113580gaggtcagga attcgagacc agcctggcca atatgctgaa
accctgtcaa tactaaagat 113640acaaacatta gctgggcgtg gtggtgggtg cctgtaatcc
cagctgctcg ggaggctgag 113700gcaggagaat cgcttgaacc caggaggtgg aggttgcagt
aaaccgagat cacgccattg 113760cactccagcc tgggcgacag agcgagactc catctcaaaa
aaagaaagaa agtaagaaag 113820gaaaggaaag aaaagaaaaa aaagaaggaa gggagggagg
aaggaaggga gggagggaga 113880aaggaggaag ggagggaggg aggaagaggg gaggtgggga
gggctctgtt cacataataa 113940agaaatccag aggtttgacc atgggcgtca tttcatcact
gcacttggca gaggtcatcg 114000ctgctgctcc catccagctc agtcctgtgg tccctgtcgt
gtgtagaaac acatactctg 114060tacgttgctg ggatcatagc agaatatcag ctccaccaat
cctggttcct caatggcgac 114120acccacgcac tcagggcatg tatcagtgtg catggagttg
cctcgcgttc aactgcgtgg 114180tgtgggtaac gcagttgtgc cattgtgtgt ctcacaattc
ccagactggt aaataatgtt 114240gcttctagtt ttttgctact taagcagtgt cacagtgaac
attcttttat aaacgtgttt 114300gtgttttaaa aataagctta ttttggaata atttcatatg
tacagaaaag ttatggagtt 114360aggatgtttt catccactct tcatccagtt tcctctgtga
acacctcaca catctgtggt 114420acatttgtta caaccaagaa actgacattg gtacccacac
tattcatgag acaccagact 114480ttatttggat ttttttcttt ttgagatggg gtcttactct
gtcattcagg ctggagtgca 114540gtggcacaga tcatagctca ctatagtctc cacctcaagt
gatcctcccg cctcagcctc 114600ccaagtaggt gggagtacag gtgcacgccc ccacacccag
ctaatttttt attttttgta 114660gagacagggt ctcactgtgt tgcccaggct ggtcttgaac
tcctgacctc aaaggatcct 114720cctgccttgg cctcccaaag tgctgggatt acaggcatga
accactatat ccagtcttta 114780tttggatttt actacaaaca tatgcttctg tatacttgtg
tgactatttg tttaaggtac 114840agtccctgaa gtgggattgc cgtgcctgat ccttgctgcc
acatcatcat cccatagggg 114900acagggcccc actgtccccc atcactgaac agtgtgccca
ggtgagatca tggacttggg 114960ccccctaggc cagcccagtc tctttgcagc caaggaaagt
gaggcttagc tgtcgggggc 115020tgtgggggga tgcagcttgc cacacttgac acccaaacgc
tgattttgtt tacatgagac 115080tcaggtccga gtttatgtct gagacctggg aatcaaaaag
ccacccccac agcctccagg 115140gacctggctc ccctggagcc accgtgctgc cccacaggtg
tcacccatgg aggctgcagg 115200agcacctggc gtggtccagg ggacacgaag ccctggtgct
ggctcctccc aacgcccctg 115260tgagttctgc ctcccataga ccacatgtgt catctgcgtg
cacatgtgtg aaacccgaaa 115320gacaagcaga tgagacccgc ctgccaaaat atttgctgcc
gcgcagaaca gtaagtcgtt 115380tttgacatat ggatctcagc gcgcagccac agcaaacagg
cagcgtggcg ctgaggggca 115440caggtttgct ggggggcgtg agccgggagc atcgccagca
gcaggctcgc ccccctgcga 115500gggcgcgcac ggattcgctc gccaccaaga gcagggcccc
aagaggggcc ctcggcatag 115560tcccctcgag tccccagaac ttcacaggca gagagcctgg
ctcgaggatg ggcgggagct 115620cagtctggct ttgatctgcg ggaggtgttg tgtgcagaag
ctgtgggcac cgggggctgc 115680agagcttcag acctggagcc tgctcccagg gccgatctga
acccggcacc tgtgctgacc 115740caagggtcaa ccaaagtcct gagtttggtg gcgccgaggg
aggggagtgg tcgcatccca 115800ccgggtgcct ggaatggtgc tggaggaggc agcatctgat
ttgggccccg tagggccaga 115860ggggggattt gtagggacag agcaggaggg agtggagctg
aggctggggg agggcatggt 115920cggaggtcca gaggtgggag aggggccgcc atcctgaagc
acacccctcc tcgtcttcct 115980gaggcctcca tcccttcctg atcttccagg tcatgggctc
caggcttggg cccactgctg 116040tccttctcca ctctctcccc agggaaccca cctgggttgt
ggccttaaac ccagccagat 116100gatgcttctc cagcctctgc ctcctgccgg gccctcccct
gagctcagaa gctgccatca 116160gtagccatgc aacctccccc ttgccatcat gaagtatctc
aaacttggca gcctcgaaac 116220agagcccttg gggagtttaa gcagaaacag aatatttttg
catagtctta aagtaattcc 116280ccccaccccc aaatattgag caattacaaa gggcaaaata
ggaagtttac ggtggagaat 116340cctggtggac acagccttgg ccaggtgatc aaggttcatg
tcaccagtca tgaagcggcg 116400tgcacccctg gtgtgctgag aagggcagct tacctcctgc
aggtgccctt gccccaacgc 116460accactcagc ctaaccatgg ggagcatccg acaaaaccaa
gctgtccgca tctccaagga 116520caggcacact ggggcttagg gcttcagtgg gggcggaggt
acagttcact ccctaatacg 116580ccatttactt ggagaagccc tctccagctg cctgaggcag
aagcacccgg ttctctctct 116640gcaggacccc tgctttgttc ttcctaggat gcgtcagccc
ctgattttcc atgacagtcc 116700cctgtgccca tgtcctgcag cctctcctgg gggcaggagc
tacagggttt cttgactcct 116760acctcagcac tttggtccaa gctctgagca gagacctctg
taggcgtctg caggactggc 116820tggcccttgt gtctgtcttt cccctcgtcc tggccccctt
tgtccactcc tgcaccagct 116880tccctggacc cactgccact gcgtcacgtg gtcggatctg
tgtcccctgc actctcacca 116940cctccaactt gcatcccggt cccccacacc ctgaccccca
aaggcattgt gcatgtggat 117000gagtgagtga aggaaaggct ggcgtgttgg gtgtgcgagg
tcgcaggggc atgaggtgcc 117060tggacggcag agagccaggg ctgggtgagg acctgggtgc
caggtcgggg gtctgggcag 117120cctggtggcc ccctgaaaac gttcacatcc cttggggcct
gctgtgatac atactaatca 117180gtgcttcatg agaaaggtat gagccgaaag aaatgggggt
tcccactgga ttggtcgttt 117240ctgcattgtc cggagctctc agacctggtt tcttaaccac
tgtgttcact gaacaccccg 117300ggcttgatca gctgctgagg gtgacttcac atgagattat
aattgtactg cctaatagag 117360tgctcacaca tgttaaaact acctggcagg caggagtgag
aatacagctt tcccagaaac 117420gtgttcaaag aaaaacccat taccaacccg gcagaacata
aagtactagt gtggggtatg 117480aggcagcatg tgggggcgct ggggtggcag cagggccaga
ccccacttca ggggttgccc 117540ccaggacccc ccttgcggta cccgtatatc ccacccccag
actcctttcc atgagctgct 117600cccagacccc ctcatatcca tcctgcctcc agaaccccta
tatcttgctc ccactcccta 117660ccatatgctg ttcccagacc cccctccatg gcctgccccc
agatccccct ttgctgccac 117720ccctatatct taccccagac tcccccatat ctcgccccca
gaccctcctc catggtctgc 117780ccccaggcct gctggtaata tgggatacag aggtctcctc
actgcacaca caaaactcct 117840tccctgagcg agggataggc agagctgagg gcgggggtac
tggggggctg aggggggttc 117900tttagggcag gagttacctg ccctgtagac ggggagacca
cattagtgtt tgtgcaaagc 117960cctgccctcg tgcacctgtc cttccgtagc ccacgatacc
tcatccagcc cgggcccgtg 118020ggtgggctgc tgagcacccg accacagcca ggcctccatg
tgccactgtc aaggggactt 118080ggcccgccat gatcctgagt gcacaggtac attgctccgc
cctctcaaca gccacgaccc 118140agaaataaag taggcactgt aatttgcatt ttgaggataa
ggaaaccgaa taatagagtc 118200tactgagcgc atactcagtc tttccagacc ctagaaagta
ggtaccattg tcatcctcat 118260ttcagagcta ggagagcaga ggccagacag gttccatagc
ctgctcaggt gggtccgtgc 118320tggaccagga gaagggccag gtaggctgat gggtcctgag
cttgtgctcc cgaccgccag 118380gctgtgtgtc tgatggccga ggccagggtg gctcagccaa
gcccagccaa cccaagacta 118440gaccatcccc agcgtccacc ccagcctcct gggcacaggg
gtcacctccc acccctggcc 118500cacccataca gccttttccc atctaacctt ttctttaaag
gaaggcatcc cttggaatga 118560gtccaccctg agattttatg gatccaaaga aaaaacttaa
ttagctgcat aaactgtcag 118620ttgagactct ccagaggaca cacctccggc tttcaggttc
cccagggaat ctgaggctct 118680gaggcaggtg gacagccagg tccctcttta cccagcacca
ggtaacctta gaggcctggc 118740cctgccttcc tcattttgaa atgtcaccct gtggtcatct
ctccttccct gatgtgaagc 118800agtgaccagt tctgccccat cctacacatg tgtgcacaca
tacatgcaca catgtgtaaa 118860tacacatgca tgcacatgca tatgtgtgca cacatatgca
cataccttca ccttcctatc 118920cctccccagg ggaaggcctt ccttggggct gcccctcctc
cactctggca gcattgggac 118980ttgaatcttg tggccccgtc acaactgtac tcagctactc
cagctccttg tgtgacctct 119040cagcccccag gctctgagct acctgaggtg aggtggggcc
acactggtcc ccacctcggt 119100gcagcacaca cagccccatc ctgggagtgc cacccaggtt
tcctctaact gtgctttcca 119160agggctggaa ggacctcttc ctctatgcca ggtgtgctag
agtggcccca gccgagaacc 119220tgactgggac caagccccag gacagtggtc aggcacacac
ctgagtcgct cttcctccca 119280tagacacatt tggagttccc actgccttcc taggccccac
cacatgctgt cacctccagg 119340aagccttcac aggttgctcc ctgctgatag ccttgacctg
cttcccaggg catttgcaag 119400ggagtagaac gaagaccaag tccgtgcagg ggctgctggc
tctgtgctca gcacacgccc 119460cgtgaccacc tcaggggcgg ctctgcgagt ccccacttga
caggtgagcc acagggccac 119520aggcatggct tggggtcctc tgcggagccc gatgtgtcac
tggccctcac gtgtccttct 119580ccgatctgct cactcattca tccattcatt tattcagcag
gggttcttga gtgtctctga 119640ggtggagcgt ctgctgtgaa ggactggcgg gacgcaaacg
gttaccactg gtgtgacggg 119700ctctgatcag gaggcacagg gtgttgggga gctcagaggg
tgacatctca ccctgccttg 119760gccagtgtgg gagggagagg gttcagggag gatgtaactg
tgtgtgtggc aggtgctgtg 119820gtggcacaga gctgagggga cataggcaca gttgctggat
gcagggctta caccgcgcca 119880cgcgtcagaa gctagcaggg gctgagcagc atctctgccc
tccccgcttg cacccccacc 119940atccagtgca gcccagcacc tagggggcat ggggccagtg
cctggcacat tttgtcctga 120000ctgtggaaaa aagctcttca cagccctgag tgctgcttta
gctttggctc cacccatctt 120060aggctgctca gaggcctcag gcctccctgg agggccagaa
gggattccca ggggcaggat 120120agcacggacg ggggggtccg ctgtgtaagc ctgggaaatt
ctggctgctt gaagctggca 120180cactcccctg aggctgttcc aggcacagtc actgggtctc
cccgtcgaag tggctctaag 120240gcaggtcttc ccggttacca gcaagcagat acatgttgta
acgggctgag agtgggtgtg 120300tatctcatat tccccagagg agcttacgga cagcccctct
ttaaggaaca tctcccgggc 120360cgagcgttct gagcgccctc tgagcaggca gcccaccacc
ggggcgctgg ccagtctgag 120420tcagtcctgc ctgggaggca gtcaactctg gggaaggcag
aagaatcaca gcactgagtg 120480ccgtccagtt tcccctcaac tatattcgcg gacacggcgc
caacagggga aaacttacat 120540agcacctatt cattctgagg cctggccggc ctctggaaat
cagcatttca gtgttggcca 120600ggaaggaggc cagaactgtg ctgtagcatt tcaatgtatg
gctgtcccac aatttatatt 120660ccagtcccct gatgatggac atgtaggcgg tttccagttt
ttttcatgat cacagatcac 120720acttctgtga acagctctgg atgtgtctcc ttgtgcaaat
atggtgctag gtgttcttca 120780gagcattttt ctccaaactt gccgggtaat atggtcacct
tggggagttt gttaaatata 120840catatttcca ggtctgctct ggtgggtctg atatttggag
ccaggaatct tttctttctt 120900ttaacagata cttaagtgat tcctgtggtc caggaagttt
tgcaagctct gatgcagaag 120960agctcgctgg agcaaacatg tatgttctgt tttactaatt
agatatacgc tgttgatgac 121020tccaaagtgg acgtgccatt tctgtcccct ccagcagtag
gaaagaatcc cattatccca 121080catcctggcc agttcttggt attgtgagat ttttttaaac
ttcttctctt aagactatga 121140tggtattttg ttgtaatata atccacattt tctatattat
tactgtgctt gagcacttct 121200tcaaatgttt gatgaccttg accattgaaa cttccttttt
gtgaattacc tattcatatc 121260ctttgtagct tcactgtgga atgcttgtta ctgatttgta
gccattcttt atacatttca 121320ctcccactcg tttgtcagtt atatgtggtg cacacatctg
ctcccagtct gtggtttgtc 121380tttccacttt gtttatggcc tcttattggg acacagtttt
aaattaaaat ataatcagat 121440gtattaatca tgatgttgta gtttgtgcat tttctatctt
gtctaagaaa ttcatccctc 121500tccttgttat acttagaaag aaggatattc tacattctcc
ttcaggagtt ctaaagattt 121560gctttgcaca ttcaggttcc aaatctactt gaatttgact
tgtgtatatg gtgtgaggta 121620gggatttgat tttattttct atgtggataa cagttgttcc
attatcactt ttgagtagtc 121680caccttttct ctcactgact tataatgcca cctctgtcat
gtcagctttt catatttatc 121740agttcctgtt aaatgtatct ttaaagcttt attgagacat
aattcacatg atacaattta 121800cccatctaaa gtgaataatt aatgttttta ttatattcat
acaccatcgc tatcagtcaa 121860ttttagaaca ttttcttttt ttttttcttt tttttttttt
tgagatggag tcttgctgtt 121920gtcacccagg ctagagtgca atggcacaat cttggctcac
tgcaacctct gcctcctggg 121980tttaagcact tctcctgcct cagcctcctg agtagctggg
attacaggca cccgccacca 122040cacccagcta atttttgtat ttttagtaga gacggggttt
cgccatgttg gccaggctgg 122100ttttgaactc ctgatcttgg gtcttgggat ccacccgcct
tggcctccca aagtgctggg 122160attacagacg taagccacca tgcccggcta ttttcattac
ccccaaataa accttgtacc 122220ctttagttac ctcctcatct ctgtaaccct aagcaaatac
tcatctgctt tctgcctttg 122280tgggtttccc tgttctaaat attttttatg aatggagtca
tatagtatgt ggccttttct 122340gtcttctttt actctgcgtg ttttcaaggt tcatccatgt
tgtagcatgt atcaggactt 122400catttctttt tatggctgaa taatgctcca ctgtatggat
agaacacagt tatttattca 122460ttcctgtgtt agtgtatatt tgggttgttt ctgcctttgg
ctattgtgag taatgctgct 122520ataaggattc atgggcaagt ttttgtgtgg acataggtct
tcatttctct tgaatatatg 122580cctaaggagt ggaattgctg ggtcacgtgg tatctatttt
taatcacttg aggaattgcc 122640agactattca aaagcagctg caccatttta tgccatttta
cattcccacc agtagtgtat 122700gagggctgat ttatctatgt ccttaccaac gcttcttatc
tgactgttta attctagcca 122760ttctagtggc tggtgaagta ctatctcatt cagattttgg
tttgcaattc cctacaatga 122820tgactagtga tgtcaggcat cttttcttgt acttattggc
cacctgtatg tcttccttgg 122880agaaatgtct attcaaattt tttgcacaat tttaatacat
acatatatgt atatacacac 122940acacacacac acacattttt tctttttaaa cagattctgc
tttgtcggcc aggctggagt 123000gcagtggcac aatcttggct tactgcaagc tccgcctcct
gggttcatgc cattctcctg 123060cctcagcctc ccaagtagct gggactacag gtgcccacca
ccacacccag ctaatttttt 123120gtatttttta gtagagatgg ggtttcacca tgttagccag
gatggtctcg atctcctgac 123180ctcgtgatcc acccaccttg gcctcccaaa gtgctgggat
tacaggtgtg agccaccgca 123240cctggcctat ttatatatat attttttaaa gtcaggatct
ctgtcaccta ggctggagtg 123300cagtggcaca atcacagctc actgcaacct tgaactcctg
ggctcaaacc atccttctgc 123360ctcagtctct caagtagcta ggactacagg tatgcaccac
cattcttggt gaatttttat 123420ttttattttt atagagatgg gatcttgcta tgttgctcaa
gctggtctca aactcttagc 123480ctcaagcaat cctcctggct aggcccctca aagtgttggg
attataggtg tgagccacca 123540catctagcct cctttgcaca ttttaaaagt ggattagttg
cctttttatc attgagttgt 123600aagagttgtt tatatattct ggatataagg cctttatcaa
atatatgatt tacaggtaca 123660gttgaccctt gaataacatg ggagttaagg gtgcttgctg
cctatgcagt agaaaatctg 123720tatatacctt tgacttcccc aaaatttaac tacagatacc
ctgctgttga ctggaagact 123780tactgaaaac ataaacaatt ggttaacaca tattttgtgt
atgtattaca ccctatattc 123840ttacaataaa ggaagccaga gaaaagaaaa tttaagaaaa
gcataagggg ctgggtgtgg 123900tggctcacac ctgtaatccc agcactttag gaggccgagg
caggtggatt gcttgagctc 123960aggagttcaa gaccagcctg ggcaacatgg tgaaaccttg
tctctattca aaaaaaaaaa 124020agaagaagaa aatatattta ctgttcatga agcagaagtg
aatcatcatg aaggtcttta 124080tcttcattgt cttcacattg agtaggatgt gagtaggctg
aggaggtgga agaagaggag 124140ggttggtctt gctgtctcaa ggtggcagag gtagaagaaa
aactgcatat aagtgggccc 124200atgcagttca aacccagttg ttaagggtca actgtatttt
ctcatatttt gtgggttgtc 124260ttttcatttt cttgatgttg ttttttaaag cataaaagtt
tttaaagttt tgatgaagtc 124320catttgatct tttattttct cttgttgctt atagttttgg
tggcatatct aagaatcctt 124380tgataaatct gaagccatgt agatttaccc ctgtgttttc
tcctaagaat tatatattct 124440tagctgtaac agataggttt ttgatacatt gttaattcac
ttttgcatat ggtgtgaggt 124500aagggtacaa cttctgttct tttgcatggt gctatccagt
tgtcccagca gcatttgttg 124560aaggctattc tttccccttt gaatggtcat ggcacacttg
tcaaaaatca gttgaccatt 124620gtagtaggct gttcttgcac tgctataaag aaatacctga
gcctgggtaa tttataagaa 124680aagaaactta aatggcttac agttctgtag gttgcacaga
aagtaaagca gcatctgctt 124740ctgggaagcc tcaggaagct tccaatcaca gtggaaggca
aagcaggagt aggcatctca 124800catggcgaga atgggagcaa gagtggtagg ggtgccgtat
acttctaaat gaccagatct 124860catgagaact tactcactat cgcaaatatg gcaccaagcc
ctgagagatc tgccccatga 124920cccaaacacc tcccaccatg ccccacctct agcatggagg
attacaattc aacgtgagat 124980ttgggtgggg acaaatattc aaactaaagc aaccatacac
acaaagtttt atttccagat 125040ttacagttct attgcattga tcactgacat gattactgtt
cctttgtagt aagttttgaa 125100attaggaagt atgaatcttc ctactttgta cttctttttc
tagatcattt tggctattct 125160gggtccttta taattccata tgaattttag attcagcttg
tcaatttcaa caaataagtc 125220agctgggatt ctgaaggggg ttgtgtcaca tccgtagccc
aatttgggga gtgtcttagt 125280ccatttgtgc tgctataaca aaatacctga aagtgggtaa
tctataaata agagaaattt 125340agcttgggca acatagggag accccatctc tgaaaaaaaa
aaaaattagg catggtggcg 125400tgcacctgta gtctcagcta ctcaggaggc tgaggtggga
gaatcacttg agcccaggag 125460gtcaaggcag cagtgaacca tgatccacta ttgcattcca
gcctgggcaa cagagcgaga 125520ctctgtctca aaaaaaaaaa aaaaaagaaa aagaacagca
atttatttcc tcacggttct 125580ggagctggga attccaatat caaggcactg gtaggttggt
gtctggtgag tgctactctg 125640tgctttcaag atggtgcctc ttggtttgtc ctcacgtggc
gaaggaggga aaggagaaac 125700agggacgaac agtgttcata gcagctcttt tatttataaa
ttgctaatcc tgttcatgag 125760agcagagccc tcatgactga atcacctcct aaaggcccca
cctcgaaata ctatcacgtt 125820gctggtttag tttcaataca tgaattctgg ggggacattc
agaccgtagc agggagcgtt 125880gccatcttaa caatattaag tattttgctc catgaacatg
ggatgctttt ctgtttattt 125940agatcttctt taatttcttt taacaagtat cattaatcca
aaaatctgaa atctggtatg 126000ctccaaaacc tgaaactttc tgagcactaa catgacagca
acagtggaaa attccacccc 126060tgtcctcacg cgatgggtcg cagtcaaaac ccaatcaaaa
ccttgtttca tgcacaaaat 126120tatttaaaat tttgtataat taccctcagg ctatgtgtct
aagcatatat gaaacacaaa 126180ttttatgttt agatttgggt cccatcccca agatatctca
tgtatatgca aatattccag 126240aatcaaaaaa tatatataga gagaaatttg aaacactttt
ggtcccatgc attttggata 126300agggattctt tacttgtatt ttgtagtgtt cacaatataa
actataatat aatacaaaca 126360atataagcta aaatataaaa tttgcacttc cgttgtcaaa
tttactactg agtattttat 126420tctttttgag ctaccataaa tgggactgca ttcctgttag
attttttctt tctttctttt 126480ttaagaaatg gagacttagg ccaggagcgg tggctcgtgc
ctataatccc agcagtttgg 126540gaggccgagg cgggcagatt gcctgagttc aggagtttga
gaccagcctg ggcaacacag 126600tgaaaccccg actctactaa aatacaaaaa attagccagg
catggtggct ggcgcctgta 126660gtcccagcta ctcaggaggc tgaggcagga gaattgcttg
aacccaggag atggaggttg 126720cagtgagctg agatcgcgcc atcacactcc agtctgggcg
acagaacgag actctatctc 126780taaaaaaaaa aaagaaaaag aaaaaaaaga aaagataagg
agccttgctg tgttgctcag 126840gctgacctca aactcctggg ctcaagtaat ccttctgcct
cagcctcccg agcagcaagc 126900agggaccgag cagcagggac cataggagta tgtcactgtg
cctggctaga tgtgtgtgtt 126960ttcttttttt ggagatagag tcttgctctg tccaggctgg
agtgcagtgg cacaatcttg 127020gctcactgca acctctgccg cctgggttta agtgattctc
ctgcctcagc ctccggagta 127080gctgggacta tagctgtgtg ccaccacacc cagctaattt
ttatattttt agtagaggcg 127140gggttttcca tgttggccag gctggtctca aactcctgaa
ctcaagtggt ctgcctgcct 127200tgacctccca aagtgctggg attacaggcg tgagccacac
tacacccggc ctagatatgt 127260ttctaaaaca acattgtgta ggaaattttt cttaggctgg
ttcaaggcta acatgctgaa 127320ggtcccaggt agtgaagcag gtgaaccctg aaggaccgtg
cccctcccac acccacccta 127380gcagctgttc cacattgtag atggatccct gcgtccctac
attgtcacct ttctgtctaa 127440atgagaggtg gggagaagga ggggtgaaga ccaatgaggt
tagtcacctc aaggggtggg 127500gccggtgcag agaaagatgg gaactgttcc ctagtcctct
tccttctgga gttgggaggg 127560ggaaatagcc ctgccccaac atgtgatcca gcatggacta
tggtcagtta caaatatgag 127620tgagccaggc ccctgccctg aaggacctcc cagatttgtg
gggaagacac aaactcagca 127680cagaatggaa aggggcctgg ggacacagtg ctgtgccgag
cttggaattt gtttcatgaa 127740gggacagcca gcagtgtctg tgcaggtcca gggaggctcc
acaggggaag tgagatttga 127800gctgaggctt gaaagaaagg tggaccgaga ggagacgact
ttccagggca ggaagtggcc 127860tgcacagagg ccccgaggcg agaaggacca caactctcct
gtggacctcc tctttgtacg 127920atcagtttta attacgcaaa aaaattatgc agttaaagct
ttgctcattt aaacagacct 127980tccctggatt aagagtaatt aggttttatt gtcattgtct
tgaaaattgc cttgccatgg 128040acctgtgaac acagaattat ttatatatgt gtgcatagaa
atcatctcat ggattttttt 128100ttaactgaga tgaacttggg ggattttaaa ggtaacaaac
aaccccttct ggctggtgtg 128160ggggaagccc taatttgtag tgcttgctac tttttttttt
tttttttttt ttgagacagt 128220gttttgctct cgtcacccag gctagagtgc aacggcacaa
tcttcactca ttgtaaaacc 128280tccgtctccc gggttccagc gattctcctg cctcagcctc
ccgagtagct gggattacag 128340gcacctgcca ccacacccgg ctaattattt ttgtattttc
agtagagatg gggtttggcc 128400atgttggcca ggctggtctc gaactcctgg cctcaagtga
tccgcccgcc tcagcctccc 128460aaagtgctgg gattgcaagc ataagccacc gcgcccggcc
tccactttct gtggtgtaaa 128520tattcctacc atggctgatt tcctctgggg aagccatgtc
ccctccactt ctaccttcat 128580ggtctttgct tgtttatttt caaatgtggc ccaaatccac
cctgaggcag gcgcagtggc 128640tcacgtctgt aatcccagca ctttgggagg ccgagacggg
cggatcacct gaggtcgaga 128700gtttgagcca gcctgactaa cacggggaaa ccccgtctct
actaaaaaca caaaaattag 128760ccaggcatgg tggcacactc ctgtaatccc agccactcgg
gaggctgagg caggagaatc 128820gctggaacct aggaggtgga gattgcagtg agccaagatc
atgccactgc actccagcct 128880gggagacaca gccagactcc atctaaaaaa aaaaaaaaaa
aaaaaaaaaa accaaaaaaa 128940cccaccctga gtttaaaaga aaaaggctaa gaaatgcttg
ctgggctgtc tcaggggtgg 129000cagtggagct cccgtgcctc ccacagatga ggacgtggag
gccctggccg agtgctgaat 129060gcaggttggg ggacagggct gccggcctct gtaaggatgg
ggaccccaga cccacccggt 129120ggagctgggg ctgaaggccg tgataaacac agcctcaccc
tggcattgcc cacctccact 129180ggccaccagc tttgacggac agccccactg ccactgcccc
tctccaggca ggccacattc 129240acaggtgttc tgtacagaga gcagaagttt gagaaaacgt
gacccactga acagacactt 129300gctgaggccc gggggcttca tgtgtcaggc ccagaggtgg
ccacaggcac aacacgcgtc 129360cctcagcctt gccacctctg ctgccctcca aggtgccctg
ggtccagctg gccagaagca 129420ctgctttctg gagcagccac ttatgctttt acttttttct
cttctttttt tttttaagac 129480tgagtcttgc tctgttgccc aggctggagt gcagtggcac
aatcttggct cactgcattc 129540tccagttcaa gtgattctcc tgcctcagcc tcctgagtag
ctgagattac aggcatctgc 129600caccatgccc ggctaatttt tgtattttta gtagagatgg
aatttcgcca tgttgaccag 129660actagtcttg aactcctgac cccaagtgat ccgcccgccc
aaagtgctgg gattacaggc 129720gtgagccaca gtgcccggcc cagctttttc attttgaaat
caatacaagt ttttacagat 129780gtctgcaaaa aatatccata ttgatggatc ctggctcctc
agaatctgag gtttttacaa 129840atgtctgcaa aaaatatcca tattgatgga tcctggctcc
tcagaatctg aggtgacagc 129900agcttgcaga ggacactgct cccccaacac tgccttccaa
gggagacatg ccctacccca 129960aggcctccct tgtcccctcc ctcctctgac agtcaggtcc
agggggtcct ggagagggca 130020ctggcccacc ctccacacat gttgcacagt gccctgggaa
tgaggaagga agggcaaagg 130080ttagggaaaa ccagacgtca gtttcttgac cagccacagc
cggtgcagcg cggttattta 130140tagccctggt ggagcgccca taaatcaagg ctttccccag
cagcgcggct gtgagggctc 130200cgagatgatc tcatggtgcc cctcccttga accatcccag
agaaaggagg cctgtcacct 130260tcaaggacca atgcttctgc cgggactcaa gcagaacctg
ttcctgctgc tgcaaggttt 130320acaggcagcc ttggggacag tccgtgcaga aatgtcagga
gccttttcct gcataaggca 130380tctcagagct tatagcacct ccttggcggt ggccgtgctc
gcagcctgat acatgggacc 130440tctgtctcac ttgcagtgtg caccgcaggt cctaattcct
tatctgttcc aaccacaaaa 130500gcttccactg ctggatggag ccctgttatc attcaggctt
ctgagccacc tggaaggcaa 130560aggcctctct gtgaaaatga aaacttggct ttaaaaagct
ggcatgggct cagccatgga 130620gttgcttaga tgtagttcac aatctcgcag acccacggca
ggattccacc caggagaagg 130680tcctggaggt ggccatgggg ctcagcagaa tgagccggga
ggagcaggac tgttgacacg 130740agagcagggg tcaggctcag tgcacaggtc ccagtgtgca
gagctcaccc ggccccagct 130800ttcatcaccc atctgagatc ggcctccttc actcctgttc
tctgggactt ccttgtgtgt 130860tagagactcg agcaagaccc tttcttgtct atccccggac
ggacccgcct ttcccgcatc 130920tctgggggtg aggtcggctc tcgttcaggc ctgcagggtc
cctggtgccg cagcccctct 130980ggttggaagg ccccctgcgt gcagccccgt cctccagccc
cgctcggctc tgaatgagtt 131040cattatgtct cagcgcgccc tggccgccag gctggcagtt
cttcgggctg gcggggctcg 131100ggagctgtca gcctgccaaa tccagttgtt tgacttcatg
tttcaaggcc aaatcctttc 131160aggaaagttc tctccatccc accactccac cgccccccgc
ccttagaaaa ataaatgcgt 131220gattgactgg ctttgcaggt tttttatcca tcgttctttc
caaagaatag tgtgaacagc 131280tcattcgatt cgttcatgtg acgtcctccc tctcccaatc
cctgtctctc tctctccttc 131340cacagccaca tgaaattgaa gtgggagaac atggtgtgtg
agcattattg ggggtggggt 131400gcggagcagg cactggcttt ggttgaggtc tgcaaaggaa
agcaccccca ctgccaggct 131460gctcaggagt ggctcccacc ctcgactgcg gggaagtgct
ggaaccctcc gcacgagggc 131520aacctttctt gggctctgaa ggcgcctctc atcctctgag
ccaagaagac ttctgaccca 131580gaattctgag ttgagttccg acgcaggcgt gggcgatggt
gagcaactcc aggctacccc 131640gagaaagccg ctgtgtgacc ccattagggg acttggctct
cctgccaagc ccccacccag 131700cacctgtccc ttgtgtcatg actggacttg ttacttctag
cccaggagct ccaggtgaag 131760gggactgtgt tggtcctgtc tgcccaggac tctccccacc
cgcacagtgt acctgtccca 131820cagtggcacc aacaaggctc agctgagccc ttgccagagg
agacacggtg ggcaccagcc 131880ccagcgctgc cacctgcctg cctcctctgc caccgaggga
agccatgtcg ccttcctgag 131940cctgcttctc tgtccacaaa acaaagacta gataattaga
gtgcacactt ggagggacag 132000gatggcttcg tatccatctg gctgtgccct gtaactcaag
ggatgggact gtgtcattgg 132060taaacactgc aacaggcctc ctttccagga aacaggatcc
ctgtctcaga cagaagcagt 132120ctcgcccatc ctggggcctg agagcatcat tcctttgtga
gactcaaatg tgaggaattc 132180ctgtttcctc aagtctgtgt ggggagtgca gagcagccct
ctgagggggt agcgtgtgca 132240ggaaataagc cggggggaag ggggcaccgc gggtgggtgt
taactgctgc ttcggggtcc 132300agtcctgtta atgaaagcag gggtgaggca agcggggaag
actctctgta cctgatcagt 132360ggcatgtgtg tggctccagc cggcatggcc tgagctgccg
gtgagggaga ggcctcaggg 132420agaaaagaca aaagctgggg tgcccctgaa ggtggggcag
agctggggca gtcaggggct 132480cttccacagg tgcaggcctg gtgctggagg acacagtcca
taccacctcc aacccccacc 132540acctcccctg ccatggacag tgtggggcag gctcctgagg
atgaggcagc ctgaccccta 132600cccctgggct gtccccccga ctcctggttt ctctggaagc
ctggggttcc aggccatgca 132660ggtagagaac aacaggttta ggacaaagtc attgtggaca
ctgaggccca gaaaagggac 132720ttgcccaagg ccacacagca ggtggccagc actgtgtccg
tgtctctgtt attccttcct 132780ctgcttgctt tgggtttatt ttgtgctcct tttctacttc
cttaagatgg gagcttcagt 132840gatcaatttc agatctttcc tctcttctaa caagcatttg
gtactatgaa tttccctctg 132900aatcctgctt cagctgcatc ccaacttctg ataccaggta
ggagtaattg ttctaagctc 132960cgcagctgac tacagagagc ggccacattt gagaaacatt
ctcatctctc tgatcttggg 133020tctctgacaa tggagctagc aacacttcct ggttttaagg
gctctgagga gcaaattaga 133080ctgtctggtg gtgaagaact ttgtttagga acaatttaag
atgccaacag gttcatctgc 133140tctcctgcct ccttctcccc accttgtaac acaccctaga
caagtcagca ctcaagttta 133200tatggacaaa taagcaagta aaattagtta ggaaatcaaa
tattggggtt ggggaacaat 133260aagagaaggc aagccctatg ggatggtaaa ctgcagtata
aagccccatt tgctggcaca 133320gtgtggtctg gcctgtgact aggtgaggga cagtgcgggg
tggcctgcct gcatatgcca 133380ggttctgtgt atcacatctc tcaccaggaa ggcaaaacct
ggcagatggc acccgggcct 133440gatggctcct caaccctctc agcaccccaa agagggaagt
cccatcaccc tcactcaccc 133500agccccgggc tggccctgtc agctctcgaa ctgcagcata
cccgagctct taaagcacac 133560tggacctagg gctcaggttt gagctttcat gatgagaccc
tgaggtcact ggcggggaaa 133620atgagtcact ggggcattcc ccgaacttgg gaaaagccct
gaccccagaa tcctgagcct 133680aaattgctgc caagttcccg atttcccttc cagtcctcag
tttccccttc tctccggcac 133740cctcctgagg accactgagc cccaaccacc accatgccca
tggctggtgc ccaggaggtg 133800ggagctgtgg agctgcctcc aggccttccc ggaggcccca
tggctgtctt acttggtttg 133860tggcttccca gtaaacggtg ggacaggacc agggtctgag
gaaaagcaaa gcagtgttga 133920cagagtgtcc tggagggcag cttgtctccc tggcctggag
atgaaaactg aagaaaacaa 133980gattccgtct aggaatcgtc cagggcgcgg cagcccgcca
ggagcccggg cagcttggat 134040ggggtccctg tgggcacaat cggcccagtg tggggcgcac
cgccccaggc aggctggcgt 134100cccgggggcc agaatgaatg gaccctcact gggccgcttt
gacagtttat gaggtgatga 134160cgttgcagct atgattgatg aggtggcctc agtcggcgtt
tcggggaacc gggtacaagt 134220tgctagggaa atacaggggg agggtgctgc cctccccact
ggcccgctct gggtccctgt 134280gctatgcctc agtctcccca tctgtgtgga ggaactgatg
aagacagcac atgccccctg 134340ccctggtccc tggattgcaa gtccagacag aagagagcgt
cctgggaagc cagctgagag 134400cacctctttc tctttgggtc ctgcagcccg tgggtggcag
ggcggtaacc ccatctgccc 134460agaaccaggc tgcgctctcg ggagggaaac agactcttat
cctcttaggt acagcctccc 134520tggccgcctc ggatgagcat ctgcccctcc cgggcctcag
tttccacacc tgagatggga 134580gaggttgcat cctaaaggcc ctccatctct caggttgggg
gatcactctt gtttccctgc 134640agttgtttcc agcattagag tcacactggg gtccttgttc
tagcccccgt gtgttggctg 134700catgacccga actttactcc catgtcctcc tcttgggggt
gtgacatgag gtcttcccca 134760agtggccccg agtgaggcac caccatggac ctgatgggca
gatcccccag ggcactctat 134820ccagagggtg gtggacccaa acgcgacgtc cccagtggcg
gtggctgagg agaggggttg 134880gggcgaggac cagctctggg cagtctccag gtcatgcgcg
tttgacgcag gaggttttgc 134940cggcgccagg ggctccctcg gctgaccgag ggtgtccctt
actccataag gtactgatat 135000ggtctgcgga gcaggtggca tttgccatgc ctgccctgct
tgggtatttg gccccagggc 135060cagggctgtc cctcccacag ctattgaaca ccaggtcctt
gctgcgtctg gctgggctgc 135120agaataccct tgacctttca gcatcctcag tacctctgtg
gagtgctgcc atgaccaaac 135180tagagactgt tattattacg cacattttaa gacagcatcg
gccggacgtg gggggctccc 135240accagcactt tgggaggctg aggcaggcgg atcacttgag
gtcaggagtt caacaccagc 135300ctggccaaca tggggaaacc ctgtctctac taaaaataca
aaaaaactag ctgggcatga 135360tggtgcatgc ctataatccc agctactcag gaggcggagg
caagagaatc acttgaacct 135420aggaggcaga ggttgtagtg agccaagatc atgccattgc
actccagcct gggcggcaga 135480gtgagactcc atctcaaaaa aaaaaaaaaa aaaaaaagac
agcatcactg tagccaaggg 135540aacttaaaag aaaaaagtgc atgtttagac agttgttcat
ttcctccagt ttcctgccag 135600cccccttctg tagctgtggg cacgtgcaca tataagtctt
ctctttctca ccctgctcat 135660tgttcatagc tgcataatat tttacagagc agataaactc
tgccggcctt aactactccc 135720agggtcgaac gtctcctttt ctatcgtaaa tatctctgca
attcctatag cttttttcct 135780tcaattcagt aaactttttt ttttttttta atttgagatg
gagtctcact ctgtcaccca 135840ggctggaatg cagtggtgtg atctcagctc actgcaacct
ctgcctcctg ggtccaagca 135900attctcctgc ctcagcctcc tgagtagctt ggaaaacagg
catgcaccac cacacccggc 135960taatttttgt attttttagt agagaggggt ttcaccatgt
tggccaggct ggtcttgaat 136020gcctgatctc gagtgatcca cctgcctcag cttcccaaag
tgctgggatt acaggcatga 136080gccaccgcac ccagcccctt tgctcttcct ttgtcttctg
ccatgattgt gaggcctccc 136140cagccatgtg gagttgtgag tccattaaac ttctttcctt
taaacattac ccagtctcag 136200gtatgtcttc attagcagca tgaaaacaga ctaatacacc
atgtcattta atgaggctgg 136260gcctgcctct ttctgatgga atgaactctc cccaggtgac
agcctctggt cccctctgcc 136320tttggagtga cttctcctgc atgggggttc tgggcagggt
tggggggtga acactggaga 136380gggatcagtc ccagctctgt tcttccccac ccgctgcaac
ctggttgggg acgcgaggag 136440ggcacccagc aggctgccag ggggaggagg agggactgca
gtctgggccc agagcccggt 136500tgtcccccat ggagccacac agcaggcacc tcagagtttg
catccattgg agatggccag 136560gagtctgctc gtgtgggaag aatggcttct gagcgccaag
gtagcctgtg ctggagcctt 136620ggctgccata gaaggccatt cacgtccctc acaccctggc
ctgtgctgct gggaaggctg 136680tcactcctaa acagttatct gcttctgtgt taacctacag
ggattccctc tttgctttgg 136740ctgccaggga actcaaagct aagttttttc gctcacattc
ccaggcttct ggtgtagata 136800tctcctttct tccttactgt ggtttctgag ccatctggtg
tagatatctc ccttcttcct 136860tactgtggtt tctgagccac cctgcaaact tcctctgcat
ctgtcccagt acaaatatca 136920ctgtgtgccc ttctaggcta tctgggacac ttaaaaataa
ataaataaag acatctgtct 136980gaggcttaac ttgcatatag caaagtgcat gaatctgaag
tgtgtggctg atgaatctgt 137040acacaccgct ctccctgcac agccacccag atgaagacga
ggagcacctc caatacccaa 137100gcctccctcc acccagtcaa cactcttctc cagagggagc
tagcccgtgc cttcaccccc 137160gtcttctgcc tgacggtaaa cttgaggtga atgggctcct
acccatgaac acccaccgtt 137220catttcatcc gtgttgttct gtgcagccag actccatcct
acggcatccg tgggtgatgt 137280ttcatcacga ggctggacca cgatgtgttt atcatgtcac
tgttgctgtc acatgggctg 137340tttctggctt ggggtcctta tgaacagagc tgatcagaac
atatctgtcc atggctcttg 137400gtggacatag cctcatttca tgtgtgtgac tcaggagtga
ggttgctggg tcccgcggca 137460ttcatgtgtt tagtgtagga gactcttgcc aaacatttct
ccagagtgtc tgatgtacca 137520tgttgcgctc ccgccagcct cgctgagttc ccatagcacc
atgtcttctg cagcacacga 137580tgtggccgga cctctccacg ctggccattc aggtggacgt
ggagcggtgt catttccttc 137640ctgaggggcc tgcatgaatc ccttttaggt ggaacacaca
cagagtgggc cggcgctcat 137700aagcagtccc gtgtgaatcc ttggagagct gtggaggtag
cacagctgca tctctaggtt 137760ggggccactc ttttcttgga acaagagccc taaacgtggg
accctaaaca acacccaggg 137820aggagcagcc tggcctatga gcagggggac accatgaagc
tgggtggacg ggtggggagg 137880cccgacttta agggacctgc ttcctaagca aacaggccag
gacagaggct gcaggtataa 137940gccctgccct gccaccatga tggtgaatat gaggtgtcaa
cctgattgga ttgaaggatg 138000cctagatggc tggtaaagtt ttgtttctgg ctggttaagt
gtgtctgtga gggtgttgcc 138060agaggagact gacttttgag tcggtggacc gggtgacgga
gacccaccct ccgtgtgggt 138120gggcaccatc tcatcggctg tccttgtggc tggaacaaag
cagatggagg aaggtgggaa 138180gactttgttt gctgggtctt ctggctttca tccgtctccc
gtgctggatg cttcctgctc 138240tgggacatca gactccaggt tctttgacct ttggactctg
gtttagctgg agctctcggg 138300aggtgtcggc ttccctactt ttgaggattt tggactcgga
cttggccatt accagcttct 138360ctcttcccca gcttgcgtac ggtctgtcgt gggacccgcc
tagtgatcgt gtgagccagt 138420tctccctaat aaactccctt tcatgtatag acatggatcc
tatgagtctt ttccctctgg 138480ggagccctgc ctaacacagt ggcatagagc tccaggcagg
cccctcacct ggccttcagc 138540ttcagtttct ttgcctactg ggctgaatat tggtgtcacc
acccaggata gcaagtcccc 138600atcagtgtca ctgccacctg tcgaagagat cccccagccc
agcactcggg tctgtgagtg 138660gcagtggaga tgacactagg ctctaagtag tggggctgcg
tgactccagc gcccctagga 138720ccagcaaggg cacccagagc actgtgggag gactggtgcc
gcctccctgc ccccctgctt 138780tctcctgggc cactgggcta gacatggggc ctccctgccc
ccatgtggtt tggcctgaaa 138840aagagaggag gtggggtggc acttctgggg cagcccagag
tgagtgccct gccctgagcc 138900cagcgtgtgc cctgcctgtg cccctgggtg gggggtgcat
acagacctcg gggggatctg 138960gccgttggag tgcgcactgg gagggacgga gtcagctctt
tttaatctaa acacacacca 139020ggtgcttaac gtgtttttat caaaatatca agcacagctg
caagaggctg ctctgctcca 139080aggccctacg cccaggcccg tccctctgcc aaagcgagcc
ttttaggtga gccctgccgg 139140ccacctgcta gcccaggccc cggctcctgg tgggccctgg
accaggctgc aagcatatcc 139200aagtgcatgg cagtgagacg tccaggcctc ccagccccaa
ctctgagcct ccgttgggtc 139260agtggtaaga tggggcagtg gcacctacct ccccggctgc
tgtgaaaacc aagtgagcgg 139320aagcatggca gcccctcacc aggtccccag tgggaggact
ctcaatgatt ggcactgcca 139380ggagtaatgt tatgagagga gcgtcccatc agcctggagg
ggacacaggc tgggtcagca 139440cccacagggg cttccgcttc tctgtctcag ggggtccagg
gtggcccctt gccactggtc 139500cccaggcaca gccgctacac aggtgcctct gcagggtctg
ggagtcagct gtccagggcc 139560tgcttggcca ccgcagggca ggagtgtggc actgtgagcc
tggtggcctg gctgggcgtg 139620gccttgaggt tctattgata accccatggg gaaagccccc
ctgaggtgct tgggcatcgg 139680acctgcccac catctcctga ggtggggcct gctctccaga
cagtcaggtc agcatgagtt 139740cccatgaggc cagggccagc acgaggaatt gctgggagcg
gtccatccac ccatccgtcc 139800gtccatccat ccatccatcc atccatccct atccacctgt
ccaccctgat gcagttcctg 139860ccatccacgt gtccatctgt ctacccgtac acctgttcat
ccccatccat ccatccccct 139920gtccatccct gcccattcat ccaccctgat gcagcgcctg
ctacccaccc ttctctgtcc 139980atccccatcc atccatctct gtccatccat ctgtccacct
gtccatcccc ctatccatcc 140040ctgtccattc gtccacccat ccaccctgat gcagcacctg
ccatccaccc ttttctgtcc 140100atccccgtcc atccatctcc atctatccat ctctgtcctt
ccatctccat ctatccatct 140160ctgcccatcc atccatccat ccatccatcc acctgtccat
ccccatccat ccatccctgt 140220ccatccatcc actcatccac cctgatgcag tgcctgccgt
ccacccttct ctgtccattc 140280ccatccatcc atccatgtcc atccatccat tcattgccca
tctgtccact cacccatcca 140340ctcatcccac ccacccatcc atccacccat ccgtccccag
ccatccatcc tcctgtccat 140400ccctgtgcat gcatcatcct tccaactgtc tgtttgtcca
ttcccatcca tctatccacc 140460catccaccca ggcgagcacc tgccatgccc acgacctgca
ccatctcggt gaatctccat 140520cgtagcccca ggaggggctc ttttcagctg cactttcagg
atgagagaag ggaggctctg 140580agtggccatt tgcccaaggt cacacagagc tcagggccag
cgcttcccat ggccaaagcc 140640caggccccct ctgccagtcc cgcctccagg cttgcaggtg
gagcagccag gggttccgta 140700ctccctgcca ggcagctgcc agccctcccc gctcccctct
cagttcctca gcctggctat 140760gggaggccca agtcccacgt gtcactggaa ttcgggaggc
ctggctctca aagctttctg 140820tacaaaacaa ggagaaaaag agaagccact tttgctgtct
ggacatgctg tgggtgtcct 140880ggggaatgtg ctggcccact gcttggggcc tgacttggag
tgcccaggag ctgcgtctgc 140940actgtggctt gaaccgagtg tcgtgctctg tggacagagc
catccactct gggggtgctc 141000gtcccatggc actgtcagcc cccattctgg aaggtcccaa
ctgggtcccc caggcctgtg 141060tcccagagac cttccaggtc ttcattcact gatcatgcat
ttattttggg ggtaggatag 141120gtcctgaaga ccacagggca gggatctctg agttggggct
tttgcggagt gagggagtga 141180gagggtgttc caggcagagg aggtggcaga gaaaggcacc
aagtgaggca ggggcagttg 141240gggacatgag tgctcgggag cggggagagg gggcagctgg
tgggtaagca gaggccagct 141300ccaaggcccc cctgtgggcc acattaggga gacagggtct
ggccttgttg gtagcgcggc 141360ccctcgcagt gatgggtccc ctgtcagacc tccaggatcc
tgggttgctg gtttggcgca 141420gccccggctt ctgtgagctt ttgtttctgt tactggcacc
tgcagctctt cctaggcccc 141480tgccctgggg cactggtgcc tgcttcgcca aaggcacaga
gggctgcagg cgactggggc 141540agggtggggg gctcgcctct tagggctcag cttcactggc
cagaaacagg gctggctcga 141600tctgggtgtt ggcctgggtg gggtggccag catgcccaca
cctctcagtc tggatgtggg 141660gagaggttgt tccatccagg gcagttggtc tccagaagga
aattctctgt atctgggtgc 141720tgcaacctag gtgcccgggg tctcacgcag tggggcctcc
ccggaggagg tgaagtgcat 141780gtggcccctc gtgctggatg gagctggact ctagggagga
ccaggagggt gagctcaggc 141840cagggcgcca aggccctgag ccccaggggc ctgcctgtct
gcccatcatt cctccactca 141900tggggaggga ctgcccccca gcccagattt ctaggagggc
cgaggctcac aggaagccaa 141960tgccaggcga taacccacag tccccagtgg tcagtaggaa
ggaggctgcc tgcttgtgtg 142020ctcattagga gccaaggcca tgaggccact aacggaatgt
ccctgatctc agggatctca 142080gtgtggaggt ggcagcacac aggtaaccaa acaattgtgg
cagattgatg ggtgctttga 142140tggggtccac acagttaccg tggagtgcag agtgggcacc
taactgagga ggcaagggtt 142200ggagggagga aggtttcagg gaagacttcc tggaggaggt
gacttgttct tagggaatga 142260ggcccaggga gctaaagctt tagggcaggg cagcagattc
agagacaccc ccaggggtag 142320ggggaagtgc actagagtga cgtgtgcctg gcgtgggaac
agggagacat tagagaatag 142380tgggagtgtg gcaggctggg cccgccttgc ccctttcaag
gaagcagctg ccgcgtcaca 142440cgctgcagcg cagtgctgcc catggtttgt cggcccatat
tgtgggatct ttcagatttt 142500ctcaagaagc caggtttcca gctcctaggt ttgaaaagtt
ctatgtgcgc ttgaccgggg 142560ggccttacgt acgtgaattg ggtgagggcc tgagacaggc
gggtgagcac ttgccccaca 142620ctccagccag gattgggggt tcaaggtacc gcatttgcca
cgtgagggag tctctggaga 142680ccaccaaccc gctggtgggc ggccaagcgt caccaccagt
ccctctcagt gcgtcccacc 142740tctgttctca gcccctgcgg tgcccgcagt cagtggggca
gtgagaaccg cgcaggaaat 142800agctggcgcc atggtgtcgg ttctggttga cagagtgcca
cctggctaaa cccagccgta 142860aataagcccc gtttcctccc tgtggacaga gactggtggg
tgccaaggtt cacggcaaaa 142920acccttcccc gccaggcccc actgctctct gcctggctgg
cacacgccct tgcctcaggc 142980ctggctcccc gggtcctgtt gaccaccagg ctgcagatcc
caaggaatgc aggcatctgg 143040gtggccccac gtggagcacc agccatgggg aggccgtggg
ctaccaggcc ctgcgtgcac 143100tcgcagcatg caccactcct tctccaggtc ccgttgacct
ccatgtgtgc tgtgccccgt 143160cctccgctgc actgcctctt cgtagtgccc cagctccatc
cgggggtggg tgggagtgga 143220gacccctgcc tacaggcctc cgggtctagc cagagtcaac
cagcccacca gagacaccca 143280gatgcaggct atgccccacg tgcccaggga ggatctgctg
acgtgagccc agatttcgat 143340tcttgtgcga gggagacgaa ccaggtttaa accccagctc
tgccacttct ggctgtgact 143400ttgagcatgg ctgttcatct ctgggggtct ctctcctcat
ctcagagaag cagatcatgg 143460tgcgcacccc atggcgcgct gtgcagagca catagagacg
ccacccacag ctggcttact 143520gcagaggtgt gggcccattg tggtatcgct gatgtcctgg
aggccacaag tctgctagct 143580agaggagcct cacctgcatc ccggggcctg cccactggct
ggtggagagc tctgggttgt 143640gggcagcagg caggcctatt tccagtgagt ctgtggctcc
tgggggtgtg tgagaccttg 143700ggcagatccc ttcctgtgcc tcggtttacc cactagggct
cagagccctg ggggactggc 143760tcagtgagtg cctccttctg cgaggagaga cttttgtcct
tactagccag aggacaggtc 143820agccggggca ggctctgggc cctgggcacc acccactgga
aggaaagggg gttcagttag 143880cccagccctg ccccccaatg gaagaaaatt ggggttcagt
tagcccagcc ctgccccacc 143940ccagctcttc acacatccag atccagatgg gggccagacc
accaggagac tgtaaggatc 144000caaggccgag gggcaggaga cgcagtgcgc ccctctccta
tttctgggtt gcaaccgccc 144060acacatgtgc acacagggaa gatgtctctt ttggttgggg
ggcacaggac cccacctcac 144120tcttagctct gagacaccct tgtcaggcct cctggacagg
tgggagggac agggcagggt 144180ggggtgtggc cagctgcagt ccctgggagg ggtactgggg
ctgggccagg ccagggcagg 144240acagagttgg ggaatccagg cccctcacca ttgccctggc
tctttgtatg tagtgctgcc 144300tgcgggcggc agcctagctg gacacctgca gtgcgctcat
gcccagcctc tgtgggtgtc 144360tggctcaccc cacgtgggag gaaggcgggc acgggtggtt
tgggctgggc aggcccagcc 144420cgggtgcaca gctttctcat ccacacacac attccatgtg
taggggctca gggggagggg 144480tcacctcctc cctccccacc cctccttgtt acagaggagc
agatggccaa acggatgggg 144540gacaagcacc tgacagggcc actgctcctg ctctcagctg
ggccagtggg aaagggagct 144600tggcacctcc aagggaggtc ccagccaggg gaagtgtgtc
cccatccctg ctctggtccc 144660caggccagag aggtgaggga cccagagagg cccaggatga
ggactccttc agtcctgggt 144720ttttgtttgg attttttttt tctgccaagt gacttcaaag
ccgcccgaga aggtcctgta 144780tgtgtgagca ggttgggcaa ggggagggcg ggtctccaag
ctgggagcct cctcgcggcc 144840ctcgcctcgg cactgatgtg gccctggacc ctccggtccc
tccaggccac ccttcgtgcc 144900aggacccagc tggcttcaat ggcaggaggt gccctgggct
ggagggtccc cacgctgggc 144960tccggggctc caggttgact cagcagtccc caggcctgtg
ggcagagctg gcctgggagg 145020acacgggcct gagggctgag caatccctga gtgagaacag
gtagctgagg ctggagcctg 145080tgtcccccag gcagctgcta aggcccgggc ctggggctgg
agtgctgggc tcaggtgcca 145140tgtgagctct ttgcaggcag ggaccaggtc agagtccttt
gtgtgcctct tgacccccag 145200gtcctgggag aagaggggct tgggaggcct tggctggatg
cacaggctcc atgggaagga 145260gctcctgctc gggtttcttt gtaagcagca cgacatggca
ccttcccaga ggtacatggg 145320tctgggcatg tcattccccg acctgcttcc agaacccagg
ggacggaggt cggagcagcc 145380tgcctgagat ctgtggccat ctccactccg ccacccagct
accccgtaag gaggccaccg 145440accctcgggc acctgccagg gacccagctc aggccaggtc
acgggctgga tgtgggatgt 145500tctgggcctt ctgcccacct cactgggccc cagcaagctg
ctgtcctgag ggtggccgtg 145560ctgggccagc acgtggacag cagggaaccc catgggcaca
ggcctagtgt gtgtgatgta 145620ccggtgcaga ccatctgata agcagccttc agagaagctg
cgggtgtgca ggtggaggca 145680gacctcccag ctgcatcctc caggcacttg ccagggtcag
gctctggtga acacccttct 145740agaaataacc ccccaggccc ccgggggtct tgggctcctg
gggttctgat ggaaggatgc 145800cccttgcccg gcctcactac ctccacctca tgaatcaccc
tctgtccttg tcctgcaagt 145860ctgtcggcac agccgagctg gcgtcaccct ctgatgccag
tgcccagatc cctgaattca 145920ctgggcagcc gccccctgca cctggctccc agccactgcc
tttccaaagc aaatggggac 145980agggatacct gcccctccct gggtaggcag ctgttgcggc
ctccagttgt gagagtctct 146040gccccaccct gccccttccc ctgcccctgc agggtagaga
ggaggcgggt gggccaccga 146100ctgggagtcc cgggcagctc aggtccctca agagcccctc
ggggagaggc cgcacccgtc 146160tcccttggtc agtgttcccc aggaggcggt gctggcctgc
gtgtgccctg gaggaggcgg 146220cagggagact ggcctcatgg ggctcttttg ctggggcctg
gccaccattt tccttcgatt 146280ggtacaaaga ctgtccctgt gaagctggcc aggccccagg
cacagaaccg ccttgttttc 146340tgcctctcac tcacccgttc cctgcattcg gctgacagag
catgaggggg aggaatcact 146400gatgacaggc actggcctgc ccagctgggg gcctttgttt
attcatttgg tgggcacttc 146460ctgggtgcct gctctgggtc aggcctgtgg gggggaccac
tgagggcagg aaacctggcc 146520tgtccctcca ggaagcgaag tcaacactgg cacctgcaga
tgaagtggca gagcagcccc 146580cagctttgat ggcatggggt ggttgggggg cacattctgc
atgctcagaa gagagagcaa 146640ctcgccctgt ggaaggagca tacagtggga gatggggaca
ggtgagtggg gtcttgaaac 146700gtgaacagga gtttgcaggt ggagcaggtt tcaagagcat
tccagggcac agcatgggca 146760atggcaggga gagctgggat gtctgaagag cagcagccca
gacactgagt ggctggcgag 146820gcgagggagg tccgcaggtg gggcccagtc aggatggact
catctgccat gctgaggagc 146880caggctatga tgtgggcaat agggagccac agcagtttac
acaggggagg cgggagcagc 146940tgagcattta gggaagccac tctggctgca aaagaggaga
gggggccagg caggagggag 147000agccagtgga gggggagggg gaaggggaag ggaacaggct
tcagcagaag acctcctgcc 147060ttcaaggggc ccgggtcacc ctgtgggttc cggtggttaa
ggccagcctc ctggtcctgc 147120agccacatgc ctcccagctg tgtgcctccg cttctacctc
cgaaacccgg gcagcaggac 147180ccaccctgca ggctcctgcg caaggatggg ggaggtacct
ggagggccca ccccacatca 147240gaagcaggtc ctcagccaga gacgcccaaa tgcagcccga
gggaggctgc ccccaggaga 147300ggagaggaga acccgcagcc gccccgcagg cgcctgggga
aatcactcac tgctcccacc 147360tcccagccag gccccgaggg ccaggcactc tggcttagtg
aagaagccac ccagcaaagg 147420cctggtgtct ccacaccatt ctcgggggga cagcagcaag
gacgatggac gggagcagtg 147480ctcagcctta acgtccctgt ttccggaagc gtccctgccc
cgctcagtcc tgccccactc 147540cctctcccat cagtgccctc agactccagg cccttttctt
tctttcccag cagtgctgac 147600aactgtgagt gtggaactgg actgtgaacc ctggagggca
gatggtggag gtgggagggg 147660gcagggccag cgccagggtg agctgcaggg tccccaggtg
caggggaaac gccaacttcg 147720tgtgcagtct tgcctctttt gtttcgtttc agagtggccc
ccagagccaa gccctatgct 147780accttacaga tgaggaaact gaggctcggg caggtgaaga
gaccggccct gggccatgca 147840cttttttggg gacagagcag ggtgctcagc tgtctggggg
cctgggctgg gcctggggca 147900gggggaggag cagcgctgtg acccccacgg ggagccgtca
gccaggcccg tagccaggga 147960agggcggagg ggcttaacgc aggcccacat ggccaggaca
cagaaccgtg ggggcagccc 148020cagcccttcc ctgcctgtta caggcccctg cccctgttcc
tgtgtcccat gcctgtcccc 148080tccctccatg tgtctgtctc tactcgcctg ccctctctct
gccctctttt ctctgcttca 148140tcctcttttt ccgtttctgc ctgtctcatg gttcttcctg
tccctgccct ccctccctcc 148200gtgtgtggcc gtgtctctct gtctgtctct tcctcctcct
ccagctctgt cccagtaggg 148260ccactccggg caggatggtc tgcggcccct cctgcttcac
cctctccgct ctgtacccac 148320tgccgaggac acacctttgc tgggtttctt ggttcagctt
ttgtaccgtg acatctctga 148380ggtccacagg tcacagggca ctggccagtg tggggcaggg
actcctcctg cccaggccat 148440tggacaaagc agagctgctt caggggcttt ggggccacca
ggccaggaac aggtcccctc 148500ctgcctaccc cagctccctg ggagacacag cccacagcag
cgtcacgtaa ttaagacata 148560aaagtccacg atttccgctg cgtgtgggca aacaccggcc
gggtgctaat gcaatccttt 148620cttttagtgg attcttgaat ttaaatgata tgtaattaaa
cataattagc cacggacgcc 148680attcggctag gtacacgtgc ggagtggggc tggccggatt
ctcattacaa agcgtcatct 148740tctatgggga gacctcttgt tccctctcgg ggctggttgc
ttttgcctgg acaataagcc 148800tggccccagg gcacctgggg aaggtgtcca aggagggggt
cctgggggcc agtcccccgg 148860gggctcggat gatgaatcat gggcggtgtt tgctggaaga
atttttgttc cgcttcgggc 148920aacaaggaac ccaaacctct cactgttgtg gggagaaaag
ttggaaatca gatgggaact 148980ggctgcggta ccagagagga cgtcagaccc gggccgcggc
ctcagccccc gactccttgt 149040aagcgtgggc tgggtgcaca cgtggcggct gcgagcagtg
ctttggagct catctcccca 149100gaccctagtg ggagctgagg gcagcctggc agggctgatg
caggcccatg cacacaggcc 149160aggcagaggt gagaccaggg aatagaagcg ttgcaggggc
agagggcagg agggggcaga 149220gcctcggcgt aggagtgagc ctactggtcc ttcccgcagg
caccctccct agtgccaggt 149280ctgctgggac tccaggctgt tggtatgccc cctagggtcc
tgcttggagc ctttctgacc 149340ctcctgatcc agtgcctgtg ccgggcttcc tccccactca
ccctgtcctc cctgtgcccc 149400atgtagccca cacatcagtg tgtccgtccg tctgcctgtc
tgccccccac cagcctgtgt 149460gcccagggga aggctcaagg ccggtcatgc tctacctccc
tggccagccc agccccagcc 149520cacggtgcat tccctggaca ggattgctga gttttccatt
ttagccacaa cagcgcccac 149580acccgcgcca accaacctgt ccttgccacg aatattggct
cctgccctcc tcatgggagc 149640ccatggtgca tgctattatg acctcggtct agaggcaggg
acaccaaggc acaggggctt 149700agggaagttg cctgaggtca cgtggctggt agggggcgga
gccagcttta tcctcatggc 149760acagatgagg aaactgaggc agagaggtgt tttgccaggg
ggctgattta ctgagctagc 149820cagccgttct ggttttgtct ctgtctcgct cagagggacc
cccaccccac ccacctgcct 149880acccagcttg ccgtgtcctc ggccagggcc agcagtctca
gaggtagagc tgctgagaga 149940agcctgtcat ctgtcctcca agagggatct gagtaagccg
gagagcatca gcagcctcaa 150000ggggaccttg gacccctgcc cacctttcat tctgtaccag
ggcagggtgt gtctgcacag 150060cctggagcca ggtgtggctg gcagccccag gaagacaggg
ccaaccctgc ccagcccaga 150120gcagccgcat ctctgcagca gcggtgcact tgagtggaag
cactggcttc ccaagagtaa 150180gagcttgaga aggggcctgt ccagagccct gggggacggt
ggacaggcca gcccaggggt 150240ccaggccttt cctgtagggg ccgaatcgtt accacttcag
gctctgccag cttccatttg 150300gtctctcttg catattactc tcttcttttt acaccccttt
aaaaatataa agacaggccg 150360caggtaggat ttggccctct ggcctacaca atagcaactg
tgagcattat atgtggcttc 150420tctttctttt tgagactctc atctgtcacc caggttggaa
tgcagtggtg tgatcacggc 150480tcaccacagc ctcaacctcc caggctcagg tgatcctccc
acctcagcct cccaagtagc 150540tgggactaca ggcgcctgcc actacacctg gctgattttt
gtactttttt gtagagatag 150600ggtttcacca tgttgcccag gctggtcttg acctcctggg
ctcaagcaat cctcctgcct 150660tagcctccca aagtgctagg atgacaggcg tgagctacca
cgcccggccc atatgtggct 150720ttttaattaa atttttgagg tataacatta gtaaagtgct
ctaaaagtga agaataggat 150780gttagcccca cccaggtcaa gatctagaac attcctggca
cccctattca cctttccgtt 150840aggaagctgc cgtcccaccc tctgtcacta gtctgccttt
gccagctctg gagctctctg 150900taagtcagtc ctgctgtgtg tattcctcga gtctggcctc
agttgctctt caggtctggg 150960gctcaccccc attgccgcgg caggaactca ccccagattg
agccgttgtg gagtattcca 151020gtgtgcaatt ctgccgtggt ctgttttccc atcctccccc
tggcgggcat ttgtccagtg 151080tttgactttt gctattgcaa ctaaagctgt gtggagcagt
gtgcctttgg tgaacgtgcc 151140tttggtgaac gtgaggatgc atttctgctg tgtacatact
taggagtggg actgcagggt 151200tgcaagtgga tatcaacacg cagttttcca atgtggcccg
taccatagaa ggcctcgtgg 151260tgatgccgag tccatattcc agcactgacg caggattgcg
tggtgccaag tccctatttc 151320agcactgacg caggatcacg catgtcttca atgcccctct
cctagatcgt agacaccctg 151380tgtggcatca cagaatgaca cacacaagac aagaggcctc
taaagccctg acactgacac 151440gggttgttct tgctggggtt gctaatggag cgggtgcggg
tgtgggtgcg ggtgagggtg 151500ctgatggagc gggtgagggt gcgggtgagg gtgctggtgg
agcgggtgag ggtgcgggtg 151560agggtgctgg tggagcgggt gagggtgagg gtgagggtgc
tgatgcagcg ggtgagggtg 151620agggtgaggg tgagggtgct gatggagcgg gtgagggtga
gggtgttgac ggagcgggtg 151680agggtgcggg tgagggtgct gatggagcgg gtgagggtgc
gggtgagggt gctggtggag 151740cgggtgcggg tgcgggtgca ggtgagggtg ctgatggagc
gggcgagggt gagggtgagg 151800gtgctgatgg agcgggtgag ggtgcgggtg agggtgctgg
tggagcgggt gcgggtgcgg 151860gtgcaggtga gggtgctgat ggagcgggcg agggtgaggg
tgagggtgct gatggagcgg 151920gtgagggtgc gggtgagggt gctgatggag cgggtgaggg
tgcgggtgag ggtgctgatg 151980gagcgggtga gggtgagggt gagggtgagg gtgttgacgg
agcgggtgag ggtgagggtg 152040agggtgctga tggagcgggt gagggtgagg gtgagggtgc
tgatggagcg ggcgagggtg 152100agggtgaggg tgctgatgga gcgggtgagg gtgcgggtga
gggtgctggt ggagcgggtg 152160agggtgttaa tgaagcgggt gcgggtgagg gtgctgatgg
agcgggtgcg ggtgaggggg 152220ccgcatactc atccccttgc ctggtccccc aggctcctac
tttggtgtgt gctcattcag 152280gagcagcaga gtggccgtgc ttgggacggc cagccaagcg
atccccactg gtactggcat 152340ctgtaggtgc ccctgcttgg agcgggtgct ggcaactgcc
ctttgctgga aggggcagca 152400atgtccagca gtggtggaag ctctcgaaag aacacagggc
tgggagtaga actcagtcag 152460atcccagctc tgctctccct gctgtgtgac ctatagcaag
tctctgaacc tttctgagct 152520gtttcctcac ctgttcaaga tgggggtgac cacttcttcc
ttggaggatg gaagggagag 152580ggatgtgagg ctcgtgagtg ctgtgctcgg agactgtgca
ggaggggaag cggccgtagc 152640cccatacaag caggtgggca aggaagtggg ggcgctggag
gggcttaggg gacaaggggt 152700tctgtccttg ggacctggac aagagccaaa aggtataccc
tcacagtcac aggggagccc 152760acactgccct cccctcccag ggactgcccc tgtggccccc
gtccaggtcc acagcctcag 152820cagtccctga gggccaggag gtagcctcca ggccactggg
ctgagaaccc ttaaaagaaa 152880aagcccccct gcccatgaca gtcattttag ggtcaagaaa
gccctttgca gaaaccaaac 152940cctttgaaag ggccaacacg gcttttagat gacgtgagaa
ccaacttcaa gcacttcctc 153000tcagaccagg gtggcagctt ccaaagtggg ggcgctgctg
ccgggagact gtggtttccc 153060tgcttctcag ggagcttggg ccaggccgtg gggagatcca
agcaacatgt ccctcggtag 153120tggtaggttt gcagacgggc cgctccccac ttcctgccaa
gctccagcgg cacctctgtg 153180ggtgcaaggc tgtggggagg ggccttctgg ctcacagccc
tccaggctgc ccacagcccc 153240taggatgagg tgccacttcc tcaaatgact gtcaagactg
ggatgggggc cgggcacggt 153300ggctcactcc tgtaatccca gcactttggg aggctaaggc
gggcaaatca cttgaggtca 153360ggagtcggag accggcctgg ccaacatggc gaaaccctat
ctctacacta aaaaatacaa 153420aaaaagccgg gcgtggtggc gggtgcctgt agtcccagct
actcaagagc cgaggcagga 153480gaatcgcttg aaactgggag gcagaggttg cagtgagcca
agatcgcacc attgcactcc 153540agcctgggca acagagtgac acttggtctc aaaaaaataa
aaaataaaaa aggctctgga 153600tgtctccacg aacccacctg accccttcct gctcctgggg
ctctcctctc ccacccacgt 153660gtctacccgt agctcctgcc tgccttcagg actcagccca
cagccccttc cacaaagtcc 153720cctggcaggg cccggaggcc tctgcactgt ctcatggctt
cattcctgtg tcctcagcat 153780ccagtgagac ctcaggagat gcctggggaa tggctgaagg
ctttggagag gttgctgccc 153840ccagaatccc agccagaggg cagtttatcc aggagcccgg
tgcctcctct gttgggtggg 153900cgtggcctca gagggcacca accagaagac acatgctgtc
actgacatgc gggcccccag 153960cgttagggca ggcaggagct ccgagtctct accccagctg
tccccacagt gcacatggac 154020taggctcctc ccacggggca ctaggccagg ccaggggtgt
ggggtgagcc cctggggagc 154080ccagagcagg gtacactcat gtccccacca tccaaggtga
gatgtctgtg ctggaggccc 154140agagggagct gggccggctg tcactccacc acaaggctga
gcaggctccc cagtgggaga 154200caggtttggg gaggtgtcca ggttgagggc acagtcagga
ggggccatgt gtgcctgagt 154260agggcagggg gaggaggggg atggccagtg gccacgggcc
aggctgcctg gcaggacctg 154320gaggtttcct caagctaagg ccaggcccat gttagatgct
gggagggtgg gtgtggctgc 154380agaggggtca gcgtgggggc actgactaac ctccggccat
cctctccagg cccagtgacg 154440agcaccatcc ggaagtgaag gctgatgggt acgtggacaa
cctcgcagag gcagtggacc 154500tgctgctgca gcacgccgac aagtgatggc ctcctgggag
agccccgcct cctccacccc 154560tgcctctcct ccacccctgc ctcccctcca cccctgcctc
tcctccaccc gcccaggaga 154620gccccacctc ctccacccct gcctctcctc cacccctgcc
tcccctccac ctgccccagt 154680gcccagacca accaaggccc tgacagccct gccttctgcc
ctctgccctg catgggcagg 154740catttgttcc ctacctgggt ggcctgctcc cctgcctggg
ccctgacttc agctccctgt 154800agtgaagtcc aggagggtgg gacaggcctg tcaggcctct
gggaatctcc caaatcccag 154860aactcaccac tcaccatggg cctttaaatg cagtaaactc
cacctaacca gattcagggg 154920cactatgccc actgcctcct cttcagactc tttgcatttc
agtgaagagc ctggaagaaa 154980cccaggggcc tcctatgcac agatcttgca gcccagaacc
aagtcagcct ccctgcgact 155040gcccaggcac actgcccacc accccacccc cgaaacaatg
ccagcccgct gctttttcta 155100tcctcccagt cacctttgca gacaaagacc aggggcagct
cccgagggca ctgtgaaggc 155160tcccatgcca cacagtgaga actgtagcct ctgcgtccaa
ggcacacagg gtactttctg 155220gacccactgc tggacagact tgaaggtgtc atgcccggtg
tgtgcaggag gaaactaaca 155280gttcagtaaa ctctgccttg accagcagcc tttgactcag
gctttgactg ctaggaagaa 155340ccgtatgggg gaggccacca gcagggtctg ggccactgtc
tctagtcctg ccttatgctt 155400gagccactga atatcagagg tgcgagggac aagggccctg
aaacacctca cctgctccag 155460ccccttcact tagcagatgg ggaaactgag gcccagaggg
gccagtgagc tgctgttggc 155520cctatctgga acgaggcagt ccagggcaaa ctttggcact
gccttcccta acggaacagc 155580ctgtggcctg ggggtggtga gctttgcttt cccgaccaag
ggcgcggcgg ccttcccaca 155640ggggccctgg gaacaaatca ctcataactg aagtttcagg
tttcagaatc agtcagctga 155700acagataact tgattgctcc atttctcccc aaatcagttc
agaagctact gataaccctt 155760gagagactgc ttctttattt tattttatta ttattatttt
tgaaatggag ttttgctttc 155820gttgcccagg ctggagggca ctggcgtgat ctcagctcac
tgcaacctcc gcctcctggg 155880ttcaagctat tctcccgcct cagcctcccg agtagctggg
attataggtg cccgccacca 155940cgcctggcta attttttgta tttttagtag agacagggtt
tcaccatgtt ggtcaggctg 156000gtctcgaact cccaacctca ggtgatccgc ccacctcggc
cttccaaagt gctggtatta 156060caggcatgag ccactgcgcc cggcctggag agcgcttctt
atgagagaag ccatgggtct 156120ccggagcagg atttccacca catgaaggtg gacgataaga
aagtgtggga cccacgccgg 156180ggaggggcaa gtacagcccc cgcaaggccg cccattaaat
caggattggg ggaaacggtg 156240acagtctttt cagcaagatg agaggccacc agccgggcca
gtgtcactgg atttgaggga 156300gggccatttg gccccttagt tggaaaatat gcagatggct
ggctccttga tggagtcccc 156360tgatggcagc tccttaggag acaatacagg acgttcaggc
ggcggctgca gttcagtggc 156420agcctggagg ggctgtgcgt gatgccgggc tgtgcgtgat
gccgggctgt gggtgatgcc 156480atgctgtgcg tgatgccggg tttgtgggtg atgccgagtt
tgtgggtgat gccgggctgc 156540gggtgatgcc gggctgtgcg tgatgccggg tttgtgggtg
atgccgggct gtgcgtgatg 156600ccgggtttgt gggtgatgcc gggctgcggg tgatgccggg
ctgtgcgtga tgccgggctg 156660cgggtgatgc cgggctgcgg gtgatgccgg ggctgtgtgt
gatgccgggt ttgtgtgtga 156720tgccgggctg tgcgtgatgc cgggctgggg gtgatgccgg
gctgtgcgtg atgccgggct 156780gtgcgtgatg ccaggctgtg cgtgatgccg gtttgtttgt
aggtctttag ttttttcttt 156840cttttgtgtt taatggggaa gcttcccgcc aagacagcaa
cccctctggg ggaaggagtt 156900caagctgaga cctctttttt tttttttttg agacggagtc
tcgctctgtc gcccaggctg 156960gggtgcagtg gccaatctcg gctcactgca aactccgcct
cccgggttca caccatttcc 157020ctgcctcagc ctcccgagtg gctgggacta caggtacctg
ccaccacgcc cagctaattt 157080tttgtatttt ttagtagaga cagggtttca ccgtgctagc
caagatggtc tcgatctcct 157140gaccttgtga tccgcctgcc tcggcctccc aaagtgctga
gattacaggc gtgagccact 157200gcgcccagcc gagacctcct ctttcaagtt gctacccagg
agtggggtgc ccagagggac 157260agcccctgga ggtgcagccc cctgctttgt tggggcggga
gcatgccacc ctctgagccg 157320cggtgggctc agtgaagtgg gtgttgtcat ccttcccact
gcacgggtct agagtctggg 157380gtgcggtgag ctcgcggctt gctgagggcc aacagttagt
gaggggtgga gctgcgactt 157440gcaggaaggt ccagccccgc ctcccaggcc ctttctagcc
ccttctgttg ccgtgctgct 157500tcccaccctt gcctggaatc ctcctgtggg ggacgcatgt
ggtggcccct gtggctgact 157560ttccggctga gccagggcac cagtgtcctc actctagaga
tgagctcttc tgaggggtga 157620ggtccagcca cacctgtggt ggggtggtgt agcccgagcc
tgttaccccg tctataaaac 157680agagggattc acttctttta caggcacttc ctggccctgc
ggggtgccct gtctctggct 157740ggagctgatg agccgtggtg tgatgcttgc ctgggggtcg
gtgttgccta gaagtgagga 157800atgacctcct tcacccctgg ctccttccgg cagcctggcc
acttccagca tgtggagggg 157860gcccgggccc tggcagagac ggtgtggaac cgccggccgc
atgcggggct ggagttcggg 157920gctgccaatg ggctaaggcc ctgtgttcaa cacatgtcgc
ggaactgtcg tgaaaaaaag 157980actcaggacc acaggtccgg ggtgctccca ggggtctttc
ccaccgaggc tgggagtgag 158040cagcagccta ggcccagcca cacctcccag gctcgggagc
tgccaggcca ggcaaccgtg 158100aagggtagca gggtgggggg tggaggggag gacagggaga
gcaggcctct cgggcctcac 158160ccacccttgc accttccagc tctagggccc acacagtacc
aagccctggg ctcaatgtac 158220cacttatgaa aactcaattc aaattggctt agtgcaacaa
ggtatttggt ggtctcacat 158280gacttgaaca tccaggcctg gctgtcacgg gaaccgcatc
tcttcccatt gcagctactt 158340ggcaggtggc gggatgtccc ccagccaccg acgtccccct
gcctgctccg caaccccagg 158400gcctgcagaa aaggcccacg agactcagac tggcagagac
ttaggcggac caggaacagg 158460ggcgcagtct ccgtcccacc caaaccctaa ccagagagaa
cacggcacgt tgtgccagac 158520ggaggacgga tgccagcgag ggtccatgtc ctcactgccg
acaaggctgg gaactgggcc 158580aagtgaagca gaggcctcca cgtcagatgt gagcgccacc
ggcccaggtg actgcagttc 158640ttccctcctt ccgttcggct tgagccctcc agaggatcgg
aaaggctgag gcctgacctg 158700gtgccgctgt cctgggtggg tctgtcctgc tggtcggttc
ctgcccctct cggggaggtt 158760ggctggcagc tggcaggtgg aaagcctcct gtgttcacct
cagggcagag gtggggacac 158820agggcgggac gggcgagtgt ggtgcccctc tggggtgggt
gctcttgggt ccgcctcccg 158880tgccagagtg cgtgtcaaca gttccagctg cccctcagaa
ctgtcctggt ttaggaggtg 158940aacacacggg gcagcctaca ttctacgtgg tttttttaac
attataaaag cagcatgtgt 159000tattacagga aatttaacag aagtatataa aaagaaacca
gaagtca 15904721194DNAHomo
sapiensCDS(351)..(410)CDS(414)..(587)CDS(591)..(686)CDS(693)..(770)CDS(81-
1)..(849)CDS(853)..(1062)CDS(1066)..(1071)CDS(1075)..(1161) 2attcggctga
cagagcatga gggggaggaa tcactgatga caggcactgg cctgcccagc 60tgggggcctt
tgtttattca tttggtgggc acttcctggg tgcctgctct gggtcaggcc 120tgtggggggg
accactgagg gcaggaaacc tggcctgtcc ctccaggaag cgaagtcaac 180actggcacct
gcagatgaag tggcagagca gcccccagct ttgatggcat ggggtggttg 240gggggcacat
tctgcatgct cagaagagag agcaactcgc cctgtggaag gagcatacag 300tgggagatgg
ggacaggccc agtgacgagc accatccgga agtgaaggct gat ggg 356
Asp Gly
1 tac gtg gac aac ctc gca gag
gca gtg gac ctg ctg ctg cag cac gcc 404Tyr Val Asp Asn Leu Ala Glu
Ala Val Asp Leu Leu Leu Gln His Ala 5 10
15 gac aag tga tgg cct cct ggg aga gcc ccg
cct cct cca ccc ctg cct 452Asp Lys Trp Pro Pro Gly Arg Ala Pro
Pro Pro Pro Pro Leu Pro 20 25
30 ctc ctc cac ccc tgc ctc ccc tcc acc cct gcc tct cct
cca ccc gcc 500Leu Leu His Pro Cys Leu Pro Ser Thr Pro Ala Ser Pro
Pro Pro Ala 35 40 45
cag gag agc ccc acc tcc tcc acc cct gcc tct cct cca ccc ctg cct
548Gln Glu Ser Pro Thr Ser Ser Thr Pro Ala Ser Pro Pro Pro Leu Pro
50 55 60 65 ccc ctc
cac ctg ccc cag tgc cca gac caa cca agg ccc tga cag ccc 596Pro Leu
His Leu Pro Gln Cys Pro Asp Gln Pro Arg Pro Gln Pro
70 75 80 tgc ctt ctg ccc tct
gcc ctg cat ggg cag gca ttt gtt ccc tac ctg 644Cys Leu Leu Pro Ser
Ala Leu His Gly Gln Ala Phe Val Pro Tyr Leu 85
90 95 ggt ggc ctg ctc ccc tgc ctg ggc
cct gac ttc agc tcc ctg tagtga 692Gly Gly Leu Leu Pro Cys Leu Gly
Pro Asp Phe Ser Ser Leu 100 105
110 agt cca gga ggg tgg gac agg cct gtc agg cct
ctg gga atc tcc caa 740Ser Pro Gly Gly Trp Asp Arg Pro Val Arg Pro
Leu Gly Ile Ser Gln 115 120
125 atc cca gaa ctc acc act cac cat ggg cct ttaaatgcag
taaactccac 790Ile Pro Glu Leu Thr Thr His His Gly Pro
130 135
ctaaccagat tcaggggcac tat gcc cac tgc ctc ctc ttc aga ctc ttt gca
843 Tyr Ala His Cys Leu Leu Phe Arg Leu Phe Ala
140 145 ttt cag
tga aga gcc tgg aag aaa ccc agg ggc ctc cta tgc aca gat 891Phe Gln
Arg Ala Trp Lys Lys Pro Arg Gly Leu Leu Cys Thr Asp 150
155 160 ctt gca gcc cag aac caa
gtc agc ctc cct gcg act gcc cag gca cac 939Leu Ala Ala Gln Asn Gln
Val Ser Leu Pro Ala Thr Ala Gln Ala His 165
170 175 tgc cca cca ccc cac ccc cga aac aat
gcc agc ccg ctg ctt ttt cta 987Cys Pro Pro Pro His Pro Arg Asn Asn
Ala Ser Pro Leu Leu Phe Leu 180 185
190 tcc tcc cag tca cct ttg cag aca aag acc agg ggc
agc tcc cga ggg 1035Ser Ser Gln Ser Pro Leu Gln Thr Lys Thr Arg Gly
Ser Ser Arg Gly 195 200 205
210 cac tgt gaa ggc tcc cat gcc aca cag tga gaa ctg tag cct ctg
cgt 1083His Cys Glu Gly Ser His Ala Thr Gln Glu Leu Pro Leu
Arg 215 220
cca agg cac aca ggg tac ttt ctg gac cca ctg ctg gac aga ctt gaa
1131Pro Arg His Thr Gly Tyr Phe Leu Asp Pro Leu Leu Asp Arg Leu Glu
225 230 235 240 ggt gtc
atg ccc ggt gtg tgc agg agg aaa ctaacagttc agtaaactct 1181Gly Val
Met Pro Gly Val Cys Arg Arg Lys
245 250 gccttgacca gca
11943140PRTHomo sapiens 3Met
Gly Thr Trp Thr Thr Ser Gln Arg Gln Trp Thr Cys Cys Cys Ser 1
5 10 15 Thr Pro Thr Ser Asp Gly
Leu Leu Gly Glu Pro Arg Leu Leu His Pro 20
25 30 Cys Leu Ser Ser Thr Pro Ala Ser Pro Pro
Pro Leu Pro Leu Leu His 35 40
45 Pro Pro Arg Arg Ala Pro Pro Pro Pro Pro Leu Pro Leu Leu
His Pro 50 55 60
Cys Leu Pro Ser Thr Cys Pro Ser Ala Gln Thr Asn Gln Gly Pro Asp 65
70 75 80 Ser Pro Ala Phe Cys
Pro Leu Pro Cys Met Gly Arg His Leu Phe Pro 85
90 95 Thr Trp Val Ala Cys Ser Pro Ala Trp Ala
Leu Thr Ser Ala Pro Cys 100 105
110 Ser Glu Val Gln Glu Gly Gly Thr Gly Leu Ser Gly Leu Trp
Glu Ser 115 120 125
Pro Lys Ser Gln Asn Ser Pro Leu Thr Met Gly Leu 130
135 140 4117PRTHomo sapiens 4Met Pro Thr Ala Ser Ser Ser
Asp Ser Leu His Phe Ser Glu Glu Pro 1 5
10 15 Gly Arg Asn Pro Gly Ala Ser Tyr Ala Gln Ile
Leu Gln Pro Arg Thr 20 25
30 Lys Ser Ala Ser Leu Arg Leu Pro Arg His Thr Ala His His Pro
Thr 35 40 45 Pro
Glu Thr Met Pro Ala Arg Cys Phe Phe Tyr Pro Pro Ser His Leu 50
55 60 Cys Arg Gln Arg Pro Gly
Ala Ala Pro Glu Gly Thr Val Lys Ala Pro 65 70
75 80 Met Pro His Ser Glu Asn Cys Ser Leu Cys Val
Gln Gly Thr Gln Gly 85 90
95 Thr Phe Trp Thr His Cys Trp Thr Asp Leu Lys Val Ser Cys Pro Val
100 105 110 Cys Ala
Gly Gly Asn 115 51092DNAHomo sapiens 5tgagaaccca
gggctcgctg ggatgctgtc ctccctccct ccctccctct gccaagctgg 60tgggaaggcc
cttgccaaag cgcacaagac cccttcacac agcaggggca caagctcttc 120gaggcagagt
cgcctgcagg tggggtggaa ggggtgcgcc ccagacccaa gaatcgcccg 180cctctccaag
accatccctg gctgctggcc cagtgacgag caccatccgg aagtgaaggc 240tgatgggtac
gtggacaacc tcgcagaggc agtggacctg ctgctgcagc acgccgacaa 300gtgatggcct
cctgggagag ccccgcctcc tccacccctg cctctcctcc acccctgcct 360cccctccacc
cctgcctctc ctccacccgc ccaggagagc cccacctcct ccacccctgc 420ctctcctcca
cccctgcctc ccctccacct gccccagtgc ccagaccaac caaggccctg 480acagccctgc
cttctgccct ctgccctgca tgggcaggca tttgttccct acctgggtgg 540cctgctcccc
tgcctgggcc ctgacttcag ctccctgtag tgaagtccag gagggtggga 600caggcctgtc
aggcctctgg gaatctccca aatcccagaa ctcaccactc accatgggcc 660tttaaatgca
gtaaactcca cctaaccaga ttcaggggca ctaatgccca ctgcctcctc 720ttcagactct
ttgcatttca gtgaagagcc tggaagaaac ccaggggcct cctatgcaca 780gatcttgcag
cccagaacca agtcagcctc cctgcgactg cccaggcaca ctgcccacca 840ccccaccccc
gaaacaatgc cagcccgctg ctttttctat cctcccagtc acctttgcag 900acaaagacca
ggggcagctc ccgagggcac tgtgaaggct cccatgccac acagtgagaa 960ctgtagcctc
tgcgtccaag gcacacaggg tactttctgg acccactgct ggacagactt 1020gaaggtgtca
tgcccggtgt gtgcaggagg aaactaacag ttcagtaaac tctgccttga 1080ccagcagcct
tt 109261022DNAHomo
sapiens 6ctccgagtct ctaccccagc tgtccccaca gtgcacatgg actaggctcc
tcccacgggg 60cactaggcca ggccaggggt gtggggtgag cccctgggga gcccagagca
gggtacactc 120atgtccccac catccaaggc ccagtgacga gcaccatccg gaagtgaagg
ctgatgggta 180cgtggacaac ctcgcagagg cagtggacct gctgctgcag cacgccgaca
agtgatggcc 240tcctgggaga gccccgcctc ctccacccct gcctctcctc cacccctgcc
tcccctccac 300ccctgcctct cctccacccg cccaggagag ccccacctcc tccacccctg
cctctcctcc 360acccctgcct cccctccacc tgccccagtg cccagaccaa ccaaggccct
gacagccctg 420ccttctgccc tctgccctgc atgggcaggc atttgttccc tacctgggtg
gcctgctccc 480ctgcctgggc cctgacttca gctccctgta gtgaagtcca ggagggtggg
acaggcctgt 540caggcctctg ggaatctccc aaatcccaga actcaccact caccatgggc
ctttaaatgc 600agtaaactcc acctaaccag attcaggggc actatgccca ctgcctcctc
ttcagactct 660ttgcatttca gtgaagagcc tggaagaaac ccaggggcct cctatgcaca
gatcttgcag 720cccagaacca agtcagcctc cctgcgactg cccaggcaca ctgcccacca
ccccaccccc 780gaaacaatgc cagcccgctg ctttttctat cctcccagtc acctttgcag
acaaagacca 840ggggcagctc ccgagggcac tgtgaaggct cccatgccac acagtgagaa
ctgtagcctc 900tgcgtccaag gcacacaggg tactttctgg acccactgct ggacagactt
gaaggtgtca 960tgcccggtgt gtgcaggagg aaactaacag ttcagtaaac tctgccttga
ccagcagcct 1020tt
102271186DNAHomo sapiens 7tgcattcggc tgacagagca tgagggggag
gaatcactga tgacaggcac tggcctgccc 60agctgggggc ctttgtttat tcatttggtg
ggcacttcct gggtgcctgc tctgggtcag 120gcctgtgggg gggaccactg agggcaggaa
acctggcctg tccctccagg aagcgaagtc 180aacactggca cctgcagatg aagtggcaga
gcagccccca gctttgatgg catggggtgg 240ttggggggca cattctgcat gctcagaaga
gagagcaact cgccctgtgg aaggagcata 300cagtgggaga tggggacagg cccagtgacg
agcaccatcc ggaagtgaag gctgatgggt 360acgtggacaa cctcgcagag gcagtggacc
tgctgctgca gcacgccgac aagtgatggc 420ctcctgggag agccccgcct cctccacccc
tgcctcccct ccacccctgc ctctcctcca 480cccgcccagg agagccccac ctcctccacc
cctgcctctc ctccacccct gcctcccctc 540cacctgcccc agtgcccaga ccaaccaagg
ccctgacagc cctgccttct gccctctgcc 600ctgcatgggc aggcatttgt tccctacctg
ggtggcctgc tcccctgcct gggccctgac 660ttcagctccc tgtagtgaag tccaggaggg
tgggacaggc ctgtcaggcc tctgggaatc 720tcccaaatcc cagaactcac cactcaccat
gggcctttaa atgcagtaaa ctccacctaa 780ccagattcag gggcactatg cccactgcct
cctcttcaga ctctttgcat ttcagtgaag 840agcctggaag aaacccaggg gcctcctatg
cacagatctt gcagcccaga accaagtcag 900cctccctgcg actgcccagg cacactgccc
accaccccac ccccgaaaca atgccagccc 960gctgcttttt ctatcctccc agtcaccttt
gcagacaaag accaggggca gctcccgagg 1020gcactgtgaa ggctcccatg ccacacagtg
agaactgtag cctctgcgtc caaggcacac 1080agggtacttt ctggacccac tgctggacag
acttgaaggt gtcatgcccg gtgtgtgcag 1140gaggaaacta acagttcagt aaactctgcc
ttgaccagca gccttt 118681005DNAHomo sapiens 8ctccgagtct
ctaccccagc tgtccccaca gtgcacatgg actaggctcc tcccacgggg 60cactaggcca
ggccaggggt gtggggtgag cccctgggga gcccagagca gggtacactc 120atgtccccac
catccaaggc ccagtgacga gcaccatccg gaagtgaagg ctgatgggta 180cgtggacaac
ctcgcagagg cagtggacct gctgctgcag cacgccgaca agtgatggcc 240tcctgggaga
gccccgcctc ctccacccct gcctcccctc cacccctgcc tctcctccac 300ccgcccagga
gagccccacc tcctccaccc ctgcctctcc tccacccctg cctcccctcc 360acctgcccca
gtgcccagac caaccaaggc cctgacagcc ctgccttctg ccctctgccc 420tgcatgggca
ggcatttgtt ccctacctgg gtggcctgct cccctgcctg ggccctgact 480tcagctccct
gtagtgaagt ccaggagggt gggacaggcc tgtcaggcct ctgggaatct 540cccaaatccc
agaactcacc actcaccatg ggcctttaaa tgcagtaaac tccacctaac 600cagattcagg
ggcactatgc ccactgcctc ctcttcagac tctttgcatt tcagtgaaga 660gcctggaaga
aacccagggg cctcctatgc acagatcttg cagcccagaa ccaagtcagc 720ctccctgcga
ctgcccaggc acactgccca ccaccccacc cccgaaacaa tgccagcccg 780ctgctttttc
tatcctccca gtcacctttg cagacaaaga ccaggggcag ctcccgaggg 840cactgtgaag
gctcccatgc cacacagtga gaactgtagc ctctgcgtcc aaggcacaca 900gggtactttc
tggacccact gctggacaga cttgaaggtg tcatgcccgg tgtgtgcagg 960aggaaactaa
cagttcagta aactctgcct tgaccagcag ccttt 100591765DNAHomo
sapiensCDS(166)..(975) 9aagaacttta aaaatcacct aggtgtgggc cgggcacggt
ggctaacgcc tgtaatccca 60gcactttgag atgctgaggc aggtggatca cgaggtcagg
agatcgagac catcctggat 120aacacggaga aaccccggcg gagctgagga gcagggccgg
gcgcc atg gca ccg tgg 177
Met Ala Pro Trp
1 ggc aag cgg ctg gct ggc gtg cgc ggg gtg ctg ctt gac atc
tcg ggc 225Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu Asp Ile
Ser Gly 5 10 15
20 gtg ctg tac gac agc ggc gcg ggc ggc ggc acg gcc atc gcc ggc tcg
273Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala Ile Ala Gly Ser
25 30 35 gtg gag gcg
gtg gcc aga ctg aag cgt tcc cgg ctg aag gtg agg ttc 321Val Glu Ala
Val Ala Arg Leu Lys Arg Ser Arg Leu Lys Val Arg Phe 40
45 50 tgc acc aac gag tcg cag
aag tcc cgg gca gag ctg gtg ggg cag ctt 369Cys Thr Asn Glu Ser Gln
Lys Ser Arg Ala Glu Leu Val Gly Gln Leu 55
60 65 cag agg ctg gga ttt gac atc tct gag
cag gag gtg acc gcc ccg gca 417Gln Arg Leu Gly Phe Asp Ile Ser Glu
Gln Glu Val Thr Ala Pro Ala 70 75
80 cca gct gcc tgc cag atc ctg aag gag cga ggc ctg
cga cca tac ctg 465Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu
Arg Pro Tyr Leu 85 90 95
100 ctc atc cat gac gga gtc cgc tca gaa ttt gat cag atc gac aca
tcc 513Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln Ile Asp Thr
Ser 105 110 115
aac cca aac tgt gtg gta att gca gac gca gga gaa agc ttt tct tat
561Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu Ser Phe Ser Tyr
120 125 130 caa aac atg
aat aac gcc ttc cag gtg ctc atg gag ctg gaa aaa cct 609Gln Asn Met
Asn Asn Ala Phe Gln Val Leu Met Glu Leu Glu Lys Pro 135
140 145 gtg ctc ata tca ctg gga
aaa ggg cgt tac tac aag gag acc tct ggc 657Val Leu Ile Ser Leu Gly
Lys Gly Arg Tyr Tyr Lys Glu Thr Ser Gly 150 155
160 ctg atg ctg gac gtt ggt ccc tac atg
aag gcg ctt gag tat gcc tgt 705Leu Met Leu Asp Val Gly Pro Tyr Met
Lys Ala Leu Glu Tyr Ala Cys 165 170
175 180 ggc atc aaa gcc gag gtg gtg ggg aag cct tct cct
gag ttt ttc aag 753Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro
Glu Phe Phe Lys 185 190
195 tct gcc ctg caa gcg ata gga gtg gaa gcc cac cag gcc gtc atg
att 801Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His Gln Ala Val Met
Ile 200 205 210
ggg gac gat atc gtg ggc gac gtc ggc ggt gcc cag cgg tgt gga atg
849Gly Asp Asp Ile Val Gly Asp Val Gly Gly Ala Gln Arg Cys Gly Met
215 220 225 aga gcg ctg
cag gtg cgc acc ggg aag ttc agg ccc agt gac gag cac 897Arg Ala Leu
Gln Val Arg Thr Gly Lys Phe Arg Pro Ser Asp Glu His 230
235 240 cat ccg gaa gtg aag gct
gat ggg tac gtg gac aac ctc gca gag gca 945His Pro Glu Val Lys Ala
Asp Gly Tyr Val Asp Asn Leu Ala Glu Ala 245 250
255 260 gtg gac ctg ctg ctg cag cac gcc gac
aag tgatggcctc ctgggagagc 995Val Asp Leu Leu Leu Gln His Ala Asp
Lys 265 270
cccgcctcct ccacccctgc ctctcctcca cccctgcctc
ccctccaccc ctgcctctcc 1055tccacccgcc caggagagcc ccacctcctc cacccctgcc
tctcctccac ccctgcctcc 1115cctccacctg ccccagtgcc cagaccaacc aaggccctga
cagccctgcc ttctgccctc 1175tgccctgcat gggcaggcat ttgttcccta cctgggtggc
ctgctcccct gcctgggccc 1235tgacttcagc tccctgtagt gaagtccagg agggtgggac
aggcctgtca ggcctctggg 1295aatctcccaa atcccagaac tcaccactca ccatgggcct
ttaaatgcag taaactccac 1355ctaaccagat tcaggggcac tatgcccact gcctcctctt
cagactcttt gcatttcagt 1415gaagagcctg gaagaaaccc aggggcctcc tatgcacaga
tcttgcagcc cagaaccaag 1475tcagcctccc tgcgactgcc caggcacact gcccaccacc
ccacccccga aacaatgcca 1535gcccgctgct ttttctatcc tcccagtcac ctttgcagac
aaagaccagg ggcagctccc 1595gagggcactg tgaaggctcc catgccacac agtgagaact
gtagcctctg cgtccaaggc 1655acacagggta ctttctggac ccactgctgg acagacttga
aggtgtcatg cccggtgtgt 1715gcaggaggaa actaacagtt cagtaaactc tgccttgacc
agcagccttt 176510270PRTHomo sapiens 10Met Ala Pro Trp Gly
Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu 1 5
10 15 Asp Ile Ser Gly Val Leu Tyr Asp Ser Gly
Ala Gly Gly Gly Thr Ala 20 25
30 Ile Ala Gly Ser Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg
Leu 35 40 45 Lys
Val Arg Phe Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu 50
55 60 Val Gly Gln Leu Gln Arg
Leu Gly Phe Asp Ile Ser Glu Gln Glu Val 65 70
75 80 Thr Ala Pro Ala Pro Ala Ala Cys Gln Ile Leu
Lys Glu Arg Gly Leu 85 90
95 Arg Pro Tyr Leu Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln
100 105 110 Ile Asp
Thr Ser Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu 115
120 125 Ser Phe Ser Tyr Gln Asn Met
Asn Asn Ala Phe Gln Val Leu Met Glu 130 135
140 Leu Glu Lys Pro Val Leu Ile Ser Leu Gly Lys Gly
Arg Tyr Tyr Lys 145 150 155
160 Glu Thr Ser Gly Leu Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu
165 170 175 Glu Tyr Ala
Cys Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro 180
185 190 Glu Phe Phe Lys Ser Ala Leu Gln
Ala Ile Gly Val Glu Ala His Gln 195 200
205 Ala Val Met Ile Gly Asp Asp Ile Val Gly Asp Val Gly
Gly Ala Gln 210 215 220
Arg Cys Gly Met Arg Ala Leu Gln Val Arg Thr Gly Lys Phe Arg Pro 225
230 235 240 Ser Asp Glu His
His Pro Glu Val Lys Ala Asp Gly Tyr Val Asp Asn 245
250 255 Leu Ala Glu Ala Val Asp Leu Leu Leu
Gln His Ala Asp Lys 260 265
270 111669DNAHomo sapiens 11aagaacttta aaaatcacct aggtgtgggc cgggcacggt
ggctaacgcc tgtaatccca 60gcactttgag atgctgaggc aggtggatca cgaggtcagg
agatcgagac catcctggat 120aacacggaga aaccccggcg gagctgagga gcagggccgg
gcgccatggc accgtggggc 180aagcggctgg ctggcgtgcg cggggtgctg cttgacatct
cgggcgtgct gtacgacagc 240ggcgcgggcg gcggcacggc catcgccggc tcggtggagg
cggtggccag actgaagcgt 300tcccggctga aggtgaggtt ctgcaccaac gagtcgcaga
agtcccgggc agagctggtg 360gggcagcttc agaggctggg atttgacatc tctgagcagg
aggtgaccgc cccggcacca 420gctgcctgcc agatcctgaa ggagcgaggc ctgcgaccat
acctgctcat ccatgacgga 480gtccgctcag aatttgatca gatcgacaca tccaacccaa
actgtgtggt aattgcagac 540gcaggagaaa gcttttctta tcaaaacatg aataacgcct
tccaggtgct catggagctg 600gaaaaacctg tgctcatatc actgggaaaa gggcgttact
acaaggagac ctctggcctg 660atgctggacg ttggtcccta catgaaggcg cttgagtatg
cctgtggcat caaagccgag 720gtggtgggga agccttctcc tgagtttttc aagtctgccc
tgcaagcgat aggagtggaa 780gcccaccagg ccgtcatgat tggggacgat atcgtgggcg
acgtcggcgg tgcccagcgg 840tgtggaatga gagcgctgca ggtgcgcacc gggaagttca
ggcccagtga cgagcaccat 900ccggaagtga aggctgatgg gtacgtggac aacctcgcag
aggcagtgga cctgctgctg 960cagcacgccg acaagtgatg gcctcctggg agagccccgc
ctcctccacc cctgcctccc 1020ctccacctgc cccagtgccc agaccaacca aggccctgac
agccctgcct tctgccctct 1080gccctgcatg ggcaggcatt tgttccctac ctgggtggcc
tgctcccctg cctgggccct 1140gacttcagct ccctgtagtg aagtccagga gggtgggaca
ggcctgtcag gcctctggga 1200atctcccaaa tcccagaact caccactcac catgggcctt
taaatgcagt aaactccacc 1260taaccagatt caggggcact atgcccactg cctcctcttc
agactctttg catttcagtg 1320aagagcctgg aagaaaccca ggggcctcct atgcacagat
cttgcagccc agaaccaagt 1380cagcctccct gcgactgccc aggcacactg cccaccaccc
cacccccgaa acaatgccag 1440cccgctgctt tttctatcct cccagtcacc tttgcagaca
aagaccaggg gcagctcccg 1500agggcactgt gaaggctccc atgccacaca gtgagaactg
tagcctctgc gtccaaggca 1560cacagggtac tttctggacc cactgctgga cagacttgaa
ggtgtcatgc ccggtgtgtg 1620caggaggaaa ctaacagttc agtaaactct gccttgacca
gcagccttt 1669121673DNAHomo sapiensCDS(166)..(795)
12aagaacttta aaaatcacct aggtgtgggc cgggcacggt ggctaacgcc tgtaatccca
60gcactttgag atgctgaggc aggtggatca cgaggtcagg agatcgagac catcctggat
120aacacggaga aaccccggcg gagctgagga gcagggccgg gcgcc atg gca ccg tgg
177 Met Ala Pro Trp
1 ggc aag
cgg ctg gct ggc gtg cgc ggg gtg ctg ctt gac atc tcg ggc 225Gly Lys
Arg Leu Ala Gly Val Arg Gly Val Leu Leu Asp Ile Ser Gly 5
10 15 20 gtg ctg tac gac agc
ggc gcg ggc ggc ggc acg gcc atc gcc ggc tcg 273Val Leu Tyr Asp Ser
Gly Ala Gly Gly Gly Thr Ala Ile Ala Gly Ser 25
30 35 gtg gag gcg gtg gcc aga ctg aag
cgt tcc cgg ctg aag gtg agg ttc 321Val Glu Ala Val Ala Arg Leu Lys
Arg Ser Arg Leu Lys Val Arg Phe 40 45
50 tgc acc aac gag tcg cag aag tcc cgg gca gag
ctg gtg ggg cag ctt 369Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu
Leu Val Gly Gln Leu 55 60
65 cag agg ctg gga ttt gac atc tct gag cag gag gtg acc gcc
ccg gca 417Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val Thr Ala
Pro Ala 70 75 80
cca gct gcc tgc cag atc ctg aag gag cga ggc ctg cga cca tac ctg
465Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu Arg Pro Tyr Leu 85
90 95 100 ctc atc cat
gac gga gtc cgc tca gaa ttt gat cag atc gac aca tcc 513Leu Ile His
Asp Gly Val Arg Ser Glu Phe Asp Gln Ile Asp Thr Ser
105 110 115 aac cca aac tgt gtg gta
att gca gac gca gga gaa agc ttt tct tat 561Asn Pro Asn Cys Val Val
Ile Ala Asp Ala Gly Glu Ser Phe Ser Tyr 120
125 130 caa aac atg aat aac gcc ttc cag gtg
ctc atg gag ctg gaa aaa cct 609Gln Asn Met Asn Asn Ala Phe Gln Val
Leu Met Glu Leu Glu Lys Pro 135 140
145 gtg ctc ata tca ctg gga aaa ggg cgt tac tac aag
gag acc tct ggc 657Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys
Glu Thr Ser Gly 150 155 160
ctg atg ctg gac gtt ggt ccc tac atg aag gcg ctt gag tat gcc
tgt 705Leu Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu Glu Tyr Ala
Cys 165 170 175 180
ggc atc aaa gcc gag gtg gtg ggg aag cct tct cct gag ttt ttc aag
753Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro Glu Phe Phe Lys
185 190 195 tct gcc ctg
caa gcg ata gga gtg gaa gcc cac cag gcc cag 795Ser Ala Leu
Gln Ala Ile Gly Val Glu Ala His Gln Ala Gln 200
205 210 tgacgagcac catccggaag
tgaaggctga tgggtacgtg gacaacctcg cagaggcagt 855ggacctgctg ctgcagcacg
ccgacaagtg atggcctcct gggagagccc cgcctcctcc 915acccctgcct ctcctccacc
cctgcctccc ctccacccct gcctctcctc cacccgccca 975ggagagcccc acctcctcca
cccctgcctc tcctccaccc ctgcctcccc tccacctgcc 1035ccagtgccca gaccaaccaa
ggccctgaca gccctgcctt ctgccctctg ccctgcatgg 1095gcaggcattt gttccctacc
tgggtggcct gctcccctgc ctgggccctg acttcagctc 1155cctgtagtga agtccaggag
ggtgggacag gcctgtcagg cctctgggaa tctcccaaat 1215cccagaactc accactcacc
atgggccttt aaatgcagta aactccacct aaccagattc 1275aggggcacta tgcccactgc
ctcctcttca gactctttgc atttcagtga agagcctgga 1335agaaacccag gggcctccta
tgcacagatc ttgcagccca gaaccaagtc agcctccctg 1395cgactgccca ggcacactgc
ccaccacccc acccccgaaa caatgccagc ccgctgcttt 1455ttctatcctc ccagtcacct
ttgcagacaa agaccagggg cagctcccga gggcactgtg 1515aaggctccca tgccacacag
tgagaactgt agcctctgcg tccaaggcac acagggtact 1575ttctggaccc actgctggac
agacttgaag gtgtcatgcc cggtgtgtgc aggaggaaac 1635taacagttca gtaaactctg
ccttgaccag cagccttt 167313210PRTHomo sapiens
13Met Ala Pro Trp Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu 1
5 10 15 Asp Ile Ser Gly
Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala 20
25 30 Ile Ala Gly Ser Val Glu Ala Val Ala
Arg Leu Lys Arg Ser Arg Leu 35 40
45 Lys Val Arg Phe Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala
Glu Leu 50 55 60
Val Gly Gln Leu Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val 65
70 75 80 Thr Ala Pro Ala Pro
Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu 85
90 95 Arg Pro Tyr Leu Leu Ile His Asp Gly Val
Arg Ser Glu Phe Asp Gln 100 105
110 Ile Asp Thr Ser Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly
Glu 115 120 125 Ser
Phe Ser Tyr Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu 130
135 140 Leu Glu Lys Pro Val Leu
Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys 145 150
155 160 Glu Thr Ser Gly Leu Met Leu Asp Val Gly Pro
Tyr Met Lys Ala Leu 165 170
175 Glu Tyr Ala Cys Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro
180 185 190 Glu Phe
Phe Lys Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His Gln 195
200 205 Ala Gln 210
141516DNAHomo sapiensCDS(166)..(726) 14aagaacttta aaaatcacct aggtgtgggc
cgggcacggt ggctaacgcc tgtaatccca 60gcactttgag atgctgaggc aggtggatca
cgaggtcagg agatcgagac catcctggat 120aacacggaga aaccccggcg gagctgagga
gcagggccgg gcgcc atg gca ccg tgg 177
Met Ala Pro Trp
1 ggc aag cgg ctg gct ggc gtg cgc ggg gtg ctg
ctt gac atc tcg ggc 225Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu
Leu Asp Ile Ser Gly 5 10 15
20 gtg ctg tac gac agc ggc gcg ggc ggc ggc acg gcc atc gcc
ggc tcg 273Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala Ile Ala
Gly Ser 25 30 35
gtg gag gcg gtg gcc aga ctg aag cgt tcc cgg ctg aag gtg agg ttc
321Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu Lys Val Arg Phe
40 45 50 tgc acc aac
gag tcg cag aag tcc cgg gca gag ctg gtg ggg cag ctt 369Cys Thr Asn
Glu Ser Gln Lys Ser Arg Ala Glu Leu Val Gly Gln Leu 55
60 65 cag agg ctg gga ttt gac
atc tct gag cag gag gtg acc gcc ccg gca 417Gln Arg Leu Gly Phe Asp
Ile Ser Glu Gln Glu Val Thr Ala Pro Ala 70 75
80 cca gct gcc tgc cag atc ctg aag gag
cga ggc ctg cga cca tac ctg 465Pro Ala Ala Cys Gln Ile Leu Lys Glu
Arg Gly Leu Arg Pro Tyr Leu 85 90
95 100 ctc atc cat gac gga gtc cgc tca gaa ttt gat cag
atc gac aca tcc 513Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln
Ile Asp Thr Ser 105 110
115 aac cca aac tgt gtg gta att gca gac gca gga gaa agc ttt tct
tat 561Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu Ser Phe Ser
Tyr 120 125 130
caa aac atg aat aac gcc ttc cag gtg ctc atg gag ctg gaa aaa cct
609Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu Leu Glu Lys Pro
135 140 145 gtg ctc ata
tca ctg gga aaa ggg ccc agt gac gag cac cat ccg gaa 657Val Leu Ile
Ser Leu Gly Lys Gly Pro Ser Asp Glu His His Pro Glu 150
155 160 gtg aag gct gat ggg tac
gtg gac aac ctc gca gag gca gtg gac ctg 705Val Lys Ala Asp Gly Tyr
Val Asp Asn Leu Ala Glu Ala Val Asp Leu 165 170
175 180 ctg ctg cag cac gcc gac aag tgatggcctc
ctgggagagc cccgcctcct 756Leu Leu Gln His Ala Asp Lys
185
ccacccctgc ctctcctcca cccctgcctc ccctccaccc ctgcctctcc
tccacccgcc 816caggagagcc ccacctcctc cacccctgcc tctcctccac ccctgcctcc
cctccacctg 876ccccagtgcc cagaccaacc aaggccctga cagccctgcc ttctgccctc
tgccctgcat 936gggcaggcat ttgttcccta cctgggtggc ctgctcccct gcctgggccc
tgacttcagc 996tccctgtagt gaagtccagg agggtgggac aggcctgtca ggcctctggg
aatctcccaa 1056atcccagaac tcaccactca ccatgggcct ttaaatgcag taaactccac
ctaaccagat 1116tcaggggcac tatgcccact gcctcctctt cagactcttt gcatttcagt
gaagagcctg 1176gaagaaaccc aggggcctcc tatgcacaga tcttgcagcc cagaaccaag
tcagcctccc 1236tgcgactgcc caggcacact gcccaccacc ccacccccga aacaatgcca
gcccgctgct 1296ttttctatcc tcccagtcac ctttgcagac aaagaccagg ggcagctccc
gagggcactg 1356tgaaggctcc catgccacac agtgagaact gtagcctctg cgtccaaggc
acacagggta 1416ctttctggac ccactgctgg acagacttga aggtgtcatg cccggtgtgt
gcaggaggaa 1476actaacagtt cagtaaactc tgccttgacc agcagccttt
151615187PRTHomo sapiens 15Met Ala Pro Trp Gly Lys Arg Leu Ala
Gly Val Arg Gly Val Leu Leu 1 5 10
15 Asp Ile Ser Gly Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly
Thr Ala 20 25 30
Ile Ala Gly Ser Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu
35 40 45 Lys Val Arg Phe
Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu 50
55 60 Val Gly Gln Leu Gln Arg Leu Gly
Phe Asp Ile Ser Glu Gln Glu Val 65 70
75 80 Thr Ala Pro Ala Pro Ala Ala Cys Gln Ile Leu Lys
Glu Arg Gly Leu 85 90
95 Arg Pro Tyr Leu Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln
100 105 110 Ile Asp Thr
Ser Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu 115
120 125 Ser Phe Ser Tyr Gln Asn Met Asn
Asn Ala Phe Gln Val Leu Met Glu 130 135
140 Leu Glu Lys Pro Val Leu Ile Ser Leu Gly Lys Gly Pro
Ser Asp Glu 145 150 155
160 His His Pro Glu Val Lys Ala Asp Gly Tyr Val Asp Asn Leu Ala Glu
165 170 175 Ala Val Asp Leu
Leu Leu Gln His Ala Asp Lys 180 185
161287DNAHomo sapiensCDS(166)..(936)CDS(940)..(948) 16aagaacttta
aaaatcacct aggtgtgggc cgggcacggt ggctaacgcc tgtaatccca 60gcactttgag
atgctgaggc aggtggatca cgaggtcagg agatcgagac catcctggat 120aacacggaga
aaccccggcg gagctgagga gcagggccgg gcgcc atg gca ccg tgg 177
Met Ala Pro Trp
1 ggc aag cgg ctg gct ggc gtg
cgc ggg gtg ctg ctt gac atc tcg ggc 225Gly Lys Arg Leu Ala Gly Val
Arg Gly Val Leu Leu Asp Ile Ser Gly 5 10
15 20 gtg ctg tac gac agc ggc gcg ggc ggc ggc
acg gcc atc gcc ggc tcg 273Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly
Thr Ala Ile Ala Gly Ser 25 30
35 gtg gag gcg gtg gcc aga ctg aag cgt tcc cgg ctg aag
gtg agg ttc 321Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu Lys
Val Arg Phe 40 45 50
tgc acc aac gag tcg cag aag tcc cgg gca gag ctg gtg ggg cag ctt
369Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu Val Gly Gln Leu
55 60 65 cag agg
ctg gga ttt gac atc tct gag cag gag gtg acc gcc ccg gca 417Gln Arg
Leu Gly Phe Asp Ile Ser Glu Gln Glu Val Thr Ala Pro Ala 70
75 80 cca gct gcc tgc cag
atc ctg aag gag cga ggc ctg cga cca tac ctg 465Pro Ala Ala Cys Gln
Ile Leu Lys Glu Arg Gly Leu Arg Pro Tyr Leu 85 90
95 100 ctc atc cat gac gga gtc cgc tca
gaa ttt gat cag atc gac aca tcc 513Leu Ile His Asp Gly Val Arg Ser
Glu Phe Asp Gln Ile Asp Thr Ser 105
110 115 aac cca aac tgt gtg gta att gca gac gca gga
gaa agc ttt tct tat 561Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly
Glu Ser Phe Ser Tyr 120 125
130 caa aac atg aat aac gcc ttc cag gtg ctc atg gag ctg gaa
aaa cct 609Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu Leu Glu
Lys Pro 135 140 145
gtg ctc ata tca ctg gga aaa ggg cgt tac tac aag gag acc tct ggc
657Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys Glu Thr Ser Gly
150 155 160 ctg atg ctg
gac gtt ggt ccc tac atg aag gcg ctt gag tat gcc tgt 705Leu Met Leu
Asp Val Gly Pro Tyr Met Lys Ala Leu Glu Tyr Ala Cys 165
170 175 180 ggc atc aaa gcc gag gtg
gtg ggg aag cct tct cct gag ttt ttc aag 753Gly Ile Lys Ala Glu Val
Val Gly Lys Pro Ser Pro Glu Phe Phe Lys 185
190 195 tct gcc ctg caa gcg ata gga gtg gaa
gcc cac cag gcc gtc atg att 801Ser Ala Leu Gln Ala Ile Gly Val Glu
Ala His Gln Ala Val Met Ile 200 205
210 ggg gac gat atc gtg ggc gac gtc ggc ggt gcc cag
cgg tgt gga atg 849Gly Asp Asp Ile Val Gly Asp Val Gly Gly Ala Gln
Arg Cys Gly Met 215 220 225
aga gcg ctg cag gtg cgc acc ggg aag ttc agg ccc agt gac gag
cac 897Arg Ala Leu Gln Val Arg Thr Gly Lys Phe Arg Pro Ser Asp Glu
His 230 235 240
cat ccg gaa gtg aag gct gat gga ctc ttt gca ttt cag tga aga gcc
945His Pro Glu Val Lys Ala Asp Gly Leu Phe Ala Phe Gln Arg Ala
245 250 255 tgg
aagaaaccca ggggcctcct atgcacagat cttgcagccc agaaccaagt 998Trp
260
cagcctccct
gcgactgccc aggcacactg cccaccaccc cacccccgaa acaatgccag 1058cccgctgctt
tttctatcct cccagtcacc tttgcagaca aagaccaggg gcagctcccg 1118agggcactgt
gaaggctccc atgccacaca gtgagaactg tagcctctgc gtccaaggca 1178cacagggtac
tttctggacc cactgctgga cagacttgaa ggtgtcatgc ccggtgtgtg 1238caggaggaaa
ctaacagttc agtaaactct gccttgacca gcagccttt 128717257PRTHomo
sapiens 17Met Ala Pro Trp Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu
1 5 10 15 Asp Ile
Ser Gly Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala 20
25 30 Ile Ala Gly Ser Val Glu Ala
Val Ala Arg Leu Lys Arg Ser Arg Leu 35 40
45 Lys Val Arg Phe Cys Thr Asn Glu Ser Gln Lys Ser
Arg Ala Glu Leu 50 55 60
Val Gly Gln Leu Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val 65
70 75 80 Thr Ala Pro
Ala Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu 85
90 95 Arg Pro Tyr Leu Leu Ile His Asp
Gly Val Arg Ser Glu Phe Asp Gln 100 105
110 Ile Asp Thr Ser Asn Pro Asn Cys Val Val Ile Ala Asp
Ala Gly Glu 115 120 125
Ser Phe Ser Tyr Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu 130
135 140 Leu Glu Lys Pro
Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys 145 150
155 160 Glu Thr Ser Gly Leu Met Leu Asp Val
Gly Pro Tyr Met Lys Ala Leu 165 170
175 Glu Tyr Ala Cys Gly Ile Lys Ala Glu Val Val Gly Lys Pro
Ser Pro 180 185 190
Glu Phe Phe Lys Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His Gln
195 200 205 Ala Val Met Ile
Gly Asp Asp Ile Val Gly Asp Val Gly Gly Ala Gln 210
215 220 Arg Cys Gly Met Arg Ala Leu Gln
Val Arg Thr Gly Lys Phe Arg Pro 225 230
235 240 Ser Asp Glu His His Pro Glu Val Lys Ala Asp Gly
Leu Phe Ala Phe 245 250
255 Gln 182264DNAHomo sapiensCDS(166)..(1005) 18aagaacttta
aaaatcacct aggtgtgggc cgggcacggt ggctaacgcc tgtaatccca 60gcactttgag
atgctgaggc aggtggatca cgaggtcagg agatcgagac catcctggat 120aacacggaga
aaccccggcg gagctgagga gcagggccgg gcgcc atg gca ccg tgg 177
Met Ala Pro Trp
1 ggc aag cgg ctg gct ggc
gtg cgc ggg gtg ctg ctt gac atc tcg ggc 225Gly Lys Arg Leu Ala Gly
Val Arg Gly Val Leu Leu Asp Ile Ser Gly 5 10
15 20 gtg ctg tac gac agc ggc gcg ggc ggc
ggc acg gcc atc gcc ggc tcg 273Val Leu Tyr Asp Ser Gly Ala Gly Gly
Gly Thr Ala Ile Ala Gly Ser 25 30
35 gtg gag gcg gtg gcc aga ctg aag cgt tcc cgg ctg
aag gtg agg ttc 321Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu
Lys Val Arg Phe 40 45
50 tgc acc aac gag tcg cag aag tcc cgg gca gag ctg gtg ggg cag
ctt 369Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu Val Gly Gln
Leu 55 60 65
cag agg ctg gga ttt gac atc tct gag cag gag gtg acc gcc ccg gca
417Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val Thr Ala Pro Ala
70 75 80 cca gct gcc
tgc cag atc ctg aag gag cga ggc ctg cga cca tac ctg 465Pro Ala Ala
Cys Gln Ile Leu Lys Glu Arg Gly Leu Arg Pro Tyr Leu 85
90 95 100 ctc atc cat gac gga gtc
cgc tca gaa ttt gat cag atc gac aca tcc 513Leu Ile His Asp Gly Val
Arg Ser Glu Phe Asp Gln Ile Asp Thr Ser 105
110 115 aac cca aac tgt gtg gta att gca gac
gca gga gaa agc ttt tct tat 561Asn Pro Asn Cys Val Val Ile Ala Asp
Ala Gly Glu Ser Phe Ser Tyr 120 125
130 caa aac atg aat aac gcc ttc cag gtg ctc atg gag
ctg gaa aaa cct 609Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu
Leu Glu Lys Pro 135 140 145
gtg ctc ata tca ctg gga aaa ggg cgt tac tac aag gag acc tct
ggc 657Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys Glu Thr Ser
Gly 150 155 160
ctg atg ctg gac gtt ggt ccc tac atg aag gcg ctt gag tat gcc tgt
705Leu Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu Glu Tyr Ala Cys
165 170 175 180 ggc atc
aaa gcc gag gtg gtg ggg aag cct tct cct gag ttt ttc aag 753Gly Ile
Lys Ala Glu Val Val Gly Lys Pro Ser Pro Glu Phe Phe Lys
185 190 195 tct gcc ctg caa gcg
ata gga gtg gaa gcc cac cag gcc gtc atg att 801Ser Ala Leu Gln Ala
Ile Gly Val Glu Ala His Gln Ala Val Met Ile 200
205 210 ggg gac gat atc gtg ggc gac gtc
ggc ggt gcc cag cgg tgt gga atg 849Gly Asp Asp Ile Val Gly Asp Val
Gly Gly Ala Gln Arg Cys Gly Met 215 220
225 aga gcg ctg cag gtg cgc acc ggg aag ttc agg
ccc agt gac gag cac 897Arg Ala Leu Gln Val Arg Thr Gly Lys Phe Arg
Pro Ser Asp Glu His 230 235 240
cat ccg gaa gtg aag gct gat ggg cac ttc ctg gcc ctg cgg
ggt gcc 945His Pro Glu Val Lys Ala Asp Gly His Phe Leu Ala Leu Arg
Gly Ala 245 250 255
260 ctg tct ctg gct gga gct gat gag ccg tgg tgt gat gct tgc ctg ggg
993Leu Ser Leu Ala Gly Ala Asp Glu Pro Trp Cys Asp Ala Cys Leu Gly
265 270 275 gtc ggt gtt
gcc tagaagtgag gaatgacctc cttcacccct ggctccttcc 1045Val Gly Val
Ala 280
ggcagcctgg ccacttccag
catgtggagg gggcccgggc cctggcagag acggtgtgga 1105accgccggcc gcatgcgggg
ctggagttcg gggctgccaa tgggctaagg ccctgtgttc 1165aacacatgtc gcggaactgt
cgtgaaaaaa agactcagga ccacaggtcc ggggtgctcc 1225caggggtctt tcccaccgag
gctgggagtg agcagcagcc taggcccagc cacacctccc 1285aggctcggga gctgccaggc
caggcaaccg tgaagggtag cagggtgggg ggtggagggg 1345aggacaggga gagcaggcct
ctcgggcctc acccaccctt gcaccttcca gctctagggc 1405ccacacagta ccaagccctg
ggctcaatgt accacttatg aaaactcaat tcaaattggc 1465ttagtgcaac aaggtatttg
gtggtctcac atgacttgaa catccaggcc tggctgtcac 1525gggaaccgca tctcttccca
ttgcagctac ttggcaggtg gcgggatgtc ccccagccac 1585cgacgtcccc ctgcctgctc
cgcaacccca gggcctgcag aaaaggccca cgagactcag 1645actggcagag acttaggcgg
accaggaaca ggggcgcagt ctccgtccca cccaaaccct 1705aaccagagag aacacggcac
gttgtgccag acggaggacg gatgccagcg agggtccatg 1765tcctcactgc cgacaaggct
gggaactggg ccaagtgaag cagaggcctc cacgtcagat 1825gtgagcgcca ccggcccagg
tgactgcagt tcttccctcc ttccgttcgg cttgagccct 1885ccagaggatc ggaaaggctg
aggcctgacc tggtgccgct gtcctgggtg ggtctgtcct 1945gctggtcggt tcctgcccct
ctcggggagg ttggctggca gctggcaggt ggaaagcctc 2005ctgtgttcac ctcagggcag
aggtggggac acagggcggg acgggcgagt gtggtgcccc 2065tctggggtgg gtgctcttgg
gtccgcctcc cgtgccagag tgcgtgtcaa cagttccagc 2125tgcccctcag aactgtcctg
gtttaggagg tgaacacacg gggcagccta cattctacgt 2185ggttttttta acattataaa
agcagcatgt gttattacag gaaatttaac agaagtatat 2245aaaaagaaac cagaagtca
226419280PRTHomo sapiens
19Met Ala Pro Trp Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu 1
5 10 15 Asp Ile Ser Gly
Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala 20
25 30 Ile Ala Gly Ser Val Glu Ala Val Ala
Arg Leu Lys Arg Ser Arg Leu 35 40
45 Lys Val Arg Phe Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala
Glu Leu 50 55 60
Val Gly Gln Leu Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val 65
70 75 80 Thr Ala Pro Ala Pro
Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu 85
90 95 Arg Pro Tyr Leu Leu Ile His Asp Gly Val
Arg Ser Glu Phe Asp Gln 100 105
110 Ile Asp Thr Ser Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly
Glu 115 120 125 Ser
Phe Ser Tyr Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu 130
135 140 Leu Glu Lys Pro Val Leu
Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys 145 150
155 160 Glu Thr Ser Gly Leu Met Leu Asp Val Gly Pro
Tyr Met Lys Ala Leu 165 170
175 Glu Tyr Ala Cys Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro
180 185 190 Glu Phe
Phe Lys Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His Gln 195
200 205 Ala Val Met Ile Gly Asp Asp
Ile Val Gly Asp Val Gly Gly Ala Gln 210 215
220 Arg Cys Gly Met Arg Ala Leu Gln Val Arg Thr Gly
Lys Phe Arg Pro 225 230 235
240 Ser Asp Glu His His Pro Glu Val Lys Ala Asp Gly His Phe Leu Ala
245 250 255 Leu Arg Gly
Ala Leu Ser Leu Ala Gly Ala Asp Glu Pro Trp Cys Asp 260
265 270 Ala Cys Leu Gly Val Gly Val Ala
275 280 20992DNAHomo sapiensCDS(166)..(960)
20aagaacttta aaaatcacct aggtgtgggc cgggcacggt ggctaacgcc tgtaatccca
60gcactttgag atgctgaggc aggtggatca cgaggtcagg agatcgagac catcctggat
120aacacggaga aaccccggcg gagctgagga gcagggccgg gcgcc atg gca ccg tgg
177 Met Ala Pro Trp
1 ggc aag
cgg ctg gct ggc gtg cgc ggg gtg ctg ctt gac atc tcg ggc 225Gly Lys
Arg Leu Ala Gly Val Arg Gly Val Leu Leu Asp Ile Ser Gly 5
10 15 20 gtg ctg tac gac agc
ggc gcg ggc ggc ggc acg gcc atc gcc ggc tcg 273Val Leu Tyr Asp Ser
Gly Ala Gly Gly Gly Thr Ala Ile Ala Gly Ser 25
30 35 gtg gag gcg gtg gcc aga ctg aag
cgt tcc cgg ctg aag gtg agg ttc 321Val Glu Ala Val Ala Arg Leu Lys
Arg Ser Arg Leu Lys Val Arg Phe 40 45
50 tgc acc aac gag tcg cag aag tcc cgg gca gag
ctg gtg ggg cag ctt 369Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu
Leu Val Gly Gln Leu 55 60
65 cag agg ctg gga ttt gac atc tct gag cag gag gtg acc gcc
ccg gca 417Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val Thr Ala
Pro Ala 70 75 80
cca gct gcc tgc cag atc ctg aag gag cga ggc ctg cga cca tac ctg
465Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu Arg Pro Tyr Leu 85
90 95 100 ctc atc cat
gac gga gtc cgc tca gaa ttt gat cag atc gac aca tcc 513Leu Ile His
Asp Gly Val Arg Ser Glu Phe Asp Gln Ile Asp Thr Ser
105 110 115 aac cca aac tgt gtg gta
att gca gac gca gga gaa agc ttt tct tat 561Asn Pro Asn Cys Val Val
Ile Ala Asp Ala Gly Glu Ser Phe Ser Tyr 120
125 130 caa aac atg aat aac gcc ttc cag gtg
ctc atg gag ctg gaa aaa cct 609Gln Asn Met Asn Asn Ala Phe Gln Val
Leu Met Glu Leu Glu Lys Pro 135 140
145 gtg ctc ata tca ctg gga aaa ggg cgt tac tac aag
gag acc tct ggc 657Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys
Glu Thr Ser Gly 150 155 160
ctg atg ctg gac gtt ggt ccc tac atg aag gcg ctt gag tat gcc
tgt 705Leu Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu Glu Tyr Ala
Cys 165 170 175 180
ggc atc aaa gcc gag gtg gtg ggg aag cct tct cct gag ttt ttc aag
753Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro Glu Phe Phe Lys
185 190 195 tct gcc ctg
caa gcg ata gga gtg gaa gcc cac cag gcc gtc atg att 801Ser Ala Leu
Gln Ala Ile Gly Val Glu Ala His Gln Ala Val Met Ile 200
205 210 ggg gac gat atc gtg ggc
gac gtc ggc ggt gcc cag cgg tgt gga atg 849Gly Asp Asp Ile Val Gly
Asp Val Gly Gly Ala Gln Arg Cys Gly Met 215
220 225 aga gcg ctg cag gtg cgc acc ggg aag
ttc agg tca gtg cca gct gga 897Arg Ala Leu Gln Val Arg Thr Gly Lys
Phe Arg Ser Val Pro Ala Gly 230 235
240 gtc att tat tca cct tcc ttc cag ggg atg acc aca
ttc tca ttc tgt 945Val Ile Tyr Ser Pro Ser Phe Gln Gly Met Thr Thr
Phe Ser Phe Cys 245 250 255
260 ttt gtt ctt caa aat aaaggggata ttctttccaa atcaaagagc ag
992Phe Val Leu Gln Asn
265
21265PRTHomo sapiens 21Met Ala Pro Trp Gly Lys Arg Leu Ala Gly Val Arg
Gly Val Leu Leu 1 5 10
15 Asp Ile Ser Gly Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala
20 25 30 Ile Ala Gly
Ser Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu 35
40 45 Lys Val Arg Phe Cys Thr Asn Glu
Ser Gln Lys Ser Arg Ala Glu Leu 50 55
60 Val Gly Gln Leu Gln Arg Leu Gly Phe Asp Ile Ser Glu
Gln Glu Val 65 70 75
80 Thr Ala Pro Ala Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu
85 90 95 Arg Pro Tyr Leu
Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln 100
105 110 Ile Asp Thr Ser Asn Pro Asn Cys Val
Val Ile Ala Asp Ala Gly Glu 115 120
125 Ser Phe Ser Tyr Gln Asn Met Asn Asn Ala Phe Gln Val Leu
Met Glu 130 135 140
Leu Glu Lys Pro Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys 145
150 155 160 Glu Thr Ser Gly Leu
Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu 165
170 175 Glu Tyr Ala Cys Gly Ile Lys Ala Glu Val
Val Gly Lys Pro Ser Pro 180 185
190 Glu Phe Phe Lys Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His
Gln 195 200 205 Ala
Val Met Ile Gly Asp Asp Ile Val Gly Asp Val Gly Gly Ala Gln 210
215 220 Arg Cys Gly Met Arg Ala
Leu Gln Val Arg Thr Gly Lys Phe Arg Ser 225 230
235 240 Val Pro Ala Gly Val Ile Tyr Ser Pro Ser Phe
Gln Gly Met Thr Thr 245 250
255 Phe Ser Phe Cys Phe Val Leu Gln Asn 260
265 22952DNAHomo sapiensCDS(166)..(936) 22aagaacttta aaaatcacct
aggtgtgggc cgggcacggt ggctaacgcc tgtaatccca 60gcactttgag atgctgaggc
aggtggatca cgaggtcagg agatcgagac catcctggat 120aacacggaga aaccccggcg
gagctgagga gcagggccgg gcgcc atg gca ccg tgg 177
Met Ala Pro Trp
1 ggc aag cgg ctg gct ggc gtg cgc ggg
gtg ctg ctt gac atc tcg ggc 225Gly Lys Arg Leu Ala Gly Val Arg Gly
Val Leu Leu Asp Ile Ser Gly 5 10
15 20 gtg ctg tac gac agc ggc gcg ggc ggc ggc acg gcc
atc gcc ggc tcg 273Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala
Ile Ala Gly Ser 25 30
35 gtg gag gcg gtg gcc aga ctg aag cgt tcc cgg ctg aag gtg agg
ttc 321Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu Lys Val Arg
Phe 40 45 50
tgc acc aac gag tcg cag aag tcc cgg gca gag ctg gtg ggg cag ctt
369Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu Val Gly Gln Leu
55 60 65 cag agg ctg
gga ttt gac atc tct gag cag gag gtg acc gcc ccg gca 417Gln Arg Leu
Gly Phe Asp Ile Ser Glu Gln Glu Val Thr Ala Pro Ala 70
75 80 cca gct gcc tgc cag atc
ctg aag gag cga ggc ctg cga cca tac ctg 465Pro Ala Ala Cys Gln Ile
Leu Lys Glu Arg Gly Leu Arg Pro Tyr Leu 85 90
95 100 ctc atc cat gac gga gtc cgc tca gaa
ttt gat cag atc gac aca tcc 513Leu Ile His Asp Gly Val Arg Ser Glu
Phe Asp Gln Ile Asp Thr Ser 105 110
115 aac cca aac tgt gtg gta att gca gac gca gga gaa
agc ttt tct tat 561Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu
Ser Phe Ser Tyr 120 125
130 caa aac atg aat aac gcc ttc cag gtg ctc atg gag ctg gaa aaa
cct 609Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu Leu Glu Lys
Pro 135 140 145
gtg ctc ata tca ctg gga aaa ggg cgt tac tac aag gag acc tct ggc
657Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys Glu Thr Ser Gly
150 155 160 ctg atg ctg
gac gtt ggt ccc tac atg aag gcg ctt gag tat gcc tgt 705Leu Met Leu
Asp Val Gly Pro Tyr Met Lys Ala Leu Glu Tyr Ala Cys 165
170 175 180 ggc atc aaa gcc gag gtg
gtg ggg aag cct tct cct gag ttt ttc aag 753Gly Ile Lys Ala Glu Val
Val Gly Lys Pro Ser Pro Glu Phe Phe Lys 185
190 195 tct gcc ctg caa gcg ata gga gtg gaa
gcc cac cag gcc gtc atg att 801Ser Ala Leu Gln Ala Ile Gly Val Glu
Ala His Gln Ala Val Met Ile 200 205
210 ggg gac gat atc gtg ggc gac gtc ggc ggt gcc cag
cgg tgt gga atg 849Gly Asp Asp Ile Val Gly Asp Val Gly Gly Ala Gln
Arg Cys Gly Met 215 220 225
aga gcg ctg cag agc cgc tca aca cgt gtt gca agg cac gtg ctc
tgg 897Arg Ala Leu Gln Ser Arg Ser Thr Arg Val Ala Arg His Val Leu
Trp 230 235 240
acc tgg tgc tca cac gtg ctc ttc tcc caa gac act ccc tgaagcgcgt
946Thr Trp Cys Ser His Val Leu Phe Ser Gln Asp Thr Pro
245 250 255 tcccca
952
23257PRTHomo sapiens 23Met Ala Pro Trp Gly Lys Arg Leu Ala Gly Val Arg
Gly Val Leu Leu 1 5 10
15 Asp Ile Ser Gly Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala
20 25 30 Ile Ala Gly
Ser Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu 35
40 45 Lys Val Arg Phe Cys Thr Asn Glu
Ser Gln Lys Ser Arg Ala Glu Leu 50 55
60 Val Gly Gln Leu Gln Arg Leu Gly Phe Asp Ile Ser Glu
Gln Glu Val 65 70 75
80 Thr Ala Pro Ala Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu
85 90 95 Arg Pro Tyr Leu
Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln 100
105 110 Ile Asp Thr Ser Asn Pro Asn Cys Val
Val Ile Ala Asp Ala Gly Glu 115 120
125 Ser Phe Ser Tyr Gln Asn Met Asn Asn Ala Phe Gln Val Leu
Met Glu 130 135 140
Leu Glu Lys Pro Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys 145
150 155 160 Glu Thr Ser Gly Leu
Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu 165
170 175 Glu Tyr Ala Cys Gly Ile Lys Ala Glu Val
Val Gly Lys Pro Ser Pro 180 185
190 Glu Phe Phe Lys Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His
Gln 195 200 205 Ala
Val Met Ile Gly Asp Asp Ile Val Gly Asp Val Gly Gly Ala Gln 210
215 220 Arg Cys Gly Met Arg Ala
Leu Gln Ser Arg Ser Thr Arg Val Ala Arg 225 230
235 240 His Val Leu Trp Thr Trp Cys Ser His Val Leu
Phe Ser Gln Asp Thr 245 250
255 Pro 24979DNAHomo sapiensCDS(166)..(801) 24aagaacttta
aaaatcacct aggtgtgggc cgggcacggt ggctaacgcc tgtaatccca 60gcactttgag
atgctgaggc aggtggatca cgaggtcagg agatcgagac catcctggat 120aacacggaga
aaccccggcg gagctgagga gcagggccgg gcgcc atg gca ccg tgg 177
Met Ala Pro Trp
1 ggc aag cgg ctg gct ggc
gtg cgc ggg gtg ctg ctt gac atc tcg ggc 225Gly Lys Arg Leu Ala Gly
Val Arg Gly Val Leu Leu Asp Ile Ser Gly 5 10
15 20 gtg ctg tac gac agc ggc gcg ggc ggc
ggc acg gcc atc gcc ggc tcg 273Val Leu Tyr Asp Ser Gly Ala Gly Gly
Gly Thr Ala Ile Ala Gly Ser 25 30
35 gtg gag gcg gtg gcc aga ctg aag cgt tcc cgg ctg
aag gtg agg ttc 321Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg Leu
Lys Val Arg Phe 40 45
50 tgc acc aac gag tcg cag aag tcc cgg gca gag ctg gtg ggg cag
ctt 369Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu Val Gly Gln
Leu 55 60 65
cag agg ctg gga ttt gac atc tct gag cag gag gtg acc gcc ccg gca
417Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val Thr Ala Pro Ala
70 75 80 cca gct gcc
tgc cag atc ctg aag gag cga ggc ctg cga cca tac ctg 465Pro Ala Ala
Cys Gln Ile Leu Lys Glu Arg Gly Leu Arg Pro Tyr Leu 85
90 95 100 ctc atc cat gac gga gtc
cgc tca gaa ttt gat cag atc gac aca tcc 513Leu Ile His Asp Gly Val
Arg Ser Glu Phe Asp Gln Ile Asp Thr Ser 105
110 115 aac cca aac tgt gtg gta att gca gac
gca gga gaa agc ttt tct tat 561Asn Pro Asn Cys Val Val Ile Ala Asp
Ala Gly Glu Ser Phe Ser Tyr 120 125
130 caa aac atg aat aac gcc ttc cag gtg ctc atg gag
ctg gaa aaa cct 609Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu
Leu Glu Lys Pro 135 140 145
gtg ctc ata tca ctg gga aaa ggg cgt tac tac aag gag acc tct
ggc 657Val Leu Ile Ser Leu Gly Lys Gly Arg Tyr Tyr Lys Glu Thr Ser
Gly 150 155 160
ctg atg ctg gac gtt ggt ccc tac atg aag gcg ctt gag tat gcc tgt
705Leu Met Leu Asp Val Gly Pro Tyr Met Lys Ala Leu Glu Tyr Ala Cys
165 170 175 180 ggc atc
aaa gcc gag gtg gtg ggg aag cct tct cct gag ttt ttc aag 753Gly Ile
Lys Ala Glu Val Val Gly Lys Pro Ser Pro Glu Phe Phe Lys
185 190 195 tct gcc ctg caa gcg
ata gga gtg gaa gcc cac cag tta ctt tca gta 801Ser Ala Leu Gln Ala
Ile Gly Val Glu Ala His Gln Leu Leu Ser Val 200
205 210 tgaaagcaag aagcagaaat gcctgcggct
tttcctgagt ttttgctgct tctctgaaag 861gataagaatt gacaagtcct atcagtgtgt
taatatatct cactggcaag acagtgtaac 921agcaagatta caacaatatg gaggaaataa
taaagtcact cattttgcga cctttata 97925212PRTHomo sapiens 25Met Ala Pro
Trp Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu 1 5
10 15 Asp Ile Ser Gly Val Leu Tyr Asp
Ser Gly Ala Gly Gly Gly Thr Ala 20 25
30 Ile Ala Gly Ser Val Glu Ala Val Ala Arg Leu Lys Arg
Ser Arg Leu 35 40 45
Lys Val Arg Phe Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu 50
55 60 Val Gly Gln Leu
Gln Arg Leu Gly Phe Asp Ile Ser Glu Gln Glu Val 65 70
75 80 Thr Ala Pro Ala Pro Ala Ala Cys Gln
Ile Leu Lys Glu Arg Gly Leu 85 90
95 Arg Pro Tyr Leu Leu Ile His Asp Gly Val Arg Ser Glu Phe
Asp Gln 100 105 110
Ile Asp Thr Ser Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu
115 120 125 Ser Phe Ser Tyr
Gln Asn Met Asn Asn Ala Phe Gln Val Leu Met Glu 130
135 140 Leu Glu Lys Pro Val Leu Ile Ser
Leu Gly Lys Gly Arg Tyr Tyr Lys 145 150
155 160 Glu Thr Ser Gly Leu Met Leu Asp Val Gly Pro Tyr
Met Lys Ala Leu 165 170
175 Glu Tyr Ala Cys Gly Ile Lys Ala Glu Val Val Gly Lys Pro Ser Pro
180 185 190 Glu Phe Phe
Lys Ser Ala Leu Gln Ala Ile Gly Val Glu Ala His Gln 195
200 205 Leu Leu Ser Val 210
261590DNAHomo sapiensCDS(344)..(769) 26tgagaaccca gggctcgctg ggatgctgtc
ctccctccct ccctccctct gccaagctgg 60tgggaaggcc cttgccaaag cgcacaagac
cccttcacac agcaggggca caagctcttc 120gaggcagagt cgcctgcagg tggggtggaa
ggggtgcgcc ccagacccaa gaatcgcccg 180cctctccaag accatccctg gctgctggcc
cagtgacgag caccatccgg aagtgaaggc 240tgatgggcac ttcctggccc tgcggggtgc
cctgtctctg gctggagctg atgagccgtg 300gtgtgatgct tgcctggggg tcggtgttgc
ctagaagtga gga atg acc tcc ttc 355
Met Thr Ser Phe
1 acc cct ggc tcc ttc cgg cag cct ggc cac ttc
cag cat gtg gag ggg 403Thr Pro Gly Ser Phe Arg Gln Pro Gly His Phe
Gln His Val Glu Gly 5 10 15
20 gcc cgg gcc ctg gca gag acg gtg tgg aac cgc cgg ccg cat
gcg ggg 451Ala Arg Ala Leu Ala Glu Thr Val Trp Asn Arg Arg Pro His
Ala Gly 25 30 35
ctg gag ttc ggg gct gcc aat ggg cta agg ccc tgt gtt caa cac atg
499Leu Glu Phe Gly Ala Ala Asn Gly Leu Arg Pro Cys Val Gln His Met
40 45 50 tcg cgg aac
tgt cgt gaa aaa aag act cag gac cac agg tcc ggg gtg 547Ser Arg Asn
Cys Arg Glu Lys Lys Thr Gln Asp His Arg Ser Gly Val 55
60 65 ctc cca ggg gtc ttt ccc
acc gag gct ggg agt gag cag cag cct agg 595Leu Pro Gly Val Phe Pro
Thr Glu Ala Gly Ser Glu Gln Gln Pro Arg 70 75
80 ccc agc cac acc tcc cag gct cgg gag
ctg cca ggc cag gca acc gtg 643Pro Ser His Thr Ser Gln Ala Arg Glu
Leu Pro Gly Gln Ala Thr Val 85 90
95 100 aag ggt agc agg gtg ggg ggt gga ggg gag gac agg
gag agc agg cct 691Lys Gly Ser Arg Val Gly Gly Gly Gly Glu Asp Arg
Glu Ser Arg Pro 105 110
115 ctc ggg cct cac cca ccc ttg cac ctt cca gct cta ggg ccc aca
cag 739Leu Gly Pro His Pro Pro Leu His Leu Pro Ala Leu Gly Pro Thr
Gln 120 125 130
tac caa gcc ctg ggc tca atg tac cac tta tgaaaactca attcaaattg
789Tyr Gln Ala Leu Gly Ser Met Tyr His Leu
135 140 gcttagtgca
acaaggtatt tggtggtctc acatgacttg aacatccagg cctggctgtc 849acgggaaccg
catctcttcc cattgcagct acttggcagg tggcgggatg tcccccagcc 909accgacgtcc
ccctgcctgc tccgcaaccc cagggcctgc agaaaaggcc cacgagactc 969agactggcag
agacttaggc ggaccaggaa caggggcgca gtctccgtcc cacccaaacc 1029ctaaccagag
agaacacggc acgttgtgcc agacggagga cggatgccag cgagggtcca 1089tgtcctcact
gccgacaagg ctgggaactg ggccaagtga agcagaggcc tccacgtcag 1149atgtgagcgc
caccggccca ggtgactgca gttcttccct ccttccgttc ggcttgagcc 1209ctccagagga
tcggaaaggc tgaggcctga cctggtgccg ctgtcctggg tgggtctgtc 1269ctgctggtcg
gttcctgccc ctctcgggga ggttggctgg cagctggcag gtggaaagcc 1329tcctgtgttc
acctcagggc agaggtgggg acacagggcg ggacgggcga gtgtggtgcc 1389cctctggggt
gggtgctctt gggtccgcct cccgtgccag agtgcgtgtc aacagttcca 1449gctgcccctc
agaactgtcc tggtttagga ggtgaacaca cggggcagcc tacattctac 1509gtggtttttt
taacattata aaagcagcat gtgttattac aggaaattta acagaagtat 1569ataaaaagaa
accagaagtc a 159027142PRTHomo
sapiens 27Met Thr Ser Phe Thr Pro Gly Ser Phe Arg Gln Pro Gly His Phe Gln
1 5 10 15 His Val
Glu Gly Ala Arg Ala Leu Ala Glu Thr Val Trp Asn Arg Arg 20
25 30 Pro His Ala Gly Leu Glu Phe
Gly Ala Ala Asn Gly Leu Arg Pro Cys 35 40
45 Val Gln His Met Ser Arg Asn Cys Arg Glu Lys Lys
Thr Gln Asp His 50 55 60
Arg Ser Gly Val Leu Pro Gly Val Phe Pro Thr Glu Ala Gly Ser Glu 65
70 75 80 Gln Gln Pro
Arg Pro Ser His Thr Ser Gln Ala Arg Glu Leu Pro Gly 85
90 95 Gln Ala Thr Val Lys Gly Ser Arg
Val Gly Gly Gly Gly Glu Asp Arg 100 105
110 Glu Ser Arg Pro Leu Gly Pro His Pro Pro Leu His Leu
Pro Ala Leu 115 120 125
Gly Pro Thr Gln Tyr Gln Ala Leu Gly Ser Met Tyr His Leu 130
135 140 28791DNAHomo
sapiensCDS(166)..(672) 28aagaacttta aaaatcacct aggtgtgggc cgggcacggt
ggctaacgcc tgtaatccca 60gcactttgag atgctgaggc aggtggatca cgaggtcagg
agatcgagac catcctggat 120aacacggaga aaccccggcg gagctgagga gcagggccgg
gcgcc atg gca ccg tgg 177
Met Ala Pro Trp 1
ggc aag cgg ctg gct ggc gtg cgc ggg gtg ctg ctt gac atc
tcg ggc 225Gly Lys Arg Leu Ala Gly Val Arg Gly Val Leu Leu Asp Ile
Ser Gly 5 10 15
20 gtg ctg tac gac agc ggc gcg ggc ggc ggc acg gcc atc gcc ggc tcg
273Val Leu Tyr Asp Ser Gly Ala Gly Gly Gly Thr Ala Ile Ala Gly Ser
25 30 35 gtg gag gcg
gtg gcc aga ctg aag cgt tcc cgg ctg aag gtg agg ttc 321Val Glu Ala
Val Ala Arg Leu Lys Arg Ser Arg Leu Lys Val Arg Phe 40
45 50 tgc acc aac gag tcg cag
aag tcc cgg gca gag ctg gtg ggg cag ctt 369Cys Thr Asn Glu Ser Gln
Lys Ser Arg Ala Glu Leu Val Gly Gln Leu 55
60 65 cag agg ctg gga ttt gac atc tct gag
cag gag gtg acc gcc ccg gca 417Gln Arg Leu Gly Phe Asp Ile Ser Glu
Gln Glu Val Thr Ala Pro Ala 70 75
80 cca gct gcc tgc cag atc ctg aag gag cga ggc ctg
cga cca tac ctg 465Pro Ala Ala Cys Gln Ile Leu Lys Glu Arg Gly Leu
Arg Pro Tyr Leu 85 90 95
100 ctc atc cat gac gga gtc cgc tca gaa ttt gat cag atc gac aca
tcc 513Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln Ile Asp Thr
Ser 105 110 115
aac cca aac tgt gtg gta att gca gac gca gga gaa agc ttt tct tat
561Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu Ser Phe Ser Tyr
120 125 130 caa aac atg
aat aac gcc ttc cag gtg ctc atg gag ctg gaa aaa cct 609Gln Asn Met
Asn Asn Ala Phe Gln Val Leu Met Glu Leu Glu Lys Pro 135
140 145 gtg ctc ata tca ctg gga
aaa ggc atc cac ggg ttc ctc ttt gtg gct 657Val Leu Ile Ser Leu Gly
Lys Gly Ile His Gly Phe Leu Phe Val Ala 150 155
160 aga aca ttc acg tcc taagtggggc
tgccgttggc tgggttaaac cagtaaccag 712Arg Thr Phe Thr Ser
165
taacaggaga aatacgtcgc atctctagtg tgtggagatt
accgggcctt tatattaaaa 772aaaaatttta agtgcttat
79129169PRTHomo sapiens 29Met Ala Pro Trp Gly Lys
Arg Leu Ala Gly Val Arg Gly Val Leu Leu 1 5
10 15 Asp Ile Ser Gly Val Leu Tyr Asp Ser Gly Ala
Gly Gly Gly Thr Ala 20 25
30 Ile Ala Gly Ser Val Glu Ala Val Ala Arg Leu Lys Arg Ser Arg
Leu 35 40 45 Lys
Val Arg Phe Cys Thr Asn Glu Ser Gln Lys Ser Arg Ala Glu Leu 50
55 60 Val Gly Gln Leu Gln Arg
Leu Gly Phe Asp Ile Ser Glu Gln Glu Val 65 70
75 80 Thr Ala Pro Ala Pro Ala Ala Cys Gln Ile Leu
Lys Glu Arg Gly Leu 85 90
95 Arg Pro Tyr Leu Leu Ile His Asp Gly Val Arg Ser Glu Phe Asp Gln
100 105 110 Ile Asp
Thr Ser Asn Pro Asn Cys Val Val Ile Ala Asp Ala Gly Glu 115
120 125 Ser Phe Ser Tyr Gln Asn Met
Asn Asn Ala Phe Gln Val Leu Met Glu 130 135
140 Leu Glu Lys Pro Val Leu Ile Ser Leu Gly Lys Gly
Ile His Gly Phe 145 150 155
160 Leu Phe Val Ala Arg Thr Phe Thr Ser 165
30659DNAHomo sapiens 30catgacggag tccgctcaga atttgatcag atcgacacat
ccaacccaaa ctgtgtggta 60attgcagacg caggagaaag cttttcttat caaaacatga
ataacgcctt ccaggtgctc 120atggagctgg aaaaacctgt gctcatatca ctgggaaaag
gagccgctca acacgtgttg 180caaggcacgt gctctggacc tggtgctcac acgtgctctt
ctcccaagac actccctgaa 240gcgcgttccc caggagctgg aagggggata acgcaggccc
agtgacgagc accatccgga 300agtgaaggct gatggtgaag agcctggaag aaacccaggg
gcctcctatg cacagatctt 360gcagcccaga accaagtcag cttccctgcg actgcccagg
cacactgccc atcaccccac 420ccccgaaaca atgccagccc gctgcttttt ctatcctccc
agtcaccttt gcagacaaag 480accaggggca gctcccgagg gcactgtgaa ggctcccatg
ccacacagtg agaactgtag 540cctctgcgtc caaggcacac agggtacttt ctggacccac
tgctggacag acttgaaggt 600gtcatgcccg gtgtgtgcag gaggaaacta acagttcagt
aaactctgcc ttgaccagc 659312294DNAHomo sapiensCDS(311)..(805)
31tgtgccattg gcgatgcggg gaggctggcc ccatcgaagg ctggtgggac tggtggagac
60tcctgtccac tgctcagcac taggcctgca gcagacacca tgagccccaa acttcccaaa
120gcccttcccc agtcccacaa gatggtgtct gcggaccgtg ctcgtgagag atggcagcca
180ggcagtcccc acagggcacc cattttcagc tgcccccgct tctcagacaa ggaaactgag
240gccagaaagc caggtggccc aggagctggc ttccccattt cctgctcctg tgggccccac
300tgcagtgccc atg ggc cgg gct gat att acc cga gac ttc gga gct ctc
349 Met Gly Arg Ala Asp Ile Thr Arg Asp Phe Gly Ala Leu
1 5 10 acg ggt
gcg agt aat tta ggc tgc atg gac aca agc tgc tgg ctt gag 397Thr Gly
Ala Ser Asn Leu Gly Cys Met Asp Thr Ser Cys Trp Leu Glu 15
20 25 tcg ccc cgt tat gaa
tgt gtg tgg gtc tgt gcc cct ttc atg tgc tgc 445Ser Pro Arg Tyr Glu
Cys Val Trp Val Cys Ala Pro Phe Met Cys Cys 30 35
40 45 cac agg gcc cac gag tgt gct gaa
agg gaa gga cac ggc caa ggg gcc 493His Arg Ala His Glu Cys Ala Glu
Arg Glu Gly His Gly Gln Gly Ala 50
55 60 atg gtg gac agg aga cct tct tgg ggg ttc ggt
ggt gtc ctt gac ccc 541Met Val Asp Arg Arg Pro Ser Trp Gly Phe Gly
Gly Val Leu Asp Pro 65 70
75 act ctg act gag cac tgc ccc aag gca ctg cca ttc cag gcc
ccc ttc 589Thr Leu Thr Glu His Cys Pro Lys Ala Leu Pro Phe Gln Ala
Pro Phe 80 85 90
cct gag cct ccc acc cca ggc cca ccc acc tgc tgg gtc ctc cca cct
637Pro Glu Pro Pro Thr Pro Gly Pro Pro Thr Cys Trp Val Leu Pro Pro
95 100 105 gcg ggg ccc
gcc atg cgg ggt cac cat gcg agt ctc acc atg cag ggt 685Ala Gly Pro
Ala Met Arg Gly His His Ala Ser Leu Thr Met Gln Gly 110
115 120 125 cac cac acg agt ctc acc
atg cag ggt cac cac gcg agt ctc acc atg 733His His Thr Ser Leu Thr
Met Gln Gly His His Ala Ser Leu Thr Met 130
135 140 cag ggt cac cat gcg ggg tca cca tgt
ggg gtc acc atg cgg ggc tca 781Gln Gly His His Ala Gly Ser Pro Cys
Gly Val Thr Met Arg Gly Ser 145 150
155 cca tgt ggg gct tca gga gct tgc tgagcaccct
ccccacccac ggtcactctc 835Pro Cys Gly Ala Ser Gly Ala Cys
160 165
cctggggtct gtaagcctcc ctgggcctga gcagctccca gccttgctgc
tgcctttcca 895cttcctggca gtgaggtctc ctgggtgcct tctctcagcc ctttgggatg
ttttttgtga 955ggaagggagg ctttgatgct gtggagcatc tgtagtgccc actccagtgg
cttcacagga 1015gcagcaggct gtttgttctg agctgttcca ccttgtgcct gccagagggg
agatagtgga 1075caggcctccc tccccccaag tggtggggtg gaccccctgc ccgctgtggc
cccatacctg 1135ggggccacac accactgccc tgggccgtgc agctgctatg aagagtgtgc
tgctgagacc 1195ctggaagaga cggaggatga aattgtgttg ccagatagtc catttgttgt
tctgagactc 1255gcatgcctgg gagaatcctg ggaattaact agctccttct ctcccatccc
attttacaga 1315aaagtgagac ccaaggtggt ttctgacttg cccagaggtc ataactgctt
ggacagtcat 1375ggtcctcaga gcccacgttt gctgaccagt gcaggctctc acagccactc
agctcctgca 1435gccgtggcgt ggcagaggag ggaagcactt cctgggattt atgctgcctc
cctgacattt 1495caaggccctt catttctcta aatattggag gagttgaatt atttttagtt
gagcctcaag 1555ggatcagaga ataagcttgc agcaacgttg gcagatgggc ttcttctagc
agagagtggt 1615tattcggggc ctcttattga gagaatcggg tgatttgagg aaatctgggg
tgtcctgagg 1675cataccagag gacccccaag tttttcctgt ggctcgtctg ccatcaggaa
accaaaatga 1735ctcccctcgt cctgagctct ccagggtgtg gacctggaat gcttaagggg
aggcaatggc 1795atatctttaa gatgagcaca gctccggagc cactcgagca cccaaggcca
cgtcctgctc 1855agggcacttc gggcctcagt ttccttatct ttaaaatgga cagagttggc
cgggtgaggt 1915ggccctgcct gtaatcccag cactttggga ggccaaggct ggcagattgc
ttgagcccag 1975gagtttgaag ccagcctggg caacatggcg aaaccccatc tctactacaa
gtacaaaaat 2035ttggccgggc atggtggctc atgcttgtaa tcccagcact ttgggaggcc
aaggagagcg 2095gatcacttga ggccagaagc tcgagaccgc ctctactaaa aatacaaaaa
ttagccaggc 2155gtggtggctc acgcctgtaa tctcagctac tcgggtggct gaggcaggag
aatcacttga 2215acctgggaag tagaggttgc agtgagctga gatcgtgcca ctgcactcta
gcctgggcga 2275cagagcaaaa ccctgtctc
229432165PRTHomo sapiens 32Met Gly Arg Ala Asp Ile Thr Arg Asp
Phe Gly Ala Leu Thr Gly Ala 1 5 10
15 Ser Asn Leu Gly Cys Met Asp Thr Ser Cys Trp Leu Glu Ser
Pro Arg 20 25 30
Tyr Glu Cys Val Trp Val Cys Ala Pro Phe Met Cys Cys His Arg Ala
35 40 45 His Glu Cys Ala
Glu Arg Glu Gly His Gly Gln Gly Ala Met Val Asp 50
55 60 Arg Arg Pro Ser Trp Gly Phe Gly
Gly Val Leu Asp Pro Thr Leu Thr 65 70
75 80 Glu His Cys Pro Lys Ala Leu Pro Phe Gln Ala Pro
Phe Pro Glu Pro 85 90
95 Pro Thr Pro Gly Pro Pro Thr Cys Trp Val Leu Pro Pro Ala Gly Pro
100 105 110 Ala Met Arg
Gly His His Ala Ser Leu Thr Met Gln Gly His His Thr 115
120 125 Ser Leu Thr Met Gln Gly His His
Ala Ser Leu Thr Met Gln Gly His 130 135
140 His Ala Gly Ser Pro Cys Gly Val Thr Met Arg Gly Ser
Pro Cys Gly 145 150 155
160 Ala Ser Gly Ala Cys 165 333350DNAHomo
sapiensCDS(192)..(776) 33tttttttttt tttctgctta taagtttatt caatgcaaaa
taaccctcac cagttttact 60gaggtggctg accatgtcca cgaccaaata cgcctgtaaa
ctgaaattcg gttgctgacc 120cattcccagc ctcagctttc tcactggcac cagggggaca
gcactccatc tgtgggtgtc 180tctttctctc t atg gct gtc tgt ctg tgg gtg tct
ctc tct gtc tgt ggg 230 Met Ala Val Cys Leu Trp Val Ser
Leu Ser Val Cys Gly 1 5
10 tgt ctt tcg cca tct gtg ggt atc tct ctc tgt ctg tgg
gta tct ctc 278Cys Leu Ser Pro Ser Val Gly Ile Ser Leu Cys Leu Trp
Val Ser Leu 15 20 25
cca tct gtg ggt gtc cat ctc tgt ctt tgg gtg tct ctc ttt gtg agt
326Pro Ser Val Gly Val His Leu Cys Leu Trp Val Ser Leu Phe Val Ser
30 35 40 45 gtc tct
gtc tgt ggt tgt ctc tgt ctg tgg gtg tct ctc tgt gag tgt 374Val Ser
Val Cys Gly Cys Leu Cys Leu Trp Val Ser Leu Cys Glu Cys
50 55 60 ccc tgt gag tgt ctc
tgt ctg tgg gtg tct ctc ctc gtc tgt ggg tat 422Pro Cys Glu Cys Leu
Cys Leu Trp Val Ser Leu Leu Val Cys Gly Tyr 65
70 75 ctc tcc ctg tct gtg ggt gtc tct
gtt ggc ttc ccc act tgt ggg tct 470Leu Ser Leu Ser Val Gly Val Ser
Val Gly Phe Pro Thr Cys Gly Ser 80 85
90 tgc agg tcg gtc acg ctc cag acc ttt agg ccg
cag cct gcc agt ctc 518Cys Arg Ser Val Thr Leu Gln Thr Phe Arg Pro
Gln Pro Ala Ser Leu 95 100 105
cag acc gct gtg gca tgg ggt agc aga cac gct ctc cag ggg
cag atg 566Gln Thr Ala Val Ala Trp Gly Ser Arg His Ala Leu Gln Gly
Gln Met 110 115 120
125 gtg gta atc gca gag att ctg gat ccc cat gtg ggt gag gta cca gta
614Val Val Ile Ala Glu Ile Leu Asp Pro His Val Gly Glu Val Pro Val
130 135 140 gaa atg tct
cca ggc aaa ctc ctt cct gca acc tca gga cct gag aga 662Glu Met Ser
Pro Gly Lys Leu Leu Pro Ala Thr Ser Gly Pro Glu Arg 145
150 155 ctg cct ggc ctt cat gac
gtg aag gtt ggg cac att ctc atc tgc cag 710Leu Pro Gly Leu His Asp
Val Lys Val Gly His Ile Leu Ile Cys Gln 160
165 170 ctc cgg gtc tta ggc agg tgg aca ttc
ttc ttg gct acc gtg act ccc 758Leu Arg Val Leu Gly Arg Trp Thr Phe
Phe Leu Ala Thr Val Thr Pro 175 180
185 tcc tta aaa agg agt tca taaatagcaa tctggttctt
cttaggcatc 806Ser Leu Lys Arg Ser Ser
190 195
aacatctctg cagctgtagg gtccaggtcc ggggctggaa agcatgattt
ttttctaact 866gatctctgct gatggcatct agattgttcc tggtttttca ccataccagg
gctgtgatga 926gcatcttggt gcatttcgga tgacgtctcc agatacagtt acagaacgag
tatttttgag 986gttcttgagg catgttgcca agttgtttcc agaaagctgc acagacttat
tctgcacagc 1046ctagaattct agaatcacag ggttctgcac aacctagagt tctggaatca
cagggttctg 1106cacagctaga attctagaat cacagggttc tgcacagcta gaattctaga
atcacagggt 1166tctgcacaac ctagagttct ggaatcacag ggttctgcac agcctagagt
tctggaatca 1226cagggttctg cacagctaga attctagaat cacagggttc tgcacagcct
agagttctgg 1286aatcacaggg ttctgcacag cctagagttt tggaatcaca gggttctgca
cagctagaat 1346tctagaatca cagggttctg cacagctaga attctagaat cacagggttc
tacacagcta 1406gaattctaga atcacaggct cccagggttg caaggacact ttggagtgtc
tacctcagca 1466tctcatgaag tgtgggaatt ccgaggcggt ggcggaggaa gtgttttcca
tcttcggtgc 1526tttcgttgct tctggtgaca gcgctcactg cctctgcttg ctgtacggga
ccagctgatg 1586gaaccgacag ggagggactt tttatctggc cattggccac tgccacacac
tttgtgtacc 1646ccgttttgtg taattctgac tacaaccttg tgggatctag gcaggtcatt
gctgttttgc 1706aagtggggtt gttgaagcca caggagatga aataagctgc tgtcccccag
ccattgagtg 1766ctgataggat caggagtgcc agttggtgtg gctgacccca gaccctgtgc
gtgttacctc 1826taagctacat tctagagcag actttttgcc cacacaagcc ttaaatgtgg
gctggggaca 1886gtggctcacg ccggtaatcc cagcactttg ggaggacaag gtgggcagat
cacctgaggc 1946caggggttca agaccaggct ggccaacatg gtgaaaccct gtctctacta
aaaatacaaa 2006aattagccag gtgtggtggt gcgtgcctat agtcccagct actcgggagg
ctgacgcatg 2066agaattgctt gaacctggga ggcagaggtt gcagtaagtc aagactgcgc
cattgcactc 2126tagcctgggc gacagagcaa gactccatct cgaaaaaaac aacaaaacct
taaatgtatt 2186tttgaggctg tgtttaaaaa tggggatatt ttacacaaaa tatccagatt
tctggattct 2246tttgaagaat cagaagatct gacaatacgg agcctcacat tcctgcacac
acagcagcca 2306tcgctggagc cactgcctcc attagtttga atttactgca gaccccactc
ctccctgtcg 2366tccctgtctc cagaccacag agttagttgt cattgatcgt gtgccatttg
ttgttttttt 2426caaagtagag aagtacttct tcacgctgtg tctctatcaa aaatggacaa
gtgaaagatg 2486tttcaagaaa tgaaaagatt ttctttttag tgacaaaaaa tttctagtat
gtttctcata 2546taaataaaat gtgtcctgta tgtagtcagg gttcctcaga gaagcccgaa
gctacaggat 2606atagatatgt agagagattg tggaggcttg gcgagtccaa aatctgcagg
gcagggctgg 2666caggctgggg actcaggaat gcgcgcagca gagtcgtaag gctgtgtgct
ggtgggattc 2726ttgctcgggg aaggtcagtc tttgttcttg taaagcctgc aactggttgg
atgtggtcca 2786cccacattgc ggaagggaat gtactctcct cctagttcac cgatttaaat
gttaatctca 2846tccaaaaaca ccttcacaga aacatccaga ataatgtttg accacatatc
tgggcaccgt 2906ggcccagcca agttgacata ttaaattaac ccttgtagtc cctttttaaa
cttacaccca 2966ttgcaattta ggtcgctgct atggagcaag ccacagaacc tggcctctta
actcatttac 3026ccgggctgac ccattaggcc tttgagtcac caacacctca ctagagaaca
agcataatga 3086agaagctctg ctgtaattcg ttaatgttaa cacttttttc tttaaagatg
tctcatgctg 3146agcttcgtgg cacacgccta taatcccagc actttgggag gctgagatga
gaggatggct 3206tgagctcaga ggttcgagac cagcctgggc agcatagtaa gattccgtct
ctacaaaaaa 3266gaaaagaaaa aaagttgtct cataattatt aaaaaccact attccagatc
atggataata 3326atagtcagaa caggtatatt gttg
335034195PRTHomo sapiens 34Met Ala Val Cys Leu Trp Val Ser Leu
Ser Val Cys Gly Cys Leu Ser 1 5 10
15 Pro Ser Val Gly Ile Ser Leu Cys Leu Trp Val Ser Leu Pro
Ser Val 20 25 30
Gly Val His Leu Cys Leu Trp Val Ser Leu Phe Val Ser Val Ser Val
35 40 45 Cys Gly Cys Leu
Cys Leu Trp Val Ser Leu Cys Glu Cys Pro Cys Glu 50
55 60 Cys Leu Cys Leu Trp Val Ser Leu
Leu Val Cys Gly Tyr Leu Ser Leu 65 70
75 80 Ser Val Gly Val Ser Val Gly Phe Pro Thr Cys Gly
Ser Cys Arg Ser 85 90
95 Val Thr Leu Gln Thr Phe Arg Pro Gln Pro Ala Ser Leu Gln Thr Ala
100 105 110 Val Ala Trp
Gly Ser Arg His Ala Leu Gln Gly Gln Met Val Val Ile 115
120 125 Ala Glu Ile Leu Asp Pro His Val
Gly Glu Val Pro Val Glu Met Ser 130 135
140 Pro Gly Lys Leu Leu Pro Ala Thr Ser Gly Pro Glu Arg
Leu Pro Gly 145 150 155
160 Leu His Asp Val Lys Val Gly His Ile Leu Ile Cys Gln Leu Arg Val
165 170 175 Leu Gly Arg Trp
Thr Phe Phe Leu Ala Thr Val Thr Pro Ser Leu Lys 180
185 190 Arg Ser Ser 195
3521DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 35cacgtaccca tcagccttca c
213621DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 36cctgtggaag gagcatacag t
213722DNAArtificial SequenceDescription of Artificial
Sequence Synthetic probe 37cccagtgacg agcaccatcc gg
223820DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 38caacactggc acctgcagat
203917DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
39ccaccccatg ccatcaa
174022DNAArtificial SequenceDescription of Artificial Sequence Synthetic
probe 40aagtggcaga gcagccccca gc
224119DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 41cccgcctctc caagaccat
194222DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 42ggtacactca tgtccccacc at
224318DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 43gcgcaccggg aagttcag
184420DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
44tgcaagcgat aggagtggaa
204521DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 45ggttgtccac gtacccatca g
214622DNAArtificial SequenceDescription of Artificial Sequence
Synthetic probe 46cccaccaggc ccagtgacga gc
224718DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 47gcgcaccggg aagttcag
184827DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 48tgaagaacaa aacagaatga gaatgtg
274931DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
49ccagctggag tcatttattc accttccttc c
315027DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 50ccaccagtta ctttcagtat gaaagca
275126DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 51tatcctttca gagaagcagc aaaaac
265224DNAArtificial SequenceDescription of Artificial
Sequence Synthetic probe 52cagaaatgcc tgcggctttt cctg
245319DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 53cgccccagac ccaagaatc
195420DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
54caggaagtgc ccatcagcct
205523DNAArtificial SequenceDescription of Artificial Sequence Synthetic
probe 55cccgcctctc caagaccatc cct
235620DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 56agcaccatcc ggaagtgaag
205722DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 57gctgcaagat ctgtgcatag ga
225829DNAArtificial SequenceDescription of
Artificial Sequence Synthetic probe 58ctgatggtga agagcctgga
agaaaccca 295922DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
59caatttaggt cgctgctatg ga
226022DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 60tggtgactca aaggcctaat gg
226127DNAArtificial SequenceDescription of Artificial Sequence
Synthetic probe 61cctggcctct taactcattt acccggg
276219DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 62tggaggcagc ctcgcttta
196322DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 63ttggaggaag agttctcatg ca
226422DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
64cccgcagaac ctccacgctg tt
226523DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 65tctcccactg tatgctcctt cca
236622DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 66ctctgccact tcatctgcag gt
226721DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 67cacgtaccca tcagccttca c
216824DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 68gaatctccca aatcccagaa ctca
246924DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
69acaccgggca tgacaccttc aagt
2470137DNAArtificial SequenceDescription of Artificial Sequence Synthetic
polynucleotide 70aagaacttta aaaatcacct aggtgtgggc cgggcacggt
ggctaacgcc tgtaatccca 60gcactttgag atgctgaggc aggtggatca cgaggtcagg
agatcgagac catcctggat 120aacacggaga aaccccg
137711203DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotide 71tgcattcggc tgacagagca
tgagggggag gaatcactga tgacaggcac tggcctgccc 60agctgggggc ctttgtttat
tcatttggtg ggcacttcct gggtgcctgc tctgggtcag 120gcctgtgggg gggaccactg
agggcaggaa acctggcctg tccctccagg aagcgaagtc 180aacactggca cctgcagatg
aagtggcaga gcagccccca gctttgatgg catggggtgg 240ttggggggca cattctgcat
gctcagaaga gagagcaact cgccctgtgg aaggagcata 300cagtgggaga tggggacagg
cccagtgacg agcaccatcc ggaagtgaag gctgatgggt 360acgtggacaa cctcgcagag
gcagtggacc tgctgctgca gcacgccgac aagtgatggc 420ctcctgggag agccccgcct
cctccacccc tgcctctcct ccacccctgc ctcccctcca 480cccctgcctc tcctccaccc
gcccaggaga gccccacctc ctccacccct gcctctcctc 540cacccctgcc tcccctccac
ctgccccagt gcccagacca accaaggccc tgacagccct 600gccttctgcc ctctgccctg
catgggcagg catttgttcc ctacctgggt ggcctgctcc 660cctgcctggg ccctgacttc
agctccctgt agtgaagtcc aggagggtgg gacaggcctg 720tcaggcctct gggaatctcc
caaatcccag aactcaccac tcaccatggg cctttaaatg 780cagtaaactc cacctaacca
gattcagggg cactatgccc actgcctcct cttcagactc 840tttgcatttc agtgaagagc
ctggaagaaa cccaggggcc tcctatgcac agatcttgca 900gcccagaacc aagtcagcct
ccctgcgact gcccaggcac actgcccacc accccacccc 960cgaaacaatg ccagcccgct
gctttttcta tcctcccagt cacctttgca gacaaagacc 1020aggggcagct cccgagggca
ctgtgaaggc tcccatgcca cacagtgaga actgtagcct 1080ctgcgtccaa ggcacacagg
gtactttctg gacccactgc tggacagact tgaaggtgtc 1140atgcccggtg tgtgcaggag
gaaactaaca gttcagtaaa ctctgccttg accagcagcc 1200ttt
1203721143DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
72gcatgagggg gaggaatcac tgatgacagg cactggcctg cccagctggg ggcctttgtt
60tattcatttg gtgggcactt cctgggtgcc tgctctgggt caggcctgtg ggggggacca
120ctgagggcag gaaacctggc ctgtccctcc aggaagcgaa gtcaacactg gcacctgcag
180atgaagtggc agagcagccc ccagctttga tggcatgggg tggttggggg gcacattctg
240catgctcaga agagagagca actcgccctg tggaaggagc atacagtggg agatggggac
300aggcccagtg acgagcacca tccggaagtg aaggctgatg ggtacgtgga caacctcgca
360gaggcagtgg acctgctgct gcagcacgcc gacaagtgat ggcctcctgg gagagccccg
420cctcctccac ccctgcctct cctccacccc tgcctcccct ccacccctgc ctctcctcca
480cccgcccagg agagccccac ctcctccacc cctgcctctc ctccacccct gcctcccctc
540cacctgcccc agtgcccaga ccaaccaagg ccctgacagc cctgccttct gccctctgcc
600ctgcatgggc aggcatttgt tccctacctg ggtggcctgc tcccctgcct gggccctgac
660ttcagctccc tgtagtgaag tccaggaggg tgggacaggc ctgtcaggcc tctgggaatc
720tcccaaatcc cagaactcac cactcaccat gggcctttaa atgcagtaaa ctccacctaa
780ccagattcag gggcactatg cccactgcct cctcttcaga ctctttgcat ttcagtgaag
840agcctggaag aaacccaggg gcctcctatg cacagatctt gcagcccaga accaagtcag
900cctccctgcg actgcccagg cacactgccc accaccccac ccccgaaaca atgccagccc
960gctgcttttt ctatcctccc agtcaccttt gcagacaaag accaggggca gctcccgagg
1020gcactgtgaa ggctcccatg ccacacagtg agaactgtag cctctgcgtc caaggcacac
1080agggtacttt ctggacccac tgctggacag acttgaaggt gtcatgcccg gtgtgtgcag
1140gag
1143731129DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 73gcacttcctg ggtgcctgct ctgggtcagg
cctgtggggg ggaccactga gggcaggaaa 60cctggcctgt ccctccagga agcgaagtca
acactggcac ctgcagatga agtggcagag 120cggcccccag ctttgatggc atggggtggt
tggggggcac attctgcatg ctcagaagag 180agagcaactc gccctgtgga aggagcatac
agtgggagat ggggacaggc ccagtgacga 240gcaccatccg gaagtgaagg ctgatgggta
cgtggacaac ctcgcagagg cagtggacct 300gctgctgcag cacgccgaca agtgatggcc
tcctgggaga gccccgcctc ctccacccct 360gcctctcctc cacccctgcc tcccctccac
ccctgcctct cctccacccg cccaggagag 420ccccacctcc tccacccctg cctctcctcc
acccctgcct cccctccacc tgccccagtg 480cccagaccaa ccaaggccct gacagccctg
ccttctgccc tctgccctgc atgggcaggc 540atttgttccc tacctgggtg gcctgctccc
ctgcctgggc cctgacttca gctccctgta 600gtgaagtcca ggagggtggg acaggcctgt
caggcctctg ggaatctccc aaatcccaga 660actcaccact caccatgggc ctttaaatgc
agtaaactcc acctaaccag attcaggggc 720actatgccca ctgcctcctc ttcagactct
ttgcatttca gtgaagagcc tggaagaaac 780ccaggggcct cctatgcaca gatcttgcag
cccagaacca agtcagcctc cctgcgactg 840cccaggcaca ctgcccacca ccccaccccc
gaaacaatgc cagcccgctg ctttttctat 900cctcccagtc acctttgcag acaaagacca
ggggcagctc ccgagggcac tgtgaaggct 960cccatgccac acagtgagaa ctgtagcctc
tgcgtccaag gcacacaggg tactttctgg 1020acccactgct ggacagactt gaaggtgtca
tgcccggtgt gtgcaggagg aaactaacag 1080ttcagtaaac tctgccttga ccagcaaaaa
aaaaaaaaaa aaaaaaaaa 1129741083DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
74ctgagggcag gaaacctggc ctgtccctcc aggaagcgaa gtcaacactg gcacctgcag
60atgaagtggc agagcagccc ccagctttga tggcatgggg tggttggggg gcacattctg
120catgctcaga agagagagca actcgccctg tggaaggagc atacagtggg agatggggac
180aggcccagtg acgagcacca tccggaagtg aaggctgatg ggtacgtgga caacctcgca
240gaggcagtgg acctgctgct gcagcacgcc gacaagtgat ggcctcctgg gagagccccg
300cctcctccac ccctgcctct cctccacccc tgcctcccct ccacccctgc ctctcctcca
360cccgcccagg agagccccac ctcctccacc cctgcctctc ctccacccct gcctcccctc
420cacctgcccc agtgcccaga ccaaccaagg ccctgacagc cctgccttct gccctctgcc
480ctgcatgggc aggcatttgt tccctacctg ggtggcctgc tcccctgcct gggccctgac
540ttcagctccc tgtagtgaag tccaggaggg tgggacaggc ctgtcaggcc tctgggaatc
600tcccaaatcc cagaactcac cactcaccat gggcctttaa atgcagtaaa ctccacctaa
660ccagattcag gggcactatg cccactgcct cctcttcaga ctctttgcat ttcagtgaag
720agcctggaag aaacccaggg gcctcctatg cacagatctt gcagcccaga accaagtcag
780cctccctgcg actgcccagg cacactgccc accaccccac ccccgaaaca atgccagccc
840gctgcttttt ctatcctccc agtcaccttt gcagacaaag accaggggca gctcccgagg
900gcactgtgaa ggctcccatg ccacacagtg agaactgtag cctctgcgtc caaggcacac
960agggtacttt ctggacccac tgctggacag acttgaaggt gtcatgcccg gtgtgtgcag
1020gaggaaacta acagttcagt aaactctgcc ttgaccagca aaaaaaaaaa aaaaaaaaaa
1080aaa
108375999DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 75tacagtggga gatggggaca ggcccagtga
cgagcaccat ccggaagtga aggctgatgg 60gtacgtggac aacctcgcag aggcagtgga
cctgctgctg cagcacgccg acaagtgatg 120gcctcctggg agagccccgc ctcctccacc
cctgcctctc ctccacccct gcctcccctc 180cacccctgcc tctcctccac ccgcccagga
gagccccacc tcctccaccc ctgcctctcc 240tccacccctg cctcccctcc acctgcccca
gtgcccagac caaccaaggc cctgacagcc 300ctgccttctg ccctctgccc tgcatgggca
ggcatttgtt ccctacctgg gtggcctgct 360cccctgcctg ggccctgact tcagctccct
gtagtgaagt ccaggagggt gggacaggcc 420tgtcaggcct ctgggaatct cccaaatccc
agaactcacc actcaccatg ggcctttaaa 480tgcagtaaac tccacctaac cagattcagg
ggcactatgc ccactgcctc ctcttcagac 540tctttgcatt tcagtgaaga gcctggaaga
aacccagggg cctcctatgc acagatcttg 600cagcccagaa ccaagtcagc ctccctgcga
ctgcccaggc acactgccca ccaccccacc 660cccgaaacaa tgccagcccg ctgctttttc
tatcctccca gtcacctttg cagacaaaga 720ccaggggcag ctcccgaggg cactgtgaag
gctcccatgc cacacagtga gaactgtagc 780ctctgcgtcc aaggcacaca gggtactttc
tggacccact gctggacaga cttgaaggtg 840tcatgcccag tgtgtgcagg aggaaactaa
cagttcagta aactctgcct tgaccagcaa 900aaaaaaaaaa aaaaaaaaaa aactcgaggc
atctatgtcg ggtgcggaga aagaggtaat 960gaaatggcac atggtcatag ctgtttcctg
acccagctt 9997620PRTHomo sapiens 76Asp Gly Tyr
Val Asp Asn Leu Ala Glu Ala Val Asp Leu Leu Leu Gln 1 5
10 15 His Ala Asp Lys 20
7758PRTHomo sapiens 77Trp Pro Pro Gly Arg Ala Pro Pro Pro Pro Pro Leu Pro
Leu Leu His 1 5 10 15
Pro Cys Leu Pro Ser Thr Pro Ala Ser Pro Pro Pro Ala Gln Glu Ser
20 25 30 Pro Thr Ser Ser
Thr Pro Ala Ser Pro Pro Pro Leu Pro Pro Leu His 35
40 45 Leu Pro Gln Cys Pro Asp Gln Pro Arg
Pro 50 55 7832PRTHomo sapiens 78Gln Pro
Cys Leu Leu Pro Ser Ala Leu His Gly Gln Ala Phe Val Pro 1 5
10 15 Tyr Leu Gly Gly Leu Leu Pro
Cys Leu Gly Pro Asp Phe Ser Ser Leu 20 25
30 7926PRTHomo sapiens 79Ser Pro Gly Gly Trp Asp
Arg Pro Val Arg Pro Leu Gly Ile Ser Gln 1 5
10 15 Ile Pro Glu Leu Thr Thr His His Gly Pro
20 25 8013PRTHomo sapiens 80Tyr Ala His Cys
Leu Leu Phe Arg Leu Phe Ala Phe Gln 1 5
10 8170PRTHomo sapiens 81Arg Ala Trp Lys Lys Pro Arg Gly Leu
Leu Cys Thr Asp Leu Ala Ala 1 5 10
15 Gln Asn Gln Val Ser Leu Pro Ala Thr Ala Gln Ala His Cys
Pro Pro 20 25 30
Pro His Pro Arg Asn Asn Ala Ser Pro Leu Leu Phe Leu Ser Ser Gln
35 40 45 Ser Pro Leu Gln
Thr Lys Thr Arg Gly Ser Ser Arg Gly His Cys Glu 50
55 60 Gly Ser His Ala Thr Gln 65
70 8229PRTHomo sapiens 82Pro Leu Arg Pro Arg His Thr Gly Tyr
Phe Leu Asp Pro Leu Leu Asp 1 5 10
15 Arg Leu Glu Gly Val Met Pro Gly Val Cys Arg Arg Lys
20 25
User Contributions:
Comment about this patent or add new information about this topic: