Patent application title: Novel centrerosome-associated protein and applications thereof
Inventors:
Dominique Giorgi (Saint Gely Du Fesc, FR)
Sylvie Rouquier (Saint Gely Du Fesc, FR)
Jean-Michel Saffin (Montpellier, FR)
Assignees:
CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE
IPC8 Class: AC12Q168FI
USPC Class:
435 6
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid
Publication date: 2009-11-05
Patent application number: 20090275024
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: Novel centrerosome-associated protein and applications thereof
Inventors:
Dominique Giorgi
Sylvie Rouquier
Jean-Michel Saffin
Agents:
DICKINSON WRIGHT PLLC
Assignees:
CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE
Origin: WASHINGTON, DC US
IPC8 Class: AC12Q168FI
USPC Class:
435 6
Patent application number: 20090275024
Abstract:
A method for diagnosing a genetic disease associated with disturbances in
mitotic spindle organization or with cell division anomalies or both,
which comprises demonstrating a functional alteration of the gene
encoding an ASAP protein comprising at least the following steps of:
obtaining DNA containing the gene encoding the ASAP protein from a
biological sample; bringing said DNA into contact with a probe, and under
conditions for hybridization between the DNA and the probe; and detecting
the hybrid formed; and wherein the ASAP protein is selected from the
group consisting of a human protein having sequence SEQ ID NO:1 and
proteins having a sequence exhibiting at least 80% identity or at least
90% similarity with entire SEQ ID NO. 1.Claims:
1-39. (canceled)
40. A method for diagnosing a genetic disease associated with disturbances in mitotic spindle organization or with cell division anomalies or both, which comprises demonstrating a functional alteration of the gene encoding an ASAP protein comprising at least the following steps of:a) obtaining DNA containing the gene encoding the ASAP protein from a biological sample;b) bringing said DNA into contact with a probe, and under conditions for hybridization between the DNA and the probe; andc) detecting the hybrid formed; and wherein the ASAP protein is selected from the group consisting of a human protein having sequence SEQ ID NO: 1 and proteins having a sequence exhibiting at least 80% identity or at least 90% similarity with entire SEQ ID NO. 1.
41. The method of claim 28, wherein step b) comprises an amplification step carried out using a pair of primers, and step c) comprises is a step of detecting the amplified nucleic acids formed.
42. The method of claim 28, further comprising a step of isolating and sequencing the gene encoding the ASAP protein from the sample.
43. The method of claim 29, wherein the pair of primers is selected from the group consisting of the sequences SEQ ID NOS: 31 to 43.
44. The method of claim 28, wherein the sequence of the protein has a sequence exhibiting at least 90% identity or at least 95% similarity with entire SEQ ID No. 1.
Description:
[0001]The present invention relates to a novel centrosome-associated
protein, to the polynucleotide encoding said protein and also to the
applications of said protein and of said polynucleotide.
[0002]The cell division process consists of a nuclear division (mitosis) followed by a cytoplasmic division (cytokinesis). The mitosis is dominated by the formation of a very organized polar spindle (the mitotic spindle) consisting of two families of microtubules: polar microtubules and kinetochore microtubules. Microtubules are polymers made up of α- and β-tubulin subunits. Their growth is initiated in the peripheral region of the centrosome by a complex containing mainly a related protein, γ-tubulin. Polar microtubules are made up of rows of microtubules and of associated proteins which are put in place by the two mitotic centers, associated with centrioles, located at opposite poles of the spindle (asters). Each replicated chromosome consists of two sister chromatids connected to one another via the centromere. Kinetochore microtubules are attached to the replicated chromosomes by means of specialized structures called kinetochores, which form during prophase on each of the two faces of the centromere. The chromosomes condense during prophase and form the kinetochore microtubules, which begin to interact with the polar microtubules of the spindle after rupture of the nuclear envelope during prometaphase. Under the effect of the tension due to the opposite forces, directed toward the poles, which pull the kinetochore microtubules, the chromosomes align in the equatorial zone of the spindle during metaphase. In anaphase, under the effect of forces that are continually developed within the mitotic spindle, the sister chromatids detach and are drawn toward the opposite poles. At the same time, the two cellular poles move apart. During telophase, the nuclear envelope re-forms at the surface of each group of chromosomes.
[0003]Cell division comes to an end when the cytoplasmic content is divided according to the process of cytokinesis. The mitotic spindle plays an important role in the process of cytokinesis, by fixing the setting up of cell segmentation. The cleavage furrow invariably appears in the plane of the equatorial plate, perpendicular to the axis of the mitotic spindle.
[0004]The processes described above are finely regulated by an equilibrium between phosphorylation reactions and dephosphorylation reactions. When the cell enters into mitosis, important changes in the phosphorylation of the proteins occur. The centrosome and the mitotic spindle are particularly enriched in phosphorylated sites. Many protein kinases, particularly serine-threonine kinases, have been described as being involved in these phosphorylation processes (in this respect, see R. Giet and C. Prigent, J. Cell Science, 112, 3591-3601, 1999). Among these, mention will be made of those located at the level of the centrosomes, among which, aurora-type kinases, that are required for centrosome separation and mitotic spindle assembly, polo-type kinases, that are involved in the maturation and formation of the bipolar spindle, and NIMA-type kinases, that regulate centrosome separation.
[0005]Mammals have at least three aurora-type protein kinases. In humans, these three protein kinases are overexpressed in cancer-related pathologies due to chromosomal anomalies. Thus, these proteins appear to play an important role in the control of ploidy. For example, inactivation or overexpression of two of these kinases results in polyploidy. Inhibition of the activity of the aurora A kinase results in the formation of monopolar spindles. Inhibition of the activity of the aurora B kinase results in the formation of multinuclear cells through lack of cytokinesis. These chromosomal anomalies appear to be associated with disturbances in mitotic spindle formation.
[0006]The partners and the substrates of these protein kinases are still relatively unknown. For example, in xenopus, aurora A interacts with a kinesin involved in microtubule dynamics. In humans, it phosphorylates the HsTACC-3 protein, also overexpressed in many cancer cell lines. In drosophila, aurora A phosphorylates the D-TACC protein and is necessary for the localization thereof at the centrosomes in order to regulate astral microtubules. D-TACC interacts with the microtubule-associated protein (MAP) Msp, which is part of the family of XMAO215/ch-TOC/Msps proteins, which stimulate microtubule growth in vitro and are concentrated in the centromeres in vivo. D-TACC and Msp cooperate in order to stabilize centrosomes. The term "MAP" includes a collection of varied proteins defined on the basis of their ability to interact with microtubules. MAPs appear to be partners/substrates of the kinases of the centrosome, such as aurora or polo.
[0007]Correct cell division requires coordination between chromosomal segregation by the mitotic spindle and cell cleavage by the cytokinetic apparatus. The microtubules of the mitotic spindle play an essential role in both processes.
[0008]However, despite all the studies carried out on cell division, the factors that are involved in correctly setting up the mitotic spindle and/or, on the contrary, that disturb the setting up and/or the structure thereof, thus leading to the consequences described above, are still not known.
[0009]Such knowledge would make it possible, firstly, to understand more thoroughly the mechanisms of mitosis and, secondly, to be able to develop means for combating cell division anomalies and their resulting consequences.
[0010]The present invention lies within this field.
[0011]Specifically, surprisingly and unexpectedly, the inventors have demonstrated a novel centrosome-associated human protein. By immunofluorescence, it is detected as a colocalization with the α-tubulin of the microtubules of the mitotic spindle, in particular with the aster. This protein was named ASAP, for Aster Associated Protein, by the inventors.
[0012]Overexpression of the protein according to the invention disturbs the organization of the mitotic spindle and induces aberrant and abortive mitoses (plurinuclear cells, monopolar or multipolar spindles). Its overexpression blocks cell division and, consequently, cell proliferation.
[0013]A subject of the invention is thus an isolated protein, called ASAP, characterized in that it is selected from the group consisting of: [0014]a) a protein corresponding to the sequence represented in the attached sequence listing under the number SEQ ID NO: 1; [0015]b) a protein exhibiting, over its entire sequence, at least 80% identity or at least 90% similarity, preferably at least 90% identity or at least 95% similarity, with the protein of SEQ ID NO: 1.
[0016]A protein in accordance with the invention is characterized by the following properties: [0017]it has a molecular weight of between 60 and 100 kDa, preferably of between 65 and 80 kDa; [0018]it is associated with the centrosomes; [0019]it is colocalized, by immunofluorescence, with the α-tubulin of the microtubules of the mitotic spindle; [0020]it exhibits weak identity (23%) with the MAP1A protein (Microtubule Associated Protein 1A); [0021]it has coiled-coil domains essentially included in its C-terminal portion between, firstly, amino acids 297 and 327 and, secondly, amino acids 477 and 628, indicating either that the protein oligomerizes, or that it interacts with other proteins; [0022]it exhibits weak identity (20%), between amino acids 300 and 600, with a caldesmon-type domain (N. B. Gusev, Biochemistry, 10: 1112-1121, 2000), referenced pfam00769 (NCBI, domains), and, between amino acids 480 and 630, with a domain of ERM type (ezrin/radixin/moesin; S. Louvet-Vallet, Biol. Cell, 274: 305-316, 2000), referenced pfam02029 (NCBI, domains). The caldesmon and ERM proteins are also considered to be MAPs; [0023]it also has, between positions 65 and 303, a BRCT domain (Breast Cancer Carboxy-Terminal domain; P. Bork et al., FASEB J., 11, 68-76 (1997)), indicating that the protein is involved in cell cycle control; [0024]it is very rich in α-helices in its C-terminal portion, in particular in the region between amino acids 420-620, which is almost exclusively made up of α-helices.
[0025]These elements make it possible to consider that the ASAP protein is a novel MAP.
[0026]The proteins according to the invention include any protein (natural, synthetic, semi-synthetic or recombinant) of any prokaryotic or eukaryotic organism, in particular of a mammal, comprising or consisting of an ASAP protein. Preferably, said protein is a functional ASAP protein.
[0027]The term "functional" is intended to mean a protein that has normal biological activity, i.e. that is capable of being involved in mitotic spindle organization and in cell division. This protein can comprise silent mutations that do not induce any substantial change in its activity and do not produce any phenotypic modification.
[0028]Proteins in accordance with the invention are in particular represented by the human ASAP (SEQ ID NO: 1) and murine ASAP (SEQ ID NO: 46) proteins.
[0029]Included in the proteins according to the invention defined in b) are the proteins that are variants of the sequences SEQ ID NOS: 1 and 46, in particular the proteins for which the amino acid sequence has at least one mutation corresponding in particular to a truncation, a deletion, a substitution and/or an addition of at least one amino acid residue compared with the sequences SEQ ID NOS: 1 and 46.
[0030]Preferably, the variant proteins have a mutation that results in a dysfunction (activation or inhibition) of the protein, of other genes or proteins, or else of the cell in general.
[0031]According to another advantageous embodiment of the invention, said protein is a mammalian protein, preferably a protein of human origin.
[0032]For the purpose of the present invention, the following definitions apply.
[0033]The identity of a sequence relative to the sequence of SEQ ID NO: 1 as reference sequence is assessed according to the percentage of amino acid residues that are identical, when the two sequences are aligned, so as to obtain the maximum correspondence between them.
[0034]The percentage identity can be calculated by those skilled in the art using a computer program for sequence comparison such as, for example, that of the BLAST series (Altschul et al., NAR, 1997, 25, 3389-3402).
[0035]The BLAST programs are implemented over the window of comparison consisting of the entire SEQ ID NO: 1, indicated as reference sequence.
[0036]A protein having an amino acid sequence that has at least X % identity with a reference sequence is defined, in the present invention, as a protein whose sequence can include up to 100-X alterations per 100 amino acids of the reference sequence, while at the same time conserving the functional properties of said reference protein. For the purpose of the present invention, the term "alteration" includes consecutive or dispersed deletions, substitutions or insertions of amino acids in the reference sequence.
[0037]The similarity of a sequence relative to a reference sequence is assessed according to the percentage of amino acid residues that are identical or that differ by means of conservative substitutions, when the two sequences are aligned so as to obtain the maximum correspondence between them. For the purpose of the present invention, the term "conservative substitution" is intended to mean the substitution of an amino acid with another that has similar chemical or physical properties (size, charge or polarity), which generally does not modify the functional properties of the protein.
[0038]A protein having an amino acid sequence that has at least X % similarity with a reference sequence is defined, in the present invention, as a protein whose sequence can include up to 100-X non-conservative alterations per 100 amino acids of the reference sequence. For the purpose of the present invention, the term "non-conservative alterations" includes consecutive or dispersed non-conservative substitutions or insertions of amino acids in the reference sequence.
[0039]The expression "techniques or methods well known to those skilled in the art" is here intended to refer to the techniques or methods conventionally used by those skilled in the art and disclosed in many works, such as in particular that entitled Molecular Cloning. A Laboratory Manual (J. Sambrook, D. W. Russell (2000) Cold Spring Harbor Laboratory Press).
[0040]The protein according to the invention is obtained either from a cell, or by chemical synthesis, or by genetic recombination.
[0041]By chemical synthesis, the protein can be obtained using one of the many known peptide synthesis pathways, for example techniques using solid phases or techniques using partial solid phases, by fragment condensation or by conventional synthesis in solution. In this case, the sequence of the protein can be modified in order to improve its solubility, in particular in aqueous solvents. Such modifications are known to those skilled in the art, for instance the deletion of hydrophobic domains or the substitution of hydrophobic amino acids with hydrophilic amino acids.
[0042]The protein according to the invention consists of the series of 13 peptides corresponding to the products of translation of 13 of the 14 exons that the corresponding gene contains, the first exon not being translated (see hereinafter).
[0043]More precisely, said peptides correspond to the following sequences (positions given relative to the numbering of the sequence SEQ ID NO: 1): [0044]peptide 1: it comprises 25 amino acids corresponding to positions 1 to 25 (SEQ ID NO: 2); [0045]peptide 2: it comprises 28 amino acids corresponding to positions 26 to 53 (SEQ ID NO: 3); [0046]peptide 3: it comprises 107 amino acids corresponding to positions 54 to 160 (SEQ ID NO: 4); [0047]peptide 4: it comprises 76 amino acids corresponding to positions 161 to 236 (SEQ ID NO: 5); [0048]peptide 5: it comprises 31 amino acids corresponding to positions 237 to 267 (SEQ ID NO: 6); [0049]peptide 6: it comprises 83 amino acids corresponding to positions 268 to 350 (SEQ ID NO: 7); [0050]peptide 7: it comprises 24 amino acids corresponding to positions 351 to 374 (SEQ ID NO: 8); [0051]peptide 8: it comprises 54 amino acids corresponding to positions 375 to 428 (SEQ ID NO: 9); [0052]peptide 9: it comprises 32 amino acids corresponding to positions 429 to 460 (SEQ ID NO: 10); [0053]peptide 10: it comprises 54 amino acids corresponding to positions 461 to 514 (SEQ ID NO: 11); [0054]peptide 11: it comprises 49 amino acids corresponding to positions 515 to 563 (SEQ ID NO: 12); [0055]peptide 12: it comprises 43 amino acids corresponding to positions 564 to 606 (SEQ ID NO: 13); [0056]peptide 13: it comprises 41 amino acids corresponding to positions 607 to 647 (SEQ ID NO: 14).
[0057]A subject of the present invention is also a peptide consisting of a fragment of at least 10 consecutive amino acids of a protein defined above in a) or b), particularly a peptide selected from: [0058]the sequences corresponding to peptides 1 to 13 described above, i.e., selected from the sequences SEQ ID NO: 2 to SEQ ID NO: 14, and [0059]the sequences SEQ ID NOS: 47 to 53 corresponding to mutants of the hASAP protein in which there is a deletion of the N-terminal portion containing the BRCT domain (Ndel1: residues 304-647 (SEQ ID NO: 48); Ndel2: residues 411-647 (SEQ ID NO: 49); Ndel3: residues 478-647 (SEQ ID NO: 50)) or of the C-terminal portion containing the MAP domain (Cdel1: residues 1 to 477 (SEQ ID NO: 51); Cdel2: residues 1 to 418 (SEQ ID NO: 52); Cdel3: residues 1 to 303 (SEQ ID NO: 53); residues 1 to 421 (SEQ ID NO: 47)).
[0060]According to an advantageous embodiment of the invention, said peptide is useful for producing antibodies that specifically recognize a protein as defined above, preferably that recognize the ASAP protein of sequence SEQ ID NO: 1 or SEQ ID NO: 46.
[0061]The subject of the invention is thus also monoclonal or polyclonal antibodies, characterized in that they are capable of specifically recognizing a protein according to the invention.
[0062]Preferably according to the invention, the antibodies recognize, among MAPs, only and specifically the ASAP protein of sequence SEQ ID NO: 1 or SEQ ID NO: 46.
[0063]The antibodies according to the invention are, for example, chimeric antibodies, humanized antibodies, or Fab or F(ab')2 fragments. They may also be in the form of immunoconjugates or of antibodies that have been labeled in order to obtain a detectable and/or quantifiable signal.
[0064]Said antibodies can be obtained directly from human serum or from serum of animals immunized with the proteins or the peptides according to the invention. The specific polyclonal or monoclonal antibodies can be obtained according to techniques well known to those skilled in the art.
[0065]A subject of the invention is also the use of the antibodies according to the invention, for detecting and/or purifying a protein according to the invention.
[0066]In general, the antibodies according to the invention can be advantageously used for detecting the presence of a normal or mutated protein according to the invention.
[0067]In particular, the monoclonal antibodies can be used for detecting these proteins in a biological sample. They thus constitute a means of immunocytochemical or immunohistochemical analysis of the expression of the proteins according to the invention, in particular the protein of sequence SEQ ID NO: 1, on tissue sections. In general for such analyses, the antibodies used are labeled in order to be detectable, for example by immunofluorescent compounds, by means of gold labeling, or in the form of enzymatic immunoconjugates.
[0068]They can make it possible in particular to demonstrate abnormal expression of these proteins in the biological tissues or samples, and thus allow the detection of cells exhibiting disturbances in mitotic spindle organization and/or an induction of aberrant and abortive mitoses (plurinuclear cells, monopolar or multipolar spindles) associated with overexpression of the protein according to the invention.
[0069]A subject of the invention is also a method for detecting the protein according to the invention, particularly the ASAP protein, in a biological sample, comprising a first step consisting in suitably treating the cells by any appropriate means for making the intra-cellular medium accessible, a second step consisting in bringing said intracellular medium thus obtained into contact with an antibody according to the invention, and a third step consisting in demonstrating, by any appropriate means, the ASAP protein-antibody complex formed.
[0070]This method can also make it possible to measure the level of expression of the protein according to the invention in cells, particularly in cancer cells. The study of the expression of the ASAP protein (overexpression or underexpression) is an element for evaluating the proliferative capacity or the aggressiveness (ability to progress toward cancers with a poor prognosis) of cancer cells.
[0071]A subject of the invention is therefore also a method for evaluating, in vitro, the proliferative capacity or aggressiveness of the cancer cells contained in a biological sample, characterized in that it comprises a first step consisting in suitably treating the cells by any appropriate means for making the intracellular medium accessible, a second step consisting in bringing said intracellular medium thus obtained into contact with an antibody according to the invention, a third step consisting in demonstrating and/or measuring, by any appropriate means, the ASAP protein-antibody complex formed, and a fourth step consisting in evaluating the level of transcription of the gene, by comparing the level of ASAP protein-antibody complexes formed with that of a control biological sample selected beforehand. Said control can consist, for example, of a biological sample containing cells having a normal or altered level of proteins, to which said method is applied under the same conditions.
[0072]A subject of the invention is also a kit for carrying out any one of the methods described above, comprising: [0073]a) at least one monoclonal or polyclonal antibody according to the invention; [0074]b) the reagents for detecting the ASAP protein-antibody complex produced during the immunoreaction.
[0075]According to a particular embodiment of the invention, the kit can optionally comprise reagents required for making the intracellular medium accessible.
[0076]The expression "means for making the intracellular medium accessible" is intended to mean any means known to those skilled in the art, for instance cell lysis by enzymatic or chemical processes, or else sonication, membrane permeation, thermal shock.
[0077]A subject of the present invention is also an isolated polynucleotide (cDNA or genomic DNA fragment), characterized in that its sequence is selected from the group consisting of: [0078]the sequences encoding a protein or a peptide as defined above, and the sequences complementary to the preceding sequences, that may be sense or antisense.
[0079]The invention encompasses the alleles of the asap gene derived from any mammal, and also the polynucleotides of the natural or artificial mutants of the asap gene encoding an ASAP protein, particularly a functional ASAP protein as defined above.
[0080]According to an advantageous embodiment of the invention, said polynucleotide encoding an ASAP protein corresponds to a sequence selected from the group consisting of: [0081]the sequence SEQ ID NO: 15, corresponding to the complementary DNA of 2575 nucleotides of the mRNA encoding the human ASAP protein (hASAP); [0082]the sequence SEQ ID NO: 45, corresponding to the complementary DNA of 2767 nucleotides of the mRNA encoding the murine ASAP protein (mASAP); [0083]the genomic DNA fragment of 29750 nucleotides corresponding to the sequence represented in the attached sequence listing under the number SEQ ID NO: 16, corresponding to the human asap gene comprising 14 exons, only 13 of which are translated, the first exon not being translated, contained in the contig AC097467 (length 178204 base pairs) between bases 115117 and 143828 (version v.7.29a3 NCBI/Ensembl of Jul. 12, 2002), moreover located on chromosome 4q32.1 between the anonymous markers D4S1053 and D4S571 (region 161.25 megabases (Mb) to 161.28 Mb).
[0084]The sequence SEQ ID NO: 16 is contained in the BAC clone RP11-27G13 (K. Osoegawa et al., (2001) A Bacterial Artificial Chromosome Library for Sequencing the Complete Human Genome, Genome Research, Vol. 11, No. 3, 483-496, March 2001). The sequences contained in the contig AC097467 and in the BAC clone RP11-27G13 were obtained in the context of the human genome sequencing program, and have not up until now been the subject of any precise recognition or characterization making it possible to assign any function to them. Two nucleic acids corresponding to fragments of the polynucleotide isolated by the inventors are listed in the GenBank database under the accession numbers AK024730 and AK024812, along with the ESTs listed under the accession numbers BU198882, BM693711, AW372449, BM021380, BU928828, AL707573, AI885274, AI671785, AA805679, BU619959, BM021126, AL598336, AW976973, BU629726, AI433877, AV751613, BQ372751, AI827535, AI866257, AA843565, R96130, BU684090, BF958121, BQ351941, AW194906, BG203580, BF078132, AW486134, AL600279, AA025538, AL600264, BF170676, BU759494, BB025236, BF214179, AI283076, BE694273, AI266380, BM670854, AA968415, BU503982, BB700612, BE988355, BU058357, BB312934, AW061311, BM537962, BE988356, BB318982, BB311217, BB557152, BB185248, BB557128, BB698742, BB186736, AV345769, BB274293, BB632007, BB617958, AI391312, W18534, BB186581, BB311289, BB312835, AW347411, AA972439, BB263570, AU035125, BB277226, BB274224, BB268445, AW024037, AA025609, BB274174, R96089, BB272238, BB269037, BB385718, BE007324, BB325992, AJ275277, AI414381, BB125476, BB430961, BE232162, BQ121419, BG591509, BF457670, AL897593, AL897592, BM926692, BM538559, BI759567, AL601021, AL598780, AU222540, BG567619, AU166296, BF889835, AU164011, AV656025, BF343454, AW262441, AW237952. These sequences, obtained in the context of a program of mass sequencing of human complementary DNA libraries, are incomplete and have never been either recognized or characterized. In fact, the polynucleotide isolated by the inventors exhibits long deoxyadenosine chains (poly-dA), which explains the difficulties encountered by the inventors in obtaining the complete cDNA using conventional oligodeoxythymidine (oligo-dT) primers, said primers hybridizing randomly with the poly-dA chains. The inventors succeeded in isolating the polynucleotide corresponding to the complete mRNA by repeatedly using the 3' rapid amplification cDNA end (or 3'RACE) technique.
[0085]The mRNA, corresponding to the polynucleotide of sequence SEQ ID NO: 15, is specifically expressed in the testes in the form of a polynucleotide approximately 2.9 kilobases long, and in the brain in the form of a polynucleotide approximately 9 kilobases long, that may correspond either to a premessenger or to a high molecular weight isoform.
[0086]More precisely, said exons are distributed as follows on said genomic sequence (relative to the numbering of the sequence SEQ ID NO: 16): [0087]exon 1: it comprises 200 base pairs corresponding to positions 101 to 300 (SEQ ID NO: 17); [0088]exon 2: it comprises 139 base pairs corresponding to positions 1157 to 1295 (SEQ ID NO: 18); [0089]exon 3: it comprises 85 base pairs corresponding to positions 2050 to 2134 (SEQ ID NO: 19); [0090]exon 4: it comprises 321 base pairs corresponding to positions 3615 to 3935 (SEQ ID NO: 20); [0091]exon 5: it comprises 227 base pairs corresponding to positions 8259 to 8485 (SEQ ID NO: 21); [0092]exon 6: it comprises 94 base pairs corresponding to positions 14930 to 15023 (SEQ ID NO: 22); [0093]exon 7: it comprises 248 base pairs corresponding to positions 16715 to 16962 (SEQ ID NO: 23); [0094]exon 8: it comprises 71 base pairs corresponding to positions 19552 to 19622 (SEQ ID NO: 24); [0095]exon 9: it comprises 169 base pairs corresponding to positions 21187 to 21355 (SEQ ID NO: 25); [0096]exon 10: it comprises 90 base pairs corresponding to positions 21911 to 22000 (SEQ ID NO: 26); [0097]exon 11: it comprises 162 base pairs corresponding to positions 23731 to 23892 (SEQ ID NO: 27); [0098]exon 12: it comprises 146 base pairs corresponding to positions 24014 to 24159 (SEQ ID NO: 28); [0099]exon 13: it comprises 133 base pairs corresponding to positions 24343 to 24475 (SEQ ID NO: 29); [0100]exon 14: it comprises 485 base pairs corresponding to positions 29166 to 29650 (SEQ ID NO: 30).
[0101]A subject of the invention is also: [0102]a fragment of any one of the polynucleotides according to the invention, of at least 15 to 1500 consecutive nucleotides, with the exclusion of the sequences listed under the accession numbers AK024730 and AK024812 and of the ESTs listed under the accession numbers BU198882, BM693711, AW372449, BM021380, BU928828, AL707573, AI885274, AI671785, AA805679, BU619959, BM021126, AL598336, AW976973, BU629726, AI433877, AV751613, BQ372751, AI827535, AI866257, AA843565, R96130, BU684090, BF958121, BQ351941, AW194906, BG203580, BF078132, AW486134, AL600279, AA025538, AL600264, BF170676, BU759494, BB025236, BF214179, AI283076, BE694273, AI266380, BM670854, AA968415, BU503982, BB700612, BE988355, BU058357, BB312934, AW061311, BM537962, BE988356, BB318982, BB311217, BB557152, BB185248, BB557128, BB698742, BB186736, AV345769, BB274293, BB632007, BB617958, AI391312, W18534, BB186581, BB311289, BB312835, AW347411, AA972439, BB263570, AU035125, BB277226, BB274224, BB268445, AW024037, AA025609, BB274174, R96089, BB272238, BB269037, BB385718, BE007324, BB325992, AJ275277, AI414381, BB125476, BB430961, BE232162, BQ121419, BG591509, BF457670, AL897593, AL897592, BM926692, BM538559, BI759567, AL601021, AL598780, AU222540, BG567619, AU166296, BF889835, AU164011, AV656025, BF343454, AW262441, AW237952 in the GenBank database, particularly a fragment selected from the sequences corresponding to the exons, i.e. selected from the sequences SEQ ID NO: 16 to SEQ ID NO: 30; [0103]a nucleic acid exhibiting a percentage identity of at least 80%, preferably of at least 90%, with one of the polynucleotides according to the invention.
[0104]The definition of the identity of a sequence given above for the proteins applies by analogy to the nucleic acid molecules.
[0105]Included in a polynucleotide exhibiting a percentage identity of at least 80%, preferably of at least 90%, according to the invention, are the polynucleotides that are variants of the sequences SEQ ID NOS: 15 and 45, i.e. all the polynucleotides corresponding to allelic variants, i.e. to individual variations of the sequences SEQ ID NOS: 15 and 45.
[0106]These natural variant sequences correspond to polymorphisms present in mammals, in particular in human beings, and especially to polymorphisms that may result in the occurrence of a pathology.
[0107]The term "variant polynucleotide" is also intended to denote any RNA or cDNA resulting from a mutation and/or from a variation of a splice site of the genomic sequence which has an mRNA whose complementary DNA is the polynucleotide of sequence SEQ ID NO: 15 or SEQ ID NO: 45.
[0108]Preferably, the present invention relates to the polynucleotides or the fragments that are variants of the sequences SEQ ID NOS: 15 and 45, particularly those in which the mutations result in a modification of the amino acid sequence of the proteins of sequence SEQ ID NO: 1 and SEQ ID NO: 46.
[0109]The polynucleotides according to the invention can be isolated from cells, particularly from the cells of the testes or the brain, or from cellular DNA libraries. They can also be obtained by means of a polymerase chain reaction (PCR) carried out on the total DNA of the cells or else by RT-PCR carried out on the total RNA of the cells, or by chemical synthesis.
[0110]The polynucleotides according to the invention, particularly the fragments of any one of the polynucleotides according to the invention, and the sequences listed under the accession numbers AK024730 and AK024812 and the ESTs listed under the accession numbers BU198882, BM693711, AW372449, BM021380, BU928828, AL707573, AI885274, AI671785, AA805679, BU619959, BM021126, AL598336, AW976973, BU629726, AI433877, AV751613, BQ372751, AI827535, AI866257, AA843565, R96130, BU684090, BF958121, BQ351941, AW194906, BG203580, BF078132, AW486134, AL600279, AA025538, AL600264, BF170676, BU759494, BB025236, BF214179, AI283076, BE694273, AI266380, BM670854, AA968415, BU503982, BB700612, BE988355, BU058357, BB312934, AW061311, BM537962, BE988356, BB318982, BB311217, BB557152, BB185248, BB557128, BB698742, BB186736, AV345769, BB274293, BB632007, BB617958, AI391312, W18534, BB186581, BB311289, BB312835, AW347411, AA972439, BB263570, AU035125, BB277226, BB274224, BB268445, AW024037, AA025609, BB274174, R96089, BB272238, BB269037, BB385718, BE007324, BB325992, AJ275277, AI414381, BB125476, BB430961, BE232162, BQ121419, BG591509, BF457670, AL897593, AL897592, BM926692, BM538559, BI759567, AL601021, AL598780, AU222540, BG567619, AU166296, BF889835, AU164011, AV656025, BF343454, AW262441, AW237952 in the GenBank database, or their fragments, can in particular be used as probes or as primers for detecting/amplifying polynucleotides (RNA or genomic DNA) corresponding to the polynucleotide according to the invention, particularly in other organisms.
[0111]The transcripts of the asap gene are, for example, preferably demonstrated using probes selected from the group consisting of the sequences SEQ ID NO: 15, SEQ ID NO: 45, and SEQ ID NO: 17 to SEQ ID NO: 44, or using an EST as defined above, or amplified by RT-PCR using primers selected from the group consisting of the sequences SEQ ID NOS: 31 to 43.
[0112]The polynucleotide according to the invention can make it possible to diagnose a pathological state or a genetic disease involving a dysfunction of the asap gene, and to screen for substances capable of modulating (activating or inhibiting) the transcription of said gene.
[0113]A subject of the invention is also the polynucleotides that can be obtained by amplification using the primers according to the invention.
[0114]The probes and primers according to the invention can be directly or indirectly labeled with a radioactive or non-radioactive compound by methods well known to those skilled in the art, in order to obtain a detectable and/or quantifiable signal.
[0115]The labeling of the probes according to the invention is carried out with radioactive elements or with non-radioactive molecules. Among the radioactive isotopes used, mention may be made of 32P, 33P, 35S, 3H or 125I. The non-radioactive entities are selected from ligands such as biotin, avidin, streptavidin or digoxigenin, haptens, dyes, and luminescent agents such as radioluminescent, chemoluminescent, bioluminescent, fluorescent or phosphorescent agents.
[0116]The polynucleotides according to the invention can thus be used as a primer and/or a probe in methods using in particular the PCR (polymerase chain reaction) technique (U.S. Pat. No. 4,683,202). Other techniques for amplifying the target nucleic acid can advantageously be used as an alternative to PCR. A large number of methods currently exist that allow this amplification, for instance the SDA (Strand Displacement Amplification) technique, the TAS (Transcription-based Amplification System) technique, the 3SR (Self-sustained Sequence Replication) technique, the NASBA (Nucleic Acid Sequence Based Amplification) technique, the TMA (Transcription Mediated Amplification) technique, the LCR (Ligase Chain Reaction) technique, the RCR (Repair Chain Reaction) technique, the CPR (Cycling Probe Reaction) technique, or the Q-beta-replicase amplification technique. Mention may also be made of PCR-SSCP, which makes it possible to detect point mutations.
[0117]These techniques are of course entirely known to those skilled in the art.
[0118]As probes or as primers, the various polynucleotides according to the invention can make it possible either to determine the transcription profile of the corresponding asap gene or any possible alteration of this profile in a biological sample, or to demonstrate the corresponding gene in other species, allelic variants of this gene or any possible functional alteration of this gene (substantial change in the activity of the protein encoded by said gene) resulting from a mutation (insertion, deletion or substitution) of one or more nucleotides in at least one exon of said gene. Such mutations include in particular deletions, insertions or non-conservative substitutions in codons corresponding to amino acid residues located in a domain that is essential for the biological activity of the protein.
[0119]Thus, a subject of the invention is a method for determining the transcription profile of the gene corresponding to the polynucleotide according to the invention, or an alteration in said profile, in a biological sample, comprising a first step consisting in obtaining, by any appropriate means, the total RNA from the biological sample, a second step consisting in bringing said RNA into contact with a probe according to the invention, labeled beforehand, under conventional conditions for hybridization between the RNAs and the probe, and a third step consisting in revealing, by any appropriate means, the hybrids formed.
[0120]The expression "conventional conditions for hybridization" is intended to mean those described in J. Sambrook, D. W. Russell (2000) Cold Spring Harbor Laboratory Press.
[0121]According to one embodiment of said method, the second step can be a step consisting of reverse transcription and amplification of the transcripts, carried out using a pair of primers as described above, and the third step can be a step consisting in revealing, by any appropriate means, the amplified nucleic acids formed.
[0122]Said method for determining the transcription profile of the gene can also comprise a step consisting in evaluating the level of transcription of the gene by comparison with a control sample selected beforehand. Said control may, for example, consist of a biological sample exhibiting normal or altered transcription of the gene corresponding to the polynucleotide according to the invention, to which said method for determining the transcription profile of the gene is applied under the same conditions.
[0123]A subject of the invention is also a method for demonstrating, in other species, the gene corresponding to the polynucleotide according to the invention or the allelic variants of said gene, or a functional alteration of this gene, in a biological sample, comprising a first step consisting in obtaining, by any appropriate means, the DNA from the cells of a biological sample, a second step consisting in bringing said DNA into contact with a probe according to the invention, labeled beforehand, under conventional conditions for hybridization between the DNAs and the probe, and a third step consisting in revealing, by any appropriate means, the hybrids formed.
[0124]According to one embodiment of said method, the second step can be an amplification step carried out using a pair of primers as described above, and the third step can be a step consisting in revealing, by any appropriate means, the amplified nucleic acids formed. The method can optionally comprise a fourth step consisting in isolating and sequencing the nucleic acids demonstrated.
[0125]A subject of the invention is also a kit of reagents for carrying out the methods described above, comprising: [0126]a) at least one probe or one pair of primers according to the invention; [0127]b) the reagents required for carrying out a conventional hybridization reaction between said probe or said primers and the nucleic acid of the biological sample; [0128]c) the reagents required for carrying out an amplification reaction; [0129]d) the reagents required for detecting and/or assaying the hybrid formed between said probe and the nucleic acid of the biological sample or the amplified nucleic acids formed.
[0130]Such a kit can also contain positive or negative controls in order to ensure the quality of the results obtained. It can also contain the reagents required for purifying the nucleic acids from the biological sample.
[0131]The polynucleotide of the invention or one of its fragments, and also the ESTs described above or their fragments, can be used to develop cell or animal models that do not express the ASAP protein, by knocking out the ASAP gene by means of the Si RNA method (small interfering RNA; M. McManus and P. Sharp, Nature Reviews Genetics, 3, 737-747, 2002; V. Brondani, F. Kolb, E. Billy, M/S, 6-7, 665-667, 2002) using oligonucleotides derived from their sequences.
[0132]A subject of the invention is also a cloning and/or expression vector into which the polynucleotide according to the invention is inserted.
[0133]Such a vector can contain the elements required for the expression and, optionally, the secretion of the protein in a host cell.
[0134]Said vectors preferably comprise: a promoter, translation initiation and termination signals, and also regions suitable for regulating the transcription. It should be possible for them to be maintained stably in the cell and they can optionally comprise sequences encoding specific signals specifying the secretion of the translated protein, for instance a strong ubiquitous promoter or a promoter that is selective for a particular cell and/or tissue type. These various control sequences are chosen according to the cellular host used.
[0135]The polynucleotide according to the invention can be inserted into vectors that replicate autonomously in the chosen host or vectors that are integrative with the chosen host.
[0136]Among the autonomously replicating systems, use is preferably made, according to the host cell, of systems of the plasmid or viral type. The viral vectors can in particular be adenoviruses, retroviruses, lentiviruses, poxviruses or herpesviruses. Those skilled in the art are aware of the technology that can be used for each of these systems.
[0137]When integration of the sequence into the chromosomes of the host cell is desired, use may be made, for example, of systems of the plasmid or viral type; such viruses are, for example, retroviruses or adeno-associated viruses (AAVs).
[0138]Among the non-viral vectors, preference is given to naked polynucleotides such as naked DNA or naked RNA, bacterial artificial chromosomes (BACs), yeast artificial chromosomes (YACs) for expression in yeast, mouse artificial chromosomes (MACs) for expression in murine cells, and preferably human artificial chromosomes (HACs) for expression in human cells.
[0139]Such vectors are prepared according to the methods commonly used by those skilled in the art, and the recombinant vectors resulting therefrom can be introduced into the appropriate host by standard methods, for instance lipofection, electroporation, thermal shock, transformation after chemical membrane permeabilization, cell fusion.
[0140]A subject of the invention is also the transformed host cells, in particular the eukaryotic and prokaryotic cells, into which at least one polynucleotide or one fragment according to the invention or at least one vector according to the invention has been introduced.
[0141]Among the cells that can be used for the purposes of the present invention, mention may be made of bacterial cells, yeast cells, animal cells, in particular mammalian cells, or else plant cells. Mention may also be made of insect cells in which methods employing for example baculoviruses can be used.
[0142]A subject of the invention is also the nonhuman transgenic organisms, such as the transgenic animals or plants, in which all or some of the cells contain the polynucleotide according to the invention or the vector according to the invention, in free or integrated form.
[0143]Preferably according to the invention, the nonhuman transgenic organisms are those carrying cells containing a polynucleotide according to the invention, that is nonfunctional or carrying a mutation.
[0144]According to the invention, the transgenic animals are preferably mammals, except for humans, more preferably rodents, in particular mice or rats.
[0145]The transgenic animals can be obtained by any conventional method known to those skilled in the art, for instance by homologous recombination on embryonic stem cells, transfer of these stem cells to embryos, selection of the chimeras affected in the reproductive lines, and growth of such chimeras.
[0146]The transformed host cells, the transgenic animals or the transgenic plants according to the invention can thus express or overexpress the gene encoding the protein according to the invention, or their homologous gene, or express said gene into which a mutation is introduced.
[0147]The testicular or brain cells, the transformed host cells or the transgenic organisms according to the invention can be used for preparing the protein according to the invention.
[0148]The protein according to the invention, particularly the native ASAP protein, can be purified according to techniques known to those skilled in the art. Thus, the protein can be purified from cell lysates and extracts, from the culture medium supernatant, by methods used individually or in combination, such as fractionation, chromatography methods, particularly affinity chromatography methods, immunoaffinity techniques using specific monoclonal or polyclonal antibodies, etc.
[0149]The subject of the invention is also a method for preparing the ASAP protein, characterized in that cells expressing the protein or transformed cells according to the present invention, in particular mammalian cells or the cells of transgenic organisms according to the invention, are cultured under conditions that allow the expression of said protein, and in that said protein is purified.
[0150]As a purification technique, mention may be made, for example, of affinity chromatography on glutathione-sepharose (or agarose) as described in J Sambrook & D W Russell (2000, Cold Spring Harbor Laboratory Press).
[0151]A subject of the invention is also a protein, characterized in that it can be obtained by means of any one of the methods of preparation described above.
[0152]A subject of the invention is also a method for screening for a substance capable of interacting in vitro, directly or indirectly, with the polynucleotide or the protein according to the invention, characterized in that: [0153]in a first step, the substance to be tested and the polynucleotide or the protein according to the invention are brought into contact, and [0154]in a second step, the complex formed between said substance and the polynucleotide or the protein according to the invention is detected by any appropriate means.
[0155]A subject of the present invention is also a method for screening for a substance capable of modulating (activating or inhibiting) the activity of the ASAP protein, characterized in that: [0156]in a first step, cells of a biological sample expressing the ASAP protein are brought into contact with a substance to be tested, [0157]in a second step, the effect of said substance on the activity of said ASAP protein is measured by any appropriate means, and [0158]in a third step, substances capable of modulating said activity are selected.
[0159]For the purpose of the present invention, the expression "activity of the ASAP protein" is intended to mean both the expression of the ASAP protein or of the corresponding transcripts (mRNA), and the biological activity of said ASAP protein, for instance its effect on the organization of the mitotic spindle or the induction of aberrant or abortive mitoses.
[0160]The detection of the complex formed between said substance and the polynucleotide or the protein, or the measurement of the effect of said substance on the activity of said ASAP protein, can be carried out by conventional techniques of mRNA or protein analysis that are known in themselves; by way of nonlimiting example, mention may be made of the following techniques: RT-PCR, Northern blotting, Western blotting, RIA, ELISA, immunoprecipitation, immunocytochemical or immunohistochemical analysis techniques.
[0161]Advantageously, said measurement is carried out using the probes, the primers or the antibodies as defined above.
[0162]Such substances can be biological macromolecules such as, for example, a nucleic acid, a lipid, a sugar, a protein, a peptide, a protein-lipid, protein-sugar, peptide-lipid or peptide-sugar hybrid compound, a protein or a peptide to which have been added chemical branches or else chemical molecules.
[0163]The subject of the invention is also the polynucleotide, the protein, the antibodies, the vectors or the transformed cells according to the invention, used as medicinal products.
[0164]As indicated above, the overexpression of the protein according to the invention blocks cell division and, consequently, cell proliferation. This makes it an excellent candidate for use as an anti-mitotic agent, that can be used for example in the treatment of cancer-related pathologies.
[0165]Thus, a subject of the invention is also the use of the polynucleotide, of a vector or of the protein according to the invention, in the preparation of an anti-mitotic medicinal product.
[0166]Similarly, as is also indicated above, the overexpression of the protein according to the invention disturbs the organization of the mitotic spindle and induces aberrant and abortive mitoses (plurinuclear cells, monopolar or multipolar spindles).
[0167]Thus, a subject of the invention is also the use of an antisense polynucleotide or of an antisense fragment, of an antibody, or of a vector containing an antisense oligonucleotide, according to the invention, capable of inhibiting the expression of the polynucleotide or of the protein according to the invention, in the preparation of a medicinal product intended for the treatment of pathologies associated with disturbances in mitotic spindle organization and/or induction of aberrant and abortive mitoses (plurinuclear cells, monopolar or multipolar spindles) associated with overexpression of the protein according to the invention.
[0168]Besides the above provisions, the invention also comprises other provisions that will emerge from the following description, which refers to examples of implementation of the invention and also to the attached drawings, in which:
[0169]FIG. 1 represents the chromosomal localization and the structure of the human asap gene.
[0170]FIG. 2 represents the signals obtained by Northern blotting on various human tissues after hybridization with an hASAP probe.
[0171]FIG. 3 represents the results obtained: [0172](A) by agarose gel electrophoresis of the RT-PCR products obtained with primers corresponding to the mouse polynucleotide, which is the ortholog of the polynucleotide SEQ ID NO: 15, using various mouse tissues; [0173](B) after transfer of the gel, after electrophoresis, onto a membrane and hybridization with an internal mASAP probe.
[0174]FIG. 4 represents the cellular localization of the hASAP protein coupled to the green fluorescent protein (GFP) in the 3' position or the yellow fluorescent protein (YFP) in the 5' position, or to an MYC tag on the N-terminal side (fusion column).
[0175]The nuclei are stained with propidium iodide or with Hoechst 33286 (4A: 63× objective; 4B, 4C and 4D: 100× objective).
[0176]FIG. 5 shows the colocalization of the human ASAP protein with alpha-tubulin. FIG. 5A: cellular localization of alpha-tubulin, FIG. 5B: localization of the ASAP protein, FIG. 5C: superimposition of the 2 images showing the colocalization of the 2 proteins.
[0177]The following examples illustrate the invention but in no way limit it.
EXAMPLE 1
Construction of the Complete ASAP Coding Sequence
[0178]The complete sequence of the cDNA of the ASAP protein is amplified from 2 overlapping fragments: [0179]a fragment A amplified by PCR from the clone AI885274 with the primers:
TABLE-US-00001 [0179](SEQ ID NO: 31) constFIS-1F (5'-ATGTCTGATGAAGTTTTTAGCACC-3') and (SEQ ID NO: 32) constFIS-2R (5'-AGGCCTCAAATGATGCTAATGC-3';
[0180]a fragment B amplified from the clone AI671785 with the primers:
TABLE-US-00002 [0180](SEQ ID NO: 33) constFIS-2F (5'-ATCATTTGAGGCCTGGAAGGC-3') and (SEQ ID NO: 34) constFIS-1R (5'-AAACACTTTTGCGAACACAGTTC-3').
[0181]Next, in order to obtain a single PCR product corresponding to the complete sequence of the cDNA of the ASAP protein, that can be used for the function experiments, 0.5 μl of the products of each of the two PCR reactions (fragment A and B) are hybridized together at 25° C. and then amplified with the primers constFIS-1F and constFIS-2F. This PCR product is subcloned into the vector PCR4 according to the producer's (Invitrogen) recommendations, and verified by sequencing.
[0182]The major difficulties encountered lay in the determination, in silico, of the complete ASAP coding sequence and its reconstruction in vitro. In particular, the choice of the primers and of the various PCRs of the 3' region were tricky due to the sequence being rich in polyA.
EXAMPLE 2
Bioinformatic Analysis
[0183]FIG. 1 represents the chromosomal localization and the structure of the human asap gene.
[0184]The complete organization of the asap gene and its chromosomal localization were obtained by comparing the sequence of the cDNA obtained in example 1, with the sequence of the human genome, using the Wellcome Trust Sanger Institute programs and more particularly the BLAST search program.
[0185]The human asap gene consists of 29750 nucleotides comprising 14 exons, only 13 of which are translated, the first exon not being translated. The size of the exons ranges from 71 to 321 base pairs. The sequence of the gene is contained in the contig AC097467 (length 178204 base pairs) between bases 115117 and 143828 (version v.7.29a3 NCBI/Ensembl of Jul. 12, 2002) and is, moreover, located on chromosome 4q32.1 between the anonymous markers D4S1053 and D4S571 (region 161.25 megabases (Mb) to 161.28 Mb). The sequence of the gene is physically contained in the BAC clone RP11-27G13.
[0186]Two nucleic acids corresponding to fragments of the polynucleotide isolated by the inventors are listed in the GenBank database under the accession numbers AK024730 and AK024812, along with the ESTs listed under the accession numbers BU198882, BM693711, AW372449, BM021380, BU928828, AL707573, AI885274, AI671785, AA805679, BU619959, BM021126, AL598336, AW976973, BU629726, AI433877, AV751613, BQ372751, AI827535, AI866257, AA843565, R96130, BU684090, BF958121, BQ351941, AW194906, BG203580, BF078132, AW486134, AL600279, AA025538, AL600264, BF170676, BU759494, BB025236, BF214179, AI283076, BE694273, AI266380, BM670854, AA968415, BU503982, BB700612, BE988355, BU058357, BB312934, AW061311, BM537962, BE988356, BB318982, BB311217, BB557152, BB185248, BB557128, BB698742, BB186736, AV345769, BB274293, BB632007, BB617958, AI391312, W18534, BB186581, BB311289, BB312835, AW347411, AA972439, BB263570, AU035125, BB277226, BB274224, BB268445, AW024037, AA025609, BB274174, R96089, BB272238, BB269037, BB385718, BE007324, BB325992, AJ275277, AI414381, BB125476, BB430961, BE232162, BQ121419, BG591509, BF457670, AL897593, AL897592, BM926692, BM538559, BI759567, AL601021, AL598780, AU222540, BG567619, AU166296, BF889835, AU164011, AV656025, BF343454, AW262441, AW237952. These sequences, obtained in the context of a program of mass sequencing of human complementary DNA libraries, are incomplete and have never been either recognized or characterized.
[0187]The protein sequence was compared to the databank sequences using the PSI-BLAST and PHI-BLAST programs of the NCBI. Consensus protein motifs were sought using the DART program of the NCBI and the SMART program of ExPASy-Tools, the parameters of which make it possible to detect motifs with weak homology. The ASAP protein exhibits a sequence identity of 23% over the C-terminal third with a microtubule-associated protein (MAP 1A, for microtubule-associated protein 1A). Moreover, the search for conserved motifs (DART or SMART) reveals domains of caldesmon type (N. B. Gusev, Biochemistry, 10 1112-1121, 2000) and ERM type (ezrin/radixin/moesin) (Louvet-Vallet, S., Biol. Cell. 274: 305-316, 2000), which are proteins that are also considered to be MAPs, with identities of approximately 20%. It also has a BRCT domain (breast cancer carboxy-terminal domain; P. Bork et al., J. FASEB, 11, 68-76 (1997)) between positions 65 and 303.
[0188]The ASAP protein has coiled-coil domains essentially included in its C-terminal portion between, firstly, amino acids 297 and 327 and, secondly, amino acids 477 and 628, indicating either that the protein oligomerizes, or that it interacts with other proteins.
[0189]Computer analysis of the protein using the programs accessible on the Internet site reveals that it lacks β-sheets and is very rich in α-helices, in particular in the region between amino acids 420-620, which is almost exclusively made up of α-helices.
[0190]These elements make it possible to consider that the ASAP protein is a novel MAP.
EXAMPLE 3
Tissue Expression
a) Analysis by Northern Blotting
Preparation of Radioactive Probes:
[0191]The DNAs to be radiolabeled are isolated on a low melting point (LMP) gel according to the technique described by S. Rouquier et al. (Genomics, 17, 330-340, (1993)). Approximately 100 ng of DNA thus isolated are labeled by random priming (Klenow fragment, Promega) in the presence of [(α32P dCTP] (Amersham) according to the technique described in A. P. Feinberg & B. Vogelstein (Anal. Biochem., 132, 6-13, (1983)). These probes are purified on Sephadex G-50 columns according to the technique described in J Sambrook & D W Russell (2000, Cold Spring Harbor Laboratory Press). The hybridizations are carried out overnight in the presence of 2106 Cpm/ml of denatured radioactive probe.
a.1) Hybridization
[0192]Two Northern blotting membranes from the company Clontech (Human MTN Blot and Human MTN Blot II, Ref. 7760-1 and 7759-1) containing human mRNAs from various tissues were hybridized with the complete hASAP cDNA labeled as described above. The membrane was hybridized in the presence of formamide at 42° C., according to the Clontech protocol. A membrane hybridization control was carried out with an actin probe. The membrane was rinsed twice at high stringency in 0.1×SSC/0.1% SDS at a temperature of 42° C., for 15 minutes. The membranes were then analyzed by autoradiography or on a PhosphorImager.
[0193]The tissues tested were: spleen, thymus, prostate, testes, ovary, small intestine, colon, blood leukocytes, heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas.
a.2) Results
[0194]FIG. 2 illustrates these results.
[0195]Two signals were detected: [0196]a signal in the testes at approximately 2.6 kb, which corresponds to the size of the mRNA; [0197]a signal in the brain, but at a high molecular weight (9 kb), which corresponds either to a premessenger, or to a high molecular weight isoform.
b) Analysis by RT-PCR
[0198]This analysis was carried out on total RNA from various mouse tissues, namely brain, heart, colon, liver, small intestine, skeletal muscle, pancreas, lung, kidney, spleen and testes.
b.1) Obtaining the Mouse Orthologous cDNA
[0199]The total RNA from cells of various mouse tissues was extracted with the "mammalian total RNA kit" from the company Sigma. The RNAs were reverse-transcribed with the Superscript II kit from the company Invitrogen according to the conditions recommended by the supplier, and using oligodT primers. The products obtained were verified by 1% agarose gel electrophoresis. 1 μl of each sample thus obtained was, in turn, amplified by PCR (25 μl of reaction medium, 30 cycles (94° C. for 15 seconds, 55° C. for 30 seconds, 72° C. for 30 seconds)) with primers specific for the mouse asap gene (mFIS-1F, 5'-ACA ACG AAT AAC AGA GTG TCC-3' (SEQ ID NO: 35) and mFIS-2R, 5'-ACT CCT GAT AAA CAG CTG CC-3' (SEQ ID NO: 36)).
[0200]The amplified products obtained were analyzed by electrophoresis on a 1% agarose gel, stained with ethydium bromide, and their size was compared with a size marker loaded onto the gel in parallel.
[0201]After electrophoresis, the amplified products obtained were transferred by capillarity onto a charged nylon membrane, in a 1.5 M NaCl/0.5 M NaOH buffer, according to the Southern technique (alkaline transfer). The membrane was then hybridized with a radiolabeled mASAP probe (SEQ ID NO: 44) generated by amplification of the sequence contained in the mouse clone AW06131 selected after comparison of the human ASAP sequence in the databanks (GenBank).
[0202]The amplification was carried out by PCR (conditions as described above, in which the reaction volume was 50 μl and the cold dCTP was at a concentration of 10 μM supplemented with 50 μCi of α-P32-dCTP at 3000 Ci/mmole), using the primers mFIS-1F (SEQ ID NO: 35) and mFis-2R (SEQ ID NO: 36). The hybridizations were carried out at 65° C. (in 6×SSC buffer/0.5% SDS/5×Denhardt's solution). The membrane was rinsed at high stringency (0.1×SSC/0.1% SDS), and then analyzed by autoradiography or on a PhosphorImager.
b.2) Results
[0203]FIG. 3 illustrates these results.
[0204]It is noted that a major signal is obtained in the testes and the brain, which is clearly visible on the gel (FIG. 3A).
[0205]After transfer of the gel and hybridization with an internal probe, it is noted that a very weak signal is detected in the other tissues (FIG. 3B).
[0206]Consequently, the mRNA encoding the mASAP protein is mainly expressed in the testes and the brain. The complete mouse cDNA, amplified by RT-PCR from the mouse testicular RNA, corresponds to the sequence SEQ ID NO: 45 and the corresponding protein (mASAP) corresponds to the sequence SEQ ID NO: 46.
EXAMPLE 4
Cellular Localization
[0207]a) Subcloning of the hASAP cDNA in a Eukaryotic Expression Vector
[0208]The hASAP cDNA obtained in example 1 was inserted into three expression vectors: [0209]1--into pEAK10-EGFP in phase with the green fluorescent protein (GFP) fused in the C-terminal position (vector 1) (pEAK10, vector from Edge Biosystems (distributed by Q.BIOgene, Illkirch in France) into which the EGFP protein (enhanced green fluorescent protein) has been introduced according to the reference I. Gaillard et al., Eur. J. Neurosci, 15 409-418, 2002); [0210]2--into pEYFP-C1 in phase with the yellow fluorescent protein (YFP) fused on the N-terminal side (vector 2) (distributed by BD Biosciences Clontech); [0211]3--into GLOMYC3-1 comprising an MYC tag on the N-terminal side (vector 3), a vector derived from the vector pcDNA3.1 (Invitrogen), into which a 5' untranslated region (5'UTR) and an MYC tag have been inserted at the HindIII-BamHI sites, and the 3'UTR region of globin (Spel-Xbal fragment) has been inserted in the Xbal site.
[0212]The hASAP cDNA was amplified from its initial cloning vector (pCR4-TOPO) by PCR using the pfu Turbo high-fidelity polymerase, with primers that amplify the cDNA between the starting methionine and the last amino acid. The amplified products obtained were subcloned into the three vectors. [0213]Cloning in PEAK-GFP. Preparation of the DNA insert by PCR [94° C. 2 min; (94° C. 15 sec; 58° C. 30 sec; 72° C. 1 min 30 sec) 30 cycles; 72° C. 3 min], using the primers
TABLE-US-00003 [0213](SEQ ID NO: 37) hFIS-Exp1F (5'-GCCACCATGTCTGATGAAGTTTTTAGCAC-3) and (SEQ ID NO: 38) hFIS-Exp1R (5'-GAAACACTTTTGCGAACACAGTTC-3').
[0214]The vector was cleaved with EcoRV and dephosphorylated: 10 ng of vector were used for the ligation with the DNA insert. The PCR product was phosphorylated and then purified on a high PURE PCR kit (Roche): 100 ng of insert were used for the ligation [12 h at 16° C. in a final volume of 10 μl (Biolabs ligase), according to standard conditions (Sambrook and Russell)]. [0215]Cloning in Glomyc: Preparation of the DNA insert by PCR [94° C. 2 min; (94° C. sec; 60° C. 30 sec; 72° C. 1 min 30 sec) 30 cycles; 72° C. 3 min], using the primers:
TABLE-US-00004 [0215](SEQ ID NO: 39) Glomyc-FIS1F: (5'-TAATGTCTGATGAAGTTTTTAGCACC-3') and (SEQ ID NO: 40) Glomyc-FIS1R: (5'-TCAAAACACTTTTGCGAACACAGTTC-3').
[0216]Cloning conditions were identical to those described for the cloning in PEAK-GFP. [0217]Cloning in YFP: Preparation of the DNA insert: same conditions as for Glomyc, using the primers:
TABLE-US-00005 [0217](SEQ ID NO: 41) YFP-FIS1F (5'-AATGTCTGATGAAGTTTTTAGCACC-3') and (SEQ ID NO: 40) Glomyc-FIS1R (cf. above).
[0218]Cloning conditions were identical to those described for the cloning in PEAK-GFP, the vector having been cleaved beforehand with Sma1.
[0219]The recombinants were analyzed by PCR using a primer for the vector and an internal primer.
PEAK-GFP: annealing at 58° C., extension 45 sec at 72° C., and standard conditions for the rest. Primers: constFIS-2F (SEQ ID NO: 33) and GFP-1 R (5'-TCAGCTTGCCGTAGGTGGC-3') (SEQ ID NO: 42).YFP: annealing 550C for 1 min: primers: YFP-2F (5'-ATGGTCCTGCTGGAGTTCG-3') (SEQ ID NO: 43) and hFIS-Exp1R (SEQ ID NO: 38).Glomyc: annealing 44° C., extension 45 sec at 72° C. Primers: constFIS-2F (SEQ ID NO: 33) and SP6. The recombinants were sequenced by customer-tailored automatic sequencing using the PCR products (Genome Express, Meylan).b) Subcloning of the hASAP cDNA in a Prokaryotic Expression Vector
[0220]Using a strategy similar to that used in paragraph a) above, the hASAP cDNA was cloned into the vector pGEX-4T2 (AMERSHAM), so as to produce a fusion protein with GST, purifiable according to standard protocols.
c) Subcloning of the mASAP cDNA in a Prokaryotic or Eukaryotic Expression Vector
[0221]Using a strategy similar to that used in paragraph a) above the mASAP cDNA was cloned into the following vectors: [0222]pGEX-4T2 (AMERSHAM), so as to produce a fusion protein with GST, purifiable according to standard protocols. [0223]pEYFP-C1 so as to produce a fusion protein (N-terminal fusion) with the yellow fluorescent protein (YFP) detectable by direct immunofluorescence.
d) Transfection, Immunofluorescence and Microscopy
d.1) Materials and Methods
[0224]The vectors obtained were transfected according to the calcium phosphate technique or, more routinely, using the jetPEI method (GDSP10101, Qbiogene) according to the producer's recommendations, into the following cell lines: [0225]PEAK (ref. 37937, Edge Biosystems (distributed by Q.BIOgene, Illkirch in France), only for the human ASAP constructs, [0226]HEK-293 (ATCC (American Tissue Culture Collection) reference CRL-1573; p 53-/- non-synchronizable), for the human and murine ASAP constructs, [0227]nontransformed NIH3T3 (murine ASAP constructs), and [0228]U-2 OS (ATCC HTB-96; p 53+/-, synchronizable).
[0229]For vectors 1) and 2) (human and murine ASAP constructs), the localizations were determined directly by detection of the GFP or YFP fluorescence at 24 h, 48 h and 72 h, after fixing of the cells with paraformaldehyde and staining of the nuclei either with propidium iodide or with Hoechst 33286.
[0230]For vector 3) the MYC tag was detected using an anti-MYC primary antibody distributed by TEBU (9 E10, cat. #SC-40, Santa Cruz Biotechnology, CA) and an anti-mouse IgG goat secondary antibody labeled with the fluorochrome Alexa-594 (Molecular Probes, ref. A-11032, distributed in France by Interchim, Montlucon), after fixing of the cells and permeabilization thereof with 0.1% Triton X 100. The slides were analyzed, and the images were collected on a Zeiss Axiophot microscope.
d.2) Results: Cellular Localization and Colocalization of the ASAP Protein with Alpha-Tubulin
[0231]Cellular Localization
[0232]FIG. 4 illustrates the cellular localization of the hASAP protein overexpressed in the HEK-293 line (PI=propidium iodide).
[0233]Observation under the fluorescence microscope of the slides corresponding to the various transfections with vectors 1), 2) and 3) shows the same types of profile: the localization of the hASAP and mASAP proteins is cytoplasmic and its fibrous profile recalls that of tubulin filaments.
[0234]Moreover, it appears that the transfected cells exhibit division deficiencies since the nuclei are always larger than in the nontransfected cells (FIGS. 4A and 4B). In addition, some of the transfected cells appear to be plurinucleated (FIG. 4B). This suggests abnormal division of the transfected cells.
[0235]Finally, the mitosis of the transfected cells appears to be abnormal, in terms of both the chromosomal organization and the localization profile of the hASAP and mASAP proteins at the level of the mitotic spindle. The star-shaped localization profile of the hASAP and mASAP proteins is characteristic of the nucleation of the aster microtubules around the centrosome (FIGS. 4C and 4D).
[0236]A similar ASAP protein localization profile is detected in the U-2 OS line (p 53+/-) overexpressing hASAP and in the nontransformed NIH 3T3 line overexpressing mASAP; an accumulation of monopolar cells in mitosis is observed.
[0237]In addition, by synchronizing the U-2 OS cells and recovering the cell extracts at various times in the cycle, it was verified that the ASAP protein was indeed present in all the phases of the cell cycle (interphase, S, G2/M).
[0238]Colocalization of the ASAP Protein with Alpha-Tubulin FIG. 5 illustrates the colocalization of the human ASAP protein with alpha-tubulin; similarly, the murine ASAP protein colocalizes with alpha-tubulin.
[0239]FIG. 5A illustrates the cellular localization of alpha-tubulin detected by immunofluorescence using an anti-tubulin antibody (Alexa-594, Molecular Probe).
[0240]FIG. 5B illustrates the localization of the ASAP protein labeled with YFP (yellow fluorescent protein).
[0241]FIG. 5C represents the superimposition of the 2 images, demonstrating the colocalization of the 2 proteins.
EXAMPLE 5
Production of Anti-hASAP and -mASAP Polyclonal Antibodies
a) Antibody Production
[0242]The following ASAP protein constructs were cloned into the prokaryotic expression vector pGEX 4T-2 (AMERSHAM) as described in example 4: [0243]whole human ASAP protein (SEQ ID NO: 1), [0244]human protein from which the C-terminal portion containing the potential MAP domain (residues 1 to 421, SEQ ID NO: 47) has been deleted, [0245]whole murine protein (SEQ ID No: 46).
[0246]The proteins were expressed in E. Coli and purified according to standard protocols. Rabbits were then immunized with the purified ASAP proteins according to a standard protocol, and the immune sera were harvested.
b) Analysis of the Reactivity of the Polyclonal Sera with Respect to the Endogenous ASAP Protein
[0247]The monospecific polyclonal sera directed against the whole hASAP protein or the hASAP protein from which the C-terminal portion containing the potential MAP domain has been deleted were tested by Western blotting and by immunofluorescence, on HEK-293 and U-2 OS cells, according to standard protocols.
[0248]By Western blotting, the monospecific polyclonal serum directed against the whole hASAP protein detected a protein having an apparent molecular weight of approximately 110 kDa corresponding to the endogenous ASAP protein, in both the HEK-293 cells and the U-2 OS cells. Under these conditions, an anti-FLAG antibody detected a protein having an equivalent molecular weight, in control HEK-293 or U-2 OS cells, transfected with a vector for expression of the hASAP protein fused with a FLAG tag.
[0249]By immunofluorescence, the monospecific polyclonal serum directed against the whole hASAP protein labeled the microtubules of the HEK-293 cells in interphase, the asters of the cells in mitosis and the microtubules of the residual body at the end of telophase.
[0250]The monospecific polyclonal serum directed against the hASAP protein from which the C-terminal portion containing the potential MAP domain had been deleted exhibited the same profile by immunofluorescence and detected a protein of approximately 110 kDa, by Western blotting.
[0251]The monospecific polyclonal serum directed against the mASAP protein was used to detect which cell types expressed ASAP and at what stage(s) of the cell cycle it was expressed, by immunofluorescence on mouse testicular sections.
EXAMPLE 6
Functional Analysis of the hASAP Protein Using Mutants from Which the N-Terminal Portion Containing the BRCT Domain or the C-Terminal Region Containing the Potential MAP Domain has been Deleted
[0252]Fragments of cDNA encoding an hASAP protein from which the N-terminal portion containing the BRCT domain has been deleted (Ndel1: residues 304-647 (SEQ ID NO: 48); Ndel2: residues 411-647 (SEQ ID NO: 49); Ndel3: residues 478-647 (SEQ ID NO: 50)) or from which the C-terminal portion containing the MAP domain has been deleted (Cdel1: residues 1 to 477 (SEQ ID NO: 51); Cdel2: residues 1 to 418 (SEQ ID NO: 52); Cdel3: residues 1 to 303 (SEQ ID NO: 53)) were amplified by PCR using suitable primers, and then cloned into the expression vectors pEAK10-EGFP (C-terminal fusion with GFP) and pEYFP-C1 (N-terminal fusion with YFP) according to a protocol similar to that described in example 4.
[0253]The various constructs were transfected into the HEK-293 and U-2 OS lines, and the cellular localization of the various mutants of the hASAP protein was then analyzed as described in example 4.
[0254]It is noted that, for the same deletions, a similar profile is obtained with the construct comprising YFP in the N-terminal position or GFP in the C-terminal position.
[0255]By comparison with the whole hASAP protein, the 3 constructs from which the C-terminal portion has been deleted no longer colocalize in interphase with tubulin and no longer have a fibrous appearance; these results indicate that the deletion involves a MAP domain. In addition, no monopolar cell blocked in mitosis is observed in the cells overexpressing the mutants from which the C-terminal portion containing the MAP domain has been deleted.
[0256]By comparison with the whole hASAP protein, the three constructs from which the N-terminal portion containing the BRCT domain has been deleted exhibit a nuclear localization in the form of loci, but some fibers colocalizing with tubulin remain in the cytoplasm.
[0257]The functional analysis of the hASAP protein is completed by experiments consisting of inactivation of the expression of the gene with interfering RNAs (iRNAs).
Sequence CWU
1
531647PRTHomo sapiens 1Met Ser Asp Glu Val Phe Ser Thr Thr Leu Ala Tyr Thr
Lys Ser Pro 1 5 10 15Lys
Val Thr Lys Arg Thr Thr Phe Gln Asp Glu Leu Ile Arg Ala Ile
20 25 30Thr Ala Arg Ser Ala Arg Gln Arg
Ser Ser Glu Tyr Ser Asp Asp Phe 35 40
45Asp Ser Asp Glu Ile Val Ser Leu Gly Asp Phe Ser Asp Thr Ser Ala
50 55 60Asp Glu Asn Ser Val Asn Lys
Lys Met Asn Asp Phe His Ile Ser Asp 65 70
75 80Asp Glu Glu Lys Asn Pro Ser Lys Leu Leu Phe Leu
Lys Thr Asn Lys 85 90
95Ser Asn Gly Asn Ile Thr Lys Asp Glu Pro Val Cys Ala Ile Lys Asn
100 105 110Glu Glu Glu Met Ala Pro Asp
Gly Cys Glu Asp Ile Val Val Lys Ser 115 120
125Phe Ser Glu Ser Gln Asn Lys Asp Glu Glu Phe Glu Lys Asp Lys
Ile 130 135 140Lys Met Lys Pro Lys Pro
Arg Ile Leu Ser Ile Lys Ser Thr Ser Ser145 150
155 160Ala Glu Asn Asn Ser Leu Asp Thr Asp Asp His
Phe Lys Pro Ser Pro 165 170
175Trp Pro Arg Ser Met Leu Lys Lys Lys Ser His Met Glu Glu Lys Asp
180 185 190Gly Leu Glu Asp Lys Glu
Thr Ala Leu Ser Glu Glu Leu Glu Leu His 195 200
205Ser Ala Pro Ser Ser Leu Pro Thr Pro Asn Gly Ile Gln Leu
Glu Ala 210 215 220Glu Lys Lys Ala Phe
Ser Glu Asn Leu Asp Pro Glu Asp Ser Cys Leu225 230
235 240Thr Ser Leu Ala Ser Ser Ser Leu Lys Gln
Ile Leu Gly Asp Ser Phe 245 250
255Ser Pro Gly Ser Glu Gly Asn Ala Ser Gly Lys Asp Pro Asn Glu Glu
260 265 270Ile Thr Glu Asn His
Asn Ser Leu Lys Ser Asp Glu Asn Lys Glu Asn 275
280 285Ser Phe Ser Ala Asp His Val Thr Thr Ala Val Glu
Lys Ser Lys Glu 290 295 300Ser Gln Val
Thr Ala Asp Asp Leu Glu Glu Glu Lys Ala Lys Ala Glu305
310 315 320Leu Ile Met Asp Asp Asp Arg
Thr Val Asp Pro Leu Leu Ser Lys Ser 325
330 335Gln Ser Ile Leu Ile Ser Thr Ser Ala Thr Ala Ser
Ser Lys Lys Thr 340 345 350Ile
Glu Asp Arg Asn Ile Lys Asn Lys Lys Ser Thr Asn Asn Arg Ala 355
360 365Ser Ser Ala Ser Ala Arg Leu Met Thr
Ser Glu Phe Leu Lys Lys Ser 370 375
380Ser Ser Lys Arg Arg Thr Pro Ser Thr Thr Thr Ser Ser His Tyr Leu385
390 395 400Gly Thr Leu Lys
Val Leu Asp Gln Lys Pro Ser Gln Lys Gln Ser Ile 405
410 415Glu Pro Asp Arg Ala Asp Asn Ile Arg Ala
Ala Val Tyr Gln Glu Trp 420 425
430Leu Glu Lys Lys Asn Val Tyr Leu His Glu Met His Arg Ile Lys Arg
435 440 445Ile Glu Ser Glu Asn Leu Arg
Ile Gln Asn Glu Gln Lys Lys Ala Ala 450 455
460Lys Arg Glu Glu Ala Leu Ala Ser Phe Glu Ala Trp Lys Ala Met
Lys465 470 475 480Glu Lys
Glu Ala Lys Lys Ile Ala Ala Lys Lys Arg Leu Glu Glu Lys
485 490 495Asn Lys Lys Lys Thr Glu Glu
Glu Asn Ala Ala Arg Lys Gly Glu Ala 500 505
510Leu Gln Ala Phe Glu Lys Trp Lys Glu Lys Lys Met Glu Tyr
Leu Lys 515 520 525Glu Lys Asn Arg
Lys Glu Arg Glu Tyr Glu Arg Ala Lys Lys Gln Lys 530
535 540Glu Glu Glu Thr Val Ala Glu Lys Lys Lys Asp Asn
Leu Thr Ala Val545 550 555
560Glu Lys Trp Asn Glu Lys Lys Glu Ala Phe Phe Lys Gln Lys Lys Lys
565 570 575Glu Lys Ile Asn Glu
Lys Arg Lys Glu Glu Leu Lys Arg Ala Glu Lys 580
585 590Lys Asp Lys Asp Lys Gln Ala Ile Asn Glu Tyr Glu
Lys Trp Leu Glu 595 600 605Asn Lys
Glu Lys Gln Glu Arg Ile Glu Arg Lys Gln Lys Lys Arg His 610
615 620Ser Phe Leu Glu Ser Glu Ala Leu Pro Pro Trp
Ser Pro Pro Ser Arg625 630 635
640Thr Val Phe Ala Lys Val Phe 645225PRTHomo sapiens
2Met Ser Asp Glu Val Phe Ser Thr Thr Leu Ala Tyr Thr Lys Ser Pro 1
5 10 15Lys Val Thr Lys Arg Thr
Thr Phe Gln 20 25328PRTHomo sapiens 3Asp Glu
Leu Ile Arg Ala Ile Thr Ala Arg Ser Ala Arg Gln Arg Ser 1
5 10 15Ser Glu Tyr Ser Asp Asp Phe Asp
Ser Asp Glu Ile 20 254107PRTHomo sapiens
4Val Ser Leu Gly Asp Phe Ser Asp Thr Ser Ala Asp Glu Asn Ser Val 1
5 10 15Asn Lys Lys Met Asn Asp
Phe His Ile Ser Asp Asp Glu Glu Lys Asn 20
25 30Pro Ser Lys Leu Leu Phe Leu Lys Thr Asn Lys Ser Asn
Gly Asn Ile 35 40 45Thr Lys Asp
Glu Pro Val Cys Ala Ile Lys Asn Glu Glu Glu Met Ala 50
55 60Pro Asp Gly Cys Glu Asp Ile Val Val Lys Ser Phe
Ser Glu Ser Gln 65 70 75
80Asn Lys Asp Glu Glu Phe Glu Lys Asp Lys Ile Lys Met Lys Pro Lys
85 90 95Pro Arg Ile Leu Ser
Ile Lys Ser Thr Ser Ser 100 105576PRTHomo
sapiens 5Ala Glu Asn Asn Ser Leu Asp Thr Asp Asp His Phe Lys Pro Ser Pro
1 5 10 15Trp Pro Arg Ser
Met Leu Lys Lys Lys Ser His Met Glu Glu Lys Asp 20
25 30Gly Leu Glu Asp Lys Glu Thr Ala Leu Ser Glu
Glu Leu Glu Leu His 35 40 45Ser
Ala Pro Ser Ser Leu Pro Thr Pro Asn Gly Ile Gln Leu Glu Ala 50
55 60Glu Lys Lys Ala Phe Ser Glu Asn Leu Asp
Pro Glu 65 70 75631PRTHomo sapiens 6Asp
Ser Cys Leu Thr Ser Leu Ala Ser Ser Ser Leu Lys Gln Ile Leu 1
5 10 15Gly Asp Ser Phe Ser Pro Gly
Ser Glu Gly Asn Ala Ser Gly Lys 20 25
30783PRTHomo sapiens 7Asp Pro Asn Glu Glu Ile Thr Glu Asn His
Asn Ser Leu Lys Ser Asp 1 5 10
15Glu Asn Lys Glu Asn Ser Phe Ser Ala Asp His Val Thr Thr Ala Val
20 25 30Glu Lys Ser Lys Glu
Ser Gln Val Thr Ala Asp Asp Leu Glu Glu Glu 35
40 45Lys Ala Lys Ala Glu Leu Ile Met Asp Asp Asp Arg Thr
Val Asp Pro 50 55 60Leu Leu Ser Lys
Ser Gln Ser Ile Leu Ile Ser Thr Ser Ala Thr Ala 65 70
75 80Ser Ser Lys824PRTHomo sapiens 8Lys
Thr Ile Glu Asp Arg Asn Ile Lys Asn Lys Lys Ser Thr Asn Asn 1
5 10 15Arg Ala Ser Ser Ala Ser Ala
Arg 20954PRTHomo sapiens 9Leu Met Thr Ser Glu Phe Leu Lys Lys
Ser Ser Ser Lys Arg Arg Thr 1 5 10
15Pro Ser Thr Thr Thr Ser Ser His Tyr Leu Gly Thr Leu Lys Val
Leu 20 25 30Asp Gln Lys Pro
Ser Gln Lys Gln Ser Ile Glu Pro Asp Arg Ala Asp 35
40 45Asn Ile Arg Ala Ala Val 501032PRTHomo sapiens
10Tyr Gln Glu Trp Leu Glu Lys Lys Asn Val Tyr Leu His Glu Met His 1
5 10 15Arg Ile Lys Arg Ile Glu
Ser Glu Asn Leu Arg Ile Gln Asn Glu Gln 20
25 301154PRTHomo sapiens 11Lys Lys Ala Ala Lys Arg Glu
Glu Ala Leu Ala Ser Phe Glu Ala Trp 1 5
10 15Lys Ala Met Lys Glu Lys Glu Ala Lys Lys Ile Ala Ala
Lys Lys Arg 20 25 30Leu Glu
Glu Lys Asn Lys Lys Lys Thr Glu Glu Glu Asn Ala Ala Arg 35
40 45Lys Gly Glu Ala Leu Gln 501249PRTHomo
sapiens 12Ala Phe Glu Lys Trp Lys Glu Lys Lys Met Glu Tyr Leu Lys Glu Lys
1 5 10 15Asn Arg Lys Glu
Arg Glu Tyr Glu Arg Ala Lys Lys Gln Lys Glu Glu 20
25 30Glu Thr Val Ala Glu Lys Lys Lys Asp Asn Leu
Thr Ala Val Glu Lys 35 40
45Trp1343PRTHomo sapiens 13Asn Glu Lys Lys Glu Ala Phe Phe Lys Gln Lys
Lys Lys Glu Lys Ile 1 5 10
15Asn Glu Lys Arg Lys Glu Glu Leu Lys Arg Ala Glu Lys Lys Asp Lys
20 25 30Asp Lys Gln Ala Ile Asn
Glu Tyr Glu Lys Trp 35 401441PRTHomo sapiens
14Leu Glu Asn Lys Glu Lys Gln Glu Arg Ile Glu Arg Lys Gln Lys Lys 1
5 10 15Arg His Ser Phe Leu Glu
Ser Glu Ala Leu Pro Pro Trp Ser Pro Pro 20
25 30Ser Arg Thr Val Phe Ala Lys Val Phe 35
40152575DNAHomo sapiens 15acttccttcg tctgggtggt tgccccagcg
acacgttggg ccgaagagcg gtgttgggta 60cccgagagac ccggcggtgg ggaagtcact
tcctcccgaa gacgctgttt cctagcaacc 120gccctccgcc tctgttatta gcccctcctc
ctcgctcggt ccaggaccgg ctctgcgggc 180gccgccaggc ccagaccaag ctactatcag
aagttgaatt ctaataatta gctattttat 240aaaggtaacg agaaaaaata cactatgtct
gatgaagttt ttagcaccac tttggcatat 300acaaagagtc caaaagttac caaaagaact
actttccagg atgagctaat aagagcaatt 360acagctcgct cagccagaca aaggagttct
gaatactcag atgactttga cagtgatgag 420attgtttctt taggtgattt ttctgacact
tcagcagatg aaaattcagt taataaaaaa 480atgaatgact ttcatatatc agatgatgaa
gaaaagaatc cttcaaaact attgtttttg 540aaaaccaata aatcaaacgg taacataacc
aaagatgagc cagtgtgtgc catcaaaaat 600gaagaggaaa tggcacctga tgggtgtgaa
gacattgttg taaaatcttt ctctgaatct 660caaaataagg atgaggaatt tgaaaaagac
aaaataaaaa tgaaacctaa acccagaatt 720ctttcaatta aaagcacatc ttcagcagaa
aacaacagcc ttgacacaga tgatcacttt 780aaaccatcac cttggccaag gagtatgtta
aaaaagaaaa gtcacatgga ggagaaggat 840ggactagaag ataaagaaac tgccctcagt
gaagaattgg agttacattc tgcaccttct 900tcccttccaa cgccgaatgg catacaatta
gaagctgaga aaaaagcatt ctctgaaaac 960cttgatcctg aggattcatg cttaacaagt
ctagcatcat catcacttaa acaaattctt 1020ggagattctt tttcaccagg atctgaggga
aacgcatctg gaaaagatcc aaatgaagaa 1080atcactgaaa accataattc cttgaaatca
gatgaaaata aagagaattc attttcagca 1140gaccatgtga ctactgcagt tgagaaatcc
aaggaaagtc aagtgactgc tgatgacctt 1200gaagaagaaa aggcaaaagc ggaactgatt
atggatgatg acagaacagt tgatccacta 1260ctatctaaat ctcagagtat cttaatatct
accagtgcaa cagcatcttc aaagaaaaca 1320attgaagata gaaatataaa gaataaaaag
tcaacaaata atagagcatc cagtgcatct 1380gccagattaa tgacctctga gtttttgaag
aaatctagtt ctaaaaggag aactccatcg 1440acaactacct cttctcacta tttagggact
ttaaaagtct tggaccaaaa accttcacag 1500aaacagagca tagaacctga tagagcagat
aacataaggg cagctgttta tcaggagtgg 1560ttagaaaaga aaaatgtata tttacatgaa
atgcacagaa taaaaagaat tgaaagtgaa 1620aacttaagga tccaaaatga acagaaaaaa
gctgctaaaa gagaagaagc attagcatca 1680tttgaggcct ggaaggctat gaaagaaaag
gaagcaaaga aaatagctgc caaaaagagg 1740cttgaagaaa aaaacaagaa gaaaactgaa
gaagaaaatg ctgcaagaaa aggagaagca 1800ctacaagctt ttgaaaaatg gaaagagaaa
aagatggaat atcttaaaga gaaaaataga 1860aaggagagag aatatgaaag agcaaagaaa
cagaaagagg aggaaactgt tgccgagaaa 1920aagaaagata atttaactgc tgttgagaaa
tggaatgaaa aaaaggaagc ttttttcaag 1980caaaagaaaa aagaaaaaat aaatgagaaa
agaaaggaag aactgaaaag agctgagaaa 2040aaagataaag ataaacaagc tattaatgaa
tatgaaaaat ggctggaaaa taaggaaaaa 2100caagaaagaa ttgaacgaaa acagaagaaa
cgtcattcct ttcttgaaag tgaggcactt 2160cctccgtgga gccctccaag cagaactgtg
ttcgcaaaag tgttttgata attctagttc 2220ttacattatt tggttattta tcggtttgcc
aatattagcc atagatttaa accattcaat 2280tatttatagt tagaggaata tattttaatt
aaatgccaga cactcctgct gacaatgaaa 2340gaaatacttt ggaatgtaat cagtgaaagc
atttttttga actgtagata aactgcctca 2400aacaaagacc taataatcag attgttttta
ccattaagat acataagatt ttatcatgtc 2460ctgataattc ttatggtgga gtgattcatg
atctttttca ttaagctctg tatgttattt 2520aagtatattt aattccagta ataaaaagga
aatcatctag gtaccataaa aaaaa 25751629750DNAHomo sapiens
16tctgggtggg agttgggcgg gtcctgtctc ctaggcaaca gcacatgcac acaagcgacc
60aataatgagc ccctctccaa agacccagga aggtgatgtc acttccttcg tctgggtggt
120tgccccagcg acacgttggg ccgaagagcg gtgttgggta cccgagagac ccggcggtgg
180ggaagtcact tcctcccgaa gacgctgttt cctagcaacc gccctccgcc tctgttatta
240gcccctcctc ctcgctcggt ccaggaccgg ctctgcgggc gccgccaggc ccagaccaag
300gtgagcagct cctacccgat gcttggctct tgattctcag ggtcgcggag aactggccgc
360gggcgtccgg ggccgggaac agaaagcggg acctgggggc catgggggat ccggacagag
420accgcgcttg gacgtgcacg ggcctggcgt tcgctggtgc tcagcatacg gcgcggtgag
480gagcggcgag cacccggacg tcacctggcc tggtagggaa cggaacccgg ggcgcacaac
540gctatgggcg gccctgccag gcctctgctc cgagtacggg aaaccgcgat tttaatgcgg
600ctcatcgcga aagcttcgtc gttttgtctg gctctcttta acacttttgt gagaggaaaa
660attggcttgc aatacatctc gctggctgtt tgcgggttag cattacgatc tttttctttg
720aatagcgctg tatgcaaata tatagataca tttttttttt ggtggtggtg ctcataattt
780ttacgccgac gatccttttg atggcctttt aaataagacg tgacttattt tgaaggcaat
840gttatacttt agaagagagg tgaaaaataa ggtgttctat tttaattggc agcattttgt
900cgtattaact tgtaatcatt tatttgcaga ctttttaagt agttgcaaaa ctattttagg
960ataacttcca tttgaatttt tttaaacaag cttgttatga gaatttgcta tttctttaca
1020agaacctttt taagtgaaga tgtagcccaa tgttcatatc agatgctttt ctttgacctt
1080tgtggggaga gtagaatcaa atgtaataaa ataaattctg aagcatgcga agtctgattt
1140gttttgtata tttcagctac tatcagaagt tgaattctaa taattagcta ttttataaag
1200gtaacgagaa aaaatacact atgtctgatg aagtttttag caccactttg gcatatacaa
1260agagtccaaa agttaccaaa agaactactt tccaggtaaa gtatttttat ttggaatcat
1320ttcacagtgt aaacactgta ttagatgggt tgaaattggt gattctagaa cagtcctata
1380taaagcaggg gtaaatctta tattactttt gaggttttgc acatgatcat gtttgggctc
1440catccagtat tacaaactcc cctatatggt tttaagacta ccaaagtagc ctcaatacta
1500gtttcctact aagttaaaag ttgaatcgca accttaaatt gccattttta tataaaaact
1560tttttttctg ttgtaacata atgtttaagt ttttttttct gttgagtcac tgcaattttg
1620aactcagcct ctaagtttgc aatattgatt gcatccattt ctgaaatatg ccgagacaaa
1680agctcttaaa aataccaatt tctttcaaaa taccagtttt taataaatta taatctaaat
1740tgagcccctt cttatttgtt accctccagc tctaattata acctgcaatt aatttgttcc
1800ataatgtgtg tctcctctag ttaaactgcg agctccatga ggaagggctc ttgtctgtga
1860tgctctgcat tgagtatgag gcgtaaagtg ggtacatggc ataaagtgag cttgcaggaa
1920atatttgtta gatgaatgaa acctaagttt gaaagcagtc gttaatcaag cattgtttgt
1980ttaaagaatt acttgtgaat atgatacctc catgtttgga tggaaattga tttcagtatc
2040tcatttcagg atgagctaat aagagcaatt acagctcgct cagccagaca aaggagttct
2100gaatactcag atgactttga cagtgatgag attggtatgt gacagtatgg aaacgtgaac
2160cacttttctt ctttttgctt ccttagtttt gtatttagcc agccccccaa ccacccatcc
2220cctcaatcac gtatgttaaa ataataccta agcattcact aattttagat tttcaacttt
2280ttaattagta gaaagccact cttaattttc aggaagttgt atgattttct ttttttattg
2340ttgttttgtt ttctgaatgt gtatacgaaa atataaatta attgatggca ggtttgcagt
2400aaaaggatgg ctgccagtgg taaaccacat tgaagaagac aggttcatct ttaagatcaa
2460ccctaggagg tgctacagct agttagtaac tagtcccaca gaactaaact tcggtgcaca
2520ttagaagtgc ttttataaag cttgctataa atcagatttt ttttggctgt gataaggggt
2580aaatttaaaa accacagact cttcgtgttt catatatcag tactattata atttggtttc
2640tcttagctat gtaaacatat taacatttta gtttcaggta taagcataca gaattctaaa
2700cttggtgttt ttgtttgttt gtttttgttt ttgagatgga gtctcgctca gttgctcaag
2760ctggagtgca gtggtgcaat ctcggctcac tgcaacctcc acctcccagg ttcaagtgat
2820tctcctcctt cagcctcctg agtagctggg actacaggtg cccgccacca tgcccggcta
2880atttttgtat ttttagtaga gatggggttt caccacatcg gccaggctgg tctcgaactc
2940ctgaccttgt gatccgcccg cctcagcctc ccaaagtgct gggattatag gtgtgagcca
3000ccgcacccgg cctggtgttt tattctttaa aatttggtga ataattgtaa ttgatttctg
3060taaaaccagt aataaccaca gttaaatcac tgctgtatag ttaacttagc atttcttatg
3120attcttagta aatctaatat tctggtgtgg atggaattgt agttccaaaa tttttatgga
3180aaaaatataa ttagtaatta ctaattaaat tcttccattt acaaatgttc ttgattttac
3240atgaagaagt aatttgcaaa taaaagtttt acagtccata atctaattta aatgctacat
3300gactgattgt tagggacctt tggatggctt tttccagagc aaacagtgtt tggttgtttg
3360gtaccctaca gacaacacaa taaatacatt ttgaataaat taatgaaatt ggaattttta
3420tttcataaat gttaatgaga cgtgcctgag ttagctgtgt ttttagagct gcaagtctat
3480ttataaaata catttgtgcc tattcattgt tagaattttg tttgtagctt ttaaggtaaa
3540ctttgattaa gttaacgtaa ccttgacaat ttttaaaaat actgttgaaa acatttttct
3600tttccatttt tcagtttctt taggtgattt ttctgacact tcagcagatg aaaattcagt
3660taataaaaaa atgaatgact ttcatatatc agatgatgaa gaaaagaatc cttcaaaact
3720attgtttttg aaaaccaata aatcaaacgg taacataacc aaagatgagc cagtgtgtgc
3780catcaaaaat gaagaggaaa tggcacctga tgggtgtgaa gacattgttg taaaatcttt
3840ctctgaatct caaaataagg atgaggaatt tgaaaaagac aaaataaaaa tgaaacctaa
3900acccagaatt ctttcaatta aaagcacatc ttcaggtaat ttgttaggat tactgtaatt
3960gcatttcttg gaagtttatt ttaagataat cagtcccaaa atttttatat ggtagctagt
4020atatatttaa gaaaaaaaga cagacttaac ttccatttta cagacctgtt gtattttgtc
4080taacttcaat tttacagacc tgttgtattt tgtctaactt caattttaca gacctgttgt
4140attttgtctt gcatctaggc tgttgcctga tagaaagcca aagcacaaag ccaaagcacc
4200tttagtcatc catagcatcc atagctgtgg atctccagac acctagacct gtgagcttca
4260gttttgtttg taggtgtgga actggaatgg aatgctgtct aatccctctc acactccaaa
4320gattagagtt acagcaatat tgagactaat ccttctaaca gtctttgcca taccaacatt
4380gtgccagaaa attttcttga catttgtata tttgaaggat gagttatgtt attgctgctg
4440ttgtttgttg aagcatccag gcactcctta agagaatctc catttgatct ctgtattgcc
4500tatgaaaatc tactaagatt cagttttcca aaggaaagtt cctggtgtga tctgggatta
4560cagttagttc tgcccacaat tttactgaat tttaagcata aaggaacaaa gatagaatga
4620aacggagacc aagtcctgtc acataccctg ggccaccatt catgaacttg tatatgcaag
4680gttaaggatt ttttgttttt cattctttgt attttataaa ggaattatta gttgatgtta
4740accttcataa aaatctcctt gcatatcatc agtaaataca gtgctggtaa atatttcata
4800ctttgcatat tagataccag tggtaacgtc agacaaaact ttatttcagg catgtattgg
4860ggaactgctc ctttcttcct gaccccacaa tctcattaac tttgaaatga gcaaaggatg
4920taagcagagc aaagaacact agaataatat ccaggacact gggggaaagg cctctgtata
4980ttatatatga cttcagcaaa taagttaagc ttcagtatcc tcatgatgag gaagctaaaa
5040ataaccctct ttctattcct gcaaaattgt gagagtttat tgaagtgcat ctcataaact
5100ataaaaaact acaaaaatgc aaacagatgc ataatgaaac aattaacttg ttaaaatgta
5160ccttctaagt atagtgagtg aaatcaatgc tggagagaag aggaacataa ttgaacttcg
5220ttattaagaa aatgcgagca tatatagcaa ctaaaaattt gtctgagaca ggtggatgta
5280tataattaga agtttatggt agataatcag gaaagcaata atccacctat ttcatacctt
5340aaaaaaaaaa aaaacctgtg gtgggttaca atgaataaga aaatactgta ttttaaccac
5400aaggtggcat caggatccta aatgctctac ttatatatgc aatgttatat tcagtacgtg
5460taatataaaa ataattacct aaataggtaa ttgtatacat tgattaccaa aaaaagcgct
5520tttcttaaag tataggcatt tttttttctt tttgggaact tgacagtact tctggaagtg
5580gaatttttgt agaaaatata ttaaagttgt cattctcagg ttcttcaggt tgaaaagtaa
5640aaattgaggc tagtgttcct aagataatat ctggcatata taataagtat ttaaatgaat
5700aaattaatat atgaatgatt tatctttgaa agagggaata tggttcatga gtttatcctc
5760taaattcttt gacttttttt ttttctgtac aggtttggaa ctcaatgttt ttaatgtggt
5820gagatattgc tgagtagcaa gtaatgcttt atgaaactat tagagcttga aggttttctc
5880tgtccttgct tgtcttttgt aaaaagtata ataaccagac tttatagtca ctactgaagt
5940gacagttgct ctataaagtg aaagtatttt tcacaggata tgtttttatt ttaatactaa
6000catgactgaa atcatgaact ttggagtcag gatgcttctc ctttaatctg agatctgcag
6060cctgctagag tttgtgactt tgggcatgag acctctttgt tctcatttta ttcatcttta
6120aaaacgggat aatagttgcc tgcctctagg agtttgaggc aattaaatga gttcacatat
6180ttgaagtgct tagaatagta ctggcataaa tttagcactc tataaatgtt ctgattattc
6240attttattat ttagcgtttg tttataaaca tgctcagcag gtataaagta tcagtcatgc
6300gggatgcgta agttctagag atctgctgta cattgtgcct atagttaaca gtactgtctt
6360ttgcactgaa tgtattaaga aggtagatct catgtttgtt cttaccacaa taataaaaaa
6420aattgactca acaccttctt tcaggcatta tataatattc tgcttaaact gaggctcaaa
6480agacatgcaa gcatttgtca ggaggagaag caggaagtgg atattctagg cagggggatc
6540agcttaggta aaggtatggt agcaggaggg attggaggga ttgtggtatg tgtgcatgac
6600aactgttagc ccagcatttc agaaacacag atgacaaaat ggctgtagat aaggcagtga
6660aggacaaaac cataaaatcc gttttatgtt gtttaaaggc agttaagctt ttattctgta
6720ggattggatc atggggagcc attgaataat tttgtagaaa ggagtgatgt gatctgattt
6780ggattttgta aatatcatgg aagcagtgat ctaggaaaga gtggataagg acccgacagc
6840agggatgtag aaagtggaat aaatgagata tttggcaatt agaattgata ggatatattg
6900atactctgga tttaggggat aatagaggga ggaatctaga gcccttggat ttggggttga
6960acatttggct ggagtttagg atgtagctaa aattgtcagc tacttataat aataccaatt
7020tggtatggtt gtggaatctt ctggcagaat ccataagccc atttttaggt aaatgggagg
7080aagatgttaa ttagaccaat tttgaagttg agaaaaatgc atttgtagaa caatagaaac
7140ataaatatgt atagcaggta aaatgcaggc aaaaaatata tacatggaaa gtcttcccat
7200tgtttcgaat actggatgca aatcagcatt tgattcttga tttaaactta gaagtaatgg
7260aaagagtgaa attttaataa atgctaaaga agttttatgg actcagaaca attaactcat
7320aaaagattcc ttcctctaat gagagttagc actcctatcc cttgagtgcc aacatcatca
7380tctttgtcct tataatagca cttataatct tagtaatcta gtcttgtaat tttgtttaga
7440aaaatcaacc tgtaaagtac ctggacaggt ccattgccgc tttgttgatt atgaggttta
7500gtaacgtgta cagggcttgg tactcaaagg cttgatggat gagcctcctc attttatagt
7560ggtagaaact ggggcaagat tttgttttgt ttttttattt ttaacatttt ttttttaata
7620ttataagagt tcacaatgtt gaagagttaa cttcttgtga ctggttactt tcaggatgac
7680aactgtttct ttactttgtt ttttttttgt tgttgttgtt gtttggtttt tttttttttt
7740ttagatggat ttttgctctt attacccagg ctggagtgca gtggtgtgat ctcgatctcg
7800gctcactgca acctcagact cctgggttca agcaatcctc ctgcctcagt ctcctgagta
7860gctgggatta caggcacgcg ctactaagcc cggctaattt ttttgtattt ttagtagaga
7920cagggtttca ccgtgttagc caggctggtc tcgaactcct gacctcatga tctgcccacc
7980tcggcctccc aacgtgctgg gattacaggc gtgagtcacc gctcccaaca tgtcgggatc
8040acaggcgtga gccaccgcgt ccggcctgat tattaaccat catttatttg tgccttacta
8100gagctctgta tagagaagag ttgtgggctt catctggact cttcaggaca gagaacaaag
8160gggcataggc acaggaggga agtatggtag cacccagaga gatagataaa gccatggtca
8220tttttttata cacacacttt aagcatttta tttttcagca gaaaacaaca gccttgacac
8280agatgatcac tttaaaccat cacctcggcc aaggagtatg ttgaaaaaga aaagtcacat
8340ggaggagaag gatggactag aagataaaga aactgccctc agtgaagaat tggagttaca
8400ttctgcacct tcttcccttc caacgccgaa tggcatacaa ttagaagctg agaaaaaagc
8460attctctgaa aaccttgatc ctgaggttag cactaccact aaactgttga attgtgttct
8520tgaatttatg cttttttatc tgattatgaa aaagagaagg agagaatgaa tttgtgtgcg
8580tgtgtgtgtg ttttacatac tttcttctgc aactgataag gaaataattt ttaaaaatac
8640actgtattcc accgagtcta aaactgcatc aattgtaaga cgtagcatta ttttacatac
8700cactaaggaa gaaggaaatg catccaatta aactataaca caccagtgat tgtagagttt
8760atccagtttt agagaaagta aaatgtcaaa aagtgttgct tttctgaatc tatataatag
8820tgtttatctt taataatttt ttaaatttat gtatctttga attatgtaat ttatggctaa
8880gaacaatata gtcagtgtca ttttatttat ttgattttat tcactcaaca aatgtgtgtt
8940gaatgttcat ggcactcttc tgtgttcttt gggttatgtt ccaatagcat taaatgtggc
9000ctttcaggtt tccatcaggg aatttactat gcattgttat taagggagaa cacttcgttt
9060ttctctttgt atttcactat gagaagcaaa ctgtcccttc tgaacatttc agaagggaaa
9120agtacaggaa gaacatttct tccccataat ctgcttgggc agattaggga actgcatgcc
9180acctggccaa gcttctttct ttttctcatc gcttgtctgc agtgttggtg cttaaggatc
9240tgctctctgg gaggtgaggc agaaggtgct gagaggagct cttttgtgca atgactaaat
9300gggggaatcc ccctaattca gactggaagt attaggaagc acaataggct accaattcaa
9360atcttgttct gcagttgagc tttaccagta aagctgacaa tttgatatac gcctaactga
9420caccaccatg ctgtttctta atttgttctg aaaaccagaa gaagaaaccc aagcaaatac
9480tttatattta agaaaattat ctgatccatt gaatattgtg ctagtttctt gtagctgctg
9540taacaaattg ccacaaactg gttaacttaa aacaacagaa atgtattctc ttagttctgg
9600aggtcagaag tccaagatca aggtgtttgc agggccattt tcctctgaag gcatcacgga
9660agaatccttc cttgcctctt ccagcttctt tctagtggtt gccagcagtc catggcattc
9720cttggcttgt agctggcttg tagctgcatc attcccttct ctgccttcat cccatgtggc
9780cttcttccct gtgttttctc tgcatgtctg tgtctcttct ttctcttaaa aaaagacacc
9840aggcattgga tttagggccc accctaattg agtgtgtcct catcttatct atttaaagct
9900gtaaacacct tatttcctaa gaaagtcgta ttttgaggtt ctggatgaac atgaattttg
9960gggcattaat gttcgtatgt taaacctagc attcccggga taaactctgg ttagtcatgg
10020tgtgatattt tattgtggga tgtgatttgt taaaattgtg ttaaggtttg catctatatt
10080tatgaagtct attggtctgt aatttttttc ttataatgtt accatcaggc ttgggtatca
10140aatgagttgg ggagtgtctt ttcttcattt tataaaagtt tggtatcatt attttcttaa
10200atgagaggat tcaccagtac aattatctgg gcctggaatt ttctgtgtgg agacatcttt
10260ggcattacat ttgatttttt aaataggtat ttcagtactc acattttctg ttttgccagt
10320ttggtaattg tgtctatcaa gaagtttgtc catttcatct gatatgttga gtttataaac
10380agagttgttc acgatagtcc ctcattcttt tgatgactag gattatcatg acatttcatt
10440tttatttcta acatatataa tttgtgtttt gtgtctttcg tgctaaatct tgataggcat
10500tgcttagttt tattaaacgt ttttaagaac cacttcggct ttgtcatatg ttggtgcaaa
10560agtaattgca gttttggcca ttactttcaa tgacaaaaac cgcaatcatt ttgcaccaac
10620ctaataattt tctctattgt ttgtttaatt gattttcagt attatttcag tattattcag
10680tattatttct tttactttct tttttttttt ttgagacaga gtctcgttct atcgcccagg
10740ctggagtgca gtggtgcaat cccagctcac tgcaagctct gcctcccagg ttcactccat
10800tctcctgctt cagcctcccg agtagctggg actacaggca cccaccacca tgcctggcta
10860atttttgtat ttttagtaga gacggggttt caccgcgtta gccaggatgg tctcgatctc
10920ctgacatcgt gatccaccca cctcggcctc ccaaggtgtt gggattacag gcgtgagcca
10980cggcgcctgg cctcttttac tttcttttgg tttaatttgc ttatctttag atttgaaaat
11040tttctcattc atttttaaga ttttcgtgat ttctgctaaa cctgttgaaa ggtgtaaact
11100ttcttctttg tactgcttta gtggccccga ttttttgatg ccttttattt ttattatcat
11160ttctttaaat atatatttta acttcccttg tgatctcctg ttttaaaaat ttattttttt
11220agttgaaaaa taataattgt acatggggta catagtgatt tttcgataca tataatatat
11280agtgatcatt gtgatctctt ttttgaccag ttggttattt tatggtgatt tattttattt
11340tcaaatactt gttttttctc tagatatact tttgatgtta attataagtt aattttgttg
11400tagtctagag aatgtatctt acatgatttc aaatttttaa aaattattat tattatttct
11460aaatggccca gctttagtgt atcttgtgaa agtctcattt gcatctgcaa agtagatgtg
11520ttctccaggt gttgaatata atgttgtata atttaagttt ggtcaacatg gttggtaata
11580tcattcagat cttctttatc cttactgatt tttcatccaa tttgtttacc cgttaccaac
11640ttaggggtat taaaatatcc agttatgttt gtgggtttgt ttatacttct ctttagttct
11700gtcagtattt tataactttg ttatcaggca catacacatt tattattatt atgttttgag
11760cattatgaaa cgtctctacc tctggtaata ttcctttcct tatcttatag attgttttgt
11820gtaatacttc agctttctta tgacaagtgt ttccatggta tatgctttct atcttttttc
11880tttcaaacta attctgtctt ttcatgtaag tgaatctctt acaataagag tttggtgtca
11940cttttttatt aagtctgaca atctatgcct tttaatgtag tgtttagtcc atttatgaat
12000gttttgtcca tttaatgtaa atactgctat gattggattt aggagcaatt tgttgctctt
12060tattttctat ttatctgttt tttaaaatta ttgtttttat tgttgtttct ctgttactcc
12120tttcttgcct ttttttgagg agataatcat gaatctttta gttttttatt attattgacc
12180ttttatctat atttgtttgc attgtatttc tcagagttga tcagtggatt acagaatata
12240tctgaaaatt atcacaatct atttagaatt gatattgtat tgtttcacat ttgatctaga
12300aaccttggaa taatatagtt ccatatactc cctcatccat tgtgctattg tcatatatta
12360tatctacata tcctataatc cccacaatag agttataact ttttcttaaa gagccctttc
12420agttttttgt attagacttt taaaaaatta aagaaggcta gaataaatat atattatata
12480tctactgtat tatatattgt atatattata gataacattc tattgctaaa tatagataat
12540atatatttgt agacaatatc tatatatagg taatatatat tctattctta tatattatat
12600agatatataa catctatata atctatttat agatattaca tatctataaa tacatataca
12660atttctaggg atcttcattt cttcctgtag attcagatta ccattttgtg tcctgtcagt
12720cttacaaact tattttacat ttcttgtaat acaggtttac tagtgatgga tttttctcag
12780tctttgcttt tctaaaagta tttgtctcat ctttgttttc aaatggtggt tgatgtgatt
12840gtattcttct tgtctaacag ttgccttctt ctacctccag ctctttatag gtttccattt
12900ttattggcct ctcttgtaat cattcatttc attgtcctct ctatataatg tgttgatttt
12960gtctgaatgc tgtcaggaat tttactcaag attgtggttt ttatcttttg attacagcaa
13020tttgactgca tggtgcctgg gtctagcttt ctttatgttt attctgcttg acgtttgttg
13080agctttccaa acctataagc tgatactgtc tgtgaaatgg gaagattgtt atttcccacc
13140ctatttttca tcctctcctt ttggtactgt agttacacat gcattgaaat ttgtgctata
13200tctcactgat ctctgagatt ctgtttatat ttcttaaatc ttttttcctc tttgttttta
13260agattgaata acttgtatta cttagtcttc acgtttacag attgtggtcc ggagaatgta
13320tcttttatga tttcaaattg tattaaatta ttttgttttg ttttaatggc ccagcaaaag
13380ggtatgtcgt gagagttcca tttgcagttg caaagtatgt gtgttttcca ggtgaatttt
13440ttatttcact tattgtggtg ttcaacttca gattttctat tggtattttt tctgtttttt
13500aatataaaat cccccatctt ttcagccatc atgcatatat tttccccaaa gtgcttgaac
13560atatttatat tagctatttt aaagtccttg tctgctaact ctaaaacgtg agtcatctct
13620gggttggttc ctattgacca ttctctgttt ttttattttg ttttttaaat aagtgtcacc
13680attttctgtt tctttagtga cttttgattg aataccgggt gttctgaatg atattttgta
13740gagattctgt attcttttat gtcccttcaa acatattttc tagcaagtgg atatcatggc
13800tggacacaaa ttcccaatcc tgtttctcct gcagtggata tcagctgaaa tttctgctta
13860attcttttca gtttctagct tctatgcttt tacaggatcc tctgaggtct cccttatgcc
13920acaaatagag gtggtaaagg tttttggtga atttcatatg cagattttgt ggtcactgtc
13980ctctgctatt ttccacatac ttattggctg atctgatggt cctagactca gtcccctgtt
14040ccctcaagtc attccaccaa ggctgtagcc ttctattact tgagctgcat agactggaga
14100atgccttctg gcaaaaagct actaatttgc agatctcctc aggtgaagct ttatctttca
14160gggtagactc cagtgtctca gcacttcttc cattttctca aatgttttct ctccattgct
14220tttgacatat aatttccttt gcacccataa aatactgcgg agaaagaaaa ttaaagtatt
14280tgtacaacaa agttgaactt cctacattgt aatatcatta cctttaggct agatgattct
14340atgaagaaat gtttacctta gatagacaaa tataattatt tcatatcaga tagaattttc
14400agaattttga ggaaaactca agtgcatgca atctatgtgc ttttcctatc taaaatattt
14460ggaagtagcg gcttacttga ttttattaaa tgctttcatt tggataacta gtaatatttg
14520cttggaacta aagtatttta cctgtcttct ttatgctttc cttcaaagga taattgtagg
14580aagagctatc aaaatcaaat cttggcctta aatatttata agaaatgtga ttattaagta
14640ataggagttt tgaaaattgg taaaaaataa atagagaggt ggtggtagtt aaagaacttg
14700aataactctt tcagtgaccc cttttaatga ccaagacatc aaggcttgaa agtaaagcat
14760gcttacctcc attggcttgt cacactttgc gtttcagcaa caaatgccta aataatgcag
14820atttcagagt tatgcactat ttcaatttgt agttttaata atgctattgt tcccataaat
14880gttaattatt aaacttatgt ggcaaatgta tttttttttg cgaaaacagg attcatgctt
14940aacaagtcta gcatcatcat cacttaaaca aattcttgga gattcttttt caccaggatc
15000tgagggaaac gcatctggaa aaggtggtta tatctaataa ttatatctta tatgtgaact
15060ctgtactact tagactcctg tttgtaagag aaataatact ttgtatagtt ataagagaaa
15120tatatgtttt tatgtgtttg agttttaatc ctgactatgt agttaactaa ctgtgatttt
15180ggatgcagaa cttaatctct cagtgcctca atttccctaa gttatattat ttgtctcata
15240aggttattgt gaaaattaag tgatatagtg cattttagcc attagcctag ttaatagccc
15300aagtggagtg agcacttaag gtaaactact gttatgtatg tgttgctgtg atattctgca
15360ggacaacata atagctaggt ggaattttaa agtgagacta agctagattc caatacaggc
15420acaattacat aagcaaagta actaaccttt ctgaccctgt atgttgatct ttaaaatggg
15480taaaataaga gtaatttgcc ttatagggtg ttgtaagaat taaacatgta aagcatttac
15540agcaatacca tagtaagcac ttggtgtgat atgtgaattg ttaacataat ttcttttctt
15600agtgatacgt agcttaatga aacctaaaag acatagctat ttctaggtct gagatgtgta
15660atgaacattt tagtgcttac tatgtagtat catttttgtc attttacaga tgagaaaagc
15720tgaagtgcag tgacttaggg aaacataccc aaggtcagtg atggaaccat agttaaatct
15780tgagttccaa agttcttgtt cttttcactg aacagattaa cagctccaaa gaatccaata
15840gtgaattgag tgattttaag cccatgttac ctcaaaacaa attccaaaaa aatggtcata
15900atgaaaccaa cagaattaag acttttcaca gtaaagattc aggtttagct gcaaggtgga
15960cgttggtaga actgaaagtt ggtgatccca ttccaaaatg tggtaaaatc agaatagtag
16020aagcaattct ataaatgcaa aactgaatct tcttatgcca gagcttgagc ctgtttcttg
16080gagcactgag aggataagca ataggcttgt ctttattgcc ccttatggta tcagaggaag
16140tactacatct tggtgagatg aaactcacta gagactgtgt aaaattgcat taattcttgg
16200ttctttctgc agctatacaa ttcaacaatt gtactactag taactgtagt agcctagaga
16260ggtgtgacac cttcttatgc agcgtgttgt tccagctaag aaactcaggc tttagagtta
16320aacaaatatt gtcatctcac ttacttggtt tgtatatcaa caagctcttt tgacatgtcg
16380ttgttttagg gtagttattc cattctgttt attaatatgc tatttttcta agtactagat
16440ttgttaagtg cttcattagt taagcctaga ctattttttt ttgtaaatca ctttcgaaaa
16500gagtttatgc aagtttaata tgataacttt tcttcatatt ttgcaagaaa aaagagttta
16560tagatagtcc tcatttaaaa gaaagcaaat gaatcaagta tttaccttat taattcagaa
16620gggggtttta atgctattac tctgtctcaa aatagatcca aatgaagaaa tcactgaaaa
16680ccataattcc ttgaaatcag atgaaaataa agagaattca ttttcagcag accatgtgac
16740tactgcagtt gagaaatcca aggaaagtca agtgactgct gatgaccttg aagaagaaaa
16800ggcaaaagcg gaactgatta tggatgatga cagaacagtt gatccactac tatctaaatc
16860tcagagtatc ttaatatcta ccagtgcaac agcatcttca aaggtatttg taaaaattca
16920tacttttcat actacagctt aaaacttgaa atagaacttt aagaaatttt atcttctgtg
16980ttatatactt ctgaattacc agtggaaaat ttatcttttg atagtgatat tgtattgtca
17040catggttctt acttaatcca ataaaattta actttaagga aagtttgtag tgaatataat
17100gaaacccagt gtttaaaaat tatcagaggt gtgtgatcat aatatacttt taaatgtctc
17160agaaatgcat actcatagtg tatatatttc cataggtctt catattttaa aaatataact
17220gtctggaata atttctgaga ttttaaatta gagttatgtt tttggatatt gttttaaaac
17280gtgttaacaa ttttaacaaa aatcttaaag aaatgtttat caacagttta tcaacatctg
17340tgcttcttta aaatagatgg ttatcatcag gaacattagt attattattc gtatttgatc
17400ctttgccttt atttcctaat tttcaaaata atgaactggt gccctggcaa cctccagagg
17460tgatgaagtt gctttgtttt ttcttttttc aattcatgta aatttaatgg ttacaagtgc
17520ttttttgtta catggatata ttgtgtagtg gtaaagtcag acttttagta taaactaaaa
17580tgtacattgt acccattaag taatttctca tcccgcacct ccctctcacc tttcctagtc
17640tccattatct attattccat accctatata catgtgtaca cattatttag ctctgacttg
17700taagtgagaa catgtaccat ttgactttct gtttctgatt tatttcactt aaggtaatag
17760cctccagttc catccatgtt gtaaaagata ttatttcttt tctgtgtggc tgaatagtat
17820tcctgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt atacacattt tctttataca
17880atcatatgtt gatgtacact taggttgatt ccatatcttt gctattgtga ctagtggtgt
17940gataaacatg agtgcaggta tcttttttat ataatgattt attttccttt tggcagatac
18000tcacagtggg gttgctggat tgagtggtag ttctatattt agttccttaa gaaatcccca
18060aactattttc cataaagatt gtactaattt acattcttac caagagtata caagcattcc
18120cttttctctg tgttctcacc aacatctgtt acttttttaa ctttttaata atagctaaat
18180attctgacta gtataatata tctcactgtg gttttaattt gtgtttctct gatgattagt
18240gatggtgaac attttttttc atgtttcttg gccacttgta tgtcttcttt tcaaaaagtc
18300tattcatgtt ttttgccctc tttttagtgg ggttatttgt tttttgttgt tgttgttgag
18360gggaacatta ttattataac cttaagaaac agatatgtaa tatgtaggat tacttgtccc
18420tacattaaat tgtgcctgag tgctatactt taaaaattta tggtgtagca ttttcagtct
18480ttgtttctcc tgaatttgtc attatctctt gtagctgcaa ttagctagca gctctgtgtg
18540tttattatca gcggaagaaa acagggctag ctgaaaattt gtgtttgagc aatactttta
18600taacataaaa tacaagcttt tcttaaaatt gatgaaggag gttcattaag ccatgttcca
18660ggtatatcat ccttagctaa tttctttagg aaaaaaacac tactgctaag ttagggatgt
18720gtttattatg tctgtgctct cactttacca ctagcaccca tcagtctgtg taaagtagaa
18780aagttgttcc ttaaaagaag aaaggatatt ccggagttta tagacaggat tgtagaatgt
18840ctaatagagg caattctaaa ttagaacagg catttcatat gtaacaagta aggttgtaac
18900ttgtttcttt tgactggacc cttggcctca ttcttactct ctactgaatg accttttcta
18960aacagaaata taatcattct ccattaaagt ctttttgttg gtttctcatc acaagaattc
19020catccagact cctcatcgct gcctagtgat ctcacctggt tcttccctga ccacgtcttc
19080ctccgctttc cctgccattc actatgcttc agctccattc acctctttct gtttttcaga
19140gataacaggt tccgtccctt ctcaggcttt tacccacttg ctgtttcttt ctttcataga
19200cctttcggtg ggccctttgc actcttagct ctgatgtcag cccctcagga cagccttccc
19260tgaccaactt ctttaaagca gctcctcagc cccactctag tcattctctg tcactgcaca
19320ctattttatg tccttcatga gccatgtttg cttatatatt tatttttggt catccgtctc
19380tagaatttaa tattcttaag ggcattttat tcactgattt gctcccaatt tctactgtgt
19440ttgacacata gtagatgctt aaagaatagt gatttactgg cagtttggct tctaagccta
19500aaaaggatag ttgtcatgaa taaatcatct ttggcatttt ctgtttaata gaaaacaatt
19560gaagatagaa atataaagaa taaaaagtca acaaataata gagcatccag tgcatctgcc
19620aggtaataaa gttaccaata tttgtcattt atgggcttgc attctagcaa agctagtttt
19680aatttaactt tcataaagta aatttcattt ggtgttactg tattttcttt ttatttccat
19740ttcataaaat gaaagtagtt aacttcatga taaaacccct tggttgatga tattatttga
19800aataaagtaa tttataaaaa gtaagtctat tactgattgt tttagtgcct ggaatgttta
19860tgcaatacct ttgctctcca ggatcgtcct aggaatattt ttcttctttc ttaatgtcag
19920tgattaggga ttctttgtgc tccagactgc ttctggaata gagcttcttt ctcctacttt
19980tcctgagaca agcaatataa aatggtaata aagctgaagt ctagcaatga tacttattca
20040ttatcaagta tcattgtcta acatgagaaa ttgtactgaa agccttcaga atctatgaac
20100taagtaggtt tattaaaatg attatctgta tagcttcatt cacaccaatg ataatgaatg
20160cctaactcat aagtgctaat caaaaacctt ctgaatcttt aaaattatcg ttagtcaaat
20220tatcattaat caaataaaac agagctagca agctttttct gtaaatggcc agttagtgca
20280tattttaggc tttgtaggcg atacagtctg tattggaact actcatttct gctattttaa
20340caggaaagca gccacaggca aaacttaaca tgaatgatta cagctatggt gcaataaact
20400ttgtatatca aaaccaatgg ctggccaaat tttcccacca atccctgata tagatagtac
20460tattctttct aattttatat ttggaatgct tcatgtaaca aaatgatgaa agaaaatatt
20520aaaagagtga ttataaccta ctgtattgtt ttttccatgt aacttgagaa gtggtccata
20580tttcttaagt ttctaattac aaatatttaa aaagagcaat cattttaaag ctatataact
20640taaagttata aaatttaaat tatgttgaag gggacatatt taagttatgt ccccttctac
20700ataatttaat attctttgta tactaagact gtacatttta cctacatcat tttcaaagta
20760attataattt gttaaattat aatgtagttt ccaatttttt ttttgagatg gagtctcact
20820ctgttgctca ggctggagtt cagtggcatg atctctgctc actgcaacct ctgcctcctg
20880ggctcaagct atcctcccac ctcagcctcc agggtagcta tgactacagg catgtgccac
20940cacgccagct aattttttgt atttttggta gagacagggt ttcaccatgt tgcccaggct
21000ggtcaacagc ccaacaggat gagctcaagt catccaccca ctttggcctt ccaaagtgct
21060gggattacag gtgtgagcca tcatgcctgg ccagttttca aatattatac gtgcatattc
21120taacagatct ctcttctacc aaatgcaatt gtaatatttt gtcttgattc atttggatct
21180tttcagatta atgacctctg agtttttgaa gaaatctagt tctaaaagga gaactccatc
21240gacaactacc tcttctcact atttagggac tttaaaagtc ttggaccaaa aaccttcaca
21300gaaacagagc atagaacctg atagagcaga taacataagg gcagctgttt atcaggtaaa
21360aaaggaaaat atttttaaga gaagaagaat gatcactttc ataagcctac actgtttata
21420aagaataaag taatcctgat agaaaatgat ggtttaatac ttaaatttat tgagaaagag
21480tttcctttta atacatgagt aatcatattt tactaaatta tttgcttcca cactttgcat
21540aactgaccat agttgttttt aaagaaagaa tatgccattg caatttatag aaatacagca
21600caagccaaaa cattgtaaag tctatatatg ttttcatttt tttcttcttg aagtttatat
21660gaacaaaagg agttattatg aacaaaaagt tattaaattt tttctttcct gagatgttgt
21720taggcgtaca taggaaaaag attgtattaa tttattcaca attctaaaag tctttttttg
21780tcttttttag agtagaatag tatactttag aaaattgtac atgtgaattt cagagaaaat
21840gttaatataa agaattctaa ttcacttaag aaattttaaa tattatatga cctttttctt
21900gttcttatag gagtggttag aaaagaaaaa tgtgtattta catgaaatgc acagaataaa
21960aagaattgaa agtgaaaact taaggatcca aaatgaacag gtattctgac atatagaagt
22020aaaaatgttt tggattttta tttcagtaaa atatccctga atatataact tttctaaatc
22080agctttttaa atggcaaaat aacttgtata ttaaagaaat gatttccggt tttacttctg
22140ttttacttta tacattttag tttgatataa ctgttttaca tgaaaacaga ttttaatttt
22200gtatatgtat aggatagctt tgttcctgct gattatgaag ttattattgt ttatgagcac
22260ctaattcact tttaaaagtt gatttcattt agaacttaac caagaaggcc aggtactgtg
22320gctcatgcct gtaatcccag cactttggga ggccaaggca gatgggattc cttgaggtct
22380ggagttcgac accagcctgg gcaatgtggt gaaaccccat ctctactaaa aatacaaaaa
22440ttagccaggg atggtggtgg gcacctgtaa tcccagctac tcaggaggct gaggtggcag
22500gatcacttga acccgggagg cggaggttgc agttagctga gatcgtgcca ctgtactcca
22560gcctaggtga cagagactct gtctcaaaaa aaaaaaaaaa ggcacgacaa gataaaggat
22620cattagacac tagttagcct tcaattttcc tcttttctct cttgaatttt ataagtatct
22680tcaagtccaa cccctacctg aactcttgat ctgtatcctt tcccattgaa tggaggtgaa
22740cttttgttcc tgtctcttct gtactgagtc tcttcctcta actcctgctt gtaatacgct
22800cagttatttc ttatcttcta aagtcaaact tctggacaaa aactccagtg tgctgttcaa
22860tactaaaaat agatttagaa gaaaaatatt ttccaaggtg aactgcacga taatgcgtca
22920gtagtgaagg gagcagccct ccagggggcg tgcctgtcta tctgttaacc acgttcatag
22980cagtatgctg ctgtggtcag tgccataccc cttctcattt gattttcgta gctctgtgag
23040gtagatagta ctttgacctc taaattatgt taccccaata ttaaggtttt atgtcattta
23100atattgaaca ataaagcaaa catagaatat tatgggatta gattgaagga agtaaaataa
23160taacataact tgctatacag tctccaacct atttttcagt cgagcacata ctttcaacat
23220ttggaataca tttgtgcagt aagaacttta tgttttgata ctattcaaaa ttaagattta
23280aaccaaaaat ctgcatctta ctgcatggct tggccaattt gccttactct aacttacttt
23340ataagcccat aactttactg attttttttt caaatatttt attatgaaaa ttttactata
23400ccacttagcc tattacagtt tattttgata taatttgttt agtacacttt caaaaataat
23460agttgacatc tttctcatta ataggtcaat atgtgataaa tgtttttaga aaaggacgtt
23520ttaaaaccaa tgaataattc agataacatt ctttgtaaat tatctaagcc attctaaata
23580aattacctac tttgaaagtt aatttctaag tataatgaat atcagaggac taaagataaa
23640tgtatatgtg tatatttata tctagccata tttgtgtcta tgtatatata catatatatg
23700tatatcactc tattattttt tccactgtag aaaaaagctg ctaaaagaga agaagcatta
23760gcatcatttg aggcctggaa ggctatgaaa gaaaaggaag caaagaaaat agctgccaaa
23820aagaggcttg aagaaaaaaa caagaagaaa actgaagaag aaaatgctgc aagaaaagga
23880gaagcactac aagtattcag aactttgcac atcttaatta ttttaaaaca tttgaaatcc
23940aaattaatga ttaaccatat ttttatttat tttcaaatat tcacagtaag aaaattattc
24000tgaacttttt caggcttttg aaaaatggaa agagaaaaag atggaatatc ttaaagagaa
24060aaatagaaag gagagagaat atgaaagagc aaagaaacag aaagaggagg aaactgttgc
24120cgagaaaaag aaagataatt taactgctgt tgagaaatgg taatccaaaa tcataaatat
24180tttgatatat tttaaattat agtaacactt caggatttta taaaatttat ttacttgaaa
24240tttagtaatg catttcaatt tcattactgt caaagatgta ctagggaatc tttattatgt
24300attttccttt aactctccag tgttttatac tatgctctat aggaatgaaa aaaaggaagc
24360ttttttcaag caaaaggaaa aagaaaaaat aaatgagaaa agaaaggaag aactgaaaag
24420agctgagaaa aaagataaag ataaacaagc tattaatgaa tatgaaaaat ggctggtagg
24480tattatttgt caatgcactt tcgtcttttt catgtacctt ttgtgtcttt tctgtcccta
24540attctaattc tatttgctcc agacctactg atcatttcta cctggaatct gctttgttga
24600attcaagctc tcctcctgca tatagcatat tttctttgac ttagtcattt ctattaatgt
24660ttctactatt ccctcaaaca cccaggctga aaacttgtta taatcttctt ccttacctgc
24720atccccacat ttaccattta ctattcatgc ccattcttcc tttgctgtga ttctcacatc
24780taacatagaa agaagacaag tttactattg agggtactac gtggtggaac ttggtcatga
24840caaaaagtaa cactgaactt aatagtgaga aaattattcc atcttttatt ctcttttgat
24900gtttctgatg acctcaagga gaatctctta tttaggaatt tttaatgaaa gagagcaggt
24960ttgaggttta ggaggagcaa tagctagctg aaccagatat gtgtatatat ttgatttcac
25020tttacttatc tttataaaag ttactttttg ttgatgtcaa gcaaaatatt attttccatt
25080ttagaatatc aatataaata tgcattttgt ccatgtttat ataagtaata cattactatg
25140aataaatact ttacataagt aggtaacaca ttcatatgaa tagttaacat attcatatga
25200ttcagcaacc aaaattatag tatttttgca ctagaagtct atccagtcag gtttcctatc
25260aaactttaaa acaactcata ccaatcaact aaatcatcca ggttgttttt gatttgcatt
25320tctctggtta gaattgagct tgaatatctt ttcatttgta tacaggccat ttatctatta
25380ttttctctgt aaattgtcat ttcatagact ttgcacactt ttctattaga ttgttggttt
25440tttttcctta ctggtttcta gaatcttttg ttttgtactg gggaaattag cctatcattt
25500tttatatggg ttgcaaatat ttacccccac tatattgttg gtttcccggc tttccttata
25560gtatctcatg ccatgaagaa tttaaatttt aggtgtcaga tttctgtttt ttttttttgg
25620cttttgattt tcaagcatag ttgaaaagac ctacacaatt tgagattaaa cagaattatc
25680ttatttttct tctaacaact ttgtgacttt aatatcttaa tgttttaaca tttgttctgc
25740ttggaatttg ccctgataca tggtgggaaa tatgatttca actttagttt ttccaaatgt
25800atcctttata aagtagccca tttttaccca ttgatttgag gtgctacttc tgttatatga
25860taccttctca tgttttcggg tctgtttctt aactttctgt tccattggtc agtctcgtga
25920ttccagtgcc acacttccat tattaggctt gatatgtcta aatatctgct tggattcatc
25980tccctttata gttcttcttt cacagtcttt ctgaccagtc ttgtttattt attttttcca
26040taaacttaag aatcagcagt agttagaaag gtacatggga ccaaaatgag cgatttaaag
26100ataggataaa aagataaaac aataataaac ttaagaaaca tgccagacca acataaagaa
26160aattgtagaa ctctcctgaa caacacaaat gaagacttga gaaaatggat cagaattgcc
26220catgcacaga aacacactta accttataat gatgttataa ggatgtcagc tctccctgaa
26280gtcatttaat gcaatcttaa caaaagccaa caggatttac tctgtgtgtt gagtttagta
26340ctgctatatg ctaattcgat gcagagaaat agtaataaaa taaggtaatc aaaattggtt
26400caattttgaa tgaaaaaggt agtgtttcat gatgatttcc ttaagttaat ctgttaaata
26460atgctatgtt ctaaaaaaaa atttaaagtc cacttatatt aagaagatgt acactgactg
26520ctagtatcaa ttagggaaat taaatgtaaa catttgagtt ttccatttta attccatatc
26580ttcatgaaaa tggaatagaa tttctttaat aagtcacatt taggtatact gtttttaatt
26640atagcactta attacattgt cattcttatc agtcctctga agaacaagaa ttcctcaaag
26700accaaagaca aaataacatg tttgatatct agtaaaatgt ctgcaaatat agtacaccta
26760taaacacata aacatacatg ttacagatcg gttctccttc ttaccaaatt cttattgaaa
26820tttgtttgca gatagaatag aaaaattgcc cctgtatagg agtctaatga cttcagtttt
26880catggaaaac aacatctcaa gctttttata tacaaactag tttgaacagt aagcatttgg
26940tgggtaattg ctttagggga aagttaatag ccaaagatca ggtaagacta aaatattttt
27000cttgccaatt accagattaa ttcatcatta cctttagtaa gaaaataagc aaaaagctca
27060gttttccaca aataaatgtc tgaaggactt tttaacaagg ttcttttaat tactatcaag
27120gtgactattg attcttttga actgatatta cagttaatat aattgtctat ttgctaccct
27180ggctttacag ctccctgcta gtaagatgaa gcatatttca agttactgcc ccctcatgtt
27240aagtgaaatt acaaaaagag atttattcag tcaatttctg tggacacagt ctggtcactg
27300cttttcttcc gcctagctag atggtctgtc tctaaaatat taaaatgatt gaagatgatc
27360taattacagc tttgcttttc tcaattaaaa ttctgaaagg aagtttcctc tttgccttat
27420tagaaatagc aagcaaacaa acatgcaagc attcttatga catggaatga ggatatgggt
27480gttaacattg acaaaaaaca aacaaacctc ccacttcact ttgtttgtta catgtgaatg
27540gaaagcttgt cctgtattgc catattattc ttgtggcatt tatatatata ctgatgaaaa
27600gatgcataca tacctaatca ttttccataa tgcctttcct cccaagccat caacctgcag
27660aggcaggttt cactaagggt tttcctgctc cttgaggaat atgagaaaaa taccaagatg
27720aagaaaccac caaaccttat agtgttagca gagacataaa gggacacctg gtgcccctct
27780tccatttctt gtctcctgcc ttctgccaag ccttagtcac aatggatatt tttgtttcct
27840cccacagcac acattttttt tcccactctc agagccctca ccactactgt ttgcaagcaa
27900agctcttccc cgatatttat cacgagtggc ttctcttatc catcatgtca cacttcaaag
27960ggactttccc tgagtccatt ttttgttgaa agtaaatact cttttttatt ccttctcata
28020gttttaaaac atgtttcaga gaaattcaca caatttggaa ttatctgttg tttattttct
28080ttgtttctgt ccattttgaa agttccctgg gggacaggga ccatatctgt gtgttgggat
28140tttaaaaaat tatttttatt tgcaaatgac acataaaaag tgcacatatt tatggaatac
28200agtgtgatgt ttccatctac attgtataca ttgtgtaaca atcagaaatg actcacaaag
28260gtaggcaaaa tgtttgatgc aaagatatca ttaatattta ttataggaaa gtacacaaat
28320tactaaaaat taaaggcaaa taccatacat ttaaatgggc caaataattg agcagaaaat
28380ttacaaaagg ctaaagaaat gtttgaaaat gtgctcaagt tcaataataa agaaacatga
28440ggcagaattt ttaactattt gtaaaaaatt tgaagtatct catactgtca tgacatattg
28500aaactttgca cccagtaaac ttacttctga gaatttgttc tcacgaagtc accaccaact
28560tataacagtt actatatttg agttataatt ataggtcttt ttttctattt tatacaattc
28620ttttttaatg ttttcacttt taaagtttaa aaaattaagt gatattagta cttgcaaatt
28680gacaatgttt actaattttt ttcttgtttc cattttttgt ttgtttgttt ttttgagaca
28740gggtctcact ctgttgccca ggctggagtg cagtggtgca atctcggctc actgcaacct
28800ccacctccca ggctcaagca atcctcccat ctcagcctcc taagtaggtg ggactatagg
28860catgcaccgc cacacctggc taatttttgt gttgttttgt agagatgatg tttcaccatg
28920tttcccaggc tggtctcgaa ctcccaggct caaacaatcc acccacctta gtctcctaaa
28980gttctgggat tactggcatg agccaccatg cctggcccta cctgttattt ctttatgatc
29040tgttaaacta ggaagtgata tataaatatc ctataatgga ttattttgtt cttcagcaag
29100caacctgatt tgaaaataat aatcatatat gtacataaat ttatagtgtt ctattttctc
29160tttaggaaaa taaggaaaaa caagaaagaa ttgaacgaaa acagaagaaa cgtcattcct
29220ttcttgaaag tgaggcactt cctccgtgga gccctccaag cagaactgtg ttcgcaaaag
29280tgttttgata attctagttc ttacattatt tggttattta tcggtttgcc aatattagcc
29340atagatttaa aaccattcaa ttatttatag ttagaggaat atattttaat taaatgccag
29400acactcctgc tgacaatgaa agaaatactt tggaatgtaa tcagtgaaag catttttttg
29460aactgtagat aaactgcctc aaacaaagac ctaataatca gattgttttt accattaaga
29520tacataagat tttatcatgt cctgataatt cttatggtgg agtgattcat gatctttttc
29580attaagctct gtatgttatt taagtatatt taattccagt aataaaaagg aaatcatcta
29640ggtaccataa tgatagaaat tattcctttt gtggatgatt gtgaatctag attcaggttt
29700ttaaatgaag ggtcgctggg aagtgcgcat atattattcc ttctgaaact
2975017200DNAHomo sapiens 17acttccttcg tctgggtggt tgccccagcg acacgttggg
ccgaagagcg gtgttgggta 60cccgagagac ccggcggtgg ggaagtcact tcctcccgaa
gacgctgttt cctagcaacc 120gccctccgcc tctgttatta gcccctcctc ctcgctcggt
ccaggaccgg ctctgcgggc 180gccgccaggc ccagaccaag
20018139DNAHomo sapiens 18ctactatcag aagttgaatt
ctaataatta gctattttat aaaggtaacg agaaaaaata 60cactatgtct gatgaagttt
ttagcaccac tttggcatat acaaagagtc caaaagttac 120caaaagaact actttccag
1391985DNAHomo sapiens
19gatgagctaa taagagcaat tacagctcgc tcagccagac aaaggagttc tgaatactca
60gatgactttg acagtgatga gattg
8520321DNAHomo sapiens 20tttctttagg tgatttttct gacacttcag cagatgaaaa
ttcagttaat aaaaaaatga 60atgactttca tatatcagat gatgaagaaa agaatccttc
aaaactattg tttttgaaaa 120ccaataaatc aaacggtaac ataaccaaag atgagccagt
gtgtgccatc aaaaatgaag 180aggaaatggc acctgatggg tgtgaagaca ttgttgtaaa
atctttctct gaatctcaaa 240ataaggatga ggaatttgaa aaagacaaaa taaaaatgaa
acctaaaccc agaattcttt 300caattaaaag cacatcttca g
32121227DNAHomo sapiens 21cagaaaacaa cagccttgac
acagatgatc actttaaacc atcacctcgg ccaaggagta 60tgttgaaaaa gaaaagtcac
atggaggaga aggatggact agaagataaa gaaactgccc 120tcagtgaaga attggagtta
cattctgcac cttcttccct tccaacgccg aatggcatac 180aattagaagc tgagaaaaaa
gcattctctg aaaaccttga tcctgag 2272294DNAHomo sapiens
22gattcatgct taacaagtct agcatcatca tcacttaaac aaattcttgg agattctttt
60tcaccaggat ctgagggaaa cgcatctgga aaag
9423248DNAHomo sapiens 23atccaaatga agaaatcact gaaaaccata attccttgaa
atcagatgaa aataaagaga 60attcattttc agcagaccat gtgactactg cagttgagaa
atccaaggaa agtcaagtga 120ctgctgatga ccttgaagaa gaaaaggcaa aagcggaact
gattatggat gatgacagaa 180cagttgatcc actactatct aaatctcaga gtatcttaat
atctaccagt gcaacagcat 240cttcaaag
2482471DNAHomo sapiens 24aaaacaattg aagatagaaa
tataaagaat aaaaagtcaa caaataatag agcatccagt 60gcatctgcca g
7125169DNAHomo sapiens
25attaatgacc tctgagtttt tgaagaaatc tagttctaaa aggagaactc catcgacaac
60tacctcttct cactatttag ggactttaaa agtcttggac caaaaacctt cacagaaaca
120gagcatagaa cctgatagag cagataacat aagggcagct gtttatcag
1692690DNAHomo sapiens 26gagtggttag aaaagaaaaa tgtgtattta catgaaatgc
acagaataaa aagaattgaa 60agtgaaaact taaggatcca aaatgaacag
9027160DNAHomo sapiens 27aaaaaagctg ctaaaagaga
agaagcatta gcatcatttg aggcctggaa ggctatgaaa 60gaaaaggaag caaagaaaat
agctgccaaa aagaggcttg aagaaaaaaa caagaagaaa 120actgaagaag aaaatgctgc
aagaaaagga gaagcactac 16028146DNAHomo sapiens
28gcttttgaaa aatggaaaga gaaaaagatg gaatatctta aagagaaaaa tagaaaggag
60agagaatatg aaagagcaaa gaaacagaaa gaggaggaaa ctgttgccga gaaaaagaaa
120gataatttaa ctgctgttga gaaatg
14629133DNAHomo sapiens 29gaatgaaaaa aaggaagctt ttttcaagca aaaggaaaaa
gaaaaaataa atgagaaaag 60aaaggaagaa ctgaaaagag ctgagaaaaa agataaagat
aaacaagcta ttaatgaata 120tgaaaaatgg ctg
13330485DNAHomo sapiens 30gaaaataagg aaaaacaaga
aagaattgaa cgaaaacaga agaaacgtca ttcctttctt 60gaaagtgagg cacttcctcc
gtggagccct ccaagcagaa ctgtgttcgc aaaagtgttt 120tgataattct agttcttaca
ttatttggtt atttatcggt ttgccaatat tagccataga 180tttaaaacca ttcaattatt
tatagttaga ggaatatatt ttaattaaat gccagacact 240cctgctgaca atgaaagaaa
tactttggaa tgtaatcagt gaaagcattt ttttgaactg 300tagataaact gcctcaaaca
aagacctaat aatcagattg tttttaccat taagatacat 360aagattttat catgtcctga
taattcttat ggtggagtga ttcatgatct ttttcattaa 420gctctgtatg ttatttaagt
atatttaatt ccagtaataa aaaggaaatc atctaggtac 480cataa
4853124DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
31atgtctgatg aagtttttag cacc
243222DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 32aggcctcaaa tgatgctaat gc
223321DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 33atcatttgag gcctggaagg c
213423DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 34aaacactttt gcgaacacag ttc
233521DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 35acaacgaata acagagtgtc c
213620DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
36actcctgata aacagctgcc
203729DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 37gccaccatgt ctgatgaagt ttttagcac
293824DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 38gaaacacttt tgcgaacaca gttc
243926DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 39taatgtctga tgaagttttt agcacc
264026DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 40tcaaaacact tttgcgaaca cagttc
264125DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
41aatgtctgat gaagttttta gcacc
254219DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 42tcagcttgcc gtaggtggc
194319DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 43atggtcctgc tggagttcg
1944391DNAMus musculus 44aaagaagtga agacagaaac
acgaagaata aaaagacaac gaataacaga gtgtccagtg 60cctctggcag gctgatgacc
tctgagtttt taaagagatc cggtcccaca aaaagaagtc 120catctgcagc tacctcctca
cactatttag ggagtttgaa agtcttggac cagaagcaac 180cacggaagca gagcctagag
ccagacaagg ctgatcacat aagggcagct gtttatcagg 240agtggttaga aaagaaaaat
gtgtatttac atgaaatgca cagaataaaa agaattgaaa 300gcgaaaactt gaggatccaa
aatgaacaga aaaaagctgc taagagagag gaagccctgg 360catcatttga ggcctggaag
gcaatgaaag a 391452767DNAMus
musculusCDS(204)..(2147) 45gttgggtacc caagagacca ggcggttgga agtcacttcc
tcccggggac gctgttgcct 60agcaaccgcc ttctgcctcc atcttttgcc ccgcctccag
gttattccaa tacctggttt 120cccagaccgc gaggcccggg ccgggggcga cacctgtgct
agagcatagc cgctgggttc 180tcagcagaga aaaaggacac acc atg tcc gat gaa atc
ttc agc aca act ttg 233 Met Ser Asp Glu Ile
Phe Ser Thr Thr Leu 1 5
10gcg tac acc aag agt cca aag gct acc aag aga act tcc ttt cag gat
281Ala Tyr Thr Lys Ser Pro Lys Ala Thr Lys Arg Thr Ser Phe Gln Asp
15 20 25gag ctg atc aga
gcc att aca gcc cgg tca gcc agg cag aga agt tcc 329Glu Leu Ile Arg
Ala Ile Thr Ala Arg Ser Ala Arg Gln Arg Ser Ser 30
35 40gaa tac tcc gat gac ttt gac agt gac gag att
gtt tct tta ggt gaa 377Glu Tyr Ser Asp Asp Phe Asp Ser Asp Glu Ile
Val Ser Leu Gly Glu 45 50 55ttt
tca gat acc tcg aca gat gaa agt cta gtt aga aaa aag atg aat 425Phe
Ser Asp Thr Ser Thr Asp Glu Ser Leu Val Arg Lys Lys Met Asn 60
65 70gat ttt cat ata tcc gac gat gag gaa aaa
aat tct cca aga ctg tct 473Asp Phe His Ile Ser Asp Asp Glu Glu Lys
Asn Ser Pro Arg Leu Ser 75 80 85
90ttt ttg aaa acc aag aaa gta aac agg gca ata tcc aac gat gct
ctg 521Phe Leu Lys Thr Lys Lys Val Asn Arg Ala Ile Ser Asn Asp Ala
Leu 95 100 105gac tcc agc
act ccg ggc agc gaa ggc tcg tca ccg gat gct caa gaa 569Asp Ser Ser
Thr Pro Gly Ser Glu Gly Ser Ser Pro Asp Ala Gln Glu 110
115 120gat gtg act gga gat tcc ctc ccc aaa tct
caa aat gat gat cga gaa 617Asp Val Thr Gly Asp Ser Leu Pro Lys Ser
Gln Asn Asp Asp Arg Glu 125 130
135gtc ggc aga gag atc atc aca gtg aag cct aca ccc agg atg cac ccc
665Val Gly Arg Glu Ile Ile Thr Val Lys Pro Thr Pro Arg Met His Pro 140
145 150gtc aaa aga agc acg tcc tcg ggg
gaa acc agc agc ggt ctt gat gca 713Val Lys Arg Ser Thr Ser Ser Gly
Glu Thr Ser Ser Gly Leu Asp Ala155 160
165 170gat ggc cac ttt aag cct tca ccc cag cca agg agc
atg tta aaa aag 761Asp Gly His Phe Lys Pro Ser Pro Gln Pro Arg Ser
Met Leu Lys Lys 175 180
185agc agc cac act gag gag gga gtc aga cca gga gtt gat aaa gaa cat
809Ser Ser His Thr Glu Glu Gly Val Arg Pro Gly Val Asp Lys Glu His
190 195 200tcc ata agc gaa gcc tct
gct ccc aca cct tcc ctt cca agg cag aat 857Ser Ile Ser Glu Ala Ser
Ala Pro Thr Pro Ser Leu Pro Arg Gln Asn 205 210
215ggc aca gag ttg caa act gag gaa aaa ata tac tcg gaa aac
ctc gat 905Gly Thr Glu Leu Gln Thr Glu Glu Lys Ile Tyr Ser Glu Asn
Leu Asp 220 225 230ctt gag gac tca ctc
tta caa agt ctg acc tca tct tcc ttc aaa gaa 953Leu Glu Asp Ser Leu
Leu Gln Ser Leu Thr Ser Ser Ser Phe Lys Glu235 240
245 250agc ccc gga ggt tgc aca tca cca gga tct
cag gaa aag gtg ccc ata 1001Ser Pro Gly Gly Cys Thr Ser Pro Gly Ser
Gln Glu Lys Val Pro Ile 255 260
265aaa gat cat gat gga gaa cct act gaa atc tgg gat tcc ttg cta tca
1049Lys Asp His Asp Gly Glu Pro Thr Glu Ile Trp Asp Ser Leu Leu Ser
270 275 280aat gaa aat gaa gga agt
tct gtt ttg gtg aac tgt gtt act cct gaa 1097Asn Glu Asn Glu Gly Ser
Ser Val Leu Val Asn Cys Val Thr Pro Glu 285 290
295ctc gag cag ccc aag gac ggt cag gtg gca gct gac gac ctt
gag gaa 1145Leu Glu Gln Pro Lys Asp Gly Gln Val Ala Ala Asp Asp Leu
Glu Glu 300 305 310gaa aga gag aag ggt
gga ttt aca gaa gat gac ctc acc act gac ccg 1193Glu Arg Glu Lys Gly
Gly Phe Thr Glu Asp Asp Leu Thr Thr Asp Pro315 320
325 330ctg ctc tcc acg tcc ccg agt gtc ata aca
ccc act gag cca gca gag 1241Leu Leu Ser Thr Ser Pro Ser Val Ile Thr
Pro Thr Glu Pro Ala Glu 335 340
345ccg gcc aag aaa gca aat gaa gac aga aac acg aag aat aaa aag aca
1289Pro Ala Lys Lys Ala Asn Glu Asp Arg Asn Thr Lys Asn Lys Lys Thr
350 355 360acg aat aac aga gtg tcc
agt gcc tct ggc agc agg ctg atg acc tct 1337Thr Asn Asn Arg Val Ser
Ser Ala Ser Gly Ser Arg Leu Met Thr Ser 365 370
375gag ttt tta aag aga tcc ggt ccc aca aaa aga agt cca tct
gca gct 1385Glu Phe Leu Lys Arg Ser Gly Pro Thr Lys Arg Ser Pro Ser
Ala Ala 380 385 390acc tcc tca cac tat
tta ggg agt ttg aaa gtc ttg gac cag aag caa 1433Thr Ser Ser His Tyr
Leu Gly Ser Leu Lys Val Leu Asp Gln Lys Gln395 400
405 410cca cgg aag cag agc cta gag cca gac aag
gct gat cac ata agg gca 1481Pro Arg Lys Gln Ser Leu Glu Pro Asp Lys
Ala Asp His Ile Arg Ala 415 420
425gct gtt tat cag gag tgg tta gaa aag aaa aat gtg tat tta cat gaa
1529Ala Val Tyr Gln Glu Trp Leu Glu Lys Lys Asn Val Tyr Leu His Glu
430 435 440atg cac aga ata aaa aga
att gaa agc gaa aac ttg agg atc caa aat 1577Met His Arg Ile Lys Arg
Ile Glu Ser Glu Asn Leu Arg Ile Gln Asn 445 450
455gaa cag aaa aaa gct gct aag aga gag gaa gcc ctg gca tca
ttt gag 1625Glu Gln Lys Lys Ala Ala Lys Arg Glu Glu Ala Leu Ala Ser
Phe Glu 460 465 470gcc tgg aag gca atg
aaa gag aag gaa gca aag aga ata gct gca aaa 1673Ala Trp Lys Ala Met
Lys Glu Lys Glu Ala Lys Arg Ile Ala Ala Lys475 480
485 490aag agg ctg gag gaa aag aac aag aag aaa
aca gaa gaa gaa aat gcc 1721Lys Arg Leu Glu Glu Lys Asn Lys Lys Lys
Thr Glu Glu Glu Asn Ala 495 500
505atg agg aaa ggc gag gcc ctg caa gca ttt gaa aaa tgg aaa gag aaa
1769Met Arg Lys Gly Glu Ala Leu Gln Ala Phe Glu Lys Trp Lys Glu Lys
510 515 520aag cta gaa tac ctc aaa
gag aag acc agg agg gag aaa gaa tat gaa 1817Lys Leu Glu Tyr Leu Lys
Glu Lys Thr Arg Arg Glu Lys Glu Tyr Glu 525 530
535aga gca aag aaa cag aaa gaa gag gaa gcg gtt gct gag aaa
aag aaa 1865Arg Ala Lys Lys Gln Lys Glu Glu Glu Ala Val Ala Glu Lys
Lys Lys 540 545 550gac agt tta act gct
ttt gaa aaa tgg agt gag aga aag gaa gct ctc 1913Asp Ser Leu Thr Ala
Phe Glu Lys Trp Ser Glu Arg Lys Glu Ala Leu555 560
565 570ctc aag caa aag gag aag gag aaa ata aat
gag aga aga aag gaa gag 1961Leu Lys Gln Lys Glu Lys Glu Lys Ile Asn
Glu Arg Arg Lys Glu Glu 575 580
585ctg aag aga gcc gag aag aaa gac aaa gac aag caa gcc atc agt gaa
2009Leu Lys Arg Ala Glu Lys Lys Asp Lys Asp Lys Gln Ala Ile Ser Glu
590 595 600tac gaa aag tgg ctg gaa
aag aaa gaa agg caa gaa aga att gaa cgg 2057Tyr Glu Lys Trp Leu Glu
Lys Lys Glu Arg Gln Glu Arg Ile Glu Arg 605 610
615aaa cag aag aag cgc cac tcc ttc ctt gag agc gag aca cac
cca cca 2105Lys Gln Lys Lys Arg His Ser Phe Leu Glu Ser Glu Thr His
Pro Pro 620 625 630tgg agt cct ccg agc
aga act gcg ccc tca aaa gta ttt tga 2147Trp Ser Pro Pro Ser
Arg Thr Ala Pro Ser Lys Val Phe635 640
645tgtttctggt tcttgatttt tttttcagtt caccaactgt actcatggat ttaaaacgag
2207tcatctcatt atttgtggtt agaagactct atgtcacttc cctgcaggag cttctgtgga
2267gcatgaaaga gatactttgc agtttaatca gtggaaacat tttctgaagt gtcctcatca
2327gtttgctggg acaatccaga cgcatgaagc tttattatga cctgaacagt ctggtgtggg
2387gtgattcgtg gtcactgtcg ctgagttcgg agtcttttta aagaatgttt gatcccacta
2447atgaaagaat gccagctaga taccacaatc gtagagatga ctcggtctgt ggaagtctgt
2507gcttctagag tgtagtttgg gcattgaagg tccctggaga ccatgggcat gttatctctt
2567ctaactccag ttcttcaggt cacagaagta tctttgctgt gcaagttatc gactcagtca
2627gttgaggcca cagaactcta gtcagtcact ttagtaaaga actttgccat agggtttaat
2687ctcggtgtgg tttgccttct tgaggcttac ctgacaatcg tagccacctc tataatgggc
2747tcacttctgg aatgttcttt
276746647PRTMus musculus 46Met Ser Asp Glu Ile Phe Ser Thr Thr Leu Ala
Tyr Thr Lys Ser Pro 1 5 10
15Lys Ala Thr Lys Arg Thr Ser Phe Gln Asp Glu Leu Ile Arg Ala Ile
20 25 30Thr Ala Arg Ser Ala Arg
Gln Arg Ser Ser Glu Tyr Ser Asp Asp Phe 35 40
45Asp Ser Asp Glu Ile Val Ser Leu Gly Glu Phe Ser Asp Thr
Ser Thr 50 55 60Asp Glu Ser Leu Val
Arg Lys Lys Met Asn Asp Phe His Ile Ser Asp 65 70
75 80Asp Glu Glu Lys Asn Ser Pro Arg Leu Ser
Phe Leu Lys Thr Lys Lys 85 90
95Val Asn Arg Ala Ile Ser Asn Asp Ala Leu Asp Ser Ser Thr Pro Gly
100 105 110Ser Glu Gly Ser Ser
Pro Asp Ala Gln Glu Asp Val Thr Gly Asp Ser 115
120 125Leu Pro Lys Ser Gln Asn Asp Asp Arg Glu Val Gly
Arg Glu Ile Ile 130 135 140Thr Val Lys
Pro Thr Pro Arg Met His Pro Val Lys Arg Ser Thr Ser145
150 155 160Ser Gly Glu Thr Ser Ser Gly
Leu Asp Ala Asp Gly His Phe Lys Pro 165
170 175Ser Pro Gln Pro Arg Ser Met Leu Lys Lys Ser Ser
His Thr Glu Glu 180 185 190Gly
Val Arg Pro Gly Val Asp Lys Glu His Ser Ile Ser Glu Ala Ser 195
200 205Ala Pro Thr Pro Ser Leu Pro Arg Gln
Asn Gly Thr Glu Leu Gln Thr 210 215
220Glu Glu Lys Ile Tyr Ser Glu Asn Leu Asp Leu Glu Asp Ser Leu Leu225
230 235 240Gln Ser Leu Thr
Ser Ser Ser Phe Lys Glu Ser Pro Gly Gly Cys Thr 245
250 255Ser Pro Gly Ser Gln Glu Lys Val Pro Ile
Lys Asp His Asp Gly Glu 260 265
270Pro Thr Glu Ile Trp Asp Ser Leu Leu Ser Asn Glu Asn Glu Gly Ser
275 280 285Ser Val Leu Val Asn Cys Val
Thr Pro Glu Leu Glu Gln Pro Lys Asp 290 295
300Gly Gln Val Ala Ala Asp Asp Leu Glu Glu Glu Arg Glu Lys Gly
Gly305 310 315 320Phe Thr
Glu Asp Asp Leu Thr Thr Asp Pro Leu Leu Ser Thr Ser Pro
325 330 335Ser Val Ile Thr Pro Thr Glu
Pro Ala Glu Pro Ala Lys Lys Ala Asn 340 345
350Glu Asp Arg Asn Thr Lys Asn Lys Lys Thr Thr Asn Asn Arg
Val Ser 355 360 365Ser Ala Ser Gly
Ser Arg Leu Met Thr Ser Glu Phe Leu Lys Arg Ser 370
375 380Gly Pro Thr Lys Arg Ser Pro Ser Ala Ala Thr Ser
Ser His Tyr Leu385 390 395
400Gly Ser Leu Lys Val Leu Asp Gln Lys Gln Pro Arg Lys Gln Ser Leu
405 410 415Glu Pro Asp Lys Ala
Asp His Ile Arg Ala Ala Val Tyr Gln Glu Trp 420
425 430Leu Glu Lys Lys Asn Val Tyr Leu His Glu Met His
Arg Ile Lys Arg 435 440 445Ile Glu
Ser Glu Asn Leu Arg Ile Gln Asn Glu Gln Lys Lys Ala Ala 450
455 460Lys Arg Glu Glu Ala Leu Ala Ser Phe Glu Ala
Trp Lys Ala Met Lys465 470 475
480Glu Lys Glu Ala Lys Arg Ile Ala Ala Lys Lys Arg Leu Glu Glu Lys
485 490 495Asn Lys Lys Lys
Thr Glu Glu Glu Asn Ala Met Arg Lys Gly Glu Ala 500
505 510Leu Gln Ala Phe Glu Lys Trp Lys Glu Lys Lys
Leu Glu Tyr Leu Lys 515 520 525Glu
Lys Thr Arg Arg Glu Lys Glu Tyr Glu Arg Ala Lys Lys Gln Lys 530
535 540Glu Glu Glu Ala Val Ala Glu Lys Lys Lys
Asp Ser Leu Thr Ala Phe545 550 555
560Glu Lys Trp Ser Glu Arg Lys Glu Ala Leu Leu Lys Gln Lys Glu
Lys 565 570 575Glu Lys Ile
Asn Glu Arg Arg Lys Glu Glu Leu Lys Arg Ala Glu Lys 580
585 590Lys Asp Lys Asp Lys Gln Ala Ile Ser Glu
Tyr Glu Lys Trp Leu Glu 595 600
605Lys Lys Glu Arg Gln Glu Arg Ile Glu Arg Lys Gln Lys Lys Arg His 610
615 620Ser Phe Leu Glu Ser Glu Thr His
Pro Pro Trp Ser Pro Pro Ser Arg625 630
635 640Thr Ala Pro Ser Lys Val Phe
64547647PRTMus musculus 47Met Ser Asp Glu Ile Phe Ser Thr Thr Leu Ala Tyr
Thr Lys Ser Pro 1 5 10
15Lys Ala Thr Lys Arg Thr Ser Phe Gln Asp Glu Leu Ile Arg Ala Ile
20 25 30Thr Ala Arg Ser Ala Arg Gln
Arg Ser Ser Glu Tyr Ser Asp Asp Phe 35 40
45Asp Ser Asp Glu Ile Val Ser Leu Gly Glu Phe Ser Asp Thr Ser
Thr 50 55 60Asp Glu Ser Leu Val Arg
Lys Lys Met Asn Asp Phe His Ile Ser Asp 65 70
75 80Asp Glu Glu Lys Asn Ser Pro Arg Leu Ser Phe
Leu Lys Thr Lys Lys 85 90
95Val Asn Arg Ala Ile Ser Asn Asp Ala Leu Asp Ser Ser Thr Pro Gly
100 105 110Ser Glu Gly Ser Ser Pro
Asp Ala Gln Glu Asp Val Thr Gly Asp Ser 115 120
125Leu Pro Lys Ser Gln Asn Asp Asp Arg Glu Val Gly Arg Glu
Ile Ile 130 135 140Thr Val Lys Pro Thr
Pro Arg Met His Pro Val Lys Arg Ser Thr Ser145 150
155 160Ser Gly Glu Thr Ser Ser Gly Leu Asp Ala
Asp Gly His Phe Lys Pro 165 170
175Ser Pro Gln Pro Arg Ser Met Leu Lys Lys Ser Ser His Thr Glu Glu
180 185 190Gly Val Arg Pro Gly
Val Asp Lys Glu His Ser Ile Ser Glu Ala Ser 195
200 205Ala Pro Thr Pro Ser Leu Pro Arg Gln Asn Gly Thr
Glu Leu Gln Thr 210 215 220Glu Glu Lys
Ile Tyr Ser Glu Asn Leu Asp Leu Glu Asp Ser Leu Leu225
230 235 240Gln Ser Leu Thr Ser Ser Ser
Phe Lys Glu Ser Pro Gly Gly Cys Thr 245
250 255Ser Pro Gly Ser Gln Glu Lys Val Pro Ile Lys Asp
His Asp Gly Glu 260 265 270Pro
Thr Glu Ile Trp Asp Ser Leu Leu Ser Asn Glu Asn Glu Gly Ser 275
280 285Ser Val Leu Val Asn Cys Val Thr Pro
Glu Leu Glu Gln Pro Lys Asp 290 295
300Gly Gln Val Ala Ala Asp Asp Leu Glu Glu Glu Arg Glu Lys Gly Gly305
310 315 320Phe Thr Glu Asp
Asp Leu Thr Thr Asp Pro Leu Leu Ser Thr Ser Pro 325
330 335Ser Val Ile Thr Pro Thr Glu Pro Ala Glu
Pro Ala Lys Lys Ala Asn 340 345
350Glu Asp Arg Asn Thr Lys Asn Lys Lys Thr Thr Asn Asn Arg Val Ser
355 360 365Ser Ala Ser Gly Ser Arg Leu
Met Thr Ser Glu Phe Leu Lys Arg Ser 370 375
380Gly Pro Thr Lys Arg Ser Pro Ser Ala Ala Thr Ser Ser His Tyr
Leu385 390 395 400Gly Ser
Leu Lys Val Leu Asp Gln Lys Gln Pro Arg Lys Gln Ser Leu
405 410 415Glu Pro Asp Lys Ala Asp His
Ile Arg Ala Ala Val Tyr Gln Glu Trp 420 425
430Leu Glu Lys Lys Asn Val Tyr Leu His Glu Met His Arg Ile
Lys Arg 435 440 445Ile Glu Ser Glu
Asn Leu Arg Ile Gln Asn Glu Gln Lys Lys Ala Ala 450
455 460Lys Arg Glu Glu Ala Leu Ala Ser Phe Glu Ala Trp
Lys Ala Met Lys465 470 475
480Glu Lys Glu Ala Lys Arg Ile Ala Ala Lys Lys Arg Leu Glu Glu Lys
485 490 495Asn Lys Lys Lys Thr
Glu Glu Glu Asn Ala Met Arg Lys Gly Glu Ala 500
505 510Leu Gln Ala Phe Glu Lys Trp Lys Glu Lys Lys Leu
Glu Tyr Leu Lys 515 520 525Glu Lys
Thr Arg Arg Glu Lys Glu Tyr Glu Arg Ala Lys Lys Gln Lys 530
535 540Glu Glu Glu Ala Val Ala Glu Lys Lys Lys Asp
Ser Leu Thr Ala Phe545 550 555
560Glu Lys Trp Ser Glu Arg Lys Glu Ala Leu Leu Lys Gln Lys Glu Lys
565 570 575Glu Lys Ile Asn
Glu Arg Arg Lys Glu Glu Leu Lys Arg Ala Glu Lys 580
585 590Lys Asp Lys Asp Lys Gln Ala Ile Ser Glu Tyr
Glu Lys Trp Leu Glu 595 600 605Lys
Lys Glu Arg Gln Glu Arg Ile Glu Arg Lys Gln Lys Lys Arg His 610
615 620Ser Phe Leu Glu Ser Glu Thr His Pro Pro
Trp Ser Pro Pro Ser Arg625 630 635
640Thr Ala Pro Ser Lys Val Phe
64548344PRTArtificial SequenceDescription of Artificial Sequence
Synthetic protein sequence 48Glu Ser Gln Val Thr Ala Asp Asp Leu Glu
Glu Glu Lys Ala Lys Ala 1 5 10
15Glu Leu Ile Met Asp Asp Asp Arg Thr Val Asp Pro Leu Leu Ser Lys
20 25 30Ser Gln Ser Ile Leu
Ile Ser Thr Ser Ala Thr Ala Ser Ser Lys Lys 35
40 45Thr Ile Glu Asp Arg Asn Ile Lys Asn Lys Lys Ser Thr
Asn Asn Arg 50 55 60Ala Ser Ser Ala
Ser Ala Arg Leu Met Thr Ser Glu Phe Leu Lys Lys 65 70
75 80Ser Ser Ser Lys Arg Arg Thr Pro Ser
Thr Thr Thr Ser Ser His Tyr 85 90
95Leu Gly Thr Leu Lys Val Leu Asp Gln Lys Pro Ser Gln Lys Gln
Ser 100 105 110Ile Glu Pro Asp
Arg Ala Asp Asn Ile Arg Ala Ala Val Tyr Gln Glu 115
120 125Trp Leu Glu Lys Lys Asn Val Tyr Leu His Glu Met
His Arg Ile Lys 130 135 140Arg Ile Glu
Ser Glu Asn Leu Arg Ile Gln Asn Glu Gln Lys Lys Ala145
150 155 160Ala Lys Arg Glu Glu Ala Leu
Ala Ser Phe Glu Ala Trp Lys Ala Met 165
170 175Lys Glu Lys Glu Ala Lys Lys Ile Ala Ala Lys Lys
Arg Leu Glu Glu 180 185 190Lys
Asn Lys Lys Lys Thr Glu Glu Glu Asn Ala Ala Arg Lys Gly Glu 195
200 205Ala Leu Gln Ala Phe Glu Lys Trp Lys
Glu Lys Lys Met Glu Tyr Leu 210 215
220Lys Glu Lys Asn Arg Lys Glu Arg Glu Tyr Glu Arg Ala Lys Lys Gln225
230 235 240Lys Glu Glu Glu
Thr Val Ala Glu Lys Lys Lys Asp Asn Leu Thr Ala 245
250 255Val Glu Lys Trp Asn Glu Lys Lys Glu Ala
Phe Phe Lys Gln Lys Lys 260 265
270Lys Glu Lys Ile Asn Glu Lys Arg Lys Glu Glu Leu Lys Arg Ala Glu
275 280 285Lys Lys Asp Lys Asp Lys Gln
Ala Ile Asn Glu Tyr Glu Lys Trp Leu 290 295
300Glu Asn Lys Glu Lys Gln Glu Arg Ile Glu Arg Lys Gln Lys Lys
Arg305 310 315 320His Ser
Phe Leu Glu Ser Glu Ala Leu Pro Pro Trp Ser Pro Pro Ser
325 330 335Arg Thr Val Phe Ala Lys Val
Phe 34049237PRTArtificial SequenceDescription of Artificial
Sequence Synthetic protein sequence 49Ser Gln Lys Gln Ser Ile Glu
Pro Asp Arg Ala Asp Asn Ile Arg Ala 1 5
10 15Ala Val Tyr Gln Glu Trp Leu Glu Lys Lys Asn Val Tyr
Leu His Glu 20 25 30Met His
Arg Ile Lys Arg Ile Glu Ser Glu Asn Leu Arg Ile Gln Asn 35
40 45Glu Gln Lys Lys Ala Ala Lys Arg Glu Glu
Ala Leu Ala Ser Phe Glu 50 55 60Ala
Trp Lys Ala Met Lys Glu Lys Glu Ala Lys Lys Ile Ala Ala Lys 65
70 75 80Lys Arg Leu Glu Glu Lys
Asn Lys Lys Lys Thr Glu Glu Glu Asn Ala 85
90 95Ala Arg Lys Gly Glu Ala Leu Gln Ala Phe Glu Lys
Trp Lys Glu Lys 100 105 110Lys
Met Glu Tyr Leu Lys Glu Lys Asn Arg Lys Glu Arg Glu Tyr Glu 115
120 125Arg Ala Lys Lys Gln Lys Glu Glu Glu
Thr Val Ala Glu Lys Lys Lys 130 135
140Asp Asn Leu Thr Ala Val Glu Lys Trp Asn Glu Lys Lys Glu Ala Phe145
150 155 160Phe Lys Gln Lys
Lys Lys Glu Lys Ile Asn Glu Lys Arg Lys Glu Glu 165
170 175Leu Lys Arg Ala Glu Lys Lys Asp Lys Asp
Lys Gln Ala Ile Asn Glu 180 185
190Tyr Glu Lys Trp Leu Glu Asn Lys Glu Lys Gln Glu Arg Ile Glu Arg
195 200 205Lys Gln Lys Lys Arg His Ser
Phe Leu Glu Ser Glu Ala Leu Pro Pro 210 215
220Trp Ser Pro Pro Ser Arg Thr Val Phe Ala Lys Val Phe225
230 23550170PRTArtificial SequenceDescription of
Artificial Sequence Synthetic protein sequence 50Ala Met Lys Glu Lys
Glu Ala Lys Lys Ile Ala Ala Lys Lys Arg Leu 1 5
10 15Glu Glu Lys Asn Lys Lys Lys Thr Glu Glu Glu
Asn Ala Ala Arg Lys 20 25
30Gly Glu Ala Leu Gln Ala Phe Glu Lys Trp Lys Glu Lys Lys Met Glu
35 40 45Tyr Leu Lys Glu Lys Asn Arg Lys
Glu Arg Glu Tyr Glu Arg Ala Lys 50 55
60Lys Gln Lys Glu Glu Glu Thr Val Ala Glu Lys Lys Lys Asp Asn Leu 65
70 75 80Thr Ala Val Glu
Lys Trp Asn Glu Lys Lys Glu Ala Phe Phe Lys Gln 85
90 95Lys Lys Lys Glu Lys Ile Asn Glu Lys Arg
Lys Glu Glu Leu Lys Arg 100 105
110Ala Glu Lys Lys Asp Lys Asp Lys Gln Ala Ile Asn Glu Tyr Glu Lys
115 120 125Trp Leu Glu Asn Lys Glu Lys
Gln Glu Arg Ile Glu Arg Lys Gln Lys 130 135
140Lys Arg His Ser Phe Leu Glu Ser Glu Ala Leu Pro Pro Trp Ser
Pro145 150 155 160Pro Ser
Arg Thr Val Phe Ala Lys Val Phe 165
17051477PRTArtificial SequenceDescription of Artificial Sequence
Synthetic protein sequence 51Met Ser Asp Glu Val Phe Ser Thr Thr Leu
Ala Tyr Thr Lys Ser Pro 1 5 10
15Lys Val Thr Lys Arg Thr Thr Phe Gln Asp Glu Leu Ile Arg Ala Ile
20 25 30Thr Ala Arg Ser Ala
Arg Gln Arg Ser Ser Glu Tyr Ser Asp Asp Phe 35
40 45Asp Ser Asp Glu Ile Val Ser Leu Gly Asp Phe Ser Asp
Thr Ser Ala 50 55 60Asp Glu Asn Ser
Val Asn Lys Lys Met Asn Asp Phe His Ile Ser Asp 65 70
75 80Asp Glu Glu Lys Asn Pro Ser Lys Leu
Leu Phe Leu Lys Thr Asn Lys 85 90
95Ser Asn Gly Asn Ile Thr Lys Asp Glu Pro Val Cys Ala Ile Lys
Asn 100 105 110Glu Glu Glu Met
Ala Pro Asp Gly Cys Glu Asp Ile Val Val Lys Ser 115
120 125Phe Ser Glu Ser Gln Asn Lys Asp Glu Glu Phe Glu
Lys Asp Lys Ile 130 135 140Lys Met Lys
Pro Lys Pro Arg Ile Leu Ser Ile Lys Ser Thr Ser Ser145
150 155 160Ala Glu Asn Asn Ser Leu Asp
Thr Asp Asp His Phe Lys Pro Ser Pro 165
170 175Trp Pro Arg Ser Met Leu Lys Lys Lys Ser His Met
Glu Glu Lys Asp 180 185 190Gly
Leu Glu Asp Lys Glu Thr Ala Leu Ser Glu Glu Leu Glu Leu His 195
200 205Ser Ala Pro Ser Ser Leu Pro Thr Pro
Asn Gly Ile Gln Leu Glu Ala 210 215
220Glu Lys Lys Ala Phe Ser Glu Asn Leu Asp Pro Glu Asp Ser Cys Leu225
230 235 240Thr Ser Leu Ala
Ser Ser Ser Leu Lys Gln Ile Leu Gly Asp Ser Phe 245
250 255Ser Pro Gly Ser Glu Gly Asn Ala Ser Gly
Lys Asp Pro Asn Glu Glu 260 265
270Ile Thr Glu Asn His Asn Ser Leu Lys Ser Asp Glu Asn Lys Glu Asn
275 280 285Ser Phe Ser Ala Asp His Val
Thr Thr Ala Val Glu Lys Ser Lys Glu 290 295
300Ser Gln Val Thr Ala Asp Asp Leu Glu Glu Glu Lys Ala Lys Ala
Glu305 310 315 320Leu Ile
Met Asp Asp Asp Arg Thr Val Asp Pro Leu Leu Ser Lys Ser
325 330 335Gln Ser Ile Leu Ile Ser Thr
Ser Ala Thr Ala Ser Ser Lys Lys Thr 340 345
350Ile Glu Asp Arg Asn Ile Lys Asn Lys Lys Ser Thr Asn Asn
Arg Ala 355 360 365Ser Ser Ala Ser
Ala Arg Leu Met Thr Ser Glu Phe Leu Lys Lys Ser 370
375 380Ser Ser Lys Arg Arg Thr Pro Ser Thr Thr Thr Ser
Ser His Tyr Leu385 390 395
400Gly Thr Leu Lys Val Leu Asp Gln Lys Pro Ser Gln Lys Gln Ser Ile
405 410 415Glu Pro Asp Arg Ala
Asp Asn Ile Arg Ala Ala Val Tyr Gln Glu Trp 420
425 430Leu Glu Lys Lys Asn Val Tyr Leu His Glu Met His
Arg Ile Lys Arg 435 440 445Ile Glu
Ser Glu Asn Leu Arg Ile Gln Asn Glu Gln Lys Lys Ala Ala 450
455 460Lys Arg Glu Glu Ala Leu Ala Ser Phe Glu Ala
Trp Lys465 470 47552418PRTArtificial
SequenceDescription of Artificial Sequence Synthetic protein
sequence 52Met Ser Asp Glu Val Phe Ser Thr Thr Leu Ala Tyr Thr Lys Ser
Pro 1 5 10 15Lys Val Thr
Lys Arg Thr Thr Phe Gln Asp Glu Leu Ile Arg Ala Ile 20
25 30Thr Ala Arg Ser Ala Arg Gln Arg Ser Ser
Glu Tyr Ser Asp Asp Phe 35 40
45Asp Ser Asp Glu Ile Val Ser Leu Gly Asp Phe Ser Asp Thr Ser Ala 50
55 60Asp Glu Asn Ser Val Asn Lys Lys Met
Asn Asp Phe His Ile Ser Asp 65 70 75
80Asp Glu Glu Lys Asn Pro Ser Lys Leu Leu Phe Leu Lys Thr
Asn Lys 85 90 95Ser Asn
Gly Asn Ile Thr Lys Asp Glu Pro Val Cys Ala Ile Lys Asn 100
105 110Glu Glu Glu Met Ala Pro Asp Gly Cys
Glu Asp Ile Val Val Lys Ser 115 120
125Phe Ser Glu Ser Gln Asn Lys Asp Glu Glu Phe Glu Lys Asp Lys Ile
130 135 140Lys Met Lys Pro Lys Pro Arg
Ile Leu Ser Ile Lys Ser Thr Ser Ser145 150
155 160Ala Glu Asn Asn Ser Leu Asp Thr Asp Asp His Phe
Lys Pro Ser Pro 165 170
175Trp Pro Arg Ser Met Leu Lys Lys Lys Ser His Met Glu Glu Lys Asp
180 185 190Gly Leu Glu Asp Lys Glu
Thr Ala Leu Ser Glu Glu Leu Glu Leu His 195 200
205Ser Ala Pro Ser Ser Leu Pro Thr Pro Asn Gly Ile Gln Leu
Glu Ala 210 215 220Glu Lys Lys Ala Phe
Ser Glu Asn Leu Asp Pro Glu Asp Ser Cys Leu225 230
235 240Thr Ser Leu Ala Ser Ser Ser Leu Lys Gln
Ile Leu Gly Asp Ser Phe 245 250
255Ser Pro Gly Ser Glu Gly Asn Ala Ser Gly Lys Asp Pro Asn Glu Glu
260 265 270Ile Thr Glu Asn His
Asn Ser Leu Lys Ser Asp Glu Asn Lys Glu Asn 275
280 285Ser Phe Ser Ala Asp His Val Thr Thr Ala Val Glu
Lys Ser Lys Glu 290 295 300Ser Gln Val
Thr Ala Asp Asp Leu Glu Glu Glu Lys Ala Lys Ala Glu305
310 315 320Leu Ile Met Asp Asp Asp Arg
Thr Val Asp Pro Leu Leu Ser Lys Ser 325
330 335Gln Ser Ile Leu Ile Ser Thr Ser Ala Thr Ala Ser
Ser Lys Lys Thr 340 345 350Ile
Glu Asp Arg Asn Ile Lys Asn Lys Lys Ser Thr Asn Asn Arg Ala 355
360 365Ser Ser Ala Ser Ala Arg Leu Met Thr
Ser Glu Phe Leu Lys Lys Ser 370 375
380Ser Ser Lys Arg Arg Thr Pro Ser Thr Thr Thr Ser Ser His Tyr Leu385
390 395 400Gly Thr Leu Lys
Val Leu Asp Gln Lys Pro Ser Gln Lys Gln Ser Ile 405
410 415Glu Pro53303PRTArtificial
SequenceDescription of Artificial Sequence Synthetic protein
sequence 53Met Ser Asp Glu Val Phe Ser Thr Thr Leu Ala Tyr Thr Lys Ser
Pro 1 5 10 15Lys Val Thr
Lys Arg Thr Thr Phe Gln Asp Glu Leu Ile Arg Ala Ile 20
25 30Thr Ala Arg Ser Ala Arg Gln Arg Ser Ser
Glu Tyr Ser Asp Asp Phe 35 40
45Asp Ser Asp Glu Ile Val Ser Leu Gly Asp Phe Ser Asp Thr Ser Ala 50
55 60Asp Glu Asn Ser Val Asn Lys Lys Met
Asn Asp Phe His Ile Ser Asp 65 70 75
80Asp Glu Glu Lys Asn Pro Ser Lys Leu Leu Phe Leu Lys Thr
Asn Lys 85 90 95Ser Asn
Gly Asn Ile Thr Lys Asp Glu Pro Val Cys Ala Ile Lys Asn 100
105 110Glu Glu Glu Met Ala Pro Asp Gly Cys
Glu Asp Ile Val Val Lys Ser 115 120
125Phe Ser Glu Ser Gln Asn Lys Asp Glu Glu Phe Glu Lys Asp Lys Ile
130 135 140Lys Met Lys Pro Lys Pro Arg
Ile Leu Ser Ile Lys Ser Thr Ser Ser145 150
155 160Ala Glu Asn Asn Ser Leu Asp Thr Asp Asp His Phe
Lys Pro Ser Pro 165 170
175Trp Pro Arg Ser Met Leu Lys Lys Lys Ser His Met Glu Glu Lys Asp
180 185 190Gly Leu Glu Asp Lys Glu
Thr Ala Leu Ser Glu Glu Leu Glu Leu His 195 200
205Ser Ala Pro Ser Ser Leu Pro Thr Pro Asn Gly Ile Gln Leu
Glu Ala 210 215 220Glu Lys Lys Ala Phe
Ser Glu Asn Leu Asp Pro Glu Asp Ser Cys Leu225 230
235 240Thr Ser Leu Ala Ser Ser Ser Leu Lys Gln
Ile Leu Gly Asp Ser Phe 245 250
255Ser Pro Gly Ser Glu Gly Asn Ala Ser Gly Lys Asp Pro Asn Glu Glu
260 265 270Ile Thr Glu Asn His
Asn Ser Leu Lys Ser Asp Glu Asn Lys Glu Asn 275
280 285Ser Phe Ser Ala Asp His Val Thr Thr Ala Val Glu
Lys Ser Lys 290 295 300
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: