Patent application title: REPLICATION PROTEIN
Inventors:
Dawn Coverley (York, GB)
Assignees:
CIZZLE BIOTECHNOLOGY LIMITED
IPC8 Class: AG01N33566FI
USPC Class:
435 6
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid
Publication date: 2011-03-24
Patent application number: 20110070591
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: REPLICATION PROTEIN
Inventors:
Dawn COVERLEY
Agents:
Assignees:
Origin: ,
IPC8 Class: AG01N33566FI
USPC Class:
Publication date: 03/24/2011
Patent application number: 20110070591
Abstract:
This invention relates to a screening method for the identification of
agents which modulate the activity of a DNA replication protein as a
target for intervention in cancer therapy and includes agents which
modulate said activity. The invention also relates to the use of the DNA
replication protein, and its RNA transcripts in the prognosis and
diagnosis of proliferative disease e.g., cancer.Claims:
1-33. (canceled)
34. A diagnostic method for the identification of a cancer comprising detecting the presence of a Ciz1 polypeptide, wherein the presence or expression of said Ciz1 polypeptide is indicative of the presence of cancer.
35. The diagnostic method according to claim 34, wherein said method comprises one or more of the following steps:(i) contacting a sample isolated from a subject to be tested with an agent which specifically binds a Ciz1 polypeptide; and(ii) detecting or measuring the binding of said agent to said Ciz1 polypeptide.
36. The method of claim 34, wherein said Ciz1 polypeptide comprises an amino-acid sequence selected from the group consisting of SEQ ID NO: 29-44, 47, 48, 58-64 and 65.
37. The method of claim 36, wherein said Ciz1 polypeptide comprises an amino-acid sequence selected from the group consisting of SEQ ID NO: 58-64 and 65.
38. The method of claim 37, wherein said Ciz1 polypeptide comprises an amino-acid sequence of SEQ ID NO: 64.
39. The method of claim 35, wherein said agent is an antibody.
40. The method of claim 39, wherein said antibody is a monoclonal antibody.
41. The method of claim 35, wherein said agent is an aptamer.
42. The method of claim 39, wherein said Ciz1 polypeptide comprises an amino-acid sequence of SEQ ID: 64.
43. The method according to claim 34, wherein the cancer is a pediatric cancer selected from the group consisting of retinoblastoma, neuroblastoma, Burkett lymphoma, medulloblastoma, and Ewings Sarcoma family tumors.
44. The method according to claim 34, wherein the cancer is carcinoma, adenocarcinoma, lymphoma or leukemia.
45. The method according to claim 34 wherein the cancer is liver, lung or skin cancer.
46. The method of claim 35, wherein said agent specifically binds to a Ciz1 polypeptide at a junction sequence translated from an alternatively spliced Ciz1 transcript.
47. The method of claim 46, wherein said agent specifically binds a junction sequence of a Ciz1 polypeptide created when the amino-acid sequence VEEELCKQ is missing.
48. The method of claim 35, wherein said agent specifically binds to a Ciz1 polypeptide at a junction sequence created from an alternatively spliced Ciz1 transcript that is missing the nucleotide sequence of SEQ ID NO: 56.
49. The method of claim 46, wherein said agent is an antibody.
50. The method of claim 47, wherein said agent is an antibody.
51. The method of claim 48, wherein said agent is an antibody.
52. The method of claim 49, wherein said cancer is lung cancer.
53. The method of claim 50, wherein said cancer is lung cancer.
54. The method of claim 51, wherein said cancer is lung cancer.
Description:
REFERENCE TO RELATED APPLICATIONS
[0001]This application is a continuation of U.S. patent application Ser. No. 10/537,228, filed Jan. 13, 2006, which claims the benefit under 35 U.S.C. §371 of PCT Application Serial No. PCT/GB2003/005334, filed Dec. 5, 2003, which claims the benefit of Great Britain Application Serial No. 0228337.2, filed Dec. 5, 2002 and U.S. Provisional Application Ser. No. 60/433,925, filed Dec. 17, 2002, the disclosures of which are incorporated by reference herein in their entireties.
FIELD OF THE INVENTION
[0002]This invention relates to a screening method for the identification of agents which modulate the activity of a DNA replication protein as a target for intervention in cancer therapy and includes agents which modulate said activity. The invention also relates to the use of the DNA replication protein, and its RNA transcripts in the prognosis and diagnosis of proliferative disease e.g., cancer.
BACKGROUND
[0003]Initiation of DNA replication is a major control point in the mammalian cell cycle, and the point of action of many gene products that are mis-regulated in cancer (Hanahan and Weinberg, 2000). The initiation process involves assembly of pre-replication complex proteins, which include the origin recognition complex (ORC), Cdc6, Cdt1 and Mcm proteins, at replication origins during G1 phase of the cell cycle. This is followed by the action of a second group of proteins, which facilitate loading of DNA polymerases and their accessory factors including PCNA, and the transition to S phase. The initiation process is regulated by cyclin-dependent protein kinase 2 (Cdk2), Cdc7-dbf4 and the Cdt1 inhibitor geminin (for review see Bell and Dutta, 2002). In the nucleus of S phase cells, replication forks cluster together to form hundreds of replication `foci` or factories (Cook, 1999). Replication factories appear to be linked to a structural framework within the nucleus, however the nature of the molecules that form the link and their role in replication fork activity remains unclear.
[0004]Identification of proteins involved in eukaryotic DNA replication and analysis of the basic pathways that regulate their activity during the cell cycle has been driven largely by yeast genetics. These proteins and pathways are generally conserved from yeast to man. However, in multi-cellular organisms that differentiate down diverse developmental pathways, additional layers of complexity are being uncovered. For example, in vertebrates several proteins involved in neuronal differentiation also regulate the G1-S phase transition (Ohnuma et al., 2001). These include the cdk inhibitor p21.sup.CIP1/WAF1/SDI1 which has been implicated in oligodendrocyte differentiation following growth arrest (Zezula et al., 2001), and in the terminal differentiation of other cell types (Parker et al., 1995).
[0005]Initiation of DNA replication can be reconstituted in vitro with isolated nuclei and cytosolic extracts from mammalian cells (Krude, 2000; Krude et al., 1997; Laman et al., 2001; Stoeber et al., 1998). Furthermore, using recombinant Cdk2 complexed with either cyclins E or A, replication complex assembly and activation of DNA synthesis can be reconstituted independently (Coverley et al., 2002). We have studied the activation step, catalyzed in vitro by cyclin A-cdk2, and shown that a relatively unstudied protein, p21-Cip1 interacting zinc-finger protein (Ciz1) functions during this stage of the initiation process. Human Ciz1 was previously identified using a modified yeast two-hybrid screen with cyclin E-p21, and biochemical analysis supported an interaction with p21 (Mitsui et al., 1999). A potential role in transcription was proposed but not demonstrated, and no other function was assigned to Ciz1. More recently the Ciz1 gene was isolated from a human medulloblastoma derived cDNA library using an in vivo tumorigenesis model (Warder and Keherly, 2003). Our analysis shows for the first time that Ciz1 plays a positive role in initiation of DNA replication.
[0006]A number of changes to chromatin bound proteins occur when DNA synthesis is activated in vitro by recombinant cyclin A-cdk2. The present invention relates to the finding that a cdc6-related antigen, p85, correlates with the initiation of DNA replication and is regulated by cyclin A-cdk2. The protein was cloned from a mouse embryo library and identified as mouse Ciz1.
[0007]In vitro analysis has shown that Ciz1 protein positively regulates initiation of DNA replication and that its activity is modulated by cdk phosphorylation at threonine 191/2, linking it to the cdk-dependent pathways that control initiation. The embryonic form mouse Ciz1 is alternately spliced, compared to predicted and somatic forms. Human Ciz1 is also alternately spliced, with variability in the same exons as mouse Ciz1. It has been found that recombinant embryonic form Ciz1 promotes initiation of mammalian DNA replication and that pediatric cancers express `embryonic-like` forms of Ciz1. Without wishing to be held to one theory, the inventors propose that Ciz1 mis-splicing produces embryonic-like forms of Ciz1 at inappropriate times in development. This promotes inappropriately regulated DNA replication and contributes to formation or progression of cancer cell lineages.
[0008]A number of techniques have been developed in recent years which purport to specifically ablate genes and/or gene products. For example, the use of anti-sense nucleic acid molecules to bind to and thereby block or inactivate target mRNA molecules is an effective means to inhibit the production of gene products.
[0009]A much more recent technique to specifically ablate gene function is through the introduction of double stranded RNA, also referred to as inhibitory RNA (RNAi), into a cell which results in the destruction of mRNA complementary to the sequence included in the RNAi molecule. The RNAi molecule comprises two complementary strands of RNA (a sense strand and an antisense strand) annealed to each other to form a double stranded RNA molecule. The RNAi molecule is typically derived from the exonic or coding sequence of the gene which is to be ablated.
[0010]Nucleic acids and proteins have both a linear sequence structure, as defined by their base or amino acid sequence, and also a three dimensional structure which in part is determined by the linear sequence and also the environment in which these molecules are located. Conventional therapeutic molecules are small molecules, for example, peptides, polypeptides, or antibodies, which bind target molecules to produce an agonistic or antagonistic effect. It has become apparent that nucleic acid molecules also have potential with respect to providing agents with the requisite binding properties which may have therapeutic utility. These nucleic acid molecules are typically referred to as aptamers. Aptamers are small, usually stabilized, nucleic acid molecules which comprise a binding domain for a target molecule.
[0011]Aptamers may comprise at least one modified nucleotide base. The term "modified nucleotide base" encompasses nucleotides with a covalently modified base and/or sugar. For example, modified nucleotides include nucleotides having sugars which are covalently attached to low molecular weight organic groups other than a hydroxyl group at the 3' position and other than a phosphate group at the 5' position. Thus modified nucleotides may also include 2' substituted sugars such as 2'-O-methyl-; 2-O-alkyl; 2-O-allyl; 2'-S-alkyl; 2'-S-allyl; 2'-fluoro-; 2'-halo or 2; azido-ribose, carbocyclic sugar analogues a-anomeric sugars; epimeric sugars such as arabinose, xyloses or lyxoses, pyranose sugars, furanose sugars, and sedoheptulose.
[0012]Modified nucleotides are known in the art and include by example and not by way of limitation; alkylated purines and/or pyrimidines; acylated purines and/or pyrimidines; or other heterocycles. These classes of pyrimidines and purines are known in the art and include, pseudoisocytosine; N4, N4-ethanocytosine; 8-hydroxy-N6-methyladenine; 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil; 5-fluorouracil; 5-bromouracil; 5-carboxymethylaminomethyl-2-thiouracil; 5-carboxymethylaminomethyl uracil; dihydrouracil; inosine; N6-isopentyl-adenine; 1-methyladenine; 1-methylpseudouracil; 1-methylguanine; 2,2-dimethylguanine; 2-methyladenine; 2-methylguanine; 3-methylcytosine; 5-methylcytosine; N6-methyladenine; 7-methylguanine; 5-methylaminomethyl uracil; 5-methoxy amino methyl-2-thiouracil; β-D-mannosylqueosine; 5-methoxycarbonylmethyluracil; 5-methoxyuracil; 2 methylthio-N6-isopentenyladenine; uracil-5-oxyacetic acid methyl ester; psueouracil; 2-thiocytosine; 5-methyl-2 thiouracil, 2-thiouracil; 4-thiouracil; 5-methyluracil; N-uracil-5-oxyacetic acid methylester; uracil 5-oxyacetic acid; queosine; 2-thiocytosine; 5-propyluracil; 5-propylcytosine; 5-ethyluracil; 5-ethylcytosine; 5-butyluracil; 5-pentyluracil; 5-pentylcytosine; and 2,6,-diaminopurine; methylpseudouracil; 1-methylguanine; 1-methylcytosine;
[0013]Aptamers may be synthesized using conventional phosphodiester linked nucleotides using standard solid or solution phase synthesis techniques which are known in the art. Linkages between nucleotides may use alternative linking molecules. For example, linking groups of the formula P(O)S, (thioate); P(S)S, (dithioate); P(O)NR'2; P(O)R'; P(O)OR6; CO; or CONR'2 wherein R is H (or a salt) or alkyl (1-12C) and R6 is alkyl (1-9C) is joined to adjacent nucleotides through --O-- or --S--.
[0014]Other techniques which purport to specifically ablate genes and/or gene products focus on modulating the function or interfering with the activity of protein molecules. Proteins can be targeted by chemical inhibitors drawn, for example, from existing small molecule libraries.
[0015]Antibodies, preferably monoclonal, can be raised for example in mice or rats against different protein isoforms. Antibodies, also known as immunoglobulins, are protein molecules which have specificity for foreign molecules (antigens). Immunoglobulins (Ig) are a class of structurally related proteins consisting of two pairs of polypeptide chains, one pair of light (L) (low molecular weight) chain (κ or λ), and one pair of heavy (H) chains (γ, α, μ, δ and ε), all four linked together by disulphide bonds. Both H and L chains have regions that contribute to the binding of antigen and that are highly variable from one Ig molecule to another. In addition, H and L chains contain regions that are non-variable or constant.
[0016]The L chains consist of two domains. The carboxy-terminal domain is essentially identical among L chains of a given type and is referred to as the "constant" (C) region. The amino terminal domain varies from one L chain to anther and contributes to the binding site of the antibody. Because of its variability, it is referred to as the "variable" (V) region.
[0017]The H chains of Ig molecules are of several classes, α, μ, σ, α, and γ (of which there are several sub-classes). An assembled Ig molecule consisting of one or more units of two identical H and L chains, derives its name from the H chain that it possesses. Thus, there are five Ig isotypes: IgA, IgM, IgD, IgE and IgG (with four sub-classes based on the differences in the H chains, i.e., IgG1, IgG2, IgG3 and IgG4). Further detail regarding antibody structure and their various functions can be found in, Using Antibodies: A laboratory manual, Cold Spring Harbour Laboratory Press.
[0018]Chimeric antibodies are recombinant antibodies in which all of the V-regions of a mouse or rat antibody are combined with human antibody C-regions. Humanized antibodies are recombinant hybrid antibodies which fuse the complimentarity determining regions from a rodent antibody V-region with the framework regions from the human antibody V-regions. The C-regions from the human antibody are also used. The complimentarity determining regions (CDRs) are the regions within the N-terminal domain of both the heavy and light chain of the antibody to where the majority of the variation of the V-region is restricted. These regions form loops at the surface of the antibody molecule. These loops provide the binding surface between the antibody and antigen.
[0019]Antibodies from non-human animals provoke an immune response to the foreign antibody and its removal from the circulation. Both chimeric and humanized antibodies have reduced antigenicity when injected to a human subject because there is a reduced amount of rodent (i.e. foreign) antibody within the recombinant hybrid antibody, while the human antibody regions do not illicit an immune response. This results in a weaker immune response and a decrease in the clearance of the antibody. This is clearly desirable when using therapeutic antibodies in the treatment of human diseases. Humanized antibodies are designed to have less "foreign" antibody regions and are therefore thought to be less immunogenic than chimeric antibodies.
[0020]Other techniques for targeting at the protein level include the use of randomly generated peptides that specifically bind to proteins, and any other molecules which bind to proteins or protein variants and modify the function thereof.
[0021]Understanding the DNA replication process is of prime concern in the field of cancer therapy. It is known that cancer cells can become resistant to chemotherapeutic agents and can evade detection by the immune system. There is an on going need to identify targets for cancer therapy so that new agents can be identified. The DNA replication process represents a prime target for drug intervention in cancer therapy. There is a need to identify gene products which modulate DNA replication and which contribute to formation or progression of cancer cell lineages, and to develop agents that affect their function.
SUMMARY OF THE INVENTION
[0022]According to one aspect of the present invention there is provided the use of a Ciz1 nucleotide or polypeptide sequence, or any fragment or variant thereof, as a target for the identification of agents which modulate DNA replication.
[0023]As used herein the term `fragment` or `variant` is used to refer to any nucleic or amino acid sequence which is derived from the full length nucleotide or amino acid sequence of Ciz1 or derived from a splice variant thereof. In one embodiment of the invention the fragment is of sufficient length and/or of sufficient homology to full length Ciz1 to retain the DNA replication activity of Ciz1. In an alternative embodiment inactive Ciz1 fragments are used. The term `fragment` or `variant` also relates to the Ciz1 RNA transcripts described herein and protein isoforms (or parts thereof).
[0024]As used herein the term `modulate` is used to refer to either increasing or decreasing DNA replication, above and below the levels which would normally be observed in the absence of the specific agent (i.e., any alterations in DNA replication activity which are either directly or indirectly linked to the use of the agent). The term `modulate` also includes reference to a change of spacial or temporal organization of DNA replication.
[0025]According to an alternative aspect of the invention there is provided a screening method for the identification of agents which modulate DNA replication wherein the screening method comprises the use of Ciz1 nucleotide or polypeptide sequence or fragments or variants thereof.
[0026]Preferably the screening method comprises detecting or measuring the effect of an agent on a nucleic acid molecule selected from the groups consisting of: [0027]a) a nucleic acid molecule comprising a nucleic acid sequence represented in any of FIG. 14, 15, or 21 (SEQ ID NO: 45, 46, 66, 67, 68, 69, 70, 71, 72 or 73); [0028]b) a nucleic acid molecule which hybridizes to the nucleic acid sequence in (a) and which has Ciz1 activity or activity of a variant thereof; [0029]c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate because of the genetic code to the sequences in a) and b); and [0030]d) a nucleic acid molecule derived from the genomic sequence at the Ciz1 locus or a nucleic acid molecule that hybridizes to the genomic sequence.
[0031]In one embodiment of the invention, the nucleic acid molecule is modified by deletion, substitution or addition of at least one nucleic acid residue of the nucleic acid sequence.
[0032]Alternatively the screening method comprises the steps of:
[0033](i) forming a preparation comprising a polypeptide molecule, or an active fragment thereof, encoded by a nucleic acid molecule selected from the group consisting of: [0034]a) a nucleic acid molecule comprising a nucleic acid sequence represented in FIG. 14, 15 or 21 (SEQ ID NO: 45, 46, 66, 67, 68, 69, 70, 71, 72 or 73); [0035]b) a nucleic acid molecule which hybridizes to the nucleic acid sequence in (a) and which has Ciz1 activity or activity of a variant thereof; [0036]c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate because of the genetic code to the sequences in a) and b) and a candidate agent to be tested; [0037]d) a nucleic acid molecule derived from the genomic sequence at the Ciz1 locus or a nucleic acid molecule that hybridizes to the genomic sequence; and
[0038]ii) detecting or measuring the effect of the agent on the activity of said polypeptide.
[0039]Assays for the detection of DNA replication are known in the art. Activity residing in Ciz1, or derived peptide fragments, and the effect of potential therapeutic agents on that activity would be assayed in vitro or in vivo.
[0040]In vitro assays for Ciz1 protein activity would comprise synchronized isolated G1 phase nuclei and either S phase extract or G1 phase extract supplemented with cyclin-dependent kinases. Inclusion of Ciz1 or derived peptide fragments stimulates initiation of DNA replication in these circumstances and can be monitored visually (by scoring nuclei that have incorporated fluorescent nucleotides during in vitro reactions) or by measuring incorporation of radioactive nucleotides. The assay for therapeutic reagents that interfere with Ciz1 protein function would involve looking for inhibition of DNA replication in these assays. The effect of agents on Ciz1 nuclear localization, chromatin binding, stability, modification and protein-protein interactions could also be monitored in these assays.
[0041]In vivo assays will include creation of cell and mouse models that over-express or under-express Ciz1, or derived fragments, resulting in altered cell proliferation. The preparation of transgenic animals is generally known in the art and within the ambit of the skilled person. The assay for therapeutic reagents would involve analysis of cell-cycle time, initiation of DNA replication and cancer incidence in the presence and absence of drugs that either impinge on Ciz1 protein activity, or interfere with Ciz1 production by targeting Ciz1 and its variants at the RNA level.
[0042]In a preferred method of the invention said hybridization conditions are stringent.
[0043]Stringent hybridization/washing conditions are well known in the art. For example, nucleic acid hybrids that are stable after washing in 0.1×SSC,0.1% SDS at 60° C. It is well known in the art that optimal hybridization conditions can be calculated if the sequence of the nucleic acid is known. Typically, hybridization conditions use 4-6×SSPE (20×SSPE contains 175.3 g NaCl, 88.2 g NaH2PO4H2O and 7.4 g EDTA dissolved to 1 litre and the pH adjusted to 7.4); 5-10× Denhardts solution (50× Denhardts solution contains 5 g Ficoll (Type 400, Pharmacia), 5 g polyvinylpyrrolidone and 5 g bovine serum albumen; 100 μg-1.0 mg/ml sonicated salmon/herring DNA; 0.1-1.0% sodium dodecyl sulphate; optionally 40-60% deionised formamide. The hybridization temperature will vary depending on the GC content of the nucleic acid target sequence but will typically be between 42°-65° C.
[0044]In a preferred method of the invention said polypeptide is modified by deletion, substitution or addition of at least one amino acid residue of the polypeptide sequence.
[0045]A modified or variant, i.e. a fragment polypeptide and reference polypeptide, may differ in amino acid sequence by one or more substitutions, additions, deletions, truncations which may be present in any combination. Among preferred variants are those that vary from a reference polypeptide by conservative amino acid substitutions. Such substitutions are those that substitute a given amino acid by another amino acid of like characteristics. The following non-limiting list of amino acids are considered conservative replacements (similar): a) alanine, serine, and threonine; b) glutamic acid and aspartic acid; c) asparagine and glutamine d) arginine and lysine; e) isoleucine, leucine, methionine and valine and f) phenylalanine, tyrosine and tryptophan. Preferred are variants which retain the same biological function and activity as the reference polypeptide from which it varies. Alternatively, variants include those with an altered biological function, for example variants which act as antagonists, so called "dominant negative" variants.
[0046]Alternatively or in addition, non-conservative substitutions may give the desired biological activity see Cain S A, Williams D M, Harris V, Monk P N. Selection of novel ligands from a whole-molecule randomly mutated C5a library. Protein Eng. 2001 March; 14(3):189-93, which is incorporated by reference.
[0047]A functionally equivalent polypeptide sequence according to the invention is a variant wherein one or more amino acid residues are substituted with conserved or non-conserved amino acid residues, or one in which one or more amino acid residues includes a substituent group. Conservative substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu and Ile; interchange of the hydroxyl residues Ser and Thr; exchange of the acidic residues Asp and Glu; substitution between amide residues Asn and Gln; exchange of the basic residues Lys and Arg; and replacements among aromatic residues Phe and Tyr.
[0048]In addition, the invention features nucleotide or polypeptide sequences having at least 50% identity with the nucleotide or polypeptide sequences as herein disclosed, or fragments and functionally equivalent polypeptides thereof. In one embodiment, the nucleotide or polypeptide sequences have at least 75% to 85% identity, more preferably at least 90% identity, even more preferably at least 95% identity, still more preferably at least 97% identity, and most preferably at least 99% identity with the nucleotide and amino acid sequences illustrated herein.
[0049]In a preferred method of the invention said nucleic acid molecule comprises the nucleic acid sequence encoding the amino acid sequence Ciz1 in FIG. 16 (SEQ ID NO: 26) or FIG. 17 (SEQ ID NO: 47) or any variants thereof, including those described in FIG. 20A (SEQ ID NO: 58-61) and 20B (SEQ ID NO: 62-65). In a further preferred method of the invention said nucleic acid molecule consists of the nucleic acid sequence which encodes the amino acid sequence Ciz1 in FIG. 16 (SEQ ID NO: 26) or FIG. 17 (SEQ ID NO: 47) or variants thereof, including those described in FIG. 20A (SEQ ID NO: 58-61) and 20B (SEQ ID NO: 62-65).
[0050]In a further preferred method of the invention said polypeptide molecule comprises the amino acid sequence Ciz1 in FIG. 16 (SEQ ID NO: 26) or 17 (SEQ ID NO: 47) or variants thereof, including those described in FIG. 20A (SEQ ID NO: 58-61) and 20B (SEQ ID NO: 62-65). In a further preferred method of the invention said polypeptide molecule consists of the amino acid sequence Ciz1 in FIG. 16 (SEQ ID NO: 26) or 17 (SEQ ID NO: 47) or variants thereof, including those described in FIG. 20A (SEQ ID NO: 58-61) and 20B (SEQ ID NO: 62-65).
[0051]In a further preferred method of the invention said polypeptide is expressed by a cell, preferably a mammalian cell, or animal and said screening method is a cell-based screening method.
[0052]Preferably said cell naturally expresses the Ciz1 polypeptide. Alternatively said cell is transfected with a nucleic acid molecule encoding a Ciz1 polypeptide (or a variant molecule thereof, found, for example in cancer cell lineages).
[0053]According to a further aspect of the invention there is provided an agent obtainable by the method according to the invention. Preferably said agent is an antagonist of Ciz1 mediated DNA replication. Alternatively said agent is an agonist of Ciz1 mediated DNA replication.
[0054]In a further preferred method of the invention said agent is selected from the group consisting of: polypeptide; peptide; aptamer; chemical; antibody; nucleic acid; or polypeptide or nucleotide probe.
[0055]Preferably the agent comprises a sequence that is complimentary or of sufficient homology to give specific binding to the target and can be used to detect the level of nucleic acid or protein for diagnostic purposes.
[0056]Alternatively the agent identified by the method of the invention is a therapeutic agent and can be used for the treatment of disease.
[0057]In one embodiment of the invention the agent is an antibody molecule and binds to any of the sequences represented by FIG. 16 (SEQ ID NO: 26), 17 (SEQ ID NO: 47) or 20 (SEQ ID NO: 58-65).
[0058]Preferably said antibody is a monoclonal antibody.
[0059]Alternatively said agent is an anti-sense nucleic acid molecule which binds to and thereby blocks or inactivates the mRNA encoded by any of the nucleic acid sequences described above.
[0060]In an alternative embodiment, said agent is an RNAi molecule and comprises two complementary strands of RNA (a sense strand and an antisense strand) annealed to each other to form a double stranded RNA molecule. Preferably the RNAi molecule is derived from the exonic sequence of the Ciz1 gene or from another over-lapping gene.
[0061]In one embodiment unspliced mRNA is targeted with RNAi to inhibit production of the spliced variant. In another the spliced variant mRNA is ablated without affecting the non-variant mRNA.
[0062]In a preferred method of the invention said peptide is an oligopeptide. Preferably, said oligopeptide is at least 10 amino acids long. Preferably said oligopeptide is at least 20, 30, 40, 50 amino acids in length.
[0063]In a further preferred method of the invention said peptide is a modified peptide.
[0064]It will be apparent to one skilled in the art that modified amino acids include, by way of example and not by way of limitation, 4-hydroxyproline, 5-hydroxylysine, N6-acetyllysine, N6-methyllysine, N6,N6-dimethyllysine, N6,N6,N6-trimethyllysine, cyclohexyalanine, D-amino acids, ornithine. Other modifications include amino acids with a C2, C3 or C4 alkyl R group optionally substituted by 1, 2 or 3 substituents selected from halo (eg F, Br, I), hydroxy or C1-C4alkoxy.
[0065]Alternatively said peptide is modified by acetylation and/or amidation.
[0066]In a preferred method of the invention the polypeptides or peptides are modified by cyclisation. Cyclisation is known in the art, (see Scott et al Chem Biol (2001), 8:801-815; Gellerman et al J. Peptide Res (2001), 57: 277-291; Dutta et al J. Peptide Res (2000), 8: 398-412; Ngoka and Gross J. Amer Soc Mass Spec (1999), 10:360-363).
[0067]According to a further aspect of the invention there is provided a vector as a delivery means for, for example, an antisense or an RNAi molecule which inhibits Ciz1 or variants thereof and thereby allows the targeting of cells expressing the protein to be targeted.
[0068]In one embodiment of the invention a viral vector is used as delivery means.
[0069]Preferably the vector includes an expression cassette comprising the nucleotide sequence selected from the group consisting of; [0070]a) the nucleic acid sequence which encodes Ciz1 amino acid sequence as shown in FIGS. 14, 15 and 21 (SEQ ID NO: 45, 46, 66, 67, 68, 69, 70, 71, 72 or 73); [0071]b) a nucleic acid molecule which hybridizes to the nucleic acid sequence of (a); [0072]c) a nucleic acid molecule which has a nucleic acid sequence which is degenerate because of the genetic code to the sequences in a) and b) and any sequence which is complimentary to any of the above sequences; [0073]d) a nucleic acid sequence that encodes Ciz1 pre-mRNA (i.e., the genomic sequence), wherein the expression cassette is transcriptionally linked to a promoter sequence.
[0074]Preferably the vectors including the expression cassette is adapted for eukaryotic gene expression. Typically said adaptation includes, by example and not by way of limitation, the provision of transcription control sequences (promoter sequences) which mediate cell/tissue specific expression. These promoter sequences may be cell/tissue specific, inducible or constitutive.
[0075]Promoter elements typically also include so called TATA box and RNA polymerase initiation selection sequences which function to select a site of transcription initiation. These sequences also bind polypeptides which function, inter alia, to facilitate transcription initiation selection by RNA polymerase.
[0076]Adaptations also include the provision of selectable markers and autonomous replication sequences which both facilitate the maintenance of said vector in either the eukaryotic cell or prokaryotic host. Vectors which are maintained autonomously are referred to as episomal vectors. Further adaptations which facilitate the expression of vector encoded genes include the provision of transcription termination sequences.
[0077]These adaptations are well known in the art. There is a significant amount of published literature with respect to expression vector construction and recombinant DNA techniques in general. Please see, Sambrook et al (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. and references therein; Marston, F (1987) DNA Cloning Techniques: A Practical Approach Vol III IRL Press, Oxford UK; DNA Cloning: F M Ausubel et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (1994).
[0078]According to the present invention there is provided a diagnostic method for the identification of proliferative disorders comprising detecting the presence or expression of the Ciz1 gene, Ciz1 splice variants and mutations in the genomic or protein sequence thereof.
[0079]Preferably said diagnostic method comprises one of more of the following steps: [0080](i) contacting a sample isolated from a subject to be tested with an agent which specifically binds a polypeptide with Ciz1 activity or a nucleic acid molecule encoding a polypeptide with Ciz1 activity; and [0081](ii) detecting or measuring the binding of the agent on said polypeptide or nucleic acid in said sample; [0082](iii) use of reverse-transcribed PCR or real-time PCR to monitor Ciz1 isoform expression and to measure expression levels. [0083](iv) measuring the presence of nucleic acid or amino-acid mutations based on altered conformational properties of the molecule.
[0084]In one embodiment, the diagnostic method of the present invention is carried out in-vivo. In an alternative embodiment, the diagnostic method of the present invention is carried out ex-vivo or in-vitro.
[0085]Preferably the diagnostic method provides for a quantitative measure of Ciz1 RNA or protein variants in a sample.
[0086]In one embodiment of the invention there is provided the use of an agent which modulates Ciz1 RNA or protein, or variants thereof, as a pharmaceutical.
[0087]Preferably said pharmaceutical comprises an agent identified by the screening method of the present invention in combination or association with a pharmaceutically acceptable carrier, excipient or diluent.
[0088]Preferably said pharmaceutical is for oral or topical administration or for administration by injection. In alternative embodiment of the invention the pharmaceutical is administered as an aerosol.
[0089]In a further preferred embodiment of the invention there is provided the use of an agent according to the invention for the manufacture of a medicament for use in the treatment of proliferative disease. Preferably said proliferative disease is cancer.
[0090]Preferably said cancer is a pediatric cancer and is selected from the group consisting of; retinoblastoma, neuroblastoma, Burkitt lymphoma, medulloblastoma, and Ewings Sarcoma family tumors (ESFTs).
[0091]In an alternative embodiment the cancer is a carcinoma, adenocarcinoma, lymphoma or leukemia.
[0092]In an alternate embodiment the disease is liver, lung or skin cancer or metastasis.
[0093]According to a further aspect of the invention there is provided a method to treat a proliferative disease comprising administering to an animal, preferably a human, an agent obtainable by the method according to the invention.
[0094]According to an alternate aspect of the invention, there is provided the use of an agent according to the invention for the manufacture of a medicament to slow cell division or growth.
[0095]The invention also includes the use of the Ciz1 amino acid sequence and protein structure in rational drug design and the use of Ciz1 nucleotide and amino acid sequences thereof or variants thereof for screening chemical libraries for agents that specifically bind to Ciz1.
[0096]The invention also includes a kit comprising a diagnostic, prognostic or therapeutic agent identified by the method of the invention.
[0097]In an alternative embodiment of the invention, an array based sequencing chip is used for the detection of altered Ciz1.
BRIEF DESCRIPTION OF THE FIGURES
[0098]An embodiment of the invention is described below by example only and with reference to the following figures:
[0099]FIG. 1 illustrates the effect of cyclin A-cdk2 on late G1 nuclei. A) Anti-Cdc6 antibody V1 detects mouse Cdc6 and a second antigen in western blots of 3T3 whole cell extract, which migrates with approximate Mr of 100 kDa (based on the mobility of the Mcm3 protein this was previously estimated at nearer 85 kDa so the antigen was named p85--we have kept the same name here for clarity). P85 is present in both the soluble fraction and insoluble nuclear fraction (prepared under in vitro replication conditions). B) Initiation of DNA synthesis in `replication competent` late G1 phase nuclei by G1 phase extract supplemented with recombinant cyclin A-cdk2. Control bar shows the proportion of nuclei already in S phase (unshaded), and those that initiated replication in extract from S phase cells (shaded). C) After 15 minutes under cell-free replication conditions nuclei were washed and the chromatin fraction was re-isolated and separated by SDS-Page and blotted for Mcm2 and Mcm3. D) The same nuclei blotted with antibody V1. p85 antigen is more abundant in nuclei exposed to initiation-inducing concentrations of cyclin A-cdk2. Antibody V1 was used to clone the gene for p85 from a mouse embryo expression library which was identified as Ciz1.
[0100]FIG. 2 shows an alignment of mouse Ciz1 variants. The predicted full-length Ciz1 amino-acid sequence (`Full`; SEQ ID NO: 26) is identical to a mouse mammary tumor cDNA clone (BC018483), while embryonic Ciz1 (`ECiz1`, AJ575057; SEQ ID NO: 27), and a melanoma-derived clone (AK089986; SEQ ID NO: 28) lack two discrete internal sequences. In addition, the first available methionine in ECiz1 is in the middle of exon 3 (Met84), which excludes a polyglutamine rich region from the N-terminus. Melanoma derived AK089986 may be incomplete as it ends 77 codons before the C-terminus of all other mouse and human clones. Stars indicate amino-acids changed by site-directed mutagenesis in the constructs shown in D. Amino-acids that correspond to codons targeted by siRNAs are underlined. B) Mouse Ciz1 is encoded by at least 17 exons. Coding exons are shown in grey, alternatively spliced regions are black, untranslated regions are white. Two alternative exon 1 sequences are included in some Ciz1 transcripts (not shown) but an alternative translational start site upstream of the two depicted here has not yet been found. C) Sequence features and putative domains in ECiz1. Predicted nuclear localization sequence (NLS), putative cyclin-dependent kinase phosphorylation sites, C2H2 type zinc-fingers and a C terminal domain with homology to the nuclear matrix protein matrin 3 (Nakayasu and Berezney, 1991) are shown. The positions of sequences absent from ECiz1 are indicated by triangles. D) ECiz1 and derived truncations and point mutants used in cell-free DNA replication experiments. Numbers in parentheses relate to amino-acid positions in the full-length form of mouse Ciz1, shown in A. Stars indicate putative phosphorylation sites ablated by site-directed mutagenesis.
[0101]FIG. 3 shows the effect of Ciz1 protein and derived fragments in cell-free DNA replication experiments and illustrates that ECiz1 promotes initiation of mammalian DNA replication A) Recombinant ECiz1 stimulates initiation of DNA replication in `replication competent` late G1 phase nuclei, during incubation in S phase extract. Histogram shows the average number of nuclei that incorporated biotinylated nucleotides in vitro (black), in the presence or absence of ectopic ECiz1, with standard deviations calculated from four independent experiments. The 17% of nuclei that were already in S phase when the nuclear preparation was made are shown in white. Images show nuclei replicating in vitro, with or without 1 nM ECiz1. Total nuclei are counterstained with propidium iodide (red). B) The response to recombinant ECiz1 is concentration dependent with a sharp optimum in the nM range. In this experiment, and all those shown in B-I, results are expressed as % initiation rather than % replication. This is calculated from the number of nuclei that initiate in vitro and the number of nuclei that are `competent` to initiate in vitro (see methods). C) Threonines 191/2 are involved in regulating Ciz1 DNA replication activity as ECiz1 cdk site mutant T(191/2)A escapes suppression at high concentrations. D) Cdk site mutant T(293)A stimulates initiation with a similar profile to ECiz1 but at lower concentrations. E) Truncated ECiz1 (Nterm 442) lacks C-terminal sequences, but stimulates in vitro initiation to a similar extent as ECiz1. F) Cterm 274 retains no DNA replication activity in this assay. G, H, I) Further deletion analysis in the N-terminal two thirds of the ECiz1 protein show that a short region 3' of exon 8 is required for Ciz1 function when assayed in vitro.
[0102]FIG. 4 Characterization of anti-Ciz1 polyclonal antibodies and identification of 125 kDa Ciz1-related bands A) Coomassie stained SDS-polyacrylamide gel showing purified recombinant ECiz1 fragment Nterm442, and western blots of recombinant Nterm442 using anti-Cdc6 antibody V1, and anti-Ciz1 antibodies 1793 and 1794. B) Western blot of 3T3 whole cell extract. Of the two bands detected by anti-Ciz1 antibody 1793 one has the same mobility as p85-Ciz1 (100 kDa) recognized by antibody V1 and the other has an apparent Mr of 125 kDa. Anti-Ciz1 antibody 1794 recognizes only the 125 kDa form of Ciz1 (and a second antigen of around 80 kDa). C) Immuno-precipitation from 3T3 nuclear extract, using antibody V1 or anti-Ciz1 1793. Both antibodies precipitate p85, which is recognized by the reciprocal antibody in western blots. P125 is precipitated by antibody 1793, and to a lesser extent by antibody V1 and these are recognized by 1793 in western blots. Mcm3 is shown as a control.
[0103]FIG. 5 Immunofluorescence analysis of endogenous Ciz1. Ciz1 resides in sub-nuclear foci that overlap with sites of DNA replication A) Endogenous Ciz1 (red) in 3T3 cells fixed before (untreated) or after (detergent treated) exposure to TritonX100, detected with anti-Ciz1 antibody 1793. Nuclei are counterstained with Hoescht 33258 (blue). Cdc6 (green), detected with a Cdc6-specific monoclonal antibody is shown for comparison. B) Inclusion of recombinant Ciz1 blocks reactivity of antibody 1793 with detergent treated nuclei. C) Detergent-resistant Ciz1 (red) is present in all nuclei in cycling populations, while detergent resistant PCNA (green) persists only in S phase nuclei. D) High magnification confocal sections of detergent resistant Ciz1 and PCNA, and merged image showing co-localizing foci (yellow). E) Line plots of red and green fluorescence across the merged image in D, at the positions indicated (i and ii). F) Cross-correlation plot (Rubbi and Milner, 2000; van Steensel et al., 1996) for green foci compared to red over the whole merged image in D, and (inset) for the marked section after thresh-holding fluorescence at the levels shown in Eii. The red line in the inset to F shows loss of correlation when the Ciz1 image is rotated 90° with respect to PCNA. Bar is 10 μM.
[0104]FIG. 6 RNA interference. Ciz1 depletion inhibits S phase A) siRNAs that target Ciz1 transcripts at four sites (see FIG. 2A) were individually applied to cycling 3T3 cells as a single 3 nM dose and cell number was monitored at the indicated times. Images of cell populations at 16 and 40 hours after transfection with siRNA 8 (red outline) or mock treated cells (blue outline) are shown. B) Ciz1 protein detected with anti-Ciz1 1793 (green) 48 hours after exposure to Ciz1 siRNAs (4 and 8), or control GAPDH siRNA. C) Ciz1, GAPDH and β-actin transcript levels in cells exposed to Ciz1 siRNAs (4 and 8), or control GAPDH siRNA for 24 hours. Numbers in parentheses reflect band intensity in arbitrary units, and the overall reduction in Ciz1 and GAPDH transcripts (normalized against β-actin) is expressed as a percentage. D) The proportion of cells that incorporated BrdU into DNA (green) is significantly decreased in Ciz1 depleted cells, 48 hours after treatment with Ciz1 siRNA. Histogram shows average results from four independent experiments. E) The number of nuclei with detergent resistant Mcm3 (green) increases in populations treated with Ciz1 siRNA. F) The proportion of nuclei with detergent resistant PCNA (green) also increases under these conditions. All nuclei are counterstained and shown in pseudo-color (red).
[0105]FIG. 7 RT-PCR analysis of Ciz1 exons 3/4 splice variant expression in mouse primordial germ cells and embryonic stem cells. Exons 3 and/or 4 are alternatively spliced in these cell types, but not in neonatal heart. These data are consistent with the hypothesis that full-length Ciz1 is the pre-dominant form in neonatal somatic tissue, and that variants occur with more frequency earlier in development, and in germ line tissues.
[0106]FIG. 8 Transient transfection of mouse 3T3 cells. A. GFP-tagged Ciz1 constructs were transfected into NIH3T3 cells or B. microinjected into the male pro-nucleus of fertilized mouse eggs at the one cell stage. By 24 hours Ciz1 and ECiz1 became localized to the nucleus forming a subnuclear spotty pattern, while GFP alone was present in both the nucleus and the cytoplasm. C. High magnification images of live 3T3 cell nuclei 24 hours after transfection showing the subnuclear organization of EGFP tagged Ciz1 and ECiz1 and derived fragments with the C-terminal fragment (equivalent to Cterm274) removed. In the absence of C-terminal domains GFP-ECiz1 is diffusely localized in the nucleus 24 hours after transfection, while GFP-Ciz1 aggregates to form one or two large blobs within the nucleus. D. The C terminal 274 domain alone is cytoplasmic until after cells have passed through mitosis (most likely due to lack of nuclear localization sequences and passive entry to the nucleus), but once inside binds to nuclear structures and condenses with chromosomes. E. Representative images of GFP-Ciz1 (green), BrdU (red) and total nuclei (blue) in a population labelled with BrdU for the first 12 hours after transfection are shown. Histograms show the proportion of transfected (green) cells that incorporated BrdU compared to the number of untransfected (grey) cells for three separate labelling windows. During 0-22 hours after transfection rapidly cycling cells registered a consistent increase in the BrdU labelled fraction when transfected with either Ciz1 or ECiz1. Similar results were obtained with dense cultures in which most cells had exited the cell cycle and entered quiescence. However, when rapidly cycling cells were exposed to BrDu for a short (20 minute) pulse 22 hours after transfection the number of cells engaged in DNA synthesis was reduced in the Ciz1 and ECiz1 transfected populations, compared to untransfected controls and cells transfected with GFP alone. This indicates that by 22 hours DNA synthesis had ceased in Ciz1 expressing cells.
[0107]FIG. 9 Altered proliferation potential and cell morphology in transfected populations. Cell clusters arising in transfected 3T3 cell populations. A. Cells were transfected with the N-terminal two thirds of Ciz1 or ECiz1 (N-term442) tagged with GFP, and maintained under selection with 50 μg/ml G418. After three weeks under selection, cell aggregates were visible with GFP positive cells within.
[0108]FIG. 10 Human Ciz1 splice variants (SEQ ID NO: 29-36, respectively) in pediatric cancers. There are seven human Ciz1 cDNAs in public databases, but only one is derived from normal adult tissue (B cells) and it contains all predicted exons. The other six are derived from embryonic cells or pediatric cancers. Five of these are alternatively spliced with variability in exons 2, 3, 6, and 8 (like mouse ECiz1), and also in exon 4 (like mouse ES cells, primordial germ cells and testis). The sixth (AF159025) lacks the first methionine and contains single-nucleotide polymorphisms that give rise to amino-acid substitutions. All differences from the predicted sequence (AB030835) are marked.
[0109]FIG. 11 EST sequence analysis. On each map a schematic representation of the Ciz1 protein is included for reference, showing the positions of alternatively spliced exons (black), putative chromatin interaction domains (grey) and predicted zinc fingers (black vertical lines). All EST sequences are accompanied by their Genbank accession number with the library from which they were derived indicted in parentheses. Sequences absent from Ciz1 ESTs due to alternative splicing are shown in yellow, frame-shifts in red and putative deletions in grey. Single nucleotide polymorphisms that give rise to amino-acid substitutions are indicated by black dots and some of these occur in a consensus cdk phosphorylation site which we have shown to be important for the regulation of Ciz1 activity (blue dots). Position of the inserted sequence in the carcinoma cell line MGC102 is indicated by a triangle: [0110]A) Translated ESTs from pediatric cancers and adult neural cancers. [0111]B) Translated ESTs from various non-cancer cells and tissues [0112]C) Translated ESTs from leukemias, lymphomas, and from normal haematopoetic and lymphocytic cells [0113]D) Translated ESTs from carcinomas [0114]E) Translated ESTs from a range of other cancers [0115]F) Summary of alternatively spliced regions (SEQ ID NO: 37-44) in human Ciz1 showing conditionally included sequences.
[0116]FIG. 12 Ciz1 splice variant expression in Ewings sarcoma family tumor cell lines (ESFT) and neuroblastoma cell lines. A. Whole RNA samples from six independent ESFT cell lines, two neuroblastomas and a control cell line (HEK293 cells) was subject to RT-PCR analysis using 4 different primer sets. ESFT cell lines are 1) A673, 2) RDES, 3) SKES1, 4) SKNMC, 5) TC3, 6) TTC466. Neuroblastoma cell lines are 1) IMR32, 2) SKNSH.
[0117]B. Analysis of Ciz1 Exons 3/4/5 PCR products in ESFTs and neuroblastoma. The products of primers h3 and h4 (spanning potentially variable exons 4 and 6) were analyzed in more detail. PCR fragments were purified from agarose gels by standard procedures, subcloned and sequenced to identify the source of fragment size variations. Between one and eleven individual clones for each of the seven cell lines were sequenced and the results are summarized in tabular form. Ciz1 from ESFT cell lines lacks exon 4 in 31% of transcripts overall, and for some ESFT lines this is nearer 50%. DSSSQ (SEQ ID NO:1) is more commonly absent in the two neuroblastoma cell lines tested here.
[0118]FIG. 13 Ciz1 isoforms in normal human fibroblasts (Wi38) and metastatic prostate cancer cell lines (PC3 and LNCAP). A. Both prostate cancer cell lines contain an excess of the largest p125 Ciz1 protein variant in the nuclear fraction, compared to the non-cancer cell line. B.
[0119]Models for the production of p85 (100) from p125 variants by protein processing during initiation of DNA replication.
[0120]FIG. 14 illustrates the full length mouse mRNA sequence (SEQ ID NO: 45).
[0121]FIG. 15 illustrates the full length human mRNA sequence (SEQ ID NO: 46).
[0122]FIG. 16 illustrates the full length mouse protein sequence (SEQ ID NO: 26).
[0123]FIG. 17 illustrates the full length human protein sequence (SEQ ID NO: 47).
[0124]FIG. 18 illustrates human alternatively spliced protein sequences (SEQ ID NO: 48, 74, 41, 1, 43, 42, 44, 3 and 40, respectively). Sequences shown are absent in the spliced protein sequences.
[0125]FIG. 19 illustrates human alternatively spliced mRNA sequences (SEQ ID NO: 49-57, respectively). Sequences shown are absent in the spliced protein sequences.
[0126]FIGS. 20A and B illustrate unique junction sequences created in human Ciz1 proteins by missing exons (SEQ ID NO: 58-61 and 62-65, respectively). Junction sequences represent prime sites of target for therapeutic agents identified by the method of the invention.
[0127]FIG. 21A to H illustrate junction sequences created in human Ciz1 mRNA (SEQ ID NO: 66-73, respectively).
DETAILED DESCRIPTION
[0128]Identification of Ciz1
[0129]We have exploited a polyclonal antibody (antibody V1) that was raised against recombinant human Cdc6 (Coverley et al., 2000; Stoeber et al., 1998; Williams et al., 1998) to identify and study an unknown antigen whose behavior correlates with initiation of DNA replication in vitro. The antigen has an apparent Mr of 100 kDa (called p85) and is readily detectable in extracts from 3T3 cells (FIG. 1A).
[0130]DNA synthesis can be activated in cell-free replication experiments using `replication competent` late G1 phase nuclei, G1 extracts, and recombinant cyclin A-cdk2. Under these conditions nuclei will incorporate labelled nucleotides into nascent DNA, in a manner strictly dependent on the concentration of active protein kinase (FIG. 1B). Above and below the optimum concentration no initiation of DNA replication takes place. However, other events occur which inversely correlate with initiation (Coverley et al., 2002). Here we use activation of DNA synthesis (FIG. 1B), and Mcm2 phosphorylation (which results in increased mobility, FIG. 1C), to calibrate the effects of recombinant cyclin A-cdk2 in cell-free replication experiments, and correlate the behavior of p85 with activation of DNA synthesis.
[0131]In G1 nuclei that are re-isolated from reactions containing initiation-inducing concentrations of cyclin A-cdk2, p85 antigen is more prevalent compared to nuclei exposed to lower or higher concentrations of kinase (FIG. 1D). This suggests that p85 is regulated at some level by cyclin A-cdk2, in a manner that is co-incident with activation of DNA synthesis. No other antigens correlate so closely with this stage in the cell-free initiation process, therefore we used antibody V1 to clone the gene for mouse p85.
[0132]When applied to a cDNA expression library derived from 11-day mouse embryos antibody V1 picked out two clones that survived multiple rounds of screening (see methods). One encoded mouse Cdc6, while the other encoded 716 amino acids of the murine homologue of human Ciz1 (Mitsui et al., 1999). Full-length human and mouse Ciz1 have approximately 70% overall homology at the amino-acid level, with greatest (>80%) homology in the N and C terminal regions. Ciz1 is conserved among vertebrates as homologues exist in rat and fugu, but no proteins with a high degree of homology or similar domain structure could be identified in lower eukaryotes, raising the possibility that Ciz1 evolved to perform a specialized role in vertebrate development.
[0133]A previous publication on human Ciz1 (Mitsui et al 1999) demonstrated interaction with the cell-cycle protein p21-CIP1, leading to investigation of a proposed role as a transcription factor, not a DNA replication factor. A second paper (Warder and Keherly 2003) published after the priority date of this patent application suggests a role for Ciz1 in tumorigenesis, but does not demonstrate a role in DNA replication or recognize the importance of Ciz1 splice variant expression.
[0134]Multiple Ciz1 Isoforms
[0135]The predicted mouse Ciz1 open reading frame and a cDNA derived from a mouse mammary tumor library (BC018483) contain three regions that are not present in our embryonic clone (AJ575057), hereafter referred to as ECiz1 (FIG. 2A; SEQ ID NO: 27). The three variable regions in ECiz1 appear to be the result of alternative splicing of exons 2/3, 6 and 8 (FIG. 2B). Mouse melanoma clone AK089986 lacks two of the same three regions as ECiz1 (FIG. 2A), while the third encodes an N-terminal polyglutamine stretch that is also absent from human medulloblastoma derived clones. A fourth sequence block derived from exons 3/4 is absent from Ciz1 transcripts derived from mouse ES cells, and from exon 4 in mouse primordial germ cells (FIG. 7). Human Ciz1 is also alternatively spliced at the RNA level to yield transcripts that exclude combinations of the same four sequence blocks as mouse Ciz1 (see below). In fact, all known variations in mouse Ciz1 cDNAs have close human parallels, some of which are identical at the amino-acid level. This suggests that the different Ciz1 isoforms have functional significance. A fifth variable region (not yet observed in the mouse) is alternatively spliced in human Ciz1 transcripts derived mainly from carcinomas.
[0136]The data suggest that shorter forms of Ciz1 (lacking the alternatively spliced exons) are most prevalent early in development and in cell lineages that give rise to the germ line. In the analysis shown in FIG. 7, only Ciz1 from fully developed neonatal heart shows no alternative splicing, while all embryonic cell types contain alternatively spliced forms. Furthermore, the only complete Ciz1 cDNAs in public databases (human or mouse) are derived from non-embryonic cell types, and the only ones derived from embryonic sources are alternatively spliced. Therefore, Ciz1 splice variant expression appears to occur preferentially in cell types that are not yet fully differentiated.
[0137]Notably, Ciz1 cDNAs from pediatric cancers are also alternatively spliced (see below). This lead us to the hypothesis that failure to express the appropriate Ciz1 isoform at the right point in development leads to inappropriately regulated Ciz1 activity. This could contribute to unscheduled proliferation and cellular transformation.
[0138]ECiz1 Stimulates DNA Replication in vitro
[0139]Upon exposure to cytosolic extract from S phase cells, late G1 phase nuclei initiate DNA replication and begin synthesizing nascent DNA (Krude et al., 1997). We used this cell-free assay to test the effect of ECiz1, and derived recombinant fragments, on DNA synthesis (FIG. 3). Full-length ECiz1 protein consistently increased the number of nuclei that replicated in vitro, from 30% (+/-0.9%) to 46% (+/-5.5%), which suggests that Ciz1 is limiting for initiation in S phase extracts (FIG. 3A). Only two other classes of protein (cyclin-dependent kinases , Coverley et al., 2002; Krude et al., 1997; Laman et al., 2001, and the Cdc6 protein, Coverley et al., 2002; Stoeber et al., 1998) have been previously found to stimulate cell-free initiation. Thus, ECiz1 is the first protein to have this property that was not already known to be involved in the replication process. The positive effect of recombinant ECiz1 on cell-free initiation argues that endogenous Ciz1 plays a positive role in DNA replication in mammalian cells.
[0140]Stimulation of cell-free initiation is concentration-dependent with peak activity in S phase extract at around 1 nM ECiz1 (FIG. 3B). This echoes previous cell-free analyses with other recombinant proteins (Coverley et al., 2002; Krude et al., 1997), where stimulation of initiation typically peaks and then falls back to the un-stimulated level at high concentrations. For ECiz1, the reason for the drop in activity at high concentrations is not yet clear. However, mutagenesis studies (see below) suggest that the restraining mechanism is likely to be active and specific rather than due to a general imbalance in the composition of higher order protein complexes.
[0141]Down regulation of ECiz1 involves threonines 191/192 Ciz1 is likely to be a phospho-protein in vivo since it contains numerous putative phosphorylation sites, and it displays altered mobility when 3T3 cell extracts are treated with lambda phosphatase (not shown). Murine Ciz1 contains two RXL cyclin binding motifs and five putative cdk-phosphorylation sites, which are present in all known variants. Four of these are located in the N-terminal fragment of ECiz1 that contains in vitro replication activity (see below), and one is adjacent to the site at which exon 6 is alternatively spliced to exclude a short DSSSQ (SEQ ID NO: 1) sequence motif (FIG. 2A, C). As this motif is 100% identical and alternatively spliced in both mouse and man we reasoned that conditional inclusion might serve to regulate Ciz1 activity, identifying this region of the protein as potentially important. We therefore chose to focus on the cdk site that is four residues upstream and which is also conserved in mouse and man, by combining a genetic approach with cell-free replication assays. Starting with ECiz1, two threonines at 191 and 192 were changed to two alanines, generating ECiz1T(191/2)A (FIG. 2D). When tested in vitro for DNA replication activity, ECiz1T(191/2)A stimulated initiation in late G1 nuclei to a similar extent as ECiz1 (FIG. 3C). However unlike ECiz1, stimulation of initiation was maintained over a broad range of concentrations that extended over at least three orders of magnitude. Therefore, a mechanism to restrict the activity of excess ECiz1 exists and operates in a cell-free environment. In a separate construct, the threonine at position 293 was also changed to alanine generating ECiz1T(293)A (FIG. 2D), but this alteration had little effect on ECiz1 activity assayed in vitro (FIG. 3D).
[0142]These results demonstrate that down-regulation of ECiz1 activity involves threonine 191/2, and is probably caused by cyclin-dependent kinase mediated phosphorylation at this site. This links Ciz1 activity to the cdk-dependent pathways that control all major cell-cycle events, including initiation of DNA replication.
[0143]Most pre-replication complex proteins and many replication fork proteins are phosphorylated in vivo, often by cyclin-dependent kinases (Bell and Dutta, 2002; Fujita, 1999). Our data suggests that nuclear accumulation of p85-Ciz1 antigen is regulated (directly or indirectly) by cyclin A-cdk2, and it shows that a specific consensus cdk phosphorylation site at threonine 191/192 is involved in controlling Ciz1 activity. When this site is made unphosphorylatable Ciz1 activity is maintained over a broader range of concentrations in cell-free assays. Therefore, Ciz1 activity is normally down regulated by modification at this site. The functions of the other conserved cdk phosphorylation sites, and the effect of conditional inclusion of an RXL cyclin-binding motif in the alternatively spliced N-terminal portion of Ciz1, remain to be determined. Thus, the simple negative relationship between Ciz1 activity and cdk-dependent phosphorylation that has been uncovered here, is unlikely to be the whole story. However, our analysis so far links Ciz1 with the cdk-dependent pathways that control all major cell-cycle transitions, and is therefore consistent with our main conclusion that Ciz1 is involved in initiation of DNA replication.
[0144]In vitro Replication Activity Resides in the N-Terminus
[0145]Ciz1 possesses several C-terminal features that may anchor the protein within the nucleus. The matrin 3 domain suggests interaction with the nuclear matrix and the three zinc-fingers imply interaction with nucleic acids. Indeed, recent evidence suggests that human Ciz1 binds DNA in a weakly sequence specific manner (Warder and Keherley, 2003). To determine whether C-terminal domains are important for ECiz1 replication activity we divided the protein into two fragments (FIG. 2D). Nterm442 (which contains the NLS, two conserved cdk sites, one zinc finger and all known sites where variable splicing has been observed) stimulates initiation to a similar extent and at the same concentration as ECiz1 (FIG. 3E). In contrast, the C-terminal portion (Cterm274) contains no residual replication activity (FIG. 3F). Therefore, the matrin 3 domain, one of the cyclin-dependent kinase phosphorylation sites and two of the zinc-fingers are not required for the DNA replication activity of ECiz1, when assayed in vitro. It should be noted however that this analysis measures ECiz1 activity in trans under conditions where the consequences of mis-localisation are unlikely to be detected. Therefore, it remains possible that the matrin 3 domain and zinc fingers act in vivo to direct Ciz1 activity to specific sites in the nucleus and thus limit the scope of Ciz1 activity.
[0146]Endogenous Ciz1 antibody V1 recognizes Cdc6 as well as p85-Ciz1 (FIG. 1A), so it is not suitable for immuno-fluorescence experiments aimed at visualizing the sub-cellular localization of endogenous Ciz1. We therefore generated two new rabbit polyclonal anti-sera against recombinant ECiz1 fragment Nterm442, designated anti-Ciz1 1793 and 1794. As expected, purified Nterm442 is recognized by anti-Ciz1 antibodies 1793 and 1794 in western blots, but it is also recognized by antibody V1 (FIG. 4A), supporting the conclusion that p85 (p100) is indeed Ciz1.
[0147]When applied to protein extracts derived from growing 3T3 cells, anti-Ciz1 1793 recognized two antigens, with Mr of 125 and 100 kDa (FIG. 4B), whose relative proportions vary from preparation to preparation. The 100 kDa band co-migrates with the cyclin-A responsive antigen that is recognized by antibody V1 (FIGS. 1 and 4B), which suggests that both antibodies recognize the same protein in vivo. We confirmed that the p100-Ciz1 bands recognized by antibody V1 and 1793 are the same protein by immuno-precipitation (FIG. 4C). Antibody V1 precipitated a 100 kDa band that was recognized in western blots by 1793, and vice versa. Furthermore, in the same experiment 1793, and to a lesser extent antibody V1, precipitated a 125 kDa antigen, that was recognized in western blots by 1793. Taken together our observations show that the 100 kDa band is indeed Ciz1 (previously known as p85), and they suggest that Ciz1 protein exists in at least two forms in cycling cells.
[0148]In addition to the immuno-precipitation evidence described above, several other observations lead to the conclusion that p125 is also a form of Ciz1. First, both of our anti-Ciz1 antibodies (1793 and 1794) have this band in common. Both antibodies produce the same pattern of nuclear staining in immuno-fluorescence experiments, and this is disrupted in cells treated with Ciz1 siRNA (see below). Second, the relative proportions of p100 and p125 vary from preparation to preparation, and could therefore be the result of proteolytic cleavage. Thirdly, our results are strikingly similar to those of Mitsui et al (1999) whose anti-human Ciz1 monoclonal antibody detected two antigens with apparent Mr of 120 and 95 kDa in HEK293 cells. They proposed that the 120 kDa form of human Ciz1 protein is processed to produce the 95 kDa form and our results are consistent with this proposal.
[0149]The 125 kDa band recognized by antibody 1793 in mouse and human cells resolves into three Ciz1-related bands during high-resolution electrophoresis of material derived from non-transformed human cells (Wi38-see later), and mouse cells (NIH3T3-not shown). This may be the result of post-translational modification of the Ciz1 protein or of alternative splicing of the Ciz1 transcript.
[0150]Sub-cellular distribution of Ciz1 Anti-Ciz1 1793 was used to visualize the sub-cellular distribution of Ciz1 protein (p85 and p125) in 3T3 cells (FIG. 5A), and in HeLa cells (not shown). In both cell types 1793 reacted with a nuclear-specific antigen, and this was blocked by inclusion of recombinant Nterm442 fragment (FIG. 5B). Unlike Cdc6, which is shown for comparison (FIG. 5A), Ciz1 is clearly detectable in all 3T3 cells in this cycling population. Therefore Ciz1 is present in the nucleus throughout interphase, although minor variations in quantity, or isoform would not be detected by this method. After detergent treatment overall nuclear Ciz1 staining was reduced in all nuclei, which suggests that Ciz1 is present in the nucleus as both a soluble fraction and also bound to insoluble nuclear structures.
[0151]When soluble protein is washed away, the insoluble, immobilized antigen resolves into a punctate sub-nuclear speckled pattern at high magnification (FIG. 5C, D). Ciz1 speckles show a similar size range and distribution as replication `foci` or `factories`, the sites at which DNA synthesis takes place in S phase. To ask whether Ciz1 is coincident with sites of replication factories, we compared the position of Ciz1 speckles to the position of PCNA, a component of replication complexes in S phase cells (FIG. 5C). In confocal section, PCNA foci are less abundant than Ciz1 foci, but they are almost all co-incident with Ciz1 (FIG. 5D, E, F). This is particularly striking for foci in the medium size range. In merged images, overlap between the positions of PCNA and Ciz1 foci results in yellow spots, while the remaining Ciz1 foci that are not co-incident with PCNA are red. Green (PCNA alone) foci are virtually absent, which suggests that Ciz1 is present at all sites where DNA replication factories have formed.
[0152]Ciz1 is also present at sites that don't contain PCNA (FIG. 5D), and unlike PCNA, Ciz1 foci persist throughout interphase (FIG. 5A). One interpretation of these observations is that Ciz1 marks the positions in the nucleus at which PCNA-containing replication factories are able to form in S phase, but that not all of these sites are used at the same time. It remains to be determined whether different Ciz1 foci become active sites of DNA replication at different times in S phase, or whether other nuclear activities also occur at sites where Ciz1 is bound. Indeed, at this stage it also remains possible that the 100 kDa form and the 125 kDa variants of Ciz1 have different activities, and that they reside at nuclear sites with different functions.
[0153]Ciz1 is Essential for Cell Proliferation
[0154]So far we have shown that the behavior of p85 (p100)-Ciz1 correlates with initiation of DNA replication in cell-free assays, that recombinant Ciz1 stimulates the frequency of initiation, and that Ciz1 resides at the same nuclear sites as the DNA replication machinery. However, these data do not show that Ciz1 has an essential function in proliferating cells. In order to test this we used RNA interference (RNAi) to selectively reduce Ciz1 transcript levels in NIH3T3 cells. Four target sequences within Ciz1 were chosen (see FIG. 2A) and short interfering (si) RNA molecules were produced in vitro. When applied to cells, all four Ciz1 siRNA's restricted growth (FIG. 6A) and caused a visible reduction in the level of Ciz1 protein after 48 hours (FIG. 6B). The effect of Ciz1 depletion on proliferation becomes apparent between 23 and 40 hours post-transfection, which suggests that the first cell cycle without Ciz1 RNA is relatively unaffected. By 40 hours, controls and Ciz1 siRNA treated cells diverged significantly with no further proliferation in the Ciz1 depleted population. To verify the specificity of Ciz1 depletion, transcript levels were monitored at 24 hours, before proliferation is significantly inhibited (FIG. 6C). At this point Ciz1 transcripts were reduced to 42% of the level in control cells treated with GAPDH siRNA. These experiments show that Ciz1 is required for cell proliferation and are consistent with a primary function in DNA replication.
[0155]To test this further, cells were pulse-labelled with BrdU 48 hours after siRNA treatment to determine the fraction of cells engaged in DNA synthesis (FIG. 6D). When Ciz1 levels were reduced the BrdU labelled fraction was also reduced, suggesting that DNA synthesis is inhibited under these conditions. Furthermore, cells in the Ciz1 depleted population that did incorporate BrdU (approximately 15% of the population) were less intensely labelled. Therefore, in some Ciz1 siRNA treated cells S phase is slowed down rather than inhibited completely, possibly due to incomplete depletion.
[0156]Inhibition of DNA synthesis by Ciz1 siRNAs could be a secondary consequence of a general disruption of nuclear function. Therefore, we looked in more detail at a range of other replication proteins whose levels are regulated in a cell cycle dependant manner, to ask whether depleted cells arrest randomly, or accumulate at a particular point.
[0157]During initiation of eukaryotic DNA replication Mcm complex proteins assemble at replication origins in late G1, in a Cdc6-dependent manner. Sometime later, DNA polymerases and their accessory factors (including PCNA) become bound to chromatin and origins are activated. This is associated with nuclear export and proteolysis of the majority of Cdc6 and, as DNA synthesis proceeds, gradual displacement of the Mcm complex from chromatin (Bell and Dutta, 2002). In order to identify the point of action of Ciz1 we used immuno-fluorescence to monitor Mcm3 and PCNA. In Ciz1 depleted cells (FIG. 6E, F) both proteins were detectable within the nucleus bound to detergent resistant nuclear structures. Therefore, these factors are unlikely to bind directly to Ciz1, or to be dependent upon Ciz1 for their assembly. In fact, in four independent experiments the average number of cells with detergent-resistant chromatin-bound Mcm3 actually increased from 31% (+/-6%) to 51% (+/-5%) (FIG. 6E). Increased Mcm3 indicates that the Ciz1 dependent step occurs after pre-replication complex assembly (but before completion of S phase). In the same cell populations the PCNA positive fraction also increased, from 32% (+/-5%) to 49% (+/-6%) (FIG. 6F), narrowing the point of Ciz1 action to after PCNA assembly. Thus, Ciz1 most likely acts to facilitate DNA replication during a late stage in the initiation process, while failure to act inhibits progression through S phase, leaving Mcm3 and PCNA in place.
[0158]Taken together, our cell-free and cell-based investigations paint a consistent picture about the primary function of Ciz1. They suggest that Ciz1 is a novel component of DNA replication factories, and they show that Ciz1 plays a positive role in the mammalian cell-cycle, acting to promote initiation of DNA replication.
[0159]Three of our lines of investigation suggest that Ciz1 is required during a late stage in the initiation process after pre-replication complex formation. First, p85 (p100)-Ciz1 antigen accumulates in nuclei exposed to cyclin A-cdk2 concentrations that activate DNA synthesis, implying that Ciz1 functions during this step rather than during earlier replication complex assembly steps (Coverley et al., 2002). Second, functional studies with late G1 nuclei show that recombinant ECiz1 increases the number of nuclei that incorporate labeled nucleotides in vitro. Therefore, Ciz1 must be active in a step that converts nuclei that are poised to begin DNA synthesis into ones that are actively synthesizing DNA. Third, RNA interference studies point to a Ciz1-dependent step after Mcm complex formation and after PCNA has become assembled onto DNA, but before these proteins are displaced. These distinct lines of investigation lead to strikingly similar conclusions about the point of action of Ciz1 placing it in the later stages of initiation.
[0160]Anti-Ciz1 siRNA as a Therapeutic Strategy
[0161]Our analysis shows that Ciz1 is essential for cell proliferation, and that targeting Ciz1 is a viable strategy to restrain proliferation. The alternatively spliced forms of Ciz1 that we observe in various cancers (see below) means that Ciz1 could be targeted in a selective way to restrain proliferation in a subset of cells within a population.
[0162]By way of example, this could be done by targeting siRNA's to the junction sequence created in Ciz1 transcripts when the C-terminal sequence GTTGAGGAGGAACTCTGCAAGCAG (SEQ ID NO:2) is missing, in small cell lung carcinoma cells, or by using Ciz1 protein lacking the corresponding VEEELCKQ (SEQ ID NO: 3) sequence to select specific chemical inhibitors.
[0163]Accordingly the present invention also provides for the use of junction sequences created in Ciz1 transcripts and proteins when alternatively spliced sequences are not present, as a diagnostic marker, prognostic indicator or therapeutic target.
[0164]Embryonic form Ciz1 is localized to the nucleus RT-PCR analysis across potentially variable exons suggest that 3T3 cells predominantly express full-length Ciz1, so our immuno-localization work on endogenous Ciz1 (FIG. 5) does not necessarily reflect the behavior of ECiz1, which lacks several sequence blocks and possibly therefore information that is used to localize the protein. To directly compare the localization of ECiz1 and full-length Ciz1, enhanced GFP tagged constructs were transfected into 3T3 cells (FIG. 8A), and microinjected into mouse pro-nuclei (FIG. 8B). In all cases tagged Ciz1 and ECiz1 were exclusively nuclear, while a control construct expressing GFP alone was present in the nucleus and the cytoplasm. GFP-Ciz1 and GFP-ECiz1 were both visible in live cells as sub-nuclear foci, similar to replication foci seen in fixed cells by immuno-fluorescence. Thus, the three sequence blocks that are absent from ECiz1 do not appear to contribute to the nuclear localization of Ciz1.
[0165]Over the three day period following transfection no cell division was observed in the GFP-Ciz1 and GFP-ECiz1 transfected cells. These data suggest that overexpression of functional Ciz1 has an inhibitory effect on the cell cycle (in cells that have their regulatory pathways intact).
[0166]Coalescence
[0167]When GFP-tagged constructs in which the C-terminal one third of Ciz1 had been removed were transfected into 3T3 cells, differences between ECiz1 and full length Ciz1 were observed (FIG. 8C). By 48 hours FL Ciz1 N-term (442 equivalent) had coalesced into large intra-nuclear blobs which only became apparent in the ECiz1 N-term442 transfected population by day 3 or later. Before this time ECiz1 N-term442 was localized as a nuclear specific but diffuse pattern. Thus ability to coalesce is quantifiably different between Ciz1 and ECiz1, and is therefore affected by one of the three alternatively spliced exons (2/3, 6 or 8).
[0168]Like cells transfected with full length Ciz1 and ECiz1, cells transfected with constructs in which the C terminal one third was removed were not seen to multiply during the three day monitoring period.
[0169]C-Terminal Domains Anchor Ciz1 to Nuclear Structures
[0170]As described above, the difference between Ciz1 and ECiz1 N-term is masked when C-terminal domains are also present (FIG. 8A). Furthermore the C-terminal fragment alone directs GFP tag to chromatin, forming an irregular pattern that is not as spotty (focal) as Ciz1 or ECiz1, but which remains attached to chromosomes during mitosis (FIG. 8D). This suggests that C-terminal domains are involved in immobilizing Ciz1 on a structural framework in the nucleus. Notably, cells transiently transfected with C-terminal fragment continued to divide resulting in gradual dilution of green fluorescence.
[0171]Ectopic Ciz1 Promotes Premature Entry to S Phase
[0172]We looked at events occurring during the first day after transfection. The S phase fraction in transfected cells (green) was compared to the S phase fraction in untransfected cells, by labelling with BrdU at various intervals. During long labelling windows including 0-22 hours (FIG. 8E), 0-12 hours and 0-7 hours (not shown), consistently more of the Ciz1 and ECiz1 transfected cells were engaged in DNA synthesis, compared to untransfected cells. This suggests that Ciz1 and ECiz1 have a positive effect on the G1-S transition, promoting unscheduled entry to S phase. Similar results were obtained with 3T3 cell populations that were densely plated before transfection. This was done in order to minimize the fraction in the untransfected population that was engaged in S phase as part of the normal cell cycle. Under these conditions the difference between the transfected and untransfected population was maximized, clearly demonstrating the effect of ectopic Ciz1 on initiation of DNA replication.
[0173]Conversely, when cells were labelled with BrdU during a short pulse administered at 22 hours (FIG. 8E), or at 10 hours or 12 hours post-transfection (not shown), the labelled fraction was consistently reduced in the Ciz1 and ECiz1 transfected populations. This suggests that the S phase that is induced by ectopic Ciz1 or ECiz1 is abnormal, with slow or aborted DNA synthesis that is not sufficient to label cells during short windows of exposure to BrdU.
[0174]Therefore, ectopic Ciz1 and ECiz1 have two effects on S phase in cultured cells. They promote DNA replication, but this results in slow or aborted DNA synthesis.
[0175]Clones with Altered Proliferation Potential
[0176]We also monitored transfected populations of 3T3 cells over a three week time period. In cells transfected with the GFP-Nterm442 or the non-alternatively spliced equivalent and maintained under selection with G418, large foci containing hundreds of cells were observed (FIG. 9A). These clusters contained large numbers of GFP expressing cells, demonstrating that over-expression of the N-terminal portion of ECiz1 (in which replication activity resides) is not lethal, and suggesting that over-expression leads to altered proliferation phenotype, compared to untransfected cells, including loss of contact inhibition and failure to form a monolayer. This Ciz1-dependent altered behavior could contribute to tumor formation. A similar truncated version of mouse Ciz1, lacking putative chromatin interaction domains was previously isolated from a mouse melanoma (FIG. 2).
[0177]Human Ciz1 and Cancer
[0178]Ciz1 cDNAs in Public Databases
[0179]As mentioned above human Ciz1 is alternatively spliced at the RNA level to yield transcripts that lack three of the same exons as mouse embryonic Ciz1. Seven human Ciz1 cDNAs have been recorded in public databases (FIG. 10), submitted by Mitsui et al (1999), Warder and Keherly (2003) and large-scale genome analysis projects (NIH-MGC project, NEDO human cDNA sequencing project). Only one is derived from normal adult tissue, and this contains all predicted exons (AB030835). The rest are derived from embryonic cells (AK027287), or notably from four different types of pediatric cancer (medulloblastoma, AF159025, AF0234161, retinoblastoma, AK023978, neuroblastoma, BC004119 and Burkitt lymphoma, BC021163). The embryonic form and the cancer derived forms lack sequence blocks from the same three regions as our embryonic mouse clone, and from a fourth region which corresponds to exon 4. Therefore, the limited data suggests that alternatively spliced forms are more prevalent early in development. This correlation has not previously been noted in the scientific literature. The presence of alternatively spliced Ciz1 in pediatric cancers raises the possibility that Ciz1 mis-splicing might be linked to inappropriate cell proliferation.
[0180]For example, one of the variable exons encodes a short conserved DSSSQ (SEQ ID NO:1) sequence motif that is absent in mouse ECiz1 and in a human medulloblastoma. This is directly adjacent to the consensus cdk phosphorylation site that we have shown to be involved in regulation of ECiz1 function. Conditional inclusion of the DSSSQ (SEQ ID NO:1) sequence might make Ciz1 the subject of regulation by the ATM/ATR family of protein kinases, which phosphorylate proteins at SQ sequences, thereby restraining Ciz1 initiation function in response to DNA damage.
[0181]Analysis of Expressed Sequence Tags
[0182]The presence of alternatively spliced Ciz1 in pediatric cancers prompted a detailed analysis of Ciz1 ESTs. There are 567 expressed sequence tags (ESTs) included in NCBI unigene cluster Hs.23476 (human Ciz1). These are derived from a wide range of normal and diseased tissues and cell lines. Sequences have been translated and mapped against the predicted full-length amino-acid sequence of human Ciz1. Sequence alterations that give rise to amino-acid substitutions, deletions, frame-shifts and premature termination of translation have been recorded.
[0183]Alternatively spliced Ciz1 variants were also seen in this EST data set and are recorded here. The four sequence blocks that we previously reported to be alternatively spliced in human and mouse Ciz1 (FIG. 2) were observed in the EST sequences, as well as a previously undetected variant that lacks the exon 14 derived sequence VEEELCKQ (SEQ ID NO: 3). All of these recurrently variant sequence blocks are bounded by appropriate splice sites. A sixth variable sequence block was identified in one carcinoma derived library, caused by inclusion of GCCACCCACACCACGAAGAGATGTGTTTGCCCACGTTCCAGTGCAGG GGTGGAGCACAGCCCGGCTTGTTACAGATAT (SEQ ID NO: 4).
[0184]ESTs are grouped according to the cell type from which they were derived with the primary divisions occurring between neoplastic cells of adult, childhood or embryonic origin. ESTs from normal tissue of embryonic or adult origin are included for comparison. EST-derived Ciz1 protein maps are shown in FIG. 11A-E and the alternatively spliced exons summarized in FIG. 11F.
[0185]Three sequence blocks in the N-terminal end of human Ciz1 are absent in transcripts from medulloblastomas and neuroblastoma (FIG. 11A), and occasionally absent from Ciz1 transcripts from other cancers. We also found similar alternative splicing in a third pediatric cancer, Ewings sarcoma (see below). Pediatric cancer-associated alternatively spliced sequences are from exons 2/3 (at least two versions), exon 4 and exon 6.
[0186]Exon 8 variants in which one or more copies of a Q-rich degenerate repeat are absent have been noted in transcripts derived from normal cells (of embryonic or adult neural origin) and from various cancers. Alternative splicing in this region could produce Ciz1 with inappropriate activity, therefore exon 8 variant expression, or occurrence of point mutations which influence splicing in this region, might be useful as diagnostic or prognostic markers in cancer. The alternatively spliced degenerate repeats in exon 8 are detailed below and summarized in FIG. 11F.
[0187]In the C-terminal half of the human Ciz1 protein two sequence blocks are variably spliced. One of these is missing from transcripts derived from three out of five lung carcinoma and lung carcinoid libraries, and from three other carcinoma libraries (but very rarely from transcripts from other cell types).
[0188]The second variant sequence block is due to inappropriate inclusion of extra sequence in transcripts from the epidermoid carcinoma library (MGC102).
[0189]These sequences and the junction sequences formed in Ciz1 proteins, and Ciz1 transcripts when these segments are excluded or included, are potential targets for selective inhibition of cell proliferation in a wide range of different cancers. The remaining non-variant sequences are potential targets for non-selective inhibition of cell proliferation.
[0190]In addition to splicing variations, other non-typical Ciz1 transcripts were found to preferentially occur in some cancers. In Rhabdomyosarcomas Ciz1 is prematurely terminated leading to a predicted protein that lacks C-terminal nuclear binding domains. This could lead to inappropriate DNA replication and might therefore be a therapeutic target or marker in this type of cancer.
[0191]Several transcripts contain point mutations that lead to amino-acid substitutions in putative cyclin-dependent kinase (cdk) phosphorylation sites. In the cervical carcinoma library MGC12, this occurs twice. We have shown that two cdk phosphorylation sites are involved in restraining Ciz1 activity (FIGS. 3C and D), implicating these mutations in the deregulation of proliferation in cancer cells. One of these is the same as the carcinoma-derived mutant mentioned above (FIG. 11E). Cancer-derived transcripts with point mutations in Ciz1 could also be targeted by RNA interference, or have value as diagnostic or prognostic indicators.
[0192]Investigation of Ciz1 Variant Expression in Pediatric Cancers
[0193]Ciz1 variant expression was investigated in 6 Ewings sarcoma family tumor cell lines (ESFTs) and two neuroblastoma cell lines, using RTPCR with primer sets that span three regions of known Ciz1 variability (FIG. 12A). This analysis showed that the pattern of Ciz1 variant expression is different in ESFT cells compared to neuroblastoma cells compared to non-transformed cells, but apparently very similar within sets of cell lines from the same tumor. Therefore, Ciz1 variant expression could have prognostic or diagnostic potential for these cancers. Minor variations within a set of lines from the same tumor type could have prognostic value.
[0194]By subcloning and sequencing amplified transcripts we found that all six ESFT lines tested express an exon 4 minus form of Ciz1. As Ciz1 is essential for cell proliferation (see below), this offers a possible route for selective restraint of ESFT cells. Transcripts from the two neuroblastoma cell lines tested rarely lack exon 4 but frequently lack sequences the DSSSQ (SEQ ID NO:1) motif encoded by exon 6 (FIG. 12B).
[0195]This experimental analysis confirms that pediatric cancers express forms of Ciz1 with variable inclusion of exons 4, 6 and probably exons 2/3.
[0196]Two versions of the sequence encompassing exon 8 and one form of the sequence encompassing the VEEELCKQ-coding sequence were detected in ESFTs, neuroblastomas and control suggesting that these regions do not contribute to deregulation of Ciz1 in these paediatric cancers.
[0197]In all cases, Ciz1 RT-PCR products were most abundant in reactions carried out with RNA samples from cancer cell lines, compared to controls (Wi38, HEK293, NIH3T3 cells, and primary human osteoblasts). This is consistent with increased expression of Ciz1 variants in tumors.
[0198]Analysis of Ciz1 Protein Expression in Prostate Cancer Cell Lines
[0199]Normal, non-transformed human lung fibroblasts (and mouse NIH3T3 cells) express two major forms of Ciz1 that are detected by anti-Ciz1 polyclonal antibody 1793 in western blots (FIG. 13A). The larger (approximately 125 kDa) band resolves into three distinct bands that are present in equal proportions in Wi38 cells, but grossly uneven proportions in prostate cancer cell lines PC3 and LNCAP (and ESFT cell lines--not shown). We postulate that these protein isoforms are generated by expression of variably spliced exons. Both tumor cell lines also contain more Ciz1 antigen than Wi38 cells, consistent with over-expression of Ciz1 in these cancer cell lines.
[0200]Taken together, our results (experimental and bioinformatics analysis of genome data) support the conclusion that Ciz1 is mis-regulated in a wide range of human cancers. We have shown that the Ciz1 protein plays a positive role in the DNA replication process, therefore mutant Ciz1 could contribute to cellular transformation, rather than be a consequence of it. If deregulation of Ciz1 is a common step in this process it represents a very attractive target for development of therapeutic agents.
[0201]We have also associated particular changes with specific cancers, making it a real possibility that Ciz1 could be useful as a diagnostic or prognostic marker.
[0202]These include: [0203]Alternative splicing in the N-terminal part of the protein (that contains replication activity in vitro) in pediatric cancers. [0204]Point mutations in cyclin-dependent kinase phosphorylation sites known to be involved in restraining Ciz1 replication activity. [0205]Non-typical expression and nuclear binding properties of Ciz1-p125 forms in prostate carcinoma cell lines, possibly due to mis-regulated splicing of the degenerate repeats in exon 8, or other exons. [0206]Conditional exclusion of a discrete motif (VEEELCKQ) in the C-terminal end of Ciz1 (probably involved in localization of Ciz1 protein within the nucleus) in small cell carcinoma of the lung and other carcinomas. [0207]Increased levels of Ciz1 protein and RNA (detected by Western blot and by RT-PCR) in all cancer derived cells lines tested so far, compared to Wi38 normal embryonic lung fibroblast, human osteoblast RNA and mouse NIH3T3 fibroblasts.
[0208]The sequences shown in FIGS. 14 to 21 are of use for the development of therapeutic, diagnostic, or prognostic reagents.
[0209]Materials and Methods
[0210]Cloning.
[0211]A lamba triplEx 5'-stretch, full length enriched cDNA expression library derived from 11 Day old mouse embryos (Clontech ML5015t) was used to infect E. coli Xl1blue according to the recommended protocol (Clontech). Plaques were lifted onto 0.45 micron nitrocellulose filters pre-soaked in 10 mM IPTG (Sigma). Affinity purified antibody V1 was applied to approximately 3×106 plaques at 1/1000 dilution in PBS, 10% non-fat milk powder, 0.4% Tween20, after blocking for 30 minutes in the absence of antibody. After two hours filters were washed three times with the same buffer and reactive plaques were visualized with anti-rabbit secondary antibody conjugated to horse-radish peroxidase (Sigma), and enhanced chemi-luminescence (ECL, Amersham) according to standard procedures. 43 independent plaques were picked but only two strains of phage survived a further three rounds of screening. These were converted to pTriplEx by transforming into BM25.8 and sequenced. One codes for mouse Cdc6 (clone P) and the other (clone L) for an unknown mouse protein that is homologous to human Ciz1. We refer to this as embryonic Ciz1 (ECiz1) and it was submitted to EMBL under the accession number AJ575057.
[0212]Bacterial expression pGEX based bacterial expression constructs (Amersham) were used to produce ECiz1 proteins for in vitro analysis. pGEX-ECiz1 was generated by inserting a 2.3 kb SmaI-XbaI (blunt ended) fragment from clone L into the SmaI site of pGEX-6P-3. pGEX-Nterm442 was generated by inserting the 1.35 kb XmaI-XhoI fragment into XmaI-XhoI digested pGEX-6P-3, and pGEX-Cterm274 by inserting the 0.95 kb XhoI fragment into XhoI digested pGEX-6P-3. pGEX-T(191/2)A was generated from pGEX-ECiz1 by site directed mutagenesis (Stratagene Quikchange) using primers AACCCCCTCTTCCGCCGCCCCCAATCGCAAGA (SEQ ID NO: 5) and TCTTGCGATTGGGGGCGGCGGAAGAGGGGGTT (SEQ ID NO: 6). pGEX-T(293)A was generated from pGEX-ECiz1 using primers AAGCAGACACAGGCCCCGGATCGGCTGCCT (SEQ ID NO: 7) and AGGCAGCCGATCCGGGGCCTGTGTCTGCTT (SEQ ID NO: 8). Integrity and reading frame of all clones were sequence verified.
[0213]Recombinant Ciz1, Ciz1 fragments and point mutants were produced in BL21-pLysS (Stratagene) as glutathione S-transferase-tagged protein. This was purified from sonicated and cleared bacterial lysates by binding to glutathione sepharose 4B (Amersham). Recombinant protein was eluted by cleavage from the GST tag using precision protease (as recommended by the manufacturer, Amersham), into buffer (50 mM Tris-HC pH 7.0, 150 mM NaCl, 1 mM DTT). This yielded protein preparations between 0.2 and 2.0 mg/ml. For replication assays serial dilutions were made in 100 mM Hepes pH 7.8, 1 mM DTT, 50% glycerol so that not more than 1 ml of protein solution was added to 10 ml replication assays, yielding the concentrations shown. Consistent with previous observations (Mitsui et al., 1999; Warder and Keherly, 2003) recombinant Ciz1, and derived fragment N-term442 migrated through SDS-PAGE with anomalously high molecular weight. Cyclin A-cdk2 was produced in bacteria as previously described (Coverley et al., 2002).
[0214]Anti-Ciz1 Antibodies
[0215]Rabbit polyclonal antibody V1 (Coverley et al., 2000; Stoeber et al., 1998; Williams et al., 1998) was raised against an internal fragment of bacterially expressed human Cdc6 corresponding to amino-acids 145-360, and affinity purified by standard procedures (Harlow and Lane, 1988). This antibody reacts strongly with endogenous p100-Ciz1 and also with ECiz1 Nterm442 fragment. Alignment of Nterm442 with Cdc6 amino-acids 145-360 suggest that the shared epitope could be at 294-298 or 304-312 in mouse Ciz1. Recombinant Nterm442 was used to generate two Ciz1-specific polyclonal anti-sera designated 1793 and 1794 (Abcam). 1793 has been used routinely in the experiments described here. Its specificity was verified by reciprocal immuno-precipitation and western blot analysis with antibody V, by inclusion of Nterm 442 (25 μg/ml in antibody buffer, 10 mg/ml BSA, 0.02% SDS, 0.1% Triton X100 in PBS), which blocked reactivity with endogenous epitopes, and by siRNA-mediated depletion of Ciz1 that specifically reduced 1793 nuclear staining.
[0216]Immunoprecipitation
[0217]Asynchronousy growing 3T3 cells were washed in PBS, rinsed in extraction buffer (20 mM Hepes pH7.8, 5 mM potassium acetate, 0.5 mM magnesium chloride) supplemented with EDTA-free protease inhibitor cocktail (Roche) and scrape harvested as for replication extracts. Cells were lysed with 0.1% Triton X100 and the detergent resistant pellet fraction extracted with 0.3M NaCl in extraction buffer. 5 μl of 1793 or 2 μl of antibody V were used per 100 μl of extract and incubated for 1 hour at 4° C. Antigen-antibody complexes were extracted with 100 μl of protein G-sepharose (Sigma) and beads were washed five times with 50 mM Tris pH 7.8, 1 mM EDTA, 0.1% NP40, 150 mM NaCl. Complexes were boiled in loading buffer (100 mM DTT, 2% SDS, 60 mM Tris pH6.8, 0.001% bromophenol blue) and resolved by 6.5% SDS-polyacrylamide gel electrophoresis.
[0218]Immuno-Fluorescence
[0219]Cells were grown on coverslips and fixed in 4% paraformaldehyde, with or without brief pre-exposure to 0.05% Triton X100 in PBS. Endogenous Ciz1 was detected with 1793 serum diluted 1/2000 in antibody buffer following standard procedures. Mcm3 was detected with monoclonal antibody sc9850 (1/1000), Cdc6 with monoclonal sc9964 (1/100) and PCNA with monoclonal antibody PC10 (1/100, all Santa Cruz Biotechnology). Co-localization analysis of dual stained fluorescent confocal images was carried out as described (Rubbi and Milner, 2000; van Steensel et al., 1996).
[0220]Cell Synchrony
[0221]Mouse 3T3 cells were synchronized by release from quiescence as previously described (Coverley et al., 2002). Nuclei prepared from cells harvested 17 hours after release (referred to as `late-G1`) were used in all cell-free replication experiments described here. This yielded populations containing S phase nuclei, replication competent late G1 nuclei and unresponsive early G1/G0 nuclei, in varying proportions. Recipient, mid-G1 3T3 extracts were prepared at 15 hours (these typically contain approximately 5% S phase cells). The series of cell-free replication experiments described here required large amounts of standardized extract, therefore HeLa cells were used because they are easily synchronized in bulk. S phase HeLa extracts were prepared from cells released for two hours from two sequential thymidine-induced S phase blocks, as described (Krude et al., 1997).
[0222]Cell-Free DNA Replication
[0223]DNA replication assays were performed as described (Coverley et al., 2002; Krude et al., 1997). Briefly, 10 μl of mid G1 or S phase extract (supplemented with energy regenerating system, nucleotides and biotinylated dUTP), and 5×104 late G1 phase nuclei were incubated for 60 mins at 37° C. Reactions were supplemented with baculovirus lysate containing cyclin A-cdk2 (FIGS. 1B and C), where 0.1 μl of lysate has the same specific activity as 1 nM purified kinase (Coverley et al., 2002). All recombinant proteins were serially diluted in 100 mM Hepes pH 7.8, 1 mM DTT, 50% glycerol, so that not more than 1 μl was added to 10 μl replication assays, generating the concentrations indicated. Reactions were stopped with 50 μl of 0.5% Triton X100 and fixed by the addition of 50 μl of 8% paraformaldehyde, for 5 minutes. After transfer to coverslips, nuclei were stained with streptavidin-FITC (Amersham) and counterstained with Toto-3-iodide (Molecular Probes). The proportion of labelled nuclei was quantified by inspection at 1000× magnification, and all nuclei with fluorescent foci or intense uniform labelling were scored positive. Images of in vitro replicating nuclei were generated by confocal microscopy at 600× magnifications, of samples counterstained with propidium iodide. For analysis of nuclear proteins, nuclei were re-isolated after 15 minutes exposure to initiating conditions, by diluting reactions two fold with cold PBS and gentle centrifugation.
[0224]Data Analysis and Presentation
[0225]Prior to use in initiation assays each preparation of synchronized G1 phase nuclei is tested so that the proportion of nuclei that are already in S phase is established (`% S`). To do this nuclei are incubated in an extract that is incapable of inducing initiation of DNA synthesis (from mid-G1 phase cells harvested 15 hours after release from quiescence), but that will efficiently support elongation DNA synthesis from origins that were initiated in vivo. The elongating fraction of nuclei incorporates labeled nucleotides efficiently during in vitro initiation assays but is uninformative. Routinely this fraction is pre-established and subtracted from the raw data. Synchronized populations in which 20% or less are in S phase are used for initiation assays.
[0226]When 3T3 cells are released from quiescence by the protocol used here no more than 70% of the total population enters S phase (Coverley et al., 2002). However, the highest observed replication frequency in vitro is nearer 50%; usually obtained by incubation with ECiz1. For the G1 population of 3T3 nuclei used here 17% were in S phase (% S) and the maximum number that replicated in any assay in vitro was 51% (% replication). Therefore, 34% of this population is competent to initiate replication in vitro (% C). Thus, for each data point in FIGS. 3B-F, % initiation=(% replication-% S)/% C×100.
[0227]RNA Interference
[0228]Endogenous Ciz1 was targeted in proliferating NIH3T3 cells using in vitro transcribed siRNAs (Ambion Silencer kit), directed against four regions of mouse Ciz1. Oligonucleotide sequences that were used to generate siRNAs are AAGCACAGTCACAGGAGCAGACCTGT (SEQ ID NO: 9) CTC and AATCTGCTCCTGTGACTGTGCCCTGTCTC (SEQ ID NO: 10) for siRNA 4, AATCTGTCACAAGTTCTACGACCTGTCTC (SEQ ID NO: 11) and AATCGTAGAACTTGTGACAGACCTGTCTC (SEQ ID NO: 12) for siRNA 8, AATCGCAAGGATTCTTCTTCTCCTGTCTC (SEQ ID NO: 13) and AAAGAAGAAGAATCCTTGCGACCTGTCTC (SEQ ID NO: 14) for siRNA 9, and AATCTGCAGCAGTTCTTTCCCCCTGTCTC (SEQ ID NO: 15) and AAGGGAAAGAACTGCTGCAGACCTGTCTC (SEQ ID NO: 16) for siRNA 11. Target sequences that are distributed throughout the Ciz1 transcript were chosen based on low secondary structure predictions and on location within exons that are consistently expressed in all known forms of Ciz1 (sequences 4, 8, 11), with the exception of one (siRNA 9) that is known to be alternatively spliced. Negative controls were untreated, mock treated (transfection reagents but no siRNA) and cells treated with GAPDH siRNA (Ambion). Cy3 labelled siRNAs (Ambion) were used to estimate transfection efficiency, which was found to be greater than 95%. RNA interference experiments were performed in 24 well format starting with 2×104 cells per well in 500 μl of medium (DMEM with glutamax supplemented with 4% FCS). siRNA's were added 12 hours after plating using oligofectamine reagent for delivery (Invitrogen). Unless stated otherwise, siRNAs were used in pairs (at 2 nM total concentration in medium), as two doses with the second dose delivered in fresh medium 24 hours after the first. Results were assessed at 48 hours after first exposure, by counting cell number, S phase labelling, and immuno-staining. Northern blots were performed on RNAs isolated from cells treated for 24 hours with a single dose of siRNA, in reactions that were scaled up 5 fold. RNA was prepared using Trizol Reagent (Invitrogen) and samples were electrophoresed through 1% agarose, transferred onto Hybond N+ nylon membrane (Amersham), and sequentially hybridized at 50° C. with cDNA probes using NorthernMax kit reagents (Ambion), following manufacturers instructions. The membrane was stripped between each hybridization using 0.5% SDS solution at 90° C., allowed to cool slowly to room temperature. Probes were [32P]-dCTP labelled using Random Primers DNA labelling system (Gibco BRL), and used in the following order: i. A 1.35 kb Xmal-Xhol fragment derived from ECiz1. ii. Human β-actin cDNA (Clontech) and iii. Mouse GAPDH cDNA (RNWAY laboratories). The membrane was washed twice in 2×SSC 0.2% SDS for 30-60 mins each, followed by one wash in 0.2×SSC 0.2% SDS for 30 mins, at 55-65° C., depending on probe used. Hybridization signals were quantified using an Amersham Biosciences Typhoon 9410 variable mode imager, and Image Quant TL software (v2002). Band intensities are expressed in arbitrary units (in parentheses), and results for Ciz1 and GAPDH were normalized against those for β-actin, and expressed as a %.
[0229]S Phase Labelling
[0230]The fraction of nuclei undergoing DNA synthesis in vivo was monitored by supplementing culture medium with 20 μM bromodeoxyuridine (BrdU, Sigma) for 20 minutes. Incorporated BrdU was visualized after acid treatment with FITC-conjugated anti-BrdU monoclonal antibody (Alexis Biochemicals) according to manufacturers instructions. Nuclei were counterstained with Hoescht 33258 and scored under high (1000×) magnification.
[0231]Green Fluorescent Protein Tagged Ciz1
[0232]Full-length mouse Ciz1 cDNA was obtained from UK HGMP Resource Centre (MGC clone 27988) and the sequence fully verified. A 2.8 kb SmaI-XbaI (blunt ended) full length Ciz1 fragment from this clone, and a 2.3 kb SmaI-XbaI (blunt ended) ECiz1 fragment from pTriplEx-clone L were ligated in frame with enhanced green fluorescent protein (EGFP) into the SmaI site of pEGFP-C3 (Clontech). pEGFP-C3 with no insert was used as a control. Constructs were transfected into NIH3T3 cells using TransIT-293 (Mirus), following manufacturers instructions or microinjected into the male pro-nucleus of fertilized mouse eggs at the one cell stage. Growing 3T3 cells transfected with full length EGFP-Ciz1, or EGFP-ECiz1 were analysed by live cell fluorescent microscopy up to three days after transfection. DNA synthesis was monitored during the first 24 hours after transfection, by including the nucleotide analogue BrdU in cell culture medium for various time periods as indicated in figure legends. As described above any cells undergoing DNA synthesis while exposed to BrdU stain with anti-BrdU monoclonal antibody generating red nuclei.
[0233]Ciz1 transfected cells were also maintained under selection with 50 μg/ml G418, in standard culture medium (DMEM Glutamax plus 10% fetal calf serum) for up to a month, yielding cell populations with altered morphology.
[0234]EST Sequence Analysis
[0235]Individual expressed sequence tags (ESTs) mapping to NCBI unigene cluster Hs.23476 (human Ciz1) were translated using Genejockey and the predicted amino-acid sequence compared to the predicted sequence for full length Ciz1, with the aim of identifying recurrent changes in cancer cells. In order to exclude errors that reflect poor quality DNA sequence such as that which occurs at the end of long sequencing runs, only those changes positioned more than 8 amino-acids from the end of uninterrupted sequence are included in this analysis. Frame-shifts that are restored by a second alteration later in the read, and frame-shifts that are followed by a stop codon are only included if followed by uninterrupted sequence. Thus the majority of sequencing errors are excluded from this analysis. However, it is expected that many of the point mutations that remain (including frame-shifts and stops) reflect errors introduced during sequencing. Therefore, this analysis is aimed at uncovering trends, with weight being given to point mutations only if they appear more than once.
[0236]Of 567 sequences that map to Ciz1 unigene cluster, we have analyzed most (all paediatric cancers, prostate and lung carcinomas, leukemias and lymphomas and a wide range of non-diseased tissues). Some were not mapped because they are extremely short reads or yielded very short amino-acid sequences upon translation, and for a small number we detected no homology to the Ciz1 coding sequence. A small number of ESTs were excluded from the analysis because of multiple frameshifts that produced stretches of homology in all three frames, with no indication of the reading frame used in vivo. These were all from cancer derived material, usually adenocarcinomas.
[0237]RT-PCR Analysis of Ciz1 Isoform Expression
[0238]RNA was isolated using trizol reagent following recommended procedures, DNAse treated and reverse transcribed using random hexamers and superscript II, then amplified with Ciz1 specific primers:
TABLE-US-00001 h/m5 CAGTCCCCACCACAGGCC, (SEQ ID NO: 17) h/m2 GGCTTCCTCAGACCCCTCTG. (SEQ ID NO: 18) H/m3 ACACAGACCTCTCCAGAGCACTTAG (SEQ ID NO: 19) H/m4 ATGGTGACCTTCAGGGAGC (SEQ ID NO: 20) H4 TCCTTGGCGA TGTCCTCTGG GCAGG (SEQ ID NO: 21) H3 TCCCTCCTCA ACGGCTCCAT GCTGC (SEQ ID NO: 22) H6 CG TGGGGGCGAC TTGAGCGTTG AGG (SEQ ID NO: 23) H1 GATGCCAGGGGT ATGGGGCGCC GGG (SEQ ID NO: 24) H2 TCCGAGCCCT TCCACTCCTC TCTGG. (SEQ ID NO: 25)
[0239]Analysis of Ciz1 Protein Isoforms in Cancer Cell Lines
[0240]Cells were grown in DMEM with 10% FCS until sub-confluent, rinsed in cold hepes buffered saline supplemented with EDTA free protease inhibitor cocktail (Roche) then scrape harvested and supplemented with 0.1% Triton X100. Detergent-insoluble material (including nuclei) was pelleted by gentle centrifugation to yield supernatant (SN) and pellet fractions (P). These were boiled in reducing SDS-PAGE sample buffer and proteins resolved by electrophoresis through 8% SDS-PAGE. After transfer to nitrocellulose, Ciz1 isoforms were detected with anti-Ciz1 antibody 1793). All methods used in this analysis are well documented elsewhere.
REFERENCES
[0241]Bell, S. P. and Dutta, A. (2002). DNA replication in eukaryotic cells. Annu Rev Biochem 71, 333-74.
[0242]Cook, P. R. (1999). The organization of replication and transcription. Science 284, 1790-5.
[0243]Corpet, F. (1998). Multiple sequence alignment with hierarchical clustering. Nucl. Acids Res. 16, 10881-10890.
[0244]Coverley, D., Laman, H. and Laskey, R. A. (2002). Distinct roles for cyclins E and A during DNA replication complex assembly and activation. Nat Cell Biol 4, 523-8.
[0245]Coverley, D., Pelizon, C., Trewick, S. and Laskey, R. A. (2000). Chromatin bound Cdc6 persists in S and G2 phases in human cells, while soluble Cdc6 is destroyed in a cyclin A-cdk2 dependent process. J. Cell Sci. 113, 1929-1938.
[0246]Fujita, M. (1999). Cell cycle regulation of DNA replication initiation proteins in mammalian cells. Front Biosci 4, D816-23.
[0247]Hanahan, D. and Weinberg, R. A. (2000). The Hallmarks of Cancer. Cell 100, 57-70.
[0248]Harlow, E. and Lane, D. (1988). Antibodies: A laboratory manual. New York: Cold Spring Harbour Laboratory Press.
[0249]Jones, D. L., Alani, R. M. and Munger, K. (1997). The human papillomavirus E7 oncoprotein can uncouple cellular differentiation and proliferation in human keratinocytes by abrogating p21Cip1-mediated inhibition of cdk2. Genes Dev. 11, 2101-2111.
[0250]Krude, T. (2000). Initiation of human DNA replication in vitro using nuclei from cells arrested at an initiation-competent state. J. Biol. Chem. 275, 13699-13707.
[0251]Krude, T., Jackman, M., Pines, J. and Laskey, R. A. (1997). Cyclin/Cdk-dependent initiation of DNA replication in a human cell-free system. Cell 88, 109-119.
[0252]Laman, H., Coverley, D., Krude, T. K., Laskey, R. A. and Jones, N. (2001). Viral cyclin/cdk6 complexes initiate nuclear DNA replication. Mol. Cell. Biol. 2, 624-635.
[0253]Mercatante, D. R. and Kole, R. (2002). Control of alternative splicing by antisense oligonucleotides as a potential chemotherapy: effects on gene expression. Biochim Biophys Acta 1587, 126-32.
[0254]Mitsui, K., Matsumoto, A., Ohtsuka, S., Ohtsubo, M. and Yoshimura, A. (1999). Cloning and characterization of a novel p21cip1/waf1-interacting zinc finger protein, Ciz1. Biochem. Biophys. Res. Corn. 264, 457-464.
[0255]Nakayasu, H. and Berezney, R. (1991). Nuclear matrins: identification of the major nuclear matrix proteins. Proc Natl Acad Sci USA 88, 10312-6.
[0256]Ohnuma, S., Philpott, A. and Harris, W. A. (2001). Cell cycle and cell fate in the nervous system. Curr Opin Neurobiol 11, 66-73.
[0257]Parker, S. B., Eichele, G., Zhang, P., Rawls, A., Sands, A. T., Bradley, A., Olson, E. N., Harper, J. W. and Elledge, S. J. (1995). p53-independent expression of p21Cip1 in muscle and other terminally differentiating cells. Science 267, 1024-7.
[0258]Rubbi, C. P. and Milner, J. (2000). Non-activated p53 co-localizes with sites of transcription within both the nucleoplasm and the nucleolus. Oncogene 19, 85-96.
[0259]Sherr, C. J. and Roberts, J. M. (1999). CDK inhibitors: positive and negative regulators of G1-phase progression. Genes Dev. 13, 1501-1512.
[0260]Stoeber, K., Mills, A. D., Kubota, Y., Krude, T., Romanowski, P., Marheineke, K., Laskey, R. A. and Williams, G. H. (1998). Cdc6 protein causes premature entry into S phase in a mammalian cell-free system. EMBO J. 17, 7219-7229.
[0261]van Steensel, B., van Binnendijk, E. P., Hornsby, C. D., van der Voort, H. T., Krozowski, Z. S., de Kloet, E. R. and van Driel, R. (1996). Partial colocalization of glucocorticoid and mineralocorticoid receptors in discrete compartments in nuclei of rat hippocampus neurons. J Cell Sci 109 (Pt 4), 787-92.
[0262]Warder, D. E. and Keherly, M. J. (2003). Ciz1, Cip1 interacting zinc finger protein 1 binds the consensus DNA sequence ARYSR(0-2)YYAC. J Biomed Sci 10, 406-17.
[0263]Williams, G. H., Romanowski, P., Morris, L., Madine, M., Mills, A. D., Stoeber, K., Marr, J., Laskey, R. A. and Coleman, N. (1998). Improved cervical smear assessment using antibodies against proteins that regulate DNA replication. Proc. Natl. Acad. Sci. USA 95, 14932-14937.
[0264]Zezula, J., Casaccia-Bonnefil, P., Ezhevsky, S. A., Osterhout, D. J., Levine, J. M., Dowdy, S. F., Chao, M. V. and Koff, A. (2001). p21cip1 is required for the differentiation of oligodendrocytes independently of cell cycle withdrawal. EMBO Rep 2, 27-34.
Sequence CWU
1
7415PRTHomo sapiens 1Asp Ser Ser Ser Gln1 5224DNAHomo
sapiens 2gttgaggagg aactctgcaa gcag
2438PRTHomo sapiens 3Val Glu Glu Glu Leu Cys Lys Gln1
5478DNAHomo sapiens 4gccacccaca ccacgaagag atgtgtttgc ccacgttcca
gtgcaggggt ggagcacagc 60ccggcttgtt acagatat
78532DNAArtificialOligonucleotide primer
5aaccccctct tccgccgccc ccaatcgcaa ga
32632DNAArtificialOligonucleotide primer 6tcttgcgatt gggggcggcg
gaagaggggg tt
32730DNAArtificialOligonucleotide primer 7aagcagacac aggccccgga
tcggctgcct
30830DNAArtificialOligonucleotide primer 8aggcagccga tccggggcct
gtgtctgctt
30929DNAArtificialOligonucleotide primer 9aagcacagtc acaggagcag acctgtctc
291029DNAArtificialOligonucleotide
primer 10aatctgctcc tgtgactgtg ccctgtctc
291129DNAArtificialOligonucleotide primer 11aatctgtcac aagttctacg
acctgtctc
291229DNAArtificialOligonucleotide primer 12aatcgtagaa cttgtgacag
acctgtctc
291329DNAArtificialOligonucleotide primer 13aatcgcaagg attcttcttc
tcctgtctc
291429DNAArtificialOligonucleotide primer 14aaagaagaag aatccttgcg
acctgtctc
291529DNAArtificialOligonucleotide primer 15aatctgcagc agttctttcc
ccctgtctc
291629DNAArtificialOligonucleotide primer 16aagggaaaga actgctgcag
acctgtctc
291718DNAArtificialOligonucleotide primer 17cagtccccac cacaggcc
181820DNAArtificialOligonucleotide primer 18ggcttcctca gacccctctg
201925DNAArtificialOligonucleotide primer 19acacagacct ctccagagca cttag
252019DNAArtificialOligonucleotide primer 20atggtgacct tcagggagc
192125DNAArtificialOligonucleotide primer 21tccttggcga tgtcctctgg gcagg
252225DNAArtificialOligonucleotide primer 22tccctcctca acggctccat gctgc
252325DNAArtificialOligonucleotide primer 23cgtgggggcg acttgagcgt tgagg
252425DNAArtificialOligonucleotide primer 24gatgccaggg gtatggggcg ccggg
252525DNAArtificialOligonucleotide primer 25tccgagccct tccactcctc tctgg
2526845PRTMus musculus 26Met Phe
Asn Pro Gln Leu Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln1 5
10 15Gln Gln Leu Gln Gln Gln Leu Gln
Gln Gln Gln Leu Gln Gln Gln Gln 20 25
30Gln Gln Ile Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro
Gln 35 40 45Ala Ser Leu Ser Ile
Pro Val Ser Arg Gly Leu Pro Gln Gln Ser Ser 50 55
60Pro Gln Gln Leu Leu Ser Leu Gln Gly Leu His Ser Thr Ser
Leu Leu65 70 75 80Asn
Gly Pro Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly
85 90 95Leu Asp Gln Phe Ala Met Pro
Pro Ala Thr Tyr Asp Gly Ala Ser Leu 100 105
110Thr Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Ala Phe Asn
Val Thr 115 120 125Ala Pro Ser Leu
Ala Ala Pro Ser Leu Thr Pro Pro Gln Met Val Thr 130
135 140Pro Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg
Gln Ser Leu Leu145 150 155
160Gly Pro Pro Pro Val Gly Val Pro Ile Asn Pro Ser Gln Leu Asn His
165 170 175Ser Gly Arg Asn Thr
Gln Lys Gln Ala Arg Thr Pro Ser Ser Thr Thr 180
185 190Pro Asn Arg Lys Asp Ser Ser Ser Gln Thr Val Pro
Leu Glu Asp Arg 195 200 205Glu Asp
Pro Thr Glu Gly Ser Glu Glu Ala Thr Glu Leu Gln Met Asp 210
215 220Thr Cys Glu Asp Gln Asp Ser Leu Val Gly Pro
Asp Ser Met Leu Ser225 230 235
240Glu Pro Gln Val Pro Glu Pro Glu Pro Phe Glu Thr Leu Glu Pro Pro
245 250 255Ala Lys Arg Cys
Arg Ser Ser Glu Glu Ser Thr Glu Lys Gly Pro Thr 260
265 270Gly Gln Pro Gln Ala Arg Val Gln Pro Gln Thr
Gln Met Thr Ala Pro 275 280 285Lys
Gln Thr Gln Thr Pro Asp Arg Leu Pro Glu Pro Pro Glu Val Gln 290
295 300Met Leu Pro Arg Ile Gln Pro Gln Ala Leu
Gln Ile Gln Thr Gln Pro305 310 315
320Lys Leu Leu Arg Gln Ala Gln Thr Gln Thr Ser Pro Glu His Leu
Ala 325 330 335Pro Gln Gln
Asp Gln Val Glu Pro Gln Val Pro Ser Gln Pro Pro Trp 340
345 350Gln Leu Gln Pro Arg Glu Thr Asp Pro Pro
Asn Gln Ala Gln Ala Gln 355 360
365Thr Gln Pro Gln Pro Leu Trp Gln Ala Gln Ser Gln Lys Gln Ala Gln 370
375 380Thr Gln Ala His Pro Gln Val Pro
Thr Gln Ala Gln Ser Gln Glu Gln385 390
395 400Thr Ser Glu Lys Thr Gln Asp Gln Pro Gln Thr Trp
Pro Gln Gly Ser 405 410
415Val Pro Pro Pro Glu Gln Ala Ser Gly Pro Ala Cys Ala Thr Glu Pro
420 425 430Gln Leu Ser Ser His Ala
Ala Glu Ala Gly Ser Asp Pro Asp Lys Ala 435 440
445Leu Pro Glu Pro Val Ser Ala Gln Ser Ser Glu Asp Arg Ser
Arg Glu 450 455 460Ala Ser Ala Gly Gly
Leu Asp Leu Gly Glu Cys Glu Lys Arg Ala Gly465 470
475 480Glu Met Leu Gly Met Trp Gly Ala Gly Ser
Ser Leu Lys Val Thr Ile 485 490
495Leu Gln Ser Ser Asn Ser Arg Ala Phe Asn Thr Thr Pro Leu Thr Ser
500 505 510Gly Pro Arg Pro Gly
Asp Ser Thr Ser Ala Thr Pro Ala Ile Ala Ser 515
520 525Thr Pro Ser Lys Gln Ser Leu Gln Phe Phe Cys Tyr
Ile Cys Lys Ala 530 535 540Ser Ser Ser
Ser Gln Gln Glu Phe Gln Asp His Met Ser Glu Ala Gln545
550 555 560His Gln Gln Arg Leu Gly Glu
Ile Gln His Ser Ser Gln Thr Cys Leu 565
570 575Leu Ser Leu Leu Pro Met Pro Arg Asp Ile Leu Glu
Lys Glu Ala Glu 580 585 590Asp
Pro Pro Pro Lys Arg Trp Cys Asn Thr Cys Gln Val Tyr Tyr Val 595
600 605Gly Asp Leu Ile Gln His Arg Arg Thr
Gln Glu His Lys Val Ala Lys 610 615
620Gln Ser Leu Arg Pro Phe Cys Thr Ile Cys Asn Arg Tyr Phe Lys Thr625
630 635 640Pro Arg Lys Phe
Val Glu His Val Lys Ser Gln Gly His Lys Asp Lys 645
650 655Ala Gln Glu Leu Lys Thr Leu Glu Lys Glu
Thr Gly Ser Pro Asp Glu 660 665
670Asp His Phe Ile Thr Val Asp Ala Val Gly Cys Phe Glu Ser Gly Gln
675 680 685Glu Glu Asp Glu Asp Asp Asp
Glu Glu Glu Glu Glu Glu Gly Glu Ile 690 695
700Glu Ala Glu Glu Glu Phe Cys Lys Gln Val Lys Pro Arg Glu Thr
Ser705 710 715 720Ser Glu
Gln Gly Lys Gly Ser Glu Thr Tyr Asn Pro Asn Thr Ala Tyr
725 730 735Gly Glu Asp Phe Leu Val Pro
Val Met Gly Tyr Val Cys Gln Ile Cys 740 745
750His Lys Phe Tyr Asp Ser Asn Ser Glu Leu Arg Leu Ser His
Cys Lys 755 760 765Ser Leu Ala His
Phe Glu Asn Leu Gln Lys Tyr Lys Ala Lys Asn Pro 770
775 780Ser Pro Pro Pro Thr Arg Pro Val Ser Arg Lys Cys
Ala Ile Asn Ala785 790 795
800Arg Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser His Gln Pro Ser Pro
805 810 815Gln Asp Thr Val Lys
Met Pro Ser Lys Val Lys Pro Gly Ser Pro Gly 820
825 830Leu Pro Pro Pro Leu Arg Arg Ser Thr Arg Leu Lys
Thr 835 840 84527716PRTMus
musculus 27Ser Thr Ser Leu Leu Asn Gly Pro Met Leu Gln Arg Ala Leu Leu
Leu1 5 10 15Gln Gln Leu
Gln Gly Leu Asp Gln Phe Ala Met Pro Pro Ala Thr Tyr 20
25 30Asp Gly Ala Ser Leu Thr Met Pro Thr Ala
Thr Leu Gly Asn Leu Arg 35 40
45Ala Phe Asn Val Thr Ala Pro Ser Leu Ala Ala Pro Ser Leu Thr Pro 50
55 60Pro Gln Met Val Thr Pro Asn Leu Gln
Gln Phe Phe Pro Gln Ala Thr65 70 75
80Arg Gln Ser Leu Leu Gly Pro Pro Pro Val Gly Val Pro Ile
Asn Pro 85 90 95Ser Gln
Leu Asn His Ser Gly Arg Asn Thr Gln Lys Gln Ala Arg Thr 100
105 110Pro Ser Ser Thr Thr Pro Asn Arg Lys
Thr Val Pro Leu Glu Asp Arg 115 120
125Glu Asp Pro Thr Glu Gly Ser Glu Glu Ala Thr Glu Leu Gln Met Asp
130 135 140Thr Cys Glu Asp Gln Asp Ser
Leu Val Gly Pro Asp Ser Met Leu Ser145 150
155 160Glu Pro Gln Val Pro Glu Pro Glu Pro Phe Glu Thr
Leu Glu Pro Pro 165 170
175Ala Lys Arg Cys Arg Ser Ser Glu Glu Ser Thr Glu Lys Gly Pro Thr
180 185 190Gly Gln Pro Gln Ala Arg
Val Gln Pro Gln Thr Gln Met Thr Ala Pro 195 200
205Lys Gln Thr Gln Thr Pro Asp Arg Leu Pro Glu Pro Pro Glu
Val Gln 210 215 220Met Leu Pro Arg Ile
Gln Pro Gln Ala Leu Gln Ile Gln Thr Gln Pro225 230
235 240Lys Leu Leu Arg Gln Ala Gln Thr Gln Thr
Ser Pro Glu His Leu Ala 245 250
255Pro Gln Gln Asp Gln Val Pro Thr Gln Ala Gln Ser Gln Glu Gln Thr
260 265 270Ser Glu Lys Thr Gln
Asp Gln Pro Gln Thr Trp Pro Gln Gly Ser Val 275
280 285Pro Pro Pro Glu Gln Ala Ser Gly Pro Ala Cys Ala
Thr Glu Pro Gln 290 295 300Leu Ser Ser
His Ala Ala Glu Ala Gly Ser Asp Pro Asp Lys Ala Leu305
310 315 320Pro Glu Pro Val Ser Ala Gln
Ser Ser Glu Asp Arg Ser Arg Glu Ala 325
330 335Ser Ala Gly Gly Leu Asp Leu Gly Glu Cys Glu Lys
Arg Ala Gly Glu 340 345 350Met
Leu Gly Met Trp Gly Ala Gly Ser Ser Leu Lys Val Thr Ile Leu 355
360 365Gln Ser Ser Asn Ser Arg Ala Phe Asn
Thr Thr Pro Leu Thr Ser Gly 370 375
380Pro Arg Pro Gly Asp Ser Thr Ser Ala Thr Pro Ala Ile Ala Ser Thr385
390 395 400Pro Ser Lys Gln
Ser Leu Gln Phe Phe Cys Tyr Ile Cys Lys Ala Ser 405
410 415Ser Ser Ser Gln Gln Glu Phe Gln Asp His
Met Ser Glu Ala Gln His 420 425
430Gln Gln Arg Leu Gly Glu Ile Gln His Ser Ser Gln Thr Cys Leu Leu
435 440 445Ser Leu Leu Pro Met Pro Arg
Asp Ile Leu Glu Lys Glu Ala Glu Asp 450 455
460Pro Pro Pro Lys Arg Trp Cys Asn Thr Cys Gln Val Tyr Tyr Val
Gly465 470 475 480Asp Leu
Ile Gln His Arg Arg Thr Gln Glu His Lys Val Ala Lys Gln
485 490 495Ser Leu Arg Pro Phe Cys Thr
Ile Cys Asn Arg Tyr Phe Lys Thr Pro 500 505
510Arg Lys Phe Val Glu His Val Lys Ser Gln Gly His Lys Asp
Lys Ala 515 520 525Gln Glu Leu Lys
Thr Leu Glu Lys Glu Thr Gly Ser Pro Asp Glu Asp 530
535 540His Phe Ile Thr Val Asp Ala Val Gly Cys Phe Glu
Ser Gly Gln Glu545 550 555
560Glu Asp Glu Asp Asp Asp Glu Glu Glu Glu Glu Glu Gly Glu Ile Glu
565 570 575Ala Glu Glu Glu Phe
Cys Lys Gln Val Lys Pro Arg Glu Thr Ser Ser 580
585 590Glu Gln Gly Lys Gly Ser Glu Thr Tyr Asn Pro Asn
Thr Ala Tyr Gly 595 600 605Glu Asp
Phe Leu Val Pro Val Met Gly Tyr Val Cys Gln Ile Cys His 610
615 620Lys Phe Tyr Asp Ser Asn Ser Glu Leu Arg Leu
Ser His Cys Lys Ser625 630 635
640Leu Ala His Phe Glu Asn Leu Gln Lys Tyr Lys Ala Lys Asn Pro Ser
645 650 655Pro Pro Pro Thr
Arg Pro Val Ser Arg Lys Cys Ala Ile Asn Ala Arg 660
665 670Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser His
Gln Pro Ser Pro Gln 675 680 685Asp
Thr Val Lys Met Pro Ser Lys Val Lys Pro Gly Ser Pro Gly Leu 690
695 700Pro Pro Pro Leu Arg Arg Ser Thr Arg Leu
Lys Thr705 710 71528714PRTMus musculus
28Met Phe Asn Pro Gln Leu Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln1
5 10 15Gln Gln Leu Gln Gln Gln
Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln 20 25
30Gln Gln Ile Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser
Pro Pro Gln 35 40 45Ala Ser Leu
Ser Ile Pro Val Ser Arg Gly Leu Pro Gln Gln Ser Ser 50
55 60Pro Gln Gln Leu Leu Ser Leu Gln Gly Leu His Ser
Thr Ser Leu Leu65 70 75
80Asn Gly Pro Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly
85 90 95Leu Asp Gln Phe Ala Met
Pro Pro Ala Thr Tyr Asp Gly Ala Ser Leu 100
105 110Thr Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Ala
Phe Asn Val Thr 115 120 125Ala Pro
Ser Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln Met Val Thr 130
135 140Pro Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr
Arg Gln Ser Leu Leu145 150 155
160Gly Pro Pro Pro Val Gly Val Pro Ile Asn Pro Ser Gln Leu Asn His
165 170 175Ser Gly Arg Asn
Thr Gln Lys Gln Ala Arg Thr Pro Ser Ser Thr Thr 180
185 190Pro Asn Arg Lys Thr Val Pro Leu Glu Asp Arg
Glu Asp Pro Thr Glu 195 200 205Gly
Ser Glu Glu Ala Thr Glu Leu Gln Met Asp Thr Cys Glu Asp Gln 210
215 220Asp Ser Leu Val Gly Pro Asp Ser Met Leu
Ser Glu Pro Gln Val Pro225 230 235
240Glu Pro Glu Pro Phe Glu Thr Leu Glu Pro Pro Ala Lys Arg Cys
Arg 245 250 255Ser Ser Glu
Glu Ser Thr Glu Lys Gly Pro Thr Gly Gln Pro Gln Ala 260
265 270Arg Val Gln Pro Gln Thr Gln Met Thr Ala
Pro Lys Gln Thr Gln Thr 275 280
285Pro Asp Arg Leu Pro Glu Pro Pro Glu Val Gln Met Leu Pro Arg Ile 290
295 300Gln Pro Gln Ala Leu Gln Ile Gln
Thr Gln Pro Lys Leu Leu Arg Gln305 310
315 320Ala Gln Thr Gln Thr Ser Pro Glu His Leu Ala Pro
Gln Gln Asp Gln 325 330
335Val Pro Thr Gln Ala Gln Ser Gln Glu Gln Thr Ser Glu Lys Thr Gln
340 345 350Asp Gln Pro Gln Thr Trp
Pro Gln Gly Ser Val Pro Pro Pro Glu Gln 355 360
365Ala Ser Gly Pro Ala Cys Ala Thr Glu Pro Gln Leu Ser Ser
His Ala 370 375 380Ala Glu Ala Gly Ser
Asp Pro Asp Lys Ala Leu Pro Glu Pro Val Ser385 390
395 400Ala Gln Ser Ser Glu Asp Arg Ser Arg Glu
Ala Ser Ala Gly Gly Leu 405 410
415Asp Leu Gly Glu Cys Glu Lys Arg Ala Gly Glu Met Leu Gly Met Trp
420 425 430Gly Ala Gly Ser Ser
Leu Lys Val Thr Ile Leu Gln Ser Ser Asn Ser 435
440 445Arg Ala Phe Asn Thr Thr Pro Leu Thr Ser Gly Pro
Ser Pro Gly Asp 450 455 460Ser Thr Ser
Ala Thr Pro Ala Ile Ala Ser Thr Pro Ser Lys Gln Ser465
470 475 480Leu Gln Phe Phe Cys Tyr Ile
Cys Lys Ala Ser Ser Ser Ser Gln Gln 485
490 495Glu Phe Gln Asp His Met Ser Glu Ala Gln His Gln
Gln Arg Leu Gly 500 505 510Glu
Ile Gln His Ser Ser Gln Thr Cys Leu Leu Ser Leu Leu Pro Met 515
520 525Pro Arg Asp Ile Leu Glu Lys Glu Ala
Glu Asp Pro Pro Pro Lys Arg 530 535
540Trp Cys Asn Thr Cys Gln Val Tyr Tyr Val Gly Asp Leu Ile Gln His545
550 555 560Arg Arg Thr Gln
Glu His Lys Val Ala Lys Gln Ser Leu Arg Pro Phe 565
570 575Cys Thr Ile Cys Asn Arg Tyr Phe Lys Thr
Pro Arg Lys Phe Val Glu 580 585
590His Val Lys Ser Gln Gly His Lys Asp Lys Ala Gln Glu Leu Lys Thr
595 600 605Leu Glu Lys Glu Thr Gly Ser
Pro Asp Glu Asp His Phe Ile Thr Val 610 615
620Glu Ala Val Gly Cys Phe Glu Ser Gly Gln Glu Glu Asp Glu Asp
Asp625 630 635 640Asp Glu
Glu Glu Glu Glu Glu Gly Glu Ile Glu Ala Glu Glu Glu Phe
645 650 655Cys Lys Gln Val Lys Pro Arg
Glu Thr Ser Ser Glu Gln Gly Lys Gly 660 665
670Ser Glu Thr Tyr Asn Pro Asn Thr Ala Tyr Gly Glu Asp Phe
Leu Val 675 680 685Pro Val Met Gly
Tyr Val Cys Gln Ile Cys His Lys Phe Tyr Asp Ser 690
695 700Asn Ser Glu Leu Arg Leu Ser His Cys Lys705
71029898PRTHomo sapiens 29Met Phe Ser Gln Gln Gln Gln Gln Gln Leu
Gln Gln Gln Gln Gln Gln1 5 10
15Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln
20 25 30Gln Gln Leu Leu Gln Leu
Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln 35 40
45Ala Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln
Gln Pro 50 55 60Gln Gln Pro Leu Leu
Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu65 70
75 80Asn Gly Ser Met Leu Gln Arg Ala Leu Leu
Leu Gln Gln Leu Gln Gly 85 90
95Leu Asp Gln Phe Ala Met Pro Pro Ala Thr Tyr Asp Thr Ala Gly Leu
100 105 110Thr Met Pro Thr Ala
Thr Leu Gly Asn Leu Arg Gly Tyr Gly Met Ala 115
120 125Ser Pro Gly Leu Ala Ala Pro Ser Leu Thr Pro Pro
Gln Leu Ala Thr 130 135 140Pro Asn Leu
Gln Gln Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu Leu145
150 155 160Gly Pro Pro Pro Val Gly Val
Pro Met Asn Pro Ser Gln Phe Asn Leu 165
170 175Ser Gly Arg Asn Pro Gln Lys Gln Ala Arg Thr Ser
Ser Ser Thr Thr 180 185 190Pro
Asn Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val Glu Asp Lys 195
200 205Ser Asp Pro Pro Glu Gly Ser Glu Glu
Ala Ala Glu Pro Arg Met Asp 210 215
220Thr Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys225
230 235 240Glu Lys Arg Thr
Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu 245
250 255Leu Pro Ala Lys Arg Leu Arg Ser Ser Glu
Glu Pro Thr Glu Lys Glu 260 265
270Pro Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln Ala Arg Met Thr
275 280 285Val Pro Lys Gln Thr Gln Thr
Pro Asp Leu Leu Pro Glu Ala Leu Glu 290 295
300Ala Gln Val Leu Pro Arg Phe Gln Pro Arg Val Leu Gln Val Gln
Ala305 310 315 320Gln Val
Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln
325 330 335Val Gln Pro Lys Leu Gln Lys
Gln Ala Gln Thr Gln Thr Ser Pro Glu 340 345
350His Leu Val Leu Gln Gln Lys Gln Val Gln Pro Gln Leu Gln
Gln Glu 355 360 365Ala Glu Pro Gln
Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala His 370
375 380Ser Gln Gly Pro Arg Gln Val Gln Leu Gln Gln Glu
Ala Glu Pro Leu385 390 395
400Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala His Ser Gln Pro Pro
405 410 415Arg Gln Val Gln Leu
Gln Leu Gln Lys Gln Val Gln Thr Gln Thr Tyr 420
425 430Pro Gln Val His Thr Gln Ala Gln Pro Ser Val Gln
Pro Gln Glu His 435 440 445Pro Pro
Ala Gln Val Ser Val Gln Pro Pro Glu Gln Thr His Glu Gln 450
455 460Pro His Thr Gln Pro Gln Val Ser Leu Leu Ala
Pro Glu Gln Thr Pro465 470 475
480Val Val Val His Val Cys Gly Leu Glu Met Pro Pro Asp Ala Val Glu
485 490 495Ala Gly Gly Gly
Met Glu Lys Thr Leu Pro Glu Pro Val Gly Thr Gln 500
505 510Val Ser Met Glu Glu Ile Gln Asn Glu Ser Ala
Cys Gly Leu Asp Val 515 520 525Gly
Glu Cys Glu Asn Arg Ala Arg Glu Met Pro Gly Val Trp Gly Ala 530
535 540Gly Gly Ser Leu Lys Val Thr Ile Leu Gln
Ser Ser Asp Ser Arg Ala545 550 555
560Phe Ser Thr Val Pro Leu Thr Pro Val Pro Arg Pro Ser Asp Ser
Val 565 570 575Ser Ser Thr
Pro Ala Ala Thr Ser Thr Pro Ser Lys Gln Ala Leu Gln 580
585 590Phe Phe Cys Tyr Ile Cys Lys Ala Ser Cys
Ser Ser Gln Gln Glu Phe 595 600
605Gln Asp His Met Ser Glu Pro Gln His Gln Gln Arg Leu Gly Glu Ile 610
615 620Gln His Met Ser Gln Ala Cys Leu
Leu Ser Leu Leu Pro Val Pro Arg625 630
635 640Asp Val Leu Glu Thr Glu Asp Glu Glu Pro Pro Pro
Arg Arg Trp Cys 645 650
655Asn Thr Cys Gln Leu Tyr Tyr Met Gly Asp Leu Ile Gln His Arg Arg
660 665 670Thr Gln Asp His Lys Ile
Ala Lys Gln Ser Leu Arg Pro Phe Cys Thr 675 680
685Val Cys Asn Arg Tyr Phe Lys Thr Pro Arg Lys Phe Val Glu
His Val 690 695 700Lys Ser Gln Gly His
Lys Asp Lys Ala Lys Glu Leu Lys Ser Leu Glu705 710
715 720Lys Glu Ile Ala Gly Gln Asp Glu Asp His
Phe Ile Thr Val Asp Ala 725 730
735Val Gly Cys Phe Glu Gly Asp Glu Glu Glu Glu Glu Asp Asp Glu Asp
740 745 750Glu Glu Glu Ile Glu
Val Glu Glu Glu Leu Cys Lys Gln Val Arg Ser 755
760 765Arg Asp Ile Ser Arg Glu Glu Trp Lys Gly Ser Glu
Thr Tyr Ser Pro 770 775 780Asn Thr Ala
Tyr Gly Val Asp Phe Leu Val Pro Val Met Gly Tyr Ile785
790 795 800Cys Arg Ile Cys His Lys Phe
Tyr His Ser Asn Ser Gly Ala Gln Leu 805
810 815Ser His Cys Lys Ser Leu Gly His Phe Glu Asn Leu
Gln Lys Tyr Lys 820 825 830Ala
Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro Val Ser Arg Arg Cys 835
840 845Ala Ile Asn Ala Arg Asn Ala Leu Thr
Ala Leu Phe Thr Ser Ser Gly 850 855
860Arg Pro Pro Ser Gln Pro Asn Thr Gln Asp Lys Thr Pro Ser Lys Val865
870 875 880Thr Ala Arg Pro
Ser Gln Pro Pro Leu Pro Arg Arg Ser Thr Arg Leu 885
890 895Lys Thr30898PRTHomo sapiens 30Met Phe Ser
Gln Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln Gln Gln1 5
10 15Leu Gln Gln Leu Gln Gln Gln Gln Leu
Gln Gln Gln Gln Leu Gln Gln 20 25
30Gln Gln Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln
35 40 45Ala Pro Leu Pro Met Ala Val
Ser Arg Gly Leu Pro Pro Gln Gln Pro 50 55
60Gln Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu65
70 75 80Asn Gly Ser Met
Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly 85
90 95Leu Asp Gln Phe Ala Met Pro Pro Ala Thr
Tyr Asp Thr Ala Gly Leu 100 105
110Thr Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Gly Tyr Gly Met Ala
115 120 125Ser Pro Gly Leu Ala Ala Pro
Ser Leu Thr Pro Pro Gln Leu Ala Thr 130 135
140Pro Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu
Leu145 150 155 160Gly Pro
Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe Asn Leu
165 170 175Ser Gly Arg Asn Pro Gln Lys
Gln Ala Arg Thr Ser Ser Ser Thr Thr 180 185
190Pro Asn Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val Glu
Asp Lys 195 200 205Ser Asp Pro Pro
Glu Gly Ser Glu Glu Ala Ala Glu Pro Arg Met Asp 210
215 220Thr Pro Glu Asp Gln Asp Leu Leu Pro Cys Pro Glu
Asp Ile Ala Lys225 230 235
240Glu Lys Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu
245 250 255Leu Pro Ala Lys Arg
Leu Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu 260
265 270Pro Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln
Ala Arg Met Thr 275 280 285Val Pro
Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu Ala Leu Glu 290
295 300Ala Gln Val Leu Pro Arg Phe Gln Pro Arg Val
Leu Gln Val Gln Ala305 310 315
320Gln Val Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln
325 330 335Val Gln Pro Lys
Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu 340
345 350His Leu Val Leu Gln Gln Lys Gln Val Gln Pro
Gln Leu Gln Gln Glu 355 360 365Ala
Glu Pro Gln Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala His 370
375 380Ser Gln Gly Pro Arg Gln Val Gln Leu Gln
Gln Glu Ala Glu Pro Leu385 390 395
400Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala His Ser Gln Pro
Pro 405 410 415Arg Gln Val
Gln Leu Gln Leu Gln Lys Gln Val Gln Thr Gln Thr Tyr 420
425 430Pro Gln Val His Thr Gln Ala Gln Pro Ser
Val Gln Pro Gln Glu His 435 440
445Pro Pro Ala Gln Val Ser Val Gln Pro Pro Glu Gln Thr His Glu Gln 450
455 460Pro His Thr Gln Pro Gln Val Ser
Leu Leu Ala Pro Glu Gln Thr Pro465 470
475 480Val Val Val His Val Cys Gly Leu Glu Met Pro Pro
Asp Ala Val Glu 485 490
495Ala Gly Gly Gly Met Glu Lys Thr Leu Pro Glu Pro Val Gly Thr Gln
500 505 510Val Ser Met Glu Glu Ile
Gln Asn Glu Ser Ala Cys Gly Leu Asp Val 515 520
525Gly Glu Cys Glu Asn Arg Ala Arg Glu Met Pro Gly Val Trp
Gly Ala 530 535 540Gly Gly Ser Leu Lys
Val Thr Ile Leu Gln Gly Ser Asp Ser Arg Ala545 550
555 560Phe Ser Thr Val Pro Leu Thr Pro Val Pro
Arg Pro Ser Asp Ser Val 565 570
575Ser Ser Thr Pro Ala Ala Thr Ser Thr Pro Ser Lys Gln Ala Leu Gln
580 585 590Phe Phe Cys Tyr Ile
Cys Lys Ala Ser Cys Ser Ser Gln Gln Glu Phe 595
600 605Gln Asp His Met Ser Glu Pro Gln His Gln Gln Arg
Leu Gly Glu Ile 610 615 620Gln His Met
Ser Gln Ala Cys Leu Leu Ser Leu Leu Pro Val Pro Arg625
630 635 640Asp Val Leu Glu Thr Glu Asp
Glu Glu Pro Pro Pro Arg Arg Trp Cys 645
650 655Asn Thr Cys Gln Leu Tyr Tyr Met Gly Asp Leu Ile
Gln His Arg Arg 660 665 670Thr
Gln Asp His Lys Ile Ala Lys Gln Ser Leu Arg Pro Phe Cys Thr 675
680 685Val Cys Asn Arg Tyr Phe Lys Thr Pro
Arg Lys Phe Val Glu His Val 690 695
700Lys Ser Gln Gly His Lys Asp Lys Ala Lys Glu Leu Lys Ser Leu Glu705
710 715 720Lys Glu Ile Ala
Gly Gln Asp Glu Asp His Phe Ile Thr Val Asp Ala 725
730 735Val Gly Cys Phe Glu Gly Asp Glu Glu Glu
Glu Glu Asp Asp Glu Asp 740 745
750Glu Glu Glu Ile Glu Val Glu Glu Glu Leu Cys Lys Gln Val Arg Ser
755 760 765Arg Asp Ile Ser Arg Glu Glu
Trp Lys Gly Ser Glu Thr Tyr Ser Pro 770 775
780Asn Thr Ala Tyr Gly Val Asp Phe Leu Val Pro Val Met Gly Tyr
Ile785 790 795 800Cys Arg
Ile Cys His Lys Phe Tyr His Ser Asn Ser Gly Ala Gln Leu
805 810 815Ser His Cys Lys Ser Leu Gly
His Phe Glu Asn Leu Gln Lys Tyr Lys 820 825
830Ala Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro Val Ser Arg
Arg Cys 835 840 845Ala Ile Asn Ala
Arg Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser Gly 850
855 860Arg Pro Pro Ser Gln Pro Asn Thr Gln Asp Lys Thr
Pro Ser Lys Val865 870 875
880Thr Ala Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg Ser Thr Arg Leu
885 890 895Lys Thr31896PRTHomo
sapiens 31Phe Ser Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln Gln Gln Leu
Gln1 5 10 15Gln Leu Gln
Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln 20
25 30Ser Leu Gln Leu Gln Gln Leu Leu Gln Gln
Ser Pro Pro Gln Ala Pro 35 40
45Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln Gln 50
55 60Pro Leu Leu Asn Leu Gln Gly Thr Asn
Ser Ala Ser Leu Leu Asn Gly65 70 75
80Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly
Leu Asp 85 90 95Gln Phe
Ala Met Pro Pro Ala Thr Tyr Asp Thr Ala Gly Leu Thr Met 100
105 110Pro Thr Ala Thr Leu Gly Asn Leu Arg
Gly Tyr Gly Met Ala Ser Pro 115 120
125Gly Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln Leu Ala Thr Pro Asn
130 135 140Leu Gln Gln Phe Phe Pro Gln
Ala Thr Arg Gln Ser Leu Leu Gly Pro145 150
155 160Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe
Asn Leu Ser Gly 165 170
175Arg Asn Pro Gln Lys Gln Ala Arg Thr Ser Ser Ser Thr Thr Pro Asn
180 185 190Arg Lys Asp Ser Ser Ser
Gln Thr Met Pro Val Glu Asp Lys Ser Asp 195 200
205Pro Pro Glu Gly Ser Glu Glu Ala Ala Glu Pro Arg Met Asp
Thr Pro 210 215 220Glu Asp Gln Asp Leu
Pro Pro Cys Pro Glu Asp Ile Ala Lys Glu Lys225 230
235 240Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys
Glu Ala Ser Glu Leu Pro 245 250
255Ala Lys Arg Leu Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu Pro Pro
260 265 270Gly Gln Leu Gln Val
Lys Ala Gln Pro Gln Ala Arg Met Thr Val Pro 275
280 285Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu Ala
Leu Glu Ala Gln 290 295 300Val Leu Pro
Arg Phe Gln Pro Arg Val Leu Gln Val Gln Ala Gln Val305
310 315 320Gln Ser Gln Thr Gln Pro Arg
Ile Pro Ser Thr Asp Thr Gln Val Gln 325
330 335Pro Lys Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser
Pro Glu His Leu 340 345 350Val
Leu Gln Gln Lys Gln Val Gln Pro Gln Leu Gln Gln Glu Ala Glu 355
360 365Pro Gln Lys Gln Val Gln Pro Gln Val
Gln Pro Gln Ala His Ser Gln 370 375
380Gly Pro Arg Gln Val Gln Leu Gln Gln Glu Ala Glu Pro Leu Lys Gln385
390 395 400Val Gln Pro Gln
Val Gln Pro Gln Ala His Ser Gln Pro Pro Arg Gln 405
410 415Val Gln Leu Gln Leu Gln Lys Gln Val Gln
Thr Gln Thr Tyr Pro Gln 420 425
430Val His Thr Gln Ala Gln Pro Ser Val Gln Pro Gln Glu His Pro Pro
435 440 445Ala Gln Val Ser Val Gln Pro
Pro Glu Gln Thr His Glu Gln Pro His 450 455
460Thr Gln Pro Gln Val Ser Leu Leu Ala Pro Glu Gln Thr Pro Val
Val465 470 475 480Val His
Val Cys Gly Leu Glu Met Pro Pro Asp Ala Val Glu Ala Gly
485 490 495Gly Gly Met Glu Lys Thr Leu
Pro Glu Pro Val Gly Thr Gln Val Ser 500 505
510Met Glu Glu Ile Gln Asn Glu Ser Ala Cys Gly Leu Asp Val
Gly Glu 515 520 525Cys Glu Asn Arg
Ala Arg Glu Met Pro Gly Val Trp Gly Ala Gly Gly 530
535 540Ser Leu Lys Val Thr Ile Leu Gln Ser Ser Asp Ser
Arg Ala Phe Ser545 550 555
560Thr Val Pro Leu Thr Leu Val Pro Arg Pro Ser Asp Ser Val Ser Ser
565 570 575Thr Pro Ala Ala Thr
Ser Thr Pro Ser Lys Gln Ala Leu Gln Phe Phe 580
585 590Cys Tyr Ile Cys Lys Ala Ser Cys Ser Ser Gln Gln
Glu Phe Gln Asp 595 600 605His Met
Ser Glu Pro Gln His Gln Gln Arg Leu Gly Glu Ile Gln His 610
615 620Met Ser Gln Ala Cys Leu Leu Pro Leu Leu Pro
Val Pro Arg Asp Val625 630 635
640Leu Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg Arg Trp Cys Asn Thr
645 650 655Cys Gln Leu Tyr
Tyr Met Gly Asp Leu Ile Gln His Arg Arg Thr Gln 660
665 670Asp His Lys Ile Ala Lys Gln Ser Leu Arg Pro
Phe Cys Thr Val Cys 675 680 685Asn
Arg Tyr Phe Lys Thr Pro Arg Lys Phe Val Glu His Val Lys Ser 690
695 700Gln Gly His Lys Asp Lys Ala Lys Glu Leu
Lys Ser Leu Glu Lys Glu705 710 715
720Ile Ala Gly Gln Asp Glu Asp His Phe Ile Thr Val Gly Ala Val
Gly 725 730 735Cys Phe Glu
Gly Asp Glu Glu Glu Glu Glu Asp Asp Glu Asp Glu Glu 740
745 750Glu Ile Glu Val Glu Glu Glu Leu Cys Lys
Gln Val Arg Ser Arg Asp 755 760
765Ile Ser Arg Glu Glu Trp Lys Gly Ser Glu Thr Tyr Ser Pro Asn Thr 770
775 780Ala Tyr Gly Val Asp Phe Leu Val
Pro Val Met Gly Tyr Ile Cys Arg785 790
795 800Ile Cys His Lys Phe Tyr His Ser Asn Ser Gly Ala
Gln Leu Ser His 805 810
815Cys Lys Ser Leu Gly His Phe Glu Asn Leu Gln Lys Tyr Lys Ala Ala
820 825 830Lys Asn Pro Ser Pro Thr
Thr Arg Pro Val Ser Arg Arg Cys Ala Ile 835 840
845Asn Ala Arg Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser Gly
Arg Pro 850 855 860Pro Ser Gln Pro Asn
Thr Gln Asp Lys Thr Pro Ser Lys Val Thr Ala865 870
875 880Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg
Ser Thr Arg Leu Lys Thr 885 890
89532842PRTHomo sapiens 32Met Phe Ser Gln Gln Gln Gln Gln Gln Leu
Gln Gln Gln Gln Gln Gln1 5 10
15Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln
20 25 30Gln Gln Leu Leu Gln Leu
Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln 35 40
45Ala Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln
Gln Pro 50 55 60Gln Gln Pro Leu Leu
Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu65 70
75 80Asn Gly Ser Met Leu Gln Arg Ala Leu Leu
Leu Gln Gln Leu Gln Gly 85 90
95Leu Asp Gln Phe Val Met Pro Pro Ala Thr Tyr Asp Thr Ala Gly Leu
100 105 110Thr Met Pro Thr Ala
Thr Leu Gly Asn Leu Arg Gly Tyr Gly Met Ala 115
120 125Ser Pro Gly Leu Ala Ala Pro Ser Leu Thr Pro Pro
Gln Leu Ala Thr 130 135 140Pro Asn Leu
Gln Gln Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu Leu145
150 155 160Gly Pro Pro Pro Val Gly Val
Pro Met Asn Pro Ser Gln Phe Asn Leu 165
170 175Ser Gly Arg Asn Pro Gln Lys Gln Ala Arg Thr Ser
Ser Ser Thr Thr 180 185 190Pro
Asn Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val Glu Asp Lys 195
200 205Ser Asp Pro Pro Glu Gly Ser Glu Glu
Ala Ala Glu Pro Arg Met Asp 210 215
220Thr Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys225
230 235 240Glu Lys Arg Thr
Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu 245
250 255Leu Pro Ala Lys Arg Leu Arg Ser Ser Glu
Glu Pro Thr Glu Lys Glu 260 265
270Pro Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln Ala Arg Met Thr
275 280 285Val Pro Lys Gln Thr Gln Thr
Pro Asp Leu Leu Pro Glu Ala Leu Glu 290 295
300Ala Gln Val Leu Pro Arg Phe Gln Pro Arg Val Leu Gln Val Gln
Ala305 310 315 320Gln Val
Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln
325 330 335Val Gln Pro Lys Leu Gln Lys
Gln Ala Gln Thr Gln Thr Ser Pro Glu 340 345
350His Leu Val Leu Gln Gln Lys Gln Val Gln Pro Gln Leu Gln
Gln Glu 355 360 365Ala Glu Pro Gln
Lys Gln Val Gln Pro Gln Val His Thr Gln Ala Gln 370
375 380Pro Ser Val Gln Pro Gln Glu His Pro Pro Ala Gln
Val Ser Val Gln385 390 395
400Pro Pro Glu Gln Thr His Glu Gln Pro His Thr Gln Pro Gln Val Ser
405 410 415Leu Leu Ala Pro Glu
Gln Thr Pro Val Val Val His Val Cys Gly Leu 420
425 430Glu Met Pro Pro Asp Ala Val Glu Ala Gly Gly Gly
Met Glu Lys Thr 435 440 445Leu Pro
Glu Pro Val Gly Thr Gln Val Ser Met Glu Glu Ile Gln Asn 450
455 460Glu Ser Ala Cys Gly Leu Asp Val Gly Glu Cys
Glu Asn Arg Ala Arg465 470 475
480Glu Met Pro Gly Val Trp Gly Ala Gly Gly Ser Leu Lys Val Thr Ile
485 490 495Leu Gln Ser Ser
Asp Ser Arg Ala Phe Ser Thr Val Pro Leu Thr Pro 500
505 510Val Pro Arg Pro Ser Asp Ser Val Ser Ser Thr
Pro Ala Ala Thr Ser 515 520 525Thr
Pro Ser Lys Gln Ala Leu Gln Phe Phe Cys Tyr Ile Cys Lys Ala 530
535 540Ser Cys Ser Ser Gln Gln Glu Phe Gln Asp
His Met Ser Glu Pro Gln545 550 555
560His Gln Gln Arg Leu Gly Glu Ile Gln His Met Ser Gln Ala Cys
Leu 565 570 575Leu Ser Leu
Leu Pro Met Pro Arg Asp Val Leu Glu Thr Glu Asp Glu 580
585 590Glu Pro Pro Pro Arg Arg Trp Cys Asn Thr
Cys Gln Leu Tyr Tyr Met 595 600
605Gly Asp Leu Ile Gln His Arg Arg Thr Gln Asp His Lys Val Ala Lys 610
615 620Gln Pro Leu Arg Pro Phe Cys Thr
Val Cys Asn Arg Tyr Phe Lys Thr625 630
635 640Pro Arg Lys Phe Val Glu His Val Lys Ser Gln Gly
His Lys Asp Lys 645 650
655Ala Lys Glu Leu Lys Ser Leu Glu Lys Glu Ile Ala Gly Gln Asp Glu
660 665 670Asp His Phe Ile Thr Val
Asp Ala Val Gly Cys Phe Glu Gly Asp Glu 675 680
685Glu Glu Glu Glu Asp Asp Glu Asp Glu Glu Glu Ile Lys Val
Glu Glu 690 695 700Glu Leu Cys Lys Gln
Val Arg Ser Arg Asp Ile Ser Arg Glu Glu Trp705 710
715 720Lys Gly Ser Glu Thr Tyr Ser Pro Asn Thr
Ala Tyr Gly Val Asp Phe 725 730
735Leu Val Pro Val Met Gly Tyr Ile Cys Arg Ile Cys His Lys Phe Tyr
740 745 750His Ser Asn Ser Gly
Ala Gln Leu Ser His Cys Lys Ser Leu Gly His 755
760 765Phe Glu Asn Leu Gln Lys Tyr Lys Ala Ala Lys Asn
Pro Ser Pro Thr 770 775 780Thr Arg Pro
Val Ser Arg Arg Cys Ala Ile Asn Ala Arg Asn Ala Leu785
790 795 800Thr Ala Leu Phe Thr Ser Ser
Gly Arg Pro Pro Ser Gln Pro Asn Thr 805
810 815Gln Asp Lys Thr Pro Ser Lys Val Thr Ala Arg Pro
Ser Gln Pro Pro 820 825 830Leu
Pro Arg Arg Ser Thr Arg Leu Lys Thr 835
84033837PRTHomo sapiens 33Met Phe Ser Gln Gln Gln Gln Gln Gln Leu Gln Gln
Gln Gln Gln Ala1 5 10
15Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln
20 25 30Gln Pro Leu Leu Asn Leu Gln
Gly Thr Asn Ser Ala Ser Leu Leu Asn 35 40
45Gly Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly
Leu 50 55 60Asp Gln Phe Ala Met Pro
Pro Ala Thr Tyr Asp Thr Ala Gly Leu Thr65 70
75 80Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Gly
Tyr Gly Met Ala Ser 85 90
95Pro Gly Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln Leu Ala Thr Pro
100 105 110Asn Leu Gln Gln Phe Phe
Pro Gln Ala Thr Arg Gln Ser Leu Leu Gly 115 120
125Pro Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe Asn
Leu Ser 130 135 140Gly Arg Asn Pro Gln
Lys Gln Ala Arg Thr Ser Ser Ser Thr Thr Pro145 150
155 160Asn Arg Lys Asp Ser Ser Ser Gln Thr Met
Pro Val Glu Asp Lys Ser 165 170
175Asp Pro Pro Glu Gly Ser Glu Glu Ala Ala Glu Pro Arg Met Asp Thr
180 185 190Pro Glu Asp Gln Asp
Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys Glu 195
200 205Lys Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys Glu
Ala Ser Glu Leu 210 215 220Pro Ala Lys
Arg Leu Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu Pro225
230 235 240Pro Gly Gln Leu Gln Val Lys
Ala Gln Pro Gln Ala Arg Met Thr Val 245
250 255Pro Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu
Ala Leu Glu Ala 260 265 270Gln
Val Leu Pro Arg Phe Gln Pro Arg Val Leu Gln Val Gln Ala Gln 275
280 285Val Gln Ser Gln Thr Gln Pro Arg Ile
Pro Ser Thr Asp Thr Gln Val 290 295
300Gln Pro Lys Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu His305
310 315 320Leu Val Leu Gln
Gln Lys Gln Val Gln Pro Gln Leu Gln Gln Glu Ala 325
330 335Glu Pro Gln Lys Gln Val Gln Pro Gln Val
Gln Pro Gln Ala His Ser 340 345
350Gln Gly Pro Arg Gln Val Gln Leu Gln Gln Glu Ala Glu Pro Leu Lys
355 360 365Gln Val Gln Pro Gln Val His
Thr Gln Ala Gln Pro Ser Val Gln Pro 370 375
380Gln Glu His Pro Pro Ala Gln Val Ser Val Gln Pro Pro Glu Gln
Thr385 390 395 400His Glu
Gln Pro His Thr Gln Pro Gln Val Ser Leu Leu Ala Pro Glu
405 410 415Gln Thr Pro Val Val Val His
Val Cys Gly Leu Glu Met Pro Pro Asp 420 425
430Ala Val Glu Ala Gly Gly Gly Met Glu Lys Thr Leu Pro Glu
Pro Val 435 440 445Gly Thr Gln Val
Ser Met Glu Glu Ile Gln Asn Glu Ser Ala Cys Gly 450
455 460Leu Asp Val Gly Glu Cys Glu Asn Arg Ala Arg Glu
Met Pro Gly Val465 470 475
480Trp Gly Ala Gly Gly Ser Leu Lys Val Thr Ile Leu Gln Ser Ser Asp
485 490 495Ser Arg Ala Phe Ser
Thr Val Pro Leu Thr Pro Val Pro Arg Pro Ser 500
505 510Asp Ser Val Ser Ser Thr Pro Ala Ala Thr Ser Thr
Pro Ser Lys Gln 515 520 525Ala Leu
Gln Phe Phe Cys Tyr Ile Cys Lys Ala Ser Cys Ser Ser Gln 530
535 540Gln Glu Phe Gln Asp His Met Ser Glu Pro Gln
His Gln Gln Arg Leu545 550 555
560Gly Glu Ile Gln His Met Ser Gln Ala Cys Leu Leu Ser Leu Leu Pro
565 570 575Val Pro Arg Asp
Val Leu Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg 580
585 590Arg Trp Cys Asn Thr Cys Gln Leu Tyr Tyr Met
Gly Asp Leu Ile Gln 595 600 605His
Arg Arg Thr Gln Asp His Lys Ile Ala Lys Gln Ser Leu Arg Pro 610
615 620Phe Cys Thr Val Cys Asn Arg Tyr Phe Lys
Thr Pro Arg Lys Phe Val625 630 635
640Glu His Val Lys Ser Gln Gly His Lys Asp Lys Ala Lys Glu Leu
Lys 645 650 655Ser Leu Glu
Lys Glu Ile Ala Gly Gln Asp Glu Asp His Phe Ile Thr 660
665 670Val Asp Ala Val Gly Cys Phe Glu Gly Asp
Glu Glu Glu Glu Glu Asp 675 680
685Asp Glu Asp Glu Glu Glu Ile Glu Val Glu Glu Glu Leu Cys Lys Gln 690
695 700Val Arg Ser Arg Asp Ile Ser Arg
Glu Glu Trp Lys Gly Ser Glu Thr705 710
715 720Tyr Ser Pro Asn Thr Ala Tyr Gly Val Asp Phe Leu
Val Pro Val Met 725 730
735Gly Tyr Ile Cys Arg Ile Cys His Lys Phe Tyr His Ser Asn Ser Gly
740 745 750Ala Gln Leu Ser His Cys
Lys Ser Leu Gly His Phe Glu Asn Leu Gln 755 760
765Lys Tyr Lys Ala Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro
Val Ser 770 775 780Arg Arg Cys Ala Ile
Asn Ala Arg Asn Ala Leu Thr Ala Leu Phe Thr785 790
795 800Ser Ser Gly Arg Pro Pro Ser Gln Pro Asn
Thr Gln Asp Lys Thr Pro 805 810
815Ser Lys Val Thr Ala Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg Ser
820 825 830Thr Arg Leu Lys Thr
83534818PRTHomo sapiens 34Met Phe Ser Gln Gln Gln Gln Gln Gln Leu
Gln Gln Gln Gln Gln Gln1 5 10
15Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln
20 25 30Gln Gln Leu Leu Gln Leu
Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln 35 40
45Ala Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln
Gln Pro 50 55 60Gln Gln Pro Leu Leu
Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu65 70
75 80Asn Gly Ser Met Leu Gln Arg Ala Leu Leu
Leu Gln Gln Leu Gln Gly 85 90
95Asn Leu Arg Gly Tyr Gly Met Ala Ser Pro Gly Leu Ala Ala Pro Ser
100 105 110Leu Thr Pro Pro Gln
Leu Ala Thr Pro Asn Leu Gln Gln Phe Phe Pro 115
120 125Gln Ala Thr Arg Gln Ser Leu Leu Gly Pro Pro Pro
Val Gly Val Pro 130 135 140Met Asn Pro
Ser Gln Phe Asn Leu Ser Gly Arg Asn Pro Gln Lys Gln145
150 155 160Ala Arg Thr Ser Ser Ser Thr
Thr Pro Asn Arg Lys Asp Ser Ser Ser 165
170 175Gln Thr Met Pro Val Glu Asp Lys Ser Asp Pro Pro
Glu Gly Ser Glu 180 185 190Glu
Ala Ala Glu Pro Arg Met Asp Thr Pro Glu Asp Gln Asp Leu Pro 195
200 205Pro Cys Pro Glu Asp Ile Ala Lys Glu
Lys Arg Thr Pro Ala Pro Glu 210 215
220Pro Glu Pro Cys Glu Ala Ser Glu Leu Pro Ala Lys Arg Leu Arg Ser225
230 235 240Ser Glu Glu Pro
Thr Glu Lys Glu Pro Pro Gly Gln Leu Gln Val Lys 245
250 255Ala Gln Pro Gln Ala Arg Met Thr Val Pro
Lys Gln Thr Gln Thr Pro 260 265
270Asp Leu Leu Pro Glu Ala Leu Glu Ala Gln Val Leu Pro Arg Phe Gln
275 280 285Pro Arg Val Leu Gln Val Gln
Ala Gln Val Gln Ser Gln Thr Gln Pro 290 295
300Arg Ile Pro Ser Thr Asp Thr Gln Val Gln Pro Lys Leu Gln Lys
Gln305 310 315 320Ala Gln
Thr Gln Thr Ser Pro Glu His Leu Val Leu Gln Gln Lys Gln
325 330 335Val Gln Pro Gln Leu Gln Gln
Glu Ala Glu Pro Gln Lys Gln Val Gln 340 345
350Pro Gln Val His Thr Gln Ala Gln Pro Ser Val Gln Pro Gln
Glu His 355 360 365Pro Pro Ala Gln
Val Ser Val Gln Pro Pro Glu Gln Thr His Glu Gln 370
375 380Pro His Thr Gln Pro Gln Val Ser Leu Leu Ala Pro
Glu Gln Thr Pro385 390 395
400Val Val Val His Val Cys Gly Leu Glu Met Pro Pro Asp Ala Val Glu
405 410 415Ala Gly Gly Gly Met
Glu Lys Thr Leu Pro Glu Pro Val Gly Thr Gln 420
425 430Val Ser Met Glu Glu Ile Gln Asn Glu Ser Ala Cys
Gly Leu Asp Val 435 440 445Gly Glu
Cys Glu Asn Arg Ala Arg Glu Met Pro Gly Val Trp Gly Ala 450
455 460Gly Gly Ser Leu Lys Val Thr Ile Leu Gln Ser
Ser Asp Ser Arg Ala465 470 475
480Phe Ser Thr Val Pro Leu Thr Pro Val Pro Arg Pro Ser Asp Ser Val
485 490 495Ser Ser Thr Pro
Ala Ala Thr Ser Thr Pro Ser Lys Gln Ala Leu Gln 500
505 510Phe Phe Cys Tyr Ile Cys Lys Ala Ser Cys Ser
Ser Gln Gln Glu Phe 515 520 525Gln
Asp His Met Ser Glu Pro Gln His Gln Gln Arg Leu Gly Glu Ile 530
535 540Gln His Met Ser Gln Ala Cys Leu Leu Ser
Leu Leu Pro Val Pro Arg545 550 555
560Asp Val Leu Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg Arg Trp
Cys 565 570 575Asn Thr Cys
Gln Leu Tyr Tyr Met Gly Asp Leu Ile Gln His Arg Arg 580
585 590Thr Gln Asp His Lys Ile Ala Lys Gln Ser
Leu Arg Pro Phe Cys Thr 595 600
605Val Cys Asn Arg Tyr Phe Lys Thr Pro Arg Lys Phe Val Glu His Val 610
615 620Lys Ser Gln Gly His Lys Asp Lys
Ala Lys Glu Leu Lys Ser Leu Glu625 630
635 640Lys Glu Ile Ala Gly Gln Asp Glu Asp His Phe Ile
Thr Val Asp Ala 645 650
655Val Gly Cys Phe Glu Gly Asp Glu Glu Glu Glu Glu Asp Asp Glu Asp
660 665 670Glu Glu Glu Ile Glu Val
Glu Glu Glu Leu Cys Lys Gln Val Arg Ser 675 680
685Arg Asp Ile Ser Arg Glu Glu Trp Lys Gly Ser Glu Thr Tyr
Ser Pro 690 695 700Asn Thr Ala Tyr Gly
Val Asp Phe Leu Val Pro Val Met Gly Tyr Ile705 710
715 720Cys Arg Ile Cys His Lys Phe Tyr His Ser
Asn Ser Gly Ala Gln Leu 725 730
735Ser His Cys Lys Ser Leu Gly His Phe Glu Asn Leu Gln Lys Tyr Lys
740 745 750Ala Ala Lys Asn Pro
Ser Pro Thr Thr Arg Pro Val Ser Arg Arg Cys 755
760 765Ala Ile Asn Ala Arg Asn Ala Leu Thr Ala Leu Phe
Thr Ser Ser Gly 770 775 780Arg Pro Pro
Ser Gln Pro Asn Thr Gln Asp Lys Thr Pro Ser Lys Val785
790 795 800Thr Ala Arg Pro Ser Gln Pro
Pro Leu Pro Arg Arg Ser Thr Arg Leu 805
810 815Lys Thr35820PRTHomo sapiens 35Pro Leu Pro Met Ala
Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln1 5
10 15Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser
Ala Ser Leu Leu Asn 20 25
30Gly Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly Asn
35 40 45Leu Arg Gly Tyr Gly Met Ala Ser
Pro Gly Leu Ala Ala Pro Ser Leu 50 55
60Thr Pro Pro Gln Leu Ala Thr Pro Asn Leu Gln Gln Phe Phe Pro Gln65
70 75 80Ala Thr Arg Gln Ser
Leu Leu Gly Pro Pro Pro Val Gly Val Pro Met 85
90 95Asn Pro Ser Gln Phe Asn Leu Ser Gly Arg Asn
Pro Gln Lys Gln Ala 100 105
110Arg Thr Ser Ser Ser Thr Thr Pro Asn Arg Lys Thr Met Pro Val Glu
115 120 125Asp Lys Ser Asp Pro Pro Glu
Gly Ser Glu Glu Ala Ala Glu Pro Arg 130 135
140Met Asp Thr Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp
Ile145 150 155 160Ala Lys
Glu Lys Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala
165 170 175Ser Glu Leu Pro Ala Lys Arg
Leu Arg Ser Ser Glu Glu Pro Thr Glu 180 185
190Lys Glu Pro Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln
Ala Arg 195 200 205Met Thr Val Pro
Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu Ala 210
215 220Leu Glu Ala Gln Val Leu Pro Arg Phe Gln Pro Arg
Val Leu Gln Val225 230 235
240Gln Ala Gln Val Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp
245 250 255Thr Gln Val Gln Pro
Lys Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser 260
265 270Pro Glu His Leu Val Leu Gln Gln Lys Gln Val Gln
Pro Gln Leu Gln 275 280 285Gln Glu
Ala Glu Pro Gln Lys Gln Val Gln Pro Gln Val Gln Pro Gln 290
295 300Ala His Ser Gln Gly Pro Arg Gln Val Gln Leu
Gln Gln Glu Ala Glu305 310 315
320Pro Leu Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala His Ser Gln
325 330 335Pro Pro Arg Gln
Val Gln Leu Gln Leu Gln Lys Gln Val Gln Thr Gln 340
345 350Thr Tyr Pro Gln Val His Thr Gln Ala Gln Pro
Ser Val Gln Pro Gln 355 360 365Glu
His Pro Pro Ala Gln Val Ser Val Gln Pro Pro Glu Gln Thr His 370
375 380Glu Gln Pro His Thr Gln Pro Gln Val Ser
Leu Leu Ala Pro Glu Gln385 390 395
400Thr Pro Val Val Val His Val Cys Gly Leu Glu Met Pro Pro Asp
Ala 405 410 415Val Glu Ala
Gly Gly Ser Met Glu Lys Thr Leu Pro Glu Pro Val Gly 420
425 430Thr Gln Val Ser Met Glu Glu Ile Gln Asn
Glu Ser Ala Cys Gly Leu 435 440
445Asp Val Gly Glu Cys Glu Asn Arg Ala Arg Glu Met Pro Gly Val Trp 450
455 460Gly Ala Gly Gly Ser Leu Lys Val
Thr Ile Leu Gln Ser Ser Asp Ser465 470
475 480Arg Ala Phe Ser Thr Val Pro Leu Thr Pro Val Pro
Arg Pro Ser Asp 485 490
495Ser Val Ser Ser Thr Pro Ala Ala Thr Ser Thr Pro Ser Lys Gln Ala
500 505 510Leu Gln Phe Phe Cys Tyr
Ile Cys Lys Ala Ser Cys Ser Ser Gln Gln 515 520
525Glu Phe Gln Asp His Met Ser Glu Pro Gln His Gln Gln Arg
Leu Gly 530 535 540Glu Ile Gln His Met
Ser Gln Ala Cys Leu Leu Ser Leu Leu Pro Val545 550
555 560Pro Arg Asp Val Leu Glu Thr Glu Asp Glu
Glu Pro Pro Pro Arg Arg 565 570
575Trp Cys Asn Thr Cys Gln Leu Tyr Tyr Met Gly Asp Leu Ile Gln His
580 585 590Arg Arg Thr Gln Asp
His Arg Ile Ala Lys Gln Ser Leu Arg Pro Phe 595
600 605Cys Thr Val Cys Asn Arg Tyr Phe Lys Thr Pro Arg
Lys Phe Val Glu 610 615 620His Val Lys
Ser Gln Gly His Lys Asp Lys Ala Lys Glu Leu Lys Ser625
630 635 640Leu Glu Lys Glu Ile Ala Gly
Gln Asp Glu Asp His Phe Ile Thr Val 645
650 655Asp Ala Val Gly Cys Phe Glu Gly Asp Glu Glu Glu
Glu Glu Asp Asp 660 665 670Glu
Asp Glu Glu Glu Ile Glu Val Glu Glu Glu Leu Cys Lys Gln Val 675
680 685Arg Ser Arg Asp Ile Ser Arg Glu Glu
Trp Lys Gly Ser Glu Thr Tyr 690 695
700Ser Pro Asn Thr Ala Tyr Gly Val Asp Phe Leu Val Pro Val Met Gly705
710 715 720Tyr Ile Cys Arg
Ile Cys His Lys Phe Tyr His Asn Asn Ser Gly Ala 725
730 735Gln Leu Ser His Cys Lys Ser Leu Gly His
Phe Glu Asn Leu Gln Lys 740 745
750Tyr Lys Ala Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro Val Ser Arg
755 760 765Arg Cys Ala Ile Asn Ala Arg
Asn Ala Leu Thr Ala Leu Phe Thr Ser 770 775
780Ser Gly Arg Pro Pro Ser Gln Pro Asn Thr Gln Asp Lys Thr Pro
Ser785 790 795 800Lys Val
Thr Ala Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg Ser Thr
805 810 815Arg Leu Lys Thr
82036391PRTHomo sapiens 36Met Phe Ser Gln Gln Gln Gln Gln Gln Leu Gln Gln
Gln Gln Gln Gln1 5 10
15Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln
20 25 30Gln Gln Leu Leu Gln Leu Gln
Gln Leu Leu Gln Gln Ser Pro Pro Gln 35 40
45Ala Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln
Pro 50 55 60Gln Gln Pro Leu Leu Asn
Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu65 70
75 80Asn Gly Ser Met Leu Gln Arg Ala Leu Leu Leu
Gln Gln Leu Gln Gly 85 90
95Asn Leu Arg Gly Tyr Gly Met Ala Ser Pro Gly Leu Ala Ala Pro Ser
100 105 110Leu Thr Pro Pro Gln Leu
Ala Thr Pro Asn Leu Gln Gln Phe Phe Pro 115 120
125Gln Ala Thr Arg Gln Ser Leu Leu Gly Pro Pro Pro Val Gly
Val Pro 130 135 140Met Asn Pro Ser Gln
Phe Asn Leu Ser Gly Arg Asn Pro Gln Lys Gln145 150
155 160Ala Arg Thr Ser Ser Ser Thr Thr Pro Asn
Arg Lys Asp Ser Ser Ser 165 170
175Gln Thr Met Pro Val Glu Asp Lys Ser Asp Pro Pro Glu Gly Ser Glu
180 185 190Glu Ala Ala Glu Pro
Arg Met Asp Thr Pro Glu Asp Gln Asp Leu Pro 195
200 205Pro Cys Pro Glu Asp Ile Ala Lys Glu Lys Arg Thr
Pro Ala Pro Glu 210 215 220Pro Glu Pro
Cys Glu Ala Ser Glu Leu Pro Ala Lys Arg Leu Arg Ser225
230 235 240Ser Glu Glu Pro Thr Glu Lys
Glu Pro Pro Gly Gln Leu Gln Val Lys 245
250 255Ala Gln Pro Gln Ala Arg Met Thr Val Pro Lys Gln
Thr Gln Thr Pro 260 265 270Asp
Leu Leu Pro Glu Ala Leu Glu Ala Gln Val Leu Pro Arg Phe Gln 275
280 285Pro Arg Val Leu Gln Val Gln Ala Gln
Val Gln Ser Gln Thr Gln Pro 290 295
300Arg Ile Pro Ser Thr Asp Thr Gln Val Gln Pro Lys Leu Gln Lys Gln305
310 315 320Ala Gln Thr Gln
Thr Ser Pro Glu His Leu Val Leu Gln Gln Lys Gln 325
330 335Val Gln Pro Gln Leu Gln Gln Glu Ala Glu
Pro Gln Lys Gln Val Gln 340 345
350Pro Gln Val Gln Pro Gln Ala His Ser Gln Gly Pro Arg Gln Val Gln
355 360 365Leu Gln Gln Glu Ala Glu Pro
Leu Lys Gln Val Gln Pro Gln Val Gln 370 375
380Pro Gln Ala His Ser Gln Pro385 3903775PRTHomo
sapiens 37Leu Gln Gln Gln Gln Gln Gln Leu Gln Gln Leu Gln Gln Gln Gln
Leu1 5 10 15Gln Gln Gln
Gln Leu Gln Gln Gln Gln Leu Leu Gln Leu Gln Gln Leu 20
25 30Leu Gln Gln Ser Pro Pro Gln Ala Pro Leu
Pro Met Ala Val Ser Arg 35 40
45Gly Leu Pro Pro Gln Gln Pro Gln Gln Pro Leu Leu Asn Leu Gln Gly 50
55 60Thr Asn Ser Ala Ser Leu Leu Asn Gly
Ser Met65 70 753833PRTHomo sapiens
38Gln Gln Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu1
5 10 15Gln Gln Gln Gln Leu Leu
Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro 20 25
30Pro3952PRTHomo sapiens 39Met Phe Ser Gln Gln Gln Gln
Gln Gln Leu Gln Gln Gln Gln Gln Gln1 5 10
15Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln
Leu Gln Gln 20 25 30Gln Gln
Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln 35
40 45Ala Pro Leu Pro 504026PRTHomo sapiens
40Pro Pro Thr Pro Arg Arg Asp Val Phe Ala His Val Pro Val Gln Gly1
5 10 15Trp Ser Thr Ala Arg Leu
Val Thr Asp Met 20 254124PRTHomo sapiens
41Gly Leu Asp Gln Phe Ala Met Pro Pro Ala Thr Tyr Asp Thr Ala Gly1
5 10 15Leu Thr Met Pro Thr Ala
Thr Leu 204256PRTHomo sapiens 42Pro Gln Val Gln Pro Gln Ala
His Ser Gln Gly Pro Arg Gln Val Gln1 5 10
15Leu Gln Gln Glu Ala Glu Pro Leu Lys Gln Val Gln Pro
Gln Val Gln 20 25 30Pro Gln
Ala His Ser Gln Pro Pro Arg Gln Val Gln Leu Gln Leu Gln 35
40 45Lys Gln Val Gln Thr Gln Thr Tyr 50
554328PRTHomo sapiens 43Pro Gln Val Gln Pro Gln Ala His Ser
Gln Pro Pro Arg Gln Val Gln1 5 10
15Leu Gln Leu Gln Lys Gln Val Gln Thr Gln Thr Tyr 20
2544112PRTHomo sapiens 44Gln Val Gln Ser Gln Thr Gln Pro
Arg Ile Pro Ser Thr Asp Thr Gln1 5 10
15Val Gln Pro Lys Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser
Pro Glu 20 25 30His Leu Val
Leu Gln Gln Lys Gln Val Gln Pro Gln Leu Gln Gln Glu 35
40 45Ala Glu Pro Gln Lys Gln Val Gln Pro Gln Val
Gln Pro Gln Ala His 50 55 60Ser Gln
Gly Pro Arg Gln Val Gln Leu Gln Gln Glu Ala Glu Pro Leu65
70 75 80Lys Gln Val Gln Pro Gln Val
Gln Pro Gln Ala His Ser Gln Pro Pro 85 90
95Arg Gln Val Gln Leu Gln Leu Gln Lys Gln Val Gln Thr
Gln Thr Tyr 100 105
110452687DNAMus musculus 45catgttcaac ccgcaactcc agcagcagca acagttgcag
cagcagcagc aacagttgca 60gcagcagctc cagcagcagc agctccagca gcagcaacag
cagatactgc agctccaaca 120gctgctgcaa cagtccccac cacaggcctc cttgtccatt
cctgtcagcc ggggcctccc 180ccagcagtca tccccgcaac agcttctgag tctccagggc
ctccactcga cctccctgct 240caatggcccc atgctgcaaa gagctttgct cctacagcag
ttgcaaggac tggaccagtt 300tgcaatgcca ccagccacgt atgacggtgc cagcctcacc
atgcctacgg caacactggg 360taacctccgt gctttcaatg tgacagcccc aagcctagca
gctcccagcc ttacaccacc 420ccagatggtc accccaaatc tgcagcagtt ctttccccag
gctactcgac agtctctgct 480ggggcctcct cctgttgggg tcccaataaa cccttctcag
ctcaaccact cagggaggaa 540cacccagaaa caggccagaa ccccctcttc caccaccccc
aatcgcaagg attcttcttc 600tcagacggtg cctctggaag acagggaaga ccccacagag
gggtctgagg aagccacgga 660gctccagatg gacacatgtg aagaccaaga ttcactagtc
ggtccagata gcatgctgag 720tgagccccaa gtgcctgagc ctgagccctt tgagacattg
gaaccaccag ccaagaggtg 780caggagctca gaggagtcca ccgagaaagg ccctacaggg
cagccacaag caagggtcca 840gcctcagacc cagatgacag caccaaagca gacacagacc
ccggatcggc tgcctgagcc 900accagaagtc caaatgctgc cgcgtatcca gccacaggca
ctgcagatcc agacccagcc 960aaagctgctg aggcaggcac agacacagac ctctccagag
cacttagcgc cccagcagga 1020tcaggtagag ccacaggtac catcacagcc cccatggcag
ttgcagccac gggagacaga 1080cccaccgaac caagctcagg cacagaccca gcctcagccc
ctctggcagg cgcagtcaca 1140gaagcaggcc cagacacagg cacatccaca ggtacccacc
caagcacagt cacaggagca 1200gacatcagag aagacccagg accagcctca gacctggcca
caggggtcag tacccccacc 1260agaacaagcg tcaggtccag cctgtgccac ggaaccacag
ctatcctctc acgctgcaga 1320agctgggagt gacccagaca aggccttgcc agaaccagta
agtgcccaga gcagtgaaga 1380caggagccgg gaggcgtccg ctggtggcct ggatttggga
gaatgtgaaa agagagcggg 1440agagatgctg gggatgtggg gggctgggag ctccctgaag
gtcaccatcc tgcagagtag 1500caacagccgg gcctttaaca ccacacccct cacatctgga
cctcgccctg gggactctac 1560ctctgccacc cctgccattg ccagcacacc ctccaagcaa
agcctccagt tcttctgcta 1620catctgcaag gccagcagca gcagccagca ggagttccag
gatcacatgt cagaggctca 1680gcaccaacag cggcttgggg aaatacaaca ctcgagccag
acctgcctgc tgtccctgct 1740gcccatgcct cgggacatcc tggagaaaga agcggaagat
cctccgccca aacgctggtg 1800caacacctgc caggtgtact acgtgggaga cttgatccag
caccgtagga cacaggagca 1860caaggttgcc aaacaatccc tgaggccctt ctgcaccata
tgcaaccgtt acttcaagac 1920ccctcgaaag tttgtggagc acgtgaagtc ccagggacac
aaggacaagg cccaagagct 1980gaagacactt gaaaaggaga caggcagccc agatgaggac
cacttcatca ctgtggacgc 2040cgtcggttgc tttgagagtg gtcaagaaga ggacgaggat
gacgacgagg aagaagaaga 2100agaaggagag attgaggctg aggaggaatt ctgcaagcag
gtgaagccga gagaaacatc 2160ctcagagcaa gggaagggct ctgagacgta caaccccaac
acagcctatg gtgaggattt 2220cctggtgcca gtgatgggct atgtctgtca aatctgtcac
aagttctacg acagcaactc 2280agaattgcgg ctttctcact gcaagtccct ggcccacttt
gagaacctgc agaaatacaa 2340agccaagaac ccaagccctc ctcctacccg gcctgtgagc
cgcaagtgtg ccatcaacgc 2400ccgcaacgcc ctgactgcac tgttcacctc tagccaccag
cccagccccc aggacacagt 2460gaaaatgccc agcaaggtga agcctggatc ccccggactc
cctcctcccc ttcggcgctc 2520aacacgcctc aaaacctgat agagggagct ctggccactc
agcctgacta aggctcagtc 2580tgctaatgct tcctaggtat ctgtgtagaa atgttcaagt
ggttggtgtt tttactcaaa 2640atccaataaa gagtcagtag tttggcaaaa aaaaaaaaaa
aaaaaaa 2687462922DNAHomo sapiens 46tgggggctgc ggggccggcc
catccgtggg ggcgacttga gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc
cagcagcagc agcagctcca gcaacagcag cagcagctcc 120agcagttaca gcagcagcag
ctccagcagc agcaattgca gcagcagcag ttactgcagc 180tccagcagct gctccagcag
tccccaccac aggccccgtt gcccatggct gtcagccggg 240ggctcccccc gcagcagcca
cagcagccgc ttctgaatct ccagggcacc aactcagcct 300ccctcctcaa cggctccatg
ctgcagagag ctttgctttt acagcagttg caaggactgg 360accagtttgc aatgccacca
gccacgtatg acactgccgg tctcaccatg cccacagcaa 420cactgggtaa cctccgaggc
tatggcatgg catccccagg cctcgcagcc cccagcctca 480cacccccaca actggccact
ccaaatttgc aacagttctt tccccaggcc actcgccagt 540ccttgctggg acctcctcct
gttggggtcc ccatgaaccc ttcccagttc aacctttcag 600gacggaaccc ccagaaacag
gcccggacct cctcctctac cacccccaat cgaaaggatt 660cttcttctca gacaatgcct
gtggaagaca agtcagaccc cccagagggg tctgaggaag 720ccgcagagcc ccggatggac
acaccagaag accaagattt accgccctgc ccagaggaca 780tcgccaagga aaaacgcact
ccagcacctg agcctgagcc ttgtgaggcg tccgagctgc 840cagcaaagag attgaggagc
tcagaagagc ccacagagaa ggaacctcca gggcagttac 900aggtgaaggc ccagccgcag
gcccggatga cagtaccgaa acagacacag acaccagacc 960tgctgcctga ggccctggaa
gcccaagtgc tgccacgatt ccagccacgg gtcctgcagg 1020tccaggccca ggtgcagtca
cagactcagc cgcggatacc atccacagac acccaggtgc 1080agccaaagct gcagaagcag
gcgcaaacac agacctctcc agagcactta gtgctgcaac 1140agaagcaggt gcagccacag
ctgcagcagg aggcagagcc acagaagcag gtgcagccac 1200aggtacagcc acaggcacat
tcacagggcc caaggcaggt gcagctgcag caggaggcag 1260agccgctgaa gcaggtgcag
ccacaggtgc agccccaggc acattcacag cccccaaggc 1320aggtgcagct gcagctgcag
aagcaggtcc agacacagac atatccacag gtccacacac 1380aggcacagcc aagcgtccag
ccacaggagc atcctccagc gcaggtgtca gtacagccac 1440cagagcagac ccatgagcag
cctcacaccc agccgcaggt gtcgttgctg gctccagagc 1500aaacaccagt tgtggttcat
gtctgcgggc tggagatgcc acctgatgca gtagaagctg 1560gtggaggcat ggaaaagacc
ttgccagagc ctgtgggcac ccaagtcagc atggaagaga 1620ttcagaatga gtcggcctgt
ggcctagatg tgggagaatg tgaaaacaga gcgagagaga 1680tgccaggggt atggggcgcc
gggggctccc tgaaggtcac cattctgcag agcagtgaca 1740gccgggcctt tagcactgta
cccctgacac ctgtcccccg ccccagtgac tccgtctcct 1800ccacccctgc ggctaccagc
actccctcta agcaggccct ccagttcttc tgctacatct 1860gcaaggccag ctgctccagc
cagcaggagt tccaggacca catgtcggag cctcagcacc 1920agcagcggct aggggagatc
cagcacatga gccaagcctg cctcctgtcc ctgctgcccg 1980tgccccggga cgtcctggag
acagaggatg aggagcctcc accaaggcgc tggtgcaaca 2040cctgccagct ctactacatg
ggggacctga tccaacaccg caggacacag gaccacaaga 2100ttgccaaaca atccttgcga
cccttctgca ccgtttgcaa ccgctacttc aaaacccctc 2160gcaagtttgt ggagcacgtg
aagtcccagg ggcataagga caaagccaag gagctgaagt 2220cgcttgagaa agaaattgct
ggccaagatg aggaccactt cattacagtg gacgctgtgg 2280gttgcttcga gggtgatgaa
gaagaggaag aggatgatga ggatgaagaa gagatcgagg 2340ttgaggagga actctgcaag
caggtgaggt ccagagatat atccagagag gagtggaagg 2400gctcggagac ctacagcccc
aatactgcat atggtgtgga cttcctggtg cccgtgatgg 2460gctatatctg ccgcatctgc
cacaagttct atcacagcaa ctcaggggca cagctctccc 2520actgcaagtc cctgggccac
tttgagaacc tgcagaaata caaggcggcc aagaacccca 2580gccccaccac ccgacctgtg
agccgccggt gcgcaatcaa cgcccggaac gctttgacag 2640ccctgttcac ctccagcggc
cgcccaccct cccagcccaa cacccaggac aaaacaccca 2700gcaaggtgac ggctcgaccc
tcccagcccc cactacctcg gcgctcaacc cgcctcaaaa 2760cctgatagag ggacctccct
gtccctggcc tgcctgggtc cagatctgct aatgcttttt 2820aggagtctgc ctggaaactt
tgacatggtt catgttttta ctcaaaatcc aataaaacaa 2880ggtagtttgg ctgtgcaaaa
aaaaaaaaaa aaaaaaaaaa aa 292247897PRTHomo sapiens
47Met Phe Ser Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln Gln Gln Leu1
5 10 15Gln Gln Leu Gln Gln Gln
Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln 20 25
30Gln Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro
Pro Gln Ala 35 40 45Pro Leu Pro
Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln 50
55 60Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala
Ser Leu Leu Asn65 70 75
80Gly Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly Leu
85 90 95Asp Gln Phe Ala Met Pro
Pro Ala Thr Tyr Asp Thr Ala Gly Leu Thr 100
105 110Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Gly Tyr
Gly Met Ala Ser 115 120 125Pro Gly
Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln Leu Ala Thr Pro 130
135 140Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg
Gln Ser Leu Leu Gly145 150 155
160Pro Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe Asn Leu Ser
165 170 175Gly Arg Asn Pro
Gln Lys Gln Ala Arg Thr Ser Ser Ser Thr Thr Pro 180
185 190Asn Arg Lys Asp Ser Ser Ser Gln Thr Met Pro
Val Glu Asp Lys Ser 195 200 205Asp
Pro Pro Glu Gly Ser Glu Glu Ala Ala Glu Pro Arg Met Asp Thr 210
215 220Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro
Glu Asp Ile Ala Lys Glu225 230 235
240Lys Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu
Leu 245 250 255Pro Ala Lys
Arg Leu Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu Pro 260
265 270Pro Gly Gln Leu Gln Val Lys Ala Gln Pro
Gln Ala Arg Met Thr Val 275 280
285Pro Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu Ala Leu Glu Ala 290
295 300Gln Val Leu Pro Arg Phe Gln Pro
Arg Val Leu Gln Val Gln Ala Gln305 310
315 320Val Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr
Asp Thr Gln Val 325 330
335Gln Pro Lys Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu His
340 345 350Leu Val Leu Gln Gln Lys
Gln Val Gln Pro Gln Leu Gln Gln Glu Ala 355 360
365Glu Pro Gln Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala
His Ser 370 375 380Gln Gly Pro Arg Gln
Val Gln Leu Gln Gln Glu Ala Glu Pro Leu Lys385 390
395 400Gln Val Gln Pro Gln Val Gln Pro Gln Ala
His Ser Gln Pro Pro Arg 405 410
415Gln Val Gln Leu Gln Leu Gln Lys Gln Val Gln Thr Gln Thr Tyr Pro
420 425 430Gln Val His Thr Gln
Ala Gln Pro Ser Val Gln Pro Gln Glu His Pro 435
440 445Pro Ala Gln Val Ser Val Gln Pro Pro Glu Gln Thr
His Glu Gln Pro 450 455 460His Thr Gln
Pro Gln Val Ser Leu Leu Ala Pro Glu Gln Thr Pro Val465
470 475 480Val Val His Val Cys Gly Leu
Glu Met Pro Pro Asp Ala Val Glu Ala 485
490 495Gly Gly Gly Met Glu Lys Thr Leu Pro Glu Pro Val
Gly Thr Gln Val 500 505 510Ser
Met Glu Glu Ile Gln Asn Glu Ser Ala Cys Gly Leu Asp Val Gly 515
520 525Glu Cys Glu Asn Arg Ala Arg Glu Met
Pro Gly Val Trp Gly Ala Gly 530 535
540Gly Ser Leu Lys Val Thr Ile Leu Gln Ser Ser Asp Ser Arg Ala Phe545
550 555 560Ser Thr Val Pro
Leu Thr Pro Val Pro Arg Pro Ser Asp Ser Val Ser 565
570 575Ser Thr Pro Ala Ala Thr Ser Thr Pro Ser
Lys Gln Ala Leu Gln Phe 580 585
590Phe Cys Tyr Ile Cys Lys Ala Ser Cys Ser Ser Gln Gln Glu Phe Gln
595 600 605Asp His Met Ser Glu Pro Gln
His Gln Gln Arg Leu Gly Glu Ile Gln 610 615
620His Met Ser Gln Ala Cys Leu Leu Ser Leu Leu Pro Val Pro Arg
Asp625 630 635 640Val Leu
Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg Arg Trp Cys Asn
645 650 655Thr Cys Gln Leu Tyr Tyr Met
Gly Asp Leu Ile Gln His Arg Arg Thr 660 665
670Gln Asp His Lys Ile Ala Lys Gln Ser Leu Arg Pro Phe Cys
Thr Val 675 680 685Cys Asn Arg Tyr
Phe Lys Thr Pro Arg Lys Phe Val Glu His Val Lys 690
695 700Ser Gln Gly His Lys Asp Lys Ala Lys Glu Leu Lys
Ser Leu Glu Lys705 710 715
720Glu Ile Ala Gly Gln Asp Glu Asp His Phe Ile Thr Val Asp Ala Val
725 730 735Gly Cys Phe Glu Gly
Asp Glu Glu Glu Glu Glu Asp Asp Glu Asp Glu 740
745 750Glu Glu Ile Glu Val Glu Glu Glu Leu Cys Lys Gln
Val Arg Ser Arg 755 760 765Asp Ile
Ser Arg Glu Glu Trp Lys Gly Ser Glu Thr Tyr Ser Pro Asn 770
775 780Thr Ala Tyr Gly Val Asp Phe Leu Val Pro Val
Met Gly Tyr Ile Cys785 790 795
800Arg Ile Cys His Lys Phe Tyr His Ser Asn Ser Gly Ala Gln Leu Ser
805 810 815His Cys Lys Ser
Leu Gly His Phe Glu Asn Leu Gln Lys Tyr Lys Ala 820
825 830Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro Val
Ser Arg Arg Cys Ala 835 840 845Ile
Asn Ala Arg Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser Gly Arg 850
855 860Pro Pro Ser Gln Pro Asn Thr Gln Asp Lys
Thr Pro Ser Lys Val Thr865 870 875
880Ala Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg Ser Thr Arg Leu
Lys 885 890
895Thr4849PRTHomo sapiens 48Met Phe Ser Gln Gln Gln Gln Gln Gln Leu Gln
Gln Gln Gln Gln Gln1 5 10
15Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln
20 25 30Gln Gln Leu Leu Gln Leu Gln
Gln Leu Leu Gln Gln Ser Pro Pro Gln 35 40
45Ala49215DNAHomo sapiens 49tgggggctgc ggggccggcc catccgtggg
ggcgacttga gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc cagcagcagc
agcagctcca gcaacagcag cagcagctcc 120agcagttaca gcagcagcag ctccagcagc
agcaattgca gcagcagcag ttactgcagc 180tccagcagct gctccagcag tccccaccac
aggcc 21550101DNAHomo sapiens 50cagcagctcc
agcagttaca gcagcagcag ctccagcagc agcaattgca gcagcagcag 60ttactgcagc
tccagcagct gctccagcag tccccaccac a 1015172DNAHomo
sapiens 51ggactggacc agtttgcaat gccaccagcc acgtatgaca ctgccggtct
caccatgccc 60acagcaacac tg
725215DNAHomo sapiens 52aggattcttc ttctc
155386DNAHomo sapiens 53ccacaggtgc
agccccaggc acattcacag cccccaaggc aggtgcagct gcagctgcag 60aagcaggtcc
agacacagac atatcc 8654168DNAHomo
sapiens 54ccacaggtac agccacaggc acattcacag ggcccaaggc aggtgcagct
gcagcaggag 60gcagagccgc tgaagcaggt gcagccacag gtgcagcccc aggcacattc
acagccccca 120aggcaggtgc agctgcagct gcagaagcag gtccagacac agacatat
16855336DNAHomo sapiens 55caggtgcagt cacagactca gccgcggata
ccatccacag acacccaggt gcagccaaag 60ctgcagaagc aggcgcaaac acagacctct
ccagagcact tagtgctgca acagaagcag 120gtgcagccac agctgcagca ggaggcagag
ccacagaagc aggtgcagcc acaggtacag 180ccacaggcac attcacaggg cccaaggcag
gtgcagctgc agcaggaggc agagccgctg 240aagcaggtgc agccacaggt gcagccccag
gcacattcac agcccccaag gcaggtgcag 300ctgcagctgc agaagcaggt ccagacacag
acatat 3365624DNAHomo sapiens 56gttgaggagg
aactctgcaa gcag 245778DNAHomo
sapiens 57gccacccaca ccacgaagag atgtgtttgc ccacgttcca gtgcaggggt
ggagcacagc 60ccggcttgtt acagatat
7858863PRTHomo sapiens 58Met Phe Ser Gln Gln Gln Gln Gln Leu
Gln Gln Gln Gln Gln Ala Pro1 5 10
15Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln
Gln 20 25 30Pro Leu Leu Asn
Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu Asn Gly 35
40 45Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu
Gln Gly Leu Asp 50 55 60Gln Phe Ala
Met Pro Pro Ala Thr Tyr Asp Thr Ala Gly Leu Thr Met65 70
75 80Pro Thr Ala Thr Leu Gly Asn Leu
Arg Gly Tyr Gly Met Ala Ser Pro 85 90
95Gly Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln Leu Ala Thr
Pro Asn 100 105 110Leu Gln Gln
Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu Leu Gly Pro 115
120 125Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln
Phe Asn Leu Ser Gly 130 135 140Arg Asn
Pro Gln Lys Gln Ala Arg Thr Ser Ser Ser Thr Thr Pro Asn145
150 155 160Arg Lys Asp Ser Ser Ser Gln
Thr Met Pro Val Glu Asp Lys Ser Asp 165
170 175Pro Pro Glu Gly Ser Glu Glu Ala Ala Glu Pro Arg
Met Asp Thr Pro 180 185 190Glu
Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys Glu Lys 195
200 205Arg Thr Pro Ala Pro Glu Pro Glu Pro
Cys Glu Ala Ser Glu Leu Pro 210 215
220Ala Lys Arg Leu Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu Pro Pro225
230 235 240Gly Gln Leu Gln
Val Lys Ala Gln Pro Gln Ala Arg Met Thr Val Pro 245
250 255Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro
Glu Ala Leu Glu Ala Gln 260 265
270Val Leu Pro Arg Phe Gln Pro Arg Val Leu Gln Val Gln Ala Gln Val
275 280 285Gln Ser Gln Thr Gln Pro Arg
Ile Pro Ser Thr Asp Thr Gln Val Gln 290 295
300Pro Lys Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu His
Leu305 310 315 320Val Leu
Gln Gln Lys Gln Val Gln Pro Gln Leu Gln Gln Glu Ala Glu
325 330 335Pro Gln Lys Gln Val Gln Pro
Gln Val Gln Pro Gln Ala His Ser Gln 340 345
350Gly Pro Arg Gln Val Gln Leu Gln Gln Glu Ala Glu Pro Leu
Lys Gln 355 360 365Val Gln Pro Gln
Val Gln Pro Gln Ala His Ser Gln Pro Pro Arg Gln 370
375 380Val Gln Leu Gln Leu Gln Lys Gln Val Gln Thr Gln
Thr Tyr Pro Gln385 390 395
400Val His Thr Gln Ala Gln Pro Ser Val Gln Pro Gln Glu His Pro Pro
405 410 415Ala Gln Val Ser Val
Gln Pro Pro Glu Gln Thr His Glu Gln Pro His 420
425 430Thr Gln Pro Gln Val Ser Leu Leu Ala Pro Glu Gln
Thr Pro Val Val 435 440 445Val His
Val Cys Gly Leu Glu Met Pro Pro Asp Ala Val Glu Ala Gly 450
455 460Gly Gly Met Glu Lys Thr Leu Pro Glu Pro Val
Gly Thr Gln Val Ser465 470 475
480Met Glu Glu Ile Gln Asn Glu Ser Ala Cys Gly Leu Asp Val Gly Glu
485 490 495Cys Glu Asn Arg
Ala Arg Glu Met Pro Gly Val Trp Gly Ala Gly Gly 500
505 510Ser Leu Lys Val Thr Ile Leu Gln Ser Ser Asp
Ser Arg Ala Phe Ser 515 520 525Thr
Val Pro Leu Thr Pro Val Pro Arg Pro Ser Asp Ser Val Ser Ser 530
535 540Thr Pro Ala Ala Thr Ser Thr Pro Ser Lys
Gln Ala Leu Gln Phe Phe545 550 555
560Cys Tyr Ile Cys Lys Ala Ser Cys Ser Ser Gln Gln Glu Phe Gln
Asp 565 570 575His Met Ser
Glu Pro Gln His Gln Gln Arg Leu Gly Glu Ile Gln His 580
585 590Met Ser Gln Ala Leu Leu Ser Leu Leu Pro
Val Pro Arg Asp Val Leu 595 600
605Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg Arg Trp Cys Asn Thr Cys 610
615 620Gln Leu Tyr Tyr Met Gly Asp Leu
Ile Gln His Arg Arg Thr Gln Asp625 630
635 640His Lys Ile Ala Lys Gln Ser Leu Arg Pro Phe Cys
Thr Val Cys Asn 645 650
655Arg Tyr Phe Lys Thr Pro Arg Lys Phe Val Glu His Val Lys Ser Gln
660 665 670Gly His Lys Asp Lys Ala
Lys Glu Leu Lys Ser Leu Glu Lys Glu Ile 675 680
685Ala Gly Gln Asp Glu Asp His Phe Ile Thr Val Asp Ala Val
Gly Cys 690 695 700Phe Glu Gly Asp Glu
Glu Glu Glu Glu Asp Asp Glu Asp Glu Glu Glu705 710
715 720Ile Glu Val Glu Glu Glu Leu Cys Lys Gln
Val Arg Ser Arg Asp Ile 725 730
735Ser Arg Glu Glu Trp Lys Gly Ser Glu Thr Tyr Ser Pro Asn Thr Ala
740 745 750Tyr Gly Val Asp Phe
Leu Val Pro Val Met Gly Tyr Ile Cys Arg Ile 755
760 765Cys His Lys Phe Tyr His Ser Asn Ser Gly Ala Gln
Leu Ser His Cys 770 775 780Lys Ser Leu
Gly His Phe Glu Asn Leu Gln Lys Tyr Lys Ala Ala Lys785
790 795 800Asn Pro Ser Pro Thr Thr Arg
Pro Val Ser Arg Arg Cys Ala Ile Asn 805
810 815Ala Arg Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser
Gly Arg Pro Pro 820 825 830Ser
Gln Pro Asn Thr Gln Asp Lys Thr Pro Ser Lys Val Thr Ala Arg 835
840 845Pro Ser Gln Pro Pro Leu Pro Arg Arg
Ser Thr Arg Leu Lys Thr 850 855
86059873PRTHomo sapiens 59Met Phe Ser Gln Gln Gln Gln Gln Leu Gln Gln Gln
Gln Gln Gln Leu1 5 10
15Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln
20 25 30Gln Leu Leu Gln Leu Gln Gln
Leu Leu Gln Gln Ser Pro Pro Gln Ala 35 40
45Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln Pro
Gln 50 55 60Gln Pro Leu Leu Asn Leu
Gln Gly Thr Asn Ser Ala Ser Leu Leu Asn65 70
75 80Gly Ser Met Leu Gln Arg Ala Leu Leu Leu Gln
Gln Leu Gln Gly Asn 85 90
95Leu Arg Gly Tyr Gly Met Ala Ser Pro Gly Leu Ala Ala Pro Ser Leu
100 105 110Thr Pro Pro Gln Leu Ala
Thr Pro Asn Leu Gln Gln Phe Phe Pro Gln 115 120
125Ala Thr Arg Gln Ser Leu Leu Gly Pro Pro Pro Val Gly Val
Pro Met 130 135 140Asn Pro Ser Gln Phe
Asn Leu Ser Gly Arg Asn Pro Gln Lys Gln Ala145 150
155 160Arg Thr Ser Ser Ser Thr Thr Pro Asn Arg
Lys Asp Ser Ser Ser Gln 165 170
175Thr Met Pro Val Glu Asp Lys Ser Asp Pro Pro Glu Gly Ser Glu Glu
180 185 190Ala Ala Glu Pro Arg
Met Asp Thr Pro Glu Asp Gln Asp Leu Pro Pro 195
200 205Cys Pro Glu Asp Ile Ala Lys Glu Lys Arg Thr Pro
Ala Pro Glu Pro 210 215 220Glu Pro Cys
Glu Ala Ser Glu Leu Pro Ala Lys Arg Leu Arg Ser Ser225
230 235 240Glu Glu Pro Thr Glu Lys Glu
Pro Pro Gly Gln Leu Gln Val Lys Ala 245
250 255Gln Pro Gln Ala Arg Met Thr Val Pro Lys Gln Thr
Gln Thr Pro Asp 260 265 270Leu
Leu Pro Glu Ala Leu Glu Ala Gln Val Leu Pro Arg Phe Gln Pro 275
280 285Arg Val Leu Gln Val Gln Ala Gln Val
Gln Ser Gln Thr Gln Pro Arg 290 295
300Ile Pro Ser Thr Asp Thr Gln Val Gln Pro Lys Leu Gln Lys Gln Ala305
310 315 320Gln Thr Gln Thr
Ser Pro Glu His Leu Val Leu Gln Gln Lys Gln Val 325
330 335Gln Pro Gln Leu Gln Gln Glu Ala Glu Pro
Gln Lys Gln Val Gln Pro 340 345
350Gln Val Gln Pro Gln Ala His Ser Gln Gly Pro Arg Gln Val Gln Leu
355 360 365Gln Gln Glu Ala Glu Pro Leu
Lys Gln Val Gln Pro Gln Val Gln Pro 370 375
380Gln Ala His Ser Gln Pro Pro Arg Gln Val Gln Leu Gln Leu Gln
Lys385 390 395 400Gln Val
Gln Thr Gln Thr Tyr Pro Gln Val His Thr Gln Ala Gln Pro
405 410 415Ser Val Gln Pro Gln Glu His
Pro Pro Ala Gln Val Ser Val Gln Pro 420 425
430Pro Glu Gln Thr His Glu Gln Pro His Thr Gln Pro Gln Val
Ser Leu 435 440 445Leu Ala Pro Glu
Gln Thr Pro Val Val Val His Val Cys Gly Leu Glu 450
455 460Met Pro Pro Asp Ala Val Glu Ala Gly Gly Gly Met
Glu Lys Thr Leu465 470 475
480Pro Glu Pro Val Gly Thr Gln Val Ser Met Glu Glu Ile Gln Asn Glu
485 490 495Ser Ala Cys Gly Leu
Asp Val Gly Glu Cys Glu Asn Arg Ala Arg Glu 500
505 510Met Pro Gly Val Trp Gly Ala Gly Gly Ser Leu Lys
Val Thr Ile Leu 515 520 525Gln Ser
Ser Asp Ser Arg Ala Phe Ser Thr Val Pro Leu Thr Pro Val 530
535 540Pro Arg Pro Ser Asp Ser Val Ser Ser Thr Pro
Ala Ala Thr Ser Thr545 550 555
560Pro Ser Lys Gln Ala Leu Gln Phe Phe Cys Tyr Ile Cys Lys Ala Ser
565 570 575Cys Ser Ser Gln
Gln Glu Phe Gln Asp His Met Ser Glu Pro Gln His 580
585 590Gln Gln Arg Leu Gly Glu Ile Gln His Met Ser
Gln Ala Cys Leu Leu 595 600 605Ser
Leu Leu Pro Val Pro Arg Asp Val Leu Glu Thr Glu Asp Glu Glu 610
615 620Pro Pro Pro Arg Arg Trp Cys Asn Thr Cys
Gln Leu Tyr Tyr Met Gly625 630 635
640Asp Leu Ile Gln His Arg Arg Thr Gln Asp His Lys Ile Ala Lys
Gln 645 650 655Ser Leu Arg
Pro Phe Cys Thr Val Cys Asn Arg Tyr Phe Lys Thr Pro 660
665 670Arg Lys Phe Val Glu His Val Lys Ser Gln
Gly His Lys Asp Lys Ala 675 680
685Lys Glu Leu Lys Ser Leu Glu Lys Glu Ile Ala Gly Gln Asp Glu Asp 690
695 700His Phe Ile Thr Val Asp Ala Val
Gly Cys Phe Glu Gly Asp Glu Glu705 710
715 720Glu Glu Glu Asp Asp Glu Asp Glu Glu Glu Ile Glu
Val Glu Glu Glu 725 730
735Leu Cys Lys Gln Val Arg Ser Arg Asp Ile Ser Arg Glu Glu Trp Lys
740 745 750Gly Ser Glu Thr Tyr Ser
Pro Asn Thr Ala Tyr Gly Val Asp Phe Leu 755 760
765Val Pro Val Met Gly Tyr Ile Cys Arg Ile Cys His Lys Phe
Tyr His 770 775 780Ser Asn Ser Gly Ala
Gln Leu Ser His Cys Lys Ser Leu Gly His Phe785 790
795 800Glu Asn Leu Gln Lys Tyr Lys Ala Ala Lys
Asn Pro Ser Pro Thr Thr 805 810
815Arg Pro Val Ser Arg Arg Cys Ala Ile Asn Ala Arg Asn Ala Leu Thr
820 825 830Ala Leu Phe Thr Ser
Ser Gly Arg Pro Pro Ser Gln Pro Asn Thr Gln 835
840 845Asp Lys Thr Pro Ser Lys Val Thr Ala Arg Pro Ser
Gln Pro Pro Leu 850 855 860Pro Arg Arg
Ser Thr Arg Leu Lys Thr865 87060892PRTHomo sapiens 60Met
Phe Ser Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln Gln Gln Leu1
5 10 15Gln Gln Leu Gln Gln Gln Gln
Leu Gln Gln Gln Gln Leu Gln Gln Gln 20 25
30Gln Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro
Gln Ala 35 40 45Pro Leu Pro Met
Ala Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln 50 55
60Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala Ser
Leu Leu Asn65 70 75
80Gly Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly Leu
85 90 95Asp Gln Phe Ala Met Pro
Pro Ala Thr Tyr Asp Thr Ala Gly Leu Thr 100
105 110Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Gly Tyr
Gly Met Ala Ser 115 120 125Pro Gly
Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln Leu Ala Thr Pro 130
135 140Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg
Gln Ser Leu Leu Gly145 150 155
160Pro Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe Asn Leu Ser
165 170 175Gly Arg Asn Pro
Gln Lys Gln Ala Arg Thr Ser Ser Ser Thr Thr Pro 180
185 190Asn Arg Lys Thr Met Pro Val Glu Asp Lys Ser
Asp Pro Pro Glu Gly 195 200 205Ser
Glu Glu Ala Ala Glu Pro Arg Met Asp Thr Pro Glu Asp Gln Asp 210
215 220Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys
Glu Lys Arg Thr Pro Ala225 230 235
240Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu Leu Pro Ala Lys Arg
Leu 245 250 255Arg Ser Ser
Glu Glu Pro Thr Glu Lys Glu Pro Pro Gly Gln Leu Gln 260
265 270Val Lys Ala Gln Pro Gln Ala Arg Met Thr
Val Pro Lys Gln Thr Gln 275 280
285Thr Pro Asp Leu Leu Pro Glu Ala Leu Glu Ala Gln Val Leu Pro Arg 290
295 300Phe Gln Pro Arg Val Leu Gln Val
Gln Ala Gln Val Gln Ser Gln Thr305 310
315 320Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln Val Gln
Pro Lys Leu Gln 325 330
335Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu His Leu Val Leu Gln Gln
340 345 350Lys Gln Val Gln Pro Gln
Leu Gln Gln Glu Ala Glu Pro Gln Lys Gln 355 360
365Val Gln Pro Gln Val Gln Pro Gln Ala His Ser Gln Gly Pro
Arg Gln 370 375 380Val Gln Leu Gln Gln
Glu Ala Glu Pro Leu Lys Gln Val Gln Pro Gln385 390
395 400Val Gln Pro Gln Ala His Ser Gln Pro Pro
Arg Gln Val Gln Leu Gln 405 410
415Leu Gln Lys Gln Val Gln Thr Gln Thr Tyr Pro Gln Val His Thr Gln
420 425 430Ala Gln Pro Ser Val
Gln Pro Gln Glu His Pro Pro Ala Gln Val Ser 435
440 445Val Gln Pro Pro Glu Gln Thr His Glu Gln Pro His
Thr Gln Pro Gln 450 455 460Val Ser Leu
Leu Ala Pro Glu Gln Thr Pro Val Val Val His Val Cys465
470 475 480Gly Leu Glu Met Pro Pro Asp
Ala Val Glu Ala Gly Gly Gly Met Glu 485
490 495Lys Thr Leu Pro Glu Pro Val Gly Thr Gln Val Ser
Met Glu Glu Ile 500 505 510Gln
Asn Glu Ser Ala Cys Gly Leu Asp Val Gly Glu Cys Glu Asn Arg 515
520 525Ala Arg Glu Met Pro Gly Val Trp Gly
Ala Gly Gly Ser Leu Lys Val 530 535
540Thr Ile Leu Gln Ser Ser Asp Ser Arg Ala Phe Ser Thr Val Pro Leu545
550 555 560Thr Pro Val Pro
Arg Pro Ser Asp Ser Val Ser Ser Thr Pro Ala Ala 565
570 575Thr Ser Thr Pro Ser Lys Gln Ala Leu Gln
Phe Phe Cys Tyr Ile Cys 580 585
590Lys Ala Ser Cys Ser Ser Gln Gln Glu Phe Gln Asp His Met Ser Glu
595 600 605Pro Gln His Gln Gln Arg Leu
Gly Glu Ile Gln His Met Ser Gln Ala 610 615
620Cys Leu Leu Ser Leu Leu Pro Val Pro Arg Asp Val Leu Glu Thr
Glu625 630 635 640Asp Glu
Glu Pro Pro Pro Arg Arg Trp Cys Asn Thr Cys Gln Leu Tyr
645 650 655Tyr Met Gly Asp Leu Ile Gln
His Arg Arg Thr Gln Asp His Lys Ile 660 665
670Ala Lys Gln Ser Leu Arg Pro Phe Cys Thr Val Cys Asn Arg
Tyr Phe 675 680 685Lys Thr Pro Arg
Lys Phe Val Glu His Val Lys Ser Gln Gly His Lys 690
695 700Asp Lys Ala Lys Glu Leu Lys Ser Leu Glu Lys Glu
Ile Ala Gly Gln705 710 715
720Asp Glu Asp His Phe Ile Thr Val Asp Ala Val Gly Cys Phe Glu Gly
725 730 735Asp Glu Glu Glu Glu
Glu Asp Asp Glu Asp Glu Glu Glu Ile Glu Val 740
745 750Glu Glu Glu Leu Cys Lys Gln Val Arg Ser Arg Asp
Ile Ser Arg Glu 755 760 765Glu Trp
Lys Gly Ser Glu Thr Tyr Ser Pro Asn Thr Ala Tyr Gly Val 770
775 780Asp Phe Leu Val Pro Val Met Gly Tyr Ile Cys
Arg Ile Cys His Lys785 790 795
800Phe Tyr His Ser Asn Ser Gly Ala Gln Leu Ser His Cys Lys Ser Leu
805 810 815Gly His Phe Glu
Asn Leu Gln Lys Tyr Lys Ala Ala Lys Asn Pro Ser 820
825 830Pro Thr Thr Arg Pro Val Ser Arg Arg Cys Ala
Ile Asn Ala Arg Asn 835 840 845Ala
Leu Thr Ala Leu Phe Thr Ser Ser Gly Arg Pro Pro Ser Gln Pro 850
855 860Asn Thr Gln Asp Lys Thr Pro Ser Lys Val
Thr Ala Arg Pro Ser Gln865 870 875
880Pro Pro Leu Pro Arg Arg Ser Thr Arg Leu Lys Thr
885 89061868PRTHomo sapiens 61Met Phe Ser Gln Gln Gln
Gln Gln Leu Gln Gln Gln Gln Gln Gln Leu1 5
10 15Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln
Leu Gln Gln Gln 20 25 30Gln
Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln Ala 35
40 45Pro Leu Pro Met Ala Val Ser Arg Gly
Leu Pro Pro Gln Gln Pro Gln 50 55
60Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu Asn65
70 75 80Gly Ser Met Leu Gln
Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly Leu 85
90 95Asp Gln Phe Ala Met Pro Pro Ala Thr Tyr Asp
Thr Ala Gly Leu Thr 100 105
110Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Gly Tyr Gly Met Ala Ser
115 120 125Pro Gly Leu Ala Ala Pro Ser
Leu Thr Pro Pro Gln Leu Ala Thr Pro 130 135
140Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu Leu
Gly145 150 155 160Pro Pro
Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe Asn Leu Ser
165 170 175Gly Arg Asn Pro Gln Lys Gln
Ala Arg Thr Ser Ser Ser Thr Thr Pro 180 185
190Asn Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val Glu Asp
Lys Ser 195 200 205Asp Pro Pro Glu
Gly Ser Glu Glu Ala Ala Glu Pro Arg Met Asp Thr 210
215 220Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp
Ile Ala Lys Glu225 230 235
240Lys Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu Leu
245 250 255Pro Ala Lys Arg Leu
Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu Pro 260
265 270Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln Ala
Arg Met Thr Val 275 280 285Pro Lys
Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu Ala Leu Glu Ala 290
295 300Gln Val Leu Pro Arg Phe Gln Pro Arg Val Leu
Gln Val Gln Ala Gln305 310 315
320Val Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln Val
325 330 335Gln Pro Lys Leu
Gln Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu His 340
345 350Leu Val Leu Gln Gln Lys Gln Val Gln Pro Gln
Leu Gln Gln Glu Ala 355 360 365Glu
Pro Gln Lys Gln Val Gln Pro Gln Val Gln Pro Gln Ala His Ser 370
375 380Gln Gly Pro Arg Gln Val Gln Leu Gln Gln
Glu Ala Glu Pro Leu Lys385 390 395
400Gln Val Gln Gln Val His Thr Gln Ala Gln Pro Ser Val Gln Pro
Gln 405 410 415Glu His Pro
Pro Ala Gln Val Ser Val Gln Pro Pro Glu Gln Thr His 420
425 430Glu Gln Pro His Thr Gln Pro Gln Val Ser
Leu Leu Ala Pro Glu Gln 435 440
445Thr Pro Val Val Val His Val Cys Gly Leu Glu Met Pro Pro Asp Ala 450
455 460Val Glu Ala Gly Gly Gly Met Glu
Lys Thr Leu Pro Glu Pro Val Gly465 470
475 480Thr Gln Val Ser Met Glu Glu Ile Gln Asn Glu Ser
Ala Cys Gly Leu 485 490
495Asp Val Gly Glu Cys Glu Asn Arg Ala Arg Glu Met Pro Gly Val Trp
500 505 510Gly Ala Gly Gly Ser Leu
Lys Val Thr Ile Leu Gln Ser Ser Asp Ser 515 520
525Arg Ala Phe Ser Thr Val Pro Leu Thr Pro Val Pro Arg Pro
Ser Asp 530 535 540Ser Val Ser Ser Thr
Pro Ala Ala Thr Ser Thr Pro Ser Lys Gln Ala545 550
555 560Leu Gln Phe Phe Cys Tyr Ile Cys Lys Ala
Ser Cys Ser Ser Gln Gln 565 570
575Glu Phe Gln Asp His Met Ser Glu Pro Gln His Gln Gln Arg Leu Gly
580 585 590Glu Ile Gln His Met
Ser Gln Ala Cys Leu Leu Ser Leu Leu Pro Val 595
600 605Pro Arg Asp Val Leu Glu Thr Glu Asp Glu Glu Pro
Pro Pro Arg Arg 610 615 620Trp Cys Asn
Thr Cys Gln Leu Tyr Tyr Met Gly Asp Leu Ile Gln His625
630 635 640Arg Arg Thr Gln Asp His Lys
Ile Ala Lys Gln Ser Leu Arg Pro Phe 645
650 655Cys Thr Val Cys Asn Arg Tyr Phe Lys Thr Pro Arg
Lys Phe Val Glu 660 665 670His
Val Lys Ser Gln Gly His Lys Asp Lys Ala Lys Glu Leu Lys Ser 675
680 685Leu Glu Lys Glu Ile Ala Gly Gln Asp
Glu Asp His Phe Ile Thr Val 690 695
700Asp Ala Val Gly Cys Phe Glu Gly Asp Glu Glu Glu Glu Glu Asp Asp705
710 715 720Glu Asp Glu Glu
Glu Ile Glu Val Glu Glu Glu Leu Cys Lys Gln Val 725
730 735Arg Ser Arg Asp Ile Ser Arg Glu Glu Trp
Lys Gly Ser Glu Thr Tyr 740 745
750Ser Pro Asn Thr Ala Tyr Gly Val Asp Phe Leu Val Pro Val Met Gly
755 760 765Tyr Ile Cys Arg Ile Cys His
Lys Phe Tyr His Ser Asn Ser Gly Ala 770 775
780Gln Leu Ser His Cys Lys Ser Leu Gly His Phe Glu Asn Leu Gln
Lys785 790 795 800Tyr Lys
Ala Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro Val Ser Arg
805 810 815Arg Cys Ala Ile Asn Ala Arg
Asn Ala Leu Thr Ala Leu Phe Thr Ser 820 825
830Ser Gly Arg Pro Pro Ser Gln Pro Asn Thr Gln Asp Lys Thr
Pro Ser 835 840 845Lys Val Thr Ala
Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg Ser Thr 850
855 860Arg Leu Lys Thr86562841PRTHomo sapiens 62Met Phe
Ser Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln Gln Gln Leu1 5
10 15Gln Gln Leu Gln Gln Gln Gln Leu
Gln Gln Gln Gln Leu Gln Gln Gln 20 25
30Gln Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln
Ala 35 40 45Pro Leu Pro Met Ala
Val Ser Arg Gly Leu Pro Pro Gln Gln Pro Gln 50 55
60Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu
Leu Asn65 70 75 80Gly
Ser Met Leu Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly Leu
85 90 95Asp Gln Phe Ala Met Pro Pro
Ala Thr Tyr Asp Thr Ala Gly Leu Thr 100 105
110Met Pro Thr Ala Thr Leu Gly Asn Leu Arg Gly Tyr Gly Met
Ala Ser 115 120 125Pro Gly Leu Ala
Ala Pro Ser Leu Thr Pro Pro Gln Leu Ala Thr Pro 130
135 140Asn Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg Gln
Ser Leu Leu Gly145 150 155
160Pro Pro Pro Val Gly Val Pro Met Asn Pro Ser Gln Phe Asn Leu Ser
165 170 175Gly Arg Asn Pro Gln
Lys Gln Ala Arg Thr Ser Ser Ser Thr Thr Pro 180
185 190Asn Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val
Glu Asp Lys Ser 195 200 205Asp Pro
Pro Glu Gly Ser Glu Glu Ala Ala Glu Pro Arg Met Asp Thr 210
215 220Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu
Asp Ile Ala Lys Glu225 230 235
240Lys Arg Thr Pro Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu Leu
245 250 255Pro Ala Lys Arg
Leu Arg Ser Ser Glu Glu Pro Thr Glu Lys Glu Pro 260
265 270Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln
Ala Arg Met Thr Val 275 280 285Pro
Lys Gln Thr Gln Thr Pro Asp Leu Leu Pro Glu Ala Leu Glu Ala 290
295 300Gln Val Leu Pro Arg Phe Gln Pro Arg Val
Leu Gln Val Gln Ala Gln305 310 315
320Val Gln Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln
Val 325 330 335Gln Pro Lys
Leu Gln Lys Gln Ala Gln Thr Gln Thr Ser Pro Glu His 340
345 350Leu Val Leu Gln Gln Lys Gln Val Gln Pro
Gln Leu Gln Gln Glu Ala 355 360
365Glu Pro Gln Lys Gln Val Gln Pro Gln Val His Thr Gln Ala Gln Pro 370
375 380Ser Val Gln Pro Gln Glu His Pro
Pro Ala Gln Val Ser Val Gln Pro385 390
395 400Pro Glu Gln Thr His Glu Gln Pro His Thr Gln Pro
Gln Val Ser Leu 405 410
415Leu Ala Pro Glu Gln Thr Pro Val Val Val His Val Cys Gly Leu Glu
420 425 430Met Pro Pro Asp Ala Val
Glu Ala Gly Gly Gly Met Glu Lys Thr Leu 435 440
445Pro Glu Pro Val Gly Thr Gln Val Ser Met Glu Glu Ile Gln
Asn Glu 450 455 460Ser Ala Cys Gly Leu
Asp Val Gly Glu Cys Glu Asn Arg Ala Arg Glu465 470
475 480Met Pro Gly Val Trp Gly Ala Gly Gly Ser
Leu Lys Val Thr Ile Leu 485 490
495Gln Ser Ser Asp Ser Arg Ala Phe Ser Thr Val Pro Leu Thr Pro Val
500 505 510Pro Arg Pro Ser Asp
Ser Val Ser Ser Thr Pro Ala Ala Thr Ser Thr 515
520 525Pro Ser Lys Gln Ala Leu Gln Phe Phe Cys Tyr Ile
Cys Lys Ala Ser 530 535 540Cys Ser Ser
Gln Gln Glu Phe Gln Asp His Met Ser Glu Pro Gln His545
550 555 560Gln Gln Arg Leu Gly Glu Ile
Gln His Met Ser Gln Ala Cys Leu Leu 565
570 575Ser Leu Leu Pro Val Pro Arg Asp Val Leu Glu Thr
Glu Asp Glu Glu 580 585 590Pro
Pro Pro Arg Arg Trp Cys Asn Thr Cys Gln Leu Tyr Tyr Met Gly 595
600 605Asp Leu Ile Gln His Arg Arg Thr Gln
Asp His Lys Ile Ala Lys Gln 610 615
620Ser Leu Arg Pro Phe Cys Thr Val Cys Asn Arg Tyr Phe Lys Thr Pro625
630 635 640Arg Lys Phe Val
Glu His Val Lys Ser Gln Gly His Lys Asp Lys Ala 645
650 655Lys Glu Leu Lys Ser Leu Glu Lys Glu Ile
Ala Gly Gln Asp Glu Asp 660 665
670His Phe Ile Thr Val Asp Ala Val Gly Cys Phe Glu Gly Asp Glu Glu
675 680 685Glu Glu Glu Asp Asp Glu Asp
Glu Glu Glu Ile Glu Val Glu Glu Glu 690 695
700Leu Cys Lys Gln Val Arg Ser Arg Asp Ile Ser Arg Glu Glu Trp
Lys705 710 715 720Gly Ser
Glu Thr Tyr Ser Pro Asn Thr Ala Tyr Gly Val Asp Phe Leu
725 730 735Val Pro Val Met Gly Tyr Ile
Cys Arg Ile Cys His Lys Phe Tyr His 740 745
750Ser Asn Ser Gly Ala Gln Leu Ser His Cys Lys Ser Leu Gly
His Phe 755 760 765Glu Asn Leu Gln
Lys Tyr Lys Ala Ala Lys Asn Pro Ser Pro Thr Thr 770
775 780Arg Pro Val Ser Arg Arg Cys Ala Ile Asn Ala Arg
Asn Ala Leu Thr785 790 795
800Ala Leu Phe Thr Ser Ser Gly Arg Pro Pro Ser Gln Pro Asn Thr Gln
805 810 815Asp Lys Thr Pro Ser
Lys Val Thr Ala Arg Pro Ser Gln Pro Pro Leu 820
825 830Pro Arg Arg Ser Thr Arg Leu Lys Thr 835
84063785PRTHomo sapiens 63Met Phe Ser Gln Gln Gln Gln Gln
Leu Gln Gln Gln Gln Gln Gln Leu1 5 10
15Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln
Gln Gln 20 25 30Gln Leu Leu
Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln Ala 35
40 45Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro
Pro Gln Gln Pro Gln 50 55 60Gln Pro
Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu Asn65
70 75 80Gly Ser Met Leu Gln Arg Ala
Leu Leu Leu Gln Gln Leu Gln Gly Leu 85 90
95Asp Gln Phe Ala Met Pro Pro Ala Thr Tyr Asp Thr Ala
Gly Leu Thr 100 105 110Met Pro
Thr Ala Thr Leu Gly Asn Leu Arg Gly Tyr Gly Met Ala Ser 115
120 125Pro Gly Leu Ala Ala Pro Ser Leu Thr Pro
Pro Gln Leu Ala Thr Pro 130 135 140Asn
Leu Gln Gln Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu Leu Gly145
150 155 160Pro Pro Pro Val Gly Val
Pro Met Asn Pro Ser Gln Phe Asn Leu Ser 165
170 175Gly Arg Asn Pro Gln Lys Gln Ala Arg Thr Ser Ser
Ser Thr Thr Pro 180 185 190Asn
Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val Glu Asp Lys Ser 195
200 205Asp Pro Pro Glu Gly Ser Glu Glu Ala
Ala Glu Pro Arg Met Asp Thr 210 215
220Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys Glu225
230 235 240Lys Arg Thr Pro
Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu Leu 245
250 255Pro Ala Lys Arg Leu Arg Ser Ser Glu Glu
Pro Thr Glu Lys Glu Pro 260 265
270Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln Ala Arg Met Thr Val
275 280 285Pro Lys Gln Thr Gln Thr Pro
Asp Leu Leu Pro Glu Ala Leu Glu Ala 290 295
300Gln Val Leu Pro Arg Phe Gln Pro Arg Val Leu Gln Val Gln Ala
Pro305 310 315 320Gln Val
His Thr Gln Ala Gln Pro Ser Val Gln Pro Gln Glu His Pro
325 330 335Pro Ala Gln Val Ser Val Gln
Pro Pro Glu Gln Thr His Glu Gln Pro 340 345
350His Thr Gln Pro Gln Val Ser Leu Leu Ala Pro Glu Gln Thr
Pro Val 355 360 365Val Val His Val
Cys Gly Leu Glu Met Pro Pro Asp Ala Val Glu Ala 370
375 380Gly Gly Gly Met Glu Lys Thr Leu Pro Glu Pro Val
Gly Thr Gln Val385 390 395
400Ser Met Glu Glu Ile Gln Asn Glu Ser Ala Cys Gly Leu Asp Val Gly
405 410 415Glu Cys Glu Asn Arg
Ala Arg Glu Met Pro Gly Val Trp Gly Ala Gly 420
425 430Gly Ser Leu Lys Val Thr Ile Leu Gln Ser Ser Asp
Ser Arg Ala Phe 435 440 445Ser Thr
Val Pro Leu Thr Pro Val Pro Arg Pro Ser Asp Ser Val Ser 450
455 460Ser Thr Pro Ala Ala Thr Ser Thr Pro Ser Lys
Gln Ala Leu Gln Phe465 470 475
480Phe Cys Tyr Ile Cys Lys Ala Ser Cys Ser Ser Gln Gln Glu Phe Gln
485 490 495Asp His Met Ser
Glu Pro Gln His Gln Gln Arg Leu Gly Glu Ile Gln 500
505 510His Met Ser Gln Ala Cys Leu Leu Ser Leu Leu
Pro Val Pro Arg Asp 515 520 525Val
Leu Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg Arg Trp Cys Asn 530
535 540Thr Cys Gln Leu Tyr Tyr Met Gly Asp Leu
Ile Gln His Arg Arg Thr545 550 555
560Gln Asp His Lys Ile Ala Lys Gln Ser Leu Arg Pro Phe Cys Thr
Val 565 570 575Cys Asn Arg
Tyr Phe Lys Thr Pro Arg Lys Phe Val Glu His Val Lys 580
585 590Ser Gln Gly His Lys Asp Lys Ala Lys Glu
Leu Lys Ser Leu Glu Lys 595 600
605Glu Ile Ala Gly Gln Asp Glu Asp His Phe Ile Thr Val Asp Ala Val 610
615 620Gly Cys Phe Glu Gly Asp Glu Glu
Glu Glu Glu Asp Asp Glu Asp Glu625 630
635 640Glu Glu Ile Glu Val Glu Glu Glu Leu Cys Lys Gln
Val Arg Ser Arg 645 650
655Asp Ile Ser Arg Glu Glu Trp Lys Gly Ser Glu Thr Tyr Ser Pro Asn
660 665 670Thr Ala Tyr Gly Val Asp
Phe Leu Val Pro Val Met Gly Tyr Ile Cys 675 680
685Arg Ile Cys His Lys Phe Tyr His Ser Asn Ser Gly Ala Gln
Leu Ser 690 695 700His Cys Lys Ser Leu
Gly His Phe Glu Asn Leu Gln Lys Tyr Lys Ala705 710
715 720Ala Lys Asn Pro Ser Pro Thr Thr Arg Pro
Val Ser Arg Arg Cys Ala 725 730
735Ile Asn Ala Arg Asn Ala Leu Thr Ala Leu Phe Thr Ser Ser Gly Arg
740 745 750Pro Pro Ser Gln Pro
Asn Thr Gln Asp Lys Thr Pro Ser Lys Val Thr 755
760 765Ala Arg Pro Ser Gln Pro Pro Leu Pro Arg Arg Ser
Thr Arg Leu Lys 770 775
780Thr78564889PRTHomo sapiens 64Met Phe Ser Gln Gln Gln Gln Gln Leu Gln
Gln Gln Gln Gln Gln Leu1 5 10
15Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln Gln Leu Gln Gln Gln
20 25 30Gln Leu Leu Gln Leu Gln
Gln Leu Leu Gln Gln Ser Pro Pro Gln Ala 35 40
45Pro Leu Pro Met Ala Val Ser Arg Gly Leu Pro Pro Gln Gln
Pro Gln 50 55 60Gln Pro Leu Leu Asn
Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu Asn65 70
75 80Gly Ser Met Leu Gln Arg Ala Leu Leu Leu
Gln Gln Leu Gln Gly Leu 85 90
95Asp Gln Phe Ala Met Pro Pro Ala Thr Tyr Asp Thr Ala Gly Leu Thr
100 105 110Met Pro Thr Ala Thr
Leu Gly Asn Leu Arg Gly Tyr Gly Met Ala Ser 115
120 125Pro Gly Leu Ala Ala Pro Ser Leu Thr Pro Pro Gln
Leu Ala Thr Pro 130 135 140Asn Leu Gln
Gln Phe Phe Pro Gln Ala Thr Arg Gln Ser Leu Leu Gly145
150 155 160Pro Pro Pro Val Gly Val Pro
Met Asn Pro Ser Gln Phe Asn Leu Ser 165
170 175Gly Arg Asn Pro Gln Lys Gln Ala Arg Thr Ser Ser
Ser Thr Thr Pro 180 185 190Asn
Arg Lys Asp Ser Ser Ser Gln Thr Met Pro Val Glu Asp Lys Ser 195
200 205Asp Pro Pro Glu Gly Ser Glu Glu Ala
Ala Glu Pro Arg Met Asp Thr 210 215
220Pro Glu Asp Gln Asp Leu Pro Pro Cys Pro Glu Asp Ile Ala Lys Glu225
230 235 240Lys Arg Thr Pro
Ala Pro Glu Pro Glu Pro Cys Glu Ala Ser Glu Leu 245
250 255Pro Ala Lys Arg Leu Arg Ser Ser Glu Glu
Pro Thr Glu Lys Glu Pro 260 265
270Pro Gly Gln Leu Gln Val Lys Ala Gln Pro Gln Ala Arg Met Thr Val
275 280 285Pro Lys Gln Thr Gln Thr Pro
Asp Leu Leu Pro Glu Ala Leu Glu Ala 290 295
300Gln Val Leu Pro Arg Phe Gln Pro Arg Val Leu Gln Val Gln Ala
Gln305 310 315 320Val Gln
Ser Gln Thr Gln Pro Arg Ile Pro Ser Thr Asp Thr Gln Val
325 330 335Gln Pro Lys Leu Gln Lys Gln
Ala Gln Thr Gln Thr Ser Pro Glu His 340 345
350Leu Val Leu Gln Gln Lys Gln Val Gln Pro Gln Leu Gln Gln
Glu Ala 355 360 365Glu Pro Gln Lys
Gln Val Gln Pro Gln Val Gln Pro Gln Ala His Ser 370
375 380Gln Gly Pro Arg Gln Val Gln Leu Gln Gln Glu Ala
Glu Pro Leu Lys385 390 395
400Gln Val Gln Pro Gln Val Gln Pro Gln Ala His Ser Gln Pro Pro Arg
405 410 415Gln Val Gln Leu Gln
Leu Gln Lys Gln Val Gln Thr Gln Thr Tyr Pro 420
425 430Gln Val His Thr Gln Ala Gln Pro Ser Val Gln Pro
Gln Glu His Pro 435 440 445Pro Ala
Gln Val Ser Val Gln Pro Pro Glu Gln Thr His Glu Gln Pro 450
455 460His Thr Gln Pro Gln Val Ser Leu Leu Ala Pro
Glu Gln Thr Pro Val465 470 475
480Val Val His Val Cys Gly Leu Glu Met Pro Pro Asp Ala Val Glu Ala
485 490 495Gly Gly Gly Met
Glu Lys Thr Leu Pro Glu Pro Val Gly Thr Gln Val 500
505 510Ser Met Glu Glu Ile Gln Asn Glu Ser Ala Cys
Gly Leu Asp Val Gly 515 520 525Glu
Cys Glu Asn Arg Ala Arg Glu Met Pro Gly Val Trp Gly Ala Gly 530
535 540Gly Ser Leu Lys Val Thr Ile Leu Gln Ser
Ser Asp Ser Arg Ala Phe545 550 555
560Ser Thr Val Pro Leu Thr Pro Val Pro Arg Pro Ser Asp Ser Val
Ser 565 570 575Ser Thr Pro
Ala Ala Thr Ser Thr Pro Ser Lys Gln Ala Leu Gln Phe 580
585 590Phe Cys Tyr Ile Cys Lys Ala Ser Cys Ser
Ser Gln Gln Glu Phe Gln 595 600
605Asp His Met Ser Glu Pro Gln His Gln Gln Arg Leu Gly Glu Ile Gln 610
615 620His Met Ser Gln Ala Cys Leu Leu
Ser Leu Leu Pro Val Pro Arg Asp625 630
635 640Val Leu Glu Thr Glu Asp Glu Glu Pro Pro Pro Arg
Arg Trp Cys Asn 645 650
655Thr Cys Gln Leu Tyr Tyr Met Gly Asp Leu Ile Gln His Arg Arg Thr
660 665 670Gln Asp His Lys Ile Ala
Lys Gln Ser Leu Arg Pro Phe Cys Thr Val 675 680
685Cys Asn Arg Tyr Phe Lys Thr Pro Arg Lys Phe Val Glu His
Val Lys 690 695 700Ser Gln Gly His Lys
Asp Lys Ala Lys Glu Leu Lys Ser Leu Glu Lys705 710
715 720Glu Ile Ala Gly Gln Asp Glu Asp His Phe
Ile Thr Val Asp Ala Val 725 730
735Gly Cys Phe Glu Gly Asp Glu Glu Glu Glu Glu Asp Asp Glu Asp Glu
740 745 750Glu Glu Ile Glu Val
Arg Ser Arg Asp Ile Ser Arg Glu Glu Trp Lys 755
760 765Gly Ser Glu Thr Tyr Ser Pro Asn Thr Ala Tyr Gly
Val Asp Phe Leu 770 775 780Val Pro Val
Met Gly Tyr Ile Cys Arg Ile Cys His Lys Phe Tyr His785
790 795 800Ser Asn Ser Gly Ala Gln Leu
Ser His Cys Lys Ser Leu Gly His Phe 805
810 815Glu Asn Leu Gln Lys Tyr Lys Ala Ala Lys Asn Pro
Ser Pro Thr Thr 820 825 830Arg
Pro Val Ser Arg Arg Cys Ala Ile Asn Ala Arg Asn Ala Leu Thr 835
840 845Ala Leu Phe Thr Ser Ser Gly Arg Pro
Pro Ser Gln Pro Asn Thr Gln 850 855
860Asp Lys Thr Pro Ser Lys Val Thr Ala Arg Pro Ser Gln Pro Pro Leu865
870 875 880Pro Arg Arg Ser
Thr Arg Leu Lys Thr 88565873PRTHomo sapiens 65Met Phe Ser
Gln Gln Gln Gln Gln Leu Gln Gln Gln Gln Gln Gln Leu1 5
10 15Gln Gln Leu Gln Gln Gln Gln Leu Gln
Gln Gln Gln Leu Gln Gln Gln 20 25
30Gln Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro Pro Gln Ala
35 40 45Pro Leu Pro Met Ala Val Ser
Arg Gly Leu Pro Pro Gln Gln Pro Gln 50 55
60Gln Pro Leu Leu Asn Leu Gln Gly Thr Asn Ser Ala Ser Leu Leu Asn65
70 75 80Gly Ser Met Leu
Gln Arg Ala Leu Leu Leu Gln Gln Leu Gln Gly Asn 85
90 95Leu Arg Gly Tyr Gly Met Ala Ser Pro Gly
Leu Ala Ala Pro Ser Leu 100 105
110Thr Pro Pro Gln Leu Ala Thr Pro Asn Leu Gln Gln Phe Phe Pro Gln
115 120 125Ala Thr Arg Gln Ser Leu Leu
Gly Pro Pro Pro Val Gly Val Pro Met 130 135
140Asn Pro Ser Gln Phe Asn Leu Ser Gly Arg Asn Pro Gln Lys Gln
Ala145 150 155 160Arg Thr
Ser Ser Ser Thr Thr Pro Asn Arg Lys Asp Ser Ser Ser Gln
165 170 175Thr Met Pro Val Glu Asp Lys
Ser Asp Pro Pro Glu Gly Ser Glu Glu 180 185
190Ala Ala Glu Pro Arg Met Asp Thr Pro Glu Asp Gln Asp Leu
Pro Pro 195 200 205Cys Pro Glu Asp
Ile Ala Lys Glu Lys Arg Thr Pro Ala Pro Glu Pro 210
215 220Glu Pro Cys Glu Ala Ser Glu Leu Pro Ala Lys Arg
Leu Arg Ser Ser225 230 235
240Glu Glu Pro Thr Glu Lys Glu Pro Pro Gly Gln Leu Gln Val Lys Ala
245 250 255Gln Pro Gln Ala Arg
Met Thr Val Pro Lys Gln Thr Gln Thr Pro Asp 260
265 270Leu Leu Pro Glu Ala Leu Glu Ala Gln Val Leu Pro
Arg Phe Gln Pro 275 280 285Arg Val
Leu Gln Val Gln Ala Gln Val Gln Ser Gln Thr Gln Pro Arg 290
295 300Ile Pro Ser Thr Asp Thr Gln Val Gln Pro Lys
Leu Gln Lys Gln Ala305 310 315
320Gln Thr Gln Thr Ser Pro Glu His Leu Val Leu Gln Gln Lys Gln Val
325 330 335Gln Pro Gln Leu
Gln Gln Glu Ala Glu Pro Gln Lys Gln Val Gln Pro 340
345 350Gln Val Gln Pro Gln Ala His Ser Gln Gly Pro
Arg Gln Val Gln Leu 355 360 365Gln
Gln Glu Ala Glu Pro Leu Lys Gln Val Gln Pro Gln Val Gln Pro 370
375 380Gln Ala His Ser Gln Pro Pro Arg Gln Val
Gln Leu Gln Leu Gln Lys385 390 395
400Gln Val Gln Thr Gln Thr Tyr Pro Gln Val His Thr Gln Ala Gln
Pro 405 410 415Ser Val Gln
Pro Gln Glu His Pro Pro Ala Gln Val Ser Val Gln Pro 420
425 430Pro Glu Gln Thr His Glu Gln Pro His Thr
Gln Pro Gln Val Ser Leu 435 440
445Leu Ala Pro Glu Gln Thr Pro Val Val Val His Val Cys Gly Leu Glu 450
455 460Met Pro Pro Asp Ala Val Glu Ala
Gly Gly Gly Met Glu Lys Thr Leu465 470
475 480Pro Glu Pro Val Gly Thr Gln Val Ser Met Glu Glu
Ile Gln Asn Glu 485 490
495Ser Ala Cys Gly Leu Asp Val Gly Glu Cys Glu Asn Arg Ala Arg Glu
500 505 510Met Pro Gly Val Trp Gly
Ala Gly Gly Ser Leu Lys Val Thr Ile Leu 515 520
525Gln Ser Ser Asp Ser Arg Ala Phe Ser Thr Val Pro Leu Thr
Pro Val 530 535 540Pro Arg Pro Ser Asp
Ser Val Ser Ser Thr Pro Ala Ala Thr Ser Thr545 550
555 560Pro Ser Lys Gln Ala Leu Gln Phe Phe Cys
Tyr Ile Cys Lys Ala Ser 565 570
575Cys Ser Ser Gln Gln Glu Phe Gln Asp His Met Ser Glu Pro Gln His
580 585 590Gln Gln Arg Leu Gly
Glu Ile Gln His Met Ser Gln Ala Cys Leu Leu 595
600 605Ser Leu Leu Pro Val Pro Arg Asp Val Leu Glu Thr
Glu Asp Glu Glu 610 615 620Pro Pro Pro
Arg Arg Trp Cys Asn Thr Cys Gln Leu Tyr Tyr Met Gly625
630 635 640Asp Leu Ile Gln His Arg Arg
Thr Gln Asp His Lys Ile Ala Lys Gln 645
650 655Ser Leu Arg Pro Phe Cys Thr Val Cys Asn Arg Tyr
Phe Lys Thr Pro 660 665 670Arg
Lys Phe Val Glu His Val Lys Ser Gln Gly His Lys Asp Lys Ala 675
680 685Lys Glu Leu Lys Ser Leu Glu Lys Glu
Ile Ala Gly Gln Asp Glu Asp 690 695
700His Phe Ile Thr Val Asp Ala Val Gly Cys Phe Glu Gly Asp Glu Glu705
710 715 720Glu Glu Glu Asp
Asp Glu Asp Glu Glu Glu Ile Glu Val Glu Glu Glu 725
730 735Leu Cys Lys Gln Val Arg Ser Arg Asp Ile
Ser Arg Glu Glu Trp Lys 740 745
750Gly Ser Glu Thr Tyr Ser Pro Asn Thr Ala Tyr Gly Val Asp Phe Leu
755 760 765Val Pro Val Met Gly Tyr Ile
Cys Arg Ile Cys His Lys Phe Tyr His 770 775
780Ser Asn Ser Gly Ala Gln Leu Ser His Cys Lys Ser Leu Gly His
Phe785 790 795 800Glu Asn
Leu Gln Lys Tyr Lys Ala Ala Lys Asn Pro Ser Pro Thr Thr
805 810 815Arg Pro Val Ser Arg Arg Cys
Ala Ile Asn Ala Arg Asn Ala Leu Thr 820 825
830Ala Leu Phe Thr Ser Ser Gly Arg Pro Pro Ser Gln Pro Asn
Thr Gln 835 840 845Asp Lys Thr Pro
Ser Lys Val Thr Ala Arg Pro Ser Gln Pro Pro Leu 850
855 860Pro Arg Arg Ser Thr Arg Leu Lys Thr865
870662821DNAHomo sapiens 66tgggggctgc ggggccggcc catccgtggg
ggcgacttga gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc cagcagcagc
agcagctcca gcaacagcag ggccccgttg 120cccatggctg tcagccgggg gctccccccg
cagcagccac agcagccgct tctgaatctc 180cagggcacca actcagcctc cctcctcaac
ggctccatgc tgcagagagc tttgctttta 240cagcagttgc aaggactgga ccagtttgca
atgccaccag ccacgtatga cactgccggt 300ctcaccatgc ccacagcaac actgggtaac
ctccgaggct atggcatggc atccccaggc 360ctcgcagccc ccagcctcac acccccacaa
ctggccactc caaatttgca acagttcttt 420ccccaggcca ctcgccagtc cttgctggga
cctcctcctg ttggggtccc catgaaccct 480tcccagttca acctttcagg acggaacccc
cagaaacagg cccggacctc ctcctctacc 540acccccaatc gaaaggattc ttcttctcag
acaatgcctg tggaagacaa gtcagacccc 600ccagaggggt ctgaggaagc cgcagagccc
cggatggaca caccagaaga ccaagattta 660ccgccctgcc cagaggacat cgccaaggaa
aaacgcactc cagcacctga gcctgagcct 720tgtgaggcgt ccgagctgcc agcaaagaga
ttgaggagct cagaagagcc cacagagaag 780gaacctccag ggcagttaca ggtgaaggcc
cagccgcagg cccggatgac agtaccgaaa 840cagacacaga caccagacct gctgcctgag
gccctggaag cccaagtgct gccacgattc 900cagccacggg tcctgcaggt ccaggcccag
gtgcagtcac agactcagcc gcggatacca 960tccacagaca cccaggtgca gccaaagctg
cagaagcagg cgcaaacaca gacctctcca 1020gagcacttag tgctgcaaca gaagcaggtg
cagccacagc tgcagcagga ggcagagcca 1080cagaagcagg tgcagccaca ggtacagcca
caggcacatt cacagggccc aaggcaggtg 1140cagctgcagc aggaggcaga gccgctgaag
caggtgcagc cacaggtgca gccccaggca 1200cattcacagc ccccaaggca ggtgcagctg
cagctgcaga agcaggtcca gacacagaca 1260tatccacagg tccacacaca ggcacagcca
agcgtccagc cacaggagca tcctccagcg 1320caggtgtcag tacagccacc agagcagacc
catgagcagc ctcacaccca gccgcaggtg 1380tcgttgctgg ctccagagca aacaccagtt
gtggttcatg tctgcgggct ggagatgcca 1440cctgatgcag tagaagctgg tggaggcatg
gaaaagacct tgccagagcc tgtgggcacc 1500caagtcagca tggaagagat tcagaatgag
tcggcctgtg gcctagatgt gggagaatgt 1560gaaaacagag cgagagagat gccaggggta
tggggcgccg ggggctccct gaaggtcacc 1620attctgcaga gcagtgacag ccgggccttt
agcactgtac ccctgacacc tgtcccccgc 1680cccagtgact ccgtctcctc cacccctgcg
gctaccagca ctccctctaa gcaggccctc 1740cagttcttct gctacatctg caaggccagc
tgctccagcc agcaggagtt ccaggaccac 1800atgtcggagc ctcagcacca gcagcggcta
ggggagatcc agcacatgag ccaagcctgc 1860ctcctgtccc tgctgcccgt gccccgggac
gtcctggaga cagaggatga ggagcctcca 1920ccaaggcgct ggtgcaacac ctgccagctc
tactacatgg gggacctgat ccaacaccgc 1980aggacacagg accacaagat tgccaaacaa
tccttgcgac ccttctgcac cgtttgcaac 2040cgctacttca aaacccctcg caagtttgtg
gagcacgtga agtcccaggg gcataaggac 2100aaagccaagg agctgaagtc gcttgagaaa
gaaattgctg gccaagatga ggaccacttc 2160attacagtgg acgctgtggg ttgcttcgag
ggtgatgaag aagaggaaga ggatgatgag 2220gatgaagaag agatcgaggt tgaggaggaa
ctctgcaagc aggtgaggtc cagagatata 2280tccagagagg agtggaaggg ctcggagacc
tacagcccca atactgcata tggtgtggac 2340ttcctggtgc ccgtgatggg ctatatctgc
cgcatctgcc acaagttcta tcacagcaac 2400tcaggggcac agctctccca ctgcaagtcc
ctgggccact ttgagaacct gcagaaatac 2460aaggcggcca agaaccccag ccccaccacc
cgacctgtga gccgccggtg cgcaatcaac 2520gcccggaacg ctttgacagc cctgttcacc
tccagcggcc gcccaccctc ccagcccaac 2580acccaggaca aaacacccag caaggtgacg
gctcgaccct cccagccccc actacctcgg 2640cgctcaaccc gcctcaaaac ctgatagagg
gacctccctg tccctggcct gcctgggtcc 2700agatctgcta atgcttttta ggagtctgcc
tggaaacttt gacatggttc atgtttttac 2760tcaaaatcca ataaaacaag gtagtttggc
tgtgcaaaaa aaaaaaaaaa aaaaaaaaaa 2820a
2821672850DNAHomo sapiens 67tgggggctgc
ggggccggcc catccgtggg ggcgacttga gcgttgaggg cgcgcgggga 60ggcgagccac
catgttcagc cagcagcagc agcagctcca gcaacagcag cagcagctcc 120agcagttaca
gcagcagcag ctccagcagc agcaattgca gcagcagcag ttactgcagc 180tccagcagct
gctccagcag tccccaccac aggccccgtt gcccatggct gtcagccggg 240ggctcccccc
gcagcagcca cagcagccgc ttctgaatct ccagggcacc aactcagcct 300ccctcctcaa
cggctccatg ctgcagagag ctttgctttt acagcagttg caaggtaacc 360tccgaggcta
tggcatggca tccccaggcc tcgcagcccc cagcctcaca cccccacaac 420tggccactcc
aaatttgcaa cagttctttc cccaggccac tcgccagtcc ttgctgggac 480ctcctcctgt
tggggtcccc atgaaccctt cccagttcaa cctttcagga cggaaccccc 540agaaacaggc
ccggacctcc tcctctacca cccccaatcg aaaggattct tcttctcaga 600caatgcctgt
ggaagacaag tcagaccccc cagaggggtc tgaggaagcc gcagagcccc 660ggatggacac
accagaagac caagatttac cgccctgccc agaggacatc gccaaggaaa 720aacgcactcc
agcacctgag cctgagcctt gtgaggcgtc cgagctgcca gcaaagagat 780tgaggagctc
agaagagccc acagagaagg aacctccagg gcagttacag gtgaaggccc 840agccgcaggc
ccggatgaca gtaccgaaac agacacagac accagacctg ctgcctgagg 900ccctggaagc
ccaagtgctg ccacgattcc agccacgggt cctgcaggtc caggcccagg 960tgcagtcaca
gactcagccg cggataccat ccacagacac ccaggtgcag ccaaagctgc 1020agaagcaggc
gcaaacacag acctctccag agcacttagt gctgcaacag aagcaggtgc 1080agccacagct
gcagcaggag gcagagccac agaagcaggt gcagccacag gtacagccac 1140aggcacattc
acagggccca aggcaggtgc agctgcagca ggaggcagag ccgctgaagc 1200aggtgcagcc
acaggtgcag ccccaggcac attcacagcc cccaaggcag gtgcagctgc 1260agctgcagaa
gcaggtccag acacagacat atccacaggt ccacacacag gcacagccaa 1320gcgtccagcc
acaggagcat cctccagcgc aggtgtcagt acagccacca gagcagaccc 1380atgagcagcc
tcacacccag ccgcaggtgt cgttgctggc tccagagcaa acaccagttg 1440tggttcatgt
ctgcgggctg gagatgccac ctgatgcagt agaagctggt ggaggcatgg 1500aaaagacctt
gccagagcct gtgggcaccc aagtcagcat ggaagagatt cagaatgagt 1560cggcctgtgg
cctagatgtg ggagaatgtg aaaacagagc gagagagatg ccaggggtat 1620ggggcgccgg
gggctccctg aaggtcacca ttctgcagag cagtgacagc cgggccttta 1680gcactgtacc
cctgacacct gtcccccgcc ccagtgactc cgtctcctcc acccctgcgg 1740ctaccagcac
tccctctaag caggccctcc agttcttctg ctacatctgc aaggccagct 1800gctccagcca
gcaggagttc caggaccaca tgtcggagcc tcagcaccag cagcggctag 1860gggagatcca
gcacatgagc caagcctgcc tcctgtccct gctgcccgtg ccccgggacg 1920tcctggagac
agaggatgag gagcctccac caaggcgctg gtgcaacacc tgccagctct 1980actacatggg
ggacctgatc caacaccgca ggacacagga ccacaagatt gccaaacaat 2040ccttgcgacc
cttctgcacc gtttgcaacc gctacttcaa aacccctcgc aagtttgtgg 2100agcacgtgaa
gtcccagggg cataaggaca aagccaagga gctgaagtcg cttgagaaag 2160aaattgctgg
ccaagatgag gaccacttca ttacagtgga cgctgtgggt tgcttcgagg 2220gtgatgaaga
agaggaagag gatgatgagg atgaagaaga gatcgaggtt gaggaggaac 2280tctgcaagca
ggtgaggtcc agagatatat ccagagagga gtggaagggc tcggagacct 2340acagccccaa
tactgcatat ggtgtggact tcctggtgcc cgtgatgggc tatatctgcc 2400gcatctgcca
caagttctat cacagcaact caggggcaca gctctcccac tgcaagtccc 2460tgggccactt
tgagaacctg cagaaataca aggcggccaa gaaccccagc cccaccaccc 2520gacctgtgag
ccgccggtgc gcaatcaacg cccggaacgc tttgacagcc ctgttcacct 2580ccagcggccg
cccaccctcc cagcccaaca cccaggacaa aacacccagc aaggtgacgg 2640ctcgaccctc
ccagccccca ctacctcggc gctcaacccg cctcaaaacc tgatagaggg 2700acctccctgt
ccctggcctg cctgggtcca gatctgctaa tgctttttag gagtctgcct 2760ggaaactttg
acatggttca tgtttttact caaaatccaa taaaacaagg tagtttggct 2820gtgcaaaaaa
aaaaaaaaaa aaaaaaaaaa
2850682907DNAHomo sapiens 68tgggggctgc ggggccggcc catccgtggg ggcgacttga
gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc cagcagcagc agcagctcca
gcaacagcag cagcagctcc 120agcagttaca gcagcagcag ctccagcagc agcaattgca
gcagcagcag ttactgcagc 180tccagcagct gctccagcag tccccaccac aggccccgtt
gcccatggct gtcagccggg 240ggctcccccc gcagcagcca cagcagccgc ttctgaatct
ccagggcacc aactcagcct 300ccctcctcaa cggctccatg ctgcagagag ctttgctttt
acagcagttg caaggactgg 360accagtttgc aatgccacca gccacgtatg acactgccgg
tctcaccatg cccacagcaa 420cactgggtaa cctccgaggc tatggcatgg catccccagg
cctcgcagcc cccagcctca 480cacccccaca actggccact ccaaatttgc aacagttctt
tccccaggcc actcgccagt 540ccttgctggg acctcctcct gttggggtcc ccatgaaccc
ttcccagttc aacctttcag 600gacggaaccc ccagaaacag gcccggacct cctcctctac
cacccccaat cgaaagacaa 660tgcctgtgga agacaagtca gaccccccag aggggtctga
ggaagccgca gagccccgga 720tggacacacc agaagaccaa gatttaccgc cctgcccaga
ggacatcgcc aaggaaaaac 780gcactccagc acctgagcct gagccttgtg aggcgtccga
gctgccagca aagagattga 840ggagctcaga agagcccaca gagaaggaac ctccagggca
gttacaggtg aaggcccagc 900cgcaggcccg gatgacagta ccgaaacaga cacagacacc
agacctgctg cctgaggccc 960tggaagccca agtgctgcca cgattccagc cacgggtcct
gcaggtccag gcccaggtgc 1020agtcacagac tcagccgcgg ataccatcca cagacaccca
ggtgcagcca aagctgcaga 1080agcaggcgca aacacagacc tctccagagc acttagtgct
gcaacagaag caggtgcagc 1140cacagctgca gcaggaggca gagccacaga agcaggtgca
gccacaggta cagccacagg 1200cacattcaca gggcccaagg caggtgcagc tgcagcagga
ggcagagccg ctgaagcagg 1260tgcagccaca ggtgcagccc caggcacatt cacagccccc
aaggcaggtg cagctgcagc 1320tgcagaagca ggtccagaca cagacatatc cacaggtcca
cacacaggca cagccaagcg 1380tccagccaca ggagcatcct ccagcgcagg tgtcagtaca
gccaccagag cagacccatg 1440agcagcctca cacccagccg caggtgtcgt tgctggctcc
agagcaaaca ccagttgtgg 1500ttcatgtctg cgggctggag atgccacctg atgcagtaga
agctggtgga ggcatggaaa 1560agaccttgcc agagcctgtg ggcacccaag tcagcatgga
agagattcag aatgagtcgg 1620cctgtggcct agatgtggga gaatgtgaaa acagagcgag
agagatgcca ggggtatggg 1680gcgccggggg ctccctgaag gtcaccattc tgcagagcag
tgacagccgg gcctttagca 1740ctgtacccct gacacctgtc ccccgcccca gtgactccgt
ctcctccacc cctgcggcta 1800ccagcactcc ctctaagcag gccctccagt tcttctgcta
catctgcaag gccagctgct 1860ccagccagca ggagttccag gaccacatgt cggagcctca
gcaccagcag cggctagggg 1920agatccagca catgagccaa gcctgcctcc tgtccctgct
gcccgtgccc cgggacgtcc 1980tggagacaga ggatgaggag cctccaccaa ggcgctggtg
caacacctgc cagctctact 2040acatggggga cctgatccaa caccgcagga cacaggacca
caagattgcc aaacaatcct 2100tgcgaccctt ctgcaccgtt tgcaaccgct acttcaaaac
ccctcgcaag tttgtggagc 2160acgtgaagtc ccaggggcat aaggacaaag ccaaggagct
gaagtcgctt gagaaagaaa 2220ttgctggcca agatgaggac cacttcatta cagtggacgc
tgtgggttgc ttcgagggtg 2280atgaagaaga ggaagaggat gatgaggatg aagaagagat
cgaggttgag gaggaactct 2340gcaagcaggt gaggtccaga gatatatcca gagaggagtg
gaagggctcg gagacctaca 2400gccccaatac tgcatatggt gtggacttcc tggtgcccgt
gatgggctat atctgccgca 2460tctgccacaa gttctatcac agcaactcag gggcacagct
ctcccactgc aagtccctgg 2520gccactttga gaacctgcag aaatacaagg cggccaagaa
ccccagcccc accacccgac 2580ctgtgagccg ccggtgcgca atcaacgccc ggaacgcttt
gacagccctg ttcacctcca 2640gcggccgccc accctcccag cccaacaccc aggacaaaac
acccagcaag gtgacggctc 2700gaccctccca gcccccacta cctcggcgct caacccgcct
caaaacctga tagagggacc 2760tccctgtccc tggcctgcct gggtccagat ctgctaatgc
tttttaggag tctgcctgga 2820aactttgaca tggttcatgt ttttactcaa aatccaataa
aacaaggtag tttggctgtg 2880caaaaaaaaa aaaaaaaaaa aaaaaaa
2907692836DNAHomo sapiens 69tgggggctgc ggggccggcc
catccgtggg ggcgacttga gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc
cagcagcagc agcagctcca gcaacagcag cagcagctcc 120agcagttaca gcagcagcag
ctccagcagc agcaattgca gcagcagcag ttactgcagc 180tccagcagct gctccagcag
tccccaccac aggccccgtt gcccatggct gtcagccggg 240ggctcccccc gcagcagcca
cagcagccgc ttctgaatct ccagggcacc aactcagcct 300ccctcctcaa cggctccatg
ctgcagagag ctttgctttt acagcagttg caaggactgg 360accagtttgc aatgccacca
gccacgtatg acactgccgg tctcaccatg cccacagcaa 420cactgggtaa cctccgaggc
tatggcatgg catccccagg cctcgcagcc cccagcctca 480cacccccaca actggccact
ccaaatttgc aacagttctt tccccaggcc actcgccagt 540ccttgctggg acctcctcct
gttggggtcc ccatgaaccc ttcccagttc aacctttcag 600gacggaaccc ccagaaacag
gcccggacct cctcctctac cacccccaat cgaaaggatt 660cttcttctca gacaatgcct
gtggaagaca agtcagaccc cccagagggg tctgaggaag 720ccgcagagcc ccggatggac
acaccagaag accaagattt accgccctgc ccagaggaca 780tcgccaagga aaaacgcact
ccagcacctg agcctgagcc ttgtgaggcg tccgagctgc 840cagcaaagag attgaggagc
tcagaagagc ccacagagaa ggaacctcca gggcagttac 900aggtgaaggc ccagccgcag
gcccggatga cagtaccgaa acagacacag acaccagacc 960tgctgcctga ggccctggaa
gcccaagtgc tgccacgatt ccagccacgg gtcctgcagg 1020tccaggccca ggtgcagtca
cagactcagc cgcggatacc atccacagac acccaggtgc 1080agccaaagct gcagaagcag
gcgcaaacac agacctctcc agagcactta gtgctgcaac 1140agaagcaggt gcagccacag
ctgcagcagg aggcagagcc acagaagcag gtgcagccac 1200aggtacagcc acaggcacat
tcacagggcc caaggcaggt gcagctgcag caggaggcag 1260agccgctgaa gcaggtgcag
acaggtccac acacaggcac agccaagcgt ccagccacag 1320gagcatcctc cagcgcaggt
gtcagtacag ccaccagagc agacccatga gcagcctcac 1380acccagccgc aggtgtcgtt
gctggctcca gagcaaacac cagttgtggt tcatgtctgc 1440gggctggaga tgccacctga
tgcagtagaa gctggtggag gcatggaaaa gaccttgcca 1500gagcctgtgg gcacccaagt
cagcatggaa gagattcaga atgagtcggc ctgtggccta 1560gatgtgggag aatgtgaaaa
cagagcgaga gagatgccag gggtatgggg cgccgggggc 1620tccctgaagg tcaccattct
gcagagcagt gacagccggg cctttagcac tgtacccctg 1680acacctgtcc cccgccccag
tgactccgtc tcctccaccc ctgcggctac cagcactccc 1740tctaagcagg ccctccagtt
cttctgctac atctgcaagg ccagctgctc cagccagcag 1800gagttccagg accacatgtc
ggagcctcag caccagcagc ggctagggga gatccagcac 1860atgagccaag cctgcctcct
gtccctgctg cccgtgcccc gggacgtcct ggagacagag 1920gatgaggagc ctccaccaag
gcgctggtgc aacacctgcc agctctacta catgggggac 1980ctgatccaac accgcaggac
acaggaccac aagattgcca aacaatcctt gcgacccttc 2040tgcaccgttt gcaaccgcta
cttcaaaacc cctcgcaagt ttgtggagca cgtgaagtcc 2100caggggcata aggacaaagc
caaggagctg aagtcgcttg agaaagaaat tgctggccaa 2160gatgaggacc acttcattac
agtggacgct gtgggttgct tcgagggtga tgaagaagag 2220gaagaggatg atgaggatga
agaagagatc gaggttgagg aggaactctg caagcaggtg 2280aggtccagag atatatccag
agaggagtgg aagggctcgg agacctacag ccccaatact 2340gcatatggtg tggacttcct
ggtgcccgtg atgggctata tctgccgcat ctgccacaag 2400ttctatcaca gcaactcagg
ggcacagctc tcccactgca agtccctggg ccactttgag 2460aacctgcaga aatacaaggc
ggccaagaac cccagcccca ccacccgacc tgtgagccgc 2520cggtgcgcaa tcaacgcccg
gaacgctttg acagccctgt tcacctccag cggccgccca 2580ccctcccagc ccaacaccca
ggacaaaaca cccagcaagg tgacggctcg accctcccag 2640cccccactac ctcggcgctc
aacccgcctc aaaacctgat agagggacct ccctgtccct 2700ggcctgcctg ggtccagatc
tgctaatgct ttttaggagt ctgcctggaa actttgacat 2760ggttcatgtt tttactcaaa
atccaataaa acaaggtagt ttggctgtgc aaaaaaaaaa 2820aaaaaaaaaa aaaaaa
2836702754DNAHomo sapiens
70tgggggctgc ggggccggcc catccgtggg ggcgacttga gcgttgaggg cgcgcgggga
60ggcgagccac catgttcagc cagcagcagc agcagctcca gcaacagcag cagcagctcc
120agcagttaca gcagcagcag ctccagcagc agcaattgca gcagcagcag ttactgcagc
180tccagcagct gctccagcag tccccaccac aggccccgtt gcccatggct gtcagccggg
240ggctcccccc gcagcagcca cagcagccgc ttctgaatct ccagggcacc aactcagcct
300ccctcctcaa cggctccatg ctgcagagag ctttgctttt acagcagttg caaggactgg
360accagtttgc aatgccacca gccacgtatg acactgccgg tctcaccatg cccacagcaa
420cactgggtaa cctccgaggc tatggcatgg catccccagg cctcgcagcc cccagcctca
480cacccccaca actggccact ccaaatttgc aacagttctt tccccaggcc actcgccagt
540ccttgctggg acctcctcct gttggggtcc ccatgaaccc ttcccagttc aacctttcag
600gacggaaccc ccagaaacag gcccggacct cctcctctac cacccccaat cgaaaggatt
660cttcttctca gacaatgcct gtggaagaca agtcagaccc cccagagggg tctgaggaag
720ccgcagagcc ccggatggac acaccagaag accaagattt accgccctgc ccagaggaca
780tcgccaagga aaaacgcact ccagcacctg agcctgagcc ttgtgaggcg tccgagctgc
840cagcaaagag attgaggagc tcagaagagc ccacagagaa ggaacctcca gggcagttac
900aggtgaaggc ccagccgcag gcccggatga cagtaccgaa acagacacag acaccagacc
960tgctgcctga ggccctggaa gcccaagtgc tgccacgatt ccagccacgg gtcctgcagg
1020tccaggccca ggtgcagtca cagactcagc cgcggatacc atccacagac acccaggtgc
1080agccaaagct gcagaagcag gcgcaaacac agacctctcc agagcactta gtgctgcaac
1140agaagcaggt gcagccacag ctgcagcagg aggcagagcc acagaagcag gtgcagccac
1200aggtccacac acaggcacag ccaagcgtcc agccacagga gcatcctcca gcgcaggtgt
1260cagtacagcc accagagcag acccatgagc agcctcacac ccagccgcag gtgtcgttgc
1320tggctccaga gcaaacacca gttgtggttc atgtctgcgg gctggagatg ccacctgatg
1380cagtagaagc tggtggaggc atggaaaaga ccttgccaga gcctgtgggc acccaagtca
1440gcatggaaga gattcagaat gagtcggcct gtggcctaga tgtgggagaa tgtgaaaaca
1500gagcgagaga gatgccaggg gtatggggcg ccgggggctc cctgaaggtc accattctgc
1560agagcagtga cagccgggcc tttagcactg tacccctgac acctgtcccc cgccccagtg
1620actccgtctc ctccacccct gcggctacca gcactccctc taagcaggcc ctccagttct
1680tctgctacat ctgcaaggcc agctgctcca gccagcagga gttccaggac cacatgtcgg
1740agcctcagca ccagcagcgg ctaggggaga tccagcacat gagccaagcc tgcctcctgt
1800ccctgctgcc cgtgccccgg gacgtcctgg agacagagga tgaggagcct ccaccaaggc
1860gctggtgcaa cacctgccag ctctactaca tgggggacct gatccaacac cgcaggacac
1920aggaccacaa gattgccaaa caatccttgc gacccttctg caccgtttgc aaccgctact
1980tcaaaacccc tcgcaagttt gtggagcacg tgaagtccca ggggcataag gacaaagcca
2040aggagctgaa gtcgcttgag aaagaaattg ctggccaaga tgaggaccac ttcattacag
2100tggacgctgt gggttgcttc gagggtgatg aagaagagga agaggatgat gaggatgaag
2160aagagatcga ggttgaggag gaactctgca agcaggtgag gtccagagat atatccagag
2220aggagtggaa gggctcggag acctacagcc ccaatactgc atatggtgtg gacttcctgg
2280tgcccgtgat gggctatatc tgccgcatct gccacaagtt ctatcacagc aactcagggg
2340cacagctctc ccactgcaag tccctgggcc actttgagaa cctgcagaaa tacaaggcgg
2400ccaagaaccc cagccccacc acccgacctg tgagccgccg gtgcgcaatc aacgcccgga
2460acgctttgac agccctgttc acctccagcg gccgcccacc ctcccagccc aacacccagg
2520acaaaacacc cagcaaggtg acggctcgac cctcccagcc cccactacct cggcgctcaa
2580cccgcctcaa aacctgatag agggacctcc ctgtccctgg cctgcctggg tccagatctg
2640ctaatgcttt ttaggagtct gcctggaaac tttgacatgg ttcatgtttt tactcaaaat
2700ccaataaaac aaggtagttt ggctgtgcaa aaaaaaaaaa aaaaaaaaaa aaaa
2754712587DNAHomo sapiens 71tgggggctgc ggggccggcc catccgtggg ggcgacttga
gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc cagcagcagc agcagctcca
gcaacagcag cagcagctcc 120agcagttaca gcagcagcag ctccagcagc agcaattgca
gcagcagcag ttactgcagc 180tccagcagct gctccagcag tccccaccac aggccccgtt
gcccatggct gtcagccggg 240ggctcccccc gcagcagcca cagcagccgc ttctgaatct
ccagggcacc aactcagcct 300ccctcctcaa cggctccatg ctgcagagag ctttgctttt
acagcagttg caaggactgg 360accagtttgc aatgccacca gccacgtatg acactgccgg
tctcaccatg cccacagcaa 420cactgggtaa cctccgaggc tatggcatgg catccccagg
cctcgcagcc cccagcctca 480cacccccaca actggccact ccaaatttgc aacagttctt
tccccaggcc actcgccagt 540ccttgctggg acctcctcct gttggggtcc ccatgaaccc
ttcccagttc aacctttcag 600gacggaaccc ccagaaacag gcccggacct cctcctctac
cacccccaat cgaaaggatt 660cttcttctca gacaatgcct gtggaagaca agtcagaccc
cccagagggg tctgaggaag 720ccgcagagcc ccggatggac acaccagaag accaagattt
accgccctgc ccagaggaca 780tcgccaagga aaaacgcact ccagcacctg agcctgagcc
ttgtgaggcg tccgagctgc 840cagcaaagag attgaggagc tcagaagagc ccacagagaa
ggaacctcca gggcagttac 900aggtgaaggc ccagccgcag gcccggatga cagtaccgaa
acagacacag acaccagacc 960tgctgcctga ggccctggaa gcccaagtgc tgccacgatt
ccagccacgg gtcctgcagg 1020tccaggcctc cacaggtcca cacacaggca cagccaagcg
tccagccaca ggagcatcct 1080ccagcgcagg tgtcagtaca gccaccagag cagacccatg
agcagcctca cacccagccg 1140caggtgtcgt tgctggctcc agagcaaaca ccagttgtgg
ttcatgtctg cgggctggag 1200atgccacctg atgcagtaga agctggtgga ggcatggaaa
agaccttgcc agagcctgtg 1260ggcacccaag tcagcatgga agagattcag aatgagtcgg
cctgtggcct agatgtggga 1320gaatgtgaaa acagagcgag agagatgcca ggggtatggg
gcgccggggg ctccctgaag 1380gtcaccattc tgcagagcag tgacagccgg gcctttagca
ctgtacccct gacacctgtc 1440ccccgcccca gtgactccgt ctcctccacc cctgcggcta
ccagcactcc ctctaagcag 1500gccctccagt tcttctgcta catctgcaag gccagctgct
ccagccagca ggagttccag 1560gaccacatgt cggagcctca gcaccagcag cggctagggg
agatccagca catgagccaa 1620gcctgcctcc tgtccctgct gcccgtgccc cgggacgtcc
tggagacaga ggatgaggag 1680cctccaccaa ggcgctggtg caacacctgc cagctctact
acatggggga cctgatccaa 1740caccgcagga cacaggacca caagattgcc aaacaatcct
tgcgaccctt ctgcaccgtt 1800tgcaaccgct acttcaaaac ccctcgcaag tttgtggagc
acgtgaagtc ccaggggcat 1860aaggacaaag ccaaggagct gaagtcgctt gagaaagaaa
ttgctggcca agatgaggac 1920cacttcatta cagtggacgc tgtgggttgc ttcgagggtg
atgaagaaga ggaagaggat 1980gatgaggatg aagaagagat cgaggttgag gaggaactct
gcaagcaggt gaggtccaga 2040gatatatcca gagaggagtg gaagggctcg gagacctaca
gccccaatac tgcatatggt 2100gtggacttcc tggtgcccgt gatgggctat atctgccgca
tctgccacaa gttctatcac 2160agcaactcag gggcacagct ctcccactgc aagtccctgg
gccactttga gaacctgcag 2220aaatacaagg cggccaagaa ccccagcccc accacccgac
ctgtgagccg ccggtgcgca 2280atcaacgccc ggaacgcttt gacagccctg ttcacctcca
gcggccgccc accctcccag 2340cccaacaccc aggacaaaac acccagcaag gtgacggctc
gaccctccca gcccccacta 2400cctcggcgct caacccgcct caaaacctga tagagggacc
tccctgtccc tggcctgcct 2460gggtccagat ctgctaatgc tttttaggag tctgcctgga
aactttgaca tggttcatgt 2520ttttactcaa aatccaataa aacaaggtag tttggctgtg
caaaaaaaaa aaaaaaaaaa 2580aaaaaaa
2587722898DNAHomo sapiens 72tgggggctgc ggggccggcc
catccgtggg ggcgacttga gcgttgaggg cgcgcgggga 60ggcgagccac catgttcagc
cagcagcagc agcagctcca gcaacagcag cagcagctcc 120agcagttaca gcagcagcag
ctccagcagc agcaattgca gcagcagcag ttactgcagc 180tccagcagct gctccagcag
tccccaccac aggccccgtt gcccatggct gtcagccggg 240ggctcccccc gcagcagcca
cagcagccgc ttctgaatct ccagggcacc aactcagcct 300ccctcctcaa cggctccatg
ctgcagagag ctttgctttt acagcagttg caaggactgg 360accagtttgc aatgccacca
gccacgtatg acactgccgg tctcaccatg cccacagcaa 420cactgggtaa cctccgaggc
tatggcatgg catccccagg cctcgcagcc cccagcctca 480cacccccaca actggccact
ccaaatttgc aacagttctt tccccaggcc actcgccagt 540ccttgctggg acctcctcct
gttggggtcc ccatgaaccc ttcccagttc aacctttcag 600gacggaaccc ccagaaacag
gcccggacct cctcctctac cacccccaat cgaaaggatt 660cttcttctca gacaatgcct
gtggaagaca agtcagaccc cccagagggg tctgaggaag 720ccgcagagcc ccggatggac
acaccagaag accaagattt accgccctgc ccagaggaca 780tcgccaagga aaaacgcact
ccagcacctg agcctgagcc ttgtgaggcg tccgagctgc 840cagcaaagag attgaggagc
tcagaagagc ccacagagaa ggaacctcca gggcagttac 900aggtgaaggc ccagccgcag
gcccggatga cagtaccgaa acagacacag acaccagacc 960tgctgcctga ggccctggaa
gcccaagtgc tgccacgatt ccagccacgg gtcctgcagg 1020tccaggccca ggtgcagtca
cagactcagc cgcggatacc atccacagac acccaggtgc 1080agccaaagct gcagaagcag
gcgcaaacac agacctctcc agagcactta gtgctgcaac 1140agaagcaggt gcagccacag
ctgcagcagg aggcagagcc acagaagcag gtgcagccac 1200aggtacagcc acaggcacat
tcacagggcc caaggcaggt gcagctgcag caggaggcag 1260agccgctgaa gcaggtgcag
ccacaggtgc agccccaggc acattcacag cccccaaggc 1320aggtgcagct gcagctgcag
aagcaggtcc agacacagac atatccacag gtccacacac 1380aggcacagcc aagcgtccag
ccacaggagc atcctccagc gcaggtgtca gtacagccac 1440cagagcagac ccatgagcag
cctcacaccc agccgcaggt gtcgttgctg gctccagagc 1500aaacaccagt tgtggttcat
gtctgcgggc tggagatgcc acctgatgca gtagaagctg 1560gtggaggcat ggaaaagacc
ttgccagagc ctgtgggcac ccaagtcagc atggaagaga 1620ttcagaatga gtcggcctgt
ggcctagatg tgggagaatg tgaaaacaga gcgagagaga 1680tgccaggggt atggggcgcc
gggggctccc tgaaggtcac cattctgcag agcagtgaca 1740gccgggcctt tagcactgta
cccctgacac ctgtcccccg ccccagtgac tccgtctcct 1800ccacccctgc ggctaccagc
actccctcta agcaggccct ccagttcttc tgctacatct 1860gcaaggccag ctgctccagc
cagcaggagt tccaggacca catgtcggag cctcagcacc 1920agcagcggct aggggagatc
cagcacatga gccaagcctg cctcctgtcc ctgctgcccg 1980tgccccggga cgtcctggag
acagaggatg aggagcctcc accaaggcgc tggtgcaaca 2040cctgccagct ctactacatg
ggggacctga tccaacaccg caggacacag gaccacaaga 2100ttgccaaaca atccttgcga
cccttctgca ccgtttgcaa ccgctacttc aaaacccctc 2160gcaagtttgt ggagcacgtg
aagtcccagg ggcataagga caaagccaag gagctgaagt 2220cgcttgagaa agaaattgct
ggccaagatg aggaccactt cattacagtg gacgctgtgg 2280gttgcttcga gggtgatgaa
gaagaggaag aggatgatga ggatgaagaa gagatcgagg 2340tgaggtccag agatatatcc
agagaggagt ggaagggctc ggagacctac agccccaata 2400ctgcatatgg tgtggacttc
ctggtgcccg tgatgggcta tatctgccgc atctgccaca 2460agttctatca cagcaactca
ggggcacagc tctcccactg caagtccctg ggccactttg 2520agaacctgca gaaatacaag
gcggccaaga accccagccc caccacccga cctgtgagcc 2580gccggtgcgc aatcaacgcc
cggaacgctt tgacagccct gttcacctcc agcggccgcc 2640caccctccca gcccaacacc
caggacaaaa cacccagcaa ggtgacggct cgaccctccc 2700agcccccact acctcggcgc
tcaacccgcc tcaaaacctg atagagggac ctccctgtcc 2760ctggcctgcc tgggtccaga
tctgctaatg ctttttagga gtctgcctgg aaactttgac 2820atggttcatg tttttactca
aaatccaata aaacaaggta gtttggctgt gcaaaaaaaa 2880aaaaaaaaaa aaaaaaaa
2898732883DNAHomo sapiens
73tgggggctgc ggggccggcc catccgtggg ggcgacttga gcgttgaggg cgcgcgggga
60ggcgagccac catgttcagc cagcagcagc agcagctcca gcaacagcag cagcagctcc
120agcagttaca gcagcagcag ctccagcagc agcaattgca gcagcagcag ttactgcagc
180tccagcagct gctccagcag tccccaccac aggccccgtt gcccatggct gtcagccggg
240ggctcccccc gcagcagcca cagcagccgc ttctgaatct ccagggcacc aactcagcct
300ccctcctcaa cggctccatg ctgcagagag ctttgctttt acagcagttg caaggactgg
360accagtttgc aatgccacca gccacgtatg acactgccgg tctcaccatg cccacagcaa
420cactgggtaa cctccgaggc tatggcatgg catccccagg cctcgcagcc cccagcctca
480cacccccaca actggccact ccaaatttgc aacagttctt tccccaggcc actcgccagt
540ccttgctggg acctcctcct gttggggtcc ccatgaaccc ttcccagttc aacctttcag
600gacggaaccc ccagaaacag gcccggacct cctcctctac cacccccaat cgaaagacaa
660tgcctgtgga agacaagtca gaccccccag aggggtctga ggaagccgca gagccccgga
720tggacacacc agaagaccaa gatttaccgc cctgcccaga ggacatcgcc aaggaaaaac
780gcactccagc acctgagcct gagccttgtg aggcgtccga gctgccagca aagagattga
840ggagctcaga agagcccaca gagaaggaac ctccagggca gttacaggtg aaggcccagc
900cgcaggcccg gatgacagta ccgaaacaga cacagacacc agacctgctg cctgaggccc
960tggaagccca agtgctgcca cgattccagc cacgggtcct gcaggtccag gcccaggtgc
1020agtcacagac tcagccgcgg ataccatcca cagacaccca ggtgcagcca aagctgcaga
1080agcaggcgca aacacagacc tctccagagc acttagtgct gcaacagaag caggtgcagc
1140cacagctgca gcaggaggca gagccacaga agcaggtgca gccacaggta cagccacagg
1200cacattcaca gggcccaagg caggtgcagc tgcagcagga ggcagagccg ctgaagcagg
1260tgcagccaca ggtgcagccc caggcacatt cacagccccc aaggcaggtg cagctgcagc
1320tgcagaagca ggtccagaca cagacatatc cacaggtcca cacacaggca cagccaagcg
1380tccagccaca ggagcatcct ccagcgcagg tgtcagtaca gccaccagag cagacccatg
1440agcagcctca cacccagccg caggtgtcgt tgctggctcc agagcaaaca ccagttgtgg
1500ttcatgtctg cgggctggag atgccacctg atgcagtaga agctggtgga ggcatggaaa
1560agaccttgcc agagcctgtg ggcacccaag tcagcatgga agagattcag aatgagtcgg
1620cctgtggcct agatgtggga gaatgtgaaa acagagcgag agagatgcca ggggtatggg
1680gcgccggggg ctccctgaag gtcaccattc tgcagagcag tgacagccgg gcctttagca
1740ctgtacccct gacacctgtc ccccgcccca gtgactccgt ctcctccacc cctgcggcta
1800ccagcactcc ctctaagcag gccctccagt tcttctgcta catctgcaag gccagctgct
1860ccagccagca ggagttccag gaccacatgt cggagcctca gcaccagcag cggctagggg
1920agatccagca catgagccaa gcctgcctcc tgtccctgct gcccgtgccc cgggacgtcc
1980tggagacaga ggatgaggag cctccaccaa ggcgctggtg caacacctgc cagctctact
2040acatggggga cctgatccaa caccgcagga cacaggacca caagattgcc aaacaatcct
2100tgcgaccctt ctgcaccgtt tgcaaccgct acttcaaaac ccctcgcaag tttgtggagc
2160acgtgaagtc ccaggggcat aaggacaaag ccaaggagct gaagtcgctt gagaaagaaa
2220ttgctggcca agatgaggac cacttcatta cagtggacgc tgtgggttgc ttcgagggtg
2280atgaagaaga ggaagaggat gatgaggatg aagaagagat cgaggtgagg tccagagata
2340tatccagaga ggagtggaag ggctcggaga cctacagccc caatactgca tatggtgtgg
2400acttcctggt gcccgtgatg ggctatatct gccgcatctg ccacaagttc tatcacagca
2460actcaggggc acagctctcc cactgcaagt ccctgggcca ctttgagaac ctgcagaaat
2520acaaggcggc caagaacccc agccccacca cccgacctgt gagccgccgg tgcgcaatca
2580acgcccggaa cgctttgaca gccctgttca cctccagcgg ccgcccaccc tcccagccca
2640acacccagga caaaacaccc agcaaggtga cggctcgacc ctcccagccc ccactacctc
2700ggcgctcaac ccgcctcaaa acctgataga gggacctccc tgtccctggc ctgcctgggt
2760ccagatctgc taatgctttt taggagtctg cctggaaact ttgacatggt tcatgttttt
2820actcaaaatc caataaaaca aggtagtttg gctgtgcaaa aaaaaaaaaa aaaaaaaaaa
2880aaa
28837433PRTHomo sapiens 74Gln Gln Leu Gln Gln Leu Gln Gln Gln Gln Leu Gln
Gln Gln Gln Leu1 5 10
15Gln Gln Gln Gln Leu Leu Gln Leu Gln Gln Leu Leu Gln Gln Ser Pro
20 25 30Pro
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: