Patent application title: BIOLOGICAL ENTITIES AND THE USE THEREOF
Inventors:
Ulrich Haupts (Koln, DE)
Andre Koltermann (Icking, DE)
Andre Koltermann (Icking, DE)
Andreas Scheidig (Koln, DE)
Christian Votsmeier (Koln, DE)
Ulrich Kettling (Koln, DE)
Oliver Kensch (Koln, DE)
Birgitta Leuthner (Koln, DE)
IPC8 Class: AA61K864FI
USPC Class:
424 7014
Class name: Live hair or scalp treating compositions (nontherapeutic) polymer containing (nonsurfactant, natural or synthetic) protein or derivative
Publication date: 2009-08-20
Patent application number: 20090208440
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: BIOLOGICAL ENTITIES AND THE USE THEREOF
Inventors:
Andre Koltermann
Ulrich Haupts
Andreas Scheidig
Christian Votsmeier
Ulrich Kettling
Oliver Kensch
Birgitta Leuthner
Agents:
Ballard Spahr Andrews & Ingersoll, LLP
Assignees:
Origin: ATLANTA, GA US
IPC8 Class: AA61K864FI
USPC Class:
424 7014
Abstract:
The present invention provides engineered enzymes generated from protein
scaffolds combined with Specificity Determining Regions, the production
thereof and the use of said engineered enzymes for research, nutritional
care, personal care and industrial purposes.Claims:
1. A recombinant engineered enzyme with catalytic activity of defined
specificity, characterized by a combination of the following
components:(a) a protein scaffold capable of catalyzing at least one
protein cleavage reaction on at least one target substrate and being a
serine protease of the structural class S8, and(b) one or more
specificity determining regions (SDRs), wherein the SDRs are peptide
sequences inserted into the protein scaffold at one or more positions
that correspond structurally or by amino acid sequence homology to the
regions 6-17, 25-29, 47-55, 59-69, 101-111, 117-125, 129-137, 139-154,
158-169, 185-195 and 204-225 in subtilisin E from Bacillus subtilis
having the amino acid shown in SEQ ID NO:7, wherein the inserted SDRs
enable the resulting engineered protein to discriminate between at least
one target substrate and one or more different substrates.
2. The recombinant engineered enzyme of claim 1, wherein the SDRs (b) have a length of less than 50 amino acid residues.
3. The recombinant engineered enzyme of claim 1, wherein the SDRs (b) have a length between two and 20 amino acid residues.The above recombinant engineered enzyme, wherein the SDRs (b) have a length between two and ten amino acid residues.
5. The recombinant engineered enzyme of claim 4, wherein the SDRs (b) have a length between three and eight amino acid residues.
6. The recombinant engineered enzyme of claim 2, wherein the number of SDRs is at least one.
7. The engineered enzyme of claim 6, wherein the number of SDRs is more than one.
8. The recombinant engineered enzyme of claim 6, wherein the number of SDRs is between two and eleven.
9. The recombinant engineered enzyme of claim 6, wherein the number of SDRs is between two and six.
10. The recombinant engineered enzyme of claim 1, wherein the protein scaffold (a) is encoded by a gene of viral origin.
11. The recombinant engineered enzyme of claim 1, wherein the protein scaffold (a) is encoded by a gene of prokaryotic origin.
12. The recombinant engineered enzyme of claim 1, wherein the protein scaffold (a) is encoded by a gene of eukaryotic origin.
13. The recombinant engineered enzyme of claim 1, wherein the protein scaffold (a) is comprised of one or more polypeptides derived from the same or different native enzymes.
14. The recombinant engineered enzyme of claim 1, wherein the protein scaffold (a) is comprised of one or more polypeptides derived from the same or different native mammalian enzymes.
15. The recombinant engineered enzyme of claim 14, wherein the mammalian enzymes are human enzymes.
16. The engineered enzyme of claim 1, wherein the SDRs are located at one or more positions selected from the group of positions that correspond structurally or by amino acid sequence homology to the regions 59-69, 101-111, 129-137, 158-169 and 204-225 in subtilisin E from Bacillus subtilis having the amino acid shown in SEQ ID NO:7.
17. A fusion protein which is comprised of at least one engineered enzyme of claim 1 and at least one further proteinaceous component.
18. The fusion protein of claim 17, wherein the further proteinaceous component is selected from the group consisting of binding domains, receptors, antibodies, regulation domains, pro-sequences, and fragments thereof.
19. A fusion protein which is comprised of at least one engineered enzyme of claim 1 and at least one further functional component.
20. The fusion protein of claim 19, wherein the functional component is selected from the group consisting of polyethyleneglycols, carbohydrates, lipids, fatty acids, nucleic acids, metals, metal chelates, and fragments or derivatives thereof.
21. A composition comprising one or more engineered enzymes of claim 1.
22. A composition comprising a fusion protein of claim 17.
23. A composition comprising a fusion protein of claim 19.
24. The composition of claim 21, which is a composition selected from the group consisting of research composition, nutritional composition, cleaning composition, food additive composition, disinfection composition, cosmetic composition or composition for personal care.
25. The composition of claim 22, which is a composition selected from the group consisting of research composition, nutritional composition, cleaning composition, food additive composition, disinfection composition, cosmetic composition or composition for personal care.
26. The composition of claim 23, which is a composition selected from the group consisting of research composition, nutritional composition, cleaning composition, food additive composition, disinfection composition, cosmetic composition or composition for personal care.
27. The composition of claim 21, which further comprises optional components selected from the group consisting of acceptable carrier(s) and auxiliary agent(s).
28. The composition of claim 22, which further comprises optional components selected from the group consisting of acceptable carrier(s) and auxiliary agent(s).
29. The composition of claim 23, which further comprises optional components selected from the group consisting of acceptable carrier(s) and auxiliary agent(s).
30. A recombinant engineered enzyme with catalytic activity of defined specificity, characterized by a combination of the following components:(a) a protein scaffold capable of catalyzing at least one protein cleavage reaction on at least one target substrate and being an aspartic protease of the structural class A1, and(b) one or more specificity determining regions (SDRs), wherein the SDRs are peptide sequences inserted into the protein scaffold at one or more positions that correspond structurally or by amino acid sequence homology to the regions 6-18, 49-55, 74-83, 91-97, 112-120, 126-137, 159-164, 184-194, 242-247, 262-267 and 277-300 in human pepsin having the amino acid sequence shown in SEQ ID NO:11, wherein the inserted SDRs enable the resulting engineered protein to discriminate between at least one target substrate and one or more different substrates.
31. The engineered enzyme of claim 30, wherein the SDRs are located at one or more positions selected from the group of positions that correspond structurally or by amino acid sequence homology to the regions 10-15, 75-80, 114-118, 130-134, 186-191 and 280-296 in human pepsin having the amino acid sequence shown in SEQ ID NO:11.
32. A recombinant engineered enzyme with catalytic activity of defined specificity, characterized by a combination of the following components:(a) a protein scaffold capable of catalyzing at least one protein cleavage reaction on at least one target substrate and being a cysteine protease of the structural class C14, and(b) one or more specificity determining regions (SDRs), wherein the SDRs are peptide sequences inserted into the protein scaffold at one or more positions that correspond structurally or by amino acid sequence homology to the regions 78-91, 144-160, 186-198, 226-243 and 271-291 in human caspase 7 having the amino acid sequence of SEQ ID NO:14, wherein the inserted SDRs enable the resulting engineered protein to discriminate between at least one target substrate and one or more different substrates.
33. The engineered enzyme of claim 32, wherein the SDRs are located at one or more positions selected from the group of positions that correspond structurally or by amino acid sequence homology to the regions 80-86, 149-157, 190-194 and 233-238 of human caspase 7 having the amino acid sequence of SEQ ID NO:14.
34. A recombinant engineered enzyme with catalytic activity of defined specificity, characterized by a combination of the following components:(a) a protein scaffold capable of catalyzing at least one protein cleavage reaction on at least one target substrate and being derived from a protease selected from the group consisting of a serine protease of the structural class S1, S8, S11, S21, S26, S33 and S51, a cysteine protease of the structure class C1, C2, C4, C10, C14, C19, C47, C48 and C56, an aspartic protease of the structural class A1, A2 and A26, and a metalloprotease of the structural class M4 and M10, and(b) one or more specificity determining regions (SDRs) located at sited in the protein scaffold that enable the resulting engineered protein to discriminate between at least one target substrate and one or more different substrates and wherein the SDRs are essentially synthetic peptide sequences.
Description:
[0001]This application is a continuation-in-part of copending application
Ser. No. 10/872,197, filed Jun. 18, 2004. This application claims the
priority benefit of European Application No. 03013819, filed Jun. 18,
2003; European Application No. 03025851, filed Nov. 10, 2003; European
Application No. 03025871, filed Nov. 11, 2003; and U.S. Provisional
Application No. 60/524,960, filed Nov. 25, 2003, which applications are
incorporated herein fully by this reference.
[0002]The present invention provides engineered enzymes comprised of a protein scaffold and Specificity Determining Regions, the production of such enzymes and the use thereof for therapeutic, research, diagnostic, nutritional care, personal care and industrial purposes.
BACKGROUND
[0003]Academic and industrial research continuously searches for functional proteins to be used as therapeutic, research, diagnostic, nutritional, personal care or industrial agents. Today, such functional proteins can be classified mainly into two categories: natural proteins and engineered proteins. Natural proteins, on the one hand, are discovered from nature, e.g. by screening natural isolates or by sequencing genomes from diverse species. Engineered proteins, on the other hand, are typically based on known proteins and are altered in order to acquire modified functionalities. The present invention discloses engineered proteins with novel functions as compared to the starting components. Such proteins are called NBEs (New Biologic Entities). The NBEs disclosed in the present invention are engineered enzymes with novel substrate specificities or fusion proteins of such engineered enzymes with other functional components.
[0004]Specificity is an essential element of enzyme function. A cell consists of thousands of different, highly reactive catalysts. Yet the cell is able to maintain a coordinated metabolism and a highly organized three-dimensional structure. This is due in part to the specificity of enzymes, i.e. the selective conversion of their respective substrates. Specificity is a qualitative and a quantitative property: the specificity of a particular enzyme can vary widely, ranging from just one particular type of target molecules to all molecular types with certain chemical substructures. In nature, the specificity of an organism's enzymes has been evolved to the particular needs of the organism. Arbitrary specificities with high value for therapeutic, research, diagnostic, nutritional or industrial applications are unlikely to be found in any organism's enzymatic repertoire due to the large space of possible specificities. The only realistic way of obtaining such specificities is their generation de novo.
[0005]When comparing enzymes with binders, a paradigm of specificity is given by antibodies recognizing individual epitopes as small distinct structures within large molecules. The naturally occurring vast range of antibody specificities is attributed to the diversity generated by the immune system combined with natural selection. Several mechanisms contribute to the vast repertoire of antibody specificity and occur at different stages of immune response generation and antibody maturation (Janeway, C et al. (1999) Immunobiology, Elsevier Science Ltd., Garland Publishing, New York). Specifically, antibodies contain complementarity determining regions (CDRs) which interact with the antigen in a highly specific manner and allow discrimination even between very similar epitopes. The light as well as the heavy chain of the antibody each contribute three CDRs to the binding domain. Nature uses recombination of various gene segments combined with further mutagenesis in the generation of CDRs. As a result, the sequences of the six CDR loops are highly variable in composition and length and this forms the basis for the diversity of binding specificities in antibodies. A similar principle for the generation of a diversity of catalytic specificities is not known from nature.
[0006]Catalysis, i.e. the increase of the rate of a specific chemical reaction, is besides binding the most important protein function. Catalytic proteins, i.e. enzymes, are classified according to the chemical reaction they catalyze.
[0007]Transferases are enzymes transferring a group, for example, the methyl group or a glycosyl group, from one compound (generally regarded as donor) to another compound (generally regarded as acceptor). For example, glycosyltransferases (EC 2.4) transfer glycosyl residues from a donor to an acceptor molecule. Some of the glycosyltransferases also catalyze hydrolysis, which can be regarded as transfer of a glycosyl group from the donor to water. The subclass is further subdivided into hexosyltransferases (EC 2.4.1), pentosyltransferases (EC 2.4.2) and those transferring other glycosyl groups (EC 2.4.99, Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB)).
[0008]Oxidoreductases catalyze oxido-reductions. The substrate that is oxidized is regarded as hydrogen or electron donor. Oxidoreductases are classified as dehydrogenases, oxidases, mono- and dioxygenases. Dehydrogenases transfer hydrogen from a hydrogen donor to a hydrogen acceptor molecule. Oxidases react with molecular oxygen as hydrogen acceptor and produce oxidized products as well as either hydrogen peroxide or water. Monooxygenases transfer one oxygen atom from molecular oxygen to the substrate and one is reduced to water. In contrast, dioxygenases catalyze the insert of both oxygen atoms from molecular oxygen into the substrate.
[0009]Lyases catalyze elimination reactions and thereby generate double bonds or, in the reverse direction, catalyze the additions at double bonds. Isomerases catalyze intramolecular rearrangements. Ligases catalyze the formation of chemical bonds at the expense of ATP consumption.
[0010]Finally, hydrolases are enzymes that catalyze the hydrolysis of chemical bonds like C--O or C--N. The E.C. classification for these enzymes generally classifies them by the nature of the bond hydrolysed and by the nature of the substrate. Hydrolases such as lipases and proteases play an important role in nature as well in technical applications of biocatalysts. Proteases hydrolyse a peptide bond within the context of an oligo- or polypeptide. Depending on the catalytic mechanism proteases are grouped into aspartic, serin, cysteine, metallo- and threonine proteases (Handbook of proteolytic enzymes. (1998) Eds: Barret, A; Rawling, N.; Woessner, J.; Academic Press, London). This classification is based on the amino acid side chains that are responsible for catalysis and which are typically presented in the active site in very similar orientation to each other. The scissile bond of the substrate is brought into register with the catalytic residues due to specific interactions between the amino acid side chains of the substrate and complementary regions of the protease (Perona, J. & Craik, C (1995) Protein Science, 4, 337-360). The residues on the N- and C-terminal side of the scissile bond are usually called P1, P2, P3 etc and P1', P2', P3' and the binding pockets complementary to the substrate S1, S2, S3 and S1', S2', S3', respectively (nomenclature according to Schlechter & Berger, Biochem. Biophys. Res. Commun. 27 (1967) 157-162). The selectivity of proteases can vary widely from being virtually nonselective--e.g. the Subtilisins--over a strict preference at the P1 position--e.g. Trypsin selectively cutting on the C-terminal side of arginine or lysine residues--to highly specific proteases--e.g. human tissue-type plasminogen activator (t-PA) cleaving at the C-terminal side of the arginine in the sequence CPGRWG (Ding, L et al. (1995) Proc. Natl. Acad. Sci. USA 92, 7627-7631; Coombs, G et al. (1996) J. Biol. Chem. 271, 4461-4467).
[0011]The specificity of proteases, i.e. their ability to recognize and hydrolyze preferentially certain peptide substrates, can be expressed qualitatively and quantitatively. Qualitative specificity refers to the kind of amino acid residues that are accepted by a protease at certain positions of the peptide substrate. For example, trypsin and t-PA are related with respect to their qualitative specificity, since both of them require at the P1 position an arginine or a similar residue. On the other hand, quantitative specificity refers to the relative number of peptide substrates that are accepted as substrates by the protease, or more precisely, to the relative kcat/kM ratios of the protease for the different peptides that are accepted by the protease. Proteases that accept only a small portion of all possible peptides have a high specificity, whereas the specificity of proteases that, as an extreme, cleave any peptide substrate would theoretically be zero.
[0012]Comparison of the primary, secondary as well as the tertiary structure of proteases (Fersht, A., Enzyme Structure and Mechanism, W. H. Freeman and Company, New York, 1995) allows identification of classes showing a high degree of conservation (Rawlings, N. D. & Barrett, A. J. (1997) In: Proteolysis in Cell Functions Eds. Hopsu-Havu, V. K.; Jarvinen, M.; Kirschke, H, pp. 13-21, IOS Press, Amsterdam). A widely accepted scheme for protease classification has been proposed by Rawlings & Barrett (Handbook of proteolytic enzymes. (1998) Eds: Barret, A; Rawling, N.; Woessner, J.; Academic Press, London). For example, the serine proteases family can be subdivided into structural classes with chymotrypsin (class S1), subtilisin (class S8) and carboxypeptidase (class SC) folds, each of which includes nonspecific as well as specific proteases (Rawlings, N. D. & Barrett, A. J. (1994) Methods Enzymol. 244, 19-61). This applies to other protease families analogously. An additional distinction can be made according to the relative location of the cleaved bond in the substrate. Carboxy- and aminopeptidases cleave amino acids from the C- and N-terminus, respectively, while endopeptidases cut anywhere along the oligopeptide.
[0013]Many applications would be conceivable if enzymes with a basically unlimited spectrum of specificities were available. However, the use of such enzymes with high, low or any defined specificity is currently limited to those which can be isolated from natural sources. The field of application for these enzymes varies from therapeutic, research, diagnostic, nutritional to personal care and industrial purposes.
[0014]Enzyme additives in detergents have come to constitute nearly a third of the whole industrial enzyme market. Detergent enzymes include proteinases for removing organic stains, lipases for removing greasy stains, amylases for removing residues of starchy foods and cellulases for restoring of smooth surface of the fiber. The best known detergent enzyme is probably the nonspecific proteinase subtilisin, isolated from various Bacillus species.
[0015]Starch enzymes, such as amylases, occupy the majority of those used in food processing. While starch enzymes include products that are important for textile desizing, alcohol fermentation, paper and pulp processing, and laundry detergent additives, the largest application is for the production of high fructose corn syrup. The production of corn syrup from starch by means of industrial enzymes was a successful alternative to acid hydrolysis.
[0016]Apart from starch processing, enzymes are used for an increasing range of applications in food. Enzymes in food can improve texture, appearance and nutritional value or may generate desirable flavours and aromas. Currently used food enzymes in bakery are amylase, amyloglycosidases, pentosanases for breakdown of pentosan and reduced gluten production or glucose oxidases to increase the stability of dough. Common enzymes for dairy are rennet (protease) as coagulant in cheese production, lactase for hydrolysis of lactose, protease for hydrolysis of whey proteins or catalase for the removal of hydrogen peroxides.
[0017]Enzymes used in brewing process are the above named amylases, but also cellulases or proteases to clarify the beer from suspended proteins. In wines and fruit juices, cloudiness is more commonly caused by starch and pectins so that amylases and pectinases increase yield and clarification. Papain and other proteinases are used for meat tenderizing.
[0018]Enzymes have also been developed to aid animals in the digestion of feed. In the western hemisphere, corn is a major source of food for cattle, swine, and poultry. In order to improve the bioavailability of phosphate from corn, phytase is commonly added (Wyss, M. et al. Biochemical characterization of fungal phytases (myo-inositol hexakisphosphate phosphohydrolases): Catalytic properties. Applied & Environmental Microbiology 65, 367-373 (1999)). Moreover, phytate hydrolysis has been shown to bring about improvements in digestibility of protein and absorption of minerals such as calcium (Bedford, M. R. & Schulze, H. EXOGENOUS ENZYMES FOR PIGS AND POULTRY [Review]. Nutrition Research Reviews 11, 91-114 (1998)). Another major feed enzyme is xylanase. This enzyme is particularly useful as a supplement for feeding stuff comprising more than about 10% of wheat barley or rye, because of their relatively high soluble fiber content. Xylanases cause two important actions: reduction of viscosity of the intestinal contents by hydrolyzing the gel-like high molecular weight arabinoxylans in feed (Murphy, T., C., Bedford, M. R. & McCracken, K. J. Effect of a range of new xylanases on in vitro viscosity and on performance of broiler diets. British Poultry Science 44, S16-S18 (2003)) and break down of polymers in cell walls which improve the bioavailability of protein and starch.
[0019]Biotech research and development laboratories routinely use special enzymes in small quantities along with many other reagents. These enzymes create a significant market for various enzymes. Enzymes like alkaline phosphatase, horseradish peroxidase and luciferase are only some examples. Thermostable DNA polymerases like Taq polymerase or restriction endonucleases revolutionized laboratory work. Therapeutic enzymes are a particular class of drugs, categorized by the FDA as biologicals, with a lot of advantages compared to other, especially non-biological pharmaceuticals. Examples for successful therapeutic enzymes are human clotting factors like factor VIII and factor IX for human treatment. In addition, digestive enzymes are used for various deficiencies in human digestive processes. Other examples are t-PA and streptokinase for the treatment of cardiovascular disease, beta-glucocerebrosidase for the treatment of Type I Gaucher disease, L-asparaginase for the treatment of acute lymphoblastic leukemia and DNAse for the treatment of cystic fibrosis. An important issue in the application of proteins as therapeutics is their potential immunogenicity. To reduce this risk, one would prefer enzymes of human origin, which narrows down the set of available enzymes. The provision of designed enzymes, preferably of human origin, with novel, tailor-made specificities would allow the specific modification of target substrates at will, while minimizing the risk of immunogenicity. A further advantage of highly specific enzymes as therapeutics would be their lower risk of side effects. Due to the limited possibility of specific interactions between a small molecule and a protein, binding to non-target proteins and therefore side effects are quite common and often cause termination of an otherwise promising lead compound. Specific enzymes, on the other hand, provide many more contact sites and mechanisms for substrate discrimination and therefore enable a higher specificity and thereby less side activities.
[0020]Proteases represent an important class of therapeutic agents (Drugs of today, 33, 641-648 (1997)). However, currently the therapeutic protease is usually a substitute for insufficient activity of the body's own proteases. For example, factor VII can be administered in certain cases of coagulation deficiencies of bleeders or during surgery (Heuer L.; Blumenberg D. (2002) Anesthetist 51:388). Tissue-type plasminogen activator (t-PA) is applied in acute cardiac infarction, initializing the dissolution of fibrin clots through specific cleavage and activation of plasminogen (Verstraete, M. et al. (1995) Drugs, 50, 29-41). So far a protease with taylor-made specificity is generated to provide a therapeutic agent that specifically activates or inactivates a disease related target protein.
[0021]Monoclonal antibodies represent another important biological class of substances with therapeutic capabilities. One of the main antibody targets are tumor necrosis factors (TNFs) which belong to the family of cytokines. TNFs play a major role in the inflammation process. As homotrimers they could bind to receptors of nearly every cell. They activate a multiplicity of cellular genes, multiple signal transduction mechanisms, kinases and transcription factors. The most important TNFs are TNF-alpha and TNF-beta. TNF-alpha is produced by macrophages, monocytes and other cells. TNF-alpha is an inflammation mediator. Therefore, research of the last decade has been focused on TNF-alpha inhibitors like monoclonal antibodies as possible therapeutics for different therapeutic indications like Rheumatoid Arthritis, Crohn's disease or Psoriasis (Hamilton et al. (2000) Expert Opin Pharmacother, 1 (5): 1041-1052). One of the major disadvantages of monoclonal antibodies are their high costs, so that new biological alternatives are of great importance.
[0022]There are a lot of examples for engineered enzymes in literature. Fulani et al. (Fulani F. et al. (2003) Protein Engineering 16, 515-519) describe a rhodanase (thiosulfat:cyanide sulfurtransferase) from Azotobacter vinelandii which has a catalytic domain structurally related to catalytic subunit of Cdc25 phosphatase enzymes. The difference in catalytic mechanism depends on the different size of the active site. Both rhodanase and phosphatase are highly specific on different substrates (sulfate vs. phosphate). The catalytic mechanism of the rhodanase could be shifted towards serine/threonine phosphatase by single-residue insertion. Therefore, Fulani et al. give a single example for the change of a catalytic mechanism by structural comparison and sequence alignment of naturally known enzymes from different enzyme classes but lack an indication of how to generate a user-definable substrate specificity while keeping the same catalytic mechanism.
[0023]The thioredoxin reductase described by Briggs et al. (WO 02/090300 A2) has an altered cofactor specificity which preferably binds NADPH compared to NADH. Thus, both enzymes, the starting point as well as the resulting engineered enzyme are highly specific towards different substrates. The methods to achieve such an altered substrate specificity are either computational processing methods or sequence alignments of related proteins to define variable and conserved residues. They all have in common that they are based on the comparison of structures and sequences of proteins with known specificities followed by the transfer of the same to another backbone.
[0024]There are other examples of specificity-engineered enzymes and, in particular, of proteases which have been published in the literature. None of these examples, however, provides a means for generating novel specificities compared to the specificity of the starting material used within the described methods. The methods range from structure-directed single point mutations (Kurth, T. et al. (1998) Biochemistry 37, 11434-11440; Ballinger, M et al. (1996) Biochemistry, 35:13579-13585), exchange of surface loops between two specific proteases (Horrevoets et al. (1993) J. Biol. Chem. 268, 779-782), to random mutagenesis either regio-selectively or across the whole gene combined with in-vitro or in-vivo selection (Sices, H. & Kristie, T. (1998) Proc. Natl. Acad. Sci. USA, 95, 2828-2833).
[0025]The rational design of protease specificity is limited to very few examples. This approach is severely limited by the insufficient understanding of the complexities that govern folding and dynamics as well as structure-function relationships in proteins (Corey, M. J. & Corey, E. (1996) Proc. Natl. Acad. Sci. USA, 93:11428-11434). It is therefore difficult to alter the primary amino acid sequence of a protease in order to change its activity or specificity in a predictive way. In a successful example, Kurth et al. engineered trypsin to show a preference for a dibasic motive (Kurth, T. et al. (1998) Biochemistry, 37:11434-11440). In another example, Hedstrom et al. converted the S1 substrate specificity of trypsin to that of chymotrypsin (Hedstrom, L. et al. (1992) Science, 255:1249-1253). This is an example where a known property was transferred from one backbone to another.
[0026]Ballinger et al. (WO 96/27671) describe subtilisin variants with combination mutations (N62D/G166D, and optionally Y104D) having a shift of substrate specificity towards peptide or polypeptide substrates with basic amino acids at the P1, P2 and P4 positions of the substrate. Suitable substrates of the variant subtilisin were revealed by sorting a library of phage particles (substrate phage) containing five contiguous randomized residues. These subtilisin variants are useful for cleaving fusion proteins with basic substrate linkers and processing hormones or other proteins (in vitro or in vivo) that contain basic cleavage sites. The problems associated with rational redesign of enzymes can partially be overcome by directed evolution (as disclosed in PCT/EP03/04864). These studies can be classified by their expression and selection systems. Genetic selection means to produce inside an organism an enzyme, e.g. a protease, which is able to cleave a precursor protein which in turn results in an alteration of the growth behavior of the producing organism. From a population of organisms with different proteases those can be selected which have an altered growth behavior. This principle was for example reported by Davis et al. (U.S. Pat. No. 5,258,289, WO 96/21009). The production of a phage system is dependent on the cleavage of a phage protein which only can be activated in the presence of a proteolytic enzyme which is able to cleave the phage protein. Other approaches use a reporter system which allows a selection by screening instead of a genetic selection, but also cannot overcome the intrinsic insufficiency of the intracellular characterization of enzymes.
[0027]Systems to generate enzymes with altered sequence specificities with self-secreting enzymes are also reported. Duff et al. (WO 98/11237) describe an expression system for a self-secreting protease. An essential element of the experimental design is that the catalytic reaction acts on the protease itself by an autoproteolytic processing of the membrane-bound precursor molecule to release the matured protease from the cellular membrane into the extracellular environment. Therefore, a fusion protein must be constructed where the target peptide sequence replaces the natural cleavage site for autoproteolysis. Limitations of such a system are that positively identified proteases will have the ability to cleave a certain amino acid sequence but they also may cleave many other peptide sequences. Therefore, high substrate specificity can not be achieved. Additionally, such a system is not able to control that selected proteases cleave at a specific position in a defined amino acid sequence and it does not allow a precise characterization of the kinetic constants of the selected proteases (kcat, KM).
[0028]A method has been described that aims at the generation of new catalytic activities and specificities within the α/β-barrel proteins (WO 01/42432; Fersht et al, Methods of producing novel enzymes; Altamirano et al. (2000) Nature 403, 617-622). The α/β-barrel proteins comprise a large superfamily of proteins accounting for a large fraction of all known enzymes. The structure of the proteins is made from α/β-barrel surrounded by α-helices. The loops connecting β-strands and helices comprise the so-called lid-structure including the active site residues. The method is based on the classification of α/β-barrel proteins into two classes based on the catalytic lid structure. An extensive comparison of α/β-barrel protein structures led the authors to the conclusion that the substrate binding and specificity is primarily defined by the barrel structure while the specificity of the chemical reaction resides within the loops. It is suggested that barrels and lid structures from different enzymes can be combined to generate new enzymatic activities and to provide a starting point to fine tune the properties by targeted or randomized mutagenesis and selection. The method does not provide for the generation of user-defined specificity.
[0029]In summary, it is clear that there are many possible applications in the fields of therapeutics, research and diagnostics, industrial enzymes, food and feed processing, cosmetics and other areas that would become possible by the availability of enzymes with a novel substrate specificity. However, only a limited number of specific enzymes has been identified from natural sources so far. Methods of rational design to modify, alter, convert or transfer sequence specificity as well as random approaches described above did not enable the generation of a novel and user-definable specificity that was not present in the employed starting material.
[0030]Therefore, none of the currently available methods can provide enzymes with a novel and user-defined sequence specificity. In contrast, the current invention provides such enzymes as well as methods for generating them.
SUMMARY OF THE INVENTION
[0031]The objective of the present invention is to provide engineered proteins with novel functions that do not exist in the components used for the engineering of such proteins. In particular, the invention provides enzymes with user-definable specificities. User-definable specificity means that enzymes are provided with specificities that do not exist in the components used for the engineering of such enzymes. The specificities can be chosen by the user so that one or more intended target substrates are preferentially recognised and converted by the enzymes. Furthermore, the invention provides enzymes that possess essentially identical sequences to human proteins but have different specificities. In a particular embodiment, the invention provides proteases with user-definable specificities.
[0032]Furthermore, the present invention is directed to engineered enzymes which are fused to one or more further functional components. These further components can be proteinaceous components which preferably have binding properties and are of the group consisting of substrate binding domains, antibodies, receptors or fragments thereof. Furthermore, these further components can be further functional components, preferably being selected from the group consisting of polyethyleneglycols, carbohydrates, lipids, fatty acids, nucleic acids, metals, metal chelates, and fragments or derivatives thereof. The resulting fusion proteins are understood as enzymes with user-definable specificities within the present invention.
[0033]Besides, the invention is directed to the application of such enzymes with novel, user-definable specificities for therapeutic, research, diagnostic, nutritional, personal care or industrial purposes. Moreover, the invention is directed to a method for generating engineered enzymes with user-definable specificities. In particular, the invention is directed to generate enzymes that possess essentially identical sequences to human enzymes but have different specificities.
[0034]This problem has been solved by the embodiments of the invention specified in the description below and in the claims. The present invention is thus directed to
[0035](1) an engineered enzyme with defined specificity characterized by the combination of the following components:
[0036](a) a protein scaffold which catalyzes at least one chemical reaction on at least one substrate, and
[0037](b) one or more specificity determining regions (SDRs) located at sites in the protein scaffold that enable the resulting engineered protein to discriminate between at least one target substrate and one or more different substrates, and wherein the SDRs are essentially synthetic peptide sequences;
[0038](2) the use of an engineered enzyme as defined in (1) above for therapeutic, research, diagnostic, nutritional, personal care or industrial purposes;
[0039](3) a method for generating engineered enzymes as defined in (1) above having specificities towards target substrates, such specificities not being present in the individual starting components, comprising at least the following steps:
[0040](a) providing a protein scaffold which catalyzes at least one chemical reaction on at least one substrate,
[0041](b) generating a library of engineered enzymes by combining the protein scaffold from step (a) with fully or partially random peptide sequences at sites in the protein scaffold that enable the resulting engineered enzyme to discriminate between at least one target substrate and one or more different substrates, and
[0042](c) selecting out of the library of engineered enzymes generated in step (b) one or more enzymes that have specificities towards at least one target substrate;
[0043](4) a fusion protein which is comprised of at least one engineered enzyme as defined in (1) above and at least one further component, preferably the at least one further component having binding properties and more preferably being selected from the group consisting of antibodies, binding domains, receptors, and fragments thereof;
[0044](5) a composition or pharmaceutical composition comprising one or more engineered enzymes as defined in (1) above or a fusion protein as defined in (4) above, said pharmaceutical composition may optionally comprise an acceptable carrier, excipient and/or auxiliary agent;
[0045](6) a DNA encoding the engineered enzyme as defined in (1) above;
[0046](7) a vector comprising the DNA as defined in (6) above;
[0047](8) a host cell or transgenic organism being transformed/transfected with a vector as defined in (7) above and/or containing the DNA as defined in (6) above; and
[0048](9) a method for producing the engineered enzyme comprising culturing a cell or organism as defined in (8) above and isolating the enzyme from the culture broth.
BRIEF DESCRIPTION OF THE FIGURES
[0049]The following figures are provided in order to explain further the present invention in supplement to the detailed description:
[0050]FIG. 1 illustrates the three-dimensional structure of human trypsin I with the active site residues shown in "ball-and-stick" representation and with the marked regions indicating potential SDR insertion sites.
[0051]FIG. 2 shows the alignment of the primary amino acid sequence of three members of the serine protease class S1 family: human trypsin I, human alpha-thrombin and human enteropeptidase (see also SEQ ID NOs: 1, 5 and 6).
[0052]FIG. 3 illustrates the three-dimensional structure of subtilisin with the active site residues being shown in "ball-and-stick" representation and with the numbered regions indicating potential SDR insertion sites.
[0053]FIG. 4 shows the alignment of the primary amino acid sequences of four members of the serine protease class S8 family: subtilisin E, furin, PC1 and PC5 (see also SEQ ID NOs: 7-10).
[0054]FIG. 5 illustrates the three-dimensional structure of pepsin with the active site residues being shown in "ball-and-stick" representation and with the numbered regions indicating potential SDR insertion sites.
[0055]FIG. 6 shows the alignment of the primary amino acid sequences of three members of the A1 aspartic acid protease family: pepsin, β-secretase and cathepsin D (see also SEQ ID NOs: 11-13).
[0056]FIG. 7: illustrates the three-dimensional structure of caspase 7 with the active site residues being shown in "ball-and-stick" representation and with the numbered regions indicating potential SDR insertion sites.
[0057]FIG. 8: shows the primary amino acid sequence of caspase 7 as a member of the cysteine protease class C14 family (see also SEQ ID NO: 14).
[0058]FIG. 9 depicts schematically the third aspect of the invention.
[0059]FIG. 10 shows a Western blot analysis of a culture supernatant of cells expressing variants of human trypsin I with SDR1 and SDR2, compared to negative controls.
[0060]FIG. 11 shows the time course of the proteolytic cleavage of a target substrate by human trypsin I.
[0061]FIG. 12 shows the relative activities of three variants of inventive engineered proteolytic enzymes in comparison with human trypsin I on two different peptide substrates.
[0062]FIG. 13 shows the relative specificities of human trypsin I and variants of inventive engineered proteolytic enzymes with one or two SDRs, respectively.
[0063]FIG. 14: shows the relative specificities of human trypsin I and of variants of inventive engineered proteolytic enzymes being specific for human TNF-alpha with this scaffold on peptides with a target sequence of human TNF-alpha.
[0064]FIG. 15: shows the reduction of cytotoxicity induced by TNF-alpha when incubating the TNF-alpha with concentrated supernatant from cultures expressing the inventive engineered proteolytic enzymes being specific for human TNF-alpha.
[0065]FIG. 16: shows the reduction of cytotoxicity induced by TNF-alpha when incubating the TNF-alpha with purified inventive engineered proteolytic enzyme being specific for human TNF-alpha.
[0066]FIG. 17: compares the activity of inventive engineered proteolytic enzymes being specific for human TNF-alpha with the activity of human trypsin I on two protein substrates: (a) human TNF-alpha; (b) mixture of human serum proteins.
[0067]FIG. 18: shows the specific activity of an inventive engineered proteolytic enzyme with specificity for human VEGF.
[0068]FIG. 19: Plasmid map of the shuttle vector pBVP43-Sub.
[0069]FIG. 20: Schematic drawing of the insertion of SDR'S via PCR.
[0070]FIG. 21: Graphical description of the SDR insertion sites in subtilisin E.
[0071]FIG. 22: The figure compares the properties of ACE inhibition and degree of hydrolysis (DH) of GMP hydrolysates generated with either wt subtilisin E or the subtilisin E variants X and Y.
DEFINITIONS
[0072]In the framework of the present invention the following terms and definitions are used.
[0073]The term "protease" means any protein molecule that is capable of hydrolysing peptide bonds. This includes naturally-occurring or artificial proteolytic enzymes, as well as variants thereof obtained by site-directed or random mutagenesis or any other protein engineering method, any active fragment of a proteolytic enzyme, or any molecular complex or fusion protein comprising one of the aforementioned proteins. A "chimera of proteases" means a fusion protein of two or more fragments derived from different parent proteases.
[0074]The term "substrate" means any molecule that can be converted catalytically by an enzyme. The term "peptide substrate" means any peptide, oligopeptide, or protein molecule of any amino acid composition, sequence or length, that contains a peptide bond that can be hydrolyzed catalytically by a protease. The peptide bond that is hydrolyzed is referred to as the "cleavage site". Numbering of positions in the substrate is done according to the system introduced by Schlechter & Berger (Biochem. Biophys. Res. Commun. 27 (1967) 157-162). Amino acid residues adjacent N-terminal to the cleavage site are numbered P1, P2, P3, etc., whereas residues adjacent C-terminal to the cleavage site are numbered P1', P2', P3', etc.
[0075]The term "target substrate" describes a user-defined substrate which is specifically recognized and converted by an enzyme according to the invention. The term "target peptide substrate" describes a user-defined peptide substrate. The term "target specificity" describes the qualitative and quantitative specificity of an enzyme that is capable of recognizing and converting a target substrate. Catalytic properties of enzymes are expressed using the kinetic parameters "KM" or "Michaelis Menten constant", "kcat" or "catalytic rate constant", and "kcat/KM" or "catalytic efficiency", according to the definitions of Michaelis and Menten (Fersht, A., Enzyme Structure and Mechanism, W. H. Freeman and Company, New York, 1995). The term "catalytic activity" describes quantitatively the conversion of a given substrate under defined reaction conditions.
[0076]The term "specificity" means the ability of an enzyme to recognize and convert preferentially certain substrates. Specificity can be expressed qualitatively and quantitatively. "Qualitative specificity" refers to the chemical nature of the substrate residues that are recognized by an enzyme. "Quantitative specificity" refers to the number of substrates that are accepted as substrates. Quantitative specificity can be expressed by the term s, which is defined as the negative logarithm of the number of all accepted substrates divided by the number of all possible substrates. Proteases, for example, that accept preferentially a small portion of all possible peptide substrates have a "high specificity". Proteases that accept almost any peptide substrate have a "low specificity". Definitions are made in accordance to WO 03/095670 which is therefore incorporated by reference. Proteases with very low specificity are also referred to as "unspecific proteases". The term "defined specificity" refers to a certain type of specificity, i.e. to a certain target substrate or a set of certain target substrates that are preferentially converted versus other substrates.
[0077]The term "engineered" in combination with the term "enzyme" describes an enzyme that is comprised of different components and that has features not being conferred by the individual components alone.
[0078]The term "protein scaffold" or "scaffold protein" refers to a variety of primary, secondary and tertiary polypeptide structures.
[0079]The term "peptide sequence" indicates any peptide sequence used for insertion or substitution into or combination with a protein scaffold. Peptide sequences are usually obtained by expression from DNA sequences which can be synthesized according to well-established techniques or can be obtained from natural sources. Insertion, substitution or combination of peptide sequences with the protein scaffold are generated by insertion, substitution or combination of oligonucleotides into or with a polynucleotide encoding the protein scaffold. The term "synthetic" in combination with the term "peptide sequence" refers to peptide sequences that are not present in the protein scaffold in which the peptide sequences are inserted or substituted or with which they are combined.
[0080]The term "components" in combination with the term "engineered enzyme" refers to peptide or polypeptide sequences that are combined in the engineering of such enzymes. Such components may among others comprise one or more protein scaffolds and one or more synthetic peptide sequences. The term "library of engineered enzymes" describes a mixture of engineered enzymes, whereby every single engineered enzyme is encoded by a different polynucleotide sequence. The term "gene library" indicates a library of polynucleotides that encodes the library of engineered enzymes. The term "SDR" or "Specificity determining region" refers to a synthetic peptide sequence that provides the defined specificity when combined with the protein scaffold at sites that enable the resulting enzymes to discriminate between the target substrate and one or more other substrates. Such sites are termed "SDR sites".
[0081]The terms "tertiary structure similar to the structure of" and "similar tertiary structure" in combination with the terms "enzyme" or "protein" refer to proteins in which the type, sequence, connectivity and relative orientation of the typical secondary structural elements of a protein, e.g. alpha-helices, beta-sheets, beta-turns and loops, are similar and the proteins are therefore grouped into the same structural or topological class or fold. This includes proteins that have altered, additional or deleted structural elements of any type but otherwise unchanged topology. Examples of such structural classes are the TNF superfamily, the S1 fold or the S8 fold within the serine proteases, the GPCRs, or the α/β-barrel fold.
[0082]The term "positions that correspond structurally" indicates amino acids in proteins of similar tertiary structure that correspond structurally to each other, i.e. they are usually located within the same structural or topological element of the structure. Within the structural element they possess the same relative positions with respect to beginning and end of the structural element. If, e.g. the topological comparison of two proteins reveals two structurally corresponding sequences of different length, then amino acids within, e.g. 20% and 40% of the respective region lengths, correspond to each other structurally.
[0083]The term "library of engineered enzymes" of the present invention refers to a multiplicity of enzymes or enzyme variants, which may exist as a mixture or in isolated form.
[0084]Amino acids residues are abbreviated according to the following Table 1 either in one- or in three-letter code.
TABLE-US-00001 TABLE 1 Amino acid abbreviations Abbreviations Amino acid A Ala Alanine C Cys Cysteine D Asp Aspartic acid E Glu Glutamic acid F Phe Phenylalanine G Gly Glycine H His Histidine I Ile Isoleucine K Lys Lysine L Leu Leucine M Met Methionine N Asn Asparagine P Pro Proline Q Gln Glutamine R Arg Arginine S Ser Serine T Thr Threonine V Val Valine W Trp Tryptophane Y Tyr Tyrosine
DETAILED DESCRIPTION OF THE INVENTION
[0085]The present invention provides engineered proteins with novel functions. In particular, the invention provides enzymes with user-definable specificities. In a particular embodiment, the invention provides proteases with user-definable specificities. Besides, the invention provides applications of such enzymes with novel, user-definable specificities for therapeutic, research, diagnostic, nutritional, personal care or industrial purposes. Moreover, the invention provides a method for generating enzymes with specificities that are not present in the components used for the engineering of such enzymes. In particular, the invention is directed to the generation of enzymes that have sequences that are essentially identical to mammalian, especially human enzymes but have different specificities. Moreover, the invention provides libraries of specific engineered enzymes with corresponding specificities encoded genetically, a method for the generation of libraries of specific engineered enzymes with corresponding specificities encoded genetically, and the application of such libraries for technical, diagnostic, nutritional, personal care or research purposes.
[0086]A first aspect of the invention discloses engineered enzymes with defined specificities. These engineered enzymes are characterized by the following components:
[0087](a) a protein scaffold capable of catalyzing at least one chemical reaction on a substrate, and
[0088](b) one or more specificity determining regions (SDRs) located at sites in the protein scaffold that enable the resulting engineered protein to discriminate between at least one target substrate and one or more different substrates, wherein the SDRs are essentially synthetic peptide sequences.
[0089]Preferably, such defined specificity of the engineered enzymes is not conferred by the protein scaffold.
[0090]In principle, the protein scaffold can have a variety of primary, secondary and tertiary structures. The primary structure, i.e. the amino acid sequence, can be an engineered sequence or can be derived from any viral, prokaryotic or eukaryotic origin. For human therapeutic use, however, the protein scaffold is preferably of mammalian origin, and more preferably, of human origin. Furthermore, the protein scaffold is capable to catalyze one or more chemical reactions and has preferably only a low specificity.
[0091]Preferably, derivatives of the protein scaffold are used that have modified amino acid sequences that confer improved characteristics for the applicability as protein scaffolds. Such improved characteristics comprise, but are not limited to, stability; expression or secretion yield; folding, in particular after combination of the protein scaffold with SDRs; increased or decreased sensitivity to regulators such as activators or inhibitors; immunogenicity; catalytic rate; kM or substrate affinity.
[0092]The engineered enzymes reveal their quantitative specificity from the synthetic peptide sequences that are combined with the protein scaffold. Therefore, the engineered peptide sequences are acting as Specificity Determining Regions or SDRs. The number, the length and the positions of such SDRs can vary over a wide range. The number of SDRs within the scaffold is at least one, preferably more than one, more preferably between two and eleven, most preferably between two and six. The SDRs have a length between one and 50 amino acid residues, preferably a length between one and 15 amino acid residues, more preferably a length between one and six amino acid residues. Alternatively, the SDRs have a length between two and 20 amino acid residues, preferably a length between two and ten amino acid residues, more preferably a length between three and eight amino acid residues.
[0093]The inventive engineered enzymes can further be described as antibody-like protein molecules comprising constant and variable regions, but having a non-immunoglobulin backbone and having an active site (catalytic activity) in the constant region, whereby the substrate specificity of the active site is modulated by the variable region. Preferably, as in the immunoglobulin structure, the variable regions are loops of variable length and composition that interact with a target molecule.
[0094]In a particular variant of the invention, the engineered enzymes have hydrolase activity. In a preferred variant, the engineered enzymes have proteolytic activity. Particularly preferred protein scaffolds for this variant are unspecific proteases or are parts from unspecific proteases or are otherwise derived from unspecific proteases. The expressions "derived from" or "a derivative thereof" in this respect and in the following variants and embodiments refer to derivatives of proteins that are mutated at one or more amino acid positions and/or have a homology of at least 70%, preferably 90%, more preferably 95% and most preferably 99% to the original protein, and/or that are proteolytically processed, and/or that have an altered glycosylation pattern, and/or that are covalently linked to non-protein substances, and/or that are fused with further protein domains, and/or that have C-terminal and/or N-terminal truncations, and/or that have specific insertions, substitutions and/or deletions. Alternatively, "derived from" may refer to derivatives that are combinations or chimeras of two or more fragments from two or more proteins, each of which optionally comprises any or all of the aforementioned modifications. The tertiary structure of the protein scaffold can be of any type. Preferably, however, the tertiary structure belongs to one of the following structural classes: class S1 (chymotrypsin fold of the serine proteases family), class S8 (subtilisin fold of the serine proteases family), class SC (carboxypeptidase fold of the serine proteases family), class A1 (pepsin A fold of the aspartic proteases), or class C14 (caspase-1 fold of the cysteine proteases). Examples of proteases that can serve as the protein scaffold of engineered proteolytic enzymes for the use as human therapeutics are or are derived from human trypsin, human thrombin, human chymotrypsin, human pepsin, human endothiapepsin, human caspases 1 to 14, and/or human furin.
[0095]The defined specificity of the engineered proteolytic enzymes is a measure of their ability to discriminate between at least one target peptide or protein substrates and one or more further peptide or protein substrates. Preferably, the defined specificity refers to the ability to discriminate peptide or protein substrates that differ in other positions than the P1 site, more preferably, the defined specificity refers to the ability to discriminate peptide or protein substrates that differ in other positions than the P1 site and the P1' site. Most preferably, the engineered proteolytic enzymes distinguish target peptide or protein substrates at as many sites as is necessary to preferentially hydrolyse the target substrate versus other proteins. As an example, a therapeutically useful engineered proteolytic enzyme applied intravenously in the human body should be sufficiently specific to discriminate between the target substrate and any other protein in the human serum. Preferably, such an engineered proteolytic enzyme recognizes and discriminates peptide substrates at three or more amino acid positions, more preferably at four or more positions, and even more preferably at five or more amino acid positions. These positions may either be adjacent or non-adjacent.
[0096]In a first embodiment, the protein scaffold has a tertiary structure or fold equal or similar to the tertiary structure or fold of the S1 structural subclass of serine proteases, i.e. the chymotrypsin fold, and/or has at least 70% identity on the amino acid level to a protein of the S1 structural subclass of serine proteases. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 18-25, 38-48, 54-63, 73-86, 122-130, 148-156, 165-171 and 194-204 in human trypsin I, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 20-23, 41-45, 57-60, 76-83, 125-128, 150-153, 167-169 and 197-201 (numbering of amino acids according to SEQ ID NO:1). The number of SDRs to be combined with this type of protein scaffold is preferably between 1 and 10, and more preferably between 2 and 4. Preferably, the protein scaffold is equal to or is a derivative or homologue of one or more of the following proteins: chymotrypsin, granzyme, kallikrein, trypsin, mesotrypsin, neutrophil elastase, pancreatic elastase, enteropeptidase, cathepsin, thrombin, ancrod, coagulation factor IXa, coagulation factor VIIa, coagulation factor Xa, activated protein C, urokinase, tissue-type plasminogen activator, plasmin, Desmodus-type plasminogen activator. More preferably, the protein scaffold is trypsin or thrombin or is a derivative or homologue from trypsin or thrombin. For the use as a human therapeutic, the trypsin or thrombin scaffold is most preferably of human origin in order to minimize the risk of an immune response or an allergenic reaction.
[0097]Preferably, derivatives with improved characteristics derived from human trypsin I or from proteins with similar tertiary structure are used. Preferred examples of such derivatives are derived from human trypsin I (SEQ ID NO:1) and comprise one or more of the following amino acid substitutions E56G; R78W; Y131F; A146T; C183R.
[0098]It is preferred that at least one of two SDRs are inserted into human trypsin I, or a derivative thereof, between residues 42 and 43 (SDR 1) and between 123 and 124 (SDR 2), respectively (numbering of amino acids according to SEQ ID NO:1). In addition the SDR 1 has a preferred length of 6 and the SDR 2 has a preferred length of 5 amino acids, respectively. In a preferred variant of this embodiment, the SDR 1 and SDR 2 sequences comprise one of the amino acid sequences listed in table 2. Such engineered proteolytic enzymes have specificity for the target substrate B as exemplified in example IV.
[0099]In a further embodiment the protein scaffold belongs to the S8 structural subclass of serine proteases and/or has a tertiary structure similar to subtilisin E from Bacillus subtilis and/or has at least 70% identity on the amino acid level to a protein of the S8 structural subclass of serine proteases. Preferably, the scaffold belongs to the subtilisin family or the human pro-protein convertases. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 6-17, 25-29, 47-55, 59-69, 101-111, 117-125, 129-137, 139-154, 158-169, 185-195 and 204-225 in subtilisin E from Bacillus subtilis, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 59-69, 101-111, 129-137, 158-169 and 204-225 (numbering of amino acids according to SEQ ID NO:7). It is preferred that the protein scaffold is equal to or is a derivative or homologue of one or more of the following proteins: subtilisin Carlsberg; B. subtilis subtilisin E; subtilisin BPN'; B. licheniformis subtilisin; B. lentus subtilisin; Bacillus alcalophilus alkaline protease; proteinase K; kexin; human pro-protein convertase; human furin. In a preferred variant, subtilisin BPN' or one of the proteins SPC 1 to 7 is used as the protein scaffold.
[0100]In a further embodiment the protein scaffold belongs to the family of aspartic proteases and/or has a tertiary structure similar to human pepsin. Preferably, the scaffold belongs to the A1 class of proteases and/or has at least 70% identity on the amino acid level to a protein of the A1 class of proteases. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 6-18, 49-55, 74-83, 91-97, 112-120, 126-137, 159-164, 184-194, 242-247, 262-267 and 277-300 in human pepsin, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 10-15, 75-80, 114-118, 130-134, 186-191 and 280-296 (numbering of amino acids according to SEQ ID NO:11). It is preferred that the protein scaffold is equal to or is a derivative or homologue of one or more of the following proteins: pepsin, chymosin, renin, cathepsin, yapsin. Preferably, pepsin or endothiopepsin or a derivative or homologue thereof is used as the protein scaffold.
[0101]In a further embodiment the protein scaffold belongs to the cysteine protease family and/or has a tertiary structure similar to human caspase 7. Preferably the scaffold belongs to the C14 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C14 class of cysteine proteases. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 78-91, 144-160, 186-198, 226-243 and 271-291 in human caspase 7, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 80-86, 149-157, 190-194 and 233-238 (numbering of amino acids according to SEQ ID NO:14). It is preferred that the protein scaffold is equal to or is a derivative or homologue of one of the caspases 1 to 9.
[0102]In a further embodiment the protein scaffold belongs to the S11 class of serine proteases or has at least 70% identity on the amino acid level to a protein of the S11 class of serine proteases and/or has a tertiary structure similar to D-alanyl-D-alanine transpeptidase from Streptomyces species K15. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 67-79, 137-150, 191-206, 212-222 and 241-251 in D-alanyl-D-alanine transpeptidase from Streptomyces species K15, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 70-75, 141-147, 195-202 and 216-220 (numbering of amino acids according to SEQ ID NO:15). It is preferred that the D-alanyl-D-alanine transpeptidase from Streptomyces species K15 or a derivative or homologue thereof is used as the scaffold.
[0103]In a further embodiment the protein scaffold belongs to the S21 class of serine proteases or has at least 70% identity on the amino acid level to a protein of the S21 class of serine proteases and/or has a tertiary structure similar to assemblin from human cytomegalovirus. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 25-33, 64-69, 134-155, 162-169 and 217-244 in assemblin from human cytomegalovirus, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 27-31, 164-168 and 222-239 (numbering of amino acids according to SEQ ID NO:16). It is preferred that the assemblin from human cytomegalovirus or a derivative or homologue thereof is used as the scaffold.
[0104]In a further embodiment the protein scaffold belongs to the S26 class of serine proteases or has at least 70% identity on the amino acid level to a protein of the S26 class of serine proteases and/or has a tertiary structure similar to the signal peptidase from Escherichia coli. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 8-14, 57-68, 125-134, 239-254, 200-211 and 228-239 in signal peptidase from Escherichia coli, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 9-13, 60-67, 127-132 and 203-209 (numbering of amino acids according to SEQ ID NO:17). It is preferred that the signal peptidase from Escherichia coli or a derivative or homologue thereof is used as the scaffold.
[0105]In an further embodiment the protein scaffold belongs to the S33 class of serine proteases or has at least 70% identity on the amino acid level to a protein of the S33 class of serine proteases and/or has a tertiary structure similar to the prolyl aminopeptidase from Serratia marcescens. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 47-54, 152-160, 203-212 and 297-302 in prolyl aminopeptidase from Serratia marcescens, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 50-53, 154-158 and 206-210 (numbering of amino acids according to SEQ ID NO:18). It is preferred that the prolyl aminopeptidase from Serratia marcescens or a derivative or homologue thereof is used as the scaffold.
[0106]In a further embodiment the protein scaffold belongs to the S51 class of serine proteases or has at least 70% identity on the amino acid level to a protein of the S51 class of serine proteases and/or has a tertiary structure similar to aspartyl dipeptidase from Escherichia coli. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 8-16, 38-46, 85-92, 132-140, 159-170 and 205-211 in aspartyl dipeptidase from Escherichia coli, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 10-14, 87-90, 134-138 and 160-165 (numbering of amino acids according to SEQ ID NO:19). It is preferred that the aspartyl dipeptidase from Escherichia coli or a derivative or homologue thereof is used as the scaffold.
[0107]In a further embodiment the protein scaffold belongs to the A2 class of aspartic proteases or has at least 70% identity on the amino acid level to a protein of the A2 class of aspartic proteases and/or has a tertiary structure similar to the protease from human immunodeficiency virus. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 5-12, 17-23, 27-30, 33-38 and 77-83 in protease from human immunodeficiency virus, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 7-10, 18-21, 34-37 and 79-82 (numbering of amino acids according to SEQ ID NO:20). It is preferred that the protease from human immunodeficiency virus, preferably HIV-1 protease, or a derivative or homologue thereof is used as the scaffold.
[0108]In an further embodiment the protein scaffold belongs to the A26 class of aspartic proteases or has at least 70% identity on the amino acid level to a protein of the A26 class of aspartic proteases and/or has a tertiary structure similar to the omptin from Escherichia coli. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 28-40, 86-98, 150-168, 213-219 and 267-278 in omptin from Escherichia coli, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 33-38, 161-168 and 273-277 (numbering of amino acids according to SEQ ID NO:21). It is preferred that the omptin from Escherichia coli or a derivative or homologue thereof is used as the scaffold.
[0109]In a further embodiment the protein scaffold belongs to the C1 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C1 class of cysteine proteases and/or has a tertiary structure similar to the papain from Carica papaya. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 17-24, 61-68, 88-95, 135-142, 153-158 and 176-184 in papain from Carica papaya, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 63-66, 136-139 and 177-181 (numbering of amino acids according to SEQ ID NO:22). It is preferred that the papain from Carica papaya or a derivative or homologue thereof is used as the scaffold.
[0110]In a further embodiment the protein scaffold belongs to the C2 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C2 class of cysteine proteases and/or has a tertiary structure similar to human calpain-2. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 90-103, 160-172, 193-199, 243-260, 286-294 and 316-322 in human calpain-2, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 92-101, 245-250 and 287-291 (numbering of amino acids according to SEQ ID NO:23). It is preferred that the human calpain-2 or a derivative or homologue thereof is used as the scaffold.
[0111]In a further embodiment the protein scaffold belongs to the C4 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C4 class of cysteine proteases and/or has a tertiary structure similar to NIa protease from tobacco etch virus. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 23-31, 112-120, 144-150, 168-176 and 205-218 in NIa protease from tobacco etch virus, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 145-149, 169-174 and 212-218 (numbering of amino acids according to SEQ ID NO:24). It is preferred that the NIa protease from tobacco etch virus (TEV protease) or a derivative or homologue thereof is used as the scaffold.
[0112]In a further embodiment the protein scaffold belongs to the C10 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C10 class of cysteine proteases and/or has a tertiary structure similar to the streptopain from Streptococcus pyogenes. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 81-90, 133-140, 150-164, 191-199, 219-229, 246-256, 306-312 and 330-337 in streptopain from Streptococcus pyogenes, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 82-87, 134-138, 250-254 and 331-335 (numbering of amino acids according to SEQ ID NO:25). It is preferred that the streptopain from Streptococcus pyogenes or a derivative or homologue thereof is used as the scaffold.
[0113]In a further embodiment the protein scaffold belongs to the C19 class of cysteine proteases or has at least 700% identity on the amino acid level to a protein of the C19 class of cysteine proteases and/or has a tertiary structure similar to human ubiquitin specific protease 7. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 3-15, 63-70, 80-86, 248-256, 272-283 and 292-304 in human ubiquitin specific protease 7, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 10-15, 251-255, 277-281 and 298-304 (numbering of amino acids according to SEQ ID NO:26). It is preferred that the human ubiquitin specific protease 7 or a derivative or homologue thereof is used as the scaffold.
[0114]In a further embodiment the protein scaffold belongs to the C47 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C47 class of cysteine proteases and/or has a tertiary structure similar to the staphopain from Staphylococcus aureus. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 15-23, 57-66, 108-119, 142-149 and 157-164 in staphopain from Staphylococcus aureus, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 17-22, 111-117, 143-147 and 159-163 (numbering of amino acids according to SEQ ID NO:27). It is preferred that the staphopain from Staphylococcus aureus or a derivative or homologue thereof is used as the scaffold.
[0115]In an further embodiment the protein scaffold belongs to the C48 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C48 class of cysteine proteases and/or has a tertiary structure similar to the Ulp1 endopeptidase from Saccharomyces cerevisiae. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 40-51, 108-115, 132-141, 173-179 and 597-605 in Ulp1 endopeptidase from Saccharomyces cerevisiae, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 43-49, 110-113, 133-137 and 175-178 (numbering of amino acids according to SEQ ID NO:28). It is preferred that the Ulp1 endopeptidase from Saccharomyces cerevisiae or a derivative or homologue thereof is used as the scaffold.
[0116]In a further embodiment the protein scaffold belongs to the C56 class of cysteine proteases or has at least 70% identity on the amino acid level to a protein of the C56 class of cysteine proteases and/or has a tertiary structure similar to the Pfpl endopeptidase from Pyrococcus horikoshii. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 8-16, 40-47, 66-73, 118-125 and 147-153 in Pfpl endopeptidase from Pyrococcus horikoshii, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 9-14, 68-71, 120-123 and 148-151 (numbering of amino acids according to SEQ ID NO:29). It is preferred that the Pfpl endopeptidase from Pyrococcus horikoshii or a derivative or homologue thereof is used as the scaffold.
[0117]In a further embodiment the protein scaffold belongs to the M4 class of metallo proteases or has at least 70% identity on the amino acid level to a protein of the M4 class of metallo proteases and/or has a tertiary structure similar to thermolysin from Bacillus thermoproteolyticus. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 106-118, 125-130, 152-160, 197-204, 210-213 and 221-229 in thermolysin from Bacillus thermoproteolyticus, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 108-115, 126-129, 199-203 and 223-227 (numbering of amino acids according to SEQ ID NO:30). It is preferred that the thermolysin from Bacillus thermoproteolyticus or a derivative or homologue thereof is used as the scaffold.
[0118]In a further embodiment the protein scaffold belongs to the M10 class of metallo proteases or has at least 70% identity on the amino acid level to a protein of the M10 class of metallo proteases and/or has a tertiary structure similar to human collagenase. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 2-7, 68-79, 85-90, 107-111 and 135-141 in human collagenase, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 3-6, 71-78 and 136-140 (numbering of amino acids according to SEQ ID NO:31). It is preferred that human collagenase or a derivative or homologue thereof is used as the scaffold.
[0119]It is further preferred that the engineered enzymes have glycosidase activity. A particularly suited protein scaffold for this variant is a glycosylase or is derived from a glycosylase. Preferably, the tertiary structure belongs to one of the following structural classes: class GH13, GH7, GH12, GH11, GH10, GH28, GH26, and GH18 (beta/alpha)8 barrel.
[0120]In a first embodiment the protein scaffold belongs to the GH13 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH13 class of glycosylases and/or has a tertiary structure similar to human pancreatic alpha-amylase. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 50-60, 100-110, 148-167, 235-244, 302-310 and 346-359 in human pancreatic alpha-amylase, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 51-58, 148-155 and 303-309 (numbering of amino acids according to SEQ ID NO:32). It is preferred that human pancreatic alpha-amylase or a derivative or homologue thereof is used as the scaffold.
[0121]In a further embodiment the protein scaffold belongs to the GH7 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH7 class of glycosylases and/or has a tertiary structure similar to cellulase from Trichoderma reesei. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 47-56, 93-104, 173-182, 215-223, 229-236 and 322-334 in cellulase from Trichoderma reesei, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 175-180, 218-222 and 324-332 (numbering of amino acids according to SEQ ID NO:33). It is preferred that cellulase from Trichoderma reesei or a derivative or homologue thereof is used as the scaffold.
[0122]In a further embodiment the protein scaffold belongs to the GH12 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH12 class of glycosylases and/or has a tertiary structure similar to cellulase from Aspergillus niger. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 18-28, 55-60, 106-113, 126-132 and 149-159 in cellulase from Aspergillus niger, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 20-26, 56-59, 108-112 and 151-156 (numbering of amino acids according to SEQ ID NO:34). It is preferred that cellulase from Aspergillus niger or a derivative or homologue thereof is used as the scaffold.
[0123]In a further embodiment the protein scaffold belongs to the GH11 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH11 class of glycosylases and/or has a tertiary structure similar to xylanase from Aspergillus niger. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 7-14, 33-39, 88-97, 114-126 and 158-167 in xylanase from Aspergillus niger, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 20-26, 56-59, 108-112 and 151-156 (numbering of amino acids according to SEQ ID NO:35). It is preferred that xylanase from Aspergillus niger or a derivative or homologue thereof is used as the scaffold.
[0124]In a further embodiment the protein scaffold belongs to the GH10 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH10 class of glycosylases and/or has a tertiary structure similar to xylanase from Streptomyces lividans. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 21-29, 42-50, 84-92, 130-136, 206-217 and 269-278 in xylanase from Streptomyces lividans, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 43-49, 86-90, 208-213 and 271-276 (numbering of amino acids according to SEQ ID NO:36). It is preferred that xylanase from Streptomyces lividans or a derivative or homologue thereof is used as the scaffold.
[0125]In a further embodiment the protein scaffold belongs to the GH28 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH28 class of glycosylases and/or has a tertiary structure similar to pectinase from Aspergillus niger. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 82-88, 118-126, 171-178, 228-236, 256-264 and 289-299 in pectinase from Aspergillus niger, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 116-124, 174-178 and 291-296 (numbering of amino acids according to SEQ ID NO:37). It is preferred that pectinase from Aspergillus niger or a derivative or homologue thereof is used as the scaffold.
[0126]In a further embodiment the protein scaffold belongs to the GH26 class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH26 class of glycosylases and/or has a tertiary structure similar to mannanase from Pseudomonas cellulosa. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 75-83, 113-125, 174-182, 217-224, 247-254, 324-332 and 325-340 in mannanase from Pseudomonas cellulosa, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 115-123, 176-180, 286-291 and 328-337 (numbering of amino acids according to SEQ ID NO:38). It is preferred that mannanase from Pseudomonas cellulosa or a derivative or homologue thereof is used as the scaffold.
[0127]In an further embodiment the protein scaffold belongs to the GH18 (beta/alpha)8 barrel class of glycosylases or has at least 70% identity on the amino acid level to a protein of the GH18 class of glycosylases and/or has a tertiary structure similar to chitinase from Bacillus circulans. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 21-29, 57-65, 130-136, 176-183, 221-229, 249-257 and 327-337 in chitinase from Bacillus circulans, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 59-63, 178-181, 250-254 and 330-336 (numbering of amino acids according to SEQ ID NO:39). It is preferred that chitinase from Bacillus circulans or a derivative or homologue thereof is used as the scaffold.
[0128]It is further preferred that the engineered enzymes have esterhydrolase activity. Preferably, the protein scaffold for this variant have lipase, phosphatase, phytase, or phosphodiesterase activity.
[0129]In a first embodiment the protein scaffold belongs to the GX class of esterases or has at least 70% identity on the amino acid level to a protein of the GX class of esterases and/or has a tertiary structure similar to the structure of the lipase B from Candida antarctica. Preferably, the scaffold has lipase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 139-148, 188-195, 216-224, 256-266, 272-287 in lipase B from Candida antarctica, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 141-146, 218-222, 259-263 and 275-283 (numbering of amino acids according to SEQ ID NO:40). It is preferred that lipase B from Candida antarctica or a derivative or homologue thereof is used as the scaffold.
[0130]In a further embodiment the protein scaffold belongs to the GX class of esterases or has at least 70% identity on the amino acid level to a protein of the GX class of esterases and/or has a tertiary structure similar to the pancreatic lipase from guinea pig. Preferably, the scaffold has lipase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 78-90, 91-100, 112-120, 179-186, 207-218, 238-247 and 248-260 in pancreatic lipase from guinea pig, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 80-87, 114-118, 209-215 and 239-246 (numbering of amino acids according to SEQ ID NO:41). It is preferred that pancreatic lipase from guinea pig or a derivative or homologue thereof is used as the scaffold.
[0131]In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the alkaline phosphatase from Escherichia coli or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the alkaline phosphatase from Escherichia coli. Preferably, the scaffold has phosphatase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 110-122, 187-142, 170-175, 186-193, 280-287 and 425-435 in alkaline phosphatase from Escherichia coli, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 171-174, 187-191, 282-286 and 426-433 (numbering of amino acids according to SEQ ID NO:42). It is preferred that alkaline phosphatase from Escherichia coli or a derivative or homologue thereof is used as the scaffold.
[0132]In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the bovine pancreatic deoxyribonuclease I or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the bovine pancreatic deoxyribonuclease I. Preferably, the scaffold has phosphodiesterase activity. More preferably, a nuclease, and most preferably, an unspecific endonuclease or a derivative thereof is used as the scaffold. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 14-21, 41-47, 72-77, 97-111, 135-143, 171-178, 202-209 and 242-251 in bovine pancreatic deoxyribonuclease I, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 16-19, 42-46, 136-141 and 172-176 (numbering of amino acids according to SEQ ID NO:43). It is preferred that bovine pancreatic deoxyribonuclease I or human deoxyribonuclease I or a derivative or homologue thereof is used as the scaffold.
[0133]It is further preferred that the engineered enzyme has transferase activity. A particularly suited protein scaffold for this variant is a glycosyl-, a phospho- or a methyltransferase, or is a derivative thereof. Particularly preferred protein scaffolds for this variant are glycosyltransferases or are derived from glycosyltransferases. The tertiary structure of the protein scaffold can be of any type. Preferably, however, the tertiary structure belongs to one of the following structural classes: GH13 and GT1.
[0134]In a first embodiment the protein scaffold belongs to the GH13 class of transferases or has at least 70% identity on the amino acid level to a protein of the GH13 class of transferases and/or has a tertiary structure similar to the structure of the cyclomaltodextrin glucanotransferase from Bacillus circulans. Preferably, the scaffold has transferase activity, and more preferably a glycosyltransferase is used as the scaffold. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 38-48, 85-94, 142-154, 178-186, 259-266, 331-340 and 367-377 in cyclomaltodextrin glucanotransferase from Bacillus circulans, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 87-92, 180-185, 261-264 and 269-275 (numbering of amino acids according to SEQ ID NO:44). It is preferred that cyclomaltodextrin glucanotransferase from Bacillus circulans or a derivative or homologue thereof is used as the scaffold.
[0135]In a further embodiment the protein scaffold belongs to the GT1 class of transferases or has at least 70% identity on the amino acid level to a protein of the GT1 class of transferases and/or has a tertiary structure similar to the structure of the glycosyltransferase from Amycolatopsis orientalis A82846. Preferably the scaffold has transferase activity, and more preferably glycosyltransferase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 58-74, 130-138, 185-193, 228-236 and 314-323 in glycosyltransferase from Amycolatopsis orientalis A82846, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 61-71, 230-234 and 316-321 (numbering of amino acids according to SEQ ID NO:45). It is preferred that the glycosyltransferase from Amycolatopsis orientalis A82846 or a derivative or homologue thereof is used as the scaffold.
[0136]It is further preferred that the engineered enzymes have oxidoreductase activity. A particularly suited protein scaffold for this variant is a monooxygenase, a dioxygenase or a alcohol dehydrogenase, or a derivative thereof. The tertiary structure of the protein scaffold can be of any type.
[0137]In a first embodiment the protein scaffold has a tertiary structure similar to the structure of the 2,3-dihydroxybiphenyl dioxygenase from Pseudomonas sp. or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the 2,3-dihydroxybiphenyl dioxygenase from Pseudomonas sp. Preferably, the scaffold has dioxygenase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 172-185, 198-206, 231-237, 250-259 and 282-287 in 2,3-dihydroxybiphenyl dioxygenase from Pseudomonas sp., and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 175-182, 200-204, 252-257 and 284-287 (numbering of amino acids according to SEQ ID NO:46). It is preferred that the 2,3-dihydroxybiphenyl dioxygenase from Pseudomonas sp or a derivative or homologue thereof is used as the scaffold.
[0138]In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the catechol dioxygenase from Acinetobacter sp. or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the catechol dioxygenase from Acinetobacter sp. Preferably, the scaffold has dioxygenase activity, and more preferably catechol dioxygenase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 66-72, 105-112, 156-171 and 198-207 in catechol dioxygenase from Acinetobacter sp., and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 107-110, 161-171 and 201-205 (numbering of amino acids according to SEQ ID NO:47). It is preferred that the catechol dioxygenase from Acinetobacter sp or a derivative or homologue thereof is used as the scaffold.
[0139]In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the camphor-5-monooxygenase from Pseudomonas putida or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the camphor-5-monooxygenase from Pseudomonas putida. Preferably, the scaffold has monooxygenase activity, and more preferably camphor monooxygenase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 26-31, 57-63, 84-98, 182-191, 242-256, 292-299 and 392-399 in camphor-5-monooxygenase from Pseudomonas putida, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 85-96, 183-188, 244-253, 293-298 and 393-398 (numbering of amino acids according to SEQ ID NO:48). It is preferred that the camphor-5-monooxygenase from Pseudomonas putida or a derivative or homologue thereof is used as the scaffold.
[0140]In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the alcohol dehydrogenase from Equus callabus or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the alcohol dehydrogenase from Equus callabus. Preferably, the scaffold has alcohol dehydrogenase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 49-63, 111-112, 294-301 and 361-369 in alcohol dehydrogenase from Equus callabus, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 51-61 and 295-299 (numbering of amino acids according to SEQ ID NO:49). It is preferred that the alcohol dehydrogenase from Equus callabus or a derivative or homologue thereof is used as the scaffold.
[0141]It is further preferred that the engineered enzymes have lyase activity. A particularly suited protein scaffold for this variant is a oxoacid lyase or is a derivative thereof. Particularly preferred protein scaffolds for this variant are aldolases or synthases, or are derived thereof. The tertiary structure of the protein scaffold can be of any type, but a (beta/alpha)8 barrel structure is preferred.
[0142]In a first embodiment the protein scaffold has a tertiary structure similar to the structure of the N-acetyl-d-neuramic acid aldolase from Escherichia coli or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the N-acetyl-d-neuramic acid aldolase from Escherichia coli. Preferably, the scaffold has aldolase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 45-55, 78-87, 105-113, 137-146, 164-171, 187-193, 205-210, 244-255 and 269-276 in N-acetyl-d-neuramic acid aldolase from Escherichia coli, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 45-52, 138-144, 189-192, 247-253 and 271-275 (numbering of amino acids according to SEQ ID NO:50). It is preferred that the N-acetyl-d-neuramic acid aldolase from Escherichia coli or a derivative or homologue thereof is used as the scaffold.
[0143]In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the tryptophan synthase from Salmonella typhimurium or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the tryptophan synthase from Salmonella typhimurium. Preferably, the scaffold has synthase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 56-63, 127-134, 154-161, 175-193, 209-216 and 230-240 in tryptophan synthase from Salmonella typhimurium, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 57-62, 155-160, 178-190 and 210-215 (numbering of amino acids according to SEQ ID NO:51). It is preferred that the tryptophan synthase from Salmonella typhimurium or a derivative or homologue thereof is used as the scaffold.
[0144]It is further preferred that the engineered enzymes have isomerase activity. A particularly suited protein scaffold for this variant is a converting aldose or a converting ketose, or is a derivative thereof.
[0145]In a first embodiment, the protein scaffold has a tertiary structure similar to the structure of the xylose isomerase from Actinoplanes missouriensis or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the xylose isomerase from Actinoplanes missouriensis. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 18-31, 92-103, 136-147, 178-188 and 250-257 in xylose isomerase from Actinoplanes missouriensis, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 20-27, 92-99 and 180-186 (numbering of amino acids according to SEQ ID NO:52). It is preferred that the xylose isomerase from Actinoplanes missouriensis or a derivative or homologue thereof is used as the scaffold.
[0146]It is further preferred that the engineered enzymes have ligase activity. A particularly suited protein scaffold for this variant is a DNA ligase, or is a derivative thereof.
[0147]In a first embodiment, the protein scaffold has a tertiary structure similar to the structure of the DNA ligase from Bacteriophage T7 or has at least 70% identity on the amino acid level to a protein that has a tertiary structure similar to the structure of the DNA-ligase from Bacteriophage T7. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 52-60, 94-108, 119-131, 241-248, 255-263 and 302-318 in DNA ligase from Bacteriophage T7, and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 96-106, 121-129, 256-262 and 304-316 (numbering of amino acids according to SEQ ID NO:53). It is preferred that the DNA ligase from Bacteriophage T7 or a derivative or homologue thereof is used as the scaffold.
[0148]A second aspect of the invention is directed to the application of engineered enzymes with specificities for therapeutic, research, diagnostic, nutritional, personal care or industrial purposes. The application comprises at least the following steps: [0149](a) identification of a target peptide substrate whose hydrolysis has a positive effect in connection with the intended purpose, such as curing a disease, diagnosing a disease, processing of ingredients for human or animal nutrition, or other technical processes; [0150](b) provision of an engineered enzyme, the enzyme being specific for the target peptide identified in step (a); and [0151](c) use of the enzyme as provided in step (b) for the intended purpose.
[0152]In a first variant of this aspect of the invention, the engineered enzyme is used as a therapeutic means to inactivate a disease-related target substrate. This application comprises at least the following steps: [0153](a) identification of a target substrate whose function is connected to a disease and whose inactivation has a positive effect in connection with the disease, and determination of a target site within the target substrate characterized by the fact that modification at the target site leads to the inactivation of the target substrate; [0154](b) provision of an engineered enzyme, the enzyme being specific for the target site identified in step (a); and [0155](c) use of the enzyme for the inactivation of the target substrate inside or outside the human body.
[0156]In a preferred embodiment the scaffold of the engineered enzyme provided in step (c) is of human origin in order to avoid or reduce immunogenicity or allergenic effects associated with the application of the enzyme in the human body. In a more preferred embodiment of this variant, the scaffold is of a human protease and the modification is hydrolysis of a target site in a protein target. Preferably, the hydrolysis leads to the activation or inactivation of the peptide or protein target. Potential peptide or protein targets include: cytokines, growth factors, peptide hormones, interleukins, interferons, enzymes from the coagulation cascade, serpins, immunoglobulins, soluble or membrane-bound receptors, cellular or viral surface proteins, peptide drugs, protein drugs.
[0157]A particularly preferred embodiment is based on the finding that the engineered enzyme is capable for the cleavage of human tumor nekrose factor-alpha (TNF-α). The engineered enzymes or the fusion protein can thus be used for preparing medicaments for the treatment of inflammatory diseases (as well as other diseases connected with TNF-α). Preferably, said engineered enzyme or said fusion protein is capable of specifically inactivating human tumor nekrose factor-alpha (hTNF-α), more preferably said engineered enzyme or said fusion protein is capable of hydrolysing the peptide bond between positions 31/32, 32/33, 44/45, 87/88, 128/129 and/or 141/142 (most preferred between positions 31/32 and 32/33) in hTNF-α (SEQ ID NO:96).
[0158]In further embodiment, the target substrate is a pro-drug which is activated by the engineered enzyme. In a particular embodiment of this variant, the engineered enzyme has proteolytic activity and the target substrate is a protein target which is proteolytically activated. Examples of such pro-drugs are pro-proteins such as the inactivated forms of coagulations factors. In another particular variant, the engineered enzyme is an oxidoreductase and the target substrate is a chemical that can be activated by oxidation.
[0159]In a second variant of this aspect of the invention, the engineered enzyme is used as a technical means in order to catalyze an industrially or nutritionally relevant reaction with defined specificity. In a particular embodiment of this variant the engineered enzyme has proteolytic activity, the catalyzed reaction is a proteolytic processing, and the engineered enzyme specifically hydrolyses one or more industrially or nutrionally relevant protein substrates. In a preferred embodiment of this variant the engineered enzyme hydrolyses one or more industrially or nutrionally relevant protein substrates at specific sites, thereby leading to industrially or nutrionally desired product properties such as texture, taste or precipitation characteristics. In a further particular embodiment of this variant, the engineered enzyme catalyzes the hydrolysis of glycosidic bonds (glycosidase or glycosylases activity). Then, preferably, the catalyzed reaction is a polysaccharide processing, and the engineered enzyme specifically hydrolyses one or more industrially, technically or nutrionally relevant polysaccharide substrates. In a further particular embodiment of this variant, the engineered enzyme catalyzes the hydrolysis of triglyceride esters or lipids (lipase activity). Then, preferably, the catalyzed reaction is a lipid processing step, and the engineered enzyme specifically hydrolyses one or more industrially, technically or nutrionally relevant lipid substrates. In a further particular variant of this embodiment, the engineered enzyme catalyzes the oxidation or reduction of substrates (oxidoreductase activity). Then, preferably, the engineered enzyme specifically oxidizes or reduces one or more industrially, technically or nutrionally relevant chemical substrates.
[0160]A third aspect of the invention is directed to a method for generating engineered enzymes with specificities that are qualitatively and/or quantitatively novel in combination with the protein scaffold. The inventive method comprises at least the following steps: [0161](a) providing a protein scaffold capable to catalyze at least one chemical reaction on at least one target substrate, [0162](b) generating a library of engineered enzymes or isolated engineered enzymes by combining the protein scaffold from step (a) with one or more fully or partially random peptide sequences at sites in the protein scaffold that enable the resulting engineered enzyme to discriminate between at least one target substrate and one or more different substrates and [0163](c) selecting out of the library of engineered enzymes generated in step (b) one or more enzymes that have defined specificities towards at least one target substrate.
[0164]In a first variant of this aspect of the invention, the inventive method comprises at least the following steps: [0165](a) providing a protein scaffold capable to catalyze at least one chemical reaction on at least one target substrate, [0166](b) generating a library of engineered enzymes or isolated engineered enzymes by inserting into the protein scaffold from step (a) one or more fully or partially random peptide sequences at sites in the protein scaffold that enable the resulting engineered enzyme to discriminate between at least one target substrate and one or more different substrates and [0167](c) selecting out of the library of engineered enzymes generated in step (b) one or more enzymes that have defined specificities towards at least one target substrate.
[0168]Preferably, the positions at which the one or more fully or partially random peptide sequences are combined with or inserted into the protein scaffold are identified prior to the combination or insertion.
[0169]The number of insertions or other combinations of fully or partially random peptide sequences as well as their length may vary over a wide range. The number is at least one, preferably more than one, more preferably between two and eleven, most preferably between two and six. The length of such fully or partially random peptide sequences is usually less than 50 amino acid residues. Preferably, the length is between one and 15 amino acid residues, more preferably between one and six amino acid residues. Alternatively, the length is between two and 20 amino acid residues, preferably between two and ten amino acid residues, more preferably between three and eight amino acid residues.
[0170]Preferably such insertions or other combinations are performed on the DNA level, using polynucleotides encoding such protein scaffolds and polynucleotides or oligonucleotides encoding such fully or partially random peptide sequences.
[0171]Optionally, steps (a) to (c) are repeated cyclically, whereby enzymes selected in step (c) serve as the protein scaffold in step (a) of a further cycle, and randomized peptide sequences are either inserted or, alternatively, substituted for peptide sequences that have been inserted in former cycles. Thereby, the number of inserted peptide sequences is either constant or increases over the cycles. The cycles are repeated until one or more enzymes with the intended specificities are generated.
[0172]Moreover, during or after one or more rounds of steps (a) to (c), the scaffold may be mutated at one or more positions in order to make the scaffold more acceptable for the combination with SDR sequences, and/or to increase catalytic activity at a specific pH and temperature, and/or to change the glycosylation pattern, and/or to decrease sensitivity towards enzyme inhibitors, and/or to change enzyme stability.
[0173]In a second variant of this aspect of the invention, the inventive method comprises at least the following steps:
[0174](a) providing a first protein scaffold fragment,
[0175](b) connecting said protein scaffold fragment via a peptide linkage with a first SDR, and optionally
[0176](c) connecting the product of step (b) via a peptide linkage with a further SDR peptide or with a further protein scaffold fragment, and optionally
[0177](d) repeating step (c) for as many cycles as necessary in order to generate a sufficiently specific enzyme, and
[0178](e) selecting out of the population generated in steps (a)-(d) one or more enzymes that have the desired specificities toward the one or more target substrates.
[0179]Protein scaffold fragment means a part of the sequence of a protein scaffold. A protein scaffold is comprised of at least two protein scaffold fragments.
[0180]In a third variant of this aspect of the invention, the protein scaffold, the SDRs and the engineered enzyme are encoded by a DNA sequence and an expression system is used in order to produce the protein. In an alternative variant, the protein scaffold, the SDRs and/or the engineered enzyme are chemically synthesized from peptide building blocks.
[0181]In a fourth variant of this aspect of the invention, the inventive method comprises at least the following steps:
[0182](a) providing a polynucleotide encoding a protein scaffold capable of catalyzing one or more chemical reactions on one or more target substrates;
[0183](b) combining one or more fully or partially random oligonucleotide sequence with the polynucleotide encoding the protein scaffold, the fully or partially random oligonucleotide sequences being located at sites in the polynucleotide that enable the encoded engineered enzyme to discriminate between the one or more target substrates and one or more other substrates; and
[0184](c) selecting out of the population generated in step (b) one or more polynucleotides that encode enzymes that have the defined specificities toward the one or more target substrates.
[0185]Any enzyme can serve as the protein scaffold in step (a). It can be a naturally occurring enzyme, a variant or a truncated derivate therefore, or an engineered enzyme. For human therapeutic use, the protein scaffold is preferably a mammalian enzyme, and more preferably a human enzyme. In that aspect, the invention is directed to a method for the generation of essentially mammalian, especially of essentially human enzymes with specificities that are different from specificities of any enzyme encoded in mammalian genomes or in the human genome, respectively.
[0186]According to the invention, the protein scaffold provided in step (a) of this aspect requires to be capable of catalyzing one or more chemical reactions on a target substrate. Therefore, a protein scaffold is selected from the group of potential protein scaffolds by its activity on the target substrate.
[0187]In a preferred variant of this aspect of the invention, a protein scaffold with hydrolase activity is used. Preferably, a protein scaffold with proteolytic activity is used, and more preferably, a protease with very low specificity having basic activity on the target substrate is used as the protein scaffold. Examples of proteases from different structural classes with low substrate specificity are Papain, Trypsin, Chymotrypsin, Subtilisin, SET (trypsin-like serine protease from Streptomyces erythraeus), Elastase, Cathepsin G or Chymase. Before being employed as the protein scaffold, the amino acid sequence of the protease may be modified in order to change protein properties other than specificity, e.g catalytic activity, stability, inhibitor sensitivity, or expression yield, essentially as described in WO 92/18645, or in order to change specificity, essentially as described in EP 02020576.3 and PCT/EP03/04864.
[0188]Another option for a feasible protein scaffold are lipases. Hepatic lipase, lipoprotein lipase and pancreatic lipase belong to the "lipoprotein lipase superfamily", which in turn is an example of the GX-class of lipases (M. Fischer, J. Pleiss (2003), Nucl. Acid. Res., 31, 319-321). The substrate specificity of lipases can be characterized by their relative activity towards triglycerol esters of fatty acids and phospholipids, bearing a charged head group. Alternatively, other hydrolases such as esterases, glycosylases, amidases, or nitrilases may be used as scaffolds.
[0189]Transferases are also feasible protein scaffolds. Glycosyltransferases are involved in many biological synthesis involving a variety of donors and acceptors. Alternatively, the protein scaffold may have ligase, lyase, oxidoreductase, or isomerase activity.
[0190]In a first embodiment, the one or more fully or partially random peptide sequences are inserted at specific sites in the protein scaffold. These insertion sites are characterized by the fact that the inserted peptide sequences can act as discriminators between different substrates, i.e. as Specificity Determining Regions or SDRs. Such insertion sites can be identified by several approaches. Preferably, insertion sites are identified by analysis of the three-dimensional structure of the protein scaffolds, by comparative analysis of the primary sequences of the protein scaffold with other enzymes having different quantitative specificities, or experimentally by techniques such as alanine scanning, random mutagenesis, or random deletion, or by any combination thereof.
[0191]A first approach to identify insertion sites for SDRs bases on the three-dimensional structure of the protein scaffold as it can be obtained by x-ray crystallography or by nuclear magnetic resonance studies. Structural alignment of the protein scaffold in comparison with other enzymes of the same structural class but having different quantitative specificities reveals regions of high structural similarity and regions with low structural similarity. Such an analysis can for example be done using public software such as Swiss PDB viewer (Guex, N. and Peitsch, M. C. (1997) Electrophoresis 18, 2714-2723). Regions of low structural similarity are preferred SDR insertion sites.
[0192]In a second approach to identify insertion sites for SDRs, three-dimensional structures of the scaffold protein in complex with competitive inhibitors or substrate analogs are analysed. It is assumed that the binding site of a competitive inhibitor significantly overlaps with the binding site of the substrate. In that case, atoms of the protein that are within a certain distance of atoms of the inhibitor are likely to be in a similar distance to the substrate as well. Choosing a short distance, e.g. <5 Å, will result in an ensemble of protein atoms that are in close contact with the substrate. These residues would constitute the first shell contacts and are therefore preferred insertion sites for SDRs. Once first shell contacts have been identified, second shell contacts can be found by repeating the distance analysis starting from first shell atoms. In yet another alternative of the invention the distance analysis described above is performed starting from the active site residues.
[0193]In third approach to identify insertion sites for SDRs, the primary sequence of the scaffold protein is aligned with other enzymes of the same structural class but having different quantitative specificities using an alignment algorithm. Examples of such alignment algorithms are published (Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990) J. Mol. Biol. 215:403-410; "Statistical methods in Bioinformatics: an introduction" by Ewens, W. & Grant, G. R. 2001, Springer, New York). Such an alignment may reveal conserved and non-conserved regions with varying sequence homology, and, in particular, additional sequence elements in one or more enzymes compared to the scaffold protein. Conserved regions of are more likely to contribute to phenotypes shared among the different proteins, e.g. stabilizing the three-dimensional fold. Non-conserved regions and, in particular, additional sequences in enzymes with quantitatively higher specificity (Turner, R. et al. (2002) J. Biol. Chem., 277, 33068-33074) are preferred insertion sites for SDRs.
[0194]For proteases currently five families are known, namely aspartic-, cysteine-, serine-, metallo- and threonine proteases. Each family includes groups of proteases that share a similar fold. Crystallographic structures of members of these groups have been solved and are accessible through public databases, e.g. the Brookhaven protein database (H. M. Berman et al. Nucleic Acids Research, 28 pp. 235-242 (2000)). Such databases also include structural homologs in other enzyme classes and nonenzymatically active proteins of each class. Several tools are available to search public databases for structural homologues: SCOP--a structural classification of proteins database for the investigation of sequences and structures. (Murzin A. G. et al. (1995) J. Mol. Biol. 247, 536-540); CATH--Class, Architecture, Topology and Homologous superfamily: a hierarchical classification of protein domain structures (Orengo et al. (1997) Structure 5(8) 1093-1108); FSSP--Fold classification based on structure-structure alignment of proteins (Holm and Sander (1998) Nucl. Acids Res. 26 316-319); or VAST--Vector alignment search tool (Gibrat, Madej and Bryant (1996) Current Opinion in Structural Biology 6, 377-385).
[0195]In the above described approaches, members of structural classes are compared in order to identify insertion sites for SDRs.
[0196]In a preferred variant of these approaches serine proteases of the structural class S1 are compared with each other. Trypsin represents a member with low substrate specificity, as it requires only an arginine or lysine residue at the P1 position. On the other hand, thrombin, tissue-type plasminogen activator or enterokinase all have a high specificity towards their substrate sequences, i.e. (L/I/V/F)XPR NA, CPGR VVGG and DDDK , respectively (Perona, J. & Craik, C. (1997) J. Biol. Chem., 272, 29987-29990; Perona, J. & Craik, C (1995) Protein Science, 4, 337-360). An alignment of the amino acid sequences of these proteases is described in example 1 (FIG. 2) along with the identification of SDRs.
[0197]A further example within the family of serine proteases is given by members of the structural class S8 (subtilisin fold). Subtilisin is the type protease for this class and represents an unspecific protease (Ottesen, M. & Svendsen, A. (1998) Methods Enzymol. 19, 199-215). Furin, PC1 and PC5 are proteases of the same structural class involved in the processing of propeptides and have a high substrate specificity (Seidah, N. & Chretien, M. (1997) Curr. Opin. Biotech., 8: 602-607; Bergeron, F. et al. (2000) J. Mol. Endocrin., 24:1-22). In a preferred variant of the approach alignments of the primary amino acids sequences (FIG. 4) are used to identify eleven sequence stretches longer than three amino acids which specific proteases have in addition compared to subtilisin and are therefore potential specificity determining regions. In a further variant of the approach information from the three-dimensional structure of subtilisin can be used in order to further narrow down the selection (FIG. 3). Out of the eleven inserted sequence stretches, three are especially close to the active site residues, namely stretch number 7, 8 and 11 which are insertions in PC5, PC1 and all three specific proteases, respectively (FIG. 3). In a preferred variant, one or several amino acid stretches of variable length and composition can be inserted into the subtilisin sequence at one or several of the eleven positions. In a more preferred variant of the approach the insertion is performed at regions 7, 8 or 11 or any combination thereof. In another preferred variant of the approach protease scaffolds other than subtilisin from the structural class S8 are used.
[0198]In a further preferred variant of this approach, aspartic acid proteases of the structural class A1 are analyzed (Rawlings, N. D. & Barrett, A. J. (1995). Methods Enzymol. 248, 105-120; Chitpinityol, S. & Crabbe, M J. (1998), Food Chemistry, 61, 395-418). Examples for the A1 structural class of aspartic proteases are pepsin with a low as well as beta-secretase (Gruninger-Leitch, F., et al. (2002) J. Biol. Chem. 277, 4687-4693) and renin (Wang, W. & Liang, T C. (1994) Biochemistry, 33, 14636-14641) with relatively high substrate specificities. Retroviral proteases also belong to this class, although the active enzyme is a dimer of two identical subunits. The viral proteases are essential for the correct processing of the polyprotein precursor to generate functional proteins which requires a high substrate specificity in each case (Wu, J. et al. (1998) Biochemistry, 37, 4518-4526; Pettit, S. et al. (1991) J. Biol. Chem., 266, 14539-14547). Pepsin is the type protease for this class and represents an unspecific protease (Kageyama, T. (2002) Cell. Mol. Life Sci. 59, 288-306). B-secretase and Cathepsin D (Aguilar, C. F. et al. (1995) Adv. Exp. Med. Biol. 362, 155-166) are proteases of the same structural class and have a high substrate specificity. In a preferred variant of the approach alignments of the primary amino acids sequences (FIG. 6) are used to identify six sequence stretches longer than three amino acids which are inserted in the specific proteases compared to pepsin and are therefore potential specificity determining regions. In a further variant of the approach information from the three-dimensional structure of b-secretase can be used in order to further narrow down the selection. Out of the six inserted sequence stretches, three are especially close to the active site residues, namely stretch number 1, 3 and 4 which are insertions in cathepsin D and beta-secretase, respectively (FIG. 5). In a preferred variant of the approach, one or several amino acid stretches of variable length and composition can be inserted into the pepsin sequence at one or several of the six positions. In a more preferred embodiment of the invention the insertion is performed at the positions 1, 3 or 4 or any combination thereof. In another preferred embodiment of the invention protease scaffolds other than pepsin are used.
[0199]There are cases where a certain structural class does not include known members of low and high specificity. This is exemplified by the C14 class of caspases which belong to the cysteine protease family (Rawlings, N. D. & Barrett, A. J. (1994) Methods Enzymol. 244, 461-486) and which all show high specificity for P4 to P1 positions. For example, caspase-1, caspase-3 and caspase-9 recognize the sequences YVAD , DEVD or LEHD , respectively. Identification of the regions that differ between the caspases will include the regions responsible for the differences in substrate specificity (FIGS. 7 and 8).
[0200]Finally, non-enzymatic proteins of the same fold as the enzyme scaffold may also contribute to the identification of insertion sites for SDRs. For example, haptoglobin (Arcoleo, J. & Greer, J.; (1982) J. Biol. Chem. 257, 10063-10068) and azurocidin (Almeida, R. et al. (1991) Biochem. Biophys. Res. Commun. 177, 688-695) share the same chymotrypsin-like fold with all S1 proteases. Due to substitutions in the active site residues these proteins do not posses any proteolytic function, yet they show high homology with active proteases. Differences between these proteins and specific proteases include regions that can serve as insertion sites for SDRs.
[0201]In a fourth approach, insertion sites for SDRs are identified experimentally by techniques such as alanine scanning, random mutagenesis, random insertion or random deletion. In contrast to the approach disclosed above, this approach does not require detailed knowledge about the three-dimensional structure of the scaffold protein. In one preferred variant of this approach, random mutagenesis of enzymes with relatively high specificity from the same structural class as the protein scaffold and screening for loss or change of specificity can be used to identify insertion sites for SDRs in the protein scaffold.
[0202]Random mutagenesis, alanine scanning, random insertion or random deletion are all done on the level of the polynucleotides encoding the enzymes. There are a variety of protocols known in the literature (e.g. Sambrook, O. F; Fritsch, E. F.; Maniatis, T.; Cold Spring Harbor Laboratory Press, Second Edition, 1989, New York). For example, random mutagenesis can be achieved by the use of a polymerase as described in patent WO 9218645. According to this patent, the one or more genes encoding the one or more proteases are amplified by use of a DNA polymerase with a high error rate or under conditions that increase the rate of misincorporations. For example the method of Cadwell and Joyce can be employed (Cadwell, R. C. and Joyce, G. F., PCR methods. Appl. 2 (1992) 28-33). Other methods of random mutagenesis such as, but not limited to, the use of mutator stains, chemical mutagens or UV-radiation can be employed as well.
[0203]Alternatively, oligonucleotides can be used for mutagenesis that substitute randomly distributed amino acid residues with an alanine. This method is generally referred to as alanine scanning mutagenesis (Fersht, A. R. Biochemistry (1989) 8031-8036). As a further alternative, modifications of the alanine scanning mutagenesis such as binomial mutagenesis (Gregoret, L. M. and Sauer, R. T. PNAS (1993) 4246-4250) or combinatorial alanine scanning (Weiss et al., PNAS (2000) 8950-8954) can be employed.
[0204]In order to express engineered enzymes, the DNA encoding such engineered proteins is ligated into a suitable expression vector by standard molecular cloning techniques (e.g. Sambrook, J. F; Fritsch, E. F.; Maniatis, T.; Cold Spring Harbor Laboratory Press, Second Edition, 1989, New York). The vector is introduced in a suitable expression host cell, which expresses the corresponding engineered enzyme variant. Particularly suitable expression hosts are bacterial expression hosts such as Escherichia coli or Bacillus subtilis, or yeast expression hosts such as Saccharomyces cerevisiae or Pichia pastoris, or mammalian expression hosts such as Chinese Hamster Ovary (CHO) or Baby Hamster Kidney (BHK) cell lines, or viral expression systems such as bacteriophages like M13 or Lambda, or viruses such as the Baculovirus expression system. As a further alternative, systems for in vitro protein expression can be used. Typically, the DNA is ligated into an expression vector behind a suitable signal sequence that leads to secretion of the enzyme variants into the extracellular space, thereby allowing direct detection of protease activity in the cell supernatant. Particularly suitable signal sequences for Escherichia coli are HlyA, for Bacillus subtilis AprE, NprB, Mpr, AmyA, AmyE, Blac, SacB, and for S. cerevisiae Bar1, Suc2, Matα, Inu1A, Ggplp. Alternatively, the enzyme variants are expressed intracellularly and the substrates are expressed also intracellularly. Preferably, this is done essentially as described in patent application WO 0212543, using a fusion peptide substrate comprising two auto-fluorescent proteins linked by the substrate amino-acid sequence. As a further alternative, after intracellular expression of the enzyme variants, or secretion into the periplasmatic space using signal sequences such as DsbA, PhoA, PelB, OmpA, OmpT or gIII for Escherichia coli, a permeabilisation or lysis step releases the enzyme variants into the supernatant. The destruction of the membrane barrier can be forced by the use of mechanical means such as ultrasonic, French press, or the use of membrane-digesting enzymes such as lysozyme. As another, further alternative, the genes encoding the enzyme variants are expressed cell-free by the use of a suitable cell-free expression system. For example, the S30 extract from Escherichia coli cells is used for this purpose as described by Lesly et al. (Methods in Molecular Biology 37 (1995) 265-278).
[0205]The ensemble of gene variants generated and expressed by any of the above methods are analyzed with respect to their affinity, substrate specificity or activity by appropriate assay and screening methods as described in detail for example in patent application PCT/EP03/04864. Genes from catalytically active variants having reduced specificity in comparison to the original enzyme are analyzed by sequencing. Sites at which mutations and/or insertions and/or deletions occurred are preferred insertion sites at which SDRs can be inserted site-specifically.
[0206]In a second embodiment, the one or more fully or partially random peptide sequences are inserted at random sites in the protein scaffold. This modification is usually done on the polynucleotide level, i.e. by inserting nucleotide sequences into the gene that encodes the protein scaffold. Several methods are available that enable the random insertion of nucleotide sequences. Systems that can be used for random insertion are for example ligation based systems (Murakami et al. Nature Biotechnology 20 (2002) 76-81), systems based on DNA polymerisation and transposon based systems (e.g. GPS-M® mutagenesis system, NEB Biolabs; MGS® mutation generation system, Finnzymes). The transposon-based methods employ a transposase-mediated insertion of a selectable marker gene that contains at its termini recognition sequences for the transposase as well as two sites for a rare cutting restriction endonuclease. Using the latter endonuclease one usually releases the selection marker and after religation obtains an insertion. Instead of performing the religation one can alternatively insert a fragment that has terminal recognition sequences for one or two outside cutting restriction endonuclease as well as a selectable marker. After ligation, one releases this fragment using the one or two outside cutting endonucleases. After creating blunt ends by standard methods one inserts blunt ended random fragments at random positions into the gene.
[0207]In a further preferred embodiment, methods for homologous in-vitro recombination are used to combine the mutations introduced by the above mentioned methods to generate enzyme populations. Examples of methods that can be applied are the Recombination Chain Reaction (RCR) according to patent application WO 0134835, the DNA-Shuffling method according to the patent application WO 9522625, the Staggered Extension method according to patent WO 9842728, or the Random Priming recombination according to patent application WO9842728. Furthermore, also methods for non-homologous recombination such as the Itchy method can be applied (Ostermeier, M. et al. Nature Biotechnology 17 (1999) 1205-1209).
[0208]Upon random insertion of a nucleotide sequence into the protein scaffold one obtains a library of different genes encoding enzyme variants. The polynucleotide library is subsequently transferred to an appropriate expression vector. Upon expression in a suitable host or by use of an in vitro expression system, a library of enzymes containing randomly inserted stretches of amino acids is obtained.
[0209]According to step (b) of this third aspect of the invention, one or more fully or partially random peptide sequences are inserted into the protein scaffold. The actual number of such inserted SDRs is determined by the intended quantitative specificity following the relation: the higher the intended specificity is, the more SDRs are inserted. Whereas a single SDR enables the generation of moderately specific enzymes, two SDRs enable already the generation of significantly specific enzymes. However, up to six and more SDRs can be inserted into a protein scaffold. A similar relation is valid for the length of the SDRs: the higher the intended specificity is, the longer are the SDRs that are to be inserted. SDRs can be as short as one to four amino acid residues. They can, however, also be as long as 50 amino acid residues. Significant specificity can already be generated by the use of SDRs of a length of four to six amino acid residues.
[0210]The peptide sequences that are inserted can be fully or partially random. In this context, fully random means that a set of sequences are inserted in parallel that includes sequences that differ from each other in each and every position. Partially random means that a set of sequences are inserted in parallel that includes sequences that differ from each other in at least one position. This difference can be either pair-wise or with respect to a single sequence. For example, when regarding an insertion of the length of four amino acids, partial random could be a set (i) that includes AGGG, GVGG, GGLG, GGGI, or (ii) that includes AGGG, VGGG, LGGG and IGGG. Alternatively, random sequences also comprises sequences that differ from each other in length. Randomization of the peptide sequences is achieved by randomization of the nucleotide sequences that are inserted into the gene at the respective sites. Thereby, randomization can be achieved by employing mixtures of nucleobases as monomers during chemical synthesis of the oligonucleotides. A particularly preferred mixture of monomers for a fully random codon that in addition minimizes the probability of stop codons is NN(GTC). Alternatively, random oligonucleotides can be obtained by fragmentation of DNA into short fragments that are inserted into the gene at the respective sites. The source of the DNA to be fragmented may be a synthetic oligonucleotide but alternatively may originate from cloned genes, cDNAs, or genomic DNA. Preferably, the DNA is a gene encoding an enzyme. The fragmentation can, for example, be achieved by random endonucleolytic digestion of DNA. Preferably, an unspecific endonuclease such as DNAse I (e.g. from bovine pancreas) is employed for the endonucleolytic digestion.
[0211]If steps (a)-(c) of the inventive method are repeated cyclically, there are different alternatives for obtaining random peptide sequences that are inserted in consecutive rounds. Preferably, SDRs that were identified in one round as leading to increased specificity of enzyme are used as templates for the random peptide sequences that are inserted in the following round.
[0212]In a preferred alternative, the sequences selected in one round are analysed and randomized oligonucleotides are generated based on these sequences. This can, for example, be achieved by using in addition to the original nucleotide with a certain percentage mixtures of the other three nucleotides monomers at each position in the oligonucleotide synthesis. If, for example, in a first round an SDRs is identified that has the amino acid sequence ARLT, e.g. encoded by the nucleotide sequence GCG CGC CTT ACC, a random peptide sequence inserted in this SDR site could be encoded by an oligonucleotide with 70% G, 10% A, 10% T and 10% C at the first position, 70% C, 10% G, 10% T and 10% A at the second position, etc. This leads at each position approximately in 1 of 3 cases to the template amino acid and in 2 of 3 cases to another amino acid.
[0213]In another preferred alternative, the sequences selected in one round are analyzed and a consensus library is generated based on these sequences. This can, for example, be achieved by using defined mixtures of nucleotides at each position in the oligonucleotide synthesis in a way that leads to mixtures of the amino acid residues that were identified at each position of the SDR selected in the previous round. If, for example, in a first round two SDRs are identified that have the amino acid sequences ARLT and VPGS, a consensus library inserted in this SDR site in the following round could be encoded by an oligonucleotide with the sequence G(C/T)G C(G/C)C (G/T)(G/T)G (A/T)CC. This would correspond to the random peptide sequence (A/V)(R/P)(L/G/V/W)(T/S), thereby allowing all combinations of the amino acid residues identified in the first round, and, due to the degeneracy of the genetic code, allowing in addition to a lower degree alternative amino acid residues at some positions.
[0214]In another preferred alternative, the sequences selected in one round are, without previous analysis, recombined using methods for the in vitro recombination of polynucleotides, such as the methods described in WO 01/34835 (the following also provides details of the eighth and ninth aspect of the invention).
[0215]After insertion of the partially or fully random sequences into the gene encoding the scaffold protein, and eventually ligation of the resulting gene into a suitable expression vector using standard molecular cloning techniques (Sambrook, J. F; Fritsch, E. F.; Maniatis, T.; Cold Spring Harbor Laboratory Press, Second Edition, 1989, New York), the vector is introduced in a suitable expression host cell which expresses the corresponding enzyme variant. Particularly suitable expression hosts are bacterial expression hosts such as Escherichia coli or Bacillus subtilis, or yeast expression hosts such as Saccharomyces cerevisiae or Pichia pastoris, or mammalian expression hosts such as Chinese Hamster Ovary (CHO) or Baby Hamster Kidney (BHK) cell lines, or viral expression systems such as bacteriophages like M13 T7 phage or Lambda, or viruses such as the Baculovirus expression system. As a further alternative, systems for in vitro protein expression can be used. Typically, the DNA is ligated into an expression vector behind a suitable signal sequence that leads to secretion of the enzyme variants into the extracellular space, thereby allowing direct detection of enzyme activity in the cell supernatant. Particularly suitable signal sequences for Escherichia coli are ompA, pelB, HlyA, for Bacillus subtilis AprE, NprB, Mpr, AmyA, AmyE, Blac, SacB, and for S. cerevisiae Bar1, Suc2, Matα, Inu1A, Ggplp. Alternatively, the enzyme variants are expressed intracellularly and the substrates are expressed also intracellularly. According to protease variants this is done essentially as described in patent application WO 0212543, using a fusion peptide substrate comprising two auto-fluorescent proteins linked by the substrate amino-acid sequence. As a further alternative, after intracellular expression of the enzyme variants, or secretion into the periplasmatic space using signal sequences such as DsbA, PhoA, PelB, OmpA, OmpT or gIII for Escherichia coli, a permeabilisation or lysis step releases the enzyme variants into the supernatant. The destruction of the membrane barrier can be forced by the use of mechanical means such as ultrasonic, French press, or the use of membrane-digesting enzymes such as lysozyme. As another, further alternative, the genes encoding the enzyme variants are expressed cell-free by the use of a suitable cell-free expression system. For example, the S30 extract from Escherichia coli cells is used for this purpose as described by Lesly et al. (Methods in Molecular Biology 37 (1995) 265-278).
[0216]After introduction of the vector into host cells, these cells are screened for the expression of enzymes with specificity for the intended target substrate. Such screening is typically done by separating the cells from each other, in order to enable the correlation of genotype and phenotype, and assaying the activity of each cell clone after a growth and expression period. Such separation can for example be done by distribution of the cells into the compartments of sample carriers, e.g. as described in WO 01/24933. Alternatively, the cells are separated by streaking on agar plates, by enclosing in a polymer such as agarose, by filling into capillaries, or by similar methods.
[0217]Identification of variants with the intended specificity can be done by different approaches. In the case of proteases, preferably assays using peptide substrates essentially as described in PCT/EP03/04864 are employed.
[0218]Regardless of the expression format, selection of enzyme variants is done under conditions that allow identification of enzymes that recognize and convert the target sequence preferably. As a first alternative, enzymes that recognize and convert the target sequence preferably are identified by screening for enzymes with a high affinity for the target substrate sequence. High affinity corresponds to a low KM which is selected by screening at target substrate concentrations substantially below the KM of the first enzyme. Preferably, the substrates that are used are linked to one or more fluorophores that enable the detection of the modification of the substrate at concentrations below 10 μM, preferably below 1 μM, more preferably below 100 nM, and most preferably below 10 nM.
[0219]As a second alternative, enzymes that recognize and convert the target substrate preferably are identified by employing two or more substrates in the assay and screening for activity on these two or more substrates in comparison. Preferably, the two or more substrates employed are linked to different marker molecules, thereby enabling the detection of the modification of the two or more substrates consecutively or in parallel. In the case of proteases, particularly preferably two peptide substrates are employed, one peptide substrate having an arbitrarily chosen or even partially or fully random amino-acid sequence thereby enabling to monitor the activity on an arbitrary substrate, and the other peptide substrate having an amino-acid sequence identical to or resembling the intended target substrate sequence thereby enabling to monitor the activity on the target substrate. Especially preferably, these two peptide substrates are linked to fluorescent marker molecules, and the fluorescent properties of the two peptide substrates are sufficiently different in order to distinguish both activities when measured consecutively or in parallel. For example, a fusion protein comprising a first autofluorescent protein, a peptide, and a second autofluorescent protein according to patent application WO 0212543 can be used for this purpose. Alternatively, fluorophores such as rhodamines are linked chemically to the peptide substrates.
[0220]As a third alternative, enzymes that recognize and convert the target substrate preferably are identified by employing one or more substrates resembling the target substrate together with competing substrates in high excess. Screening with respect to activity on the substrates resembling the target substrate is then done in the presence of the competing substrates. Enzymes having a specificity which corresponds qualitatively to the target specificity, but having only a low quantitative specificity are identified as negative samples in such a screen. Whereas enzymes having a specificity which corresponds qualitatively and quantitatively to the target specificity are identified positively. Preferably, the one or more substrates resembling the target substrate are linked to marker molecules, thereby enabling the detection of their modifications, whereas the competing substrates do not carry marker molecules. The competing substrates have arbitrarily chosen or random amino-acid sequences, thereby acting as competitive inhibitors for the hydrolysis of the marker-carrying substrates. For example, protein hydrolysates such as Trypton can serve as competing substrates for engineered proteolytic enzymes according to the invention.
[0221]As a fourth alternative, enzymes that recognize and convert the target substrate preferably are identified and selected by an amplification-coupled or growth-coupled selection step. Furthermore, the activity can be measured intracellularly and the selection can be done by a cell sorter, such as a fluorescence-activated cell sorter.
[0222]As a further alternative, enzymes that recognize and convert the target substrate are identified by first selecting enzymes that preferentially bind to the target substrate, and secondly selecting out of this subgroup of enzyme variants those enzymes that convert the target substrate. Selection for enzymes that preferentially bind the target substrate can be either done by selection of binders to the target substrate or by counter-selection of enzymes that bind to other substrates. Methods for the selection of binders or for the counter-selection of non-binders is known in the art. Such methods typically require phenotype-genotype coupling which can be solved by using surface display expression methods. Such methods include, for example, phage or viral display, cell surface display and in vitro display. Phage or viral display typically involves fusion of the protein of interest to a viral/phage protein. Cell surface display, i.e. either bacterial or eukaryotic cell display, typically involves fusion of the protein of interest to a peptide or protein that is located at the cell surface. In in-vitro display, the protein is typically made in vitro and linked directly or indirectly to the mRNA encoding the protein (DE 19646372).
[0223]The invention also provides for a composition or pharmaceutical composition comprising one or more engineered enzymes according to the first aspect of the invention as defined herein before. The composition may optionally comprise an acceptable carrier, excipient and/or auxiliary agent. Non-pharmaceutical compositions as defined herein are research composition, nutritional composition, cleaning composition, disinfection composition, cosmetic composition or composition for personal care. Moreover, DNA sequences coding for the engineered enzyme as defined herein before and vectors containing said DNA sequences are also provided. Finally, transformed host cells (prokaryotic or eukaryotic) or transgenic organisms containing such DNA sequences and/or vectors, as well as a method utilizing such host cells or transgenic animals for producing the engineered enzyme of the first aspect of the invention are also contemplated.
DETAILED DESCRIPTION OF THE FIGURES
[0224]FIG. 1: Three-dimensional structure of human trypsin I with the active site residues shown in "ball-and-stick" representation and with the marked regions indicating potential SDR insertion sites.
[0225]FIG. 2: Alignment of the primary amino acid sequences of the human proteases trypsin I, alpha-thrombin and enteropeptidase all of which belong to the structural class S1 of the serine protease family. Trypsin represents an unspecific protease of this structural class, while alpha-thrombin and enteropeptidase are proteases with high substrate specificity. Compared to trypsin several regions of insertions of three or more amino acids into the primary sequence of a-thrombin and enterokinase are seen. The region marked with (-1-) and the region marked with (-3-) are preferred SDR insertion sites. In the tertiary structure of alpha-thrombin both regions are in the vicinity of the substrate binding site. These regions therefore fulfill two criteria to be selected as candidates for SDRs: firstly, they represent insertions in the specific proteases compared to the unspecific one and, secondly, they are close to the substrate binding site. A representation of the three-dimensional structure is given in FIG. 3.
[0226]FIG. 3: Three-dimensional structure of subtilisin with the active site residues being shown in "ball-and-stick" representation and with the numbered regions indicating potential SDR insertion sites.
[0227]FIG. 4: Alignment of the primary amino acid sequences of subtilisin E, furin, PC1 and PC5 all of which belong to the structural class S8 of the serine protease family. Subtilisin E represents an unspecific protease of this structural class, while furin, PC1 and PC5 are proteases with high substrate specificity. Compared to subtilisin several regions of insertions of three or more amino acids into the primary sequence of furin, PC1 and PC5 are seen. The regions marked with (-4-), (-5-), (-7-), (-9-) and (-11-) are preferred SDR insertion sites. These regions stretches fulfill two criteria to be selected as candidates for SDRs: firstly, they represent insertions in the specific proteases compared to the unspecific one and, secondly, they are close to the active site residues.
[0228]FIG. 5: Three-dimensional structure of beta-secretase with the active site residues being shown in "ball-and-stick" representation and with the numbered regions indicating potential SDR insertion sites.
[0229]FIG. 6: Alignment of the primary amino acid sequences of pepsin, b-secretase and cathepsin D, all of which belong to the structural class A1 of the aspartic protease family. Pepsin represents an unspecific protease of this structural class, while b-secretase and cathepsin D are proteases with high substrate specificity. Compared to pepsin several regions of insertions of three or more amino acids into the primary sequence of b-secretase and cathepsin D are seen. The regions marked with -1- to -11- correspond to possible SDR combining sites and are also marked in FIG. 5.
[0230]FIG. 7: illustrates the three-dimensional structure of caspase 7 with the active site residues being shown in "ball-and-stick" representation and with the numbered regions indicating potential SDR insertion sites.
[0231]FIG. 8: shows the primary amino acid sequence of caspase 7 as a member of the cysteine protease class C14 family (see also SEQ ID NO: 14).
[0232]FIG. 9: Schematic representation of method according to the third aspect of the invention.
[0233]FIG. 10: Western blot analysis of trypsin expression. Supernatant of cell cultures expressing variants of trypsin are compared to negative controls. Lane 1: molecular weight standard; lane 2: negative control; lane 3: supernatant of variant a; lane 4: negative control; lane 5: supernatant of variant b. A primary antibody specific to the expressed protein and a secondary antibody for generation of the signal were used.
[0234]FIG. 11: Time course of the proteolytic cleavage of a target substrate. Supernatant of cells containing the vector with the gene for human trypsin and that of cells containing the vector without the gene was incubated with the peptide substrate described in the text. Cleavage of the peptide results in a decreased read out value. Proteolytic activity is confirmed for the positive clone.
[0235]FIG. 12: Relative activity of three engineered proteolytic enzymes in comparison with human trypsin I on two different peptide substrates. A time course of the proteolytic digestion of the two substrates was performed and evaluated. Substrate B was used for screening and substrate A is a closely related sequence. Relative activity of the three variants was normalized to the activity of human trypsin I. Variant 1 and 2 clearly show increased specificity towards the target substrate. Variant 3, on the other hand, serves as a negative control with similar activities as the human trypsin I.
[0236]FIG. 13: Relative specificities of trypsin and variants of engineered proteolytic enzymes with one or two SDRs, respectively. Activity of the proteases was determined in the presence and absence of competitor substrate, i.e. peptone at a concentration of 10 mg/ml. Time courses for the proteolytic cleavage were recorded and the time constants k determined. The ratios between the time constants with and without competitor were formed and represent a quantitative measure for the specificity of the protease. The ratios were normalized to trypsin. The specificity of the variant containing two SDRs is 2.5 fold higher than that of the variant with SDR2 alone.
[0237]FIG. 14: Shows the relative specificities of protease variants in absence and presence of competitor substrate. The protease variants containing two inserts with different sequences and the non-modified scaffold human trypsin I were expressed in a suitable host. Activity of the protease variants was determined as the cleavage rate of a peptide with the desired target sequence of TNF-alpha in the absence and presence of competitor substrate. Specificity is expressed as the ratio of cleavage rates in the presence and absence of competitor.
[0238]FIG. 15: The figure shows the reduction of cytotoxicity induced by human TNF-alpha when incubating the human TNF-alpha with concentrated supernatant from cultures expressing the inventive engineered proteolytic enzymes being specific for human TNF-alpha. This indicates the efficacy of the inventive engineered proteolytic enzymes.
[0239]FIG. 16: The figure shows the reduction of cytotoxicity induced by human TNF-alpha when incubating the human TNF-alpha with different concentrations of purified inventive engineered proteolytic enzyme being specific for human TNF-alpha. Variant g comprises Seq ID No:72 as SDR1 and Seq ID No:73 as SDR2. This indicates the efficacy of the inventive engineered proteolytic enzymes.
[0240]FIG. 17: The figure compares the activity of inventive engineered proteolytic enzymes being specific for human TNF-alpha with the activity of human trypsin I on two protein substrates: (a) human TNF-alpha; (b) mixture of human serum proteins. This indicates the safety of the inventive engineered proteolytic enzymes. Variant×corresponds to Seq ID No: 75 comprising the SDRs according to Seq ID No. 89 (SDR1) and 95 (SDR2). Variants xi and xii correspond to derivatives thereof comprising the same SDR sequences.
[0241]FIG. 18: Specific hydrolysis of human VEGF by an engineered proteolytic enzyme derived from human trypsin.
[0242]FIG. 19: Plasmid map of the shuttle vector pBVP43-Sub.
[0243]FIG. 20: Schematic drawing of the insertion of SDR'S via PCR.
[0244]FIG. 21: Graphical description of the SDR insertion sites in subtilisin E.
[0245]FIG. 22: The figure compares the properties of ACE inhibition and degree of hydrolysis (DH) of GMP hydrolysates generated with either wt subtilisin E or the subtilisin E variants X and Y (SEQ ID NOs:129 and 130, respectively.
EXAMPLES
[0246]In the following examples, materials and methods of the present invention are provided including the determination of catalytic properties of enzymes obtained by the method. It should be understood that these examples are for illustrative purpose only and are not to be construed as limiting this invention in any manner. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
[0247]In the experimental examples described below, standard techniques of recombinant DNA technology were used that were described in various publications, e.g. Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, or Ausubel et al. (1987), Current Protocols in Molecular Biology 1987-1988, Wiley Interscience. Unless otherwise indicated, restriction enzymes, polymerases and other enzymes as well as DNA purification kits were used according to the manufacturers specifications.
Example I
Identification of SDR Sites in Human Trypsin
[0248]Insertion sites for SDRs have been identified in the serine protease human trypsin I (structural class S1) by comparison with members of the same structural class having a higher sequence specificity. Trypsin represents a member with low substrate specificity, as it requires only an arginine or lysine residue at the P1 position. On the other hand, thrombin, tissue-type plasminogen activator or enterokinase all have a high specificity towards their substrate sequences, i.e. (L/I/V/F)XPR NA, CPGR VVGG and DDDK , respectively. The primary sequences and tertiary structures of these and further S1 serine proteases have been aligned in order to determine regions of low and high sequence and structure homology and especially regions that correspond to insertions in the sequences of the more specific proteases (FIG. 2). Several regions of insertions equal or longer than 3 amino acids representing potential SDR sites have been identified as indicated in FIG. 1. These regions were chosen as target sites for the insertion of SDRs in the examples below, e.g. SDR1 (region one in FIG. 2, after amino acid 42 according to SEQ ID NO:1) with a length of six and SDR2 (region three in FIG. 2, after amino acid 123 according to SEQ ID NO:1) with a length of five amino acids, respectively.
Example II
Molecular Cloning of the Human Trypsin I Gene to be Used as Scaffold Protein and Expression of the Mature Protease in B. Subtilis
[0249]The gene encoding the unspecific protease human trypsinogen I was cloned into the vector pUC18. Cloning was done as follows: the coding sequence of the protein was amplified by PCR using primers that introduced a KpnI site at the 5' end and a BamHI site at the 3' end. This PCR fragment was cloned into the appropriate sites of the vector pUC18. Identity was confirmed by sequencing. After sequencing the coding sequence of the mature protein was amplified by PCR using primers that introduced different BglI sites at the 5' end and the 3' end.
[0250]This PCR fragment was cloned into the appropriate sites of an E. coli-B. subtilis shuttle vector. The vector contains a pMB1 origin for amplification in E. coli, a neomycin resistance marker for selection in E. coli, as well as a P43 promoter for the constitutive expression in B. subtilis. A 87 bp fragment that contains the leader sequence encoding the signal peptide from the sacB gene of B. subtilis was introduced behind the P43 promoter. Different BglI restriction sites serve as insertion sites for heterologous genes to be expressed.
[0251]Expression of human trypsin I was confirmed by measurement of the proteolytic aciticity in supernatant of cells containing the vector with the gene in comparison to a negative control. A peptide including an arginine cleavage site was chosen as a substrate. The peptide was N-terminally biotinylated and labeled with a fluorophore at the C-terminus. After incubation of the peptide with culture supernatant streptavidin was added. Uncleaved peptide associate with streptavidin and lead to a high read out value while cleavage results in low read out values. FIG. 11 shows the time course of a proteolytic digestion of B. subtilis cells containing the vector with the trypsin I gene in comparison to B. subtilis cells containing the vector without the trypsin I gene (negative control). As a further confirmation of expression of the protease, supernatants of cells containing the vector with the gene and control cells were analyzed by polyacrylamide gel electrophoreses and subsequent western blot using an antibody specific to the target protease. The procedure was performed according to standard methods (Sambrook, J. F; Fritsch, E. F.; Maniatis, T.; Cold Spring Harbor Laboratory Press, Second Edition, 1989, New York). FIG. 8 confirms expression of the protein only in the cells harbouring the vector with the gene for trypsin.
Example III
Providing a Scaffold Protein
[0252]In this example, human trypsin I was used as the scaffold protein. The gene was either used in its natural form, or, alternatively, was modified to result in a scaffold protein with increased catalytic activity or further improved characteristics.
[0253]The modification was done by random modification of the gene, followed by expression of the enzyme and subsequent selection for increased activity. First, the gene was PCR amplified under error-prone conditions, essentially as described by Cadwell, R. C and Joyce, G. F. (PCR Methods Appl. 2 (1992) 28-33). Error-prone PCR was done using 30 pmol of each primer, 20 nmol dGTP and dATP, 100 nmol dCTP and dTTP, 20 fmol template, and 5 U Taq DNA polymerase in 10 mM Tris HCl pH 7.6, 50 mM KCl, 7 mM MgCl2, 0.5 mM MnCl2, 0.01% gelatin for 20 cycles of 1 min at 94° C., 1 min at 65° C. and 1 min at 72° C. The resulting DNA library was purified using the Qiaquick PCR Purification Kit following the suppliers' instructions. The PCR product was digested with the restriction enzyme BglI and purified. Afterwards, the PCR product was ligated into the E. coli-B. subtilis shuttle vector described above which was digested with BglI and dephosphorylated. The ligation products were transformed into E. coli, amplified in LB, and the plasmids were purified using the Qiagen Plasmid Purification Kit following the suppliers' instructions. Resulting plasmids were transformed into B. subtilis cells.
[0254]Alternatively, or in addition to random mutagenesis, variants of the gene were statistically recombined at homologous positions by use of the Recombination Chain Reaction, essentially as described in WO 0134835. PCR products of the genes encoding the protease variants were purified using the QIAquick PCR Purification Kit following the suppliers' instructions, checked for correct size by agarose gel electrophoresis and mixed together in equimolar amounts. 80 μg of this PCR mix in 150 mM TrisHCl pH 7.6, 6.6 mM MgCl2 were heated for 5 min at 94° C. and subsequently cooled down to 37° C. at 0.05° C./s in order to re-anneal strands and thereby produce heteroduplices in a stochastic manner. Then, 2.5 U Exonuclease III per μg DNA were added and incubated for 20, 40 or 60 min at 37° C. in order to digest different lengths from both 3' ends of the heteroduplices. The partly digested PCR products were refilled with 0.6 U Pfu polymerase per μg DNA by incubating for 15 min at 72° C. in 0.17 mM dNTPs and Pfu polymerase buffer according to the suppliers' instructions. After performing a single PCR cycle, the resulting DNA was purified using the QIAquick PCR Purification Kit following the suppliers' instructions, digested with BglI and ligated into the linearized vector. The ligation products were transformed into E. coli, amplified in LB containing ampicillin as marker, and the plasmids were purified using the Qiagen Plasmid Purification Kit following the suppliers' instructions. Resulting plasmids were transformed into B. subtilis cells.
Example IV
Insertion of SDRs into the Protein Scaffold of Human Trypsin I and Generation of an Engineered Proteolytic Enzyme with Specificity for a Peptide Substrate Having the Sequence KKWLGRVPGGPV
[0255]In order to create insertion sites for SDRs in human trypsin I, two pairs of different restriction sites were introduced into the gene at sites that were identified as potential SDR sites (see Example I above) without changing the amino acid sequence. The insertion of the restriction sites was done by overlap extension PCR. Primers restr1 and restr2 were used for the introduction of SacII and BamHI restriction sites, restr3 and restr4 were used for the introduction of KpnI and NheI restriction sites. The sequences of the primers were as follows:
[0256]Binding site for restr1 and restr2 and the corresponding amino acid sequence
TABLE-US-00002 (SEQ ID NO: 54): 5'-GGTGGTATCAGCAGGCCACTGCTACAAGTCCCGCATCCAGGT 3' V V S A G H C Y K S R I Q Forward primer restr1 (SEQ ID NO: 56): 5'-GGTGGTATCCGCGGGCCACTGCTACAAGTCCCGGATCCAGGT-3' Reverse primer restr2 (SEQ ID NO: 57): 5'-ACCTGGATCCGGGACTTGTAGCAGTGGCCCGCGGATACCACC-3'
[0257]Binding site for restr3 and restr4 and the corresponding amino acid sequence
TABLE-US-00003 (SEQ ID NO: 58): 5'-CCACTGGCACGAAGTGCCTCATCTCTGGCTGGGGCAACACTGCGAGCTCT-3' T G T K C L I S G W G N T A S S Forward primer restr3 (SEQ ID NO: 60): 5'-CCACTGGCACGAAGTGCCTCATCTCTGGCTGGGGCAACACTGCGAGCTCT-3' Reverse primer restr4 (SEQ ID NO: 61): 5'-AGAGCTAGCAGTGTTGCCCCAGCCAGAGATGAGGCACTTGGTACCAGTGG-3'
[0258]In a first overlap extension PCR, the SacII/BamHI sites were introduced, enabling to insert SDR1, and in a second overlap extension PCR the KpnI/NheI sites, enabling the insertion of SDR2. The product of the overlap extension PCR was amplified using primers pUC-forward and pUC-reverse. The sequences of pUC-forward and pUC-reverse are as follows:
TABLE-US-00004 pUC-forward (SEQ ID NO: 62): 5'-GGGGTACCCCACCACCATGAATCCACTCCT-3' pUC-reverse (SEQ ID NO: 63): 5'-CGGGATCCGGTATAGAGACTGAAGAGATAC-3'
[0259]The restriction sites generated thereby were subsequently used to insert defined or random oligonucleotides into the SDR1 and SDR2 insertion sites by standard restriction and ligation methods. Typically, two complementary synthetic 5'-phosphorylated oligonucleotides were annealed and ligated into a vector carrying the modified human trypsin I gene that was cleaved with the respective restriction enzymes. Oligonucleotides encoding SDR1 were inserted via the SacII/BamHI sites whereas oligonucleotides encoding SDR2 were inserted via the KpnI/NheI sites. For each insertion an oligonucleotide pair according to the following general sequences was used ([P] indicating 5'-phosphorylation, N and X indicating any nucleotide or amino acid residue, respectively):
TABLE-US-00005 oligox-SDR1f (SEQ ID NO: 64): 5'-[P]-GGGCCACTGCTACNNNNNNNNNNNNNNNNNNAAGTCCCG-3' oligox-SDR1r (SEQ ID NO: 66): 3'-CGCCCGGTGACGATGNNNNNNNNNNNNNNNNNNTTCAGGGCCTAG-[P]-5' G H C Y X X X X X X K S oligox-SDR2f (SEQ ID NO: 67): 5'-[P]-CAAGTGCCTCATCTCTGGCTGGGGCAACNNNNNNNNNNNNNNNACTG-3' oligox-SDR2r (SEQ ID NO: 69): 3'-CATGGTTCACGGAGTAGAGACCGACCCCGTTGNNNNNNNNNNNNNNNTGACGATC-[P]-5' K C L I S G W G N X X X X X T
[0260]As an alternative to the above method, a PCR based method was used for the integration of random-sequences into the SDR1 and SDR2 insertion sites in the modified human trypsin I. For each SDR, one primer was used where the SDR region is fully randomized. Sequences of the primers were as follows (N=A/C/G/T, B=C/G/T, V=A/C/G):
TABLE-US-00006 Primer SDR1-mutnnb-forward (SEQ ID NO: 70): 5'-TGGTATCCGCGGGCCACTGCTACNNBNNBNNBNNBNNBNNBAAGTCCCGGATCCAGGTG-3' Primer SDR2-mutnnb-reverse (SEQ ID NO: 71): 5'-GGCGCCAGAGCTAGCAGTVNNVNNVNNVNNVNNGTTGCCCCAGCCAGAGATG-3'
[0261]The codon NNB, or VNN in the reverse strand, allows all 20 amino acids to made, but reduces the probability of encoding a stop codon from 0.047 to 0.021.
[0262]As a further alternative, after identification of SDRs that lead to increased specificity, these SDRs were used as templates for further randomization. Thereby, random peptide sequences were inserted that were partially randomized at each position and partially identical at each position to the original sequence.
[0263]As an example, random peptide sequences that have in approximately 1 of 3 cases the template amino acid residue and in approximately 2 of 3 cases any other amino acid residue at each position were inserted into the two SDR insertion sites of the modified human trypsin I. For this purpose, primers that contain at each nucleotide position of the SDR approximately 70% of the template bases and 30% of a mixture of the three other bases were used.
[0264]With each primer pair a PCR was performed under standard conditions using the human trypsin I gene as template. The resulting DNA was purified using the QIAquick PCR Purification Kit following the suppliers' instructions and digested with SacII and NheI. After digestion the DNA was purified and ligated into the SacII and NheI digested and dephosphorylated vector. The ligation products were transformed into E. coli, amplified in LB containing the respective marker, and the plasmids were purified using the Qiagen Plasmid Purification Kit following the suppliers' instructions. Resulting plasmids were transformed into B. subtilis cells. These cells were then separated to single cells, grown to clones, and after expression of the protease gene screened for proteolytic activity.
[0265]The following substrates were employed for screening for proteolytic activity (SEQ ID NOs:76 and 77):
TABLE-US-00007 sub- L L W L G R V V G G P V strate A sub- K K W L G R V P G G P V strate B
[0266]Protease variants were screened on substrate B at complexities of 106 variants by confocal fluorescence spectroscopy. The substrate was a peptide biotinylated at the N-terminus and fluorescently labeled at the C-terminus. After incubation of the peptide with supernatant of cells expressing different variants of the protease, streptavidin is added and the samples are analysed by confocal fluorimetry. The low concentration of the peptide (20 nM) leads to a preferential cleavage by proteases with a high kcat/KM value, i.e. proteases with high specificity towards the target sequence.
[0267]Variants selected in the screening procedure were further evaluated for their specificity towards substrate B and closely related substrate A by measuring time courses of the proteolytic digestion and determining the rate constants which are proportional to the kcat/KM values. Clearly, compared to the human trypsin that was used as scaffold protein, the specific activity of variants 1 and 2 is shifted (SEQ ID NOs: 2 and 3, respectively) towards substrate B. Variant 3 (SEQ ID NO:4), on the other hand, serves as a negative control with similar activities as the human trypsin I. Sequencing of the genes of the three variants revealed the following amino acid sequences in the SDRs.
TABLE-US-00008 TABLE 2 Sequences of the two SDRs in three different variants selected for specific hydrolysis of substrate B (SEQ ID NOs: 78-83). SDR 1 SDR 2 Trypsin -- -- -- -- -- -- -- -- -- -- -- Variant 1 D A V G R D T I T N S Variant 2 N G R D L E V R G T W Variant 3 G F V M F N R S P L T
[0268]In a further experiment a pool of variants containing different numbers of SDRs per gene were screened for increased specificity using a mixture of the defined substrate and pepton as a competing substrate. Variants containing one or two SDRs per gene have been analyzed further. As a measure for the specificity the activity in the peptide cleavage assay was compared with and without the presence of the competing substrate. The concentration of the competing substrate was 10 mg/ml. Under these conditions, unspecific proteases show, compared to specific proteases, a stronger decrease in activity with increasing competitor concentrations (range between 0 and 100 mg/ml). The ratio of proteolytic activity with and without substrate is a quantitative measure for the specificity of the proteases. FIG. 9 shows the relative activities with and without competing substrate. Human trypsin I that was used as the scaffold protein and two variants, one containing only SDR2, and one containing both SDRs, were compared. The specificity of the variant with both SDRs is by a factor of 2.5 higher than that of the variant with SDR2 only, confirming that there is a direct relation between the number of SDRs and the quantitative specificity of resulting engineered proteolytic enzymes.
Example V
Generation of an Engineered Proteolytic Enzyme that Specifically Inactivates Human TNF-Alpha
[0269]Human trypsin alpha I or a derivative comprising one or more of the following amino acid substitutions E56G; R78W; Y131F; A146T; C183R was used as protein scaffold for the generation of an engineered proteolytic enzyme with high specificity towards human TNF-alpha. The identification of SDR sites in human trypsin I or derivatives thereof was done as described above. Two insertion sites within the scaffold were chosen for SDRs. The protease variants containing two inserts with different sequences and also the human trypsin I itself with no inserts were expressed in a Bacillus subtilis cells. The variant protease cells were separated to single cell clones and the protease expressing variants were screened for proteolytic activity on peptides with the desired target sequence of TNF-alpha. The activity of the protease variants was determined as the cleavage rate of a peptide with the desired target sequence of TNF-alpha in the absence and presence of competitor substrate. The specificity is expressed as the ratio of cleavage rates in the presence and absence of competitor (FIG. 14).
TABLE-US-00009 TABLE 3 Relative specificity of variants of engineered proteolytic enzymes with different SDR se- quences in absence and presence of competitor substrate (SEQ ID NOs: 84-95). k with comp./ Seq. of Seq. of k without comp. SDR 1 SDR 2 Scaffold (no SDRs) 0.092 -- -- variant a 0.130 RPWDPS VHPTS variant b 0.187 GFVMFN RSPLT variant c 0.235 EIANRE RGART variant d 0.310 KAVVGT RTPIS variant e 0.374 VNIMAA TTARK variant f 0.487 AAFNGD RKDFW
[0270]The antagonistic effect of three inventive protease variants on human TNF-alpha is shown in FIG. 15. By the use of the variants, the induction of apoptosis is almost completely eliminated indicating the anti-inflammatory efficacy of the inventive proteases to initiate TNF-alpha break down. TNF-alpha has been incubated with concentrated supernatant from cultures expressing the variants i to iii for 2 hours. The resulting TNF-alpha has been incubated with non-modified cells for 4 hours. The effect of the remaining TNF-alpha activity was determined as the extent of apoptosis induction by detection of activated caspase-3 as marker for apoptotic cells. For the controls either no protease was added with the human TNF-alpha (dead cells) or buffer instead of human TNF-alpha (live cells) was used, respectively. An analogous experiment is shown in FIG. 16 using purified variant xiii. TNF-alpha was incubated with different concentrations of the purified inventive protease variant.
[0271]To demonstrate the specificity of the inventive protease variants, proteins from human blood serum or purified human TNF-alpha have been incubated with human trypsin I or the inventive engineered proteolytic enzyme variants, respectively. Here, variant x corresponds to Seq ID No: 75 comprising the same SDRs as variant f, i.e. SDRs according to Seq ID No. 89 (SDR1) and 95 (SDR2). Variants xi and xii correspond to derivatives thereof comprising the same SDR sequences. Remaining intact protein was determined as a function of time. While the variants as well as human trypsin I digest human TNF-alpha, only trypsin shows activity on serum protein (FIG. 17 a and b). This demonstrates the high TNF-alpha specificity of the inventive proteolytic enzymes and indicates their safety and accordingly their low side effects for therapeutic use.
Example VI
Generation of an Engineered Proteolytic Enzyme that Specifically Hydrolysis Human VEGF
[0272]Human trypsin I was used as protein scaffold for the generation of an engineered proteolytic enzyme with high specificity towards human VEGF. The identification of SDR sites in human trypsin I was done as described above. Two insertion sites within the scaffold were chosen for SDRs. The protease variants containing two inserts with different sequences were expressed in Bacillus subtilis cells. The variant protease cells were separated to single cell clones and the protease expressing variants were screened as described above. The activity of the protease variants was determined as the rate of VEGF cleavage. 4 μg of recombinant human VEGF165 was incubated with 0.18 μg of purified protease in PBS/pH 7.4 at room temperature. Aliquots were taken at the indicated time points and analysed on a polyacrylamide gel. The extend of cleavage was quantified by densitometric analysis of the bands. The activity is plotted over incubation time in FIG. 18. Specific cleavage was controlled by further SDS polyacrylamide gel analyses.
Example VII
Identification of SDR Sites in Subtilisin E
[0273]Insertion sites for SDRs have been identified in subtilisin E (structural class S8) by comparison with members of the same structural class having a higher substrate specificity. Subtilisin E represents a member with low substrate specificity, while furin, PC1 and PC5 are proteases with high substrate specificity. Comparing the primary sequences of these proteases several regions for insertions have been identified (see FIG. 4). The regions marked with (-4-), (-5-), (-7-), (-9-) and (-11-) are preferred SDR insertion sites. These regions stretches fulfill two criteria to be selected as candidates for SDRs: firstly, they represent insertions in the specific proteases compared to the unspecific one and, secondly, they are close to the active site residues.
Example VIII
Molecular Cloning of the Subtilisin E Gene to be Used as Scaffold Protein and Expression of the Mature Protease in B. subtilis
[0274]The aprE gene coding for subtilisin E was amplified by PCR from the genome of Bacillus subtilis strain 168 (DSM #402). Amplification was performed for 20 cycles using a 5:1 mixture of Taq and Pfu polymerases while B. subtilis cells served as templates. Primers simultaneously introducing terminal sites for unique cutting restriction endonucleases were chosen in a way that guaranteed cloning of the full-length CDS into pBV1 was provided. pBV1 is an E. coli/B. subtilis shuttle vector that was constructed by introducing a pMB1 origin of replication from pUC19 into pUB110 before deleting its mob gene. Subsequent to cloning of the subtilisin gene into this vector, a synthetic oligomer encoding the P43 promoter was introduced upstream of the gene resulting in pBVP43-Sub. To facilitate mutating the DNA sequence encoding the mature subtilisin without affecting the N-terminal signal sequence, two unique restriction sites (DraIII) were introduced at sites corresponding to the N terminus and the C terminus of the mature subtilisin. The vector construct is shown in FIG. 19. Identity of the intermediates as well as the final product of the cloning procedure was confirmed by means of DNA cycle sequencing using the dideoxynucleotide method. Functionality of the construct was confirmed by plating transformants of pBVP43-Sub in an aprE deficient B. subtilis strain (Harwood and Cutting (1990) Molecular Biological Methods for Bacillus, J. Wiley and Sons, New York) on LB agar containing 1% skim milk and the appropriate antibiotics. Subtilisin activity resulted in cleared halos around the Bacillus subtilis colonies. Expression of subtilisin E or a subtilisin variant was done by inoculating complex media containing NaCl, tryptone and yeast extract with a aprE deficient B. subtilis clone transformed with a pBVP43-Sub plasmid encoding the particular variant, and incubation in Erlenmeyer flasks for 24 to 48 h at 30° C. with continuous shaking.
Example IX
Insertion of SDRs into the Protein Scaffold of Subtilisin E and Generation of Proteolytic Enzymes
[0275]SDR's were introduced in the subtilisin E scaffold via PCR. The subtilisin E gene served as a template for the PCR. All PCRs were done with the KOD polymerase under conditions recommended by the manufacturer. PCR products were purified using the QIAquick PCR Purification Kit following the suppliers instructions. The method is illustrated in FIG. 20.
[0276]Insertion of SDR 1 and 4 in the subtilisin E gene: Primer G (SEQ ID NO:120) and primer H (SEQ ID NO: H) bind on the vector sequence in FIG. 19. The coding sequence for the mature subtilisin E protein is flanked by DraIII restriction sites. For the insertion of SDRs 1 and 3, in a first step 3 PCR reactions were carried out in order to generate PCR product 1 (PCR with primer G (SEQ ID NO:120) and primer D (SEQ ID NO:117)), PCR product 2 (PCR with primer H (SEQ ID NO:121) and primer J (SEQ ID NO:123)) and PCR product 3 (PCR with primer A (SEQ ID NO:114) and primer I (SEQ ID NO:122)) respectively. In a next step PCR product 3 was used as a template in a PCR reaction with primer E (SEQ ID NO:118) and primer K (SEQ ID NO:124) that results in the PCR product 4 harbouring the SDRs.
[0277]In a further step the whole subtilisin E gene with SDRs was generated in a overlap extension PCR with PCR products 1, 2 and 4 and the primers G and H. Finally the emerged PCR product was cut with DraIII and ligated into pBVP43-Sub.
[0278]Insertion of SDR 2 and 3 in the subtilisin E gene: Primer G (SEQ ID NO:120) and primer H (SEQ ID NO:121) bind on the vector sequence in FIG. 19. The coding sequence for the mature subtilisin E protein is flanked by DraIII restriction sites. For the insertion of SDRs 1 and 3, in a first step 3 PCR reactions were carried out in order to generate PCR product 1 (PCR with primer G (SEQ ID NO: G) and primer M (SEQ ID NO:126)), PCR product 2 (PCR with primer H (SEQ ID NO:121) and primer C (SEQ ID NO:116)) and PCR product 3 (PCR with primer N (SEQ ID NO:127) and primer B (SEQ ID NO:115)) respectively. In a next step PCR product 3 was used as a template in a PCR reaction with primer L (SEQ ID NO:125) and primer F (SEQ ID NO:119) that results in the PCR product 4 harboring the SDRs.
[0279]In a further step the whole subtilisin E gene with SDRs was generated in a overlap extension PCR with PCR products 1, 2 and 4 and the primers G and H. Finally the emerged PCR product was cut with DraIII and ligated into pBVP43-Sub.
[0280]The insertion sites for the SDRs 1, 2, 3 and 4 are provided below. It is to be noted that the numbering below refers to SEQ ID NO:7. This sequence contains five amino acids of the propeptide prior to the sequence of the mature subtilisin E protein.
[0281]The sequence of the mature protein starts with AQSVPY respectively.
[0282]SDR 1 comprising the motive GXXXX (x stands for any amino acid; SEQ ID NO:128) was inserted between amino acids D65 and G66 of SEQ ID NO:7.
[0283]SDR 2 comprising the motive GXXXX (x stands for any amino acid; SEQ ID NO:128) was inserted between amino acids T104 and G105 of SEQ ID NO:7.
[0284]SDR 3 comprising the motive GXXXXG (x stands for any amino acid; SEQ ID NO:129) was inserted between amino acids G133 and T135 of SEQ ID NO:7 while P134 of the same sequence was substituted for a G.
[0285]SDR 4 comprising the motive GXXXX (x stands for any amino acid; SEQ ID NO:128) was inserted between amino acids E161 and G162 of SEQ ID NO:7.
[0286]A graphical description of the insertion sites is show in FIG. 21.
Example X
Screening of Proteolytic Enzymes which Generate Protein Hydrolysates that are Capable of Specifically Inhibiting Angiotensin Converting Enzyme (ACE, Human or from Rabbit Lung)
[0287]The generated subtilisin E variants (harboring SDRs at different sites in the protein) were screened for their ability to generate hydrolysates of glycomacropeptide (GMP) with ACE inhibitory activity. The hydrolysates were analyzed for their degree of hydrolysis (DH) and their ACE inhibiting potential. As a measure for the specificity of subtilisin E variants the ratio between ACE inhibitory potential and DH was calculated. This value was used to discriminate specific from less specific variants.
[0288]Determination of ACE activity: For the determination of ACE activity a fluorescence based assay was developed. The assay comprises a peptide substrate that contains a ACE specific recognition sequence, a fluorophore coupled to the N-terminus and biotin coupled to the C-terminus of the peptide. This peptide is subjected to hydrolysis by ACE and subsequently contacted with streptavidin which binds to the C-terminal biotin. The final readout is fluorescence anisotropy of the coupled fluorophore. In case of no hydrolysis of the peptide the fluorescence anisotropy value is high. In the case of complete hydrolysis the fluorescence anisotropy is low.
[0289]For a typical analysis the B. subtilis cells capable of expressing the subtilisin E variants were grown for 16 h at 37° C. One volume of cell suspension was mixed with 1.3 volumes of an 11% (w/v) GMP solution in water and incubated for 12 h at 37° C. From this mixture one volume was diluted 20-fold with PBS buffer, incubated for 30 min at 90° C. and subsequently incubated for 30 min at room temperature. This solution was then mixed with an ACE solution (rabbit lung ACE, Sigma) and a fluorescent peptide solution to yield final concentrations of 13.6 mu/ml ACE and 57.7 nM fluorescent peptide. After 5 h incubation at 37° C. the reaction was stopped by adding at least a 5-fold molar excess of streptavidin over the fluorescence peptide and adjusting the solution to 625 mM EDTA. Finally the fluorescence anisotropy was measured.
[0290]Reference measurements for 100% ACE activity were carried out as described above using B. subtilis culture medium instead of cell suspension. Buffer was used instead of ACE solution in the above protocol in order to provide a reference value for complete inhibition of ACE (0% ACE activity).
[0291]Determination of the degree of hydrolysis: GMP hydrolysates were diluted 1:15 in 1% (w/v) SDS. One volume of the diluted hydrolysates was mixed with 3 volumes of 175 mM sodium phosphate buffer, ph8.2 and 3 volumes of 0.05% of a TNBS solution in 30 mM Na-phosphate pH 8.2. The mixture was incubated for 60 min at 50° C. and subsequently diluted 2-fold with 0.1M HCl. Finally the absorption was recorded at 340 nm. Calibration measurements were carried out using glycine as a reference molecule.
Example XI
Subtilisin E Variants with Improved Specificity
[0292]To demonstrate improved specificity of the subtilisin E variants, these were compared with wt subtilisin E for their properties regarding ACE inhibition vs. degree of hydrolysis as described in Example X. As a reference (negative control, also in Table 4) for the measurements the host strain transformed with an empty expression plasmid was used. Inhibition and hydrolysis data were normalized and calculated relative to the negative control. As the ultimate measure for the specificity of different variants, the ratio between ACE inhibition and DH was calculated (see Table 4). The higher this ratio, the more specific is the enzyme in the generation of ACE inhibitory hydrolysates/peptides.
TABLE-US-00010 TABLE 4 Subtilisin E variants and their respective SDR sequences and proteolytic properties. The two variants X and Y were selected for specific hydrolysis of GMP. Here the proterties of the GMP hydrolysates generated with subtilisin E variants are quantified by the ratio of ACE inhibition and degreed of hydrolysis (DH). ACE inhibition (sample- negative control)/ DH (sample-negative Seq. of Seq. of Seq. of Seq. of control) SDR 1 SDR 2 SDR 3 SDR 4 scaffold no SDRs 0.6 -- -- -- -- variant X (SEQ ID NO: 130) 1.4 GGRF -- -- PNQL variant Y (SEQ ID NO: 131) 5.2 -- RDVL QNAP --
[0293]FIG. 22 illustrates as well the differences in the properties of wt and variants and clearly indicates the improved specificity of the variants compared to wt.
[0294]Since the only modifications of the scaffold molecule subtilisin E were the insertion of the SDRs, it is proven that the SDRs are responsible for the improved specificity of the variants X and Y. This proves the concept of generating specific enzymes via insertion of SDRs for the proteases from the structural class S8.
Sequence CWU
1
1491224PRTHomo sapiens 1Ile Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Val
Pro Tyr Gln Val1 5 10
15Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu Ile Asn Glu20
25 30Gln Trp Val Val Ser Ala Gly His Cys Tyr
Lys Ser Arg Ile Gln Val35 40 45Arg Leu
Gly Glu His Asn Ile Glu Val Leu Glu Gly Asn Glu Gln Phe50
55 60Ile Asn Ala Ala Lys Ile Ile Arg His Pro Gln Tyr
Asp Arg Lys Thr65 70 75
80Leu Asn Asn Asp Ile Met Leu Ile Lys Leu Ser Ser Arg Ala Val Ile85
90 95Asn Ala Arg Val Ser Thr Ile Ser Leu Pro
Thr Ala Pro Pro Ala Thr100 105 110Gly Thr
Lys Cys Leu Ile Ser Gly Trp Gly Asn Thr Ala Ser Ser Gly115
120 125Ala Asp Tyr Pro Asp Glu Leu Gln Cys Leu Asp Ala
Pro Val Leu Ser130 135 140Gln Ala Lys Cys
Glu Ala Ser Tyr Pro Gly Lys Ile Thr Ser Asn Met145 150
155 160Phe Cys Val Gly Phe Leu Glu Gly Gly
Lys Asp Ser Cys Gln Gly Asp165 170 175Ser
Gly Gly Pro Val Val Cys Asn Gly Gln Leu Gln Gly Val Val Ser180
185 190Trp Gly Asp Gly Cys Ala Gln Lys Asn Lys Pro
Gly Val Tyr Thr Lys195 200 205Val Tyr Asn
Tyr Val Lys Trp Ile Lys Asn Thr Ile Ala Ala Asn Ser210
215 2202235PRTArtifical SequenceDescription of Artificial
Sequence = Synthetic Construct 2Ile Val Gly Gly Tyr Asn Cys Glu Glu
Asn Ser Val Pro Tyr Gln Val1 5 10
15Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu Ile Asn
Glu20 25 30Gln Trp Val Val Ser Ala Gly
His Cys Tyr Asp Ala Val Gly Arg Asp35 40
45Lys Ser Arg Ile Gln Val Arg Leu Gly Glu His Asn Ile Glu Val Leu50
55 60Glu Gly Asn Glu Gln Phe Ile Asn Ala Ala
Lys Ile Ile Arg His Pro65 70 75
80Gln Tyr Asp Arg Lys Thr Leu Asn Asn Asp Ile Met Leu Ile Lys
Leu85 90 95Ser Ser Arg Ala Val Ile Asn
Ala Arg Val Ser Thr Ile Ser Leu Pro100 105
110Thr Ala Pro Pro Ala Thr Gly Thr Lys Cys Leu Ile Ser Gly Trp Gly115
120 125Asn Thr Ile Thr Asn Ser Thr Ala Ser
Ser Gly Ala Asp Tyr Pro Asp130 135 140Glu
Leu Gln Cys Leu Asp Ala Pro Val Leu Ser Gln Ala Lys Cys Glu145
150 155 160Ala Ser Tyr Pro Gly Lys
Ile Thr Ser Asn Met Phe Cys Val Gly Phe165 170
175Leu Glu Gly Gly Lys Asp Ser Cys Gln Gly Asp Ser Gly Gly Pro
Val180 185 190Val Cys Asn Gly Gln Leu Gln
Gly Val Val Ser Trp Gly Asp Gly Cys195 200
205Ala Gln Lys Asn Lys Pro Gly Val Tyr Thr Lys Val Tyr Asn Tyr Val210
215 220Lys Trp Ile Lys Asn Thr Ile Ala Ala
Asn Ser225 230 2353235PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
3Ile Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Val Pro Tyr Gln Val1
5 10 15Ser Leu Asn Ser Gly Tyr
His Phe Cys Gly Gly Ser Leu Ile Asn Glu20 25
30Gln Trp Val Val Ser Ala Gly His Cys Tyr Asn Gly Arg Asp Leu Glu35
40 45Lys Ser Arg Ile Gln Val Arg Leu Gly
Glu His Asn Ile Glu Val Leu50 55 60Glu
Gly Asn Glu Gln Phe Ile Asn Ala Ala Lys Ile Ile Arg His Pro65
70 75 80Gln Tyr Asp Arg Lys Thr
Leu Asn Asn Asp Ile Met Leu Ile Lys Leu85 90
95Ser Ser Arg Ala Val Ile Asn Ala Arg Val Ser Thr Ile Ser Leu Pro100
105 110Thr Ala Pro Pro Ala Thr Gly Thr
Lys Cys Leu Ile Ser Gly Trp Gly115 120
125Asn Val Arg Gly Thr Trp Thr Ala Ser Ser Gly Ala Asp Tyr Pro Asp130
135 140Glu Leu Gln Cys Leu Asp Ala Pro Val
Leu Ser Gln Ala Lys Cys Glu145 150 155
160Ala Ser Tyr Pro Gly Lys Ile Thr Ser Asn Met Phe Cys Val
Gly Phe165 170 175Leu Glu Gly Gly Lys Asp
Ser Cys Gln Gly Asp Ser Gly Gly Pro Val180 185
190Val Cys Asn Gly Gln Leu Gln Gly Val Val Ser Trp Gly Asp Gly
Cys195 200 205Ala Gln Lys Asn Lys Pro Gly
Val Tyr Thr Lys Val Tyr Asn Tyr Val210 215
220Lys Trp Ile Lys Asn Thr Ile Ala Ala Asn Ser225 230
2354235PRTArtifical SequenceDescription of Artificial
Sequence = Synthetic Construct 4Ile Val Gly Gly Tyr Asn Cys Glu Glu
Asn Ser Val Pro Tyr Gln Val1 5 10
15Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu Ile Asn
Glu20 25 30Gln Trp Val Val Ser Ala Gly
His Cys Tyr Ala Ala Thr Asn Gly Asp35 40
45Lys Ser Arg Ile Gln Val Arg Leu Gly Glu His Asn Ile Glu Val Leu50
55 60Glu Gly Asn Glu Gln Phe Ile Asn Ala Ala
Lys Ile Ile Arg His Pro65 70 75
80Gln Tyr Asp Arg Lys Thr Leu Asn Asn Asp Ile Met Leu Ile Lys
Leu85 90 95Ser Ser Arg Ala Val Ile Asn
Ala Arg Val Ser Thr Ile Ser Leu Pro100 105
110Thr Ala Pro Pro Ala Thr Gly Thr Lys Cys Leu Ile Ser Gly Trp Gly115
120 125Asn Arg Lys Asp Phe Trp Thr Ala Ser
Ser Gly Ala Asp Tyr Pro Asp130 135 140Glu
Leu Gln Cys Leu Asp Ala Pro Val Leu Ser Gln Ala Lys Cys Glu145
150 155 160Ala Ser Tyr Pro Gly Lys
Ile Thr Ser Asn Met Phe Cys Val Gly Phe165 170
175Leu Glu Gly Gly Lys Asp Ser Cys Gln Gly Asp Ser Gly Gly Pro
Val180 185 190Val Cys Asn Gly Gln Leu Gln
Gly Val Val Ser Trp Gly Asp Gly Cys195 200
205Ala Gln Lys Asn Lys Pro Gly Val Tyr Thr Lys Val Tyr Asn Tyr Val210
215 220Lys Trp Ile Lys Asn Thr Ile Ala Ala
Asn Ser225 230 2355259PRTHomo sapiens
5Ile Val Glu Gly Ser Asp Ala Glu Ile Gly Met Ser Pro Trp Gln Val1
5 10 15Met Leu Phe Arg Lys Ser
Pro Gln Glu Leu Leu Cys Gly Ala Ser Leu20 25
30Ile Ser Asp Arg Trp Val Leu Thr Ala Ala His Cys Leu Leu Tyr Pro35
40 45Pro Trp Asp Lys Asn Phe Thr Glu Asn
Asp Leu Leu Val Arg Ile Gly50 55 60Lys
His Ser Arg Thr Arg Tyr Glu Arg Asn Ile Glu Lys Ile Ser Met65
70 75 80Leu Glu Lys Ile Tyr Ile
His Pro Arg Tyr Asn Trp Arg Glu Asn Leu85 90
95Asp Arg Asp Ile Ala Leu Met Lys Leu Lys Lys Pro Val Ala Phe Ser100
105 110Asp Tyr Ile His Pro Val Cys Leu
Pro Asp Arg Glu Thr Ala Ala Ser115 120
125Leu Leu Gln Ala Gly Tyr Lys Gly Arg Val Thr Gly Trp Gly Asn Leu130
135 140Lys Glu Thr Trp Thr Ala Asn Val Gly
Lys Gly Gln Pro Ser Val Leu145 150 155
160Gln Val Val Asn Leu Pro Ile Val Glu Arg Pro Val Cys Lys
Asp Ser165 170 175Thr Arg Ile Arg Ile Thr
Asp Asn Met Phe Cys Ala Gly Tyr Lys Pro180 185
190Asp Glu Gly Lys Arg Gly Asp Ala Cys Glu Gly Asp Ser Gly Gly
Pro195 200 205Phe Val Met Lys Ser Pro Phe
Asn Asn Arg Trp Tyr Gln Met Gly Ile210 215
220Val Ser Trp Gly Glu Gly Cys Asp Arg Asp Gly Lys Tyr Gly Phe Tyr225
230 235 240Thr His Val Phe
Arg Leu Lys Lys Trp Ile Gln Lys Val Ile Asp Gln245 250
255Phe Gly Glu6235PRTHomo sapiens 6Ile Val Gly Gly Ser Asn
Ala Lys Glu Gly Ala Trp Pro Trp Val Val1 5
10 15Gly Leu Tyr Tyr Gly Gly Arg Leu Leu Cys Gly Ala
Ser Leu Val Ser20 25 30Ser Asp Trp Leu
Val Ser Ala Ala His Cys Val Tyr Gly Arg Asn Leu35 40
45Glu Pro Ser Lys Trp Thr Ala Ile Leu Gly Leu His Met Lys
Ser Asn50 55 60Leu Thr Ser Pro Gln Thr
Val Pro Arg Leu Ile Asp Glu Ile Val Ile65 70
75 80Asn Pro His Tyr Asn Arg Arg Arg Lys Asp Asn
Asp Ile Ala Met Met85 90 95His Leu Glu
Phe Lys Val Asn Tyr Thr Asp Tyr Ile Gln Pro Ile Cys100
105 110Leu Pro Glu Glu Asn Gln Val Phe Pro Pro Gly Arg
Asn Cys Ser Ile115 120 125Ala Gly Trp Gly
Thr Val Val Tyr Gln Gly Thr Thr Ala Asn Ile Leu130 135
140Gln Glu Ala Asp Val Pro Leu Leu Ser Asn Glu Arg Cys Gln
Gln Gln145 150 155 160Met
Pro Glu Tyr Asn Ile Thr Glu Asn Met Ile Cys Ala Gly Tyr Glu165
170 175Glu Gly Gly Ile Asp Ser Cys Gln Gly Asp Ser
Gly Gly Pro Leu Met180 185 190Cys Gln Glu
Asn Asn Arg Trp Phe Leu Ala Gly Val Thr Ser Phe Gly195
200 205Tyr Lys Cys Ala Leu Pro Asn Arg Pro Gly Val Tyr
Ala Arg Val Ser210 215 220Arg Phe Thr Glu
Trp Ile Gln Ser Phe Leu His225 230
2357275PRTBacillus subtilis 7Ile Ala His Glu Tyr Ala Gln Ser Val Pro Tyr
Gly Ile Ser Gln Ile1 5 10
15Lys Ala Pro Ala Leu His Ser Gln Gly Tyr Thr Gly Ser Asn Val Lys20
25 30Val Ala Val Ile Asp Ser Gly Ile Asp Ser
Ser His Pro Asp Leu Asn35 40 45Val Arg
Gly Gly Ala Ser Phe Val Pro Ser Glu Thr Asn Pro Tyr Gln50
55 60Asp Gly Ser Ser His Gly Thr His Val Ala Gly Thr
Ile Ala Ala Leu65 70 75
80Asn Asn Ser Ile Gly Val Leu Gly Val Ser Pro Ser Ala Ser Leu Tyr85
90 95Ala Val Lys Val Leu Asp Ser Thr Gly Ser
Gly Gln Tyr Ser Trp Ile100 105 110Ile Asn
Gly Ile Glu Trp Ala Ile Ser Asn Asn Met Asp Val Ile Asn115
120 125Met Ser Leu Gly Gly Pro Thr Gly Ser Thr Ala Leu
Lys Thr Val Val130 135 140Asp Lys Ala Val
Ser Ser Gly Ile Val Val Ala Ala Ala Ala Gly Asn145 150
155 160Glu Gly Ser Ser Gly Ser Thr Ser Thr
Val Gly Tyr Pro Ala Lys Tyr165 170 175Pro
Ser Thr Ile Ala Val Gly Ala Val Asn Ser Ser Asn Gln Arg Ala180
185 190Ser Phe Ser Ser Ala Gly Ser Glu Leu Asp Val
Met Ala Pro Gly Val195 200 205Ser Ile Gln
Ser Thr Leu Pro Gly Gly Thr Tyr Gly Ala Tyr Asn Gly210
215 220Thr Ser Met Ala Thr Pro His Val Ala Gly Ala Ala
Ala Leu Ile Leu225 230 235
240Ser Lys His Pro Thr Trp Thr Asn Ala Gln Val Arg Asp Arg Leu Glu245
250 255Ser Thr Ala Thr Tyr Leu Gly Asn Ser
Phe Tyr Tyr Gly Lys Gly Leu260 265 270Ile
Asn Val2758320PRTArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 8Val Ala Lys Arg Arg Ala Lys Arg Asp Val Tyr
Gln Glu Pro Thr Asp1 5 10
15Pro Lys Phe Pro Gln Gln Trp Tyr Leu Ser Gly Val Thr Gln Arg Asp20
25 30Leu Asn Val Lys Glu Ala Trp Ala Gln Gly
Phe Thr Gly His Gly Ile35 40 45Val Val
Ser Ile Leu Asp Asp Gly Ile Glu Lys Asn His Pro Asp Leu50
55 60Ala Gly Asn Tyr Asp Pro Gly Ala Ser Phe Asp Val
Asn Asp Gln Asp65 70 75
80Pro Asp Pro Gln Pro Arg Tyr Thr Gln Met Asn Asp Asn Arg His Gly85
90 95Thr Arg Cys Ala Gly Glu Val Ala Ala Val
Ala Asn Asn Gly Val Cys100 105 110Gly Val
Gly Val Ala Tyr Asn Ala Arg Ile Gly Gly Val Arg Met Leu115
120 125Asp Gly Glu Val Thr Asp Ala Val Glu Ala Arg Ser
Leu Gly Leu Asn130 135 140Pro Asn His Ile
His Ile Tyr Ser Ala Ser Trp Gly Pro Glu Asp Asp145 150
155 160Gly Lys Thr Val Asp Gly Pro Ala Arg
Leu Ala Glu Glu Ala Phe Phe165 170 175Arg
Gly Val Ser Gln Gly Arg Gly Gly Leu Gly Ser Ile Phe Val Trp180
185 190Ala Ser Gly Asn Gly Gly Arg Glu His Asp Ser
Cys Asn Cys Asp Gly195 200 205Tyr Thr Asn
Ser Ile Tyr Thr Leu Ser Ile Ser Ser Ala Thr Gln Phe210
215 220Gly Asn Val Pro Trp Tyr Ser Glu Ala Cys Ser Ser
Thr Leu Ala Thr225 230 235
240Thr Tyr Ser Ser Gly Asn Gln Asn Glu Lys Gln Ile Val Thr Thr Asp245
250 255Leu Arg Gln Lys Cys Thr Glu Ser His
Thr Gly Thr Ser Ala Ser Ala260 265 270Pro
Leu Ala Ala Gly Ile Ile Ala Leu Thr Leu Glu Ala Asn Lys Asn275
280 285Leu Thr Trp Arg Asp Met Gln His Leu Val Val
Gln Thr Ser Lys Pro290 295 300Ala His Leu
Asn Ala Asp Asp Trp Ala Thr Asn Gly Val Gly Arg Lys305
310 315 3209330PRTHomo sapiens 9Glu Lys
Glu Arg Ser Lys Arg Ser Ala Leu Arg Asp Ser Ala Leu Asn1 5
10 15Leu Phe Asn Asp Pro Met Trp Asn
Gln Gln Trp Tyr Leu Gln Asp Thr20 25
30Arg Met Thr Ala Ala Leu Pro Lys Leu Asp Leu His Val Ile Pro Val35
40 45Trp Gln Lys Gly Ile Thr Gly Lys Gly Val
Val Ile Thr Val Leu Asp50 55 60Asp Gly
Leu Glu Trp Asn His Thr Asp Ile Tyr Ala Asn Tyr Asp Pro65
70 75 80Glu Ala Ser Tyr Asp Phe Asn
Asp Asn Asp His Asp Pro Phe Pro Arg85 90
95Tyr Asp Pro Thr Asn Glu Asn Lys His Gly Thr Arg Cys Ala Gly Glu100
105 110Ile Ala Met Gln Ala Asn Asn His Lys
Cys Gly Val Gly Val Ala Tyr115 120 125Asn
Ser Lys Val Gly Gly Ile Arg Met Leu Asp Gly Ile Val Thr Asp130
135 140Ala Ile Glu Ala Ser Ser Ile Gly Phe Asn Pro
Gly His Val Asp Ile145 150 155
160Tyr Ser Ala Ser Trp Gly Pro Asn Asp Asp Gly Lys Thr Val Glu
Gly165 170 175Pro Gly Arg Leu Ala Gln Lys
Ala Phe Glu Tyr Gly Val Lys Gln Gly180 185
190Arg Gln Gly Lys Gly Ser Ile Phe Val Trp Ala Ser Gly Asn Gly Gly195
200 205Arg Gln Gly Asp Asn Cys Asp Cys Asp
Gly Tyr Thr Asp Ser Ile Tyr210 215 220Thr
Ile Ser Ile Ser Ser Ala Ser Gln Gln Gly Leu Ser Pro Trp Tyr225
230 235 240Ala Glu Lys Cys Ser Ser
Thr Leu Ala Thr Ser Tyr Ser Ser Gly Asp245 250
255Tyr Thr Asp Gln Arg Ile Thr Ser Ala Asp Leu His Asn Asp Cys
Thr260 265 270Glu Thr His Thr Gly Thr Ser
Ala Ser Ala Pro Leu Ala Ala Gly Ile275 280
285Phe Ala Leu Ala Leu Glu Ala Asn Pro Asn Leu Thr Trp Arg Asp Met290
295 300Gln His Leu Val Val Trp Thr Ser Glu
Tyr Asp Pro Leu Ala Asn Asn305 310 315
320Pro Gly Trp Lys Lys Asn Gly Ala Gly Leu325
33010297PRTHomo sapiens 10Asn Thr His Pro Cys Gln Ser Asp Met Asn Ile
Glu Gly Ala Trp Lys1 5 10
15Arg Gly Tyr Thr Gly Lys Asn Ile Val Val Thr Ile Leu Asp Asp Gly20
25 30Ile Glu Arg Thr His Pro Asp Leu Met Gln
Asn Tyr Asp Ala Leu Ala35 40 45Ser Cys
Asp Val Asn Gly Asn Asp Leu Asp Pro Met Pro Arg Tyr Asp50
55 60Ala Ser Asn Glu Asn Lys His Gly Thr Arg Cys Ala
Gly Glu Val Ala65 70 75
80Ala Ala Ala Asn Asn Ser His Cys Thr Val Gly Ile Ala Phe Asn Ala85
90 95Lys Ile Gly Gly Val Arg Met Leu Asp Gly
Asp Val Thr Asp Met Val100 105 110Glu Ala
Lys Ser Val Ser Phe Asn Pro Gln His Val His Ile Tyr Ser115
120 125Ala Ser Trp Gly Pro Asp Asp Asp Gly Lys Thr Val
Asp Gly Pro Ala130 135 140Pro Leu Thr Arg
Gln Ala Phe Glu Asn Gly Val Arg Met Gly Arg Arg145 150
155 160Gly Leu Gly Ser Val Phe Val Trp Ala
Ser Gly Asn Gly Gly Arg Ser165 170 175Lys
Asp His Cys Ser Cys Asp Gly Tyr Thr Asn Ser Ile Tyr Thr Ile180
185 190Ser Ile Ser Ser Thr Ala Glu Ser Gly Lys Lys
Pro Trp Tyr Leu Glu195 200 205Glu Cys Ser
Ser Thr Leu Ala Thr Thr Tyr Ser Ser Gly Glu Ser Tyr210
215 220Asp Lys Lys Ile Ile Thr Thr Asp Leu Arg Gln Arg
Cys Thr Asp Asn225 230 235
240His Thr Gly Thr Ser Ala Ser Ala Pro Met Ala Ala Gly Ile Ile Ala245
250 255Leu Ala Leu Glu Ala Asn Pro Phe Leu
Thr Trp Arg Asp Val Gln His260 265 270Val
Ile Val Arg Thr Ser Arg Ala Gly His Leu Asn Ala Asn Asp Trp275
280 285Lys Thr Asn Ala Ala Gly Phe Lys Val290
29511328PRTHomo sapiens 11Thr Leu Val Asp Glu Gln Pro Leu Glu
Asn Tyr Leu Asp Met Glu Tyr1 5 10
15Phe Gly Thr Ile Gly Ile Gly Thr Pro Ala Gln Asp Phe Thr Val
Val20 25 30Phe Asp Thr Gly Ser Ser Asn
Leu Trp Val Pro Ser Val Tyr Cys Ser35 40
45Ser Leu Ala Cys Thr Asn His Asn Arg Phe Asn Pro Glu Asp Ser Ser50
55 60Thr Tyr Gln Ser Thr Ser Glu Thr Val Ser
Ile Thr Tyr Gly Thr Gly65 70 75
80Ser Met Thr Gly Ile Leu Gly Tyr Asp Thr Val Gln Val Gly Gly
Ile85 90 95Ser Asp Thr Asn Gln Ile Phe
Gly Leu Ser Glu Thr Glu Pro Gly Ser100 105
110Phe Leu Tyr Tyr Ala Pro Phe Asp Gly Ile Leu Gly Leu Ala Tyr Pro115
120 125Ser Ile Ser Ser Ser Gly Ala Thr Pro
Val Phe Asp Asn Ile Trp Asn130 135 140Gln
Gly Leu Val Ser Gln Asp Leu Phe Ser Val Tyr Leu Ser Ala Asp145
150 155 160Asp Lys Ser Gly Ser Val
Val Ile Phe Gly Gly Ile Asp Ser Ser Tyr165 170
175Tyr Thr Gly Ser Leu Asn Trp Val Pro Val Thr Val Glu Gly Tyr
Trp180 185 190Gln Ile Thr Val Asp Ser Ile
Thr Met Asn Gly Glu Thr Ile Ala Cys195 200
205Ala Glu Gly Cys Gln Ala Ile Val Asp Thr Gly Thr Ser Leu Leu Thr210
215 220Gly Pro Thr Ser Pro Ile Ala Asn Ile
Gln Ser Asp Ile Gly Ala Ser225 230 235
240Glu Asn Ser Asp Gly Asp Met Val Val Ser Cys Ser Ala Ile
Ser Ser245 250 255Leu Pro Asp Ile Val Phe
Thr Ile Asn Gly Val Gln Tyr Pro Val Pro260 265
270Pro Ser Ala Tyr Ile Leu Gln Ser Glu Gly Ser Cys Ile Ser Gly
Phe275 280 285Gln Gly Met Asn Val Pro Thr
Glu Ser Gly Glu Leu Trp Ile Leu Gly290 295
300Asp Val Phe Ile Arg Gln Tyr Phe Thr Val Phe Asp Arg Ala Asn Asn305
310 315 320Gln Val Gly Leu
Ala Pro Val Ala32512358PRTHomo sapiens 12Glu Met Val Asp Asn Leu Arg Gly
Lys Ser Gly Gln Gly Tyr Tyr Val1 5 10
15Glu Met Thr Val Gly Ser Pro Pro Gln Thr Leu Asn Ile Leu
Val Asp20 25 30Thr Gly Ser Ser Asn Phe
Ala Val Gly Ala Ala Pro His Pro Phe Leu35 40
45His Arg Tyr Tyr Gln Arg Gln Leu Ser Ser Thr Tyr Arg Asp Leu Arg50
55 60Lys Gly Val Tyr Val Pro Tyr Thr Gln
Gly Lys Trp Glu Gly Glu Leu65 70 75
80Gly Thr Asp Leu Val Ser Ile Pro His Gly Pro Asn Val Thr
Val Arg85 90 95Ala Asn Ile Ala Ala Ile
Thr Glu Ser Asp Lys Phe Phe Ile Asn Gly100 105
110Ser Asn Trp Glu Gly Ile Leu Gly Leu Ala Tyr Ala Glu Ile Ala
Arg115 120 125Pro Asp Asp Ser Leu Glu Pro
Phe Phe Asp Ser Leu Val Lys Gln Thr130 135
140His Val Pro Asn Leu Phe Ser Leu Gln Leu Cys Gly Ala Gly Phe Pro145
150 155 160Leu Asn Gln Ser
Glu Val Leu Ala Ser Val Gly Gly Ser Met Ile Ile165 170
175Gly Gly Ile Asp His Ser Leu Tyr Thr Gly Ser Leu Trp Tyr
Thr Pro180 185 190Ile Arg Arg Glu Trp Tyr
Tyr Glu Val Ile Ile Val Arg Val Glu Ile195 200
205Asn Gly Gln Asp Leu Lys Met Asp Cys Lys Glu Tyr Asn Tyr Asp
Lys210 215 220Ser Ile Val Asp Ser Gly Thr
Thr Asn Leu Arg Leu Pro Lys Lys Val225 230
235 240Phe Glu Ala Ala Val Lys Ser Ile Lys Ala Ala Ser
Ser Thr Glu Lys245 250 255Phe Pro Asp Gly
Phe Trp Leu Gly Glu Gln Leu Val Cys Trp Gln Ala260 265
270Gly Thr Thr Pro Trp Asn Ile Phe Pro Val Ile Ser Leu Tyr
Leu Met275 280 285Gly Glu Val Thr Asn Gln
Ser Phe Arg Ile Thr Ile Leu Pro Gln Gln290 295
300Tyr Leu Arg Pro Val Glu Asp Val Ala Thr Ser Gln Asp Asp Cys
Tyr305 310 315 320Lys Phe
Ala Ile Ser Gln Ser Ser Thr Gly Thr Val Met Gly Ala Val325
330 335Ile Met Glu Gly Phe Tyr Val Val Phe Asp Arg Ala
Arg Lys Arg Ile340 345 350Gly Phe Ala Val
Ser Ala35513351PRTHomo sapiens 13Pro Ala Val Thr Glu Gly Pro Ile Pro Glu
Val Leu Lys Asn Tyr Met1 5 10
15Asp Ala Gln Tyr Tyr Gly Glu Ile Gly Ile Gly Thr Pro Pro Gln Cys20
25 30Phe Thr Val Val Phe Asp Thr Gly Ser
Ser Asn Leu Trp Val Pro Ser35 40 45Ile
His Cys Lys Leu Leu Asp Ile Ala Cys Trp Ile His His Lys Tyr50
55 60Asn Ser Asp Lys Ser Ser Thr Tyr Val Lys Asn
Gly Thr Ser Phe Asp65 70 75
80Ile His Tyr Gly Ser Gly Ser Leu Ser Gly Tyr Leu Ser Gln Asp Thr85
90 95Val Ser Val Pro Cys Gln Ser Ala Ser
Ser Ala Ser Ala Leu Gly Gly100 105 110Val
Lys Val Glu Arg Gln Val Phe Gly Glu Ala Thr Lys Gln Pro Gly115
120 125Ile Thr Phe Ile Ala Ala Lys Phe Asp Gly Ile
Leu Gly Met Ala Tyr130 135 140Pro Arg Ile
Ser Val Asn Asn Val Leu Pro Val Phe Asp Asn Leu Met145
150 155 160Gln Gln Lys Leu Val Asp Gln
Asn Ile Phe Ser Phe Tyr Leu Ser Arg165 170
175Asp Pro Asp Ala Gln Pro Gly Gly Glu Leu Met Leu Gly Gly Thr Asp180
185 190Ser Lys Tyr Tyr Lys Gly Ser Leu Ser
Tyr Leu Asn Val Thr Arg Lys195 200 205Ala
Tyr Trp Gln Val His Leu Asp Gln Val Glu Val Ala Ser Gly Leu210
215 220Thr Leu Cys Lys Glu Gly Cys Glu Ala Ile Val
Asp Thr Gly Thr Ser225 230 235
240Leu Met Val Gly Pro Val Asp Glu Val Arg Glu Leu Gln Lys Ala
Ile245 250 255Gly Ala Val Pro Leu Ile Gln
Gly Glu Tyr Met Ile Pro Cys Glu Lys260 265
270Val Ser Thr Leu Pro Ala Ile Thr Leu Lys Leu Gly Gly Lys Gly Tyr275
280 285Lys Leu Ser Pro Glu Asp Tyr Thr Leu
Lys Val Ser Gln Ala Gly Lys290 295 300Thr
Leu Cys Leu Ser Gly Phe Met Gly Met Asp Ile Pro Pro Pro Ser305
310 315 320Gly Pro Leu Trp Ile Leu
Gly Asp Val Phe Ile Gly Arg Tyr Tyr Thr325 330
335Val Phe Asp Arg Asp Asn Asn Arg Val Gly Phe Ala Glu Ala Ala340
345 35014305PRTHomo sapiens 14Met Leu Glu
Ala Asp Asp Gln Gly Cys Ile Glu Glu Gln Gly Val Glu1 5
10 15Asp Ser Ala Asn Glu Asp Ser Val Asp
Ala Lys Pro Asp Arg Ser Ser20 25 30Phe
Val Pro Ser Leu Phe Ser Lys Lys Lys Lys Asn Val Thr Met Arg35
40 45Ser Ile Lys Thr Thr Arg Asp Arg Val Pro Thr
Tyr Gln Tyr Asn Met50 55 60Asn Phe Glu
Lys Leu Gly Lys Cys Ile Ile Ile Asn Asn Lys Asn Phe65 70
75 80Asp Lys Val Thr Gly Met Gly Val
Arg Asn Gly Thr Asp Lys Asp Ala85 90
95Glu Ala Leu Phe Lys Cys Phe Arg Ser Leu Gly Phe Asp Val Ile Val100
105 110Tyr Asn Asp Cys Ser Cys Ala Lys Met Gln
Asp Leu Leu Lys Lys Ala115 120 125Ser Glu
Glu Asp His Thr Asn Ala Ala Cys Phe Ala Cys Ile Leu Leu130
135 140Ser His Gly Glu Glu Asn Val Ile Tyr Gly Lys Asp
Gly Val Thr Pro145 150 155
160Ile Lys Asp Leu Thr Ala His Phe Arg Gly Asp Arg Ser Lys Thr Leu165
170 175Leu Glu Lys Pro Lys Leu Phe Phe Ile
Gln Ala Cys Arg Gly Thr Glu180 185 190Leu
Asp Asp Gly Ile Gln Ala Asp Ser Gly Pro Ile Asn Asp Thr Asp195
200 205Ala Asn Pro Arg Tyr Lys Ile Pro Val Glu Ala
Asp Phe Leu Phe Ala210 215 220Tyr Ser Thr
Val Pro Gly Tyr Tyr Ser Trp Arg Ser Pro Gly Arg Gly225
230 235 240Ser Trp Phe Val Gln Ala Leu
Cys Ser Ile Leu Glu Glu His Gly Lys245 250
255Asp Leu Glu Ile Met Gln Ile Leu Thr Arg Val Asn Asp Arg Val Ala260
265 270Arg His Phe Glu Ser Gln Ser Asp Asp
Pro His Phe His Glu Lys Lys275 280 285Gln
Ile Pro Cys Val Val Ser Met Leu Thr Lys Glu Leu Tyr Phe Ser290
295 300Gln30515262PRTStreptomyces sp. K15 15Val Thr
Lys Pro Thr Ile Ala Ala Val Gly Gly Tyr Ala Met Asn Asn1 5
10 15Gly Thr Gly Thr Thr Leu Tyr Thr
Lys Ala Ala Asp Thr Arg Arg Ser20 25
30Thr Gly Ser Thr Thr Lys Ile Met Thr Ala Lys Val Val Leu Ala Gln35
40 45Ser Asn Leu Asn Leu Asp Ala Lys Val Thr
Ile Gln Lys Ala Tyr Ser50 55 60Asp Tyr
Val Val Ala Asn Asn Ala Ser Gln Ala His Leu Ile Val Gly65
70 75 80Asp Lys Val Thr Val Arg Gln
Leu Leu Tyr Gly Leu Met Leu Pro Ser85 90
95Gly Cys Asp Ala Ala Tyr Ala Leu Ala Asp Lys Tyr Gly Ser Gly Ser100
105 110Thr Arg Ala Ala Arg Val Lys Ser Phe
Ile Gly Lys Met Asn Thr Ala115 120 125Ala
Thr Asn Leu Gly Leu His Asn Thr His Phe Asp Ser Phe Asp Gly130
135 140Ile Gly Asn Gly Ala Asn Tyr Ser Thr Pro Arg
Asp Leu Thr Lys Ile145 150 155
160Ala Ser Ser Ala Met Lys Asn Ser Thr Phe Arg Thr Val Val Lys
Thr165 170 175Lys Ala Tyr Thr Ala Lys Thr
Val Thr Lys Thr Gly Ser Ile Arg Thr180 185
190Met Asp Thr Trp Lys Asn Thr Asn Gly Leu Leu Ser Ser Tyr Ser Gly195
200 205Ala Ile Gly Val Lys Thr Gly Ser Gly
Pro Glu Ala Lys Tyr Cys Leu210 215 220Val
Phe Ala Ala Thr Arg Gly Gly Lys Thr Val Ile Gly Thr Val Leu225
230 235 240Ala Ser Thr Ser Ile Pro
Ala Arg Glu Ser Asp Ala Thr Lys Ile Met245 250
255Asn Tyr Gly Phe Ala Leu26016256PRTHuman cytomegalovirus 16Met Thr
Met Asp Glu Gln Gln Ser Gln Ala Val Ala Pro Val Tyr Val1 5
10 15Gly Gly Phe Leu Ala Arg Tyr Asp
Gln Ser Pro Asp Glu Ala Glu Leu20 25
30Leu Leu Pro Arg Asp Val Val Glu His Trp Leu His Ala Gln Gly Gln35
40 45Gly Gln Pro Ser Leu Ser Val Ala Leu Pro
Leu Asn Ile Asn His Asp50 55 60Asp Thr
Ala Val Val Gly His Val Ala Ala Met Gln Ser Val Arg Asp65
70 75 80Gly Leu Phe Cys Leu Gly Cys
Val Thr Ser Pro Arg Phe Leu Glu Ile85 90
95Val Arg Arg Ala Ser Glu Lys Ser Glu Leu Val Ser Arg Gly Pro Val100
105 110Ser Pro Leu Gln Pro Asp Lys Val Val
Glu Phe Leu Ser Gly Ser Tyr115 120 125Ala
Gly Leu Ser Leu Ser Ser Arg Arg Cys Asp Asp Val Glu Gln Ala130
135 140Thr Ser Leu Ser Gly Ser Glu Thr Thr Pro Phe
Lys His Val Ala Leu145 150 155
160Cys Ser Val Gly Arg Arg Arg Gly Thr Leu Ala Val Tyr Gly Arg
Asp165 170 175Pro Glu Trp Val Thr Gln Arg
Phe Pro Asp Leu Thr Ala Ala Asp Arg180 185
190Asp Gly Leu Arg Ala Gln Trp Gln Arg Cys Gly Ser Thr Ala Val Asp195
200 205Ala Ser Gly Asp Pro Phe Arg Ser Asp
Ser Tyr Gly Leu Leu Gly Asn210 215 220Ser
Val Asp Ala Leu Tyr Ile Arg Glu Arg Leu Pro Lys Leu Arg Tyr225
230 235 240Asp Lys Gln Leu Val Gly
Val Thr Glu Arg Glu Ser Tyr Val Lys Ala245 250
25517248PRTEscherichia coli 17Val Arg Ser Phe Ile Tyr Glu Pro Phe
Gln Ile Pro Ser Gly Ser Met1 5 10
15Met Pro Thr Leu Leu Ile Gly Asp Phe Ile Leu Val Glu Lys Phe
Ala20 25 30Tyr Gly Ile Lys Asp Pro Ile
Tyr Gln Lys Thr Leu Ile Glu Thr Gly35 40
45His Pro Lys Arg Gly Asp Ile Val Val Phe Lys Tyr Pro Glu Asp Pro50
55 60Lys Leu Asp Tyr Ile Lys Arg Ala Val Gly
Leu Pro Gly Asp Lys Val65 70 75
80Thr Tyr Asp Pro Val Ser Lys Glu Leu Thr Ile Gln Pro Gly Cys
Ser85 90 95Ser Gly Gln Ala Cys Glu Asn
Ala Leu Pro Val Thr Tyr Ser Asn Val100 105
110Glu Pro Ser Asp Phe Val Gln Thr Phe Ser Arg Arg Asn Gly Gly Glu115
120 125Ala Thr Ser Gly Phe Phe Glu Val Pro
Lys Asn Glu Thr Lys Glu Asn130 135 140Gly
Ile Arg Leu Ser Glu Arg Lys Glu Thr Leu Gly Asp Val Thr His145
150 155 160Arg Ile Leu Thr Val Pro
Ile Ala Gln Asp Gln Val Gly Met Tyr Tyr165 170
175Gln Gln Pro Gly Gln Gln Leu Ala Thr Trp Ile Val Pro Pro Gly
Gln180 185 190Tyr Phe Met Met Gly Asp Asn
Arg Asp Asn Ser Ala Asp Ser Arg Tyr195 200
205Trp Gly Phe Val Pro Glu Ala Asn Leu Val Gly Arg Ala Thr Ala Ile210
215 220Trp Met Ser Phe Asp Lys Gln Glu Gly
Glu Trp Pro Thr Gly Leu Arg225 230 235
240Leu Ser Arg Ile Gly Gly Ile His24518317PRTSerratia
marcescens 18Met Glu Gln Leu Arg Gly Leu Tyr Pro Pro Leu Ala Ala Tyr Asp
Ser1 5 10 15Gly Trp Leu
Asp Thr Gly Asp Gly His Arg Ile Tyr Trp Glu Leu Ser20 25
30Gly Asn Pro Asn Gly Lys Pro Ala Val Phe Ile His Gly
Gly Pro Gly35 40 45Gly Gly Ile Ser Pro
His His Arg Gln Leu Phe Asp Pro Glu Arg Tyr50 55
60Lys Val Leu Leu Phe Asp Gln Arg Gly Cys Gly Arg Ser Arg Pro
His65 70 75 80Ala Ser
Leu Asp Asn Asn Thr Thr Trp His Leu Val Ala Asp Ile Glu85
90 95Arg Leu Arg Glu Met Ala Gly Val Glu Gln Trp Leu
Val Phe Gly Gly100 105 110Ser Trp Gly Ser
Thr Leu Ala Leu Ala Tyr Ala Gln Thr His Pro Glu115 120
125Arg Val Ser Glu Met Val Leu Arg Gly Ile Phe Thr Leu Arg
Lys Gln130 135 140Arg Leu His Trp Tyr Tyr
Gln Asp Gly Ala Ser Arg Phe Phe Pro Glu145 150
155 160Lys Trp Glu Arg Val Leu Ser Ile Leu Ser Asp
Asp Glu Arg Lys Asp165 170 175Val Ile Ala
Ala Tyr Arg Gln Arg Leu Thr Ser Ala Asp Pro Gln Val180
185 190Gln Leu Glu Ala Ala Lys Leu Trp Ser Val Trp Glu
Gly Glu Thr Val195 200 205Thr Leu Leu Pro
Ser Arg Glu Ser Ala Ser Phe Gly Glu Asp Asp Phe210 215
220Ala Leu Ala Phe Ala Arg Ile Glu Asn His Tyr Phe Thr His
Leu Gly225 230 235 240Phe
Leu Glu Ser Asp Asp Gln Leu Leu Arg Asn Val Pro Leu Ile Arg245
250 255His Ile Pro Ala Val Ile Val His Gly Arg Tyr
Asp Met Ala Cys Gln260 265 270Val Gln Asn
Ala Trp Asp Leu Ala Lys Ala Trp Pro Glu Ala Glu Leu275
280 285His Ile Val Glu Gly Ala Gly His Ser Tyr Asp Glu
Pro Gly Ile Leu290 295 300His Gln Leu Met
Ile Ala Thr Asp Arg Phe Ala Gly Lys305 310
31519229PRTEscherichia coli 19Met Glu Leu Leu Leu Leu Ser Asn Ser Thr
Leu Pro Gly Lys Ala Trp1 5 10
15Leu Glu His Ala Leu Pro Leu Ile Ala Asn Gln Leu Asn Gly Arg Arg20
25 30Ser Ala Val Phe Ile Pro Phe Ala Gly
Val Thr Gln Thr Trp Asp Glu35 40 45Tyr
Thr Asp Lys Thr Ala Glu Val Leu Ala Pro Leu Gly Val Asn Val50
55 60Thr Gly Ile His Arg Val Ala Asp Pro Leu Ala
Ala Ile Glu Lys Ala65 70 75
80Glu Ile Ile Ile Val Gly Gly Gly Asn Thr Phe Gln Leu Leu Lys Glu85
90 95Ser Arg Glu Arg Gly Leu Leu Ala Pro
Met Ala Asp Arg Val Lys Arg100 105 110Gly
Ala Leu Tyr Ile Gly Trp Ser Ala Gly Ala Asn Leu Ala Cys Pro115
120 125Thr Ile Arg Thr Thr Asn Asp Met Pro Ile Val
Asp Pro Asn Gly Phe130 135 140Asp Ala Leu
Asp Leu Phe Pro Leu Gln Ile Asn Pro His Phe Thr Asn145
150 155 160Ala Leu Pro Glu Gly His Lys
Gly Glu Thr Arg Glu Gln Arg Ile Arg165 170
175Glu Leu Leu Val Val Ala Pro Glu Leu Thr Val Ile Gly Leu Pro Glu180
185 190Gly Asn Trp Ile Gln Val Ser Asn Gly
Gln Ala Val Leu Gly Gly Pro195 200 205Asn
Thr Thr Trp Val Phe Lys Ala Gly Glu Glu Ala Val Ala Leu Glu210
215 220Ala Gly His Arg Phe2252099PRTHuman
immunodeficiency virus 20Pro Gln Ile Thr Leu Trp Gln Arg Pro Leu Val Thr
Val Lys Ile Gly1 5 10
15Gly Gln Leu Arg Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val20
25 30Leu Glu Asp Ile Asn Leu Pro Gly Lys Trp
Lys Pro Lys Met Ile Gly35 40 45Gly Ile
Gly Gly Phe Ile Lys Val Arg Gln Tyr Asp Gln Ile Leu Ile50
55 60Glu Ile Cys Gly Lys Lys Ala Ile Gly Thr Val Leu
Val Gly Pro Thr65 70 75
80Pro Val Asn Ile Ile Gly Arg Asn Met Leu Thr Gln Ile Gly Cys Thr85
90 95Leu Asn Phe21297PRTEscherichia coli
21Ser Thr Glu Thr Leu Ser Phe Thr Pro Asp Asn Ile Asn Ala Asp Ile1
5 10 15Ser Leu Gly Thr Leu Ser
Gly Lys Thr Lys Glu Arg Val Tyr Leu Ala20 25
30Glu Glu Gly Gly Arg Lys Val Ser Gln Leu Asp Trp Lys Phe Asn Asn35
40 45Ala Ala Ile Ile Lys Gly Ala Ile Asn
Trp Asp Leu Met Pro Gln Ile50 55 60Ser
Ile Gly Ala Ala Gly Trp Thr Thr Leu Gly Ser Arg Gly Gly Asn65
70 75 80Met Val Asp Gln Asp Trp
Met Asp Ser Ser Asn Pro Gly Thr Trp Thr85 90
95Asp Glu Ala Arg His Pro Asp Thr Gln Leu Asn Tyr Ala Asn Glu Phe100
105 110Asp Leu Asn Ile Lys Gly Trp Leu
Leu Asn Glu Pro Asn Tyr Arg Leu115 120
125Gly Leu Met Ala Gly Tyr Gln Glu Ser Arg Tyr Ser Phe Thr Ala Arg130
135 140Gly Gly Ser Tyr Ile Tyr Ser Ser Glu
Glu Gly Phe Arg Asp Asp Ile145 150 155
160Gly Ser Phe Pro Asn Gly Glu Arg Ala Ile Gly Tyr Lys Gln
Arg Phe165 170 175Lys Met Pro Tyr Ile Gly
Leu Thr Gly Ser Tyr Arg Tyr Glu Asp Phe180 185
190Glu Leu Gly Gly Thr Phe Lys Tyr Ser Gly Trp Val Glu Ser Ser
Asp195 200 205Asn Asp Glu His Tyr Asp Pro
Lys Gly Arg Ile Thr Tyr Arg Ser Lys210 215
220Val Lys Asp Gln Asn Tyr Tyr Ser Val Ala Val Asn Ala Gly Tyr Tyr225
230 235 240Val Thr Pro Asn
Ala Lys Val Tyr Val Glu Gly Ala Trp Asn Arg Val245 250
255Thr Asn Lys Lys Gly Asn Thr Ser Leu Tyr Asp His Asn Asn
Asn Thr260 265 270Ser Asp Tyr Ser Lys Asn
Gly Ala Gly Ile Glu Asn Tyr Asn Phe Ile275 280
285Thr Thr Ala Gly Leu Lys Tyr Thr Phe290
29522212PRTCarica papaya 22Ile Pro Glu Tyr Val Asp Trp Arg Gln Lys Gly
Ala Val Thr Pro Val1 5 10
15Lys Asn Gln Gly Ser Cys Gly Ser Cys Trp Ala Phe Ser Ala Val Val20
25 30Thr Ile Glu Gly Ile Ile Lys Ile Arg Thr
Gly Asn Leu Asn Gln Tyr35 40 45Ser Glu
Gln Glu Leu Leu Asp Cys Asp Arg Arg Ser Tyr Gly Cys Asn50
55 60Gly Gly Tyr Pro Trp Ser Ala Leu Gln Leu Val Ala
Gln Tyr Gly Ile65 70 75
80His Tyr Arg Asn Thr Tyr Pro Tyr Glu Gly Val Gln Arg Tyr Cys Arg85
90 95Ser Arg Glu Lys Gly Pro Tyr Ala Ala Lys
Thr Asp Gly Val Arg Gln100 105 110Val Gln
Pro Tyr Asn Gln Gly Ala Leu Leu Tyr Ser Ile Ala Asn Gln115
120 125Pro Val Ser Val Val Leu Gln Ala Ala Gly Lys Asp
Phe Gln Leu Tyr130 135 140Arg Gly Gly Ile
Phe Val Gly Pro Cys Gly Asn Lys Val Asp His Ala145 150
155 160Val Ala Ala Val Gly Tyr Gly Pro Asn
Tyr Ile Leu Ile Lys Asn Ser165 170 175Trp
Gly Thr Gly Trp Gly Glu Asn Gly Tyr Ile Arg Ile Lys Arg Gly180
185 190Thr Gly Asn Ser Tyr Gly Val Cys Gly Leu Tyr
Thr Ser Ser Phe Tyr195 200 205Pro Val Lys
Asn21023699PRTHomo sapiens 23Ala Gly Ile Ala Ala Lys Leu Ala Lys Asp Arg
Glu Ala Ala Glu Gly1 5 10
15Leu Gly Ser His Glu Arg Ala Ile Lys Tyr Leu Asn Gln Asp Tyr Glu20
25 30Ala Leu Arg Asn Glu Cys Leu Glu Ala Gly
Thr Leu Phe Gln Asp Pro35 40 45Ser Phe
Pro Ala Ile Pro Ser Ala Leu Gly Phe Lys Glu Leu Gly Pro50
55 60Tyr Ser Ser Lys Thr Arg Gly Met Arg Trp Lys Arg
Pro Thr Glu Ile65 70 75
80Cys Ala Asp Pro Gln Phe Ile Ile Gly Gly Ala Thr Arg Thr Asp Ile85
90 95Cys Gln Gly Ala Leu Gly Asp Cys Trp Leu
Leu Ala Ala Ile Ala Ser100 105 110Leu Thr
Leu Asn Glu Glu Ile Leu Ala Arg Val Val Pro Leu Asn Gln115
120 125Ser Phe Gln Glu Asn Tyr Ala Gly Ile Phe His Phe
Gln Phe Trp Gln130 135 140Tyr Gly Glu Trp
Val Glu Val Val Val Asp Asp Arg Leu Pro Thr Lys145 150
155 160Asp Gly Glu Leu Leu Phe Val His Ser
Ala Glu Gly Ser Glu Phe Trp165 170 175Ser
Ala Leu Leu Glu Lys Ala Tyr Ala Lys Ile Asn Gly Cys Tyr Glu180
185 190Ala Leu Ser Gly Gly Ala Thr Thr Glu Gly Phe
Glu Asp Phe Thr Gly195 200 205Gly Ile Ala
Glu Trp Tyr Glu Leu Lys Lys Pro Pro Pro Asn Leu Phe210
215 220Lys Ile Ile Gln Lys Ala Leu Gln Lys Gly Ser Leu
Leu Gly Cys Ser225 230 235
240Ile Asp Ile Thr Ser Ala Ala Asp Ser Glu Ala Ile Thr Phe Gln Lys245
250 255Leu Val Lys Gly His Ala Tyr Ser Val
Thr Gly Ala Glu Glu Val Glu260 265 270Ser
Asn Gly Ser Leu Gln Lys Leu Ile Arg Ile Arg Asn Pro Trp Gly275
280 285Glu Val Glu Trp Thr Gly Arg Trp Asn Asp Asn
Cys Pro Ser Trp Asn290 295 300Thr Ile Asp
Pro Glu Glu Arg Glu Arg Leu Thr Arg Arg His Glu Asp305
310 315 320Gly Glu Phe Trp Met Ser Phe
Ser Asp Phe Leu Arg His Tyr Ser Arg325 330
335Leu Glu Ile Cys Asn Leu Thr Pro Asp Thr Leu Thr Ser Asp Thr Tyr340
345 350Lys Lys Trp Lys Leu Thr Lys Met Asp
Gly Asn Trp Arg Arg Gly Ser355 360 365Thr
Ala Gly Gly Cys Arg Asn Tyr Pro Asn Thr Phe Trp Met Asn Pro370
375 380Gln Tyr Leu Ile Lys Leu Glu Glu Glu Asp Glu
Asp Glu Glu Asp Gly385 390 395
400Glu Ser Gly Cys Thr Phe Leu Val Gly Leu Ile Gln Lys His Arg
Arg405 410 415Arg Gln Arg Lys Met Gly Glu
Asp Met His Thr Ile Gly Phe Gly Ile420 425
430Tyr Glu Val Pro Glu Glu Leu Ser Gly Gln Thr Asn Ile His Leu Ser435
440 445Lys Asn Phe Phe Leu Thr Asn Arg Ala
Arg Glu Arg Ser Asp Thr Phe450 455 460Ile
Asn Leu Arg Glu Val Leu Asn Arg Phe Lys Leu Pro Pro Gly Glu465
470 475 480Tyr Ile Leu Val Pro Ser
Thr Phe Glu Pro Asn Lys Asp Gly Asp Phe485 490
495Cys Ile Arg Val Phe Ser Glu Lys Lys Ala Asp Tyr Gln Ala Val
Asp500 505 510Asp Glu Ile Glu Ala Asn Leu
Glu Glu Phe Asp Ile Ser Glu Asp Asp515 520
525Ile Asp Asp Gly Val Arg Arg Leu Phe Ala Gln Leu Ala Gly Glu Asp530
535 540Ala Glu Ile Ser Ala Phe Glu Leu Gln
Thr Ile Leu Arg Arg Val Leu545 550 555
560Ala Lys Arg Gln Asp Ile Lys Ser Asp Gly Phe Ser Ile Glu
Thr Cys565 570 575Lys Ile Met Val Asp Met
Leu Asp Ser Asp Gly Ser Gly Lys Leu Gly580 585
590Leu Lys Glu Phe Tyr Ile Leu Trp Thr Lys Ile Gln Lys Tyr Gln
Lys595 600 605Ile Tyr Arg Glu Ile Asp Val
Asp Arg Ser Gly Thr Met Asn Ser Tyr610 615
620Glu Met Arg Lys Ala Leu Glu Glu Ala Gly Phe Lys Met Pro Cys Gln625
630 635 640Leu His Gln Val
Ile Val Ala Arg Phe Ala Asp Asp Gln Leu Ile Ile645 650
655Asp Phe Asp Asn Phe Val Arg Cys Leu Val Arg Leu Glu Thr
Leu Phe660 665 670Lys Ile Phe Lys Gln Leu
Asp Pro Glu Asn Thr Gly Thr Ile Glu Leu675 680
685Asp Leu Ile Ser Trp Leu Cys Phe Ser Val Leu690
69524221PRTTobacco etch virus 24Gly Glu Ser Leu Phe Lys Gly Pro Arg Asp
Tyr Asn Pro Ile Ser Ser1 5 10
15Thr Ile Cys His Leu Thr Asn Glu Ser Asp Gly His Thr Thr Ser Leu20
25 30Tyr Gly Ile Gly Phe Gly Pro Phe Ile
Ile Thr Asn Lys His Leu Phe35 40 45Arg
Arg Asn Asn Gly Thr Leu Leu Val Gln Ser Leu His Gly Val Phe50
55 60Lys Val Lys Asn Thr Thr Thr Leu Gln Gln His
Leu Ile Asp Gly Arg65 70 75
80Asp Met Ile Ile Ile Arg Met Pro Lys Asp Phe Pro Pro Phe Pro Gln85
90 95Lys Leu Lys Phe Arg Glu Pro Gln Arg
Glu Glu Arg Ile Cys Leu Val100 105 110Thr
Thr Asn Phe Gln Thr Lys Ser Met Ser Ser Met Val Ser Asp Thr115
120 125Ser Cys Thr Phe Pro Ser Ser Asp Gly Ile Phe
Trp Lys His Trp Ile130 135 140Gln Thr Lys
Asp Gly Gln Cys Gly Ser Pro Leu Val Ser Thr Arg Asp145
150 155 160Gly Phe Ile Val Gly Ile His
Ser Ala Ser Asn Phe Thr Asn Thr Asn165 170
175Asn Tyr Phe Thr Ser Val Pro Lys Asn Phe Met Glu Leu Leu Thr Asn180
185 190Gln Glu Ala Gln Gln Trp Val Ser Gly
Trp Arg Leu Asn Ala Asp Ser195 200 205Val
Leu Trp Gly Gly His Lys Val Phe Met Asp Lys Pro210 215
22025371PRTStreptococcus pyogenes 25Asp Gln Asn Phe Ala Arg
Asn Glu Lys Glu Ala Lys Asp Ser Ala Ile1 5
10 15Thr Phe Ile Gln Lys Ser Ala Ala Ile Lys Ala Gly
Ala Arg Ser Ala20 25 30Glu Asp Ile Lys
Leu Asp Lys Val Asn Leu Gly Gly Glu Leu Ser Gly35 40
45Ser Asn Met Tyr Val Tyr Asn Ile Ser Thr Gly Gly Phe Val
Ile Val50 55 60Ser Gly Asp Lys Arg Ser
Pro Glu Ile Leu Gly Tyr Ser Thr Ser Gly65 70
75 80Ser Phe Asp Val Asn Gly Lys Glu Asn Ile Ala
Ser Phe Met Glu Ser85 90 95Tyr Val Glu
Gln Ile Lys Glu Asn Lys Lys Leu Asp Ser Thr Tyr Ala100
105 110Gly Thr Ala Glu Ile Lys Gln Pro Val Val Lys Ser
Leu Leu Asp Ser115 120 125Lys Gly Ile His
Tyr Asn Gln Gly Asn Pro Tyr Asn Leu Leu Thr Pro130 135
140Val Ile Glu Lys Val Lys Pro Gly Glu Gln Ser Phe Val Gly
Gln His145 150 155 160Ala
Ala Thr Gly Ser Val Ala Thr Ala Thr Ala Gln Ile Met Lys Tyr165
170 175His Asn Tyr Pro Asn Lys Gly Leu Lys Asp Tyr
Thr Tyr Thr Leu Ser180 185 190Ser Asn Asn
Pro Tyr Phe Asn His Pro Lys Asn Leu Phe Ala Ala Ile195
200 205Ser Thr Arg Gln Tyr Asn Trp Asn Asn Ile Leu Pro
Thr Tyr Ser Gly210 215 220Arg Glu Ser Asn
Val Gln Lys Met Ala Ile Ser Glu Leu Met Ala Asp225 230
235 240Val Gly Ile Ser Val Asp Met Asp Tyr
Gly Pro Ser Ser Gly Ser Ala245 250 255Gly
Ser Ser Arg Val Gln Arg Ala Leu Lys Glu Asn Phe Gly Tyr Asn260
265 270Gln Ser Val His Gln Ile Asn Arg Gly Asp Phe
Ser Lys Gln Asp Trp275 280 285Glu Ala Gln
Ile Asp Lys Glu Leu Ser Gln Asn Gln Pro Val Tyr Tyr290
295 300Gln Gly Val Gly Lys Val Gly Gly His Ala Phe Val
Ile Asp Gly Ala305 310 315
320Asp Gly Arg Asn Phe Tyr His Val Asn Trp Gly Trp Gly Gly Val Ser325
330 335Asp Gly Phe Phe Arg Leu Asp Ala Leu
Asn Pro Ser Ala Leu Gly Thr340 345 350Gly
Gly Gly Ala Gly Gly Phe Asn Gly Tyr Gln Ser Ala Val Val Gly355
360 365Ile Lys Pro37026353PRTHomo sapiens 26Lys Lys
His Thr Gly Tyr Val Gly Leu Lys Asn Gln Gly Ala Thr Cys1 5
10 15Tyr Met Asn Ser Leu Leu Gln Thr
Leu Phe Phe Thr Asn Gln Leu Arg20 25
30Lys Ala Val Tyr Met Met Pro Thr Glu Gly Asp Asp Ser Ser Lys Ser35
40 45Val Pro Leu Ala Leu Gln Arg Val Phe Tyr
Glu Leu Gln His Ser Asp50 55 60Lys Pro
Val Gly Thr Lys Lys Leu Thr Lys Ser Phe Gly Trp Glu Thr65
70 75 80Leu Asp Ser Phe Met Gln His
Asp Val Gln Glu Leu Cys Arg Val Leu85 90
95Leu Asp Asn Val Glu Asn Lys Met Lys Gly Thr Cys Val Glu Gly Thr100
105 110Ile Pro Lys Leu Phe Arg Gly Lys Met
Val Ser Tyr Ile Gln Cys Lys115 120 125Glu
Val Asp Tyr Arg Ser Asp Arg Arg Glu Asp Tyr Tyr Asp Ile Gln130
135 140Leu Ser Ile Lys Gly Lys Lys Asn Ile Phe Glu
Ser Phe Val Asp Tyr145 150 155
160Val Ala Val Glu Gln Leu Asp Gly Asp Asn Lys Tyr Asp Ala Gly
Glu165 170 175His Gly Leu Gln Glu Ala Glu
Lys Gly Val Lys Phe Leu Thr Leu Pro180 185
190Pro Val Leu His Leu Gln Leu Met Arg Phe Met Tyr Asp Pro Gln Thr195
200 205Asp Gln Asn Ile Lys Ile Asn Asp Arg
Phe Glu Phe Pro Glu Gln Leu210 215 220Pro
Leu Asp Glu Phe Leu Gln Lys Thr Asp Pro Lys Asp Pro Ala Asn225
230 235 240Tyr Ile Leu His Ala Val
Leu Val His Ser Gly Asp Asn His Gly Gly245 250
255His Tyr Val Val Tyr Leu Asn Pro Lys Gly Asp Gly Lys Trp Cys
Lys260 265 270Phe Asp Asp Asp Val Val Ser
Arg Cys Thr Lys Glu Glu Ala Ile Glu275 280
285His Asn Tyr Gly Gly His Asp Asp Asp Leu Ser Val Arg His Cys Thr290
295 300Asn Ala Tyr Met Leu Val Tyr Ile Arg
Glu Ser Lys Leu Ser Glu Val305 310 315
320Leu Gln Ala Val Thr Asp His Asp Ile Pro Gln Gln Leu Val
Glu Arg325 330 335Leu Gln Glu Glu Lys Arg
Ile Glu Ala Gln Lys Arg Lys Glu Arg Gln340 345
350Glu27174PRTStaphylococcus aureus 27Tyr Asn Glu Gln Tyr Val Asn
Lys Leu Glu Asn Phe Lys Ile Arg Glu1 5 10
15Thr Gln Gly Asn Asn Gly Trp Cys Ala Gly Tyr Thr Met
Ser Ala Leu20 25 30Leu Asn Ala Thr Tyr
Asn Thr Asn Lys Tyr His Ala Glu Ala Val Met35 40
45Arg Phe Leu His Pro Asn Leu Gln Gly Gln Gln Phe Gln Phe Thr
Gly50 55 60Leu Thr Pro Arg Glu Met Ile
Tyr Phe Gly Gln Thr Gln Gly Arg Ser65 70
75 80Pro Gln Leu Leu Asn Arg Met Thr Thr Tyr Asn Glu
Val Asp Asn Leu85 90 95Thr Lys Asn Asn
Lys Gly Ile Ala Ile Leu Gly Ser Arg Val Glu Ser100 105
110Arg Asn Gly Met His Ala Gly His Ala Met Ala Val Val Gly
Asn Ala115 120 125Lys Leu Asn Asn Gly Gln
Glu Val Ile Ile Ile Trp Asn Pro Trp Asp130 135
140Asn Gly Phe Met Thr Gln Asp Ala Lys Asn Asn Val Ile Pro Val
Ser145 150 155 160Asn Gly
Asp His Tyr Gln Trp Tyr Ser Ser Ile Tyr Gly Tyr165
17028221PRTSaccharomyces cerevisiae 28Gly Ser Leu Val Pro Glu Leu Asn Glu
Lys Asp Asp Asp Gln Val Gln1 5 10
15Lys Ala Leu Ala Ser Arg Glu Asn Thr Gln Leu Met Asn Arg Asp
Asn20 25 30Ile Glu Ile Thr Val Arg Asp
Phe Lys Thr Leu Ala Pro Arg Arg Trp35 40
45Leu Asn Asp Thr Ile Ile Glu Phe Phe Met Lys Tyr Ile Glu Lys Ser50
55 60Thr Pro Asn Thr Val Ala Phe Asn Ser Phe
Phe Tyr Thr Asn Leu Ser65 70 75
80Glu Arg Gly Tyr Gln Gly Val Arg Arg Trp Met Lys Arg Lys Lys
Thr85 90 95Gln Ile Asp Lys Leu Asp Lys
Ile Phe Thr Pro Ile Asn Leu Asn Gln100 105
110Ser His Trp Ala Leu Gly Ile Ile Asp Leu Lys Lys Lys Thr Ile Gly115
120 125Tyr Val Asp Ser Leu Ser Asn Gly Pro
Asn Ala Met Ser Phe Ala Ile130 135 140Leu
Thr Asp Leu Gln Lys Tyr Val Met Glu Glu Ser Lys His Thr Ile145
150 155 160Gly Glu Asp Phe Asp Leu
Ile His Leu Asp Cys Pro Gln Gln Pro Asn165 170
175Gly Tyr Asp Cys Gly Ile Tyr Val Cys Met Asn Thr Leu Tyr Gly
Ser180 185 190Ala Asp Ala Pro Leu Asp Phe
Asp Tyr Lys Asp Ala Ile Arg Met Arg195 200
205Arg Phe Ile Ala His Leu Ile Leu Thr Asp Ala Leu Lys210
215 22029166PRTPyrococcus horikoshii 29Met Lys Val Leu
Phe Leu Thr Ala Asn Glu Phe Glu Asp Val Glu Leu1 5
10 15Ile Tyr Pro Tyr His Arg Leu Lys Glu Glu
Gly His Glu Val Tyr Ile20 25 30Ala Ser
Phe Glu Arg Gly Thr Ile Thr Gly Lys His Gly Tyr Ser Val35
40 45Lys Val Asp Leu Thr Phe Asp Lys Val Asn Pro Glu
Glu Phe Asp Ala50 55 60Leu Val Leu Pro
Gly Gly Arg Ala Pro Glu Arg Val Arg Leu Asn Glu65 70
75 80Lys Ala Val Ser Ile Ala Arg Lys Met
Phe Ser Glu Gly Lys Pro Val85 90 95Ala
Ser Ile Cys His Gly Pro Gln Ile Leu Ile Ser Ala Gly Val Leu100
105 110Arg Gly Arg Lys Gly Thr Ser Tyr Pro Gly Ile
Lys Asp Asp Met Ile115 120 125Asn Ala Gly
Val Glu Trp Val Asp Ala Glu Val Val Val Asp Gly Asn130
135 140Trp Val Ser Ser Arg Val Pro Ala Asp Leu Tyr Ala
Trp Met Arg Glu145 150 155
160Phe Val Lys Leu Leu Lys16530316PRTBacillus thermoproteolyticus 30Ile
Thr Gly Thr Ser Thr Val Gly Val Gly Arg Gly Val Leu Gly Asp1
5 10 15Gln Lys Asn Ile Asn Thr Thr
Tyr Ser Thr Tyr Tyr Tyr Leu Gln Asp20 25
30Asn Thr Arg Gly Asp Gly Ile Phe Thr Tyr Asp Ala Lys Tyr Arg Thr35
40 45Thr Leu Pro Gly Ser Leu Trp Ala Asp Ala
Asp Asn Gln Phe Phe Ala50 55 60Ser Tyr
Asp Ala Pro Ala Val Asp Ala His Tyr Tyr Ala Gly Val Thr65
70 75 80Tyr Asp Tyr Tyr Lys Asn Val
His Asn Arg Leu Ser Tyr Asp Gly Asn85 90
95Asn Ala Ala Ile Arg Ser Ser Val His Tyr Ser Gln Gly Tyr Asn Asn100
105 110Ala Phe Trp Asn Gly Ser Glu Met Val
Tyr Gly Asp Gly Asp Gly Gln115 120 125Thr
Phe Ile Pro Leu Ser Gly Gly Ile Asp Val Val Ala His Glu Leu130
135 140Thr His Ala Val Thr Asp Tyr Thr Ala Gly Leu
Ile Tyr Gln Asn Glu145 150 155
160Ser Gly Ala Ile Asn Glu Ala Ile Ser Asp Ile Phe Gly Thr Leu
Val165 170 175Glu Phe Tyr Ala Asn Lys Asn
Pro Asp Trp Glu Ile Gly Glu Asp Val180 185
190Tyr Thr Pro Gly Ile Ser Gly Asp Ser Leu Arg Ser Met Ser Asp Pro195
200 205Ala Lys Tyr Gly Asp Pro Asp His Tyr
Ser Lys Arg Tyr Thr Gly Thr210 215 220Gln
Asp Asn Gly Gly Val His Ile Asn Ser Gly Ile Ile Asn Lys Ala225
230 235 240Ala Tyr Leu Ile Ser Gln
Gly Gly Thr His Tyr Gly Val Ser Val Val245 250
255Gly Ile Gly Arg Asp Lys Leu Gly Lys Ile Phe Tyr Arg Ala Leu
Thr260 265 270Gln Tyr Leu Thr Pro Thr Ser
Asn Phe Ser Gln Leu Arg Ala Ala Ala275 280
285Val Gln Ser Ala Thr Asp Leu Tyr Gly Ser Thr Ser Gln Glu Val Ala290
295 300Ser Val Lys Gln Ala Phe Asp Ala Val
Gly Val Lys305 310 31531169PRTHomo
sapiens 31Val Leu Thr Glu Gly Asn Pro Arg Trp Glu Gln Thr His Leu Thr
Tyr1 5 10 15Arg Ile Glu
Asn Tyr Thr Pro Asp Leu Pro Arg Ala Asp Val Asp His20 25
30Ala Ile Glu Lys Ala Phe Gln Leu Trp Ser Asn Val Thr
Pro Leu Thr35 40 45Phe Thr Lys Val Ser
Glu Gly Gln Ala Asp Ile Met Ile Ser Phe Val50 55
60Arg Gly Asp His Arg Asp Asn Ser Pro Phe Asp Gly Pro Gly Gly
Asn65 70 75 80Leu Ala
His Ala Phe Gln Pro Gly Pro Gly Ile Gly Gly Asp Ala His85
90 95Phe Asp Glu Asp Glu Arg Trp Thr Asn Asn Phe Arg
Glu Tyr Asn Leu100 105 110His Arg Val Ala
Ala His Glu Leu Gly His Ser Leu Gly Leu Ser His115 120
125Ser Thr Asp Ile Gly Ala Leu Met Tyr Pro Ser Tyr Thr Phe
Ser Gly130 135 140Asp Val Gln Leu Ala Gln
Asp Asp Ile Asp Gly Ile Gln Ala Ile Tyr145 150
155 160Gly Arg Ser Gln Asn Pro Val Gln
Pro16532496PRTHomo sapiens 32Gln Tyr Ser Pro Asn Thr Gln Gln Gly Arg Thr
Ser Ile Val His Leu1 5 10
15Phe Glu Trp Arg Trp Val Asp Ile Ala Leu Glu Cys Glu Arg Tyr Leu20
25 30Ala Pro Lys Gly Phe Gly Gly Val Gln Val
Ser Pro Pro Asn Glu Asn35 40 45Val Ala
Ile Tyr Asn Pro Phe Arg Pro Trp Trp Glu Arg Tyr Gln Pro50
55 60Val Ser Tyr Lys Leu Cys Thr Arg Ser Gly Asn Glu
Asp Glu Phe Arg65 70 75
80Asn Met Val Thr Arg Cys Asn Asn Val Gly Val Arg Ile Tyr Val Asp85
90 95Ala Val Ile Asn His Met Cys Gly Asn Ala
Val Ser Ala Gly Thr Ser100 105 110Ser Thr
Cys Gly Ser Tyr Phe Asn Pro Gly Ser Arg Asp Phe Pro Ala115
120 125Val Pro Tyr Ser Gly Trp Asp Phe Asn Asp Gly Lys
Cys Lys Thr Gly130 135 140Ser Gly Asp Ile
Glu Asn Tyr Asn Asp Ala Thr Gln Val Arg Asp Cys145 150
155 160Arg Leu Thr Gly Leu Leu Asp Leu Ala
Leu Glu Lys Asp Tyr Val Arg165 170 175Ser
Lys Ile Ala Glu Tyr Met Asn His Leu Ile Asp Ile Gly Val Ala180
185 190Gly Phe Arg Leu Asp Ala Ser Lys His Met Trp
Pro Gly Asp Ile Lys195 200 205Ala Ile Leu
Asp Lys Leu His Asn Leu Asn Ser Asn Trp Phe Pro Ala210
215 220Gly Ser Lys Pro Phe Ile Tyr Gln Glu Val Ile Asp
Leu Gly Gly Glu225 230 235
240Pro Ile Lys Ser Ser Asp Tyr Phe Gly Asn Gly Arg Val Thr Glu Phe245
250 255Lys Tyr Gly Ala Lys Leu Gly Thr Val
Ile Arg Lys Trp Asn Gly Glu260 265 270Lys
Met Ser Tyr Leu Lys Asn Trp Gly Glu Gly Trp Gly Phe Val Pro275
280 285Ser Asp Arg Ala Leu Val Phe Val Asp Asn His
Asp Asn Gln Arg Gly290 295 300His Gly Ala
Gly Gly Ala Ser Ile Leu Thr Phe Trp Asp Ala Arg Leu305
310 315 320Tyr Lys Met Ala Val Gly Phe
Met Leu Ala His Pro Tyr Gly Phe Thr325 330
335Arg Val Met Ser Ser Tyr Arg Trp Pro Arg Gln Phe Gln Asn Gly Asn340
345 350Asp Val Asn Asp Trp Val Gly Pro Pro
Asn Asn Asn Gly Val Ile Lys355 360 365Glu
Val Thr Ile Asn Pro Asp Thr Thr Cys Gly Asn Asp Trp Val Cys370
375 380Glu His Arg Trp Arg Gln Ile Arg Asn Met Val
Ile Phe Arg Asn Val385 390 395
400Val Asp Gly Gln Pro Phe Thr Asn Trp Tyr Asp Asn Gly Ser Asn
Gln405 410 415Val Ala Phe Gly Arg Gly Asn
Arg Gly Phe Ile Val Phe Asn Asn Asp420 425
430Asp Trp Ser Phe Ser Leu Thr Leu Gln Thr Gly Leu Pro Ala Gly Thr435
440 445Tyr Cys Asp Val Ile Ser Gly Asp Lys
Ile Asn Gly Asn Cys Thr Gly450 455 460Ile
Lys Ile Tyr Val Ser Asp Asp Gly Lys Ala His Phe Ser Ile Ser465
470 475 480Asn Ser Ala Glu Asp Pro
Phe Ile Ala Ile His Ala Glu Ser Lys Leu485 490
49533370PRTArtifical SequenceDescription of Artificial Sequence =
Trichoderma reesei 33Gln Pro Gly Thr Ser Thr Pro Glu Val His Pro Lys
Leu Thr Thr Tyr1 5 10
15Lys Cys Thr Lys Ser Gly Gly Cys Val Ala Gln Asp Thr Ser Val Val20
25 30Leu Asp Trp Asn Tyr Arg Trp Met His Asp
Ala Asn Tyr Asn Ser Cys35 40 45Thr Val
Asn Gly Gly Val Asn Thr Thr Leu Cys Pro Asp Glu Ala Thr50
55 60Cys Gly Lys Asn Cys Phe Ile Glu Gly Val Asp Tyr
Ala Ala Ser Gly65 70 75
80Val Thr Thr Ser Gly Ser Ser Leu Thr Met Asn Gln Tyr Met Pro Ser85
90 95Ser Ser Gly Gly Tyr Ser Ser Val Ser Pro
Arg Leu Tyr Leu Leu Asp100 105 110Ser Asp
Gly Glu Tyr Val Met Leu Lys Leu Asn Gly Gln Glu Leu Ser115
120 125Phe Asp Val Asp Leu Ser Ala Leu Pro Cys Gly Glu
Asn Gly Ser Leu130 135 140Tyr Leu Ser Gln
Met Asp Glu Asn Gly Gly Ala Asn Gln Tyr Asn Thr145 150
155 160Ala Gly Ala Asn Tyr Gly Ser Gly Tyr
Cys Asp Ala Gln Cys Pro Val165 170 175Gln
Thr Trp Arg Asn Gly Thr Leu Asn Thr Ser His Gln Gly Phe Cys180
185 190Cys Asn Glu Met Asp Ile Leu Glu Gly Asn Ser
Arg Ala Asn Ala Leu195 200 205Thr Pro His
Ser Cys Thr Ala Thr Ala Cys Asp Ser Ala Gly Cys Gly210
215 220Phe Asn Pro Tyr Gly Ser Gly Tyr Lys Ser Tyr Tyr
Gly Pro Gly Asp225 230 235
240Thr Val Asp Thr Ser Lys Thr Phe Thr Ile Ile Thr Gln Phe Asn Thr245
250 255Asp Asn Gly Ser Pro Ser Gly Asn Leu
Val Ser Ile Thr Arg Lys Tyr260 265 270Gln
Gln Asn Gly Val Asp Ile Pro Ser Ala Gln Pro Gly Gly Asp Thr275
280 285Ile Ser Ser Cys Pro Ser Ala Ser Ala Tyr Gly
Gly Leu Ala Thr Met290 295 300Gly Lys Ala
Leu Ser Ser Gly Met Val Leu Val Phe Ser Ile Trp Asn305
310 315 320Asp Asn Ser Gln Tyr Met Asn
Trp Leu Asp Ser Gly Asn Ala Gly Pro325 330
335Cys Ser Ser Thr Glu Gly Asn Pro Ser Asn Ile Leu Ala Asn Asn Pro340
345 350Asn Thr His Val Val Phe Ser Asn Ile
Arg Trp Gly Asp Ile Gly Ser355 360 365Thr
Thr37034223PRTAspergillus niger 34Gln Thr Met Cys Ser Gln Tyr Asp Ser Ala
Ser Ser Pro Pro Tyr Ser1 5 10
15Val Asn Gln Asn Leu Trp Gly Glu Tyr Gln Gly Thr Gly Ser Gln Cys20
25 30Val Tyr Val Asp Lys Leu Ser Ser Ser
Gly Ala Ser Trp His Thr Glu35 40 45Trp
Thr Trp Ser Gly Gly Glu Gly Thr Val Lys Ser Tyr Ser Asn Ser50
55 60Gly Val Thr Phe Asn Lys Lys Leu Val Ser Asp
Val Ser Ser Ile Pro65 70 75
80Thr Ser Val Glu Trp Lys Gln Asp Asn Thr Asn Val Asn Ala Asp Val85
90 95Ala Tyr Asp Leu Phe Thr Ala Ala Asn
Val Asp His Ala Thr Ser Ser100 105 110Gly
Asp Tyr Glu Leu Met Ile Trp Leu Ala Arg Tyr Gly Asn Ile Gln115
120 125Pro Ile Gly Lys Gln Ile Ala Thr Ala Thr Val
Gly Gly Lys Ser Trp130 135 140Glu Val Trp
Tyr Gly Ser Thr Thr Gln Ala Gly Ala Glu Gln Arg Thr145
150 155 160Tyr Ser Phe Val Ser Glu Ser
Pro Ile Asn Ser Tyr Ser Gly Asp Ile165 170
175Asn Ala Phe Phe Ser Tyr Leu Thr Gln Asn Gln Gly Phe Pro Ala Ser180
185 190Ser Gln Tyr Leu Ile Asn Leu Gln Phe
Gly Thr Glu Ala Phe Thr Gly195 200 205Gly
Pro Ala Thr Phe Thr Val Asp Asn Trp Thr Ala Ser Val Asn210
215 22035184PRTAspergillus niger 35Ser Ala Gly Ile Asn
Tyr Val Gln Asn Tyr Asn Gly Asn Leu Gly Asp1 5
10 15Phe Thr Tyr Asp Glu Ser Ala Gly Thr Phe Ser
Met Tyr Trp Glu Asp20 25 30Gly Val Ser
Ser Asp Phe Val Val Gly Leu Gly Trp Thr Thr Gly Ser35 40
45Ser Asn Ala Ile Thr Tyr Ser Ala Glu Tyr Ser Ala Ser
Gly Ser Ala50 55 60Ser Tyr Leu Ala Val
Tyr Gly Trp Val Asn Tyr Pro Gln Ala Glu Tyr65 70
75 80Tyr Ile Val Glu Asp Tyr Gly Asp Tyr Asn
Pro Cys Ser Ser Ala Thr85 90 95Ser Leu
Gly Thr Val Tyr Ser Asp Gly Ser Thr Tyr Gln Val Cys Thr100
105 110Asp Thr Arg Thr Asn Glu Pro Ser Ile Thr Gly Thr
Ser Thr Phe Thr115 120 125Gln Tyr Phe Ser
Val Arg Glu Ser Thr Arg Thr Ser Gly Thr Val Thr130 135
140Val Ala Asn His Phe Asn Phe Trp Ala His His Gly Phe Gly
Asn Ser145 150 155 160Asp
Phe Asn Tyr Gln Val Val Ala Val Glu Ala Trp Ser Gly Ala Gly165
170 175Ser Ala Ser Val Thr Ile Ser
Ser18036313PRTStreptomyces lividans 36Ala Glu Ser Thr Leu Gly Ala Ala Ala
Ala Gln Ser Gly Arg Tyr Phe1 5 10
15Gly Thr Ala Ile Ala Ser Gly Arg Leu Ser Asp Ser Thr Tyr Thr
Ser20 25 30Ile Ala Gly Arg Glu Phe Asn
Met Val Thr Ala Glu Asn Glu Met Lys35 40
45Ile Asp Ala Thr Glu Pro Gln Arg Gly Gln Phe Asn Phe Ser Ser Ala50
55 60Asp Arg Val Tyr Asn Trp Ala Val Gln Asn
Gly Lys Gln Val Arg Gly65 70 75
80His Thr Leu Ala Trp His Ser Gln Gln Pro Gly Trp Met Gln Ser
Leu85 90 95Ser Gly Ser Ala Leu Arg Gln
Ala Met Ile Asp His Ile Asn Gly Val100 105
110Met Ala His Tyr Lys Gly Lys Ile Val Gln Trp Asp Val Val Asn Glu115
120 125Ala Phe Ala Asp Gly Ser Ser Gly Ala
Arg Arg Asp Ser Asn Leu Gln130 135 140Arg
Ser Gly Asn Asp Trp Ile Glu Val Ala Phe Arg Thr Ala Arg Ala145
150 155 160Ala Asp Pro Ser Ala Lys
Leu Cys Tyr Asn Asp Tyr Asn Val Glu Asn165 170
175Trp Thr Trp Ala Lys Thr Gln Ala Met Tyr Asn Met Val Arg Asp
Phe180 185 190Lys Gln Arg Gly Val Pro Ile
Asp Cys Val Gly Phe Gln Ser His Phe195 200
205Asn Ser Gly Ser Pro Tyr Asn Ser Asn Phe Arg Thr Thr Leu Gln Asn210
215 220Phe Ala Ala Leu Gly Val Asp Val Ala
Ile Thr Glu Leu Asp Ile Gln225 230 235
240Gly Ala Pro Ala Ser Thr Tyr Ala Asn Val Thr Asn Asp Cys
Leu Ala245 250 255Val Ser Arg Cys Leu Gly
Ile Thr Val Trp Gly Val Arg Asp Ser Asp260 265
270Ser Trp Arg Ser Glu Gln Thr Pro Leu Leu Phe Asn Asn Asp Gly
Ser275 280 285Lys Lys Ala Ala Tyr Thr Ala
Val Leu Asp Ala Leu Asn Gly Gly Ala290 295
300Ser Ser Glu Pro Pro Ala Asp Gly Gly305
31037362PRTAspergillus niger 37Met His Ser Phe Ala Ser Leu Leu Ala Tyr
Gly Leu Val Ala Gly Ala1 5 10
15Thr Phe Ala Ser Ala Ser Pro Ile Glu Ala Arg Asp Ser Cys Thr Phe20
25 30Thr Thr Ala Ala Ala Ala Lys Ala Gly
Lys Ala Lys Cys Ser Thr Ile35 40 45Thr
Leu Asn Asn Ile Glu Val Pro Ala Gly Thr Thr Leu Asp Leu Thr50
55 60Gly Leu Thr Ser Gly Thr Lys Val Ile Phe Glu
Gly Thr Thr Thr Phe65 70 75
80Gln Tyr Glu Glu Trp Ala Gly Pro Leu Ile Ser Met Ser Gly Glu His85
90 95Ile Thr Val Thr Gly Ala Ser Gly His
Leu Ile Asn Cys Asp Gly Ala100 105 110Arg
Trp Trp Asp Gly Lys Gly Thr Ser Gly Lys Lys Lys Pro Lys Phe115
120 125Phe Tyr Ala His Gly Leu Asp Ser Ser Ser Ile
Thr Gly Leu Asn Ile130 135 140Lys Asn Thr
Pro Leu Met Ala Phe Ser Val Gln Ala Asn Asp Ile Thr145
150 155 160Phe Thr Asp Val Thr Ile Asn
Asn Ala Asp Gly Asp Thr Gln Gly Gly165 170
175His Asn Thr Asp Ala Phe Asp Val Gly Asn Ser Val Gly Val Asn Ile180
185 190Ile Lys Pro Trp Val His Asn Gln Asp
Asp Cys Leu Ala Val Asn Ser195 200 205Gly
Glu Asn Ile Trp Phe Thr Gly Gly Thr Cys Ile Gly Gly His Gly210
215 220Leu Ser Ile Gly Ser Val Gly Asp Arg Ser Asn
Asn Val Val Lys Asn225 230 235
240Val Thr Ile Glu His Ser Thr Val Ser Asn Ser Glu Asn Ala Val
Arg245 250 255Ile Lys Thr Ile Ser Gly Ala
Thr Gly Ser Val Ser Glu Ile Thr Tyr260 265
270Ser Asn Ile Val Met Ser Gly Ile Ser Asp Tyr Gly Val Val Ile Gln275
280 285Gln Asp Tyr Glu Asp Gly Lys Pro Thr
Gly Lys Pro Thr Asn Gly Val290 295 300Thr
Ile Gln Asp Val Lys Leu Glu Ser Val Thr Gly Ser Val Asp Ser305
310 315 320Gly Ala Thr Glu Ile Tyr
Leu Leu Cys Gly Ser Gly Ser Cys Ser Asp325 330
335Trp Thr Trp Asp Asp Val Lys Val Thr Gly Gly Lys Lys Ser Thr
Ala340 345 350Cys Lys Asn Phe Pro Ser Val
Ala Ser Cys355 36038383PRTPseudomonas cellulosa 38Arg Ala
Asp Val Lys Pro Val Thr Val Lys Leu Val Asp Ser Gln Ala1 5
10 15Thr Met Glu Thr Arg Ser Leu Phe
Ala Phe Met Gln Glu Gln Arg Arg20 25
30His Ser Ile Met Phe Gly His Gln His Glu Thr Thr Gln Gly Leu Thr35
40 45Ile Thr Arg Thr Asp Gly Thr Gln Ser Asp
Thr Phe Asn Ala Val Gly50 55 60Asp Phe
Ala Ala Val Tyr Gly Trp Asp Thr Leu Ser Ile Val Ala Pro65
70 75 80Lys Ala Glu Gly Asp Ile Val
Ala Gln Val Lys Lys Ala Tyr Ala Arg85 90
95Gly Gly Ile Ile Thr Val Ser Ser His Phe Asp Asn Pro Lys Thr Asp100
105 110Thr Gln Lys Gly Val Trp Pro Val Gly
Thr Ser Trp Asp Gln Thr Pro115 120 125Ala
Val Val Asp Ser Leu Pro Gly Gly Ala Tyr Asn Pro Val Leu Asn130
135 140Gly Tyr Leu Asp Gln Val Ala Glu Trp Ala Asn
Asn Leu Lys Asp Glu145 150 155
160Gln Gly Arg Leu Ile Pro Val Ile Phe Arg Leu Tyr His Ala Asn
Thr165 170 175Gly Ser Trp Phe Trp Trp Gly
Asp Lys Gln Ser Thr Pro Glu Gln Tyr180 185
190Lys Gln Leu Phe Arg Tyr Ser Val Glu Tyr Leu Arg Asp Val Lys Gly195
200 205Val Arg Asn Phe Leu Tyr Ala Tyr Ser
Pro Asn Asn Phe Trp Asp Val210 215 220Thr
Glu Ala Asn Tyr Leu Glu Arg Tyr Pro Gly Asp Glu Trp Val Asp225
230 235 240Val Leu Gly Phe Asp Thr
Tyr Gly Pro Val Ala Asp Asn Ala Asp Trp245 250
255Phe Arg Asn Val Val Ala Asn Ala Ala Leu Val Ala Arg Met Ala
Glu260 265 270Ala Arg Gly Lys Ile Pro Val
Ile Ser Glu Ile Gly Ile Arg Ala Pro275 280
285Asp Ile Glu Ala Gly Leu Tyr Asp Asn Gln Trp Tyr Arg Lys Leu Ile290
295 300Ser Gly Leu Lys Ala Asp Pro Asp Ala
Arg Glu Ile Ala Phe Leu Leu305 310 315
320Val Trp Arg Asn Ala Pro Gln Gly Val Pro Gly Pro Asn Gly
Thr Gln325 330 335Val Pro His Tyr Trp Val
Pro Ala Asn Arg Pro Glu Asn Ile Asn Asn340 345
350Gly Thr Leu Glu Asp Phe Gln Ala Phe Tyr Ala Asp Glu Phe Thr
Ala355 360 365Phe Asn Arg Asp Ile Glu Gln
Val Tyr Gln Arg Pro Thr Leu Ile370 375
38039419PRTArtifical SequenceDescription of Artificial Sequence =
Bacillus circulans 39Leu Gln Pro Ala Thr Ala Glu Ala Ala Asp Ser
Tyr Lys Ile Val Gly1 5 10
15Tyr Tyr Pro Ser Trp Ala Ala Tyr Gly Arg Asn Tyr Asn Val Ala Asp20
25 30Ile Asp Pro Thr Lys Val Thr His Ile Asn
Tyr Ala Phe Ala Asp Ile35 40 45Cys Trp
Asn Gly Ile His Gly Asn Pro Asp Pro Ser Gly Pro Asn Pro50
55 60Val Thr Trp Thr Cys Gln Asn Glu Lys Ser Gln Thr
Ile Asn Val Pro65 70 75
80Asn Gly Thr Ile Val Leu Gly Asp Pro Trp Ile Asp Thr Gly Lys Thr85
90 95Phe Ala Gly Asp Thr Trp Asp Gln Pro Ile
Ala Gly Asn Ile Asn Gln100 105 110Leu Asn
Lys Leu Lys Gln Thr Asn Pro Asn Leu Lys Thr Ile Ile Ser115
120 125Val Gly Gly Trp Thr Trp Ser Asn Arg Phe Ser Asp
Val Ala Ala Thr130 135 140Ala Ala Thr Arg
Glu Val Phe Ala Asn Ser Ala Val Asp Phe Leu Arg145 150
155 160Lys Tyr Asn Phe Asp Gly Val Asp Leu
Asp Trp Glu Tyr Pro Val Ser165 170 175Gly
Gly Leu Asp Gly Asn Ser Lys Arg Pro Glu Asp Lys Gln Asn Tyr180
185 190Thr Leu Leu Leu Ser Lys Ile Arg Glu Lys Leu
Asp Ala Ala Gly Ala195 200 205Val Asp Gly
Lys Lys Tyr Leu Leu Thr Ile Ala Ser Gly Ala Ser Ala210
215 220Thr Tyr Ala Ala Asn Thr Glu Leu Ala Lys Ile Ala
Ala Ile Val Asp225 230 235
240Trp Ile Asn Ile Met Thr Tyr Asp Phe Asn Gly Ala Trp Gln Lys Ile245
250 255Ser Ala His Asn Ala Pro Leu Asn Tyr
Asp Pro Ala Ala Ser Ala Ala260 265 270Gly
Val Pro Asp Ala Asn Thr Phe Asn Val Ala Ala Gly Ala Gln Gly275
280 285His Leu Asp Ala Gly Val Pro Ala Ala Lys Leu
Val Leu Gly Val Pro290 295 300Phe Tyr Gly
Arg Gly Trp Asp Gly Cys Ala Gln Ala Gly Asn Gly Gln305
310 315 320Tyr Gln Thr Cys Thr Gly Gly
Ser Ser Val Gly Thr Trp Glu Ala Gly325 330
335Ser Phe Asp Phe Tyr Asp Leu Glu Ala Asn Tyr Ile Asn Lys Asn Gly340
345 350Tyr Thr Arg Tyr Trp Asn Asp Thr Ala
Lys Val Pro Tyr Leu Tyr Asn355 360 365Ala
Ser Asn Lys Arg Phe Ile Ser Tyr Asp Asp Ala Glu Ser Val Gly370
375 380Tyr Lys Thr Ala Tyr Ile Lys Ser Lys Gly Leu
Gly Gly Ala Met Phe385 390 395
400Trp Glu Leu Ser Gly Asp Arg Asn Lys Thr Leu Gln Asn Lys Leu
Lys405 410 415Ala Asp Leu40317PRTCandida
antarctica 40Leu Pro Ser Gly Ser Asp Pro Ala Phe Ser Gln Pro Lys Ser Val
Leu1 5 10 15Asp Ala Gly
Leu Thr Cys Gln Gly Ala Ser Pro Ser Ser Val Ser Lys20 25
30Pro Ile Leu Leu Val Pro Gly Thr Gly Thr Thr Gly Pro
Gln Ser Phe35 40 45Asp Ser Asn Trp Ile
Pro Leu Ser Thr Gln Leu Gly Tyr Thr Pro Cys50 55
60Trp Ile Ser Pro Pro Pro Phe Met Leu Asn Asp Thr Gln Val Asn
Thr65 70 75 80Glu Tyr
Met Val Asn Ala Ile Thr Ala Leu Tyr Ala Gly Ser Gly Asn85
90 95Asn Lys Leu Pro Val Leu Thr Trp Ser Gln Gly Gly
Leu Val Ala Gln100 105 110Trp Gly Leu Thr
Phe Phe Pro Ser Ile Arg Ser Lys Val Asp Arg Leu115 120
125Met Ala Phe Ala Pro Asp Tyr Lys Gly Thr Val Leu Ala Gly
Pro Leu130 135 140Asp Ala Leu Ala Val Ser
Ala Pro Ser Val Trp Gln Gln Thr Thr Gly145 150
155 160Ser Ala Leu Thr Thr Ala Leu Arg Asn Ala Gly
Gly Leu Thr Gln Ile165 170 175Val Pro Thr
Thr Asn Leu Tyr Ser Ala Thr Asp Glu Ile Val Gln Pro180
185 190Gln Val Ser Asn Ser Pro Leu Asp Ser Ser Tyr Leu
Phe Asn Gly Lys195 200 205Asn Val Gln Ala
Gln Ala Val Cys Gly Pro Leu Phe Val Ile Asp His210 215
220Ala Gly Ser Leu Thr Ser Gln Phe Ser Tyr Val Val Gly Arg
Ser Ala225 230 235 240Leu
Arg Ser Thr Thr Gly Gln Ala Arg Ser Ala Asp Tyr Gly Ile Thr245
250 255Asp Cys Asn Pro Leu Pro Ala Asn Asp Leu Thr
Pro Glu Gln Lys Val260 265 270Ala Ala Ala
Ala Leu Leu Ala Pro Ala Ala Ala Ala Ile Val Ala Gly275
280 285Pro Lys Gln Asn Cys Glu Pro Asp Leu Met Pro Tyr
Ala Arg Pro Phe290 295 300Ala Val Gly Lys
Arg Thr Cys Ser Gly Ile Val Thr Pro305 310
31541434PRTArtifical SequenceDescription of Artificial Sequence =
Chimera of guinea pig and homo sapiens 41Ala Glu Val Cys Tyr Ser His Leu
Gly Cys Phe Ser Asp Glu Lys Pro1 5 10
15Trp Ala Gly Thr Ser Gln Arg Pro Ile Lys Ser Leu Pro Ser
Asp Pro20 25 30Lys Lys Ile Asn Thr Arg
Phe Leu Leu Tyr Thr Asn Glu Asn Gln Asn35 40
45Ser Tyr Gln Leu Ile Thr Ala Thr Asp Ile Ala Thr Ile Lys Ala Ser50
55 60Asn Phe Asn Leu Asn Arg Lys Thr Arg
Phe Ile Ile His Gly Phe Thr65 70 75
80Asp Ser Gly Glu Asn Ser Trp Leu Ser Asp Met Cys Lys Asn
Met Phe85 90 95Gln Val Glu Lys Val Asn
Cys Ile Cys Val Asp Trp Lys Gly Gly Ser100 105
110Lys Ala Gln Tyr Ser Gln Ala Ser Gln Asn Ile Arg Val Val Gly
Ala115 120 125Glu Val Ala Tyr Leu Val Gln
Val Leu Ser Thr Ser Leu Asn Tyr Ala130 135
140Pro Glu Asn Val His Ile Ile Gly His Ser Leu Gly Ala His Thr Ala145
150 155 160Gly Glu Ala Gly
Lys Arg Leu Asn Gly Leu Val Gly Arg Ile Thr Gly165 170
175Leu Asp Pro Ala Glu Pro Tyr Phe Gln Asp Thr Pro Glu Glu
Val Arg180 185 190Leu Asp Pro Ser Asp Ala
Lys Phe Val Asp Val Ile His Thr Asp Ile195 200
205Ser Pro Ile Leu Pro Ser Leu Gly Phe Gly Met Ser Gln Lys Val
Gly210 215 220His Met Asp Phe Phe Pro Asn
Gly Gly Lys Asp Met Pro Gly Cys Lys225 230
235 240Thr Gly Ile Ser Cys Asn His His Arg Ser Ile Glu
Tyr Tyr His Ser245 250 255Ser Ile Leu Asn
Pro Glu Gly Phe Leu Gly Tyr Pro Cys Ala Ser Tyr260 265
270Asp Glu Phe Gln Glu Ser Gly Cys Phe Pro Cys Pro Ala Lys
Gly Cys275 280 285Pro Lys Met Gly His Phe
Ala Asp Gln Tyr Pro Gly Lys Thr Asn Ala290 295
300Val Glu Gln Thr Phe Phe Leu Asn Thr Gly Ala Ser Asp Asn Phe
Thr305 310 315 320Arg Trp
Arg Tyr Lys Val Thr Val Thr Leu Ser Gly Glu Lys Asp Pro325
330 335Ser Gly Asn Ile Asn Val Ala Leu Leu Gly Lys Asn
Gly Asn Ser Ala340 345 350Gln Tyr Gln Val
Phe Lys Gly Thr Leu Lys Pro Asp Ala Ser Tyr Thr355 360
365Asn Ser Ile Asp Val Glu Leu Asn Val Gly Thr Ile Gln Lys
Val Thr370 375 380Phe Leu Trp Lys Arg Ser
Gly Ile Ser Val Ser Lys Pro Lys Met Gly385 390
395 400Ala Ser Arg Ile Thr Val Gln Ser Gly Lys Asp
Gly Thr Lys Tyr Asn405 410 415Phe Cys Ser
Ser Asp Ile Val Gln Glu Asn Val Glu Gln Thr Leu Ser420
425 430Pro Cys42471PRTEscherichia coli 42Met Lys Gln Ser
Thr Ile Ala Leu Ala Leu Leu Pro Leu Leu Phe Thr1 5
10 15Pro Val Thr Lys Ala Arg Thr Pro Glu Met
Pro Val Leu Glu Asn Arg20 25 30Ala Ala
Gln Gly Asp Ile Thr Ala Pro Gly Gly Ala Arg Arg Leu Thr35
40 45Gly Asp Gln Thr Ala Ala Leu Arg Asp Ser Leu Ser
Asp Lys Pro Ala50 55 60Lys Asn Ile Ile
Leu Leu Ile Gly Asp Gly Met Gly Asp Ser Glu Ile65 70
75 80Thr Ala Ala Arg Asn Tyr Ala Glu Gly
Ala Gly Gly Phe Phe Lys Gly85 90 95Ile
Asp Ala Leu Pro Leu Thr Gly Gln Tyr Thr His Tyr Ala Leu Asn100
105 110Lys Lys Thr Gly Lys Pro Asp Tyr Val Thr Asp
Ser Ala Ala Ser Ala115 120 125Thr Ala Trp
Ser Thr Gly Val Lys Thr Tyr Asn Gly Ala Leu Gly Val130
135 140Asp Ile His Glu Lys Asp His Pro Thr Ile Leu Glu
Met Ala Lys Ala145 150 155
160Ala Gly Leu Ala Thr Gly Asn Val Ser Thr Ala Glu Leu Gln Asp Ala165
170 175Thr Pro Ala Ala Leu Val Ala His Val
Thr Ser Arg Lys Cys Tyr Gly180 185 190Pro
Ser Ala Thr Ser Glu Lys Cys Pro Gly Asn Ala Leu Glu Lys Gly195
200 205Gly Lys Gly Ser Ile Thr Glu Gln Leu Leu Asn
Ala Arg Ala Asp Val210 215 220Thr Leu Gly
Gly Gly Ala Lys Thr Phe Ala Glu Thr Ala Thr Ala Gly225
230 235 240Glu Trp Gln Gly Lys Thr Leu
Arg Glu Gln Ala Gln Ala Arg Gly Tyr245 250
255Gln Leu Val Ser Asp Ala Ala Ser Leu Asn Ser Val Thr Glu Ala Asn260
265 270Gln Gln Lys Pro Leu Leu Gly Leu Phe
Ala Asp Gly Asn Met Pro Val275 280 285Arg
Trp Leu Gly Pro Lys Ala Thr Tyr His Gly Asn Ile Asp Lys Pro290
295 300Ala Val Thr Cys Thr Pro Asn Pro Gln Arg Asn
Asp Ser Val Pro Thr305 310 315
320Leu Ala Gln Met Thr Asp Lys Ala Ile Glu Leu Leu Ser Lys Asn
Glu325 330 335Lys Gly Phe Phe Leu Gln Val
Glu Gly Ala Ser Ile Asp Lys Gln Asp340 345
350His Ala Ala Asn Pro Cys Gly Gln Ile Gly Glu Thr Val Asp Leu Asp355
360 365Glu Ala Val Gln Arg Ala Leu Glu Phe
Ala Lys Lys Glu Gly Asn Thr370 375 380Leu
Val Ile Val Thr Ala Asp His Ala His Ala Ser Gln Ile Val Ala385
390 395 400Pro Asp Thr Lys Ala Pro
Gly Leu Thr Gln Ala Leu Asn Thr Lys Asp405 410
415Gly Ala Val Met Val Met Ser Tyr Gly Asn Ser Glu Glu Asp Ser
Gln420 425 430Glu His Thr Gly Ser Gln Leu
Arg Ile Ala Ala Tyr Gly Pro His Ala435 440
445Ala Asn Val Val Gly Leu Thr Asp Gln Thr Asp Leu Phe Tyr Thr Met450
455 460Lys Ala Ala Leu Gly Leu Lys465
47043260PRTArtifical SequenceDescription of Artificial Sequence
= Synthetic Construct 43Leu Lys Ile Ala Ala Phe Asn Ile Arg Thr Phe
Gly Glu Thr Lys Met1 5 10
15Ser Asn Ala Thr Leu Ala Ser Tyr Ile Val Arg Ile Val Arg Arg Tyr20
25 30Asp Ile Val Leu Ile Gln Glu Val Arg Asp
Ser His Leu Val Ala Val35 40 45Gly Lys
Leu Leu Asp Tyr Leu Asn Gln Asp Asp Pro Asn Thr Tyr His50
55 60Tyr Val Val Ser Glu Pro Leu Gly Arg Asn Ser Tyr
Lys Glu Arg Tyr65 70 75
80Leu Phe Leu Phe Arg Pro Asn Lys Val Ser Val Leu Asp Thr Tyr Gln85
90 95Tyr Asp Asp Gly Cys Glu Ser Cys Gly Asn
Asp Ser Phe Ser Arg Glu100 105 110Pro Ala
Val Val Lys Phe Ser Ser His Ser Thr Lys Val Lys Glu Phe115
120 125Ala Ile Val Ala Leu His Ser Ala Pro Ser Asp Ala
Val Ala Glu Ile130 135 140Asn Ser Leu Tyr
Asp Val Tyr Leu Asp Val Gln Gln Lys Trp His Leu145 150
155 160Asn Asp Val Met Leu Met Gly Asp Phe
Asn Ala Asp Cys Ser Tyr Val165 170 175Thr
Ser Ser Gln Trp Ser Ser Ile Arg Leu Arg Thr Ser Ser Thr Phe180
185 190Gln Trp Leu Ile Pro Asp Ser Ala Asp Thr Thr
Ala Thr Ser Thr Asn195 200 205Cys Ala Tyr
Asp Arg Ile Val Val Ala Gly Ser Leu Leu Gln Ser Ser210
215 220Val Val Pro Gly Ser Ala Ala Pro Phe Asp Phe Gln
Ala Ala Tyr Gly225 230 235
240Leu Ser Asn Glu Met Ala Leu Ala Ile Ser Asp His Tyr Pro Val Glu245
250 255Val Thr Leu Thr26044686PRTBacillus
circulans 44Ala Pro Asp Thr Ser Val Ser Asn Lys Gln Asn Phe Ser Thr Asp
Val1 5 10 15Ile Tyr Gln
Ile Phe Thr Asp Arg Phe Ser Asp Gly Asn Pro Ala Asn20 25
30Asn Pro Thr Gly Ala Ala Phe Asp Gly Thr Cys Thr Asn
Leu Arg Leu35 40 45Tyr Cys Gly Gly Asp
Trp Gln Gly Ile Ile Asn Lys Ile Asn Asp Gly50 55
60Tyr Leu Thr Gly Met Gly Val Thr Ala Ile Trp Ile Ser Gln Pro
Val65 70 75 80Glu Asn
Ile Tyr Ser Ile Ile Asn Tyr Ser Gly Val Asn Asn Thr Ala85
90 95Tyr His Gly Tyr Trp Ala Arg Asp Phe Lys Lys Thr
Asn Pro Ala Tyr100 105 110Gly Thr Ile Ala
Asp Phe Gln Asn Leu Ile Ala Ala Ala His Ala Lys115 120
125Asn Ile Lys Val Ile Ile Asp Phe Ala Pro Asn His Thr Ser
Pro Ala130 135 140Ser Ser Asp Gln Pro Ser
Phe Ala Glu Asn Gly Arg Leu Tyr Asp Asn145 150
155 160Gly Thr Leu Leu Gly Gly Tyr Thr Asn Asp Thr
Gln Asn Leu Phe His165 170 175His Asn Gly
Gly Thr Asp Phe Ser Thr Thr Glu Asn Gly Ile Tyr Lys180
185 190Asn Leu Tyr Asp Leu Ala Asp Leu Asn His Asn Asn
Ser Thr Val Asp195 200 205Val Tyr Leu Lys
Asp Ala Ile Lys Met Trp Leu Asp Leu Gly Ile Asp210 215
220Gly Ile Arg Met Asp Ala Val Lys His Met Pro Phe Gly Trp
Gln Lys225 230 235 240Ser
Phe Met Ala Ala Val Asn Asn Tyr Lys Pro Val Phe Thr Phe Gly245
250 255Glu Trp Phe Leu Gly Val Asn Glu Val Ser Pro
Glu Asn His Lys Phe260 265 270Ala Asn Glu
Ser Gly Met Ser Leu Leu Asp Phe Arg Phe Ala Gln Lys275
280 285Val Arg Gln Val Phe Arg Asp Asn Thr Asp Asn Met
Tyr Gly Leu Lys290 295 300Ala Met Leu Glu
Gly Ser Ala Ala Asp Tyr Ala Gln Val Asp Asp Gln305 310
315 320Val Thr Phe Ile Asp Asn His Asp Met
Glu Arg Phe His Ala Ser Asn325 330 335Ala
Asn Arg Arg Lys Leu Glu Gln Ala Leu Ala Phe Thr Leu Thr Ser340
345 350Arg Gly Val Pro Ala Ile Tyr Tyr Gly Thr Glu
Gln Tyr Met Ser Gly355 360 365Gly Thr Asp
Pro Asp Asn Arg Ala Arg Ile Pro Ser Phe Ser Thr Ser370
375 380Thr Thr Ala Tyr Gln Val Ile Gln Lys Leu Ala Pro
Leu Arg Lys Cys385 390 395
400Asn Pro Ala Ile Ala Tyr Gly Ser Thr Gln Glu Arg Trp Ile Asn Asn405
410 415Asp Val Leu Ile Tyr Glu Arg Lys Phe
Gly Ser Asn Val Ala Val Val420 425 430Ala
Val Asn Arg Asn Leu Asn Ala Pro Ala Ser Ile Ser Gly Leu Val435
440 445Thr Ser Leu Pro Gln Gly Ser Tyr Asn Asp Val
Leu Gly Gly Leu Leu450 455 460Asn Gly Asn
Thr Leu Ser Val Gly Ser Gly Gly Ala Ala Ser Asn Phe465
470 475 480Thr Leu Ala Ala Gly Gly Thr
Ala Val Trp Gln Tyr Thr Ala Ala Thr485 490
495Ala Thr Pro Thr Ile Gly His Val Gly Pro Met Met Ala Lys Pro Gly500
505 510Val Thr Ile Thr Ile Asp Gly Arg Gly
Phe Gly Ser Ser Lys Gly Thr515 520 525Val
Tyr Phe Gly Thr Thr Ala Val Ser Gly Ala Asp Ile Thr Ser Trp530
535 540Glu Asp Thr Gln Ile Lys Val Lys Ile Pro Ala
Val Ala Gly Gly Asn545 550 555
560Tyr Asn Ile Lys Val Ala Asn Ala Ala Gly Thr Ala Ser Asn Val
Tyr565 570 575Asp Asn Phe Glu Val Leu Ser
Gly Asp Gln Val Ser Val Arg Phe Val580 585
590Val Asn Asn Ala Thr Thr Ala Leu Gly Gln Asn Val Tyr Leu Thr Gly595
600 605Ser Val Ser Glu Leu Gly Asn Trp Asp
Pro Ala Lys Ala Ile Gly Pro610 615 620Met
Tyr Asn Gln Val Val Tyr Gln Tyr Pro Asn Trp Tyr Tyr Asp Val625
630 635 640Ser Val Pro Ala Gly Lys
Thr Ile Glu Phe Lys Phe Leu Lys Lys Gln645 650
655Gly Ser Thr Val Thr Trp Glu Gly Gly Ser Asn His Thr Phe Thr
Ala660 665 670Pro Ser Ser Gly Thr Ala Thr
Ile Asn Val Asn Trp Gln Pro675 680
68545404PRTArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 45Met Arg Val Leu Ile Thr Gly Cys Gly Ser Arg
Gly Asp Thr Glu Pro1 5 10
15Leu Val Ala Leu Ala Ala Arg Leu Arg Glu Leu Gly Ala Asp Ala Arg20
25 30Met Cys Leu Pro Pro Asp Tyr Val Glu Arg
Cys Ala Glu Val Gly Val35 40 45Pro Met
Val Pro Val Gly Arg Ala Val Arg Ala Gly Ala Arg Glu Pro50
55 60Gly Glu Leu Pro Pro Gly Ala Ala Glu Val Val Thr
Glu Val Val Ala65 70 75
80Glu Trp Phe Asp Lys Val Pro Ala Ala Ile Glu Gly Cys Asp Ala Val85
90 95Val Thr Thr Gly Leu Leu Pro Ala Ala Val
Ala Val Arg Ser Met Ala100 105 110Glu Lys
Leu Gly Ile Pro Tyr Arg Tyr Thr Val Leu Ser Pro Asp His115
120 125Leu Pro Ser Glu Gln Ser Gln Ala Glu Arg Asp Met
Tyr Asn Gln Gly130 135 140Ala Asp Arg Leu
Phe Gly Asp Ala Val Asn Ser His Arg Ala Ser Ile145 150
155 160Gly Leu Pro Pro Val Glu His Leu Tyr
Asp Tyr Gly Tyr Thr Asp Gln165 170 175Pro
Trp Leu Ala Ala Asp Pro Val Leu Ser Pro Leu Arg Pro Thr Asp180
185 190Leu Gly Thr Val Gln Thr Gly Ala Trp Ile Leu
Pro Asp Glu Arg Pro195 200 205Leu Ser Ala
Glu Leu Glu Ala Phe Leu Ala Ala Gly Ser Thr Pro Val210
215 220Tyr Val Gly Phe Gly Ser Ser Ser Arg Pro Ala Thr
Ala Asp Ala Ala225 230 235
240Lys Met Ala Ile Lys Ala Val Arg Ala Ser Gly Arg Arg Ile Val Leu245
250 255Ser Arg Gly Trp Ala Asp Leu Val Leu
Pro Asp Asp Gly Ala Asp Cys260 265 270Phe
Val Val Gly Glu Val Asn Leu Gln Glu Leu Phe Gly Arg Val Ala275
280 285Ala Ala Ile His His Asp Ser Ala Gly Thr Thr
Leu Leu Ala Met Arg290 295 300Ala Gly Ile
Pro Gln Ile Val Val Arg Arg Val Val Asp Asn Val Val305
310 315 320Glu Gln Ala Tyr His Ala Asp
Arg Val Ala Glu Leu Gly Val Gly Val325 330
335Ala Val Asp Gly Pro Val Pro Thr Ile Asp Ser Leu Ser Ala Ala Leu340
345 350Asp Thr Ala Leu Ala Pro Glu Ile Arg
Ala Arg Ala Thr Thr Val Ala355 360 365Asp
Thr Ile Arg Ala Asp Gly Thr Thr Val Ala Ala Gln Leu Leu Phe370
375 380Asp Ala Val Ser Leu Glu Lys Pro Thr Val Pro
Ala Leu Glu His His385 390 395
400His His His His46292PRTArtifical SequenceDescription of
Artificial Sequence = Syntheic Construct 46Ser Ile Glu Arg Leu Gly
Tyr Leu Gly Phe Ala Val Lys Asp Val Pro1 5
10 15Ala Trp Asp His Phe Leu Thr Lys Ser Val Gly Leu
Met Ala Ala Gly20 25 30Ser Ala Gly Asp
Ala Ala Leu Tyr Arg Ala Asp Gln Arg Ala Trp Arg35 40
45Ile Ala Val Gln Pro Gly Glu Leu Asp Asp Leu Ala Tyr Ala
Gly Leu50 55 60Glu Val Asp Asp Ala Ala
Ala Leu Glu Arg Met Ala Asp Lys Leu Arg65 70
75 80Gln Ala Gly Val Ala Phe Thr Arg Gly Asp Glu
Ala Leu Met Gln Gln85 90 95Arg Lys Val
Met Gly Leu Leu Cys Leu Gln Asp Pro Phe Gly Leu Pro100
105 110Leu Glu Ile Tyr Tyr Gly Pro Ala Glu Ile Phe His
Glu Pro Phe Leu115 120 125Pro Ser Ala Pro
Val Ser Gly Phe Val Thr Gly Asp Gln Gly Ile Gly130 135
140His Phe Val Arg Cys Val Pro Asp Thr Ala Lys Ala Met Ala
Phe Tyr145 150 155 160Thr
Glu Val Leu Gly Phe Val Leu Ser Asp Ile Ile Asp Ile Gln Met165
170 175Gly Pro Glu Thr Ser Val Pro Ala His Phe Leu
His Cys Asn Gly Arg180 185 190His His Thr
Ile Ala Leu Ala Ala Phe Pro Ile Pro Lys Arg Ile His195
200 205His Phe Met Leu Gln Ala Asn Thr Ile Asp Asp Val
Gly Tyr Ala Phe210 215 220Asp Arg Leu Asp
Ala Ala Gly Arg Ile Thr Ser Leu Leu Gly Arg His225 230
235 240Thr Asn Asp Gln Thr Leu Ser Phe Tyr
Ala Asp Thr Pro Ser Pro Met245 250 255Ile
Glu Val Glu Phe Gly Trp Gly Pro Arg Thr Val Asp Ser Ser Trp260
265 270Thr Val Ala Arg His Ser Arg Thr Ala Met Trp
Gly His Lys Ser Val275 280 285Arg Gly Gln
Arg29047311PRTArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 47Met Glu Val Lys Ile Phe Asn Thr Gln Asp Val
Gln Asp Phe Leu Arg1 5 10
15Val Ala Ser Gly Leu Glu Gln Glu Gly Gly Asn Pro Arg Val Lys Gln20
25 30Ile Ile His Arg Val Leu Ser Asp Leu Tyr
Lys Ala Ile Glu Asp Leu35 40 45Asn Ile
Thr Ser Asp Glu Tyr Trp Ala Gly Val Ala Tyr Leu Asn Gln50
55 60Leu Gly Ala Asn Gln Glu Ala Gly Leu Leu Ser Pro
Gly Leu Gly Phe65 70 75
80Asp His Tyr Leu Asp Met Arg Met Asp Ala Glu Asp Ala Ala Leu Gly85
90 95Ile Glu Asn Ala Thr Pro Arg Thr Ile Glu
Gly Pro Leu Tyr Val Ala100 105 110Gly Ala
Pro Glu Ser Val Gly Tyr Ala Arg Met Asp Asp Gly Ser Asp115
120 125Pro Asn Gly His Thr Leu Ile Leu His Gly Thr Ile
Phe Asp Ala Asp130 135 140Gly Lys Pro Leu
Pro Asn Ala Lys Val Glu Ile Trp His Ala Asn Thr145 150
155 160Lys Gly Phe Tyr Ser His Phe Asp Pro
Thr Gly Glu Gln Gln Ala Phe165 170 175Asn
Met Arg Arg Ser Ile Ile Thr Asp Glu Asn Gly Gln Tyr Arg Val180
185 190Arg Thr Ile Leu Pro Ala Gly Tyr Gly Cys Pro
Pro Glu Gly Pro Thr195 200 205Gln Gln Leu
Leu Asn Gln Leu Gly Arg His Gly Asn Arg Pro Ala His210
215 220Ile His Tyr Phe Val Ser Ala Asp Gly His Arg Lys
Leu Thr Thr Gln225 230 235
240Ile Asn Val Ala Gly Asp Pro Tyr Thr Tyr Asp Asp Phe Ala Tyr Ala245
250 255Thr Arg Glu Gly Leu Val Val Asp Ala
Val Glu His Thr Asp Pro Glu260 265 270Ala
Ile Lys Ala Asn Asp Val Glu Gly Pro Phe Ala Glu Met Val Phe275
280 285Asp Leu Lys Leu Thr Arg Leu Val Asp Gly Val
Asp Asn Gln Val Val290 295 300Asp Arg Pro
Arg Leu Ala Val305 31048414PRTPseudomonas putida 48Thr
Thr Glu Thr Ile Gln Ser Asn Ala Asn Leu Ala Pro Leu Pro Pro1
5 10 15His Val Pro Glu His Leu Val
Phe Asp Phe Asp Met Tyr Asn Pro Ser20 25
30Asn Leu Ser Ala Gly Val Gln Glu Ala Trp Ala Val Leu Gln Glu Ser35
40 45Asn Val Pro Asp Leu Val Trp Thr Arg Cys
Asn Gly Gly His Trp Ile50 55 60Ala Thr
Arg Gly Gln Leu Ile Arg Glu Ala Tyr Glu Asp Tyr Arg His65
70 75 80Phe Ser Ser Glu Cys Pro Phe
Ile Pro Arg Glu Ala Gly Glu Ala Tyr85 90
95Asp Phe Ile Pro Thr Ser Met Asp Pro Pro Glu Gln Arg Gln Phe Arg100
105 110Ala Leu Ala Asn Gln Val Val Gly Met
Pro Val Val Asp Lys Leu Glu115 120 125Asn
Arg Ile Gln Glu Leu Ala Cys Ser Leu Ile Glu Ser Leu Arg Pro130
135 140Gln Gly Gln Cys Asn Phe Thr Glu Asp Tyr Ala
Glu Pro Phe Pro Ile145 150 155
160Arg Ile Phe Met Leu Leu Ala Gly Leu Pro Glu Glu Asp Ile Pro
His165 170 175Leu Lys Tyr Leu Thr Asp Gln
Met Thr Arg Pro Asp Gly Ser Met Thr180 185
190Phe Ala Glu Ala Lys Glu Ala Leu Tyr Asp Tyr Leu Ile Pro Ile Ile195
200 205Glu Gln Arg Arg Gln Lys Pro Gly Thr
Asp Ala Ile Ser Ile Val Ala210 215 220Asn
Gly Gln Val Asn Gly Arg Pro Ile Thr Ser Asp Glu Ala Lys Arg225
230 235 240Met Cys Gly Leu Leu Leu
Val Gly Gly Leu Asp Thr Val Val Asn Phe245 250
255Leu Ser Phe Ser Met Glu Phe Leu Ala Lys Ser Pro Glu His Arg
Gln260 265 270Glu Leu Ile Gln Arg Pro Glu
Arg Ile Pro Ala Ala Cys Glu Glu Leu275 280
285Leu Arg Arg Phe Ser Leu Val Ala Asp Gly Arg Ile Leu Thr Ser Asp290
295 300Tyr Glu Phe His Gly Val Gln Leu Lys
Lys Gly Asp Gln Ile Leu Leu305 310 315
320Pro Gln Met Leu Ser Gly Leu Asp Glu Arg Glu Asn Ala Cys
Pro Met325 330 335His Val Asp Phe Ser Arg
Gln Lys Val Ser His Thr Thr Phe Gly His340 345
350Gly Ser His Leu Cys Leu Gly Gln His Leu Ala Arg Arg Glu Ile
Ile355 360 365Val Thr Leu Lys Glu Trp Leu
Thr Arg Ile Pro Asp Phe Ser Ile Ala370 375
380Pro Gly Ala Gln Ile Gln His Lys Ser Gly Ile Val Ser Gly Val Gln385
390 395 400Ala Leu Pro Leu
Val Trp Asp Pro Ala Thr Thr Lys Ala Val405
41049374PRTEquus caballus 49Ser Thr Ala Gly Lys Val Ile Lys Cys Lys Ala
Ala Val Leu Trp Glu1 5 10
15Glu Lys Lys Pro Phe Ser Ile Glu Glu Val Glu Val Ala Pro Pro Lys20
25 30Ala His Glu Val Arg Ile Lys Met Val Ala
Thr Gly Ile Cys Arg Ser35 40 45Asp Asp
His Val Val Ser Gly Thr Leu Val Thr Pro Leu Pro Val Ile50
55 60Ala Gly His Glu Ala Ala Gly Ile Val Glu Ser Ile
Gly Glu Gly Val65 70 75
80Thr Thr Val Arg Pro Gly Asp Lys Val Ile Pro Leu Phe Thr Pro Gln85
90 95Cys Gly Lys Cys Arg Val Cys Lys His Pro
Glu Gly Asn Phe Cys Leu100 105 110Lys Asn
Asp Leu Ser Met Pro Arg Gly Thr Met Gln Asp Gly Thr Ser115
120 125Arg Phe Thr Cys Arg Gly Lys Pro Ile His His Phe
Leu Gly Thr Ser130 135 140Thr Phe Ser Gln
Tyr Thr Val Val Asp Glu Ile Ser Val Ala Lys Ile145 150
155 160Asp Ala Ala Ser Pro Leu Glu Lys Val
Cys Leu Ile Gly Cys Gly Phe165 170 175Ser
Thr Gly Tyr Gly Ser Ala Val Lys Val Ala Lys Val Thr Gln Gly180
185 190Ser Thr Cys Ala Val Phe Gly Leu Gly Gly Val
Gly Leu Ser Val Ile195 200 205Met Gly Cys
Lys Ala Ala Gly Ala Ala Arg Ile Ile Gly Val Asp Ile210
215 220Asn Lys Asp Lys Phe Ala Lys Ala Lys Glu Val Gly
Ala Thr Glu Cys225 230 235
240Val Asn Pro Gln Asp Tyr Lys Lys Pro Ile Gln Glu Val Leu Thr Glu245
250 255Met Ser Asn Gly Gly Val Asp Phe Ser
Phe Glu Val Ile Gly Arg Leu260 265 270Asp
Thr Met Val Thr Ala Leu Ser Cys Cys Gln Glu Ala Tyr Gly Val275
280 285Ser Val Ile Val Gly Val Pro Pro Asp Ser Gln
Asn Leu Ser Met Asn290 295 300Pro Met Leu
Leu Leu Ser Gly Arg Thr Trp Lys Gly Ala Ile Phe Gly305
310 315 320Gly Phe Lys Ser Lys Asp Ser
Val Pro Lys Leu Val Ala Asp Phe Met325 330
335Ala Lys Lys Phe Ala Leu Asp Pro Leu Ile Thr His Val Leu Pro Phe340
345 350Glu Lys Ile Asn Glu Gly Phe Asp Leu
Leu Arg Ser Gly Glu Ser Ile355 360 365Arg
Thr Ile Leu Thr Phe37050297PRTEscherichia coli 50Met Ala Thr Asn Leu Arg
Gly Val Met Ala Ala Leu Leu Thr Pro Phe1 5
10 15Asp Gln Gln Gln Ala Leu Asp Lys Ala Ser Leu Arg
Arg Leu Val Gln20 25 30Phe Asn Ile Gln
Gln Gly Ile Asp Gly Leu Tyr Val Gly Gly Ser Thr35 40
45Gly Glu Ala Phe Val Gln Ser Leu Ser Glu Arg Glu Gln Val
Leu Glu50 55 60Ile Val Ala Glu Glu Gly
Lys Gly Lys Ile Lys Leu Ile Ala His Val65 70
75 80Gly Cys Val Thr Thr Ala Glu Ser Gln Gln Leu
Ala Ala Ser Ala Lys85 90 95Arg Tyr Gly
Phe Asp Ala Val Ser Ala Val Thr Pro Phe Tyr Tyr Pro100
105 110Phe Ser Phe Glu Glu His Cys Asp His Tyr Arg Ala
Ile Ile Asp Ser115 120 125Ala Asp Gly Leu
Pro Met Val Val Tyr Asn Ile Pro Ala Leu Ser Gly130 135
140Val Lys Leu Thr Leu Asp Gln Ile Asn Thr Leu Val Thr Leu
Pro Gly145 150 155 160Val
Gly Ala Leu Lys Gln Thr Ser Gly Asp Leu Tyr Gln Met Glu Gln165
170 175Ile Arg Arg Glu His Pro Asp Leu Val Leu Tyr
Asn Gly Tyr Asp Glu180 185 190Ile Phe Ala
Ser Gly Leu Leu Ala Gly Ala Asp Gly Gly Ile Gly Ser195
200 205Thr Tyr Asn Ile Met Gly Trp Arg Tyr Gln Gly Ile
Val Lys Ala Leu210 215 220Lys Glu Gly Asp
Ile Gln Thr Ala Gln Lys Leu Gln Thr Glu Cys Asn225 230
235 240Lys Val Ile Asp Leu Leu Ile Lys Thr
Gly Val Phe Arg Gly Leu Lys245 250 255Thr
Val Leu His Tyr Met Asp Val Val Ser Val Pro Leu Cys Arg Lys260
265 270Pro Phe Gly Pro Val Asp Glu Lys Tyr Leu Pro
Glu Leu Lys Ala Leu275 280 285Ala Gln Gln
Leu Met Gln Glu Arg Gly290 29551268PRTSalmonella
typhimurium 51Met Glu Arg Tyr Glu Asn Leu Phe Ala Gln Leu Asn Asp Arg Arg
Glu1 5 10 15Gly Ala Phe
Val Pro Phe Val Thr Leu Gly Asp Pro Gly Ile Glu Gln20 25
30Ser Leu Lys Ile Ile Asp Thr Leu Ile Asp Ala Gly Ala
Asp Ala Leu35 40 45Glu Leu Gly Val Pro
Phe Ser Asp Pro Leu Ala Asp Gly Pro Thr Ile50 55
60Gln Asn Ala Asn Leu Arg Ala Phe Ala Ala Gly Val Thr Pro Ala
Gln65 70 75 80Cys Phe
Glu Met Leu Ala Leu Ile Arg Glu Lys His Pro Thr Ile Pro85
90 95Ile Gly Leu Leu Met Tyr Ala Asn Leu Val Phe Asn
Asn Gly Ile Asp100 105 110Ala Phe Tyr Ala
Arg Cys Glu Gln Val Gly Val Asp Ser Val Leu Val115 120
125Ala Asp Val Pro Val Glu Glu Ser Ala Pro Phe Arg Gln Ala
Ala Leu130 135 140Arg His Asn Ile Ala Pro
Ile Phe Ile Cys Pro Pro Asn Ala Asp Asp145 150
155 160Asp Leu Leu Arg Gln Val Ala Ser Tyr Gly Arg
Gly Tyr Thr Tyr Leu165 170 175Leu Ser Arg
Ser Gly Val Thr Gly Ala Glu Asn Arg Gly Ala Leu Pro180
185 190Leu His His Leu Ile Glu Lys Leu Lys Glu Tyr His
Ala Ala Pro Ala195 200 205Leu Gln Gly Phe
Gly Ile Ser Ser Pro Glu Gln Val Ser Ala Ala Val210 215
220Arg Ala Gly Ala Ala Gly Ala Ile Ser Gly Ser Ala Ile Val
Lys Ile225 230 235 240Ile
Glu Lys Asn Leu Ala Ser Pro Lys Gln Met Leu Ala Glu Leu Arg245
250 255Ser Phe Val Ser Ala Met Lys Ala Ala Ser Arg
Ala260 26552393PRTActinoplanes missouriensis 52Ser Val
Gln Ala Thr Arg Glu Asp Lys Phe Ser Phe Gly Leu Trp Thr1 5
10 15Val Gly Trp Gln Ala Arg Asp Ala
Phe Gly Asp Ala Thr Arg Thr Ala20 25
30Leu Asp Pro Val Glu Ala Val His Lys Leu Ala Glu Ile Gly Ala Tyr35
40 45Gly Ile Thr Phe His Asp Asp Asp Leu Val
Pro Phe Gly Ser Asp Ala50 55 60Gln Thr
Arg Asp Gly Ile Ile Ala Gly Phe Lys Lys Ala Leu Asp Glu65
70 75 80Thr Gly Leu Ile Val Pro Met
Val Thr Thr Asn Leu Phe Thr His Pro85 90
95Val Phe Lys Asp Gly Gly Phe Thr Ser Asn Asp Arg Ser Val Arg Arg100
105 110Tyr Ala Ile Arg Lys Val Leu Arg Gln
Met Asp Leu Gly Ala Glu Leu115 120 125Gly
Ala Lys Thr Leu Val Leu Trp Gly Gly Arg Glu Gly Ala Glu Tyr130
135 140Asp Ser Ala Lys Asp Val Ser Ala Ala Leu Asp
Arg Tyr Arg Glu Ala145 150 155
160Leu Asn Leu Leu Ala Gln Tyr Ser Glu Asp Arg Gly Tyr Gly Leu
Arg165 170 175Phe Ala Ile Glu Pro Lys Pro
Asn Glu Pro Arg Gly Asp Ile Leu Leu180 185
190Pro Thr Ala Gly His Ala Ile Ala Phe Val Gln Glu Leu Glu Arg Pro195
200 205Glu Leu Phe Gly Ile Asn Pro Glu Thr
Gly Asn Glu Gln Met Ser Asn210 215 220Leu
Asn Phe Thr Gln Gly Ile Ala Gln Ala Leu Trp His Lys Lys Leu225
230 235 240Phe His Ile Asp Leu Asn
Gly Gln His Gly Pro Lys Phe Asp Gln Asp245 250
255Leu Val Phe Gly His Gly Asp Leu Leu Asn Ala Phe Ser Leu Val
Asp260 265 270Leu Leu Glu Asn Gly Pro Asp
Gly Ala Pro Ala Tyr Asp Gly Pro Arg275 280
285His Phe Asp Tyr Lys Pro Ser Arg Thr Glu Asp Tyr Asp Gly Val Trp290
295 300Glu Ser Ala Lys Ala Asn Ile Arg Met
Tyr Leu Leu Leu Lys Glu Arg305 310 315
320Ala Lys Ala Phe Arg Ala Asp Pro Glu Val Gln Glu Ala Leu
Ala Ala325 330 335Ser Lys Val Ala Glu Leu
Lys Thr Pro Thr Leu Asn Pro Gly Glu Gly340 345
350Tyr Ala Glu Leu Leu Ala Asp Arg Ser Ala Phe Glu Asp Tyr Asp
Ala355 360 365Asp Ala Val Gly Ala Lys Gly
Phe Gly Phe Val Lys Leu Asn Gln Leu370 375
380Ala Ile Glu His Leu Leu Gly Ala Arg385
39053348PRTBacteriophage T7 53Val Asn Ile Lys Thr Asn Pro Phe Lys Ala Val
Ser Phe Val Glu Ser1 5 10
15Ala Ile Lys Lys Ala Leu Asp Asn Ala Gly Tyr Leu Ile Ala Glu Ile20
25 30Lys Tyr Asp Gly Val Arg Gly Asn Ile Cys
Val Asp Asn Thr Ala Asn35 40 45Ser Tyr
Trp Leu Ser Arg Val Ser Lys Thr Ile Pro Ala Leu Glu His50
55 60Leu Asn Gly Phe Asp Val Arg Trp Lys Arg Leu Leu
Asn Asp Asp Arg65 70 75
80Cys Phe Tyr Lys Asp Gly Phe Met Leu Asp Gly Glu Leu Met Val Lys85
90 95Gly Val Asp Phe Asn Thr Gly Ser Gly Leu
Leu Arg Thr Lys Trp Thr100 105 110Asp Thr
Lys Asn Gln Glu Phe His Glu Glu Leu Phe Val Glu Pro Ile115
120 125Arg Lys Lys Asp Lys Val Pro Phe Lys Leu His Thr
Gly His Leu His130 135 140Ile Lys Leu Tyr
Ala Ile Leu Pro Leu His Ile Val Glu Ser Gly Glu145 150
155 160Asp Cys Asp Val Met Thr Leu Leu Met
Gln Glu His Val Lys Asn Met165 170 175Leu
Pro Leu Leu Gln Glu Tyr Phe Pro Glu Ile Glu Trp Gln Ala Ala180
185 190Glu Ser Tyr Glu Val Tyr Asp Met Val Glu Leu
Gln Gln Leu Tyr Glu195 200 205Gln Lys Arg
Ala Glu Gly His Glu Gly Leu Ile Val Lys Asp Pro Met210
215 220Cys Ile Tyr Lys Arg Gly Lys Lys Ser Gly Trp Trp
Lys Met Lys Pro225 230 235
240Glu Asn Glu Ala Asp Gly Ile Ile Gln Gly Leu Val Trp Gly Thr Lys245
250 255Gly Leu Ala Asn Glu Gly Lys Val Ile
Gly Phe Glu Val Leu Leu Glu260 265 270Ser
Gly Arg Leu Val Asn Ala Thr Asn Ile Ser Arg Ala Leu Met Asp275
280 285Glu Phe Thr Glu Thr Val Lys Glu Ala Thr Leu
Ser Gln Trp Gly Phe290 295 300Phe Ser Pro
Tyr Gly Ile Gly Asp Asn Asp Ala Cys Thr Ile Asn Pro305
310 315 320Tyr Asp Gly Trp Ala Cys Gln
Ile Ser Tyr Met Glu Glu Thr Pro Asp325 330
335Gly Ser Leu Arg His Pro Ser Phe Val Met Phe Arg340
3455442DNAArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 54g gtg gta tca gca ggc cac tgc tac aag tcc cgc
atc cag gt 42Val Val Ser Ala Gly His Cys Tyr Lys Ser Arg Ile
Gln1 5 105513PRTArtifical
SequenceDescription of Artificial Sequence = binding site for restr1
and restr2 55Val Val Ser Ala Gly His Cys Tyr Lys Ser Arg Ile Gln1
5 105642DNAArtifical SequenceDescription of
Artificial Sequence = Synthetic Construct 56ggtggtatcc gcgggccact
gctacaagtc ccggatccag gt 425742DNAArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
57acctggatcc gggacttgta gcagtggccc gcggatacca cc
425850DNAArtifical SequenceDescription of Artificial Sequence = Synthetic
Construct 58cc act ggc acg aag tgc ctc atc tct ggc tgg ggc aac act
gcg agc 47Thr Gly Thr Lys Cys Leu Ile Ser Gly Trp Gly Asn Thr Ala
Ser1 5 10 15tct
50Ser5916PRTArtifical SequenceDescription of Artificial Sequence =
binding site for restr3 and restr4 59Thr Gly Thr Lys Cys Leu Ile Ser Gly
Trp Gly Asn Thr Ala Ser Ser1 5 10
156050DNAArtificial SequenceDescription of Artificial Sequence =
Synthetic Construct 60ccactggcac gaagtgcctc atctctggct ggggcaacac
tgcgagctct 506150DNAArtifical SequenceDescription of
Artificial Sequence = Synthetic Construct 61agagctagca gtgttgcccc
agccagagat gaggcacttg gtaccagtgg 506230DNAArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
62ggggtacccc accaccatga atccactcct
306330DNAArtifical SequenceDescription of Artificial Sequence = Synthetic
Construct 63cgggatccgg tatagagact gaagagatac
306439DNAArtifical SequenceDescription of Artificial
Sequence = Synthetic Construct 64g ggc cac tgc tac nnn nnn nnn nnn
nnn nnn aag tcc cg 39Gly His Cys Tyr Xaa Xaa Xaa Xaa Xaa
Xaa Lys Ser1 5 106512PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
65Gly His Cys Tyr Xaa Xaa Xaa Xaa Xaa Xaa Lys Ser1 5
106645DNAArtifical SequenceDescription of Artificial Sequence
= Synthetic Construct 66cgcccggtga cgatgnnnnn nnnnnnnnnn nnnttcaggg
cctag 456747DNAArtifical SequenceDescription of
Artificial Sequence = Synthetic Construct 67c aag tgc ctc atc tct
ggc tgg ggc aac nnn nnn nnn nnn nnn act g 47Lys Cys Leu Ile Ser Gly
Trp Gly Asn Xaa Xaa Xaa Xaa Xaa Thr1 5 10
156815PRTArtificial SequenceDescription of Artificial
Sequence = Synthetic Construct 68Lys Cys Leu Ile Ser Gly Trp Gly
Asn Xaa Xaa Xaa Xaa Xaa Thr1 5 10
156955DNAArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 69catggttcac ggagtagaga ccgaccccgt tgnnnnnnnn
nnnnnnntga cgatc 557059DNAArtifical SequenceDescription of
Artificial Sequence = Synthetic Construct 70tggtatccgc gggccactgc
tacnnbnnbn nbnnbnnbnn baagtcccgg atccaggtg 597152DNAArtificial
SequenceDescription of Artificial Sequence = Synthetic Construct
71ggcgccagag ctagcagtvn nvnnvnnvnn vnngttgccc cagccagaga tg
52726PRTArtificial SequenceDescription of Artificial Sequence = Synthetic
Construct 72Ala Phe Phe Asn Gly Asp1 5735PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
73Arg Lys Asp Pro Trp1 574234PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
74Ile Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Val Pro Tyr Gln Val1
5 10 15Ser Leu Asn Ser Gly Tyr
His Phe Cys Gly Gly Ser Leu Ile Asn Glu20 25
30Gln Trp Val Val Ser Ala Gly His Cys Tyr Ala Ala Phe Asn Gly Lys35
40 45Ser Arg Ile Gln Val Arg Leu Gly Glu
His Asn Ile Glu Val Leu Glu50 55 60Gly
Asn Glu Gln Phe Ile Asn Ala Ala Lys Ile Ile Arg His Pro Gln65
70 75 80Tyr Asp Arg Lys Thr Leu
Asn Asn Asp Ile Met Leu Ile Lys Leu Ser85 90
95Ser Arg Ala Val Ile Asn Ala Arg Val Ser Thr Ile Ser Leu Pro Thr100
105 110Ala Pro Pro Ala Thr Gly Thr Lys
Cys Leu Ile Ser Gly Trp Gly Asn115 120
125Arg Lys Asp Phe Trp Thr Ala Ser Ser Gly Ala Asp Tyr Pro Asp Glu130
135 140Leu Gln Cys Leu Asp Ala Pro Val Leu
Ser Gln Ala Lys Cys Glu Ala145 150 155
160Ser Tyr Pro Gly Lys Ile Thr Ser Asn Met Phe Cys Val Gly
Phe Leu165 170 175Glu Gly Gly Lys Asp Ser
Cys Gln Gly Asp Ser Gly Gly Pro Val Val180 185
190Cys Asn Gly Gln Leu Gln Gly Val Val Ser Trp Gly Asp Gly Cys
Ala195 200 205Gln Lys Asn Lys Pro Gly Val
Tyr Thr Lys Val Tyr Asn Tyr Val Lys210 215
220Trp Ile Lys Asn Thr Ile Ala Ala Asn Ser225
23075234PRTArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 75Ile Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser
Val Pro Tyr Gln Val1 5 10
15Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu Ile Asn Glu20
25 30Gln Trp Val Val Ser Ala Gly His Cys Tyr
Ala Ala Phe Asn Gly Lys35 40 45Ser Arg
Ile Gln Val Arg Leu Gly Glu His Asn Ile Gly Val Leu Glu50
55 60Gly Asn Glu Gln Phe Ile Asn Ala Ala Lys Ile Ile
Arg His Pro Gln65 70 75
80Tyr Asp Trp Lys Thr Leu Asn Asn Asp Ile Met Leu Ile Lys Leu Ser85
90 95Ser Arg Ala Val Ile Asn Ala Arg Val Ser
Thr Ile Ser Leu Pro Thr100 105 110Ala Pro
Pro Ala Thr Gly Thr Lys Cys Leu Ile Ser Gly Trp Gly Asn115
120 125Arg Lys Asp Phe Trp Thr Ala Ser Ser Gly Ala Asp
Phe Pro Asp Glu130 135 140Leu Gln Cys Leu
Asp Ala Pro Val Leu Ser Gln Thr Lys Cys Glu Ala145 150
155 160Ser Tyr Pro Gly Lys Ile Thr Ser Asn
Met Phe Cys Val Gly Phe Leu165 170 175Glu
Gly Gly Lys Asp Ser Cys Gln Gly Asp Ser Gly Gly Pro Val Val180
185 190Arg Asn Gly Gln Leu Gln Gly Val Val Ser Trp
Gly Asp Gly Cys Ala195 200 205Gln Lys Asn
Lys Pro Gly Val Tyr Thr Lys Val Tyr Asn Tyr Val Lys210
215 220Trp Ile Lys Asn Thr Ile Ala Ala Asn Ser225
2307612PRTArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 76Leu Leu Trp Leu Gly Arg Val Val Gly Gly Pro
Val1 5 107712PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
77Lys Lys Trp Leu Gly Arg Val Pro Gly Gly Pro Val1 5
10786PRTArtifical SequenceDescription of Artificial Sequence =
Synthetic Construct 78Asp Ala Val Gly Arg Asp1
5796PRTArtifical SequenceDescription of Artificial Sequence = Synthetic
Construct 79Asn Gly Arg Asp Leu Glu1 5806PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
80Gly Phe Val Met Phe Asn1 5815PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
81Arg Val His Pro Ser1 5825PRTArtifical SequenceDescription
of Artificial Sequence = Synthetic Construct 82Val Arg Gly Thr Trp1
5835PRTArtifical SequenceDescription of Artificial Sequence
= Synthetic Construct 83Arg Ser Pro Leu Thr1
5846PRTArtificial SequenceDescription of Artificial Sequence = Synthetic
Construct 84Arg Pro Trp Asp Pro Ser1 5856PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
85Gly Phe Val Met Phe Asn1 5866PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
86Glu Ile Ala Asn Arg Glu1 5876PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
87Lys Ala Val Val Gly Thr1 5886PRTArtificial
SequenceDescription of Artificial Sequence = Synthetic Construct
88Val Asn Ile Met Ala Ala1 5896PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
89Ala Ala Phe Asn Gly Asp1 5905PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
90Val His Pro Thr Ser1 5915PRTArtifical SequenceDescription
of Artificial Sequence = Synthetic Construct 91Arg Ser Pro Leu Thr1
5925PRTArtifical SequenceDescription of Artificial Sequence
= Synthetic Construct 92Arg Gly Ala Arg Thr1
5935PRTArtifical SequenceDescription of Artificial Sequence = Synthetic
Construct 93Arg Thr Pro Ile Ser1 5945PRTArtifical
SequenceDescription of Artificial Sequence = Synthetic Construct
94Thr Thr Ala Arg Lys1 5955PRTArtifical SequenceDescription
of Artificial Sequence = Synthetic Construct 95Arg Lys Asp Phe Trp1
596157PRTHomo sapiens 96Val Arg Ser Ser Ser Arg Thr Pro Ser
Asp Lys Pro Val Ala His Val1 5 10
15Val Ala Asn Pro Gln Ala Glu Gly Gln Leu Gln Trp Leu Asn Arg
Arg20 25 30Ala Asn Ala Leu Leu Ala Asn
Gly Val Glu Leu Arg Asp Asn Gln Leu35 40
45Val Val Pro Ser Glu Gly Leu Tyr Leu Ile Tyr Ser Gln Val Leu Phe50
55 60Lys Gly Gln Gly Cys Pro Ser Thr His Val
Leu Leu Thr His Thr Ile65 70 75
80Ser Arg Ile Ala Val Ser Tyr Gln Thr Lys Val Asn Leu Leu Ser
Ala85 90 95Ile Lys Ser Pro Cys Gln Arg
Glu Thr Pro Glu Gly Ala Glu Ala Lys100 105
110Pro Trp Tyr Glu Pro Ile Tyr Leu Gly Gly Val Phe Gln Leu Glu Lys115
120 125Gly Asp Arg Leu Ser Ala Glu Ile Asn
Arg Pro Asp Tyr Leu Leu Phe130 135 140Ala
Glu Ser Gly Gln Val Tyr Phe Gly Ile Ile Ala Leu145 150
15597306PRTHomo sapiens 97Ile Trp Glu Leu Lys Lys Asp Val
Tyr Val Val Glu Leu Asp Trp Tyr1 5 10
15Pro Asp Ala Pro Gly Glu Met Val Val Leu Thr Cys Asp Thr
Pro Glu20 25 30Glu Asp Gly Ile Thr Trp
Thr Leu Asp Gln Ser Ser Glu Val Leu Gly35 40
45Ser Gly Lys Thr Leu Thr Ile Gln Val Lys Glu Phe Gly Asp Ala Gly50
55 60Gln Tyr Thr Cys His Lys Gly Gly Glu
Val Leu Ser His Ser Leu Leu65 70 75
80Leu Leu His Lys Lys Glu Asp Gly Ile Trp Ser Thr Asp Ile
Leu Lys85 90 95Asp Gln Lys Glu Pro Lys
Asn Lys Thr Phe Leu Arg Cys Glu Ala Lys100 105
110Asn Tyr Ser Gly Arg Phe Thr Cys Trp Trp Leu Thr Thr Ile Ser
Thr115 120 125Asp Leu Thr Phe Ser Val Lys
Ser Ser Arg Gly Ser Ser Asp Pro Gln130 135
140Gly Val Thr Cys Gly Ala Ala Thr Leu Ser Ala Glu Arg Val Arg Gly145
150 155 160Asp Asn Lys Glu
Tyr Glu Tyr Ser Val Glu Cys Gln Glu Asp Ser Ala165 170
175Cys Pro Ala Ala Glu Glu Ser Leu Pro Ile Glu Val Met Val
Asp Ala180 185 190Val His Lys Leu Lys Tyr
Glu Asn Tyr Thr Ser Ser Phe Phe Ile Arg195 200
205Asp Ile Ile Lys Pro Asp Pro Pro Lys Asn Leu Gln Leu Lys Pro
Leu210 215 220Lys Asn Ser Arg Gln Val Glu
Val Ser Trp Glu Tyr Pro Asp Thr Trp225 230
235 240Ser Thr Pro His Ser Tyr Phe Ser Leu Thr Phe Cys
Val Gln Val Gln245 250 255Gly Lys Ser Lys
Arg Glu Lys Lys Asp Arg Val Phe Thr Asp Lys Thr260 265
270Ser Ala Thr Val Ile Cys Arg Lys Asn Ala Ser Ile Ser Val
Arg Ala275 280 285Gln Asp Arg Tyr Tyr Ser
Ser Ser Trp Ser Glu Trp Ala Ser Val Pro290 295
300Cys Ser30598157PRTHomo sapiens 98Tyr Phe Gly Lys Leu Glu Ser Lys
Leu Ser Val Ile Arg Asn Leu Asn1 5 10
15Asp Gln Val Leu Phe Ile Asp Gln Gly Asn Arg Pro Leu Phe
Glu Asp20 25 30Met Thr Asp Ser Asp Cys
Arg Asp Asn Ala Pro Arg Thr Ile Phe Ile35 40
45Ile Ser Met Tyr Lys Asp Ser Gln Pro Arg Gly Met Ala Val Thr Ile50
55 60Ser Val Lys Cys Glu Lys Ile Ser Thr
Leu Ser Cys Glu Asn Lys Ile65 70 75
80Ile Ser Phe Lys Glu Met Asn Pro Pro Asp Asn Ile Lys Asp
Thr Lys85 90 95Ser Asp Ile Ile Phe Phe
Gln Arg Ser Val Pro Gly His Asp Asn Lys100 105
110Met Gln Phe Glu Ser Ser Ser Tyr Glu Gly Tyr Phe Leu Ala Cys
Glu115 120 125Lys Glu Arg Asp Leu Phe Lys
Leu Ile Leu Lys Lys Glu Asp Glu Leu130 135
140Gly Asp Arg Ser Ile Met Phe Thr Val Gln Asn Glu Asp145
150 15599133PRTHomo sapiens 99Ala Pro Thr Ser Ser Ser
Thr Lys Lys Thr Gln Leu Gln Leu Glu His1 5
10 15Leu Leu Leu Asp Leu Gln Met Ile Leu Asn Gly Ile
Asn Asn Tyr Lys20 25 30Asn Pro Lys Leu
Thr Arg Met Leu Thr Phe Lys Phe Tyr Met Pro Lys35 40
45Lys Ala Thr Glu Leu Lys His Leu Gln Cys Leu Glu Glu Glu
Leu Lys50 55 60Pro Leu Glu Glu Val Leu
Asn Leu Ala Gln Ser Lys Asn Phe His Leu65 70
75 80Arg Pro Arg Asp Leu Ile Ser Asn Ile Asn Val
Ile Val Leu Glu Leu85 90 95Lys Gly Ser
Glu Thr Thr Phe Met Cys Glu Tyr Ala Asp Glu Thr Ala100
105 110Thr Ile Val Glu Phe Leu Asn Arg Trp Ile Thr Phe
Cys Gln Ser Ile115 120 125Ile Ser Thr Leu
Thr13010072PRTHomo sapiens 100Ser Ala Lys Glu Leu Arg Cys Gln Cys Ile Lys
Thr Tyr Ser Lys Pro1 5 10
15Phe His Pro Lys Phe Ile Lys Glu Leu Arg Val Ile Glu Ser Gly Pro20
25 30His Cys Ala Asn Thr Glu Ile Ile Val Lys
Leu Ser Asp Gly Arg Glu35 40 45Leu Cys
Leu Asp Pro Lys Glu Asn Trp Val Gln Arg Val Val Glu Lys50
55 60Phe Leu Lys Arg Ala Glu Asn Ser65
7010174PRTHomo sapiens 101Gly Pro Ala Ser Val Pro Thr Thr Cys Cys Phe Asn
Leu Ala Asn Arg1 5 10
15Lys Ile Pro Leu Gln Arg Leu Glu Ser Tyr Arg Arg Ile Thr Ser Gly20
25 30Lys Cys Pro Gln Lys Ala Val Ile Phe Lys
Thr Lys Leu Ala Lys Asp35 40 45Ile Cys
Ala Asp Pro Lys Lys Lys Trp Val Gln Asp Ser Met Lys Tyr50
55 60Leu Asp Gln Lys Ser Pro Thr Pro Lys Pro65
7010276PRTHomo sapiens 102Gln Pro Asp Ala Ile Asn Ala Pro Val Thr
Cys Cys Tyr Asn Phe Thr1 5 10
15Asn Arg Lys Ile Ser Val Gln Arg Leu Ala Ser Tyr Arg Arg Ile Thr20
25 30Ser Ser Lys Cys Pro Lys Glu Ala Val
Ile Phe Lys Thr Ile Val Ala35 40 45Lys
Glu Ile Cys Ala Asp Pro Lys Gln Lys Trp Val Gln Asp Ser Met50
55 60Asp His Leu Asp Lys Gln Thr Gln Thr Pro Lys
Thr65 70 75103206PRTHomo sapiens 103Ala
Pro Met Ala Glu Gly Gly Gly Gln Asn His His Glu Val Val Lys1
5 10 15Phe Met Asp Val Tyr Gln Arg
Ser Tyr Cys His Pro Ile Glu Thr Leu20 25
30Val Asp Ile Phe Gln Glu Tyr Pro Asp Glu Ile Glu Tyr Ile Phe Lys35
40 45Pro Ser Cys Val Pro Leu Met Arg Cys Gly
Gly Cys Cys Asn Asp Glu50 55 60Gly Leu
Glu Cys Val Pro Thr Glu Glu Ser Asn Ile Thr Met Gln Ile65
70 75 80Met Arg Ile Lys Pro His Gln
Gly Gln His Ile Gly Glu Met Ser Phe85 90
95Leu Gln His Asn Lys Cys Glu Cys Arg Pro Lys Lys Asp Arg Ala Arg100
105 110Gln Glu Lys Lys Ser Val Arg Gly Lys
Gly Lys Gly Gln Lys Arg Lys115 120 125Arg
Lys Lys Ser Arg Tyr Lys Ser Trp Ser Val Tyr Val Gly Ala Arg130
135 140Cys Cys Leu Met Pro Trp Ser Leu Pro Gly Pro
His Pro Cys Gly Pro145 150 155
160Cys Ser Glu Arg Arg Lys His Leu Phe Val Gln Asp Pro Gln Thr
Cys165 170 175Lys Cys Ser Cys Lys Asn Thr
Asp Ser Arg Cys Lys Ala Arg Gln Leu180 185
190Glu Leu Asn Glu Arg Thr Cys Arg Cys Asp Lys Pro Arg Arg195
200 205104112PRTHomo sapiens 104Ala Leu Asp Thr Asn
Tyr Cys Phe Ser Ser Thr Glu Lys Asn Cys Cys1 5
10 15Val Arg Gln Leu Tyr Ile Asp Phe Arg Lys Asp
Leu Gly Trp Lys Trp20 25 30Ile His Glu
Pro Lys Gly Tyr His Ala Asn Phe Cys Leu Gly Pro Cys35 40
45Pro Tyr Ile Trp Ser Leu Asp Thr Gln Tyr Ser Lys Val
Leu Ala Leu50 55 60Tyr Asn Gln His Asn
Pro Gly Ala Ser Ala Ala Pro Cys Cys Val Pro65 70
75 80Gln Ala Leu Glu Pro Leu Pro Ile Val Tyr
Tyr Val Gly Arg Lys Pro85 90 95Lys Val
Glu Gln Leu Ser Asn Met Ile Val Arg Ser Cys Lys Cys Ser100
105 11010530PRTHomo sapiens 105Phe Val Asn Gln His Leu
Cys Gly Ser His Leu Val Glu Ala Leu Tyr1 5
10 15Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro
Lys Thr20 25 3010621PRTHomo sapiens
106Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln Leu1
5 10 15Glu Asn Tyr Cys
Asn2010728PRTHomo sapiens 107Gly Ser Ser Phe Leu Ser Pro Glu His Gln Arg
Val Gln Gln Arg Lys1 5 10
15Glu Ser Lys Lys Pro Pro Ala Lys Leu Gln Pro Arg20
251089PRTHomo sapiens 108Arg Val Tyr Ile His Pro Phe His Leu1
5109114PRTHomo sapiens 109Pro Met Phe Ile Val Asn Thr Asn Val Pro Arg
Ala Ser Val Pro Asp1 5 10
15Gly Phe Leu Ser Glu Leu Thr Gln Gln Leu Ala Gln Ala Thr Gly Lys20
25 30Pro Pro Gln Tyr Ile Ala Val His Val Val
Pro Asp Gln Leu Met Ala35 40 45Phe Gly
Gly Ser Ser Glu Pro Cys Ala Leu Cys Ser Leu His Ser Ile50
55 60Gly Lys Ile Gly Gly Ala Gln Asn Arg Ser Tyr Ser
Lys Leu Leu Cys65 70 75
80Gly Leu Leu Ala Glu Arg Leu Arg Ile Ser Pro Asp Arg Val Tyr Ile85
90 95Asn Tyr Tyr Asp Met Asn Ala Ala Asn Val
Gly Trp Asn Asn Ser Thr100 105 110Phe
Ala110425PRTHomo sapiens 110Met Gly Pro Arg Arg Leu Leu Leu Val Ala Ala
Cys Phe Ser Leu Cys1 5 10
15Gly Pro Leu Leu Ser Ala Arg Thr Arg Ala Arg Arg Pro Glu Ser Lys20
25 30Ala Thr Asn Ala Thr Leu Asp Pro Arg Ser
Phe Leu Leu Arg Asn Pro35 40 45Asn Asp
Lys Tyr Glu Pro Phe Trp Glu Asp Glu Glu Lys Asn Glu Ser50
55 60Gly Leu Thr Glu Tyr Arg Leu Val Ser Ile Asn Lys
Ser Ser Pro Leu65 70 75
80Gln Lys Gln Leu Pro Ala Phe Ile Ser Glu Asp Ala Ser Gly Tyr Leu85
90 95Thr Ser Ser Trp Leu Thr Leu Phe Val Pro
Ser Val Tyr Thr Gly Val100 105 110Phe Val
Val Ser Leu Pro Leu Asn Ile Met Ala Ile Val Val Phe Ile115
120 125Leu Lys Met Lys Val Lys Lys Pro Ala Val Val Tyr
Met Leu His Leu130 135 140Ala Thr Ala Asp
Val Leu Phe Val Ser Val Leu Pro Phe Lys Ile Ser145 150
155 160Tyr Tyr Phe Ser Gly Ser Asp Trp Gln
Phe Gly Ser Glu Leu Cys Arg165 170 175Phe
Val Thr Ala Ala Phe Tyr Cys Asn Met Tyr Ala Ser Ile Leu Leu180
185 190Met Thr Val Ile Ser Ile Asp Arg Phe Leu Ala
Val Val Tyr Pro Met195 200 205Gln Ser Leu
Ser Trp Arg Thr Leu Gly Arg Ala Ser Phe Thr Cys Leu210
215 220Ala Ile Trp Ala Leu Ala Ile Ala Gly Val Val Pro
Leu Leu Leu Lys225 230 235
240Glu Gln Thr Ile Gln Val Pro Gly Leu Asn Ile Thr Thr Cys His Asp245
250 255Val Leu Asn Glu Thr Leu Leu Glu Gly
Tyr Tyr Ala Tyr Tyr Phe Ser260 265 270Ala
Phe Ser Ala Val Phe Phe Phe Val Pro Leu Ile Ile Ser Thr Val275
280 285Cys Tyr Val Ser Ile Ile Arg Cys Leu Ser Ser
Ser Ala Val Ala Asn290 295 300Arg Ser Lys
Lys Ser Arg Ala Leu Phe Leu Ser Ala Ala Val Phe Cys305
310 315 320Ile Phe Ile Ile Cys Phe Gly
Pro Thr Asn Val Leu Leu Ile Ala His325 330
335Tyr Ser Phe Leu Ser His Thr Ser Thr Thr Glu Ala Ala Tyr Phe Ala340
345 350Tyr Leu Leu Cys Val Cys Val Ser Ser
Ile Ser Cys Cys Ile Asp Pro355 360 365Leu
Ile Tyr Tyr Tyr Ala Ser Ser Glu Cys Gln Arg Tyr Val Tyr Ser370
375 380Ile Leu Cys Cys Lys Glu Ser Ser Asp Pro Ser
Ser Tyr Asn Ser Ser385 390 395
400Gly Gln Leu Met Ala Ser Lys Met Asp Thr Cys Ser Ser Asn Leu
Asn405 410 415Asn Ser Ile Tyr Lys Lys Leu
Leu Thr420 425111397PRTHomo sapiens 111Met Arg Ser Pro
Ser Ala Ala Trp Leu Leu Gly Ala Ala Ile Leu Leu1 5
10 15Ala Ala Ser Leu Ser Cys Ser Gly Thr Ile
Gln Gly Thr Asn Arg Ser20 25 30Ser Lys
Gly Arg Ser Leu Ile Gly Lys Val Asp Gly Thr Ser His Val35
40 45Thr Gly Lys Gly Val Thr Val Glu Thr Val Phe Ser
Val Asp Glu Phe50 55 60Ser Ala Ser Val
Leu Thr Gly Lys Leu Thr Thr Val Phe Leu Pro Ile65 70
75 80Val Tyr Thr Ile Val Phe Val Val Gly
Leu Pro Ser Asn Gly Met Ala85 90 95Leu
Trp Val Phe Leu Phe Arg Thr Lys Lys Lys His Pro Ala Val Ile100
105 110Tyr Met Ala Asn Leu Ala Leu Ala Asp Leu Leu
Ser Val Ile Trp Phe115 120 125Pro Leu Lys
Ile Ala Tyr His Ile His Gly Asn Asn Trp Ile Tyr Gly130
135 140Glu Ala Leu Cys Asn Val Leu Ile Gly Phe Phe Tyr
Gly Asn Met Tyr145 150 155
160Cys Ser Ile Leu Phe Met Thr Cys Leu Ser Val Gln Arg Tyr Trp Val165
170 175Ile Val Asn Pro Met Gly His Ser Arg
Lys Lys Ala Asn Ile Ala Ile180 185 190Gly
Ile Ser Leu Ala Ile Trp Leu Leu Ile Leu Leu Val Thr Ile Pro195
200 205Leu Tyr Val Val Lys Gln Thr Ile Phe Ile Pro
Ala Leu Asn Ile Thr210 215 220Thr Cys His
Asp Val Leu Pro Glu Gln Leu Leu Val Gly Asp Met Phe225
230 235 240Asn Tyr Phe Leu Ser Leu Ala
Ile Gly Val Phe Leu Phe Pro Ala Phe245 250
255Leu Thr Ala Ser Ala Tyr Val Leu Met Ile Arg Met Leu Arg Ser Ser260
265 270Ala Met Asp Glu Asn Ser Glu Lys Lys
Arg Lys Arg Ala Ile Lys Leu275 280 285Ile
Val Thr Val Leu Ala Met Tyr Leu Ile Cys Phe Thr Pro Ser Asn290
295 300Leu Leu Leu Val Val His Tyr Phe Leu Ile Lys
Ser Gln Gly Gln Ser305 310 315
320His Val Tyr Ala Leu Tyr Ile Val Ala Leu Cys Leu Ser Thr Leu
Asn325 330 335Ser Cys Ile Asp Pro Phe Val
Tyr Tyr Phe Val Ser His Asp Phe Arg340 345
350Asp His Ala Lys Asn Ala Leu Leu Cys Arg Ser Val Arg Thr Val Lys355
360 365Gln Met Gln Val Ser Leu Thr Ser Lys
Lys His Ser Arg Lys Ser Ser370 375 380Ser
Tyr Ser Ser Ser Ser Thr Thr Val Lys Thr Ser Tyr385 390
395112153PRTHomo sapiens 112Ala Pro Val Arg Ser Leu Asn Cys
Thr Leu Arg Asp Ser Gln Gln Lys1 5 10
15Ser Leu Val Met Ser Gly Pro Tyr Glu Leu Lys Ala Leu His
Leu Gln20 25 30Gly Gln Asp Met Glu Gln
Gln Val Val Phe Ser Met Ser Phe Val Gln35 40
45Gly Glu Glu Ser Asn Asp Lys Ile Pro Val Ala Leu Gly Leu Lys Glu50
55 60Lys Asn Leu Tyr Leu Ser Cys Val Leu
Lys Asp Asp Lys Pro Thr Leu65 70 75
80Gln Leu Glu Ser Val Asp Pro Lys Asn Tyr Pro Lys Lys Lys
Met Glu85 90 95Lys Arg Phe Val Phe Asn
Lys Ile Glu Ile Asn Asn Lys Leu Glu Phe100 105
110Glu Ser Ala Gln Phe Pro Asn Trp Tyr Ile Ser Thr Ser Gln Ala
Glu115 120 125Asn Met Pro Val Phe Leu Gly
Gly Thr Lys Gly Gly Gln Asp Ile Thr130 135
140Asp Phe Thr Met Gln Phe Val Ser Ser145
150113385PRTHomo sapiens 113Met Trp Gly Arg Leu Leu Leu Trp Pro Leu Val
Leu Gly Phe Ser Leu1 5 10
15Ser Gly Gly Thr Gln Thr Pro Ser Val Tyr Asp Glu Ser Gly Ser Thr20
25 30Gly Gly Gly Asp Asp Ser Thr Pro Ser Ile
Leu Pro Ala Pro Arg Gly35 40 45Tyr Pro
Gly Gln Val Cys Ala Asn Asp Ser Asp Thr Leu Glu Leu Pro50
55 60Asp Ser Ser Arg Ala Leu Leu Leu Gly Trp Val Pro
Thr Arg Leu Val65 70 75
80Pro Ala Leu Tyr Gly Leu Val Leu Val Val Gly Leu Pro Ala Asn Gly85
90 95Leu Ala Leu Trp Val Leu Ala Thr Gln Ala
Pro Arg Leu Pro Ser Thr100 105 110Met Leu
Leu Met Asn Leu Ala Thr Ala Asp Leu Leu Leu Ala Leu Ala115
120 125Leu Pro Pro Arg Ile Ala Tyr His Leu Arg Gly Gln
Arg Trp Pro Phe130 135 140Gly Glu Ala Ala
Cys Arg Leu Ala Thr Ala Ala Leu Tyr Gly His Met145 150
155 160Tyr Gly Ser Val Leu Leu Leu Ala Ala
Val Ser Leu Asp Arg Tyr Leu165 170 175Ala
Leu Val His Pro Leu Arg Ala Arg Ala Leu Arg Gly Arg Arg Leu180
185 190Ala Leu Gly Leu Cys Met Ala Ala Trp Leu Met
Ala Ala Ala Leu Ala195 200 205Leu Pro Leu
Thr Leu Gln Arg Gln Thr Phe Arg Leu Ala Arg Ser Asp210
215 220Arg Val Leu Cys His Asp Ala Leu Pro Leu Asp Ala
Gln Ala Ser His225 230 235
240Trp Gln Pro Ala Phe Thr Cys Leu Ala Leu Leu Gly Cys Phe Leu Pro245
250 255Leu Leu Ala Met Leu Leu Cys Tyr Gly
Ala Thr Leu His Thr Leu Ala260 265 270Ala
Ser Gly Arg Arg Tyr Gly His Ala Leu Arg Leu Thr Ala Val Val275
280 285Leu Ala Ser Ala Val Ala Phe Phe Val Pro Ser
Asn Leu Leu Leu Leu290 295 300Leu His Tyr
Ser Asp Pro Ser Pro Ser Ala Trp Gly Asn Leu Tyr Gly305
310 315 320Ala Tyr Val Pro Ser Leu Ala
Leu Ser Thr Leu Asn Ser Cys Val Asp325 330
335Pro Phe Ile Tyr Tyr Tyr Val Ser Ala Glu Phe Arg Asp Lys Val Arg340
345 350Ala Gly Leu Phe Gln Arg Ser Pro Gly
Asp Thr Val Ala Ser Lys Ala355 360 365Ser
Ala Glu Gly Gly Ser Arg Gly Met Gly Thr His Ser Ser Leu Leu370
375 380Gln3851141338PRTHomo sapiens 114Met Val Ser
Tyr Trp Asp Thr Gly Val Leu Leu Cys Ala Leu Leu Ser1 5
10 15Cys Leu Leu Leu Thr Gly Ser Ser Ser
Gly Ser Lys Leu Lys Asp Pro20 25 30Glu
Leu Ser Leu Lys Gly Thr Gln His Ile Met Gln Ala Gly Gln Thr35
40 45Leu His Leu Gln Cys Arg Gly Glu Ala Ala His
Lys Trp Ser Leu Pro50 55 60Glu Met Val
Ser Lys Glu Ser Glu Arg Leu Ser Ile Thr Lys Ser Ala65 70
75 80Cys Gly Arg Asn Gly Lys Gln Phe
Cys Ser Thr Leu Thr Leu Asn Thr85 90
95Ala Gln Ala Asn His Thr Gly Phe Tyr Ser Cys Lys Tyr Leu Ala Val100
105 110Pro Thr Ser Lys Lys Lys Glu Thr Glu Ser
Ala Ile Tyr Ile Phe Ile115 120 125Ser Asp
Thr Gly Arg Pro Phe Val Glu Met Tyr Ser Glu Ile Pro Glu130
135 140Ile Ile His Met Thr Glu Gly Arg Glu Leu Val Ile
Pro Cys Arg Val145 150 155
160Thr Ser Pro Asn Ile Thr Val Thr Leu Lys Lys Phe Pro Leu Asp Thr165
170 175Leu Ile Pro Asp Gly Lys Arg Ile Ile
Trp Asp Ser Arg Lys Gly Phe180 185 190Ile
Ile Ser Asn Ala Thr Tyr Lys Glu Ile Gly Leu Leu Thr Cys Glu195
200 205Ala Thr Val Asn Gly His Leu Tyr Lys Thr Asn
Tyr Leu Thr His Arg210 215 220Gln Thr Asn
Thr Ile Ile Asp Val Gln Ile Ser Thr Pro Arg Pro Val225
230 235 240Lys Leu Leu Arg Gly His Thr
Leu Val Leu Asn Cys Thr Ala Thr Thr245 250
255Pro Leu Asn Thr Arg Val Gln Met Thr Trp Ser Tyr Pro Asp Glu Lys260
265 270Asn Lys Arg Ala Ser Val Arg Arg Arg
Ile Asp Gln Ser Asn Ser His275 280 285Ala
Asn Ile Phe Tyr Ser Val Leu Thr Ile Asp Lys Met Gln Asn Lys290
295 300Asp Lys Gly Leu Tyr Thr Cys Arg Val Arg Ser
Gly Pro Ser Phe Lys305 310 315
320Ser Val Asn Thr Ser Val His Ile Tyr Asp Lys Ala Phe Ile Thr
Val325 330 335Lys His Arg Lys Gln Gln Val
Leu Glu Thr Val Ala Gly Lys Arg Ser340 345
350Tyr Arg Leu Ser Met Lys Val Lys Ala Phe Pro Ser Pro Glu Val Val355
360 365Trp Leu Lys Asp Gly Leu Pro Ala Thr
Glu Lys Ser Ala Arg Tyr Leu370 375 380Thr
Arg Gly Tyr Ser Leu Ile Ile Lys Asp Val Thr Glu Glu Asp Ala385
390 395 400Gly Asn Tyr Thr Ile Leu
Leu Ser Ile Lys Gln Ser Asn Val Phe Lys405 410
415Asn Leu Thr Ala Thr Leu Ile Val Asn Val Lys Pro Gln Ile Tyr
Glu420 425 430Lys Ala Val Ser Ser Phe Pro
Asp Pro Ala Leu Tyr Pro Leu Gly Ser435 440
445Arg Gln Ile Leu Thr Cys Thr Ala Tyr Gly Ile Pro Gln Pro Thr Ile450
455 460Lys Trp Phe Trp His Pro Cys Asn His
Asn His Ser Glu Ala Arg Cys465 470 475
480Asp Phe Cys Ser Asn Asn Glu Glu Ser Phe Ile Leu Asp Ala
Asp Ser485 490 495Asn Met Gly Asn Arg Ile
Glu Ser Ile Thr Gln Arg Met Ala Ile Ile500 505
510Glu Gly Lys Asn Lys Met Ala Ser Thr Leu Val Val Ala Asp Ser
Arg515 520 525Ile Ser Gly Ile Tyr Ile Cys
Ile Ala Ser Asn Lys Val Gly Thr Val530 535
540Gly Arg Asn Ile Ser Phe Tyr Ile Thr Asp Val Pro Asn Gly Phe His545
550 555 560Val Asn Leu Glu
Lys Met Pro Thr Glu Gly Glu Asp Leu Lys Leu Ser565 570
575Cys Thr Val Asn Lys Phe Leu Tyr Arg Asp Val Thr Trp Ile
Leu Leu580 585 590Arg Thr Val Asn Asn Arg
Thr Met His Tyr Ser Ile Ser Lys Gln Lys595 600
605Met Ala Ile Thr Lys Glu His Ser Ile Thr Leu Asn Leu Thr Ile
Met610 615 620Asn Val Ser Leu Gln Asp Ser
Gly Thr Tyr Ala Cys Arg Ala Arg Asn625 630
635 640Val Tyr Thr Gly Glu Glu Ile Leu Gln Lys Lys Glu
Ile Thr Ile Arg645 650 655Asp Gln Glu Ala
Pro Tyr Leu Leu Arg Asn Leu Ser Asp His Thr Val660 665
670Ala Ile Ser Ser Ser Thr Thr Leu Asp Cys His Ala Asn Gly
Val Pro675 680 685Glu Pro Gln Ile Thr Trp
Phe Lys Asn Asn His Lys Ile Gln Gln Glu690 695
700Pro Gly Ile Ile Leu Gly Pro Gly Ser Ser Thr Leu Phe Ile Glu
Arg705 710 715 720Val Thr
Glu Glu Asp Glu Gly Val Tyr His Cys Lys Ala Thr Asn Gln725
730 735Lys Gly Ser Val Glu Ser Ser Ala Tyr Leu Thr Val
Gln Gly Thr Ser740 745 750Asp Lys Ser Asn
Leu Glu Leu Ile Thr Leu Thr Cys Thr Cys Val Ala755 760
765Ala Thr Leu Phe Trp Leu Leu Leu Thr Leu Leu Ile Arg Lys
Met Lys770 775 780Arg Ser Ser Ser Glu Ile
Lys Thr Asp Tyr Leu Ser Ile Ile Met Asp785 790
795 800Pro Asp Glu Val Pro Leu Asp Glu Gln Cys Glu
Arg Leu Pro Tyr Asp805 810 815Ala Ser Lys
Trp Glu Phe Ala Arg Glu Arg Leu Lys Leu Gly Lys Ser820
825 830Leu Gly Arg Gly Ala Phe Gly Lys Val Val Gln Ala
Ser Ala Phe Gly835 840 845Ile Lys Lys Ser
Pro Thr Cys Arg Thr Val Ala Val Lys Met Leu Lys850 855
860Glu Gly Ala Thr Ala Ser Glu Tyr Lys Ala Leu Met Thr Glu
Leu Lys865 870 875 880Ile
Leu Thr His Ile Gly His His Leu Asn Val Val Asn Leu Leu Gly885
890 895Ala Cys Thr Lys Gln Gly Gly Pro Leu Met Val
Ile Val Glu Tyr Cys900 905 910Lys Tyr Gly
Asn Leu Ser Asn Tyr Leu Lys Ser Lys Arg Asp Leu Phe915
920 925Phe Leu Asn Lys Asp Ala Ala Leu His Met Glu Pro
Lys Lys Glu Lys930 935 940Met Glu Pro Gly
Leu Glu Gln Gly Lys Lys Pro Arg Leu Asp Ser Val945 950
955 960Thr Ser Ser Glu Ser Phe Ala Ser Ser
Gly Phe Gln Glu Asp Lys Ser965 970 975Leu
Ser Asp Val Glu Glu Glu Glu Asp Ser Asp Gly Phe Tyr Lys Glu980
985 990Pro Ile Thr Met Glu Asp Leu Ile Ser Tyr Ser
Phe Gln Val Ala Arg995 1000 1005Gly Met
Glu Phe Leu Ser Ser Arg Lys Cys Ile His Arg Asp Leu1010
1015 1020Ala Ala Arg Asn Ile Leu Leu Ser Glu Asn Asn
Val Val Lys Ile1025 1030 1035Cys Asp
Phe Gly Leu Ala Arg Asp Ile Tyr Lys Asn Pro Asp Tyr1040
1045 1050Val Arg Lys Gly Asp Thr Arg Leu Pro Leu Lys
Trp Met Ala Pro1055 1060 1065Glu Ser
Ile Phe Asp Lys Ile Tyr Ser Thr Lys Ser Asp Val Trp1070
1075 1080Ser Tyr Gly Val Leu Leu Trp Glu Ile Phe Ser
Leu Gly Gly Ser1085 1090 1095Pro Tyr
Pro Gly Val Gln Met Asp Glu Asp Phe Cys Ser Arg Leu1100
1105 1110Arg Glu Gly Met Arg Met Arg Ala Pro Glu Tyr
Ser Thr Pro Glu1115 1120 1125Ile Tyr
Gln Ile Met Leu Asp Cys Trp His Arg Asp Pro Lys Glu1130
1135 1140Arg Pro Arg Phe Ala Glu Leu Val Glu Lys Leu
Gly Asp Leu Leu1145 1150 1155Gln Ala
Asn Val Gln Gln Asp Gly Lys Asp Tyr Ile Pro Ile Asn1160
1165 1170Ala Ile Leu Thr Gly Asn Ser Gly Phe Thr Tyr
Ser Thr Pro Ala1175 1180 1185Phe Ser
Glu Asp Phe Phe Lys Glu Ser Ile Ser Ala Pro Lys Phe1190
1195 1200Asn Ser Gly Ser Ser Asp Asp Val Arg Tyr Val
Asn Ala Phe Lys1205 1210 1215Phe Met
Ser Leu Glu Arg Ile Lys Thr Phe Glu Glu Leu Leu Pro1220
1225 1230Asn Ala Thr Ser Met Phe Asp Asp Tyr Gln Gly
Asp Ser Ser Thr1235 1240 1245Leu Leu
Ala Ser Pro Met Leu Lys Arg Phe Thr Trp Thr Asp Ser1250
1255 1260Lys Pro Lys Ala Ser Leu Lys Ile Asp Leu Arg
Val Thr Ser Lys1265 1270 1275Ser Lys
Glu Ser Gly Leu Ser Asp Val Ser Arg Pro Ser Phe Cys1280
1285 1290His Ser Ser Cys Gly His Val Ser Glu Gly Lys
Arg Arg Phe Thr1295 1300 1305Tyr Asp
His Ala Glu Leu Glu Arg Lys Ile Ala Cys Cys Ser Pro1310
1315 1320Pro Pro Asp Tyr Asn Ser Val Val Leu Tyr Ser
Thr Pro Pro Ile1325 1330
13351151356PRTHomo sapiens 115Met Gln Ser Lys Val Leu Leu Ala Val Ala Leu
Trp Leu Cys Val Glu1 5 10
15Thr Arg Ala Ala Ser Val Gly Leu Pro Ser Val Ser Leu Asp Leu Pro20
25 30Arg Leu Ser Ile Gln Lys Asp Ile Leu Thr
Ile Lys Ala Asn Thr Thr35 40 45Leu Gln
Ile Thr Cys Arg Gly Gln Arg Asp Leu Asp Trp Leu Trp Pro50
55 60Asn Asn Gln Ser Gly Ser Glu Gln Arg Val Glu Val
Thr Glu Cys Ser65 70 75
80Asp Gly Leu Phe Cys Lys Thr Leu Thr Ile Pro Lys Val Ile Gly Asn85
90 95Asp Thr Gly Ala Tyr Lys Cys Phe Tyr Arg
Glu Thr Asp Leu Ala Ser100 105 110Val Ile
Tyr Val Tyr Val Gln Asp Tyr Arg Ser Pro Phe Ile Ala Ser115
120 125Val Ser Asp Gln His Gly Val Val Tyr Ile Thr Glu
Asn Lys Asn Lys130 135 140Thr Val Val Ile
Pro Cys Leu Gly Ser Ile Ser Asn Leu Asn Val Ser145 150
155 160Leu Cys Ala Arg Tyr Pro Glu Lys Arg
Phe Val Pro Asp Gly Asn Arg165 170 175Ile
Ser Trp Asp Ser Lys Lys Gly Phe Thr Ile Pro Ser Tyr Met Ile180
185 190Ser Tyr Ala Gly Met Val Phe Cys Glu Ala Lys
Ile Asn Asp Glu Ser195 200 205Tyr Gln Ser
Ile Met Tyr Ile Val Val Val Val Gly Tyr Arg Ile Tyr210
215 220Asp Val Val Leu Ser Pro Ser His Gly Ile Glu Leu
Ser Val Gly Glu225 230 235
240Lys Leu Val Leu Asn Cys Thr Ala Arg Thr Glu Leu Asn Val Gly Ile245
250 255Asp Phe Asn Trp Glu Tyr Pro Ser Ser
Lys His Gln His Lys Lys Leu260 265 270Val
Asn Arg Asp Leu Lys Thr Gln Ser Gly Ser Glu Met Lys Lys Phe275
280 285Leu Ser Thr Leu Thr Ile Asp Gly Val Thr Arg
Ser Asp Gln Gly Leu290 295 300Tyr Thr Cys
Ala Ala Ser Ser Gly Leu Met Thr Lys Lys Asn Ser Thr305
310 315 320Phe Val Arg Val His Glu Lys
Pro Phe Val Ala Phe Gly Ser Gly Met325 330
335Glu Ser Leu Val Glu Ala Thr Val Gly Glu Arg Val Arg Ile Pro Ala340
345 350Lys Tyr Leu Gly Tyr Pro Pro Pro Glu
Ile Lys Trp Tyr Lys Asn Gly355 360 365Ile
Pro Leu Glu Ser Asn His Thr Ile Lys Ala Gly His Val Leu Thr370
375 380Ile Met Glu Val Ser Glu Arg Asp Thr Gly Asn
Tyr Thr Val Ile Leu385 390 395
400Thr Asn Pro Ile Ser Lys Glu Lys Gln Ser His Val Val Ser Leu
Val405 410 415Val Tyr Val Pro Pro Gln Ile
Gly Glu Lys Ser Leu Ile Ser Pro Val420 425
430Asp Ser Tyr Gln Tyr Gly Thr Thr Gln Thr Leu Thr Cys Thr Val Tyr435
440 445Ala Ile Pro Pro Pro His His Ile His
Trp Tyr Trp Gln Leu Glu Glu450 455 460Glu
Cys Ala Asn Glu Pro Ser Gln Ala Val Ser Val Thr Asn Pro Tyr465
470 475 480Pro Cys Glu Glu Trp Arg
Ser Val Glu Asp Phe Gln Gly Gly Asn Lys485 490
495Ile Glu Val Asn Lys Asn Gln Phe Ala Leu Ile Glu Gly Lys Asn
Lys500 505 510Thr Val Ser Thr Leu Val Ile
Gln Ala Ala Asn Val Ser Ala Leu Tyr515 520
525Lys Cys Glu Ala Val Asn Lys Val Gly Arg Gly Glu Arg Val Ile Ser530
535 540Phe His Val Thr Arg Gly Pro Glu Ile
Thr Leu Gln Pro Asp Met Gln545 550 555
560Pro Thr Glu Gln Glu Ser Val Ser Leu Trp Cys Thr Ala Asp
Arg Ser565 570 575Thr Phe Glu Asn Leu Thr
Trp Tyr Lys Leu Gly Pro Gln Pro Leu Pro580 585
590Ile His Val Gly Glu Leu Pro Thr Pro Val Cys Lys Asn Leu Asp
Thr595 600 605Leu Trp Lys Leu Asn Ala Thr
Met Phe Ser Asn Ser Thr Asn Asp Ile610 615
620Leu Ile Met Glu Leu Lys Asn Ala Ser Leu Gln Asp Gln Gly Asp Tyr625
630 635 640Val Cys Leu Ala
Gln Asp Arg Lys Thr Lys Lys Arg His Cys Val Val645 650
655Arg Gln Leu Thr Val Leu Glu Arg Val Ala Pro Thr Ile Thr
Gly Asn660 665 670Leu Glu Asn Gln Thr Thr
Ser Ile Gly Glu Ser Ile Glu Val Ser Cys675 680
685Thr Ala Ser Gly Asn Pro Pro Pro Gln Ile Met Trp Phe Lys Asp
Asn690 695 700Glu Thr Leu Val Glu Asp Ser
Gly Ile Val Leu Lys Asp Gly Asn Arg705 710
715 720Asn Leu Thr Ile Arg Arg Val Arg Lys Glu Asp Glu
Gly Leu Tyr Thr725 730 735Cys Gln Ala Cys
Ser Val Leu Gly Cys Ala Lys Val Glu Ala Phe Phe740 745
750Ile Ile Glu Gly Ala Gln Glu Lys Thr Asn Leu Glu Ile Ile
Ile Leu755 760 765Val Gly Thr Ala Val Ile
Ala Met Phe Phe Trp Leu Leu Leu Val Ile770 775
780Ile Leu Arg Thr Val Lys Arg Ala Asn Gly Gly Glu Leu Lys Thr
Gly785 790 795 800Tyr Leu
Ser Ile Val Met Asp Pro Asp Glu Leu Pro Leu Asp Glu His805
810 815Cys Glu Arg Leu Pro Tyr Asp Ala Ser Lys Trp Glu
Phe Pro Arg Asp820 825 830Arg Leu Lys Leu
Gly Lys Pro Leu Gly Arg Gly Ala Phe Gly Gln Val835 840
845Ile Glu Ala Asp Ala Phe Gly Ile Asp Lys Thr Ala Thr Cys
Arg Thr850 855 860Val Ala Val Lys Met Leu
Lys Glu Gly Ala Thr His Ser Glu His Arg865 870
875 880Ala Leu Met Ser Glu Leu Lys Ile Leu Ile His
Ile Gly His His Leu885 890 895Asn Val Val
Asn Leu Leu Gly Ala Cys Thr Lys Pro Gly Gly Pro Leu900
905 910Met Val Ile Val Glu Phe Cys Lys Phe Gly Asn Leu
Ser Thr Tyr Leu915 920 925Arg Ser Lys Arg
Asn Glu Phe Val Pro Tyr Lys Thr Lys Gly Ala Arg930 935
940Phe Arg Gln Gly Lys Asp Tyr Val Gly Ala Ile Pro Val Asp
Leu Lys945 950 955 960Arg
Arg Leu Asp Ser Ile Thr Ser Ser Gln Ser Ser Ala Ser Ser Gly965
970 975Phe Val Glu Glu Lys Ser Leu Ser Asp Val Glu
Glu Glu Glu Ala Pro980 985 990Glu Asp Leu
Tyr Lys Asp Phe Leu Thr Leu Glu His Leu Ile Cys Tyr995
1000 1005Ser Phe Gln Val Ala Lys Gly Met Glu Phe Leu
Ala Ser Arg Lys1010 1015 1020Cys Ile
His Arg Asp Leu Ala Ala Arg Asn Ile Leu Leu Ser Glu1025
1030 1035Lys Asn Val Val Lys Ile Cys Asp Phe Gly Leu
Ala Arg Asp Ile1040 1045 1050Tyr Lys
Asp Pro Asp Tyr Val Arg Lys Gly Asp Ala Arg Leu Pro1055
1060 1065Leu Lys Trp Met Ala Pro Glu Thr Ile Phe Asp
Arg Val Tyr Thr1070 1075 1080Ile Gln
Ser Asp Val Trp Ser Phe Gly Val Leu Leu Trp Glu Ile1085
1090 1095Phe Ser Leu Gly Ala Ser Pro Tyr Pro Gly Val
Lys Ile Asp Glu1100 1105 1110Glu Phe
Cys Arg Arg Leu Lys Glu Gly Thr Arg Met Arg Ala Pro1115
1120 1125Asp Tyr Thr Thr Pro Glu Met Tyr Gln Thr Met
Leu Asp Cys Trp1130 1135 1140His Gly
Glu Pro Ser Gln Arg Pro Thr Phe Ser Glu Leu Val Glu1145
1150 1155His Leu Gly Asn Leu Leu Gln Ala Asn Ala Gln
Gln Asp Gly Lys1160 1165 1170Asp Tyr
Ile Val Leu Pro Ile Ser Glu Thr Leu Ser Met Glu Glu1175
1180 1185Asp Ser Gly Leu Ser Leu Pro Thr Ser Pro Val
Ser Cys Met Glu1190 1195 1200Glu Glu
Glu Val Cys Asp Pro Lys Phe His Tyr Asp Asn Thr Ala1205
1210 1215Gly Ile Ser Gln Tyr Leu Gln Asn Ser Lys Arg
Lys Ser Arg Pro1220 1225 1230Val Ser
Val Lys Thr Phe Glu Asp Ile Pro Leu Glu Glu Pro Glu1235
1240 1245Val Lys Val Ile Pro Asp Asp Asn Gln Thr Asp
Ser Gly Met Val1250 1255 1260Leu Ala
Ser Glu Glu Leu Lys Thr Leu Glu Asp Arg Thr Lys Leu1265
1270 1275Ser Pro Ser Phe Gly Gly Met Val Pro Ser Lys
Ser Arg Glu Ser1280 1285 1290Val Ala
Ser Glu Gly Ser Asn Gln Thr Ser Gly Tyr Gln Ser Gly1295
1300 1305Tyr His Ser Asp Asp Thr Asp Thr Thr Val Tyr
Ser Ser Glu Glu1310 1315 1320Ala Glu
Leu Leu Lys Leu Ile Glu Ile Gly Val Gln Thr Gly Ser1325
1330 1335Thr Ala Gln Ile Leu Gln Pro Asp Ser Gly Thr
Thr Leu Ser Ser1340 1345 1350Pro Pro
Val13551161186PRTHomo sapiens 116Leu Glu Glu Lys Lys Val Cys Gln Gly Thr
Ser Asn Lys Leu Thr Gln1 5 10
15Leu Gly Thr Phe Glu Asp His Phe Leu Ser Leu Gln Arg Met Phe Asn20
25 30Asn Cys Glu Val Val Leu Gly Asn Leu
Glu Ile Thr Tyr Val Gln Arg35 40 45Asn
Tyr Asp Leu Ser Phe Leu Lys Thr Ile Gln Glu Val Ala Gly Tyr50
55 60Val Leu Ile Ala Leu Asn Thr Val Glu Arg Ile
Pro Leu Glu Asn Leu65 70 75
80Gln Ile Ile Arg Gly Asn Met Tyr Tyr Glu Asn Ser Tyr Ala Leu Ala85
90 95Val Leu Ser Asn Tyr Asp Ala Asn Lys
Thr Gly Leu Lys Glu Leu Pro100 105 110Met
Arg Asn Leu Gln Glu Ile Leu His Gly Ala Val Arg Phe Ser Asn115
120 125Asn Pro Ala Leu Cys Asn Val Glu Ser Ile Gln
Trp Arg Asp Ile Val130 135 140Ser Ser Asp
Phe Leu Ser Asn Met Ser Met Asp Phe Gln Asn His Leu145
150 155 160Gly Ser Cys Gln Lys Cys Asp
Pro Ser Cys Pro Asn Gly Ser Cys Trp165 170
175Gly Ala Gly Glu Glu Asn Cys Gln Lys Leu Thr Lys Ile Ile Cys Ala180
185 190Gln Gln Cys Ser Gly Arg Cys Arg Gly
Lys Ser Pro Ser Asp Cys Cys195 200 205His
Asn Gln Cys Ala Ala Gly Cys Thr Gly Pro Arg Glu Ser Asp Cys210
215 220Leu Val Cys Arg Lys Phe Arg Asp Glu Ala Thr
Cys Lys Asp Thr Cys225 230 235
240Pro Pro Leu Met Leu Tyr Asn Pro Thr Thr Tyr Gln Met Asp Val
Asn245 250 255Pro Glu Gly Lys Tyr Ser Phe
Gly Ala Thr Cys Val Lys Lys Cys Pro260 265
270Arg Asn Tyr Val Val Thr Asp His Gly Ser Cys Val Arg Ala Cys Gly275
280 285Ala Asp Ser Tyr Glu Met Glu Glu Asp
Gly Val Arg Lys Cys Lys Lys290 295 300Cys
Glu Gly Pro Cys Arg Lys Val Cys Asn Gly Ile Gly Ile Gly Glu305
310 315 320Phe Lys Asp Ser Leu Ser
Ile Asn Ala Thr Asn Ile Lys His Phe Lys325 330
335Asn Cys Thr Ser Ile Ser Gly Asp Leu His Ile Leu Pro Val Ala
Phe340 345 350Arg Gly Asp Ser Phe Thr His
Thr Pro Pro Leu Asp Pro Gln Glu Leu355 360
365Asp Ile Leu Lys Thr Val Lys Glu Ile Thr Gly Phe Leu Leu Ile Gln370
375 380Ala Trp Pro Glu Asn Arg Thr Asp Leu
His Ala Phe Glu Asn Leu Glu385 390 395
400Ile Ile Arg Gly Arg Thr Lys Gln His Gly Gln Phe Ser Leu
Ala Val405 410 415Val Ser Leu Asn Ile Thr
Ser Leu Gly Leu Arg Ser Leu Lys Glu Ile420 425
430Ser Asp Gly Asp Val Ile Ile Ser Gly Asn Lys Asn Leu Cys Tyr
Ala435 440 445Asn Thr Ile Asn Trp Lys Lys
Leu Phe Gly Thr Ser Gly Gln Lys Thr450 455
460Lys Ile Ile Ser Asn Arg Gly Glu Asn Ser Cys Lys Ala Thr Gly Gln465
470 475 480Val Cys His Ala
Leu Cys Ser Pro Glu Gly Cys Trp Gly Pro Glu Pro485 490
495Arg Asp Cys Val Ser Cys Arg Asn Val Ser Arg Gly Arg Glu
Cys Val500 505 510Asp Lys Cys Asn Leu Leu
Glu Gly Glu Pro Arg Glu Phe Val Glu Asn515 520
525Ser Glu Cys Ile Gln Cys His Pro Glu Cys Leu Pro Gln Ala Met
Asn530 535 540Ile Thr Cys Thr Gly Arg Gly
Pro Asp Asn Cys Ile Gln Cys Ala His545 550
555 560Tyr Ile Asp Gly Pro His Cys Val Lys Thr Cys Pro
Ala Gly Val Met565 570 575Gly Glu Asn Asn
Thr Leu Val Trp Lys Tyr Ala Asp Ala Gly His Val580 585
590Cys His Leu Cys His Pro Asn Cys Thr Tyr Gly Cys Thr Gly
Pro Gly595 600 605Leu Glu Gly Cys Pro Thr
Asn Gly Pro Lys Ile Pro Ser Ile Ala Thr610 615
620Gly Met Val Gly Ala Leu Leu Leu Leu Leu Val Val Ala Leu Gly
Ile625 630 635 640Gly Leu
Phe Met Arg Arg Arg His Ile Val Arg Lys Arg Thr Leu Arg645
650 655Arg Leu Leu Gln Glu Arg Glu Leu Val Glu Pro Leu
Thr Pro Ser Gly660 665 670Glu Ala Pro Asn
Gln Ala Leu Leu Arg Ile Leu Lys Glu Thr Glu Phe675 680
685Lys Lys Ile Lys Val Leu Gly Ser Gly Ala Phe Gly Thr Val
Tyr Lys690 695 700Gly Leu Trp Ile Pro Glu
Gly Glu Lys Val Lys Ile Pro Val Ala Ile705 710
715 720Lys Glu Leu Arg Glu Ala Thr Ser Pro Lys Ala
Asn Lys Glu Ile Leu725 730 735Asp Glu Ala
Tyr Val Met Ala Ser Val Asp Asn Pro His Val Cys Arg740
745 750Leu Leu Gly Ile Cys Leu Thr Ser Thr Val Gln Leu
Ile Thr Gln Leu755 760 765Met Pro Phe Gly
Cys Leu Leu Asp Tyr Val Arg Glu His Lys Asp Asn770 775
780Ile Gly Ser Gln Tyr Leu Leu Asn Trp Cys Val Gln Ile Ala
Lys Gly785 790 795 800Met
Asn Tyr Leu Glu Asp Arg Arg Leu Val His Arg Asp Leu Ala Ala805
810 815Arg Asn Val Leu Val Lys Thr Pro Gln His Val
Lys Ile Thr Asp Phe820 825 830Gly Leu Ala
Lys Leu Leu Gly Ala Glu Glu Lys Glu Tyr His Ala Glu835
840 845Gly Gly Lys Val Pro Ile Lys Trp Met Ala Leu Glu
Ser Ile Leu His850 855 860Arg Ile Tyr Thr
His Gln Ser Asp Val Trp Ser Tyr Gly Val Thr Val865 870
875 880Trp Glu Leu Met Thr Phe Gly Ser Lys
Pro Tyr Asp Gly Ile Pro Ala885 890 895Ser
Glu Ile Ser Ser Ile Leu Glu Lys Gly Glu Arg Leu Pro Gln Pro900
905 910Pro Ile Cys Thr Ile Asp Val Tyr Met Ile Met
Val Lys Cys Trp Met915 920 925Ile Asp Ala
Asp Ser Arg Pro Lys Phe Arg Glu Leu Ile Ile Glu Phe930
935 940Ser Lys Met Ala Arg Asp Pro Gln Arg Tyr Leu Val
Ile Gln Gly Asp945 950 955
960Glu Arg Met His Leu Pro Ser Pro Thr Asp Ser Asn Phe Tyr Arg Ala965
970 975Leu Met Asp Glu Glu Asp Met Asp Asp
Val Val Asp Ala Asp Glu Tyr980 985 990Leu
Ile Pro Gln Gln Gly Phe Phe Ser Ser Pro Ser Thr Ser Arg Thr995
1000 1005Pro Leu Leu Ser Ser Leu Ser Ala Thr Ser
Asn Asn Ser Thr Val1010 1015 1020Ala Cys
Ile Asp Arg Asn Gly Leu Gln Ser Cys Pro Ile Lys Glu1025
1030 1035Asp Ser Phe Leu Gln Arg Tyr Ser Ser Asp Pro
Thr Gly Ala Leu1040 1045 1050Thr Glu
Asp Ser Ile Asp Asp Thr Phe Leu Pro Val Pro Glu Tyr1055
1060 1065Ile Asn Gln Ser Val Pro Lys Arg Pro Ala Gly
Ser Val Gln Asn1070 1075 1080Pro Val
Tyr His Asn Gln Pro Leu Asn Pro Ala Pro Ser Arg Asp1085
1090 1095Pro His Tyr Gln Asp Pro His Ser Thr Ala Val
Gly Asn Pro Glu1100 1105 1110Tyr Leu
Asn Thr Val Gln Pro Thr Cys Val Asn Ser Thr Phe Asp1115
1120 1125Ser Pro Ala His Trp Ala Gln Lys Gly Ser His
Gln Ile Ser Leu1130 1135 1140Asp Asn
Pro Asp Tyr Gln Gln Asp Phe Phe Pro Lys Glu Ala Lys1145
1150 1155Pro Asn Gly Ile Phe Lys Gly Ser Thr Ala Glu
Asn Ala Glu Tyr1160 1165 1170Leu Arg
Val Ala Pro Gln Ser Ser Glu Phe Ile Gly Ala1175 1180
1185117422PRTHomo sapiens 117Met Asp Val Leu Ser Pro Gly Gln
Gly Asn Asn Thr Thr Ser Pro Pro1 5 10
15Ala Pro Phe Glu Thr Gly Gly Asn Thr Thr Gly Ile Ser Asp
Val Thr20 25 30Val Ser Tyr Gln Val Ile
Thr Ser Leu Leu Leu Gly Thr Leu Ile Phe35 40
45Cys Ala Val Leu Gly Asn Ala Cys Val Val Ala Ala Ile Ala Leu Glu50
55 60Arg Ser Leu Gln Asn Val Ala Asn Tyr
Leu Ile Gly Ser Leu Ala Val65 70 75
80Thr Asp Leu Met Val Ser Val Leu Val Leu Pro Met Ala Ala
Leu Tyr85 90 95Gln Val Leu Asn Lys Trp
Thr Leu Gly Gln Val Thr Cys Asp Leu Phe100 105
110Ile Ala Leu Asp Val Leu Cys Cys Thr Ser Ser Ile Leu His Leu
Cys115 120 125Ala Ile Ala Leu Asp Arg Tyr
Trp Ala Ile Thr Asp Pro Ile Asp Tyr130 135
140Val Asn Lys Arg Thr Pro Arg Arg Ala Ala Ala Leu Ile Ser Leu Thr145
150 155 160Trp Leu Ile Gly
Phe Leu Ile Ser Ile Pro Pro Met Leu Gly Trp Arg165 170
175Thr Pro Glu Asp Arg Ser Asp Pro Asp Ala Cys Thr Ile Ser
Lys Asp180 185 190His Gly Tyr Thr Ile Tyr
Ser Thr Phe Gly Ala Phe Tyr Ile Pro Leu195 200
205Leu Leu Met Leu Val Leu Tyr Gly Arg Ile Phe Arg Ala Ala Arg
Phe210 215 220Arg Ile Arg Lys Thr Val Lys
Lys Val Glu Lys Thr Gly Ala Asp Thr225 230
235 240Arg His Gly Ala Ser Pro Ala Pro Gln Pro Lys Lys
Ser Val Asn Gly245 250 255Glu Ser Gly Ser
Arg Asn Trp Arg Leu Gly Val Glu Ser Lys Ala Gly260 265
270Gly Ala Leu Cys Ala Asn Gly Ala Val Arg Gln Gly Asp Asp
Gly Ala275 280 285Ala Leu Glu Val Ile Glu
Val His Arg Val Gly Asn Ser Lys Glu His290 295
300Leu Pro Leu Pro Ser Glu Ala Gly Pro Thr Pro Cys Ala Pro Ala
Ser305 310 315 320Phe Glu
Arg Lys Asn Glu Arg Asn Ala Glu Ala Lys Arg Lys Met Ala325
330 335Leu Ala Arg Glu Arg Lys Thr Val Lys Thr Leu Gly
Ile Ile Met Gly340 345 350Thr Phe Ile Leu
Cys Trp Leu Pro Phe Phe Ile Val Ala Leu Val Leu355 360
365Pro Phe Cys Glu Ser Ser Cys His Met Pro Thr Leu Leu Gly
Ala Ile370 375 380Ile Asn Trp Leu Gly Tyr
Ser Asn Ser Leu Leu Asn Pro Val Ile Tyr385 390
395 400Ala Tyr Phe Asn Lys Asp Phe Gln Asn Ala Phe
Lys Lys Ile Ile Lys405 410 415Cys Lys Phe
Cys Arg Gln420118129PRTHomo sapiens 118His Lys Cys Asp Ile Thr Leu Gln
Glu Ile Ile Lys Thr Leu Asn Ser1 5 10
15Leu Thr Glu Gln Lys Thr Leu Cys Thr Glu Leu Thr Val Thr
Asp Ile20 25 30Phe Ala Ala Ser Lys Asn
Thr Thr Glu Lys Glu Thr Phe Cys Arg Ala35 40
45Ala Thr Val Leu Arg Gln Phe Tyr Ser His His Glu Lys Asp Thr Arg50
55 60Cys Leu Gly Ala Thr Ala Gln Gln Phe
His Arg His Lys Gln Leu Ile65 70 75
80Arg Phe Leu Lys Arg Leu Asp Arg Asn Leu Trp Gly Leu Ala
Gly Leu85 90 95Asn Ser Cys Pro Val Lys
Glu Ala Asn Gln Ser Thr Leu Glu Asn Phe100 105
110Leu Glu Arg Leu Lys Thr Ile Met Arg Glu Lys Tyr Ser Lys Cys
Ser115 120 125Ser119113PRTHomo sapiens
119Met Gly Pro Val Pro Pro Ser Thr Ala Leu Arg Glu Leu Ile Glu Glu1
5 10 15Leu Val Asn Ile Thr Gln
Asn Gln Lys Ala Pro Leu Cys Asn Gly Ser20 25
30Met Val Trp Ser Ile Asn Leu Thr Ala Gly Met Tyr Cys Ala Ala Leu35
40 45Glu Ser Leu Ile Asn Val Ser Gly Cys
Ser Ala Ile Glu Lys Thr Gln50 55 60Arg
Met Leu Ser Gly Phe Cys Pro His Lys Val Ser Ala Gly Gln Phe65
70 75 80Ser Ser Leu His Val Arg
Asp Thr Lys Ile Glu Val Ala Gln Phe Val85 90
95Lys Asp Leu Leu Leu His Leu Lys Lys Leu Phe Arg Glu Gly Arg Phe100
105 110Asn120726PRTHomo sapiens 120Val
Thr Lys Leu Leu Pro Ala Leu Leu Leu Gln His Val Leu Leu His1
5 10 15Leu Leu Leu Leu Pro Ile Ala
Ile Pro Tyr Ala Glu Gly Gln Arg Lys20 25
30Arg Arg Asn Thr Ile His Glu Phe Lys Lys Ser Ala Lys Thr Thr Leu35
40 45Ile Lys Ile Asp Pro Ala Leu Lys Ile Lys
Thr Lys Lys Val Asn Thr50 55 60Ala Asp
Gln Cys Ala Asn Arg Cys Thr Arg Asn Lys Gly Leu Pro Phe65
70 75 80Thr Cys Lys Ala Phe Val Phe
Asp Lys Ala Arg Lys Gln Cys Leu Trp85 90
95Phe Pro Phe Asn Ser Met Ser Ser Gly Val Lys Lys Glu Phe Gly His100
105 110Glu Phe Asp Leu Tyr Glu Asn Lys Asp
Tyr Ile Arg Asn Cys Ile Ile115 120 125Gly
Lys Gly Arg Ser Tyr Lys Gly Thr Val Ser Ile Thr Lys Ser Gly130
135 140Ile Lys Cys Gln Pro Trp Ser Ser Met Ile Pro
His Glu His Ser Phe145 150 155
160Leu Pro Ser Ser Tyr Arg Gly Lys Asp Leu Gln Glu Asn Tyr Cys
Arg165 170 175Asn Pro Arg Gly Glu Glu Gly
Gly Pro Trp Cys Phe Thr Ser Asn Pro180 185
190Glu Val Arg Tyr Glu Val Cys Asp Ile Pro Gln Cys Ser Glu Val Glu195
200 205Cys Met Thr Cys Asn Gly Glu Ser Tyr
Arg Gly Leu Met Asp His Thr210 215 220Glu
Ser Gly Lys Ile Cys Gln Arg Trp Asp His Gln Thr Pro His Arg225
230 235 240His Lys Phe Leu Pro Glu
Arg Tyr Pro Asp Lys Gly Phe Asp Asp Asn245 250
255Tyr Cys Arg Asn Pro Asp Gly Gln Pro Arg Pro Trp Cys Tyr Thr
Leu260 265 270Asp Pro His Thr Arg Trp Glu
Tyr Cys Ala Ile Lys Thr Cys Ala Asp275 280
285Asn Thr Met Asn Asp Thr Asp Val Pro Leu Glu Thr Thr Glu Cys Ile290
295 300Gln Gly Gln Gly Glu Gly Tyr Arg Gly
Thr Val Asn Thr Ile Trp Asn305 310 315
320Gly Ile Pro Cys Gln Arg Trp Asp Ser Gln Tyr Pro His Glu
His Asp325 330 335Met Thr Pro Glu Asn Phe
Lys Cys Lys Asp Leu Arg Glu Asn Tyr Cys340 345
350Arg Asn Pro Asp Gly Ser Glu Ser Pro Trp Cys Phe Thr Thr Asp
Pro355 360 365Asn Ile Arg Val Gly Tyr Cys
Ser Gln Ile Pro Asn Cys Asp Met Ser370 375
380His Gly Gln Asp Cys Tyr Arg Gly Asn Gly Lys Asn Tyr Met Gly Asn385
390 395 400Leu Ser Gln Thr
Arg Ser Gly Leu Thr Cys Ser Met Trp Asp Lys Asn405 410
415Met Glu Asp Leu His Arg His Ile Phe Trp Glu Pro Asp Ala
Ser Lys420 425 430Leu Asn Glu Asn Tyr Cys
Arg Asn Pro Asp Asp Asp Ala His Gly Pro435 440
445Trp Cys Tyr Thr Gly Asn Pro Leu Ile Pro Trp Asp Tyr Cys Pro
Ile450 455 460Ser Arg Cys Glu Gly Asp Thr
Thr Pro Thr Ile Val Asn Leu Asp His465 470
475 480Pro Val Ile Ser Cys Ala Lys Thr Lys Gln Leu Arg
Val Val Asn Gly485 490 495Ile Pro Thr Arg
Thr Asn Ile Gly Trp Met Val Ser Leu Arg Tyr Arg500 505
510Asn Lys His Ile Cys Gly Gly Ser Leu Ile Lys Glu Ser Trp
Val Leu515 520 525Thr Ala Arg Gln Cys Phe
Pro Ser Arg Asp Leu Lys Asp Tyr Glu Ala530 535
540Trp Leu Gly Ile His Asp Val His Gly Arg Gly Asp Glu Lys Cys
Lys545 550 555 560Gln Val
Leu Asn Val Ser Gln Leu Val Tyr Gly Pro Glu Gly Ser Asp565
570 575Leu Val Leu Met Lys Leu Ala Arg Pro Ala Val Leu
Asp Asp Phe Val580 585 590Ser Thr Ile Asp
Leu Pro Asn Tyr Gly Cys Thr Ile Pro Glu Lys Thr595 600
605Ser Cys Ser Val Tyr Gly Trp Gly Tyr Thr Gly Leu Ile Asn
Tyr Asp610 615 620Gly Leu Leu Arg Val Ala
His Leu Tyr Ile Met Gly Asn Glu Lys Cys625 630
635 640Ser Gln His His Arg Gly Lys Val Thr Leu Asn
Glu Ser Glu Ile Cys645 650 655Ala Gly Ala
Glu Lys Ile Gly Ser Gly Pro Cys Glu Gly Asp Tyr Gly660
665 670Gly Pro Leu Val Cys Glu Gln His Lys Met Arg Met
Val Leu Gly Val675 680 685Ile Val Pro Gly
Arg Gly Cys Ala Ile Pro Asn Arg Pro Gly Ile Phe690 695
700Val Arg Val Ala Tyr Tyr Ala Lys Trp Ile His Lys Ile Ile
Leu Thr705 710 715 720Tyr
Lys Val Pro Gln Ser725121191PRTHomo sapiens 121Phe Pro Thr Ile Pro Leu
Ser Arg Leu Phe Asp Asn Ala Met Leu Arg1 5
10 15Ala His Arg Leu His Gln Leu Ala Phe Asp Thr Tyr
Gln Glu Phe Glu20 25 30Glu Ala Tyr Ile
Pro Lys Glu Gln Lys Tyr Ser Phe Leu Gln Asn Pro35 40
45Gln Thr Ser Leu Cys Phe Ser Glu Ser Ile Pro Thr Pro Ser
Asn Arg50 55 60Glu Glu Thr Gln Gln Lys
Ser Asn Leu Glu Leu Leu Arg Ile Ser Leu65 70
75 80Leu Leu Ile Gln Ser Trp Leu Glu Pro Val Gln
Phe Leu Arg Ser Val85 90 95Phe Ala Asn
Ser Leu Val Tyr Gly Ala Ser Asp Ser Asn Val Tyr Asp100
105 110Leu Leu Lys Asp Leu Glu Glu Gly Ile Gln Thr Leu
Met Gly Arg Leu115 120 125Glu Asp Gly Ser
Pro Arg Thr Gly Gln Ile Phe Lys Gln Thr Tyr Ser130 135
140Lys Phe Asp Thr Asn Ser His Asn Asp Asp Ala Leu Leu Lys
Asn Tyr145 150 155 160Gly
Leu Leu Tyr Cys Phe Arg Lys Asp Met Asp Lys Val Glu Thr Phe165
170 175Leu Arg Ile Val Gln Cys Arg Ser Val Glu Gly
Ser Cys Gly Phe180 185 190122156PRTHomo
sapiens 122Ala Tyr Arg Pro Ser Glu Thr Leu Cys Gly Gly Glu Leu Val Asp
Thr1 5 10 15Leu Gln Phe
Val Cys Gly Asp Arg Gly Phe Tyr Phe Ser Arg Pro Ala20 25
30Ser Arg Val Ser Arg Arg Ser Arg Gly Ile Val Glu Glu
Cys Cys Phe35 40 45Arg Ser Cys Asp Leu
Ala Leu Leu Glu Thr Tyr Cys Ala Thr Pro Ala50 55
60Lys Ser Glu Arg Asp Val Ser Thr Pro Pro Thr Val Leu Pro Asp
Asn65 70 75 80Phe Pro
Arg Tyr Pro Val Gly Lys Phe Phe Gln Tyr Asp Thr Trp Lys85
90 95Gln Ser Thr Gln Arg Leu Arg Arg Gly Leu Pro Ala
Leu Leu Arg Ala100 105 110Arg Arg Gly His
Val Leu Ala Lys Glu Leu Glu Ala Phe Arg Glu Ala115 120
125Lys Arg His Arg Pro Leu Ile Ala Leu Pro Thr Gln Asp Pro
Ala His130 135 140Gly Gly Ala Pro Pro Glu
Met Ala Ser Asn Arg Lys145 150
155123735PRTHomo sapiens 123Glu Val Lys Gln Glu Asn Arg Leu Leu Asn Glu
Ser Glu Ser Ser Ser1 5 10
15Gln Gly Leu Leu Gly Tyr Tyr Phe Ser Asp Leu Asn Phe Gln Ala Pro20
25 30Met Val Val Thr Ser Ser Thr Thr Gly Asp
Leu Ser Ile Pro Ser Ser35 40 45Glu Leu
Glu Asn Ile Pro Ser Glu Asn Gln Tyr Phe Gln Ser Ala Ile50
55 60Trp Ser Gly Phe Ile Lys Val Lys Lys Ser Asp Glu
Tyr Thr Phe Ala65 70 75
80Thr Ser Ala Asp Asn His Val Thr Met Trp Val Asp Asp Gln Glu Val85
90 95Ile Asn Lys Ala Ser Asn Ser Asn Lys Ile
Arg Leu Glu Lys Gly Arg100 105 110Leu Tyr
Gln Ile Lys Ile Gln Tyr Gln Arg Glu Asn Pro Thr Glu Lys115
120 125Gly Leu Asp Phe Lys Leu Tyr Trp Thr Asp Ser Gln
Asn Lys Lys Glu130 135 140Val Ile Ser Ser
Asp Asn Leu Gln Leu Pro Glu Leu Lys Gln Lys Ser145 150
155 160Ser Asn Ser Arg Lys Lys Arg Ser Thr
Ser Ala Gly Pro Thr Val Pro165 170 175Asp
Arg Asp Asn Asp Gly Ile Pro Asp Ser Leu Glu Val Glu Gly Tyr180
185 190Thr Val Asp Val Lys Asn Lys Arg Thr Phe Leu
Ser Pro Trp Ile Ser195 200 205Asn Ile His
Glu Lys Lys Gly Leu Thr Lys Tyr Lys Ser Ser Pro Glu210
215 220Lys Trp Ser Thr Ala Ser Asp Pro Tyr Ser Asp Phe
Glu Lys Val Thr225 230 235
240Gly Arg Ile Asp Lys Asn Val Ser Pro Glu Ala Arg His Pro Leu Val245
250 255Ala Ala Tyr Pro Ile Val His Val Asp
Met Glu Asn Ile Ile Leu Ser260 265 270Lys
Asn Glu Asp Gln Ser Thr Gln Asn Thr Asp Ser Gln Thr Arg Thr275
280 285Ile Ser Lys Asn Thr Ser Thr Ser Arg Thr His
Thr Ser Glu Val His290 295 300Gly Asn Ala
Glu Val His Ala Ser Phe Phe Asp Ile Gly Gly Ser Val305
310 315 320Ser Ala Gly Phe Ser Asn Ser
Asn Ser Ser Thr Val Ala Ile Asp His325 330
335Ser Leu Ser Leu Ala Gly Glu Arg Thr Trp Ala Glu Thr Met Gly Leu340
345 350Asn Thr Ala Asp Thr Ala Arg Leu Asn
Ala Asn Ile Arg Tyr Val Asn355 360 365Thr
Gly Thr Ala Pro Ile Tyr Asn Val Leu Pro Thr Thr Ser Leu Val370
375 380Leu Gly Lys Asn Gln Thr Leu Ala Thr Ile Lys
Ala Lys Glu Asn Gln385 390 395
400Leu Ser Gln Ile Leu Ala Pro Asn Asn Tyr Tyr Pro Ser Lys Asn
Leu405 410 415Ala Pro Ile Ala Leu Asn Ala
Gln Asp Asp Phe Ser Ser Thr Pro Ile420 425
430Thr Met Asn Tyr Asn Gln Phe Leu Glu Leu Glu Lys Thr Lys Gln Leu435
440 445Arg Leu Asp Thr Asp Gln Val Tyr Gly
Asn Ile Ala Thr Tyr Asn Phe450 455 460Glu
Asn Gly Arg Val Arg Val Asp Thr Gly Ser Asn Trp Ser Glu Val465
470 475 480Leu Pro Gln Ile Gln Glu
Thr Thr Ala Arg Ile Ile Phe Asn Gly Lys485 490
495Asp Leu Asn Leu Val Glu Arg Arg Ile Ala Ala Val Asn Pro Ser
Asp500 505 510Pro Leu Glu Thr Thr Lys Pro
Asp Met Thr Leu Lys Glu Ala Leu Lys515 520
525Ile Ala Phe Gly Phe Asn Glu Pro Asn Gly Asn Leu Gln Tyr Gln Gly530
535 540Lys Asp Ile Thr Glu Phe Asp Phe Asn
Phe Asp Gln Gln Thr Ser Gln545 550 555
560Asn Ile Lys Asn Gln Leu Ala Glu Leu Asn Ala Thr Asn Ile
Tyr Thr565 570 575Val Leu Asp Lys Ile Lys
Leu Asn Ala Lys Met Asn Ile Leu Ile Arg580 585
590Asp Lys Arg Phe His Tyr Asp Arg Asn Asn Ile Ala Val Gly Ala
Asp595 600 605Glu Ser Val Val Lys Glu Ala
His Arg Glu Val Ile Asn Ser Ser Thr610 615
620Glu Gly Leu Leu Leu Asn Ile Asp Lys Asp Ile Arg Lys Ile Leu Ser625
630 635 640Gly Tyr Ile Val
Glu Ile Glu Asp Thr Glu Gly Leu Lys Glu Val Ile645 650
655Asn Asp Arg Tyr Asp Met Leu Asn Ile Ser Ser Leu Arg Gln
Asp Gly660 665 670Lys Thr Phe Ile Asp Phe
Lys Lys Tyr Asn Asp Lys Leu Pro Leu Tyr675 680
685Ile Ser Asn Pro Asn Tyr Lys Val Asn Val Tyr Ala Val Thr Lys
Glu690 695 700Asn Thr Ile Ile Asn Pro Ser
Glu Asn Gly Asp Thr Ser Thr Asn Gly705 710
715 720Ile Lys Lys Ile Leu Ile Phe Ser Lys Lys Gly Tyr
Glu Ile Gly725 730 735124509PRTHomo
sapiens 124Met Lys Val Lys Gly Thr Arg Arg Asn Tyr Gln His Leu Trp Arg
Trp1 5 10 15Gly Thr Leu
Leu Leu Gly Met Leu Met Ile Cys Ser Ala Thr Glu Lys20 25
30Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys
Glu Ala Thr35 40 45Thr Thr Leu Phe Cys
Ala Ser Asp Ala Arg Ala Tyr Asp Thr Glu Val50 55
60His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn
Pro65 70 75 80Gln Glu
Val Val Leu Gly Asn Val Thr Glu Asn Phe Asn Met Trp Lys85
90 95Asn Asn Met Val Glu Gln Met Gln Glu Asp Ile Ile
Ser Leu Trp Asp100 105 110Gln Ser Leu Lys
Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu115 120
125Asn Cys Thr Asp Leu Gly Lys Ala Thr Asn Thr Asn Ser Ser
Asn Trp130 135 140Lys Glu Glu Ile Lys Gly
Glu Ile Lys Asn Cys Ser Phe Asn Ile Thr145 150
155 160Thr Ser Ile Arg Asp Lys Ile Gln Lys Glu Asn
Ala Leu Phe Arg Asn165 170 175Leu Asp Val
Val Pro Ile Asp Asn Ala Ser Thr Thr Thr Asn Tyr Thr180
185 190Asn Tyr Arg Leu Ile His Cys Asn Arg Ser Val Ile
Thr Gln Ala Cys195 200 205Pro Lys Val Ser
Phe Glu Pro Ile Pro Ile His Tyr Cys Thr Pro Ala210 215
220Gly Phe Ala Ile Leu Lys Cys Asn Asn Lys Thr Phe Asn Gly
Lys Gly225 230 235 240Pro
Cys Thr Asn Val Ser Thr Val Gln Cys Thr His Gly Ile Arg Pro245
250 255Ile Val Ser Thr Gln Leu Leu Leu Asn Gly Ser
Leu Ala Glu Glu Glu260 265 270Val Val Ile
Arg Ser Asp Asn Phe Thr Asn Asn Ala Lys Thr Ile Ile275
280 285Val Gln Leu Asn Glu Ser Val Ala Ile Asn Cys Thr
Arg Pro Asn Asn290 295 300Asn Thr Arg Lys
Ser Ile Tyr Ile Gly Pro Gly Arg Ala Phe His Thr305 310
315 320Thr Gly Arg Ile Ile Gly Asp Ile Arg
Lys Ala His Cys Asn Ile Ser325 330 335Arg
Ala Gln Trp Asn Asn Thr Leu Glu Gln Ile Val Lys Lys Leu Arg340
345 350Glu Gln Phe Gly Asn Asn Lys Thr Ile Val Phe
Asn Gln Ser Ser Gly355 360 365Gly Asp Pro
Glu Ile Val Met His Ser Phe Asn Cys Arg Gly Glu Phe370
375 380Phe Tyr Cys Asn Thr Thr Gln Leu Phe Asn Asn Thr
Trp Arg Leu Asn385 390 395
400His Thr Glu Gly Thr Lys Gly Asn Asp Thr Ile Ile Leu Pro Cys Arg405
410 415Ile Lys Gln Ile Ile Asn Met Trp Gln
Glu Val Gly Lys Ala Met Tyr420 425 430Ala
Pro Pro Ile Gly Gly Gln Ile Ser Cys Ser Ser Asn Ile Thr Gly435
440 445Leu Leu Leu Thr Arg Asp Gly Gly Thr Asn Val
Thr Asn Asp Thr Glu450 455 460Val Phe Arg
Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu465
470 475 480Leu Tyr Lys Tyr Lys Val Ile
Lys Ile Glu Pro Leu Gly Ile Ala Pro485 490
495Thr Lys Ala Lys Arg Arg Val Val Gln Arg Glu Lys Arg500
505125101PRTHomo sapiens 125Ser Trp Val Ile Pro Pro Ile Ser Cys Pro Glu
Asn Glu Lys Gly Pro1 5 10
15Phe Pro Lys Asn Leu Val Gln Ile Lys Ser Asn Lys Asp Lys Glu Gly20
25 30Lys Val Phe Tyr Ser Ile Thr Gly Gln Gly
Ala Asp Thr Pro Pro Val35 40 45Gly Val
Phe Ile Ile Glu Arg Glu Thr Gly Trp Leu Lys Val Thr Glu50
55 60Pro Leu Asp Arg Glu Arg Ile Ala Thr Tyr Thr Leu
Phe Ser His Ala65 70 75
80Val Ser Ser Asn Gly Asn Ala Val Glu Asp Pro Met Glu Ile Leu Ile85
90 95Thr Val Thr Asp Gln100126459PRTHomo
sapiens 126Glu Ile Cys Gly Pro Gly Ile Asp Ile Arg Asn Asp Tyr Gln Gln
Leu1 5 10 15Lys Arg Leu
Glu Asn Cys Thr Val Ile Glu Gly Tyr Leu His Ile Leu20 25
30Leu Ile Ser Lys Ala Glu Asp Tyr Arg Ser Tyr Arg Phe
Pro Lys Leu35 40 45Thr Val Ile Thr Glu
Tyr Leu Leu Leu Phe Arg Val Ala Gly Leu Glu50 55
60Ser Leu Gly Asp Leu Phe Pro Asn Leu Thr Val Ile Arg Gly Trp
Lys65 70 75 80Leu Phe
Tyr Asn Tyr Ala Leu Val Ile Phe Glu Met Thr Asn Leu Lys85
90 95Asp Ile Gly Leu Tyr Asn Leu Arg Asn Ile Thr Arg
Gly Ala Ile Arg100 105 110Ile Glu Lys Asn
Ala Asp Leu Cys Tyr Leu Ser Thr Val Asp Trp Ser115 120
125Leu Ile Leu Asp Ala Val Ser Asn Asn Tyr Ile Val Gly Asn
Lys Pro130 135 140Pro Lys Glu Cys Gly Asp
Leu Cys Pro Gly Thr Met Glu Glu Lys Pro145 150
155 160Met Cys Glu Lys Thr Thr Ile Asn Asn Glu Tyr
Asn Tyr Arg Cys Trp165 170 175Thr Thr Asn
Arg Cys Gln Lys Met Cys Pro Ser Thr Cys Gly Lys Arg180
185 190Ala Cys Thr Glu Asn Asn Glu Cys Cys His Pro Glu
Cys Leu Gly Ser195 200 205Cys Ser Ala Pro
Asp Asn Asp Thr Ala Cys Val Ala Cys Arg His Tyr210 215
220Tyr Tyr Ala Gly Val Cys Val Pro Ala Cys Pro Pro Asn Thr
Tyr Arg225 230 235 240Phe
Glu Gly Trp Arg Cys Val Asp Arg Asp Phe Cys Ala Asn Ile Leu245
250 255Ser Ala Glu Ser Ser Asp Ser Glu Gly Phe Val
Ile His Asp Gly Glu260 265 270Cys Met Gln
Glu Cys Pro Ser Gly Phe Ile Arg Asn Gly Ser Gln Ser275
280 285Met Tyr Cys Ile Pro Cys Glu Gly Pro Cys Pro Lys
Val Cys Glu Glu290 295 300Glu Lys Lys Thr
Lys Thr Ile Asp Ser Val Thr Ser Ala Gln Met Leu305 310
315 320Gln Gly Cys Thr Ile Phe Lys Gly Asn
Leu Leu Ile Asn Ile Arg Arg325 330 335Gly
Asn Asn Ile Ala Ser Glu Leu Glu Asn Phe Met Gly Leu Ile Glu340
345 350Val Val Thr Gly Tyr Val Lys Ile Arg His Ser
His Ala Leu Val Ser355 360 365Leu Ser Phe
Leu Lys Asn Leu Arg Leu Ile Leu Gly Glu Glu Gln Leu370
375 380Glu Gly Asn Tyr Ser Phe Tyr Val Leu Asp Asn Gln
Asn Leu Gln Gln385 390 395
400Leu Trp Asp Trp Asp His Arg Asn Leu Thr Ile Lys Ala Gly Lys Met405
410 415Tyr Phe Ala Phe Asn Pro Lys Leu Cys
Val Ser Glu Ile Tyr Arg Met420 425 430Glu
Glu Val Thr Gly Thr Lys Gly Arg Gln Ser Lys Gly Asp Ile Asn435
440 445Thr Arg Asn Asn Gly Glu Arg Ala Ser Cys
Glu450 455127146PRTHomo sapiens 127Val Pro Ile Gln Lys
Val Gln Asp Asp Thr Lys Thr Leu Ile Lys Thr1 5
10 15Ile Val Thr Arg Ile Asn Asp Ile Ser His Thr
Gln Ser Val Ser Ser20 25 30Lys Gln Lys
Val Thr Gly Leu Asp Phe Ile Pro Gly Leu His Pro Ile35 40
45Leu Thr Leu Ser Lys Met Asp Gln Thr Leu Ala Val Tyr
Gln Gln Ile50 55 60Leu Thr Ser Met Pro
Ser Arg Asn Val Ile Gln Ile Ser Asn Asp Leu65 70
75 80Glu Asn Leu Arg Asp Leu Leu His Val Leu
Ala Phe Ser Lys Ser Cys85 90 95His Leu
Pro Trp Ala Ser Gly Leu Glu Thr Leu Asp Ser Leu Gly Gly100
105 110Val Leu Glu Ala Ser Gly Tyr Ser Thr Glu Val Val
Ala Leu Ser Arg115 120 125Leu Gln Gly Ser
Leu Gln Asp Met Leu Trp Gln Leu Asp Leu Ser Pro130 135
140Gly Cys145128327PRTHomo sapiens 128Lys Glu Ile Thr Asn
Ala Leu Glu Thr Trp Gly Ala Leu Gly Gln Asp1 5
10 15Ile Asn Leu Asp Ile Pro Ser Phe Gln Met Ser
Asp Asp Ile Asp Asp20 25 30Ile Lys Trp
Glu Lys Thr Ser Asp Lys Lys Lys Ile Ala Gln Phe Arg35 40
45Lys Glu Lys Glu Thr Phe Lys Glu Lys Asp Thr Tyr Lys
Leu Phe Lys50 55 60Asn Gly Thr Leu Lys
Ile Lys His Leu Lys Thr Asp Asp Gln Asp Ile65 70
75 80Tyr Lys Val Ser Ile Tyr Asp Thr Lys Gly
Lys Asn Val Leu Glu Lys85 90 95Ile Phe
Asp Leu Lys Ile Gln Glu Arg Val Ser Lys Pro Lys Ile Ser100
105 110Trp Thr Cys Ile Asn Thr Thr Leu Thr Cys Glu Val
Met Asn Gly Thr115 120 125Asp Pro Glu Leu
Asn Leu Tyr Gln Asp Gly Lys His Leu Lys Leu Ser130 135
140Gln Arg Val Ile Thr His Lys Trp Thr Thr Ser Leu Ser Ala
Lys Phe145 150 155 160Lys
Cys Thr Ala Gly Asn Lys Val Ser Lys Glu Ser Ser Val Glu Pro165
170 175Val Ser Cys Pro Glu Lys Gly Leu Asp Ile Tyr
Leu Ile Ile Gly Ile180 185 190Cys Gly Gly
Gly Ser Leu Leu Met Val Phe Val Ala Leu Leu Val Phe195
200 205Tyr Ile Thr Lys Arg Lys Lys Gln Arg Ser Arg Arg
Asn Asp Glu Glu210 215 220Leu Glu Thr Arg
Ala His Arg Val Ala Thr Glu Glu Arg Gly Arg Lys225 230
235 240Pro Gln Gln Ile Pro Ala Ser Thr Pro
Gln Asn Pro Ala Thr Ser Gln245 250 255His
Pro Pro Pro Pro Pro Gly His Arg Ser Gln Ala Pro Ser His Arg260
265 270Pro Pro Pro Pro Gly His Arg Val Gln His Gln
Pro Gln Lys Arg Pro275 280 285Pro Ala Pro
Ser Gly Thr Gln Val His Gln Gln Lys Gly Pro Pro Leu290
295 300Pro Arg Pro Arg Val Gln Pro Lys Pro Pro His Gly
Ala Ala Glu Asn305 310 315
320Ser Leu Ser Pro Ser Ser Asn325129433PRTHomo sapiens 129Lys Lys Val
Val Leu Gly Lys Lys Gly Asp Thr Val Glu Leu Thr Cys1 5
10 15Thr Ala Ser Gln Lys Lys Ser Ile Gln
Phe His Trp Lys Asn Ser Asn20 25 30Gln
Ile Lys Ile Leu Gly Asn Gln Gly Ser Phe Leu Thr Lys Gly Pro35
40 45Ser Lys Leu Asn Asp Arg Ala Asp Ser Arg Arg
Ser Leu Trp Asp Gln50 55 60Gly Asn Phe
Pro Leu Ile Ile Lys Asn Leu Lys Ile Glu Asp Ser Asp65 70
75 80Thr Tyr Ile Cys Glu Val Glu Asp
Gln Lys Glu Glu Val Gln Leu Leu85 90
95Val Phe Gly Leu Thr Ala Asn Ser Asp Thr His Leu Leu Gln Gly Gln100
105 110Ser Leu Thr Leu Thr Leu Glu Ser Pro Pro
Gly Ser Ser Pro Ser Val115 120 125Gln Cys
Arg Ser Pro Arg Gly Lys Asn Ile Gln Gly Gly Lys Thr Leu130
135 140Ser Val Ser Gln Leu Glu Leu Gln Asp Ser Gly Thr
Trp Thr Cys Thr145 150 155
160Val Leu Gln Asn Gln Lys Lys Val Glu Phe Lys Ile Asp Ile Val Val165
170 175Leu Ala Phe Gln Lys Ala Ser Ser Ile
Val Tyr Lys Lys Glu Gly Glu180 185 190Gln
Val Glu Phe Ser Phe Pro Leu Ala Phe Thr Val Glu Lys Leu Thr195
200 205Gly Ser Gly Glu Leu Trp Trp Gln Ala Glu Arg
Ala Ser Ser Ser Lys210 215 220Ser Trp Ile
Thr Phe Asp Leu Lys Asn Lys Glu Val Ser Val Lys Arg225
230 235 240Val Thr Gln Asp Pro Lys Leu
Gln Met Gly Lys Lys Leu Pro Leu His245 250
255Leu Thr Leu Pro Gln Ala Leu Pro Gln Tyr Ala Gly Ser Gly Asn Leu260
265 270Thr Leu Ala Leu Glu Ala Lys Thr Gly
Lys Leu His Gln Glu Val Asn275 280 285Leu
Val Val Met Arg Ala Thr Gln Leu Gln Lys Asn Leu Thr Cys Glu290
295 300Val Trp Gly Pro Thr Ser Pro Lys Leu Met Leu
Ser Leu Lys Leu Glu305 310 315
320Asn Lys Glu Ala Lys Val Ser Lys Arg Glu Lys Ala Val Trp Val
Leu325 330 335Asn Pro Glu Ala Gly Met Trp
Gln Cys Leu Leu Ser Asp Ser Gly Gln340 345
350Val Leu Leu Glu Ser Asn Ile Lys Val Leu Pro Thr Trp Ser Thr Pro355
360 365Val Gln Pro Met Ala Leu Ile Val Leu
Gly Gly Val Ala Gly Leu Leu370 375 380Leu
Phe Ile Gly Leu Gly Ile Phe Phe Cys Val Arg Cys Arg His Arg385
390 395 400Arg Arg Gln Ala Glu Arg
Met Ser Gln Ile Lys Arg Leu Leu Ser Glu405 410
415Lys Lys Thr Cys Gln Cys Pro His Arg Phe Gln Lys Thr Cys Ser
Pro420 425 430Ile1301145PRTHomo sapiens
130Tyr Asn Leu Asp Val Arg Gly Ala Arg Ser Phe Ser Pro Pro Arg Ala1
5 10 15Gly Arg His Phe Gly Tyr
Arg Val Leu Gln Val Gly Asn Gly Val Ile20 25
30Val Gly Ala Pro Gly Glu Gly Asn Ser Thr Gly Ser Leu Tyr Gln Cys35
40 45Gln Ser Gly Thr Gly His Cys Leu Pro
Val Thr Leu Arg Gly Ser Asn50 55 60Tyr
Thr Ser Lys Tyr Leu Gly Met Thr Leu Ala Thr Asp Pro Thr Asp65
70 75 80Gly Ser Ile Leu Ala Cys
Asp Pro Gly Leu Ser Arg Thr Cys Asp Gln85 90
95Asn Thr Tyr Leu Ser Gly Leu Cys Tyr Leu Phe Arg Gln Asn Leu Gln100
105 110Gly Pro Met Leu Gln Gly Arg Pro
Gly Phe Gln Glu Cys Ile Lys Gly115 120
125Asn Val Asp Leu Val Phe Leu Phe Asp Gly Ser Met Ser Leu Gln Pro130
135 140Asp Glu Phe Gln Lys Ile Leu Asp Phe
Met Lys Asp Val Met Lys Lys145 150 155
160Leu Ser Asn Thr Ser Tyr Gln Phe Ala Ala Val Gln Phe Ser
Thr Ser165 170 175Tyr Lys Thr Glu Phe Asp
Phe Ser Asp Tyr Val Lys Arg Lys Asp Pro180 185
190Asp Ala Leu Leu Lys His Val Lys His Met Leu Leu Leu Thr Asn
Thr195 200 205Phe Gly Ala Ile Asn Tyr Val
Ala Thr Glu Val Phe Arg Glu Glu Leu210 215
220Gly Ala Arg Pro Asp Ala Thr Lys Val Leu Ile Ile Ile Thr Asp Gly225
230 235 240Glu Ala Thr Asp
Ser Gly Asn Ile Asp Ala Ala Lys Asp Ile Ile Arg245 250
255Tyr Ile Ile Gly Ile Gly Lys His Phe Gln Thr Lys Glu Ser
Gln Glu260 265 270Thr Leu His Lys Phe Ala
Ser Lys Pro Ala Ser Glu Phe Val Lys Ile275 280
285Leu Asp Thr Phe Glu Lys Leu Lys Asp Leu Phe Thr Glu Leu Gln
Lys290 295 300Lys Ile Tyr Val Ile Glu Gly
Thr Ser Lys Gln Asp Leu Thr Ser Phe305 310
315 320Asn Met Glu Leu Ser Ser Ser Gly Ile Ser Ala Asp
Leu Ser Arg Gly325 330 335His Ala Val Val
Gly Ala Val Gly Ala Lys Asp Trp Ala Gly Gly Phe340 345
350Leu Asp Leu Lys Ala Asp Leu Gln Asp Asp Thr Phe Ile Gly
Asn Glu355 360 365Pro Leu Thr Pro Glu Val
Arg Ala Gly Tyr Leu Gly Tyr Thr Val Thr370 375
380Trp Leu Pro Ser Arg Gln Lys Thr Ser Leu Leu Ala Ser Gly Ala
Pro385 390 395 400Arg Tyr
Gln His Met Gly Arg Val Leu Leu Phe Gln Glu Pro Gln Gly405
410 415Gly Gly His Trp Ser Gln Val Gln Thr Ile His Gly
Thr Gln Ile Gly420 425 430Ser Tyr Phe Gly
Gly Glu Leu Cys Gly Val Asp Val Asp Gln Asp Gly435 440
445Glu Thr Glu Leu Leu Leu Ile Gly Ala Pro Leu Phe Tyr Gly
Glu Gln450 455 460Arg Gly Gly Arg Val Phe
Ile Tyr Gln Arg Arg Gln Leu Gly Phe Glu465 470
475 480Glu Val Ser Glu Leu Gln Gly Asp Pro Gly Tyr
Pro Leu Gly Arg Phe485 490 495Gly Glu Ala
Ile Thr Ala Leu Thr Asp Ile Asn Gly Asp Gly Leu Val500
505 510Asp Val Ala Val Gly Ala Pro Leu Glu Glu Gln Gly
Ala Val Tyr Ile515 520 525Phe Asn Gly Arg
His Gly Gly Leu Ser Pro Gln Pro Ser Gln Arg Ile530 535
540Glu Gly Thr Gln Val Leu Ser Gly Ile Gln Trp Phe Gly Arg
Ser Ile545 550 555 560His
Gly Val Lys Asp Leu Glu Gly Asp Gly Leu Ala Asp Val Ala Val565
570 575Gly Ala Glu Ser Gln Met Ile Val Leu Ser Ser
Arg Pro Val Val Asp580 585 590Met Val Thr
Leu Met Ser Phe Ser Pro Ala Glu Ile Pro Val His Glu595
600 605Val Glu Cys Ser Tyr Ser Thr Ser Asn Lys Met Lys
Glu Gly Val Asn610 615 620Ile Thr Ile Cys
Phe Gln Ile Lys Ser Leu Tyr Pro Gln Phe Gln Gly625 630
635 640Arg Leu Val Ala Asn Leu Thr Tyr Thr
Leu Gln Leu Asp Gly His Arg645 650 655Thr
Arg Arg Arg Gly Leu Phe Pro Gly Gly Arg His Glu Leu Arg Arg660
665 670Asn Ile Ala Val Thr Thr Ser Met Ser Cys Thr
Asp Phe Ser Phe His675 680 685Phe Pro Val
Cys Val Gln Asp Leu Ile Ser Pro Ile Asn Val Ser Leu690
695 700Asn Phe Ser Leu Trp Glu Glu Glu Gly Thr Pro Arg
Asp Gln Arg Ala705 710 715
720Gln Gly Lys Asp Ile Pro Pro Ile Leu Arg Pro Ser Leu His Ser Glu725
730 735Thr Trp Glu Ile Pro Phe Glu Lys Asn
Cys Gly Glu Asp Lys Lys Cys740 745 750Glu
Ala Asn Leu Arg Val Ser Phe Ser Pro Ala Arg Ser Arg Ala Leu755
760 765Arg Leu Thr Ala Phe Ala Ser Leu Ser Val Glu
Leu Ser Leu Ser Asn770 775 780Leu Glu Glu
Asp Ala Tyr Trp Val Gln Leu Asp Leu His Phe Pro Pro785
790 795 800Gly Leu Ser Phe Arg Lys Val
Glu Met Leu Lys Pro His Ser Gln Ile805 810
815Pro Val Ser Cys Glu Glu Leu Pro Glu Glu Ser Arg Leu Leu Ser Arg820
825 830Ala Leu Ser Cys Asn Val Ser Ser Pro
Ile Phe Lys Ala Gly His Ser835 840 845Val
Ala Leu Gln Met Met Phe Asn Thr Leu Val Asn Ser Ser Trp Gly850
855 860Asp Ser Val Glu Leu His Ala Asn Val Thr Cys
Asn Asn Glu Asp Ser865 870 875
880Asp Leu Leu Glu Asp Asn Ser Ala Thr Thr Ile Ile Pro Ile Leu
Tyr885 890 895Pro Ile Asn Ile Leu Ile Gln
Asp Gln Glu Asp Ser Thr Leu Tyr Val900 905
910Ser Phe Thr Pro Lys Gly Pro Lys Ile His Gln Val Lys His Met Tyr915
920 925Gln Val Arg Ile Gln Pro Ser Ile His
Asp His Asn Ile Pro Thr Leu930 935 940Glu
Ala Val Val Gly Val Pro Gln Pro Pro Ser Glu Gly Pro Ile Thr945
950 955 960His Gln Trp Ser Val Gln
Met Glu Pro Pro Val Pro Cys His Tyr Glu965 970
975Asp Leu Glu Arg Leu Pro Asp Ala Ala Glu Pro Cys Leu Pro Gly
Ala980 985 990Leu Phe Arg Cys Pro Val Val
Phe Arg Gln Glu Ile Leu Val Gln Val995 1000
1005Ile Gly Thr Leu Glu Leu Val Gly Glu Ile Glu Ala Ser Ser
Met1010 1015 1020Phe Ser Leu Cys Ser Ser
Leu Ser Ile Ser Phe Asn Ser Ser Lys1025 1030
1035His Phe His Leu Tyr Gly Ser Asn Ala Ser Leu Ala Gln Val
Val1040 1045 1050Met Lys Val Asp Val Val
Tyr Glu Lys Gln Met Leu Tyr Leu Tyr1055 1060
1065Val Leu Ser Gly Ile Gly Gly Leu Leu Leu Leu Leu Leu Ile
Phe1070 1075 1080Ile Val Leu Tyr Lys Val
Gly Phe Phe Lys Arg Asn Leu Lys Glu1085 1090
1095Lys Met Glu Ala Gly Arg Gly Val Pro Asn Gly Ile Pro Ala
Glu1100 1105 1110Asp Ser Glu Gln Leu Ala
Ser Gly Gln Glu Ala Gly Asp Pro Gly1115 1120
1125Cys Leu Lys Pro Leu His Glu Lys Asp Ser Glu Ser Gly Gly
Gly1130 1135 1140Lys Asp1145131660PRTHomo
sapiens 131Met Glu Ala Leu Met Ala Arg Gly Ala Leu Thr Gly Pro Leu Arg
Ala1 5 10 15Leu Cys Leu
Leu Gly Cys Leu Leu Ser His Ala Ala Ala Ala Pro Ser20 25
30Pro Ile Ile Lys Phe Pro Gly Asp Val Ala Pro Lys Thr
Asp Lys Glu35 40 45Leu Ala Val Gln Tyr
Leu Asn Thr Phe Tyr Gly Cys Pro Lys Glu Ser50 55
60Cys Asn Leu Phe Val Leu Lys Asp Thr Leu Lys Lys Met Gln Lys
Phe65 70 75 80Phe Gly
Leu Pro Gln Thr Gly Asp Leu Asp Gln Asn Thr Ile Glu Thr85
90 95Met Arg Lys Pro Arg Cys Gly Asn Pro Asp Val Ala
Asn Tyr Asn Phe100 105 110Phe Pro Arg Lys
Pro Lys Trp Asp Lys Asn Gln Ile Thr Tyr Arg Ile115 120
125Ile Gly Tyr Thr Pro Asp Leu Asp Pro Glu Thr Val Asp Asp
Ala Phe130 135 140Ala Arg Ala Phe Gln Val
Trp Ser Asp Val Thr Pro Leu Arg Phe Ser145 150
155 160Arg Ile His Asp Gly Glu Ala Asp Ile Met Ile
Asn Phe Gly Arg Trp165 170 175Glu His Gly
Asp Gly Tyr Pro Phe Asp Gly Lys Asp Gly Leu Leu Ala180
185 190His Ala Phe Ala Pro Gly Thr Gly Val Gly Gly Asp
Ser His Phe Asp195 200 205Asp Asp Glu Leu
Trp Thr Leu Gly Glu Gly Gln Val Val Arg Val Lys210 215
220Tyr Gly Asn Ala Asp Gly Glu Tyr Cys Lys Phe Pro Phe Leu
Phe Asn225 230 235 240Gly
Lys Glu Tyr Asn Ser Cys Thr Asp Thr Gly Arg Ser Asp Gly Phe245
250 255Leu Trp Cys Ser Thr Thr Tyr Asn Phe Glu Lys
Asp Gly Lys Tyr Gly260 265 270Phe Cys Pro
His Glu Ala Leu Phe Thr Met Gly Gly Asn Ala Glu Gly275
280 285Gln Pro Cys Lys Phe Pro Phe Arg Phe Gln Gly Thr
Ser Tyr Asp Ser290 295 300Cys Thr Thr Glu
Gly Arg Thr Asp Gly Tyr Arg Trp Cys Gly Thr Thr305 310
315 320Glu Asp Tyr Asp Arg Asp Lys Lys Tyr
Gly Phe Cys Pro Glu Thr Ala325 330 335Met
Ser Thr Val Gly Gly Asn Ser Glu Gly Ala Pro Cys Val Phe Pro340
345 350Phe Thr Phe Leu Gly Asn Lys Tyr Glu Ser Cys
Thr Ser Ala Gly Arg355 360 365Ser Asp Gly
Lys Met Trp Cys Ala Thr Thr Ala Asn Tyr Asp Asp Asp370
375 380Arg Lys Trp Gly Phe Cys Pro Asp Gln Gly Tyr Ser
Leu Phe Leu Val385 390 395
400Ala Ala His Glu Phe Gly His Ala Met Gly Leu Glu His Ser Gln Asp405
410 415Pro Gly Ala Leu Met Ala Pro Ile Tyr
Thr Tyr Thr Lys Asn Phe Arg420 425 430Leu
Ser Gln Asp Asp Ile Lys Gly Ile Gln Glu Leu Tyr Gly Ala Ser435
440 445Pro Asp Ile Asp Leu Gly Thr Gly Pro Thr Pro
Thr Leu Gly Pro Val450 455 460Thr Pro Glu
Ile Cys Lys Gln Asp Ile Val Phe Asp Gly Ile Ala Gln465
470 475 480Ile Arg Gly Glu Ile Phe Phe
Phe Lys Asp Arg Phe Ile Trp Arg Thr485 490
495Val Thr Pro Arg Asp Lys Pro Met Gly Pro Leu Leu Val Ala Thr Phe500
505 510Trp Pro Glu Leu Pro Glu Lys Ile Asp
Ala Val Tyr Glu Ala Pro Gln515 520 525Glu
Glu Lys Ala Val Phe Phe Ala Gly Asn Glu Tyr Trp Ile Tyr Ser530
535 540Ala Ser Thr Leu Glu Arg Gly Tyr Pro Lys Pro
Leu Thr Ser Leu Gly545 550 555
560Leu Pro Pro Asp Val Gln Arg Val Asp Ala Ala Phe Asn Trp Ser
Lys565 570 575Asn Lys Lys Thr Tyr Ile Phe
Ala Gly Asp Lys Phe Trp Arg Tyr Asn580 585
590Glu Val Lys Lys Lys Met Asp Pro Gly Phe Pro Lys Leu Ile Ala Asp595
600 605Ala Trp Asn Ala Ile Pro Asp Asn Leu
Asp Ala Val Val Asp Leu Gln610 615 620Gly
Gly Gly His Ser Tyr Phe Phe Lys Gly Ala Tyr Tyr Leu Lys Leu625
630 635 640Glu Asn Gln Ser Leu Lys
Ser Val Lys Phe Gly Ser Ile Lys Ser Asp645 650
655Trp Leu Gly Cys660132707PRTHomo sapiens 132Met Ser Leu Trp Gln
Pro Leu Val Leu Val Leu Leu Val Leu Gly Cys1 5
10 15Cys Phe Ala Ala Pro Arg Gln Arg Gln Ser Thr
Leu Val Leu Phe Pro20 25 30Gly Asp Leu
Arg Thr Asn Leu Thr Asp Arg Gln Leu Ala Glu Glu Tyr35 40
45Leu Tyr Arg Tyr Gly Tyr Thr Arg Val Ala Glu Met Arg
Gly Glu Ser50 55 60Lys Ser Leu Gly Pro
Ala Leu Leu Leu Leu Gln Lys Gln Leu Ser Leu65 70
75 80Pro Glu Thr Gly Glu Leu Asp Ser Ala Thr
Leu Lys Ala Met Arg Thr85 90 95Pro Arg
Cys Gly Val Pro Asp Leu Gly Arg Phe Gln Thr Phe Glu Gly100
105 110Asp Leu Lys Trp His His His Asn Ile Thr Tyr Trp
Ile Gln Asn Tyr115 120 125Ser Glu Asp Leu
Pro Arg Ala Val Ile Asp Asp Ala Phe Ala Arg Ala130 135
140Phe Ala Leu Trp Ser Ala Val Thr Pro Leu Thr Phe Thr Arg
Val Tyr145 150 155 160Ser
Arg Asp Ala Asp Ile Val Ile Gln Phe Gly Val Ala Glu His Gly165
170 175Asp Gly Tyr Pro Phe Asp Gly Lys Asp Gly Leu
Leu Ala His Ala Phe180 185 190Pro Pro Gly
Pro Gly Ile Gln Gly Asp Ala His Phe Asp Asp Asp Glu195
200 205Leu Trp Ser Leu Gly Lys Gly Val Val Val Pro Thr
Arg Phe Gly Asn210 215 220Ala Asp Gly Ala
Ala Cys His Phe Pro Phe Ile Phe Glu Gly Arg Ser225 230
235 240Tyr Ser Ala Cys Thr Thr Asp Gly Arg
Ser Asp Gly Leu Pro Trp Cys245 250 255Ser
Thr Thr Ala Asn Tyr Asp Thr Asp Asp Arg Phe Gly Phe Cys Pro260
265 270Ser Glu Arg Leu Tyr Thr Arg Asp Gly Asn Ala
Asp Gly Lys Pro Cys275 280 285Gln Phe Pro
Phe Ile Phe Gln Gly Gln Ser Tyr Ser Ala Cys Thr Thr290
295 300Asp Gly Arg Ser Asp Gly Tyr Arg Trp Cys Ala Thr
Thr Ala Asn Tyr305 310 315
320Asp Arg Asp Lys Leu Phe Gly Phe Cys Pro Thr Arg Ala Asp Ser Thr325
330 335Val Met Gly Gly Asn Ser Ala Gly Glu
Leu Cys Val Phe Pro Phe Thr340 345 350Phe
Leu Gly Lys Glu Tyr Ser Thr Cys Thr Ser Glu Gly Arg Gly Asp355
360 365Gly Arg Leu Trp Cys Ala Thr Thr Ser Asn Phe
Asp Ser Asp Lys Lys370 375 380Trp Gly Phe
Cys Pro Asp Gln Gly Tyr Ser Leu Phe Leu Val Ala Ala385
390 395 400His Glu Phe Gly His Ala Leu
Gly Leu Asp His Ser Ser Val Pro Glu405 410
415Ala Leu Met Tyr Pro Met Tyr Arg Phe Thr Glu Gly Pro Pro Leu His420
425 430Lys Asp Asp Val Asn Gly Ile Arg His
Leu Tyr Gly Pro Arg Pro Glu435 440 445Pro
Glu Pro Arg Pro Pro Thr Thr Thr Thr Pro Gln Pro Thr Ala Pro450
455 460Pro Thr Val Cys Pro Thr Gly Pro Pro Thr Val
His Pro Ser Glu Arg465 470 475
480Pro Thr Ala Gly Pro Thr Gly Pro Pro Ser Ala Gly Pro Thr Gly
Pro485 490 495Pro Thr Ala Gly Pro Ser Thr
Ala Thr Thr Val Pro Leu Ser Pro Val500 505
510Asp Asp Ala Cys Asn Val Asn Ile Phe Asp Ala Ile Ala Glu Ile Gly515
520 525Asn Gln Leu Tyr Leu Phe Lys Asp Gly
Lys Tyr Trp Arg Phe Ser Glu530 535 540Gly
Arg Gly Ser Arg Pro Gln Gly Pro Phe Leu Ile Ala Asp Lys Trp545
550 555 560Pro Ala Leu Pro Arg Lys
Leu Asp Ser Val Phe Glu Glu Pro Leu Ser565 570
575Lys Lys Leu Phe Phe Phe Ser Gly Arg Gln Val Trp Val Tyr Thr
Gly580 585 590Ala Ser Val Leu Gly Pro Arg
Arg Leu Asp Lys Leu Gly Leu Gly Ala595 600
605Asp Val Ala Gln Val Thr Gly Ala Leu Arg Ser Gly Arg Gly Lys Met610
615 620Leu Leu Phe Ser Gly Arg Arg Leu Trp
Arg Phe Asp Val Lys Ala Gln625 630 635
640Met Val Asp Pro Arg Ser Ala Ser Glu Val Asp Arg Met Phe
Pro Gly645 650 655Val Pro Leu Asp Thr His
Asp Val Phe Gln Tyr Arg Glu Lys Ala Tyr660 665
670Phe Cys Gln Asp Arg Phe Tyr Trp Arg Val Ser Ser Arg Ser Glu
Leu675 680 685Asn Gln Val Asp Gln Val Gly
Tyr Val Thr Tyr Asp Ile Leu Gln Cys690 695
700Pro Glu Asp705133115PRTHomo sapiens 133Ile Pro Thr Glu Ile Pro Thr
Ser Ala Leu Val Lys Glu Thr Leu Ala1 5 10
15Leu Leu Ser Thr His Arg Thr Leu Leu Ile Ala Asn Glu
Thr Leu Arg20 25 30Ile Pro Val Pro Val
His Lys Asn His Gln Leu Cys Thr Glu Glu Ile35 40
45Phe Gln Gly Ile Gly Thr Leu Glu Ser Gln Thr Val Gln Gly Gly
Thr50 55 60Val Glu Arg Leu Phe Lys Asn
Leu Ser Leu Ile Lys Lys Tyr Ile Asp65 70
75 80Gly Gln Lys Lys Lys Cys Gly Glu Glu Arg Arg Arg
Val Asn Gln Phe85 90 95Leu Asp Tyr Leu
Gln Glu Phe Leu Gly Val Met Asn Thr Glu Trp Ile100 105
110Ile Glu Ser115134185PRTHomo sapiens 134Ala Pro Val Pro
Pro Gly Glu Asp Ser Lys Asp Val Ala Ala Pro His1 5
10 15Arg Gln Pro Leu Thr Ser Ser Glu Arg Ile
Asp Lys Gln Ile Arg Tyr20 25 30Ile Leu
Asp Gly Ile Ser Ala Leu Arg Lys Glu Thr Cys Asn Lys Ser35
40 45Asn Met Cys Glu Ser Ser Lys Glu Ala Leu Ala Glu
Asn Asn Leu Asn50 55 60Leu Pro Lys Met
Ala Glu Lys Asp Gly Cys Phe Gln Ser Gly Phe Asn65 70
75 80Glu Glu Thr Cys Leu Val Lys Ile Ile
Thr Gly Leu Leu Glu Phe Glu85 90 95Val
Tyr Leu Glu Tyr Leu Gln Asn Arg Phe Glu Ser Ser Glu Glu Gln100
105 110Ala Arg Ala Val Gln Met Ser Thr Lys Val Leu
Ile Gln Phe Leu Gln115 120 125Lys Lys Ala
Lys Asn Leu Asp Ala Ile Thr Thr Pro Asp Pro Thr Thr130
135 140Asn Ala Ser Leu Leu Thr Lys Leu Gln Ala Gln Asn
Gln Trp Leu Gln145 150 155
160Asp Met Thr Thr His Leu Ile Leu Arg Ser Phe Lys Glu Phe Leu Gln165
170 175Ser Ser Leu Arg Ala Leu Arg Gln
Met180 185135160PRTHomo sapiens 135Ser Pro Gly Gln Gly
Thr Gln Ser Glu Asn Ser Cys Thr His Phe Pro1 5
10 15Gly Asn Leu Pro Asn Met Leu Arg Asp Leu Arg
Asp Ala Phe Ser Arg20 25 30Val Lys Thr
Phe Phe Gln Met Lys Asp Gln Leu Asp Asn Leu Leu Leu35 40
45Lys Glu Ser Leu Leu Glu Asp Phe Lys Gly Tyr Leu Gly
Cys Gln Ala50 55 60Leu Ser Glu Met Ile
Gln Phe Tyr Leu Glu Glu Val Met Pro Gln Ala65 70
75 80Glu Asn Gln Asp Pro Asp Ile Lys Ala His
Val Asn Ser Leu Gly Glu85 90 95Asn Leu
Lys Thr Leu Arg Leu Arg Leu Arg Arg Cys His Arg Phe Leu100
105 110Pro Cys Glu Asn Lys Ser Lys Ala Val Glu Gln Val
Lys Asn Ala Phe115 120 125Asn Lys Leu Gln
Glu Lys Gly Ile Tyr Lys Ala Met Ser Glu Phe Asp130 135
140Ile Phe Ile Asn Tyr Ile Glu Ala Tyr Met Thr Met Lys Ile
Arg Asn145 150 155
160136472PRTHomo sapiens 136Glu Met Gly Thr Ala Asp Leu Gly Pro Ser Ser
Val Pro Thr Pro Thr1 5 10
15Asn Val Thr Ile Glu Ser Tyr Asn Met Asn Pro Ile Val Tyr Trp Glu20
25 30Tyr Gln Ile Met Pro Gln Val Pro Val Phe
Thr Val Glu Val Lys Asn35 40 45Tyr Gly
Val Lys Asn Ser Glu Trp Ile Asp Ala Cys Ile Asn Ile Ser50
55 60His His Tyr Cys Asn Ile Ser Asp His Val Gly Asp
Pro Ser Asn Ser65 70 75
80Leu Trp Val Arg Val Lys Ala Arg Val Gly Gln Lys Glu Ser Ala Tyr85
90 95Ala Lys Ser Glu Glu Phe Ala Val Cys Arg
Asp Gly Lys Ile Gly Pro100 105 110Pro Lys
Leu Asp Ile Arg Lys Glu Glu Lys Gln Ile Met Ile Asp Ile115
120 125Phe His Pro Ser Val Phe Val Asn Gly Asp Glu Gln
Glu Val Asp Tyr130 135 140Asp Pro Glu Thr
Thr Cys Tyr Ile Arg Val Tyr Asn Val Tyr Val Arg145 150
155 160Met Asn Gly Ser Glu Ile Gln Tyr Lys
Ile Leu Thr Gln Lys Glu Asp165 170 175Asp
Cys Asp Glu Ile Gln Cys Gln Leu Ala Ile Pro Val Ser Ser Leu180
185 190Asn Ser Gln Tyr Cys Val Ser Ala Glu Gly Val
Leu His Val Trp Gly195 200 205Val Thr Thr
Glu Lys Ser Lys Glu Val Cys Ile Thr Ile Phe Asn Ser210
215 220Ser Ile Lys Gly Ser Leu Trp Ile Pro Val Val Ala
Ala Leu Leu Leu225 230 235
240Phe Leu Val Leu Ser Leu Val Phe Ile Cys Phe Tyr Ile Lys Lys Ile245
250 255Asn Pro Leu Lys Glu Lys Ser Ile Ile
Leu Pro Lys Ser Leu Ile Ser260 265 270Val
Val Arg Ser Ala Thr Leu Glu Thr Lys Pro Glu Ser Lys Tyr Val275
280 285Ser Leu Ile Thr Ser Tyr Gln Pro Phe Ser Leu
Glu Lys Glu Val Val290 295 300Cys Glu Glu
Pro Leu Ser Pro Ala Thr Val Pro Gly Met His Thr Glu305
310 315 320Asp Asn Pro Gly Lys Val Glu
His Thr Glu Glu Leu Ser Ser Ile Thr325 330
335Glu Val Val Thr Thr Glu Glu Asn Ile Pro Asp Val Val Pro Gly Ser340
345 350His Leu Thr Pro Ile Glu Arg Glu Ser
Ser Ser Pro Leu Ser Ser Asn355 360 365Gln
Ser Glu Pro Gly Ser Ile Ala Leu Asn Ser Tyr His Ser Arg Asn370
375 380Cys Ser Glu Ser Asp His Ser Arg Asn Gly Phe
Asp Thr Asp Ser Ser385 390 395
400Cys Leu Glu Ser His Ser Ser Leu Ser Asp Ser Glu Phe Pro Pro
Asn405 410 415Asn Lys Gly Glu Ile Lys Thr
Glu Gly Gln Glu Leu Ile Thr Val Ile420 425
430Lys Ala Pro Thr Ser Phe Gly Tyr Asp Lys Pro His Val Leu Val Asp435
440 445Leu Leu Val Asp Asp Ser Gly Lys Glu
Ser Leu Ile Gly Tyr Arg Pro450 455 460Thr
Glu Asp Ser Lys Glu Phe Ser465 470137143PRTHomo sapiens
137Gln Asp Pro Tyr Val Lys Glu Ala Glu Asn Leu Lys Lys Tyr Phe Asn1
5 10 15Ala Gly His Ser Asp Val
Ala Asp Asn Gly Thr Leu Phe Leu Gly Ile20 25
30Leu Lys Asn Trp Lys Glu Glu Ser Asp Arg Lys Ile Met Gln Ser Gln35
40 45Ile Val Ser Phe Tyr Phe Lys Leu Phe
Lys Asn Phe Lys Asp Asp Gln50 55 60Ser
Ile Gln Lys Ser Val Glu Thr Ile Lys Glu Asp Met Asn Val Lys65
70 75 80Phe Phe Asn Ser Asn Lys
Lys Lys Arg Asp Asp Phe Glu Lys Leu Thr85 90
95Asn Tyr Ser Val Thr Asp Leu Asn Val Gln Arg Lys Ala Ile His Glu100
105 110Leu Ile Gln Val Met Ala Glu Leu
Ser Pro Ala Ala Lys Thr Gly Lys115 120
125Arg Lys Arg Ser Gln Met Leu Phe Arg Gly Arg Arg Ala Ser Gln130
135 140138143PRTHomo sapiens 138Met Glu Ser Pro
Ser Ala Pro Pro His Arg Trp Cys Ile Pro Trp Gln1 5
10 15Arg Leu Leu Leu Thr Ala Ser Leu Leu Thr
Phe Trp Asn Pro Pro Thr20 25 30Thr Ala
Lys Leu Thr Ile Glu Ser Thr Pro Phe Asn Val Ala Glu Gly35
40 45Lys Glu Val Leu Leu Leu Val His Asn Leu Pro Gln
His Leu Phe Gly50 55 60Tyr Ser Trp Tyr
Lys Gly Glu Arg Val Asp Gly Asn Arg Gln Ile Ile65 70
75 80Gly Tyr Val Ile Gly Thr Gln Gln Ala
Thr Pro Gly Pro Ala Tyr Ser85 90 95Gly
Arg Glu Ile Ile Tyr Pro Asn Ala Ser Leu Leu Ile Gln Asn Ile100
105 110Ile Gln Asn Asp Thr Gly Phe Tyr Thr Leu His
Val Ile Lys Ser Asp115 120 125Leu Val Asn
Glu Glu Ala Thr Gly Gln Phe Arg Val Tyr Arg Glu130 135
140139440PRTHomo sapiens 139Glu Met Val Asp Asn Leu Arg Gly
Lys Ser Gly Gln Gly Tyr Tyr Val1 5 10
15Glu Met Thr Val Gly Ser Pro Pro Gln Thr Leu Asn Ile Leu
Val Asp20 25 30Thr Gly Ser Ser Asn Phe
Ala Val Gly Ala Ala Pro His Pro Phe Leu35 40
45His Arg Tyr Tyr Gln Arg Gln Leu Ser Ser Thr Tyr Arg Asp Leu Arg50
55 60Lys Gly Val Tyr Val Pro Tyr Thr Gln
Gly Lys Trp Glu Gly Glu Leu65 70 75
80Gly Thr Asp Leu Val Ser Ile Pro His Gly Pro Asn Val Thr
Val Arg85 90 95Ala Asn Ile Ala Ala Ile
Thr Glu Ser Asp Lys Phe Phe Ile Asn Gly100 105
110Ser Asn Trp Glu Gly Ile Leu Gly Leu Ala Tyr Ala Glu Ile Ala
Arg115 120 125Pro Asp Asp Ser Leu Glu Pro
Phe Phe Asp Ser Leu Val Lys Gln Thr130 135
140His Val Pro Asn Leu Phe Ser Leu Gln Leu Cys Gly Ala Gly Phe Pro145
150 155 160Leu Asn Gln Ser
Glu Val Leu Ala Ser Val Gly Gly Ser Met Ile Ile165 170
175Gly Gly Ile Asp His Ser Leu Tyr Thr Gly Ser Leu Trp Tyr
Thr Pro180 185 190Ile Arg Arg Glu Trp Tyr
Tyr Glu Val Ile Ile Val Arg Val Glu Ile195 200
205Asn Gly Gln Asp Leu Lys Met Asp Cys Lys Glu Tyr Asn Tyr Asp
Lys210 215 220Ser Ile Val Asp Ser Gly Thr
Thr Asn Leu Arg Leu Pro Lys Lys Val225 230
235 240Phe Glu Ala Ala Val Lys Ser Ile Lys Ala Ala Ser
Ser Thr Glu Lys245 250 255Phe Pro Asp Gly
Phe Trp Leu Gly Glu Gln Leu Val Cys Trp Gln Ala260 265
270Gly Thr Thr Pro Trp Asn Ile Phe Pro Val Ile Ser Leu Tyr
Leu Met275 280 285Gly Glu Val Thr Asn Gln
Ser Phe Arg Ile Thr Ile Leu Pro Gln Gln290 295
300Tyr Leu Arg Pro Val Glu Asp Val Ala Thr Ser Gln Asp Asp Cys
Tyr305 310 315 320Lys Phe
Ala Ile Ser Gln Ser Ser Thr Gly Thr Val Met Gly Ala Val325
330 335Ile Met Glu Gly Phe Tyr Val Val Phe Asp Arg Ala
Arg Lys Arg Ile340 345 350Gly Phe Ala Val
Ser Ala Cys His Val His Asp Glu Phe Arg Thr Ala355 360
365Ala Val Glu Gly Pro Phe Val Thr Leu Asp Met Glu Asp Cys
Gly Tyr370 375 380Asn Ile Pro Gln Thr Asp
Glu Ser Thr Leu Met Thr Ile Ala Tyr Val385 390
395 400Met Ala Ala Ile Cys Ala Leu Phe Met Leu Pro
Leu Cys Leu Met Val405 410 415Cys Gln Trp
Arg Cys Leu Arg Cys Leu Arg Gln Gln His Asp Asp Phe420
425 430Ala Asp Asp Ile Ser Leu Leu Lys435
440140810PRTHomo sapiens 140Met Glu His Lys Glu Val Val Leu Leu Leu Leu
Leu Phe Leu Lys Ser1 5 10
15Gly Gln Gly Glu Pro Leu Asp Asp Tyr Val Asn Thr Gln Gly Ala Ser20
25 30Leu Phe Ser Val Thr Lys Lys Gln Leu Gly
Ala Gly Ser Ile Glu Glu35 40 45Cys Ala
Ala Lys Cys Glu Glu Asp Glu Glu Phe Thr Cys Arg Ala Phe50
55 60Gln Tyr His Ser Lys Glu Gln Gln Cys Val Ile Met
Ala Glu Asn Arg65 70 75
80Lys Ser Ser Ile Ile Ile Arg Met Arg Asp Val Val Leu Phe Glu Lys85
90 95Lys Val Tyr Leu Ser Glu Cys Lys Thr Gly
Asn Gly Lys Asn Tyr Arg100 105 110Gly Thr
Met Ser Lys Thr Lys Asn Gly Ile Thr Cys Gln Lys Trp Ser115
120 125Ser Thr Ser Pro His Arg Pro Arg Phe Ser Pro Ala
Thr His Pro Ser130 135 140Glu Gly Leu Glu
Glu Asn Tyr Cys Arg Asn Pro Asp Asn Asp Pro Gln145 150
155 160Gly Pro Trp Cys Tyr Thr Thr Asp Pro
Glu Lys Arg Tyr Asp Tyr Cys165 170 175Asp
Ile Leu Glu Cys Glu Glu Glu Cys Met His Cys Ser Gly Glu Asn180
185 190Tyr Asp Gly Lys Ile Ser Lys Thr Met Ser Gly
Leu Glu Cys Gln Ala195 200 205Trp Asp Ser
Gln Ser Pro His Ala His Gly Tyr Ile Pro Ser Lys Phe210
215 220Pro Asn Lys Asn Leu Lys Lys Asn Tyr Cys Arg Asn
Pro Asp Arg Glu225 230 235
240Leu Arg Pro Trp Cys Phe Thr Thr Asp Pro Asn Lys Arg Trp Glu Leu245
250 255Cys Asp Ile Pro Arg Cys Thr Thr Pro
Pro Pro Ser Ser Gly Pro Thr260 265 270Tyr
Gln Cys Leu Lys Gly Thr Gly Glu Asn Tyr Arg Gly Asn Val Ala275
280 285Val Thr Val Ser Gly His Thr Cys Gln His Trp
Ser Ala Gln Thr Pro290 295 300His Thr His
Asn Arg Thr Pro Glu Asn Phe Pro Cys Lys Asn Leu Asp305
310 315 320Glu Asn Tyr Cys Arg Asn Pro
Asp Gly Lys Arg Ala Pro Trp Cys His325 330
335Thr Thr Asn Ser Gln Val Arg Trp Glu Tyr Cys Lys Ile Pro Ser Cys340
345 350Asp Ser Ser Pro Val Ser Thr Glu Gln
Leu Ala Pro Thr Ala Pro Pro355 360 365Glu
Leu Thr Pro Val Val Gln Asp Cys Tyr His Gly Asp Gly Gln Ser370
375 380Tyr Arg Gly Thr Ser Ser Thr Thr Thr Thr Gly
Lys Lys Cys Gln Ser385 390 395
400Trp Ser Ser Met Thr Pro His Arg His Gln Lys Thr Pro Glu Asn
Tyr405 410 415Pro Asn Ala Gly Leu Thr Met
Asn Tyr Cys Arg Asn Pro Asp Ala Asp420 425
430Lys Gly Pro Trp Cys Phe Thr Thr Asp Pro Ser Val Arg Trp Glu Tyr435
440 445Cys Asn Leu Lys Lys Cys Ser Gly Thr
Glu Ala Ser Val Val Ala Pro450 455 460Pro
Pro Val Val Leu Leu Pro Asp Val Glu Thr Pro Ser Glu Glu Asp465
470 475 480Cys Met Phe Gly Asn Gly
Lys Gly Tyr Arg Gly Lys Arg Ala Thr Thr485 490
495Val Thr Gly Thr Pro Cys Gln Asp Trp Ala Ala Gln Glu Pro His
Arg500 505 510His Ser Ile Phe Thr Pro Glu
Thr Asn Pro Arg Ala Gly Leu Glu Lys515 520
525Asn Tyr Cys Arg Asn Pro Asp Gly Asp Val Gly Gly Pro Trp Cys Tyr530
535 540Thr Thr Asn Pro Arg Lys Leu Tyr Asp
Tyr Cys Asp Val Pro Gln Cys545 550 555
560Ala Ala Pro Ser Phe Asp Cys Gly Lys Pro Gln Val Glu Pro
Lys Lys565 570 575Cys Pro Gly Arg Val Val
Gly Gly Cys Val Ala His Pro His Ser Trp580 585
590Pro Trp Gln Val Ser Leu Arg Thr Arg Phe Gly Met His Phe Cys
Gly595 600 605Gly Thr Leu Ile Ser Pro Glu
Trp Val Leu Thr Ala Ala His Cys Leu610 615
620Glu Lys Ser Pro Arg Pro Ser Ser Tyr Lys Val Ile Leu Gly Ala His625
630 635 640Gln Glu Val Asn
Leu Glu Pro His Val Gln Glu Ile Glu Val Ser Arg645 650
655Leu Phe Leu Glu Pro Thr Arg Lys Asp Ile Ala Leu Leu Lys
Leu Ser660 665 670Ser Pro Ala Val Ile Thr
Asp Lys Val Ile Pro Ala Cys Leu Pro Ser675 680
685Pro Asn Tyr Val Val Ala Asp Arg Thr Glu Cys Phe Ile Thr Gly
Trp690 695 700Gly Glu Thr Gln Gly Thr Phe
Gly Ala Gly Leu Leu Lys Glu Ala Gln705 710
715 720Leu Pro Val Ile Glu Asn Lys Val Cys Asn Arg Tyr
Glu Phe Leu Asn725 730 735Gly Arg Val Gln
Ser Thr Glu Leu Cys Ala Gly His Leu Ala Gly Gly740 745
750Thr Asp Ser Cys Gln Gly Asp Ser Gly Gly Pro Leu Val Cys
Phe Glu755 760 765Lys Asp Lys Tyr Ile Leu
Gln Gly Val Thr Ser Trp Gly Leu Gly Cys770 775
780Ala Arg Pro Asn Lys Pro Gly Val Tyr Val Arg Val Ser Arg Phe
Val785 790 795 800Thr Trp
Ile Glu Gly Val Met Arg Asn Asn805 810141762PRTHomo
sapiens 141Gly Pro Asn Ile Cys Thr Thr Arg Gly Val Ser Ser Cys Gln Gln
Cys1 5 10 15Leu Ala Val
Ser Pro Met Cys Ala Trp Cys Ser Asp Glu Ala Leu Pro20 25
30Leu Gly Ser Pro Arg Cys Asp Leu Lys Glu Asn Leu Leu
Lys Asp Asn35 40 45Cys Ala Pro Glu Ser
Ile Glu Phe Pro Val Ser Glu Ala Arg Val Leu50 55
60Glu Asp Arg Pro Leu Ser Asp Lys Gly Ser Gly Asp Ser Ser Gln
Val65 70 75 80Thr Gln
Val Ser Pro Gln Arg Ile Ala Leu Arg Leu Arg Pro Asp Asp85
90 95Ser Lys Asn Phe Ser Ile Gln Val Arg Gln Val Glu
Asp Tyr Pro Val100 105 110Asp Ile Tyr Tyr
Leu Met Asp Leu Ser Tyr Ser Met Lys Asp Asp Leu115 120
125Trp Ser Ile Gln Asn Leu Gly Thr Lys Leu Ala Thr Gln Met
Arg Lys130 135 140Leu Thr Ser Asn Leu Arg
Ile Gly Phe Gly Ala Phe Val Asp Lys Pro145 150
155 160Val Ser Pro Tyr Met Tyr Ile Ser Pro Pro Glu
Ala Leu Glu Asn Pro165 170 175Cys Tyr Asp
Met Lys Thr Thr Cys Leu Pro Met Phe Gly Tyr Lys His180
185 190Val Leu Thr Leu Thr Asp Gln Val Thr Arg Phe Asn
Glu Glu Val Lys195 200 205Lys Gln Ser Val
Ser Arg Asn Arg Asp Ala Pro Glu Gly Gly Phe Asp210 215
220Ala Ile Met Gln Ala Thr Val Cys Asp Glu Lys Ile Gly Trp
Arg Asn225 230 235 240Asp
Ala Ser His Leu Leu Val Phe Thr Thr Asp Ala Lys Thr His Ile245
250 255Ala Leu Asp Gly Arg Leu Ala Gly Ile Val Gln
Pro Asn Asp Gly Gln260 265 270Cys His Val
Gly Ser Asp Asn His Tyr Ser Ala Ser Thr Thr Met Asp275
280 285Tyr Pro Ser Leu Gly Leu Met Thr Glu Lys Leu Ser
Gln Lys Asn Ile290 295 300Asn Leu Ile Phe
Ala Val Thr Glu Asn Val Val Asn Leu Tyr Gln Asn305 310
315 320Tyr Ser Glu Leu Ile Pro Gly Thr Thr
Val Gly Val Leu Ser Met Asp325 330 335Ser
Ser Asn Val Leu Gln Leu Ile Val Asp Ala Tyr Gly Lys Ile Arg340
345 350Ser Lys Val Glu Leu Glu Val Arg Asp Leu Pro
Glu Glu Leu Ser Leu355 360 365Ser Phe Asn
Ala Thr Cys Leu Asn Asn Glu Val Ile Pro Gly Leu Lys370
375 380Ser Cys Met Gly Leu Lys Ile Gly Asp Thr Val Ser
Phe Ser Ile Glu385 390 395
400Ala Lys Val Arg Gly Cys Pro Gln Glu Lys Glu Lys Ser Phe Thr Ile405
410 415Lys Pro Val Gly Phe Lys Asp Ser Leu
Ile Val Gln Val Thr Phe Asp420 425 430Cys
Asp Cys Ala Cys Gln Ala Gln Ala Glu Pro Asn Ser His Arg Cys435
440 445Asn Asn Gly Asn Gly Thr Phe Glu Cys Gly Val
Cys Arg Cys Gly Pro450 455 460Gly Trp Leu
Gly Ser Gln Cys Glu Cys Ser Glu Glu Asp Tyr Arg Pro465
470 475 480Ser Gln Gln Asp Glu Cys Ser
Pro Arg Glu Gly Gln Pro Val Cys Ser485 490
495Gln Arg Gly Glu Cys Leu Cys Gly Gln Cys Val Cys His Ser Ser Asp500
505 510Phe Gly Lys Ile Thr Gly Lys Tyr Cys
Glu Cys Asp Asp Phe Ser Cys515 520 525Val
Arg Tyr Lys Gly Glu Met Cys Ser Gly His Gly Gln Cys Ser Cys530
535 540Gly Asp Cys Leu Cys Asp Ser Asp Trp Thr Gly
Tyr Tyr Cys Asn Cys545 550 555
560Thr Thr Arg Thr Asp Thr Cys Met Ser Ser Asn Gly Leu Leu Cys
Ser565 570 575Gly Arg Gly Lys Cys Glu Cys
Gly Ser Cys Val Cys Ile Gln Pro Gly580 585
590Ser Tyr Gly Asp Thr Cys Glu Lys Cys Pro Thr Cys Pro Asp Ala Cys595
600 605Thr Phe Lys Lys Glu Cys Val Glu Cys
Lys Lys Phe Asp Arg Glu Pro610 615 620Tyr
Met Thr Glu Asn Thr Cys Asn Arg Tyr Cys Arg Asp Glu Ile Glu625
630 635 640Ser Val Lys Glu Leu Lys
Asp Thr Gly Lys Asp Ala Val Asn Cys Thr645 650
655Tyr Lys Asn Glu Asp Asp Cys Val Val Arg Phe Gln Tyr Tyr Glu
Asp660 665 670Ser Ser Gly Lys Ser Ile Leu
Tyr Val Val Glu Glu Pro Glu Cys Pro675 680
685Lys Gly Pro Asp Ile Leu Val Val Leu Leu Ser Val Met Gly Ala Ile690
695 700Leu Leu Ile Gly Leu Ala Ala Leu Leu
Ile Trp Lys Leu Leu Ile Thr705 710 715
720Ile His Asp Arg Lys Glu Phe Ala Lys Phe Glu Glu Glu Arg
Ala Arg725 730 735Ala Lys Trp Asp Thr Ala
Asn Asn Pro Leu Tyr Lys Glu Ala Thr Ser740 745
750Thr Phe Thr Asn Ile Thr Tyr Arg Gly Thr755
760142505PRTHomo sapiens 142Gln Thr Ser Val Ser Pro Ser Lys Val Ile Leu
Pro Arg Gly Gly Ser1 5 10
15Val Leu Val Thr Cys Ser Thr Ser Cys Asp Gln Pro Lys Leu Leu Gly20
25 30Ile Glu Thr Pro Leu Pro Lys Lys Glu Leu
Leu Leu Pro Gly Asn Asn35 40 45Arg Lys
Val Tyr Glu Leu Ser Asn Val Gln Glu Asp Ser Gln Pro Met50
55 60Cys Tyr Ser Asn Cys Pro Asp Gly Gln Ser Thr Ala
Lys Thr Phe Leu65 70 75
80Thr Val Tyr Trp Thr Pro Glu Arg Val Glu Leu Ala Pro Leu Pro Ser85
90 95Trp Gln Pro Val Gly Lys Asn Leu Thr Leu
Arg Cys Gln Val Glu Gly100 105 110Gly Ala
Pro Arg Ala Asn Leu Thr Val Val Leu Leu Arg Gly Glu Lys115
120 125Glu Leu Lys Arg Glu Pro Ala Val Gly Glu Pro Ala
Glu Val Thr Thr130 135 140Thr Val Leu Val
Arg Arg Asp His His Gly Ala Asn Phe Ser Cys Arg145 150
155 160Thr Glu Leu Asp Leu Arg Pro Gln Gly
Leu Glu Leu Phe Glu Asn Thr165 170 175Ser
Ala Pro Tyr Gln Leu Gln Thr Phe Val Leu Pro Ala Thr Pro Pro180
185 190Gln Leu Val Ser Pro Arg Val Leu Glu Val Asp
Thr Gln Gly Thr Val195 200 205Val Cys Ser
Leu Asp Gly Leu Phe Pro Val Ser Glu Ala Gln Val His210
215 220Leu Ala Leu Gly Asp Gln Arg Leu Asn Pro Thr Val
Thr Tyr Gly Asn225 230 235
240Asp Ser Phe Ser Ala Lys Ala Ser Val Ser Val Thr Ala Glu Asp Glu245
250 255Gly Thr Gln Arg Leu Thr Cys Ala Val
Ile Leu Gly Asn Gln Ser Gln260 265 270Glu
Thr Leu Gln Thr Val Thr Ile Tyr Ser Phe Pro Ala Pro Asn Val275
280 285Ile Leu Thr Lys Pro Glu Val Ser Glu Gly Thr
Glu Val Thr Val Lys290 295 300Cys Glu Ala
His Pro Arg Ala Lys Val Thr Leu Asn Gly Val Pro Ala305
310 315 320Gln Pro Leu Gly Pro Arg Ala
Gln Leu Leu Leu Lys Ala Thr Pro Glu325 330
335Asp Asn Gly Arg Ser Phe Ser Cys Ser Ala Thr Leu Glu Val Ala Gly340
345 350Gln Leu Ile His Lys Asn Gln Thr Arg
Glu Leu Arg Val Leu Tyr Gly355 360 365Pro
Arg Leu Asp Glu Arg Asp Cys Pro Gly Asn Trp Thr Trp Pro Glu370
375 380Asn Ser Gln Gln Thr Pro Met Cys Gln Ala Trp
Gly Asn Pro Leu Pro385 390 395
400Glu Leu Lys Cys Leu Lys Asp Gly Thr Phe Pro Leu Pro Ile Gly
Glu405 410 415Ser Val Thr Val Thr Arg Asp
Leu Glu Gly Thr Tyr Leu Cys Arg Ala420 425
430Arg Ser Thr Gln Gly Glu Val Thr Arg Glu Val Thr Val Asn Val Leu435
440 445Ser Pro Arg Tyr Glu Ile Val Ile Ile
Thr Val Val Ala Ala Ala Val450 455 460Ile
Met Gly Thr Ala Gly Leu Ser Thr Tyr Leu Tyr Asn Arg Gln Arg465
470 475 480Lys Ile Lys Lys Tyr Arg
Leu Gln Gln Ala Gln Lys Gly Thr Pro Met485 490
495Lys Pro Asn Thr Gln Ala Thr Pro Pro500
505143261PRTHomo sapiens 143Met Ile Glu Thr Tyr Asn Gln Thr Ser Pro Arg
Ser Ala Ala Thr Gly1 5 10
15Leu Pro Ile Ser Met Lys Ile Phe Met Tyr Leu Leu Thr Val Phe Leu20
25 30Ile Thr Gln Met Ile Gly Ser Ala Leu Phe
Ala Val Tyr Leu His Arg35 40 45Arg Leu
Asp Lys Ile Glu Asp Glu Arg Asn Leu His Glu Asp Phe Val50
55 60Phe Met Lys Thr Ile Gln Arg Cys Asn Thr Gly Glu
Arg Ser Leu Ser65 70 75
80Leu Leu Asn Cys Glu Glu Ile Lys Ser Gln Phe Glu Gly Phe Val Lys85
90 95Asp Ile Met Leu Asn Lys Glu Glu Thr Lys
Lys Glu Asn Ser Phe Glu100 105 110Met Gln
Lys Gly Asp Gln Asn Pro Gln Ile Ala Ala His Val Ile Ser115
120 125Glu Ala Ser Ser Lys Thr Thr Ser Val Leu Gln Trp
Ala Glu Lys Gly130 135 140Tyr Tyr Thr Met
Ser Asn Asn Leu Val Thr Leu Glu Asn Gly Lys Gln145 150
155 160Leu Thr Val Lys Arg Gln Gly Leu Tyr
Tyr Ile Tyr Ala Gln Val Thr165 170 175Phe
Cys Ser Asn Arg Glu Ala Ser Ser Gln Ala Pro Phe Ile Ala Ser180
185 190Leu Cys Leu Lys Ser Pro Gly Arg Phe Glu Arg
Ile Leu Leu Arg Ala195 200 205Ala Asn Thr
His Ser Ser Ala Lys Pro Cys Gly Gln Gln Ser Ile His210
215 220Leu Gly Gly Val Phe Glu Leu Gln Pro Gly Ala Ser
Val Phe Val Asn225 230 235
240Val Thr Asp Pro Ser Gln Val Ser His Gly Thr Gly Phe Thr Ser Phe245
250 255Gly Leu Leu Lys Leu260144187PRTHomo
sapiens 144Ala Met His Val Ala Gln Pro Ala Val Val Leu Ala Ser Ser Arg
Gly1 5 10 15Ile Ala Ser
Phe Val Cys Glu Tyr Ala Ser Pro Gly Lys Ala Thr Glu20 25
30Val Arg Val Thr Val Leu Arg Gln Ala Asp Ser Gln Val
Thr Glu Val35 40 45Cys Ala Ala Thr Tyr
Met Met Gly Asn Glu Leu Thr Phe Leu Asp Asp50 55
60Ser Ile Cys Thr Gly Thr Ser Ser Gly Asn Gln Val Asn Leu Thr
Ile65 70 75 80Gln Gly
Leu Arg Ala Met Asp Thr Gly Leu Tyr Ile Cys Lys Val Glu85
90 95Leu Met Tyr Pro Pro Pro Tyr Tyr Leu Gly Ile Gly
Asn Gly Thr Gln100 105 110Ile Tyr Val Ile
Asp Pro Glu Pro Cys Pro Asp Ser Asp Phe Leu Leu115 120
125Trp Ile Leu Ala Ala Val Ser Ser Gly Leu Phe Phe Tyr Ser
Phe Leu130 135 140Leu Thr Ala Val Ser Leu
Ser Lys Met Leu Lys Lys Arg Ser Pro Leu145 150
155 160Thr Thr Gly Val Tyr Val Lys Met Pro Pro Thr
Glu Pro Glu Cys Glu165 170 175Lys Gln Phe
Gln Pro Tyr Phe Ile Pro Ile Asn180 185145544PRTHomo
sapiens 145Ile Pro Pro His Val Gln Lys Ser Val Asn Asn Asp Met Ile Val
Thr1 5 10 15Asp Asn Asn
Gly Ala Val Lys Phe Pro Gln Leu Cys Lys Phe Cys Asp20 25
30Val Arg Phe Ser Thr Cys Asp Asn Gln Lys Ser Cys Met
Ser Asn Cys35 40 45Ser Ile Thr Ser Ile
Cys Glu Lys Pro Gln Glu Val Cys Val Ala Val50 55
60Trp Arg Lys Asn Asp Glu Asn Ile Thr Leu Glu Thr Val Cys His
Asp65 70 75 80Pro Lys
Leu Pro Tyr His Asp Phe Ile Leu Glu Asp Ala Ala Ser Pro85
90 95Lys Cys Ile Met Lys Glu Lys Lys Lys Pro Gly Glu
Thr Phe Phe Met100 105 110Cys Ser Cys Ser
Ser Asp Glu Cys Asn Asp Asn Ile Ile Phe Ser Glu115 120
125Glu Tyr Asn Thr Ser Asn Pro Asp Leu Leu Leu Val Ile Phe
Gln Val130 135 140Thr Gly Ile Ser Leu Leu
Pro Pro Leu Gly Val Ala Ile Ser Val Ile145 150
155 160Ile Ile Phe Tyr Cys Tyr Arg Val Asn Arg Gln
Gln Lys Leu Ser Ser165 170 175Thr Trp Glu
Thr Gly Lys Thr Arg Lys Leu Met Glu Phe Ser Glu His180
185 190Cys Ala Ile Ile Leu Glu Asp Asp Arg Ser Asp Ile
Ser Ser Thr Cys195 200 205Ala Asn Asn Ile
Asn His Asn Thr Glu Leu Leu Pro Ile Glu Leu Asp210 215
220Thr Leu Val Gly Lys Gly Arg Phe Ala Glu Val Tyr Lys Ala
Lys Leu225 230 235 240Lys
Gln Asn Thr Ser Glu Gln Phe Glu Thr Val Ala Val Lys Ile Phe245
250 255Pro Tyr Glu Glu Tyr Ala Ser Trp Lys Thr Glu
Lys Asp Ile Phe Ser260 265 270Asp Ile Asn
Leu Lys His Glu Asn Ile Leu Gln Phe Leu Thr Ala Glu275
280 285Glu Arg Lys Thr Glu Leu Gly Lys Gln Tyr Trp Leu
Ile Thr Ala Phe290 295 300His Ala Lys Gly
Asn Leu Gln Glu Tyr Leu Thr Arg His Val Ile Ser305 310
315 320Trp Glu Asp Leu Arg Lys Leu Gly Ser
Ser Leu Ala Arg Gly Ile Ala325 330 335His
Leu His Ser Asp His Thr Pro Cys Gly Arg Pro Lys Met Pro Ile340
345 350Val His Arg Asp Leu Lys Ser Ser Asn Ile Leu
Val Lys Asn Asp Leu355 360 365Thr Cys Cys
Leu Cys Asp Phe Gly Leu Ser Leu Arg Leu Asp Pro Thr370
375 380Leu Ser Val Asp Asp Leu Ala Asn Ser Gly Gln Val
Gly Thr Ala Arg385 390 395
400Tyr Met Ala Pro Glu Val Leu Glu Ser Arg Met Asn Leu Glu Asn Ala405
410 415Glu Ser Phe Lys Gln Thr Asp Val Tyr
Ser Met Ala Leu Val Leu Trp420 425 430Glu
Met Thr Ser Arg Cys Asn Ala Val Gly Glu Val Lys Asp Tyr Glu435
440 445Pro Pro Phe Gly Ser Lys Val Arg Glu His Pro
Cys Val Glu Ser Met450 455 460Lys Asp Asn
Val Leu Arg Asp Arg Gly Arg Pro Glu Ile Pro Ser Phe465
470 475 480Trp Leu Asn His Gln Gly Ile
Gln Met Val Cys Glu Thr Leu Thr Glu485 490
495Cys Trp Asp His Asp Pro Glu Ala Arg Leu Thr Ala Gln Cys Val Ala500
505 510Glu Arg Phe Ser Glu Leu Glu His Leu
Asp Arg Leu Ser Gly Arg Ser515 520 525Cys
Ser Glu Glu Lys Ile Pro Glu Asp Gly Ser Leu Asn Thr Thr Lys530
535 540146358PRTHomo sapiens 146Cys Glu Glu Pro Pro
Thr Phe Glu Ala Met Glu Leu Ile Gly Lys Pro1 5
10 15Lys Pro Tyr Tyr Glu Ile Gly Glu Arg Val Asp
Tyr Lys Cys Lys Lys20 25 30Gly Tyr Phe
Tyr Ile Pro Pro Leu Ala Thr His Thr Ile Cys Asp Arg35 40
45Asn His Thr Trp Leu Pro Val Ser Asp Asp Ala Cys Tyr
Arg Glu Thr50 55 60Cys Pro Tyr Ile Arg
Asp Pro Leu Asn Gly Gln Ala Val Pro Ala Asn65 70
75 80Gly Thr Tyr Glu Phe Gly Tyr Gln Met His
Phe Ile Cys Asn Glu Gly85 90 95Tyr Tyr
Leu Ile Gly Glu Glu Ile Leu Tyr Cys Glu Leu Lys Gly Ser100
105 110Val Ala Ile Trp Ser Gly Lys Pro Pro Ile Cys Glu
Lys Val Leu Cys115 120 125Thr Pro Pro Pro
Lys Ile Lys Asn Gly Lys His Thr Phe Ser Glu Val130 135
140Glu Val Phe Glu Tyr Leu Asp Ala Val Thr Tyr Ser Cys Asp
Pro Ala145 150 155 160Pro
Gly Pro Asp Pro Phe Ser Leu Ile Gly Glu Ser Thr Ile Tyr Cys165
170 175Gly Asp Asn Ser Val Trp Ser Arg Ala Ala Pro
Glu Cys Lys Val Val180 185 190Lys Cys Arg
Phe Pro Val Val Glu Asn Gly Lys Gln Ile Ser Gly Phe195
200 205Gly Lys Lys Phe Tyr Tyr Lys Ala Thr Val Met Phe
Glu Cys Asp Lys210 215 220Gly Phe Tyr Leu
Asp Gly Ser Asp Thr Ile Val Cys Asp Ser Asn Ser225 230
235 240Thr Trp Asp Pro Pro Val Pro Lys Cys
Leu Lys Val Leu Pro Pro Ser245 250 255Ser
Thr Lys Pro Pro Ala Leu Ser His Ser Val Ser Thr Ser Ser Thr260
265 270Thr Lys Ser Pro Ala Ser Ser Ala Ser Gly Pro
Arg Pro Thr Tyr Lys275 280 285Pro Pro Val
Ser Asn Tyr Pro Gly Tyr Pro Lys Pro Glu Glu Gly Ile290
295 300Leu Asp Ser Leu Asp Val Trp Val Ile Ala Val Ile
Val Ile Ala Ile305 310 315
320Val Val Gly Val Ala Val Ile Cys Val Val Pro Tyr Arg Tyr Leu Gln325
330 335Arg Arg Lys Lys Lys Gly Thr Tyr Leu
Thr Asp Glu Thr His Arg Glu340 345 350Val
Lys Phe Thr Ser Leu3551471148PRTHomo sapiens 147Leu Pro Glu Ala Lys Ile
Phe Ser Gly Pro Ser Ser Glu Gln Phe Gly1 5
10 15Tyr Ala Val Gln Gln Phe Ile Asn Pro Lys Gly Asn
Trp Leu Leu Val20 25 30Gly Ser Pro Trp
Ser Gly Phe Pro Glu Asn Arg Met Gly Asp Val Tyr35 40
45Lys Cys Pro Val Asp Leu Ser Thr Ala Thr Cys Glu Lys Leu
Asn Leu50 55 60Gln Thr Ser Thr Ser Ile
Pro Asn Val Thr Glu Met Lys Thr Asn Met65 70
75 80Ser Leu Gly Leu Ile Leu Thr Arg Asn Met Gly
Thr Gly Gly Phe Leu85 90 95Thr Cys Gly
Pro Leu Trp Ala Gln Gln Cys Gly Asn Gln Tyr Tyr Thr100
105 110Thr Gly Val Cys Ser Asp Ile Ser Pro Asp Phe Gln
Leu Ser Ala Ser115 120 125Phe Ser Pro Ala
Thr Gln Pro Cys Pro Ser Leu Ile Asp Val Val Val130 135
140Val Cys Asp Glu Ser Asn Ser Ile Tyr Pro Trp Asp Ala Val
Lys Asn145 150 155 160Phe
Leu Glu Lys Phe Val Gln Gly Leu Asp Ile Gly Pro Thr Lys Thr165
170 175Gln Val Gly Leu Ile Gln Tyr Ala Asn Asn Pro
Arg Val Val Phe Asn180 185 190Leu Asn Thr
Tyr Lys Thr Lys Glu Glu Met Ile Val Ala Thr Ser Gln195
200 205Thr Ser Gln Tyr Gly Gly Asp Leu Thr Asn Thr Phe
Gly Ala Ile Gln210 215 220Tyr Ala Arg Lys
Tyr Ala Tyr Ser Ala Ala Ser Gly Gly Arg Arg Ser225 230
235 240Ala Thr Lys Val Met Val Val Val Thr
Asp Gly Glu Ser His Asp Gly245 250 255Ser
Met Leu Lys Ala Val Ile Asp Gln Cys Asn His Asp Asn Ile Leu260
265 270Arg Phe Gly Ile Ala Val Leu Gly Tyr Leu Asn
Arg Asn Ala Leu Asp275 280 285Thr Lys Asn
Leu Ile Lys Glu Ile Lys Ala Ile Ala Ser Ile Pro Thr290
295 300Glu Arg Tyr Phe Phe Asn Val Ser Asp Glu Ala Ala
Leu Leu Glu Lys305 310 315
320Ala Gly Thr Leu Gly Glu Gln Ile Phe Ser Ile Glu Gly Thr Val Gln325
330 335Gly Gly Asp Asn Phe Gln Met Glu Met
Ser Gln Val Gly Phe Ser Ala340 345 350Asp
Tyr Ser Ser Gln Asn Asp Ile Leu Met Leu Gly Ala Val Gly Ala355
360 365Phe Gly Trp Ser Gly Thr Ile Val Gln Lys Thr
Ser His Gly His Leu370 375 380Ile Phe Pro
Lys Gln Ala Phe Asp Gln Ile Leu Gln Asp Arg Asn His385
390 395 400Ser Ser Tyr Leu Gly Tyr Ser
Val Ala Ala Ile Ser Thr Gly Glu Ser405 410
415Thr His Phe Val Ala Gly Ala Pro Arg Ala Asn Tyr Thr Gly Gln Ile420
425 430Val Leu Tyr Ser Val Asn Glu Asn Gly
Asn Ile Thr Val Ile Gln Ala435 440 445His
Arg Gly Asp Gln Ile Gly Ser Tyr Phe Gly Ser Val Leu Cys Ser450
455 460Val Asp Val Asp Lys Asp Thr Ile Thr Asp Val
Leu Leu Val Gly Ala465 470 475
480Pro Met Tyr Met Ser Asp Leu Lys Lys Glu Glu Gly Arg Val Tyr
Leu485 490 495Phe Thr Ile Lys Lys Gly Ile
Leu Gly Gln His Gln Phe Leu Glu Gly500 505
510Pro Glu Gly Ile Glu Asn Thr Arg Phe Gly Ser Ala Ile Ala Ala Leu515
520 525Ser Asp Ile Asn Met Asp Gly Phe Asn
Asp Val Ile Val Gly Ser Pro530 535 540Leu
Glu Asn Gln Asn Ser Gly Ala Val Tyr Ile Tyr Asn Gly His Gln545
550 555 560Gly Thr Ile Arg Thr Lys
Tyr Ser Gln Lys Ile Leu Gly Ser Asp Gly565 570
575Ala Phe Arg Ser His Leu Gln Tyr Phe Gly Arg Ser Leu Asp Gly
Tyr580 585 590Gly Asp Leu Asn Gly Asp Ser
Ile Thr Asp Val Ser Ile Gly Ala Phe595 600
605Gly Gln Val Val Gln Leu Trp Ser Gln Ser Ile Ala Asp Val Ala Ile610
615 620Glu Ala Ser Phe Thr Pro Glu Lys Ile
Thr Leu Val Asn Lys Asn Ala625 630 635
640Gln Ile Ile Leu Lys Leu Cys Phe Ser Ala Lys Phe Arg Pro
Thr Lys645 650 655Gln Asn Asn Gln Val Ala
Ile Val Tyr Asn Ile Thr Leu Asp Ala Asp660 665
670Gly Phe Ser Ser Arg Val Thr Ser Arg Gly Leu Phe Lys Glu Asn
Asn675 680 685Glu Arg Cys Leu Gln Lys Asn
Met Val Val Asn Gln Ala Gln Ser Cys690 695
700Pro Glu His Ile Ile Tyr Ile Gln Glu Pro Ser Asp Val Val Asn Ser705
710 715 720Leu Asp Leu Arg
Val Asp Ile Ser Leu Glu Asn Pro Gly Thr Ser Pro725 730
735Ala Leu Glu Ala Tyr Ser Glu Thr Ala Lys Val Phe Ser Ile
Pro Phe740 745 750His Lys Asp Cys Gly Glu
Asp Gly Leu Cys Ile Ser Asp Leu Val Leu755 760
765Asp Val Arg Gln Ile Pro Ala Ala Gln Glu Gln Pro Phe Ile Val
Ser770 775 780Asn Gln Asn Lys Arg Leu Thr
Phe Ser Val Thr Leu Lys Asn Lys Arg785 790
795 800Glu Ser Ala Tyr Asn Thr Gly Ile Val Val Asp Phe
Ser Glu Asn Leu805 810 815Phe Phe Ala Ser
Phe Ser Leu Pro Val Asp Gly Thr Glu Val Thr Cys820 825
830Gln Val Ala Ala Ser Gln Lys Ser Val Ala Cys Asp Val Gly
Tyr Pro835 840 845Ala Leu Lys Arg Glu Gln
Gln Val Thr Phe Thr Ile Asn Phe Asp Phe850 855
860Asn Leu Gln Asn Leu Gln Asn Gln Ala Ser Leu Ser Phe Gln Ala
Leu865 870 875 880Ser Glu
Ser Gln Glu Glu Asn Lys Ala Asp Asn Leu Val Asn Leu Lys885
890 895Ile Pro Leu Leu Tyr Asp Ala Glu Ile His Leu Thr
Arg Ser Thr Asn900 905 910Ile Asn Phe Tyr
Glu Ile Ser Ser Asp Gly Asn Val Pro Ser Ile Val915 920
925His Ser Phe Glu Asp Val Gly Pro Lys Phe Ile Phe Ser Leu
Lys Val930 935 940Thr Thr Gly Ser Val Pro
Val Ser Met Ala Thr Val Ile Ile His Ile945 950
955 960Pro Gln Tyr Thr Lys Glu Lys Asn Pro Leu Met
Tyr Leu Thr Gly Val965 970 975Gln Thr Asp
Lys Ala Gly Asp Ile Ser Cys Asn Ala Asp Ile Asn Pro980
985 990Leu Lys Ile Gly Gln Thr Ser Ser Ser Val Ser Phe
Lys Ser Glu Asn995 1000 1005Phe Arg His
Thr Lys Glu Leu Asn Cys Arg Thr Ala Ser Cys Ser1010
1015 1020Asn Val Thr Cys Trp Leu Lys Asp Val His Met
Lys Gly Glu Tyr1025 1030 1035Phe Val
Asn Val Thr Thr Arg Ile Trp Asn Gly Thr Phe Ala Ser1040
1045 1050Ser Thr Phe Gln Thr Val Gln Leu Thr Ala Ala
Ala Glu Ile Asn1055 1060 1065Thr Tyr
Asn Pro Glu Ile Tyr Val Ile Glu Asp Asn Thr Val Thr1070
1075 1080Ile Pro Leu Met Ile Met Lys Pro Asp Glu Lys
Ala Glu Val Pro1085 1090 1095Thr Gly
Val Ile Ile Gly Ser Ile Ile Ala Gly Ile Leu Leu Leu1100
1105 1110Leu Ala Leu Val Ala Ile Leu Trp Lys Leu Gly
Phe Phe Lys Arg1115 1120 1125Lys Tyr
Glu Lys Met Thr Lys Asn Pro Asp Glu Ile Asp Glu Thr1130
1135 1140Thr Glu Leu Ser Ser1145148133PRTHomo sapiens
148Ala Pro Met Thr Gln Thr Thr Pro Leu Lys Thr Ser Trp Val Asn Cys1
5 10 15Ser Asn Met Ile Asp Glu
Ile Ile Thr His Leu Lys Gln Pro Pro Leu20 25
30Pro Leu Leu Asp Phe Asn Asn Leu Asn Gly Glu Asp Gln Asp Ile Leu35
40 45Met Glu Asn Asn Leu Arg Arg Pro Asn
Leu Glu Ala Phe Asn Arg Ala50 55 60Val
Lys Ser Leu Gln Asn Ala Ser Ala Ile Glu Ser Ile Leu Lys Asn65
70 75 80Leu Leu Pro Cys Leu Pro
Leu Ala Thr Ala Ala Pro Thr Arg His Pro85 90
95Ile His Ile Lys Asp Gly Asp Trp Asn Glu Phe Arg Arg Lys Leu Thr100
105 110Phe Tyr Leu Lys Thr Leu Glu Asn
Ala Gln Ala Gln Gln Thr Thr Leu115 120
125Ser Leu Ala Ile Phe130149622PRTHomo sapiens 149Met Ala His Val Arg Gly
Leu Gln Leu Pro Gly Cys Leu Ala Leu Ala1 5
10 15Ala Leu Cys Ser Leu Val His Ser Gln His Val Phe
Leu Ala Pro Gln20 25 30Gln Ala Arg Ser
Leu Leu Gln Arg Val Arg Arg Ala Asn Thr Phe Leu35 40
45Glu Glu Val Arg Lys Gly Asn Leu Glu Arg Glu Cys Val Glu
Glu Thr50 55 60Cys Ser Tyr Glu Glu Ala
Phe Glu Ala Leu Glu Ser Ser Thr Ala Thr65 70
75 80Asp Val Phe Trp Ala Lys Tyr Thr Ala Cys Glu
Thr Ala Arg Thr Pro85 90 95Arg Asp Lys
Leu Ala Ala Cys Leu Glu Gly Asn Cys Ala Glu Gly Leu100
105 110Gly Thr Asn Tyr Arg Gly His Val Asn Ile Thr Arg
Ser Gly Ile Glu115 120 125Cys Gln Leu Trp
Arg Ser Arg Tyr Pro His Lys Pro Glu Ile Asn Ser130 135
140Thr Thr His Pro Gly Ala Asp Leu Gln Glu Asn Phe Cys Arg
Asn Pro145 150 155 160Asp
Ser Ser Thr Thr Gly Pro Trp Cys Tyr Thr Thr Asp Pro Thr Val165
170 175Arg Arg Gln Glu Cys Ser Ile Pro Val Cys Gly
Gln Asp Gln Val Thr180 185 190Val Ala Met
Thr Pro Arg Ser Glu Gly Ser Ser Val Asn Leu Ser Pro195
200 205Pro Leu Glu Gln Cys Val Pro Asp Arg Gly Gln Gln
Tyr Gln Gly Arg210 215 220Leu Ala Val Thr
Thr His Gly Leu Pro Cys Leu Ala Trp Ala Ser Ala225 230
235 240Gln Ala Lys Ala Leu Ser Lys His Gln
Asp Phe Asn Ser Ala Val Gln245 250 255Leu
Val Glu Asn Phe Cys Arg Asn Pro Asp Gly Asp Glu Glu Gly Val260
265 270Trp Cys Tyr Val Ala Gly Lys Pro Gly Asp Phe
Gly Tyr Cys Asp Leu275 280 285Asn Tyr Cys
Glu Glu Ala Val Glu Glu Glu Thr Gly Asp Gly Leu Asp290
295 300Glu Asp Ser Asp Arg Ala Ile Glu Gly Arg Thr Ala
Thr Ser Glu Tyr305 310 315
320Gln Thr Phe Phe Asn Pro Arg Thr Phe Gly Ser Gly Glu Ala Asp Cys325
330 335Gly Leu Arg Pro Leu Phe Glu Lys Lys
Ser Leu Glu Asp Lys Thr Glu340 345 350Arg
Glu Leu Leu Glu Ser Tyr Ile Asp Gly Arg Ile Val Glu Gly Ser355
360 365Asp Ala Glu Ile Gly Met Ser Pro Trp Gln Val
Met Leu Phe Arg Lys370 375 380Ser Pro Gln
Glu Leu Leu Cys Gly Ala Ser Leu Ile Ser Asp Arg Trp385
390 395 400Val Leu Thr Ala Ala His Cys
Leu Leu Tyr Pro Pro Trp Asp Lys Asn405 410
415Phe Thr Glu Asn Asp Leu Leu Val Arg Ile Gly Lys His Ser Arg Thr420
425 430Arg Tyr Glu Arg Asn Ile Glu Lys Ile
Ser Met Leu Glu Lys Ile Tyr435 440 445Ile
His Pro Arg Tyr Asn Trp Arg Glu Asn Leu Asp Arg Asp Ile Ala450
455 460Leu Met Lys Leu Lys Lys Pro Val Ala Phe Ser
Asp Tyr Ile His Pro465 470 475
480Val Cys Leu Pro Asp Arg Glu Thr Ala Ala Ser Leu Leu Gln Ala
Gly485 490 495Tyr Lys Gly Arg Val Thr Gly
Trp Gly Asn Leu Lys Glu Thr Trp Thr500 505
510Ala Asn Val Gly Lys Gly Gln Pro Ser Val Leu Gln Val Val Asn Leu515
520 525Pro Ile Val Glu Arg Pro Val Cys Lys
Asp Ser Thr Arg Ile Arg Ile530 535 540Thr
Asp Asn Met Phe Cys Ala Gly Tyr Lys Pro Asp Glu Gly Lys Arg545
550 555 560Gly Asp Ala Cys Glu Gly
Asp Ser Gly Gly Pro Phe Val Met Lys Ser565 570
575Pro Phe Asn Asn Arg Trp Tyr Gln Met Gly Ile Val Ser Trp Gly
Glu580 585 590Gly Cys Asp Arg Asp Gly Lys
Tyr Gly Phe Tyr Thr His Val Phe Arg595 600
605Leu Lys Lys Trp Ile Gln Lys Val Ile Asp Gln Phe Gly Glu610
615 620
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: