Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: NOVEL C. ELEGANS P21-ACTIVATED KINASE (PAK) GENE AND ASSOCIATED LOSS-OF-FUNCTION PHENOTYPES THAT FACILITATE SCREENING FOR SMALL MOLECULE MODULATORS OF PAK ACTIVITY IN THE NEMATODE, CAENORHABDITIS ELEGANS

Inventors:  Kaj Grandien (Kelkheim, DE)  Jonathan Rothblatt (Somerville, MA, US)  Paola Concari (Munich, DE)  Isabelle Quelo (Schwalbach, DE)  Bert Klebl (Gunzlhofen, DE)
Assignees:  SANOFI-AVENTIS DEUTSCHLAND GMBH
IPC8 Class: AG01N33567FI
USPC Class: 435 72
Class name: Involving a micro-organism or cell membrane bound antigen or cell membrane bound receptor or cell membrane bound antibody or microbial lysate
Publication date: 09/04/2008
Patent application number: 20080213798






Sign up to receive free email alerts when patent applications with chosen keywords are published SIGN UP

Abstract:

The invention refers to a novel C. elegans p21-activated kinase gene, the pak-3 gene, and associated loss-of-function phenotypes. These phenotypes can be used to elucidate PAK signaling pathways in C. elegans and to screen compounds that modulate PAK signaling.

Claims:

1. An isolated protein which is encoded by a polynucleotide sequence comprising SEQ ID NO. 1.

2. The isolated protein as claimed in claim 1 which comprises an amino acid sequence of SEQ ID NO. 7.

3. The isolated protein as claimed in claim 1 which exhibits the activity of a pak-3a protein.

4. An assay for identifying a compound which binds to the pak-3a protein whereina] a pak-3a protein is provided,b] a compound is provided,c] the pak-3a protein and the compound are brought in contact, andd] the binding of the chemical compound to the pak-3a protein is determined and/or the activity of the pak-3a protein is determined.

5. The assay as claimed in claim 4, wherein step a] consists of providing a host cell which is expressing a pak-3a protein and step c] consists of bringing in contact the host cell with the compound.

6. The assay as claimed in claim 4, wherein the compound is inactivating, or activating, or maintaining the activity of a pak-3a protein.

Description:

[0001]A novel C. elegans p21-activated kinase (PAK) gene and associated loss-of-function phenotypes that facilitate screening for small molecule modulators of PAK activity in the nematode, Caenorhabditis elegans.

[0002]The invention refers to a novel C. elegans p21-activated kinase gene, the pak-3 gene, and associated loss-of-function phenotypes. These phenotypes can be used to elucidate PAK signaling pathways in C. elegans and to screen compounds that modulate PAK signaling.

[0003]The p21-activated kinases comprise a group of serine/threonine protein kinases with distinct structural features that have emerged as important regulators of several different cellular and biological processes (reviewed in Bokoch, Annu Rev Biochem 2003. 72:743-81). The PAK family can be subdivided in the PAK 1-3 subclass and the PAK 4-6 subclass (based on the numbering of the human/mammalian PAKs), the former being the focus of interest here. Members of the PAK 1-3 subclass are highly related to the STE20 kinase in yeast, the founding member of this protein class, and homologues have also been identified in other model organisms such as Drosophila and C. elegans.

[0004]The two most important structural features of PAKs of the 1-3 class are the highly conserved C-terminal catalytic domain and the N-terminal regulatory domain, respectively. A distinct motif in the regulatory domain of PAK proteins is the CRIB domain (cdc42 and Rac interactive domain), which overlaps with an autoinhibitory domain, keeping the catalytic domain inactive in the absence of stimulatory signals. Other motifs found in PAK proteins are SH3 binding domains and an acidic residue-rich domain between the regulatory domain and the catalytic domain. Additionally a binding site for the Gβy subunit of heterotrimeric G proteins has been reported to be present in the very C-terminus of PAK.

[0005]The most well described activators of PAKs are the Rho class GTPases cdc42 and Rac that upon binding to the CRIB domain block the autoinhibitory domain, leading to activation of the kinase domain. Activation of PAKs can also take place through GTPase independent mechanisms after recruitment of PAKs to the plasma membrane where tyrosine kinase receptor mediated activation occurs. PAKs are known to be activated by phosphorylation, in part through autophosphorylation at Thr423 and Ser144 (numbering according to human PAK1). One kinase that has been shown to phosphorylate Thr423 is PDK1, a 3-phosphinositide dependent kinase.

[0006]Many proteins have been reported to be phosphorylated by PAKs, several of those are proteins involved in cell structure and cell motility. It has for example been shown that LIM kinases-1 and -2, serine kinases implicated in actin cytoskeletal dynamics, are phosphorylated by PAKs. Other targets involved in cell motility are myosin light chain kinase and regulatory myosin light chain. In addition PAKs are involved in microtubule dynamics, possibly by phosphorylation of stathmin.

[0007]Through their regulatory actions on the actin cytoskeleton, myosin and microtubules, PAKs are highly involved in cellular processes such as cell motility and cell migration, which on the organism level is manifested as important role(s) for PAKs during e.g. neurogenesis and angiogenesis. It has also been suggested that PAKs are part of a signaling cascade leading to platelet activation through their regulatory action on actin cytoskeleton dynamics. PAKs are known to have both pro- and antiapoptotic effects, depending on the isoform in question. PAK2 is activated by caspase 3 and is thus part of the apoptotic signaling cascade, whereas it has been reported that PAK1 is activated by certain signaling pathways that promote cell survival, for example by IL-3 signaling.

[0008]The important role for PAKs in neurogenesis is exemplified by the hereditary disease nonsyndromic X-linked mental retardation, which is caused by point mutations in PAK3, the brain-specific PAK isoform in humans.

[0009]Several studies have suggested that PAKs may play important roles in cancer metastasis. So has it been reported that many breast cancer cell lines express elevated PAK1 and PAK2 activities. It has also been shown that heregulin, a stimulator of cancer cell growth stimulates PAK1 activity. In addition, dominant negative forms of Pak1 can inhibit motility and invasiveness in cancer cell model.

[0010]Is has been demonstrated that PAKs can associate with the HIV encoded Nef protein, a protein of central importance in HIV pathogenesis. Together with Nef, PAK appears to promote viral replication and pathogenesis of HIV, and PAK is required for survival of infected cells.

[0011]Previous studies have described the existence of one PAK encoding gene in C. elegans, denoted PAK1 (Chen et al 1996 JBC271, 26362-68, Iino & Yamamoto 1998 BBRC 245, 177-84). It was shown by in vitro biochemical assays that PAK1 encodes a bona fide PAK protein demonstrating kinase activity and interaction with CeRac1 (today known as CED-10) and CDC42Ce (CDC-42). Immunoflourescence indicated PAK-1 localization to hypodermal cell boundaries during embryonic body elongation, suggesting a role for pak-1 in embryogenesis. Analysis of transgenic worms expressing pak-1 promoter-reporter gene fusions demonstrated pak-1 expression throughout development, primarily in embryonic tissues, pharyngeal muscles, CAN neurons, motor neurons in the ventral nerve cord, the spermatheca and the distal tip cell (DTC) of the developing gonad. However, no in vivo functional characterization of pak-1 has been reported, even though a knock-out pak-1 strain, RB689, is publicly available. This might suggest that loss-of-function phenotypes of pak-1 are very subtle and hard to detect or that pak-1 is functionally redundant with other protein(s).

C45b11.1 (pak-4)

[0012]In addition to the pak-1 gene, one other predicted gene in the C. elegans genome, c45b11.1, appears to encode a PAK protein, which, based on sequence homology, belongs to the PAK 4-6 subclass of PAK proteins. We propose to call this gene pak-4.

Indications of a Hitherto Unidentified Pak Gene

[0013]Sequence homology searches for genes encoding PAK-like kinase domains identified one open reading frame, y38f1a.10 (SEQ. ID NO. 33), predicted to encode a kinase domain-only protein, without the characteristic regulatory regions of a PAK protein. In the kinase database "kinase.com" located on the world wide web, this ORF is denoted with the name PAK3 with the associated comment that a putative CRIB domain is encoded in a genomic region further upstream. However, no references or experimental data is provided that support this notion.

[0014]The invention pertains to an isolated polynucleotide comprising a DNA sequence which is selected from one of the following groups [0015]a] a DNA sequence of SEQ ID NO. 1; or [0016]b] a DNA sequence which is complementary to SEQ ID NO. 1; or [0017]c] a DNA sequence which hybridizes to a DNA sequence of SEQ ID NO. 1 or to a DNA sequence which is complementary to SEQ ID NO. 1; or [0018]d] a DNA sequence which is degenerate as a result of the genetic code to the DNA sequence of SEQ ID NO. 1 or to a DNA sequence which is complementary to SEQ ID NO. 1; or [0019]e] a DNA sequence which is encoding a pak-3a polypeptide.

[0020]In one embodiment of the invention the isolated polynucleotide consists of a polynucleotide sequence of SEQ ID NO. 1.

[0021]In a further embodiment of the invention the pak-3a polypeptide that is encoded by the DNA-sequence is the pak-3a polypeptide of C. elegans consisting of an amino acid sequence of SEQ ID NO. 7.

[0022]The hybridization can occur under conditions of medium or high stringency. Conditions of medium or high stringent hybridization can be found in textbooks as "Molecular Cloning; edited by Sambrook J. Fritsch E. F., Maniatis T.; Cold Spring Harbor Laboratory Press (ISBN: 0-87969-309-6)"

[0023]Current Protocols in Molecular Biology; edited by Ausubel F. M., Brent R., Kingston R. E., Moore D. D., Seidmann J. G., Smith A., Struhl K., John Wiley & Sons, Inc. (ISBN: 0-471-50338-X-looseleaf)."

Example of Hybridization Under Medium Stringency Conditions:

[0024]The DNA or RNA is transferred on to a membrane filter (e.g. nylon, nitrocellulose) via Southern Blot or Northern Blot.

[0025]The membrane filter containing the target DNA or RNA (e.g. polynucleotide comprising a sequence of SEQ ID NO. 1) is thoroughly wetted in 6×SSC which is prepared from 20×SSC by dilution with water.

[0026](20×SSC: 0.3 M NaCl, 0.3 M Na3-Citrat.2H2O).

[0027]The membrane filter is then prehybridized by adding 0.2 ml prehybridization solution and incubated at 68° C. for 1-2 hours. The prehybridization solution consists of 6×SSC, 5×Denhardt's reagent, 0.5% SDS and 100 μg/ml denatured, fragmented salmon sperm DNA. 5×Denhardt's reagent is prepared from 100×Denhardt's solution by dilution with water.

[0028](100×Denhardt's solution: 10 g Ficoll 400 and 10 g Polyvinylpyrollidone and 10 g Bovine Serum Albumin in 500 ml water).

[0029]To the prehybridization mix 10-20 μg/ml of radiolabeled probe (specific activity for example=109 cpm/μg) is added.

[0030]If the radiolabeled probe is double stranded, it has to be denatured by heating for 5 min. at 100° C. followed by rapid chilling to between 0° C. to 10° C.

[0031]The hybridization mix is incubated for 2 to 14 hours at 60° C. After hybridization the membrane filter is first washed in

2×SSC containing 0.5% SDS for 5 minutes at room temperature.

[0032]The filter is then washed in

2×SSC containing 0.1% SDS for 15 minutes at room temperature.

[0033]The filter is then washed in

0.1% SSC containing 0.5% SDS for 30 minutes at 37° C.

[0034]The filter is then washed in

0.1×SSC containing 0.5% SDS for 30 minutes at 42° C.

[0035]After this washing steps the filter is exposed e.g. to X-ray film or is analyzed by a phosphoimager (Applied Biosystems).

Example of Hybridization Under High Stringency Conditions:

[0036]The medium and high stringency conditions differ in particular with respect to the temperature and composition of the washing steps. Whereas the prehybridization and incubation with the radiolabeled probe is performed under the same conditions as in case of medium stringent hybridization the washing steps under stringent hybridization are as follows:

[0037]The membrane filter is first washed in 2×SSC and 0.5% SDS for 5 minutes at room temperature.

[0038]The filter is then washed in 2×SSC containing 0.1% SDS for 30 min at 50° C.

[0039]The filter is then washed in 0.1×SSC containing 0.1% SDS for 30 min at 60° C.

[0040]This last washing step is repeated one more time before the filter is exposed to a X-ray film or analyzed by a phospho imager.

[0041]In another embodiment the invention concerns an isolated polynucleotide comprising a DNA sequence that is selected from one of the following groups [0042]a] a DNA sequence of SEQ ID No. 2, 3, 4, 5 or 6; or [0043]b] a DNA sequence which is complementary to one of the DNA sequences of SEQ ID NO. 2, 3, 4, 5 or 6; or [0044]c] a DNA sequence which hybridizes to at least one DNA sequence of SEQ ID NO. 2, 3, 4, 5 or 6 or to at least one DNA sequence which is complementary to a DNA sequence of SEQ ID NO. 2, 3, 4, 5 or 6; or [0045]d] a DNA sequence which is degenerate as a result of the genetic code to at least one DNA sequence of SEQ ID NO. 2, 3, 4, 5 or 6; or [0046]e] a DNA sequence which is encoding a pak-3b polypeptide.

[0047]In one embodiment of the invention the isolated polynucleotide consists of a polynucleotide sequence of SEQ ID NO. 2, 3, 4, 5 or 6.

[0048]In a further embodiment of the invention the pak-3b polypeptide that is encoded by the DNA sequence is the pak-3b polypeptide of C. elegans consisting of an amino acid sequence of SEQ ID NO. 8, 9, 10, 11 or 12.

[0049]With respect to the hybridization of a polynucleotide to a pak-3b specifying sequence reference is made to the conditions as drafted aforementioned in context of pak-3a. The conditions as specified for pak-3a are just as applicable for pak-3b.

[0050]The invention refers in a further embodiment to a recombinant vector sequence comprising a DNA sequence selected from one of the following groups [0051]a] a DNA sequence of one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18; or [0052]b] a DNA sequence which hybridizes to one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18.

[0053]The conditions for hybridization as specified for pak-3a are applicable for a DNA sequence that hybridizes to one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18 as well.

[0054]The invention refers in a further preferred embodiment to a vector sequence that consists of a DNA sequence of one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18.

[0055]The invention refers also to a host cell containing a recombinant vector system as specified in SEQ ID NO. 13, 14, 15, 16, 17 or 18.

[0056]A host cell may be any cell that is transformable by a vector sequence. Examples of host cells are: Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, insect cells, mammalian cell lines (NIH 3T3; COS; Hela etc.) and others.

[0057]A further embodiment of the invention refers to an isolated protein that is encoded by a polynucleotide sequence of SEQ ID NO. 1. This isolated protein can consist of an amino acid sequence of SEQ ID NO. 7. This isolated protein can exhibit the activity of a pak-3a protein.

[0058]A further embodiment of the invention refers to an isolated protein that is encoded by a polynucleotide sequence of SEQ ID NO. 2, 3, 4, 5 or 6. Such an isolated protein can consists of an amino acid sequence of SEQ ID NO. 8, 9, 10, 11 or 12. This isolated protein can exhibit the activity of a pak-3b protein.

[0059]The invention refers also to the use of a host cell containing a recombinant vector system of SEQ ID NO. 13, 14, 15, 16, 17 or 18 for manufacturing of a protein having an amino acid sequence of SEQ ID NO. 7 and/or exhibiting the activity of a pak-3a protein or for manufacturing of a protein having an amino acid sequence of SEQ ID NO. 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3b protein. In a further embodiment the invention refers to the use of a host cell containing a recombinant vector system of SEQ ID NO. 13, 14, 15, 16, 17 or 18 in a screening assay for identifying of a compound which interacts with a pak-3a protein or a pak-3b protein. When a compound interacts with a protein in context of the present invention it shall mean that the compound binds to the protein, or that it stimulates the activity of the protein (activation), or it diminishes the activity of the protein (inhibition), or it maintains the activity of the protein, or it stabilizes the acting of the protein.

[0060]The invention concerns further the manufacturing of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3a or pak-3b protein by cultivation of host cell which harbors a recombinant vector sequence of SEQ ID NO. 13, 14, 15, 16, 17 or 18, after cultivation the separation of the cells from cultivation medium, thereafter the lysis of the cells and the purification of the protein by means of protein purification techniques. A person skilled in the art will get access to all required protocols for performing such a method for manufacturing of the protein starting from cultivation of the cells up to the purification of the protein in a text book such as "Current Protocols in Protein Science; edited by Coligan J. E., Dunn B. M., Ploegh H. L., Speicher D. W., Wingfield P. T.; Wiley, John & Sons, Inc. (ISBN: 0471140988)".

[0061]In a further embodiment the invention refers to the use of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3a or pak-3b protein to the preparation of an antibody which exhibits binding specificity for such a protein.

[0062]In a further embodiment the invention pertains to the use of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3a or pak-3b protein the preparation of a medicament for therapy of a disease which is caused by a deficiency, hyperactivation, or malfunction of a mammalian analogous protein of a pak-3a and/or pak-3 protein. Such a mammalian analogous protein may be derived from the human species. It can consist of a kinase protein. The disease involved may be related to a malfunction of the central nervous system, of metabolism, of the cardiovascular system, of the cell division process, or of other cellular or systemic processes.

[0063]The invention refers further to the use of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 in a screening process for identifying of a compound that interacts with a pak-3a or a pak-3b protein. Such a screening process can be organized in form of a High-Throughput-Screening (HTS). The HTS is based upon automized screening formats by means of laboratory robot systems.

[0064]An embodiment of the invention refers to an assay for identifying of a compound that is interacting with a pak-3a and/or a pak-3b protein wherein [0065]a] a pak-3a and/or a pak-3b protein is provided, [0066]b] a chemical compound is provided, [0067]c] the pak-3a and/or the pak-3b protein and the chemical compound are brought in contact, [0068]d] the binding of the chemical compound to the pak-3a and/or pak-3b protein is determined and/or the activity of the pak-3a and/or pak-3b protein is determined.

[0069]The pak-3a or pak-3b protein can consist of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12. A chemical compound can be provided by means of a chemical synthesis performed in a chemist's laboratory or by an industrial process. A chemical compound can be further provided by isolation from a biological organism (e.g. bacterium, fungus, plant, mammal etc.).

[0070]The pak-3a or pak-3b protein can be provided in form of a host cell which harbors a recombinant vector of SEQ ID NO. 13, 14, 15, 16, 17 or 18 and expresses a protein having the activity of a pak-3a or pak-3b protein. In one embodiment of the invention such a host cell is brought in contact with the chemical compound. In a further embodiment of the invention the assay is used for identifying a compound that is inactivating, or activating, or binding, or maintaining the activity of a pak-3a and/or pak-3b protein.

[0071]The invention concerns further a compound that can be identified by such an assay as well as the use of such a compound as pharmaceutically active ingredient or the use of such a compound for manufacturing of a medicament. Such a compound may consist of a molecular weight of 100 to 50 000 kDa.

[0072]The invention pertains in a further embodiment to a strain of C. elegans that is exhibiting a loss-of-function phenotype with respect to the pak-3a and/or pak-3b protein. Such a loss-of-function phenotype is detectable by means of southern or northern blots in case the gene and/or the mRNA is not expressed. The loss-of-function phenotype is also detectable by western blots in case the protein is not expressed. The determination of the activity of the pak-3a or pak-3b protein proves the loss-of-function phenotype with respect to the pak-3a and/or pak-3b protein in case the organism is not able to produce functional versions of the proteins, or is degrading the proteins rapidly or contains inhibitors of the proteins. The loss-of-function phenotype of a C. elegans strains with respect to the pak-3a and/or pak-3b protein can be linked to gonad migration, embryonic lethality or sterility.

[0073]In one embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by a mutation or by a partly or complete deletion of the gene coding sequence of pak-3a and/or pak-3b.

[0074]In a further embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by an insertion of a polynucleotide sequence into the gene coding sequence of pak-3a and/or pak-3b.

[0075]In a further embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by a polynucleotide that is selected from the following group:

RNAi; (interference RNA), Ribozyme, antisense RNA, antisense DNA.

[0076]The inactivation of specific mRNAs upon exposure to double-stranded RNA (dsRNA) can in C. elegans be achieved by several different approaches. Below is a short summary of the main approaches.

By Feeding

[0077]In RNAi by feeding a cDNA or genomic DNA fragment from the gene of interest is cloned in a plasmid between two opposing T7 RNA polymerase promoter sites. The plasmid is subsequently transformed in to an E. coli host strain that contains an inducible T7 RNA polymerase gene and the E. coli strain obtained is used as a C. elegans food source (Timmons and Fire, 1998, Nature, 395, 854; Timmons et al, 2001, Gene 263, 103-112; Kamath et al, 2000, Genome Biology 2, research0002.1-0002.10research0002.1-0002.10).

[0078]The advantage of this approach is that relative large numbers of worms can be treated for RNAi and that over several generations. One disadventage is that some RNAs might be toxic to the E. coli. Normally phenotypes are scored initially in the F1 generation, although some phenotypes occasionally can be observes already in the P0 animals.

By Microinjection

[0079]The first described approach for RNAi in C. elegans was the microinjection of dsRNA into the animal body cavity (Fire et al, 1998, Nature 391, 801-811). For this approach dsRNA is obtained by in vitro transcription of a cDNA or genomic DNA fragment cloned into a vector with T3, T7 or SP6 RNA polymerase promoter sites, or from a hybrid PCR product containing both suitable RNA polymerase promoter site(s) and sequence from the gene of interest. Normally the two RNA strands are transcribed separately and subsequently annealed together. Alternatively both strands can be transcribed in one reaction (if the insert has been cloned in both orientations downstream of the same promoter), meaning that a separate annealing stap can be left out. The in vitro produced dsRNA is subsequently microinjected into the body of C. elegans animals.

[0080]With this approach the number of animals available for analysis is much lower than the feeding method. However, it has occasionally been claimed that microinjection has a higher success rate.

By Soaking

[0081]Instead of microinjecting the in vitro transcribed dsRNA the worms can be incubated ("soaked") in high concentrations of dsRNA (Maeda et al 2001, Current Biology, 11, 171-176).

[0082]This approach is less labour intensive than microinjection but is not so commonly used.

By Transgenics

[0083]DsRNA can also be produced in situ in the worms by generating transgenic animals expressing either a hairpin RNA molecule that fold on itself to a dsRNA, or by the use of two transgenic constructs expressing the two different RNA strands. Although labour intensive, this approach opens the possibility for stably knocked-down RNAs as well as tissue-specific and inducible RNAi, depending on the promoter chosen for driving the RNA expression (Tavernakis et al, 2000, Nature Genetics, 24, 180-183).

[0084]In a further embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by an inhibitor of the pak-3a and/or pak-3b protein.

[0085]The invention refers also to the use of a strain of C. elegans that is exhibiting a loss-of-function phenotype for identifying of a protein of the PAK signaling pathway. Such a protein can be a kinase, a phosphatase, a transcription enhancer, a transcription repressor or any other protein which is able to interact with a intracellular signaling cascade.

[0086]The invention refers further to the use of a strain that is exhibiting a loss-of-function phenotype for identifying a compound that interacts with a protein of the PAK signaling pathway.

[0087]The invention further pertains to a method for generating a C. elegans having a phenotype that is characterized by sterility and/or embryonic lethality and/or a defective gonad migration pattern in by [0088]a] inactivating the pak-1 gene and/or pak-1 protein in a C. elegans and/or inactivating the pak-3 gene and/or pak-3 protein of the same C. elegans and [0089]b] identifying of a C. elegans exhibiting the phenotype of sterility and/or embryonic lethality and/or a defective gonad migration pattern.

[0090]In context of this application the term pak-3 shall include pak-3a and pak-3b. When referring to pak-3 the reference shall pertain to pak-3a and/or pak-3b.

[0091]The inactivating of the pak-1 gene and/or pak-1 protein and pak-3 gene and/or pak-3 protein has to be performed in the same C. elegans organisms. The inactivating of both genes could occur simultaneously at the same time or consecutively one after another. In all cases of inactivation of pak-1 and/or pak-1 to obtain the loss of function phenotype it makes no difference whether the chemical inhibition and/or genetic inactivation of pak-3 is performed before or subsequently after the chemical inhibition and/or genetic inactivation of pak-1

[0092]The identifying of a C. elegans exhibiting the phenotype of sterility and/or embryonic lethality and/or a defective gonad migration pattern can occur in offsprings of the F1 and/or the F2 and/or a further following generation.

[0093]The inactivating of the pak-1 gene and/or the pak-1 protein can be achieved by means of RNA molecules that are suitable for RNA interference with a pak-1 coding polynucleotide. Such RNA molecules can be derived from at least one of the vectors of the following group: pKG61 (SEQ ID NO. 26), pKG71 (SEQ ID NO.28). For that purpose the according vector is introduced into a bacterial strain as e.g. E. coli, the RNA is transcribed from the plasmid promotor and thereafter isolated from the bacteria and purified. The purified RNA is then brought in contact with a cell of C. elegans, or a part of an organism of C. elegans or a complete organism of C. elegans.

[0094]A further possibility of inactivating the pak-1 gene and/or the pak-1 protein consists in feeding bacteria to C. elegans which bacteria contain RNA molecules which are suitable for RNA interference with a pak-1 coding polynucleotide. Such bacteria for the feeding of the C. elegans can harbor at least one plasmid of the following group: pKG61 (SEQ ID NO. 26), pKG71 (SEQ ID NO. 28). The RNA is transcribed from these vectors within the bacteria.

[0095]The inactivating of the pak-1 gene and/or the pak-1 protein can be performed by use of a pak-1 knock out strain of C. elegans. Such a knock out strain is e.g. C. elegans RB 689.

[0096]The pak-1 gene and/or pak-1 protein can be inactivated by means of an according antisense RNA, antisense DNA, a Ribozyme, an inhibitor of the pak-1 gene transcription or an inhibitor of the pak-1 protein.

[0097]The inactivating of the pak-3 gene and/or pak-3 protein can be achieved by means of RNA molecules that are suitable for RNA interference with a pak-3 coding polynucleotide. Such RNA molecules can be derived from at least one of the vectors of the following group: pKG65 (SEQ ID NO. 27), pKG71 (SEQ ID NO. 28), pKG63 (SEQ ID NO. 29), pKG64 (SEQ ID NO. 30). For that purpose the according vector is introduced into a bacterial strain as e.g. E. coli, the RNA transcribed from the plasmid promotor and thereafter isolated from the bacteria and purified. The purified RNA is then brought in contact with a cell of C. elegans, or a part of an organism of C. elegans or a complete organism of C. elegans. A further possibility of inactivating the pak-3 gene and/or the pak-3 protein consists in feeding bacteria to C. elegans which bacteria contain RNA molecules which are suitable for RNA interference with a pak-3 coding polynucleotide. Such bacteria for the feeding of the C. elegans can harbor at least one plasmid of the following group: pKG65 (SEQ ID NO. 27), pKG71 (SEQ ID NO. 28), pKG63 (SEQ ID NO. 29), pKG64 (SEQ ID NO. 30). The RNA is transcribed from these vectors within the bacteria. The inactivating of the pak-3 gene and/or pak-3 protein can be performed by use of a pak-3 knock out strain of C. elegans. The pak-3 gene and/or pak-3 protein can be inactivated by means of an according antisense RNA, antisense DNA, a Ribozyme, an inhibitor of the pak-3 gene transcription or an inhibitor of the pak-3 protein.

[0098]The invention pertains further to a strain of C. elegans which is characterized by a phenotype of sterility and/or embryonic lethality and/or a defective gonad migration and which harbors an inquired or missing pak-1 function and an impaired or missing pak-3 function. In context of this application the term function shall refer to the gene and/or the protein. The strain of C. elegans of the invention which is characterized by a phenotype of sterility and/or embryonic lethality and/or a defective gonad migration pattern could be obtainable or could be obtained by one or several of the methods for generating a C. elegans having said phenotypes. Such a strain can be used amongst other things for characterizing the intracellular signaling cascade linked to pak-1 and/or pak-3. Such a strain can also be used for identifying of a compound that interferes with one or several proteins which are part of the signaling cascade linked to pak-1 and/or pak-3. Such a strain can further be used for identification of a compound that interferes with transcription of one or several proteins that are part of the signaling cascade linked to pak-1 and/or pak-3.

[0099]The invention relates also to manufacturing of a RNA molecule wherein at least one of the polynucleotides of pKG61 (SEQ ID NO. 26), pKG65 (SEQ ID NO. 27), pKG71 (SEQ ID NO. 28), pKG63 (SEQ ID NO. 29), pKG64 (SEQ ID NO. 30), pKG167 (SEQ ID NO. 31) or pKG168 (SEQ ID NO. 32) is transformed into a bacterial strain, the RNA is transcribed from the vector and the transcribed RNA is isolated and/or purified. The invention pertains also to RNA molecules that are obtainable or obtained by such a method. These RNA molecules can be used as individual species one by one or in a combined manner for RNA interference with a pak-1 and/or pak-3 protein coding polynucleotide.

Description of SEQ IDs

[0100]SEQ ID NO. 1 is disclosing the polynucleotide sequence of the pak-3a cDNA. The pak-3a gene is consisting of the coding information of a kinase domain.

[0101]SEQ ID NO. 2 is disclosing the polynucleotide sequence of the pak-3b cDNA. The pak-3b gene is consisting of the coding information of a kinase domain of the same sequence composition as pak-3a and a additional CRIB domain (cdc42/Rac interactive binding domain) which is 5'-linked to the kinase domain.

[0102]SEQ ID NO. 3 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring a silent polymorphism (change from gct to gcc) that would leave the concerned Ala of the corresponding protein unchanged.

[0103]SEQ ID NO. 4 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring the silent polymorphism as described in SEQ ID NO. 3 and harboring further a in frame 6 bp insertion within the kinase domain.

[0104]SEQ ID NO. 5 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring an in frame 9 bp insert within the CRIB domain and having the corresponding sequence of Exon 7 deleted.

[0105]SEQ ID NO. 6 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring a polymorphism (change from atc to gtc) within the CRIB domain which changes an Ile into a Val of the corresponding protein and harboring the in frame insertion of 6 bp of the kinase domain (as is the same as in SEQ ID NO. 4).

[0106]SEQ ID NO. 7 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 1 (kinase domain).

[0107]SEQ ID NO. 8 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 2 (kinase domain plus CRIB domain).

[0108]SEQ ID NO. 9 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 3 (kinase domain plus CRIB domain).

[0109]SEQ ID NO. 10 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 4 (kinase domain having a 6 bp insert plus CRIB domain).

[0110]SEQ ID NO. 11 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 5 (kinase domain plus CRIB domain having a 9 bp insert and Exon 7 deleted).

[0111]SEQ ID NO. 12 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 6 (kinase domain having a 6 bp insert plus CRIB domain in which an Ile is changed into a Val).

[0112]SEQ ID NO. 13 is disclosing the polynucleotide sequence of vector pKG40, which is encompassing the polynucleotide sequence of SEQ ID NO. 1. A description of the vector is given within the header of FIG. 13.

[0113]SEQ ID NO. 14 is disclosing the polynucleotide sequence of vector pKG 123, which is encompassing the polynucleotide sequence of SEQ ID NO. 2. A description of the vector is given within the header of FIG. 14.

[0114]SEQ ID NO. 15 is disclosing the polynucleotide sequence of vector pKG 43, which is encompassing the polynucleotide sequence of SEQ ID NO. 3. A description of the vector is given within the header of FIG. 15.

[0115]SEQ ID NO. 16 is disclosing the polynucleotide sequence of vector pKG 44, which is encompassing the polynucleotide sequence of SEQ ID NO. 4. A description of the vector is given within the header of FIG. 16.

[0116]SEQ ID NO. 17 is disclosing the polynucleotide sequence of vector pKG 58, which is encompassing the polynucleotide sequence of SEQ ID NO. 5. A description of the vector is given within the header of FIG. 17.

[0117]SEQ ID NO. 18 is disclosing the polynucleotide sequence of vector pKG 59, which is encompassing the polynucleotide sequence of SEQ ID NO. 6. A description of the vector is given within the header of FIG. 18.

[0118]SEQ ID NO. 19 is disclosing the primer sequence kg 1.

[0119]SEQ ID NO. 20 is disclosing the primer sequence kg 2.

[0120]SEQ ID NO. 21 is disclosing the primer sequence kg 25.

[0121]SEQ ID NO. 22 is disclosing the primer sequence kg 26.

[0122]SEQ ID NO. 23 is disclosing the primer sequence kg 37.

[0123]SEQ ID NO. 24 is disclosing the primer sequence kg 27.

[0124]SEQ ID NO. 25 is disclosing the primer sequence kg 50.

[0125]SEQ ID NO. 26 is disclosing the polynucleotide sequence of vector pkG61/dT7-pak-1. A description of the vector is given within the header of FIG. 26.

[0126]SEQ ID NO. 27 is disclosing the polynucleotide sequence of vector pkG65/dT7-pak-3. A description of the vector is given within the header of FIG. 27.

[0127]SEQ ID NO. 28 is disclosing the polynucleotide sequence of vector pkG 71/dT7-pak-3/pak-1. A description of the vector is given within the header of FIG. 28.

[0128]SEQ ID NO. 29 is disclosing the polynucleotide sequence of vector pkG63/dT7-pak-3a. A description of the vector is given within the header of FIG. 29.

[0129]SEQ ID NO. 30 is disclosing the polynucleotide sequence of vector pkG64/dT7-pak-3b. A description of the vector is given within the header of FIG. 30.

[0130]SEQ ID NO. 31 is disclosing the polynucleotide sequence of vector pkG167/dT7-ced-10. A description of the vector is given within the header of FIG. 31.

[0131]SEQ ID NO. 32 is disclosing the polynucleotide sequence of vector pkG168/dT7-mig-2. A description of the vector is given within the header of FIG. 32.

[0132]SEQ ID NO. 33 is disclosing the polynucleotide sequence of expressed sequence tag (EST) y38f1a10 of C. elegans.

[0133]SEQ ID NO. 34 is disclosing the polynucleotide sequence of EST yk65141 5' of C. elegans.

[0134]SEQ ID NO. 35 is disclosing the polynucleotide sequence of EST yk65141 3' of C. elegans.

[0135]SEQ ID NO. 36 is disclosing the polynucleotide sequence of EST F18a 11.4 of C. elegans.

DESCRIPTION OF THE FIGURES

[0136]FIG. 1 exhibits SEQ ID NO. 1.

[0137]FIG. 2 exhibits SEQ ID NO. 2.

[0138]FIG. 3 exhibits SEQ ID NO. 3.

[0139]FIG. 4 exhibits SEQ ID NO. 4.

[0140]FIG. 5 exhibits SEQ ID NO. 5.

[0141]FIG. 6 exhibits SEQ ID NO. 6.

[0142]FIG. 7 exhibits SEQ ID NO. 7.

[0143]FIG. 8 exhibits SEQ ID NO. 8.

[0144]FIG. 9 exhibits SEQ ID NO. 9.

[0145]FIG. 10 exhibits SEQ ID NO. 10.

[0146]FIG. 11 exhibits SEQ ID NO. 11.

[0147]FIG. 12 exhibits SEQ ID NO. 12.

[0148]FIG. 13 exhibits SEQ ID NO. 13.

[0149]FIG. 14 exhibits SEQ ID NO. 14.

[0150]FIG. 15 exhibits SEQ ID NO. 15.

[0151]FIG. 16 exhibits SEQ ID NO. 16.

[0152]FIG. 17 exhibits SEQ ID NO. 17.

[0153]FIG. 18 exhibits SEQ ID NO. 18.

[0154]FIG. 19 exhibits SEQ ID NO. 19.

[0155]FIG. 20 exhibits SEQ ID NO. 20.

[0156]FIG. 21 exhibits SEQ ID NO. 21.

[0157]FIG. 22 exhibits SEQ ID NO. 22.

[0158]FIG. 23 exhibits SEQ ID NO. 23.

[0159]FIG. 24 exhibits SEQ ID NO. 24.

[0160]FIG. 25 exhibits SEQ ID NO. 25.

[0161]FIG. 26 exhibits SEQ ID NO. 26.

[0162]FIG. 27 exhibits SEQ ID NO. 27.

[0163]FIG. 28 exhibits SEQ ID NO. 28.

[0164]FIG. 29 exhibits SEQ ID NO. 29.

[0165]FIG. 30 exhibits SEQ ID NO. 30.

[0166]FIG. 31 exhibits SEQ ID NO. 31.

[0167]FIG. 32 exhibits SEQ ID NO. 32.

[0168]FIG. 33 exhibits SEQ ID NO. 33.

[0169]FIG. 34 exhibits SEQ ID NO. 34.

[0170]FIG. 35 exhibits SEQ ID NO. 35.

[0171]FIG. 36 exhibits SEQ ID NO. 36.

[0172]FIG. 37 Gonad path finding phenotypes (A) and intensing of gonad defects.

ABBREVIATIONS

[0173]NGM--Nematode growing medium [0174]E. coli OP50-- Uracil requiring strain of E. coli; used as a food source for nematodes [0175]DMSO--Dimethylsulfoxide.

Deposit of Plasmid DNA

[0176]The plasmids of the present invention have been deposited with the DSMZ--Deutsche Sammiung von Mikroorganismen und Zellkulturen GmbH (German Collection of Microorganisms and Cell Cultures GmbH)

Mascheroder Weg 1b

D-38124 Braunschweig

[0177]according to the following numbers:plasmid pKG40=DSM 16147 (see also Seq ID No. 13 as well as FIG. 13)plasmid pKG43=DSM 16148 (see also Seq ID No. 15 as well as FIG. 15)plasmid pKG44=DSM 16149 (see also Seq ID No. 16 as well as FIG. 16)plasmid pKG58=DSM 16150 (see also Seq ID No. 17 as well as FIG. 17)plasmid pKG59=DSM 16151 (see also Seq ID No. 18 as well as FIG. 18)plasmid pKG123=DSM 16152 (see also Seq ID No. 14 as well as FIG. 14)

EXAMPLES

Strains, General Strain Culture, Molecular & Genetic Methods

[0178]Nematode culture was done according to Brenner 1974.

[0179]Strains were obtained from the C. elegans Genomics Center (CGC) and the C. elegans knock out consortium.

Cloning of pak-3

[0180]The different isoforms of pak-3 cDNA were cloned by RT-PCR from N2 (C. elegans wild-type) total RNA. All primers contain 5'adaptor sequences to allow Gateway cloning. RT-PCR products with 5' Gateway adaptor sequences were re-amplified using AftB sequence primers

TABLE-US-00001 (SEQ ID NO. 19) kg1; GGGGACAAGTTTGTACAAAAAAGCAGGCT, (SEQ ID NO. 20) kg2; GGGGACCACTTTGTACAAGAAAGCTGGGT,

and cloned via the BP reaction into the vector pDONR201 as described by the manufacturer (Invitrogen).

[0181]Sequence alignments were done using the Lasergene software package. Blast database searches were conducted using the NCBI Blast tool (internal installation). Sequence motifs were identified using the Workbench Pfam HMM database search tool (GeneData AG).

[0182]pKG40 (pak-3a wild-type; DSM 16147) was cloned by the use of gene-specific primers based on the gene prediction for y38f1a.10, (SEQ ID NO. 33). (5': kg25; aaaaagcaggctcaaaaATGTTTCAAAATAGTCCGATGAT (SEQ ID NO. 21; 3': kg26; agaaagctgggtCTACTTTTCTCGGACGGCTCT, SEQ ID NO. 22). (One pak-3a clone was isolated by the 5' SL1 primer kg37 [see below] in combination with kg26. This clone was found to have an ORF identical to pKG40 and was not kept).

[0183]pKG43 (pak-3b SL 1 t1680c; Ala-Ala; DSM 16148) was cloned by a 5' primer corresponding to the SL1 trans-spliced leader sequence (kg37; aaaaagcaggctGGTTTAATTACCCAAGTTTGAG, SEQ ID NO. 23) in conjunction with the 3' primer kg26 (agaaagctgggtcTACTTTTCTCGGACGGCTCT, SEQ ID NO. 24).

[0184]pKG44 (pak-3b Ins1581 t1686c; DSM 16149), pKG58 (pak-3b Ins228 Deta Exon 7) and pKG59 (pak-3b a43g Ins1581) were all cloned by combining a gene-specific 5' primer, kg50 (aaaaagcaggctcaaaaATGTCAACTTCAAAAAGTTCCAAG, SEQ ID NO. 25), derived from the sequence information from pKG43, with the 3' primer kg26 (SEQ ID NO. 22).

[0185]pKG123 (pak-3b wild-type; DSM 16152) was constructed by replacing an EcoRV-SacI restriction fragment (the C-terminal part of the kinase domain) in pKG44, containing deviations from the wild-type sequence, with the corresponding wild-type fragment from pKG58, thus creating a full-length, wild-type cDNA clone.

RNAi

[0186]RNA interference was done using the feeding method as described previously in Fraser et al. 2000 and Kamath et al. 2000. Double RNAi was done either by mixing bacterial cultures before seeding plates or by generation of vector constructs containing two cDNA fragments.

[0187]Vectors for RNAi by feeding were generated by cloning full-length or partial cDNAs into derivatives of the double T7 vector pPD129.36 (Timmons & Fire, Nature 395, 854). Either a Gateway-adopted version (pKG14) was used for cloning according to standard Gateway protocols (Invitrogen) or a version with a SrfI site added to the polylinker (pKG90) was used for direct cloning of PCR products as described (Schlofterer, C. and Wolff, C. Trends Genet, 1996.12, 286-287).

RNAi Vectors:

[0188]pKG61 (dT7-pak-1); vector: pKG14; insert: bp 1-1710 of pak-1; the RNA is encoded from 89 to 1753 of SEQ ID NO. 26 (entire ORF) (SEQ ID NO. 26) [0189]pKG65 (dT7-pak-3); vector: pKG14; insert: bp 1141-1941 of pak-3b (kinase domain); the RNA is encoded from 84 to 883 of SEQ ID NO. 27; specific for both pak-3a and pak-3b (SEQ ID NO. 27) [0190]pKG71 (dT7-pak-3/-1); vector pKG14; inserts: bp 1141-1941 of pak-3b (kinase domain) and bp 921-1710 of pak-1 (kinase domain); the RNA is encoded from 84 to 884 (pak1) and 885 to 1674 (pak3) of SEQ ID NO. 28; for double RNAi against pak-1/pak-3 (SEQ ID NO. 28) [0191]pKG63 (dT7-pak-3a); vector pKG14; insert: bp 1-128 of pak-3a (N-terminus); the RNA is encoded from 89 to 216 of SEQ ID NO. 29; specific for pak-3a (SEQ ID NO. 29) [0192]pKG64 (dT7-pak-3b); vector pKG14; insert: bp 1-788 of pak-3b (N-terminus); the RNA is encoded from 89 to 876 of SEQ ID NO. 30; specific for pak-3b (SEQ ID NO. 30) [0193]pKG167 (dT7-ced-10); vector pKG90; insert: bp 40-562 of c09g12.8b (ced-10); the RNA is encoded from 136 to 658 of SEQ ID NO. 31 pKG168 (d7-mig-2); vector pKG90; insert: bp 13-566 of c35c5.4 (mig-2); the RNA is encoded from 137 to 689 of SEQ ID NO. 32.

Assay for Phenotypic Analysis

[0194]Egg lay was scored by placing 5 or 10 [for 1. pak-1 (RNAi); 2. pak-1 (ok448); pak-3(RNAi)3. pak-1(ok448); pak-3b(RNAi)] adult F1 generation worms (1st generation progeny from the P0 parents initially exposed to RNAi treatment) on plates (5 plates per RNAi treatment) for 5 h at 20° C. and subsequent manual counting of the eggs after removal of the adult worms. Embryonic lethality was defined as the number of eggs remaining 24 h after removal of the adult worms relative the total number of eggs laid.

[0195]Gonad morphology and distal tip cell (DTC) migration was scored essentially as described previously (Nishiwaki 1999 Genetics 152, 985-997; Su et al 2000, Development 127, 585-594). Briefly, F1 generation late L4 larvae or young adults were observed under Nomarski (DIC) optics using an Axioplan 2 microscope (Sulston and Horwitz 1977, Dev Biol 56, 110-156) and the trajectories of the DTCs were deduced from the shapes of the gonad arms. As a negative control worms were exposed to bacteria expressing the empty T7 vector. The DTC migration phenotypes were group into five different classes (FIG. 1a): I) wild-types, showing the typical C-sharped gonad with normal 1st and 2nd turns; II) Rac-type, typically observed in ced-10 and mig-2 mutants, with normal 1st and 2nd turns but with an additional 3rd turn leading to that the gonadal tip points away from the midbody region (Reddien and Horwitz 2000, Nature Cell Biol 2, 131-136); III) Pak-type, with a normal 1st turn but a 2nd turn in the wrong direction away from the midbody region (a similar phenotype has been described previously for the mutant mig-14 [Nishiwaki 1999]); IV) Straight, where the gonad progresses without any turns along the ventral side away from the midbody region; V) Other, mainly a complete lack of gonad outgrowth or ruptured gonads with free-floating germ cells in the body.

Compound Testing

[0196]For screening of candidate pak-3 inhibitory compounds, synchronized RB689 (pak-1, ok448) L1 larvae were obtained by NaOH/Na-hypochlorite treatment of gravid adults and subsequent incubation of the resulting eggs in M9 buffer O/N at 20° C. with agitation essentially as described (C. elegans, a practical approach, Ed. I. Hope, 1999). About 30 L1 larvae in NGM medium were mixed with 2 OD600 E. coli OP-50, 100 μM test compound, 1% DMSO (from compound stock solution) in a final volume of 50 μl per well in flat-bottomed 96-well plates and incubated at 200 for 3 to 4 days. Preliminary in-well scoring of gonad phenotypes was done using an Axiovert 200 microscope. For final scoring, worms were pipetted out of the wells, mounted and analyzed under Nomarski (DIC) optics. As a negative control worms were incubated with 1% DMSO.

Cloning of Pak-3 cDNAs and Identification of the Pak-3 Gene

[0197]The initial indication of a hitherto unknown pak gene in C. elegans came from the identification of a predicted open reading frame, y38f1a.10 (SEQ ID NO. 33), encoding a kinase domain with high homology to a pak-type kinase domain. (The kinase domain is also classified as a pak-type kinase domain in the kinase database "kinase.com" located on the world wide web). However, the predicted ORF y38f1a.10 does not encode for a CRIB-domain, the regulatory domain conserved within the PAK gene family. We noticed by sequence comparison that the EST yk651h1 (SEQ ID NO. 34, 35), covering the y38f1a.100RF, also contains parts of the predicted ORF f18a11.4 (SEQ ID NO. 36), located upstream of y38f1a.10. This suggested to us that these two predicted ORFs might in fact be one single gene. To test this we performed RT-PCR using a 3' gene-specific primer corresponding to the 3' end of y38f1a.10 and a 5' primer corresponding to the SL1 leader sequence spliced in trans to many C. elegans mRNAs. As it has been reported that the C. elegans pak-1 mRNA is SL1 trans-spliced we suspected that this might be the case also for mRNAs from other C. elegans pak genes.

[0198]Sequence analysis of the RT-PCR products obtained revealed two different classes of mRNAs. The first class (isoforms a) corresponds roughly to the predicted ORF y38f1a.10 but with an additional exon upstream of the kinase domain. The second class (isoforms b) spans both ORFs y38f1a.10 and f18a11.4, thus demonstrating that these two ORFs belong to one single gene. However, the mRNAs have a 5' region longer than predicted in f18a11.4 and also longer than the EST yk651h1. Sequence analysis revealed a splicing pattern different from the ORF f18a11.4 and most importantly a domain with homology to a CRIB domain. Blast sequence database searches with cDNA isoforms b yielded a highest similarity score against human and rodent PAK3 and secondly against human and rodent PAK1.

[0199]Taken together this demonstrates that the two predicted ORFs y38f1a.10 (SEQ ID NO. 33) and f18a11.4 (SEQ ID NO. 36) are in fact one gene that codes for two different mRNA splice variants, a short form encoding a protein mainly consisting of a pak-type kinase domain and a 5' longer form encoding a typical PAK protein. Based on sequence similarity and biological function (see below) we propose to call this novel pak gene pak-3 with the short splice variant denoted pak-3a and the long form pak-3b.

RNAi in C. elegans Strains N2 (Wild Type) and RB689 (pak-1)

[0200]To assess the biological function of pak-3, RNAi by feeding experiments were performed. However, no obvious phenotypes could be detected when N2 wild-type worms were used for RNAi. Similarly, no phenotypes were observed when pak-1 function was assayed by RNAi, which was corroborated by observation of the pak-1 knock-out strain RB689, appearing completely wild-type in morphology and behavior.

[0201]Based on the similarity between pak-1 and pak-3 it was concluded that the lack of phenotypes in the RNAi experiments could be explained by supplementary functions of pak-1 and pak-3. To confirm this double RNAi experiments were conducted in the N2 background as well as pak-3 RNAi in the pak-1 knock-out strain RB689 (ok448). Similar results were obtained in both approaches, showing several drastic phenotypes: sterility, embryonic lethality and defects in the gonad migration pattern. Sterility was not completely penetrant but reproducibly shown to be very strong and readily visible. The result of a representative quantitative experiment is shown in Table I. Compared to the control worms exposed to mock RNAi treatment, the relative number of eggs laid by animals exposed to pak-1; pak-3 double RNAi was only 17%, when double RNAi was performed by mixing pak-1 and pak-3 RNAi bacterial cultures. When double RNAi was done using bacteria expressing a hybrid pak-1/pak-3 double RNA molecule the effect was somewhat stronger, 11%, suggesting that double RNAi by mixing of cultures is only moderately less efficient than the use of a dedicated double RNAi vector.

[0202]In pak-11g (ok448); pak-3 (RNAi) animals sterility was even more penetrant, only 3% compared to pak-1If (ok 448); mock (RNAi). When compared with the results obtained in the N2 background, this indicates that pak-1 RNAi is not as penetrant as the complete knock-out, which can be expected.

[0203]Embryonic lethality was initially observed from the presence of small, round eggs that did not hatch upon prolonged incubation. Closer examination of these eggs suggested a high degree of cellular differentiation, for example muscle and pharynx tissue was clearly present. However, the overall morphology of the embryos was distorted, ranging from moderate to very severe with no morphological features conserved. A quantitative analysis (Table I) demonstrated more than 20% embryonic lethality in N2 animals and almost 40% in the RB689 background.

[0204]Interestingly, the phenotypes could not be observed in the first generation (P0) of worms exposed to RNAi, sterility and gonad defects were first observed in the F1 generation. Embryonic lethality was first seen in F2 generation embryos, suggesting maternal rescue in the F1 generation.

[0205]The cloning of pak-3 cDNAs had revealed the existence of two different splice variants, pak-3a and pak-3b. The functional importance of the two forms was demonstrated by conducting isoforms-specific RNAi in the RB689 background. Both as assayed from the sterility phenotype and as well as embryonic lethality it appears that the longer isoform pak-3b may play the mayor role with respect to the phenotypes observed (Table I).

[0206]A third pak gene is encoded by the predicted gene c45b11.1, which is most similar to the human PAK-4. It is known that mammalian PAK-4 differ significantly in regulation and function from PAK1 and PAK3. In agreement with this, no additional phenotypes were observed in double RNAi experiments between c45b11.1 and pak-1 or pak-3.

pak-3 and pak-1 are Required for DTC Pathfinding

[0207]In the C. elegans hermaphrodite the shape of the bi-lobed gonad is determined by the paths of cell migration of the gonadal distal tip cells (DTCs). In wild-type animals the two gonadal arms develop from the ventrally located gonadal primordium in the midbody. One DTC migrates anteriorly and the other posteriorly close to the ventral midline. The migration of the DTCs then undergoes two turns, the first turn towards the dorsal side and the second turn towards the midbody. The result of these migrations is the formation of the two symmetrical C-shaped adult gonad arms. As mentioned above we noted deviations from the wild-type gonad shape, indicative of defects in DTC migration, were noted in pak-3(RNAi); pak-1 (RNAi) and pak-3(RNAi); RB689(ok448) animals, but not in single pak-3(RNAi); pak-1 (RNAi) or the RB689 strain itself. Thus, also for this phenotype pak-1 and pak-3 appears to act supplementary. In more than half of the of gonads observed the first turn appeared normal whereas the second turn was in the wrong direction, i.e. instead of turning towards the midbody, the posterior gonad continued posteriorly and the anterior continued anteriorly (FIG. 1). Occasionally gonads without any turns were observed, the gonads continuing along the ventral midline towards the posterior and anterior end of the animal, respectively.

[0208]There were also analyzed the pak-3 isoform specific effects on DTC migration by pak-3b and pak-3a RNAi in the RB689 background. The results demonstrate that only the pak-3b isoform is important for DTC migration, as for the sterility and embryonic lethality phenotypes (Table II).

Pak-Rac Interaction

[0209]It has previously been described that two of the three Rac GTPases in C. elegans, ced-10 and mig-2, are involved in DTC pathfinding (Reddien & Horwitz, 2000, Lundquist et al 2001). In ced-10 and mig-2 mutants the gonads undergo a third, extra, turn after the second turn, leading to gonad tips pointing away from the midbody. This phenotype is different from what was observed in pak-1; pak-3 mutant animals, in which already the second turn was defective. It is furthermore known from invitro studies and mammalian cell systems (e.g. Bishop & Hall Biochem J, 2000) that Rho GTPases, to which the Rac proteins belong, are upstream regulators of PAKs. Given that the C. elegans paks also are important for DTC pathfinding it was deducted that there is an interaction between pak-1 and pak-3 and the two Racs ced-10 and mig-2 in C. elegans gonad development. To demonstrate this a set of RNAi experiments was performed in different genetic backgrounds (summarized in Table II). The different experimental combinations consistently showed that mig-2 or ced-10 loss of function did not lead to a stronger phenotype in combination with pak-3 than the separate single loss of function mutants. However, in combination with pak-1 mutants the penetrance and severity of the gonad migration defects increased dramatically. As pak-1 and pak-3 act supplementary, these results suggest that ced-10 and mig-2 act as upstream regulators of pak-3 but not, or only to a minor extent, of pak-1. Interestingly, the ced-10; mig-2; pak-1 triple mutant animals were much stronger affected than pak-1; pak-3 double mutants, suggesting that the two Racs also act through other pathways than pak-3 in parallel. Furthermore it was not only observed that the penetrance of DTC pathfinding defects was higher in ced-10; mig-2; pak-1 animals but also the phenotypic spectrum shifted towards more severe pathfinding and migration defects. High frequencies of gonads were observed without turns and also gonad movement defects. This demonstrates that both Paks and Racs are involved in several stages and aspects of DTC pathfinding but that these functions are not evident in the single or double mutants, probably as an effect of the redundant functions of these genes.

Compound Mimicking RNAi Phenotype

[0210]To investigate if the gonad migration defect phenotype can be used as a reporter for PAK-3 inhibitory small molecules, worms were exposed to a set of potential PAK inhibitors, derived from a chemical compound collection, in a 96-well assay format. Synchronized RB689 (pak-1, ok448) L1 larvae were incubated with test compounds in NGM (media) and E. coli OP-50 as food source. As the compounds were added as DMSO solutions, worms exposed to DMSO was used as a control. At late L4 or early young adult stage, gonad phenotypes were scored.

[0211]Several of the 14 substances tested showed a partial effect on gonad migration, causing phenotypes similar to those seen in pak-1; pak-3 (RNAi) animals. In particular, one compound tested, A000025706, was shown to reproducibly cause gonad migration defects (Table III). Out of 100 gonads analyzed, 74 were found to have gonad migration defects.

[0212]The observation that the types of defects observed differ somewhat from those observed with RNAi is possibly due to pharmacological properties of the compound, e.g. uptake and stability. It is also possible that other kinases involved in gonad development and other developmental processes are also inhibited by A000025706. In fact, we observed general growth retardation in worms treated with this compound, suggesting a certain degree of non-specific effects of A 000025706. However, as A000025706 has been confirmed as a PAK inhibitor in other assays (data not shown) we believe that most or all of the gonad migration defects observed can be attributed to a specific inhibition of PAK-3.

[0213]The observation that PAK inhibitors can be identified in a C. elegans-based assay demonstrates the usefulness of this model organism as a tool for pharmacological research. The fact that growth retardation was observed also exemplifies that potential side effects can be identified in parallel to the specific assay readout, i.e. that C. elegans-based assays can be valuable as high-throughput screening systems.

TABLE-US-00002 TABLE I % Eggs % Emb Strain RNAi treatment Genotype laid Lethality N2 ctrl Wild-type 100 0.2 pak-1 pak-1 (RNAi) 113 2.2 pak-3 pak-3 (RNAi) 69 5.0 pak-1/-3 (one vector pak-1 (RNAi); 11 20.8 pak-3(RNAi); pak-1/pak-3 pak-1(RNAi); 17 15.4 (mixed vectors) pak-3(RNAi RB689 ctrl pak-1(ok448) 100 4.4 pak-3 pak-1(ok448); 3 38.7 pak-3(RNAi) pak-3a pak-1(ok448); 134 4.0 pak-3a(RNAi) pak-3b pak-1(ok448); 2 22.7 pak-3b(RNAi

TABLE-US-00003 TABLE II movement pathfinding defects defects % affected Strain RNAi construct Genotype % wt % rac % pak % straight % other gonads n N2 ctrl Wild type 100.0 0.0 0.0 0.0 0.0 0.0 168 pak-1 pak-1(RNAi) 100.0 0.0 0.0 0.0 0.0 0.0 154 pak-3 pak-3(RNAi) 99.3 0.0 0.0 0.7 0.0 0.7 150 pak-1/-3 (one vector) pak-1(RNAi); pak-3(RNAi) 41.7 0.0 54.3 3.5 0.4 58.3 230 pak-1/pak-3 (mixed pak-1(RNAi); pak-3(RNAi) 44.6 0.0 52.2 3.3 0.0 55.4 92 vectors) mig-2 mig-2(RNAi) 91.2 8.1 0.7 0.0 0.0 8.8 136 mig-2/pak-1 mig-2(RNAi); pak-1(RNAi) 75.0 1.6 22.6 0.8 0.0 25.0 124 mig-2/pak-3 mig-2(RNAi); pak-3(RNAi) 88.8 7.2 2.0 2.0 0.0 11.2 152 mig-2/ced-10 mig-2(RNAi); ced-10(RNAi) 92.7 5.6 1.6 0.0 0.0 7.3 124 ced-10 ced-10(RNAi) 90.5 8.8 0.7 0.0 0.0 9.5 148 ced-10/pak-1 ced-10(RNAi); pak-1(RNAi) 49.2 1.6 36.7 12.5 0.0 50.8 128 ced-10/pak-3 ced-10(RNAi); pak-3(RNAi) 96.6 3.4 0.0 0.0 0.0 3.4 146 RB689 ctrl pak-1(ok448) 99.5 0.0 0.5 0.0 0.0 0.5 198 pak 1(ok448) pak-3 pak-1(ok448); pak-3(RNAi) 44.1 0.0 49.6 4.7 1.7 55.9 236 pak-3a pak-1(ok448); pak-3a(RNAi) 100.0 0.0 0.0 0.0 0.0 0.0 80 pak-3b pak-1(ok448); pak-3b(RNAi) 31.7 0.0 54.9 2.4 11.0 68.3 82 mig-2 pak-1(ok448); mig-2(RNAi) 55.6 0.0 33.8 10.6 0.0 44.4 160 mig-2/pak-3 pak-1(ok448); pak-3(RNAi); 57.8 0.0 24.7 16.9 0.6 42.2 154 mig-2(RNAi) mig-2/ced-10 pak-1(ok448); mig-2(RNAi); 29.8 0.0 13.7 55.6 0.8 70.2 124 ced-10(RNAi) ced-10 pak-1(ok448); ced-10(RNAi) 37.8 1.2 32.3 28.0 0.6 62.2 164 ced-10/pak-3 pak-1(ok448); pak-3(RNAi); 43.8 3.1 34.6 18.5 0.0 56.2 162 ced-10(RNAi) CF162 ctrl mig-2(mu28) 71.4 28.6 0.0 0.0 0.0 28.6 126 mig2 (mu28) pak-1 pak-1(RNAi); mig-2(mu28) 40.5 3.2 43.7 9.5 3.2 59.5 126 pak-3 pak-3(RNAi); mig-2(mu28) 74.6 23.0 0.0 2.4 0.0 25.4 126 pak-1/-3 pak-1(RNAi); pak-3(RNAi); 5.5 0.8 43.0 48.4 2.3 94.5 128 mig-2(mu28) mig-2 mig-2(mu28); mig-2(RNAi) 65.9 34.1 0.0 0.0 0.0 34.1 88 ced-10 mig-2(mu28); ced-10(RNAi) 59.8 14.8 13.9 7.4 4.1 40.2 122 ced-10/pak-1 pak-1(RNAi); mig-2(mu28); 14.1 4.7 20.3 50.0 10.9 85.9 128 ced-10(RNAi) ced-10/pak-3 pak-3(RNAi); mig-2(mu28); 74.6 14.8 4.1 2.5 4.1 25.4 122 ced-10(RNAi) MT5013 ctrl ced-10(n1993) 77.3 21.4 0.0 0.6 0.6 22.7 154 ced-10 (n1993) pak-1 pak-1(RNAi); ced-10(n1993) 47.4 1.3 36.4 14.3 0.6 52.6 154 pak-3 pak-3(RNAi); ced-10(n1993) 82.5 14.3 1.3 1.3 0.6 17.5 154 pak-1/-3 pak-1(RNAi); pak-3(RNAi); 49.7 0.7 43.0 2.6 4.0 50.3 151 ced-10(n1993) mig-2 mig-2(RNAi); ced-10(n1993) 81.7 6.3 4.8 1.6 5.6 18.3 126 mig-2/pak-1 pak-1(RNAi); mig-2(RNAi); 11.3 2.4 16.1 50.8 19.4 88.7 124 ced-10(n1993) mig-2/pak-3 pak-3(RNAi); mig-2(RNAi); 87.1 8.9 0.0 0.0 4.0 12.9 124 ced-10(n1993) ced-10 ced-10(n1993); ced-10(RNAi) 75.0 9.2 5.3 0.0 10.5 25.0 76

TABLE-US-00004 TABLE III Strain Cpd % wt % pak-like % Straight % Movement Def % affected n RB689 pak-1(ok448) ctrl 99 0 0 1 1 80 RB689 pak-1(ok448) 25706 26 4 51 19 74 100

Sequence CWU 1

3611281DNACaenorhabditis elegans 1atgtttcaaa atagtccgat gatgtacgac tggtggaatg acaccaccaa accgaaacac 60cagcagccga cacttaacgt gttgtcacca tggggagcat atttcaatca cattggaaat 120gaactgctgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 180cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca 240acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt 300ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 360gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 420aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag 480caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat 540gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc 600gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 660ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca 720gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 780cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac 840aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag 900attcagccgg gatcgaaaag agatactgtc gtcggaactc catattggat gtcgccggag 960atattgaaca agaagcagta caactataag gttgacattt ggtcgctggg aattatggct 1020ctagagatga ttgatggaga gccaccatat ttgagagaaa cacctttgaa ggctatctac 1080ttgattgctc aaaacgggaa gccagagatc aagcaacgcg acagactgtc ttcagagttc 1140aacaatttcc ttgacaagtg tcttgttgtt gatccggatc agagagccga tacaacggag 1200ctcttggcac atccattcct gaaaaaggcg aagccactct caagcctgat tccatacatc 1260agagccgtcc gagaaaagta g 128121941DNACaenorhabditis elegans 2atgtcaactt caaaaagttc caaggtgcga atacggaatt tcatcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag 240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag 300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat 360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat 420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg 480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg 540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg 600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg 660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg 720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg 780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca 900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt 960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag 1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat 1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc 1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca 1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac 1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag 1560attcagccgg gatcgaaaag agatactgtc gtcggaactc catattggat gtcgccggag 1620atattgaaca agaagcagta caactataag gttgacattt ggtcgctggg aattatggct 1680ctagagatga ttgatggaga gccaccatat ttgagagaaa cacctttgaa ggctatctac 1740ttgattgctc aaaacgggaa gccagagatc aagcaacgcg acagactgtc ttcagagttc 1800aacaatttcc ttgacaagtg tcttgttgtt gatccggatc agagagccga tacaacggag 1860ctcttggcac atccattcct gaaaaaggcg aagccactct caagcctgat tccatacatc 1920agagccgtcc gagaaaagta g 194131941DNACaenorhabditis elegans 3atgtcaactt caaaaagttc caaggtgcga atacggaatt tcatcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag 240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag 300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat 360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat 420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg 480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg 540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg 600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg 660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg 720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg 780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca 900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt 960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag 1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat 1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc 1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca 1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac 1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag 1560attcagccgg gatcgaaaag agatactgtc gtcggaactc catattggat gtcgccggag 1620atattgaaca agaagcagta caactataag gttgacattt ggtcgctggg aattatggcc 1680ctagagatga ttgatggaga gccaccatat ttgagagaaa cacctttgaa ggctatctac 1740ttgattgctc aaaacgggaa gccagagatc aagcaacgcg acagactgtc ttcagagttc 1800aacaatttcc ttgacaagtg tcttgttgtt gatccggatc agagagccga tacaacggag 1860ctcttggcac atccattcct gaaaaaggcg aagccactct caagcctgat tccatacatc 1920agagccgtcc gagaaaagta g 194141947DNACaenorhabditis elegans 4atgtcaactt caaaaagttc caaggtgcga atacggaatt tcatcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag 240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag 300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat 360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat 420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg 480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg 540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg 600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg 660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg 720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg 780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca 900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt 960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag 1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat 1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc 1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca 1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac 1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag 1560attcagccgg gatcgaaaag ttgtagagat actgtcgtcg gaactccata ttggatgtcg 1620ccggagatat tgaacaagaa gcagtacaac tataaggttg acatttggtc gctgggaatt 1680atggccctag agatgattga tggagagcca ccatatttga gagaaacacc tttgaaggct 1740atctacttga ttgctcaaaa cgggaagcca gagatcaagc aacgcgacag actgtcttca 1800gagttcaaca atttccttga caagtgtctt gttgttgatc cggatcagag agccgataca 1860acggagctct tggcacatcc attcctgaaa aaggcgaagc cactctcaag cctgattcca 1920tacatcagag ccgtccgaga aaagtag 194751806DNACaenorhabditis elegans 5atgtcaactt caaaaagttc caaggtgcga atacggaatt tcatcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagcta tttcagtctt 240gccgatcaga agaaggatcc gaacgcggtg gtgactgcgt tgaagttcta cgcacaatca 300atgaaggaga acgagaagac gaaattcatg acgacgaata gtgttttcac gaatagcgat 360gacgatgatg tggacgttca gttgaccgga caagtcacgg aacatttgag gaatttgcag 420tgtagtaatg gttccgcaac ttccccatct acatcagtgt cagcttcatc ttcttctgct 480cgtccactga caaatggaaa taatcatctt tccacggcgt cgtctaccga cacatctctc 540tcattatcgg aaaggaataa cgttccgtct ccagctccag ttccatatag tgaaagtgct 600ccacaactga aaacattcac cggagagact ccaaaactgc atccacgatc tccgttcccg 660cctcaaccgc cagttcttcc gcaacgaagc aaaaccgcat cggcagtggc gacgacgacg 720acgaatccga cgacttcgaa tggagcacca ccaccagttc ctggatcgaa aggacccccg 780gtgccaccga aaccatcgaa ggaaaattcg aatgacaaat cagttggaga caagaatggg 840aacaccacca caaacaaaac gaccgtcgaa ccacctccac cagaagagcc acctgttcgt 900gttcgagcat ctcatcgtga aaagctttct gattccgaag tgctcaatca actccgcgag 960attgttaatc caagtaatcc acttggaaag tacgagatga agaagcaaat cggtgttgga 1020gcatccggaa ctgtattcgt tgctaatgtg gccggcagca ctgatgtggt ggctgtgaag 1080agaatggctt tcaagactca gccgaagaag gagatgttgc tcaccgagat taaggttatg 1140aagcagtatc gacacccgaa cctcgtcaac tacattgaat cgtatctggt tgatgctgat 1200gatctttggg tagtgatgga ttatctggaa ggtggaaact tgacagatgt cgttgtgaag 1260actgagttgg acgaaggaca aattgcagca gttttgcaag aatgtcttaa agcgcttcac 1320ttccttcata gacactccat agtgcaccga gatatcaaga gtgacaacgt gctgctcggc 1380atgaacggag aggttaagct caccgatatg ggattctgtg ctcagattca gccgggatcg 1440aaaagagata ctgtcgtcgg aactccatat tggatgtcgc cggagatatt gaacaagaag 1500cagtacaact ataaggttga catttggtcg ctgggaatta tggctctaga gatgattgat 1560ggagagccac catatttgag agaaacacct ttgaaggcta tctacttgat tgctcaaaac 1620gggaagccag agatcaagca acgcgacaga ctgtcttcag agttcaacaa tttccttgac 1680aagtgtcttg ttgttgatcc ggatcagaga gccgatacaa cggagctctt ggcacatcca 1740ttcctgaaaa aggcgaagcc actctcaagc ctgattccat acatcagagc cgtccgagaa 1800aagtag 180661947DNACaenorhabditis elegans 6atgtcaactt caaaaagttc caaggtgcga atacggaatt tcgtcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag 240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag 300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat 360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat 420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg 480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg 540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg 600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg 660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg 720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg 780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca 900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt 960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag 1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat 1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc 1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca 1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac 1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag 1560attcagccgg gatcgaaaag ttgtagagat actgtcgtcg gaactccata ttggatgtcg 1620ccggagatat tgaacaagaa gcagtacaac tataaggttg acatttggtc gctgggaatt 1680atggctctag agatgattga tggagagcca ccatatttga gagaaacacc tttgaaggct 1740atctacttga ttgctcaaaa cgggaagcca gagatcaagc aacgcgacag actgtcttca 1800gagttcaaca atttccttga caagtgtctt gttgttgatc cggatcagag agccgataca 1860acggagctct tggcacatcc attcctgaaa aaggcgaagc cactctcaag cctgattcca 1920tacatcagag ccgtccgaga aaagtag 19477426PRTCaenorhabditis elegans 7Met Phe Gln Asn Ser Pro Met Met Tyr Asp Trp Trp Asn Asp Thr Thr1 5 10 15Lys Pro Lys His Gln Gln Pro Thr Leu Asn Val Leu Ser Pro Trp Gly20 25 30Ala Tyr Phe Asn His Ile Gly Asn Glu Leu Leu His Leu Lys Ile Ala35 40 45Ser Ser Thr Val Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser50 55 60Ala Arg Ser Val Gly Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr65 70 75 80Thr Ser Ser Asp Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn85 90 95Asp Lys Ser Val Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr100 105 110Thr Val Glu Pro Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala115 120 125Ser His Arg Glu Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg130 135 140Glu Ile Val Asn Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys145 150 155 160Gln Ile Gly Val Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala165 170 175Gly Ser Thr Asp Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln180 185 190Pro Lys Lys Glu Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr195 200 205Arg His Pro Asn Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala210 215 220Asp Asp Leu Trp Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr225 230 235 240Asp Val Val Val Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val245 250 255Leu Gln Glu Cys Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile260 265 270Val His Arg Asp Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly275 280 285Glu Val Lys Leu Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly290 295 300Ser Lys Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu305 310 315 320Ile Leu Asn Lys Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu325 330 335Gly Ile Met Ala Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg340 345 350Glu Thr Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro355 360 365Glu Ile Lys Gln Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu370 375 380Asp Lys Cys Leu Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu385 390 395 400Leu Leu Ala His Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu405 410 415Ile Pro Tyr Ile Arg Ala Val Arg Glu Lys420 4258646PRTCaenorhabditis elegans 8Met Ser Thr Ser Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Ile Gly1 5 10 15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu Met20 25 30Lys Pro Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40 45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50 55 60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp Ile Ser Leu Ala Asp Gln65 70 75 80Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys Phe Tyr Ala Gln85 90 95Ser Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr Asn Ser Val100 105 110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu Thr Gly Gln115 120 125Val Thr Glu His Leu Arg Asn Leu Gln Cys Ser Asn Gly Ser Ala Thr130 135 140Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala Arg Pro Leu145 150 155 160Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165 170 175Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro Ala Pro Val Pro180 185 190Tyr Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu Thr Pro195 200 205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro Val Leu Pro210 215 220Gln Arg Ser Lys Thr Ala Ser Ala Val Ala Thr Thr Thr Thr Asn

Pro225 230 235 240Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser Lys Gly Pro245 250 255Pro Val Pro Pro Lys Pro Ser His Leu Lys Ile Ala Ser Ser Thr Val260 265 270Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser Ala Arg Ser Val275 280 285Gly Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr Thr Ser Ser Asp290 295 300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn Asp Lys Ser Val305 310 315 320Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu Pro325 330 335Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala Ser His Arg Glu340 345 350Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg Glu Ile Val Asn355 360 365Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln Ile Gly Val370 375 380Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly Ser Thr Asp385 390 395 400Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro Lys Lys Glu405 410 415Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro Asn420 425 430Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp Asp Leu Trp435 440 445Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp Val Val Val450 455 460Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu Gln Glu Cys465 470 475 480Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val His Arg Asp485 490 495Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu Val Lys Leu500 505 510Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Arg Asp515 520 525Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile Leu Asn Lys530 535 540Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly Ile Met Ala545 550 555 560Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu Thr Pro Leu565 570 575Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu Ile Lys Gln580 585 590Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp Lys Cys Leu595 600 605Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu Leu Ala His610 615 620Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile Pro Tyr Ile625 630 635 640Arg Ala Val Arg Glu Lys6459646PRTCaenorhabditis elegans 9Met Ser Thr Ser Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Ile Gly1 5 10 15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu Met20 25 30Lys Pro Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40 45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50 55 60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp Ile Ser Leu Ala Asp Gln65 70 75 80Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys Phe Tyr Ala Gln85 90 95Ser Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr Asn Ser Val100 105 110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu Thr Gly Gln115 120 125Val Thr Glu His Leu Arg Asn Leu Gln Cys Ser Asn Gly Ser Ala Thr130 135 140Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala Arg Pro Leu145 150 155 160Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165 170 175Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro Ala Pro Val Pro180 185 190Tyr Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu Thr Pro195 200 205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro Val Leu Pro210 215 220Gln Arg Ser Lys Thr Ala Ser Ala Val Ala Thr Thr Thr Thr Asn Pro225 230 235 240Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser Lys Gly Pro245 250 255Pro Val Pro Pro Lys Pro Ser His Leu Lys Ile Ala Ser Ser Thr Val260 265 270Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser Ala Arg Ser Val275 280 285Gly Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr Thr Ser Ser Asp290 295 300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn Asp Lys Ser Val305 310 315 320Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu Pro325 330 335Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala Ser His Arg Glu340 345 350Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg Glu Ile Val Asn355 360 365Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln Ile Gly Val370 375 380Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly Ser Thr Asp385 390 395 400Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro Lys Lys Glu405 410 415Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro Asn420 425 430Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp Asp Leu Trp435 440 445Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp Val Val Val450 455 460Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu Gln Glu Cys465 470 475 480Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val His Arg Asp485 490 495Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu Val Lys Leu500 505 510Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Arg Asp515 520 525Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile Leu Asn Lys530 535 540Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly Ile Met Ala545 550 555 560Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu Thr Pro Leu565 570 575Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu Ile Lys Gln580 585 590Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp Lys Cys Leu595 600 605Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu Leu Ala His610 615 620Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile Pro Tyr Ile625 630 635 640Arg Ala Val Arg Glu Lys64510648PRTCaenorhabditis elegans 10Met Ser Thr Ser Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Ile Gly1 5 10 15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu Met20 25 30Lys Pro Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40 45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50 55 60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp Ile Ser Leu Ala Asp Gln65 70 75 80Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys Phe Tyr Ala Gln85 90 95Ser Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr Asn Ser Val100 105 110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu Thr Gly Gln115 120 125Val Thr Glu His Leu Arg Asn Leu Gln Cys Ser Asn Gly Ser Ala Thr130 135 140Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala Arg Pro Leu145 150 155 160Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165 170 175Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro Ala Pro Val Pro180 185 190Tyr Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu Thr Pro195 200 205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro Val Leu Pro210 215 220Gln Arg Ser Lys Thr Ala Ser Ala Val Ala Thr Thr Thr Thr Asn Pro225 230 235 240Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser Lys Gly Pro245 250 255Pro Val Pro Pro Lys Pro Ser His Leu Lys Ile Ala Ser Ser Thr Val260 265 270Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser Ala Arg Ser Val275 280 285Gly Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr Thr Ser Ser Asp290 295 300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn Asp Lys Ser Val305 310 315 320Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu Pro325 330 335Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala Ser His Arg Glu340 345 350Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg Glu Ile Val Asn355 360 365Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln Ile Gly Val370 375 380Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly Ser Thr Asp385 390 395 400Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro Lys Lys Glu405 410 415Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro Asn420 425 430Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp Asp Leu Trp435 440 445Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp Val Val Val450 455 460Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu Gln Glu Cys465 470 475 480Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val His Arg Asp485 490 495Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu Val Lys Leu500 505 510Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Ser Cys515 520 525Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile Leu530 535 540Asn Lys Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly Ile545 550 555 560Met Ala Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu Thr565 570 575Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu Ile580 585 590Lys Gln Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp Lys595 600 605Cys Leu Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu Leu610 615 620Ala His Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile Pro625 630 635 640Tyr Ile Arg Ala Val Arg Glu Lys64511601PRTCaenorhabditis elegans 11Met Ser Thr Ser Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Ile Gly1 5 10 15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu Met20 25 30Lys Pro Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40 45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50 55 60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp Ile Ser Tyr Phe Ser Leu65 70 75 80Ala Asp Gln Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys Phe85 90 95Tyr Ala Gln Ser Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr100 105 110Asn Ser Val Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu115 120 125Thr Gly Gln Val Thr Glu His Leu Arg Asn Leu Gln Cys Ser Asn Gly130 135 140Ser Ala Thr Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala145 150 155 160Arg Pro Leu Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr165 170 175Asp Thr Ser Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro Ala180 185 190Pro Val Pro Tyr Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly195 200 205Glu Thr Pro Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro210 215 220Val Leu Pro Gln Arg Ser Lys Thr Ala Ser Ala Val Ala Thr Thr Thr225 230 235 240Thr Asn Pro Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser245 250 255Lys Gly Pro Pro Val Pro Pro Lys Pro Ser Lys Glu Asn Ser Asn Asp260 265 270Lys Ser Val Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr275 280 285Val Glu Pro Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala Ser290 295 300His Arg Glu Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg Glu305 310 315 320Ile Val Asn Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln325 330 335Ile Gly Val Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly340 345 350Ser Thr Asp Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro355 360 365Lys Lys Glu Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg370 375 380His Pro Asn Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp385 390 395 400Asp Leu Trp Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp405 410 415Val Val Val Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu420 425 430Gln Glu Cys Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val435 440 445His Arg Asp Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu450 455 460Val Lys Leu Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser465 470 475 480Lys Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile485 490 495Leu Asn Lys Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly500 505 510Ile Met Ala Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu515 520 525Thr Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu530 535 540Ile Lys Gln Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp545 550 555 560Lys Cys Leu Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu565 570 575Leu Ala His Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile580 585 590Pro Tyr Ile Arg Ala Val Arg Glu Lys595 60012648PRTCaenorhabditis elegans 12Met Ser Thr Ser Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Val Gly1 5 10 15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu Met20 25 30Lys Pro Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40 45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50 55 60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp Ile Ser Leu Ala Asp Gln65 70 75 80Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys Phe Tyr Ala Gln85 90 95Ser Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr Asn Ser Val100 105 110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu Thr Gly Gln115 120 125Val Thr Glu His Leu Arg Asn Leu Gln Cys Ser Asn Gly Ser Ala Thr130 135 140Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala Arg Pro Leu145 150 155 160Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165 170 175Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro Ala Pro Val Pro180 185 190Tyr Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu Thr Pro195 200 205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro Val Leu Pro210 215 220Gln Arg Ser Lys Thr Ala Ser Ala Val Ala Thr Thr Thr Thr Asn Pro225 230 235 240Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser Lys Gly Pro245 250 255Pro Val Pro Pro Lys Pro Ser His Leu Lys Ile Ala Ser Ser Thr Val260 265 270Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser Ala Arg Ser Val275 280 285Gly Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr Thr Ser Ser Asp290 295 300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn Asp Lys Ser Val305 310 315 320Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu Pro325

330 335Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala Ser His Arg Glu340 345 350Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg Glu Ile Val Asn355 360 365Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln Ile Gly Val370 375 380Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly Ser Thr Asp385 390 395 400Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro Lys Lys Glu405 410 415Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro Asn420 425 430Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp Asp Leu Trp435 440 445Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp Val Val Val450 455 460Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu Gln Glu Cys465 470 475 480Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val His Arg Asp485 490 495Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu Val Lys Leu500 505 510Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Ser Cys515 520 525Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile Leu530 535 540Asn Lys Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly Ile545 550 555 560Met Ala Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu Thr565 570 575Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu Ile580 585 590Lys Gln Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp Lys595 600 605Cys Leu Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu Leu610 615 620Ala His Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile Pro625 630 635 640Tyr Ile Arg Ala Val Arg Glu Lys645133541DNAArtificial SequenceDescription of Artificial SequenceVector 13gttaacgcta gcatggatct cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtttca aaatagtccg atgatgtacg actggtggaa tgacaccacc 180aaaccgaaac accagcagcc gacacttaac gtgttgtcac catggggagc atatttcaat 240cacattggaa atgaactgct gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 300tcgtctccac aacagtattc gtctgctcga tccgttggta actcgctctc caacggcagt 360gttgtctcca caacatcgtc agatggtgat gtgcaattgt cgaataagga aaattcgaat 420gacaaatcag ttggagacaa gaatgggaac accaccacaa acaaaacgac cgtcgaacca 480cctccaccag aagagccacc tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 540tccgaagtgc tcaatcaact ccgcgagatt gttaatccaa gtaatccact tggaaagtac 600gagatgaaga agcaaatcgg tgttggagca tccggaactg tattcgttgc taatgtggcc 660ggcagcactg atgtggtggc tgtgaagaga atggctttca agactcagcc gaagaaggag 720atgttgctca ccgagattaa ggttatgaag cagtatcgac acccgaacct cgtcaactac 780attgaatcgt atctggttga tgctgatgat ctttgggtag tgatggatta tctggaaggt 840ggaaacttga cagatgtcgt tgtgaagact gagttggacg aaggacaaat tgcagcagtt 900ttgcaagaat gtcttaaagc gcttcacttc cttcatagac actccatagt gcaccgagat 960atcaagagtg acaacgtgct gctcggcatg aacggagagg ttaagctcac cgatatggga 1020ttctgtgctc agattcagcc gggatcgaaa agagatactg tcgtcggaac tccatattgg 1080atgtcgccgg agatattgaa caagaagcag tacaactata aggttgacat ttggtcgctg 1140ggaattatgg ctctagagat gattgatgga gagccaccat atttgagaga aacacctttg 1200aaggctatct acttgattgc tcaaaacggg aagccagaga tcaagcaacg cgacagactg 1260tcttcagagt tcaacaattt ccttgacaag tgtcttgttg ttgatccgga tcagagagcc 1320gatacaacgg agctcttggc acatccattc ctgaaaaagg cgaagccact ctcaagcctg 1380attccataca tcagagccgt ccgagaaaag tagacccagc tttcttgtac aaagttggca 1440ttataagaaa gcattgctta tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata 1500aaatcattat ttgccatcca gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac 1560attgcacaag ataaaaatat atcatcatga acaataaaac tgtctgctta cataaacagt 1620aatacaaggg gtgttatgag ccatattcaa cgggaaacgt cgaggccgcg attaaattcc 1680aacatggatg ctgatttata tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt 1740gcgacaatct atcgcttgta tgggaagccc gatgcgccag agttgtttct gaaacatggc 1800aaaggtagcg ttgccaatga tgttacagat gagatggtca gactaaactg gctgacggaa 1860tttatgcctc ttccgaccat caagcatttt atccgtactc ctgatgatgc atggttactc 1920accactgcga tccccggaaa aacagcattc caggtattag aagaatatcc tgattcaggt 1980gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt tgcattcgat tcctgtttgt 2040aattgtcctt ttaacagcga tcgcgtattt cgtctcgctc aggcgcaatc acgaatgaat 2100aacggtttgg ttgatgcgag tgattttgat gacgagcgta atggctggcc tgttgaacaa 2160gtctggaaag aaatgcataa acttttgcca ttctcaccgg attcagtcgt cactcatggt 2220gatttctcac ttgataacct tatttttgac gaggggaaat taataggttg tattgatgtt 2280ggacgagtcg gaatcgcaga ccgataccag gatcttgcca tcctatggaa ctgcctcggt 2340gagttttctc cttcattaca gaaacggctt tttcaaaaat atggtattga taatcctgat 2400atgaataaat tgcagtttca tttgatgctc gatgagtttt tctaatcaga attggttaat 2460tggttgtaac actggcagag cattacgctg acttgacggg acggcgcaag ctcatgacca 2520aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa aagatcaaag 2580gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac 2640cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactcttttt ccgaaggtaa 2700ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg tagttaggcc 2760accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag 2820tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 2880cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc agcttggagc 2940gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagc gccacgcttc 3000ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3060cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc 3120tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 3180ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgct cacatgttct 3240ttcctgcgtt atcccctgat tctgtggata accgtattac cgctagccag gaagagtttg 3300tagaaacgca aaaaggccat ccgtcaggat ggccttctgc ttagtttgat gcctggcagt 3360ttatggcggg cgtcctgccc gccaccctcc gggccgttgc ttcacaacgt tcaaatccgc 3420tcccggcgga tttgtcctac tcaggagagc gttcaccgac aaacaacaga taaaacgaaa 3480ggcccagtct tccgactgag cctttcgttt tatttgatgc ctggcagttc cctactctcg 3540c 3541144201DNAArtificial SequenceDescription of Artificial SequenceVector 14gttaacgcta gcatggatct cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac ttcaaaaagt tccaaggtgc gaatacggaa tttcatcggg 180cgaatcttct ctcccagcga taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact gccgcaacca tggatggata ttcttctccg agacattagt 360cttgccgatc agaagaagga tccgaacgcg gtggtgactg cgttgaagtt ctacgcacaa 420tcaatgaagg agaacgagaa gacgaaattc atgacgacga atagtgtttt cacgaatagc 480gatgacgatg atgtggacgt tcagttgacc ggacaagtca cggaacattt gaggaatttg 540cagtgtagta atggttccgc aacttcccca tctacatcag tgtcagcttc atcttcttct 600gctcgtccac tgacaaatgg aaataatcat ctttccacgg cgtcgtctac cgacacatct 660ctctcattat cggaaaggaa taacgttccg tctccagctc cagttccata tagtgaaagt 720gctccacaac tgaaaacatt caccggagag actccaaaac tgcatccacg atctccgttc 780ccgcctcaac cgccagttct tccgcaacga agcaaaaccg catcggcagt ggcgacgacg 840acgacgaatc cgacgacttc gaatggagca ccaccaccag ttcctggatc gaaaggaccc 900ccggtgccac cgaaaccatc gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 960tcgtctccac aacagtattc gtctgctcga tccgttggta actcgctctc caacggcagt 1020gttgtctcca caacatcgtc agatggtgat gtgcaattgt cgaataagga aaattcgaat 1080gacaaatcag ttggagacaa gaatgggaac accaccacaa acaaaacgac cgtcgaacca 1140cctccaccag aagagccacc tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 1200tccgaagtgc tcaatcaact ccgcgagatt gttaatccaa gtaatccact tggaaagtac 1260gagatgaaga agcaaatcgg tgttggagca tccggaactg tattcgttgc taatgtggcc 1320ggcagcactg atgtggtggc tgtgaagaga atggctttca agactcagcc gaagaaggag 1380atgttgctca ccgagattaa ggttatgaag cagtatcgac acccgaacct cgtcaactac 1440attgaatcgt atctggttga tgctgatgat ctttgggtag tgatggatta tctggaaggt 1500ggaaacttga cagatgtcgt tgtgaagact gagttggacg aaggacaaat tgcagcagtt 1560ttgcaagaat gtcttaaagc gcttcacttc cttcatagac actccatagt gcaccgagat 1620atcaagagtg acaacgtgct gctcggcatg aacggagagg ttaagctcac cgatatggga 1680ttctgtgctc agattcagcc gggatcgaaa agagatactg tcgtcggaac tccatattgg 1740atgtcgccgg agatattgaa caagaagcag tacaactata aggttgacat ttggtcgctg 1800ggaattatgg ctctagagat gattgatgga gagccaccat atttgagaga aacacctttg 1860aaggctatct acttgattgc tcaaaacggg aagccagaga tcaagcaacg cgacagactg 1920tcttcagagt tcaacaattt ccttgacaag tgtcttgttg ttgatccgga tcagagagcc 1980gatacaacgg agctcttggc acatccattc ctgaaaaagg cgaagccact ctcaagcctg 2040attccataca tcagagccgt ccgagaaaag tagacccagc tttcttgtac aaagttggca 2100ttataagaaa gcattgctta tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata 2160aaatcattat ttgccatcca gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac 2220attgcacaag ataaaaatat atcatcatga acaataaaac tgtctgctta cataaacagt 2280aatacaaggg gtgttatgag ccatattcaa cgggaaacgt cgaggccgcg attaaattcc 2340aacatggatg ctgatttata tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt 2400gcgacaatct atcgcttgta tgggaagccc gatgcgccag agttgtttct gaaacatggc 2460aaaggtagcg ttgccaatga tgttacagat gagatggtca gactaaactg gctgacggaa 2520tttatgcctc ttccgaccat caagcatttt atccgtactc ctgatgatgc atggttactc 2580accactgcga tccccggaaa aacagcattc caggtattag aagaatatcc tgattcaggt 2640gaaaatattg ttgatgcgct ggcagtgttc ctgcgccggt tgcattcgat tcctgtttgt 2700aattgtcctt ttaacagcga tcgcgtattt cgtctcgctc aggcgcaatc acgaatgaat 2760aacggtttgg ttgatgcgag tgattttgat gacgagcgta atggctggcc tgttgaacaa 2820gtctggaaag aaatgcataa acttttgcca ttctcaccgg attcagtcgt cactcatggt 2880gatttctcac ttgataacct tatttttgac gaggggaaat taataggttg tattgatgtt 2940ggacgagtcg gaatcgcaga ccgataccag gatcttgcca tcctatggaa ctgcctcggt 3000gagttttctc cttcattaca gaaacggctt tttcaaaaat atggtattga taatcctgat 3060atgaataaat tgcagtttca tttgatgctc gatgagtttt tctaatcaga attggttaat 3120tggttgtaac actggcagag cattacgctg acttgacggg acggcgcaag ctcatgacca 3180aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa aagatcaaag 3240gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac 3300cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactcttttt ccgaaggtaa 3360ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg tagttaggcc 3420accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag 3480tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 3540cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc agcttggagc 3600gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagc gccacgcttc 3660ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3720cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc 3780tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 3840ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgct cacatgttct 3900ttcctgcgtt atcccctgat tctgtggata accgtattac cgctagccag gaagagtttg 3960tagaaacgca aaaaggccat ccgtcaggat ggccttctgc ttagtttgat gcctggcagt 4020ttatggcggg cgtcctgccc gccaccctcc gggccgttgc ttcacaacgt tcaaatccgc 4080tcccggcgga tttgtcctac tcaggagagc gttcaccgac aaacaacaga taaaacgaaa 4140ggcccagtct tccgactgag cctttcgttt tatttgatgc ctggcagttc cctactctcg 4200c 4201154278DNAArtificial SequenceDescription of Artificial SequenceVector 15gttaacgcta gcatggatct cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctggt ttaattaccc aagtttgaga tttaccttaa catcgggtct gacaaccgtg 180tcgcttacga cgcattctaa tcattaacca tgtcaacttc aaaaagttcc aaggtgcgaa 240tacggaattt catcgggcga atcttctctc ccagcgataa agacaaggat cgagacgatg 300agatgaagcc atcctcgtcc gcaatggata ttagtcagcc atataacaca gtgcatcgag 360tccacgttgg atacgacggc cagaagttca gcggactgcc gcaaccatgg atggatattc 420ttctccgaga cattagtctt gccgatcaga agaaggatcc gaacgcggtg gtgactgcgt 480tgaagttcta cgcacaatca atgaaggaga acgagaagac gaaattcatg acgacgaata 540gtgttttcac gaatagcgat gacgatgatg tggacgttca gttgaccgga caagtcacgg 600aacatttgag gaatttgcag tgtagtaatg gttccgcaac ttccccatct acatcagtgt 660cagcttcatc ttcttctgct cgtccactga caaatggaaa taatcatctt tccacggcgt 720cgtctaccga cacatctctc tcattatcgg aaaggaataa cgttccgtct ccagctccag 780ttccatatag tgaaagtgct ccacaactga aaacattcac cggagagact ccaaaactgc 840atccacgatc tccgttcccg cctcaaccgc cagttcttcc gcaacgaagc aaaaccgcat 900cggcagtggc gacgacgacg acgaatccga cgacttcgaa tggagcacca ccaccagttc 960ctggatcgaa aggacccccg gtgccaccga aaccatcgca tctgaaaatc gcatcgtcga 1020cagtatcctc gggatgctcg tctccacaac agtattcgtc tgctcgatcc gttggtaact 1080cgctctccaa cggcagtgtt gtctccacaa catcgtcaga tggtgatgtg caattgtcga 1140ataaggaaaa ttcgaatgac aaatcagttg gagacaagaa tgggaacacc accacaaaca 1200aaacgaccgt cgaaccacct ccaccagaag agccacctgt tcgtgttcga gcatctcatc 1260gtgaaaagct ttctgattcc gaagtgctca atcaactccg cgagattgtt aatccaagta 1320atccacttgg aaagtacgag atgaagaagc aaatcggtgt tggagcatcc ggaactgtat 1380tcgttgctaa tgtggccggc agcactgatg tggtggctgt gaagagaatg gctttcaaga 1440ctcagccgaa gaaggagatg ttgctcaccg agattaaggt tatgaagcag tatcgacacc 1500cgaacctcgt caactacatt gaatcgtatc tggttgatgc tgatgatctt tgggtagtga 1560tggattatct ggaaggtgga aacttgacag atgtcgttgt gaagactgag ttggacgaag 1620gacaaattgc agcagttttg caagaatgtc ttaaagcgct tcacttcctt catagacact 1680ccatagtgca ccgagatatc aagagtgaca acgtgctgct cggcatgaac ggagaggtta 1740agctcaccga tatgggattc tgtgctcaga ttcagccggg atcgaaaaga gatactgtcg 1800tcggaactcc atattggatg tcgccggaga tattgaacaa gaagcagtac aactataagg 1860ttgacatttg gtcgctggga attatggccc tagagatgat tgatggagag ccaccatatt 1920tgagagaaac acctttgaag gctatctact tgattgctca aaacgggaag ccagagatca 1980agcaacgcga cagactgtct tcagagttca acaatttcct tgacaagtgt cttgttgttg 2040atccggatca gagagccgat acaacggagc tcttggcaca tccattcctg aaaaaggcga 2100agccactctc aagcctgatt ccatacatca gagccgtccg agaaaagtag acccagcttt 2160cttgtacaaa gttggcatta taagaaagca ttgcttatca atttgttgca acgaacaggt 2220cactatcagt caaaataaaa tcattatttg ccatccagct gcagctctgg cccgtgtctc 2280aaaatctctg atgttacatt gcacaagata aaaatatatc atcatgaaca ataaaactgt 2340ctgcttacat aaacagtaat acaaggggtg ttatgagcca tattcaacgg gaaacgtcga 2400ggccgcgatt aaattccaac atggatgctg atttatatgg gtataaatgg gctcgcgata 2460atgtcgggca atcaggtgcg acaatctatc gcttgtatgg gaagcccgat gcgccagagt 2520tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt tacagatgag atggtcagac 2580taaactggct gacggaattt atgcctcttc cgaccatcaa gcattttatc cgtactcctg 2640atgatgcatg gttactcacc actgcgatcc ccggaaaaac agcattccag gtattagaag 2700aatatcctga ttcaggtgaa aatattgttg atgcgctggc agtgttcctg cgccggttgc 2760attcgattcc tgtttgtaat tgtcctttta acagcgatcg cgtatttcgt ctcgctcagg 2820cgcaatcacg aatgaataac ggtttggttg atgcgagtga ttttgatgac gagcgtaatg 2880gctggcctgt tgaacaagtc tggaaagaaa tgcataaact tttgccattc tcaccggatt 2940cagtcgtcac tcatggtgat ttctcacttg ataaccttat ttttgacgag gggaaattaa 3000taggttgtat tgatgttgga cgagtcggaa tcgcagaccg ataccaggat cttgccatcc 3060tatggaactg cctcggtgag ttttctcctt cattacagaa acggcttttt caaaaatatg 3120gtattgataa tcctgatatg aataaattgc agtttcattt gatgctcgat gagtttttct 3180aatcagaatt ggttaattgg ttgtaacact ggcagagcat tacgctgact tgacgggacg 3240gcgcaagctc atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc 3300cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt 3360gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag agctaccaac 3420tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg tccttctagt 3480gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct 3540gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga 3600ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac 3660acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc gtgagctatg 3720agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa gcggcagggt 3780cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc tttatagtcc 3840tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg 3900gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc 3960ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc gtattaccgc 4020tagccaggaa gagtttgtag aaacgcaaaa aggccatccg tcaggatggc cttctgctta 4080gtttgatgcc tggcagttta tggcgggcgt cctgcccgcc accctccggg ccgttgcttc 4140acaacgttca aatccgctcc cggcggattt gtcctactca ggagagcgtt caccgacaaa 4200caacagataa aacgaaaggc ccagtcttcc gactgagcct ttcgttttat ttgatgcctg 4260gcagttccct actctcgc 4278164207DNAArtificial SequenceDescription of Artificial SequenceVector 16gttaacgcta gcatggatct cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac ttcaaaaagt tccaaggtgc gaatacggaa tttcatcggg 180cgaatcttct ctcccagcga taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact gccgcaacca tggatggata ttcttctccg agacattagt 360cttgccgatc agaagaagga tccgaacgcg gtggtgactg cgttgaagtt ctacgcacaa 420tcaatgaagg agaacgagaa gacgaaattc atgacgacga atagtgtttt cacgaatagc 480gatgacgatg atgtggacgt tcagttgacc ggacaagtca cggaacattt gaggaatttg 540cagtgtagta atggttccgc aacttcccca tctacatcag tgtcagcttc atcttcttct 600gctcgtccac tgacaaatgg aaataatcat ctttccacgg cgtcgtctac cgacacatct 660ctctcattat cggaaaggaa taacgttccg tctccagctc cagttccata tagtgaaagt 720gctccacaac tgaaaacatt caccggagag actccaaaac tgcatccacg atctccgttc 780ccgcctcaac cgccagttct tccgcaacga agcaaaaccg catcggcagt ggcgacgacg 840acgacgaatc cgacgacttc

gaatggagca ccaccaccag ttcctggatc gaaaggaccc 900ccggtgccac cgaaaccatc gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 960tcgtctccac aacagtattc gtctgctcga tccgttggta actcgctctc caacggcagt 1020gttgtctcca caacatcgtc agatggtgat gtgcaattgt cgaataagga aaattcgaat 1080gacaaatcag ttggagacaa gaatgggaac accaccacaa acaaaacgac cgtcgaacca 1140cctccaccag aagagccacc tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 1200tccgaagtgc tcaatcaact ccgcgagatt gttaatccaa gtaatccact tggaaagtac 1260gagatgaaga agcaaatcgg tgttggagca tccggaactg tattcgttgc taatgtggcc 1320ggcagcactg atgtggtggc tgtgaagaga atggctttca agactcagcc gaagaaggag 1380atgttgctca ccgagattaa ggttatgaag cagtatcgac acccgaacct cgtcaactac 1440attgaatcgt atctggttga tgctgatgat ctttgggtag tgatggatta tctggaaggt 1500ggaaacttga cagatgtcgt tgtgaagact gagttggacg aaggacaaat tgcagcagtt 1560ttgcaagaat gtcttaaagc gcttcacttc cttcatagac actccatagt gcaccgagat 1620atcaagagtg acaacgtgct gctcggcatg aacggagagg ttaagctcac cgatatggga 1680ttctgtgctc agattcagcc gggatcgaaa agttgtagag atactgtcgt cggaactcca 1740tattggatgt cgccggagat attgaacaag aagcagtaca actataaggt tgacatttgg 1800tcgctgggaa ttatggccct agagatgatt gatggagagc caccatattt gagagaaaca 1860cctttgaagg ctatctactt gattgctcaa aacgggaagc cagagatcaa gcaacgcgac 1920agactgtctt cagagttcaa caatttcctt gacaagtgtc ttgttgttga tccggatcag 1980agagccgata caacggagct cttggcacat ccattcctga aaaaggcgaa gccactctca 2040agcctgattc catacatcag agccgtccga gaaaagtaga cccagctttc ttgtacaaag 2100ttggcattat aagaaagcat tgcttatcaa tttgttgcaa cgaacaggtc actatcagtc 2160aaaataaaat cattatttgc catccagctg cagctctggc ccgtgtctca aaatctctga 2220tgttacattg cacaagataa aaatatatca tcatgaacaa taaaactgtc tgcttacata 2280aacagtaata caaggggtgt tatgagccat attcaacggg aaacgtcgag gccgcgatta 2340aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 2400tcaggtgcga caatctatcg cttgtatggg aagcccgatg cgccagagtt gtttctgaaa 2460catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcagact aaactggctg 2520acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 2580ttactcacca ctgcgatccc cggaaaaaca gcattccagg tattagaaga atatcctgat 2640tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 2700gtttgtaatt gtccttttaa cagcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 2760atgaataacg gtttggttga tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 2820gaacaagtct ggaaagaaat gcataaactt ttgccattct caccggattc agtcgtcact 2880catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 2940gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 3000ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 3060cctgatatga ataaattgca gtttcatttg atgctcgatg agtttttcta atcagaattg 3120gttaattggt tgtaacactg gcagagcatt acgctgactt gacgggacgg cgcaagctca 3180tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga 3240tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 3300aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga 3360aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt 3420taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt 3480taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat 3540agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct 3600tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga gaaagcgcca 3660cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag 3720agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc 3780gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga 3840aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca 3900tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgct agccaggaag 3960agtttgtaga aacgcaaaaa ggccatccgt caggatggcc ttctgcttag tttgatgcct 4020ggcagtttat ggcgggcgtc ctgcccgcca ccctccgggc cgttgcttca caacgttcaa 4080atccgctccc ggcggatttg tcctactcag gagagcgttc accgacaaac aacagataaa 4140acgaaaggcc cagtcttccg actgagcctt tcgttttatt tgatgcctgg cagttcccta 4200ctctcgc 4207174066DNAArtificial SequenceDescription of Artificial SequenceVector 17gttaacgcta gcatggatct cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac ttcaaaaagt tccaaggtgc gaatacggaa tttcatcggg 180cgaatcttct ctcccagcga taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact gccgcaacca tggatggata ttcttctccg agacattagc 360tatttcagtc ttgccgatca gaagaaggat ccgaacgcgg tggtgactgc gttgaagttc 420tacgcacaat caatgaagga gaacgagaag acgaaattca tgacgacgaa tagtgttttc 480acgaatagcg atgacgatga tgtggacgtt cagttgaccg gacaagtcac ggaacatttg 540aggaatttgc agtgtagtaa tggttccgca acttccccat ctacatcagt gtcagcttca 600tcttcttctg ctcgtccact gacaaatgga aataatcatc tttccacggc gtcgtctacc 660gacacatctc tctcattatc ggaaaggaat aacgttccgt ctccagctcc agttccatat 720agtgaaagtg ctccacaact gaaaacattc accggagaga ctccaaaact gcatccacga 780tctccgttcc cgcctcaacc gccagttctt ccgcaacgaa gcaaaaccgc atcggcagtg 840gcgacgacga cgacgaatcc gacgacttcg aatggagcac caccaccagt tcctggatcg 900aaaggacccc cggtgccacc gaaaccatcg aaggaaaatt cgaatgacaa atcagttgga 960gacaagaatg ggaacaccac cacaaacaaa acgaccgtcg aaccacctcc accagaagag 1020ccacctgttc gtgttcgagc atctcatcgt gaaaagcttt ctgattccga agtgctcaat 1080caactccgcg agattgttaa tccaagtaat ccacttggaa agtacgagat gaagaagcaa 1140atcggtgttg gagcatccgg aactgtattc gttgctaatg tggccggcag cactgatgtg 1200gtggctgtga agagaatggc tttcaagact cagccgaaga aggagatgtt gctcaccgag 1260attaaggtta tgaagcagta tcgacacccg aacctcgtca actacattga atcgtatctg 1320gttgatgctg atgatctttg ggtagtgatg gattatctgg aaggtggaaa cttgacagat 1380gtcgttgtga agactgagtt ggacgaagga caaattgcag cagttttgca agaatgtctt 1440aaagcgcttc acttccttca tagacactcc atagtgcacc gagatatcaa gagtgacaac 1500gtgctgctcg gcatgaacgg agaggttaag ctcaccgata tgggattctg tgctcagatt 1560cagccgggat cgaaaagaga tactgtcgtc ggaactccat attggatgtc gccggagata 1620ttgaacaaga agcagtacaa ctataaggtt gacatttggt cgctgggaat tatggctcta 1680gagatgattg atggagagcc accatatttg agagaaacac ctttgaaggc tatctacttg 1740attgctcaaa acgggaagcc agagatcaag caacgcgaca gactgtcttc agagttcaac 1800aatttccttg acaagtgtct tgttgttgat ccggatcaga gagccgatac aacggagctc 1860ttggcacatc cattcctgaa aaaggcgaag ccactctcaa gcctgattcc atacatcaga 1920gccgtccgag aaaagtagac ccagctttct tgtacaaagt tggcattata agaaagcatt 1980gcttatcaat ttgttgcaac gaacaggtca ctatcagtca aaataaaatc attatttgcc 2040atccagctgc agctctggcc cgtgtctcaa aatctctgat gttacattgc acaagataaa 2100aatatatcat catgaacaat aaaactgtct gcttacataa acagtaatac aaggggtgtt 2160atgagccata ttcaacggga aacgtcgagg ccgcgattaa attccaacat ggatgctgat 2220ttatatgggt ataaatgggc tcgcgataat gtcgggcaat caggtgcgac aatctatcgc 2280ttgtatggga agcccgatgc gccagagttg tttctgaaac atggcaaagg tagcgttgcc 2340aatgatgtta cagatgagat ggtcagacta aactggctga cggaatttat gcctcttccg 2400accatcaagc attttatccg tactcctgat gatgcatggt tactcaccac tgcgatcccc 2460ggaaaaacag cattccaggt attagaagaa tatcctgatt caggtgaaaa tattgttgat 2520gcgctggcag tgttcctgcg ccggttgcat tcgattcctg tttgtaattg tccttttaac 2580agcgatcgcg tatttcgtct cgctcaggcg caatcacgaa tgaataacgg tttggttgat 2640gcgagtgatt ttgatgacga gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg 2700cataaacttt tgccattctc accggattca gtcgtcactc atggtgattt ctcacttgat 2760aaccttattt ttgacgaggg gaaattaata ggttgtattg atgttggacg agtcggaatc 2820gcagaccgat accaggatct tgccatccta tggaactgcc tcggtgagtt ttctccttca 2880ttacagaaac ggctttttca aaaatatggt attgataatc ctgatatgaa taaattgcag 2940tttcatttga tgctcgatga gtttttctaa tcagaattgg ttaattggtt gtaacactgg 3000cagagcatta cgctgacttg acgggacggc gcaagctcat gaccaaaatc ccttaacgtg 3060agttttcgtt ccactgagcg tcagaccccg tagaaaagat caaaggatct tcttgagatc 3120ctttttttct gcgcgtaatc tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg 3180tttgtttgcc ggatcaagag ctaccaactc tttttccgaa ggtaactggc ttcagcagag 3240cgcagatacc aaatactgtc cttctagtgt agccgtagtt aggccaccac ttcaagaact 3300ctgtagcacc gcctacatac ctcgctctgc taatcctgtt accagtggct gctgccagtg 3360gcgataagtc gtgtcttacc gggttggact caagacgata gttaccggat aaggcgcagc 3420ggtcgggctg aacggggggt tcgtgcacac agcccagctt ggagcgaacg acctacaccg 3480aactgagata cctacagcgt gagctatgag aaagcgccac gcttcccgaa gggagaaagg 3540cggacaggta tccggtaagc ggcagggtcg gaacaggaga gcgcacgagg gagcttccag 3600ggggaaacgc ctggtatctt tatagtcctg tcgggtttcg ccacctctga cttgagcgtc 3660gatttttgtg atgctcgtca ggggggcgga gcctatggaa aaacgccagc aacgcggcct 3720ttttacggtt cctggccttt tgctggcctt ttgctcacat gttctttcct gcgttatccc 3780ctgattctgt ggataaccgt attaccgcta gccaggaaga gtttgtagaa acgcaaaaag 3840gccatccgtc aggatggcct tctgcttagt ttgatgcctg gcagtttatg gcgggcgtcc 3900tgcccgccac cctccgggcc gttgcttcac aacgttcaaa tccgctcccg gcggatttgt 3960cctactcagg agagcgttca ccgacaaaca acagataaaa cgaaaggccc agtcttccga 4020ctgagccttt cgttttattt gatgcctggc agttccctac tctcgc 4066184207DNAArtificial SequenceDescription of Artificial SequenceVector 18gttaacgcta gcatggatct cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac ttcaaaaagt tccaaggtgc gaatacggaa tttcgtcggg 180cgaatcttct ctcccagcga taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact gccgcaacca tggatggata ttcttctccg agacattagt 360cttgccgatc agaagaagga tccgaacgcg gtggtgactg cgttgaagtt ctacgcacaa 420tcaatgaagg agaacgagaa gacgaaattc atgacgacga atagtgtttt cacgaatagc 480gatgacgatg atgtggacgt tcagttgacc ggacaagtca cggaacattt gaggaatttg 540cagtgtagta atggttccgc aacttcccca tctacatcag tgtcagcttc atcttcttct 600gctcgtccac tgacaaatgg aaataatcat ctttccacgg cgtcgtctac cgacacatct 660ctctcattat cggaaaggaa taacgttccg tctccagctc cagttccata tagtgaaagt 720gctccacaac tgaaaacatt caccggagag actccaaaac tgcatccacg atctccgttc 780ccgcctcaac cgccagttct tccgcaacga agcaaaaccg catcggcagt ggcgacgacg 840acgacgaatc cgacgacttc gaatggagca ccaccaccag ttcctggatc gaaaggaccc 900ccggtgccac cgaaaccatc gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 960tcgtctccac aacagtattc gtctgctcga tccgttggta actcgctctc caacggcagt 1020gttgtctcca caacatcgtc agatggtgat gtgcaattgt cgaataagga aaattcgaat 1080gacaaatcag ttggagacaa gaatgggaac accaccacaa acaaaacgac cgtcgaacca 1140cctccaccag aagagccacc tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 1200tccgaagtgc tcaatcaact ccgcgagatt gttaatccaa gtaatccact tggaaagtac 1260gagatgaaga agcaaatcgg tgttggagca tccggaactg tattcgttgc taatgtggcc 1320ggcagcactg atgtggtggc tgtgaagaga atggctttca agactcagcc gaagaaggag 1380atgttgctca ccgagattaa ggttatgaag cagtatcgac acccgaacct cgtcaactac 1440attgaatcgt atctggttga tgctgatgat ctttgggtag tgatggatta tctggaaggt 1500ggaaacttga cagatgtcgt tgtgaagact gagttggacg aaggacaaat tgcagcagtt 1560ttgcaagaat gtcttaaagc gcttcacttc cttcatagac actccatagt gcaccgagat 1620atcaagagtg acaacgtgct gctcggcatg aacggagagg ttaagctcac cgatatggga 1680ttctgtgctc agattcagcc gggatcgaaa agttgtagag atactgtcgt cggaactcca 1740tattggatgt cgccggagat attgaacaag aagcagtaca actataaggt tgacatttgg 1800tcgctgggaa ttatggctct agagatgatt gatggagagc caccatattt gagagaaaca 1860cctttgaagg ctatctactt gattgctcaa aacgggaagc cagagatcaa gcaacgcgac 1920agactgtctt cagagttcaa caatttcctt gacaagtgtc ttgttgttga tccggatcag 1980agagccgata caacggagct cttggcacat ccattcctga aaaaggcgaa gccactctca 2040agcctgattc catacatcag agccgtccga gaaaagtaga cccagctttc ttgtacaaag 2100ttggcattat aagaaagcat tgcttatcaa tttgttgcaa cgaacaggtc actatcagtc 2160aaaataaaat cattatttgc catccagctg cagctctggc ccgtgtctca aaatctctga 2220tgttacattg cacaagataa aaatatatca tcatgaacaa taaaactgtc tgcttacata 2280aacagtaata caaggggtgt tatgagccat attcaacggg aaacgtcgag gccgcgatta 2340aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 2400tcaggtgcga caatctatcg cttgtatggg aagcccgatg cgccagagtt gtttctgaaa 2460catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcagact aaactggctg 2520acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga tgatgcatgg 2580ttactcacca ctgcgatccc cggaaaaaca gcattccagg tattagaaga atatcctgat 2640tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca ttcgattcct 2700gtttgtaatt gtccttttaa cagcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 2760atgaataacg gtttggttga tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 2820gaacaagtct ggaaagaaat gcataaactt ttgccattct caccggattc agtcgtcact 2880catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat aggttgtatt 2940gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct atggaactgc 3000ctcggtgagt tttctccttc attacagaaa cggctttttc aaaaatatgg tattgataat 3060cctgatatga ataaattgca gtttcatttg atgctcgatg agtttttcta atcagaattg 3120gttaattggt tgtaacactg gcagagcatt acgctgactt gacgggacgg cgcaagctca 3180tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga 3240tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 3300aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga 3360aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt 3420taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt 3480taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat 3540agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct 3600tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga gaaagcgcca 3660cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag 3720agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc 3780gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga 3840aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca 3900tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgct agccaggaag 3960agtttgtaga aacgcaaaaa ggccatccgt caggatggcc ttctgcttag tttgatgcct 4020ggcagtttat ggcgggcgtc ctgcccgcca ccctccgggc cgttgcttca caacgttcaa 4080atccgctccc ggcggatttg tcctactcag gagagcgttc accgacaaac aacagataaa 4140acgaaaggcc cagtcttccg actgagcctt tcgttttatt tgatgcctgg cagttcccta 4200ctctcgc 42071929DNAArtificial Sequenceprimer 19ggggacaagt ttgtacaaaa aagcaggct 292029DNAArtificial Sequenceprimer 20ggggaccact ttgtacaaga aagctgggt 292140DNAArtificial Sequenceprimer 21aaaaagcagg ctcaaaaatg tttcaaaata gtccgatgat 402233DNAArtificial Sequenceprimer 22agaaagctgg gtctactttt ctcggacggc tct 332334DNAArtificial Sequenceprimer 23aaaaagcagg ctggtttaat tacccaagtt tgag 342433DNAArtificial Sequenceprimer 24agaaagctgg gtctactttt ctcggacggc tct 332541DNAArtificial Sequenceprimer 25aaaaagcagg ctcaaaaatg tcaacttcaa aaagttccaa g 41264447DNAArtificial SequenceDescription of Artificial SequenceVector 26aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcac 60aagtttgtac aaaaaagcag gctcaaaaat gaaagctttc tcatcgtatg atgagaaacc 120accagcacca ccaattcgtt tcagcagctc ggcaacgagg gagaatcagg tcgtcggatt 180gaagccattg cccaaagagc cagaagcaac caagaaaaag aagacgatgc ctaacccgtt 240catgaaaaag aacaaagaca aaaaggaagc gtcagaaaaa ccagtgatct ctcgaccgag 300caatttcgaa cacacaattc atgtcggata tgacccaaaa accggcgaat ttacgggaat 360gcctgaagca tgggcacgtc ttctcacaga ctcacagatc tcaaaacaag agcagcaaca 420gaatcctcag gcagtgttgg acgcgctcaa atactacaca caaggcgaaa gcagcggcca 480gaagtggttg cagtacgata tgaatgacgc accttctcgg acgccatcat acggactgaa 540accgcaacca tatagcacat catccctgcc gtatcatggc aataaaattc aggatccaag 600aaagatgaat ccaatgacaa ccagtacaag tagtgcgggg tataacagca agcaaggagt 660tcctccgacg acgtttagtg taaatgagaa tagatcgagt atgccaccga gttatgcacc 720gccaccggtc ccccatggtg aaactcctgc tgatattgtt cctcccgcta tccctgatag 780gccggcaagg acgttgagta tttacacaaa accgaaagag gaggaagaaa aaattccaga 840cctttcaaaa ggacaatttg gtgtacaggc cagaggtcaa aaagctaaga aaaagatgac 900tgacgctgaa gtgctgacta agctccgtac cattgtgtct atcggaaatc cagatcgaaa 960atatagaaaa gttgataaaa tcggctcagg tgcatctggt tctgtgtaca ccgctattga 1020aattagtacc gaagcggagg tggctatcaa gcagatgaac ctgaaggatc aaccaaagaa 1080ggaattgatc attaatgaga ttttggtgat gcgtgagaat aagcatgcaa atattgtaaa 1140ttatttggat tcgtatttgg tgtgcgatga attatgggta gtgatggagt atcttgccgg 1200tggatcattg actgatgttg tcacggagtg ccagatggag gatggaatta ttgcagctgt 1260ttgcagagaa gttcttcaag cgcttgaatt cctccacagc cgccacgtca ttcacagaga 1320tattaaatct gacaatattc ttttgggaat ggatggttcg gtgaaattga ccgactttgg 1380attctgtgct cagctctcgc cggagcaaag aaaacgcacg acaatggtcg gaactccata 1440ctggatggcg ccggaagtgg tgacccgcaa acaatacgga cccaaggttg atgtgtggtc 1500cttgggaatc atggcgattg agatggtcga aggagaaccg ccatatttga atgaaaatcc 1560actcagggct atctatctca ttgctacaaa tggcaaaccc gacttccctg gaagagattc 1620catgactttg ttgttcaagg actttgtcga ctctgcgttg gaagtacaag ttgaaaatcg 1680atggtcggca agccaactcc ttacgcatcc attcctccga tgcgccaaac cgcttgcttc 1740actgtactac ttaatcgttg cggcgaagaa gagcatcgcc gaagctagca actcataaac 1800ccagctttct tgtacaaagt ggtgatatca agcttatcga taccgtcgac ctcgaggggg 1860ggcccggtac ccaattcgcc ctatagtgag tcgtattacg cgcgctcact ggccgtcgtt 1920ttacaacgtc gtgactggga aaaccctggc gttacccaac ttaatcgcct tgcagcacat 1980ccccctttcg ccagctggcg taatagcgaa gaggcccgca ccgatcgccc ttcccaacag 2040ttgcgcagcc tgaatggcga atgggacgcg ccctgtagcg gcgcattaag cgcggcgggt 2100gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc cgctcctttc 2160gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc tctaaatcgg 2220gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa aaaacttgat 2280tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg ccctttgacg 2340ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac actcaaccct 2400atctcggtct attcttttga tttataaggg attttgccga tttcggccta ttggttaaaa 2460aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac

gcttacaatt 2520taggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac 2580attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa 2640aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat 2700tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc 2760agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga 2820gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg 2880cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactattctc 2940agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag 3000taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc 3060tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg 3120taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg 3180acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac 3240ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac 3300cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg 3360agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg 3420tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg 3480agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac 3540tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg 3600ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg 3660tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc 3720aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc 3780tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc cttctagtgt 3840agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc 3900taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact 3960caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac 4020agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt gagctatgag 4080aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg 4140gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg 4200tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga 4260gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt 4320ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct 4380ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg 4440aggaagc 4447273533DNAArtificial SequenceDescription of Artificial SequenceVector 27aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcac 60aagtttgtac aaaaaagcag gctcaaatcg gtgttggagc atccggaact gtattcgttg 120ctaatgtggc cggcagcact gatgtggtgg ctgtgaagag aatggctttc aagactcagc 180cgaagaagga gatgttgctc accgagatta aggttatgaa gcagtatcga cacccgaacc 240tcgtcaacta cattgaatcg tatctggttg atgctgatga tctttgggta gtgatggatt 300atctggaagg tggaaacttg acagatgtcg ttgtgaagac tgagttggac gaaggacaaa 360ttgcagcagt tttgcaagaa tgtcttaaag cgcttcactt ccttcataga cactccatag 420tgcaccgaga tatcaagagt gacaacgtgc tgctcggcat gaacggagag gttaagctca 480ccgatatggg attctgtgct cagattcagc cgggatcgaa aagagatact gtcgtcggaa 540ctccatattg gatgtcgccg gagatattga acaagaagca gtacaactat aaggttgaca 600tttggtcgct gggaattatg gctctagaga tgattgatgg agagccacca tatttgagag 660aaacaccttt gaaggctatc tacttgattg ctcaaaacgg gaagccagag atcaagcaac 720gcgacagact gtcttcagag ttcaacaatt tccttgacaa gtgtcttgtt gttgatccgg 780atcagagagc cgatacaacg gagctcttgg cacatccatt cctgaaaaag gcgaagccac 840tctcaagcct gattccatac atcagagccg tccgagaaaa gtagacccag ctttcttgta 900caaagtggtg atatcaagct tatcgatacc gtcgacctcg agggggggcc cggtacccaa 960ttcgccctat agtgagtcgt attacgcgcg ctcactggcc gtcgttttac aacgtcgtga 1020ctgggaaaac cctggcgtta cccaacttaa tcgccttgca gcacatcccc ctttcgccag 1080ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc caacagttgc gcagcctgaa 1140tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg 1200cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc 1260ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg 1320gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc 1380acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt 1440ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc 1500ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta 1560acaaaaattt aacgcgaatt ttaacaaaat attaacgctt acaatttagg tggcactttt 1620cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc aaatatgtat 1680ccgctcatga gacaataacc ctgataaatg cttcaataat attgaaaaag gaagagtatg 1740agtattcaac atttccgtgt cgcccttatt cccttttttg cggcattttg ccttcctgtt 1800tttgctcacc cagaaacgct ggtgaaagta aaagatgctg aagatcagtt gggtgcacga 1860gtgggttaca tcgaactgga tctcaacagc ggtaagatcc ttgagagttt tcgccccgaa 1920gaacgttttc caatgatgag cacttttaaa gttctgctat gtggcgcggt attatcccgt 1980attgacgccg ggcaagagca actcggtcgc cgcatacact attctcagaa tgacttggtt 2040gagtactcac cagtcacaga aaagcatctt acggatggca tgacagtaag agaattatgc 2100agtgctgcca taaccatgag tgataacact gcggccaact tacttctgac aacgatcgga 2160ggaccgaagg agctaaccgc ttttttgcac aacatggggg atcatgtaac tcgccttgat 2220cgttgggaac cggagctgaa tgaagccata ccaaacgacg agcgtgacac cacgatgcct 2280gtagcaatgg caacaacgtt gcgcaaacta ttaactggcg aactacttac tctagcttcc 2340cggcaacaat taatagactg gatggaggcg gataaagttg caggaccact tctgcgctcg 2400gcccttccgg ctggctggtt tattgctgat aaatctggag ccggtgagcg tgggtctcgc 2460ggtatcattg cagcactggg gccagatggt aagccctccc gtatcgtagt tatctacacg 2520acggggagtc aggcaactat ggatgaacga aatagacaga tcgctgagat aggtgcctca 2580ctgattaagc attggtaact gtcagaccaa gtttactcat atatacttta gattgattta 2640aaacttcatt tttaatttaa aaggatctag gtgaagatcc tttttgataa tctcatgacc 2700aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga aaagatcaaa 2760ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca 2820ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt tccgaaggta 2880actggcttca gcagagcgca gataccaaat actgtccttc tagtgtagcc gtagttaggc 2940caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca 3000gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag acgatagtta 3060ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc cagcttggag 3120cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag cgccacgctt 3180cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac aggagagcgc 3240acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg gtttcgccac 3300ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac 3360gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc tcacatgttc 3420tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga gtgagctgat 3480accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga agc 3533284323DNAArtificial SequenceDescription of Artificial SequenceVector 28aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcac 60aagtttgtac aaaaaagcag gctcaaatcg gtgttggagc atccggaact gtattcgttg 120ctaatgtggc cggcagcact gatgtggtgg ctgtgaagag aatggctttc aagactcagc 180cgaagaagga gatgttgctc accgagatta aggttatgaa gcagtatcga cacccgaacc 240tcgtcaacta cattgaatcg tatctggttg atgctgatga tctttgggta gtgatggatt 300atctggaagg tggaaacttg acagatgtcg ttgtgaagac tgagttggac gaaggacaaa 360ttgcagcagt tttgcaagaa tgtcttaaag cgcttcactt ccttcataga cactccatag 420tgcaccgaga tatcaagagt gacaacgtgc tgctcggcat gaacggagag gttaagctca 480ccgatatggg attctgtgct cagattcagc cgggatcgaa aagagatact gtcgtcggaa 540ctccatattg gatgtcgccg gagatattga acaagaagca gtacaactat aaggttgaca 600tttggtcgct gggaattatg gctctagaga tgattgatgg agagccacca tatttgagag 660aaacaccttt gaaggctatc tacttgattg ctcaaaacgg gaagccagag atcaagcaac 720gcgacagact gtcttcagag ttcaacaatt tccttgacaa gtgtcttgtt gttgatccgg 780atcagagagc cgatacaacg gagctcttgg cacatccatt cctgaaaaag gcgaagccac 840tctcaagcct gattccatac atcagagccg tccgagaaaa gtagcaccgc tattgaaatt 900agtaccgaag cggaggtggc tatcaagcag atgaacctga aggatcaacc aaagaaggaa 960ttgatcatta atgagatttt ggtgatgcgt gagaataagc atgcaaatat tgtaaattat 1020ttggattcgt atttggtgtg cgatgaatta tgggtagtga tggagtatct tgccggtgga 1080tcattgactg atgttgtcac ggagtgccag atggaggatg gaattattgc agctgtttgc 1140agagaagttc ttcaagcgct tgaattcctc cacagccgcc acgtcattca cagagatatt 1200aaatctgaca atattctttt gggaatggat ggttcggtga aattgaccga ctttggattc 1260tgtgctcagc tctcgccgga gcaaagaaaa cgcacgacaa tggtcggaac tccatactgg 1320atggcgccgg aagtggtgac ccgcaaacaa tacggaccca aggttgatgt gtggtccttg 1380ggaatcatgg cgattgagat ggtcgaagga gaaccgccat atttgaatga aaatccactc 1440agggctatct atctcattgc tacaaatggc aaacccgact tccctggaag agattccatg 1500actttgttgt tcaaggactt tgtcgactct gcgttggaag tacaagttga aaatcgatgg 1560tcggcaagcc aactccttac gcatccattc ctccgatgcg ccaaaccgct tgcttcactg 1620tactacttaa tcgttgcggc gaagaagagc atcgccgaag ctagcaactc ataaacccag 1680ctttcttgta caaagtggtg atatcaagct tatcgatacc gtcgacctcg agggggggcc 1740cggtacccaa ttcgccctat agtgagtcgt attacgcgcg ctcactggcc gtcgttttac 1800aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa tcgccttgca gcacatcccc 1860ctttcgccag ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc caacagttgc 1920gcagcctgaa tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg 1980tggttacgcg cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt 2040tcttcccttc ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc 2100tccctttagg gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg 2160gtgatggttc acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg 2220agtccacgtt ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct 2280cggtctattc ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg 2340agctgattta acaaaaattt aacgcgaatt ttaacaaaat attaacgctt acaatttagg 2400tggcactttt cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc 2460aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat attgaaaaag 2520gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg cggcattttg 2580ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta aaagatgctg aagatcagtt 2640gggtgcacga gtgggttaca tcgaactgga tctcaacagc ggtaagatcc ttgagagttt 2700tcgccccgaa gaacgttttc caatgatgag cacttttaaa gttctgctat gtggcgcggt 2760attatcccgt attgacgccg ggcaagagca actcggtcgc cgcatacact attctcagaa 2820tgacttggtt gagtactcac cagtcacaga aaagcatctt acggatggca tgacagtaag 2880agaattatgc agtgctgcca taaccatgag tgataacact gcggccaact tacttctgac 2940aacgatcgga ggaccgaagg agctaaccgc ttttttgcac aacatggggg atcatgtaac 3000tcgccttgat cgttgggaac cggagctgaa tgaagccata ccaaacgacg agcgtgacac 3060cacgatgcct gtagcaatgg caacaacgtt gcgcaaacta ttaactggcg aactacttac 3120tctagcttcc cggcaacaat taatagactg gatggaggcg gataaagttg caggaccact 3180tctgcgctcg gcccttccgg ctggctggtt tattgctgat aaatctggag ccggtgagcg 3240tgggtctcgc ggtatcattg cagcactggg gccagatggt aagccctccc gtatcgtagt 3300tatctacacg acggggagtc aggcaactat ggatgaacga aatagacaga tcgctgagat 3360aggtgcctca ctgattaagc attggtaact gtcagaccaa gtttactcat atatacttta 3420gattgattta aaacttcatt tttaatttaa aaggatctag gtgaagatcc tttttgataa 3480tctcatgacc aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga 3540aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac 3600aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt 3660tccgaaggta actggcttca gcagagcgca gataccaaat actgtccttc tagtgtagcc 3720gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat 3780cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag 3840acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc 3900cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag 3960cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac 4020aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg 4080gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct 4140atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc 4200tcacatgttc tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga 4260gtgagctgat accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga 4320agc 4323292865DNAArtificial SequenceDescription of Artificial SequenceVector 29aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcac 60aagtttgtac aaaaaagcag gctcaaaaat gtttcaaaat agtccgatga tgtacgactg 120gtggaatgac accaccaaac cgaaacacca gcagccgaca cttaacgtgt tgtcaccatg 180gggagcatat ttcaatcaca ttggaaatga actgctaccc agctttcttg tacaaagtgg 240tgatatcaag cttatcgata ccgtcgacct cgaggggggg cccggtaccc aattcgccct 300atagtgagtc gtattacgcg cgctcactgg ccgtcgtttt acaacgtcgt gactgggaaa 360accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta 420atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat 480gggacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 540ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 600ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 660ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 720ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 780gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 840tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 900ttaacgcgaa ttttaacaaa atattaacgc ttacaattta ggtggcactt ttcggggaaa 960tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt atccgctcat 1020gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta tgagtattca 1080acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg tttttgctca 1140cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac gagtgggtta 1200catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg aagaacgttt 1260tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc gtattgacgc 1320cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg ttgagtactc 1380accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat gcagtgctgc 1440cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg gaggaccgaa 1500ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg atcgttggga 1560accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc ctgtagcaat 1620ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt cccggcaaca 1680attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct cggcccttcc 1740ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc gcggtatcat 1800tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca cgacggggag 1860tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct cactgattaa 1920gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt taaaacttca 1980tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc 2040ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 2100ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 2160agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 2220cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 2280caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 2340tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 2400ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 2460ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 2520gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 2580gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 2640tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 2700cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 2760gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg ataccgctcg 2820ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagc 2865303525DNAArtificial SequenceDescription of Artificial SequenceVector 30aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcac 60aagtttgtac aaaaaagcag gctcaaaaat gtcaacttca aaaagttcca aggtgcgaat 120acggaatttc atcgggcgaa tcttctctcc cagcgataaa gacaaggatc gagacgatga 180gatgaagcca tcctcgtccg caatggatat tagtcagcca tataacacag tgcatcgagt 240ccacgttgga tacgacggcc agaagttcag cggactgccg caaccatgga tggatattct 300tctccgagac attagtcttg ccgatcagaa gaaggatccg aacgcggtgg tgactgcgtt 360gaagttctac gcacaatcaa tgaaggagaa cgagaagacg aaattcatga cgacgaatag 420tgttttcacg aatagcgatg acgatgatgt ggacgttcag ttgaccggac aagtcacgga 480acatttgagg aatttgcagt gtagtaatgg ttccgcaact tccccatcta catcagtgtc 540agcttcatct tcttctgctc gtccactgac aaatggaaat aatcatcttt ccacggcgtc 600gtctaccgac acatctctct cattatcgga aaggaataac gttccgtctc cagctccagt 660tccatatagt gaaagtgctc cacaactgaa aacattcacc ggagagactc caaaactgca 720tccacgatct ccgttcccgc ctcaaccgcc agttcttccg caacgaagca aaaccgcatc 780ggcagtggcg acgacgacga cgaatccgac gacttcgaat ggagcaccac caccagttcc 840tggatcgaaa ggacccccgg tgccaccgaa accatcaccc agctttcttg tacaaagtgg 900tgatatcaag cttatcgata ccgtcgacct cgaggggggg cccggtaccc aattcgccct 960atagtgagtc gtattacgcg cgctcactgg ccgtcgtttt acaacgtcgt gactgggaaa 1020accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta 1080atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat 1140gggacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 1200ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 1260ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 1320ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 1380ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 1440gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 1500tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 1560ttaacgcgaa ttttaacaaa atattaacgc ttacaattta ggtggcactt ttcggggaaa 1620tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt atccgctcat 1680gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta tgagtattca 1740acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg tttttgctca 1800cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac gagtgggtta 1860catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg aagaacgttt 1920tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc gtattgacgc 1980cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg

ttgagtactc 2040accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat gcagtgctgc 2100cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg gaggaccgaa 2160ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg atcgttggga 2220accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc ctgtagcaat 2280ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt cccggcaaca 2340attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct cggcccttcc 2400ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc gcggtatcat 2460tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca cgacggggag 2520tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct cactgattaa 2580gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt taaaacttca 2640tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc 2700ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 2760ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 2820agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 2880cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 2940caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 3000tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 3060ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 3120ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 3180gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 3240gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 3300tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 3360cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 3420gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg ataccgctcg 3480ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagc 3525313292DNAArtificial SequenceDescription of Artificial SequenceVector 31aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcat 60cgatgaattc gagctccacc gcggtggcgg ccgctctaga actagtggat cccccgggct 120gcaggaattc cgcccgtcgg taaaacgtgt ctcctgatat cctacaccac aaacgcattt 180cccggagaat atattccgac ggtattcgac aactactcag caaatgtgat ggtcgacggt 240cggccgataa atctcgggct ctgggataca gctggacagg aagattacga tcgactccga 300ccactgtcat atccacaaac agacgtgttt ctcgtatgct ttgccctgaa caatccggcg 360agttttgaga atgttcgtgc gaaatggtat ccagaagtgt cacatcattg cccgaatacg 420ccgattattt tggttggaac gaaagctgat ctgcgtgagg atcgagatac tgttgaacgg 480ctccgcgaac gccggctcca accagtgagc caaacccagg gctacgtgat ggcaaaggaa 540atcaaggctg tcaagtatct ggagtgctcg gcgctcacgc aacgtggtct gaaacaagtt 600ttcgatgagg cgatccgagc cgtgctcacg ccgccacaaa gagccaaaaa gagcaagtgg 660gcgaattcga tatcaagctt atcgataccg tcgacctcga gggggggccc ggtacccaat 720tcgccctata gtgagtcgta ttacgcgcgc tcactggccg tcgttttaca acgtcgtgac 780tgggaaaacc ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc 840tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat 900ggcgaatggg acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt ggttacgcgc 960agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt cttcccttcc 1020tttctcgcca cgttcgccgg ctttccccgt caagctctaa atcgggggct ccctttaggg 1080ttccgattta gtgctttacg gcacctcgac cccaaaaaac ttgattaggg tgatggttca 1140cgtagtgggc catcgccctg atagacggtt tttcgccctt tgacgttgga gtccacgttc 1200tttaatagtg gactcttgtt ccaaactgga acaacactca accctatctc ggtctattct 1260tttgatttat aagggatttt gccgatttcg gcctattggt taaaaaatga gctgatttaa 1320caaaaattta acgcgaattt taacaaaata ttaacgctta caatttaggt ggcacttttc 1380ggggaaatgt gcgcggaacc cctatttgtt tatttttcta aatacattca aatatgtatc 1440cgctcatgag acaataaccc tgataaatgc ttcaataata ttgaaaaagg aagagtatga 1500gtattcaaca tttccgtgtc gcccttattc ccttttttgc ggcattttgc cttcctgttt 1560ttgctcaccc agaaacgctg gtgaaagtaa aagatgctga agatcagttg ggtgcacgag 1620tgggttacat cgaactggat ctcaacagcg gtaagatcct tgagagtttt cgccccgaag 1680aacgttttcc aatgatgagc acttttaaag ttctgctatg tggcgcggta ttatcccgta 1740ttgacgccgg gcaagagcaa ctcggtcgcc gcatacacta ttctcagaat gacttggttg 1800agtactcacc agtcacagaa aagcatctta cggatggcat gacagtaaga gaattatgca 1860gtgctgccat aaccatgagt gataacactg cggccaactt acttctgaca acgatcggag 1920gaccgaagga gctaaccgct tttttgcaca acatggggga tcatgtaact cgccttgatc 1980gttgggaacc ggagctgaat gaagccatac caaacgacga gcgtgacacc acgatgcctg 2040tagcaatggc aacaacgttg cgcaaactat taactggcga actacttact ctagcttccc 2100ggcaacaatt aatagactgg atggaggcgg ataaagttgc aggaccactt ctgcgctcgg 2160cccttccggc tggctggttt attgctgata aatctggagc cggtgagcgt gggtctcgcg 2220gtatcattgc agcactgggg ccagatggta agccctcccg tatcgtagtt atctacacga 2280cggggagtca ggcaactatg gatgaacgaa atagacagat cgctgagata ggtgcctcac 2340tgattaagca ttggtaactg tcagaccaag tttactcata tatactttag attgatttaa 2400aacttcattt ttaatttaaa aggatctagg tgaagatcct ttttgataat ctcatgacca 2460aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa aagatcaaag 2520gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac 2580cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactcttttt ccgaaggtaa 2640ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg tagttaggcc 2700accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag 2760tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 2820cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc agcttggagc 2880gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagc gccacgcttc 2940ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3000cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc 3060tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 3120ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgct cacatgttct 3180ttcctgcgtt atcccctgat tctgtggata accgtattac cgcctttgag tgagctgata 3240ccgctcgccg cagccgaacg accgagcgca gcgagtcagt gagcgaggaa gc 3292323323DNAArtificial SequenceDescription of Artificial SequenceVector 32aacctggctt atcgaaatta atacgactca ctatagggag accggcagat ctgatatcat 60cgatgaattc gagctccacc gcggtggcgg ccgctctaga actagtggat cccccgggct 120gcaggaattc cgccctcgag gcagatcaaa tgtgtagttg ttggagacgg aacagttgga 180aaaacatgca tgttaatatc ttacacaact gactcttttc cagttcagta tgtgcctaca 240gtatttgata actattcggc acagatgagt cttgatggga acgttgtgaa cttaggattg 300tgggatactg ctggacagga ggattatgat cgtttacgac cactttccta cccacagacg 360gatgttttca ttctctgctt ctctgtcgtc tcgcccgtat cgtttgacaa tgtggcaagc 420aagtggattc cggaaatacg acagcattgt ccagatgcgc ctgtcattct agttggtacc 480aaactcgatt tgcgcgacga ggccgaaccg atgcgtgctc tgcaggccga aggaaagtcc 540ccaatttcca aaacgcaagg catgaaaatg gctcaaaaaa ttaaagctgt caagtatttg 600gaatgctctg cattgacgca acagggactc acacaggtgt tcgaagacgc cgtacggtcc 660attcttcatc cgaaaccaca gaaaaagaag ggcgaattcg atatcaagct tatcgatacc 720gtcgacctcg agggggggcc cggtacccaa ttcgccctat agtgagtcgt attacgcgcg 780ctcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa 840tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg cccgcaccga 900tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg gacgcgccct gtagcggcgc 960attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg ccagcgccct 1020agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg gctttccccg 1080tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac ggcacctcga 1140ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 1200ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 1260aacaacactc aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 1320ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 1380attaacgctt acaatttagg tggcactttt cggggaaatg tgcgcggaac ccctatttgt 1440ttatttttct aaatacattc aaatatgtat ccgctcatga gacaataacc ctgataaatg 1500cttcaataat attgaaaaag gaagagtatg agtattcaac atttccgtgt cgcccttatt 1560cccttttttg cggcattttg ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta 1620aaagatgctg aagatcagtt gggtgcacga gtgggttaca tcgaactgga tctcaacagc 1680ggtaagatcc ttgagagttt tcgccccgaa gaacgttttc caatgatgag cacttttaaa 1740gttctgctat gtggcgcggt attatcccgt attgacgccg ggcaagagca actcggtcgc 1800cgcatacact attctcagaa tgacttggtt gagtactcac cagtcacaga aaagcatctt 1860acggatggca tgacagtaag agaattatgc agtgctgcca taaccatgag tgataacact 1920gcggccaact tacttctgac aacgatcgga ggaccgaagg agctaaccgc ttttttgcac 1980aacatggggg atcatgtaac tcgccttgat cgttgggaac cggagctgaa tgaagccata 2040ccaaacgacg agcgtgacac cacgatgcct gtagcaatgg caacaacgtt gcgcaaacta 2100ttaactggcg aactacttac tctagcttcc cggcaacaat taatagactg gatggaggcg 2160gataaagttg caggaccact tctgcgctcg gcccttccgg ctggctggtt tattgctgat 2220aaatctggag ccggtgagcg tgggtctcgc ggtatcattg cagcactggg gccagatggt 2280aagccctccc gtatcgtagt tatctacacg acggggagtc aggcaactat ggatgaacga 2340aatagacaga tcgctgagat aggtgcctca ctgattaagc attggtaact gtcagaccaa 2400gtttactcat atatacttta gattgattta aaacttcatt tttaatttaa aaggatctag 2460gtgaagatcc tttttgataa tctcatgacc aaaatccctt aacgtgagtt ttcgttccac 2520tgagcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc 2580gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat 2640caagagctac caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat 2700actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct 2760acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt 2820cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc gggctgaacg 2880gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta 2940cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg 3000gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg 3060tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc 3120tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg 3180gccttttgct ggccttttgc tcacatgttc tttcctgcgt tatcccctga ttctgtggat 3240aaccgtatta ccgcctttga gtgagctgat accgctcgcc gcagccgaac gaccgagcgc 3300agcgagtcag tgagcgagga agc 3323331281DNACaenorhabditis elegans 33atgtttcaaa atagtccgat gatgtacgac tggtggaatg acaccaccaa accgaaacac 60cagcagccga cacttaacgt gttgtcacca tggggagcat atttcaatca cattggaaat 120gaactgctgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 180cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca 240acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt 300ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 360gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 420aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag 480caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat 540gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc 600gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 660ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca 720gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 780cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac 840aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag 900attcagccgg gatcgaaaag agatactgtc gtcggaactc catattggat gtcgccggag 960atattgaaca agaagcagta caactataag gttgacattt ggtcgctggg aattatggct 1020ctagagatga ttgatggaga gccaccatat ttgagagaaa cacctttgaa ggctatctac 1080ttgattgctc aaaacgggaa gccagagatc aagcaacgcg acagactgtc ttcagagttc 1140aacaatttcc ttgacaagtg tcttgttgtt gatccggatc agagagccga tacaacggag 1200ctcttggcac atccattcct gaaaaaggcg aagccactct caagcctgat tccatacatc 1260agagccgtcc gagaaaagta g 128134360DNACaenorhabditis elegansmisc_feature(303)..(303)n is a or c or g or t or unknown 34cgacgaaata gtgttttcac gaatagcgat gacgatgatg tggacgttca gttgaccgga 60caagtcacgg aacatttgag gaatttgcag tgtagtaatg gttccgcaac ttccccatct 120acatcagtgt cagcttcatc ttcttctgct cgtccactga caaatggaaa taatcatctt 180tccacggcgt cgtctaccga cacatctctc tcattatcgg aaaggaataa cgttccgtct 240ccagctccag ttccatatag tgaaagtgct ccacaactga aaacattcac cggagagact 300ccnaaactgc atccacgatc tccgttcccg cctcaaccgc cagttcttcc gcaacgaagc 36035300DNACaenorhabditis elegansmisc_feature(33)..(33)a or c or g or t or unknown 35atgtttctgt atattttatg tgaaatgcaa cangaatctt ctagcaaaaa agtacgatgc 60tggcaggtag ttgttggggg atggagagaa ggggagaaac aaaacaaaaa tgacaatagg 120tgataaaaat nataataatg ttttcgccac agttttcgcg cttaattcac aggaaggttt 180ttttttgcat acaataaaat agtgtgaatg ggagagattt ttagagagaa aaaaactaca 240aaaaaaacga ggagcaagat ataagggctt gtgtatggta aaacatataa aacgctgtgt 30036750DNACaenorhabditis elegans 36atgaagccat cctcgtccgc aatggatatt agtcagccat ataacacagt gcatcgtctt 60gccgatcaga agaaggatcc gaacgcggtg gtgactgcgt tgaagttcta cgcacaatca 120atgaaggaga acgagaagac gaaattcatg acgacgaata gtgttttcac gaatagcgat 180gacgatgatg tggacgttca gttgaccgga caagtcacgg aacatttgag gaatttgcag 240tgtagtaatg gttccgcaac ttccccatct acatcagtgt cagcttcatc ttcttctgct 300cgtccactga caaatggaaa taatcatctt tccacggcgt cgtctaccga cacatctctc 360tcattatcgg aaaggaataa cgttccgtct ccagctccag ttccatatag tgaaagtgct 420ccacaactga aaacattcac cggagagact ccaaaactgc atccacgatc tccgttcccg 480cctcaaccgc cagttcttcc gcaacgaagc aaaaccgcat cggcagtggc gacgacgacg 540acgaatccga cgacttcgaa tggagcacca ccaccagttc ctggatcgaa aggacccccg 600gtgccaccga aaccatcgac ttcagttatc tcttttcgtg agtgttcact gatttgtgtt 660ttgatttatg ttgttcgtca aatttgtaga tttgatcttc tcacttccaa gctcggtgca 720cattgttcaa actctttgca attctggtag 750


Patent applications by Bert Klebl, Gunzlhofen DE

Patent applications by Jonathan Rothblatt, Somerville, MA US

Patent applications by SANOFI-AVENTIS DEUTSCHLAND GMBH

Patent applications in class Involving a micro-organism or cell membrane bound antigen or cell membrane bound receptor or cell membrane bound antibody or microbial lysate

Patent applications in all subclasses Involving a micro-organism or cell membrane bound antigen or cell membrane bound receptor or cell membrane bound antibody or microbial lysate


User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA