Patent application title: Systems and Methods for Designing RNA Nanostructures and Uses Thereof

Inventors: Rhiju Das (Palo Alto, CA, US) Joseph Yesselman (Stanford, CA, US) Kalli Kappel (Stanford, CA, US)
Assignees: The Board of Trustees of the Leland Stanford Junior University
IPC8 Class: AC12N1511FI
USPC Class: 1 1
Class name:
Publication date: 2022-08-18
Patent application number: 20220259590

Abstract:

Systems and methods for generating RNA nanostructures capable of linking RNA structures and capable of securing aptamers in an active and stable structure are disclosed. Generally, RNA possesses many structural properties to create novel nanostructures and machines. RNA tertiary structure is composed of discrete and recurring components known as tertiary `motifs`. Along with the helices that they interconnect, many of these structural motifs appear highly modular. Systems and methods herein generate a motif library including canonical and noncanonical motifs to design a candidate path to connect one or more RNA molecules. These paths can also be used to secure RNA aptamers to improve aptamer stability and activity.

Claims:

1. A method of designing an RNA nanostructure, comprising: generating a motif library describing a plurality of structural motifs; and designing a candidate path between two points of RNA using individual motifs from the motif library.

2. The method of claim 1, wherein the motif library includes canonical motifs and noncanonical motifs, wherein the canonical motifs are double-stranded RNA helix motifs of variable length.

3.-4. (canceled)

5. The method of claim 2, wherein the noncanonical motifs include one or more of the group consisting of two-way junctions, higher-order junctions, variable-length hairpins, tertiary contacts, and multi-way junctions.

6. The method of claim 1, wherein the designing step includes integrating an aptamer into the candidate path.

7. The method of claim 1, wherein the designing step is performed in a depth-first manner.

8. The method of claim 1, wherein the candidate path is based on motif structure.

9. The method of claim 8, further comprising filling in the candidate path with sequences that best match a target secondary structure.

10. The method of claim 9, wherein the filling in step uses sequences that minimize alternative secondary structures.

11. The method of claim 1, wherein the designing step generates a plurality of candidate paths.

12. The method of claim 11, further comprising filtering the plurality of candidate paths based on at least one limitation.

13. The method of claim 12, wherein the at least one limitation is selected from the group consisting of minimum number of motifs, maximum number of motifs, minimum number of residues, maximum number of residues, minimum stability, and maximum stability.

14. The method of claim 1, further comprising synthesizing an oligonucleotide covering the design of the candidate path.

15. An RNA nanostructure comprising: a plurality of RNA motifs aligned end to end forming a chain, wherein the plurality of RNA motifs are selected from the group consisting of canonical RNA motifs and noncanonical RNA motifs.

16. The RNA nanostructure of claim 15, wherein the plurality of RNA motifs alternate between canonical RNA motifs and noncanonical RNA motifs.

17. (canceled)

18. The RNA nanostructure of claim 15, further comprising two anchor structures, wherein one anchor structure is connected to one end of the chain, and the other anchor structure is connected to the other end of the chain.

19. The RNA nanostructure of claim 18, wherein the two anchor structures are a tetraloop and a tetraloop receptor.

20. The RNA nanostructure of claim 15, further comprising an anchor structure, wherein the plurality of RNA motifs are connected to one end of the anchor structure, and at least one more RNA motif is connected to the other end of the anchor structure.

21. The RNA nanostructure of claim 20, wherein the anchor structure is an aptamer.

22. The RNA nanostructure of claim 15, wherein the canonical RNA motifs are double stranded RNA helix motifs.

23.-24. (canceled)

25. The RNA nanostructure of claim 15, wherein the noncanonical RNA motifs are selected from the group consisting of: two-way junctions, higher-order junctions, variable-length hairpins, tertiary contacts, and multi-way junctions.

Description:

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Provisional Application Ser. No. 62/894,098, entitled "Methods and Systems for Rational Design of RNA Aptamers and Uses Thereof" to Das et al., filed Aug. 30, 2019 and U.S. Provisional Application Ser. No. 62/835,699, entitled "Systems and Methods for Designing RNA Nanostructures and Uses Thereof" to Das et al., filed Apr. 18, 2019; the disclosures of which are herein incorporated by reference in their entireties.

FIELD OF THE DISCLOSURE

[0003] The present disclosure relates to ribonucleic acid (RNA) aptamers, and in particular methods and systems to design RNA aptamers for increased stability and/or function.

INCORPORATION OF SEQUENCE LISTING

[0004] A computer readable form of the sequence listing, "06060.PRO Construct Sequences_ST25.txt", submitted via EFS-WEB, is herein incorporated by reference in its entirety.

BACKGROUND OF THE DISCLOSURE

[0005] RNA-based nanotechnology is an emerging field that harnesses RNA's unique structural properties to create novel nanostructures and machines. Perhaps more so than for other biomolecules, RNA tertiary structure is composed of discrete and recurring components known as tertiary `motifs`. Along with the helices that they interconnect, many of these structural motifs appear highly modular; that is, each motif folds into a well-defined three-dimensional (3D) structure in a broad range of contexts. By exploiting symmetry, motif repetition, and expert modeling, these motifs have been assembled into novel polyhedra, sheets, and cargo-carrying nanoparticles for biomedical use. Despite these advances, current methods still rely on human intuition in conjunction with simple visualization tools and the field is far from generating RNAs as sophisticated as natural RNA machines, which are asymmetric, too large to be solved by 3D RNA structure prediction methods, and composed of vast repertoires of distinct interacting motifs, most of which are not yet well characterized. (See Guo, P. (2010) The emerging field of RNA nanotechnology. Nat. Nanotechnol. 5, 833-842; Grabow, W. W., and Jaeger, L. (2014) RNA self-assembly and RNA nanotechnology. Acc. Chem. Res. 47, 1871-1880; Leontis, N. B., et al. (2006) The building blocks and motifs of RNA architecture. Curr. Opin. Struct. Biol. 16, 279-287; Jaeger, L., and Chworos, A. (2006) The architectonics of programmable RNA and DNA nanostructures. Curr. Opin. Struct. Biol. 16, 531-543; Jaeger, L., and Leontis, N. B. (2000) Tecto-RNA: One-Dimensional Self-Assembly through Tertiary Interactions. Angew. Chem. Int. Ed. Engl. 39, 2521-2524; Zhang, H., et al. (2013) Crystal structure of 3WJ core revealing divalent ion-promoted thermostability and assembly of the Phi29 hexameric motor pRNA. RNA 19, 1226-1237; Weizmann, Y., and Andersen, E. S. (2017) RNA nanotechnology--The knots and folds of RNA nanoparticle engineering. MRS Bull. 42, 930-935; Jasinski, D., et al. (2017) Advancement of the emerging field of RNA nanotechnology. ACS Nano 11, 1142-1164; Bindewald, E., et al. (2008) Computational strategies for the automated design of RNA nanoscale structures from building blocks using NanoTiler. J Mol Graph Model 27, 299-308; Jossinet, F., et al. (2010) Assemble: an interactive graphical tool to analyze and build RNA architectures at the 2D and 3D levels. Bioinformatics 26, 2057-2059; Wimberly, B. T., et al. (2000) Structure of the 30S ribosomal subunit. Nature 407, 327-339; Nguyen, T. H. D., et al. (2015) The architecture of the spliceosomal U4/U6.U5 tri-snRNP. Nature 523, 47-52; and Miao, Z., et al. (2017) RNA-Puzzles Round III: 3D RNA structure prediction of five riboswitches and one ribozyme. RNA 23, 655-672; the disclosures of which are incorporated herein by reference in their entirety.)

[0006] Additionally, aptamer selection suffers from two critical limitations that prevent its use in engineering scaffolds that do not require target protein reengineering. First, selection experiments are limited by the number of sequences that can be tested, which results in many cases where high quality aptamers cannot be selected. (See e.g., Wang, J. P., et al., Influence of Target Concentration and Background Binding on In Vitro Selection of Affinity Reagents. Plos One, 2012. 7(8); and Gold, L., et al., Aptamer-Based Multiplexed Proteomic Technology for Biomarker Discovery. Plos One, 2010. 5(12); the disclosures of which are incorporated by reference herein in their entireties.) Second, the structure of the aptamer cannot be explicitly controlled, which is undesirable when the goal is to generate an aptamer that can be used to precisely orient proteins relative to each other.

SUMMARY OF THE DISCLOSURE

[0007] This summary is meant to provide examples and is not intended to be limiting of the scope of the invention in any way. For example, any feature included in an example of this summary is not required by the claims, unless the claims explicitly recite the feature. Also, the features described can be combined in a variety of ways. Various features and steps as described elsewhere in this disclosure can be included in the examples summarized here.

[0008] In one embodiment, a method of designing an RNA nanostructure, includes generating a motif library describing a plurality of structural motifs, and designing a candidate path between two points of RNA using individual motifs from the motif library.

[0009] In a further embodiment, the motif library includes canonical motifs and noncanonical motifs.

[0010] In another embodiment, the canonical motifs are double stranded RNA helix motifs of variable length.

[0011] In a still further embodiment, the canonical motifs range in size from 1-22 bp.

[0012] In still another embodiment, the noncanonical motifs include one or more of the group consisting of two-way junctions, higher-order junctions, variable-length hairpins, tertiary contacts, and multi-way junctions.

[0013] In a yet further embodiment, the designing step includes integrating an aptamer into the candidate path.

[0014] In yet another embodiment, the designing step is performed in a depth-first manner.

[0015] In a further embodiment again, the candidate path is based on motif structure.

[0016] In another embodiment again, the method further includes filling in the candidate path with sequences that best match a target secondary structure.

[0017] In a further additional embodiment, the filling in step uses sequences that minimize alternative secondary structures.

[0018] In another additional embodiment, the designing step generates a plurality of candidate paths.

[0019] In a still yet further embodiment, the method further includes filtering the plurality of candidate paths based on at least one limitation.

[0020] In still yet another embodiment, the at least one limitation is selected from the group consisting of minimum number of motifs, maximum number of motifs, minimum number of residues, maximum number of residues, minimum stability, and maximum stability.

[0021] In a still further embodiment again, the method further includes synthesizing an oligonucleotide covering the design of the candidate path.

[0022] In still another embodiment again, an RNA nanostructure comprises a plurality of RNA motifs aligned end to end forming a chain, where the plurality of RNA motifs are selected from the group consisting of canonical RNA motifs and noncanonical RNA motifs.

[0023] In a still further additional embodiment, the plurality of RNA motifs alternate between canonical RNA motifs and noncanonical RNA motifs.

[0024] In still another additional embodiment, the RNA nanostructure further includes an anchor structure connected to one end of the chain.

[0025] In a yet further embodiment again, the RNA nanostructure further includes two anchor structures, where one anchor structure is connected to one end of the chain, and the other anchor structure is connected to the other end of the chain.

[0026] In yet another embodiment again, the two anchor structures are a tetraloop and a tetraloop receptor.

[0027] In a yet further additional embodiment, the RNA nanostructure further includes an anchor structure, wherein the plurality of RNA motifs are connected to one end of the anchor structure, and at least one more RNA motif is connected to the other end of the anchor structure.

[0028] In yet another additional embodiment, the anchor structure is an aptamer.

[0029] In a further additional embodiment again, the canonical RNA motifs are double stranded RNA helix motifs.

[0030] In another additional embodiment again, the canonical RNA motifs range in size from 1 base pair to 100 base pairs.

[0031] In a still yet further embodiment again, the canonical RNA motifs range in size from 1 base pair to 22 base pairs.

[0032] In still yet another embodiment again, the noncanonical RNA motifs are selected from the group consisting of: two-way junctions, higher-order junctions, variable-length hairpins, tertiary contacts, and multi-way junctions.

[0033] The foregoing and other objects, features, and advantages of the disclosed technology will become more apparent from the following detailed description, which proceeds with reference to the accompanying figures.

BRIEF DESCRIPTION OF THE DRAWINGS

[0034] FIGS. 1A-1C illustrate problems in RNA nanostructure design in accordance with various embodiments.

[0035] FIG. 2 illustrates a method to design RNA nanostructures in accordance with various embodiments.

[0036] FIG. 3. Illustrates a depth-first process for designing an RNA nanostructure in accordance with various embodiments.

[0037] FIGS. 4A-4B illustrate computer performance of various methods for designing an RNA nanostructure in accordance with various embodiments.

[0038] FIGS. 5A-5C illustrate RNA nanostructures to connect a tetraloop/tetraloop receptor (TTR) in accordance with various embodiments.

[0039] FIGS. 6A-6C illustrate RNA nanostructures to connect ribosomal subunits in accordance with various embodiments.

[0040] FIGS. 6D-6E illustrate RNA nanostructures including multi-way junctions in accordance with various embodiments.

[0041] FIG. 7 illustrates RNA nanostructures incorporating an aptamer in accordance with various embodiments.

[0042] FIGS. 8A-8D illustrate RNA nanostructures incorporating an aptamer in accordance with various embodiments.

[0043] FIG. 9A illustrates a method for designing RNA aptamers in accordance with various embodiments.

[0044] FIG. 9B illustrates strategies for increasing binding affinity between RNA aptamers and proteins in accordance with various embodiments.

[0045] FIG. 9C illustrate a schematic for designing RNA aptamers in accordance with various embodiments.

[0046] FIG. 9D illustrates an RNA scaffold designed to bind multiple proteins in accordance with various embodiments.

[0047] FIGS. 10A-10J illustrate exemplary RNA nanostructures in accordance with various embodiments.

[0048] FIGS. 11A-11E illustrate predicted and calculated structures of RNA motifs in accordance with various embodiments.

[0049] FIGS. 12A-12F illustrate RNA nanostructures to connect ribosomal subunits in accordance with various embodiments.

[0050] FIGS. 13A-13C illustrate RNA nanostructures to connect ribosomal subunits in accordance with various embodiments.

[0051] FIGS. 14A-14D illustrate data showing structure and function of an RNA nanostructure incorporating an aptamer in accordance with various embodiments.

[0052] FIG. 15 illustrates data showing function of an RNA nanostructure incorporating an aptamer in accordance with various embodiments.

[0053] FIG. 16 illustrates data showing function of an RNA nanostructure incorporating an aptamer in accordance with various embodiments.

[0054] FIGS. 17A-17B illustrate RNA anchor structures and RNA connecting structures in accordance with various embodiments.

DETAILED DESCRIPTION OF THE DISCLOSURE

[0055] Turning now to the drawings and data, embodiments herein represent a novel approach to 3D RNA design, based on the recognition that numerous recurring problems in the field can be cast into a `pathfinding` problem. (See FIGS. 1A-1C.) Embodiments described herein present a computer-implemented 3D RNA design program, which obviates one or more of the three problems highlighted above describing RNA motif pathfinding problems. Additional embodiments are directed to the RNA nanostructures and structural and functional measurements to test the ability of computationally generated RNA nanostructures, ribosomes, and aptamers to achieve the specific purpose of overcoming the problems described above, without requiring additional rounds of trial and error. Embodiments of the present disclosure describe methods that operate counter to prevailing, human strategies to design RNA nanostructures capable of tethering or linking various RNA sequences securely and over long distances. Additionally, various embodiments improve aptamer function and stability by integrating the aptamer into a linking structure that maintains aptamer conformation.

[0056] First, a founding problem of RNA nanotechnology involves designing a compact nanostructure that aligns the two parts of the tetraloop/tetraloop-receptor (TTR) so that they can form a tertiary contact upon RNA chain folding (FIG. 1A). This task requires finding RNA sequences that interconnect the 5' and 3' ends of the tetraloop (102) to the 3' and 5' ends of the tetraloop receptor, respectively (104, FIG. 1A). The problem has previously been solved through a combination of expert manual modeling and symmetric assembly of multiple chains. (See Jaeger, L., and Leontis, N. B. (2000) Tecto-RNA: One-Dimensional Self-Assembly through Tertiary Interactions. Angew. Chem. Int. Ed. Engl. 39, 2521-2524 and Nasalean, L., et al. (2006) Controlling RNA self-assembly to form filaments. Nucleic Acids Res. 34, 1381-1392; the disclosures of which are incorporated herein by reference in their entirety.) In all cases, an important guiding principle--sometimes called RNA architectonics--has been to design the intermediate RNA chains so that they form RNA modules previously seen in nature, including both canonical double-stranded helices and noncanonical RNA motifs that twist and translate between two desired helical endpoints at the tetraloop and the receptor. This design task is referred to as the `RNA motif pathfinding problem`. The general complexity of this pathfinding task has prevented design of asymmetric, single-chain solutions to the TTR stabilization problem.

[0057] A second problem is highly analogous to the TTR stabilization problem but is more difficult. Efforts to select engineered ribosomes with mRNA decoding, polypeptide synthesis, and protein excretion functions optimized for new substrates might be dramatically accelerated through the design of integrated ribosomes. An important step towards this goal involves tethering the two 23S and 16S rRNAs of the ribosome into a single RNA strand that supports E. coli growth. (See Fried, S. D., et al. (2015) Ribosome subunit stapling for orthogonal translation in E. coli. Angew. Chem. Int. Ed. Engl. 54, 12791-12794; Orelle, C., et al. (2015) Protein synthesis by ribosomes with tethered subunits. Nature 524, 119-124; Carlson, E. D. (2015) Creating Ribo-T: (Design, Build, Test)n. ACS Synth. Biol. 4, 1173-1175; and Schmied, W. H., et al. (2018) Controlling orthogonal ribosome subunit interactions enables evolution of new function. Nature 564, 444-448; the disclosures of which are incorporated herein by reference in their entirety.) Three-dimensional designs for a tether (106) would require solving the RNA motif pathfinding problem (108) over >100 .ANG. distances and avoiding steric collisions with the ribosome's RNA and protein components (110, FIG. 1B). Even after identification of appropriate helix endpoints, this difficult design challenge previously took more than a year to solve using trial-and-error refinement based in vivo assays or ad hoc combination of noncanonical motifs without explicit 3D modeling.

[0058] A third problem involves a more complex instance of two RNA motif pathfinding problems (112, FIG. 1C). A ubiquitous task in RNA nanotechnology is the selection of `aptamer` RNAs (114) that sense or carry target small molecules, such as adenosine 5'-triphosphate or fluorophores. (See Famulok, M. (1999) Oligonucleotide aptamers that recognize small molecules. Curr. Opin. Struct. Biol. 9, 324-329; the disclosure of which is incorporated herein by reference in its entirety.) Despite recent progress, improving aptamers requires numerous rounds of tedious selections, with few design tools available to guide consistent improvements. The desired stabilizations might be achieved by peripheral tertiary contacts that extend out of either end of an aptamer and encircle these aptamers, bracing them into their functional 3D arrangements (116,, FIG. 1C)--analogous to the tertiary contacts that `lock` natural riboswitch aptamers. (See Porter, E. B., et al. (2017) Recurrent RNA motifs as scaffolds for genetically encodable small-molecule biosensors. Nat. Chem. Biol. 13, 295-301; Gotrik, M., et al. (2018) Direct Selection of Fluorescence-Enhancing RNA Aptamers. J. Am. Chem. Soc. 140, 3583-3591; and Montange, R. K., and Batey, R. T. (2008) Riboswitches: emerging themes in RNA structure and function. Annu. Rev. Biophys. 37, 117-133; the disclosures of which are incorporated herein by reference in their entirety.) However, such rational design has not been carried out due to the difficulty of finding the required four strands that interconnect a given aptamer structure and a tertiary contact.

[0059] Additional issues exist in protein scaffolding. Scaffold proteins physically link individual molecules to increase the efficiency of their interaction and have been found to be critical to many cellular signaling processes. (See e.g., Good, M. C., et al., Scaffold proteins: hubs for controlling the flow of cellular information. Science, 2011. 332(6030): p. 680-6; the disclosure of which is incorporated by reference herein in its entirety.) Engineers have realized the potential of these scaffold molecules to reshape cellular behavior and have redesigned scaffold proteins for several applications including altering MAP kinase pathway signaling dynamics and enhancing production of specific metabolites. (See e.g., Dueber, J. E., et al., Synthetic protein scaffolds provide modular control over metabolic flux. Nature Biotechnology, 2009. 27(8): p. 753-U107; and Bashor, C. J., et al., Using engineered scaffold interactions to reshape MAP kinase pathway signaling dynamics. Science, 2008. 319(5869): p. 1539-1543; the disclosures of which is incorporated by reference herein in their entirety.) Synthetic RNA molecules offer increased design flexibility over protein scaffolds and have also been used to spatially arrange proteins to increase metabolic pathway yields and control synthetic transcriptional programs. (See e.g., Delebecque, C. J., et al., Designing and using RNA scaffolds to assemble proteins in vivo. Nature Protocols, 2012. 7(10): p. 1797-1807; Delebecque, C. J., et al., Organization of Intracellular Reactions with Rationally Designed RNA Assemblies. Science, 2011. 333(6041): p. 470-474; Zalatan, J. G., et al., Engineering Complex Synthetic Transcriptional Programs with CRISPR RNA Scaffolds. Cell, 2015. 160(1-2): p. 339-350; and Sachdeva, G., et al., In vivo co-localization of enzymes on RNA scaffolds increases metabolic production in a geometrically dependent manner. Nucleic Acids Research, 2014. 42(14): p. 9493-9503; the disclosures of which is incorporated by reference herein in their entirety.) However, both engineered RNA and protein scaffolds rely on known protein-protein or protein-RNA interactions and thus require protein- or RNA-binding proteins to be fused to the proteins to be scaffolded. This requirement precludes the use of scaffolds for therapeutic applications and makes it much more difficult to control the precise three-dimensional arrangement of the scaffolded proteins.

[0060] Turning to FIG. 2, certain embodiments are directed to computational methods 200 of RNA nanostructure design. In this method, one or more motif libraries are generated at 202. Generated libraries include canonical and/or noncanonical RNA motifs. Canonical motifs are double stranded RNA (dsRNA) helix motifs that vary in sequence and/or length. These motifs possess canonical (e.g., Watson-Crick) base-pairing (e.g., adenosine with uridine and guanosine with cytosine). In some embodiments, the canonical motifs are double stranded RNA molecules with Watson-Crick base paring. In many embodiments, canonical motifs are at least 1 base pair (bp) but can be up to 20 bp, 22 bp 25 bp, 30 bp, 50 bp, 75 bp, 100 bp, or longer. Noncanonical motifs include other RNA structures, including two-way junctions, higher-order junctions, variable-length hairpins, tertiary contacts, multi-way junctions (e.g., Phi29 P-RNA planar 3-way junction), other branched elements, and any other non-canonical motif. In many embodiments, the canonical and noncanonical motifs are empirically derived (e.g., motifs where structures are identified via X-ray crystallography or other known methods of elucidating RNA structure), while some embodiments the canonical and noncanonical motifs are computationally derived (e.g., generating motifs based on known structures and/or base pair interactions). In certain embodiments, the canonical motifs are idealized and sequence invariant. Various embodiments maintain multiple libraries representing each of noncanonical and canonical motifs, while certain embodiments will maintain a single library for both canonical and noncanonical motifs. In certain embodiments, the motifs are entered based on sequence, while many embodiments, the motifs are entered based on structure (e.g., crystallographic structure), such as pdb format. Many embodiments will utilize curated motif libraries of RNA components, such as the RNA 3D Motif Atlas (rna.bgsu.edu/rna3dhub/motifs). (See also Petrov, A. I., et al. (2013) Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas. RNA 19, 1327-1340; the disclosure of which is incorporated by reference in its entirety.)

[0061] At 204, certain embodiments design a candidate RNA structure, or candidate path, connecting two points of RNA, where the path is comprised of one or more RNA motifs in the one or more motif libraries. In this 204, connection points are defined to be linked. These connection points can be on one or more RNA molecules, such as to link two RNA molecules together or to link two ends of a single RNA molecule.

[0062] Various embodiments perform the path designing in a step-by-step in a depth-first manner, where a first motif is joined to a first point to achieve the closest distance to a second point prior to a second motif being added, then a third motif is added to achieve the closest distance to the terminating point. This process is performed, until a candidate path is designed between the first and second points. In various embodiments, the pathfinding will be performed in a bidirectional manner, such that candidate paths will generated starting at the first point and terminating at the second point in addition to candidate paths being generated starting at the second point and terminating at the first point. Additional embodiments will further always begin with a canonical motif, and some embodiments will always end with a canonical motif. Some embodiments will further alternate canonical and noncanonical motifs until a candidate path is identified. Further embodiments will allow for specific settings, such that canonical motifs are selected for larger lengths, while noncanonical motifs are selected for smaller lengths. An illustration of this pathfinding process is illustrated in FIG. 3, where a canonical motif ("helix") is added to a starting point prior to a noncanonical motif ("Motif 1") is added, which is subsequently followed by a canonical motif ("helix") and a noncanonical motif ("Motif 2") until the path meets the finishing point.

[0063] Further embodiments will design the path using structures of specific motifs rather than the RNA sequence of the specific motif to be included into the path. For example, some embodiments will allow a user to specify a specific RNA structure (e.g., an RNA aptamer) to be included in the path in lieu of a canonical or noncanonical motif. In embodiments incorporating a specific RNA structure, the method 200 incorporates a de novo scaffold around the existing structure, which will result in a structure that is more stable and active (in the case of functional structures). This pathway runs counter to prevailing methodologies (discussed further below), which attempt to place RNA structures into known scaffolds, thus plugging such structures into preconstructed scaffolds, which require vast amounts of effort without much success in generating functional scaffolds.

[0064] In 206, if the candidate path was found based on structure, many embodiments will fill in the candidate path with sequences that best match the target secondary structure. Additional embodiments will fill in the candidate path with sequences that minimize alternative secondary structures.

[0065] Once the candidate path sequences have been identified, many embodiments filter the candidate paths at 208. In 208, factors or limitations are utilized to limit the total output of method 200. Such factors include minimum and/or maximum number of motifs (e.g., canonical motifs and noncanonical motifs), minimum and/or maximum number of residues (e.g., the number of bases in the entire RNA strand), and/or minimum and/or maximum stability (e.g., number of Watson-Crick base pairs).

[0066] At 210 of certain embodiments, oligonucleotides are synthesized representing the designed RNA nanostructure. Various embodiments synthesize the RNA nanostructure chemically via various known technologies, while additional embodiments synthesize the RNA nanostructure via biochemical. Example methods of synthesis include phosphoramidite, T7 polymerase, and any other known or applicable means of synthesizing an RNA nanostructure. In various embodiments, the oligonucleotides will include just the developed path from a starting point to an ending point, while in some embodiments, the oligonucleotide includes a portion (including the entirety) of the molecule at the starting point and/or a portion (including the entirety) of the molecule at the ending point. Certain embodiments will synthesize the oligonucleotide using RNA base pairs, while some embodiments will synthesize the oligonucleotide using DNA base pairs, and additional embodiments will synthesize the oligonucleotide using a combination of RNA and DNA base pairs. Further, embodiments synthesize the oligonucleotide double stranded, single stranded, or a combination of double and single stranded.

[0067] At 212, the RNA nanostructure is put into use. Using an RNA nanostructure can include a number of uses, such as a medicament or to enhance RNA function, such as the means described in depth below.

[0068] It should be noted that in numerous embodiments, some components in method 200 will be performed in a different order, performed simultaneously with prior components, and/or omitted. For example, filtering 208 can be completed simultaneously with the pathfinding 204, such that once a path reaches a certain point (e.g., a maximum length and/or a maximum number of motifs) the path is eliminated, and another path is begun. Additionally, if the motif libraries are based on sequence, 206 will be omitted in some embodiments, as there will be no need to fill in the sequence.

[0069] Certain embodiments of method 200 are implemented on non-transitory machine readable media, where method 200 is encoded as processor instructions. In many of these embodiments, execution of the processor instructions by a processor causes the processor to perform one or more steps embodied in method 200. Additional embodiments are further directed to systems comprising a processor and memory, where the memory contains instructions that when read by the processor direct the processor to perform one or more steps embodied in method 200.

[0070] When implemented on a computer, certain embodiments of method 200 scale linearly with problem size (e.g., distance between starting and ending points). Some embodiments will be performed on a consumer-grade computer (e.g., laptop computer), and FIGS. 4A and 4B illustrate the performance of method 100. Specifically, FIG. 4A illustrates that the run time increases with distance, while FIG. 4B shows that the number of residues (e.g., base pairs) required to complete the distance also increases with the problem size. FIGS. 4A and 4B illustrate that certain embodiments method 100 will discover exceptionally long dsRNA paths (e.g., long enough to encircle a ribosome) in less than three seconds.

[0071] The resulting products of method 200 possess a number of characteristics, including the ability to fold properly, traverse long distances, and/or hold aptamers into a functional conformation.

RNA Folding

[0072] Various embodiments possess the ability for the RNA nanostructure to properly fold upon synthesis. FIGS. 5A-5D, show the ability of embodiments to fold appropriately. Specifically, FIG. 5A illustrates an embodiment a novel RNA nanostructure designed to link tetraloops and tetraloop receptors ("TTRs"). In FIG. 5A, embodiments of the novel RNA nanostructures to link TTRs will possess a tetraloop 502, tetraloop receptor 504, and the linking region 506. The structures of several embodiments are illustrated in FIG. 5B. Sequences for the embodiments illustrated in FIG. 5B can be found in the attached sequence listing as SEQ_ID NOs: 1-16. Additionally, some embodiments of the RNA nanostructures illustrated in FIG. 5B allow the TTRs to fold appropriately, as illustrated in FIG. 5C. FIG. 5C illustrates a native gel mobility assay of the embodiments illustrated in FIG. 5B. In FIG. 5C, the embodiments in FIG. 5B are labelled at the top of each image and are run in two lanes of the gel, where the left lane is a native tetraloop possessing the sequence GAAA, while the right lane has this sequence mutated to UUCG. When the native sequence tetraloop migrates further through the gel is an indicator that the linking RNA nanostructure does not disrupt the TTR tertiary fold. Quantification of this information is found below in Table 1.

TABLE-US-00001 TABLE 1 Quantification of properties of TTR linkages SHAPE and TTR DMS Native Gel DMS support Reactivity Mobility Secondary Fold Shift Mg.sup.2+ Folding Construct Structure.sup.a Change.sup.b (cm).sup.c Midpoints.sup.d miniTTR 1 95.2% 3.01 0.205 1.12 +0.34/-0.24 miniTTR 2 94.2% 6.94 0.247 0.08 +0.00/-0.00 miniTTR 3 96.6% 1.63* 0.055* >10* miniTTR 4 96.6% 1.74* 0.204 >10* miniTTR 5 98.1% 4.1 0.236 1.64 +0.32/-0.22 miniTTR 6 95.5% 3.39 0.382 0.74 +0.01/-0.02 miniTTR 7 97.2% 2.66 0.226 3.31 +0.79/-0.55 miniTTR 8 98.5% 1.16* -1.117* >10* miniTTR 9 98.5% 6.18 0.348 0.84 +0.11/-0.11 miniTTR 10 98.5% 6.59 0.405 0.74 +0.08/-0.06 miniTTR 11 96.7% 4.79 0.282 0.87 +0.13/-0.10 miniTTR 12 96.4% 5.3 0.406 0.50 +0.05/-0.03 miniTTR 13 94.2% 1.72* -0.066* >10* miniTTR 14 98.6% 5.21 0.408 0.44 +0.02/-0.01 miniTTR 15 94.2% 3.79 -0.108* 0.95 +0.14/-0.14 miniTTR 16 96.2% 14.47 0.456 0.24 +0.08/-0.02 .sup.aPercent of helical residues that have SHAPE and DMS reactivities < 0.5 reactivity units, suggesting they are in base pairs. .sup.bFor DMS chemical mapping with and without 10 mM Mg.sup.2+, a 2-fold reduction in mean DMS reactivity at the four TTR adenines was considered to pass screen. .sup.cDistance traveled in gel of RNA compared to mutant with tetraloop GAAA changed to UUCG. Positive numbers correspond to faster gel mobility (more compact fold) with wild type tetraloop, as expected for correctly folded RNA. .sup.dRNA that was more than half folded with [Mg.sup.2+] < 10 mM was considered to pass screen *Considered to not pass screen

Long Distance Tethering

[0073] Various embodiments have the ability to link molecules across long distances. FIGS. 6A-6C, show the ability of embodiments to link ribosomal subunits. Specifically, FIG. 6A illustrates an embodiment a novel RNA nanostructure designed to link ribosomal subunits. In FIG. 6A, embodiments of the novel RNA nanostructures to link ribosomal subunits will possess a linking structure 602 that connects the 23S ribosomal subunit 604 and 16S ribosomal subunit 606. The structures of several embodiments are illustrated in FIG. 6B. Sequences for the embodiments illustrated in FIG. 6B can be found in the attached sequence listing as SEQ_ID NOs: 17-25. Additionally, FIG. 6C illustrates how the tethering of some embodiments allows the growth of ribosome-deficient bacteria, which otherwise would be unable to grow without functional ribosomes.

Multi-Junction Linkages

[0074] Additional embodiments generate structures including multi-way junctions. An example of such embodiments is illustrated in FIG. 6D, where multi-way junctions 610 are incorporated into linking region 612 that connects the tetraloop-tetraloop receptor 614. Additionally, some embodiments generate multiple linkages off of such multi-link junctions, such as illustrated in FIG. 6E. FIG. 6E illustrates double-stranded RNA (dsRNA) helix 620 possessing four A-minor interactions 622. Certain embodiments include RNA nanostructures 624 to link the various A-minor interactions 622 using multi-way junctions, such as those illustrated in FIG. 6D. Additional embodiments build off of multi-way junctions to design paths 626 linking additional A-minor interactions 622 located on the dsRNA helix 620. Such embodiments generate a "RNA claw," or aptamer, to hold a dsRNA helix. Embodiments including multi-way junctions still scale linearly when designed in many embodiments (e.g., FIG. 2, method 200) (see also FIGS. 4A-4B). Some embodiments involving including multi-way junctions run faster than embodiments which only use two-way junctions, as multi-way junctions add motifs that have significantly different 6-dimensional orientations between base pair ends.

RNA Aptamer Function and Stability

[0075] RNA aptamers possess the ability to bind small molecules. Unfortunately, prior methods to improve RNA aptamer function have largely been unsuccessful by producing weakened binding affinity or instability in biological environments. Even after multiple rounds of improvement, many prior attempts resulted in diminishing returns. (See, e.g., Carothers, J. M., et al. (2006) Aptamers selected for higher-affinity binding are not more specific for the target ligand. J. Am. Chem. Soc. 128, 7929-7937; Paige, J. S., et al. (2011) RNA mimics of green fluorescent protein. Science 333, 642-646; and Ellington, A. D., and Szostak, J. W. (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822; the disclosures of which are incorporated herein by reference in their entirety.) As such, various embodiments allow for the introduction of RNA aptamers into an RNA nanostructure. Examples of such activity are illustrated below in FIGS. 7-8D. Specifically, FIG. 7 illustrates various embodiments of RNA nanostructures incorporating an aptamer 702 specific for adenosine 5'-triphosphate (ATP) and adenosine 5'-monophosphate (AMP). Sequences for the embodiments illustrated in FIG. 7 can be found in the attached sequence listing as SEQ_ID NOs: 26-35. Additionally, the dissociation constant of various embodiments is reduced by an order of magnitude from the ATP aptamer alone, showing a vast improvement of various embodiments, as shown in Table 2.

TABLE-US-00002 TABLE 2 Quantification of properties of ATP/AMP aptamers of some embodiments Reactivity Mean Formed DMS DMS TTR with Change of A9 reactivity ATP (fold and A10 at TTR change upon ATP without in DMS K.sub.d for ATP, Design binding.sup.a ATP.sup.b reactivity).sup.c .mu.M.sup.d ATP-TTR 1.sup.e n.d. n.d. n.d. n.d. ATP-TTR 2 .sup.e n.d. n.d. n.d. n.d. ATP-TTR 3 -0.24 0.04 1.00 1.5 +0.51/-0.38 ATP-TTR 4 -0.24 0.09 1.46 4.1 +1.30/-0.96 ATP-TTR 5 -0.27 0.17 1.94 1.4 +0.46/-0.35 ATP-TTR 6* 0.02 0.14 2.28 n.d. ATP-TTR 7* 0.04 0.27 1.85 n.d. ATP-TTR 8 -0.11 1.28 1.16 n.d. ATP-TTR 9 -0.71 0.28 2.84 n.d. ATP-TTR 10 -0.22 1.26 0.90 n.d. ATP aptamer -0.41 n.a. n.a. 16.2 +5.70/-4.00 .sup.aDecrease in reactivity beyond 0.2 exceeds experimental error and considered evidence for ATP binding at ATP aptamer. Values normalized to DMS reactivity of single-stranded adenosines in reference GAGUA hairpins flanking design. .sup.bMean DMS reactivity less than 0.5 taken as evidence for tetraloop/tetraloop-receptor (TTR) formation. .sup.cFold change in DMS reactivity with and without ATP. If both the mean reactivity is under 0.5 and the fold change is under 2 it is considered a success. .sup.dK.sub.d lower than reference ATP aptamer demonstrated successful stabilization of ATP aptamer. .sup.eChemical mapping data for ATP-TTR 1 and 2 could not be processed due to strong stops on the capillary electrophoresis readout. *Construct had strong stops in capillary electrophoresis making data too weak to be reliable

[0076] Additionally, the Spinach RNA aptamer binds an analog of the green fluorescent protein chromophore (Z)-4-(3,5-Difluoro-4-hydroxybenzylidene)-1,2-dimethyl-1H-imidazol-5(4H)-- one (DFHBI) within a G-quadruplex. Binding to Spinach enhances the fluorescence of DFHBI by .about.1,000-fold relative to unbound ligand, making this RNA useful for biological interrogations. (See Paige, J. S., et al. (2011) RNA mimics of green fluorescent protein. Science 333, 642-646 and Kellenberger, C. A., et al. (2015) RNA-Based Fluorescent Biosensors for Live Cell Imaging of Second Messenger Cyclic di-AMP. J. Am. Chem. Soc. 137, 6432-6435; the disclosures of which are incorporated herein by reference in their entirety.) However, the binding affinity, brightness, folding efficiency and biological stability remain poor even after extensive efforts to discover improvements such as the minimized Spinach and Broccoli aptamers. (See Strack, R. L., et al. (2013) A superfolding Spinach2 reveals the dynamic nature of trinucleotide repeat-containing RNA. Nat. Methods 10, 1219-1224; Filonov, G. S., et al. (2014) Broccoli: rapid selection of an RNA mimic of green fluorescent protein by fluorescence-based selection and directed evolution. J. Am. Chem. Soc. 136, 16299-16308; Ketterer, S., et al. (2015) Systematic reconstruction of binding and stability landscapes of the fluorogenic aptamer spinach. Nucleic Acids Res. 43, 9564-9572; and Song, W., et al. (2014) Plug-and-play fluorophores extend the spectral properties of Spinach. J. Am. Chem. Soc. 136, 1198-1201; the disclosures of which are incorporated herein by reference in their entirety.)

[0077] Turning to FIG. 8A, various embodiments of RNA nanostructures incorporating the Spinach aptamer are illustrated. Sequences for the embodiments illustrated in FIG. 8A can be found in the attached sequence listing as SEQ_ID NOs: 36-51. Additionally, FIGS. 8B and 8C illustrate improved fluorescence intensity of some embodiments Spinach RNA nanostructures (SEQ_ID NOs: 36-51) over just the Spinach aptamer (SEQ_ID NO: 52) as both DFHBI and aptamer concentration are increased. Further, FIG. 8D illustrates improved stability of certain embodiments Spinach RNA nanostructures (SEQ_ID NOs: 36-51) over both the Spinach (SEQ_ID NO: 52) and Broccoli (SEQ_ID NO: 54) aptamers, when the reaction is challenged with cellular lysate, indicating that certain embodiments of RNA nanostructures (SEQ_ID NOs: 36-51) incorporating the Spinach aptamer or more stable than other versions (e.g., Spinach (SEQ_ID NO: 52) and Broccoli (SEQ_ID NO: 54)).

Protein Scaffolding

[0078] A number of embodiments are directed to RNA aptamers to scaffold proteins. In some embodiments, the methods are biased toward sequences that form favorable interactions with target proteins and adopt specific three-dimensional structures. Various embodiments design sequence libraries for in vitro selection experiments. Turning to FIG. 9A, a method 900 to design protein scaffolds is illustrated. At 902, many embodiments select a protein of interest or target protein. Numerous embodiments select the protein, along with sequence, structure, and other protein characteristics from a database of this information, including such databases as Protein Database (PDB). Further embodiments select protein complexes when one or more proteins interact or form a complex structure. At 904, many embodiments identify optimal RNA-binding regions on the surface of the target protein.

[0079] Many embodiments start with a target protein 902, then computationally identify optimal RNA-binding regions 904 on the surface of the target protein, then design small "anchor" RNA structures 906 that bind to these regions, likely with low affinity, and finally design RNA structures 908 that connect the anchors. In further embodiments, the affinity of the designed structures are improved by randomizing specific regions and performing selection experiments.

[0080] Many embodiments identify RNA/protein binding regions by predicting interaction sites between RNA structures and regions on proteins. Certain embodiments utilize a custom scoring function to discriminate between native and non-native structures, where different structures can be calculated as equation 1:

-kT In(P(structure|sequence)) (eq. 1)

[0081] The embodiments utilize an expression for the probability of a structure given its primary sequence (e.g., P(structure|sequence)). In particular, the probability of each monomer in an overall complex structure, such as given in equation 2:

P(M.sub.1,M.sub.2,C|sequence)=P(C|M.sub.1,M.sub.2,sequence) P(M.sub.1,M.sub.2,sequence) P(M.sub.2|sequence) (eq. 2)

where M.sub.1 is the structure of the RNA monomer 1, M.sub.2 is the structure of the protein monomer 2, and C is the structure of the complex.

[0082] Assuming that P(M.sub.1|M.sub.2, sequence) is approximately equal to P(M.sub.1|sequence), the equation becomes equation 3:

P(M.sub.1,M.sub.2,C|sequence)=P(C|M.sub.1,M.sub.2,sequence) P(RNA structure|sequence) P(protein structure|sequence) (eq. 3)

[0083] The energy of the RNA/Protein complex is further given by equation 4:

E(M.sub.1,M.sub.2,C|sequence)=-kT In(P(C|M.sub.1,M.sub.2, sequence))+Score.sub.RNA+Score.sub.protein (eq. 4)

[0084] Medium resolution potentials for both Score.sub.RNA and Score.sub.protein have been previously worked out and implemented within Rosetta. (See Das, R., et al., Atomic accuracy in predicting and designing noncanonical RNA structure. Nat Methods, 2010. 7(4): p. 291-4; Simons, K. T., et al., Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of Molecular Biology, 1997. 268(1): p. 209-225; Simons, K. T., et al., Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins. Proteins-Structure Function and Genetics, 1999. 34(1): p. 82-95; and Das, R. and D. Baker, Automated de novo prediction of native-like RNA tertiary structures. Proceedings of the National Academy of Sciences of the United States of America, 2007. 104(37): p. 14664-14669; the disclosures of which are incorporated herein by reference in their entireties.) Additionally, the expression for P(C|M.sub.1, M.sub.2, sequence) can be decomposed similar to protein-protein docking in equation 5: (See Gray, J. J., et al., Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. Journal of Molecular Biology, 2003. 331(1): p. 281-299; the disclosure of which is incorporated herein by reference in its entirety.)

P .function. ( C | M 1 , M 2 , sequence ) = P .function. ( sequence | C , M 1 , M 2 ) .times. P .function. ( C | M 1 , M 2 ) P .function. ( sequence | M 1 , M 2 ) ( eq . 5 ) ##EQU00001##

where P(sequence|M.sub.1, M.sub.2) is constant and can be neglected. Additionally, P(sequence|C, M.sub.1, M.sub.2) can be expanded following framework outlined for knowledge-based protein score function in Rosetta, as in equation 6: (See

P .function. ( sequence | C , M .times. 1 , M .times. 2 ) .apprxeq. r i .di-elect cons. seq 1 , seq 2 P .function. ( r i | E i ) .times. r j .di-elect cons. seq 1 , r k .times. seq 2 P .function. ( r j , r k | d jk , E j , E k ) P .function. ( r j | d jk , E j , E k ) .times. P .function. ( r k | d jk , E j , E k ) ( eq . 6 ) ##EQU00002##

[0085] The first term is the residue environment term (S.sub.env) and the second term is the residue pair term (S.sub.pair). The environments are defined as interface or non-interface and for proteins buried or exposed and for RNA base-paired or not base-paired. Many embodiments use a coarse-grained representation of both the protein and RNA residues in which the sidechains are represented as a single centroid atom. Accordingly, the distances in this potential are computed between these centroid atoms.

[0086] P(C|M.sub.1, M.sub.2) is the sequence-independent part of the interaction and includes terms describing well-formed complexes. To start, this include two terms approximating the attractive and repulsive parts of van der Waals interactions in equation 7:

P(C|M.sub.1,M.sub.2).about.e.sup.-S.sup.contact+e.sup.-S.sup.clash (eq. 7)

[0087] S.sub.contact is proportional to the number of residues between the two monomers that are within an optimal distance range to be determined from the training set of structures described below. S.sub.clash is calculated using atom type dependent distance cutoffs, d.sub.ij.sup.0 determined from the training set following the same method as for the protein potential in equation 8:

S.sub.clash=(d.sub.ij.sup.0).sup.2-(d.sub.ij).sup.2 (eq. 8)

[0088] This leads to a final expression for the protein-RNA score function in equation 9:

E(M.sub.1,M.sub.2,C|sequence)=w.sub.envS.sub.env+w.sub.pairS.sub.pair+w.- sub.contactS.sub.contact+w.sub.clashS.sub.clash+w.sub.RNAScore.sub.RNA+w.s- ub.proteinScore.sub.protein (eq. 9)

where w.sub.env, w.sub.pair, w.sub.contact, w.sub.clash, w.sub.RNA, and w.sub.protein are weights that are fit to optimize prediction of native structures.

[0089] The probabilities of protein/RNA interactions, used to derive S.sub.env, S.sub.pair, S.sub.contact, and S.sub.clash is approximated from the frequencies of these interactions in the non-redundant set of protein/RNA structures found in the Protein Database (PDB). As of June 2016, there are 1283 crystal structures containing both protein and RNA chains, with resolution better than 3.5 .ANG. and less than 70% sequence identity. Additional embodiments further refine the set of structures to ensure it only contains non-redundant structures where the protein and RNA are in the same biological unit.

[0090] The proposed form of P(C|M.sub.1, M.sub.2) described here may be insufficient for successful discrimination of native complexes. The protein/RNA complexes from the PDB are analyzed in certain embodiments to identify additional structural features of well-formed RNA/protein complexes such as possible orientation preferences of secondary structure elements. Some embodiments include systematically testing the inclusion of these additional terms to find the score function that best predicts correctly formed protein/RNA structures.

[0091] At 906 of many embodiments, small "anchor" RNA structures are designed at 906 of many embodiments. RNA binding proteins with high affinity for their RNA targets are often composed of many modules, each of which binds a short RNA sequence with relatively low affinity. (See e.g., Lunde, B. M., et al., RNA-binding proteins: modular design for efficient function. Nature Reviews Molecular Cell Biology, 2007. 8(6): p. 479-490; the disclosure of which is incorporated by reference herein in its entirety.) Various embodiments design high affinity protein binding RNA aptamers. De novo design of these structures can be accomplished through two different paths in accordance with various embodiments. Some embodiments design small "anchor" RNA structures that bind weakly to specific protein surfaces, while additional embodiments design connecting RNA structures. Certain embodiments combine these paths, to incorporate small, anchor RNA structures with connecting RNA structures. FIG. 9B illustrates a schematic of these paths, where 910 represents a protein bound to native RNA anchors. 912 illustrates modified anchors where certain contacts are removed from native anchors to reduce affinity between a protein and its native anchors. 914 illustrates an embodiment with a connecting RNA structure on used on the native anchors to increase affinity between the protein and the native anchors. And, 916 illustrates a design incorporating connecting RNA structures in accordance with some embodiments, where the connecting RNA structure causes the modified anchors to have improved affinity between the protein and the modified anchors.

[0092] By choosing the sites of anchor structures and the paths of the RNA connections between them, embodiments design libraries of RNA aptamers de novo that are likely to have specific structural features. To do this, some embodiments first implement a method for determining specific patches of the protein surface that are most optimal for interacting with RNA, then certain embodiments design RNA structures at the protein surface. Several methods have been developed for predicting the RNA binding sites of RNA binding proteins using both structure and sequence-based approaches. (See e.g., Chen, Y. C., et al., Identifying RNA-binding residues based on evolutionary conserved structural and energetic features. Nucleic Acids Research, 2014. 42(3); Zhao, H. Y., et al., Structure-based prediction of RNA-binding domains and RNA-binding sites and application to structural genomics targets. Nucleic Acids Research, 2011. 39(8): p. 3017-3025; and Perez-Cano, L. and J. Fernandez-Recio, Optimal Protein-RNA Area, OPRA: A propensity-based method to identify RNA-binding sites on proteins. Proteins-Structure Function and Bioinformatics, 2010. 78(1): p. 25-35; the disclosures of which are incorporated by reference herein in their entireties.) Many embodiments adapt a structure-based method to predict patches of an arbitrary protein surface that are most optimal for interacting with RNA. Certain embodiments adapt Optimal protein-RNA area (OPRA) to predict patches of an arbitrary protein surface that are most optimal for interacting with RNA. (See e.g., Perez-Cano, L. and J. Fernandez-Recio, Optimal Protein-RNA Area, OPRA: A propensity-based method to identify RNA-binding sites on proteins. Proteins-Structure Function and Bioinformatics, cited above.) OPRA uses the probability of each amino acid being at an RNA/protein interface, calculated from a training set of RNA/protein complex structures, to assign an energy value to each amino acid. Then, for each amino acid on the surface of the protein, these energy values are summed over all of the neighboring residues within a certain distance cutoff, to give a set of patch scores. Some embodiments calculate updated probabilities for each amino acid using novel training sets as developed in research. Certain embodiments utilize Rosetta to output optimal patch centers as a list of amino acids. (See e.g., Leaver-Fay, A., et al., ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol, 2011. 487: p. 545-74; the disclosure of which is incorporated by reference herein in its entirety.) A number of embodiments utilize these amino acids to serve to aid in designing connecting RNA structures.

[0093] At 908 of many embodiments, connecting structures are designed to connect the anchor RNA structures from 906. In many embodiments, the connecting RNA structures are designed using the structural modularity of RNA motifs to build new RNA structures by combining motifs found in the Protein Database (PDB). Certain methods used in embodiments treat proteins as steric constraints by representing residues of an input structure as beads. However, further embodiments design the optimal connection structures by considering simple interactions with the protein. For example, some embodiments implement a representation for proteins that conserves information about residues and/or include a custom scorer object that rewards favorable interactions between the RNA and the protein for the design of RNA structures around proteins. In various embodiments, favorable interactions are defined as RNA structures that come within approximately 5 .ANG. of positively charged protein residues. Further embodiments use a combination of methods described within this disclosure.

[0094] A schematic of method 900 is illustrated in FIG. 9C where a target protein 920 is selected, then the RNA-binding regions 922 on the surface of the target protein are identified. The small "anchor" RNA structures 924 are shown to interact with the RNA-binding regions 922. Finally, RNA structures 926 that connect the anchors connect the anchor RNA structures 924. Additionally, as noted above, certain embodiments bind multiple proteins with a single RNA scaffold, such as illustrated in FIG. 9D. These embodiments design several different connections between two aptamers designed as above. However, additional RNA structures are added to connect the aptamers to form a single aptamer that binds to more than one protein.

Embodiments of RNA Nanostructures

[0095] Turning to FIGS. 10A-10J, some embodiments are directed to RNA nanostructures to link or join one or more RNA-containing molecules. Many of these embodiments comprise at least one RNA motif 102, while further embodiments include a plurality of RNA motifs 102 (FIG. 10A), where the RNA motifs are aligned end to end forming a chain. In a variety of embodiments, the RNA motifs are selected from canonical motifs (e.g., A-U and C-G base paired) and noncanonical motifs. FIG. 10B illustrates a number of embodiments where canonical motifs 104 and noncanonical motifs 106 are alternated throughout the RNA nanostructure.

[0096] Further embodiments of RNA nanostructures are connected to at least one anchor structure 108, where the anchor structures are selected from aptamers, tetraloops and/or tetraloop receptors (e.g., TTRs, including mini-TTRs), RNA-protein anchors, ribosomes, and other RNA structures. FIG. 10C illustrates an embodiment where one anchor structure 108 is located at one end of a plurality of RNA motifs 104, 106, while FIG. 10D illustrates an embodiment with two anchor structures, where anchor structures are located at each end of a plurality of RNA motifs 104, 106.

[0097] Certain embodiments of RNA nanostructures comprise an anchor structure located between RNA motifs 102, such as illustrated in FIG. 10E. Such embodiments are capable of holding on structure in a particular conformation (e.g., aptamers) to maintain aptamer function, while certain embodiments are capable of linking numerous anchor structures together. In some of the embodiments with a centrally located anchor structure 110 and with alternating canonical and noncanonical RNA motifs, the anchor structure 110 is flanked by canonical motifs 104 among alternating canonical 104 and noncanonical 106 motifs, effectively taking the place of a noncanonical RNA motif (FIG. 10F), while other embodiments, anchor structure 110 is flanked by noncanonical motifs 106 among alternating canonical 104 and noncanonical 106 motifs, effectively taking the place of a canonical RNA motif (FIG. 10G).

[0098] Additional embodiments further comprise a combination of one or more centrally located anchor structures 110 flanked by one or more among RNA motifs 102 with an anchor structure 108 located at least one end of one or more, such as illustrated in FIG. 10H. FIG. 10I illustrates one such embodiment, where the RNA nanostructure comprises an aptamer 112 flanked by one or more RNA motifs 102 located on each side of the aptamer with a tetraloop 114 located at one end and a tetraloop receptor 116 located at the other end. Additionally, certain embodiments comprise a plurality of centrally anchor structures (e.g., FIG. 9D), where RNA a plurality of RNA anchors are joined by RNA motifs forming an RNA scaffold.

[0099] It should also be noted that certain embodiments are circularized in structure, such that one "end" of the RNA nanostructure is connected to the distal end of the RNA nanostructure, such as illustrated in FIG. 10J, where dashed line 118 represents a connection between one RNA motif 102 and a second motif 102.

EXEMPLARY EMBODIMENTS

[0100] Although the following embodiments provide details on certain embodiments of the inventions, it should be understood that these are only exemplary in nature, and are not intended to limit the scope of the invention.

EXAMPLE 1

Building RNA Nanostructures

[0101] Methods: To build a curated motif library of all RNA structural components, a set of non-redundant RNA crystal structures managed by the Leontis and Zirbel groups (version 1.45: rna.bgsu.edu/rna3dhub/nrlist/release/1.45) were obtained. (See Petrov, A. I., et al. (2013) Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas. RNA 19, 1327-1340; the disclosure of which is incorporated herein by reference in its entirety.) This set specifically removes redundant RNA structures that are identical to previously solved structures, such as ribosomes crystallized with different antibiotics. Each RNA structure to extract every motif with Dissecting the Spatial Structure of RNA (DSSR); (see Lu, X.-J., et al. (2015) DSSR: an integrated software tool for dissecting the spatial structure of RNA. Nucleic Acids Res. 43, e142; the disclosure of which is incorporated herein by reference in its entirety;) were processed with the following command:

x3dna-dssr -i file.pdb -o file_dssr.out

[0102] Each extracted motif were checked to confirm that it was the correct type, as DSSR sometimes classifies tertiary contacts as higher-order junctions and vice-versa. For each motif collected from DSSR, we ran the X3DNA find_pair and analyze programs to determine the reference frame for the first and last base pair of each motif to allow for alignment between motifs:

[0103] The naming convention for each motif involves the motif classification, the originating PDB accession code, and a unique number to distinguish from other motifs of the same type, all separated by periods. For example, TWOWAY.1GID.2, is a two-way junction from the PDB 1GID and is the third two-way junction to be found in this structure. All motifs retain their original residue numbering, chain IDs and relative position compared to their originating structure.

[0104] In addition to the motifs derived from the PDB, the make-na web server (structure.usc.edu/make-na/server.html) were utilized to generate idealized helices of between 2 and 22 base pairs in length. (see Montange, R. K., and Batey, R. T. (2008) Riboswitches: emerging themes in RNA structure and function. Annu. Rev. Biophys. 37, 117-133; the disclosure of which is incorporated herein by reference in its entirety.) All motifs in these generated libraries are bundled with some embodiments and are grouped together by type (junctions, hairpins, etc.) in sqlite3 databases in the directory RNAMake/RNAMake/resources/motif_libraries/(github.com/RNAMake/RNAMake/tre- e/master/RNAMake/resources/motif_libraries_new).

[0105] To build new RNA nanostructures, certain embodiments seek a path for RNA helices and noncanonical motifs that can connect two base pairs separated by a target translation and rotation. A depth-first search algorithm to discover such RNA paths were developed. The algorithm is guided by a heuristic cost function f inspired by prior manual design efforts. (See Grabow, W. W., and Jaeger, L. (2014) RNA self-assembly and RNA nanotechnology. Acc. Chem. Res. 47, 1871-18802, 25; and Dibrov, S. M., et al. (2011) Self-assembling RNA square. Proc. Natl. Acad. Sci. USA 108, 6405-6408; the disclosures of which are incorporated herein by reference in their entirety.) The algorithm is composed of two terms:

f(path)=h(path)+g(path) (eq. 1)

[0106] The first term, h(path), describes how close the last base pair in the path is to the target base pair; h(path)=0 corresponds to a perfect overlap in translation and rotation. The functional form for h(path) depends on the spatial position of each base pair's centroid d and an orthonormal coordinate frame R defining the rotational orientation of each base pair:

h(path)=|{right arrow over (d.sub.1)}-{right arrow over (d.sub.2)}|+W(|{right arrow over (d.sub.1)}-{right arrow over (d.sub.2)}|).SIGMA..sub.i.sup.3.SIGMA..sub.j.sup.3abs(R.sub.1ij-R.sub.2ij- ) (eq. 2)

(See Filonov, G. S., et al. (2014) Broccoli: rapid selection of an RNA mimic of green fluorescent protein by fluorescence-based selection and directed evolution. J. Am. Chem. Soc. 136, 16299-16308; the disclosure of which is incorporated herein by reference in its entirety.)

Here, W(d) is:

[0107] W .function. ( d ) = { 0 , if .times. d > 150 log .times. 150 d , if 1.5 < d < 150 2 , if 1.5 > d ( eq . 3 ) ##EQU00003##

[0108] Where d is measured in Angstroms. The weight W(d) reduces the importance of the current base pair and the target base pair with similar alignment when they are spatially far apart. This term conveys the intuition that aligning the two coordinate frames becomes important only as the path of the motif and helices approaches the target base pair. Embodiments readily allow for the exploration of alternative forms of the cost function terms in (eq. 2) and (eq. 3), including more standard rotationally invariant metrics to define rotation matrix differences; (see Huynh, D. Q. (2009) Metrics for 3D rotations: comparison and analysis. J. Math. Imaging Vis. 35, 155-164; the disclosure of which is incorporated herein by reference in its entirety;) or base-pair-to-base-pair RMSDs based on quaternions; (see Karney, C. F. F. (2007) Quaternions in molecular modeling. J Mol Graph Model 25, 595-604; the disclosure of which is incorporated herein by reference in its entirety;) but these were not tested in the current study.

[0109] The second term in the cost function (eq. 1) is g(path), which parameterizes the properties of the non-canonical RNA motifs and helices comprising the path at each stage of the calculation:

g .function. ( path ) = S ss ( path ) 2 + 2 .times. N motifs ( eq . 4 ) ##EQU00004##

where S.sub.ss is a secondary structure score for all the motifs and helices in the path. This S.sub.ss term favors longer canonical helices as well as motifs with frequently recurring base pairs, as follows. All base pairs found in the RNA motif are scored based on their relative occurrences in all high-resolution crystal structures; all unpaired residues receive a penalty, and Watson-Crick base pairs receive an additional bonus score (Table 3).

TABLE-US-00003 TABLE 3 Scoring penalties for each base pair type X3DNA bp Type Leontis-Westhof Energetic Penalty cm- N/A 6.11 cM - M tHH 6.11 tW + W tWW 3.11 c. + M N/A 5.69 .W + W N/A 6.11 tW - M tWH 2.42 tm - M tSH 2.72 cW + M cWH 3.33 .W - W N/A 4.33 cM + . N/A 6.11 c. - M N/A 6.11 cM + W cHW 4.40 tM + m N/A 6.11 tM - W tHW 3.02 cm - m cSS 5.12 cM - W tHW 6.11 cW - W cWW -2.00 c. - M N/A 5.44 cm + M cSH 2.71 cm - M tSH 3.23 . . . N/A 4.18 cm - W cSW 4.37 tM - m tSH 2.84 c. - W N/A 6.11 cM + m cHS 5.69 cM - m tSH 3.12

Values were derived based on logarithms of the frequencies of these elements in the crystallographic database, i.e. the inverse Boltzmann approximation; (see Finkelstein, A. V., et al. (1995) Why do protein architectures have Boltzmann-like statistics? Proteins 23, 142-150; the disclosure of which is incorporated herein by reference in its entirety;) so that that frequency of the elements in some embodiment designs was similar to what is seen in natural RNA tertiary structures. In addition to the secondary structure score, N.sub.motifs penalizes the total number of motifs in the path, here taken as the number of non-canonical motifs plus the number of canonical motifs (e.g., helices, independent of helix length).

[0110] The search adds motifs and helices to the path in a depth-first manner, while the total cost function f(path) decreases, back-tracking if f(path) increases. Any solutions with h(path) less than 5, i.e., overlap at approximately nucleotide resolution between the path's last base pair and the target base pair, are accepted into a list of final designs. The balance between g(path) and h(path) allows some embodiments to reduce the number of motif combinations considered, finding most solutions in a few seconds. For each solution, EteRNAbot, was used a secondary structure optimization algorithm that has undergone extensive empirical tests to fill in helix sequences. (See Lee, J., et al. (2014) RNA design rules from a massive open laboratory. Proc. Natl. Acad. Sci. USA 111, 2122-2127; the disclosure of which is incorporated herein by reference in its entirety.)

[0111] Proteins that are included in the coordinates supplied to Embodiments are represented as steric beads centered at the C.alpha. atom of each amino acid. This representation allows embodiments to avoid steric clashes with proteins, particularly for the ribosome tethering problems.

[0112] Results: The above method generated a multitude RNA nanostructure designs, as seen in FIGS. 5B, 6B, 7, and 8A in a relatively short amount of time, as illustrated in FIGS. 4A and 4B.

[0113] Conclusion: Embodiments reveal a novel approach to solving RNA pathfinding problems.

EXAMPLE 2

Design, Synthesis and Experimental Testing of TTR Linking Constructs

[0114] Background: The problem of creating a well-folded RNA nanostructure was first solved two decades ago by repurposing the well-characterized tetraloop/receptor (TTR) tertiary contact to bring together two separate RNA chains, analogous to the P4-P6 domain of the Tetrahymena group I self-splicing intron and other natural functional RNAs. While later RNA nanotechnology studies used the TTR module and other structural motifs to design different nanostructures, the resulting RNAs original and later designs have all been multi-chain assemblies. (See Bindewald, E., et al. (2008) Computational strategies for the automated design of RNA nanoscale structures from building blocks using NanoTiler. J Mol Graph Model 27, 299-308; Dibrov, S. M., et al. (2011) Self-assembling RNA square. Proc. Natl. Acad. Sci. USA 108, 6405-6408; Afonin, K. A., et al. (2014) Multifunctional RNA nanoparticles. Nano Lett. 14, 5662-5671; Khisamutdinov, E. F., et al. (2016) Fabrication of RNA 3D nanoprisms for loading and protection of small RNAs and model drugs. Adv. Mater. Weinheim 28, 10079-10087; and Huang, L., and Lilley, D. M. J. (2016) A quasi-cyclic RNA nano-scale molecular object constructed using kink turns. Nanoscale 8, 15189-15195; the disclosures of which are incorporated herein by reference in their entirety.) Testing embodiments on the TTR problem was chosen due to the prospect of achieving the first de novo single-chain solutions to this fundamental problem, which we hypothesized might also help crystallization.

[0115] Methods: To generate TTR linking designs, the coordinates from the X-ray crystal structure of a TTR from the P4-P6 domain of the Tetrahymena ribozyme (residues 146-157, 221-246, and 228-252 from PDB 1GID) were extracted. Second, embodiments were used to build structural segments composed of two-way junctions and helices spanning the last base pair of the hairpin (A146-U157) to base pair U221-A252 of the tetraloop-receptor, thus connecting the TTR into a single continuous strand (FIG. 3). Of 200,000 RNA segments generated, sixteen were selected based on two criteria: 1) the fewest number of motifs used in the solution (i.e. only three unique tertiary motifs); and 2) the tightest predicted atom-wise alignment of the TTR linking design to its target spatial and rotational orientations. These computational designs ranged from 75 to 102 nucleotides in size (for full sequences, see sequence list), significantly shorter than the 157 nucleotides of the natural P4-P6 domain RNA.

[0116] To probe the structures of the TTR linking designs generated by embodiments, quantitative chemical mapping with selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) and dimethyl sulfate (DMS) were performed. For all 16 designs illustrated in FIG. 5B, the SHAPE and DMS reactivity of each TTR linking RNA to its respective secondary structure were compared.

[0117] To evaluate the formation of tertiary structure, the change in DMS reactivity of both tetraloop and tetraloop-receptor adenines as a function of Mg.sup.2+ concentration were investigated. Previous studies have demonstrated that TTR formation in the P4-P6 domain is strongly stabilized by Mg.sup.2+. As a control for the unfolded state, we measured the DMS reactivities of the tetraloop and tetraloop-receptor adenines of the TTR of the P4-P6 domain without Mg.sup.2+ (A248, A151, A152, and A153) were measured.

[0118] As an independent test of TTR linking construct folding, each RNA's GAAA tetraloop was replaced with a UUCG tetraloop, which does not form the sequence-specific TTR tertiary contact and is predicted to reduce the RNA's mobility in non-denaturing polyacrylamide gel electrophoresis, as observed for the P4-P6 domain.

[0119] After the gel-based and chemical mapping tests above, whether the embodiment designs might allow crystallization and thereby enable high-resolution characterization of the structural accuracy of the designs were tested. Crystals of miniTTR 6 that diffracted at 2.55 .ANG. resolution (I/.sigma. of 1.0) were obtained. Purified miniTTR 6 RNA diluted in buffer A (30 mM HEPES (pH 7.5), 20 mM MgCl2, and 100 mM KCl) was incubated at 65.degree. C. for 2 min, centrifuged at 13,000 rpm for 2 min, and snap-cooled on ice for approximately 5 min before moving to 25.degree. C. to set up crystallization trays. Within 2-4 weeks, miniTTR 6 crystallized at 25.degree. C. as plates or clusters of plates via sitting-drop vapor diffusion by mixing 2 .mu.L of miniTTR 6 at a concentration of 100 .mu.M with 3 .mu.L of crystallization solution containing 40 mM sodium cacodylate (pH 5.5), 20 mM MgCl2, 2 mM cobalt hexammine, and 40% 2-methyl-2,4-pentanediol (MPD). Crystals of miniTTR 6 grew to maximum dimensions of 700.times.700.times.20 .mu.m and were stabilized and cryogenically protected by increasing the MPD to a final concentration of 44%. Crystals were flash-frozen by plunging into liquid nitrogen. Diffraction data were collected at 100 K using synchrotron X-ray radiation at beamline 4.2.2 of the Advanced Light Source, Lawrence Berkeley National Laboratory (Berkeley, Calif.). The data were processed and scaled using X-ray Detector Software (XDS). The scaled data were handled using Collaborative Computational Project programs.

[0120] The initial structural determination of the miniTTR 6 in the C2 space group was carried out from molecular replacement (MR) in Phaser (CCP4) searching for one copy of a 31-nucleotide model of only the tetraloop and receptor with the identical sequence. The rotational and translational Z-scores were somewhat low, 4.6 and 5.9 respectively, but the maps were of sufficient quality to enable the iterative building of all the residues into the 2Fo-Fc and Fo-Fc maps. Composite omit maps in PHENIX were used to help confirm the model and reduce model bias from the initial MR solution. The models were built using COOT and refined using REFMAC5 and PHENIX. The final model was refined in REFMAC5 and ERRASER, and the overall Rwork and Rfree were refined to 22.9% and 27.4%, respectively. The structure derived from the miniTTR was refined to 2.55 .ANG. against a data set scaled to an overall I/.sigma. of 1.0 at the highest resolution shell with 98.5% completeness.

[0121] Results: Of the 1386 nucleotides in the sixteen TTR linking constructs, 1367 (98.7%) were either reactive at target unpaired regions or protected at target helical residues, supporting the predicted secondary structures. All 19 outliers occurred at helix edges (i.e., flanking base pairs of motifs). These data supported the formation of the expected secondary structures for all TTR linking designs (See Table 1).

[0122] Several TTR linking constructs required less than 1 mM Mg.sup.2+ to fold stably, similarly to or better than reported midpoints for natural TTR-contains RNA nanostructures. Indeed, miniTTR 2 and miniTTR 16 exhibited folding stabilities better than the P4-P6 RNA in side-by-side assays. Furthermore, miniTTR 6 has a much sharper Mg.sup.2+ dependence than P4-P6 with an apparent Hill coefficient of over 10. The adenines exhibited reactivities of 1.27, 0.72, 0.70, and 0.90, respectively. The values are normalized to the reactivity of the reference hairpin loops that flank each design. Upon the addition of 10 mM Mg.sup.2+, the adenines involved in the TTR became protected from DMS modification in the P4-P6 control. As with this folding control, for 12 of the 16 designs (miniTTRs 1, 2, 5-7, 9-12 and 14-16), we observed a more than two-fold decrease in the reactivity of the TTR adenine residues. These results were consistent with Mg.sup.2+-dependent TTR formation. The remaining designs (miniTTRs 3, 4, 8 and 13) did not demonstrate significant changes in DMS reactivity upon addition of 10 mM Mg.sup.2+, indicating that the TTR interaction did not form.

[0123] Of the 16 TTR linking constructs tested, 12 designs displayed mobility shifts consistent with the formation of the TTR tertiary contact (See Table 1). Constructs 4 and 15 exhibited mobility shifts that were inconsistent with our chemical mapping results. The UUCG mutant of miniTTR design 4 displayed a mobility shift, but it did not demonstrate a full two-fold decrease in TTR DMS reactivity, suggesting partial folding. Compared to its UUCG mutant, miniTTR design 15 in the wild-type form (GAAA tetraloop) exhibited a wide, slow-mobility band. In all other cases, the electrophoretic mobility measurements were concordant with our quantitative SHAPE and DMS chemical mapping data, supporting the formation of the TTR and a compact tertiary fold.

[0124] The crystal structure and the embodiment model agreed with an all-heavy-atom RMSD of 4.2 .ANG., better than the nanometer-scale accuracy typically sought in RNA nanotechnology. The primary discrepancy between the modeled 3D structure and the crystal structure was a single motif, a triple mismatch drawn from the large ribosomal subunit. This motif formed multiple consecutive non-canonical base pairs with high B-factors in our miniTTR 6 crystal instead of the conformation found in the ribosomal structure, which involved flipped out adenosines (residues: O2360-O2363, O2424-O2426, PDB:1S72), as shown in FIGS. 11A and 11B, where FIG. 11A illustrates the modeled motif structure, while FIG. 11B illustrates the crystallographic structure. Other motifs in the design achieved near-atomic accuracy, including the TTR tertiary contact (RMSD 0.45 .ANG.; FIG. 11C), a kink-turn variant drawn from the archaeal 50S ribosomal subunit (RMSD 2.0 .ANG.; FIG. 11D) (33), and a `right angle turn` drawn from a viral internal ribosomal entry site domain (RMSD 1.28 .ANG.; FIG. 11E).

[0125] Conclusion: The stability of the TTR liking designs was particularly notable given that P4-P6 and other natural TTR-containing RNAs are larger than the miniTTR designs and have additional stabilizing tertiary contacts and other attempts to make artificial minimized TTR constructs have given significantly worse stabilities.

EXAMPLE 3

Automated 3D Design of Covalently Tethered Ribosomal Subunits

[0126] Background: The ribosome is a ribonucleoprotein machine dominated by two extensive RNA subunits, the 16S and 23S rRNAs. Previous work constructed a tethered ribosome called Ribo-T, in which the large and small subunit rRNAs were connected by an RNA tether to form a single subunit ribosome. In that work, the major bottleneck involved a year of numerous trial-and-error iterations to identify RNA tethers that were not cleaved by ribonucleases in vivo when wild type ribosomes were replaced in the Squires strain (SQ171fg) of E. coli. SQ171fg cells lack genetic rRNA alleles, surviving off plasmids that can be exchanged using positive and negative selections. Early failure rounds involving ribosomes from prior studies are shown in FIG. 12A-12B and success with Ribo-T in FIG. 12C. Nevertheless, the current tethers in Ribo-T are unstructured and unlikely to remain stable if other modules are incorporated (FIG. 12C). It is hypothesized that automated design by the embodiment might give structured, chemically stable tethers for this design problem.

[0127] Methods: For ribosome tether designs, PDB coordinates 3R8T and 4GD2 were used for the 50S and 30S ribosomal subunit structures respectively. From the 50S coordinates, we removed residues A2854-A2863 and, from the 30S, we removed residues A1445-A1457. These designs contained either four or five noncanonical structural motifs each to tether the H101 helix on a circularly permuted 23S rRNA to the h44 helix on the 16S rRNA (FIG. 6B). Of the nine diverse solutions we tested (RM-Tether 1 to 9), DNA templates for seven could be synthesized, and transformation of these DNA templates into SQ171fg allowed an assay as to whether the generated designs could replace wild type ribosomes deleted from growing bacteria.

[0128] The designed tethers were cloned into plasmid pRibo-T-A2058G. The backbone was generated for each design using forward (f) and reverse (r) primer pairs in separate PCR reactions using plasmid pRibo-T as a template, Phusion polymerase (NEB), and 3% DMSO. PCR cycling was as follows: 98.degree. C. for 3 min; 25 cycles of 98.degree. C. for 30 sec, 55.degree. C. for 30 sec, 72.degree. C. for 2 min; and 72.degree. C. for 10 min. Circularly permuted 23S ribosomal RNA (rRNA) was generated with forward and reverse primer pairs, the pRibo-T template, and the same PCR conditions as described above. Each PCR reaction was purified by gel extraction from a 0.7% agarose gel with an E.Z.N.A. gel extraction kit (Omega). Each purified backbone (50 ng) was assembled with the respective 23S insert in 3-fold molar excess using Gibson assembly. Assembly reactions were transformed into POP2136 cells, and the cells were grown at 30.degree. C. overnight. Colonies were picked and plasmids were isolated using an E.Z.N.A. miniprep kit (Omega) and confirmed with full plasmid sequencing by ACGT, Inc.

[0129] Each purified plasmid (100 ng) was separately transformed into electrocompetent SQ171fg cells containing pCSacB. Cells were recovered in 1 mL of SOC media at 37.degree. C. with shaking for 1 hour. Fresh SOC (1.85 mL) supplemented with 50 .mu.g/mL carbenicillin and 0.25% sucrose was inoculated with 250 .mu.L of recovered cells and incubated overnight at 37.degree. C. with shaking. Cultures (10% and 90%) were plated on LB agar plates supplemented with 50 .mu.g/mL carbenicillin, 5% sucrose and 1 mg/mL erythromycin and incubated at 37.degree. C.

[0130] After 48 hours with no visible colonies, the plates were replica plated onto fresh LB agar plates supplemented with 50 .mu.g/mL carbenicillin, 5% sucrose and 1 mg/mL erythromycin and incubated at 37.degree. C. After 72 additional hours, colonies appeared on the plate containing RM-Tether design 4. Eight colonies were streaked onto LB agar supplemented with 50 .mu.g/mL carbenicillin and 1 mg/mL erythromycin and LB agar supplemented with 30 .mu.g/mL kanamycin (to confirm loss of the pCSacB plasmid) and were also used to inoculate 5 mL of LB supplemented with 50 .mu.g/mL carbenicillin and 1 mg/mL erythromycin. Plates were incubated at 37.degree. C., and cultures were incubated at 37.degree. C. with shaking. The OD600 of the cultures was tracked to generate growth curves (Biochrom Libra S4 spectrophotometer). After 5 days at 37.degree. C., total RNA was extracted using an RNA extraction kit from Qiagen. Total RNA was analyzed by gel electrophoresis on a 1% agarose gel with GelRed. Total plasmid was extracted from saturated 5 mL cultures with an E.Z.N.A. miniprep kit (Omega) and sequenced to confirm the correct RM-Tether design 4 sequence.

[0131] For in vitro characterization of ribosomes, all constructs (wild type, Ribo-T v1.0, and RM-Tether 4) were cloned to be under control of a T7 promoter. The T7 promoter was introduced into primers, and amplified using the wild type, Ribo-T v1.0, and RM-Tether 4 plasmids as templates for PCR amplification. PCR products were blunt end ligated, transformed into DH5.alpha. E. coli cells using electroporation, and plated onto LB-agar/ampicillin plates at 37.degree. C. Plasmid was recovered from resulting clones and sequence confirmed.

[0132] In vitro ribosome synthesis, assembly, and translation (iSAT) reactions were set-up as previously described. Briefly, eight 15 .mu.L reactions were prepared and incubated for 2 hours at 37.degree. C., then pooled together.

[0133] Sucrose gradients were prepared from buffer C (10 mM Tris-OAc (pH=7.5 at 4.degree. C.), 60 mM NH4Cl, 7.5 mM Mg(OAc)2, 0.5 mM EDTA, 2 mM DTT) with 10 and 40% sucrose in SW41 polycarbonate tubes using a Biocomp Gradient Master. Gradients were placed in SW41 buckets and chilled to 4.degree. C. 120 .mu.L of pooled iSAT reactions were loaded onto the gradients. The gradients were ultra-centrifuged at 22,500 rpm for 17 hours at 4.degree. C., using an Optima L-80 XP ultracentrifuge (Beckman-Coulter) at medium acceleration and braking (setting of 5 for each). Gradients were analyzed with a BR-188 density gradient fractionation system (Brandel) by pushing 60% sucrose into the gradient at 0.75 mL/min (at normal speed). Traces of A254 readings versus elution volumes were obtained for each gradient. Gradient fractions were collected and analyzed for rRNA content by gel electrophoresis in 1% agarose and imaged in a GelDoc Imager (Bio-Rad). Ribosome profile peaks were identified based on the rRNA content as representing 30S or 50S subunits, 70S ribosomes, or polysomes.

[0134] Fractions containing 70S ribosomes and polysomes were collected and pooled. These fractions were recovered as previously described, with pelleted iSAT ribosomes resuspended in iSAT buffer, aliquoted, and flash-frozen. These pelleted fractions were re-run on a 1 agarose gel and imaged in a GelDoc Imager to confirm tethering in monosome and polysome peaks.

[0135] For SHAPE-seq, in vitro ribosome synthesis, assembly, and translation reactions were set-up as previously described. (See Jewett, M. C., et al. (2013) In vitro integration of ribosomal RNA synthesis, ribosome assembly, and translation. Mol. Syst. Biol. 9, 678; and Fritz, B. R., et al. (2015) Implications of macromolecular crowding and reducing conditions for in vitro ribosome construction. Nucleic Acids Res. 43, 4774-4784; the disclosures of which are incorporated herein by reference in their entirety.) Briefly, 15 .mu.L iSAT reactions each possessing wild type, Ribo-T, or RM-40 were prepared in triplicate, incubated for 2 hours at 37.degree. C., and then placed on ice. To perform SHAPE modification, samples were warmed to 37.degree. C. for 5 minutes, and 7.5 .mu.L of each sample was added to 0.83 .mu.L of 65 mM 1-methyl-7-nitroisatoic anhydride (1M7) or 0.83 .mu.L DMSO (control solvent). Reactions were incubated for 2 minutes, then all samples were Trizol extracted, ethanol precipitated, washed twice with 70% ethanol, and resuspended in 10 .mu.L water. Subsequent library preparation steps were performed as described previously with one exception: 2 custom reverse transcription primers were used to simultaneously probe the regions containing T1 (5'-GGTTAAGCCTCACGG-3') and T2 (5'-CCCTACGGTTACCTTGTTACGAC-3'). (See Watters, K. E., et al. (2016) Simultaneous characterization of cellular RNA structure and function with in-cell SHAPE-Seq. Nucleic Acids Res. 44, e12; the disclosure of which is incorporated herein by reference in its entirety.) Following 2.times.75 bp paired-end Illumina sequencing, SHAPE reactivities were calculated as described by Yu et al. mapping both modification-induced stops and mutations. (See Yu et al. (2018) Estimating RNA structure chemical probing reactivities from reverse transcriptase stops and mutations, BioRxiv; the disclosure of which is incorporated herein by reference in its entirety.) Raw reactivities were calculated using Spats v1.9.8, and were then linearly re-scaled to account for estimated differences in SHAPE probe concentration between replicates. Specifically, one replicate was first selected as the reference. Reactivities for the other datasets were divided by the reference at each position, then the median value of this ratio was taken as the scale factor. Reactivities across each dataset were divided by their scale factor. The same experimental replicate was used to scale reactivities, and reactivities are presented as the average value over these re-scaled replicates.

[0136] Results: One of these seven constructs, RM-Tether 4 (FIG. 12D), led to viable growth of bacterial colonies. DNA sequencing confirmed that these colonies harbored the correct RM-Tether 4 plasmid; and RNA electrophoresis confirmed the presence of a single dominant RNA species with the same length as Ribo-T, with no detectable products corresponding to separate 16S or 23S rRNA lengths or other cleavage products. While the growth rate of this strain was low (FIG. 6C), it was independently confirmed that the ribosomes loaded on mRNA in vitro, using integrated synthesis, assembly, and translation (iSAT) in ribosome-free S150 extracts. Similar to Ribo-T, 70S/monosome and polysomes (and no 30S or 50S subunits) by separation of iSAT-prepared RM-Tether 4 ribosomes on a sucrose gradient were detected (FIG. 12E). Electrophoresis of the polysome fraction confirmed that it contained an uncleaved rRNA the same size as Ribo-T (FIG. 12F). In addition, SHAPE-Seq mapping on this rRNA confirmed that the RM-Tether 4 can be reverse transcribed from one ribosomal subunit to the other across both strands of the tether and highlights chemical reactivity consistent with the design, with one region of flexibility around the middle junction, as seen in FIGS. 13A-13C, where FIG. 13A illustrates a wild-type ribosome, FIG. 13B illustrates a Ribo-T tethered ribosome, and FIG. 13C illustrates a ribosomes tethered with RM-Tether 4.

[0137] Conclusion: Taken together, these data demonstrate that an embodiment-designed ribosomes with structured, chemically stable tethers can replace wild type ribosomes in vivo and more than one such ribosome can be loaded onto a single message in vitro. Embodiments obviate repeated rounds of trial and error that were previously required to achieve these design goals.

EXAMPLE 4

Automated Improvement of ATP-Binding RNA Aptamers

[0138] Background: Small molecules can be bound and sensed by artificially selected RNA aptamers. Unfortunately, these molecules often exhibit weakened binding affinities or instability in biological environments, and additional rounds of selection to improve aptamers typically give diminishing returns. (See Carothers, J. M., et al. (2006) Aptamers selected for higher-affinity binding are not more specific for the target ligand. J. Am. Chem. Soc. 128, 7929-7937; Paige, J. S., et al. (2011) RNA mimics of green fluorescent protein. Science 333, 642-646; and Ellington, A. D., and Szostak, J. W. (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822; the disclosures of which are incorporated herein by reference in their entirety.)

[0139] Methods: Starting with PDB 1AM0 we removed residues A6-A18 and A33-A35 to achieve a minimal ATP aptamer flanked by single Watson-Crick base pairs. We moved these residues into a new PDB `ATP_min.pdb`.

[0140] Results: In all 5210 designs were generated. As with previous construct designs, designs were selected that maximized motif usage and minimized the chain closure score or how close the optimized sequence is to the target base pair. In total, 10 ATP aptamers embedded by an embodiment into scaffolds with tetraloop/receptor contacts, which we called ATP-TTR designs (FIG. 7). Chemical mapping confirmed that four of these RNAs formed the TTR and also retained their ability to bind to ATP, as assessed by DMS protection of aptamer nucleotides A13 and A14 (Table 2). Titrations of ATP read out through chemical mapping (Table 2; FIG. 14A) showed that three designs achieved better ATP dissociation constants (Kd of 1.5, 4.1, and 1.4 .mu.M) than the isolated ATP aptamer under the same conditions (Kd=16.2 .mu.M), improvements by up to an order of magnitude. Three of the ATP-TTRs gave ligand-free DMS reactivity profiles in the aptamer regions similar to the ligand-bound aptamer, suggesting that they pre-form the structure needed for ATP binding rather than requiring conformational rearrangements observed in the isolated ATP aptamer (FIGS. 14B-14C; Table 2).

[0141] Conclusion: These results demonstrate that the TTR peripheral contact efficiently couples to enhance binding of ATP in the aptameric region, as desired. As a further test of this coupling, we confirmed that the Mg.sup.2+ requirements for forming the TTR was reduced in the presence compared to the absence of the small molecule ligand in these constructs (FIG. 14D).

EXAMPLE 5

Automated Improvement of Spinach RNA Aptamers

[0142] Background: Binding to Spinach enhances the fluorescence of DFHBI by .about.1,000-fold relative to unbound ligand, making this RNA useful for biological interrogations (38, 45), although its binding affinity, brightness, folding efficiency and biological stability remain poor even after extensive efforts to discover improvements such as the minimized Spinach and Broccoli aptamers (46-49). (See Paige, J. S., et al. (2011) RNA mimics of green fluorescent protein. Science 333, 642-646; Kellenberger, C. A., et al. (2015) RNA-Based Fluorescent Biosensors for Live Cell Imaging of Second Messenger Cyclic di-AMP. J. Am. Chem. Soc. 137, 6432-6435; Strack, R. L., et al. (2013) A superfolding Spinach2 reveals the dynamic nature of trinucleotide repeat-containing RNA. Nat. Methods 10, 1219-1224; Filonov, G. S., et al. (2014) Broccoli: rapid selection of an RNA mimic of green fluorescent protein by fluorescence-based selection and directed evolution. J. Am. Chem. Soc. 136, 16299-16308; Ketterer, S., et al. (2015) Systematic reconstruction of binding and stability landscapes of the fluorogenic aptamer spinach. Nucleic Acids Res. 43, 9564-9572; and Song, W., et al. (2014) Plug-and-play fluorophores extend the spectral properties of Spinach. J. Am. Chem. Soc. 136, 1198-1201; the disclosures of which are incorporated herein by reference in their entirety.)

[0143] Methods: Starting with PDB 6614 we removed residues R19-R31 and R49-R66 to achieve the minimal DFHBI binding aptamer (Spinach_min.pdb).

[0144] A stock of DFHBI (Sigma) was prepared in PBSMKT (1.times.phosphate buffered saline, 5 mM MgCl2, 100 mM KCl, 0.01% Tween-20, pH 7.2) and its absorbance measured using a UV spectrophotometer (NanoDrop, Thermo Scientific). The DFHBI concentration was calculated using an extinction coefficient of 30,100 cm-1/M at 423 nm as previously reported. (See Paige, J. S., et al. (2011) RNA mimics of green fluorescent protein. Science 333, 642-646; the disclosure of which is incorporated herein by reference in its entirety.) A DFHBI titration was performed in half area, flat-bottomed black 96-well plates (Corning) at a final RNA concentration of 200 nM with DFHBI concentration ranging from 10 .mu.M to 10 nM prepared in a 1:2 dilution series. After mixing, the plates were covered with an adhesive film to prevent evaporation and temperature-cycled from room temperature to 4.degree. C. twice over the course of 1 hour to allow aptamer-target equilibration while minimizing magnesium-dependent self-cleavage. Measurements were acquired at room temperature and wells were excited at 462.+-.10 nm and emission was measured at 504.+-.15 nm using a Tecan M1000 plate reader. A fluorescence background was obtained at each DFHBI concentration in the absence of RNA and subtracted from the corresponding wells. The corrected signal for each aptamer at every DFHBI concentration was then least-squares fit using a custom MATLAB script using a 1:1 complexation model according to the following equation:

F = B max * [ T ] [ T ] + K d ( eq . 5 ) ##EQU00005##

Here, [T] is the concentration of DFHBI, K.sub.d is the dissociation constant of the given aptamer, and B.sub.max is the maximum brightness obtained for the given concentration of aptamer.

[0145] Next, we prepared an RNA titration assay using identical measurement, equilibration, and buffer conditions, except with the amount of DFHBI constant at 400 nM and RNA concentrations ranging from 5 .mu.M down to 5 nM prepared in a 1:2 dilution series. A background fluorescence was obtained at 400 nM DFHBI in the absence of RNA and subtracted from each well. The corrected signal was then least-squares fit using a custom MATLAB script using a 1:1 complexation model according to the following equation:

F = F max ( [ A ] * f + DT + K d - ( [ A ] * f + DT + K d ) 2 - 4 * [ A ] * f * DT 2 * DT ) ( eq . 6 ) ##EQU00006##

Where [A] was the concentration of aptamer, f is the folding efficiency, DT is the DFHBI concentration (400 nM), K.sub.d is the dissociation constant calculated for each sequence above, and F.sub.max is the maximum fluorescence signal at dye-binding saturation. Quantum yields were obtained through direct comparison of F.sub.max with the literature value for Broccoli (QY=0.72).

[0146] Small molecules can be bound and sensed by artificially selected RNA aptamers. Unfortunately, these molecules often exhibit weakened binding affinities or instability in biological environments, and additional rounds of selection to improve aptamers typically give diminishing returns. (See Carothers, J. M., et al. (2006) Aptamers selected for higher-affinity binding are not more specific for the target ligand. J. Am. Chem. Soc. 128, 7929-7937; Paige, J. S., et al. (2011) RNA mimics of green fluorescent protein. Science 333, 642-646; and Ellington, A. D., and Szostak, J. W. (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822; the disclosures of which are incorporated herein by reference in their entirety.)

[0147] Each TTR Spinach aptamer was prepared in 60 .mu.L PBSMKT containing 1.66 .mu.M total RNA and 30 .mu.L of this was added to 50 .mu.L of 5 .mu.M DFHBI in PBSMKT in two wells per aptamer. Next, 20 .mu.L of PBSMKT was added to one well per aptamer to give a final concentration of 500 nM RNA and 2.5 .mu.M DFHBI in order to provide a baseline fluorescence. Next, 20 .mu.L of 100% frog egg lysate prepared 4 hours earlier and stored at 4.degree. C., was added to each well and pipet mixed. (Higher lysate concentrations were too optically absorbent to allow fluorescence measurements). Fluorescence measurements were then obtained for every well every 1 minute for 30 minutes, then every 3 minutes for 1 hour, and after every 5 minutes for an additional hour. For evaluation of times to half-fluorescence, the fluorescence of each aptamer in wells containing lysate was normalized to the same aptamer's fluorescence in PBSMKT at every time point in order to account for photobleaching.

[0148] Each TTR Spinach aptamer was prepared in PBSMK (1.times.PBS pH 7.2, 5 mM MgCl.sub.2, 100 mM KCl) containing 1 .mu.M RNA and 2.5 .mu.M DFHBI. The RNA/DFHBI mixture was equilibrated on ice for 30 minutes before aliquoting 50 .mu.L into 4 wells per RNA species. As control reactions, 50 .mu.L of PBSMK containing 2.5 .mu.M DFHBI was added to one of these wells per RNA. Immediately prior to use, PBSMLK (1.times.PBS pH 7.2, 5 mM MgCl.sub.2, 40% E. coli lysate, 100 mM KCl) containing 2.5 .mu.M DFHBI was prepared and 50 .mu.L of this mixture was added to each well to give final concentrations of 500 nM RNA, 2.5 .mu.M DFHBI, and 20% E. coli lysate. Immediately upon addition of PBSMLK, fluorescence intensities were obtained for every well and repeated every 30 s for 8 hours using a Tecan M1000 plate reader.

[0149] To test the in vivo fluorescence of Spinach-TTR variants, designed sequences were cloned between a T7 promoter and T7 terminator in a plasmid harboring carbenicillin resistance and a ColE1 origin of replication. Plasmids were transformed into chemically competent E. coli strain BL21*(DE3) (F.sup.- ompT hsdSB (rB.sup.- mB.sup.-) gal dcm me131 [DE3]), plated on Difco LB+Agar plates containing 100 .mu.g/mL carbenicillin, and grown overnight at 37.degree. C. A cellular autofluorescence control containing a blank plasmid was also included. Individual colonies were grown overnight in LB containing 100 .mu.g/mL carbenicillin, then diluted 1:50 into fresh LB. After 1 h, Isopropyl-.beta.-D-thiogalactoside (IPTG) was added at a final concentration of 100 .mu.M to induce expression of T7 RNA polymerase. After 4.5 h of additional shaking, cells were diluted 1:200 into lx Phosphate Buffered Saline (PBS) containing 2 mg/mL kanamycin and 200 .mu.M (Z)-4-(3,5-Difluoro-4-hydroxybenzylidene)-1,2-dimethyl-1H-imidazol-- 5(4H)-one (DFHBI), then incubated at 37.degree. C. for 5 minutes. A BD Accuri C6 Plus flow cytometer fitted with a high-throughput sampler was then used to measure fluorescence of at least 50,000 events for each sample. Measurements were taken for 4 biological replicates.

[0150] Flow cytometry data analysis was performed using FlowJo (v10.4.1). Cells were gated by FSC-A and SSC-A, and the same gate was used for all samples. The geometric mean fluorescence was calculated for each sample, then all fluorescence measurements were converted to Molecules of Equivalent Fluorescein (MEFL) using CS&T RUO Beads (BD). The average fluorescence (MEFL) of cells expressing blank plasmid (pJBL002) in the presence of DFHBI was then subtracted from each measured fluorescence value.

[0151] Results: In all 697 designs were generated, and a subset were again chosen to maximize number of motifs tested and the chain closure score (how close the designed RNA sequence is to overlay with its target base pair). Out of these designs, 16 `Spinach-TTR` molecules designed by an embodiment to embed the Spinach aptamer into scaffolds with tetraloop/receptor contacts were characterized (FIG. 8A). By carrying out fluorescence assays titrating both RNA and DFHBI concentration, these design's dissociation constants, brightness, and folding efficiency were evaluated (FIGS. 8B-8C). Seven of the 16 Spinach-TTR designs exhibited 2-fold brighter fluorescence than the original Spinach as well as the brighter Broccoli aptamer (FIG. 8B). Two of these constructs, Spinach-TTR 3 and 8 were not only brighter but also gave higher affinity and improved folding efficiency relative to Broccoli and a minimized Spinach construct, Spinach-min (FIG. 8C).

[0152] Additionally, six of the seven Spinach-TTR constructs exhibited fluorescence longer than control Spinach and Broccoli sequences. Spinach-TTR 3 exhibited particularly high stability (FIG. 8D), giving a time to half fluorescence of 131 minutes, compared to <20 minutes for Spinach, Spinach-min, and Broccoli (FIG. 8D). This same robust fluorescence of the Spinach-TTRs was observed in 20% E. coli. lysate, suggesting a general stabilization in biological environments (FIG. 15). Six Spinach-TTR designs were cloned into a plasmid for T7 RNA polymerase-driven expression. Each Spinach-TTR variant was able to significantly activate expression above background, and several designs exceeded the fluorescence observed for both Spinach and Broccoli in vivo (FIG. 16).

[0153] Conclusion: These results demonstrate that the TTR peripheral contact efficiently couples to enhance binding of DFHBI in the aptameric region, thus increasing fluorescence. As a further test, these aptameric designs also showed to be more effective than other aptamers at increasing fluorescence as well as more stable, when challenged with cellular lysate, showing that embodiments herein are a vast improvement in the art at stabilizing and improving aptamer function.

EXAMPLE 6

Designing and Characterizing Novel RNAs Binding to Proteins

[0154] Background: Two well-studied RNA binding proteins, MS2 coat protein and PUF3 can be used as model systems for testing the design of RNA connections. MS2 coat protein specifically binds a 19 nucleotide RNA hairpin structure with nanomolar affinity. (See Carey, J., et al, Interaction of R17 coat protein with synthetic variants of its ribonucleic acid binding site. Biochemistry, 1983. 22(20): p. 4723-30; the disclosure of which is incorporated by reference herein in its entirety.) PUF3 binds an 8-nucleotide single stranded RNA sequence with nanomolar affinity. (See Zhu, D. Y., et al., A 5' cytosine binding pocket in Puf3p specifies regulation of mitochondrial mRNAs. Proceedings of the National Academy of Sciences of the United States of America, 2009. 106(48): p. 20192-20197; the disclosure of which is incorporated by reference herein in its entirety.) Both systems have been extensively characterized and crystal structures of the complexes have been solved. (See e.g., Helgstrand, C., et al., Investigating the structural basis of purine specificity in the structures of MS2 coat protein RNA translational operator hairpins. Nucleic Acids Res, 2002. 30(12): p. 2678-85; the disclosure of which is incorporated by reference herein in its entirety.) Here, designing and testing a library of RNA structures addresses two main questions. First, if removing key binding residues from the RNA targets, e.g. remove the tetraloop from the MS2 hairpin structure, how can the remaining RNA target structure, e.g. the MS2 helix, be built on to create new RNA structures that recover the wildtype binding affinity. Second, can the wildtype RNA structures, e.g., the full MS2 hairpin structure, to create new RNA structures that bind to their target proteins with higher affinity.

[0155] Methods: To address these questions, an embodiment designs a library of sequences which systematically varies the RNA anchor structures. Two examples are shown in FIGS. 17A-17B, which show proteins 1702 binding native RNA residues 1704, which are connected to designed RNA structures 1706. The embodiment varies the number of anchor structures, the strength of the anchor structures (by keeping varying numbers of RNA residues that interact with the protein), and the sites of the anchors. For each set of RNA anchor structures, the embodiment designs several thousand distinct RNA connection structures. Within the RNA structures, the embodiment varies the predicted number of contacts with the protein, the length of the connections, and the extent to which they wrap around the protein. The embodiment assesses the success of these designs by measuring the binding affinities to their target proteins using a high throughput RNA array. (See e.g., Buenrostro, J. D., et al., Quantitative analysis of RNA-protein interactions on a massively parallel array reveals biophysical and evolutionary landscapes. Nat Biotechnol, 2014. 32(6): p. 562-8; the disclosure of which is incorporated herein by reference in its entirety.) Successful designs are characterized by high affinity binding to the target protein.

EXAMPLE 7

Developing Rules for Designing More Successful RNAs

[0156] Background: Predicting binding affinity increases the predictive capacity for embodiments to design successful RNAs for binding proteins. In particular, some embodiments identify predictive features of successful designs with the goal of increasing the percentage of successful designs in the future. Binding affinity is defined as the free energy difference between the complex and the unbound components.

[0157] Methods: An embodiment approximately estimate the free energy of the bound complex as a linear combination of various features such as the number of protein/RNA contacts, the extent to which the RNA wraps around the protein, the predicted free energy of the bound RNA secondary structure, and the number and strength of anchor structures. The unbound free energy of the protein are neglected for simplicity and the unbound free energy of the RNA are estimated as the free energy of all possible secondary structures, i.e. from Vienna. Weights are fit for each of these terms using a simple linear regression to a training subset. The correlation coefficient and the AUC of the resulting model are used to assess its utility.

[0158] In silico binding affinity prediction is a very difficult problem: previous work showed that even predicting the relative protein binding affinities of small, closely related RNA sequences is challenging and at best yields results accurate to within 1.5 kcal/mol. Because predicting absolute binding affinity is even more challenging, it is possible that the model described above are not predictive. If that is the case, an embodiment focuses on identifying features that increase the likelihood of a successful design, e.g. designs that detectably bind to the target protein. Again, these features are identified from a training subset of the binding affinity data. As an example, an embodiment may identify that designs that have more protein/RNA contacts are more likely to be successful.

[0159] Once the binding affinity model or the predictive features have been established, an embodiment implement a new scoring function to encourage solutions that are predicted to be more successful. The embodiment then designs and test a new library of RNA structures for MS2 and PUF3, in the same manner as described in Example 1.

EXAMPLE 8

Verifying Structures from a Subset of Designs

[0160] Background: A need exists to assess designs to both measure binding affinity and to examine the structure of the complex. An embodiment verifies this assumption for a small subset of designs deemed successful in other embodiments.

[0161] Methods: The RNA/protein structure are examined by performing one dimensional SHAPE chemical mapping on the bound complexes. A SHAPE profile consistent with the secondary structure of the design is expected, with reduced reactivity in regions predicted to be bound to the protein. Additionally, for a small subset of design failures SHAPE chemical mapping in the presence and absence of the protein is performed. By identifying ways in which the designs are failing, design algorithms may be improved.

EXAMPLE 9

Testing Libraries of RNA Aptamers

[0162] Background: Once designed and constructed, aptamer embodiments can be tested for the efficacy in binding particular proteins to which they were designed to bind.

[0163] Methods: The aptamers are designed by first identifying several possible RNA anchor structures/sequences methods, such as those described herein. Then for each of these sets of anchor structures, many different connecting RNA structures are designed. Additionally, each of the libraries contains a subset of sequences with specific randomized portions, for a total of approximately 10.sup.15 sequences in each library. The benchmark set of proteins contains proteins that range in size and for which previous selection attempts have been both successful and unsuccessful. Table 1 lists an initial set of five possible proteins for the benchmark set. Selections are performed for each of these proteins with the designed libraries. This initial benchmark set helps to identify the optimal way in which to incorporate randomized regions into the designed sequences. The success is assessed by the binding affinities of the selected aptamers.

TABLE-US-00004 TABLE 4 Benchmark proteins Size (No. of Previous selection Protein Aptamer/protein Protein amino acids) yielded aptamers? PDB ID complex PDB ID Thrombin 288 Yes 5AFY 3DD2 Human 211 Yes 4W4N 3AGV IgG1 MAPK8 371 No 2XRW -- (JNK1) MEK1 393 -- 1S9J -- MEK2 400 -- 1S9I --

EXAMPLE 10

Investigating Structures of Successful Aptamers

[0164] Background: If or when successful aptamers are identified, the structures of these aptamers can be examined to identify the specific features that contribute to the success.

[0165] Methods: First the structures of the RNA are verified by performing one-dimensional SHAPE chemical mapping. By examining the SHAPE profile in the presence and absence of the protein, the regions of the RNA that are likely to be interacting with the protein are identified. In addition to the chemical mapping experiments, verifying that the RNA is binding to the protein where it was predicted on the surface are performed. To do this, successful designs that were predicted to leave functional sites accessible are assessed. For these aptamer embodiments, the binding affinity of ligands known to bind to the functional site after incubating the protein with the RNA aptamer are assessed. If the binding affinity of the ligand remains the same when the protein is bound to the RNA aptamer, this would suggest that the functional site is indeed accessible. For example, there are several ligands known to bind to the different binding pockets on thrombin. Aptamers can be designed that should specifically leave one of these binding sites accessible. Then, thrombin are incubated with one of the successful aptamers, then the binding affinity of one of the known ligands to the thrombin/aptamer complex are measured.

EXAMPLE 11

Redesigning Aptamers to Increase Affinity

[0166] Background: When selection experiments fail, they generally still yield many low-quality aptamers. This often means aptamers that have high nanomolar or low micromolar affinity to the target protein. Currently, there is no simple strategy for optimizing these aptamers to bind with higher affinity.

[0167] Methods: First, the structure of the RNA aptamer bound to the target protein will be predicted. Using many (.about.100) of the structures that score best, RNA extensions that should wrap around the protein will be designed. A small library of these designs will then be tested experimentally. It is expected that some of these designs will bind to the target protein with higher affinity than the original aptamer.

EXAMPLE 12

Implementing Sampling Schemes for RNA Fragment Assembly

[0168] Background: Certain embodiments will seek to predict a structure of an RNA/protein complex based on RNA sequence and protein structure.

[0169] Methods: An embodiment will extend the fragment assembly algorithm for RNA structure prediction within Rosetta. This method builds de novo RNA structures by sampling torsion angles from fragments of RNA structures from the PDB in a Monte Carlo simulation. Protein binding will be incorporated using two different strategies: 1) fold the RNA in the presence of the protein, and 2) fold the RNA without the protein and then dock it onto the protein surface and remodel interface residues. Both of these initial strategies will use a coarse-grained representation of the protein and RNA residues.

[0170] The first strategy, folding the RNA in the presence of the protein, will involve both fragment insertion and docking moves. Initially, we will implement a strategy similar to that described previously for the simultaneous folding and docking of symmetric protein complexes, in which every tenth move will be a docking attempt. (See Das, R., et al., Simultaneous prediction of protein folding and docking at high resolution. Proceedings of the National Academy of Sciences of the United States of America, 2009. 106(45): p. 18978-18983; the disclosure of which is incorporated by reference herein in its entirety.) Each move will be scored using the potential described herein.

[0171] The novel aspect of the second strategy is essentially the flexible docking algorithm. Initially, the RNA structure will be built with the fragment assembly method. Because the protein will not be present at this stage, structures will be evaluated with the RNA-only potential. The resulting RNA structures will then be docked against the protein and interface residues will be resampled with fragment insertion moves. At this stage, structures will be scored with the RNA/protein potential described herein.

[0172] Finally, coarse-grained structures resulting from either of these two strategies will be converted into full-atom representation. The structures will be refined by sampling side chain rotamers in a Monte Carlo simulation and then performing energy minimization using the high-resolution RNA/protein potential described herein.

[0173] These methods will be tested on a benchmark set of RNA/protein complexes with known structures. Varying amounts of input information will be provided for each complex, ranging from just the protein structure and the RNA sequence, to the protein structure with one or more "anchor" RNA residues bound, to the protein structure and parts of the RNA structure. The results over this range of input information will help to evaluate the reliability of this method in various practical situations.

Doctrine of Equivalents

[0174] Having described several embodiments, it will be recognized by those skilled in the art that various modifications, alternative constructions, and equivalents may be used without departing from the spirit of the invention. Additionally, a number of well-known processes and elements have not been described in order to avoid unnecessarily obscuring the present invention. Accordingly, the above description should not be taken as limiting the scope of the invention.

[0175] Those skilled in the art will appreciate that the foregoing examples and descriptions of various preferred embodiments of the present invention are merely illustrative of the invention as a whole, and that variations in the components or steps of the present invention may be made within the spirit and scope of the invention. Accordingly, the present invention is not limited to the specific embodiments described herein, but, rather, is defined by the scope of the appended claims.

Sequence CWU 1

1

551148DNAArtificial SequenceDesigned sequence in accordance with embodiments 1ggaacagctc gagtagagct gaaagttgat atggatagag taagagagat ggaagtctca 60ggggaaactt tgagatggac ggtttacaag ttgtcctaag tcaacaaacg catcgagtag 120atgcgaacaa agaaacaaca acaacaac 1482164DNAArtificial SequenceDesigned sequence in accordance with embodiments 2ggaacagctc gagtagagct gaaagttgat atggataata cgtcaagctt caccgaagaa 60caaatcaggg gaaactttga tttgggaggt gaagaactac ttgacgttgt cctaagtcaa 120caaacgcatc gagtagatgc gaacaaagaa acaacaacaa caac 1643145DNAArtificial SequenceDesigned Sequence 3ggaacagctc gagtagagct gaaagttgat atggatgatt aggacatgca ttgctgaggg 60gaaacttttt gcaatgcaac agccaaatcg tcctaagtca acaaacgcat cgagtagatg 120cgaacaaaga aacaacaaca acaac 1454145DNAArtificial SequenceDesigned sequence in accordance with embodiments 4ggaacagctc gagtagagct gaaagttgat atggatacct aggacatgcc aatctgtggg 60gaaacttatt gattggcaac agccaaggtg tcctaagtca acaaacgcat cgagtagatg 120cgaacaaaga aacaacaaca acaac 1455145DNAArtificial SequenceDesigned sequence in accordance with embodiments 5ggaacagctc gagtagagct gaaagttgat atggattagc aaggacatgc agagcaaggg 60ggaaacttca cctctgcaac agccacctag tcctaagtca acaaacgcat cgagtagatg 120cgaacaaaga aacaacaaca acaac 1456161DNAArtificial SequenceDesigned sequence in accordance with embodiments 6ggaacagctc gagtagagct gaaagttgat atggatttac tccgaggaga cgaactacca 60cgaacagggg aaactctacc cgtggcgtct ccgtttgacg agtaagtcct aagtcaacaa 120acgcatcgag tagatgcgaa caaagaaaca acaacaacaa c 1617169DNAArtificial SequenceDesigned sequence in accordance with embodiments 7ggaacagctc gagtagagct gaaagttgat atgggcaaga attgtgccag actttgaact 60actgcgtctc aggggaaact ttgagatgca gcaaagtcgg taatacaatt cgacccctaa 120gtcaacaaac gcatcgagta gatgcgaaca aagaaacaac aacaacaac 1698162DNAArtificial SequenceDesigned sequence in accordance with embodiments 8ggaacagctc gagtagagct gaaagttgat atggtgaccg caaggatgga agaccaatac 60tatctcaggg gaaactttga gatagtatag gttggacctt gccagtaacc taagtcaaca 120aacgcatcga gtagatgcga acaaagaaac aacaacaaca ac 1629165DNAArtificial SequenceDesigned sequence in accordance with embodiments 9ggaacagctc gagtagagct gaaagttgat atggaaaccg agcccgagga tatgcttgaa 60aaactcaggg gaaactttga gttttggcgc atatccgttt gacgggagtt tcctaagtca 120acaaacgcat cgagtagatg cgaacaaaga aacaacaaca acaac 16510166DNAArtificial SequenceDesigned sequence in accordance with embodiments 10ggaacagctc gagtagagct gaaagttgat atggatttga aaccggatgg aaggtgagag 60caacgactgc aggggaaact ttgcagtcgg ctcactggac cgactgacaa atcctaagtc 120aacaaacgca tcgagtagat gcgaacaaag aaacaacaac aacaac 16611150DNAArtificial SequenceDesigned sequence in accordance with embodiments 11ggaacagctc gagtagagct gaaagttgat atggacattc aagttgtgga cgacacatgg 60gggaaacttc atgtagtcga tggaagcaga gaatgtccta agtcaacaaa cgcatcgagt 120agatgcgaac aaagaaacaa caacaacaac 15012142DNAArtificial SequenceDesigned sequence in accordance with embodiments 12ggaacagctc gagtagagct gaaagttgat atggatgtgg tatgccaaca gccatagctg 60ggaaactagc atggacatgg caccacatcc taagtcaaca aacgcatcga gtagatgcga 120acaaagaaac aacaacaaca ac 14213143DNAArtificial SequenceDesigned sequence in accordance with embodiments 13ggaacagctc gagtagagct gaaagttgat atggtggata gtgacatgaa ttctcagggg 60aaactttgag aattcaacag cacaagaagc ctaagtcaac aaacgcatcg agtagatgcg 120aacaaagaaa caacaacaac aac 14314168DNAArtificial SequenceDesigned sequence in accordance with embodiments 14ggaacagctc gagtagagct gaaagttgat atggttaaca cccgatgatg gaaggtagga 60gcaacgttgg caggggaaac tttgccaacg gcctactgga catcggcaag ttaacctaag 120tcaacaaacg catcgagtag atgcgaacaa agaaacaaca acaacaac 16815142DNAArtificial SequenceDesigned sequence in accordance with embodiments 15ggaacagctc gagtagagct gaaagttgat atggaacctg tatggagtaa ccaatgggga 60aacttattgg cccatggaag tattggttcc taagtcaaca aacgcatcga gtagatgcga 120acaaagaaac aacaacaaca ac 14216143DNAArtificial SequenceDesigned sequence in accordance with embodiments 16ggaacagctc gagtagagct gaaagttgat atgggagcta ggacatggga ctttagggaa 60acttaaagta tcccaacagc ctagccgagc ctaagtcaac aaacgcatcg agtagatgcg 120aacaaagaaa caacaacaac aac 143177539DNAArtificial SequenceDesigned sequence in accordance with embodiments 17gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctgagcaac aggtccgtgc cgaggatttc gatctaagac 1800agtatggggc cccgttgagc taaccggtac taatgaaccg tgaggcttaa ccgagaggtt 1860aagcgactaa gcgtacacgg tggatgccct ggcagtcaga ggcgatgaag gacgtgctaa 1920tctgcgataa gcgtcggtaa ggtgatatga accgttataa ccggcgattt ccgaatgggg 1980aaacccagtg tgtttcgaca cactatcatt aactgaatcc ataggttaat gaggcgaacc 2040gggggaactg aaacatctaa gtaccccgag gaaaagaaat caaccgagat tcccccagta 2100gcggcgagcg aacggggagc agcccagagc ctgaatcagt gtgtgtgtta gtggaagcgt 2160ctggaaaggc gcgcgataca gggtgacagc cccgtacaca aaaatgcaca tgctgtgagc 2220tcgatgagta gggcgggaca cgtggtatcc tgtctgaata tggggggacc atcctccaag 2280gctaaatact cctgactgac cgatagtgaa ccagtaccgt gagggaaagg cgaaaagaac 2340cccggcgagg ggagtgaaaa agaacctgaa accgtgtacg tacaagcagt gggagcacgc 2400ttaggcgtgt gactgcgtac cttttgtata atgggtcagc gacttatatt ctgtagcaag 2460gttaaccgaa taggggagcc gaagggaaac cgagtcttaa ctgggcgtta agttgcaggg 2520tatagacccg aaacccggtg atctagccat gggcaggttg aaggttgggt aacactaact 2580ggaggaccga accgactaat gttgaaaaat tagcggatga cttgtggctg ggggtgaaag 2640gccaatcaaa ccgggagata gctggttctc cccgaaagct atttaggtag cgcctcgtga 2700attcatctcc gggggtagag cactgtttcg gcaagggggt catcccgact taccaacccg 2760atgcaaactg cgaataccgg agaatgttat cacgggagac acacggcggg tgctaacgtc 2820cgtcgtgaag agggaaacaa cccagaccgc cagctaaggt cccaaagtca tggttaagtg 2880ggaaacgatg tgggaaggcc cagacagcca ggatgttggc ttagaagcag ccatcattta 2940aagaaagcgt aatagctcac tggtcgagtc ggcctgcgcg gaagatgtaa cggggctaaa 3000ccatgcaccg aagctgcggc agcgacgctt atgcgttgtt gggtagggga gcgttctgta 3060agcctgcgaa ggtgtgctgt gaggcatgct ggaggtatca gaagtgcgaa tgctgacata 3120agtaacgata aagcgggtga aaagcccgct cgccggaaga ccaagggttc ctgtccaacg 3180ttaatcgggg cagggtgagt cgacccctaa ggcgaggccg aaaggcgtag tcgatgggaa 3240acaggttaat attcctgtac ttggtgttac tgcgaagggg ggacggagaa ggctatgttg 3300gccgggcgac ggttgtcccg gtttaagcgt gtaggctggt tttccaggca aatccggaaa 3360atcaaggctg aggcgtgatg acgaggcact acggtgctga agcaacaaat gccctgcttc 3420caggaaaagc ctctaagcat caggtaacat caaatcgtac cccaaaccga cacaggtggt 3480caggtagaga ataccaaggc gcttgagaga actcgggtga aggaactagg caaaatggtg 3540ccgtaacttc gggagaaggc acgctgatat gtaggtgagg tccctcgcgg atggagctga 3600aatcagtcga agataccagc tggctgcaac tgtttattaa aaacacagca ctgtgcaaac 3660acgaaagtgg acgtatacgg tgtgacgcct gcccggtgcc ggaaggttaa ttgatggggt 3720tagcgcaagc gaagctcttg atcgaagccc cggtaaacgg cggccgtaac tataacggtc 3780ctaaggtagc gaaattcctt gtcgggtaag ttccgacctg cacgaatggc gtaatgatgg 3840ccaggctgtc tccacccgag actcagtgaa attgaactcg ctgtgaagat gcagtgtacc 3900cgcggcaaga cggaaagacc ccgtgaacct ttactatagc ttgacactga acattgagcc 3960ttgatgtgta ggataggtgg gaggctttga agtgtggacg ccagtctgca tggagccgac 4020cttgaaatac caccctttaa tgtttgatgt tctaacgttg acccgtaatc cgggttgcgg 4080acagtgtctg gtgggtagtt tgactggggc ggtctcctcc taaagagtaa cggaggagca 4140cgaaggttgg ctaatcctgg tcggacatca ggaggttagt gcaatggcat aagccagctt 4200gactgcgagc gtgacggcgc gagcaggtgc gaaagcaggt catagtgatc cggtggttct 4260gaatggaagg gccatcgctc aacggataaa aggtactccg gggataacag gctgataccg 4320cccaagagtt catatcgacg gcggtgtttg gcacctcgat gtcggctcat cacatcctgg 4380ggctgaagta ggtcccaagg gtatggctgt tcgccattta aagtggtacg cgagctgggt 4440ttagaacgtc gtgagacagt tcggtcccta tctgccgtgg gcgctggaga actgaggggg 4500gctgctccta gtacgagagg accggagtgg acgcatcact ggtgttcggg ttgtcatgcc 4560aatggcactg cccggtagct aaatgcggaa gagataagtg ctgaaagcat ctaagcacga 4620aacttgcccc gagatgagtt ctccctgacc ctttaagggt cctgaaggaa cgttgaagac 4680gacgacgttg ataggccggg tgtgtaagcg gggccccata caatgacgga tcgaaatccg 4740tttgacgcac ggcctggcgg cgcttaccac tttgtgattc atgactgggg tgaagtcgta 4800acaaggtaac cgtaggggaa cctgcggttg gatcacctcc ttaccttaaa gaagcgtact 4860ttgtagtgct cacacagatt gtctgataga aagtgaaaag caaggcgttt acgcgttggg 4920agtgaggctg aagagaataa ggccgttcgc tttctattaa tgaaagctca ccctacacga 4980aaatatcacg caacgcgtga taagcaattt tcgtgtcccc ttcgtctaga ggcccaggac 5040accgcccttt cacggcggta acaggggttc gaatccccta ggggacgcca cttgctggtt 5100tgtgagtgaa agtcgccgac cttaatatct caaaactcat cttcgggtga tgtttgagat 5160atttgctctt taaaaatctg gatcaagctg aaaattgaaa cactgaacaa cgagagttgt 5220tcgtgagtct ctcaaatttt cgcaacacga tgatgaatcg aaagaaacat cttcgggttg 5280tgagcttaag cttacaacgc cgaagctgtt ttggcggatg agagaagatt ttcagcctga 5340tacagattaa atcagaacgc agaagcggtc tgataaaaca gaatttgcct ggcggcagta 5400gcgcggtggt cccacctgac cccatgccga actcagaagt gaaacgccgt agcgccgatg 5460gtagtgtggg gtctccccat gcgagagtag ggaactgcca ggcatcaaat aaaacgaaag 5520gctcagtcga aagactgggc ctttcgtttt atctgttgtt tgtcggtgaa cgctctcctg 5580agtaggacaa atccgccggg agcggatttg aacgttgcga agcaacggcc cggagggtgg 5640cgggcaggac gcccgccata aactgccagg catcaaatta agcagaaggc catcctgacg 5700gatggccttt ttgcgtttct acaaactctt cctgtcgtca tatctacaag ccggcgcgcc 5760gggaaatgtg cgcggaaccc ctatttgttt atttttctaa atacattcaa atatgtatcc 5820gctcatgaga caataaccct gataaatgct tcaataatat tgaaaaagga agagtatgag 5880tattcaacat ttccgtgtcg cccttattcc cttttttgcg gcattttgcc ttcctgtttt 5940tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt 6000gggttacatc gaactggatc tcaacagcgg taagatcctt gagagttttc gccccgaaga 6060acgttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat tatcccgtgt 6120tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga 6180gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag 6240tgctgcaata accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg 6300accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg 6360ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgc 6420agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg 6480gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc 6540ccttccggct agctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg 6600tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta tctacacgac 6660ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact 6720gattaagcat tggtaactgc agaccaagtt tactcatata tactttagat tgatttaaaa 6780cttcattttt aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa 6840atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga 6900tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg 6960ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact 7020ggcttcagca gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac 7080cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg 7140gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg 7200gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga 7260acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc 7320gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg 7380agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 7440tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc 7500agcaacgcgg cctttttacg gttcctggcc ttttgctgg 7539187559DNAArtificial SequenceDesigned sequence in accordance with embodiments 18gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctacagggc cguguggcug ugaacgggau ccacggccgg 1800ggccgaggug gggggccccg ttgagctaac cggtactaat gaaccgtgag gcttaaccga 1860gaggttaagc gactaagcgt acacggtgga tgccctggca gtcagaggcg atgaaggacg 1920tgctaatctg cgataagcgt cggtaaggtg atatgaaccg ttataaccgg cgatttccga 1980atggggaaac ccagtgtgtt tcgacacact atcattaact gaatccatag gttaatgagg 2040cgaaccgggg gaactgaaac atctaagtac cccgaggaaa agaaatcaac cgagattccc 2100ccagtagcgg cgagcgaacg gggagcagcc cagagcctga atcagtgtgt gtgttagtgg 2160aagcgtctgg aaaggcgcgc gatacagggt gacagccccg tacacaaaaa tgcacatgct 2220gtgagctcga tgagtagggc gggacacgtg gtatcctgtc tgaatatggg gggaccatcc 2280tccaaggcta aatactcctg actgaccgat agtgaaccag taccgtgagg gaaaggcgaa 2340aagaaccccg gcgaggggag tgaaaaagaa cctgaaaccg tgtacgtaca agcagtggga 2400gcacgcttag gcgtgtgact gcgtaccttt tgtataatgg gtcagcgact tatattctgt 2460agcaaggtta accgaatagg ggagccgaag ggaaaccgag tcttaactgg gcgttaagtt 2520gcagggtata gacccgaaac ccggtgatct agccatgggc aggttgaagg ttgggtaaca 2580ctaactggag gaccgaaccg actaatgttg aaaaattagc ggatgacttg tggctggggg 2640tgaaaggcca atcaaaccgg gagatagctg gttctccccg aaagctattt aggtagcgcc 2700tcgtgaattc atctccgggg gtagagcact gtttcggcaa gggggtcatc ccgacttacc 2760aacccgatgc aaactgcgaa taccggagaa tgttatcacg ggagacacac ggcgggtgct 2820aacgtccgtc gtgaagaggg aaacaaccca gaccgccagc taaggtccca aagtcatggt 2880taagtgggaa acgatgtggg aaggcccaga cagccaggat gttggcttag aagcagccat 2940catttaaaga aagcgtaata gctcactggt cgagtcggcc tgcgcggaag atgtaacggg 3000gctaaaccat gcaccgaagc tgcggcagcg acgcttatgc gttgttgggt aggggagcgt 3060tctgtaagcc tgcgaaggtg tgctgtgagg catgctggag gtatcagaag tgcgaatgct 3120gacataagta acgataaagc gggtgaaaag cccgctcgcc ggaagaccaa gggttcctgt 3180ccaacgttaa tcggggcagg gtgagtcgac ccctaaggcg aggccgaaag gcgtagtcga 3240tgggaaacag gttaatattc ctgtacttgg tgttactgcg aaggggggac ggagaaggct 3300atgttggccg ggcgacggtt gtcccggttt aagcgtgtag gctggttttc caggcaaatc

3360cggaaaatca aggctgaggc gtgatgacga ggcactacgg tgctgaagca acaaatgccc 3420tgcttccagg aaaagcctct aagcatcagg taacatcaaa tcgtacccca aaccgacaca 3480ggtggtcagg tagagaatac caaggcgctt gagagaactc gggtgaagga actaggcaaa 3540atggtgccgt aacttcggga gaaggcacgc tgatatgtag gtgaggtccc tcgcggatgg 3600agctgaaatc agtcgaagat accagctggc tgcaactgtt tattaaaaac acagcactgt 3660gcaaacacga aagtggacgt atacggtgtg acgcctgccc ggtgccggaa ggttaattga 3720tggggttagc gcaagcgaag ctcttgatcg aagccccggt aaacggcggc cgtaactata 3780acggtcctaa ggtagcgaaa ttccttgtcg ggtaagttcc gacctgcacg aatggcgtaa 3840tgatggccag gctgtctcca cccgagactc agtgaaattg aactcgctgt gaagatgcag 3900tgtacccgcg gcaagacgga aagaccccgt gaacctttac tatagcttga cactgaacat 3960tgagccttga tgtgtaggat aggtgggagg ctttgaagtg tggacgccag tctgcatgga 4020gccgaccttg aaataccacc ctttaatgtt tgatgttcta acgttgaccc gtaatccggg 4080ttgcggacag tgtctggtgg gtagtttgac tggggcggtc tcctcctaaa gagtaacgga 4140ggagcacgaa ggttggctaa tcctggtcgg acatcaggag gttagtgcaa tggcataagc 4200cagcttgact gcgagcgtga cggcgcgagc aggtgcgaaa gcaggtcata gtgatccggt 4260ggttctgaat ggaagggcca tcgctcaacg gataaaaggt actccgggga taacaggctg 4320ataccgccca agagttcata tcgacggcgg tgtttggcac ctcgatgtcg gctcatcaca 4380tcctggggct gaagtaggtc ccaagggtat ggctgttcgc catttaaagt ggtacgcgag 4440ctgggtttag aacgtcgtga gacagttcgg tccctatctg ccgtgggcgc tggagaactg 4500aggggggctg ctcctagtac gagaggaccg gagtggacgc atcactggtg ttcgggttgt 4560catgccaatg gcactgcccg gtagctaaat gcggaagaga taagtgctga aagcatctaa 4620gcacgaaact tgccccgaga tgagttctcc ctgacccttt aagggtcctg aaggaacgtt 4680gaagacgacg acgttgatag gccgggtgtg taagcggggc cccccaccgu uugacgcccc 4740gaagccgugg aucccgggag ccacguggcu ggccugucgg cgcttaccac tttgtgattc 4800atgactgggg tgaagtcgta acaaggtaac cgtaggggaa cctgcggttg gatcacctcc 4860ttaccttaaa gaagcgtact ttgtagtgct cacacagatt gtctgataga aagtgaaaag 4920caaggcgttt acgcgttggg agtgaggctg aagagaataa ggccgttcgc tttctattaa 4980tgaaagctca ccctacacga aaatatcacg caacgcgtga taagcaattt tcgtgtcccc 5040ttcgtctaga ggcccaggac accgcccttt cacggcggta acaggggttc gaatccccta 5100ggggacgcca cttgctggtt tgtgagtgaa agtcgccgac cttaatatct caaaactcat 5160cttcgggtga tgtttgagat atttgctctt taaaaatctg gatcaagctg aaaattgaaa 5220cactgaacaa cgagagttgt tcgtgagtct ctcaaatttt cgcaacacga tgatgaatcg 5280aaagaaacat cttcgggttg tgagcttaag cttacaacgc cgaagctgtt ttggcggatg 5340agagaagatt ttcagcctga tacagattaa atcagaacgc agaagcggtc tgataaaaca 5400gaatttgcct ggcggcagta gcgcggtggt cccacctgac cccatgccga actcagaagt 5460gaaacgccgt agcgccgatg gtagtgtggg gtctccccat gcgagagtag ggaactgcca 5520ggcatcaaat aaaacgaaag gctcagtcga aagactgggc ctttcgtttt atctgttgtt 5580tgtcggtgaa cgctctcctg agtaggacaa atccgccggg agcggatttg aacgttgcga 5640agcaacggcc cggagggtgg cgggcaggac gcccgccata aactgccagg catcaaatta 5700agcagaaggc catcctgacg gatggccttt ttgcgtttct acaaactctt cctgtcgtca 5760tatctacaag ccggcgcgcc gggaaatgtg cgcggaaccc ctatttgttt atttttctaa 5820atacattcaa atatgtatcc gctcatgaga caataaccct gataaatgct tcaataatat 5880tgaaaaagga agagtatgag tattcaacat ttccgtgtcg cccttattcc cttttttgcg 5940gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa 6000gatcagttgg gtgcacgagt gggttacatc gaactggatc tcaacagcgg taagatcctt 6060gagagttttc gccccgaaga acgttttcca atgatgagca cttttaaagt tctgctatgt 6120ggcgcggtat tatcccgtgt tgacgccggg caagagcaac tcggtcgccg catacactat 6180tctcagaatg acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg 6240acagtaagag aattatgcag tgctgcaata accatgagtg ataacactgc ggccaactta 6300cttctgacaa cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat 6360catgtaactc gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag 6420cgtgacacca cgatgcctgc agcaatggca acaacgttgc gcaaactatt aactggcgaa 6480ctacttactc tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca 6540ggaccacttc tgcgctcggc ccttccggct agctggttta ttgctgataa atctggagcc 6600ggtgagcgtg ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt 6660atcgtagtta tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc 6720gctgagatag gtgcctcact gattaagcat tggtaactgc agaccaagtt tactcatata 6780tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt 6840ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga gcgtcagacc 6900ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 6960tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 7020ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gtccttctag 7080tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 7140tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 7200actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 7260cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 7320gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 7380tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 7440ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 7500ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctgg 7559197555DNAArtificial SequenceDesigned sequence in accordance with embodiments 19gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctccuagcg gguagggaga ucauggacgu aacauucgau 1800ccgagggggg gggaccgttg agctaaccgg tactaatgaa ccgtgaggct taaccgagag 1860gttaagcgac taagcgtaca cggtggatgc cctggcagtc agaggcgatg aaggacgtgc 1920taatctgcga taagcgtcgg taaggtgata tgaaccgtta taaccggcga tttccgaatg 1980gggaaaccca gtgtgtttcg acacactatc attaactgaa tccataggtt aatgaggcga 2040accgggggaa ctgaaacatc taagtacccc gaggaaaaga aatcaaccga gattccccca 2100gtagcggcga gcgaacgggg agcagcccag agcctgaatc agtgtgtgtg ttagtggaag 2160cgtctggaaa ggcgcgcgat acagggtgac agccccgtac acaaaaatgc acatgctgtg 2220agctcgatga gtagggcggg acacgtggta tcctgtctga atatgggggg accatcctcc 2280aaggctaaat actcctgact gaccgatagt gaaccagtac cgtgagggaa aggcgaaaag 2340aaccccggcg aggggagtga aaaagaacct gaaaccgtgt acgtacaagc agtgggagca 2400cgcttaggcg tgtgactgcg taccttttgt ataatgggtc agcgacttat attctgtagc 2460aaggttaacc gaatagggga gccgaaggga aaccgagtct taactgggcg ttaagttgca 2520gggtatagac ccgaaacccg gtgatctagc catgggcagg ttgaaggttg ggtaacacta 2580actggaggac cgaaccgact aatgttgaaa aattagcgga tgacttgtgg ctgggggtga 2640aaggccaatc aaaccgggag atagctggtt ctccccgaaa gctatttagg tagcgcctcg 2700tgaattcatc tccgggggta gagcactgtt tcggcaaggg ggtcatcccg acttaccaac 2760ccgatgcaaa ctgcgaatac cggagaatgt tatcacggga gacacacggc gggtgctaac 2820gtccgtcgtg aagagggaaa caacccagac cgccagctaa ggtcccaaag tcatggttaa 2880gtgggaaacg atgtgggaag gcccagacag ccaggatgtt ggcttagaag cagccatcat 2940ttaaagaaag cgtaatagct cactggtcga gtcggcctgc gcggaagatg taacggggct 3000aaaccatgca ccgaagctgc ggcagcgacg cttatgcgtt gttgggtagg ggagcgttct 3060gtaagcctgc gaaggtgtgc tgtgaggcat gctggaggta tcagaagtgc gaatgctgac 3120ataagtaacg ataaagcggg tgaaaagccc gctcgccgga agaccaaggg ttcctgtcca 3180acgttaatcg gggcagggtg agtcgacccc taaggcgagg ccgaaaggcg tagtcgatgg 3240gaaacaggtt aatattcctg tacttggtgt tactgcgaag gggggacgga gaaggctatg 3300ttggccgggc gacggttgtc ccggtttaag cgtgtaggct ggttttccag gcaaatccgg 3360aaaatcaagg ctgaggcgtg atgacgaggc actacggtgc tgaagcaaca aatgccctgc 3420ttccaggaaa agcctctaag catcaggtaa catcaaatcg taccccaaac cgacacaggt 3480ggtcaggtag agaataccaa ggcgcttgag agaactcggg tgaaggaact aggcaaaatg 3540gtgccgtaac ttcgggagaa ggcacgctga tatgtaggtg aggtccctcg cggatggagc 3600tgaaatcagt cgaagatacc agctggctgc aactgtttat taaaaacaca gcactgtgca 3660aacacgaaag tggacgtata cggtgtgacg cctgcccggt gccggaaggt taattgatgg 3720ggttagcgca agcgaagctc ttgatcgaag ccccggtaaa cggcggccgt aactataacg 3780gtcctaaggt agcgaaattc cttgtcgggt aagttccgac ctgcacgaat ggcgtaatga 3840tggccaggct gtctccaccc gagactcagt gaaattgaac tcgctgtgaa gatgcagtgt 3900acccgcggca agacggaaag accccgtgaa cctttactat agcttgacac tgaacattga 3960gccttgatgt gtaggatagg tgggaggctt tgaagtgtgg acgccagtct gcatggagcc 4020gaccttgaaa taccaccctt taatgtttga tgttctaacg ttgacccgta atccgggttg 4080cggacagtgt ctggtgggta gtttgactgg ggcggtctcc tcctaaagag taacggagga 4140gcacgaaggt tggctaatcc tggtcggaca tcaggaggtt agtgcaatgg cataagccag 4200cttgactgcg agcgtgacgg cgcgagcagg tgcgaaagca ggtcatagtg atccggtggt 4260tctgaatgga agggccatcg ctcaacggat aaaaggtact ccggggataa caggctgata 4320ccgcccaaga gttcatatcg acggcggtgt ttggcacctc gatgtcggct catcacatcc 4380tggggctgaa gtaggtccca agggtatggc tgttcgccat ttaaagtggt acgcgagctg 4440ggtttagaac gtcgtgagac agttcggtcc ctatctgccg tgggcgctgg agaactgagg 4500ggggctgctc ctagtacgag aggaccggag tggacgcatc actggtgttc gggttgtcat 4560gccaatggca ctgcccggta gctaaatgcg gaagagataa gtgctgaaag catctaagca 4620cgaaacttgc cccgagatga gttctccctg accctttaag ggtcctgaag gaacgttgaa 4680gacgacgacg ttgataggcc gggtgtgtaa gcgguccccc ccccguuuga cgaucgaaug 4740cccguccaug aucugugaac uacccgcuag aagcggcgct taccactttg tgattcatga 4800ctggggtgaa gtcgtaacaa ggtaaccgta ggggaacctg cggttggatc acctccttac 4860cttaaagaag cgtactttgt agtgctcaca cagattgtct gatagaaagt gaaaagcaag 4920gcgtttacgc gttgggagtg aggctgaaga gaataaggcc gttcgctttc tattaatgaa 4980agctcaccct acacgaaaat atcacgcaac gcgtgataag caattttcgt gtccccttcg 5040tctagaggcc caggacaccg ccctttcacg gcggtaacag gggttcgaat cccctagggg 5100acgccacttg ctggtttgtg agtgaaagtc gccgacctta atatctcaaa actcatcttc 5160gggtgatgtt tgagatattt gctctttaaa aatctggatc aagctgaaaa ttgaaacact 5220gaacaacgag agttgttcgt gagtctctca aattttcgca acacgatgat gaatcgaaag 5280aaacatcttc gggttgtgag cttaagctta caacgccgaa gctgttttgg cggatgagag 5340aagattttca gcctgataca gattaaatca gaacgcagaa gcggtctgat aaaacagaat 5400ttgcctggcg gcagtagcgc ggtggtccca cctgacccca tgccgaactc agaagtgaaa 5460cgccgtagcg ccgatggtag tgtggggtct ccccatgcga gagtagggaa ctgccaggca 5520tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttt cgttttatct gttgtttgtc 5580ggtgaacgct ctcctgagta ggacaaatcc gccgggagcg gatttgaacg ttgcgaagca 5640acggcccgga gggtggcggg caggacgccc gccataaact gccaggcatc aaattaagca 5700gaaggccatc ctgacggatg gcctttttgc gtttctacaa actcttcctg tcgtcatatc 5760tacaagccgg cgcgccggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac 5820attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa 5880aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat 5940tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc 6000agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga 6060gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg 6120cggtattatc ccgtgttgac gccgggcaag agcaactcgg tcgccgcata cactattctc 6180agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag 6240taagagaatt atgcagtgct gcaataacca tgagtgataa cactgcggcc aacttacttc 6300tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg 6360taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg 6420acaccacgat gcctgcagca atggcaacaa cgttgcgcaa actattaact ggcgaactac 6480ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac 6540cacttctgcg ctcggccctt ccggctagct ggtttattgc tgataaatct ggagccggtg 6600agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg 6660tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg 6720agataggtgc ctcactgatt aagcattggt aactgcagac caagtttact catatatact 6780ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga tcctttttga 6840taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt cagaccccgt 6900agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct gctgcttgca 6960aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc taccaactct 7020ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgtcc ttctagtgta 7080gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 7140aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg ggttggactc 7200aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt cgtgcacaca 7260gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg agctatgaga 7320aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg 7380aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 7440cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag gggggcggag 7500cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt gctgg 7555207560DNAArtificial SequenceDesigned sequence in accordance with embodiments 20gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctuaacacc uguuguggcg cgaaacgacc gauauuuccc 1800ccgauauggc gcgaaaccgg gacgcccgtt gagctaaccg gtactaatga accgtgaggc 1860ttaaccgaga ggttaagcga ctaagcgtac acggtggatg ccctggcagt cagaggcgat 1920gaaggacgtg ctaatctgcg ataagcgtcg gtaaggtgat atgaaccgtt ataaccggcg 1980atttccgaat ggggaaaccc agtgtgtttc gacacactat cattaactga atccataggt 2040taatgaggcg aaccggggga actgaaacat ctaagtaccc cgaggaaaag aaatcaaccg 2100agattccccc agtagcggcg agcgaacggg gagcagccca gagcctgaat cagtgtgtgt 2160gttagtggaa gcgtctggaa aggcgcgcga tacagggtga cagccccgta cacaaaaatg 2220cacatgctgt gagctcgatg agtagggcgg gacacgtggt atcctgtctg aatatggggg 2280gaccatcctc caaggctaaa tactcctgac tgaccgatag tgaaccagta ccgtgaggga 2340aaggcgaaaa gaaccccggc gaggggagtg aaaaagaacc tgaaaccgtg tacgtacaag 2400cagtgggagc acgcttaggc gtgtgactgc gtaccttttg tataatgggt cagcgactta 2460tattctgtag caaggttaac cgaatagggg agccgaaggg aaaccgagtc ttaactgggc 2520gttaagttgc agggtataga cccgaaaccc ggtgatctag ccatgggcag gttgaaggtt 2580gggtaacact aactggagga ccgaaccgac taatgttgaa aaattagcgg atgacttgtg 2640gctgggggtg aaaggccaat caaaccggga gatagctggt tctccccgaa agctatttag 2700gtagcgcctc gtgaattcat ctccgggggt agagcactgt ttcggcaagg gggtcatccc 2760gacttaccaa cccgatgcaa actgcgaata ccggagaatg ttatcacggg agacacacgg 2820cgggtgctaa cgtccgtcgt gaagagggaa acaacccaga ccgccagcta aggtcccaaa 2880gtcatggtta agtgggaaac gatgtgggaa ggcccagaca gccaggatgt tggcttagaa 2940gcagccatca tttaaagaaa gcgtaatagc tcactggtcg agtcggcctg cgcggaagat 3000gtaacggggc taaaccatgc accgaagctg cggcagcgac gcttatgcgt tgttgggtag 3060gggagcgttc tgtaagcctg cgaaggtgtg ctgtgaggca tgctggaggt atcagaagtg 3120cgaatgctga cataagtaac gataaagcgg

gtgaaaagcc cgctcgccgg aagaccaagg 3180gttcctgtcc aacgttaatc ggggcagggt gagtcgaccc ctaaggcgag gccgaaaggc 3240gtagtcgatg ggaaacaggt taatattcct gtacttggtg ttactgcgaa ggggggacgg 3300agaaggctat gttggccggg cgacggttgt cccggtttaa gcgtgtaggc tggttttcca 3360ggcaaatccg gaaaatcaag gctgaggcgt gatgacgagg cactacggtg ctgaagcaac 3420aaatgccctg cttccaggaa aagcctctaa gcatcaggta acatcaaatc gtaccccaaa 3480ccgacacagg tggtcaggta gagaatacca aggcgcttga gagaactcgg gtgaaggaac 3540taggcaaaat ggtgccgtaa cttcgggaga aggcacgctg atatgtaggt gaggtccctc 3600gcggatggag ctgaaatcag tcgaagatac cagctggctg caactgttta ttaaaaacac 3660agcactgtgc aaacacgaaa gtggacgtat acggtgtgac gcctgcccgg tgccggaagg 3720ttaattgatg gggttagcgc aagcgaagct cttgatcgaa gccccggtaa acggcggccg 3780taactataac ggtcctaagg tagcgaaatt ccttgtcggg taagttccga cctgcacgaa 3840tggcgtaatg atggccaggc tgtctccacc cgagactcag tgaaattgaa ctcgctgtga 3900agatgcagtg tacccgcggc aagacggaaa gaccccgtga acctttacta tagcttgaca 3960ctgaacattg agccttgatg tgtaggatag gtgggaggct ttgaagtgtg gacgccagtc 4020tgcatggagc cgaccttgaa ataccaccct ttaatgtttg atgttctaac gttgacccgt 4080aatccgggtt gcggacagtg tctggtgggt agtttgactg gggcggtctc ctcctaaaga 4140gtaacggagg agcacgaagg ttggctaatc ctggtcggac atcaggaggt tagtgcaatg 4200gcataagcca gcttgactgc gagcgtgacg gcgcgagcag gtgcgaaagc aggtcatagt 4260gatccggtgg ttctgaatgg aagggccatc gctcaacgga taaaaggtac tccggggata 4320acaggctgat accgcccaag agttcatatc gacggcggtg tttggcacct cgatgtcggc 4380tcatcacatc ctggggctga agtaggtccc aagggtatgg ctgttcgcca tttaaagtgg 4440tacgcgagct gggtttagaa cgtcgtgaga cagttcggtc cctatctgcc gtgggcgctg 4500gagaactgag gggggctgct cctagtacga gaggaccgga gtggacgcat cactggtgtt 4560cgggttgtca tgccaatggc actgcccggt agctaaatgc ggaagagata agtgctgaaa 4620gcatctaagc acgaaacttg ccccgagatg agttctccct gaccctttaa gggtcctgaa 4680ggaacgttga agacgacgac gttgataggc cgggtgtgta agcgggcguc ccgggagcca 4740uaucgggcaa uaucggucgg agccacaauu ggugaaagcg gcgcttacca ctttgtgatt 4800catgactggg gtgaagtcgt aacaaggtaa ccgtagggga acctgcggtt ggatcacctc 4860cttaccttaa agaagcgtac tttgtagtgc tcacacagat tgtctgatag aaagtgaaaa 4920gcaaggcgtt tacgcgttgg gagtgaggct gaagagaata aggccgttcg ctttctatta 4980atgaaagctc accctacacg aaaatatcac gcaacgcgtg ataagcaatt ttcgtgtccc 5040cttcgtctag aggcccagga caccgccctt tcacggcggt aacaggggtt cgaatcccct 5100aggggacgcc acttgctggt ttgtgagtga aagtcgccga ccttaatatc tcaaaactca 5160tcttcgggtg atgtttgaga tatttgctct ttaaaaatct ggatcaagct gaaaattgaa 5220acactgaaca acgagagttg ttcgtgagtc tctcaaattt tcgcaacacg atgatgaatc 5280gaaagaaaca tcttcgggtt gtgagcttaa gcttacaacg ccgaagctgt tttggcggat 5340gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt ctgataaaac 5400agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg aactcagaag 5460tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta gggaactgcc 5520aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt tatctgttgt 5580ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt gaacgttgcg 5640aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag gcatcaaatt 5700aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct tcctgtcgtc 5760atatctacaa gccggcgcgc cgggaaatgt gcgcggaacc cctatttgtt tatttttcta 5820aatacattca aatatgtatc cgctcatgag acaataaccc tgataaatgc ttcaataata 5880ttgaaaaagg aagagtatga gtattcaaca tttccgtgtc gcccttattc ccttttttgc 5940ggcattttgc cttcctgttt ttgctcaccc agaaacgctg gtgaaagtaa aagatgctga 6000agatcagttg ggtgcacgag tgggttacat cgaactggat ctcaacagcg gtaagatcct 6060tgagagtttt cgccccgaag aacgttttcc aatgatgagc acttttaaag ttctgctatg 6120tggcgcggta ttatcccgtg ttgacgccgg gcaagagcaa ctcggtcgcc gcatacacta 6180ttctcagaat gacttggttg agtactcacc agtcacagaa aagcatctta cggatggcat 6240gacagtaaga gaattatgca gtgctgcaat aaccatgagt gataacactg cggccaactt 6300acttctgaca acgatcggag gaccgaagga gctaaccgct tttttgcaca acatggggga 6360tcatgtaact cgccttgatc gttgggaacc ggagctgaat gaagccatac caaacgacga 6420gcgtgacacc acgatgcctg cagcaatggc aacaacgttg cgcaaactat taactggcga 6480actacttact ctagcttccc ggcaacaatt aatagactgg atggaggcgg ataaagttgc 6540aggaccactt ctgcgctcgg cccttccggc tagctggttt attgctgata aatctggagc 6600cggtgagcgt gggtctcgcg gtatcattgc agcactgggg ccagatggta agccctcccg 6660tatcgtagtt atctacacga cggggagtca ggcaactatg gatgaacgaa atagacagat 6720cgctgagata ggtgcctcac tgattaagca ttggtaactg cagaccaagt ttactcatat 6780atactttaga ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt 6840tttgataatc tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac 6900cccgtagaaa agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc 6960ttgcaaacaa aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca 7020actctttttc cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta 7080gtgtagccgt agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct 7140ctgctaatcc tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg 7200gactcaagac gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc 7260acacagccca gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta 7320tgagaaagcg ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg 7380gtcggaacag gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt 7440cctgtcgggt ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg 7500cggagcctat ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg 7560217565DNAArtificial SequenceDesigned sequence in accordance with embodiments 21gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctcuguagg cgaacuacua acucguacgu augggauuau 1800auucgauaua uggaccuaug gacccccgtt gagctaaccg gtactaatga accgtgaggc 1860ttaaccgaga ggttaagcga ctaagcgtac acggtggatg ccctggcagt cagaggcgat 1920gaaggacgtg ctaatctgcg ataagcgtcg gtaaggtgat atgaaccgtt ataaccggcg 1980atttccgaat ggggaaaccc agtgtgtttc gacacactat cattaactga atccataggt 2040taatgaggcg aaccggggga actgaaacat ctaagtaccc cgaggaaaag aaatcaaccg 2100agattccccc agtagcggcg agcgaacggg gagcagccca gagcctgaat cagtgtgtgt 2160gttagtggaa gcgtctggaa aggcgcgcga tacagggtga cagccccgta cacaaaaatg 2220cacatgctgt gagctcgatg agtagggcgg gacacgtggt atcctgtctg aatatggggg 2280gaccatcctc caaggctaaa tactcctgac tgaccgatag tgaaccagta ccgtgaggga 2340aaggcgaaaa gaaccccggc gaggggagtg aaaaagaacc tgaaaccgtg tacgtacaag 2400cagtgggagc acgcttaggc gtgtgactgc gtaccttttg tataatgggt cagcgactta 2460tattctgtag caaggttaac cgaatagggg agccgaaggg aaaccgagtc ttaactgggc 2520gttaagttgc agggtataga cccgaaaccc ggtgatctag ccatgggcag gttgaaggtt 2580gggtaacact aactggagga ccgaaccgac taatgttgaa aaattagcgg atgacttgtg 2640gctgggggtg aaaggccaat caaaccggga gatagctggt tctccccgaa agctatttag 2700gtagcgcctc gtgaattcat ctccgggggt agagcactgt ttcggcaagg gggtcatccc 2760gacttaccaa cccgatgcaa actgcgaata ccggagaatg ttatcacggg agacacacgg 2820cgggtgctaa cgtccgtcgt gaagagggaa acaacccaga ccgccagcta aggtcccaaa 2880gtcatggtta agtgggaaac gatgtgggaa ggcccagaca gccaggatgt tggcttagaa 2940gcagccatca tttaaagaaa gcgtaatagc tcactggtcg agtcggcctg cgcggaagat 3000gtaacggggc taaaccatgc accgaagctg cggcagcgac gcttatgcgt tgttgggtag 3060gggagcgttc tgtaagcctg cgaaggtgtg ctgtgaggca tgctggaggt atcagaagtg 3120cgaatgctga cataagtaac gataaagcgg gtgaaaagcc cgctcgccgg aagaccaagg 3180gttcctgtcc aacgttaatc ggggcagggt gagtcgaccc ctaaggcgag gccgaaaggc 3240gtagtcgatg ggaaacaggt taatattcct gtacttggtg ttactgcgaa ggggggacgg 3300agaaggctat gttggccggg cgacggttgt cccggtttaa gcgtgtaggc tggttttcca 3360ggcaaatccg gaaaatcaag gctgaggcgt gatgacgagg cactacggtg ctgaagcaac 3420aaatgccctg cttccaggaa aagcctctaa gcatcaggta acatcaaatc gtaccccaaa 3480ccgacacagg tggtcaggta gagaatacca aggcgcttga gagaactcgg gtgaaggaac 3540taggcaaaat ggtgccgtaa cttcgggaga aggcacgctg atatgtaggt gaggtccctc 3600gcggatggag ctgaaatcag tcgaagatac cagctggctg caactgttta ttaaaaacac 3660agcactgtgc aaacacgaaa gtggacgtat acggtgtgac gcctgcccgg tgccggaagg 3720ttaattgatg gggttagcgc aagcgaagct cttgatcgaa gccccggtaa acggcggccg 3780taactataac ggtcctaagg tagcgaaatt ccttgtcggg taagttccga cctgcacgaa 3840tggcgtaatg atggccaggc tgtctccacc cgagactcag tgaaattgaa ctcgctgtga 3900agatgcagtg tacccgcggc aagacggaaa gaccccgtga acctttacta tagcttgaca 3960ctgaacattg agccttgatg tgtaggatag gtgggaggct ttgaagtgtg gacgccagtc 4020tgcatggagc cgaccttgaa ataccaccct ttaatgtttg atgttctaac gttgacccgt 4080aatccgggtt gcggacagtg tctggtgggt agtttgactg gggcggtctc ctcctaaaga 4140gtaacggagg agcacgaagg ttggctaatc ctggtcggac atcaggaggt tagtgcaatg 4200gcataagcca gcttgactgc gagcgtgacg gcgcgagcag gtgcgaaagc aggtcatagt 4260gatccggtgg ttctgaatgg aagggccatc gctcaacgga taaaaggtac tccggggata 4320acaggctgat accgcccaag agttcatatc gacggcggtg tttggcacct cgatgtcggc 4380tcatcacatc ctggggctga agtaggtccc aagggtatgg ctgttcgcca tttaaagtgg 4440tacgcgagct gggtttagaa cgtcgtgaga cagttcggtc cctatctgcc gtgggcgctg 4500gagaactgag gggggctgct cctagtacga gaggaccgga gtggacgcat cactggtgtt 4560cgggttgtca tgccaatggc actgcccggt agctaaatgc ggaagagata agtgctgaaa 4620gcatctaagc acgaaacttg ccccgagatg agttctccct gaccctttaa gggtcctgaa 4680ggaacgttga agacgacgac gttgataggc cgggtgtgta agcggggguc cauaggaugg 4740aaguauaucg aauauaaucc cauacgcaaa uuagcgccua uugcggcgct taccactttg 4800tgattcatga ctggggtgaa gtcgtaacaa ggtaaccgta ggggaacctg cggttggatc 4860acctccttac cttaaagaag cgtactttgt agtgctcaca cagattgtct gatagaaagt 4920gaaaagcaag gcgtttacgc gttgggagtg aggctgaaga gaataaggcc gttcgctttc 4980tattaatgaa agctcaccct acacgaaaat atcacgcaac gcgtgataag caattttcgt 5040gtccccttcg tctagaggcc caggacaccg ccctttcacg gcggtaacag gggttcgaat 5100cccctagggg acgccacttg ctggtttgtg agtgaaagtc gccgacctta atatctcaaa 5160actcatcttc gggtgatgtt tgagatattt gctctttaaa aatctggatc aagctgaaaa 5220ttgaaacact gaacaacgag agttgttcgt gagtctctca aattttcgca acacgatgat 5280gaatcgaaag aaacatcttc gggttgtgag cttaagctta caacgccgaa gctgttttgg 5340cggatgagag aagattttca gcctgataca gattaaatca gaacgcagaa gcggtctgat 5400aaaacagaat ttgcctggcg gcagtagcgc ggtggtccca cctgacccca tgccgaactc 5460agaagtgaaa cgccgtagcg ccgatggtag tgtggggtct ccccatgcga gagtagggaa 5520ctgccaggca tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttt cgttttatct 5580gttgtttgtc ggtgaacgct ctcctgagta ggacaaatcc gccgggagcg gatttgaacg 5640ttgcgaagca acggcccgga gggtggcggg caggacgccc gccataaact gccaggcatc 5700aaattaagca gaaggccatc ctgacggatg gcctttttgc gtttctacaa actcttcctg 5760tcgtcatatc tacaagccgg cgcgccggga aatgtgcgcg gaacccctat ttgtttattt 5820ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 5880taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt 5940tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat 6000gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag 6060atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg 6120ctatgtggcg cggtattatc ccgtgttgac gccgggcaag agcaactcgg tcgccgcata 6180cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat 6240ggcatgacag taagagaatt atgcagtgct gcaataacca tgagtgataa cactgcggcc 6300aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg 6360ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac 6420gacgagcgtg acaccacgat gcctgcagca atggcaacaa cgttgcgcaa actattaact 6480ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa 6540gttgcaggac cacttctgcg ctcggccctt ccggctagct ggtttattgc tgataaatct 6600ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc 6660tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga 6720cagatcgctg agataggtgc ctcactgatt aagcattggt aactgcagac caagtttact 6780catatatact ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga 6840tcctttttga taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 6900cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct 6960gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc 7020taccaactct ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgtcc 7080ttctagtgta gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc 7140tcgctctgct aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 7200ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt 7260cgtgcacaca gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg 7320agctatgaga aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg 7380gcagggtcgg aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt 7440atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 7500gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt 7560gctgg 7565227565DNAArtificial SequenceDesigned sequence in accordance with embodiments 22gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctcuguagg cgaacuacua acucguacgu augggauuau 1800auucgauaua uggaccuaug gacccccgtt gagctaaccg gtactaatga accgtgaggc 1860ttaaccgaga ggttaagcga ctaagcgtac acggtggatg ccctggcagt cagaggcgat 1920gaaggacgtg ctaatctgcg ataagcgtcg gtaaggtgat atgaaccgtt ataaccggcg 1980atttccgaat ggggaaaccc agtgtgtttc gacacactat cattaactga atccataggt 2040taatgaggcg aaccggggga actgaaacat ctaagtaccc cgaggaaaag aaatcaaccg 2100agattccccc agtagcggcg agcgaacggg gagcagccca gagcctgaat cagtgtgtgt 2160gttagtggaa gcgtctggaa aggcgcgcga tacagggtga cagccccgta cacaaaaatg 2220cacatgctgt gagctcgatg agtagggcgg gacacgtggt atcctgtctg aatatggggg 2280gaccatcctc caaggctaaa tactcctgac tgaccgatag tgaaccagta ccgtgaggga 2340aaggcgaaaa gaaccccggc gaggggagtg aaaaagaacc tgaaaccgtg tacgtacaag 2400cagtgggagc acgcttaggc gtgtgactgc gtaccttttg tataatgggt cagcgactta 2460tattctgtag caaggttaac cgaatagggg agccgaaggg aaaccgagtc ttaactgggc 2520gttaagttgc agggtataga cccgaaaccc ggtgatctag ccatgggcag gttgaaggtt 2580gggtaacact aactggagga ccgaaccgac taatgttgaa aaattagcgg atgacttgtg 2640gctgggggtg aaaggccaat caaaccggga gatagctggt tctccccgaa agctatttag 2700gtagcgcctc gtgaattcat ctccgggggt agagcactgt ttcggcaagg gggtcatccc 2760gacttaccaa cccgatgcaa actgcgaata ccggagaatg ttatcacggg agacacacgg 2820cgggtgctaa cgtccgtcgt gaagagggaa acaacccaga ccgccagcta

aggtcccaaa 2880gtcatggtta agtgggaaac gatgtgggaa ggcccagaca gccaggatgt tggcttagaa 2940gcagccatca tttaaagaaa gcgtaatagc tcactggtcg agtcggcctg cgcggaagat 3000gtaacggggc taaaccatgc accgaagctg cggcagcgac gcttatgcgt tgttgggtag 3060gggagcgttc tgtaagcctg cgaaggtgtg ctgtgaggca tgctggaggt atcagaagtg 3120cgaatgctga cataagtaac gataaagcgg gtgaaaagcc cgctcgccgg aagaccaagg 3180gttcctgtcc aacgttaatc ggggcagggt gagtcgaccc ctaaggcgag gccgaaaggc 3240gtagtcgatg ggaaacaggt taatattcct gtacttggtg ttactgcgaa ggggggacgg 3300agaaggctat gttggccggg cgacggttgt cccggtttaa gcgtgtaggc tggttttcca 3360ggcaaatccg gaaaatcaag gctgaggcgt gatgacgagg cactacggtg ctgaagcaac 3420aaatgccctg cttccaggaa aagcctctaa gcatcaggta acatcaaatc gtaccccaaa 3480ccgacacagg tggtcaggta gagaatacca aggcgcttga gagaactcgg gtgaaggaac 3540taggcaaaat ggtgccgtaa cttcgggaga aggcacgctg atatgtaggt gaggtccctc 3600gcggatggag ctgaaatcag tcgaagatac cagctggctg caactgttta ttaaaaacac 3660agcactgtgc aaacacgaaa gtggacgtat acggtgtgac gcctgcccgg tgccggaagg 3720ttaattgatg gggttagcgc aagcgaagct cttgatcgaa gccccggtaa acggcggccg 3780taactataac ggtcctaagg tagcgaaatt ccttgtcggg taagttccga cctgcacgaa 3840tggcgtaatg atggccaggc tgtctccacc cgagactcag tgaaattgaa ctcgctgtga 3900agatgcagtg tacccgcggc aagacggaaa gaccccgtga acctttacta tagcttgaca 3960ctgaacattg agccttgatg tgtaggatag gtgggaggct ttgaagtgtg gacgccagtc 4020tgcatggagc cgaccttgaa ataccaccct ttaatgtttg atgttctaac gttgacccgt 4080aatccgggtt gcggacagtg tctggtgggt agtttgactg gggcggtctc ctcctaaaga 4140gtaacggagg agcacgaagg ttggctaatc ctggtcggac atcaggaggt tagtgcaatg 4200gcataagcca gcttgactgc gagcgtgacg gcgcgagcag gtgcgaaagc aggtcatagt 4260gatccggtgg ttctgaatgg aagggccatc gctcaacgga taaaaggtac tccggggata 4320acaggctgat accgcccaag agttcatatc gacggcggtg tttggcacct cgatgtcggc 4380tcatcacatc ctggggctga agtaggtccc aagggtatgg ctgttcgcca tttaaagtgg 4440tacgcgagct gggtttagaa cgtcgtgaga cagttcggtc cctatctgcc gtgggcgctg 4500gagaactgag gggggctgct cctagtacga gaggaccgga gtggacgcat cactggtgtt 4560cgggttgtca tgccaatggc actgcccggt agctaaatgc ggaagagata agtgctgaaa 4620gcatctaagc acgaaacttg ccccgagatg agttctccct gaccctttaa gggtcctgaa 4680ggaacgttga agacgacgac gttgataggc cgggtgtgta agcggggguc cauaggaugg 4740aaguauaucg aauauaaucc cauacgcaaa uuagcgccua uugcggcgct taccactttg 4800tgattcatga ctggggtgaa gtcgtaacaa ggtaaccgta ggggaacctg cggttggatc 4860acctccttac cttaaagaag cgtactttgt agtgctcaca cagattgtct gatagaaagt 4920gaaaagcaag gcgtttacgc gttgggagtg aggctgaaga gaataaggcc gttcgctttc 4980tattaatgaa agctcaccct acacgaaaat atcacgcaac gcgtgataag caattttcgt 5040gtccccttcg tctagaggcc caggacaccg ccctttcacg gcggtaacag gggttcgaat 5100cccctagggg acgccacttg ctggtttgtg agtgaaagtc gccgacctta atatctcaaa 5160actcatcttc gggtgatgtt tgagatattt gctctttaaa aatctggatc aagctgaaaa 5220ttgaaacact gaacaacgag agttgttcgt gagtctctca aattttcgca acacgatgat 5280gaatcgaaag aaacatcttc gggttgtgag cttaagctta caacgccgaa gctgttttgg 5340cggatgagag aagattttca gcctgataca gattaaatca gaacgcagaa gcggtctgat 5400aaaacagaat ttgcctggcg gcagtagcgc ggtggtccca cctgacccca tgccgaactc 5460agaagtgaaa cgccgtagcg ccgatggtag tgtggggtct ccccatgcga gagtagggaa 5520ctgccaggca tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttt cgttttatct 5580gttgtttgtc ggtgaacgct ctcctgagta ggacaaatcc gccgggagcg gatttgaacg 5640ttgcgaagca acggcccgga gggtggcggg caggacgccc gccataaact gccaggcatc 5700aaattaagca gaaggccatc ctgacggatg gcctttttgc gtttctacaa actcttcctg 5760tcgtcatatc tacaagccgg cgcgccggga aatgtgcgcg gaacccctat ttgtttattt 5820ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 5880taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt 5940tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat 6000gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag 6060atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg 6120ctatgtggcg cggtattatc ccgtgttgac gccgggcaag agcaactcgg tcgccgcata 6180cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat 6240ggcatgacag taagagaatt atgcagtgct gcaataacca tgagtgataa cactgcggcc 6300aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg 6360ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac 6420gacgagcgtg acaccacgat gcctgcagca atggcaacaa cgttgcgcaa actattaact 6480ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa 6540gttgcaggac cacttctgcg ctcggccctt ccggctagct ggtttattgc tgataaatct 6600ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc 6660tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga 6720cagatcgctg agataggtgc ctcactgatt aagcattggt aactgcagac caagtttact 6780catatatact ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga 6840tcctttttga taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 6900cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct 6960gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc 7020taccaactct ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgtcc 7080ttctagtgta gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc 7140tcgctctgct aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 7200ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt 7260cgtgcacaca gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg 7320agctatgaga aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg 7380gcagggtcgg aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt 7440atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 7500gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt 7560gctgg 7565237564DNAArtificial SequenceDesigned sequence in accordance with embodiments 23gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctgtctacc gccccaggga gatatccccc cacaggccat 1800ataagacagc gtatacccgc cgttgagcta accggtacta atgaaccgtg aggcttaacc 1860gagaggttaa gcgactaagc gtacacggtg gatgccctgg cagtcagagg cgatgaagga 1920cgtgctaatc tgcgataagc gtcggtaagg tgatatgaac cgttataacc ggcgatttcc 1980gaatggggaa acccagtgtg tttcgacaca ctatcattaa ctgaatccat aggttaatga 2040ggcgaaccgg gggaactgaa acatctaagt accccgagga aaagaaatca accgagattc 2100ccccagtagc ggcgagcgaa cggggagcag cccagagcct gaatcagtgt gtgtgttagt 2160ggaagcgtct ggaaaggcgc gcgatacagg gtgacagccc cgtacacaaa aatgcacatg 2220ctgtgagctc gatgagtagg gcgggacacg tggtatcctg tctgaatatg gggggaccat 2280cctccaaggc taaatactcc tgactgaccg atagtgaacc agtaccgtga gggaaaggcg 2340aaaagaaccc cggcgagggg agtgaaaaag aacctgaaac cgtgtacgta caagcagtgg 2400gagcacgctt aggcgtgtga ctgcgtacct tttgtataat gggtcagcga cttatattct 2460gtagcaaggt taaccgaata ggggagccga agggaaaccg agtcttaact gggcgttaag 2520ttgcagggta tagacccgaa acccggtgat ctagccatgg gcaggttgaa ggttgggtaa 2580cactaactgg aggaccgaac cgactaatgt tgaaaaatta gcggatgact tgtggctggg 2640ggtgaaaggc caatcaaacc gggagatagc tggttctccc cgaaagctat ttaggtagcg 2700cctcgtgaat tcatctccgg gggtagagca ctgtttcggc aagggggtca tcccgactta 2760ccaacccgat gcaaactgcg aataccggag aatgttatca cgggagacac acggcgggtg 2820ctaacgtccg tcgtgaagag ggaaacaacc cagaccgcca gctaaggtcc caaagtcatg 2880gttaagtggg aaacgatgtg ggaaggccca gacagccagg atgttggctt agaagcagcc 2940atcatttaaa gaaagcgtaa tagctcactg gtcgagtcgg cctgcgcgga agatgtaacg 3000gggctaaacc atgcaccgaa gctgcggcag cgacgcttat gcgttgttgg gtaggggagc 3060gttctgtaag cctgcgaagg tgtgctgtga ggcatgctgg aggtatcaga agtgcgaatg 3120ctgacataag taacgataaa gcgggtgaaa agcccgctcg ccggaagacc aagggttcct 3180gtccaacgtt aatcggggca gggtgagtcg acccctaagg cgaggccgaa aggcgtagtc 3240gatgggaaac aggttaatat tcctgtactt ggtgttactg cgaagggggg acggagaagg 3300ctatgttggc cgggcgacgg ttgtcccggt ttaagcgtgt aggctggttt tccaggcaaa 3360tccggaaaat caaggctgag gcgtgatgac gaggcactac ggtgctgaag caacaaatgc 3420cctgcttcca ggaaaagcct ctaagcatca ggtaacatca aatcgtaccc caaaccgaca 3480caggtggtca ggtagagaat accaaggcgc ttgagagaac tcgggtgaag gaactaggca 3540aaatggtgcc gtaacttcgg gagaaggcac gctgatatgt aggtgaggtc cctcgcggat 3600ggagctgaaa tcagtcgaag ataccagctg gctgcaactg tttattaaaa acacagcact 3660gtgcaaacac gaaagtggac gtatacggtg tgacgcctgc ccggtgccgg aaggttaatt 3720gatggggtta gcgcaagcga agctcttgat cgaagccccg gtaaacggcg gccgtaacta 3780taacggtcct aaggtagcga aattccttgt cgggtaagtt ccgacctgca cgaatggcgt 3840aatgatggcc aggctgtctc cacccgagac tcagtgaaat tgaactcgct gtgaagatgc 3900agtgtacccg cggcaagacg gaaagacccc gtgaaccttt actatagctt gacactgaac 3960attgagcctt gatgtgtagg ataggtggga ggctttgaag tgtggacgcc agtctgcatg 4020gagccgacct tgaaatacca ccctttaatg tttgatgttc taacgttgac ccgtaatccg 4080ggttgcggac agtgtctggt gggtagtttg actggggcgg tctcctccta aagagtaacg 4140gaggagcacg aaggttggct aatcctggtc ggacatcagg aggttagtgc aatggcataa 4200gccagcttga ctgcgagcgt gacggcgcga gcaggtgcga aagcaggtca tagtgatccg 4260gtggttctga atggaagggc catcgctcaa cggataaaag gtactccggg gataacaggc 4320tgataccgcc caagagttca tatcgacggc ggtgtttggc acctcgatgt cggctcatca 4380catcctgggg ctgaagtagg tcccaagggt atggctgttc gccatttaaa gtggtacgcg 4440agctgggttt agaacgtcgt gagacagttc ggtccctatc tgccgtgggc gctggagaac 4500tgaggggggc tgctcctagt acgagaggac cggagtggac gcatcactgg tgttcgggtt 4560gtcatgccaa tggcactgcc cggtagctaa atgcggaaga gataagtgct gaaagcatct 4620aagcacgaaa cttgccccga gatgagttct ccctgaccct ttaagggtcc tgaaggaacg 4680ttgaagacga cgacgttgat aggccgggtg tgtaagcggc gggtatacgc aatgacgtat 4740ggtggcctgt ggggggatat ctgtgaactg gggcgaagta gccggcgctt accactttgt 4800gattcatgac tggggtgaag tcgtaacaag gtaaccgtag gggaacctgc ggttggatca 4860cctccttacc ttaaagaagc gtactttgta gtgctcacac agattgtctg atagaaagtg 4920aaaagcaagg cgtttacgcg ttgggagtga ggctgaagag aataaggccg ttcgctttct 4980attaatgaaa gctcacccta cacgaaaata tcacgcaacg cgtgataagc aattttcgtg 5040tccccttcgt ctagaggccc aggacaccgc cctttcacgg cggtaacagg ggttcgaatc 5100ccctagggga cgccacttgc tggtttgtga gtgaaagtcg ccgaccttaa tatctcaaaa 5160ctcatcttcg ggtgatgttt gagatatttg ctctttaaaa atctggatca agctgaaaat 5220tgaaacactg aacaacgaga gttgttcgtg agtctctcaa attttcgcaa cacgatgatg 5280aatcgaaaga aacatcttcg ggttgtgagc ttaagcttac aacgccgaag ctgttttggc 5340ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag cggtctgata 5400aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat gccgaactca 5460gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag agtagggaac 5520tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc gttttatctg 5580ttgtttgtcg gtgaacgctc tcctgagtag gacaaatccg ccgggagcgg atttgaacgt 5640tgcgaagcaa cggcccggag ggtggcgggc aggacgcccg ccataaactg ccaggcatca 5700aattaagcag aaggccatcc tgacggatgg cctttttgcg tttctacaaa ctcttcctgt 5760cgtcatatct acaagccggc gcgccgggaa atgtgcgcgg aacccctatt tgtttatttt 5820tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat 5880aatattgaaa aaggaagagt atgagtattc aacatttccg tgtcgccctt attccctttt 5940ttgcggcatt ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg 6000ctgaagatca gttgggtgca cgagtgggtt acatcgaact ggatctcaac agcggtaaga 6060tccttgagag ttttcgcccc gaagaacgtt ttccaatgat gagcactttt aaagttctgc 6120tatgtggcgc ggtattatcc cgtgttgacg ccgggcaaga gcaactcggt cgccgcatac 6180actattctca gaatgacttg gttgagtact caccagtcac agaaaagcat cttacggatg 6240gcatgacagt aagagaatta tgcagtgctg caataaccat gagtgataac actgcggcca 6300acttacttct gacaacgatc ggaggaccga aggagctaac cgcttttttg cacaacatgg 6360gggatcatgt aactcgcctt gatcgttggg aaccggagct gaatgaagcc ataccaaacg 6420acgagcgtga caccacgatg cctgcagcaa tggcaacaac gttgcgcaaa ctattaactg 6480gcgaactact tactctagct tcccggcaac aattaataga ctggatggag gcggataaag 6540ttgcaggacc acttctgcgc tcggcccttc cggctagctg gtttattgct gataaatctg 6600gagccggtga gcgtgggtct cgcggtatca ttgcagcact ggggccagat ggtaagccct 6660cccgtatcgt agttatctac acgacgggga gtcaggcaac tatggatgaa cgaaatagac 6720agatcgctga gataggtgcc tcactgatta agcattggta actgcagacc aagtttactc 6780atatatactt tagattgatt taaaacttca tttttaattt aaaaggatct aggtgaagat 6840cctttttgat aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc 6900agaccccgta gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg 6960ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct 7020accaactctt tttccgaagg taactggctt cagcagagcg cagataccaa atactgtcct 7080tctagtgtag ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct 7140cgctctgcta atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg 7200gttggactca agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc 7260gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga 7320gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg 7380cagggtcgga acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta 7440tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg 7500ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg 7560ctgg 7564247551DNAArtificial SequenceDesigned sequence in accordance with embodiments 24gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctgucugac ucguacguag ggcgucacag gagauacuaa 1800gacagaauau accccccgtt gagctaaccg gtactaatga accgtgaggc ttaaccgaga 1860ggttaagcga ctaagcgtac acggtggatg ccctggcagt cagaggcgat gaaggacgtg 1920ctaatctgcg ataagcgtcg gtaaggtgat atgaaccgtt ataaccggcg atttccgaat 1980ggggaaaccc agtgtgtttc gacacactat cattaactga atccataggt taatgaggcg 2040aaccggggga actgaaacat ctaagtaccc cgaggaaaag aaatcaaccg agattccccc 2100agtagcggcg agcgaacggg gagcagccca gagcctgaat cagtgtgtgt gttagtggaa 2160gcgtctggaa aggcgcgcga tacagggtga cagccccgta cacaaaaatg cacatgctgt 2220gagctcgatg agtagggcgg gacacgtggt atcctgtctg aatatggggg gaccatcctc 2280caaggctaaa tactcctgac tgaccgatag tgaaccagta ccgtgaggga aaggcgaaaa 2340gaaccccggc gaggggagtg aaaaagaacc tgaaaccgtg tacgtacaag cagtgggagc 2400acgcttaggc gtgtgactgc gtaccttttg tataatgggt cagcgactta tattctgtag 2460caaggttaac cgaatagggg agccgaaggg aaaccgagtc ttaactgggc gttaagttgc 2520agggtataga cccgaaaccc

ggtgatctag ccatgggcag gttgaaggtt gggtaacact 2580aactggagga ccgaaccgac taatgttgaa aaattagcgg atgacttgtg gctgggggtg 2640aaaggccaat caaaccggga gatagctggt tctccccgaa agctatttag gtagcgcctc 2700gtgaattcat ctccgggggt agagcactgt ttcggcaagg gggtcatccc gacttaccaa 2760cccgatgcaa actgcgaata ccggagaatg ttatcacggg agacacacgg cgggtgctaa 2820cgtccgtcgt gaagagggaa acaacccaga ccgccagcta aggtcccaaa gtcatggtta 2880agtgggaaac gatgtgggaa ggcccagaca gccaggatgt tggcttagaa gcagccatca 2940tttaaagaaa gcgtaatagc tcactggtcg agtcggcctg cgcggaagat gtaacggggc 3000taaaccatgc accgaagctg cggcagcgac gcttatgcgt tgttgggtag gggagcgttc 3060tgtaagcctg cgaaggtgtg ctgtgaggca tgctggaggt atcagaagtg cgaatgctga 3120cataagtaac gataaagcgg gtgaaaagcc cgctcgccgg aagaccaagg gttcctgtcc 3180aacgttaatc ggggcagggt gagtcgaccc ctaaggcgag gccgaaaggc gtagtcgatg 3240ggaaacaggt taatattcct gtacttggtg ttactgcgaa ggggggacgg agaaggctat 3300gttggccggg cgacggttgt cccggtttaa gcgtgtaggc tggttttcca ggcaaatccg 3360gaaaatcaag gctgaggcgt gatgacgagg cactacggtg ctgaagcaac aaatgccctg 3420cttccaggaa aagcctctaa gcatcaggta acatcaaatc gtaccccaaa ccgacacagg 3480tggtcaggta gagaatacca aggcgcttga gagaactcgg gtgaaggaac taggcaaaat 3540ggtgccgtaa cttcgggaga aggcacgctg atatgtaggt gaggtccctc gcggatggag 3600ctgaaatcag tcgaagatac cagctggctg caactgttta ttaaaaacac agcactgtgc 3660aaacacgaaa gtggacgtat acggtgtgac gcctgcccgg tgccggaagg ttaattgatg 3720gggttagcgc aagcgaagct cttgatcgaa gccccggtaa acggcggccg taactataac 3780ggtcctaagg tagcgaaatt ccttgtcggg taagttccga cctgcacgaa tggcgtaatg 3840atggccaggc tgtctccacc cgagactcag tgaaattgaa ctcgctgtga agatgcagtg 3900tacccgcggc aagacggaaa gaccccgtga acctttacta tagcttgaca ctgaacattg 3960agccttgatg tgtaggatag gtgggaggct ttgaagtgtg gacgccagtc tgcatggagc 4020cgaccttgaa ataccaccct ttaatgtttg atgttctaac gttgacccgt aatccgggtt 4080gcggacagtg tctggtgggt agtttgactg gggcggtctc ctcctaaaga gtaacggagg 4140agcacgaagg ttggctaatc ctggtcggac atcaggaggt tagtgcaatg gcataagcca 4200gcttgactgc gagcgtgacg gcgcgagcag gtgcgaaagc aggtcatagt gatccggtgg 4260ttctgaatgg aagggccatc gctcaacgga taaaaggtac tccggggata acaggctgat 4320accgcccaag agttcatatc gacggcggtg tttggcacct cgatgtcggc tcatcacatc 4380ctggggctga agtaggtccc aagggtatgg ctgttcgcca tttaaagtgg tacgcgagct 4440gggtttagaa cgtcgtgaga cagttcggtc cctatctgcc gtgggcgctg gagaactgag 4500gggggctgct cctagtacga gaggaccgga gtggacgcat cactggtgtt cgggttgtca 4560tgccaatggc actgcccggt agctaaatgc ggaagagata agtgctgaaa gcatctaagc 4620acgaaacttg ccccgagatg agttctccct gaccctttaa gggtcctgaa ggaacgttga 4680agacgacgac gttgataggc cgggtgtgta agcggggggu auauucaaug acgguauccg 4740agcugugacg cuggccuacg caaaucagcc ggcgcttacc actttgtgat tcatgactgg 4800ggtgaagtcg taacaaggta accgtagggg aacctgcggt tggatcacct ccttacctta 4860aagaagcgta ctttgtagtg ctcacacaga ttgtctgata gaaagtgaaa agcaaggcgt 4920ttacgcgttg ggagtgaggc tgaagagaat aaggccgttc gctttctatt aatgaaagct 4980caccctacac gaaaatatca cgcaacgcgt gataagcaat tttcgtgtcc ccttcgtcta 5040gaggcccagg acaccgccct ttcacggcgg taacaggggt tcgaatcccc taggggacgc 5100cacttgctgg tttgtgagtg aaagtcgccg accttaatat ctcaaaactc atcttcgggt 5160gatgtttgag atatttgctc tttaaaaatc tggatcaagc tgaaaattga aacactgaac 5220aacgagagtt gttcgtgagt ctctcaaatt ttcgcaacac gatgatgaat cgaaagaaac 5280atcttcgggt tgtgagctta agcttacaac gccgaagctg ttttggcgga tgagagaaga 5340ttttcagcct gatacagatt aaatcagaac gcagaagcgg tctgataaaa cagaatttgc 5400ctggcggcag tagcgcggtg gtcccacctg accccatgcc gaactcagaa gtgaaacgcc 5460gtagcgccga tggtagtgtg gggtctcccc atgcgagagt agggaactgc caggcatcaa 5520ataaaacgaa aggctcagtc gaaagactgg gcctttcgtt ttatctgttg tttgtcggtg 5580aacgctctcc tgagtaggac aaatccgccg ggagcggatt tgaacgttgc gaagcaacgg 5640cccggagggt ggcgggcagg acgcccgcca taaactgcca ggcatcaaat taagcagaag 5700gccatcctga cggatggcct ttttgcgttt ctacaaactc ttcctgtcgt catatctaca 5760agccggcgcg ccgggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc 5820aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat attgaaaaag 5880gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg cggcattttg 5940ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta aaagatgctg aagatcagtt 6000gggtgcacga gtgggttaca tcgaactgga tctcaacagc ggtaagatcc ttgagagttt 6060tcgccccgaa gaacgttttc caatgatgag cacttttaaa gttctgctat gtggcgcggt 6120attatcccgt gttgacgccg ggcaagagca actcggtcgc cgcatacact attctcagaa 6180tgacttggtt gagtactcac cagtcacaga aaagcatctt acggatggca tgacagtaag 6240agaattatgc agtgctgcaa taaccatgag tgataacact gcggccaact tacttctgac 6300aacgatcgga ggaccgaagg agctaaccgc ttttttgcac aacatggggg atcatgtaac 6360tcgccttgat cgttgggaac cggagctgaa tgaagccata ccaaacgacg agcgtgacac 6420cacgatgcct gcagcaatgg caacaacgtt gcgcaaacta ttaactggcg aactacttac 6480tctagcttcc cggcaacaat taatagactg gatggaggcg gataaagttg caggaccact 6540tctgcgctcg gcccttccgg ctagctggtt tattgctgat aaatctggag ccggtgagcg 6600tgggtctcgc ggtatcattg cagcactggg gccagatggt aagccctccc gtatcgtagt 6660tatctacacg acggggagtc aggcaactat ggatgaacga aatagacaga tcgctgagat 6720aggtgcctca ctgattaagc attggtaact gcagaccaag tttactcata tatactttag 6780attgatttaa aacttcattt ttaatttaaa aggatctagg tgaagatcct ttttgataat 6840ctcatgacca aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa 6900aagatcaaag gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaaca 6960aaaaaaccac cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactcttttt 7020ccgaaggtaa ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg 7080tagttaggcc accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc 7140ctgttaccag tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga 7200cgatagttac cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc 7260agcttggagc gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagc 7320gccacgcttc ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca 7380ggagagcgca cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg 7440tttcgccacc tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta 7500tggaaaaacg ccagcaacgc ggccttttta cggttcctgg ccttttgctg g 7551257561DNAArtificial SequenceDesigned sequence in accordance with embodiments 25gcggccgcga tctctcacct accaaacaat gcccccctgc aaaaaataaa ttcatataaa 60aaacatacag ataaccatct gcggtgataa attatctctg gcggtgttga cataaatacc 120actggcggtg atactgagca cgggtaccgg ccgctgagaa aaagcgaagc ggcactgctc 180tttaacaatt tatcagacaa tctgtgtggg cactcgaaga tacggattct taacgtcgca 240agacgaaaaa tgaataccaa gtctcaagag tgaacacgta attcattacg aagtttaatt 300ctttgagcgt caaactttta aattgaagag tttgatcatg gctcagattg aacgctggcg 360gcaggcctaa cacatgcaag tcgaacggta acaggaagaa gcttgcttct ttgctgacga 420gtggcggacg ggtgagtaat gtctgggaaa ctgcctgatg gagggggata actactggaa 480acggtagcta ataccgcata acgtcgcaag accaaagagg gggaccttcg ggcctcttgc 540catcggatgt gcccagatgg gattagctag taggtggggt aacggctcac ctaggcgacg 600atccctagct ggtctgagag gatgaccagc cacactggaa ctgagacacg gtccagactc 660ctacgggagg cagcagtggg gaatattgca caatgggcgc aagcctgatg cagccatgcc 720gcgtgtatga agaaggcctt cgggttgtaa agtactttca gcggggagga agggagtaaa 780gttaatacct ttgctcattg acgttacccg cagaagaagc accggctaac tccgtgccag 840cagccgcggt aatacggagg gtgcaagcgt taatcggaat tactgggcgt aaagcgcacg 900caggcggttt gttaagtcag atgtgaaatc cccgggctca acctgggaac tgcatctgat 960actggcaagc ttgagtctcg tagagggggg tagaattcca ggtgtagcgg tgaaatgcgt 1020agagatctgg aggaataccg gtggcgaagg cggccccctg gacgaagact gacgctcagg 1080tgcgaaagcg tggggagcaa acaggattag ataccctggt agtccacgcc gtaaacgatg 1140tcgacttgga ggttgtgccc ttgaggcgtg gcttccggag ctaacgcgtt aagtcgaccg 1200cctggggagt acggccgcaa ggttaaaact caaatgaatt gacgggggcc cgcacaagcg 1260gtggagcatg tggtttaatt cgatgcaacg cgaagaacct tacctggtct tgacatccac 1320ggaagttttc agagatgaga atgtgccttc gggaaccgtg agacaggtgc tgcatggctg 1380tcgtcagctc gtgttgtgaa atgttgggtt aagtcccgca acgagcgcaa cccttatcct 1440ttgttgccag cggtccggcc gggaactcaa aggagactgc cagtgataaa ctggaggaag 1500gtggggatga cgtcaagtca tcatggccct tacgaccagg gctacacacg tgctacaatg 1560gcgcatacaa agagaagcga cctcgcgaga gcaagcggac ctcataaagt gcgtcgtagt 1620ccggattgga gtctgcaact cgactccatg aagtcggaat cgctagtaat cgtggatcag 1680aatgccacgg tgaatacgtt cccgggcctt gtacacaccg cccgtcacac catgggagtg 1740ggttgcaaaa gaagtaggta gctgggtaac tgtcgtatag atggaagtgg attcacaccc 1800gatataagac agaatatgga ccccgttgag ctaaccggta ctaatgaacc gtgaggctta 1860accgagaggt taagcgacta agcgtacacg gtggatgccc tggcagtcag aggcgatgaa 1920ggacgtgcta atctgcgata agcgtcggta aggtgatatg aaccgttata accggcgatt 1980tccgaatggg gaaacccagt gtgtttcgac acactatcat taactgaatc cataggttaa 2040tgaggcgaac cgggggaact gaaacatcta agtaccccga ggaaaagaaa tcaaccgaga 2100ttcccccagt agcggcgagc gaacggggag cagcccagag cctgaatcag tgtgtgtgtt 2160agtggaagcg tctggaaagg cgcgcgatac agggtgacag ccccgtacac aaaaatgcac 2220atgctgtgag ctcgatgagt agggcgggac acgtggtatc ctgtctgaat atggggggac 2280catcctccaa ggctaaatac tcctgactga ccgatagtga accagtaccg tgagggaaag 2340gcgaaaagaa ccccggcgag gggagtgaaa aagaacctga aaccgtgtac gtacaagcag 2400tgggagcacg cttaggcgtg tgactgcgta ccttttgtat aatgggtcag cgacttatat 2460tctgtagcaa ggttaaccga ataggggagc cgaagggaaa ccgagtctta actgggcgtt 2520aagttgcagg gtatagaccc gaaacccggt gatctagcca tgggcaggtt gaaggttggg 2580taacactaac tggaggaccg aaccgactaa tgttgaaaaa ttagcggatg acttgtggct 2640gggggtgaaa ggccaatcaa accgggagat agctggttct ccccgaaagc tatttaggta 2700gcgcctcgtg aattcatctc cgggggtaga gcactgtttc ggcaaggggg tcatcccgac 2760ttaccaaccc gatgcaaact gcgaataccg gagaatgtta tcacgggaga cacacggcgg 2820gtgctaacgt ccgtcgtgaa gagggaaaca acccagaccg ccagctaagg tcccaaagtc 2880atggttaagt gggaaacgat gtgggaaggc ccagacagcc aggatgttgg cttagaagca 2940gccatcattt aaagaaagcg taatagctca ctggtcgagt cggcctgcgc ggaagatgta 3000acggggctaa accatgcacc gaagctgcgg cagcgacgct tatgcgttgt tgggtagggg 3060agcgttctgt aagcctgcga aggtgtgctg tgaggcatgc tggaggtatc agaagtgcga 3120atgctgacat aagtaacgat aaagcgggtg aaaagcccgc tcgccggaag accaagggtt 3180cctgtccaac gttaatcggg gcagggtgag tcgaccccta aggcgaggcc gaaaggcgta 3240gtcgatggga aacaggttaa tattcctgta cttggtgtta ctgcgaaggg gggacggaga 3300aggctatgtt ggccgggcga cggttgtccc ggtttaagcg tgtaggctgg ttttccaggc 3360aaatccggaa aatcaaggct gaggcgtgat gacgaggcac tacggtgctg aagcaacaaa 3420tgccctgctt ccaggaaaag cctctaagca tcaggtaaca tcaaatcgta ccccaaaccg 3480acacaggtgg tcaggtagag aataccaagg cgcttgagag aactcgggtg aaggaactag 3540gcaaaatggt gccgtaactt cgggagaagg cacgctgata tgtaggtgag gtccctcgcg 3600gatggagctg aaatcagtcg aagataccag ctggctgcaa ctgtttatta aaaacacagc 3660actgtgcaaa cacgaaagtg gacgtatacg gtgtgacgcc tgcccggtgc cggaaggtta 3720attgatgggg ttagcgcaag cgaagctctt gatcgaagcc ccggtaaacg gcggccgtaa 3780ctataacggt cctaaggtag cgaaattcct tgtcgggtaa gttccgacct gcacgaatgg 3840cgtaatgatg gccaggctgt ctccacccga gactcagtga aattgaactc gctgtgaaga 3900tgcagtgtac ccgcggcaag acggaaagac cccgtgaacc tttactatag cttgacactg 3960aacattgagc cttgatgtgt aggataggtg ggaggctttg aagtgtggac gccagtctgc 4020atggagccga ccttgaaata ccacccttta atgtttgatg ttctaacgtt gacccgtaat 4080ccgggttgcg gacagtgtct ggtgggtagt ttgactgggg cggtctcctc ctaaagagta 4140acggaggagc acgaaggttg gctaatcctg gtcggacatc aggaggttag tgcaatggca 4200taagccagct tgactgcgag cgtgacggcg cgagcaggtg cgaaagcagg tcatagtgat 4260ccggtggttc tgaatggaag ggccatcgct caacggataa aaggtactcc ggggataaca 4320ggctgatacc gcccaagagt tcatatcgac ggcggtgttt ggcacctcga tgtcggctca 4380tcacatcctg gggctgaagt aggtcccaag ggtatggctg ttcgccattt aaagtggtac 4440gcgagctggg tttagaacgt cgtgagacag ttcggtccct atctgccgtg ggcgctggag 4500aactgagggg ggctgctcct agtacgagag gaccggagtg gacgcatcac tggtgttcgg 4560gttgtcatgc caatggcact gcccggtagc taaatgcgga agagataagt gctgaaagca 4620tctaagcacg aaacttgccc cgagatgagt tctccctgac cctttaaggg tcctgaagga 4680acgttgaaga cgacgacgtt gataggccgg gtgtgtaagg gggtccatat tcaatgacgt 4740atcgaaggtg tgaatccatg gactatacga ttgttacccc ggcgcttacc actttgtgat 4800tcatgactgg ggtgaagtcg taacaaggta accgtagggg aacctgcggt tggatcacct 4860ccttacctta aagaagcgta ctttgtagtg ctcacacaga ttgtctgata gaaagtgaaa 4920agcaaggcgt ttacgcgttg ggagtgaggc tgaagagaat aaggccgttc gctttctatt 4980aatgaaagct caccctacac gaaaatatca cgcaacgcgt gataagcaat tttcgtgtcc 5040ccttcgtcta gaggcccagg acaccgccct ttcacggcgg taacaggggt tcgaatcccc 5100taggggacgc cacttgctgg tttgtgagtg aaagtcgccg accttaatat ctcaaaactc 5160atcttcgggt gatgtttgag atatttgctc tttaaaaatc tggatcaagc tgaaaattga 5220aacactgaac aacgagagtt gttcgtgagt ctctcaaatt ttcgcaacac gatgatgaat 5280cgaaagaaac atcttcgggt tgtgagctta agcttacaac gccgaagctg ttttggcgga 5340tgagagaaga ttttcagcct gatacagatt aaatcagaac gcagaagcgg tctgataaaa 5400cagaatttgc ctggcggcag tagcgcggtg gtcccacctg accccatgcc gaactcagaa 5460gtgaaacgcc gtagcgccga tggtagtgtg gggtctcccc atgcgagagt agggaactgc 5520caggcatcaa ataaaacgaa aggctcagtc gaaagactgg gcctttcgtt ttatctgttg 5580tttgtcggtg aacgctctcc tgagtaggac aaatccgccg ggagcggatt tgaacgttgc 5640gaagcaacgg cccggagggt ggcgggcagg acgcccgcca taaactgcca ggcatcaaat 5700taagcagaag gccatcctga cggatggcct ttttgcgttt ctacaaactc ttcctgtcgt 5760catatctaca agccggcgcg ccgggaaatg tgcgcggaac ccctatttgt ttatttttct 5820aaatacattc aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat 5880attgaaaaag gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg 5940cggcattttg ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta aaagatgctg 6000aagatcagtt gggtgcacga gtgggttaca tcgaactgga tctcaacagc ggtaagatcc 6060ttgagagttt tcgccccgaa gaacgttttc caatgatgag cacttttaaa gttctgctat 6120gtggcgcggt attatcccgt gttgacgccg ggcaagagca actcggtcgc cgcatacact 6180attctcagaa tgacttggtt gagtactcac cagtcacaga aaagcatctt acggatggca 6240tgacagtaag agaattatgc agtgctgcaa taaccatgag tgataacact gcggccaact 6300tacttctgac aacgatcgga ggaccgaagg agctaaccgc ttttttgcac aacatggggg 6360atcatgtaac tcgccttgat cgttgggaac cggagctgaa tgaagccata ccaaacgacg 6420agcgtgacac cacgatgcct gcagcaatgg caacaacgtt gcgcaaacta ttaactggcg 6480aactacttac tctagcttcc cggcaacaat taatagactg gatggaggcg gataaagttg 6540caggaccact tctgcgctcg gcccttccgg ctagctggtt tattgctgat aaatctggag 6600ccggtgagcg tgggtctcgc ggtatcattg cagcactggg gccagatggt aagccctccc 6660gtatcgtagt tatctacacg acggggagtc aggcaactat ggatgaacga aatagacaga 6720tcgctgagat aggtgcctca ctgattaagc attggtaact gcagaccaag tttactcata 6780tatactttag attgatttaa aacttcattt ttaatttaaa aggatctagg tgaagatcct 6840ttttgataat ctcatgacca aaatccctta acgtgagttt tcgttccact gagcgtcaga 6900ccccgtagaa aagatcaaag gatcttcttg agatcctttt tttctgcgcg taatctgctg 6960cttgcaaaca aaaaaaccac cgctaccagc ggtggtttgt ttgccggatc aagagctacc 7020aactcttttt ccgaaggtaa ctggcttcag cagagcgcag ataccaaata ctgtccttct 7080agtgtagccg tagttaggcc accacttcaa gaactctgta gcaccgccta catacctcgc 7140tctgctaatc ctgttaccag tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt 7200ggactcaaga cgatagttac cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg 7260cacacagccc agcttggagc gaacgaccta caccgaactg agatacctac agcgtgagct 7320atgagaaagc gccacgcttc ccgaagggag aaaggcggac aggtatccgg taagcggcag 7380ggtcggaaca ggagagcgca cgagggagct tccaggggga aacgcctggt atctttatag 7440tcctgtcggg tttcgccacc tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg 7500gcggagccta tggaaaaacg ccagcaacgc ggccttttta cggttcctgg ccttttgctg 7560g 756126187DNAArtificial SequenceDesigned sequence in accordance with embodiments 26ggaacagctc gagtagagct gaaagcgata tggtacgacc caggagtccg gcatatcacg 60acgaaacaga cctgaggaaa ctcaggtctg tggagtgata tgggaagaaa ctgcggacga 120gaaactgggt cgtacctaag tcgcaaacgc atcgagtaga tgcgaacaaa gaaacaacaa 180caacaac 18727192DNAArtificial SequenceDesigned sequence in accordance with embodiments 27ggaacagctc gagtagagct gaaagcgata tggacaccgt atgtgcgtat acacggctca 60tcactctcat cggaccctgg aaacagggtc cgtgcgtgat gagggaagaa actgcgtgta 120tactctcatc atacggtgtc ctaagtcgca aacgcatcga gtagatgcga acaaagaaac 180aacaacaaca ac 19228196DNAArtificial SequenceDesigned sequence in accordance with embodiments 28ggaacagctc gagtagagct gaaagcgata tggcatggaa tcagctcaag gaactgtgaa 60cgtatatcgg gcaacgacta ggaaactagt cgttgggaag aaactgccga tatacgggag 120ttccttgagc gggagattcc atgcctaagt cgcaaacgca tcgagtagat gcgaacaaag 180aaacaacaac aacaac 19629197DNAArtificial SequenceDesigned sequence in accordance with embodiments 29ggaacagctc gagtagagct gaaagcgata tggtatacat gtgaagcgtg agggacgacg 60aaatataggc tggctttcgc tcggaaacga gcgaaaggga agaaactgca gcctatatgg 120agtccctcac gtggaacatg tatacctaag tcgcaaacgc atcgagtaga tgcgaacaaa 180gaaacaacaa caacaac 19730169DNAArtificial SequenceDesigned sequence in accordance with embodiments 30ggaacagctc gagtagagct gaaagcgata tggtacgtat agcaccgtga actactccgg 60catgggtcgg aaacgaccca tgggaagaaa ctgcggagca cggttcccta tacgtaccta 120agtcgcaaac gcatcgagta gatgcgaaca aagaaacaac aacaacaac 16931177DNAArtificial SequenceDesigned sequence in accordance with embodiments 31ggaacagctc gagtagagct gaaagcgata tgggtccgtg agtctccgag tatgaccggc 60gtgacgttgg aaacaacgtc acgggaagaa actgcggtca tacagtgaaa ggagtctcac 120ggaccctaag tcgcaaacgc atcgagtaga tgcgaacaaa gaaacaacaa caacaac 17732174DNAArtificial SequenceDesigned sequence in accordance with embodiments 32ggaacagctc gagtagagct gaaagcgata tggaactcgt cgggaggata tatagaccgg 60catatggtgg aaacaccata tgggaagaaa ctgcggtcat atatccgaag aacgacgagt 120tcctaagtcg caaacgcatc gagtagatgc gaacaaagaa acaacaacaa caac 17433192DNAArtificial SequenceDesigned sequence in accordance with embodiments 33ggaacagctc gagtagagct gaaagcgata tggatgggtg tgtgcgaccc aatggcgtat 60ggagggctat gtccacggaa acgtggacat agggaagaaa ctgcctccat acttgaattg 120ggtctctcat cacacccatc ctaagtcgca aacgcatcga gtagatgcga acaaagaaac 180aacaacaaca ac

19234174DNAArtificial SequenceDesigned sequence in accordance with embodiments 34ggaacagctc gagtagagct gaaagcgata tgggagcgtg tgggagcatc cgaactacgt 60atggcattcg ggaaaccgaa tgggaagaaa ctgcatacgc ggatgcgaag aacacacgct 120ccctaagtcg caaacgcatc gagtagatgc gaacaaagaa acaacaacaa caac 17435180DNAArtificial SequenceDesigned sequence in accordance with embodiments 35ggaacagctc gagtagagct gaaagcgata tggatgtggg catgaccgaa gaacggtacc 60ttgaatccgt caaaggaaac tttgacggat ggcggtaccg ggaggtcatg ggaagaaact 120gccacatcct aagtcgcaaa cgcatcgagt agatgcgaac aaagaaacaa caacaacaac 18036107DNAArtificial SequenceDesigned sequence in accordance with embodiments 36ggaacagctc gagtagagct gaaagggttg ggaagaaact gtggcacttc ggtgccagca 60acccaaacgc atcgagtaga tgcgaacaaa gaaacaacaa caacaac 10737203DNAArtificial SequenceDesigned sequence in accordance with embodiments 37ggaacagctc gagtagagct gaaagcgata tggtgaccac gaaggacggg tccacgattc 60aggaagtgta cgaagaaccg gagtagggaa acctactcct cagtattcgt acatggaaga 120atcgtgttga gtagagtgtg agcgtggtca cctaagtcgc aaacgcatcg agtagatgcg 180aacaaagaaa caacaacaac aac 20338196DNAArtificial SequenceDesigned sequence in accordance with embodiments 38ggaacagctc gagtagagct gaaagcgata tggtttctga aggacgggtc ccataagctg 60tgaactatgg agccaggcac gagaggaaac tctcgtgagc aactccatag ggagcttatg 120gttgagtaga gtgtgagcag aaacctaagt cgcaaacgca tcgagtagat gcgaacaaag 180aaacaacaac aacaac 19639196DNAArtificial SequenceDesigned sequence in accordance with embodiments 39ggaacagctc gagtagagct gaaagcgata tggatagaag gacgggtccc atattgcgag 60aaacgttata cggaaccgta cgaaggaaac ttcgtactca gtacgtataa cggagcaata 120tggttgagta gagtgtgagc tatcctaagt cgcaaacgca tcgagtagat gcgaacaaag 180aaacaacaac aacaac 19640190DNAArtificial SequenceDesigned sequence in accordance with embodiments 40ggaacagctc gagtagagct gaaagcgata tggctcgaag gacgggtccc atagggagct 60tgtacgagca acatacgagg aaactcgtat gccaggcgta caagcgaaga actatggttg 120agtagagtgt gagcgagcct aagtcgcaaa cgcatcgagt agatgcgaac aaagaaacaa 180caacaacaac 19041195DNAArtificial SequenceDesigned sequence in accordance with embodiments 41ggaacagctc gagtagagct gaaagcgata tgggaggaag gacgggtccc acagagcgaa 60gaactattag aggaccgtat ttcggaaacg aaatactagt actctaatag ggagctctgt 120ggttgagtag agtgtgagcc tccctaagtc gcaaacgcat cgagtagatg cgaacaaaga 180aacaacaaca acaac 19542191DNAArtificial SequenceDesigned sequence in accordance with embodiments 42ggaacagctc gagtagagct gaaagcgata tggcttgaag gacgggtccc atacagcgac 60gaaagtgtct tcgagtacag aaggaaactt ctgtatagaa gacactggag ctgtatggtt 120gagtagagtg tgagcaagcc taagtcgcaa acgcatcgag tagatgcgaa caaagaaaca 180acaacaacaa c 19143179DNAArtificial SequenceDesigned sequence in accordance with embodiments 43ggaacagctc gagtagagct gaaagcgata tggctgtgga aggacgggtc ccatggaaga 60cgtcaccgaa gtcggaaacg acttcgcaag acgtcaggaa gtggttgagt agagtgtgag 120ccacagccta agtcgcaaac gcatcgagta gatgcgaaca aagaaacaac aacaacaac 17944196DNAArtificial SequenceDesigned sequence in accordance with embodiments 44ggaacagctc gagtagagct gaaagcgata tggattgaag gacgggtccc taggagcgcg 60aaacttatat aggaaccgtt cgtcggaaac gacgaactca gtactatata aggagctcct 120aggttgagta gagtgtgagc aatcctaagt cgcaaacgca tcgagtagat gcgaacaaag 180aaacaacaac aacaac 19645191DNAArtificial SequenceDesigned sequence in accordance with embodiments 45ggaacagctc gagtagagct gaaagcgata tggcagctaa cagacttata tggagagtcc 60tggaaggacg ggtccgaggg aaacctcgtt gagtagagtg tgagccagga ctcgacgaaa 120tataagtcgg gttagctgcc taagtcgcaa acgcatcgag tagatgcgaa caaagaaaca 180acaacaacaa c 19146197DNAArtificial SequenceDesigned sequence in accordance with embodiments 46ggaacagctc gagtagagct gaaagcgata tgggatcctg gcaagctgta cataagacag 60ctccatcgaa ggacgggtcc gtcggaaacg acgttgagta gagtgtgagc gatggagcaa 120tgacgtgtac agcgacccag gatccctaag tcgcaaacgc atcgagtaga tgcgaacaaa 180gaaacaacaa caacaac 19747198DNAArtificial SequenceDesigned sequence in accordance with embodiments 47ggaacagctc gagtagagct gaaagcgata tggtgctacc tcagtatcct accacatcag 60ctataacgaa ggacgggtcc gaaggaaact tcgttgagta gagtgtgagc gttatagcga 120cgacgtggta ggagaaccgg tagcacctaa gtcgcaaacg catcgagtag atgcgaacaa 180agaaacaaca acaacaac 19848196DNAArtificial SequenceDesigned sequence in accordance with embodiments 48ggaacagctc gagtagagct gaaagcgata tggcaagcac aaaagctcta tataagacag 60cttgtaggaa ggacgggtcc gaaggaaact tcgttgagta gagtgtgagc ctacaagcaa 120tgacgtatag agcgctgtgc ttgcctaagt cgcaaacgca tcgagtagat gcgaacaaag 180aaacaacaac aacaac 19649194DNAArtificial SequenceDesigned sequence in accordance with embodiments 49ggaacagctc gagtagagct gaaagcgata tggatcttag gacctcctat ggagtagctt 60gtatgaagga cgggtccgag ggaaacctcg ttgagtagag tgtgagcata caagcgcaga 120taccatagga gaccctaaga tcctaagtcg caaacgcatc gagtagatgc gaacaaagaa 180acaacaacaa caac 19450195DNAArtificial SequenceDesigned sequence in accordance with embodiments 50ggaacagctc gagtagagct gaaagcgata tggagctgta gcaagttgtt tcgtgccagc 60tacttgaagg acgggtccag aggaaactct gttgagtaga gtgtgagcaa gtagcggtaa 120tacgaaacaa cgacctacag ctcctaagtc gcaaacgcat cgagtagatg cgaacaaaga 180aacaacaaca acaac 19551193DNAArtificial SequenceDesigned sequence in accordance with embodiments 51ggaacagctc gagtagagct gaaagcgata tggagctcca ctaagctact gagggagctt 60gtacgaagga cgggtccgaa ggaaacttcg ttgagtagag tgtgagcgta caagctgtga 120actcagtagt atgtggagct cctaagtcgc aaacgcatcg agtagatgcg aacaaagaaa 180caacaacaac aac 19352179DNAArtificial SequenceDesigned sequence in accordance with embodiments 52ggaacagctc gagtagagct gaaagcgata tgggaagtca ccttggtcag gaagtatgga 60aggacgggtc cataggaaac tatgttgagt agagtgtgag ccatatggaa gaccaagcaa 120gacttcccta agtcgcaaac gcatcgagta gatgcgaaca aagaaacaac aacaacaac 1795384DNAArtificial SequenceDesigned sequence in accordance with embodiments 53ggacgcgacc gaaatggtga aggacgggtc cagtgcgaaa cacgcactgt tgagtagagt 60gtgagctccg taactggtcg cgtc 845459DNAArtificial SequenceDesigned sequence in accordance with embodiments 54ggcagtggaa ggacgggtcc ggcgtggaaa cacgccgttg agtagagtgt gagccactg 595551DNAArtificial SequenceDesigned sequence in accordance with embodiments 55ggagacggtc gggtccagat attcgtatct gtcgagtaga gtgtgggctc c 51

User Contributions:

Comment about this patent or add new information about this topic:

Date	Title
New patent applications in this class:
2022-09-22	Electronic device
2022-09-22	Front-facing proximity detection using capacitive sensor
2022-09-22	Touch-control panel and touch-control display apparatus
2022-09-22	Sensing circuit with signal compensation
2022-09-22	Reduced-size interfaces for managing alerts

Date	Title
New patent applications from these inventors:
2022-01-13	Systems and methods to enhance rna stability and translation and uses thereof

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: Systems and Methods for Designing RNA Nanostructures and Uses Thereof

Abstract:

Claims:

Description: