Patent application title: METHODS AND SYSTEMS FOR PROCESSING BIOMASS MATERIAL
Corey William Radtke (Katy, TX, US)
Phillip Guy Hamilton (Sugar Land, TX, US)
Keith Michael Kreitman (Houston, TX, US)
SHELL OIL COMPANY
IPC8 Class: AC12P500FI
Class name: Chemistry: molecular biology and microbiology micro-organism, tissue cell culture or enzyme using process to synthesize a desired chemical compound or composition preparing hydrocarbon
Publication date: 2013-11-21
Patent application number: 20130309739
Embodiments of the present invention provide for efficient and economical
production and recovery of volatile organic compounds and hydrocarbons.
One embodiment comprises contacting a solid component of a biomass
material with a solution adapted to facilitate saccharification, and
contacting the at least one fermentable sugar with a microorganism
capable of using the at least one fermentable sugar to generate a
hydrocarbon. The solid component is generated by introducing a biomass
material to a compartment of a solventless recovery system, wherein the
biomass material contains one or more volatile organic compounds;
contacting the biomass material with a superheated vapor stream in the
compartment to vaporize at least a portion of an initial liquid content
in the biomass material; separating a vapor component and a solid
component from the heated biomass material; and retaining at least a
portion of the gas component for use as part of the superheated vapor
1. A method for processing a biomass material comprising: (i) introducing
a biomass material to a compartment of a solventless recovery system,
wherein the biomass material contains one or more volatile organic
compounds; (ii) contacting the biomass material with a superheated vapor
stream in the compartment to vaporize at least a portion of an initial
liquid content in the biomass material, said superheated vapor stream
comprising at least one volatile organic compound; (iii) separating a
vapor component and a solid component from the heated biomass material,
said vapor component comprising at least one volatile organic compound;
retaining at least a portion of the gas component for use as part of the
superheated vapor stream; (iv) discharging the solid component from the
solventless recovery system, wherein the solid component comprises a
lignocellulosic material; (v) generating at least one fermentable sugar
from the lignocellulosic material; and (vi) contacting the at least one
fermentable sugar with a microorganism capable of using the at least one
fermentable sugar to generate a hydrocarbon.
2. The method of claim 1, wherein the microorganism is adapted to express at least one of a fatty acid reductase, a fatty aldehyde synthetase, a fatty acyl transferase, and an aldehyde decarbonylase.
3. The method of claim 2, wherein the microorganism comprises at least one nucleic acid encoding one or more of the amino acid sequences selected from the group consisting of SEQ ID NO:1 to SEQ ID NO:4.
4. The method of claim 3, wherein the microorganism is a yeast or a bacterium.
5. The method of claim 4, wherein the yeast is Saccharomyces cerevisiae and the bacterium is Eschericia coli.
6. The method of claim 2, wherein the microorganism is further adapted to express an acyl-ACP thioesterase.
7. The method of claim 6, wherein the microorganism is a genetically modified microorganism genetically modified to express an exogenous acyl-ACP thioesterase.
8. The method of claim 7, wherein the microorganism comprises at least one nucleic acid encoding the amino acid sequence of SEQ ID NO:5.
9. The method of claim 2, wherein the microorganism is further adapted to express a 3-ketoacyl-ACP synthase III.
10. The method of claim 9, wherein the host cell comprises at least one nucleic acid encoding the amino acid sequence of SEQ ID NO:6.
11. The method of claim 2, wherein the microorganism is further adapted to express a branched-chain ketodehydrogenase complex.
12. The method of claim 11, wherein the microorganism comprises at least one nucleic acid encoding one or more of the amino acid sequences selected from the group consisting of SEQ ID NO:7 to SEQ ID NO:10.
13. The method of claim 1, wherein the microorganism is adapted to express an aldehyde-generating acyl-ACP reductase and fatty aldehyde decarbonylase.
14. The method of claim 1 wherein the biomass material introduced to the compartment is obtained from a fermentation process of a harvested crop.
15. The method of claim 14 wherein the crop is selected from the group consisting of sorghum, sugar cane, corn, tropical corn, sugar beet, energy cane, and any combination thereof.
16. The method of claim 1 wherein the compartment comprises a cylindrical body in a shape of a loop within which the superheated vapor stream flows.
17. The method of claim 1 wherein the separating step is achieved using a cyclone separating component coupled to the compartment, wherein the cyclone separating component is configured to discharge the separated solid component from the compartment.
18. The method of claim 1 the biomass is generated by adding to the biomass at least one additive added, wherein said at least one additive comprise a microbe, and optionally, an acid and/or an enzyme; and storing the prepared biomass material for at least about 24 hours in a storage facility to allow for the production of at least one volatile organic compound from at least a portion of the sugar.
19. A method for processing a biomass material comprising: contacting a solid component of a biomass material with a solution adapted to facilitate saccharification, and contacting the at least one fermentable sugar with a microorganism capable of using the at least one fermentable sugar to generate a hydrocarbon. wherein the solid component is generated by a method comprising: introducing a biomass material to a compartment of a solventless recovery system, wherein the biomass material contains one or more volatile organic compounds; contacting the biomass material with a superheated vapor stream in the compartment to vaporize at least a portion of an initial liquid content in the biomass material, said superheated vapor stream comprising at least one volatile organic compound; separating a vapor component and a solid component from the heated biomass material, said vapor component comprising at least one volatile organic compound; retaining at least a portion of the gas component for use as part of the superheated vapor stream; discharging the solid component from the solventless recovery system.
CROSS-REFERENCE TO RELATED APPLICATIONS
 The present application claims priority to U.S. Provisional Application No. 61/648,109, filed on May 17, 2012, U.S. Provisional Application No. 61/786,844, filed on Mar. 15, 2013, and U.S. Provisional Application No. 61/786,860, filed on Mar. 15, 2013, the disclosures of which are incorporated by reference herein in their entirety.
FIELD OF THE INVENTION
 Embodiments of this invention relate generally to processing of biomass material and more particularly to manufacturing and recovery of volatile organic compounds and hydrocarbon compounds, using readily available fermentable sugar and fermentable sugar from further processing of lignocellulosic material in the biomass material.
 This section is intended to introduce various aspects of the art, which may be associated with exemplary embodiments of the present invention. This discussion is believed to assist in providing a framework to facilitate a better understanding of particular aspects of the present invention. Accordingly, it should be understood that this section should be read in this light, and not necessarily as admissions of any prior art.
 As the world's petroleum supplies continue to diminish there is a growing need for alternative materials that can be substituted for various petroleum products, particularly transportation fuels. A significant amount of effort has been placed on developing new methods and systems for providing energy from resources other than fossil fuels. Currently, much effort is underway to produce bioethanol and other transportation fuels and chemicals from renewable biomass materials. One type of biomass is plant biomass, which contains a high amount of carbohydrates including sugars, starches, celluloses, lignocelluloses, hemicelluloses. Efforts have particularly been focused on ethanol from fermentable sugar readily available and ethanol from cellulosic materials.
 Conventional ethanol production from corn using readily available fermentable sugar typically competes with valuable food resources, which can be further amplified by increasingly more severe climate conditions, such as droughts and floods, which negatively impact the amount of crop harvested every year. The competition from conventional ethanol production can drive up food prices. While other crops have served as the biomass material for ethanol production, they usually are not suitable for global implementations due to the climate requirements of such crops. For instance, ethanol can also be efficiently produced from sugar cane, but only in certain areas of the world, such as Brazil, that have a climate that can support near-year-round harvest.
 Further, additional fermentable sugars can be freed from lignocellulosic biomass, which comprises hemicelluloses, cellulose and smaller portions of lignin and protein. Cellulose comprises sugars that can be converted into fuels and valuable chemicals, when they are liberated from the cell walls and polymers that contain them.
 Current processes aiming to process lignocellulosic biomass are limited to feedstock that includes unprocessed biomass materials or municipal solid waste (MSW). Unprocessed biomass includes sugarcane bagasse, forest resources, crop residues, and wet/dry harvested energy crops. These conventional feedstock sources require storage, transportation, particle size reduction, and additional front end processing before they can be introduced for further processing of lignocellulosic material. For example, baling of biomass is costly and can result in hazards such as fire, rodent, dust, unwanted debris (such as rocks) and hantavirus. Further, bales and forest resources are more costly to transport than denser material and more costly to handle than materials that are already particle size reduced and do not need to be further formatted. MSW further has challenges related to contamination with regulated hazardous metals that can contribute to risks of poor fuel quality as well as health and safety risks. Forest resources, such as trees, are cumbersome to transport. Further, forest resources require debarking, chopping to wood chips of desirable thickness, and washing to remove any residual soil, dirt and the like. Therefore, there is still a need for a biomass that addresses these challenges.
 Embodiments of the invention can address the challenges mentioned above as well as provide other advantages and features. In one embodiment, the feedstock can come from the solid component exiting a volatile organic compound recovery system. In that embodiment, the feedstock is already flowable in an engineered system, which allows the feedstock to be routed directly into the reactor to generate additional fermentable sugars as desired. Embodiments of the invention can provide for a volatile organic compound recovery equipment to recover products from the fermentation phase and further processing of lignocellulosic material equipment to be located near each other. The further processing can yield additional fermentable sugar that can be converted to hydrocarbons and/or other chemicals. Such embodiments can allow for production of volatile organic compounds from fermentation and further processing of lignocellulosic material, which reduces storage, handling, and transportation costs associated with other feedstock before it can enter the production flow of the further processing of lignocellulosic material. Such embodiments can also provide a continuous supply of feedstock that is already formatted in contrast to conventional feedstock that often requires storage, transportation, and/or formatting at or prior to arriving at the biomass facility for processing of the lignocellulosic material, which reduces the particular associated costs.
 The feedstock of certain embodiments can also have lower handling and transportation costs when it is transported to other locations for processing of the lignocellulosic material. Unlike other conventional feedstock sources, such as forest resources, the feedstock of certain embodiments exits the volatile organic compound recovery system in a preformatted manner that is particle-size reduced, which can reduce or eliminate the front end processing costs before the feedstock can enter the processing of lignocellulosic material. The preformatted size distribution of the feedstock of certain embodiments of the invention places it in a denser form than other conventional feedstock sources, which can reduce transportation cost as more of the feedstock of these embodiments can be transported per volume. Embodiments of the invention can provide a supply of feedstock that is available year-round independent of a harvest period particular to a biomass material thus reducing storage needs and costs for the further processing plant and does not compete with valuable food sources for human.
 In one embodiment, a biomass material is prepared to generate volatile organic compounds. The volatile organic compounds are recovered from the prepared biomass material by introducing the prepared biomass material to a compartment of a solventless recovery system; contacting the biomass material with a superheated vapor stream in the compartment to vaporize at least a portion of an initial liquid content in the prepared biomass material, the superheated vapor stream comprising at least one volatile organic compound; separating a vapor component and a solid component from the heated biomass material, where the vapor component comprises at least one volatile organic compound; and retaining at least a portion of the gas component for use as part of the superheated vapor stream. Compounds in the vapor component can be further purified through an appropriate distillation process. At least a portion of the solid component is further processed to generate additional fermentable sugars. In one embodiment, the further processing comprises contacting at least a portion of the solid component with a solution adapted to facilitate saccharification to generate additional fermentable sugars. In one embodiment, the additionally generated fermentable sugars are converted to at least one hydrocarbon compound through fermentation using one or more microorganism.
 In one embodiment, the one or more organism is a recombinant host cell (or microorganism) adapted to produce a hydrocarbon as disclosed by PCT Application No. PCT/EP2013/053600, the disclosure of which is incorporated by reference in its entirety. For example, in one embodiment, the recombinant host cell is adapted to express at least one of a fatty acid reductase, a fatty aldehyde synthetase, a fatty acyl transferase, and a aldehyde decarbonylase, where at least one fatty acid reductase, at least one fatty aldehyde synthetase, and at least one fatty acyl transferase forms a fatty acid reductase complex. Contact of a fatty acid substrate with a fatty aldehyde synthetase forms a fatty acid aldehyde. Contact of the fatty acid aldehyde with at least one aldehyde decarbonylase forms a hydrocarbon.
 The fatty acid substrate may be a fatty acid, a fatty acyl-ACP (fatty acyl-acyl carrier protein) or fatty acyl-CoA or a mixture of any of these. The fatty acid reductase complex may comprise a fatty acid reductase enzyme polypeptide having Enzyme Commission (EC) no. 18.104.22.168, for example having at least 50% sequence identity with SEQ ID NO:1 (Photorhabdus luminescens protein LuxC). Additionally or independently, the fatty acid reductase complex may comprise a fatty aldehyde synthetase enzyme polypeptide having EC no. 22.214.171.124, for example having at least 50% sequence identity with SEQ ID NO:2 (P. luminescens protein LuxE). Additionally or independently, the fatty acid reductase complex may comprise a fatty acyl transferase enzyme polypeptide in class EC 2.3.1.-, for example having at least 50% sequence identity to SEQ ID NO:3 (P. luminescens protein LuxD). Additionally or independently, the aldehyde decarbonylase may be in class EC 126.96.36.199, for example it may be a polypeptide having at least 50% sequence identity with SEQ ID NO:4 (Nostoc punctiforme aldehyde decarbonylase protein). In an exemplary embodiment, all of the enzymes having the sequences SEQ ID NOs:1-4 are used.
 In one embodiment, the prepared biomass is generated by adding to the biomass at least one additive added, wherein said at least one additive comprise a microbe, and optionally, an acid and/or an enzyme; and storing the prepared biomass material for at least about 24 hours in a storage facility to allow for the production of at least one volatile organic compound from at least a portion of the sugar.
 In addition to the features described above, embodiments of the invention allow for economical production of alternative fuels, such as ethanol, other volatile organic compounds, hydrocarbons, and other chemicals, from plants that contain fermentable sugar by addressing challenges, such as costs of storage and transportation, short harvest windows, quick degradation of sugars, and large investment in equipment. Aspects of the embodiments described herein are applicable to any biomass material, such as plants containing fermentable sugars. The features of embodiments of the present invention allow for economical use of various plants to produce alternative fuels and chemicals and are not limited to sorghum and other plants that suffer similar challenges. Such challenging crops are highlighted herein because other methods and systems have not been able to economically use these challenging crops to produce fuels and chemicals. As such, the specific mention of sorghum is not intended to be limiting, but rather illustrates one particular application of embodiments of the invention.
 Embodiments of the invention allow for the recovery facility to run continuously year-round in a controlled manner independent of the harvest window, thereby broadening the geological locations available to place a recovery facility and/or a facility to process lignocellulosic material, including areas with a relatively short harvest window.
 Other advantages and features of embodiments of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
 These drawings illustrate certain aspects of some of the embodiments of the invention, and should not be used to limit or define the invention.
 FIG. 1 is a diagram of one embodiment to process biomass material according to certain aspects of the present invention.
 FIG. 2 is a diagram of another embodiment to process biomass material according to certain aspects of the present invention.
 FIG. 3 is a diagram of a particular embodiment for saccharification of a solid component according to aspects of the invention.
 FIG. 4 is a schematic detailing the genetic elements (solid lines) introduced into E. coli cells according to certain aspects of the invention;
 FIG. 5 shows the growth curve for various E. coli cultures obtained according to certain aspects of the invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
 Embodiments of the present invention can provide efficient and economical production and recovery of ethanol or other volatile organic compounds, such as ethanol and acetic acid, from solid biomass material, as well as a feedstock for further processing of lignocellulosic material to generate fermentable sugar for conversion further fermentation, including production of hydrocarbon compounds. According to one aspect of the invention, a biomass material is prepared to generate volatile organic compounds. The volatile organic compounds are recovered from the prepared biomass material by introducing the prepared biomass material to a compartment of a solventless recovery system; contacting the biomass material with a superheated vapor stream in the compartment to vaporize at least a portion of an initial liquid content in the prepared biomass material, the superheated vapor stream comprising at least one volatile organic compound; separating a vapor component and a solid component from the heated biomass material, where the vapor component comprises at least one volatile organic compound; and retaining at least a portion of the gas component for use as part of the superheated vapor stream. At least a portion of the solid component is further processed to generate additional fermentable sugar. In one embodiment, the further processing contacting at least a portion of the solid component with a solution adapted to facilitate saccharification. In one embodiment, the additionally generated fermentable sugars are fermented to produce a hydrocarbon.
 Biomass Preparation
 As used herein, the term "solid biomass" or "biomass" refers at least to biological matter from living, or recently living organisms. Solid biomass includes plant or animal matter that can be converted into fibers or other industrial chemicals, including biofuels. Solid biomass can be derived from numerous types of plants or trees, including miscanthus, switchgrass, hemp, corn, tropical poplar, willow, sorghum, sugarcane, sugar beet, and any energy cane, and a variety of tree species, ranging from eucalyptus to oil palm (palm oil). In one embodiment, the solid biomass comprises at least one fermentable sugar-producing plant. The solid biomass can comprise two or more different plant types, including fermentable sugar-producing plant. In a preferred embodiment not intended to limit the scope of the invention, sorghum is selected, due to its high-yield on less productive lands and high sugar content.
 The term "fermentable sugar" refers to oligosaccharides and monosaccharides that can be used as a carbon source (e.g., pentoses and hexoses) by a microorganism to produce an organic product such as alcohols, organic acids, esters, and aldehydes, under anaerobic and/or aerobic conditions. Such production of an organic product can be referred to generally as fermentation. The at least one fermentable sugar-producing plant contains fermentable sugars dissolved in the water phase of the plant material at one point in time during its growth cycle. Non-limiting examples of fermentable sugar-producing plants include sorghum, sugarcane, sugar beet, and energy cane. In particular, sugarcane, energy cane, and sorghum typically contain from about 5% to about 25% soluble sugar w/w in the water phase and have moisture content between about 60% and about 80% on a wet basis when they are near or at their maximum potential fermentable sugar production (e.g., maximum fermentable sugar concentration).
 The term "wet basis" refers at least to the mass percentage that includes water as part of the mass. In a preferred embodiment, the sugar producing plant is sorghum. Any species or variety of the genus sorghum that provides for the microbial conversion of carbohydrates to volatile organic compounds (VOCs) can be used. For embodiments using sorghum, the plant provides certain benefits, including being water-efficient, as well as drought and heat-tolerant. These properties make the crop suitable for many locations, including various regions across the earth, such as China, Africa, Australia, and in the US, such as portions of the High Plains, the West, and across the South. Texas.
 In embodiments using sorghum, the sorghum can include any variety or combination of varieties that may be harvested with higher concentrations of fermentable sugar. Certain varieties of sorghum with preferred properties are sometimes referred to as "sweet sorghum." The sorghum can include a variety that may or may not contain enough moisture to support the juicing process in a sugar cane mill operation. In a preferred embodiment, the solid biomass includes a Sugar T sorghum variety commercially produced by Advanta and/or a male parent of Sugar T, which is also a commercially available product of Advanta. In a preferred embodiment, the crop used has from about 5 to about 25 brix, preferably from about 10 to about 20 brix, and more preferably from about 12 to about 18 brix. The term "brix" herein refers at least to the content of glucose, fructose, and sucrose in an aqueous solution where one degree brix is 1 gram of glucose, fructose, and/or sucrose in 100 grams of solution and represents the strength of the solution as percentage by weight (% w/w). In another preferred embodiment, the moisture content of the crop used is from about 50% to 80%, preferably at least 60%.
 In one embodiment, the crop is a male parent of Sugar T with a brix value of about 18 and a moisture content of about 67%. In another embodiment, the crop is Sugar T with a brix value of about 12 at a moisture content of about 73%. In these particular embodiments, the brix and moisture content values were determined by handheld refractometer.
 After at least one additive (a microbe, optionally, an acid and/or enzyme) is added to the solid biomass, it becomes prepared biomass material where the at least one additive facilitates the conversion of fermentable sugar into a VOC (such as ethanol). As noted above and further described below, the prepared biomass material can be stored for a certain period of time to allow more VOCs to be generated by the conversion process. At least one volatile organic compound is then recovered from the prepared biomass material. Volatile organic compounds are known to those skilled in the art. The U.S. EPA provides descriptions volatile organic compounds (VOC), one of which is any compound of carbon, excluding carbon monoxide, carbon dioxide, carbonic acid, metallic carbides or carbonates, and ammonium carbonate, which participates in atmospheric photochemical reactions, except those designated by EPA as having negligible photochemical reactivity (see http://www.epa.gov/iaq/voc2.html#definition). Another description of volatile organic compounds, or VOCs, is any organic chemical compound whose composition makes it possible for them to evaporate under normal indoor atmospheric conditions of temperature and pressure. This is the general definition of VOCs that is used in the scientific literature, and is consistent with the definition used for indoor air quality. Normal indoor atmospheric conditions of temperature and pressure refer to the range of conditions usually found in buildings occupied by people, and thus can vary depending on the type of building and its geographic location. One exemplary normal indoor atmospheric condition is provided by the International Union of Pure and Applied Chemistry (IUPAC) and the National Institute of Standards and Technology (NIST). IUPAC's standard is a temperature of 0° C. (273, 15 K, 32° F.) and an absolute pressure of 100 kPa (14.504 psi), and NIST's definition is a temperature of 20° C. (293, 15 K, 68° F.) and an absolute pressure of 101.325 kPa (14.696 psi).
 Since the volatility of a compound is generally higher the lower its boiling point temperature, the volatility of organic compounds are sometimes defined and classified by their boiling points. Accordingly, a VOC can be described by its boiling point. A VOC is any organic compound having a boiling point range of about 50 degrees C. to 260 degrees C. measured at a standard atmospheric pressure of about 101.3 kPa. Many volatile organic compounds that can be recovered and/or further processed from VOCs recovered from embodiments of the present invention have applications in the perfume and flavoring industries. Examples of such compounds may be esters, ketones, alcohols, aldehydes, hydrocarbons and terpenes. The following Table 1 further provides non-limiting examples of volatile organic compounds that may be recovered and/or further processed from VOCs recovered from the prepared biomass material.
TABLE-US-00001 TABLE 1 Methanol Ethyl acetate Acetaldehyde Diacetyl 2,3-pentanedione Malic acid Pyruvic acid Succinic acid Butyric acid Formic acid Acetic acid Propionic acid Isobutyric acid Valeric acid Isovaleric acid 2-methylbutyric acid Hexanoic acid Heptanoic acid Octanoic acid Nonanoic acid Decanoic acid Propanol Isopropanol Butanol Isobutanol Isoamyl Hexanol Tyrosol Tryptoptanol alcohol 2,3-butanediol Glycerol Fumaric acid Phenethyl Amyl alcohol 1,2-propanol 1-propanol alcohol Methyl acetate Ethyl acetate Propyl acetate Ethanol Propyl lactate Acetone Ethyl formate 2-butanol 2-methyl-1- 2-propen-1-ol 2,3-methyl-1- Ethyl lactate propanol butanol n-propyl alcohol 3-buten-2-ol
 Ethanol is a preferred volatile organic compound. As such, many examples specifically mention ethanol. This specific mention, however, is not intended to limit the invention. It should be understood that aspects of the invention also equally apply to other volatile organic compounds. Another preferred volatile organic compound is acetic acid.
 Embodiments of the present invention provide for the long term storage of solid biomass material without significant degradation to the volatile organic compounds contained in the prepared biomass material, and they provide for sugar preservation to allow for continued generation of VOCs. As used in this context, "significant" refers at least to within the margin of error when measuring the amount or concentration of the volatile organic compounds in the prepared biomass material. In one embodiment, the margin of error is about 0.5%.
 Accordingly, embodiments of the present invention allow for continuous production VOCs without dependence on the length of the harvest, thereby eliminating or minimizing down time of a recovery plant in traditional just-in-time harvest and recovery processes. As such, embodiments of the present invention allow for harvest of the crop at its peak without compromises typically made to lengthen the harvest season, such as harvest slightly earlier and later than peak time. That is, embodiments of the invention allow for harvest at high field yields and high sugar concentrations, such as when the selected crop has reached its peak sugar concentration or amount of fermentable sugars that can be converted into a volatile organic compound, even if this results in a shorter harvest period. In one embodiment, the solid biomass is harvested or prepared when it is at about 80%, about 85%, about 90%, about 95%, or about 100% of its maximum potential fermentable sugar concentration. As such, embodiments of the present invention, particularly the recovery phase, can be operated continuously year-round without time pressure from fear of spoilage of the solid biomass and VOCs contained therein. While embodiments of the present invention allow for harvest of the solid biomass near or at its maximum sugar production potential, the solid biomass material can be harvested at any point when it is deemed to contain a suitable amount of sugar. Further, the harvest window varies depending on the type of crop and the geographical location. For example, the harvest window for sorghum in North America can range from about 1 to 7 months. However, in Brazil and other equatorial and near equatorial areas, the harvest window may be up to twelve months.
 In embodiments using plants as the solid biomass, the solid biomass can be collected or harvested from the field using any suitable means known to those skilled in the art. In one embodiment, the solid biomass comprises a stalk component and a leaf component of the plant. In another embodiment, the solid biomass further comprises a grain component. In a preferred embodiment, the solid biomass is harvested with a forage or silage harvester (a forage or silage chopper). A silage or forage harvester refers to farm equipment used to make silage, which is grass, corn or other plant that has been chopped into small pieces, and compacted together in a storage silo, silage bunker, or in silage bags. A silage or forage harvester has a cutting mechanism, such as either a drum (cutterhead) or a flywheel with a number of knives fixed to it, which chops and transfers the chopped material into a receptacle that is either connected to the harvester or to another vehicle driving alongside. A forage harvester is preferred because it provides benefits over a sugar cane harvester or dry baled system. For example, a forage harvester provides higher density material than a sugar cane harvester, thereby allowing for more efficient transportation of the harvested material. In one embodiment, using a forage harvester results in harvested sorghum with a bulk density of about 400 kg/m3, compared to sugarcane harvested with a sugarcane harvester with density of about 300 kg/m3, and for sorghum harvested with a sugarcane harvester with a density of about 200 kg/m3. In general, higher bulk density material is cheaper to transport, which tends to limit the geographical area in which cane-harvested crop can be sourced.
 Thus, a forage harvester is an overall less expensive way to harvest the selected biomass, such as sorghum, than a cane harvester or dry baled system. Not to be bound by theory, it is believed the cost savings are due in part to higher material throughputs and the higher bulk density of the solid biomass harvested by a forage harvester. The solid biomass can be cut in any length. In one embodiment, the chop lengths of the harvester is set to a range of about 3 mm to about 80 mm, preferably about 3 mm to about 20 mm, with examples of about 3 mm to about 13 mm chop lengths being most preferred. At these preferred chop lengths, there was not observable aqueous discharge in the forage harvester, so losses were minimal. When a chop length is selected, the harvester provides biomass with an average size or length distribution of about the chop length selected. In one embodiment, the average size distribution of the solid component exiting the recovery system can be adjusted as desired, which can be done by adjusting the chop length of the harvester.
 At least one additive is added to the solid biomass to facilitate and/or expedite the conversion of appropriate carbohydrates into volatile organic compounds. After selected additive(s) have been added, the solid biomass can be referred to as prepared biomass material. In one embodiment, the prepared biomass material can comprise at least one or any combination of fermentable sugar-producing plants listed above. In a preferred embodiment, the selected additive(s) can be conveniently added using the harvester during harvest.
 In one embodiment, at least about 700 tons, preferably at least about 1 million tons, such as at least 1.2 million tons, or more preferably about at least 5 million tons of prepared biomass material is generated in a particular harvest window based on the growing conditions of a specific region, such as about 1 to 7 months in North America for sorghum.
 The at least one additive can be added at any point during and/or after the harvest process. In a preferred embodiment using a forage harvester, additives are added to the solid biomass during the harvest process to generate a prepared biomass material. In particular, forage harvesters are designed for efficiently adding both solid and liquid additives during harvest. As mentioned above, the additives added include at least a microbe (e.g. a yeast), and optionally, an acid and/or an enzyme. In a preferred embodiment, the selected additive(s) are added as solutions. Additional details of the potential additives are further provided below.
 For embodiments using a forage harvester or a similar equipment, the selected additive(s) can be added during harvest at all phases, such as before the intake feed rollers, during intake, at chopping, after chopping, through the blower, after the blower, in the accelerator, in the boom (or spout), and/or after the boom. In one embodiment where acid and enzyme are added, the acid is added near the intake feed rollers, and a microbe and the enzyme are added in the boom. In a particular embodiment, a Krone Big X forage harvester with a V12 motor with an about 30 ft wide header is used. In an embodiment using the Krone system, the acid is added as a solution through flexible tubing that discharged the solution just in front of the feed rollers. In this way, the liquid flow can be visually monitored, which showed the acid solution and solid biomass quickly mixed inside the chopping chamber. In another embodiment, the addition of acid was also demonstrated as a viable practice using a Case New Holland FX 58 forage harvester. In certain embodiments, the forage harvester used can include an onboard rack for containing additives, at least the one(s) selected to be added during harvest. In another embodiment, the selected additive(s) to be added during harvest may be towed behind the harvester on a trailer. For example, in one embodiment, it was demonstrated that a modified utility trailer equipped with tanks containing additive solutions of yeast, enzymes and acid can be employed with minimal interfering with normal operations of the harvester, thereby substantially maintaining the expected cost and duration of the harvest process. For example, a normal harvest configuration and biomass yield employing a silage harvester travelling at about 4 miles per hour maintains a similar rate of collection of about 4 miles per hour when equipped with certain additives as described above in one embodiment.
 In embodiments of the present invention, the prepared biomass material is eventually transported to a storage facility where it is stored for a period of time to allow for production of at least one volatile organic compound from at least a portion of the fermentable sugar of the solid biomass. The details of the storage phase are further provided below. In certain embodiments, selected additive(s) can also be added at the storage facility. For example, in one embodiment, the selected additive(s) can be added during unloading or after the solid biomass has been unloaded at the storage facility. In one embodiment, a conveyance system is used to assist with the adding of selected additive(s) at the storage facility. Additive(s) added at the storage facility to solid biomass can be one(s) that have not been added or additional amount of one(s) previously added. Accordingly, selected additive(s) can therefore be added at any point from the start of the harvest process to prior to storage of the prepared biomass material at the storage area or facility, such as at points where the material is transferred.
 As mentioned above, additive(s) for embodiments of the present invention include at least a microbe and optionally, an acid and/or an enzyme. Selected additive(s) can be added to the solid biomass in any order. In a preferred embodiment, an acid is added to the solid biomass before adding a microbe to prime the material to provide an attractive growth environment for the microbe.
 In a preferred embodiment, acid is added to reduce the pH of the solid biomass to a range that facilitates and/or expedites selected indigenous or added microbial growth, which increases production of ethanol and/or volatile organic compounds. The acid can also stop or slow plant respiration, which consumes fermentable sugars intended for subsequent VOC production. In one embodiment, acid is added until the pH of the solid biomass is between about 2.5 and about 5.0, preferably in a range of about 3.7 to about 4.3, and more preferably about 4.2. The acid used can include known acids, such as sulfuric acid, formic acid, or phosphoric acid. The following Table 2 provides non-limiting examples of an acid that can be used individually or in combination.
TABLE-US-00002 TABLE 2 Sulfuric Acid Formic Acid Propionic Acid Malic Acid Phosphoric Acid Maleic Acid Folic Acid Citric Acid
 In a preferred embodiment, after the solid biomass has reached the desired pH with the addition of acid, a microbe is added. A microbe in the additive context refers at least to a living organism added to the solid biomass that is capable of impacting or affecting the prepared biomass material. One exemplary impact or effect from added microbe(s) includes providing fermentation or other metabolism to convert fermentable sugars from various sources, including cellulosic material, into ethanol or other volatile organic compounds. Another exemplary impact or effect may be production of certain enzyme(s) that help to deconstruct cellulose in the prepared biomass material into fermentable sugars which can be metabolized to ethanol or other VOC. Yet another exemplary impact or effect provided by a microbe includes production of compounds such as vitamins, co-factors, and proteins that can improve the quality, and thus value, of an eventual by-product that can serve as feed for animals. Further, microbial activity provides heat for the pile. Parts of the microbial cell walls or other catabolite or anabolite may also offer value-added chemicals that may be recovered by a recovery unit. These impacts and effects may also be provided by microbes indigenous to the solid biomass.
 Any microbe that is capable of impacting or affecting the prepared biomass material can be added. In a preferred embodiment, the microbe(s) can include microbes used in the silage, animal feed, wine, and industrial ethanol fermentation applications. In one embodiment, the microbe selected includes yeast, fungi, and bacteria according to application and the desired profile of the organic molecule to be made. In a preferred embodiment, yeast is the selected microbe. In another embodiment, bacteria can be added to make lactic acid or acetic acid. Certain fungi can also be added to make these acids. For example, Acetobacterium acetii can be added to generate acetic acid; Lactobacillus, Streptococcus thermophilus can be added to generate lactic acid; Actinobacillus succinogenes, Mannheimia succiniciproducens, and/or Anaerobiospirillum succiniciproducens can be added to generate succinic acid; Clostridium acetobutylicum can be added to generate acetone and butanol; and/or Aerobacter aerogenes can be added to generate butanediol.
 The following Table 3 provides non-limiting examples of preferred microbes, which can be used individually or in combination.
TABLE-US-00003 TABLE 3 Saccharomyces Saccharomyces Saccharomyces Saccharomyces fermentatti cerevisiae japonicas bayanus Saccharomyces Saccharomyces Clostridium Clostridium exiguous chevalieri acetobutylicum amylosaccharobutylpropylicum Clostridium Clostridium Clostridium Aerobacter species propyl- viscifaciens propionicum butylicum Aerobacter Zymomonas Zymomonas Clostridium species aerogenes mobilis species Saccharomyces Bacillus species Clostridium Lactobacillus buchneri species thermocellum Lactobacillus Enterococcus Pediococcus Propionibacteria plantarum faecium species Acetobacterium Streptococcus Lactobacillus Lactobacillus species acetii thermophilus paracasei Actinobacillus Mannheimia Anaerobiospirillum succinogenes succiniciproducens succiniciproducens
 Preferred microbes also include Saccharomyces cerevisiae strains that can tolerate high ethanol concentrations and are strong competitors in its respective microbial community. The microbes may be mesophiles or thermophiles. Thermophiles are organisms that grow best at temperatures above about 45° C., and are found in all three domains of life: Bacteria, Archaea and Eukarya. Mesophiles generally are active between about 20° C. and 45° C. In an embodiment using a strain of Saccharomyces cerevisiae, the strain can come from a commercially available source such as Biosaf from Lesaffre, Ethanol Red from Phibro, and Lallamand activated liquid yeast. If the microbe is obtained from a commercial source, the microbe can be added according to the recommended rate of the provider, which is typically based on the expected sugar content per wet ton, where water is included in the mass calculation. The term "wet ton" refers at least to the mass unit including water. The recommended amount can be adjusted according to reaction conditions. The microbe added can comprise one strain or multiple strains of a particular microbe. In one embodiment, the microbes are added at a rate of up to 500 mL per wet ton of solid biomass. In a particular embodiment using commercially available yeast, about 300 mL of Lallamand yeast preparation is added per wet ton of solid biomass. In another embodiment, an additional yeast strain can be added. For example, Ethanol Red can be added at a rate between about 0.001 kg/wet ton to about 0.5 kg/wet ton, particularly about 0.1 kg/wet ton. In yet another embodiment, another yeast strain can be added, e.g., Biosaf, at a rate between about 0.001 kg/wet tone to about 0.5 kg/wet ton, particularly about 0.1 kg/wet ton. It is understood that other amounts of any yeast strain can be added. For example, about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 1.5 times, about 2 times, about 2.5 times, or about 3 times of the provided amounts of microbes can be added.
 In certain embodiments, an enzyme is further added. The enzyme can be one that assists in the generation of fermentable sugars from plant materials that are more difficult for the microbe to metabolize, such as different cellulosic materials, and/or to improve the value of an eventual by-product serving as animal feed, such as by making the feed more digestable. The enzyme can also be an antibiotic, such as a lysozyme as discussed further below. The enzyme added can include one type of enzyme or many types of enzymes. The enzyme can come from commercially available enzyme preparations. Non-limiting examples of enzymes that assist in converting certain difficult to metabolize plant materials into fermentable sugars include cellulases, hemicellulases, ferulic acid esterases, and/or proteases. Additional examples also include other enzymes that either provide or assist the provision for the production of fermentable sugars from the feedstock, or increase the value of the eventual feed by-product.
 In certain embodiments, the enzymes that assist in converting certain difficult to metabolize plant materials into fermentable sugars can be produced by the plant itself, e.g. in-plantae. Examples of plants that can produce cellulases, hemicellulases, and other plant-polymer degrading enzymes may be produced within the growing plants are described in the patent publications and patent WO2011057159, WO2007100897, WO9811235, and U.S. Pat. No. 6,818,803, which show that enzymes for depolymerizing plant cell walls may be produced in plants. In another embodiment, ensilagement can be used to activate such plant produced enzymes as well as temper the biomass for further processing. One example is described in patent publication WO201096510. If used, such transgenic plants can be included in the harvest in any amount. For example, certain embodiments may employ in-plantae enzymes produced in plants by using particular transgenic plants exclusively as a feedstock, or incorporating the transgenic plants in an interspersed manner within like or different crops.
 In certain embodiments that include such plant-polymer degrading enzymes, ethanol can be produced from cellulosic fractions of the plant. In a particular embodiment, when Novazymes CTEC2 enzyme was added to a sorghum storage system in excess of the recommended amount, about 100 times more than the recommended amount, about 152% of the theoretical ethanol conversion efficiency based on the initial free sugar content was achieved. While such an amount of enzymes can be added using commercially available formulations, doing so can be costly. On the other hand, such an amount of enzymes can be obtained in a more cost effective manner by growing transgenic plants that produce these enzymes at least interspersingly among the biomass crop.
 The ethanol production from cellulose occurred during the storage phase, e.g., in silage and was stable for about 102 days of storage, after which the experiment was terminated. This demonstrates that, under the conditions of that particular experiment, an excess of such enzyme activity results in at least about 52% production of ethanol using fermentable sugars from cellulose. Not intended to be bound by theory, for certain embodiments, the immediate addition of acid during harvest in the experiment may have lowered the pH, thereby potentially inducing the enzyme activity, which otherwise could damage the plants if produced while the plants were still growing.
 In a preferred embodiment, if an enzyme is added, the enzyme can be any family of cellulase preparations. In one embodiment, the cellulose preparation used is Novozymes Cellic CTec 2 or CTec 3. In another embodiment, a fibrolytic enzyme preparation is used, particularly, Liquicell 2500. If used, the amount of enzyme added to degrade plant polymer can be any amount that achieves the desired conversion of plant material to fermentable sugar, such as the recommended amount. In a particular embodiment, about 80,000 FPU to about 90,000,000 FPU, preferably about 400,000 FPU to about 45,000,000 FPU, more preferably about 800,000 FPU to about 10,000,000 FPU of enzyme is added per wet ton of biomass. The term "FPU" refers to Filter Paper Unit, which refers at least to the amount of enzyme required to liberate 2 mg of reducing sugar (e.g., glucose) from a 50 mg piece of Whatman No. 1 filter paper in 1 hour at 50° C. at approximately pH 4.8.
 In certain other embodiments, selected additive(s) added can include other substances capable of slowing or controlling bacterial growth. Non-limiting examples of these other substances include antibiotics (including antibiotic enzymes), such as Lysovin (lysozyme) and Lactrol® (Virginiamycin, a bacterial inhibitor). Control of bacterial growth can allow the appropriate microbe to expedite and/or provide the production of volatile organic compounds. Antibiotic is a general term for something which suppresses or kills life. An example of an antibiotic is a bacterial inhibitor. In one embodiment, a selective antibiotic that is intended to impact bacteria and not other microbes is used. One example of a selective antibiotic is Lactrol, which affects bacteria but does not affect yeasts.
 In a particular embodiment, if used, Lactrol can be added at rates of about 1 to 20 part-per-million (ppm) w/v (weight Lactrol per volume liquid) as dissolved in the water phase of the prepared biomass material, for example at about at about 5 ppm w/v. In an embodiment using an enzyme to control bacterial growth, lysozyme is preferably used. The lysozyme can come from a commercial source. An exemplary commercially available lysozyme preparation is Lysovin, which is a preparation of the enzyme lysozyme that has been declared permissible for use in food, such as wine.
 The enzyme and/or other antibiotic material, if used, can be added independently or in conjunction with one another and/or with the microbe. In certain embodiments, other compounds serving as nutrients to the microbes facilitating and/or providing the volatile organic compound production can also be added as an additive. The following Table 4 provides non-limiting examples of other substances, including antibiotics, which can be added to the solid biomass.
TABLE-US-00004 TABLE 4 Potassium Potassium FermaSure ® (from Lysovin Metabisulfite Bicarbonate Dupont ®) - oxychlorine products including chlorite Thiamin Magnesium Calcium Diammonium Sulfate Pantothenate Phosphate Ammonia Antibiotics Lactrol Biotin
 Yeasts and other microbes that are attached to solids individually, as small aggregates, or biofilms have been shown to have increased tolerance to inhibitory compounds. Not intended to be bound by theory, part of the long-term fermentation may be possible or enhanced by such microbial-to-solids binding. As such, the prepared biomass material that includes the microbe optimized for microbial binding as well as additives that may bind microorganisms can experience a greater extent of fermentation and or efficiency of fermentation. Substances providing and/or facilitating long term fermentation is different from substances that increase the rate of fermentation. In certain embodiments, an increase in the rate of fermentation is not as an important factor as the long-term fermentation, particularly over a period of many weeks or months.
 The following provides particular amounts of additives applied to one specific embodiment. If used, the rate and amount of adding an acid varies with the buffering capacity of the particular solid biomass to which the particular acid is added. In a particular embodiment using sulfuric acid, 9.3% w/w sulfuric acid is added at rates of up to about 10 liter/ton wet biomass, for example at about 3.8 liter/ton wet biomass to achieve a pH of about 4.2. In other embodiments, the rate will vary depending on the concentration and type of acid, liquid and other content and buffering capacity of the particular solid biomass, and/or desired pH. In this particular embodiment, Lactrol is added at a rate of about 3.2 g/wet ton of solid biomass. Yeasts or other microbes are added according to the recommended rate from the provider, such as according to the expected sugar content per wet ton. In one particular embodiment, Lallemand stabilized liquid yeast is added at about 18 fl oz per wet ton, and Novozymes Cellic CTec2 is added at about 20 fl oz per wet ton.
 In a preferred embodiment, selected additive(s) are added to the solid biomass stream during harvest according to aspects of the invention described above to generate the prepared biomass material. Preferably, the prepared biomass material is transported to a storage facility to allow for conversion of carbohydrates of the prepared biomass material into volatile organic compounds of the desired amount and/or await recovery of the volatile organic compounds. Any suitable transportation method and/or device can be used, such as vehicles, trains, etc, and any suitable method to place the prepared biomass material onto the transportation means. Non-limiting examples of vehicles that can be used to transport the biomass material include end-unloading dump trucks, side-unloading dump trucks, and self-unloading silage trucks. In a preferred embodiment, a silage truck is used. In embodiments using a forage harvester to collect the biomass, transportation of such solid biomass is more efficient than transportation of materials collected by conventional means, such as sugar cane billets, because the bulk density is higher in the solid biomass cut with a forage harvester. That is, materials chopped into smaller pieces pack more densely than materials in billets. In one embodiment, the range of bulk densities in a silage truck varies between about 150 kg/m3 and about 350 kg/m3, for example about 256 kg/m3. Because in certain embodiments, all selected additives are added during harvest, preferably on the harvester, the microbe may begin to interact with the biomass during transportation, and in this way transportation is not detrimental to the overall process.
 The biomass, whether prepared or not, is delivered to at least one storage area or facility. The storage facility can be located any distance from the harvest site. Selected additive(s) can be added if they have not been added already or if additional amounts or types need to be further added to generate the prepared biomass material. In a preferred embodiment, the prepared biomass is stored in at least one pile on a prepared surface for a period of time. The facility can incorporate man-made or natural topography. Man-made structures can include existing structures at the site not initially designated for silage, such as canals and water treatment ponds. Non-limiting examples of a prepared surface includes a concrete, asphalt, fly ash, or soil surface. The at least one pile can have any dimension or shape, which can depend on operating conditions, such as space available, amount of biomass, desired storage duration, etc.
 The conversion process of fermentable sugars is an exothermic reaction. Too much heat, however, can be detrimental to the conversion process if the temperature is in the lethal range for the microbes in the prepared biomass material. However, in an embodiment using about 700 wet tons of biomass and piling up to about 12 feet, ethanol production and stability were satisfactory. Therefore larger piles will likely not suffer from overheating. In one embodiment, an inner portion of the pile maintains a temperature in a range of about 20° C. to about 60° C. for microbes of all types, including thermophiles. In an embodiment not employing thermophiles, an inner portion of the pile maintains a temperature in a range of about 35° C. to about 45° C.
 The prepared biomass material that is stored as at least one pile at the storage facility can also be referred to as a wet stored biomass aggregate. After addition of the selected additive(s), at least a portion of the solid biomass is converted to volatile organic compounds, such as fermentation of sugars into ethanol. In one embodiment, the prepared biomass material is stored for a period of time sufficient to achieve an anaerobiasis environment. In a preferred embodiment, the anaerobiasis environment is achieved in about 24 hours. In another embodiment, the anaerobiasis environment is achieved in more than about 4 hours. In yet another embodiment, the anaerobiasis environment is achieved in up to about 72 hours.
 The pile can be free standing or formed in another structure, such as a silage bunker, designed to accept silage, including provisions to collect aqueous runoff and leachate, placement of a tarp over the biomass, and to facilitate both efficient initial silage truck unloading into the bunker as well as removal of the biomass year around. The individual bunkers may be sized at about the size to support annual feedstock requirements of about 700 wet tons to 10,000,000 wet tons or more. For example, the storage facility may have 50 bunkers, where each individual bunker can accept 100,000 wet tons of prepared biomass material for a total of a maximum of about 5 million wet tons of stored material at any one time. In a preferred embodiment where ethanol is the volatile organic compound of choice, about 14 gallons to about 16 gallons of ethanol is recovered per one wet ton of prepared biomass material. The provided numbers are exemplary and not intended to limit the amount of prepared biomass material a storage facility can accommodate.
 In a particular embodiment, the storage pile further includes a leachate collection system. In one embodiment, the collection system is used to remove leachate collected from the storage pile. For example, the leachate collection system can be adapted to remove liquid from the pile at certain points during the storage period. In another embodiment, the leachate collection system is adapted to circulate the liquid in the storage pile. For example, circulation can involve taking at least a portion of the recovered liquid and routing it back to the pile, preferably at or near the top portion. Such recirculation allows for longer retention time of certain portions of the liquids in the pile, even as the recovery phase of the prepared biomass material begins and portions of the non-liquid component of the prepared biomass material are sent to the recovery unit. The longer retention time results in longer microbial reaction time, and hence, higher concentrations of organic volatile compounds, such as ethanol.
 Any suitable leachate collection system known to those skilled in the art can be employed as described. In a particular embodiment, the leachate collection system comprises at least one trough along the bottom of the pile, preferably positioned near the middle, of the storage pile or bunker if one is used, where the storage pile is prepared at a grade designed to direct liquid from the prepared biomass material to the trough and out to a desired collection receptacle or routed to other applications.
 In another embodiment, the leachate collection system comprises one or more perforated conduits, preferably pipes made of polyvinyl chloride (PVC), that run along the bottom of the pile to allow the liquid collected in the conduits to be directed away from the pile.
 In one embodiment, as the prepared biomass material is added to the bunker or laid on top of the prepared surface, a tractor or other heavy implement is driven over the pile repeatedly to facilitate packing. In one embodiment, the packing ranges from about 7 lbs/ft3 to about 50 lbs/ft3 per cubic foot for the prepared biomass material. In a preferred embodiment, the packing is from about 30 lbs/ft3 to about 50 lbs/ft3, particularly about 44 lbs/ft3. In one embodiment, the compacting of the prepared biomass material in a pile facilitates and/or allows an anaerobiasis environment to be achieved in the preferred time periods described above. In another embodiment, after the packing is performed or during the time the packing is being performed, an air impermeable membrane is placed on the pile, typically a fit for purpose plastic tarp. In a particular embodiment, the tarp is placed on the pile as soon as is practical. For instance, the tar is placed on the pile within a 24-hour period.
 In one embodiment, the prepared biomass material is stored for at least about 24 hours and preferably at least about 72 hours (or 3 days) to allow for production of volatile organic compounds, such as ethanol. In one embodiment, the prepared biomass material is stored for about three days, preferably ten days, more preferably greater than ten days. In one embodiment, the time period for storage of the prepared biomass is about 1 day to about 700 days, preferably about 10 to 700 days. In another embodiment, the biomass material is stored for up to about three years. In one embodiment, the prepared biomass material is stored for a time period sufficient to allow a conversion efficiency of sugar to at least one volatile organic compound of at least about 95% of the theoretical production efficiency as calculated through a stoichiometric assessment of the relevant biochemical pathway. In another embodiment, the prepared biomass material is stored for a time period sufficient to allow a calculated conversion efficiency of sugar to at least one volatile organic compound of at least about 100%. In yet another embodiment, the prepared biomass material is prepared with certain additives, such as enzymes, that allow a calculated conversion efficiency of sugar to at least one volatile organic compound of up to about 150% of the theoretical value based on the initial amount of available fermentable sugars. Not intended to be bound by theory, it is believed that, at or above 100% efficiency, the volatile organic compound(s) are produced from both the initially available fermentable sugars and fermentable sugars from cellulosic or other polymeric material in the prepared biomass material, which can be achieved by enzymatic hydrolysis or acid hydrolysis facilitated by certain additive(s) applied to the biomass.
 The produced volatile organic products, such as ethanol, remain stable in the stored prepared biomass material for the duration of the storage period. In particular, the prepared biomass material can be stored up to 700 days without significant degradation to the volatile organic compounds. "Significant" in this context refers at least to within the margin of error when measuring the amount or concentration of the volatile organic compounds in the prepared biomass material. In one embodiment, the margin of error is 0.5%. It has been demonstrated that ethanol remains stable in the pile after at least about 330 days with no significant ethanol losses observed. This aspect of embodiments of the present invention is important because it provides for at least eight months of stable storage, which enables year-round VOCs production and recovery with a harvest window of only about four months. Embodiments of the invention provide significant advantages over the conventional just-in-time processing that would only be able to operate during the four months harvest window per year. That is, embodiments of the invention allow a plant to operate year-round using only a four-month harvest window, thereby reducing capitals cost for a plant of the same size as one used for just-in-time processing.
 Also, in an embodiment employing a tarp, it is envisioned that placing soil or other medium around and on the tarp edges to 1) provide weight for holding the tarp down; and also 2) to act as a biofilter of the off-gas from the pile. In such an embodiment, biofilters are efficient for organics and carbon monoxide detoxification/degradation. The prepared biomass material can also be stored as compressed modules, drive over piles, bunkers, silos, bags, tubes, or wrapped bales or other anaerobic storage system.
 In one embodiment, the off-gas stream from a pile of prepared biomass material was monitored, and it was found that only small levels of organics, and also very low levels of nitrogen oxides, were present. For example, Tables 5.1, 5.2, and 5.3 below show the analysis of various off-gas samples collected during the storage phase of one implementation of certain embodiments of the invention. The designation "BDL" refers to an amount below detectable limit. Summa and Tedlar refer to gas sampling containers commercially available.
TABLE-US-00005 TABLE 5.1 Con- Con- Nor- tain- tain- mal- er er % % % % % % ized type ID H2 O2 N2 CH4 CO2 H2O CO2 Tedlar A BDL 1.72 7.84 BDL 95.90 5.23 85.21 bag Tedlar B BDL 2.30 9.12 BDL 89.97 5.97 82.62 bag Tedlar C BDL 0.71 3.57 BDL 97.45 5.54 90.18 bag Tedlar D BDL 0.72 3.18 BDL 97.50 5.97 90.14 bag Tedlar E BDL 1.86 7.24 BDL 91.75 7.64 83.26 bag Summa EQ 0.01 5.74 22.14 0.07 73.74 5.28 66.84 Con- #8 tainer Summa EQ 0.09 3.28 12.89 0.33 84.48 5.66 78.18 Con- #13 tainer Summa EQ 0.12 3.30 13.01 0.12 84.65 4.99 78.70 Con- #16 tainer
TABLE-US-00006 TABLE 5.2 Con- Con- tainer tainer % ppmv % ppmv ppmv ppmv ppmv ppmv type ID O2 CO CO2 HC NO NO2 NOX SO2 Tedlar A 1.6 13 72.7 104 3.8 1.90 5.70 BDL bag Tedlar B 4.4 19 66.2 739 2.5 122.90 125.40 6 bag Tedlar C 0.6 29 75.3 158 8.9 27.20 36.10 4 bag Tedlar D 0.6 35 75.7 222 7.9 56.50 64.40 5 bag Tedlar E 4.1 35 66.8 423 3.0 20.30 23.90 4 bag
TABLE-US-00007 TABLE 5.3 Con- Con- ppmv tain- tain- 2- ppmv er er ppmv ppmv ppmv pro- ppmv pro- type ID CH2O C2H4O methanol panol ethanol panol Tedlar A 386 870 63.4 0.593 78.5 BDL bag Tedlar B BDL 1299 678 0.186 1065 15.2 bag Tedlar C 18.2 590 89.2 2.784 171 6.098 bag Tedlar D BDL 941 170 3.031 264 7.648 bag Tedlar E BDL 819 389 2.512 634 11.3 bag
 Embodiments of the present invention, although relatively uncontained in the bunker, should be environmentally benign. Even so, certain aspects of the present invention fit well with using soil or other media as a biofilter placed around and on the bunkers because the escape of gas from under the tarp is radial in nature. As such, the vapors have a higher amount of surface area in contact with the edges of the pile. In embodiments using a biofilter, vapor phase releases pass through the biofilter (such as soil or compost) placed near the edge mass before entering into the atmosphere. The biofilter retains many potential environmental pollutants and odors released by the storage pile, and it eliminates or greatly reduces the potentially harmful off-gases released from the storage pile.
 In one embodiment, the prepared biomass material is stored until it contains no more than about 80 wt % liquid. The prepared biomass material is stored until it contains at least about 4 to about 5% higher than initial content. At this stage, the wet stored biomass aggregate is not considered "beer" yet since it still contains over about 20% solids. In one embodiment, the prepared biomass material is stored until it contains between about 2 wt % and about 50 wt % ethanol, and preferably between about 4 wt % and about 10 wt % ethanol. The balance of the liquid is primarily water but can contain many other organic compounds, such as acetic acid, lactic acid, etc.
 Embodiments of the present invention allow the solid biomass to be harvested in a much shorter harvest window than typical sugar cane juicing operations, which allows for
 1) a much larger geographic area where the facilities could be placed,
 2) harvest of the crop when the crop has its highest yield potential,
 3) harvest of the crop at its highest sugar concentration potential,
 4) shorter harvest window still economical, and
 5) decoupling the need for taking the juice from the biomass for fermentation.
 VOC Recovery
 Once the prepared biomass material has been stored for the desired amount of time and/or contains a desired concentration of volatile organic compounds, such as ethanol, it can be routed to the VOC recovery system for recovery of particular volatile organic compounds. The recovery system and storage facility can be located any distance from one another. Embodiments of systems and methods described herein allow flexibility in the geographical location of both and their locations relative to each other. In a particular embodiment, the recovery system is located about 0.5 to about 2 miles from the storage facility. Any suitable method and/or equipment can be used to transfer the prepared biomass material from the storage facility to the recovery system. In one embodiment, a feed hopper is used. In one embodiment, a silage facer, a front end loader or payloader, a sweep auger or other auger system can be used to place the prepared biomass material into the feed hopper. The material can be placed directly into the feed hopper or it can be transferred to by conveyer system, such as belt system. The feed hopper containing the prepared biomass material can then be driven to the recovery system.
 The recovery system is solventless and uses a superheated vapor stream to vaporize the liquid in the prepared biomass material into a gas component, which can then be collected. A super-heated vapor is a vapor that is heated above its saturation temperature at the pressure of operation. In a preferred embodiment, after the recovery system reaches steady state, the superheated vapor stream comprises only vapor previously evaporated from the prepared biomass material, so that no other gas is introduced, thereby reducing the risk of combustion of the volatile organic compounds and/or dilution of the recovered product stream of volatile organic compounds. A portion of the vapor is removed as product and the remainder is recycled back for use in transferring heat to fresh incoming prepared biomass material. The remaining solid component is discharged from the system and can have various subsequent uses. The super-heated vapor directly contacts the biomass transferring energy and vaporizing the liquid present there. The heat or thermal energy source does not directly contact the prepared biomass material. Thus, the VOC recovery system can also be described as providing "indirect" heat contact.
 To provide solventless recovery of volatile organic compounds, the recovery system comprises a compartment that allows superheated vapor to flow in a continuous manner, i.e., as a stream. In one embodiment, the compartment has a loop shape. In another embodiment, the compartment is a rotating drum. The compartment has an inlet through which the prepared biomass material can enter. In one embodiment, the inlet comprises a pressure tight rotary valve, plug screw, or other similar device, which can assist in separating the prepared biomass material to increase the surface area exposed to the superheated vapor stream.
 In yet another embodiment, the system comprises a dewatering mechanism to remove at least a portion of the liquid in the prepared biomass material before the liquid is vaporized. The liquid removal can occur before and/or while the prepared biomass material enters the compartment. The liquid from the prepared biomass material contains at least one volatile organic compound, which can be recovered by further processing the liquid, such as feeding the liquid to a distillation column. The liquid can be routed directly to further processing unit, such as a distillation column. Alternatively or in addition to, the system further includes a collection unit to collect the liquid removed from the prepared biomass material. Any portion of the collected liquid can then be further processed.
 In one embodiment, the dewatering mechanism comprises a component adapted to squeeze the liquid from the prepared biomass material. In such an embodiment, the squeezing can be performed while the prepared biomass material is being fed into the compartment. For instance, the inlet can comprise a squeezing mechanism to squeeze liquid from the prepared biomass material as it is introduced into the compartment. Alternatively or in addition to, the squeezing can be performed separately before the prepared biomass material enters the compartment. A non-limiting example of such a squeezing mechanism is a screw plug feeder.
 In one embodiment, the liquid removal mechanism comprises a mechanical press. Non-limiting examples of types of mechanical presses include belt filter presses, V-type presses, ring presses, screw presses and drum presses. In a particular embodiment of a belt filter press, the prepared biomass material is sandwiched between two porous belts, which are passed over and under rollers to squeeze moisture out. In another particular embodiment, a drum press comprises a perforated drum with a revolving press roll inside it that presses material against the perforated drum. In yet another embodiment, in a bowl centrifuge, the material enters a conical, spinning bowl in which solids accumulate on the perimeter.
 The compartment provides a space where the superheated vapor stream can contact the prepared biomass material to vaporize the liquid from the prepared biomass material. The vaporization of at least a portion of the liquid provides a gas component and a solid component of the prepared biomass material. The system further comprises a separating unit where the solid component of the prepared biomass material can be separated from the gas component, so each component can be removed as desired for further processing. In one embodiment, the separating unit comprises a centrifugal collector. An example of such centrifugal collector is high efficiency cyclone equipment. In a preferred embodiment, the separating unit also discharges the solid component from the solventless recovery system. There is a separate outlet for the gas component where it can exit the system for further processing, such as distillation. In one embodiment, the separating unit is further coupled to a second pressure tight rotary valve or the like to extrude or discharge the solid component. In one embodiment, the superheated vapor is maintained at a desired temperature above its saturation temperature by a heat exchange component coupled to a heat source where the superheated vapor does not contact the heat source. The heat transfer between the heat source and the system occurs via convection to the superheated vapor. In one embodiment, the heat source can include electrical elements or hot vapors through an appropriate heat exchanger. In one embodiment, the operating pressure is in a range from about 1 psig to about 120 psig. In a preferred embodiment, the operating pressure is in a range from about 3 psig to about 40 psig. In a particularly preferred embodiment, the system is at an operating pressure of about 60 psig to force the vapor component from the system.
 In one embodiment, at start up of the recovery system, the prepared biomass material is introduced into the compartment via the inlet. Steam is initially used as the superheated vapor to initially vaporize the liquid in the prepared biomass material. The superheated vapor continuously moves through the compartment. When the prepared biomass material enters the superheated vapor stream, it becomes fluidized where it flows through the compartment like a fluid. As the prepared biomass material is introduced, it comes into contact with the superheated vapor stream. Heat from the superheated vapor is transferred to the prepared biomass material and vaporizes at least a portion of the liquid in the prepared biomass material and is separated from the solid component, which may still contain moisture. The gas component contains volatile organic compound(s) produced in the prepared biomass material. In a preferred embodiment, as liquid from the prepared biomass material begins to vaporize, at least a portion of the vaporized liquid can be recycled in the system as superheated fluid. That is, during any one cycle, at least a portion of the vaporized liquid remains in the compartment to serve as superheated vapor instead of being collected for further processing, until the next cycle where more prepared biomass material is fed into the system.
 In a preferred embodiment, during the initial start up procedure, the superheated fluid can be purged as needed, preferably continuously (intermittently or constantly), until steady state is achieved where the superheated vapor comprises only vaporized liquid of the prepared biomass material. The gas component and solid component can be collected via the respective outlet. Heat can be added continuously (intermittently or constantly) to the system via the heat exchanger coupled to the heat source to maintain the temperature of the superheated vapor, to maintain a desired operating pressure in the system, or to maintain a target vaporization rate. Various conditions of the system, such as flow rate of the superheated vapor stream, pressure, and temperature, can be adjusted to achieve the desired liquid and/or volatile organic compounds removal rate.
 In one embodiment, the collected gas component is condensed for further processing, such as being transferred to a purification process to obtain a higher concentration of the volatile organic compound(s) of choice. In a preferred embodiment, the collected gas component is fed directly into a distillation column, which provides savings of energy not used to condense the gas component. In another embodiment, the gas component is condensed and fed to the next purification step as liquid.
 In one embodiment, before entering the recovery phase, the prepared biomass material has an initial liquid content of about at least 10 wt % and up to about 80 wt % based on the biomass material. In a particular embodiment, the initial liquid content is at least about 50 wt % based on the biomass material. In one embodiment, the initial liquid content comprises from about 2 to 50 wt %, and preferably from about 4 to 10 wt % ethanol based on the initial liquid content.
 In one embodiment, the solid component collected contains from about 5 wt % to about 70 wt %, and preferably from about 30 wt % to about 50 wt %, liquid depending on the ethanol removal target. In another component, the collected gas component contains between about 1 wt % and about 50 wt % ethanol, preferably between about 4 wt % and about 15 wt % ethanol. In one embodiment, the recovery system recovers from about 50% to about 100% of the volatile organic compounds contained in the prepared biomass material. The residence time of the prepared biomass varies based on a number of factors, including the volatile organic compound removal target. In one embodiment, the residence time of the prepared biomass material in the compartment is in a range of about 1 to about 10 seconds. In one embodiment, the recovery system can be operated between about 0.06 barg and about 16 barg. The term "barg" refers to bar gauge as understood by one of ordinary skill in the art, and 1 bar equals to 0.1 MegaPascal. In one embodiment, the gas in the recovery system has a temperature in a range of about 100° C. to about 375° C., particularly from about 104° C. to about 372° C., and the solid component exiting the system has a temperature of less than about 50° C. The collected solid component can be used in other applications. Non-limiting examples include animal feed, feed for a biomass burner to supply process energy or generate electricity, or further converted to ethanol by means of a cellulosic ethanol process (either re-ferment in a silage pile, or feed to a pre-treatment unit for any cellulosic ethanol process) or a feed for any other bio-fuel process requiring ligno-cellulosic biomass.
 The operating conditions of the solventless recovery system include at least one of temperature, pressure, flow velocity, and residence time. Any one or combination of these conditions can be controlled to achieve a target or desired removal target, such as the amount of the initial liquid content removed or the amount of the liquid remaining in the separated liquid component exiting the recovery system. In one embodiment, at least one operating condition is controlled to achieve removal of about 10-90 wt %, preferably about 45-65 wt %, and more preferably about 50 wt %, of the initial liquid content.
 In a preferred embodiment, increasing the temperature of the system at constant pressure will cause the liquid in the biomass to be vaporized more quickly and thus for a given residence time will cause a higher percentage of the liquid in the biomass to be evaporated. The vapor flow rate exiting the system has to be controlled to match the rate of vaporization of liquid from the biomass in order to achieve steady state and can also be used as a mechanism to control the system pressure. Increasing the system pressure will cause more energy to be stored in the vapor phase in the system which can then be used to aid in further processing or to help move the vapor to the next downstream processing unit. Increasing the biomass residence time in the system causes more heat to be transferred from the vapor phase to the biomass resulting in more liquid being vaporized.
 In a specific exemplary embodiment, the recovery system comprises a closed loop pneumatic superheated steam dryer, which can be obtained from commercially available sources. In one embodiment, the closed loop pneumatic superheated steam dryer is an SSD® model of GEA Barr-Rosin Inc. Other suitable commercially available equipment include the Superheated Steam Processor, SSP® from GEA Barr-Rosin Inc, the Ring Dryer from several companies including GEA Ban-Rosin Inc. and Dupps; the Airless Dryer from Dupps; the QuadPass® Rotary Drum Dryer from DuppsEvactherm®, Vacuum Superheated Steam Drying from Eirich; the rotary drum dryer using superheated vapor from Swiss Combi Ecodry; and the airless dryer from Ceramic Drying Systems Ltd.
 Still other types of indirect dryers that could serve as the volatile organics recovery unit for this process are batch tray dryers, indirect-contact rotary dryers, rotating batch vacuum dryers, and agitated dryers. The basic principle for these dryers is that they will be enclosed and attached to a vacuum system to remove vapors from the solids as they are generated (also by lowering the pressure with the vacuum the volatiles are removed more easily). The wet solids contact a hot surface such as trays or paddles, the heat is transferred to the wet solids causing the liquids to evaporate so they can be collected in the vacuum system and condensed.
 FIG. 1 illustrates an exemplary VOC recovery system and process employing a superheated steam dryer, referenced as system 100. In a particular embodiment, the superheated steam dry can be obtained from GEA Ban-Rosin Inc. In FIG. 1, prepared biomass material 1 containing ethanol and/or other VOCs following solid state fermentation in the silage piles is fed into compartment 3 through input 2. In the particular embodiment shown, input 2 comprises a screw extruder. As shown in FIG. 1, at least a portion of the liquid of the prepared biomass material 1 is removed prior to entering compartment 3. The dewatering mechanism can be a screw plug feeder through which the prepared biomass material 1 passes. At least a portion of the liquid removed from biomass material 1 can be routed directly to distillation step 11 via stream 15 without going through recovery system 100. Optionally, a delumper can be coupled to the output of the dewatering mechanism can be used to facilitate introduction of the dewatered biomass material into compartment 3.
 Referring to FIG. 1, recovery system 100 comprises compartment 3, which can be pressurized, shown as a conduit that has an appropriate diameter, length and shape, adapted to provide the desired operating conditions, such as residence time of prepared biomass material 1, heat transfer to the superheated vapor, and operating pressure and temperature. After entering compartment 3, during steady state operation, prepared biomass material 1 contacts superheated vapor flowing through system 100 at a desired temperature and becomes fluidized. As described above, in a preferred embodiment, the superheated vapor, or at least a portion thereof, is vapor component obtained from prepared biomass materials previously fed into system 100 for VOC recovery. The fluidized biomass flows through compartment 3 at a target flow rate and remains in contact with the superheated vapor for a target residence time sufficient to evaporate the desired amount of liquid from prepared biomass material 1. In the embodiment shown, the flow of the superheated vapor and prepared biomass material 1 through system 100 is facilitated by system fan 14. System 100 can have one or more fans. The flow rate or velocity of the superheated vapor and biomass material 1 can be controlled by system fan 14. Biomass material 1 flows through compartment 3 and reaches separating unit 4, which is preferably a cyclone separator, where a vapor component and a solid component of biomass material 1 are separated from each other. As shown, the vapor component is routed away from the solid component via overhead stream 5 and the remaining portion of biomass material 1 is considered a solid component, which is discharged from separating unit 4 as solid component 7, preferably by screw extruder 6. At least a portion of the discharged solid component 7 can be used as animal feed, burner fuel, or biomass feedstock for other bio-fuels processes. For example, at least a portion of solid component 7 can serve as feedstock for process 400 that further processes lignocellulosic material contained in solid component 7. Process 400 is illustrated in FIG. 3 and correspondingly further discussed below. Referring to FIG. 1, a portion of the vapor component, referenced as stream 8, is retained and recycled as a portion of the superheated vapor used to vaporize newly introduced prepared biomass material. In the embodiment shown, the retained vapor component in stream 8 is routed through heat exchanger 9 to heat it to the target operating temperature. The heat source can include steam, electricity, hot flue gases or any other applicable heating source known to those skilled in the art.
 In a preferred embodiment, the temperature is controlled such that the pressure in the system is maintained at the target and there is adequate energy present to evaporate the desired amount of liquid. The pressure can also be controlled by the flow rate of the superheated vapor stream and the heat input to heat exchanger 9. Preferably, recovery system 100 operates continuously where prepared biomass material 1 is continuously fed at a desired rate, and vapor component 10 and solid component 6 are continuously removed at a continuous rate. In a preferred embodiment, "fresh" vapor component 8 from one run is retained continuously at a target rate to be used as the superheated vapor stream for the next run. Any of these rates are adjustable to achieve the desired operating conditions. As mentioned, system fan 14 circulates the superheated vapor stream through system 100 and can be adjusted to obtain the target flow rate or velocity.
 Referring to FIG. 1, the remaining portion of vapor component stream 5, represented as numeral 10 is routed to a distillation step 11. Depending on the distillation configuration, vapor component portion 10 may be condensed before further purification or preferably fed directly into the distillation column as a vapor. In a preferred embodiment, the distillation product from distillation step 11 has an ethanol content of about 95.6 wt % ethanol (the ethanol/water azeotrope), which can further be purified to above about 99 wt % using common ethanol dehydration technology, which is shown as step 12. The final ethanol product 13 will then typically be used as a biofuel for blending with gasoline.
 FIG. 2 illustrates another exemplary recovery system and process employing a superheated steam dryer, referenced as system 200 that is representative of the Ring Dryer provided by various manufacturers. Prepared biomass material 201 is fed into system 200 through input 202, which preferably comprises a screw extruder. In one embodiment, least a portion of the liquid of the prepared biomass material 201 is removed prior to entering system 200. The dewatering mechanism can be a screw plug feeder through which the prepared biomass material 201 passes. At least a portion of the liquid removed from biomass material 201 can be routed directly to distillation step 211 via stream 215 without going through recovery system 200. Optionally, a delumper can be coupled to the output of the dewatering mechanism can be used to facilitate introduction of the dewatered biomass material into compartment 203.
 Referring to FIG. 2, recovery system 200 comprises compartment 203, which preferably comprises a rotating drum that provides the target operating conditions for VOC recovery, including residence time of prepared biomass material 201, heat transfer to the superheated vapor, and operating pressure and temperature. After entering compartment 203, during steady state operation, prepared biomass material 201 contacts superheated vapor flowing through system 200 at the operating temperature and flow rate and becomes fluidized. As described above, in a preferred embodiment, the superheated vapor, or at least a portion thereof, is the vapor component obtained from prepared biomass material previously fed into system 200 for VOC recovery. The fluidized biomass flows through compartment 203 at a target flow rate and remains in contact with the superheated vapor for the target residence time to achieve the target vaporization of liquid from the biomass. The fluidized biomass then reaches separating unit 204, which is preferably a cyclone separator, where the vapor component and solid component are separated from each other. As shown, the vapor component is routed away from the solid component through overhead stream 205, and solid component 207 is discharged from separating unit 204. As shown, solid component 207 exits system 100 via extruder 206 and at least a portion of it can serve as feedstock for process 400, which further processes lignocellulosic material contained in solid component 207. Process 400 is illustrated in FIG. 3 and correspondingly further discussed below. Solid component 207 can be directly routed to process 400. In addition to or alternatively, solid component 207 can be transported to be fed into process 400.206 is released from separating unit 204. A portion of the vapor component, referenced as stream 208, is retained and recycled as a portion of the superheated vapor used to vaporize newly introduced prepared biomass material. As shown, retained vapor component 208 is routed through heat exchanger 209 to heat it to the desired temperature. The heat source or thermal energy source can include steam, electricity, hot flue gases or any other desired heating source. As shown, hot flue gas is used. The temperature is controlled such that the pressure in the system is maintained at the target and there is adequate energy present to evaporate the desired amount of liquid. The pressure can also be controlled by the flow rate of the superheated vapor stream and the heat input to heat exchanger 209.
 Referring to FIG. 2, the remaining portion of vapor component stream 205, represented as numeral 210 is routed to a distillation step. Depending on the distillation configuration, vapor component portion 210 may be condensed before further purification or preferably fed directly into the distillation column as a vapor. The product from the distillation step can further be concentrated using known processes.
 Preferably, recovery system 200 operates continuously where prepared biomass material 201 is continuously fed at a desired rate, and vapor component 210 and solid component 206 are continuously removed at a continuous rate. In a preferred embodiment, "fresh" vapor component 208 from one run is retained continuously at a target rate to be used as the superheated vapor stream for the next run. All these rates are adjustable to achieve the desired operating conditions. System fan 214 creates a circulating loop of superheated vapor stream and can be adjusted to obtain the target flow rate.
 By using a solventless recovery system according to aspects of the present invention, the points of heat transfer in the system, i.e., addition of heat to the system and heat transfer to the prepared biomass material, take place in the vapor phase in a preferred embodiment, which provides an advantage cause vapor phase heat transfer (convection) is more efficient than solid phase heat transfer (conduction) in the prepared biomass material, which is a bad conductor because it has insulating properties. As mentioned above, in certain embodiments, once steady state is reached no vapor other than that vaporized from the liquid of the prepared biomass material contacts the solid component and gas component of the prepared biomass material in the system, which prevents or reduces dilution that would come from the addition of process steam or other vapor to replenish the superheated vapor stream. The collected gas component can be fed directly to a distillation column for separation of the desired volatile organic compound(s), which can provide significant energy savings. The advantage of this system is that the vapors that contact the wet solids are only those vapors that have been previously removed from the solids so that there is no dilution or explosion risk, etc.
 Further Processing of Lignocellulosic
 Referring to FIGS. 1 and 2, at least a portion of the solid component, such as solid components 7 and component 207, discharged from the recovery system, such as systems 100 and 200, can serve as feedstock to further processing system 400 and be further processed to generate fermentable sugars. The solid component serving as feedstock to further processing system 400 may be referred to as "bio-based feedstock," "solid component feedstock," or "biomass feedstock." Further processing system 400 treats the lignocellulosic material in the solid component to generate fermentable sugars that can be used in subsequent reactions, such as additional fermentation. In a preferred embodiment, the further processing system, such as system 400, is located near the VOC recovery system, such as system 100 or 200, and is coupled to the VOC recovery system so that at least a portion of the solid component discharged from the recovery system is directly routed as feedstock to further processing system 400, which is preferably operated in a continuous or semi-continuous flow mode. In that preferred embodiment, the solid component feedstock is in an entrained engineered system where it is already flowing in an engineered system instead of requiring a mechanism to take it from storage and introduce it to the further processing system. Further, embodiments that couple the VOC recovery system to the further processing system can allow for production of volatile organic compounds from various sources, e.g., readily available fermentable sugars and lignocellulosic material, at one site, which reduces storage, handling, and transportation costs associated with other feedstock sources, which are not already in an entrained system. Such embodiments can also provide a continuous supply of feedstock that is already particle size reduced in contrast to conventional feedstock that often requires storage, transportation, and/or size reduction at or prior to arriving at the facility for additional processing of lignocellulosic material, which reduces the particular associated costs. Alternatively or in addition, the solid component can be transported to other further processing systems located at a different location. The solid component can be pelletized or further formatted to facilitate transport and/or reduce transportation costs. In embodiments of the invention, the solid component is already particle size reduced, which reduces the cost and difficulties of pelletization or other formatting processes as compared to other feedstock sources.
 In certain embodiments, the further processing comprises contacting at least a portion of the solid component with a solution adapted to facilitate saccharification. The term "saccharification" has its ordinary meaning, which refers at least to the process of converting a complex carbohydrate (such as starch or cellulose) into simple or fermentable sugars. Any saccharification process or any combination of saccharification process can be used, such as chemical and/or enzymatic. FIG. 3 provides two exemplary saccharification routes for lignocellulosic material: one via concentrated acid hydrolysis and the other via pretreatment and enzymatic hydrolysis. In a preferred embodiment, the saccharification process comprises pretreating the solid component feedstock for subsequent enzymatic hydrolysis. It is understood that the pretreatment of the solid component feedstock can also result in partial or at least some saccharification. Pretreatment is preferred because the lignocellulose is recalcitrant to enzymatic hydrolysis because of its structural complexity. Pretreatment of the solid component feedstock can improve its enzymatic digestibility, typically by removing hemicellulose and making the cellulose more accessible to cellulase enzymes. A variety of chemical and mechanical pretreatment methods are contemplated, including but not limited to, dilute acid, hot-water, ammonia, alkali, SPORL, steam explosion, ionic liquid, organosolv, etc., which, have been well described in the literature (see, e.g. Zhu and Pan (2010), Bioresource Technology, 101:4992-5002; Hendriks and Zeeman (2009), Bioresource Technology, 100:10-18, the disclosures of both articles are herein incorporated by reference in their entireties for all purposes.)
 For example, in one embodiment, pretreatment comprises using hot water in a range from about 170 degrees C. to about 200 degrees C. In another embodiment, pretreatment comprises using a high temperature, dilute-sulfuric acid process, which effectively hydrolyzes the hemicellulosic portion of the biomass to soluble sugars and exposes the cellulose so that enzymatic saccharification can be successful. In one embodiment, the temperature of the pretreatment with the dilute acid solution is in a range from about 140 degrees C. to about 170 degrees C. The parameters which can be employed to control the conditions of the dilute acid pretreatment include time, temperature, and acid loading. These are often combined in a mathematical equation termed the combined severity factor. In general, the higher the acid loading employed, the lower the temperature that can be employed in the pretreatment. Conversely, the lower the temperature used, the longer the pretreatment process takes.
 In one embodiment, further processing system 400 further includes subject at least a portion of the pretreated product to enzymatic hydrolysis to generate additional fermentable sugars. Additional information regarding enzymatic hydrolysis is further provided below. In a particular embodiment, the fermentable sugars from further processing of lignocellulosic material can then be fermented using a variety of microbes as described herein. For example, using a microbe adapted to produce a hydrocarbon. This can generally be referred to as lignocellulosic fermentation.
 Referring to FIGS. 1 and 2, in one embodiment, at least a portion of liquid from the lignocellulosic fermentation, which contains VOCs, can be routed via stream 430 to join distillation process 11 or 211 of vapor component 10 or 210 and/or liquid product 15 or 215 recovered from prepared biomass 1 or 201 using solventless recovery system 100 or 200 as described above. Likewise, the VOCs in at least a portion of any solid material from the lignocellulosic fermentation in further processing 400 can be recovered using the solventless recovery system 100 or 200, as indicated by stream 432. Accordingly, certain embodiments of the invention can provide for an integrated overall system for generation of VOCs from readily available fermentable sugars in biomass, recovery of those VOCs, processing lignocellulosic material from the first round of fermentation and recovery, generating additional VOCs from lignocellulosic material, and recovery of same. Such a system in those embodiments do not require additional equipment cost, and thus capital investment, where the same equipment can be used for all VOCs production.
 In a particularly preferred embodiment, an acid solution comprising at least one alpha.-hydroxysulfonic acid is used. The α-hydroxysulfonic acid is effective for hydrolyzing the biomass to fermentable sugars like pentose such as xylose at lower temperature, e.g., about 100° C. for α-hydroxymethane sulfonic acid or α-hydroxyethane sulfonic acid, producing little to no furfural in the process. A portion of the cellulose has also been show to hydrolyze under these comparatively mild conditions. It has been found that other polysaccharides such as starch are also readily hydrolyzed to component sugars by α-hydroxy sulfonic acids. Further, the α-hydroxysulfonic acid is reversible to readily removable and recyclable materials unlike mineral acids such as sulfuric, phosphoric, or hydrochloric acid. The lower temperatures and pressures employed in the biomass treatment leads to lower equipment cost. Biomass pretreated in this manner has been shown to be highly susceptible to additional saccharification, especially enzyme mediated saccharification.
 The alpha-hydroxysulfonic acids of the general formula
 where R1 and R2 are individually hydrogen or hydrocarbyl with up to about 9 carbon atoms that may or may not contain oxygen can be used in the treatment of the instant invention. The alpha-hydroxysulfonic acid can be a mixture of the aforementioned acids. The acid can generally be prepared by reacting at least one carbonyl compound or precursor of carbonyl compound (e.g., trioxane and paraformaldehyde) with sulfur dioxide or precursor of sulfur dioxide (e.g., sulfur and oxidant, or sulfur trioxide and reducing agent) and water according to the following general equation 1.
where R1 and R2 are individually hydrogen or hydrocarbyl with up to about 9 carbon atoms or a mixture thereof.
 Illustrative examples of carbonyl compounds useful to prepare the alpha-hydroxysulfonic acids include
 R1═R2═H (formaldehyde)
 R1═H, R2═CH3 (acetaldehyde)
 R1═H, R2═CH2CH3 (propionaldehyde)
 R1═H, R2═CH2CH2CH3 (n-butyraldehyde)
 R1═H, R2═CH(CH3)2 (i-butyraldehyde)
 R1═H, R2═CH2OH (glycolaldehyde)
 R1═H, R2═CHOHCH2OH (glyceraldehdye)
 R1=H, R2=C(═O)H (glyoxal)
 R1═R2═CH3 (acetone)
 R1═CH2OH, R2═CH3 (acetol)
 R1═CH3, R2═CH2CH3 (methyl ethyl ketone)
 R1═CH3, R2═CHC(CH3)2 (mesityl oxide)
 R1═CH3, R2═CH2CH(CH3)2 (methyl i-butyl ketone)
 R1, R2═(CH2)5 (cyclohexanone) or
 R1═CH3, R2═CH2Cl (chloroacetone)
 The carbonyl compounds and its precursors can be a mixture of compounds described above. For example, the mixture can be a carbonyl compound or a precursor such as, for example, trioxane which is known to thermally revert to formaldehyde at elevated temperatures or an alcohol that maybe converted to the aldehyde by dehydrogenation of the alcohol to an aldehyde by any known methods. An example of such a conversion to aldehyde from alcohol is described below. An example of a source of carbonyl compounds maybe a mixture of hydroxyacetaldehyde and other aldehydes and ketones produced from fast pyrolysis oil such as described in "Fast Pyrolysis and Bio-oil Upgrading, Biomass-to-Diesel Workshop", Pacific Northwest National Laboratory, Richland, Washington, Sep. 5-6, 2006. The carbonyl compounds and its precursors can also be a mixture of ketones and/or aldehydes with or without alcohols that may be converted to ketones and/or aldehydes, preferably in the range of 1 to 7 carbon atoms.
 The preparation of alpha-hydroxysulfonic acids by the combination of an organic carbonyl compounds, SO2 and water is a general reaction and is illustrated in equation 2 for acetone.
 The alpha-hydroxysulfonic acids appear to be as strong as, if not stronger than, HCl since an aqueous solution of the adduct has been reported to react with NaCl freeing the weaker acid, HCl (see U.S. Pat. No. 3,549,319). The reaction in equation 1 is a true equilibrium, which results in facile reversibility of the acid. That is, when heated, the equilibrium shifts towards the starting carbonyl, sulfur dioxide, and water (component form). If the volatile components (e.g. sulfur dioxide) is allowed to depart the reaction mixture via vaporization or other methods, the acid reaction completely reverses and the solution becomes effectively neutral. Thus, by increasing the temperature and/or lowering the pressure, the sulfur dioxide can be driven off and the reaction completely reverses due to Le Chatelier's principle, the fate of the carbonyl compound is dependent upon the nature of the material employed. If the carbonyl is also volatile (e.g. acetaldehyde), this material is also easily removed in the vapor phase. Carbonyl compounds such as benzaldehyde, which are sparingly soluble in water, can form a second organic phase and be separted by mechanical means. Thus, the carbonyl can be removed by conventional means, e.g., continued application of heat and/or vacuum, steam and nitrogen stripping, solvent washing, centrifugation, etc. Therefore, the formation of these acids is reversible in that as the temperature is raised, the sulfur dioxide and/or aldehyde and/or ketone can be flashed from the mixture and condensed or absorbed elsewhere in order to be recycled. It has been found that these reversible acids, which are approximately as strong as strong mineral acids, are effective in biomass treatment reactions. It had been found that these treatment reactions produce very few of the undesired byproducts, furfurals, produced by other conventional mineral acids. Additionally, since the acids are effectively removed from the reaction mixture following treatment, neutralization with base and the formation of salts to complicate downstream processing is substantially avoided. The ability to reverse and recycle these acids also allows the use of higher concentrations than would otherwise be economically or environmentally practical. As a direct result, the temperature employed in biomass treatment can be reduced to diminish the formation of byproducts such as furfural or hydroxymethylfurfural.
 In some embodiments, the reactions described are carried out in any system of suitable design, including systems comprising continuous-flow (such as CSTR and plug flow reactors), batch, semi-batch or multi-system vessels and reactors and packed-bed flow-through reactors. For reasons strictly of economic viability, it is prefferable that the invention is practiced using a continuous-flow system at steady-state equilibrium. In one advantage of the process in contrast with the dilute acids pretreatment reactions where residual acid is left in the reaction mixture (<1% wt. sulfuric acid), the lower temperatures employed using these acids (10 to 20% wt.) results in substantially lower pressures in the reactor resulting in potentially less expensive processing systems such as plastic lined reactors, duplex stainless reactors, and 2205 type reactors.
 In one embodiment (not shown), at least a portion of product stream of treated lignocellulosic material can further be subject to enzymatic hydrolysis to generate additional fermentable sugars. Additional information regarding enzymatic hydrolysis is further provided below. In a particular embodiment, the fermentable sugars from further processing of lignocellulosic material can then be fermented using a variety of microbes as described above to generate a plurality of volatile organic compounds, including hydrocarbon precursor compounds that can be converted to hydrocarbons. This can generally be referred to as lignocellulosic fermentation.
 In some embodiments, a plurality of reactor vessels may be used to carry out the hydrolysis reaction. These vessels may have any design capable of carrying out a hydrolysis reaction. Suitable reactor vessel designs can include, but are not limited to, batch, trickle bed, co-current, counter-current, stirred tank, or fluidized bed reactors. Staging of reactors can be employed to achieve the optimal or desired economical solution. The remaining biomass feedstock solids may then be optionally separated from the liquid stream to allow more severe processing of the recalcitrant solids or pass directly within the liquid stream to further processing that may include enzymatic hydrolysis, fermentation, extraction, distillation and/or hydrogenation. In another embodiment, a series of reactor vessels may be used with an increasing temperature profile so that a desired sugar fraction is extracted in each vessel. The outlet of each vessel can then be cooled prior to combining the streams, or the streams can be individually fed to the next reaction for conversion.
 Suitable reactor designs can include, but are not limited to, a backmixed reactor (e.g., a stirred tank, a bubble column, and/or a jet mixed reactor) may be employed if the viscosity and characteristics of the partially digested bio-based feedstock and liquid reaction media is sufficient to operate in a regime where bio-based feedstock solids are suspended in an excess liquid phase (as opposed to a stacked pile digester). It is also conceivable that a trickle bed reactor could be employed with the biomass present as the stationary phase and a solution of alpha-hydroxysulfonic acid passing over the material.
 The treatment reaction product contains fermentable sugar or monosaccharides, such as pentose and/or hexose that is suitable for further processing.
 In one embodiment, the product stream from any pretreatment process can further be hydrolyzed by other methods, for example by enzymes to further hydrolyze the biomass to sugar products containing pentose and hexose (e.g., glucose) and fermented to produce alcohols such as disclosed in US Publication No. 2009/0061490 and U.S. Pat. No. 7,781,191 which disclosures are hereby incorporated by reference.
 The process can be carried out with any type of cellulase enzymes, regardless of their source. Non-limiting examples of cellulases which may be used include those obtained from fungi of the genera Aspergillus, Humicola, and Trichoderma, Myceliophthora, Chrysosporium and from bacteria of the genera Bacillus, Thermobifida and Thermotoga. In some embodiments, the filamentous fungal host cell is an Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, or Trichoderma cell.
 The cellulase enzyme dosage is chosen to convert the cellulose of the pretreated feedstock to glucose. For example, an appropriate cellulase dosage can be about 0.1 to about 40.0 Filter Paper Unit(s) (FPU or IU) per gram of cellulose, or any amount there between. The term Filter Paper Unit(s) refers to the amount of enzyme required to liberate 2 mg of reducing sugar (e.g., glucose) from a 50 mg piece of Whatman No. 1 filter paper in 1 hour at 50° C. at approximately pH 4.8.
 In practice, the hydrolysis may be carried out in a hydrolysis system, which may include a series of hydrolysis reactors. The number of hydrolysis reactors in the system depends on the cost of the reactors, the volume of the aqueous slurry, and other factors. The enzymatic hydrolysis with cellulase enzymes produces an aqueous sugar stream (hydrolyzate) comprising glucose, unconverted cellulose, lignin and other sugar components. The hydrolysis may be carried out in two stages (see U.S. Pat. No. 5,536,325, which is incorporated herein by reference), or may be performed in a single stage.
 In one embodiment, the treated solid component comprising fermentable sugars can then be fermented by one or more microorganism to produce a fermentation broth comprising the desired chemical. In the lignocellulosic fermentation system, any one of a number of known microorganisms may be used to convert sugar to the desired fermentation products. The microorganisms convert sugars, including, but not limited to glucose, mannose and galactose present in the treated solid component or hydrolysate to a fermentation product. A particular fermentation product is a hydrocarbon. However, other compounds can be generated by adding the appropriate organism.
 In one embodiment, the lignocellulosic fermentation comprises using a microorganism adapted to produce a hydrocarbon. An example of such a microorganism is disclosed in PCT Application No. PCT/EP2013/053600, the disclosure of which is incorporated herein by reference.
 In one embodiment, a recombinant host cell, such as a recombinant micro-organism is adapted to express at least one of the following enzymes: a fatty acid reductase (LuxC), a fatty acyl transferase (LuxD), a fatty aldehyde synthetase (LuxE), and an aldehyde decarbonylase. Coexpression of the fatty acid reductase, fatty aldehyde synthetase, fatty acyl transferase, and aldehyde decarbonylase enzymes can be collectively referred to as CEDDEC. At least one fatty acid reductase, at least one fatty aldehyde synthetase, and at least one fatty acyl transferase forms a fatty acid reductase complex. Contact of a fatty acid substrate with a fatty aldehyde synthetase forms a fatty acid aldehyde. Contact of the fatty acid aldehyde with at least one aldehyde decarbonylase forms a hydrocarbon. The host cell may be recombinant and may, for example, be a genetically modified microorganism to express at least one, and preferably all, of these enzymes: a fatty acid reductase, fatty aldehyde synthetase, a fatty acyl transferase, and an aldehyde decarbonylase. These enzymes may each be expressed by a recombinant host cell, either within the same host cell or in separate host cells. The hydrocarbon may be secreted from the host cell in which it is formed. Hydrocarbons produced can include alkanes and alkenes of the appropriate chain length for diesel or aviation fuel, namely tridecane, pentadecane, pentadecene, hexadecene, heptadecane, and heptadecene.
 In one embodiment, at least some of the fatty acid substrate is obtainable by contacting a fatty acyl-ACP with at least one acyl-ACP thioesterase. The term acyl-ACP thioesterase is an enzyme in the class EC 188.8.131.52, capable of catalysing the release of free fatty acid from fatty acyl-ACP. The acyl-ACP thioesterase may be, for example, a polypeptide having at least 50% sequence identity to SEQ ID NO:5 (thioesterase protein from Cinnamomum camphora).
 In one embodiment, at least some of the fatty acyl-ACP is obtainable by contacting a keto acyl CoA and a malonyl-ACP with at least one 3-ketoacyl-ACP synthase III (KASIII). This is an enzyme in class EC 184.108.40.206, capable of catalysing the reaction of a keto acyl CoA and a malonyl-ACP to form fatty acyl-ACP. The 3-ketoacyl-ACP synthase III may be a polypeptide having at least 50% sequence identity to SEQ ID NO:6 (Bacillus subtilis enzyme KASIII).
 In this embodiment, at least some of the keto acyl-CoA may be obtainable by contacting a keto acid with a branched-chain ketodehydrogenase complex. This is an enzyme or complex of enzymes capable of catalysing the conversion of a keto acid to a keto acyl-CoA. For example, the branched-chain ketodehydrogenase complex may comprise a polypeptide in class EC 220.127.116.11 (for example having at least 50% sequence identity to SEQ ID NO:7; B. subtilis BCKD subunit E1α) and a further polypeptide in class EC 18.104.22.168 (for example having at least 50% sequence identity to SEQ ID NO:8; B. subtilis BCKD subunit E1β) and a polypeptide in class EC 22.214.171.124 (for example having at least 50% sequence identity to SEQ ID NO:9; B. subtilis BCKD subunit E2) and a polypeptide in class EC 126.96.36.199 (for example having at least 50% sequence identity to SEQ ID NO:10; B. subtilis BCKD subunit E3). In an embodiment, the branched-chain ketodehydrogenase complex is a single polypeptide comprising all of the amino acid sequences SEQ ID NOs:7-10.
 These other enzymes described herein (i.e., an acyl-ACP thioesterase and/or a 3-ketoacyl-ACP synthase III and/or a branched-chain ketodehydrogenase complex) may also be expressed by a micro-organism. Preferably, the enzymes are exogenous, i.e., not present in the cell prior to modification, having been introduced using microbiological methods such as are described herein. Furthermore, the enzymes may each be expressed by a recombinant host cell, either within the same host cell or in separate host cells. The hydrocarbon may be secreted from the host cell in which it is formed.
 The host cell may be genetically modified by any manner known to be suitable for this purpose by the person skilled in the art. This includes the introduction of the genes of interest on a plasmid or cosmid or other expression vector which may be capable of reproducing within the host cell. Alternatively, the plasmid or cosmid DNA or part of the plasmid or cosmid DNA or a linear DNA sequence may integrate into the host genome, for example by homologous recombination. To carry out genetic modification, DNA can be introduced or transformed into cells by natural uptake or mediated by well-known processes such as electroporation. Genetic modification can involve expression of a gene under control of an introduced promoter. The introduced DNA may encode a protein which could act as an enzyme or could regulate the expression of further genes.
 In one embodiment, such a host cell may comprise a nucleic acid sequence encoding a fatty acid reductase and/or a fatty aldehyde synthetase and/or a fatty acyl transferase and/or an aldehyde decarbonylase and/or an acyl-ACP thioesterase and/or a 3-ketoacyl-ACP synthase III and/or a branched-chain ketodehydrogenase complex. The nucleic acid sequences encoding the enzymes may be exogenous, i.e., not naturally occurring in the host cell.
 Therefore, in one embodiment, there is a recombinant host cell, such as a micro-organism, comprising at least one polypeptide which is a fatty acid reductase in class EC 188.8.131.52, for example, having an amino acid sequence at least 50% identical to SEQ ID NO:1 (e.g., SEQ ID NO:1, 28 or 29), and comprising at least one polypeptide which is a fatty aldehyde synthetase in class EC 184.108.40.206, for example, having an amino acid sequence at least 50% identical to SEQ ID NO:2 (e.g., SEQ ID NO:2, 32 or 33), and comprising at least one polypeptide which is a fatty acyl transferase in class EC 2.3.1.-, for example, having an amino acid sequence at least 50% identical to SEQ ID NO:3 (e.g., SEQ ID NO:3, 30 or 31). The cell may also comprise at least one polypeptide which is an aldehyde decarbonylase in class EC 220.127.116.11, for example, having an amino acid sequence at least 50% identical to SEQ ID NO:4, or a functional variant or fragment of any of these sequences. The recombinant host cell may comprise a polypeptide comprising all of SEQ ID NOs:1-4 and/or amino acid sequences at least 50% identical to all of SEQ ID NOs:1-3 (e.g., amino acid sequences selected from SEQ ID NOs:28-33, as outlined above) and at least 50% identical to SEQ ID NO:4. The recombinant host cell may comprise the polynucleotide sequences SEQ ID NOs:11-14 and/or the sequences SEQ ID NOs:13 & 15 and/or the sequences SEQ ID NOs:13-16 and/or any combination of these specific combinations.
 The recombinant host cell may further comprise: at least one acyl-ACP thioesterase in class EC 18.104.22.168 (e.g., having an amino acid sequence which is at least 50% identical to any of SEQ ID NOs:5 or a functional variant or fragment thereof); and/or at least one 3-ketoacyl-ACP synthase III in class EC 22.214.171.124 (e.g., having an amino acid sequence which is at least 50% identical to any of SEQ ID NOs:6 or a functional variant or fragment thereof); and/or at least one branched-chain ketodehydrogenase complex comprising enzymes in classes EC 126.96.36.199, 188.8.131.52 and 184.108.40.206 (e.g., comprising one or more amino acid sequence(s) each being at least 50% identical to any of SEQ ID NOs:7-10 or a functional variant or fragment thereof); and/or at least one polynucleotide encoding at least one of these enzymes and/or functional fragments or variants of these. The cell may also be modified to produce increased levels of fatty acid which may be used by the fatty acid reductase and fatty aldehyde synthetase and fatty acyl transferase as a substrate to form a fatty aldehyde which may then be converted to a hydrocarbon by the decarbonylase. The recombinant host cell may also comprise one or more transport proteins for transporting hydrocarbon(s) out of the cell.
 FIG. 4 is a schematic detailing the genetic elements (solid lines) introduced into E. coli cells to produce bespoke alkanes, their relationship with the endogenous genes (dashed lines) and the de novo metabolic pathway (the boxes represent genes whilst circles represent metabolic intermediates. Key to metabolites: ILV, isoleucine, leucine and valine; MDHLA, methyl-butan/propanoyl-dihydrolipoamide-E. Key to genes: ilvE, endogenous branched chain amino acid aminotransferase; E1 and E1β, branched chain alpha keto acid decarboxylase/dehydrogenase E1α and β subunits from B. subtilis; E2, dihydrolipoyl transacylase from B. subtilis; E3, dihydrolipoamide dehydrogenase from B. subtilis (recycles lipoamide-E for use by E1 subunits); KASIII, keto-acyl synthase III (FabH2) from B. subtilis; accA to accD, endogenous acetyl-CoA carboxylase genes; fabH, endogenous beta-Ketoacyl-ACP synthase III; tesA, endogenous long chain thioesterase; thioesterase, Myristoyl-acyl carrier protein thioesterase from C. camphora; luxD, acyl transferase, from P. luminescens; luxC and luxE, fatty acid reductase and acyl-protein synthetase from P. luminescens; AD, aldehyde decarbonylase from N. punctiforme).
 PCT/EP2013/053600 has demonstrated certain aspects of the production of hydrocarbon described herein, including conversion of exogenous fatty acid to alkane via the cyanobacterial alkane biosynthetic pathway, production of alkanes and alkenes via the FAR/NpAD pathway; expression of the camphor FatB1 thioesterase gene in E. coli increases the pool size of tetradecanoic acid, production of tridecane in E. coli cells; production of branched fatty acids in E. coli, and production of branched pentadecane in E. coli cells.
 A suitable polynucleotide may be introduced into the cell by homologous recombination and/or may form part of an expression vector comprising at least one of the polynucleotide sequences SEQ ID NOs:11-25 or a complement thereof. Such an expression vector forms a third aspect of the invention. Suitable vectors for construction of such an expression vector are well known in the art (examples are mentioned above) and may be arranged to comprise the polynucleotide operably linked to one or more expression control sequences, so as to be useful to express the required enzymes in a host cell, for example a micro-organism as described above.
 In some embodiments, the recombinant or genetically modified host cell, as mentioned throughout this specification, may be any micro-organism or part of a micro-organism selected from the group consisting of fungi (such as members of the genus Saccharomyces), protists, algae, bacteria (including cyanobacteria) and archaea. The bacterium may comprise a gram-positive bacterium or a gram-negative bacterium and/or may be selected from the genera Escherichia, Bacillus, Lactobacillus, Rhodococcus, Pseudomonas or Streptomyces. The cyanobacterium may be selected from the group of Synechococcus elongatus, Synechocystis, Prochlorococcus marinus, Anabaena variabilis, Nostoc punctiforme, Gloeobacter violaceus, Cyanothece sp. and Synechococcus sp. The selection of a suitable micro-organism (or other expression system) is within the routine capabilities of the skilled person. Particularly suitable micro-organisms include Escherichia coli and Saccharomyces cerevisiae, for example.
 In a related embodiment of the invention, a fatty acid reductase and/or a fatty aldehyde synthetase and/or a fatty acyl transferase and/or an aldehyde decarbonylase and/or an acyl-ACP thioesterase and/or a 3-ketoacyl-ACP synthase III and/or a branched-chain ketodehydrogenase complex or functional variant or functional fragment of any of these may be expressed in a non-micro-organism cell such as a cultured mammalian cell or a plant cell or an insect cell. Mammalian cells may include CHO cells, COS cells, VERO cells, BHK cells, HeLa cells, Cvl cells, MDCK cells, 293 cells, 3T3 cells, and/or PC12 cells.
 The recombinant host cell or micro-organism may be used to express the enzymes mentioned above and a cell-free extract then obtained by standard methods. Conventional methods and techniques mentioned herein are explained in more detail, for example, in Sambrook et al. (reference number 17, see below).
 The identity of amino acid and nucleotide sequences referred to in this specification is as set out in Table 7 at the end of the description. The terms "polynucleotide", "polynucleotide sequence" and "nucleic acid sequence" are used interchangeably herein. The terms "polypeptide", "polypeptide sequence" and "amino acid sequence" are, likewise, used interchangeably herein. Other sequences encompassed by the invention are provided in the Sequence Listing.
 Enzyme Commission (EC) numbers (also called "classes" herein), referred to throughout this specification, are according to the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) in its resource "Enzyme Nomenclature" (1992, including Supplements 6-17) available, for example, at http://www.chem.qmul.ac.uk/iubmb/enzyme/. This is a numerical classification scheme based on the chemical reactions catalysed by each enzyme class (reference 19).
 The term "fatty acid reductase complex" indicates an enzyme complex capable of catalysing the conversion of free fatty acid, fatty acyl-ACP or fatty acyl-CoA to fatty aldehyde. Typically, the complex comprises a fatty acid reductase enzyme and a fatty aldehyde synthetase enzyme and a fatty acyl transferase enzyme. The term "fatty aldehyde synthetase" indicates an enzyme in class EC 220.127.116.11 capable of catalysing the formation of an acyl-protein thioester from a fatty acid and a protein. The term "fatty acid reductase enzyme" indicates an enzyme in class EC 18.104.22.168, the enzyme being capable of catalysing the formation of a long-chain aldehyde from a fatty acyl-AMP (fatty acyl-adenosine monophosphate) or a fatty acyl-CoA. Fatty acyl-AMP is the intermediate formed by the fatty aldehyde synthetase in this coupled reaction. An example of a fatty acid reductase is the polypeptide having amino acid sequence SEQ ID NO:1; an example of a fatty aldehyde synthetase is the polypeptide having amino acid sequence SEQ ID NO:2. Other suitable fatty acid reductase polypeptides have amino acid sequence at least 50% identical to SEQ ID NO:1, e.g., SEQ ID NO:28 or 29; other suitable fatty aldehyde synthetase polypeptides have an amino acid sequence at least 50% identical to SEQ ID NO:2, e.g., SEQ ID NO:32 or 33.
 The term "fatty acyl transferase" indicates an enzyme in class EC 2.3.1.-, capable of catalysing the transfer of the acyl moiety of fatty acyl-ACP, acyl-CoA and other activated acyl donors, to the hydroxyl group of a serine on the transferase, followed by the conversion of the ester to a fatty acid through hydrolysis. An example of a fatty acyl transferase is the polypeptide having amino acid sequence SEQ ID NO:3. Other suitable fatty acyl transferase polypeptides have an amino acid sequence at least 50% identical to SEQ ID NO:3, e.g. SEQ ID NO:30 or 31.
 The term "aldehyde decarbonylase" indicates an enzyme in class EC 22.214.171.124, capable of catalysing the conversion of fatty aldehyde to a hydrocarbon, for example an alkane, alkene or mixture thereof. An example of an aldehyde decarbonylase is the polypeptide having amino acid sequence SEQ ID NO:4 or an amino acid sequence at least 50% identical to SEQ ID NO:4.
 A fatty acid is a carboxylic acid with a long unbranched or branched aliphatic tail. The fatty acid can comprise saturated fatty acids and/or unsaturated fatty acids containing one, two, three or more double bonds. The one or more fatty acid(s), fatty acyl-ACP or fatty acyl-CoA may, for example, comprise 4 or more carbon atoms, for example, 8 or more carbon atoms, 10 or more carbon atoms, 12 or more carbon atoms, or 14 or more carbon atoms. The fatty acid may also comprise, for example, 30 or fewer carbon atoms, for example, 26 or fewer carbon atoms, 25 or fewer carbon atoms, 23 or fewer carbon atoms, or 20 or fewer carbon atoms. Fatty acids may, for example, be derived from triacylglycerols or phospholipids, or may be made de novo by a cell, and/or by mechanisms described elsewhere herein.
 In certain embodiments, variants of the polypeptides described herein can be used. As used herein, a "variant" means a polypeptide in which the amino acid sequence differs from the base sequence from which it is derived in that one or more amino acids within the sequence are substituted for other amino acids. For example, a variant of SEQ ID NO:1 may have an amino acid sequence at least about 50% identical to SEQ ID NO:1, for example, at least about 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or about 100% identical. The variants and/or fragments are functional variants/fragments in that the variant sequence has similar oridentical functional enzyme activity characteristics to the enzyme having the non-variant amino acid sequence specified herein (and this is the meaning of the term "functional variant" as used throughout this specification).
 For example, a functional variant of SEQ ID NO:1 has similar or identical fatty acid reductase characteristics as SEQ ID NO:1, being classified in enzyme class EC 126.96.36.199 by the Enzyme Nomenclature of NC-IUBMB as mentioned above. An example may be that the rate of conversion by a functional variant of SEQ ID NO:1, in the presence of non-variant SEQ ID NO:2, of a free fatty acid to fatty aldehyde may be the same or similar, for example at least about 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or at least about 100% the rate achieved when using the enzyme having amino acid sequence SEQ ID NO:1, in the presence of non-variant SEQ ID NO:2. The rate may be improved when using the variant polypeptide, so that a rate of more than 100% the non-variant rate is achieved. Equivalent analysis of percentage sequence identity and comparative functional variant activity may, likewise, be made for other enzymes mentioned herein.
 For example, a variant of the fatty acyl transferase SEQ ID NO:3 may have an amino acid sequence at least about 50% identical to SEQ ID NO:3, being a functional variant in that it is classified in EC 2.3.1.-; the rate of transfer of the acyl moiety of fatty acyl-ACP, acyl-CoA and other activated acyl donors, to the hydroxyl group of a serine on the transferase, followed by the conversion of the ester to a fatty acid through hydrolysis, may be the same or similar, for example at least about 60%, 70%, 80%, 90% or 95% the rate achieved when using SEQ ID NO:3.
 SEQ ID NOs:28 and 29 may be examples of functional variants of SEQ ID NO:1, as defined herein. SEQ ID NOs:32 and 33 may be examples of functional variants of SEQ ID NO:2, as defined herein. SEQ ID NOs:30 and 31 may be examples of functional variants of SEQ ID NO:3, as defined herein.
 The NC-IUBMB classification of the enzymes mentioned herein are, in summary, set out in Table 6 below.
TABLE-US-00008 TABLE 6 SEQ ID EC Description of sequence NO number Photorhabdus luminescens LuxC amino acid 1 188.8.131.52 sequence P. luminescens LuxE amino acid sequence 2 184.108.40.206 P. luminescens LuxD amino acid sequence 3 2.3.1.-- Nostoc punctiforme aldehyde decarbonylase amino 4 220.127.116.11 acid sequence Cinnamomum camphora thioesterase amino acid 5 18.104.22.168 sequence Bacillus subtilis KasIII (3-ketoacyl-ACP synthase 6 22.214.171.124 III) amino acid sequence B. subtilis BCKD subunit E1α amino acid sequence 7 126.96.36.199 B. subtilis BCKD subunit E1β amino acid sequence 8 188.8.131.52 B. subtilis BCKD subunit E2 amino acid sequence 9 184.108.40.206 B. subtilis BCKD subunit E3 amino acid sequence 10 220.127.116.11
 A functional variant or fragment of any of the above SEQ ID NO amino acid sequences or genes mentioned herein, therefore, is any amino acid sequence which remains within the same enzyme category (i.e., has the same EC number) as the non-variant sequences as set out in Table 1. Methods of determining whether an enzyme falls within a particular category are well known to the skilled person, who can determine the enzyme category without use of inventive skill Suitable methods may, for example, be obtained from the International Union of Biochemistry and Molecular Biology.
 Amino acid substitutions may be regarded as "conservative" where an amino acid is replaced with a different amino acid with broadly similar properties. Non-conservative substitutions are where amino acids are replaced with amino acids of a different type.
 By "conservative substitution" is meant the substitution of an amino acid by another amino acid of the same class, in which the classes are defined as follows:
TABLE-US-00009 Class Amino acid examples Nonpolar: A, V, L, I, P, M, F, W Uncharged polar: G, S, T, C, Y, N, Q Acidic: D, E Basic: K, R, H.
 As is well known to those skilled in the art, altering the primary structure of a polypeptide by a conservative substitution may not significantly alter the activity of that polypeptide because the side-chain of the amino acid which is inserted into the sequence may be able to form similar bonds and contacts as the side chain of the amino acid which has been substituted out. This is so even when the substitution is in a region which is critical in determining the polypeptide's conformation.
 Non-conservative substitutions are possible provided that these do not interrupt the enzyme activities of the polypeptides, as defined elsewhere herein. The substituted versions of the enzymes must retain characteristics such that they remain in the same enzyme class as the non-substituted enzyme, as determined using the NC-IUBMB nomenclature discussed above.
 Broadly speaking, fewer non-conservative substitutions than conservative substitutions will be possible without altering the biological activity of the polypeptides. Determination of the effect of any substitution (and, indeed, of any amino acid deletion or insertion) is wholly within the routine capabilities of the skilled person, who can readily determine whether a variant polypeptide retains the enzyme activity according to the invention. For example, when determining whether a variant of the polypeptide falls within the scope of the invention (i.e., is a "functional variant or fragment" as defined above), the skilled person will determine whether the variant or fragment retains the substrate converting enzyme activity as defined with reference to the NC-IUBMB nomenclature mentioned elsewhere herein. All such variants are within the scope of the invention.
 Using the standard genetic code, further nucleic acid sequences encoding the polypeptides may readily be conceived and manufactured by the skilled person, in addition to those disclosed herein. The nucleic acid sequence may be DNA or RNA, and where it is a DNA molecule, it may for example comprise a cDNA or genomic DNA. The nucleic acid may be contained within an expression vector, as described elsewhere herein.
 According to another aspect of the invention, fermentation to produce hydrocarbons includes using a recombinant microorganism adapted to express at least one of an aldehyde-generating acyl-ACP reductase and fatty aldehyde decarbonylase enzymes. The gene for the aldehyde-generating acyl-ACP reductase and fatty aldehyde decarbonylase enzymes preferably comes from a cyanobacteria, such as Nostoc punctiforme. Non-limiting examples of the aldehyde-generating acyl-ACP reductase and fatty aldehyde decarbonylase are described in Schirmer A, et al., Microbial biosynthesis of alkanes. Science 329(5991):559-562 (2010), the disclosure of which is incorporated by reference in its entirety. Coexpression aldehyde-generating acyl-ACP reductase and fatty aldehyde decarbonylase enzymes can be generally referred to as NpARAD or NpAR/NpAD.
 Embodiments of the invention include variant nucleic acid sequences encoding the polypeptides described herein. The term "variant" in relation to a nucleic acid sequence means any substitution of, variation of, modification of, replacement of, deletion of, or addition of one or more nucleotide(s) from or to a polynucleotide sequence, providing the resultant polypeptide sequence encoded by the polynucleotide exhibits at least the same or similar enzymatic properties as the polypeptide encoded by the basic sequence. The term includes allelic variants and also includes a polynucleotide (a "probe sequence") which substantially hybridises to the polynucleotide sequences described herein. Such hybridisation may occur at or between low and high stringency conditions. In general terms, low stringency conditions can be defined as hybridisation in which the washing step takes place in a 0.330-0.825 M NaCl buffer solution at a temperature of about 40-48 C below the calculated or actual melting temperature (Tm) of the probe sequence (for example, about ambient laboratory temperature to about 55 C), while high stringency conditions involve a wash in a 0.0165-0.0330 M NaCl buffer solution at a temperature of about 5-10 C below the calculated or actual Tm of the probe sequence (for example, about 65 C). The buffer solution may, for example, be SSC buffer (0.15M NaCl and 0.015M tri-sodium citrate), with the low stringency wash taking place in 3×SSC buffer and the high stringency wash taking place in 0.1×SSC buffer. Steps involved in hybridisation of nucleic acid sequences have been described for example in Sambrook et al.
 Typically, nucleic acid sequence variants have about 55% or more of the nucleotides in common with the nucleic acid sequence of the present invention, more typically 60%, 65%, 70%, 80%, 85%, or even 90%, 95%, 98% or 99% or greater sequence identity.
 Variant nucleic acids of the invention may be codon-optimised for expression in a particular host cell.
 Sequence identity between amino acid sequences can be determined by comparing an alignment of the sequences using the Needleman-Wunsch Global Sequence Alignment Tool available from the National Center for Biotechnology Information (NCBI), Bethesda, Md., USA, for example via http://blast.ncbi.nlm.nih.gov/Blast.cgi, using default parameter settings (for protein alignment, Gap costs Existence:11 Extension:1). Sequence comparisons and percentage identities mentioned in this specification have been determined using this software. When comparing the level of sequence identity to, for example, SEQ ID NO:1, this typically should be done relative to the whole length of SEQ ID NO:1 (i.e., a global alignment method is used), to avoid short regions of high identity overlap resulting in a high overall assessment of identity. For example, a short polypeptide fragment having, for example, five amino acids might have a 100% identical sequence to a five amino acid region within the whole of SEQ ID NO:1, but this does not provide a 100% amino acid identity unless the fragment forms part of a longer sequence which also has identical amino acids at other positions equivalent to positions in SEQ ID NO:1. When an equivalent position in the compared sequences is occupied by the same amino acid, then the molecules are identical at that position. Scoring an alignment as a percentage of identity is a function of the number of identical amino acids at positions shared by the compared sequences. When comparing sequences, optimal alignments may require gaps to be introduced into one or more of the sequences, to take into consideration possible insertions and deletions in the sequences. Sequence comparison methods may employ gap penalties so that, for the same number of identical molecules in sequences being compared, a sequence alignment with as few gaps as possible, reflecting higher relatedness between the two compared sequences, will achieve a higher score than one with many gaps. Calculation of maximum percent identity involves the production of an optimal alignment, taking into consideration gap penalties. As mentioned above, the percentage sequence identity may be determined using the Needleman-Wunsch Global Sequence Alignment tool, using default parameter settings. The Needleman-Wunsch algorithm was published in J. Mol. Biol. (1970) vol. 48:443-53.
 Polypeptide and polynucleotide sequences for use in the methods, vectors and host cells described herein are shown in the Sequence Listing.
TABLE-US-00010 TABLE 7 Identity of sequences included in application SEQ ID NO Description of sequence 1 Photorhabdus luminescens LuxC amino acid sequence 2 P. luminescens LuxE amino acid sequence 3 P. luminescens LuxD amino acid sequence 4 Nostoc punctiforme aldehyde decarbonylase amino acid sequence 5 Cinnamomum camphora thioesterase amino acid sequence 6 Bacillus subtilis KasIII (3-ketoacyl-ACP synthase III) amino acid sequence 7 B. subtilis BCKD subunit E1 amino acid sequence 8 B. subtilis BCKD subunit E1β amino acid sequence 9 B. subtilis BCKD subunit E2 amino acid sequence 10 B. subtilis BCKD subunit E3 amino acid sequence 11 P. luminescens LuxC codon-optimised nucleotide sequence 12 P. luminescens LuxE codon-optimised nucleotide sequence 13 N. punctiforme aldehyde decarbonylase codon-optimised nucleotide sequence 14 P. luminescens LuxD codon-optimised nucleotide sequence 15 P. luminescens LuxCDE operon codon-optimised nucleotide sequence 16 pACYC LuxCDE 17 C. camphora thioesterase codon-optimised nucleotide sequence 18 pETDuet-1 thioesterase 19 B. subtilis KasIII codon-optimised nucleotide sequence 20 B. subtilis BCKD subunit E1 codon-optimised nucleotide sequence 21 B. subtilis BCKD subunit E1β codon-optimised nucleotide sequence 22 B. subtilis BCKD subunit E2 codon-optimised nucleotide sequence 23 B. subtilis BCKD subunit E3 codon-optimised nucleotide sequence 24 KasIII/BCKD operon codon-optimised nucleotide sequence 25 pETDuet-1 KasIII/BCKD 26 Amplification primer 27 Amplification primer 28 Vibrio harveyi LuxC amino acid sequence 29 Vibrio fischeri ES114 LuxC amino acid sequence 30 Vibrio harveyi LuxD amino acid sequence 31 Vibrio fischeri MJ11 LuxD amino acid sequence 32 Vibrio harveyi LuxE amino acid sequence 33 Vibrio fischeri ES114 LuxE amino acid sequence
 To facilitate a better understanding of embodiments the present invention, the following examples of certain aspects of some embodiments are given. In no way should the following examples be read to limit, or define, the entire scope of the invention.
 The following examples all used solid components obtained as described below.
 Biomass Preparation
 In this example, various samples of fresh chopped sorghum are mixed with a variety of added components as listed in Table 8 and are stored in a silage bag for about 20 days. The particular additives and respective addition rates are shown in Table 9.
TABLE-US-00011 TABLE 8 2011 Experiments WITH ACID Experiment # 1 estimated mass 450 kgs Moisture Content 76% Storage Method Silage bag Yeast Lallemand Liquid Yeast bacterial inhibitor Lactrol cellulose to glucose Novozymes Cellic CTec2 Chop size 3 mm Result (gallons Ethanol/initial dry metric 50 tonne) Days in Storage ~20
TABLE-US-00012 TABLE 9 ADDITIVE Rates LACTROL 3.2 g/wet ton Lallemand Stabilized Liquid 18 fl oz/wet ton Yeast Novozymes Cellic CTec2 20 fl oz/wet ton 9.3% Concentrated Sulfuric 3.8 L/wet ton Acid
 VOC Recovery
 The VOCs from the prepared biomass material of this example were recovered using a GEA SSD® as the solventless recovery unit. Table 10 below provides certain properties of (i) the prepared biomass material fed into the solventless recovery unit, (ii) the solid component exiting the solventless recovery unit, and (iii) the operating conditions of the solventless recovery unit.
TABLE-US-00013 TABLE 10 Sample Feed composition Liquid in Feed 80.2% (%) Solid component Liquid in Solid 31.4% component (product) (%) Solid component 90 (product) Temperature (F.) Operating Conditions Heater 516 Temperature (F.) Feed Rate 5.30 (lb/min.) Evaporation Rate 3.93 (lb/min) Saturation 287 Temperature (F.) Solid component 1.03 production rate (lb/min.) Vapor 428 Temperature at Inlet (F.) Exhaust 370 Temperature (F.) Operating 40 Pressure (psig)
 Further Processing: Saccharification
 Into a 4 liter bottle was added 2160.02 grams of deionized water and 540.12 grams of 40% wt. HESA were mixed to form 8.5% wt. HESA solution. Into a one gallon Parr Instruments C276 autoclave equipped with a DiComp IR probe was placed 433.82 grams of the solid component of Example C. The solid component was estimated to have 289.67 grams of bone dried biomass (BDBM). The acid solution was gently poured over the wet biomass in the reactor. The reactor contained a mixture comprising approximately 9.53% wt. dry biomass in contact with a 7.3% wt. HESA solution (based on the total reactor content).
 The reaction mixture was heated to 120 degrees C. and held for the stated period of time. The reactor content was stirred initially at 100 rpm, but as the reaction heats to 120° C. the contents thin and the stir rate is increased to 250 then 400 rpm. The reactor was held at 120° C. for 1 hour. The heating was discontinued. The reactor was purged with a slow nitrogen stream for a few minutes to eliminate any sulfur dioxide in the gas cap. The reactor was cooled to room temperature and purged once more with nitrogen.
 The reactor content was transferred to a Buchner funnel and vacuum filtered over Whatman 541 hardened ashless 185 mm filter paper. As much liquid as possible was removed from the reactor content. The cumulative weight of the filtrate and liquids removed was obtained. The filtrate was then analyzed by HPLC and the recovery of materials from the biomass calculated by comparison to the amount of the precursors in present in the biomass. The % of glucose recovered was 11.3%, based on the theoretical amount of glucose available in the biomass. The % of xylose recovered was 91%, based on the theoretical amount of xylose available in the biomass.
 The treated sample was further subject to enzymatic hydrolysis. 144 grams of the material from HESA treatment were washed 3 times with 500 mL deionized water. After the first wash, the pH of the material was adjusted to 10. The liquid was then drained, and water was added, and the pH was adjusted to 5.6. In 1 L of water with about 144 grams of washed material, 50 grams of CTEC2 cellulase were added. The solution was shaken at 53 degrees Celsius for 3 days at which time the contents were measured to be: Cellobiose: 1.93 g/L, Glucose: 52.6 g/L, Xylose: 6.12 g/L, Arabinose: 0 g/L, Glycerol: 1.4 g/L, Acetic Acid: 0.92 g/L, Ethanol: 0.0 g/L.
 The hydrolysis mixture (or hydrolysate) was then fermented, where the hydrolysate was added to the media growing various E. coli cultures to replace the glucose content. The E. coli cultures were adapted to co-express the fatty acid reductase, fatty aldehyde synthetase, fatty acyl transferase, and aldehyde decarbonylase enzymes (CEDDEC) or NpARAD. E. coli host cells without these exogenous genes (BL21) were also grown for reference.
 In general, E. coli BL21*(DE3) cells carrying pACYCDuet NpAR/AD, pACYCDuet CEDDEC or no vector controls were grown from glycerol stocks at 37 degrees C., with constant shaking (225 rpm) overnight in LB media with 34 ug/mL chloramphenicol as selectable marker. Chloramphenicol was omitted from E. coli BL21*(DE3) control cells. 200 uL of overnight starter cultures were used to inoculate 20 mL of the following media, containing antibiotic as indicated, and prepared as follows: modified minimal media, without yeast extract*and with the 3% glucose carbon source replaced with filter sterilised hydrolysis mix to a final, total carbon (cellobiose+glucose+xylose+glycerol+acetic acid) concentration of 1%. This can be referred to as the standard LB medium. In addition, E. coli BL21*(DE3) cells carrying pACYCDuet CEDDEC were inoculated into modified minimal media with yeast extract (MYE) and with a 3% glucose carbon source as a positive control. Three replicates of each treatment were grown. This can be referred to as the LB-MYE medium.
Construction of FAR/NpAD (CEDDEC) Plasmids
 The amino acid sequences listed in Table 2 below were reverse translated and codon-optimised for expression in E. coli, providing the nucleic acid sequences also shown in Table 2:
TABLE-US-00014 TABLE 11 Codon-optimised GenBank** nucleic acid accession number Sequence name SEQ ID NO SEQ ID NO AAD05355.1 Fatty acid reductase 1 11 AAD05359.1 LuxE E 2 12 P19197.1 LUXD1_PHOLU 3 13 **The sequences can be retrieved from GenBank at http://www.ncbi.nlm.nih.gov/genbank. GenBank is the NIH genetic sequence database. Genbank is located at the National Center for Biotechnology Information, U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda MD, 20894 USA.
 Codon-optimised luxC, luxE and luxD genes for E. Coli were synthesised in a three-gene operon (SEQ ID NO:15) inserted into pACYCDuet-1 (commercially obtainable from Merck, the final construct having sequence SEQ ID NO:16) and subsequently digested with the restriction enzymes NcoI and NotI (commercially obtainable) and ligated into pCDFDuet-1 MCS1 (commercially obtainable from Merck).
 The Genomic DNA was extracted from N. punctiforme using the FAST-DNA SPIN Kit (commercially obtainable by MP Biomedicals). Cultures were centrifuged for 2 min, 4500 rpm, 4 C and 120 mg of the pellet was re-suspended in 1 ml of buffer Cell Lysis/DNA Solubilizing Solution (CLS-Y). Samples were homogenized with a MP Biomedicals FastPrep-24 (FASTPREP is a trademark) instrument using lysing matrix A (also MP Biomedicals) for 40 sec at a speed setting of 6.0 m/s. All subsequent steps were carried out according to the manufacturer's instructions. After this procedure, the genomic DNA was further purified by phenol-chloroform extraction (using a tris(hydroxymethyl)aminomethane)
 pH7.5-buffered 50% phenol, 48% chloroform, 2% isoamyl alcohol solution), followed by DNA precipitation using ethanol and sodium acetate. The final DNA samples were adjusted (using water) to a concentration of 8 nanograms per microliter (ng/μl). The gene encoding NpAD (aldehyde decarbonylase) was amplified with PHUSION High-Fidelity DNA Polymerase (PHUSION is a trademark, commercially obtainable from New England Biolabs), using 8 ng of cyanobacterial genomic DNA as template.
 Primers used were CATATGCAGCAGCTTACAGACCAAT (SEQ ID NO:26) and CTCGAGTTAAGCACCTATGAGTCCGTAGG (SEQ ID NO:27), allowing direct cloning into MCS2 (MCS is an abbreviation for Multiple Cloning Site) using NdeI and XhoI sites (underlined).
 Plasmids were transformed into TOP10 competent E. coli cells (commercially obtainable from Invitrogen) using the manufactures protocol (as described above for Expression of recombinant enzymes in E. coli), purified using the Qiagen miniprep kit (purified plasmids) and insertions were investigated by polymerase chain reaction (PCR) or restriction digest. The nucleic acid sequence SEQ ID NO:13, encoding NpAD, was confirmed to be present in pACYCDuet-1 luxCED and pCDFDuet-1 luxCED by DNA sequencing (commercially obtainable from Geneservice, U.K.) of purified plasmids.
 Similar techniques were used to construct the NpARAD expression plasmid. E. coli cultures containing the FAR/NpAD (CEDDEC) expression plasmid, NpARAD expression plasmid, or no expression plasmid (BL21) were inoculated and grown in the standard LB media for at least 6 hours. In addition, an E. coli cultures containing the FAR/NpAD (CEDDEC) was also grown in LB-MYE media for at least 6 hours. Table 12 below shows the optical reading at OD600 nm at various time points during the growth of the bacteria cultures.
TABLE-US-00015 TABLE 12 hr post CEDDEC inoculation CEDDEC NpARAD BL21 (MYE) 2 0.24 0.24 0.33 0.03 3.5 0.95 0.93 1.13 0.31 6 0.73
 FIG. 5 shows the growth curve of these different E. coli cultures: CEDDEC, NpARAD, BL21, and CEDDEC (MYE), which is E. coli containing CEDDEC expression plasmid grown in LB-MYE medium. As shown, the growth of E. coli cultures containing the FAR/NpAD (CEDDEC) and NpARAD expression plasmids were quite robust, essentially the same as the standard culture without any expression plasmid. This shows that the glucose in the hydrolysis mixture was available to the microorganisms and further that the hydrolysis mixture was not toxic to the microorganisms.
 While the cultures were not harvested for hydrocarbon extraction and detection, based on the demonstration by PCT/EP2013/053600 of hydrocarbon production by E. coli cultures containing the FAR/NpAD (CEDDEC) and NpARAD expression plasmids, it is expected that the cultures of this example would provide a similar outcome, particularly in light of such robust growth. The hydrocarbons can be extracted and detected by the following method.
 8 ml of bacterial culture can be mixed with 8 ml of ethyl acetate and incubated for 2 hours at room temperature (about 20° C.) and 480 rpm to facilitate hydrocarbon extraction. After extraction, samples can be centrifuged at room temperature (about 20° C.), 700× gravitation for 5 minutes to cause phase separation, and 6 ml of the top phase can be transferred into a fresh vial. The ethyl acetate can be dried under a stream of nitrogen and subsequently the residue can be dissolved in 225 ml dichloromethane (DCM). Separation and identification of hydrocarbons and volatile compounds can be performed using a Trace GasChromatography-Mass spectrometer (GC/MS) 2000 (Thermo Finnigan) equipped with a ZB1-MS column (commercially obtainable from Zebron).
 Further modifications and alternative embodiments of various aspects of the invention will be apparent to those skilled in the art in view of this description. Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the general manner of carrying out the invention. It is to be understood that the forms of the invention shown and described herein are to be taken as the presently preferred embodiments. Elements and materials may be substituted for those illustrated and described herein, parts and processes may be reversed, and certain features of the invention may be utilized independently, all as would be apparent to one skilled in the art after having the benefit of this description of the invention. Changes may be made in the elements described herein without departing from the spirit and scope of the invention as described in the following claims.
331480PRTPhotorhabdus luminescens 1Met Asn Lys Lys Ile Ser Phe Ile Ile Asn Gly Arg Val Glu Ile Phe 1 5 10 15 Pro Glu Ser Asp Asp Leu Val Gln Ser Ile Asn Phe Gly Asp Asn Ser 20 25 30 Val His Leu Pro Val Leu Asn Asp Ser Gln Val Lys Asn Ile Ile Asp 35 40 45 Tyr Asn Glu Asn Asn Glu Leu Gln Leu His Asn Ile Ile Asn Phe Leu 50 55 60 Tyr Thr Val Gly Gln Arg Trp Lys Asn Glu Glu Tyr Ser Arg Arg Arg 65 70 75 80 Thr Tyr Ile Arg Asp Leu Lys Arg Tyr Met Gly Tyr Ser Glu Glu Met 85 90 95 Ala Lys Leu Glu Ala Asn Trp Ile Ser Met Ile Leu Cys Ser Lys Gly 100 105 110 Gly Leu Tyr Asp Leu Val Lys Asn Glu Leu Gly Ser Arg His Ile Met 115 120 125 Asp Glu Trp Leu Pro Gln Asp Glu Ser Tyr Ile Arg Ala Phe Pro Lys 130 135 140 Gly Lys Ser Val His Leu Leu Thr Gly Asn Val Pro Leu Ser Gly Val 145 150 155 160 Leu Ser Ile Leu Arg Ala Ile Leu Thr Lys Asn Gln Cys Ile Ile Lys 165 170 175 Thr Ser Ser Thr Asp Pro Phe Thr Ala Asn Ala Leu Ala Leu Ser Phe 180 185 190 Ile Asp Val Asp Pro His His Pro Val Thr Arg Ser Leu Ser Val Val 195 200 205 Tyr Trp Gln His Gln Gly Asp Ile Ser Leu Ala Lys Glu Ile Met Gln 210 215 220 His Ala Asp Val Val Val Ala Trp Gly Gly Glu Asp Ala Ile Asn Trp 225 230 235 240 Ala Val Lys His Ala Pro Pro Asp Ile Asp Val Met Lys Phe Gly Pro 245 250 255 Lys Lys Ser Phe Cys Ile Ile Asp Asn Pro Val Asp Leu Val Ser Ala 260 265 270 Ala Thr Gly Ala Ala His Asp Val Cys Phe Tyr Asp Gln Gln Ala Cys 275 280 285 Phe Ser Thr Gln Asn Ile Tyr Tyr Met Gly Ser His Tyr Glu Glu Phe 290 295 300 Lys Leu Ala Leu Ile Glu Lys Leu Asn Leu Tyr Ala His Ile Leu Pro 305 310 315 320 Asn Thr Lys Lys Asp Phe Asp Glu Lys Ala Ala Tyr Ser Leu Val Gln 325 330 335 Lys Glu Cys Leu Phe Ala Gly Leu Lys Val Glu Val Asp Val His Gln 340 345 350 Arg Trp Met Val Ile Glu Ser Asn Ala Gly Val Glu Leu Asn Gln Pro 355 360 365 Leu Gly Arg Cys Val Tyr Leu His His Val Asp Asn Ile Glu Gln Ile 370 375 380 Leu Pro Tyr Val Arg Lys Asn Lys Thr Gln Thr Ile Ser Val Phe Pro 385 390 395 400 Trp Glu Ala Ala Leu Lys Tyr Arg Asp Leu Leu Ala Leu Lys Gly Ala 405 410 415 Glu Arg Ile Val Glu Ala Gly Met Asn Asn Ile Phe Arg Val Gly Gly 420 425 430 Ala His Asp Gly Met Arg Pro Leu Gln Arg Leu Val Thr Tyr Ile Ser 435 440 445 His Glu Arg Pro Ser His Tyr Thr Ala Lys Asp Val Ala Val Glu Ile 450 455 460 Glu Gln Thr Arg Phe Leu Glu Glu Asp Lys Phe Leu Val Phe Val Pro 465 470 475 480 2370PRTPhotorhabdus luminescens 2Met Thr Ser Tyr Val Asp Lys Gln Glu Ile Thr Ala Ser Ser Glu Ile 1 5 10 15 Asp Asp Leu Ile Phe Ser Ser Asp Pro Leu Val Trp Ser Tyr Asp Glu 20 25 30 Gln Glu Lys Ile Arg Lys Lys Leu Val Leu Asp Ala Phe Arg His His 35 40 45 Tyr Lys His Cys Gln Glu Tyr Arg His Tyr Cys Gln Ala His Lys Val 50 55 60 Asp Asp Asn Ile Thr Glu Ile Asp Asp Ile Pro Val Phe Pro Thr Ser 65 70 75 80 Val Phe Lys Phe Thr Arg Leu Leu Thr Ser Asn Glu Asn Glu Ile Glu 85 90 95 Ser Trp Phe Thr Ser Ser Gly Thr Asn Gly Leu Lys Ser Gln Val Pro 100 105 110 Arg Asp Arg Leu Ser Ile Glu Arg Leu Leu Gly Ser Val Ser Tyr Gly 115 120 125 Met Lys Tyr Ile Gly Ser Trp Phe Asp His Gln Met Glu Leu Val Asn 130 135 140 Leu Gly Pro Asp Arg Phe Asn Ala His Asn Ile Trp Phe Lys Tyr Val 145 150 155 160 Met Ser Leu Val Glu Leu Leu Tyr Pro Thr Ser Phe Thr Val Thr Glu 165 170 175 Glu His Ile Asp Phe Val Gln Thr Leu Asn Ser Leu Glu Arg Ile Lys 180 185 190 His Gln Gly Lys Asp Ile Cys Leu Ile Gly Ser Pro Tyr Phe Ile Tyr 195 200 205 Leu Leu Cys Arg Tyr Met Lys Asp Lys Asn Ile Ser Phe Ser Gly Asp 210 215 220 Lys Ser Leu Tyr Ile Ile Thr Gly Gly Gly Trp Lys Ser Tyr Glu Lys 225 230 235 240 Glu Ser Leu Lys Arg Asn Asp Phe Asn His Leu Leu Phe Asp Thr Phe 245 250 255 Asn Leu Ser Asn Ile Asn Gln Ile Arg Asp Ile Phe Asn Gln Val Glu 260 265 270 Leu Asn Thr Cys Phe Phe Glu Asp Glu Met Gln Arg Lys His Val Pro 275 280 285 Pro Trp Val Tyr Ala Arg Ala Leu Asp Pro Glu Thr Leu Lys Pro Val 290 295 300 Pro Asp Gly Met Pro Gly Leu Met Ser Tyr Met Asp Ala Ser Ser Thr 305 310 315 320 Ser Tyr Pro Ala Phe Ile Val Thr Asp Asp Ile Gly Ile Ile Ser Arg 325 330 335 Glu Tyr Gly Gln Tyr Pro Gly Val Leu Val Glu Ile Leu Arg Arg Val 340 345 350 Asn Thr Arg Lys Gln Lys Gly Cys Ala Leu Ser Leu Thr Glu Ala Phe 355 360 365 Gly Ser 370 3307PRTPhotorhabdus luminescens 3Met Glu Asn Glu Ser Lys Tyr Lys Thr Ile Asp His Val Ile Cys Val 1 5 10 15 Glu Gly Asn Lys Lys Ile His Val Trp Glu Thr Leu Pro Glu Glu Asn 20 25 30 Ser Pro Lys Arg Lys Asn Ala Ile Ile Ile Ala Ser Gly Phe Ala Arg 35 40 45 Arg Met Asp His Phe Ala Gly Leu Ala Glu Tyr Leu Ser Arg Asn Gly 50 55 60 Phe His Val Ile Arg Tyr Asp Ser Leu His His Val Gly Leu Ser Ser 65 70 75 80 Gly Thr Ile Asp Glu Phe Thr Met Ser Ile Gly Lys Gln Ser Leu Leu 85 90 95 Ala Val Val Asp Trp Leu Thr Thr Arg Lys Ile Asn Asn Phe Gly Met 100 105 110 Leu Ala Ser Ser Leu Ser Ala Arg Ile Ala Tyr Ala Ser Leu Ser Glu 115 120 125 Ile Asn Ala Ser Phe Leu Ile Thr Ala Val Gly Phe Val Asn Leu Arg 130 135 140 Tyr Ser Leu Glu Arg Ala Leu Gly Phe Asp Tyr Leu Ser Leu Pro Ile 145 150 155 160 Asn Glu Leu Pro Asn Asn Leu Asp Phe Glu Gly His Lys Leu Gly Ala 165 170 175 Glu Val Phe Ala Arg Asp Cys Leu Asp Phe Gly Trp Glu Asp Leu Ala 180 185 190 Ser Thr Ile Asn Asn Met Met Tyr Leu Asp Ile Pro Phe Ile Ala Phe 195 200 205 Thr Ala Asn Asn Asp Asn Trp Val Lys Gln Asp Glu Val Ile Thr Leu 210 215 220 Leu Ser Asn Ile Arg Ser Asn Arg Cys Lys Ile Tyr Ser Leu Leu Gly 225 230 235 240 Ser Ser His Asp Leu Ser Glu Asn Leu Val Val Leu Arg Asn Phe Tyr 245 250 255 Gln Ser Val Thr Lys Ala Ala Ile Ala Met Asp Asn Asp His Leu Asp 260 265 270 Ile Asp Val Asp Ile Thr Glu Pro Ser Phe Glu His Leu Thr Ile Ala 275 280 285 Thr Val Asn Glu Arg Arg Met Arg Ile Glu Ile Glu Asn Gln Ala Ile 290 295 300 Ser Leu Ser 305 4232PRTNostoc punctiforme 4Met Gln Gln Leu Thr Asp Gln Ser Lys Glu Leu Asp Phe Lys Ser Glu 1 5 10 15 Thr Tyr Lys Asp Ala Tyr Ser Arg Ile Asn Ala Ile Val Ile Glu Gly 20 25 30 Glu Gln Glu Ala His Glu Asn Tyr Ile Thr Leu Ala Gln Leu Leu Pro 35 40 45 Glu Ser His Asp Glu Leu Ile Arg Leu Ser Lys Met Glu Ser Arg His 50 55 60 Lys Lys Gly Phe Glu Ala Cys Gly Arg Asn Leu Ala Val Thr Pro Asp 65 70 75 80 Leu Gln Phe Ala Lys Glu Phe Phe Ser Gly Leu His Gln Asn Phe Gln 85 90 95 Thr Ala Ala Ala Glu Gly Lys Val Val Thr Cys Leu Leu Ile Gln Ser 100 105 110 Leu Ile Ile Glu Cys Phe Ala Ile Ala Ala Tyr Asn Ile Tyr Ile Pro 115 120 125 Val Ala Asp Asp Phe Ala Arg Lys Ile Thr Glu Gly Val Val Lys Glu 130 135 140 Glu Tyr Ser His Leu Asn Phe Gly Glu Val Trp Leu Lys Glu His Phe 145 150 155 160 Ala Glu Ser Lys Ala Glu Leu Glu Leu Ala Asn Arg Gln Asn Leu Pro 165 170 175 Ile Val Trp Lys Met Leu Asn Gln Val Glu Gly Asp Ala His Thr Met 180 185 190 Ala Met Glu Lys Asp Ala Leu Val Glu Asp Phe Met Ile Gln Tyr Gly 195 200 205 Glu Ala Leu Ser Asn Ile Gly Phe Ser Thr Arg Asp Ile Met Arg Leu 210 215 220 Ser Ala Tyr Gly Leu Ile Gly Ala 225 230 5300PRTCinnamomum camphora 5Met Leu Glu Trp Lys Pro Lys Pro Asn Pro Pro Gln Leu Leu Asp Asp 1 5 10 15 His Phe Gly Pro His Gly Leu Val Phe Arg Arg Thr Phe Ala Ile Arg 20 25 30 Ser Tyr Glu Val Gly Pro Asp Arg Ser Thr Ser Ile Val Ala Val Met 35 40 45 Asn His Leu Gln Glu Ala Ala Leu Asn His Ala Lys Ser Val Gly Ile 50 55 60 Leu Gly Asp Gly Phe Gly Thr Thr Leu Glu Met Ser Lys Arg Asp Leu 65 70 75 80 Ile Trp Val Val Lys Arg Thr His Val Ala Val Glu Arg Tyr Pro Ala 85 90 95 Trp Gly Asp Thr Val Glu Val Glu Cys Trp Val Gly Ala Ser Gly Asn 100 105 110 Asn Gly Arg Arg His Asp Phe Leu Val Arg Asp Cys Lys Thr Gly Glu 115 120 125 Ile Leu Thr Arg Cys Thr Ser Leu Ser Val Met Met Asn Thr Arg Thr 130 135 140 Arg Arg Leu Ser Lys Ile Pro Glu Glu Val Arg Gly Glu Ile Gly Pro 145 150 155 160 Ala Phe Ile Asp Asn Val Ala Val Lys Asp Glu Glu Ile Lys Lys Pro 165 170 175 Gln Lys Leu Asn Asp Ser Thr Ala Asp Tyr Ile Gln Gly Gly Leu Thr 180 185 190 Pro Arg Trp Asn Asp Leu Asp Ile Asn Gln His Val Asn Asn Ile Lys 195 200 205 Tyr Val Asp Trp Ile Leu Glu Thr Val Pro Asp Ser Ile Phe Glu Ser 210 215 220 His His Ile Ser Ser Phe Thr Ile Glu Tyr Arg Arg Glu Cys Thr Met 225 230 235 240 Asp Ser Val Leu Gln Ser Leu Thr Thr Val Ser Gly Gly Ser Ser Glu 245 250 255 Ala Gly Leu Val Cys Glu His Leu Leu Gln Leu Glu Gly Gly Ser Glu 260 265 270 Val Leu Arg Ala Lys Thr Glu Trp Arg Pro Lys Leu Thr Asp Ser Phe 275 280 285 Arg Gly Ile Ser Val Ile Pro Ala Glu Ser Ser Val 290 295 300 6325PRTBacillus subtilis 6Met Ser Lys Ala Lys Ile Thr Ala Ile Gly Thr Tyr Ala Pro Ser Arg 1 5 10 15 Arg Leu Thr Asn Ala Asp Leu Glu Lys Ile Val Asp Thr Ser Asp Glu 20 25 30 Trp Ile Val Gln Arg Thr Gly Met Arg Glu Arg Arg Ile Ala Asp Glu 35 40 45 His Gln Phe Thr Ser Asp Leu Cys Ile Glu Ala Val Lys Asn Leu Lys 50 55 60 Ser Arg Tyr Lys Gly Thr Leu Asp Asp Val Asp Met Ile Leu Val Ala 65 70 75 80 Thr Thr Thr Ser Asp Tyr Ala Phe Pro Ser Thr Ala Cys Arg Val Gln 85 90 95 Glu Tyr Phe Gly Trp Glu Ser Thr Gly Ala Leu Asp Ile Asn Ala Thr 100 105 110 Cys Ala Gly Leu Thr Tyr Gly Leu His Leu Ala Asn Gly Leu Ile Thr 115 120 125 Ser Gly Leu His Gln Lys Ile Leu Val Ile Ala Gly Glu Thr Leu Ser 130 135 140 Lys Val Thr Asp Tyr Thr Asp Arg Thr Thr Cys Val Leu Phe Gly Asp 145 150 155 160 Ala Ala Gly Ala Leu Leu Val Glu Arg Asp Glu Glu Thr Pro Gly Phe 165 170 175 Leu Ala Ser Val Gln Gly Thr Ser Gly Asn Gly Gly Asp Ile Leu Tyr 180 185 190 Arg Ala Gly Leu Arg Asn Glu Ile Asn Gly Val Gln Leu Val Gly Ser 195 200 205 Gly Lys Met Val Gln Asn Gly Arg Glu Val Tyr Lys Trp Ala Ala Arg 210 215 220 Thr Val Pro Gly Glu Phe Glu Arg Leu Leu His Lys Ala Gly Leu Ser 225 230 235 240 Ser Asp Asp Leu Asp Trp Phe Val Pro His Ser Ala Asn Leu Arg Met 245 250 255 Ile Glu Ser Ile Cys Glu Lys Thr Pro Phe Pro Ile Glu Lys Thr Leu 260 265 270 Thr Ser Val Glu His Tyr Gly Asn Thr Ser Ser Val Ser Ile Val Leu 275 280 285 Ala Leu Asp Leu Ala Val Lys Ala Gly Lys Leu Lys Lys Asp Gln Ile 290 295 300 Val Leu Leu Phe Gly Phe Gly Gly Gly Leu Thr Tyr Thr Gly Leu Leu 305 310 315 320 Ile Lys Trp Gly Met 325 7330PRTBacillus subtilis 7Met Ser Thr Asn Arg His Gln Ala Leu Gly Leu Thr Asp Gln Glu Ala 1 5 10 15 Val Asp Met Tyr Arg Thr Met Leu Leu Ala Arg Lys Ile Asp Glu Arg 20 25 30 Met Trp Leu Leu Asn Arg Ser Gly Lys Ile Pro Phe Val Ile Ser Cys 35 40 45 Gln Gly Gln Glu Ala Ala Gln Val Gly Ala Ala Phe Ala Leu Asp Arg 50 55 60 Glu Met Asp Tyr Val Leu Pro Tyr Tyr Arg Asp Met Gly Val Val Leu 65 70 75 80 Ala Phe Gly Met Thr Ala Lys Asp Leu Met Met Ser Gly Phe Ala Lys 85 90 95 Ala Ala Asp Pro Asn Ser Gly Gly Arg Gln Met Pro Gly His Phe Gly 100 105 110 Gln Lys Lys Asn Arg Ile Val Thr Gly Ser Ser Pro Val Thr Thr Gln 115 120 125 Val Pro His Ala Val Gly Ile Ala Leu Ala Gly Arg Met Glu Lys Lys 130 135 140 Asp Ile Ala Ala Phe Val Thr Phe Gly Glu Gly Ser Ser Asn Gln Gly 145 150 155 160 Asp Phe His Glu Gly Ala Asn Phe Ala Ala Val His Lys Leu Pro Val 165 170 175 Ile Phe Met Cys Glu Asn Asn Lys Tyr Ala Ile Ser Val Pro Tyr Asp 180 185 190 Lys Gln Val Ala Cys Glu Asn Ile Ser Asp Arg Ala Ile Gly Tyr Gly 195 200 205 Met Pro Gly Val Thr Val Asn Gly Asn Asp Pro Leu Glu Val Tyr Gln 210 215 220 Ala Val Lys Glu Ala Arg Glu Arg Ala Arg Arg Gly Glu Gly Pro Thr 225 230 235 240 Leu Ile Glu Thr Ile Ser Tyr Arg Leu Thr Pro His Ser Ser Asp Asp 245 250 255 Asp Asp Ser Ser Tyr Arg Gly Arg Glu Glu Val Glu Glu Ala Lys Lys 260
265 270 Ser Asp Pro Leu Leu Thr Tyr Gln Ala Tyr Leu Lys Glu Thr Gly Leu 275 280 285 Leu Ser Asp Glu Ile Glu Gln Thr Met Leu Asp Glu Ile Met Ala Ile 290 295 300 Val Asn Glu Ala Thr Asp Glu Ala Glu Asn Ala Pro Tyr Ala Ala Pro 305 310 315 320 Glu Ser Ala Leu Asp Tyr Val Tyr Ala Lys 325 330 8327PRTBacillus subtilis 8Met Ser Val Met Ser Tyr Ile Asp Ala Ile Asn Leu Ala Met Lys Glu 1 5 10 15 Glu Met Glu Arg Asp Ser Arg Val Phe Val Leu Gly Glu Asp Val Gly 20 25 30 Arg Lys Gly Gly Val Phe Lys Ala Thr Ala Gly Leu Tyr Glu Gln Phe 35 40 45 Gly Glu Glu Arg Val Met Asp Thr Pro Leu Ala Glu Ser Ala Ile Ala 50 55 60 Gly Val Gly Ile Gly Ala Ala Met Tyr Gly Met Arg Pro Ile Ala Glu 65 70 75 80 Met Gln Phe Ala Asp Phe Ile Met Pro Ala Val Asn Gln Ile Ile Ser 85 90 95 Glu Ala Ala Lys Ile Arg Tyr Arg Ser Asn Asn Asp Trp Ser Cys Pro 100 105 110 Ile Val Val Arg Ala Pro Tyr Gly Gly Gly Val His Gly Ala Leu Tyr 115 120 125 His Ser Gln Ser Val Glu Ala Ile Phe Ala Asn Gln Pro Gly Leu Lys 130 135 140 Ile Val Met Pro Ser Thr Pro Tyr Asp Ala Lys Gly Leu Leu Lys Ala 145 150 155 160 Ala Val Arg Asp Glu Asp Pro Val Leu Phe Phe Glu His Lys Arg Ala 165 170 175 Tyr Arg Leu Ile Lys Gly Glu Val Pro Ala Asp Asp Tyr Val Leu Pro 180 185 190 Ile Gly Lys Ala Asp Val Lys Arg Glu Gly Asp Asp Ile Thr Val Ile 195 200 205 Thr Tyr Gly Leu Cys Val His Phe Ala Leu Gln Ala Ala Glu Arg Leu 210 215 220 Glu Lys Asp Gly Ile Ser Ala His Val Val Asp Leu Arg Thr Val Tyr 225 230 235 240 Pro Leu Asp Lys Glu Ala Ile Ile Glu Ala Ala Ser Lys Thr Gly Lys 245 250 255 Val Leu Leu Val Thr Glu Asp Thr Lys Glu Gly Ser Ile Met Ser Glu 260 265 270 Val Ala Ala Ile Ile Ser Glu His Cys Leu Phe Asp Leu Asp Ala Pro 275 280 285 Ile Lys Arg Leu Ala Gly Pro Asp Ile Pro Ala Met Pro Tyr Ala Pro 290 295 300 Thr Met Glu Lys Tyr Phe Met Val Asn Pro Asp Lys Val Glu Ala Ala 305 310 315 320 Met Arg Glu Leu Ala Glu Phe 325 9424PRTBacillus subtilis 9Met Ala Ile Glu Gln Met Thr Met Pro Gln Leu Gly Glu Ser Val Thr 1 5 10 15 Glu Gly Thr Ile Ser Lys Trp Leu Val Ala Pro Gly Asp Lys Val Asn 20 25 30 Lys Tyr Asp Pro Ile Ala Glu Val Met Thr Asp Lys Val Asn Ala Glu 35 40 45 Val Pro Ser Ser Phe Thr Gly Thr Ile Thr Glu Leu Val Gly Glu Glu 50 55 60 Gly Gln Thr Leu Gln Val Gly Glu Met Ile Cys Lys Ile Glu Thr Glu 65 70 75 80 Gly Ala Asn Pro Ala Glu Gln Lys Gln Glu Gln Pro Ala Ala Ser Glu 85 90 95 Ala Ala Glu Asn Pro Val Ala Lys Ser Ala Gly Ala Ala Asp Gln Pro 100 105 110 Asn Lys Lys Arg Tyr Ser Pro Ala Val Leu Arg Leu Ala Gly Glu His 115 120 125 Gly Ile Asp Leu Asp Gln Val Thr Gly Thr Gly Ala Gly Gly Arg Ile 130 135 140 Thr Arg Lys Asp Ile Gln Arg Leu Ile Glu Thr Gly Gly Val Gln Glu 145 150 155 160 Gln Asn Pro Glu Glu Leu Lys Thr Ala Ala Pro Ala Pro Lys Ser Ala 165 170 175 Ser Lys Pro Glu Pro Lys Glu Glu Thr Ser Tyr Pro Ala Ser Ala Ala 180 185 190 Gly Asp Lys Glu Ile Pro Val Thr Gly Val Arg Lys Ala Ile Ala Ser 195 200 205 Asn Met Lys Arg Ser Lys Thr Glu Ile Pro His Ala Trp Thr Met Met 210 215 220 Glu Val Asp Val Thr Asn Met Val Ala Tyr Arg Asn Ser Ile Lys Asp 225 230 235 240 Ser Phe Lys Lys Thr Glu Gly Phe Asn Leu Thr Phe Phe Ala Phe Phe 245 250 255 Val Lys Ala Val Ala Gln Ala Leu Lys Glu Phe Pro Gln Met Asn Ser 260 265 270 Met Trp Ala Gly Asp Lys Ile Ile Gln Lys Lys Asp Ile Asn Ile Ser 275 280 285 Ile Ala Val Ala Thr Glu Asp Ser Leu Phe Val Pro Val Ile Lys Asn 290 295 300 Ala Asp Glu Lys Thr Ile Lys Gly Ile Ala Lys Asp Ile Thr Gly Leu 305 310 315 320 Ala Lys Lys Val Arg Asp Gly Lys Leu Thr Ala Asp Asp Met Gln Gly 325 330 335 Gly Thr Phe Thr Val Asn Asn Thr Gly Ser Phe Gly Ser Val Gln Ser 340 345 350 Met Gly Ile Ile Asn Tyr Pro Gln Ala Ala Ile Leu Gln Val Glu Ser 355 360 365 Ile Val Lys Arg Pro Val Val Met Asp Asn Gly Met Ile Ala Val Arg 370 375 380 Asp Met Val Asn Leu Cys Leu Ser Leu Asp His Arg Val Leu Asp Gly 385 390 395 400 Leu Val Cys Gly Arg Phe Leu Gly Arg Val Lys Gln Ile Leu Glu Ser 405 410 415 Ile Asp Glu Lys Thr Ser Val Tyr 420 10474PRTBacillus subtilis 10Met Ala Thr Glu Tyr Asp Val Val Ile Leu Gly Gly Gly Thr Gly Gly 1 5 10 15 Tyr Val Ala Ala Ile Arg Ala Ala Gln Leu Gly Leu Lys Thr Ala Val 20 25 30 Val Glu Lys Glu Lys Leu Gly Gly Thr Cys Leu His Lys Gly Cys Ile 35 40 45 Pro Ser Lys Ala Leu Leu Arg Ser Ala Glu Val Tyr Arg Thr Ala Arg 50 55 60 Glu Ala Asp Gln Phe Gly Val Glu Thr Ala Gly Val Ser Leu Asn Phe 65 70 75 80 Glu Lys Val Gln Gln Arg Lys Gln Ala Val Val Asp Lys Leu Ala Ala 85 90 95 Gly Val Asn His Leu Met Lys Lys Gly Lys Ile Asp Val Tyr Thr Gly 100 105 110 Tyr Gly Arg Ile Leu Gly Pro Ser Ile Phe Ser Pro Leu Pro Gly Thr 115 120 125 Ile Ser Val Glu Arg Gly Asn Gly Glu Glu Asn Asp Met Leu Ile Pro 130 135 140 Lys Gln Val Ile Ile Ala Thr Gly Ser Arg Pro Arg Met Leu Pro Gly 145 150 155 160 Leu Glu Val Asp Gly Lys Ser Val Leu Thr Ser Asp Glu Ala Leu Gln 165 170 175 Met Glu Glu Leu Pro Gln Ser Ile Ile Ile Val Gly Gly Gly Val Ile 180 185 190 Gly Ile Glu Trp Ala Ser Met Leu His Asp Phe Gly Val Lys Val Thr 195 200 205 Val Ile Glu Tyr Ala Asp Arg Ile Leu Pro Thr Glu Asp Leu Glu Ile 210 215 220 Ser Lys Glu Met Glu Ser Leu Leu Lys Lys Lys Gly Ile Gln Phe Ile 225 230 235 240 Thr Gly Ala Lys Val Leu Pro Asp Thr Met Thr Lys Thr Ser Asp Asp 245 250 255 Ile Ser Ile Gln Ala Glu Lys Asp Gly Glu Thr Val Thr Tyr Ser Ala 260 265 270 Glu Lys Met Leu Val Ser Ile Gly Arg Gln Ala Asn Ile Glu Gly Ile 275 280 285 Gly Leu Glu Asn Thr Asp Ile Val Thr Glu Asn Gly Met Ile Ser Val 290 295 300 Asn Glu Ser Cys Gln Thr Lys Glu Ser His Ile Tyr Ala Ile Gly Asp 305 310 315 320 Val Ile Gly Gly Leu Gln Leu Ala His Val Ala Ser His Glu Gly Ile 325 330 335 Ile Ala Val Glu His Phe Ala Gly Leu Asn Pro His Pro Leu Asp Pro 340 345 350 Thr Leu Val Pro Lys Cys Ile Tyr Ser Ser Pro Glu Ala Ala Ser Val 355 360 365 Gly Leu Thr Glu Asp Glu Ala Lys Ala Asn Gly His Asn Val Lys Ile 370 375 380 Gly Lys Phe Pro Phe Met Ala Ile Gly Lys Ala Leu Val Tyr Gly Glu 385 390 395 400 Ser Asp Gly Phe Val Lys Ile Val Ala Asp Arg Asp Thr Asp Asp Ile 405 410 415 Leu Gly Val His Met Ile Gly Pro His Val Thr Asp Met Ile Ser Glu 420 425 430 Ala Gly Leu Ala Lys Val Leu Asp Ala Thr Pro Trp Glu Val Gly Gln 435 440 445 Thr Ile His Pro His Pro Thr Leu Ser Glu Ala Ile Gly Glu Ala Ala 450 455 460 Leu Ala Ala Asp Gly Lys Ala Ile His Phe 465 470 111443DNAPhotorhabdus luminescens 11atgaacaaga aaatcagctt catcatcaac ggtcgcgtag aaatttttcc ggagtctgat 60gacctggttc aaagcatcaa ttttggtgac aatagcgtcc acctgccggt gctgaacgat 120agccaagtga aaaacattat cgactataat gagaataatg agctgcaact gcacaatatc 180attaactttc tgtataccgt cggtcagcgc tggaaaaacg aagaatacag ccgtcgtcgt 240acctatattc gcgatctgaa gcgttatatg ggctacagcg aggaaatggc gaaactggaa 300gccaattgga ttagcatgat tctgtgctct aaaggtggtt tgtacgatct ggtgaaaaat 360gagctgggca gccgtcacat tatggacgaa tggctgccgc aagacgaaag ctacatccgt 420gccttcccga aaggcaagag cgttcatctg ctgaccggta atgtcccgct gtcgggcgtg 480ctgtccatcc tgcgcgcgat tctgaccaag aaccagtgca tcattaagac gagcagcacg 540gatcctttca cggcgaatgc gctggcgctg agcttcatcg acgttgaccc acatcacccg 600gtgacccgta gcctgtctgt cgtttattgg cagcaccaag gtgacatcag cttggcgaaa 660gagattatgc agcacgccga tgtggtcgtt gcctggggtg gtgaggatgc aattaactgg 720gcggttaaac acgcaccgcc ggatatcgac gtcatgaaat tcggtccgaa aaagagcttc 780tgcatcattg acaacccggt tgacttggtt agcgcagcga ccggcgcagc acacgacgtc 840tgtttttacg atcagcaggc atgctttagc acgcagaaca tctactacat gggctcccat 900tacgaggagt ttaagctggc tttgatcgaa aaactgaatc tgtatgcaca tatcctgcct 960aacaccaaga aggatttcga cgaaaaggca gcttattcct tggtgcaaaa ggagtgtctg 1020ttcgccggtt tgaaagtgga agttgacgtt catcaacgct ggatggttat tgaatccaat 1080gctggcgttg agctgaacca gccgctgggt cgttgtgtgt acttgcatca cgtggataac 1140atcgagcaga ttttgccgta tgtgcgtaag aacaaaaccc agacgattag cgtgtttccg 1200tgggaggctg cgctgaagta ccgcgatctg ctggccctga aaggcgcgga gcgtattgtt 1260gaggcgggta tgaataacat tttccgtgtg ggtggtgcgc acgatggcat gcgtccgctg 1320caacgcctgg tcacttacat tagccacgag cgtccgagcc attacaccgc gaaggacgtc 1380gcggtcgaaa tcgaacagac gcgctttctg gaagaggaca agttcctggt gtttgttcca 1440taa 1443121113DNAPhotorhabdus luminescens 12atgactagct acgtcgacaa acaggaaatc accgcgagca gcgagattga cgacctgatc 60ttttccagcg atccgttggt gtggtcctat gatgagcaag aaaagattcg caagaaactg 120gtcctggatg cgttccgcca ccactacaag cactgtcaag agtaccgtca ttattgccaa 180gcccataaag tcgacgataa cattacggaa attgacgata tcccggtttt cccgacctct 240gttttcaagt tcacccgtct gctgacctcc aacgagaatg agattgagag ctggtttact 300tcgagcggta ccaatggtct gaaaagccaa gtcccgcgtg atcgtctgag cattgaacgt 360ctgctgggca gcgtgagcta cggcatgaag tacatcggtt cgtggtttga ccatcaaatg 420gagctggtta acttgggtcc ggatcgcttt aatgcccaca acatttggtt caagtacgtt 480atgagcctgg ttgagctgtt gtatccgacg agcttcaccg tgacggaaga gcacatcgac 540ttcgtgcaga cgctgaacag cctggaacgc attaaacatc agggcaaaga catttgtctg 600atcggttctc cgtatttcat ctatctgctg tgccgttaca tgaaggacaa gaacatcagc 660tttagcggtg acaagagcct gtatatcatc accggtggcg gttggaaaag ctacgaaaaa 720gagtccctga agcgtaatga ctttaatcac ctgttgttcg atacgttcaa tctgagcaac 780attaaccaga tccgtgacat ctttaaccag gtcgaactga atacctgttt ctttgaggac 840gagatgcagc gcaaacacgt cccgccgtgg gtatacgcgc gtgcgctgga tcctgaaacc 900ttgaaaccgg ttccagatgg catgcctggt ctgatgagct atatggatgc tagctctacg 960agctacccgg catttatcgt gaccgacgat attggtatta tcagccgcga gtacggtcaa 1020tatccgggcg tgctggttga aattctgcgt cgtgtgaata cccgcaagca gaaaggctgc 1080gcgttgtctc tgacggaggc attcggttcc taa 111313699DNANostoc punctiforme 13atgcagcagc ttacagacca atctaaagaa ttagatttca agagcgaaac atacaaagat 60gcttatagcc ggattaatgc gatcgtgatt gaaggggaac aagaagccca tgaaaattac 120atcacactag cccaactgct gccagaatct catgatgaat tgattcgcct atccaagatg 180gaaagccgcc ataagaaagg atttgaagct tgtgggcgca atttagctgt taccccagat 240ttgcaatttg ccaaagagtt tttctccggc ctacaccaaa attttcaaac agctgccgca 300gaagggaaag tggttacttg tctgttgatt cagtctttaa ttattgaatg ttttgcgatc 360gcagcatata acatttacat ccccgttgcc gacgatttcg cccgtaaaat tactgaagga 420gtagttaaag aagaatacag ccacctcaat tttggagaag tttggttgaa agaacacttt 480gcagaatcca aagctgaact tgaacttgca aatcgccaga acctacccat cgtctggaaa 540atgctcaacc aagtagaagg tgatgcccac acaatggcaa tggaaaaaga tgctttggta 600gaagacttca tgattcagta tggtgaagca ttgagtaaca ttggtttttc gactcgcgat 660attatgcgct tgtcagccta cggactcata ggtgcttaa 69914924DNAPhotorhabdus luminescens 14atggaaaacg agagcaagta caaaacgatc gaccacgtaa tctgcgtgga gggtaacaaa 60aagattcacg tgtgggagac tttgccagaa gagaacagcc cgaaacgcaa aaacgcaatc 120attatcgcga gcggtttcgc acgccgcatg gatcattttg cgggcctggc cgaatacctg 180agccgtaacg gcttccacgt tatccgttat gacagcctgc atcacgtcgg cctgtcgtct 240ggtaccatcg acgagttcac gatgagcatc ggcaagcaaa gcctgttggc ggttgttgat 300tggctgacca cgcgtaagat caacaatttt ggtatgctgg cttccagcct gtccgcacgc 360attgcgtacg cttctctgag cgagattaat gccagctttc tgatcaccgc cgtgggtttc 420gtcaatctgc gttatagcct ggagcgtgcg ctgggtttcg attacttgag cctgccgatt 480aacgagctgc cgaataatct ggactttgaa ggccataagt tgggtgcgga ggtctttgcg 540cgtgattgcc tggattttgg ttgggaagat ctggcatcga cgattaacaa tatgatgtat 600ctggatatcc cgtttattgc tttcacggcg aataacgaca attgggttaa gcaagacgag 660gttatcaccc tgctgtctaa cattcgttcc aatcgctgta aaatctatag cttgctgggc 720agcagccacg acttgagcga aaatctggtc gtgctgcgca acttctacca gagcgtgacc 780aaagcagcga ttgcaatgga taacgaccac ctggacattg acgtggatat caccgaaccg 840agcttcgaac atctgaccat cgcgaccgtt aacgaacgtc gtatgcgtat tgagattgag 900aatcaggcca tttccctgag ctaa 924153588DNAArtificial SequenceCodon-optimised operon 15catgggaagg agatatagat atgaacaaga aaatcagctt catcatcaac ggtcgcgtag 60aaatttttcc ggagtctgat gacctggttc aaagcatcaa ttttggtgac aatagcgtcc 120acctgccggt gctgaacgat agccaagtga aaaacattat cgactataat gagaataatg 180agctgcaact gcacaatatc attaactttc tgtataccgt cggtcagcgc tggaaaaacg 240aagaatacag ccgtcgtcgt acctatattc gcgatctgaa gcgttatatg ggctacagcg 300aggaaatggc gaaactggaa gccaattgga ttagcatgat tctgtgctct aaaggtggtt 360tgtacgatct ggtgaaaaat gagctgggca gccgtcacat tatggacgaa tggctgccgc 420aagacgaaag ctacatccgt gccttcccga aaggcaagag cgttcatctg ctgaccggta 480atgtcccgct gtcgggcgtg ctgtccatcc tgcgcgcgat tctgaccaag aaccagtgca 540tcattaagac gagcagcacg gatcctttca cggcgaatgc gctggcgctg agcttcatcg 600acgttgaccc acatcacccg gtgacccgta gcctgtctgt cgtttattgg cagcaccaag 660gtgacatcag cttggcgaaa gagattatgc agcacgccga tgtggtcgtt gcctggggtg 720gtgaggatgc aattaactgg gcggttaaac acgcaccgcc ggatatcgac gtcatgaaat 780tcggtccgaa aaagagcttc tgcatcattg acaacccggt tgacttggtt agcgcagcga 840ccggcgcagc acacgacgtc tgtttttacg atcagcaggc atgctttagc acgcagaaca 900tctactacat gggctcccat tacgaggagt ttaagctggc tttgatcgaa aaactgaatc 960tgtatgcaca tatcctgcct aacaccaaga aggatttcga cgaaaaggca gcttattcct 1020tggtgcaaaa ggagtgtctg ttcgccggtt tgaaagtgga agttgacgtt catcaacgct 1080ggatggttat tgaatccaat gctggcgttg agctgaacca gccgctgggt cgttgtgtgt 1140acttgcatca cgtggataac atcgagcaga ttttgccgta tgtgcgtaag aacaaaaccc 1200agacgattag cgtgtttccg tgggaggctg cgctgaagta ccgcgatctg ctggccctga 1260aaggcgcgga gcgtattgtt gaggcgggta tgaataacat tttccgtgtg ggtggtgcgc 1320acgatggcat gcgtccgctg caacgcctgg tcacttacat tagccacgag cgtccgagcc 1380attacaccgc gaaggacgtc gcggtcgaaa tcgaacagac gcgctttctg gaagaggaca 1440agttcctggt gtttgttcca taagaattct aacactgtat aacattaaga aggaggtaaa 1500agatatgact agctacgtcg acaaacagga aatcaccgcg agcagcgaga ttgacgacct 1560gatcttttcc agcgatccgt tggtgtggtc ctatgatgag caagaaaaga ttcgcaagaa 1620actggtcctg gatgcgttcc gccaccacta caagcactgt caagagtacc gtcattattg 1680ccaagcccat aaagtcgacg ataacattac ggaaattgac gatatcccgg ttttcccgac 1740ctctgttttc aagttcaccc gtctgctgac ctccaacgag aatgagattg agagctggtt 1800tacttcgagc ggtaccaatg gtctgaaaag ccaagtcccg cgtgatcgtc tgagcattga 1860acgtctgctg ggcagcgtga gctacggcat gaagtacatc ggttcgtggt ttgaccatca 1920aatggagctg gttaacttgg gtccggatcg ctttaatgcc cacaacattt ggttcaagta 1980cgttatgagc ctggttgagc tgttgtatcc gacgagcttc accgtgacgg aagagcacat 2040cgacttcgtg cagacgctga acagcctgga
acgcattaaa catcagggca aagacatttg 2100tctgatcggt tctccgtatt tcatctatct gctgtgccgt tacatgaagg acaagaacat 2160cagctttagc ggtgacaaga gcctgtatat catcaccggt ggcggttgga aaagctacga 2220aaaagagtcc ctgaagcgta atgactttaa tcacctgttg ttcgatacgt tcaatctgag 2280caacattaac cagatccgtg acatctttaa ccaggtcgaa ctgaatacct gtttctttga 2340ggacgagatg cagcgcaaac acgtcccgcc gtgggtatac gcgcgtgcgc tggatcctga 2400aaccttgaaa ccggttccag atggcatgcc tggtctgatg agctatatgg atgctagctc 2460tacgagctac ccggcattta tcgtgaccga cgatattggt attatcagcc gcgagtacgg 2520tcaatatccg ggcgtgctgg ttgaaattct gcgtcgtgtg aatacccgca agcagaaagg 2580ctgcgcgttg tctctgacgg aggcattcgg ttcctaaaag ctttaacact gtataacatt 2640aagaaggagg taaatataat ggaaaacgag agcaagtaca aaacgatcga ccacgtaatc 2700tgcgtggagg gtaacaaaaa gattcacgtg tgggagactt tgccagaaga gaacagcccg 2760aaacgcaaaa acgcaatcat tatcgcgagc ggtttcgcac gccgcatgga tcattttgcg 2820ggcctggccg aatacctgag ccgtaacggc ttccacgtta tccgttatga cagcctgcat 2880cacgtcggcc tgtcgtctgg taccatcgac gagttcacga tgagcatcgg caagcaaagc 2940ctgttggcgg ttgttgattg gctgaccacg cgtaagatca acaattttgg tatgctggct 3000tccagcctgt ccgcacgcat tgcgtacgct tctctgagcg agattaatgc cagctttctg 3060atcaccgccg tgggtttcgt caatctgcgt tatagcctgg agcgtgcgct gggtttcgat 3120tacttgagcc tgccgattaa cgagctgccg aataatctgg actttgaagg ccataagttg 3180ggtgcggagg tctttgcgcg tgattgcctg gattttggtt gggaagatct ggcatcgacg 3240attaacaata tgatgtatct ggatatcccg tttattgctt tcacggcgaa taacgacaat 3300tgggttaagc aagacgaggt tatcaccctg ctgtctaaca ttcgttccaa tcgctgtaaa 3360atctatagct tgctgggcag cagccacgac ttgagcgaaa atctggtcgt gctgcgcaac 3420ttctaccaga gcgtgaccaa agcagcgatt gcaatggata acgaccacct ggacattgac 3480gtggatatca ccgaaccgag cttcgaacat ctgaccatcg cgaccgttaa cgaacgtcgt 3540atgcgtattg agattgagaa tcaggccatt tccctgagct aagcggcc 3588167511DNAArtificial SequenceExpression vector 16aacattagtg caggcagctt ccacagcaat ggcatcctgg tcatccagcg gatagttaat 60gatcagccca ctgacgcgtt gcgcgagaag attgtgcacc gccgctttac aggcttcgac 120gccgcttcgt tctaccatcg acaccaccac gctggcaccc agttgatcgg cgcgagattt 180aatcgccgcg acaatttgcg acggcgcgtg cagggccaga ctggaggtgg caacgccaat 240cagcaacgac tgtttgcccg ccagttgttg tgccacgcgg ttgggaatgt aattcagctc 300cgccatcgcc gcttccactt tttcccgcgt tttcgcagaa acgtggctgg cctggttcac 360cacgcgggaa acggtctgat aagagacacc ggcatactct gcgacatcgt ataacgttac 420tggtttcaca ttcaccaccc tgaattgact ctcttccggg cgctatcatg ccataccgcg 480aaaggttttg cgccattcga tggtgtccgg gatctcgacg ctctccctta tgcgactcct 540gcattaggaa attaatacga ctcactatag gggaattgtg agcggataac aattcccctg 600tagaaataat tttgtttaac tttaataagg agatatacca tgggaaggag atatagatat 660gaacaagaaa atcagcttca tcatcaacgg tcgcgtagaa atttttccgg agtctgatga 720cctggttcaa agcatcaatt ttggtgacaa tagcgtccac ctgccggtgc tgaacgatag 780ccaagtgaaa aacattatcg actataatga gaataatgag ctgcaactgc acaatatcat 840taactttctg tataccgtcg gtcagcgctg gaaaaacgaa gaatacagcc gtcgtcgtac 900ctatattcgc gatctgaagc gttatatggg ctacagcgag gaaatggcga aactggaagc 960caattggatt agcatgattc tgtgctctaa aggtggtttg tacgatctgg tgaaaaatga 1020gctgggcagc cgtcacatta tggacgaatg gctgccgcaa gacgaaagct acatccgtgc 1080cttcccgaaa ggcaagagcg ttcatctgct gaccggtaat gtcccgctgt cgggcgtgct 1140gtccatcctg cgcgcgattc tgaccaagaa ccagtgcatc attaagacga gcagcacgga 1200tcctttcacg gcgaatgcgc tggcgctgag cttcatcgac gttgacccac atcacccggt 1260gacccgtagc ctgtctgtcg tttattggca gcaccaaggt gacatcagct tggcgaaaga 1320gattatgcag cacgccgatg tggtcgttgc ctggggtggt gaggatgcaa ttaactgggc 1380ggttaaacac gcaccgccgg atatcgacgt catgaaattc ggtccgaaaa agagcttctg 1440catcattgac aacccggttg acttggttag cgcagcgacc ggcgcagcac acgacgtctg 1500tttttacgat cagcaggcat gctttagcac gcagaacatc tactacatgg gctcccatta 1560cgaggagttt aagctggctt tgatcgaaaa actgaatctg tatgcacata tcctgcctaa 1620caccaagaag gatttcgacg aaaaggcagc ttattccttg gtgcaaaagg agtgtctgtt 1680cgccggtttg aaagtggaag ttgacgttca tcaacgctgg atggttattg aatccaatgc 1740tggcgttgag ctgaaccagc cgctgggtcg ttgtgtgtac ttgcatcacg tggataacat 1800cgagcagatt ttgccgtatg tgcgtaagaa caaaacccag acgattagcg tgtttccgtg 1860ggaggctgcg ctgaagtacc gcgatctgct ggccctgaaa ggcgcggagc gtattgttga 1920ggcgggtatg aataacattt tccgtgtggg tggtgcgcac gatggcatgc gtccgctgca 1980acgcctggtc acttacatta gccacgagcg tccgagccat tacaccgcga aggacgtcgc 2040ggtcgaaatc gaacagacgc gctttctgga agaggacaag ttcctggtgt ttgttccata 2100agaattctaa cactgtataa cattaagaag gaggtaaaag atatgactag ctacgtcgac 2160aaacaggaaa tcaccgcgag cagcgagatt gacgacctga tcttttccag cgatccgttg 2220gtgtggtcct atgatgagca agaaaagatt cgcaagaaac tggtcctgga tgcgttccgc 2280caccactaca agcactgtca agagtaccgt cattattgcc aagcccataa agtcgacgat 2340aacattacgg aaattgacga tatcccggtt ttcccgacct ctgttttcaa gttcacccgt 2400ctgctgacct ccaacgagaa tgagattgag agctggttta cttcgagcgg taccaatggt 2460ctgaaaagcc aagtcccgcg tgatcgtctg agcattgaac gtctgctggg cagcgtgagc 2520tacggcatga agtacatcgg ttcgtggttt gaccatcaaa tggagctggt taacttgggt 2580ccggatcgct ttaatgccca caacatttgg ttcaagtacg ttatgagcct ggttgagctg 2640ttgtatccga cgagcttcac cgtgacggaa gagcacatcg acttcgtgca gacgctgaac 2700agcctggaac gcattaaaca tcagggcaaa gacatttgtc tgatcggttc tccgtatttc 2760atctatctgc tgtgccgtta catgaaggac aagaacatca gctttagcgg tgacaagagc 2820ctgtatatca tcaccggtgg cggttggaaa agctacgaaa aagagtccct gaagcgtaat 2880gactttaatc acctgttgtt cgatacgttc aatctgagca acattaacca gatccgtgac 2940atctttaacc aggtcgaact gaatacctgt ttctttgagg acgagatgca gcgcaaacac 3000gtcccgccgt gggtatacgc gcgtgcgctg gatcctgaaa ccttgaaacc ggttccagat 3060ggcatgcctg gtctgatgag ctatatggat gctagctcta cgagctaccc ggcatttatc 3120gtgaccgacg atattggtat tatcagccgc gagtacggtc aatatccggg cgtgctggtt 3180gaaattctgc gtcgtgtgaa tacccgcaag cagaaaggct gcgcgttgtc tctgacggag 3240gcattcggtt cctaaaagct ttaacactgt ataacattaa gaaggaggta aatataatgg 3300aaaacgagag caagtacaaa acgatcgacc acgtaatctg cgtggagggt aacaaaaaga 3360ttcacgtgtg ggagactttg ccagaagaga acagcccgaa acgcaaaaac gcaatcatta 3420tcgcgagcgg tttcgcacgc cgcatggatc attttgcggg cctggccgaa tacctgagcc 3480gtaacggctt ccacgttatc cgttatgaca gcctgcatca cgtcggcctg tcgtctggta 3540ccatcgacga gttcacgatg agcatcggca agcaaagcct gttggcggtt gttgattggc 3600tgaccacgcg taagatcaac aattttggta tgctggcttc cagcctgtcc gcacgcattg 3660cgtacgcttc tctgagcgag attaatgcca gctttctgat caccgccgtg ggtttcgtca 3720atctgcgtta tagcctggag cgtgcgctgg gtttcgatta cttgagcctg ccgattaacg 3780agctgccgaa taatctggac tttgaaggcc ataagttggg tgcggaggtc tttgcgcgtg 3840attgcctgga ttttggttgg gaagatctgg catcgacgat taacaatatg atgtatctgg 3900atatcccgtt tattgctttc acggcgaata acgacaattg ggttaagcaa gacgaggtta 3960tcaccctgct gtctaacatt cgttccaatc gctgtaaaat ctatagcttg ctgggcagca 4020gccacgactt gagcgaaaat ctggtcgtgc tgcgcaactt ctaccagagc gtgaccaaag 4080cagcgattgc aatggataac gaccacctgg acattgacgt ggatatcacc gaaccgagct 4140tcgaacatct gaccatcgcg accgttaacg aacgtcgtat gcgtattgag attgagaatc 4200aggccatttc cctgagctaa gcggccgcat aatgcttaag tcgaacagaa agtaatcgta 4260ttgtacacgg ccgcataatc gaaattaata cgactcacta taggggaatt gtgagcggat 4320aacaattccc catcttagta tattagttaa gtataagaag gagatataca tatggcagat 4380ctcaattgga tatcggccgg ccacgcgatc gctgacgtcg gtaccctcga gtctggtaaa 4440gaaaccgctg ctgcgaaatt tgaacgccag cacatggact cgtctactag cgcagcttaa 4500ttaacctagg ctgctgccac cgctgagcaa taactagcat aaccccttgg ggcctctaaa 4560cgggtcttga ggggtttttt gctgaaacct caggcatttg agaagcacac ggtcacactg 4620cttccggtag tcaataaacc ggtaaaccag caatagacat aagcggctat ttaacgaccc 4680tgccctgaac cgacgaccgg gtcgaatttg ctttcgaatt tctgccattc atccgcttat 4740tatcacttat tcaggcgtag caccaggcgt ttaagggcac caataactgc cttaaaaaaa 4800ttacgccccg ccctgccact catcgcagta ctgttgtaat tcattaagca ttctgccgac 4860atggaagcca tcacagacgg catgatgaac ctgaatcgcc agcggcatca gcaccttgtc 4920gccttgcgta taatatttgc ccatagtgaa aacgggggcg aagaagttgt ccatattggc 4980cacgtttaaa tcaaaactgg tgaaactcac ccagggattg gctgagacga aaaacatatt 5040ctcaataaac cctttaggga aataggccag gttttcaccg taacacgcca catcttgcga 5100atatatgtgt agaaactgcc ggaaatcgtc gtggtattca ctccagagcg atgaaaacgt 5160ttcagtttgc tcatggaaaa cggtgtaaca agggtgaaca ctatcccata tcaccagctc 5220accgtctttc attgccatac ggaactccgg atgagcattc atcaggcggg caagaatgtg 5280aataaaggcc ggataaaact tgtgcttatt tttctttacg gtctttaaaa aggccgtaat 5340atccagctga acggtctggt tataggtaca ttgagcaact gactgaaatg cctcaaaatg 5400ttctttacga tgccattggg atatatcaac ggtggtatat ccagtgattt ttttctccat 5460tttagcttcc ttagctcctg aaaatctcga taactcaaaa aatacgcccg gtagtgatct 5520tatttcatta tggtgaaagt tggaacctct tacgtgccga tcaacgtctc attttcgcca 5580aaagttggcc cagggcttcc cggtatcaac agggacacca ggatttattt attctgcgaa 5640gtgatcttcc gtcacaggta tttattcggc gcaaagtgcg tcgggtgatg ctgccaactt 5700actgatttag tgtatgatgg tgtttttgag gtgctccagt ggcttctgtt tctatcagct 5760gtccctcctg ttcagctact gacggggtgg tgcgtaacgg caaaagcacc gccggacatc 5820agcgctagcg gagtgtatac tggcttacta tgttggcact gatgagggtg tcagtgaagt 5880gcttcatgtg gcaggagaaa aaaggctgca ccggtgcgtc agcagaatat gtgatacagg 5940atatattccg cttcctcgct cactgactcg ctacgctcgg tcgttcgact gcggcgagcg 6000gaaatggctt acgaacgggg cggagatttc ctggaagatg ccaggaagat acttaacagg 6060gaagtgagag ggccgcggca aagccgtttt tccataggct ccgcccccct gacaagcatc 6120acgaaatctg acgctcaaat cagtggtggc gaaacccgac aggactataa agataccagg 6180cgtttcccct ggcggctccc tcgtgcgctc tcctgttcct gcctttcggt ttaccggtgt 6240cattccgctg ttatggccgc gtttgtctca ttccacgcct gacactcagt tccgggtagg 6300cagttcgctc caagctggac tgtatgcacg aaccccccgt tcagtccgac cgctgcgcct 6360tatccggtaa ctatcgtctt gagtccaacc cggaaagaca tgcaaaagca ccactggcag 6420cagccactgg taattgattt agaggagtta gtcttgaagt catgcgccgg ttaaggctaa 6480actgaaagga caagttttgg tgactgcgct cctccaagcc agttacctcg gttcaaagag 6540ttggtagctc agagaacctt cgaaaaaccg ccctgcaagg cggttttttc gttttcagag 6600caagagatta cgcgcagacc aaaacgatct caagaagatc atcttattaa tcagataaaa 6660tatttctaga tttcagtgca atttatctct tcaaatgtag cacctgaagt cagccccata 6720cgatataagt tgtaattctc atgttagtca tgccccgcgc ccaccggaag gagctgactg 6780ggttgaaggc tctcaagggc atcggtcgag atcccggtgc ctaatgagtg agctaactta 6840cattaattgc gttgcgctca ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc 6900attaatgaat cggccaacgc gcggggagag gcggtttgcg tattgggcgc cagggtggtt 6960tttcttttca ccagtgagac gggcaacagc tgattgccct tcaccgcctg gccctgagag 7020agttgcagca agcggtccac gctggtttgc cccagcaggc gaaaatcctg tttgatggtg 7080gttaacggcg ggatataaca tgagctgtct tcggtatcgt cgtatcccac taccgagatg 7140tccgcaccaa cgcgcagccc ggactcggta atggcgcgca ttgcgcccag cgccatctga 7200tcgttggcaa ccagcatcgc agtgggaacg atgccctcat tcagcatttg catggtttgt 7260tgaaaaccgg acatggcact ccagtcgcct tcccgttccg ctatcggctg aatttgattg 7320cgagtgagat atttatgcca gccagccaga cgcagacgcg ccgagacaga acttaatggg 7380cccgctaaca gcgcgatttg ctggtgaccc aatgcgacca gatgctccac gcccagtcgc 7440gtaccgtctt catgggagaa aataatactg ttgatgggtg tctggtcaga gacatcaaga 7500aataacgccg g 751117906DNACinnamomum camphora 17atgggtctgg aatggaaacc gaagccgaat ccgccacaac tgctggatga tcatttcggt 60ccgcacggcc tggtctttcg ccgtaccttc gcaatccgta gctatgaggt tggcccggac 120cgcagcacgt ctatcgtggc tgttatgaat cacctgcaag aggccgcttt gaaccatgcg 180aaaagcgtcg gcattctggg cgatggcttc ggtaccactt tggaaatgag caagcgcgat 240ctgatctggg tggttaaacg tacgcacgtt gccgtggaac gttacccggc gtggggtgat 300accgtagaag ttgagtgctg ggtcggcgca agcggtaata acggtcgccg tcacgacttt 360ctggtgcgtg actgtaaaac cggtgaaatc ttgacgcgct gtaccagcct gagcgttatg 420atgaacaccc gcacgcgtcg tctgtccaag attcctgaag aggtccgtgg tgagattggt 480ccggcgttca tcgacaacgt ggccgttaag gacgaagaga ttaagaagcc gcagaaattg 540aatgactcta cggcggatta cattcagggt ggtctgacgc cgcgttggaa tgacctggac 600attaaccagc acgtgaacaa tatcaaatat gtcgattgga ttctggaaac cgtgccggac 660agcatttttg agtcgcatca catcagcagc ttcaccattg agtaccgtcg cgagtgcacg 720atggatagcg ttctgcaaag cctgaccact gtgagcggcg gtagctctga ggcgggtctg 780gtgtgcgagc atctgctgca gctggagggt ggcagcgaag ttctgcgtgc aaaaaccgag 840tggcgtccga agctgaccga ctcctttcgt ggcatctccg tcatcccagc ggaaagcagc 900gtctaa 906186291DNAArtificial SequenceExpression vector 18ggggaattgt gagcggataa caattcccct ctagaaataa ttttgtttaa ctttaagaag 60gagatatacc atgggtctgg aatggaaacc gaagccgaat ccgccacaac tgctggatga 120tcatttcggt ccgcacggcc tggtctttcg ccgtaccttc gcaatccgta gctatgaggt 180tggcccggac cgcagcacgt ctatcgtggc tgttatgaat cacctgcaag aggccgcttt 240gaaccatgcg aaaagcgtcg gcattctggg cgatggcttc ggtaccactt tggaaatgag 300caagcgcgat ctgatctggg tggttaaacg tacgcacgtt gccgtggaac gttacccggc 360gtggggtgat accgtagaag ttgagtgctg ggtcggcgca agcggtaata acggtcgccg 420tcacgacttt ctggtgcgtg actgtaaaac cggtgaaatc ttgacgcgct gtaccagcct 480gagcgttatg atgaacaccc gcacgcgtcg tctgtccaag attcctgaag aggtccgtgg 540tgagattggt ccggcgttca tcgacaacgt ggccgttaag gacgaagaga ttaagaagcc 600gcagaaattg aatgactcta cggcggatta cattcagggt ggtctgacgc cgcgttggaa 660tgacctggac attaaccagc acgtgaacaa tatcaaatat gtcgattgga ttctggaaac 720cgtgccggac agcatttttg agtcgcatca catcagcagc ttcaccattg agtaccgtcg 780cgagtgcacg atggatagcg ttctgcaaag cctgaccact gtgagcggcg gtagctctga 840ggcgggtctg gtgtgcgagc atctgctgca gctggagggt ggcagcgaag ttctgcgtgc 900aaaaaccgag tggcgtccga agctgaccga ctcctttcgt ggcatctccg tcatcccagc 960ggaaagcagc gtctaaggat ccgaattcga gctcggcgcg cctgcaggtc gacaagcttg 1020cggccgcata atgcttaagt cgaacagaaa gtaatcgtat tgtacacggc cgcataatcg 1080aaattaatac gactcactat aggggaattg tgagcggata acaattcccc atcttagtat 1140attagttaag tataagaagg agatatacat atggcagatc tcaattggat atcggccggc 1200cacgcgatcg ctgacgtcgg taccctcgag tctggtaaag aaaccgctgc tgcgaaattt 1260gaacgccagc acatggactc gtctactagc gcagcttaat taacctaggc tgctgccacc 1320gctgagcaat aactagcata accccttggg gcctctaaac gggtcttgag gggttttttg 1380ctgaaaggag gaactatatc cggattggcg aatgggacgc gccctgtagc ggcgcattaa 1440gcgcggcggg tgtggtggtt acgcgcagcg tgaccgctac acttgccagc gccctagcgc 1500ccgctccttt cgctttcttc ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag 1560ctctaaatcg ggggctccct ttagggttcc gatttagtgc tttacggcac ctcgacccca 1620aaaaacttga ttagggtgat ggttcacgta gtgggccatc gccctgatag acggtttttc 1680gccctttgac gttggagtcc acgttcttta atagtggact cttgttccaa actggaacaa 1740cactcaaccc tatctcggtc tattcttttg atttataagg gattttgccg atttcggcct 1800attggttaaa aaatgagctg atttaacaaa aatttaacgc gaattttaac aaaatattaa 1860cgtttacaat ttctggcggc acgatggcat gagattatca aaaaggatct tcacctagat 1920ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc 1980tgacagttac caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc 2040atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc 2100tggccccagt gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc 2160aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc 2220catccagtct attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt 2280gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc 2340ttcattcagc tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa 2400aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt 2460atcactcatg gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg 2520cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 2580gagttgctct tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa 2640agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt 2700gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt 2760caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 2820ggcgacacgg aaatgttgaa tactcatact cttccttttt caatcatgat tgaagcattt 2880atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa 2940taggtcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc agaccccgta 3000gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa 3060acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct accaactctt 3120tttccgaagg taactggctt cagcagagcg cagataccaa atactgtcct tctagtgtag 3180ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct cgctctgcta 3240atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg gttggactca 3300agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag 3360cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga gctatgagaa 3420agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga 3480acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc 3540gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc 3600ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg ctggcctttt 3660gctcacatgt tctttcctgc gttatcccct gattctgtgg ataaccgtat taccgccttt 3720gagtgagctg ataccgctcg ccgcagccga acgaccgagc gcagcgagtc agtgagcgag 3780gaagcggaag agcgcctgat gcggtatttt ctccttacgc atctgtgcgg tatttcacac 3840cgcatatatg gtgcactctc agtacaatct gctctgatgc cgcatagtta agccagtata 3900cactccgcta tcgctacgtg actgggtcat ggctgcgccc cgacacccgc caacacccgc 3960tgacgcgccc tgacgggctt gtctgctccc ggcatccgct tacagacaag ctgtgaccgt 4020ctccgggagc tgcatgtgtc agaggttttc accgtcatca ccgaaacgcg cgaggcagct 4080gcggtaaagc tcatcagcgt ggtcgtgaag cgattcacag atgtctgcct gttcatccgc 4140gtccagctcg ttgagtttct ccagaagcgt taatgtctgg cttctgataa agcgggccat 4200gttaagggcg gttttttcct gtttggtcac tgatgcctcc gtgtaagggg gatttctgtt 4260catgggggta atgataccga tgaaacgaga gaggatgctc acgatacggg ttactgatga 4320tgaacatgcc cggttactgg aacgttgtga gggtaaacaa ctggcggtat ggatgcggcg 4380ggaccagaga aaaatcactc agggtcaatg ccagcgcttc gttaatacag atgtaggtgt 4440tccacagggt agccagcagc atcctgcgat gcagatccgg aacataatgg tgcagggcgc 4500tgacttccgc gtttccagac tttacgaaac acggaaaccg aagaccattc atgttgttgc 4560tcaggtcgca gacgttttgc agcagcagtc gcttcacgtt cgctcgcgta tcggtgattc 4620attctgctaa ccagtaaggc aaccccgcca gcctagccgg gtcctcaacg acaggagcac 4680gatcatgcta gtcatgcccc gcgcccaccg gaaggagctg actgggttga aggctctcaa 4740gggcatcggt cgagatcccg gtgcctaatg agtgagctaa cttacattaa ttgcgttgcg 4800ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 4860acgcgcgggg agaggcggtt
tgcgtattgg gcgccagggt ggtttttctt ttcaccagtg 4920agacgggcaa cagctgattg cccttcaccg cctggccctg agagagttgc agcaagcggt 4980ccacgctggt ttgccccagc aggcgaaaat cctgtttgat ggtggttaac ggcgggatat 5040aacatgagct gtcttcggta tcgtcgtatc ccactaccga gatgtccgca ccaacgcgca 5100gcccggactc ggtaatggcg cgcattgcgc ccagcgccat ctgatcgttg gcaaccagca 5160tcgcagtggg aacgatgccc tcattcagca tttgcatggt ttgttgaaaa ccggacatgg 5220cactccagtc gccttcccgt tccgctatcg gctgaatttg attgcgagtg agatatttat 5280gccagccagc cagacgcaga cgcgccgaga cagaacttaa tgggcccgct aacagcgcga 5340tttgctggtg acccaatgcg accagatgct ccacgcccag tcgcgtaccg tcttcatggg 5400agaaaataat actgttgatg ggtgtctggt cagagacatc aagaaataac gccggaacat 5460tagtgcaggc agcttccaca gcaatggcat cctggtcatc cagcggatag ttaatgatca 5520gcccactgac gcgttgcgcg agaagattgt gcaccgccgc tttacaggct tcgacgccgc 5580ttcgttctac catcgacacc accacgctgg cacccagttg atcggcgcga gatttaatcg 5640ccgcgacaat ttgcgacggc gcgtgcaggg ccagactgga ggtggcaacg ccaatcagca 5700acgactgttt gcccgccagt tgttgtgcca cgcggttggg aatgtaattc agctccgcca 5760tcgccgcttc cactttttcc cgcgttttcg cagaaacgtg gctggcctgg ttcaccacgc 5820gggaaacggt ctgataagag acaccggcat actctgcgac atcgtataac gttactggtt 5880tcacattcac caccctgaat tgactctctt ccgggcgcta tcatgccata ccgcgaaagg 5940ttttgcgcca ttcgatggtg tccgggatct cgacgctctc ccttatgcga ctcctgcatt 6000aggaagcagc ccagtagtag gttgaggccg ttgagcaccg ccgccgcaag gaatggtgca 6060tgcaaggaga tggcgcccaa cagtcccccg gccacggggc ctgccaccat acccacgccg 6120aaacaagcgc tcatgagccc gaagtggcga gcccgatctt ccccatcggt gatgtcggcg 6180atataggcgc cagcaaccgc acctgtggcg ccggtgatgc cggccacgat gcgtccggcg 6240tagaggatcg agatcgatct cgatcccgcg aaattaatac gactcactat a 629119978DNABacillus subtilis 19atgagcaagg cgaaaatcac ggcaatcggc acctacgcac caagccgtcg tctgaccaat 60gcggatctgg agaagattgt tgacacctct gatgaatgga tcgttcaacg tacgggtatg 120cgtgaacgtc gtattgccga cgaacatcag ttcacgtctg atctgtgcat cgaagccgtt 180aagaacctga aaagccgtta caaaggcacg ctggatgacg ttgacatgat cctggttgca 240accacgacct ctgactatgc ttttccgagc accgcttgtc gtgtgcagga gtatttcggc 300tgggaatcca ctggtgcgct ggatatcaat gccacctgtg cgggtctgac ctacggtctg 360cacctggcca atggcctgat taccagcggc ctgcatcaaa agattctggt tattgcgggc 420gaaacgctga gcaaagttac cgattacacc gatcgcacga cctgcgtttt gtttggcgac 480gcagcgggtg cactgctggt tgagcgcgat gaggaaacgc caggtttcct ggcgagcgtc 540cagggcacta gcggtaacgg tggtgacatc ctgtaccgtg caggtctgcg taacgagatt 600aacggtgtgc agctggtggg ctctggcaag atggtgcaaa atggccgtga ggtttacaag 660tgggctgcgc gcactgttcc gggcgagttc gagcgcctgc tgcacaaagc aggtctgagc 720agcgacgatc tggactggtt tgtgccgcac agcgccaacc tgcgtatgat cgagagcatc 780tgcgaaaaga cgccgttccc aatcgaaaag accttgacga gcgtggagca ttacggtaat 840accagctccg tgtctattgt cctggcgctg gacttggcag tgaaggcagg caaactgaaa 900aaggatcaga tcgttctgct gtttggcttc ggtggtggct tgacctacac gggcctgctg 960atcaaatggg gtatgtaa 97820993DNABacillus subtilis 20atgggcacga accgccacca agcactgggc ctgaccgacc aagaggcggt tgatatgtac 60cgcacgatgc tgctggcgcg caagattgat gagcgtatgt ggctgttgaa tcgttccggc 120aagattccat ttgtgatttc ttgccagggc caagaggcag cacaagttgg tgcagcgttc 180gcgctggatc gtgagatgga ttacgtgctg ccgtactacc gtgatatggg tgtggtgctg 240gcattcggta tgaccgcaaa agatctgatg atgtctggct ttgcaaaagc ggcggaccca 300aacagcggcg gtcgccagat gccaggtcac tttggtcaga agaagaatcg tattgtcacc 360ggtagcagcc cggttacgac gcaggttccg cacgcggttg gtattgcgct ggccggtcgt 420atggaaaaga aagatatcgc cgcgttcgtc acgtttggcg agggtagcag caatcagggt 480gactttcatg agggtgccaa cttcgctgcg gtccataaac tgccggtcat cttcatgtgc 540gaaaacaaca agtacgccat tagcgttccg tacgacaagc aggttgcttg cgagaacatc 600agcgaccgcg cgatcggcta tggtatgccg ggtgtgacgg tcaacggcaa cgatccgctg 660gaggtttatc aagcggttaa agaagcgcgc gagcgtgccc gtcgcggtga gggtccgacg 720ttgatcgaaa ccatttccta tcgtctgacg cctcacagca gcgatgatga tgacagcagc 780taccgtggtc gtgaagaggt cgaagaggcc aaaaagagcg acccgctgct gacctaccaa 840gcgtatctga aagaaacggg tctgctgagc gacgagattg agcaaaccat gctggacgag 900atcatggcaa tcgtgaatga ggcaaccgac gaggcggaga acgcgccgta tgcggcaccg 960gaaagcgcac tggattatgt ctacgcgaag taa 99321984DNABacillus subtilis 21atgagcgtaa tgagctacat cgatgcaatc aacctggcca tgaaagaaga aatggaacgc 60gacagccgcg tttttgtttt gggtgaggac gtcggtcgca aaggtggtgt gttcaaagcc 120accgcgggtt tgtacgagca atttggcgaa gagcgtgtca tggatacgcc gctggccgaa 180agcgctattg caggcgtcgg catcggtgcg gctatgtatg gtatgcgtcc gatcgctgaa 240atgcaatttg cagactttat catgccagcc gtcaaccaga tcatcagcga ggcagcgaaa 300atccgttatc gtagcaacaa cgattggagc tgtccgatcg ttgtccgtgc cccgtatggt 360ggtggtgttc acggcgcact gtatcatagc cagagcgttg aagcgatttt cgcaaaccaa 420cctggtctga aaatcgttat gccaagcacc ccgtacgatg cgaagggttt gctgaaagcg 480gcggtgcgcg atgaagatcc ggtgctgttc ttcgagcaca agcgtgcgta ccgtctgatt 540aaaggcgagg tcccggcaga cgactacgtc ttgccgatcg gtaaagcgga tgttaagcgt 600gaaggtgatg atatcaccgt gatcacgtac ggcctgtgcg tgcacttcgc cctgcaagcg 660gccgaacgcc tggagaagga cggcatcagc gcacacgttg tagacctgcg taccgtctac 720ccgttggata aagaagccat catcgaggcg gcgagcaaaa ccggcaaggt gctgctggtc 780acggaagata ccaaagaagg tagcatcatg agcgaggttg cagccatcat tagcgagcac 840tgtttgttcg acttggatgc gccgattaag cgtctggcgg gtccagatat cccggccatg 900ccgtacgcac cgacgatgga gaaatacttt atggtcaacc cggataaggt ggaagcggcc 960atgcgtgagc tggcggagtt ctaa 984221275DNABacillus subtilis 22atggccatcg agcaaatgac catgccgcaa ctgggcgaga gcgtaacgga aggcaccatt 60tccaaatggc tggttgctcc aggtgataaa gtcaacaagt atgacccgat cgctgaggtt 120atgaccgata aggtgaacgc ggaggttccg tcctctttca ctggcaccat taccgaactg 180gtcggcgaag agggtcaaac gctgcaagtc ggcgagatga tctgtaagat tgaaacggag 240ggtgctaatc cggctgaaca aaagcaggag caaccggcag cgtctgaagc ggcagaaaat 300ccagtcgcga agagcgcggg tgccgcagat caaccgaaca aaaagcgtta cagcccggca 360gttttgcgcc tggctggtga gcacggcatc gacctggatc aagtgactgg tacgggcgca 420ggtggccgca ttacccgtaa ggacatccaa cgcttgattg aaacgggtgg tgtccaggaa 480cagaacccgg aggagctgaa aaccgccgca ccggcaccga aaagcgcgag caaaccggag 540ccgaaggaag aaacctctta cccggcgtcc gctgcgggcg ataaggagat tccggttact 600ggcgttcgca aggccatcgc tagcaatatg aagcgcagca agactgagat cccgcacgca 660tggacgatga tggaggtgga tgtgaccaac atggtagcat accgtaatag catcaaggat 720agcttcaaaa agaccgaagg tttcaacctg acgttctttg ccttctttgt gaaggccgtt 780gcacaggcac tgaaagagtt tccgcaaatg aacagcatgt gggctggcga caagattatt 840caaaagaagg atatcaacat tagcattgca gtcgccaccg aggacagcct gttcgtgccg 900gtaatcaaaa atgctgatga aaagactatc aaaggtattg caaaggacat caccggcctg 960gcgaagaaag ttcgcgacgg taagctgacc gcagatgaca tgcagggtgg cacctttacg 1020gtcaacaaca cgggcagctt tggcagcgtc cagagcatgg gtattatcaa ctatccgcag 1080gcggcaattc tgcaagttga atccatcgtg aaacgcccgg ttgttatgga caacggcatg 1140attgcagttc gtgacatggt aaacttgtgt ctgagcttgg accaccgcgt tctggacggc 1200ctggtctgcg gtcgtttctt gggccgtgtg aaacagatcc tggagagcat tgatgagaaa 1260acgagcgtgt attaa 1275231425DNABacillus subtilis 23atggcaacgg agtacgacgt agtgattttg ggcggtggca cgggcggtta cgtggcggcc 60attcgtgcgg cgcaattggg cctgaaaacg gccgtggtcg aaaaagaaaa actgggcggc 120acctgcctgc acaagggttg tattccgagc aaagccctgt tgcgttccgc ggaggtgtac 180cgtaccgctc gtgaagcgga ccaattcggc gtggaaaccg cgggtgtgtc cctgaacttt 240gagaaagtcc agcagcgtaa acaggcggtg gtggacaaac tggctgcggg tgtcaatcac 300ctgatgaaga agggtaaaat cgatgtgtat accggttatg gccgcatcct gggtccgagc 360attttcagcc cgctgccggg tactatttcc gtggaacgtg gcaacggtga agaaaacgac 420atgttgatcc ctaaacaggt gatcatcgcg accggtagcc gtccgcgcat gctgccaggt 480ctggaagttg acggtaaaag cgtgctgacc agcgatgagg cgctgcaaat ggaggagttg 540ccgcagagca tcatcattgt aggtggcggc gtcattggca ttgagtgggc gagcatgctg 600catgattttg gcgtcaaagt cactgtgatc gagtacgccg accgtattct gccgacggag 660gatttggaga tttccaaaga aatggaaagc ctgctgaaaa agaaaggtat ccaattcatt 720accggtgcta aggttctgcc ggacacgatg accaaaacta gcgacgatat cagcattcaa 780gcagaaaaag atggcgaaac ggtcacctac agcgcggaga aaatgttggt gagcatcggt 840cgtcaggcga atatcgaggg tattggtctg gaaaacaccg acattgttac cgagaatggt 900atgatctccg tcaacgagag ctgccaaacg aaagagtcgc acatctatgc catcggtgac 960gtcatcggtg gcctgcaatt ggcccacgtc gcaagccatg agggtatcat cgcagtagaa 1020catttcgccg gtctgaatcc gcacccgctg gacccgactc tggtccctaa gtgtatctac 1080tccagcccgg aagccgctag cgtaggtctg accgaagatg aggctaaggc gaatggccac 1140aacgtcaaga ttggcaagtt cccgtttatg gctattggta aggcgctggt gtatggcgag 1200agcgacggtt ttgtcaagat tgtagctgat cgtgataccg acgatattct gggtgtgcac 1260atgatcggtc cgcacgtgac cgacatgatt agcgaagcag gtctggccaa agtactggac 1320gcgaccccgt gggaagtagg ccagaccatt cacccgcatc ctacgctgag cgaagcgatt 1380ggtgaggcgg cattggccgc agacggtaaa gctatccact tctaa 1425245862DNAArtificial SequenceCodon-optimised operon 24ccatgggaag gagatatacc atgggcacga accgccacca agcactgggc ctgaccgacc 60aagaggcggt tgatatgtac cgcacgatgc tgctggcgcg caagattgat gagcgtatgt 120ggctgttgaa tcgttccggc aagattccat ttgtgatttc ttgccagggc caagaggcag 180cacaagttgg tgcagcgttc gcgctggatc gtgagatgga ttacgtgctg ccgtactacc 240gtgatatggg tgtggtgctg gcattcggta tgaccgcaaa agatctgatg atgtctggct 300ttgcaaaagc ggcggaccca aacagcggcg gtcgccagat gccaggtcac tttggtcaga 360agaagaatcg tattgtcacc ggtagcagcc cggttacgac gcaggttccg cacgcggttg 420gtattgcgct ggccggtcgt atggaaaaga aagatatcgc cgcgttcgtc acgtttggcg 480agggtagcag caatcagggt gactttcatg agggtgccaa cttcgctgcg gtccataaac 540tgccggtcat cttcatgtgc gaaaacaaca agtacgccat tagcgttccg tacgacaagc 600aggttgcttg cgagaacatc agcgaccgcg cgatcggcta tggtatgccg ggtgtgacgg 660tcaacggcaa cgatccgctg gaggtttatc aagcggttaa agaagcgcgc gagcgtgccc 720gtcgcggtga gggtccgacg ttgatcgaaa ccatttccta tcgtctgacg cctcacagca 780gcgatgatga tgacagcagc taccgtggtc gtgaagaggt cgaagaggcc aaaaagagcg 840acccgctgct gacctaccaa gcgtatctga aagaaacggg tctgctgagc gacgagattg 900agcaaaccat gctggacgag atcatggcaa tcgtgaatga ggcaaccgac gaggcggaga 960acgcgccgta tgcggcaccg gaaagcgcac tggattatgt ctacgcgaag taaggatccc 1020actgtataac attaagaagg aggtaaaaaa aatgagcgta atgagctaca tcgatgcaat 1080caacctggcc atgaaagaag aaatggaacg cgacagccgc gtttttgttt tgggtgagga 1140cgtcggtcgc aaaggtggtg tgttcaaagc caccgcgggt ttgtacgagc aatttggcga 1200agagcgtgtc atggatacgc cgctggccga aagcgctatt gcaggcgtcg gcatcggtgc 1260ggctatgtat ggtatgcgtc cgatcgctga aatgcaattt gcagacttta tcatgccagc 1320cgtcaaccag atcatcagcg aggcagcgaa aatccgttat cgtagcaaca acgattggag 1380ctgtccgatc gttgtccgtg ccccgtatgg tggtggtgtt cacggcgcac tgtatcatag 1440ccagagcgtt gaagcgattt tcgcaaacca acctggtctg aaaatcgtta tgccaagcac 1500cccgtacgat gcgaagggtt tgctgaaagc ggcggtgcgc gatgaagatc cggtgctgtt 1560cttcgagcac aagcgtgcgt accgtctgat taaaggcgag gtcccggcag acgactacgt 1620cttgccgatc ggtaaagcgg atgttaagcg tgaaggtgat gatatcaccg tgatcacgta 1680cggcctgtgc gtgcacttcg ccctgcaagc ggccgaacgc ctggagaagg acggcatcag 1740cgcacacgtt gtagacctgc gtaccgtcta cccgttggat aaagaagcca tcatcgaggc 1800ggcgagcaaa accggcaagg tgctgctggt cacggaagat accaaagaag gtagcatcat 1860gagcgaggtt gcagccatca ttagcgagca ctgtttgttc gacttggatg cgccgattaa 1920gcgtctggcg ggtccagata tcccggccat gccgtacgca ccgacgatgg agaaatactt 1980tatggtcaac ccggataagg tggaagcggc catgcgtgag ctggcggagt tctaaggatc 2040cgaattcact gtataacatt aagaaggagg taaaaaaaat ggccatcgag caaatgacca 2100tgccgcaact gggcgagagc gtaacggaag gcaccatttc caaatggctg gttgctccag 2160gtgataaagt caacaagtat gacccgatcg ctgaggttat gaccgataag gtgaacgcgg 2220aggttccgtc ctctttcact ggcaccatta ccgaactggt cggcgaagag ggtcaaacgc 2280tgcaagtcgg cgagatgatc tgtaagattg aaacggaggg tgctaatccg gctgaacaaa 2340agcaggagca accggcagcg tctgaagcgg cagaaaatcc agtcgcgaag agcgcgggtg 2400ccgcagatca accgaacaaa aagcgttaca gcccggcagt tttgcgcctg gctggtgagc 2460acggcatcga cctggatcaa gtgactggta cgggcgcagg tggccgcatt acccgtaagg 2520acatccaacg cttgattgaa acgggtggtg tccaggaaca gaacccggag gagctgaaaa 2580ccgccgcacc ggcaccgaaa agcgcgagca aaccggagcc gaaggaagaa acctcttacc 2640cggcgtccgc tgcgggcgat aaggagattc cggttactgg cgttcgcaag gccatcgcta 2700gcaatatgaa gcgcagcaag actgagatcc cgcacgcatg gacgatgatg gaggtggatg 2760tgaccaacat ggtagcatac cgtaatagca tcaaggatag cttcaaaaag accgaaggtt 2820tcaacctgac gttctttgcc ttctttgtga aggccgttgc acaggcactg aaagagtttc 2880cgcaaatgaa cagcatgtgg gctggcgaca agattattca aaagaaggat atcaacatta 2940gcattgcagt cgccaccgag gacagcctgt tcgtgccggt aatcaaaaat gctgatgaaa 3000agactatcaa aggtattgca aaggacatca ccggcctggc gaagaaagtt cgcgacggta 3060agctgaccgc agatgacatg cagggtggca cctttacggt caacaacacg ggcagctttg 3120gcagcgtcca gagcatgggt attatcaact atccgcaggc ggcaattctg caagttgaat 3180ccatcgtgaa acgcccggtt gttatggaca acggcatgat tgcagttcgt gacatggtaa 3240acttgtgtct gagcttggac caccgcgttc tggacggcct ggtctgcggt cgtttcttgg 3300gccgtgtgaa acagatcctg gagagcattg atgagaaaac gagcgtgtat taagaattcg 3360agctcactgt ataacattaa gaaggaggta aaaaaaatgg caacggagta cgacgtagtg 3420attttgggcg gtggcacggg cggttacgtg gcggccattc gtgcggcgca attgggcctg 3480aaaacggccg tggtcgaaaa agaaaaactg ggcggcacct gcctgcacaa gggttgtatt 3540ccgagcaaag ccctgttgcg ttccgcggag gtgtaccgta ccgctcgtga agcggaccaa 3600ttcggcgtgg aaaccgcggg tgtgtccctg aactttgaga aagtccagca gcgtaaacag 3660gcggtggtgg acaaactggc tgcgggtgtc aatcacctga tgaagaaggg taaaatcgat 3720gtgtataccg gttatggccg catcctgggt ccgagcattt tcagcccgct gccgggtact 3780atttccgtgg aacgtggcaa cggtgaagaa aacgacatgt tgatccctaa acaggtgatc 3840atcgcgaccg gtagccgtcc gcgcatgctg ccaggtctgg aagttgacgg taaaagcgtg 3900ctgaccagcg atgaggcgct gcaaatggag gagttgccgc agagcatcat cattgtaggt 3960ggcggcgtca ttggcattga gtgggcgagc atgctgcatg attttggcgt caaagtcact 4020gtgatcgagt acgccgaccg tattctgccg acggaggatt tggagatttc caaagaaatg 4080gaaagcctgc tgaaaaagaa aggtatccaa ttcattaccg gtgctaaggt tctgccggac 4140acgatgacca aaactagcga cgatatcagc attcaagcag aaaaagatgg cgaaacggtc 4200acctacagcg cggagaaaat gttggtgagc atcggtcgtc aggcgaatat cgagggtatt 4260ggtctggaaa acaccgacat tgttaccgag aatggtatga tctccgtcaa cgagagctgc 4320caaacgaaag agtcgcacat ctatgccatc ggtgacgtca tcggtggcct gcaattggcc 4380cacgtcgcaa gccatgaggg tatcatcgca gtagaacatt tcgccggtct gaatccgcac 4440ccgctggacc cgactctggt ccctaagtgt atctactcca gcccggaagc cgctagcgta 4500ggtctgaccg aagatgaggc taaggcgaat ggccacaacg tcaagattgg caagttcccg 4560tttatggcta ttggtaaggc gctggtgtat ggcgagagcg acggttttgt caagattgta 4620gctgatcgtg ataccgacga tattctgggt gtgcacatga tcggtccgca cgtgaccgac 4680atgattagcg aagcaggtct ggccaaagta ctggacgcga ccccgtggga agtaggccag 4740accattcacc cgcatcctac gctgagcgaa gcgattggtg aggcggcatt ggccgcagac 4800ggtaaagcta tccacttcta agagctcgtc gaccactgta taacattaag aaggaggtaa 4860aaaaaatgag caaggcgaaa atcacggcaa tcggcaccta cgcaccaagc cgtcgtctga 4920ccaatgcgga tctggagaag attgttgaca cctctgatga atggatcgtt caacgtacgg 4980gtatgcgtga acgtcgtatt gccgacgaac atcagttcac gtctgatctg tgcatcgaag 5040ccgttaagaa cctgaaaagc cgttacaaag gcacgctgga tgacgttgac atgatcctgg 5100ttgcaaccac gacctctgac tatgcttttc cgagcaccgc ttgtcgtgtg caggagtatt 5160tcggctggga atccactggt gcgctggata tcaatgccac ctgtgcgggt ctgacctacg 5220gtctgcacct ggccaatggc ctgattacca gcggcctgca tcaaaagatt ctggttattg 5280cgggcgaaac gctgagcaaa gttaccgatt acaccgatcg cacgacctgc gttttgtttg 5340gcgacgcagc gggtgcactg ctggttgagc gcgatgagga aacgccaggt ttcctggcga 5400gcgtccaggg cactagcggt aacggtggtg acatcctgta ccgtgcaggt ctgcgtaacg 5460agattaacgg tgtgcagctg gtgggctctg gcaagatggt gcaaaatggc cgtgaggttt 5520acaagtgggc tgcgcgcact gttccgggcg agttcgagcg cctgctgcac aaagcaggtc 5580tgagcagcga cgatctggac tggtttgtgc cgcacagcgc caacctgcgt atgatcgaga 5640gcatctgcga aaagacgccg ttcccaatcg aaaagacctt gacgagcgtg gagcattacg 5700gtaataccag ctccgtgtct attgtcctgg cgctggactt ggcagtgaag gcaggcaaac 5760tgaaaaagga tcagatcgtt ctgctgtttg gcttcggtgg tggcttgacc tacacgggcc 5820tgctgatcaa atggggtatg taatgagtcg acgcggccgc gc 58622511200DNAArtificial SequenceExpression vector 25ggggaattgt gagcggataa caattcccct ctagaaataa ttttgtttaa ctttaagaag 60gagatatacc atgggaagga gatataccat gggcacgaac cgccaccaag cactgggcct 120gaccgaccaa gaggcggttg atatgtaccg cacgatgctg ctggcgcgca agattgatga 180gcgtatgtgg ctgttgaatc gttccggcaa gattccattt gtgatttctt gccagggcca 240agaggcagca caagttggtg cagcgttcgc gctggatcgt gagatggatt acgtgctgcc 300gtactaccgt gatatgggtg tggtgctggc attcggtatg accgcaaaag atctgatgat 360gtctggcttt gcaaaagcgg cggacccaaa cagcggcggt cgccagatgc caggtcactt 420tggtcagaag aagaatcgta ttgtcaccgg tagcagcccg gttacgacgc aggttccgca 480cgcggttggt attgcgctgg ccggtcgtat ggaaaagaaa gatatcgccg cgttcgtcac 540gtttggcgag ggtagcagca atcagggtga ctttcatgag ggtgccaact tcgctgcggt 600ccataaactg ccggtcatct tcatgtgcga aaacaacaag tacgccatta gcgttccgta 660cgacaagcag gttgcttgcg agaacatcag cgaccgcgcg atcggctatg gtatgccggg 720tgtgacggtc aacggcaacg atccgctgga ggtttatcaa gcggttaaag aagcgcgcga 780gcgtgcccgt cgcggtgagg gtccgacgtt gatcgaaacc atttcctatc gtctgacgcc 840tcacagcagc gatgatgatg acagcagcta ccgtggtcgt gaagaggtcg aagaggccaa 900aaagagcgac ccgctgctga cctaccaagc gtatctgaaa gaaacgggtc tgctgagcga 960cgagattgag caaaccatgc tggacgagat catggcaatc gtgaatgagg caaccgacga 1020ggcggagaac gcgccgtatg cggcaccgga aagcgcactg gattatgtct acgcgaagta 1080aggatcccac tgtataacat taagaaggag gtaaaaaaaa tgagcgtaat gagctacatc 1140gatgcaatca acctggccat gaaagaagaa atggaacgcg acagccgcgt ttttgttttg 1200ggtgaggacg tcggtcgcaa aggtggtgtg ttcaaagcca ccgcgggttt gtacgagcaa 1260tttggcgaag agcgtgtcat ggatacgccg ctggccgaaa gcgctattgc aggcgtcggc 1320atcggtgcgg ctatgtatgg tatgcgtccg atcgctgaaa tgcaatttgc agactttatc 1380atgccagccg tcaaccagat catcagcgag gcagcgaaaa tccgttatcg tagcaacaac 1440gattggagct gtccgatcgt tgtccgtgcc ccgtatggtg gtggtgttca cggcgcactg 1500tatcatagcc agagcgttga agcgattttc gcaaaccaac ctggtctgaa aatcgttatg 1560ccaagcaccc cgtacgatgc gaagggtttg ctgaaagcgg cggtgcgcga tgaagatccg 1620gtgctgttct tcgagcacaa gcgtgcgtac cgtctgatta aaggcgaggt cccggcagac 1680gactacgtct
tgccgatcgg taaagcggat gttaagcgtg aaggtgatga tatcaccgtg 1740atcacgtacg gcctgtgcgt gcacttcgcc ctgcaagcgg ccgaacgcct ggagaaggac 1800ggcatcagcg cacacgttgt agacctgcgt accgtctacc cgttggataa agaagccatc 1860atcgaggcgg cgagcaaaac cggcaaggtg ctgctggtca cggaagatac caaagaaggt 1920agcatcatga gcgaggttgc agccatcatt agcgagcact gtttgttcga cttggatgcg 1980ccgattaagc gtctggcggg tccagatatc ccggccatgc cgtacgcacc gacgatggag 2040aaatacttta tggtcaaccc ggataaggtg gaagcggcca tgcgtgagct ggcggagttc 2100taaggatccg aattcactgt ataacattaa gaaggaggta aaaaaaatgg ccatcgagca 2160aatgaccatg ccgcaactgg gcgagagcgt aacggaaggc accatttcca aatggctggt 2220tgctccaggt gataaagtca acaagtatga cccgatcgct gaggttatga ccgataaggt 2280gaacgcggag gttccgtcct ctttcactgg caccattacc gaactggtcg gcgaagaggg 2340tcaaacgctg caagtcggcg agatgatctg taagattgaa acggagggtg ctaatccggc 2400tgaacaaaag caggagcaac cggcagcgtc tgaagcggca gaaaatccag tcgcgaagag 2460cgcgggtgcc gcagatcaac cgaacaaaaa gcgttacagc ccggcagttt tgcgcctggc 2520tggtgagcac ggcatcgacc tggatcaagt gactggtacg ggcgcaggtg gccgcattac 2580ccgtaaggac atccaacgct tgattgaaac gggtggtgtc caggaacaga acccggagga 2640gctgaaaacc gccgcaccgg caccgaaaag cgcgagcaaa ccggagccga aggaagaaac 2700ctcttacccg gcgtccgctg cgggcgataa ggagattccg gttactggcg ttcgcaaggc 2760catcgctagc aatatgaagc gcagcaagac tgagatcccg cacgcatgga cgatgatgga 2820ggtggatgtg accaacatgg tagcataccg taatagcatc aaggatagct tcaaaaagac 2880cgaaggtttc aacctgacgt tctttgcctt ctttgtgaag gccgttgcac aggcactgaa 2940agagtttccg caaatgaaca gcatgtgggc tggcgacaag attattcaaa agaaggatat 3000caacattagc attgcagtcg ccaccgagga cagcctgttc gtgccggtaa tcaaaaatgc 3060tgatgaaaag actatcaaag gtattgcaaa ggacatcacc ggcctggcga agaaagttcg 3120cgacggtaag ctgaccgcag atgacatgca gggtggcacc tttacggtca acaacacggg 3180cagctttggc agcgtccaga gcatgggtat tatcaactat ccgcaggcgg caattctgca 3240agttgaatcc atcgtgaaac gcccggttgt tatggacaac ggcatgattg cagttcgtga 3300catggtaaac ttgtgtctga gcttggacca ccgcgttctg gacggcctgg tctgcggtcg 3360tttcttgggc cgtgtgaaac agatcctgga gagcattgat gagaaaacga gcgtgtatta 3420agaattcgag ctcactgtat aacattaaga aggaggtaaa aaaaatggca acggagtacg 3480acgtagtgat tttgggcggt ggcacgggcg gttacgtggc ggccattcgt gcggcgcaat 3540tgggcctgaa aacggccgtg gtcgaaaaag aaaaactggg cggcacctgc ctgcacaagg 3600gttgtattcc gagcaaagcc ctgttgcgtt ccgcggaggt gtaccgtacc gctcgtgaag 3660cggaccaatt cggcgtggaa accgcgggtg tgtccctgaa ctttgagaaa gtccagcagc 3720gtaaacaggc ggtggtggac aaactggctg cgggtgtcaa tcacctgatg aagaagggta 3780aaatcgatgt gtataccggt tatggccgca tcctgggtcc gagcattttc agcccgctgc 3840cgggtactat ttccgtggaa cgtggcaacg gtgaagaaaa cgacatgttg atccctaaac 3900aggtgatcat cgcgaccggt agccgtccgc gcatgctgcc aggtctggaa gttgacggta 3960aaagcgtgct gaccagcgat gaggcgctgc aaatggagga gttgccgcag agcatcatca 4020ttgtaggtgg cggcgtcatt ggcattgagt gggcgagcat gctgcatgat tttggcgtca 4080aagtcactgt gatcgagtac gccgaccgta ttctgccgac ggaggatttg gagatttcca 4140aagaaatgga aagcctgctg aaaaagaaag gtatccaatt cattaccggt gctaaggttc 4200tgccggacac gatgaccaaa actagcgacg atatcagcat tcaagcagaa aaagatggcg 4260aaacggtcac ctacagcgcg gagaaaatgt tggtgagcat cggtcgtcag gcgaatatcg 4320agggtattgg tctggaaaac accgacattg ttaccgagaa tggtatgatc tccgtcaacg 4380agagctgcca aacgaaagag tcgcacatct atgccatcgg tgacgtcatc ggtggcctgc 4440aattggccca cgtcgcaagc catgagggta tcatcgcagt agaacatttc gccggtctga 4500atccgcaccc gctggacccg actctggtcc ctaagtgtat ctactccagc ccggaagccg 4560ctagcgtagg tctgaccgaa gatgaggcta aggcgaatgg ccacaacgtc aagattggca 4620agttcccgtt tatggctatt ggtaaggcgc tggtgtatgg cgagagcgac ggttttgtca 4680agattgtagc tgatcgtgat accgacgata ttctgggtgt gcacatgatc ggtccgcacg 4740tgaccgacat gattagcgaa gcaggtctgg ccaaagtact ggacgcgacc ccgtgggaag 4800taggccagac cattcacccg catcctacgc tgagcgaagc gattggtgag gcggcattgg 4860ccgcagacgg taaagctatc cacttctaag agctcgtcga ccactgtata acattaagaa 4920ggaggtaaaa aaaatgagca aggcgaaaat cacggcaatc ggcacctacg caccaagccg 4980tcgtctgacc aatgcggatc tggagaagat tgttgacacc tctgatgaat ggatcgttca 5040acgtacgggt atgcgtgaac gtcgtattgc cgacgaacat cagttcacgt ctgatctgtg 5100catcgaagcc gttaagaacc tgaaaagccg ttacaaaggc acgctggatg acgttgacat 5160gatcctggtt gcaaccacga cctctgacta tgcttttccg agcaccgctt gtcgtgtgca 5220ggagtatttc ggctgggaat ccactggtgc gctggatatc aatgccacct gtgcgggtct 5280gacctacggt ctgcacctgg ccaatggcct gattaccagc ggcctgcatc aaaagattct 5340ggttattgcg ggcgaaacgc tgagcaaagt taccgattac accgatcgca cgacctgcgt 5400tttgtttggc gacgcagcgg gtgcactgct ggttgagcgc gatgaggaaa cgccaggttt 5460cctggcgagc gtccagggca ctagcggtaa cggtggtgac atcctgtacc gtgcaggtct 5520gcgtaacgag attaacggtg tgcagctggt gggctctggc aagatggtgc aaaatggccg 5580tgaggtttac aagtgggctg cgcgcactgt tccgggcgag ttcgagcgcc tgctgcacaa 5640agcaggtctg agcagcgacg atctggactg gtttgtgccg cacagcgcca acctgcgtat 5700gatcgagagc atctgcgaaa agacgccgtt cccaatcgaa aagaccttga cgagcgtgga 5760gcattacggt aataccagct ccgtgtctat tgtcctggcg ctggacttgg cagtgaaggc 5820aggcaaactg aaaaaggatc agatcgttct gctgtttggc ttcggtggtg gcttgaccta 5880cacgggcctg ctgatcaaat ggggtatgta atgagtcgac gcggccgcgc ggccgcataa 5940tgcttaagtc gaacagaaag taatcgtatt gtacacggcc gcataatcga aattaatacg 6000actcactata ggggaattgt gagcggataa caattcccca tcttagtata ttagttaagt 6060ataagaagga gatatacata tggcagatct caattggata tcggccggcc acgcgatcgc 6120tgacgtcggt accctcgagt ctggtaaaga aaccgctgct gcgaaatttg aacgccagca 6180catggactcg tctactagcg cagcttaatt aacctaggct gctgccaccg ctgagcaata 6240actagcataa ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg 6300aactatatcc ggattggcga atgggacgcg ccctgtagcg gcgcattaag cgcggcgggt 6360gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc cgctcctttc 6420gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc tctaaatcgg 6480gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa aaaacttgat 6540tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg ccctttgacg 6600ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac actcaaccct 6660atctcggtct attcttttga tttataaggg attttgccga tttcggccta ttggttaaaa 6720aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac gtttacaatt 6780tctggcggca cgatggcatg agattatcaa aaaggatctt cacctagatc cttttaaatt 6840aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc 6900aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg 6960cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct ggccccagtg 7020ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca ataaaccagc 7080cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc atccagtcta 7140ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg cgcaacgttg 7200ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct tcattcagct 7260ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa aaagcggtta 7320gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg 7380ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc ttttctgtga 7440ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg agttgctctt 7500gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca 7560ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt 7620cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt 7680ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga 7740aatgttgaat actcatactc ttcctttttc aatcatgatt gaagcattta tcagggttat 7800tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat aggtcatgac 7860caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 7920aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc 7980accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt 8040aactggcttc agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg 8100ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc 8160agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 8220accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga 8280gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct 8340tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 8400cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 8460cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 8520cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 8580ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga 8640taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 8700gcgcctgatg cggtattttc tccttacgca tctgtgcggt atttcacacc gcatatatgg 8760tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagtatac actccgctat 8820cgctacgtga ctgggtcatg gctgcgcccc gacacccgcc aacacccgct gacgcgccct 8880gacgggcttg tctgctcccg gcatccgctt acagacaagc tgtgaccgtc tccgggagct 8940gcatgtgtca gaggttttca ccgtcatcac cgaaacgcgc gaggcagctg cggtaaagct 9000catcagcgtg gtcgtgaagc gattcacaga tgtctgcctg ttcatccgcg tccagctcgt 9060tgagtttctc cagaagcgtt aatgtctggc ttctgataaa gcgggccatg ttaagggcgg 9120ttttttcctg tttggtcact gatgcctccg tgtaaggggg atttctgttc atgggggtaa 9180tgataccgat gaaacgagag aggatgctca cgatacgggt tactgatgat gaacatgccc 9240ggttactgga acgttgtgag ggtaaacaac tggcggtatg gatgcggcgg gaccagagaa 9300aaatcactca gggtcaatgc cagcgcttcg ttaatacaga tgtaggtgtt ccacagggta 9360gccagcagca tcctgcgatg cagatccgga acataatggt gcagggcgct gacttccgcg 9420tttccagact ttacgaaaca cggaaaccga agaccattca tgttgttgct caggtcgcag 9480acgttttgca gcagcagtcg cttcacgttc gctcgcgtat cggtgattca ttctgctaac 9540cagtaaggca accccgccag cctagccggg tcctcaacga caggagcacg atcatgctag 9600tcatgccccg cgcccaccgg aaggagctga ctgggttgaa ggctctcaag ggcatcggtc 9660gagatcccgg tgcctaatga gtgagctaac ttacattaat tgcgttgcgc tcactgcccg 9720ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga 9780gaggcggttt gcgtattggg cgccagggtg gtttttcttt tcaccagtga gacgggcaac 9840agctgattgc ccttcaccgc ctggccctga gagagttgca gcaagcggtc cacgctggtt 9900tgccccagca ggcgaaaatc ctgtttgatg gtggttaacg gcgggatata acatgagctg 9960tcttcggtat cgtcgtatcc cactaccgag atgtccgcac caacgcgcag cccggactcg 10020gtaatggcgc gcattgcgcc cagcgccatc tgatcgttgg caaccagcat cgcagtggga 10080acgatgccct cattcagcat ttgcatggtt tgttgaaaac cggacatggc actccagtcg 10140ccttcccgtt ccgctatcgg ctgaatttga ttgcgagtga gatatttatg ccagccagcc 10200agacgcagac gcgccgagac agaacttaat gggcccgcta acagcgcgat ttgctggtga 10260cccaatgcga ccagatgctc cacgcccagt cgcgtaccgt cttcatggga gaaaataata 10320ctgttgatgg gtgtctggtc agagacatca agaaataacg ccggaacatt agtgcaggca 10380gcttccacag caatggcatc ctggtcatcc agcggatagt taatgatcag cccactgacg 10440cgttgcgcga gaagattgtg caccgccgct ttacaggctt cgacgccgct tcgttctacc 10500atcgacacca ccacgctggc acccagttga tcggcgcgag atttaatcgc cgcgacaatt 10560tgcgacggcg cgtgcagggc cagactggag gtggcaacgc caatcagcaa cgactgtttg 10620cccgccagtt gttgtgccac gcggttggga atgtaattca gctccgccat cgccgcttcc 10680actttttccc gcgttttcgc agaaacgtgg ctggcctggt tcaccacgcg ggaaacggtc 10740tgataagaga caccggcata ctctgcgaca tcgtataacg ttactggttt cacattcacc 10800accctgaatt gactctcttc cgggcgctat catgccatac cgcgaaaggt tttgcgccat 10860tcgatggtgt ccgggatctc gacgctctcc cttatgcgac tcctgcatta ggaagcagcc 10920cagtagtagg ttgaggccgt tgagcaccgc cgccgcaagg aatggtgcat gcaaggagat 10980ggcgcccaac agtcccccgg ccacggggcc tgccaccata cccacgccga aacaagcgct 11040catgagcccg aagtggcgag cccgatcttc cccatcggtg atgtcggcga tataggcgcc 11100agcaaccgca cctgtggcgc cggtgatgcc ggccacgatg cgtccggcgt agaggatcga 11160gatcgatctc gatcccgcga aattaatacg actcactata 112002625DNAArtificial SequencePrimer sequence 26catatgcagc agcttacaga ccaat 252729DNAArtificial SequencePrimer sequence 27ctcgagttaa gcacctatga gtccgtagg 2928477PRTVibrio harveyi 28Met Glu Lys His Leu Pro Leu Ile Val Asn Gly Gln Ile Ile Ser Thr 1 5 10 15 Glu Glu Asn Arg Phe Glu Ile Ser Phe Glu Glu Lys Lys Val Lys Ile 20 25 30 Asp Ser Phe Asn Asn Leu His Leu Thr Gln Met Val Asn His Asp Tyr 35 40 45 Leu Asn Asp Leu Asn Ile Asn Asn Ile Ile Asn Phe Leu Tyr Thr Thr 50 55 60 Gly Gln Arg Trp Lys Ser Glu Glu Tyr Ser Arg Arg Arg Ala Tyr Ile 65 70 75 80 Arg Ser Leu Ile Thr Tyr Leu Gly Tyr Ser Pro Gln Met Ala Lys Leu 85 90 95 Glu Ala Asn Trp Ile Ala Met Ile Leu Cys Ser Lys Ser Ala Leu Tyr 100 105 110 Asp Ile Ile Asp Thr Glu Leu Gly Ser Thr His Ile Gln Asp Glu Trp 115 120 125 Leu Pro Gln Gly Glu Cys Tyr Val Arg Ala Phe Pro Lys Gly Arg Thr 130 135 140 Met His Leu Leu Ala Gly Asn Val Pro Leu Ser Gly Val Thr Ser Ile 145 150 155 160 Leu Arg Gly Ile Leu Thr Arg Asn Gln Cys Ile Val Arg Met Ser Ala 165 170 175 Ser Asp Pro Phe Thr Ala His Ala Leu Ala Met Ser Phe Ile Asp Val 180 185 190 Asp Pro Asn His Pro Ile Ser Arg Ser Ile Ser Val Leu Tyr Trp Pro 195 200 205 His Ala Ser Asp Thr Thr Leu Ala Glu Glu Leu Leu Ser His Met Asp 210 215 220 Ala Val Val Ala Trp Gly Gly Arg Asp Ala Ile Asp Trp Ala Val Lys 225 230 235 240 His Ser Pro Ser His Ile Asp Val Leu Lys Phe Gly Pro Lys Lys Ser 245 250 255 Phe Thr Val Leu Asp His Pro Ala Asp Leu Glu Glu Ala Ala Ser Gly 260 265 270 Val Ala His Asp Ile Cys Phe Tyr Asp Gln Asn Ala Cys Phe Ser Thr 275 280 285 Gln Asn Ile Tyr Phe Ser Gly Asp Lys Tyr Glu Glu Phe Lys Leu Lys 290 295 300 Leu Val Glu Lys Leu Asn Leu Tyr Gln Glu Val Leu Pro Lys Ser Lys 305 310 315 320 Gln Ser Phe Asp Asp Glu Ala Leu Phe Ser Met Thr Arg Leu Glu Cys 325 330 335 Gln Phe Ser Gly Leu Lys Val Ile Ser Glu Pro Glu Asn Asn Trp Met 340 345 350 Ile Ile Glu Ser Glu Pro Gly Val Glu Tyr Asn His Pro Leu Ser Arg 355 360 365 Cys Val Tyr Val His Lys Ile Asn Lys Val Asp Asp Val Val Gln Tyr 370 375 380 Ile Glu Lys His Gln Thr Gln Thr Ile Ser Phe Tyr Pro Trp Glu Ser 385 390 395 400 Ser Lys Lys Tyr Arg Asp Ala Phe Ala Ala Lys Gly Val Glu Arg Ile 405 410 415 Val Glu Ser Gly Met Asn Asn Ile Phe Arg Ala Gly Gly Ala His Asp 420 425 430 Ala Met Arg Pro Leu Gln Arg Leu Val Arg Phe Val Ser His Glu Arg 435 440 445 Pro Tyr Asn Phe Thr Thr Lys Asp Val Ser Val Glu Ile Glu Gln Thr 450 455 460 Arg Phe Leu Glu Glu Asp Lys Phe Leu Val Phe Val Pro 465 470 475 29479PRTVibrio fischeri 29Met Ile Lys Cys Ile Pro Met Ile Ile Lys Gly Val Val Gln Asp Phe 1 5 10 15 Asp Asn Asn Ala Cys Lys Glu Ile Asn Leu Asp Ser Gly Asn Lys Ile 20 25 30 Lys Leu Ser Leu Leu Thr Glu Asp Ser Val Leu Arg Ser Leu Asn Ser 35 40 45 Lys Glu Lys Val Asp Leu Asn Leu Asn Gln Ile Val Asn Phe Leu Tyr 50 55 60 Thr Val Gly Gln Arg Trp Lys Asn Glu Glu Tyr Asn Arg Arg Arg Thr 65 70 75 80 Tyr Ile Arg Glu Leu Lys Lys Tyr Leu Gly Tyr Ser Asp Glu Met Ala 85 90 95 Arg Leu Glu Ala Asn Trp Ile Ala Met Leu Leu Cys Ser Lys Ser Ala 100 105 110 Leu Tyr Asp Ile Val Asn Tyr Asp Leu Gly Ser Ile His Val Leu Asp 115 120 125 Glu Trp Leu Pro Arg Gly Asp Cys Tyr Val Lys Ala Gln Ala Lys Gly 130 135 140 Val Ser Ile His Leu Leu Ala Gly Asn Val Pro Leu Ser Gly Val Thr 145 150 155 160 Ser Ile Leu Arg Ala Ile Leu Thr Lys Asn Glu Cys Ile Ile Lys Thr 165 170 175 Ser Ser Ser Asp Pro Phe Thr Ala Thr Ala Leu Ala Ser Ser Phe Ile 180 185 190 Asp Val Asn Ala Glu His Pro Ile Thr Lys Ser Met Ser Val Met Tyr 195 200 205 Trp Pro His Asn Glu Asp Met Thr Leu Pro Gln Arg Ile Met Asn His 210 215 220 Ala Asp Ile Val Ile Ala Trp Gly Gly Glu Glu Ala Ile Lys Trp Ala 225 230 235 240 Ala Lys His Ser Pro Pro His Ala Asp Val Leu Lys Phe Gly Pro Lys 245 250 255 Lys Ser Leu Ser Ile Ile Glu Glu Pro Glu Asp Met Glu Glu Ala Ala 260 265 270 Met Gly Val Ala His Asp Ile Cys Phe Tyr Asp Gln Gln Ala Cys Phe 275 280 285 Ser Thr Gln Asp Val Tyr Tyr Ile Gly Glu His Leu Pro Leu Phe Leu 290 295 300 Ser Glu Leu Glu Lys Gln Leu Asp Arg Tyr Ala Lys Ile Leu Pro Lys 305 310 315 320 Gly Leu Lys Asn Phe Asp Glu Lys Ala Ala Phe Ser Leu Thr Glu Arg 325
330 335 Glu Gly Ile Phe Ala Gly Tyr Asp Val Lys Lys Gly Asp Asn Gln Ala 340 345 350 Trp Leu Met Ile Ile Ser Pro Thr Asn Ser Ser Gly Asn Gln Pro Leu 355 360 365 Ser Arg Ser Val Tyr Ile His Gln Val Ser Asp Ile Asn Glu Val Leu 370 375 380 Pro Phe Val Asn Lys Asn Ser Thr Gln Thr Val Ser Ile Tyr Pro Trp 385 390 395 400 Glu Ala Ser Leu Lys Tyr Arg Asp Lys Leu Ala Met Ser Gly Ala Glu 405 410 415 Arg Ile Val Glu Ser Gly Met Asn Asn Ile Phe Arg Val Gly Gly Ala 420 425 430 His Asp Ser Leu Ser Pro Leu Gln Tyr Leu Val Arg Phe Thr Ser His 435 440 445 Glu Arg Pro Phe His Tyr Thr Thr Lys Asp Val Ala Val Glu Ile Glu 450 455 460 Gln Thr Arg Tyr Leu Glu Glu Asp Lys Phe Leu Val Phe Val Pro 465 470 475 30305PRTVibrio harveyi 30Met Asn Asn Gln Cys Lys Thr Ile Ala His Val Leu Arg Val Asn Asn 1 5 10 15 Gly Gln Glu Leu His Val Trp Glu Thr Pro Pro Lys Glu Asn Val Pro 20 25 30 Ser Lys Asn Asn Thr Ile Leu Ile Ala Ser Gly Phe Ala Arg Arg Met 35 40 45 Asp His Phe Ala Gly Leu Ala Glu Tyr Leu Ser Glu Asn Gly Phe His 50 55 60 Val Phe Arg Tyr Asp Ser Leu His His Val Gly Leu Ser Ser Gly Ser 65 70 75 80 Ile Asp Glu Phe Thr Met Thr Thr Gly Lys Asn Ser Leu Cys Thr Val 85 90 95 Tyr His Trp Leu Gln Thr Lys Gly Thr Gln Asn Ile Gly Leu Ile Ala 100 105 110 Ala Ser Leu Ser Ala Arg Val Ala Tyr Glu Val Ile Ser Asp Leu Glu 115 120 125 Leu Ser Phe Leu Ile Thr Ala Val Gly Val Val Asn Leu Arg Asp Thr 130 135 140 Leu Glu Lys Ala Leu Gly Phe Asp Tyr Leu Ser Leu Pro Ile Asp Glu 145 150 155 160 Leu Pro Asn Asp Leu Asp Phe Glu Gly His Lys Leu Gly Ser Glu Val 165 170 175 Phe Val Arg Asp Cys Phe Glu His His Trp Asp Thr Leu Asp Ser Thr 180 185 190 Leu Asp Lys Val Ala Asn Thr Ser Val Pro Leu Ile Ala Phe Thr Ala 195 200 205 Asn Asn Asp Asp Trp Val Lys Gln Glu Glu Val Tyr Asp Met Leu Ala 210 215 220 His Ile Arg Thr Gly His Cys Lys Leu Tyr Ser Leu Leu Gly Ser Ser 225 230 235 240 His Asp Leu Gly Glu Asn Leu Val Val Leu Arg Asn Phe Tyr Gln Ser 245 250 255 Val Thr Lys Ala Ala Ile Ala Met Asp Gly Gly Ser Leu Glu Ile Asp 260 265 270 Val Asp Phe Ile Glu Pro Asp Phe Glu Gln Leu Thr Ile Ala Thr Val 275 280 285 Asn Glu Arg Arg Leu Lys Ala Glu Ile Glu Ser Arg Thr Pro Glu Met 290 295 300 Ala 305 31307PRTVibrio fischeri 31Met Lys Asp Glu Ser Ala Leu Phe Thr Ile Asp His Ile Ile Lys Leu 1 5 10 15 Asp Asn Gly Gln Ser Ile Arg Val Trp Glu Thr Leu Pro Lys Lys Asn 20 25 30 Val Pro Glu Lys Lys Asn Thr Ile Leu Ile Ala Ser Gly Phe Ala Arg 35 40 45 Arg Met Asp His Phe Ala Gly Leu Ala Glu Tyr Leu Ser Thr Asn Gly 50 55 60 Phe His Val Ile Arg Tyr Asp Ser Leu His His Val Gly Leu Ser Ser 65 70 75 80 Gly Cys Ile Asn Glu Phe Thr Met Ser Ile Gly Lys Asn Ser Leu Leu 85 90 95 Thr Val Val Asp Trp Leu Thr Asp His Gly Val Glu Arg Ile Gly Leu 100 105 110 Ile Ala Ala Ser Leu Ser Ala Arg Ile Ala Tyr Glu Val Val Asn Lys 115 120 125 Ile Lys Leu Ser Phe Leu Ile Thr Ala Val Gly Val Val Asn Leu Arg 130 135 140 Asp Thr Leu Glu Lys Ala Leu Glu Tyr Asp Tyr Leu Gln Leu Pro Ile 145 150 155 160 Ser Glu Leu Pro Glu Asp Leu Asp Phe Glu Gly His Asn Leu Gly Ser 165 170 175 Glu Val Phe Val Thr Asp Cys Phe Lys His Asp Trp Asp Thr Leu Asp 180 185 190 Ser Thr Leu Asn Ser Val Lys Gly Leu Ala Ile Pro Phe Ile Ala Phe 195 200 205 Thr Ala Asn Asp Asp Ser Trp Val Lys Gln Ser Glu Val Ile Glu Leu 210 215 220 Ile Asp Ser Ile Glu Ser Ser Asn Cys Lys Leu Tyr Ser Leu Ile Gly 225 230 235 240 Ser Ser His Asp Leu Gly Glu Asn Leu Val Val Leu Arg Asn Phe Tyr 245 250 255 Gln Ser Val Thr Lys Ala Ala Leu Ala Leu Asp Asp Gly Leu Leu Asp 260 265 270 Leu Glu Ile Asp Ile Ile Glu Pro Arg Phe Glu Asp Val Thr Ser Ile 275 280 285 Thr Val Lys Glu Arg Arg Leu Lys Asn Glu Ile Glu Asn Glu Leu Leu 290 295 300 Glu Leu Ala 305 32378PRTVibrio harveyi 32Met Asp Val Leu Ser Ala Val Lys Gln Glu Asn Ile Ala Ala Ser Thr 1 5 10 15 Glu Ile Asp Asp Leu Ile Phe Met Gly Thr Pro Gln Gln Trp Ser Leu 20 25 30 Gln Glu Gln Lys Gln Leu Thr Ser Arg Leu Val Lys Gly Ala Tyr Gln 35 40 45 Tyr His Tyr His Asn Asn Asp Asp Tyr Arg Gln Phe Cys Glu Arg Leu 50 55 60 Gly Val Gly Glu Val Val Glu Asp Leu Asn Asp Ile Pro Val Phe Pro 65 70 75 80 Thr Ser Ile Phe Lys Leu Lys Thr Leu Leu Thr Leu Asp Asp Asp Glu 85 90 95 Val Glu Asn Arg Phe Thr Ser Ser Gly Thr Ser Gly Ile Lys Ser Ile 100 105 110 Val Ala Arg Asp Arg Leu Ser Ile Glu Arg Leu Leu Gly Ser Val Asn 115 120 125 Phe Gly Met Asn Tyr Val Gly Asp Trp Phe Asp His Gln Met Glu Leu 130 135 140 Val Asn Leu Gly Pro Asp Arg Phe Asn Ala Asn Asn Ile Trp Phe Lys 145 150 155 160 Tyr Val Met Ser Leu Val Glu Leu Leu Tyr Pro Thr Ala Phe Thr Val 165 170 175 Thr Glu Asp Glu Ile Asp Phe Glu Ala Thr Leu Ala Asn Met Asn Arg 180 185 190 Ile Lys Gln Ser Gly Lys Thr Ile Cys Leu Ile Gly Pro Pro Tyr Phe 195 200 205 Ile Tyr Leu Leu Cys Cys Phe Met Arg Glu Gln Gly Gln Thr Phe Asn 210 215 220 Gly Gly Arg Asp Leu Tyr Ile Ile Thr Gly Gly Gly Trp Lys Lys His 225 230 235 240 Gln Asp Gln Ser Leu Asp Arg Asp Glu Phe Asn Gln Leu Leu Cys Glu 245 250 255 Thr Phe Thr Leu Glu Ser Pro Glu Gln Ile Arg Asp Thr Phe Asn Gln 260 265 270 Val Glu Leu Asn Thr Cys Phe Phe Glu Asp Thr Glu His Lys Lys Arg 275 280 285 Val Pro Pro Trp Val Phe Ala Arg Ala Leu Asp Pro Lys Thr Leu Lys 290 295 300 Pro Leu Pro His Gly Gln Pro Gly Leu Met Ser Tyr Met Asp Ala Ser 305 310 315 320 Ala Val Ser Tyr Pro Cys Phe Leu Val Thr Asp Asp Ile Gly Ile Val 325 330 335 Arg Glu Glu Glu Gly Asp Arg Pro Gly Thr Thr Val Glu Ile Val Arg 340 345 350 Arg Val Lys Thr Arg Gly Met Lys Gly Cys Ala Leu Ser Met Ser Gln 355 360 365 Ala Phe Thr Ala Lys Ser Glu Gly Gly Asn 370 375 33376PRTVibrio fischeri 33Met Thr Asn His Ile Glu Tyr Lys Lys Asn Gln Ile Ile Ala Ser Ser 1 5 10 15 Glu Ile Asp Asp Leu Ile Phe Met Ser Ala Pro Gln Glu Trp Ser Leu 20 25 30 Glu Glu Gln Lys Glu Ile Gln Asp Lys Leu Val Arg Glu Ala Phe His 35 40 45 Phe His Tyr Asn Arg Asn Glu Lys Tyr Arg Asn Tyr Cys Ile Ser Gln 50 55 60 His Ile Asn Glu Asn Leu His Ser Ile Asp Glu Ile Pro Val Phe Pro 65 70 75 80 Thr Ser Ile Phe Lys His Met Lys Phe His Thr Val Ser Met Gly Asp 85 90 95 Ile Glu Asn Trp His Thr Ser Ser Gly Thr Gln Gly Ile Lys Ser Cys 100 105 110 Ile Ala Arg Asp Arg Leu Ser Ile Glu Arg Leu Leu Gly Ser Val Asn 115 120 125 Phe Gly Met Lys Tyr Val Gly Asn Trp Phe Glu His Gln Met Glu Leu 130 135 140 Val Asn Leu Gly Pro Asp Arg Phe Ser Ala Ser Asn Val Trp Phe Lys 145 150 155 160 Tyr Val Met Ser Leu Val Glu Leu Leu Tyr Pro Thr Val Phe Thr Val 165 170 175 Asn Asn Asp Lys Ile Asp Phe Glu Glu Thr Val Asn His Leu Tyr Arg 180 185 190 Ile Asn Asn Ser Asn Lys Asp Ile Cys Leu Ile Gly Pro Pro Phe Phe 195 200 205 Val Ser Leu Leu Cys Gln Tyr Met Lys Glu Asn Asn Ile Glu Phe Lys 210 215 220 Gly Glu Asn Arg Leu His Val Ile Thr Gly Gly Gly Trp Lys Ser Asn 225 230 235 240 Glu Asn Ser Ser Leu Asn Arg Gln Asp Phe Asn Gln Leu Ile Met Asp 245 250 255 Thr Phe Gln Leu Asp Asn Val Asn Gln Ile Arg Asp Thr Phe Asn Gln 260 265 270 Val Glu Leu Asn Thr Cys Phe Phe Glu Asp Glu Phe Gln Arg Lys His 275 280 285 Val Pro Pro Trp Val Tyr Ala Arg Ala Leu Asp Pro Glu Thr Leu Lys 290 295 300 Pro Val Ala Asp Gly Glu Leu Gly Leu Leu Ser Tyr Met Asp Ala Ser 305 310 315 320 Ser Thr Ala Tyr Pro Ala Phe Ile Val Thr Asp Asp Ile Gly Ile Val 325 330 335 Arg Glu Ile Arg Glu Pro Asp Pro Tyr Pro Gly Val Thr Val Glu Ile 340 345 350 Val Arg Arg Leu Asn Thr Arg Ala Gln Lys Gly Cys Ala Leu Ser Met 355 360 365 Ala Ser Phe Ile Gln Ser Thr Ile 370 375
Patent applications by Phillip Guy Hamilton, Sugar Land, TX US
Patent applications by SHELL OIL COMPANY
Patent applications in class Preparing hydrocarbon
Patent applications in all subclasses Preparing hydrocarbon