Important note: Do NOT send email to the cs.cf.ac.uk address above: it will
be ignored. Corrections and other correspondence should be sent to
TABLE OF CONTENTS OF PART 3
Q2: What applications of EAs are there?
Q3: Who is concerned with EAs?
Q4: How many EAs exist? Which?
Q4.1: What about Alife systems, like Tierra and VENUS?
Q5: What about all this Optimization stuff?
Subject: Q2: What applications of EAs are there?
In principle, EAs can compute any computable function, i.e.
everything a normal digital computer can do.
But EAs are especially badly suited for problems where efficient ways
of solving them are already known, (unless these problems are
intended to serve as benchmarks). Special purpose algorithms, i.e.
algorithms that have a certain amount of problem domain knowledge
hard coded into them, will usually outperform EAs, so there is no
black magic in EC. EAs should be used when there is no other known
problem solving strategy, and the problem domain is NP-complete.
That's where EAs come into play: heuristically finding solutions
where all else fails.
Following is an incomplete (sic!) list of successful EA
ARTIFICIAL LIFE (ALIFE) systems attempt to simulate the kind of
behaviour exhibited by real, living creatures. Not all Artificial
Life systems employ EVOLUTIONARY ALGORITHMs (see Q4.1).
Framsticks is a three-dimensional life SIMULATION project. Both the
physical structure of creatures and their control systems are
evolved. Evolutionary algorithms are used with SELECTION, CROSSOVER
and MUTATION. Finite element methods are used for simulation. Both
spontaneous and directed EVOLUTIONs are possible.
This system uses the standard EA framework to evolve 3D agents
equipped with neural networks. It has proved to be an attractive tool
for people who want to learn about the way evolutionary OPTIMIZATION
This is shareware, but all the evolutionary features are available
free. The project is open, and developers can take part in it, and
also conduct their own experiments (i.e. using their own GENETIC
OPERATORs). There are links to the scientific papers on the web
page, as well as the detailed program documentation. The software is
quite general and can be used to study a range of problems, including
coevolution of body and brain.
For more details, see: http://www.frams.poznan.pl/
Biocomputing, or Bioinformatics, is the field of biology dedicated to
the automatic analysis of experimental data (mostly sequencing data).
Several approaches to specific biocomputing problems have been
described that involve the use of GA, GP and simulated annealing.
General information about biocomputing (software, databases, misc.)
can be found on the server of the European Bioinformatics Institute:
http://www.ebi.ac.uk/ebi_home.html ENCORE has a good selection of
pointers related to this subject. VSCN provides a detailed online
course on bioinformatics: http://www.techfak.uni-
There are three main domains to which GA have been applied in
Bioinformatics: protein folding, RNA folding, sequence alignment.
Proteins are one of the essential components of any form of life.
They are made of twenty different types of amino acid. These amino
acids are chained together in order to form the protein that can
contain from a few to several thousands residues. In most of the
cases, the properties and the function of a protein are a result of
its three dimensional structure. It seems that in many cases this
structure is a direct consequence of the sequence. Unfortunately, it
is still very difficult/impossible to deduce the three dimensional
structure, knowing only the sequence. A part of the VSCN on-line
bioinformatics course is dedicated to the use of GAs in Protein
Folding Prediction. It contains an extensive bibliography and a
detailed presentation of the subject with LOTS of explanations and
on-line papers. The URL is: http://www.techfak.uni-
Koza [KOZA92] gives one example of GP applied to Protein Folding.
Davis [DAVIS91] gives an example of DNA conformation prediction (a
closely related problem) in his Handbook of GAs.
Describing the tertiary structure of an RNA molecule, is about as
hard as for a protein, but describing the intermediate structure
(secondary structure) is somehow easier because RNA molecules are
using the same pairing rules as DNA, (Watson and Crick base pairing).
There exist deterministic algorithms that given a set of constraints
(rules), compute the more stable structure, but: (a) their time and
memory requirement increase quadratically or more with the length of
the sequences, and (b) they require simplified rules. Lots of effort
has recently been put into applying GAs to this problem, and several
papers can be found (on-line if your institute subscribes to these
A genetic Algorithm Based Molecular Modelling Technique For RNA Stem-
loop Structures H. Ogata, Y. Akiyama and M Kanehisa, Nucleic Acid
Research, 1995, vol 23,3 419-426
An Annealing Mutation Operator in the GA for RNA folding B.A Shapiro
and J. C. Wu, CABIOS, 1996, vol 12, 3, 171-180
The computer Simulation of RNA Folding Pathway Using a Genetic
Algorithm A.P. Gultyaev, F.D.H van Batenburg and C. W. A. Pleij in
Journal of Molecular Biology, 1995, vol 250 37-51
Simulated Annealing has also been applied successfully to this
Description of RNA folding by SA M. Schmitz and G. Steger in Journal
of Molecular Biology, 1995, 255, 245-266
Sequence Alignment is another important problem of Bioinformatics.
The aim is to align together several related sequences (from two to
hundreds) given a cost function. For the most widely used cost
functions, the problem has been shown to be NP-complete. Several
attempts have been made using SA:
Multiple Sequence Alignment Using SA J. Kim, Sakti Pramanik and M.J.
Chung, CABIOS, 1994, vol 10, 4, 419-426
Multiple Sequence Alignment by Parallel SA M. Isshikawa, T. Koya and
al, CABIOS, 1993,vol 9, 3, 267-273
SAM, software which uses Hidden Markov Models for Multiple Sequence
Alignment, can use SA to train the model. Several papers have been
published on SAM. The software, documentation and an extensive
bibliography can be found in:
More recently, various software using different methods like Gibbs
sampling or GAs has been developed:
A Gibbs Sampling Strategy for Multiple Alignment C.E. Lawrence, S. F.
Altschull and al, Science, October 1993, vol 262, 208-214
SAGA: Sequence Alignment by Genetic Algorithm C. Notredame and D.G.
Higgins, Nucleic Acid Research, 1995, vol 24, 8,
A beta release of SAGA (along with the paper) is available on the
European Bioinformatics Institute anonymous FTP server:
CELLULAR PROGRAMMING: Evolution of Parallel Cellular Machines
Nature abounds in systems involving the actions of simple, locally-
interacting components, that give rise to coordinated global
behavior. These collective systems have evolved by means of natural
SELECTION to exhibit striking problem-solving capacities, while
functioning within a complex, dynamic ENVIRONMENT. Employing simple
yet versatile parallel cellular models, coupled with EVOLUTIONARY
COMPUTATION techniques, cellular programming is an approach for
constructing man-made systems that exhibit characteristics such as
those manifest by their natural counterparts.
Parallel cellular machines hold potential both scientifically, as
vehicles for studying phenomena of interest in areas such as complex
adaptive systems and ARTIFICIAL LIFE, as well as practically,
enabling the construction of novel systems, endowed with
evolutionary, reproductive, regenerative, and learning capabilities.
Web site: http://lslwww.epfl.ch/~moshes/cp.html
Sipper, M. (1997) "Evolution of Parallel Cellular Machines: The
Cellular Programming Approach", Springer-Verlag, Heidelberg.
Sipper, M. (1996) "Co-evolving Non-Uniform Cellular Automata to
Perform Computations", Physica D, 92, 193-208.
Sipper, M. and Ruppin, E. (1997) "Co-evolving architectures for
cellular machines", Physica D, 99, 428-441.
Sipper, M. and Tomassini, M. (1996) "Generating Parallel Random
Number Generators By Cellular Programming", International Journal of
Modern Physics C, 7(2), 181-190.
Sipper, M. (1997) "Evolving Uniform and Non-uniform Cellular Automata
Networks", in Annual Reviews of Computational Physics, D. Stauffer
The idea of evolving machines, whose origins can be traced to the
cybernetics movement of the 1940s and the 1950s, has recently
resurged in the form of the nascent field of bio-inspired systems and
evolvable hardware. The field draws on ideas from the EVOLUTIONARY
COMPUTATION domain as well as on novel hardware innovations.
Recently, the term evolware has been used to describe such evolving
ware, with current implementations centering on hardware, while
raising the possibility of using other forms in the future, such as
bioware. The inaugural workshop, Towards Evolvable Hardware, took
place in Lausanne, in October 1995, followed by the First
International Conference on Evolvable Systems: From Biology to
Hardware (ICES96) held in Japan, in October 1996. Another major event
in the field, ICES98, was held in Lausanne, Switzerland, in September
Sipper, M. et al (1997) "A Phylogenetic, Ontogenetic, and Epigenetic
View of Bio-Inspired Hardware Systems", IEEE Transactions on
Evolutionary Computation, 1(1).
Sanchez, E. and Tomassini, M. (eds) (1996) "Towards Evolvable
Hardware", Springer-Verlag, Lecture Notes in Computer Science, 1062.
Higuchi, T. et al (1997) "Proceedings of First International
Conference on Evolvable Systems: From Biology to Hardware (ICES96)",
Springer-Verlag, Lecture Notes in Computer Science.
GAs can be used to evolve behaviors for playing games. Work in
evolutionary GAME THEORY typically surrounds the EVOLUTION of a
POPULATION of players who meet randomly to play a game in which they
each must adopt one of a limited number of moves. (Maynard-Smith
1982). Let's suppose it is just two moves, X and Y. The players
receive a reward, analogous to Darwinian FITNESS, depending on which
combination of moves occurs and which move they adopted. In more
complicated models there may be several players and several moves.
The players iterate such a game a series of times, and then move on
to a new partner. At the end of all such moves, the players will have
a cumulative payoff, their fitness. This fitness can then be used to
generate a new population.
The real key in using a GA is to come up with an encoding to
represent player's strategies, one that is amenable to CROSSOVER and
to MUTATION. Possibilities are to suppose at each iteration a player
adopts X with some probability (and Y with one minus such). A player
can thus be represented as a real number, or a bit-string suitably
interpreted as a probability
An alternative characterisation is to model the players as Finite
State Machines, or Finite Automata (FA). These can be though of as a
simple flow chart governing behaviour in the "next" play of the game
depending upon previous plays. For example:
100 Play X
110 If opponent plays X go to 100
120 Play Y
130 If opponent plays X go to 100 else go to 120
represents a strategy that does whatever its opponent did last, and
begins by playing X, known as "Tit-For-Tat." (Axelrod 1982). Such
machines can readily be encoded as bit-strings. Consider the encoding
"1 0 1 0 0 1" to represent TFT. The first three bits, "1 0 1" are
state 0. The first bit, "1" is interpreted as "Play X." The second
bit, "0" is interpreted as "if opponent plays X go to state 1," the
third bit, "1", is interpreted as "if the opponent plays Y, go to
state 1." State 1 has a similar interpretation. Crossing over such
bit-strings always yields valid strategies.
SIMULATIONs in the Prisoner's dilemma have been undertaken (Axelrod
1987, Fogel 1993, Miller 1989) of these machines.
Alternative representations of game players include CLASSIFIER
SYSTEMs (Marimon, McGrattan and Sargent 1990, [GOLD89]), and Neural-
networks (Fogel and Harrald 1994), though not necessarily with a GA.
(Fogel 1993), and Fogel and Harrald 1994 use an Evolutionary
Program). Chellapilla and Fogel (1999) have evolved a neural network
which can play checkers (draughts) at near expert level.
Other methods of evolving a population can be found in Lindgren 1991,
Glance and Huberman 1993 and elsewhere.
A GA for playing the game "Mastermind" has been produced. See
Axelrod, R. (1987) ``The Evolution of Strategies in the Repeated
Prisoner's Dilemma,'' in [DAVIS91]
Axelrod, R (?) ``The Complexity of Cooperation'' (See the web site,
which includes code to implement tournaments:
Chellapilla, K. and Fogel, D.B. (1999) ``Evolution, neural networks,
games, and intelligence'' , Proc. IEEE, Sept., pp. 1471-1496.
Miller, J.H. (1989) ``The Coevolution of Automata in the Repeated
Prisoner's Dilemma'' Santa Fe Institute Working Paper 89-003.
Marimon, Ramon, Ellen McGrattan and Thomas J. Sargent (1990) ``Money
as a Medium of Exchange in an Economy with Artificially Intelligent
Agents'' Journal of Economic Dynamics and Control 14, pp. 329--373.
Maynard-Smith, (1982) Evolution and the Theory of Games, CUP.
Lindgren, K. (1991) ``Evolutionary Phenomena in Simple Dynamics,'' in
Holland, J.H and John Miller (1990) ``Artificially Adaptive Agents in
Economic Theory,'' American Economic Review: Papers and Proceedings
of the 103rd Annual Meeting of the American Economics Association:
Huberman, Bernado, and Natalie S. Glance (1993) "Diversity and
Collective Action" in H. Haken and A. Mikhailov (eds.)
Interdisciplinary Approaches to Nonlinear Systems, Springer.
Fogel (1993) "Evolving Behavior in the Iterated Prisoner's Dilemma"
Evolutionary Computation 1:1, 77-97
Fogel, D.B. and Harrald, P. (1994) ``Evolving Complex Behaviour in
the Iterated Prisoner's Dilemma,'' Proceedings of the Fourth Annual
Meetings of the Evolutionary Programming Society, L.J. Fogel and A.W.
Sebald eds., World Science Press.
Lindgren, K. and Nordahl, M.G. "Cooperation and Community Structure
in Artificial Ecosystems", Artificial Life, vol 1:1&2, 15-38
Stanley, E.A., Ashlock, D. and Tesfatsion, L. (1994) "Iterated
Prisoners Dilemma with Choice and Refusal of Partners in [ALIFEIII]
The Job-Shop Scheduling Problem (JSSP) is a very difficult NP-
complete problem which, so far, seems best addressed by sophisticated
branch and bound search techniques. GA researchers, however, are
continuing to make progress on it. (Davis 85) started off GA
research on the JSSP, (Whitley 89) reports on using the edge
RECOMBINATION operator (designed initially for the TSP) on JSSPs too.
More recent work includes (Nakano 91),(Yamada & Nakano 92), (Fang et
al. 93). The latter three report increasingly better results on
using GAs on fairly large benchmark JSSPs (from Muth & Thompson 63);
neither consistently outperform branch & bound search yet, but seem
well on the way. A crucial aspect of such work (as with any GA
application) is the method used to encode schedules. An important
aspect of some of the recent work on this is that better results have
been obtained by rejecting the conventional wisdom of using binary
representations (as in (Nakano 91)) in favor of more direct
encodings. In (Yamada & Nakano 92), for example, a GENOME directly
encodes operation completion times, while in (Fang et al. 93) genomes
represent implicit instructions for building a schedule. The success
of these latter techniques, especially since their applications are
very important in industry, should eventually spawn advances in GA
Concerning the point of using GAs at all on hard job-shop scheduling
problems, the same goes here as suggested above for `Timetabling':
The GA approach enables relatively arbitrary constraints and
objectives to be incorporated painlessly into a single OPTIMIZATION
method. It is unlikely that GAs will outperform specialized
knowledge-based and/or conventional OR-based approaches to such
problems in terms of raw solution quality, however GAs offer much
greater simplicity and flexibility, and so, for example, may be the
best method for quick high-quality solutions, rather than finding the
best possible solution at any cost. Also, of course, hybrid methods
will have a lot to offer, and GAs are far easier to parallelize than
typical knowledge-based/OR methods.
Similar to the JSSP is the Open Shop Scheduling Problem (OSSP).
(Fang et al. 93) reports an initial attempt at using GAs for this.
Ongoing results from the same source shows reliable achievement of
results within less than 0.23% of optimal on moderately large OSSPs
(so far, up to 20x20), including an improvement on the previously
best known solution for a benchmark 10x10 OSSP. A simpler form of job
shop problem is the Flow-Shop Sequencing problem; recent successful
work on applying GAs to this includes (Reeves 93)."
Other scheduling problems
In contrast to job shop scheduling some maintenance scheduling
problems consider which activities to schedule within a planned
maintenance period, rather than seeking to minimise the total time
taken by the activities. The constraints on which parts may be taken
out of service for maintenance at particular times may be very
complex, particularly as they will in general interact. Some initial
work is given in (Langdon, 1995).
Davis, L. (1985) "Job-Shop Scheduling with Genetic Algorithms",
Muth, J.F. & Thompson, G.L. (1963) "Industrial Scheduling". Prentice
Hall, Englewood Cliffs, NJ, 1963.
Nakano, R. (1991) "Conventional Genetic Algorithms for Job-Shop
Problems", [ICGA91], 474-479.
Reeves, C.R. (1993) "A Genetic Algorithm for Flowshop Sequencing",
Coventry Polytechnic Working Paper, Coventry, UK.
Whitley, D., Starkweather, T. & D'Ann Fuquay (1989) "Scheduling
Problems and Traveling Salesmen: The Genetic Edge Recombination
Operator", [ICGA89], 133-140.
Fang, H.-L., Ross, P., & Corne D. (1993) "A Promising Genetic
Algorithm Approach to Job-Shop Scheduling, Rescheduling & Open-Shop
Scheduling Problems", [ICGA93], 375-382.
Yamada, T. & Nakano, R. (1992) "A Genetic Algorithm Applicable to
Large-Scale Job-Shop Problems", [PPSN92], 281-290.
Langdon, W.B. (1995) "Scheduling Planned Maintenance of the (UK)
National Grid", cs.ucl.ac.uk/genetic/papers/grid_aisb-95.ps
"Applications of EA in management science and closely related fields
like organizational ecology is a domain that has been covered by some
EA researchers - with considerable bias towards scheduling problems.
Since I believe that EA have considerable potential for applications
outside the rather narrow domain of scheduling and related
combinatorial problems, I started collecting references about the
status quo of EA-applications in management science. This report
intends to make available my findings to other researchers in the
field. It is a short overview and lists some 230 references to
current as well as finished research projects. [..]
"At the end of the paper, a questionnaire has been incorporated that
may be used for this purpose. Other comments are also appreciated."
--- from the Introduction of (Nissen 93)
Nissen, V. (1993) "Evolutionary Algorithms in Management Science: An
Overview and List of References", Papers on Economics and Evolution,
edited by the European Study Group for Evolutionary Economics. This
report is also avail. via anon. FTP from
Boulding, K.E. (1991) "What is evolutionary economics?", Journal of
Evolutionary Economics, 1, 9-17.
New connections between GENETIC ALGORITHMs and Non Linear Filtering
Theory have been established. GAs have already been successfully
applied to a large class of non-linear filtering problems such as
RADAR / SONAR/ GPS signal processing. This relatively new branch of
GA application has also lead to new results on the convergence of
GAs: large deviations, fluctuations...
Some preprints and references on this topic are available in the web
The new results also points out some natural connections between:
genetic type algorithms, information theory, non-linear filtering
theory, interacting and branching particle systems.
This has been addressed quite successfully with GAs. A very common
manifestation of this kind of problem is the timetabling of exams or
classes in Universities, etc.
The first application of GAs to the timetabling problem was to build
the schedule of the teachers in an Italian high school. The
research, conducted at the Department of Electronics and Information
of Politecnico di Milano, Italy, showed that a GA was as good as Tabu
Search, and better than simulated annealing, at finding teacher
schedules satisfying a number of hard and soft constraints. The
software package developed is now in current use in some high schools
in Milano. (Colorni et al 1990)
At the Department of Artificial Intelligence, University of
Edinburgh, timetabling the MSc exams is now done using a GA (Corne et
al. 93, Fang 92). An example of the use of GAs for timetabling
classes is (Abramson & Abela 1991).
In the exam timetabling case, the FITNESS function for a GENOME
representing a timetable involves computing degrees of punishment for
various problems with the timetable, such as clashes, instances of
students having to take consecutive exams, instances of students
having (eg) three or more exams in one day, the degree to which
heavily-subscribed exams occur late in the timetable (which makes
marking harder), overall length of timetable, etc. The modular nature
of the fitness function has the key to the main potential strength of
using GAs for this sort of thing as opposed to using conventional
search and/or constraint programming methods. The power of the GA
approach is the ease with which it can handle arbitrary kinds of
constraints and objectives; all such things can be handled as
weighted components of the fitness function, making it easy to adapt
the GA to the particular requirements of a very wide range of
possible overall objectives . Very few other timetabling methods, for
example, deal with such objectives at all, which shows how difficult
it is (without GAs) to graft the capacity to handle arbitrary
objectives onto the basic "find shortest- length timetable with no
clashes" requirement. The proper way to weight/handle different
objectives in the fitness function in relation to the general GA
dynamics remains, however, an important research problem!
GAs thus offer a combination of simplicity, flexibility & speed which
competes very favorably with other approaches, but are unlikely to
outperform knowledge-based (etc) methods if the best possible
solution is required at any cost. Even then, however, hybridisation
may yield the best of both worlds; also, the ease (if the hardware is
available!) of implementing GAs in parallel enhances the possibility
of using them for good, fast solutions to very hard timetabling and
Abramson & Abela (1991) "A Parallel Genetic Algorithm for Solving the
School Timetabling Problem", Technical Report, Division of I.T.,
C.S.I.R.O, April 1991. (Division of Information Technology,
C.S.I.R.O., c/o Dept. of Communication & Electronic Engineering,
Royal Melbourne Institute of Technology, PO BOX 2476V, Melbourne
Colorni A., M. Dorigo & V. Maniezzo (1990). Genetic Algorithms And
Highly Constrained Problems: The Time-Table Case. Proceedings of the
First International Workshop on Parallel Problem Solving from Nature,
Dortmund, Germany, Lecture Notes in Computer Science 496, Springer-
Colorni A., M. Dorigo & V. Maniezzo (1990). Genetic Algorithms: A
New Approach to the Time-Table Problem. NATO ASI Series, Vol.F 82,
COMBINATORIAL OPTIMIZATION, (Ed. M.Akguel and others), Springer-
Colorni A., M. Dorigo & V. Maniezzo (1990). A Genetic Algorithm to
Solve the Timetable Problem. Technical Report No. 90-060,
Politecnico di Milano, Italy.
Corne, D. Fang, H.-L. & Mellish, C. (1993) "Solving the Modular Exam
Scheduling Problem with Genetic Algorithms". Proc. of 6th Int'l
Conf. on Industrial and Engineering Applications of Artificial
Intelligence & Expert Systems, ISAI.
Fang, H.-L. (1992) "Investigating GAs for scheduling", MSc
Dissertation, University of Edinburgh Dept. of Artificial
Intelligence, Edinburgh, UK.
Subject: Q3: Who is concerned with EAs?
EVOLUTIONARY COMPUTATION attracts researchers and people of quite
dissimilar disciplines, i.e. EC is a interdisciplinary research
Want to find out about the properties of sub-symbolic information
processing with EAs and about learning, i.e. adaptive systems in
They also build the hardware necessary to enable future EAs
(precursors are already beginning to emerge) to huge real world
problems, i.e. the term "massively parallel computation" [HILLIS92],
springs to mind.
Of many kinds want to exploit the capabilities of EAs on many areas
to solve their application, esp. OPTIMIZATION problems.
Want to build MOBOTs (MOBile ROBOTs, i.e. R2D2's and #5's cousins)
that navigate through uncertain ENVIRONMENTs, without using built-in
"maps". The MOBOTS thus have to adapt to their surroundings, and
learn what they can do "move-through-door" and what they can't "move-
through-wall" on their own by "trial-and-error".
Might view CFS as a possible apparatus to describe models of thinking
and cognitive systems.
Use EC hardware, e.g. Hillis' (Thinking Machine Corp.'s) Connection
Machine to model real world problems which include thousands of
variables, that run "naturally" in parallel, and thus can be modelled
more easily and esp. "faster" on a parallel machine, than on a
serial "PC" one.
Are finding EAs useful when it comes to protein folding and other
such bio-computational problems (see Q2).
EAs can also be used to model the behaviour of real POPULATIONs of
organisms. Some biologists are hostile to modeling, but an entire
community of Population Biologists arose with the 'evolutionary
synthesis' of the 1930's created by the polymaths R.A. Fisher, J.B.S.
Haldane, and S. Wright. Wright's SELECTION in small populations,
thereby avoiding local optima) is of current interest to both
biologists and ECers -- populations are naturally parallel.
A good exposition of current population Biology modeling is J.
Maynard Smith's text Evolutionary Genetics. Richard Dawkin's Selfish
Gene and Extended Phenotype are unparalleled (sic!) prose expositions
of evolutionary processes. Rob Collins' papers are excellent
parallel GA models of evolutionary processes (available in [ICGA91]
and by FTP from ftp.cognet.ucla.edu/pub/alife/papers/ ).
As fundamental motivation, consider Fisher's comment: "No practical
biologist interested in (e.g.) sexual REPRODUCTION would be led to
work out the detailed consequences experienced by organisms having
three or more sexes; yet what else should [s/]he do if [s/]he wishes
to understand why the sexes are, in fact, always
two?" (Three sexes would make for even weirder grammar, [s/]he
And in particular biochemists and molecular chemists, are interested
in problems such as the conformational analysis of molecular clusters
and related problems in molecular sciences. The application of GAs
to molecular systems has opened an interesting area of research and
the number of chemists involved in it increases day-by-day.
Some typical research topics include:
o protein folding; o conformational analysis and energy
minimization; o docking algorithms for drug-design; o solvent site
prediction in macromolecules;
Several papers have been published in journals such as Journal of
Computational Chemistry and Journal of Computer-Aided Design.
Some interesting WWW sites related to the applications of GAs to
chemistry (or molecular science in general) include:
o http://garage.cps.msu.edu/projects/biochem/biochem.html about GAs
in biochemistry (water site prediction, drug-design and protein
http://www.tc.cornell.edu/Edu/SPUR/SPUR94/Main/John.html about the
application of GAs to the search of conformational energy minima;
o http://cmp.ameslab.gov/cmp/CMP_Theory/gsa/gen2.html By using a
GA in combiation with a Tight-binding model, David Deaven and Kai-
Ming Ho founded fullerene cages (including C60) starting from
See also Q2 for applications in biocomputing.
and some other really curious people may also be interested in EC for
Subject: Q4: How many EAs exist? Which?
The All Stars
There are currently 3 main paradigms in EA research: GENETIC
ALGORITHMs, EVOLUTIONARY PROGRAMMING, and EVOLUTION STRATEGIEs.
CLASSIFIER SYSTEMs and GENETIC PROGRAMMING are OFFSPRING of the GA
community. Besides this leading crop, there are numerous other
different approaches, alongside hybrid experiments, i.e. there exist
pieces of software residing in some researchers computers, that have
been described in papers in conference proceedings, and may someday
prove useful on certain tasks. To stay in EA slang, we should think
of these evolving strands as BUILDING BLOCKs, that when recombined
someday, will produce new offspring and give birth to new EA
One such interesting offspring is the Memetic Algorithm. This is a
hybrid evolutionary algorithm, which makes use of local search
operators. For details, see
As far as "solving complex function and COMBINATORIAL OPTIMIZATION
tasks" is concerned, Davis' work on real-valued representations and
adaptive operators should be mentioned (Davis 89). Moreover Whitley's
Genitor system incorporating ranking and "steady state" mechanism
(Whitley 89), Goldberg's "messy GAs", involves adaptive
representations (Goldberg 91), and Eshelman's CHC algorithm (Eshelman
91). For real FUNCTION OPTIMIZATION, Differential EVOLUTION seems
hard to beat in terms of convergence speed as well as simplicity:
With just three control variables, tuning is particularly easy to do.
For "the design of robust learning systems", i.e. the field
characterized by CFS, Holland's (1986) CLASSIFIER SYSTEM, with its
state-of-the-art implementation CFS-C (Riolo 88), we should note
developments in SAMUEL (Grefenstette 89), GABIL (De Jong & Spears
91), and GIL (Janikow 91).
Davis, L. (1989) "Adapting operator probabilities in genetic
algorithms", [ICGA89], 60-69.
De Jong K.A. & Spears W. (1991) "Learning concept classification
rules using genetic algorithms". Proc. 12th IJCAI, 651-656, Sydney,
Australia: Morgan Kaufmann.
Dorigo M. & E. Sirtori (1991)."ALECSYS: A Parallel Laboratory for
Learning Classifier Systems". Proceedings of the Fourth International
Conference on Genetic Algorithms, San Diego, California, R.K.Belew
and L.B.Booker (Eds.), Morgan Kaufmann, 296-302.
Dorigo M. (1995). "ALECSYS and the AutonoMouse: Learning to Control a
Real Robot by Distributed Classifier Systems". Machine Learning, 19,
Eshelman, L.J. et al. (1991) "Preventing premature convergence in
genetic algorithms by preventing incest", [ICGA91], 115-122.
Goldberg, D. et al. (1991) "Don't worry, be messy", [ICGA91], 24-30.
Grefenstette, J.J. (1989) "A system for learning control strategies
with genetic algorithms", [ICGA89], 183-190.
Holland, J.H. (1986) "Escaping brittleness: The possibilities of
general-purpose learning algorithms applied to parallel rule-based
systems". In R. Michalski, J. Carbonell, T. Mitchell (eds), Machine
Learning: An Artificial Intelligence Approach. Los Altos: Morgan
Janikow C. (1991) "Inductive learning of decision rules from
attribute-based examples: A knowledge-intensive Genetic Algorithm
approach". TR91-030, The University of North Carolina at Chapel Hill,
Dept. of Computer Science, Chapel Hill, NC.
Riolo, R.L. (1988) "CFS-C: A package of domain independent
subroutines for implementing classifier systems in arbitrary, user-
defined environments". Logic of computers group, Division of
computer science and engineering, University of Michigan.
Whitley, D. et al. (1989) "The GENITOR algorithm and selection
pressure: why rank-based allocation of reproductive trials is best",
Subject: Q4.1: What about Alife systems, like Tierra and VENUS?
None of these are EVOLUTIONARY ALGORITHMs, but all of them use the
evolutionary metaphor as their "playing field".
Synthetic organisms have been created based on a computer metaphor of
organic life in which CPU time is the ``energy'' resource and memory
is the ``material'' resource. Memory is organized into informational
patterns that exploit CPU time for self-replication. MUTATION
generates new forms, and EVOLUTION proceeds by natural SELECTION as
different GENOTYPEs compete for CPU time and memory space.
Observation of nature shows that evolution by natural selection is
capable of both OPTIMIZATION and creativity. Artificial models of
evolution have demonstrated the optimizing ability of evolution, as
exemplified by the field of GENETIC ALGORITHMs. The creative aspects
of evolution have been more elusive to model. The difficulty derives
in part from a tendency of models to specify the meaning of the
``genome'' of the evolving entities, precluding new meanings from
emerging. I will present a natural model of evolution demonstrating
both optimization and creativity, in which the GENOME consists of
sequences of executable machine code.
From a single rudimentary ancestral ``creature'', very quickly there
evolve parasites, which are not able to replicate in isolation
because they lack a large portion of the genome. However, these
parasites search for the missing information, and if they locate it
in a nearby creature, parasitize the information from the neighboring
genome, thereby effecting their own replication.
In some runs, hosts evolve immunity to attack by parasites. When
immune hosts appear, they often increase in frequency, devastating
the parasite POPULATIONs. In some runs where the community comes to
be dominated by immune hosts, parasites evolve that are resistant to
Hosts sometimes evolve a response to parasites that goes beyond
immunity, to actual (facultative) hyper-parasitism. The hyper-
parasite deceives the parasite causing the parasite to devote its
energetic resources to replication of the hyper-parastie genome.
This drives the parasites to extinction. Evolving in the absence of
parasites, hyper-parasites completely dominate the community,
resulting in a relatively uniform community characterized by a high
degree of relationship between INDIVIDUALs. Under these
circumstances, sociality evolves, in the form of creatures which can
only replicate in aggregations.
The cooperative behavior of the social hyper-parasites makes them
vulnerable to a new class of parasites. These cheaters, hyper-hyper-
parasites, insert themselves between cooperating social individuals,
deceiving the social creatures, causing them to replicate the genomes
of the cheaters.
The only genetic change imposed on the simulator is random bit flips
in the machine code of the creatures. However, it turns out that
parasites are very sloppy replicators. They cause significant
RECOMBINATION and rearrangement of the genomes. This spontaneous
sexuality is a powerful force for evolutionary change in the system.
One of the most interesting aspects of this instance of life is that
the bulk of the evolution is based on adaptation to the biotic
ENVIRONMENT rather than the physical environment. It is co-evolution
that drives the system.
--- "Tierra announcement" by Tom Ray (1991)
How to get Tierra?
Tierra is available (source and executables, for Unix and NT) from
David Bennett <email@example.com> reported in March 2000: Much new work
has been done in Tierra since 1993. Thomas Ray
<firstname.lastname@example.org> is now working in Japan. I have been using
another similar system called Avida. It has some advantages, and a
significant body of research results. The contact for Avida is
Ray, T. S. (1991) "Is it alive, or is it GA?" in [ICGA91], 527--534.
Ray, T. S. (1991) "An approach to the synthesis of life." in
Ray, T. S. (1991) "Population dynamics of digital organisms." in
Ray, T. S. (1991) "Evolution and optimization of digital
organisms." Scientific Excellence in Supercomputing: The IBM 1990
Contest Prize Papers, Eds. Keith R. Billingsley, Ed Derohanes, Hilton
Brown, III. Athens, GA, 30602, The Baldwin Press, The University of
Ray, T. S. (1992) "Evolution, ecology and optimization of digital
organisms." Santa Fe Institute working paper 92-08-042.
Ray, T. S. "Evolution, complexity, entropy, and artificial reality."
submitted Physica D.
Ray, T. S. (1993) "An evolutionary approach to synthetic biology,
Zen and the art of creating life. Artificial Life 1(1).
Steen Rasmussen's (et al.) VENUS I+II "coreworlds" as described in
[ALIFEII] and [LEVY92], are inspired by A.K. Dewdney's well-known
article (Dewdney 1984). Dewdney proposed a game called "Core Wars",
in which hackers create computer programs that battle for control of
a computer's "core" memory (Strack 93). Since computer programs are
just patterns of information, a successful program in core wars is
one that replicates its pattern within the memory, so that eventually
most of the memory contains its pattern rather than that of the
VENUS is a modification of Core Wars in which the Computer programs
can mutate, thus the pseudo assembler code creatures of VENUS evolve
steadily. Furthermore each memory location is endowed with
"resources" which, like sunshine are added at a steady state. A
program must have sufficient resources in the regions of memory it
occupies in order to execute. The input of resources determines
whether the VENUS ecosystem is a "jungle" or a "desert." In jungle
ENVIRONMENTs, Rasmussen et al. observe the spontaneous emergence of
primitive "copy/split" organisms starting from (structured) random
--- [ALIFEII], p.821
Dewdney, A.K. (1984) "Computer Recreations: In the Game called Core
War Hostile Programs Engage in a Battle of Bits", Sci. Amer. 250(5),
Farmer & Belin (1992) "Artificial Life: The Coming Evolution",
Rasmussen, et al. (1990) "The Coreworld: Emergence and Evolution of
Cooperative Structures in a Computational Chemistry", [FORREST90],
Rasmussen, et al. (1992) "Dynamics of Programmable Matter",
Strack (1993) "Core War Frequently Asked Questions (
rec.games.corewar FAQ)" Avail. by anon. FTP from
Larry Yaeger's PolyWorld as described in [ALIFEIII] and [LEVY92] is
available via anonymous FTP from
"The subdirectories in this "polyworld" area contain the source code
for the PolyWorld ecological simulator, designed and written by Larry
Yaeger, and Copyright 1990, 1991, 1992 by Apple Computer.
PostScript versions of my ARTIFICIAL LIFE III technical paper have
now been added to the directory. These should be directly printable
from most machines. Because some unix systems' "lpr" commands cannot
handle very large files (ours at least), I have split the paper into
Yaeger.ALife3.1.ps and Yaeger.ALife3.2.ps. These files can be ftp-ed
in "ascii" mode. For unix users I have also included compressed
versions of both these files (indicated by the .Z suffix), but have
left the uncompressed versions around for people connecting from non-
unix systems. I have not generated PostScript versions of the
images, because they are color and the resulting files are much too
large to store, retrieve, or print. Accordingly, though I have
removed a Word-formatted version of the textual body of the paper
that used to be here, I have left a Word-formatted version of the
color images. If you wish to acquire it, you will need to use the
binary transfer mode to move it to first your unix host and then to a
Macintosh (unless Word on a PC can read it - I don't know), and you
may need to do something nasty like use ResEdit to set the file type
and creator to match those of a standard Word document (Type = WDBN,
Creator = MSWD). [..]"
--- from the README by Larry Yaeger <email@example.com>
General Alife repositories?
Also, all of the following FTP sites carry ALIFE related info:
ftp.cogs.susx.ac.uk/pub/reports/csrp/ , xyz.lanl.gov/nlin-sys/ ,
Subject: Q5: What about all this Optimization stuff?
Just think of an OPTIMIZATION problem as a black box. A large black
box. As large as, for example, a Coca-Cola vending machine. Now, we
don't know anything about the inner workings of this box, but see,
that there are some regulators to play with, and of course we know,
that we want to have a bottle of the real thing...
Putting this everyday problem into a mathematical model, we proceed
(1) we label all the regulators with x and a number starting from 1;
the result is a vector x, i.e. (x_1,...,x_n), where n is the
number of visible regulators.
(2) we must find an objective function, in this case it's obvious, we
want to get k bottles of the real thing, where k is equal to 1.
[some might want a "greater or equal" here, but we restricted
ourselves to the visible regulators (we all know that sometimes a
"kick in the right place" gets use more than 1, but I have no
idea how to put this mathematically...)]
(3) thus, in the language some mathematicians prefer to speak in:
f(x) = k = 1. So, what we have here is a maximization problem
presented in a form we know from some boring calculus lessons,
and we also know that there at least a dozen utterly
uninteresting techniques to solve problems presented this way...
What can we do in order to solve this problem?
We can either try to gain more knowledge or exploit what we already
know about the interior of the black box. If the objective function
turns out to be smooth and differentiable, analytical methods will
produce the exact solution.
If this turns out to be impossible, we might resort to the brute
force method of enumerating the entire SEARCH SPACE. But with the
number of possibilities growing exponentially in n, the number of
dimensions (inputs), this method becomes infeasible even for low-
Consequently, mathematicians have developed theories for certain
kinds of problems leading to specialized OPTIMIZATION procedures.
These algorithms perform well if the black box fulfils their
respective prerequisites. For example, Dantzig's simplex algorithm
(Dantzig 66) probably represents the best known multidimensional
method capable of efficiently finding the global optimum of a linear,
hence convex, objective function in a search space limited by linear
constraints. (A USENET FAQ on linear programming is maintained by
Professor Robert Fourer <firstname.lastname@example.org> (and "nonlinear-
programming-faq") that is posted monthly to sci.op-research and is
mostly interesting to read. It is also available from http://www-
Gradient strategies are no longer tied to these linear worlds, but
they smooth their world by exploiting the objective function's first
partial derivatives one has to supply in advance. Therefore, these
algorithms rely on a locally linear internal model of the black box.
Newton strategies additionally require the second partial
derivatives, thus building a quadratic internal model. Quasi-Newton,
conjugate gradient and variable metric strategies approximate this
information during the search.
The deterministic strategies mentioned so far cannot cope with
deteriorations, so the search will stop if anticipated improvements
no longer occur. In a multimodal ENVIRONMENT these algorithms move
"uphill" from their respective starting points. Hence, they can only
converge to the next local optimum.
Newton-Raphson-methods might even diverge if a discrepancy between
their internal assumptions and reality occurs. But of course, these
methods turn out to be superior if a given task matches their
requirements. Not relying on derivatives, polyeder strategy, pattern
search and rotating coordinate search should also be mentioned here
because they represent robust non-linear optimization algorithms
Dealing with technical optimization problems, one will rarely be able
to write down the objective function in a closed form. We often need
a SIMULATION model in order to grasp reality. In general, one cannot
even expect these models to behave smoothly. Consequently,
derivatives do not exist. That is why optimization algorithms that
can successfully deal with black box-type situations have been
developed. The increasing applicability is of course paid for by a
loss of "convergence velocity," compared to algorithms specially
designed for the given problem. Furthermore, the guarantee to find
the global optimum no longer exists!
But why turn to nature when looking for more powerful algorithms?
In the attempt to create tools for various purposes, mankind has
copied, more often instinctively than geniously, solutions invented
by nature. Nowadays, one can prove in some cases that certain forms
or structures are not only well adapted to their ENVIRONMENT but have
even reached the optimum (Rosen 67). This is due to the fact that the
laws of nature have remained stable during the last 3.5 billion
years. For instance, at branching points the measured ratio of the
diameters in a system of blood-vessels comes close to the theoretical
optimum provided by the laws of fluid dynamics (2^-1/3). This, of
course, only represents a limited, engineering point of view on
nature. In general, nature performs adaptation, not optimization.
The idea to imitate basic principles of natural processes for optimum
seeking procedures emerged more than three decades ago (cf Q10.3).
Although these algorithms have proven to be robust and direct
OPTIMIZATION tools, it is only in the last five years that they have
caught the researchers' attention. This is due to the fact that many
people still look at organic EVOLUTION as a giantsized game of dice,
thus ignoring the fact that this model of evolution cannot have
worked: a human germ-cell comprises approximately 50,000 GENEs, each
of which consists of about 300 triplets of nucleic bases. Although
the four existing bases only encode 20 different amino acids,
20^15,000,000, ie circa 10^19,500,000 different GENOTYPEs had to be
tested in only circa 10^17 seconds, the age of our planet. So, simply
rolling the dice could not have produced the diversity of today's
complex living systems.
Accordingly, taking random samples from the high-dimensional
parameter space of an objective function in order to hit the global
optimum must fail (Monte-Carlo search). But by looking at organic
evolution as a cumulative, highly parallel sieving process, the
results of which pass on slightly modified into the next sieve, the
amazing diversity and efficiency on earth no longer appears
miraculous. When building a model, the point is to isolate the main
mechanisms which have led to today's world and which have been
subjected to evolution themselves. Inevitably, nature has come up
with a mechanism allowing INDIVIDUALs of one SPECIES to exchange
parts of their genetic information (RECOMBINATION or CROSSOVER), thus
being able to meet changing environmental conditions in a better way.
Dantzig, G.B. (1966) "Lineare Programmierung und Erweiterungen",
Berlin: Springer. (Linear programming and extensions)
Kursawe, F. (1994) " Evolution strategies: Simple models of natural
processes?", Revue Internationale de Systemique, France (to appear).
Rosen, R. (1967) "Optimality Principles in Biologie", London:
Schwefel, H.-P. (1981) "Numerical Optimization of Computer Models",
Copyright (c) 1993-2000 by J. Heitkoetter and D. Beasley, all rights
This FAQ may be posted to any USENET newsgroup, on-line service, or
BBS as long as it is posted in its entirety and includes this
copyright statement. This FAQ may not be distributed for financial
gain. This FAQ may not be included in commercial collections or
compilations without express permission from the author.
End of ai-faq/genetic/part3