|
ISSN 1320-0682 | |||
| Volume 02 | April 1995 | |||
Russell K. Standish
Sydney Regional Centre for Parallel
Computing
ACSU, Division of Information Services
The University of New South Wales
Sydney 2052, Australia
Email: R.Standish@unsw.edu.au
In recent years, theories about biological evolution have undergone many re-interpretations and refinements. For example, perhaps one of the most influential new ideas is that of Punctuated Equilibria [1]. Many controversies cannot be resolved because only one example of evolution can be studied, that the timescale over which it occurs is more vast than an individual researcher's lifetime, and that the fossil record is imperfect.
Our best chance of resolving many of these issues is to turn to simulations of evolution, or to create Artificial Life systems which feature evolvability [2, 4, 3]. Most studies of evolution to date have had an externally imposed fitness function whether the models have been analytic in nature [5, 6] or are simulated on a computer as with genetic algorithms [7, 8]. This fitness function will have a global optimum and, as a consequence, there will be an end to evolution once the system has reached this optimum state. By contrast, in natural evolution, the "goalposts" are constantly being moved due to the rise and fall of co-existing species. A true evolutionary model must take into account ecological factors. Thomas Ray's Tierra World has the advantage that ecological interactions define the "fitness" of the organisms, rather than some externally imposed constraint.
However, it is still advantageous to be able to study evolution by means of simple abstract models in order to separate general principles of the process from the specifics of the particular system at hand (biological or artificial). Obviously, any such model must include ecological interactions, so the starting point should be a generalised version of the Lotke-Volterra equations:
Here,
is the difference between the birth rate and
death rate for each species, in the absence of competition or
symbiosis. It can depend on spatial co-ordinates, as it includes
environmental factors.
is the interaction term
between species, with the diagonal terms referring to the species'
self limitation. This may also be spatially dependent, although it is
less clear what spatial dependence means in this case. n
is the species density. The use of "species" here is arguably a poor
choice of term, given that sexual reproduction is not a feature of
this model. What is really meant is that regions of phenotype space
constitute an entity, with each region identified with its average
values of
,
etc. The radius of these regions
may be large enough to encompass all members of a species, or even a
higher taxonomic level, or alternatively be so tightly set as to
include only a single individual, or at most a few. This latter case
of an individual model [9] may well prove important
in the study of evolution as classical Darwinism works on variation
within the species. It should be pointed out that
Equation (1) is not realistic as an individual model,
because it gives rise to a Poisson distribution of individual
lifetimes. It could be easily augmented by making
into a convolution operator.
The most obvious thing about Equation (1) is its fixed point
where
. For this point to be biologically meaningful, all
components of
must be positive, giving rise to the following
inequalities:
The stability of this point is related to the
negative definiteness of the derivative of
at
. The
components of the derivative are given by
Substituting Equation (2) gives
Stability of the fixed point requires that this matrix should be
negative definite. Since the
are
all negative by virtue of (3), this is equivalent
to
being negative definite, or equivalently, that its
\
eigenvalues all have negative real parts. Taken together with the
Inequalities (3), this implies that
inequalities must be satisfied for the fixed point to be stable. This
point was made by Strobeck [10], in a slightly different
form. (Note that Strobeck implicitly assumes that
, so he comes to the conclusion that
conditions are
required.)
If one were to randomly pick coefficients for a Lotke-Volterra system,
then it has a probability of
of being stable; that is, one
expects ecosystems to become more unstable as the number of species
increases. Turning the argument on its head, to get a system that
suffers extinctions, and so will undergo evolution once we add the
possibility of mutation or migration, we do not want
to be
negative definite. We do not want it to be positive definite either, as
this leads to a system with a strongly unstable equilibrium point,
with species growing to infinity, or crashing to extinction according
to initial conditions. Since diagonally dominant systems are definite,
we can choose an interesting domain for our system by ensuring that
where
is the set of diagonal elements
,
is the set of off-diagonal elements
and
means the average over the set.
The stability of the system under attack from mutation can be analysed
by considering Equation (4) for small values of
,
for a particular species i. Linearising (4) in
,
one gets
So if a new species enters the system satisfying
then it can obtain a foothold in the system. Consider a mutant that is
very similar to its parent, or another species k, such that
, then this condition at
equilibrium is
. Thus, it will have a slim chance of
entering the system unless its reproductive success is much greater
than its competitor, in which case it will displace species k. This
is an example of Gause's principle [11].
Conversely, if the opposite inequality held for some species i, that
species is likely to become extinct if its numbers become too low. For
example, if a species is under intense predatory pressure from species
k, (
and
large), then it is likely to become extinct.
Adding migration and mutation involves adding a couple of terms to Equation (1) (in tensor notation)
The difficulty with adding mutation to this model is how to define the
mapping between genotype space and phenotype space or, in other words,
what defines the embryology. A few studies, including Ray's
Tierra World, do this with an explicit mapping from the genotype to
some particular organism property (for example, interpreted as machine language
instructions, or as weight in a neural net). These organisms then
interact with one another to determine the population dynamics. In
this model, however, we are doing away with the organismal layer, and
so an explicit embryology is impossible. The only possibility remaining is
to use a statistical model of embryology. The mapping between
genotype space and the population parameters
,
and
is expected to look like a rugged
landscape; however, if two genotypes are close together (in a Hamming
sense) then one might expect that the phenotypes are likely to be
similar, as would the population parameters. This I call random
embryology with locality.
In the simple case of point mutations, the probability P(x) of x new
species arising follows a Poisson distribution, as does the phenotypic
displacement from the parent species. The population parameters
,
and
should be
distributed in a Gaussian fashion around the parent species, with the
standard deviation being given by the phenotypic displacement. The
means of the two Poisson processes become parameters of the model that
the experimenter can adjust.
Figure 1: Distribution of species separation in phenotype space
The parameter describing the phenotypic displacement required to
consider two individuals as separate species can be examined by
application of Gause's principle. Gause's principle asserts
that multiple species cannot occupy the same ecological niche. If the
phenotypic parameters
and
are too close for two species
i and k, then we are dealing with essentially the same species.
Figure 1 shows a histogram of species differences
(defined as
) plotted logarithmically for two runs. When the species
difference parameter sp_sep is small, there is a noticeable
lump in the distribution for small
. When the parameter
is an order of magnitude larger, there is no such lump, indicating
that individual species in the model occupy distinct ecological niches.
An unstated assumption of the above model is that the probability of mutation back to a previously existing species is negligible, and can be ignored.
The implementation of this model is complicated by two factors, one
being that the system dimension
may vary through extinction and
mutation, and secondly that the interaction array
is sparse,
and so must be represented by a sparse matrix structure for
efficiency. A sizeable portion of the code is a C++ component that
implements these arrays, and defines addition and multiplication on
them. The advantage in this approach is that only this component needs
to be ported should one decide to use a massively parallel computer
such as the CM5. At this stage, I have not implemented migration (that is,
); should I do so, the power of a parallel computer is
required to perform sufficiently long runs.
The operation
is
implemented as a Tcl command; other Tcl commands include the
mutation operator and analysis commands that return average values of
,
etc. or display the value of n on a graph. This
enables the entire simulation to be written as a Tcl/Tk
script [12], allowing parameters to be varied quickly,
possibly even interactively via X11 widgets.
Figure 2: Connectivity (
) as a function of time step
If n was represented by real numbers, then no species would ever
become extinct. Rather, a particular
will vanish exponentially
towards 0, but remain positive for time. To force extinction, n is
made an integer array, and an interesting variant of the rounding
operator is used. If the conventional rounding operator is used, then
we again may never see extinction. For example, if
, and
, the rounding operation will leave
constant. Similar
problems occur with the truncation operator. Instead, we round the
result up or down stochastically, so that in the above example,
has a 90% chance of remaining the same, and a 10% chance of being
reduced by 1 (and becoming extinct!)
The implementation of mutation follows in a fairly straightforward
fashion (although the coding is a little tortuous) with an additional
consideration to preserve the sparseness of
. If the absolute
values of any component in the new
is less than some threshold
value (set as a Tcl variable), it is identified with 0, and removed
from the sparse structure. The same mechanism allows new connections
between species to arise through mutation.
This paper is meant to be a progress report, so the results presented
here are still somewhat tentative. When running without the mutation
operator, there is a massive extinction from the initial condition
until the system reaches a stable equilibrium and, thereafter, the
system becomes constant. If one chose the values of
and
completely randomly, then one might expect that three quarters of the
species will become extinct. On the other hand, if one chose the values to
satisfy (3) and negative definiteness of
,
then hardly any species will die out. If
is chosen to be
negative semi-definite, then there is the possibilities for limit
cycles (or even strange attractors), as in the original predator-prey
system studied by Volterra [11].
Figure 3:
as a function of time step
A typical run with mutation included is shown in Figures 2,
3 and 4. At first, there is an initial extinction
(
initially), then there is a climb as mutation starts to
search for more stable ecologies. Figure 2 shows the
connectivity (number of off-diagonal elements per row of
), which can be seen is roughly correlated with
. There is also a negative
correlation between
and
. It is
interesting to note that mutation increases the size of
, decreasing the diagonal dominance and
unstabilising the equilibrium point. The other parameters such as
and
are essentially
constant like
.
The analysis in Section 2 can help us understand what is
going on here. If the connectivity is low, then (7) is likely to be satisfied, and the new mutation is
likely to increase the connectivity of the system. However, the higher
the connectivity, the less diagonally dominant
becomes, and
the less stable the system becomes. This balance between two
competing trends dominates the evolution of the system.
Figure 4:
and
as a
function of time step
Much remains to be done. Figure 4 only indirectly indicates
the destabilisation of the equilibrium point. This model should be run
with an eigenvalue routine to plot the maximum eigenvalue of
.
If this eigenvalue becomes positive, then the equilibrium point is
unstable indeed. Presumably, mutation will drive the system to some
critical point dividing stability and complete instability.
Population Models with Random Embryologies as a Paradigm for Evolution
This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html rkspap1.
The translation was initiated by Pam Milliken on Wed Nov 13 17:26:04 EST 1996