Chapter 5
5.1 General 5.2 Output 5.3 Size Limits 5.4 Initialization 5.5 Selection Methods 5.6 Breeding 5.7 Operators 5.8 Multiple Populations
Parameters
A large number of parameters control lil-gp. These parameters are input via parameter files and command line arguments. This chapter lists and describes all the available parameters. In addition, user code may define application-specific parameters (as in the sample artificial ant problem, for instance.
All parameter settings are saved in the checkpoint files, so if you are restarting an aborted run from a checkpoint you do not need to explicitly load all the original parameter files on the command line
Default values for parameters, where appropriate, have been chosen to correspond with those given in Chapter 7 ("Detailed Description of Genetic Programming") of Koza's (first) book [3]
The following conventions apply:
These parameters govern the overall operation of the run.
max_generations | type: integer | The maximum number of generations for the run. | default: none |
pop_size | type: integer | The population size. For multipop runs, the subpopulation size. | default: none |
random_seed | type: integer | The seed for the random number generator. | default: 1 |
These parameters control the writing of output and checkpoint files.
output.basename | type: string | The base name for the output files. Various three-character extensions
are added to create the actual filenames. |
default: lilgp |
output.detail | type: integer, 0-100 | The level of detail in output files. 100 is everything, 0 is practically nothing. | default: 50 |
output.stt_interval | type: positive integer | How often, in generations, to write information to the STT file. | default: 1 |
output.bestn | type: positive integer | How many individuals are printed to the BST file (i.e., if set to 5 then the top 5 individuals are written to the file). | default: 1 |
output.digits | type: integer | The number of decimal places with which fitness values are printed in output files. | default: 4 |
checkpoint.interval | type: integer | Specifies how often (in generations) to write a checkpoint file. If not set or negative, no checkpoint files are written. If set to a positive number, checkpoint files are written every (that number) generations and after the last generation. If set to zero, then only one checkpoint file is written, after the last generation. | default: none |
checkpoint.filename | type: string | A printf() format string with exactly one %d specifier, which is replaced with a generation number. The resulting string will be used as the filename for the checkpoint file for that generation. | default: gp%06d.ckp |
checkpoint.compress | type: string | A printf() format string used to generate a command to run on each checkpoint file after it is written. The string should have one %s specifier, which will be replaced with the name of the checkpoint file. The usual printf() conventions for percent signs apply. Typically this command is used to compress the checkpoint file. If this parameter is not defined, then no command is executed. | default: none |
These parameters set limits on the number of nodes and/or the
depth of individuals in the population, both at initialization
and during evolution. In problems where individuals are composed
of multiple trees, # refers to the tree number.
max_nodes | type: integer | Maximum total number of nodes per individual. If not set, no limit is enforced. | default: none |
tree[#].max_nodes | type: integer | The maximum number of nodes in tree #. If not set, no limit is enforced. | default: none |
max_depth | type: integer | Maximum depth of individual. If not set, no limit is enforced. | default: none |
tree[#].max_depth | type: integer | The maximum depth of tree #. If not set, no limit is enforced. | default: none |
These parameters control generation of the initial random population.
init.tree[#].method | type: half_and_half, grow, or full | Method for generating tree # of each individual in the initial population. If not set, then the value of init.method is used. | default: none |
init.method | type: half_and_half, grow, or full | Default method for generation of initial random population. | default: half_and_half |
init.tree[#].depth | type: depth ramp | A depth ramp for choosing the size of tree # during the generation of the initial population. If not set, then the value of init.depth is used. | default: none |
init.depth | type: string | Default depth ramp for generation of initial random population. | default: 2-6 |
init.random_attempts | type: positive integer | During initial generation, trees that violate size limits or are duplicates are rejected. This parameter is the maximum number of consecutive rejected trees the program will tolerate before generating an error and giving up. | default: 100 |
A selection method is an algorithm for picking an individual from a population. Selection methods are used in various places throughout the program. This section does not list parameters per se, but rather describes valid values for parameters needing selection methods
lil-gp currently has seven selection methods available. Some
selection methods have options, which are set by following the
method name with a comma-separated list of "option=value"
pairs, as in:
blahblah.select = tournament,size=7
Whitespace in the string is completely ignored. The previous
example
is equivalent to all of the following:
blahblah.select = tournament, size = 7 blahblah.select = tournament, size = 7 blahblah.select = tournament, size = 7
The selection methods available are:
fitness Fitness-proportionate selection. Individuals
are chosen at random, with the probability for an individual proportional to its adjusted fitness.
No options are available.
fitness_overselect Greedy overselection. In a population,
the top individuals accounting for cutoff
of the total adjusted fitness are placed in Group I, the rest
of the population going into Group II. Individuals are randomly
selected from Group I (proportional to adjusted fitness) proportion
of the time, and from Group II (proportional to adjusted fitness)
the rest of the time. cutoff and proportion are
set with options:
cutoff For populations less than 1000, defaults to 0.32. For larger populations, defaults to 320/popsize (i.e., 32% for popsize of 1000, 16% for popsize of 2000, etc.).
proportion Defaults to 0.80.
tournament A number of individuals are chosen at random with uniform probability with reselection allowed. The best of the chosen individuals is selected. Has one option:
size The size of the tournament (how many individuals are randomly chosen to compete). Defaults to 2.
inverse_fitness Individuals are randomly chosen with
probability proportional to 1 divided by the
adjusted fitness.
random Individuals are selected at random with uniform
probability.
best The first time the selection is done, the best
member of the population (as determined by adjusted
fitness) is returned. Subsequent selections return the 2nd, 3rd,
4th, etc. best individuals. Should not be used in a context
which will need to select more individuals than are in the population.
worst Same as best, but returns individuals in
order of increasing adjusted fitness (i.e., worst first).
Breeding is the term used in lil-gp for creation of the new population each generation. It is controlled through a number of "phases." Each phase has an operator (such as crossover) and a rate specifying how often that operator occurs
The breeding parameters described at the end of Chapter 7 in Koza's first book [3] (for populations less than 1000) can be emulated with the following breeding settings
breed_phases = 2 breed[1].operator = crossover, select=fitness breed[1].rate = 0.9 breed[2].operator = reproduction, select=fitness breed[2].rate = 0.1
The "Simple LISP Code" presented in the back of that book can be emulated with the following parameters (for populations of 1000 or larger, replace all occurrences of fitness with fitness_overselect):
probabilistic_operators = off breed_phases = 4 ; functionpoint crossover 70% of the time breed[1].operator = crossover, select=fitness, internal=1.0 breed[1].rate = 0.7 ; anypoint crossover 20% of the time breed[2].operator = crossover, select=fitness, internal=0.0, external=0.0 breed[2].rate = 0.2 breed[3].operator = reproduction, select=fitness breed[3].rate = 0.1 breed[4].operator = mutation, select=fitness, method=grow,depth=4 breed[4].rate = 0.0
In the second book [4], Koza uses defaults equivalent to
breed_phases = 2 breed[1].operator = crossover, select=(tournament, size=7) breed[1].rate = 0.9 breed[2].operator = reproduction, select=(tournament, size=7) breed[2].rate = 0.1
The specific parameters are:
breed_phases | type: integer | Specifies the number of phases. | default: none |
probabilistic_operators | type: binary | When on, phases are selected by chance, with frequency proportional to that phase's "rate" parameter. When off, the number of individuals produced by a given phase is exactly (well, approximately) proportional to that phase's rate. | default: on |
The following parameter names should all substitute a number in
the
range [1,breed_phases] for "#".
breed[#].rate | type: float | The rate for this phase. With probabilistic_operators on, specifies the probability with which this phase is (randomly) chosen. Otherwise, specifies the proportion of individuals in new population to be created with this phase. Note that if the rates for all phases sum to something other than 1, each is divided by the total to normalize them. | default: none |
breed[#].operator | type: operator string | Specifies the operator for this phase, and any arguments it has. | default: none |
The available operators are listed in the next section.
Operators are picked in a manner identical to that for selection methods: the string consists of the operator name, a comma, then any arguments to that operator as a comma-separated list of "option=value" pairs
Note that the argument list for an operator may include one or more selection methods. If the selection method itself has arguments, then the entire selection string should be enclosed in parentheses:
blahblah.operator = crossover, select=(tournament, size=7), internal=0.3
This forces the "size" argument to be parsed as an option to the tournament selection method, not to the crossover operator.
Three operators are currently available:
crossover Chooses two parent individuals. Picks a tree
on each one, subject to the restriction that the trees be over
the same function set. Chooses a crossover point on each tree.
Switches the subtrees rooted at those points, placing newly created
individuals in new population. This operator has the following
arguments:
select Specifies the selection method (and arguments) used to pick the first parent. This option is required.
select2 Specifies the selection method (and arguments) used to pick the second parent. If not specified, then defaults to be the same method as is used to pick the first parent.
keep_trying This is a binary argument. It specifies what to do when the crossover operation produces a tree that violates the node and/or depth limits. If on, then it keeps picking new crossover points on the same two parents until it produces legal child trees. If off, then upon failure it just reproduces one of the parents into the new generation in lieu of the child individual. The default is off.
internal Specifies the frequency with which internal points are selected as the crossover point.
external Specifies the frequency with which external points are selected as the crossover point. The defaults for these two options are coupled. If neither is set, then internal is 0.9 and external is 0.1. If one is set but not the other, the unset one is taken as zero. If both are set to zero, then the crossover point is selected uniformly over all points, without regard to their location.
tree Sets the frequency with which a particular tree is selected as the crossover tree. Should be a comma separated list of reals enclosed in parentheses, with a length equal to the number of trees per individual. For instance, if individuals consist of three trees, this argument could be tree=(0.1,0.2,0.7).
treen Sets the frequency with which tree n is selected as the crossover tree. Multiple "tree" and "treen" arguments are allowed and are applied in the order that they appear. If no tree arguments are used, then each tree has the same probability of being selected. If some tree arguments are used, any unspecified trees are given a zero probability of being chosen.
reproduction Chooses an individual and copies it into
the new population. It has only one argument, which is required:
select The selection method (and arguments) used to pick the individual to be reproduced.
mutation Chooses an individual, then chooses a tree within
that individual and a mutation point on that tree. Replaces the
subtree at that point with a randomly generated subtree. Places
new individual in new population. It has several arguments:
select The selection method (and arguments) used to pick the individual to be mutated. This argument is required.
keep_trying This is a binary argument. It specifies what to do when the mutation operation produces a tree that violates the node and/or depth limits. If on, then it keeps picking new mutation points and generating replacement subtrees until it produces a legal child tree. If off, then upon failure it just reproduces the original tree into the new generation in lieu of the mutated individual. The default is off.
internal Specifies the frequency with which internal points are selected as the mutation point.
external Specifies the frequency with which external points are selected as the mutation point. The defaults for these two options are coupled. If neither is set, then internal is 0.9 and external is 0.1. If one is set but not the other, the unset one is taken as zero. If both are set to zero, then the mutation point is selected uniformly over all points, without regard to internal or external.
tree Sets the frequency with which a particular tree is selected as the mutated tree. Should be a comma separated list of reals enclosed in parentheses, with a length equal to the number of trees per individual. For instance, if individuals consist of three trees, this argument could be tree=(0.1,0.2,0.7).
treen Sets the frequency with which tree n is selected as the mutated tree. Multiple "tree" and "treen" arguments are allowed and are applied in the order that they appear. If no tree arguments are used, then each tree has the same probability of being selected. If some tree arguments are used, any unspecified trees are given a zero probability of being chosen.
method Selects the method used to generate the
replacement subtree. The allowed values are the same as those
for the init.method parameter. The default is half_and_half.
depth The depth ramp used to generate the replacement
subtree The default is "0-4".
lil-gp supports multiple population runs, in which subpopulations evolve separately, exchanging individuals (or parts of individuals) periodically. Breeding parameters can be set individually for each subpop. The frequency of exchange and the exchange topology are set via parameters
All three of these populations must be set to use multiple populations:
multiple.subpops | type: integer | The number of subpopulations (each of pop_size individuals) for multipop runs. The default of 1 specifies an ordinary, singlepop run. | default: 1 |
multiple.exch_gen | type: integer | How often (in generations) the subpop exchange takes place. | default: none |
multiple.exchanges | type: integer | The number of sets of exchanges done. | default: none |
Any breeding parameter can be set for a specific subpop by prefixing the parameter name with "subpop[#].". (Occurrences of "#" in this parameter names of section should be replaced with a subpopulation number, in the range [1; multiple.subpops ].) The unprefixed form of the parameter name acts a default. For instance, when looking for the operator for the first phase for breeding subpop 3, lil-gp will first look for a parameter named subpop[3].breed[1].operator. If that is not found, it will look for a parameter named just breed[1].operator. If that is not found, lil-gp will stop with an error message.
Exchange of information between subpopulations can take one of two forms. In the first, whole individuals are copied from one subpop to another, replacing some of the individuals in the destination subpop. The other applies only to individuals composed of multiple trees. The exchange process can create a new individual by taking different trees from (possibly) individuals in (possibly) different subpops, and using the resulting composite individual to replace an existing individual
You specify a set of exchanges by first giving the following three
parameters: ("#" in all these should be replaced with
a number from 1; : : :; multiple.exchanges .)
exch[#].to | type: integer | The number of the subpop to receive the individuals. | default: none |
exch[#].toselect | type: selection method | The method used to select the individuals to be replaced in the destination subpopulation. | default: none |
exch[#].count | type: integer | How many individuals to replace with this exchange. | default: none |
For the simple transfer of a whole individual, you specify two
more parameters:
exch[#].from | type: integer | The subpop to take the individuals out of. Individuals are always copied ; the donor subpop is left unchanged. | default: none |
exch[#].fromselect | type: selection method | The method used to select the individuals to be copied out. | default: none |
For example, consider a ring of three subpopulations. Each subpopulation chooses its five best members and sends them to the next subpop in the ring. Each takes the individuals sent to it and uses them to replace its five worst members. The topology parameters for this would look like
multiple.subpops = 3 multiple.exch_gen = 10 # exchange every 10 generations multiple.exchanges = 3
exch[1].from = 1 exch[1].fromselect = best exch[1].to = 2 exch[1].toselect = worst exch[1].count = 5
exch[2].from = 2 exch[2].fromselect = best exch[2].to = 3 exch[2].toselect = worst exch[2].count = 5
exch[3].from = 3 exch[3].fromselect = best exch[3].to = 1 exch[3].toselect = worst exch[3].count = 5
To build a new individual from pieces of current ones, you need
to
specify a from and/or fromselect for each tree instead:
exch[#].from.tree[#] | type: integer | default: none |
exch[#].fromselect.tree[#] | type: string | default: none |
There are four possibilities, for each tree:
from and fromselect are both set.
In this case, the selection method fromselect is used to
select an individual from the subpop from, and the tree
is taken from that individual.
only from is set.
The parameter exch[#].fromselect (with no tree number)
is examined for a default. If it is found, then it is used as
the selection method as in the first case. If it is not found,
an error message results.
only fromselect is set.
fromselect should be set to the string "asn",
where n is a tree number. This means "take this tree
from the same individual you took tree n from." If
it is set to anything else an error message results.
neither is set.
The tree is taken from the individual selected to be replaced
(i.e., that tree is just left alone).
Consider this exotic (and probably not terribly useful) example. We have three subpops and individuals composed of four trees. We want to take the worst individuals in subpop 1, replace their tree 0 with that from an individual in subpop 2 (using the fitness selection method), and replace both trees 1 and 2 with those from a single individual in subpop 3 (using tournament selection with a tournament size of 7). We want to leave tree 3 alone. We want to replace 10 individuals in this manner. The following parameters will set this up
exch[1].to = 1 exch[1].toselect = worst exch[1].count = 10 ; replace tree 0 with one from an individual in subpop 2, fitness selection exch[1].from.tree[0] = 2 exch[1].fromselect.tree[0] = fitness ; replace tree 1 with one from an indiviudal in subpop 3, tournament selection exch[1].from.tree[1] = 3 exch[1].fromselect.tree[1] = tournament, size=7 ; replace tree 2 with the one from the individual that you got tree 1 from exch[1].fromselect.tree[2] = as1 ; no parameters for tree 3 means leave it unchanged
Exchanges are done in the order that they are specified in the
parameter file. Individuals that are placed into a subpopulation
(either copied whole or created from different trees) are marked
as ineligible to be written over by another exchange during that
generation. They can, however, contribute part or all of themselves
to other exchanges.