Programming and Variable-Length Genetic Algorithms with Subtree Crossover


Riccardo Poli 
Department of Computer Science  
University of Essex
UK 
rpoli@essex.ac.uk 

Jonathan E. Rowe 
School of Computer Science 
The University of Birmingham  
UK
j.e.rowe@cs.bham.ac.uk 

Christopher R. Stephens
Instituto de Ciencias Nucleares 
UNAM, Mexico  
stephens@nuclecu.unam.mx

Alden H. Wright
Computer Science Dept.
Univ. of Montana
Missoula, MT 59812
wright@cs.umt.edu
(406) 243-4790
http://www.cs.umt.edu/u/wright/wright.htm


Abstract

In this paper we study, theoretically, the search biases 
produced by GP subtree crossover when applied to linear 
representations, such as those used in linear GP or in 
variable length GAs.  The study naturally leads to
generalisations of Geiringer's theorem and of the notion 
of linkage equilibrium, which, until now, were applicable only 
to fixed-length representations. This indicates the presence of a 
diffusion process by which, even in the absence of selective 
pressure and mutation, the alleles in a particular individual 
tend not just to be swapped with those of other individuals in 
the population, but also to diffuse within the representation 
of each individual. More precisely, crossover attempts
to push the population towards distributions of primitives where each
primitive is equally likely to be found in any position in any
individual.