Genetic algorithms as a mathematical apparatus

Genetic algorithms aim to solve optimization and modeling tasks by iterating, variation and combining target parameters with the help of mechanisms similar to biological evolution.

Genetic algorithms aim to solve optimization problems. Neural network training can serve as an example of such a task. It consists in selecting such values of weights at which the minimum error is achieved. The genetic algorithm is based on the random search method. The main disadvantage of random search is that we do not know how long it will take to solve the problem.

To avoid such a waste of time in solving the problem, professionals use the methods discovered in the study of the evolution and origin of species. In the evolution process, the fittest specimens survive. This leads to an increase in the fitness level of the populations, which allows it to survive better in changing conditions.

This algorithm was first proposed by John Holland at the University of Michigan in 1975. It was called the Holland Reproductive Plan and formed the basis for almost all varieties of genetic algorithms. Before we examine it in detail, we should focus on how real-world objects can be encoded for use in genetic algorithms.

Representation of objects

We know from biology that any organism can be represented by its own phenotype, which actually defines what an object is in the real world, and a genotype that contains all the information about the object at the level of the chromosome set. Each gene (that is, an element of genotype information) has its own reflection in the phenotype.

To solve the problems, we need to represent each trait of the object in a form suitable for use in the genetic algorithm. All further functioning of the mechanisms of the genetic algorithm occurs at the genotype level. It allows you to do without information about the internal structure of the object, which causes its wide application in a variety of tasks.

In the most common variation of the genetic algorithm, bit strings are used to represent the genotype of an object. Each trait of an object in the phenotype corresponds to one gene in the genotype of the object. A gene is a bit string, most often of fixed length, that represents the value of this trait.

Encoding of traits (integers)

To encode traits, you can use the simplest option — the bit value of a trait. Then it will be easy for you to use a gene of a certain length, long enough to represent all possible values of such a trait. But, unfortunately, this kind of encoding has its disadvantages.

The main drawback is that neighboring numbers differ in the values of several bits. For example, the numbers 7 and 8 in the bit representation differ in 4 positions, which makes it difficult for the genetic algorithm to function and increases the time required for its convergence. To avoid this problem, it is better to use encoding, in which neighboring numbers differ in a smaller number of positions, ideally in the value of one bit.
Gray code can be used to work out the genetic algorithm. The values of the Gray code are indicated in the table below:

Decimal CodeBinary ValueHexadecimal valueDecimal CodeBinary ValueHexadecimal value

Table 1. Correlation of decimal codes (left) and Gray codes (right)

When encoding an integer trait, we divide it into tetrads and transform each tetrad according to the Gray code.

In practical implementations of genetic algorithms, there is usually no need to convert the values of a trait to the value of a gene. In practice, there is an inverse problem, when we need to determine the value of a trait, based on the value of the gene.

Thus, the task of decoding the values of genes that correspond to integer traits is trivial.

Encoding of traits (real numbers)

The simplest and the most obvious encoding method is to use bit representation. However, this option has the same disadvantages as the method that is applied to integers. Therefore, in practice, we would usually resort to the following sequence of actions:

  1. Divide the entire range of acceptable values of the trait into sections with the required accuracy.
  2. Take the gene value as an integer that defines the interval number (using Gray code).
  3. Take the number that is the middle of this interval as the parameter value.

Let's analyze the above-described sequence of actions using an example.

Let's assume that the values of the trait belong to the range of [0, 1]. To encode the section, we divided it into 256 intervals. To encode their number, we will need 8 bits. Let the gene value be 00100101bG (the capital G letter indicates that Gray code is used). To get started, let's use Gray code to find the corresponding interval number:


Now let's see which interval corresponds to it. By using simple calculations, we get the [0.20703125, 0.2109375] interval. So the value of the parameter will be (0.20703125+0.2109375)/2=0.208984375.

Encoding non-numeric data

When encoding non-numeric data, you must first convert it to numerals.

How to determine the phenotype by genotype

To determine the phenotype of an object (that is, the values of the traits that describe the object), we only need to know the values of the genes that correspond to these traits (that is, the genotype of the object). The set of genes that describes the genotype of the object is a chromosome. In some implementations, it is also called a specimen.

In the implementation of the genetic algorithm, the chromosome is a bit string with a fixed length. Each section of the string corresponds to a certain gen. The length of the genes within the chromosome can be identical or different. Most often, same-length genes are used.

Let's consider an example of a chromosome and the interpretation of its meaning. Let's say that an object has 5 traits, each encoded by a gene with a length of 4 elements. In this case, the length of the chromosome will be 5*4=20 bits:


Now we can determine the values of the traits:

TraitGene valueBinary trait valueDecimal trait value

Basic genetic operators

In the evolution theory, the way in which the characteristics of parents are transmitted to descendants plays an important role. In genetic algorithms, an operator called crossover is responsible for transmitting the characteristics of parents to descendants. This operator defines the transfer of parent traits to descendants.

It works as follows:

  1. Two specimens are selected from the population to become parents.
  2. The point of discontinuity is determined (usually randomly).
  3. The descendant is defined as a concatenation of a part of the first and second parent.

Let's consider the functioning of this operator:


Let's imagine that the discontinuity occurs after the 3rd bit of the chromosome. So, in this case the following values will apply:


Then, with a probability of 0.5, one of the resulting chromosomes is determined as a descendant.

The next genetic operator aims to maintain the diversity of specimens within the population. It is called the mutation operator. When using this operator, each bit in the chromosome is inverted with a certain probability.

Plus, the so-called inversion operator is also used. It divides the chromosome into two parts, and then they are swapped. This can be represented as the following scheme:


Technically, these two genetic operators are sufficient for the functioning of a genetic algorithm. Yet in practice, some additional operators or modifications of these operators are also used.

For example, a crossover may not be single-point (as described above), but multi-point, when several points of discontinuity are formed (most often, two). Besides, in some implementations of the algorithm, the mutation operator is the inversion of only one randomly selected bit of the chromosome.

Operating scheme

ТNow that you know how to interpret the values of genes, we will proceed to the description of the functioning of the genetic algorithm. Let's consider the scheme of functioning of the genetic algorithm in its classical version:

  1. Set the start time as t=0. Randomly generate an initial population consisting of k specimens B_0={A_1,A_2,…,A_k}
  2. Calculate the fitness of each specimen F_{A_i}=fit(A_i), i=1…k and the population as a whole F_t=fit(B_t).The value of this function determines how well the specimen described by this chromosome is suitable for solving the problem.
  3. Select the A_c specimen from the population A_c=Get(B_t).
  4. With a certain probability (crossover probability P_c) select the second specimen from the population A_{c_1}=Get(B_t) and produce the crossover operator A_c=crossing(A_c,A_{c_1}).
  5. With a certain probability (probability of mutation P_m) execute the mutation operator A_c=mutation(A_c).
  6. With a certain probability (probability of inversion P_i) execute the inversion operator A_c=inversion(A_c).
  7. Place the resulting chromosome in a new population insert(B_{t+1},A_c).
  8. Perform the operations starting with step 3, k times.
  9. Increase the number of the current epoch t=t+1.
  10. If the stopping condition is met, then exit. Otherwise, move to step 2.

Now, let's take a closer look at certain stages of the algorithm.

The selection of the parent chromosomes in steps 3 and 4 plays the most important role in the successful functioning of the algorithm. Various options are possible here. The selection method that is used the most often is called the roulette. When using this method, the probability of choosing a chromosome is determined by its fitness, that is P_{Get{A_i}}=Fit(A_i)/Fit(B_t). Using this method leads to the fact that the specimen with greater fitness levels will be more likely to transmit their traits to their descendants.

Another commonly used method is tournament selection. It consists in several specimens (usually 2) being randomly selected from the population. The specimen with the greatest fitness is selected as the winner.

Besides, some implementations of the algorithm rely on the so-called elitism strategy. Specimens with the best fitness levels are guaranteed to move to a new population. The use of elitism usually accelerates the convergence of the genetic algorithm. The disadvantage of using the elitism strategy is that it increases the probability of the algorithm hitting the local minimum.

Another important point is the definition of the stopping criteria. Usually, these are either a restriction on the maximum number of epochs of the algorithm's functioning, or a determination of its convergence, usually by comparing the fitness of the population at several epochs and stopping when this parameter is stabilized.

See also

Scalable CLOPE algorithm for clustering categorical data
Splitting of categorical and transactional data sets into groups with similar attributes into large databases is the most important task of data mining. In most cases, traditional clustering...
Clustering algorithms on Data Mining
This article is an attempt to systematize and give a holistic view of the latest achievements in the development of effective approaches to data clustering. The purpose of this article was not...
Apriori algorithm for association rule learning
Apriori is one of the most popular algorithms for generating association rules. Employing the anti-monotonicity property, it is able to process large volumes of data within a reasonable amount of...