MNIST Classifier using Genetic CNN

Original article was published on Artificial Intelligence on Medium

MNIST Classifier using Genetic CNN

Every person wanting to work on deep learning starts with the MNIST (Modified National Institute of Standards and Technology database) handwritten data classification. It is a fairly simple code of preprocessing and training, and no matter how good or bad your model is the accuracy crosses 0.99 almost every time. So we will be using a genetic algorithm with CNN to classify the MNIST dataset.

But to all those deep learning enthusiasts interested in learning new ways of training your model and increasing its efficiency here is something. I have been going through the genetic algorithm and its amazing. It is purely inspired by nature’s way of ‘survival of the fittest’. Combining it with deep learning could not only reduce the training time but also increase its efficiency. In this tutorial, we will try to classify handwritten digits but its applications can be immense.

MNIST Dataset

Apart from its huge name, its popularity is not to be undermined. It is used by every deep learning enthusiast at least once in their career, whether to test their model or a new theory. Similarly, we will be using the handwritten digit dataset to test our Genetic CNN model.


The dataset as a whole contains about 70,000 images and digits from 0 to 9. It contains input as images of a fixed size (28x28x1) and actual digits as numerical output. The dataset can directly be loaded from the TensorFlow library. To get started, install the library.

Import the library and load the dataset.

Preprocess the data to normalize the images in order to reduce the training size.

Deep Learning

The genetic algorithm in itself has a limited reach, but with deep learning acting as the brain its a different story altogether. Convolutional neural networks will be used as the base of this project. We will create a simple CNN since the dataset is the MNIST handwritten dataset, it will not matter much. Start by declaring a new function to initialize the model.

Now define a new function that inputs a model trains it for 1 epoch, and returns the trained one with its current loss.

With these, we are ready to train the model. The rest will we controlled by the genetic algorithm. Now to improve the model, you may train the model individually and make some changes.

Genetic Algorithm

Genetic Algorithms are a part of evolutionary algorithms used for optimization and searching problems. They are purely inspired by nature’s natural evolution process. For example we humans started as apes, then as the conditions and requirements changed so we evolved, became more intelligent. Female peacocks always choose males with the brightest colors, long ago there were not many, but today finding ones that are not bright are rare. Here are some more examples do go through them:

Image by Ilya Kuzovkin via

Now we will go through some of the basics of the genetic algorithm-

Individual — An individual is that entity which tries to solve the given problem

Genes — A set of properties that characterize the individual. These can be a set of strings or in our case the weights of the neural network.

Population — A population is a set of individuals who try to overcome the given problem.

Generation — The entire population at a particular time is generation. Each generation is better than the last. Each individual in the current generation is either produced from the last generation or randomly selected from them.

Elitism — The most elite individuals of the current generation are capable of adapting to the problem and hence are directly promoted to the next one.

Mating — From the current generation, top individuals are chosen. Two of them are picked at random and their genes are mixed to form a new individual.

Mutation — Genes of a newly formed individual are randomly modified in order to maintain the randomness in the generations.

Now we will start coding the genetic algorithm using a neural network as the memory. Make a new python file and import the previous one in it. Declare a few required variables and continue to the main code.

The layers variable is dependent on your model. Since the 1st, 3rd and 5th layer of our model have weights we assign the following

Create a new function to perform mutation of the weights:

To perform mating and form a new generation of individuals with better weights write the following code:

Now create another function to sort the current population to find the elites and also parents need for creating new generation:

Finally, all that is left is to code the main controlling loop for the whole algorithm. It runs till the number of generations and in each loop performs evolution.


The first generation of 10 individuals is initialized each with the same model architecture but random weights to begin with. Each of them is trained for 1 epoch and their losses and new weights are stored. Then using the present generation, new individuals are created using elitism, mating, and mutation. These new individuals again train for 1 epoch and the process repeats.

Hope you learned something new with this blog. If you come along any errors or doubts do comment below. Also the entire repository and the original working code this project can be found in this GitHub repository: