Apply Deep Neural Network on the Animal Faces dataset

Original article was published on Deep Learning on Medium


Apply Deep Neural Network on the Animal Faces dataset

animal faces

“Machine learn to classify animals using its images”.

Introduction

passed all last month trying to solidify my knowledge of PyTorch framework with FreeCodeCamp and Jovian.ml and their course “Deep Learning with PyTorch: Zero to GANs”. As the final step in the course is an independent project in all parts to deep learning of cause. I have choose animal faces dataset (animal classification).

Data Characteristics

This dataset, also known as Animal Faces-HQ (AFHQ), consists of 16,130 high-quality images at 512×512 resolution.
There are three domains of classes, each providing about 5000 images. By having multiple (three) domains and diverse images of various breeds per each domain, AFHQ sets a challenging image-to-image translation problem. The classes are:

sample image from the dataset

For this project, we will use 10% of the dataset as the validation set and 90% as the training set. The loss function will be cross-entropy loss since this is a classification problem. The optimizer will be stochastic gradient descent and the batch size for gradient descent will be 20. Stochastic gradient descent is an approximation of gradient descent. The gradient of the loss function is applied to a batch of all the training points instead of the whole set, which is much faster to compute. This stochastic batch sampling of training samples introduces a lot of noise, which is actually helpful in preventing the algorithm from getting stuck in narrow local minima.

animal-faces dataset

The dataset is extracted to the directory animal-faces. It contains 2 folders (train and val), containing the training set (14630 images) with 3 classes and val set (14630) with 3 classes respectively.

let see the train dataset:

Lets see the validation dataset:

Test set

Test set is used to compare different models, or different types of modeling approaches, and report the final accuracy of the model. Since there’s no predefined test set, we can set aside a small portion (5000 images) to be used as the test set. We’ll use the random_split helper method from PyTorch to do this. To ensure that we always create the same test set, we’ll also set a seed for the random number generator :

We also check the image shape and label of the given dataset.

Training and Validation Dataset

Now, we save the dataset into a variable using data loaders for training and validation, to load the data in batches. Then, we set the batch size to 20

Preparation for model training

This dataset consists of 16,130 high-quality images at 512×512 resolution. in 3 classes, each providing about 5000 images. There are 1500 validation images and 14630 train images. Since we are working with coloured images, our data will consist of numeric values that will be split based on the RGB scale.

Base Model and Training on GPU

First, we create the base model for our neural network where we will define functions for the training process and validation process.

Then we will define the evaluate function to return the progress of our model after each epoch and the fit function which will be used to update the weights for each epoch

Thanks to PyTorch, we can use the GPU for training and evaluating our model. GPUs are much more efficient for updating and calculating weights, mostly if the data we’re dealing with are images or videos, which is in our case .So, we will be moving our data to the GPU because it is available, but if you don’t have it you could use your CPU as usual but it may get slow.

First we Check if the GPU is available :

Now let’s define the helper function for moving data into GPU

then we move the data into the GPU:

Training the Model:

First we define input size of our image dataset which is 512*512pxl and we set the output which should be equal to the output classes you have:

Neural network:

Our Neural Network is a input layer with 3 hidden layers using RELU as an activation function

Having the functions we can start training. Remember to initiate the model in the GPU. Then we put the values that will use.

First let us see how our function can perform before its trained:

SO, we are having an accuracy of 33.3% before the model is trained which is good

then let’s train the model and see if there is some improvements

We train the model using the fit function to reduce the validation loss & improve accuracy by setting the number of epoch we need to train and the learning rate:

It seems like 0.0001 is a good learning rate so we better keep it this way

Excellent , after 7 epoch we’re getting an accuracy of 60%, its good right?

lets increase the learning rate to 0.001 and see what happen:

it looks like increasing the learning rate by a small portion provide better result

If you want you can continue adjusting this model and try different approach and see if you can get better result . Here we are getting 81% and its good!

Evaluating the model:

We plot of the losses & accuracies and evaluate the first model on the test set :

on the chart the accuracy increased dramatically up to 81%-85% started to be firm.

Now let look on the history chart:

Predictions:

Let us now do some predictions using the Test dataset we created and see how the model is doing on real object