[Deep Learning Lab] Episode-2: CIFAR-10

Let the “Deep Learning Lab” begin!

This is the second episode of “Deep Learning Lab” story series which contains my individual deep learning works with different cases.

I would like to work on CIFAR datasets in the second episode. CIFAR-10 and CIFAR-100 are two different datasets with different number of classes -please take a hint, it’s all about names-. First, it’s the time to start with CIFAR-10, which is -relatively- easier to work and actually, working on CIFAR-100 in such a different case has been already planned for the later part of the series.

CIFAR-10

Let’s quickly get to know the CIFAR-10 dataset. CIFAR-10 is one of the most well-known image dataset containing 60.000 different images which is created by the first person that should come to your mind in deep learning and his teammates. OFC, I’m talking about Geoffrey Hinton. CIFAR-10 is labeled subsets of the “80 million tiny images” dataset. (G. Hinton, A. Krizhevsky and V. Nair in Canadian Institute for Advanced Research)

The size of all images in this dataset is 32x32x3 (RGB). If you don’t have any idea of what are the “3” in the third dimension and “RGB” in the brackets mean, I strongly recommend you to read this article. Moreover, there are 50.000 images for training a model and 10.000 images for evaluating the performance of the model. The classes and randomly selected 10 images of each class could be seen in the picture below.

Classes and randomly selected 10 images of each class

Let me reference to the real hero:

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009

University of Toronto, Technical Report

I am looking forward to creating an accurate deep learning model on the CIFAR-10 dataset. Let’s start coding!

LET’S GOOOOO!

Randomly selected 24 images in CIFAR-10 dataset

As in the first episode, I would like to express my thanks to Fuat from Deep Learning Turkey since he introduced me to Google Colaboratory and helped me to be able to use it.

For more information: Google Colab Free GPU Tutorial

The very first move: Importing the libraries

from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import print_summary, to_categorical
import sys
import os

We need to assign a file path on Google Drive to save the model that we trained on Google Colaboratory. The first thing to do is to create a new folder named “cifar10” on Google Drive and then, let’s run the following code snippet on Google Colab.

sys.path.insert(0, 'drive/cifar10')
os.chdir(“drive/cifar10”)

Initializing the parameters.

We will feed the convolutional neural network with the images as batches -each batch contains 64 images- in 100 epochs and eventually, the network model will output the possibilities of 10 different categories (num_classes) can belong to the image.

batch_size = 64
num_classes = 10
epochs = 100
model_name = 'keras_cifar10_model'
save_dir = '/model/' + model_name

Thanks to Keras, we can load the dataset easily.

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

We also need to convert the labels in the dataset into categorical matrix structure from 1-dim numpy array structure.

y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

Once bitten twice shy, we will not forget it for this time. We need to normalize the images in the dataset.

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255.0
x_test /= 255.0

We are now absolutely sure that it is enough for preprocessing -for now, LUL-. It is the time to create our model. For this episode in the series, I would prefer to use the most common neural network model architecture in the literature: [CONV] — [MAXP] -..- [CONV] — [MAXP] — [Dense]

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(64, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(128, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.4))

model.add(Flatten())
model.add(Dense(80))
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

The summary of this model could be seen below:

Summary of model

I would prefer the Stochastic Gradient Descent algorithm to optimize the weights on the backpropagation. Set the momentum parameter as 0.9, and just leave the others as default. I, again, strongly recommend you to read an article, this one, in order to get more information about SGD algorithm.

opt = SGD(lr=0.01, momentum=0.9, decay=0, nesterov=False)

We are now ready to compile our model. The categorical crossentropy function has been picked out as a loss function because we have more than 2 labels and already prepared the labels in the categorical matrix structure -I confess, copied it from the first episode-.

model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])

We’ve done a lot and we have only one step to begin training our model. At this time, I would like to make a different move. I will split the training dataset (50.000 images) into training (40.000 images) and validation (10.000 images) datasets to measure the validation accuracy of our model in such a better way. Thus, our neural network model will continue the training by evaluating the images that never been seen during the training after each epoch.

model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2,
shuffle=True)

Well, so far so good. We have started to learn. I think we did, didn’t we?

Epochs 1–5

In contrast to the previous episode, training our model took a long time despite using a powerful GPU -thanks to Google Colab-. After about 4 hours of training, it could be seen like below.

Epochs 95–100

Just before measuring the accuracy of our model with the test dataset, I would like to share with you the achievements obtained using the CIFAR-10 dataset. GO!

We all hold our breath, AND…

scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

The result:

Our -lovely- model classifies 77.37% of 10.000 test images.

We -really- need to look at the performances in the literature and I -indeed- don’t believe that we fall even further behind. I’m proud of this figure since our model has just 3 layers with simple neural network architecture and trained only 4 hours.

So what do you think?

What should we do to improve the performance of our model?

The first answer for this question: Train the model for a while.

I’m not a lazy guy -HUH-.

Results:

100 epochs: Accuracy: 77.37%, Loss: 0.670
150 epochs: Accuracy: 78.22%, Loss: 0.646
200 epochs: Accuracy: 77.77%, Loss: 0.670

I won’t continue to train this model anymore -of course- since there was no improvement in the loss function values after 170–175th epochs. In other words, our model will start to overfitting. If it is trained more -as is-, the performance of test dataset will begin decreasing, which is the last thing we would like to happen.

Well, what else can we do?

Yes, possible, we can still do something to improve the performance of our model.

For example:

  • Data augmentation. We can efficiently increase the number of images in the dataset with the help of a method in Keras library named “ImageDataGenerator” by augmenting the images with horizontal/vertical flipping, rescaling, rotating and whitening etc. The more data we have for training, the more accurate result we could obtain -here, I do not claim that data augmentation always guarantees better accuracy-.
  • Changing optimizer. Stochastic Gradient Descent algorithm to optimize the weights is probably not the most appropriate algorithm for this dataset. There may be an increase in the performance -attention! I’m not talking about a definite increase”.
  • Changing learning rate. We could decrease the learning rate of the model a bit after 170th epoch. It is possible to change the learning rate during the training by helping of the methods in Keras library named “LearningRateScheduler” and “ReduceLROnPlateau”.
  • Changing the architecture. If the performance of our model still does not satisfy us, we will have to question the architecture of the model that we are building. We need to try to resemble our model to more modern neural networks like ResNet and VGGNet or we need to change the activation functions of the layers in the model. Then, we will have a chance to improve the performance.

Before summing up, I would like to save our model as a file with extension “.h5”. Thus, I can continue to train my model, I can convert the model into a software which predicts “real-case” pictures that the model has never met before and I can share the model with other deep learning researchers who would like to use this model.

model.save(save_dir + '.h5')

Well, the second episode of “Deep Learning Lab” series, CIFAR-10 ends here. Thank you for taking the time with me. For comments and suggestions, please e-mail me. You can also contact me via LinkedIn. Thank you.

fk.


[Deep Learning Lab] Episode-2: CIFAR-10 was originally published in Deep Learning Turkey on Medium, where people are continuing the conversation by highlighting and responding to this story.

Source: Deep Learning on Medium