DC Comics logo classifier

Source: Deep Learning on Medium

DC Comics logo classifier

Training an Image classifier from scratch using TensorFlow 2.0. We will be training CNN to classify the logo of a particular character. In this example, I took five different characters namely Batman, Superman, Green Lantern, Wonder Woman and Flash. This will be an end to end article. It includes steps right from collecting data to saving the trained model.

So after giving this image as input, our predicted class will be ‘Batman’

Prerequisites

  1. Knowledge of Python
  2. Google Account: As we will be using Google Colab

So time to get our hands dirty!

First, we will collect data using GoogleImagesDownload, a very handy python package to download images from google search. Now we will download images for each class(here we have five classes as batman, superman, green lantern, wonder woman and flash). Please refer to the documentation about using tool mentioned in the given link above.

Here’s a link to ChromeDriver, if you face trouble finding it.

googleimagesdownload — keywords “batman logo” — chromedriver chromedrvier — limit 300

I ran the above statement in command prompt to obtain images for each class by changing search keywords. Now I selected files with .jpg extension as it also downloads files with other extensions. I had manually delete some irrelevant images. Then I renamed the images. I did this for each class. For renaming and selecting only .jpg files I have provided scripts in Github repository. You will just have to take care of paths before executing them.

Finally, I made a folder named data that consisted of images for each class. The hierarchy of directories looked as shown in the snapshot below.

Hierarchy of directories

Now upload this folder to your google drive. After uploading this folder to Google Drive, We will create a new colab notebook from this link. Google colab gives us jupyter environment. You can refer to the jupyter notebook in the Github repository. Now we will start with preprocessing and then defining model to train it.

!pip install tensorflow==2.0

So in the first cell, we installed TensorFlow 2.0. Now we will import all the packages we need.

import cv2
import os
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
from tensorflow.keras import layers, models
from google.colab import drive
drive.mount('/content/drive')

We are using cv2 for processing images, os for dealing with paths. Numpy is used for numpy arrays. TensorFlow will be used for defining and training models. Here I have used shuffle from sklearn.utils to shuffle image data while the train-test split. Finally, drive from google.colab will be used to mount google drive on colab notebook.

After last line of above cell is executed, It will provide a link that provides verification token. Once the token is given, google drive will be mounted on colab notebook. I have defined two functions loadTrain() and readData(), loadTrain() will help in preprocessing image. Preprocessing image includes resizing, normalizing and assigning labels to corresponding images.

validationSize = 0.2
imageSize = 128
numChannels = 3
dataPath = "/content/drive/My Drive/comic/data"
classes = os.listdir(dataPath)
numClasses = len(classes)
print("Number of classes are : ", classes)
print("Training data Path : ",dataPath)

Here validationSize is given value 0.2, so 80% will be our training data and 20% testing data. imageSize will specify dimension of image that will be given input to the model. numChannels is given value 3 as our image will be read into RGB channels.

data = readData(dataPath,classes,imageSize,validationSize)

X_train,y_train,names_train,cls_train = data.train.getData()
X_test,y_test,names_test,cls_test = data.valid.getData()

print("Training data X : " , X_train.shape)
print("Training data y : " , y_train.shape)
print("Testing data X : ",X_test.shape)
print("Testing data y : ",y_test.shape)

Now we have our train and test data ready. Time to define our model and train it.

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))

There’s a Conv layer following with max-pooling layer. Here input shape is 128*128*3 as we resized our image to 128*128 resolution and 3 is the number of channels. Then again we have Conv layer followed by max-pooling. Then again a Conv layer and now tensor is flattened in the next layer. We now have a dense layer connected to our output layer. Here output layer consists of 5 units as we have five classes for classification.

model.summary()

We get summary of our defined model. Now it’s time to train.

history = model.fit(X_train,y_train, epochs=4, 
validation_data=(X_test,y_test))

I have also mentioned plots and accuracy metrics. You can check it in my jupyter notebook.

model.save("comic.h5")

We save our model in .h5 file, but this is local to colab so we will save it google drive.

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
model_file = drive.CreateFile({'title' : 'comic.h5'})
model_file.SetContentFile('comic.h5')
model_file.Upload()
drive.CreateFile({'id': model_file.get('id')})

So now our trained model will be saved to google drive. It can be easily downloaded from google drive.

Now to classify I have written a script named classify.py. Here we will pass path of our image as cli argument. Our output will be predicted class. We actually get probabilities for each class, we will select one with maximum.

Here we can see predicted output for the below image is Batman.
input

Here’s link to my Github repo.

Further, I will try to write my experience on deploying it as API to cloud platform. You can also deploy it in mobile devices by converting model to lite version and saving it to .tflite file, refer TensorFlow Lite documentation for details. Feel free to connect with me on LinkedIn, Github, and Instagram. Let me know about any improvisation(s).

Thank You!