Dog Breed Classifier — using Convolutional Neural Networks

Source: Deep Learning on Medium

The things we do for Doggos !! Photo: Shutterstock

The main objective of the project is to create an application that is able to identify a breed of dog if given an image as input. If we provide an image that contains a human face, then the application will recognize it is a human and returns the breed of dog that most resembles this person. It also identifies when there is no human or dog present in the image.

This project is carried out by following steps:

  • Step 0: Import Datasets
  • Step 1: Detect Humans
  • Step 2: Detect Dogs
  • Step 3: Create a CNN to Classify Dog Breeds (from Scratch)
  • Step 4: Use a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 5: Create a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 6: Write your Algorithm
  • Step 7: Test Your Algorithm

Step 0: Import Datasets

The datasets used in the project consist of two

  • dog dataset– contains the images of different dog breed for training
  • human dataset– contains the images of human faces

By analyzing our dataset, we find that we have 133 total dog categories.
and there are 8351 total dog images.

Step 1: Detect Humans

We use OpenCV’s implementation of Haar feature-based cascade classifiers to detect human faces in images. We have downloaded one of these detectors ato use for the project. We use this to write a face detector and upon assessing it we find the following

Face Detector Accuracy on human faces: 100.0%
Face Detector Accuracy on dog images: 11.0%

Image from the project

Step 2: Detect Dogs

we use a pre-trained ResNet-50 model to detect dogs in image. Given an image, this pre-trained ResNet-50 model returns a prediction for the object that is contained in the image. Upon assesing the dog detector we find the following:

Face Detector Accuracy on human faces: 0.0%
Face Detector Accuracy on dog images: 100.0%

Step 3: Create a CNN to Classify Dog Breeds (from Scratch)

I chose to use the hinted architecture as a base, but extended upon it by adding one more layer at the start of the network — going from the raw image to 8 filters, and then to 16 rather than straight to 16. My thinking here was that by having the network learn less features in the first step, it would maybe be able to learn some fundamental features well and then use these in subsequent layers to construct more complex filters. As opposed to learning a multitude of filters in the first layer, many of these filters could end up as duplicates or potentially “dead filters”.

Furthermore, I chose to add some dropout layers after witnissing overfitting occuring in my loss plots. This proved useful in the end. The target was to achieve a CNN with >1% accuracy. The network described above achieved 7.76% without any fine-tuning of parameters and without any augmentation on the data.

Step 4 and 5: Use/Create a CNN to Classify Dog Breeds (using Transfer Learning)

By using the pre-trained model we get to utilize a much deeper and more complex network architecture without the need to train such a deep network, which would require expensive hardware and lots of time (and finesse).

Instead of blindly picking one model, I’m going to evaluate two of the models on a test-run to determine which one model has the best time/accuracy trade off and build around that model then. Since the Xception and Vgg19 are lareger in size (nearly 3GB) I guess it will take much time ti train the data. Let us use ResNet50 and Vgg16 data and check if they produce better accuaracy else we will use Xception or Vgg19. Let us evaluate and plot the results of these two models

ResNet50(left)/ Vgg16(Right)

From the rest of the data, its clear that ResNet50 is the best choice — an accuracy of ~81% with a training time of < 40seconds on 20 epochs. Considering that the requirement for this assignment is achieving an accuracy of >60% , I will use the fastest model that achieves this. So let go with ResNet50

Tuning the dropout percentage to reduce over-fitting.

The ResNet50 model is showing some heavy overfitting. After trying and tuning the dropout percentage to reduce overfitting, I found that the loss-graph with dropout=20% seems to be the best performing, so I continue with the ResNet50 architecture and 20% dropout.

We test this to check its accuracy and get 81.92%

Step 6: Test Your Algorithm:

Here we are creating our algorithm to analyze any image. The algorithm accepts a file path and:

  • if a dog is detected in the image, return the predicted breed.
  • if a human is detected in the image, return the resembling dog breed.
  • if neither is detected in the image, provide output that indicates an error.

The algorithm collects together functions we used previously to create a final output and shows the image.

Step 7: Test Your Algorithm:

The results were pretty good for the images the model was shown. The algorithm was able to identify dog breeds quite accurately.

I used an image of my house and my picture. It found there are no humans or dogs in the first picture and identified me as a human and also told I look like a Black Russian Terrier. Oops XD !!

We achieved a accuracy of 85%. Following the above areas I’m sure we could increase the testing accuracy of the model to above 90%.

To summarise possible improvements are:

  1. analysis of images used to train and validate the model
  2. data augmentation
  3. fine tune hyperparameters