Dog Identification App

Source: Deep Learning on Medium

In this blog, I will walk through the dog identification app which will use Convolutional Neural Networks (CNNs)! In this project, I will also building a pipeline to process real-world, user-supplied images.

google search

The aim of the project is to create a web application that is able to identify a breed of dog if given a photo or image as input. Given an image of a dog, the algorithm will identify an estimate of the canine’s breed. If supplied an image of a human, the code will identify the resembling dog breed.

The dataset for this project is given by Udacity. we are using python 3.6 with TensorFlow as Keras backend. Also, we’ll use transfer learning and clearly see how transfer learning can be an effective method to train a machine learning model with fewer data and low compute resources.

The Road Ahead

We break the notebook and this blog into separate steps.

  • Step 0: Import Datasets
  • Step 1: Detect Humans
  • Step 2: Detect Dogs
  • Step 3: Create a CNN to Classify Dog Breeds (from Scratch)
  • Step 4: Use a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 5: Create a CNN to Classify Dog Breeds (using Transfer Learning)
  • Step 6: Write your Algorithm
  • Step 7: Test Your Algorithm

Code can be downloaded from the Github Repo

Step 0: Import Datasets

In this step, we will be importing the dog dataset as well as setting up the libraries needed for the app

from sklearn.datasets import load_files 
from keras.utils import np_utils
import numpy as np
from glob import glob
import cv2
import matplotlib.pyplot as plt
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from tqdm import tqdm
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
%matplotlib inline
# define function to load train, test, and validation datasets
def load_dataset(path):
data = load_files(path)
dog_files = np.array(data['filenames'])
dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
return dog_files, dog_targets
# load train, test, and validation datasets
train_files, train_targets = load_dataset('../../../data/dog_images/train')
valid_files, valid_targets = load_dataset('../../../data/dog_images/valid')
test_files, test_targets = load_dataset('../../../data/dog_images/test')
# load list of dog names
dog_names = [item[20:-1] for item in sorted(glob("../../../data/dog_images/train/*/"))]
# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training dog images.' % len(train_files))
print('There are %d validation dog images.' % len(valid_files))
print('There are %d test dog images.'% len(test_files))

We have used extensively from keras for creating the CNN, we have also used sklearn for dataset loading, OpenCV and PIL for image work, matplotlib for viewing the images and numpy for processing tensors.

tqdm provides a smart progress meter so you can see how your for loops are progressing and glob is used to find all path names matching a specified pattern

The dog_names variable stores a list of the names for the classes which we will use in our final prediction model. If everything has worked, we’ll get the following result:

There are 133 total dog categories.
There are 8351 total dog images.

There are 6680 training dog images.
There are 835 validation dog images.
There are 836 test dog images.

Step 1: Detect Humans

We use OpenCV’s implementation of Haar feature-based cascade classifiers to detect human faces in images. OpenCV provides many pre-trained face detectors, stored as XML files on github. We have downloaded one of these detectors and stored it in the haarcascades directory.

import cv2 
import matplotlib.pyplot as plt
%matplotlib inline
# extract pre-trained face detector
face_cascade = cv2.CascadeClassifier('haarcascades/haarcascade_frontalface_alt.xml')
# load color (BGR) image
img = cv2.imread(human_files[3])
# convert BGR image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# find faces in image
faces = face_cascade.detectMultiScale(gray)
# print number of faces detected in the image
print('Number of faces detected:', len(faces))
# get bounding box for each detected face
for (x,y,w,h) in faces:
# add bounding box to color image

# convert BGR image to RGB for plotting
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# display the image, along with bounding box
# returns "True" if face is detected in image stored at img_path
def face_detector(img_path):
img = cv2.imread(img_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
return len(faces) > 0

This below code is just the performance test of our face detector

human_files_short = human_files[:100]
dog_files_short = train_files[:100]
# Do NOT modify the code above this line.
## TODO: Test the performance of the face_detector algorithm
## on the images in human_files_short and dog_files_short.
detected_faces_in_humans = 0
detected_faces_in_dogs = 0
for ii in range(100):
if face_detector(human_files_short[ii]):
detected_faces_in_humans += 1
if face_detector(dog_files_short[ii]):
detected_faces_in_dogs +=1
print (f"Detected human faces: {detected_faces_in_humans}%")
print (f"Detected faces in dogs: {detected_faces_in_dogs}%")

The results are not perfect but acceptable.

  • Detected human faces: 100%
  • Detected faces in dogs: 11%

Step 2: Detect Dogs

For the dog detector we have used the pretrained Resnet50 network. The weights used were the standard ones for the dataset imagenet.

For keras, the images need to be in four dimensions, In the path_to_tensor function we are processing a single image, so the output is (1,224,224,3), where 1 image, 224 pixels wide, 224 pixels high, and 3 colours red, green and blue.

The image is loaded using the PIL library, and converted to the size 224×224. the img_to_array method separates the colors to (224x224x3) and finally we add a dimension at the front using the numpy expand_dims function to obtain our (1,224,224,3).

The function paths_to_tensor then stacks the images returned from path_to_tensor into a 4D tensor with the number of images from training, validation or test datasets depending on which img_path is called.

from keras.applications.resnet50 import ResNet50# define ResNet50 model
ResNet50_model = ResNet50(weights='imagenet')
from keras.preprocessing import image
from tqdm import tqdm
def path_to_tensor(img_path):
# loads RGB image as PIL.Image.Image type
img = image.load_img(img_path, target_size=(224, 224))
# convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
x = image.img_to_array(img)
# convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
return np.expand_dims(x, axis=0)
def paths_to_tensor(img_paths):
list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
return np.vstack(list_of_tensors)

Now we are ready to make predictions. The function shown below, after completing the preprocessing steps above, uses the predict function to obtain an array for imagenet’s 1000 classes. We are using numpy’s argmax function to isolate the class with the highest probability and use imagenet’s dictionary to identify the name of the class.

from keras.applications.resnet50 import preprocess_input, decode_predictionsdef ResNet50_predict_labels(img_path):
# returns prediction vector for image located at img_path
img = preprocess_input(path_to_tensor(img_path))
return np.argmax(ResNet50_model.predict(img))

You will notice that the categories corresponding to dogs appear in an uninterrupted sequence and correspond to dictionary keys 151–268, and this explains the return on the dog_detector function below. If the prediction is within this range (151 to 268), return True.

### returns "True" if a dog is detected in the image stored at img_path
def dog_detector(img_path):
prediction = ResNet50_predict_labels(img_path)
return ((prediction <= 268) & (prediction >= 151))
### TODO: Test the performance of the dog_detector function
### on the images in human_files_short and dog_files_short.
detected_dogs_in_humans = 0
detected_dogs_in_dogs = 0
for ii in range(100):
if dog_detector(human_files_short[ii]):
detected_dogs_in_humans += 1
print(f"This human ({ii}) looks like a dog")
human_dog_image =[ii])
if dog_detector(dog_files_short[ii]):
detected_dogs_in_dogs +=1

print (f"Percentage of the images in human_files_short that have a detected dog: {detected_dogs_in_humans}%")
print (f"Percentage of the images in dog_files_short that have a detected dog: {detected_dogs_in_dogs}%")
  • Percentage of the images in human_files_short that have a detected dog: 0%
  • Percentage of the images in dog_files_short that have a detected dog: 100%

Step 3: Create a CNN to Classify Dog Breeds (from Scratch)

We’ll create CNN model from scratch here,

The network which I’ve chosen consisted of 3 convolution layers with 3 max-pooling layers to reduce the dimensionality and increase the depth. The filters used were 16, 32, 64 respectively.

At the end , a fully connected layer was added layer having 133 nodes to match our classes of dog breeds and a soft-max activation function to obtain probabilities for each of the classes. Dropouts were not added to keep the network very primitive. The default settings with Adam were used as the optimizer for the loss function.

The target was to achieve a CNN model with >1% accuracy. The network described above achieved 3.466% without any augmentation on the data.

Step 4: Use a CNN to Classify Dog Breeds (using Transfer Learning)

In this section of the Jupyter notebook, we are walked through using one of the pretrained networks available for use with keras.

Bottleneck features is the concept of taking a pre-trained model in our case here VGG16 and chopping off the top classifying layer, and then inputing this “chopped” VGG16 as the first layer into our model.

The bottleneck features are the last activation maps in the VGG16, (the fully-connected layers for classifying has been cut off) thus making it now an effective feature extractor.

I haven’t included any code here as we will follow the same process in step 5 with the VGG16 pretrained model used for transfer learning. Check out the code

<script src=”“></script>

Step 5: Create a CNN to Classify Dog Breeds (using Transfer Learning)

Now lets build CNN to classify good using transfer learning

I decided to use the VGG16 model, as I am very comfortable with it

Udacity has prepared in advance the extraction of bottleneck features for each of the pretrained networks on the dog images. So it is only necessary to download the file for our network.

The above variables contain images that have already been put through the bottleneck extractor. This will make the training of our model very quick, as we are applying images where the main features for identification have already been isolated. This means we will have only a small number of parameters or weights to backpropogate through

<script src=”“></script>

Here we have defined the architecture of pre-trained models.

<script src=”“></script>

Test accuracy VGG16 : 73.32%

Step 6: Write your Algorithm

Here create an algorithm that accepts a file path to an image and first determines whether the image contains a human, dog, or neither. Then,

  • if a dog is detected in the image, return the predicted breed.
  • if a human is detected in the image, return the resembling dog breed.
  • if neither is detected in the image, provide output that indicates an error.

<script src=”“></script>

Step 7: Test Your Algorithm

Let us now test the algorithm on few dogs images, the app worked fine.

Here are the results