How To Label Where’s Wally/Waldo images using OpenCV and Deep Learning (YOLO v2)

Source: Deep Learning on Medium

There are 3 steps to achieve our goal: First we need to build a dataset, then we need to perform YOLO model and finally we create an output image highlighting Wally.

Step1 : Wally Dataset

Our dataset was composed by images that we found online, Keagle and data augmentation (Vertical Flip, Brightness modification) using Keras (ImageDataGenerator). As a second task, we label the images obtained, highlighting the location of Wally in each one. LabelImg was using for this propose, since it is a great tool witch allow annotate images in Pascal VOC format (generate xml files).

Images from our dataset with annotation in LabelImg

As result of this task, we built a dataset with 350 images.

Step2 : Perform Deep Learning Architecture

In order to detect Wally we used Yolo algorithm, an deep learning object detection architecture based on convolution neural networks.

Yolo is a single network trained end to end to perform a regression task predicting both object bounding box and object class proposed by Joseph Redmon , Ali Farhadi, Ross Girshick and Santosh Divvala in 2015.

The paper is available in this link.

I have used open source implementation, “Darkflow”, so you don’t need to worry about the detail.

Installing Darkflow

I am currently updating the original version of Darkflow, updating dependencies and creating new features. In this way. I strongly recommend, that you use my github repository to follow this tutorial. In the repository, you will find the complete installation guide and other important instructions. I am just going to leave the instruction as simple as possible here (using conda and pip).

conda install Cython
git clone https://github.com/arthurfortes/darkflow.git
cd darkflow
python setup.py build_ext - inplace
pip install .

Downloading Weights

To train Yolo model it is recommended that you use the weights and the configurations already trained from the deep learning network. There are two ways of downloading these files. First of all, you can download it from the official YOLO project webpage. Second, you can download it from my wheigts repository.

Building and Training the Model

In order to organize our project, we need to follow this repository tree:

Wally_Project
|- built_graph
|- cfg
|---- yolov2-wally.cfg
|---- yolov2-wally.cfg
|- ckpt
|- labels.txt

The files downloaded in the last section must be pasted in the cfg path and the other repositories may be empty. The labels.txt file must contain in each line the labels that our model will provide. In our case, our file contains only one line: “wally”.

As you can see in the darkflow repository, it is quiet simple to build the model. First, you need to create a train_model.py file and then define options object. Finally, you need to instantiate TFNet class object with the options.

As you can see below, it is quiet simple to build the model. First, you need to define options object. Then, you need to instantiate TFNet class object with the options.

from darkflow.net.build import TFNet
import cv2
options = {
"model": "cfg/yolov2-wally.cfg",
"load": "cfg/yolov2-wally.weights",
"batch": 8,
"epoch": 1000,
# comment the line bellow if you don't have GPU
"gpu": 0.8,
"train": True,
"annotation": "data/total/annotations/",
"dataset": "data/total/images/",
"load": -1
}

The options is a specification of the model and its environment. These and other specifications are described below.

'imgdir': path to testing directory with images 
'binary': path to .weights directory
'config': path to .cfg directory
'dataset': path to dataset directory
'labels': path to labels file
'backup': path to backup folder
'summary': path to TensorBoard summaries directory
'annotation': path to annotation directory
'threshold': detection threshold
'model': configuration of choice
'trainer': training algorithm
'momentum': applicable for rmsprop and momentum optimizers 'verbalise': say out loud while building graph
'train': train the whole net
'load': how to initialize the net? Either from .weights or a checkpoint, or even from scratc
'savepb': save net and weight to a .pb file
'gpu': how much gpu (from 0.0 to 1.0)
'gpuName': GPU device name
'lr': learning rate
'keep': Number of most recent training results to save
'batch': batch size
'epoch': number of epoch
'save': save checkpoint every ? training examples
'demo': demo on webcam
'queue': process demo in batch
'json': Outputs bounding box information in json format 'saveVideo': Records video from input video or camera
'pbLoad': path to .pb protobuf file (metaLoad must also be specified
'metaLoad': path to .meta file generated during --savepb that corresponds to .pb file

As I suggested to download yolo.weights from here, I specified it. If you have your own pre-trained weight files, this is where you let the model knows (like after you trained custom objects, your system will produce the specific weight file).

That done, just run the script with this command:

python train_model.py

and wait a few minutes … hours … days …

Good time to have a coffee and see how the day is outside.