Guide to Tensorflow Object Detection ( Tensorflow 2)

Original article was published by Rohit Prakash on Deep Learning on Medium

2. Preparing the Dataset

There are two ways to go about this:

  • Use a Public Labelled Dataset
  • Create a Custom Labelled Dataset

You can find Public Labelled Datasets online, which are already labeled and saved in the right format, ready to be used to train.

For this tutorial, we will be creating our own dataset from scratch.

First things first, gather the images for the dataset. I will assume this step has already been done.

Now we need to label the images. There are many popular labeling tools, we will be using LabelIMG.

To install LabelIMG, execute the following code (Do it on your local Terminal since Colab does not support GUI applications):

pip install labelImg

Launch LabelImg in the folder where your images are stored.

labelImg imagesdir

Now you can start labeling your images, for more info on how to label the images follow this link (LabelImg Repository).


Create a label map in notepad as follows (label_map.pbtxt) with two classes for example cars and bikes:

item {
id: 1
name: 'car'

item {
id: 2
name: 'bike'

Now for creating the TFRecord files.

We can do the following:

  • Create TFRecord ourselves
  • Upload the annotations to Roboflow and get the dataset in TFRecord Format.

Creating the TFRecords ourselves is a bit tedious as the XML created after annotating may sometimes vary, so for the sake of ease, I suggest using Roboflow to perform the above task. They also provide an option to perform additional Data Augmentation which will increase the size of the dataset.

For your reference, here is a sample .py script to create the TFRecords manually.

Use the above code for train and test images to create train.tfrecord and test.tfrecord respectively by changing

xml_dir = ‘images/test’
image_dir = ‘images/test’
output_path = 'annotations/test.record'

By using Roboflow you will be provided the TFRecord files automatically.

Setting up on Colab

Create folders to store all the necessary files we have just created.

%mkdir annotations exported-models pre-trained-models models/my_mobilenet # my_mobilenet folder is where our training results will be stored

Now upload the newly created TFRecord files along with the images and annotations to Google Colab by clicking upload files.

You could use Google Drive to store your necessary files and importing those to Google Colab should be as simple as doing a !cp command.

Download Pre-Trained Model

There are many models ready to download from the Tensorflow Model Zoo.

Be careful in choosing which model to use as some are not made for Object Detection. For this tutorial we will be using the following model:

SSD MobileNet V2 FPNLite 320×320.

Download it into your Colab Notebook and extract it by executing:

%cd pre-trained-models
!curl "" --output "ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz"
model_name = 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8'model_file = model_name + '.tar.gz'tar =

Your directory structure should now look like this:

├─ models/
│ ├─ community/
│ ├─ official/
│ ├─ orbit/
│ ├─ research/
│ ├─ my_mobilenet/
│ └─ ...
├─ annotations/
│ ├─ train/
│ └─ test/
├─ pre-trained-model/
├─ exported-models/

Editing the Configuration file

In TF Object Detection API, all the settings and required information for training the model and evaluating is situated in the pipeline.config file.

Let us take a look at it:

The most important ones we will need to change are

batch_size: 128fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED"num_steps: 50000fine_tune_checkpoint_type: "classification"train_input_reader { 
label_map_path: "PATH_TO_BE_CONFIGURED"
tf_record_input_reader {
eval_input_reader {
label_map_path: "PATH_TO_BE_CONFIGURED"
shuffle: false
num_epochs: 1
tf_record_input_reader {

batch_size is the number of batches the model will train in parallel. A suitable number to use is 8. It could be more/less depending on the computing power available.

A good suggestion given on StackOverflow is:

Max batch size= available GPU memory bytes / 4 / (size of tensors + trainable parameters)

fine_tune_checkpoint is the last trained checkpoint (a checkpoint is how the model is stored by Tensorflow).

If you are starting the training for the first time, set this to the pre-trained-model.

If you want to continue training on a previously trained checkpoint, set it to the respective checkpoint path. (This will continue training, building upon the features and loss instead of starting from scratch).

# For Fresh Training
fine_tune_checkpoint: "pre-trained-model/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/checkpoint/ckpt-0"
# For Contuining the Training
fine_tune_checkpoint: "exported_models/your_latest_batch/checkpoint/ckpt-0"
batch_size = 8 # Increase/Decrease this value depending on how fast your train job runs and the availability of the Compute Resources.num_steps: 25000 # 25000 is a good number of steps to get a good loss.fine_tune_checkpoint_type: "detection" # Set this to detectiontrain_input_reader {
label_map_path: "annotations/label_map.pbtxt" # Set to location of label map
tf_record_input_reader {
input_path: "annotations/train.tfrecord" # Set to location of train TFRecord file
# Similarly do the same for the eval input reader
eval_input_reader {
label_map_path: "annotations/label_map.pbtxt"
shuffle: false
num_epochs: 1
tf_record_input_reader {
input_path: "annotations/test.tfrecord"

After editing the config file, we need to add the TensorFlow object detection folders to the python path.

import osos.environ['PYTHONPATH'] += ':/content/window_detection/models/:/content/window_detection/models/research/:/content/window_detection/models/research/slim/'

Setting up TensorBoard on Colab to monitor the training process

Colab has introduced inbuilt support for TensorBoard and can now be called with a simple magic command as follows

%load_ext tensorboard
%tensorboard --logdir 'models/my_mobilenet'