Detailed Tutorial: Build Your Custom Real-Time Object Detector

Source: Deep Learning on Medium

8. Configuring the Training Pipeline.

This is the last step before starting to train the model, finally! Perhaps, it will be the step where you might spend some time to tune the model.

Tensorflow Object Detection API model we downloaded comes with many sample config files. For each model, there is a config file that is ‘almost’ ready to be used.

The config files are located here:


ssd_mobilenet_v2_coco.config is the config file for the pretrained model we are using. If you chose another model, you need to use & edit the correspondent config file.

Because we will probably have to tune the config constantly, I suggest doing the following:

  • view the content of the sample config file by running:
  • Copy the content of the config file
  • Edit it by using

Or, you can open and edit the config file directly from your local machine if you have everything synced by using any text editor.

Here are the required edits that need to be changed in the sample config file, also some suggested edits to improve your model performance.

. Required edits to the config file:

  1. model {} > ssd {}: change num_classes to the number of classes you have.

2. train_config {}: change fine_tune_checkpoint to the checkpoint file path.

Note: The exact file name model.ckpt does’t exist. This is where the model will be saved during training. This is its relative path:


3. train_input_reader {}: set the path to the train_labels.record and the label map pbtxt file.

4. eval_input_reader {}: set the path to the test_labels.record and the label map pbtxt file.

That’s it! You can skip the optional edits and head to training!

. Suggested edits to the config file:

First, you might want to start training the model and see how well it does. If you are overfitting, then you might want to do some more image augmentation.

  • In the sample config file: random_horizontal_flip & ssd_random_crop are added by default. You could try adding these as well:

from train_config {}:

Note: Each image augmentation will increase the training time drastically.

There are many data augmentation options that you can add. Check the full list from the official code here.

  • In model {} > ssd {} > box_predictor {}: set use_dropout to true This will be helpful to counter overfitting.
  • In eval_config : {} set the number of testing images you have in num_examples and remove max_eval to evaluate indefinitely

Note: The notebook provided explains many more things in regard to tuning the config file. Check it out!

The full working directory:
(Including some files/folders that will be created and used later)

├── data/
│ ├── images/
│ │ └── ...
│ ├── annotations/
│ │ └── ...
│ ├── train_labels/
│ │ └── ...
│ ├── test_labels/
│ │ └── ...
│ ├── label_map.pbtxt
│ ├── test_labels.csv
│ ├── train_labels.csv
│ ├── test_labels.records
│ └── train_labels.records

└── models/
├─ research/
│ ├── fine_tuned_model/
│ │ ├── frozen_inference_graph.pb
│ │ └── ...
│ │
│ ├── pretrained_model/
│ │ ├── frozen_inference_graph.pb
│ │ └── ...
│ │
│ ├── object_detection/
│ │ ├── utils/
│ │ ├── samples/
│ │ │ ├── configs/
│ │ │ │ ├── ssd_mobilenet_v2_coco.config
│ │ │ │ ├── rfcn_resnet101_pets.config
│ │ │ │ └── ...
│ │ │ └── ...
│ │ ├──
│ │ ├──
│ │ └── ...
│ │
│ ├── training/
│ │ ├── events.out.tfevents.xxxxx
│ │ └── ...
│ └── ...
└── ...

9. Tensorboard.

Tensorboard is the place where we can visualize everything that’s happening during training. You can monitor the loss, mAP, AR and many more.

You could also monitor the pictures and the annotations during training. At each evaluation step, you could see how good your model was at detecting the object ←……Note: Remember when we set num_visualizations: 20 above? Tensorboard will display that much pictures of the testing images here.

To use Tensorboard on Colab, we need to use it through ngrok. Get it by running:

Next, we specify where the log files are stored and we configure a link to view Tensorboard:

When you run the code above, at the end of the output there will be a url where you can access Tensorboard through.


  1. You might not get a url when running the above code, but an error instead. Just run the above cell again. No need to reinstall ngrok.
  2. Tensorboard will not log any files until the training starts.
  3. A max of 20 connection per minute is allowed when using ngrok, you will not be able to access tensorboard while the model is logging to it. (happens very frequently)

If you have the project synced to your local machine, you will be able to view the Tensorboard without any limitation.

Go to terminal on your local machine and run:

$ pip install tensorboard

Run it and specify where the logging dir is:

# in my case, the path to the training folder is:
tensorboard --logdir=/Users/alaasenjab/Google\ Drive/object_detection/models/research/training

10. Training… Finally!

Training the model is as easy as running the following code. We just need to give it:

  • which runs the training process
  • pipeline_config_path=Path/to/config/file/model.config
  • model_dir= Path/to/training/


  1. If the kernel dies, the training will resume from the last checkpoint. Unless you didn’t save the training/ directory somewhere, ex: GDrive.
  2. If you are changing the below paths, make sure there is no space between the equal sign = and the path.

Now set back and watch your model train on Tensorboard.

11. Export the trained model.

By default, the model will save a checkpoint every 600 seconds while training up to 5 checkpoints. Then, as new files are created, older files are deleted.

We can find the last model trained by running this code:

Then by executing to convert the model to a frozen model frozen_inference_graph.pb that we can use for inference. This frozen model can’t be used to resume training. However, saved_model.pb gets exported as well which can be used to resume training as it has all the weights.

  • pipeline_config_path=Path/to/config/file/model.config
  • output_directory= where the model will be saved at
  • trained_checkpoint_prefix=Path/to/a/checkpoint

You can access all exported files from this directory:

/gdrive/My Drive/object_detection/models/research/pretrained_model/

Or, you can download the frozen graph needed for inference directly from Google Colab:

#downloads the frozen model that is needed for inference
# output_directory = 'fine_tuned_model' dir specified above. + '/frozen_inference_graph.pb')

We also need the label map .pbtxt file:

#downlaod the label map
# we specified 'data_base_url' above. It directs to
# 'object_detection/data/' folder. + '/label_map.pbtxt')

12. Webcam Inference.

To use your webcam in your local machine to inference the model, you need to have the following installed:

Tensorflow = 1.15.0
cv2 = 4.1.2

You also need Tensorflow model downloaded on your local machine (Step 5 above) or you can skip that and navigate to the model if you have GDrive on your local machine synced.

  • Go to terminal on your local machine and navigate to object_detection/models/research/object_detection

In my case, I am navigating to the folder in GDrive.

$ cd /Users/alaasenjab/Google\ Drive/object_detection/models/research/object_detection

You can run the following from a jupyter notebook or by creating a .py file. However, change PATH_TO_FROZEN_GRAPH , PATH_TO_LABEL_MAP and NUM_CLASSES

Run the code and smile 🙂