Source: Deep Learning on Medium
Choosing what to execute:
I was reading articles before working on this project, on the Object Detection API by Google — how it is a single stop solution to implement all the major models of computer vision and decided to give it a go.
Next most important thing was to choose what model I wanted to execute to detect objects from the images.
The Object Detection API has the capability to execute many different types of models, like ssd_mobilenet_v1, faster_rcnn_resnet50 to name a few.
The complete list of models and the weights that they are trained on can be found in Tensorflow detection model zoo.
Each model named here has 2 columns listed beside it. One is for “Speed” — how quickly can a model predict and another is “mAP” — which loosely translates to how accurate the model is usually (based on test conditions).
Point to note here is that there is usually a tradeoff between speed and accuracy in these models
So after reviewing all the models, I decided to go with ssd_inception_v2_coco as it provides a good trade-off between speed and accuracy.