Live object detection of sea otters (because why not?)

Original article was published on Artificial Intelligence on Medium

Live object detection of sea otters (because why not?)

Do you know about sea otters? Y’know, the marine mammals that are super cute? Fun fact: they hold hands while they sleep so they don’t float away from each other.

Well, now you know about them. They’re absolutely adorable and probably one of my favorite animals. Monterey Bay Aquarium even has a sea otter cam (but it’s down until June 13, 2020… I’m not crying you’re crying) and they have an educational article about sea otters.

So I used TensorFlow’s object detection API to detect sea otters. That’s a normal thing to do, right? In this article, I’m going to show you how I did it.

Sea otters holding hands ❤ | Courtesy: Mother Nature Network

Object detection vs image classifier

I’ve done image classification before (and I wrote an article on that here) and it’s quite self-explanatory. In my example, it classifies images of hands making rock, paper, or scissors.

But that image classifier can only classify images that it was meant to classify. If the model was given an image that was not a hand making rock, paper, or scissors, the model would classify it wrong because it wasn’t programmed to classify that. It would try to classify it with the classes it was given.

Object detection would usually entail drawing boxes around a certain object in the picture. In this case, you could give any picture and it would try to detect an object within that picture. The box is usually drawn based on a probability that the object in the picture is the object it was meant to detect. If the probability that an object in the picture is not one of the objects the API was meant to detect, then a box will not be drawn

Much like image classifiers, object detection models can be trained to detect multiple objects. One popular dataset used to train object detection models is the COCO dataset (which stands for Common Objects in Context). It’s a dataset with over 81 unique objects ranging from airplanes to people to dogs and cats.

Courtesy: COCO Dataset

Using TF’s object detection to detect sea otters!

First I grabbed TensorFlow’s models from their GitHub and followed the installation instructions.

Next, I changed some code so that I could run the object detection on my computer screen. I used OpenCV to do this and their method of using your webcam as input for the object detection API. Instead of a regular webcam, I used something called VirtualCam which is a plugin for OBS (Open Broadcaster Software) to use my computer screen as webcam input.

And since there aren’t any datasets of sea otters (at least to my knowledge), I had to make my own. I used labelImg to label my images and turn them into an XML file. Then I used Python scripts from user datitran on GitHub. I used his xml_to_csv.py to convert all the XML files into CVS files. Then I generated TFRecords by using his generate_tfrecord.py.

I also downloaded the file ssd_mobilenet_v1_coco_2018_01_28 from the TensorFlow repository. Then I slapped it all together in the object detection API and BOOM you get to detect sea otters in pictures and videos.