“Hey, our friends are cliff jumping, wanna come? You only live once man!”
“Nah, I only have to look once to know I’m not doing that.”
You are probably used to the saying “YOLO: You only live once”. But, that’s not actually the subject of this article. I’m talking about “YOLO: You only look once”.
It is a very clever Convolutional Neural Network (CNN) which is really good at detecting where and what objects are in a given image or video. In fact, it only has to look once.
How does it work?
YOLO first will divide a given image or frame of a video into N squares. Then, each grid cell detects objects in it and predicts B bounding boxes, and confidence scores for those boxes (a bounding box is just it predicting where objects are and placing a box around it).
After that, it takes the confidence scores high enough to likely be an object, and predicts what that object is, based on training data. This means you can only detect an object with YOLO after giving it a dataset of objects.
This allows you to detect objects, and predict what they are!
I decided to create an implementation of YOLO, which can predict different objects in an airport scene!
I found out about the YOLO Coco dataset, which is a pre-made dataset good for detecting general objects, like suitcases, people, cars, skateboards, etc. which made things a lot easier for me.
After programming everything, and after YOLO learned the dataset, YOLO was able to produce this image.
There are some use cases for YOLO as well, where YOLO can be really helpful! Here are some:
YOLO can be used in recycling plants, to help control robots and sort the waste. Since YOLO is so good at detecting objects, we can train it to sort through waste in a recycling facility
Self Driving Cars
YOLO can be especially useful in self driving cars, and in fact is already used today to detect cars, people, and traffic lights! It allows for full autonomous control!
This is far fetched, but YOLO could assist in police investigations by figuring out who a potential suspect could be. This is greatly helpful in low quality recordings, and the like!
- YOLO is a CNN capable of classifying and finding objects looking only once.
- It scans the image for objects, and figures out what they are.
- It needs a dataset to learn objects
- There are use cases today, ranging from all sorts of different things