Measuring Traffic Speed With Deep Learning Object Detection

Source: Deep Learning on Medium

Go to the profile of Rohit Sharma

Object Detection

Object detection meaning locating the object in the image or a video frame. Input to object detection is clear image of an object. This image is passed to the software which outputs the position, or a bounding box of the input object.

Traffic Vision

In this article, we use Yolo-v2 neural network as the basis for our software for vehicle detection. We augmented Yolo model (converted to MIVisionX model for AMD GPUs) with vehicle speed estimation and vehicle direction capabilities in the traffic vision app.

high level modules for traffic vision app

Detected Box Speed Calibration

Box speed calibration is simply a mapping of box-speed in pixels/sec to vehicle-speed in miles/hr (or km/hr if you happen to follow SI system).

Here is the algorithm for detecting up/down speed:

1. Get a sample video with up-down traffic in the frame.
2. Collect box locations and compute center y-coordinate of each box
3. Draw the histogram for the y-coordinate (see histogram)
4. Find the y-range within which majority of detected vehicles are passing.
5. Remember to exclude regions where vehicles could be stationary. 
6. Find average speed between frames in y-range.
7. Calibrate center-pixel-distance/one-frame-duration between two consecutive frames to known speed limit of the road.

Vehicle Center Point Histogram with Data Extracted from YoloV2 Detection on a Live Traffic

Picture above shows center point of detected vehicles as boxes. It is easy to draw the region (shown as blue rectangle) covering most vehicles. One can use ±½σ, ±σ or ±2σ as a metric to draw the region. Left and right side of this rectangle is y-range. In our example (shown in picture), this range is 167, 232.

Location Scale Factor

Remember, you’ve calibrated only small part of the frame. This calibration must be scaled to rest of the frame assuming most vehicles are travelling at the speed limit (statistical inference). Collect estimated speed of area beyond the calibrated region and divide it by statistically estimated speed of the calibrated region to find scale factor based on location. In our example, location factor graph is shown below.

pixel travel distance between two consecutive frames


We used techniques mentioned above to calibrate the street. This calibration is used in estimating the speed as shown in the video below.


  1. yoloV2 paper
  2. Tiny Yolo aka Darknet reference network
  3. MiVisionX Setup
  4. AMD OpenVX