Detect and track baseball using Detectron2 and SORT

Original article was published by C Kuan on Deep Learning on Medium

Detect and track baseball using Detectron2 and SORT

How I track the baseballs when there are multiple baseballs in the video.

Baseball detection and tracking.


In my previous post, I trained and built a model to detect baseball in video using Detectron2.

Baseball detection using Detectron2. (from my previous post)

It was working quite will and was able to capture the ball in most of the frames. However, one potential problem to apply this model to real baseball practice video is that there might be more than one baseball in the video as shown in the first picture. This problem makes it difficult to extract information such as ball velocity and flying angle for the exact ball.

One possible approach to the problem is to track the balls and assign unique IDs to them, then I can calculate the information for each ball and choose the ball I want.

There are several methods for object tracking, I decided to use SORT (Simple Online and Realtime Tracking) by Alex Bewley. Details introduction and paper of SORT can be found in the repo of the author. I will focus on the implementation in this post.

Baseball tracking — SORT

I used Google Colab, so I first mounted my Google Drive and copy the to the folder. Then I installed the requirement for SORT and import it.

!cp "gdrive/My Drive/Colab Notebooks/object_tracking/" .
!pip install filterpy
from sort import *

Then I called the Sort object

mot_tracker1 = Sort(max_age=3, min_hits=1, iou_threshold=0.15)

The max_age, min_hist and iou_threshold are parameters that can be adjusted depending on the requirement.

Then I integrated the Sort tracker into the object detection loop by updating the tracker every frame.

track_bbs_ids = mot_tracker1.update(dets)

The dets is a numpy array of detections in the format of [[x1,y1,x2,y2,score],[x1,y1,x2,y2,score],…].

It will return an array of [[x1,y1,x2,y2,ID],[x1,y1,x2,y2,ID],…]. The ID is the unique ball ID assigned by SORT.

It has to be noted that this has to be called for every frame even with empty detection. Use np.empty((0, 5)) when no detection.

Then I visualized it by drawing bounding boxes and IDs.

Ball Tracking.

The result is quite good. The balls on the ground were detected and assigned IDs. However, the fast-moving ball was not assigned to any ID by SORT. This might due to the distance of the ball between frame is too large and the size of ball is too small. The bounding boxes between frames have no overlap, therefore the SORT treated them as different balls.

I played around with the Sort parameters shown above, and did a little work-around to approach the issue.

In order to trick to SORT, I manually make the bounding boxes bigger, so the overlap between frames become bigger.

Ball Tracking.

Now the flying ball can be captured and assigned to a unique ID.

Finally, I can reverse and visualize the real bounding box. I converted the box to circle so it fits the spherical ball better. I also added a tail for the ball to show the track.

Ball tracking

Looks good! At this point, I have ball detections with ID for every frame. I stored in a list during the detection loop and converted it to pandas dataframe for further analysis.

Ball detections details in a pandas dataframe.


  1. The detection was not really good, some balls were not identified and some noise like white glove, shoe, etc. were identified as ball. I need to expand the training dataset and improve the detection model.
  2. The detection was a bit slow, maybe try different architecture.
  3. Film the video from side, so I will be able to calculate the velocity and angle without considering the distortion.

Thanks for reading, feedback and suggestion are welcome!