Highlight Action Area in Soccer using Tensorflow

Source: Deep Learning on Medium


Doing cool things with data!

Introduction

How cool would it be if cameras could be intelligent to self understand the area of action in sports and capture the right footage. This could enable us to mount camera in soccer fields especially the ones where local games are happening, capture their videos and automatically create useful highlights for the teams to watch their moves and learn new plays. There are a few companies that offer “AI cameras” with these capabilities — veo is one of them. But these cameras are expensive. A more economical option would be to have your own good resolution ordinary cameras and custom model that connects to one or multiple cameras and does processing in the cloud. This could be a more scalable solution which can be extended to sports outside of soccer as well. So how do we do this?

Tensorflow object detection is getting us there! I have used a youtube video to highlight this. See below.

Soccer — Highlighting area of action

I like this video since it captures a wide angle for the play. If the camera is high resolution then object detection can highlight area of action and we can zoom into it creating a close angle shot as show below.

You can find the code I used on my Github repo.

Overview of the steps

Tensorflow Object Detection API is a very powerful source for quickly building object detection models. If you are not familiar with this API, please see the following blogs from me that introduce the API and teach you how to build a custom model using the API.

Introduction to Tensorflow Object Detection API

Building a custom model using Tensorflow Object Detection API

Custom Model for Detecting Players and Ball

I built a custom model here for detecting players and the soccer ball. To do this I took about 50 frames from the video and annotated players and ball in it. This took some time since each frame had 15–20 players. I also annotated the soccer ball but as you can see below detecting it is a bit difficult since it gets blurry due to the speed at which it moves. Better detection of the ball can also be made by adding more images with the blurred ball. I chose the Faster RCNN Inception model from the Tensorflow Object Detection Zoo. Once the model was trained on about 50 frames, I ran it on the full video of 10 minutes and I can see that the model has generalized well. To allow the model to work on a variety of camera angles more training with different images is needed.

Player and Ball detection using Faster RCNN Inception model

You can find the trained model on my github repo.

Identifying Area of Action

Now that we can identify where all the players and our best guess of where the ball is, we can do the interesting exercise of understanding where the action is.

The tensorflow object detection gives us information on classes, boxes and scores for each frame passed to it. We can narrow down the output to get bounding boxes only where score is high. With this information, we know the centroid of most of the players on the field and the ball in some frames.

Now we must figure out the rectangle with the most players and the ball as that is where likely the action is. The approach I took was:

  1. Choosing the highlight area as a fraction of the frame size to keep the code flexible
  2. Once we have the dimension of the chosen area, we iterate over the entire frame systematically choosing many areas and calculating how many players are in them. If the area also has ball along with players than it gets a much higher score
  3. Selecting the area with the highest score and adding that rectangle on the frame

Conclusion and Next Steps

Awesome. So now you can see how we can use output from deep learning models to produce interesting results. This is just the beginning here. With a bit more time we can improve this code more. Some next steps here would be:

  1. Detect the ball better by more training focused on this class
  2. Add the option to rotate the rectangular area of focus to capture angles or views that are a different orientation to the one here
  3. Run this code even faster. I would love to use YOLO to accomplish this. A post on that soon.

Give me a ❤️ if you liked this post:) Hope you pull the code and try it yourself.

Other writings: http://deeplearninganalytics.org/blog

PS: I have my own deep learning consultancy and love to work on interesting problems. I have helped many startups deploy innovative AI based solutions. Check us out at — http://deeplearninganalytics.org/.

If you have a project that we can collaborate on, then please contact me through my website or at priya.toronto3@gmail.com

References

  1. Camera Angles
  2. Tensorflow Object Detection