Parking Space Detection Using Deep Learning

Source: Deep Learning on Medium

Parking Space Detection Using Deep Learning

AI For Real-world Applications

Overview

Finding parking space for your vehicle is a major problem in big cities. The rise of car ownership has created an imbalance between parking demand and supply. In the current situation, a parking management system that can track parking spots has become a necessity for all major cities. The system has to be scalable, efficient, reliable, and affordable at the same time. In recent years, the advances in deep learning powered computer vision algorithms have shown very promising results in a variety of tasks. Similar techniques can be used to address the problem of parking space detection.

In this tutorial, I will show you how to build a simple parking space detection system using deep learning. Let’s get straight to the business. We will break down our pipeline into three major components:

  • Detection of parking spots.
  • Detection of cars.
  • Calculate IoU.
Overview of the system

On each frame of the input video, we will first use the Mask-RCNN object detection model to detect the cars and their bounding boxes. After getting the bounding boxes from the Mask-RCNN, we will compute the Intersection over Union (IoU) on each pair of the bounding boxes and parking spot coordinates. If the IoU value for any parking spot is greater than a certain threshold, we will consider that parking spot as occupied.

Now, let me explain each step in little detail.

Dependencies

  • Python 3.6
  • Tensorflow ≥1.3.0
  • OpenCV
  • Matplotlib
  • Shapely

1. Detection Of Parking Spots

The very first step in a parking space detection system is to identify the parking spots. There are a few techniques to do this. For example, identifying the parking spots by locating the parking lines in a spot. This can be done using the edge detectors that OpenCV provides. But the problem here is that all parking locations don’t have these pre-defined boundaries.

Another approach we can use is to assume that the cars that don’t move for a long time are in parking spaces. In other words, valid parking spaces are just places containing non-moving cars. But, this also doesn’t seem to be reliable. It may lead to false positives and true negatives.

So, what should we do when automation doesn’t seem reliable? We do it manually. Unlike space-based methods that require labeling and training for every distinct parking facility, we only need to mark out parking lot boundaries and surrounding road areas once to configure our system for a new parking facility.

Here we will take a frame from our video/stream of the parking location and we will mark the parking regions. Python library matplotlib provides a functionality called PolygonSelector. It does exactly what we need here. It provides the functionality to select the polygon regions.

I have made a simple python script to mark the polygon regions on one of the initial frames of our input video. It takes the path of video as an argument and saves the coordinates of selected polygon regions in a pickle file as the output.

By default, this script is set to accept only quadrilaterals. With slight modifications, it can be used to mark polygons with any number of sides.

2. Detecting Cars In A Video

As I mentioned earlier, to detect cars in a video we will use the Mask-RCNN. It is basically a convolutional neural network trained on millions of images and videos from several datasets, including the COCO dataset, to detect various objects and their boundaries. Mask-RCNN is built on the top of the Faster-RCNN object detection model.

In addition to the class label and bounding box coordinates for each detected object, Mask R-CNN will also return the pixel-wise mask for each detected object in an image. This pixel-wise masking is called Instance Segmentation. Instance segmentation powers some of the recent advances that we see in the field of computer vision, including self-driving cars, robotics, and more.

Here we will use the implementation of Mask-RCNN by matterport. The reason I am using M-RCNN is that it has very good accuracy and matterport’s implementation is very easy to use.

M-RCNN will be used on every frame of the video and it will return a dictionary that contains the bounding box coordinates, masks of detected objects, confidence score for each prediction, and class_ids of detected objects. Now using the class_ids we will filter out the bounding boxes of the cars, trucks, and buses. Then we will use these boxes in the next step to calculate the IoU.

The output of M-RCNN

Check out this official IPython Notebook tutorial to understand the API of M-RCNN.

3. Calculating Intersection Over Union (IoU)

Intersection over Union is simply an evaluation metric. As its name suggests it the ratio of the area of overlap and area of intersection. Computing Intersection over Union can, therefore, be determined via:

As I said earlier, we will compute the IoU for every pair of parking spot coordinates and bounding box of cars. If the IoU for a pair is higher than a certain threshold, we will consider that parking spot as occupied.

To calculate IoU we are going to use a python library called Shapely. It comes with an easy to use API that we will use to calculate the area of intersection and area of the union of two polygons.

Detector Code

Now lets put together all the pieces in the form of a python script.

Github repo with IPython Notebook for google colab: https://github.com/sainimohit23/parking

Conclusion

In this tutorial, we have explored how to use a Mask-RCNN to make a simple parking space detection system. The only reason I have used Mask-RCNN for this tutorial because of easy to use API and higher accuracy. On a single GPU, it can process around 4–5 frames per second. For example- a tesla k80 GPU with 12Gb memory can process a 320×240 RGB image in around 200ms using Mask-RCNN. For a better frame rate, you can go for the YOLO object detection model. YOLO is significantly faster than M-RCNN but, it is less accurate compared to M-RCNN.

To get better results, instead of using bounding boxes we can use the masks of detected cars for the calculation of IoU. Using masks in the calculations will result in more accurate IoU values.

The code of this program is very raw. Various features can be added on top of this code to make it usable and scalable for real-world scenarios. For example, a web interface can be added on the top of this code to monitor the parking spaces or we can add a feature to send push notifications to alert the users about the empty parking spot.

References:

I hope you liked the post 🙂