Intersection over Union — Object Detection Evaluation Technique

Original article was published on Deep Learning on Medium

Intersection over Union — Object Detection Evaluation Technique

This article will describe the concept of IoU in any Object Detection Problem. It will also walk you through the application of the same through python code.

What is IoU ?

As we know, any object detection algorithm, be it RCNN, Faster RCNN or Mask RCNN, will always draw a bounding rectangular box around the object which we want to detect in an image.

IoU actually stands for Intersection over Union. It is basically an evaluation metric. Any algorithm that provides predicted bounding boxes as output can be evaluated using IoU.

In order to apply Intersection over Union to evaluate an object detector we need:

  1. The ground-truth bounding boxes (i.e., the hand labeled bounding boxes from the validation set that specify where in the image our object is). Technically Ground-Truth are the actual coordinates of the object that we get from it’s corresponding image’s annotation file (xml or csv).
  2. The predicted bounding boxes from our model.

Now, how do we draw this Ground-Truth Box manually ?

Although there are many open source annotation tool available in the Internet, I’ll be using the most commonly used tool here just for the demonstration purpose, i.e., LabelImg

Steps to be followed are as below:

  1. Open Command Line Terminal in your local PC and clone the git repository as shown below.
    git clone
    Alternatively, you can download the labelImg-master zip file directly from the github link above.

2. I am using a Windows Machine with Anaconda preinstalled. So, I just executed the commands below one by one.

3. It will open the LabelImg Application.

Click on Open Dir and browse to the folder which has all the training images.

4. Select an image from the file list and click on Create Rect Box.

5. Drag the mouse pointer to draw a rectangular box around the object and enter the class name. For my case, it is kangaroo. After that, click on the Save icon to save this annotated xml file.

6. Likewise, we have to create annotated xml files by labelling the objects for all the images.

For cases, where there are two kangaroos in an image, we’ll draw two boxes and name the classes as kangaroo for both the boxes.

For cases, where there is one kangaroo and one horse in an image, we’ll draw one box for the kangaroo and one for the horse and name the classes as kangaroo and horse respectively.

7. Now, if we open any of the xml files, we’ll be able to see the actual coordinates of the objects annotated.

In the above annotation xml file, we have two kangaroos present in one od the images.

Now let’s come back to the concept of IoU.

Formula for IoU


If IoU is greater than the threshold, let’s say (0.5), then the object is a kangaroo, else, it is not an object of our interest. So, for the above case, the object within the predicted box is not an object of our interest.

Now for the intersection area, we have to get the coordinates of the intersection rectangle in the python program.

Let us define the get_iou function.

Now as our get_iou function is ready, let us see how we can fetch the coordinates of the ground truth or actual boxes from the annotation file.

First, we need to convert the XML file to CSV

If there were two kangaroos in an image, then we would have got two rows in the converted CSV file, just like below.

Annotations for an image in CSV format

Now, let’s loop through the rows (in our case, number of rows =1) and fetch the x and y coordinates for both the objects. After fetching, we will append the values to an array, named as bb1.

Just for the demonstration of IoU in this article, I am not going to actually predict the values for bb2 (predicted rectangular box coordinates) through any algorithm. Rather, I would be passing hard-coded values manually.

Now, let’s call the IoU function to check for the IoU values.

For images with two objects or kangaroos, we would have got two IoUs as shown below.

Let us now do the same process while simultaneously plotting the boxes for ground-truth (actual) and predicted coordinates, on the original image.

The calculated IoU is coming as 0.0149. Let us change the values of predicted rectangular box coordinates (bb2) and check again.

The new IoU value of the detected object is now showing as 0.6997

In real-time object detection problems, we set a threshold for the IoU value (0.5), as mentioned earlier, so that the predicted rectangular box appears only when the IoU value is above the threshold.

That’s it for this article. I hope by now, you have clearly understood of concept of IoU in object detection problems. In my subsequent articles, we would discuss, how to predict the values for the predicted rectangular box through various algorithms such as RCNN, Faster RCNN or Mask RCNN, instead of passing hard-coded values.

Thank you !!