Object detection on Satellite Imagery using RetinaNet (Part 2) — Inference

Source: Deep Learning on Medium

4. Using the trained model for inference

4.1 Generate detection data about previously unseen images

Once you have achieved an accuracy you are happy with, you can quickly use the trained weights to generate detections on previously unseen (test) images in two easy steps:

i. Convert the model — the training models used for the training procedure of keras-retinanet are stripped down versions of the inference models we need to make predictions. Training models basically only contain the layers required for training. That is why we need to convert the training model saved in /snapshots/ to render it useful for inference.

ii. Generate predictions (class and bounding box coordinates) and save them as a .csv file.

For this purpse I create an image_inference_write.py file and start off by constructing an argument parser. Using those arguments, I create variables for the input path, the output path, the desired threshold, the model path, and the path to the previously generatedclass.csv. I then load the class labels and the inference model and create a list of images I want to run detections on:

Next, I loop over that list of inference image path each time creating a file to store predictions in, then loading, preprocessing and scaling the images before making predictions using model.predict_on_batch(). Weak detections are filtered out according to a desired threshold (the default is 0.5). Each valid detection – described by an image path, a confidence score, its bounding box coordinates and its class – is then written to a CSV file that is saved in the output directory.

Once the file is created, I can run the following script to generate predicted detection data saved in the specified directory (-o argument):

# convert training model to inference model
!python /content/keras-retinanet/keras_retinanet/bin/convert_model.py '/content/snapshots/resnet50_csv_30.h5' '/content/snapshots/resnet50_csv_30_inference.h5'
inference_model = '/content/snapshots/resnet50_csv_30_inference.h5'# generate predictions
!python /content/ije_retinanet/image_inference_write.py \
-i '/content/data/test_data_images/test_data_images/test/' \
-t 0.6 \
-m {inference_model} \
-o /content/data/ \
-l /content/images_subset/classes.csv

Note that this program generates a seperate CSV file for each image containing all detections made on that one image. For analysis purposes, it might be useful to pull all output files into a single one, like so:

# combine all csv files into onefout=open("/content/data/output/out.csv","a")
# write header
# append files
for fi in glob.glob("/content/data/*.csv"):
f = open(fi)
for line in f:

4.2 Generate images with bounding boxes around detected objects

Now to the fun part: generating copies of the original test images with bounding boxes around detected objects. The program for this task starts in a similar way as the one in the previous section.

I create an image_inference_print.py file and start by setting up the argument parser that lets me point the program to i. the input directory of images I want to detect objects on, ii. the path to the trained weights, and iii. the path to the directory where I want the detections saved. It also provides the option of specifying a custom threshold for filtering out weak detections should I want to experiment with values other than the default of 0.5.

As before, I loop over the list of images I want to produce detections on. This time, however, in addition to preprocessing the image and correcting for image scale, I make a copy of the image to draw on, generate visualisations of detections using keras-retinanet tools based on OpenCV, and save the output to the path specified in the arguments:

With this program in tow, I can now easily generate inferences for my test images running the following script:

# create output directory where you want to save images with bounding boxes
!mkdir /content/data/output
# # uncomment if you haven't already converted training model to inference model# !python /content/ije_retinanet/image_inference_print.py \
# -i /content/data/test_data_images/test_data_images/test/ \
# -t 0.6 \
# -m {inference_model} \
# -o /content/data/output
# inference_model = '/content/snapshots/resnet50_csv_30_inference.h5'
# generate detections on images
!python /content/ije_retinanet/image_inference_print.py \
-i /content/data/test_data_images/test_data_images/test/ \
-t 0.6 \
-m {inference_model} \
-o /content/data/output

Here are a few examples of input vs. output images:

Left: Raw image; Right: Images with bounding boxes around identified objects
Left: Raw image; Right: Images with bounding boxes around identified objects
Left: Raw image; Right: Images with bounding boxes around identified objects

Et voila! Not perfect, but a great start. Feel free to reach out with any questions or comments.