Source: Deep Learning on Medium
Road Segmentation from scratch using Deep Learning towards Autonomous Driving…
Segmentation is the task of comparing pixel wise similarity to classify the objects in the road traffic environments. This article provides a motivational experience on road segmentation for Indian road traffic conditions.
Data-set: There exists kitti, Cityscapes, Camvid and Waymo dataset in the space of Autonomous driving. But they have been captured in well structured, and disciplined traffic environments.
The Indian Road Driving Data-set (IDD) data-set have been released in 2018 for segmentation and obstacle detection tasks. The data-set was majorly captured in and around the Hyderabad and Karnataka cities. Initial release of the segmentation IDD data-set have in total 10,003 instances. Further, the data-set is split to 6,993 training, 981 validation and 2029 test samples. These data-set was the first data-set that was released specific to Indian road driving sequences. IDD segmentation data-set totally has 34 classes.
Now to consider road segmentation challenge, we need to have only the pixels information of road class. Notably, this image has the RGB and A (Alpha) channel information. Now, as part of pre-processing we need to consider road pixel information masking rest of the classes as background.
One simple possible approach would be to find the RGBA values of the class “road” and simply compare it with the other pixels.
below is the short code snippet trying to compare the pixel similarity and modify the ground truth mask:
from PIL import Image
if pixdata[x,y] == (128,64,128,255): # road color RGBA information…
pixdata[x,y] = (255, 255, 255, 255)
Now, once we preprocess the original mask, we get the revised segmentation ground truth as follows:
Now these image can be further converted again as black and white based on our requirement as black/white image for binary classification. Example of black and white image segmentation mask is given below:
Now the major part is segmentation model. As there are many state-of-the art models, we tried initially with U-net which was popular and introduced for the space of bio-medical segmentation.
The architecture of Unet is given below. One advantage of Unet is we can try with different image dimensions.
Trying out to train the IDD dataset with inception-resnet, densenet, LR as 0.005 and 0.0001, Nadam as optimizer with further fine tuning gave a decent score of 87%. The visualization of results are shown below.: