In this article, we are going to download the dataset that we are going to use for our model.
This is article is the second part of the series about this amazing Segmentation Network used for the task of semantic segmentation. Furthermore, this Segmentation Network brings a novel approach to decouple the function of spatial information preservation (high-resolution features) and receptive field offering two paths. Specifically, it is proposed a Bilateral Segmentation Network(BiSeNet) with a Spatial Path(SP) and a Context(CP). This two paths came to fix previous approaches used in the semantic segmentation task, that compromise of accuracy over speed.
Without further ado, let’s get this show on the road!
First things first, we have got to have data to train our Artificial Intelligence(AI) algorithm, for example: In supervised learning we want the AI algorithm to learn the mapping between the input(x) and output(y), so that given a new input(x) that is not part of the dataset it was trained on it can predict the output(y).
In order to get a dataset, we can collect it ourselves via web scraping or download a dataset someone else put together. There are some pretty famous dataset repositories which offer us a variety of datasets for many ends, such as :
- CamVid (which is the dataset we will be using)
- COCO stuff
- Kaggle and etc..
Unlike many think, Kaggle does not only host Data Science competitions, it also a dataset and kernel repositories, where Data Scientist share their dataset and their kernels that give more insight about their datasets.
For our implementation of the BiSeNet, we are going to use a dataset called CamVid, that was mentioned in the research paper by the researchers that invented BiSeNet.
The CamVid dataset is a street scene dataset from the perspective of a driving automobile. It is composed of 701 images in total, in which 367 for training, 101 for validation and 233 for testing. The images have a resolution of 960×720 and 11 semantic categories/labels.
To download the dataset and labels click here.
Preprocessing and Visualizing
Pre-processing refers to the transformations applied to our data before feeding it to the algorithm.
Data Preprocessing is a technique that is used to convert the raw data into a clean data set. In other words, whenever the data is gathered from different sources it is collected in raw format which is not feasible for the analysis. Furthermore, we might not use the entire dataset because some features are not important or relevant to our problem.
For example if we want an algorithm to distinguish dogs from cats, and our dataset contains their pictures in one column and the name of the owners in the other column as input features, we can discard the name of the owners column because it does not contribute at all to distinguish dogs from cats, all we need is the features like shape of the ears , nose, and type of fur, which are in picture’s column.
In this section, I’m going to present you the code I use to load the files, convert them into a Numpy array and plot them using jupyter notebooks.
The first step is to import the libraries we are going to use to manipulate the data and get the path where the dataset is saved, whether on your laptop or cloud.
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
#image_path is defined as a global variable
image_path = "path to where you saved your dataset"
After this, we are going to write a function which receives path as a parameter in order to get all the files in that path and put them in a list.
image_files = sorted([os.path.join(path, 'train', file)
for file in os.listdir(path + "/train") if file.endswith('.png')])
annotation_files = sorted([os.path.join(image_path, 'trainannot', file)
for file in os.listdir(path + "/trainannot") if file.endswith('.png')])
image_val_files = sorted([os.path.join(path, 'val', file)
for file in os.listdir(path + "/val") if file.endswith('.png')])
annotation_val_files = sorted([os.path.join(path, "valannot", file)
for file in os.listdir(path + "/valannot") if file.endswith('.png')])
return image_files, annotation_files, image_val_files, annotation_val_files
Then we can make the main function that calls the loadImages() function and passes the parameter path. Algorithms can’t read an image as they are, we need to convert them into a 2d array so we can then feed it to the algorithm and visualize it.
#calling global variable
# The var Dataset is a list of size 4 that gets all files from different folders so we can access different folders using dataset[0...3]
dataset = loadImages(image_path)
train_set = dataset
r = [i for i in train_set]
img_1 = mpimg.imread(r)
display = plt.imshow(img_1)
For the full code used in this post, here is the link for my GitHub.
This repo contains the code for a fierce t attempt to implement this amazing Research paper. …github.com
This concludes the Part II of this series about BiSeNet, stay tuned for more amazing content and Part II with the code for implementing this state-of-the-art Real-time semantic segmentation Network research paper.
Thank you for reading if you have any thoughts, comments or critics please comment down below.
If you like it please give me a round of applause👏👏 👏and share it with your friends.
In this post, I’m going to give you an introduction to Bilateral Segmentation Network for Real-time Semantic…medium.com
The recipe of the Success of BiSeNet (Bilateral Segmentation Network).medium.com
Data is measured, collected and reported, and analyzed, whereupon it can be visualized using graphs, images or other…en.wikipedia.org
In relational databases and flat file databases, a table is a set of data elements (values) using a model of vertical…en.wikipedia.org
Source: Deep Learning on Medium