Computer Vision 101… And More! (Part 2)

Source: Deep Learning on Medium

Preprocessing is boring, right? | Photo by Jo Szczepanska on Unsplash

Okay, so you’ve decided to explore Part 2 with me after completing Part 1, and damn, this is some prudent decision. Without further ado, let’s get down to it! First we’ll import all the required packages


Oh, a side note: The first three lines help us to install those 3 packages using ‘!’ as if it were python command line! (pretty sick, right?!). Also, we’re importing skmultilearn to split multi-label data and MLB is the custom Multi Label Generator. As clearly visible, I’ll be using the VGG16 model to train our deep neural net. Finally, the ‘xx’ and ‘yy’ are the image height and width.

Extraction and Organisation

You might’ve noticed that Kaggle gave us files with tar.7z extension along with the common .zip extension. We’ll create a function extract_dir that’ll extract all the files and neatly organise (yes, I prefer the British ‘s’ over the American ‘organiZe’. Sue me!) them using pyunpack module. Also, two steps are needed to extract a tar.7z file — first it is extracted to a .tar file and then the final extraction takes place.

Following is the directory structure:

Multi-Label Splitting

Multi-Label Splitter

The problem with randomly splitting multi-label data is that the training images could contain labels not present in the test set and vice-versa. To cope with this problem, a nifty package skmultilearn is there. ’Tis a little slow, but hey, it gets the job done. We’ve used the MultiLabelBinarizer to convert the data in a format that it can easily access. Here’s the docs for this splitting strategy

With this, we’re done with all the preprocessing related to getting ourselves ready for the fun stuff (yep, I’m talking ‘bout deep learning). *opens Part 3*

Definitely comment any line of code that you found difficult to understand. Happy to help! 😄