Image Augmentation to Build a Powerful Image Classification Model

Original article was published on Deep Learning on Medium

Image Augmentation to Build a Powerful Image Classification Model

Image Augmentation, Powerful Image Classification Model

Objective:

In this article, we will learn how we can build a powerful image classification model though we don’t have a good number of training data. Specifically, we will learn how we can generate more training data with the help of the existing dataset.

Let’s Start:

In my previous article, I explained how to build a deep neural network model that can be able to classify images. And we know, a deep neural network needs a large amount of training data to achieve a good model performance. But what if we don’t have much training data. In a lot of real-world use cases, even small-scale data collection can be extremely expensive or sometimes near-impossible (e.g. in medical imaging).

This type of problem we can deal with using the image augmentation technique.

Image Augmentation:

Image augmentation is a technique that artificially creates training images through different ways of processing, such as rotation, shear, shifts, and flips, etc. It is used to expand the training dataset.

Different types of Image Augmentation techniques:

i) Shift (Horizontal and Vertical):

A horizontal shift to an image means moving all pixels of the image horizontally. And the vertical shift to an image means moving all pixels of the image vertically. And in both cases, the image dimension should remain the same. So basically some of the pixels will be clipped off the image. And on the other hand, there will be a region of the image where new pixel values will have to be specified.

ii) Flip (Horizontal and Vertical):

Fliping means reversing the rows or columns of pixels in the case of a vertical or horizontal flip respectively. But before applying to flip mechanism check the image because a flip operation on a car number plate is not recommended.

iii)Image Rotation:

The rotation will likely rotate pixels out of the image frame and leave areas of the frame with no pixel data that must be filled in. We can apply a random rotation in between 0to 360 degrees.

iv) Change of Brightness:

We can also generate different brightness images by randomly darkening images or brightening images.

v) Apply Zooming:

A zoom augmentation randomly zooms the image in and either add new pixel values around the image or interpolates pixel values respectively.

Augmented image generator:

Using the ImageDataGenerator class in Keras we can easily create an augmented image generator. Let’s see how the ImageDataGenerator class works.

#Importing class
from keras.preprocessing.image import ImageDataGenerator
#Creating instance of the ImageDataGenerator class
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
  • rotation_range is a value in degrees (0-180), a range within which to randomly rotate pictures
  • width_shift and height_shift are ranges (as a fraction of total width or height) within which to randomly translate pictures vertically or horizontally
  • rescale is a value by which we will multiply the data before any other processing. Our original images consist of RGB coefficients in the 0-255, but such values would be too high for our models to process (given a typical learning rate), so we target values between 0 and 1 instead of by scaling with a 1/255. factor.
  • shear_range is for randomly applying shearing transformations
  • zoom_range is for randomly zooming inside pictures
  • horizontal_flip is for randomly flipping half of the images horizontally –relevant when there are no assumptions of horizontal asymmetry (e.g. real-world pictures).
  • fill_mode is the strategy used for filling in newly created pixels, which can appear after a rotation or a width/height shift.

Let’s Start Coding:

i) Importing necessary libraries:

from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot

ii) Setting a sample image input:

%pylab inline
import matplotlib.image as mpimg
img=mpimg.imread('/image/rabbit.jpg')
pyplot.figure(figsize=(10,10))
imgplot = pyplot.imshow(img)
pyplot.show()

In this example, we will set ratation_range and horizontal_flip and will check.

# loading the image
img = load_img(‘/image/rabbit.jpg’)
# converting to numpy array
data = img_to_array(img)
# expanding the dimension to one sample
samples = expand_dims(data, 0)
# creating image data augmentation generator
datagen = ImageDataGenerator(rotation_range=30, horizontal_flip=0.5)
# preparing iterator
it = datagen.flow(samples, batch_size=1)
#
pyplot.figure(figsize=(15,15))
#
# generating samples and plotting
for i in range(9):
# defining subplot
pyplot.subplot(330 + 1 + i)
# generating batch of images
batch = it.next()
# converting to unsigned integers for viewing
image = batch[0].astype('uint8')
# plotting raw pixel data
pyplot.imshow(image)
# showing the figure
pyplot.show()

Now we can build an image classification model( https://medium.com/ai-in-plain-english/digit-mnist-image-classification-using-deep-neural-network-and-keras-ae1d4533db8a ) using those augmented images.

References:

i) https://en.wikipedia.org/wiki/Keras

ii) https://keras.io/api/preprocessing/image/