Part 4: Image Classification using Features Extracted by Transfer Learning in Keras

Source: Deep Learning on Medium

Part 4: Image Classification using Features Extracted by Transfer Learning in Keras

By Ahmed F. Gad, Alibaba Cloud Community Blog author

Welcome again in a new part of the series in which the Fruits360 dataset will be classified in Keras running in Jupyter notebook using features extracted by transfer learning of MobileNet which is a pre-trained convolutional neural network (CNN).

In Part 3, the MobileNet model is downloaded and prepared for transferring its learning for working with the Fruits360. This is by replacing its top layers by other layers that fit the number of classes in the new dataset which is 102. By freezing all layers of MobileNet and just allowing learning for the newly added layers, the model is able to work with the new dataset. Finally, the model is saved for use in this tutorial.

In this tutorial, which is Part 4 and the last part in the series, the saved model is loaded again for extracting features from the datasets. In Part 2, the dataset images were saved into NumPy arrays. In this tutorial, these arrays will be fed to the model. The goal is not predicting the class label but extracting features from each image. The features extracted from the entire dataset are then used for training and testing an artificial neural network (ANN).

The sections covered in this tutorial are as follows:

  • Loading the Dataset NumPy Arrays.
  • Loading the Model.
  • Making Predictions.
  • Preparing the Model for Feature Extraction.
  • Extracting Features from the Dataset.
  • Building and Training an ANN using Extracted Features.

Let’s get started.

Loading the Dataset NumPy Arrays

In Part 2, the dataset images are read and saved into NumPy arrays. For the training data, 2 files were produced which are:

  1. train_dataset_array.npy: Training images.
  2. train_dataset_array_labels.npy: Training labels.

Regarding the test data, its 2 files are as follows:

  1. test_dataset_array.npy: Test images.
  2. test_dataset_array_labels.npy: Test labels.

In this tutorial, we are going to feed the NumPy arrays that hold the training and test images to the pre-trained model for feature extraction. Before doing this, we have to load the files listed above if they are currently existing in your machine.

For any missing NumPy array of the previously listed 4 arrays, rerun its code to generate it again. At all, here is the code that produces all 4 NumPy arrays previously created in Part 2 and saving them to the notebook disk.

import numpy
import keras
import os
import tensorflow as tf

def images_to_array(dataset_dir, image_size):
dataset_array = []
dataset_labels = []

class_counter = 0

classes_names = os.listdir(dataset_dir)
for current_class_name in classes_names:
class_dir = os.path.join(dataset_dir, current_class_name)
images_in_class = os.listdir(class_dir)

print("Class index", class_counter, ", ", current_class_name, ":" , len(images_in_class))

for image_file in images_in_class:
if image_file.endswith(".jpg"):
image_file_dir = os.path.join(class_dir, image_file)

img = keras.preprocessing.image.load_img(image_file_dir, target_size=(image_size, image_size))
img_array = keras.preprocessing.image.img_to_array(img)

img_array = img_array/255.0

dataset_array.append(img_array)
dataset_labels.append(class_counter)
class_counter = class_counter + 1
dataset_array = numpy.array(dataset_array)
dataset_labels = numpy.array(dataset_labels)
return dataset_array, dataset_labels

zip_file = tf.keras.utils.get_file(origin="https://storage.googleapis.com/kaggle-datasets/5857/414958/fruits-360_dataset.zip?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1566314394&Signature=rqm4aS%2FIxgWrNSJ84ccMfQGpjJzZ7gZ9WmsKok6quQMyKin14vJyHBMSjqXSHR%2B6isrN2POYfwXqAgibIkeAy%2FcB2OIrxTsBtCmKUQKuzqdGMnXiLUiSKw0XgKUfvYOTC%2F0gdWMv%2B2xMLMjZQ3CYwUHNnPoDRPm9GopyyA6kZO%2B0UpwB59uwhADNiDNdVgD3GPMVleo4hPdOBVHpaWl%2F%2B%2BPDkOmQdlcH6b%2F983JHaktssmnCu8f0LVeQjzZY96d24O4H85x8wdZtmkHZCoFiIgCCMU%2BKMMBAbTL66QiUUB%2FW%2FpULPlpzN9sBBUR2yydB3CUwqLmSjAcwz3wQ%2FpIhzg%3D%3D",
fname="fruits-360.zip", extract=True)
base_dir, _ = os.path.splitext(zip_file)

train_dir = "/root/.keras/datasets/fruits-360/Training"
image_size = 128
train_dataset_array, train_dataset_array_labels = images_to_array(dataset_dir=train_dir, image_size=image_size)
print("Training Data Array Shape :", train_dataset_array.shape)
numpy.save("train_dataset_array.npy", train_dataset_array)
numpy.save("train_dataset_array_labels.npy", train_dataset_array_labels)

test_dir = "/root/.keras/datasets/fruits-360/Test"
test_dataset_array, test_dataset_array_labels = images_to_array(dataset_dir=test_dir, image_size=image_size)
print("Test Data Array Shape :", test_dataset_array.shape)
numpy.save("test_dataset_array.npy", test_dataset_array)
numpy.save("test_dataset_array_labels.npy", test_dataset_array_labels)

After making sure the data is existing, we can start loading the model for extracting features from the dataset as discussed in the next section.

Loading the Model

In Part 3, the model was saved in a file named MobileNet_TransferLearning_Fruits360v48.h5. In this section, we are going to load this model according to the code listed below. The model is loaded using the load_model() function.

from tensorflow.keras.models import load_model

base_model = load_model('MobileNet_TransferLearning_Fruits360v48.h5')
base_model.summary()

To make sure the model is loaded successfully, we can call the summary() method. Its output is listed below. It successfully printed the model information discussed in Part 3 and this proves the model is loaded successfully.

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
mobilenet_1.00_128 (Model) (None, 4, 4, 1024) 3228864
_________________________________________________________________
global_average_pooling2d_20 (None, 1024) 0
_________________________________________________________________
dense_18 (Dense) (None, 102) 104550
=================================================================
Total params: 3,333,414
Trainable params: 104,550
Non-trainable params: 3,228,864

After preparing the images in NumPy arrays and loading the model, you might think that the next step is to feed the NumPy arrays into the model to extract the features. This is not right. At the current state, we can just make predictions as discussed in the next section.

Making Predictions

When a new sample is to be fed to a pre-trained model, then the model returns the output of its last layer. According to the model architecture summarized previously, the last layer is the layer named dense_18 which is the fully connected (FC) layer. Note that the name to this layer is dynamically assigned and thus it might change for you.

This layer has 102 neurons which represent the 102 classes in the dataset. For each input to the model sample, a vector of 102 values will be returned holding the class scores for the sample. The sample is classified according to the class of the highest score. Here is the code in which the first 2 samples from the training data are fed to the model using the predict() method.

The predict() method accepts a NumPy array of one or more lists where each list is regarded as a sample. In our project, the NumPy array must be 4D. The first dimension represents the number of samples and the 3 remaining dimensions represent the shape of the image. In this code, the first 2 samples are fed to the predict() method and thus the shape of the input to this method is (2, 128, 128, 3).

This method returns a NumPy array of the class scores for all samples. Because the last layer in the model, FC layer named dense_18, has 102 neurons, then a vector of length 102 is returned for each sample. Because there are 2 samples, then the shape of the returned NumPy array is (2, 102).

from tensorflow.keras.models import load_model
import numpy

base_model = load_model('MobileNet_TransferLearning_Fruits360v48.h5')
base_model.summary()

train_dataset_array = numpy.load("train_dataset_array.npy")
train_dataset_array_labels = numpy.load("train_dataset_array_labels.npy")

prediction = base_model.predict(train_dataset_array[0:2, :])

class_labels = ['Banana', 'Melon Piel de Sapo', 'Cherry 2', 'Raspberry', 'Pitahaya Red', 'Apple Granny Smith', 'Pomelo Sweetie', 'Quince', 'Apple Red Yellow 1', 'Plum', 'Grapefruit White', 'Grape Pink', 'Tomato 4', 'Apple Crimson Snow', 'Apricot', 'Limes', 'Cherry Wax Black', 'Pepper Red', 'Lemon', 'Mulberry', 'Carambula', 'Apple Red 2', 'Tamarillo', 'Pear Williams', 'Tomato Cherry Red', 'Walnut', 'Apple Red Yellow 2', 'Chestnut', 'Strawberry Wedge', 'Cactus fruit', 'Pineapple Mini', 'Plum 2', 'Passion Fruit', 'Cherry Wax Yellow', 'Cherry 1', 'Apple Red 3', 'Kiwi', 'Guava', 'Rambutan', 'Cocos', 'Apple Golden 3', 'Hazelnut', 'Orange', 'Lemon Meyer', 'Kumquats', 'Banana Red', 'Apple Braeburn', 'Papaya', 'Lychee', 'Kaki', 'Plum 3', 'Tomato 3', 'Cantaloupe 2', 'Peach', 'Mango', 'Pomegranate', 'Grape White 2', 'Granadilla', 'Redcurrant', 'Strawberry', 'Pear Kaiser', 'Apple Red Delicious', 'Apple Pink Lady', 'Cantaloupe 1', 'Salak', 'Maracuja', 'Pear', 'Grape Blue', 'Clementine', 'Apple Red 1', 'Huckleberry', 'Cherry Wax Red', 'Avocado', 'Banana Lady Finger', 'Pineapple', 'Nectarine', 'Cherry Rainier', 'Tomato Maroon', 'Pear Monster', 'Peach Flat', 'Grape White 3', 'Tomato 1', 'Physalis with Husk', 'Avocado ripe', 'Physalis', 'Tomato 2', 'Grapefruit Pink', 'Tangelo', 'Pepper Green', 'Kohlrabi', 'Apple Golden 1', 'Pear Red', 'Mandarine', 'Pepper Yellow', 'Mangostan', 'Pepino', 'Apple Golden 2', 'Peach 2', 'Grape White', 'Grape White 4', 'Pear Abate', 'Dates']

predicted_class_ID = numpy.where(prediction[0] == numpy.max(prediction[0]))[0][0]
predicted_class_name = class_labels[predicted_class_ID]
print("Predicted Class ID", predicted_class_ID)
print("Predicted Class Name", predicted_class_name)

correct_class_ID = train_dataset_array_labels[0]
correct_class_name = class_labels[correct_class_ID]
print("Correct Class ID", correct_class_ID)
print("Correct Class Name", correct_class_name)

After making predictions, a list named class_labels is prepared with the names of all 102 classes. It is used to print the predicted class name.

Of the vector of length 102 for each sample, the highest score refers to the class to which the sample is classified. The index of the class with the highest score for the prediction of the first sample in the training data is returned in the predicted_class_ID variable. After that, the predicted class name is returned by indexing the class_labels list with that variable. The name is returned in the predicted_class_name variable. After printing the predicted class ID and name, next is to find the correct class ID and name of the first training sample and stored such information in the correct_class_ID and correct_class_name variables, respectively.

Note that the goal of this series is not using the model for making predictions but for feature extraction. Preparing the NumPy arrays of the images and loading the model is not enough and there is an intermediate step that must be fulfilled. This step is preparing the model for feature extraction which will be discussed in the next section.

Preparing the Model for Feature Extraction

As discussed in the previous section, when a sample is fed to the model using the predict() method, the output from the last layer in the architecture is what returned by default. The last layer in our case is the FC layer named dense_18. This layer returns a vector of length 102 representing the class scores. But we are interested in returning the features, not the class scores. What can we do? It is very simple.

If the default behavior of the predict() method is to return the outputs from the last layer, which in this case the dense_18 layer, why not changing this default behavior? Looking at the network architecture given again below, the layer that precedes the FC layer dense_18 is the pooling layer named global_average_pooling2d_20. The features will be simply returned by forcing the predict() method returning the output the output of the global_average_pooling2d_20 layer rather than the dense_18 layer. How to change the layer from which the predict() will return its outputs? Let’s discuss this.

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
mobilenet_1.00_128 (Model) (None, 4, 4, 1024) 3228864
_________________________________________________________________
global_average_pooling2d_20 (None, 1024) 0
_________________________________________________________________
dense_18 (Dense) (None, 102) 104550
=================================================================

Keras offers a class named Model. The constructor of this class accepts 2 arguments which are an input tensor and an output tensor. The Model class is clever enough to know how to start from the input tensor until calculating the output tensor.

In our case, we will feed the Model class constructor by our model inputs and the outputs from the desired layer as given in the next line. The inputs argument is assigned to the model input which is returned using the input property. Because the desired layer to return its outputs is the pooling layer named global_average_pooling2d_20, then the outputs from this layer are assigned to the outputs argument. This layer is returned using the get_layer() method and its output is returned using the output property. If you want to build a model that returns the output of another layer when the predict() method is called, then specify its name in the get_layer() method.

model2 = tensorflow.keras.models.Model(inputs=base_model.input, outputs=base_model.get_layer('global_average_pooling2d_20').output)

The Model class constructor returns a new model which is saved in the model2 variable. When this model calls the predict() method, then the outputs from the pooling layer will be returned. The outputs from this layer represent the features extracted from the image. Because this layer has 1,024 neurons, then the feature vector length is 1,024. You can change the length of the feature vector by specifying a different number of neurons to this layer when the model is created.

In order to return the features from the samples, just call the predict() method from model2 rather than from base_model as given in the code below. Now, the predict() method returns a feature vector of length 1,024 for each sample. Because the first 2 samples are passed to the predict() method, the shape of the returned NumPy array features is (2, 1,024).

import tensorflow.keras
import numpy

base_model = tensorflow.keras.models.load_model('MobileNet_TransferLearning_Fruits360v48.h5')
base_model.summary()

model2 = tensorflow.keras.models.Model(inputs=base_model.input, outputs=base_model.get_layer('global_average_pooling2d_20').output)

train_dataset_array = numpy.load("train_dataset_array.npy")
train_dataset_array_labels = numpy.load("train_dataset_array_labels.npy")

features = model2.predict(train_dataset_array[0:2, :])

After preparing the model to return the features from the input samples rather than the class scores, the next section extracts the features from the entire dataset.

Extracting Features from the Dataset

The dataset images are split into 2 NumPy arrays which are train_dataset_array for the training images and test_dataset_array for the test images. We can start by extracting the features of the training images.

In order to extract the features from the entire training data, just pass the train_dataset_array NumPy array to the predict() method as given in the next code. The extracted features from the training images are saved in the training_features NumPy array. The shape of this array is (52,718, 1,024) because there are 52,718 training images and a feature vector of length 1,024 is extracted from each image. Finally, this NumPy array is saved in a file named MobileNet_TransferLearning_Fruits360v48_Train_Features.npy.

import tensorflow.keras
import numpy

base_model = tensorflow.keras.models.load_model('MobileNet_TransferLearning_Fruits360v48.h5')
base_model.summary()

model2 = tensorflow.keras.models.Model(inputs=base_model.input, outputs=base_model.get_layer('global_average_pooling2d_20').output)

train_dataset_array = numpy.load("train_dataset_array.npy")

train_features = model2.predict(train_dataset_array)
base_model = numpy.save('MobileNet_TransferLearning_Fruits360v48_Train_Features.npy', train_features)
print(train_features.shape)

After the above code completes successfully, the .npy file will be saved.

After extracting the features from the training images, next is to repeat the same work but on testing images according to the code below. At first, the test_dataset_array.npy file is loaded which returns a NumPy array with all test images in the test_dataset_array variable. This NumPy array is then fed to the predict() method called using model2 to return the features which are stored in the NumPy array test_features. Finally, this NumPy array is saved in a file named MobileNet_TransferLearning_Fruits360v48_Test_Features.npy.

import tensorflow.keras
import numpy
base_model = tensorflow.keras.models.load_model('MobileNet_TransferLearning_Fruits360v48.h5')
base_model.summary()
model2 = tensorflow.keras.models.Model(inputs=base_model.input, outputs=base_model.get_layer('global_average_pooling2d_20').output)test_dataset_array = numpy.load("test_dataset_array.npy")test_features = model2.predict(test_dataset_array)
numpy.save('MobileNet_TransferLearning_Fruits360v48_Test_Features.npy', test_features)
print(test_features.shape)

After this code runs successfully, the test features file will be saved. The next section creates an ANN that will be trained and tested according to such features.

Building and Training an ANN using Extracted Features

Using the extracted features, an ANN is created using scikit-learn as listed below. At first, the features NumPy arrays are loaded. Then the classifier is created using the MLPClassifier class constructor and the instance is returned in the classifier variable. The constructor accepts many arguments but just 3 of them are used which are:

  1. solver: Adam solver or optimizer is used.
  2. hidden_layer_sizes: It accepts a tuple specifying the number of hidden neurons in the hidden layers. I just built a network with 2 hidden layers where the first layer has 500 neurons and the second layer has 150 neurons.
  3. max_iter: Maximum number of iterations after which the network will return. It is set to 5,000 iterations.
import numpy
from sklearn.neural_network import MLPClassifier
import joblib
train_features = numpy.load('MobileNet_TransferLearning_Fruits360v48_Train_Features.npy')
train_dataset_array_labels = numpy.load("train_dataset_array_labels.npy")
print("Train Features Shape", train_features.shape)
classifier = MLPClassifier(solver='adam', hidden_layer_sizes=(500, 150), max_iter=20000)classifier.fit(train_features, train_dataset_array_labels)joblib.dump(classifier, 'MobileNet_TransferLearning_Fruits360v48_ANN_500_150.joblib')train_predictions = classifier.predict(train_features)train_correct_predictions = numpy.array(numpy.where(train_predictions == train_dataset_array_labels))
train_accuracy = numpy.round((train_correct_predictions.shape[1]/train_features.shape[0])*100, 2)
print("Train Accuracy : ", train_accuracy)

The ANN is trained by passing both the NumPy array of the features train_features and the class labels NumPy array train_dataset_array_labels to the fit() method.

After the model is trained, it is saved using the joblib library for later use.

In order to make sure the network is able to make correct predictions for the training data, the network is made to predict the class labels for all training samples using the predict() method. Remember that a machine learning model that is not able to make correct predictions for the training data, which is seen data to it, is likely to do the same of test data which is unseen. Based on the predictions saved in the train_predictions variable, the training classification accuracy is calculated which exceeds 97.8%.

After training and saving the ANN completes, next is to test it using the test data according to the code below. You need to load the test data features file named MobileNet_TransferLearning_Fruits360v48_Test_Features.npy. After that, the test data labels are loaded from the test_dataset_array_labels.npy file. In addition to loading the data, the trained ANN saved previously using joblib is loaded again. After that, the classifier predicts the class labels of the test data samples using the predict() method. Finally, the test accuracy is calculated and printed which is about 60%.

import numpy
from sklearn.neural_network import MLPClassifier
import joblib
## Testing ##
test_features = numpy.load('MobileNet_TransferLearning_Fruits360v48_Test_Features.npy')
test_dataset_array_labels = numpy.load("test_dataset_array_labels.npy")
print("Test Features Shape", test_features.shape)
classifier = joblib.load('MobileNet_TransferLearning_Fruits360v48_ANN_500_150.joblib')
test_predictions = classifier.predict(test_features)
test_correct_predictions = numpy.array(numpy.where(test_predictions == test_dataset_array_labels))
test_accuracy = numpy.round((test_correct_predictions.shape[1]/test_features.shape[0])*100, 2)
print("Test Accuracy : ", test_accuracy)

There are a number of reasons why the test accuracy is not that high. One main reason is that the data has to be engineered. The raw features extracted from MobileNet after transfer learning may have some bad features. By removing such bad features, the accuracy might increase.

After both training and testing the network, the complete code for building, training, saving, loading, and testing the ANN is listed below.

import numpy
from sklearn.neural_network import MLPClassifier
import joblib

train_features = numpy.load('MobileNet_TransferLearning_Fruits360v48_Train_Features.npy')
train_dataset_array_labels = numpy.load("train_dataset_array_labels.npy")
print("Train Features Shape", train_features.shape)

classifier = MLPClassifier(solver='adam', hidden_layer_sizes=(500, 150), max_iter=20000)

classifier.fit(train_features, train_dataset_array_labels)

joblib.dump(classifier, 'MobileNet_TransferLearning_Fruits360v48_ANN_500_150.joblib')

train_predictions = classifier.predict(train_features)

train_correct_predictions = numpy.array(numpy.where(train_predictions == train_dataset_array_labels))
train_accuracy = numpy.round((train_correct_predictions.shape[1]/train_features.shape[0])*100, 2)
print("Train Accuracy : ", train_accuracy)

## Testing ##
test_features = numpy.load('MobileNet_TransferLearning_Fruits360v48_Test_Features.npy')
test_dataset_array_labels = numpy.load("test_dataset_array_labels.npy")
print("Test Features Shape", test_features.shape)

classifier = joblib.load('MobileNet_TransferLearning_Fruits360v48_ANN_500_150.joblib')
test_predictions = classifier.predict(test_features)

test_correct_predictions = numpy.array(numpy.where(test_predictions == test_dataset_array_labels))
test_accuracy = numpy.round((test_correct_predictions.shape[1]/test_features.shape[0])*100, 2)
print("Test Accuracy : ", test_accuracy)

Conclusion

This tutorial is the forth and past part in the series of building an image classifier using features extracted by transfer learning of MobileNet using Keras. Here is a quick summary of all 4 tutorials in the series.

  • Part 1: the machine learning pipeline is discussed to highlight the disadvantages of manual feature extraction. After that, an introduction to transfer learning is given to define it, highlight its advantages, and when it is useful.
  • Part 2: the Fruits360 dataset is downloaded and its images are read into NumPy arrays for being fed later to the MobileNet for feature extraction after transfer learning.
  • Part 3: the MobileNet model is downloaded and its learning was transferred to the new dataset and the model is saved.
  • Part 4 [this tutorial]: the new model is loaded and prepared for extracting features rather than predicting class labels. After that, the features from the entire dataset are extracted. Based on the extracted features, an artificial neural network is created using scikit-learn, trained, and tested.

According to this series, the complete steps required to do transfer learning are covered.