Original article was published on Artificial Intelligence on Medium
Transfer Learning 101: Part 1
Loading State of Art Model and Valuating in tf 2.0.
“Standing on the shoulders of giants’’
The transfer learning is defined below:
Transfer learning and domain adaptation refer to the situation where what has been learned in one setting … is exploited to improve generalization in another setting— Page 526, Deep Learning, 2016.
Thus to put it in simple words, It is the process to leverage the state of art models in the relevant field for our in hand problem. Let us go through the complete process of using this approach. And skip the theory for books and course 😉
Step 1: Identifying problem
First we should know which type of problem we are dealing with. For this blog I will take a image classification problem were we need to classify the cell image whether it is of class “Uninfected” or “Parasitized”(infected). This data set is part of malaria datasets . The sample labeled images are below:
Thus our aim is to create a model to classify between the two class. Thus our problem is Image classification.
Step 2: Choosing Best suited model.
One might think of building CNN model from scratch, which can be feasible or required in some cases. But here we would use the pre-trained model. We select the ResNet50 as our base as it seem to preform better out of the box compared to InceptionNet. The to use the same is :
# adding the Resnet50v2 model
Resnet = tf.keras.applications.ResNet50V2(include_top=False,
The above line downloads the ResNet50V2 , without top means that the classification layer is not present in the model, weights = ‘imagenet’ means that the model has the weights that was obtained when it was trained on imagenet dataset, input_shape should be equal to the shape of input images that we desire to provide, in our case 100 x 100 x 3(width x height x channel).
The whole model that we create is given in the code below:
# adding the Resnet50v2 modelResnet = tf.keras.applications.ResNet50V2(include_top=False,
input_shape=(100,100,3))Resnet.trainable = False#print(Resnet.summary())myResnet = tf.keras.models.Sequential(name='MyResnet')myResnet.add(Resnet)myResnet.add(tf.keras.layers.Flatten())myResnet.add(tf.keras.layers.Dropout(0.5))myResnet.add(tf.keras.layers.Dense(256, activation='relu'))myResnet.add(tf.keras.layers.Dropout(0.5))myResnet.add(tf.keras.layers.Dense(64, activation='relu'))myResnet.add(tf.keras.layers.Dense(2, activation='softmax'))print(myResnet.summary())
Here we add the bottom layer to suite our task thus, the softmax layer at the end with 2 neurons helps to classify two class, as the classes are only two and mutually exclusive thus a single neuron with sigmoid activation can also be used. It takes comparative less time to train. But we will go with 2 softmax neuron.😎
Step 3: Train the model.
Even through the resnet model is trained but the new added head layers (yes, the output part is head of network and input is tail by convention 🤷♂️) are not trained thus we need to train then to suit our dataset. After training the graph can be shown as:-
Yes, there is still scope for more epochs, but for now we will stop here As we have other plans. See Part 2 : Fine Tuning.
Step 4: Valuation
The valuation results are below:
The accuracy of the MyResnet Custom model is: 0.9361393323657474 The Mean Squared Error of the MyResnet Custom model is: 0.06386066763425254 The Mean Squared Log Error of the MyResnet Custom model is: 0.03068205023570517
Now we would try to increase it even more with the help of Fine Tunning the model to our needs. Stay tuned for Part 2.