Source: Deep Learning on Medium
Transfer Learning with ResNet
This is my first post on Deep Learning for Computer Vision (Convolutional Neural Networks).
I would like to share a bit of my understanding of what Transfer-Learning is in Deep Learning space.
In simple terms, Transfer learning is “Leveraging the knowledge of a neural network learned by training on one task to apply it for another task.”
There are different strategies to perform transfer learning depending upon the application.
For example, if the goal is for building an image classification application where the target images to be classified are similar to those present in established datasets such as ImageNet, CIFAR10/100. It makes sense to start of training using pre-trained neural-net models on these datasets, instead of doing it from scratch.
Fine-tuning v/s Freezing neural-network layers:
It is a common strategy to Freeze the layers in neural-networks that are supposed to be leveraged from the pre-trained model as is and no weight updates are expected on these layers.
This is done keeping in mind, that contributions from these layers areas similar to those which would have been if the new network was being trained off from the ground. For example, the first few layers of any image-classifier would learn Edges and Patterns. So it makes sense to keep these frozen in most of the cases.
Unlike freezing, fine-tuning would mean allowing weight updates to happen. In other words, these layers are un-frozen or unlocked and they would participate in the transfer-learning process, but with one difference.
i.e: These layers would now be initialized with the weights of the pre-trained neural network model rather than starting from some random initializer/zero weights.
This matters significantly, as convergence would be affected with the right set of weights which wouldn’t be the case if the network was being trained from scratch.
In this post, I would be demonstrating my strategy used for Transfer-Learning using a pre-trained ResNet50 model from Keras on the CIFAR100 dataset.
- Load the pre-trained ResNet50 model inbuilt into Keras as below
(Though, the input_shape can be anything, remember the ResNet50 is trained on ImageNet data-set, which comprises on 224×224 sized RGB images.)
- Each of the neural-net has some sort of pre-processing, which would involve normalizing input dataset for zero-mean and unit-variance, etc. The best thing is you don’t have to worry about this manually as Keras already provides this feature.
- Fine-tuning could be done using the Sequential/Functional way of defining Keras-model. I have chosen a Sequential one in this post. Since the CIFAR-100 image data-set is of 32×32 RGB images, in-order to match ImageNet data-set Up-sampling is done.
- ResNet50 neural-net has batch-normalization (BN) layers and using the pre-trained model causes issues with BN layers, if the target dataset on which model is being trained on is different from the originally used training dataset. This is because the BN layer would be using statistics of training data, instead of one used for inference.
TL;DR: Make batch-normalization layers of a pre-trained model as trainable, meaning allowing weight updates to happen during training.
- The final classification layer is replaced with Global Average Pooling(GAP), Dropout and Batch-Norm, followed by the Dense and Soft-Max layer.
- As a regularization technique, CutOut is used to improve validation accuracy.
- Training parameters: Batch-size = 64, Number of Epochs = 15, Optimizer = ‘adam’
Results: Validation accuracy of 80.16% in 15 epochs.
Further, improvements could be done using better regularization techniques and epoch time could be reduced by good magnitude using larger batch-size.