How to build a robust fruit classifier using Deep Learning

Today, we will build an image classifier using library to classify the Fruit-360 dataset. (


  1. Load the data and the pretrained model.
  2. Use lr_find() to find suitable learning rate and train the last layer with precomputed = True for 1–2 epochs.
  3. Train last layer with data augmentation for several epochs.
  4. Unfreeze all layers. Set lower learning rates for earlier layers block and perform a full train on model.
  5. Use lr_find() again. Train full network several epochs.
  6. Use test time augmentation to improve the predictions.
  7. Understand the classifier by confusion matrix.

Step 1: Load the data and the pretrained model.

First we need to plot the distribution of data

Train samples distribution (left) and Validation samples distribution (right)

Some samples from the data:

Apple Braeburn

Because this is a fruit dataset so we want to random rotate the data from side to side and upside down to create good augmentation data.

Set up data and model:

So what is precompute = True?
When precompute is on, the library compute all the activations from the beginning and save the computation for the penultimate layer. Because of this, when we train the last layer, we just have to feed the pre-computation to the last layer and we don’t have to compute forward and backward through all layers of our neural network all the time. That saves a lot of time!

Augmentation will not work if precompute = True. 
This is because precompute require the specific input to compute exactly the activation of that input. Augmentation process will produce a lot of random input and that’s why they will be disabled when we set precompute = True.

Step 2: Find the learning rate and train the last layer.

We should get the learning rate smaller than the optimal learning rate. Here we choose the learning rate = 0.004 and train the last layer for the first time with precompute activations.

epoch      trn_loss   val_loss   accuracy                      
0 0.031786 0.061239 0.986524

Step 3: Turn off precompute and train the model with augmentation data

First we check augmentation data to make sure they make sense.

Augmentation data (Random rotate)

Set the precompute = False and train the last layer for 3 epochs

epoch      trn_loss   val_loss   accuracy                     
0 0.051406 0.041234 0.987822
1 0.033894 0.035761 0.987822
2 0.027753 0.036705 0.987389

Step 4: Unfreeze all layers. And perform full train on our model.

We divide the model into 3 blocks with 3 different learning rates. We set the lower learning rate for earlier blocks cause we don’t want to destroy the weights trained by ImageNet (which is the dataset with very large number of images). We just want to change it slightly so they can fit better with our data.

epoch      trn_loss   val_loss   accuracy                     
0 0.033246 0.023907 0.989623
1 0.034452 0.02385 0.9902
2 0.021151 0.024252 0.989335
3 0.015466 0.02015 0.990632
4 0.023543 0.02202 0.989407
5 0.012539 0.02229 0.989263

Step 5: Find learning rate again and train several epochs more.

epoch      trn_loss   val_loss   accuracy                      
0 0.014929 0.023934 0.988975
1 0.037037 0.022498 0.989911
2 0.015485 0.026213 0.990055
3 0.012607 0.023107 0.989263
4 0.014334 0.022035 0.990055
5 0.009438 0.020433 0.9902

Step 6: Use test time augmentation (TTA) to improve the prediction.

TTA generates augmentation data for test data. By getting the mean of the output scores, we will have slightly better result.

acc = 0.9904878576061108

Step 7: Understand the classifier using confusion matrix

Confusion matrix

From the confusion matrix, we can see that some pairs of fruit usually be misclassified. For example: Cherry 1 vs Cherry 2, Pepino vs Grape White

Cherry 1 vs Cherry 2
Apple Braeburn vs Apple Golden 2

It’s easy to understand why our model misclassified these pairs!

You can see full notebook here:

Reference: course: note by Tim Lee: lesson 1 notebook:

Source: Deep Learning on Medium