Up to 100% accuracy for cellular histo-pathology image classification… with Fast-AI and Kimia960…

Source: Deep Learning on Medium


FastAI + ResNet152 and differential learning rates.

Utilizing the latest in AI deep learning techniques such as differential learning rates and auto learning rate detection with Fast AI and an ImageNet-trained ResNet152 CNN, I was able to build and train a neural network model showing up to 100% accuracy in classifying the Kimia 960 histo-pathology data-set.

Previous papers have shown 91% — 94% peak accuracy using AI, and a range of rates for machine learning (i.e. 75–81%) so achieving up to 100% was quite exciting. This article provides some insight into the process and results!

Background —the Kimia 960 dataset consists of 960 whole slide images of muscle, epithelial and connective tissue. It was selected from a larger set of histo-pathology scans. It is considered a challenging data set due to it’s large ‘intra-class variability’ as shown in this paper:

Each row is the same type of tissue…you can see the large differences in appearance. (image from paper: https://www.researchgate.net/publication/320192322_A_Comparative_Study_of_CNN_BoVW_and_LBP_for_Classification_of_Histopathological_Images)

When reviewing some of the papers using AI for classification, I noticed that while they were as recent as six months ago, the architectures and techniques were not leveraging some of the newer breakthroughs that FastAI has brought to the fore. Notably, things like progressive resizing, cyclical restart annealing and differential learning rates. With that, I decided this was an excellent challenge and proceeded to prep the dataset for use with FastAI.

ResNet50 = Could not get higher than 98%, not enough horsepower to solve: Of interest, I initially used ResNet50 believing that should be sufficient. However, a few hours in and it became clear that ResNet50 could not get past 98% accuracy. The issue was consistent confusion between class A and D in the dataset. Even with multiple changes and various learning rate adjustments, ResNet50 could not shake the difference.

When you view some of the images from these two classes, you can see they are quite similar:

Two different classes but very similar — where ResNet50 became stuck.

Thus, I decided the root issue might be the CNN simply needed more raw horsepower and had to restart with ResNet152, which adds nearly 100 more layers of neural ‘power’ (or 3x the total layers).

That proved to be the right change, as impressively, it never got stuck the way ResNet50 did.

Some minor issues after initial training but no specific concentration of confusion. (diagonal = proper classifications, off the diagonal is mistaken classification).

Differential learning rates and auto learning rate finder, for the win: While the initial training results were pretty steady, ResNet152 began to peak around 97–98% and the risk of potentially over-training was starting to appear as the validation.

Thus, I had to become very conservative with the learning rates and this is where differential learning rate appeared to really excel in terms of guiding ResNet152 to 100% accuracy.

The process was to repeatedly run the FastAI learning rate finder, and then use differential learning rates to allow the initial layers to only be gently coaxed, vs the middle and finishing layers received a bit more push from the training epochs. By using differential learning rates, you don’t overly disturb what’s working, while still providing more feedback to the layers that need it. This is a huge improvement in terms of finesse versus the more traditional fixed learning rate for the entire network, and in this case, appeared to be a key difference.

With learning rates steadily scaling down via the learning rate finders changing curves, the final training runs were split with a learning rate, between 1e-8 and 1e-6 (and thus 1e-8 to the first layers, 1e-7 to the middle layers and 1e-6 to the final layers), the validation results slowly climbed until the exciting moment of repeated 100% accuracy on the validation set:

First glimpse of perfect accuracy….
Success!

As the training results above show, 100% accuracy was repeatable and with that I froze the model for future use.

To be safe, I titled this “up to 100%” accuracy as there’s no guarantee with even larger data sets that the 100% accuracy would be maintained. However, it did achieve 100% consecutive times and in addition, 98.7% was very stable. These results exceed previously published CNN results ranging from 91–94%, as well as other non deep learning approaches to this dataset (i.e. 75–82% with Support Vector Machine learning).

We can thus see that Deep Learning (or AI), with the techniques leveraged in FastAI, continues to rapidly improve and evolve, and holds the prospect’s of improving health care for all in the future.

References (previous papers using AI (CNN) and Kimia 960):

1 –https://www.researchgate.net/publication/320192322_A_Comparative_Study_of_CNN_BoVW_and_LBP_for_Classification_of_Histopathological_Images

2 — https://arxiv.org/pdf/1805.05837