Original article can be found here (source): Deep Learning on Medium
The dataset consisted of 73 posterior-anterior (PA) chest x-ray from patients with Covid-19 and 73 PA chest x-ray from patients with non-covid pneumonia for a total of 146 total images.
Covid-19 images were taken from a number of sources including an Italian radiology website discussing 60 Covid-19 cases and research articles found on Pubmed. The non-covid pneumonia images were taken from the training images in the RSNA Pneumonia Detection Challenge on Kaggle. Note there is another nicely labeled pneumonia dataset available on Kaggle, but I believe using it in this setting to be a mistake due to its pediatric population. CXRs of adults and children are quite easily distinguishable, particularly in the pattern of the rib cage and its possible the algorithm will distinguish classes using this feature rather than ones intrinsic to the disease.
Let us take a look at some sample data.
As you can see the 2 classes look quite similar.
Transfer Learning ResNet34:
Transfer learning was used to accommodate for the small dataset. ResNet34 was chosen as the foundation of the model architecture and trained for 7 epochs with the results below:
The validation accuracy is about 86%.
The confusion matrix for the validation set shows only 4 errors are being made on 29 images!
That is pretty good for such a small dataset.
Taking a closer look at mistakes, below and to the left is an example of a case of pneumonia the model was certain was Covid-19 and to the right a case of Covid-19 the model was certain to be pneumonia:
Next, let’s try fine tuning.
We unfreeze the layers of the network and allow them to vary with very small gradients. This allows the update of parameters to better suit the dataset while hopefully preventing overfitting.
We finetune for 2 epochs and our accuracy goes from 86% to 89.7%!