Original article was published on Deep Learning on Medium
Can Artificial Intelligence help diagnose COVID-19?
The unforeseen emergence of a novel contagion ravaging through nations all over the world has seen the world witness one of greatest pandemics in recorded history. COVID-19 has brought the world to a staggering halt, as countries struggle to cope with collapsing economies and overburdened healthcare systems. As long as a vaccine is admittedly some distance away in the development stage, the only way to slow down the spread of the relentless virus is to test people in huge numbers and distance the infected from the healthy. Regular tests being exorbitantly expensive for the general populace, there is an urgent need for innovative testing solutions that can be cheaper, quicker and much more effective.
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves. There is no dearth of applications of machine learning in the real world, some examples include data sciences, recommender systems, anomaly detection and time-series forecasting.
Deep learning is a sub-branch of the machine learning field, inspired by the structure of the brain. Deep learning techniques used in recent years continue to show impressive performances in the field of medical image processing, as in many other fields. The primary difference between machine learning and deep learning lies in the fact that machine learning involves getting the computer to learn by manual input of ‘features’, i.e. the factors or variables it should take into account while working on unseen data to make a prediction by itself; while on the other hand deep learning can be seen as an end-to-end approach of machine learning, where given enough data, the model itself figures out what the relevant bits of information in the data are and generates features automatically. Drawing meaningful results from medical data is quite effective by use of deep learning techniques. Deep learning models have been used successfully in many areas such as classification, segmentation and lesion detection of medical data.
Convolutional Neural Networks (CNN’s) are deep neural networks specifically meant for working with images. They’ve been proven to be an effective tool for tasks like image classification, object detection, image segmentation, face verification, and so on. The way these networks work is by a sliding patch over the image (aka a kernel) that is sized much smaller than the image, and is composed of numbers called ‘weights’. These weights get multiplied by the corresponding numbers (pixel values) in the image and get summed up to form the output(s), which are probability scores for various classes. This process is repeated multiple times, along with special functions acting on outputs, depending on the architecture of the model being used. An image with a positive case should yield a high probability score, whereas one that is negative should not, in an ideal case. However, this does not happen spontaneously, on the first run through the model. The neural network sees hundreds of images and their corresponding ‘correct answers’ which are truth values of how those images have been classified. The greater the amount of data that is fed in the network, the better its performances get.
In the COVID-19 context, CNN’s are being used to classify X-ray images of patients based on whether or not the patient is infected. There have been various studies conducted by numerous individuals and organisations based on this approach, where they train the convolutional neural net on hundreds of publicly available X-ray images, both COVID-19 infected ones and those of healthy patients, to identify COVID-19 cases. There have also been various novel network architectures proposed by researchers meant specifically for diagnosing COVID-19, with classification accuracies going over 95% in several studies, with some touching 98%. There are ongoing attempts to procure large amounts of chest X-ray imaging data for COVID-19 cases, a very limited quantity for which is open-sourced on the internet right now.
A novel architecture proposed for recognizing COVID-19 from chest X-ray imaging data is the CoroNet:
Based on the Xception CNN architecture, shown in the figure below, with a dropout layer and two fully-connected layers added at the end. Xception which stands for Extreme version of Inception (its predecessor model) is a 71 layers deep CNN architecture pre-trained on ImageNet dataset, an open-source benchmark dataset with over a million images.
Here’s a summary of results on how well it performs at recognizing COVID-19:
Another popular time-tested convolutional neural network is AlexNet, visualized below:
One of the major advantages of this network is that it is very lightweight, i.e. computationally inexpensive, and delivers great performances for COVID-19 identification. It was found to deliver 83% classification accuracy in a study.
Some other commonly used CNN’s and their respective performances are shown below:
The InceptionV3 architecture looks like this:
This are the architecture of ResNet50 and Inception-ResNetV2:
In the table above, TP, TN, FP and FN stand for True Positives, True Negatives, False Positives and False Negatives respectively. Acc is accuracy, Spe is specificity, Pre is precision and F1 is the F1-score. All of these are various parameters used for evaluating a model’s performance. Why is just accuracy not an adequate measure? Here’s an example where it may provide misleading results- Suppose you have a dataset of 100 images, where 95 images are those of healthy patients and 5 are COVID-19 +ve. If you choose accuracy as the performance metric, you might be reasonably pleased to find 95% accuracy on the data. In the background, however, a model might have learnt to predict healthy for every image, all the time, actually learning nothing! This factor of choosing alternative evaluation metrics often comes into play while dealing with unbalanced datasets, which is the case for most COVID-19 deep learning studies.
The use of artificial intelligence for paving the way to victory in the global battle against COVID-19 is thus definitely a promising avenue to pursue, given the inexpensive nature of the technology and remarkable accuracy of results.
Governments all over the world are contemplating the use of AI as a cheaper alternative to conduct testing in large numbers for COVID-19.