Covid-19 lung X-Ray classification and CT detection demos in 10 minutes

Original article was published on Deep Learning on Medium

# import the necessary packages 
from tensorflow.keras.layers import AveragePooling2D, Dropout, Flatten, Dense, Input from tensorflow.keras.models import Model from tensorflow.keras.optimizers import Adam from tensorflow.keras.utils import to_categorical from tensorflow.keras import optimizers, models, layers from tensorflow.keras.applications.inception_v3 import InceptionV3 from tensorflow.keras.applications.resnet50 import ResNet50 from tensorflow.keras.preprocessing.image import ImageDataGenerator from sklearn.preprocessing import LabelEncoder, OneHotEncoder from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report, confusion_matrix from imutils import paths import matplotlib.pyplot as plt import numpy as np import cv2 import os
# set learning rate, epochs and batch size INIT_LR = 1e-5 # This value is specific to what model is chosen: Inception, VGG or ResNet etc. EPOCHS = 50 BS = 8 print("Loading images...") imagePath = "./Covid_M/all/train" # change to your local path for the sample images imagePaths = list(paths.list_images(imagePath)) data = [] labels = [] # read all X-Rays in the specified path, and resize them all to 256x256 for imagePath in imagePaths: label = imagePath.split(os.path.sep)[-2] image = cv2.imread(imagePath) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.resize(image, (256, 256)) data.append(image) labels.append(label) #normalise pixel values to real numbers between 0.0 - 1.0 data = np.array(data) / 255.0 labels = np.array(labels) # perform one-hot encoding for a multi-class labeling label_encoder = LabelEncoder() integer_encoded = label_encoder.fit_transform(labels) labels = to_categorical(integer_encoded) print("... ... ", len(data), "images loaded in multiple classes:") print(label_encoder.classes_)Loading images... ... ... 200 images loaded in 3x classes: ['covid' 'normal' 'pneumonia_bac']

3. Add in basic data augment, re-compose the model, then train it

# split the data between train and validation. (trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.20, stratify=labels, random_state=42) # add on a simple Augmentation. Note: too many Augumentation doesn't actually help in this case - I found during the test. trainAug = ImageDataGenerator(rotation_range=15, fill_mode="nearest") #Use the InveptionV3 model with Transfer Learning of pre-trained "ImageNet"'s weights. #note: If you choose VGG16 or ResNet you may need to reset the initial learning rate at the top. baseModel = InceptionV3(weights="imagenet", include_top=False, input_tensor=Input(shape=(256, 256, 3))) #baseModel = VGG16(weights="imagenet", include_top=False, input_tensor=Input(shape=(256, 256, 3))) #baseModel = ResNet50(weights="imagenet", include_top=False, input_tensor=Input(shape=(256, 256, 3))) #Add on a couple of custom CNN layers on top of the Inception V3 model. headModel = baseModel.output headModel = AveragePooling2D(pool_size=(4, 4))(headModel) headModel = Flatten(name="flatten")(headModel) headModel = Dense(64, activation="relu")(headModel) headModel = Dropout(0.5)(headModel) headModel = Dense(3, activation="softmax")(headModel) # Compose the final model model = Model(inputs=baseModel.input, outputs=headModel) # Unfreeze pre-trained Inception "ImageNet" weights for re-training since I got a Navidia T4 GPU to play with anyway #for layer in baseModel.layers: # layer.trainable = False print("Compiling model...") opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS) model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"]) # train the full model, since we unfroze the pre-trained weights above print("Training the full stack model...") H = model.fit_generator( trainAug.flow(trainX, trainY, batch_size=BS), steps_per_epoch=len(trainX) // BS, validation_data=(testX, testY), validation_steps=len(testX) // BS, epochs=EPOCHS)... ... Compiling model... Training the full stack model... ... ... Use tf.cast instead. Epoch 1/50 40/40 [==============================] - 1s 33ms/sample - loss: 1.1898 - acc: 0.3000 20/20 [==============================] - 16s 800ms/step - loss: 1.1971 - acc: 0.3812 - val_loss: 1.1898 - val_acc: 0.3000 Epoch 2/50 40/40 [==============================] - 0s 6ms/sample - loss: 1.1483 - acc: 0.3750 20/20 [==============================] - 3s 143ms/step - loss: 1.0693 - acc: 0.4688 - val_loss: 1.1483 - val_acc: 0.3750 Epoch 3/50 ... ... ... ... Epoch 49/50 40/40 [==============================] - 0s 5ms/sample - loss: 0.1020 - acc: 0.9500 20/20 [==============================] - 3s 148ms/step - loss: 0.0680 - acc: 0.9875 - val_loss: 0.1020 - val_acc: 0.9500 Epoch 50/50 40/40 [==============================] - 0s 6ms/sample - loss: 0.0892 - acc: 0.9750 20/20 [==============================] - 3s 148ms/step - loss: 0.0751 - acc: 0.9812 - val_loss: 0.0892 - val_acc: 0.9750

4. Plot confusion matrix for the validation results:

print("Evaluating the trained model ...") predIdxs = model.predict(testX, batch_size=BS) predIdxs = np.argmax(predIdxs, axis=1) print(classification_report(testY.argmax(axis=1), predIdxs, target_names=label_encoder.classes_)) # calculate a basic confusion matrix cm = confusion_matrix(testY.argmax(axis=1), predIdxs) total = sum(sum(cm)) acc = (cm[0, 0] + cm[1, 1] + cm[2, 2]) / total sensitivity = cm[0, 0] / (cm[0, 0] + cm[0, 1] + cm[0, 2]) specificity = (cm[1, 1] + cm[1, 2] + cm[2, 1] + cm[2, 2]) / (cm[1, 0] + cm[1, 1] + cm[1, 2] + cm[2, 0] + cm[2, 1] + cm[2, 2]) # show the confusion matrix, accuracy, sensitivity, and specificity print(cm) print("acc: {:.4f}".format(acc)) print("sensitivity: {:.4f}".format(sensitivity)) print("specificity: {:.4f}".format(specificity)) # plot the training loss and accuracy N = EPOCHS"ggplot") plt.figure() plt.plot(np.arange(0, N), H.history["loss"], label="train_loss") plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss") plt.plot(np.arange(0, N), H.history["acc"], label="train_acc") plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc") plt.title("Training Loss and Accuracy on COVID-19 Dataset") plt.xlabel("Epoch #") plt.ylabel("Loss/Accuracy") plt.legend(loc="lower left") plt.savefig("./Covid19/s-class-plot.png")

From the above figure, it can be seen that with the benefit of “transfer learning” even with a small dataset and a quick training of less than 5 minutes, the result is not that bad: all 12x Covid-19 Lungs are classified correctly, and only 1x Normal lung out of 40 in total is classified wrongly as “Bacterial Pneumonia” lung.

5. Plot confusion matrix for testing some real X-rays

Now why can’t we leap one step further — send in some real X-Rays to test how effective this lightly trained classifier could be.

So I uploaded 27x X-Ray images into the model that were not yet used in the above training or validation set:

9x Covid-19 lungs vs. 9x normal lung vs. 9x Bacterial Lungs. (These images are attached in this post too.)

I only changed one line of the code in step 2, to make sure it loads test mages from a different path:

... imagePathTest = "./Covid_M/all/test" ...

The we use the above trained model to predict:

predTest = model.predict(dataTest, batch_size=BS) print(predTest) predClasses = predTest.argmax(axis=-1) print(predClasses) ...

Finally we can re-calculate the confusion matrix as in step 5:

testX = dataTest testY = labelsTest ... ...

We got some real test results:

Again, the trained model seems to be able to classify all Covid-19 lungs correctly. It’s not that bad for such a small data set.

6. Some further observations:

I played with various dataset over the Easter weekend, and noticed that Covid-19 lungs seemed to have some distinct features somehow, from the AI classifier point of view — it’s relatively easy to tell them apart from other normal Bacterial or Viral (Flu) Lungs.

Also I noticed with some quick tests that it’s really difficult to distinguish between such as Bacterial and Viral (normal flu) lungs. I would have to go for xgboost or ensemble cluster to chase down their difference if I have the time, just like other Kaggle competitors would possibly do in those circumstances.

Is the above really true from the clinical point of view? Does Covid-19 lungs really have some distinct features on X-Ray? I am not so sure. I would have to ask real chest radiologist for opinions. For now, I’d rather assume the current dataset is just too small to draw a conclusion yet.

Next: I would love to collect more real site X-Rays to give them a serious look with some xgsboot, AutoML or our new IRIS IntegratedML workbenches. And even more, we should be able to further classify the Covid-19 lungs into such as Level 1, Level 2 & Level 3 per its seriousness for clinicians and A&E triage doctors, I hope?

Anyway, I attached the dataset and the above Jupyter Notebook.


We touched a bit simple starting point for some quick setups above in this “Medical Imaging” field. This Covid-19 front is actually the 3rd I tried to look into for the past year over some weekends and long holidays. Others include an “AI Assisted Bone Fracture detection” system, and an “AI assisted eye(ophthalmology) retina diagnosis” system.

The above model might be too simple to bother for now, but sooner than we might think when we can’t avoid the common question: how are we going to deploy it into a kind of “AI service””?

It is about technology stacks and service life-cycle, but also about the actual “Use Case” — what problems are we trying to solve, and what real values can it provide? The answers sometimes are not as clear as technology itself.

UK RCR(Royal College of Radiologist) draft proposed 2 simple use cases: “Radiologist’s AI assistant”, and “AI Triage in A&E or primary care settings”. To be honest, personally I agree and think the 2nd one “AI Triage” would provide more value for now. And fortunately today’s developer is so much more powered than ever before by the Cloud, Docker, AI, and certainly our HealthShare to address this sort of cases.

For example, the screen capture below actually shows an enterprise-grade “AI assisted CT detection of Covid-19 lungs” service hosted in AWS, and how it could be embedded directly into a HealthShare Clinical Viewer for demo purpose. Similar to X-Rays, CT set in DICOM can be directly uploaded or sent into this open PACS Viewer, then with one click on “AI diagnosis”, it can give a quantified dictation of Covid-19 probability in less than 10 seconds based on trained models, working 24×7 for the quick “AI triage” use case. The X-Ray classification etc models can be deployed and invoked in the same approach on top of existing PACS viewer within the same patient’s context to help the front-line clinicians.


Again, test images are from Covid-19 Lung X-Ray set by Joseph Paul Cohen, and some clean lungs from the open Kaggle Chest X-Ray sets, collected by Adrian Yu at the GradientCrescent repository. I also re-use the structure of Adrian at PyImageSearchwith my own improved training as listed under “Test” section. And thanks to HYM providing the AWS cloud based Open PACS Viewer with AI modules for X-Ray and CT images to look into test datasets.

What’s Next

Today AI has “eroded” almost every aspect of human’s health and daily life. In my oversimplified view, AI applications in healthcare might largely have the following few directions:

  • Medical Imaging: including chest, heart, eye and brain etc X-Ray, CT, or MRI etc images.
  • NLP comprehension: mining, understanding and learning of vast text asset and knowledge base.
  • Population Health: including Epidemiology etc trend predictions, analysis and modeling.
  • Personalised AI: a set of AI/ML/DL model(s) specially trained for and dedicated to an individual, growing up and growing old with him/her, as a personal health assistant?
  • Other AI(s): such as AlphaGo or even AlphaFold for 3D protein structure predictions etc — fighting against Covid-19 — really impressed with those cutting-edge breakthroughs.

We will see what we can pick up along the journey. It might be just a wish list anyway, unless we stay home for far too long.

Appendix — File Uploads here. In includes image as we used above and a Jupyter Notebook file as above. It might take a couple of hours to set up and run from scratch during a weekend.