Anomaly Detection in Capsule Endoscopy Images: Feature selection and extraction

Source: Deep Learning on Medium

Anomaly Detection in Capsule Endoscopy Images: Feature selection and extraction

I wrote my first blog a few days back. I had kept my expectations rather realistic since by choosing a technical topic focussing on an uncommon domain, I had seriously narrowed down the reader base. I wanted to write and more than that I wanted people to read what I wrote because readers are the writer’s soul.

The response took me by surprise. It was nothing like I had expected! I got in touch with so many wonderful people through my last blog. I can’t thank you all enough for motivating me. A big thanks to everyone who read my blog and an even bigger one to all those who let me know that they were looking forward to the next article. So, here I am typing away in the middle of the night during my exam week. Excitement is surpassing all preparation worries and enthusiasm making me underestimate the remaining syllabus. I guess it’s that point in my life where I have to choose between what I have to do and what is it that I want to do (doesn’t mean I won’t study).

Now, picking up from where I left the last time — Feature selection and extraction which is the most decisive aspect of designing a good machine learning model was the biggest problem that I faced. Nope, feature extraction wasn’t the problem but “relevant” and “distinguishable” feature extraction was.

Usually in medical images, the background is static i.e., a major portion (lighting, angles, etc) of the image won’t change over successive iterations. For example, let’s consider lung CT scan images for detecting cancerous nodules. The most relevant feature would then be spotting where cells are abnormally clumped together. That’s about it, one feature serves the purpose. Here, I am referring to feature as something that can be perceived by the human eye. I’ll perceive it as one feature but to enable the model to learn it efficiently I would use something like a Grey-Level Co-occurrence Matrix (GLCM). Ultimately, I’d be feeding multiple columns of numeric data (features) to the model in order to make it learn what I call “a single feature”. When features are distinguishable, feature selection becomes rather easy.

Doesn’t that defeat the whole purpose using an AI algorithm to solve the problem? It is supposed to notice those patterns that go amiss by our eyes. That’s precisely when we turn to the convolutional approaches and let the network figure out what is going on.

Now, I am going to highlight a few drawbacks of this approach — some are general whereas the others are specific to this problem statement. Firstly, deep learning techniques based on convolutional neural networks require high computation costs and a large dataset for proper training. A high-end GPU can still be arranged but obtaining a good amount of data was a big challenge. Medical data is hard to obtain for several reasons — doctor-patient confidentiality, the potential misuse of data, erroneous data, man-hours involved, etc. Picking a problem statement that already has a publicly available dataset is the way to go.

Secondly, deep learning architectures don’t generalize well i.e., they are not invariant to small transformations. In other words, the model will find it hard to adjust if the data came from another vendor’s pill who’s specifications differ from the one it was trained on. Most importantly, this approach might still work wonders with labeled data but I didn’t have access to that either.

Now, consider using an unsupervised learning architecture like Convolutional Autoencoder. It will perform well for a given type of data when there’s not too much variation amongst samples. However, we’re talking about Capsule Endoscopy here — a pill that is traversing through our digestive organs and the background-color along with texture is changing continuously. The problem wasn’t too much variation, rather it was the fact that 90% of the variation wasn’t the information. I had to teach the model to focus on that 10% of the variation and ignore the rest.

A Capsule Endoscopy procedure is used to diagnose a variety of diseases and internal conditions. For example polyps, bleeding, stomach worms, ulcers, lesions, inflammation, stricture, etc. All of these categories have their own unique visual identifiers and that doesn’t imply that it is easy to spot them. Difficulties arise in scenarios like an occluded vision of the capsule due to bubbles/water/active bleeding, blood changing color over time — from red to black, draining of capsule battery thereby reducing illumination power provided by the mounted light. Sometimes, food particles or stool may be misinterpreted as an anomaly.

Many standard transformations and metrics for image data are designed to work on greyscale images. Also, rescaling to lower dimensions or decreasing the resolution is often recommended when dealing with image data to ease computation. This wasn’t an option here because both the color and texture contained important information. Converting to greyscale would have erased information contained in the color and lowering resolution would have further degraded the texture features.

Problems like lack of availability of labeled data, unclean and noisy images, and high variance clubbed with numerous classes to identify make targetting just one or two classes look like a better and more logical approach. We can always debate about this — responses are most welcome.

Another blog will be posted soon, till then stay tuned!