Anomaly Detection in Capsule Endoscopy Images: An insight

Source: Deep Learning on Medium

Anomaly Detection in Capsule Endoscopy Images: An insight

Hey everyone! This is my very first article on Medium. I have been wanting to write stuff for a long time now but I was always caught up in this dilemma -” To blog for myself?” or “To blog for the community?”. I always wanted to give something back to the community.

After working for over a year on various Machine Learning and Deep Learning projects, I finally feel that now I am in a position to contribute something through the series blog posts that I am planning to write. So, this will be a short read, less of the technical part and more of an intuitive insight into the domain of Biomedical Image Processing and how I as a beginner experienced it.

Capsule Endoscopy procedure is a non-invasive method of performing endoscopy. It is a capsule that is ingested by the patient and it takes anywhere between 5 to 8 hours to get excreted. The capsule produces video data, a video nearly 5 to 6 hours long. The physician watches the entire video for diagnosis. It takes an experienced radiologist nearly 3 to 4 hours to make a diagnosis, it’ll take even more time for slightly less experienced ones. Thus it is a fairly time-consuming procedure and even though a patient will be more likely to opt for the non-invasive alternative, it is not always the best option in the doctor’s eyes.

I talked to a few radiologists and came to know that there are quite a few drawbacks associated with Capsule Endoscopy Procedure. Some of them being- not being able to control the movement of the device externally which might be needed in case of occluded vision, no provision for collecting biopsy samples through capsules which is made possible by the invasive counterpart. Moreover, when I discussed the idea of building an Artificial Intelligence-based automated system for anomaly detection, I came to know that all doctors were rather reluctant to put their faith in it. According to them, the system would be of absolutely no use unless it achieves more than 99% accuracy. This was not at all surprising because we are talking about detecting diseases here. Anything less than 98% or 99% essentially puts the patient’s life at risk.

I was just embarking on this journey and to know that the experts of this very field were not in support of a technical solution to this problem somewhat demoralized me. I couldn’t just sit there and pitch 99% accuracy before even working on the project. I wasn’t even sure anymore. The dataset was already a big problem. I was struggling to get a sufficient amount of data to train and test models. Also, it was not just the data that posed problems because unannotated data could still be gathered but annotated data required doctor’s cooperation, which translates to — a lot of time with a junior/senior doctor for sitting and labeling the data and generating frame-wise masks. Doctors are busy people and it’s only fair that they don’t readily agree to invest so much of their precious time into an idea they don’t even believe in completely. It was a vicious cycle. The doctor needed accuracy to be convinced about the solution and I needed data to get that kind of accuracy. After a series of meetings and a lot of discussions, I did get my hands on a small dataset and that’s when I finally started working on what seemed like a complicated problem.

This is a technical topic and it is supposed to be accompanied by code snippets and graphs but unfortunately, I won’t be able to share any such thing with regards to this project. The entire thing is confidential until a publication is submitted. I have been working on this project for the past six months and I have developed my own opinions regarding a lot of situations and problems that I encountered during that time. That is what I want to share with all of you. This is the one thing that can’t be copy-pasted and I didn’t want to start blogging someone else’s views.

I’ll be writing more posts on the same topic discussing the problems that I faced. The topics that I’ll discuss will depend on the kind of response I get. I’d love to share some technical insights as well like — issues faced while choosing the right kind of models and the struggle for annotated data, how that lead me to look for ways that did not require marking of annotations and how I experimented with unsupervised learning methods desperately hoping to find a solution there. Biomedical Image Processing is a fascinating field. It wasn’t an easy journey but what kept me going was this constant realization that I was working in a domain that has the potential to improve healthcare and benefit real lives.

Stay tuned for the next article in which I’ll discuss how feature extraction posed the biggest problem in this case and how important it was to first frame the problem statement in the right way.