Stomach Wars: The ConvNet Awakens

Written by Andrew Huang and Matthew Sun

Screenshot of Kylo Ren in Star Wars: The Last Jedi. (Lucasfilm); Abdomen scans from Li et al. (2017)

There’s been an awakening…Have you felt it?

Like with many tasks in computer vision, convolutional neural networks (CNNs) have been very effective at image segmentation, the task of classifying each pixel of a picture into a different class. However, if you look at the leaderboards of many prominent abdominal organ segmentation challenges (PROMISE12, SLiver07), CNNs have only begun to perform favorably compared to traditional semi-automatic or interactive image analysis techniques as recently as 2016. Even 3D CNNs based off of U-Nets, a convolutional network architecture designed specifically for biomedical image segmentations, were challenging to train effectively with small datasets (Yu et al. 2017). Even though CNNs had the advantages of fast computation and end-to-end design control, they were unable to outperform far less heralded methods. So why were CNNs initially unsuccessful? What changed?

At each downsampling step of the U-Net (the left side of the “U”), the number of feature channels is doubled. Every step in the expansive path (the right side of the “U”) consists of an upsampling of the feature map that halves the number of feature channels and a concatenation with the feature map from the contracting path (Ronneburger et al. 2015)

Much of the problem lies in the difficulty inherent in segmenting organs in the abdomen. In addition to a lack of data (an issue common across medical imaging tasks) and variation across individuals, the human abdomen is packed with organs that have similar-looking tissue with fuzzy boundaries. Even once these organs have been segmented, tumor segmentation is considered an even tougher challenge. For example, tumors in the liver have various potential sizes, shapes, locations, and number within one patient’s liver, whereas segmenting the liver only requires finding one object’s boundaries in a one consistent area of the body.

A common practice in radiation treatment planning is manual segmentation and volume analysis of organs, particularly the liver, on computed tomography (CT) and magnetic resonance (MR) scans. Abdominal organ segmentation is a crucial first step for computer-aided diagnosis (CAD) clinical interventions, radiotherapy, treatment planning, and post-treatment evaluation (Moghbel et al. 2017). For example, accurately determining a patient’s liver and liver tumor volume is essential for reducing the risk of delivering an insufficient or excess amount of radiation therapy. Manual segmentation, however, varies from practitioner to practitioner, must be performed slice-by-slice, and requires large amounts of time and labor. Performing liver segmentation on CT scans may take up to 90 minutes for a single patient (Gotra et al. 2017).

Image of liver segmentation and volumetric analysis task. (Source)

The creation of a fast, accurate, and automatic system for abdominal organ and tumor segmentation would have a vast, positive impact on the healthcare industry. Liver cancer alone is the second most lethal cancer for men and the sixth most lethal cancer for women, with 696,000 deaths worldwide in 2008. In the United States, colorectal cancer is the third highest cause of cancer occurrence and death for men and women combined. The production of fast and fully automatic models that require no expert knowledge to preprocess would be especially beneficial for hospitals in many developing countries, where the facilities are often understaffed and have few trained radiologists.

A New Hope

This graph tracks the maximum score on the SLIVER07 tumor segmentation challenge achieved by different methods each year, up to and including that year. Since 2013, the performance of CNN-based automatic methods in the SLIVER07 tumor segmentation challenge has surpassed human-based interactive methods and has nearly caught up to semi-automatic methods. Modern challenges, like PROMISE12 and LiTS, are dominated by CNN-based approaches.

Generally speaking, the approaches researchers have recently leveraged to achieve leading performance in abdominal organ segmentation with CNNs have consisted of novel, hybrid network architectures that take advantage of the 3D structure of CT segmentation. For instance, one group (Yu et al. 2017), used long and short residual connections in a neural network, resulting in an architecture that was a hybrid between a U-Net and a ResNet. These residual connections vastly improve gradient propagation throughout the network, which is essential to combat the problem on low data. This proved to be an excellent design choice, as not only did they achieve much better performance than the older methods, but the residual connections led to faster training time. In fact, the model was able to unseat an algorithm that had held the top spot in the PROMISE12 Prostate MRI Segmentation Challenge for five years.

Yu et al. (2017)

Similarly, Li et al. (2017), whose model currently holds the top spot in the MICCAI 2017 Liver Tumor Segmentation Challenge by a large margin, proposed a hybrid densely connected UNet with a hybrid feature fusion (HFF) layer to jointly optimize the 2D and 3D DenseUNet layers. Other approaches include using pretrained CNNs like CaffeNet to generate features, and then feeding those features into SVMs or random forest classifiers. For example, Zhang et al. (2017) were able to surpass endoscopist performance in detecting and classifying colorectal polyps by training an SVM on features from the first 5 layers of CaffeNet. These models also perform well on dataset challenges, but are harder to fine-tune, as researchers don’t have end-to-end control over the model.

Data: The Sacred Texts

The problem that has most consistently hindered abdominal segmentation research is the limited amount of data available. One of the most prominent datasets, SLiver07, only contains 20 liver CT volumes for training and 10 CT volumes for testing (Lee et al. 2007). In the MICCAI PROMISE12 challenge, just 50 prostate MRI scans were available for download. Despite efforts at data augmentation, such as rotation, translation, and the addition of Gaussian noise, researchers acknowledge that training an efficient CNN under limited training data for medical image analysis is a fundamental challenge (Yu et al. 2017). Indeed, it is generally acknowledged that for complex problem such as medical image segmentation, thousands, as opposed to tens, of data points are required to train a CNN for optimal performance.

Promisingly, however, it seems as though the sizes of public and private datasets are increasing recently: the 2017 MICCAI Liver Tumor Segmentation Challenge had 130 downloadable CT scans, and Yang et al. (2017) published the first liver segmentation study trained on 1000+ 3D CT scans, achieving excellent performance compared to state-of-the-art methods. Notably, Yang et al. also implemented a novel form of adversarial training to improve performance of DI2IN, their multi-layer CNN. After training DI2IN, they used an adversarial network designed to distinguish between ground truth and the CNN output, allowing them to further update parameters of DI2IN for higher performance without additional data.

The ConvNet Strikes Back

So far, we have focused primarily on liver and liver tumor segmentation, the most popular area of study in abdominal image segmentation. However, we believe that the advances made in this domain can be generalized to other domains, such as prostate, kidney, and pancreas segmentation, due to the similarity of these organs in appearance in CT scans. In fact, only in the past two years have we begun to see studies successfully using CNNs for multi-organ segmentation in the abdomen. In 2017, Roth et al. achieved the highest mean Dice score on a dataset of manually labeled CT scans containing seven abdominal structures using a 3D U-Net, beating out other approaches, such as 2D fully convolutional networks or random forests with graph cut. During the same year, Hu et al. (2017) and Larsson et al. (2017) both published studies using 3D deep convolutional neural networks for high-accuracy multi-organ segmentation as well. Multi-organ segmentation may increase in popularity as a problem of interest, especially if researchers attempt to use some of the aforementioned hybrid network architectures that have been successfully applied to liver tumor segmentation.

The Last Radiologist?

It’s important to note that organ segmentation is a common practice in radiation oncology and interventional radiology, but isn’t often performed by radiologists on a day-to-day basis. Advances in fully automated abdominal organ segmentation shouldn’t be taken as an indication that radiologists are in imminent danger of losing their jobs; instead, we hope these breakthroughs will ease the time and labor burden involved in radiation treatment planning. By freeing up clinicians from intensive, menial, and repetitive tasks, we hope that AI tools will allow them to focus on long-term treatment strategy, serve more patients, and ultimately, save as many lives as possible.


We’d like to express our gratitude to everyone who reviewed this article and provided feedback. First, we’d like to thank Matt Lungren MD MPH, Assistant Professor of Radiology at the Stanford University Medical Center, for providing his expert suggestions. We also thank Pranav Rajpurkar and Jeremy Irvin from the teaching staff for the AI for Healthcare Bootcamp, as well as our fellow bootcampers Norah Borus, Chris Lin, and Henrik Marklund for reviewing our initial drafts. Lastly, we’d like to give a shout-out to the Stanford ML Group for for the opportunity to participate in the AI for Healthcare Bootcamp!

Works Cited

American Cancer Society. Cancer Facts & Figures 2017. Atlanta: American Cancer Society; 2011. [Accessed January 21, 2017]

Gotra, A., Sivakumaran, L., Chartrand, G., Vu, K. N., Vandenbroucke-Menu, F., Kauffmann, C., … & Tang, A. (2017). Liver segmentation: indications, techniques and future directions. Insights into Imaging, 1–16.

Hu, P., Wu, F., Peng, J., Bao, Y., Chen, F., & Kong, D. (2017). Automatic abdominal multi-organ segmentation using deep convolutional neural network and time-implicit level sets. International journal of computer assisted radiology and surgery, 12(3), 399–411.

Larsson, M., Zhang, Y., & Kahl, F. (2017, June). Robust Abdominal Organ Segmentation Using Regional Convolutional Neural Networks. In Scandinavian Conference on Image Analysis (pp. 41–52). Springer, Cham.

Lee, J., Kim, N., LEE, H., Seo, J.B., Won, H.J., Shin, Y.M., Shin, Y.G., Kim, S.H.: Efficient liver segmentation using a level-set method with optimal detection of the initial liver boundary from level-set speed images. Computer Methods and Programs in Biomedicine, 88(1), 26–28, (2007).

Li, X., Chen, H., Qi, X., Dou, Q., Fu, C. W., & Heng, P. A. (2017). H-DenseUNet: Hybrid densely connected UNet for liver and liver tumor segmentation from CT volumes. arXiv preprint arXiv:1709.07330.

Moghbel, M., Mashohor, S., Mahmud, R., & Saripan, M. I. B. (2017). Review of liver segmentation and computer assisted detection/diagnosis methods in computed tomography. Artificial Intelligence Review, 1–41.

Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 234–241). Springer, Cham.

Roth, H. R., Oda, H., Hayashi, Y., Oda, M., Shimizu, N., Fujiwara, M., … & Mori, K. (2017). Hierarchical 3D fully convolutional networks for multi-organ segmentation. arXiv preprint arXiv:1704.06382.

Yang, D., Xu, D., Zhou, S. K., Georgescu, B., Chen, M., Grbic, S., … & Comaniciu, D. (2017, September). Automatic liver segmentation using an adversarial image-to-image network. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 507–515). Springer, Cham.

Yu, L., Yang, X., Chen, H., Qin, J., & Heng, P. A. (2017, February). Volumetric ConvNets with Mixed Residual Connections for Automated Prostate Segmentation from 3D MR Images. In AAAI (pp. 66–72).

Zhang, R., Zheng, Y., Mak, T. W. C., Yu, R., Wong, S. H., Lau, J. Y., & Poon, C. C. (2017). Automatic detection and classification of colorectal polyps by transferring low-level CNN features from nonmedical domain. IEEE journal of biomedical and health informatics, 21(1), 41–47.

Stomach Wars: The ConvNet Awakens was originally published in Stanford AI for Healthcare on Medium, where people are continuing the conversation by highlighting and responding to this story.

Source: Deep Learning on Medium