Review: Suggestive Annotation — Deep Active Learning Framework (Biomedical Image Segmentation)

Source: Deep Learning on Medium

Reducing Annotation Effort & Cost of Biomedical Experts such as Radiographer

Go to the profile of Sik-Ho Tsang
Glands Segmentation in Colon Histology Images (Left) & Lymph Nodes Segmentation in Ultrasound Images (Right)

In this story, Suggestive Annotation (SA) is reviewed. For example, colon cancer and cancer in lymph nodes (lymphoma), are two common types of cancers causing death. Accurate segmentation is essential to know the size/shape of the annotated objects such as for diagnosis or cancer grading/staging. Conventionally, to annotate a medical image, an expert from biomedical field is needed. High effort and cost are required.

As annotation is expensive, a deep active learning framework is applied to biomedical field in order to train the deep neural network with fewer annotated samples. And this is a 2017 MICCAI paper with more than 40 citations. (Sik-Ho Tsang @ Medium)


  1. Problems of Annotation in Biomedical Imaging by Experts
  2. What is Active Learning?
  3. From Active Learning to Deep Active Learning Framework
  4. Proposed Deep Active Learning Framework Using Suggestive Annotation
  5. Results

1. Problems of Annotation in Biomedical Imaging by Experts

Annotation in Biomedical Imaging by Experts
  • Only trained biomedical experts can annotate the data.
  • Extensive manual efforts (Time & Cost).
  • Human Mistakes.

2. What is Active Learning?

Active Learning
  • Annotation/Labeling is an expensive activity especially in biomedical area.
  • Active learning is suggested which is from a Technical Report in 2010, “Active Learning Literature Survey” with over 3000 citations.
  • As shown above, the human annotates some samples from unlabeled pool, and input those annotated samples for training.
  • After training, the machine learning model outputs some samples which has high uncertainty, back to the unlabeled pool.
  • Thus, human can avoid annotate those samples with high certainty predicted by the machine learning model, and consequently save the effort and cost of the human annotator.

3. From Active Learning to Deep Active Learning Framework

3.1. Employ Junior for Annotation

Employ Junior for Annotation
  • As shown above, similar to active learning framework in the previous section, the trained junior annotate samples from unannotated sample pool.
  • Then select those with high uncertainty, and ask his/her senior, i.e. the expert, to annotate.
  • With expert annotated, the junior can learn more, and become a better trained junior.
  • With better learning/training, the trained junior is supposed to have higher annotation ability to annotate the remaining unannotated samples from the pool.
  • With the above framework, we can save the effort and cost of the expert.

3.2. Employ More Juniors for Annotation

Employ More Juniors for Annotation
  • To speed up the annotation, we can employ more juniors to work on the annotation task.
  • Only those samples that are uncertain among all trained juniors, are sent to expert for annotation.
  • Thus, we can further save the effort and cost of the expert.

3.3. FCN Replacing Juniors

FCN Replacing Juniors
  • To be more automatic, Fully Convolutional Networks (FCNs) are to replace the persons.
  • Now, it becomes the proposed framework in this paper.

4. Proposed Deep Active Learning Framework Using Suggestive Annotation

Deep Active Learning Framework
  • There are 3 main parts as shown above: FCN Architecture, Uncertainty Measure, and Similarity Estimation.

4.1. FCN Architecture

FCN Architecture
  • Input: Unannotated Image
  • Outputs: Annotated label map (what we want) and 1024-d image descriptors, which is used for measuring uncertainty.
  • The architecture used is a FCN-like architecture with the use of residual block.
  • Bootstrapping (Sampling with Replacement) is used: such that each FCN will have different training data
  • Simple cases: Multiple FCNs will come up with similar outputs
  • Difficult cases: Multiple FCNs will have diverse outputs
  • 4 FCNs are used in SA due to the use of 4 NVidia Tesla P100 GPUs.

4.2. Uncertainty Measure

Uncertainty Measure
  • When uncertainty of a pixel (s.d., standard deviation) is low, accuracy of that pixel is high, or vice versa.
  • To measure uncertainty of an image, mean uncertainty of all pixels is used.

4.3. Similarity Estimation

  • As mentioned that there is another output, 1024-d image descriptor. This descriptor contains rich and accurate shape information.
  • Cosine similarity is used for similarity estimation.

4.4. Suggestive Annotation (SA)

Suggestive Annotation (SA)
  • Among all unannotated images Su, we use uncertainty measure to select K images with Top K uncertainty scores as Sc (K=16).
  • Thus, we have selected K images that FCNs have diverse outputs.
  • Within these K images, greedy approach is used to find Sa (A set we want to suggest expert to annotate).
  • Initially Sa is empty set, i.e. Sa=∅ and F(Sa, Su) = 0.
  • Iteratively add IiSc, that maximizes F(Sa Ii, Su) until Sa contains k images (k=8)
  • Hence, a set of images Sa which has uncertain outputs, but also similar to the unannotated images, are selected.

5. Results

5.1. 2015 MICCAI Gland Challenge dataset

  • 85 training images
  • 80 testing images with 60 in Part A (normal glands), 20 in Part B (abnormal glands)
Comparison with full training data for gland segmentation
  • When 100% training data is used, SA (Our method) outperforms MultiChannel and CUMedVision2 / DCAN, which proves the effectiveness of the FCN architecture.
Comparison using limited training data for gland segmentation

When 50% training data is used, it is already even better than the SOTA (Green) result.

5.2. Lymph node dataset

  • 37 training images, and 37 testing images
  • With only 50% training data, the framework can have a better segmentation performance than U-Net, CUMedVision1 and CFS-FCN.
  • CFS-FCN needs extra labelling effort for the intermediate label maps, which can be treated as 200% training data.

With active learning framework, it can helps to improve the prediction accuracy of small dataset.


[2017 MICCAI] [SA]
Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation

My Previous Reviews

Image Classification
[LeNet] [AlexNet] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [MSDNet]

Object Detection
[OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]

Semantic Segmentation
[FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [PSPNet] [DeepLabv3]

Biomedical Image Segmentation
[CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN]

Instance Segmentation
[SDS] [Hypercolumn] [DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]

Super Resolution

Human Pose Estimation
 [DeepPose] [Tompson NIPS’14] [Tompson CVPR’15]