Reading: ESPNet — Efficient Spatial Pyramid of Dilated Convolutions (Semantic Segmentation)

Original article was published by Sik-Ho Tsang on Artificial Intelligence on Medium


Reading: ESPNet — Efficient Spatial Pyramid of Dilated Convolutions (Semantic Segmentation)

In this story, “ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation” (ESPNet), by Allen Institute for AI, and XNOR.AI, is shortly presented. In this paper:

  • A new convolutional module, efficient spatial pyramid (ESP), is introduced.
  • ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less.
  • ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet.
  • ESPNet can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively.

This is a paper in 2018 ECCV with about 200 citations. (Sik-Ho Tsang @ Medium)