(ECCV 2018) Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Keyword [DeepLabv3+] [Encoder-Decoder] [Xception]

Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 801-818.

1. Overview

In this paper, it proposes DeepLabv3+ by extending DeepLabv3.

  • Encoder-Decoder structure.
  • Adapt Xception model.
  • Apply Depthwise Separable Conv.

1.1. Architecture

1) Encoder output: last feature map before logits in the DeepLabv3.
2) Apply $1 \times 1$ Conv on low-level features to reduce the number of channels, since large channels may ignore encoder output.

1.2. Xception

Make a few changes:
1) Deeper Xception.
2) Replace all MaxPool with Depthwise Separable Conv (with strides), which enables to apply Atrous Separable Conv.
3) After each $3 \times 3$ Depthwise Conv, add BN-ReLU.

1.3. Comparison