Multi-Level Contextual Network for Biomedical Image Segmentation

13 November 2018

Authors: Amirhossein Dadashzadeh and Alireza Tavakoli Targhi. Department of Computer Science, Shahid Beheshti University, Tehran, Iran

Abstact: Accurate and reliable image segmentation is an essential part of biomedical image analysis. In this paper, we consider the problem of biomedical image segmentation using deep convolutional neural networks. We propose a new end-to-end network architecture that effectively integrates local and global contextual patterns of histologic primitives to obtain a more reliable segmentation result. Specifically, we introduce a deep fully convolution residual network with a new skip connection strategy to control the contextula information passed forward. Moreover, our trained model is also computationally inexpensive due to its small number of networks. We evaluate our method on two public datasets for epithelium segmentation and tubule segmentation tasks. Our experimental results show that the proposed method provides a fast and effective way of producing a piexel-wise dense prediction of biomedical images.

摘要: 准确可靠的图像分割是生物医学图像分析的重要组成部分.本文研究了利用深度卷积神经网络进行生物医学图像分割的问题. 我们提出一种新的端到端的网络架构,有效地整合了组织学原语的局部和全局上下文模式,以获得更可靠的分割结果. 具体地,我们引入一个具有新skip connection策略的全卷积残差网络,来控制前向传递的上下文信息. 此外,我们所训练的模型由于其参数量较少的网络而在计算上也很便宜. 我们在上皮细胞分割、小管分割两个公共数据集上评估了我们的方法. 实验结果表明, 该方法为生物医学图像的像素级dense prediction预测提供了一种快速有效的方法.

These encoder-decoder based architectures directly concatenate feature maps from earlier layers to recover image detail. However, this strategy is not able to model long-range context, which can be crucial for tasks such as segmentation of tubules in histology images. Moreover, these models perform complicate decoder module which usually needs substantial computing resources.

这些基于编码-解码的架构直接连接先前层的feature map以恢复图像细节. 然而,这种策略不能够对远程上下文建模, 这对于组织学图像中的分割小管等任务至关重要. 此外, 这些模型执行复杂的解码器模块, 通常需要大量的计算资源.

To address this problem, in this paper we propose a deep multi-level contextual network, a CNN-based architecture that effectively integrates multi-level long-range(global) and short-range(local) contextual information to achieve a more reliable segmentation result without causing too much computation cost.

为了解决这个问题,本文提出了一种深度多层上下文网络,一种基于CNN的架构,它能够有效地集成多级远程(全局)和短程(局部)上下文信息,从而在不需要太多计算的情况下获得更可靠的分割结果.

In summary, there are three main contributions in our paper which are as follows: + We take advantage of residual learning technique, proposed by [], to build a deeper FCN-like network with fewer papameters and powerful representational ability. + We propose a new skip connection strategy using pyramid dilated convolution(PDC) to encode multi-scale and multi-level contextual information passed forward. + We perform extensive experiments on two digital pathology tasks including epithelium and tubule segmentation to demonstrate the effecitiveness of the proposed model.

综上所述,本论文主要有以下三个方面的贡献:

  • 利用残差学习技术,构建了一个更深的FCN类网络,具有较少的参数和强大的表示能力
  • 我们提出来一种新的skip connection策略: 使用金字塔空洞卷积来编码多尺度和多层次的上下文信息
  • 我们对两个数字病理任务进行了广泛的实验,包括上皮和小管的分割,以证明所提出模型的有效性.
In this work, we take advantage of the residual learning technique to make a deep encoder network with 46 convolution layers. For this purpose, as shown in Fig. 4, We consider 5 residual groups that each group contains 3 residual units which are stacked together. The structure of these units is shown in Fig. 2.

As it is clear in Fig. 2, we use a bottlenect architecture for each residual unit, using two 1x1 convolution layers. This is architecture is responsible for reducing the dimensionality and restore it. Also, the downsampling operation is performed by the first 1x1 convolution with a stride of 2.

In this work, we combine different scales of contextual information using a module called pyramid dilated convolution(PDC). The PDC module used in this paper consists of four parallel dilated convolutions with different dilated rates.

Also, for the first layer of this module, we apply a 3x3 convolution layer with stride s to control the resolution of the input feature maps.

Furthemore, the output size of the final feature maps from PDC is 1/16 of the input image. The final module structure is shown in Fig. 3.

一句话总结: 残差网络, 空洞卷积,金字塔,网络的结构

疑问:

  • 经过Residual Group, feature map如何减小的
  • 经过PDC的output_stride = 16是什么意思