Residual Deconvolutional Networks for Brain Electron Microscopy Image Segmentation

12 November 2018

Accepted: IEEE TRANSACTIONS ON MEDICAL IMAGING

Authors: Ahmed Fakhry, Tao Zeng, and Shuiwang Ji, Senior Member, IEEE

Index Terms - Residual learning, deconvolution networks, deep learning, image segmentation, electron microscopy, brain circuit reconstruction

摘要: 使用电子显微镜(EM)图像精确重建大脑神经元之间的解剖连接,被认为是circuit mapping的金标注. 获得重构的关键步骤是能够以接近人类水平的精度自动分割神经元。尽管EM图像分割技术最近有了一些进步,但大多数EM图像分割在某种程度上依赖于手工制作的特定于数据的特征,限制了它们的泛化能力. 这里,我们提出了一种简单而强大的EM图像分割技术,它是端到端训练的,并且不依赖于数据的先验知识.我们提出的反卷积网络包含两个信息路径,分布捕获全分辨率特征和上下文信息.结果表明,该模型能够有效的解决dense output prediction中的冲突目标,即保持全分辨率预测和包含足够的上下文信息.我们将我们的方法应用到正在进行的EM图像3D神经突起分割的开放挑战中(open chanllenge).我们的方法达到了这一开放挑战的最高结果之一.我们通过在2D神经突起分割挑战数据集上评估来论证了我们技术的泛化能力, 在该数据集上获得了一致的好的表现.因此,我们期望我们的方法很好地推广到其他的dense output prediction问题上.

Contextual information can be increased by enlarging the patch size, but excessively large patches tend to compromise the full resolution, pixel-level predictions.

上下文信息能够通过增加patch的尺寸来增加,但是过大的patch往往会损害完整的分辨率,像素级预测.

Thus, dense output prediction problems face the conflicting goals of full resolution prediction and incorporation of sufficient contextual information.

因此,dense output prediction问题面临着全分辨率预测和充分结合上下文信息的相互冲突的矛盾.

Our proposed model naturally balances the tradeoff between increasing contextual window required for multi-scale reasoning and the ability to preserve pixel level resolution and accuracy expected for dense output prediction.

我们提出的模型自然地平衡了增加多尺度推理所需的上下文窗口与用于dense output prediction而保持像素级分辨率和精度的能力

We achieved these goals by adding multiple residual shortcut paths to a fully deconvolutional network with minimum additional computations. This allows for the training of very deep deconvolutional networks that incorporate sufficient contextual information, and the multiscale full-resolution features are extracted and provided through the residual paths.

我们使用最小额外计算,将多个residual shotcut paths添加到全反卷积网络中实现这些目标. 这允许训练包括足够上下文信息的非常深的去卷积网络,而且通过residual paths来提取多尺度全分辨率特征.

Fully convolutional networks(FCNs) are efficient approaches to generate dense predictions for image segmentation.The idea is to reconstruct the full-sized input by performing several deconvolutional operations at multiple scales through aggregated bilinear interpolation. The segmentation performance of FCNs is limited by the absence of real deconvolution, and full-resolution features are not well preserved. To address this limitation, deconvolution networks have been proposed recently by performing actual deconvolutional.The pooling layers are reversed in the decoding stage by unpooling layers which keep track of the maximum activation position selected during the pooling operation. While both of these two approaches are attempts to design novel deepl models specifically for dense prediction problems, they do not have explicit mechanisms to address the conflicting goals in dense prediction problems. They still suffer from loss of information due to excessive reduction of resolution as we show in our experiments.

全卷积网络(FCNs)是产生用于图像分割的dense prediction的有效方法. 其思想是通过聚合双线性差值在多尺度上执行多个反卷积操作来重构多尺寸输入. 由于缺乏真正的反卷积,FCNs的分割性能受到限制,并且不能很好地保持全分辨率特征。为了解决这一限制,最近提出了通过实际执行的反卷积的反卷积网络. 在解码阶段,通过跟踪在池化操作期间选择的最大激活位置,池化层被unpooling翻转. 虽然这两种方法都试图设计专门用于dense prediction的新型深度模型,但它们没有明确的机制来解决dense prediction问题中的冲突目标. 正如我们实验中显示的,由于分辨率过度降低,它们仍然遭受信息丢失.

Residual Deconvolutional Network Model

In the design of our model, we intend to achieve three goals: (1) Generate dense predictions equal in size to any arbitrary-sized input. (2) Increase the contextual information used to make pixel-level decisions. (3) Achieve pixel-level accuracy by incorporating high resolution feature information.

在我们的模型设计中,我们打算实现三个目标: (1) 生成与输入同等大小的dense prediction. (2) 增加用于像素级的上下文信息. (3) 通过引入高分辨率特征信息来达到像素级精度.

We enhence the performance of the deconvolution networks by adding residual connections between every several stacks of convolution or deconvolution layers. For our dense prediction network architecture, we propose to introduce projection shortcuts not just on the convolutional stage reponsible for extracting the feature represnetations, but also on the deconvolutional stage responsible for reconstructing the shape and producing the objects segmentation. We believe that with this design, our network is able to acquire more multi-scale contextual while reducing the effect of degradation problem.

通过在每几个卷积层或反卷积层之间添加residual连接,我们提高了反卷积网络的性能. 对于我们的dense prediction网络架构,我们提出不仅在卷积阶段引入projection shortcuts,而且在用于重建形状并产生目标分割的反卷积阶段引入projection shortcuts. 我们相信,通过这种设计,我们的网络能够在减少退化问题影响的同时获得更多的多尺度上下文.

We also propose the use of a novel resolution-preserving path to facilitate the reconstruction of full-resolution output. The resolution-preserving paths are essentially the projection mapping of the pooling layer outputs added to the corresponding deconvolution layer before performing the unpooling operation. These paths are responsible for transferring the missing high resolution information from the encoding stages to the decoding stages. Together, the context-growing and the resolution-preserving paths have significantly boosted the performance of non-residual deconvolutional networks. An illustration of the RDN architecture is shown in Fiure 1

我们还提议使用一种新的resolution-preserving路径, 以促进全分辨率输出的重建. 这个resolution-preserving路径, 本质上是在unpooling操作之前,把池化层输出的projection mapping加入到对应的反卷积层. 这些路径负责将丢失的高分辨率信息从编码阶段传输到解码阶段. 上下文增长和分辨率保持路径一起显著的提高了非残差反卷积网络的性能.

Fiure 1 显示了RDN网络架构

一句话总结:

  • 在卷积和反卷积添加了residual connection
  • 解码阶段,unpooling和几层反卷积构成block

疑惑:

  • unpooling是如何实现的?
  • 反卷积的实现