Accepeted: Bioinformatics, 32(15), 2016, 2352–2358
Authors: Ahmed Fakhry, Hanchuan Peng and Shuiwang Ji. Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA; Allen Institute for Brain Science, Seattle, WA 98103, USA and; School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164, USA
Motivation: Accurate segmentation of brain electron microscopy(EM) images is a critical step in dense circuit reconstruction. Although deep neural networks(DNNs) have been widely used in a number of applications in computer vision, most of these models that proved to be effective on image classification tasks cannot be applied directly to EM image segmentation, due to the different objectives of these tasks. As a result, it is desirable to develop an optimized architecture that uses the full power of DNNs and tailored specifically for EM image segmentation.
脑电镜图像的精确分割是dense circuit reconstruction的关键步骤. 尽管深度神经网络(DNNs)在计算机视觉中的很多应用已经得到了使用，但由于这些任务的目标不同，大多数在图像分类任务中有效的迷信不能直接应用到EM图像分割. 因此, 希望开发一种利用DNNs全部能力并专门针对EM图像分割而定制的优化架构.
Results: In this work, we proposed a novel design of DNNs for this task. We trained a pixel classifier that operates on raw pixel intensities with no preprocessing to generate probability values for each pixel being a membrane or not. Although the use of neural networks in image segmentation is not completely new, we developed novel insights and model architectures that allow us to achieve superior performance on EM image segmentation tasks. Our submission based on these insights to the 2D EM Image Segmentation Challenge achieved the best performance consistently across all the three evaluation metrics. This challenge is still ongoing and the results in this paper are as of June 5, 2015.
在这项工作中，我们提出了一种新的DNNs设计来用于这项任务. 我们训练了一个像素分类器，该分类器对原始像素进行操作，而不需要预处理，以生成每个像素是否为膜的概率. 虽然神经网络在图像分割中并不是完全新的，但是我们开发了新颖的见解和模型架构，使我们能够在EM图像分割任务中获得优异的性能. 基于这些见解，我们的提交在2D EM 图像分割挑战中，在所有三个评估指标中都获得最佳的性能. 这一挑战仍在继续，本文中的结果截止到2015年6月5日.In this work, we developed a DNN model architecture that is highly optimized for ssTEM image segmentation. The key contribution is the model itself and the novel insights about the specific kernel configuration leading to substantially improved results. We evaluated the effect of model configuration along with kernel structures and depth on the final segmentation outcome.
在本文中，我们开发了一个用于ssTEM图像分割的高度优化的DNN网络架构. 关键贡献在于模型本身和关于特定核配置的新颖见解，这些见解带来了显著提升的结果. 我们在最终分割结果上面评估了核结构、模型深度等模型配置的影响.Although the use of DNN architecture usually leads to good performance on similar tasks, a careful design of the network architecture and choices of kernel sizes and placement are the key to utilize the full performance power of the model.
虽然使用DNN架构可以在相似的任务中带来良好的性能，但是仔细设计网络架构、选择核大小和布局是利用该模型全部性能的关键.In terms of the data itself, EM images are characterized by their high density and the invariability of the objects it composes unlike the natural images that are regularly used in classification.
就数据本身而言，EM图像的特征在于它们的高密度和它所构成的对象的不变性，这与在分类中经常使用的自然图像不同.The key observation about this task is how important the context information is for building discriminative features especially in the bottom layers of DNN in EM image segmentation. Another key insight is the impact of non-linearity and network depth on the overall network performance.
关于该任务的关键观察室上下文信息对于构建判别特征的重要性，尤其是在EM图像分割中DNN的底层. 另一个关键观察室非线性和网络深度对整体网络性能的影响.We also introduced back to back convolution layers interleaved by only RELU as the non-linear transformation in all achitectures. Instead of using a single convolution layer with a very large kernel size in every block, we chose to stack multiple convolution layers above each other with moderate kernel sizes while add RELU layers in betwween them. This was done mainly to increase the nonlinearity in the model and thus encouraging it to learn more complex features. In addition, breaking down a single large kernel into several smaller ones reduced the total number of parameters need to be trained, thus reducing the overall computation time.
我们还引入了仅由RELU交错的背对背卷积层作为所有体系结构中的非线性变换. 我们选择使用中等核将多个卷积层堆叠在彼此之上，而在它们之间添加RELU层，而不是在每一块中使用具有非常大的核的单个卷积层. 这主要是为了添加非线性，从而鼓励它学习更加复杂的特征. 此外，将单个大的核分解为较小的核减少了需要训练的总的参数，从而减少了计算时间.
The full architectures of the four networks can be found in Table 1.