熊炜, 童磊, 李利荣, 李敏. 基于可分离空洞卷积与联合归一化的语义分割算法研究[J]. 微电子学与计算机, 2020, 37(10): 18-23.
引用本文: 熊炜, 童磊, 李利荣, 李敏. 基于可分离空洞卷积与联合归一化的语义分割算法研究[J]. 微电子学与计算机, 2020, 37(10): 18-23.
XIONG Wei, TONG Lei, LI Li-rong, LI Ming. Semantic segmentation algorithm based on separable dilated convolution and joint normalization method[J]. Microelectronics & Computer, 2020, 37(10): 18-23.
Citation: XIONG Wei, TONG Lei, LI Li-rong, LI Ming. Semantic segmentation algorithm based on separable dilated convolution and joint normalization method[J]. Microelectronics & Computer, 2020, 37(10): 18-23.

基于可分离空洞卷积与联合归一化的语义分割算法研究

Semantic segmentation algorithm based on separable dilated convolution and joint normalization method

  • 摘要: 图像语义分割是图像理解的重要一环,已被广泛应用于自动驾驶等场景中.针对信息丢失和语义分割速度慢的问题,本文提出一种基于可分离空洞卷积和联合归一化的语义分割算法.首先结合可分离卷积和空洞卷积提取ResNet101的后三层输出,然后在语义分割中应用实例归一化方法,与应用批量归一化对比,验证了实例归一化的有效性,最后提出了两种联合归一化方法,验证了这两种归一化方法对语义分割效果的提升.本文方法在Pascal VOC 2012数据集进行了实验,结果表明,本文方法加速了网络的训练、验证和预测,交并集之比最高到达了80.62%.

     

    Abstract: Image semantic segmentation is an important part of image understanding, which is applied to automatic driving. In this paper, we use the Pascal VOC 2012 data set and ResNet101 as the basic network.We propose semantic segmentation algorithm based on separable dilated convolution and improved normalization method to solve the problem of information loss andslow speed Firstly, we combineseparable convolution and dilated convolution to extract the last three layers' output of ResNet101.Compared with standard dilatedconvolution, separable dilatedconvolution accelerates the training, validation and prediction of the network. Then, in the semantic segmentation, the instance normalization method is applied and compared with the application batch normalization to verify the effectiveness of batch normalization. Finally, two normalization methods combining batch normalizationand instance normalizationare proposed to improve the effect of semantic segmentation. This method is tested in Pascal VOC 2012 data set. The results show thatour methodaccelerates the training, validation and prediction of the network. Thehighest mean intersection over union ofthis method in Pascal VOC 2012 data set is 80.62%.

     

/

返回文章
返回