李斌,钮东,吴朝晖,等.基于卷积神经网络及易于硬件实现的真实图像去噪算法[J]. 微电子学与计算机,2023,40(2):87-93. doi: 10.19304/J.ISSN1000-7180.2022.0297
引用本文: 李斌,钮东,吴朝晖,等.基于卷积神经网络及易于硬件实现的真实图像去噪算法[J]. 微电子学与计算机,2023,40(2):87-93. doi: 10.19304/J.ISSN1000-7180.2022.0297
LI B,NIU D,WU Z H,et al. Real image denoising algorithm based on convolutional neural network and easy hardware implementation[J]. Microelectronics & Computer,2023,40(2):87-93. doi: 10.19304/J.ISSN1000-7180.2022.0297
Citation: LI B,NIU D,WU Z H,et al. Real image denoising algorithm based on convolutional neural network and easy hardware implementation[J]. Microelectronics & Computer,2023,40(2):87-93. doi: 10.19304/J.ISSN1000-7180.2022.0297

基于卷积神经网络及易于硬件实现的真实图像去噪算法

Real image denoising algorithm based on convolutional neural network and easy hardware implementation

  • 摘要: 考虑移动端有限的计算资源,本文采用U型网络作为图像去噪的主干网络,提出了一种新的真实图像去噪算法CBDNet+. 在CBDNet基础上,提出在上、下采样阶段采用小波变换,减少了乘法器的利用,更易于在资源有限的移动端实现,并且图像去噪性能较CBDNet有一定的提升. 针对资源有限及低功耗的需求,对训练之后的网络进行剪枝以及8bit量化压缩,有效地提升了算法的效率并且减少了其需要的存储空间. 在算法基础上,围绕硬件架构、片上缓存、性能及功耗等方面进行移动端专用型神经网络加速器的研究与设计. 针对使用小波变换及小波逆变换的卷积神经网络图像去噪算法,采用专用的卷积神经网络加速器结构,降低片内外存储带宽;采用并行运算的方式,提高了小波逆变换的运算效率;在兼顾资源和速度的前提下,实现算法的加速推理. 在AX7350 ZYNQ 平台上实现了真实图像去噪系统,结果表明,本系统在100 MHz时钟下,平均计算性能为55.2 GOPS,功耗为1.93 W. 图像去噪系统在DND测试集上测试,信噪比为36.21 dB,结构相似比为0.9435.

     

    Abstract: Considering the limited computing resources of the mobile terminal, this paper adopts the U-shaped network as the backbone network of image denoising, and proposes a new real image denoising algorithm CBDNet+. On the basis of CBDNet, it is proposed to use wavelet transform in the up-sampling and down-sampling stages, which reduces the utilization of multipliers and is easier to implement on mobile terminals with limited resources, and the image denoising performance has a certain improvement compared with CBDNet. To meet the requirements of limited resources and low power consumption, the trained network is pruned and 8-bit quantized and compressed, which effectively improves the efficiency of the algorithm and reduces the required storage space. On the basis of the algorithm, the research and design of the mobile terminal-specific neural network accelerator are carried out in terms of hardware architecture, on-chip cache, performance and power consumption. For the convolutional neural network image denoising algorithm using wavelet transform and wavelet inverse transform, a dedicated convolutional neural network accelerator structure is used to reduce the storage bandwidth on and off the chip; the parallel operation is used to improve the operation efficiency of the wavelet inverse transform; Under the premise of taking into account resources and speed, the accelerated reasoning of the algorithm is realized. A real image denoising system is implemented on the AX7350 ZYNQ platform. The results show that the system has an average computing performance of 55.2 GOPS and a power consumption of 1.93 W under a clock of 100 MHz. The image denoising system is tested on the DND test set, the signal-to-noise ratio is 36.21 dB, and the structural similarity ratio is 0.9435.

     

/

返回文章
返回