基于FPGA的卷积神经网络加速器设计与实现

Design and implementation of convolution neural network accelerator based on FPGA

摘要: FPGA已广泛用于卷积神经网络的硬件加速器的实现.本文设计了一种基于FPGA的卷积神经网络加速器.主要利用卷积神经网络中固有的并行性来减少实时嵌入式应用所需带宽与资源使用, 并将其在有限资源的ZC706开发板上实现, 结果显示, 在150 MHz的工作频率下, FPGA的峰值运算速度达到0.54 GOP/s, 且功耗很小.

Abstract: FPGAs are widely used in the implementation of hardware accelerators for convolutional neural networks. This paper designs a FPGA-based CNN acceleration structure. It mainly utilizes the inherent parallelism in CNN to reduce the bandwidth and resource usage required by real-time embedded applications, and implements it on the ZC706 development board with limited resources. The results show that the FPGA operates at a frequency of 0.54 GOP/s and consumes very little power at 150 MHz.