李斌, 许华杰, 吴朝晖. 基于FPGA的多人脸实时检测系统设计与实现[J]. 微电子学与计算机, 2021, 38(4): 57-62.
引用本文: 李斌, 许华杰, 吴朝晖. 基于FPGA的多人脸实时检测系统设计与实现[J]. 微电子学与计算机, 2021, 38(4): 57-62.
LI Bin, XU Hua-jie, WU Zhao-hui. Design and implementation of FPGA-based real time multiple face detection system[J]. Microelectronics & Computer, 2021, 38(4): 57-62.
Citation: LI Bin, XU Hua-jie, WU Zhao-hui. Design and implementation of FPGA-based real time multiple face detection system[J]. Microelectronics & Computer, 2021, 38(4): 57-62.

基于FPGA的多人脸实时检测系统设计与实现

Design and implementation of FPGA-based real time multiple face detection system

  • 摘要: 设计了一种基于移动端深度学习目标检测算法MobileNetV2- SSDlite的小规模、高精度的人脸检测算法.在算法基础上,围绕硬件架构、片上缓存、性能及功耗等方面进行移动端专用型神经网络加速器的研究与设计; 在ALINX AX7350 SOC平台上实现了实时多人脸检测加速器系统.结果表明,本系统在100 MHz时钟下,平均计算性能为56.01 GOPS,功耗为7.3 W.与现有的MobileNetV2-SSDlite加速器相比,运算速度提高了28.7%,资源平均节省了44.46%,功耗降低了26.3%.在224×224分辨率下,达到了83.4 FPS,满足实时性需求.

     

    Abstract: A small and high-precision face detection algorithm based on MobileNetV2-SSDlite is designed. To improve the computing capability of mobile terminals, the study of the mobile algorithm specific neural network accelerator is carried out in terms of hardware architecture, on-chip memory, performance and power consumption. A real-time multi-face detection accelerator system is simulated and implemented on ALINX AX7350 SOC platform. The results show that under a 100MHz clock, the face detection system has an average computing performance of 56.01 GOPS and a power consumption of 7.3W. Compared with the existing MobileNetV2-SSDlite accelerator, the operation speed is improved by 28.7%, the resources are reduced by 44.46%, and the power consumption is reduced by 26.3%. The speed of the system in inferencing a 224×224 resolution picture reached 83.4 FPS, meet the demand of real-time performance.

     

/

返回文章
返回