李永博, 王琴, 蒋剑飞. 稀疏卷积神经网络加速器设计[J]. 微电子学与计算机, 2020, 37(6): 30-34,39.
引用本文: 李永博, 王琴, 蒋剑飞. 稀疏卷积神经网络加速器设计[J]. 微电子学与计算机, 2020, 37(6): 30-34,39.
LI Yong-bo, WANG Qin, JIANG Jian-fei. Design of sparse convolutional neural network accelerator[J]. Microelectronics & Computer, 2020, 37(6): 30-34,39.
Citation: LI Yong-bo, WANG Qin, JIANG Jian-fei. Design of sparse convolutional neural network accelerator[J]. Microelectronics & Computer, 2020, 37(6): 30-34,39.

稀疏卷积神经网络加速器设计

Design of sparse convolutional neural network accelerator

  • 摘要: 为降低卷积神经网络推断时的时延和能耗,使用动态网络剪枝技术得到稀疏网络并设计出高能效比的稀疏卷积神经网络加速器.针对运算负载不均衡问题,提出适合稀疏运算的数据流;针对卷积运算高时延问题,采用16×16运算阵列提高运算并行度,设计索引单元避免无效运算,设计脉动输入层加强数据复用,采用乒乓缓存减少数据等待.综合结果表明,在TSMC 28 nm工艺下,芯片工作频率可达500 MHz,功耗为249.7 mW,卷积运算峰值算力达到256 GOPS,能效比为1.03 TOPS/W.

     

    Abstract: In order to reduce the latency and energy consumption of convolutional neural networks, dynamic network surgery is used to get sparse networks and a high energy efficiency sparse convolutional neural network accelerator is designed. Aiming at the problem of unbalanced computing load, a dataflow suitable for sparse computing is proposed. To reduce the latency of convolution operation, a 16×16 process engine array is used to improve computation parallelism, index units are designed to avoid invalid operation, the systolic input structure is designed to enhance data reuse, and ping-pong buffers are introduced to reduce data waiting. The synthesis results showthat the frequency can reach 500 MHz, the power consumption is 139mW, the peak performance is 221 GOPS, and the energy efficiency is 1.59T OPS/W with TSMC 28nm process.

     

/

返回文章
返回