ZHAO Kang, LI Xiangfeng, LI Gaoyang, ZUO Dunwen. Dynamic gesture recognition based on lightweight (2+1)D convolution structure[J]. Microelectronics & Computer, 2022, 39(9): 46-54. DOI: 10.19304/J.ISSN1000-7180.2022.0115
Citation: ZHAO Kang, LI Xiangfeng, LI Gaoyang, ZUO Dunwen. Dynamic gesture recognition based on lightweight (2+1)D convolution structure[J]. Microelectronics & Computer, 2022, 39(9): 46-54. DOI: 10.19304/J.ISSN1000-7180.2022.0115

Dynamic gesture recognition based on lightweight (2+1)D convolution structure

  • At present, great progress has been made in dynamic gesture recognition based on convolutional neural network. But neural network model has a large number of parameters, the cost of calculation and memory footprint is high, and it is difficult to apply for the occasion of limited equipment resources. In order to reduce the amount of calculation and parameter, a lightweight (2+1)D convolution structure is proposed. Based on the (2+1)D convolution structure, the 3D convolution is replaced by the 3D depthwise separable convolution. So the computation and parameter number of (2+1)D convolution structure are further reduced under the premise that the dimension of the output vector is unchanged. In order to make up for the deficiency of spatio-temporal features in the representation of dynamic gestures, attention mechanism module that focusing on the extraction of motion features is integrated. Combined with the spatio-temporal features that be extracted by the lightweight (2+1)D convolution structure, it can better represent gestures. Experimental results show that by inserting the attention mechanism module, the recognition accuracy of the model is further improved without increasing too much extra calculation and space cost. On 20BN-jester, EgoGesture and IsoGD datasets, the model based on the above structure achieved the recognition accuracy of 96.62%, 91.83% and 60.1%, respectively. The number of parameters and floating point of operations are 5.05M and 12.81GFLOPs respectively, which greatly reduces the calculation cost and memory footprint. Recognition speed is 70 frames per second in the real-time gesture recognition.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return