DUAN Y,SHAO Y B,LONG H,et al. Language identification based on joint decision of nonlinear spectrograms[J]. Microelectronics & Computer,2024,41(5):99-108. doi: 10.19304/J.ISSN1000-7180.2023.0298
Citation: DUAN Y,SHAO Y B,LONG H,et al. Language identification based on joint decision of nonlinear spectrograms[J]. Microelectronics & Computer,2024,41(5):99-108. doi: 10.19304/J.ISSN1000-7180.2023.0298

Language identification based on joint decision of nonlinear spectrograms

  • To address the problem that the gray-scale logarithmic speech spectrogram is too stretched to the fundamental frequency, which limits the improvement of short-length speech identification rate, a language identification method with joint judgment of nonlinear speech spectrogram is proposed. Firstly, the logarithmic power spectrum is extracted by energy normalization, and the nonlinear speech spectrogram is obtained by nonlinear mapping of frequency scales according to human ear perception. Then, the nonlinear speech spectrogram is split into equal intervals according to word association characteristics, and the joint judgment layer is added at the back end of the ResNet network. Finally, the language type of the speech is output. The experimental results show that the proposed method can effectively improve the shortcomings of the gray-scale logarithmic speech spectrogram, and the recognition performance is higher than that of the speech spectrogram and the improved features. The best recognition results are obtained for the sample speech with a cut time of 1.0 s, and the recognition rate reaches 94.25% in the broadcast audio data set and 98.94% in the VoxForge public corpus.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return