基于PAD情绪模型的情感语音识别

Emotional Speech Recognition Based on PAD Emotion Model

摘要: 简述梅尔频率倒谱系数、线性预测系数、韵律学特征、共振峰频率和过零峰值幅度特征, 并将这五种语音特征应用于情感语音识别.根据识别结果从PAD情绪模型的三个维度进行相关性分析得到特征的权重系数, 并将识别结果融合映射到PAD三维情绪空间, 最终获得情感语音的PAD值.利用情感语音的PAD值可以从连续情感理论对情感语音进行描述分析, 采用量化的方法揭示情感空间中各种情绪范畴的定位和关系.

Abstract: Five approaches of feature extraction: the MEL-frequency Cepstral Coefficient(MFCC), the Linear Predictor Coefficient(LPC), prosodic features, formant frequency and the Zero Crossings with Peak Amplitudes(ZCPA)are described in this paper.These features are applied to emotional speech recognition.According to the recognition results, the weight coefficients of features are obtained by correlation analysis in the three dimensions of PAD emotion model.Simultaneously the recognition results are fused to the PAD emotional space, and the PAD values of the emotional speech are obtained.The PAD values of the emotional speech can be analyzed from the theory of continuous emotion.And the quantitative analysis of emotional speech can reveal the position and relationship of emotional category in emotional space.