基于BERT-AWC的文本分类方法研究

李金彪; 侯进; 李晨; 陈子锐; 何川

doi:10.19304/J.ISSN1000-7180.2021.1264

摘要: 针对现有文本分类算法处理中文数据时存在的分类精度低、参数量庞大、模型难训练等问题，对BERT算法进行了优化.BERT算法处理中文文本时无法提取词向量特征，为此提出了均匀词向量卷积模块AWC.通过在传统卷积神经网络中引入注意力机制来提取可靠词向量特征，再进一步获取到文本的局部特征，由此弥补了BERT模型无法提取词向量的缺点.BERT模型本身具有的自注意力网络可提取到文本的全局特征来突出全文的重点含义，与此同时在BERT算法中又引入了局部特征，通过将描述文本的局部特征以及全局特征按照重要程度进行融合，最终生成了更加丰富的文本信息.将融合后的特征输入softmax层得到模型的分类结果.平衡多头设计、层级参数共享机制、全连接层优化等方法的运用在保证算法准确度的前提下大大降低了模型参数量，最终形成了一种基于混合注意力机制的BERT-AWC轻量化文本分类算法.在多个公开数据集上的实验结果表明，相较于基准算法BERT，该算法在多个公开数据集上的预测精度均有1~5%的提升，而模型参数量仅为BERT的3.6%，达到了设计预期.

Abstract: Aiming at the problems of low classification accuracy, huge amount of parameters, and difficulty in training models when processing Chinese data in existing text classification algorithms, the BERT algorithm is optimized. The BERT algorithm cannot extract word vector features when processing Chinese text. For this reason, a uniform word vector convolution module AWC is proposed. By introducing the attention mechanism into the traditional convolutional neural network to extract reliable word vector features, and then further obtain the local features of the text, this makes up for the shortcomings of the BERT model that cannot extract word vectors. The self-attention network of the BERT model itself can extract the global features of the text to highlight the key meaning of the full text. At the same time, local features are introduced into the BERT algorithm, by describing the local features and global features of the text according to the degree of importance. The fusion finally generates richer text information. Input the fused features into the softmax layer to get the classification result of the model. The use of balanced multi-head design, hierarchical parameter sharing mechanism, fully connected layer optimization and other methods greatly reduces the amount of model parameters under the premise of ensuring the accuracy of the algorithm, and finally forms a BERT-AWC lightweight text classification based on a hybrid attention mechanism algorithm. The experimental results on multiple public data sets show that compared with the benchmark algorithm BERT, the prediction accuracy of the BERT-AWC algorithm on multiple public data sets has improved by 1~5%, and the model parameters are only BERT′s 3.6%, which met the design expectations.

基于BERT-AWC的文本分类方法研究

Research on text classification method based on BERT-AWC