基于改进注意力机制的问题生成模型研究

易也难; 卞艺杰

doi:10.19304/J.ISSN1000-7180.2021.1082

基于改进注意力机制的问题生成模型研究

Research on question generation model based on improved attention mechanism

摘要

摘要: 问题生成是一项应用非常广泛的自然语言生成任务，现有的研究大多数是采用基于循环神经网络构建的序列到序列模型.由于循环神经网络自带的“长期依赖”问题，导致模型编码器在对输入语句建模表示时，无法有效地捕获到词语间的相互关系信息.此外，在解码阶段，模型解码器通常只利用编码器的单层输出或者顶层输出来计算全局注意力权重，无法充分利用从原始输入语句中提取到的语法语义信息.针对以上两个缺陷，现提出一种基于改进注意力机制的问题生成模型.该模型在编码器中加入了自注意力机制，用来获取输入词语间的相互关系信息，并且在解码器生成问题词语时，采用编码器的多层输出联合计算全局注意力权重，可以充分利用语法语义信息提高解码效果.利用SQuAD数据集对上述改进模型进行了实验，实验结果表明，改进模型在自动评估方法和人工评估方法中均优于基准模型，并且通过样例分析可以看出，改进模型生成的自然语言问题质量更高.

Abstract: Question generation is a widely used natural language generation task. Most of the existing researches use sequence-to-sequence model based on recurrent neural network. Due to the "long-term dependence" problem, the encoder can not effectively capture the relationship information between words when modeling sentences. In addition, in the decoding stage, the decoder usually only uses the single-layer or top-layer output of the encoder to calculate the global attention weight, which can not make full use of the syntax and semantic information in sentences. Aiming at the above two defects, a question generation model based on improved attention mechanism is proposed. This model adds the self-attention mechanism to the encoder to extract the relationship information between words, and use the multi-layer outputs of the encoder to jointly calculate the global attention weight when the decoder generating question words. The improved model is tested on the SQuAD dataset. The experimental results show that, compared with the baseline model, the improved model obtains better scores in the two evaluation methods, and through the example analysis, it can be seen that the quality of natural language questions generated by the improved model is higher.

HTML全文

参考文献(31)

施引文献

资源附件(0)