Research on question generation model based on improved attention mechanism
-
摘要:
问题生成是一项应用非常广泛的自然语言生成任务,现有的研究大多数是采用基于循环神经网络构建的序列到序列模型.由于循环神经网络自带的“长期依赖”问题,导致模型编码器在对输入语句建模表示时,无法有效地捕获到词语间的相互关系信息.此外,在解码阶段,模型解码器通常只利用编码器的单层输出或者顶层输出来计算全局注意力权重,无法充分利用从原始输入语句中提取到的语法语义信息.针对以上两个缺陷,现提出一种基于改进注意力机制的问题生成模型.该模型在编码器中加入了自注意力机制,用来获取输入词语间的相互关系信息,并且在解码器生成问题词语时,采用编码器的多层输出联合计算全局注意力权重,可以充分利用语法语义信息提高解码效果.利用SQuAD数据集对上述改进模型进行了实验,实验结果表明,改进模型在自动评估方法和人工评估方法中均优于基准模型,并且通过样例分析可以看出,改进模型生成的自然语言问题质量更高.
Abstract:Question generation is a widely used natural language generation task. Most of the existing researches use sequence-to-sequence model based on recurrent neural network. Due to the "long-term dependence" problem, the encoder can not effectively capture the relationship information between words when modeling sentences. In addition, in the decoding stage, the decoder usually only uses the single-layer or top-layer output of the encoder to calculate the global attention weight, which can not make full use of the syntax and semantic information in sentences. Aiming at the above two defects, a question generation model based on improved attention mechanism is proposed. This model adds the self-attention mechanism to the encoder to extract the relationship information between words, and use the multi-layer outputs of the encoder to jointly calculate the global attention weight when the decoder generating question words. The improved model is tested on the SQuAD dataset. The experimental results show that, compared with the baseline model, the improved model obtains better scores in the two evaluation methods, and through the example analysis, it can be seen that the quality of natural language questions generated by the improved model is higher.
-
表 1 各个模型在测试数据集中的BLEU得分
Table 1. The BLEU scores of each model in the test data set
模型 BLEU-1 BLEU-2 BLEU-3 BLEU-4 s2s+att+rich-f+copy (基准模型) 0.419 1 0.276 3 0.203 4 0.155 9 s2s+2l-att+rich-f +copy 0.423 8 0.279 6 0.206 0 0.158 0 s2s+2l-att+rich-f +copy+self-att 0.421 6 0.279 2 0.206 9 0.159 7 表 2 各个模型的人工打分结果
Table 2. The manual scoring results of each model
模型 平均得分 Kappa系数 人工提出的问题 2.89 0.72 s2s+att+rich-f+copy(基准模型) 1.78 0.53 s2s+2l-att+rich-f+copy+self-att 1.87 0.59 -
[1] DU X Y, SHAO J R, CARDIE C. Learning to ask: neural question generation for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada: Association for Computational Linguistics, 2017: 1342-1352. DOI: 10.18653/v1/P17-1123. [2] HEILMAN M, SMITH N A. Good question! statistical ranking for question generation[C]//Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2010: 609-617. DOI: 10.5555/1857999.1858085. [3] MOSTAFAZADEH N, MISRA I, DEVLIN J, et al. Generating natural questions about an image[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics, 2016: 1802-1813. DOI: 10.18653/v1/P16-1170. [4] RUS V, WYSE B, PIWEK P, et al. The first question generation shared task evaluation challenge[C]//Proceedings of the 6th International Natural Language Generation Conference. Stroudsburg, PA, United States: Association for Computational Linguistics, 2010: 251-257. DOI: 10.5555/1873738.1873777. [5] AGARWAL M, MANNEM P. Automatic gap-fill question generation from text books[C]//Proceedings of the sixth workshop on innovative use of NLP for Building Educational Applications. Portland, Oregon: Association for Computational Linguistics, 2011: 56-64. DOI: 10.5555/2043132.2043139. [6] ALI H, CHALI Y, HASAN S A. Automatic question generation from sentences[C]//Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles Courts. Montréal, Canada: ATALA, 2010: 213-218. [7] DHOLE K, MANNING C D. Syn-QG: syntactic and shallow semantic rules for question generation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020: 752-765. DOI: 10.18653/v1/2020.acl-main.69. [8] LABUTOV I, BASU S, VANDERWENDE L. Deep questions without deep understanding[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing, China: Association for Computational Linguistics, 2015: 889-898. DOI: 10.3115/v1/P15-1086. [9] LINDBERG D, POPOWICH F, NESBIT J, et al. Generating natural language questions to support learning on-line[C]//Proceedings of the 14th European Workshop on Natural Language Generation. Sofia, Bulgaria: Association for Computational Linguistics, 2013: 105-114. [10] CHALI Y, GOLESTANIRAD S. Ranking automatically generated questions using common human queries[C]//Proceedings of The 9th International Natural Language Generation Conference. Edinburgh, UK: Association for Computational Linguistics, 2016: 217-221. DOI: 10.18653/v1/W16-6635. [11] VANDERWENDE L. The importance of being important: question generation[C]//Proceedings of the 1st Workshop on the Question Generation Shared Task Evaluation Challenge. Arlington, VA. 2008. [12] XU H F, LIU Q H, VAN GENABITH J, et al. Multi-head highly parallelized LSTM decoder for neural machine translation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2021: 273-282. DOI: 10.18653/v1/2021.acl-long.23. [13] 王丽亚, 刘昌辉, 蔡敦波, 等. 基于字符级联合网络特征融合的中文文本情感分析[J]. 微电子学与计算机, 2020, 37(1): 80-86. DOI: 10.19304/j.cnki.issn1000-7180.2020.01.013.WANG L Y, LIU C H, CAI D B, et al. Chinese text sentiment analysis based on joint network and attention model[J]. Microelectronics & Computer, 2020, 37(1): 80-86. DOI: 10.19304/j.cnki.issn1000-7180.2020.01.013. [14] 武光利, 李雷霆, 郭振洲, 等. 基于改进的双向长短期记忆网络的视频摘要生成模型[J]. 计算机应用, 2021, 41(7): 1908-1914. DOI: 10.11772/j.issn.1001-9081.2020091512.WU G L, LI L T, GUO Z Z, et al. Video summarization generation model based on improved bi-directional long short-term memory network[J]. Journal of Computer Applications, 2021, 41(7): 1908-1914. DOI: 10.11772/j.issn.1001-9081.2020091512. [15] ZHOU Q Y, YANG N, WEI F R, et al. Neural question generation from text: A preliminary study[M]//6th CCF International Conference on National CCF Conference on Natural Language Processing and Chinese Computing. Dalian, China: Springer, 2017: 662-671. DOI: 10.1007/978-3-319-73618-1_56. [16] ZHAO Y, NI X C, DING Y Y, et al. Paragraph-level neural question generation with maxout pointer and gated self-attention networks[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018: 3901-3910. DOI: 10.18653/v1/D18-1424. [17] DU X Y, CARDIE C. Harvesting paragraph-level question-answer pairs from wikipedia[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association for Computational Linguistics, 2018: 1907-1917. DOI: 10.18653/v1/P18-1177. [18] ZHOU W J, ZHANG M H, WU Y F. Question-type driven question generation[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, 2019: 6032-6037. DOI: 10.18653/v1/D19-1622. [19] LI J J, GAO Y F, BING L D, et al. Improving question generation with to the point context[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, 2019: 3216-3226. DOI: 10.18653/v1/D19-1317. [20] MA X Y, ZHU Q L, ZHOU Y L, et al. Improving question generation with sentence-level semantic matching and answer position inferring[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8464-8471. DOI: 10.1609/aaai.v34i05.6366. [21] TUAN L A, SHAH D, BARZILAY R. Capturing greater context for question generation[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 9065-9072. DOI: 10.1609/aaai.v34i05.6440. [22] 武恺莉, 朱朦朦, 朱鸿雨, 等. 结合问题类型及惩罚机制的问题生成[J]. 中文信息学报, 2021, 35(4): 110-119. DOI: 10.3969/j.issn.1003-0077.2021.04.015.WU K L, ZHU M M, ZHU H Y, et al. Joint question type and penalty mechanism for question generation[J]. Journal of Chinese Information Processing, 2021, 35(4): 110-119. DOI: 10.3969/j.issn.1003-0077.2021.04.015. [23] PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana: Association for Computational Linguistics, 2018: 2227-2237. DOI: 10.18653/v1/N18-1202. [24] BELINKOV Y, DURRANI N, DALVI F, et al. What do neural machine translation models learn about morphology?[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: Association for Computational Linguistics, 2017: 861-872. DOI: 10.18653/v1/P17-1080. [25] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014. [26] LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal: Association for Computational Linguistics, 2015: 1412-1421. DOI: 10.18653/v1/D15-1166. [27] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY, United States: Curran Associates Inc., 2017: 6000-6010. DOI: 10.5555/3295222.3295349. [28] RAJPURKAR P, ZHANG J, LOPYREV K, et al. SQuAD: 100, 000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, Texas: Association for Computational Linguistics, 2016: 2383-2392. DOI: 10.18653/v1/D16-1264. [29] PAPINENI K, ROUKOS S, WARD T, et al. Bleu: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th annual Meetingon Association for Computational Linguistics. Stroudsburg, PA, United States: Association for Computational Linguistics, 2002: 311-318. DOI: 10.3115/1073083.1073135. [30] PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1532-1543. DOI: 10.3115/v1/D14-1162. [31] GLOROT X, BENGIO Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: W&CP, 2010: 249-256. -