Research on chinese entity relationship extraction method based on sentence-entity features and bert fusion
-
摘要:
关系抽取是信息抽取技术的重要环节,旨在从无结构的文本中抽取出实体之间的关系.目前基于深度学习的实体关系抽取已经取得了一定的成果,但其特征提取不够全面,在各项实验指标方面仍有较大的提升空间.实体关系抽取不同于其他自然语言分类和实体识别等任务,它主要依赖于句子和两个目标实体的信息.本文根据实体关系抽取的特点,提出了SEF-BERT关系抽取模型(Fusion Sentence-Entity Features and Bert Model).该模型以预训练BERT模型为基础,文本在经过BERT模型预训练之后,进一步提取语句特征和实体特征.然后对语句特征和实体特征进行融合处理,使融合特征向量能够同时具有语句和两个实体的特征,增强了模型对特征向量的处理能力.最后,分别使用通用领域数据集和医学领域数据集对该模型进行了训练和测试.实验结果表明,与其他已有模型相比,SEF-BERT模型在两个数据集上都有更好的表现.
-
关键词:
- 自然语言处理 /
- 关系抽取 /
- 深度学习 /
- BERT /
- Transformer
Abstract:Relation extraction is an important part of information extraction technology, which aims to extract the relationship between entities from unstructured text. At present, entity relationship extraction based on deep learning has achieved certain results, but its feature extraction is not comprehensive enough, and there is still a large space for improvement in various experimental indicators. Entity relationship extraction is different from other tasks such as natural language classification and entity recognition. It mainly depends on the sentence and the information of two target entities. According to the characteristics of entity relationship extraction, this paper proposes the SEF-BERT model (Fusion Sentence-Entity Features and Bert Model). This model is based on the pre-trained BERT model. After the model is pre-trained by the BERT model, sentence features and entity features are further extracted. Then, the sentence feature and the entity feature are fused, so that the fusion feature vector can have the features of the sentence and two entities at the same time, which enhances the model′s ability to process feature vectors. Finally, the model was trained and tested using the data set of the general field and the data set of the medical field. The experimental results show that, compared with other existing models, the SEF-BERT model has better performance on both data sets.
-
Key words:
- natural language processing /
- relation extraction /
- deep learning /
- BERT /
- transformer
-
算法1:SEF-BERT模型算法流程 input: 语句H 实体1的起止位置i, j 实体2的起止位置k, m output: 关系的映射向量p H0 ←H in BERT (将语句H输入BERT模型得到关于句子的特征向量) $\boldsymbol{H}_{1} \leftarrow \sum\limits_{t=i}^{\mathrm{j}} \boldsymbol{H}_{t}$ in BERT (将实体1的位置信息输入BERT模型得到实体1的特征向量) $\boldsymbol{H}_{2} \leftarrow \sum\limits_{t=k}^{\mathrm{m}} \boldsymbol{H}_{t}$ in BERT (将实体2的位置信息输入BERT模型得到实体2的特征向量) h′0← H0 in activation (将句子的特征向量输入激活函数) h′1← tanh(H1 * H0) (将H1与H0矩阵相乘后输入激活函数) h′2← tanh(H2 * H0) (将H2与H0矩阵相乘后输入激活函数) H″← concat (h′0 h′1 h′2) (H′0 H′1 H′2进行特征融合) p← softmax(H″) (将特征融合向量进行归一化映射) return p 表 1 通用领域数据集SEF-BERT模型训练结果表
Table 1. General field data set SEF-BERT model training result table
序号 关系类别 P R F1 1 Part-Whole 0.91 0.81 0.85 2 Near 0.62 0.59 0.60 3 Social 0.88 0.77 0.81 4 Create 0.91 0.79 0.84 5 Use 0.90 0.95 0.93 6 Located 0.78 0.77 0.78 7 General-Special 0.89 0.97 0.93 8 Family 0.89 0.97 0.93 9 Ownership 0.74 0.86 0.79 10 Macro avg 0.83 0.82 0.82 表 2 通用领域模型对比实验结果
Table 2. General domain model comparison experiment results
模型名称 特征集 F1 SVM[19] Word embedding,NER,WordNet,HowNet,POS,dependency parse,Google n-gram 0.489 RNN[20] Word embedding + POS,NER,WoedNet 0.483 CNN[21] Word embedding + position embedding,NER,WordNet 0.476 CR-CNN[22] Word embedding + position embedding 0.527 SDP-LSTM[23] Word embedding + POS,NER,WoedNet 0.549 DepCNN[24] Word embedding,WoedNet 0.552 Att-BLSTM[25] Character embedding + position embedding,entity sense 0.562 BERT token embedding,position embedding,segment embeddi 0.780 SEF-BERT token embedding,position embedding,segment embedding 0.820 表 3 医学领域数据集SEF-BERT模型训练结果表(部分)
Table 3. Medical field data set SEF-BERT model training result table (partial)
序号 关系类别 P R F1 1 预防 0.89 0.83 0.86 2 阶段 0.92 0.87 0.89 3 就诊科室 1.00 0.95 0.98 4 同义词 0.99 0.98 0.98 5 辅助治疗 0.87 0.87 0.87 6 化疗 0.87 0.71 0.78 7 放射治疗 0.84 0.84 0.84 8 手术治疗 0.89 0.90 0.90 9 实验室检查 0.88 0.88 0.88 10 影像学检查 0.91 0.95 0.93 11 辅助检查 0.82 0.73 0.77 12 组织学检查 0.68 0.78 0.73 13 内窥镜检查 0.90 0.83 0.86 14 筛查 0.56 0.62 0.59 15 多发群体 0.92 0.88 0.90 表 4 医学领域模型对比实验结果
Table 4. Medical field model comparison experiment results
模型 P(%) R(%) F1(%) CNN 69 60 62 RNN 42 41 40 BERT 82 73 75 SEF-BERT 85 82 83 表 5 三种模型的实验结果对比表
Table 5. Comparison table of experimental results of three models
医学文本 实体1 实体2 CNN BERT SEF-BERT 标准 感觉障碍包括麻木感、蚁行感、针刺感以及灼烧感. 感觉障碍 麻木感 临床表现 临床表现 临床表现 临床表现 低剂量辐射后会发展单纯生长激素缺乏,而高剂量辐射后会发展全垂体功能减退 垂体功能减退 高剂量辐射 传播途径 病因 病因 病因 缺血性卒中在一些专门的卒中治疗中心使用MRI取代CT作为初始的首选影像学检查 缺血性卒中 MRI 实验室检查 实验室检查 影像学检查 影像学检查 -
[1] 朱小龙, 谢忠. 基于海量文本数据的知识图谱自动构建算法[J]. 吉林大学学报(工学版), 2021, 51(4): 1358-1363. https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202104022.htmZHU X L, XIE Z. Automatic construction of knowledge graph based on massive text data[J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1358-1363. https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202104022.htm [2] 潘大胜. 不确定噪声下海量文本数据的模糊挖掘算法研究[J]. 微电子学与计算机, 2017, 34(9): 129-132. http://www.journalmc.com/article/id/39b80dfe-9ea2-4a97-99ae-d602f8b7e87cPAN D S. Research on fuzzy mining algorithm for massive text data under uncertain noise[J]. Microelectronics & Computer, 2017, 34(9): 129-132. http://www.journalmc.com/article/id/39b80dfe-9ea2-4a97-99ae-d602f8b7e87c [3] 李冬梅, 张扬, 李东远, 实体关系抽取方法研究综述[J]. 计算机研究与发展, 2020, 57(7): 1424-1448. DOI: 10.7544/issn1000-1239.2020.20190358LI D M, ZHANG Y, LI D Y, Review of entity relation extraction methods[J]. Journal of Computer Research and Development, 2020, 57(7): 1424-1448. doi: 10.7544/issn1000-1239.2020.20190358 [4] 刘辉, 江千军, 桂前进, . 实体关系抽取技术研究进展综述[J]. 计算机应用研究, 2020, 37(S2): 1-5. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ2020S2001.htmLIU H, JIANG Q J, GUI Q J, . Review of research progress of entity relationship extraction[J]. Application Research of Computers, 2020, 37(S2): 1-5. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ2020S2001.htm [5] SUN W, RUMSHISKY A, UZUNER O. Evaluating temporal relations in clinical text: 2012 i2b2 Challenge[J]. Journal of the American Informatics Association, 2013, 20(5): 806-813. DOI: 10.1136/amiajnl-2013-001628 [6] CHANG Y C, DAI H J, WU C Y, et al. TEMPTING system: A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries[J]. Journal of Biomedical Informatics, 2013, 46: S54-S62. DOI: 10.1016/j.jbi.2013.09.007 [7] KAMBHATLA N. Combining lexical, syntactic, and semantic features with entropy models for extracting relations[CProc of. 2004. DOI: 10.3115/1219044.1219066 [8] 车万翔, 刘挺, 李生. 实体关系自动抽取[J]. 中文信息学报, 2005, 19(2): 1-6. DOI: 10.3969/j.issn.1003-0077.2005.02.001CHE W X, LIU T, LI S. Automatic entity relation extraction[J]. Journal of Chinese Information Processing, 2005, 19(2): 1-6. doi: 10.3969/j.issn.1003-0077.2005.02.001 [9] MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data[C]Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP2009: 1003-1011. DOI: [10] SOCHER R, HUVAL B, MANNING C D, et al. Semantic compositionality through recursive matrix-vector spaces[CProc of. 2012: 1201-1211. [11] HENDRICKX I, KIM S N, OZAREVA Z, . SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals[ ]Proceedings of the 5th International Workshop on Semantic Evaluation, 3338. [12] VASWANI, SHAZEER, PARMAR, . Attention is all you need[C31st Conference on Neural Information Processing Systems (NIPS 2017, 2017. [13] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv: . [14] LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: a lite BERT for self-supervised learning of language representations[J]. 2019. https://arxiv.org/abs/1909.11942v3 [15] ZHANG Z, HAN X, LIU Z, et al. ERNIE: enhanced language representation with informative entities[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. [16] KUMAR S. A survey of deep learning methods for relation extraction[J]. arXiv: 1705.03645, 2017 [17] HAN X, ZHU, YU P, et al. FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation[ ]. 2018. DOI: 10.18653/v1/D18-1514 [18] GUAN T, ZAN H, ZHOU X, et al. CMeIE: construction and evaluation of Chinese medical information extraction dataset[M]. 2020. DOI: 10.1007/978-3-030-60450-9_22 [19] SOCHER R, PENNINGTON J, HUANG E H, . Semi-supervised recursive autoencoders for predicting sentiment distributions[ ]Proceedings of the Conference on Empirical Methods In Natural Language Processing2011: 151161. [20] ZENG D, LIU K, LAI S, et al. Relation classification via convolutional deep neural network[C]., 2014: 23352344. [21] SANTOS CD, XIANG B, ZHOU B. Classifying relations by ranking with convolutional neural networks[C]. 2015: 626634. DOI: 10.3115/v1/P15-1061 [22] XU Y, MOU L, LI G, . Classifying relations via long short term memory networks along shortest dependency paths[ ]., 2015: 17851794. [23] LIN D WU X. Phrase clustering for discriminative learning[C]. 10301038. [24] CAI R, ZHANG X, WANG H. Bidirectional recurrent convolutional neural network for relation classification[C]. DOI: 10.18653/v1/P16-1072 [25] ZHANG, HAO, TANG. A multi-feature fusion model for Chinese relation extraction with entity sense[ ]. Knowledge-Based 206 106348DOI: 10.1016/j.knosys.2020.106348 -