Abstract:
Medical event extraction is an important foundation for constructing medical knowledge graphs. Aiming at the problem of lack of label data in the medical field, a joint extraction model of medical events based on Transformer Encoder, BiLSTM and attention mechanism is constructed, and a pseudo-label confidence selection algorithm for selecting high-confidence data is proposed. Firstly, the medical event joint extraction model is trained to predict unlabeled data and generate pseudo-labeled data. Secondly, , high-confidence pseudo-label data is selected by calculating the pseudo-label consensus probability P, and is added to the original data to retrain the joint extraction model. Finally, the updated medical event joint extraction model is used to extract the primary sites, focus sizes and metastatic sites events in the medical electronic medical records, and use majority voting to obtain the final extraction results. Taking the medical event extraction task corpus for Chinese electronic medical records in the 2020 National Knowledge Graph and Semantic Computing Conference (CCKS2020) as experimental data, the experimental results show that the method proposed in this paper has obtained better medical event extraction results.