|Table of Contents|

ALBERT-Based Named Entity Recognition of Chinese Medical Records(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2021年01期
Page:
36-43
Research Field:
计算机科学与技术
Publishing date:

Info

Title:
ALBERT-Based Named Entity Recognition of Chinese Medical Records
Author(s):
Chen Jie1Xi Xuefeng12Pi Zhou1Victor S Sheng3Cui Zhiming12
(1.School of Electronic and Computer Engineering,Suzhou University of Science and Technology,Suzhou 215009,China)(2.Suzhou Smart City Research Institute,Suzhou 215009,China)(3.Computer Science Department,Texas Tech University,Texas 79431,USA)
Keywords:
ALBERTnamed entity recognitionclinical electronic medical recordsBiLSTMCRF
PACS:
TP181
DOI:
10.3969/j.issn.1672-1292.2021.01.006
Abstract:
The main task of named entity recognition on medical record is to convert unstructured text into structured data,and then provide an important fundamental support for data mining for medical field tasks. This paper proposes a named entity recognition method for Chinese medical records based on ALBERT and fusion model. Firstly,we use manual labeling to expand the sample dataset,and fine-tune the dataset in conjunction with the ALBERT. Secondly,the Bi-directional Long Short-Term Memory(BiLSTM)is used to extract the global features of the text. Finally,on the basis of the conditional random field model(CRF),sequence tags for named entities are made. The experimental results on the standard dataset show that the proposed method further improves the accuracy of name entity recognition on medical text and greatly reduces the time overhead.

References:

[1] BIKEL D M,SCHWARTA R,WEISCHEDEL R M. An algorithm that learns what’s in a name[J]. Machine Learning,1999,34(1/2/3):211-231.
[2]LIAO W H,VEERAMACHANENI S. A simple semi-supervised algorithm for named entity recognition[C]//The Proceedings of NAACL HLT 2009. Boulder,USA:ASL,2009:58-65.
[3]RATINOV L,ROTH D. Design challenges and misconceptions in named entity recognition[C]//Proceedings of the Thirteenth Conference on Computational Natural Language Learning(CoNLL-2009). Boulder,USA:ASL,2009:147-155.
[4]TSAI T H,WU S H,LEE C W,et al. Mencius:a Chinese named entity recognizer using the maximum entropy-based hybrid model[J]. International Journal of Computational Linguistics and Chinese Language Processing,2004,9(1):65-82.
[5]陈钰枫,宗成庆,苏克毅. 汉英双语命名实体识别与对齐的交互式方法[J]. 计算机学报,2011,34(9):1688-1696.
[6]张海楠,伍大勇,刘悦,等. 基于深度神经网络的中文命名实体识别[J]. 中文信息学报,2017,31(4):28-35.
[7]杨锦锋,关毅,何彬,等. 中文电子病历命名实体和实体关系语料库构建[J]. 软件学报,2016,27(11):2725-2746.
[8]YOUNG T,HAZARIKA D,PORIA S,et al. Recent trends in deep learning based natural language processing[J]. IEEE Computational Intelligence Magazine,2018,13(3):55-75.
[9]ASAHARA M,MATSUMOTO Y. Japanese named entity extraction with redundant morphological analysis[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Association for Computational Linguistics. Sapporo,Japan:ACL,2003:8-15.
[10]CHEN A,PENG F,SHAN R,et al. Chinese named entity recognition with conditional probabilistic models[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Sydney,Australia:ACL,2006:173-176.
[11]CHEN Y,ZHOU C J,LI T X,et al. Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training[J]. Journal of Biomedical Informatics,2019,96:103252.
[12]HUANG Z H,XU W,YU K. Bidirectional LSTM-CRF models for sequence tagging[C]//ACL. Beijing,China:ACL,2015:13-16.
[13]STRUBELL E,VERGA P,BELANGER D,et al. Fast and accurate entity recognition with iterated dilated convolutions[C]//EMNLP. Copenhagen,Denmark:ACL,2017:2670-2680.
[14]LIU K X,HU Q C,LIU J W. Named entity recognition in Chinese electronic medical records based on CRF[C]//2017 14th Web Information Systems and Applications Conference(WISA). Jeju,Korea:IEEE,2017:105-110.
[15]LIU Z J,YANG M,WANG X L,et al. Entity recognition from clinical texts via recurrent neural network[J]. BMC Medical Informatics and Decision Making,2017,17:53-61.
[16]QIU J,QI W,ZHOU Y,et al. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions[C]//2018 IEEE International Conference on Bioinformatics and Biomedicine(BIBM). Madrid,Spain:IEEE,2018:935-942.
[17]PETERS M E,NEUMANN M,IYYER M,et al. Deep contextualized word representations[C]//Proceedings of NAACL-HLT. New Orleans,USA:ACL,2018:2227-2237.
[18]DEVLIN J,CHANG M W,LEE K,et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolis,USA:ACL,2019:278-286.
[19]LAN Z,CHEN M,GOODMAN S,et al. ALBERT:a lite BERT for self-supervised learning of language representations[C]//International Conference on Learning Representations. New Orleans,USA:Elsevier,2019:12-17.
[20]HOCHREITER S,SCHMIDHUBER J. Long short-termmemory[J]. Neural Computation,1997,9(8):1735-1780.
[21]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al. Neural architectures for named entity recognition[C]//NAACL-HLT. San Diego,USA:ACL,2016:260-270.
[22]LUO L,YANG Z,YANG P,et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition[J]. Bioinformatics,2018,34(8):1381-1388.
[23]VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. Long Beach,USA:NeurIPS,2017:6000-6010.

Memo

Memo:
-
Last Update: 2021-03-15