«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.issn.1672-1292.2022.02.007]
点击复制

面向事件检测的预训练主动学习模型

分享到：

南京师范大学学报（工程技术版）[ISSN:1006-6977/CN:61-1281/TN]

卷:: 22卷
期数:: 2022年02期

页码:: 041-47

栏目:: 计算机科学与技术

出版日期:: 2022-06-30

文章信息/Info

Title:: Design and Implementation of a Pretraining Active Learning Model for Unstructured Event Detection

文章编号:: 1672-1292(2022)02-0041-07

作者:: 冯琳慧; 乔林波; 阚志刚; (国防科技大学并行与分布处理国家重点实验室,湖南长沙 410073)

Author(s):: Feng Linhui; Qiao Linbo; Kan Zhigang; (National Laboratory for Parallel and Distributed Processing,National University of Defense Technology,Changsha 410073,China)

关键词:: 主动学习; 事件检测; 预训练模型; 样本选择策略; 微调

Keywords:: active learning; event detection; pre-trained model; selecting strategy; fine-tuning

分类号:: O643; X703

DOI:: 10.3969/j.issn.1672-1292.2022.02.007

文献标志码:: A

摘要:: 深度学习在事件检测任务上取得了显著的成果,但模型严重依赖于大量的标注数据. 由于事件结构化的信息和丰富的标签表示,使得获取注释的成本很高,难以大量获得. 针对事件检测任务,为了提高语料标注效率,减少训练过程所需的标注样本数量,提出一种联合主动学习和预训练模型的事件检测模型. 针对主动学习模型存在的冷启动问题,设计了基于融合不确定性的特殊样本选择策略,估计样本在微调下游事件检测任务方面的潜在贡献. 一方面,结合预训练模型从原始任务中带来的丰富的语义信息,避免了重新设计网络结构或从零开始训练; 另一方面,利用主动学习选择信息丰富的样本能更好地微调预训练模型,减少数据标注成本. 在ACE 2005语料上进行数值实验验证,结果证明了所提出的EDPAL算法的有效性.

Abstract:: With the rapid growth of network information,it has become more and more important to find the key information. Event detection focuses on extracting event triggers from unstructured natural language texts. Deep learning has achieved a great success in event detection tasks,but the model relies on a large amount of labeled data which are difficult to be obtained. And the cost of obtaining annotations is very high due to the structured information of the event and the rich label representation. To address these issues,this paper proposes a joint active learning and pre-trained event detection model(EDPAL). To handle the cold start problem of the active learning,a special sample selection strategy on the basis of fusion uncertainty is designed to estimate the potential contribution of samples in fine-tuning downstream event detection tasks. On the one hand,combined with the rich semantic information brought by the pre-training model from the original task,it avoids redesigning the network structure or training from scratch. On the other hand,the pre-training model can be better fine-tuned by selecting information-rich samples and reduce the cost of data labeling at the same time. The experimental results on the ACE 2005 corpus shows the effectiveness of the proposed EDPAL.

参考文献/References:

[1] HONG Y,ZHANG J F,MA B,et al. Using cross-entity inference to improve event extraction[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies. Portland,USA:Association for Computational Linguistics,2011.
[2]LI Q,JI H,HUANG L. Joint event extraction via structured prediction with global features[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia,Bulgaria:ACL,2013.
[3]吴家皋,周凡坤,张雪英. HMM模型和句法分析相结合的事件属性信息抽取[J]. 南京师大学报(自然科学版),2014,37(1):30-34.
[4]CHEN Y B,XU L H,LIU K,et al. Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Beijing,China:ACL,2015.
[5]FENG X C,QIN B,LIU T. A language-independent neural network for event detection[J]. Science China(Information Science),2018,61(9):81-92.
[6]NGUYEN T H,CHO K,GRISHMAN R. Joint event extraction via recurrent neural networks[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association Computation Linguistics:Human Language Technologies. San Diego,USA:NAACL,2016.
[7]REN P Z,XIAO Y,CHANG X J,et al. A Survey of Deep Active Learning[J]. arXiv preprint arXiv:2009.00236,2020.
[8]SEUNG H S,OPPER M,SOMPOLINSKY H. Query by committee[C]//Proceedings of the fifth Annual Workshop on Computational Learning Theory. Pittsburgh,USA:ACM,1992.
[9]LIAO S S,GRISHMAN R. Using prediction from sentential scope to build a pseudo co-testing learner for event extraction[C]//Proceedings of the 5th International Joint Conference on Natural Language Processing. Chiang Mai,Thailand:ACL,2011.
[10]邱盈盈,洪宇,周文瑄,等. 面向事件抽取的深度与主动联合学习方法[J]. 中文信息学报,2018,32(6):98-106.
[11]DEVLIN J,CHANG M W,LEE K,et al. Bert:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolics,USA:ACL,2019.
[12]PETERS M,NEUMANN M,IYYER M,et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. New Orleans,USA:ACL,2018.
[13]VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach,USA:CAI,2017.
[14]HUANG S J,ZHAO J W,LIU Z Y. Cost-effective training of deep CNNs with active model adaptation[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London,UK:ACM,2018.
[15]MERCHANT A,RAHIMTOROGHI E,PAVLICK E,et al. What happens To BERT embeddings during fine-tuning?[J]. arXiv preprint arXiv:2014.14448,2020
[16]MARTINEZ-CANTIN R,DE FREITAS N,DOUCET A,et al. Active policy learning for robot planning and exploration under uncertainty[C]//Proceedings of Robotics:Science and Systems III. Atlanta,USA:MIT Press,2007.
[17]SCHEFFER T,DECOMAIN C,WROBEL S. Active hidden Markov models for information extraction[C]//Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis. Berlin Germany:Springer,2001.
[18]LIU M Y,TU Z Y,ZHANG T,et al. LTP:A New Active Learning Strategy for CRF-Based Named Entity Recognition[J]. arXiv preprint arXiv:2001.02524,2020.
[19]LIU J,CHEN Y B,LIU K. Exploiting the ground-truth:an adversarial imitation based knowledge distillation approach for event detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):6754-6761.
[20]SCHEIN A I,UNGAR L H. Active learning for logistic regression:an evaluation[J]. Machine Learning,2007,68(3):235-265.

备注/Memo

备注/Memo:: 收稿日期:2021-08-31.
通讯作者:乔林波,博士,助理研究员,研究方向:在线和分布式优化、大规模机器学习、事理图谱构建与挖掘. E-mail:qiaolinbo@nudt.edu.cn

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed1648
全文下载/Downloads2007
评论/Comments

更新日期/Last Update: 1900-01-01