«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

[1]熊富林,邓怡豪,唐晓晟.Word2vec的核心架构及其应用[J].南京师范大学学报(工程技术版),2015,15(01):043-48.
　Xiong Fulin,Deng Yihao,Tang Xiaosheng.The Architecture of Word2vec and Its Applications[J].Journal of Nanjing Normal University(Engineering and Technology),2015,15(01):043-48.
点击复制

Word2vec的核心架构及其应用

分享到：

南京师范大学学报（工程技术版）[ISSN:1006-6977/CN:61-1281/TN]

卷:: 15卷
期数:: 2015年01期

页码:: 043-48

栏目:: 计算机工程

出版日期:: 2015-03-20

文章信息/Info

Title:: The Architecture of Word2vec and Its Applications

作者:: 熊富林¹; 邓怡豪¹; 唐晓晟²; (1.北京邮电大学信息与通信工程学院,北京 100876)(2.北京邮电大学WTI实验室,北京 100876)

Author(s):: Xiong Fulin¹; Deng Yihao¹; Tang Xiaosheng²; (1.School of Information and Communication Engineering,Beijing University of Posts and Telecommunications,Beijing 100876,China)(2.Wireless Technology Innovation,Beijing University of Posts and Telecommunications,Beijing 100876,China)

关键词:: 自然语言处理; Word2vec CBOW Skip-gram; 中文语言处理

Keywords:: NPL; Word2vec CBOW Skip-gram; Chinese-language-processing

分类号:: TP391.1

文献标志码:: A

摘要:: 神经网络概率语言模型是一种新兴的自然语言处理算法,该模型通过学习训练语料获得词向量和概率密度函数,词向量是多维实数向量,向量中包含了自然语言中的语义和语法关系,词向量之间余弦距离的大小代表了词语之间关系的远近,词向量的加减代数运算则是计算机在“遣词造句”. 近年来,神经网络概率语言模型发展迅速,Word2vec是最新技术理论的合集. 首先,重点介绍Word2vec的核心架构CBOW及Skip-gram; 接着,使用英文语料训练Word2vec模型,对比两种架构的异同; 最后,探讨了Word2vec模型在中文语料处理中的应用.

Abstract:: Word2vec is a combination of neural probabilistic language model,which includes CBOW model and Skip-gram model in terms of architecture. This paper will introduce the technology of Word2vec. Firstly,the paper will elaborate the theory of Word2vec architecture; secondly,an English corpus which is extracted from Wikipedia will be used to train the model,and a set of results will be shown; lastly,the application of Word2vec in the language of Chinese will be explored,a result will also be presented precisely.

参考文献/References:

[1] Bengio Y,Ducharme R,Vincent P. A neural probabilistic language model[J]. Journal of Machine Learning Research,2003,3(7):1 137-1 155.
[2]Michael U G,AapoHy(¨overv)rinen. Noise-contrastive estimation of unnormalized statistical models,with applications to natural image statistics[J]. The Journal of Machine Learning Research,2012,13(2):307-361.
[3]Tomas M,Chen K,Corrado G. Efficient estimation of word representations in vector space[EB/OL].(2013-08-18)[2013-09-07]http://arxiv.org/abs/1301.3781.
[4]Bengio Y,LeCun Y. Scaling Learning Algorithms Towards AI[M]//Large-Scale Kernel Machines. Cambridge:MIT Press,2007.
[5]Mikolov T,Karafi M,Burget L,et al. Recurrent neural network based language model[C]//Proceedings of Interspeech. Chiba,Japan:MIT Press,2010:131-138.
[6]Mikolov T,Ilya S,Kai C,et al. Distributed representations of words and phrases and their compositionality[EB/OL]. [2013-10-16]http://arxiv.org/abs/1310.4546.
[7]Elman J. Finding structure in time[J]. Cognitive Science,1990,14(7):179-211.
[8]Rumelhart D E,Hinton G E,Williams R J. Learning internal representations by back-propagating errors[J]. Nature,1986,323(9):533-536.
[9]李雷. 基于人工智能机器学习的文字识别方法研究[D]. 成都:电子科技大学机械电子工程学院,2013.
Li Lei. Character recognition research based on artificial intelligence and machine learning[D]. Chengdu:School of Mechatronics Engineering of University of Electronics Science and Technology of China,2013.(in Chinese)
[10]Andriy M,Yee W T. A fast and simple algorithm for training neural probabilistic language models[EB/OL].(2009-10-12)[2012-06-10]http://arxiv.org/ftp/arxiv/papers/12061.
[11]Frederic M,Yoshua B. Hierarchical probabilistic neural network language model[C]//Proceedings of the International Workshop on Artificial Intelligence and Statistics. Barbados:MIT Press,2005:246-252.
[12]Mikolov T,Kopeck J,Burget L,et al. Neural network based language models for highly inflective languages[C]//Proc. ICASSP. Taipei:ICA,2009:126-129.
[13]Hinton G E,McClelland J L,Rumelhart D E. Distributed Representations[M]//Parallel Dis-Tributed Processing:Explorations in the Microstructure of Cognition. Cambridge:MIT Press,1986.
[14]许炎,金芝,李戈,等. 基于多Web信息源的主体概念网络获取[J]. 计算机研究与发展,2013,50(9):1 843-1 854.
Xu Yan,Jin Zhi,Li Ge,et al. Acquiring topical concept network from multiple Web information sources[J]. Journal of Computer Research and Development,2013,50(9):1 843-1 854.(in Chinese)

相似文献/References:

[1]易法令,孙晓翠,陈珊珊.基于LTP的糖尿病智能问答系统的研究与实现[J].南京师范大学学报(工程技术版),2023,23(03):060.[doi:10.3969/j.issn.1672-1292.2023.03.008]
　Yi Faling,Sun Xiaocui,Chen Shanshan.Research and Implementation of Diabetes Intelligent Question Answering System Based on LTP[J].Journal of Nanjing Normal University(Engineering and Technology),2023,23(01):060.[doi:10.3969/j.issn.1672-1292.2023.03.008]

备注/Memo

备注/Memo:: 收稿日期:2014-08-16. 通讯联系人:熊富林,硕士,研究方向:数据挖掘. E-mail:fulinxiong@gmail.com

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed4107
全文下载/Downloads7688
评论/Comments

更新日期/Last Update: 2015-03-30