|Table of Contents|

The Architecture of Word2vec and Its Applications(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2015年01期
Page:
43-48
Research Field:
计算机工程
Publishing date:

Info

Title:
The Architecture of Word2vec and Its Applications
Author(s):
Xiong Fulin1Deng Yihao1Tang Xiaosheng2
(1.School of Information and Communication Engineering,Beijing University of Posts and Telecommunications,Beijing 100876,China)(2.Wireless Technology Innovation,Beijing University of Posts and Telecommunications,Beijing 100876,China)
Keywords:
NPLWord2vec CBOW Skip-gramChinese-language-processing
PACS:
TP391.1
DOI:
-
Abstract:
Word2vec is a combination of neural probabilistic language model,which includes CBOW model and Skip-gram model in terms of architecture. This paper will introduce the technology of Word2vec. Firstly,the paper will elaborate the theory of Word2vec architecture; secondly,an English corpus which is extracted from Wikipedia will be used to train the model,and a set of results will be shown; lastly,the application of Word2vec in the language of Chinese will be explored,a result will also be presented precisely.

References:

[1] Bengio Y,Ducharme R,Vincent P. A neural probabilistic language model[J]. Journal of Machine Learning Research,2003,3(7):1 137-1 155.
[2]Michael U G,AapoHy(¨overv)rinen. Noise-contrastive estimation of unnormalized statistical models,with applications to natural image statistics[J]. The Journal of Machine Learning Research,2012,13(2):307-361.
[3]Tomas M,Chen K,Corrado G. Efficient estimation of word representations in vector space[EB/OL].(2013-08-18)[2013-09-07]http://arxiv.org/abs/1301.3781.
[4]Bengio Y,LeCun Y. Scaling Learning Algorithms Towards AI[M]//Large-Scale Kernel Machines. Cambridge:MIT Press,2007.
[5]Mikolov T,Karafi M,Burget L,et al. Recurrent neural network based language model[C]//Proceedings of Interspeech. Chiba,Japan:MIT Press,2010:131-138.
[6]Mikolov T,Ilya S,Kai C,et al. Distributed representations of words and phrases and their compositionality[EB/OL]. [2013-10-16]http://arxiv.org/abs/1310.4546.
[7]Elman J. Finding structure in time[J]. Cognitive Science,1990,14(7):179-211.
[8]Rumelhart D E,Hinton G E,Williams R J. Learning internal representations by back-propagating errors[J]. Nature,1986,323(9):533-536.
[9]李雷. 基于人工智能机器学习的文字识别方法研究[D]. 成都:电子科技大学机械电子工程学院,2013.
Li Lei. Character recognition research based on artificial intelligence and machine learning[D]. Chengdu:School of Mechatronics Engineering of University of Electronics Science and Technology of China,2013.(in Chinese)
[10]Andriy M,Yee W T. A fast and simple algorithm for training neural probabilistic language models[EB/OL].(2009-10-12)[2012-06-10]http://arxiv.org/ftp/arxiv/papers/12061.
[11]Frederic M,Yoshua B. Hierarchical probabilistic neural network language model[C]//Proceedings of the International Workshop on Artificial Intelligence and Statistics. Barbados:MIT Press,2005:246-252.
[12]Mikolov T,Kopeck J,Burget L,et al. Neural network based language models for highly inflective languages[C]//Proc. ICASSP. Taipei:ICA,2009:126-129.
[13]Hinton G E,McClelland J L,Rumelhart D E. Distributed Representations[M]//Parallel Dis-Tributed Processing:Explorations in the Microstructure of Cognition. Cambridge:MIT Press,1986.
[14]许炎,金芝,李戈,等. 基于多Web信息源的主体概念网络获取[J]. 计算机研究与发展,2013,50(9):1 843-1 854.
Xu Yan,Jin Zhi,Li Ge,et al. Acquiring topical concept network from multiple Web information sources[J]. Journal of Computer Research and Development,2013,50(9):1 843-1 854.(in Chinese)

Memo

Memo:
-
Last Update: 2015-03-30