|Table of Contents|

Research of Disambiguating Combinational Ambiguity in Chinese Word Segmentation Based on CRF(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2008年04期
Page:
73-76
Research Field:
Publishing date:

Info

Title:
Research of Disambiguating Combinational Ambiguity in Chinese Word Segmentation Based on CRF
Author(s):
Ding Dexin1Qu Weiguang1Xu Tao1Dong Yu2
1.School of Mathematics and Computer Science,Nanjing Normal University,Nanjing 210097,China;2.Longpan School,Jinling Institute of Technology,Nanjing 211169,China
Keywords:
Ch inese wo rd segm entation comb inationa l amb iguity CRF
PACS:
TP311
DOI:
-
Abstract:
Com bina tiona l am bigu ity is one of the d ifficult po in ts in Ch inesew ord segm entation. B ased on theCRF ( Cond itiona l Random Fie lds) m ode,l th is pape r establishes feature tem plate by the contextual wo rds and part o f speeches o f the amb iguity w ord. 10 o ften-used am bigu ity wo rds are tested by us ing ha lf of the 1998 " People" s Da ily" co rpus, and the average accuracy is 96. 35%. The resu lt o f the exper iment revea ls that using themodel is mo re effective for d isam biguation.

References:

[ 1] 刘开瑛, 由丽萍. 汉语框架语义知识库构建工程[ C ]. 北京: 清华大学出版社, 2006: 64-71.
Liu Ka iy ing, You L ip ing. On Chinese Fram eN et Construc tion [ C]. Be ijing: Ts inghua Un iversity Press, 2006: 64-71. ( in Chinese)
[ 2] 孙茂松, 黄昌宁, 邹嘉彦. 利用汉字二元语法关系解决汉语自动分词中的交集型歧义[ J]. 计算机研究与发展, 1997,34( 5): 332-339.
SunM aosong, Huang Changn ing, Ben jam in K Tsou. Us ing cha racte r b ig ram for am bigu ity reso lution in ch inesew ord segm entation[J]. Com puter Research and Developm ent, 1997, 34( 5): 332-339. ( in Ch inese)
[ 3] 孙茂松, 左正平. 消解中文三字长交集型分词歧义的算法[ J]. 清华大学学报, 1999, 39( 5): 101-103.
SunM aosong, Zuo Zhengp ing. A lgorithm for so lv ing 3-charac ter cross ing am b iguities in Ch inesew ord segm enta tion[ J]. Tsinghua Univ ( Sci& Tech), 39( 5): 101-103. ( in Ch inese)
[ 4] 廉竹钧. 汉语组合型切分歧义字段消歧方法研究[ D]. 北京: 北京语言文化大学, 2002.
Lian Zhu jun . A S tudy on the Disamb iguation o f Comb inator ia lAmb igu ities in Ch ineseW o rd Segm entation[ D]. Be ijing: Beijing Language and Culture University, 2002. ( in Ch inese)
[ 5] 郑家恒, 吴芳芳. 多义型歧义字段切分研究[ C ]. 北京: 清华大学出版社, 1999: 129-134.
Zhang Jiaheng, Wu Fang fang. Research onM ult-i sense Type Am biguous Phrases Segm enta tion[ C]. Be ijing: TsinghuaUn iversity Press, 1999. 129-134. ( in Chinese)
[ 6] 肖云, 孙茂松, 邹嘉彦. 利用上下文信息解决汉语自动分词中的组合型歧义[ J] . 计算机工程与应用, 2001, 37( 19):
87-81.
X iaoYun, SunM aosong, Ben jam in K Tsou. So lv ing com binato rial amb iguity in Ch inese wo rd segm entation us ing contex tual information[ J]. Computer
Engineering and App lication, 2001, 37( 19): 87-81. ( in Ch inese)
[ 7] Luo X iao, SunM aosong , Tsou B K. Cove ring am bigu ity reso lution in Chinese wo rd segm entation based on con tex tua l inform ation[C ] / / Pro ceedings of the 19th In ternational Conference on Com puta tiona l Lingu istics. Ta iw an: [ s. n. ], 2002: 598-604
[ 8] 曲维光, 吉根林, 穗志方, 等. 基于语境信息的组合型分词歧义消解方法[ J]. 计算机工程, 2006, 32( 17): 74-76.
X iaoYun, SunM aosong, Ben jam in K Tsou. So lv ing com binato rial amb iguity in Ch inese wo rd segm entation us ing contex tual information[ J]. Compu ter Eng ineer ing and App lication, 2001, 37( 19): 87-81. ( in Ch inese)
[ 9] 冯素琴, 陈惠明. 一种自组织的汉语组合型歧义消歧方法[ J]. 计算机工程与设计, 2007, 28( 3): 737-749, 742.
Feng Suq in, Chen H uim ing. AdaptiveChinese com bina to ria l am bigu ities disamb iguate m ethod[ J] . Com pute r Eng ineer ing and Des ign, 2007, 28( 3): 737-749, 742 . ( in Chinese)
[ 10] John La fferty, AndrewM cCa llum, Fem ando Pere ira. Cond itional random fie lds: Probab ilisticm odels fo r segm enting and labeling
sequence data[ C ] / / Proceed ings of the 18 th ICML. San Francisco: Mogan Koufm ann, 2001: 282-289.
[ 11] 冯素琴, 陈惠明. 基于语境信息的汉语组合型歧义消歧方法[ J]. 中文信息学报, 2007, 21( 6): 13-16, 42.
Feng Suqin, Chen H uim ing. Contex t-based approach to comb inationa l amb iguity reso lution in Chinesew ord segm entation[ J].Journal o f Chinese Inform a tion Process ing, 2007, 21( 6) : 13-16, 42. ( in Chinese)

Memo

Memo:
-
Last Update: 2013-04-24