|Table of Contents|

Applying CLUCENE in Corpus Building(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2008年04期
Page:
118-122
Research Field:
Publishing date:

Info

Title:
Applying CLUCENE in Corpus Building
Author(s):
He Sheng1Qu Weiguang2Lu Yajun3
1.School of Chinese Language and Literature,Nanjing Normal University,Nanjing 210097,China;2.School of Mathematics and Computer Science,Nanjing Normal University,Nanjing 210097,China;3.School of Tibetan Language and Culture,Northwest University for Nation
Keywords:
C lucene corpus co rpus building
PACS:
TP391.1
DOI:
-
Abstract:
Th is paper exam ines deeply the constructed m ode ls o f the current co rpus bu ild ing design and the functions co rpus should have. A new corpus design based on file system and C lucene full text sea rching eng ine packag e is proposed. Experim ents show tha t C lucene prov ides va rious types o f inter faces and can be easily extended for large quantity data. These characteristics m ake the package a prom is ing p la tform for corpus build ing.

References:

[ 1] 何婷婷. 语料库的数据管理方式研究[ C] / /第一届学生计算语言学研讨会论文集. 北京: 清华大学出版社, 2002: 307-310.
H e T ing ting. Study on data m anagement of corpus[ C ] / / Pro ceedings 1st Studen tsWo rkshop on Com puta tiona l Linguistics.Beijing: TsinghuaUn iversity Press, 2002: 307-310. ( in Ch inese)
[ 2] 金天荣. 文档数据库与关系数据库研究[ J]. 微计算机信息, 2008( 3): 173-174.
Jin T ianrong. Research on the document database and re lationship database[ J] . M icrocompu ter Informa tion, 2008( 3): 173-174. ( in Chinese)
[ 3] 傅爱平. 语料库研究与应用综述[ DB /OL]. [ 2007-10-22] . http: / /cc .l pku. edu. cn /doub tfire /CorpusL ingu istics/ Introduction/FuA ip ing- Co rpus- introduction. pd.fBo A iping. S tudy and app lication summ ar ization of corpus[ DB /OL]. [ 2007-10-22]. http: / /cc.l pku. edu. cn / doubtfire/CorpusLinguistics/Introduc tion /FuA ip ing- Corpus- introduc tion. pd.f ( in Ch inese)
[ 4] 贺胜. 面向大规模语料库的全文检索系统研究[ J] . 图书与情报, 2008( 4): 93-97
H e Sheng. Resea rch o f fu l-l text retr ieva l system for la rge-scale co rpus[ J]. Library& Inform ation, 2008( 4): 93-97. ( in Chinese)
[ 5] 贺胜. 基于Lucene的中文全文检索系统[ J]. 中国高校科技与产业化, 2007( 6): 142-144
H e Sheng. Ch inese fu l-l tex t retrieva l system based on Lucene[ J]. Ch inese Un iversity Techno logy Transfer, 2007( 6 ): 142-144. ( in Ch inese)
[ 6] C lucene- a C + + Search Eng ine[ EB /OL]. [ 2007-10-12]. http: / /sourceforge. net/projects /c lucene.

Memo

Memo:
-
Last Update: 2013-04-24