|Table of Contents|

An Improved Efficient Bayesian Short Message Text Classifier(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2014年03期
Page:
70-
Research Field:
Publishing date:

Info

Title:
An Improved Efficient Bayesian Short Message Text Classifier
Author(s):
Zhang YongjunLiu Jinling
College of Computer Engineering,Huaiyin Institute of Technology,Huai’an 223003,China
Keywords:
short messagetext classificationBayesianSVMcategory energy space
PACS:
TP181
DOI:
-
Abstract:
A Bayesian classifier model is proposed to classify short message according to its content.The concept of category energy space is introduced and the word feature is converted to an energy unit in category energy space.Then the short message is represented as an energy vector based on its words.To obtain each category’s probability,the energy vector density is calculated and brought in Bayesian probability formula.When the category probabilities are not very different,a SVM model is used to reclassify the short message.The experimental results shows that the proposed model is superior to other classification methods in the classification result.

References:

[1] 新浪科技.2012年我国短信量同比增2%人均发送量下滑[R/OL].[2013-1-28].http://tech.sina.com.cn/t/2013-01-28/00538020096.shtml.
Sina Tech.SMS quantity increased is 2% and per capita volume has declined in China in 2012[R/OL].[2013-1-28].http://tech.sina.com.cn/t/2013-01-28/00538020096.shtml.(in Chinese)
[2]陈功平,沈明玉,王红,等.基于内容的短信分类技术[J].华东理工大学学报:自然科学版,2011,37(6):770-774.
Chen Gongping,Shen Mingyu,Wang Hong.SMS classification technology based on content[J].Journal of East China University of Science and Technology:Natural Science Edition,2011,37(6):770-774.(in Chinese)
[3]李继刚.短信自动分类技术研究与应用[D].上海:东华大学计算机科学学院,2011.
Li Jigang.Study and application of SMS automatic classification[D].Shanghai:Computer Science & Technology College,Donghua University,2011.(in Chinese)
[4]綦科,谢冬青.基于内容的短信分类系统的设计与实现[J].广州大学学报:自然科学版,2011,10(5):43-47.
Qi Ke,Xie Dongqing.Implement of classification system of short message based on text content[J].Journal of Guangzhou University:Natural Science Edition,2011,10(5):43-47.(in Chinese)
[5]张兢,候旭东,吕和胜.基于朴素贝叶斯和支持向量机的短信智能分析系统设计[J].重庆理工大学学报:自然科学版,2010,24(1):77-81.
Zhang Jing,Hou Xudong,Lv Heshen.Journal of chongqing university of technology[J].Journal of Chongqing University of Technology:Natural Science Edition,2010,24(1):77-81.(in Chinese)
[6]Ganiz M C.Higher order Na?ve Bayes:a novel non-IID approach to text classification[J].IEEE Transactions on Knowledge and Data Engineering,2011,23(7):1 022-1 034.
[7]Zhang Haijun.Textual and visual content-based anti-phishing:a Bayesian approach[J].IEEE Transactions on Neural Networks,2011,22(10):1 532-1 546.
[8]Tak-Lam Wong,Wai Lam.Learning to adapt web information extraction knowledge and discovering new attributes via a Bayesian approach[J].IEEE Transactions on Knowledge and Data Engineering,2010,22(4):523-536.
[9]Belem D.Content filtering for SMS systems based on Bayesian classifier and word grouping[C]//Network Operations and Management Symposium(LANOMS),Quito:IEEE Press,2011:1-7.
[10]Uysal,Alper Kursat.Detection of SMS spam messages on mobile phones[C]//Signal Processing and Communications Applications Conference(SIU),Mugla:IEEE Press,2012:1-4.
[11]Vahora S,Hasan M,Lakhani R.Novel approach:Na?ve Bayes with vector space model for spam classification[C]//2011 Nirma University International Conference,Ahmedabad Gujarat:Nirma University Press,2011:1-5.
[12]Gunal S,Ergin S,Gunal E S.Detection of SMS spam messages on mobile phones[C]//2012 20th Signal Processing and Communications Applications Conference(SIU),Mugla:IEEE Press,2012:1-4.
[13]Han Kyoungsoo,Rrim Haechang,Sung Hyon Myaeng.Some effective techniques for Naive Bayes text classification[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(11):1 457-1 466.
[14]Khemapatapan C.Thai-English spam SMS filtering[C]//Communications(APCC),Auckland:IEEE Press,2010:226-230.
[15]宋艳艳.基于内容分类的垃圾短信拦截系统的研究[D].哈尔滨:哈尔滨理工大学测控技术与通信工程学院,2012.
Song Yanyan.Research on spam message interception system based on content classification[D].Harbin:Measurement and Control Technology & Communication engineering College,Harbin University of Science and Technology,2012.(in Chinese)
[16]李慧,叶鸿,潘学瑞,等.基于SVM的垃圾短信过滤系统[J].计算机安全,2012,13(6):34-38.
Li Hui,Ye Hong,Pan Xuerui.Spam messages filtering system based on SVM[J].Computer Security,2012,13(6):34-38.(in Chinese)
[17]冯鸥鹏.垃圾短信过滤中字特征与词特征对过滤效果的比较研究[D].北京:北京邮电大学计算机学院,2011.
Feng Oupeng.A comparative study of chinese character feature and word feature in SMS spam filtering[D].Beijing:School of Computing,Beijing University of Posts and Telecommunications,2011.(in Chinese)
[18]徐易.基于短文本的分类算法研究[D].上海:上海交通大学电子信息与电气工程学院,2010.
Xu Yi.Research of text classification algorithm based on short text[D]Shanghai:Electronic Information and Electrical Engineering College,Shanghai Jiao Tong University,2010.(in Chinese)
[19]龚垒.基于支持向量机的垃圾短信过滤方法研究[D].焦作:河南理工大学计算机科学与技术学院,2011.
Gong Lei.The research of filtering methods of spam messages based on SVM[D].Jiaozuo:Computer Science & Technology College,Henan Polytechnic University,,2011.(in Chinese)
[20]刘庆瑜.基于决策树分类的手机垃圾短信过滤器的设计与实现[D].杭州:浙江工业大学计算机科学与技术学院,2011.
Liu Qingyu.Design and implementation of mobilephone garbage SMS filters based on sorting algorithm of decision tree[D].Hangzhou:Computer Science & Technology College,Zhejiang University of Technology,2011.(in Chinese)
[21]熊忠阳,蒋健,张玉芳.新的CDF文本分类特征提取方法[J].计算机应用,2009,29(7):1 755-1 757.
Xiong Zhongyang,Jiang Jian,Zhang Yufang.New feature selection approach(CDF)for text categorization[J].Journal of Computer Applications,2009,29(7):1 755-1 757.(in Chinese)
[22]Yang Y,Pederson J O.A comparative study on feature selection in text categorization[C]//Proceedings of the 14th International Conference on Machine Learning.San Francisco:Morgan Kaufmann,1997:412-420.
[23]Forman G.An Extensive empirical study of feature selection metrics for text classification[J].Special Issue on Variable and Feature Selection,2003,8:1 289-1 305.

Memo

Memo:
-
Last Update: 2014-09-30