[1]刘佳丽,许建华.多标签分类中标签检测技术的实验比较[J].南京师范大学学报(工程技术版),2012,12(04):055-61.
 Liu Jiali,Xu Jianhua.An Empirical Comparison of Label Detection Techniques for Multi-Label Classification[J].Journal of Nanjing Normal University(Engineering and Technology),2012,12(04):055-61.
点击复制

多标签分类中标签检测技术的实验比较
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
12卷
期数:
2012年04期
页码:
055-61
栏目:
出版日期:
2012-12-20

文章信息/Info

Title:
An Empirical Comparison of Label Detection Techniques for Multi-Label Classification
作者:
刘佳丽许建华
南京师范大学计算机科学与技术学院,江苏南京210023
Author(s):
Liu JialiXu Jianhua
School of Computer Science and Technology,Nanjing Normal University,Nanjing 210023,China
关键词:
多标签分类k 近邻法线性回归阈值函数多输出线性回归Logistic 回归离散Bayes 规则
Keywords:
multi-label classificationk-nearest neighbor algorithm linear regression threshold functionmulti-output linear regression logistic regressiondiscrete Bayesian rule
分类号:
TP18
摘要:
当前的部分多标签分类算法本质上由两项分类技术级联而成,前一级建立标签排序系统,后一级检测相关标签,兼顾进一步改善分类性能.本文针对不同标签检测技术开展研究,收集并实现4种通用标签检测技术:线性回归阈值法、多输出线性回归法、Logistic回归法以及离散Bayes规则,以k近邻算法作为基线算法,在10个基准数据集上进行实验比较.实验结果表明,从计算时间与分类性能两个方面来说,多输出线性回归法是值得推荐的方法.
Abstract:
Now some multi-label classification methods cascade two different classification techniques in essence. The former is to build a label ranking system, and the latter to detect relevant labels effectively and improve classification performance further. To compare the different detection techniques,we collect four general label detection approaches: linear regression threshold,multiple output linear regression, logistic regression and discrete Bayesian methods. With k-nearest neighbor algorithm as a baseline method,we conduct an extensive experimental comparison on ten benchmark data sets. Our experimental results demonstrate that multiple output linear regression technique is recommendable,according to both computational time and classification performance.

参考文献/References:

[1] Tsoumakas G,Katakis I. Multi-label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007,3 ( 3) : 1-13.
[2] Tsoumakas G,Katakis I,Vlahavas I. Mining multi-label data[C]/ /Maimon O,Rokach L. Data Mining and Knowledge Discovery Handbook. 3rd ed. New York: Springer, 2010: 667-685.
[3] Xu J. An extended one-versus-rest support vector machine for multi-label classification[J]. Neurocomputing,2011,74 ( 17) : 3 114-3 124.
[4] Madjarov G,Kocev D,Gjorgjevik D, et al. An extensive experimental comparison of methods for multi-label learning[J]. Pattern Recognition, 2012, 45( 9) : 3 084-3 104.
[5] Elisseeff A,Weston J. A kernel method for multi-labelled classification[C]/ /Proceedings of the 14th Conference on Neural Information Processing Systems. Canada: Vancouver, 2001: 681-687.
[6] Zhang M L,Zhou Z H. Multilabel neural networks with application to function genomics and text categorization[J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18( 10) : 1 338-1 351.
[7] Spyromitros E,Tsoumakas G,Vlahavas I. An empirical study of lazy multilabel classification algorithms[C]/ /Proceedings of the 5th Hellenic Conference on Artificial Intelligence. Greece: Syros, 2008,LNAI 5138: 401-406.
[8] Zhang M L,Zhou Z H. ML-kNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition,2007,40 ( 7) : 2 038-2 048.
[9] Madjarov G,Gjorgjevik D,Dzersoki S. Two stage architecture for multi-label learning[J]. Pattern Recognition,2012,45 ( 3) : 1 019-1 034.
[10] Petrovskiy M,Glazkova V. Linear methods for reduction from ranking to multilabel classification[C]/ /Proceedings of the 19th Australia Joint Conference on Artificial Intelligence. Australia: Hobart, 2006,LNCS 4304: 1 152-1 156.
[11] Hastie T,Tibshirani R,Friedman J. The Elements of Statistical Learning[M]. New York: Springer, 2001.
[12] Duda R O,Hart P E,Stork D G. Pattern Classification[M]. 2nd ed. New York: John Wiley and Sons, 2001.
[13] 张学工. 模式识别[M]. 3 版. 北京: 清华大学出版社, 2010. Zhang Xuegong. Pattern Recognition[M]. 3rd ed. Beijing: Tsinghua University Press, 2010. ( in Chinese)
[14] Cheng W W,Hullermeier E. Combining instance-based learning and logistic regression for multilabel classification[J]. Machine Learning, 2009, 76( 2 /3) : 211-225.
[15] Vach W,Robner P,Schumacher M. Neural networks and logistic regression: part Ⅰ[J]. Computational Statistics and Data Analysis, 1996, 21( 6) : 661-682.
[16] Vach W,Robner P,Schumacher M. Neural networks and logistic regression: part Ⅱ[J]. Computational Statistics and Data Analysis, 1996, 21( 6) : 683-701.
[17] Tsoumakas. Multi-label data sets[DB/OL].[2012-04-10]. http: / /mulan. sourceforge. net /datasets. html, 2010.
[18] Zhang M L. Image data set[DB/OL].[2012-04-10]. http: / /cse. seu. edu. cn /people /zhangml, 2009.

备注/Memo

备注/Memo:
基金项目:国家自然科学基金( 60875001) .
通讯联系人:许建华,博士,教授,研究方向: 模式识别、机器学习和生物信息学. E-mail: xujianhua@ njnu. edu. Cn
更新日期/Last Update: 2013-03-21