[1]杨杨,刘会东.一种基于成对约束的特征选择改进算法[J].南京师范大学学报(工程技术版),2011,11(01):056-61.
 Yang Yang,Liu Huidong.An Improved Algorithm for Feature Selection Based on Pairwise Constraint[J].Journal of Nanjing Normal University(Engineering and Technology),2011,11(01):056-61.
点击复制

一种基于成对约束的特征选择改进算法
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
11卷
期数:
2011年01期
页码:
056-61
栏目:
出版日期:
2011-03-09

文章信息/Info

Title:
An Improved Algorithm for Feature Selection Based on Pairwise Constraint
作者:
杨杨1刘会东2
1. 南京师范大学强化培养学院,江苏南京210046; 2. 南京师范大学计算机科学与技术学院,江苏南京210046
Author(s):
Yang Yang1Liu Huidong2
1.Intensification Culture School,Nanjing Normal University,Nanjing 210046,China; 2.School of Computer Science and Technology,Nanjing Normal University,Nanjing 210046,China
关键词:
机器学习特征选择成对约束分类
Keywords:
machine learningfeature selectionpairwise constraintclassification
分类号:
TP181
摘要:
基于成对约束的特征选择算法通过度量单个特征的重要性得到一个特征序列,但由单个重要特征构成的特征子集未必是最有效的.为此,提出了一种基于成对约束的特征选择改进算法,该算法采用对特征子集进行度量的策略,逐步选择使新的特征子集最有效的特征,从而得到一个有效的特征序列.实验表明新提出的算法是有效可行的.
Abstract:
Feature selection is key issue in machine learning field. As compared with unsupervised feature selection methods,supervised feature selection approaches have more better performances. However,most of the existing supervised feature selection algorithms mainly aim at the cases using the labels as supervised information,here these methods are not applied to the cases with pairwise constraints. In the real application,it is more easier to get the pairwise constraints as comparing with getting labels. So the researchers proposed a feature selection based on pairwise constraint, the algorithm obtains a feature sequence by measuring the significance of each single feature,but in fact the feature subset combining by those more important features may be not an effective feature subset. Therefore,in this paper,we introduce an improved feature selection algorithm based on pairwise constraint,the newly developed algorithm focuses on evaluating the importance of a feature subset but not a single feature,that is,it uses the empty feature subset as starting point,and then gradually extends this feature subset by adding a most effective feature in every round,in this way an effective ranking feature list is obtained. Experimental results show that the newly proposed algorithm is flexible.

参考文献/References:

[1]Liu H,Motoda H. Feature selection for knowledge discovery and data mining[M]. Boston: Kluwer,1998.
[2]Yu L,Liu H. Feature selection for high-dimensional data: a fast correlation-based filter solution[C]/ / Proceedings of the 20th International Conferences on Machine Learning. Washington DC,2003: 856-863.
[3]Kohavi R,John G. Wrappers for feature subset selection[J]. Artificial Intelligence,1997,19( 1 /2) : 273-324.
[4]毛勇,周晓波,夏铮,等. 特征选择算法研究综述[J]. 模式识别与人工智能,2007,20( 2) : 211-218. Mao Yong,Zhou Xiaobo,Xia Zheng,et al. A survey for study of feature selection algorithms[J]. Pattern Recognition & Artificial Intelligence,2007,20( 2) : 211-218. ( in Chinese)
[5]朱颢东,李红婵,钟勇. 新颖的无监督特征选择方法[J]. 电子科技大学学报,2010,39( 3) : 412-415. Zhu Haodong,Li Hongchan,Zhong Yong. New unsupervised feature selection method[J]. Journal of University of Electronic Science and Technology of China,2010,39( 3) : 412-415. ( in Chinese)
[6]Mitra P,Murthy C A,Pal S K. Unsupervised feature selection using feature similarity[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24( 3) : 301-312.
[7]Bishop C M. Neural Networks for Pattern Recognition[M]. Oxford: Oxford University Press,1995.
[8]王博,黄九鸣,贾焰,等. 适用于多种监督模型的特征选择方法研究[J]. 计算机研究与发展,2010,47( 9) : 1 548-1 557. Wang Bo,Huang Jiuming,Jia Yan,et al. Research on a common feature selection method for multiple supervised models[J]. Journal of Computer Research and Development,2010,47( 9) : 1 548-1 557. ( in Chinese)
[9]Zhao Z,Liu H. Semi-supervised feature selection via spectral analysis[C]/ / Proceedings of the 7th SIAM International Conference on Data Mining. Minneapolis: MN,2007: 641-646.
[10]Xing E P,Ng A Y,Jordan M I,et al. Distance metric learning,with application to clustering with side-information[C]/ / Proceedings of the Conference on Advances in Neural Information Processing Systems( NIPS) . 2002: 505-512.
[11]Zhang D,Chen S,Zhou Z H. Constraint Score: A new filter method for feature selection with pairwise constraints[J]. Pattern Recognition,2008,41( 5) : 1 440-1 451.
[12]Witten I H,Frank E. Data Mining: Practical Machine Learning Tools and Techniques[M]. 2nd ed. San Francisco: Morgan Kaufmann,2005.

相似文献/References:

[1]万文强,张伶卫.分布式环境下的隐私保护特征选择研究[J].南京师范大学学报(工程技术版),2012,12(03):060.
 Wan Wenqiang,Zhang Lingwei.Privacy Preserving Feature Selection in Distributed Environment[J].Journal of Nanjing Normal University(Engineering and Technology),2012,12(01):060.
[2]杨杨,吕静.高维数据的特征选择研究[J].南京师范大学学报(工程技术版),2012,12(01):057.
 Yang Yang,Lü Jing.Some Studies on Feature Selection for High Dimensional Data[J].Journal of Nanjing Normal University(Engineering and Technology),2012,12(01):057.
[3]赵红艳,等.基于机器学习与语义知识的动词隐喻识别[J].南京师范大学学报(工程技术版),2011,11(03):059.
 Zhao Hongyan,Qu Weiguang,et al.Chinese Verb Metaphor Recognition Based on Machine Learning and Semantic Knowledge[J].Journal of Nanjing Normal University(Engineering and Technology),2011,11(01):059.
[4]柏宏权,韩庆年.机器学习在适应性教学系统中的应用研究[J].南京师范大学学报(工程技术版),2007,07(04):076.
 Bai Hongquan,Han Qingnian.Application of Machine Learning in Adaptive Instructional System[J].Journal of Nanjing Normal University(Engineering and Technology),2007,07(01):076.
[5]凌霄汉,吉根林.一种基于聚类集成的无监督特征选择方法[J].南京师范大学学报(工程技术版),2007,07(03):060.
 Ling Xiaohan,Ji Genlin.A Clustering Ensemble Based Unsupervised Feature Selection Approach[J].Journal of Nanjing Normal University(Engineering and Technology),2007,07(01):060.
[6]孙良君,范剑锋,杨琬琪,等.基于Group Lasso的多源电信数据离网用户分析[J].南京师范大学学报(工程技术版),2014,14(04):077.
 Sun Liangjun,Fan Jianfeng,Yang Wanqi,et al.Group Lasso-Based Feature Selection for Off-networkAnalysis in Multisource Teledata[J].Journal of Nanjing Normal University(Engineering and Technology),2014,14(01):077.
[7]吴兴惠,吴 迪,周玉萍,等.基于机器学习算法的稀土元素掺杂TiO2光催化活性分析[J].南京师范大学学报(工程技术版),2017,17(03):087.[doi:10.3969/j.issn.1672-1292.2017.03.013]
 Wu Xinghui,Wu Di,Zhou Yuping,et al.Photocatalytic Activity Prediction of Rare Earth Doped TiO2Based on Machine Learning Algorithm[J].Journal of Nanjing Normal University(Engineering and Technology),2017,17(01):087.[doi:10.3969/j.issn.1672-1292.2017.03.013]
[8]刘金晶,王丽英.在线学习社区发帖质量评价的回归模型研究[J].南京师范大学学报(工程技术版),2020,20(01):033.[doi:10.3969/j.issn.1672-1292.2020.01.006]
 Liu Jinjing,Wang Liying.Regression Model Research on Posting Quality Evaluationin Online Learning Community[J].Journal of Nanjing Normal University(Engineering and Technology),2020,20(01):033.[doi:10.3969/j.issn.1672-1292.2020.01.006]
[9]宗 影,李玉凤,刘红玉.基于面向对象随机森林方法的滨海湿地植被分类研究[J].南京师范大学学报(工程技术版),2021,21(04):047.[doi:10.3969/j.issn.1672-1292.2021.04.008]
 Zong Ying,Li Yufeng,Liu Hongyu.A Study of Coastal Wetland Vegetation ClassificationBased on Object-oriented Random Forest Method[J].Journal of Nanjing Normal University(Engineering and Technology),2021,21(01):047.[doi:10.3969/j.issn.1672-1292.2021.04.008]
[10]邱 宇,李文魁,李 冕.基于随机森林的虚拟路谱准确度智能主观评价法[J].南京师范大学学报(工程技术版),2022,22(04):045.[doi:10.3969/j.issn.1672-1292.2022.04.006]
 Qiu Yu,Li Wenkui,Li Mian.Intelligent Subjective Evaluation of Virtual Road Load Data Accuracy Based on Random Forest[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(01):045.[doi:10.3969/j.issn.1672-1292.2022.04.006]

备注/Memo

备注/Memo:
基金项目: 南京师范大学2010 年学生科学基金.通讯联系人: 刘会东,硕士,研究方向: 稀疏化和多任务学习. E-mail: huidong - liu@163. Com
更新日期/Last Update: 2013-03-21