[1]孔令蔚,蔡林晟,林少杰,等.基于特征交互的层次分类在线流特征选择[J].南京师范大学学报(工程技术版),2024,24(02):034-42.[doi:10.3969/j.issn.1672-1292.2024.02.005]
 Kong Lingwei,Cai Linsheng,Lin Shaojie,et al.Online Hierarchical Streaming Feature Selection Based on Feature Interaction[J].Journal of Nanjing Normal University(Engineering and Technology),2024,24(02):034-42.[doi:10.3969/j.issn.1672-1292.2024.02.005]
点击复制

基于特征交互的层次分类在线流特征选择
分享到:

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
24卷
期数:
2024年02期
页码:
034-42
栏目:
计算机科学与技术
出版日期:
2024-06-15

文章信息/Info

Title:
Online Hierarchical Streaming Feature Selection Based on Feature Interaction
文章编号:
1672-1292(2024)02-0034-09
作者:
孔令蔚12蔡林晟12林少杰12林耀进12
(1.闽南师范大学计算机学院,福建 漳州 363000)
(2.闽南师范大学数据科学与智能应用福建省高等学校重点实验室,福建 漳州 363000)
Author(s):
Kong Lingwei12Cai Linsheng12Lin Shaojie12Lin Yaojin12
(1.School of Computer Science,Minnan Normal University,Zhangzhou 363000,China)
(2.Fujian Key Laboratory of Data Science and Intelligence Application,Minnan Normal University,Zhangzhou 363000,China)
关键词:
在线流特征选择层次分类特征交互兄弟策略邻域粗糙集
Keywords:
online streaming feature selectionhierarchical classificationfeature interactionsibling strategyneighborhood rough set
分类号:
TP181
DOI:
10.3969/j.issn.1672-1292.2024.02.005
文献标志码:
A
摘要:
在开放动态环境下的分类学习任务中,数据特征空间具有动态性,标记空间存在层次化结构. 现有的层次分类在线流特征选择算法可以选择较优的特征子集,但这些算法忽略了特征之间存在的交互作用. 基于此,提出了一种基于特征交互的层次分类在线流特征选择算法. 首先,设计了一种基于层次邻域依赖度去判断特征交互的计算方法; 其次,针对层次化结构数据,根据层次结构中不同节点间的兄弟关系定义邻域粗糙集模型; 最后,设计了具有在线重要性分析、在线冗余性分析以及在线交互性分析的层次分类在线流框架,用于选择强相关和存在交互作用的特征子集. 在6个层次数据集上的实验验证了所提算法具有较优的综合性能.
Abstract:
In classification learning tasks in open dynamic environments,the data feature space is dynamic and there is a hierarchical structure in the labelling space. Existing hierarchical classification online streaming feature selection algorithms can select a superior subset of features,but these algorithms ignore the interactions that exist between the features. Therefore,this paper proposes a feature selection algorithm for hierarchical classification online streaming based on feature interaction. Firstly,a computational method based on hierarchical neighborhood dependency is designed to judge the feature interaction. Secondly,for hierarchical structure data,a neighborhood rough set model is defined on the basis of sibling relationships between different nodes in the hierarchical structure. Finally,the online streaming framework is designed for hierarchical classification with online importance analysis,online redundancy analysis and online interaction analysis for selecting the subset of features that are strongly correlated and have interaction. The proposed algorithm is experimentally verified on six hierarchical datasets to have superior comprehensive performance.

参考文献/References:

[1]KRAUSE J,STARK M,DENG J,et al. 3d object representations for fine-grained categorization[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops.Sydney,Australia:IEEE,2013.
[2]WEI L Y,LIAO M H,GAO X,et al. An improved protein structural classes prediction method by incorporating both sequence and structure information[J]. IEEE Transactions on NanoBioscience,2014,14(4):339-349.
[3]胡清华,王煜,周玉灿,等. 大规模分类任务的分层学习方法综述[J]. 中国科学:信息科学,2018,48(5):487-500.
[4]SHI J,LI Z Y,ZHAO H. Feature selection via maximizing inter-class independence and minimizing intra-class redundancy for hierarchical classification[J]. Information Sciences,2023,626:1-18.
[5]LIU H Y,LIN Y J,WANG C X,et al. Semantic-gap-oriented feature selection in hierarchical classification learning[J]. Information Sciences,2023,642:119241.
[6]GUO S X,ZHAO H,YANG W Y. Hierarchical feature selection with multi-granularity clustering structure[J]. Information Sciences,2021,568:448-462.
[7]林耀进,白盛兴,赵红,等. 基于标签关联性的分层分类共有与固有特征选择[J]. 软件学报,2022,33(7):2667-2682.
[8]LI H G,WU X D,LI Z,et al. Group feature selection with streaming features[C]//Proceedings of the IEEE 13th International Conference on Data Mining. Dallas,USA:IEEE,2013.
[9]林耀进,陈祥焰,白盛兴,等. 基于最大决策边界的高维类不平衡数据在线流特征选择[J]. 模式识别与人工智能,2020,33(9):820-829.
[10]WU X D,YU K,DING W,et al. Online feature selection with streaming features[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(5):1178-1192.
[11]YOU D L,WANG Y,XIAO J W,et al. Online multi-label streaming feature selection with label correlation[J]. IEEE Transactions on Knowledge and Data Engineering,2023,35(3):2901-2915.
[12]LIN Y J,HU Q H,LIU J H,et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems,2017,25(6):1491-1507.
[13]KOHAVI R,JOHN G H. Wrappers for feature subset selection[J]. Artificial Intelligence,1997,97(1/2):273-324.
[14]JAKULIN A,BRATKO I. Analyzing attribute dependencies[C]//Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases(PKDD2003). Cavtat-Dubrovnik,Coratia:PKDD,2003.
[15]ZHOU P,WANG N,ZHAO S. Online group streaming feature selection considering feature interaction[J]. Knowledge-Based Systems,2021,226:107157.
[16]WU F H,ZHANG J,HONAVAR V. Learning classifiers using hierarchically structured class taxonomies[C]//Proceedings of the 6th International Symposium on Abstraction,Reformulation and Approximation(SARA 2005). Airth Castle,Scotland,UK:SARA,2005.
[17]SILLA JR C N,FREITAS A A. A survey of hierarchical classification across different application domains[J]. Data Mining and Knowledge Discovery,2011,22(1/2):31-72.
[18]CECI M,MALERBA D. Classifying web documents in a hierarchy of categories:a comprehensive study[J]. Journal of Intelligent Information Systems,2007,28(1):37-78.
[19]白盛兴,林耀进,王晨曦,等. 基于邻域粗糙集的大规模层次分类在线流特征选择[J]. 模式识别与人工智能,2019,32(9):811-820.
[20]ZENG Z L,ZHANG H J,ZHANG R,et al. A novel feature selection method considering feature interaction[J]. Pattern Recognition:the Journal of the Pattern Recognition Society,2015,48(8):2656-2666.
[21]DEKEL O,KESHET J,SINGER Y. Large margin hierarchical classification[C]//Proceedings of the 21th International Conference on Machine Learning. Banff,Canada:ACM,2004.
[22]KOSMOPOULOS A,PARTALAS I,GAUSSIER E,et al. Evaluation measures for hierarchical classification:A unified view and novel approaches[J]. Data Mining and Knowledge Discovery,2015,29(3):820-865.
[23]ZHOU P,HU X G,LI P P,et al. OFS-Density:A novel online streaming feature selection method[J]. Pattern Recognition:the Journal of the Pattern Recognition Society,2019,86:48-61.
[24]YU K,WU X D,DING W,et al. Scalable and accurate online feature selection for big data[J]. ACM Transactions on Knowledge Discovery from Data,2016,11(2):16.
[25]FRIEDMAN M. A comparison of alternative tests of significance for the problem of m rankings[J]. The Annals of Mathematical Statistics,1940,11(1):86-92.
[26]NEMENYI P B. Distribution-Free Multiple Comparisons[M]. Princeton,USA:Princeton University ProQuest Dissertations Publishing,1963.

相似文献/References:

[1]王晨曦,刘园奎,吕 彦,等.基于邻域决策误差率的层次分类在线流特征选择[J].南京师范大学学报(工程技术版),2022,22(04):009.[doi:10.3969/j.issn.1672-1292.2022.04.002]
 Wang Chenxi,Liu Yuankui,Lv Yan,et al.Online Hierarchical Streaming Feature Selection Based on Neighborhood Decision Error Rate[J].Journal of Nanjing Normal University(Engineering and Technology),2022,22(02):009.[doi:10.3969/j.issn.1672-1292.2022.04.002]

备注/Memo

备注/Memo:
收稿日期:2023-11-14.
基金项目:国家自然科学基金面上项目(62076116).
通讯作者:林耀进,博士,教授,研究方向:数据挖掘、粒计算. E-mail:zzlinyaojin@163.com
更新日期/Last Update: 2024-06-15