|Table of Contents|

Online Hierarchical Streaming Feature Selection Based on Feature Interaction(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2024年02期
Page:
34-42
Research Field:
计算机科学与技术
Publishing date:

Info

Title:
Online Hierarchical Streaming Feature Selection Based on Feature Interaction
Author(s):
Kong Lingwei12Cai Linsheng12Lin Shaojie12Lin Yaojin12
(1.School of Computer Science,Minnan Normal University,Zhangzhou 363000,China)
(2.Fujian Key Laboratory of Data Science and Intelligence Application,Minnan Normal University,Zhangzhou 363000,China)
Keywords:
online streaming feature selectionhierarchical classificationfeature interactionsibling strategyneighborhood rough set
PACS:
TP181
DOI:
10.3969/j.issn.1672-1292.2024.02.005
Abstract:
In classification learning tasks in open dynamic environments,the data feature space is dynamic and there is a hierarchical structure in the labelling space. Existing hierarchical classification online streaming feature selection algorithms can select a superior subset of features,but these algorithms ignore the interactions that exist between the features. Therefore,this paper proposes a feature selection algorithm for hierarchical classification online streaming based on feature interaction. Firstly,a computational method based on hierarchical neighborhood dependency is designed to judge the feature interaction. Secondly,for hierarchical structure data,a neighborhood rough set model is defined on the basis of sibling relationships between different nodes in the hierarchical structure. Finally,the online streaming framework is designed for hierarchical classification with online importance analysis,online redundancy analysis and online interaction analysis for selecting the subset of features that are strongly correlated and have interaction. The proposed algorithm is experimentally verified on six hierarchical datasets to have superior comprehensive performance.

References:

[1]KRAUSE J,STARK M,DENG J,et al. 3d object representations for fine-grained categorization[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops.Sydney,Australia:IEEE,2013.
[2]WEI L Y,LIAO M H,GAO X,et al. An improved protein structural classes prediction method by incorporating both sequence and structure information[J]. IEEE Transactions on NanoBioscience,2014,14(4):339-349.
[3]胡清华,王煜,周玉灿,等. 大规模分类任务的分层学习方法综述[J]. 中国科学:信息科学,2018,48(5):487-500.
[4]SHI J,LI Z Y,ZHAO H. Feature selection via maximizing inter-class independence and minimizing intra-class redundancy for hierarchical classification[J]. Information Sciences,2023,626:1-18.
[5]LIU H Y,LIN Y J,WANG C X,et al. Semantic-gap-oriented feature selection in hierarchical classification learning[J]. Information Sciences,2023,642:119241.
[6]GUO S X,ZHAO H,YANG W Y. Hierarchical feature selection with multi-granularity clustering structure[J]. Information Sciences,2021,568:448-462.
[7]林耀进,白盛兴,赵红,等. 基于标签关联性的分层分类共有与固有特征选择[J]. 软件学报,2022,33(7):2667-2682.
[8]LI H G,WU X D,LI Z,et al. Group feature selection with streaming features[C]//Proceedings of the IEEE 13th International Conference on Data Mining. Dallas,USA:IEEE,2013.
[9]林耀进,陈祥焰,白盛兴,等. 基于最大决策边界的高维类不平衡数据在线流特征选择[J]. 模式识别与人工智能,2020,33(9):820-829.
[10]WU X D,YU K,DING W,et al. Online feature selection with streaming features[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(5):1178-1192.
[11]YOU D L,WANG Y,XIAO J W,et al. Online multi-label streaming feature selection with label correlation[J]. IEEE Transactions on Knowledge and Data Engineering,2023,35(3):2901-2915.
[12]LIN Y J,HU Q H,LIU J H,et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems,2017,25(6):1491-1507.
[13]KOHAVI R,JOHN G H. Wrappers for feature subset selection[J]. Artificial Intelligence,1997,97(1/2):273-324.
[14]JAKULIN A,BRATKO I. Analyzing attribute dependencies[C]//Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases(PKDD2003). Cavtat-Dubrovnik,Coratia:PKDD,2003.
[15]ZHOU P,WANG N,ZHAO S. Online group streaming feature selection considering feature interaction[J]. Knowledge-Based Systems,2021,226:107157.
[16]WU F H,ZHANG J,HONAVAR V. Learning classifiers using hierarchically structured class taxonomies[C]//Proceedings of the 6th International Symposium on Abstraction,Reformulation and Approximation(SARA 2005). Airth Castle,Scotland,UK:SARA,2005.
[17]SILLA JR C N,FREITAS A A. A survey of hierarchical classification across different application domains[J]. Data Mining and Knowledge Discovery,2011,22(1/2):31-72.
[18]CECI M,MALERBA D. Classifying web documents in a hierarchy of categories:a comprehensive study[J]. Journal of Intelligent Information Systems,2007,28(1):37-78.
[19]白盛兴,林耀进,王晨曦,等. 基于邻域粗糙集的大规模层次分类在线流特征选择[J]. 模式识别与人工智能,2019,32(9):811-820.
[20]ZENG Z L,ZHANG H J,ZHANG R,et al. A novel feature selection method considering feature interaction[J]. Pattern Recognition:the Journal of the Pattern Recognition Society,2015,48(8):2656-2666.
[21]DEKEL O,KESHET J,SINGER Y. Large margin hierarchical classification[C]//Proceedings of the 21th International Conference on Machine Learning. Banff,Canada:ACM,2004.
[22]KOSMOPOULOS A,PARTALAS I,GAUSSIER E,et al. Evaluation measures for hierarchical classification:A unified view and novel approaches[J]. Data Mining and Knowledge Discovery,2015,29(3):820-865.
[23]ZHOU P,HU X G,LI P P,et al. OFS-Density:A novel online streaming feature selection method[J]. Pattern Recognition:the Journal of the Pattern Recognition Society,2019,86:48-61.
[24]YU K,WU X D,DING W,et al. Scalable and accurate online feature selection for big data[J]. ACM Transactions on Knowledge Discovery from Data,2016,11(2):16.
[25]FRIEDMAN M. A comparison of alternative tests of significance for the problem of m rankings[J]. The Annals of Mathematical Statistics,1940,11(1):86-92.
[26]NEMENYI P B. Distribution-Free Multiple Comparisons[M]. Princeton,USA:Princeton University ProQuest Dissertations Publishing,1963.

Memo

Memo:
-
Last Update: 2024-06-15