|Table of Contents|

Weighted k-means Algorithm Based on Outlier Detection(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2022年01期
Page:
75-80
Research Field:
机器学习
Publishing date:

Info

Title:
Weighted k-means Algorithm Based on Outlier Detection
Author(s):
Hu Haojie1Chen Hui2Mu Tingting3Yao Minli1He Fang1Zhang Fenggan1
(1.Rocket Force Engineering University,Xi’an 710025,China)(2.The Fourth Academy of China Aerospace Science and Technology Corporation,Xi’an 710025,China)(3.Beijing New Era Global Import and Export Co.,Ltd.,Beijing 100027,China)
Keywords:
clusteringk-meansoutlier detection0-norm
PACS:
TP391
DOI:
10.3969/j.issn.1672-1292.2022.01.011
Abstract:
In this paper,to solve the problem of that few outliers can easily destroy the cluster structure,leading to a significant deviation for the obtained centroids in k-means clustering algorithm,we assign different weights on the data points based on their distance from the potential cluster center to alleviate the negative impact on the data structure. Moreover,we also incorporate outlier detection in our clustering model by imposing 0-norm constraint on weight assignments. To optimize the model,we introduce an efficient alternating minimization algorithm. Extensive experiments on both synthetic and real datasets show the effectiveness of the proposed model.

References:

[1] JAIN A K,DUBES R C. Algorithms for clustering data[M]. New Jersey,USA:Prentice-Hall,1988:227-229.
[2]吉珊珊. 基于神经网络树和人工蜂群优化的数据聚类[J].南京师大学报(自然科学版),2021,44(1):119-127.
[3]CAMPOS R,DIAS G,JORGE A M,et al. Survey of temporal information retrieval and related applications[J]. ACM Computing Surveys,2014,47(2):1-41.
[4]CAI X,NIE F P,HUANG H. Multi-view k-means clustering on big data[C]//Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence. Beijing:IJCAI,2013:2598-2604.
[5]NIE F P,HUANG H,CAI X,et al. Efficient and robust feature selection via joint 2,1-norms minimization[C]//Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver,Canada,2010.
[6]HUANG S,REN Y,XU Z. Robust multi-view data clustering with multi-view capped-norm K-means[J]. Neurocomputing,2018,311:197-208.
[7]袁小翠,刘宝玲,马永力. 基于空间邻域连通区域标记法的点云离群点检测[J].计算机应用研究,2020,37(增刊2):380-382,385.
[8]BEER A,LAUTERBACH J,SEIDL T. MORe++:k-means based outlier removal on high-dimensional data[C]//Proceedings of the 12th International Conference on Similarity Search and Applications. Newark,USA:Springer,2019:188-202.
[9]HAUTAM?KI V,CHEREDNICHE-NKO S,K?RKK?INEN I,et al. Improving K-means by outlier removal[C]//Proceedings of the 14th Scandinavian Conference on Image Analysis. Joensuu,Finland:Springer-Verlag,2005:978-987.
[10]AHMED M,NASER A. A novel approach for outlier detection and clustering improvement[C]//Proceedings of the 2013 IEEE 8th Conference on Industrial Electronics & Applications. Melbourne,Australia:IEEE,2013.
[11]WHANG J J,DHILLON I S,GLEICH D F. Non-exhaustive,overlapping k-means[M]//Proceedings of the 2015 SIAM International Conference on Data Mining. Vancouver,Canada:SIAM,2015.
[12]GAN G J,NG M K P. k-means clustering with outlier removal[J]. Pattern Recognition Letters,2017,90:8-14.
[13]LIU H F,LI J,WU Y,et al. Clustering with outlier removal[J]. IEEE Transactions on Knowledge and Data Engineering,2019. DOI:10.1109/TKDE.2019.2954317.
[14]许振,吉根林,唐梦梦.基于聚类的兴趣区域间异常轨迹并行检测算法[J].南京师大学报(自然科学版),2019,42(1):59-64.

Memo

Memo:
-
Last Update: 2022-03-15