ABSTRACT
Feature selection is aimed at reducing the dimensionality of data sets and obtaining a feature subset with better performance for the target learner. Unsupervised feature selection is more challenging because of the lack of label information. In this paper, the idea of decision graph is applied to unsupervised feature selection. Specifically, a 3-D decision graph model of features is proposed to reveal the characteristics of each feature and the relationship between them. Then, a movable hyperplane is constructed to select specified number of features from the original feature space to form the final feature subset. Experiments in comparison with both traditional and novel algorithms on benchmark data sets reveal that the proposed method is able to select feature subset with both of lower redundancy and higher performance.
- David Arthur and Sergei Vassilvitskii. 2006. k-means++: The advantages of careful seeding. Technical Report. Stanford.Google Scholar
- Carmelo Cassisi, Alfredo Ferro, Rosalba Giugno, Giuseppe Pigola, and Alfredo Pulvirenti. 2013. Enhancing density-based clustering: parameter reduction and outlier detection. Information Systems 38, 3 (2013), 317–330.Google ScholarDigital Library
- Richard O. Duda, Peter E. Hart, and David G. Stork. 2012. Pattern Classification. John Wiley & Sons.Google ScholarDigital Library
- Jinrong He, Yingzhou Bi, Lixin Ding, Zhaokui Li, and Shenwen Wang. 2017. Unsupervised feature selection based on decision graph. Neural Computing and Applications 28, 10 (2017), 3047–3059.Google ScholarDigital Library
- Xiaofei He, Deng Cai, and Partha Niyogi. 2005. Laplacian score for feature selection. Advances in Neural Information Processing Systems 18 (2005).Google Scholar
- Mingxia Liu and Daoqiang Zhang. 2014. Sparsity score: A novel graph-preserving feature selection method. International Journal of Pattern Recognition and Artificial Intelligence 28, 04(2014), 1450009.Google ScholarCross Ref
- M. Liu and D. Zhang. 2016. Feature selection with effective distance. Neurocomputing 215(2016), 100–109.Google ScholarDigital Library
- Pabitra Mitra, CA Murthy, and Sankar K. Pal. 2002. Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 3(2002), 301–312.Google ScholarDigital Library
- Chao Ni, Wangshu Liu, Qing Gu, Xiang Chen, and Daoxu Chen. 2017. FeSCH: a feature selection method using clusters of hybrid-data for cross-project defect prediction. In 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. IEEE, 51–56.Google ScholarCross Ref
- Mohsen Rahmanian and Eghbal Mansoori. 2021. Unsupervised fuzzy multivariate symmetric uncertainty feature selection based on constructing virtual cluster representative. Fuzzy Sets and Systems(2021).Google Scholar
- Mohsen Rahmanian, Eghbal G Mansoori, and Mohammad Taheri. 2020. Unsupervised Feature Selection based on Constructing Virtual Cluster Representative. In 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE). IEEE, 088–093.Google ScholarCross Ref
- David N Reshef, Yakir A Reshef, Hilary K Finucane, Sharon R Grossman, Gilean McVean, Peter J Turnbaugh, Eric S Lander, Michael Mitzenmacher, and Pardis C Sabeti. 2011. Detecting novel associations in large data sets. Science 334, 6062 (2011), 1518–1524.Google ScholarCross Ref
- Alex Rodriguez and Alessandro Laio. 2014. Clustering by fast search and find of density peaks. Science 344, 6191 (2014), 1492–1496.Google Scholar
- Joab R Winkler. 1993-01. Numerical recipes in C: The art of scientific computing, second edition. 17, 4 (1993-01), 201. https://doi.org/10.1016/0160-9327(93)90069-FGoogle ScholarCross Ref
- Xuyang Yan, Shabnam Nazmi, Berat A Erol, Abdollah Homaifar, Biniam Gebru, and Edward Tunstel. 2020. An efficient unsupervised feature selection procedure through feature clustering. Pattern Recognition Letters 131 (2020), 277–284.Google ScholarCross Ref
- Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. L2,1-norm regularized discriminative feature selection for unsupervised. In Twenty-Second International Joint Conference on Artificial Intelligence.Google Scholar
- Chao Yao, Ya-Feng Liu, Bo Jiang, Jungong Han, and Junwei Han. 2017. LLE score: A new filter-based unsupervised feature selection method based on nonlinear manifold embedding and its application to image recognition. IEEE Transactions on Image Processing 26, 11 (2017), 5257–5269.Google ScholarDigital Library
- Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In Proceedings of The 24th International Conference on Machine Learning. 1151–1157.Google ScholarDigital Library
- Xiaoyan Zhu, Yu Wang, Yingbin Li, Yonghui Tan, Guangtao Wang, and Qinbao Song. 2019. A new unsupervised feature selection algorithm using similarity-based feature clustering. Computational Intelligence 35, 1 (2019), 2–22.Google ScholarCross Ref
Index Terms
- Unsupervised Feature Selection Based on 3-D Feature Decision Graph for High-dimensional Data
Recommendations
A Redundancy Based Unsupervised Feature Selection Method for High-Dimensional Data
ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and ComputingFeature selection is a process to select key features from the initial feature set. It is commonly used as a preprocessing step to improve the efficiency and accuracy of a classification model in artificial intelligence and machine learning domains. ...
Unsupervised Feature Selection with Feature Clustering
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01As an effective technique for dimensionality reduction, feature selection has a broad application in different research areas. In this paper, we present a feature selection method based on a novel feature clustering procedure, which aims at partitioning ...
Stable feature selection via dense feature groups
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningMany feature selection algorithms have been proposed in the past focusing on improving classification accuracy. In this work, we point out the importance of stable feature selection for knowledge discovery from high-dimensional data, and identify two ...
Comments