ABSTRACT
High-dimensional multi-labeled data contain instances, where each instance is associated with a set of class labels and has a large number of noisy and irrelevant features. Feature selection has been shown to have great benefits in improving the classification performance in machine learning. In multi-label learning, to select the discriminative features among multiple labels, several challenges should be considered: interdependent labels, different instances may share different label correlations, correlated features, and missing and flawed labels. This work is part of a project at The Children's Hospital at Westmead (TB-CHW), Australia to explore the genomics of childhood leukaemia. In this paper, we propose a CMFS (Correlated- and Multi-label Feature Selection method), based on non-negative matrix factorization (NMF) for simultaneously performing feature selection and addressing the aforementioned challenges. Significantly, a major advantage of our research is to exploit the correlation information contained in features, labels and instances to select the relevant features among multiple labels. Furthermore, l2,1 -norm regularization is incorporated in the objective function to undertake feature selection by imposing sparsity on the feature matrix rows. We employ CMFS to decompose the data and multi-label matrices into a low-dimensional space. To solve the objective function, an efficient iterative optimization algorithm is proposed with guaranteed convergence. Finally, extensive experiments are conducted on high-dimensional multi-labeled datasets. The experimental results demonstrate that our method significantly outperforms state-of-the-art multi-label feature selection methods.
- Zafer Barutcuoglu, Robert E Schapire, and Olga G Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics, Vol. 22, 7 (2006), 830--836. Google ScholarDigital Library
- Matthew R Boutell, Jiebo Luo, Xipeng Shen, and Christopher M Brown. 2004. Learning multi-label scene classification. Pattern recognition, Vol. 37, 9 (2004), 1757--1771.Google Scholar
- Ali Braytee, Daniel R Catchpoole, Paul J Kennedy, and Wei Liu. 2016. Balanced Supervised Non-Negative Matrix Factorization for Childhood Leukaemia Patients Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 2405--2408. Google ScholarDigital Library
- Janez Demvsar. 2006. Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research Vol. 7, Jan (2006), 1--30. Google ScholarDigital Library
- Chris Ding, Tao Li, Wei Peng, and Haesun Park. 2006. Orthogonal nonnegative matrix t-factorizations for clustering Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 126--135. Google ScholarDigital Library
- Susan T Dumais. 2004. Latent semantic analysis. Annual review of information science and technology, Vol. 38, 1 (2004), 188--230.Google Scholar
- Sheng-Jun Huang, Zhi-Hua Zhou, and ZH Zhou. 2012. Multi-Label Learning by Exploiting Label Correlations Locally. AAAI. Google ScholarDigital Library
- Shuiwang Ji, Lei Tang, Shipeng Yu, and Jieping Ye. 2008. Extracting shared subspace for multi-label classification Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 381--389. Google ScholarDigital Library
- Ling Jian, Jundong Li, Kai Shu, and Huan Liu. 2016. Multi-Label Informed Feature Selection. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. Google ScholarDigital Library
- Daniel D Lee and H Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. Advances in neural information processing systems. 556--562.Google Scholar
- Jaesung Lee and Dae-Won Kim. 2015. Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recognition, Vol. 48, 9 (2015), 2761--2771. Google ScholarDigital Library
- Huan Liu and Hiroshi Motoda. 2007. Computational methods of feature selection. CRC Press. Google ScholarDigital Library
- Zhigang Ma, Feiping Nie, Yi Yang, Jasper RR Uijlings, and Nicu Sebe. 2012. Web image annotation via subspace-sparsity collaborated feature selection. IEEE Transactions on Multimedia Vol. 14, 4 (2012), 1021--1030. Google ScholarDigital Library
- Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint ffl2, 1-norms minimization Advances in neural information processing systems. 1813--1821. Google ScholarDigital Library
- Alberto Pascual-Montano, Pedro Carmona-Saez, Monica Chagoyen, Francisco Tirado, Jose M Carazo, and Roberto D Pascual-Marqui. 2006. bioNMF: a versatile tool for non-negative matrix factorization in biology. BMC bioinformatics, Vol. 7, 1 (2006), 1.Google Scholar
- Newton Spolaôr, Everton Alvares Cherman, Maria Carolina Monard, and Huei Diana Lee. 2012. Filter approach feature selection methods to support multi-label learning based on relieff and information gain. Advances in Artificial Intelligence-SBIA 2012. Springer, 72--81. Google ScholarDigital Library
- Newton Spolaôr, Everton Alvares Cherman, Maria Carolina Monard, and Huei Diana Lee. 2013. ReliefF for multi-label feature selection. In Intelligent Systems (BRACIS), 2013 Brazilian Conference on. IEEE, 6--11. Google ScholarDigital Library
- Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267--288.Google Scholar
- Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis, Jozef Vilcek, and Ioannis Vlahavas. 2011. Mulan: A java library for multi-label learning. Journal of Machine Learning Research Vol. 12, Jul (2011), 2411--2414. Google ScholarDigital Library
- Grigorios Tsoumakas and Min-ling Zhang. 2009. Learning from multi-label data. (2009).Google Scholar
- Naonori Ueda and Kazumi Saito. 2002. Parametric mixture models for multi-labeled text. Advances in neural information processing systems. 721--728. Google ScholarDigital Library
- Linli Xu, Zhen Wang, Zefan Shen, Yubo Wang, and Enhong Chen. 2014. Learning low-rank label correlations for multi-label classification with missing labels Data Mining (ICDM), 2014 IEEE International Conference on. IEEE, 1067--1072. Google ScholarDigital Library
- Yin Zhang and Zhi-Hua Zhou. 2010. Multilabel dimensionality reduction via dependence maximization. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 4, 3 (2010), 14. Google ScholarDigital Library
- Li Zhou, Namrata Nath, Oksana Markovich, Aysen Yuksel, Aedan Roberts, and Daniel Catchpoole. 2015. The Tumour Bank of The Children's Hospital at Westmead. Biopreservation and biobanking Vol. 13, 2 (2015), 147--148.Google Scholar
- Xin Zhou and David P Tuck. 2007. MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data. Bioinformatics, Vol. 23, 9 (2007), 1106--1114. Google ScholarDigital Library
Index Terms
- Multi-Label Feature Selection using Correlation Information
Recommendations
Correlated Multi-label Classification with Incomplete Label Space and Class Imbalance
Special Section on Advances in Causal Discovery and Inference and Regular PapersMulti-label classification is defined as the problem of identifying the multiple labels or categories of new observations based on labeled training data. Multi-labeled data has several challenges, including class imbalance, label correlation, incomplete ...
Distinguishing two types of labels for multi-label feature selection
Highlights- We categorize labels into two groups: independent labels and dependent labels.
- ...
AbstractMulti-label feature selection plays an important role in pattern recognition, which can improve multi-label classification performance. In traditional multi-label feature selection methods based on information theory, feature relevance ...
Multi-label feature selection based on correlation label enhancement
AbstractFeature selection is an effective data preprocessing technique that can effectively alleviate the curse of dimensionality in multi-label learning. The technique selects a subset of features with high discriminative power to maintain or ...
Comments