Abstract
Many real-world data have multiple class labels known as multi-label data, where the labels are correlated with each other, and as such, they are not independent. Since these data are usually high-dimensional, and the current multi-label feature selection methods have not been precise enough, then a new feature selection method is necessary. In this paper, for the first time, we have modeled the problem of multi-label feature selection to a bipartite graph matching process. The proposed method constructs a bipartite graph of features (as the left vertices) and labels (as the right vertices), called Feature-Label Graph (FLG), where each feature is connected to the set of labels, where the weight of the edge between each feature and label is equal to their correlation. Then, the Hungarian algorithm estimates the best matching in FLG. The selected features in each matching are sorted by weighted correlation distance and added to the ranking vector. To select the discriminative features, the proposed method considers both the redundancy of features and the relevancy of each feature to the class labels. The results indicate the superiority of the proposed method against the other methods in classification measures.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arslan S, Ozturk C (2019) Multi hive artificial bee colony programming for high dimensional symbolic regression with feature selection. Appl Soft Comput J 78:515–527. https://doi.org/10.1016/j.asoc.2019.03.014
Bayati H, Dowlatshahi MB, Paniri M (2020) MLPSO: a filter multi-label feature selection based on particle swarm optimization. In: 2020 25th International Computer Conference, Computer Society of Iran (CSICC). IEEE, pp 1–6
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: A new perspective. Neurocomputing 300:70–79. https://doi.org/10.1016/j.neucom.2017.11.077
Che X, Chen D, Mi J (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Inf Sci (Ny) 512:795–812. https://doi.org/10.1016/j.ins.2019.10.022
Cherman EA, Spolaôr N, Valverde-Rebaza J, Monard MC (2015) Lazy Multi-label learning algorithms based on mutuality strategies. J Intell Robot Syst Theory Appl 80:261–276. https://doi.org/10.1007/s10846-014-0144-4
Coakley CW, Conover WJ (2000) Practical nonparametric statistics. J Am Stat Assoc 95:332. https://doi.org/10.2307/2669565
Doquire G, Verleysen M (2011) Feature selection for multi-label classification problems. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 9–16
Dowlatshahi MB, Derhami V (2019) Winner determination in combinatorial auctions using hybrid ant colony optimization and multi-neighborhood local search. J AI Data Min 5:169–181. https://doi.org/10.22044/jadm.2017.880
Dowlatshahi MB, Derhami V, Nezamabadi-pour H (2018) A novel three-stage filter-wrapper framework for miRNA subset selection in cancer classification. Informatics. https://doi.org/10.3390/informatics5010013
Dowlatshahi MB, Derhami V, Nezamabadi-Pour H (2020) Fuzzy particle swarm optimization with nearest-better neighborhood for multimodal optimization. Iran J Fuzzy Syst 17:7–24. https://doi.org/10.22111/ijfs.2020.5403
Dowlatshahi MB, Derhami V, Nezamabadi-Pour H (2017) Ensemble of filter-based rankers to guide an epsilon-greedy swarm optimizer for high-dimensional feature subset selection. Inf. https://doi.org/10.3390/info8040152
Dowlatshahi MB, Nezamabadi-Pour H (2014) GGSA: a grouping gravitational search algorithm for data clustering. Eng Appl Artif Intell 36:114–121. https://doi.org/10.1016/j.engappai.2014.07.016
Dowlatshahi MB, Nezamabadi-Pour H, Mashinchi M (2014) A discrete gravitational search algorithm for solving combinatorial optimization problems. Inf Sci (Ny) 258:94–107. https://doi.org/10.1016/j.ins.2013.09.034
Dowlatshahi MB, Rezaeian M (2016) Training spiking neurons with gravitational search algorithm for data classification. In: 1st conference on swarm intelligence and evolutionary computation, CSIEC 2016—Proceedings. pp 53–58
Duan R, Su HH (2012) A scaling algorithm for maximum weight matching in bipartite graphs. In: proceedings of the annual ACM-SIAM symposium on discrete algorithms, pp 1413–1424
Ventura JS, Cano A (2020) Distributed multi-label feature selection using individual mutual information measures. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2019.105052
Hashemi A, Dowlatshahi MB (2020) MLCR: a fast multi-label feature selection method based on K-means and L2-norm. In: 2020 25th international computer conference, Computer Society of Iran (CSICC). IEEE, pp 1–7
Hashemi A, Dowlatshahi MB, Nezamabadi-pour H (2020) MGFS: a multi-label graph-based feature selection algorithm via pagerank centrality. Expert Syst Appl 142:113024. https://doi.org/10.1016/j.eswa.2019.113024
Hastie T, Tibshirani R, Friedman J, Franklin J (2017) The elements of statistical learning: data mining, inference, and prediction. Math Intell. https://doi.org/10.1007/BF02985802
Huang R, Jiang W, Sun G (2018) Manifold-based constraint Laplacian score for multi-label feature selection. Pattern Recognit Lett 112:346–352. https://doi.org/10.1016/j.patrec.2018.08.021
Kashef S, Nezamabadi-pour H, Nikpour B (2018) Multilabel feature selection: a comprehensive review and guiding experiments. Wiley Interdiscip Rev Data Min Knowl Discov 8:e1240. https://doi.org/10.1002/widm.1240
Kashef S, Nezamabadi-Pour H, Nikpour B (2018b) FCBF3Rules: a feature selection method for multi-label datasets. In: 3rd conference on swarm intelligence and evolutionary computation (CSIEC). IEEE, pp 1–5
Kuhn HW (2010) The hungarian method for the assignment problem. In: 50 years of integer programming 1958–2008: From the early years to the state-of-the-art. Springer, Berlin, pp 29–47
Lee J, Kim D-W (2015) Mutual Information-based multi-label feature selection using interaction information. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2014.09.063
Li J, Cheng K, Wang S et al (2017) Feature selection: a data perspective. ACM Comput Surv. https://doi.org/10.1145/3136625
Liu H, Yang Y (2015) Bipartite edge prediction via transductive learning over product graphs. In: 32nd International Conference on Machine Learning, ICML 2015. pp 1880–1888
Livi L, Rizzi A (2013) The graph matching problem. Pattern Anal Appl 16:253–283. https://doi.org/10.1007/s10044-012-0284-8
Miao J, Niu L (2016) A survey on feature selection. Procedia Comput Sci 91:919–926. https://doi.org/10.1016/j.procs.2016.07.111
Momeni E, Dowlatshahi MB, Omidinasab F et al (2020) Gaussian process regression technique to estimate the pile bearing capacity. Arab J Sci Eng. https://doi.org/10.1007/s13369-020-04683-4
Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl Math 5:32–38. https://doi.org/10.1137/0105003
Paniri M, Dowlatshahi MB, Nezamabadi-pour H (2020) MLACO: a multi-label feature selection algorithm based on ant colony optimization. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2019.105285
Pereira RB, Plastino A, Zadrozny B, Merschmann LHC (2018) Categorizing feature selection methods for multi-label classification. Artif Intell Rev 49:57–78. https://doi.org/10.1007/s10462-016-9516-4
Rafsanjani MK, Dowlatshahi MB (2012) Using gravitational search algorithm for finding near-optimal base station location in two-tiered WSNs. Int J Mach Learn Comput. https://doi.org/10.7763/ijmlc.2012.v2.148
Reyes O, Morell C, Ventura S (2015) Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing. https://doi.org/10.1016/j.neucom.2015.02.045
Stauffer M, Tschachtli T, Fischer A, Riesen K (2017) A survey on applications of bipartite graph edit distance. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 242–252
Sun Z, Zhang J, Dai L et al (2019) Mutual information based multi-label feature selection via constrained convex optimization. Neurocomputing. https://doi.org/10.1016/j.neucom.2018.10.047
Wang C, Lin Y, Liu J (2019) Feature selection for multi-label learning with missing labels. Appl Intell 49:3027–3042. https://doi.org/10.1007/s10489-019-01431-6
Wang H, Zhang Y, Zhang J et al (2019) A factor graph model for unsupervised feature selection. Inf Sci (Ny) 480:144–159. https://doi.org/10.1016/j.ins.2018.12.034
Yan J, Yin XC, Lin W, et al (2016) A short survey of recent advances in graph matching. In: ICMR 2016—proceedings of the 2016 ACM International Conference on Multimedia Retrieval, pp 167–174
Zepeda-Mendoza ML, Resendis-Antonio O (2013) Bipartite Graph. Encyclopedia of Systems Biology. Springer, New York, pp 147–148
Zhang J, Luo Z, Li C et al (2019) Manifold regularized discriminative feature selection for multi-label learning. Pattern Recognit 95:136–150. https://doi.org/10.1016/j.patcog.2019.06.003
Zhang L, Hu Q, Zhou Y, Wang X (2014) Multi-label attribute evaluation based on fuzzy rough sets, pp 100–108
Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label leaming. Pattern Recognit 40:2038–2048. https://doi.org/10.1016/j.patcog.2006.12.019
Zhang P, Liu G, Gao W (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recognit 95:72–82. https://doi.org/10.1016/j.patcog.2019.06.004
Zhou F, Lin Y (2016) Fine-grained image classification by exploring bipartite-graph labels. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 1124–1133
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hashemi, A., Dowlatshahi, M.B. & Nezamabadi-Pour, H. A bipartite matching-based feature selection for multi-label learning. Int. J. Mach. Learn. & Cyber. 12, 459–475 (2021). https://doi.org/10.1007/s13042-020-01180-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-020-01180-w