Abstract
Label ranking based prediction problems help to find a mapping between instances and ranked labels. To improve the prediction performance of label ranking models, the main suggestion is to use ensembles. The main feature of ensemble models is the ability to retrieve the outcomes of various multiple simple models and join them into a combined outcome. To ensure the property of ensemble learning, the nearest neighbor estimation is applied with a bagging approach that samples the instances of training data. The reason to select the label ranking ensemble approach is shown in this study using detailed analysis of suitable existing algorithms. The results show that the defined parameters used in the k-nearest label ranker help to obtain a better prediction performance than the existing label ranking ensembles with an accuracy of 85% to 99% on 21 label ranking datasets. But there is a possibility to improve any ensemble algorithm by using the voting rule selection procedure. This study, integrating the Voting Rule Selector (VRS) algorithm and the seven commonly used voting rules with the k-nearest neighbor label ranker, found that VRS and Copeland work more efficiently than the Borda aggregation in dataset level learning. The results indicate that the k-nearest neighbor label ranker with the VRS and Copeland aggregation method is ranked first in most of the datasets. In the dataset level, the VRS method obtained an average improvement of 48.02% in comparison to the simple model k-nearest neighbor approach and is almost equal to the Copeland with 47.84%.
Similar content being viewed by others
References
Aiguzhinov A, Soares C, Serra AP (2010) A similarity-based adaptation of naive bayes for label ranking: application to the metalearning problem of algorithm recommendation. Int Conf Discov Sci 16-26. https://doi.org/10.1007/978-3-642-16184-1_2
Aledo JA, Gámez JA, Molina D (2017) Tackling the supervised label ranking problem by bagging weak learners. Inform Fusion 35:38–50. https://doi.org/10.1016/j.inffus.2016.09.002
Alfaro JC, Aledo JA, Gamez JA (2019) Scikit-lr. https://github.com/alfaro96/scikit-lr
Alfaro JC, Aledo JA, Gámez JA (2020) Averaging-based ensemble methods for the partial label ranking problem. Int Conf Hybrid Artificial Intell Syst. 410-423. https://doi.org/10.1007/978-3-030-61705-9_34
Alfaro JC, Aledo JA, Gámez JA (2021) Learning decision trees for the partial label ranking problem. Int J Intell Syst 36(2):890–918. https://doi.org/10.1002/int.22325
Brafman R, Domshlak C (2009) Preference handling-an introductory tutorial. AI Mag 30(1):58–58
Brazdil PB, Soares C, Da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277. https://doi.org/10.1023/A:1021713901879
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Chen Z, Cui Q, Wei XS, Jin X, Guo Y (2020) Disentangling, embedding and ranking label cues for multi-label image recognition. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2020.3003779
Cheng W, Hüllermeier E (2008) Instance-based label ranking using the mallows model. ECCBR Workshops 143-157.
Cheng W, Hüllermeier E (2009) A new instance-based label ranking approach using the mallows model. Int Symposium Neural networks 707-716. https://doi.org/10.1007/978-3-642-01507-6_80
Cheng W, Hühn J, Hüllermeier E (2009) Decision tree and instance-based learning for label ranking. Proceed 26th Annual Int Conf Mach Learn. 161-168. https://doi.org/10.1145/1553374.1553395
Cheng W, Dembczynski K, Hüllermeier E (2010) Label ranking methods based on the Plackett-Luce model. In ICML.
Cunha T, Soares C, de Carvalho AC (2018) A label ranking approach for selecting rankings of collaborative filtering algorithms. Proceed 33rd Annual ACM Symposium Appl Comput 1393-1395. https://doi.org/10.1145/3167132.3167418
De Sá CR, Soares C, Jorge AM, Azevedo P, Costa J (2011) Mining association rules for label ranking. Pacific-Asia Conf Knowledge Discovery Data Mining:432–443. https://doi.org/10.1007/978-3-642-20847-8_36
de Sá CR, Soares C, Knobbe A, Cortez P (2017) Label ranking forests. Expert Syst 34(1):e12166. https://doi.org/10.1111/exsy.12166
Dehghani M, Montazeri Z, Malik OP, Dhiman G, Kumar V (2019) BOSA: binary orientation search algorithm. Int J Innov Technol Explor Eng(IJITEE) 9:5306–5310
Dehghani M, Montazeri Z, Dehghani A, Ramirez-Mendoza RA, Samet H, Guerrero JM, Dhiman G (2020) MLO: Multi Leader Optimizer. Int J Intell Eng Syst 13:364–373
Dehghani M, Montazeri Z, Dhiman G, Malik OP, Morales-Menendez R, Ramirez-Mendoza RA, Dehghani A, Guerrero JM, Parra-Arroyo L (2020) A spring search algorithm applied to engineering optimization problems. Appl Sci 10(18):6173. https://doi.org/10.3390/app10186173
Dehghani M, Montazeri Z, Givi H, Guerrero JM, Dhiman G (2020) Darts game optimizer: a new optimization technique based on darts game. Int J Intell Eng Syst 13:286–294
Dekel O, Singer Y, Manning CD (2003) Log-linear models for label ranking. Adv Neural Inf Proces Syst 16:497–504
Dery L, Shmueli E (2020) BoostLR: a boosting-based learning ensemble for label ranking tasks. IEEE Access 8:176023–176032. https://doi.org/10.1109/ACCESS.2020.3026758
Dery L, Shmueli E (2020) Improving label ranking ensembles using boosting techniques. arXiv preprint arXiv:2001.07744.
Dhiman G (2019) ESA: a hybrid bio-inspired metaheuristic optimization approach for engineering problems. Eng Comput. 1-31. https://doi.org/10.1007/s00366-019-00826-w
Dhiman G, Kaur A (2019) STOA: a bio-inspired based optimization algorithm for industrial engineering problems. Eng Appl Artif Intell 82:148–174. https://doi.org/10.1016/j.engappai.2019.03.021
Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Softw 114:48–70. https://doi.org/10.1016/j.advengsoft.2017.05.014
Dhiman G, Kumar V (2018) Emperor penguin optimizer: a bio-inspired algorithm for engineering problems. Knowl-Based Syst 159:20–50. https://doi.org/10.1016/j.knosys.2018.06.001
Dhiman G, Kumar V (2019) Seagull optimization algorithm: theory and its applications for large-scale industrial engineering problems. Knowl-Based Syst 165:169–196. https://doi.org/10.1016/j.knosys.2018.11.024
Dhiman G, Garg M, Nagar A, Kumar V, Dehghani M (2020) A novel algorithm for global optimization: rat swarm optimizer. J Ambient Intell Human Comput. 1-26. https://doi.org/10.1007/s12652-020-02580-0
Dhiman G, Oliva D, Kaur A, Singh KK, Vimal S, Sharma A, Cengiz K (2021) BEPO: a novel binary emperor penguin optimizer for automatic feature selection. Knowl-Based Syst 211:106560. https://doi.org/10.1016/j.knosys.2020.106560
Dietterich TG (2000) Ensemble methods in machine learning. Int Workshop Multiple Classif Syst. 1-15. https://doi.org/10.1007/3-540-45014-9_1
Elgharabawy A (2019) Preference neural network. arXiv preprint arXiv:1904.02345.
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969
Fürnkranz J, Hüllermeier E (2003) Pairwise preference learning and ranking. Eur Conf Mach Learn:145–156. https://doi.org/10.1007/978-3-540-39857-8_15
Fürnkranz J, Hüllermeier E (2010) Preference learning and ranking by pairwise comparison. Preference Learn:65–82. https://doi.org/10.1007/978-3-642-14125-6_4
Grbovic M, Djuric N, Vucetic S (2012) Learning from pairwise preference data using Gaussian mixture model. Pref Learn: Problems Appl AI 33
Gurrieri M, Siebert X, Fortemps P, Greco S, Słowiński R (2012) Label ranking: a new rule-based label ranking method. In international conference on information processing and Management of Uncertainty in knowledge-based systems 613-623. https://doi.org/10.1007/978-3-642-31709-5_62
Har-Peled S, Roth D, Zimak D (2003) Constraint classification for multiclass classification and ranking. Adv Neural Inf Proces Syst:809–816
Hestilow TJ, Huang Y (2009) Clustering of gene expression data based on shape similarity. EURASIP J Bioinform Syst Biol. 1-12
Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916. https://doi.org/10.1016/j.artint.2008.08.002
Jain H, Prabhu Y, Varma M (2016) Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. Proceed 22nd ACM SIGKDD Int Conf Knowledge Discovery Data Mining 935-944. https://doi.org/10.1145/2939672.2939756
Kaur S, Awasthi LK, Sangal AL, Dhiman G (2020) Tunicate swarm algorithm: a new bio-inspired based metaheuristic paradigm for global optimization. Eng Appl Artif Intell 90:103541. https://doi.org/10.1016/j.engappai.2020.103541
Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93. https://doi.org/10.2307/2332226
Kendall MG (1948) Rank correlation methods.
Kittler J, Hatef M, Duin RP, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239. https://doi.org/10.1109/34.667881
Kouchaki S, Yang Y, Lachapelle A, Walker TM, Walker AS, Peto TE, Crook DW, Clifton DA, CRyPTIC Consortium (2020) Multi-label random forest model for tuberculosis drug resistance classification and mutation ranking. Front Microbiol 11:667. https://doi.org/10.3389/fmicb.2020.00667
Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inform Fusion 37:132–156. https://doi.org/10.1016/j.inffus.2017.02.004
Lin S (2010) Rank aggregation methods. Wiley Int Rev: Comput Stat 2(5):555–570. https://doi.org/10.1002/wics.111
Naamani-Dery L, Golan I, Kalech M, Rokach L (2015) Preference elicitation for group decisions using the borda voting rule. Group Decis Negot 24(6):1015–1033. https://doi.org/10.1007/s10726-015-9427-9
Philip LH, Wan WM, Lee PH (2010) Decision tree modeling for ranking data. Preference Learn:83–106. https://doi.org/10.1007/978-3-642-14125-6_5
Polikar R (2006) Ensemble based systems in decision making. IEEE Circ Syst Magazine 6(3):21–45. https://doi.org/10.1109/MCAS.2006.1688199
Prabhu Y, Kusupati A, Gupta N, Varma M (2020) Extreme regression for dynamic search advertising. Proceed 13th Int Conf Web Search Data Mining 456-464. https://doi.org/10.1145/3336191.3371768
Ribeiro G, Duivesteijn W, Soares C, Knobbe A (2012) Multilayer perceptron for label ranking. Int Conf Artif Neural Networks 25-32. https://doi.org/10.1007/978-3-642-33266-1_4
Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdisciplinary Rev: Data Mining Knowledge Discovery 8(4):e1249. https://doi.org/10.1002/widm.1249
Savargiv M, Masoumi B, Keyvanpour MR (2020) A new ensemble learning method based on learning automata. J Ambient Intell Human Comput 1-16. https://doi.org/10.1007/s12652-020-01882-7
Tax DM, Van Breukelen M, Duin RP, Kittler J (2000) Combining multiple classifiers by averaging or by multiplying? Pattern Recogn 33(9):1475–1485. https://doi.org/10.1016/S0031-3203(99)00138-7
Vembu S, Gärtner T (2010) Label ranking algorithms: a survey. Preference Learn:45–64. https://doi.org/10.1007/978-3-642-14125-6_3
Werbin-Ofir H, Dery L, Shmueli E (2019) Beyond majority: label ranking ensembles based on voting rules. Expert Syst Appl 136:50–61. https://doi.org/10.1016/j.eswa.2019.06.022
Wu H, Huang H, Lu W, Fu Q, Ding Y, Qiu J, Li H (2019) Ranking near-native candidate protein structures via random forest classification. BMC Bioinform 20(25):1–13. https://doi.org/10.1186/s12859-019-3257-8
Yang H, Tianyi Zhou J, Zhang Y, Gao BB, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In proceedings of the IEEE conference on computer vision and pattern recognition 280-288.
Zeng R, Wang YY (2012) Research of personalized web-based intelligent collaborative learning. JSW 7(4):904–912
Zhou Y, Qiu G (2018) Random forest for label ranking. Expert Syst Appl 112:99–109. https://doi.org/10.1016/j.eswa.2018.06.036
Zhou Y, Liu Y, Yang J, He X, Liu L (2014) A taxonomy of label ranking algorithms. JCP 9(3):557–565
Availability of data and material
Datasets are publicly available and accessed from KEBI Data Repository of the Philipps University of Marburg (https://www.uni-marburg.de/de/fb12) or https://en.cs.uni-paderborn.de/is/research/research-projects/software/label-ranking-datasets
Code availability
The package Scikit-lr is publicly available at https://pypi.org/project/scikit-lr/ or GitHub https://github.com/alfaro96/scikit-lr
The source code of the VRS algorithm is publicly available at Shared Resources section of http://bigdatalab.tau.ac.il/
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest/competing interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Suchithra, M.S., Pai, M.L. Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method. Multimed Tools Appl 81, 20741–20762 (2022). https://doi.org/10.1007/s11042-022-12716-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12716-3