Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method

Suchithra, M. S.; Pai, Maya L.

doi:10.1007/s11042-022-12716-3

Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method

Published: 12 March 2022

Volume 81, pages 20741–20762, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

M. S. Suchithra¹ &
Maya L. Pai¹

176 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Label ranking based prediction problems help to find a mapping between instances and ranked labels. To improve the prediction performance of label ranking models, the main suggestion is to use ensembles. The main feature of ensemble models is the ability to retrieve the outcomes of various multiple simple models and join them into a combined outcome. To ensure the property of ensemble learning, the nearest neighbor estimation is applied with a bagging approach that samples the instances of training data. The reason to select the label ranking ensemble approach is shown in this study using detailed analysis of suitable existing algorithms. The results show that the defined parameters used in the k-nearest label ranker help to obtain a better prediction performance than the existing label ranking ensembles with an accuracy of 85% to 99% on 21 label ranking datasets. But there is a possibility to improve any ensemble algorithm by using the voting rule selection procedure. This study, integrating the Voting Rule Selector (VRS) algorithm and the seven commonly used voting rules with the k-nearest neighbor label ranker, found that VRS and Copeland work more efficiently than the Borda aggregation in dataset level learning. The results indicate that the k-nearest neighbor label ranker with the VRS and Copeland aggregation method is ranked first in most of the datasets. In the dataset level, the VRS method obtained an average improvement of 48.02% in comparison to the simple model k-nearest neighbor approach and is almost equal to the Copeland with 47.84%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble Feature Selection for Multi-label Classification: A Rank Aggregation Method

Averaging-Based Ensemble Methods for the Partial Label Ranking Problem

Ensemble of Subset of k-Nearest Neighbours Models for Class Membership Probability Estimation

References

Aiguzhinov A, Soares C, Serra AP (2010) A similarity-based adaptation of naive bayes for label ranking: application to the metalearning problem of algorithm recommendation. Int Conf Discov Sci 16-26. https://doi.org/10.1007/978-3-642-16184-1_2
Aledo JA, Gámez JA, Molina D (2017) Tackling the supervised label ranking problem by bagging weak learners. Inform Fusion 35:38–50. https://doi.org/10.1016/j.inffus.2016.09.002
Article Google Scholar
Alfaro JC, Aledo JA, Gamez JA (2019) Scikit-lr. https://github.com/alfaro96/scikit-lr
Alfaro JC, Aledo JA, Gámez JA (2020) Averaging-based ensemble methods for the partial label ranking problem. Int Conf Hybrid Artificial Intell Syst. 410-423. https://doi.org/10.1007/978-3-030-61705-9_34
Alfaro JC, Aledo JA, Gámez JA (2021) Learning decision trees for the partial label ranking problem. Int J Intell Syst 36(2):890–918. https://doi.org/10.1002/int.22325
Article Google Scholar
Brafman R, Domshlak C (2009) Preference handling-an introductory tutorial. AI Mag 30(1):58–58
Google Scholar
Brazdil PB, Soares C, Da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277. https://doi.org/10.1023/A:1021713901879
Article MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655
Article MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Chen Z, Cui Q, Wei XS, Jin X, Guo Y (2020) Disentangling, embedding and ranking label cues for multi-label image recognition. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2020.3003779
Cheng W, Hüllermeier E (2008) Instance-based label ranking using the mallows model. ECCBR Workshops 143-157.
Cheng W, Hüllermeier E (2009) A new instance-based label ranking approach using the mallows model. Int Symposium Neural networks 707-716. https://doi.org/10.1007/978-3-642-01507-6_80
Cheng W, Hühn J, Hüllermeier E (2009) Decision tree and instance-based learning for label ranking. Proceed 26th Annual Int Conf Mach Learn. 161-168. https://doi.org/10.1145/1553374.1553395
Cheng W, Dembczynski K, Hüllermeier E (2010) Label ranking methods based on the Plackett-Luce model. In ICML.
Cunha T, Soares C, de Carvalho AC (2018) A label ranking approach for selecting rankings of collaborative filtering algorithms. Proceed 33rd Annual ACM Symposium Appl Comput 1393-1395. https://doi.org/10.1145/3167132.3167418
De Sá CR, Soares C, Jorge AM, Azevedo P, Costa J (2011) Mining association rules for label ranking. Pacific-Asia Conf Knowledge Discovery Data Mining:432–443. https://doi.org/10.1007/978-3-642-20847-8_36
de Sá CR, Soares C, Knobbe A, Cortez P (2017) Label ranking forests. Expert Syst 34(1):e12166. https://doi.org/10.1111/exsy.12166
Article Google Scholar
Dehghani M, Montazeri Z, Malik OP, Dhiman G, Kumar V (2019) BOSA: binary orientation search algorithm. Int J Innov Technol Explor Eng(IJITEE) 9:5306–5310
Article Google Scholar
Dehghani M, Montazeri Z, Dehghani A, Ramirez-Mendoza RA, Samet H, Guerrero JM, Dhiman G (2020) MLO: Multi Leader Optimizer. Int J Intell Eng Syst 13:364–373
Google Scholar
Dehghani M, Montazeri Z, Dhiman G, Malik OP, Morales-Menendez R, Ramirez-Mendoza RA, Dehghani A, Guerrero JM, Parra-Arroyo L (2020) A spring search algorithm applied to engineering optimization problems. Appl Sci 10(18):6173. https://doi.org/10.3390/app10186173
Article Google Scholar
Dehghani M, Montazeri Z, Givi H, Guerrero JM, Dhiman G (2020) Darts game optimizer: a new optimization technique based on darts game. Int J Intell Eng Syst 13:286–294
Google Scholar
Dekel O, Singer Y, Manning CD (2003) Log-linear models for label ranking. Adv Neural Inf Proces Syst 16:497–504
Google Scholar
Dery L, Shmueli E (2020) BoostLR: a boosting-based learning ensemble for label ranking tasks. IEEE Access 8:176023–176032. https://doi.org/10.1109/ACCESS.2020.3026758
Article Google Scholar
Dery L, Shmueli E (2020) Improving label ranking ensembles using boosting techniques. arXiv preprint arXiv:2001.07744.
Dhiman G (2019) ESA: a hybrid bio-inspired metaheuristic optimization approach for engineering problems. Eng Comput. 1-31. https://doi.org/10.1007/s00366-019-00826-w
Dhiman G, Kaur A (2019) STOA: a bio-inspired based optimization algorithm for industrial engineering problems. Eng Appl Artif Intell 82:148–174. https://doi.org/10.1016/j.engappai.2019.03.021
Article Google Scholar
Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Softw 114:48–70. https://doi.org/10.1016/j.advengsoft.2017.05.014
Article Google Scholar
Dhiman G, Kumar V (2018) Emperor penguin optimizer: a bio-inspired algorithm for engineering problems. Knowl-Based Syst 159:20–50. https://doi.org/10.1016/j.knosys.2018.06.001
Article Google Scholar
Dhiman G, Kumar V (2019) Seagull optimization algorithm: theory and its applications for large-scale industrial engineering problems. Knowl-Based Syst 165:169–196. https://doi.org/10.1016/j.knosys.2018.11.024
Article Google Scholar
Dhiman G, Garg M, Nagar A, Kumar V, Dehghani M (2020) A novel algorithm for global optimization: rat swarm optimizer. J Ambient Intell Human Comput. 1-26. https://doi.org/10.1007/s12652-020-02580-0
Dhiman G, Oliva D, Kaur A, Singh KK, Vimal S, Sharma A, Cengiz K (2021) BEPO: a novel binary emperor penguin optimizer for automatic feature selection. Knowl-Based Syst 211:106560. https://doi.org/10.1016/j.knosys.2020.106560
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. Int Workshop Multiple Classif Syst. 1-15. https://doi.org/10.1007/3-540-45014-9_1
Elgharabawy A (2019) Preference neural network. arXiv preprint arXiv:1904.02345.
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969
MathSciNet MATH Google Scholar
Fürnkranz J, Hüllermeier E (2003) Pairwise preference learning and ranking. Eur Conf Mach Learn:145–156. https://doi.org/10.1007/978-3-540-39857-8_15
Fürnkranz J, Hüllermeier E (2010) Preference learning and ranking by pairwise comparison. Preference Learn:65–82. https://doi.org/10.1007/978-3-642-14125-6_4
Grbovic M, Djuric N, Vucetic S (2012) Learning from pairwise preference data using Gaussian mixture model. Pref Learn: Problems Appl AI 33
Gurrieri M, Siebert X, Fortemps P, Greco S, Słowiński R (2012) Label ranking: a new rule-based label ranking method. In international conference on information processing and Management of Uncertainty in knowledge-based systems 613-623. https://doi.org/10.1007/978-3-642-31709-5_62
Har-Peled S, Roth D, Zimak D (2003) Constraint classification for multiclass classification and ranking. Adv Neural Inf Proces Syst:809–816
Hestilow TJ, Huang Y (2009) Clustering of gene expression data based on shape similarity. EURASIP J Bioinform Syst Biol. 1-12
Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916. https://doi.org/10.1016/j.artint.2008.08.002
Article MathSciNet MATH Google Scholar
Jain H, Prabhu Y, Varma M (2016) Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. Proceed 22nd ACM SIGKDD Int Conf Knowledge Discovery Data Mining 935-944. https://doi.org/10.1145/2939672.2939756
Kaur S, Awasthi LK, Sangal AL, Dhiman G (2020) Tunicate swarm algorithm: a new bio-inspired based metaheuristic paradigm for global optimization. Eng Appl Artif Intell 90:103541. https://doi.org/10.1016/j.engappai.2020.103541
Article Google Scholar
Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93. https://doi.org/10.2307/2332226
Article MATH Google Scholar
Kendall MG (1948) Rank correlation methods.
Kittler J, Hatef M, Duin RP, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239. https://doi.org/10.1109/34.667881
Article Google Scholar
Kouchaki S, Yang Y, Lachapelle A, Walker TM, Walker AS, Peto TE, Crook DW, Clifton DA, CRyPTIC Consortium (2020) Multi-label random forest model for tuberculosis drug resistance classification and mutation ranking. Front Microbiol 11:667. https://doi.org/10.3389/fmicb.2020.00667
Article Google Scholar
Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inform Fusion 37:132–156. https://doi.org/10.1016/j.inffus.2017.02.004
Article Google Scholar
Lin S (2010) Rank aggregation methods. Wiley Int Rev: Comput Stat 2(5):555–570. https://doi.org/10.1002/wics.111
Article Google Scholar
Naamani-Dery L, Golan I, Kalech M, Rokach L (2015) Preference elicitation for group decisions using the borda voting rule. Group Decis Negot 24(6):1015–1033. https://doi.org/10.1007/s10726-015-9427-9
Article Google Scholar
Philip LH, Wan WM, Lee PH (2010) Decision tree modeling for ranking data. Preference Learn:83–106. https://doi.org/10.1007/978-3-642-14125-6_5
Polikar R (2006) Ensemble based systems in decision making. IEEE Circ Syst Magazine 6(3):21–45. https://doi.org/10.1109/MCAS.2006.1688199
Article Google Scholar
Prabhu Y, Kusupati A, Gupta N, Varma M (2020) Extreme regression for dynamic search advertising. Proceed 13th Int Conf Web Search Data Mining 456-464. https://doi.org/10.1145/3336191.3371768
Ribeiro G, Duivesteijn W, Soares C, Knobbe A (2012) Multilayer perceptron for label ranking. Int Conf Artif Neural Networks 25-32. https://doi.org/10.1007/978-3-642-33266-1_4
Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdisciplinary Rev: Data Mining Knowledge Discovery 8(4):e1249. https://doi.org/10.1002/widm.1249
Article Google Scholar
Savargiv M, Masoumi B, Keyvanpour MR (2020) A new ensemble learning method based on learning automata. J Ambient Intell Human Comput 1-16. https://doi.org/10.1007/s12652-020-01882-7
Tax DM, Van Breukelen M, Duin RP, Kittler J (2000) Combining multiple classifiers by averaging or by multiplying? Pattern Recogn 33(9):1475–1485. https://doi.org/10.1016/S0031-3203(99)00138-7
Article Google Scholar
Vembu S, Gärtner T (2010) Label ranking algorithms: a survey. Preference Learn:45–64. https://doi.org/10.1007/978-3-642-14125-6_3
Werbin-Ofir H, Dery L, Shmueli E (2019) Beyond majority: label ranking ensembles based on voting rules. Expert Syst Appl 136:50–61. https://doi.org/10.1016/j.eswa.2019.06.022
Article Google Scholar
Wu H, Huang H, Lu W, Fu Q, Ding Y, Qiu J, Li H (2019) Ranking near-native candidate protein structures via random forest classification. BMC Bioinform 20(25):1–13. https://doi.org/10.1186/s12859-019-3257-8
Article Google Scholar
Yang H, Tianyi Zhou J, Zhang Y, Gao BB, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In proceedings of the IEEE conference on computer vision and pattern recognition 280-288.
Zeng R, Wang YY (2012) Research of personalized web-based intelligent collaborative learning. JSW 7(4):904–912
Article Google Scholar
Zhou Y, Qiu G (2018) Random forest for label ranking. Expert Syst Appl 112:99–109. https://doi.org/10.1016/j.eswa.2018.06.036
Article Google Scholar
Zhou Y, Liu Y, Yang J, He X, Liu L (2014) A taxonomy of label ranking algorithms. JCP 9(3):557–565
Google Scholar

Download references

Availability of data and material

Datasets are publicly available and accessed from KEBI Data Repository of the Philipps University of Marburg (https://www.uni-marburg.de/de/fb12) or https://en.cs.uni-paderborn.de/is/research/research-projects/software/label-ranking-datasets

Code availability

The package Scikit-lr is publicly available at https://pypi.org/project/scikit-lr/ or GitHub https://github.com/alfaro96/scikit-lr

The source code of the VRS algorithm is publicly available at Shared Resources section of http://bigdatalab.tau.ac.il/

Author information

Authors and Affiliations

Department of Computer Science and IT, Amrita School of Arts and Sciences, Amrita Vishwa Vidyapeetham, Kochi, India
M. S. Suchithra & Maya L. Pai

Authors

M. S. Suchithra
View author publications
You can also search for this author in PubMed Google Scholar
Maya L. Pai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. S. Suchithra.

Ethics declarations

Conflicts of interest/competing interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suchithra, M.S., Pai, M.L. Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method. Multimed Tools Appl 81, 20741–20762 (2022). https://doi.org/10.1007/s11042-022-12716-3

Download citation

Received: 12 December 2020
Revised: 03 March 2021
Accepted: 21 February 2022
Published: 12 March 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11042-022-12716-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method

Abstract

Access this article

Similar content being viewed by others

Ensemble Feature Selection for Multi-label Classification: A Rank Aggregation Method

Averaging-Based Ensemble Methods for the Partial Label Ranking Problem

Ensemble of Subset of k-Nearest Neighbours Models for Class Membership Probability Estimation

References

Availability of data and material

Code availability

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest/competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method

Abstract

Access this article

Similar content being viewed by others

Ensemble Feature Selection for Multi-label Classification: A Rank Aggregation Method

Averaging-Based Ensemble Methods for the Partial Label Ranking Problem

Ensemble of Subset of k-Nearest Neighbours Models for Class Membership Probability Estimation

References

Availability of data and material

Code availability

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest/competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation