Skip to main content
Log in

Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Label ranking based prediction problems help to find a mapping between instances and ranked labels. To improve the prediction performance of label ranking models, the main suggestion is to use ensembles. The main feature of ensemble models is the ability to retrieve the outcomes of various multiple simple models and join them into a combined outcome. To ensure the property of ensemble learning, the nearest neighbor estimation is applied with a bagging approach that samples the instances of training data. The reason to select the label ranking ensemble approach is shown in this study using detailed analysis of suitable existing algorithms. The results show that the defined parameters used in the k-nearest label ranker help to obtain a better prediction performance than the existing label ranking ensembles with an accuracy of 85% to 99% on 21 label ranking datasets. But there is a possibility to improve any ensemble algorithm by using the voting rule selection procedure. This study, integrating the Voting Rule Selector (VRS) algorithm and the seven commonly used voting rules with the k-nearest neighbor label ranker, found that VRS and Copeland work more efficiently than the Borda aggregation in dataset level learning. The results indicate that the k-nearest neighbor label ranker with the VRS and Copeland aggregation method is ranked first in most of the datasets. In the dataset level, the VRS method obtained an average improvement of 48.02% in comparison to the simple model k-nearest neighbor approach and is almost equal to the Copeland with 47.84%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Aiguzhinov A, Soares C, Serra AP (2010) A similarity-based adaptation of naive bayes for label ranking: application to the metalearning problem of algorithm recommendation. Int Conf Discov Sci 16-26. https://doi.org/10.1007/978-3-642-16184-1_2

  2. Aledo JA, Gámez JA, Molina D (2017) Tackling the supervised label ranking problem by bagging weak learners. Inform Fusion 35:38–50. https://doi.org/10.1016/j.inffus.2016.09.002

    Article  Google Scholar 

  3. Alfaro JC, Aledo JA, Gamez JA (2019) Scikit-lr. https://github.com/alfaro96/scikit-lr

  4. Alfaro JC, Aledo JA, Gámez JA (2020) Averaging-based ensemble methods for the partial label ranking problem. Int Conf Hybrid Artificial Intell Syst. 410-423. https://doi.org/10.1007/978-3-030-61705-9_34

  5. Alfaro JC, Aledo JA, Gámez JA (2021) Learning decision trees for the partial label ranking problem. Int J Intell Syst 36(2):890–918. https://doi.org/10.1002/int.22325

    Article  Google Scholar 

  6. Brafman R, Domshlak C (2009) Preference handling-an introductory tutorial. AI Mag 30(1):58–58

    Google Scholar 

  7. Brazdil PB, Soares C, Da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277. https://doi.org/10.1023/A:1021713901879

    Article  MATH  Google Scholar 

  8. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655

    Article  MATH  Google Scholar 

  9. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  10. Chen Z, Cui Q, Wei XS, Jin X, Guo Y (2020) Disentangling, embedding and ranking label cues for multi-label image recognition. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2020.3003779

  11. Cheng W, Hüllermeier E (2008) Instance-based label ranking using the mallows model. ECCBR Workshops 143-157.

  12. Cheng W, Hüllermeier E (2009) A new instance-based label ranking approach using the mallows model. Int Symposium Neural networks 707-716. https://doi.org/10.1007/978-3-642-01507-6_80

  13. Cheng W, Hühn J, Hüllermeier E (2009) Decision tree and instance-based learning for label ranking. Proceed 26th Annual Int Conf Mach Learn. 161-168. https://doi.org/10.1145/1553374.1553395

  14. Cheng W, Dembczynski K, Hüllermeier E (2010) Label ranking methods based on the Plackett-Luce model. In ICML.

  15. Cunha T, Soares C, de Carvalho AC (2018) A label ranking approach for selecting rankings of collaborative filtering algorithms. Proceed 33rd Annual ACM Symposium Appl Comput 1393-1395. https://doi.org/10.1145/3167132.3167418

  16. De Sá CR, Soares C, Jorge AM, Azevedo P, Costa J (2011) Mining association rules for label ranking. Pacific-Asia Conf Knowledge Discovery Data Mining:432–443. https://doi.org/10.1007/978-3-642-20847-8_36

  17. de Sá CR, Soares C, Knobbe A, Cortez P (2017) Label ranking forests. Expert Syst 34(1):e12166. https://doi.org/10.1111/exsy.12166

    Article  Google Scholar 

  18. Dehghani M, Montazeri Z, Malik OP, Dhiman G, Kumar V (2019) BOSA: binary orientation search algorithm. Int J Innov Technol Explor Eng(IJITEE) 9:5306–5310

    Article  Google Scholar 

  19. Dehghani M, Montazeri Z, Dehghani A, Ramirez-Mendoza RA, Samet H, Guerrero JM, Dhiman G (2020) MLO: Multi Leader Optimizer. Int J Intell Eng Syst 13:364–373

    Google Scholar 

  20. Dehghani M, Montazeri Z, Dhiman G, Malik OP, Morales-Menendez R, Ramirez-Mendoza RA, Dehghani A, Guerrero JM, Parra-Arroyo L (2020) A spring search algorithm applied to engineering optimization problems. Appl Sci 10(18):6173. https://doi.org/10.3390/app10186173

    Article  Google Scholar 

  21. Dehghani M, Montazeri Z, Givi H, Guerrero JM, Dhiman G (2020) Darts game optimizer: a new optimization technique based on darts game. Int J Intell Eng Syst 13:286–294

    Google Scholar 

  22. Dekel O, Singer Y, Manning CD (2003) Log-linear models for label ranking. Adv Neural Inf Proces Syst 16:497–504

    Google Scholar 

  23. Dery L, Shmueli E (2020) BoostLR: a boosting-based learning ensemble for label ranking tasks. IEEE Access 8:176023–176032. https://doi.org/10.1109/ACCESS.2020.3026758

    Article  Google Scholar 

  24. Dery L, Shmueli E (2020) Improving label ranking ensembles using boosting techniques. arXiv preprint arXiv:2001.07744.

  25. Dhiman G (2019) ESA: a hybrid bio-inspired metaheuristic optimization approach for engineering problems. Eng Comput. 1-31. https://doi.org/10.1007/s00366-019-00826-w

  26. Dhiman G, Kaur A (2019) STOA: a bio-inspired based optimization algorithm for industrial engineering problems. Eng Appl Artif Intell 82:148–174. https://doi.org/10.1016/j.engappai.2019.03.021

    Article  Google Scholar 

  27. Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Softw 114:48–70. https://doi.org/10.1016/j.advengsoft.2017.05.014

    Article  Google Scholar 

  28. Dhiman G, Kumar V (2018) Emperor penguin optimizer: a bio-inspired algorithm for engineering problems. Knowl-Based Syst 159:20–50. https://doi.org/10.1016/j.knosys.2018.06.001

    Article  Google Scholar 

  29. Dhiman G, Kumar V (2019) Seagull optimization algorithm: theory and its applications for large-scale industrial engineering problems. Knowl-Based Syst 165:169–196. https://doi.org/10.1016/j.knosys.2018.11.024

    Article  Google Scholar 

  30. Dhiman G, Garg M, Nagar A, Kumar V, Dehghani M (2020) A novel algorithm for global optimization: rat swarm optimizer. J Ambient Intell Human Comput. 1-26. https://doi.org/10.1007/s12652-020-02580-0

  31. Dhiman G, Oliva D, Kaur A, Singh KK, Vimal S, Sharma A, Cengiz K (2021) BEPO: a novel binary emperor penguin optimizer for automatic feature selection. Knowl-Based Syst 211:106560. https://doi.org/10.1016/j.knosys.2020.106560

    Article  Google Scholar 

  32. Dietterich TG (2000) Ensemble methods in machine learning. Int Workshop Multiple Classif Syst. 1-15. https://doi.org/10.1007/3-540-45014-9_1

  33. Elgharabawy A (2019) Preference neural network. arXiv preprint arXiv:1904.02345.

  34. Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969

    MathSciNet  MATH  Google Scholar 

  35. Fürnkranz J, Hüllermeier E (2003) Pairwise preference learning and ranking. Eur Conf Mach Learn:145–156. https://doi.org/10.1007/978-3-540-39857-8_15

  36. Fürnkranz J, Hüllermeier E (2010) Preference learning and ranking by pairwise comparison. Preference Learn:65–82. https://doi.org/10.1007/978-3-642-14125-6_4

  37. Grbovic M, Djuric N, Vucetic S (2012) Learning from pairwise preference data using Gaussian mixture model. Pref Learn: Problems Appl AI 33

  38. Gurrieri M, Siebert X, Fortemps P, Greco S, Słowiński R (2012) Label ranking: a new rule-based label ranking method. In international conference on information processing and Management of Uncertainty in knowledge-based systems 613-623. https://doi.org/10.1007/978-3-642-31709-5_62

  39. Har-Peled S, Roth D, Zimak D (2003) Constraint classification for multiclass classification and ranking. Adv Neural Inf Proces Syst:809–816

  40. Hestilow TJ, Huang Y (2009) Clustering of gene expression data based on shape similarity. EURASIP J Bioinform Syst Biol. 1-12

  41. Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916. https://doi.org/10.1016/j.artint.2008.08.002

    Article  MathSciNet  MATH  Google Scholar 

  42. Jain H, Prabhu Y, Varma M (2016) Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. Proceed 22nd ACM SIGKDD Int Conf Knowledge Discovery Data Mining 935-944. https://doi.org/10.1145/2939672.2939756

  43. Kaur S, Awasthi LK, Sangal AL, Dhiman G (2020) Tunicate swarm algorithm: a new bio-inspired based metaheuristic paradigm for global optimization. Eng Appl Artif Intell 90:103541. https://doi.org/10.1016/j.engappai.2020.103541

    Article  Google Scholar 

  44. Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93. https://doi.org/10.2307/2332226

    Article  MATH  Google Scholar 

  45. Kendall MG (1948) Rank correlation methods.

  46. Kittler J, Hatef M, Duin RP, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239. https://doi.org/10.1109/34.667881

    Article  Google Scholar 

  47. Kouchaki S, Yang Y, Lachapelle A, Walker TM, Walker AS, Peto TE, Crook DW, Clifton DA, CRyPTIC Consortium (2020) Multi-label random forest model for tuberculosis drug resistance classification and mutation ranking. Front Microbiol 11:667. https://doi.org/10.3389/fmicb.2020.00667

    Article  Google Scholar 

  48. Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inform Fusion 37:132–156. https://doi.org/10.1016/j.inffus.2017.02.004

    Article  Google Scholar 

  49. Lin S (2010) Rank aggregation methods. Wiley Int Rev: Comput Stat 2(5):555–570. https://doi.org/10.1002/wics.111

    Article  Google Scholar 

  50. Naamani-Dery L, Golan I, Kalech M, Rokach L (2015) Preference elicitation for group decisions using the borda voting rule. Group Decis Negot 24(6):1015–1033. https://doi.org/10.1007/s10726-015-9427-9

    Article  Google Scholar 

  51. Philip LH, Wan WM, Lee PH (2010) Decision tree modeling for ranking data. Preference Learn:83–106. https://doi.org/10.1007/978-3-642-14125-6_5

  52. Polikar R (2006) Ensemble based systems in decision making. IEEE Circ Syst Magazine 6(3):21–45. https://doi.org/10.1109/MCAS.2006.1688199

    Article  Google Scholar 

  53. Prabhu Y, Kusupati A, Gupta N, Varma M (2020) Extreme regression for dynamic search advertising. Proceed 13th Int Conf Web Search Data Mining 456-464. https://doi.org/10.1145/3336191.3371768

  54. Ribeiro G, Duivesteijn W, Soares C, Knobbe A (2012) Multilayer perceptron for label ranking. Int Conf Artif Neural Networks 25-32. https://doi.org/10.1007/978-3-642-33266-1_4

  55. Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdisciplinary Rev: Data Mining Knowledge Discovery 8(4):e1249. https://doi.org/10.1002/widm.1249

    Article  Google Scholar 

  56. Savargiv M, Masoumi B, Keyvanpour MR (2020) A new ensemble learning method based on learning automata. J Ambient Intell Human Comput 1-16. https://doi.org/10.1007/s12652-020-01882-7

  57. Tax DM, Van Breukelen M, Duin RP, Kittler J (2000) Combining multiple classifiers by averaging or by multiplying? Pattern Recogn 33(9):1475–1485. https://doi.org/10.1016/S0031-3203(99)00138-7

    Article  Google Scholar 

  58. Vembu S, Gärtner T (2010) Label ranking algorithms: a survey. Preference Learn:45–64. https://doi.org/10.1007/978-3-642-14125-6_3

  59. Werbin-Ofir H, Dery L, Shmueli E (2019) Beyond majority: label ranking ensembles based on voting rules. Expert Syst Appl 136:50–61. https://doi.org/10.1016/j.eswa.2019.06.022

    Article  Google Scholar 

  60. Wu H, Huang H, Lu W, Fu Q, Ding Y, Qiu J, Li H (2019) Ranking near-native candidate protein structures via random forest classification. BMC Bioinform 20(25):1–13. https://doi.org/10.1186/s12859-019-3257-8

    Article  Google Scholar 

  61. Yang H, Tianyi Zhou J, Zhang Y, Gao BB, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In proceedings of the IEEE conference on computer vision and pattern recognition 280-288.

  62. Zeng R, Wang YY (2012) Research of personalized web-based intelligent collaborative learning. JSW 7(4):904–912

    Article  Google Scholar 

  63. Zhou Y, Qiu G (2018) Random forest for label ranking. Expert Syst Appl 112:99–109. https://doi.org/10.1016/j.eswa.2018.06.036

    Article  Google Scholar 

  64. Zhou Y, Liu Y, Yang J, He X, Liu L (2014) A taxonomy of label ranking algorithms. JCP 9(3):557–565

    Google Scholar 

Download references

Availability of data and material

Datasets are publicly available and accessed from KEBI Data Repository of the Philipps University of Marburg (https://www.uni-marburg.de/de/fb12) or https://en.cs.uni-paderborn.de/is/research/research-projects/software/label-ranking-datasets

Code availability

The package Scikit-lr is publicly available at https://pypi.org/project/scikit-lr/ or GitHub https://github.com/alfaro96/scikit-lr

The source code of the VRS algorithm is publicly available at Shared Resources section of http://bigdatalab.tau.ac.il/

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. S. Suchithra.

Ethics declarations

Conflicts of interest/competing interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suchithra, M.S., Pai, M.L. Evaluating the performance of bagging-based k-nearest neighbor ensemble with the voting rule selection method. Multimed Tools Appl 81, 20741–20762 (2022). https://doi.org/10.1007/s11042-022-12716-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12716-3

Keywords

Navigation