Abstract
This paper proposes a novel extension of the harmonic mean-based adaptive k-nearest neighbors (HMAKNN) algorithm, called scaled HMAKNN (SHMAKNN), which builds on HMAKNN’s strengths to achieve improved multi-class classification accuracy. HMAKNN uses a modified voting mechanism based on the harmonic mean and adaptive k-value selection to address issues like the sensitivity to k-value selection and the limitations of majority voting. SHMAKNN further improves the decision process by adjusting the components of the harmonic mean, focusing on voting values and the average distances of each class label. Additionally, SHMAKNN applies a re-scaling process to adjust the distances of the nearest neighbors within a specific range, enhancing the consistency of distances at different scales. These improvements help align the elements of the harmonic mean more effectively, leading to a balanced and less biased classification process. The study utilized 26 benchmark datasets, carefully curated to ensure accuracy and consistency, selected from diverse domains to evaluate the proposed method on real-world problems. These datasets were chosen to represent challenges like noise, imbalance, and sparsity, ensuring robustness in handling common data complexities. Additionally, small to medium-sized datasets were used to reduce computational burden and allow for efficient evaluation. The evaluation results show that the proposed SHMAKNN models outperform existing methods in both accuracy and F1-score for datasets with four or more classes. Specifically, SHMAKNN achieved the highest average accuracy and F1-score (86.36% and 86.16%) compared to HMAKNN (86.10% and 85.74%) and traditional k-nearest neighbors (84.87% and 84.69%). The performance improvements were validated using Friedman’s test at a significance level of 0.05, confirming their statistical significance of the results. Consequently, the findings indicate that the proposed algorithm exhibits remarkable performance, thereby confirming its reliability and validity in the context of real-world applications, particularly those involving multiple classes.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The data that support the findings of this study are openly available in the UCI machine learning repository at https://archive.ics.uci.edu.
Notes
UCI Machine Learning Repository, https://archive.ics.uci.edu/, 2024
References
Fix E, Hodges JL Jr (1951) Discriminatory analysis-nonparametric discrimination: consistency properties. Proj Rand Res Memorandum 1(17):1–12
Dudani SA (1976) The distance-weighted k-nearest-neighbor rule. IEEE Trans Syst Man Cybern SMC-6(4), 325–327. https://doi.org/10.1109/TSMC.1976.5408784
Jivani AG (2013) The novel k nearest neighbor algorithm. In: 2013 International Conference on Computer Communication and Informatics, pp 1–4. https://doi.org/10.1109/ICCCI.2013.6466287
Guo H, Tang R, Ye Y, Liu F, Zhang eQ. Yuzhou", Zhou ZH, Gong Z, Zhang ML, Huang SJ (2019) A novel knn approach for session-based recommendation. Advances in Knowledge Discovery and Data Mining. Springer, Cham, pp 381–393
Shekhar S, Hoque N, Bhattacharyya DK (2022) Pknn-mifs: A parallel KNN classifier over an optimal subset of features. Intell Syst Appl 14:200073. https://doi.org/10.1016/j.iswa.2022.200073
Ye X, He Z, Wang H, Li Y (2023) Hypersphere anchor loss for k-nearest neighbors. Appl Intell 53:1–10. https://doi.org/10.1007/s10489-023-05148-5
dos Santos Freitas MM, Barbosa JR, dos Santos Martins EM, da Silva Martins LH, de Souza Farias F, de Fátima Henriques Lourenço L, da Silva e Silva N (2022) Knn algorithm and multivariate analysis to select and classify starch films. Food Packag Shelf Life 34:100976. https://doi.org/10.1016/j.fpsl.2022.100976
Lin G, Lin A, Cao J (2021) Multidimensional KNN algorithm based on EEMD and complexity measures in financial time series forecasting. Expert Syst Appl 168:114443. https://doi.org/10.1016/j.eswa.2020.114443
Angel Viji KS, Hevin Rajesh D (2020) An efficient technique to segment the tumor and abnormality detection in the brain MRI images using KNN classifier. Materials Today: Proceedings 24:1944–1954. https://doi.org/10.1016/j.matpr.2020.03.622. International Multi-conference on Computing, Communication, Electrical & Nanotechnology, I2CN-2K19, 25th & 26th April 2019
Liu J (2023) Research on regional fiscal expenditure budget management based on KNN algorithm. In: Zhang K (ed) International Conference on Mathematics, Modeling, and Computer Science (MMCS2022), vol 12625, p 1262536. SPIE, ???. https://doi.org/10.1117/12.2671585 . International Society for Optics and Photonics
Anand L, Mewada S, Shamsi W, Ritonga M, Aflisia N, KumarSarangi P, NdoleArthur M (2023) Diagnosis of prostate cancer using GLCM enabled KNN technique by analyzing MRI images. BioMed Res Int 2023:1–7
Airen S, Agrawal J (2021) Movie recommender system using K-nearest neighbors variants. National Acad Sci Lett 45(1):75–82. https://doi.org/10.1007/s40009-021-01051-0
Adapala JSS, Gontla KVS, Koka V, Modugula SL, Mothukuri R, Bulla S (2023) Breast cancer classification using svm and knn. In: 2023 Second International Conference on Electronics and Renewable Systems (ICEARS), pp 1617–1621. https://doi.org/10.1109/ICEARS56392.2023.10085546
Kumar HS, Manjunath SH (2022) Use of empirical mode decomposition and K-nearest neighbour classifier for rolling element bearing fault diagnosis. Materials Today: Proceedings 52:796–801. https://doi.org/10.1016/j.matpr.2021.10.152. International Conference on Smart and Sustainable Developments in Materials, Manufacturing and Energy Engineering
Zhao D, Hu X, Xiong S, Tian J, Xiang J, Zhou J, Li H (2021) k-means clustering and kNN classification based on negative databases. Appl Soft Comput 110:107732. https://doi.org/10.1016/j.asoc.2021.107732
Armghan A, Htay MM, Alsharari M, Aliqab K, Surve J, Patel SK (2023) Performance enhancing solar energy absorber with structure optimization and absorption prediction with KNN regressor model. Alex Eng J 82:531–540. https://doi.org/10.1016/j.aej.2023.10.017
Hossny K, Magdi S, Soliman AY, Hossny AH (2020) Detecting explosives by pgnaa using knn regressors and decision tree classifier: A proof of concept. Prog Nucl Energy 124:103332. https://doi.org/10.1016/j.pnucene.2020.103332
Mohebbanaaz Rajani Kumari, LV, Padma Sai eSK Y", Sethi S, Srirama SN (2021) Classification of arrhythmia beats using optimized k-nearest neighbor classifier. Intelligent Systems. Springer, Singapore, pp 349–359
Dhar P, Kothandapani SD, Satti SK, Padmanabhan S (2023) HPKNN: Hyper-parameter optimized KNN classifier for classification of poikilocytosis. Int J Imaging Syst Technol 33(3):928–950. https://doi.org/10.1002/ima.22841
Chen Z, Zhou LJ, Li XD, Zhang JN, Huo WJ (2020) The lao text classification method based on knn. Procedia Comput Sci 166:523–528. https://doi.org/10.1016/j.procs.2020.02.053. Proceedings of the 3rd International Conference on Mechatronics and Intelligent Robotics (ICMIR-2019)
Cui L, Zhang Q, Shi Y, Yang L, Wang Y, Wang J, Bai C (2023) A method for satellite time series anomaly detection based on fast-dtw and improved-knn. Chin J Aeronaut 36(2):149–159. https://doi.org/10.1016/j.cja.2022.05.001
Liu D, Liang Z, Li W, Liu Y, Li J (2022) Improved KNN for face classification via high-frequency texture components extraction. Multimed Tools Appl 82(12):18585–18597. https://doi.org/10.1007/s11042-022-14244-6
Indu R, Dimri SC, Malik P (2023) A modified kNN algorithm to detect Parkinson’s disease. Network Modeling Analysis in Health Informatics and Bioinformatics 12(1). https://doi.org/10.1016/10.1007/s13721-023-00420-7
Hu M, Tsang ECC, Guo Y, Chen D, Xu W (2022) Attribute reduction based on overlap degree and k-nearest-neighbor rough sets in decision information systems. Inf Sci 584:301–324. https://doi.org/10.1016/j.ins.2021.10.063
Huang M, Hu B, Jiang H, Fang B (2023) A water quality prediction method based on k-nearest-neighbor probability rough sets and pso-lstm. Appl Intell 53. https://doi.org/10.1007/s10489-023-05024-2
Ji X, Ye W, Li X, Zhao P, Yao S (2022) Adaptive active learning through k-nearest neighbor optimized local density clustering. Appl Intell 53:1–11. https://doi.org/10.1007/s10489-022-04169-w
Bulut F, Amasyali MF (2015) Locally adaptive k parameter selection for nearest neighbor classifier: one nearest cluster. Patt Anal Appl 20(2):415–425. https://doi.org/10.1007/s10044-015-0504-0
Ertuğrul Faruk, Tağluk ME (2017) A novel version of k nearest neighbor: Dependent nearest neighbor. Appl Soft Comput 55:480–490. https://doi.org/10.1016/j.asoc.2017.02.020
Zhong XF, Guo SZ, Gao L, Shan H, Zheng JH (2017) An improved k-nn classification with dynamic k. In: Proceedings of the 9th International Conference on Machine Learning and Computing. ICMLC ’17, Association for Computing Machinery, New York, NY, USA, pp 211–216. https://doi.org/10.1145/3055635.3056604
Zhang S, Li X, Zong M, Zhu X, Wang R (2018) Efficient kNN Classification With Different Numbers of Nearest Neighbors. IEEE Trans Neural Netw Learn Syst 29(5):1774–1785. https://doi.org/10.1109/TNNLS.2017.2673241
Karabulut B, Arslan G, Ünver H (2019) A weighted similarity measure for k-nearest neighbors algorithm. Celal Bayar Üniversitesi Fen Bilimleri Dergisi 15:393–400
Mateos-García D, García-Gutiérrez J, Riquelme-Santos JC (2019) On the evolutionary weighting of neighbours and features in the k-nearest neighbour rule. Neurocomput 326–327:54–60. https://doi.org/10.1016/j.neucom.2016.08.159
Gou J, Ma H, Ou W, Zeng S, Rao Y, Yang H (2019) A generalized mean distance-based k-nearest neighbor classifier. Expert Syst Appl 115:356–372. https://doi.org/10.1016/j.eswa.2018.08.021
Wang Q, Wang S, Wei B, Chen W, Zhang Y (2021) Weighted k-nn classification method of bearings fault diagnosis with multi-dimensional sensitive features. IEEE Access 9:45428–45440. https://doi.org/10.1109/ACCESS.2021.306648
Rahman B, Hendric Spits Warnars HL, Subirosa Sabarguna B, Budiharto W (2021) Heart disease classification model using k-nearest neighbor algorithm. In: 2021 Sixth International Conference on Informatics and Computing (ICIC), pp 1–4
Romero-del-Castillo JA, Mendoza-Hurtado M, Ortiz-Boyer D, García-Pedrajas N (2022) Local-based k values for multi-label k-nearest neighbors rule. Eng Appl Artif Intell 116:105487. https://doi.org/10.1016/j.engappai.2022.105487
Zamri N, Pairan MA, Azman WNAW, Abas SS, Abdullah L, Naim S, Tarmudi Z, Gao M (2022) River quality classification using different distances in k-nearest neighbors algorithm. Procedia Comput Sci 204:180–186. https://doi.org/10.1016/j.procs.2022.08.022
Wang Y, Pan Z, Dong J (2022) A new two-layer nearest neighbor selection method for kNN classifier. Knowl-Based Syst 235:107604. https://doi.org/10.1016/j.knosys.2021.107604
Hassan SU, Ahamed J, Ahmad K (2022) Analytics of machine learning-based algorithms for text classification. Sustain Oper Comput 3:238–248. https://doi.org/10.1016/j.susoc.2022.03.001
Ma Y, Huang R, Yan M, Li G, Wang T (2022) Attention-based Local Mean K-Nearest Centroid Neighbor Classifier. Expert Syst App 201:117159. https://doi.org/10.1016/j.eswa.2022.117159
Karabas D, Birant D, Yildirim Taser P (2023) Stepwise dynamic nearest neighbor (sdnn): a new algorithm for classification. Turk J Electr Eng Comput Sci 31:751–770. https://doi.org/10.55730/1300-0632.4016
Ali A, Hamraz M, Gul N, Khan DM, Aldahmani S, Khan Z (2023) A k nearest neighbour ensemble via extended neighbourhood rule and feature subsets. Patt Recognit 142:109641. https://doi.org/10.1016/j.patcog.2023.109641
Ahmed R, Bibi M, Syed S (2023) Improving Heart Disease Prediction Accuracy Using a Hybrid Machine Learning Approach: A Comparative study of SVM and KNN Algorithms. Int J Computat Inf Manuf (IJCIM) 3(1):49–54
Sonekar SV, Dhoke H, Mate V, Dhewle S, Patil M (2023) Real-time sign language identification using knn: A machine learning approach. In: 2023 11th International Conference on Emerging Trends in Engineering and Technology Signal and Information Processing (ICETET - SIP), pp 1–4. https://doi.org/10.1109/ICETET-SIP58143.2023.10151523
Prasad BVVS, Gupta S, Borah N, Dineshkumar R, Lautre HK, Mouleswararao B (2023) Predicting diabetes with multivariate analysis an innovative KNN-based classifier approach. Prev Med 174:107619. https://doi.org/10.1016/j.ypmed.2023.107619
Wang N, Zhao E (2024) A new method for feature selection based on weighted k-nearest neighborhood rough set. Expert Syst Appl 238:122324. https://doi.org/10.1016/j.eswa.2023.122324
Kanwal K, Khalid SG, Asif M, Zafar F, Qurashi AG (2024) Diagnosis of Community-Acquired pneumonia in children using photoplethysmography and Machine learning-based classifier. Biomed Signal Process Control 87:105367. https://doi.org/10.1016/j.bspc.2023.105367
Açıkkar M, Tokgöz S (2024) An improved KNN classifier based on a novel weighted voting function and adaptive k-value selection. Neural Comput Appl 36(8):4027–4045. https://doi.org/10.1007/s00521-023-09272-8
Sultana N, Hossain SMZ, Abusaad M, Alanbar N, Senan Y, Razzak SA (2022) Prediction of biodiesel production from microalgal oil using bayesian optimization algorithm-based machine learning approaches. Fuel 309:122184. https://doi.org/10.1016/j.fuel.2021.122184
Dessureault JS, Massicotte D (2023) Explainable global error weighted on feature importance: The xgewfi metric to evaluate the error of data imputation and data augmentation. Appl Intell 53(19):21532–21542. https://doi.org/10.1007/s10489-023-04661-x
Ravi N, Johnson DP (2021) Artificial intelligence based monitoring system for onsite septic systems failure. Process Saf Environ Protect 148:1090–1097. https://doi.org/10.1016/j.psep.2021.01.049
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
Mustafa Açıkkar: Conceptualization, Methodology, Software, Writing - Original Draft, Review and Editing. Selçuk Tokgöz: Literature Review, Data Curation, Software, Writing - Original Draft.
Corresponding author
Ethics declarations
Competing Interest
The authors have no competing interests to declare that are relevant to the content of this article.
Ethics Approval
The data utilized in this study were sourced from publicly available repositories. This article does not involve any research with human participants or animals conducted by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Açıkkar, M., Tokgöz, S. Improving multi-class classification: scaled extensions of harmonic mean-based adaptive k-nearest neighbors. Appl Intell 55, 168 (2025). https://doi.org/10.1007/s10489-024-06109-2
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06109-2