Abstract
In the medical field, several resources of big data like patient medical reports, hospital information, information on diagnostic tests, and data on IoT (Internet of Things)-based smart medical devices form a vast asset to future analysis. Big data produced and recorded in the field of biomedical research are primarily relevant to comprehensive health services. Due to the enormous quantity of storage, appropriate administration, and computing resources, big data demands the most effective and advance data analytical tool. Thus, most of the robust algorithms or tools used to get insights from large datasets, especially to analyze, are from the Machine Learning (ML) domain. Numerous academicians and researchers have suggested ML techniques to medical-oriented data analytics in recent times to increase the accuracy of assessments. Though most of the existing approaches were employed independently and retrieved an acceptable outcome, they still lacked optimal accuracy. To improve accuracy, this article presents an ensemble learning strategy wherein the outcomes of two ML algorithms, namely 'Edge Detection Instance preference (EDIP)' and 'Extreme Gradient Boosting (XGboost)' are integrated. For aggregating precision, the voting technique is utilized. As a consequence of these findings, it is determined that the proposed learning technique achieves the highest level of accuracy when compared with the existing methods. The ensemble learning attains a better accuracy rate for all the four tested datasets, namely, blood cancer (98.43%), liver cancer (97.47%), heart disease (96.34%), and diabetes (98.14%).
Similar content being viewed by others
References
Chakraborty T (2017) EC3: Combining clustering and classification for ensemble learning. Dept of CSE, IIIT Delhi, India, 2017 IEEE International Conference on Data Mining, IEEE
Chen T, Guestrin C (2016) In: XGboost: A scalable tree boosting system. Association for Computing Machinery, p 785–94
Dhotre P, Shimpi S, Suryawanshi P, Sanghati M (2015) Health care analysis using hadoop. Department of Computer Engineering, SITS, Narhe, IJSTR 2015, vol 4, pp 300–306
Ensemble Vote Classifier, 2014–2019. http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/#methods
Gao S, Dai J, Shi H (2018) Discernibility matrix-based ensemble learning. 2018 24th International Conference on Pattern Recognition (ICPR).https://doi.org/10.1109/icpr.2018.8545243
Karthick Raghunath KM, Thirukumaran S (2019) Fuzzy-based fault-tolerant and instant synchronization routing technique in wireless sensor network for rapid transit system. Automatika 60(5):547–554
Kouser RR, Manikandan T, Kumar VV (2018) Heart disease prediction system using artificial neural network, radial basis function and case based reasoning. J Comput Theor Nanosci 15:2810–2817
Lin W, Ziming W, Lin L, Wen A, Li J (2017) An ensemble random forest algorithm for insurance big data analysis. School of Computer Science, Guanzhou University, China, IEEE
Mehta S, Rana P, Singh S, Sharma A, Agarwal P (2019) ensemble learning approach for enhanced stock prediction. 2019 Twelfth International Conference on Contemporary Computing (IC3) https://doi.org/10.1109/ic3.2019.8844891
Nai-Arun N, Sittidech P (2014) Ensemble learning model for diabetes classificaion, faculty of science, Naresuan University, Phitsanulok, Thailand, Advanced Materials Research Vols. 931–932, Trans Tech Publications, Switzerland pp 1427–1431
Priyanka D, Johari R (2016) HCAB: Health care analysis and data archival using big data tool. 2016 1st India International Conference on Information Processing (IICIP). https://doi.org/10.1109/iicip.2016.7975353
Raghunath KM, Rengarajan N (2016) A novel fuzzy integrated fault-tolerant and energy-efficient routing protocol for wireless sensor network. 15:1289-1296
“UCI Machine Learning Repository” https://archive.ics.uci.edu/ml/index.php
Umamaheswaran S, Sureshkumar N, Ganesh K, Rajendra Sethupathi PV, Anbuudayasankar SP (2015) Variants, meta-heuristic solution approaches and applications for image retrieval in business – comprehensive review and framework. Int J Bus Inf Syst. https://doi.org/10.1504/IJBIS.2015.067263
Umamaheswaran S, Sivakumar P, Ganesh K, Suresh Kumar N (2017) Enhancing the retrieval performance in content based image retrieval using meta-heuristic approach. Int J Business Inf Syst 26(2):220. https://doi.org/10.1504/ijbis.2017.10007114
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Namamula, L.R., Chaytor, D. Effective ensemble learning approach for large-scale medical data analytics. Int J Syst Assur Eng Manag 15, 13–20 (2024). https://doi.org/10.1007/s13198-021-01552-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-021-01552-7