Skip to main content

Advertisement

Log in

Effective ensemble learning approach for large-scale medical data analytics

  • Original article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

In the medical field, several resources of big data like patient medical reports, hospital information, information on diagnostic tests, and data on IoT (Internet of Things)-based smart medical devices form a vast asset to future analysis. Big data produced and recorded in the field of biomedical research are primarily relevant to comprehensive health services. Due to the enormous quantity of storage, appropriate administration, and computing resources, big data demands the most effective and advance data analytical tool. Thus, most of the robust algorithms or tools used to get insights from large datasets, especially to analyze, are from the Machine Learning (ML) domain. Numerous academicians and researchers have suggested ML techniques to medical-oriented data analytics in recent times to increase the accuracy of assessments. Though most of the existing approaches were employed independently and retrieved an acceptable outcome, they still lacked optimal accuracy. To improve accuracy, this article presents an ensemble learning strategy wherein the outcomes of two ML algorithms, namely 'Edge Detection Instance preference (EDIP)' and 'Extreme Gradient Boosting (XGboost)' are integrated. For aggregating precision, the voting technique is utilized. As a consequence of these findings, it is determined that the proposed learning technique achieves the highest level of accuracy when compared with the existing methods. The ensemble learning attains a better accuracy rate for all the four tested datasets, namely, blood cancer (98.43%), liver cancer (97.47%), heart disease (96.34%), and diabetes (98.14%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Chakraborty T (2017) EC3: Combining clustering and classification for ensemble learning. Dept of CSE, IIIT Delhi, India, 2017 IEEE International Conference on Data Mining, IEEE

  • Chen T, Guestrin C (2016) In: XGboost: A scalable tree boosting system. Association for Computing Machinery, p 785–94

  • Dhotre P, Shimpi S, Suryawanshi P, Sanghati M (2015) Health care analysis using hadoop. Department of Computer Engineering, SITS, Narhe, IJSTR 2015, vol 4, pp 300–306

  • Ensemble Vote Classifier, 2014–2019. http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/#methods

  • Gao S, Dai J, Shi H (2018) Discernibility matrix-based ensemble learning. 2018 24th International Conference on Pattern Recognition (ICPR).https://doi.org/10.1109/icpr.2018.8545243

  • Karthick Raghunath KM, Thirukumaran S (2019) Fuzzy-based fault-tolerant and instant synchronization routing technique in wireless sensor network for rapid transit system. Automatika 60(5):547–554

    Article  Google Scholar 

  • Kouser RR, Manikandan T, Kumar VV (2018) Heart disease prediction system using artificial neural network, radial basis function and case based reasoning. J Comput Theor Nanosci 15:2810–2817

    Article  Google Scholar 

  • Lin W, Ziming W, Lin L, Wen A, Li J (2017) An ensemble random forest algorithm for insurance big data analysis. School of Computer Science, Guanzhou University, China, IEEE

  • Mehta S, Rana P, Singh S, Sharma A, Agarwal P (2019) ensemble learning approach for enhanced stock prediction. 2019 Twelfth International Conference on Contemporary Computing (IC3) https://doi.org/10.1109/ic3.2019.8844891

  • Nai-Arun N, Sittidech P (2014) Ensemble learning model for diabetes classificaion, faculty of science, Naresuan University, Phitsanulok, Thailand, Advanced Materials Research Vols. 931–932, Trans Tech Publications, Switzerland pp 1427–1431

  • Priyanka D, Johari R (2016) HCAB: Health care analysis and data archival using big data tool. 2016 1st India International Conference on Information Processing (IICIP). https://doi.org/10.1109/iicip.2016.7975353

  • Raghunath KM, Rengarajan N (2016) A novel fuzzy integrated fault-tolerant and energy-efficient routing protocol for wireless sensor network. 15:1289-1296

  • “UCI Machine Learning Repository” https://archive.ics.uci.edu/ml/index.php

  • Umamaheswaran S, Sureshkumar N, Ganesh K, Rajendra Sethupathi PV, Anbuudayasankar SP (2015) Variants, meta-heuristic solution approaches and applications for image retrieval in business – comprehensive review and framework. Int J Bus Inf Syst. https://doi.org/10.1504/IJBIS.2015.067263

    Article  Google Scholar 

  • Umamaheswaran S, Sivakumar P, Ganesh K, Suresh Kumar N (2017) Enhancing the retrieval performance in content based image retrieval using meta-heuristic approach. Int J Business Inf Syst 26(2):220. https://doi.org/10.1504/ijbis.2017.10007114

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lakshmana Rao Namamula.

Ethics declarations

Conflict of interest

Authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Namamula, L.R., Chaytor, D. Effective ensemble learning approach for large-scale medical data analytics. Int J Syst Assur Eng Manag 15, 13–20 (2024). https://doi.org/10.1007/s13198-021-01552-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-021-01552-7

Keywords

Navigation