Skip to main content

Advertisement

Log in

Big data analytics enhanced healthcare systems: a review

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

There is increased interest in deploying big data technology in the healthcare industry to manage massive collections of heterogeneous health datasets such as electronic health records and sensor data, which are increasing in volume and variety due to the commoditization of digital devices such as mobile phones and wireless sensors. The modern healthcare system requires an overhaul of traditional healthcare software/hardware paradigms, which are ill-equipped to cope with the volume and diversity of the modern health data and must be augmented with new “big data” computing and analysis capabilities. For researchers, there is an opportunity in healthcare data analytics to study this vast amount of data, find patterns and trends within data and provide a solution for improving healthcare, thereby reducing costs, democratizing health access, and saving valuable human lives. In this paper, we present a comprehensive survey of different big data analytics integrated healthcare systems and describe the various applicable healthcare data analytics algorithms, techniques, and tools that may be deployed in wireless, cloud, Internet of Things settings. Finally, the contribution is given in formation of a convergence point of all these platforms in form of SmartHealth that could result in contributing to unified standard learning healthcare system for future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Raghupathi W, Raghupathi V (2014) Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2(1):3

    Article  Google Scholar 

  2. Perer A (2012) Healthcare analytics for clinical and non-clinical settings. Proceedings of CHI Conference

  3. Krumholz HM (2014) Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff (Millwood) 33(7):1163–1170

    Article  Google Scholar 

  4. The Learning Healthcare Project the Learning Health Care Project. http://www.learninghealthcareproject.org/section/background/learning-healthcare-system

  5. Cloud Security Alliance (2013) Big data analytics for security intelligence. Big Data Working Group

  6. IBM Centre for applied insights (2014) Raising the game: The IBM business tech trends study

  7. Islam SR, Kwak D, Kabir MH, Hossain M, Kwak K-S (2015) The internet of things for health care: a comprehensive survey. IEEE Access 3:678–708

    Article  Google Scholar 

  8. Cunha J, Silva C, Antunes M (2015) Health twitter big bata management with hadoop framework. Procedia Comput Sci 64:425–431

    Article  Google Scholar 

  9. Basuthkar VS, Srinivas C (2016) Cost effective knowledge based quality and value data extraction from clinical healthcare data. Int J Adv Res Comput Commun Eng 5(4):1098–1103 [India]

    Google Scholar 

  10. Kaur H, Wasan SK (2006) Empirical study on applications of data mining techniques in healthcare. J Comput Sci 2(2):194–200

    Article  Google Scholar 

  11. Srinivas K, Rani BK, Govrdhan A (2010) Applications of data mining techniques in healthcare and prediction of heart attacks. Int J Comput Sci Eng 2(02):250–255

    Google Scholar 

  12. Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J (2016) Doctor ai: predicting clinical events via recurrent neural networks. In: Machine Learning for Healthcare Conference, pp 301–318

  13. Wu X et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37

    Article  Google Scholar 

  14. Buczak AL, Guven E (2016) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surv Tutor 18(2):1153–1176

    Article  Google Scholar 

  15. Boukenze B, Mousannif H, Haqiq A (2016) Predictive analytics in healthcare system using data mining techniques. Comput Sci Inf Technol 1:1–9

    Article  Google Scholar 

  16. Talbi E-G (2002) A taxonomy of hybrid metaheuristics. J Heuristics 8(5):541–564

    Article  Google Scholar 

  17. Alba E, Giacobini M, Tomassini M, Romero S (2002) Comparing synchronous and asynchronous cellular genetic algorithms. In: International Conference on Parallel Problem Solving from Nature, pp 601–610

  18. Ramesh AN, Kambhampati C, Monson JR, Drew PJ (2004) Artificial intelligence in medicine. Ann R Coll Surg Engl 86(5):334

    Article  Google Scholar 

  19. Bujok P (2013) Synchronous and asynchronous migration in adaptive differential evolution algorithms. Neural Netw World 23(1):17

    Article  Google Scholar 

  20. Zhao J, Papapetrou P, Asker L, Boström H (2017) Learning from heterogeneous temporal data in electronic health records. J Biomed Inform 65:105–119

    Article  Google Scholar 

  21. Archenaa J, Anita EM (2015) A survey of big data analytics in healthcare and government. Procedia Comput Sci 50:408–413

    Article  Google Scholar 

  22. Neto S, Ferraz FS (2016) Disease surveillance big data platform for large scale event processing. In: Proceedings on the International Conference on Internet Computing (ICOMP), p 89

  23. Xiao X (2016) Data mining techniques for complex user-generated data. Politecnico di Torino, Turin

    Google Scholar 

  24. Ling ZJ et al (2014) Gemini: an integrative healthcare analytics system. Proc VLDB Endow 7(13):1766–1771

    Article  Google Scholar 

  25. Calyam P et al (2016) Synchronous big data analytics for personalized and remote physical therapy. Pervasive Mob Comput 28:3–20

    Article  Google Scholar 

  26. Kulkarni SM, Babu BS (2015) Cloud-based patient profile analytics system for monitoring diabetes mellitus. In: International Conference on Computational Systems for Health & Sustainability (CSFHS), IJITR, pp 228–231

  27. Ng K, Ghoting A, Steinhubl SR, Stewart WF, Malin B, Sun J (2014) PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records. J Biomed Inform 48:160–170

    Article  Google Scholar 

  28. Kobielus J, Marcus B (2014) Deploying big data analytics applications to the cloud. In: The Cloud Standards Customer Council 2014

  29. Shafqat S, Abbasi A, Amjad T, Ahmad HF (2018) SmartHealth simulation representing a hybrid architecture over cloud integrated with IoT: a modular approach. In: Presented at the Future of Information and Communications Conference (FICC) 2018, Singapore

  30. Health Level Seven Standard. American National Standard Institute. www.hl7.org

  31. WHO Who we are, what we do. WHO. http://www.who.int/about/en/

  32. Standards Organizations for the NHII. ASPE, 26-Nov-2016. https://aspe.hhs.gov/standards-organizations-nhii

  33. ANSI Approved Standards. http://www.hl7.org/implement/standards/ansiapproved.cfm?ref=nav

  34. Kaggal VC et al (2016) Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed Inform Insights 8(Suppl 1):13

    Google Scholar 

  35. Andreu-Perez J, Poon CC, Merrifield RD, Wong ST, Yang G-Z (2015) Big data for health. IEEE J Biomed Health Inform 19(4):1193–1208

    Article  Google Scholar 

  36. Cortés R, Bonnaire X, Marin O, Sens P (2015) Stream processing of healthcare sensor data: studying user traces to identify challenges from a big data perspective. Procedia Comput Sci 52:1004–1009

    Article  Google Scholar 

  37. Escaravage J, Guerra P (2013) Enabling cloud analytics with data level security. In: Tapping the full potential of big data and cloud, Booz, Allen, Hamilton

  38. Cao P et al (2015) Towards an unified security testbed and security analytics framework. In: ACM, Urbana, USA

  39. Pentland A, Reid TG, Heibeck T (2013) Revolutionizing medicine and public health. Report of the Big Data and Health Working Group. World Innovation Summit for Health, Doha

  40. Berkman LF (2001) Social integration, social networks, and health. In: Smelser NJ, Baltes PB (eds) International encyclopedia of the social & behavioral sciences. Pergamon, Oxford, pp 14327–14332

    Chapter  Google Scholar 

  41. Valente TW (2010) Social networks and health: models, methods, and applications. Oxford University Press, New York

    Book  Google Scholar 

  42. WHO Report from the first consultation of the health and social protection action research & knowledge sharing (SPARKS) network. WHO. http://www.who.int/tb/publications/sparksreport/en/. Based on: http://www.who.int/healthinfo/15_Social_Networks_Berkman_ok.pdf

  43. Sengur A, Turkoglu I (2008) A hybrid method based on artificial immune system and fuzzy k-NN algorithm for diagnosis of heart valve diseases. Expert Syst Appl 35(3):1011–1020

    Article  Google Scholar 

  44. Zheng B, Yoon SW, Lam SS (2014) Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Syst Appl 41(4):1476–1482

    Article  Google Scholar 

  45. Khaing HW (2011) Data mining based fragmentation and prediction of medical data. In: Computer Research and Development (ICCRD), 2011 3rd International Conference on, vol 2, pp 480–485

  46. Sawacha Z, Guarneri G, Avogaro A, Cobelli C (2010) A new classification of diabetic gait pattern based on cluster analysis of biomechanical data. SAGE Publications, Los Angeles

    Book  Google Scholar 

  47. Phanich M, Pholkul P, Phimoltares S (2010) Food recommendation system using clustering analysis for diabetic patients. In: Information Science and Applications (ICISA), 2010 International Conference on, pp 1–8

  48. Antonelli D, Baralis E, Bruno G, Cerquitelli T, Chiusano S, Mahoto N (2013) Analysis of diabetic patients through their examination history. Expert Syst Appl 40(11):4672–4678

    Article  Google Scholar 

  49. Purwar A, Singh SK (2015) Hybrid prediction model with missing value imputation for medical data. Expert Syst Appl 42(13):5621–5631

    Article  Google Scholar 

  50. Polat K, Güneş S, Arslan A (2008) A cascade learning system for classification of diabetes disease: generalized discriminant analysis and least square support vector machine. Expert Syst Appl 34(1):482–487

    Article  Google Scholar 

  51. Karan O, Bayraktar C, Gümüşkaya H, Karlık B (2012) Diagnosing diabetes using neural networks on small mobile devices. Expert Syst Appl 39(1):54–60

    Article  Google Scholar 

  52. Menshawy ME, Benharref A, Serhani M (2015) An automatic mobile-health based approach for EEG epileptic seizures detection. Expert Syst Appl 42(20):7157–7174

    Article  Google Scholar 

  53. Miah SJ, Hasan J, Gammack JG (2017) On-cloud healthcare clinic: an e-health consultancy approach for remote communities in a developing country. Telemat Inform 34(1):311–322

    Article  Google Scholar 

  54. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J 15:104–116

    Article  Google Scholar 

  55. Demirkan H, Delen D (2013) Leveraging the capabilities of service-oriented decision support systems: putting analytics and big data in cloud. Decision Support Systems, vol 55, no 1, pp 412–421

  56. Sordo M (2002) Introduction of neural networks in healthcare. Open Clinical: Knowledge Management for Medical Care

  57. Sun J et al (2014) Predicting changes in hypertension control using electronic health records from a chronic disease management program. J Am Med Inform Assoc 21:337–344

    Article  Google Scholar 

  58. Luo D, Wang F, Sun J, Markatou M, Hu J, Ebadollahi S (2012) Sor: scalable orthogonal regression for non-redundant feature selection and its healthcare applications. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp 576–587

  59. Zhou J, Sun J, Liu Y, Hu J, Ye J (2013) Patient risk prediction model via top-k stability selection. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp 55–63

  60. Sondhi P, Sun J, Zhai C, Sorrentino R, Kohn MS (2012) Leveraging medical thesauri and physician feedback for improving medical literature retrieval for case queries. J Am Med Inform Assoc 19(5):851–858

    Article  Google Scholar 

  61. Eswari T, Sampath P, Lavanya S (2015) Predictive methodology for diabetic data analysis in big data. Procedia Comput Sci 50:203–208

    Article  Google Scholar 

  62. Esfandiari N, Babavalian MR, Moghadam A-ME, Tabar VK (2014) Knowledge discovery in medicine: current issue and future trend. Expert Syst Appl 41(9):4434–4463

    Article  Google Scholar 

  63. WebMD. www.webmd.com

  64. Li J, Jiang B, Fine JP (2012) Multicategory reclassification statistics for assessing improvements in diagnostic accuracy. Biostatistics 14(2):382–394

    Article  Google Scholar 

  65. Cloud Analytics Platform. Gurucul Predictive Security Analytics

  66. Kang U, Tong H, Sun J (2012) Fast random walk graph kernel. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp 828–838

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarah Shafqat.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shafqat, S., Kishwer, S., Rasool, R.U. et al. Big data analytics enhanced healthcare systems: a review. J Supercomput 76, 1754–1799 (2020). https://doi.org/10.1007/s11227-017-2222-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-017-2222-4

Keywords

Navigation