Abstract
The main objective of this paper is to present a review of existing researches in the literature, referring to Big Data sources and techniques in health sector and to identify which of these techniques are the most used in the prediction of chronic diseases. Academic databases and systems such as IEEE Xplore, Scopus, PubMed and Science Direct were searched, considering the date of publication from 2006 until the present time. Several search criteria were established as ‘techniques’ OR ‘sources’ AND ‘Big Data’ AND ‘medicine’ OR ‘health’, ‘techniques’ AND ‘Big Data’ AND ‘chronic diseases’, etc. Selecting the paper considered of interest regarding the description of the techniques and sources of Big Data in healthcare. It found a total of 110 articles on techniques and sources of Big Data on health from which only 32 have been identified as relevant work. Many of the articles show the platforms of Big Data, sources, databases used and identify the techniques most used in the prediction of chronic diseases. From the review of the analyzed research articles, it can be noticed that the sources and techniques of Big Data used in the health sector represent a relevant factor in terms of effectiveness, since it allows the application of predictive analysis techniques in tasks such as: identification of patients at risk of reentry or prevention of hospital or chronic diseases infections, obtaining predictive models of quality.



Similar content being viewed by others
References
Philip Chen, C.L., and Zhang, C.Y., Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Inf. Sci. (Ny). 275:314–347, 2014. https://doi.org/10.1016/j.ins.2014.01.015.
Manuel, J., and Sesmero, M., “Big Data”; aplicación y utilidad para el sistema sanitario. Farm. Hosp. 39(2):69–70, 2015. https://doi.org/10.7399/fh.2015.39.2.8835.
Garg, N., Singla, S., and Jangra, S., Challenges and techniques for testing of big data. Procedia. Comput. Sci. 85:940–948, 2016.
Tu, C., He, X., Shuai, Z., and Jiang, F., Big data issues in smart grid - A review. Renew. Sust. Energy Rev. 79:1099–1107, 2017.
Khan, S., Liu, X., Shakil, K.A., and Alam, M., A survey on scholarly data: From big data perspective. Inf. Process. Manag. 53(4):923–944, 2017.
Wang, H., Xu, Z., and Pedrycz, W., An overview on the roles of fuzzy set techniques in big data processing: Trends, challenges and opportunities. Knowl.-Based Syst. 118:15–30, 2017.
Merelli, I., Pérez-Sánchez, H., Gesing, S., and D’Agostino, D., Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives. Biomed. Res. Int., 2014. https://doi.org/10.1155/2014/134023.
Belle, A., Thiagarajan, R., Soroushmehr, S.M.R., Navidi, F., Beard, D.A., and Najarian, K., Big Data Analytics in Healthcare. Hindawi Publ. Corp.:1–16, 2015. https://doi.org/10.1155/2015/370194.
Alyass, A., Turcotte, M., and Meyre, D., From big data analysis to personalized medicine for all: challenges and opportunities. BMC Med. Genomics. 8(1):33, 2015. https://doi.org/10.1186/s12920-015-0108-y.
Trifiletti, D.M., and Showalter, T.N., Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery. Front Oncol. 5:5–9, 2015. https://doi.org/10.3389/fonc.2015.00274.
Cunha, J., Silva, C., and Antunes, M., Health Twitter Big Bata Management with Hadoop Framework. Procedia Comput. Sci. 64:425–431, 2015. https://doi.org/10.1016/j.procs.2015.08.536.
O’Driscoll, A., Daugelaite, J., and Sleator, R.D., “Big data”, Hadoop and cloud computing in genomics. J. Biomed. Inform. 46(5):774–781, 2013. https://doi.org/10.1016/j.jbi.2013.07.001.
Saravana Kumar, N.M., Eswari, T., Sampath, P., and Lavanya, S., Predictive methodology for diabetic data analysis in big data. Procedia Comput. Sci. 50:203–208, 2015. https://doi.org/10.1016/j.procs.2015.04.069.
Huang, T., Lan, L., Fang, X., An, P., Min, J., and Wang, F., Promises and Challenges of Big Data Computing in Health Sciences. Big Data Res. 2(1):2–11, 2015. https://doi.org/10.1016/j.bdr.2015.02.002.
Patel, J. A., Sharma, P., Big data for Better Health Planning. Adv. Eng. Technol. Res. (ICAETR), 2014 Int. Conf. IEEE. 0–4, 2014.
Chennamsetty, H., Chalasani, S., Riley, D., Predictive analytics on Electronic Health Records (EHRs) using Hadoop and Hive. Proc. 2015 I.E. Int. Conf. Electr. Comput. Commun. Technol. ICECCT 2015, 2015 1–5, . doi:https://doi.org/10.1109/ICECCT.2015.7226129.
Grover, A., Gholap, J., Janeja, V. P., et al. SQL-like big data environments: Case study in clinical trial analytics. 2015 I.E. Int. Conf. Big Data (Big Data). 2680–2689, 2015. doi:https://doi.org/10.1109/BigData.2015.7364068.
Payakachat, N., Tilford, J.M., and Ungar, W.J., National Database for Autism Research (NDAR): Big Data Opportunities for Health Services Research and Health Technology Assessment. Pharmacoeconomics. 34(2):127–138, 2016. https://doi.org/10.1007/s40273-015-0331-6.
Moskowitz, A., McSparron, J., Stone, D.J., and Celi, L.A., Preparing a New Generation of Clinicians for the Era of Big Data. Harvard Med. Student Rev. 2(1):24–27, 2015.
Andreu-Perez, J., Poon, C.C.Y., Merrifield, R.D., Wong, S.T.C., and Yang, G.Z., Big Data for Health. IEEE J. Biomed. Heal Informatics. 19(4):1193–1208, 2015. https://doi.org/10.1109/JBHI.2015.2450362.
Rose, P.W., Beran, B., Bi, C., et al., The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res. 39:392–241, 2011. https://doi.org/10.1093/nar/gkq1021.
Wishart, D.S., Jewison, T., Guo, A.C., et al., HMDB 3.0-The Human Metabolome Database in 2013. Nucleic Acids Res. 41(D1):D801–D807, 2013. https://doi.org/10.1093/nar/gks1065.
Costa, F.F., Big data in biomedicine. Drug Discov. Today. 19(4):433–440, 2014. https://doi.org/10.1016/j.drudis.2013.10.012.
Buchanan, C.C., Torstenson, E.S., Bush, W.S., and Ritchie, M.D., A comparison of cataloged variation between International HapMap Consortium and 1000 Genomes Project data. J. Am. Med. Informatics Assoc. 19(2):289–294, 2012. https://doi.org/10.1136/amiajnl-2011-000652.
Lu, J., Keech, M., Emerging Technologies for Health Data Analytics Research: A Conceptual Architecture. 2015 26th Int. Work Database Expert Syst. Appl. 225–229, 2015. doi:https://doi.org/10.1109/DEXA.2015.58.
Pérez, G., Peligros del uso de los big data en la investigación en salud pública y en epidemiología Risks of the use of big data in research in public health and. epidemiology. 30(1):66–68, 2016.
Nambiar, R., Bhardwaj, R., Sethi, A., Vargheese, R., A look at challenges and opportunities of Big Data analytics in healthcare. Proc - 2013 IEEE Int. Conf. Big Data, Big Data 2013. 17–22, 2013. doi:https://doi.org/10.1109/BigData.2013.6691753.
Young, S.D., A “ big data ” approach to HIV epidemiology and prevention. Prev. Med. (Baltim). 70:17–18, 2015. https://doi.org/10.1016/j.ypmed.2014.11.002.
Palaniappan, S., Awang, R., Intelligent heart disease prediction system using data mining techniques. 2008 IEEE/ACS Int. Conf. Comput. Syst. Appl. 108–115, 2008. doi:https://doi.org/10.1109/AICCSA.2008.4493524.
Kunwar, V., Chandel, K., Sabitha, A. S., Bansal, A., Chronic Kidney Disease Analysis Using Data Mining Classification. Cloud Syst. Big Data Eng. (Confluence), 2016 6th Int. Conf. IEEE. 300–305, 2016. doi:https://doi.org/10.1109/CONFLUENCE.2016.7508132.
Chauhan, R., Kumar, A., Cloud computing for improved healthcare: Techniques, potential and challenges. 2013 E-Health Bioeng. Conf. EHB 2013. 2013. https://doi.org/10.1109/EHB.2013.6707234.
Al-Janabi, S., Patel, A., Fatlawi, H., Kalajdzic, K., Al Shourbaji, I., Empirical rapid and accurate prediction model for data mining tasks in cloud computing environments. 2014 Int. Congr. Technol. Commun. Knowledge, ICTCK 2014. 26–27, 2015. https://doi.org/10.1109/ICTCK.2014.7033495.
Elsebakhi, E., Lee, F., Schendel, E., et al., Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms. J. Comput. Sci. 11:69–81, 2015. https://doi.org/10.1016/j.jocs.2015.09.008.
Melethadathil, N., Chellaiah, P., Nair, B., Diwakar, S., Classification and clustering for neuroinformatics: Assessing the efficacy on reverse-mapped NeuroNLP data using standard ML techniques. 2015 Int. Conf. Adv. Comput. Commun. Informatics, ICACCI 2015. 1065–1070, 2015. doi:https://doi.org/10.1109/ICACCI.2015.7275751.
Fouad, M.M., Oweis, N.E., Gaber, T., Ahmed, M., and Snasel, V., Data Mining and Fusion Techniques for WSNs as a Source of the Big Data. Procedia Comput. Sci. 65:778–786, 2015. https://doi.org/10.1016/j.procs.2015.09.023.
Sankaranarayanan, S., Perumal, T. P., A Predictive Approach for Diabetes Mellitus Disease through Data Mining Technologies. 2014 World Congr. Comput. Commun. Technol. 231–233, 2014. doi:https://doi.org/10.1109/WCCCT.2014.65.
Sivagowry, S., Durairaj, M., Persia, A., An empirical study on applying data mining techniques for the analysis and prediction of heart disease. 2013 Int. Conf. Inf. Commun. Embed. Syst. 265–270, 2013. doi:https://doi.org/10.1109/ICICES.2013.6508204.
Alfisahrin, S. N. N., Mantoro, T., Data Mining Techniques for Optimization of Liver Disease Classification. 2013 Int. Conf. Adv. Comput. Sci. Appl. Technol. 379–384, 2013. doi:https://doi.org/10.1109/ACSAT.2013.81.
Koppad, S. H., Kumar, A., Application of Big Data Analytics in Healthcare System to Predict COPD. Circuit, Power Comput. Technol. (ICCPCT), 2016 Int. Conf. IEEE. 1–5, 2016.
Acknowledgements
This research has been partially supported by the European Commission and the Ministry of Industry, Energy and Tourism under the project AAL-20125036 named “WetakeCare: ICT- based Solution for (Self-) Management of Daily Living”, by National Funding from the FCT – Fundação para a Ciência e a Tecnologia through the UID/EEA/500008/2013 Project, by the Government of the Russian Federation, Grant 074-U01, and by Finep, with resources from Funttel, Grant No. 01.14.0231.00, under the Centro de Referência em Radiocomunicações - CRR project of the Instituto Nacional de Telecomunicações (Inatel), Brazil.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no competing interests.
Additional information
This article is part of the Topical Collection on Systems-Level Quality Improvement
Rights and permissions
About this article
Cite this article
Alonso, S.G., de la Torre Díez, I., Rodrigues, J.J.P.C. et al. A Systematic Review of Techniques and Sources of Big Data in the Healthcare Sector. J Med Syst 41, 183 (2017). https://doi.org/10.1007/s10916-017-0832-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-017-0832-2