Skip to main content

Advertisement

Log in

Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution

  • Transactional Processing Systems
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

The huge increases in medical devices and clinical applications which generate enormous data have raised a big issue in managing, processing, and mining this massive amount of data. Indeed, traditional data warehousing frameworks can not be effective when managing the volume, variety, and velocity of current medical applications. As a result, several data warehouses face many issues over medical data and many challenges need to be addressed. New solutions have emerged and Hadoop is one of the best examples, it can be used to process these streams of medical data. However, without an efficient system design and architecture, these performances will not be significant and valuable for medical managers. In this paper, we provide a short review of the literature about research issues of traditional data warehouses and we present some important Hadoop-based data warehouses. In addition, a Hadoop-based architecture and a conceptual data model for designing medical Big Data warehouse are given. In our case study, we provide implementation detail of big data warehouse based on the proposed architecture and data model in the Apache Hadoop platform to ensure an optimal allocation of health resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Kuo, M.H., Sahama, T., Kushniruk, A.W., Borycki, E.M., and Grunwell, D.K., Health big data analytics: Current perspectives, challenges and potential solutions. Int. J. Big Data Intell. 1(1–2):114–126, 2014. https://doi.org/10.1504/IJBDI.2014.063835.

    Article  Google Scholar 

  2. Cuzzocrea, A., Warehousing and Protecting Big Data: State-Of-The-Art-Analysis, Methodologies, Future Challenges. In Proceedings of the International Conference on Internet of things and Cloud Computing (p. 14). ACM, 2016. https://doi.org/10.1145/2896387.2900335

  3. White, T., Hadoop: The definitive guide (third edition). O’Reilly, 2012. ISBN: 978-1-449-322252-0.

  4. Sumathi, S., and Esakkirajan, S., Fundamentals of relational database management systems (Vol. 47). Springer, 2007. ISBN: 978 3 540 48397 7.

  5. Ewen, E.F., Medsker, C.E., and Dusterhoft, L.E., Data warehousing in an integrated health system: building the business case. In Proceedings of the 1st ACM international workshop on Data warehousing and OLAP (pp. 47–53). ACM, 1998. https://doi.org/10.1145/294260.294271

  6. Pedersen, T.B., and Jensen, C.S., Research issues in clinical data warehousing. In Scientific and Statistical Database Management. Proceedings. Tenth international conference on (pp. 43–52). IEEE, 1998. https://doi.org/10.1109/SSDM.1998.688110

  7. Guérin, E., Moussouni, F., Courselaud, B., and Loréal, O., UML modeling of Gedaw: A gene expression data warehouse specialised in the liver. In The 3rd French bioinformatics conference proceeding: JOBIM 2002 (pp. 319–334), Saint-Malo, France, 2002.

  8. Banek, M., Tjoa, A.M., and Stolba, N., Integrating different grain levels in a medical data warehouse federation. In International Conference on Data Warehousing and Knowledge Discovery (pp. 185–194). Springer Berlin Heidelberg, 2006. https://doi.org/10.1007/11823728_18

  9. Kerkri, E.M., Quantin, C., Allaert, F.A., Cottin, Y., Charve, P., Jouanot, F., and Yétongnon, K., An approach for integrating heterogeneous information sources in a medical data warehouse. J. Med. Syst. 25(3):167–176, 2001. https://doi.org/10.1023/A:1010728915998.

    Article  CAS  PubMed  Google Scholar 

  10. Pavalam, S.M., Jawahar, M., and Akorli, F.K., Data warehouse based Architecture for Electronic Health Records for Rwanda. In Education and Management Technology (ICEMT) International Conference on (pp. 253–255). IEEE, 2010. https://doi.org/10.1109/ICEMT.2010.5657660

  11. Sebaa, A., Nouicer, A., Tari, A., Ramtani, T., and Ouhab, A., Decision support system for health care resources allocation. Electron. Physician. 9(6):4661–4668, 2017. https://doi.org/10.19082/4661.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Sebaa, A., Nouicer, A., Tari, A., Ramtani, T., and Ouhab, A., Decision support system for Health Care Resources allocation. Abstracts Book of ICHSMT’16- International Conference on Health Sciences and Medical Technologies; 2016 Sep 27-29; Tlemcen, Algeria. Mehr publishing. p. 8, 2016. ISBN: 978-600-96661-0-2.

  13. Sebaa, A., Tari, A., Ramtani, T., and Ouhab, A., DW RHSB: A framework for optimal allocation of health resources. Int. J. Comput. Sci. Commun Inf. Technol. 2(1):12–17, 2015.

    Google Scholar 

  14. Wang, L., and Alexander, C.A., Big data in medical applications and health care. Am. Med. J. 6(1):1, 2015. https://doi.org/10.3844/amjsp.2015.1.8.

    Google Scholar 

  15. Cuzzocrea, A., Song, I.Y., and Davis, K.C., Analytics over large-scale multidimensional data: the big data revolution. In Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP. pp. 101–104. ACM, 2011. https://doi.org/10.1145/2064676.2064695

  16. Sebaa, A., Nouicer, N., Chikh, F., and Tari, A., Big Data Technologies to Improve Medical Data Warehousing. In Proceedings of 2nd international conference on Big Data, Cloud and Applications. ACM, 2017. https://doi.org/10.1145/3090354.3090376

  17. Yao, Q., Tian, Y., Li, P.F., Tian, L.L., Qian, Y.M., and Li, J.S., Design and development of a medical big data processing system based on Hadoop. J. Med. Syst. 39(3):23, 2015. https://doi.org/10.1007/s10916-015-0220-8.

    Article  PubMed  Google Scholar 

  18. Istephan, S., and Siadat, M.R., Unstructured medical image query using big data–an epilepsy case study. J. Biomed. Inform. 59:218–226, 2016. https://doi.org/10.1016/j.jbi.2015.12.005.

    Article  PubMed  Google Scholar 

  19. Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., and Saltz, J., Hadoop GIS: a high performance spatial data warehousing system over Map-Reduce. VLDB Endowment. 6(11):1009–1020, 2013. https://doi.org/10.14778/2536222.2536227.

    Article  Google Scholar 

  20. Saravanakumar, N.M., Eswari, T., Sampath, P., and Lavanya, S., Predictive methodology for diabetic data analysis in big data. In 2nd ISBCC. Procedia Computer Science. 50:203–208, 2015. https://doi.org/10.1016/j.procs.2015.04.069.

    Article  Google Scholar 

  21. Rodger, J.A., Discovery of medical big data analytics: Improving the prediction of traumatic brain injury survival rates by data mining patient informatics processing software hybrid Hadoop hive. Informatics in Medicine Unlocked. 1:17–26, 2015. https://doi.org/10.1016/j.imu.2016.01.002.

    Article  Google Scholar 

  22. Sundvall, E., Wei-Kleiner, F., Freire, S.M., and Lambrix, P., Querying archetype-based electronic health records using Hadoop and Dewey encoding of openEHR models. Stud. Health Technol. Inform. 235:406, 2017. https://doi.org/10.3233/978-1-61499-753-5-406.

    PubMed  Google Scholar 

  23. Raja, P.V., and Sivasankar, E., Modern Framework for Distributed Healthcare Data Analytics Based on Hadoop. In Information and Communication Technology-EurAsia Conference (pp. 348–355). Springer Berlin Heidelberg, 2014. https://doi.org/10.1007/978-3-642-55032-4_34

  24. Yang, C.T., Liu, J.C., Chen, S.T., and Lu, H.W., Implementation of a big data accessing and processing platform for medical records in cloud. J. Med. Syst. 41(10):149, 2017. https://doi.org/10.1007/s10916-017-0777-5.

    Article  PubMed  Google Scholar 

  25. Sebaa, A., Chick, F., Nouicer, A., and Tari, A., Research in big data warehousing using Hadoop. J. Inform. Syst. Eng. Manag. 2(2), 2017. https://doi.org/10.20897/jisem.201710.

  26. Dean, J., and Ghemawat, S., MapReduce: A flexible data processing tool. CACM. 53(1):72–77, 2010. https://doi.org/10.1145/1629175.1629198.

    Article  Google Scholar 

  27. Wu, S., Li, F., Mehrotra, S., and Ooi, B.C., Query optimization for massively parallel data processing. In Proceedings of the 2nd ACM Symposium on Cloud Computing (p. 12). ACM, 2011. https://doi.org/10.1145/2038916.2038928

  28. Apache Hadoop: http://hadoop.apache.org/, Viewed in 02/2015.

  29. Taylor, R.C., An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC bioinform. 11(12):S1, 2010. https://doi.org/10.1186/1471-2105-11-S12-S1.

    Article  Google Scholar 

  30. Apache Hive: https://hive.apache.org/, Viewed in 02/2015.

  31. Liu, X., Thomsen, C., and Pedersen, T.B., ETLMR: a highly scalable dimensional ETL framework based on mapreduce. In Transactions on Large-Scale Data-and Knowledge-Centered Systems VIII (pp. 1–31). Springer Berlin Heidelberg, 2013. https://doi.org/10.1007/978-3-642-37574-3_1

  32. Gao, S., Li, L., Li, W., Janowicz, K., and Zhang, Y., Constructing gazetteers from volunteered big geo-data based on Hadoop. Comput. Environ. Urban. Syst. 61:172–186, 2017. https://doi.org/10.1016/j.compenvurbsys.2014.02.004.

    Google Scholar 

  33. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., et al., Hive: A warehousing solution over a map-reduce framework. Proc. VLDB Endowment. 2(2):1626–1629, 2009. https://doi.org/10.14778/1687553.1687609.

    Article  Google Scholar 

  34. Ross, J., The use of economic evaluation in health care: Australian decision makers' perceptions. Health Policy. 31(2):103–110, 1995. https://doi.org/10.1016/0168-8510(94)00671-7.

    Article  CAS  PubMed  Google Scholar 

  35. ANDI: National Agency for Investment Development of Algeria, http://www.andi.dz/index.php/en/secteur-de-sante, Viewed in 02/2015.

Download references

Acknowledgements

This work was partially supported by the Ministry of Higher Education and Scientific Research of Algeria and the University of Bejaia, under the project CNEPRU (Ref. B*00620140066/2015-2018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abderrazak Sebaa.

Ethics declarations

Conflict of Interest

Authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

This article is part of the Topical Collection on Transactional Processing Systems

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sebaa, A., Chikh, F., Nouicer, A. et al. Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution. J Med Syst 42, 59 (2018). https://doi.org/10.1007/s10916-018-0894-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-018-0894-9

Keywords

Navigation