Abstract
In the recent era, data science plays an important role in the health-care domain to provide a cost-effective and better treatment procedure. To achieve this goal, the data management system has a huge contribution by controlling, arranging, storing and preprocessing a large volume of health dataset. Already there are a lot of investigation and designing of different approaches to support the big data applications in different domain. Still, management of big data is a challenging task for the data scientist due to the complex characteristics of data and demands of the application. In this survey paper, we discuss the occurring challenges and it’s possible solutions by considering the entities related to data services. It will help the data scientist to understand the supporting parameters of data storage system for designing big data management system.
Similar content being viewed by others
References
(Last access 2018) Amazon s3. https://aws.amazon.com/s3/
(Last access 2018) Cassandra. http://cassandra.apache.org/
(Last access 2018) Couchdb. http://couchdb.apache.org/
(Last access 2018) Disaster definitions. In: Public health guide for emergencies, pp 24–43
(Last access 2018) Hbase. https://hbase.apache.org/
(Last access 2018) Microsoft azure. https://docs.microsoft.com/en-us/azure/storage/storage-introduction
(Last access 2018) Microsoft healthvault. https://international.healthvault.com/
(Last access 2018) Mongodb. https://www.mongodb.org/
(Last access 2018) Voltdb. https://voltdb.com/
Abrahams J (2011) Disaster risk management for health: overview. In: Global platform, developed by the World Health Organization, United Kingdom Health Protection Agency and partners, 6 p
Amazon R (2016) Amazon relational database service (Amazon RDS)
Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide: time to relax. O’Reilly Media Inc, Sebastopol
Atzeni P, Bugiotti F, Rossi L (2012) Uniform access to non-relational database systems: The SOS platform. In: advanced information systems engineering, Springer, New York, pp 160–174
Brown SJ (2012) Networked remote patient monitoring with handheld devices. US Patent 8,249,894
Brumley R, Enguidanos S, Jamison P, Seitz R, Morgenstern N, Saito S, McIlwane J, Hillary K, Gonzalez J (2007) Increased satisfaction with care and lower costs: results of a randomized trial of in-home palliative care. J Am Geriatr Soc 55(7):993–1000
Buffington J (2010) Microsoft SQL server, chapter 8. In: Data protection for virtual data centers. Wiley, pp 267–315
Bugiotti F, Cabibbo L (2013) An object-datastore mapper supporting nosql database design. http://cabibbo.dia.uniroma3.it/pub/ondm.pdf
Bugiotti F, Cabibbo L, Atzeni P, Torlone R (2013) A logical approach to nosql databases. http://cabibbo.dia.uniroma3.it/pub/noam.pdf
Cabibbo L (2013) ONDM (Object-NoSQL Datastore Mapper). Faculty of Engineering, Roma TRE University Retrieved June 15th
Chang F, Dean J, Ghemawat S, Hsieh W, Wallach D, Burrows M, Chandra T, Fikes A, Gruber R (2006) Bigtable: a distributed structured data storage system. In: 7th OSDI, pp 305–314
Chen PM, Lee EK, Gibson GA, Katz RH, Patterson DA (1994) Raid: high-performance, reliable secondary storage. ACM Computing Surveys (CSUR) 26:145–185
Cooper BF, Ramakrishnan R, Srivastava U, Silberstein A, Bohannon P, Jacobsen HA, Puz N, Weaver D, Yerneni R (2008) Pnuts: Yahoo!’s hosted data serving platform. Proc VLDB Endow 1(2):1277–1288
Curé O, Kerdjoudj F, Duc CL, Lamolle M, Faye D (2012) On the potential integration of an ontology-based data access approach in NoSQL stores. In: 2012 Third international conference on emerging intelligent data and web technologies (EIDWT), pp 166–173
Curé O, Lamolle M, Duc CL (2013) Ontology based data integration over document and column family oriented NoSQL. arXiv preprint arXiv:13072603
Date CJ, White CJ (1989) A guide to DB2. Addison Wesley Publishing Company, Boston
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: Amazon’s highly available key-value store. In: ACM SIGOPS operating systems review, ACM, vol 41
Erling O (2012) Virtuoso, a hybrid RDBMS/graph column store. IEEE Data Eng Bull 35:3–8
Fan W, Huai JP (2014) Querying big data: bridging theory and practice. J Comput Sci Technol 29:849–869
Fernández-Alemán JL, Señor IC, Lozoya PÁO, Toval A (2013) Security and privacy in electronic health records: a systematic literature review. J Biomed Inform 46(3):541–562
Fichman RG, Kohli R, Krishnan R (2011) The role of information systems in healthcare: current research and future trends. Inf Syst Res 22:419–428
Gaonkar PE, Bojewar S, Das JA (2013) A survey: data storage technologies. Int J Eng Sci Innov Technol 2(2):547–554
Ghemawat S, Gobioff H, Leung ST (2003) The google file system. In: ACM SIGOPS operating systems review, ACM, vol 37
Ginsberg J, Mohebbi MM, Patel RS, Brammer L, Smolinski MS, Brilliant L (2009) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014
Greenspan J, Bulger B (2001) MySQL/PHP database applications. Wiley, New York
Groves P, Kayyali B, Knott D, Kuiken SV (2013) The ‘big data’ revolution in healthcare. McKinsey Quarterly, Seattle
Härder T (1984) Observations on optimistic concurrency control schemes. Inf Syst 9:111–120
Harris S, Seaborne A, Prud’hommeaux E (2013) SPARQL 1.1 query language. W3C recommendation 21
Hassanalieragh M, Page A, Soyata T, Sharma G, Aktas M, Mateos G, Kantarci B, Andreescu S (2015) Health monitoring and management using internet-of-things (IoT) sensing with cloud-based processing: Opportunities and challenges. In: 2015 IEEE international conference on services computing, IEEE, pp 285–292
Hermon R, Williams PAH (2014) Big data in healthcare: what is it used for? In: Australian eHealth informatics and security conference, pp 40–49
Huang T, Lan L, Fang X, An P, Min J, Wang F (2015) Promises and challenges of big data computing in health sciences. Big Data Res 2(1):2–11
Jennings B (2008) disaster planning and public health. In: From Birth to Death and Bench to Clinic: the hastings center bioethics briefing book for journalists, policymakers, and Campaigns, The Hastings Center, Garrison, NY pp 41–44
Ji Z, Ganchev I, O’Droma M, Zhang X, Zhang X (2014) A cloud-based x73 ubiquitous mobile healthcare system: design and implementation. Sci World J 2014:145803
Kiran KV, Vijayakumar R (2014) Ontology based data integration of NoSQL datastores. In: 2014 9th international conference on industrial and information systems (ICIIS), IEEE, pp 1–6
Kaur K, Rani R (2015) Managing data in healthcare information systems: many models, one solution. Computer 48(3):52–59
Kulkarni G, Sutar R, Gambhir J (2012) Cloud computing-infrastructure as service-amazon ec2. Int J Eng Res Appl 2(1):117–125
Kung HT, Robinson JT (1981) On optimistic methods for concurrency control. ACM Trans Database Syst (TODS) 6(2):213–226
Kwong T, O’Brien A, Kwong Q, Hill K, Haswell J (2009) Medical communication skills and law made easy: the patient-centred approach. Elsevier, Amsterdam
Lin Y, Agrawal D, Chen C, Ooi BC, Wu S (2011) Llama: leveraging columnar storage for scalable join processing in the mapreduce framework. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, pp 961–972
Lubbers C, Elkington S, Hess R, Sicola SJ, McCarty J, Korgaonkar A, Leveille J (2005) Flexible data replication mechanism. US Patent 6,947,981
Madden S (2012) From databases to big data. IEEE Intern Comput 16(3):4–6
Matthew N, Stones R (2005) Beginning databases with PostgreSQL: from novice to professional. Apress, 664 p
Miller FP, Vandome AF, McBrewster J (2010) Amazon web services. Alpha Press
Muro S, Kameda T, Minoura T (1984) Multi-version concurrency control scheme for a database system. J Comput Syst Sci 29:207–224
Narayanam S, Wang S (2016) Oracle nosql database. https://scholarworks.bridgeport.edu/xmlui/bitstream/handle/123456789/1559/161.pdf
Nicolae B (2010) Blobseer: Towards efficient data storage management for large-scale, distributed systems. PhD thesis, Université Rennes 1
Paksula M (2010) Persisting objects in redis key-value database. University of Helsinki, Department of Computer Science, Helsinki
Palankar MR, Iamnitchi A, Ripeanu M, Garfinkel S (2008) Amazon s3 for science grids: a viable solution? In: Proceedings of the 2008 international workshop on data-aware distributed computing, New York, NY
Paul M, Das A (2017) Health informatics as a service (HIAAS) for developing countries. In: Internet of things and big data technologies for next generation healthcare. Springer, New York, pp 251–279
Proctor S (2013) Exploring the architecture of the NuoDB database, part 1. Dosegljivo Na
Roijackers J, Fletcher GH (2013) On bridging relational and document-centric data stores. In: Big Data, Springer, New York, pp 135–148
Rolison JJ, Hanoch Y, Wood S, Liu PJ (2014) Risk-taking differences across the adult life span: a question of age and domain. J Gerontol Ser B: Psychol Sci Soc Sci 69(6):870–880
Rosenthal B (2006) Method and system for providing low cost, readily accessible healthcare. US Patent App. 11/105,220
Rossi R, Hirama K (2015) Characterizing big data management. Issues Inform Sci Inf Technol 12:165–180
Russom P (2013) Managing big data. TDWI Research TDWI Best Practices Report
Sawarkar S (2013) Remote healthcare solution. https://www.who.int/ehealth/resources/compendium_ehealth2013_7.pdf
Shih KY, Srinivasan U (2003) Method and system for data replication. US Patent 6,615,223
Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp 1–10
Silva LAB, Costa C, Oliveira JL (2012) A pacs archive architecture supported on cloud services. Int J Comput Assist Radiol Surg 7(3):349–358
Sivasubramanian S (2012) Amazon dynamodb: a seamlessly scalable non-relational database service. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, ACM, pp 729–730
Skourletopoulos G, Mavromoustakis CX, Mastorakis G, Batalla JM, Dobre C, Panagiotakis S, Pallis E (2017) Big data and cloud computing: a survey of the state-of-the-art and research challenges. In: Advances in mobile cloud computing and big data in the 5G Era, Springer, New York, pp 23–41
Tran VT, Narayanan D, Antoniu G, Bougé L (2012) Dstore: an in-memory document-oriented store. PhD thesis, INRIA
Vernica R, Carey MJ, Li C (2010) Efficient parallel set-similarity joins using mapreduce. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, ACM, pp 495–506
West KG, Moon JB, Colquitt NL, Weiner HS, Petersen EG, Howell WH (2003) Patient monitoring system. US Patent 6,544,174
White T (2012) Hadoop: the definitive guide. O’Reilly Media Inc, Sebastopol
Wu L, Yuan L, You J (2015) Survey of large-scale data management systems for big data applications. J Comput Sci Technol 30(1):163–183
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mondal, A.S., Neogy, S., Mukherjee, N. et al. A survey of issues and solutions of health data management systems. Innovations Syst Softw Eng 15, 155–166 (2019). https://doi.org/10.1007/s11334-019-00336-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-019-00336-4