Skip to main content

Data Science in Healthcare: Benefits, Challenges and Opportunities

  • Chapter
  • First Online:
Data Science for Healthcare

Abstract

The advent of digital medical data has brought an exponential increase in information available for each patient, allowing for novel knowledge generation methods to emerge. Tapping into this data brings clinical research and clinical practice closer together, as data generated in ordinary clinical practice can be used towards rapid-learning healthcare systems, continuously improving and personalizing healthcare. In this context, the recent use of Data Science technologies for healthcare is providing mutual benefits to both patients and medical professionals, improving prevention and treatment for several kinds of diseases. However, the adoption and usage of Data Science solutions for healthcare still require social capacity, knowledge and higher acceptance. The goal of this chapter is to provide an overview of needs, opportunities, recommendations and challenges of using (Big) Data Science technologies in the healthcare sector. This contribution is based on a recent whitepaper (http://www.bdva.eu/sites/default/files/Big%20Data%20Technologies%20in%20Healthcare.pdf) provided by the Big Data Value Association (BDVA) (http://www.bdva.eu/), the private counterpart to the EC to implement the BDV PPP (Big Data Value PPP) programme, which focuses on the challenges and impact that (Big) Data Science may have on the entire healthcare chain.

Authors are listed in alphabetic order since their contributions have been equally distributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A third of European hospitals report operating losses, according to Accenture nine-country study. https://newsroom.accenture.com/industries/health-public-service/a-third-of-european-hospitals-report-operating-losses-according-to-accenture-nine-country-study.htm

  2. Aronson, A.R.: Effective mapping of biomedical text to the UMLS metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium, p. 17. American Medical Informatics Association, Bethesda (2001)

    Google Scholar 

  3. Atzeni, M., Recupero, D.R.: Deep learning and sentiment analysis for human-robot interaction. In: The Semantic Web: ESWC 2018 Satellite Events - ESWC 2018 Satellite Events, Heraklion, Crete, June 3–7, 2018. Revised Selected Papers, pp. 14–18 (2018)

    Google Scholar 

  4. Auffray, C., et al.: Making sense of big data in health research: towards an eu action plan. Genome Med. 8, 71 (2016)

    Article  Google Scholar 

  5. Baro, E., Degoul, S., Beuscart, R., Chazard, E.: Toward a literature-driven definition of big data in healthcare. BioMed. Res. Int. 2015, 639021 (2015)

    Google Scholar 

  6. Bd2k Mission Statement (2012). http://datascience.nih.gov/bd2k/about

  7. Berners-Lee, T., Chen, Y., Chilton, L., Connolly, D., Dhanaraj, R., Hollenbach, J., Lerer, A., Sheets, D.: Exploring and analyzing linked data on the semantic web. In: Proceedings of the 3rd International Semantic Web User Interaction Workshop, SWUI 2006, Athens (2006)

    Google Scholar 

  8. Berners-Lee, T., Bizer, C., Heath, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5, 1–22 (2009)

    Google Scholar 

  9. Big Data and Analytics for Infectious Disease Research, Operations, and Policy: Proceedings of a Workshop (2016). https://www.nap.edu/read/23654/chapter/1

  10. Bizer, C., Heath, T.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web edition, vol. 344. Morgan & Claypool Publishers, San Rafael (2011)

    Google Scholar 

  11. Colin, P., Karthik, P.G., Preteek, J., Peter, Y., Kunal, V.: Multiple ontologies in healthcare information technology: motivations and recommendation for ontology mapping and alignment. In: Proceedings of International Conference on Biomedical Ontologies, New York, pp. 367–369 (2011)

    Google Scholar 

  12. Cotik, V., Filippo, D., Roller, R., Uszkoreit, H., Xu, F.: Annotation of entities and relations in Spanish radiology reports. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, Varna, pp. 177–184. INCOMA Ltd, Moskva (2017)

    Google Scholar 

  13. Courville, A., Goodfellow, I., Bengio, Y.: Deep Learning (2016). http://www.deeplearningbook.org

  14. Data silos: Healthcare’s silent shame. http://www.forbes.com/sites/davidshaywitz/2015/03/24/data-silos-healthcares-silent-tragedy/#19b0f7f99394

  15. Decap, D., Reumers, J., Herzeel, C., Costanza, P., Fostier, J.: Halvade: scalable sequence analysis with mapreduce. Bioinformatics 31(15), 2482–2488 (2015)

    Article  Google Scholar 

  16. Deering, M.J.: Issue brief: patient-generated health data and health it. The Office of the National Coordinator for Health Information Technology (2013)

    Google Scholar 

  17. Deftereos, S.N., Andronis, C., Friedla, E.J., Persidis, A., Persidis, A.: Drug repurposing and adverse event prediction using high-throughput literature analysis. Wiley Interdiscip. Rev. Syst. Biol. Med. 3(3), 323–334 (2011)

    Article  Google Scholar 

  18. Dessì, D., Reforgiato Recupero, D., Fenu, G., Consoli, S.: Exploiting cognitive computing and frame semantic features for biomedical document clustering, vol. 1948, pp. 20–34 (2017). Cited By 4

    Google Scholar 

  19. Dessì, D., Cirrone, J., Recupero, D.R., Shasha, D.E.: Supernoder: a tool to discover over-represented modular structures in networks. BMC Bioinf. 19(1), 318:1–318:12 (2018)

    Google Scholar 

  20. Dessì, D., Reforgiato Recupero, D., Fenu, G., Consoli, S.: A recommender system of medical reports leveraging cognitive computing and frame semantics. Intell. Syst. Ref. Libr. 149, 7–30 (2019). Cited By 0

    Google Scholar 

  21. Dridi, A., Reforgiato Recupero, D.: Leveraging semantics for sentiment polarity detection in social media. Int. J. Mach. Learn. Cybern. (2017). https://doi.org/10.1007/s13042-017-0727-z

  22. European Centre for Disease Prevention and Control. http://ecdc.europa.eu/en/healthtopics/Healthcare-associated_infections/Pages/index.aspx

  23. European Medical Information Framework (EMIF). http://www.emif.eu

  24. Garcia-Barbero, M., Gröne, O.: Trends in integrated care reflections on conceptual issues. World Health Organization, Copenhagen, EUR/02/5037864 (2002)

    Google Scholar 

  25. Hahn, U., Cohen, K.B., Garten, Y., Shah, N.H.: Mining the pharmacogenomics literature survey of the state of the art. Brief. Bioinform. 13(4), 460–494 (2012)

    Article  Google Scholar 

  26. Hai Data and Statistics, Centers for Disease Control and Prevention (2016). http://www.cdc.gov/HAI/surveillance/

  27. Health at a glance 2015, OECD indicators. http://www.oecd-ilibrary.org/social-issues-migrationhealth/health-at-a-glance-2015/summary/english_47801564-en;jsessionid=fnol3e9ktakqk.x-oecd-live-03

  28. Healthcare Breach Report, Bitglass Report (2016). Available at: http://pages.bitglass.com/rs/418-ZAL-815/images/BR_Healthcare_Breach_Report_2016.pdf

  29. Healthcare data growth: an exponential problem. http://www.nextech.com/blog/healthcare-data-growth-an-exponential-problem

  30. Health care systems: getting more value for money. http://www.oecd.org/eco/growth/46508904.pdf

  31. Health and health systems. http://ec.europa.eu/europe2020/pdf/themes/05_health_and_health_systems.pdf?_sm_au_=iHVqq23HLDVwQ7DP

  32. Healthy aging data and statistics. http://www.euro.who.int/en/health-topics/Life-stages/healthy-ageing/data-and-statistics

  33. Herzeel, C., Costanza, P., Decap, D., Fostier, J., Reumers, J.: elPrep: high-performance preparation of sequence alignment/map files for variant calling. PLOS One 10(7), e0132868 (2015). https://doi.org/10.1371/journal.pone.0132868

    Article  Google Scholar 

  34. Holzinger, A., Schantl, J., Schroettner, M., Seifert, C., Verspoor, K.: Biomedical text mining: state-of-the-art, open problems and future challenges. In: Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. Springer, Berlin (2014)

    Chapter  Google Scholar 

  35. Investing in health. http://ec.europa.eu/health/strategy/docs/swd_investing_in_health_en.pdf

  36. Jonquet, C., Shah, N., Youn, C., Callendar, C., Storey, M.-A., Musen, M.: NCBO annotator: semantic annotation of biomedical data. In: International Semantic Web Conference, Poster and Demo session, vol. 110 (2009)

    Google Scholar 

  37. Khosla, A., Ngiam, J., et al.: Multimodal deep learning. In: Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA (2011)

    Google Scholar 

  38. Kissick, W.: Medicine’s Dilemmas. Yale University Press, New Haven (1994)

    Google Scholar 

  39. Kou, S.C., Yang, S., Santillana, M.: Accurate estimation of influenza epidemics using google search data via argo PNAS (2015). http://www.pnas.org/content/112/47/14473

  40. Lam, H.Y., Pan, C., Clark, M.J., Lacroute, P., Chen, R., Haraksingh, R., O’Huallachain, M., Gerstein, M.B., Kidd, J.M., Bustamante, C.D., Snyder, M.: Detecting and annotating genetic variations using the hugeseq pipeline. Nat. Biotechnol. 30(3), 226–229 (2012)

    Article  Google Scholar 

  41. Luo, B., Sampathkumar, H., Chen, X.-W.: Mining adverse drug reactions from online healthcare forums using hidden markov model. BMC Med. Inform. Decis. Mak. 14, 91 (2014)

    Article  Google Scholar 

  42. May, M.: Life science technologies: big biological impacts from big data. Science 344(6189), 1298–1300 (2014)

    Article  Google Scholar 

  43. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Trends in integrated care reflections on conceptual issues. Big data: the next frontier for innovation, competition, and productivity, McKinsey Global Institute Technical Report. Available at: https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation

  44. Névéol, A., Grouin, C., Tannier, X., Hamon, T., Kelly, L., Goeuriot, L., Zweigenbaum, P.: CLEF eHealth evaluation lab 2015 task 1b: clinical named entity recognition. In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, September 8–11 (2015)

    Google Scholar 

  45. Neves, M., Leser, U.: A survey on annotation tools for the biomedical literature. Brief. Bioinform. 15(2), 327–340 (2012)

    Article  Google Scholar 

  46. Nothaft, F.: Scalable genome resequencing with Adam and Avocado. Master’s thesis, EECS Department, University of California, Berkeley (2015)

    Google Scholar 

  47. OECD: Data-Driven Innovation: Big Data for Growth And Well-Being. OECD Publishing, Paris (2015)

    Book  Google Scholar 

  48. Openphacts bringing together pharmacological data resources in an integrated, interoperable infrastructure. http://openphacts.org

  49. Oxford, U.O. prime minister joins sir ka-shing li for launch of 90m initiative in big data and drug discovery at oxford university (2014). http://www.ox.ac.uk/media/news_releases_for_journalists/130305.htm

  50. Personal health train architecture for analyzing distributed data repositories. http://www.dtls.nl/fair-data/personal-health-train/

  51. Raghupathi, V., Raghupathi, W.: Big data analytics in healthcare: promise and potential. Health Inf. Sci. Syst. 2, 3 (2014)

    Article  Google Scholar 

  52. Rebholz-Schuhmann, D., Oellrich, A., Hoehndorf, R.: Text-mining solutions for biomedical research: enabling integrative biology. Nat. Rev. Genet. 13(12), 829–839 (2012)

    Article  Google Scholar 

  53. Recupero, D.R., Presutti, V., Consoli, S., Gangemi, A., Nuzzolese, A.G.: Sentilo: frame-based sentiment analysis. Cogn. Comput. 7(2), 211–225 (2015)

    Article  Google Scholar 

  54. Rodriguez, M.L., Quelch, J.A.: Philips healthcare: marketing the healthsuite digital platform. Harvard Business School Case 515-052 (2015). https://hbr.org/product/Philips-Healthcare--Marke/an/515052-PDF-ENG (Revised September 2015)

  55. Roller, R., Rethmeier, N., Thomas, P., Hübner, M., Uszkoreit, H., Staeck, O., Budde, K., Halleck, F., Schmidt, D.: Detecting Named Entities and Relations in German Clinical Reports, pp. 146–154. Springer, Cham (2018)

    Google Scholar 

  56. Roney, K.: If interoperability is the future of healthcare, what’s the delay? Becker’s Hospital Review (2012). Available at: https://www.beckershospitalreview.com/healthcare-information-technology/if-interoperability-is-the-future-of-healthcare-whats-the-delay.html

  57. Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inf. Assoc. 17(5), 507–513 (2010)

    Article  Google Scholar 

  58. Scott, R.D., II.: The direct medical costs of healthcare-associated infections in U.S. hospitals and the benefits of prevention. Stephen B. Thacker CDC Library Collection, document number cdc:11550. Available at: https://stacks.cdc.gov/view/cdc/11550

  59. Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Proceedings of Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  60. Skeppstedt, M., Kvist, M., Nilsson, G.H., Dalianis, H.: Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J. Biomed. Inf. 49, 148–158 (2014)

    Article  Google Scholar 

  61. Tackling chronic disease in Europe strategies, interventions and challenges. http://www.euro.who.int/__data/assets/pdf_file/0008/96632/E93736.pdf

  62. Teisberg, E.O., Porter, M.E.: Redefining Health Care: Creating Value-Based Competition on Results. Harvard Business Press, Boston (2006)

    Google Scholar 

  63. Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016). http://www.nature.com/articles/sdata201618

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Consoli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Abedjan, Z. et al. (2019). Data Science in Healthcare: Benefits, Challenges and Opportunities. In: Consoli, S., Reforgiato Recupero, D., Petković, M. (eds) Data Science for Healthcare. Springer, Cham. https://doi.org/10.1007/978-3-030-05249-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05249-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05248-5

  • Online ISBN: 978-3-030-05249-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics