Skip to main content

A Public Health Surveillance Platform Exploiting Free-Text Sources via Natural Language Processing and Linked Data: Application in Adverse Drug Reaction Signal Detection Using PubMed and Twitter

  • Conference paper
  • First Online:
Knowledge Representation for Health Care (ProHealth 2016, KR4HC 2016)

Abstract

This paper presents a platform enabling the systematic exploitation of diverse, free-text data sources for public health surveillance applications. The platform relies on Natural Language Processing (NLP) and a micro-services architecture, utilizing Linked Data as a data representational formalism. In order to perform NLP in an extendable and modular fashion, the proposed platform employs the Apache Unstructured Information Management Architecture (UIMA) and semantically annotates the results through a newly developed UIMA Semantic Common Analysis Structure Consumer (SCC). The SCC output is a graph represented in the Resource Description Framework (RDF) based on the W3C Web Annotation Data Model (WADM) and SNOMED-CT. We also present the use of the proposed platform through an exemplar application scenario concerning the detection of adverse drug reaction (ADR) signals using data retrieved from PubMed and Twitter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.ibm.com/blogs/research/2011/04/open-architecture-helps-watson-understand-natural-language/.

  2. 2.

    https://www.ncbi.nlm.nih.gov/pubmed/.

  3. 3.

    https://twitter.com/?lang=en.

  4. 4.

    https://www.w3.org/RDF/.

  5. 5.

    http://www.hermit-reasoner.com/.

  6. 6.

    http://www.w3.org/TR/annotation-model/.

  7. 7.

    http://www.ihtsdo.org/snomed-ct.

  8. 8.

    http://europepmc.org/RestfulWebService.

  9. 9.

    https://dev.twitter.com/streaming/overview.

  10. 10.

    http://lucene.apache.org/solr/.

  11. 11.

    http://virtuoso.openlinksw.com/.

  12. 12.

    https://www.w3.org/TR/sparql11-http-rdf-update/.

  13. 13.

    http://www.nlm.nih.gov/research/umls/.

  14. 14.

    https://jena.apache.org/.

  15. 15.

    http://www.drugbank.ca/.

  16. 16.

    http://bio2rdf.org/.

  17. 17.

    https://uima.apache.org/downloads/sandbox/RDF_CC/RDFCASConsumerUserGuide.html.

  18. 18.

    https://www.mongodb.com/.

References

  1. Harpaz, R., Callahan, A., Tamang, S., Low, Y., Odgers, D., Finlayson, S., Jung, K., LePendu, P., Shah, N.H.: Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. 37, 777–790 (2014)

    Article  Google Scholar 

  2. Bizer, C.: The emerging web of Linked Data. IEEE Intell. Syst. 24, 87–92 (2009)

    Article  Google Scholar 

  3. Martin Fowler: Microservices. http://martinfowler.com/articles/microservices.html

  4. Apache UIMA - Apache UIMA. http://uima.apache.org/

  5. Sarker, A., Ginn, R., Nikfarjam, A., O’Connor, K., Smith, K., Jayaraman, S., Upadhaya, T., Gonzalez, G.: Utilizing social media data for pharmacovigilance: a review. J. Biomed. Inform. 54, 202–212 (2015)

    Article  Google Scholar 

  6. Council for International Organizations of Medical Sciences (CIOMS): Practical Aspects of Signal Detection in Pharmacovigilance. Council for International Organizations of Medical Sciences. Report of CIOMS Working Group VIII. CIOMS, Geneva (2010)

    Google Scholar 

  7. Klann, J.G., Buck, M.D., Brown, J., Hadley, M., Elmore, R., Weber, G.M., Murphy, S.N.: Query Health: standards-based, cross-platform population health surveillance. J. Am. Med. Inform. Assoc. 21, 650–656 (2014)

    Article  Google Scholar 

  8. Teodoro, D., Pasche, E., Gobeill, J., Emonet, S., Ruch, P., Lovis, C.: Building a transnational biosurveillance network using Semantic Web technologies: requirements, design, and preliminary evaluation. J. Med. Internet Res. 14(3), e73 (2012)

    Article  Google Scholar 

  9. Daniulaityte, R., Chen, L., Lamy, F.R., Carlson, R.G., Thirunarayan, K., Sheth, A.: When “Bad” is “Good”: identifying personal communication and sentiment in drug-related tweets. JMIR Public Heal. Surveill. 2, e162 (2016)

    Article  Google Scholar 

  10. Huff, A.G., Breit, N., Allen, T., Whiting, K., Kiley, C.: Evaluation and verification of the global rapid identification of threats system for infectious diseases in textual data sources. Interdiscip. Perspect. Infect. Dis. 2016, 5080746 (2016)

    Google Scholar 

  11. Yang, M., Kiang, M., Shang, W.: Filtering big data from social media – building an early warning system for adverse drug reactions. J. Biomed. Inform. 54, 230–240 (2015)

    Article  Google Scholar 

  12. Cameron, D., Smith, G.A., Daniulaityte, R., Sheth, A.P., Dave, D., Chen, L., Anand, G., Carlson, R., Watkins, K.Z., Falck, R.: PREDOSE: a Semantic Web platform for drug abuse epidemiology using social media. J. Biomed. Inform. 46, 985–997 (2013)

    Article  Google Scholar 

  13. Shang, N., Xu, H., Rindflesch, T.C., Cohen, T.: Identifying plausible adverse drug reactions using knowledge extracted from the literature. J. Biomed. Inform. 52, 293–310 (2014)

    Article  Google Scholar 

  14. Freifeld, C.C., Brownstein, J.S., Menone, C.M., Bao, W., Filice, R., Kass-Hout, T., Dasgupta, N.: Digital drug safety surveillance: monitoring pharmaceutical products in Twitter. Drug Saf. 37, 343–350 (2014)

    Article  Google Scholar 

  15. Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of tweets during the 2009 H1N1 outbreak. PLoS ONE 5, e14118 (2010)

    Article  Google Scholar 

  16. Ram, S., Zhang, W., Williams, M., Pengetnze, Y.: Predicting asthma-related emergency department visits using big data. IEEE J. Biomed. Heal. Inform. 19, 1216–1223 (2015)

    Article  Google Scholar 

  17. Gesualdo, F., Stilo, G., D’Ambrosio, A., Carloni, E., Pandolfi, E., Velardi, P., Fiocchi, A., Tozzi, A.E.: Can Twitter be a source of information on allergy? correlation of pollen counts with tweets reporting symptoms of allergic rhinoconjunctivitis and names of antihistamine drugs. PLoS ONE 10, e0133706 (2015)

    Article  Google Scholar 

  18. Gittelman, S., Lange, V., Gotway Crawford, C.A., Okoro, C.A., Lieb, E., Dhingra, S.S., Trimarchi, E.: A new source of data for public health surveillance: Facebook likes. J. Med. Internet Res. 17(4), e98 (2015)

    Article  Google Scholar 

  19. Fullwood, M.D., Kecojevic, A., Basch, C.H.: Examination of YouTube videos related to synthetic cannabinoids. Int. J. Adolesc. Med. Health (2016)

    Google Scholar 

  20. Shin, S.-Y., Seo, D.-W., An, J., Kwak, H., Kim, S.-H., Gwack, J., Jo, M.-W.: High correlation of Middle East respiratory syndrome spread with google search and Twitter trends in Korea. Sci. Rep. 6, 32920 (2016)

    Article  Google Scholar 

  21. Santillana, M., Nguyen, A.T., Dredze, M., Paul, M.J., Nsoesie, E.O., Brownstein, J.S.: Combining search, social media, and traditional data sources to improve influenza surveillance. PLoS Comput. Biol. 11, e1004513 (2015)

    Article  Google Scholar 

  22. Koutkias, V., Lillo-Le Louët, A., Jaulent, M.C.: Exploiting heterogeneous publicly available data sources for drug safety surveillance: computational framework and case studies. Expert Opin. Drug Saf. 16, 113–124 (2016)

    Google Scholar 

  23. Poulymenopoulou, M., Papakonstantinou, D., Malamateniou, F., Vassilacopoulos, G.: A health analytics semantic ETL service for obesity surveillance. Stud. Health Technol. Inform. 210, 840–844 (2015)

    Google Scholar 

  24. Chorianopoulos, K., Talvis, K.: Flutrack.org: open-source and Linked Data for epidemiology. Health Inform. J. 22(4), 962–974 (2015)

    Article  Google Scholar 

  25. Kato, Y., Izui, T., Murakawa, Y., Okabayashi, K., Ueki, M., Tsuchiya, Y., Narita, M.: Research and development environments for robot services and its implementation. In: 2011 IEEE/SICE International Symposium on System Integration (SII), pp. 306–311 (2011)

    Google Scholar 

  26. Vögler, M., Schleicher, J., Inzinger, C., Nastic, S., Sehic, S., Dustdar, S.: LEONORE – large-scale provisioning of resource-constrained IoT deployments. In: 9th International Symposium on Service-Oriented System Engineering, pp. 78–87 (2015)

    Google Scholar 

  27. Ono, K., Muetze, T., Kolishovski, G., Shannon, P., Demchak, B.: CyREST: turbocharging cytoscape access for external tools via a RESTful API. F1000Research 4, 478 (2015)

    Google Scholar 

  28. Fages, F., Soliman, S. (eds.): PPSWR 2005. LNCS, vol. 3703. Springer, Heidelberg (2005)

    Google Scholar 

  29. Samwald, M., Jentzsch, A., Bouton, C., Kallesøe, C.S., Willighagen, E., Hajagos, J., Marshall, M.S., Prud’hommeaux, E., Hassenzadeh, O., Pichler, E., Stephens, S.: Linked open drug data for pharmaceutical research and development. J Cheminform. 3, 19 (2011)

    Article  Google Scholar 

  30. Callahan, A., Cruz-Toledo, J., Ansell, P., Dumontier, M.: Bio2RDF release 2: improved coverage, interoperability and provenance of life science Linked Data. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) The Semantic Web: Semantics and Big Data, pp. 200–212. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  31. Salvadores, M., Alexander, P.R., Musen, M.A., Noy, N.F.: BioPortal as a dataset of linked biomedical ontologies and terminologies in RDF. Semant. Web. 4, 277–284 (2013)

    Google Scholar 

  32. Sneps-Sneppe, M., Namiot, D.: Micro-service architecture for emerging telecom applications. Int. J. Open Inf. Technol. 2, 34–38 (2014)

    Google Scholar 

  33. Fielding, R.T., Taylor, R.N.: Principled design of the modern web architecture. In: Proceedings of the 22nd International Conference on Software Engineering, pp. 407–416. ACM, New York (2000)

    Google Scholar 

  34. Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17, 507–513 (2010)

    Article  Google Scholar 

  35. Lawley, M.: SNOMED CT URI Standard. http://ihtsdo.org/fileadmin/user_upload/doc/download/doc_UriStandard_Current-en-US_INT_20140527.pdf?ok

  36. Koutkias, V.G., Jaulent, M.-C.: Computational approaches for pharmacovigilance signal detection: toward integrated and semantically-enriched frameworks. Drug Saf. 38, 219–232 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pantelis Natsiavas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Natsiavas, P., Maglaveras, N., Koutkias, V. (2017). A Public Health Surveillance Platform Exploiting Free-Text Sources via Natural Language Processing and Linked Data: Application in Adverse Drug Reaction Signal Detection Using PubMed and Twitter. In: Riaño, D., Lenz, R., Reichert, M. (eds) Knowledge Representation for Health Care. ProHealth KR4HC 2016 2016. Lecture Notes in Computer Science(), vol 10096. Springer, Cham. https://doi.org/10.1007/978-3-319-55014-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55014-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55013-8

  • Online ISBN: 978-3-319-55014-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics