Skip to main content

Exploring the Automatisation of Animal Health Surveillance Through Natural Language Processing

  • Conference paper
  • First Online:
Artificial Intelligence XXXVI (SGAI 2019)

Abstract

The Animal and Plant Health Agency (APHA) conducts post-mortem examinations (PMEs) of farm animal species as part of routine scanning surveillance for new and re-emerging diseases that may pose a threat to animal and public health. This paper investigates whether relevant veterinary medical terms can be automatically identified in the free-text summaries entered by Veterinary Investigation Officers (VIOs) on the PME reports. Two natural language processing tasks were performed: (1) named entity recognition, where terms within the free-text were mapped to concepts in the Unified Medical Language System (UMLS) Metathesaurus; and (2) semantic similarity and relatedness also using UMLS. For this pilot study, we focused on two diagnostic codes: salmonellosis (S. Dublin) and Pneumonia NOS (Not Otherwise Specified). The outputs were manually evaluated by VIOs. The results highlight the potential value of natural language processing to identify key concepts and pertinent veterinary medical terms that can be used for scanning surveillance purposes using large, free-text data. We also discuss issues resulting from the inherent bias of UMLS to human medical terms and its use in animal health monitoring.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. UK APHA. https://www.gov.uk/government/organisations/animal-and-plant-health-agency

  2. VIDA. http://apha.defra.gov.uk/documents/surveillance/pub-surv-vida2018.pdf

  3. Nadkarni, P.M., Ohno-Machado, L., Chapman, W.W.: Natural language processing: an introduction. J. Am. Med. Inform. Assoc. 18(5), 544–551 (2011)

    Article  Google Scholar 

  4. WHO salmonella. https://www.who.int/zoonoses/diseases/en/

  5. UK Zoonotic diseases. https://www.gov.uk/government/publications/list-of-zoonotic-diseases/list-of-zoonotic-diseases

  6. Zipf, G.K.: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Cambridge (1949)

    Google Scholar 

  7. Sinclair, J.: Corpus, Concordance, Collocation. Oxford University Press, Oxford (1991)

    Google Scholar 

  8. Francis, W.N., Kučera, H.: A Standard Corpus of Present-Day Edited American English, for use with Digital Computers (Brown). Brown University, Providence (1979)

    Google Scholar 

  9. Pedersen, T., Pakhomov, S.V., Patwardhan, S., Chute, C.G.: Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inform. 40(3), 288–299 (2007)

    Article  Google Scholar 

  10. NLTK. https://www.nltk.org

  11. MetaMap. https://metamap.nlm.nih.gov

  12. UMLS API. https://documentation.uts.nlm.nih.gov

  13. cTAKES. https://ctakes.apache.org

  14. Reátegui, R., Ratté, S.: Comparison of MetaMap and cTAKES for entity extraction in clinical notes. BMC Med. Inform. Decis. Mak. 18(3), 74 (2018)

    Article  Google Scholar 

  15. i2b2 Obesity challenge data. https://www.i2b2.org/NLP/Obesity/

  16. UMLS Metathesaurus. https://www.ncbi.nlm.nih.gov/books/NBK9685/

  17. SNOMED CT. http://www.snomed.org

  18. VetSCT. https://vtsl.vetmed.vt.edu

  19. McCray, A.T., Burgun, A., Bodenreider, O.: Aggregating UMLS semantic types for reducing conceptual complexity. Stud. Health. Technol. Inform. 84, 216 (2001)

    Google Scholar 

  20. McInnes, B.T., Pedersen, T., Pakhomov, S.V.: UMLS-Interface and UMLS-Similarity: open source software for measuring paths and semantic similarity. In: AMIA Annual Symposium Proceedings, vol. 2009, p. 431 (2009)

    Google Scholar 

  21. Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Berkeley (1999)

    MATH  Google Scholar 

  22. Pratt, W., Yetisgen-Yildiz, M.: A study of biomedical concept identification: MetaMap vs. people. In: AMIA Annual Symposium Proceedings, pp. 529–533 (2003)

    Google Scholar 

  23. Harvey, R.R., et al.: Epidemiology of Salmonella enterica serotype Dublin infections among humans, United States, 1968–2013. Emerg. Infect. Dis. 23(9), 1493 (2017)

    Article  Google Scholar 

  24. McDonough, P.L., Fogelman, D., Shin, S.J., Brunner, M.A., Lein, D.H.: Salmonella enterica serotype Dublin infection: an emerging infectious disease for the northeastern United States. J. Clin. Microbiol. 37(8), 2418–2427 (1999)

    Article  Google Scholar 

  25. Alvaro, N., Miyao, Y., Collier, N.: TwiMed: Twitter and PubMed comparable corpus of drugs, diseases, symptoms, and their relations. JMIR Public Health Surveill. 3(2), e24 (2017)

    Article  Google Scholar 

  26. NHS 2020. https://www.gov.uk/government/publications/personalised-health-and-care-2020

Download references

Acknowledgements

This project was funded by the N8 Research Partnership through an AgriFood pump-priming award from the University of Manchester. The authors would also like to thank the Animal and Plant Health Agency (APHA) for providing a suitable free-text dataset.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mercedes Arguello-Casteleiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arguello-Casteleiro, M., Jones, P.H., Robertson, S., Irvine, R.M., Twomey, F., Nenadic, G. (2019). Exploring the Automatisation of Animal Health Surveillance Through Natural Language Processing. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXVI. SGAI 2019. Lecture Notes in Computer Science(), vol 11927. Springer, Cham. https://doi.org/10.1007/978-3-030-34885-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34885-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34884-7

  • Online ISBN: 978-3-030-34885-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics