Skip to main content

AHIAP: An Agile Medical Named Entity Recognition and Relation Extraction Framework Based on Active Learning

  • Conference paper
  • First Online:
Health Information Science (HIS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12435))

Included in the following conference series:

Abstract

Knowledge graph plays a significant role in many domains for providing a wide range of assistance. In the medical domain, clinical guidelines, academic papers, Electronic Medical Records (EMRs) and crawled data from the Internet contain essential information. However, those data are usually unstructured but vital to knowledge graph construction. The construction of knowledge graph using unstructured data requires a large number of medical experts to participate in annotations based on their prior experiences and knowledge. Knowledge graphs’ quality highly depends on the performances of medical named entity recognition and relation extraction that are both based on data annotation. However, faced with handling such a large amount of enormous data, manual labelling turns out to be a high labor cost task. Besides, the data is generated rapidly, requiring us to annotate and extract quickly to keep the pace with the data accumulation. Therefore, we propose a named entity recognition and relation extraction framework, AHIAP, to solve these problems mentioned above. AHIAP uses active learning method to reduce the labor cost of the annotation process while maintaining the annotation quality. There are two modules in AHIAP, an active learning module for reducing labor cost and a measurement module to control the quality. By using active learning, AHIAP only takes 200 samples to get to the accuracy of 70%, whereas the standard learning strategy takes 4000 records to get the same accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pujara, J., Miao, H., Getoor, L., Cohen, W.: Knowledge graph identification. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 542–557. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_34

    Chapter  Google Scholar 

  2. Verborgh, R., et al.: Triple Pattern Fragments: a low-cost knowledge graph interface for the Web. J. Web Semant. 37, 184–206 (2016)

    Article  Google Scholar 

  3. Donnelly, K.: SNOMED-CT: the advanced terminology and coding system for eHealth. Stud. Health Technol. Inform. 121, 279 (2006)

    Google Scholar 

  4. Agarwala, R., et al.: Database resources of the national center for biotechnology information. Nucleic Acids Res. 45, D12–D17 (2017)

    Google Scholar 

  5. Sheng, M., et al.: DEKGB: an extensible framework for health knowledge graph. In: ICSH, pp. 27–38 (2019)

    Google Scholar 

  6. Rotmensch, M., Halpern, Y., Tlimat, A., Horng, S., Sontag, D.: Learning a health knowledge graph from electronic medical records. Sci. Rep. 7, 1–11 (2017)

    Article  Google Scholar 

  7. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

  8. Giorgi, J.M., Bader, G.D., Wren, J.: Towards reliable named entity recognition in the biomedical domain. Bioinformatics 36, 280–286 (2020)

    Article  Google Scholar 

  9. Sheng, M., et al.: DocKG: a knowledge graph framework for health with doctor-in-the-loop. In: Wang, H., Siuly, S., Zhou, R., Martin-Sanchez, F., Zhang, Y., Huang, Z. (eds.) HIS 2019. LNCS, vol. 11837, pp. 3–14. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32962-4_1

    Chapter  Google Scholar 

  10. doccano - Document Annotation Tool. https://doccano.herokuapp.com/. Accessed 11 June 2020

  11. brat rapid annotation tool. https://brat.nlplab.org/

  12. Prodigy · An annotation tool for AI. Machine Learning & NLP. https://prodi.gy/

  13. Jie, Y., Yue Z., Linwei L., Xingxuan L.: YEDDA: a lightweight collaborative text span annotation tool. In: ACL 2018, pp. 31–36 (2018)

    Google Scholar 

  14. Deepdive. https://github.com/HazyResearch/deepdive. Accessed 11 June 2020

  15. Chen, W., Styler, W.: Anafora: a web-based general purpose annotation tool. In: NAACL, pp. 14–19 (2013)

    Google Scholar 

  16. Eckart de Castilho, R., et al.: A web-based tool for the integrated annotation of semantic and syntactic structures. In: LT4DH Workshop, pp. 76–84 (2016)

    Google Scholar 

  17. Multi-document Annotation Environment. http://keighrim.github.io/mae-annotation/

  18. Klie, J.-C., Bugert, M., Boullosa, B., Eckart de Castilho, R., Gurevych, I.: The INCEpTION platform: machine-assisted and knowledge-oriented interactive annotation. In: ACL, pp. 5–9 (2018)

    Google Scholar 

  19. Coelho da Silva, T.L., Magalhães, R.P., et al.: Improving named entity recognition using deep learning with human in the loop. In: EDBT, 594–597 (2019)

    Google Scholar 

  20. Yang, Y., Kandogan, E., Li, Y., Sen, P., Lasecki, W.S.: A study on interaction in human-in-the-loop machine learning for text analytics. In: CEUR Workshop (2019)

    Google Scholar 

  21. Shen, Y., Yun, H., Lipton, Z.C., Kronrod, Y., Anandkumar, A.: Deep active learning for named entity recognition. arXiv preprint arXiv:1707.05928 (2017)

  22. Vieira, S.M., Kaymak, U., Sousa, J.M.C.: Cohen’s kappa coefficient as a performance measure for feature selection. In: WCCI 2010. pp. 1–8. IEEE (2010)

    Google Scholar 

  23. Zhao, K., et al.: Modeling patient visit using electronic medical records for cost profile estimation. In: DASFAA, pp. 20–36 (2018)

    Google Scholar 

  24. Tian, B., Zhang, Y., Wang, J., Xing, C.: Hierarchical inter-attention network for document classification with multi-task learning. In: IJCAI, pp. 3569–3575 (2019)

    Google Scholar 

  25. Wang, J., Lin, C., Li, M., Zaniolo, C.: Boosting approximate dictionary-based entity extraction with synonyms. Inf. Sci. 530, 1–21 (2020)

    Article  Google Scholar 

  26. Zhao, K., et al.: Discovering subsequence patterns for next POI recommendation. In: IJCAI, pp. 3216–3222 (2020)

    Google Scholar 

Download references

Acknowledgement

This work was supported by NSFC (91646202), National Key R&D Program of China (2018YFB1404401, 2018YFB1402701).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Dong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sheng, M. et al. (2020). AHIAP: An Agile Medical Named Entity Recognition and Relation Extraction Framework Based on Active Learning. In: Huang, Z., Siuly, S., Wang, H., Zhou, R., Zhang, Y. (eds) Health Information Science. HIS 2020. Lecture Notes in Computer Science(), vol 12435. Springer, Cham. https://doi.org/10.1007/978-3-030-61951-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61951-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61950-3

  • Online ISBN: 978-3-030-61951-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics