Supporting Deep Learning-Based Named Entity Recognition Using Cloud Resource Management

Hartmann, Benedict; Tamla, Philippe; Hemmje, Matthias

doi:10.1007/978-3-031-48057-7_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14059))

Included in the following conference series:

International Conference on Human-Computer Interaction

1867 Accesses
1 Citations

Abstract

This paper presents a system for managing Cloud Resources such as memory and CPU/GPU that is used to develop, train, and customize Deep Learning-based Named Entity Recognition (NER) models in domains like heath care. The increasing digitization of healthcare services has led to the emergence of electronic health records (EHRs) as a significant component of healthcare data management. NER is a machine learning technique that can be applied to EHRs to extract information such as drug and treatment information, helping to support clinical decision making. The paper is addressing the difficulty domain experts face in using Cloud technologies to perform NER tasks, since they often require technical expertise and technical management overhead. The paper presents a system for the configuration of cloud resources for NER training using the spaCy framework and AWS compute services. The research is structured using Nunamaker’s methodology, which provides a structured approach to software development through four phases: observation, theory building, systems development, and experimentation. The paper identifies problem statements and research questions to guide the research and maps them to the objectives of the methodology. The objectives of the methodology include researching the state-of-the-art of NER and cloud technologies, analyzing the architecture of motivating research projects, defining user requirements and the system architecture, and implementing the system. The system is designed using User Centered Systems Design and is based on previously identified user requirements. Two main user groups are considered for the application: NER Experts and Medical Domain Experts. The system is implemented using the Model-View-Controller architecture pattern. It allows for the training of Transformer models, selection of compute resources, and adjusting training configuration and hyperparameters. The system is designed for scalability of compute and storage resources. The paper also discusses the evaluation of the system through experiments and analysis of the results to gain insights. It provides information about the technical implementation and details about the user interface. It is evaluated using cognitive walkthrough and experiments with Transformer-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Big Data in Healthcare Institutions: An Architecture Proposal

HealtheDataLab – a cloud computing solution for data science and advanced analytics in healthcare with application to predicting multi-center pediatric readmissions

Article Open access 19 June 2020

DiagnoseNET: Automatic Framework to Scale Neural Networks on Heterogeneous Systems Applied to Medical Diagnosis

Notes

1.
https://cordis.europa.eu/programme/id/H2020_DT-ICT-12-2020.
2.
https://www.ftk.de/en.
3.
https://spacy.io/.
4.
https://aws.amazon.com.
5.
https://paperswithcode.com/sota/named-entity-recognition-ner-on-conll-2003.
6.
https://aws.amazon.com.
7.
https://azure.com.
8.
https://cloud.google.com.
9.
https://aws.amazon.com/pricing.
10.
https://www.spacy.io.
11.
https://www.python.org.
12.
https://flask.palletsprojects.com/en/2.2.x/.
13.
https://www.getbootstrap.com.
14.
https://boto3.amazonaws.com/v1/documentation/api/latest/index.html.
15.
https://huggingface.co.
16.
https://aws.amazon.com/batch/.
17.
https://docker.io.
18.
https://kubernetes.io/.

References

Ahmadi, M., Aslani, N.: Capabilities and advantages of cloud computing in the implementation of electronic health record. Acta Informatica Medica 26(1), 24 (2018)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Explosion: Spacy (2023). https://spacy.io/
Fichman, R.G., Kohli, R., Krishnan, R.: Editorial overview–the role of information systems in healthcare: current research and future trends. Inf. Syst. Res. 22(3), 419–428 (2011)
Article Google Scholar
Frei, J., Kramer, F.: GERNERMED: an open German medical NER model. Softw. Impacts 11, 100212 (2022)
Article Google Scholar
Freund, F., et al.: FIT4NER - towards a framework independent toolkit for named entity recognition (2022)
Google Scholar
Fu, S., Liu, J., Chu, X., Hu, Y.: Toward a standard interface for cloud providers: the container as the narrow waist. IEEE Internet Comput. 20(2), 66–71 (2016)
Article Google Scholar
Synergy Research Group: Q2 cloud market grows by 29% despite strong currency headwinds; Amazon increases its share, July 2022. https://www.srgresearch.com/articles/q2-cloud-market-grows-by-29-despite-strong-currency-headwinds-amazon-increases-its-share
Grove, R.F., Ozkan, E.: The MVC-web design pattern. In: International Conference on Web Information Systems and Technologies, vol. 2, pp. 127–130. SCITEPRESS (2011)
Google Scholar
Hartmann, B.: Development of an application for the configuration of cloud resources to support NER model training with the spacy framework in the AWS cloud. Coursework at University of Hagen, February 2023, unpublished
Google Scholar
Hogan, M., Liu, F., Sokol, A., Tong, J.: NIST cloud computing standards roadmap. NIST Spec. Publ. 35, 6–11 (2011)
Google Scholar
Jacobson, I., Booch, G., Rumbaugh, J.: The unified process. IEEE Softw. 16(3), 96 (1999)
Google Scholar
Kohli, R., Tan, S.S.L.: Electronic health records: how can is researchers contribute to transforming healthcare? MIS Q. 40(3), 553–573 (2016). https://doi.org/10.25300/MISQ/2016/40.3.02
Article Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Menachemi, N., Brooks, R.G.: Reviewing the benefits and costs of electronic health records and associated patient safety technologies. J. Med. Syst. 30, 159–168 (2006)
Article Google Scholar
Moosavi, N.S., Strube, M.: Which coreference evaluation metric do you trust? A proposal for a link-based entity aware metric. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, August 2016, pp. 632–642. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/P16-1060
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Article Google Scholar
Narayanan, D., Santhanam, K., Kazhamiaka, F., Phanishayee, A., Zaharia, M.: Analysis and exploitation of dynamic pricing in the public cloud for ml training. In: VLDB DISPA Workshop 2020 (2020)
Google Scholar
Nasar, Z., Jaffry, S.W., Malik, M.K.: Named entity recognition and relation extraction: state-of-the-art. ACM Comput. Surv. (CSUR) 54(1), 1–39 (2021)
Article Google Scholar
Nguyen, N.: Development of an application for the configuration of cloud resources to support NER model training with the spacy framework in the Azure cloud. Coursework at University of Hagen, February 2023, unpublished
Google Scholar
Norman, D.A., Draper, S.W.: User Centered System Design: New Perspectives on Human-Computer Interaction (1986)
Google Scholar
Nunamaker, J.F., Jr., Chen, M., Purdin, T.D.: Systems development in information systems research. J. Manag. Inf. Syst. 7(3), 89–106 (1990)
Article Google Scholar
Polson, P.G., Lewis, C., Rieman, J., Wharton, C.: Cognitive walkthroughs: a method for theory-based evaluation of user interfaces. Int. J. Man Mach. Stud. 36(5), 741–773 (1992)
Article Google Scholar
Singh, H., Spitzmueller, C., Petersen, N.J., Sawhney, M.K., Sittig, D.F.: Information overload and missed test results in electronic health record-based settings. JAMA Intern. Med. 173(8), 702–704 (2013)
Article Google Scholar
Tamla, P., Hartmann, B., Nguyen, N., Kramer, C., Freund, F., Hemmje, M.: CIE: a cloud-based information extraction system for named entity recognition in AWS, Azure, and medical domain. In: Coenen, F., et al. (eds.) Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2022. Communications in Computer and Information Science, vol. 1842, pp 127–148. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43471-6_6
Tamla, P.: Supporting access to textual resources using named entity recognition and document classification. Ph.D. thesis, Hagen (2022). https://ub-deposit.fernuni-hagen.de/receive/mir_mods_00001782
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vychegzhanin, S., Kotelnikov, E.: Comparison of named entity recognition tools applied to news articles. In: 2019 Ivannikov Ispras Open Conference (ISPRAS), pp. 72–77. IEEE (2019)
Google Scholar
Wan, Q., Liu, J., Wei, L., Ji, B.: A self-attention based neural architecture for Chinese medical named entity recognition. Math. Biosci. Eng. 17(4), 3498–3511 (2020)
Article MATH Google Scholar
Weingärtner, R., Bräscher, G.B., Westphall, C.B.: Cloud resource management: a survey on forecasting and profiling models. J. Netw. Comput. Appl. 47, 99–106 (2015)
Article Google Scholar
Yao, L., Liu, H., Liu, Y., Li, X., Anwar, M.W.: Biomedical named entity recognition based on deep neutral network. Int. J. Hybrid Inf. Technol. 8(8), 279–288 (2015)
Google Scholar

Download references

Acknowledgements

The author, Benedict Hartmann, acknowledges the financial support provided by Allianz Technology SE to attend HCI International 2023.

Author information

Authors and Affiliations

University of Hagen, Universitätsstr. 11, 58084, Hagen, Germany
Benedict Hartmann, Philippe Tamla & Matthias Hemmje

Authors

Benedict Hartmann
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Tamla
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Hemmje
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benedict Hartmann .

Editor information

Editors and Affiliations

Siemens Corporation, Princeton, NJ, USA
Helmut Degen
Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Stavroula Ntoa
San Jose State University, San Jose, CA, USA
Abbas Moallem

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hartmann, B., Tamla, P., Hemmje, M. (2023). Supporting Deep Learning-Based Named Entity Recognition Using Cloud Resource Management. In: Degen, H., Ntoa, S., Moallem, A. (eds) HCI International 2023 – Late Breaking Papers. HCII 2023. Lecture Notes in Computer Science, vol 14059. Springer, Cham. https://doi.org/10.1007/978-3-031-48057-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-48057-7_6
Published: 26 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48056-0
Online ISBN: 978-3-031-48057-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics