A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text Using Large Language Models

Neuberger, Julian; Ackermann, Lars; van der Aa, Han; Jablonski, Stefan

doi:10.1007/978-3-031-75872-0_3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15238))

Included in the following conference series:

International Conference on Conceptual Modeling

400 Accesses

Abstract

Over the past decade, extensive research efforts have been dedicated to the extraction of information from textual process descriptions. Despite the remarkable progress witnessed in natural language processing (NLP), information extraction within the Business Process Management domain remains predominantly reliant on rule-based systems and machine learning methodologies. Data scarcity has so far prevented the successful application of deep learning techniques. However, the rapid progress in generative large language models (LLMs) makes it possible to solve many NLP tasks with very high quality without the need for extensive data. Therefore, we systematically investigate the potential of LLMs for extracting information from textual process descriptions, targeting the detection of process elements such as activities and actors, and relations between them. Based on a novel prompting strategy, we show that LLMs are able to outperform state-of-the-art machine learning approaches with absolute performance improvements of up to 8% $F_1$ score across three different datasets. We evaluate our prompting strategy on eight different LLMs, showing it is universally applicable, while also analyzing the impact of certain prompt parts on extraction quality. The number of example texts, the specificity of definitions, and the rigour of format instructions are identified as key for improving the accuracy of extracted information. Our code, prompts, and data are publicly available at https://github.com/JulianNeuberger/llm-process-generation/tree/er2024.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Leveraging Data Augmentation for Process Information Extraction

Extracting Business Process Entities and Relations from Text Using Pre-trained Language Models and In-Context Learning

PET: An Annotated Dataset for Process Extraction from Natural Language Text Tasks

Notes

1.
This is inspired by the paradigm of interlingua-based machine translation [27], which reduces the number of translation systems for n languages from $n^2$ to 2n.
2.
https://www.omg.org/bpmn/, accessed June 2, 2024.
3.
Code at https://github.com/JulianNeuberger/llm-process-generation/tree/er2024 .
4.
see OpenAI’s source code, accessed June 3, 2024.
5.
See https://tatsu-lab.github.io/alpaca_eval/, last accessed May 30, 2024.

References

Van der Aa, H., Carmona Vargas, J., Leopold, H., Mendling, J., Padró, L.: Challenges and opportunities of applying natural language processing in business process management. In: COLING (2018)
Google Scholar
van der Aa, H., Di Ciccio, C., Leopold, H., Reijers, H.A.: Extracting declarative process models from natural language. In: CAiSE (2019)
Google Scholar
Van der Aa, H., Leopold, H., Reijers, H.A.: Checking process compliance against natural language specifications using behavioral spaces. IS (2018)
Google Scholar
van der Aa, H., Leopold, H., van de Weerd, I., Reijers, H.A.: Causes and consequences of fragmented process information: Insights from a case study. In: AMCIS (2017)
Google Scholar
van der Aalst, W.: Process Mining. Springer Berlin Heidelberg, Berlin, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
Book Google Scholar
Ackermann, L., Neuberger, J., Jablonski, S.: Data-driven annotation of textual process descriptions based on formal meaning representations. In: La Rosa, M., Sadiq, S., Teniente, E. (eds.) Advanced Information Systems Engineering: 33rd International Conference, CAiSE 2021, Melbourne, VIC, Australia, June 28 – July 2, 2021, Proceedings, pp. 75–90. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-79382-1_5
Chapter Google Scholar
Ackermann, L., Neuberger, J., Käppel, M., Jablonski, S.: Bridging research fields: An empirical study on joint, neural relation extraction techniques. In: CAiSE (2023)
Google Scholar
Bellan, P., Dragoni, M., Ghidini, C.: Extracting business process entities and relations from text using pre-trained language models and in-context learning. In: EDOC (2022)
Google Scholar
Bellan, P., Ghidini, C., Dragoni, M., Ponzetto, S.P., van der Aa, H.: Process extraction from natural language text: the PET dataset and annotation guidelines. In: NL4AI (2022)
Google Scholar
Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: Can language models be too big? In: ACM FAccT (2021)
Google Scholar
Cui, L., Wu, Y., Liu, J., Yang, S., Zhang, Y.: Template-based named entity recognition using bart. arXiv preprint arXiv:2106.01760 (2021)
Davies, I., Green, P., Rosemann, M., Indulska, M., Gallo, S.: How do practitioners use conceptual modeling in practice? Data Knowl. Eng. 58(3), 358–380 (2006)
Article Google Scholar
Dubois, Y., et al.: Alpacafarm: A simulation framework for methods that learn from human feedback. Adv. Neural Inform. Process. Syst. 36 (2024)
Google Scholar
Ferreira., R.C.B., Thom., L.H., Fantinato., M.: A semi-automatic approach to identify business process elements in natural language texts. In: ICEIS (2017)
Google Scholar
Franceschetti, M., Seiger, R., López, H.A., Burattin, A., García-Bañuelos, L., Weber, B.: A characterisation of ambiguity in BPM. In: Almeida, J.P.A., Borbinha, J., Guizzardi, G., Link, S., Zdravkovic, J. (eds.) Conceptual Modeling: 42nd International Conference, ER 2023, Lisbon, Portugal, November 6–9, 2023, Proceedings, pp. 277–295. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-47262-6_15
Chapter Google Scholar
Friedrich, F., Mendling, J., Puhlmann, F.: Process model generation from natural language text. In: CAiSE (2011)
Google Scholar
Kaddour, J., Harris, J., Mozes, M., Bradley, H., Raileanu, R., McHardy, R.: Challenges and applications of large language models. arXiv preprint (2023)
Google Scholar
Kourani, H., Berti, A., Schuster, D., van der Aalst, W.M.: Process modeling with large language models. arXiv preprint arXiv:2403.07541 (2024)
Leopold, H., van der Aa, H., Pittke, F., Raffel, M., Mendling, J., Reijers, H.A.: Searching textual and model-based process descriptions based on a unified data format. SoSym 18, 1179–1194 (2019)
Google Scholar
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive nlp tasks: Adv. Neural. Inf. Process. Syst. 33, 9459–9474 (2020)
Google Scholar
López-Acosta, H.A., Hildebrandt, T., Debois, S., Marquard, M.: The process highlighter: From texts to declarative processes and back. In: CEUR Workshop Proceedings, pp. 66–70. CEUR Workshop Proceedings (2018)
Google Scholar
Min, B., et al.: Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. 56(2), 1–40 (2023)
Article Google Scholar
Neuberger, J., Ackermann, L., Jablonski, S.: Beyond rule-based named entity recognition and relation extraction for process model generation from natural language text. In: CoopIS (2023)
Google Scholar
Pesic, M., Schonenberg, H., Van der Aalst, W.M.: Declare: full support for loosely-structured processes. In: 11th IEEE International Enterprise Distributed Object Computing Conference (EDOC 2007), pp. 287–287. IEEE (2007)
Google Scholar
Qian, C., et al.: An approach for process model extraction by multi-grained text classification. In: CAiSE (2020)
Google Scholar
Quishpi, L., Carmona, J., Padró, L.: Extracting annotations from textual descriptions of processes. In: BPM 2020 (2020)
Google Scholar
Richens, R.H.: Interlingual machine translation. Comput. J. 1(3), 144–147 (1958)
Article Google Scholar
Sànchez-Ferreres, J., Burattin, A., Carmona, J., Montali, M., Padró, L., Quishpi, L.: Unleashing textual descriptions of business processes. In: SoSyM (2021)
Google Scholar
Sukthanker, R., Poria, S., Cambria, E., Thirunavukarasu, R.: Anaphora and coreference resolution: A review. Information Fusion (2020)
Google Scholar
Ter Hofstede, A.H., et al.: Process-data quality: The true frontier of process mining. In: ACM JDIQ (2023)
Google Scholar
Törnberg, P.: Best practices for text annotation with large language models. arXiv preprint arXiv:2402.05129 (2024)
Wei, J., et al.: Chain-of-thought prompting elicits reasoning in large language models. In: NIPS (2022)
Google Scholar
White, J., et al.: A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382 (2023)
Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., Zhu, J.: Explainable ai: brief survey on history, research areas, approaches and challenges. In: NLPCC (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Bayreuth, Bayreuth, Germany
Julian Neuberger, Lars Ackermann & Stefan Jablonski
University of Vienna, Vienna, Austria
Han van der Aa

Authors

Julian Neuberger
View author publications
You can also search for this author in PubMed Google Scholar
Lars Ackermann
View author publications
You can also search for this author in PubMed Google Scholar
Han van der Aa
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Jablonski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julian Neuberger .

Editor information

Editors and Affiliations

Saarland University and German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany
Wolfgang Maass
Illinois State University, Normal, IL, USA
Hyoil Han
Software Engineering Institute – Carnegie Mellon University, Pittsburgh, PA, USA
Hasan Yasar
Pacific Northwest National Laboratory, Richland, WA, USA
Nick Multari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neuberger, J., Ackermann, L., van der Aa, H., Jablonski, S. (2025). A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text Using Large Language Models. In: Maass, W., Han, H., Yasar, H., Multari, N. (eds) Conceptual Modeling. ER 2024. Lecture Notes in Computer Science, vol 15238. Springer, Cham. https://doi.org/10.1007/978-3-031-75872-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-75872-0_3
Published: 21 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-75871-3
Online ISBN: 978-3-031-75872-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text Using Large Language Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Leveraging Data Augmentation for Process Information Extraction

Extracting Business Process Entities and Relations from Text Using Pre-trained Language Models and In-Context Learning

PET: An Annotated Dataset for Process Extraction from Natural Language Text Tasks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text Using Large Language Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Leveraging Data Augmentation for Process Information Extraction

Extracting Business Process Entities and Relations from Text Using Pre-trained Language Models and In-Context Learning

PET: An Annotated Dataset for Process Extraction from Natural Language Text Tasks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation