Skip to main content

Extracting Semantic Process Information from the Natural Language in Event Logs

  • Conference paper
  • First Online:
Advanced Information Systems Engineering (CAiSE 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12751))

Included in the following conference series:


Process mining focuses on the analysis of recorded event data in order to gain insights about the true execution of business processes. While foundational process mining techniques treat such data as sequences of abstract events, more advanced techniques depend on the availability of specific kinds of information, such as resources in organizational mining and business objects in artifact-centric analysis. However, this information is generally not readily available, but rather associated with events in an ad hoc manner, often even as part of unstructured textual attributes. Given the size and complexity of event logs, this calls for automated support to extract such process information and, thereby, enable advanced process mining techniques. In this paper, we present an approach that achieves this through so-called semantic role labeling of event data. We combine the analysis of textual attribute values, based on a state-of-the-art language model, with a novel attribute classification technique. In this manner, our approach extracts information about up to eight semantic roles per event. We demonstrate the approach’s efficacy through a quantitative evaluation using a broad range of event logs and demonstrate the usefulness of the extracted information in a case study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.

    We kindly refer to Sect. 4.1 for further information on the event logs referenced here.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.

  7. 7.

    For reproducibility, the gold standard is published alongside the implementation.


  1. van der Aalst, W.M.P.: Process Mining: Data Science in Action. Springer, Heidelberg (2016).

    Book  Google Scholar 

  2. Aalst, W.M.P.: Object-centric process mining: dealing with divergence and convergence in event data. In: Ölveczky, P.C., Salaün, G. (eds.) SEFM 2019. LNCS, vol. 11724, pp. 3–25. Springer, Cham (2019).

    Chapter  Google Scholar 

  3. van der Aalst, W.M.P., Reijers, H.A., Song, M.: Discovering social networks from event logs. Comput. Support. Coop. Work (CSCW) 14(6), 549–593 (2005).

    Article  Google Scholar 

  4. Acampora, G., Vitiello, A., Di Stefano, B., van der Aalst, W., Günther, C., Verbeek, E.: IEEE 1849tm: The XES standard. IEEE Comput. Intell. Mag. 12(2), 4–8 (2017).

  5. Berti, A., van Zelst, S.J., van der Aalst, W.: Process mining for python (PM4Py): bridging the gap between process-and data science. ICPM Demo Track 2019, 13–16 (2019)

    Google Scholar 

  6. Carreras, X., Màrquez, L.: Introduction to the CoNLL-2005 shared task: semantic role labeling. CoNLL 2005, 152–164 (2005)

    Article  Google Scholar 

  7. Deokar, A.V., Tao, J.: Semantics-based event log aggregation for process mining and analytics. Inf. Syst. Front. 17(6), 1209–1226 (2015).

    Article  Google Scholar 

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, pp. 4171–4186. ACL (2019)

    Google Scholar 

  9. van Dongen, B.F.: BPI challenge (2014).

  10. van Dongen, B.F.: BPI challenge (2020).

  11. Du, N., Chen, K., Kannan, A., Tran, L., Chen, Y., Shafran, I.: Extracting symptoms and their status from clinical conversations. In: ACL. pp. 915–925 (2019)

    Google Scholar 

  12. Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. Comput. Linguist. 28(3), 245–288 (2002)

    Article  Google Scholar 

  13. He, L., Lee, K., Lewis, M., Zettlemoyer, L.: Deep semantic role labeling: what works and what’s next. In: ACL, pp. 473–483 (2017)

    Google Scholar 

  14. Honnibal, M., Montani, I.: spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing (2017, To appear)

    Google Scholar 

  15. Leopold, H., van der Aa, H., Offenberg, J., Reijers, H.A.: Using hidden Markov models for the accurate linguistic analysis of process model activity labels. Inf. Syst. 83, 30–39 (2019)

    Article  Google Scholar 

  16. Leopold, H., van der Aa, H., Reijers, H.A.: Identifying candidate tasks for robotic process automation in textual process descriptions. In: Gulden, J., Reinhartz-Berger, I., Schmidt, R., Guerreiro, S., Guédria, W., Bera, P. (eds.) BPMDS/EMMSAD -2018. LNBIP, vol. 318, pp. 67–81. Springer, Cham (2018).

    Chapter  Google Scholar 

  17. Lu, X., Fahland, D., van den Biggelaar, F.J.H.M., van der Aalst, W.M.P.: Handling duplicated tasks in process discovery by refining event labels. In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 90–107. Springer, Cham (2016).

    Chapter  Google Scholar 

  18. Mendling, J., Reijers, H.A., van der Aalst, W.M.: Seven process modeling guidelines (7PMG). Inf. Softw. Technol. 52(2), 127–136 (2010)

    Article  Google Scholar 

  19. Mendling, J., Reijers, H.A., Recker, J.: Activity labeling in process modeling: empirical insights and recommendations. Inf. Syst. 35(4), 467–482 (2010)

    Article  Google Scholar 

  20. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)

    Google Scholar 

  21. Pradhan, S., Ward, W., Hacioglu, K., Martin, J.H., Jurafsky, D.: Semantic role labeling using different syntactic views. In: ACL, pp. 581–588 (2005)

    Google Scholar 

  22. Sadeghianasl, S., ter Hofstede, A., Suriadi, S., Turkay, S.: Collaborative and Interactive Detection and Repair of Activity Labels in Process Event Logs. In: ICPM, pp. 41–48 (2020)

    Google Scholar 

  23. Tsoury, A., Soffer, P., Reinhartz-Berger, I.: A conceptual framework for supporting deep exploration of business process behavior. In: Trujillo, J.C., et al. (eds.) ER 2018. LNCS, vol. 11157, pp. 58–71. Springer, Cham (2018).

    Chapter  Google Scholar 

  24. Zhang, Z., et al..: Semantics-aware BERT for language understanding. In: AAAI, vol. 34, issue number 05, pp. 9628–9635 (2020)

    Google Scholar 

  25. Zhang, Z.: Effective and efficient semantic table interpretation using tableminer+. Seman. Web 8(6), 921–957 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Adrian Rebmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rebmann, A., van der Aa, H. (2021). Extracting Semantic Process Information from the Natural Language in Event Logs. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79381-4

  • Online ISBN: 978-3-030-79382-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics