Abstract
Process mining focuses on the analysis of recorded event data in order to gain insights about the true execution of business processes. While foundational process mining techniques treat such data as sequences of abstract events, more advanced techniques depend on the availability of specific kinds of information, such as resources in organizational mining and business objects in artifact-centric analysis. However, this information is generally not readily available, but rather associated with events in an ad hoc manner, often even as part of unstructured textual attributes. Given the size and complexity of event logs, this calls for automated support to extract such process information and, thereby, enable advanced process mining techniques. In this paper, we present an approach that achieves this through so-called semantic role labeling of event data. We combine the analysis of textual attribute values, based on a state-of-the-art language model, with a novel attribute classification technique. In this manner, our approach extracts information about up to eight semantic roles per event. We demonstrate the approach’s efficacy through a quantitative evaluation using a broad range of event logs and demonstrate the usefulness of the extracted information in a case study.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We kindly refer to Sect. 4.1 for further information on the event logs referenced here.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
For reproducibility, the gold standard is published alongside the implementation.
References
van der Aalst, W.M.P.: Process Mining: Data Science in Action. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
Aalst, W.M.P.: Object-centric process mining: dealing with divergence and convergence in event data. In: Ölveczky, P.C., Salaün, G. (eds.) SEFM 2019. LNCS, vol. 11724, pp. 3–25. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30446-1_1
van der Aalst, W.M.P., Reijers, H.A., Song, M.: Discovering social networks from event logs. Comput. Support. Coop. Work (CSCW) 14(6), 549–593 (2005). https://doi.org/10.1007/s10606-005-9005-9
Acampora, G., Vitiello, A., Di Stefano, B., van der Aalst, W., Günther, C., Verbeek, E.: IEEE 1849tm: The XES standard. IEEE Comput. Intell. Mag. 12(2), 4–8 (2017). https://ieeexplore.ieee.org/document/7895272
Berti, A., van Zelst, S.J., van der Aalst, W.: Process mining for python (PM4Py): bridging the gap between process-and data science. ICPM Demo Track 2019, 13–16 (2019)
Carreras, X., Màrquez, L.: Introduction to the CoNLL-2005 shared task: semantic role labeling. CoNLL 2005, 152–164 (2005)
Deokar, A.V., Tao, J.: Semantics-based event log aggregation for process mining and analytics. Inf. Syst. Front. 17(6), 1209–1226 (2015). https://doi.org/10.1007/s10796-015-9563-4
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, pp. 4171–4186. ACL (2019)
van Dongen, B.F.: BPI challenge (2014). https://doi.org/10.4121/uuid:c3e5d162-0cfd-4bb0-bd82-af5268819c35
van Dongen, B.F.: BPI challenge (2020). https://doi.org/10.4121/uuid:52fb97d4-4588-43c9-9d04-3604d4613b51
Du, N., Chen, K., Kannan, A., Tran, L., Chen, Y., Shafran, I.: Extracting symptoms and their status from clinical conversations. In: ACL. pp. 915–925 (2019)
Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. Comput. Linguist. 28(3), 245–288 (2002)
He, L., Lee, K., Lewis, M., Zettlemoyer, L.: Deep semantic role labeling: what works and what’s next. In: ACL, pp. 473–483 (2017)
Honnibal, M., Montani, I.: spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing (2017, To appear)
Leopold, H., van der Aa, H., Offenberg, J., Reijers, H.A.: Using hidden Markov models for the accurate linguistic analysis of process model activity labels. Inf. Syst. 83, 30–39 (2019)
Leopold, H., van der Aa, H., Reijers, H.A.: Identifying candidate tasks for robotic process automation in textual process descriptions. In: Gulden, J., Reinhartz-Berger, I., Schmidt, R., Guerreiro, S., Guédria, W., Bera, P. (eds.) BPMDS/EMMSAD -2018. LNBIP, vol. 318, pp. 67–81. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91704-7_5
Lu, X., Fahland, D., van den Biggelaar, F.J.H.M., van der Aalst, W.M.P.: Handling duplicated tasks in process discovery by refining event labels. In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 90–107. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45348-4_6
Mendling, J., Reijers, H.A., van der Aalst, W.M.: Seven process modeling guidelines (7PMG). Inf. Softw. Technol. 52(2), 127–136 (2010)
Mendling, J., Reijers, H.A., Recker, J.: Activity labeling in process modeling: empirical insights and recommendations. Inf. Syst. 35(4), 467–482 (2010)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Pradhan, S., Ward, W., Hacioglu, K., Martin, J.H., Jurafsky, D.: Semantic role labeling using different syntactic views. In: ACL, pp. 581–588 (2005)
Sadeghianasl, S., ter Hofstede, A., Suriadi, S., Turkay, S.: Collaborative and Interactive Detection and Repair of Activity Labels in Process Event Logs. In: ICPM, pp. 41–48 (2020)
Tsoury, A., Soffer, P., Reinhartz-Berger, I.: A conceptual framework for supporting deep exploration of business process behavior. In: Trujillo, J.C., et al. (eds.) ER 2018. LNCS, vol. 11157, pp. 58–71. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-5_6
Zhang, Z., et al..: Semantics-aware BERT for language understanding. In: AAAI, vol. 34, issue number 05, pp. 9628–9635 (2020)
Zhang, Z.: Effective and efficient semantic table interpretation using tableminer+. Seman. Web 8(6), 921–957 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Rebmann, A., van der Aa, H. (2021). Extracting Semantic Process Information from the Natural Language in Event Logs. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-79382-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79381-4
Online ISBN: 978-3-030-79382-1
eBook Packages: Computer ScienceComputer Science (R0)