Skip to main content

Data-Driven Annotation of Textual Process Descriptions Based on Formal Meaning Representations

  • Conference paper
  • First Online:
Advanced Information Systems Engineering (CAiSE 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12751))

Included in the following conference series:

Abstract

Business process management encompasses a variety of tasks that can be solved system-aided but usually require formal process representations, i.e. process models. However, it requires a significant effort to learn a formal process modeling language like, for instance, BPMN. Among others, this is one reason why companies often still stick to informal textual process descriptions. However, in contrast to formal models, information from natural language text usually cannot be automatically processed by algorithms. Hence, recent research also focuses on annotated textual process descriptions to make text machine processable.

While still human-readable, they additionally contain annotations following a formal scheme. Thus, they also enable automated processing by, for instance, formal reasoning and simulation. State-of-the-art techniques for automatically annotating textual process descriptions are either based on hand-crafted rule sets or artificial neural networks. Maintaining complex rule sets requires a significant manual effort and the approaches using neural networks suffer from rather low result quality. In this paper we present an approach based on Semantic Parsing and Graph Convolutional Networks that avoids manually defined rules and provides significantly better results than existing techniques based on neural networks. A comprehensive evaluation using multiple data sets from both academia and industry shows encouraging results and differentiates between several applied text features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Literature refers to Clause Classification and Clause Semantics Recognition as Sentence Classification and Sentence Semantics Recognition, which suggests processing of whole sentences, though the discussed approach operates on clauses instead.

  2. 2.

    Our code can be accessed at https://github.com/JulianNeuberger/UCCA4BPM.

  3. 3.

    see https://universaldependencies.org/u/pos/, accessed 2020/12/5.

  4. 4.

    Using token based node features, inner nodes use the zero vector as feature, since they do not have a corresponding token. Therefore, two edges need to be traversed before the incoming edge information is aggregated in a terminal node: The artificial inverse edge “up” the UCCA structure and only then the edge in question, see Sect. 4.

  5. 5.

    https://universaldependencies.org/u/feat/index.html, accessed 2020/12/5.

References

  1. van der Aa, H., Carmona, J., Leopold, H., Mendling, J., Padró, L.: Challenges and opportunities of applying natural language processing in business process management. In: Proceedings of COLING. ACL (2018)

    Google Scholar 

  2. van der Aa, H., Di Ciccio, C., Leopold, H., Reijers, H.A.: Extracting declarative process models from natural language. In: Giorgini, P., Weber, B. (eds.) CAiSE 2019. LNCS, vol. 11483, pp. 365–382. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21290-2_23

    Chapter  Google Scholar 

  3. van der Aa, H., Leopold, H., Reijers, H.A.: Detecting inconsistencies between process models and textual descriptions. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) BPM 2015. LNCS, vol. 9253, pp. 90–105. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23063-4_6

    Chapter  Google Scholar 

  4. Aalst, W.: Data science in action. Process Mining, pp. 3–23. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4_1

    Chapter  Google Scholar 

  5. Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: Proceedings of OSDI (2016)

    Google Scholar 

  6. Abend, O., Rappoport, A.: Universal conceptual cognitive annotation (UCCA). In: Proceedings of the ACL. ACL (2013)

    Google Scholar 

  7. Abend, O., Rappoport, A.: The state of the art in semantic representation. In: Proceedings of the ACL. ACL (2017)

    Google Scholar 

  8. Allen-Zhu, Z., Li, Y., Liang, Y.: Learning and generalization in overparameterized neural networks, going beyond two layers. In: Proceedings of NeurIPS (2019)

    Google Scholar 

  9. Btoush, E.S., Hammad, M.M.: Generating ER diagrams from requirement specifications based on natural language processing. In: IJDTA (2015)

    Google Scholar 

  10. Che, W., Dou, L., Xu, Y., Wang, Y., Liu, Y., Liu, T.: HIT-SCIR at MRP 2019: a unified pipeline for meaning representation parsing via efficient training and effective encoding. In: Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 CoNLL (2019)

    Google Scholar 

  11. Chinchor, N., Sundheim, B.: Muc-5 evaluation metrics. In: Proceedings of MUC. ACL (1993)

    Google Scholar 

  12. Dawood, O.S., et al.: From requirements engineering to UML using natural language processing-survey study. In: EJERS (2017)

    Google Scholar 

  13. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press (1998)

    Google Scholar 

  14. Figl, K., Recker, J.: Exploring cognitive style and task-specific preferences for process representations. Requirements Eng. 21(1), 63–85 (2014). https://doi.org/10.1007/s00766-014-0210-2

    Article  Google Scholar 

  15. Friedrich, F., Mendling, J., Puhlmann, F.: Process model generation from natural language text. In: Mouratidis, H., Rolland, C. (eds.) CAiSE 2011. LNCS, vol. 6741, pp. 482–496. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21640-4_36

    Chapter  Google Scholar 

  16. Hershcovich, D., Abend, O., Rappoport, A.: A transition-based directed acyclic graph parser for UCCA. In: Proceedings of the ACL. ACL (2017)

    Google Scholar 

  17. Jia, R., Liang, P.: Data recombination for neural semantic parsing. In: Proceedings of ACL. ACL (2016)

    Google Scholar 

  18. Jlailaty, D., Grigori, D., Belhajjame, K.: Email business activities extraction and annotation. In: Kotzinos, D., Laurent, D., Spyratos, N., Tanaka, Y., Taniguchi, R. (eds.) ISIP 2018. CCIS, vol. 1040, pp. 69–86. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30284-9_5

    Chapter  Google Scholar 

  19. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  20. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of ICLR (2017)

    Google Scholar 

  21. Körner, S.J., Landhäußer, M.: Semantic enriching of natural language texts with automatic thematic role annotation. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 92–99. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13881-2_9

    Chapter  Google Scholar 

  22. Leopold, H., van der Aa, H., Reijers, H.A.: Identifying candidate tasks for robotic process automation in textual process descriptions. In: Gulden, J., Reinhartz-Berger, I., Schmidt, R., Guerreiro, S., Guédria, W., Bera, P. (eds.) BPMDS/EMMSAD -2018. LNBIP, vol. 318, pp. 67–81. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91704-7_5

    Chapter  Google Scholar 

  23. López, H.A., Debois, S., Hildebrandt, T.T., Marquard, M.: The process highlighter: from texts to declarative processes and back. In: CEUR Workshop Proceedings (2018)

    Google Scholar 

  24. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR, Workshop Track Proceedings (2013)

    Google Scholar 

  25. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the Conference on EMNLP (2014)

    Google Scholar 

  26. Qian, C., et al.: An approach for process model extraction by multi-grained text classification. In: Dustdar, S., Yu, E., Salinesi, C., Rieu, D., Pant, V. (eds.) CAiSE 2020. LNCS, vol. 12127, pp. 268–282. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49435-3_17

    Chapter  Google Scholar 

  27. Quishpi, L., Carmona, J., Padró, L.: Extracting annotations from textual descriptions of processes. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 184–201. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58666-9_11

    Chapter  Google Scholar 

  28. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) Modeling relational data with graph convolutional networks. In: Proc. of ESWC. Springer (2018). LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38

    Chapter  Google Scholar 

  29. Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. In: IEEE SPM (2013)

    Google Scholar 

  30. Straka, M., Straková, J.: Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (2017)

    Google Scholar 

  31. Sànchez-Ferreres, J., Burattin, A., Carmona, J., Montali, M., Padró, L.: Formal reasoning on natural language descriptions of processes. In: Hildebrandt, T., van Dongen, B.F., Röglinger, M., Mendling, J. (eds.) BPM 2019. LNCS, vol. 11675, pp. 86–101. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26619-6_8

    Chapter  Google Scholar 

  32. Tsai, R.T.H., et al.: Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinform. 7, 92 (2006)

    Google Scholar 

  33. Wang, M., et al.: Deep graph library: a graph-centric, highly-performant package for graph neural networks. arXiv: Learning (2019)

  34. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. In: IEEE Transactions on NNLS (2020)

    Google Scholar 

  35. Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: NeurIPS (2018)

    Google Scholar 

Download references

Acknowledgements

We thank Omri Abend (HUJI) and Daniel Hershcovich (UCPH) for their assistance with UCCA, Lluís Padró, Luis Quishpi and Josep Carmona (UPC) for valuable advice regarding their approach, and the DBIS Chair (UBT) for assistance creating the new dataset.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lars Ackermann or Julian Neuberger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ackermann, L., Neuberger, J., Jablonski, S. (2021). Data-Driven Annotation of Textual Process Descriptions Based on Formal Meaning Representations. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79382-1_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79381-4

  • Online ISBN: 978-3-030-79382-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics