Abstract
In spite of recent advances in process mining, making this new technology accessible to non-technical users remains a challenge. Process maps and dashboards still seem to frighten many line of business professionals. In order to democratize this technology, we propose a natural language querying interface that allows non-technical users to retrieve relevant information and insights about their processes by simply asking questions in plain English. In this work we propose a reference architecture to support questions in natural language and provide the right answers by integrating to existing process mining tools. We combine classic natural language processing techniques (such as entity recognition and semantic parsing) with an abstract logical representation for process mining queries. We also provide a compilation of real natural language questions and an implementation of the architecture that interfaces to an existing commercial tool: Everflow. We also introduce a taxonomy for process mining related questions, and use that as a background grid to analyze the performance of this experiment. Finally, we point to potential future work opportunities in this field.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The question dataset analysed during the current study is available in https://ic.unicamp.br/~luciana.barbieri/20220306-classifiedpmquestions.csv.
Notes
References
Affolter, K., Stockinger, K., & Bernstein, A. (2019). A comparative survey of recent natural language interfaces for databases. The VLDB Journal, 28(5), 793–819.
Álvarez, J.M.P., Díaz, A.C., Parody, L., Quintero, A.M.R., Gómez-López, M.T. (2022). Process instance query language and the process querying framework. In: Polyvyanyy, A. (ed.) Process Querying Methods, pp. 85–111. Springer.
Androutsopoulos, I., Ritchie, G. D., & Thanisch, P. (1995). Natural language interfaces to databases - an introduction. Natural Language Engineering, 1(1), 29–81.
Barbieri, L., Madeira, E.R.M., Stroeh, K., van der Aalst, W.M.P. (2021). Towards a natural language conversational interface for process mining. In: Process Mining Workshops, ICPM 2021. Springer.
Berti, A., van Zelst, S.J., van der Aalst, W. (2019). Process mining for python (pm4py): bridging the gap between process-and data science. In: Proceedings of the ICPM Demo Track 2019, Co-located with 1st International Conference on Process Mining, CEUR Workshop Proceedings 2374, pp. 13–16.
Blunschi, L., Jossen, C., Kossmann, D., Mori, M., & Stockinger, K. (2012). Soda: Generating sql for business users. Proceedings of the VLDB Endowment, 5(10), 932–943.
Carmona, J., van Dongen, B.F., Solti, A., Weidlich, M. (2018). Conformance Checking: Relating Processes and Models. Springer.
del-Río-Ortega, A., Resinas, M., Cabanillas, C., Ruiz-Cortés, A. (2013). On the definition and design-time analysis of process performance indicators. Information Systems 38(4), 470–490.
del-Río-Ortega, A., Resinas, M., Durán, A., Ruiz-Cortés, A. (2016). Using templates and linguistic patterns to define process performance indicators. Enterprise Information Systems 10(2), 159–192.
Epure, E.V., Martín-Rodilla, P., Hug, C., Deneckère, R., Salinesi, C. (2015). Automatic process model discovery from textual methodologies. In: 2015 IEEE 9th International Conference on Research Challenges in Information Science (RCIS), pp. 19–30. IEEE.
Friedrich, F., Mendling, J., Puhlmann, F. (2011). Process model generation from natural language text. In: Advanced Information Systems Engineering, pp. 482–496. Springer.
Han, X., Hu, L., Sen, J., Dang, Y., Gao, B., Isahagian, V., Lei, C., Efthymiou, V., Özcan, F., Quamar, A., Huang, Z., Muthusamy, V. (2020). Bootstrapping natural language querying on process automation data. In: 2020 IEEE International Conference on Services Computing (SCC), pp. 170–177.
Hendrix, G. G., Sacerdoti, E. D., Sagalowicz, D., & Slocum, J. (1978). Developing a natural language interface to complex data. ACM Trans. Database Syst., 3(2), 105–147.
Hompes, B.F., Buijs, J.C., van der Aalst, W.M. (2016). A generic framework for context-aware process performance analysis. In: OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”, pp. 300–317. Springer.
Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: spaCy: Industrial-strength Natural Language Processing in Python.
Iyer, S., Konstas, I., Cheung, A., Krishnamurthy, J., Zettlemoyer, L. (2017). Learning a neural semantic parser from user feedback. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 963–973. Association for Computational Linguistics.
Kobeissi, M., Assy, N., Gaaloul, W., Defude, B., Haidar, B. (2021). An intent-based natural language interface for querying process execution data. In: 3rd International Conference on Process Mining (ICPM), pp. 152–159. IEEE.
Leopold, H., Mendling, J., Polyvyanyy, A. (2012). Generating natural language texts from business process models. In: Advanced Information Systems Engineering, pp. 64–79. Springer.
Mans, R.S., van der Aalst, W.M.P., Vanwersch, R.J.B. (2015). Process Mining in Healthcare: Evaluating and Exploiting Operational Healthcare Processes. Springer.
Mishra, A., & Jain, S. K. (2016). A survey on question answering systems with classification. Journal of King Saud University - Computer and Information Sciences, 28(3), 345–361.
Polyvyanyy, A. (2022). Process query language. In: Polyvyanyy, A. (ed.) Process Querying Methods, pp. 313–341. Springer.
Polyvyanyy, A. (2022). Process Querying Methods. Springer.
Polyvyanyy, A., Ouyang, C., Barros, A., & van der Aalst, W. M. P. (2017). Process querying: Enabling business intelligence through query-based process analytics. Decision Support Systems, 100, 41–56.
Riefer, M., Ternis, S.F., Thaler, T. (2016) Mining process models from natural language text: A state-of-the-art analysis. Multikonferenz Wirtschaftsinformatik (MKWI-16), March, 9–11.
Saha, D., Floratou, A., Sankaranarayanan, K., Minhas, U. F., Mittal, A. R., & Özcan, F. (2016). Athena: an ontology-driven system for natural language querying over relational data stores. Proceedings of the VLDB Endowment, 9(12), 1209–1220.
van der Aa, H., Leopold, H., & Reijers, H. A. (2017). Comparing textual descriptions to process models - the automatic detection of inconsistencies. Information Systems, 64, 447–460.
van der Aalst, W.M.P. (2016). Process mining: data science in action. Springer.
van der Aa, H., Carmona Vargas, J., Leopold, H., Mendling, J., Padró, L. (2018). Challenges and opportunities of applying natural language processing in business process management. In: International Conference on Computational Linguistics, pp. 2791–2801. Association for Computational Linguistics.
Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P. (2010). Xes, xesame, and prom 6. In: Information Systems Evolution, vol. 72, pp. 60–75. Springer.
Viner, D., Stierle, M., Matzner, M. (2020). A process mining software comparison. In: Proceedings of the ICPM Doctoral Consortium and Tool Demonstration Track 2020 Co-located with the 2nd International Conference on Process Mining (ICPM2020), Volume 2703 of CEUR Workshop Proceedings, pp. 19–22.
Wolfson, T., Geva, M., Gupta, A., Gardner, M., Goldberg, Y., Deutch, D., & Berant, J. (2020). Break it down: A question understanding benchmark. Transactions of the Association for Computational Linguistics, 8, 183–198.
Zhong, V., Xiong, C., Socher, R. (2017). Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv:1709.00103
Acknowledgements
We would like to thank Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, for providing the financial support for this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Barbieri, L., Madeira, E., Stroeh, K. et al. A natural language querying interface for process mining. J Intell Inf Syst 61, 113–142 (2023). https://doi.org/10.1007/s10844-022-00759-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-022-00759-9