Abstract
Predictive process monitoring aims to predict how the execution of a running process instance will evolve until its completion. Deep learning techniques have been shown to perform well for various prediction tasks, such as next activity prediction, remaining time prediction, or outcome prediction. However, the quality and performance of these models is highly dependent on the available amount of training data, as deep learning models require a lot of data to generalize well. In practice, the available event logs usually contain only a few thousand records with more or less redundancy, which is insufficient with respect to the large number of parameters that need to be estimated during training. For this reason, data augmentation is often used in machine learning research to increase the amount of available training data by applying transformations to them and create new samples synthetically. Since data augmentation is still largely unexplored in predictive process monitoring, this paper proposes an initial set of simple noise-based transformations that could be applied to any event log and boosts the performance of existing predictive process monitoring approaches. Our experimental evaluation shows that predictive process monitoring approaches for predicting the next activity benefit from this data augmentation technique in terms of performance and stability of the training process.
Our work is supported by the Bavarian Research Foundation (grant no. AZ-1390-19).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Our code and detailed results can be found in the supplementary material at https://github.com/mkaep/pbpm-ssl-suite.
- 2.
- 3.
We do not apply the commonly used edit distance metrics such as the Damerau-Levenshtein distance, since they would not provide much insight in our case. The reason is that these metrics cannot handle timestamps and the number of edits required for activity and resource attributes is obvious in most cases (with the exception of loop and fragment augmentation each transformation requires one edit for each attribute).
- 4.
References
van der Aalst, W.M.P.: Process Mining: Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
Adadi, A.: A survey on data-efficient algorithms in big data era. J. Big Data 8(1), 1–54 (2021)
Bukhsh, Z.A., Saeed, A., Dijkman, R.M.: Processtransformer: predictive business process monitoring with transformer network (2021)
Camargo, M., Dumas, M., González-Rojas, O.: Learning accurate LSTM models of business processes. In: Hildebrandt, T., van Dongen, B.F., Röglinger, M., Mendling, J. (eds.) BPM 2019. LNCS, vol. 11675, pp. 286–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26619-6_19
Camargo, M., Dumas, M., González-Rojas, O.: Learning accurate business process simulation models from event logs via automated process discovery and deep learning. In: Franch, X., Poels, G., Gailly, F., Snoeck, M. (eds.) CAiSE 2022. LNCS, vol. 13295. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-07472-1_4
Dhole, K., et al.: NL-augmenter: a framework for task-sensitive natural language augmentation (2021)
Di Mauro, N., Appice, A., Basile, T.M.A.: Activity prediction of business process instances with inception CNN models. In: Alviano, M., Greco, G., Scarcello, F. (eds.) AI*IA 2019. LNCS (LNAI), vol. 11946, pp. 348–361. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35166-3_25
Evermann, J., Rehse, J.R., Fettke, P.: Predicting process behaviour using deep learning. Decis. Support Syst. 100, 129–140 (2017)
Francescomarino, C.D., Ghidini, C.: Predictive process monitoring. In: van der Aalst, W.M.P., Carmona, J. (eds.) Process Mining Handbook, vol. 448, pp. 320–346. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08848-3_10
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Hinkka, M., Lehto, T., Heljanko, K., Jung, A.: Classifying process instances using recurrent neural networks. In: Business Process Management Workshops (2018)
Hinton, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in NIPS, vol. 15. MIT Press, Cambridge (2002)
Käppel, M., Jablonski, S., Schönig, S.: Evaluating predictive business process monitoring approaches on small event logs. In: Paiva, A.C.R., Cavalli, A.R., Ventura Martins, P., Pérez-Castillo, R. (eds.) QUATIC 2021. CCIS, vol. 1439, pp. 167–182. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85347-1_13
Käppel, M., Schönig, S., Jablonski, S.: Leveraging small sample learning for business process management. Inf. Softw. Technol. 132, 106472 (2021)
Khan, M.A., et al.: Memory-augmented neural networks for predictive process analytics (2018)
Kratsch, W., Manderscheid, J., Röglinger, M., Seyfried, J.: Machine learning in business process monitoring: a comparison of deep learning and classical approaches used for outcome prediction. BISE 63, 261–276 (2020)
Maggi, F.M., Di Francescomarino, C., Dumas, M., Ghidini, C.: Predictive monitoring of business processes. In: Jarke, M., et al. (eds.) CAiSE 2014. LNCS, vol. 8484, pp. 457–472. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07881-6_31
Marcus, G.F.: Deep learning: a critical appraisal. arXiv (2018)
Pasquadibisceglie, V., Appice, A., Castellano, G., Malerba, D.: Using convolutional neural networks for predictive process analytics. In: ICPM 2019. IEEE (2019)
Rama-Maneiro, E., Vidal, J.C., Lama, M.: Deep learning for predictive business process monitoring: review and benchmark (2020)
Sani, M.F., Vazifehdoostirani, M., Park, G., Pegoraro, M., van Zelst, S.J., van der Aalst, W.M.P.: Event log sampling for predictive monitoring (2022)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)
Shu, J., Xu, Z., Meng, D.: Small sample learning in big data era (2018)
Tax, N., Verenich, I., Rosa, M.L., Dumas, M.: Predictive business process monitoring with LSTM neural networks. arXiv (2017)
Taymouri, F., Rosa, M.L., Erfani, S., Bozorgi, Z.D., Verenich, I.: Predictive business process monitoring via generative adversarial nets: the case of next event prediction. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 237–256. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58666-9_14
Theis, J., Darabi, H.: Decay replay mining to predict next process events. IEEE Access 7, 119787–119803 (2019)
Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of EMNLP-IJCNLP. ACL (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Käppel, M., Jablonski, S. (2023). Model-Agnostic Event Log Augmentation for Predictive Process Monitoring. In: Indulska, M., Reinhartz-Berger, I., Cetina, C., Pastor, O. (eds) Advanced Information Systems Engineering. CAiSE 2023. Lecture Notes in Computer Science, vol 13901. Springer, Cham. https://doi.org/10.1007/978-3-031-34560-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-34560-9_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34559-3
Online ISBN: 978-3-031-34560-9
eBook Packages: Computer ScienceComputer Science (R0)