Skip to main content

Model-Agnostic Event Log Augmentation for Predictive Process Monitoring

  • Conference paper
  • First Online:
Advanced Information Systems Engineering (CAiSE 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13901))

Included in the following conference series:

Abstract

Predictive process monitoring aims to predict how the execution of a running process instance will evolve until its completion. Deep learning techniques have been shown to perform well for various prediction tasks, such as next activity prediction, remaining time prediction, or outcome prediction. However, the quality and performance of these models is highly dependent on the available amount of training data, as deep learning models require a lot of data to generalize well. In practice, the available event logs usually contain only a few thousand records with more or less redundancy, which is insufficient with respect to the large number of parameters that need to be estimated during training. For this reason, data augmentation is often used in machine learning research to increase the amount of available training data by applying transformations to them and create new samples synthetically. Since data augmentation is still largely unexplored in predictive process monitoring, this paper proposes an initial set of simple noise-based transformations that could be applied to any event log and boosts the performance of existing predictive process monitoring approaches. Our experimental evaluation shows that predictive process monitoring approaches for predicting the next activity benefit from this data augmentation technique in terms of performance and stability of the training process.

Our work is supported by the Bavarian Research Foundation (grant no. AZ-1390-19).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Our code and detailed results can be found in the supplementary material at https://github.com/mkaep/pbpm-ssl-suite.

  2. 2.

    https://data.4tu.nl/.

  3. 3.

    We do not apply the commonly used edit distance metrics such as the Damerau-Levenshtein distance, since they would not provide much insight in our case. The reason is that these metrics cannot handle timestamps and the number of edits required for activity and resource attributes is obvious in most cases (with the exception of loop and fragment augmentation each transformation requires one edit for each attribute).

  4. 4.

    https://search.r-project.org/CRAN/refmans/DescTools/html/StuartMaxwellTest.html.

References

  1. van der Aalst, W.M.P.: Process Mining: Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4

    Book  Google Scholar 

  2. Adadi, A.: A survey on data-efficient algorithms in big data era. J. Big Data 8(1), 1–54 (2021)

    Article  Google Scholar 

  3. Bukhsh, Z.A., Saeed, A., Dijkman, R.M.: Processtransformer: predictive business process monitoring with transformer network (2021)

    Google Scholar 

  4. Camargo, M., Dumas, M., González-Rojas, O.: Learning accurate LSTM models of business processes. In: Hildebrandt, T., van Dongen, B.F., Röglinger, M., Mendling, J. (eds.) BPM 2019. LNCS, vol. 11675, pp. 286–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26619-6_19

    Chapter  Google Scholar 

  5. Camargo, M., Dumas, M., González-Rojas, O.: Learning accurate business process simulation models from event logs via automated process discovery and deep learning. In: Franch, X., Poels, G., Gailly, F., Snoeck, M. (eds.) CAiSE 2022. LNCS, vol. 13295. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-07472-1_4

    Chapter  Google Scholar 

  6. Dhole, K., et al.: NL-augmenter: a framework for task-sensitive natural language augmentation (2021)

    Google Scholar 

  7. Di Mauro, N., Appice, A., Basile, T.M.A.: Activity prediction of business process instances with inception CNN models. In: Alviano, M., Greco, G., Scarcello, F. (eds.) AI*IA 2019. LNCS (LNAI), vol. 11946, pp. 348–361. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35166-3_25

    Chapter  Google Scholar 

  8. Evermann, J., Rehse, J.R., Fettke, P.: Predicting process behaviour using deep learning. Decis. Support Syst. 100, 129–140 (2017)

    Article  Google Scholar 

  9. Francescomarino, C.D., Ghidini, C.: Predictive process monitoring. In: van der Aalst, W.M.P., Carmona, J. (eds.) Process Mining Handbook, vol. 448, pp. 320–346. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08848-3_10

    Chapter  Google Scholar 

  10. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  11. Hinkka, M., Lehto, T., Heljanko, K., Jung, A.: Classifying process instances using recurrent neural networks. In: Business Process Management Workshops (2018)

    Google Scholar 

  12. Hinton, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in NIPS, vol. 15. MIT Press, Cambridge (2002)

    Google Scholar 

  13. Käppel, M., Jablonski, S., Schönig, S.: Evaluating predictive business process monitoring approaches on small event logs. In: Paiva, A.C.R., Cavalli, A.R., Ventura Martins, P., Pérez-Castillo, R. (eds.) QUATIC 2021. CCIS, vol. 1439, pp. 167–182. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85347-1_13

    Chapter  Google Scholar 

  14. Käppel, M., Schönig, S., Jablonski, S.: Leveraging small sample learning for business process management. Inf. Softw. Technol. 132, 106472 (2021)

    Article  Google Scholar 

  15. Khan, M.A., et al.: Memory-augmented neural networks for predictive process analytics (2018)

    Google Scholar 

  16. Kratsch, W., Manderscheid, J., Röglinger, M., Seyfried, J.: Machine learning in business process monitoring: a comparison of deep learning and classical approaches used for outcome prediction. BISE 63, 261–276 (2020)

    Google Scholar 

  17. Maggi, F.M., Di Francescomarino, C., Dumas, M., Ghidini, C.: Predictive monitoring of business processes. In: Jarke, M., et al. (eds.) CAiSE 2014. LNCS, vol. 8484, pp. 457–472. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07881-6_31

    Chapter  Google Scholar 

  18. Marcus, G.F.: Deep learning: a critical appraisal. arXiv (2018)

    Google Scholar 

  19. Pasquadibisceglie, V., Appice, A., Castellano, G., Malerba, D.: Using convolutional neural networks for predictive process analytics. In: ICPM 2019. IEEE (2019)

    Google Scholar 

  20. Rama-Maneiro, E., Vidal, J.C., Lama, M.: Deep learning for predictive business process monitoring: review and benchmark (2020)

    Google Scholar 

  21. Sani, M.F., Vazifehdoostirani, M., Park, G., Pegoraro, M., van Zelst, S.J., van der Aalst, W.M.P.: Event log sampling for predictive monitoring (2022)

    Google Scholar 

  22. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)

    Article  Google Scholar 

  23. Shu, J., Xu, Z., Meng, D.: Small sample learning in big data era (2018)

    Google Scholar 

  24. Tax, N., Verenich, I., Rosa, M.L., Dumas, M.: Predictive business process monitoring with LSTM neural networks. arXiv (2017)

    Google Scholar 

  25. Taymouri, F., Rosa, M.L., Erfani, S., Bozorgi, Z.D., Verenich, I.: Predictive business process monitoring via generative adversarial nets: the case of next event prediction. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 237–256. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58666-9_14

    Chapter  Google Scholar 

  26. Theis, J., Darabi, H.: Decay replay mining to predict next process events. IEEE Access 7, 119787–119803 (2019)

    Article  Google Scholar 

  27. Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of EMNLP-IJCNLP. ACL (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Käppel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Käppel, M., Jablonski, S. (2023). Model-Agnostic Event Log Augmentation for Predictive Process Monitoring. In: Indulska, M., Reinhartz-Berger, I., Cetina, C., Pastor, O. (eds) Advanced Information Systems Engineering. CAiSE 2023. Lecture Notes in Computer Science, vol 13901. Springer, Cham. https://doi.org/10.1007/978-3-031-34560-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34560-9_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34559-3

  • Online ISBN: 978-3-031-34560-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics