Skip to main content

On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery

  • Conference paper
On the Move to Meaningful Internet Systems: OTM 2012 (OTM 2012)

Abstract

Process discovery algorithms typically aim at discovering process models from event logs that best describe the recorded behavior. Often, the quality of a process discovery algorithm is measured by quantifying to what extent the resulting model can reproduce the behavior in the log, i.e. replay fitness. At the same time, there are many other metrics that compare a model with recorded behavior in terms of the precision of the model and the extent to which the model generalizes the behavior in the log. Furthermore, several metrics exist to measure the complexity of a model irrespective of the log.

In this paper, we show that existing process discovery algorithms typically consider at most two out of the four main quality dimensions: replay fitness, precision, generalization and simplicity. Moreover, existing approaches can not steer the discovery process based on user-defined weights for the four quality dimensions.

This paper also presents the ETM algorithm which allows the user to seamlessly steer the discovery process based on preferences with respect to the four quality dimensions. We show that all dimensions are important for process discovery. However, it only makes sense to consider precision, generalization and simplicity if the replay fitness is acceptable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carmona, J., van Dongen, B.F., van der Aalst, W.M.P., Adriansyah, A., Munoz-Gama, J.: Alignment Based Precision Checking. BPM Center Report BPM-12-10. BPMcenter.org (2012)

    Google Scholar 

  2. van der Aalst, W., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(2), 182–192 (2012)

    Article  Google Scholar 

  3. van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)

    MATH  Google Scholar 

  4. van der Aalst, W.M.P., de Medeiros, A.K.A., Weijters, A.J.M.M.: Genetic Process Mining. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 48–69. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  6. Adriansyah, A., van Dongen, B., van der Aalst, W.M.P.: Conformance Checking using Cost-Based Fitness Analysis. In: IEEE International Enterprise Computing Conference (EDOC 2011), pp. 55–64. IEEE Computer Society (2011)

    Google Scholar 

  7. Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process Mining Based on Regions of Languages. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 375–383. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: A Genetic Algorithm for Discovering Process Trees. In: Proceedings of the 2012 IEEE World Congress on Computational Intelligence. IEEE (to appear, 2012)

    Google Scholar 

  9. Calders, T., Günther, C.W., Pechenizkiy, M., Rozinat, A.: Using minimum description length for process mining. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC 2009, pp. 1451–1455. ACM, New York (2009)

    Chapter  Google Scholar 

  10. van Dongen, B.F.: Process Mining and Verification. Phd thesis, Eindhoven University of Technology (2007)

    Google Scholar 

  11. van Dongen, B.F., Alves de Medeiros, A.K., Wen, L.: Process Mining: Overview and Outlook of Petri Net Discovery Algorithms. In: Jensen, K., van der Aalst, W.M.P. (eds.) ToPNoC II. LNCS, vol. 5460, pp. 225–242. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Eiben, A.E., Smith, J.E.: Introduction to evolutionary computing. Springer (2003)

    Google Scholar 

  13. Günther, C.W., van der Aalst, W.M.P.: Fuzzy Mining – Adaptive Process Simplification Based on Multi-perspective Metrics. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 328–343. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Alves de Medeiros, A.K., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic Process Mining: An Experimental Evaluation. Data Mining and Knowledge Discovery 14(2), 245–304 (2007)

    Article  MathSciNet  Google Scholar 

  15. Mendling, J., Verbeek, H.M.W., van Dongen, B.F., van der Aalst, W.M.P., Neumann, G.: Detection and Prediction of Errors in EPCs of the SAP Reference Model. Data and Knowledge Engineering 64(1), 312–329 (2008)

    Article  Google Scholar 

  16. Rozinat, A., van der Aalst, W.M.P.: Conformance Checking of Processes Based on Monitoring Real Behavior. Information Systems 33(1), 64–95 (2008)

    Article  Google Scholar 

  17. Vanhatalo, J., Völzer, H., Koehler, J.: The Refined Process Structure Tree. Data and Knowledge Engineering 68(9), 793–818 (2009)

    Article  Google Scholar 

  18. Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  19. Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering Workflow Models from Event-Based Data using Little Thumb. Integrated Computer-Aided Engineering 10(2), 151–162 (2003)

    Google Scholar 

  20. Wen, L., van der Aalst, W.M.P., Wang, J., Sun, J.: Mining Process Models with Non-Free-Choice Constructs. Data Mining and Knowledge Discovery 15(2), 145–180 (2007)

    Article  MathSciNet  Google Scholar 

  21. van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process Discovery using Integer Linear Programming. Fundamenta Informaticae 94, 387–412 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P. (2012). On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery. In: Meersman, R., et al. On the Move to Meaningful Internet Systems: OTM 2012. OTM 2012. Lecture Notes in Computer Science, vol 7565. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33606-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33606-5_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33605-8

  • Online ISBN: 978-3-642-33606-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics