On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery

Buijs, Joos C. A. M.; van Dongen, Boudewijn F.; van der Aalst, Wil M. P.

doi:10.1007/978-3-642-33606-5_19

Joos C. A. M. Buijs²⁶,
Boudewijn F. van Dongen²⁶ &
Wil M. P. van der Aalst²⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7565))

Included in the following conference series:

OTM Confederated International Conferences "On the Move to Meaningful Internet Systems"

2154 Accesses
128 Citations

Abstract

Process discovery algorithms typically aim at discovering process models from event logs that best describe the recorded behavior. Often, the quality of a process discovery algorithm is measured by quantifying to what extent the resulting model can reproduce the behavior in the log, i.e. replay fitness. At the same time, there are many other metrics that compare a model with recorded behavior in terms of the precision of the model and the extent to which the model generalizes the behavior in the log. Furthermore, several metrics exist to measure the complexity of a model irrespective of the log.

In this paper, we show that existing process discovery algorithms typically consider at most two out of the four main quality dimensions: replay fitness, precision, generalization and simplicity. Moreover, existing approaches can not steer the discovery process based on user-defined weights for the four quality dimensions.

This paper also presents the ETM algorithm which allows the user to seamlessly steer the discovery process based on preferences with respect to the four quality dimensions. We show that all dimensions are important for process discovery. However, it only makes sense to consider precision, generalization and simplicity if the replay fitness is acceptable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Carmona, J., van Dongen, B.F., van der Aalst, W.M.P., Adriansyah, A., Munoz-Gama, J.: Alignment Based Precision Checking. BPM Center Report BPM-12-10. BPMcenter.org (2012)
Google Scholar
van der Aalst, W., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(2), 182–192 (2012)
Article Google Scholar
van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)
MATH Google Scholar
van der Aalst, W.M.P., de Medeiros, A.K.A., Weijters, A.J.M.M.: Genetic Process Mining. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 48–69. Springer, Heidelberg (2005)
Chapter Google Scholar
van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering 16(9), 1128–1142 (2004)
Article Google Scholar
Adriansyah, A., van Dongen, B., van der Aalst, W.M.P.: Conformance Checking using Cost-Based Fitness Analysis. In: IEEE International Enterprise Computing Conference (EDOC 2011), pp. 55–64. IEEE Computer Society (2011)
Google Scholar
Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process Mining Based on Regions of Languages. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 375–383. Springer, Heidelberg (2007)
Chapter Google Scholar
Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: A Genetic Algorithm for Discovering Process Trees. In: Proceedings of the 2012 IEEE World Congress on Computational Intelligence. IEEE (to appear, 2012)
Google Scholar
Calders, T., Günther, C.W., Pechenizkiy, M., Rozinat, A.: Using minimum description length for process mining. In: Proceedings of the 2009 ACM Symposium on Applied Computing, SAC 2009, pp. 1451–1455. ACM, New York (2009)
Chapter Google Scholar
van Dongen, B.F.: Process Mining and Verification. Phd thesis, Eindhoven University of Technology (2007)
Google Scholar
van Dongen, B.F., Alves de Medeiros, A.K., Wen, L.: Process Mining: Overview and Outlook of Petri Net Discovery Algorithms. In: Jensen, K., van der Aalst, W.M.P. (eds.) ToPNoC II. LNCS, vol. 5460, pp. 225–242. Springer, Heidelberg (2009)
Chapter Google Scholar
Eiben, A.E., Smith, J.E.: Introduction to evolutionary computing. Springer (2003)
Google Scholar
Günther, C.W., van der Aalst, W.M.P.: Fuzzy Mining – Adaptive Process Simplification Based on Multi-perspective Metrics. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 328–343. Springer, Heidelberg (2007)
Chapter Google Scholar
Alves de Medeiros, A.K., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic Process Mining: An Experimental Evaluation. Data Mining and Knowledge Discovery 14(2), 245–304 (2007)
Article MathSciNet Google Scholar
Mendling, J., Verbeek, H.M.W., van Dongen, B.F., van der Aalst, W.M.P., Neumann, G.: Detection and Prediction of Errors in EPCs of the SAP Reference Model. Data and Knowledge Engineering 64(1), 312–329 (2008)
Article Google Scholar
Rozinat, A., van der Aalst, W.M.P.: Conformance Checking of Processes Based on Monitoring Real Behavior. Information Systems 33(1), 64–95 (2008)
Article Google Scholar
Vanhatalo, J., Völzer, H., Koehler, J.: The Refined Process Structure Tree. Data and Knowledge Engineering 68(9), 793–818 (2009)
Article Google Scholar
Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011)
Chapter Google Scholar
Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering Workflow Models from Event-Based Data using Little Thumb. Integrated Computer-Aided Engineering 10(2), 151–162 (2003)
Google Scholar
Wen, L., van der Aalst, W.M.P., Wang, J., Sun, J.: Mining Process Models with Non-Free-Choice Constructs. Data Mining and Knowledge Discovery 15(2), 145–180 (2007)
Article MathSciNet Google Scholar
van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process Discovery using Integer Linear Programming. Fundamenta Informaticae 94, 387–412 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Eindhoven University of Technology, The Netherlands
Joos C. A. M. Buijs, Boudewijn F. van Dongen & Wil M. P. van der Aalst

Authors

Joos C. A. M. Buijs
View author publications
You can also search for this author in PubMed Google Scholar
Boudewijn F. van Dongen
View author publications
You can also search for this author in PubMed Google Scholar
Wil M. P. van der Aalst
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Semantic Technology and Application Research Laboratory (STARLab), Vrije Universiteit Brussel, Building G-10, Pleinlaan 2, 1050, Brussels, Belgium
Robert Meersman
Research Centre for Automatic Control, School of Engineering in Information Technology, University of Lorraine, CNRS, Campus scientifique, BP 70239, 54506, Vandoeuvre-les-Nancy, France
Hervé Panetto
La Trobe University, Melbourne, VIC, Australia
Tharam Dillon
Faculty of Computer Science, University of Vienna, 1010, Vienna, Austria
Stefanie Rinderle-Ma
Institute of Databases and Information Systems, Ulm University, Germany
Peter Dadam
School of Information Technology and Electrical Engineering, University of Queensland, 4072, Brisbane, QLD, Australia
Xiaofang Zhou
HP Labs, Bristol, UK
Siani Pearson
Johannes Kepler University, Linz, Austria
Alois Ferscha
Università di Modena e Reggio Emilia, Modena, Italy
Sonia Bergamaschi
ADVIS Lab, Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA
Isabel F. Cruz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P. (2012). On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery. In: Meersman, R., et al. On the Move to Meaningful Internet Systems: OTM 2012. OTM 2012. Lecture Notes in Computer Science, vol 7565. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33606-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-33606-5_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33605-8
Online ISBN: 978-3-642-33606-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics