Skip to main content

Comparative Analysis of Process Mining Algorithms in Python

  • Conference paper
  • First Online:
Smart Objects and Technologies for Social Good (GOODTECHS 2021)

Abstract

In many sectors, there is a large amount of data collected and stored, which is not analyzed. The health area is a good example. This situation is not desirable, as the data can provide historical information or trends that may help to improve organizations performance in the future. Process mining allows the extraction of knowledge from data generated and stored in the information systems.

This work aims to contribute to the aforementioned knowledge extraction, comparing different algorithms in process mining techniques, using health care processes and data. The results showed that Inductive Miner and Heuristic Miner are the algorithms with better results. Considering the execution times, Petri Net is the type of model that takes longer, but it is the one that allows a better analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Noise is the result of data quality problems, such as registration errors, which infrequently manifest themselves in the behavior of the process [13].

  2. 2.

    A Petri net has two types of elements, positions and transitions. A position can contain one or more tokens. A transition is enabled if all inputs (positions connected to itself) contain, at least, one token [14].

  3. 3.

    Process tree is a tree-structured process model, where leaf nodes represent activities, and non-leaf nodes represent control flow operators [28].

  4. 4.

    Transitional system is used to describe the potential behavior of discrete systems. It consists of states and transitions between states [29].

References

  1. Hendricks, R.: Process mining of incoming patients with sepsis. Online J. Public Health Inform. 11(2) (2019). https://doi.org/10.5210/ojphi.v11i2.10151

  2. Fraunhofer Institute for Applied Information Technology: Process Mining for Python (PM4Py). Process Mining for Python (PM4Py) (2021). https://pypi.org/project/pm4py/

  3. Kurniati, A.P., Hall, G., Hogg, D., Johnson, O.: Process mining in oncology using the MIMIC-III dataset. In: Journal of Physics: Conference Series, vol. 971, no. 1 (2018). https://doi.org/10.1088/1742-6596/971/1/012008

  4. Bolt, A., van der Aalst, W.M.P., de Leoni, M.: Finding process variants in event logs. In: Panetto, H., et al. (eds.) OTM 2017. LNCS, vol. 10573, pp. 45–52. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69462-7_4

    Chapter  Google Scholar 

  5. Van Der Aalst, W.: Process mining: overview and opportunities. ACM Trans. Manag. Inf. Syst. 3(2), 1–17 (2012). https://doi.org/10.1145/2229156.2229157

    Article  Google Scholar 

  6. Mans, R.S., Van Der Aalst, W.M.P., Vanwersch, R.J.B.: Process Mining in the Healthcare (2015)

    Google Scholar 

  7. Pegoraro, M., Uysal, M.S., van der Aalst, W.M.P.: Discovering process models from uncertain event data. In: Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.) BPM 2019. LNBIP, vol. 362, pp. 238–249. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37453-2_20

    Chapter  Google Scholar 

  8. Batista, E., Solanas, A.: Process mining in healthcare: a systematic review. In: 2018 9th International Conference on Information, Intelligence, Systems and Applications, IISA 2018, pp. 1–6 (2019). https://doi.org/10.1109/IISA.2018.8633608

  9. Rojas, E., Cifuentes, A., Burattin, A., Munoz-Gama, J., Sepúlveda, M., Capurro, D.: Performance analysis of emergency room episodes through process mining. Int. J. Environ. Res. Public Health 16(7) (2019). https://doi.org/10.3390/ijerph16071274

  10. Shinde, S.A., Rajeswari, P.R.: Intelligent health risk prediction systems using machine learning: a review. Int. J. Eng. Technol. (UAE) 7(3), 1019–1023 (2018). https://doi.org/10.14419/ijet.v7i3.12654

  11. Wang, L., Du, Y., Qi, L.: Efficient deviation detection between a process model and event logs. IEEE/CAA J. Automatica Sinica 6(6), 1352–1364 (2019). https://doi.org/10.1109/JAS.2019.1911750

    Article  MathSciNet  Google Scholar 

  12. Sundari, M.S., Nayak, R.K.: Process mining in healthcare systems: a critical review and its future. Int. J. Emerg. Trends Eng. Res. 8(9), 5197–5208 (2020). https://doi.org/10.30534/ijeter/2020/50892020

  13. Conforti, R., La Rosa, M., ter Hofstede, A.H.M.: Noise Filtering of Process Execution Logs based on Outliers Detection. Institute for Future Environments, School of Information Systems; Science & Engineering Faculty, pp. 1–16 (2015)

    Google Scholar 

  14. de Petri, R.: Rede de Petri. Wikipédia, a enciclopédia livre (2019). https://pt.wikipedia.org/w/index.php?title=Rede_de_Petri&oldid=55172483

  15. Van Der Aalst, W.M.P.: A practitioner’s guide to process mining: limitations of the directly-follows graph. Procedia Comput. Sci. 164, 321–328 (2019). https://doi.org/10.1016/j.procs.2019.12.189

    Article  Google Scholar 

  16. Weijters, A.J.M.M., van der Aalst, W.M.P., de Medeiros, A.K.A.: Process Mining with the Heuristics Miner Algorithm. Beta Working Papers (2006)

    Google Scholar 

  17. Bogarín, A., Cerezo, R., Romero, C.: Discovering learning processes using inductive miner: a case study with learning management systems (LMSs). Psicothema 30(3), 322–329 (2018). https://doi.org/10.7334/psicothema2018.116

    Article  Google Scholar 

  18. Breitmayer, M.: Applying Process Mining Algorithms in the Context of Data Collection Scenarios (2018)

    Google Scholar 

  19. Veiga, G.M., Ferreira, D.R.: Understanding spaghetti models with sequence clustering for ProM. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 92–103. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12186-9_10

    Chapter  Google Scholar 

  20. Berti, A., Van Zelst, S.J., Van Der Aalst, W.M.P., Gesellschaf, F.: Process mining for python (PM4py): bridging the gap between process-and data science. In: CEUR Workshop Proceedings, vol. 2374, pp. 13–16 (2019)

    Google Scholar 

  21. van Dongen, B.F., de Medeiros, A.K.A., Verbeek, H.M.W., Weijters, A.J.M.M., van der Aalst, W.M.P.: The ProM framework: a new era in process mining tool support. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 444–454. Springer, Heidelberg (2005). https://doi.org/10.1007/11494744_25

    Chapter  Google Scholar 

  22. Lohmann, N.M.: Discover Your Processes Disc. Proceedings, September (2012)

    Google Scholar 

  23. Badakhshan, P., Geyer-Klingeberg, J., El-Halaby, M., Lutzeyer, T., Affonseca, G.V.L.: Celonis process repository: a bridge between business process management and process mining. In: CEUR Workshop Proceedings, vol. 2673, pp. 67–71 (2020)

    Google Scholar 

  24. Van Der Aalst, W., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004). https://doi.org/10.1109/TKDE.2004.47

    Article  Google Scholar 

  25. Adriansyah, A., Sidorova, N., Van Dongen, B.F.: Cost-based fitness in conformance checking. In: Proceedings - International Conference on Application of Concurrency to System Design, ACSD 2011, pp. 57–66 (2011). https://doi.org/10.1109/ACSD.2011.19

  26. Pohl, T.: An Inductive Miner Implementation for the PM4PY Framework, pp. 1–66 (2019)

    Google Scholar 

  27. van der Aalst, W.M.P., Song, M.: Mining social networks: uncovering interaction patterns in business processes. In: Desel, J., Pernici, B., Weske, M. (eds.) BPM 2004. LNCS, vol. 3080, pp. 244–260. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25970-1_16

    Chapter  Google Scholar 

  28. Arriagada-Benítez, M., Sepúlveda, M., Munoz-Gama, J., Buijs, J.C.A.M.: Strategies to automatically derive a process model from a configurable process model based on event data. Appl. Sci. (Switzerland) 7(10) (2017). https://doi.org/10.3390/app7101023

  29. See, E.: Transition System on, pp. 1–17 (2005)

    Google Scholar 

  30. Munoz-Gama, J., Carmona, J.: Enhancing precision in process conformance: stability, confidence, and severity. In: IEEE SSCI 2011: Symposium Series on Computational Intelligence - CIDM 2011: 2011 IEEE Symposium on Computational Intelligence and Data Mining, pp. 184–191 (2011). https://doi.org/10.1109/CIDM.2011.5949451

  31. Berti, A., van der Aalst, W.M.P.: A novel token-based replay technique to speed up conformance checking and process enhancement. In: Koutny, M., Kordon, F., Pomello, L. (eds.) Transactions on Petri Nets and Other Models of Concurrency XV. LNCS, vol. 12530, pp. 1–26. Springer, Heidelberg (2021). https://doi.org/10.1007/978-3-662-63079-2_1

    Chapter  Google Scholar 

  32. Bloemen, V., Van Zelst, S., Van Der Aalst, W.: Aligning Observed and Modeled Behavior by Maximizing Synchronous Moves and Using Milestones (2019)

    Google Scholar 

  33. Buijs, J.C.A.M., Van Dongen, B.F., Van Der Aalst, W.M.P.: Quality dimensions in process discovery: the importance of fitness, precision, generalization, and simplicity. Int. J. Coop. Inf. Syst. 23(1), 1–39 (2014). https://doi.org/10.1142/S0218843014400012

    Article  Google Scholar 

Download references

Acknowledgements

This work is funded by National Funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project Ref UIDB/05583/2020. Furthermore, we would like to thank the Research Centre in Digital Services (CISeD), the Polytechnic of Viseu for their support.

This work is also funded by National Funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project Refª UIDB/05507/2020. Furthermore we would like to thank the Centre for Studies in Education and Innovation (CI&DEI) and the Polytechnic of Viseu for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joana Rita da Silva Fialho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gomes, A.F.D., de Lacerda, A.C.W.G., da Silva Fialho, J.R. (2021). Comparative Analysis of Process Mining Algorithms in Python. In: Pires, I.M., Spinsante, S., Zdravevski, E., Lameski, P. (eds) Smart Objects and Technologies for Social Good. GOODTECHS 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 401. Springer, Cham. https://doi.org/10.1007/978-3-030-91421-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91421-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91420-2

  • Online ISBN: 978-3-030-91421-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics