Skip to main content

Using Suffix-Tree to Identify Patterns and Cluster Traces from Event Log

  • Conference paper
Signal Processing and Information Technology (SPIT 2011)

Abstract

Process mining refers to the extraction process models from event logs. Traditional process mining algorithms have problems dealing with event logs that are produced from unstructured real-life processes and generate spaghetti-like and incomprehensible process models. One means making traces more structural is to extract commonly used process model constructs (common patterns) in the event log and transform traces basing on such constructs. Another way of pre-processing traces is to categorize traces in event log into clusters such that process traces in each cluster can be adequately represented by a process model. Nevertheless, current approaches for trace clustering have many problems such as ignoring context process and huge computational overhead. In this paper, suffix-tree is firstly utilized for discovering common patterns. The traces in event log are transformed with common patterns. Thereafter suffix-trees are applied to categorize transformed traces. The trace clustering algorithm has a linear-time computational complexity. The process models mined from the clustered traces show a high degree of fitness and comprehensibility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering Process Models from Event Logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  2. Greco, G., Guzzo, A., Pontieri, L.: Mining Hierarchies of Models: From Abstract Views to Concrete Specifications. In: van der Aalst, W.M.P., Benatallah, B., Casati, F., Curbera, F. (eds.) BPM 2005. LNCS, vol. 3649, pp. 32–47. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Greco, G., Guzzo, A., Pontieri, L.: Mining Taxonomies of Process Models. Data Knowl. Eng. 67(1), 74 (2008)

    Article  Google Scholar 

  4. Jagadeesh Chandra Bose, R.P., van der Aalst, W.M.P.: Abstractions in Process Mining: A Taxonomy of Patterns. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 159–175. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Jain, A.K., Murty, M.N., Flynn: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  6. Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace Clustering in Process Mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008 Workshops. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009)

    Google Scholar 

  7. Greco, G., Guzzo, A., Pontieri, L., Sacca, D.: Disco-covering Expressive Process Models by Clusering Log Traces. IEEE Trans. Knowl. Data Eng., 1010–1027 (2006)

    Google Scholar 

  8. Jagadeesh Chandra Bose, R.P., van der Aalst, W.M.P.: Context Aware Trace Clustering: Towards Improving Process Mining Results. In: Proceedings of the SIAM International Conference on Data Mining, SDM, pp. 401–412 (2009)

    Google Scholar 

  9. Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace Clustering in Process Mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008 Workshops. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009)

    Google Scholar 

  10. Bose, R.P.J.C., van der Aalst, W.M.P.: Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009 Workshops. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010)

    Google Scholar 

  11. Hammouda, K.M., Kamel, M.S.: Efficient phrase-based document indexing for web document clustering. IEEE Transactions on Knowledge and Data Engineering 16(10), 1279–1296 (2004)

    Article  Google Scholar 

  12. Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–54 (1998)

    Google Scholar 

  13. Wen, L., van der Aalst, W.M.P., Wang, J., Sun, J.: Mining Process Models with Non-Free Choice Constructs. Data Min. Knowl. Discov. 15(2), 145–182 (2007)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Wang, X., Zhang, L., Cai, H. (2012). Using Suffix-Tree to Identify Patterns and Cluster Traces from Event Log. In: Das, V.V., Ariwa, E., Rahayu, S.B. (eds) Signal Processing and Information Technology. SPIT 2011. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 62. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32573-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32573-1_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32572-4

  • Online ISBN: 978-3-642-32573-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics