Skip to main content

Finding Composite Episodes

  • Conference paper
Mining Complex Data (MCD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4944))

Included in the following conference series:

Abstract

Mining frequent patterns is a major topic in data mining research, resulting in many seminal papers and algorithms on item set and episode discovery. The combination of these, called composite episodes, has attracted far less attention in literature, however. The main reason is that the well-known frequent pattern explosion is far worse for composite episodes than it is for item sets or episodes. Yet, there are many applications where composite episodes are required, e.g., in developmental biology were sequences containing gene activity sets over time are analyzed.

This paper introduces an effective algorithm for the discovery of a small, descriptive set of composite episodes. It builds on our earlier work employing MDL for finding such sets for item sets and episodes. This combination yields an optimization problem. For the best results the components descriptive power has to be balanced. Again, this problem is solved using MDL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1, 259–289 (1997)

    Article  Google Scholar 

  2. Zhang, S., Zhang, J., Zhu, X., Huang, Z.: Identifying follow-correlation itemset-pairs. In: ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, pp. 765–774. IEEE Computer Society, Washington (2006)

    Chapter  Google Scholar 

  3. Wang, C., Parthasarathy, S.: Summarizing itemset patterns using probabilistic models. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 730–735. ACM Press, New York (2006)

    Chapter  Google Scholar 

  4. van Leeuwen, M., Vreeken, J., Siebes, A.: Compression picks item sets that matter. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 585–592. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Siebes, A., Vreeken, J., van Leeuwen, M.: Itemsets that compress. In: SIAM 2006: Proceedings of the SIAM Conference on Data Mining, Maryland, USA, pp. 393–404 (2006)

    Google Scholar 

  6. Bathoorn, R., Koopman, A., Siebes, A.: Reducing the frequent pattern set. In: Tsumoto, S., Clifton, C., Zhong, N., Wu, X., Liu, J., Wah, B., Cheung, Y.M. (eds.) ICDM 2006: Proceedings of the 6th International Conference on Data Mining - Workshops, ICDM workshops, vol. 6, pp. 55–59. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  7. Grünwald, P.: A tutorial introduction to the minimum description length principle. In: Advances in Minimum Description Length, MIT Press, Cambridge (2005)

    Google Scholar 

  8. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, 05 2000, pp. 1–12. ACM Press, New York (2000)

    Chapter  Google Scholar 

  9. Welten, M.C.M., Verbeek, F.J., Meijer, A.H., Richardson, M.K.: Gene expression and digit homology in the chicken embryo wing. Evolution & Development 7, 18–28 (2005)

    Article  Google Scholar 

  10. Rácz, B., Bodon, F., Schmidt-Thieme, L.: On benchmarking frequent itemset mining algorithms. In: Proceedings of the 1st International Workshop on Open Source Data Mining, in conjunction with ACM SIGKDD (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zbigniew W. RaÅ› Shusaku Tsumoto Djamel Zighed

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bathoorn, R., Siebes, A. (2008). Finding Composite Episodes. In: RaÅ›, Z.W., Tsumoto, S., Zighed, D. (eds) Mining Complex Data. MCD 2007. Lecture Notes in Computer Science(), vol 4944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68416-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68416-9_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68415-2

  • Online ISBN: 978-3-540-68416-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics