Skip to main content
Log in

A unified view of the apriori-based algorithms for frequent episode discovery

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Frequent episode discovery framework is a popular framework in temporal data mining with many applications. Over the years, many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper, we present a unified view of all the apriori-based discovery methods for serial episodes under these different notions of frequencies. Specifically, we present a unified view of the various frequency counting algorithms. We propose a generic counting algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies, and we present quantitative relationships among different frequencies. Our unified view also helps in obtaining correctness proofs for various counting algorithms as we show here. It also aids in understanding and obtaining the anti-monotonicity properties satisfied by the various frequencies, the properties exploited by the candidate generation step of any apriori-based method. We also point out how our unified view of counting helps to consider generalization of the algorithm to count episodes with general partial orders.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Achar A, Laxman S, Raajay V, Sastry PS (2009) Discovering general partial orders from event streams. Technical Report arXiv: 0902.1227v2 [cs.AI] http://arxiv.org, Dec 2009

  2. Bouqata Bouchra, Caraothers Christopher D, Szymanski Boleslaw K, Zaki Mohammed J. (2006) Vogue: a novel variable order-gap state machine for modeling sequences. In: Proceedings of European conference principles and practice of knowledge discovery in databases (PKDD’06), pp 42–54, Sep 2006

  3. Casas-Garriga G (2003) Discovering unbounded episodes in sequential data. In: Proceedings of European conference principles and practice of knowledge discovery in databases (PKDD’03), pp 83–94, Sep 2003

  4. Das G, Fleischer R, Ga̧sieniec L, Gunopulos D, Kärkkäinen J (1997) Episode matching. In: Proceedings of annual symposium combinatorial pattern matching(CPM’97), pp 12–27, June–July 1997

  5. Gwadera R, Atallah MJ, Szpankowski W (2003) Reliable detection of episodes in event sequences. In: Proceedings of IEEE international conference data mining (ICDM’03), pp 67–74, Nov 2003

  6. Gwadera R, Atallah MJ, Szpankowski W (2005) Reliable detection of episodes in event sequences. Knowl Inf Syst 7(4): 415–437

    Article  Google Scholar 

  7. Huang K-Y, Chang C-H (2008) Efficient mining of frequent episodes from complex sequences. Inf Syst 33(1): 96–114

    Article  Google Scholar 

  8. Iwanuma K, Takano Y, Nabeshima H (2004) On anti-monotone frequency measures for extracting sequential patterns from a single very-long sequence. In: Proceedings of IEEE conference cybernetics and intelligent systems, pp 213–217, Dec 2004

  9. Laxman S (2006) Discovering frequent episodes: fast algorithms, connections with HMMs and generalizations. PhD thesis, Bangalore, India, Sep 2006

  10. Laxman S, Sastry PS, Unnikrishnan KP (2005) Discovering frequent episodes and learning hidden Markov models: a formal connection. IEEE Trans Knowl Data Eng 17(11): 1505–1517

    Article  Google Scholar 

  11. Laxman S, Sastry PS, Unnikrishnan KP (2007) Discovering frequent generalized episodes when events persist for different durations. IEEE Trans Knowl Data Eng 19(9): 1188–1201

    Article  Google Scholar 

  12. Laxman S, Sastry PS, Unnikrishnan KP (2007) A fast algorithm for finding frequent episodes in event streams. In: Proceedings of ACM SIGKDD international conference knowledge discovery and data mining (KDD’07), pp 410–419, Aug 2007

  13. Laxman S, Tankasali V, White RW (2008) Stream prediction using a generative model based on frequent episodes in event sequences. In: Proceedings of ACM SIGKDD international conference knowledge discovery and data mining (KDD’08), pp 453–461, Jul 2008

  14. Luo J, Bridges SM (2000) Mining fuzzy association rules and fuzzy frequent episodes for intrusion detection. Int J Intell Syst 15(8): 687–703

    Article  MATH  Google Scholar 

  15. Karypis G, Joshi MV, Kumar V (1999) Universal formulation of sequential patterns. Technical Report TR 99-021, Department of Computer Science, University of Minnesota, Minneapolis

  16. Mannila H, Toivonen H (1996) Discovering generalized episodes using minimal occurrences. In: Proceedings of international conference knowledge discovery and data mining (KDD’96), pp 146–151, Aug 1996

  17. Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Mining Knowl Discov 1(3): 259–289

    Article  Google Scholar 

  18. Meger N, Rigotti C (2004) Constraint-based mining of episode rules and optimal window sizes. In: Proceedings of European conference principles and practice of knowledge discovery in databases (PKDD’04), Sep 2004

  19. Morchen F (2007) Unsupervised pattern mining from symbolic temporal data. In: SIGKDD explorations, vol 9, pp 41–55, Jun 2007

  20. Nag A, A Wai-Chee Fu (2003) Mining freqeunt episodes for relating financial events and stock trends. In: Proceedings of Pacific-Asia conference knowledge discovery and data mining, (PAKDD 2003), pp 27–39

  21. Orlando S, Perego R, Silvestri C (2004) A new algorithm for gap constrained sequence mining. In: Proceedings of ACM symposium applied computing, pp 540–547, Mar 2004

  22. Papapetrou P, Kollios G, Sclaroff S, Gunopulos D (2009) Mining frequent arrangements of temporal intervals. Knowl Inf Syst 21: 133–171

    Article  Google Scholar 

  23. Patnaik D, Laxman S, Ramakrishnan N (2010) Discovering excitatory relationships using dynamic bayesian networks. Knowl Inf Syst, pp 1–31. doi:10.1007/s10115-010-0344-6

  24. Patnaik D, Sastry PS, Unnikrishnan KP (2008) Inferring neuronal network connectivity from spike data: a temporal data mining approach. Sci Programm 16(1): 49–77

    Google Scholar 

  25. Sastry PS, Unnikrishnan KP (2010) Conditinal probability based significance tests for sequential patterns in multi-neuronal spike trains. Neural Comput 22(4): 1025–1059

    Article  MATH  Google Scholar 

  26. Srikanth R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of international conference extending database technology (EDBT), Mar 1996

  27. Tatti N (2008) Maximum entropy based significance of itemsets. Knowl Inf Syst 17(1): 57–77

    Article  Google Scholar 

  28. Tatti N (2009) Significance of episodes based on minimal windows. In: Proceedings of IEEE international conference data mining (ICDM’09), Dec 2009

  29. Unnikrishnan KP, Shadid BQ, Sastry PS, Laxman S (2009) Root cause diagnostics using temporal datamining. U.S.Patent no. 7509234, 24 Mar 2009

  30. Wang M-F, Wu Y-C, Tsai M-F (2008) Exploiting frequent episodes in weighted suffix tree to improve intrusion detection system. In: Proceedings of international conference advanced information networking and applications(AINA’08), pp 1246–1252, Mar 2008

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Avinash Achar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Achar, A., Laxman, S. & Sastry, P.S. A unified view of the apriori-based algorithms for frequent episode discovery. Knowl Inf Syst 31, 223–250 (2012). https://doi.org/10.1007/s10115-011-0408-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-011-0408-2

Keywords

Navigation