Abstract
Research on traditional association rules has gained a great attention during the past decade. Generally, an association rule A → B is used to predict that B likely occurs when A occurs. This is a kind of strong correlation, and indicates that the two events will probably happen simultaneously. However, in real world applications such as bioinformatics and medical research, there are many follow-up correlations between itemsets A and B, such as, B is likely to occur n times after A has occurred m times. That is, the correlative itemsets do not belong to the same transaction. We refer to this relation as a follow-up correlation pattern (FCP). The task of mining FCP patterns brings more challenges on efficient processing than normal pattern discovery because the number of potentially interesting patterns becomes extremely large as the length limit of transactions no longer exists. In this paper, we develop an efficient algorithm to identify FCP patterns in time-related databases. We also experimentally evaluate our approach, and provide extensive results on mining this new kind of patterns.
Similar content being viewed by others
References
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large database. In: Proceedings of the 20th international conference on very large data Bases, pp 478–499
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the international conference on data engineering, pp 3–14
Bettini C, Wang Sean X and Jajodia S (1998). Mining temporal relationships with multiple granularities in time sequences. Data Eng Bull 21: 32–38
Brin S, Motwani R, Silverstein C (1997) Beyond market baskets: generalizing association rules to correlations. In: The ACM SIGMOD conference on management of data, pp 265–276
Carlos O, Norberto E and Cesar AS (2006). Constraining and summarizing association rules in medical data. Knowl Inf Syst 9(3): 1–2
Chen G, Wu X, Zhu X, Arslan AN and He Y (2006). Efficient string matching with wildcards and length constraints. Knowl Inf Syst 10(4): 399–419
Elfeky MG, Aref WG, Elmagarmid AK (2004) Using convolution to mine obscure periodic patterns in one pass. In: Proceedings of the 9th international conference on extending database technology, pp 605–620
Francesco B and Claudio L (2006). On condensed representations of constrained frequent patterns. Knowl Inf Syst 9(2): 180–201
Garofalakis M, Rastogi R, Shim K (1999) SPIRIT: sequential pattern mining with regular expression constraints. In: Proceedings of the international conference on very large data bases, pp 223–234
Han J, Gong W, Yin Y (1998) Mining segment-wise periodic patterns in time-related databases. In: Proceedings of the international conference on knowledge discovery and data mining, pp 214–218
Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M (2000) FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings of the international conference on knowledge discovery and data mining, pp 355–359
Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: Proceedings of the 15th international conference on data engineering, pp 106–115
Ismail HT, Kantarcioglu M (2001) Mining cyclically repeated patterns. In: Proceedings of the international conference on data warehousing and knowledge discovery, pp 83–92
Ismail HT (2003). Repetition support and mining cyclic patterns. Expert Sys Appl 25(3): 303–311
Lin MY and Lee SY (2005). Efficient mining of sequential patterns with time constraints by delimited pattern growth. Knowl Inf Sys 7(4): 499–514
Lu H, Han J, Feng L (1998) Stock movement and n-dimensional inter-transaction association rules. In: Proceedings of the 1998 SIGMOD workshop on research issues on data mining and knowledge discovery. Seattle, Washington, vol.12, pp 1–7
Mannila H, Toivonen H, Verkamo AI (1995) Discovering frequent episodes in sequence. In: Proceedings of the first international conference on knowledge discovery and data mining. Montreal, Quebec, pp 144–155
Ozden B, Ramaswamy S, Silberschatz A (1998) Cyclic Association Rules. In: Proceedings of the 14th international conference on data engineering, pp 412–421
Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology, pp 3–17
Yang J, Wang W and Yu P (2004). Discovering high-order periodic patterns. Knowl Inf Sys 6(3): 243–268
Zaki M (2000) Sequence mining in categorical domains: incorporating constraints. In: Proceedings of the 9th international conference on information and knowledge management, pp 422–429
Zhang S, Lu J and Zhang C (2004). A fuzzy logic based method to acquire user threshold of minimum-support for mining association rules. Infn Sci 164(1–4): 1–16
Zhang S, Zhang J, Zhu X, Huang Z (2006) Identifying follow-correlation itemset-pairs. In: Proceedings of the 6th international conference on data mining (ICDM06), pp 765–774
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is partially supported by Australian large ARC grants (DP0449535, DP0559536 and DP0667060), a China NSF major research Program (60496327), a China NSF grant (60463003), an Overseas Outstanding Talent Research Program of the Chinese Academy of Sciences (06S3011S01), an Overseas-Returning High-level Talent Research Program of China Hunan-Resource Ministry, and an Innovation Project of Guangxi Graduate Education (2006106020812M35).
Rights and permissions
About this article
Cite this article
Zhang, S., Huang, Z., Zhang, J. et al. Mining follow-up correlation patterns from time-related databases. Knowl Inf Syst 14, 81–100 (2008). https://doi.org/10.1007/s10115-007-0086-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-007-0086-2