Skip to main content
Log in

Mining interesting imperfectly sporadic rules

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Detecting association rules with low support but high confidence is a difficult data mining problem. To find such rules using approaches like the Apriori algorithm, minimum support must be set very low, which results in a large number of redundant rules. We are interested in sporadic rules; i.e. those that fall below a maximum support level but above the level of support expected from random coincidence. There are two types of sporadic rules: perfectly sporadic and imperfectly sporadic. Here we are more concerned about finding imperfectly sporadic rules, where the support of the antecedent as a whole falls below maximum support, but where items may have quite high support individually. In this paper, we introduce an algorithm called Mining Interesting Imperfectly Sporadic Rules (MIISR) to find imperfectly sporadic rules efficiently, e.g. fever, headache, stiff neckmeningitis. Our proposed method uses item constraints and coincidence pruning to discover these rules in reasonable time. This paper is an expanded version of Koh et al. [Advances in knowledge discovery and data mining: 10th Pacific-Asia Conference (PAKDD 2006), Singapore. Lecture Notes in Computer Science 3918, Springer, Berlin, pp 473–482].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Buneman P, Jajodia S (eds) Proceedings of the ACM SIGMOD international conference on management of data, Washington, DC, pp~207–216

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of the 20th international conference on very large databases (VLDB 1994), Santiago de Chile, Chile, pp~487–499

  3. Bayardo RJ, Agrawal R, Gunopulos D (2000) Constraint-based rule mining in large, dense databases. Data Mining Knowl Discov 4(2/3):217–240

    Article  Google Scholar 

  4. Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2005) Efficient breadth-first mining of frequent pattern with monotone constraints. Knowl Inf Syst 8(2):131–153

    Article  Google Scholar 

  5. Bonchi F, Lucchese C (2006) On condensed representations of constrained frequent patterns. Knowl Inf Syst 9(2):180–201

    Article  Google Scholar 

  6. Bower KM (2003) When to use Fisher’s exact test. Am Soc Qual Six Sigma Forum Mag 2(4):35–37

    Google Scholar 

  7. Buckland M, Gey F (1994) The relationship between recall and precision. J Am Soc Inf Sci 45(1):12–19

    Article  Google Scholar 

  8. Everitt B (1992) The analysis of contingency tables. Monographs on statistics and applied probability. Chapman and Hall, London, pp~11–36

    Google Scholar 

  9. Fisher RA (1970) Statistical methods for research workers. Oliver and Boyd, Edinburgh, UK

    Google Scholar 

  10. Flouvat F, Marchi FD, Petit J-M (2004) ABS: Adaptive borders search of frequent itemsets. In: Bayardo RJ Jr, Goethals B, Zaki MJ (eds) Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations (FIMI 2004), Brighton, UK

  11. Koh YS, Rountree N (2005) Finding sporadic rules using apriori-inverse. In: Ho TB, Cheung D, Liu H (eds) Advances in knowledge discovery and data mining: 9th Pacific-Asia conference (PAKDD 2005), Hanoi, Vietnam. Lecture notes in computer science, vol 3518. Springer, Berlin Heidelberg New York, pp~97–106

  12. Koh YS, Rountree N, O’Keefe R (2006) Finding non-coincidental sporadic rules using apriori-inverse. Int J Data Warehousing Mining 2(2):38–54

    Google Scholar 

  13. Koh YS, Rountree N, O’Keefe R (2006) Mining interesting imperfectly sporadic rules. In: Ng WK, Kitsuregawa M, Li J, Chang K (eds) Advances in knowledge discovery and data mining: 10th Pacific-Asia conference (PAKDD 2006), Singapore. Lecture notes in computer science, vol 3918. Springer, Berlin Heidelberg New York, pp~473–482

  14. Li J, Zhang X, Dong G, Ramamohanarao K, Sun Q (1999) Efficient mining of high confidence association rules without support threshold. In: Zytkow JM, Rauch J (eds) Principles of data mining and knowledge discovery: Third European conference (PKDD 1999), Prague, Czech Republic. Lecture notes in computer science, vol 1704. Springer, Berlin Heidelberg New York, pp~406–411

  15. Liu B, Hsu W, Ma Y (1999) Pruning and summarizing the discovered associations. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 1999), San Deigo, CA, pp~125–134

  16. Newman D, Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html

  17. Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. In: Tiwary A, Franklin M (eds) Proceedings of the 1998 ACM SIGMOD international conference on management of data (SIGMOD 1998), Seattle, WA, pp~13–24

  18. Rahal I, Ren D, Wu W, Perrizo W (2004) Mining confident minimal rules with fixed consequents. In: Proceedings of the 16th IEEE international conference on tools with artifical intelligence(ICTAI 2004), Boca Raton, FL, pp~6–13

  19. Srikant R, Agrawal R (1995) Mining generalized association rules. In: Dayal U, Gray PMD, Nishio S (eds) Proceedings of the 21st international conference on very large data bases (VLDB 1995), Zurich, Switzerland, pp~407–419

  20. Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Heckerman D, Mannila H, Pregibon D, Uthurusamy R (eds) Proceedings of the third international conference on knowledge discovery and data mining (KDD 1997). AAAI Press, Menlo Park, CA, pp~67–73

  21. Uno T, Kiyomi M, Arimura H (2004) LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Bayardo RJ Jr, Goethals B, Zaki MJ (eds) Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations (FIMI 2004), Brighton, UK

  22. Wang H, Perng C-S, Ma S, Yu PS (2005) Demand-driven frequent itemset mining using pattern structures. Knowl Inf Syst 8(1):82–102

    Article  Google Scholar 

  23. Weisstein E (2005) Fisher’s exact test. MathWorld – a Wolfram Web resource. http://mathworld.wolfram.com/FishersExactTest.html

  24. Zou Q, Chu W, Johnson D, Chiu H (2002) A pattern decomposition algorithm for data mining of frequent patterns. Knowl Inf Syst 4(4):466–428

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Sing Koh.

Additional information

Yun Sing Koh is currently a Ph.D. student at the Department of Computer Science, University of Otago, New Zealand. Her main research interest is in association rule mining with particular interest in generating hard-to-find association rules and interestingness measures. She holds a B.Sc. (Honours) degree in computer science and a Master’s degree in software engineering, both from the University of Malaya, Malaysia.

Nathan Rountree has been a faculty member of the Department of Computer Science at the University of Otago, Dunedin, since 1999. His research interests are in the fields of data mining, artificial neural networks, and computer science education. He is also a consulting software engineer for Profiler Corporation, a Dunedin-based company specialising in data mining and knowledge discovery.

Richard A. O’Keefe holds a B.Sc. (Honours) degree in mathematics and physics, majoring in statistics, and an M.Sc. degree in physics (underwater acoustics), both obtained from the University of Auckland, New Zealand. He received his Ph.D. degree in artificial intelligence from the University of Edinburgh. He is the author of “The Craft of Prolog’’ (MIT Press). Dr. O’Keefe is now a lecturer at the University of Otago, New Zealand. His computing interests include declarative programming languages, especially Prolog and Erlang; statistical applications, including data mining and information retrieval; and applications of logic. He is also a member of the editorial board of theory and practice of logic programming.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koh, Y.S., Rountree, N. & O’Keefe, R.A. Mining interesting imperfectly sporadic rules. Knowl Inf Syst 14, 179–196 (2008). https://doi.org/10.1007/s10115-007-0074-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-007-0074-6

Keywords

Navigation