Skip to main content

Implications of Probabilistic Data Modeling for Mining Association Rules

  • Conference paper
From Data and Information Analysis to Knowledge Engineering

Abstract

Mining association rules is an important technique for discovering meaningful patterns in transaction databases. In the current literature, the properties of algorithms to mine association rules are discussed in great detail. We present a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world grocery database to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left-hand-side of rules and that lift performs poorly to filter random noise in transaction data. The probabilistic data modeling approach presented in this paper not only is a valuable framework to analyze interest measures but also provides a starting point for further research to develop new interest measures which are based on statistical tests and geared towards the specific properties of transaction data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 159.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AGGARWAL, C.C., and YU, P.S. (1998): A new framework for itemset generation. PODS 98, Symposium on Principles of Database Systems. Seattle, WA, USA, 18–24.

    Google Scholar 

  • AGRAWAL, R., IMIELINSKI, T., and SWAMI, A. (1993): Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD International Conference on Management of Data. Washington D.C., 207–216.

    Google Scholar 

  • BAYARDO, R.J., JR. and AGRAWAL, R. (1999): Mining the most interesting rules. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery in Databases & Data Mining (KDD99), 145–154.

    Google Scholar 

  • BRIJS, T., SWINNEN, G., VANHOOF, K., and WETS, G. (2004): Building an association rules framework to improve product assortment decisions. Data Mining and Knowledge Discovery, 8(1):7–23.

    Article  MathSciNet  Google Scholar 

  • BRIN, S., MOTWANI, R., ULLMAN, J.D., and TSUR, S. (1997): Dynamic itemset counting and implication rules for market basket data. SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data. Tucson, Arizona, USA, 255–264.

    Google Scholar 

  • DUMOUCHEL, W., and PREGIBON, D. (2001): Empirical Bayes screening for multi-item associations. In: F. Provost and R. Srikant (Eds.): Proceedings of the ACM SIGKDD Intentional Conference on Knowledge Discovery in Databases & Data Mining (KDD01), 67–76. ACM Press

    Google Scholar 

  • GOETHALS, B., and ZAKI, M.J. (2004): Advances in frequent itemset mining implementations: Report on FIMI’03. SIGKDD Explorations, 6(1):109–117.

    Google Scholar 

  • HAHSLER, M., HORNIK, K., and REUTTERER, T. (2005): Implications of probabilistic data modeling for rule mining. Report 14, Research Report Series, Department of Statistics and Mathematics, Wirschaftsuniversität Wien, Augasse 2–6, 1090 Wien, Austria.

    Google Scholar 

  • HIPP, J., GÜNTZER, U., and NAKHAEIZADEH, G. (2000): Algorithms for association rule mining — A general survey and comparison. SIGKDD Explorations, 2(2):1–58.

    Google Scholar 

  • HRUSCHKA, H., LUKANOWICZ, M., and BUCHTA, C. (1999): Cross-category sales promotion effects. Journal of Retailing and Consumer Services, 6(2):99–105.

    Article  Google Scholar 

  • LAWRENCE, R.D., ALMASI, G.S., KOTLYAR, V., VIVEROS, M.S., and DURI, S. (2001): Personalization of supermarket product recommendations. Data Mining and Knowledge Discovery, 5(1/2):11–32.

    Article  Google Scholar 

  • LIN, W., ALVAREZ, S.A., and RUIZ, C. (2002): Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery, 6(1):83–105.

    Article  MathSciNet  Google Scholar 

  • VAN DEN POEL, D., DE SCHAMPHELAERE, J., and WETS, G. (2004): Direct and indirect effects of retail promotions on sales and profits in the do-it-yourself market. Expert Systems with Applications, 27(1):53–62.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Berlin · Heidelberg

About this paper

Cite this paper

Hahsler, M., Hornik, K., Reutterer, T. (2006). Implications of Probabilistic Data Modeling for Mining Association Rules. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_73

Download citation

Publish with us

Policies and ethics