Abstract
We propose the share-confidence framework for knowledge discovery from databases which addresses the problem of mining itemsets from market basket data. Our goal is two-fold: (1) to present new itemset measures which are practical and useful alternatives to the commonly used support measure; (2) to not only discover the buying patterns of customers, but also to discover customer profiles by partitioning customers into distinct classes. We present a new algorithm for classifying itemsets based upon characteristic attributes extracted from census or lifestyle data. Our algorithm combines the Apriori algorithm for discovering association rules between items in large databases, and the AOG algorithm for attribute-oriented generalization in large databases. We suggest how characterized itemsets can be generalized according to concept hierarchies associated with the characteristic attributes. Finally, we present experimental results that demonstrate the utility of the share-confidence framework.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, K. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of the 21th International Conference on Very Large Databases (VLDB'95), Zurich, Switzerland, September 1995.
R. Agrawal, H. Mannila, R.Srikant, H.Toivonen, and A.I. Verkamo. Fast discovery of association rules. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328, Menlo Park, CA, 1996. AAAI Press/MIT Press.
R. Agrawal and J.C. Schafer. Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering, 8(6:962–969, December 1996.
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'97), pages 265–276, May 1997.
S. Brin, R. Motwani, J.D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'97), pages 255–264, May 1997.
C.L. Carter and H.J. Hamilton. Efficient attribute-oriented algorithms for knowledge discovery from large databases. IEEE Transactions on Knowledge and Data Engineering. To appear.
C.L. Carter and H.J. Hamilton. Performance evaluation of attribute-oriented algorithms for knowledge discovery from databases. In Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (ICTAI'95), pages 486–489, Washington, D.C., November 1995.
C.L. Carter, H.J. Hamilton, and N. Cercone. Share-based measures for itemsets. In J. Komorowski and J. Zytkow, editors, Proceedings of the First European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD'97), pages 14–24, Trondheim, Norway, June 1997.
D.W. Cheung, A.W. Fu, and J. Han. Knowledge discovery in databases: a rule-based attribute-oriented approach. In Lecture Notes in Artificial Intelligence, The 8th International Symposium on Methodologies for Intelligent Systems (ISMIS'94), pages 164–173, Charlotte, North Carolina, 1994.
J. Han and Y. Fu. Discovery of multiple-level association rules from large databases. In Proceedings of the 1995 International Conference on Very Large Data Bases (VLDB'95), pages 420–431, September 1995.
J. Han and Y. Fu. Exploration of the power of attribute-oriented induction in data mining. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Adavances in Knowledge Discovery and Data Mining, pages 399–421. AAAI/MIT Press, 1996.
R.J. Hilderman, H.J. Hamilton, R.J. Kowalchuk, and N. Cercone. Parallel knowledge discovery using domain generalization graphs. In J. Komorowski and J. Zytkow, editors, Proceedings of the First European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD'97), pages 25–35, Trondheim, Norway, June 1997.
H.-Y. Hwang and W.-C. Fu. Efficient algorithms for attribute-oriented induction. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD'95), pages 168–173, Montreal, August 1995.
J.S. Park, M.-S. Chen, and P.S. Yu. An effective hash-based algorithm for mining association rules. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'95), pages 175–186, May 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hilderman, R.J., Carter, C.L., Hamilton, H.J., Cercone, N. (1998). Mining market basket data using share measures and characterized itemsets. In: Wu, X., Kotagiri, R., Korb, K.B. (eds) Research and Development in Knowledge Discovery and Data Mining. PAKDD 1998. Lecture Notes in Computer Science, vol 1394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64383-4_14
Download citation
DOI: https://doi.org/10.1007/3-540-64383-4_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64383-8
Online ISBN: 978-3-540-69768-8
eBook Packages: Springer Book Archive