Abstract
Current approaches to market basket simulation neglect the fact that empty transactions are typically not recorded and therefore should not occur in simulated data. This paper suggest how the simulation framework without associations can be extended to avoid empty transactions and explores the possible consequences for several measures of interestingness used in association rule filtering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BAHADUR, R.R. (1961): A Representation of the Joint Distribution of Responses to n Dichotomous Items. In: H. Solomon (Ed.): Studies in Item Analysis and Prediction. Standford Mathematical Studies in the Social Sciences VI. Stanford University Press, Stanford.
EMERICH, L.J. and PIEDMONT, M.R. (1991): A Method for Generating High-dimensional Multivariate Binary Deviates. Statistical Computing, 45, 302–304.
HAHSLER, M., HORNIK, K. and REUTTERER, T. (2006): Implications of Probabilistic Data Modelling for Mining Association Rules. In: M. Spiliopoulou, R. Kruse, C. Borgelt, A. Nürnberger and W. Gaul (Eds.): From Data and Information Analysis to Knowledge Engineering. Springer, Berlin, 598–605.
HRUSCHKA, H. (1990): Bestimmung der Kaufverbundenheit mit Hilfe eines probabilistischen Messmodells. zfbf, 418–434.
LEE, A.J. (1997): Some Simple Methods for Generating Correlated Categorical Deviates. Computational Statistics & Data Analysis, 25, 133–148.
LEE, A.J. (1993): Generating Random Binary Deviates Having Fixed Marginal Distributions and Specified Degrees of Association. The American Statistician, 47,3, 209–215.
LEISCH, F., WEINGESSEL, A. and HORNIK, K. (1998): On the Generation of Correlated Artificial Binary Data. Technical Report 13, SFB Working Paper Series.
OMAN, S.D. and ZUCKER, D.M. (2001): Modelling and Generating Correlated Binary Variables. Biometrica, 88,1, 287–290.
ORASCH, M., LEISCH, F. and WEINGESSEL, A. (1998): On Specifying Correlation Matrices for Binary Data. Technical Report 53, SFB Working Paper Series.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buchta, C. (2007). Improving the Probabilistic Modeling of Market Basket Data. In: Decker, R., Lenz, H.J. (eds) Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70981-7_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-70981-7_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70980-0
Online ISBN: 978-3-540-70981-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)