Abstract
We introduce a model which differs from the well-known multivariate logit model (MVL) used to analyze the cross-category dependence in market baskets by the addition of binary hidden variables. This model is called restricted Boltzmann machine (RBM) and new to the marketing literature. Extant applications of the MVL model for higher numbers of categories typically follow a two-step approach as simultaneous maximum likelihood estimation is computationally infeasible. In contrast to the MVL, the RBM can be simultaneously estimated by maximum likelihood even for a higher number of categories as long as the number of hidden variables is moderate. We measure the cross-category dependence by pairwise marginal cross effects which are obtained using estimated coefficients and sampling of baskets. In the empirical study, we analyze market baskets consisting of the 60 most frequently purchased categories of the assortment of a supermarket. For a validation data set, the RBM performs better than the MVL model estimated by maximum pseudo-likelihood. For our data, about 75 % of the baskets are reproduced by the model without cross-category dependence, but 25 % of the baskets cannot be adequately modeled if cross effects are ignored. Moreover, it turns out that both the number of significant cross effects and their relationships can be grasped rather easily.
Similar content being viewed by others
Notes
Brijs et al. (2004) analyzed four product categories by means of a multivariate Poisson mixture model. Wang et al. (2007) introduced a multivariate Poisson-lognormal model and study cross-category effects between five product categories. Niraj et al. (2008) dichotomized the purchase quantity by distinguishing purchases of 1 U and purchases of at least 2 U. On the basis of this dichotomization, these authors specify a two-stage bivariate logit model of purchase incidence and quantity for two product categories.
Song and Chintagunta (2007) developed an integrated model based on an utility-maximizing framework for multicategory purchase incidence, brand choice, and purchase quantity. In their empirical study, these authors analyzed four product categories.
References
Besag J (1974) Spatial interaction and the statistical analysis of lattice Systems. J R Stat Soc Ser B 36:192–236
Betancourt R, Gautschi D (1990) Demand complementarities, household production, and retail assortments. Mark Sci 9:146–161
Boztug Y, Hildebrandt L (2008) Modeling joint purchases with a multivariate MNL approach. Schmalenbach Bus Rev 60:400–422
Boztuğ Y, Reutterer T (2008) A combined approach for segment-specific market basket analysis. European J Operational Res 187:294–312
Brijs T, Karlis D, Swinnen G, Vanhoof K, Wets G, Manchanda P (2004) A multivariate Poisson mixture model for marketing applications. Statist Neerl 58:322–348
Cameron AC, Trivedi PK (2005) Microeconometrics. Cambridge University Press, New York
Carlin BP, Louis TA (2000) Bayes and empiricla Bayes methods for data analysis. Cambridge University Press, Cambridge
Casella G, George EI (1992) Explaining the Gibbs sampler. The Am Stat 46:167–174
Chib S, Seetharaman PB, Strijnev A (2002) Analysis of multi-category purchase incidence decisions using IRI market basket data. In: Franses PH, Montgomery AL (eds) Econometric models in marketing. JAI, Amsterdam, p 92
Comets F (1992) On consistency of a class of estimators for exponential families of Markov random fields on the lattice. Ann Stat 20:455–468
Cox DR (1972) The analysis of multivariate binary data. J R Stat Soc Ser C 21:113–120
Duvvuri SD, Gruca TS (2010) A Bayesian multi-level factor analytic model of consumer price sensitivities across categories. Psychometrika 75:558–578
Duvvuri SD, Ansari A, Gupta S (2007) Consumers’ price sensitivities across complementary categories. Manag Sci 53:1933–1945
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Greene WH (2003) Econometric analysis, 6th edn. Prentice Hall, Upper Saddle River
Hertz J, Krogh A, Palmer RG (1991) Introduction to the theory of neural computation. Addison-Wesley, Redwood City
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comp 14:1771–1800
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Hruschka H, Lukanowicz M, Buchta C (1999) Cross-category sales promotion effects. J. Retail Consumer Serv 6:99–105
Larochelle H, Bengio Y, Turian J (2010) Tractable multivariate binary density estimation and the restricted Boltzmann forest. Neural Comp 22:2285–2307
Laud PW, Ibrahim JG (1995) Predictive model selection. J R Stat Soc Ser B 57:247–262
Manchanda P, Ansari A, Gupta S (1999) The “Shopping Basket”: a model for multi-category purchase incidence decisions. Mark Sci 18:95–114
Niraj R, Padmanabhan V, Seetharaman PB (2008) A cross-category model of households’ incidence and quantity decisions. Marketing Sci 27:225–235
Russel GJ, Kamakura WA (1997) Modeling multiple category brand preference with household basket data. J Retail 73:439–461
Russell GJ, Petersen A (2000) Analysis of cross category dependence in market basket selection. J Retail 76:369–392
Russell GJ, Bell D, Bodapati A, Brown CL, Chiang J, Gaeth G, Gupta S, Manchanda P (1997) Perspectives on multiple category choice. Marketing Lett 8:297–305
Russell GJ, Ratneshwar S, Shocker AD, Bell D, Bodapati A, Degeratu A, Hildebrandt L, Kim N, Ramaswami S, Shankar VH (1999) Multiple category decision-making: review and synthesis. Marketing Lett 10:319–332
Salakhutdinov RR, Hinton GE (2010) An efficient learning procedure for deep Boltzmann machines. Technical Report, Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge
Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1., foundationsMIT Press, Cambridge, pp 194–281
Song I, Chintagunta PK (2007) A discrete-continuous model for multicategory purchase behavior of households. J. Marketing Res 44:595–612
Wang H, Kalwani MU, Akçura T (2007) A Bayesian multivariate Poisson regression model of cross-category store brand purchasing behavior. J Retail Consumer Serv 14:369–382
Wedel M, Kamakura WA (1998) Market segmentation. Kluwer, Boston
Acknowledgments
I want to thank two anonymous reviewers for their suggestions which helped me to improve this article. Thanks also go to Thomas Reutterer for his comments on an early version.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Maximum likelihood estimation of restricted Boltzmann machines
We apply the BFGS algorithm (see, e.g., Greene 2003) in 50 random restarts and selected the estimated coefficients which lead to the maximum log likelihood value across these restarts for the estimation data.
Initial values \(b_j, j=1,\ldots ,J\) are set equal to the ML estimates for the independence model according to expression (10). Coefficients \(W_{kj}\) (\(k=1, \ldots ,K\) and \(j=1,\ldots ,J \)) are initialized to random numbers from the normal distribution with mean zero and standard deviation equal to \(0.5.\)
The first derivative of the log likelihood (9) of a RBM with regard to any coefficient \(\theta \) is:
In the following, we derive the components of first derivatives of the coefficients. From expression (8) we obtain the unnormalized log probability of a basket:
The various derivatives of the unnormalized log probability are therefore:
We investigate \(\partial Z_\mathrm{{RBM}} / \partial \theta \) because of
We write the normalization constant as sum over \(2^K\) terms \(Z_\mathrm{{RBM}}^{(n)}\):
Because of \(\partial Z_\mathrm{{RBM}} / \partial \theta = \sum _{n=1}^{2^K} \partial Z_\mathrm{{RBM}}^{(n)} / \partial \theta \) we give the various derivatives \(\partial Z_\mathrm{{RBM}}^{(n)} / \partial \theta \) with \( h_{k}^{(n)}\) denoting the value that hidden variable \(k\) assumes in configuration \(n\):
Appendix 2: Maximum pseudo-likelihood estimation of the multivariate logit model
We estimate the MVL model by means of the BFGS algorithm (see, e.g., Greene 2003) using first derivatives of the log pseudo-likelihood. After rewriting and simplifying we obtain the following expressions for first derivatives of the log pseudo-likelihood of any basket \(i\) with respect to constants \(a_j\) and cross-category coefficients \(V_{jl}\) from expressions (12) and (2):
Rights and permissions
About this article
Cite this article
Hruschka, H. Analyzing market baskets by restricted Boltzmann machines. OR Spectrum 36, 209–228 (2014). https://doi.org/10.1007/s00291-012-0303-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00291-012-0303-6