Skip to main content

Advertisement

Log in

Analyzing market baskets by restricted Boltzmann machines

  • Regular Article
  • Published:
OR Spectrum Aims and scope Submit manuscript

Abstract

We introduce a model which differs from the well-known multivariate logit model (MVL) used to analyze the cross-category dependence in market baskets by the addition of binary hidden variables. This model is called restricted Boltzmann machine (RBM) and new to the marketing literature. Extant applications of the MVL model for higher numbers of categories typically follow a two-step approach as simultaneous maximum likelihood estimation is computationally infeasible. In contrast to the MVL, the RBM can be simultaneously estimated by maximum likelihood even for a higher number of categories as long as the number of hidden variables is moderate. We measure the cross-category dependence by pairwise marginal cross effects which are obtained using estimated coefficients and sampling of baskets. In the empirical study, we analyze market baskets consisting of the 60 most frequently purchased categories of the assortment of a supermarket. For a validation data set, the RBM performs better than the MVL model estimated by maximum pseudo-likelihood. For our data, about 75 % of the baskets are reproduced by the model without cross-category dependence, but 25 % of the baskets cannot be adequately modeled if cross effects are ignored. Moreover, it turns out that both the number of significant cross effects and their relationships can be grasped rather easily.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Brijs et al. (2004) analyzed four product categories by means of a multivariate Poisson mixture model. Wang et al. (2007) introduced a multivariate Poisson-lognormal model and study cross-category effects between five product categories. Niraj et al. (2008) dichotomized the purchase quantity by distinguishing purchases of 1 U and purchases of at least 2 U. On the basis of this dichotomization, these authors specify a two-stage bivariate logit model of purchase incidence and quantity for two product categories.

  2. Song and Chintagunta (2007) developed an integrated model based on an utility-maximizing framework for multicategory purchase incidence, brand choice, and purchase quantity. In their empirical study, these authors analyzed four product categories.

References

  • Besag J (1974) Spatial interaction and the statistical analysis of lattice Systems. J R Stat Soc Ser B 36:192–236

    Google Scholar 

  • Betancourt R, Gautschi D (1990) Demand complementarities, household production, and retail assortments. Mark Sci 9:146–161

    Article  Google Scholar 

  • Boztug Y, Hildebrandt L (2008) Modeling joint purchases with a multivariate MNL approach. Schmalenbach Bus Rev 60:400–422

    Google Scholar 

  • Boztuğ Y, Reutterer T (2008) A combined approach for segment-specific market basket analysis. European J Operational Res 187:294–312

    Article  Google Scholar 

  • Brijs T, Karlis D, Swinnen G, Vanhoof K, Wets G, Manchanda P (2004) A multivariate Poisson mixture model for marketing applications. Statist Neerl 58:322–348

    Article  Google Scholar 

  • Cameron AC, Trivedi PK (2005) Microeconometrics. Cambridge University Press, New York

    Book  Google Scholar 

  • Carlin BP, Louis TA (2000) Bayes and empiricla Bayes methods for data analysis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Casella G, George EI (1992) Explaining the Gibbs sampler. The Am Stat 46:167–174

    Google Scholar 

  • Chib S, Seetharaman PB, Strijnev A (2002) Analysis of multi-category purchase incidence decisions using IRI market basket data. In: Franses PH, Montgomery AL (eds) Econometric models in marketing. JAI, Amsterdam, p 92

    Google Scholar 

  • Comets F (1992) On consistency of a class of estimators for exponential families of Markov random fields on the lattice. Ann Stat 20:455–468

    Article  Google Scholar 

  • Cox DR (1972) The analysis of multivariate binary data. J R Stat Soc Ser C 21:113–120

    Google Scholar 

  • Duvvuri SD, Gruca TS (2010) A Bayesian multi-level factor analytic model of consumer price sensitivities across categories. Psychometrika 75:558–578

    Article  Google Scholar 

  • Duvvuri SD, Ansari A, Gupta S (2007) Consumers’ price sensitivities across complementary categories. Manag Sci 53:1933–1945

    Article  Google Scholar 

  • Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741

    Article  Google Scholar 

  • Greene WH (2003) Econometric analysis, 6th edn. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Hertz J, Krogh A, Palmer RG (1991) Introduction to the theory of neural computation. Addison-Wesley, Redwood City

  • Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comp 14:1771–1800

    Article  Google Scholar 

  • Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507

    Article  Google Scholar 

  • Hruschka H, Lukanowicz M, Buchta C (1999) Cross-category sales promotion effects. J. Retail Consumer Serv 6:99–105

    Article  Google Scholar 

  • Larochelle H, Bengio Y, Turian J (2010) Tractable multivariate binary density estimation and the restricted Boltzmann forest. Neural Comp 22:2285–2307

    Article  Google Scholar 

  • Laud PW, Ibrahim JG (1995) Predictive model selection. J R Stat Soc Ser B 57:247–262

    Google Scholar 

  • Manchanda P, Ansari A, Gupta S (1999) The “Shopping Basket”: a model for multi-category purchase incidence decisions. Mark Sci 18:95–114

    Article  Google Scholar 

  • Niraj R, Padmanabhan V, Seetharaman PB (2008) A cross-category model of households’ incidence and quantity decisions. Marketing Sci 27:225–235

    Article  Google Scholar 

  • Russel GJ, Kamakura WA (1997) Modeling multiple category brand preference with household basket data. J Retail 73:439–461

    Article  Google Scholar 

  • Russell GJ, Petersen A (2000) Analysis of cross category dependence in market basket selection. J Retail 76:369–392

    Article  Google Scholar 

  • Russell GJ, Bell D, Bodapati A, Brown CL, Chiang J, Gaeth G, Gupta S, Manchanda P (1997) Perspectives on multiple category choice. Marketing Lett 8:297–305

    Article  Google Scholar 

  • Russell GJ, Ratneshwar S, Shocker AD, Bell D, Bodapati A, Degeratu A, Hildebrandt L, Kim N, Ramaswami S, Shankar VH (1999) Multiple category decision-making: review and synthesis. Marketing Lett 10:319–332

    Article  Google Scholar 

  • Salakhutdinov RR, Hinton GE (2010) An efficient learning procedure for deep Boltzmann machines. Technical Report, Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge

  • Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1., foundationsMIT Press, Cambridge, pp 194–281

  • Song I, Chintagunta PK (2007) A discrete-continuous model for multicategory purchase behavior of households. J. Marketing Res 44:595–612

    Article  Google Scholar 

  • Wang H, Kalwani MU, Akçura T (2007) A Bayesian multivariate Poisson regression model of cross-category store brand purchasing behavior. J Retail Consumer Serv 14:369–382

    Article  Google Scholar 

  • Wedel M, Kamakura WA (1998) Market segmentation. Kluwer, Boston

Download references

Acknowledgments

I want to thank two anonymous reviewers for their suggestions which helped me to improve this article. Thanks also go to Thomas Reutterer for his comments on an early version.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harald Hruschka.

Appendices

Appendix 1: Maximum likelihood estimation of restricted Boltzmann machines

We apply the BFGS algorithm (see, e.g., Greene 2003) in 50 random restarts and selected the estimated coefficients which lead to the maximum log likelihood value across these restarts for the estimation data.

Initial values \(b_j, j=1,\ldots ,J\) are set equal to the ML estimates for the independence model according to expression (10). Coefficients \(W_{kj}\) (\(k=1, \ldots ,K\) and \(j=1,\ldots ,J \)) are initialized to random numbers from the normal distribution with mean zero and standard deviation equal to \(0.5.\)

The first derivative of the log likelihood (9) of a RBM with regard to any coefficient \(\theta \) is:

$$\begin{aligned} \frac{\partial \text{ LL}}{\partial \theta } = \sum _ i \frac{\partial \log p^*({\varvec{y}}_i)}{\partial \theta } - I \frac{\partial \log Z_\mathrm{{RBM}}}{\partial \theta } \end{aligned}$$
(19)

In the following, we derive the components of first derivatives of the coefficients. From expression (8) we obtain the unnormalized log probability of a basket:

$$\begin{aligned} \log p^*({\varvec{y}}_i) = {\varvec{b}}^{^{\prime }} {\varvec{y}}_i + \sum _{k=1}^{K} \log \left( 1+ \exp ({\varvec{W}}_{k ,\cdot } {\varvec{y}}_i ) \right) \end{aligned}$$
(20)

The various derivatives of the unnormalized log probability are therefore:

$$\begin{aligned} \frac{\partial \log p^*({\varvec{y}}_i)}{\partial b_j}&= y_{ij} \end{aligned}$$
(21)
$$\begin{aligned} \frac{\partial \log p^*({\varvec{y}}_i)}{\partial W_{kj}}&= 1\,\big /\left(1+\exp \left(-\left( {\varvec{W}}_{k ,\cdot } {\varvec{y}}_i + c_k\right)\right)\right)\!\! y_{ij} \end{aligned}$$
(22)

We investigate \(\partial Z_\mathrm{{RBM}} / \partial \theta \) because of

$$\begin{aligned} \frac{\partial \log Z_\mathrm{{RBM}}}{\partial \theta } = \frac{1}{Z_\mathrm{{RBM}}} \frac{\partial Z_\mathrm{{RBM}}}{\partial \theta } \end{aligned}$$
(23)

We write the normalization constant as sum over \(2^K\) terms \(Z_\mathrm{{RBM}}^{(n)}\):

$$\begin{aligned} Z_\mathrm{{RBM}}&= \sum _{n=1}^{2^K} \prod _{j=1}^{J} \left(1+ \exp ({\varvec{W}}_{\cdot ,j }^{^{\prime }} {\varvec{h}}^{(n)} + b_j) \right) = \sum _{n=1}^{2^K} Z_\mathrm{{RBM}}^{(n)} \end{aligned}$$
(24)

Because of \(\partial Z_\mathrm{{RBM}} / \partial \theta = \sum _{n=1}^{2^K} \partial Z_\mathrm{{RBM}}^{(n)} / \partial \theta \) we give the various derivatives \(\partial Z_\mathrm{{RBM}}^{(n)} / \partial \theta \) with \( h_{k}^{(n)}\) denoting the value that hidden variable \(k\) assumes in configuration \(n\):

$$\begin{aligned} \frac{\partial Z_\mathrm{{RBM}}^{(n)}}{\partial W_{kj}}&= 1\Big /\left(1+\exp \left(-\left({\varvec{W}}_{\cdot ,j }^{\prime } {\varvec{h}}^{(n)} + b_j\right)\right)\right) \, Z_\mathrm{{RBM}}^{(n)} h_{k}^{(n)}\end{aligned}$$
(25)
$$\begin{aligned} \frac{\partial Z_\mathrm{{RBM}}^{(n)}}{\partial b_{j}}&= 1\Big /\left(1+\exp \left(-\left({\varvec{W}}_{\cdot ,j }^{\prime } {\varvec{h}}^{(n)} + b_j\right)\right)\right) \, Z_\mathrm{{RBM}}^{(n)} \end{aligned}$$
(26)

Appendix 2: Maximum pseudo-likelihood estimation of the multivariate logit model

We estimate the MVL model by means of the BFGS algorithm (see, e.g., Greene 2003) using first derivatives of the log pseudo-likelihood. After rewriting and simplifying we obtain the following expressions for first derivatives of the log pseudo-likelihood of any basket \(i\) with respect to constants \(a_j\) and cross-category coefficients \(V_{jl}\) from expressions (12) and (2):

$$\begin{aligned} \frac{\partial \text{ LPL}_i }{\partial a_j}&= y_{ij} - p(y_{ij} | {\varvec{y}}_{i,-j}) \quad \text{ for} \ j=1, \ldots , J \end{aligned}$$
(27)
$$\begin{aligned} \frac{\partial \text{ LPL}_i }{\partial V_{jl}}&= 2 y_{ij} y_{il} - y_{il} p(y_{ij} | {\varvec{y}}_{i,-j}) - y_{ij} p(y_{il} | {\varvec{y}}_{i,-l}) \quad \text{ for} \ l>j \end{aligned}$$
(28)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hruschka, H. Analyzing market baskets by restricted Boltzmann machines. OR Spectrum 36, 209–228 (2014). https://doi.org/10.1007/s00291-012-0303-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00291-012-0303-6

Keywords

Navigation