In most previous applications of brand choice models, possible time-varying effects in consumer behavior are ignored by merely imposing constant parameters. However, it is very likely that trends or short-term variations in consumers’ intrinsic brand utilities or sensitivities to marketing instruments occur. For example, preferences for specific brands or price elasticities in product categories such as coffee or chocolate may vary in the run-up to festive occasions like Easter or Christmas. In this paper, we employ flexible multinomial logit models for estimating time-varying effects in brand choice behavior. Time-varying brand intercepts and time-varying effects of covariates are modeled using penalized splines, a flexible, yet parsimonious, nonparametric smoothing technique. The estimation is data driven; the flexible functions, as well as the corresponding degrees of smoothness, are determined simultaneously in a unified approach. Our model further allows for alternative-specific time-varying effects of covariates and can mimic state-space approaches with random walk parameter dynamics. In an empirical application for ground coffee, we compare the performance of the proposed approach to a number of benchmark models regarding in-sample fit, information criteria, and in particular out-of-sample fit. Interestingly, the most complex P-spline model with time-varying brand intercepts and brand-specific time-varying covariate effects outperforms all other specifications both in- and out-of-sample. We further present results from a sensitivity analysis on how the number of knots and other P-spline settings affect the model performance, and we provide guidelines for the model building process about the many options for model specification using P-splines. Finally, the resulting parameter paths provide valuable insights for marketing managers.

We focus on brand choice here because most of the effects of marketing variables (e.g., price, promotion activities) on demand can be attributed to changes in secondary demand (i.e., brand switching) as opposed to primary demand (i.e., purchase incidence and purchase quantity). See in particular Gupta (1988) and Bell et al. (1999).
Please note that even though the latent utility is a linear function of the covariates, the choice probability is a nonlinear function of the latent utility in the MNL model [see Eq. (3)]. Therefore, marginal effects response variables (e.g., demand or market share) are nonlinear functions of the covariates (see Train 2009, pp. 57–59).
Using normally distributed error terms would lead to the multinomial probit (MNP) model (Train 2009; Paetz and Steiner 2017). The probit model, if correlations are allowed, does not exhibit the IIA assumption. However, it lacks a closed-form solution for choice probabilities, and thus, estimation is much more complicated and time-consuming.
From our experience, \( M = 40 \) generally works well for datasets with weekly data and a time span of 52–78 weeks. For datasets with longer time windows, we suggest starting with at least \( M = 0.3 \cdot {\rm T} \) as a default value to ensure sufficient flexibility.
Note that in case of brand choice data only discrete responses are available and the continuous utility values are latent.
Note that retailers have nowadays access to similar datasets at the disaggregated level given the rise of loyalty card programs in which individual transactions can be associated with households.
We divided the brands into two classes (brands 1, 2, and 4 versus brands 3 and 5) to test whether estimation results stay robust as compared to the full set of brands. We estimated both MNL models with constant parameters and MNL models with time-varying parameters for the different subsets of brands. Our results indicate that differences between estimated parameters and parameter paths are not overly sensitive to the particular sets of brands, implying that the IIA property inherent to models of the MNL type is not a major concern in our application. Detailed results can be obtained from the authors upon request.
Since observed prices and the use or non-use of promotions are strongly correlated, and as a result, the correlations between promotions and related behavioral price terms like reference prices, gains, and losses (see Sect. 3.2) also turn out substantial, we do not explicitly consider information on promotions for model estimation to prevent multicollinearity problems.
Note that purchases of multiple packs (even of the same brand) are considered in our dataset as multiple single brand choices, however we do not account for quantity effects (in the sense that consumers may have chosen different pack sizes) in our market share calculations. Given that most coffee purchases are single-unit purchases of the most popular package size of 500 g, we argue that our market share calculations are (fairly) representative. On the other hand, we acknowledge that this is a limitation of our empirical study and incorporation of primary demand effects would be desirable for applications in a retailing context. We thank one reviewer for pointing this out.
Via the smoothing parameter, we can continuously vary the effective number of parameters between the total number of basis functions (smoothing parameter equal to zero) and the dimension of the polynomial that is left unpenalized by the difference order (corresponding to a total of r parameters for rth-order differences). For example, with 50 basis functions and a second-order difference penalty, the effective number of parameters can vary continuously in the range between 2 and 50. The effective number of parameters can then be determined as the trace of the product between the unpenalized Fisher information and the inverse of the penalized Fisher information, see Gray (1992) for details.
We thank one reviewer for suggesting this model extension.
We thank one reviewer for suggesting this sensitivity analysis. The full results are available from the authors upon request.
We have further estimated corresponding model versions with time-varying brand intercepts only and with time-varying covariate effects only. However, these versions showed a worse performance (in- and out-of-sample), thus replicating the order of models as in Sect. 3.4.1. Results are available from the authors upon request.
We thank one reviewer for pointing us to this model extension.
