Decision Support
Analysis of a chance-constrained new product risk model with multiple customer classes

https://doi.org/10.1016/j.ejor.2018.07.042Get rights and content

Highlights

  • We consider a cost-minimization nonlinear program featuring multiplicative noise.

  • We derive a closed-form expression for the optimal solution in a special case.

  • For a related special case, we reduce the problem to a sequence of linear programs.

  • For other cases, we propose a heuristic and compare it to sample average approximation.

  • Numerical experiments illustrate how various parameters influence the optimal cost.

Abstract

We consider the non-convex problem of minimizing a linear deterministic cost objective subject to a probabilistic requirement on a nonlinear multivariate stochastic expression attaining, or exceeding a given threshold. The stochastic expression represents the output of a noisy system featuring the product of mutually-independent, uniform random parameters each raised to a linear function of one of the decision vector’s constituent variables. We prove a connection to (i) the probability measure on the superposition of a finite collection of uncorrelated exponential random variables, and (ii) an entropy-like affine function. Then, we determine special cases for which the optimal solution exists in closed-form, or is accessible via sequential linear programming. These special cases inspire the design of a gradient-based heuristic procedure that guarantees a feasible solution for instances failing to meet any of the special case conditions. The application motivating our study is a consumer goods firm seeking to cost-effectively manage a certain aspect of its new product risk. We test our heuristic on a real problem and compare its overall performance to that of an asymptotically optimal Monte-Carlo-based method called sample average approximation. Numerical experimentation on synthetic problem instances sheds light on the interplay between the optimal cost and various parameters including the probabilistic requirement and the required threshold.

Introduction

Traditional mathematical optimization models conveniently ignore the possibility of any uncertainty in the parameters influencing the optimality or feasibility of the decision vectors involved. In practice, this implicit assumption of perfect knowledge is often a strong enough one to significantly diminish the real-world fidelity of optimal solutions for such classical models. Chance-constrained optimization has thus emerged as an approach for coping with the nuisance of parameter uncertainty in optimization problems. The goal of a chance-constrained model is to optimize a function subject to one or more chance constraints which demand explicit probabilistic guarantees on the satisfaction of linear or nonlinear inequalities featuring noisy parameters. That is, chance-constrained models aspire not just to optimize performance, but also to be robust in the face of uncertainty. If, in the absence of uncertainty, either the objective function, or at least one member of the set of constraint functions is nonlinear, then we have a nonlinear program, and by infecting any of its constraint parameters with randomness, we turn the optimization problem into a chance-constrained nonlinear program (CCNLP).

The discipline of chance-constrained optimization traces its origins to the pioneering papers of Charnes, Cooper, and Symmonds (1958), and Charnes and Cooper (1959) both of which addressed stochastic problems in heating oil inventory planning. Over the years, the body of theoretical and computational knowledge has grown severalfold with notable contributions being those of Prékopa (1973), Prékopa (1993), Prékopa (1995), Prékopa, Vizvári, and Badics (1998), Prékopa, Yoda, and Subasi (2011), Seppälä (1972), Dentcheva, Prékopa, and Ruszczynski (2000), Nemirovski, Shapiro, 2006, Nemirovski, Shapiro, 2007, Luedtke, Ahmed, and Nemhauser (2010) to name a few (see Geletu, Klöppel, Zhang, & Li, 2013 for a more comprehensive survey). More recently, techniques developed to solve chance-constrained problems have become increasingly popular among engineers and decision-makers as tools for handling uncertainty in various application domains such as power systems management (Kargarian, Fu, & Wu, 2016), robotic path control (Blackmore, Ono, & Williams, 2011), network machine learning (Doppa, Yu, Tadepalli, & Getoor, 2010), portfolio optimization (Fakhar, Mahyarinia, & Zafarani, 2018), and forest disaster response (Wei, Bevers, Belval, & Bird, 2015).

However, coming up with exact algorithms for chance-constrained models is quite a challenging task principally because (i) chance constraints are in general non-convex, and (ii) it is typically not the case that we have a closed-form probability formula at hand to a priori check the feasibility of any given candidate solution (Prékopa, 1995). Hence, the absence of generalizable exact methods frequently compels us to resort to heuristic alternatives such as back-mapping (Li, Arellano-Garcia, Wozny, 2008, Wendt, Li, Wozny, 2002, Zhang, Li, 2011), or to asymptotically optimal Monte-Carlo-based methods like sample average approximation (Luedtke, Ahmed, 2008, Pagnoncelli, Ahmed, Shapiro, 2009). Nevertheless, special chance-constrained linear programs (such as those with Gaussian uncertainty and known covariances) do possess the desirable properties of chance constraint convexity, and easy a priori feasibility checking (Ahmed, Shapiro, 2008, Zhang, Jiang, & Shen). An interesting question therefore, is whether it is possible for a CCNLP to be endowed with at least one of the aforementioned desirable properties. The answer to that question might be useful in exploring for instance, a (generally non-convex) CCNLP of the form:z(x)minxnNcnxnsubjecttoPr(1nN(a˜n)(knan+)xnb)pxnlnforallnN where b ∈ (0, 1), p ∈ (0, 1), cn>0,kn(0,1],an+>1,ln0 are deterministic scalars for all nN and |N|1,nNkn=1. The constrained probability measure (2) is our chance constraint, and the quantity a˜nA{a˜1,a˜2,,a˜|N|} is a continuous random variable that is uniformly distributed on the support (0,an+]. To keep the analysis tractable, we assume mutual independence of all elements in A, thereby implying that Cov[a˜n,a˜m]E[a˜na˜m]E[a˜n]E[a˜m]=0 for all n ≠ m, and we justify the imposition of a continuous uniform distribution on a˜nA by the continuous uniform distribution’s maximum differential entropy property. The differential entropy of a random variable ω˜ (distributed on continuous domain Ω) with density function fω˜(·) is defined as H(ω˜)Ωfω˜(ω)logωdω, and roughly speaking, inversely relates to the amount of prior information that is intrinsically built into ω˜. Being the distribution with the greatest differential entropy, the continuous uniform distribution therefore assumes the least amount of prior information when serving as a model for the uncertainty of any unknown continuous quantity with known bounds.

The motivation for studying Problem (1) arises out of a certain consumer goods firm’s new product risk model in which the expression 1nN(a˜n)(knan+)xn[0,1) represents a fractional and stochastic index measure reflecting the market’s aggregate perception of the attractiveness (i.e., desirability) of a new product in the firm’s portfolio. The product is marketed at |N| disparate customer classes, and its aggregate desirability in the market (measured via survey response data) is taken to depend on two sets of ingredients. The first set is stored in a decision vector of real numbers x=[x1x2x|N|] in the non-negative orthant R+|N| for which xn ≥ 0 represents a firm-provided incentive that entices class nN members to promote the new product on the firm’s behalf. Furthermore, the unit cost of an incentive destined for class n members is cn > 0 and out of strategic or operational necessity, the firm is constrained by a lower bound ln ≥ 0 on each decision variable xn. The second set of ingredients is a collection {kn,an+,a˜n}nN of deterministic and stochastic parameters which govern how members of the various customer classes typically react to the firm’s incentives.

More concretely, the decision variable xn ≥ 0 models the time and effort expended by the firm to induce class nN customers to effectively act as a surrogate arm of the firm’s salesforce whereas kn,an+ and a˜n dictate the predictable and random aspects relating to how sensitive class n customers are to xn. The model 1nN(a˜n)(knan+)xn therefore maps an input vector x of costly firm-provided promotion incentives to a random (and hence risky) output fraction that measures the new product’s overall desirability in the market.

Expressions such as the posited stochastic model 1nN(a˜n)(knan+)xn are effectively probabilistic versions of models belonging to the family of so-called production functions extensively used in economics (Grassetti, Hunanyan, 2018, Shahbaz, Benkraiem, Miloudi, Lahiani, 2017), agronomy (Araya, Kisekka, Gowda, Prasad, 2018, Zhang, Feng, Ahuja, Kong, Ouyang, Adeli, et al., 2018) and engineering (Wibe, 1984). At heart, a production function1 describes the relationship between a valuable output and a set of controllable input factors which collectively determine the output quantity (e.g., an agricultural production function could model how a certain crop yield depends on necessary inputs like labor, capital and technology).

To capture economic reality, a production function must have the attributes of essentiality, monotonicity, upper semi-continuity, attainability, and quasi-concavity in every input. In what immediately follows, we show that the conditional value of 1nN(a˜n)(knan+)xn qualifies as a production function.

Let a[a1a2a|N|] be a vector of hypothesized values of the respective random parameters in A={a˜1,a˜2,,a˜|N|} where 0<anan+,an1 for all nN. Now, suppose that 0R|N| is a vector of zeros and define the mapping PDes:R+|N|[0,1) to be PDes(x|a)1nN(an)(knan+)xn. Then, it is clear that PDes has the following properties.

  • Essentiality: PDes vanishes in the absence of any inputs i.e.,PDes(0|a)=1nN(an)(knan+)(0)=11|N|=11=0.

  • Strict monotonicity (and therefore, monotonicity): PDes is strictly increasing in all inputs i.e.,PDes(x|a)xn=xn(1nN(an)(knan+)xn)=knan+(an)(knan+)xnlog(an)mNmn(am)(kmam+)xm>0forallxn0,nN.

  • Continuity (and therefore, upper semi-continuity): PDes is smooth all over its domain i.e.,|tPDes(x|a)xnt|=|(1)t1(knan+)t(an)(knan+)xnlogt(an)mNmn(am)(kmam+)xm|>0forallxn0,nNandt=1,2,

  • Attainability: it is always possible meet any value on PDes’s range by proportionally and sufficiently increasing all inputs i.e.,{PDes(x|a)=1nN(an)(knan+)xn>0limλPDes(λx|a)=limλ1nN(an)(knan+)λxn=supxR+|N|PDes(x|a)=1whereλ>0andsupxR+|N|PDes(x|a)=limxn1nN(an)(knan+)xn=1forallnN.

  • Strict concavity (and therefore quasi-concavity) in every input: PDes obeys the law of diminishing marginal returns with respect to every input i.e.,2PDes(x|a)xn2=2xn2(1nN(an)(knan+)xn)=(knan+)2(an)(knan+)xnlog2(an)mNmn(am)(kmam+)xm<0forallxn0,nN.

Hence, for a given new product in the firm’s portfolio, the model P˜Des(x)1nN(a˜n)(knan+)xn is a stochastic production function2 that translates the firm’s incentivization strategy x into a fractional desirability measure for that product. In the firm’s context, the uniform random variable a˜n(0,an+] (where an+>1) is called the propensity of class n because the magnitude of its logarithm represents the uncertain (i.e., noisy) component of the degree to which a typical class n member will react to infinitesimal changes in the firm’s incentives (when these incentives are still low). To see how this is the case, note that limxnPr((a˜n)(knan+)xn=0)=1 almost surely, but for low values of xn, we obtain the first order approximation (a˜n)(knan+)xn1(knan+)xnlog(a˜n), and taking the first derivative with respect to xn yields the gradient magnitude |(knan+)log(a˜n)|. The random variable |log(a˜n)| is then the stochastic component of |(knan+)log(a˜n)|. In contrast, the dimensionless ratio knan+>0 (with kn being a relative weight) is called the responsivity of class n, and is the deterministic component of the gradient magnitude |(knan+)log(a˜n)|. Essentially, in the firm’s multiplicative sensitivity model, knan+ approximates the predictable element pertaining to how sensitive class n customers are to its incentives, whereas log(a˜n) is an approximate description of the probabilistic component of that sensitivity.

At this juncture, it is important to mention that the point of discussing the first-order approximation (a˜n)(knan+)xn1(knan+)xnlog(a˜n) in the preceding paragraph was to introduce and explain the genesis of the terms “propensity” and “responsivity” rather than to suggest that a first-order approximation is sufficiently powerful to capture all the model’s moving parts (in fact, 1nN(a˜n)(knan+)xn involves an infinite power series, and so employing a first-order approximation for each factor (a˜n)(knan+)xn would coarsen the model). Note that a class’ maximum propensity value an+ (which is linearly proportional to Var[a˜n]=an+23) inversely relates to the class’ responsivity knan+. This inverse relationship models the firm’s empirical data which suggests that greater uncertainty over the true value of a˜n is plausibly explainable in large part by weaker deterministic sensitivity to changes in the incentive level.

The firm’s objective - for a specific new product in its portfolio - is to find the least costly feasible incentivization strategy x capable of ensuring that the product in question will enjoy an aggregate desirability index of at least b ∈ (0, 1) with minimum reliability (i.e., probability) p ∈ (0, 1). Equivalently, the firm’s goal is to select x such that the risk of violating 1nN(a˜n)(knan+)xnb is at most 1p where p is a chosen probabilistic feasibility standard. To ease the notation, we introduce for class n’s responsivity, the symbolrnknan+rn(0,1)sincekn(0,1],an+>1forallnN,andrnkn=1an+rnkn(0,1)sincean+>1forallnN.

We confirm that (2) is well-behaved because for all nN,0xn<an+>1Pr(cnxn(1nN(a˜n)rnxn)>0)=Pr(rn(a˜n)rnxnlog(a˜n)cnmNmn(a˜m)rmxm>0)>0cnxnPr(1nN(a˜n)rnxn>0)>0 and{limx10,,x|N|0nNcnxn=0limx10,,x|N|0Pr(1nN(a˜n)rnxn=0)=1limx1,,x|N|nNcnxn=limx1,,x|N|Pr(1nN(a˜n)rnxn=1)=1a.s.

Verbally, the probability of an arbitrary incentivization strategy x being feasible for (2) is monotonically increasing with respect to each cost component of that strategy. Also, a zero cost strategy will always violate (2) whereas an infinitely costly one will satisfy (2) almost surely. In this paper, Proposition 5 provides a closed-form expression for the left-hand side of (2), thereby showing that Problem (1) belongs to the family of chance-constrained programs with easy a priori feasibility checking. We thus give a positive answer to the question raised in the third paragraph of Section 1.1. More broadly, we offer the following four contributions to the literature:

  • Given a vector of decisions x, Proposition 1 asserts that feasibility for Problem (1) is strongly connected to the probability that the sum of mutually-uncorrelated exponential random variables (collectively dependent on x) does not exceed a certain entropy-like affine function evaluation of x.

  • Under two restrictive assumptions, we derive a closed-form expression for Problem (1)’s optimal solution (Proposition 2), and for the chance constraint’s shadow price (Corollary 1). We then weaken one of the assumptions to usher in a regime in which Problem (1) is amenable to a linear programming solution framework (Proposition 3). We show that an important parameter in that framework is the lone positive root of a certain quadratic equation (Proposition 4). To summarize, we identify special (albeit restrictive) cases in which we can quickly and exactly determine Problem (1)’s optimal solution. These special cases are valuable for two reasons. First, they show that at least for the class of problems of the form of Problem (1), there are situations in which we could do better than resorting to (asymptotically optimal) Monte-Carlo-based solution techniques. Second, they provide useful information for designing heuristic procedures aimed at pragmatically addressing Problem (1) in very general situations.

  • Building on the preceding contribution, we exploit the result in Corollary 1 to design a heuristic that covers cases outside the jurisdiction of the previously-mentioned special cases. The proposed heuristic begins by solving a special case problem that yields a “warm start” initial infeasible solution. Then, using estimated shadow price information at each iteration, the heuristic gradually works its way toward the feasible space with the goal of minimizing the journey’s cumulative cost. Using real parameter settings from a consumer goods firm, we compare our heuristic procedure to the Monte-Carlo-based sample average approximation approach of Luedtke and Ahmed (2008).

  • We artificially generate problem instances that conform to the special cases. Then, we experiment on these instances so as to validate the theoretical analysis and to better understand how the optimal cost reacts to changes in the propensity regime.

Outline. The organization of this paper is as follows. In Section 2, we analyze the problem theoretically and derive some structural properties. Section 3 then provides a special case solution algorithm that exploits these properties. In Section 4, we propose a heuristic procedure to cover instances beyond our special case algorithm’s reach, then we derive a mixed integer linear programming problem that incarnates an asymptotically optimal solution technique known as sample average approximation. Section 5 experimentally verifies the findings on a real problem instance, as well as on numerous replications of randomly-generated problem instances. In Section 6, we round up the discussion and offer possible fruitful extensions of our work.

Notation. Throughout this paper, lowercase boldface shall denote vectors, the tilde ( ∼ ) superscript over a letter will indicate randomness, and on its own, will be an abbreviation for the phrase “is distributed as”. For instance, θ˜n(xn)Exp(1rnxn) means that the random variable θ˜n(xn) is exponentially-distributed with rate parameter 1rnxn. We will use the lowercase variable t either as a local operating index, or as the variable of a Riemann integral.

Section snippets

Main results

We begin by characterizing feasibility for (2) in terms of the probability measure belonging to the superposition of a set of mutually-uncorrelated exponential random variables. To set the stage for this characterization, we associate to each incentivization strategy x, an index set N>0(x), a set T(x) of mutually-uncorrelated exponential random variables, and an entropy-like real-valued affine function h(x). Their definitions follow.

Definition 1

N>0(x){nN:xn>0}NT(x){θ˜n(xn):nN>0(x)}whereθ˜n(xn)Exp(1rnx

An exact solution algorithm for the second special case

On the assumption that we meet the requirements of special case 2 (see Table 1), let {x[q]}0 ≤ q ≤ Q be a sequence of output vectors for some algorithm solving Problem (15) (i.e., x=x[Q]) where Q ≥ 0. Because FnN>0(x[q])θ˜n(xn[q])1(p) is monotonically increasing and bounded over p ∈ (0, 1) it follows that there must exist two bounded sequences {vLB[q]}0qQ,{vUB[q]}0qQ that approach each other and guarantee the validity of vLB[q]FnN>0(x)θ˜n(xn)1(p)vUB[q] for all 0 ≤ q ≤ Q. However,

A gradient-based heuristic

Our solution techniques thus far only guarantee optimality when p<1rn̲kn̲. Hence, they are only practically useful for the firm as long as 1rn̲kn̲ is above, e.g., 0.95. When 1rn̲kn̲ is not high enough, the next best course of action is to seek low-cost feasible solutions for situations in which p1rn̲kn̲. We propose Algorithm 2 as a heuristic procedure whose goal is to achieve that objective.

Algorithm 2 works as follows. Choose an appropriate initial minimum reliability π and use it to

Numerical experiments and discussion

Fig. 1 is a concise illustration of the ensemble of methods that we propose for tackling Problem (1).

We investigated a real problem instance for a firm with |N|=9 customer classes that required an aggregate product desirability index of at least b=0.95. The firm assigned to each class nN an equal weight of kn=1|N|=19, with cn,an+, and ln listed for all nN in Table 2 below.

We coded all algorithms in C++ and ran all experiments on a 3.4 gigahertz, 32 gigabytes RAM computer. Also, we employed

Conclusion

In this paper, we adopted a chance-constrained optimization framework to address a nonlinear program plagued by multiplicative noise. Chance-constrained models are a subset of non-convex programs which in and of themselves constitute a class of quite difficult non-traditional optimization problems. We proved the existence of a weakly restrictive special case for which it is possible to determine the optimal solution by solving a sequence of appropriately-designed linear programs. In this

Acknowledgments

The authors are grateful to both anonymous referees for their constructive and insightful comments which significantly improved the paper’s quality. This research was partially supported by the University of Mary Washington Faculty Development Fund.

References (33)

  • A. Charnes et al.

    Cost horizons and certainty equivalence: An approach to stochastic programming of heating oil

    Management Science

    (1958)
  • D. Dentcheva et al.

    Concavity and efficient points of discrete distributions in probabilistic programming

    Mathematical Programming

    (2000)
  • J.R. Doppa et al.

    Learning algorithms for link prediction based on chance constraints

    Proceedings of the European conference on machine learning and knowledge discovery in databases: Part I

    (2010)
  • M. Fakhar et al.

    On nonsmooth robust multiobjective optimization under generalized convexity with applications to portfolio optimization

    European Journal of Operational Research

    (2018)
  • F. Grassetti et al.

    On the economic growth theory with Kadiyala production function

    Communications in Nonlinear Science and Numerical Simulation

    (2018)
  • A. Kargarian et al.

    Chance-constrained system of systems based operation of power systems

    IEEE Transactions on Power Systems

    (2016)
  • Cited by (0)

    View full text