On sample size control in sample average approximations for solving smooth stochastic programs

Royset, Johannes O.

doi:10.1007/s10589-012-9528-1

On sample size control in sample average approximations for solving smooth stochastic programs

Published: 11 January 2013

Volume 55, pages 265–309, (2013)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Johannes O. Royset¹

599 Accesses
16 Citations
Explore all metrics

Abstract

We consider smooth stochastic programs and develop a discrete-time optimal-control problem for adaptively selecting sample sizes in a class of algorithms based on variable sample average approximations (VSAA). The control problem aims to minimize the expected computational cost to obtain a near-optimal solution of a stochastic program and is solved approximately using dynamic programming. The optimal-control problem depends on unknown parameters such as rate of convergence, computational cost per iteration, and sampling error. Hence, we implement the approach within a receding-horizon framework where parameters are estimated and the optimal-control problem is solved repeatedly during the calculations of a VSAA algorithm. The resulting sample-size selection policy consistently produces near-optimal solutions in short computing times as compared to other plausible policies in several numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sample Average Approximation in a Two-Stage Stochastic Linear Program with Quantile Criterion

Article 01 December 2018

Adaptive sampling quasi-Newton methods for zeroth-order stochastic optimization

Article 14 March 2023

Penalty variable sample size method for solving optimization problems with equality constraints in a form of mathematical expectation

Article 15 April 2019

References

Alexander, S., Coleman, T.F., Li, Y.: Minimizing CVaR and VaR for a portfolio of derivatives. J. Bank. Finance 30, 583–605 (2006)
Article Google Scholar
Attouch, H., Wets, R.J.-B.: Epigraphical processes: laws of large numbers for random lsc functions. In: Seminaire d’Analyze Convexe, Montpellier, pp. 13.1–13.29 (1990)
Google Scholar
Bastin, F., Cirillo, C., Toint, P.L.: An adaptive Monte Carlo algorithm for computing mixed logit estimators. Comput. Manag. Sci. 3(1), 55–79 (2006)
Article MathSciNet MATH Google Scholar
Bayraksan, G., Morton, D.P.: A sequential sampling procedure for stochastic programming. Oper. Res. 59(4), 898–913 (2011)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)
Google Scholar
Betts, J.T., Huffman, W.P.: Mesh refinement in direct transcription methods for optimal control. Optim. Control Appl. 19, 1–21 (1998)
Article MathSciNet Google Scholar
Billingsley, P.: Probability and Measure. Wiley, New York (1995)
MATH Google Scholar
Deng, G., Ferris, M.C.: Variable-number sample-path optimization. Math. Program., Ser. B 117, 81–109 (2009)
Article MathSciNet MATH Google Scholar
Ermoliev, Y.: Stochastic quasigradient methods. In: Ermoliev, Y., J-B Wets, R.J.-B. (eds.) Numerical Techniques for Stochastic Optimization. Springer, New York (1988)
Chapter Google Scholar
Gill, P.E., Hammarling, S.J., Murray, W., Saunders, M.A., Wright, M.H.: LSSOL 1.0 User’s guide. Technical Report SOL-86-1, System Optimization Laboratory, Stanford University, Stanford, CA (1986)
Grant, M., Boyd CVX, S.: Matlab software for disciplined convex programming, version 1.21 (2010). http://cvxr.com/cvx
He, L., Polak, E.: Effective diagonalization strategies for the solution of a class of optimal design problems. IEEE Trans. Autom. Control 35(3), 258–267 (1990)
Article MathSciNet MATH Google Scholar
Higle, J.L., Sen, S.: Stochastic Decomposition: a Statistical Method for Large Scale Stochastic Linear Programming. Springer, New York (1996)
MATH Google Scholar
Holmstrom, K.: Tomlab optimization (2009). http://tomopt.com
Homem-de-Mello, T.: Variable-sample methods for stochastic optimization. ACM Trans. Model. Comput. Simul. 13(2), 108–133 (2003)
Article Google Scholar
Homem-de-Mello, T., Shapiro, A., Spearman, M.L.: Finding optimal material release times using simulation-based optimization. Manag. Sci. 45(1), 86–102 (1999)
Article MATH Google Scholar
Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for global optimization. Oper. Res. 55(3), 549–568 (2007)
Article MathSciNet MATH Google Scholar
Infanger, G.: Planning Under Uncertainty: Solving Large-Scale Stochastic Linear Programs. Thomson Learning, Washington (1994)
MATH Google Scholar
Kall, P., Meyer, J.: Stochastic Linear Programming, Models, Theory, and Computation. Springer, Berlin (2005)
MATH Google Scholar
Kohn, W., Zabinsky, Z.B., Brayman, V.: Optimization of algorithmic parameters using a meta-control approach. J. Glob. Optim. 34, 293–316 (2006)
Article MathSciNet MATH Google Scholar
Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York (2003)
MATH Google Scholar
Lan, G.: Convex optimization under inexact first-order information. PhD thesis, Georgia Institute of Technology, Atlanta, GA (2009)
Linderoth, J., Shapiro, A., Wright, S.: The empirical behavior of sampling methods for stochastic programming. Ann. Oper. Res. 142, 215–241 (2006)
Article MathSciNet MATH Google Scholar
Mak, W.K., Morton, D.P., Wood, R.K.: Monte Carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 24, 47–56 (1999)
Article MathSciNet MATH Google Scholar
Molvalioglu, O., Zabinsky, Z.B., Kohn, W.: The interacting-particle algorithm with dynamic heating and cooling. J. Glob. Optim. 43, 329–356 (2009)
Article MathSciNet MATH Google Scholar
Munakata, T., Nakamura, Y.: Temperature control for simulated annealing. Phys. Rev. E 64(4), 46–127 (2001)
Article Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Article MathSciNet MATH Google Scholar
Norkin, V.I., Pflug, G.C., Ruszczynski, A.: A branch and bound method for stochastic global optimization. Math. Program. 83, 425–450 (1998)
MathSciNet MATH Google Scholar
Oppen, J., Woodruff, D.L.: Parametric models of local search progress. Int. Trans. Oper. Res. 16, 627–640 (2009)
Article MathSciNet Google Scholar
Pasupathy, R.: On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization. Oper. Res. 58, 889–901 (2010)
Article MATH Google Scholar
Pee, E.Y., Royset, J.O.: On solving large-scale finite minimax problems using exponential smoothing. J. Optim. Theory Appl. 148(2), 390–421 (2011)
Article MathSciNet MATH Google Scholar
Pironneau, O., Polak, E.: Consistent approximations and approximate functions and gradients in optimal control. SIAM J. Control Optim. 41(2), 487–510 (2002)
Article MathSciNet MATH Google Scholar
Polak, E.: Optimization. Algorithms and Consistent Approximations. Springer, New York (1997)
MATH Google Scholar
Polak, E., Royset, J.O.: Efficient sample sizes in stochastic nonlinear programming. J. Comput. Appl. Math. 217, 301–310 (2008)
Article MathSciNet MATH Google Scholar
Polak, E., Royset, J.O., Womersley, R.S.: Algorithms with adaptive smoothing for finite minimax problems. J. Optim. Theory Appl. 119(3), 459–484 (2003)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T., Uryasev, S.: Conditional value-at-risk for general loss distributions. J. Bank. Finance 26, 1443–1471 (2002)
Article Google Scholar
Royset, J.O.: Optimality functions in stochastic programming. Math. Program. 135(1–2), 293–321 (2012)
Article MathSciNet MATH Google Scholar
Royset, J.O., Polak, E.: Implementable algorithm for stochastic programs using sample average approximations. J. Optim. Theory Appl. 122(1), 157–184 (2004)
Article MathSciNet MATH Google Scholar
Royset, J.O., Polak, E.: Extensions of stochastic optimization results from problems with simple to problems with complex failure probability functions. J. Optim. Theory Appl. 133(1), 1–18 (2007)
Article MathSciNet MATH Google Scholar
Royset, J.O., Polak, E.: Sample average approximations in reliability-based structural optimization: theory and applications. In: Papadrakakis, M., Tsompanakis, Y., Lagaros, N.D. (eds.) Structural Design Optimization Considering Uncertainties, pp. 307–334. Taylor & Francis, London (2008)
Google Scholar
Royset, J.O., Polak, E., Der Kiureghian, A.: Adaptive approximations and exact penalization for the solution of generalized semi-infinite min-max problems. SIAM J. Optim. 14(1), 1–34 (2003)
Article MathSciNet MATH Google Scholar
Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: a Unified Combinatorial Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. Springer, New York (2004)
MATH Google Scholar
Sastry, K., Goldberg, D.E.: Let’s get ready to rumble redux: crossover versus mutation head to head on exponentially scaled problems. In: GECCO’07: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, pp. 1380–1387. ACM, New York (2007)
Chapter Google Scholar
Schwartz, A., Polak, E.: Consistent approximations for optimal control problems based on Runge-Kutta integration. SIAM J. Control Optim. 34(4), 1235–1269 (1996)
Article MathSciNet MATH Google Scholar
Shapiro, A.: Asymptotic analysis of stochastic programs. Ann. Oper. Res. 30, 169–186 (1991)
Article MathSciNet MATH Google Scholar
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Philadelphia (2009)
Book MATH Google Scholar
Shapiro, A., Homem-de-Mello, T.: A simulation-based approach to two-stage stochastic programming with recourse. Math. Program. 81, 301–325 (1998)
MathSciNet MATH Google Scholar
Shapiro, A., Wardi, Y.: Convergence analysis of stochastic algorithms. Math. Oper. Res. 21(3), 615–628 (1996)
Article MathSciNet MATH Google Scholar
Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, New York (2003)
Book MATH Google Scholar
Verweij, B., Ahmed, S., Kleywegt, A.J., Nemhauser, G., Shapiro, A.: Sample average approximation method applied to stochastic routing problems: a computational study. Comput. Optim. Appl. 24(2–3), 289–333 (2003)
Article MathSciNet MATH Google Scholar
Washburn, A.R., Search and Detection, 4th edn. INFORMS, Linthicum (2002)
Google Scholar
Xu, H., Zhang, D.: Smooth sample average approximation of stationary points in nonsmooth stochastic optimization and applications. Math. Program. 119, 371–401 (2009)
Article MathSciNet MATH Google Scholar
Xu, S.: Smoothing method for minimax problems. Comput. Optim. Appl. 20, 267–279 (2001)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This study is supported by AFOSR Young Investigator grant F1ATA08337G003. The author is grateful for valuable discussions with Roberto Szechtman, Naval Postgraduate School. The author also thanks Alexander Shapiro, Georgia Institute of Technology, for assistance with two technical results.

Author information

Authors and Affiliations

Operations Research Department, Naval Postgraduate School, Monterey, CA, USA
Johannes O. Royset

Authors

Johannes O. Royset
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johannes O. Royset.

Appendix

This appendix includes proofs of results in Sect. 4.

Proof of Proposition 2.

By assumption, for any i=0,1,…,n _k−1 and a∈(0,1)

$$ \frac{f_{N_k}(x_{n_k}^k) - a^{n_k-i}f_{N_k}(x_{i}^k)}{1-a^{n_k-i}} < \frac{f_{N_k}(x_{n_k}^k) - a^{n_k-i}f_{N_k}(x_{n_k}^k)}{1-a^{n_k-i}} = f_{N_k}\bigl(x_{n_k}^k \bigr). $$

(55)

Hence, $\phi(a) < f_{N_{k}}(x_{n_{k}}^{k})$ and

$$ f_{N_k}\bigl(x_i^k\bigr)- \phi(a)>0 $$

(56)

for any i=0,1,…,n _k and a∈(0,1). Consequently, the logarithmic transformation of the data in Step 2 of Subroutine B is permissable when a _j∈(0,1) and regression coefficients loga _j+1 and logb _j+1, j=0,1,… , are given by the standard linear least-square regression formulae. Specifically,

(57)

Since the denominator in (57) simplifies to $(n_{k}^{3} + 3n_{k}^{2} + 2n_{k})/12$, we obtain using the definition of $\alpha_{i} = 12(i-n_{k}/2)/(n_{k}^{3} + 3n_{k}^{2} + 2n_{k})$ in Proposition 2 that

(58)

We find that

(59)

and consequently

$$ a_{j+1} = \prod_{i=0}^{n_k} \bigl(f_{N_k}\bigl(x_i^k\bigr) - \phi(a_j)\bigr)^{\alpha_i}. $$

(60)

By definition, $\alpha_{i} = 12(i-n_{k}/2)/(n_{k}^{2} + 3n_{k}^{2} + 2n_{k}) = -12(n_{k}-i-n_{k}/2)/(n_{k}^{2} + 3n_{k}^{2} + 2n_{k}) = -\alpha_{n_{k}-i}$. Hence,

where we use the fact that $\alpha_{n_{k}/2} = 0$ when n _k is an even number. The expression for g(⋅) then follows by combining the two products. The positivity of g(a _j) follows trivially from (56), as g(a _j) is a product of positive numbers. Since $f_{N_{k}}(x_{i}^{k})>f_{N_{k}}(x_{i+1}^{k})$ for all i=0,1,…,n _k−1, $(f_{N_{k}}(x_{i}^{k})-\phi(a))/(f_{N_{k}}(x_{n_{k}-i}^{k})-\phi(a))<1$ for all $i=n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}$. Moreover, α _i>0 for all $i=n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}$. Hence, it follows that g(a)<1. □

Proof of Theorem 4

We first show that the derivative dg(a)/da exists and is positive on (0,1). For any a∈(0,1) and $i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}$, let h _i:(0,1)→ℝ be defined by

$$ h_i(a) := \frac{f_{N_k}(x_i^k)-\phi(a)}{f_{N_k}(x_{n_k-i}^k)-\phi(a)}. $$

(61)

By (56), h _i(a)>0 for any a∈(0,1) and $i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}$ and consequently

$$ g(a) = \exp \Biggl(\sum_{i=n_k^0}^{n_k} \alpha_i\log h_i(a) \Biggr). $$

(62)

By straightforward derivation we obtain that

(63)

where

$$ \frac{d\phi(a)}{da} = \frac{1}{n_k}\sum _{i=0}^{n_k-1} \frac {(n_k-i)a^{n_k-i-1}(f_{N_k}(x_{n_k}^k)-f_{N_k}(x_{i}^k))}{(1-a^{n_k-i})^2}. $$

(64)

Since $f_{N_{k}}(x_{n_{k}}^{k})-f_{N_{k}}(x_{i}^{k})<0$ for all i=0,1,…,n _k−1 by assumption, it follows that dϕ(a)/da<0 for all a∈(0,1). Again, by assumption, $f_{N_{k}}(x_{i}^{k})-f_{N_{k}}(x_{n_{k}-i}^{k})<0$ for all $i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}$. Hence, by (56) and the fact that α _i>0 for all $i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}$, we conclude that dg(a)/da>0 for any a∈(0,1)

Since $\{a_{j}\}_{j=0}^{\infty}$ is contained in the compact set [0,1] by Proposition 2, it follows that there exists a subsequence {a _j}_j∈J, with J⊂ℕ, and an a ^∗∈[0,1] such that a _j→^J a ^∗, as j→∞. By the mean value theorem, we obtain that for every j=0,1,2,… , there exists an s _j∈[0,1] such that

$$ a_{j+2} - a_{j+1} = g(a_{j+1}) - g(a_j) = \frac{dg(a_j + s_j(a_{j+1}-a_j))}{da}(a_{j+1}-a_j). $$

(65)

Since dg(a _j+s _j(a _j+1−a _j))/da>0, it follows that $\{a_{j}\}_{j=0}^{\infty}$ generated by Subroutine B initialized with a ₀∈(0,1) is either strictly increasing or strictly decreasing. That is, if a ₀<a ₁, then a _j<a _j+1 for all j∈ℕ. If a ₀>a ₁, then a _j>a _j+1 for all j∈ℕ. Hence, a _j→a ^∗ as j→∞. Similarly, g(a _j)∈(0,1) by Proposition 2 and there must exists a convergent subsequence of $\{g(a_{j})\}_{j=0}^{\infty}$ that convergence to a point g ^∗∈[0,1]. Since a _j+1=g(a _j), $\{g(a_{j})\}_{j=0}^{\infty}$ is either strictly increasing or strictly decreasing and therefore g(a _j)→g ^∗, as j→∞. Since a _j+1=g(a _j) for all j∈ℕ and a _j→a ^∗ and g(a _j)→g ^∗, as j→∞, we have that a ^∗=g ^∗. By continuity of g(⋅) on (0,1), if a ^∗∈(0,1), then g(a _j)→g(a ^∗), as j→∞. Hence, g(a ^∗)=g ^∗=a ^∗. If a ^∗=0, then g ^∗=0. If a ^∗=1, then g ^∗=1. Since by definition g(0)=0 and g(1)=1, it follows that a ^∗=g(a ^∗) in these two cases too. The finite termination of Subroutine B follows directly from the fact that $\{a_{j}\}_{j=0}^{\infty}$ converges. □

Proof of Theorem 5

Since $f_{N_{k}}(x_{i}^{k})= f_{N_{k}}^{*} + (\theta_{N_{k}})^{i}(f_{N_{k}}(x_{0}^{k}) - f_{N_{k}}^{*})$ for all i=0,1,2,…,

$$ \phi(\theta_{N_k}) = \frac{1}{n_k}\sum _{i=0}^{n_k-1} \frac {f_{N_k}(x_{n_k}^k) - (\theta_{N_k})^{n_k-i}f_{N_k}(x_{i}^k)}{1-(\theta_{N_k})^{n_k-i}} = f_{N_k}^*. $$

(66)

It then follows by (60) that

Since $\sum_{i=0}^{n_{k}}i^{2} = (n_{k}+1)(n_{k}^{2}/3 + n_{k}/6)$,

Since $\sum_{i=0}^{n_{k}}\alpha_{i}=0$ by (59), the conclusion follows. □

Proof of Theorem 6

By Theorem 5 and (66), $\theta_{N_{k}} = g(\theta_{N_{k}})$ and $\phi(\theta_{N_{k}})=f_{N_{k}}^{*}$. Consequently, it follows from (63) and the assumption of exact linear rate that

$$ \frac{dg(\theta_{N_k})}{da} = \theta_{N_k}\sum _{i=n_k^0}^{n_k} \alpha_i\frac{(\theta_{N_k}^i - \theta_{N_k}^{n_k-i})d\phi(\theta_{N_k})/da}{\theta_{N_k}^{n_k}(f_{N_k}(x_0^k)-f_{N_k}^*)}. $$

(67)

Since

(68)

we obtain that

(69)

If n _k is even, then using (59) we find that

If n _k is odd, then using (59) we find that

Since 1/(n _k+1)<(n _k+1)/(n _k(n _k+2)),

(70)

for all n _k=2,3,… .

The first multiplicative term in (69) decomposes as follows:

(71)

Since $1/(1-\theta_{N_{k}}^{i'})\leq1/(1-\theta_{N_{k}})$ for any i′∈ℕ, we obtain from (69) using (70) and (71) that

$$ \frac{dg(\theta_{N_k})}{da} \leq \Biggl( \frac{n_k+1}{2} + \frac {1}{n_k}\frac{1}{1-\theta_{N_k}}\sum_{i'=1}^{\infty} i'\theta_{N_k}^{i'} \Biggr) \frac{3}{2} \frac{n_k+1}{n_k(n_k+2)}. $$

(72)

Using the mean of the geometric distribution, we deduce that $\sum_{i = 1}^{\infty}i\theta_{N_{k}}^{i} = \theta_{N_{k}}/(1-\theta_{N_{k}})^{2}$. Hence,

(73)

Consequently, if $n_{k} > (1+ \sqrt{1+72\beta})/6$, where $\beta= \theta_{N_{k}}/(1-\theta_{N_{k}})^{3}$, then the right-hand size in (73) is less than one. Hence, for $n_{k} > (1+ \sqrt{1+72\beta })/6$, $dg(\theta_{N_{k}})/da<1$. Since $\theta_{N_{k}}/(1-\theta_{N_{k}})^{3}\leq0.99/(1-0.99)^{3}$ for all $\theta_{N_{k}} \in[0, 0.99]$, it follows that when n _k≥1408, $dg(\theta_{N_{k}})/da<1$ for any $\theta_{N_{k}} \in[0, 0.99]$. It then follows by the fixed point theorem that under the assumption that n _k≥1408, $a_{j}\to\theta_{N_{k}}$, as j→∞, whenever a ₀ is sufficiently close to $\theta_{N_{k}}$.

It appears difficult to examine $dg(\theta_{N_{k}})/da$ analytically for 2<n _k<1408. However, we show that $dg(\theta_{N_{k}})/da<1$ for all $\theta_{N_{k}} \in(0, 0.99]$ and 2≤n _k<1408 using the following numerical scheme. (We note that the case with n _k=2 is easily checked analytically, but we do not show that for brevity.) We consider the function γ:[0,1)→ℝ defined for any θ∈[0,1) by

$$ \gamma(\theta) := \Biggl( \frac{1}{n_k}\sum_{i'=1}^{n_k} \frac {i'}{1-\theta^{i'}} \Biggr)\sum_{i=n_k^0}^{n_k} \alpha_i\bigl(\theta^{n_k-i}-\theta^i\bigr). $$

(74)

Obviously, for $\theta_{N_{k}}\in(0,1)$, $\gamma(\theta_{N_{k}})=dg(\theta_{N_{k}})/da$. Straightforward derivation yields that

(75)

Hence, for any θ _max∈(0,1),

$$ \frac{d\gamma(\theta)}{d\theta} \leq L := \sum_{i'=1}^{n_k} \sum_{i=n_k^0}^{n_k} \frac{\alpha_i i'}{n_k} \frac{(n_k-i)\theta_{\max }^{n_k-i-1} + i'\theta_{\max}^{n_k-i}\theta_{\max}^{i'-1}}{(1-\theta_{\max}^{i'})^2} $$

(76)

for all θ∈(0,θ _max]. Consequently, γ(⋅) is Lipschitz continuous on (0,θ _max] with Lipschitz constant L. Hence, it follows that it suffices to check $dg(\theta_{N_{k}})/da$ for n _k∈{2,3,…,1407} and a finite number of values for $\theta_{N_{k}}$ to verify that $dg(\theta_{N_{k}})/da<1$ for all $\theta_{N_{k}}\in(0,\theta_{\max}]$. Let $\tilde{\theta}_{1}$, $\tilde{\theta}_{2}$, …, $\tilde{\theta}_{\tilde{k}}$ be these values, which are computed recursively starting with $\tilde{\theta}_{1} = 0$ and then by $\tilde{\theta}_{k+1} = \tilde{\theta}_{k} + (1-\gamma(\tilde{\theta}_{k}))/L$, k=1,2,…, until a value no smaller than θ _max is obtained. Let θ _max=0.99. Since we find that $\gamma(\tilde{\theta}_{k}) < 1$ for all k in this case, it follows from the fact that γ(⋅) is Lipschitz continuous on (0,0.99] with Lipschitz constant L that γ(θ)<1 for all θ∈[0,0.99]. Hence, $dg(\theta_{N_{k}})/da<1$ for all $\theta_{N_{k}} \in(0,0.99]$ and n _k=2,3,…,1407. The conclusion then follows by the fixed-point theorem. □

Proof of Proposition 4

By Proposition 1, $N^{1/2}(f_{N}^{*} - f^{*})\Rightarrow \mathcal {N}(0, \sigma^{2}(x^{*}))$, as N→∞. Let $\{N_{l}(S_{k})\}_{S_{k}=1}^{\infty}$, l=1,2,…,k, be such that N _l(S _k)∈ℕ for all S _k∈ℕ and l=1,2,…,k, $\sum_{l=1}^{k} N_{l}(S_{k}) = S_{k}$, and N _l(S _k)/S _k→β _l∈[0,1], as S _k→∞. Consequently, $\sum_{l=1}^{k} \beta_{l} = 1$. By Slutsky’s theorem (see, e.g., Exercise 25.7 of [7]), it then follows that for all l=1,2,…,k,

$$ \biggl(\frac{N_l(S_k)}{S_k} \biggr)^{1/2} N_l(S_k)^{1/2} \bigl(f_{N_l(S_k)}^* - f^*\bigr)\Rightarrow\beta_l^{1/2} \mathcal {N}\bigl(0, \sigma^2\bigl(x^*\bigr)\bigr), $$

(77)

as S _k→∞.

Since the sequences $\{f_{N_{l}}(x_{i}^{l})\}_{i=0}^{n_{l}}$, l=1,2,…,k, converge exactly linearly with coefficient $\hat{\theta}_{l+1}$, it follows that the minimization in (43) can be ignored and $\hat{m}_{l} = f_{N_{l}}^{*}$, l=1,2,…,k. Using the recursive formula for $\hat{f}_{k+1}^{*}$ in Step 2 of Subroutine C, we find that $\hat{f}_{k+1}^{*} = \sum_{l=1}^{k} N_{l}(S_{k}) f_{N_{l}}^{*}/S_{k}$. Consequently,

$$ S_k^{1/2}\bigl(\hat{f}_{k+1}^* - f^*\bigr) = \sum _{l=1}^k \biggl(\frac {N_l(S_k)}{S_k} \biggr)^{1/2}N_l(S_k)^{1/2} \bigl(f_{N_l(S_k)}^* - f^*\bigr). $$

(78)

It then follows by the continuous mapping theorem and the independence of samples across stages that

(79)

as S _k→∞. The conclusion then follows from the fact that $\sum_{l=1}^{k} \beta_{l} = 1$. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Royset, J.O. On sample size control in sample average approximations for solving smooth stochastic programs. Comput Optim Appl 55, 265–309 (2013). https://doi.org/10.1007/s10589-012-9528-1

Download citation

Received: 21 December 2009
Published: 11 January 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s10589-012-9528-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On sample size control in sample average approximations for solving smooth stochastic programs

Abstract

Access this article

Similar content being viewed by others

Sample Average Approximation in a Two-Stage Stochastic Linear Program with Quantile Criterion

Adaptive sampling quasi-Newton methods for zeroth-order stochastic optimization

Penalty variable sample size method for solving optimization problems with equality constraints in a form of mathematical expectation

References

Acknowledgements