Skip to main content
Log in

On sample size control in sample average approximations for solving smooth stochastic programs

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

We consider smooth stochastic programs and develop a discrete-time optimal-control problem for adaptively selecting sample sizes in a class of algorithms based on variable sample average approximations (VSAA). The control problem aims to minimize the expected computational cost to obtain a near-optimal solution of a stochastic program and is solved approximately using dynamic programming. The optimal-control problem depends on unknown parameters such as rate of convergence, computational cost per iteration, and sampling error. Hence, we implement the approach within a receding-horizon framework where parameters are estimated and the optimal-control problem is solved repeatedly during the calculations of a VSAA algorithm. The resulting sample-size selection policy consistently produces near-optimal solutions in short computing times as compared to other plausible policies in several numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Alexander, S., Coleman, T.F., Li, Y.: Minimizing CVaR and VaR for a portfolio of derivatives. J. Bank. Finance 30, 583–605 (2006)

    Article  Google Scholar 

  2. Attouch, H., Wets, R.J.-B.: Epigraphical processes: laws of large numbers for random lsc functions. In: Seminaire d’Analyze Convexe, Montpellier, pp. 13.1–13.29 (1990)

    Google Scholar 

  3. Bastin, F., Cirillo, C., Toint, P.L.: An adaptive Monte Carlo algorithm for computing mixed logit estimators. Comput. Manag. Sci. 3(1), 55–79 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bayraksan, G., Morton, D.P.: A sequential sampling procedure for stochastic programming. Oper. Res. 59(4), 898–913 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)

    Google Scholar 

  6. Betts, J.T., Huffman, W.P.: Mesh refinement in direct transcription methods for optimal control. Optim. Control Appl. 19, 1–21 (1998)

    Article  MathSciNet  Google Scholar 

  7. Billingsley, P.: Probability and Measure. Wiley, New York (1995)

    MATH  Google Scholar 

  8. Deng, G., Ferris, M.C.: Variable-number sample-path optimization. Math. Program., Ser. B 117, 81–109 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Ermoliev, Y.: Stochastic quasigradient methods. In: Ermoliev, Y., J-B Wets, R.J.-B. (eds.) Numerical Techniques for Stochastic Optimization. Springer, New York (1988)

    Chapter  Google Scholar 

  10. Gill, P.E., Hammarling, S.J., Murray, W., Saunders, M.A., Wright, M.H.: LSSOL 1.0 User’s guide. Technical Report SOL-86-1, System Optimization Laboratory, Stanford University, Stanford, CA (1986)

  11. Grant, M., Boyd CVX, S.: Matlab software for disciplined convex programming, version 1.21 (2010). http://cvxr.com/cvx

  12. He, L., Polak, E.: Effective diagonalization strategies for the solution of a class of optimal design problems. IEEE Trans. Autom. Control 35(3), 258–267 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  13. Higle, J.L., Sen, S.: Stochastic Decomposition: a Statistical Method for Large Scale Stochastic Linear Programming. Springer, New York (1996)

    MATH  Google Scholar 

  14. Holmstrom, K.: Tomlab optimization (2009). http://tomopt.com

  15. Homem-de-Mello, T.: Variable-sample methods for stochastic optimization. ACM Trans. Model. Comput. Simul. 13(2), 108–133 (2003)

    Article  Google Scholar 

  16. Homem-de-Mello, T., Shapiro, A., Spearman, M.L.: Finding optimal material release times using simulation-based optimization. Manag. Sci. 45(1), 86–102 (1999)

    Article  MATH  Google Scholar 

  17. Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for global optimization. Oper. Res. 55(3), 549–568 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  18. Infanger, G.: Planning Under Uncertainty: Solving Large-Scale Stochastic Linear Programs. Thomson Learning, Washington (1994)

    MATH  Google Scholar 

  19. Kall, P., Meyer, J.: Stochastic Linear Programming, Models, Theory, and Computation. Springer, Berlin (2005)

    MATH  Google Scholar 

  20. Kohn, W., Zabinsky, Z.B., Brayman, V.: Optimization of algorithmic parameters using a meta-control approach. J. Glob. Optim. 34, 293–316 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  21. Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York (2003)

    MATH  Google Scholar 

  22. Lan, G.: Convex optimization under inexact first-order information. PhD thesis, Georgia Institute of Technology, Atlanta, GA (2009)

  23. Linderoth, J., Shapiro, A., Wright, S.: The empirical behavior of sampling methods for stochastic programming. Ann. Oper. Res. 142, 215–241 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  24. Mak, W.K., Morton, D.P., Wood, R.K.: Monte Carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 24, 47–56 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  25. Molvalioglu, O., Zabinsky, Z.B., Kohn, W.: The interacting-particle algorithm with dynamic heating and cooling. J. Glob. Optim. 43, 329–356 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  26. Munakata, T., Nakamura, Y.: Temperature control for simulated annealing. Phys. Rev. E 64(4), 46–127 (2001)

    Article  Google Scholar 

  27. Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  28. Norkin, V.I., Pflug, G.C., Ruszczynski, A.: A branch and bound method for stochastic global optimization. Math. Program. 83, 425–450 (1998)

    MathSciNet  MATH  Google Scholar 

  29. Oppen, J., Woodruff, D.L.: Parametric models of local search progress. Int. Trans. Oper. Res. 16, 627–640 (2009)

    Article  MathSciNet  Google Scholar 

  30. Pasupathy, R.: On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization. Oper. Res. 58, 889–901 (2010)

    Article  MATH  Google Scholar 

  31. Pee, E.Y., Royset, J.O.: On solving large-scale finite minimax problems using exponential smoothing. J. Optim. Theory Appl. 148(2), 390–421 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  32. Pironneau, O., Polak, E.: Consistent approximations and approximate functions and gradients in optimal control. SIAM J. Control Optim. 41(2), 487–510 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  33. Polak, E.: Optimization. Algorithms and Consistent Approximations. Springer, New York (1997)

    MATH  Google Scholar 

  34. Polak, E., Royset, J.O.: Efficient sample sizes in stochastic nonlinear programming. J. Comput. Appl. Math. 217, 301–310 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  35. Polak, E., Royset, J.O., Womersley, R.S.: Algorithms with adaptive smoothing for finite minimax problems. J. Optim. Theory Appl. 119(3), 459–484 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  36. Rockafellar, R.T., Uryasev, S.: Conditional value-at-risk for general loss distributions. J. Bank. Finance 26, 1443–1471 (2002)

    Article  Google Scholar 

  37. Royset, J.O.: Optimality functions in stochastic programming. Math. Program. 135(1–2), 293–321 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  38. Royset, J.O., Polak, E.: Implementable algorithm for stochastic programs using sample average approximations. J. Optim. Theory Appl. 122(1), 157–184 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  39. Royset, J.O., Polak, E.: Extensions of stochastic optimization results from problems with simple to problems with complex failure probability functions. J. Optim. Theory Appl. 133(1), 1–18 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  40. Royset, J.O., Polak, E.: Sample average approximations in reliability-based structural optimization: theory and applications. In: Papadrakakis, M., Tsompanakis, Y., Lagaros, N.D. (eds.) Structural Design Optimization Considering Uncertainties, pp. 307–334. Taylor & Francis, London (2008)

    Google Scholar 

  41. Royset, J.O., Polak, E., Der Kiureghian, A.: Adaptive approximations and exact penalization for the solution of generalized semi-infinite min-max problems. SIAM J. Optim. 14(1), 1–34 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  42. Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: a Unified Combinatorial Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. Springer, New York (2004)

    MATH  Google Scholar 

  43. Sastry, K., Goldberg, D.E.: Let’s get ready to rumble redux: crossover versus mutation head to head on exponentially scaled problems. In: GECCO’07: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, pp. 1380–1387. ACM, New York (2007)

    Chapter  Google Scholar 

  44. Schwartz, A., Polak, E.: Consistent approximations for optimal control problems based on Runge-Kutta integration. SIAM J. Control Optim. 34(4), 1235–1269 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  45. Shapiro, A.: Asymptotic analysis of stochastic programs. Ann. Oper. Res. 30, 169–186 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  46. Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Philadelphia (2009)

    Book  MATH  Google Scholar 

  47. Shapiro, A., Homem-de-Mello, T.: A simulation-based approach to two-stage stochastic programming with recourse. Math. Program. 81, 301–325 (1998)

    MathSciNet  MATH  Google Scholar 

  48. Shapiro, A., Wardi, Y.: Convergence analysis of stochastic algorithms. Math. Oper. Res. 21(3), 615–628 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  49. Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, New York (2003)

    Book  MATH  Google Scholar 

  50. Verweij, B., Ahmed, S., Kleywegt, A.J., Nemhauser, G., Shapiro, A.: Sample average approximation method applied to stochastic routing problems: a computational study. Comput. Optim. Appl. 24(2–3), 289–333 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  51. Washburn, A.R., Search and Detection, 4th edn. INFORMS, Linthicum (2002)

    Google Scholar 

  52. Xu, H., Zhang, D.: Smooth sample average approximation of stationary points in nonsmooth stochastic optimization and applications. Math. Program. 119, 371–401 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  53. Xu, S.: Smoothing method for minimax problems. Comput. Optim. Appl. 20, 267–279 (2001)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This study is supported by AFOSR Young Investigator grant F1ATA08337G003. The author is grateful for valuable discussions with Roberto Szechtman, Naval Postgraduate School. The author also thanks Alexander Shapiro, Georgia Institute of Technology, for assistance with two technical results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes O. Royset.

Appendix

Appendix

This appendix includes proofs of results in Sect. 4.

Proof of Proposition 2.

By assumption, for any i=0,1,…,n k −1 and a∈(0,1)

$$ \frac{f_{N_k}(x_{n_k}^k) - a^{n_k-i}f_{N_k}(x_{i}^k)}{1-a^{n_k-i}} < \frac{f_{N_k}(x_{n_k}^k) - a^{n_k-i}f_{N_k}(x_{n_k}^k)}{1-a^{n_k-i}} = f_{N_k}\bigl(x_{n_k}^k \bigr). $$
(55)

Hence, \(\phi(a) < f_{N_{k}}(x_{n_{k}}^{k})\) and

$$ f_{N_k}\bigl(x_i^k\bigr)- \phi(a)>0 $$
(56)

for any i=0,1,…,n k and a∈(0,1). Consequently, the logarithmic transformation of the data in Step 2 of Subroutine B is permissable when a j ∈(0,1) and regression coefficients loga j+1 and logb j+1, j=0,1,… , are given by the standard linear least-square regression formulae. Specifically,

(57)

Since the denominator in (57) simplifies to \((n_{k}^{3} + 3n_{k}^{2} + 2n_{k})/12\), we obtain using the definition of \(\alpha_{i} = 12(i-n_{k}/2)/(n_{k}^{3} + 3n_{k}^{2} + 2n_{k})\) in Proposition 2 that

(58)

We find that

(59)

and consequently

$$ a_{j+1} = \prod_{i=0}^{n_k} \bigl(f_{N_k}\bigl(x_i^k\bigr) - \phi(a_j)\bigr)^{\alpha_i}. $$
(60)

By definition, \(\alpha_{i} = 12(i-n_{k}/2)/(n_{k}^{2} + 3n_{k}^{2} + 2n_{k}) = -12(n_{k}-i-n_{k}/2)/(n_{k}^{2} + 3n_{k}^{2} + 2n_{k}) = -\alpha_{n_{k}-i}\). Hence,

where we use the fact that \(\alpha_{n_{k}/2} = 0\) when n k is an even number. The expression for g(⋅) then follows by combining the two products. The positivity of g(a j ) follows trivially from (56), as g(a j ) is a product of positive numbers. Since \(f_{N_{k}}(x_{i}^{k})>f_{N_{k}}(x_{i+1}^{k})\) for all i=0,1,…,n k −1, \((f_{N_{k}}(x_{i}^{k})-\phi(a))/(f_{N_{k}}(x_{n_{k}-i}^{k})-\phi(a))<1\) for all \(i=n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}\). Moreover, α i >0 for all \(i=n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}\). Hence, it follows that g(a)<1. □

Proof of Theorem 4

We first show that the derivative dg(a)/da exists and is positive on (0,1). For any a∈(0,1) and \(i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}\), let h i :(0,1)→ℝ be defined by

$$ h_i(a) := \frac{f_{N_k}(x_i^k)-\phi(a)}{f_{N_k}(x_{n_k-i}^k)-\phi(a)}. $$
(61)

By (56), h i (a)>0 for any a∈(0,1) and \(i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}\) and consequently

$$ g(a) = \exp \Biggl(\sum_{i=n_k^0}^{n_k} \alpha_i\log h_i(a) \Biggr). $$
(62)

By straightforward derivation we obtain that

(63)

where

$$ \frac{d\phi(a)}{da} = \frac{1}{n_k}\sum _{i=0}^{n_k-1} \frac {(n_k-i)a^{n_k-i-1}(f_{N_k}(x_{n_k}^k)-f_{N_k}(x_{i}^k))}{(1-a^{n_k-i})^2}. $$
(64)

Since \(f_{N_{k}}(x_{n_{k}}^{k})-f_{N_{k}}(x_{i}^{k})<0\) for all i=0,1,…,n k −1 by assumption, it follows that (a)/da<0 for all a∈(0,1). Again, by assumption, \(f_{N_{k}}(x_{i}^{k})-f_{N_{k}}(x_{n_{k}-i}^{k})<0\) for all \(i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}\). Hence, by (56) and the fact that α i >0 for all \(i = n_{k}^{0}, n_{k}^{0}+1, \ldots, n_{k}\), we conclude that dg(a)/da>0 for any a∈(0,1)

Since \(\{a_{j}\}_{j=0}^{\infty}\) is contained in the compact set [0,1] by Proposition 2, it follows that there exists a subsequence {a j } jJ , with J⊂ℕ, and an a ∈[0,1] such that a j J a , as j→∞. By the mean value theorem, we obtain that for every j=0,1,2,… , there exists an s j ∈[0,1] such that

$$ a_{j+2} - a_{j+1} = g(a_{j+1}) - g(a_j) = \frac{dg(a_j + s_j(a_{j+1}-a_j))}{da}(a_{j+1}-a_j). $$
(65)

Since dg(a j +s j (a j+1a j ))/da>0, it follows that \(\{a_{j}\}_{j=0}^{\infty}\) generated by Subroutine B initialized with a 0∈(0,1) is either strictly increasing or strictly decreasing. That is, if a 0<a 1, then a j <a j+1 for all j∈ℕ. If a 0>a 1, then a j >a j+1 for all j∈ℕ. Hence, a j a as j→∞. Similarly, g(a j )∈(0,1) by Proposition 2 and there must exists a convergent subsequence of \(\{g(a_{j})\}_{j=0}^{\infty}\) that convergence to a point g ∈[0,1]. Since a j+1=g(a j ), \(\{g(a_{j})\}_{j=0}^{\infty}\) is either strictly increasing or strictly decreasing and therefore g(a j )→g , as j→∞. Since a j+1=g(a j ) for all j∈ℕ and a j a and g(a j )→g , as j→∞, we have that a =g . By continuity of g(⋅) on (0,1), if a ∈(0,1), then g(a j )→g(a ), as j→∞. Hence, g(a )=g =a . If a =0, then g =0. If a =1, then g =1. Since by definition g(0)=0 and g(1)=1, it follows that a =g(a ) in these two cases too. The finite termination of Subroutine B follows directly from the fact that \(\{a_{j}\}_{j=0}^{\infty}\) converges. □

Proof of Theorem 5

Since \(f_{N_{k}}(x_{i}^{k})= f_{N_{k}}^{*} + (\theta_{N_{k}})^{i}(f_{N_{k}}(x_{0}^{k}) - f_{N_{k}}^{*})\) for all i=0,1,2,…,

$$ \phi(\theta_{N_k}) = \frac{1}{n_k}\sum _{i=0}^{n_k-1} \frac {f_{N_k}(x_{n_k}^k) - (\theta_{N_k})^{n_k-i}f_{N_k}(x_{i}^k)}{1-(\theta_{N_k})^{n_k-i}} = f_{N_k}^*. $$
(66)

It then follows by (60) that

Since \(\sum_{i=0}^{n_{k}}i^{2} = (n_{k}+1)(n_{k}^{2}/3 + n_{k}/6)\),

Since \(\sum_{i=0}^{n_{k}}\alpha_{i}=0\) by (59), the conclusion follows. □

Proof of Theorem 6

By Theorem 5 and (66), \(\theta_{N_{k}} = g(\theta_{N_{k}})\) and \(\phi(\theta_{N_{k}})=f_{N_{k}}^{*}\). Consequently, it follows from (63) and the assumption of exact linear rate that

$$ \frac{dg(\theta_{N_k})}{da} = \theta_{N_k}\sum _{i=n_k^0}^{n_k} \alpha_i\frac{(\theta_{N_k}^i - \theta_{N_k}^{n_k-i})d\phi(\theta_{N_k})/da}{\theta_{N_k}^{n_k}(f_{N_k}(x_0^k)-f_{N_k}^*)}. $$
(67)

Since

(68)

we obtain that

(69)

If n k is even, then using (59) we find that

If n k is odd, then using (59) we find that

Since 1/(n k +1)<(n k +1)/(n k (n k +2)),

(70)

for all n k =2,3,… .

The first multiplicative term in (69) decomposes as follows:

(71)

Since \(1/(1-\theta_{N_{k}}^{i'})\leq1/(1-\theta_{N_{k}})\) for any i′∈ℕ, we obtain from (69) using (70) and (71) that

$$ \frac{dg(\theta_{N_k})}{da} \leq \Biggl( \frac{n_k+1}{2} + \frac {1}{n_k}\frac{1}{1-\theta_{N_k}}\sum_{i'=1}^{\infty} i'\theta_{N_k}^{i'} \Biggr) \frac{3}{2} \frac{n_k+1}{n_k(n_k+2)}. $$
(72)

Using the mean of the geometric distribution, we deduce that \(\sum_{i = 1}^{\infty}i\theta_{N_{k}}^{i} = \theta_{N_{k}}/(1-\theta_{N_{k}})^{2}\). Hence,

(73)

Consequently, if \(n_{k} > (1+ \sqrt{1+72\beta})/6\), where \(\beta= \theta_{N_{k}}/(1-\theta_{N_{k}})^{3}\), then the right-hand size in (73) is less than one. Hence, for \(n_{k} > (1+ \sqrt{1+72\beta })/6\), \(dg(\theta_{N_{k}})/da<1\). Since \(\theta_{N_{k}}/(1-\theta_{N_{k}})^{3}\leq0.99/(1-0.99)^{3}\) for all \(\theta_{N_{k}} \in[0, 0.99]\), it follows that when n k ≥1408, \(dg(\theta_{N_{k}})/da<1\) for any \(\theta_{N_{k}} \in[0, 0.99]\). It then follows by the fixed point theorem that under the assumption that n k ≥1408, \(a_{j}\to\theta_{N_{k}}\), as j→∞, whenever a 0 is sufficiently close to \(\theta_{N_{k}}\).

It appears difficult to examine \(dg(\theta_{N_{k}})/da\) analytically for 2<n k <1408. However, we show that \(dg(\theta_{N_{k}})/da<1\) for all \(\theta_{N_{k}} \in(0, 0.99]\) and 2≤n k <1408 using the following numerical scheme. (We note that the case with n k =2 is easily checked analytically, but we do not show that for brevity.) We consider the function γ:[0,1)→ℝ defined for any θ∈[0,1) by

$$ \gamma(\theta) := \Biggl( \frac{1}{n_k}\sum_{i'=1}^{n_k} \frac {i'}{1-\theta^{i'}} \Biggr)\sum_{i=n_k^0}^{n_k} \alpha_i\bigl(\theta^{n_k-i}-\theta^i\bigr). $$
(74)

Obviously, for \(\theta_{N_{k}}\in(0,1)\), \(\gamma(\theta_{N_{k}})=dg(\theta_{N_{k}})/da\). Straightforward derivation yields that

(75)

Hence, for any θ max∈(0,1),

$$ \frac{d\gamma(\theta)}{d\theta} \leq L := \sum_{i'=1}^{n_k} \sum_{i=n_k^0}^{n_k} \frac{\alpha_i i'}{n_k} \frac{(n_k-i)\theta_{\max }^{n_k-i-1} + i'\theta_{\max}^{n_k-i}\theta_{\max}^{i'-1}}{(1-\theta_{\max}^{i'})^2} $$
(76)

for all θ∈(0,θ max]. Consequently, γ(⋅) is Lipschitz continuous on (0,θ max] with Lipschitz constant L. Hence, it follows that it suffices to check \(dg(\theta_{N_{k}})/da\) for n k ∈{2,3,…,1407} and a finite number of values for \(\theta_{N_{k}}\) to verify that \(dg(\theta_{N_{k}})/da<1\) for all \(\theta_{N_{k}}\in(0,\theta_{\max}]\). Let \(\tilde{\theta}_{1}\), \(\tilde{\theta}_{2}\), …, \(\tilde{\theta}_{\tilde{k}}\) be these values, which are computed recursively starting with \(\tilde{\theta}_{1} = 0\) and then by \(\tilde{\theta}_{k+1} = \tilde{\theta}_{k} + (1-\gamma(\tilde{\theta}_{k}))/L\), k=1,2,…, until a value no smaller than θ max is obtained. Let θ max=0.99. Since we find that \(\gamma(\tilde{\theta}_{k}) < 1\) for all k in this case, it follows from the fact that γ(⋅) is Lipschitz continuous on (0,0.99] with Lipschitz constant L that γ(θ)<1 for all θ∈[0,0.99]. Hence, \(dg(\theta_{N_{k}})/da<1\) for all \(\theta_{N_{k}} \in(0,0.99]\) and n k =2,3,…,1407. The conclusion then follows by the fixed-point theorem. □

Proof of Proposition 4

By Proposition 1, \(N^{1/2}(f_{N}^{*} - f^{*})\Rightarrow \mathcal {N}(0, \sigma^{2}(x^{*}))\), as N→∞. Let \(\{N_{l}(S_{k})\}_{S_{k}=1}^{\infty}\), l=1,2,…,k, be such that N l (S k )∈ℕ for all S k ∈ℕ and l=1,2,…,k, \(\sum_{l=1}^{k} N_{l}(S_{k}) = S_{k}\), and N l (S k )/S k β l ∈[0,1], as S k →∞. Consequently, \(\sum_{l=1}^{k} \beta_{l} = 1\). By Slutsky’s theorem (see, e.g., Exercise 25.7 of [7]), it then follows that for all l=1,2,…,k,

$$ \biggl(\frac{N_l(S_k)}{S_k} \biggr)^{1/2} N_l(S_k)^{1/2} \bigl(f_{N_l(S_k)}^* - f^*\bigr)\Rightarrow\beta_l^{1/2} \mathcal {N}\bigl(0, \sigma^2\bigl(x^*\bigr)\bigr), $$
(77)

as S k →∞.

Since the sequences \(\{f_{N_{l}}(x_{i}^{l})\}_{i=0}^{n_{l}}\), l=1,2,…,k, converge exactly linearly with coefficient \(\hat{\theta}_{l+1}\), it follows that the minimization in (43) can be ignored and \(\hat{m}_{l} = f_{N_{l}}^{*}\), l=1,2,…,k. Using the recursive formula for \(\hat{f}_{k+1}^{*}\) in Step 2 of Subroutine C, we find that \(\hat{f}_{k+1}^{*} = \sum_{l=1}^{k} N_{l}(S_{k}) f_{N_{l}}^{*}/S_{k}\). Consequently,

$$ S_k^{1/2}\bigl(\hat{f}_{k+1}^* - f^*\bigr) = \sum _{l=1}^k \biggl(\frac {N_l(S_k)}{S_k} \biggr)^{1/2}N_l(S_k)^{1/2} \bigl(f_{N_l(S_k)}^* - f^*\bigr). $$
(78)

It then follows by the continuous mapping theorem and the independence of samples across stages that

(79)

as S k →∞. The conclusion then follows from the fact that \(\sum_{l=1}^{k} \beta_{l} = 1\). □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Royset, J.O. On sample size control in sample average approximations for solving smooth stochastic programs. Comput Optim Appl 55, 265–309 (2013). https://doi.org/10.1007/s10589-012-9528-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-012-9528-1

Keywords

Navigation