Abstract
We consider the problem of estimating the optimal parameter trajectory over a finite time interval in a parameterized stochastic differential equation (SDE), and propose a simulation-based algorithm for this purpose. Towards this end, we consider a discretization of the SDE over finite time instants and reformulate the problem as one of finding an optimal parameter at each of these instants. A stochastic approximation algorithm based on the smoothed functional technique is adapted to this setting for finding the optimal parameter trajectory. A proof of convergence of the algorithm is presented and results of numerical experiments over two different settings are shown. The algorithm is seen to exhibit good performance. We also present extensions of our framework to the case of finding optimal parameterized feedback policies for controlled SDE and present numerical results in this scenario as well.
- Abdulla, M. S. and Bhatnagar, S. 2007. Reinforcement learning based algorithms for average cost Markov decision processes. Discrete Event Dynam. Syst. 17, 1, 23--52. Google ScholarDigital Library
- Bertsekas, D. P. 1995. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, MA. Google ScholarDigital Library
- Bertsekas, D. P. and Gallager, R. G. 1991. Data Networks. Prentice-Hall, New York. Google ScholarDigital Library
- Bertsekas, D. P. and Tsitsiklis, J. N. 1996. Neuro-Dynamic Programming. Athena Scientific, Belmont, MA. Google ScholarDigital Library
- Bhatnagar, S. 2005. Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization. ACM Trans. Modeling Comput. Simul. 15, 1, 74--107. Google ScholarDigital Library
- Bhatnagar, S. 2007. Adaptive Newton-based smoothed functional algorithms for simulation optimization. ACM Trans. Model. Comput. Simul. 18, 1, 1--35. Google ScholarDigital Library
- Bhatnagar, S. and Borkar, V. S. 1998. A two time scale stochastic approximation scheme for simulation based parametric optimization. Prob. Eng. Inf. Sci. 12, 519--531.Google ScholarCross Ref
- Bhatnagar, S. and Borkar, V. S. 2003. Multiscale chaotic SPSA and smoothed functional algorithms for simulation optimization. Simul. 79, 10, 568--580.Google ScholarCross Ref
- Bhatnagar, S., Fu, M. C., Marcus, S. I., and Bhatnagar, S. 2001. Two timescale algorithms for simulation optimization of hidden Markov models. IIE Trans. 33, 3, 245--258.Google ScholarCross Ref
- Bhatnagar, S. and Karmeshu. 2007. Monte-Carlo estimation of time-dependent statistical characteristics of a process governed by a random differential equation. Submitted.Google Scholar
- Bhatnagar, S. and Kumar, S. 2004. A simultaneous perturbation stochastic approximation based actor-critic algorithm for Markov decision processes. IEEE Trans. Autom. Control 49, 4, 592--598.Google ScholarCross Ref
- Campillo, F. and Traore, A. 1994. Lyapunov exponents of controlled SDEs and stabilizability property: Some examples. Rapport de Recherche 2397, INRIA.Google Scholar
- Campillo, F. and Traore, A. 1995. A stabilization algorithm for linear controlled SDEs. In Proceedings of IEEE Conference on Decision and Control, 1034--1035.Google Scholar
- Charalambos, C. D., Djouadi, S. M., and Denic, S. Z. 2005. Stochastic power control for wireless networks via SDEs: Probabilistic qos measures. IEEE Trans. Inf. Theory 51, 12, 4396--4401. Google ScholarDigital Library
- Glasserman, P. 2005. Monte Carlo Methods in Financial Engineering. Springer, New York.Google Scholar
- Glynn, P. W. 1990. Likelihood ratio gradient estimation for stochastic systems. Commun. ACM 33, 10, 75--84. Google ScholarDigital Library
- Hirsch, M. W. 1989. Convergent activation dynamics in continuous time networks. Neural Netw. 2, 331--349. Google ScholarDigital Library
- Ho, Y. C. and Cao, X. R. 1991. Perturbation Analysis of Discrete Event Dynamical Systems. Kluwer, Boston.Google Scholar
- Konda, V. R. and Tsitsiklis, J. N. 2003. Actor-Critic algorithms. SIAM J. Control Optimiz. 42, 4, 1143--1166. Google ScholarDigital Library
- Korn, R. and Kraft, H. 2002. A stochastic control approach to portfolio problems with stochastic interest rates. SIAM J. Control Optimiz. 40, 4, 1250--1269. Google ScholarDigital Library
- Kushner, H. J. and Clark, D. S. 1978. Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York.Google Scholar
- Kushner, H. J. and Dupuis, P. G. 2001. Numerical Methods for Stochastic Control Problems in Continuous Time. Springer, New York. Google ScholarDigital Library
- Lim, A. E. B., Zhou, X. Y., and Moore, J. B. 2003. Multiple-Objective risk-sensitive control and its small noise limit. Automatica 39, 533--541. Google ScholarDigital Library
- Liu, T., Bahl, P., and Chlamtac, I. 1998. Mobility modeling, location tracking, and trajectory prediction in wireless atm networks. IEEE J. Selected Areas Commun. 16, 6, 922--936. Google ScholarDigital Library
- Marbach, P. and Tsitsiklis, J. N. 2001. Simulation-based optimization of Markov reward processes. IEEE Trans. Autom. Control 46, 2, 191--209.Google ScholarCross Ref
- Moose, R. L., Vanlandingham, H. F., and McCabe, D. H. 1979. Modeling and estimation for tracking maneuvering targets. IEEE Trans. Aerospace Electron. Syst. AES-15, 3, 448--456.Google ScholarCross Ref
- Nelson, R. 1987. Stochastic catastrophe theory in computer performance modeling. J. Assoc. Comput. Mach. 34, 3, 661--685. Google ScholarDigital Library
- Primak, S., Kontorovich, V., and Lyandres, V. 2004. Stochastic Methods and Their Applications to Communications: Stochastic Differential Equations Approach. Wiley, West Sussex, UK.Google Scholar
- Rubinstein, R. Y. 1981. Simulation and the Monte Carlo Method. Wiley, New York. Google ScholarDigital Library
- Singer, R. A. 1970. Estimating optical tracking filter performance for manned maneuvering targets. IEEE Trans. Aerospace Electron. Syst. AES-6, 4, 473--483.Google ScholarCross Ref
- Spall, J. C. 1992. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 37, 3, 332--341.Google ScholarCross Ref
- Styblinski, M. A. and Tang, T.-S. 1990. Experiments in nonconvex optimization: Stochastic approximation with function smoothing and simulated annealing. Neural Netw. 3, 467--483. Google ScholarDigital Library
- Vazquez-Abad, F. J. and Kushner, H. J. 1992. Estimation of the derivative of a stationary measure with respect to a control parameter. J. Appl. Probability 29, 343--352.Google ScholarCross Ref
Index Terms
- Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure
Recommendations
Optimal Stochastic Parameter Design for Estimation Problems
In this study, the aim is to perform optimal stochastic parameter design in order to minimize the cost of a given estimator. Optimal probability distributions of signals corresponding to different parameters are obtained in the presence and absence of ...
Infinite Horizon Forward-Backward SDEs and Open-Loop Optimal Controls for Stochastic Linear-Quadratic Problems with Random Coefficients
In this paper, we introduce a new infinite horizon domination-monotonicity framework. In this framework, by the method of continuation and some subtle techniques, we obtain an existence and uniqueness result and a pair of estimates for the solutions to a ...
Optimal pointwise approximation of SDEs from inexact information
We study a pointwise approximation of solutions of systems of stochastic differential equations. We assume that an approximation method can use values of the drift and diffusion coefficients which are perturbed by some deterministic noise. Let 1,20 be ...
Comments