skip to main content
research-article

Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure

Published:23 March 2009Publication History
Skip Abstract Section

Abstract

We consider the problem of estimating the optimal parameter trajectory over a finite time interval in a parameterized stochastic differential equation (SDE), and propose a simulation-based algorithm for this purpose. Towards this end, we consider a discretization of the SDE over finite time instants and reformulate the problem as one of finding an optimal parameter at each of these instants. A stochastic approximation algorithm based on the smoothed functional technique is adapted to this setting for finding the optimal parameter trajectory. A proof of convergence of the algorithm is presented and results of numerical experiments over two different settings are shown. The algorithm is seen to exhibit good performance. We also present extensions of our framework to the case of finding optimal parameterized feedback policies for controlled SDE and present numerical results in this scenario as well.

References

  1. Abdulla, M. S. and Bhatnagar, S. 2007. Reinforcement learning based algorithms for average cost Markov decision processes. Discrete Event Dynam. Syst. 17, 1, 23--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bertsekas, D. P. 1995. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bertsekas, D. P. and Gallager, R. G. 1991. Data Networks. Prentice-Hall, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bertsekas, D. P. and Tsitsiklis, J. N. 1996. Neuro-Dynamic Programming. Athena Scientific, Belmont, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bhatnagar, S. 2005. Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization. ACM Trans. Modeling Comput. Simul. 15, 1, 74--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bhatnagar, S. 2007. Adaptive Newton-based smoothed functional algorithms for simulation optimization. ACM Trans. Model. Comput. Simul. 18, 1, 1--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bhatnagar, S. and Borkar, V. S. 1998. A two time scale stochastic approximation scheme for simulation based parametric optimization. Prob. Eng. Inf. Sci. 12, 519--531.Google ScholarGoogle ScholarCross RefCross Ref
  8. Bhatnagar, S. and Borkar, V. S. 2003. Multiscale chaotic SPSA and smoothed functional algorithms for simulation optimization. Simul. 79, 10, 568--580.Google ScholarGoogle ScholarCross RefCross Ref
  9. Bhatnagar, S., Fu, M. C., Marcus, S. I., and Bhatnagar, S. 2001. Two timescale algorithms for simulation optimization of hidden Markov models. IIE Trans. 33, 3, 245--258.Google ScholarGoogle ScholarCross RefCross Ref
  10. Bhatnagar, S. and Karmeshu. 2007. Monte-Carlo estimation of time-dependent statistical characteristics of a process governed by a random differential equation. Submitted.Google ScholarGoogle Scholar
  11. Bhatnagar, S. and Kumar, S. 2004. A simultaneous perturbation stochastic approximation based actor-critic algorithm for Markov decision processes. IEEE Trans. Autom. Control 49, 4, 592--598.Google ScholarGoogle ScholarCross RefCross Ref
  12. Campillo, F. and Traore, A. 1994. Lyapunov exponents of controlled SDEs and stabilizability property: Some examples. Rapport de Recherche 2397, INRIA.Google ScholarGoogle Scholar
  13. Campillo, F. and Traore, A. 1995. A stabilization algorithm for linear controlled SDEs. In Proceedings of IEEE Conference on Decision and Control, 1034--1035.Google ScholarGoogle Scholar
  14. Charalambos, C. D., Djouadi, S. M., and Denic, S. Z. 2005. Stochastic power control for wireless networks via SDEs: Probabilistic qos measures. IEEE Trans. Inf. Theory 51, 12, 4396--4401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Glasserman, P. 2005. Monte Carlo Methods in Financial Engineering. Springer, New York.Google ScholarGoogle Scholar
  16. Glynn, P. W. 1990. Likelihood ratio gradient estimation for stochastic systems. Commun. ACM 33, 10, 75--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hirsch, M. W. 1989. Convergent activation dynamics in continuous time networks. Neural Netw. 2, 331--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ho, Y. C. and Cao, X. R. 1991. Perturbation Analysis of Discrete Event Dynamical Systems. Kluwer, Boston.Google ScholarGoogle Scholar
  19. Konda, V. R. and Tsitsiklis, J. N. 2003. Actor-Critic algorithms. SIAM J. Control Optimiz. 42, 4, 1143--1166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Korn, R. and Kraft, H. 2002. A stochastic control approach to portfolio problems with stochastic interest rates. SIAM J. Control Optimiz. 40, 4, 1250--1269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kushner, H. J. and Clark, D. S. 1978. Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York.Google ScholarGoogle Scholar
  22. Kushner, H. J. and Dupuis, P. G. 2001. Numerical Methods for Stochastic Control Problems in Continuous Time. Springer, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lim, A. E. B., Zhou, X. Y., and Moore, J. B. 2003. Multiple-Objective risk-sensitive control and its small noise limit. Automatica 39, 533--541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Liu, T., Bahl, P., and Chlamtac, I. 1998. Mobility modeling, location tracking, and trajectory prediction in wireless atm networks. IEEE J. Selected Areas Commun. 16, 6, 922--936. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Marbach, P. and Tsitsiklis, J. N. 2001. Simulation-based optimization of Markov reward processes. IEEE Trans. Autom. Control 46, 2, 191--209.Google ScholarGoogle ScholarCross RefCross Ref
  26. Moose, R. L., Vanlandingham, H. F., and McCabe, D. H. 1979. Modeling and estimation for tracking maneuvering targets. IEEE Trans. Aerospace Electron. Syst. AES-15, 3, 448--456.Google ScholarGoogle ScholarCross RefCross Ref
  27. Nelson, R. 1987. Stochastic catastrophe theory in computer performance modeling. J. Assoc. Comput. Mach. 34, 3, 661--685. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Primak, S., Kontorovich, V., and Lyandres, V. 2004. Stochastic Methods and Their Applications to Communications: Stochastic Differential Equations Approach. Wiley, West Sussex, UK.Google ScholarGoogle Scholar
  29. Rubinstein, R. Y. 1981. Simulation and the Monte Carlo Method. Wiley, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Singer, R. A. 1970. Estimating optical tracking filter performance for manned maneuvering targets. IEEE Trans. Aerospace Electron. Syst. AES-6, 4, 473--483.Google ScholarGoogle ScholarCross RefCross Ref
  31. Spall, J. C. 1992. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 37, 3, 332--341.Google ScholarGoogle ScholarCross RefCross Ref
  32. Styblinski, M. A. and Tang, T.-S. 1990. Experiments in nonconvex optimization: Stochastic approximation with function smoothing and simulated annealing. Neural Netw. 3, 467--483. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Vazquez-Abad, F. J. and Kushner, H. J. 1992. Estimation of the derivative of a stationary measure with respect to a control parameter. J. Appl. Probability 29, 343--352.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Modeling and Computer Simulation
            ACM Transactions on Modeling and Computer Simulation  Volume 19, Issue 2
            March 2009
            142 pages
            ISSN:1049-3301
            EISSN:1558-1195
            DOI:10.1145/1502787
            Issue’s Table of Contents

            Copyright © 2009 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 March 2009
            • Accepted: 1 June 2008
            • Revised: 1 January 2008
            • Received: 1 May 2007
            Published in tomacs Volume 19, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader