Structured policies in the sequential design of experiments

Rieder, Ulrich; Wagner, Hartmut

doi:10.1007/BF02204833

Structured policies in the sequential design of experiments

Published: December 1991

Volume 32, pages 165–188, (1991)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Ulrich Rieder¹ &
Hartmut Wagner¹

63 Accesses
7 Citations
Explore all metrics

Abstract

A general control model under uncertainty is considered. Using a Bayesian approach and dynamic programming, we investigate structural properties of optimal decision rules. In particular, we show the monotonicity of the total expected reward and of the so-called Gittins-Index. We extend the stopping rule and the stay-on-a-winner rule, which are well-known in bandit problems. Our approach is based on the multivariate likelihood ratio order andTP ₂ functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal design of experiments to identify latent behavioral types

Article 28 September 2020

Optimal Stopping Problems in Lévy Models with Random Observations

Article 19 September 2018

From Reinforcement Learning to Optimal Control: A Unified Framework for Sequential Decisions

References

H. Benzing, K. Hinderer and M. Kolonko, On thek-armed Bernoulli bandit: Monotonicity of the total reward under an arbitrary prior distribution, Math. Operationsforschung Statistik, Ser. Optimization 15(1984)583–595.
Google Scholar
H. Benzing and M. Kolonko, Structured policies for a sequential design problem with general distributions, Math. Oper. Res. 12(1987)60–71.
Google Scholar
D.A. Berry and B. Fristedt,Bandit Problems (Chapman and Hall, London, 1985).
Google Scholar
D.P. Bertsekas,Dynamic Programming: Deterministic and Stochastic Models (Prentice Hall, Englewood Cliffs, NJ, 1987).
Google Scholar
K. Hinderer,Foundations of Non-Stationary Dynamic Programming with Discrete Time, Parameter (Springer, Berlin, 1970).
Google Scholar
S. Karlin and Y. Rinott, Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions, J. Multivariate Anal. 10(1980)467–498.
Google Scholar
M. Kolonko, A note on a general stoppting rule in dynamic programming with finite horizon, Statist. Decisions 4(1986)379–387.
Google Scholar
U. Rieder, Bayesian dynamic programming, Adv. Appl. Prob. 7(1975)330–348.
Google Scholar
U. Rieder, Bayessche Kontrollmodelle, Skript Universität Ulm (1988).
H. Wagner, Strukturuntersuchungen in Bayesschen Semi-Markoffschen Kontrollmodellen, Dissertation, Universität Ulm (1988).
W. Whitt, Multivariate monotone likelihood ratio and uniform conditional stochastic order, J. Appl. Prob. 19(1982)695–701.
Google Scholar
P. Whittle,Optimization over Time, Vol. 1 (Wiley, New York, 1982).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Ulm, D-7900, Ulm, Germany
Ulrich Rieder & Hartmut Wagner

Authors

Ulrich Rieder
View author publications
You can also search for this author inPubMed Google Scholar
Hartmut Wagner
View author publications
You can also search for this author inPubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rieder, U., Wagner, H. Structured policies in the sequential design of experiments. Ann Oper Res 32, 165–188 (1991). https://doi.org/10.1007/BF02204833

Download citation

Issue Date: December 1991
DOI: https://doi.org/10.1007/BF02204833

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Structured policies in the sequential design of experiments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimal design of experiments to identify latent behavioral types

Optimal Stopping Problems in Lévy Models with Random Observations

From Reinforcement Learning to Optimal Control: A Unified Framework for Sequential Decisions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now