Journals & Magazines >IEEE Journal of Selected Topi... >Volume: 16 Issue: 3

Optimal Myopic Policy for Restless Bandit: A Perspective of Eigendecomposition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper considers restless multiarmed bandit (RMAB) problem involving partially observed Markov decision processes with heterogeneous state transition matrices in disc...Show More

Metadata

Abstract:

This paper considers restless multiarmed bandit (RMAB) problem involving partially observed Markov decision processes with heterogeneous state transition matrices in discrete time slots. Due to the notorious PSPACE-hardness of the generic RMAB, we turn to the myopic policy, i.e., scheduling the best processes each time, since the myopic policy is one of the policies with linear computation complexity and popularly adopted in practice. Intuitively, the myopic policy is not optimal because it only exploits information without exploring. In this paper, to show the optimality of the myopic policy, we first generalize the concept of the first order stochastic dominance ordering and then decompose each state transition matrix into a weighted sum of a series of eigenmatrices in which the weights are determined by the corresponding eigenvalues. Furthermore, we identify two sets of sufficient conditions on both eigenvalues and eigenmatrices to guarantee the optimality of the myopic policy for two instances, respectively. The theorectical results provide some insights into the structure of the optimal policy for the generic RMAB.

Published in: IEEE Journal of Selected Topics in Signal Processing ( Volume: 16, Issue: 3, April 2022)

Page(s): 420 - 433

Date of Publication: 13 January 2022

ISSN Information:

DOI: 10.1109/JSTSP.2022.3142502

Funding Agency:

Contents

References is not available for this document.

Optimal Myopic Policy for Restless Bandit: A Perspective of Eigendecomposition

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Optimal Myopic Policy for Restless Bandit: A Perspective of Eigendecomposition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?