Finding good stochastic factored policies for factored Markov decision processes

Radoszycki, Julia; Peyrard, Nathalie; Sabbadin, R&#233;gis

doi:10.3233/978-1-61499-419-0-1083

Abstract

We propose a framework for approximate resolution of MDPs with factored state space, factored action space and additive reward, based on (i) considering stochastic factored policies (SFPs) with a given structure, (ii) using variational approximations to estimate SFP values and (iii) using local continuous optimization algorithms to compute “good” SFPs. We have implemented and tested an algorithm (CA-LBP), involving a loopy belief propagation algorithm and a coordinate ascent procedure. Experiments show that CA-LBP performs as well as a state-of-the-art algorithm dedicated to a specific sub-class of FA-FMDPs, and that CA-LBP can be applied to general FA-FMDPs with up to 100 binary state variables and 100 binary action variables.

Contact

IOS Press Copyright 2024

Contact

IOS Press Copyright 2024

This website uses cookies

This website uses cookies