Simulation-based uniform value function estimates of discounted and average-reward MDPs

Simulation-based uniform value function estimates of discounted and average-reward MDPs | IEEE Conference Publication | IEEE Xplore