Conferences >2015 54th IEEE Conference on ...

Online learning of optimal strategies in unknown environments

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Define an environment as a set of convex constraint functions that vary arbitrarily over time and consider a cost function that is also convex and arbitrarily varying. Ag...Show More

Metadata

Abstract:

Define an environment as a set of convex constraint functions that vary arbitrarily over time and consider a cost function that is also convex and arbitrarily varying. Agents that operate in this environment intend to select actions that are feasible for all times while minimizing the cost's time average. Such action is said optimal and can be computed offline if the cost and the environment are known a priori. An online policy is one that depends causally on the cost and the environment. To compare online policies to the optimal offline action define the fit of a trajectory as a vector that integrates the constraint violations over time and its regret as the cost difference with the optimal action accumulated over time. Fit measures the extent to which an online policy succeeds in learning feasible actions while regret measures its success in learning optimal actions. This paper proposes the use of online policies computed from a saddle point controller. It is shown that this controller produces policies with bounded regret and fit that grows at a sublinear rate. These properties provide an indication that the controller finds trajectories that are feasible and optimal in a relaxed sense. Concepts are illustrated throughout with the problem of a shepherd that wants to stay close to all sheep in a herd. Numerical experiments show that the saddle point controller allows the shepherd to do so.

Published in: 2015 54th IEEE Conference on Decision and Control (CDC)

Date of Conference: 15-18 December 2015

Date Added to IEEE Xplore: 11 February 2016

ISBN Information:

DOI: 10.1109/CDC.2015.7402833

Conference Location: Osaka, Japan

Contents

References is not available for this document.

Online learning of optimal strategies in unknown environments

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Online learning of optimal strategies in unknown environments

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?