Machine Learning Proceedings 1994
Proceedings of the Eleventh International Conference, Rutgers University, New Brunswick, NJ, July 10–13, 1994
1994, Pages 266-274
On the Worst-case Analysis of Temporal-difference Learning Algorithms
References (0)
Cited by (5)
Exponentiated Gradient versus Gradient Descent for Linear Predictors
1997, Information and ComputationGuided Policy Exploration for Markov Decision Processes Using an Uncertainty-Based Value-of-Information Criterion
2018, IEEE Transactions on Neural Networks and Learning SystemsSparse Q-learning with mirror descent
2012, Uncertainty in Artificial Intelligence - Proceedings of the 28th Conference, UAI 2012Near-optimal reinforcement learning in polynomial time
2002, Machine Learning
Copyright © 1994 Morgan Kaufmann Publishers, Inc. Published by Elsevier Inc. All rights reserved.