Cited By
View all- Weinstein ALittman M(2012)Bandit-based planning and learning in continuous-action Markov decision processesProceedings of the Twenty-Second International Conference on International Conference on Automated Planning and Scheduling10.5555/3038546.3038582(306-314)Online publication date: 25-Jun-2012