abstract

Unconventional Formats of Background Knowledge from Human Teacher in Reward Shaping

Authors:
Syed Ali Raza

University of Technology, Sydney, Sydney, Australia

University of Technology, Sydney, Sydney, Australia
View Profile

,
Mary-Anne Williams

University of Technology, Sydney, Sydney, Australia

University of Technology, Sydney, Sydney, Australia
View Profile

HRI '17: Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot InteractionMarch 2017Pages 373–374https://doi.org/10.1145/3029798.3034817

Published:06 March 2017Publication History

HRI '17: Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction

Pages 373–374

ABSTRACT

This research evaluates reward shaping with unconventional formats of human input. Conventionally, a human teacher is assumed to provide numeric rewards for complete task training. However, there are limitations to this conventional format. Firstly, the continuous demand of numeric rewards is onerous. Secondly, it is limited in extracting useful knowledge from humans. In this research, we have tested three unconventional formats of human input, two to increase social appeal of reward shaping and one to efficiently extraction knowledge from humans. The preliminary results on simulated domains validate the usefulness of these formats in terms of user's comfort and learning performance.

References

T. Joachims. Training linear svms in linear time. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 217--226. ACM, 2006. Google ScholarDigital Library
W. B. Knox and P. Stone. Interactively shaping agents via human reinforcement: The tamer framework. In Proceedings of the fifth international conference on Knowledge capture, pages 9--16. ACM, 2009. Google ScholarDigital Library
A. Y. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In ICML, pages 278--287, 1999. Google ScholarDigital Library
H. B. Suay and S. Chernova. Effect of human guidance and state space size on interactive reinforcement learning. In 2011 Ro-Man, pages 1--6. IEEE, 2011.Google ScholarCross Ref
A. L. Thomaz and C. Breazeal. Reinforcement learning with human teachers: Evidence of feedback and guidance with implications for learning performance. In AAAI, volume 6, pages 1000--1005, 2006. Google ScholarDigital Library

Index Terms

Recommendations

Potential Based Reward Shaping Using Learning to Rank
HRI '17: Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction

This paper presents a novel method for the computation of potential function using human input for potential based reward shaping. It defines a ranking over state space which is used to define a potential function. Specifically, it seeks multiple, ...
Read More
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Read More
Multi-agent, reward shaping for RoboCup KeepAway
AAMAS '11: The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3

This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory [2], potential-based reward shaping does not alter the Nash Equilibria of a stochastic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HRI '17: Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction
March 2017
462 pages
ISBN:9781450348850
DOI:10.1145/3029798
General Chairs:
Bilge Mutlu
University of Wisconsin-Madison, USA
,
Manfred Tscheligi
University of Salzburg, Austria
,
Program Chairs:
Astrid Weiss
Vienna University of Technology, Austria
,
James E. Young
University of Manitoba, Canada
Copyright © 2017 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 March 2017
Check for updates
Author Tags
learning from demonstration
learning to rank
reinforcement learning
reward shaping
Qualifiers
- abstract
Conference

Acceptance Rates
HRI '17 Paper Acceptance Rate51of211submissions,24%Overall Acceptance Rate192of519submissions,37%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 68
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Unconventional Formats of Background Knowledge from Human Teacher in Reward Shaping

HRI '17: Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Potential Based Reward Shaping Using Learning to Rank

Reward Shaping in Episodic Reinforcement Learning

Multi-agent, reward shaping for RoboCup KeepAway