Modeling reward functions for incomplete state representations via echo state networks | IEEE Conference Publication | IEEE Xplore

Modeling reward functions for incomplete state representations via echo state networks


Abstract:

This paper investigates an echo state network (ESN) (Jaeger, 2001 and Maass and Markram, 2002) architecture as the approximation of the Q-function for temporally dependen...Show More

Abstract:

This paper investigates an echo state network (ESN) (Jaeger, 2001 and Maass and Markram, 2002) architecture as the approximation of the Q-function for temporally dependent rewards embedded in a linear dynamical system, the mass-spring-damper (MSD). This problem has been solved utilizing feed-forward neural networks (FNN) when all state information necessary to specify the dynamics is provided as input (Kretchmar, 2000). Time-delayed neural networks (TDNN) solve this problem with finite-size windows of incomplete state information. Our research demonstrates that the ESN architecture represents the Q-function of the MSD system given incomplete state information as well as current feed forward neural networks given either perfect state or a temporally-windowed, incomplete state vector. The remainder of this paper is organized as follows. We introduce basic concepts of reinforcement learning and the echo state network architecture. The MSD system simulation is defined in section IV. Experimental results for learning state quality given incomplete state information are presented in section V. Results for learning estimates of all future state qualities for incomplete state information is presented in section VI. Section VII discusses the potential of the ESN for use in reinforcement learning and provides current and future directions of research.
Date of Conference: 31 July 2005 - 04 August 2005
Date Added to IEEE Xplore: 27 December 2005
Print ISBN:0-7803-9048-2

ISSN Information:

Conference Location: Montreal, QC, Canada

References

References is not available for this document.