Reduction of state space in reinforcement learning by sensor selection

Kishima, Yasutaka; Kurashige, Kentarou

doi:10.1007/s10015-013-0092-2

Reduction of state space in reinforcement learning by sensor selection

Original Article
Published: 31 January 2013

Volume 18, pages 7–14, (2013)
Cite this article

Artificial Life and Robotics Aims and scope Submit manuscript

Yasutaka Kishima¹ &
Kentarou Kurashige¹

329 Accesses
1 Citation
Explore all metrics

Abstract

Much research has been conducted on the application of reinforcement learning to robots. Learning time is a matter of concern in reinforcement learning. In reinforcement learning, information from sensors is projected on to a state space. A robot learns the correspondence between each state and action in state space and determines the best correspondence. When the state space is expanded according to the number of sensors, the number of correspondences learnt by the robot is increased. Therefore, learning the best correspondence becomes time consuming. In this study, we focus on the importance of sensors for a robot to perform a particular task. The sensors that are applicable to a task differ for different tasks. A robot does not need to use all installed sensors to perform a task. The state space should consist of only those sensors that are essential to a task. Using such a state space consisting of only important sensors, a robot can learn correspondences faster than in the case of a state space consisting of all installed sensors. Therefore, in this paper, we propose a relatively fast learning system in which a robot can autonomously select those sensors that are essential to a task and a state space for only such important sensors is constructed. We define the measure of importance of a sensor for a task. The measure is the coefficient of correlation between the value of each sensor and reward in reinforcement learning. A robot determines the importance of sensors based on this correlation. Consequently, the state space is reduced based on the importance of sensors. Thus, the robot can efficiently learn correspondences owing to the reduced state space. We confirm the effectiveness of our proposed system through a simulation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, MIT Press, Cambridge
Kondo T, Ito K (2004) A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control. Robot Auton Syst 46:111–124
Article Google Scholar
Kober J, Oztop E, Peters J (2010) Reinforcement learning to adjust robot movements to new situations. In: Proceedings of the twenty-second international joint conference on artificial intelligence, pp 2650–2655
Navarro N, Weber C, Wermter S (2011) Real-world reinforcement learning for autonomous humanoid robot charging in a home environment. Lecture notes in computer science, vol 6856, pp 231–240
Tan M (1993) Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning
Ahmadabadi MN, Asadpur M, Khodanbakhsh SH, Nakano E (2000) Expertness measuring in cooperative learning. In: Proceedings of the international conference on intelligent robots and systems 2000 (IROS 2000), vol 3 , pp 2261–2267
Ahmadabali MN, Asadpour M (2002) Expertness based cooperative Q-learning. IEEE Trans Syst Man Cybern 32(1):66–76
Article Google Scholar
Iima H, Kuroe Y (2006) Swarm reinforcement learning algorithm based on exchanging information among agents. Trans Soc Instrum Control Eng 42(11):1244–1251
Google Scholar
Yongming Y, Yantao T, Hao M (2007) Cooperative Q learning based on blackboard architecture. In: International conference on computational intelligence and security workshops, pp 224–227
Asada M, Noda S, Hosoda K (1996) Action-based sensor space categorization for robot learning. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, pp 1518–1524
Ishiguro H, Sato R, Ishida T (1996) Robot oriented state space construction. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, pp 1496–1501
Samejima S, Omori T (1999) Adaptive internal state space construction method for reinforcement learning of a real-world agent. Neural Netw 12:1143–1155
Article Google Scholar
Smith AJ (2002) Applications of the self-organising map to reinforcement learning. Neural Netw 15:1107–1124
Article Google Scholar
Aung KT, Fuchda T (2012) A proposition of adaptive state space partition in reinforcement learning with Voronoi tessellation. In: Proceedings of the 17th international symposium on artificial life and robotics, pp 638–641

Download references

Author information

Authors and Affiliations

Muroran Institute of Technology, 27-1 Mizumoto, Muroran, Hokkaido, Japan
Yasutaka Kishima & Kentarou Kurashige

Authors

Yasutaka Kishima
View author publications
You can also search for this author in PubMed Google Scholar
Kentarou Kurashige
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kentarou Kurashige.

About this article

Cite this article

Kishima, Y., Kurashige, K. Reduction of state space in reinforcement learning by sensor selection. Artif Life Robotics 18, 7–14 (2013). https://doi.org/10.1007/s10015-013-0092-2

Download citation

Received: 31 July 2012
Accepted: 26 December 2012
Published: 31 January 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s10015-013-0092-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reduction of state space in reinforcement learning by sensor selection

Abstract

Access this article

Similar content being viewed by others

Local and soft feature selection for value function approximation in batch reinforcement learning for robot navigation

A fully distributed multi-robot navigation method without pre-allocating target positions

State-Dependent Maximum Entropy Reinforcement Learning for Robot Long-Horizon Task Learning

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Keywords

Navigation

Reduction of state space in reinforcement learning by sensor selection

Abstract

Access this article

Similar content being viewed by others

Local and soft feature selection for value function approximation in batch reinforcement learning for robot navigation

A fully distributed multi-robot navigation method without pre-allocating target positions

State-Dependent Maximum Entropy Reinforcement Learning for Robot Long-Horizon Task Learning

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Keywords

Search

Navigation