Skip to main content
Log in

Research on task decomposition and state abstraction in reinforcement learning

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Task decomposition and State abstraction are crucial parts in reinforcement learning. It allows an agent to ignore aspects of its current states that are irrelevant to its current decision, and therefore speeds up dynamic programming and learning. This paper presents the SVI algorithm that uses a dynamic Bayesian network model to construct an influence graph that indicates relationships between state variables. SVI performs state abstraction for each subtask by ignoring irrelevant state variables and lower level subtasks. Experiment results show that the decomposition of tasks introduced by SVI can significantly accelerate constructing a near-optimal policy. This general framework can be applied to a broad spectrum of complex real world problems such as robotics, industrial manufacturing, games and others.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Barto A, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Syst (special issue on reinforcement learning) 13: 41–77

    MathSciNet  MATH  Google Scholar 

  • Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont

    MATH  Google Scholar 

  • Boutilier C, Dearden R, Goldszmidt M (1995) Exploiting structure in policy construction. IJCAI 14: 1104–1113

    Google Scholar 

  • Dean T, Kanazawa K (1989) A model for reasoning about persistence and causation. Comput Intell 5(3): 142–150

    Article  Google Scholar 

  • Dietterich T (2000) Hierarchical reinforcement learning with the MAXQ value function decoposition. J Artif Intell Res 13: 227–303

    MathSciNet  MATH  Google Scholar 

  • Hengst B (2002) Discovering hierarchy in reinforcement learning with HEXQ. ICML 19: 243–250

    Google Scholar 

  • Jonsson A, Barto A (2005) A causal approach to hierarchical decomposition of factored MDPs. In: Proceedings of the 22nd international conference on machine learning, pp 401–408

  • Makar R, Mahadevan S, Ghavamzadeh M (2001) Hierarchical multi-agent reinforcement learning. In: Proceedings of the 5th international conference on autonomous agents

  • Parr R, Russell S (1998) Reinforcement learning with hierarchies of machines. Advances in neural information processing systems. MIT Press, Oxford, pp 1043–1049

    Google Scholar 

  • Sutton R, Barto A (1998) Reinforcement learning. MIT Press, Oxford

    Google Scholar 

  • Sutton R, Precup D, Singh S (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1-2): 181–211

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Lasheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lasheng, Y., Zhongbin, J. & Kang, L. Research on task decomposition and state abstraction in reinforcement learning. Artif Intell Rev 38, 119–127 (2012). https://doi.org/10.1007/s10462-011-9243-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-011-9243-9

Keywords

Navigation