Skip to main content
Log in

Multi-criteria expertness based cooperative Q-learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

One of the most influential points in cooperative learning is the type of exchanging information. If the content of exchanging information among agents is rich, cooperation gives rise to better results. To extract proper knowledge of agents during the cooperation process, some expertness measures that assign expertness levels to the other agents are used. In this paper, a new method named Multi-Criteria Expertness based cooperative Q-learning (MCE) is proposed that utilizes all of the expertness measures and attempts to enrich the exchanging information more efficiently. In MCE, all expertness measures are considered simultaneously and collective knowledge is equal to the combination of learned knowledge by each of expertness measures. The experimental results confirm outstanding performance of the proposed method on a sample maze world and a hunter-prey problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Panait L, Luke S (2005) Cooperative multi-agent learning: the state of the art. J Auton Agents Multi-Agent Syst 11(3):387–434

    Article  Google Scholar 

  2. Smith E (2003) Human cooperation: perspectives from behavioral ecology. In: Hammerstein P (ed) Genetic and cultural evolution of cooperation. MIT Press, Cambridge, pp 401–427

    Google Scholar 

  3. Nunes L, Oliveira E (2003) Cooperative learning using advice-exchange. J Adapt Agents Multi-Agent Syst 2636:33–48

    Article  Google Scholar 

  4. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction to adaptive computation and machine learning. MIT Press, Cambridge

    Google Scholar 

  5. Nili Ahmadabadi M, Asadpour M, Khodaabakhsh Seyyed H, Nakano E (2000) Expertness measuring in cooperative learning. In: Proceedings of the 2000 IEEE/RSJ international conference on intelligent robots and systems, pp 2261–2267

    Google Scholar 

  6. Nili Ahmadabadi M, Asadpour M (2002) Expertness based cooperative Q-learning. IEEE Trans Syst Man Cybern, Part B, Cybern 32(1):66–76

    Article  Google Scholar 

  7. Dragoni N, Gaspari M, Guidi D (2006) An infrastructure to support cooperation of knowledge-level agents on the semantic grid. J Appl Intell 25(2):159–180

    Article  MATH  Google Scholar 

  8. Kinney M, Tsatsoulis C (1998) Learning communication strategies in multi-agent systems. J Appl Intell 9(1):71–91

    Article  Google Scholar 

  9. Watkins CJCH (1989) Learning with delayed rewards. PhD Dissertation, Cambridge University, Psychology Department, England

  10. Whitehead S, Ballard D (1991) A study of cooperative mechanisms for faster reinforcement learning. Technical Report 365, Computer Science Dept, University of Rochester

  11. Tan M (1993) Multi-agent reinforcement learning: independent vs cooperative agents. In: Proceedings of tenth international conference on machine learning, Amherst, America, pp 487–494

    Google Scholar 

  12. Kuniyoshi y (1994) Learning by watching: extracting reuseable task knowledge from visual observation of human performance. IEEE Trans Robot Autom 10(6):799–822

    Article  Google Scholar 

  13. Maclin R, Shavlik JW (1996) Creating advice-taking reinforcement learners. J Mach Learn 22:251–282

    Google Scholar 

  14. Judah K, Roy S, Fern A, Dietterich T (2010) Reinforcement learning via practice and critique advice. In: AAAI 2010

    Google Scholar 

  15. Garland A, Alterman R (1995) Preparation of multi-agent knowledge for reuse. Technical Report, Waltham: AAAI fall symposium on adaptation of knowledge for reuse, November 10–12, Menlo Park, Canada

  16. Garland A, Alterman R (1996) Multi-agent learning through collective memory. In: Proceedings of adaptation, co evolution and learning in multi-agent systems: papers from the 1996 AAAI spring symposium, Menlo Park, CA, pp 33–38

    Google Scholar 

  17. Akbarzadeh MR, Rezaei H, Naghibi MB (2003) A fuzzy adaptive algorithm for expertness based cooperative learning, application to herding problem. In: Proceeding of 22nd inter conf of the north American fuzzy information processing society, pp 317–322

    Google Scholar 

  18. Ritthipravat P, Maneewarn T, Wyatt J, Laowattana D (2006) Comparison and analysis of expertness measure in knowledge sharing among robots, LNAI, vol 4031. Springer, Berlin, pp 60–69

    Google Scholar 

  19. Yang Y, Tian Y, Mei H (2007) Cooperative Q learning based on blackboard architecture. In: Proceedings of international conference on computational intelligence and security workshops, pp 224–227

    Google Scholar 

  20. Yang M, Tian Y, Liu X (2009) Cooperative Q-learning based on maturity of the policy. In: 2009 IEEE international conference on mechatronics and automation, August 9–12, Changchun, China

    Google Scholar 

  21. Bianchi RAC, Costa AHR (2004) The use of heuristics to speedup reinforcement learning. Boletim Interno, No. BT/PCS 0409, Escola Politécnica da USP, São Paulo, Brazil, pp 125–144

  22. Kadleček D (2008) Motivation driven reinforcement learning and automatic creation of behavior hierarchies. PhD Dissertation, Faculty of Electrical Engineering, Czech Technical University in Prague

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Esmat Pakizeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pakizeh, E., Palhang, M. & Pedram, M.M. Multi-criteria expertness based cooperative Q-learning. Appl Intell 39, 28–40 (2013). https://doi.org/10.1007/s10489-012-0392-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-012-0392-6

Keywords

Navigation