Abstract
One of the most influential points in cooperative learning is the type of exchanging information. If the content of exchanging information among agents is rich, cooperation gives rise to better results. To extract proper knowledge of agents during the cooperation process, some expertness measures that assign expertness levels to the other agents are used. In this paper, a new method named Multi-Criteria Expertness based cooperative Q-learning (MCE) is proposed that utilizes all of the expertness measures and attempts to enrich the exchanging information more efficiently. In MCE, all expertness measures are considered simultaneously and collective knowledge is equal to the combination of learned knowledge by each of expertness measures. The experimental results confirm outstanding performance of the proposed method on a sample maze world and a hunter-prey problem.
Similar content being viewed by others
References
Panait L, Luke S (2005) Cooperative multi-agent learning: the state of the art. J Auton Agents Multi-Agent Syst 11(3):387–434
Smith E (2003) Human cooperation: perspectives from behavioral ecology. In: Hammerstein P (ed) Genetic and cultural evolution of cooperation. MIT Press, Cambridge, pp 401–427
Nunes L, Oliveira E (2003) Cooperative learning using advice-exchange. J Adapt Agents Multi-Agent Syst 2636:33–48
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction to adaptive computation and machine learning. MIT Press, Cambridge
Nili Ahmadabadi M, Asadpour M, Khodaabakhsh Seyyed H, Nakano E (2000) Expertness measuring in cooperative learning. In: Proceedings of the 2000 IEEE/RSJ international conference on intelligent robots and systems, pp 2261–2267
Nili Ahmadabadi M, Asadpour M (2002) Expertness based cooperative Q-learning. IEEE Trans Syst Man Cybern, Part B, Cybern 32(1):66–76
Dragoni N, Gaspari M, Guidi D (2006) An infrastructure to support cooperation of knowledge-level agents on the semantic grid. J Appl Intell 25(2):159–180
Kinney M, Tsatsoulis C (1998) Learning communication strategies in multi-agent systems. J Appl Intell 9(1):71–91
Watkins CJCH (1989) Learning with delayed rewards. PhD Dissertation, Cambridge University, Psychology Department, England
Whitehead S, Ballard D (1991) A study of cooperative mechanisms for faster reinforcement learning. Technical Report 365, Computer Science Dept, University of Rochester
Tan M (1993) Multi-agent reinforcement learning: independent vs cooperative agents. In: Proceedings of tenth international conference on machine learning, Amherst, America, pp 487–494
Kuniyoshi y (1994) Learning by watching: extracting reuseable task knowledge from visual observation of human performance. IEEE Trans Robot Autom 10(6):799–822
Maclin R, Shavlik JW (1996) Creating advice-taking reinforcement learners. J Mach Learn 22:251–282
Judah K, Roy S, Fern A, Dietterich T (2010) Reinforcement learning via practice and critique advice. In: AAAI 2010
Garland A, Alterman R (1995) Preparation of multi-agent knowledge for reuse. Technical Report, Waltham: AAAI fall symposium on adaptation of knowledge for reuse, November 10–12, Menlo Park, Canada
Garland A, Alterman R (1996) Multi-agent learning through collective memory. In: Proceedings of adaptation, co evolution and learning in multi-agent systems: papers from the 1996 AAAI spring symposium, Menlo Park, CA, pp 33–38
Akbarzadeh MR, Rezaei H, Naghibi MB (2003) A fuzzy adaptive algorithm for expertness based cooperative learning, application to herding problem. In: Proceeding of 22nd inter conf of the north American fuzzy information processing society, pp 317–322
Ritthipravat P, Maneewarn T, Wyatt J, Laowattana D (2006) Comparison and analysis of expertness measure in knowledge sharing among robots, LNAI, vol 4031. Springer, Berlin, pp 60–69
Yang Y, Tian Y, Mei H (2007) Cooperative Q learning based on blackboard architecture. In: Proceedings of international conference on computational intelligence and security workshops, pp 224–227
Yang M, Tian Y, Liu X (2009) Cooperative Q-learning based on maturity of the policy. In: 2009 IEEE international conference on mechatronics and automation, August 9–12, Changchun, China
Bianchi RAC, Costa AHR (2004) The use of heuristics to speedup reinforcement learning. Boletim Interno, No. BT/PCS 0409, Escola Politécnica da USP, São Paulo, Brazil, pp 125–144
Kadleček D (2008) Motivation driven reinforcement learning and automatic creation of behavior hierarchies. PhD Dissertation, Faculty of Electrical Engineering, Czech Technical University in Prague
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pakizeh, E., Palhang, M. & Pedram, M.M. Multi-criteria expertness based cooperative Q-learning. Appl Intell 39, 28–40 (2013). https://doi.org/10.1007/s10489-012-0392-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-012-0392-6