Abstract
Many works provide intrinsic rewards to deal with sparse rewards in reinforcement learning. Due to the non-stationarity of multi-agent systems, it is impracticable to apply existing methods to multi-agent reinforcement learning directly. In this paper, a fuzzy curiosity-driven mechanism is proposed for multi-agent reinforcement learning, by which agents can explore more efficiently in a scenario with sparse extrinsic reward. First, we improve the variational auto-encoder to predict the next state through the joint-state and joint-action for agents. Then several fuzzy partitions are built according to the next joint-state, which aims at assigning the prediction error to different agents. With the proposed method, each agent in the multi-agent environment receives its individual intrinsic reward. We elaborate on the proposed method in partially observable environments and fully observable environments separately. Experimental results show that multi-agent learns joint policies more efficiently by the proposed fuzzy curiosity-driven mechanism, and it can also help agents find better policies in the training process.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
de Abril, I.M., Kanai, R.: Curiosity-driven reinforcement learning with homeostatic regulation. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2018)
Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: An overview. In: Innovations in multi-agent systems and applications-1, pp. 183–221. Springer (2010)
Da Silva, F.L., Costa, A.H.R.: A survey on transfer learning for multiagent reinforcement learning systems. J. Artif. Intell. Res. 64, 645–703 (2019)
Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rabbat, M., Pineau, J.: Tarmac: Targeted multi-agent communication. In: International Conference on Machine Learning, pp. 1538–1546 (2019)
Frank, M., Leitner, J., Stollenga, M., Förster, A., Schmidhuber, J.: Curiosity driven reinforcement learning for motion planning on humanoids. Front. Neurorobot 7, 25 (2014)
Gupta, A., Mendonca, R., Liu, Y., Abbeel, P., Levine, S.: Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural Information Processing Systems, pp. 5302–5311 (2018)
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer (2017)
Hou, Y., Liu, L., Wei, Q., Xu, X., Chen, C.: A novel ddpg method with prioritized experience replay. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 316–321. IEEE (2017)
Hsu, C.C., Hwang, H.T., Wu, Y.C., Tsao, Y., Wang, H.M.: Voice conversion from non-parallel corpora using variational auto-encoder. In: 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 1–6. IEEE (2016)
Ibarz, B., Leike, J., Pohlen, T., Irving, G., Legg, S., Amodei, D.: Reward learning from human preferences and demonstrations in atari. In: Advances in neural information processing systems, pp. 8011–8023 (2018)
Iqbal, S., Sha, F.: Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 2961–2970 (2019)
Liu, Y., Wang, W., Hu, Y., Hao, J., Chen, X., Gao, Y.: Multi-agent game abstraction via graph attention neural network. In: AAAI, pp. 7211–7218 (2020)
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp. 6379–6390 (2017)
Luo, Y., Huang, Z., Zhang, Z., Wang, Z., Li, J., Yang, Y.: Curiosity-driven reinforcement learning for diverse visual paragraph generation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2341–2350 (2019)
Ma, J., Zheng, Y., Wang, L.: Nash equilibrium topology of multi-agent systems with competitive groups. IEEE Trans. Ind. Electron. 64(6), 4956–4966 (2017)
Mahajan, A., Rashid, T., Samvelyan, M., Whiteson, S.: Maven: Multi-agent variational exploration. In: Advances in Neural Information Processing Systems, pp. 7613–7624 (2019)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. nature 518(7540), 529–533 (2015)
Ni, Z., Paul, S.: A multistage game in smart grid security: A reinforcement learning solution. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2684–2695 (2019)
Nowé, A., Vrancx, P., De Hauwere, Y.M.: Game theory and multi-agent reinforcement learning. In: Reinforcement Learning, pp. 441–470. Springer (2012)
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 16–17 (2017)
Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304 (2018)
Shi, H., Li, X., Hwang, K.S., Pan, W., Xu, G.: Decoupled visual servoing with fuzzy q-learning. IEEE Trans. Ind. Inform. 14(1), 241–252 (2016)
Shi, H., Lin, Z., Zhang, S., Li, X., Hwang, K.S.: An adaptive decision-making method with fuzzy bayesian reinforcement learning for robot soccer. Inf. Sci. 436, 268–281 (2018)
Shi, H., Shi, L., Xu, M., Hwang, K.S.: End-to-end navigation strategy with deep reinforcement learning for mobile robots. IEEE Trans. Ind. Inform. 16(4), 2393–2402 (2019)
Valcher, M.E., Misra, P.: On the stabilizability and consensus of positive homogeneous multi-agent dynamical systems. IEEE Trans. Autom. Control 59(7), 1936–1941 (2013)
Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., Wang, J.: Mean field multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 5571–5580 (2018)
Zheng, Y., Zhu, Y., Wang, L.: Consensus of heterogeneous multi-agent systems. IET Control Theory Appl. 5(16), 1881–1888 (2011)
Acknowledgements
This work is supported by National Natural Science Foundation of China under Grant 62076202 and 61976178.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, W., Shi, H., Li, J. et al. A Fuzzy Curiosity-Driven Mechanism for Multi-Agent Reinforcement Learning. Int. J. Fuzzy Syst. 23, 1222–1233 (2021). https://doi.org/10.1007/s40815-020-01035-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-020-01035-0