Skip to main content
Log in

A Fuzzy Curiosity-Driven Mechanism for Multi-Agent Reinforcement Learning

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

Many works provide intrinsic rewards to deal with sparse rewards in reinforcement learning. Due to the non-stationarity of multi-agent systems, it is impracticable to apply existing methods to multi-agent reinforcement learning directly. In this paper, a fuzzy curiosity-driven mechanism is proposed for multi-agent reinforcement learning, by which agents can explore more efficiently in a scenario with sparse extrinsic reward. First, we improve the variational auto-encoder to predict the next state through the joint-state and joint-action for agents. Then several fuzzy partitions are built according to the next joint-state, which aims at assigning the prediction error to different agents. With the proposed method, each agent in the multi-agent environment receives its individual intrinsic reward. We elaborate on the proposed method in partially observable environments and fully observable environments separately. Experimental results show that multi-agent learns joint policies more efficiently by the proposed fuzzy curiosity-driven mechanism, and it can also help agents find better policies in the training process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. de Abril, I.M., Kanai, R.: Curiosity-driven reinforcement learning with homeostatic regulation. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2018)

  2. Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: An overview. In: Innovations in multi-agent systems and applications-1, pp. 183–221. Springer (2010)

  3. Da Silva, F.L., Costa, A.H.R.: A survey on transfer learning for multiagent reinforcement learning systems. J. Artif. Intell. Res. 64, 645–703 (2019)

    Article  MathSciNet  Google Scholar 

  4. Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rabbat, M., Pineau, J.: Tarmac: Targeted multi-agent communication. In: International Conference on Machine Learning, pp. 1538–1546 (2019)

  5. Frank, M., Leitner, J., Stollenga, M., Förster, A., Schmidhuber, J.: Curiosity driven reinforcement learning for motion planning on humanoids. Front. Neurorobot 7, 25 (2014)

    Article  Google Scholar 

  6. Gupta, A., Mendonca, R., Liu, Y., Abbeel, P., Levine, S.: Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural Information Processing Systems, pp. 5302–5311 (2018)

  7. Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer (2017)

  8. Hou, Y., Liu, L., Wei, Q., Xu, X., Chen, C.: A novel ddpg method with prioritized experience replay. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 316–321. IEEE (2017)

  9. Hsu, C.C., Hwang, H.T., Wu, Y.C., Tsao, Y., Wang, H.M.: Voice conversion from non-parallel corpora using variational auto-encoder. In: 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 1–6. IEEE (2016)

  10. Ibarz, B., Leike, J., Pohlen, T., Irving, G., Legg, S., Amodei, D.: Reward learning from human preferences and demonstrations in atari. In: Advances in neural information processing systems, pp. 8011–8023 (2018)

  11. Iqbal, S., Sha, F.: Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 2961–2970 (2019)

  12. Liu, Y., Wang, W., Hu, Y., Hao, J., Chen, X., Gao, Y.: Multi-agent game abstraction via graph attention neural network. In: AAAI, pp. 7211–7218 (2020)

  13. Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp. 6379–6390 (2017)

  14. Luo, Y., Huang, Z., Zhang, Z., Wang, Z., Li, J., Yang, Y.: Curiosity-driven reinforcement learning for diverse visual paragraph generation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2341–2350 (2019)

  15. Ma, J., Zheng, Y., Wang, L.: Nash equilibrium topology of multi-agent systems with competitive groups. IEEE Trans. Ind. Electron. 64(6), 4956–4966 (2017)

    Article  Google Scholar 

  16. Mahajan, A., Rashid, T., Samvelyan, M., Whiteson, S.: Maven: Multi-agent variational exploration. In: Advances in Neural Information Processing Systems, pp. 7613–7624 (2019)

  17. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. nature 518(7540), 529–533 (2015)

  18. Ni, Z., Paul, S.: A multistage game in smart grid security: A reinforcement learning solution. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2684–2695 (2019)

    Article  MathSciNet  Google Scholar 

  19. Nowé, A., Vrancx, P., De Hauwere, Y.M.: Game theory and multi-agent reinforcement learning. In: Reinforcement Learning, pp. 441–470. Springer (2012)

  20. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 16–17 (2017)

  21. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304 (2018)

  22. Shi, H., Li, X., Hwang, K.S., Pan, W., Xu, G.: Decoupled visual servoing with fuzzy q-learning. IEEE Trans. Ind. Inform. 14(1), 241–252 (2016)

    Article  Google Scholar 

  23. Shi, H., Lin, Z., Zhang, S., Li, X., Hwang, K.S.: An adaptive decision-making method with fuzzy bayesian reinforcement learning for robot soccer. Inf. Sci. 436, 268–281 (2018)

    Article  MathSciNet  Google Scholar 

  24. Shi, H., Shi, L., Xu, M., Hwang, K.S.: End-to-end navigation strategy with deep reinforcement learning for mobile robots. IEEE Trans. Ind. Inform. 16(4), 2393–2402 (2019)

    Article  Google Scholar 

  25. Valcher, M.E., Misra, P.: On the stabilizability and consensus of positive homogeneous multi-agent dynamical systems. IEEE Trans. Autom. Control 59(7), 1936–1941 (2013)

    Article  MathSciNet  Google Scholar 

  26. Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., Wang, J.: Mean field multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 5571–5580 (2018)

  27. Zheng, Y., Zhu, Y., Wang, L.: Consensus of heterogeneous multi-agent systems. IET Control Theory Appl. 5(16), 1881–1888 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant 62076202 and 61976178.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kao-Shing Hwang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, W., Shi, H., Li, J. et al. A Fuzzy Curiosity-Driven Mechanism for Multi-Agent Reinforcement Learning. Int. J. Fuzzy Syst. 23, 1222–1233 (2021). https://doi.org/10.1007/s40815-020-01035-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-020-01035-0

Keywords