skip to main content
10.1145/3653081.3653082acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiotaaiConference Proceedingsconference-collections
research-article

Research on the application of multi-intelligence collaborative method based on imitation learning

Authors Info & Claims
Published:03 May 2024Publication History

ABSTRACT

Deep learning empowerment enables traditional reinforcement learning methods to deal with complex problems with large state space. However, one of the disadvantages of the deep reinforcement learning method is that the trained agent has limited exploration ability, requires a large amount of computing resources, and may fall into local optimization. In this paper, imitation learning is added into the training process of agents to increase the exploration ability of multiple deep reinforcement learning agents. Experiments are carried out in the multi-agent environment of Overcook. The results show that the performance of multiple agents in cooperative learning with expert strategy is significantly improved. This suggests that using expert-assisted training can help reinforcement learning agents solve complex problems.

References

  1. SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Second edition. Cambridge, Massachusetts: The MIT Press, 2018.Google ScholarGoogle Scholar
  2. WU S A, WANG R E, EVANS J A, et.. Too Many Cooks: Bayesian Inference for Coordinating Multi‐Agent Collaboration[J/OL]. Topics in Cognitive Science, 2021, 13(2): 414-432. DOI:10.1111/tops.12525.Google ScholarGoogle ScholarCross RefCross Ref
  3. HU H, LERER A, PEYSAKHOVICH A, et.. “Other-Play” for Zero-Shot Coordination[J/OL]. arXiv:2003.02979 [cs], 2021[2022-01-24]. http://arxiv.org/abs/2003.02979.Google ScholarGoogle Scholar
  4. SONG Y, WANG J, LUKASIEWICZ T, et.. Diversity-driven extensible hierarchical reinforcement learning [C]//Proceedings of the AAAI conference on artificial intelligence: volume 33. 2019: 4992-4999.Google ScholarGoogle Scholar
  5. YANG Y, WANG J. An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective [J/OL]. arXiv:2011.00583 [cs], 2021 [2022-01-25]. http://arxiv.org/abs/2011.00583.Google ScholarGoogle Scholar
  6. CHARAKORN R, MANOONPONG P, DILOKTHANAKUL N. Investigating partner diversification methods in cooperative multi-agent deep reinforcement learning[C]//International Conference on Neural Information Processing. Springer, 2020: 395-402.Google ScholarGoogle Scholar
  7. Zhao Yunbo, Kang Yu, Zhu Jin. Theory and methods of autonomy for human-computer hybrid intelligent systems [M]. Science Press, 2021.Google ScholarGoogle Scholar
  8. FONTAINE M C, HSU Y C, ZHANG Y, et.. On the Importance of Environments in Human-Robot Coordination [J/OL]. arXiv:2106.10853 [cs], 2021[2022-02-04]. http://arxiv.org/abs/2106.10853.Google ScholarGoogle Scholar
  9. KNOTT P, CARROLL M, DEVLIN S, et.. Evaluating the Robustness of Collaborative Agents[J/OL]. arXiv:2101.05507 [cs], 2021[2022-01-21]. http://arxiv.org/abs/2101.05507.Google ScholarGoogle Scholar
  10. BACON P L, HARB J, PRECUP D. The Option-Critic Architecture[J/OL]. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1)[2022-02-19]. https://ojs.aaai.org/index.php/AAAI/article/view/10916.Google ScholarGoogle ScholarCross RefCross Ref
  11. DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of artificial intelligence research, 2000, 13: 227-303.Google ScholarGoogle Scholar
  12. CARROLL M, SHAH R, HO M K, et.. On the Utility of Learning about Humans for Human-AI Coordination[C/OL]//Advances in Neural Information Processing Systems: volume 32. Curran Associates, Inc., 2019[2022-01-25]. https://proceedings.neurips.cc/paper/2019/hash/f5b1b89d98b7286673128a5fb112cb9a-Abstract.html.Google ScholarGoogle Scholar
  13. LOWE R, WU Y, TAMAR A, et.. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments[J/OL]. arXiv:1706.02275 [cs], 2020[2022-02-18]. http://arxiv.org/abs/1706.02275.Google ScholarGoogle Scholar
  14. HEESS N, WAYNE G, TASSA Y, et.. Learning and Transfer of Modulated Locomotor Controllers[J/OL]. arXiv:1610.05182 [cs], 2016[2022-02-19]. http://arxiv.org/abs/1610.05182.Google ScholarGoogle Scholar
  15. SRINIVAS A, KRISHNAMURTHY R, KUMAR P, et.. Option discovery in hierarchical reinforcement learning using spatio-temporal clustering[J]. arXiv preprint arXiv:1605.05359, 2016.Google ScholarGoogle Scholar
  16. ZHAO R, SONG J, HAIFENG H, et.. Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination[J/OL]. arXiv:2112.11701 [cs], 2021[2022-02-04]. http://arxiv.org/abs/2112.11701.Google ScholarGoogle Scholar
  17. SARKAR B, TALATI A, SHIH A, et.. PantheonRL: A MARL Library for Dynamic Training Interactions[J/OL]. arXiv:2112.07013 [cs], 2021[2022-02-04]. http://arxiv.org/abs/2112.07013.Google ScholarGoogle Scholar
  18. ŞIMŞEK Ö, BARTO A G. Using relative novelty to identify useful temporal abstractions in reinforcement learning[C/OL]//Proceedings of the twenty-first international conference on Machine learning. New York, NY, USA: Association for Computing Machinery, 2004: 95[2022-02-19]. https://doi.org/10.1145/1015330.1015353. DOI:10.1145/1015330.1015353.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Research on the application of multi-intelligence collaborative method based on imitation learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial Intelligence
      November 2023
      902 pages
      ISBN:9798400716485
      DOI:10.1145/3653081

      Copyright © 2023 Owner/Author

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 May 2024

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)4

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format