research-article

Research on the application of multi-intelligence collaborative method based on imitation learning

Authors:
Wei Zhang

The 15th Research Institute of China Electronics Technology Group Corporation, China

The 15th Research Institute of China Electronics Technology Group Corporation, China

0009-0002-5938-2106
View Profile

,
Haiyu Zhao

The 15th Research Institute of China Electronics Technology Group Corporation, China

The 15th Research Institute of China Electronics Technology Group Corporation, China

0009-0001-8729-3971
View Profile

,
Zhenyu Li

The 15th Research Institute of China Electronics Technology Group Corporation, China

The 15th Research Institute of China Electronics Technology Group Corporation, China

0009-0003-7693-0896
View Profile

,
Wenshi Wang

The 15th Research Institute of China Electronics Technology Group Corporation, China

The 15th Research Institute of China Electronics Technology Group Corporation, China

0009-0006-0880-1816
View Profile

,
Yucai Dong

The 15th Research Institute of China Electronics Technology Group Corporation, China

The 15th Research Institute of China Electronics Technology Group Corporation, China

0009-0001-1455-3824
View Profile

IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial IntelligenceNovember 2023Pages 1–5https://doi.org/10.1145/3653081.3653082

Published:03 May 2024Publication History

IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial Intelligence

Pages 1–5

ABSTRACT

Deep learning empowerment enables traditional reinforcement learning methods to deal with complex problems with large state space. However, one of the disadvantages of the deep reinforcement learning method is that the trained agent has limited exploration ability, requires a large amount of computing resources, and may fall into local optimization. In this paper, imitation learning is added into the training process of agents to increase the exploration ability of multiple deep reinforcement learning agents. Experiments are carried out in the multi-agent environment of Overcook. The results show that the performance of multiple agents in cooperative learning with expert strategy is significantly improved. This suggests that using expert-assisted training can help reinforcement learning agents solve complex problems.

References

SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Second edition. Cambridge, Massachusetts: The MIT Press, 2018.Google Scholar
WU S A, WANG R E, EVANS J A, et.. Too Many Cooks: Bayesian Inference for Coordinating Multi‐Agent Collaboration[J/OL]. Topics in Cognitive Science, 2021, 13(2): 414-432. DOI:10.1111/tops.12525.Google ScholarCross Ref
HU H, LERER A, PEYSAKHOVICH A, et.. “Other-Play” for Zero-Shot Coordination[J/OL]. arXiv:2003.02979 [cs], 2021[2022-01-24]. http://arxiv.org/abs/2003.02979.Google Scholar
SONG Y, WANG J, LUKASIEWICZ T, et.. Diversity-driven extensible hierarchical reinforcement learning [C]//Proceedings of the AAAI conference on artificial intelligence: volume 33. 2019: 4992-4999.Google Scholar
YANG Y, WANG J. An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective [J/OL]. arXiv:2011.00583 [cs], 2021 [2022-01-25]. http://arxiv.org/abs/2011.00583.Google Scholar
CHARAKORN R, MANOONPONG P, DILOKTHANAKUL N. Investigating partner diversification methods in cooperative multi-agent deep reinforcement learning[C]//International Conference on Neural Information Processing. Springer, 2020: 395-402.Google Scholar
Zhao Yunbo, Kang Yu, Zhu Jin. Theory and methods of autonomy for human-computer hybrid intelligent systems [M]. Science Press, 2021.Google Scholar
FONTAINE M C, HSU Y C, ZHANG Y, et.. On the Importance of Environments in Human-Robot Coordination [J/OL]. arXiv:2106.10853 [cs], 2021[2022-02-04]. http://arxiv.org/abs/2106.10853.Google Scholar
KNOTT P, CARROLL M, DEVLIN S, et.. Evaluating the Robustness of Collaborative Agents[J/OL]. arXiv:2101.05507 [cs], 2021[2022-01-21]. http://arxiv.org/abs/2101.05507.Google Scholar
BACON P L, HARB J, PRECUP D. The Option-Critic Architecture[J/OL]. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1)[2022-02-19]. https://ojs.aaai.org/index.php/AAAI/article/view/10916.Google ScholarCross Ref
DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of artificial intelligence research, 2000, 13: 227-303.Google Scholar
CARROLL M, SHAH R, HO M K, et.. On the Utility of Learning about Humans for Human-AI Coordination[C/OL]//Advances in Neural Information Processing Systems: volume 32. Curran Associates, Inc., 2019[2022-01-25]. https://proceedings.neurips.cc/paper/2019/hash/f5b1b89d98b7286673128a5fb112cb9a-Abstract.html.Google Scholar
LOWE R, WU Y, TAMAR A, et.. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments[J/OL]. arXiv:1706.02275 [cs], 2020[2022-02-18]. http://arxiv.org/abs/1706.02275.Google Scholar
HEESS N, WAYNE G, TASSA Y, et.. Learning and Transfer of Modulated Locomotor Controllers[J/OL]. arXiv:1610.05182 [cs], 2016[2022-02-19]. http://arxiv.org/abs/1610.05182.Google Scholar
SRINIVAS A, KRISHNAMURTHY R, KUMAR P, et.. Option discovery in hierarchical reinforcement learning using spatio-temporal clustering[J]. arXiv preprint arXiv:1605.05359, 2016.Google Scholar
ZHAO R, SONG J, HAIFENG H, et.. Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination[J/OL]. arXiv:2112.11701 [cs], 2021[2022-02-04]. http://arxiv.org/abs/2112.11701.Google Scholar
SARKAR B, TALATI A, SHIH A, et.. PantheonRL: A MARL Library for Dynamic Training Interactions[J/OL]. arXiv:2112.07013 [cs], 2021[2022-02-04]. http://arxiv.org/abs/2112.07013.Google Scholar
ŞIMŞEK Ö, BARTO A G. Using relative novelty to identify useful temporal abstractions in reinforcement learning[C/OL]//Proceedings of the twenty-first international conference on Machine learning. New York, NY, USA: Association for Computing Machinery, 2004: 95[2022-02-19]. https://doi.org/10.1145/1015330.1015353. DOI:10.1145/1015330.1015353.Google ScholarDigital Library

Index Terms

Research on the application of multi-intelligence collaborative method based on imitation learning
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence

Recommendations

Embodied imitation-enhanced reinforcement learning in multi-agent systems

Imitation is an example of social learning in which an individual observes and copies another's actions. This paper presents a new method for using imitation as a way of enhancing the learning speed of individual agents that employ a well-known ...
Read More
Self-Imitation Advantage Learning
AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

Self-imitation learning is a Reinforcement Learning (RL) method that encourages actions whose returns were higher than expected, which helps in hard exploration and sparse reward problems. It was shown to improve the performance of on-policy actor-...
Read More
Effective Integration of Imitation Learning and Reinforcement Learning by Generating Internal Reward
ISDA '08: Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 03

This paper describes an integrative machine learning architecture of imitation learning and reinforcement learning. The learning architecture aims to help integration of the two learning process by generating internal rewards. After observing superiors, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial Intelligence
November 2023
902 pages
ISBN:9798400716485
DOI:10.1145/3653081

Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 May 2024
Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 4
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Research on the application of multi-intelligence collaborative method based on imitation learning

IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Embodied imitation-enhanced reinforcement learning in multi-agent systems

Self-Imitation Advantage Learning

Effective Integration of Imitation Learning and Reinforcement Learning by Generating Internal Reward

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Research on the application of multi-intelligence collaborative method based on imitation learning

IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Embodied imitation-enhanced reinforcement learning in multi-agent systems

Self-Imitation Advantage Learning

Effective Integration of Imitation Learning and Reinforcement Learning by Generating Internal Reward

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media