Limited Information Opponent Modeling

Lv, Yongliang; Yu, Yuanqiang; Zheng, Yan; Hao, Jianye; Wen, Yongming; Yu, Yue

doi:10.1007/978-3-031-44198-1_42

Yongliang Lv¹¹,
Yuanqiang Yu¹¹,
Yan Zheng¹¹,
Jianye Hao¹¹,
Yongming Wen¹² &
…
Yue Yu¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14261))

Included in the following conference series:

International Conference on Artificial Neural Networks

632 Accesses

Abstract

The goal of opponent modeling is to model the opponent policy to maximize the reward of the main agent. Most prior works fail to effectively handle scenarios where opponent information is limited. To this end, we propose a Limited Information Opponent Modeling (LIOM) approach that extracts opponent policy representations across episodes using only self-observations. LIOM introduces a novel policy-based data augmentation method that extracts opponent policy representations offline via contrastive learning and incorporates them as additional inputs for training a general response policy. During online testing, LIOM dynamically responds to opponent policies by extracting opponent policy representations from recent historical trajectory data and combining them with the general policy. Moreover, LIOM ensures a lower bound on expected rewards through a balance between conservative and exploitation. Experimental results demonstrate that LIOM is able to accurately extract opponent policy representations even when the opponent’s information is limited, and has a certain degree of generalization ability for unknown policies, outperforming existing opponent modeling algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Google Scholar
Foerster, J.N., Chen, R.Y., Al-Shedivat, M., Whiteson, S., Abbeel, P., Mordatch, I.: Learning with opponent-learning awareness. arXiv preprint: arXiv:1709.04326 (2017)
Fu, H., et al.: Greedy when sure and conservative when uncertain about the opponents. In: International Conference on Machine Learning, pp. 6829–6848. PMLR (2022)
Google Scholar
He, H., Boyd-Graber, J., Kwok, K., Daumé III, H.: Opponent modeling in deep reinforcement learning. In: International Conference on Machine Learning, pp. 1804–1813. PMLR (2016)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Google Scholar
Hernandez-Leal, P., Taylor, M.E., Rosman, B.S., Sucar, L.E., Munoz de Cote, E.: Identifying and tracking switching, non-stationary opponents: a Bayesian approach (2016)
Google Scholar
Hong, Z.W., Su, S.Y., Shann, T.Y., Chang, Y.H., Lee, C.Y.: A deep policy inference q-network for multi-agent systems. arXiv preprint: arXiv:1712.07893 (2017)
Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint: arXiv:1807.03748 (2018)
Raileanu, R., Denton, E., Szlam, A., Fergus, R.: Modeling others using oneself in multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4257–4266. PMLR (2018)
Google Scholar
Rosman, B., Hawasly, M., Ramamoorthy, S.: Bayesian policy reuse. Mach. Learn. 104, 99–127 (2016)
Article MathSciNet MATH Google Scholar
Yang, T., Meng, Z., Hao, J., Zhang, C., Zheng, Y., Zheng, Z.: Towards efficient detection and optimal response against sophisticated opponents. arXiv preprint: arXiv:1809.04240 (2018)
Zheng, Y., Meng, Z., Hao, J., Zhang, Z., Yang, T., Fan, C.: A deep Bayesian policy reuse approach against non-stationary agents. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No.62106172), the “New Generation of Artificial Intelligence” Major Project of Science & Technology 2030 (Grant No.2022ZD0116402), and the Science and Technology on Information Systems Engineering Laboratory (Grant No.WDZC20235250409, No.WDZC20205250407).

Author information

Authors and Affiliations

Deep Reinforcement Learning Lab, Tianjin University, Tianjin, China
Yongliang Lv, Yuanqiang Yu, Yan Zheng & Jianye Hao
Science and Technology on Information Systems Engineering Laboratory, Beijing Institute of Control and Electronics Technology, Beijing, China
Yongming Wen & Yue Yu

Authors

Yongliang Lv
View author publications
You can also search for this author in PubMed Google Scholar
Yuanqiang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jianye Hao
View author publications
You can also search for this author in PubMed Google Scholar
Yongming Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yue Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianye Hao .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Lancaster University, Lancaster, UK
Plamen Angelov
Teesside University, Middlesbrough, UK
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lv, Y., Yu, Y., Zheng, Y., Hao, J., Wen, Y., Yu, Y. (2023). Limited Information Opponent Modeling. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14261. Springer, Cham. https://doi.org/10.1007/978-3-031-44198-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-44198-1_42
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44197-4
Online ISBN: 978-3-031-44198-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics