Dynamic fusion for ensemble of deep Q-network

Chan, Patrick P. K.; Xiao, Meng; Qin, Xinran; Kees, Natasha

doi:10.1007/s13042-020-01218-z

Dynamic fusion for ensemble of deep Q-network

Original Article
Published: 16 November 2020

Volume 12, pages 1031–1040, (2021)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Patrick P. K. Chan¹,
Meng Xiao¹,
Xinran Qin¹ &
…
Natasha Kees¹

348 Accesses
1 Citation
Explore all metrics

Abstract

Ensemble reinforcement learning, which combines the decisions of a set of base agents, is proposed to enhance the decision making process and speed up training time. Many studies indicate that an ensemble model may achieve better results than a single agent because of the complement of base agents, in which the error of an agent may be corrected by others. However, the fusion method is a fundamental issue in ensemble. Currently, existing studies mainly focus on static fusion which either assumes all agents have the same ability or ignores the ones with poor average performance. This assumption causes current static fusion methods to overlook base agents with poor overall performance, but excellent results in select scenarios, which results in the ability of some agents not being fully utilized. This study aims to propose a dynamic fusion method which utilizes each base agent according to its local competence on test states. The performance of a base agent on the validation set is measured in terms of the rewards achieved by the agent in next n steps. The similarity between a validation state and a new state is quantified by Euclidian distance in the latent space and the weights of each base agent are updated according to its performance on validation states and their similarity to a new state. The experimental studies confirm that the proposed dynamic fusion method outperforms its base agents and also the static fusion methods. This is the first dynamic fusion method proposed for deep reinforcement learning, which extends the study on dynamic fusion from classification to reinforcement learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gradient boosting in crowd ensembles for Q-learning using weight sharing

Article 23 March 2020

Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

A review of cooperative multi-agent deep reinforcement learning

Article 14 October 2022

References

Anschel O, Baram N, Shimkin N (2017) Averaged-dqn: Variance reduction and stabilization for deep reinforcement learning. In: International conference on machine learning, pp 176–185
Bellemare MG, Naddaf Y, Veness J, Bowling M (2013) The arcade learning environment: an evaluation platform for general agents. J. Artificial Intell. Res. 47:253–279
Article Google Scholar
Bonaccio S, Dalal RS (2006) Advice taking and decision-making: an integrative literature review, and implications for the organizational sciences. Organizational behavior and human decision processes 101(2):127–151
Article Google Scholar
Brys T, Harutyunyan A, Vrancx P, Nowé A, Taylor ME (2017) Multi-objectivization and ensembles of shapings in reinforcement learning. Neurocomputing 263:48–59
Article Google Scholar
Buckman J, Hafner D, Tucker G, Brevdo E, Lee H (2018) Sample-efficient reinforcement learning with stochastic ensemble value expansion. In: Advances in Neural Information Processing Systems, pp 8224–8234
Chan PP, Yeung DS, Ng WW, Lin CM, Liu JN (2012) Dynamic fusion method using localized generalization error model. Inform Sci 217:1–20
Article Google Scholar
Chan PP, Wang YX, Yeung DS (2020) Adversarial attack against deep reinforcement learning with static reward impact map. In: ACM ASIACCS, InPress
Chen L, Lu L, Feng K, Li W, Song J, Zheng L, Yuan Y, Zeng Z, Feng K, Lu W et al (2009) Multiple classifier integration for the prediction of protein structural classes. J Comput Chem 30(14):2248–2254
Google Scholar
Xl Chen (2018) Cao L, Li Cx, Xu Zx, Lai J (2018) Ensemble network architecture for deep reinforcement learning. Mathematical Problems in Engineering
Cruz RM, Sabourin R, Cavalcanti GD (2018) Dynamic classifier selection: recent advances and perspectives. Inform Fusion 41:195–216
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems, Springer, pp 1–15
Duell S, Udluft S (2013) Ensembles for continuous actions in reinforcement learning. In: ESANN
Faußer S, Schwenker F (2011) Ensemble methods for reinforcement learning with function approximation. In: International workshop on multiple classifier systems, Springer, pp 56–65
Faußer S, Schwenker F (2015) Neural network ensembles in reinforcement learning. Neural Process Lett 41(1):55–69
Article Google Scholar
Feng X, Xiao Z, Zhong B, Qiu J, Dong Y (2018) Dynamic ensemble classification for credit scoring using soft probability. Appl Soft Comput 65:139–151
Article Google Scholar
Gao Z, Gao Y, Hu Y, Jiang Z, Su J (2020) Application of deep q-network in portfolio management. In: 2020 5th IEEE international conference on big data analytics (ICBDA), IEEE, pp 268–275
Gosavi A (2004) A reinforcement learning algorithm based on policy iteration for average reward: empirical results with yield management and convergence analysis. Mach Learning 55(1):5–29
Article Google Scholar
Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE international conference on robotics and automation (ICRA), IEEE, pp 3389–3396
Hans A, Udluft S (2010) Ensembles of neural networks for robust reinforcement learning. In: 2010 ninth international conference on machine learning and applications, IEEE, pp 401–406
Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, Horgan D, Piot B, Azar M, Silver D (2018) Rainbow: Combining improvements in deep reinforcement learning. In: Thirty-second AAAI conference on artificial intelligence
Huang W, Chan PP, Zhou D, Fang Y, Liu H, Yeung DS (2016) Multiple classifier system with sensitivity based dynamic weighting fusion for hand gesture recognition. In: 2016 international conference on wavelet analysis and pattern recognition (ICWAPR), IEEE, pp 31–36
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:13125602
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Article Google Scholar
Saleena N et al (2018) An ensemble classification system for twitter sentiment analysis. Proc Comput Sci 132:937–946
Article Google Scholar
Sallab AE, Abdou M, Perot E (2017) Yogamani S (2017) Deep reinforcement learning framework for autonomous driving. Electron Imaging 19:70–76
Article Google Scholar
Saphal R, Ravindran B, Mudigere D, Avancha S, Kaul B (2020) Seerl: Sample efficient ensemble reinforcement learning. arXiv preprint arXiv:200105209
Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: International conference on machine learning, pp 1889–1897
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al. (2016) Mastering the game of go with deep neural networks and tree search. nature 529(7587):484–489
Soares SG, Araújo R (2016) An adaptive ensemble of on-line extreme learning machines with variable forgetting factor for dynamic system prediction. Neurocomputing 171:693–707
Article Google Scholar
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Thirtieth AAAI conference on artificial intelligence
Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2014) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Article Google Scholar
Watkins CJCH (1989) Learning from delayed rewards. PhD thesis, King’s College, Cambridge
Wiering MA, Van Hasselt H (2008) Ensemble algorithms in reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38(4):930–936
Wu J (2020) Li H (2020) Deep ensemble reinforcement learning with multiple deep deterministic policy gradient algorithm. Mathematical Problems in Engineering
Yeung DS, Chan PP (2009) A novel dynamic fusion method using localized generalization error model. 2009 IEEE international conference on systems. Man and Cybernetics, IEEE, pp 623–628
Google Scholar
Yu H, Chen Y, Lingras P, Wang G (2019) A three-way cluster ensemble approach for large-scale data. Intl J Approximate Reasoning 115:32–49
Article MathSciNet Google Scholar
Yu Z, Li L, Liu J, Zhang J, Han G (2015) Adaptive noise immune cluster ensemble using affinity propagation. IEEE Trans Knowl Data Eng 27(12):3176–3189
Article Google Scholar
Zhu Y, Mottaghi R, Kolve E, Lim JJ, Gupta A, Fei-Fei L, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE international conference on robotics and automation (ICRA), IEEE, pp 3357–3364

Download references

Acknowledgements

This paper is supported by the Natural Science Foundation of Guangdong Province, China (No. 2018A030313203) and the Fundamental Research Funds for the Central Universities (No. 2018ZD32).

Author information

Authors and Affiliations

School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
Patrick P. K. Chan, Meng Xiao, Xinran Qin & Natasha Kees

Authors

Patrick P. K. Chan
View author publications
You can also search for this author in PubMed Google Scholar
Meng Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Xinran Qin
View author publications
You can also search for this author in PubMed Google Scholar
Natasha Kees
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Xiao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chan, P.P.K., Xiao, M., Qin, X. et al. Dynamic fusion for ensemble of deep Q-network. Int. J. Mach. Learn. & Cyber. 12, 1031–1040 (2021). https://doi.org/10.1007/s13042-020-01218-z

Download citation

Received: 09 July 2020
Accepted: 30 September 2020
Published: 16 November 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s13042-020-01218-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic fusion for ensemble of deep Q-network

Abstract

Access this article

Similar content being viewed by others

Gradient boosting in crowd ensembles for Q-learning using weight sharing

Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

A review of cooperative multi-agent deep reinforcement learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic fusion for ensemble of deep Q-network

Abstract

Access this article

Similar content being viewed by others

Gradient boosting in crowd ensembles for Q-learning using weight sharing

Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

A review of cooperative multi-agent deep reinforcement learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation