skip to main content
10.1145/3106426.3109426acmconferencesArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
research-article

Deep deformable Q-Network: an extension of deep Q-Network

Published: 23 August 2017 Publication History

Abstract

The performance of Deep Reinforcement Learning (DRL) algorithms is usually constrained by instability and variability. In this work, we present an extension of Deep Q-Network (DQN) called Deep Deformable Q-Network which is based on deformable convolution mechanisms. The new algorithm can readily be built on existing models and can be easily trained end-to-end by standard back-propagation. Extensive experiments on the Atari games validate the feasibility and effectiveness of the proposed Deep Deformable Q-Network.

References

[1]
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013).
[2]
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G. and Petersen, S., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540), pp.529--533.
[3]
Hausknecht, M. and Stone, P., 2015. Deep recurrent q-learning for partially observable mdps. arXiv preprint arXiv:1507.06527.
[4]
Osband, I., Blundell, C., Pritzel, A. and Van Roy, B., 2016. Deep exploration via bootstrapped DQN. In Advances In Neural Information Processing Systems (pp. 4026--4034).
[5]
Sutton, R.S. and Barto, A.G., 1998. Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.
[6]
Van Hasselt, H., Guez, A. and Silver, D., 2016, March. Deep Reinforcement Learning with Double Q-Learning. In AAAI (pp. 2094--2100).
[7]
Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M. and de Freitas, N., 2015. Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581.
[8]
Schulman, J., Moritz, P., Levine, S., Jordan, M. and Abbeel, P., 2015. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438.
[9]
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H. and Wei, Y., 2017. Deformable Convolutional Networks. arXiv preprint arXiv:1703.06211.
[10]
Jeon, Y. and Kim, J., 2017. Active Convolution: Learning the Shape of Convolution for Image Classification. arXiv preprint arXiv:1703.09076.
[11]
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. and Wojna, Z., 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2818--2826).
[12]
Simonyan, K. and Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[13]
Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105).
[14]
Wang, J., Zhou, J., Wonka, P. and Ye, J., 2013. Advances in neural information processing systems. In Neural information processing systems foundation.
[15]
Sutton, R.S. and Barto, A.G., 1998. Introduction to reinforcement learning (Vol. 135). Cambridge: MIT Press.
[16]
Stadie, B.C., Levine, S. and Abbeel, P., 2015. Incentivizing exploration in reinforcement learning with deep predictive models. arXiv preprint arXiv:1507.00814.
[17]
LeCun, Y., Bengio, Y. and Hinton, G., 2015. Deep learning. Nature, 521(7553), pp.436--444.
[18]
Sutton, R.S. and Barto, A.G., 1998. Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.
[19]
Watkins, C.J. and Dayan, P., 1992. Q-learning. Machine learning, 8(3--4), pp.279--292.
[20]
Rummery, G.A. and Niranjan, M., 1994. On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering.
[21]
Sutton, R.S., McAllester, D.A., Singh, S.P. and Mansour, Y., 1999, November. Policy gradient methods for reinforcement learning with function approximation. In NIPS (Vol. 99, pp. 1057--1063).
[22]
Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105).
[23]
Long, J., Shelhamer, E. and Darrell, T., 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431--3440).
[24]
Girshick, R., Donahue, J., Darrell, T. and Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580--587).
[25]
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. and Kavukcuoglu, K., 2016, June. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning (pp. 1928--1937).
[26]
Sorokin, I., Seleznev, A., Pavlov, M., Fedorov, A. and Ignateva, A., 2015. Deep attention recurrent Q-network. arXiv preprint arXiv:1512.01693.
[27]
Schaul, T., Quan, J., Antonoglou, I. and Silver, D., 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952.

Cited By

View all
  • (2025)Empowering RIS-assisted NOMA networks with deep learning for user clustering and phase shifter optimizationWireless Networks10.1007/s11276-025-03936-0Online publication date: 6-Mar-2025
  • (2024)Exponential gannet firefly optimization algorithm enabled deep learning for diabetic retinopathy detectionBiomedical Signal Processing and Control10.1016/j.bspc.2023.10537687(105376)Online publication date: Jan-2024
  • (2023)RIFATA: Remora improved invasive feedback artificial tree algorithm-enabled hybrid deep learning approach for root disease classificationBiomedical Signal Processing and Control10.1016/j.bspc.2023.10457882(104578)Online publication date: Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WI '17: Proceedings of the International Conference on Web Intelligence
August 2017
1284 pages
ISBN:9781450349512
DOI:10.1145/3106426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep Q-Network
  2. deep learning
  3. deformable convolution layer
  4. reinforcement learning

Qualifiers

  • Research-article

Conference

WI '17
Sponsor:

Acceptance Rates

WI '17 Paper Acceptance Rate 118 of 178 submissions, 66%;
Overall Acceptance Rate 118 of 178 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Empowering RIS-assisted NOMA networks with deep learning for user clustering and phase shifter optimizationWireless Networks10.1007/s11276-025-03936-0Online publication date: 6-Mar-2025
  • (2024)Exponential gannet firefly optimization algorithm enabled deep learning for diabetic retinopathy detectionBiomedical Signal Processing and Control10.1016/j.bspc.2023.10537687(105376)Online publication date: Jan-2024
  • (2023)RIFATA: Remora improved invasive feedback artificial tree algorithm-enabled hybrid deep learning approach for root disease classificationBiomedical Signal Processing and Control10.1016/j.bspc.2023.10457882(104578)Online publication date: Apr-2023
  • (2019)Autonomous Obstacle Avoidance Algorithm of UAVs for Automatic Terrain Following Application2019 IEEE International Conference on Unmanned Systems and Artificial Intelligence (ICUSAI)10.1109/ICUSAI47366.2019.9124741(309-314)Online publication date: Nov-2019
  • (2018)Reinforcement Learning-Based Satellite Attitude Stabilization Method for Non-Cooperative Target CapturingSensors10.3390/s1812433118:12(4331)Online publication date: 7-Dec-2018
  • (2018)Comprehensive Control System for Gathering Pipe Network Operation Based on Reinforcement LearningProceedings of the 2018 VII International Conference on Network, Communication and Computing10.1145/3301326.3301375(34-39)Online publication date: 14-Dec-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media