research-article

Deep deformable Q-Network: an extension of deep Q-Network

Authors:

Xiangsheng Huang,

Dawar KhanAuthors Info & Claims

WI '17: Proceedings of the International Conference on Web Intelligence

Pages 963 - 966

https://doi.org/10.1145/3106426.3109426

Published: 23 August 2017 Publication History

Abstract

The performance of Deep Reinforcement Learning (DRL) algorithms is usually constrained by instability and variability. In this work, we present an extension of Deep Q-Network (DQN) called Deep Deformable Q-Network which is based on deformable convolution mechanisms. The new algorithm can readily be built on existing models and can be easily trained end-to-end by standard back-propagation. Extensive experiments on the Atari games validate the feasibility and effectiveness of the proposed Deep Deformable Q-Network.

References

[1]

Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013).

[2]

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G. and Petersen, S., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540), pp.529--533.

[3]

Hausknecht, M. and Stone, P., 2015. Deep recurrent q-learning for partially observable mdps. arXiv preprint arXiv:1507.06527.

[4]

Osband, I., Blundell, C., Pritzel, A. and Van Roy, B., 2016. Deep exploration via bootstrapped DQN. In Advances In Neural Information Processing Systems (pp. 4026--4034).

[5]

Sutton, R.S. and Barto, A.G., 1998. Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

Digital Library

[6]

Van Hasselt, H., Guez, A. and Silver, D., 2016, March. Deep Reinforcement Learning with Double Q-Learning. In AAAI (pp. 2094--2100).

Digital Library

[7]

Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M. and de Freitas, N., 2015. Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581.

[8]

Schulman, J., Moritz, P., Levine, S., Jordan, M. and Abbeel, P., 2015. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438.

[9]

Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H. and Wei, Y., 2017. Deformable Convolutional Networks. arXiv preprint arXiv:1703.06211.

[10]

Jeon, Y. and Kim, J., 2017. Active Convolution: Learning the Shape of Convolution for Image Classification. arXiv preprint arXiv:1703.09076.

[11]

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. and Wojna, Z., 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2818--2826).

[12]

Simonyan, K. and Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

[13]

Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105).

Digital Library

[14]

Wang, J., Zhou, J., Wonka, P. and Ye, J., 2013. Advances in neural information processing systems. In Neural information processing systems foundation.

[15]

Sutton, R.S. and Barto, A.G., 1998. Introduction to reinforcement learning (Vol. 135). Cambridge: MIT Press.

Digital Library

[16]

Stadie, B.C., Levine, S. and Abbeel, P., 2015. Incentivizing exploration in reinforcement learning with deep predictive models. arXiv preprint arXiv:1507.00814.

[17]

LeCun, Y., Bengio, Y. and Hinton, G., 2015. Deep learning. Nature, 521(7553), pp.436--444.

[18]

Sutton, R.S. and Barto, A.G., 1998. Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

Digital Library

[19]

Watkins, C.J. and Dayan, P., 1992. Q-learning. Machine learning, 8(3--4), pp.279--292.

Digital Library

[20]

Rummery, G.A. and Niranjan, M., 1994. On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering.

[21]

Sutton, R.S., McAllester, D.A., Singh, S.P. and Mansour, Y., 1999, November. Policy gradient methods for reinforcement learning with function approximation. In NIPS (Vol. 99, pp. 1057--1063).

Digital Library

[22]

Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105).

Digital Library

[23]

Long, J., Shelhamer, E. and Darrell, T., 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431--3440).

[24]

Girshick, R., Donahue, J., Darrell, T. and Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580--587).

Digital Library

[25]

Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. and Kavukcuoglu, K., 2016, June. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning (pp. 1928--1937).

Digital Library

[26]

Sorokin, I., Seleznev, A., Pavlov, M., Fedorov, A. and Ignateva, A., 2015. Deep attention recurrent Q-network. arXiv preprint arXiv:1512.01693.

[27]

Schaul, T., Quan, J., Antonoglou, I. and Silver, D., 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952.

Cited By

Banday Y(2025)Empowering RIS-assisted NOMA networks with deep learning for user clustering and phase shifter optimizationWireless Networks10.1007/s11276-025-03936-0Online publication date: 6-Mar-2025
https://doi.org/10.1007/s11276-025-03936-0
Prabhakar TMadhusudhana Rao TMaram BChigurukota D(2024)Exponential gannet firefly optimization algorithm enabled deep learning for diabetic retinopathy detectionBiomedical Signal Processing and Control10.1016/j.bspc.2023.10537687(105376)Online publication date: Jan-2024
https://doi.org/10.1016/j.bspc.2023.105376
Jackulin CMurugavalli SValarmathi K(2023)RIFATA: Remora improved invasive feedback artificial tree algorithm-enabled hybrid deep learning approach for root disease classificationBiomedical Signal Processing and Control10.1016/j.bspc.2023.10457882(104578)Online publication date: Apr-2023
https://doi.org/10.1016/j.bspc.2023.104578
Show More Cited By

Index Terms

Deep deformable Q-Network: an extension of deep Q-Network
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification

Recommendations

Deep Q-Network Using Reward Distribution
Artificial Intelligence and Soft Computing
Abstract
In this paper, we propose a Deep Q-Network using reward distribution. Deep Q-Network is based on the convolutional neural network which is a representative method of Deep Learning and the Q Learning which is a representative method of ...
Tuning Apex DQN: A Reinforcement Learning based Deep Q-Network Algorithm
PEARC '24: Practice and Experience in Advanced Research Computing 2024: Human Powered Computing

Technological inventions in the last decade have increasingly geared towards developing autonomous systems that belong to robotics, healthcare, games, smart grids, finance and other domains. Such advanced developments have become possible through the ...
Accelerating Deep Q Network by Weighting Experiences
Neural Information Processing
Abstract
Deep Q Network (DQN) is a reinforcement learning methodlogy that uses deep neural networks to approximate the Q-function. Literature reveals that DQN can select better responses than humans. However, DQN requires a lengthy period of time to learn ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WI '17: Proceedings of the International Conference on Web Intelligence

August 2017

1284 pages

ISBN:9781450349512

DOI:10.1145/3106426

Conference Chair:
Amit Sheth
Wright State University GuoROLE@GENERAL CHAIR
,
General Chairs:
Axel Ngonga
Leipzig University, Germany
,
yin Wang
Chongqing University of Posts and Telecommunications, China
,
Elizabeth Chang
The University of New South Wales, Australia
,
Dominik Ślęzak
Infobright Inc. & University of Warsaw, Poland
,
Bogdan Franczyk
Leipzig University, Germany
,
Program Chairs:
Rainer Alt
Leipzig University, Germany
,
Xiaohui Tao
University of Southern Queensland, Australia

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence
TCII: IEEE Computer Society Technical Committee on Intelligent Informatics
Web Intelligence Consortium

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WI '17

Sponsor:

SIGAI
TCII

WI '17: International Conference on Web Intelligence 2017

August 23 - 26, 2017

Leipzig, Germany

Acceptance Rates

WI '17 Paper Acceptance Rate 118 of 178 submissions, 66%;

Overall Acceptance Rate 118 of 178 submissions, 66%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
247
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Banday Y(2025)Empowering RIS-assisted NOMA networks with deep learning for user clustering and phase shifter optimizationWireless Networks10.1007/s11276-025-03936-0Online publication date: 6-Mar-2025
https://doi.org/10.1007/s11276-025-03936-0
Prabhakar TMadhusudhana Rao TMaram BChigurukota D(2024)Exponential gannet firefly optimization algorithm enabled deep learning for diabetic retinopathy detectionBiomedical Signal Processing and Control10.1016/j.bspc.2023.10537687(105376)Online publication date: Jan-2024
https://doi.org/10.1016/j.bspc.2023.105376
Jackulin CMurugavalli SValarmathi K(2023)RIFATA: Remora improved invasive feedback artificial tree algorithm-enabled hybrid deep learning approach for root disease classificationBiomedical Signal Processing and Control10.1016/j.bspc.2023.10457882(104578)Online publication date: Apr-2023
https://doi.org/10.1016/j.bspc.2023.104578
Wang YMa ZTang XWang Z(2019)Autonomous Obstacle Avoidance Algorithm of UAVs for Automatic Terrain Following Application2019 IEEE International Conference on Unmanned Systems and Artificial Intelligence (ICUSAI)10.1109/ICUSAI47366.2019.9124741(309-314)Online publication date: Nov-2019
https://doi.org/10.1109/ICUSAI47366.2019.9124741
Ma ZWang YYang YWang ZTang LAckland S(2018)Reinforcement Learning-Based Satellite Attitude Stabilization Method for Non-Cooperative Target CapturingSensors10.3390/s1812433118:12(4331)Online publication date: 7-Dec-2018
https://doi.org/10.3390/s18124331
Wu QZhu DLiu YDu AChen DYe Z(2018)Comprehensive Control System for Gathering Pipe Network Operation Based on Reinforcement LearningProceedings of the 2018 VII International Conference on Network, Communication and Computing10.1145/3301326.3301375(34-39)Online publication date: 14-Dec-2018
https://dl.acm.org/doi/10.1145/3301326.3301375

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten