short-paper

Open access

gPSRM: A generative propensity score-based replay memory for deep reinforcement learnings

Authors:

Bin YiAuthors Info & Claims

BCB '24: Proceedings of the 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Article No.: 45, Pages 1 - 8

https://doi.org/10.1145/3698587.3701524

Published: 16 December 2024 Publication History

Abstract

In recent years, reinforcement learning (RL) methods have made significant strides in medical decision optimization, including treatments for AIDS, cancer, and diabetes. However, designing an effective reward function remains a significant challenge in RL for medicine. This study proposes a data-driven approach to reward function design. The replay memory plays a crucial role in DQN-based methods. To address the issue of imbalanced sample distribution across different categories, we introduce a novel design for the replay memory in both DQN and Dueling DQN, based on propensity scores. We applied these innovations to AI decision-making in skin cancer diagnosis using the public ISIC2018 dataset. The experimental results demonstrated substantial improvements in accuracy. The modified DQN model's accuracy increased from 88.2% to 91.4%, while the modified Dueling DQN model showed an even more impressive improvement, rising from 87.9% to 92.8%. Besides, to discuss the generalization ability and applicability of the gPSRM method to other domains, we train models to predict the COVID-19 status of patients presenting to hospital emergency departments. Through experiments, we found that Q-imb with gPSRM improved accuracy from 79.2% to 84.6%. These results underscore the potential of our proposed methods in enhancing the performance of RL algorithms in medical decision-making tasks.

References

[1]

Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016).

[2]

Catarina Barata, Veronica Rotemberg, Noel CF Codella, Philipp Tschandl, Christoph Rinner, Bengu Nisa Akay, Zoe Apalla, Giuseppe Argenziano, Allan Halpern, Aimilios Lallas, et al. 2023. A reinforcement learning model for AI-based decision support in skin cancer. Nature Medicine 29, 8 (2023), 1941--1946.

[3]

U. Bhowan, M. Johnston, and M. Zhang. 2012. Developing New Fitness Functions in Genetic Programming for Classification With Unbalanced Data. IEEE Transactions on Systems Man & Cybernetics Part B 42, 2 (2012), 406--421.

Digital Library

[4]

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16, 1 (2002), 321--357.

[5]

N Chen, L Zhang, and L Chen. 2020. Clinical analysis of 3163 cases of malignant skin tumors and precancerous lesions. Journal of Clinical Dermatology (in Chinese), 49, 8 (2020), 456--459.

[6]

Editorial. 2017. Artificial Intelligence Helps Early Skin Cancer Detection. Data Analysis and Knowledge Discovery 1 (2017), 58.

[7]

Jonas Eschmann. 2021. Reward function design in reinforcement learning. Reinforcement learning algorithms: Analysis and Applications (2021), 25--33.

[8]

Samuel J Gershman. 2017. Reinforcement learning and causal models. The Oxford handbook of causal reasoning 1 (2017), 295.

[9]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf

[10]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (Oct. 2020), 139--144. https://doi.org/10.1145/3422622

Digital Library

[11]

Manu Goyal, Thomas Knackstedt, Shaofeng Yan, and Saeed Hassanpour. 2020. Artificial intelligence-based image classification methods for diagnosis of skin cancer: Challenges and opportunities. Computers in biology and medicine 127 (2020), 104065.

[12]

H He and Y Ma. 2013. Imbalanced learning. Foundations, algorithms, and applications. Imbalanced learning. Foundations, algorithms, and applications.

[13]

Jean Kaddour, Aengus Lynch, Qi Liu, Matt J Kusner, and Ricardo Silva. 2022. Causal machine learning: A survey and open problems. arXiv preprint arXiv:2206.15475 (2022).

[14]

Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. 1996. Reinforcement learning: A survey. Journal of artificial intelligence research 4 (1996), 237--285.

Digital Library

[15]

Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arXiv:1312.6114 [stat.ML] https://arxiv.org/abs/1312.6114

[16]

Arvind Kumar, Shivani Goel, Nishant Sinha, and Arpit Bhardwaj. 2022. A Review on Unbalanced Data Classification. In Proceedings of International Joint Conference on Advances in Computational Intelligence, Mohammad Shorif Uddin, Prashant Kumar Jamwal, and Jagdish Chand Bansal (Eds.). Springer Nature Singapore, Singapore, 197--208.

[17]

Frank L Lewis, Draguna Vrabie, and Kyriakos G Vamvoudakis. 2012. Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers. IEEE Control Systems Magazine 32, 6 (2012), 76--105.

[18]

D Li, Y Wang, Y Li, and W Huang. 2020. Application research of artificial intelligence in medical imaging diagnosis. Chinese Journal of Clinical Anatomy (in Chinese), 38, 1 (2020), 4.

[19]

Maja J Mataric. 1994. Reward functions for accelerated learning. In Machine learning proceedings 1994. Elsevier, 181--189.

[20]

Philipp Tschandl, Cliff Rosendahl, Bengu Nisa Akay, Giuseppe Argenziano, Andreas Blum, Ralph P Braun, Horacio Cabo, Jean-Yves Gourhant, Jürgen Kreusch, Aimilios Lallas, et al. 2019. Expert-level diagnosis of nonpigmented skin cancer by combined convolutional neural networks. JAMA dermatology 155, 1 (2019), 58--65.

[21]

Volodymyr, Mnih, Koray, Kavukcuoglu, David, Silver, Andrei, A, Rusu, and Joel. 2015. Human-level control through deep reinforcement learning. Nature (2015).

[22]

P Wang, J Liu, X Liu, S Yu, D Zhang, and X Kang. 2023. Clinical analysis of skin scraping, skin reflectance confocal microscopy and histopathological features of atypical scabies: A report of 60 cases. Journal of Dermatology and Dermatopathology (in Chinese), 30, 4 (2023), 322--328.

[23]

Ziyu Wang, Nando De Freitas, and Marc Lanctot. 2015. Dueling Network Architectures for Deep Reinforcement Learning. JMLR.org (2015).

[24]

J Zhaoand H Wen, X Cai, and Z Cui. [n. d.]. Sample skin cancer detection based on high-dimensional multi-objective optimization. Ontrol and Decision (in Chinese), ([n.d.]), 1--9. https://doi.org/10.13195/j.kzyjc.2023.1102

[25]

Jenny Yang, Rasheed El-Bouri, Odhran O'Donoghue, Alexander S. Lachapelle, Andrew A. S. Soltan, David W. Eyre, Lei Lu, and David A. Clifton. 2024. Deep reinforcement learning for multi-class imbalanced training: applications in healthcare. Machine Learning 113, 5 (2024), 2655--2674.

Digital Library

[26]

Chong Zhang, Kay Chen Tan, Haizhou Li, and Geok Soon Hong. 2019. A Cost-Sensitive Deep Belief Network for Imbalanced Classification. IEEE Transactions on Neural Networks and Learning Systems 30, 1 (2019), 109--122. https://doi.org/10.1109/TNNLS.2018.2832648

[27]

Ting Zuo, Fenglian Li, Xueying Zhang, Fengyun Hu, Lixia Huang, and Wenhui Jia. 2024. Stroke classification based on deep reinforcement learning over stroke screening imbalanced data. Computers and Electrical Engineering 114 (2024), 109069.

Digital Library

Index Terms

gPSRM: A generative propensity score-based replay memory for deep reinforcement learnings

Index terms have been assigned to the content through auto-classification.

Recommendations

Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Conversational Recommender System Using Deep Reinforcement Learning
RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Deep Reinforcement Learning (DRL) uses the best of both Reinforcement Learning and Deep Learning for solving problems which cannot be addressed by them individually. Deep Reinforcement Learning has been used widely for games, robotics etc. Limited work ...
A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation
Abstract
On-policy deep reinforcement learning (DRL) has the inherent advantage of using multi-step interaction data for policy learning. However, on-policy DRL still faces challenges in improving the sample efficiency of policy evaluations. Therefore, we ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

BCB '24: Proceedings of the 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

November 2024

614 pages

ISBN:9798400713026

DOI:10.1145/3698587

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Sponsors

SIGBio: ACM Special Interest Group on Bioinformatics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 December 2024

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

National Science Foundations of China
Chongqing Municipal Natural Science Foundation
Chongqing Talents Project

Conference

BCB '24

Sponsor:

SIGBio

BCB '24: 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

November 22 - 25, 2024

Shenzhen, China

Acceptance Rates

Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
73
Total Downloads

Downloads (Last 12 months)73
Downloads (Last 6 weeks)29

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten