skip to main content
10.1145/3698587.3701524acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
short-paper
Open access

gPSRM: A generative propensity score-based replay memory for deep reinforcement learnings

Published: 16 December 2024 Publication History

Abstract

In recent years, reinforcement learning (RL) methods have made significant strides in medical decision optimization, including treatments for AIDS, cancer, and diabetes. However, designing an effective reward function remains a significant challenge in RL for medicine. This study proposes a data-driven approach to reward function design. The replay memory plays a crucial role in DQN-based methods. To address the issue of imbalanced sample distribution across different categories, we introduce a novel design for the replay memory in both DQN and Dueling DQN, based on propensity scores. We applied these innovations to AI decision-making in skin cancer diagnosis using the public ISIC2018 dataset. The experimental results demonstrated substantial improvements in accuracy. The modified DQN model's accuracy increased from 88.2% to 91.4%, while the modified Dueling DQN model showed an even more impressive improvement, rising from 87.9% to 92.8%. Besides, to discuss the generalization ability and applicability of the gPSRM method to other domains, we train models to predict the COVID-19 status of patients presenting to hospital emergency departments. Through experiments, we found that Q-imb with gPSRM improved accuracy from 79.2% to 84.6%. These results underscore the potential of our proposed methods in enhancing the performance of RL algorithms in medical decision-making tasks.

References

[1]
Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016).
[2]
Catarina Barata, Veronica Rotemberg, Noel CF Codella, Philipp Tschandl, Christoph Rinner, Bengu Nisa Akay, Zoe Apalla, Giuseppe Argenziano, Allan Halpern, Aimilios Lallas, et al. 2023. A reinforcement learning model for AI-based decision support in skin cancer. Nature Medicine 29, 8 (2023), 1941--1946.
[3]
U. Bhowan, M. Johnston, and M. Zhang. 2012. Developing New Fitness Functions in Genetic Programming for Classification With Unbalanced Data. IEEE Transactions on Systems Man & Cybernetics Part B 42, 2 (2012), 406--421.
[4]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16, 1 (2002), 321--357.
[5]
N Chen, L Zhang, and L Chen. 2020. Clinical analysis of 3163 cases of malignant skin tumors and precancerous lesions. Journal of Clinical Dermatology (in Chinese), 49, 8 (2020), 456--459.
[6]
Editorial. 2017. Artificial Intelligence Helps Early Skin Cancer Detection. Data Analysis and Knowledge Discovery 1 (2017), 58.
[7]
Jonas Eschmann. 2021. Reward function design in reinforcement learning. Reinforcement learning algorithms: Analysis and Applications (2021), 25--33.
[8]
Samuel J Gershman. 2017. Reinforcement learning and causal models. The Oxford handbook of causal reasoning 1 (2017), 295.
[9]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
[10]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (Oct. 2020), 139--144. https://doi.org/10.1145/3422622
[11]
Manu Goyal, Thomas Knackstedt, Shaofeng Yan, and Saeed Hassanpour. 2020. Artificial intelligence-based image classification methods for diagnosis of skin cancer: Challenges and opportunities. Computers in biology and medicine 127 (2020), 104065.
[12]
H He and Y Ma. 2013. Imbalanced learning. Foundations, algorithms, and applications. Imbalanced learning. Foundations, algorithms, and applications.
[13]
Jean Kaddour, Aengus Lynch, Qi Liu, Matt J Kusner, and Ricardo Silva. 2022. Causal machine learning: A survey and open problems. arXiv preprint arXiv:2206.15475 (2022).
[14]
Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. 1996. Reinforcement learning: A survey. Journal of artificial intelligence research 4 (1996), 237--285.
[15]
Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arXiv:1312.6114 [stat.ML] https://arxiv.org/abs/1312.6114
[16]
Arvind Kumar, Shivani Goel, Nishant Sinha, and Arpit Bhardwaj. 2022. A Review on Unbalanced Data Classification. In Proceedings of International Joint Conference on Advances in Computational Intelligence, Mohammad Shorif Uddin, Prashant Kumar Jamwal, and Jagdish Chand Bansal (Eds.). Springer Nature Singapore, Singapore, 197--208.
[17]
Frank L Lewis, Draguna Vrabie, and Kyriakos G Vamvoudakis. 2012. Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers. IEEE Control Systems Magazine 32, 6 (2012), 76--105.
[18]
D Li, Y Wang, Y Li, and W Huang. 2020. Application research of artificial intelligence in medical imaging diagnosis. Chinese Journal of Clinical Anatomy (in Chinese), 38, 1 (2020), 4.
[19]
Maja J Mataric. 1994. Reward functions for accelerated learning. In Machine learning proceedings 1994. Elsevier, 181--189.
[20]
Philipp Tschandl, Cliff Rosendahl, Bengu Nisa Akay, Giuseppe Argenziano, Andreas Blum, Ralph P Braun, Horacio Cabo, Jean-Yves Gourhant, Jürgen Kreusch, Aimilios Lallas, et al. 2019. Expert-level diagnosis of nonpigmented skin cancer by combined convolutional neural networks. JAMA dermatology 155, 1 (2019), 58--65.
[21]
Volodymyr, Mnih, Koray, Kavukcuoglu, David, Silver, Andrei, A, Rusu, and Joel. 2015. Human-level control through deep reinforcement learning. Nature (2015).
[22]
P Wang, J Liu, X Liu, S Yu, D Zhang, and X Kang. 2023. Clinical analysis of skin scraping, skin reflectance confocal microscopy and histopathological features of atypical scabies: A report of 60 cases. Journal of Dermatology and Dermatopathology (in Chinese), 30, 4 (2023), 322--328.
[23]
Ziyu Wang, Nando De Freitas, and Marc Lanctot. 2015. Dueling Network Architectures for Deep Reinforcement Learning. JMLR.org (2015).
[24]
J Zhaoand H Wen, X Cai, and Z Cui. [n. d.]. Sample skin cancer detection based on high-dimensional multi-objective optimization. Ontrol and Decision (in Chinese), ([n.d.]), 1--9. https://doi.org/10.13195/j.kzyjc.2023.1102
[25]
Jenny Yang, Rasheed El-Bouri, Odhran O'Donoghue, Alexander S. Lachapelle, Andrew A. S. Soltan, David W. Eyre, Lei Lu, and David A. Clifton. 2024. Deep reinforcement learning for multi-class imbalanced training: applications in healthcare. Machine Learning 113, 5 (2024), 2655--2674.
[26]
Chong Zhang, Kay Chen Tan, Haizhou Li, and Geok Soon Hong. 2019. A Cost-Sensitive Deep Belief Network for Imbalanced Classification. IEEE Transactions on Neural Networks and Learning Systems 30, 1 (2019), 109--122. https://doi.org/10.1109/TNNLS.2018.2832648
[27]
Ting Zuo, Fenglian Li, Xueying Zhang, Fengyun Hu, Lixia Huang, and Wenhui Jia. 2024. Stroke classification based on deep reinforcement learning over stroke screening imbalanced data. Computers and Electrical Engineering 114 (2024), 109069.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
BCB '24: Proceedings of the 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
November 2024
614 pages
ISBN:9798400713026
DOI:10.1145/3698587
This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 December 2024

Check for updates

Author Tags

  1. Deep reinforcement learning
  2. Propensity score
  3. Replay memory
  4. Skin cancer

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

  • National Science Foundations of China
  • Chongqing Municipal Natural Science Foundation
  • Chongqing Talents Project

Conference

BCB '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 73
    Total Downloads
  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)29
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media