research-article

Medical Visual Question Answering via Conditional Reasoning

Authors:

Xiao-Ming WuAuthors Info & Claims

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 2345 - 2354

https://doi.org/10.1145/3394171.3413761

Published: 12 October 2020 Publication History

Abstract

Medical visual question answering (Med-VQA) aims to accurately answer a clinical question presented with a medical image. Despite its enormous potential in healthcare industry and services, the technology is still in its infancy and is far from practical use. Med-VQA tasks are highly challenging due to the massive diversity of clinical questions and the disparity of required visual reasoning skills for different types of questions. In this paper, we propose a novel conditional reasoning framework for Med-VQA, aiming to automatically learn effective reasoning skills for various Med-VQA tasks. Particularly, we develop a question-conditioned reasoning module to guide the importance selection over multimodal fusion features. Considering the different nature of closed-ended and open-ended Med-VQA tasks, we further propose a type-conditioned reasoning module to learn a different set of reasoning skills for the two types of tasks separately. Our conditional reasoning framework can be easily applied to existing Med-VQA systems to bring performance gains. In the experiments, we build our system on top of a recent state-of-the-art Med-VQA model and evaluate it on the VQA-RAD benchmark [23]. Remarkably, our system achieves significantly increased accuracy in predicting answers to both closed-ended and open-ended questions, especially for open-ended questions, where a 10.8% increase in absolute accuracy is obtained. The source code can be downloaded from https://github.com/awenbocc/med-vqa.

Supplementary Material

MP4 File (3394171.3413761.mp4)

A brief introduction to our work.

Download
5.24 MB

References

[1]

Asma Ben Abacha, Soumya Gayen, Jason J. Lau, Sivaramakrishnan Rajaraman, and Dina Demner-Fushman. 2018. NLM at ImageCLEF 2018 Visual Question Answering in the Medical Domain. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum (CEUR Workshop Proceedings, Vol. 2125). CEUR-WS.org, Avignon, France.

[2]

Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Joey Liu, Dina Demner-Fushman, and Henning Mü ller. 2019. VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum (CEUR Workshop Proceedings, Vol. 2380). CEUR-WS.org, Lugano, Switzerland.

[3]

Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR. IEEE Computer Society, Salt Lake City, UT, USA, 6077--6086.

[4]

Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. 2016. Neural Module Networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. IEEE Computer Society, Las Vegas, NV, USA, 39--48.

[5]

Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. 2015. VQA: Visual Question Answering. In IEEE International Conference on Computer Vision, ICCV. IEEE Computer Society, Santiago, Chile, 2425--2433.

[6]

Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, and Ram Nevatia. 2015. ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering. arXiv e-prints (Nov. 2015), arXiv:1511.05960.

[7]

Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In Proceedings of SSST@EMNLP 2014, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar, 103--111.

[8]

Xuanyi Dong, Linchao Zhu, De Zhang, Yi Yang, and Fei Wu. 2018. Fast Parameter Adaptation for Few-shot Image Captioning and Visual Question Answering. In 2018 ACM Multimedia Conference on Multimedia Conference, MM. ACM, Seoul, Republic of Korea, 54--62.

[9]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, ICML (Proceedings of Machine Learning Research, Vol. 70). PMLR, Sydney, NSW, Australia, 1126--1135.

[10]

Akira Fukui, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach. 2016. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP. The Association for Computational Linguistics, Austin, Texas, USA, 457--468.

[11]

Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, and Devi Parikh. 2017. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. IEEE Computer Society, Honolulu, HI, USA, 6325--6334.

[12]

Sepp Hochreiter and Jü rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780.

Digital Library

[13]

Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Kate Saenko. 2017. Learning to Reason: End-to-End Module Networks for Visual Question Answering. In IEEE International Conference on Computer Vision, ICCV. IEEE Computer Society, Venice, Italy, 804--813.

[14]

Drew A. Hudson and Christopher D. Manning. 2018. Compositional Attention Networks for Machine Reasoning. In 6th International Conference on Learning Representations, ICLR. OpenReview.net, Vancouver, BC, Canada.

[15]

Drew A. Hudson and Christopher D. Manning. 2019. GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR. Computer Vision Foundation / IEEE, Long Beach, CA, USA, 6700--6709.

[16]

Bogdan Ionescu, Henning Mü ller, Mauricio Villegas, Alba Garcia Seco de Herrera, Carsten Eickhoff, Vincent Andrearczyk, Yashin Dicente Cid, Vitali Liauchuk, Vassili Kovalev, Sadid A. Hasan, Yuan Ling, Oladimeji Farri, Joey Liu, Matthew P. Lungren, Duc-Tien Dang-Nguyen, Luca Piras, Michael Riegler, Liting Zhou, Mathias Lux, and Cathal Gurrin. 2018. Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation. In Experimental IR Meets Multilinguality, Multimodality, and Interaction - 9th International Conference of the CLEF Association, CLEF (Lecture Notes in Computer Science, Vol. 11018). Springer, Avignon, France, 309--334.

[17]

Yu Jiang, Vivek Natarajan, Xinlei Chen, Marcus Rohrbach, Dhruv Batra, and Devi Parikh. 2018. Pythia v0.1: the Winning Entry to the VQA Challenge 2018. arXiv e-prints (July 2018), arXiv:1807.09956.

[18]

Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, and Ross B. Girshick. 2017. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. IEEE Computer Society, Honolulu, HI, USA, 1988--1997.

[19]

Kushal Kafle and Christopher Kanan. 2017. Visual question answering: Datasets, algorithms, and future challenges. Comput. Vis. Image Underst., Vol. 163 (2017), 3--20.

[20]

Jin-Hwa Kim, Jaehyun Jun, and Byoung-Tak Zhang. 2018. Bilinear Attention Networks. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, NeurIPS. NeurIPS, Montré al, Canada, 1571--1581.

[21]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR, Conference Track Proceedings. OpenReview.net, San Diego, CA, USA.

[22]

Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent Convolutional Neural Networks for Text Classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. AAAI Press, Austin, Texas, USA, 2267--2273.

Digital Library

[23]

Jason J Lau, Soumya Gayen, Asma Ben Abacha, and Dina Demner-Fushman. 2018. A dataset of clinically generated visual questions and answers about radiology images. Scientific data, Vol. 5, 1 (2018), 1--10.

[24]

Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A. W. M. van der Laak, Bram van Ginneken, and Clara I. Sá nchez. 2017. A survey on deep learning in medical image analysis. Medical Image Anal., Vol. 42 (2017), 60--88.

[25]

Fei Liu, Jing Liu, Richang Hong, and Hanqing Lu. 2019. Erasing-based Attention Learning for Visual Question Answering. In Proceedings of the 27th ACM International Conference on Multimedia, MM. ACM, Nice, France, 1175--1183.

Digital Library

[26]

Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. 2016. Hierarchical Question-Image Co-Attention for Visual Question Answering. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems, NeurIPS. Barcelona, Spain, 289--297.

[27]

Jiayuan Mao, Chuang Gan, Pushmeet Kohli, Joshua B. Tenenbaum, and Jiajun Wu. 2019. The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision. In 7th International Conference on Learning Representations, ICLR. OpenReview.net, New Orleans, LA, USA.

[28]

David Mascharka, Philip Tran, Ryan Soklaski, and Arjun Majumdar. 2018. Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. IEEE Computer Society, Salt Lake City, UT, USA, 4942--4950.

[29]

Jonathan Masci, Ueli Meier, Dan C. Ciresan, and Jü rgen Schmidhuber. 2011. Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction. In Artificial Neural Networks and Machine Learning - ICANN 2011 - 21st International Conference on Artificial Neural Networks, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 6791). Springer, Espoo, Finland, 52--59.

[30]

Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, and Quang D. Tran. 2019. Overcoming Data Limitation in Medical Visual Question Answering. In Medical Image Computing and Computer Assisted Intervention - MICCAI 2019 - 22nd International Conference, Part IV (Lecture Notes in Computer Science, Vol. 11767). Springer, Shenzhen, China, 522--530.

[31]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kö pf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS. Vancouver, BC, Canada, 8024--8035.

[32]

Liang Peng, Yang Yang, Zheng Wang, Xiao Wu, and Zi Huang. 2019. CRA-Net: Composed Relation Attention Network for Visual Question Answering. In Proceedings of the 27th ACM International Conference on Multimedia, MM. ACM, Nice, France, 1202--1210.

Digital Library

[33]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, Doha, Qatar, 1532--1543.

[34]

Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron C. Courville. 2018. FiLM: Visual Reasoning with a General Conditioning Layer. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. AAAI Press, New Orleans, Louisiana, USA, 3942--3951.

[35]

Maithra Raghu, Chiyuan Zhang, Jon M. Kleinberg, and Samy Bengio. 2019. Transfusion: Understanding Transfer Learning for Medical Imaging. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS. Vancouver, BC, Canada, 3342--3352.

[36]

Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, NeurIPS. Montreal, Quebec, Canada, 91--99.

[37]

Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, and Lawrence Carin. 2018. Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, Volume 1: Long Papers. Association for Computational Linguistics, Melbourne, Australia, 440--450.

[38]

Lei Shi, Feifan Liu, and Max P. Rosen. 2019. Deep Multimodal Learning for Medical Visual Question Answering. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum (CEUR Workshop Proceedings, Vol. 2380). CEUR-WS.org, Lugano, Switzerland.

[39]

Robik Shrestha, Kushal Kafle, and Christopher Kanan. 2019. Answer Them All! Toward Universal Visual Question Answering Models. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR. Computer Vision Foundation / IEEE, Long Beach, CA, USA, 10472--10481.

[40]

Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. 2015. Highway Networks. arXiv e-prints (May 2015), arXiv:1505.00387.

[41]

Huijuan Xu and Kate Saenko. 2016. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering. In Computer Vision - ECCV 2016 - 14th European Conference, Proceedings, Part VII (Lecture Notes in Computer Science, Vol. 9911). Springer, Amsterdam, The Netherlands, 451--466.

[42]

Xin Yan, Lin Li, Chulin Xie, Jun Xiao, and Lin Gu. 2019. Zhejiang University at ImageCLEF 2019 Visual Question Answering in the Medical Domain. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum (CEUR Workshop Proceedings, Vol. 2380). CEUR-WS.org, Lugano, Switzerland.

[43]

Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alexander J. Smola. 2016. Stacked Attention Networks for Image Question Answering. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, Las Vegas, NV, USA, 21--29.

[44]

Kexin Yi, Chuang Gan, Yunzhu Li, Pushmeet Kohli, Jiajun Wu, Antonio Torralba, and Joshua B. Tenenbaum. 2020. CLEVRER: Collision Events for Video Representation and Reasoning. In 8th International Conference on Learning Representations, ICLR. OpenReview.net, Addis Ababa, Ethiopia.

[45]

Kexin Yi, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli, and Josh Tenenbaum. 2018. Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, NeurIPS. Montré al, Canada, 1039--1050.

[46]

Zhou Yu, Jun Yu, Jianping Fan, and Dacheng Tao. 2017. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering. In IEEE International Conference on Computer Vision, ICCV. IEEE Computer Society, Venice, Italy, 1839--1848.

[47]

Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, and Rob Fergus. 2015. Simple Baseline for Visual Question Answering. arXiv e-prints (Dec. 2015), arXiv:1512.02167.

[48]

Yangyang Zhou, Xin Kang, and Fuji Ren. 2018. Employing Inception-Resnet-v2 and Bi-LSTM for Medical Domain Visual Question Answering. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum (CEUR Workshop Proceedings, Vol. 2125). CEUR-WS.org, Avignon, France.

Cited By

He YHuang FJiang XNie YWang MWang JChen H(2025)Foundation Model for Advancing Healthcare: Challenges, Opportunities and Future DirectionsIEEE Reviews in Biomedical Engineering10.1109/RBME.2024.349674418(172-191)Online publication date: 2025
https://doi.org/10.1109/RBME.2024.3496744
Yu TGe BWang SYang YHuang QYu J(2025)Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question AnsweringIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.349214129:2(1357-1370)Online publication date: Feb-2025
https://doi.org/10.1109/JBHI.2024.3492141
Gong HLi L(2025)Answer Distillation Network With Bi-Text-Image Attention for Medical Visual Question AnsweringIEEE Access10.1109/ACCESS.2025.353230813(16455-16465)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3532308
Show More Cited By

Index Terms

Medical Visual Question Answering via Conditional Reasoning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

Medical visual question answering based on question-type reasoning and semantic space constraint
Abstract
Medical visual question answering (Med-VQA) aims to accurately answer clinical questions about medical images. Despite its enormous potential for application in the medical domain, the current technology is still in its infancy. ...
Highlights
- A new Framework has been proposed for the medical visual question answering tasks.
Conditional Reasoning and Relevance
KI 2020: Advances in Artificial Intelligence
Abstract
The Weak Completion Semantics is a computational and nonmonotonic cognitive theory based on the three-valued logic of Łukasiewicz. It has been applied to adequately model – among others – the suppression task, the selection task, syllogistic ...
Situated conditional reasoning
Abstract
Conditionals are useful for modelling many forms of everyday human reasoning but are not always sufficiently expressive to represent the information we want to reason about. In this paper, we make a case for a form of situated conditional. By ‘...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

October 2020

4889 pages

ISBN:9781450379885

DOI:10.1145/3394171

General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '20

Sponsor:

SIGMM

MM '20: The 28th ACM International Conference on Multimedia

October 12 - 16, 2020

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

98
Total Citations
View Citations
1,066
Total Downloads

Downloads (Last 12 months)219
Downloads (Last 6 weeks)18

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

He YHuang FJiang XNie YWang MWang JChen H(2025)Foundation Model for Advancing Healthcare: Challenges, Opportunities and Future DirectionsIEEE Reviews in Biomedical Engineering10.1109/RBME.2024.349674418(172-191)Online publication date: 2025
https://doi.org/10.1109/RBME.2024.3496744
Yu TGe BWang SYang YHuang QYu J(2025)Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question AnsweringIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.349214129:2(1357-1370)Online publication date: Feb-2025
https://doi.org/10.1109/JBHI.2024.3492141
Gong HLi L(2025)Answer Distillation Network With Bi-Text-Image Attention for Medical Visual Question AnsweringIEEE Access10.1109/ACCESS.2025.353230813(16455-16465)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3532308
Zhang SXu YUsuyama NXu HBagga JTinn RPreston SRao RWei MValluri NWong CTupini AWang YMazzola MShukla SLiden LGao JCrabtree APiening BBifulco CLungren MNaumann TWang SPoon H(2025)A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text PairsNEJM AI10.1056/AIoa24006402:1Online publication date: Jan-2025
https://doi.org/10.1056/AIoa2400640
Lameesa ASilpasuwanchai CAlam M(2025)VG-CALF: A vision-guided cross-attention and late-fusion network for radiology images in Medical Visual Question AnsweringNeurocomputing10.1016/j.neucom.2024.128730613(128730)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128730
Zhan CPeng PWang HWang GLin YChen TWang H(2025)UnICLAM: Contrastive representation learning with adversarial masking for unified and interpretable Medical Vision Question AnsweringMedical Image Analysis10.1016/j.media.2025.103464101(103464)Online publication date: Apr-2025
https://doi.org/10.1016/j.media.2025.103464
Tascon-Morales SMárquez-Neila PSznitman R(2025)Targeted Visual Prompting for Medical Visual Question AnsweringApplications of Medical Artificial Intelligence10.1007/978-3-031-82007-6_7(64-73)Online publication date: 8-Feb-2025
https://doi.org/10.1007/978-3-031-82007-6_7
Wu ZXu HLong YYou SSu XLong JLuo YXu CSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Detecting any instruction-to-answer interaction relationshipProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694281(53909-53927)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694281
Li QLi LLi Y(2024)Developing ChatGPT for biology and medicine: a complete review of biomedical question answeringBiophysics Reports10.52601/bpr.2024.2400049(1)Online publication date: 2024
https://doi.org/10.52601/bpr.2024.240004
Wang JSeng KShen YAng LHuang D(2024)Image to Label to Answer: An Efficient Framework for Enhanced Clinical Applications in Medical Visual Question AnsweringElectronics10.3390/electronics1312227313:12(2273)Online publication date: 10-Jun-2024
https://doi.org/10.3390/electronics13122273
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten