Abstract:
In the field of medical imaging analysis, particularly in interpreting chest X-rays, deep learning models have shown remarkable progress. Nonetheless, these models often ...Show MoreMetadata
Abstract:
In the field of medical imaging analysis, particularly in interpreting chest X-rays, deep learning models have shown remarkable progress. Nonetheless, these models often face challenges such as limited annotation and inadequate utilization of public data resources. This is particularly apparent with databases containing multimodal data, such as images and medical reports, where the effective integration of this multimodal information remains difficult. To address these limitations, we propose the Neighbor-Assisted Multimodal Attention Network (NAMAN), a novel approach designed to leverage retrieval augmentation techniques to enhance disease classification performance. NAMAN combines nearest neighbor search with multimodal fusion, utilizing both visual features from similar X-ray images and textual information from corresponding medical records. The experimental results demonstrate the efficacy of incorporating retrieved neighbor information and multimodal integration mechanisms in NAMAN. Our ablation studies offer insights into the optimal configuration of the model, including the effects of various attention mechanisms and the number of retrieved neighbors. This work contributes to the expanding field of retrieval-augmented approaches in medical imaging, presenting a promising avenue for leveraging large-scale, multimodal medical databases to enhance diagnostic accuracy and reliability.
Date of Conference: 03-06 December 2024
Date Added to IEEE Xplore: 10 January 2025
ISBN Information: