research-article

Modal-aware Visual Prompting for Incomplete Multi-modal Brain Tumor Segmentation

Authors:

Zheng WangAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 3228 - 3239

https://doi.org/10.1145/3581783.3611712

Published: 27 October 2023 Publication History

Abstract

In the realm of medical imaging, distinct magnetic resonance imaging (MRI) modalities can provide complementary medical insights. However, it is not uncommon for one or more modalities to be absent due to image corruption, artifacts, acquisition protocols, allergies to contrast agents, or cost constraints, posing a significant challenge for perceiving the modality-absent state in incomplete modality segmentation.In this work, we introduce a novel incomplete multi-modal segmentation framework called Modal-aware Visual Prompting (MAVP), which draws inspiration from the widely used pre-training and prompt adjustment protocol employed in natural language processing (NLP). In contrast to previous prompts that typically use textual network embeddings, we utilize embeddings as the prompts generated by a modality state classifier that focuses on the missing modality states. Additionally, we integrate modality state prompts into both the extraction stage of each modality and the modality fusion stage to facilitate intra/inter-modal adaptation. Our approach achieves state-of-the-art performance in various modality-incomplete scenarios compared to incomplete modality-specific solutions.

Supplemental Material

MP4 File

Presentation video

Download
169.85 MB

References

[1]

Reza Azad, Nika Khosravi, and Dorit Merhof. 2022. SMU-Net: Style matching U-Net for brain tumor segmentation with missing modalities. In International Conference on Medical Imaging with Deep Learning. PMLR, 48--62.

[2]

Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, and Phillip Isola. 2022. Exploring Visual Prompts for Adapting Large-Scale Models. arxiv: 2203.17274 [cs.CV]

[3]

Tejus A Bale and Marc K Rosenblum. 2022. The 2021 WHO classification of tumors of the central nervous system: an update on pediatric low-grade gliomas and glioneuronal tumors. Brain Pathology, Vol. 32, 4 (2022), e13060.

[4]

Amir Bar, Yossi Gandelsman, Trevor Darrell, Amir Globerson, and Alexei Efros. 2022. Visual prompting via image inpainting. Advances in Neural Information Processing Systems, Vol. 35 (2022), 25005--25017.

[5]

Elad Ben Zaken, Yoav Goldberg, and Shauli Ravfogel. 2022. BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Dublin, Ireland, 1--9. https://doi.org/10.18653/v1/2022.acl-short.1

[6]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.

[7]

Cheng Chen, Qi Dou, Yueming Jin, Hao Chen, Jing Qin, and Pheng-Ann Heng. 2019. Robust multimodal brain tumor segmentation via feature disentanglement and gated fusion. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 447--456.

Digital Library

[8]

Cheng Chen, Qi Dou, Yueming Jin, Quande Liu, and Pheng Ann Heng. 2021. Learning with privileged multimodal knowledge for unimodal segmentation. IEEE Transactions on Medical Imaging, Vol. 41, 3 (2021), 621--632.

[9]

Shoufa Chen, Chongjian GE, Zhan Tong, Jiangliu Wang, Yibing Song, Jue Wang, and Ping Luo. 2022. AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., 16664--16678.

[10]

Lee R Dice. 1945. Measures of the amount of ecologic association between species. Ecology, Vol. 26, 3 (1945), 297--302.

[11]

Yuhang Ding, Xin Yu, and Yi Yang. 2021a. Modeling the probabilistic distribution of unlabeled data for one-shot medical image segmentation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 'AAAI Press', 1246--1254.

[12]

Yuhang Ding, Xin Yu, and Yi Yang. 2021b. RFNet: Region-aware fusion network for incomplete multi-modal brain tumor segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 'IEEE', 3975--3984.

[13]

Reuben Dorent, Samuel Joutard, Marc Modat, Sébastien Ourselin, and Tom Vercauteren. 2019. Hetero-modal variational encoder-decoder for joint modality completion and segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 74--82.

Digital Library

[14]

Ian Goodfellow et al. 2014. Generative adversarial nets. Adv. Neural Inf. Process. Syst., Vol. 27 (2014).

[15]

Mohammad Havaei, Nicolas Guizard, Nicolas Chapados, and Yoshua Bengio. 2016. Hemis: Hetero-modal image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 469--477.

Digital Library

[16]

Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016).

[17]

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin De Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-efficient transfer learning for NLP. In International Conference on Machine Learning. PMLR, 2790--2799.

[18]

Minhao Hu, Matthis Maillard, Ya Zhang, Tommaso Ciceri, Giammarco La Barbera, Isabelle Bloch, and Pietro Gori. 2020. Knowledge distillation from multi-modal to mono-modal segmentation networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 772--781.

Digital Library

[19]

Ziqi Huang, Li Lin, Pujin Cheng, Kai Pan, and Xiaoying Tang. 2022. DS 3-Net: Difficulty-Perceived Common-to-T1ce Semi-supervised Multimodal MRI Synthesis Network. In Medical Image Computing and Computer Assisted Intervention. Springer, 571--581.

[20]

Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. 2022a. Visual prompt tuning. In Computer Vision - ECCV 2022 - 17th European Conference, Vol. '13693'. Springer, 'Israel', '709--727'.

[21]

Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. 2022b. Visual prompt tuning. In ECCV. 'Springer', '709--727'.

[22]

Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, and Fahad Shahbaz Khan. 2022. MaPLe: Multi-modal Prompt Learning. ArXiv:2210.03117, Vol. 'abs/2210.03117' (2022).

[23]

Sein Kim, Namkyeong Lee, Junseok Lee, Dongmin Hyun, and Chanyoung Park. 2022. Heterogeneous Graph Learning for Multi-modal Medical Data Analysis. In AAAI.

[24]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proc. Int. Conf. Learn. Represent.

[25]

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643 (2023).

[26]

Dongwook Lee, Won-Jin Moon, and Jong Chul Ye. 2020. Assessing the importance of magnetic resonance contrasts using collaborative generative adversarial networks. Nat. Mach. Intell., Vol. 2, 1 (2020), 34--42.

[27]

Ho Hin Lee, Shunxing Bao, Yuankai Huo, and Bennett A Landman. 2022. 3D UX-Net: A Large Kernel Volumetric ConvNet Modernizing Hierarchical Transformer for Medical Image Segmentation. arXiv preprint arXiv:2209.15076, Vol. 'abs/2209.15076' (2022).

[28]

Yi-Lun Lee, Yi-Hsuan Tsai, Wei-Chen Chiu, and Chen-Yu Lee. 2023. Multimodal Prompting with Missing Modalities for Visual Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 'IEEE'.

[29]

Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP. Association for Computational Linguistics, 'Punta Cana', '3045--3059'.

[30]

Rongjian Li, Wenlu Zhang, Heung-Il Suk, Li Wang, Jiang Li, Dinggang Shen, and Shuiwang Ji. 2014. Deep learning based imaging data completion for improved brain disease diagnosis. In International conference on medical image computing and computer-assisted intervention. Springer, 305--312.

Digital Library

[31]

Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4582--4597. https://doi.org/10.18653/v1/2021.acl-long.353

[32]

Sheng Liang, Mengjie Zhao, and Hinrich Schütze. 2022. Modular and Parameter-Efficient Multimodal Fusion with Prompting. In Findings of the Association for Computational Linguistics: ACL 2022. 'Association for Computational Linguistics', '2976--2985'.

[33]

Han Liu, Yubo Fan, Hao Li, Jiacheng Wang, Dewei Hu, Can Cui, Ho Hin Lee, Huahong Zhang, and Ipek Oguz. 2022. ModDrop: A Dynamic Filter Network with Intra-subject Co-training for Multiple Sclerosis Lesion Segmentation with Missing Modalities. In Medical Image Computing and Computer Assisted Intervention. Springer, 444--453.

[34]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023 b. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv., Vol. '55', '9' (2023), '195:1--195:35'.

[35]

Weihuang Liu, Xi Shen, Chi-Man Pun, and Xiaodong Cun. 2023 a. Explicit Visual Prompting for Low-Level Structure Segmentations. In CPVR. 'IEEE'.

[36]

Yanbei Liu, Lianxi Fan, Changqing Zhang, Tao Zhou, Zhitao Xiao, Lei Geng, and Dinggang Shen. 2021. Incomplete multi-modal representation learning for Alzheimer's disease diagnosis. Medical Image Analysis, Vol. 69 (2021), 101953.

[37]

Mengmeng Ma, Jian Ren, Long Zhao, Davide Testuggine, and Xi Peng. 2022. Are Multimodal Transformers Robust to Missing Modality?. In CVPR. 'IEEE', '18156--18165'.

[38]

Mengmeng Ma, Jian Ren, Long Zhao, Sergey Tulyakov, Cathy Wu, and Xi Peng. 2021. SMIL: Multimodal learning with severely missing modality. In AAAI. 'AAAI Press', '2302--2310'.

[39]

Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, et al. 2014. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions on medical imaging, Vol. 34, 10 (2014), 1993--2024.

[40]

OpenAI. 2023. GPT-4 Technical Report. ArXiv, Vol. abs/2303.08774 (2023).

[41]

Himashi Peiris, Munawar Hayat, Zhaolin Chen, Gary Egan, and Mehrtash Harandi. 2022. A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 162--172.

Digital Library

[42]

Shengju Qian, Hao Shao, Yi Zhu, Mu Li, and Jiaya Jia. 2021. Blending anti-aliasing into vision transformer. Advances in Neural Information Processing Systems, Vol. 34 (2021), 5416--5429.

[43]

Shengju Qian, Yi Zhu, Wenbo Li, Mu Li, and Jiaya Jia. 2022. What Makes for Good Tokenizers in Vision Transformer? IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 'abs/2212.11115' (2022).

[44]

Yansheng Qiu, Delin Chen, Hongdou Yao, Yongchao Xu, and Zheng Wang. 2023. Scratch Each Other's Back: Incomplete Multi-modal Brain Tumor Segmentation Via Category Aware Group Self-Support Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision.

[45]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 19. Springer, 234--241.

[46]

Saikat Roy, Gregor Koehler, Constantin Ulrich, Michael Baumgartner, Jens Petersen, Fabian Isensee, Paul F Jaeger, and Klaus Maier-Hein. 2023. MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation. arXiv preprint arXiv:2303.09975, Vol. 'abs/2303.09975' (2023).

[47]

Liyue Shen, Wentao Zhu, Xiaosong Wang, Lei Xing, John M Pauly, Baris Turkbey, Stephanie Anne Harmon, Thomas Hogue Sanford, Sherif Mehralivand, Peter L Choyke, et al. 2020. Multi-domain image completion for random missing input data. IEEE transactions on medical imaging, Vol. 40, 4 (2020), 1113--1122.

[48]

Zhi-Xuan Tan, Harold Soh, and Desmond C. Ong. 2020. Factorized Inference in Deep Markov Models for Incomplete Multimodal Time Series. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 10334--10341.

[49]

Yucheng Tang, Dong Yang, Wenqi Li, Holger R Roth, Bennett Landman, Daguang Xu, Vishwesh Nath, and Ali Hatamizadeh. 2022. Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 'IEEE', 20730--20740.

[50]

Maria Tsimpoukelli, Jacob L Menick, Serkan Cabi, SM Eslami, Oriol Vinyals, and Felix Hill. 2021. Multimodal few-shot learning with frozen language models. NeurIPS (2021), '200--212'.

[51]

Melissa Vibberts. 2021. Incomplete Scans and Lost Revenue In MRI. https://blog.beekley.com/incomplete-scans-and-lost-revenue-in-mri.

[52]

Shuxin Wang, Shilei Cao, Dong Wei, Renzhen Wang, Kai Ma, Liansheng Wang, Deyu Meng, and Yefeng Zheng. 2020. LT-Net: Label transfer by learning reversible voxel-wise correspondence for one-shot medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 'IEEE', 9162--9171.

[53]

Yixin Wang, Yang Zhang, Yang Liu, Zihao Lin, Jiang Tian, Cheng Zhong, Zhongchao Shi, Jianping Fan, and Zhiqiang He. 2021. Acn: Adversarial co-training network for brain tumor segmentation with missing modalities. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 410--420.

Digital Library

[54]

Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, et al. 2022a. DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning. In ECCV. 'Springer', '631--648'.

[55]

Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, and Tomas Pfister. 2022b. Learning to prompt for continual learning. In CVPR. 'IEEE', '139--149'.

[56]

Jinyu Yang, Zhe Li, Feng Zheng, Ales Leonardis, and Jingkuan Song. 2022. Prompting for Multi-Modal Tracking. In ACM MM. 'ACM', '3492--3500'.

[57]

Biting Yu, Luping Zhou, Lei Wang, Yinghuan Shi, Jurgen Fripp, and Pierrick Bourgeat. 2019. Ea-GANs: edge-aware generative adversarial networks for cross-modality MR image synthesis. IEEE Trans. Med. Imag., Vol. 38, 7 (2019), 1750--1762.

[58]

Jiandian Zeng, Tianyi Liu, and Jiantao Zhou. 2022. Tag-assisted Multimodal Sentiment Analysis under Uncertain Missing Modalities. In SIGIR. ACM, 'Madrid, Spain', '1545--1554'.

[59]

Changqing Zhang, Yajie Cui, Zongbo Han, Joey Tianyi Zhou, Huazhu Fu, and Qinghua Hu. 2020. Deep partial multi-view learning. IEEE transactions on pattern analysis and machine intelligence, Vol. '44', '5' (2020), '2402--2415'.

[60]

Yao Zhang, Nanjun He, Jiawei Yang, Yuexiang Li, Dong Wei, Yawen Huang, Yang Zhang, Zhiqiang He, and Yefeng Zheng. 2022. mmFormer: Multimodal Medical Transformer for Incomplete Multimodal Learning of Brain Tumor Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 107--117.

Digital Library

[61]

Amy Zhao, Guha Balakrishnan, Fredo Durand, John V Guttag, and Adrian V Dalca. 2019. Data augmentation using learned transformations for one-shot medical image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 'IEEE', 8543--8553.

[62]

Jinming Zhao, Ruichen Li, and Qin Jin. 2021. Missing modality imagination network for emotion recognition with uncertain missing modalities. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP). 'Association for Computational Linguistics', 2608--2618' pages.

[63]

Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, and Haizhou Li. 2022a. Memobert: Pre-training model with prompt-based learning for multimodal emotion recognition. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4703--4707.

[64]

Zechen Zhao, Heran Yang, and Jian Sun. 2022b. Modality-Adaptive Feature Interaction for Brain Tumor Segmentation with Missing Modalities. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 183--192.

[65]

Ziyuan Zhao, Fangcheng Zhou, Kaixin Xu, Zeng Zeng, Cuntai Guan, and S. Kevin Zhou. 2023. LE-UDA: Label-Efficient Unsupervised Domain Adaptation for Medical Image Segmentation. IEEE Transactions on Medical Imaging, Vol. 42, 3 (2023), 633--646. https://doi.org/10.1109/TMI.2022.3214766

[66]

Ziyuan Zhao, Fangcheng Zhou, Zeng Zeng, Cuntai Guan, and S. Kevin Zhou. 2022c. Meta-hallucinator: Towards Few-Shot Cross-Modality Cardiac Image Segmentation. In Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, Linwei Wang, Qi Dou, P. Thomas Fletcher, Stefanie Speidel, and Shuo Li (Eds.). Springer Nature Switzerland, Cham, 128--139.

[67]

Hong-Yu Zhou, Jiansen Guo, Yinghao Zhang, Lequan Yu, Liansheng Wang, and Yizhou Yu. 2021. nnFormer: Interleaved Transformer for Volumetric Segmentation. CoRR, Vol. abs/2109.03201 (2021). https://arxiv.org/abs/2109.03201

[68]

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. 2022. Learning to prompt for vision-language models. Int. J. Comput. Vis., Vol. '130', '9' (2022), '2337--2348'.

Cited By

Ye ZZhan JAi QLiu Yde Rijke MLioma CRuotsalo TCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Query Augmentation with Brain SignalsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681658(7561-7570)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681658
Shi JShang CSun ZYu LYang XYan ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)PASSION: Towards Effective Incomplete Multi-Modal Medical Image Segmentation with Imbalanced Missing RatesProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681543(456-465)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681543
Wang YWan ZQiu YWang ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680588(10640-10648)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680588
Show More Cited By

Index Terms

Modal-aware Visual Prompting for Incomplete Multi-modal Brain Tumor Segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation

Recommendations

Modality-Adaptive Feature Interaction for Brain Tumor Segmentation with Missing Modalities
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022
Abstract
Multi-modal Magnetic Resonance Imaging (MRI) plays a crucial role in brain tumor segmentation. However, missing modality is a common phenomenon in clinical practice, leading to performance degradation in tumor segmentation. Considering that there ...
ReFuSeg: Regularized Multi-modal Fusion for Precise Brain Tumour Segmentation
Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries
Abstract
Semantic segmentation of brain tumours is a fundamental task in medical image analysis that can help clinicians in diagnosing the patient and tracking the progression of any malignant entities. Accurate segmentation of brain lesions is essential ...
Multi-Modal Image Processing and Visualization: Application to PET-CT
CGI '16: Proceedings of the 33rd Computer Graphics International

Multi-modality medical imaging, such as positron emission tomography and computed tomography (PET-CT) depicts biological and physiological functions (from PET) within a higher resolution anatomical reference frame (from CT). Although it may seem counter-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
CAAI-Huawei MindSpore Open Fund
National Key R\&D Project
Hubei Key R\&D Project

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
626
Total Downloads

Downloads (Last 12 months)345
Downloads (Last 6 weeks)34

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ye ZZhan JAi QLiu Yde Rijke MLioma CRuotsalo TCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Query Augmentation with Brain SignalsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681658(7561-7570)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681658
Shi JShang CSun ZYu LYang XYan ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)PASSION: Towards Effective Incomplete Multi-Modal Medical Image Segmentation with Imbalanced Missing RatesProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681543(456-465)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681543
Wang YWan ZQiu YWang ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680588(10640-10648)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680588
Liu JXie JZhou FHe S(2024)Question Type-Aware Debiasing for Test-Time Visual Question Answering Model AdaptationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.341004134:11(10805-10816)Online publication date: Nov-2024
https://doi.org/10.1109/TCSVT.2024.3410041
Wu ZZheng JRen XVasluianu FMa CPaudel DVan Gool LTimofte R(2024)Single-Model and Any-Modality for Video Object Tracking2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01812(19156-19166)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01812
Yang KShan WLi XWang XYang XWang XHeng PLi QWang Z(2024)Decoupling Feature Representations of Ego and Other Modalities for Incomplete Multi-modal Brain Tumor Segmentation2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM62325.2024.10822635(3897-3902)Online publication date: 3-Dec-2024
https://doi.org/10.1109/BIBM62325.2024.10822635
Liu TTan ZJiang HYang XHuang K(2024)Mind the Gap: Promoting Missing Modality Brain Tumor Segmentation with Alignment2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM62325.2024.10822387(3531-3536)Online publication date: 3-Dec-2024
https://doi.org/10.1109/BIBM62325.2024.10822387
Cheng JFeng RLi JXu J(2024)Incomplete Multimodal Learning with Modality-Aware Feature Interaction for Brain Tumor SegmentationBioinformatics Research and Applications10.1007/978-981-97-5131-0_24(281-290)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1007/978-981-97-5131-0_24
Hu ZZheng WWei MShi MZong Y(2024)Missing Customized Distillation Network for Incomplete Multimodal Sentiment AnalysisPattern Recognition10.1007/978-3-031-78186-5_4(49-64)Online publication date: 30-Nov-2024
https://doi.org/10.1007/978-3-031-78186-5_4
Zhao ZLin RXu KYang XGuan C(2024)MS-MT++: Enhanced Multi-scale Mean Teacher for Cross-Modality Vestibular Schwannoma and Cochlea SegmentationBrain Tumor Segmentation, and Cross-Modality Domain Adaptation for Medical Image Segmentation10.1007/978-3-031-76163-8_35(386-394)Online publication date: 28-Dec-2024
https://doi.org/10.1007/978-3-031-76163-8_35
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten