skip to main content
10.1145/3581783.3612337acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Enhancing Adversarial Robustness of Multi-modal Recommendation via Modality Balancing

Published: 27 October 2023 Publication History

Abstract

Recently multi-modal recommender systems have been widely applied in real scenarios such as e-commerce businesses. Existing multi-modal recommendation methods exploit the multi-modal content of items as auxiliary information and fuse them to boost performance. Despite the superior performance achieved by multi-modal recommendation models, there's currently no understanding of their robustness to adversarial attacks. In this work, we first identify the vulnerability of existing multi-modal recommendation models. Next, we show the key reason for such vulnerability is modality imbalance, i.e., the prediction score margin between positive and negative samples in the sensitive modality will drop dramatically facing adversarial attacks and fail to be compensated by other modalities. Finally, based on this finding we propose a novel defense method to enhance the robustness of multi-modal recommendation models through modality balancing. Specifically, we first adopt an embedding distillation to obtain a pair of content-similar but prediction-different item embeddings in the sensitive modality and calculate the score margin reflecting the modality vulnerability. Then we optimize the model to utilize the score margin between positive and negative samples in other modalities to compensate for the vulnerability. The proposed method can serve as a plug-and-play module and is flexible to be applied to a wide range of multi-modal recommendation models. Extensive experiments on two real-world datasets demonstrate that our method significantly improves the robustness of multi-modal recommendation models with nearly no performance degradation on clean data.

References

[1]
Vito Walter Anelli, Yashar Deldjoo, Tommaso Di Noia, Daniele Malitesta, and Felice Antonio Merra. 2021. A study of defensive methods to protect visual recommendation against adversarial manipulation of images. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1094--1103.
[2]
Feiyu Chen, Junjie Wang, Yinwei Wei, Hai-Tao Zheng, and Jie Shao. 2022. Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation. In Proceedings of the 30th ACM International Conference on Multimedia. 385--394.
[3]
Huiyuan Chen and Jing Li. 2019. Adversarial tensor factorization for context-aware recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems. 363--367.
[4]
Xu Chen, Hanxiong Chen, Hongteng Xu, Yongfeng Zhang, Yixin Cao, Zheng Qin, and Hongyuan Zha. 2019. Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 765--774.
[5]
Yashar Deldjoo, Tommaso Di Noia, Eugenio Di Sciascio, and Felice Antonio Merra. 2020. How dataset characteristics affect the robustness of collaborative recommendation models. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. 951--960.
[6]
Xiaoyu Du, Zike Wu, Fuli Feng, Xiangnan He, and Jinhui Tang. 2022. Invariant Representation Learning for Multimedia Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia. 619--628.
[7]
Yali Du, Meng Fang, Jinfeng Yi, Chang Xu, Jun Cheng, and Dacheng Tao. 2018. Enhancing the robustness of neural collaborative filtering systems under malicious attacks. IEEE Transactions on Multimedia, Vol. 21, 3 (2018), 555--565.
[8]
Minghong Fang, Neil Zhenqiang Gong, and Jia Liu. 2020. Influence function based data poisoning attacks to top-n recommender systems. In Proceedings of The Web Conference 2020. 3019--3025.
[9]
Chen Gao, Yu Zheng, Nian Li, Yinfeng Li, Yingrong Qin, Jinghua Piao, Yuhan Quan, Jianxin Chang, Depeng Jin, Xiangnan He, et al. 2023. A survey of graph neural networks for recommender systems: challenges, methods, and directions. ACM Transactions on Recommender Systems, Vol. 1, 1 (2023), 1--51.
[10]
Chen Gao, Yu Zheng, Wenjie Wang, Fuli Feng, Xiangnan He, and Yong Li. 2022. Causal Inference in Recommender Systems: A Survey and Future Directions. arXiv preprint arXiv:2208.12397 (2022).
[11]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 249--256.
[12]
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
[13]
Ihsan Gunes, Cihan Kaleli, Alper Bilge, and Huseyin Polat. 2014. Shilling attacks against recommender systems: A comprehensive survey. Artificial Intelligence Review, Vol. 42, 4 (2014).
[14]
Ruining He and Julian McAuley. 2016a. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web. 507--517.
[15]
Ruining He and Julian McAuley. 2016b. VBPR: visual bayesian personalized ranking from implicit feedback. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.
[16]
Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, and Meng Wang. 2020. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 639--648.
[17]
Xiangnan He, Zhankui He, Xiaoyu Du, and Tat-Seng Chua. 2018. Adversarial personalized ranking for recommendation. In The 41st International ACM SIGIR conference on research & development in information retrieval. 355--364.
[18]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. (2015).
[19]
Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. 426--434.
[20]
Shyong K Lam and John Riedl. 2004. Shilling recommender systems for fun and profit. In Proceedings of the 13th international conference on World Wide Web. 393--402.
[21]
Bo Li, Yining Wang, Aarti Singh, and Yevgeniy Vorobeychik. 2016. Data poisoning attacks on factorization-based collaborative filtering. Advances in neural information processing systems, Vol. 29 (2016).
[22]
Fan Liu, Zhiyong Cheng, Changchang Sun, Yinglong Wang, Liqiang Nie, and Mohan Kankanhalli. 2019b. User diverse preference modeling by multimodal attentive metric learning. In Proceedings of the 27th ACM international conference on multimedia. 1526--1534.
[23]
Shang Liu, Zhenzhong Chen, Hongyi Liu, and Xinghai Hu. 2019a. User-video co-attention network for personalized micro-video recommendation. In The World Wide Web Conference. 3020--3026.
[24]
Xiaohao Liu, Zhulin Tao, Jiahong Shao, Lifang Yang, and Xianglin Huang. 2022. EliMRec: Eliminating Single-modal Bias in Multimedia Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia. 687--695.
[25]
Zhuoran Liu and Martha Larson. 2021. Adversarial item promotion: Vulnerabilities at the core of top-n recommenders that use images to address cold start. In Proceedings of the Web Conference 2021. 3590--3602.
[26]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations.
[27]
Zongshen Mu, Yueting Zhuang, Jie Tan, Jun Xiao, and Siliang Tang. 2022. Learning Hybrid Behavior Patterns for Multimedia Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia. 376--384.
[28]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. 452--461.
[29]
Jinhui Tang, Xiaoyu Du, Xiangnan He, Fajie Yuan, Qi Tian, and Tat-Seng Chua. 2019. Adversarial training towards robust multimedia recommender system. IEEE Transactions on Knowledge and Data Engineering, Vol. 32, 5 (2019), 855--867.
[30]
Jiaxi Tang, Hongyi Wen, and Ke Wang. 2020. Revisiting adversarially learned injection attacks against recommender systems. In Proceedings of the 14th ACM Conference on Recommender Systems. 318--327.
[31]
Zhulin Tao, Xiaohao Liu, Yewei Xia, Xiang Wang, Lifang Yang, Xianglin Huang, and Tat-Seng Chua. 2022. Self-supervised learning for multimedia recommendation. IEEE Transactions on Multimedia (2022).
[32]
Zhulin Tao, Yinwei Wei, Xiang Wang, Xiangnan He, Xianglin Huang, and Tat-Seng Chua. 2020. Mgat: Multimodal graph attention network for recommendation. Information Processing & Management, Vol. 57, 5 (2020), 102277.
[33]
Nhu-Thuat Tran and Hady W Lauw. 2022. Aligning Dual Disentangled User Representations from Ratings and Textual Content. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1798--1806.
[34]
Haoyu Wang, Nan Shao, and Defu Lian. 2019b. Adversarial binary collaborative filtering for implicit feedback. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 5248--5255.
[35]
Qifan Wang, Yinwei Wei, Jianhua Yin, Jianlong Wu, Xuemeng Song, and Liqiang Nie. 2021. Dualgnn: Dual graph neural network for multimedia recommendation. IEEE Transactions on Multimedia (2021).
[36]
Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019a. Neural graph collaborative filtering. In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval. 165--174.
[37]
Yinwei Wei, Xiang Wang, Xiangnan He, Liqiang Nie, Yong Rui, and Tat-Seng Chua. 2021. Hierarchical user intent graph network for multimedia recommendation. IEEE Transactions on Multimedia, Vol. 24 (2021), 2701--2712.
[38]
Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, and Tat-Seng Chua. 2020. Graph-refined convolutional network for multimedia recommendation with implicit feedback. In Proceedings of the 28th ACM international conference on multimedia. 3541--3549.
[39]
Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Richang Hong, and Tat-Seng Chua. 2019. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM international conference on multimedia. 1437--1445.
[40]
Chenwang Wu, Defu Lian, Yong Ge, Zhihao Zhu, Enhong Chen, and Senchao Yuan. 2021. Fight fire with fire: towards robust recommender systems via adversarial poisoning training. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1074--1083.
[41]
Zixuan Yi, Xi Wang, Iadh Ounis, and Craig Macdonald. 2022. Multi-modal graph contrastive learning for micro-video recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1807--1811.
[42]
Feng Yuan, Lina Yao, and Boualem Benatallah. 2019. Adversarial collaborative neural network for robust recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1065--1068.
[43]
Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 353--362.
[44]
Jinghao Zhang, Yanqiao Zhu, Qiang Liu, Shu Wu, Shuhui Wang, and Liang Wang. 2021. Mining latent structures for multimedia recommendation. In Proceedings of the 29th ACM International Conference on Multimedia. 3872--3880.
[45]
Hongyu Zhou, Xin Zhou, Zhiwei Zeng, Lingzi Zhang, and Zhiqi Shen. 2023. A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions. arXiv preprint arXiv:2302.04473 (2023).
[46]
Xin Zhou, Hongyu Zhou, Yong Liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, and Feijun Jiang. 2022. Bootstrap latent representations for multi-modal recommendation. arXiv preprint arXiv:2207.05969 (2022).

Cited By

View all
  • (2024)Multimodal Recommender Systems: A SurveyACM Computing Surveys10.1145/369546157:2(1-17)Online publication date: 10-Oct-2024
  • (2024)FINEST: Stabilizing Recommendations by Rank-Preserving Fine-TuningACM Transactions on Knowledge Discovery from Data10.1145/369525618:9(1-22)Online publication date: 1-Nov-2024
  • (2024)PIP: Detecting Adversarial Examples in Large Vision-Language Models via Attention Patterns of Irrelevant Probe QuestionsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3685510(11175-11183)Online publication date: 28-Oct-2024

Index Terms

  1. Enhancing Adversarial Robustness of Multi-modal Recommendation via Modality Balancing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adversarial robustness
    2. modality balancing
    3. multi-modal recommendation

    Qualifiers

    • Research-article

    Funding Sources

    • The National Natural Science Foundation of China
    • The National Key R&D Program of China

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)254
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Multimodal Recommender Systems: A SurveyACM Computing Surveys10.1145/369546157:2(1-17)Online publication date: 10-Oct-2024
    • (2024)FINEST: Stabilizing Recommendations by Rank-Preserving Fine-TuningACM Transactions on Knowledge Discovery from Data10.1145/369525618:9(1-22)Online publication date: 1-Nov-2024
    • (2024)PIP: Detecting Adversarial Examples in Large Vision-Language Models via Attention Patterns of Irrelevant Probe QuestionsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3685510(11175-11183)Online publication date: 28-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media