skip to main content
10.1145/3589334.3645554acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Zero-shot Image Classification with Logic Adapter and Rule Prompt

Published: 13 May 2024 Publication History

Abstract

Zero-shot image classification, which aims to predict unseen classes whose samples have never appeared during the training phase, is crucial in the Web domain because many new web images appear on various websites. Attributes, as annotations for class-level characteristics, are widely used semantic information for this task. However, most current methods often fail to capture discriminative image features between similar images from different classes, leading to unsatisfactory zero-shot image classification results. This is because they solely focus on limited visual-attribute feature alignment. Therefore, we propose a Zero-Shot image Classification with Logic adapter and Rule prompt method called ZSCLR, which utilizes logic adapter and rule prompts to encourage the model to capture discriminative image features and achieve reasoning. Specifically, ZSCLR consists of a visual perception module and a logic adapter. The visual perception module extracts image features from training data. At the same time, the logic adapter utilizes the Markov logic network to encode the extracted image features and rule prompts for refining the discriminative image features. Due to predicates of rule prompts representing symbolic discriminative features, the proposed model can focus more on these discriminative features and achieve more precise image classification. Additionally, the logic adapter enables the model to adapt from recognizing images in seen classes to those in unseen classes through the reasoning of the Markov logic networks. We implement experiments on three standard zero-shot image classification benchmarks, and ZSCLR achieves competitive performance. Furthermore, ZSCLR can provide explanations for its predictions through rule prompts.

Supplemental Material

MP4 File
Supplemental video

References

[1]
Wieland Brendel and Matthias Bethge. 2019. Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. International Conference on Learning Representations (2019).
[2]
Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. 2016. Synthesized classifiers for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5327--5336.
[3]
Shiming Chen, Ziming Hong, Yang Liu, Guo-Sen Xie, Baigui Sun, Hao Li, Qinmu Peng, Ke Lu, and Xinge You. 2022a. Transzero: Attribute-guided transformer for zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 330--338.
[4]
Shiming Chen, Ziming Hong, Guo-Sen Xie, Wenhan Yang, Qinmu Peng, Kai Wang, Jian Zhao, and Xinge You. 2022b. Msdn: Mutually semantic distillation network for zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7612--7621.
[5]
Shiming Chen, Wenjie Wang, Beihao Xia, Qinmu Peng, Xinge You, Feng Zheng, and Ling Shao. 2021. Free: Feature refinement for generalized zero-shot learning. In Proceedings of the IEEE/CVF international conference on computer vision. 122--131.
[6]
Zhuo Chen, Yufeng Huang, Jiaoyan Chen, Yuxia Geng, Wen Zhang, Yin Fang, Jeff Z Pan, and Huajun Chen. 2023. Duet: Cross-modal semantic grounding for contrastive zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 405--413.
[7]
Pedro Domingos and Daniel Lowd. 2019. Unifying logical and statistical AI with Markov logic. Commun. ACM, Vol. 62, 7 (2019), 74--83.
[8]
Dat Huynh and Ehsan Elhamifar. 2020a. Compositional zero-shot learning via fine-grained dense feature composition. Advances in Neural Information Processing Systems, Vol. 33 (2020), 19849--19860.
[9]
Dat Huynh and Ehsan Elhamifar. 2020b. Fine-grained generalized zero-shot learning via dense attribute-based attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4483--4493.
[10]
Michael Kampffmeyer, Yinbo Chen, Xiaodan Liang, Hao Wang, Yujia Zhang, and Eric P Xing. 2019. Rethinking knowledge graph propagation for zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11487--11496.
[11]
Dehui Kong, Xiliang Li, Shaofan Wang, Jinghua Li, and Baocai Yin. 2023. Learning visual-and-semantic knowledge embedding for zero-shot image classification. Applied Intelligence, Vol. 53, 2 (2023), 2250--2264.
[12]
Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, and Chengqi Zhang. 2021b. Isometric propagation network for generalized zero-shot learning. International Conference on Learning Representations (2021).
[13]
Yang Liu, Jishun Guo, Deng Cai, and Xiaofei He. 2019. Attribute attention for semantic disambiguation in zero-shot learning. In Proceedings of the IEEE/CVF international conference on computer vision. 6698--6707.
[14]
Yang Liu, Lei Zhou, Xiao Bai, Yifei Huang, Lin Gu, Jun Zhou, and Tatsuya Harada. 2021a. Goal-oriented gaze estimation for zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3794--3803.
[15]
Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S Corrado, and Jeffrey Dean. 2014. Zero-shot learning by convex combination of semantic embeddings. International Conference on Learning Representations (2014).
[16]
Genevieve Patterson and James Hays. 2012. Sun attribute database: Discovering, annotating, and recognizing scene attributes. In 2012 IEEE conference on computer vision and pattern recognition. 2751--2758.
[17]
Matthew Richardson and Pedro Domingos. 2006. Markov logic networks. Machine learning, Vol. 62, 1 (2006), 107--136.
[18]
Edgar Schonfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, and Zeynep Akata. 2019. Generalized zero-and few-shot learning via aligned variational autoencoders. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8247--8255.
[19]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815--823.
[20]
Tasfia Shermin, Shyh Wei Teng, Ferdous Sohel, Manzur Murshed, and Guojun Lu. 2022. Integrated generalized zero-shot learning for fine-grained classification. Pattern Recognition, Vol. 122 (2022), 108246.
[21]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1--9.
[22]
Maunil R Vyas, Hemanth Venkateswara, and Sethuraman Panchanathan. 2020. Leveraging seen and unseen semantic relationships for generative zero-shot learning. In European Conference on Computer Vision. 70--86.
[23]
Ziyu Wan, Dongdong Chen, Yan Li, Xingguang Yan, Junge Zhang, Yizhou Yu, and Jing Liao. 2019. Transductive zero-shot learning with visual structure constraint. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[24]
Jiwei Wei, Yang Yang, Zeyu Ma, Jingjing Li, Xing Xu, and Heng Tao Shen. 2022. Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning. arXiv preprint arXiv:2212.13151 (2022).
[25]
Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Serge Belongie, and Pietro Perona. 2010. Caltech-UCSD birds 200. (2010).
[26]
Yongqin Xian, Bernt Schiele, and Zeynep Akata. 2017. Zero-shot learning-the good, the bad and the ugly. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4582--4591.
[27]
Guo-Sen Xie, Li Liu, Xiaobo Jin, Fan Zhu, Zheng Zhang, Jie Qin, Yazhou Yao, and Ling Shao. 2019. Attentive region embedding network for zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9384--9393.
[28]
Guo-Sen Xie, Li Liu, Fan Zhu, Fang Zhao, Zheng Zhang, Yazhou Yao, Jie Qin, and Ling Shao. 2020. Region graph embedding network for zero-shot learning. In European Conference on Computer Vision. 562--580.
[29]
Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, and Zeynep Akata. 2020. Attribute prototype network for zero-shot learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 21969--21980.
[30]
Bo Yang, Yuxueqing Zhang, Yida Peng, chunxu Zhang, and Jing Hang. 2021. Collaborative Filtering Based Zero-Shot Learning. Journal of Software, Vol. 32, 9 (2021), 2801--2815.
[31]
Guanyu Yang, Kaizhu Huang, Rui Zhang, John Y Goulermas, and Amir Hussain. 2020. Self-focus deep embedding model for coarse-grained zero-shot classification. In International Conference on Brain Inspired Cognitive Systems. 12--22.
[32]
Dongran Yu, Bo Yang, Qianhao Wei, Anchen Li, and Shirui Pan. 2022. A Probabilistic Graphical Model Based on Neural-Symbolic Reasoning for Visual Relationship Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10609--10618.
[33]
Yunlong Yu, Zhong Ji, Jungong Han, and Zhongfei Zhang. 2020. Episode-based prototype generating network for zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14035--14044.
[34]
Zhongqi Yue, Tan Wang, Qianru Sun, Xian-Sheng Hua, and Hanwang Zhang. 2021. Counterfactual zero-shot and open-set visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 15404--15414.
[35]
Li Zhang, Tao Xiang, and Shaogang Gong. 2017. Learning a deep embedding model for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2021--2030.
[36]
Ziming Zhang and Venkatesh Saligrama. 2016. Zero-shot recognition via structured prediction. In European Conference on Computer Vision. 533--548.
[37]
Yizhe Zhu, Jianwen Xie, Zhiqiang Tang, Xi Peng, and Ahmed Elgammal. 2019. Semantic-guided multi-attention localization for zero-shot learning. Advances in Neural Information Processing Systems, Vol. 32 (2019).

Index Terms

  1. Zero-shot Image Classification with Logic Adapter and Rule Prompt

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. image classification
    2. logic adapter
    3. markov logic network
    4. rule prompt
    5. zero-shot learning

    Qualifiers

    • Research-article

    Funding Sources

    • the National Key R&D Program of China award number
    • National Natural Science Foundation of China award number

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 213
      Total Downloads
    • Downloads (Last 12 months)213
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media