skip to main content
10.1145/3573942.3574099acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

Few-Shot Image Classification Based on Cross-Dimensional Interactive Attention

Published: 16 May 2023 Publication History

Abstract

In recent years, deep learning techniques have achieved great success in traditional image classification tasks, however, it is often difficult to achieve good results with a small amount of labeled data and prone to overfitting. Therefore, scholars have started to focus on image classification methods based on few-shot learning. The prototype network uses the mean value of the support set samples as the prototype, and achieves classification by calculating the distance between the query set samples and the prototype. To enhance the feature representation capability of the prototype network, this paper proposes a few-shot image classification method based on cross-dimensional interactive attention. The algorithm uses the pre-trained model Resnet-12 to extract deep features of images and introduces the cross-dimensional interactive attention mechanism to link the information between channel and spatial dimensions through triple attention, which enhances the information interaction of each dimension. Meanwhile, in order to improve the problem of insufficient generalization ability of the prototype network, this algorithm uses a gradient-centered optimization algorithm to zero-mean the weight gradient, which improves the generalization ability of the network and improves the classification accuracy. Extensive experimental results show that the proposed algorithm performs well in the few-shot image classification task.

References

[1]
Karthigayan Muthukaruppan, Sargunam Thirugnanam, Ramachandran Nagarajan 2015. A Comparison of South East Asian Face Emotion Classification Based on Optimized Ellipse Data Using Clustering Technique. Journal of Image and Graphics. 3(1), 1-5.
[2]
Yen-Ju Wu, Chun-Ming Tsai, and Frank Shih. 2016. Improving Leaf Classification Rate via Background Removal and ROI Extraction. Journal of Image and Graphics. 4(2), 93-98.
[3]
Yordanka Karayaneva and Diana Hintea. 2018. Object Recognition in Python and MNIST Dataset Modification and Recognition with Five Machine Learning Classifiers. Journal of Image and Graphics. 6(1), 10-20.
[4]
Lei Wang, Biao Liu, Shaohua Xu, Ji Pan, and Qi Zhou. 2021. AI Auxiliary Labeling and Classification of Breast Ultrasound Images. Journal of Image and Graphics. 9(2), 45-49.
[5]
Nguyen Minh Trieu and Nguyen Truong Thinh. 2022. A Study of Combining KNN and ANN for Classifying Dragon Fruits Automatically. Journal of Image and Graphics. 10(1), 28-35.
[6]
Andrey A. Dovganich, Alexander V. Khvostikov, Yakov A. Pchelintsev 2022. Automatic Out-of-Distribution Detection Methods for Improving the Deep Learning Classification of Pulmonary X-ray Images. Journal of Image and Graphics. 10(2), 56-63.
[7]
Fei-Fei Li, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence. 28(4), 594-611.
[8]
Michael Fink. 2015. Object classification from a single example utilizing class relevance metrics. Advances in neural information processing systems. 17, 449-456.
[9]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. International conference on machine learning. PMLR, 1126-1135.
[10]
Luca Bertinetto, Joao F. Henriques, Philip Torr, Andrea Vedaldi. 2019. Meta-learning with differentiable closed-form solvers. 7th International Conference on Learning Representations, ICLR. arXiv preprint arXiv:1805.08136.
[11]
Zhenguo Li, Fengwei Zhou, Fei Chen and Hang Li. 2017. Meta-sgd: Learning to learn quickly for few-shot learning. Advances in Neural Information Processing Systems, 4077-4087. arXiv preprint arXiv:1707.09835.
[12]
Wei, Xiu-Shen, Peng Wang, Lingqiao Liu, Chunhua Shen and Jianxin Wu. 2019. Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. IEEE Transactions on Image Processing. 28(12), 6116-6125.
[13]
Huaxi Huang, Junjie Zhang, Jian Zhang, Qiang Wu and Jingsong Xu. 2019. Compare more nuanced: Pairwise alignment bilinear network for few-shot fine-grained learning. 2019 IEEE International Conference on Multimedia and Expo. 91-96.
[14]
Huaxi Huang, Junjie Zhang, Jian Zhang, Jingsong Xu and Qiang Wu. 2020. Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE Transactions on Multimedia. 1666-1680.
[15]
Munkhdalai, Tsendsuren, and Hong Yu. 2017. Meta networks. International Conference on Machine Learning. PMLR, 2554-2563.
[16]
Oriol Vinyals, 2016. Matching networks for one shot learning. Advances in neural information processing systems. Barcelona, Spain: MIT Press, 2016. 3637-3645.
[17]
Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese neural networks for one-shot image recognition. ICML deep learning workshop. 7(11), 956-963.
[18]
Flood Sung, Yongxin Yang, Li Zhang, 2018. Learning to compare: Relation network for few-shot learning. Proceedings of the IEEE conference on computer vision and pattern recognition. 1199-1208.
[19]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. Advances in neural information processing systems. Long Beach, USA: ACM, 2017. 4077−4087.
[20]
Wenbin Li, Lei Wang, Jinglin Xu, 2019. Revisiting local descriptor based image-to-class measure for few-shot learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7260-7268.
[21]
Victor Garcia, and Joan Bruna. 2018. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043.
[22]
Diganta Misra, 2021. Rotate to attend: Convolutional triplet attention module. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3139-3148.
[23]
Hongwei Yong, Jianqiang Huang, Xiansheng Hua, and Lei Zhang. 2020. Gradient centralization: A new optimization technique for deep neural networks. European Conference on Computer Vision. Springer, Cham. 635-652.
[24]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 770-778.
[25]
Catherine Wah, 2011. The caltech-ucsd birds-200-2011 dataset.
[26]
Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, 2011. Novel dataset for fine-grained image categorization: Stanford dogs. Proc. CVPR Workshop on Fine-Grained Visual Categorization. 2(1).
[27]
Jonathan Krause, Michael Stark, Jia Deng 2013. 3d object representations for fine-grained categorization. Proceedings of the IEEE international conference on computer vision workshops. 554-561.
[28]
CIIP-TPID (Center for Image and Information Processing-Tire Pattern Image Datasets). http://www.xuptciip.com.cn/show.html?database-lthhhw, Xi'an University of Posts and Telecommunications, 2019.
[29]
Léon Bottou. 2012. Stochastic gradient descent tricks. Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 421-436.

Index Terms

  1. Few-Shot Image Classification Based on Cross-Dimensional Interactive Attention

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition
    September 2022
    1221 pages
    ISBN:9781450396899
    DOI:10.1145/3573942
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 May 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Cross-dimensional interactive attention
    2. Few-shot learning
    3. Image classification
    4. Information interaction
    5. Prototype network

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Xi'an University of Posts and Telecommunications

    Conference

    AIPR 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 39
      Total Downloads
    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media