research-article

Few-Shot Image Classification Based on Cross-Dimensional Interactive Attention

Authors:

Hengchang Zhang,

Xin CheAuthors Info & Claims

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

Pages 810 - 815

https://doi.org/10.1145/3573942.3574099

Published: 16 May 2023 Publication History

Abstract

In recent years, deep learning techniques have achieved great success in traditional image classification tasks, however, it is often difficult to achieve good results with a small amount of labeled data and prone to overfitting. Therefore, scholars have started to focus on image classification methods based on few-shot learning. The prototype network uses the mean value of the support set samples as the prototype, and achieves classification by calculating the distance between the query set samples and the prototype. To enhance the feature representation capability of the prototype network, this paper proposes a few-shot image classification method based on cross-dimensional interactive attention. The algorithm uses the pre-trained model Resnet-12 to extract deep features of images and introduces the cross-dimensional interactive attention mechanism to link the information between channel and spatial dimensions through triple attention, which enhances the information interaction of each dimension. Meanwhile, in order to improve the problem of insufficient generalization ability of the prototype network, this algorithm uses a gradient-centered optimization algorithm to zero-mean the weight gradient, which improves the generalization ability of the network and improves the classification accuracy. Extensive experimental results show that the proposed algorithm performs well in the few-shot image classification task.

References

[1]

Karthigayan Muthukaruppan, Sargunam Thirugnanam, Ramachandran Nagarajan 2015. A Comparison of South East Asian Face Emotion Classification Based on Optimized Ellipse Data Using Clustering Technique. Journal of Image and Graphics. 3(1), 1-5.

[2]

Yen-Ju Wu, Chun-Ming Tsai, and Frank Shih. 2016. Improving Leaf Classification Rate via Background Removal and ROI Extraction. Journal of Image and Graphics. 4(2), 93-98.

[3]

Yordanka Karayaneva and Diana Hintea. 2018. Object Recognition in Python and MNIST Dataset Modification and Recognition with Five Machine Learning Classifiers. Journal of Image and Graphics. 6(1), 10-20.

[4]

Lei Wang, Biao Liu, Shaohua Xu, Ji Pan, and Qi Zhou. 2021. AI Auxiliary Labeling and Classification of Breast Ultrasound Images. Journal of Image and Graphics. 9(2), 45-49.

[5]

Nguyen Minh Trieu and Nguyen Truong Thinh. 2022. A Study of Combining KNN and ANN for Classifying Dragon Fruits Automatically. Journal of Image and Graphics. 10(1), 28-35.

[6]

Andrey A. Dovganich, Alexander V. Khvostikov, Yakov A. Pchelintsev 2022. Automatic Out-of-Distribution Detection Methods for Improving the Deep Learning Classification of Pulmonary X-ray Images. Journal of Image and Graphics. 10(2), 56-63.

[7]

Fei-Fei Li, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence. 28(4), 594-611.

[8]

Michael Fink. 2015. Object classification from a single example utilizing class relevance metrics. Advances in neural information processing systems. 17, 449-456.

[9]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. International conference on machine learning. PMLR, 1126-1135.

[10]

Luca Bertinetto, Joao F. Henriques, Philip Torr, Andrea Vedaldi. 2019. Meta-learning with differentiable closed-form solvers. 7th International Conference on Learning Representations, ICLR. arXiv preprint arXiv:1805.08136.

[11]

Zhenguo Li, Fengwei Zhou, Fei Chen and Hang Li. 2017. Meta-sgd: Learning to learn quickly for few-shot learning. Advances in Neural Information Processing Systems, 4077-4087. arXiv preprint arXiv:1707.09835.

[12]

Wei, Xiu-Shen, Peng Wang, Lingqiao Liu, Chunhua Shen and Jianxin Wu. 2019. Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. IEEE Transactions on Image Processing. 28(12), 6116-6125.

[13]

Huaxi Huang, Junjie Zhang, Jian Zhang, Qiang Wu and Jingsong Xu. 2019. Compare more nuanced: Pairwise alignment bilinear network for few-shot fine-grained learning. 2019 IEEE International Conference on Multimedia and Expo. 91-96.

[14]

Huaxi Huang, Junjie Zhang, Jian Zhang, Jingsong Xu and Qiang Wu. 2020. Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE Transactions on Multimedia. 1666-1680.

[15]

Munkhdalai, Tsendsuren, and Hong Yu. 2017. Meta networks. International Conference on Machine Learning. PMLR, 2554-2563.

[16]

Oriol Vinyals, 2016. Matching networks for one shot learning. Advances in neural information processing systems. Barcelona, Spain: MIT Press, 2016. 3637-3645.

[17]

Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese neural networks for one-shot image recognition. ICML deep learning workshop. 7(11), 956-963.

[18]

Flood Sung, Yongxin Yang, Li Zhang, 2018. Learning to compare: Relation network for few-shot learning. Proceedings of the IEEE conference on computer vision and pattern recognition. 1199-1208.

[19]

Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. Advances in neural information processing systems. Long Beach, USA: ACM, 2017. 4077−4087.

[20]

Wenbin Li, Lei Wang, Jinglin Xu, 2019. Revisiting local descriptor based image-to-class measure for few-shot learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7260-7268.

[21]

Victor Garcia, and Joan Bruna. 2018. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043.

[22]

Diganta Misra, 2021. Rotate to attend: Convolutional triplet attention module. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3139-3148.

[23]

Hongwei Yong, Jianqiang Huang, Xiansheng Hua, and Lei Zhang. 2020. Gradient centralization: A new optimization technique for deep neural networks. European Conference on Computer Vision. Springer, Cham. 635-652.

[24]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 770-778.

[25]

Catherine Wah, 2011. The caltech-ucsd birds-200-2011 dataset.

[26]

Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, 2011. Novel dataset for fine-grained image categorization: Stanford dogs. Proc. CVPR Workshop on Fine-Grained Visual Categorization. 2(1).

[27]

Jonathan Krause, Michael Stark, Jia Deng 2013. 3d object representations for fine-grained categorization. Proceedings of the IEEE international conference on computer vision workshops. 554-561.

[28]

CIIP-TPID (Center for Image and Information Processing-Tire Pattern Image Datasets). http://www.xuptciip.com.cn/show.html?database-lthhhw, Xi'an University of Posts and Telecommunications, 2019.

[29]

Léon Bottou. 2012. Stochastic gradient descent tricks. Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 421-436.

Index Terms

Few-Shot Image Classification Based on Cross-Dimensional Interactive Attention
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

A Multi-perspective Squeeze Excitation Classifier Based on Vision Transformer for Few Shot Image Classification
Pattern Recognition and Computer Vision
Abstract
Few-shot image classification is a task that uses a small number of labeled samples to train a model to complete the classification task. Most few-shot image classification methods use small CNN-based models due to its good performance under ...
Few-shot image classification with composite rotation based self-supervised auxiliary task
Highlights
- Proposes composite rotation that is composed of inner and outer rotations of image.
Abstract
Many real-life problem settings have classes of data with very few examples for training. Deep learning networks do not perform well for such few-shot classes. In order to perform well in this setting, the networks should learn to ...
Dual class representation learning for few-shot image classification
Abstract
Few-shot learning (FSL) models are trained on base classes that have many training examples and evaluated on novel classes that have very few training examples. Since these models cannot be properly fine-tuned on the novel classes ...
Highlights
- Proposes dual class representation learning (DCRL) for few-shot image classification.

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 2022

1221 pages

ISBN:9781450396899

DOI:10.1145/3573942

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Xi'an University of Posts and Telecommunications

Conference

AIPR 2022

AIPR 2022: 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 23 - 25, 2022

Xiamen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
39
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten