Abstract
A general framework to tackle the problem of few-shot learning is meta-learning, which aims to train a well-generalized meta-learner (or backbone network) to learn a base-learner for each future task with small training data. Although a lot of work has produced relatively good results, there are still some challenges for few-shot image classification. First, meta-learning is a learning problem over a collection of tasks and the meta-learner is usually shared among all tasks. To achieve image classification of novel classes in different tasks, it is needed to learn a base-learner for each task. Under the circumstances, how to make the base-learner specialized, and thus respond to different inputs in an extremely task-wise manner for different tasks is a big challenge at present. Second, classification network usually inclines to identify local regions from the most discriminative object parts rather than the whole objects for recognition, thereby resulting in incomplete feature representations. To address the first challenge, we propose a task-wise attention (TWA) module to guide the base-learner to extract task-specific image features. To address the second challenge, under the guidance of TWA, we propose a part complementary learning (PCL) module to extract and fuse the features of multiple complementary parts of target objects, and thus we can obtain more specific and complete information. In addition, the proposed TWA module and PCL module can be embedded into a unified network for end-to-end training. Extensive experiments on two commonly-used benchmark datasets and comparison with state-of-the-art methods demonstrate the effectiveness of our proposed method.
Similar content being viewed by others
References
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, 2015. 91–99
Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2117–2125
Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 779–788
Cheng G, Zhou P C, Han J W. RIFD-CNN: rotation-invariant and fisher discriminative convolutional neural networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2884–2893
Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, 2016. 21–37
Cheng G, Han J, Zhou P, et al. Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection. IEEE Trans Image Process, 2019, 28: 265–278
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770–778
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Cheng G, Yang C Y, Yao X W, et al. When deep learning meets metric learning: remote sensing image scene classification via learning discriminative CNNs. IEEE Trans Geosci Remote Sens, 2018, 56: 2811–2821
Cheng G, Gao D C, Liu Y, et al. Multi-scale and discriminative part detectors based features for multi-label image classification. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 649–655
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. 3431–3440
Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision, 2015. 1520–1528
Wang N, Ma S H, Li J Y, et al. Multistage attention network for image inpainting. Pattern Recogn, 2020, 106: 107448
Song L C, Wang C, Zhang L F, et al. Unsupervised domain adaptive re-identification: theory and practice. Pattern Recogn, 2020, 102: 107173
Wei X S, Wang P, Liu L Q, et al. Piecewise classifier mappings: learning fine-grained learners for novel categories with few examples. IEEE Trans Image Process, 2019, 28: 6116–6125
Ji Z, Chai X L, Yu Y L, et al. Improved prototypical networks for few-shot learning. Pattern Recogn Lett, 2020, 140: 81–87
Ji Z, Sun Y X, Yu Y L, et al. Attribute-guided network for cross-modal zero-shot hashing. IEEE Trans Neur Netw Lear Syst, 2020, 31: 321–330
Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning. 2019. ArXiv:1904.05046
Ji Z, Yan J T, Wang Q, et al. Triple discriminator generative adversarial network for zero-shot image classification. Sci China Inf Sci, 2021, 64: 120101
Vilalta R, Drissi Y. A perspective view and survey of meta-learning. Artif Intell Rev, 2002, 18: 77–95
Bertinetto L, Henriques J F, Torr P H, et al. Meta-learning with differentiable closed-form solvers. In: Proceedings of International Conference on Learning Representations, 2019. 1–15
Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 4077–4087
Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 3630–3638
Andrychowicz M, Denil M, Gomez S, et al. Learning to learn by gradient descent by gradient descent. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 3981–3989
Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of International Conference on Learning Representations, 2017. 1–11
Santoro A, Bartunov S, Botvinick M, et al. Meta-learning with memory-augmented neural networks. In: Proceedings of the 33rd International Conference on Machine Learning, 2016. 1842–1850
Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning, 2017. 1126–1135
Li Z G, Zhou F W, Chen F, et al. Meta-SGD: learning to learn quickly for few-shot learning. 2017. ArXiv:1707.09835
Jamal M, Qi G J. Task agnostic meta-learning for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 11719–11727
Zhou F W, Wu B, Li Z G. Deep meta-learning: learning to learn in the concept space. 2018. ArXiv:1802.03596
Sun Q R, Liu Y Y, Chua T, et al. Meta-transfer learning for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 403–412
Lee K, Maji S, Ravichandran A, et al. Meta-learning with differentiable convex optimization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 10657–10665
Lifchitz Y, Avrithis Y, Picard S, et al. Dense classification and implanting for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 9258–9267
Munkhdalai T, Yu H. Meta networks. In: Proceedings of the 34th International Conference on Machine Learning, 2017. 2554–2563
Sung F, Yang Y X, Zhang L, et al. Learning to compare: relation network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1199–1208
Wang P, Liu L Q, Shen C H, et al. Multi-attention network for one shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2721–2729
Li W B, Xu J L, Huo J, et al. Distribution consistency based covariance metric networks for few-shot learning. Assoc Adv Artif Intell, 2019, 33: 8642–8649
Li W B, Wang L, Xu J L, et al. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 7260–7268
Li H Y, Eigen D, Dodge S, et al. Finding task-relevant features for few-shot learning by category traversal. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 1–10
Zhang H G, Zhang J, Koniusz P. Few-shot learning via saliency-guided hallucination of samples. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 2770–2779
Alfassy A, Karlinsky L, Aides A, et al. LaSO: label-set operations networks for multi-label few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 6548–6557
Chen Z T, Fu Y W, Wang Y X, et al. Image deformation meta-networks for one-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 8680–8689
Chu W H, Li Y J, Chang J C, et al. Spot and learn: a maximum-entropy patch sampler for few-shot image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 6251–6260
Bearman A, Russakovsky O, Ferrari V, et al. What’s the point: semantic segmentation with point supervision. In: Proceedings of the 14th European Conference on Computer Vision, 2016. 549–565
Wah C, Branson S, Welinder P, et al. The caltech-ucsd birds-200-2011 dataset. 2011. https://authors.library.caltech.edu/27452/
Hilliard N, Phillips L, Howland S, et al. Few-shot learning with metric-agnostic conditional embeddings. 2018. ArXiv:1802.04376
Kim J, Kim T, Kim S, et al. Edge-labeling graph neural network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 11–20
Chen W Y, Liu Y C, Kira Z, et al. A closer look at few-shot classification. 2019. ArXiv:1904.04232
Zhang C, Cai Y J, Lin G S, et al. DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 12203–12213
Ye H J, Hu H X, Zhan D C, et al. Learning embedding adaptation for few-shot learning. 2018. ArXiv:1812.03664
Yang L, Li L L, Zhang Z L, et al. DPGN: distribution propagation graph network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 13390–13399
Schwartz E, Karlinsky L, Feris R, et al. Baby steps towards few-shot learning with multiple semantics. 2019. ArXiv: 1906.01905
Zhang X L, Wei Y C, Feng J S, et al. Adversarial complementary learning for weakly supervised object localization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1325–1334
Acknowledgements
This work was supported by Science, Technology and Innovation Commission of Shenzhen Municipality (Grant No. JCYJ20180306171131643) and National Natural Science Foundation of China (Grant No. 61772425).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheng, G., Li, R., Lang, C. et al. Task-wise attention guided part complementary learning for few-shot image classification. Sci. China Inf. Sci. 64, 120104 (2021). https://doi.org/10.1007/s11432-020-3156-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3156-7