Skip to main content
Log in

Task-wise attention guided part complementary learning for few-shot image classification

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

A general framework to tackle the problem of few-shot learning is meta-learning, which aims to train a well-generalized meta-learner (or backbone network) to learn a base-learner for each future task with small training data. Although a lot of work has produced relatively good results, there are still some challenges for few-shot image classification. First, meta-learning is a learning problem over a collection of tasks and the meta-learner is usually shared among all tasks. To achieve image classification of novel classes in different tasks, it is needed to learn a base-learner for each task. Under the circumstances, how to make the base-learner specialized, and thus respond to different inputs in an extremely task-wise manner for different tasks is a big challenge at present. Second, classification network usually inclines to identify local regions from the most discriminative object parts rather than the whole objects for recognition, thereby resulting in incomplete feature representations. To address the first challenge, we propose a task-wise attention (TWA) module to guide the base-learner to extract task-specific image features. To address the second challenge, under the guidance of TWA, we propose a part complementary learning (PCL) module to extract and fuse the features of multiple complementary parts of target objects, and thus we can obtain more specific and complete information. In addition, the proposed TWA module and PCL module can be embedded into a unified network for end-to-end training. Extensive experiments on two commonly-used benchmark datasets and comparison with state-of-the-art methods demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, 2015. 91–99

  2. Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2117–2125

  3. Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 779–788

  4. Cheng G, Zhou P C, Han J W. RIFD-CNN: rotation-invariant and fisher discriminative convolutional neural networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2884–2893

  5. Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, 2016. 21–37

  6. Cheng G, Han J, Zhou P, et al. Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection. IEEE Trans Image Process, 2019, 28: 265–278

    Article  MathSciNet  Google Scholar 

  7. He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770–778

  8. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556

  9. Cheng G, Yang C Y, Yao X W, et al. When deep learning meets metric learning: remote sensing image scene classification via learning discriminative CNNs. IEEE Trans Geosci Remote Sens, 2018, 56: 2811–2821

    Article  Google Scholar 

  10. Cheng G, Gao D C, Liu Y, et al. Multi-scale and discriminative part detectors based features for multi-label image classification. In: Proceedings of International Joint Conference on Artificial Intelligence, 2018. 649–655

  11. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. 3431–3440

  12. Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision, 2015. 1520–1528

  13. Wang N, Ma S H, Li J Y, et al. Multistage attention network for image inpainting. Pattern Recogn, 2020, 106: 107448

    Article  Google Scholar 

  14. Song L C, Wang C, Zhang L F, et al. Unsupervised domain adaptive re-identification: theory and practice. Pattern Recogn, 2020, 102: 107173

    Article  Google Scholar 

  15. Wei X S, Wang P, Liu L Q, et al. Piecewise classifier mappings: learning fine-grained learners for novel categories with few examples. IEEE Trans Image Process, 2019, 28: 6116–6125

    Article  MathSciNet  Google Scholar 

  16. Ji Z, Chai X L, Yu Y L, et al. Improved prototypical networks for few-shot learning. Pattern Recogn Lett, 2020, 140: 81–87

    Article  Google Scholar 

  17. Ji Z, Sun Y X, Yu Y L, et al. Attribute-guided network for cross-modal zero-shot hashing. IEEE Trans Neur Netw Lear Syst, 2020, 31: 321–330

    Article  Google Scholar 

  18. Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a few examples: a survey on few-shot learning. 2019. ArXiv:1904.05046

  19. Ji Z, Yan J T, Wang Q, et al. Triple discriminator generative adversarial network for zero-shot image classification. Sci China Inf Sci, 2021, 64: 120101

    Article  MathSciNet  Google Scholar 

  20. Vilalta R, Drissi Y. A perspective view and survey of meta-learning. Artif Intell Rev, 2002, 18: 77–95

    Article  Google Scholar 

  21. Bertinetto L, Henriques J F, Torr P H, et al. Meta-learning with differentiable closed-form solvers. In: Proceedings of International Conference on Learning Representations, 2019. 1–15

  22. Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 4077–4087

  23. Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 3630–3638

  24. Andrychowicz M, Denil M, Gomez S, et al. Learning to learn by gradient descent by gradient descent. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 3981–3989

  25. Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of International Conference on Learning Representations, 2017. 1–11

  26. Santoro A, Bartunov S, Botvinick M, et al. Meta-learning with memory-augmented neural networks. In: Proceedings of the 33rd International Conference on Machine Learning, 2016. 1842–1850

  27. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning, 2017. 1126–1135

  28. Li Z G, Zhou F W, Chen F, et al. Meta-SGD: learning to learn quickly for few-shot learning. 2017. ArXiv:1707.09835

  29. Jamal M, Qi G J. Task agnostic meta-learning for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 11719–11727

  30. Zhou F W, Wu B, Li Z G. Deep meta-learning: learning to learn in the concept space. 2018. ArXiv:1802.03596

  31. Sun Q R, Liu Y Y, Chua T, et al. Meta-transfer learning for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 403–412

  32. Lee K, Maji S, Ravichandran A, et al. Meta-learning with differentiable convex optimization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 10657–10665

  33. Lifchitz Y, Avrithis Y, Picard S, et al. Dense classification and implanting for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 9258–9267

  34. Munkhdalai T, Yu H. Meta networks. In: Proceedings of the 34th International Conference on Machine Learning, 2017. 2554–2563

  35. Sung F, Yang Y X, Zhang L, et al. Learning to compare: relation network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1199–1208

  36. Wang P, Liu L Q, Shen C H, et al. Multi-attention network for one shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2721–2729

  37. Li W B, Xu J L, Huo J, et al. Distribution consistency based covariance metric networks for few-shot learning. Assoc Adv Artif Intell, 2019, 33: 8642–8649

    Google Scholar 

  38. Li W B, Wang L, Xu J L, et al. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 7260–7268

  39. Li H Y, Eigen D, Dodge S, et al. Finding task-relevant features for few-shot learning by category traversal. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 1–10

  40. Zhang H G, Zhang J, Koniusz P. Few-shot learning via saliency-guided hallucination of samples. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 2770–2779

  41. Alfassy A, Karlinsky L, Aides A, et al. LaSO: label-set operations networks for multi-label few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 6548–6557

  42. Chen Z T, Fu Y W, Wang Y X, et al. Image deformation meta-networks for one-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 8680–8689

  43. Chu W H, Li Y J, Chang J C, et al. Spot and learn: a maximum-entropy patch sampler for few-shot image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 6251–6260

  44. Bearman A, Russakovsky O, Ferrari V, et al. What’s the point: semantic segmentation with point supervision. In: Proceedings of the 14th European Conference on Computer Vision, 2016. 549–565

  45. Wah C, Branson S, Welinder P, et al. The caltech-ucsd birds-200-2011 dataset. 2011. https://authors.library.caltech.edu/27452/

  46. Hilliard N, Phillips L, Howland S, et al. Few-shot learning with metric-agnostic conditional embeddings. 2018. ArXiv:1802.04376

  47. Kim J, Kim T, Kim S, et al. Edge-labeling graph neural network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 11–20

  48. Chen W Y, Liu Y C, Kira Z, et al. A closer look at few-shot classification. 2019. ArXiv:1904.04232

  49. Zhang C, Cai Y J, Lin G S, et al. DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 12203–12213

  50. Ye H J, Hu H X, Zhan D C, et al. Learning embedding adaptation for few-shot learning. 2018. ArXiv:1812.03664

  51. Yang L, Li L L, Zhang Z L, et al. DPGN: distribution propagation graph network for few-shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 13390–13399

  52. Schwartz E, Karlinsky L, Feris R, et al. Baby steps towards few-shot learning with multiple semantics. 2019. ArXiv: 1906.01905

  53. Zhang X L, Wei Y C, Feng J S, et al. Adversarial complementary learning for weakly supervised object localization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1325–1334

Download references

Acknowledgements

This work was supported by Science, Technology and Innovation Commission of Shenzhen Municipality (Grant No. JCYJ20180306171131643) and National Natural Science Foundation of China (Grant No. 61772425).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junwei Han.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, G., Li, R., Lang, C. et al. Task-wise attention guided part complementary learning for few-shot image classification. Sci. China Inf. Sci. 64, 120104 (2021). https://doi.org/10.1007/s11432-020-3156-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-020-3156-7

Keywords

Navigation