Prototype-based classifier learning for long-tailed visual recognition

Wei, Xiu-Shen; Xu, Shu-Lin; Chen, Hao; Xiao, Liang; Peng, Yuxin

doi:10.1007/s11432-021-3489-1

Prototype-based classifier learning for long-tailed visual recognition

Research Paper
Special Focus on Deep Learning for Computer Vision
Published: 16 May 2022

Volume 65, article number 160105, (2022)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Xiu-Shen Wei^1,2,3,
Shu-Lin Xu^1,3,
Hao Chen¹,
Liang Xiao¹ &
…
Yuxin Peng⁴

474 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we tackle the long-tailed visual recognition problem from the categorical prototype perspective by proposing a prototype-based classifier learning (PCL) method. Specifically, thanks to the generalization ability and robustness, categorical prototypes reveal their advantages of representing the category semantics. Coupled with their class-balance characteristic, categorical prototypes also show potential for handling data imbalance. In our PCL, we propose to generate the categorical classifiers based on the prototypes by performing a learnable mapping function. To further alleviate the impact of imbalance on classifier generation, two kinds of classifier calibration approaches are designed from both prototype-level and example-level aspects. Extensive experiments on five benchmark datasets, including the large-scale iNaturalist, Places-LT, and ImageNet-LT, justify that the proposed PCL can outperform state-of-the-arts. Furthermore, validation experiments can demonstrate the effectiveness of tailored designs in PCL for long-tailed problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Part-Aware Prototype-Aligned Interpretable Image Classification with Basic Feature Domain

Discriminative Semi-supervised Learning Based on Visual Concept-Like Features

Interpretable Image Recognition by Screening Class-Specific and Class-Shared Prototypes

References

Kendall M G, Stuart A, Ord J K, et al. Kendall’s Advanced Theory of Statistics. Volume 1. Distribution Theory. 5th ed. New York: Oxford University Press, 1987
MATH Google Scholar
van Horn G, Perona P. The devil is in the tails: fine-grained classification in the wild. 2017. ArXiv:1709.01450
van Horn G, Mac Aodha O, Song Y, et al. The iNaturalist species classification and detection dataset. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 8769–8778
Gupta A, Dollár P, Girshick R. LVIS: a dataset for large vocabulary instance segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 5356–5364
Wei X S, Cui Q, Yang L, et al. RPC: a large-scale retail product checkout dataset. 2019. ArXiv:1901.07249
Shen L, Lin Z, Huang Q. Relay backpropagation for effective learning of deep convolutional neural networks. In: Proceedings of European Conference on Computer Vision, 2016. 467–482
Japkowicz N, Stephen S. The class imbalance problem: a systematic study. Intell Data Anal, 2002, 6: 429–449
Article MATH Google Scholar
Liu X Y, Zhou Z H. The influence of class imbalance on cost-sensitive learning: an empirical study. In: Proceedings of IEEE International Conference on Data Mining, 2006. 970–974
Huang C, Li Y, Loy C C, et al. Learning deep representation for imbalanced classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 5375–5384
Zhou B, Cui Q, Wei X S, et al. BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 9719–9728
Kang B, Xie S, Rohrbach M, et al. Decoupling representation and classifier for long-tailed recognition. In: Proceedings of International Conference on Learning Representations, 2020. 1–16
Kim J, Jeong J, Shin J. M2m: imbalanced classification via major-to-minor translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 13896–13905
Chu P, Bian X, Liu S, et al. Feature space augmentation for long-tailed data. In: Proceedings of European Conference on Computer Vision, 2020. 694–710
He Y Y, Wu J, Wei X S. Distilling virtual examples for long-tailed recognition. In: Proceedings of IEEE International Conference on Computer Vision, 2021. 235–244
Menon A K, Jayasumana S, Rawat A S, et al. Long-tail learning via logit adjustment. In: Proceedings of International Conference on Learning Representations, 2020. 1–13
Cao K, Wei C, Gaidon A, et al. Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of International Conference on Neural Information Processing Systems, 2019. 1–18
Tan J, Wang C, Li B, et al. Equalization loss for long-tailed object recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 11662–11671
Viéville T, Crahay S. Using an Hebbian learning rule for multi-class SVM classifiers. J Comput Neurosci, 2004, 17: 271–287
Article Google Scholar
Yang H M, Zhang X Y, Yin F, et al. Robust classification with convolutional prototype learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 3474–3482
Xiao M, Kortylewski A, Wu R, et al. TDMPNet: prototype network with recurrent top-down modulation for robust object classification under partial occlusion. In: Proceedings of European Conference on Computer Vision, 2020. 447–463
Krizhevsky A, Hinton G. Learning Multiple Layers of Features From Tiny Images. Technical Report, TR-2009-1618. 2009
Zhou B, Lapedriza A, Khosla A, et al. Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell, 2017, 40: 1452–1464
Article Google Scholar
Liu Z, Miao Z, Zhan X, et al. Large-scale long-tailed recognition in an open world. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 2537–2546
Zhou Z-H, Liu X-Y. Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng, 2006, 18: 63–77
Article Google Scholar
Liu X-Y, Wu J X, Zhou Z-H. Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern B, 2009, 39: 539–550
Article Google Scholar
Zhang M L, Li Y K, Liu X Y. Towards class-imbalance aware multi-label learning. In: Proceedings of International Joint Conference on Artificial Intelligence, 2017. 4041–4047
Zhang J, Liu L, Wang P, et al. Exploring the auxiliary learning for long-tailed visual recognition. Neurocomputing, 2021, 449: 303–314
Article Google Scholar
Buda M, Maki A, Mazurowski M A. A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 2018, 106: 249–259
Article Google Scholar
Byrd J, Lipton Z. What is the effect of importance weighting in deep learning? In: Proceedings of International Conference on Machine Learning, 2019. 872–881
He H B, Garcia E A. Learning from imbalanced data. IEEE Trans Knowl Data Eng, 2009, 21: 1263–1284
Article Google Scholar
Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res, 2002, 16: 321–357
Article MATH Google Scholar
Cui Y, Jia M, Lin T Y, et al. Class-balanced loss based on effective number of samples. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 9268–9277
Wang Y X, Ramanan D, Hebert M. Learning to model the tail. In: Proceedings of International Conference on Neural Information Processing Systems, 2017. 7029–7039
Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of International Conference on Neural Information Processing Systems, 2013. 3111–3119
Shu J, Xie Q, Yi L, et al. Meta-weight-net: learning an explicit mapping for sample weighting. In: Proceedings of International Conference on Neural Information Processing Systems, 2019. 1917–1928
Salman K, Munawar H, Waqas Z S, et al. Striking the right balance with uncertainty. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 103–112
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Article Google Scholar
Zhu L, Yang Y. Inflated episodic memory with region self-attention for long-tailed visual recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 4343–4352
Zhang Y, Wei X S, Zhou B, et al. Bag of tricks for long-tailed visual recognition with deep convolutional neural networks. In: Proceedings of American Association for Artificial Intelligence, 2021. 3447–3455
Liu J, Sun Y, Han C, et al. Deep representation learning on long-tailed data: a learnable embedding augmentation perspective. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 2970–2979
Zhou B, Khosla A, lapedriza A, et al. Learning deep features for discriminative localization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2921–2929
Manning C, Raghavan P, Schutze H. Vector space classification. Cambridge: Cambridge University Press, 2008
Book Google Scholar
Tibshirani R, Hastie T. Diagnosis of multiple cancer types by shrunken centroids of gene expression. In: Proceedings of the National Academy of Sciences, 2002. 6567–6572
Wang P, Liu L, Shen C, et al. Multi-attention network for one shot learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2721–2729
Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of International Conference on Neural Information Processing Systems, 2017. 1–11
Edwards H, Storkey A. Towards a neural statistician. In: Proceedings of International Conference on Learning Representations, 2017. 1–14
Fort S. Gaussian prototypical networks for few-shot learning on Omniglot. 2017. ArXiv:1708.02735
Hecht T, Gepperth A. Computational Advantages of Deep Prototype-Based Learning. Technical Report, hal-01418135. 2016
Banerjee A, Merugu S, Dhillon I S, et al. Clustering with Bregman divergences. J Mach Learn Res, 2005, 61: 1705–1749
MathSciNet MATH Google Scholar
Niculescu-Mizil A, Caruana R. Predicting good probabilities with supervised learning. In: Proceedings of International Conference on Machine Learning, 2005. 625–632
Wei X S, Song Y Z, Aodha O M, et al. Fine-grained image analysis with deep learning: a survey. IEEE Trans Pattern Anal Mach Intell, 2021. doi: https://doi.org/10.1109/TPAMI.2021.3126648
Wei X S, Luo J H, Wu J, et al. Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process, 2017, 26: 2868–2881
Article MathSciNet MATH Google Scholar
Wei X S, Wang P, Liu L, et al. Piecewise classifier mappings: learning fine-grained learners for novel categories with few examples. IEEE Trans Image Process, 2019, 28: 6116–6125
Article MathSciNet MATH Google Scholar
Wei X S, Shen Y, Sun X, et al. A²-Net: learning attribute-aware hash codes for large-scale fine-grained image retrieval. In: Proceedings of International Conference on Neural Information Processing Systems, 2021. 5720–5730
Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009. 248–255
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770–778
Zhang S, Li Z, Yan S, et al. Distribution alignment: a unified framework for long-tail visual recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 2361–2370
Khosla P, Teterwak P, Wang C, et al. Supervised contrastive learning. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 18661–18673
Clevert D A, Unterthiner T, Hochreiter S. Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of International Conference on Learning Representations, 2015. 1–14
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2980–2988
Jamal M A, Brown M, Yang M H, et al. Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020. 7610–7619
Tang K, Huang J, Zhang H. Long-tailed classification by keeping the good and removing the bad momentum causal effect. In: Proceedings of International Conference on Neural Information Processing Systems, 2020. 1513–1524
Wang P, Han K, Wei X S, et al. Contrastive learning based hybrid networks for long-tailed image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2021. 943–952
Zhang X, Fang Z, Wen Y, et al. Range loss for deep face recognition with long-tailed training data. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. 5409–5418
Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 4367–4375
Liu Z, Miao Z, Zhan X, et al. Deep metric learning via lifted structured feature embedding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 2537–2546

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (Grant No. 2021YFA1001100), National Natural Science Foundation of China (Grant Nos. 61925201, 62132001, U21B2025, 61871226), Natural Science Foundation of Jiangsu Province of China (Grant No. BK20210340), Fundamental Research Funds for the Central Universities (Grant No. 30920041111), CAAI-Huawei MindSpore Open Fund, and Beijing Academy of Artificial Intelligence (BAAI). We gratefully acknowledge the support of MindSpore, CANN (Compute Architecture for Neural Networks), and Ascend AI Processor used for this research.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
Xiu-Shen Wei, Shu-Lin Xu, Hao Chen & Liang Xiao
State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, 710071, China
Xiu-Shen Wei
Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, Jiangsu Key Lab of Image and Video Understanding for Social Security, Nanjing, 210094, China
Xiu-Shen Wei & Shu-Lin Xu
Wangxuan Institute of Computer Technology, Peking University, Beijing, 100871, China
Yuxin Peng

Authors

Xiu-Shen Wei
View author publications
You can also search for this author in PubMed Google Scholar
Shu-Lin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Liang Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Liang Xiao or Yuxin Peng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, XS., Xu, SL., Chen, H. et al. Prototype-based classifier learning for long-tailed visual recognition. Sci. China Inf. Sci. 65, 160105 (2022). https://doi.org/10.1007/s11432-021-3489-1

Download citation

Received: 14 August 2021
Revised: 23 March 2022
Accepted: 22 April 2022
Published: 16 May 2022
DOI: https://doi.org/10.1007/s11432-021-3489-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prototype-based classifier learning for long-tailed visual recognition

Abstract

Access this article

Similar content being viewed by others

Part-Aware Prototype-Aligned Interpretable Image Classification with Basic Feature Domain

Discriminative Semi-supervised Learning Based on Visual Concept-Like Features

Interpretable Image Recognition by Screening Class-Specific and Class-Shared Prototypes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Prototype-based classifier learning for long-tailed visual recognition

Abstract

Access this article

Similar content being viewed by others

Part-Aware Prototype-Aligned Interpretable Image Classification with Basic Feature Domain

Discriminative Semi-supervised Learning Based on Visual Concept-Like Features

Interpretable Image Recognition by Screening Class-Specific and Class-Shared Prototypes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation