research-article

Nonlocal Hybrid Network for Long-tailed Image Classification

Authors:

Rongjiao Liang,

Jinyun TangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 4

Article No.: 108, Pages 1 - 22

https://doi.org/10.1145/3630256

Published: 11 January 2024 Publication History

Abstract

It is a significant issue to deal with long-tailed data when classifying images. A nonlocal hybrid network (NHN) that takes into account both feature learning and classifier learning is proposed. The NHN can capture the existence of dependencies between two locations that are far away from each other as well as alleviate the impact of long-tailed data on the model to some extent. The dependency relationship between distant pixels is obtained first through a nonlocal module to extract richer feature representations. Then, a learnable soft class center is proposed to balance the supervised contrastive loss and reduce the impact of long-tailed data on feature learning. For efficiency, a logit adjustment strategy is adopted to correct the bias caused by the different label distributions between the training and test sets and obtain a classifier that is more suitable for long-tailed data. Finally, extensive experiments are conducted on two benchmark datasets, the long-tailed CIFAR and the large-scale real-world iNaturalist 2018, both of which have imbalanced label distributions. The experimental results show that the proposed NHN model is efficient and promising.

References

[1]

Antoni Buades, Bartomeu Coll, and J.-M. Morel. 2005. A non-local algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2. IEEE, 60–65.

Digital Library

[2]

Mateusz Buda, Atsuto Maki, and Maciej A. Mazurowski. 2018. A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks 106 (2018), 249–259.

Digital Library

[3]

Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. 2019. Learning imbalanced datasets with label-distribution-aware margin loss. Advances in Neural Information Processing Systems 32 (2019).

[4]

Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, and Han Hu. 2020. Global context networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

[5]

Zhineng Chen, Shanshan Ai, and Caiyan Jia. 2019. Structure-aware deep learning for product image classification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 15, 1s (2019), 1–20.

Digital Library

[6]

Jiequan Cui, Zhisheng Zhong, Shu Liu, Bei Yu, and Jiaya Jia. 2021. Parametric contrastive learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 715–724.

[7]

Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9268–9277.

[8]

Yin Cui, Yang Song, Chen Sun, Andrew Howard, and Serge Belongie. 2018. Large scale fine-grained categorization and domain-specific transfer learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4109–4118.

[9]

Zongyong Deng, Hao Liu, Yaoxing Wang, Chenyang Wang, Zekuan Yu, and Xuehong Sun. 2021. PML: Progressive margin loss for long-tailed age classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10503–10512.

[10]

Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 (2017).

[11]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729–9738.

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[13]

Youngkyu Hong, Seungju Han, Kwanghee Choi, Seokjun Seo, Beomsu Kim, and Buru Chang. 2021. Disentangling label distribution for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6626–6636.

[14]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132–7141.

[15]

Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2019. Deep imbalanced learning for face recognition and attribute prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 11 (2019), 2781–2794.

[16]

Bingyi Kang, Yu Li, Sa Xie, Zehuan Yuan, and Jiashi Feng. 2020. Exploring balanced feature spaces for representation learning. In International Conference on Learning Representations.

[17]

Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, and Yannis Kalantidis. 2019. Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217 (2019).

[18]

Salman H. Khan, Munawar Hayat, Mohammed Bennamoun, Ferdous A. Sohel, and Roberto Togneri. 2017. Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Transactions on Neural Networks and Learning Systems 29, 8 (2017), 3573–3587.

[19]

Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems 33 (2020), 18661–18673.

[20]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases 1, 4 (2009).

[21]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980–2988.

[22]

Xiangbin Liu, Jiesheng He, Liping Song, Shuai Liu, and Gautam Srivastava. 2021. Medical image classification based on an adaptive size deep learning model. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 3s (2021), 1–18.

Digital Library

[23]

Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X. Yu. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2537–2546.

[24]

Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens Van Der Maaten. 2018. Exploring the limits of weakly supervised pretraining. In Proceedings of the European Conference on Computer Vision (ECCV). 181–196.

Digital Library

[25]

Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, and Sanjiv Kumar. 2020. Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314 (2020).

[26]

Ajinkya More. 2016. Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv preprint arXiv:1608.06048 (2016).

[27]

Jiawei Ren, Cunjun Yu, Xiao Ma, Haiyu Zhao, Shuai Yi, et al. 2020. Balanced meta-softmax for long-tailed visual recognition. Advances in Neural Information Processing Systems 33 (2020), 4175–4186.

[28]

Silhouettes Rousseeuw. 1987. A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53.

Digital Library

[29]

Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell System Technical Journal 27, 3 (1948), 379–423.

[30]

Jun Shu, Qi Xie, Lixuan Yi, Qian Zhao, Sanping Zhou, Zongben Xu, and Deyu Meng. 2019. Meta-weight-net: Learning an explicit mapping for sample weighting. Advances in Neural Information Processing Systems 32 (2019).

[31]

Kaihua Tang, Jianqiang Huang, and Hanwang Zhang. 2020. Long-tailed classification by keeping the good and removing the bad momentum causal effect. Advances in Neural Information Processing Systems 33 (2020), 1513–1524.

[32]

Jianfeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, and Zhenghua Xu. 2021. RSG: A simple but effective module for learning imbalanced datasets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3784–3793.

[33]

Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, and Lei Wang. 2021. Contrastive learning based hybrid networks for long-tailed image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 943–952.

[34]

Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, and Jiashi Feng. 2020. The devil is in classification: A simple framework for long-tail instance segmentation. In European Conference on Computer Vision. Springer, 728–744.

Digital Library

[35]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794–7803.

[36]

Yiru Wang, Weihao Gan, Jie Yang, Wei Wu, and Junjie Yan. 2019. Dynamic curriculum learning for imbalanced data classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5017–5026.

[37]

Zheng Wang, Xiaojun Ye, Chaokun Wang, and Philip S. Yu. 2019. Feature selection via transferring knowledge across different classes. ACM Transactions on Knowledge Discovery from Data (TKDD) 13, 2 (2019), 1–29.

Digital Library

[38]

Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, and Dahua Lin. 2020. Distribution-balanced loss for multi-label classification in long-tailed datasets. In European Conference on Computer Vision. Springer, 162–178.

Digital Library

[39]

Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, and Dahua Lin. 2021. Adversarial robustness under long-tailed distribution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8659–8668.

[40]

Yuzhe Yang and Zhi Xu. 2020. Rethinking the value of labels for improving class-imbalanced learning. Advances in Neural Information Processing Systems 33 (2020), 19290–19301.

[41]

Han-Jia Ye, Hong-You Chen, De-Chuan Zhan, and Wei-Lun Chao. 2020. Identifying and compensating for feature deviation in imbalanced deep learning. arXiv preprint arXiv:2001.01385 (2020).

[42]

Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, and Manmohan Chandraker. 2019. Feature transfer learning for face recognition with under-represented data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5704–5713.

[43]

Yuhang Zang, Chen Huang, and Chen Change Loy. 2021. FASA: Feature augmentation and sampling adaptation for long-tailed instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3457–3466.

[44]

Junjie Zhang, Lingqiao Liu, Peng Wang, and Chunhua Shen. 2019. To balance or not to balance: A simple-yet-effective approach for learning with long-tailed distributions. arXiv preprint arXiv:1912.04486 (2019).

[45]

Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, and Jian Sun. 2021. Distribution alignment: A unified framework for long-tail visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2361–2370.

[46]

Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan, and Jiashi Feng. 2021. Deep long-tailed learning: A survey. arXiv preprint arXiv:2110.04596 (2021).

[47]

Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia. 2021. Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16489–16498.

[48]

Boyan Zhou, Quan Cui, Xiu-Shen Wei, and Zhao-Min Chen. 2020. BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9719–9728.

Cited By

Xi PCheng DLu GDeng ZZhang GZhang S(2025)Identifying local useful information for attribute graph anomaly detectionNeurocomputing10.1016/j.neucom.2024.128900617(128900)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128900
Wu YLu QWang WWang WLi JXu XChe K(2024)Recognition of dispersed organic matter macerals using YOLOv5m model with convolutional block attention moduleFuel10.1016/j.fuel.2024.132899378(132899)Online publication date: Dec-2024
https://doi.org/10.1016/j.fuel.2024.132899
Li ZZhang WSong JChen BHu YZhang S(2024)WS-GCA: A Synergistic Framework for Precise Semantic Segmentation with Comprehensive SupervisionWeb and Big Data10.1007/978-981-97-7232-2_29(435-450)Online publication date: 31-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-7232-2_29

Index Terms

Nonlocal Hybrid Network for Long-tailed Image Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Flexible Sampling for Long-Tailed Skin Lesion Classification
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022
Abstract
Most of the medical tasks naturally exhibit a long-tailed distribution due to the complex patient-level conditions and the existence of rare diseases. Existing long-tailed learning methods usually treat each class equally to re-balance the long-...
Probability guided loss for long-tailed multi-label image classification
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

Long-tailed learning has attracted increasing attention in very recent years. Long-tailed multi-label image classification is one subtask and remains challenging and poorly researched. In this paper, we provide a fresh perspective from probability to ...
Identifying Hard Noise in Long-Tailed Sample Distribution
Computer Vision – ECCV 2022
Abstract
Conventional de-noising methods rely on the assumption that all samples are independent and identically distributed, so the resultant classifier, though disturbed by noise, can still easily identify the noises as the outliers of training ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 4

April 2024

676 pages

EISSN:1551-6865

DOI:10.1145/3613617

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2024

Online AM: 02 November 2023

Accepted: 21 October 2023

Revised: 02 August 2023

Received: 18 January 2023

Published in TOMM Volume 20, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Project of Guangxi Science and Technology
Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
249
Total Downloads

Downloads (Last 12 months)156
Downloads (Last 6 weeks)9

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xi PCheng DLu GDeng ZZhang GZhang S(2025)Identifying local useful information for attribute graph anomaly detectionNeurocomputing10.1016/j.neucom.2024.128900617(128900)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128900
Wu YLu QWang WWang WLi JXu XChe K(2024)Recognition of dispersed organic matter macerals using YOLOv5m model with convolutional block attention moduleFuel10.1016/j.fuel.2024.132899378(132899)Online publication date: Dec-2024
https://doi.org/10.1016/j.fuel.2024.132899
Li ZZhang WSong JChen BHu YZhang S(2024)WS-GCA: A Synergistic Framework for Precise Semantic Segmentation with Comprehensive SupervisionWeb and Big Data10.1007/978-981-97-7232-2_29(435-450)Online publication date: 31-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-7232-2_29

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents