Skip to main content
Log in

Feature channel interaction long-tailed image classification model based on dual attention

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In the real world, the data distribution often presents a long tail distribution, and the imbalance of data will lead to the model learning bias to the head class. To address the influence of long tail distribution on image classification, this paper proposes a feature channel interactive long tail image classification model based on dual attention. Firstly, the dual attention module is used to capture the autocorrelation and spatial dimension information of the feature map, and the enhanced image is obtained by transformation and class activation map. After that, image preprocessing is performed on the enhanced data set to reduce the over-fitting of the model to the head, and the features that are more conducive to tail classification are obtained through learning. Finally, by interacting with the local channels adjacent to the features, the correlation between the channels is extracted to obtain more robust features. The method achieves good performance on CIFAR10-LT, CIFAR100-LT and ImageNet datasets, which proves the effectiveness of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  2. Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020)

    Article  PubMed  Google Scholar 

  3. Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2016)

  4. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

  5. Zhang, C., Lin, G., Liu, F., Yao, R., Shen, C.: CANet: class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5212–5221 (2019)

  6. Wang, Y., Luo, F., Yang, X., et al.: The Swin-Transformer network based on focal loss is used to identify images of pathological subtypes of lung adenocarcinoma with high similarity and class imbalance. J. Cancer Res. Clin. Oncol. 149, 8581–8592 (2023)

    Article  PubMed  Google Scholar 

  7. Huang, P., Tan, X., Zhou, X., Liu, S., Mercaldo, F., Santone, A.: FABNet: fusion attention block and transfer learning for laryngeal cancer tumor grading in P63 IHC histopathology images. IEEE J. Biomed. Health Inform. 26(4), 1696–1707 (2022)

    Article  PubMed  Google Scholar 

  8. Zhou, X., Tang, C., Huang, P., et al.: ASI-DBNet: an adaptive sparse interactive ResNet-vision transformer dual-branch network for the grading of brain cancer histopathological images. Interdiscip. Sci. Comput. Life Sci. 15, 15–31 (2023)

    CAS  Google Scholar 

  9. Tian, S., et al.: CASDD: automatic surface defect detection using a complementary adversarial network. IEEE Sens. J. 22(20), 19583–19595 (2022)

    Article  ADS  CAS  Google Scholar 

  10. Huang, P., et al.: A ViT-AMC network with adaptive model fusion and multiobjective optimization for interpretable laryngeal tumor grading from histopathological images. IEEE Trans. Med. Imaging 42(1), 15–28 (2023)

    Article  PubMed  Google Scholar 

  11. Huang, P., et al.: Interpretable laryngeal tumor grading of histopathological images via depth domain adaptive network with integration gradient CAM and priori experience-guided attention. Comput. Biol. Med. 154, 106447 (2023)

    Article  PubMed  Google Scholar 

  12. Park, S., Hong, Y., Heo, B., Yun, S., Choi, J.Y.: The majority can help the minority: context-rich minority oversampling for long-tailed classification. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR0), pp. 6877–6886. New Orleans, LA, USA (2022)

  13. Wang, Y., Gan, W., Yang, J., Wu, W., Yan, J.: Dynamic curriculum learning for imbalanced data classification. In: International Conference on Computer Vision, pp. 5017–5026 (2019)

  14. Tan, J., Wang, C., Li, B., Li, Q., Ouyang, W., Yin, C., Yan, J.: Equalization loss for long-tailed object recognition. In: Computer Vision and Pattern Recognition, pp. 11662–11671 (2020)

  15. He, Y.-Y., Wu, J., Wei, X.-S.: Distilling virtual examples for long-tailed recognition. In: International Conference on Computer Vision (2021)

  16. Chu, P., Bian, X., Liu, S., Ling, H.: Feature space augmentation for long-tailed data. In: European Conference on Computer Vision (2020)

  17. Wu, T., Liu, Z., Huang, Q., Wang, Y., Lin, D.: Adversarial robustness under long-tailed distribution. In: Computer Vision and Pattern Recognition, pp. 8659–8668 (2021)

  18. Kang, B., Xie, S., Rohrbach, M., Yan, Z., Gordo, A., Feng, J., Kalantidis, Y.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2020)

  19. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: International Conference on Learning Representations (2018)

  20. Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  21. Kim, J., Jeong, J., Shin, J.: M2m: imbalanced classification via major-to-minor translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.13896–13905 (2020)

  22. Chou, H.-P., Chang, S.-C., Pan, J.-Y., Wei, W., Juan, D.-C.: Remix: rebalanced mixup. In: European Conference on Computer Vision Workshop (2020)

  23. Wang, T., Li, Y., Kang, B., Li, J., Liew, J., Tang, S., Hoi, S., Feng, J.: The devil is in classification: a simple framework for long-tail instance segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 728–744 (2020)

  24. Kang, B., Li, Y., Xie, S., Yuan, Z., Feng, J.: Exploring balanced feature spaces for representation learning. In: International Conference on Learning Representations (2021)

  25. Zhong, Z., Cui, J., Liu, S., Jia, J.: Improving calibration for long-tailed recognition. In: Computer Vision and Pattern Recognition (2021)

  26. Desai, A., Wu, T.-Y., Tripathi, S., Vasconcelos, N.: Learning of visual relations: the devil is in the tails. In: International Conference on Computer Vision (2021)

  27. Mnih, V., et al.: Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 3 (2014)

  28. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Polosukhin, I.: Attention is all you need. In: Proceedings of the Conference Neural Information Processing Systems, pp. 5998–6008 (2017)

  29. Hao, Y., et al.: Attention in attention: modeling context correlation for efficient video classification. IEEE Trans. Circuits Syst. Video Technol. 32(10), 7120–7132 (2022)

    Article  Google Scholar 

  30. Fan, Q., Zhuo, W., Tang, C.-K., Tai, Y.-W.: Few-shot object detection with attention-RPN and multi-relation detector. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4012–4021. Seattle, WA, USA (2020)

  31. Yang, Y., et al.: Dual wavelet attention networks for image classification. IEEE Trans. Circuits Syst. Video Technol. 33(4), 1899–1910 (2023)

    Article  Google Scholar 

  32. Wang, W., Zhao, Z., Wang, P., Su, F., Meng, H.: Attentive feature augmentation for long-tailed visual recognition. IEEE Trans. Circuits Syst. Video Technol. 32(9), 5803–5816 (2022)

    Article  Google Scholar 

  33. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929. Las Vegas, NV, USA (2016)

  34. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141. Salt Lake City, UT, USA (2018)

  35. Cui, Y., Jia, M., Lin, T.Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9268–9277 (2019)

  36. Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: CVPR, pp. 2537–2546 (2019)

  37. Zhou, B., Cui, Q., Wei, X.-S., Chen, Z.-M.: BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9716–9725. Seattle, WA, USA (2020)

  38. Li, T., Cao, P., Yuan, Y., Fan, L., Yang, Y., Feris, R. S., Indyk, P., Katabi, D.: Targeted supervised contrastive learning for long-tailed recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6918–6928 (2022)

  39. Cai, J., Wang, Y., Hwang, J.-N.: ACE: ally complementary experts for solving long-tailed recognition in one-shot. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 112–121. Montreal, QC, Canada (2021)

  40. Han, B.: Wrapped cauchy distributed angular softmax for long-tailed visual recognition. arXiv e-prints (2023)

  41. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. TPAMI 42(02), 318–327 (2020)

    Article  Google Scholar 

  42. Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. In: NeurIPS (2019)

  43. Park, S., Lim, J., Jeon, Y., Choi, J.Y.: Influence-balanced loss for imbalanced visual classification. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada (2021)

  44. Nah, W.J., et al.: Rethinking long-tailed visual recognition with dynamic probability smoothing and frequency weighted focusing. In: 2023 IEEE International Conference on Image Processing (ICIP). Kuala Lumpur, Malaysia (2023)

  45. Cui, J., Liu, S., Tian, Z., Zhong, Z., Jia, J.: ResLT: residual learning for long-tailed recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3695–3706 (2023)

    PubMed  Google Scholar 

  46. Ye, H., Zhou, F., Li, X., Zhang, Q.: Balanced mixup loss for long-tailed visual recognition. In: ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Rhodes Island, Greece (2023)

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

K.W. and K.L. conducted research and wrote the paper. All authors made substantial contributions to the concept, design, and revision of the paper.

Corresponding author

Correspondence to Keer Wang.

Ethics declarations

Conflict of interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liao, K., Wang, K., Zheng, Y. et al. Feature channel interaction long-tailed image classification model based on dual attention. SIViP 18, 1661–1670 (2024). https://doi.org/10.1007/s11760-023-02848-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02848-w

Keywords

Navigation