Skip to main content
Log in

Gradient optimization for object detection in learning with noisy labels

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep neural networks have made significant progress benefiting large-scale correctly human-labeled datasets. However, large-scale human-labeled datasets are often ambiguous because the limited experience can lead to mislabeled classes. Most research related to learning with noisy labels concentrates on image classification, while we focus on object detection that also suffers from noisy labels. In this paper, we propose a method that applies gradient optimization for object detection (GOOD), aiming to combat poor generalization caused by noisy labels in objection detection. Usually, a detection task is divided into a foreground-background subtask and a foreground-object subtask. Hence, gradient descent with cross-entropy exploits corrected gradient guidance for foreground-background subtask, while dynamic gradient underweighted ascent with cross-entropy and variant gradient clipping with improved symmetric cross-entropy are mutually employed to prevent incorrect gradient guidance for foreground-object subtask. We conducted extensive experiments on PASCAL VOC 2012 and COCO 2017, demonstrating the effectiveness of GOOD. Furthermore, we promote GOOD to instance segmentation, and competitive results on Cityscapes show that it is also appropriate for instance segmentation. Specifically, we achieved a 9.4% improvement on PASCAL VOC 2012, 5.2% on COCO 2017, and 4.3% on Cityscapes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The code is available at https://github.com/QiangqiangXia/GOOD.

References

  1. Zhang C, Cheng J, Tian Q (2019) Unsupervised and semi-supervised image classification with weak semantic consistency. IEEE Trans Multimed 21:2482–2491

    Article  Google Scholar 

  2. Chen L, Bo KH, Lee F, Chen Q (2020) Advanced feature fusion algorithm based on multiple convolutional neural network for scene recognition. Comput Model Eng Sci 122(2):505–523

    Google Scholar 

  3. Xie L, Lee F, Liu L, Kotani K, Chen Q (2020) Scene recognition: a comprehensive survey. Pattern Recognit 102:107205

  4. Lin C, Lee F, Xie L, Cai J, Chen H, Liu L, Chen Q (2022) Scene recognition using multiple representation network. Appl Soft Comput 118:108530

  5. Xie X, Lee F, Chen Q (2023) DMA-Net: Decoupled multi-Scale attention for few-Shot object detection. Appl Sci 13(12):6933

  6. Kirillov A, Wu Y, He K, Girshick R (2020) Pointrend: Image segmentation as rendering. In: Proc 2020 IEEE/CVF Conf on Comput Vis and Pattern Recogniti, pp 9796–9805

  7. He K, Zhang X, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: Proc 2016 IEEE Conf on Comput Vis and Pattern Recognit, pp 770–778

  8. Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li L-J (2016) YFCC100M: the new data in multimedia research. Commun ACM 59(2):64–73

    Article  Google Scholar 

  9. Branson S, Van Horn G, Perona P (2017) Lean crowdsourcing: Combining humans and machines in an online system. In: Proc 2017 IEEE Conf on Comput Vis and Pattern Recognit, pp 7474–7483

  10. Arpit D, Jastrzebski S, Ballas N, Krueger D, Bengio E, Kanwal MS, Maharaj T, Fischer A, Courville A, Bengi Y, Lacoste-Julien S (2017) A closer look at memorization in deep networks. In: Proc Int Conf on Mach Learn

  11. Sun L, Lyu G, Feng S, Huang X (2021) Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels. Appl Intell 51:1552–1564

    Article  Google Scholar 

  12. Ji X, Tan A, Wu W-Z, Gu S (2023) Multi-label classification with weak labels by learning label correlation and label regularization. Appl Intell 53(17):20110–20133

    Article  Google Scholar 

  13. Wang DN, Li L, Zhao D (2022) Corporate finance risk prediction based on LightGBM. Inf Sci 602:259–268

    Article  Google Scholar 

  14. Song R, Liu Z, Chen X, An H, Zhang Z, Wang X, Xu H (2023) Label prompt for multi-label text classification. Appl Intell 53:8761–8775

    Article  Google Scholar 

  15. Mishra S, Zhang Y, Chen DZ, Hu XS (2022) Data-driven deep supervision for medical image segmentation. IEEE Trans Med Imaging 41(6):1560–1574

    Article  Google Scholar 

  16. Liu X, Li W, Yang Q, Li B, Yuan Y (2022) Towards robust adaptive object detection under noisy annotations. In: Proc 2022 IEEE/CVF Conf on Comput Vis and Pattern Recognit, pp 14187–14196

  17. Yang L, Meng F, Li H, Wu Q, Cheng Q (2020) Learning with noisy class labels for instance segmentation. In: Proc 16th Eur Conf on Comput Vis

  18. Yang L, Li H, Meng F, Wu Q, Ngan KN (2021) Task-specific loss for robust instance segmentation with noisy class labels. IEEE Trans Circuits Sys Video Tech 33(1):213–227

    Article  Google Scholar 

  19. Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Proc IEEE Conf Adv Neural Inf Process Sys, 31

  20. Wang Y, Ma X, Chen Z, Luo Y, Yi J, Bailey J (2019) Symmetric cross entropy for robust learning with noisy labels. In: Proc 2019 IEEE/CVF Int Conf on Comput Vis, pp 322–330

  21. Han B, Yao QM, Yu XR, Niu G, Xu M, H WH, Tsang IW, Sugiyama M (2018) Co-teaching: Robust training of deep neural networks with extremely noisy labels. In: Proc IEEE Conf Adv Neural Inf Process Sys, 31

  22. Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M (2019) How does disagreement help generalization against label corruption. In: Proc the 36th Int Conf on Mach Learn, pp 7164–7173

  23. Zhang Q, Lee F, Wang Y, Ding D, Yang S, Lin C, Chen Q (2021) CJC-Net: A cyclical-training method with joint-loss and co-teaching strategy for deep learning under noisy labels. Info Sci 579:186–198

    Article  MathSciNet  Google Scholar 

  24. Chen Y, Hu SX, Shen X, Ai C, Suykens JAK (2022) Compressing features for learning with noisy labels. IEEE Trans Neur Netw Learn Syst 35(2):2124–2138

    Article  MathSciNet  Google Scholar 

  25. Patel D, Sastry PS (2023) Adaptive sample selection for robust learning under label noise. In: Proc IEEE/CVF Wint Conf on Apps of Comput Vis, pp 3932–3942

  26. Xia Q, Lee F, Chen Q (2023) TCC-net: a two-stage training method with contradictory loss and co-teaching based on meta-learning for learning with noisy labels. Info Sci 639:119008

  27. Karim N, Rizve MN, Rahnavard N, Mian A, Shah M (2022) Unicon: Combating label noise through uniform selection and contrastive learning. In: Proc 2022 IEEE Conf on Comput Vis and Pattern Recognit, pp 9676–9686

  28. Tanaka D, Ikami D, Yamasaki T, Aizawa K (2018) Joint optimization framework for learning with noisy labels. In: Proc 2018 IEEE/CVF Conf on Comput Vis and Pattern Recognit, pp 5552–5560

  29. Zhang Q, Lee F, Wang Y-g, Miao R, Chen L, Chen Q (2020) An improved noise loss correction algorithm for learning from noisy labels. J Vis Commun Image Represent 72:102930

  30. Zhang Q, Lee F, Wang Y, Ding D, Yao W, Chen L, Chen Q (2021) An joint end-to-end framework for learning with noisy labels. Appl Soft Comput 108:107426

  31. Han B, Niu G, Yu X, Yao Q, Xu M, Tsang I, Sugiyama M (2020) SIGUA: Forgetting may make learning with noisy labels more robust. In: Proc 37th Int Conf on Mach Learn, pp 4006–4016

  32. Menon AK, Rawat AS, Reddi SJ, Kumar S (2020) Can gradient clipping mitigate label noise? In: Proc Int Conf on Learn Represent

  33. Xia X, Liu T, Han B, Gong C, Wang N, Ge Z, Chang Y (2021) Robust early-learning: hindering the memorization of noisy labels. In: Proc Int Conf on Learn Represent

  34. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proc 2016 IEEE Conf on Comput Vis and Pattern Recognit, pp 779–788

  35. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proc IEEE Conf Adv Neural Inf Process Sys, pp 91–99

  36. He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proc 2017 IEEE Int Conf on Comput Vis, pp 2961–2969

  37. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Proc Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  38. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Doll´ar P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proc Eur Conf on Comput Vis, pp 740–755

  39. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proc 2016 IEEE Conf on Comput Vis and Pattern Recognit, pp 3213–3223

  40. Liu S, Liu K, Zhu W, Shen Y, Fernandez-Granda C (2022) Adaptive early-learning correction for segmentation from noisy annotations. In: Proc 2022 IEEE Conf on Comput Vis and Pattern Recognit

  41. Bolya D, Zhou C, Xiao F, Lee YJ (2019) Yolact: Real-time instance segmentation. In: Proc 2019 IEEE/CVF Int Conf on Comput Vis, pp 9157–9166

Download references

Acknowledgment

This work is partially supported by JSPS KAKENHI Grant Number 22K12079.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Feifei Lee or Qiu Chen.

Ethics declarations

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, Q., Hu, C., Lee, F. et al. Gradient optimization for object detection in learning with noisy labels. Appl Intell 54, 4248–4259 (2024). https://doi.org/10.1007/s10489-024-05357-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05357-6

Keywords

Navigation