Abstract
Deep neural networks have made significant progress benefiting large-scale correctly human-labeled datasets. However, large-scale human-labeled datasets are often ambiguous because the limited experience can lead to mislabeled classes. Most research related to learning with noisy labels concentrates on image classification, while we focus on object detection that also suffers from noisy labels. In this paper, we propose a method that applies gradient optimization for object detection (GOOD), aiming to combat poor generalization caused by noisy labels in objection detection. Usually, a detection task is divided into a foreground-background subtask and a foreground-object subtask. Hence, gradient descent with cross-entropy exploits corrected gradient guidance for foreground-background subtask, while dynamic gradient underweighted ascent with cross-entropy and variant gradient clipping with improved symmetric cross-entropy are mutually employed to prevent incorrect gradient guidance for foreground-object subtask. We conducted extensive experiments on PASCAL VOC 2012 and COCO 2017, demonstrating the effectiveness of GOOD. Furthermore, we promote GOOD to instance segmentation, and competitive results on Cityscapes show that it is also appropriate for instance segmentation. Specifically, we achieved a 9.4% improvement on PASCAL VOC 2012, 5.2% on COCO 2017, and 4.3% on Cityscapes.
Similar content being viewed by others
Data Availability
The code is available at https://github.com/QiangqiangXia/GOOD.
References
Zhang C, Cheng J, Tian Q (2019) Unsupervised and semi-supervised image classification with weak semantic consistency. IEEE Trans Multimed 21:2482–2491
Chen L, Bo KH, Lee F, Chen Q (2020) Advanced feature fusion algorithm based on multiple convolutional neural network for scene recognition. Comput Model Eng Sci 122(2):505–523
Xie L, Lee F, Liu L, Kotani K, Chen Q (2020) Scene recognition: a comprehensive survey. Pattern Recognit 102:107205
Lin C, Lee F, Xie L, Cai J, Chen H, Liu L, Chen Q (2022) Scene recognition using multiple representation network. Appl Soft Comput 118:108530
Xie X, Lee F, Chen Q (2023) DMA-Net: Decoupled multi-Scale attention for few-Shot object detection. Appl Sci 13(12):6933
Kirillov A, Wu Y, He K, Girshick R (2020) Pointrend: Image segmentation as rendering. In: Proc 2020 IEEE/CVF Conf on Comput Vis and Pattern Recogniti, pp 9796–9805
He K, Zhang X, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: Proc 2016 IEEE Conf on Comput Vis and Pattern Recognit, pp 770–778
Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li L-J (2016) YFCC100M: the new data in multimedia research. Commun ACM 59(2):64–73
Branson S, Van Horn G, Perona P (2017) Lean crowdsourcing: Combining humans and machines in an online system. In: Proc 2017 IEEE Conf on Comput Vis and Pattern Recognit, pp 7474–7483
Arpit D, Jastrzebski S, Ballas N, Krueger D, Bengio E, Kanwal MS, Maharaj T, Fischer A, Courville A, Bengi Y, Lacoste-Julien S (2017) A closer look at memorization in deep networks. In: Proc Int Conf on Mach Learn
Sun L, Lyu G, Feng S, Huang X (2021) Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels. Appl Intell 51:1552–1564
Ji X, Tan A, Wu W-Z, Gu S (2023) Multi-label classification with weak labels by learning label correlation and label regularization. Appl Intell 53(17):20110–20133
Wang DN, Li L, Zhao D (2022) Corporate finance risk prediction based on LightGBM. Inf Sci 602:259–268
Song R, Liu Z, Chen X, An H, Zhang Z, Wang X, Xu H (2023) Label prompt for multi-label text classification. Appl Intell 53:8761–8775
Mishra S, Zhang Y, Chen DZ, Hu XS (2022) Data-driven deep supervision for medical image segmentation. IEEE Trans Med Imaging 41(6):1560–1574
Liu X, Li W, Yang Q, Li B, Yuan Y (2022) Towards robust adaptive object detection under noisy annotations. In: Proc 2022 IEEE/CVF Conf on Comput Vis and Pattern Recognit, pp 14187–14196
Yang L, Meng F, Li H, Wu Q, Cheng Q (2020) Learning with noisy class labels for instance segmentation. In: Proc 16th Eur Conf on Comput Vis
Yang L, Li H, Meng F, Wu Q, Ngan KN (2021) Task-specific loss for robust instance segmentation with noisy class labels. IEEE Trans Circuits Sys Video Tech 33(1):213–227
Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Proc IEEE Conf Adv Neural Inf Process Sys, 31
Wang Y, Ma X, Chen Z, Luo Y, Yi J, Bailey J (2019) Symmetric cross entropy for robust learning with noisy labels. In: Proc 2019 IEEE/CVF Int Conf on Comput Vis, pp 322–330
Han B, Yao QM, Yu XR, Niu G, Xu M, H WH, Tsang IW, Sugiyama M (2018) Co-teaching: Robust training of deep neural networks with extremely noisy labels. In: Proc IEEE Conf Adv Neural Inf Process Sys, 31
Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M (2019) How does disagreement help generalization against label corruption. In: Proc the 36th Int Conf on Mach Learn, pp 7164–7173
Zhang Q, Lee F, Wang Y, Ding D, Yang S, Lin C, Chen Q (2021) CJC-Net: A cyclical-training method with joint-loss and co-teaching strategy for deep learning under noisy labels. Info Sci 579:186–198
Chen Y, Hu SX, Shen X, Ai C, Suykens JAK (2022) Compressing features for learning with noisy labels. IEEE Trans Neur Netw Learn Syst 35(2):2124–2138
Patel D, Sastry PS (2023) Adaptive sample selection for robust learning under label noise. In: Proc IEEE/CVF Wint Conf on Apps of Comput Vis, pp 3932–3942
Xia Q, Lee F, Chen Q (2023) TCC-net: a two-stage training method with contradictory loss and co-teaching based on meta-learning for learning with noisy labels. Info Sci 639:119008
Karim N, Rizve MN, Rahnavard N, Mian A, Shah M (2022) Unicon: Combating label noise through uniform selection and contrastive learning. In: Proc 2022 IEEE Conf on Comput Vis and Pattern Recognit, pp 9676–9686
Tanaka D, Ikami D, Yamasaki T, Aizawa K (2018) Joint optimization framework for learning with noisy labels. In: Proc 2018 IEEE/CVF Conf on Comput Vis and Pattern Recognit, pp 5552–5560
Zhang Q, Lee F, Wang Y-g, Miao R, Chen L, Chen Q (2020) An improved noise loss correction algorithm for learning from noisy labels. J Vis Commun Image Represent 72:102930
Zhang Q, Lee F, Wang Y, Ding D, Yao W, Chen L, Chen Q (2021) An joint end-to-end framework for learning with noisy labels. Appl Soft Comput 108:107426
Han B, Niu G, Yu X, Yao Q, Xu M, Tsang I, Sugiyama M (2020) SIGUA: Forgetting may make learning with noisy labels more robust. In: Proc 37th Int Conf on Mach Learn, pp 4006–4016
Menon AK, Rawat AS, Reddi SJ, Kumar S (2020) Can gradient clipping mitigate label noise? In: Proc Int Conf on Learn Represent
Xia X, Liu T, Han B, Gong C, Wang N, Ge Z, Chang Y (2021) Robust early-learning: hindering the memorization of noisy labels. In: Proc Int Conf on Learn Represent
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proc 2016 IEEE Conf on Comput Vis and Pattern Recognit, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proc IEEE Conf Adv Neural Inf Process Sys, pp 91–99
He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proc 2017 IEEE Int Conf on Comput Vis, pp 2961–2969
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Proc Int J Comput Vis 88(2):303–338
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Doll´ar P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proc Eur Conf on Comput Vis, pp 740–755
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proc 2016 IEEE Conf on Comput Vis and Pattern Recognit, pp 3213–3223
Liu S, Liu K, Zhu W, Shen Y, Fernandez-Granda C (2022) Adaptive early-learning correction for segmentation from noisy annotations. In: Proc 2022 IEEE Conf on Comput Vis and Pattern Recognit
Bolya D, Zhou C, Xiao F, Lee YJ (2019) Yolact: Real-time instance segmentation. In: Proc 2019 IEEE/CVF Int Conf on Comput Vis, pp 9157–9166
Acknowledgment
This work is partially supported by JSPS KAKENHI Grant Number 22K12079.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xia, Q., Hu, C., Lee, F. et al. Gradient optimization for object detection in learning with noisy labels. Appl Intell 54, 4248–4259 (2024). https://doi.org/10.1007/s10489-024-05357-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05357-6