ABSTRACT
Deep Neural Networks (DNNs) have shown impressive performance on large-scale training data with high-quality annotations. However, the collected annotations inevitably contain inaccurate labels in consideration of time and money budget, which causes DNNs to generalize poorly on the test set. To combat noisy labels in deep learning, the label correction methods are dedicated to simultaneously updating model parameters and correcting noisy labels, in which the noisy labels are usually corrected based on model predictions, the topological structures of data, or the aggregation of multiple models. However, such self-training manner cannot guarantee that the direction of label correction is always reliable. In view of this, we propose a novel label correction method to supervise and guide the process of label correction. In particular, the proposed label correction is an online two-fold process at each iteration only through back-propagation. The first label correction minimizes the empirical risk on noisy training data using noise-tolerant loss function, and the second label correction adopts a meta-learning paradigm to rectify the direction of first label correction so that the model can perform optimally in the evaluation procedure. Extensive experiments demonstrate the effectiveness of the proposed method on synthetic datasets with varying noise types and noise rates. Notably, our method achieves test accuracy of 77.37% on the real-world Clothing1M dataset.
Supplemental Material
- Görkem Algan and Ilkay Ulusoy. 2019. Image classification with deep learning in the presence of noisy labels: A survey. arXiv preprint arXiv:1912.05170 (2019).Google Scholar
- Görkem Algan and .Ilkay Ulusoy. 2020a. Label Noise Types and Their Effects on Deep Learning. arXiv preprint arXiv:2003.10471 (2020).Google Scholar
- Görkem Algan and Ilkay Ulusoy. 2020b. Meta Soft Label Generation for Noisy Labels. arXiv preprint arXiv:2007.05836 (2020).Google Scholar
- Devansh Arpit, Stanislaw Jastrzebski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, et al. 2017. A closer look at memorization in deep networks. arXiv preprint arXiv:1706.05394 (2017).Google Scholar
- Samaneh Azadi, Jiashi Feng, Stefanie Jegelka, and Trevor Darrell. 2015. Auxiliary image regularization for deep cnns with noisy labels. arXiv preprint arXiv:1511.07069 (2015).Google Scholar
- Oshrat Bar, Amnon Drory, and Raja Giryes. 2020. A Spectral Perspective of Neural Networks Robustness to Label Noise. (2020).Google Scholar
- Antonin Berthon, Bo Han, Gang Niu, Tongliang Liu, and Masashi Sugiyama. 2021. Confidence scores make instance-dependent label-noise learning possible. In International Conference on Machine Learning. PMLR, 825--836.Google Scholar
- Pengfei Chen, Junjie Ye, Guangyong Chen, Jingwei Zhao, and Pheng-Ann Heng. 2020. Beyond class-conditional assumption: A primary attempt to combat instance-dependent label noise. arXiv preprint arXiv:2012.05458 (2020).Google Scholar
- Xinlei Chen and Abhinav Gupta. 2015. Webly supervised learning of convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 1431--1439.Google ScholarDigital Library
- Yifang Chen, Simon S Du, and Kevin G Jamieson. 2021. Corruption Robust Active Learning. Advances in Neural Information Processing Systems , Vol. 34 (2021).Google Scholar
- Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, and Yang Liu. 2020. Learning with Instance-Dependent Label Noise: A Sample Sieve Approach. In International Conference on Learning Representations.Google Scholar
- Shangzhe Di, Zeren Jiang, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, and Shuicheng Yan. 2021. Video Background Music Generation with Controllable Music Transformer. In Proceedings of the 29th ACM International Conference on Multimedia. 2037--2045.Google ScholarDigital Library
- Justin Domke. 2012. Generic methods for optimization-based modeling. In Artificial Intelligence and Statistics. 318--326.Google Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).Google Scholar
- Erik Englesson and Hossein Azizpour. 2021. Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels. Advances in Neural Information Processing Systems , Vol. 34 (2021).Google Scholar
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning. PMLR, 1126--1135.Google Scholar
- Xin Geng. 2016. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering, Vol. 28, 7 (2016), 1734--1748.Google ScholarCross Ref
- Shane Gilroy, Martin Glavin, Edward Jones, and Darragh Mullins. 2021. Pedestrian Occlusion Level Classification using Keypoint Detection and 2D Body Surface Area Estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3833--3839.Google ScholarCross Ref
- Jacob Goldberger and Ehud Ben-Reuven. 2016. Training deep neural-networks using a noise adaptation layer. (2016).Google Scholar
- Yves Grandvalet and Yoshua Bengio. 2004. Semi-supervised learning by entropy minimization. In Proceedings of the 17th International Conference on Neural Information Processing Systems. 529--536.Google ScholarDigital Library
- Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W Tsang, James T Kwok, and Masashi Sugiyama. 2020. A Survey of Label-noise Representation Learning: Past, Present and Future. arXiv preprint arXiv:2011.04406 (2020).Google Scholar
- Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, and Masashi Sugiyama. 2018. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In Advances in neural information processing systems. 8527--8537.Google Scholar
- Jiangfan Han, Ping Luo, and Xiaogang Wang. 2019. Deep self-learning from noisy labels. In Proceedings of the IEEE International Conference on Computer Vision. 5138--5147. https://doi.org/DOI:10.1109/ICCV.2019.00524Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778. https://doi.org/DOI:10.1109/cvpr.2016.90Google ScholarCross Ref
- Dan Hendrycks, Mantas Mazeika, Duncan Wilson, and Kevin Gimpel. 2018. Using trusted data to train deep networks on labels corrupted by severe noise. arXiv preprint arXiv:1802.05300 (2018).Google Scholar
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google Scholar
- Simon Jenni and Paolo Favaro. 2018. Deep bilevel learning. In Proceedings of the European Conference on Computer Vision (ECCV). 618--633.Google ScholarDigital Library
- Lu Jiang, Di Huang, Mason Liu, and Weilong Yang. 2020a. Beyond synthetic noise: Deep learning on controlled noisy labels. ICML.Google Scholar
- Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. 2018. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In International Conference on Machine Learning. 2304--2313.Google Scholar
- Wentao Jiang, Si Liu, Chen Gao, Ran He, Bo Li, and Shuicheng Yan. 2020b. Beautify as you like. In Proceedings of the 28th ACM International Conference on Multimedia. 4542--4544.Google ScholarDigital Library
- Zhimeng Jiang, Kaixiong Zhou, Zirui Liu, Li Li, Rui Chen, Soo-Hyun Choi, and Xia Hu. 2021. An Information Fusion Approach to Learning with Instance-Dependent Label Noise. In International Conference on Learning Representations.Google Scholar
- Davood Karimi, Haoran Dou, Simon K Warfield, and Ali Gholipour. 2020. Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis. Medical Image Analysis , Vol. 65 (2020), 101759.Google ScholarCross Ref
- Taehyeon Kim, Jongwoo Ko, JinHwan Choi, Se-Young Yun, et al. 2021. FINE Samples for Learning with Noisy Labels. Advances in Neural Information Processing Systems , Vol. 34 (2021).Google Scholar
- Pang Wei Koh, Shiori Sagawa, Henrik Marklund, et al. 2020. WILDS: A Benchmark of in-the-Wild Distribution Shifts. arXiv preprint arXiv:2012.07421 (2020).Google Scholar
- Axel Barroso Laguna and Krystian Mikolajczyk. 2022. Key. Net: Keypoint Detection by Handcrafted and Learned CNN Filters Revisited. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).Google Scholar
- Samuli Laine and Timo Aila. 2016. Temporal Ensembling for Semi-Supervised Learning. (2016).Google Scholar
- Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3. 896.Google Scholar
- Kuang-Huei Lee, Xiaodong He, Lei Zhang, and Linjun Yang. 2018. Cleannet: Transfer learning for scalable image classifier training with label noise. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5447--5456. https://doi.org/DOI:10.1109/CVPR.2018.00571Google ScholarCross Ref
- Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare, Shafiq Joty, Caiming Xiong, and Steven Chu Hong Hoi. 2021b. Align before fuse: Vision and language representation learning with momentum distillation. Advances in Neural Information Processing Systems , Vol. 34 (2021).Google Scholar
- Junnan Li, Richard Socher, and Steven CH Hoi. 2020a. Dividemix: Learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394 (2020).Google Scholar
- Jingzheng Li, Hailong Sun, and Jiyi Li. 2022. Beyond Confusion Matrix: Learning from Multiple Annotators with Awareness of Instance Features. Machine Learning (2022). https://doi.org/10.1007/s10994-022-06211-xGoogle Scholar
- Junnan Li, Yongkang Wong, Qi Zhao, and Mohan S Kankanhalli. 2019. Learning to learn from noisy labeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5051--5059.Google ScholarCross Ref
- Junnan Li, Caiming Xiong, and Steven CH Hoi. 2020b. Mopro: Webly supervised learning with momentum prototypes. arXiv preprint arXiv:2009.07995 (2020).Google Scholar
- Junnan Li, Pan Zhou, Caiming Xiong, and Steven CH Hoi. 2020c. Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966 (2020).Google Scholar
- Xuefeng Li, Tongliang Liu, Bo Han, Gang Niu, and Masashi Sugiyama. 2021a. Provably end-to-end label-noise learning without anchor points. In International Conference on Machine Learning. PMLR, 6403--6413.Google Scholar
- Or Litany and Daniel Freedman. 2018. Soseleto: A unified approach to transfer learning and training with noisy labels. arXiv preprint arXiv:1805.09622 (2018).Google Scholar
- Yang Liu and Hongyi Guo. 2020. Peer loss functions: Learning from noisy labels without knowing noise rates. In International Conference on Machine Learning. PMLR, 6226--6236.Google Scholar
- Xingjun Ma, Hanxun Huang, Yisen Wang, Simone Romano, Sarah Erfani, and James Bailey. 2020. Normalized loss functions for deep learning with noisy labels. In International Conference on Machine Learning. PMLR, 6543--6553.Google Scholar
- Andrey Malinin and Mark Gales. 2018. Predictive uncertainty estimation via prior networks. arXiv preprint arXiv:1802.10501 (2018).Google Scholar
- Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, and Thomas Brox. 2019. Self: Learning to filter noisy labels with self-ensembling. arXiv preprint arXiv:1910.01842 (2019).Google Scholar
- Amanda Olmin and Fredrik Lindsten. 2021. Robustness and reliability when training with noisy labels. arXiv preprint arXiv:2110.03321 (2021).Google Scholar
- Sung Woo Park and Junseok Kwon. 2021. Wasserstein Distributional Normalization For Robust Distributional Certification of Noisy Labeled Data. In International Conference on Machine Learning. PMLR, 8381--8390.Google Scholar
- Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1944--1952.Google ScholarCross Ref
- Gabriel Pereyra, George Tucker, Jan Chorowski, Łukasz Kaiser, and Geoffrey Hinton. 2017. Regularizing neural networks by penalizing confident output distributions. arXiv preprint arXiv:1701.06548 (2017).Google Scholar
- Hieu Pham, Qizhe Xie, Zihang Dai, and Quoc V Le. 2020. Meta pseudo labels. arXiv preprint arXiv:2003.10580 (2020).Google Scholar
- Scott Reed, Honglak Lee, Dragomir Anguelov, Christian Szegedy, Dumitru Erhan, and Andrew Rabinovich. 2014. Training deep neural networks on noisy labels with bootstrapping. arXiv preprint arXiv:1412.6596 (2014).Google Scholar
- Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. 2018. Learning to reweight examples for robust deep learning. arXiv preprint arXiv:1803.09050 (2018).Google Scholar
- Maryam Sabzevari, Gonzalo Martinez-Munoz, and Alberto Suárez. 2015. Small margin ensembles can be robust to class-label noise. Neurocomputing , Vol. 160 (2015), 18--33. https://doi.org/10.1016/j.neucom.2014.12.086Google ScholarDigital Library
- Kuniaki Saito, Yoshitaka Ushiku, and Tatsuya Harada. 2017. Asymmetric tri-training for unsupervised domain adaptation. In International Conference on Machine Learning. PMLR, 2988--2997.Google Scholar
- Jürgen Schmidhuber. 1995. On learning how to learn learning strategies. (1995).Google Scholar
- Jun Shu, Qi Xie, Lixuan Yi, Qian Zhao, Sanping Zhou, Zongben Xu, and Deyu Meng. 2019. Meta-weight-net: Learning an explicit mapping for sample weighting. In Advances in Neural Information Processing Systems. 1919--1930.Google Scholar
- Hwanjun Song, Minseok Kim, and Jae-Gil Lee. 2019. SELFIE: Refurbishing unclean samples for robust deep learning. In International Conference on Machine Learning. 5907--5915.Google Scholar
- Hwanjun Song, Minseok Kim, Dongmin Park, and Jae-Gil Lee. 2020. Learning from noisy labels with deep neural networks: A survey. arXiv preprint arXiv:2007.08199 (2020).Google Scholar
- Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, and Rob Fergus. 2014. Training convolutional networks with noisy labels. arXiv preprint arXiv:1406.2080 (2014).Google Scholar
- Cheng Tan, Jun Xia, Lirong Wu, and Stan Z Li. 2021. Co-learning: Learning from noisy labels with self-supervision. In Proceedings of the 29th ACM International Conference on Multimedia. 1405--1413.Google ScholarDigital Library
- Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2018. Joint optimization framework for learning with noisy labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5552--5560. https://doi.org/DOI:10.1109/CVPR.2018.00582Google ScholarCross Ref
- Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in neural information processing systems. 1195--1204.Google Scholar
- Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, and Serge Belongie. 2017. Learning from noisy large-scale datasets with minimal supervision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 839--847.Google ScholarCross Ref
- Xinshao Wang, Elyor Kodirov, Yang Hua, and Neil M Robertson. 2019a. Derivative manipulation for general example weighting. arXiv preprint arXiv:1905.11233 (2019).Google Scholar
- Yulin Wang, Rui Huang, Gao Huang, Shiji Song, and Cheng Wu. 2020b. Collaborative learning with corrupted labels. Neural Networks (2020).Google Scholar
- Yisen Wang, Weiyang Liu, Xingjun Ma, James Bailey, Hongyuan Zha, Le Song, and Shu-Tao Xia. 2018. Iterative learning with open-set noisy labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8688--8696.Google ScholarCross Ref
- Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, and James Bailey. 2019b. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE International Conference on Computer Vision. 322--330.Google ScholarCross Ref
- Zhen Wang, Guosheng Hu, and Qinghua Hu. 2020a. Training noise-robust deep neural networks via meta-learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4524--4533.Google ScholarCross Ref
- Colin Wei, Kendrick Shen, Yining Chen, and Tengyu Ma. 2020. Theoretical analysis of self-training with deep networks on unlabeled data. arXiv preprint arXiv:2010.03622 (2020).Google Scholar
- Jiaheng Wei and Yang Liu. 2020. When optimizing $ f $-divergence is robust with label noise. arXiv preprint arXiv:2011.03687 (2020).Google Scholar
- Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, and Yang Liu. 2021. Learning with noisy labels revisited: A study using real-world human annotations. arXiv preprint arXiv:2110.12088 (2021).Google Scholar
- Mitchell Wortsman, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, et al. 2022. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. arXiv preprint arXiv:2203.05482 (2022).Google Scholar
- Dongxian Wu, Yisen Wang, Zhuobin Zheng, and Shu-Tao Xia. 2020b. Temporal Calibrated Regularization for Robust Noisy Label Learning. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.Google Scholar
- Pengxiang Wu, Songzhu Zheng, Mayank Goswami, Dimitris N Metaxas, and Chao Chen. 2020c. A Topological Filter for Learning with Label Noise. Advances in neural information processing systems , Vol. 33 (2020).Google Scholar
- Yichen Wu, Jun Shu, Qi Xie, Qian Zhao, and Deyu Meng. 2020a. Learning to Purify Noisy Labels via Meta Soft Label Corrector. arXiv preprint arXiv:2008.00627 (2020).Google Scholar
- Youjiang Xu, Linchao Zhu, Lu Jiang, and Yi Yang. 2021. Faster meta update strategy for noise-robust deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 144--153.Google ScholarCross Ref
- Jiangchao Yao, Jiajie Wang, Ivor W Tsang, Ya Zhang, Jun Sun, Chengqi Zhang, and Rui Zhang. 2018. Deep learning from noisy image labels with quality embedding. IEEE Transactions on Image Processing , Vol. 28, 4 (2018), 1909--1922.Google ScholarDigital Library
- Yu Yao, Tongliang Liu, Mingming Gong, Bo Han, Gang Niu, and Kun Zhang. 2021. Instance-dependent Label-noise Learning under a Structural Causal Model. Advances in Neural Information Processing Systems , Vol. 34 (2021).Google Scholar
- Kun Yi and Jianxin Wu. 2019. Probabilistic end-to-end noise correction for learning with noisy labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7017--7025. https://doi.org/DOI:10.1109/CVPR.2019.00718Google ScholarCross Ref
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2021. Understanding deep learning (still) requires rethinking generalization. Commun. ACM, Vol. 64, 3 (2021), 107--115. https://doi.org/10.1145/3446776Google ScholarDigital Library
- Huaizheng Zhang, Yong Luo, Qiming Ai, Yonggang Wen, and Han Hu. 2020a. Look, read and feel: Benchmarking ads understanding with multimodal multitask learning. In Proceedings of the 28th ACM International Conference on Multimedia. 430--438.Google ScholarDigital Library
- Yivan Zhang and Masashi Sugiyama. 2021. Approximating Instance-Dependent Noise via Instance-Confidence Embedding. arXiv preprint arXiv:2103.13569 (2021).Google Scholar
- Yikai Zhang, Songzhu Zheng, Pengxiang Wu, Mayank Goswami, and Chao Chen. 2020b. Learning with Feature-Dependent Label Noise: A Progressive Approach. In International Conference on Learning Representations.Google Scholar
- Zhilu Zhang and Mert Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. In Advances in neural information processing systems. 8778--8788.Google Scholar
- Guoqing Zheng, Ahmed Hassan Awadallah, and Susan Dumais. 2021. Meta label correction for noisy label learning. In Proceedings of the 35th AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Zhedong Zheng and Yi Yang. 2021. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision (2021), 1--15.Google ScholarDigital Library
- Zhi-Hua Zhou and Ming Li. 2005. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on knowledge and Data Engineering, Vol. 17, 11 (2005), 1529--1541.Google ScholarDigital Library
Index Terms
- Correct Twice at Once: Learning to Correct Noisy Labels for Robust Deep Learning
Recommendations
Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining
Data Engineering in Medical ImagingAbstractNoisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization ...
Improving Semi-Supervised Text Classification with Dual Meta-Learning
The goal of semi-supervised text classification (SSTC) is to train a model by exploring both a small number of labeled data and a large number of unlabeled data, such that the learned semi-supervised classifier performs better than the supervised ...
Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection
Computer Vision – ECCV 2022AbstractRecent studies on learning with noisy labels have shown remarkable performance by exploiting a small clean dataset. In particular, model agnostic meta-learning-based label correction methods further improve performance by correcting noisy labels ...
Comments