Abstract
To distinguish the subtle differences among fine-grained categories, a large amount of well-labeled images are typically required. However, manual annotations for fine-grained categories is an extremely difficult task as it usually has a high demand for professional knowledge. To this end, we propose to directly leverage web images for fine-grained visual recognition. Nevertheless, directly utilizing web images for training fine-grained classification models tends to have poor performance due to the existence of label noise. In this work, we propose an end-to-end method by combining uncertainly dynamic loss correction and global sample selection to solve the problem of label noise. Specifically, we leverage a deep neural network to predict all samples, record the predictions of several recent epochs and calculate the uncertainly dynamic loss for global sample selection in the whole epoch. We conduct experiments on three commonly used noisy fine-grained datasets Web-Aircraft, Web-Bird and Web-Cars. The average classification accuracy is 75.40%, 78.53% and 82.19%, which has 1.20%, 2.16% and 3.43% improvements, respectively.
Similar content being viewed by others
Data Availability Statement
The datasets analyzed during the current study are available in AAAI 2020 paper “Web-supervised network with softly update-drop training for fine-grained visual classification” [35]. These datasets were derived from the following resources: https://github.com/z337-408/WSNFGVC.
References
D. Arpit, S. Jastrzebski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio et al. A closer look at memorization in deep networks. in International Conference on Machine Learning (2017), pp. 233–242
Y. Cui, Y. Song, C. Sun, A. Howard, S. Belongie, Large scale fine-grained categorization and domain-specific transfer learning. in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 4109–4118
Y. Cui, F. Zhou, Y. Lin, S. Belongie, Fine-grained categorization and dataset bootstrapping using deep metric learning with humans in the loop. in IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 1153–1162
J. Fu, H. Zheng, T. Mei, Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. in IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4438–4446
B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I. Tsang, M. Sugiyama, Co-teaching: robust training of deep neural networks with extremely noisy labels. in The Conference on Neural Information Processing Systems (2018), pp. 8527–8537
L. Jiang, Z.-Y. Zhou, T. Leung, L.-J. Li, F.-F. Li, Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. in International Conference on Machine Learning (2017), pp. 1–20
J. Krause, M. Stark, J. Deng, L. Fei-Fei, 3D object representations for fine-grained categorization. in IEEE International Conference on Computer Vision (2013), pp. 554–561
T.-Y. Lin, A. RoyChowdhury, S. Maji, Bilinear CNN models for fine-grained visual recognition. in IEEE International Conference on Computer Vision (2015), pp. 1449–1457
S. Maji, E. Rahtu, J. Kannala, M. Blaschko, A. Vedaldi, Fine-grained visual classification of aircraft (2013). arXiv:1306.5151
E. Malach, S. Shalev-Shwartz, Decoupling “when to update” from “how to update”. in The Conference and Workshop on Neural Information Processing Systems (2017), pp. 960–970
L. Niu, A. Veeraraghavan, A. Sabharwal, Webly supervised learning meets zero-shot learning: a hybrid approach for fine-grained classification. in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 7171–7180
G. Patrini, A. Rozza, A. Krishna Menon, R. Nock, L. Qu, Making deep neural networks robust to label noise: A loss correction approach. in IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 1944–1952
S. Reed, H. Lee, D. Anguelov, C. Szegedy, D. Erhan, A. Rabinovich, Training deep neural networks on noisy labels with bootstrapping. in The International Conference on Learning Representations (2015), pp. 1–9
A. Shrivastava, A. Gupta, R. Girshick, Training region-based object detectors with online hard example mining. in IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 761–769
M. Simon, E. Rodner, Neural activation constellations: Unsupervised part model discovery with convolutional networks. in IEEE International Conference on Computer Vision (2015), pp. 1143–1151
H. Song, M. Kim, J.-G. Lee, Selfie: refurbishing unclean samples for robust deep learning. in International Conference on Machine Learning (2019), pp. 5907–5915
Z. Sun, X. Hua, Y. Yao, X. Wei, G. Hu, J. Zhang, CRSSC: salvage reusable samples from noisy data for robust learning. in ACM International Conference on Multimedia (2020), pp. 92–101
Z. Sun, Y. Yao, X. Wei, Y. Zhang, F. Shen, J. Wu, J. Zhang, H. Shen, Webly supervised fine-grained recognition: benchmark datasets and an approach. in IEEE International Conference on Computer Vision (2021), pp. 10602–10611
C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The caltech-ucsd birds-200-2011 dataset. Technical Report, (CNS-TR-2011-001, California Institute of Technology, 2011)
Y. Wang, W. Liu, X. Ma, J. Bailey, H. Zha, L. Song, S.-T. Xia, Iterative learning with open-set noisy labels. in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 8688–8696
Y. Wang, V. I. Morariu, L. S. Davis, Learning a discriminative filter bank within a CNN for fine-grained recognition. in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 4148–4157
X.-S. Wei, C.-W. Xie, J. Wu, C. Shen, Mask-CNN: localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recogn. 76, 704–714 (2018)
T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, Z. Zhang, The application of two-level attention models in deep convolutional neural network for fine-grained image classification. in IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 842–850
Z. Xu, S. Huang, Y. Zhang, D. Tao, Webly-supervised fine-grained visual categorization via deep domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1100–1113 (2016)
Y. Yao, T. Chen, G. Xie, C. Zhang, F. Shen, Q. Wu, Z Tang, J. Zhang, Non-salient region object mining for weakly supervised semantic segmentation. in IEEE Conference on Computer Vision and Pattern Recognition (2021), pp. 2623–2632
Y. Yao, X. Hua, G. Gao, Z. Sun, Z. Li, J. Zhang, Bridging the web data and fine-grained visual recognition via alleviating label noise and domain mismatch. in ACM International Conference on Multimedia (2020), pp. 1735–1744
Y. Yao, X. Hua, F. Shen, J. Zhang, Z. Tang, A domain robust approach for image dataset construction. in ACM international conference on Multimedia (2016), pp. 212–216
Y. Yao, F. Shen, G. Xie, L. Liu, F. Zhu, J. Zhang, H. Shen, Exploiting web images for multi-output classification: from category to subcategories. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2348–2360 (2020)
Y. Yao, F. Shen, J. Zhang, L. Liu, Z. Tang, L. Shao, Extracting multiple visual senses for web learning. IEEE Trans. Multimedia 21(1), 184–196 (2019)
Y. Yao, F. Shen, J. Zhang, L. Liu, Z. Tang, L. Shao, Extracting privileged information for enhancing classifier learning. IEEE Trans. Image Process. 28(1), 436–450 (2019)
Y. Yao, Z. Sun, C. Zhang, F. Shen, Q. Wu, J. Zhang, Z. Tang, Jo-SRC: a contrastive approach for combating noisy labels. in IEEE Conference on Computer Vision and Pattern Recognition (2021), pp. 5192–5201
Y. Yao, J. Zhang, F. Shen, X. Hua, J. Xu, Z. Tang, Exploiting web images for dataset construction: A domain robust approach. IEEE Trans. Multimedia 19(8), 1771–1784 (2017)
Y. Yao, J. Zhang, F. Shen, L. Liu, F. Zhu, D. Zhang, H. Shen, Towards automatic construction of diverse, high-quality image datasets. IEEE Trans. Knowl. Data Eng. 32(6), 1199–1211 (2020)
C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals, Understanding deep learning requires rethinking generalization. Commun. ACM 64(3), 107–115 (2016)
C. Zhang, Y. Yao, H. Liu, G.-S. Xie, X. Shu, T. Zhou, Z. Zhang, F. Shen, Z. Tang, Web-supervised network with softly update-drop training for fine-grained visual classification. in AAAI Conference on Artificial Intelligence (2020), pp. 12781–12788
C. Zhang, Y. Yao, X. Xu, J. Shao, J. Song, Z. Li, Z. Tang Extracting Useful Knowledge form Noisy Web Images via Data Purification for Fine-Grained Recognition. in ACM International Conference on Multimedia (2021)
N. Zhang, J. Donahue, R. Girshick, T. Darrell, Part-based R-CNNS for fine-grained category detection. in European Conference on computer Vision (2014), pp. 834–849
Y. Zhang, X.-S. Wei, J. Wu, J. Cai, J. Lu, V.-A. Nguyen, M.N. Do, Weakly supervised fine-grained categorization with part-based image representation. IEEE Trans. Image Process. 25(4), 1713–1725 (2016)
X. Zhu, X. Wu, Class noise vs. attribute noise: a quantitative study. Artif. Intell. Rev. 22(3), 177–210 (2004)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guo, J., Ding, M., Wang, Q. et al. An Uncertainly Dynamic Loss Correction and Global Sample Selection Method for Webly Supervised Fine-Grained Visual Classification. Circuits Syst Signal Process 41, 3265–3281 (2022). https://doi.org/10.1007/s00034-021-01928-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-021-01928-x