Skip to main content

Label-Smooth Learning for Fine-Grained Visual Categorization

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Abstract

Fine-Grained Visual Categorization (FGVC) is challenging due to the superior similarity among categories and the large within-category variance. Existing work tackles this problem by designing self-localization modules in an end-to-end DCNN to learn semantic part features. However the model efficiency of this strategy decreases significantly with the increasing of the number of categories, because more parts are needed to offset the impact of the increasing of categories. In this paper, we propose a label-smooth learning method that improves models applicability to large categories by maximizing its prediction diversity. Based on the similarity among fine-grained categories, a KL divergence between uniform and prediction distributions is established to reduce model’s confidence on the ground-truth category, while raising its confidence on similar categories. By minimizing it, information from similar categories are exploited for model learning, thus diminishing the effects caused by the increasing of categories. Experiments on five benchmark datasets of mid-scale (CUB-200-2011, Stanford Dogs, Stanford Cars, and FGVC-Aircraft) and large-scale (NABirds) categories show a clear advantage of the proposed label-smooth learning and demonstrate its comparable or state-of-the-art performance. Code is available at https://github.com/Cedric-Mo/LS-for-FGVC.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Baum, E.B., Wilczek, F.: Supervised learning of probability distributions by neural networks. In: NIPS (1988)

    Google Scholar 

  2. Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: ICCV (2013)

    Google Scholar 

  3. Chen, Y., Mo, X., Liang, Z., Wei, T., Luo, W.: Cross-category cross-semantic regularization for fine-grained image recognition. In: Lin, Z., et al. (eds.) PRCV 2019. LNCS, vol. 11857, pp. 110–122. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31654-9_10

    Chapter  Google Scholar 

  4. Chorowski, J., Jaitly, N.: Towards better decoding and language model integration in sequence to sequence models. arXiv preprint arXiv:1612.02695 (2016)

  5. Dubey, A., Gupta, O., Guo, P., Raskar, R., Farrell, R., Naik, N.: Pairwise confusion for fine-grained visual classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 71–88. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_5

    Chapter  Google Scholar 

  6. Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)

    Google Scholar 

  7. Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: CVPR (2016)

    Google Scholar 

  8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)

    Google Scholar 

  9. Gosselin, P.H., Murray, N., Jégou, H., Perronnin, F.: Revisiting the Fisher vector for fine-grained classification. Pattern Recogn. Lett. 49, 92–98 (2014)

    Article  Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  11. Hertz, J.A.: Introduction to the Theory of Neural Computation. CRC Press, Boca Raton (2018)

    Book  Google Scholar 

  12. Horn, G.V., et al.: Building a bird recognition app and large scale dataset with citizen scientists: the fine print in fine-grained dataset collection. In: CVPR (2015)

    Google Scholar 

  13. Huang, Y., et al.: GPipe: efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965 (2018)

  14. Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: NIPS (2015)

    Google Scholar 

  15. Khosla, A., Jayadevaprakash, N., Yao, B., Li, F.F.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization (FGVC) at CVPR (2011)

    Google Scholar 

  16. Krause, J., Stark, M., Deng, J., Li, F.F.: 3D object representations for fine-grained categorization. In: 4th IEEE Workshop on 3D Representation and Recognition at ICCV (2013)

    Google Scholar 

  17. Lam, M., Mahasseni, B., Todorovic, S.: Fine-grained recognition as HSnet search for informative image parts. In: CVPR (2017)

    Google Scholar 

  18. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  19. Levin, E., Fleisher, M.: Accelerated learning in layered neural networks. Complex Syst. 2, 625–640 (1988)

    MathSciNet  MATH  Google Scholar 

  20. Lin, D., Shen, X., Lu, C., Jia, J.: Deep LAC: deep localization, alignment and classification for fine-grained recognition. In: CVPR (2015)

    Google Scholar 

  21. Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV (2015)

    Google Scholar 

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)

    Google Scholar 

  23. Luo, W., Yang, X., Mo, X., Lu, Y., Davis, L.S., Lin, S.N.: Cross-X learning for fine-grained visual categorization. In: ICCV (2019)

    Google Scholar 

  24. Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. In: arXiv preprint arXiv:1306.5151 (2013)

  25. Mo, X., Zhu, J., Zhao, X., Liu, M., Wei, T., Luo, W.: Exploiting category-level semantic relationships for fine-grained image recognition. In: Lin, Z., et al. (eds.) PRCV 2019. LNCS, vol. 11857, pp. 50–62. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31654-9_5

    Chapter  Google Scholar 

  26. Naik, N., Dubey, A., Gupta, O., Raskar, R.: Maximum entropy fine-grained classification. In: NIPS (2018)

    Google Scholar 

  27. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI (2019)

    Google Scholar 

  28. Rumelhart, D.E., Hinton, G.E., Williams, R.J., et al.: Learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)

    MATH  Google Scholar 

  29. Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: ICCV (2015)

    Google Scholar 

  30. Sun, M., Yuan, Y., Zhou, F., Ding, E.: Multi-attention multi-class constraint for fine-grained image recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 834–850. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_49

    Chapter  Google Scholar 

  31. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR (2016)

    Google Scholar 

  32. Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)

    Google Scholar 

  33. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Technical report, California Institute of Technology (2011)

    Google Scholar 

  34. Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a CNN for fine-grained recognition. In: CVPR (2018)

    Google Scholar 

  35. Wu, C.Y., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2840–2848 (2017)

    Google Scholar 

  36. Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: CVPR (2015)

    Google Scholar 

  37. Xu, Z., Huang, S., Zhang, Y., Tao, D.: Augmenting strong supervision using web data for fine-grained categorization. In: ICCV (2015)

    Google Scholar 

  38. Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: NIPS (2012)

    Google Scholar 

  39. Yang, Z., Luo, T., Wang, D., Hu, Z., Gao, J., Wang, L.: Learning to navigate for fine-grained classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 438–454. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_26

    Chapter  Google Scholar 

  40. Zhang, H., et al.: SPDA-CNN: unifying semantic part detection and abstraction for fine-grained recognition. In: CVPR (2016)

    Google Scholar 

  41. Zhang, J., Zhang, R., Huang, Y., Zou, Q.: Unsupervised part mining for fine-grained image classification. In: arXiv preprint arXiv:1902.09941 (2019)

  42. Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54

    Chapter  Google Scholar 

  43. Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV (2013)

    Google Scholar 

  44. Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: CVPR (2016)

    Google Scholar 

  45. Zhang, X., Xiong, H., Zhou, W., Lin, W., Tian, Q.: Picking deep filter responses for fine-grained image recognition. In: CVPR (2016)

    Google Scholar 

  46. Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: ICCV (2017)

    Google Scholar 

  47. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61702197, in part by the Natural Science Foundation of Guangdong Province under Grant 2017A030310261.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mo, X., Wei, T., Zhang, H., Huang, Q., Luo, W. (2020). Label-Smooth Learning for Fine-Grained Visual Categorization. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41404-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41403-0

  • Online ISBN: 978-3-030-41404-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics