skip to main content
10.1145/3512353.3512359acmotherconferencesArticle/Chapter ViewAbstractPublication PagesapitConference Proceedingsconference-collections
research-article

Deep Learning for Fine-Grained Image Recognition: A Comprehensive Study

Published:14 March 2022Publication History

ABSTRACT

In computer vision, image recognition is a noteworthy and hot research area which develops rapidly. The principal task of this technique is to automatically predict which pre-defined categories an image might belong to. Traditional image recognition targets to classify images into diversified highly distinguished categories. However, Fine-Grained Image Recognition (FGIR) aims to recognize the variances among images categorized in subordinate classes, e.g., species of birds, types of cars or species of flowers, which are equivalent to “species” in Taxonomy in certain aspects. As a result, models of FGIR are required to pick out features from finer granularity. Conventional methods apply special feature encoding to explore discernible attributes, while recent methods of FGIR makes great advancement with assistance of deep learning which has obtained the remarkable development nowadays. In this paper, we provide a new integration of the current leading FGIR models according to how they improve the development of FGIR. We classified them into five main categories and then compared their performance on three popular datasets and analyzed the results. To advance the further development of this topic, we point out some open problems worth further exploring.

References

  1. Irving Biederman “Subordinate-level Object Classification Reexamined”. Psychological Research, 62, 131-153, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  2. Karen Simonyan and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. arXiv, 409.1556, 2015.Google ScholarGoogle Scholar
  3. Kaiming He “Deep Residual Learning for Image Recognition.” arXiv, 1512.03385, 2015.Google ScholarGoogle Scholar
  4. Gao Huang “Densely Connected Convolutional Networks”. arXiv, 1608.06993, 2018.Google ScholarGoogle Scholar
  5. Jie Hu “Squeeze-and-Excitation Networks”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2011-2023, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bo Zhao “A Survey on Deep Learning-based Fine-grained Object Classification and Semantic Segmentation”. International Journal of Automation and Computing, 14, 119-135, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yafei Wang and Zepeng Wang. “A Survey of Recent Work on Fine-grained Image Classification Techniques”. Journal of Visual Communication and Image Representation, 59, 210-214, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Xiu Shen Wei, Jianxin Wu, and Quan Cui. “Deep Learning for Fine-Grained Image Analysis: A Survey”. arXiv, 1907.03069, 2019.Google ScholarGoogle Scholar
  9. Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. “Bilinear CNNs for Fine-grained Visual Recognition”. arXiv, 1504.07889, 2017.Google ScholarGoogle Scholar
  10. Yen-Chi Hsu “ACE: Adaptive Confusion Energy for Natural World Data Distribution”. arXiv, 1910.12423, 2021.Google ScholarGoogle Scholar
  11. Harald Hanselmann and Hermann Ney. “ELoPE: Fine-Grained Visual Classification with Efficient Localization, Pooling and Embedding”. arXiv, 1911.07344, 2019.Google ScholarGoogle Scholar
  12. Yang Gao “Compact Bilinear Pooling”. arXiv, 1511.06062, 2016.Google ScholarGoogle Scholar
  13. Y. Cui “Kernel Pooling for Convolutional Neural Networks”. IEEE Conference on Computer Vision and Pattern Recognition, 3049-3058, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  14. Shu Kong and Charless Fowlkes. “Low-rank Bilinear Pooling for Fine-Grained Classification”, IEEE Conference on Computer Vision and Pattern Recognition, 7025-7034, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  15. Tsung-Yu Lin and Subhransu Maji. “Improved Bilinear Pooling with CNNs”, arXiv, 1707.06772, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  16. Eric Mitchell “Higher-Order Function Networks for Learning Composable 3D Object Representations”. arXiv, 1907.10388, 2020.Google ScholarGoogle Scholar
  17. Yaming Wang, Vlad I. Morariu, and Larry S. Davis. “Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition”. IEEE Conference on Computer Vision and Pattern Recognition, 4148-4157, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  18. Peiqin Zhuang, Yali Wang, and Yu Qiao. “Learning Attentive Pairwise Interaction for Fine-Grained Classification”. arXiv, 2002.10191, 2020.Google ScholarGoogle Scholar
  19. Ning Zhang “Part-based RCNN for Fine Grained Detection”. arXiv, 1407.3867, 2014.Google ScholarGoogle Scholar
  20. Tianjun Xiao “The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification”. IEEE Conference on Computer Vision and Pattern Recognition, 842-850, 2015.Google ScholarGoogle Scholar
  21. Y. Zhang “Weakly Supervised Fine-Grained Categorization with Part-Based Image Representation”. IEEE Transactions on Image Processing, 10(13), 4652, 2016.Google ScholarGoogle Scholar
  22. Jianlong Fu, Heliang Zheng, and Tao Mei. “Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition”. IEEE Conference on Computer Vision and Pattern Recognition, 4438-4446, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  23. E. Gavves “Fine-Grained Categorization by Alignments”. IEEE International Conference on Computer Vision, 1713-1720, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Bo Zhao “Diversified Visual Attention Networks for Fine-Grained Object Classification”. IEEE Transactions on Multimedia, 6, 1245–1256, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Heliang Zheng “Learning Rich Part Hierarchies with Progressive Attention Networks for Fine Grained Image Recognition”. IEEE Transactions on Image Processing, 29, 1057-7149, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Weifeng Ge, Xiangru Lin, and Yizhou Yu. “Weakly Supervised Complementary Parts Models for Fine Grained Image Classification from the Bottom Up”. IEEE Conference on Computer Vision and Pattern Recognition, 3029-3038, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  27. Zhang Wei, Chen Yu, Bai Yalong and Mei Tao. “Destruction and Construction Learning for Fine Grained Image Recognition”. IEEE Conference on Computer Vision and Pattern Recognition, 5157-5166, 2019.Google ScholarGoogle Scholar
  28. Ruoyi Du “Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches”. European Conference on Computer Vision, 23-28, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Shaokang Yang “Re-rank Coarse Classification with Local Region Enhanced Features for Fine Grained Image Recognition”. arXiv, 2102.09875, 2021.Google ScholarGoogle Scholar
  30. Dongliang Chang “‘Your “Flamingo’ is My ‘Bird’: Fine-Grained, or Not”. arXiv, 2011.09040, 2021.Google ScholarGoogle Scholar
  31. E. D. Cubuk “Randaugment: Practical Automated Data Augmentation with a Reduced Search Space”. arXiv, 1909.13719, 2020.Google ScholarGoogle Scholar
  32. Ryuichiro Hataya “Faster AutoAugment: Learning Augmentation Strategies using Backpropagation.” arXiv, 1911.06987, 2019.Google ScholarGoogle Scholar
  33. Keyu Tian “Improving Auto-Augment via Augmentation-Wise Weight Sharing”. arXiv, 2009.14737v2, 2020.Google ScholarGoogle Scholar
  34. Barret Zoph “Learning Data Augmentation Strategies for Object Detection”. European Conference on Computer Vision, 566-583, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Longhui Wei “Circumventing Outliers of AutoAugment with Knowledge Distillation”. European Conference on Computer Vision, 608-625, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ross Girshick “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation”. arXiv, 1311.2524, 2014.Google ScholarGoogle Scholar
  37. S. Maji “Fine-Grained Visual Classification of Aircraft”. arXiv, 1306.5151, 2013.Google ScholarGoogle Scholar
  38. Ekin Dogus Cubuk “AutoAugment: Learning Augmentation Policies from Data”. arXiv, 1805.09501, 2019.Google ScholarGoogle Scholar
  39. Terrance DeVries and Graham W. Taylor. “Improved Regularization of Convolutional Neural Networks with Cutout”. arXiv, 1708.04552, 2017.Google ScholarGoogle Scholar
  40. Hiroshi Inoue. “Data Augmentation by Pairing Samples for Images Classification”. arXiv, 1801.02929, 2018.Google ScholarGoogle Scholar
  41. C. Wah “The Caltech-UCSD Birds-200-2011 Dataset”. California Institute of Technology, 2011.Google ScholarGoogle Scholar
  42. Jonathan Krause “3D Object Representations for Fine-Grained Categorization”. 4th International IEEE Workshop on 3D Representation and Recognition, 554-561, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Vinod Nair and Geoffrey E. Hinton. “Rectified Linear Units Improve Restricted Boltzmann Machines”. 27th International Conference on International Conference on Machine Learning, 807-814, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Christian Szegedy “Inception-v4, Inception ResNet and the Impact of Residual Connections on Learning”. arXiv, 1602.07261, 2016.Google ScholarGoogle Scholar
  45. Harald Hanselmann and Hermann Ney. “Fine Grained Visual Classification with Efficient End-to-end Localization.” arXiv, 2005.05123, 2020.Google ScholarGoogle Scholar
  46. H. Zheng “Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition”. IEEE International Conference on Computer Vision, 52-63, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  47. Jiquan Ngiam “Domain Adaptive Transfer Learning with Specialist Models”. arXiv, 1811.07056, 2018.Google ScholarGoogle Scholar
  48. Guolei Sun “Fine-grained Recognition: Accounting for Subtle Differences between Similar Classes”. arXiv, 1912.06842, 2019.Google ScholarGoogle Scholar
  49. David Held, Sebastian Thrun, and Silvio Savarese. “Robust Single-View Instance Recognition”. IEEE International Conference on Robotics and Automation, 2152-2159, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. FH Hamker. “Life-long Learning Cell Structures Continuously Learning without Catastrophic Interference”. Neural networks: the Official Journal of the International Neural Network Society, 14, 4-5, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Matthias Feurer “Efficient and Robust Automated Machine Learning”. Advances in Neural Information Processing Systems, 113-134, 2015.Google ScholarGoogle Scholar
  52. Olga Russakovsky “ImageNet Large Scale Visual Recognition Challenge”. International Journal of Computer Vision, 115, 211-252, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    APIT '22: Proceedings of the 2022 4th Asia Pacific Information Technology Conference
    January 2022
    239 pages
    ISBN:9781450395571
    DOI:10.1145/3512353

    Copyright © 2022 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 March 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format