PFNet: a novel part fusion network for fine-grained visual categorization

Liang, Jingyun; Guo, Jinlin; Guo, Yanming; Lao, Songyang

doi:10.1007/s11042-018-7047-5

PFNet: a novel part fusion network for fine-grained visual categorization

Published: 15 December 2018

Volume 79, pages 33397–33416, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jingyun Liang¹,
Jinlin Guo¹,
Yanming Guo¹ &
…
Songyang Lao¹

492 Accesses
Explore all metrics

Abstract

The existing methods in fine-grained visual categorization focus on integrating multiple deep CNN models or complicated attention mechanism, resulting in increasing cumbersome networks. In addition, most methods rely on part annotations which requires expensive expert guidance. In this paper, without extra annotation, we propose a novel part fusion network (PFNet) to effectively fuse discriminative image parts for classification. More specifically, PFNet consists of a part feature extractor to extract part features and a two-level classification network to utilize part-level and image-level features simultaneously. Part-level features are trained with the weighted part loss, which embeds a weighting mechanism based on different parts’ characteristics. Easy parts, hard parts and background parts are proposed and discriminatively used for classification. Moreover, part-level features are fused to form an image-level feature so as to introduce global supervision and generate final predictions. Experiments on three popular benchmark datasets show that our framework achieves competitive performance compared with the state-of-the-art. Code is available at https://github.com/MichaelLiang12/PFNet-FGVC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification-Specific Parts for Improving Fine-Grained Visual Categorization

CPML: Category Probability Mask Learning for Fine-Grained Visual Classification

Fine-Grained Image Classification with Object-Part Model

References

Branson S, Van Horn G, Belongie S, Perona P (2014) Bird species categorization using pose normalized deep convolutional nets. arXiv:1406.2952
Cai S, Zuo W, Zhang L (2017) Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 511–520
Chen X, Gupta A (2017) An implementation of faster rcnn with study for region sampling. arXiv:1702.02138
Cui Y, Zhou F, Lin Y, Belongie S (2016) Fine-grained categorization and dataset bootstrapping using deep metric learning with humans in the loop. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1153–1162
Cui Y, Zhou F, Wang J, Liu X, Lin Y, Belongie S (2017) Kernel pooling for convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2930
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 886–893
Farrell R, Oza O, Zhang N, Morariu VI, Darrell T, Davis LS (2011) Birdlets: subordinate categorization using volumetric primitives and pose-normalized appearance. In: Proceedings of the IEEE international conference on computer vision, pp 161–168
Fu J, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4438–4446
Gao Y, Beijbom O, Zhang N, Darrell T (2016) Compact bilinear pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 317–326
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37 (9):1904–1916
Article Google Scholar
Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked cnn for fine-grained visual categorization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1173–1182
Karessli N, Akata Z, Schiele B, Bulling A, et al. (2017) Gaze embeddings for zero-shot image classification. In: Proceedings of the IEEE international conference on computer vision, pp 6412–6421
Kong S, Fowlkes C (2017) Low-rank bilinear pooling for fine-grained classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7025–7034
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561
Krause J, Jin H, Yang J, Fei-Fei L (2015) Fine-grained recognition without part annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5546–5555
Lin D, Shen X, Lu C, Jia J (2015) Deep lac: deep localization, alignment and classification for fine-grained recognition. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1666–1674
Lin TY, RoyChowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE international conference on computer vision, pp 1449–1457
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu L, Fieguth P (2012) Texture classification from random features. IEEE Trans Pattern Anal Mach Intell 34(3):574–586
Article Google Scholar
Liu J, Kanazawa A, Jacobs D, Belhumeur P (2012) Dog breed classification using part localization. In: European conference on computer vision, pp 172–185
Liu X, Xia T, Wang J, Yang Y, Zhou F, Lin Y (2016) Fully convolutional attention networks for fine-grained recognition. arXiv:1603.06765
Liu L, Chen J, Fieguth P, Zhao G, Chellappa R, Pietikainen M (2018) A survey of recent advances in texture representation. arXiv:1801.10324
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2018) Deep learning for generic object detection: a survey. arXiv:1809.02165
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv:1306.5151
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Article Google Scholar
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769
Simon M, Rodner E (2015) Neural activation constellations: unsupervised part model discovery with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 1143–1151
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Tang S, Zheng Y-T, Wang Y, Chua T-S (2012) Sparse ensemble learning for concept detection. IEEE Trans Multimed 14(1):43–54
Article Google Scholar
Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001 California Institute of Technology
Wang D, Shen Z, Shao J, Zhang W, Xue X, Zhang Z (2015) Multiple granularity descriptors for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision, pp 2399–2406
Wang Y, Choi J, Morariu VI, Davis LS (2016) Mining discriminative triplets of patches for fine-grained classification. arXiv:1605.01130
Wei XS, Xie CW, Wu J (2016) Mask-cnn: localizing parts and selecting descriptors for fine-grained image recognition. arXiv:1605.06878
Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 842–850
Xie S, Yang T, Wang X, Lin Y (2015) Hyper-class augmented and regularized deep learning for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2645–2654
Xu Z, Huang S, Zhang Y, Tao D (2015) Augmenting strong supervision using web data for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision, pp 2524–2532
Yang S, Wang J, Wang J, Shapiro L (2012) Unsupervised template learning for fine-grained object recognition. In: Proceedings of the neural information processing systems, pp 3122–3130
Zhang N, Farrell R, Darrell T (2012) Pose pooling kernels for sub-category recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3665–3672
Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. In: European conference on computer vision, pp 834–849
Zhang X, Xiong H, Zhou W, Lin W, Tian Q (2016) Picking deep filter responses for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1134–1142
Zhang H, Xu T, Elhoseiny M, Huang X, Zhang S, Elgammal A, Metaxas D (2016) Spda-cnn: unifying semantic part detection and abstraction for fine-grained recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1143–1152
Zhang Y, Wei XS, Wu J, Cai J, Lu J, Nguyen VA, Do MN (2016) Weakly supervised fine-grained categorization with part-based image representation. IEEE Trans Image Process 25(4):1713–1725
Article MathSciNet Google Scholar
Zhao B, Wu X, Feng J, Peng Q, Yan S (2016) Diversified visual attention networks for fine-grained object classification. arXiv:1606.08572
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China: 61571453.

Author information

Authors and Affiliations

College of System Engineering, National University of Defense Technology, Changsha, China
Jingyun Liang, Jinlin Guo, Yanming Guo & Songyang Lao

Authors

Jingyun Liang
View author publications
You can also search for this author in PubMed Google Scholar
Jinlin Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yanming Guo
View author publications
You can also search for this author in PubMed Google Scholar
Songyang Lao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinlin Guo.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liang, J., Guo, J., Guo, Y. et al. PFNet: a novel part fusion network for fine-grained visual categorization. Multimed Tools Appl 79, 33397–33416 (2020). https://doi.org/10.1007/s11042-018-7047-5

Download citation

Received: 14 June 2018
Revised: 09 November 2018
Accepted: 07 December 2018
Published: 15 December 2018
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-018-7047-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PFNet: a novel part fusion network for fine-grained visual categorization

Abstract

Access this article

Similar content being viewed by others

Classification-Specific Parts for Improving Fine-Grained Visual Categorization

CPML: Category Probability Mask Learning for Fine-Grained Visual Classification

Fine-Grained Image Classification with Object-Part Model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PFNet: a novel part fusion network for fine-grained visual categorization

Abstract

Access this article

Similar content being viewed by others

Classification-Specific Parts for Improving Fine-Grained Visual Categorization

CPML: Category Probability Mask Learning for Fine-Grained Visual Classification

Fine-Grained Image Classification with Object-Part Model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation