Skip to main content
Log in

Exploring part-aware segmentation for fine-grained visual categorization

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

It is challenge to segment fine-grained objects due to appearance variations and clutter of backgrounds. Most of existing segmentation methods hardly separate small parts of the instance from its background with sufficient accuracy. However, such small parts usually contain important semantic information, which is crucial in fine-grained categorization. Observing that fine-grained objects almost share the same configuration of parts, we present a novel part-aware segmentation method, which explicitly detects semantic parts and preserve these parts during segmentation. We firstly design a hybrid part localization method, which generates accurate part proposals with moderate computation. Then we iteratively update the segmentation outputs and the part proposals, which obtains better foreground segmentation results. Experiments demonstrate the superiority of the proposed method, as compared to state-of-the-art segmentation approaches for fine-grained categorization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2927–2936

  2. Angelova A, Zhu S (2013) Efficient object detection and segmentation for fine-grained recognition. In: 2013 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 811–818

  3. Berg T, Belhumeur PN (2013) Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 955–962

  4. Berg T, Liu J, Lee SW, Alexander ML, Jacobs DW, Belhumeur PN (2014) Birdsnap: Large-scale fine-grained visual categorization of birds. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2019–2026

  5. Bossard L, Guillaumin M, Van Gool L (2014) Food-101–mining discriminative components with random forests. In: European conference on computer vision (ECCV). Springer, pp 446–461

  6. Boykov YY, Jolly MP (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In: 2001 IEEE international conference on computer vision (ICCV). IEEE, pp 105–112

  7. Branson S, Van Horn G, Wah C, Perona P, Belongie S (2014) The ignorant led by the blind: a hybrid human–machine vision system for fine-grained categorization. Int J Comput Vis 108(1-2):3–29

    MathSciNet  MATH  Google Scholar 

  8. Chai Y, Lempitsky V, Zisserman A Symbiotic segmentation and part localization for fine-grained categorization. In: 2013 IEEE international conference on computer vision (ICCV). IEEE, pp 321–328

  9. Cheng M, Mitra NJ, Huang X, Torr PH, Hu S (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  10. Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol 1, pp 1–2

  11. Cui Y, Zhou F, Lin Y, Belongie S (2016) Fine-grained categorization and dataset bootstrapping using deep metric learning with humans in the loop. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE

  12. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR), 2005 IEEE conference on, vol 1, pp 886–893. IEEE

  13. Deng J, Krause J, Fei-Fei L (2013) Fine-grained crowdsourcing for fine-grained recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 580–587

  14. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vision 88(2):303–338

    Article  Google Scholar 

  15. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–8

  16. Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vision 59(2):167–181

    Article  Google Scholar 

  17. Freytag A, Rodner E, Darrell T, Denzler J (2014) Exemplar-specific patch features for fine-grained recognition. In: German Conference on Pattern Recognition. Springer, Cham, pp 144–156

  18. Freytag A, Rodner E, Denzler J (2014) Birds of a feather flock together–local learning of mid-level representations for fine-grained recognition. In: ECCV workshop on parts and attributes, vol 2

  19. Gkioxari G, Malik J (2015) Finding action tubes. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 759–768

  20. Goering C, Rodner E, Freytag A, Denzler J (2014) Nonparametric part transfer for fine-grained recognition. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2489–2496

  21. Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked cnn for fine-grained visual categorization. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE

  22. Jain S, Xiong B, Grauman K (2017) Pixel objectness. arXiv:1701.0534

  23. Jiang F, Zhang S, Wu S, Gao Y, Zhao D (2015) Multi-layered gesture recognition with kinect. J Mach Learn Res 16:227–254

    MathSciNet  MATH  Google Scholar 

  24. Khosla A, Jayadevaprakash N, Yao B, Li FF (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proceedings CVPR workshop on fine-grained visual categorization (FGVC), vol 2

  25. Lin D, Shen X, Lu C, Jia J (2015) Deep lac: deep localization, alignment and classification for fine-grained recognition. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1666–1674

  26. Lin TY, RoyChowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. In: 2013 IEEE international conference on computer vision (ICCV). IEEE, pp 1449–1457

  27. Liu J, Belhumeur PN (2013) BBird part localization using exemplar-based models with enforced pose and subcategory consistency. In: 2013 IEEE international conference on computer vision (ICCV). IEEE, pp 2520–2527

  28. Liu J, Kanazawa A., Jacobs D., Belhumeur P. (2012) Dog breed classification using part localization. In: Computer Vision–ECCV 2012, pp 172–185. Springer

  29. Liu J, Li Y, Belhumeur PN (2014) Part-pair representation for part localization. In: Computer Vision–ECCV 2014, pp 456–471. Springer

  30. Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: AAAI, vol 30, pp 1266–1272

  31. Liu W, Yang X, Tao D, Cheng J, Tang Y (2018) Multiview dimension reduction via Hessian multiset canonical correlations. Information Fusion 41:119–128

    Article  Google Scholar 

  32. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: Recognizing complex activities from sensor data. In: IJCAI, pp 1617–1623

  33. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  34. Liu Y., Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: predicting your career path. In: AAAI, pp 201–207

  35. Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban water quality prediction based on multi-task multi-view learning. In: International joint conference on artificial intelligence

  36. Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1096–1104

  37. Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of IEEE international conference on computer vision, p 1150

  38. Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: 2011 IEEE International conference on computer vision (ICCV). IEEE, pp 89–96

  39. Mottos AB, Feris RS (2014) Fusing well-crafted feature descriptors for efficient fine-grained classification. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 5197–5201

  40. Ni B, Yang X, Gao S (2016) Progressively parsing interactional objects for fine grained action detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1020–1028

  41. Pang C, Yao H, Sun X (2014) Discriminative features for bird species classification. In: International Conference on internet multimedia computing and service. ACM, p 256

  42. Pang C, Yao H, Yang Z, Sun X, Zhao S, Zhang Y (2015) Part-aware segmentation for fine-grained categorization. In: Pacific rim conference on multimedia, pp 538–548. Springer

  43. Preoţiuc-Pietro D, Liu Y, Hopkins D, Ungar L Beyond binary labels: political ideology prediction of twitter users. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), vol 1, pp 729–740

  44. Rosch E, Mervis CB, Gray WD, Johnson DM, Boyes-Braem P (1976) Basic objects in natural categories. Cogn Psychol 8(3):382–439

    Article  Google Scholar 

  45. Rother C, Kolmogorov V, Blake A (2004) Interactive foreground extraction using iterated graph cuts. In: ACM transactions on graphics (TOG). ACM, vol 23, pp 309–314

  46. Singh B, Shao M (2016) A multi-stream bi-directional recurrent neural network for fine-grained action detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE

  47. Sochor J, Herout A, Havel J (2016) Boxcars: 3d boxes as cnn input for improved fine-grained vehicle recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3006–3015

  48. Wah C, Branson S, Welinder P et al. (2011) The caltech-ucsd birds-200-2011 dataset. California Institute of Technology

  49. Wang Y, Choi J, Morariu VI, Davis LS (2016) Mining discriminative triplets of patches for fine-grained classification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE

  50. Weijer JVD, Schmid C, Verbeek J, Larlus D (2009) Learning color names for real-world applications. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 18(7):1512–23

    Article  MathSciNet  Google Scholar 

  51. Wilf P, Zhang S, Chikkerur S, Little SA, Wing SL, Serre T (2016) Computer vision cracks the leaf code. Proceedings of the National Academy of Sciences of the United States of America 113(12):3305– 3310

    Article  Google Scholar 

  52. Wu B, Nevatia R, Li Y (2008) Segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. In: 2008 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–8

  53. Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 842–850

  54. Xie L, Tian Q, Hong R, Yan S, Zhang B (2013) Hierarchical part matching for fine-grained visual categorization. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE, pp 1641–1648

  55. Yao B, Khosla A, Fei-Fei L (2011) Combining randomization and discrimination for fine-grained image categorization. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1577–1584

  56. Yao B, Ma J, Fei-Fei L (2013) Discovering object functionality. In: 2013 IEEE international conference on computer vision (ICCV). IEEE, pp 2512–2519

  57. Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. In: European conference on computer vision (ECCV), pp 834–849. Springer

  58. Zhang N, Farrell R, Iandola F, Darrell T (2013) Deformable part descriptors for fine-grained recognition and attribute prediction. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE, pp 729–736

  59. Zhang S, Kasiviswanathan S, Yuen PC, Harandi M (2015) Online dictionary learning on symmetric positive definite manifolds with vision applications. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, pp 3165–3173

  60. Zhang S, Yao H, Sun X, Wang K, Zhang J, Lu X, Zhang Y (2014) Action recognition based on overcomplete independent component analysis. Inf Sci 281:635–647

    Article  Google Scholar 

  61. Zhang S, Zhou H, Jiang F, Li X (2015) Robust visual tracking using structurally random projection and weighted least squares. IEEE Trans Circuits Syst Video Technol 25(11):1749–1760

    Article  Google Scholar 

  62. Zhang S, Zhou H, Yao H, Zhang Y, Wang K, Zhang J (2015) Adaptive normalhedge for robust visual tracking. Signal Process 110:132–142

    Article  Google Scholar 

  63. Zhang X, Zhou F, Lin Y, Zhang S (2016) Embedding label structures for fine-grained feature representation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE

  64. Zhou F, Lin Y (2016) Fine-grained image classification by exploring bipartite-graph labels. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Project No. 61472103, No. 61772158 and No. 61702136.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongxun Yao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pang, C., Yao, H., Sun, X. et al. Exploring part-aware segmentation for fine-grained visual categorization. Multimed Tools Appl 77, 30291–30310 (2018). https://doi.org/10.1007/s11042-018-5957-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5957-x

Keywords

Navigation