Skip to main content
Log in

Improved hierarchical conditional random field model for object segmentation

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Although the hierarchical conditional random field (HCRF) model has been successfully applied to multi-class object segmentation, there is still room for improvement. Firstly, the pairwise potential in the HCRF model has the tendency to over-smooth boundaries of regions that are similar to their neighbors. Secondly, the higher-order potential associated with multiple unsupervised segments is prone to producing incorrect guidance to inference in the under-segmentation situation. Finally, the co-occurrence potential as a measure of inter-object relationships cannot completely suppress some uncommon combinations of object classes due to joint optimization of multi-potential cost function. To alleviate these problems, we propose an improved HCRF model that efficiently combines information from global, middle and local scales for object segmentation in this paper. At the global scale, scene categorization technique is adopted to recognize the scene category of an image. The scene consistency then enforces object segmentation to align with feasible labels in specific scenes at the local and middle scales. Furthermore, an improved pairwise potential and a segment-reliable consistency potential are developed at the local and middle scales, respectively. These potentials rectify the over-smoothness issues by propagating the believed labeling from the unary potential and perform coherent inference by ensuring reliable segment consistency. Experimental results on the MSRC-21 dataset demonstrate that the improved HCRF model achieves better subjective results, as well as state-of-the-art objective results in terms of both global accuracy of 87.98 % and average accuracy of 81.43 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Kohli, P., Torr, P.H.: Robust higher order potentials for enforcing label consistency. Int. J. Comput. Vis. 82, 302–324 (2009)

    Article  Google Scholar 

  2. Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials. Int. J. Comput. Vis. 96, 83–102 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  3. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of Machine Learning, pp. 282–289 (2001)

  4. He, X., Zemel, R.S., Carreira-Perpinán, M.A.: Multiscale conditional random fields for image labeling. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, CVPR 2004, pp. II-695–II-702 (2004)

  5. Kumar, S., Hebert, M.: A hierarchical field framework for unified context-based classification. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, pp. 1284–1291 (2005)

  6. Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. Comput. Vis. ECCV 2006, 1–15 (2006)

    Google Scholar 

  7. Ladicky, L.U., Russell, C., Kohli, P., Torr, P.H.: Associative hierarchical random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, pp. 1056–1077 (2014)

  8. Boykov, Y.Y., Jolly, M.-P.: Interactive graph cuts for optimal boundary and region segmentation of objects in ND images. In: Proceedings of the Eighth IEEE International Conference on in Computer Vision, ICCV 2001, pp. 105–112 (2001)

  9. Plath, N., Toussaint, M., Nakajima, S.: Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 817–824 (2009)

  10. Ladický, L., Russell, C., Kohli, P., Torr, P.H.: Inference methods for CRFs with co-occurrence statistics. Int. J. Comput. Vis., 1–13 (2012)

  11. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)

    Article  Google Scholar 

  12. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, p. 14 (1967)

  13. Zhu, S.-S., Yung, N.H.: Sub-scene generation: a step towards complex scene understanding. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2011)

  14. Tan, Z., Yung, N.H.: Image segmentation towards natural clusters. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4 (2008)

  15. Lempitsky, V., Vedaldi, A., Zisserman, A.: Pylon model for semantic segmentation. In: Advances in Neural Information Processing Systems, pp. 1485–1493 (2011)

  16. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)

    Article  Google Scholar 

  17. Vogel, J., Schiele, B.: A semantic typicality measure for natural scene categorization. In: Pattern Recognition. Springer, pp. 195–203 (2004)

  18. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, pp. 524–531 (2005)

  19. Qin, J., Yung, N.H.: Scene categorization via contextual visual words. Pattern Recognit. 43, 1874–1888 (2010)

    Article  MATH  Google Scholar 

  20. Qin, J., Yung, N.H.: Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit. 45, 1671–1683 (2012)

    Article  Google Scholar 

  21. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)

  22. Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. Computer Vis. ECCV 2006, 517–530 (2006)

    Google Scholar 

  23. Tighe, J., Lazebnik, S.: Superparsing: scalable nonparametric image parsing with superpixels. In: Computer Vision-ECCV. Springer, pp. 352–365 (2010)

  24. Yang, J., Price, B., Cohen, S., Yang, M.-H.: Context driven scene parsing with attention to rare classes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3294–3301 (2014)

  25. Singaraju, D., Vidal, R.: Using global bag of features models in random fields for joint categorization and segmentation of objects. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2313–2319 (2011)

  26. Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. In: ACM Transactions on Graphics (TOG), pp. 309–314 (2004)

  27. Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, CVPR 2004, pp. II-762–II-769

  28. Malik, J., Belongie, S., Leung, T., Shi, J.: Contour and texture analysis for image segmentation. Int. J. Comput. Vis. 43, 7–27 (2001)

    Article  MATH  Google Scholar 

  29. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  30. Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 401–408 (2007)

  31. Van De Sande, K.E., Gevers, T., Snoek, C.G.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582–1596 (2010)

    Article  Google Scholar 

  32. Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Conference A: Computer Vision and Image Processing, Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 1, pp. 582–585 (1994)

  33. Ladický, L.: Automatic Labelling Environment (ALE) (2011). http://www.inf.ethz.ch/personal/ladickyl/

  34. Szummer, M., Kohli, P., Hoiem, D.: Learning CRFs using graph cuts. Comput. Vis. ECCV 2008, 582–595 (2008)

    Google Scholar 

  35. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)

    Article  Google Scholar 

  36. Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: Proceedings. Ninth IEEE International Conference on Computer Vision, pp. 273–280 (2003)

  37. Rivera, P., Gould, S.: Simultaneous multi-class pixel labeling over coherent image sets. In: International Conference on Digital Image Computing Techniques and Applications (DICTA), pp. 99–106 (2011)

  38. Ren, X., Malik, J.: Learning a classification model for segmentation. In: Proceedings of Ninth IEEE International Conference on Computer Vision, pp. 10–17 (2003)

  39. Rother, C., Kohli, P., Feng, W., Jia, J.: Minimizing sparse higher order energy functions of discrete variables. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1382–1389 (2009)

  40. Kohli, P., Kumar, M.P., Torr, P.H.: P\(^{3}\) and beyond: move making algorithms for solving higher order functions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 1645–1656 (2009)

    Article  Google Scholar 

  41. Boros, E., Hammer, P.L.: Pseudo-boolean optimization. Discrete Appl. Math. 123, 155–225 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  42. Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. J. Comput. Vis. 80, 300–316 (2008)

    Article  Google Scholar 

  43. Ladicky, L., Russell, C., Kohli, P., Torr, P.H.: Graph cut based inference with co-occurrence statistics. Comput. Vis. ECCV 2010, 239–253 (2010)

    Google Scholar 

Download references

Acknowledgments

This research was supported by a grant from the Research Grant Council of the Hong Kong Special Administrative Region, China, under Project HKU718912E.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li-Li Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, LL., Yung, N.H.C. Improved hierarchical conditional random field model for object segmentation. Machine Vision and Applications 26, 1027–1043 (2015). https://doi.org/10.1007/s00138-015-0708-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-015-0708-8

Keywords

Navigation