Abstract
Although the hierarchical conditional random field (HCRF) model has been successfully applied to multi-class object segmentation, there is still room for improvement. Firstly, the pairwise potential in the HCRF model has the tendency to over-smooth boundaries of regions that are similar to their neighbors. Secondly, the higher-order potential associated with multiple unsupervised segments is prone to producing incorrect guidance to inference in the under-segmentation situation. Finally, the co-occurrence potential as a measure of inter-object relationships cannot completely suppress some uncommon combinations of object classes due to joint optimization of multi-potential cost function. To alleviate these problems, we propose an improved HCRF model that efficiently combines information from global, middle and local scales for object segmentation in this paper. At the global scale, scene categorization technique is adopted to recognize the scene category of an image. The scene consistency then enforces object segmentation to align with feasible labels in specific scenes at the local and middle scales. Furthermore, an improved pairwise potential and a segment-reliable consistency potential are developed at the local and middle scales, respectively. These potentials rectify the over-smoothness issues by propagating the believed labeling from the unary potential and perform coherent inference by ensuring reliable segment consistency. Experimental results on the MSRC-21 dataset demonstrate that the improved HCRF model achieves better subjective results, as well as state-of-the-art objective results in terms of both global accuracy of 87.98 % and average accuracy of 81.43 %.












Similar content being viewed by others
References
Kohli, P., Torr, P.H.: Robust higher order potentials for enforcing label consistency. Int. J. Comput. Vis. 82, 302–324 (2009)
Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials. Int. J. Comput. Vis. 96, 83–102 (2012)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of Machine Learning, pp. 282–289 (2001)
He, X., Zemel, R.S., Carreira-Perpinán, M.A.: Multiscale conditional random fields for image labeling. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, CVPR 2004, pp. II-695–II-702 (2004)
Kumar, S., Hebert, M.: A hierarchical field framework for unified context-based classification. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, pp. 1284–1291 (2005)
Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. Comput. Vis. ECCV 2006, 1–15 (2006)
Ladicky, L.U., Russell, C., Kohli, P., Torr, P.H.: Associative hierarchical random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, pp. 1056–1077 (2014)
Boykov, Y.Y., Jolly, M.-P.: Interactive graph cuts for optimal boundary and region segmentation of objects in ND images. In: Proceedings of the Eighth IEEE International Conference on in Computer Vision, ICCV 2001, pp. 105–112 (2001)
Plath, N., Toussaint, M., Nakajima, S.: Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 817–824 (2009)
Ladický, L., Russell, C., Kohli, P., Torr, P.H.: Inference methods for CRFs with co-occurrence statistics. Int. J. Comput. Vis., 1–13 (2012)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, p. 14 (1967)
Zhu, S.-S., Yung, N.H.: Sub-scene generation: a step towards complex scene understanding. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2011)
Tan, Z., Yung, N.H.: Image segmentation towards natural clusters. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4 (2008)
Lempitsky, V., Vedaldi, A., Zisserman, A.: Pylon model for semantic segmentation. In: Advances in Neural Information Processing Systems, pp. 1485–1493 (2011)
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)
Vogel, J., Schiele, B.: A semantic typicality measure for natural scene categorization. In: Pattern Recognition. Springer, pp. 195–203 (2004)
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, pp. 524–531 (2005)
Qin, J., Yung, N.H.: Scene categorization via contextual visual words. Pattern Recognit. 43, 1874–1888 (2010)
Qin, J., Yung, N.H.: Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit. 45, 1671–1683 (2012)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. Computer Vis. ECCV 2006, 517–530 (2006)
Tighe, J., Lazebnik, S.: Superparsing: scalable nonparametric image parsing with superpixels. In: Computer Vision-ECCV. Springer, pp. 352–365 (2010)
Yang, J., Price, B., Cohen, S., Yang, M.-H.: Context driven scene parsing with attention to rare classes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3294–3301 (2014)
Singaraju, D., Vidal, R.: Using global bag of features models in random fields for joint categorization and segmentation of objects. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2313–2319 (2011)
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. In: ACM Transactions on Graphics (TOG), pp. 309–314 (2004)
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, CVPR 2004, pp. II-762–II-769
Malik, J., Belongie, S., Leung, T., Shi, J.: Contour and texture analysis for image segmentation. Int. J. Comput. Vis. 43, 7–27 (2001)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 401–408 (2007)
Van De Sande, K.E., Gevers, T., Snoek, C.G.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582–1596 (2010)
Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Conference A: Computer Vision and Image Processing, Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 1, pp. 582–585 (1994)
Ladický, L.: Automatic Labelling Environment (ALE) (2011). http://www.inf.ethz.ch/personal/ladickyl/
Szummer, M., Kohli, P., Hoiem, D.: Learning CRFs using graph cuts. Comput. Vis. ECCV 2008, 582–595 (2008)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: Proceedings. Ninth IEEE International Conference on Computer Vision, pp. 273–280 (2003)
Rivera, P., Gould, S.: Simultaneous multi-class pixel labeling over coherent image sets. In: International Conference on Digital Image Computing Techniques and Applications (DICTA), pp. 99–106 (2011)
Ren, X., Malik, J.: Learning a classification model for segmentation. In: Proceedings of Ninth IEEE International Conference on Computer Vision, pp. 10–17 (2003)
Rother, C., Kohli, P., Feng, W., Jia, J.: Minimizing sparse higher order energy functions of discrete variables. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1382–1389 (2009)
Kohli, P., Kumar, M.P., Torr, P.H.: P\(^{3}\) and beyond: move making algorithms for solving higher order functions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 1645–1656 (2009)
Boros, E., Hammer, P.L.: Pseudo-boolean optimization. Discrete Appl. Math. 123, 155–225 (2002)
Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. J. Comput. Vis. 80, 300–316 (2008)
Ladicky, L., Russell, C., Kohli, P., Torr, P.H.: Graph cut based inference with co-occurrence statistics. Comput. Vis. ECCV 2010, 239–253 (2010)
Acknowledgments
This research was supported by a grant from the Research Grant Council of the Hong Kong Special Administrative Region, China, under Project HKU718912E.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, LL., Yung, N.H.C. Improved hierarchical conditional random field model for object segmentation. Machine Vision and Applications 26, 1027–1043 (2015). https://doi.org/10.1007/s00138-015-0708-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-015-0708-8