Abstract
Using context to aid object detection is becoming more popular among computer vision researchers. Our physical world is structured, and our perception as human beings does not neglect contextual information. In this paper, we propose a framework that is able to simultaneously detect and segment objects of different classes under context. Context is incorporated into our model as long-range pairwise interactions between pixels, which impose a prior on the labeling. Long-range interactions have seen seldom use in the computer vision literature, and we show how to use them to encode contextual information in our segmentation. Our framework formulates the multi-class image segmentation task as an energy minimization problem and finds a globally optimal solution under certain conditions using a single graph cut. We experimentally evaluate performance of our model on two publicly available datasets: the MSRC-1 and the CorelB datasets. Our results show the applicability of our model to the multi-class segmentation problem.
Similar content being viewed by others
References
Boros E, Hammer PL (2002) Pseudo-Boolean optimization. Discret Appl Math 123(1–3):155–225
Boros E, Hammer PL, Sun X (May 1991) Network flows and minimization of quadratic pseudo-Boolean functions. Technical Report RRR 17–1991, RUTCOR Research Report
Boros E, Hammer PL, Tavares G (April 2006) Preprocessing of unconstrained quadratic binary optimization. Technical Report RRR 10–2006, RUTCOR Research Report
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE TPAMI 23(11):1222–1239
Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE TPAMI 26(9):1124–1137
Boykov Y, Jolly MP (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In ICCV
Delong A, Boykov Y (2009) Globally optimal segmentation of multi-region objects. In ICCV
Desai C, Ramanan D, Fowlkes C (2009) Discriminative models for multi-class object layout. In ICCV
Fulkerson B, Vedaldi A, Soatto S (2009) Class segmentation and object localization with superpixel neighborhoods. In ICCV
Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In CVPR
Gould S, Rodgers J, Cohen D, Elidan G, Koller D (2008) Multi-class segmentation with relative location prior. IJCV 80(3):300–316
He X, Zemel RS, Carreira-Perpinnan MA (2004) Multiscale conditional random fields for image labeling. In CVPR
He X, Zemel RS, Ray D (2006) Learning and incorporating top-down cues in image segmentation. In ECCV
Hoiem D, Efros AA, Hebert M (2006) Putting objects in perspective. In CVPR
Ishikawa H (2003) Exact optimization for Markov random fields with convex priors. IEEE TPAMI 25(10):1333–1336
Kohli P, Ladicky L, Torr PHS (2008) Robust higher order potentials for enforcing label consistency. In CVPR
Kolmogorov V, Zabih R (2004) What energy functions can be minimized via graph cuts. IEEE TPAMI 26(2):65–81
Kolmogorov V, Rother C (2007) Minimizing non-submodular functions with graph cuts–a review. IEEE TPAMI 28(7):1274–1279
Kumar S, Hebert M (2005) A hierarchical field framework for unified context-based classification. In ICCV
Ladicky L, Russell C, Kohli P (2009) Associative hierarchical CRFs for object class image segmentation. In ICCV
Li Y, Huttenlocher DP (2008) Sparse long-range random field and its application to image denoising. In ECCV
Long YJ, Huang YZ (2006) Image based source camera identification using demosaicking. IEEE 8th workshop on multimedia signal processing, pp 419–424
Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) Objects in context. In ICCV
Ramalingam S, Kohli P, Alahari K, Torr PHS (2008) Exact inference in multi-label CRFs with higher order cliques. In CVPR
Rother C, Kohli P, Feng W, Jia J (2009) Minimizing sparse higher order energy functions of discrete variables. In CVPR
Rother C, Kolmogorov V, Lempitsky V, Szummer M (2007) Optimizing binary MRFs via extended roof duality. In CVPR
Schroff F, Criminisi A, Zisserman A (2008) Object class segmentation using random forests. In BMVC
Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV 81(1):2–23
Torralba A (2003) Contextual priming for object detection. IJCV 53(2):169–191
Torralba A, Murphy KP, Freeman WT (2004) Sharing features: efficient boosting procedures for multiclass object detection. In CVPR
Verbeek J, Triggs B (2007) Region classification with Markov field aspect models. In CVPR
Verbeek J, Triggs B (2007) Scene segmentation with CRFs learned from partially labeled images. In NIPS
Winn J, Criminisi A, Minka T (2005) Object categorization by learned universal visual dictionary. In ICCV
Yang L, Meer P, Foran DJ (2007) Multiple class segmentation using a unified framework over mean-shift patches. In CVPR
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant No. 51104157), the Ph.D. Programs Foundation of Ministry of Education of China (Grant No. 20110095120008), the China Postdoctoral Science Foundation (Grant No.20100481181), the Fundamental Research Funds for the Central Universities (Grant No. 2011QNA30) and Jiangsu Overseas Research & Training Program for University Prominent Young & Middle-aged Teachers and Prsesidents.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wei, C., Jiang, X., Tang, Z. et al. Context-based global multi-class semantic image segmentation by wireless multimedia sensor networks. Artif Intell Rev 43, 579–591 (2015). https://doi.org/10.1007/s10462-013-9394-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-013-9394-y