Object Recognition by Sequential Figure-Ground Ranking

Carreira, João; Li, Fuxin; Sminchisescu, Cristian

doi:10.1007/s11263-011-0507-2

Object Recognition by Sequential Figure-Ground Ranking

Published: 19 November 2011

Volume 98, pages 243–262, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

João Carreira¹,
Fuxin Li¹ &
Cristian Sminchisescu¹

1199 Accesses
77 Citations
3 Altmetric
Explore all metrics

Abstract

We present an approach to visual object-class segmentation and recognition based on a pipeline that combines multiple figure-ground hypotheses with large object spatial support, generated by bottom-up computational processes that do not exploit knowledge of specific categories, and sequential categorization based on continuous estimates of the spatial overlap between the image segment hypotheses and each putative class. We differ from existing approaches not only in our seemingly unreasonable assumption that good object-level segments can be obtained in a feed-forward fashion, but also in formulating recognition as a regression problem. Instead of focusing on a one-vs.-all winning margin that may not preserve the ordering of segment qualities inside the non-maximum (non-winning) set, our learning method produces a globally consistent ranking with close ties to segment quality, hence to the extent entire object or part hypotheses are likely to spatially overlap the ground truth. We demonstrate results beyond the current state of the art for image classification, object detection and semantic segmentation, in a number of challenging datasets including Caltech-101, ETHZ-Shape as well as PASCAL VOC 2009 and 2010.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arbelaez, P., & Cohen, L. (2008). Constrained image segmentation from hierarchical boundaries. In Computer vision and pattern recognition, IEEE computer society conference on (pp. 1–8).
Google Scholar
Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2009). From contours to regions: an empirical evaluation. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Bishop, C. M. (2007) Pattern recognition and machine learning Information science and statistics, 1st edn, 2006. Springer, Berlin corr. 2nd printing edn.
Google Scholar
Blaschko, M. B., & Lampert, C. H. (2008). Learning to localize objects with structured output regression. In European conference on computer vision (pp. 2–15).
Google Scholar
Bo, L., & Sminchisescu, C. (2009). Efficient match kernels between sets of features for visual recognition. In Advances in neural information processing systems.
Google Scholar
Boiman, O., Shechtman, E., & Irani, M. (2008). In defense of nearest-neighbor based image classification. In Computer vision and pattern recognition, IEEE conference on CVPR 2008 (pp. 1–8).
Google Scholar
Borenstein, E., & Ullman, S. (2002). Class-specific, top-down segmentation. In European conference on computer vision.
Google Scholar
Borenstein, E., & Ullman, S. (2008). Combined top-down/bottom-up segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12), 2109–2125.
Article Google Scholar
Bosch, A., Zisserman, A., & Munoz, X. (2007). Representing shape with a spatial pyramid kernel. In CIVR’07.
Google Scholar
Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In ICCV (pp. 105–112).
Google Scholar
Carreira, J., & Sminchisescu, C. (2010a). Constrained parametric min-cuts for automatic object segmentation, release 1. http://sminchisescu.ins.uni-bonn.de/code/cpmc/.
Carreira, J., & Sminchisescu, C. (2010b). Constrained parametric min cuts for automatic object segmentation. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Carreira, J., & Sminchisescu, C. (2012). Cpmc: Automatic object segmentation using constrained parametric min-cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence (accepted).
Carreira, J., Ion, A., & Sminchisescu, C. (2010). Image segmentation by discounted cumulative ranking on maximal cliques (Tech. Rep.). 06-2010 (arXiv:1009.4823), Computer Vision and Machine Learning Group, Institute for Numerical Simulation, University of Bonn. Available at http://arxiv.org/abs/1009.4823.
Cour, T., & Shi, J. (2007). Recognizing objects by piecing together the segmentation puzzle. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Chapter Google Scholar
Csurka, G., & Perronnin, F. (2008). A simple high performance approach to semantic segmentation. In BMVC.
Google Scholar
Csurka, G., & Perronnin, F. (2010). An efficient approach to semantic segmentation. International Journal of Computer Vision 1–15.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.
Article Google Scholar
Fei-Fei, L., Fergus, R., & Perona, P. (2007). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Computer Vision and Image Understanding, 106(1), 59–70.
Article Google Scholar
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1627–1645.
Article Google Scholar
Ferrari, V., Jurie, F., & Schmid, C. (2007). Accurate object detection with deformable shape models learnt from images. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Fulkerson, B., Vedaldi, A., & Soatto, S. (2009). Class segmentation and object localization with superpixel neighborhoods. In International conference on computer vision (pp. 670–677).
Chapter Google Scholar
Gallo, G., Grigoriadis, M. D., & Tarjan, R. E. (1989). A fast parametric maximum flow algorithm and applications. SIAM Journal on Computing, 18(1), 30–55. doi:10.1137/0218003.
Article MathSciNet MATH Google Scholar
Gehler, P. V., & Nowozin, S. (2009). On feature combination for multiclass object classification. In International conference on computer vision.
Google Scholar
Gonfaus, J., Boix, X., de Weijer, J. V., Bagdanov, A., Serrat, J., & Gonzàlez, J. (2010). Harmony potentials for joint classification and segmentation. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Gould, S., Fulton, R., & Koller, D. (2009a). Decomposing a scene into geometric and semantically consistent regions. In International conference on computer vision.
Google Scholar
Gould, S., Gao, T., & Koller, D. (2009b). Region-based segmentation and object detection. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams & A. Culotta (Eds.), Advances in neural information processing systems (pp. 655–663).
Google Scholar
Grauman, K., & Darrell, T. (2005). The pyramid match kernel: discriminative classification with sets of image features. In International conference on computer vision (Vol. 2, pp. 1458–1465).
Google Scholar
Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset (Tech. Rep. 7694). California Institute of Technology.
Gu, C., Lim, J. J., Arbeláez, P., & Malik, J. (2009). Recognition using regions. In IEEE conference on computer vision and pattern recognition.
Google Scholar
He, X., Zemel, R. S., & Carreira-Perpiñán, M. (2004). Multiscale conditional random fields for image labeling. IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 695–702).
Google Scholar
Ion, A., Carreira, J., & Sminchisescu, C. (2011). Image segmentation by figure-ground composition into maximal cliques. In International conference on computer vision.
Google Scholar
Kohli, P., Ladicky, L., & Torr, P. (2008). Robust higher order potentials for enforcing label consistency. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Chapter Google Scholar
Kumar, A., & Sminchisescu, C. (2007). Support kernel machines for object recognition. In International conference on computer vision.
Google Scholar
Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Obj cut. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Ladicky, L., Russell, C., Kohli, P., & Torr, P. H. S. (2009a). Associative hierarchical crfs for object class image segmentation. In International conference on computer vision.
Google Scholar
Ladicky, L., Russell, C., Kohli, P., & Torr, P. H. S. (2009b). Associative hierarchical crfs for object class image segmentation. In International conference on computer vision.
Google Scholar
Ladicky, L., Sturgess, P., Alaharia, K., Russel, C., & Torr, P. H. (2010). What, where & how many ? combining object detectors and crfs. In European conference on computer vision.
Google Scholar
Lampert, C., Blaschko, M., & Hofmann, T. (2008). Beyond sliding windows: object localization by efficient subwindow search. In Computer vision and pattern recognition. IEEE conference on CVPR 2008 (pp. 1–8).
Google Scholar
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 2169–2178).
Google Scholar
Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.
Article Google Scholar
Levin, A., & Weiss, Y. (2009). Learning to combine bottom-up and top-down segmentation. International Journal of Computer Vision, 81(1), 105–118.
Article Google Scholar
Li, F., Carreira, J., & Sminchisescu, C. (2010a). Object recognition as ranking holistic figure-ground hypotheses. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Li, F., Ionescu, C., & Sminchisescu, C. (2010b). Random Fourier approximations for skewed multiplicative histogram kernels. In Annual symposium of the German association for pattern recognition (DAGM).
Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Malisiewicz, T., & Efros, A. (2007). Improving spatial support for objects via multiple segmentations. In British machine vision conference.
Google Scholar
Malisiewicz, T., & Efros, A. A. (2008). Recognition by association via learning per-exemplar distances. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Mori, G., Ren, X., Efros, A., & Malik, J. (2004). Recovering human body configurations: combining segmentation and recognition. In Computer vision and pattern recognition. Proceedings of the 2004 IEEE computer society conference on CVPR 2004 (Vol. 2, pp. II-326–II-333).
Google Scholar
Pantofaru, C., Schmid, C., & Hebert, M. (2008). Object recognition by integrating multiple image segmentations. In European conference on computer vision.
Google Scholar
Pinto, N., Cox, D. D., & DiCarlo, J. J. (2008). Why is real-world visual object recognition hard? PLoS Computational Biology 4(1), e27.
Article MathSciNet Google Scholar
Rabinovich, A., Belongie, S., Lange, T., & Buhmann, J. M. (2006). Model order selection and cue combination for image segmentation. In IEEE conference on computer vision and pattern recognition (Vol. 1, pp. 1130–1137).
Google Scholar
Rabinovich, A., Vedaldi, A., & Belongie, S. (2007). Does image segmentation improve object categorization? (Tech. Rep.). CS2007-090.
Rahimi, A., & Recht, B. (2007). Random features for large-scale kernel machines. In Advances in neural information processing systems.
Google Scholar
Schoenemann, T., & Cremers, D. (2010). A combinatorial solution for model-based image segmentation and real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1153–1164.
Article Google Scholar
Shi, J., & Malik, J. (2000) Normalized cuts and image segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence. doi:10.1109/34.868688.
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European conference on computer vision (pp. 1–15).
Google Scholar
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2009). Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 81, 2–23.
Article Google Scholar
Srinivasan, P., & Shi, J. (2007). Botom-up recognition and parsing of the human body. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Todorovic, S., & Ahuja, N. (2008). Learning subcategory relevances for category recognition. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Toshev, A., Taskar, B., & Daniilidis, K. (2010). Object detection via boundary structure segmentation. In IEEE conference on computer vision and pattern recognition (pp. 950–957).
Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the international conference of machine learning.
Google Scholar
Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley: Reading.
MATH Google Scholar
van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9, 1582–1596.
Article Google Scholar
Vedaldi, A., & Zisserman, A. (2010). Efficient additive kernels via explicit feature maps. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). Multiple kernels for object detection. In International conference on computer vision.
Google Scholar
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Yang, Y., Hallman, S., Ramanan, D., & Fowlkes, C. (2010). Layered object detection for multi-class segmentation. In IEEE conference on computer vision and pattern recognition.
Google Scholar
Yu, H. F., Hsieh, C. J., Chang, K. W., & Lin, C. J. (2010). Large linear classification when data cannot fit in memory. In ACM SIGKDD conference on knowledge discovery and data mining.
Google Scholar
Yu, S. X., & Shi, J. (2003). Object-specific figure-ground segregation. In IEEE conference on computer vision and pattern recognition (Vol. 2, p. 39).
Google Scholar
Zhang, H., Berg, A., Maire, M., & Malik, J. (2006). Svm-knn: discriminative nearest neighbor classification for visual category recognition. In Computer vision and pattern recognition. IEEE computer society conference on (Vol. 2, pp. 2126–2136).
Google Scholar

Download references

Author information

Authors and Affiliations

INS, University of Bonn, Wegelerstrasse 6, Bonn, 53115, Germany
João Carreira, Fuxin Li & Cristian Sminchisescu

Authors

João Carreira
View author publications
You can also search for this author in PubMed Google Scholar
Fuxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Sminchisescu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristian Sminchisescu.

Additional information

The first two authors contributed equally.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carreira, J., Li, F. & Sminchisescu, C. Object Recognition by Sequential Figure-Ground Ranking. Int J Comput Vis 98, 243–262 (2012). https://doi.org/10.1007/s11263-011-0507-2

Download citation

Received: 19 February 2011
Accepted: 08 November 2011
Published: 19 November 2011
Issue Date: July 2012
DOI: https://doi.org/10.1007/s11263-011-0507-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object Recognition by Sequential Figure-Ground Ranking

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

Attention mechanisms in computer vision: A survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Object Recognition by Sequential Figure-Ground Ranking

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

Attention mechanisms in computer vision: A survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation