Geometric Image Parsing in Man-Made Environments

Tretyak, Elena; Barinova, Olga; Kohli, Pushmeet; Lempitsky, Victor

doi:10.1007/s11263-011-0488-1

Geometric Image Parsing in Man-Made Environments

Published: 08 September 2011

Volume 97, pages 305–321, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Elena Tretyak¹,
Olga Barinova¹,
Pushmeet Kohli² &
…
Victor Lempitsky³

820 Accesses
57 Citations
3 Altmetric
Explore all metrics

Abstract

We present a new optimization based parsing framework for the geometric analysis of a single image coming from a man-made environment. This framework models the scene as a composition of geometric primitives spanning different layers from low level (edges) through mid-level (lines segments, lines and vanishing points) to high level (the zenith and the horizon). The inference in such a model thus jointly and simultaneously estimates (a) the grouping of edges into the line segments, (b) the grouping of line segments into the straight lines, (c) the grouping of lines into parallel families, and (d) the positioning of the horizon and the zenith in the image. Such a unified treatment means that the uncertainty information propagates between the layers of the model. This is in contrast to most previous approaches to the same problem, which either ignore the middle levels (line segments or lines) all together, or use the bottom-up step-by-step pipeline.

For the evaluation, we consider a publicly available York Urban dataset of “Manhattan” scenes, and also introduce a new, harder dataset of 103 urban outdoor images containing many non-Manhattan scenes. The comparative evaluation for the horizon estimation task demonstrate higher accuracy and robustness attained by our method when compared to the current state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indoor scene modeling from a single image using normal inference and edge features

Article 11 January 2017

Geometry Driven Semantic Labeling of Indoor Scenes

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

References

Aguilera, D. G., Lahoz, J. G., & Codes, J. F. (2005). A new method for vanishing points detection in 3d reconstruction from a single view. In Proc. of ISPRS Commission V.
Google Scholar
Almansa, A., Desolneux, A., & Vamech, S. (2003). Vanishing point detection without any a priori information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(4), 502–507.
Article Google Scholar
Antone, M. E., & Teller, S. J. (2000). Automatic recovery of relative camera rotations for urban scenes. In CVPR (pp. 2282–2289).
Google Scholar
Barinova, O., Lempitsky, V., & Kohli, P. (2010a). On detection of multiple object instances using hough transforms. In CVPR.
Google Scholar
Barinova, O., Lempitsky, V., Tretiak, E., & Kohli, P. (2010b). Geometric image parsing in man-made environments. In ECCV.
Google Scholar
Barnard, S. (1983). Interpreting perspective images. Artificial Intelligence, 21(4), 435–462.
Article Google Scholar
Beardsley, P. Murray, D. (1992). Camera calibration using vanishing points. In BMVC (pp. 416–425).
Chapter Google Scholar
Boulanger, K., Bouatouch, K., & Pattanaik, S. (2006). Atip: A tool for 3d navigation inside a single image with automatic camera calibration. In EG UK theory and practice of computer graphics.
Google Scholar
Cipolla, R., Drummond, T., & Robertson, D. P. (1999). Camera calibration from vanishing points in image of architectural scenes. In BMVC.
Google Scholar
Collins, R. T., & Weiss, R. S. (1990). Vanishing point calculation as a statistical inference on the unit sphere. In ICCV (pp. 400–403).
Google Scholar
Coughlan, J. M., & Yuille, A. L. (1999). Manhattan world: Compass direction from a single image by Bayesian inference. In ICCV (pp. 941–947).
Google Scholar
Denis, P., Elder, J. H., & Estrada, F. J. (2008). Efficient edge-based methods for estimating Manhattan frames in urban imagery. In ECCV (2) (pp. 197–210).
Google Scholar
Deutscher, J., Isard, M., & MacCormick, J. (2002). Automatic camera calibration from a single Manhattan image. In ECCV (4) (pp. 175–205).
Google Scholar
Duric, Z., & Rosenfeld, A. (1996). Image sequence stabilization in real time. Real-Time Imaging, 2(5), 271–284.
Article Google Scholar
Flint, A., Mei, C., Reid, I., & Murray, D. (2010). Growing semantically meaningful models for visual slam. In Proc. IEEE conference on computer vision and pattern recognition (pp. 467–474). Los Alamitos: IEEE Computer Society.
Google Scholar
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
Google Scholar
Hedau, V., Hoiem, D., & Forsyth, D. (2009). Recovering the spatial layout of cluttered rooms. In ICCV (pp. 1849–1856).
Google Scholar
Hedau, V., Hoiem, D., & Forsyth, D. (2010). Thinking outside the box: using appearance models and context based on room geometry. In ECCV (pp. 224–237).
Google Scholar
Hoiem, D., Efros, A. A., & Hebert, M. (2005a). Automatic photo pop-up. ACM Transactions on Graphics, 24(3), 577–584.
Article Google Scholar
Hoiem, D., Efros, A. A., & Hebert, M. (2005b). Geometric context from a single image. In ICCV (pp. 654–661).
Google Scholar
Hoiem, D., Efros, A. A., & Hebert, M. (2008). Putting objects in perspective. International Journal of Computer Vision, 80(1), 3–15.
Article Google Scholar
Kosecká, J., & Zhang, W. (2002). Video compass. In ECCV (4) (pp. 476–490).
Google Scholar
Lee, D. C., Hebert, M., & Kanade, T. (2009). Geometric reasoning for single image structure recovery. In CVPR.
Google Scholar
Lee, D. C., Gupta, A., Hebert, M., & Kanade, T. (2010). Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In NIPS.
Google Scholar
McLean, G. F., & Kotturi, D. (1995). Vanishing point detection by line clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(11), 1090–1095.
Article Google Scholar
Morel, J.-M., Randall, G., Grompone von Gioi, R., & Jakubowicz, J. (2008). Lsd: A fast line segment detector with a false detection control. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 722–732.
MathSciNet Google Scholar
Rother, C. (2000). A new approach for vanishing point detection in architectural environments. In BMVC.
Google Scholar
Schaffalitzky, F., & Zisserman, A. (2000). Planar grouping for automatic detection of vanishing lines and points. Image and Vision Computing, 18, 647–658.
Article Google Scholar
Schindler, G., & Dellaert, F. (2004). Atlanta world: An expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In CVPR (1) (pp. 203–209).
Google Scholar
Tardif, J.-P. (2009). Non-iterative approach for fast and accurate vanishing point detection. In ICCV.
Google Scholar
Tu, Z., Chen, X., Yuille, A. L., & Zhu, S. C. (2005). Image parsing: Unifying segmentation, detection, and recognition. International Journal of Computer Vision, 63(2), 113–140.
Article Google Scholar
Tuytelaars, T., Van Gool, L. J., Proesmans, M., & Moons, T. (1998). A cascaded hough transform as an aid in aerial image interpretation. In ICCV (pp. 67–72).
Google Scholar
Wildenauer, H., & Vincze, M. (2007). Vanishing point detection in complex man-made worlds. In ICIAP (pp. 615–622).
Google Scholar
Yu, S., Zhang, H., & Malik, J. (2008). Inferring spatial layout from a single image via depth-ordered grouping. In POCV.
Google Scholar

Download references

Author information

Authors and Affiliations

Lomonosov Moscow State University, Moscow, Russia
Elena Tretyak & Olga Barinova
Microsoft Research Cambridge, Cambridge, UK
Pushmeet Kohli
University of Oxford, Oxford, UK
Victor Lempitsky

Authors

Elena Tretyak
View author publications
You can also search for this author in PubMed Google Scholar
Olga Barinova
View author publications
You can also search for this author in PubMed Google Scholar
Pushmeet Kohli
View author publications
You can also search for this author in PubMed Google Scholar
Victor Lempitsky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elena Tretyak.

Additional information

Tretyak Elena, Barinova Olga and Victor Lempitsky are supported by Microsoft Research programs in Russia. Victor Lempitsky is also supported by EU under ERC grant VisRec no. 228180.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tretyak, E., Barinova, O., Kohli, P. et al. Geometric Image Parsing in Man-Made Environments. Int J Comput Vis 97, 305–321 (2012). https://doi.org/10.1007/s11263-011-0488-1

Download citation

Received: 03 March 2011
Accepted: 29 July 2011
Published: 08 September 2011
Issue Date: May 2012
DOI: https://doi.org/10.1007/s11263-011-0488-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Geometric Image Parsing in Man-Made Environments

Abstract

Access this article

Similar content being viewed by others

Indoor scene modeling from a single image using normal inference and edge features

Geometry Driven Semantic Labeling of Indoor Scenes

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Geometric Image Parsing in Man-Made Environments

Abstract

Access this article

Similar content being viewed by others

Indoor scene modeling from a single image using normal inference and edge features

Geometry Driven Semantic Labeling of Indoor Scenes

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation