Stereo Matching—State-of-the-Art and Research Challenges

Bleyer, Michael; Breiteneder, Christian

doi:10.1007/978-1-4471-5520-1_6

Michael Bleyer^6,7 &
Christian Breiteneder⁶

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

4222 Accesses
24 Citations

Abstract

Stereo matching denotes the problem of finding dense correspondences in pairs of images in order to perform 3D reconstruction. In this chapter, we provide a review of stereo methods with a focus on recent developments and our own work. We start with a discussion of local methods and introduce our algorithms: geodesic stereo, cost filtering and PatchMatch stereo. Although local algorithms have recently become very popular, they are not capable of handling large untextured regions where a global smoothness prior is required. In the discussion of such global methods, we briefly describe standard optimization techniques. However, the real problem is not in the optimization, but in finding an energy function that represents a good model of the stereo problem. In this context, we investigate data and smoothness terms of standard energies to find the best-suited implementations of which. We then describe our own work on finding a good model. This includes our combined stereo and matting approach, Surface Stereo, Object Stereo as well as a new method that incorporates physics-based reasoning in stereo matching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The distance between the cameras is thereby referred to as the stereo baseline.
2.
Rectification can be accomplished using standard methods (e.g., [29]), once the stereo camera system has been calibrated.
3.
This is the reason why local algorithms are also sometimes referred to as area-based methods in literature.
4.
It is interesting to note that the majority of recent submissions to the Middlebury benchmark [67] are adaptive support weight techniques.
5.
Hirschmüller stresses a relationship to local methods. He calls the method semi-global matching, as he uses the aggregation step of local methods, but aggregates (global) DP path costs instead of pixel dissimilarities of spatially close pixels.
6.
It is likely that the approach would run in real time if implemented on a modern GPU.
7.
Note that although we speak about disparity, these move making algorithms can be applied to arbitrary computer vision labeling problems outside stereo vision.
8.
This is why these algorithms are referred to as graph-cuts.
9.
Due to the np-hardness of the problem, QPBO can only guarantee to find a part of the global optimal solution and will, in general, leave a subset of pixels unlabeled. The autarky property of QPBO guarantees that by assigning unlabeled pixels to the disparities of proposal 1 the energy of the fusion result will be lower or the same as that of proposal 1. Alternative ways for handling these unlabeled pixels are QPBO-I and QPBO-P [64].
10.
In practice, it is sufficient to compute the gradient in x-direction, as vertical edges contain more disparity information than horizontal ones.
11.
Hirschmüller [32] uses a hierarchical approach to speed up this iterative procedure.
12.
In our study, the extension of AD and SD to color works by computing AD and SD for each color channel separately. We then sum up the differences over the 3 color channels.
13.
The stereo problem is symmetrical.
14.
In general, such a construction works if the smoothness term is convex [40].
15.
Note that in [81], the smoothness term is also truncated to make it discontinuity preserving.
16.
There are also alternative methods (e.g., cooperative cuts [41]).
17.
It is interesting to note some similarity to cost filter approaches discussed in Sect. 6.2.
18.
This term forms a higher-order clique in the optimization step. In general, optimization of such higher-order cliques is intractable. We take advantage of the fact that these cliques are sparse, i.e., only one state generates 0 costs and all other states generate constant costs. There exist graph-cut based algorithms that can “efficiently” optimize such sparse higher-order cliques [44, 65] and we make use of them.
19.
By optimal we mean the solution that leads to the largest energy decrease according to our energy function among all possible fusion moves.
20.
Small number means that we apply the MDL term above to penalize the number of objects.

References

Agarwal S, Snavely N, Simon I, Seitz S, Szeliski R (2009) Building Rome in a day. In: ICCV
Google Scholar
Baker S, Szeliski R, Anandan P (1998) A layered approach to stereo reconstruction. In: CVPR, pp 434–441
Google Scholar
Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31
Article Google Scholar
Barnes C, Shechtman E, Finkelstein A, Goldman D (2009) PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24 (SIGGRAPH Proc)
Article Google Scholar
Birchfield S, Tomasi C (1998) A pixel dissimilarity measure that is insensitive to image sampling. IEEE Trans Pattern Anal Mach Intell 20(4):401–406
Article Google Scholar
Birchfield S, Tomasi C (1999) Depth discontinuities by pixel-to-pixel stereo. Int J Comput Vis 35(3):269–293
Article Google Scholar
Bleyer M, Chambon S (2010) Does color really help in dense stereo matching? In: International symposium on 3D data processing, visualization and transmission (3DPVT)
Google Scholar
Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints. ISPRS J Photogramm Remote Sens 59(3):128–150
Article Google Scholar
Bleyer M, Gelautz M (2007) Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process Image Commun 22(2):127–143
Article Google Scholar
Bleyer M, Gelautz M (2008) Simple but effective tree structures for dynamic programming-based stereo matching. In: VISAPP, vol 2, pp 415–422
Google Scholar
Bleyer M, Chambon S, Poppe U, Gelautz M (2008) Evaluation of different methods for using colour information in global stereo matching. In: International archives of the photogrammetry, remote sensing and spatial information sciences, vol XXXVII, pp 415–422
Google Scholar
Bleyer M, Gelautz M, Rother C, Rhemann C (2009) A stereo approach that handles the matting problem via image warping. In: CVPR, pp 501–508
Google Scholar
Bleyer M, Rother C, Kohli P (2010) Surface stereo with soft segmentation. In: CVPR
Google Scholar
Bleyer M, Rhemann C, Rother C (2011) PatchMatch stereo—stereo matching with slanted support windows. In: BMVC
Google Scholar
Bleyer M, Rother C, Kohli P, Scharstein D, Sinha S (2011) Object stereo—joint stereo matching and object segmentation. In: CVPR
Google Scholar
Bleyer M, Rhemann C, Rother C (2012) Extracting 3D scene-consistent object proposals and depth from stereo images. In: ECCV
Google Scholar
Bobick A, Intille S (1999) Large occlusion stereo. Int J Comput Vis 33(3):181–200
Article Google Scholar
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239
Article Google Scholar
Carreira J, Li F, Sminchisescu C (2012) Object recognition by sequential figure-ground ranking. Int J Comput Vis 98(3):243–262
Article MathSciNet Google Scholar
Deng Y, Yang Q, Lin X, Tang X (2005) A symmetric patch-based correspondence model for occlusion handling. In: ICCV, pp 542–567
Google Scholar
Faugeras O, Hotz B, Mathieu H, Viéville T, Zhang Z, Fua P, Théron E, Moll L, Berry G, Vuillemin J, Bertin P, Proy C (1996) Real time correlation based stereo: algorithm implementations and applications. Technical report, RR-2013, INRIA
Google Scholar
Felzenszwalb P, Huttenlocher D (2006) Efficient belief propagation for early vision. Int J Comput Vis 70(1):41–54
Article Google Scholar
Fua PV (1991) Combining stereo and monocular information to compute dense depth maps that preserve depth discontinuities. In: International joint conference on artificial intelligence, pp 1292–1298
Google Scholar
Fusiello A, Roberto V, Trucco E (1997) Efficient stereo with multiple windowing. In: CVPR, pp 858–863
Google Scholar
Gallup D, Frahm J, Mordohai P, Yang Q, Pollefeys M (2007) Real-time plane-sweeping stereo with multiple sweeping directions. In: CVPR
Google Scholar
Gehrig S, Franke U (2007) Improving sub-pixel accuracy for long range stereo. In: ICCV VRML workshop
Google Scholar
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR
Google Scholar
Gupta A, Efros A, Hebert M (2010) Blocks world revisited: image understanding using qualitative geometry and mechanics. In: ECCV
Google Scholar
Hartley R, Zisserman A (2003) Multiple view geometry in computer vision
Google Scholar
Hasinoff S, Kang SB, Szeliski R (2006) Boundary matting for view synthesis. Comput Vis Image Underst 103(1):22–32
Article Google Scholar
He K, Sun J, Tang X (2010) Guided image filtering. In: ECCV
Google Scholar
Hirschmüller H (2005) Accurate and efficient stereo processing by semi-global matching and mutual information. In: CVPR, vol 2, pp 807–814
Google Scholar
Hirschmüller H, Scharstein D (2009) Evaluation of stereo matching costs on images with radiometric differences. IEEE Trans Pattern Anal Mach Intell 31:1582–1599
Article Google Scholar
Hirschmüller H, Innocent P, Garibaldi J (2002) Real-time correlation-based stereo vision with reduced border errors. Int J Comput Vis 47:229–246
Article MATH Google Scholar
Hong L, Chen G (2004) Segment-based stereo matching using graph cuts. In: CVPR, vol 1, pp 74–81
Google Scholar
Hosni A, Bleyer M, Gelautz M, Rhemann C (2009) Local stereo matching using geodesic support weights. In: ICIP
Google Scholar
Hosni A, Bleyer M, Gelautz M (2010) Near real-time stereo with adaptive support weight approaches. In: 3DPVT
Google Scholar
Hosni A, Rhemann C, Bleyer M, Gelautz M (2011) Temporally consistent disparity and optical flow via efficient spatio-temporal filtering. In: PSIVT
Google Scholar
Hosni A, Rhemann C, Bleyer M, Rother C, Gelautz M (2013) Fast cost-volume filtering for visual correspondence and beyond. IEEE Trans Pattern Anal Mach Intell 35(2):504–511
Article Google Scholar
Ishikawa H (2000) Global optimization using embedded graphs. PhD thesis, New York University
Google Scholar
Jegelka S, Bilmes J (2011) Submodularity beyond submodular energies: coupling edges in graph cuts. In: CVPR
Google Scholar
Ju M, Kang H (2009) Constant time stereo matching. In: MVIP, pp 13–17
Google Scholar
Klaus A, Sormann M, Karner K (2006) Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In: ICPR, pp 15–18
Google Scholar
Kohli P, Kumar M, Torr P (2007) P3 & beyond: solving energies with higher order cliques. In: CVPR
Google Scholar
Kolmogorov V, Rother C (2007) Minimizing non-submodular functions with graph cuts—a review. IEEE Trans Pattern Anal Mach Intell 29(7):1274–1279
Article Google Scholar
Kolmogorov V, Zabih R (2002) Computing visual correspondence with occlusions using graph cuts. In: ICCV, vol 2, pp 508–515
Google Scholar
Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. In: ECCV
Google Scholar
Kolmogorov V, Zabih R (2004) What energy functions can be minimized via graph cuts? IEEE Trans Pattern Anal Mach Intell 26(2):147–159
Article Google Scholar
Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFs with gaussian edge potentials. In: Advances in neural information processing systems
Google Scholar
Lempitsky V, Rother C, Blake A (2007) Logcut—efficient graph cut optimization for Markov random fields. In: ICCV
Google Scholar
Li G, Zucker SW (2006) Surface geometric constraints for stereo in belief propagation. In: CVPR, pp 2355–2362
Google Scholar
Lin M, Tomasi C (2003) Surfaces with occlusions from layered stereo. In: CVPR, pp 710–717
Google Scholar
Mei X, Sun X, Zhou M, Jiao S, Wang H, Zhang X (2011) On building an accurate stereo matching system on graphics hardware. In: GPUCV, pp 467–474
Google Scholar
Meltzer T, Yanover TC, Weiss Y (2005) Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation. In: ICCV, pp 428–435
Google Scholar
Mühlmann K, Maier D, Hesser J, Männer R (2002) Calculating dense disparity maps from color stereo images, an efficient implementation. Int J Comput Vis 47(1):79–88
Article MATH Google Scholar
Newcombe R, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A, Kohli P, Shotton J, Hodges S, Fitzgibbon A (2011) KinectFusion: real-time dense surface mapping and tracking. In: ISMAR
Google Scholar
Ogale AS, Aloimonos Y (2004) Stereo correspondence with slanted surfaces: critical implications of horizontal slant. In: CVPR, pp 568–573
Google Scholar
Ohta Y, Kanade T (1985) Stereo by intra- and inter-scanline search. IEEE Trans Pattern Anal Mach Intell 7(2):139–154
Article Google Scholar
Paris S, Durandi F (2009) A fast approximation of the bilateral filter using a signal processing approach. Int J Comput Vis 81:24–52
Article Google Scholar
Porikli F (2005) Integral histogram: a fast way to extract histograms in cartesian spaces. In: CVPR, vol 1, pp 829–836
Google Scholar
Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: CVPR
Google Scholar
Richardt C, Orr D, Davies I, Criminisi A, Dodgson NA (2010) Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid. In: ECCV, vol 6313, pp 510–523
Google Scholar
Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23:309–314
Article Google Scholar
Rother C, Kolmogorov V, Lempitsky V, Szummer M (2007) Optimizing binary MRFs via extended roof duality. In: CVPR
Google Scholar
Rother C, Kohli P, Feng W, Jia J (2009) Minimizing sparse higher order energy functions of discrete variables. In: CVPR, pp 1382–1389
Google Scholar
Roy S, Cox I (1998) A maximum-flow formulation of the n-camera stereo correspondence problem. In: ICCV, pp 492–499
Google Scholar
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1/2/3):7–42. http://vision.middlebury.edu/stereo/
Article MATH Google Scholar
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from a single depth image. In: CVPR
Google Scholar
Smith B, Zhang L, Jin H (2009) Stereo matching with nonparametric smoothness priors in feature space. In: CVPR, pp 485–492
Google Scholar
Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25:835–846 (SIGGRAPH Proc)
Article Google Scholar
Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation. IEEE Trans Pattern Anal Mach Intell 25(7):787–800
Article Google Scholar
Sun J, Li Y, Kang SB, Shum HY (2005) Symmetric stereo matching for occlusion handling. In: CVPR, vol 25, pp 399–406
Google Scholar
Szeliski R, Golland P (1998) Stereo matching with transparency and matting. In: ICCV, pp 517–525
Google Scholar
Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C (2006) A comparative study of energy minimization methods for Markov random fields. In: ECCV, vol 2, pp 19–26
Google Scholar
Taguchi Y, Wilburn B, Zitnick L (2008) Stereo reconstruction with mixed pixels using adaptive over-segmentation. In: CVPR, pp 1–8
Google Scholar
Tao H, Sawhney H, Kumar R (2001) A global matching framework for stereo computation. In: ICCV, pp 532–539
Google Scholar
Tappen MF, Freeman WT (2003) Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: ICCV, vol 2, pp 900–906
Google Scholar
Veksler O (2002) Stereo correspondence with compact windows via minimum ratio cycle. IEEE Trans Pattern Anal Mach Intell 24(12):1654–1660
Article Google Scholar
Veksler O (2005) Stereo correspondence by dynamic programming on a tree. In: CVPR, pp 384–390
Google Scholar
Wainwright M, Jaakkola T, Willsky A (2003) Tree reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In: AISTATS
Google Scholar
Woodford O, Torr P, Reid I, Fitzgibbon A (2008) Global stereo reconstruction under second order smoothness priors. In: CVPR
Google Scholar
Xiong W, Jia J (2007) Stereo matching on objects with fractional boundary. In: CVPR, pp 1–8
Google Scholar
Yang Q, Wang L, Yang R, Wang S, Liao M, Nister D (2006) Real-time global stereo matching using hierarchical belief propagation. In: BMVC
Google Scholar
Yang Q, Yang R, Davis J, Nister D (2007) Spatial-depth super resolution for range images. In: CVPR
Google Scholar
Yang Q, Wang L, Yang R, Stewenius H, Nister D (2009) Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. IEEE Trans Pattern Anal Mach Intell 31(3):492–504
Article Google Scholar
Yoon KJ, Kweon IS (2005) Locally adaptive support-weight approach for visual correspondence search. In: CVPR
Google Scholar
Zhang Y, Gong M, Yang Y (2008) Local stereo matching with 3D adaptive cost aggregation for slanted surface modeling and sub-pixel accuracy. In: ICPR
Google Scholar
Zhang K, Lu J, Lafruit G (2009) Cross-based local stereo matching using orthogonal integral images. IEEE Trans Circuits Syst Video Technol 19:1073–1079
Article Google Scholar
Zhang K, Lafruit G, Lauwereins R, Gool L (2010) Joint integral histograms and its application in stereo matching. In: ICIP, pp 817–820
Google Scholar
Zitnick L, Kang S, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. ACM Trans Graph 23(3):600–608
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the Vienna Science and Technology Fund (WWTF) under project ICT08-019.

Author information

Authors and Affiliations

Vienna University of Technology, Vienna, Austria
Michael Bleyer & Christian Breiteneder
Microsoft Redmond, Redmond, USA
Michael Bleyer

Authors

Michael Bleyer
View author publications
You can also search for this author in PubMed Google Scholar
Christian Breiteneder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Bleyer .

Editor information

Editors and Affiliations

Dipartimento di Matematica e Informatica, Università di Catania, Catania, Italy
Giovanni Maria Farinella
Dipartimento di Matematica e Informatica, Università di Catania, Catania, Italy
Sebastiano Battiato
Department of Engineering, University of Cambridge, Cambridge, United Kingdom
Roberto Cipolla

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bleyer, M., Breiteneder, C. (2013). Stereo Matching—State-of-the-Art and Research Challenges. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_6

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5520-1_6
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5519-5
Online ISBN: 978-1-4471-5520-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics