Abstract
Stereo matching denotes the problem of finding dense correspondences in pairs of images in order to perform 3D reconstruction. In this chapter, we provide a review of stereo methods with a focus on recent developments and our own work. We start with a discussion of local methods and introduce our algorithms: geodesic stereo, cost filtering and PatchMatch stereo. Although local algorithms have recently become very popular, they are not capable of handling large untextured regions where a global smoothness prior is required. In the discussion of such global methods, we briefly describe standard optimization techniques. However, the real problem is not in the optimization, but in finding an energy function that represents a good model of the stereo problem. In this context, we investigate data and smoothness terms of standard energies to find the best-suited implementations of which. We then describe our own work on finding a good model. This includes our combined stereo and matting approach, Surface Stereo, Object Stereo as well as a new method that incorporates physics-based reasoning in stereo matching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The distance between the cameras is thereby referred to as the stereo baseline.
- 2.
Rectification can be accomplished using standard methods (e.g., [29]), once the stereo camera system has been calibrated.
- 3.
This is the reason why local algorithms are also sometimes referred to as area-based methods in literature.
- 4.
It is interesting to note that the majority of recent submissions to the Middlebury benchmark [67] are adaptive support weight techniques.
- 5.
Hirschmüller stresses a relationship to local methods. He calls the method semi-global matching, as he uses the aggregation step of local methods, but aggregates (global) DP path costs instead of pixel dissimilarities of spatially close pixels.
- 6.
It is likely that the approach would run in real time if implemented on a modern GPU.
- 7.
Note that although we speak about disparity, these move making algorithms can be applied to arbitrary computer vision labeling problems outside stereo vision.
- 8.
This is why these algorithms are referred to as graph-cuts.
- 9.
Due to the np-hardness of the problem, QPBO can only guarantee to find a part of the global optimal solution and will, in general, leave a subset of pixels unlabeled. The autarky property of QPBO guarantees that by assigning unlabeled pixels to the disparities of proposal 1 the energy of the fusion result will be lower or the same as that of proposal 1. Alternative ways for handling these unlabeled pixels are QPBO-I and QPBO-P [64].
- 10.
In practice, it is sufficient to compute the gradient in x-direction, as vertical edges contain more disparity information than horizontal ones.
- 11.
Hirschmüller [32] uses a hierarchical approach to speed up this iterative procedure.
- 12.
In our study, the extension of AD and SD to color works by computing AD and SD for each color channel separately. We then sum up the differences over the 3 color channels.
- 13.
The stereo problem is symmetrical.
- 14.
In general, such a construction works if the smoothness term is convex [40].
- 15.
Note that in [81], the smoothness term is also truncated to make it discontinuity preserving.
- 16.
There are also alternative methods (e.g., cooperative cuts [41]).
- 17.
It is interesting to note some similarity to cost filter approaches discussed in Sect. 6.2.
- 18.
This term forms a higher-order clique in the optimization step. In general, optimization of such higher-order cliques is intractable. We take advantage of the fact that these cliques are sparse, i.e., only one state generates 0 costs and all other states generate constant costs. There exist graph-cut based algorithms that can “efficiently” optimize such sparse higher-order cliques [44, 65] and we make use of them.
- 19.
By optimal we mean the solution that leads to the largest energy decrease according to our energy function among all possible fusion moves.
- 20.
Small number means that we apply the MDL term above to penalize the number of objects.
References
Agarwal S, Snavely N, Simon I, Seitz S, Szeliski R (2009) Building Rome in a day. In: ICCV
Baker S, Szeliski R, Anandan P (1998) A layered approach to stereo reconstruction. In: CVPR, pp 434–441
Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31
Barnes C, Shechtman E, Finkelstein A, Goldman D (2009) PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24 (SIGGRAPH Proc)
Birchfield S, Tomasi C (1998) A pixel dissimilarity measure that is insensitive to image sampling. IEEE Trans Pattern Anal Mach Intell 20(4):401–406
Birchfield S, Tomasi C (1999) Depth discontinuities by pixel-to-pixel stereo. Int J Comput Vis 35(3):269–293
Bleyer M, Chambon S (2010) Does color really help in dense stereo matching? In: International symposium on 3D data processing, visualization and transmission (3DPVT)
Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints. ISPRS J Photogramm Remote Sens 59(3):128–150
Bleyer M, Gelautz M (2007) Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process Image Commun 22(2):127–143
Bleyer M, Gelautz M (2008) Simple but effective tree structures for dynamic programming-based stereo matching. In: VISAPP, vol 2, pp 415–422
Bleyer M, Chambon S, Poppe U, Gelautz M (2008) Evaluation of different methods for using colour information in global stereo matching. In: International archives of the photogrammetry, remote sensing and spatial information sciences, vol XXXVII, pp 415–422
Bleyer M, Gelautz M, Rother C, Rhemann C (2009) A stereo approach that handles the matting problem via image warping. In: CVPR, pp 501–508
Bleyer M, Rother C, Kohli P (2010) Surface stereo with soft segmentation. In: CVPR
Bleyer M, Rhemann C, Rother C (2011) PatchMatch stereo—stereo matching with slanted support windows. In: BMVC
Bleyer M, Rother C, Kohli P, Scharstein D, Sinha S (2011) Object stereo—joint stereo matching and object segmentation. In: CVPR
Bleyer M, Rhemann C, Rother C (2012) Extracting 3D scene-consistent object proposals and depth from stereo images. In: ECCV
Bobick A, Intille S (1999) Large occlusion stereo. Int J Comput Vis 33(3):181–200
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239
Carreira J, Li F, Sminchisescu C (2012) Object recognition by sequential figure-ground ranking. Int J Comput Vis 98(3):243–262
Deng Y, Yang Q, Lin X, Tang X (2005) A symmetric patch-based correspondence model for occlusion handling. In: ICCV, pp 542–567
Faugeras O, Hotz B, Mathieu H, Viéville T, Zhang Z, Fua P, Théron E, Moll L, Berry G, Vuillemin J, Bertin P, Proy C (1996) Real time correlation based stereo: algorithm implementations and applications. Technical report, RR-2013, INRIA
Felzenszwalb P, Huttenlocher D (2006) Efficient belief propagation for early vision. Int J Comput Vis 70(1):41–54
Fua PV (1991) Combining stereo and monocular information to compute dense depth maps that preserve depth discontinuities. In: International joint conference on artificial intelligence, pp 1292–1298
Fusiello A, Roberto V, Trucco E (1997) Efficient stereo with multiple windowing. In: CVPR, pp 858–863
Gallup D, Frahm J, Mordohai P, Yang Q, Pollefeys M (2007) Real-time plane-sweeping stereo with multiple sweeping directions. In: CVPR
Gehrig S, Franke U (2007) Improving sub-pixel accuracy for long range stereo. In: ICCV VRML workshop
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR
Gupta A, Efros A, Hebert M (2010) Blocks world revisited: image understanding using qualitative geometry and mechanics. In: ECCV
Hartley R, Zisserman A (2003) Multiple view geometry in computer vision
Hasinoff S, Kang SB, Szeliski R (2006) Boundary matting for view synthesis. Comput Vis Image Underst 103(1):22–32
He K, Sun J, Tang X (2010) Guided image filtering. In: ECCV
Hirschmüller H (2005) Accurate and efficient stereo processing by semi-global matching and mutual information. In: CVPR, vol 2, pp 807–814
Hirschmüller H, Scharstein D (2009) Evaluation of stereo matching costs on images with radiometric differences. IEEE Trans Pattern Anal Mach Intell 31:1582–1599
Hirschmüller H, Innocent P, Garibaldi J (2002) Real-time correlation-based stereo vision with reduced border errors. Int J Comput Vis 47:229–246
Hong L, Chen G (2004) Segment-based stereo matching using graph cuts. In: CVPR, vol 1, pp 74–81
Hosni A, Bleyer M, Gelautz M, Rhemann C (2009) Local stereo matching using geodesic support weights. In: ICIP
Hosni A, Bleyer M, Gelautz M (2010) Near real-time stereo with adaptive support weight approaches. In: 3DPVT
Hosni A, Rhemann C, Bleyer M, Gelautz M (2011) Temporally consistent disparity and optical flow via efficient spatio-temporal filtering. In: PSIVT
Hosni A, Rhemann C, Bleyer M, Rother C, Gelautz M (2013) Fast cost-volume filtering for visual correspondence and beyond. IEEE Trans Pattern Anal Mach Intell 35(2):504–511
Ishikawa H (2000) Global optimization using embedded graphs. PhD thesis, New York University
Jegelka S, Bilmes J (2011) Submodularity beyond submodular energies: coupling edges in graph cuts. In: CVPR
Ju M, Kang H (2009) Constant time stereo matching. In: MVIP, pp 13–17
Klaus A, Sormann M, Karner K (2006) Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In: ICPR, pp 15–18
Kohli P, Kumar M, Torr P (2007) P3 & beyond: solving energies with higher order cliques. In: CVPR
Kolmogorov V, Rother C (2007) Minimizing non-submodular functions with graph cuts—a review. IEEE Trans Pattern Anal Mach Intell 29(7):1274–1279
Kolmogorov V, Zabih R (2002) Computing visual correspondence with occlusions using graph cuts. In: ICCV, vol 2, pp 508–515
Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. In: ECCV
Kolmogorov V, Zabih R (2004) What energy functions can be minimized via graph cuts? IEEE Trans Pattern Anal Mach Intell 26(2):147–159
Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFs with gaussian edge potentials. In: Advances in neural information processing systems
Lempitsky V, Rother C, Blake A (2007) Logcut—efficient graph cut optimization for Markov random fields. In: ICCV
Li G, Zucker SW (2006) Surface geometric constraints for stereo in belief propagation. In: CVPR, pp 2355–2362
Lin M, Tomasi C (2003) Surfaces with occlusions from layered stereo. In: CVPR, pp 710–717
Mei X, Sun X, Zhou M, Jiao S, Wang H, Zhang X (2011) On building an accurate stereo matching system on graphics hardware. In: GPUCV, pp 467–474
Meltzer T, Yanover TC, Weiss Y (2005) Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation. In: ICCV, pp 428–435
Mühlmann K, Maier D, Hesser J, Männer R (2002) Calculating dense disparity maps from color stereo images, an efficient implementation. Int J Comput Vis 47(1):79–88
Newcombe R, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A, Kohli P, Shotton J, Hodges S, Fitzgibbon A (2011) KinectFusion: real-time dense surface mapping and tracking. In: ISMAR
Ogale AS, Aloimonos Y (2004) Stereo correspondence with slanted surfaces: critical implications of horizontal slant. In: CVPR, pp 568–573
Ohta Y, Kanade T (1985) Stereo by intra- and inter-scanline search. IEEE Trans Pattern Anal Mach Intell 7(2):139–154
Paris S, Durandi F (2009) A fast approximation of the bilateral filter using a signal processing approach. Int J Comput Vis 81:24–52
Porikli F (2005) Integral histogram: a fast way to extract histograms in cartesian spaces. In: CVPR, vol 1, pp 829–836
Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: CVPR
Richardt C, Orr D, Davies I, Criminisi A, Dodgson NA (2010) Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid. In: ECCV, vol 6313, pp 510–523
Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23:309–314
Rother C, Kolmogorov V, Lempitsky V, Szummer M (2007) Optimizing binary MRFs via extended roof duality. In: CVPR
Rother C, Kohli P, Feng W, Jia J (2009) Minimizing sparse higher order energy functions of discrete variables. In: CVPR, pp 1382–1389
Roy S, Cox I (1998) A maximum-flow formulation of the n-camera stereo correspondence problem. In: ICCV, pp 492–499
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1/2/3):7–42. http://vision.middlebury.edu/stereo/
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from a single depth image. In: CVPR
Smith B, Zhang L, Jin H (2009) Stereo matching with nonparametric smoothness priors in feature space. In: CVPR, pp 485–492
Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25:835–846 (SIGGRAPH Proc)
Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation. IEEE Trans Pattern Anal Mach Intell 25(7):787–800
Sun J, Li Y, Kang SB, Shum HY (2005) Symmetric stereo matching for occlusion handling. In: CVPR, vol 25, pp 399–406
Szeliski R, Golland P (1998) Stereo matching with transparency and matting. In: ICCV, pp 517–525
Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C (2006) A comparative study of energy minimization methods for Markov random fields. In: ECCV, vol 2, pp 19–26
Taguchi Y, Wilburn B, Zitnick L (2008) Stereo reconstruction with mixed pixels using adaptive over-segmentation. In: CVPR, pp 1–8
Tao H, Sawhney H, Kumar R (2001) A global matching framework for stereo computation. In: ICCV, pp 532–539
Tappen MF, Freeman WT (2003) Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: ICCV, vol 2, pp 900–906
Veksler O (2002) Stereo correspondence with compact windows via minimum ratio cycle. IEEE Trans Pattern Anal Mach Intell 24(12):1654–1660
Veksler O (2005) Stereo correspondence by dynamic programming on a tree. In: CVPR, pp 384–390
Wainwright M, Jaakkola T, Willsky A (2003) Tree reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In: AISTATS
Woodford O, Torr P, Reid I, Fitzgibbon A (2008) Global stereo reconstruction under second order smoothness priors. In: CVPR
Xiong W, Jia J (2007) Stereo matching on objects with fractional boundary. In: CVPR, pp 1–8
Yang Q, Wang L, Yang R, Wang S, Liao M, Nister D (2006) Real-time global stereo matching using hierarchical belief propagation. In: BMVC
Yang Q, Yang R, Davis J, Nister D (2007) Spatial-depth super resolution for range images. In: CVPR
Yang Q, Wang L, Yang R, Stewenius H, Nister D (2009) Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. IEEE Trans Pattern Anal Mach Intell 31(3):492–504
Yoon KJ, Kweon IS (2005) Locally adaptive support-weight approach for visual correspondence search. In: CVPR
Zhang Y, Gong M, Yang Y (2008) Local stereo matching with 3D adaptive cost aggregation for slanted surface modeling and sub-pixel accuracy. In: ICPR
Zhang K, Lu J, Lafruit G (2009) Cross-based local stereo matching using orthogonal integral images. IEEE Trans Circuits Syst Video Technol 19:1073–1079
Zhang K, Lafruit G, Lauwereins R, Gool L (2010) Joint integral histograms and its application in stereo matching. In: ICIP, pp 817–820
Zitnick L, Kang S, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. ACM Trans Graph 23(3):600–608
Acknowledgements
This work was supported in part by the Vienna Science and Technology Fund (WWTF) under project ICT08-019.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Bleyer, M., Breiteneder, C. (2013). Stereo Matching—State-of-the-Art and Research Challenges. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5520-1_6
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5519-5
Online ISBN: 978-1-4471-5520-1
eBook Packages: Computer ScienceComputer Science (R0)