Robust non-local stereo matching for outdoor driving images using segment-simple-tree

https://doi.org/10.1016/j.image.2015.09.012Get rights and content

Highlights

  • We propose a non-local stereo matching algorithm for driver assistance systems.

  • The disparity characteristics of outdoor driving images are demonstrated by analysis.

  • We introduce segment-simple-tree that is more adequate for outdoor driving images than minimum spanning tree.

  • Qualitative and quantitative evaluation to existing methods is provided over three datasets.

Abstract

Non-local cost aggregation has recently emerged as a promising approach for stereomatching and has attracted much interest over the past few years. Most non-local algorithms are reportedly better than state-of-the-art local algorithms for high-quality indoor images. However, the accuracy of non-local algorithms is still limited for outdoor images. Computing disparity maps for outdoor images in driver assistance systems is one of the most actively researched topics in the field of stereo vision. In this paper, we present a robust non-local stereo matching algorithm that improves the performance of non-local approaches for outdoor driving images. The proposed algorithm is inspired by the non-local cost aggregation method based on a minimum spanning tree, and it improves the estimation accuracy by introducing an alternate, effective segment-simple-tree that is more adequate for outdoor driving images than the minimum spanning tree. Experimental results showed that the proposed algorithm is superior to the existing local and non-local algorithms, and is comparable to semi-global matching.

Introduction

Binocular stereo matching is one of the most important algorithms in computer vision because it provides computers with the depth perception ability similar to that of humans. The goal is to estimate the disparity maps from two rectified images of the same scene taken from left and right viewpoints. Binocular stereo matching has been extensively studied over the past few decades, and numerous algorithms have been proposed [1], [2], [3]. Nevertheless, it is still an active area of research because new challenges have surfaced and there are a variety of problems that have not yet been solved. Among these challenges, computing the disparity maps for outdoor images in driver assistance systems (DAS) is one of the most actively researched topics in this field [4], [5], [6].

In general, a stereo matching algorithm consists of four steps: cost computation, cost aggregation, disparity optimization, and post-processing. In this paper, we mainly focus on the cost aggregation step since it has the greatest impact on the accuracy of the estimated disparity map. Most existing cost aggregation algorithms define a local 2-D support window for each pixel and perform summing/averaging operations using the information obtained from the pixels inside of that window. State-of-the-art local algorithms include the adaptive support-weight approach [7], geodesic diffusion [8], and fast cost-volume filtering [9]. These were respectively inspired by bilateral filtering [10], anisotropic diffusion [11], and guided image filtering [12], which are the three most well-known edge-aware image filtering methods.

Semi-global matching (SGM) [13] is one of the best performing stereo matching algorithms for outdoor driving images [4]. SGM performs the cost aggregation step using several global paths instead of the local pixel similarities used in the local approach. Even though this is an optimization-based approach that uses dynamic programming for each path, it can still be considered as a cost aggregation step [14].

In the past few years, non-local cost aggregation has recently emerged as a promising approach since the accuracy and the computation are reportedly better and faster than existing local algorithms. The key idea behind its success is that it aggregates the matching cost using the information from all of the pixels in the image rather than only those inside a specific window, as the aforementioned local algorithms do [7], [8], [9]. Cigla et al. [15] proposed an information permeability algorithm with separable successive weighted summation along the horizontal and vertical directions, and the improved spatio-temporal information permeability filtering is presented in [16]. Pham et al. [17] employed a sequence of 1-D filters to reduce the computation and memory costs relative to conventional 2-D filters by applying domain transform [18].

Yang׳s method [19], [20], the most notable non-local algorithm, carries out the cost aggregation on a minimum spanning tree (MST). The MST is constructed using a subset of edges whose sum of edge-weights is minimal, and the matching cost is then aggregated on this tree within two passes, leaf-to-root and root-to-leaf. Mei et al. [21] improved upon this algorithm by incorporating image segmentation [22] in the tree construction procedure. This results in the new tree structure, which is called segment-tree (ST). Although these methods generate accurate disparity maps for indoor images as shown in [19], [20], [21], they fail to do the same for outdoor driving images. Both MST and ST are not adequate for outdoor driving images, as will be demonstrated by analysis in the later sections. In this paper, we present a robust non-local algorithm that can meet this challenge. Specifically, we perform cost aggregation on an alternate, effective segment-simple-tree (SST) that is more suitable for outdoor driving images than MST and ST.

The remainder of this paper is structured as follows. Section 2 presents the non-local cost aggregation on a MST, which is the inspiration for our algorithm. Section 3 presents the proposed algorithm. Section 4 presents the cost computation, disparity optimization and post-processing. Section 5 presents the experimental results that compare our algorithm to those proposed in prior literature. Section 6 provides the conclusions for this paper.

Section snippets

Non-local cost aggregation on MST

In this section, we present the MST-based non-local algorithm [19], [20], which directly inspired our algorithm. Specifically, we summarize the construction of the MST and the two-pass cost aggregation. The proposed algorithm performs a two-pass cost aggregation similar to that of the MST. However, the tree is constructed differently by employing an alternate, effective segment-simple-tree structure (SST) that is more suitable than MST for outdoor driving images.

To begin, we denote G=(V,E) as a

Analysis of outdoor driving images

In this subsection, we first analyze the disparity characteristics of outdoor driving images to understand the key idea behind the SST proposed for non-local stereo matching in DAS. Fig. 2 illustrates this analysis using a synthesized stereo image pair obtained from the EISATS dataset [4]. Fig. 2(a) and (b) shows the left image and the corresponding ground truth data, respectively. To examine the disparity characteristic on the road surface, we plot the disparities on the road regions in the

Cost computation

As previously mentioned, cost computation is the first step of a general four-step stereo matching pipeline. Cost computation is essential and strongly influences the accuracy of the disparity results. There are several existing metrics for computing matching costs; these metrics include truncated absolute difference (TAD), a combination of TAD using intensity and gradient [9], mutual information (MI) [13], and Census [30]. TAD is one of the simplest measures that has been widely used, but it

Experimental setup

We compared the proposed non-local algorithm using SST (NL-SST) with the state-of-the-art cost aggregation algorithms including cost-volume filtering (CostFilter) [9], domain transform-based cost aggregation (DTAggr) [17], non-local cost aggregation on a MST (NL-MST) [19], non-local cost aggregation on a segment-tree (NL-ST) [21], and semi-global matching (SGM) [13]. The reasons for this choice are as follows: (1) CostFilter is a state-of-the-art local algorithm, (2) DTAggr, NL-MST, and NL-ST

Conclusions

In this paper, we presented an effective segment-simple-tree for non-local stereo matching in driver assistance systems. In contrast to the original non-local algorithm that only builds a single MST for the entire image, our approach constructs multiple MSTs for non-road segments and some simple-trees for the segments on the road. The cost aggregation is carried out on the proposed SST in two passes in a manner similar to that of the original approach using MST. The advantages of SST are

Acknowledgments

This work was supported by Samsung Research Funding Center of Samsung Electronics under Project Number SRFC-IT1402-12.

References (32)

  • N. Kiryati et al.

    A probabilistic hough transform

    Pattern Recognit.

    (1991)
  • D. Scharstein et al.

    A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

    Int. J. Comput. Vis.

    (2002)
  • R. Szeliski

    Computer Vision: Applications and Algorithms

    (2011)
  • D. Scharstein, R. Szeliski, Middlebury Stereo Evaluation - Version 2 〈http://vision.middlebury.edu/stereo/eval〉,...
  • R. Klette et al.

    Performance of correspondence algorithms in vision-based driver assistance using an online image sequence database

    IEEE Trans. Veh. Technol.

    (2011)
  • S. Meister, B. Jahne, D. Kondermann, Outdoor stereo camera system for the generation of real-world benchmark data sets,...
  • A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The KITTI vision benchmark suite, In: Proceedings...
  • K.-J. Yoon et al.

    Adaptive support-weight approach for correspondence search

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • L. De-Maeztu et al.

    Near real-time stereo matching using geodesic diffusion

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2012)
  • C. Rhemann, A. Hosni, M. Bleyer, C. Rother, M. Gelautz, Fast Cost-Volume Filtering for Visual Correspondence and...
  • C. Tomasi, R. Manduchi, Bilateral filtering for gray and color images, In: Proceedings of the International Conference...
  • P. Penora et al.

    Scale-space and edge detection using anisotropic diffusion

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1990)
  • K. He, J. Sun, X. Tang, Guided image filtering, In: Proceedings of ECCV 2010 of LNCS, vol. 6311, 2010, pp....
  • H. Hirschmuller

    Stereo processing by semiglobal matching and mutual information

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2008)
  • M. Bleyer, C. Breiteneder, Stereo matching—state-of-the-art and research challenges, In: Advanced Topics in Computer...
  • C. Cigla, A.A. Alatan, Efficient edge-preserving stereo matching, In: ICCV Workshop on LDRMV, 2011, pp....
  • Cited by (14)

    • Efficient and robust unsupervised inverse intensity compensation for stereo image registration under radiometric changes

      2021, Signal Processing: Image Communication
      Citation Excerpt :

      Image registration is one of the most important technologies in computer vision research field. After years of development, image registration technology has made important applications in augmented reality (AR), autonomous navigation, medical image processing, 3-D reconstruction, 3-D scanning, dense mapping, and other fields [1–4]. Registration refers to the identification of the pixel in each image, from a collection of the same scene, which corresponds to the same physical point [5].

    • Depth Map Information from Stereo Image Pairs using Deep Learning and Bilateral Filter for Machine Vision Application

      2022, 2022 IEEE 5th International Symposium in Robotics and Manufacturing Automation, ROMA 2022
    • Accurate and Efficient Stereo Matching by Log-Angle and Pyramid-Tree

      2021, IEEE Transactions on Circuits and Systems for Video Technology
    View all citing articles on Scopus
    View full text