Skip to main content
Log in

Oriented-linear-tree based cost aggregation for stereo matching

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Matching cost aggregation is one of the most important steps in dense stereo correspondence, and non-local cost aggregation methods based on tree structures have been widely studied recently. In this paper, we analyze the shortcomings of both the local window-based aggregation methods and the non-local tree-based aggregation methods, and propose a novel oriented linear tree structure for each pixel to perform the non-local cost aggregation strategy. Firstly, each pixel in the image has an oriented linear tree rooted on it and each oriented linear tree consists of multiple 1D paths from different directions. Compared to other spanning trees, our oriented linear trees don’t need to be additionally constructed beforeh and since they are naturally embedded in the original image. Moreover, each root pixel not only gets supports from adjacent pixels within its local support window, but also receives supports from the other pixels along all 1D paths. Secondly, for each pixel lying on the same 1D path, we can at the same time calculate their aggregated cost along their path by traversing the path back and forth twice. Finally, the final aggregated cost for each root pixel can be calculated by summing the aggregated costs from all 1D paths. Performance evaluation on the Middlebury and KITTI datasets shows that the proposed method outperforms the current state-of-the-art aggregation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://xing-mei.net/resource/page/segment-tree.html

  2. https://www.ims.tuwien.ac.at/publications/tuw-202088

References

  1. Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints[J]. ISPRS J Photogramm Remote Sens 59(3):128–150

    Article  Google Scholar 

  2. Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts [J]. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239

    Article  Google Scholar 

  3. Cheng F, Zhang H, Sun M et al (2015) Cross-trees, edge and superpixel priors-based cost aggregation for stereo matching [J]. Pattern Recogn 48(7):2269–2278

    Article  Google Scholar 

  4. Cigla C, Alatan AA (2011) Efficient edge-preserving stereo matching[C]. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 696–699

  5. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis [J]. IEEE Trans Pattern Anal Mach Intell 24(5):603–619

    Article  Google Scholar 

  6. Fusiello A, Roberto V, Trucco E (2000) Symmetric stereo with multiple windowing[J]. Int J Pattern Recognit Artif Intell 14(08):1053–1066

    Article  Google Scholar 

  7. Geiger A, Ziegler J, Stiller C (2011) Stereoscan: dense 3d reconstruction in real-time[C]. In: Proceedings of the IEEE intelligent vehicles symposium. IEEE, pp 963–968

  8. Geiger A, Lenz P, Urtasun R (2012) The KITTI vision benchmark. [Online]. Available: http://www.cvlibs.net/datasets/kitti/eval_stereo_flow.php?benchmark=stereo. Accessed March 2018

  9. Gerrits M, Bekaert P (2006) Local stereo matching with segmentation-based outlier rejection[C]. In: Proceedings of the 3rd Canadian conference on computer and robot vision. IEEE, pp 66–66

  10. Guney F, Geiger A (2015) Displets: resolving stereo ambiguities using object knowledge[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4165–4175

  11. Hafner D, Demetz O, Weickert J (2013) Why is the census transform good for robust optic flow computation?[C]. In: Proceedings of the international conference on scale space and variational methods in computer vision. Springer, pp 210–221

  12. Hartley R, Zisserman A (2003) Multiple view geometry in computer vision[M]. Cambridge university press, Cambridge

    MATH  Google Scholar 

  13. He K, Sun J, Tang X (2013) Guided image filtering [J]. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409

    Article  Google Scholar 

  14. Hermann S, Klette R (2012) Iterative semi-global matching for robust driver assistance systems[C]. In: Proceedings of the Asian conference on computer vision. Springer, pp 465–478

  15. Hirschmuller H (2008) Stereo processing by semiglobal matching and mutual information[J]. IEEE Trans Pattern Anal Mach Intell 30(2):328–341

    Article  Google Scholar 

  16. Hirschmüller H, Scharstein D (2007) Evaluation of cost functions for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8

  17. Hosni A, Bleyer M, Gelautz M et al (2009) Local stereo matching using geodesic support weights[C]. In: Proceedings of the 16th IEEE international conference on image processing. IEEE, pp 2093–2096

  18. Hosni A, Rhemann C, Bleyer M et al (2013) Fast cost-volume filtering for visual correspondence and beyond[J]. IEEE Trans Pattern Anal Mach Intell 35(2):504–511

    Article  Google Scholar 

  19. Hosni A, Bleyer M, Gelautz M (2013) Secrets of adaptive support weight techniques for local stereo matching [J]. Comput Vis Image Underst 117(6):620–632

    Article  Google Scholar 

  20. Kanade T, Okutomi M (1994) A stereo matching algorithm with an adaptive window: theory and experiment [J]. IEEE Trans Pattern Anal Mach Intell 16(9):920–932

    Article  Google Scholar 

  21. Kao CC (2017) Stereoscopic image generation with depth image based rendering[J]. Multimed Tools Appl 76(11):12981–12999

    Article  Google Scholar 

  22. Kappes JH, Andres B, Hamprecht FA et al (2013) A comparative study of modern inference techniques for discrete energy minimization problems[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1328–1335

  23. Kolmogorov V, Zabin R (2004) What energy functions can be minimized via graph cuts?[J]. IEEE Trans Pattern Anal Mach Intell 26(2):147–159

    Article  Google Scholar 

  24. Lan X, Ma AJ, Yuen PC (2014) Multi-cue visual tracking using robust feature-level fusion based on joint sparse representation[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1194–1201

  25. Lan X, Ma AJ, Yuen PC et al (2015) Joint sparse representation and robust feature-level fusion for multi-cue visual tracking[J]. IEEE Trans Image Process 24(12):5826–5841

    Article  MathSciNet  MATH  Google Scholar 

  26. Lan X, Zhang S, Yuen PC (2016) Robust joint discriminative feature learning for visual tracking[C]. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence. AAAI, pp 3403–3410

  27. Lan X, Yuen PC, Chellappa R (2017) Robust MIL-based feature template learning for object tracking[C]. In: Proceedings of the thirty-first AAAI conference on artificial intelligence. AAAI, pp 4118–4125

  28. Lan X, Ye M, Zhang S et al (2018) Robust collaborative discriminative learning for RGB-infrared tracking[C]. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, vol 7008. AAAI, p 7015

  29. Lan X, Zhang S, Yuen PC et al (2018) Learning common and feature-specific patterns: a novel multiple-sparse-representation-based tracker[J]. IEEE Trans Image Process 27(4):2022–2037

    Article  MathSciNet  MATH  Google Scholar 

  30. Liu Y, Nie L, Han L et al (2015) Action2Activity: recognizing complex activities from sensor data[C]. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence. AAAI, pp 1617–1623

  31. Liu L, Cheng L, Liu Y et al (2016) Recognizing complex activities by a probabilistic interval-based model[C]. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 1266–1272

  32. Liu Y, Nie L, Liu L et al (2016) From action to activity: sensor-based activity recognition[J]. Neurocomputing 181:108–115

    Article  Google Scholar 

  33. Liu Y, Zhang L, Nie L et al (2016) Fortune teller: predicting your career path[C]. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 201–207

  34. Mattoccia S, Giardino S, Gambini A (2009) Accurate and efficient cost aggregation strategy for stereo correspondence based on approximated joint bilateral filtering[C]. In: Proceedings of the Asian conference on computer vision. Springer, pp 371–382

  35. Mei X, Sun X, Zhou M et al (2011) On building an accurate stereo matching system on graphics hardware[C]. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 467–474

  36. Mei X, Sun X, Dong W et al (2013) Segment-tree based cost aggregation for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 313–320

  37. Milanfar P (2013) A tour of modern image filtering: new insights and methods, both practical and theoretical[J]. IEEE Signal Process Mag 30(1):106–128

    Article  Google Scholar 

  38. Richardt C, Orr D, Davies I et al (2010) Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid[C]. In: Proceedings of the European conference on computer vision. Springer, pp 510–523

  39. Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms [J]. Int J Comput Vis 47(1):7–42

    Article  MATH  Google Scholar 

  40. Scharstein D, Szeliski R (2002) Middlebury stereo vision website. [Online]. Available: http://vision.middlebury.edu/stereo/data. Accessed March 2018

  41. Sengupta S, Greveson E, Shahrokni A et al (2013) Urban 3d semantic modelling using stereovision[C]. In: Proceedings of the IEEE international conference on robotics and automation. IEEE, pp 580–585

  42. Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation [J]. IEEE Trans Pattern Anal Mach Intell 25(7):787–800

    Article  Google Scholar 

  43. Tombari F, Mattoccia S, Stefano LD (2007) Segmentation-based adaptive support for accurate stereo correspondence [C]. In: Proceedings of the Pacific-rim symposium on image and video technology. Springer, pp 427–438

  44. Tombari F, Mattoccia S, Stefano LD et al (2008) Classification and evaluation of cost aggregation methods for stereo correspondence [C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8

  45. Veksler O (2003) Fast variable window for stereo correspondence using integral images[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 556–561

  46. Veksler O (2005) Stereo correspondence by dynamic programming on a tree[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2. IEEE, pp 384–390

  47. Wang ZF, Zheng ZG (2008) A region based stereo matching algorithm using cooperative optimization[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8

  48. Wu W, Zhu H, Zhang Q (2018) Epipolar rectification by singular value decomposition of essential matrix[J]. Multimed Tools Appl 77(12):15747–15771

    Article  Google Scholar 

  49. Yamaguchi K, McAllester D, Urtasun R (2014) Efficient joint segmentation, occlusion labeling, stereo and flow estimation[C]. In: Proceedings of European conference on computer vision. Springer, pp 756–771

  50. Yang Q (2012) A non-local cost aggregation method for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1402–1409

  51. Yang Q (2015) Stereo matching using tree filtering [J]. IEEE Trans Pattern Anal Mach Intell 37(4):834–846

    Article  Google Scholar 

  52. Yang Q, Wang L, Yang R et al (2009) Stereo matching with color-weighted correlation, hierarchical belief propagation, and occlusion handling [J]. IEEE Trans Pattern Anal Mach Intell 31(3):492–504

    Article  Google Scholar 

  53. Yoon K, Kweon I (2006) Adaptive support-weight approach for correspondence search [J]. IEEE Trans Pattern Anal Mach Intell 28(4):650–656

    Article  Google Scholar 

  54. Zabih R, Woodfill J (1994) Non-parametric local transforms for computing visual correspondence[C]. In: Proceedings of the European conference on computer vision. Springer, pp 151–158

  55. Zhang K, Lu J, Lafruit G (2009) Cross-based local stereo matching using orthogonal integral images [J]. IEEE Trans Circuits Syst Video Technol 19(7):1073–1079

    Article  Google Scholar 

  56. Zhang K, Fang Y, Min D et al (2014) Cross-scale cost aggregation for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1590–1597

  57. Zhang C, Li Z, Cheng Y et al (2015) Meshstereo: a global stereo model with mesh alignment regularization for view interpolation[C]. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2057–2065

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No.61673318, No.61703301, No.61771386, No.61801005); by Research project of Hubei Provincial Department of Education (B2017080), China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Zhu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, W., Zhu, H. & Zhang, Q. Oriented-linear-tree based cost aggregation for stereo matching. Multimed Tools Appl 78, 15779–15800 (2019). https://doi.org/10.1007/s11042-018-6993-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6993-2

Keywords

Navigation