Skip to main content
Log in

Fast implementation of dense stereo vision algorithms on a highly parallel SIMD architecture

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

In this paper, we present faster than real-time implementation of a class of dense stereo vision algorithms on a low-power massively parallel SIMD architecture, the CSX700. With two cores, each with 96 Processing Elements, this SIMD architecture provides a peak computation power of 96 GFLOPS while consuming only 9 Watts, making it an excellent candidate for embedded computing applications. Exploiting full features of this architecture, we have developed schemes for an efficient parallel implementation with minimum of overhead. For the sum of squared differences (SSD) algorithm and for VGA (640 × 480) images with disparity ranges of 16 and 32, we achieve a performance of 179 and 94 frames per second (fps), respectively. For the HDTV (1,280 × 720) images with disparity ranges of 16 and 32, we achieve a performance of 67 and 35 fps, respectively. We have also implemented more accurate, and hence more computationally expensive variants of the SSD, and for most cases, particularly for VGA images, we have achieved faster than real-time performance. Our results clearly demonstrate that, by developing careful parallelization schemes, the CSX architecture can provide excellent performance and flexibility for various embedded vision applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Who, M., Mahlke, S., Mudge, T., Chakrabarti, C.: Mobile supercomputers for the next-generation cell phone. IEEE Comput. 43(1), 81–85 (2010)

    Article  Google Scholar 

  2. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2001)

    Article  Google Scholar 

  3. van der Mark, W., Gavrila, D.M.: Real-time dense stereo for intelligent vehicles. IEEE Trans. Intell. Trans. Syst. 7(1), 38–50 (2006)

    Article  Google Scholar 

  4. Di Stefano, L., Marchionni, M., Mattoccia, S.: A PC-based real-time stereo vision system. Int. J. Mach. Graphics Vis. 13(3), 197–220 (2004)

    Google Scholar 

  5. ClearSpeed Technology: ClearSpeed whitepaper: CSX processor architecture. http://www.clearspeed.com (2007)

  6. Tilera Corporation, http://www.tilera.com/

  7. Hosseini, F., Fijany, A., Safari, S., Chellali, R., Fontaine, J.-G.: Real-Time parallel implementation of SSD stereo vision algorithm on CSX SIMD architecture. 5th International Symposium on Advances in Visual Computing (ISVC’09), pp. 808–818 (2009)

  8. McCullagh, B.: Real-time disparity map computation using the cell broadband engine. J. Real-Time Image Process. doi:10.1007/s11554-010-0155-8

  9. Yang, R., Pollefeys, M.: A versatile stereo implementation on commodity graphics hardware. J. Real-Time Imaging 11(1), 7–18 (2005)

    Article  Google Scholar 

  10. Zhu, K., Butenuth, M., d’Angelo, P.: Comparison of dense stereo using CUDA. In: Workshop of Computer Vision on GPUs (CVGPU) in Conjunction with ECCV, on CD (2010)

  11. Chang, N., Lin ,T.-M., Tsai, T.-H., Tseng, Y.-C., Chang, T.-S.: Real-time DSP implementation on local stereo matching. In: IEEE International Conference on Multimedia and Expo, pp. 2090–2093 (2007)

  12. Jia, Y., Zhang, X., Li, M., An, L.: A miniature stereo vision machine (MSVM-III) for dense disparity mapping. 17th Int. Conf. Pattern Recognit. 1, 728–731 (2004)

    Google Scholar 

  13. Georgoulas, C., Andreadis, I.: A real-time fuzzy hardware structure for disparity map computation. J. Real-Time Image Process. (2010). doi:10.1007/s11554-010-0157-6

  14. Woodfill, J.I., Gordon, G., Buck, R.: Tyzx DeepSea high speed stereo vision system.In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 41–45 (2004)

  15. Kuhn, M., Moser, S., Isler, O., Gurkaynak, F.K., Burg, A., Felber, N., Kaeslin, H., Fichtner, W.: Efficient ASIC implementation of a real-time depth mapping stereo vision system. IEEE Midwest Symp. Circuits Syst. 3, 1478–1481 (2003)

    Article  Google Scholar 

  16. Ambrosch, K., Humenberger, M., Kubinger, W., Steininger, A.: Hardware implementation of an SAD based stereo vision algorithm. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2007)

  17. Hirschmüller, H., Innocent, P.R., Garibaldi, J.: Real-time correlation-based stereo vision with reduced border errors, Int. J. Comput. Vis. 47(1–3), 229–246 (2002)

    Article  MATH  Google Scholar 

  18. ClearSpeed Technology: CSX600 Hardware Programming Manual, Jan 2008, document No. 06-RM-1305 Revision:1.A. http://www.clearspeed.com

  19. ClearSpeed Technology, CSX600/CSX700 Instruction Set Reference Manual, Aug 2008, document No. 06-RM-1137 Revision: 4.A. http://www.clearspeed.com,

  20. Heuveline, V., Weiß, J.-P.: Lattice boltzmann methods on the clearspeed advance™ accelerator board. Eur. Phys. J. Special Top. 171(1), 31–36 (2009)

    Article  Google Scholar 

  21. Soviany, C.: Embedding data and task parallelism in image processing applications, Ph.D. thesis, Delft University of Technology, Netherlands (2003)

  22. ClearSpeed Technology, Visual Profiler, Feb 2008, document No. 06-RM-1136 Revision:4.B. http://www.clearspeed.com

  23. Scharstein, D., Szeliski, R.: http://vision.middlebury.edu/stereo/

  24. Hosseini, F., Fijany, A., Fontaine, J.-G.: Highly parallel implementation of Harris corner detector on CSX SIMD architecture. In: Proceeding of 4th Workshop on Highly Parallel Processing on a Chip (HPPC’10) in conjunction with Euro-par (2010)

  25. Fijany, A., Hosseini, F.: Image processing applications on a low power highly parallel SIMD architecture.In: IEEE Aerospace Conference, pp. 1–12 (2011)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fouzhan Hosseini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hosseini, F., Fijany, A., Safari, S. et al. Fast implementation of dense stereo vision algorithms on a highly parallel SIMD architecture. J Real-Time Image Proc 8, 421–435 (2013). https://doi.org/10.1007/s11554-011-0211-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-011-0211-z

Keywords

Navigation