Skip to main content
Log in

Bayesian real-time perception algorithms on GPU

Real-time implementation of Bayesian models for multimodal perception using CUDA

  • Special Issue
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

In this text we present the real-time implementation of a Bayesian framework for robotic multisensory perception on a graphics processing unit (GPU) using the Compute Unified Device Architecture (CUDA). As an additional objective, we intend to show the benefits of parallel computing for similar problems (i.e. probabilistic grid-based frameworks), and the user-friendly nature of CUDA as a programming tool. Inspired by the study of biological systems, several Bayesian inference algorithms for artificial perception have been proposed. Their high computational cost has been a prohibitory factor for real-time implementations. However in some cases the bottleneck is in the large data structures involved, rather than the Bayesian inference per se. We will demonstrate that the SIMD (single-instruction, multiple-data) features of GPUs provide a means for taking a complicated framework of relatively simple and highly parallelisable algorithms operating on large data structures, which might take up to several minutes of execution with a regular CPU implementation, and arrive at an implementation that executes in the order of tenths of a second. The implemented multimodal perception module (including stereovision, binaural sensing and inertial sensing) builds an egocentric representation of occupancy and local motion, the Bayesian Volumetric Map (BVM), based on which gaze shift decisions are made to perform active exploration and reduce the entropy of the BVM. Experimental results show that the real-time implementation successfully drives the robotic system to explore areas of the environment mapped with high uncertainty.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. However, the new Fermi GPUs from NVIDIA will have Configurable L1 and Unified L2 caches [32]. Refer to concluding section for further discussion on the subject.

  2. These sensor models are, in fact, Bayesian subprograms of the BVM.

  3. Set of cells, each belonging to a particular line-of-sight (θ C , ϕ C ) in the BVM, just preceding the first occupied cell in that direction.

  4. CUDA streams are concurrent lanes of execution that allow parallel execution of multiple kernels on the GPU.

References

  1. GPU4Vision—Accelerating Computer Vision. http://gpu4vision.icg.tugraz.at/ (2009)

  2. Aloimonos, J., Weiss, I., Bandyopadhyay, A.: Active vision. Int. J. Comput. Vis. 1, 333–356 (1987)

    Article  Google Scholar 

  3. Bajcsy, R.: Active perception vs passive perception. In: Third IEEE Workshop on Computer Vision, Bellair, Michigan, pp 55–59 (1985)

  4. Barber, M.J., Clark, J.W., Anderson, C.H.: Neural representation of probabilistic information. Neural Comput. 15(8), 1843–1864 (2003). doi:10.1162/08997660360675062

    Article  MATH  Google Scholar 

  5. Bessière, P., Laugier, C., Siegwart, R. (eds.): Probabilistic reasoning and decision making in sensory-motor systems. In: Springer Tracts in Advanced Robotics, vol 46. Springer. ISBN: 978-3-540-79006-8 (2008)

  6. Box, G.E.P., Muller, M.: A note on the generation of random normal deviates. Ann. Math. Stat. 29(2), 610–611 (1958)

    Article  MATH  Google Scholar 

  7. Carpenter, R.H.S.: The saccadic system: a neurological microcosm. Adv. Clin. Neurosci. Rehabil. 4, 6–8 (2004)

    Google Scholar 

  8. Caspi, A., Beutter, B.R., Eckstein, M.P.: The time course of visual information accrual guiding eye movement decisions. Proc. Natl. Acad. Sci. USA 101(35), 13086–13090 (2004)

    Article  Google Scholar 

  9. Cutting, J.E., Vishton, P.M.: Perceiving layout and knowing distances: the integration, relative potency, and contextual use of different information about depth. In: Epstein, W., Rogers, S. (eds.) Handbook of Perception and Cognition, vol 5; Perception of Space and Motion. Academic Press, New York (1995)

  10. Denève, S., Latham, P.E., Pouget, A.: Reading population codes: a neural implementation of ideal observers. Nat. Neurosci. 2(8), 740–745 (1999). doi:10.1038/11205

    Article  Google Scholar 

  11. Elfes, A.: Using occupancy grids for mobile robot perception and navigation. IEEE Comput. 22(6), 46–57 (1989)

    Google Scholar 

  12. Farrugia, J.P., Horain, P., Guehenneux, E., Alusse, Y.: GpuCV: a framework for image processing acceleration with graphics processors. In: 2006 IEEE International Conference on Multimedia and Expo, pp 585–588 (2006)

  13. Ferreira, J.F., Bessière, P., Mekhnacha, K., Lobo, J., Dias, J., Laugier, C.: Bayesian models for multimodal perception of 3D structure and motion. In: International Conference on Cognitive Systems (CogSys 2008), University of Karlsruhe, Karlsruhe, Germany, pp 103–108 (2008)

  14. Ferreira, J.F., Pinho, C., Dias, J.: Active exploration using Bayesian models for multimodal perception. In: Campilho, A., Kamel, M. (eds.) Image Analysis and Recognition. Lecture Notes in Computer Science Series (Springer LNCS), International Conference ICIAR 2008, pp 369–378 (2008)

  15. Ferreira, J.F., Pinho, C., Dias, J.: Bayesian sensor model for egocentric stereovision. In: 14a Conferência Portuguesa de Reconhecimento de Padrões Coimbra (RECPAD 2008) (2008)

  16. Ferreira, J.F., Pinho, C., Dias, J.: Implementation and calibration of a Bayesian binaural system for 3D localisation. In: 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Thailand (2009)

  17. Fung, J., Mann, S.: Using graphics devices in reverse: GPU-based image processing and computer vision. In: IEEE Int’l Conference on Multimedia and Expo, Hannover, Germany (2008)

  18. Fung, J., Mann, S., Aimone, C.: OpenVIDIA: parallel GPU computer vision. In: ACM Multimedia 2005, Singapore, pp 849–852 (2005)

  19. Hussein, M., Varshney, A., Davis, L.: On implementing graph cuts on CUDA. In: First Workshop on General Purpose Processing on Graphics Processing Units, Boston, MA (2007)

  20. Hwu, W.M., Rodrigues, C., Ryoo, S., Stratton, J.: Compute unified device architecture application suitability. Comput. Sci. Eng. 11(3), 16–26 (2009)

    Article  Google Scholar 

  21. Jang, H., Park, A., Jung, K.: Neural network implementation using CUDA and OpenMP. In: Proceedings of the 2008 Digital Image Computing: Techniques and Applications, pp 155–161. IEEE Computer Society Washington, DC, USA (2008)

  22. Knill, D.C., Pouget, A.: The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 27(12), 712–719 (2004)

    Article  Google Scholar 

  23. Koene, A., Morén, J., Trifa, V., Cheng, G.: Gaze shift reflex in a humanoid active vision system. In: 5th International Conference on Computer Vision Systems (ICVS 2007), Applied Computer Science Group, Bielefeld University, Germany. ISBN:978-3-00-020933-8 (2007)

  24. Laurens, J., Droulez, J.: Bayesian processing of vestibular information. Biol. Cybernet. 96(4), 389–404 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  25. Lebeltel, O.: Programmation Bayésienne des robots. PhD thesis, Institut National Polytechnique de Grenoble, Grenoble, France (1999)

  26. L’Ecuyer, P.: Maximally equidistributed combined tausworthe generators. Math. Comput. 65(213), 202–213 (1996)

    Google Scholar 

  27. Lobo, J., Ferreira, J.F., Dias, J.: Bioinspired visuovestibular artificial perception system for independent motion segmentation. In: Second International Cognitive Vision Workshop, ECCV 9th European Conference on Computer Vision, Graz, Austria. http://dib.joanneum.at/icvw2006/ (2006)

  28. Lobo, J., Ferreira, J.F., Dias, J.: Robotic implementation of biological Bayesian models towards visuo-inertial image stabilization and gaze control. In: 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Tailand (2009)

  29. Lu, Y.C., Christensen, H., Cooke, M.: Active binaural distance estimation for dynamic sources. In: Interspeech 2007, Antwerp, Belgium (2007)

  30. Mekhnacha, K., Ahuactzin, J.M., Bessière, P., Mazer, E., Smail, L.: Exact and approximate inference in ProBT. Revue d’Intelligence Artificielle 21(3), 295–332 (2007)

    Article  Google Scholar 

  31. NVIDIA (2007) CUDA Programming Guide ver 1.2

  32. NVIDIA (2009) NVIDIA’s Next Generation CUDATM Compute Architecture: FermiTM. Whitepaper, NVIDIA. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf

  33. Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(5), 80–113 (2007)

    Article  Google Scholar 

  34. Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. In: Proceedings of the IEEE, vol 96, pp 879–899 (2008)

  35. Pinho, C., Ferreira, J.F., Bessière, P., Dias, J.: A Bayesian binaural system for 3D sound-source localisation. In: International Conference on Cognitive Systems (CogSys 2008), pp 109–114. University of Karlsruhe, Karlsruhe, Germany (2008)

  36. Pouget, A., Dayan, P., Zemel, R.: Information processing with population codes. Nat. Rev. Neurosci. 1, 125–132 (2000)

    Article  Google Scholar 

  37. Reinbothe, C., Boubekeur, T., Alexa, M.: Hybrid ambient occlusion. In: Proceedings of the Eurographics Symposium on Rendering (2009)

  38. Tay, C., Mekhnacha, K., Chen, C., Yguel, M., Laugier, C.: An efficient formulation of the Bayesian occupation filter for target tracking in dynamic environments. Int. J. Veh. Auton. Syst. 6, 155–171 (2008)

    Google Scholar 

  39. Yguel, M., Aycard, O., Laugier, C.: Efficient GPU-based construction of occupancy grids using several laser range-finders. Int. J. Veh. Auton. Syst. 6(1–2), 48–83 (2007)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank José Marinho for his invaluable assistance on the implementation of the inertial sensor model particle filter. This publication has been supported by EC-contract number FP6-IST-027140, Action line: Cognitive Systems. The contents of this text reflect only the author’s views. The European Community is not liable for any use that may be made of the information contained herein.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Dias.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ferreira, J.F., Lobo, J. & Dias, J. Bayesian real-time perception algorithms on GPU. J Real-Time Image Proc 6, 171–186 (2011). https://doi.org/10.1007/s11554-010-0156-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-010-0156-7

Keywords

Navigation