Abstract
In this paper, we describe a real-time vision machine having a stereo camera as input generating visual information on two different levels of abstraction. The system provides visual low-level and mid-level information in terms of dense stereo and optical flow, egomotion, indicating areas with independently moving objects as well as a condensed geometric description of the scene. The system operates at more than 20 Hz using a hybrid architecture consisting of one dual-GPU card and one quad-core CPU. The different processing stages of visual information have rather different characteristics that in some cases make fine-grained parallelization on a GPU less applicable. However, for most of the stages that are not efficiently implementable on a GPU, a coarse parallelization on multiple CPU-cores is applicable. We show that with such hybrid parallelism, we can achieve a speed up of approximately a factor 90 and a reduction of latency of a factor 26 compared to processing on a single CPU-core. Since the vision machine provides generic visual information it can be used in many contexts. Currently it is used in a driver assistance context as well as in two robotic applications.









Similar content being viewed by others
Notes
Compiled with ‘-O3 -march=native -m64 -fPIC’ optimization and the g++ compiler version 4.3.3. Using an optimized library for FFT computations when possible.
References
Aarno, D., Sommerfeld, J., Kragic, D., Pugeault, N., Kalkan, S., Wörgötter, F., Kraft, D., Krüger, N.: Early reactive grasping with second order 3d feature relations. In: The IEEE International Conference on Advanced Robotics, Jeju Island, Korea (2007)
Adelson, E.H., Anderson, C.H., Bergen, J.R., Burt, P.J., Ogden, J.M.: Pyramid methods in image processing. RCA Eng. 29(6), 33–41 (1984)
Aloimonos, Y., Shulman, D.: Integration of Visual Modules—An Extension of the Marr Paradigm. Academic Press, London (1989)
Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities, pp. 79–81 (2000)
Başeski, E., Kraft, D., Krüger, N.: Road interpretation for driver assistance based on an early cognitive vision system. VISAPP (2009)
Baker, Z.K., Gokhale, M.B., Tripp, J.L.: Matched filter computation on FPGA, cell and GPU. Field-Programmable Custom Computing Machines. Annual IEEE Symposium, pp. 207–218 (2007). doi:10.1109/FCCM.2007.52
Bouguet, J.Y.: Camera Calibration Toolbox for Matlab (2008). http://www.vision.caltech.edu/bouguetj/calib_doc/
Bugge. H.: An evaluation of Intel’s core i7 architecture using a comparative approach. Comput. Sci. Res. Dev. 23(3–4), 203–209 (2009)
Che, S., Li, J., Sheaffer, J., Skadron, K., Lach, J.: Accelerating compute-Intensive applications with GPUs and FPGAs. In: Application Specific Processors, 2008. SASP 2008. Symposium, pp 101–107 (2008). doi:10.1109/SASP.2008.4570793
Daugman, J.: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by 2D visual cortical filters. J. Opt. Soc. Am. 2(7), 1160–1169 (1985)
Detry, R., Pugeault, N., Piater, J.: A probabilistic framework for 3D visual object representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1790–1803 (2009)
ECOVISION: Artificial visual systems based on early-cognitive cortical processing (EU-Project) (2001–2003). http://www.pspcdibeunigeit/ecovision/projecthtml
Felsberg, M., Sommer, G.: The monogenic signal. IEEE Trans. Signal Process. 49(12), 3136–3144 (2001)
Felsberg, M., Kalkan, S., Krüger, N.: Continuous dimensionality characterization of image structures. Image Vis. Comput. 27, 628–636 (2009)
Fleet, D., Jepson, A.: Computation of component image velocity from local phase information. Int. J. Comput. Vis. 5, 77–104 (1990)
Gautama, T., Van Hulle, M.: A phase-based approach to the estimation of the optical flow field using spatial filtering. IEEE Trans. Neural Netw. 13(5), 1127–1136 (2002)
Granlund, G.: In search of a general picture processing operator. Comput. Graph. Image Process. 8, 155–173 (1978)
Granlund, G.: The complexity of vision. Signal Process. 74 (1999)
Granlund, G.H., Knutsson, H.: Signal Processing for Computer Vision. Kluwer, Dordrecht (1995)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
Hubel, D., Wiesel, T.: Anatomical demonstration of columns in the monkey striate cortex. Nature 221, 747–750 (1969)
Intel®: CoreTM i7 Desktop Processor, Product Brief (2009)
Jensen, L.B.W., Kjær-Nielsen, A., Alonso, J.D., Ros, E., Krüger, N.: A hybrid fpga/coarse parallel processing architecture for multi-modal visual feature descriptors. In: RECONFIG ’08: Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs, pp. 241–246 (2008)
Jessen, J.B.: Real time sparse and dense stereo in an early cognitive vision system using cuda. Master’s thesis. The Cognitive Vision Group, Maersk Institute, University of Southern Denmark (2009). http://www.mip.sdu.dk/covig/publications/JessenMaster.pdf
Kjær-Nielsen, A., Jensen, L.B.W., Sørensen, A.S., Krüger, N.: A real-time embedded system for stereo vision preprocessing using an fpga. In: RECONFIG ’08: Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs, pp. 37–42 (2008)
Kraft, D., Pugeault, N., Başeski, E., Popović, M., Kragic, D., Kalkan, S., Wörgötter, F., Krüger, N.: Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object Action Complexes. Special Issue on “Cognitive Humanoid Robots” of the International Journal of Humanoid Robotics, vol. 5, pp. 247–265 (2009)
Krüger, N., Felsberg, M.: An explicit and compact coding of geometric and structural information applied to stereo matching. Pattern Recognit. Lett. 25(8), 849–863 (2004)
Krüger, N., Lappe, M., Wörgötter, F.: (2004) Biologically motivated multi-modal processing of visual primitives. The interdisciplinary. J. Artifi. Intell. Simul. Behav. 1(5), 417–428
Lades, M., Vorbrüggen, J., Buhmann, J., Lange, J., von der Malsburg, C., Würtz, R., Konen, W.: Distortion invariant object recognition in the dynamik link architecture. IEEE Trans. Comput. 42(3), 300–311 (1993)
Longuet-Higgins, H.C., Prazdny, K.: The interpretation of a moving retinal image. Proc. R. Soc. Lond. Ser. B Biol. Sci. 208, 385–397 (1980)
Malvar, H., He L., Cutler, R.: High-quality linear interpolation for demosaicing of Bayer-patterned color images. In: Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP’04). IEEE International Conference on, vol 3, pp iii–485–8 (2004). doi:10.1109/ICASSP.2004.1326587
Marr, D.: Vision. Freeman, San Francisco (1982)
Mosteller, F., Tukey, J.: Data Analysis and Regression: A Second Course in Statistics. Addison-Wesley Reading (1977)
Pauwels, K.: Computational modeling of visual attention: neuronal response modulation in the thalamocortical complex and saliency-based detection of independent motion. PhD thesis, K.U.Leuven (2008)
Pauwels, K., Van Hulle, M.: Realtime phase-based optical flow on the GPU. In: IEEE Conference on Computer Vision and Pattern Recognition, Workshop on Computer Vision on the GPU (2008)
Pauwels, K., Krüger, N., Lappe, M., Wörgötter, F., Van Hulle, M.: A cortical architecture on parallel hardware for motion processing in real-time. J. Vis. (2010, submitted)
Pollen, D., Ronner, S.: Phase-relationships between adjacent simple cells in the visual cortex. Science 212(4501), 1409–1411 (1981)
Pugeault, N.: Early Cognitive Vision: Feedback Mechanisms for the Disambiguation of Early Visual Representation. VDM Verlag Dr. Müller, Germany (2008)
Pugeault, N., Wörgötter, F., Krüger, N.: Accumulated visual representation for cognitive vision. In Proceedings of the British Machine Vision Conference (BMVC) (2008)
Sabatini, S., Gastaldi, G., Solari, F., Diaz, J., Ros, E., Pauwels, K., Van Hulle, M., Pugeault, N., Krüger, N.: Compact and accurate early vision processing in the harmonic space. In: International Conference on Computer Vision Theory and Applications, Barcelona, pp. 213–220 (2007)
Satish, N., Harris, M., Garland, M.: Designing efficient sorting algorithms for manycore GPUs. In: Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium, pp. 1–10 (2009). doi:10.1109/IPDPS.2009.5161005
Schiele, B., Crowley, J.: Probabilistic object recognition using multidimensional receptive field histograms. Adv. Neural Info. Process. Syst. 8, 865–871 (1996)
Slama, C.C. (ed.): Manual of Photogrammetry. American Society of Photo (1980)
Thompson, W., Pong, T.: Detecting moving-objects. Int. J. Comput. Vis. 4, 39–57 (1990)
Tröger, P.: The Multi-Core Era—Trends and Challenges. CoRR abs/0810.5439 (2008)
Vernon, D.: The space of cognitive vision. In: Cognitive Vision Systems, Part I: Foundations of Cognitive Vision Systems. LNCS, vol. 3948 (2006)
Wilkinson, B.: Computer Architecture (2nd ed.): Design and Performance. Prentice-Hall, Inc., Upper Saddle River (1996)
Wörgötter, F., Krüger, N., Pugeault, N., Calow, D., Lappe, M., Pauwels, K., Hulle, M.V., Tan, S., Johnston, A.: Early cognitive vision: Using gestalt-laws for task-dependent, active image-processing. Nat. Comput. 3(3), 293–321 (2004)
Zetzsche, C., Krieger, G.: Nonlinear mechanisms and higher-order statistics in biologial vision and electronic image processing: review and perspectives. J. Electron. Imaging 10(1), 56–99 (2001)
Zhang, T., Tomasi, C.: On the consistency of instantaneous rigid motion estimation. Int. J. Comput. Vis. 46, 51–79 (2002)
Acknowledgments
This work was supported by the European Commission FP6 Project DRIVSCO (IST-016276-2) and the Danish project Robo-Packman.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jensen, L.B.W., Kjær-Nielsen, A., Pauwels, K. et al. A two-level real-time vision machine combining coarse- and fine-grained parallelism. J Real-Time Image Proc 5, 291–304 (2010). https://doi.org/10.1007/s11554-010-0159-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-010-0159-4