Skip to main content
Log in

Toward a smart camera for fast high-level structure extraction

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

This paper presents an initial framework to extract high-level structures from man-made environments, by means of a novel methodology that combines stereo vision, binary descriptors and parallel processing implemented on a GPU. High-level structures such as planes, spheres and cubes provide vital information of the world, essential to perform applications in the field of robotics, augmented reality and computer vision. However, their extraction involves several computational challenges, especially because their application context requires solving real-time and environment operation constraints. Hence, stereo vision-based attempts have been proposed, without achieving real-time performance because they require a rectification stage running in the frame-to-frame basis, increasing the computational burden. Therefore, in contrast to typical stereo algorithms, the proposed methodology is developed on the basis of a semi-calibrated stereo rig, which means that rectification stage is avoiding, thus enabling to invest computational cost in critical stages and consequently achieving a frame rate up to 50 fps for the whole process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Alahi, A., Ortiz, R., Vandergheynst, Freak, P.: Fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517. IEEE, (2012)

  2. Alcantarilla, P.F., Bartoli, A., Davison, A.J.: Kaze features. In: European Conference on Computer Vision, pp. 214–227. Springer, Berlin (2012)

  3. Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2011)

    Google Scholar 

  4. Authors, V.: Guidance stereo camera. urlhttps://www.dji.com/guidance. (2017)

  5. Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)

    Article  MATH  Google Scholar 

  6. Banz, C., Hesselbarth, S., Flatt, H., Blume, H., Pirsch, P.: Real-time stereo vision system using semi-global matching disparity estimation: Architecture and fpga-implementation. In: 2010 International Conference on Embedded Computer Systems (SAMOS), pp. 93–101. IEEE, (2010)

  7. Baumberg, A.: Reliable feature matching across widely separated views. In: IEEE Conference on Computer Vision and Pattern Recognition, 2000. Proceedings. Vol. 1, pp. 774–781. IEEE, (2000)

  8. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer, Berlin (2006)

  9. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)

    Article  Google Scholar 

  10. Borrmann, D., Elseberg, J., Lingemann, K., Nüchter, A.: The 3d Hough transform for plane detection in point clouds: A review and a new accumulator design. 3D. Research 2(2), 1–13 (2011)

    Google Scholar 

  11. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010)

  12. Carraro, M., Munaro, M., Menegatti, E.: Cost-efficient rgb-d smart camera for people detection and tracking. J. Electron. Imaging 25(4), 041007–041007 (2016)

    Article  Google Scholar 

  13. Corporation, C.N.: CUDA Developer Zone. https://developer.nvidia.com/cuda-zone, (2016). Accessed October-7-2016

  14. Crozier, S., Falconer, D., Mahmoud, S.: Least sum of squared errors (lsse) channel estimation. IEEE Proc. F-Radar Signal Process. 138, 371–378 (1991). IET

    Article  Google Scholar 

  15. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) Vol. 1, pp. 886–893. IEEE, (2005)

  16. Denker, K., Umlauf, G.: Accurate real-time multi-camera stereo-matching on the gpu for 3d reconstruction. J. WSCG. 19(1–3), 9–16 (2011)

    Google Scholar 

  17. Derpanis, K.G.: Overview of the ransac algorithm. Image Rochester NY 4(1), 2–3 (2010)

    Google Scholar 

  18. Faugeras, O.: Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT press, Cambridge (1993)

    Google Scholar 

  19. Hartley, R., Gupta, R., Chang T.: Stereo from uncalibrated cameras. In: 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. Proceedings CVPR’92., pp. 761–764. IEEE, (1992)

  20. Hartley, R.I.: In defense of the eight-point algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 19(6), 580–593 (1997)

    Article  Google Scholar 

  21. Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: Experimental Robotics, pp. 477–491. Springer, Berlin (2014)

  22. Hirschmuller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, 807–814. IEEE, (2005)

  23. Jain, R., Kasturi, R., Schunck, B.G.: Machine Vision. McGraw-Hill, New York (1995)

    Google Scholar 

  24. Jung, I.-L., Sim, J.-Y., Kim, C.-S., Lee, S.-U.: Robust stereo matching under radiometric variations based on cumulative distributions of gradients. In: 2013 IEEE International Conference on Image Processing, pp. 2082–2085. IEEE, (2013)

  25. Kowalczuk, J., Psota, E.T., Perez, L.C.: Real-time stereo matching on cuda using an iterative refinement method for adaptive support-weight correspondences. IEEE Trans. Circuit Syst. Video Technol. 23(1), 94–104 (2013)

    Article  Google Scholar 

  26. Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: Binary robust invariant scalable keypoints. In: 2011 International Conference on Computer Vision, pp. 2548–2555. IEEE, (2011)

  27. Lewis, J.P.: Fast normalized cross-correlation. In: Vision Interface, Vol. 10, No. 1, pp. 120–123 (1995)

  28. Loghman, M., Zarshenas, A., Chung, K.-H. Lee, Y., Kim, J.: A novel depth estimation method for uncalibrated stereo images. In: 2014 International SoC Design Conference (ISOCC), pp. 186–187. IEEE, (2014)

  29. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  30. Madeo, S., Pelliccia, R., Salvadori, C., del Rincon, J.M., Nebel, J.-C.: An optimized stereo vision implementation for embedded systems: application to rgb and infra-red images. J. Real-Time Image Process. 12(4), 725–746 (2016)

    Article  Google Scholar 

  31. Martin J., Crowley, J.L.: Comparison of correlation techniques. In: International Conference on Intelligent Autonomous Systems, Karlsruhe (Germany), pp. 86–93, (1995)

  32. Marzollo, A.: Topics in Artificial Intelligence, vol. 256. Springer, Berlin (1976)

    Book  MATH  Google Scholar 

  33. Mesmakhosroshahi, M., Chung, K.-H. Lee, Y., Kim, J.: Depth gradient based region of interest generation for pedestrian detection. In: 2014 International SoC Design Conference (ISOCC), pp. 156–157. IEEE, (2014)

  34. Michailidis, G.-T., Pajarola, R., Andreadis, I.: High performance stereo system for dense 3-D reconstruction. IEEE Trans. Circuit Syst. Video Technol. 24(6), 929–941 (2014)

    Article  Google Scholar 

  35. Microsoft. Meet Kinect. https://developer.microsoft.com/en-us/windows/kinect (2016). Accessed October-5-2016

  36. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)

    Article  Google Scholar 

  37. Miksik. O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2681–2684. IEEE, (2012)

  38. Mosqueron, R., Dubois, J., Mattavelli, M., Mauvilet, D.: Smart camera based on embedded hw/sw coprocessor. EURASIP J. Embedded Syst. 2008, 3 (2008)

    Google Scholar 

  39. Mukhopadhyay, P., Chaudhuri, B.B.: A survey of hough transform. Pattern Recogn. 48(3), 993–1010 (2015)

    Article  Google Scholar 

  40. Peak, V.: Motion capture systems. Vicon (2005). https://www.vicon.com/. Accessed 11 July 2017

  41. Rosin, P.L.: Measuring corner properties. Comput. Vis. Image Underst. 73(2), 291–307 (1999)

    Article  Google Scholar 

  42. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE, (2011)

  43. Salas-Moreno, R.F., Glocken, B., Kelly, P.H., Davison, A.J.: Dense planar slam. In: 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164. IEEE, (2014)

  44. Schnabel, R., Wahl, R., Klein R.: Efficient ransac for point-cloud shape detection. In: Computer Graphics Forum, Vol. 26, pp. 214–226. Wiley Online Library, (2007)

  45. Senouci, B., Charfi, I., Heyrman, B., Dubois, J., Miteran, J.: Fast prototyping of a soc-based smart-camera: a real-time fall detection case study. J. Real-Time Image Process. 12(4), 649–662 (2016)

    Article  Google Scholar 

  46. Tan, X., Sun, C., Sirault, X., Furbank, R., Pham, T.D.: Feature matching in stereo images encouraging uniform spatial distribution. Pattern Recogn. 48(8), 2530–2542 (2015)

    Article  Google Scholar 

  47. Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P., et al.: Hough-transform and extended ransac algorithms for automatic detection of 3d building roof planes from lidar data. Proc. ISPRS Workshop Laser Scan. 36, 407–412 (2007)

    Google Scholar 

  48. Tippetts, B., Lee, D.J., Lillywhite, K., Archibald, J.: Review of stereo vision algorithms and their suitability for resource-limited systems. J. Real-Time Image Process. 11(1), 5–25 (2016)

    Article  Google Scholar 

  49. Tordoff, B., Murray, D.W.: The impact of radial distortion on the self-calibration of rotating cameras. Comput. Vis. Image Underst. 96(1), 17–34 (2004)

    Article  Google Scholar 

  50. Veksler O.: Fast variable window for stereo correspondence using integral images. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings. Vol. 1, pp. I–556. IEEE, (2003)

  51. Vision O.S.C.: OpenCV. http://opencv.org/, 2016. Accessed October-10-2016

  52. Wang, Q., Wu, J., Long, C., Li, B.: P-fad: Real-time face detection scheme on embedded smart cameras. IEEE J. Emerg. Select. Topics Circuit Syst. 3(2), 210–222 (2013)

    Article  Google Scholar 

  53. Weingarten, J., Siegwart, R.: 3d slam using planar segments. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3062–3067. IEEE, (2006)

  54. Wolf, W., Ozer, B., Lv, T.: Smart cameras as embedded systems. Computer 35(9), 48–53 (2002)

    Article  Google Scholar 

  55. Xtion, P.: Live. https://www.asus.com/3D-Sensor/Xtion_PRO_LIVE, (2014)

  56. Yoon, K.-J., Kweon, I.-S.: Locally adaptive support-weight approach for visual correspondence search. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, pp. 924–931. IEEE, (2005)

Download references

Acknowledgements

The first author is supported by the Mexican National Council for Science and Technology (CONACyT) studentship number 627047. The second author is thankful for the support received through his Royal Society-Newton Advanced Fellowship with reference NA140454.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto de Lima.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Lima, R., Martinez-Carranza, J., Morales-Reyes, A. et al. Toward a smart camera for fast high-level structure extraction. J Real-Time Image Proc 14, 685–699 (2018). https://doi.org/10.1007/s11554-017-0704-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-017-0704-5

Keywords

Navigation