Abstract
This paper presents an initial framework to extract high-level structures from man-made environments, by means of a novel methodology that combines stereo vision, binary descriptors and parallel processing implemented on a GPU. High-level structures such as planes, spheres and cubes provide vital information of the world, essential to perform applications in the field of robotics, augmented reality and computer vision. However, their extraction involves several computational challenges, especially because their application context requires solving real-time and environment operation constraints. Hence, stereo vision-based attempts have been proposed, without achieving real-time performance because they require a rectification stage running in the frame-to-frame basis, increasing the computational burden. Therefore, in contrast to typical stereo algorithms, the proposed methodology is developed on the basis of a semi-calibrated stereo rig, which means that rectification stage is avoiding, thus enabling to invest computational cost in critical stages and consequently achieving a frame rate up to 50 fps for the whole process.
Similar content being viewed by others
References
Alahi, A., Ortiz, R., Vandergheynst, Freak, P.: Fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517. IEEE, (2012)
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: Kaze features. In: European Conference on Computer Vision, pp. 214–227. Springer, Berlin (2012)
Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1281–1298 (2011)
Authors, V.: Guidance stereo camera. urlhttps://www.dji.com/guidance. (2017)
Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)
Banz, C., Hesselbarth, S., Flatt, H., Blume, H., Pirsch, P.: Real-time stereo vision system using semi-global matching disparity estimation: Architecture and fpga-implementation. In: 2010 International Conference on Embedded Computer Systems (SAMOS), pp. 93–101. IEEE, (2010)
Baumberg, A.: Reliable feature matching across widely separated views. In: IEEE Conference on Computer Vision and Pattern Recognition, 2000. Proceedings. Vol. 1, pp. 774–781. IEEE, (2000)
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: European Conference on Computer Vision, pp. 404–417. Springer, Berlin (2006)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Borrmann, D., Elseberg, J., Lingemann, K., Nüchter, A.: The 3d Hough transform for plane detection in point clouds: A review and a new accumulator design. 3D. Research 2(2), 1–13 (2011)
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010)
Carraro, M., Munaro, M., Menegatti, E.: Cost-efficient rgb-d smart camera for people detection and tracking. J. Electron. Imaging 25(4), 041007–041007 (2016)
Corporation, C.N.: CUDA Developer Zone. https://developer.nvidia.com/cuda-zone, (2016). Accessed October-7-2016
Crozier, S., Falconer, D., Mahmoud, S.: Least sum of squared errors (lsse) channel estimation. IEEE Proc. F-Radar Signal Process. 138, 371–378 (1991). IET
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) Vol. 1, pp. 886–893. IEEE, (2005)
Denker, K., Umlauf, G.: Accurate real-time multi-camera stereo-matching on the gpu for 3d reconstruction. J. WSCG. 19(1–3), 9–16 (2011)
Derpanis, K.G.: Overview of the ransac algorithm. Image Rochester NY 4(1), 2–3 (2010)
Faugeras, O.: Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT press, Cambridge (1993)
Hartley, R., Gupta, R., Chang T.: Stereo from uncalibrated cameras. In: 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. Proceedings CVPR’92., pp. 761–764. IEEE, (1992)
Hartley, R.I.: In defense of the eight-point algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 19(6), 580–593 (1997)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: Experimental Robotics, pp. 477–491. Springer, Berlin (2014)
Hirschmuller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, 807–814. IEEE, (2005)
Jain, R., Kasturi, R., Schunck, B.G.: Machine Vision. McGraw-Hill, New York (1995)
Jung, I.-L., Sim, J.-Y., Kim, C.-S., Lee, S.-U.: Robust stereo matching under radiometric variations based on cumulative distributions of gradients. In: 2013 IEEE International Conference on Image Processing, pp. 2082–2085. IEEE, (2013)
Kowalczuk, J., Psota, E.T., Perez, L.C.: Real-time stereo matching on cuda using an iterative refinement method for adaptive support-weight correspondences. IEEE Trans. Circuit Syst. Video Technol. 23(1), 94–104 (2013)
Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: Binary robust invariant scalable keypoints. In: 2011 International Conference on Computer Vision, pp. 2548–2555. IEEE, (2011)
Lewis, J.P.: Fast normalized cross-correlation. In: Vision Interface, Vol. 10, No. 1, pp. 120–123 (1995)
Loghman, M., Zarshenas, A., Chung, K.-H. Lee, Y., Kim, J.: A novel depth estimation method for uncalibrated stereo images. In: 2014 International SoC Design Conference (ISOCC), pp. 186–187. IEEE, (2014)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Madeo, S., Pelliccia, R., Salvadori, C., del Rincon, J.M., Nebel, J.-C.: An optimized stereo vision implementation for embedded systems: application to rgb and infra-red images. J. Real-Time Image Process. 12(4), 725–746 (2016)
Martin J., Crowley, J.L.: Comparison of correlation techniques. In: International Conference on Intelligent Autonomous Systems, Karlsruhe (Germany), pp. 86–93, (1995)
Marzollo, A.: Topics in Artificial Intelligence, vol. 256. Springer, Berlin (1976)
Mesmakhosroshahi, M., Chung, K.-H. Lee, Y., Kim, J.: Depth gradient based region of interest generation for pedestrian detection. In: 2014 International SoC Design Conference (ISOCC), pp. 156–157. IEEE, (2014)
Michailidis, G.-T., Pajarola, R., Andreadis, I.: High performance stereo system for dense 3-D reconstruction. IEEE Trans. Circuit Syst. Video Technol. 24(6), 929–941 (2014)
Microsoft. Meet Kinect. https://developer.microsoft.com/en-us/windows/kinect (2016). Accessed October-5-2016
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Miksik. O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2681–2684. IEEE, (2012)
Mosqueron, R., Dubois, J., Mattavelli, M., Mauvilet, D.: Smart camera based on embedded hw/sw coprocessor. EURASIP J. Embedded Syst. 2008, 3 (2008)
Mukhopadhyay, P., Chaudhuri, B.B.: A survey of hough transform. Pattern Recogn. 48(3), 993–1010 (2015)
Peak, V.: Motion capture systems. Vicon (2005). https://www.vicon.com/. Accessed 11 July 2017
Rosin, P.L.: Measuring corner properties. Comput. Vis. Image Underst. 73(2), 291–307 (1999)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE, (2011)
Salas-Moreno, R.F., Glocken, B., Kelly, P.H., Davison, A.J.: Dense planar slam. In: 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164. IEEE, (2014)
Schnabel, R., Wahl, R., Klein R.: Efficient ransac for point-cloud shape detection. In: Computer Graphics Forum, Vol. 26, pp. 214–226. Wiley Online Library, (2007)
Senouci, B., Charfi, I., Heyrman, B., Dubois, J., Miteran, J.: Fast prototyping of a soc-based smart-camera: a real-time fall detection case study. J. Real-Time Image Process. 12(4), 649–662 (2016)
Tan, X., Sun, C., Sirault, X., Furbank, R., Pham, T.D.: Feature matching in stereo images encouraging uniform spatial distribution. Pattern Recogn. 48(8), 2530–2542 (2015)
Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P., et al.: Hough-transform and extended ransac algorithms for automatic detection of 3d building roof planes from lidar data. Proc. ISPRS Workshop Laser Scan. 36, 407–412 (2007)
Tippetts, B., Lee, D.J., Lillywhite, K., Archibald, J.: Review of stereo vision algorithms and their suitability for resource-limited systems. J. Real-Time Image Process. 11(1), 5–25 (2016)
Tordoff, B., Murray, D.W.: The impact of radial distortion on the self-calibration of rotating cameras. Comput. Vis. Image Underst. 96(1), 17–34 (2004)
Veksler O.: Fast variable window for stereo correspondence using integral images. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings. Vol. 1, pp. I–556. IEEE, (2003)
Vision O.S.C.: OpenCV. http://opencv.org/, 2016. Accessed October-10-2016
Wang, Q., Wu, J., Long, C., Li, B.: P-fad: Real-time face detection scheme on embedded smart cameras. IEEE J. Emerg. Select. Topics Circuit Syst. 3(2), 210–222 (2013)
Weingarten, J., Siegwart, R.: 3d slam using planar segments. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3062–3067. IEEE, (2006)
Wolf, W., Ozer, B., Lv, T.: Smart cameras as embedded systems. Computer 35(9), 48–53 (2002)
Xtion, P.: Live. https://www.asus.com/3D-Sensor/Xtion_PRO_LIVE, (2014)
Yoon, K.-J., Kweon, I.-S.: Locally adaptive support-weight approach for visual correspondence search. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, pp. 924–931. IEEE, (2005)
Acknowledgements
The first author is supported by the Mexican National Council for Science and Technology (CONACyT) studentship number 627047. The second author is thankful for the support received through his Royal Society-Newton Advanced Fellowship with reference NA140454.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Lima, R., Martinez-Carranza, J., Morales-Reyes, A. et al. Toward a smart camera for fast high-level structure extraction. J Real-Time Image Proc 14, 685–699 (2018). https://doi.org/10.1007/s11554-017-0704-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-017-0704-5