Abstract
Visual Simultaneous Localization and Mapping (vSLAM) is the method of employing an optical sensor to map the robot’s observable surroundings while also identifying the robot’s pose in relation to that map. The accuracy and speed of vSLAM calculations can have a very significant impact on the performance and effectiveness of subsequent tasks that need to be executed by the robot, making it a key building component for current robotic designs. The application of vSLAM in the area of humanoid robotics is particularly difficult due to the robot’s unsteady locomotion. This paper introduces a pose graph optimization module based on RGB (ORB) features, as an extension of the KinectFusion pipeline (a well-known vSLAM algorithm), to assist in recovering the robot’s stance during unstable gait patterns when the KinectFusion tracking system fails. We develop and test a wide range of embedded MPSoC FPGA designs, and we investigate numerous architectural improvements, both precise and approximation, to study their impact on performance and accuracy. Extensive design space exploration reveals that properly designed approximations, which exploit domain knowledge and efficient management of CPU and FPGA fabric resources, enable real-time vSLAM at more than 30 fps in humanoid robots with high energy-efficiency and without compromising robot tracking and map construction. This is the first FPGA design to achieve robust, real-time dense SLAM operation targeting specifically humanoid robots. An open source release of our implementations and data can be found in [1].
- [1] Oct 2021. https://github.com/csl-uth/PG-SLAM_fpga. (Oct 2021).
DOI: Google ScholarCross Ref - [2] . 2018. Embedding SLAM algorithms: Has it come of age? Robotics and Autonomous Systems 100 (2018), 14–26.Google ScholarCross Ref
- [3] . 2015. Decentralized active information acquisition: Theory and application to multi-robot SLAM. IEEE International Conference on Robotics and Automation (ICRA) (2015), 4775–4782.Google ScholarCross Ref
- [4] . 2016. Blur image detection using Laplacian operator and Open-CV. 2016 International Conference System Modeling & Advancement in Research Trends (SMART) (2016), 63–67.Google ScholarCross Ref
- [5] . 1992. A method for registration of 3-D shapes. IEEE Trans. Pattern Analysis and Machine Intelligence 14, 2 (1992).Google ScholarDigital Library
- [6] . 2018. SLAMBench2: Multi-objective head-to-head benchmarking for visual SLAM. CoRR abs/1808.06820.Google Scholar
- [7] . 2016. Semi-dense SLAM on an FPGA SoC. In 26th International Conference on Field Programmable Logic and Applications, (FPL), Lausanne, Switzerland, August 29–September 2, 2016, , , , , and (Eds.). IEEE, 1–4.Google ScholarCross Ref
- [8] . 2017. A high-performance system-on-chip architecture for direct tracking for SLAM. In 27th International Conference on Field Programmable Logic and Applications, (FPL), Ghent, Belgium, September 4–8. 1–7.Google ScholarCross Ref
- [9] . 2019. A scalable FPGA-based architecture for depth estimation in SLAM. In 15th International Symposium on Applied Reconfigurable Computing, (ARC) Darmstadt, Germany, April 9–11. 181–196.Google Scholar
- [10] . 2009. A floating-point extended Kalman filter implementation for autonomous mobile robots. J. Signal Process. Syst. 56, 1 (2009), 41–50.Google ScholarDigital Library
- [11] . 2017. Simultaneous localization and mapping: A survey of current trends in autonomous driving. IEEE Transactions on Intelligent Vehicles 2 (2017), 194–220.Google ScholarCross Ref
- [12] . 2003. Humanoid robots. In Encyclopedia of Physical Science and Technology (Third Edition), (Ed.). Academic Press, New York, 401–425.Google Scholar
- [13] . 2019. SLAMBench 3.0: Systematic automated reproducible evaluation of SLAM systems for robot vision challenges and scene understanding. In International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20–24, 2019. IEEE, 6351–6358.Google ScholarDigital Library
- [14] . 2016. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics 32, 6 (2016), 1309–1332.Google ScholarDigital Library
- [15] . 1996. A volumetric method for building complex models from range images. In 23rd Annual Conference on Computer Graphics and Interactive Techniques, (SIGGRAPH), New Orleans, LA, USA, August 4–9, 1996. 303–312.Google ScholarDigital Library
- [16] . 2018. SOFT-SLAM: Computationally efficient stereo visual simultaneous localization and mapping for autonomous unmanned aerial vehicles. J. Field Robotics 35 (2018), 578–595.Google ScholarCross Ref
- [17] . 2007. MonoSLAM: Real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (2007), 1052–1067.Google ScholarDigital Library
- [18] . 2018. SuperPoint: Self-supervised interest point detection and description. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2018, Salt Lake City, UT, USA, June 18–22, 2018. 224–236.Google ScholarCross Ref
- [19] . 2014. LSD-SLAM: Large-scale direct monocular SLAM. In Computer Vision - ECCV 2014–13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 8690. Springer, 834–849.Google ScholarCross Ref
- [20] . 2017. FPGA-based ORB feature extraction for real-time visual SLAM. In International Conference on Field Programmable Technology, (FPT), Melbourne, Australia, December 11–13, 2017. 275–278.Google ScholarCross Ref
- [21] . 2021. Energy-efficient FPGA-accelerated LiDAR-based SLAM for embedded robotics. In International Conference on Field-Programmable Technology, (FPT) Auckland, New Zealand, December 6–10, 2021. IEEE, 1–6.Google ScholarCross Ref
- [22] . 2013. Collaborative monocular SLAM with multiple Micro Aerial Vehicles. 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (2013), 3962–3970.Google Scholar
- [23] . 2019. FPGA architectures for real-time dense SLAM. In 30th IEEE International Conference on Application-specific Systems, Architectures and Processors, (ASAP), New York, NY, USA, July 15–17, 2019. 83–90.Google ScholarCross Ref
- [24] . 2014. Real-time 3D reconstruction for FPGAs: A case study for evaluating the performance, area, and programmability trade-offs of the Altera OpenCL SDK. In 2014 International Conference on Field-Programmable Technology, FPT Shanghai, China, December 10–12, 2014. 326–329.Google ScholarCross Ref
- [25] . 2010. Multi-robot visual SLAM using a Rao-Blackwellized particle filter. Robotics Auton. Syst. 58 (2010), 68–80.Google ScholarDigital Library
- [26] . 2021. FPGA architectures for approximate dense SLAM computing. In 24th Conference on Design, Automation and Test in Europe (DATE) Virtual Conference, February 1–3, 2021.Google ScholarCross Ref
- [27] . 2015. An FPGA-based real-time simultaneous localization and mapping system. In International Conference on Field Programmable Technology, (FPT) Queenstown, New Zealand, December 7–9, 2015. 200–203.Google ScholarCross Ref
- [28] . 2014. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In ICRA Hong Kong, China, May.Google Scholar
- [29] . 2010. Humanoid robot localization in complex indoor environments. IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings, 1690–1695.
DOI: Google ScholarCross Ref - [30] . 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. UIST’11 - Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559–568.Google ScholarDigital Library
- [31] . 2015. Very high frame rate volumetric integration of depth images on mobile devices. IEEE Trans. Vis. Comput. Graph. 21, 11 (2015).Google ScholarDigital Library
- [32] . 2013. Dense visual SLAM for RGB-D cameras. 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (2013), 2100–2106.Google Scholar
- [33] . 2011. G2o: A general framework for graph optimization. Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA), 3607–3613.
DOI: Google ScholarCross Ref - [34] . 2019. Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality. Virtual Real. Intell. Hardw. 1 (2019), 386–410.Google ScholarCross Ref
- [35] . 2018. ICE-BA: Incremental, consistent and efficient bundle adjustment for visual-inertial SLAM. In IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation / IEEE Computer Society, 1974–1982.Google ScholarCross Ref
- [36] . 2019. eSLAM: An energy-efficient accelerator for real-time ORB-SLAM on FPGA platform. In 56th Annual Design Automation Conference, (DAC), Las Vegas, NV, USA, June 02–06, 2019. ACM, 193.Google ScholarDigital Library
- [37] . 2014. Characterizations of noise in Kinect depth images: A review. IEEE Sensors Journal 14, 6 (2014), 1731–1740.Google ScholarCross Ref
- [38] . 2016. A survey of techniques for approximate computing. ACM Comput. Surv. 48, 4 (2016).Google ScholarDigital Library
- [39] . 2003. FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In IJCAI.Google Scholar
- [40] . 2009. Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP 2009 - Proceedings of the 4th International Conference on Computer Vision Theory and Applications 1, 331–340.Google Scholar
- [41] . 2015. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robotics 31, 5 (2015).Google ScholarDigital Library
- [42] . 2017. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics 33 (2017), 1255–1262.Google ScholarDigital Library
- [43] . 2015. Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM. In International Conference on Robotics and Automation, (ICRA), Seattle, WA, USA, 26–30 May. 5783–5790.Google ScholarCross Ref
- [44] . 2014. A synchronized visual-inertial sensor system with FPGA pre-processing for accurate real-time SLAM. In International Conference on Robotics and Automation, (ICRA), Hong Kong, China, May 31–June 7, 2014.Google ScholarCross Ref
- [45] . 2016. Energy-efficient simultaneous localization and mapping via compounded approximate computing. In IEEE International Workshop on Signal Processing Systems (SiPS), Dallas, TX, USA, October 26–28, 2016.Google Scholar
- [46] . 2012. Vision-based odometric localization for humanoids using a kinematic EKF. In 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012). 153–158.Google ScholarCross Ref
- [47] . 2005. Using visual odometry to create 3D maps for online footstep planning. In 2005 IEEE International Conference on Systems, Man and Cybernetics, Vol. 3. 2643–2648.Google ScholarCross Ref
- [48] . 2019. SLAMBooster: An application-aware online controller for approximation in dense SLAM. In 28th International Conference on Parallel Architectures and Compilation Techniques, (PACT), Seattle, WA, USA, September 23–26, 2019.Google ScholarCross Ref
- [49] . 2020. A methodology for principled approximation in visual SLAM. In International Conference on Parallel Architectures and Compilation Techniques (PACT), Virtual Event, GA, USA, October 3–7, 2020. 373–386.Google ScholarDigital Library
- [50] . 2019. Humanoid Robot Dense RGB-D SLAM for Embedded Devices. (
04 2019).DOI: Google ScholarCross Ref - [51] . 2011. ORB: An efficient alternative to SIFT or SURF. Proceedings of the IEEE International Conference on Computer Vision, 2564–2571.Google Scholar
- [52] . 2018. Navigating the landscape for real-time localization and mapping for robotics and virtual and augmented reality. Proc. IEEE 106, 11 (2018), 2020–2039.Google ScholarCross Ref
- [53] . 2013. SLAM++: Simultaneous localisation and mapping at the level of objects. 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013), 1352–1359.Google ScholarDigital Library
- [54] . 2017. Multi-UAV collaborative monocular SLAM. 2017 IEEE International Conference on Robotics and Automation (ICRA) (2017), 3863–3870.Google ScholarDigital Library
- [55] . 2019. BAD SLAM: Bundle adjusted direct RGB-D SLAM. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), 134–144.Google ScholarCross Ref
- [56] . 2011. Managing performance vs. accuracy trade-offs with loop perforation. In ESEC/FSE, Szeged, Hungary, Sept. 2011.Google Scholar
- [57] . 2015. Discriminative learning of deep convolutional feature point descriptors. In 2015 IEEE International Conference on Computer Vision, ICCV Santiago, Chile, December 7–13, 2015. 118–126.Google ScholarDigital Library
- [58] . 1987. Estimating uncertain spatial relationships in robotics. Proceedings. 1987 IEEE International Conference on Robotics and Automation 4 (1987), 850–850.Google ScholarCross Ref
- [59] . 2019. Navion: A 2-mW fully integrated real-time visual-inertial odometry accelerator for autonomous navigation of nano drones. IEEE Journal of Solid-State Circuits 54, 4 (2019).Google ScholarCross Ref
- [60] . 2004. 3D map building for a humanoid robot by using visual odometry. In 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583), Vol. 5. 4444–4449.
DOI: Google ScholarCross Ref - [61] . 2018. A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK. International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (2018), 1–10.Google Scholar
- [62] . 2014. FPGA design and implementation of a matrix multiplier based accelerator for 3D EKF SLAM. In International Conference on ReConFigurable Computing and FPGAs, ReConFig14, Cancun, Mexico, December 8–10, 2014. IEEE, 1–6.Google ScholarCross Ref
- [63] . 2016. FPGA design of EKF block accelerator for 3D visual SLAM. Comput. Electr. Eng. 55 (2016), 123–137.Google ScholarDigital Library
- [64] . 2006. Localisation for autonomous humanoid navigation. In 2006 6th IEEE-RAS International Conference on Humanoid Robots. 13–19.Google ScholarCross Ref
- [65] . 1996. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society (Series B) 58 (1996).Google Scholar
- [66] . 1998. Bilateral filtering for gray and color images. In 6th International Conference on Computer Vision (ICCV), Bombay, India, January 4–7, 1998.Google ScholarCross Ref
- [67] . 2012. Real time simultaneous localization and mapping: Towards low-cost multiprocessor embedded systems. EURASIP J. Embed. Syst. 2012 (2012), 5.Google ScholarCross Ref
- [68] . 2015. ElasticFusion: Dense SLAM without a pose graph. In Robotics: Science and Systems.Google Scholar
- [69] . 2013. Humanoid robot navigation: From a visual SLAM to a visual compass. In 10th IEEE International Conference on Networking, Sensing and Control, ICNSC 2013, Evry, France, April 10–12, 2013. 678–683.Google ScholarCross Ref
- [70] . 2020. CNN-based feature-point extraction for real-time visual SLAM on embedded FPGA. In 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, (FCCM), Fayetteville, AR, USA, May 3–6, 2020. 33–37.Google Scholar
- [71] . 2018. Dense RGB-D SLAM for humanoid robots in the dynamic humans environment. In IEEE-RAS International Conference on Humanoid Robots, Humanoids. Beijing, China, November 6–9, 2018. 270–276.Google ScholarDigital Library
Index Terms
- Reconfigurable System-on-Chip Architectures for Robust Visual SLAM on Humanoid Robots
Recommendations
FPGA Accelerators for Robust Visual SLAM on Humanoid Robots
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysVisual Simultaneous Localization and Mapping (vSLAM) is the process of mapping the robot's observed environment using an optical sensor, while concurrently determining the robot's pose with respect to that map. For humanoid robots, the implementation of ...
Vision-based maze navigation for humanoid robots
We present a vision-based approach for navigation of humanoid robots in networks of corridors connected through curves and junctions. The objective of the humanoid is to follow the corridors, walking as close as possible to their center to maximize ...
Development of a humanoid robot
This study presents design methodologies, specifications and control strategies for vision-guided object grasping for the developed humanoid robot, Cheng-kung Humanoid RobotIc System (CHRIS). The humanoid robot constructed herein comprises mainly a ...
Comments