Abstract
We present a Direct Visual Odometry (VO) algorithm for multi-camera rigs, that allows for flexible connections between cameras and runs in real-time at high frame rate on GPU for stereo setups. In contrast to feature-based VO methods, Direct VO aligns images directly to depth-enhanced previous images based on the photoconsistency of all high-contrast pixels. By using a multi-camera setup we can introduce an absolute scale into our reconstruction. Multiple views also allow us to obtain depth from multiple disparity sources: static disparity between the different cameras of the rig and temporal disparity by exploiting rig motion. We propose a joint optimization of the rig poses and the camera poses within the rig which enables working with flexible rigs. We show that sub-pixel rigidity is difficult to manufacture for 720p or higher resolution cameras which makes this feature important, particularly in current and future (semi-)autonomous cars or drones. Consequently, we evaluate our approach on own, real-world and synthetic datasets that exhibit flexibility in the rig beside sequences from established KITTI dataset.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Engel, J., Stueckler, J., Cremers, D.: Large-scale direct slam with stereo cameras. In: International Conference on Intelligent Robots and Systems (IROS) (2015)
Chiuso, A., Favaro, P., Jin, H., Soatto, S.: Structure from motion causally integrated over time. IEEE Trans. Pattern Anal. Mach. Intell. 24, 523–535 (2002)
Nistér, D., Naroditsky, O., Bergen, J.: Visual odometry, pp. 652–659 (2004)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1052–1067 (2007)
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: Proceedings of Sixth IEEE and ACM International Symposium on Mixed and Augmented Reality, ISMAR 2007, Nara, Japan (2007)
Paz, L.M., Piniés, P., Tardós, J.D., Neira, J.: Large scale 6-DOF slam with stereo-in-hand. IEEE Trans. Robot. 24, 946–957 (2008)
Kerl, C., Sturm, J., Cremers, D.: Robust odometry estimation for RGB-D cameras. In: ICRA, pp. 3748–3754. IEEE (2013)
Meilland, M., Comport, A.I.: On unifying key-frame and voxel-based dense visual SLAM at large scales. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013, pp. 3677–3683 (2013)
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-Scale Direct Monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10605-2_54
Pillai, S., Ramalingam, S., Leonard, J.: High-performance and tunable stereo reconstruction. In: 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE (2016)
Comport, A.I., Malis, E., Rives, P.: Accurate quadrifocal tracking for robust 3D visual odometry. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 40–45 (2007)
Resch, B., Lensch, H.P.A., Wang, O., Pollefeys, M., Sorkine-Hornung, A.: Scalable structure from motion for densely sampled videos. In: CVPR, pp. 3936–3944. IEEE Computer Society (2015)
Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A., Gross, M.: Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. Proc. ACM SIGGRAPH 32, 73:1–73:12 (2013)
Wei, J., Resch, B., Lensch, H.P.A.: Dense and occlusion-robust multi-view stereo for unstructured videos. In: 13th Conference on Computer and Robot Vision, CRV 2016, Victoria, British Columbia, 1–3 June 2016. IEEE Computer Society (2016)
Delaunoy, A., Pollefeys, M.: Photometric bundle adjustment for dense multi-view 3D modeling. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1486–1493. IEEE (2014)
Engel, J., Sturm, J., Cremers, D.: Semi-dense visual odometry for a monocular camera. In: IEEE International Conference on Computer Vision (ICCV), Sydney, Australia (2013)
Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. J. Appl. Math. II, 164–168 (1944)
Crouse, D.F., Willett, P., Pattipati, K., Svensson, L.: A look at Gaussian mixture reduction algorithms. In: 2011 Proceedings of the 14th International Conference on Information Fusion (FUSION), pp. 1–8 (2011)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Acknowledgements
This work was supported by Daimler AG, Germany. Real-world flexible stereo rig datasets were kindly provided by Dr. Senya Polikovsky, OSLab, Max Planck Institute for Intelligent Systems Tübingen.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 17297 KB)
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Resch, B., Wei, J., Lensch, H.P.A. (2017). Real Time Direct Visual Odometry for Flexible Multi-camera Rigs. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-54190-7_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54189-1
Online ISBN: 978-3-319-54190-7
eBook Packages: Computer ScienceComputer Science (R0)