Skip to main content
Log in

Less restrictive camera odometry estimation from monocular camera

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper addresses the problem of estimating a camera motion from a non-calibrated monocular camera. Compared to existing methods that rely on restrictive assumptions, we propose a method which can estimate camera motion with much less restrictions by adopting new example-based techniques compensating the lack of information. Specifically, we estimate the focal length of the camera by referring to visually similar training images with which focal lengths are associated. For one step camera estimation, we refer to stationary points (landmark points) whose depths are estimated based on RGB-D candidates. In addition to landmark points, moving objects can be also used as an information source to estimate the camera motion. Therefore, our method simultaneously estimates the camera motion for a video, and the 3D trajectories of objects in this video by using Reversible Jump Markov Chain Monte Carlo (RJ-MCMC) particle filtering. Our method is evaluated on challenging datasets demonstrating its effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. https://github.com/BVLC/caffe/tree/master/models/bvlc_reference_caffenet

  2. http://www.cvlibs.net/datasets/kitti/eval_odometry.php

References

  1. Boukhers Z, Shirahama K, Li F, Grzegorzek M (2015) Extracting 3d trajectories of objects from 2d videos using particle filter. In: International conference on multimedia retrieval (ICMR), pp 83–90

  2. Buczko M, Willert V (2016) How to distinguish inliers from outliers in visual odometry for high-speed automotive applications IEEE symposium on intelligent vehicles (IV), pp 478–483

  3. Cao L, Wang C, Li J (2015) Robust depth-based object tracking from a moving binocular camera. Signal Process 112:154–161

    Article  Google Scholar 

  4. Choi W, Savarese S (2010) Multiple target tracking in world coordinate with single, minimally calibrated camera, pages 553–567

  5. Choi W, Pantofaru C, Savarese S (2013) A general framework for tracking multiple people from a moving camera. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(7):1577–1591

    Article  Google Scholar 

  6. Cvišié I, Petrović I (2015) Stereo odometry based on careful feature selection and tracking. In: European conference on mobile robots (ECMR), pp 1–6

  7. Engel J, Sturm J, Cremers D (2013) Semi-dense visual odometry for a monocular camera. In: IEEE International conference on computer vision (ICCV), pp 1449–1456

  8. Ess A, Leibe B, Schindler K, van Gool L (2008) A mobile vision system for robust multi-person tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1–8

  9. Ess A, Leibe B, Schindler K, Van Gool L (2009) Robust multiperson tracking from a mobile platform. IEEE Trans Pattern Anal Mach Intell (PAMI) 31 (10):1831–1846

    Article  Google Scholar 

  10. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  11. Frost DP, Kähler O. (2012) D. W. Murray. Object-aware bundle adjustment for correcting monocular scale drift. In: IEEE International conference on robotics and automation (ICRA), pp 4770–4776

  12. Garcia J, Gardel A, Bravo I, Lazaro J, Martinez M (2013) Tracking people motion based on extended condensation algorithm. IEEE Trans Syst Man Cybern Syst 43(3):606–618

    Article  Google Scholar 

  13. Geiger A, Ziegler J, Stiller C (2011) Stereoscan: Dense 3d reconstruction in real-time. In: IEEE Symposium on intelligent vehicles, pp 963–968

  14. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 3354–3361

  15. Grigorescu SM, Macesanu G, Cocias TT, Puiu D, Moldoveanu F (2011) Robust camera pose and scene structure analysis for service robotics. Robot Auton Syst 59(11):899–909

    Article  Google Scholar 

  16. Gutierrez-Gomez D, Mayol-Cuevas W, Guerrero J (2015) Inverse depth for accurate photometric and geometric error minimisation in rgb-d dense visual odometry. In: IEEE International conference on robotics and automation (ICRA), pp 83–89

  17. Handa A, Whelan T, McDonald J, Davison A (2014) A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In: IEEE International conference on robotics and automation (ICRA), pp 1524–1531

  18. Hartley RI, Zisserman A (2004) Multiple view geometry in computer vision, Second edition. Cambridge University Press, Cambridge. ISBN: 0521540518

    Book  MATH  Google Scholar 

  19. Hoiem D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vis 80(1):3–15

    Article  Google Scholar 

  20. Jafari O, Mitzel D, Leibe B (2014) Real-time rgb-d based people detection and tracking for mobile robots and head-worn cameras IEEE International conference on robotics and automation (ICRA), pp 5636–5643

  21. Jaimez M, Gonzalez-Jimenez J (2015) Fast visual odometry for 3-d range sensors. IEEE Trans Robot 31(4):809–822

    Article  Google Scholar 

  22. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093

  23. Karsch K, Liu C, Kang SB (2016) Depth transfer: depth extraction from videos using nonparametric sampling pages 173–205

  24. Kerl C, Sturm J, Cremers D (2013) Robust odometry estimation for rgb-d cameras. In: IEEE International conference on robotics and automation (ICRA), pp 3748–3754

  25. Kerl C, Stuckler J, Cremers D (2015) Dense continuous-time tracking and mapping with rolling shutter rgb-d cameras. In: IEEE International conference on computer vision (ICCV), pp 2264–2272

  26. Khan Z, Balch T, Dellaert F (2005) Mcmc-based particle filtering for tracking a variable number of interacting targets. IEEE Trans Pattern Anal Mach Intell (PAMI) 27(11):1805–1819

    Article  Google Scholar 

  27. Liu R, Li Z, Jia J (2008) Image partial blur detection and classification. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1–8

  28. Liu C, Yuen J, flow A. Torralba. (2011) Sift Dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell (PAMI) 33(5):978–994

    Article  Google Scholar 

  29. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence, pp 674–679

  30. Micusik B, Pajdla T (2003) Estimation of omnidirectional camera model from epipolar geometry. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 485–490

  31. Mirabdollah MH, Mertsching B (2014) On the second order statistics of essential matrix elements. In: German conference on pattern recognition, pp 547–557

  32. Mirabdollah H, Mertsching B (2015) Fast techniques for monocular visual odometry. In: German conference on pattern recognition (GCPR), pp 297–307

  33. Morais E, Ferreira A, Cunha SA, Barros RM, Rocha A, Goldenstein S (2014) A multiple camera methodology for automatic localization and tracking of futsal players. Pattern Recogn Lett 39:21–30

    Article  Google Scholar 

  34. Nardi L, Bodin B, Zia MZ, Mawer J, Nisbet A, Kelly PHJ, Davison AJ, Luján M, O’Boyle MFP, Riley G, Topham N, Furber S (2015) Introducing slambench, a performance and accuracy benchmarking methodology for slam IEEE International conference on robotics and automation (ICRA), pp 5783–5790

  35. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  MATH  Google Scholar 

  36. Persson M, Piccini T, Mester R, Felsberg M (2015) Robust stereo visual odometry from monocular techniques. In: IEEE Symposium on intelligent vehicles (IV), pp 686–691

  37. Rosten E, Drummond T (2005) Fusing points and lines for high performance tracking. In: IEEE International conference on computer vision (ICCV), pp 1508–1515

  38. Saisan P, Medasani S, Owechko Y (2005) Multi-view classifier swarms for pedestrian detection and tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR) - Workshops, p 18

  39. Salas-Moreno RF, Glocken B, Kelly PHJ, Davison AJ (2014) Dense planar slam. In: IEEE International symposium on mixed and augmented reality (ISMAR), pp 157–164

  40. Saxena A, Sun M, Ng AY (2009) Make3d: learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell (PAMI) 31(5):824–840

    Article  Google Scholar 

  41. Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3d. ACM Trans Graph 25(3):835–846

    Article  Google Scholar 

  42. Song S, Chandraker M (2014) Robust scale estimation in real-time monocular sfm for autonomous driving. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1566–1573

  43. Song S, Xiao J, 233–240 (2013) Tracking revisited using rgbd camera: unified benchmark and baselines. In: IEEE International conference on computer vision (ICCV)

  44. Vedaldi A, Fulkerson B (2008) VLFeat: an open and portable library of computer vision algorithms. http://www.vlfeat.org/

  45. Wojek C, Walk S, Roth S, Schiele B (2011) Monocular 3d scene understanding with explicit occlusion reasoning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1993–2000

  46. Wojek C, Walk S, Roth S, Schindler K, Schiele B (2013) Monocular visual scene understanding: Understanding multi-object traffic scenes. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(4):882–897

    Article  Google Scholar 

  47. Wu S, Oreifej O, Shah M (2011) Action recognition in videos acquired by a moving camera using motion decomposition of lagrangian particle trajectories. In: 2011 International conference on computer vision, pp 1419–1426

  48. Xiang Y, Song C, Savarese S (2014) Monocular multiview object tracking with 3D aspect parts, pages 220–235

  49. Xu C, Cetintas S, Lee K, Li L (2014) Visual sentiment prediction with deep convolutional neural networks. CoRR

  50. Xue H, Liu Y, Cai D, He X (2016) Tracking people in rgbd videos using deep learning and motion clues. Neurocomputing 204:70–76

    Article  Google Scholar 

  51. Zhang J, Singh S (2015) Visual-lidar odometry and mapping: low drift, robust, and fast. In: IEEE International conference on robotics and automation(ICRA), pp 2174–2181

  52. Zhang S, Yu X, Sui Y, Zhao S, Zhang L (2015) Object tracking with multi-view support vector machines. IEEE Trans Multimedia 17(3):265–278

    Google Scholar 

  53. Zhou Q-Y, Koltun V (2015) Depth camera tracking with contour cues. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 632–638

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zeyd Boukhers.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boukhers, Z., Shirahama, K. & Grzegorzek, M. Less restrictive camera odometry estimation from monocular camera. Multimed Tools Appl 77, 16199–16222 (2018). https://doi.org/10.1007/s11042-017-5195-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5195-7

Keywords

Navigation