Visual-model-based, real-time 3D pose tracking for autonomous navigation: methodology and experiments

de Ruiter, Hans; Benhabib, Beno

doi:10.1007/s10514-008-9094-7

Visual-model-based, real-time 3D pose tracking for autonomous navigation: methodology and experiments

Published: 09 July 2008

Volume 25, pages 267–286, (2008)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Hans de Ruiter¹ &
Beno Benhabib¹

318 Accesses
10 Citations
Explore all metrics

Abstract

This paper presents a novel 3D-model-based computer-vision method for tracking the full six degree-of-freedom (dof) pose (position and orientation) of a rigid body, in real-time. The methodology has been targeted for autonomous navigation tasks, such as interception of or rendezvous with mobile targets. Tracking an object’s complete six-dof pose makes the proposed algorithm useful even when targets are not restricted to planar motion (e.g., flying or rough-terrain navigation). Tracking is achieved via a combination of textured model projection and optical flow. The main contribution of our work is the novel combination of optical flow with z-buffer depth information that is produced during model projection. This allows us to achieve six-dof tracking with a single camera.

A localized illumination normalization filter also has been developed in order to improve robustness to shading. Real-time operation is achieved using GPU-based filters and a new data-reduction algorithm based on colour-gradient redundancy, which was developed within the framework of our project. Colour-gradient redundancy is an important property of colour images, namely, that the gradients of all colour channels are generally aligned. Exploiting this property provides a threefold increase in speed. A processing rate of approximately 80 to 100 fps has been obtained in our work when utilizing synthetic and real target-motion sequences. Sub-pixel accuracies were obtained in tests performed under different lighting conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

3D point cloud-based place recognition: a survey

Article Open access 07 March 2024

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

Article 13 November 2015

References

Advanced Multimedia Processing Lab. (2006, January). The self-reconfigurable camera array. Carnegie Mellon University. [Online]. Available: http://amp.ece.cmu.edu/projects/MobileCamArray/.
Andrews, R., & Lovell, B. (2003). Color optical flow. In Workshop on digital image computing (Vol. 1, pp. 135–139) Brisbane, Australia, February 2003.
ATI Technologies Inc. (2004, December). Radeon X800 graphics technology. [Online]. Available: http://www.ati.com/products/radeonx800/index.html.
Barron, J., & Klette, R. (2002). Quantitative color optical flow. In 16th international conference on pattern recognition (Vol. 4, pp. 251–255), Quebec City, Canada, August 2002.
Chen, J., & Stockman, G. (1996). Determining pose of 3D objects with curved surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(1), 52–57.
Article MATH Google Scholar
Collins, G., & Dennis, L. A. (2000). A system for video surveillance and monitoring. In International conference on automated deduction (pp. 497–501), Pittsburgh, PA, June 2000.
Comport, A., Marchand, E., & Chaumette, F. (2003). A real-time tracker for markerless augmented reality. In IEEE and ACM international symposium on mixed and augmented reality (pp. 36–45), Tokyo, Japan, October 2003.
Corke, P., & Good, M. (1996). Dynamic effects in visual closed-loop systems. IEEE Transactions on Robotics and Automation, 12(5), 671–683.
Article Google Scholar
Dickmanns, E., & Graefe, V. (1988). Dynamic monocular machine vision. Machine Vision and Applications, 1(4), 223–240.
Article Google Scholar
Drummond, T., & Cipolla, R. (2002). Real-time visual tracking of complex structures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 932–946.
Article Google Scholar
Dunker, J., Hartmann, G., & Stöhr, M. (1996). Single view recognition and pose estimation of 3D objects using sets of prototypical views and spatially tolerant ontour representations. In International conference on pattern recognition (Vol. 4, pp. 14–18), Vienna, Austria, August 1996.
Ekvall, S., Hoffman, F., & Kragic, D. (2003). Object recognition and pose estimation for robotic manipulation using color coocurrence histograms. In International conference on robots and systems (Vol. 2, pp. 1284–1289), Las Vegas, NV, October 2003.
Fagerer, C., Dickmanns, D., & Dickmanns, E. (1994). Visual grasping with long delay time of a free floating object in orbit. Autonomous Robots, 1(1), 53–68.
Article Google Scholar
Fan, Y., & Balasuriya, A. (2001). Target tracking by underwater robots. In IEEE international conference on systems, man, and cybernetics (pp. 696–701), Tucson, AZ, October 2001.
Farmer, M., Hsu, R., & Jain, A. (2002). Interacting multiple model (IMM) Kalman filters for robust high speed motion tracking. In International conference on pattern recognition (Vol. 2, pp. 20–23), Québec City, August 2002.
Fung, J. (2004, November). Parallel computer graphics architectures for computer vision. EyeTap Personal Imaging (ePI) Lab, Edward S. Rogers Dept. of Electrical and Computer Eng., University of Toronto. [Online]. Available: http://www.eyetap.org/about_us/people/fungja/research/.
Ginhoux, R., & Gutmann, J. (2001). Model-based object tracking using stereo vision. In IEEE international conference on robotics and automation (pp. 1226–1232), Seoul, Korea, May 2001.
Gong, H., Yang, Q., Pan, C., & Lu, H. (2004). Generalized optical flow in the scale space. In IEEE international conference on image and graphics (pp. 536–539), Hong Kong, China, December 2004.
Hager, G. D., & Belhumeur, P. N. (1996). Real-time tracking of image regions with changes in geometry and illumination. In IEEE conference on computer vision and pattern recognition (pp. 403–410), San Francisco, CA.
Han, M., Xu, W., Tao, H., & Gong, Y. (2004). An algorithm for multiple object trajectory tracking. In IEEE conference on computer vision and pattern recognition (pp. 864–871), Washington, DC, June–July 2004.
Hartley, R., & Kang, S. (2005). Parameter-free radial distortion correction with centre of distortion estimation. In IEEE international conference on computer vision (Vol. 2, pp. 1834–1841), Canberra, Australia, October 2005.
Hujic, D., Croft, E., Zak, G., Fenton, R., Mills, J., & Benhabib, B. (1998). The robotic interception of moving objects in industrial settings: strategy development and experiment. IEEE/ASME Transactions on Mechatronics, 3(3), 225–239.
Article Google Scholar
Hutchinson, S., Hager, G., & Corke, P. (1996). A tutorial on visual servo control. IEEE Transactions on Robotics and Automation, 12(5), 651–670.
Article Google Scholar
Hyams, J., Powell, M., & Murphy, R. (2000). Cooperative navigation of micro-rovers using color segmentation. Autonomous Robots, 9(1), 7–16.
Article Google Scholar
InforMedia Services. (2006, January). Images. St. Cloud State University. [Online]. Available: http://ims.stcloudstate.edu/handouts/images.htm.
INTEL. (2004, November). The software vectorization handbook, errata. [Online]. Available: http://www.intel.com/intelpress/vmmx/errata.htm
Isard, M., & Blake, A. (1998). Condensation—conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1), 5–28.
Article Google Scholar
Jepson, A., Fleet, D., & El-Maraghi, T. (2001). Robust online appearance models for visual tracking. In IEEE conference on computer vision and pattern recognition (pp. 415–422), Kauai, HI.
Jia, Z., Balasuriya, A., & Challa, S. (2005). Vision based autonomous vehicles target visual tracking with multiple dynamics models. In IEEE network, sensing and control (pp. 1081–1086), Las Vegas, NV, March 2005.
Jin, H., Favaro, P., & Soatto, S. (2000). Real-time 3D motion and structure of point-features: a front-end for vision-based control and interaction. In Conference on computer vision and pattern recognition (pp. 778–779), Hilton Head Island, SC.
Johansson, B., & Moe, A. (2005). Patch-duplets for object recognition and pose estimation. In Canadian conference on computer and robot vision (pp. 9–16), Victoria, Canada, May 2005.
Jurie, F., & Dhome, M. (2002). Real time robust template matching. In 13th British machine vision conference (pp. 123–132), Cardiff, Wales.
Kify. (2006, January). Nature wallpapers. [Online]. Available: http://wallpapers.kify.com/nature-wallpapers.htm.
Kim, S., & Kweon, I. (2003). Robust model-based 3D object recognition by combining feature matching with tracking. In International conference on robotics and automation, (Vol. 2, pp. 2123–2128), Taipei, Taiwan, September 2003.
Krahnstoever, N., & Sharma, R. (2003). Appearance management and cue fusion for 3D model-based tracking. In Conference on computer vision and pattern recognition (Vol. 2, pp. 249–254), Madison, WI, June 2003.
Kyrki, V., & Schmock, K. (2005). Integration methods of model-free features for 3D tracking. In Lecture notes in computer science (pp. 557–566). Berlin: Springer.
Google Scholar
Lee, S., Jung, S., & Nevatia, R. (2002). Automatic pose estimation of complex 3D building models. In IEEE workshop on applications of computer vision (pp. 148–152), Orlando, FL, December 2002.
Lepetit, V., Pilet, J., & Fua, P. (2004). Point matching as a classification problem for fast and robust object pose estimation. In Conference on computer vision and pattern recognition (Vol. 2, pp. 224–250), June 2004.
Lippiello, V., Siciliano, B., & Villani, L. (2003). Robust visual tracking using a fixed multi-camera system. In IEEE conference on robotics and automation (pp. 3333–3338), Taipei, Taiwan, September 2003.
Lu, C., Hager, G., & Mjolsness, E. (2000). Fast and globally convergent pose estimation from video images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(6), 610–622.
Article Google Scholar
Lucas, B., & Kanade, T. (1981). An iterative image registration technique with application to stereo vision. In 7th international joint conference on artificial intelligence (pp. 674–479), Vancouver, Canada, August 1981.
Marchand, E., Bouthemy, P., & Chaumette, F. (2001). A 2D-3D model-based approach to real-time visual tracking. Image and Vision Computing, 19(7), 941–955.
Article Google Scholar
Matsushita, Y., Nishino, K., Ikeuchi, K., & Sakauchi, M. (2004). Illumination normalization with time-dependent intrinsic images for video surveillance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1336–1347.
Article Google Scholar
Matthews, I., Ishikawa, T., & Baker, S. (2004). The template update problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 810–815.
Article Google Scholar
McKenna, S., Jabri, S., Duric, Z., & Wechsler, H. (2000). Tracking interacting people. In 4th IEEE international conference on automatic face and gesture recognition (pp. 348–353), Grenoble, France, March 2000.
Ponsa, D., López, A., Serrat, J., Lumbereras, F., & Graf, T. (2005). Multiple vehicle 3D tracking using an unscented Kalman filter. In International IEEE conference on intelligent transportation systems (pp. 1108–1113), Vienna, Austria, September 2005.
Schenker, P., Huntsberger, T., Pirjanian, P., Baumgartner, E., & Tunstel, E. (2003). Planetary rover developments supporting mars exploration, sample return and future human-robotic colonization. Autonomous Robots, 13(2–3), 103–126.
Article Google Scholar
Sen Gupta, G., Messom, C., & Demidenko, S. (2005). Real-time identification and predictive control of fast mobile robots using global vision sensing. IEEE Transactions on Instrumentation and Measurement, 54(1), 200–214.
Article Google Scholar
Shreiner, D. (Ed.). (2004). OpenGL reference manual (4th ed.) Boston: Addison-Wesley.
Google Scholar
Sugar, T., McBeath, M., Suluh, A., & Mundhra, K. (2006). Mobile robot intercaption using human navigational principles: comparison of active versus passive tracking algorithms. Autonomous Robots, 21(1), 43–54.
Article Google Scholar
Tan, T. N., Sullivan, G. D., & Baker, K. D. (1998). Model-based localisation and recognition of road vehicles. International Journal of Computer Vision, 27(1), 5–25.
Article Google Scholar
Vincze, M., Schlemmer, M., Gemeiner, P., & Ayromlou, M. (2005). Vision for robotics: a tool for model-based object tracking. IEEE Robotics and Automation Magazine, 12(4), 53–64.
Article Google Scholar
Virtual New Zealand. (2006, January). Virtual New Zealand photos. [Online]. Available: http://www.virtualoceania.net/newzealand/photos/.
Webber, J., & Malik, J. (1993). Robust computation of optical flow in a multi-scale differential framework. In 4th international conference on computer vision (pp. 12–20), Berlin, Germany, May 1993.
Wong, F., Chan, T., Ben Mrad, R., & Benhabib, B. (2004). Mobile-robot guidance in the presence of obstacles. In International conference on flexible automation and intelligent manufacturing (pp. 292–299), Toronto, Canada, July 2004.
Wu, Y., Hua, G., & Yu, T. (2003). Switching observation models for contour tracking in clutter. In IEEE conference on computer vision and pattern recognition (pp. 295–302), Madison, WI, June 2003.
Yang, R., Welch, G., & Bishop, G. (2002). Real-time consensus-based scene reconstruction using commodity graphics hardware. In 10th Pacific conference on computer graphics and applications (pp. 225–234), Beijing, China.
Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1330–1334.
Article Google Scholar
Zhang, Z., Li, J., & Wei, X. (2004). Robust computation of optical flow field with large motion. In IEEE international conference on signal processing (Vol. 1, pp. 893–896), Beijing, China, September 2004.
Zhao, L., Luo, S., & Liao, L. (2004). 3d object recognition and pose estimation using kernel pca. In 3rd international conference on machine learning and cybernetics (pp. 3258–3262), Shanghai, China, August 2004.

Download references

Author information

Authors and Affiliations

Computer Integrated Manufacturing Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto, 5 King’s College Road, Toronto, Ontario, M5S 3G8, Canada
Hans de Ruiter & Beno Benhabib

Authors

Hans de Ruiter
View author publications
You can also search for this author in PubMed Google Scholar
Beno Benhabib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hans de Ruiter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Ruiter, H., Benhabib, B. Visual-model-based, real-time 3D pose tracking for autonomous navigation: methodology and experiments. Auton Robot 25, 267–286 (2008). https://doi.org/10.1007/s10514-008-9094-7

Download citation

Received: 14 March 2007
Accepted: 05 June 2008
Published: 09 July 2008
Issue Date: October 2008
DOI: https://doi.org/10.1007/s10514-008-9094-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual-model-based, real-time 3D pose tracking for autonomous navigation: methodology and experiments

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

3D point cloud-based place recognition: a survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visual-model-based, real-time 3D pose tracking for autonomous navigation: methodology and experiments

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

3D point cloud-based place recognition: a survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation