The Video Yardstick

Brodský, Tomáš; Fermüller, Cornelia; Aloimonos, Yiannis

doi:10.1007/3-540-49384-0_12

Tomáš Brodský³,
Cornelia Fermüller³ &
Yiannis Aloimonos³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1537))

Included in the following conference series:

International Workshop on Capture Techniques for Virtual Environments

777 Accesses

Abstract

Given uncalibrated video sequences, how can we recover rich descriptions of the scene content, beyond two-dimensional (2D) measurements such as color/texture or motion fields — descriptions of shape and three-dimensional (3D) motion? This is the well known structure from motion (SFM) problem. Up to now, SFM algorithms proceeded in two well defined steps, where the first and most important step is recovering the rigid transformation between two views, and the subsequent step is using this transformation to compute the structure of the scene in view. This paper introduces a novel approach to structure from motion in which both steps are accomplished in a synergistic manner. It deals with the classical structure from motion problem considering a calibrated camera as well as the extension to an uncalibrated optical device. Existing approaches to estimation of the viewing geometry are mostly based on the use of optic flow, which, however, poses a problem at the locations of depth discontinuities. If we knew where depth discontinuities were, we could (using a multitude of approaches based on smoothness constraints) accurately estimate flow values for image patches corresponding to smooth scene patches; but to know the discontinuities requires solving the structure from motion problem first. In the past this dilemma has been addressed by improving the estimation of flow through sophisticated optimization techniques, whose performance often depends on the scene in view. In this paper we follow a different approach. We directly utilize the image derivatives and employ constraints which involve the 3D motion and shape of the scene, leading to a geometric and statistical estimation problem. The main idea is based on the interaction between 3D motion and shape which allows us to estimate the 3D motion while at the same time segmenting the scene. If we use a wrong 3D motion estimate to compute depth, we obtain a distorted version of the depth function. The distortion, however, is such that the worse the motion estimate, the more likely we are to obtain depth estimates that are locally unsmooth, i.e., they vary more than the correct ones. Since local variability of depth is due either to the existence of a discontinuity or to a wrong 3D motion estimate, being able to differentiate between these two cases provides the correct motion, which yields the “smoothest” estimated depth as well as the image locations of scene discontinuities. We analyze the new constraints introduced by our approach and show their relationship to the minimization of the epipolar constraint, which becomes a special case of our theory. Finally, we present a number of experimental results with real image sequences indicating the robustness of our method and the improvement over traditional methods. The resulting system is a video yardstick that can be applied to any video sequence to recover first the calibration parameters of the camera that captured the video and, subsequently, the structure of the scene.

Patent pending.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brodský, T., Fermüller, C., Aloimonos, Y.: Self-Calibration from Image Derivatives in: Proc. International Conference on Computer Vision 1998 83–89
Google Scholar
Brodský, T., Fermüller, C., Aloimonos, Y.: Simultaneous Estimation of Viewing Geometry and Structure Technical report Center for Automation Research University of Maryland (1998) To appear
Google Scholar
Cheong, L., Fermüller, C., Aloimonos, Y.: Effects of Errors in the Viewing Geometry on Shape Estimation Computer Vision and Image Understanding 71 (1998) 356–372
Google Scholar
Fermüller, C.: Navigational Preliminaries in: Aloimonos, Y. (ed.), Active Perception Lawrence Erlbaum Associates, Hillsdale, NJ Advances in Computer Vision 1993 103–150
Google Scholar
Heitz, F., Bouthemy, P.: Multimodal Estimation of Discontinuous Optical Flow Using Markov Random Fields IEEE Transactions on Pattern Analysis and Machine Intelligence 15 (1993) 1217–1232
Article Google Scholar
Horn, B. K. P., Weldon, E. J.: Direct methods for recovering motion International Journal of Computer Vision 2 (1988) 51–76
Article Google Scholar
Mendelsohn, J., Simoncelli, E., Bajcsy, R.: Discrete-Time Rigidity Constrained Optical Flow in: Proc. 7th International Conference on Computer Analysis of Images and Patterns Springer, Berlin 1997
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Laboratory, Center for Automation Research, University of Maryland, College Park, MD, 20742-3275, USA
Tomáš Brodský, Cornelia Fermüller & Yiannis Aloimonos

Authors

Tomáš Brodský
View author publications
You can also search for this author in PubMed Google Scholar
Cornelia Fermüller
View author publications
You can also search for this author in PubMed Google Scholar
Yiannis Aloimonos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Systems, University of Geneva, 24 Rue du Général Dufour, CH-1211, Geneva 4, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology (EPFL), CH-1015, Lausanne, Switzerland
Daniel Thalmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brodský, T., Fermüller, C., Aloimonos, Y. (1998). The Video Yardstick. In: Magnenat-Thalmann, N., Thalmann, D. (eds) Modelling and Motion Capture Techniques for Virtual Environments. CAPTECH 1998. Lecture Notes in Computer Science(), vol 1537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49384-0_12

Download citation

DOI: https://doi.org/10.1007/3-540-49384-0_12
Published: 18 November 1998
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65353-0
Online ISBN: 978-3-540-49384-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics