Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation

Chowdhury, Amit K. Roy; Chellappa, R.

doi:10.1023/A:1024488407740

Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation

Published: October 2003

Volume 55, pages 27–53, (2003)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Amit K. Roy Chowdhury¹ &
R. Chellappa¹

617 Accesses
24 Citations
Explore all metrics

Abstract

Recent research on structure and motion recovery has focused on issues related to sensitivity and robustness of existing techniques. One possible reason is that in practical applications, the underlying assumptions made by existing algorithms are often violated. In this paper, we propose a framework for 3D reconstruction from short monocular video sequences taking into account the statistical errors in reconstruction algorithms. Detailed error analysis is especially important for this problem because the motion between pairs of frames is small and slight perturbations in its estimates can lead to large errors in 3D reconstruction. We focus on the following issues: physical sources of errors, their experimental and theoretical analysis, robust estimation techniques and measures for characterizing the quality of the final reconstruction. We derive a precise relationship between the error in the reconstruction and the error in the image correspondences. The error analysis is used to design a robust, recursive multi-frame fusion algorithm using “stochastic approximation” as the framework since it is capable of dealing with incomplete information about errors in observations. Rate-distortion analysis is proposed for evaluating the quality of the final reconstruction as a function of the number of frames and the error in the image correspondences. Finally, to demonstrate the effectiveness of the algorithm, examples of depth reconstruction are shown for different video sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

Article 13 November 2015

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Azarbayejani, A. and Pentland, A. 1995. Recursive estimation of motion, structure, and focal length. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17:562-575.
Google Scholar
Benveniste, A., Metivier, M., and Priouret, P. 1987. Adaptive Algorithms and Stochastic Approximations. Springer-Verlag.
Black, M. and Rangarajan, A. 1996. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal of Computer Vision, 19:57- 91.
Google Scholar
Broida, T. 1985. Estimating the Kinematics and Structure of aMoving Object from a Sequence of Images. Ph.D. Thesis.
Broida, T., Chandrashekhar, S., and Chellappa, R. 1990. Recursive estimation of 3-D kinematics and structure from a noisy monocular image sequence. IEEE Trans. on Aerospace and Electronic Systems, 26:639-656.
Google Scholar
Broida, T. and Chellappa, R. 1989. Performance bounds for estimating three-dimensional motion parameters from a sequence of noisy images. Journal of the Optical Society of America A, 6:879- 889.
Google Scholar
Broida, T. and Chellappa, R. 1991. Estimating the kinematics and structure of a rigid object from a sequence of monocular images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13:497-513.
Google Scholar
Cho, K., Meer, P., and Cabrera, J. 1997. Performance assessment through bootstrap. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19:1185-1198.
Google Scholar
Cover, T. and Thomas, J. 1991. Elements of Information Theory. John Wiley and Sons.
Daniilidis, K. and Nagel, H. 1990. Analytic results on error sensitivity of motion estimation from two views. Image and Vision Computing, 8(4):297-303.
Google Scholar
Daniilidis, K. and Nagel, H. 1993. The coupling of rotation and translation in motion estimation of planar surfaces. In Conference on Computer Vision and Pattern Recognition, pp. 188- 193.
Daniilidis, K. and Spetsakis, M. 1993. Understanding noise sensitivity in structure from motion. In VisNav93.
Efron, B. and Tibshirani, R. 1993. An Introduction to the Bootstrap.Chapman and Hall.
Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press.
Fermuller, C. and Aloimonos, Y. 2001. Statistics explains geometrical optical illusions. In Foundations of Image Understanding, Chap. 14.
Fessler, J. 1996. Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): Applications to tomography. IEEE Transactions on Image Processing, 5:493- 506.
Google Scholar
Fua, P. 2000. Regularized bundle-adjustment to model heads from image sequences without calibration data. International Journal of Computer Vision, 38(2):153-171.
Google Scholar
Gennery, D. 1992. Visual tracking of known three-dimensional objects. International Journal of Computer Vision, 7(3):243- 270.
Google Scholar
Golub, G. and Van Loan, C. 1989. Matrix Computations. Johns Hopkins University Press.
Goodman, I., Mahler, R., and Nguyen, H. 1997. Mathematics of Data Fusion. Kluwer Academic Publishers.
Haralick, R. 1996. Covariance propagation in computer vision. In ECCV Workshop on Performance Characteristics of Vision Algorithms.
Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.
Kanatani, K. 1993. Unbiased estimation and statistical analysis of 3-D rigid motion from two views. Pattern Analysis and Machine Intelligence, 15(1):37-50.
Google Scholar
Kanatani, K. 1996. Statistical Optimization for Geometric Computation: Theory and Practice. North-Holland.
Ljung, L. and Soderstrom, T. 1987. Theory and Practice of Recursive Identification. MIT Press.
Longuet-Higgins, H. 1981. A computer algorithm for reconstructing a scenes from two projections. Nature, 293:133-135.
Google Scholar
Ma, Y., Kosecka, J., and Sastry, S. 2000. Linear differential algorithm for motion recovery:Ageometric approach. International Journal of Computer Vision, 36:71-89.
Google Scholar
Meer, P., Mintz, D., and Rosenfeld, A. 1992. Analysis of the least median of squares estimator for computer vision applications. In Conference on ComputerVision andPattern Recognition, pp. 621- 623.
Morris, D., Kanatani, K., and Kanade, T. 2000. 3D model accuracy and gauge fixing. Technical Report, Carnegie-Mellon University, Pittsburgh.
Google Scholar
Nalwa, V. 1993. A Guided Tour of Computer Vision. AddisonWesley.
Oliensis, J. 1999. A multi-frame structure-from-motion algorithm under perspective projection. International Journal of Computer Vision, 34:1-30.
Google Scholar
Oliensis, J. 2000. A critique of structure from motion algorithms. Technical Report http://www.neci.nj.nec.com/homepages/oliensis/, NECI.
Oliensis, J. and Genc, Y. 2001. Fast and accurate algorithms for projective multi-image structure from motion. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(6):546-559.
Google Scholar
Papoulis, A. 1991. Probability, Random Variables and Stochastic Processes. McGraw-Hill.
Poor, H. 1988. An Introduction to Signal Detection and Estimation. Springer-Verlag.
Robbins, H. and Monro, S. 1951. A stochastic approximation method. Annals of Mathematical Statistics, 22:400-407.
Google Scholar
Rousseeuw, P. 1984. Least median of square regression. Journal of the American Statistical ssociation, 79:871-880.
Google Scholar
Rousseeuw, P.and Leroy, A.1987. Robust Regression and Outlier Detection. John Wiley and Sons.
Roy Chowdhury, A. 2002. Statistical Analysis of 3D Modeling From Monocular Video Stream. Ph.D. Thesis, University of Maryland, College Park.
Roy Chowdhury, A. and Chellappa, R. 2002. Towards a criterion for evaluating the quality of 3D reconstructions. In International Conference on Acoustics, Speech and Signal Processing.
Roy Chowdhury, A. and Chellappa, R. 2003a. Face reconstruction from monocular video using uncertainty analysis and a generic model. Accepted to Computer Vision and Image Understanding.
Roy Chowdhury, A. and Chellappa, R. 2003b. Statistical error propagation in 3D modeling from monocular video. In CVPRWorkshop on Statistical Analysis in Computer Vision.
Saridis, G. December 1974. Stochastic approximation methods for identification and control-A survey. IEEE Trans. on Automatic Control, 19.
Shan, Y., Liu, Z., and Zhang, Z. 2001. Model-based bundle adjustment with application to face modeling. In International Conference on Computer Vision. pp.644-651.
Shao, J. 1998. Mathematical Statistics. Springer-Verlag.
Soatto, S. and Brockett, R. 1998. Optimal structure from motion: Local ambiguities and global estimates. In Conference on Computer Vision and Pattern Recognition, pp. 282-288.
Spall, J. 2000. Preprint of Introduction to Stochastic Search and Optimization. Wiley.
Srinivasan, S. 2000. Extracting structure from optical flow using fast error search technique. International Journal of Computer Vision,37:203-230.
Google Scholar
Sun, Z., Ramesh, V., and Tekalp, A. 2001. Error characterization of the factorization method. Computer Vision and Image Understanding, 82:110-137.
Google Scholar
Szeliski, R. and Kang, S. 1994. Recovering 3D shape and motion from image streams using non-linear least squares. Journal of Visual Computation and Image Representation, 5:10-28.
Google Scholar
Thomas, J. and Oliensis, J. 1999. Dealing with noise in multiframe structure from motion. Computer Vision and Image understanding, 76:109-124.
Google Scholar
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9:137-154.
Google Scholar
Triggs, B., Zisserman, A., and Szeliski, R. 2000. Vision Algorithms:T heory and Practice. Springer.
Tsai, R. and Huang, T. 1981. Estimating 3-D motion parameters of a rigid planar patch: I. IEEE Trans. on Acoustics, Speech and Signal Processing, 29:1147-1152.
Google Scholar
Walter, R. 1976. Principles of Mathematical Analysis, 3rd Edition.McGraw-Hill.
Weng, J., Ahuja, N., and Huang, T. 1993. Optimal motion and structure estimation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 15:864-884.
Google Scholar
Weng, J., Huang, T., and Ahuja, N. 1987. 3-D motion estimation, understanding, and prediction from noisy image sequences. IEEE Trans. on Pattern Analysis and Machine Intelligence, 9:370-389.
Google Scholar
Weng, J., Huang, T., and Ahuja, N. 1989. Motion and structure from two perspective views: Algorithms, error analysis, and error estimation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 11(5):451-476.
Google Scholar
Young, G. and Chellappa, R. 1990. 3-D motion estimation using a sequence of noisy stereo images: Models, estimation, and uniqueness results. Pattern Analysis and Machine Intelligence:12(8):735- 759.
Google Scholar
Young, G. and Chellappa, R. 1992. Statistical analysis of inherent ambiguities in recovering 3-D motion from a noisy flowfield. IEEE Trans. on Pattern Analysis and Machine Intelligence, 14:995- 1013.
Google Scholar
Zhang, Z. 1998. Determining the epipolar geometry and its uncertainty: A review. International Journal of Computer Vision, 27:161-195.
Google Scholar
Zhang, Z. and Faugeras, O. 1992. 3D Dynamic Scene Analysis. Springer-Verlag.

Download references

Author information

Authors and Affiliations

Center for Automation Research and Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, 20742
Amit K. Roy Chowdhury & R. Chellappa

Authors

Amit K. Roy Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
R. Chellappa
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chowdhury, A.K.R., Chellappa, R. Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation. International Journal of Computer Vision 55, 27–53 (2003). https://doi.org/10.1023/A:1024488407740

Download citation

Issue Date: October 2003
DOI: https://doi.org/10.1023/A:1024488407740

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation