Skip to main content
Log in

Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Recent research on structure and motion recovery has focused on issues related to sensitivity and robustness of existing techniques. One possible reason is that in practical applications, the underlying assumptions made by existing algorithms are often violated. In this paper, we propose a framework for 3D reconstruction from short monocular video sequences taking into account the statistical errors in reconstruction algorithms. Detailed error analysis is especially important for this problem because the motion between pairs of frames is small and slight perturbations in its estimates can lead to large errors in 3D reconstruction. We focus on the following issues: physical sources of errors, their experimental and theoretical analysis, robust estimation techniques and measures for characterizing the quality of the final reconstruction. We derive a precise relationship between the error in the reconstruction and the error in the image correspondences. The error analysis is used to design a robust, recursive multi-frame fusion algorithm using “stochastic approximation” as the framework since it is capable of dealing with incomplete information about errors in observations. Rate-distortion analysis is proposed for evaluating the quality of the final reconstruction as a function of the number of frames and the error in the image correspondences. Finally, to demonstrate the effectiveness of the algorithm, examples of depth reconstruction are shown for different video sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Azarbayejani, A. and Pentland, A. 1995. Recursive estimation of motion, structure, and focal length. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17:562-575.

    Google Scholar 

  • Benveniste, A., Metivier, M., and Priouret, P. 1987. Adaptive Algorithms and Stochastic Approximations. Springer-Verlag.

  • Black, M. and Rangarajan, A. 1996. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal of Computer Vision, 19:57- 91.

    Google Scholar 

  • Broida, T. 1985. Estimating the Kinematics and Structure of aMoving Object from a Sequence of Images. Ph.D. Thesis.

  • Broida, T., Chandrashekhar, S., and Chellappa, R. 1990. Recursive estimation of 3-D kinematics and structure from a noisy monocular image sequence. IEEE Trans. on Aerospace and Electronic Systems, 26:639-656.

    Google Scholar 

  • Broida, T. and Chellappa, R. 1989. Performance bounds for estimating three-dimensional motion parameters from a sequence of noisy images. Journal of the Optical Society of America A, 6:879- 889.

    Google Scholar 

  • Broida, T. and Chellappa, R. 1991. Estimating the kinematics and structure of a rigid object from a sequence of monocular images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13:497-513.

    Google Scholar 

  • Cho, K., Meer, P., and Cabrera, J. 1997. Performance assessment through bootstrap. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19:1185-1198.

    Google Scholar 

  • Cover, T. and Thomas, J. 1991. Elements of Information Theory. John Wiley and Sons.

  • Daniilidis, K. and Nagel, H. 1990. Analytic results on error sensitivity of motion estimation from two views. Image and Vision Computing, 8(4):297-303.

    Google Scholar 

  • Daniilidis, K. and Nagel, H. 1993. The coupling of rotation and translation in motion estimation of planar surfaces. In Conference on Computer Vision and Pattern Recognition, pp. 188- 193.

  • Daniilidis, K. and Spetsakis, M. 1993. Understanding noise sensitivity in structure from motion. In VisNav93.

  • Efron, B. and Tibshirani, R. 1993. An Introduction to the Bootstrap.Chapman and Hall.

  • Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press.

  • Fermuller, C. and Aloimonos, Y. 2001. Statistics explains geometrical optical illusions. In Foundations of Image Understanding, Chap. 14.

  • Fessler, J. 1996. Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): Applications to tomography. IEEE Transactions on Image Processing, 5:493- 506.

    Google Scholar 

  • Fua, P. 2000. Regularized bundle-adjustment to model heads from image sequences without calibration data. International Journal of Computer Vision, 38(2):153-171.

    Google Scholar 

  • Gennery, D. 1992. Visual tracking of known three-dimensional objects. International Journal of Computer Vision, 7(3):243- 270.

    Google Scholar 

  • Golub, G. and Van Loan, C. 1989. Matrix Computations. Johns Hopkins University Press.

  • Goodman, I., Mahler, R., and Nguyen, H. 1997. Mathematics of Data Fusion. Kluwer Academic Publishers.

  • Haralick, R. 1996. Covariance propagation in computer vision. In ECCV Workshop on Performance Characteristics of Vision Algorithms.

  • Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.

  • Kanatani, K. 1993. Unbiased estimation and statistical analysis of 3-D rigid motion from two views. Pattern Analysis and Machine Intelligence, 15(1):37-50.

    Google Scholar 

  • Kanatani, K. 1996. Statistical Optimization for Geometric Computation: Theory and Practice. North-Holland.

  • Ljung, L. and Soderstrom, T. 1987. Theory and Practice of Recursive Identification. MIT Press.

  • Longuet-Higgins, H. 1981. A computer algorithm for reconstructing a scenes from two projections. Nature, 293:133-135.

    Google Scholar 

  • Ma, Y., Kosecka, J., and Sastry, S. 2000. Linear differential algorithm for motion recovery:Ageometric approach. International Journal of Computer Vision, 36:71-89.

    Google Scholar 

  • Meer, P., Mintz, D., and Rosenfeld, A. 1992. Analysis of the least median of squares estimator for computer vision applications. In Conference on ComputerVision andPattern Recognition, pp. 621- 623.

  • Morris, D., Kanatani, K., and Kanade, T. 2000. 3D model accuracy and gauge fixing. Technical Report, Carnegie-Mellon University, Pittsburgh.

    Google Scholar 

  • Nalwa, V. 1993. A Guided Tour of Computer Vision. AddisonWesley.

  • Oliensis, J. 1999. A multi-frame structure-from-motion algorithm under perspective projection. International Journal of Computer Vision, 34:1-30.

    Google Scholar 

  • Oliensis, J. 2000. A critique of structure from motion algorithms. Technical Report http://www.neci.nj.nec.com/homepages/oliensis/, NECI.

  • Oliensis, J. and Genc, Y. 2001. Fast and accurate algorithms for projective multi-image structure from motion. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(6):546-559.

    Google Scholar 

  • Papoulis, A. 1991. Probability, Random Variables and Stochastic Processes. McGraw-Hill.

  • Poor, H. 1988. An Introduction to Signal Detection and Estimation. Springer-Verlag.

  • Robbins, H. and Monro, S. 1951. A stochastic approximation method. Annals of Mathematical Statistics, 22:400-407.

    Google Scholar 

  • Rousseeuw, P. 1984. Least median of square regression. Journal of the American Statistical ssociation, 79:871-880.

    Google Scholar 

  • Rousseeuw, P.and Leroy, A.1987. Robust Regression and Outlier Detection. John Wiley and Sons.

  • Roy Chowdhury, A. 2002. Statistical Analysis of 3D Modeling From Monocular Video Stream. Ph.D. Thesis, University of Maryland, College Park.

  • Roy Chowdhury, A. and Chellappa, R. 2002. Towards a criterion for evaluating the quality of 3D reconstructions. In International Conference on Acoustics, Speech and Signal Processing.

  • Roy Chowdhury, A. and Chellappa, R. 2003a. Face reconstruction from monocular video using uncertainty analysis and a generic model. Accepted to Computer Vision and Image Understanding.

  • Roy Chowdhury, A. and Chellappa, R. 2003b. Statistical error propagation in 3D modeling from monocular video. In CVPRWorkshop on Statistical Analysis in Computer Vision.

  • Saridis, G. December 1974. Stochastic approximation methods for identification and control-A survey. IEEE Trans. on Automatic Control, 19.

  • Shan, Y., Liu, Z., and Zhang, Z. 2001. Model-based bundle adjustment with application to face modeling. In International Conference on Computer Vision. pp.644-651.

  • Shao, J. 1998. Mathematical Statistics. Springer-Verlag.

  • Soatto, S. and Brockett, R. 1998. Optimal structure from motion: Local ambiguities and global estimates. In Conference on Computer Vision and Pattern Recognition, pp. 282-288.

  • Spall, J. 2000. Preprint of Introduction to Stochastic Search and Optimization. Wiley.

  • Srinivasan, S. 2000. Extracting structure from optical flow using fast error search technique. International Journal of Computer Vision,37:203-230.

    Google Scholar 

  • Sun, Z., Ramesh, V., and Tekalp, A. 2001. Error characterization of the factorization method. Computer Vision and Image Understanding, 82:110-137.

    Google Scholar 

  • Szeliski, R. and Kang, S. 1994. Recovering 3D shape and motion from image streams using non-linear least squares. Journal of Visual Computation and Image Representation, 5:10-28.

    Google Scholar 

  • Thomas, J. and Oliensis, J. 1999. Dealing with noise in multiframe structure from motion. Computer Vision and Image understanding, 76:109-124.

    Google Scholar 

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9:137-154.

    Google Scholar 

  • Triggs, B., Zisserman, A., and Szeliski, R. 2000. Vision Algorithms:T heory and Practice. Springer.

  • Tsai, R. and Huang, T. 1981. Estimating 3-D motion parameters of a rigid planar patch: I. IEEE Trans. on Acoustics, Speech and Signal Processing, 29:1147-1152.

    Google Scholar 

  • Walter, R. 1976. Principles of Mathematical Analysis, 3rd Edition.McGraw-Hill.

  • Weng, J., Ahuja, N., and Huang, T. 1993. Optimal motion and structure estimation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 15:864-884.

    Google Scholar 

  • Weng, J., Huang, T., and Ahuja, N. 1987. 3-D motion estimation, understanding, and prediction from noisy image sequences. IEEE Trans. on Pattern Analysis and Machine Intelligence, 9:370-389.

    Google Scholar 

  • Weng, J., Huang, T., and Ahuja, N. 1989. Motion and structure from two perspective views: Algorithms, error analysis, and error estimation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 11(5):451-476.

    Google Scholar 

  • Young, G. and Chellappa, R. 1990. 3-D motion estimation using a sequence of noisy stereo images: Models, estimation, and uniqueness results. Pattern Analysis and Machine Intelligence:12(8):735- 759.

    Google Scholar 

  • Young, G. and Chellappa, R. 1992. Statistical analysis of inherent ambiguities in recovering 3-D motion from a noisy flowfield. IEEE Trans. on Pattern Analysis and Machine Intelligence, 14:995- 1013.

    Google Scholar 

  • Zhang, Z. 1998. Determining the epipolar geometry and its uncertainty: A review. International Journal of Computer Vision, 27:161-195.

    Google Scholar 

  • Zhang, Z. and Faugeras, O. 1992. 3D Dynamic Scene Analysis. Springer-Verlag.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chowdhury, A.K.R., Chellappa, R. Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation. International Journal of Computer Vision 55, 27–53 (2003). https://doi.org/10.1023/A:1024488407740

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1024488407740

Navigation