Abstract
This paper presents a simple but robust model based approach to estimating the kinematics of a moving camera and the structure of the objects in a stationary environment using long, noisy, monocular image sequences. Both batch and recursive algorithms are presented and the problem due to occlusion is addressed. The approach is based on representing the constant translational velocity and constant angular velocity of the camera motion using nine rectilinear motion parameters, which are 3-D vectors of the position of the rotation center, linear and angular velocities. The structure parameters are 3-D coordinates of the salient feature points in the inertial coordinate system. Due to redundancies in parameterization, the total number of independent parameters to be estimated is 3M+7, whereM is the number of feature points. The image plane coordinates of these feature points in each frame are first detected and matched over the frames. These noisy image coordinates serve as the input to our algorithms. Due to the nonlinear nature of perspective projection, a nonlinear least squares method is formulated for the batch algorithm, and a conjugate gradient method is then applied to find the solution. A recursive method using an Iterated Extended Kalman Filter (IEKF) for incremental estimation of motion and structure is also presented. Since the plant model is simple in our formulation, closed form solutions for the state and covariance transition equations are easily derived. Experimental results for simulated imagery as well as several real image sequences are included.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aggarwal, J.K. and Mitiche, A. 1985. Structure and Motion from Images: Fact in Fiction.Proc. Third Workshop on Computer Vision: Representation and Control (Bellaire, MI), pp. 127–128.
Ando, H. 1991. Dynamic Reconstruction of 3D Structure and 3D Motion.Proc. IEEE Workshop on Visual Motion (Princeton, NJ), pp. 101–110.
Broida, T.J., Chandrashekhar, S., and Chellappa, R. 1990. Recursive Estimation of 3-D Kinematics and Structure from a Noisy Monocular Image Sequence.IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-26, pp. 639–656.
Broida, T.J. and Chellappa, R. 1986. Estimation of Object Motion Parameters from Noisy Image.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-8, pp. 90–99.
Broida, T.J. and Chellappa, R. 1989. Performance Bounds for Estimating Three-Dimensional Motion Parameters from a Sequence of Noisy Images.Journal of the Optical Society of America A, Vol. 6, pp. 879–889.
Broida, T.J. and Chellappa, R. 1991. Estimating the Kinematics and Structure of a Rigid Object from a Sequence of Monocular Images.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-13, pp. 497–513.
Chandrashekhar, S. and Chellappa, R. 1992. Passive Ranging Using a Moving Camera.Journal of Robotic Systems, Vol. 9, pp. 729–752.
Cui, N., Weng, J., and Cohen, P. 1991. Motion and Structure from Long Stereo Image Sequences.Proc. IEEE Workshop on Visual Motion (Princeton, NJ), pp. 75–80.
Daugman, J.G. 1988. Relaxation Neural Network for Non-Orthogonal Image Transforms.Proc. Int. Conf. on Neural networks (San Diego, CA), pp. 547–560.
Dickmanns, E.D. and Graefe, V. 1988. Dynamic Monocular Machine Vision.Machine Vision and Applications, Vol. 1, pp. 233–240.
Dickmanns, E.D. and Graefe, V. 1988. Applications of Dynamic Monnocular Machine Vision.Machine Vision and Applications, Vol. 1, pp. 241–261.
Dutta, R., Manmatha, R., Williams, L., and Riseman, E. 1989. A Data Set for Quantitative Motion Analysis, inProc. IEEE Conf. on Computer Vision and Pattern Recognition (San Diego, CA), pp. 159–164.
Fang, J.Q. and Huang, T.S. 1984. Some Experiments on Estimating the 3-D Motion Parameters of a Rigid Body from Two Consecutive Image Frames.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-6, pp. 545–554.
Franzen, W.O. 1991.Structure from Chronogeneous Motion, Ph.D. Thesis, University of Southern California, Available as USC-IRIS Report # 266.
Gennery, D.B. 1982. Tracking Known Three-Dimensional Objects.Proc. of AAAI-82, National Conference on Artificial Intelligence, pp. 13–17.
Gennery, D.B. 1992. Visual Tracking of Known Three-Dimensional Objects.International Journal of Computer Vision, Vol. 7, pp. 243–270.
Heel, J. 1990. Direct Estimation of Structure and Motion from Multiple Frames. AI Memo 1190, MIT AI Laboratory.
Horn, B.K.P. and Weldon, J.E.J. 1988. Direct Methods for Recovering Motion,International Journal of Computer Vision, Vol. 2, pp. 51–76.
Huang, T.S., ed. 1981.Image Sequence Analysis: Motion Estimation in Image Sequence Analysis. New York: Springer Verlag.
Huang, T.S. et al. 1986. Motion Detection and estimation from Stereo Image Sequences: Some Preliminary Experimental Results.Proc. IEEE Workshop on Motion: Representation and Analysis (Kiawah Island, SC), pp. 45–46.
Jazwinski, A.H. 1970.Stochastic Processes and Filtering Theory., New York: Academic Press.
Jerian, C.P. and Jain, R. 1991. Structure from Motion—A Critical Analysis of Methods.IEEE Trans. on Systems, Man, and Cybernetics, Vol. SMC-21, pp. 572–588.
Kumar, R. and Hanson, A. 1990. Pose Refinement: Application to Model Extension and Sensitivity of Camera Parameters, inProc. DARPA Image Understanding Workshop (Pittsburgh, PA), pp. 660–669.
Longuet-Higgins, H.C. 1984. The Reconstruction of a Scene from Two Projections-Configurations that Defeat the 8-point Algorithm.Proc. IEEE Conf. on Artificial Intelligence Applications (Denver, CO), pp. 395–397.
Manjunath, B.S., Chellappa, R., and Malsburg, C. 1992. A Feature Based Approach to Face Recognition.Proc. IEEE Conf. on Computer Vision and Pattern Recognition (Champaign, IL), pp. 373–378.
Matthies, L. and Shafer, S.A. 1987. Error Modeling in Stereo Navigation.IEEE Journal of Robotics and Automation, Vol. RA-3, pp. 239–2248.
Maybeck, P.S. 1982.Stochastic Models, Estimation, and Control, Vol. 2, New York: Academic Press.
Meyer, F. and Bouthemy, P. 1992. Estimation of Time-to-Collision Mps from First Order Motion Models and Normal Flows.Proc. IEEE International Conf. on Pattern Recognition (The Hague, The Netherlands), pp. 78–82.
Negahdaripour, S. and Horn, B.K.P. 1987. Direct Passive Navigation.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-9, pp. 168–176.
Oliensis, J. and Thomas, J. 1991. Incorporating Motion Errors in Multi-Frame Structure from Motion, inProc. IEEE Workshop on Visual Motion (Princeton, NJ), pp. 8–13.
Porat, M. and Zeevi, Y.A. 1988. The Generalized Gabor Scheme of Image Representation in Biological and Machine Vision.IEEE Trans. on Patt. Anal. Mack Intell., Vol. PAMI-10, pp. 452–468.
Ranade, S. and Rosenfeld, A. 1980. Point Pattern Maching by Relaxation.Pattern Recognition, Vol. 12, pp. 269–275.
Roach, J. and Aggarwal, J. 1980. Determining the Movement of Objects from a Sequence of Images.IEEE Trans. on Patt. Anal., Mach. Intell., Vol. PAMI-2, pp. 554–562.
Sawhney, H.S. and Hanson, A.R. 1991. Identification and 3D Description of Shallow Environmental Structure in A Sequence of Images.Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 179–185.
Sethi, I.K. and Jain, R. 1987. Finding Trajectories of Feature Points in a Monocular Image Sequence.IEEE Trans. on Patt. Anal. Mack Intell., Vol. PAMI-9, pp. 56–73.
Shariat, H. and Price, K. 1990. Motion Estimation with More Than Two Frames.IEEE Trans. on Patt. Anal. Mack Intell., Vol. Pami-12, pp. 417–434.
Taylor, C.J., Kriegman, D.J., and Anandan, P. 1991. Structure and Motion in Two Dimensions from Multiple Images: A Least Squares Approach.Proc. IEEE Workshop on Visual Motion (Princeton, NJ), pp. 242–248.
Thomas, J.I. and Oliensis, J. 1992. Recursive Multi-Frame Structure from Motion Incorporating Motion Error.Proc. DARPA Image Understanding Workshop (San Diego, CA), pp. 507–513.
Tian, Q. and Huhns, M.N. 1986. Algorithms for subpixel registration.Comput. Vision, Graphics, Image Proc., Vol. 35, pp. 220–233.
Tomasi, C. and Kanade, T. 1991. Factoring Image Sequences into Shape and Motion.Proc. IEEE Workshop on Visual Motion (Princeton, NJ), pp. 21–28.
Tsai, R.Y. and Huang, T.S. 1981. Estimating Three-Dimensional Motion Parameters of a Rigid Planar Patch.IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-29, pp. 1147–1152.
Tsai, R.Y. and Huang, T.S. 1984. Estimating Three-Dimensional Motion Parameters of a Rigid Planar Patch, III: Finite Point Correspondences and the Three View Problem.IEEE Trans. on Acoustics, Speech, and Sigal Processing, Vol. ASSP-32, pp. 213–220.
Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-6, pp. 13–27.
Tsai, R.Y., Huang, T.S., and Zhu, W.L. 1982. Estimating Three-Dimensional Motion Parameters of a Rigid Planar Patch, II: Singular Value Decomposition.IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-30, pp. 525–534.
Weng, J., Huang, T.S., and Ahuja, N. 1987. 3-D Motion Estimatioon, Understanding, and Prediction from Noisy Image Sequences.IEEE Trans. on Patt. Anal. Mack Intell., Vol. PAMI-9, pp. 370–389.
Weng, J., Huang, T.S., and Ahuja, N. 1992. Motion and Structure from Line Correspondences: Closed-From Solutions, Uniqueness, and Optimization.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-14, pp. 318–336.
Williams, L.R. and Hanson, A.R. 1988. Depth from Looming Structure, inProc. DARPA Image Understanding Workshop (Cambridge, MA), pp. 1047–1051.
Wu, T.H. and Chellappa, R. 1992. Stereoscopic Recovery of Ego-motion and Environmental Structure: Models, Uniqueness and Experiments, Tech. Rep. CAR-TR-646, Center for Automation Research, University of Maryland, College Park, MD.
Young, G.S. and Chellappa, R. 1990. 3-D Motion Estimation Using a Sequence of Noisy Stereo Images: Models, Estimation, and Uniqueness Results,IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-12, pp. 735–759.
Zhang, Z. and Faugeras, O.D. 1992. Three-Dimensional Motion Computation and Object Segmentation in a Long Sequence of Stereo Frames.International Journal of Computer Vision, Vol. 7, pp. 211–241.
Zheng, Q. and Chellappa, R. 1991. Estimation of Illuminant Direction, Albedo and Shape from Shading.IEEE Trans. on Patt. Anal. Mach. Intell., Vol. PAMI-13, pp. 680–702.
Zheng, Q. and Chellappa, R. 1992a. Automatic Feature Point Extraction and Tracking in Image Sequences for Arbitrary Camera Motion. Tech. Rep. CAR-TR-628, Center for Automation Research, University of Maryland, College Park, MD, Accepted for publication,International Journal of Computer Vision.
Zheng, Q. and Chellappa, R. 1992b. A Computational Vision Approach to Image Registration.IEEE International Conf. on Pattern Recognition (The Hague, The Netherlands), pp. 193–197. To appear inIEEE Trans. on Image Processing.
Author information
Authors and Affiliations
Additional information
The support of the Advanced Research Projects Agency (ARPA order No. 8459), the U.S. Army Topographic Engineering Center under contract DACA 76-92-C-0009, and the Department of Electrical Engineering at the University of Maryland is gratefully acknowledged.
Rights and permissions
About this article
Cite this article
Wu, TH., Chellappa, R. & Zheng, Q. Experiments on estimating egomotion and structure parameters using long monocular image sequences. Int J Comput Vision 15, 77–103 (1995). https://doi.org/10.1007/BF01450850
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF01450850