Abstract
This paper describes an approach for tracking rigid and articulated objects using a view-based representation. The approach builds on and extends work on eigenspace representations, robust estimation techniques, and parameterized optical flow estimation. First, we note that the least-squares image reconstruction of standard eigenspace techniques has a number of problems and we reformulate the reconstruction problem as one of robust estimation. Second we define a “subspace constancy assumption” that allows us to exploit techniques for parameterized optical flow estimation to simultaneously solve for the view of an object and the affine transformation between the eigenspace and the image. To account for large affine transformations between the eigenspace and the image we define a multi-scale eigenspace representation and a coarse-to-fine matching strategy. Finally, we use these techniques to track objects over long image sequences in which the objects simultaneously undergo both affine image motions and changes of view. In particular we use this “EigenTracking” technique to track and recognize the gestures of a moving hand.
Similar content being viewed by others
References
Adelson, E.H. and Bergen, J.R. 1991. The plenoptic function and the elements of early vision. In Computational Models of Visual Processing, M. Landy and J.A. Movshon (Eds.), MIT Press: Boston, MA, pp. 1-20.
Baumberg, A. and Hogg, D. 1994. Learning flexible models from image sequences. In European Conf. on Computer Vision, ECCV-94, J. Eklundh (Ed.), Vol. 800 of LNCS-Series, Springer-Verlag: Stockholm, Sweden, pp. 299-308.
Bergen, J.R., Anandan, P., Hanna, K.J., and Hingorani, R. 1992. Hierarchical model-based motion estimation. In Proc. of Second European Conference on Computer Vision, ECCV-92, G. Sandini (Ed.), Vol. 588 of LNCS-Series, Springer-Verlag, pp. 237-252.
Beymer, D. 1996. Feature correspondence by interleaving shape and texture computations. In Proc. Computer Vision and Pattern Recognition, CVPR-96, San Francisco, pp. 921-928.
Black, M.J. and Anandan, P. 1993. A framework for the robust estimation of optical flow. In Proc. Int. Conf. on Computer Vision, ICCV-93, Berlin, Germany, pp. 231-236.
Black, M.J. and Yacoob, Y. 1995. Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motions. In Proceedings of the International Conference on Computer Vision, Boston, MA, pp. 374-381.
Black, M.J. and Anandan, P. 1996. The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding, 63(1):75-104.
Blake, A., Isard, M., and Reynard, D. 1994. Learning to track curves in motion. In Proceedings of the IEEE Conf. Decision Theory and Control, pp. 3788-3793.
Bobick, A.F. and Wilson, A.D. 1995. A state-based technique for the summarization and recognition of gesture. In Proceedings of the International Conference on Computer Vision, Boston, MA, pp. 382-388.
Bregler, C. and Omohundro, S.M. 1994. Surface learning with applications to lip reading. In Advances in Neural Information Processing Systems 6, J.D. Cowan, G. Tesauro, and J. Alspector (Eds.), Morgan Kaufmann Publishers: San Francisco, CA, pp. 43-50.
Cootes, T.F., Taylor, C.J., Cooper, D.H., and Graham, J. 1992. Training models for shape from sets of examples. In Proc. British Machine Vision Conference, pp. 9-18.
Darrell, T. and Pentland, A. 1993. Space-time gestures. In Proc. Computer Vision and Pattern Recognition, CVPR-93, New York, pp. 335-340.
Hager, G. and Belhumeur, P. 1999. Real-time tracking of image region with changes in geometry and illumination. Proc. Computer Vision and Pattern Recognition, CVPR-96, San Francisco, To appear.
Hallinan, P. 1995. A deformable model for the recognition of human faces under arbitrary illumination. Ph.D. Thesis, Harvard University, Cambridge, MA.
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., and Stahel, W.A. 1996. Robust Statistics: The Approach Based on Influence Functions. John Wiley and Sons: New York, NY.
Jepson, A. and Black, M.J. 1993. Mixture models for optical flow computation. In Partitioning Data Sets: With Applications to Psychology, Vision and Target Tracking, I. Cox, P. Hansen, and B. Julesz (Eds.), DIMACS Workshop, AMS Pub.: Providence, RI, pp. 271-286.
Kervrann, C. and Heitz, F. 1994. A hierarchical statistical framework for the segmentation of deformable objects in image sequences. In Proc. Computer Vision and Pattern Recognition, CVPR-94, Seattle, WA, pp. 724-728.
Koller, D., Daniilidis, K., and Nagel, H.-H. 1993. Model-based object tracking in monocular image sequences of road traffic scenes. International Journal of Computer Vision, 10(3):257-281.
Leonardis, A. and Bischof, H. 1996. Dealing with occlusions in the eigenspace approach. In Proc. Computer Vision and Pattern Recognition, CVPR-96, San Francisco, pp. 453-458.
Li, G. 1985. Robust regression. In Exploring Data, Tables, Trends and Shapes, D.C. Hoaglin, F. Mosteller, and J.W. Tukey (Eds.), John Wiley & Sons: NY.
McLachlan, G.J. and Basford, K.E. 1988. Mixture Models: Inference and Applications to Clustering. Marcel Dekker Inc.: NY.
Moghaddam, B. and Pentland, A. 1995. Probabilistic visual learning for object detection. In Proceedings of the International Conference on Computer Vision, Boston, MA, pp. 786-793.
Murase, H. and Nayar, S. 1995. Visual learning and recognition of 3-D objects from appearance. International Journal of Computer Vision, 14:5-24.
Nastar, C., Moghaddam, B., and Pentland. A. 1996. Generalized image matching: Statistical learning of physically-based deformations. In European Conf. on Computer Vision, ECCV-96, B. Buxton and R. Cipolla (Eds.), Cambridge, UK, Vol. 1064 of LNCS-Series, Springer-Verlag, pp. 589-598.
Nayar, S.K., Murase, H., and Nene, S. 1994. Learning, positioning, and tracking visual appearance. In IEEE Conf. on Robotics and Automation, San Diego.
Pentland, A., Moghaddam, B., and Starner, T. 1994. View-based and modular eigenspaces for face recognition. In Proc. Computer Vision and Pattern Recognition, CVPR-94, Seattle, WA, pp. 84- 91.
Rehg, J. and Kanade, T. 1995. Model-based tracking of selfoccluding articulated objects. In Proceedings of the International Conference on Computer Vision, Boston, MA, pp. 612-617.
Rousseeuw, P.J. and Leroy, A.M. 1987. Robust Regression and Outlier Detection. John Wiley & Sons: New York.
Saund, E. 1995. A multiple cause mixture model for unsupervised learning. Neural Computation, 7:51-71.
Strang, G. 1976. Linear Algebra and its Applications. Academic Press: New York.
Tarr, M.J. and Pinker, S. 1989. Mental rotation and orientation-dependence in shape recognition. Cognitive Psychology, 21:233- 282.
Turk, M. and Pentland, A. 1991. Face recognition using eigenfaces. In Proc. Computer Vision and Pattern Recognition, CVPR-91, Maui, pp. 586-591.
Viola, P.A. 1995. Alignment by maximization of mutual information. Ph.D. Thesis, AI-Lab., M.I.T., Cambridge, AI Technical Report 1548.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Black, M.J., Jepson, A.D. EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation. International Journal of Computer Vision 26, 63–84 (1998). https://doi.org/10.1023/A:1007939232436
Issue Date:
DOI: https://doi.org/10.1023/A:1007939232436