Skip to main content

Tracking While Zooming Using Affine Transfer and Multifocal Tensors

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper presents algorithms for tracking unknown objects in the presence of zoom. Since prior models are unavailable, point and line matches in affine views are used to characterize the structure and to transfer a fixation point into new images in a sequence. Because any affine projection matrix is permitted, the intrinsic camera parameters such as focal length may change freely. Also, since the techniques do not require long feature tracks, a further desirable property is insensitivity to partial occlusion caused, for instance, by part of the object falling off the image plane while zooming in. If only point matches are available, a previous method based on factorization is applied. When also incorporating lines, the affine trifocal and quadrifocal tensors are used for tracking in monocular and stereo systems respectively. Methods for computing the tensors, minimizing algebraic error, are developed. In comparison with their projective counterparts, the affine tensors offer significant advantages in terms of computation time and convenience of parameterization, and the relations between the different tensors are shown to be much simpler. Successful tracking is demonstrated on several real image sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Åström, K., Heyden, A., Kahl, F., and Oskarsson, M. 1999. Structure and motion from lines under affine projections. In Proc. 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 285-292.

  • Baillard, C., Schmid, C., Zisserman, A., and Fitzgibbon, A. 1999. Automatic line matching and 3D reconstruction of buildings from multiple views. In ISPRS Conference on Automatic Extraction of GIS Objects from Digital Imagery, IAPRS, Vol. 32, Part 3-2W5, pp. 69-80.

    Google Scholar 

  • Blake, A. and Isard, M. 1998. Active Contours. Springer: Berlin.

    Google Scholar 

  • Blake, A., Isard, M.A., and Reynard, D. 1995. Learning to track the visual motion of contours. Artificial Intelligence, 78:101-134.

    Google Scholar 

  • Bretzner, L. and Lindeberg, T. 1998. Use your hand as a 3-D mouse, or, relative orientation from extended sequences of sparse point and line correspondences using the affine trifocal tensor. In Proc. 5th European Conf. on Computer Vision, Freiburg, pp. 141- 157.

  • Canny, J.F. 1983. Finding edges and lines in images. Master's Thesis, MIT.

  • Cipolla, R. and Blake, A. 1990. The dynamic analysis of apparent contours. In Proc. 3rd Int'l Conf. on Computer Vision, Osaka. IEEE Computer Society Press: Washington, DC, pp.616-632.

    Google Scholar 

  • Drummond, T. and Cipolla, R. 2000. Real-time tracking of multiple articulated structures in multiple views. In Proc. 6th European Conference on Computer Vision, Dublin, Ireland, pp. II:20-36.

    Google Scholar 

  • Fairley, S.M., Reid, I.D., and Murray, D.W. 1998. Transfer of fixation using affine structure: Extending the analysis to stereo. International Journal of Computer Vision, 29(1):47-58.

    Google Scholar 

  • Faugeras, O. and Mourrain, B. 1995. On the geometry and algebra of the point and line correspondences between N images. In Proc. 5th Int'l Conf. on Computer Vision, Cambridge, MA. IEEE Computer Society Press: Los Alamitos, CA, pp. 951-956.

    Google Scholar 

  • Faugeras, O.D., Luong, Q.-T., and Maybank, S.J. 1992. Camera selfcalibration: Theory and experiments. In Proc. 2nd European Conf. on Computer Vision, G. Sandini (Ed.). Santa Margharita Ligure, Italy, Springer-Verlag: Berlin, pp. 321-334.

    Google Scholar 

  • Fischler, M.A. and Bolles, R.C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. Assoc. Comp. Mach., 24(6):381- 395.

    Google Scholar 

  • Fitzgibbon, A.W. and Zisserman, A. 1998. Automatic camera recovery for closed or open image sequences. In Proc. European Conf.on Computer Vision, Springer-Verlag: Berlin, pp. 311-326.

    Google Scholar 

  • Hager, G. and Belhumeur, P. 1998. Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10):1025- 1039.

    Google Scholar 

  • Harris, C.G. 1992. Tracking with rigid models. In Active Vision, A. Blake and A. Yuille (Eds.). MIT Press: Cambridge, MA.

    Google Scholar 

  • Harris, C.G. and Stephens, M. 1988. A combined corner and edge detector. In Proc. 4th AlveyVision Conf., Manchester, pp. 147-151.

  • Hartley, R., Gupta, R., and Chang, T. 1992. Stereo from uncalibrated cameras. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 761-764.

  • Hartley, R.I. 1993. Camera calibration using line correspondences. In Proc. DARPA Image Understanding Workshop, pp. 361- 366.

  • Hartley, R.I. 1995. In defence of the 8-point algorithm. In Proc. International Conference on Computer Vision, pp. 1064-1070.

  • Hartley, R.I. 1997. Lines and points in three views and the trifocal tensor. International Journal of Computer Vision, 22(2):125-140.

    Google Scholar 

  • Hartley, R.I. 1998. Computation of the quadrifocal tensor. In Proc. 5th European Conf. on Computer Vision, Freiburg, Vol. I, pp. 20- 35.

    Google Scholar 

  • Hartley, R.I. and Sturm, P. 1997. Triangulation. Computer Vision and Image Understanding, 68(2):146-157.

    Google Scholar 

  • Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.

  • Hayman, E. 2000. The use of zoom within active vision. D.Phil Thesis, Department of Engineering Science, University of Oxford.

  • Hayman, E., Reid, I., and Murray, D.W. 1996. Zooming while tracking using affine transfer. In Proc. 7th British Machine Vision Conference, Edinburgh, Vol. 2, pp. 395-404.

    Google Scholar 

  • Hayman, E., Thórhallsson, T., and Murray, D.W. 1999. Zoominvariant tracking using points and lines in affine views-An application of the affine multifocal tensors. In Proc. 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 269-276.

  • Heyden, A. 1998a. A common framework for multiple view tensors.In Proc. 5th European Conf. on Computer Vision, Freiburg, Vol. I, pp. 3-19.

    Google Scholar 

  • Heyden, A. 1998b. Reduced multilinear constraints-Theory and experiments. International Journal of Computer Vision, 30(1).

  • Inoue, H., Tachikawa, T., and Inaba, M. 1992. Robot vision system with a correlation chip for real-time tracking, optical flow and depth map generation. In Proc. IEEE Int'l Conf. on Robotics and Automation, pp. 1621-1626.

  • Irani, M., Rousso, B., and Peleg, S. 1994. Computing occluding and transparent motions. International Journal of Computer Vision, 12(1):5-16.

    Google Scholar 

  • Kahl, F. and Heyden, A. 1999. Affine structure and motion from points, lines and conics. International Journal of Computer Vision, 33(3):1-18.

    Google Scholar 

  • Kanade, T. and Morris, D.D. 1998. Factorization methods for structure from motion. Philosophical Transactions of the Royal Society of London, SERIES A, 356(1740):1153-1173.

    Google Scholar 

  • Kass, M., Witkin, A., and Terzopoulos, D. 1987. Snakes: Active contour models. In Proc. 1st Int'l Conf. on Computer Vision, London, IEEE Computer Society Press: Los Alamitos, CA, pp. 259- 268.

    Google Scholar 

  • Kaucic, R., Hartley, R., and Dano, N. 2001. Plane-based projective reconstruction. In Proc. 8th International Conference on Computer Vision, Vancouver, Canada, pp. I:420-427.

  • Koenderink, J.J. and van Doorn, A.J. 1991. Affine structure from motion. J. Opt. Soc. Am., A 8(2):377-385.

    Google Scholar 

  • Longuet-Higgins, H. 1981. A computer algorithm for reconstructing a scene from two projections. Nature, 293:133-135.

    Google Scholar 

  • Lowe, D.G. 1987. The viewpoint consistency constraint. International Journal of Computer Vision, 1(1):57-72.

    Google Scholar 

  • McLauchlan, P. 1999. The variable state dimension filter. Centre for Vision, Speech and Signal Processing, University of Surrey, UK, Technical Report VSSP 4/99.

    Google Scholar 

  • Mendonça, P.R.S. and Cipolla, R. 1998. Analysis and computation of an affine trifocal tensor. In Proc. 9th British Machine Vision Conf., M. Nixon and J. Carter (Eds.). Southampton, pp. 125- 133.

  • Morris, D.D. and Kanade, T. 1998. A unified factorization algorithm for points, line segments and planes with uncertainty models. In Proc. 6th International Conference on Computer Vision, Bombay, India, pp. 696-702.

  • Murray, D.W., Bradshaw, K.J., McLauchlan, P.F., Reid, I.D., and Sharkey, P.M. 1995. Driving saccade to pursuit using image motion. International Journal of Computer Vision, 16(3):205-228.

    Google Scholar 

  • Pahlavan, K., Uhlin, T., and Eklundh, J.-O. 1996. Dynamic fixation and active perception. International Journal of Computer Vision, 16(2):113-135.

    Google Scholar 

  • Quan, L. 1996. Self-calibration of an affine camera from multiple views. International Journal of Computer Vision, 19(1):93-105.

    Google Scholar 

  • Quan, L. and Kanade, T. 1997. Affine structure from line correspondences with uncalibrated affine cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(8):834-845.

    Google Scholar 

  • Quan, L., Ohta, Y., and Mohr, R. 1998. Geometry of multiple affine views. In Proc. European Workshop on 3D Structure from Multiple Images of Large-Scale Environments (SMILE'98), Freiburg, Germany, Vol. 1506 of Lecture Notes in Computer Science. Springer-Verlag: Berlin, pp. 32-46.

    Google Scholar 

  • Reid, I.D. and Murray, D.W. 1993. Tracking foveated corner clusters using affine structure. In Proc. 4th Int'l Conf. on Computer Vision, Berlin, IEEE Computer Society Press: Los Alamitos CA, pp. 76- 83.

    Google Scholar 

  • Reid, I.D. and Murray, D.W. 1996. Active tracking of foveated feature clusters using affine structure. International Journal of Computer Vision, 18(1):41-60.

    Google Scholar 

  • Rother, C. and Carlsson, S. 2001. Linear multi view reconstruction and camera recovery. In Proc. 8th International Conference on Computer Vision, Vancouver, Canada, pp. I:42-49.

  • Schmid, C. and Zisserman, A. 1997. Automatic line matching across views. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 666-671.

  • Shapiro, L.S., Zisserman, A., and Brady, M. 1995. 3D motion recovery via affine epipolar geometry. International Journal of Computer Vision, 16(2):147-182.

    Google Scholar 

  • Shashua, A. 1994. Trilinearity in visual recognition by alignment. In Proc. 3rd European Conf. on Computer Vision, Stockholm, Vol. 1, pp. 479-484.

    Google Scholar 

  • Shashua, A. 1995. Algebraic functions for recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8):779- 789.

    Google Scholar 

  • Shashua, A. and Avidan, S. 2000. On the reprojection of 3D and 2D scenes without explicit model selection. In Proc. 6th European Conference on Computer Vision, Dublin, Ireland, pp. I:936- 949.

    Google Scholar 

  • Shashua, A. and Wolf, L. 2000. On the structure and properties of the quadrifocal tensor. In Proc. 6th European Conference on Computer Vision, Dublin, Ireland, pp. I:710-724.

    Google Scholar 

  • Spetsakis, M.E. and Aloimonos, J. 1990. Structure from motion using line correspondences. International Journal of Computer Vision, 4(3):171-183.

    Google Scholar 

  • Thórhallsson, T. 2000. Object symmetry in multiple affine views. D.Phil Thesis, Department of Engineering Science, University of Oxford.

  • Thórhallsson, T. and Murray, D.W. 1999. The tensors of three affine views. In Proc. IEEE Conf. on ComputerVision andPattern Recognition. Fort Collins, IEEE Computer Society Press: Los Alamitos, CA, pp. 450-456.

    Google Scholar 

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization approach. International Journal of Computer Vision, 9(2):137-154.

    Google Scholar 

  • Tordoff, B.J. and Murray, D.W. 2002. Guided sampling and consensus for motion estimation. In Proc. 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. I:82-96.

  • Torr, P. 1995. Motion segmentation and outlier detection. D.Phil Thesis, Dept of Engineering Science, Oxford University.

  • Triggs, W. 1995. The geometry of projective reconstruction I: Matching constraints and the joint image. Unpublished Report.

  • Triggs, W. 1996. Factorization methods for projective structure and motion. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 845-851.

  • Triggs, W. 2000. Plane+parallax, tensors and factorization. In Proc. 6th European Conference on Computer Vision, Dublin, Ireland, pp. 522-538.

  • Uhlin, T., Nordlund, P., Maki, A., and Eklundh, J. 1995. Towards an active visual observer. In Proc. 5th International Conference on Computer Vision, Boston, pp. 679-686.

  • Viéville, T. and Luong, Q.-T. 1993. Motion of points and lines in the uncalibrated case. I.N.R.I.A., Technical Report 2054.

  • Weng, J., Ahuja, N., and Huang, T.S. 1992. Motion and structure from line correspondences: Closed-form solution, uniqueness, and optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(3):318-336.

    Google Scholar 

  • Yuille, A. and Hallinan, P. 1992. Deformable templates. In Active Vision, A. Blake and A. Yuille (Eds.). MIT Press: Cambridge, MA, pp. 21-38.

    Google Scholar 

  • Zhang, Z. 1994. Token tracking in a cluttered scene. Image and Vision Computing, 12(2):110-120.

    Google Scholar 

  • Zisserman, A. 1992. Notes on geometric invariance in vision. British Machine Vision Conference, Tutorial.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hayman, E., Thórhallsson, T. & Murray, D. Tracking While Zooming Using Affine Transfer and Multifocal Tensors. International Journal of Computer Vision 51, 37–62 (2003). https://doi.org/10.1023/A:1020988723254

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020988723254