Abstract
This paper addresses the recovery of structure and motion from uncalibrated images of a scene under full perspective or under affine projection. Particular emphasis is placed on the configuration of two views, while the extension to $N$ views is given in Appendix. A unified expression of the fundamental matrix is derived which is valid for any projection model without lens distortion (including full perspective and affine camera). Affine reconstruction is considered as a special projective reconstruction. The theory is elaborated in a way such that everyone having knowledge of linear algebra can understand the discussion without difficulty. A new technique for affine reconstruction is developed, which consists in first estimating the affine epipolar geometry and then performing a triangulation for each point match with respect to an implicit common affine basis.
Similar content being viewed by others
References
J. Aloimonos, “Perspective approximations,” Image and Vision Computing, Vol. 8, No.3, pp. 179–192, Aug. 1990.
N. Ayache, Artificial Vision for Mobile Robots, MIT Press, 1991.
P. Beardsley, A. Zisserman, and D. Murray, “Navigation using affine structure from motion,” in Proceedings of the 3rd European Conference on Computer Vision, Stockholm, Sweden, J.-O. Eklundh (Ed.), Volume 2 of Lecture Notes in Computer Science, Springer-Verlag, May 1994, pp. 85–96.
S. Demey, A. Zisserman, and P. Beardsley, “Affine and projective structure from motion,” in British Machine Vision Conference, Leeds, UK, Sept. 1992, pp. 49–58.
O. Faugeras, “What can be seen in three dimensions with an uncalibrated stereo rig,” in Proceedings of the 2nd European Conference on Computer Vision, Santa Margherita Ligure, Italy, G. Sandini (Ed.), Volume 588 of Lecture Notes in Computer Science, Springer-Verlag, May 1992, pp. 563–578.
O. Faugeras, Three-Dimensional Computer Vision: A Geometric Viewpoint, MIT Press, 1993.
O. Faugeras, “Stratification of 3-D vision: projective, affine, and metric representations,” Journal of the Optical Society of America A, Vol. 12, No.3, pp. 465–484, March 1995.
O. Faugeras and S. Laveau, “Representing three-dimensional data as a collection of images and fundamental matrices for image synthesis,” in Proceedings of the International Conference on Pattern Recognition, Jerusalem, Israel, Computer Society Press, Oct. 1994, pp. 689–691.
O. Faugeras, S. Laveau, L. Robert, C. Zeller, and G. Csurka, “3-d reconstruction of urban scenes from sequences of images,” in Automatic Extraction of Man-Made Objects from Aerial and Space Images, A. Gruen, O. Kuebler, and P. Agouris (Eds.), Ascona, Switzerland, ETH, Birkhauser Verlag, April 1995, pp. 145–168. Also INRIA Technical Report 2572.
O. Faugeras, T. Luong, and S. Maybank, “Camera self-calibration: Theory and experiments,” in “Proc 2nd ECCV, Santa Margherita Ligure, Italy, G. Sandini (Ed.), Volume 588 of Lecture Notes in Computer Science, Springer-Verlag, May 1992, pp. 321–334.
O. Faugeras and B. Mourrain, “About the correspondences of points between nimages,” in Proceedings of the Workshop on the Representation of Visual Scenes, Cambridge, Massachusetts, USA, June 1995.
D. Forsyth, J. L. Mundy, A. Zisserman, C. Coello, A. Heller, and C. Rothwell, “Invariant descriptors for 3D object recognition and pose,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 13, No.10, pp. 971–991, Oct. 1991.
R. Hartley, R. Gupta, and T. Chang, “Stereo from uncalibrated cameras,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Urbana Champaign, IL, IEEE, June 1992, pp. 761–764.
R. Hartley and P. Sturm, “Triangulation,” in Proceedings of the ARPA Image Understanding Workshop, Defense Advanced Research Projects Agency, Morgan Kaufmann Publishers, Inc., 1994, pp. 957–966.
J. J. Koenderink and A. J. van Doorn, “Affine structure from motion,” Journal of the Optical Society of America, Vol. A8, pp. 377–385, 1991.
Q.-T. Luong and O. D. Faugeras, “The fundamental matrix: Theory, algorithms and stability analysis,” The International Journal of Computer Vision, Vol. 17, No.1, pp. 43–76, Jan. 1996.
Q.-T. Luong and T. Viéville, “Canonic representations for the geometries of multiple projective views,” in Proceedings of the 3rd European Conference on Computer Vision, Stockholm, Sweden, J.-O. Eklundh (Ed.), Volume 1 of Lecture Notes in Computer Science, Springer-Verlag, May 1994, pp. 589–599.
S. Maybank, Theory of Reconstruction from Image Motion, Springer-Verlag, 1992.
J. L. Mundy and A. Zisserman (Eds.), Geometric Invariance in Computer Vision, MIT Press, 1992.
E. Nishimura, G. Xu, and S. Tsuji, “Motion segmentation and correspondence using epipolar constraint,” in Proc. 1st Asian Conf. Computer Vision, Osaka, Japan, 1993, pp. 199–204.
L. Quan and R. Mohr, “Towards structure from motion for linear features through reference points,” in IEEE Workshop on Visual Motion, New Jersey, 1991.
C. Rothwell, G. Csurka, and O. Faugeras, “A comparison of projective reconstruction methods for pairs of views,” in Proceedings of the 5th International Conference on Computer Vision, Boston, MA, IEEE Computer Society Press, June 1995, pp. 932–937.
P. Rousseeuw and A. Leroy, Robust Regression and Outlier Detection, John Wiley & Sons: New York, 1987.
L. Shapiro, “Affine analysis of image sequences,” Ph. D. thesis, Department of Engineering Science, University of Oxford, Oxford, UK, Nov. 1993.
L. Shapiro, A. Zisserman, and M. Brady, “Motion from point matches using affine epipolar geometry,” in Proceedings of the 3rd European Conference on Computer Vision, Stockholm, Sweden, J.-O. Eklundh (Ed.), Volume 2 of Lecture Notes in Computer Science, Springer-Verlag, May 1994, pp. 73–84.
A. Shashua, “Projective structure from uncalibrated images: Structure from motion and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 16, No.8, pp. 778–790, 1994.
A. Shashua and N. Navab, “Relative affine structure: Theory and application to 3D reconstruction from perspective views,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, IEEE, June 1994.
M. Spetsakis and J. Aloimonos, “A unified theory of structure from motion,” Technical Report CAR-TR-482, Computer Vision Laboratory, University of Maryland, Dec. 1989.
C. Tomasi and T. Kanade, “Shape and motion from image streams under orthography: A factorization method,” The International Journal of Computer Vision, Vol. 9, No. 2, pp. 137–154, 1992.
P. Torr and D. Murray, “Stochastic motion clustering,” in Proceedings of the 3rd European Conference on Computer Vision, J.-O. Eklundh (Ed.), Vol. 2, Stockholm, Sweden, May 1994, pp. 328–337.
D. Weinshall and C. Tomasi, “Linear and incremental acquisition of invariant shape models from image sequences,” in Proceedings of the 4th International Conference on Computer Vision, Berlin, Germany, IEEE Computer Society Press, May 1993, pp. 675–682.
G. Xu and Z. Zhang, Epipolar Geometry in Stereo, Motion and Object Recognition, Kluwer Academic Publishers, 1996.
C. Zeller and O. Faugeras, “Applications of non-metric vision to some visual guided tasks,” in Proceedings of the International Conference on Pattern Recognition, Jerusalem, Israel, Computer Society Press, Oct. 1994, pp. 132–136. A longer version in INRIA Tech Report RR2308.
Z. Zhang, “Determining the epipolar geometry and its uncertainty: A review,” Technical Report 2927, INRIA Sophia-Antipolis, France, July 1996. Also appeared in the International Journal of Computer Vision, Vol. 27, No.2, pp. 161–195.
Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong, “A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry,” Artificial Intelligence Journal, Vol. 78, pp. 87–119, Oct. 1995.
Z. Zhang, O. Faugeras, and R. Deriche, “An effective technique for calibrating a binocular stereo through projective reconstruction using both a calibration object and the environment,” Videre: A Journal of Computer Vision Research, Vol. 1, No.1, pp. 58–68, 1997.
Z. Zhang, K. Isono, and S. Akamatsu. Euclidean structure from uncalibrated images using fuzzy domain knowledge: Application to facial images synthesis, in Proceedings of the 6th International Conference on Computer Vision, Bombay, India, Jan. 1998, IEEE Computer Society Press.
A. Zisserman, “Notes on geometric invariants in vision,” BMVC92 Tutorial, 1992.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zhang, Z., Xu, G. A Unified Theory of Uncalibrated Stereo for Both Perspective and Affine Cameras. Journal of Mathematical Imaging and Vision 9, 213–229 (1998). https://doi.org/10.1023/A:1008341803636
Issue Date:
DOI: https://doi.org/10.1023/A:1008341803636