Abstract
We describe a linear-time algorithm that recovers absolute camera orientations and positions, along with uncertainty estimates, for networks of terrestrial image nodes spanning hundreds of meters in outdoor urban scenes. The algorithm produces pose estimates globally consistent to roughly 0.1° (2 milliradians) and 5 centimeters on average, or about four pixels of epipolar alignment.
We assume that adjacent nodes observe overlapping portions of the scene, and that at least two distinct vanishing points are observed by each node. The algorithm decouples registration into pure rotation and translation stages. The rotation stage aligns nodes to commonly observed scene line directions; the translation stage assigns node positions consistent with locally estimated motion directions, then registers the resulting network to absolute (Earth) coordinates.
The paper's principal contributions include: extension of classic registration methodsto large scale and dimensional extent; a consistent probabilistic framework for modeling projective uncertainty; and a new hybrid of Hough transform and expectation maximization algorithms.
We assess the algorithm's performance on synthetic and real data, and draw several conclusions. First, by fusing thousands of observations the algorithm achieves accurate registration even in the face of significant lighting variations, low-level feature noise, and error in initial pose estimates. Second, the algorithm's robustness and accuracy increase with image field of view. Third, the algorithm surmounts the usual tradeoff between speed and accuracy; it is both faster and more accurate than manual bundle adjustment.
Similar content being viewed by others
References
Adam, A., Rivlin, E., and Shimshoni, I. 2000. ROR: Rejection of outliers by rotations in stereo matching. In Proc. CVPR, pp. 2–9.
Amemiya, Y. and Fuller, W.A. 1984. Estimation for the multivariate errors-in-variables model with estimated error covariance matrix. Annals of Statistics, 12(2):497–509.
Antone, M. and Teller, S. 2000. Automatic recovery of relative camera rotations for urban scenes. In Proc. CVPR, pp. 282–289.
Antone, M. and Teller, S. 2001. Scalable, absolute position recovery for omni-directional image networks. In Proc. CVPR, pp. 398–405.
Azarbayejani, A. and Pentland, A. 1995. Recursive estimation of motion, structure, and focal length. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(6):562–575.
Barnard, S.T. 1983. Methods for interpreting perspective images. Artificial Intelligence, 21:435–462.
Becker, S. and Bove, V.M. 1995. Semiautomatic 3-D model extraction from uncalibrated 2-D camera views. In Proc. SPIE Image Synthesis, vol. 2410, pp. 447–461.
Beran, R. 1979. Exponential models for directional data. Annals of Statistics, 7(6):1162–1178.
Bingham, C. 1974. An antipodally symmetric distribution on the sphere. Annals of Statistics, 2(6):1201–1225.
Bosse, M., de Couto, D., and Teller, S. 1999. Eyes of Argus: Georeferenced imagery in urban environments. GPS World, pp. 20–30.
Canny, J.F. 1986. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6):679–698.
Chang, T. 1989. Spherical regression with errors in variables. Annals of Statistics, 17(1):293–306.
Chaudhuri, S. and Chatterjee, S. 1991. Robust estimation of 3-D motion parameters in presence of correspondence mismatches. In Proc. Asilomar Conference on Signals, Systems and Computers, pp. 1195–1199.
Chui, H. and Rangarajan, A. 2000a. A feature registration framework using mixture models. In Proc. IEEE Workshop on Mathematical Methods in Biomedical Image Analysis, pp. 190–197.
Chui, H. and Rangarajan, A. 2000b. A new algorithm for non-rigid point matching. In Proc. CVPR, pp. 44–51.
Cipolla, R., Robertson, D., and Boyer, E. 1999. Photobuilder-3D models of architectural scenes from uncalibrated images. In ICMCS, vol. 1, pp. 25–31.
Collins, R.T. 1993. Model acquisition using stochastic projective geometry. Ph.D. Thesis, University of Massachusetts.
Collins, R.T. and Weiss, R. 1990. Vanishing point calculation as statistical inference on the unit sphere. In Proc. ICCV, pp. 400–403.
Coorg, S., Master, N., and Teller, S. 1998. Acquisition of a large pose-mosaic dataset. In Proc. CVPR, pp. 872–878.
Csurka, G., Zeller, C., Zhang, Z., and Faugeras, O. 1997. Characterizing the uncertainty of the fundamental matrix. Computer Vision and Image Understanding, 68(1):18–36.
Debevec, P.E., Taylor, C.J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. In Proc. SIGGRAPH, pp. 11–20.
Dellaert, F., Seitz, S.M., Thorpe, C.E., and Thrun, S. 2000. Structure from motion without correspondence. In Proc. CVPR, pp. 557–564.
Dempster, A.P., Laird, N.M., and Rubin, D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. of the Royal Statistical Society, Series B, 39(1):1–38.
Faugeras, O., Luong, Q.-T., and Papadopoulo, T. 2001. The Geometry of Multiple Images. MIT Press: Cambridge, MA.
Fermüller, C. and Aloimonos, Y. 1998. Ambiguity in structure from motion: Sphere versus plane. IJCV, 28(2):137–154.
Fischler, M.A. and Bolles, R.C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395.
Fitzgibbon, A.W. and Zisserman, A. 1998. Automatic camera recovery for closed or open image sequences. In Proc. ECCV, pp. 311–326.
Fua, P. and Leclerc, Y.G. 1994. Registration without correspondence. In Proc. CVPR, pp. 121–128.
Gluckman, J. and Nayar, S. 1998. Ego-motion and omnidirectional cameras. In ICCV, pp. 35–42.
Golub, G.H. and Van Loan, C.F. 1980. An analysis of the total least squares problem. SIAM Journal on Numerical Analysis, 17(6):883–893.
Hartley, R. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press: Cambridge.
Horn, B.K.P. 1986. Robot Vision. MIT Press: Cambridge, MA.
Horn, B.K.P. 1987. Closed-form solution of absolute orientation using unit quaternions. J. Optical Society of America A, 4(4):629–642.
Horn, B.K.P. and Schunck, B.G. 1981. Determining optical flow. Artificial Intelligence, 16(1-3):185–203.
Jupp, P.E. and Mardia, K.V. 1979. Maximum likelihood estimators for the matrix von Mises-Fisher and Bingham distributions. Annals of Statistics, 7(3):599–606.
Kanatani, K. 1993. 3-D interpretation of optical flow by renormalization. TIIJCV, 11:267–282.
Kanatani, K. 1994. Statistical analysis of geometric computation. Computer Vision Graphics and Image Processing, 59(3):286–306.
Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P. 1983. Optimization by simulated annealing. Science, 220(4598):671–680.
Lee, M.-S., Medioni, G., and Deriche, R. 1995. Structure and motion from a sparse set of views. In Proc. the International Symposium on Computer Vision, pp. 73–78.
Leung, J.C.H. and McLean, G.F. 1996. Vanishing point matching. In Proc. ICIP, vol. 2, pp. 305–308.
Luong, Q.T. and Faugeras, O. 1997. Camera calibration, scene motion, and structure recovery from point correspondences and fundamental matrices. TIIJCV, 22(3):261–289.
Lutton, E., Maître, H., and Lopez-Krahe, J. 1994. Contribution to the determination of vanishing points using Hough transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(4):430–438.
Matei, B. and Meer, P. 2000. Ageneral method for errors-in-variables problems in computer vision. In Proc. CVPR, pp. 18–25.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E. 1953. Equation of state calculations by fast computing machines. J. of Chemical Physics, 21(6):1087–1092.
Mundy, J.L. and Zisserman, A. (Eds.). 1992. Geometric Invariance in Computer Vision. MIT Press: Cambridge, MA.
Nicewarner, K.E. and Sanderson, A.C. 1994. A general representation for orientation uncertainty using random unit quaternions. In Proc. IEEE International Conference on Robotics and Automation, vol. 2, pp. 1161–1168.
Poelman, C.J. and Kanade, T. 1994. A paraperspective factorization method for shape and recovery. In Proc. ECCV, pp. 97–108.
Prentice, M.J. 1989. Spherical regression on matched pairs of orientation statistics. J. Royal Statistical Society, Series B, 51(2):241–248.
Pritchett, P. and Zisserman, A. 1998. Matching and reconstruction from widely separated views. In Proc. Workshop on 3-D Structure from Multiple Images of Large-Scale Environments, pp. 78–92.
Rangarajan, A., Chui, H., and Duncan, J.S. 1999. Rigid point feature registration using mutual information. Medical Image Analysis, 4:1–17.
Rivest, L.-P. 1984. On the information matrix for symmetric distributions on the unit hypersphere. Annals of Statistics, 12(3):1085–1089.
Shigang, L., Tsuji, S., and Imai, M. 1990. Determining of camera rotation from vanishing points of lines on horizontal planes. In Proc. ICCV, pp. 499–502.
Shufelt, J.A. 1999. Performance evaluation and analysis of vanishing point detection techniques. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(3):282–288.
Shum, H.S., Han, M., and Szeliski, R. 1998. Interactive construction of 3D models from panoramic image mosaics. In Proc. CVPR, pp. 427–433.
Sinkhorn, R. 1964. Arelationship between arbitrary positive matrices and doubly stochastic matrices. Annals of Mathematical Statistics, 35(2):876–879.
Stefanski, L.A. 1985. The effects of measurement error on parameter estimation. Biometrika, 72(3):583–592.
Stein, G.P. 1998. Tracking from multiple view points: Self-calibration of space and time. In Proc. CVPR, pp. 521–527.
Szeliski, R. and Kang, S.B. 1994. Recovering 3D shape and motion from image streams using nonlinear least squares. J. of Visual Communication and Image Representation, 5(1):10–28.
Szeliski, R., Kang, S.B., and Shum, H.-Y. 1995. A parallel feature tracker for extended image sequences. In Proc. International Symposium on Computer Vision, pp. 241–246.
Taylor, C.J. and Kriegman, D.J. 1992. Structure and motion from line segments in multiple images. In Proc. IEEE International Conference on Robotics and Automation, pp. 1615–1620.
Teller, S. 1997. Automatic acquisition of hierarchical, textured 3D geometric models of urban environments: Project plan. In Proc. of the Image Understanding Workshop.
Teller, S. 2001. Scalable, controlled image capture in urban environments. Technical Report 825, MIT LCS.
Teller, S., Antone, M., Bosse, M., Coorg, S., Jethwa, M., and Master, N. 2001. Calibrated, registered images of an extended urban area. In Proc. CVPR, pp. I-813–I-820.
Torr, P.H.S. and Zisserman, A. 2000. MLESAC: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78:138–156.
Tuytelaars, T., Proesmans, M., and Luc Van Gool. 1997. The cascaded Hough transform. In Proc. ICIP, vol. 2, pp. 736–739.
Watson, G.S. 1983. Statistics on Spheres. John Wiley and Sons: New York.
Zelnik-Manor, L. and Irani, M. 2000. Multi-frame estimation of planar motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(10):1105–1116.
Zhang, Z. 1998. Determining the epipolar geometry and its uncertainty: A review. IJCV, 27(2):161–195.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Antone, M., Teller, S. Scalable Extrinsic Calibration of Omni-Directional Image Networks. International Journal of Computer Vision 49, 143–174 (2002). https://doi.org/10.1023/A:1020141505696
Issue Date:
DOI: https://doi.org/10.1023/A:1020141505696