Abstract
This paper describes the automatic acquisition of three dimensional architectural models from short image sequences. The approach is Bayesian and model based. Bayesian methods necessitate the formulation of a prior distribution; however designing a generative model for buildings is a difficult task. In order to overcome this a building is described as a set of walls together with a ‘Lego’ kit of parameterised primitives, such as doors or windows. A prior on wall layout, and a prior on the parameters of each primitive can then be defined. Part of this prior is learnt from training data and part comes from expert architects. The validity of the prior is tested by generating example buildings using MCMC and verifying that plausible buildings are generated under varying conditions. The same MCMC machinery can also be used for optimising the structure recovery, this time generating a range of possible solutions from the posterior. The fact that a range of solutions can be presented allows the user to select the best when the structure recovery is ambiguous.
Similar content being viewed by others
References
Baillard, C., Schmid, C., Zisserman, A., and Fitzgibbon, A.W. 1999. Automatic line matching and 3d reconstruction of buildings from multiple views. In ISPRS Congress, pp. 69–80.
Baker, S. and Kanade, T. 2001. Super-resolution: Reconstruction or recognition? In Proc. IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing, Baltimore, Maryland.
Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 434–441.
Beardsley, P.A., Zisserman, A., and Murray, D.W. 1997. Sequential updating of projective and affine structure from motion. International Journal of Computer Vision, 23(3):235–259.
Biederman, I. 1985. Human image understanding: Recent research and a theory. Computer Vision Graphics and Image Processing, 32(1):29–73.
Borges, D.L. and Fisher, R.B. 1997. Class-based recognition of 3d objects represented by volumetric primitives. Image and Vision Computing, 15(8):655–664.
Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 8(6):679–698.
Cipolla, R., Okamoto, Y., and Kuno, Y. 1993. Robust structure from motion using motion parallax. In Proc. IEEE International Conference on Computer Vision, pp. 374–382.
Collins, R. 1992. Projective reconstruction of approximately planar scenes. In Interdisciplinary Computer Vision: An Exploration of Diverse Applications, pp. 174–185.
Dick, A.R. 2001. Modelling and interpretation of architecture from several images. PhD thesis, University of Cambridge.
Dick, A.R., Torr, P., and Cipolla, R. 2000. Automatic 3d modelling of architecture. In Proc. 11th British Machine Vision Conference (BMVC'00), Bristol, pp. 372–381.
Dickinson, S.J., Bergevin, R., Biederman, I., Eklundh, J.O., Munck-Fairwood, R., Jain, A.K., and Pentland, A.P. 1997. Panel report: The potential of geons for generic 3-d object recognition. Image and Vision Computing, 15(4):277–292.
Efros, A. and Leung, T. 1999. Texture synthesis by non-parametric sampling. In Proc. IEEE International Conference on Computer Vision, pp. 1033–1038.
Faugeras, O.D., Mundy, J.L., Ahuja, N., Dyer, C.R., Pentland, A.P., Jain, R., Ikeuchi, K., and Bowyer, K.W. 1992. Why aspect graphs are not (yet) practical for computer vision. Computer Vision Graphics and Image Processing, 55(2):212–218.
Ferryman, J.M., Worrall, A.D., Sullivan, G.D., and Baker, K.D. 1995. A generic deformable model for vehicle recognition. In Proceedings British Machine Vision Conference, pp. 127–136.
Fisher, R., 1989. From Surfaces to Objects: Computer vision and three dimensional scene analysis. John Wiley and Sons.
Gelman, A., Carlin, J., Stern, H., and Rubin, D. 1995. Bayesian Data Analysis. Chapman and Hall: Boston.
Gilks,W., Richardson, S., and Spiegelhalter,D. (Eds.), 1996. Markov Chain Monte Carlo in Practice. Chapman and Hall: London.
Green, P. 1995. Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika, 82:711–732.
Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. 4th Alvey Conference, pp. 147–152.
Irani, M. and Peleg, S. 1993. Using motion analysis for image enhancement. Journal of Visual Communication and Image Representation, 4(4):324–335.
Jaynes, E.T. 1996. Probability Theory: The Logic of Science. Unpublished but available online at http://bayes.wustl.edu/etj/prob.html.
Koenderink, J.J. and van Doorn, A.J. 1979. The internal representation of solid shape with respect to vision. Biological Cybernetics, 32:211–216.
Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(5):441–450.
Marr, D. 1982. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman.
Mendonca, P.R.D.S. and Cipolla, R. 1999. A simple technique for self-calibration. In Proc. IEEE Computer Vision and Pattern Recognition, pp. I:500–505.
Neal, R.M. 1993. Probabilistic inference using monte carlo markov chains. Technical Report CRG-TR-93-1, University of Toronto.
Oliensis, J. 2000. A critique of structure-from-motion algorithms. Computer Vision and Image Understanding, 80(2):172–214.
Papageorgiou, C. and Poggio, T. 2000. A trainable system for object detection. International Journal of Computer Vision, 38(1):15–33.
Pilu, M. and Fisher, R.B. 1996. Recognition of geons by parametric deformable contour models. In Proc. 4th European Conference on Computer Vision, Lecture Notes in Computer Science 1064, Springer-Verlag. pp. I:71–82.
Pope, A.R. 1994. Model-based object recognition:Asurvey of recent research. Technical Report 94-04, University of British Columbia.
Portilla, J. and Simoncelli, E.P. 2000. A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision, 40(1):49–70.
Schmid, C. and Zisserman, A. 2000. The geometry and matching of lines and curves over multiple views. International Journal of Computer Vision, 40(3):199–233.
Schneiderman, H. and Kanade, T. 2000. A statistical method for 3d object detection applied to faces and cars. In Proc. IEEE Computer Vision and Pattern Recognition, pp. I:746–751.
Slama, C.C. 1980. Manual of Photogrammetry, 4th ed. American Society of Photogrammetry.
Sullivan, J., Blake, A., Isard, M., and Maccormick, J.P. 1999. Object localization by bayesian correlation. In Proc. IEEE International Conference on Computer Vision, pp. 1068–1075.
Taylor, C.J., Debevec, P.E., and Malik, J. 1996. Modeling and rendering architecture from photographs:Ahybrid geometry-and imagebased approach. ACMSIGGraph, Computer Graphics, pp. 11–20.
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2):137–154.
Torr, P., Dick, A., and Cipolla, R. 2000. Layer extraction with a bayesian model of shapes. In Proc. 6th European Conference on Computer Vision, Lecture Notes in Computer Sciences, Vol. 1843, Springer-Verlag, pp. II:273–289.
Torr, P., Szeliski, R., and Anandan, P. 1999. An integrated bayesian approach to layer extraction from image sequences. In Proc. IEEE International Conference on Computer Vision, pp. 983–990.
Triggs, B. 2000. Plane + parallax, tensors and factorization. In Proc. 6th European Conference on Computer Vision, Lecture Notes in Computer Sciences. vol. 1842, Springer-Verlag, pp. I:522–538.
Triggs, B., Mclauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment-A modern synthesis. In Vision Algorithms: Theory and Practice, W. Triggs, A. Zisserman, and R. Szeliski (Eds.), LNCS vol. 1883, Springer Verlag, pp. 298–375.
Wang, J. and Adelson, E.H. 1993. Layered representation for motion analysis. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 361–366.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Dick, A., Torr, P. & Cipolla, R. Modelling and Interpretation of Architecture from Several Images. International Journal of Computer Vision 60, 111–134 (2004). https://doi.org/10.1023/B:VISI.0000029665.07652.61
Issue Date:
DOI: https://doi.org/10.1023/B:VISI.0000029665.07652.61