Modelling and Interpretation of Architecture from Several Images

Dick, A.R.; Torr, P.H.S.; Cipolla, R.

doi:10.1023/B:VISI.0000029665.07652.61

Modelling and Interpretation of Architecture from Several Images

Published: November 2004

Volume 60, pages 111–134, (2004)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

A.R. Dick¹,
P.H.S. Torr² &
R. Cipolla¹

963 Accesses
130 Citations
6 Altmetric
Explore all metrics

Abstract

This paper describes the automatic acquisition of three dimensional architectural models from short image sequences. The approach is Bayesian and model based. Bayesian methods necessitate the formulation of a prior distribution; however designing a generative model for buildings is a difficult task. In order to overcome this a building is described as a set of walls together with a ‘Lego’ kit of parameterised primitives, such as doors or windows. A prior on wall layout, and a prior on the parameters of each primitive can then be defined. Part of this prior is learnt from training data and part comes from expert architects. The validity of the prior is tested by generating example buildings using MCMC and verifying that plausible buildings are generated under varying conditions. The same MCMC machinery can also be used for optimising the structure recovery, this time generating a range of possible solutions from the posterior. The fact that a range of solutions can be presented allows the user to select the best when the structure recovery is ambiguous.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VOX2BIM+ - A Fast and Robust Approach for Automated Indoor Point Cloud Segmentation and Building Model Generation

Article Open access 30 May 2023

LSD-SLAM: Large-Scale Direct Monocular SLAM

The Pascal Visual Object Classes Challenge: A Retrospective

Article 25 June 2014

References

Baillard, C., Schmid, C., Zisserman, A., and Fitzgibbon, A.W. 1999. Automatic line matching and 3d reconstruction of buildings from multiple views. In ISPRS Congress, pp. 69–80.
Baker, S. and Kanade, T. 2001. Super-resolution: Reconstruction or recognition? In Proc. IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing, Baltimore, Maryland.
Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 434–441.
Beardsley, P.A., Zisserman, A., and Murray, D.W. 1997. Sequential updating of projective and affine structure from motion. International Journal of Computer Vision, 23(3):235–259.
Google Scholar
Biederman, I. 1985. Human image understanding: Recent research and a theory. Computer Vision Graphics and Image Processing, 32(1):29–73.
Google Scholar
Borges, D.L. and Fisher, R.B. 1997. Class-based recognition of 3d objects represented by volumetric primitives. Image and Vision Computing, 15(8):655–664.
Google Scholar
Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 8(6):679–698.
Google Scholar
Cipolla, R., Okamoto, Y., and Kuno, Y. 1993. Robust structure from motion using motion parallax. In Proc. IEEE International Conference on Computer Vision, pp. 374–382.
Collins, R. 1992. Projective reconstruction of approximately planar scenes. In Interdisciplinary Computer Vision: An Exploration of Diverse Applications, pp. 174–185.
Dick, A.R. 2001. Modelling and interpretation of architecture from several images. PhD thesis, University of Cambridge.
Dick, A.R., Torr, P., and Cipolla, R. 2000. Automatic 3d modelling of architecture. In Proc. 11th British Machine Vision Conference (BMVC'00), Bristol, pp. 372–381.
Dickinson, S.J., Bergevin, R., Biederman, I., Eklundh, J.O., Munck-Fairwood, R., Jain, A.K., and Pentland, A.P. 1997. Panel report: The potential of geons for generic 3-d object recognition. Image and Vision Computing, 15(4):277–292.
Google Scholar
Efros, A. and Leung, T. 1999. Texture synthesis by non-parametric sampling. In Proc. IEEE International Conference on Computer Vision, pp. 1033–1038.
Faugeras, O.D., Mundy, J.L., Ahuja, N., Dyer, C.R., Pentland, A.P., Jain, R., Ikeuchi, K., and Bowyer, K.W. 1992. Why aspect graphs are not (yet) practical for computer vision. Computer Vision Graphics and Image Processing, 55(2):212–218.
Google Scholar
Ferryman, J.M., Worrall, A.D., Sullivan, G.D., and Baker, K.D. 1995. A generic deformable model for vehicle recognition. In Proceedings British Machine Vision Conference, pp. 127–136.
Fisher, R., 1989. From Surfaces to Objects: Computer vision and three dimensional scene analysis. John Wiley and Sons.
Gelman, A., Carlin, J., Stern, H., and Rubin, D. 1995. Bayesian Data Analysis. Chapman and Hall: Boston.
Google Scholar
Gilks,W., Richardson, S., and Spiegelhalter,D. (Eds.), 1996. Markov Chain Monte Carlo in Practice. Chapman and Hall: London.
Google Scholar
Green, P. 1995. Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika, 82:711–732.
Google Scholar
Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. 4th Alvey Conference, pp. 147–152.
Irani, M. and Peleg, S. 1993. Using motion analysis for image enhancement. Journal of Visual Communication and Image Representation, 4(4):324–335.
Google Scholar
Jaynes, E.T. 1996. Probability Theory: The Logic of Science. Unpublished but available online at http://bayes.wustl.edu/etj/prob.html.
Koenderink, J.J. and van Doorn, A.J. 1979. The internal representation of solid shape with respect to vision. Biological Cybernetics, 32:211–216.
Google Scholar
Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(5):441–450.
Google Scholar
Marr, D. 1982. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman.
Mendonca, P.R.D.S. and Cipolla, R. 1999. A simple technique for self-calibration. In Proc. IEEE Computer Vision and Pattern Recognition, pp. I:500–505.
Neal, R.M. 1993. Probabilistic inference using monte carlo markov chains. Technical Report CRG-TR-93-1, University of Toronto.
Oliensis, J. 2000. A critique of structure-from-motion algorithms. Computer Vision and Image Understanding, 80(2):172–214.
Google Scholar
Papageorgiou, C. and Poggio, T. 2000. A trainable system for object detection. International Journal of Computer Vision, 38(1):15–33.
Google Scholar
Pilu, M. and Fisher, R.B. 1996. Recognition of geons by parametric deformable contour models. In Proc. 4th European Conference on Computer Vision, Lecture Notes in Computer Science 1064, Springer-Verlag. pp. I:71–82.
Google Scholar
Pope, A.R. 1994. Model-based object recognition:Asurvey of recent research. Technical Report 94-04, University of British Columbia.
Portilla, J. and Simoncelli, E.P. 2000. A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision, 40(1):49–70.
Google Scholar
Schmid, C. and Zisserman, A. 2000. The geometry and matching of lines and curves over multiple views. International Journal of Computer Vision, 40(3):199–233.
Google Scholar
Schneiderman, H. and Kanade, T. 2000. A statistical method for 3d object detection applied to faces and cars. In Proc. IEEE Computer Vision and Pattern Recognition, pp. I:746–751.
Slama, C.C. 1980. Manual of Photogrammetry, 4th ed. American Society of Photogrammetry.
Sullivan, J., Blake, A., Isard, M., and Maccormick, J.P. 1999. Object localization by bayesian correlation. In Proc. IEEE International Conference on Computer Vision, pp. 1068–1075.
Taylor, C.J., Debevec, P.E., and Malik, J. 1996. Modeling and rendering architecture from photographs:Ahybrid geometry-and imagebased approach. ACMSIGGraph, Computer Graphics, pp. 11–20.
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2):137–154.
Google Scholar
Torr, P., Dick, A., and Cipolla, R. 2000. Layer extraction with a bayesian model of shapes. In Proc. 6th European Conference on Computer Vision, Lecture Notes in Computer Sciences, Vol. 1843, Springer-Verlag, pp. II:273–289.
Google Scholar
Torr, P., Szeliski, R., and Anandan, P. 1999. An integrated bayesian approach to layer extraction from image sequences. In Proc. IEEE International Conference on Computer Vision, pp. 983–990.
Triggs, B. 2000. Plane + parallax, tensors and factorization. In Proc. 6th European Conference on Computer Vision, Lecture Notes in Computer Sciences. vol. 1842, Springer-Verlag, pp. I:522–538.
Google Scholar
Triggs, B., Mclauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment-A modern synthesis. In Vision Algorithms: Theory and Practice, W. Triggs, A. Zisserman, and R. Szeliski (Eds.), LNCS vol. 1883, Springer Verlag, pp. 298–375.
Wang, J. and Adelson, E.H. 1993. Layered representation for motion analysis. In Proc. IEEE Computer Vision and Pattern Recognition, pp. 361–366.

Download references

Author information

Authors and Affiliations

Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK
A.R. Dick & R. Cipolla
Department of Computing, Oxford Brookes University, Wheatley, Oxford, OX33 1HX, UK
P.H.S. Torr

Authors

A.R. Dick
View author publications
You can also search for this author in PubMed Google Scholar
P.H.S. Torr
View author publications
You can also search for this author in PubMed Google Scholar
R. Cipolla
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dick, A., Torr, P. & Cipolla, R. Modelling and Interpretation of Architecture from Several Images. International Journal of Computer Vision 60, 111–134 (2004). https://doi.org/10.1023/B:VISI.0000029665.07652.61

Download citation

Issue Date: November 2004
DOI: https://doi.org/10.1023/B:VISI.0000029665.07652.61

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modelling and Interpretation of Architecture from Several Images

Abstract

Access this article

Similar content being viewed by others

VOX2BIM+ - A Fast and Robust Approach for Automated Indoor Point Cloud Segmentation and Building Model Generation

LSD-SLAM: Large-Scale Direct Monocular SLAM

The Pascal Visual Object Classes Challenge: A Retrospective

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Modelling and Interpretation of Architecture from Several Images

Abstract

Access this article

Similar content being viewed by others

VOX2BIM+ - A Fast and Robust Approach for Automated Indoor Point Cloud Segmentation and Building Model Generation

LSD-SLAM: Large-Scale Direct Monocular SLAM

The Pascal Visual Object Classes Challenge: A Retrospective

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation