Abstract
Recovering 3D structure and camera motion from images has been a long-standing focus of computer vision research and is known as Structure-from-Motion (SfM). Solutions to this problem are categorized into incremental and global approaches. Until now, the most popular systems follow the incremental paradigm due to its superior accuracy and robustness, while global approaches are drastically more scalable and efficient. With this work, we revisit the problem of global SfM and propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM. In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM, while being orders of magnitude faster. We share our system as an open-source implementation at https://github.com/colmap/glomap.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kapture toolbox. https://github.com/naver/kapture
Abdel-Aziz, Y.I., Karara, H.M., Hauck, M.: Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry. Photogram. Eng. Remote Sens. 81(2), 103–107 (2015)
Agarwal, S., Mierle, K., Team, T.C.S.: Ceres Solver (2022)
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)
Arie-Nachimson, M., Kovalsky, S.Z., Kemelmacher-Shlizerman, I., Singer, A., Basri, R.: Global motion estimation from point matches. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 81–88. IEEE (2012)
Arrigoni, F., Fusiello, A.: Bearing-based network localizability: a unifying view. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2049–2069 (2018)
Arrigoni, F., Fusiello, A., Rossi, B.: On computing the translations norm in the epipolar graph. In: 2015 International Conference on 3D Vision, pp. 300–308. IEEE (2015)
Barath, D., Noskova, J., Ivashechkin, M., Matas, J.: MAGSAC++, a fast, reliable and accurate robust estimator. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1304–1312 (2020)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Cai, Q., Zhang, L., Wu, Y., Yu, W., Hu, D.: A pose-only solution to visual reconstruction and navigation. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 73–86 (2021)
Cai, R., Tung, J., Wang, Q., Averbuch-Elor, H., Hariharan, B., Snavely, N.: Doppelgangers: learning to disambiguate images of similar structures. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 34–44 (2023)
Carlone, L., Aragues, R., Castellanos, J.A., Bona, B.: A linear approximation for graph-based simultaneous localization and mapping. In: Robotics: Science and Systems, vol. 7, pp. 41–48. MIT Press Cambridge (2012)
Carlone, L., Calafiore, G.C.: Convex relaxations for pose graph optimization with outliers. IEEE Robot. Autom. Lett. 3(2), 1160–1167 (2018)
Chatterjee, A., Govindu, V.M.: Efficient and robust large-scale rotation averaging. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 521–528 (2013)
Chatterjee, A., Govindu, V.M.: Robust relative rotation averaging. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 958–972 (2017)
Chow, A., et al.: Image matching challenge 2023 (2023). https://kaggle.com/competitions/image-matching-challenge-2023
Cui, H., Gao, X., Shen, S., Hu, Z.: HSFM: hybrid structure-from-motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1212–1221 (2017)
Cui, Z., Jiang, N., Tang, C., Tan, P.: Linear global translation estimation with feature tracks. arXiv preprint arXiv:1503.01832 (2015)
Cui, Z., Tan, P.: Global structure-from-motion by similarity averaging. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Dellaert, F., Rosen, D.M., Wu, J., Mahony, R., Carlone, L.: Shonan rotation averaging: global optimality by surfing \(SO(p)^n\). In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 292–308. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_18
DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Eriksson, A., Olsson, C., Kahl, F., Chin, T.J.: Rotation averaging and strong duality. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 127–135 (2018)
Fredriksson, J., Olsson, C.: Simultaneous multiple rotation averaging using lagrangian duality. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7726, pp. 245–258. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37431-9_19
Govindu, V.M.: Combining two-view constraints for motion estimation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 2, p. II. IEEE (2001)
Hartley, R., Aftab, K., Trumpf, J.: L1 rotation averaging using the weiszfeld algorithm. In: CVPR 2011, pp. 3041–3048. IEEE (2011)
Hartley, R., Trumpf, J., Dai, Y., Li, H.: Rotation averaging. Int. J. Comput. Vision 103, 267–305 (2013)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Hartley, R.I.: Cheirality invariants. In: Proc. DARPA Image Understanding Workshop, vol. 3. Citeseer (1993)
Hartley, R.I., Sturm, P.: Triangulation. Comput. Vision Image Underst. 68(2), 146–157 (1997)
He, X., et al.: Detector-free structure from motion. arXiv preprint arXiv:2306.15669 (2023)
Henry, S., Christian, J.A.: Absolute triangulation algorithms for space exploration. J. Guid. Control. Dyn. 46(1), 21–46 (2023)
Holynski, A., Geraghty, D., Frahm, J.M., Sweeney, C., Szeliski, R.: Reducing drift in structure from motion using extended features. In: 2020 International Conference on 3D Vision (3DV), pp. 51–60. IEEE (2020)
Huber, P.J.: Robust estimation of a location parameter. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics. Springer Series in Statistics, pp. 492–518. Springer, New York (1992). https://doi.org/10.1007/978-1-4612-4380-9_35
Jiang, N., Cui, Z., Tan, P.: A global linear method for camera pose registration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 481–488 (2013)
Kennedy, R., Daniilidis, K., Naroditsky, O., Taylor, C.J.: Identifying maximal rigid components in bearing-based localization. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 194–201. IEEE (2012)
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Kipman, A.: Azure Spatial Anchors approach to privacy and ethical design (2019). https://www.linkedin.com/pulse/azure-spatial-anchors-approach-privacy-ethical-design-alex-kipman
Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)
Li, X., Ling, H.: Pogo-net: pose graph optimization with graph neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5895–5905 (2021)
Lindenberger, P., Sarlin, P.E., Larsson, V., Pollefeys, M.: Pixel-perfect structure-from-motion with feature metric refinement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5987–5997 (2021)
Liu, Z., Qv, W., Cai, H., Guan, H., Zhang, S.: An efficient and robust hybrid SFM method for large-scale scenes. Remote Sens. 15(3), 769 (2023)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Lu, F., Hartley, R.: A fast optimal algorithm for \(L_{2}\) triangulation. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007. LNCS, vol. 4844, pp. 279–288. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76390-1_28
Manam, L., Govindu, V.M.: Correspondence reweighted translation averaging. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13693, pp. 56–72. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19827-4_4
Manam, L., Govindu, V.M.: Sensitivity in translation averaging. Adv. Neural Inf. Process. Syst. 36 (2024)
Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)
Martinec, D., Pajdla, T.: Robust rotation and translation estimation in multiview reconstruction. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Moisan, L., Moulon, P., Monasse, P.: Automatic homographic registration of a pair of images, with a contrario elimination of outliers. Image Process. Line 2, 56–73 (2012)
Moulon, P., Monasse, P.: Unordered feature tracking made fast and easy. In: CVMP 2012, p. 1 (2012)
Moulon, P., Monasse, P., Perrot, R., Marlet, R.: OpenMVG: open multiple view geometry. In: International Workshop on Reproducible Research in Pattern Recognition (2016)
Ozyesil, O., Singer, A.: Robust camera location estimation by convex programming. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2674–2683 (2015)
Ozyesil, O., Singer, A., Basri, R.: Stable camera motion estimation using convex programming. SIAM J. Imag. Sci. 8(2), 1220–1262 (2015)
Purkait, P., Chin, T.-J., Reid, I.: NeuRoRA: neural robust rotation averaging. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12369, pp. 137–154. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_9
Reinhardt, T.: Google visual positioning service (2019). https://ai.googleblog.com/2019/02/using-global-localization-to-improve.html
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4938–4947 (2020)
Sarlin, P.E., et al.: LaMAR: Benchmarking localization and mapping for augmented reality. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13667, pp. 686–704. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_40
Schönberger, J.L.: Robust methods for accurate and efficient 3D modeling from unstructured imagery. Ph.D. thesis, ETH Zürich (2018)
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Schönberger, J.L., Price, T., Sattler, T., Frahm, J.-M., Pollefeys, M.: A vote-and-verify strategy for fast spatial verification in image retrieval. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10111, pp. 321–337. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54181-5_21
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Schöps, T., Sattler, T., Pollefeys, M.: BAD SLAM: bundle adjusted direct RGB-D SLAM. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Schöps, T., et al.: A multi-view stereo benchmark with high-resolution images and multi-camera videos. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Servatius, B., Whiteley, W.: Constraining plane configurations in computer-aided design: combinatorics of directions and lengths. SIAM J. Discret. Math. 12(1), 136–153 (1999)
Sidhartha, C., Govindu, V.M.: It is all in the weights: robust rotation averaging revisited. In: 2021 International Conference on 3D Vision (3DV), pp. 1134–1143. IEEE (2021)
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. In: ACM SIGGRAPH 2006 Papers, pp. 835–846 (2006)
Sweeney, C.: Theia multiview geometry library: tutorial & reference. http://theia-sfm.org
Sweeney, C., Sattler, T., Hollerer, T., Turk, M., Pollefeys, M.: Optimizing the viewing graph for structure-from-motion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 801–809 (2015)
Tejus, G., Zara, G., Rota, P., Fusiello, A., Ricci, E., Arrigoni, F.: Rotation synchronization via deep matrix factorization. arXiv preprint arXiv:2305.05268 (2023)
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44480-7_21
Ullman, S.: The interpretation of structure from motion. Proc. Roy. Soc. London Ser. B Biol. Sci. 203(1153), 405–426 (1979)
Wang, J., Karaev, N., Rupprecht, C., Novotny, D.: Visual geometry grounded deep structure from motion (2023)
Werner, T., Pajdla, T.: Cheirality in epipolar geometry. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001. vol. 1, pp. 548–553. IEEE (2001)
Wilson, K., Bindel, D., Snavely, N.: When is rotations averaging hard? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 255–270. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_16
Wilson, K., Snavely, N.: Robust global translations with 1DSfM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 61–75. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_5
Wu, C.: Towards linear-time incremental structure from motion. In: 2013 International Conference on 3D Vision-3DV 2013, pp. 127–134. IEEE (2013)
Yang, L., Li, H., Rahim, J.A., Cui, Z., Tan, P.: End-to-end rotation averaging with multi-source propagation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11774–11783 (2021)
Zhang, G., Larsson, V., Barath, D.: Revisiting rotation averaging: uncertainties and robust losses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17215–17224 (2023)
Zhang, J.Y., Lin, A., Kumar, M., Yang, T.H., Ramanan, D., Tulsiani, S.: Cameras as rays: pose estimation via ray diffusion. arXiv preprint arXiv:2402.14817 (2024)
Zhao, S., Zelazo, D.: Localizability and distributed protocols for bearing-based network localization in arbitrary dimensions. Automatica 69, 334–341 (2016)
Zhuang, B., Cheong, L.F., Lee, G.H.: Baseline desensitizing in translation averaging. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4539–4547 (2018)
Acknowledgment
The authors thank Philipp Lindenberger for the thoughtful discussions and comments on the text. This work was partially funded by the Hasler Stiftung Research Grant via the ETH Zurich Foundation and the ETH Zurich Career Seed Award. Linfei Pan was supported by gift funding from Microsoft.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pan, L., Baráth, D., Pollefeys, M., Schönberger, J.L. (2025). Global Structure-from-Motion Revisited. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15098. Springer, Cham. https://doi.org/10.1007/978-3-031-73661-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-73661-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73660-5
Online ISBN: 978-3-031-73661-2
eBook Packages: Computer ScienceComputer Science (R0)