Global Structure-from-Motion Revisited

Pan, Linfei; Baráth, Dániel; Pollefeys, Marc; Schönberger, Johannes L.

doi:10.1007/978-3-031-73661-2_4

Linfei Pan¹³,
Dániel Baráth¹³,
Marc Pollefeys^13,14 &
…
Johannes L. Schönberger¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15098))

Included in the following conference series:

European Conference on Computer Vision

618 Accesses

Abstract

Recovering 3D structure and camera motion from images has been a long-standing focus of computer vision research and is known as Structure-from-Motion (SfM). Solutions to this problem are categorized into incremental and global approaches. Until now, the most popular systems follow the incremental paradigm due to its superior accuracy and robustness, while global approaches are drastically more scalable and efficient. With this work, we revisit the problem of global SfM and propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM. In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM, while being orders of magnitude faster. We share our system as an open-source implementation at https://github.com/colmap/glomap.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Progressive Structure from Motion

ER-SFM: Efficient and Robust Cluster-Based Structure from Motion

A Benchmark and Evaluation of Non-Rigid Structure from Motion

Article Open access 29 December 2020

References

Kapture toolbox. https://github.com/naver/kapture
Abdel-Aziz, Y.I., Karara, H.M., Hauck, M.: Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry. Photogram. Eng. Remote Sens. 81(2), 103–107 (2015)
Article Google Scholar
Agarwal, S., Mierle, K., Team, T.C.S.: Ceres Solver (2022)
Google Scholar
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)
Google Scholar
Arie-Nachimson, M., Kovalsky, S.Z., Kemelmacher-Shlizerman, I., Singer, A., Basri, R.: Global motion estimation from point matches. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 81–88. IEEE (2012)
Google Scholar
Arrigoni, F., Fusiello, A.: Bearing-based network localizability: a unifying view. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2049–2069 (2018)
Article Google Scholar
Arrigoni, F., Fusiello, A., Rossi, B.: On computing the translations norm in the epipolar graph. In: 2015 International Conference on 3D Vision, pp. 300–308. IEEE (2015)
Google Scholar
Barath, D., Noskova, J., Ivashechkin, M., Matas, J.: MAGSAC++, a fast, reliable and accurate robust estimator. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1304–1312 (2020)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Google Scholar
Cai, Q., Zhang, L., Wu, Y., Yu, W., Hu, D.: A pose-only solution to visual reconstruction and navigation. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 73–86 (2021)
Article Google Scholar
Cai, R., Tung, J., Wang, Q., Averbuch-Elor, H., Hariharan, B., Snavely, N.: Doppelgangers: learning to disambiguate images of similar structures. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 34–44 (2023)
Google Scholar
Carlone, L., Aragues, R., Castellanos, J.A., Bona, B.: A linear approximation for graph-based simultaneous localization and mapping. In: Robotics: Science and Systems, vol. 7, pp. 41–48. MIT Press Cambridge (2012)
Google Scholar
Carlone, L., Calafiore, G.C.: Convex relaxations for pose graph optimization with outliers. IEEE Robot. Autom. Lett. 3(2), 1160–1167 (2018)
Article Google Scholar
Chatterjee, A., Govindu, V.M.: Efficient and robust large-scale rotation averaging. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 521–528 (2013)
Google Scholar
Chatterjee, A., Govindu, V.M.: Robust relative rotation averaging. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 958–972 (2017)
Article Google Scholar
Chow, A., et al.: Image matching challenge 2023 (2023). https://kaggle.com/competitions/image-matching-challenge-2023
Cui, H., Gao, X., Shen, S., Hu, Z.: HSFM: hybrid structure-from-motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1212–1221 (2017)
Google Scholar
Cui, Z., Jiang, N., Tang, C., Tan, P.: Linear global translation estimation with feature tracks. arXiv preprint arXiv:1503.01832 (2015)
Cui, Z., Tan, P.: Global structure-from-motion by similarity averaging. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Dellaert, F., Rosen, D.M., Wu, J., Mahony, R., Carlone, L.: Shonan rotation averaging: global optimality by surfing $SO(p)^n$. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 292–308. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_18
Chapter Google Scholar
DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Google Scholar
Eriksson, A., Olsson, C., Kahl, F., Chin, T.J.: Rotation averaging and strong duality. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 127–135 (2018)
Google Scholar
Fredriksson, J., Olsson, C.: Simultaneous multiple rotation averaging using lagrangian duality. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7726, pp. 245–258. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37431-9_19
Chapter Google Scholar
Govindu, V.M.: Combining two-view constraints for motion estimation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 2, p. II. IEEE (2001)
Google Scholar
Hartley, R., Aftab, K., Trumpf, J.: L1 rotation averaging using the weiszfeld algorithm. In: CVPR 2011, pp. 3041–3048. IEEE (2011)
Google Scholar
Hartley, R., Trumpf, J., Dai, Y., Li, H.: Rotation averaging. Int. J. Comput. Vision 103, 267–305 (2013)
Article MathSciNet Google Scholar
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Google Scholar
Hartley, R.I.: Cheirality invariants. In: Proc. DARPA Image Understanding Workshop, vol. 3. Citeseer (1993)
Google Scholar
Hartley, R.I., Sturm, P.: Triangulation. Comput. Vision Image Underst. 68(2), 146–157 (1997)
Article Google Scholar
He, X., et al.: Detector-free structure from motion. arXiv preprint arXiv:2306.15669 (2023)
Henry, S., Christian, J.A.: Absolute triangulation algorithms for space exploration. J. Guid. Control. Dyn. 46(1), 21–46 (2023)
Article Google Scholar
Holynski, A., Geraghty, D., Frahm, J.M., Sweeney, C., Szeliski, R.: Reducing drift in structure from motion using extended features. In: 2020 International Conference on 3D Vision (3DV), pp. 51–60. IEEE (2020)
Google Scholar
Huber, P.J.: Robust estimation of a location parameter. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics. Springer Series in Statistics, pp. 492–518. Springer, New York (1992). https://doi.org/10.1007/978-1-4612-4380-9_35
Chapter Google Scholar
Jiang, N., Cui, Z., Tan, P.: A global linear method for camera pose registration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 481–488 (2013)
Google Scholar
Kennedy, R., Daniilidis, K., Naroditsky, O., Taylor, C.J.: Identifying maximal rigid components in bearing-based localization. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 194–201. IEEE (2012)
Google Scholar
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Google Scholar
Kipman, A.: Azure Spatial Anchors approach to privacy and ethical design (2019). https://www.linkedin.com/pulse/azure-spatial-anchors-approach-privacy-ethical-design-alex-kipman
Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)
Article MathSciNet Google Scholar
Li, X., Ling, H.: Pogo-net: pose graph optimization with graph neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5895–5905 (2021)
Google Scholar
Lindenberger, P., Sarlin, P.E., Larsson, V., Pollefeys, M.: Pixel-perfect structure-from-motion with feature metric refinement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5987–5997 (2021)
Google Scholar
Liu, Z., Qv, W., Cai, H., Guan, H., Zhang, S.: An efficient and robust hybrid SFM method for large-scale scenes. Remote Sens. 15(3), 769 (2023)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Article Google Scholar
Lu, F., Hartley, R.: A fast optimal algorithm for $L_{2}$ triangulation. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007. LNCS, vol. 4844, pp. 279–288. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76390-1_28
Chapter Google Scholar
Manam, L., Govindu, V.M.: Correspondence reweighted translation averaging. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13693, pp. 56–72. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19827-4_4
Chapter Google Scholar
Manam, L., Govindu, V.M.: Sensitivity in translation averaging. Adv. Neural Inf. Process. Syst. 36 (2024)
Google Scholar
Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)
Article MathSciNet Google Scholar
Martinec, D., Pajdla, T.: Robust rotation and translation estimation in multiview reconstruction. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Moisan, L., Moulon, P., Monasse, P.: Automatic homographic registration of a pair of images, with a contrario elimination of outliers. Image Process. Line 2, 56–73 (2012)
Article Google Scholar
Moulon, P., Monasse, P.: Unordered feature tracking made fast and easy. In: CVMP 2012, p. 1 (2012)
Google Scholar
Moulon, P., Monasse, P., Perrot, R., Marlet, R.: OpenMVG: open multiple view geometry. In: International Workshop on Reproducible Research in Pattern Recognition (2016)
Google Scholar
Ozyesil, O., Singer, A.: Robust camera location estimation by convex programming. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2674–2683 (2015)
Google Scholar
Ozyesil, O., Singer, A., Basri, R.: Stable camera motion estimation using convex programming. SIAM J. Imag. Sci. 8(2), 1220–1262 (2015)
Article MathSciNet Google Scholar
Purkait, P., Chin, T.-J., Reid, I.: NeuRoRA: neural robust rotation averaging. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12369, pp. 137–154. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_9
Chapter Google Scholar
Reinhardt, T.: Google visual positioning service (2019). https://ai.googleblog.com/2019/02/using-global-localization-to-improve.html
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4938–4947 (2020)
Google Scholar
Sarlin, P.E., et al.: LaMAR: Benchmarking localization and mapping for augmented reality. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13667, pp. 686–704. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_40
Chapter Google Scholar
Schönberger, J.L.: Robust methods for accurate and efficient 3D modeling from unstructured imagery. Ph.D. thesis, ETH Zürich (2018)
Google Scholar
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Schönberger, J.L., Price, T., Sattler, T., Frahm, J.-M., Pollefeys, M.: A vote-and-verify strategy for fast spatial verification in image retrieval. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10111, pp. 321–337. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54181-5_21
Chapter Google Scholar
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Chapter Google Scholar
Schöps, T., Sattler, T., Pollefeys, M.: BAD SLAM: bundle adjusted direct RGB-D SLAM. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Schöps, T., et al.: A multi-view stereo benchmark with high-resolution images and multi-camera videos. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Servatius, B., Whiteley, W.: Constraining plane configurations in computer-aided design: combinatorics of directions and lengths. SIAM J. Discret. Math. 12(1), 136–153 (1999)
Article MathSciNet Google Scholar
Sidhartha, C., Govindu, V.M.: It is all in the weights: robust rotation averaging revisited. In: 2021 International Conference on 3D Vision (3DV), pp. 1134–1143. IEEE (2021)
Google Scholar
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. In: ACM SIGGRAPH 2006 Papers, pp. 835–846 (2006)
Google Scholar
Sweeney, C.: Theia multiview geometry library: tutorial & reference. http://theia-sfm.org
Sweeney, C., Sattler, T., Hollerer, T., Turk, M., Pollefeys, M.: Optimizing the viewing graph for structure-from-motion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 801–809 (2015)
Google Scholar
Tejus, G., Zara, G., Rota, P., Fusiello, A., Ricci, E., Arrigoni, F.: Rotation synchronization via deep matrix factorization. arXiv preprint arXiv:2305.05268 (2023)
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44480-7_21
Chapter Google Scholar
Ullman, S.: The interpretation of structure from motion. Proc. Roy. Soc. London Ser. B Biol. Sci. 203(1153), 405–426 (1979)
Google Scholar
Wang, J., Karaev, N., Rupprecht, C., Novotny, D.: Visual geometry grounded deep structure from motion (2023)
Google Scholar
Werner, T., Pajdla, T.: Cheirality in epipolar geometry. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001. vol. 1, pp. 548–553. IEEE (2001)
Google Scholar
Wilson, K., Bindel, D., Snavely, N.: When is rotations averaging hard? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 255–270. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_16
Chapter Google Scholar
Wilson, K., Snavely, N.: Robust global translations with 1DSfM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 61–75. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_5
Chapter Google Scholar
Wu, C.: Towards linear-time incremental structure from motion. In: 2013 International Conference on 3D Vision-3DV 2013, pp. 127–134. IEEE (2013)
Google Scholar
Yang, L., Li, H., Rahim, J.A., Cui, Z., Tan, P.: End-to-end rotation averaging with multi-source propagation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11774–11783 (2021)
Google Scholar
Zhang, G., Larsson, V., Barath, D.: Revisiting rotation averaging: uncertainties and robust losses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17215–17224 (2023)
Google Scholar
Zhang, J.Y., Lin, A., Kumar, M., Yang, T.H., Ramanan, D., Tulsiani, S.: Cameras as rays: pose estimation via ray diffusion. arXiv preprint arXiv:2402.14817 (2024)
Zhao, S., Zelazo, D.: Localizability and distributed protocols for bearing-based network localization in arbitrary dimensions. Automatica 69, 334–341 (2016)
Article MathSciNet Google Scholar
Zhuang, B., Cheong, L.F., Lee, G.H.: Baseline desensitizing in translation averaging. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4539–4547 (2018)
Google Scholar

Download references

Acknowledgment

The authors thank Philipp Lindenberger for the thoughtful discussions and comments on the text. This work was partially funded by the Hasler Stiftung Research Grant via the ETH Zurich Foundation and the ETH Zurich Career Seed Award. Linfei Pan was supported by gift funding from Microsoft.

Author information

Authors and Affiliations

ETH Zurich, Zürich, Switzerland
Linfei Pan, Dániel Baráth & Marc Pollefeys
Microsoft, Redmond, USA
Marc Pollefeys & Johannes L. Schönberger

Authors

Linfei Pan
View author publications
You can also search for this author in PubMed Google Scholar
Dániel Baráth
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pollefeys
View author publications
You can also search for this author in PubMed Google Scholar
Johannes L. Schönberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Linfei Pan .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 33415 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, L., Baráth, D., Pollefeys, M., Schönberger, J.L. (2025). Global Structure-from-Motion Revisited. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15098. Springer, Cham. https://doi.org/10.1007/978-3-031-73661-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-73661-2_4
Published: 10 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73660-5
Online ISBN: 978-3-031-73661-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics