Skip to main content
Log in

Applications of structure from motion: a survey

  • Review
  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

Structure from motion (SfM) has been an active research area in computer vision for decades and numerous practical applications are benefiting from this research. While no previous work has tried to summarize the applications appearing in the literature, this paper deals with a comprehensive overview of recent applications of SfM by classifying them into 10 categories, namely augmented reality, autonomous navigation/guidance, motion capture, hand-eye calibration, image/video processing, image-based 3D modeling, remote sensing, image organization/browsing, segmentation and recognition, and military applications. The goal is to provide insights for researchers to position their work more appropriately in the context of existing techniques, and to perceive both new applications and relevant research problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R., 2009. Building Rome in a Day. Proc. IEEE Int. Conf. on Computer Vision, p.72–79. [doi:10.1109/ICCV.2009.5459148]

    Google Scholar 

  • Agarwal, S., Snavely, N., Seitz, S.M., Szeliski, R., 2010. Bundle Adjustment in the Large. Proc. European Conf. on Computer Vision: Part II, p.29–42. [doi:10.1007/978-3-642-15552-9_3]

    Google Scholar 

  • Andreff, N., Horaud, R., Espiau, B., 2001. Robot hand-eye calibration using structure-from-motion. Int. J. Robot. Res., 20(3):228–248. [doi:10.1177/02783640122067372]

    Article  Google Scholar 

  • Bao, S.Y., Bagra, M., Chao, Y.W., Savarese, S., 2012. Semantic Structure from Motion with Points, Regions, and Objects. CVPR, p.2703–2710. [doi:10.1109/CVPR.2012.6247992]

    Google Scholar 

  • Bhat, P., Zitnick, C.L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M., Curless, B., Kang, S.B., 2007. Using Photographs to Enhance Videos of a Static Scene. Eurographics Symp. on Rendering, p.327–338. [doi:10.2312/EGWR/EGSR07/327-338]

    Google Scholar 

  • Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R., 2008. Segmentation and Recognition Using Structure from Motion Point Clouds. Proc. 10th European Conf. on Computer Vision: Part I, p.44–57. [doi:10.1007/978-3-540-88682-2_5]

    Google Scholar 

  • Brostow, G.J., Fauqueur, J., Cipolla, R., 2009. Semantic object classes in video: a high definition ground truth database. Pattern Recogn. Lett., 30(2):88–97. [doi:10.1016/j.patrec.2008.04.005]

    Article  Google Scholar 

  • Cornelis, K., Pollefeys, M., Gool, L.V., 2001. Tracking Based Structure and Motion Recovery for Augmented Video Productions. Proc. ACM Symp. on Virtual Reality Software and Technology, p.17–24. [doi:10.1145/505008.505012]

    Chapter  Google Scholar 

  • Frahm, J.M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.H., Dunn, E., Clipp, B., Lazebnik, S., et al., 2010. Building Rome on a Cloudless Day. Proc. 11th European Conf. on Computer Vision: Part IV, p.368–381. [doi:10.1007/978-3-642-15561-1_27]

    Google Scholar 

  • Hartley, R., Zisserman, A., 2004. Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge, UK. [doi:10.1017/CBO9780511811685]

    Book  MATH  Google Scholar 

  • Hasler, N., Rosenhahn, B., Thormahlen, T., Wand, M., Gall, J., Seidel, H.P., 2009. Markerless Motion Capture with Unsynchronized Moving Cameras. CVPR, p.224–231. [doi:10.1109/CVPR.2009.5206859]

    Google Scholar 

  • Helala, M.A., Zarrabeitia, L.A., Qureshi, F.Z., 2012. Mosaic of Near Ground UAV Videos under Parallax Effects. Proc. 6th ACM/IEEE Int. Conf. on Distributed Smart Cameras, p.1–6.

    Google Scholar 

  • Heller, J., Havlena, M., Sugimoto, A., Pajdla, T., 2011. Structure-from-Motion Based Hand-Eye Calibration Using l∞ Minimization. CVPR, p.3497–3503. [doi:10. 1109/CVPR.2011.5995629]

    Chapter  Google Scholar 

  • Irschara, A., Zach, C., Frahm, J.M., Bischof, H., 2009. From Structure-from-Motion Point Clouds to Fast Location Recognition. CVPR, p.2599–2606. [doi:10.1109/CVPR.2009.5206587]

    Google Scholar 

  • Jackson, N.L., 2008. Precision Reconstruction Based Tracking for Autonomous Synthetic Battlefield Displays Acquired from Unmanned Aerial Vehicle Video Streams. Dissertation, Morgan State University, Baltimore, Maryland, United States.

    Google Scholar 

  • Kurz, C., Thormahlen, T., Seidel, H.P., 2009. Scene-Aware Video Stabilization by Visual Fixation. Proc. Conf. for Visual Media Production, p.1–6. [doi:10.1109/CVMP.2009.9]

    Google Scholar 

  • Li, Y., Snavely, N., Huttenlocher, D., 2010. Location Recognition Using Prioritized Feature Matching. Proc. 11th European Conf. on Computer Vision: Part II, p.791–804. [doi:10.1007/978-3-642-15552-9_57]

    Google Scholar 

  • Li, Y., Snavely, N., Huttenlocher, D., Fua, P., 2012. Worldwide Pose Estimation Using 3D Point Clouds. Proc. 12th European Conf. on Computer Vision: Part I, p.15–29. [doi:10.1007/978-3-642-33718-5_2]

    Google Scholar 

  • Liu, F., Gleicher, M., Jin, H., Agarwala, A., 2009. Contentpreserving warps for 3D video stabilization. ACM Trans. Graph., 28(3), Article 44, p.1–9. [doi:10.1145/1531326.1531350]

    Google Scholar 

  • Longuet-Higgins, H.C., 1981. A computer algorithm for reconstructing a scene from two projections. Nature, 293(5828):133–135. [doi:10.1038/293133a0]

    Article  Google Scholar 

  • Lourakis, M.I.A., Argyros, A.A., 2009. SBA: a software package for generic sparse bundle adjustment. ACM Trans. Math. Software, 36(1):1–30. [doi:10.1145/1486525.1486527]

    Article  MathSciNet  Google Scholar 

  • Lowe, D.G., 2004. Distinctive image features from scaleinvariant keypoints. Int. J. Comput. Vis., 60(2):91–110. [doi:10.1023/B:VISI.0000029664.99615.94]

    Article  Google Scholar 

  • Manferdini, A.M., 2012. A Methodology for the Promotion of Cultural Heritage Sites Through the Use of Low-Cost Technologies and Procedures. Proc. 17th Int. Conf. on 3D Web Technology, p.180. [doi:10.1145/2338714.2338747]

    Chapter  Google Scholar 

  • Manweiler, J., Jain, P., Choudhury, R.R., 2012. Satellites in Our Pockets: an Object Positioning System Using Smartphones. Proc. 10th Int. Conf. on Mobile Systems, Applications, and Services, p.211–224. [doi:10. 1145/2307636.2307656]

    Chapter  Google Scholar 

  • Mikolajczyk, K., Schmid, C., 2005. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell., 27(10):1615–1630. [doi:10.1109/TPAMI.2005.188]

    Article  Google Scholar 

  • Mooser, J., You, S., Neumann, U., Wang, Q., 2009. Applying Robust Structure from Motion to Markerless Augmented Reality. Workshop on Applications of Computer Vision, p.1–8. [doi:10.1109/WACV.2009.5403038]

    Google Scholar 

  • Moslah, O., Guitteny, V., Couvet, S., 2009. Geo-referencing Uncalibrated Photographs Using Aerial Images and 3D Urban Models. CORESA, p.1–5.

    Google Scholar 

  • Muja, M., Lowe, D.G., 2009. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. Proc. 4th Int. Conf. on Computer Vision Theory and Applications, p.331–340.

    Google Scholar 

  • Musialski, P., Wonka, P., Aliaga, D.G., Wimmer, M., van Gool, L., Purgathofer, W., 2012. A Survey of Urban Reconstruction. Eurographics State of the Art Reports, p.1–28. [doi:10.1111/cgf.12077]

    Google Scholar 

  • Nassar, K., Aly, E.A., Jung, Y., 2011. Structure-from-Motion for Earthwork Planning. Proc. 28th ISARC, p.310–316.

    Google Scholar 

  • Nicosevici, T., Garcia, R., 2008. Online Robust 3D Mapping Using Structure from Motion Cues. OCEANS, p.1–7. [doi:10.1109/OCEANSKOBE.2008.4531022]

    Google Scholar 

  • Niethammer, U., Rothmund, S., Schwaderer, U., Zeman, J., Joswig, M., 2011. Open Source Image-Processing Tools for Low-Cost UAV-Based Landslide Investigations. Int. Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, p.1–6.

    Google Scholar 

  • Nilosek, D., Walli, K., 2009. Aerial Scene Synthesis from Images. SIGGRAPH Posters, Article 65. [doi:10.1145/1599301.1599366]

    Google Scholar 

  • Oliensis, J., 1999. A multi-frame structure-from-motion algorithm under perspective projection. Int. J. Comput. Vis., 34(2–3):163–192. [doi:10.1023/A:1008139920864]

    Article  Google Scholar 

  • Pollefeys, M., Gool, L.V., Vergauwen, M., Cornelis, K., Verbiest, F., Tops, J., 2001a. Image-Based 3D Acquisition of Archaeological Heritage and Applications. Proc. Conf. on Virtual Reality, Archeology, and Cultural Heritage, p.255–262. [doi:10.1145/584993.585033]

    Google Scholar 

  • Pollefeys, M., Vergauwen, M., Cornelis, K., Verbiest, F., Schouteden, J., Tops, J., Gool, L.V., 2001b. 3D Acquisition of Archaeological Heritage from Images. CIPA Conf., Int. Archive of Photogrammetry and Remote Sensing, p.1–8.

    Google Scholar 

  • Pupilli, M., Calway, A., 2002. Real-Time Structure from Motion for Augmented Reality. University of Bristol, Bristol, UK.

    Google Scholar 

  • Pylvanainen, T., Berclaz, J., Korah, T., Hedau, V., Aanjaneya, M., Grzeszczuk, R., 2012. 3D City Modeling from Street-Level Data for Augmented Reality Applications. 2nd Int. Conf. on 3D Imaging, Modeling, Processing, Visualization and Transmission, p.238–245. [doi:10.1109/3DIMPVT.2012.19]

    Chapter  Google Scholar 

  • Remondino, F., El-Hakim, S., 2006. Image-based 3D modelling: a review. Photogrammetr. Rec., 21(115):269–291. [doi:10.1111/j.1477-9730.2006.00383.x]

    Article  Google Scholar 

  • Royer, E., Lhuillier, M., Dhome, M., Lavest, J.M., 2007. Monocular vision for mobile robot localization and autonomous navigation. Int. J. Comput. Vis., 74(3):237–260. [doi:10.1007/s11263-006-0023-y]

    Article  Google Scholar 

  • Sato, T., Iketani, A., Ikeda, S., Kanbara, M., Nakajima, N., Yokoya, N., 2006. Video Mosaicing for Curved Documents by Structure from Motion. ACM SIGGRAPH Sketches. [doi:10.1145/1179849.1180007]

    Google Scholar 

  • Schaffalitzky, F., Zisserman, A., 2002. Multi-view Matching for Unordered Image Sets, or “How Do I Organize My Holiday Snaps?”. Proc. 7th European Conf. on Computer Vision: Part I, p.414–431. [doi:10.1007/3-540-47969-4_28]

    Google Scholar 

  • Schindler, G., Krishnamurthy, P., Dellaert, F., 2006. Line-Based Structure from Motion for Urban Environments. Proc. 3rd Int. Symp. on 3D Data Processing, Visualization, and Transmission, p.846–853. [doi:10.1109/3DPVT.2006.90]

    Google Scholar 

  • Schmidt, J., Vogt, F., Niemann, H., 2005. Calibration Free HandEye Calibration: a Structure from Motion Approach. Proc. 27th DAGM Conf. on Pattern Recognition, p.67–74. [doi:10.1007/11550518_9]

    Google Scholar 

  • Schweighofer, G., Segvic, S., Pinz, A., 2008. Online/Realtime Structure and Motion for General Camera Models. IEEE Workshop on Applications of Computer Vision, p.1–6. [doi:10.1109/WACV.2008.4544016]

    Google Scholar 

  • Shim, M., Yilma, S., Bonner, K., 2008. A robust realtime structure from motion for situational awareness and RSTA. SPIE, 6962:1–11. [doi:10.1117/12.778074]

    Google Scholar 

  • Shiratori, T., Park, H.S., Sigal, L., Sheikh, Y., Hodgins, J.K., 2011. Motion capture from body-mounted cameras. ACM Trans. Graph., 30(4), Article 31, p.1–10. [doi:10.1145/2010324.1964926]

    Article  Google Scholar 

  • Sinha, S.N., Steedly, D., Szeliski, R., Agrawala, M., Pollefeys, M., 2008. Interactive 3D architectural modeling from unordered photo collections. ACM Trans. Graph., 27(5), Article 159, p.1–10. [doi:10.1145/1409060.1409112]

    Article  Google Scholar 

  • Snavely, N., Seitz, S.M., Szeliski, R., 2006. Photo tourism: exploring photo collections in 3D. ACM Trans. Graph., 25(3):835–846. [doi:10.1145/1141911.1141964]

    Article  Google Scholar 

  • Snavely, N., Simon, I., Goesele, M., Szeliski, R., Seitz, S.M., 2010. Scene reconstruction and visualization from community photo collections. Proc. IEEE, 98(8):1370–1390. [doi:10.1109/JPROC.2010.2049330]

    Article  Google Scholar 

  • Spetsakis, M., Aloimonos, J., 1991. A multi-frame approach to visual motion perception. Int. J. Comput. Vis., 6(3):245–255. [doi:10.1007/BF00115698]

    Article  Google Scholar 

  • Srinivasan, S., Chellappa, R., 1999. Fast Structure from Motion Recovery Applied to 3D Image Stabilization. Proc. IEEE Int. Conf. on the Acoustics, Speech, and Signal, p.3357–3360. [doi:10.1109/ICASSP.1999.757561]

    Google Scholar 

  • Streckel, B., Evers-Senne, J.F., Koch, R., 2005. Lens Model Selection for a Markerless AR Tracking System. Proc. 4th IEEE and ACM Int. Symp. on Mixed and Augmented Reality, p.130–133. [doi:10.1109/ISMAR.2005.38]

    Google Scholar 

  • Strucl, D.W., Quartisch, M., 2001. A Structure Based Mosaicking Approach for Aerial Images from Low Altitude of Non-planar Scenes. Proc. 16th Computer Vision Winter Workshop, p.51–58.

    Google Scholar 

  • Sturgess, P., Alahari, K., Ladicky, L., Torr, P.H.S., 2009. Combining Appearance and Structure from Motion Features for Road Scene Understanding. Proc. British Machine Vision Conf., p.1–11. [doi:10.5244/C.23.62]

    Google Scholar 

  • Szeliski, R., 2010. Computer Vision: Algorithms and Applications. Springer, New York.

    Google Scholar 

  • Szeliski, R., Kang, S.B., 1994. Recovering 3D shape and motion from image streams using nonlinear least squares. J. Vis. Commun. Image Represent., 5(1):10–28. [doi:10.1006/jvci.1994.1002]

    Article  Google Scholar 

  • Tomasi, C., 1992. Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis., 9(2):137–154. [doi:10.1007/BF00129684]

    Article  Google Scholar 

  • Triggs, B., Mclauchlan, P., Hartley, R., Fitzgibbon, A., 2000. Bundle adjustment: a modern synthesis. LNCS, 1883:298–375. [doi:10.1007/3-540-44480-7_21]

    Google Scholar 

  • Turner, D., Lucieer, A., Watson, C., 2012. An automated technique for generating georectified mosaics from ultrahigh resolution unmanned aerial vehicle (UAV) imagery, structure from motion (SfM) point clouds. Remote Sens., 4(12):1392–1410. [doi:10.3390/rs4051392]

    Article  Google Scholar 

  • Tuytelaars, T., Mikolajczyk, K., 2007. Local invariant feature detectors: a survey. Found. Trends Comput. Graph. Vis., 3(3):177–280. [doi:10.1561/0600000017]

    Article  Google Scholar 

  • Wang, C., 1992. Extrinsic calibration of a vision sensor mounted on a robot. IEEE Trans. Robot. Autom., 8(2):161–175. [doi:10.1109/70.134271]

    Article  Google Scholar 

  • Wang, Y., Olano, M., 2011. A Framework for GPU 3D Model Reconstruction Using Structure-from-Motion. ACM SIGGRAPH Posters, p.27. [doi:10.1145/2037715.2037748]

    Google Scholar 

  • Wu, C., Agarwal, S., Curless, B., Seitz, S.M., 2011. Multicore Bundle Adjustment. CVPR, p.3057–3064. [doi:10.1109/CVPR.2011.5995552]

    Chapter  Google Scholar 

  • Xiao, J., Fang, T., Tan, P., Zhao, P., Ofek, E., Quan, L., 2008. Image-based facade modeling. ACM Trans. Graph., 27(5), Article 161, p.1–10. [doi:10.1145/1409060.1409114]

    Article  Google Scholar 

  • Yang, M.D., Chao, C.F., Huang, K.S., Lu, L.Y., Chen, Y.P., 2013. Image-based 3D scene reconstruction and exploration in augmented reality. Autom. Construct., 33:48–60. [doi:10.1016/j.autcon.2012.09.017]

    Article  Google Scholar 

  • Yao, A., Calway, A., 2002. Robust Estimation of 3-D Camera Motion for Uncalibrated Augmented Reality. University of Bristol, Bristol, UK.

    Google Scholar 

  • Zelek, J.S., Fazl-Ersi, E., Asmar, D.C., Fakih, A.H., 2010. Computer Vision Geo-location, Awareness & Detail. Proc. 1st Int. Conf. and Exhibition on Computing for Geospatial Research & Application. [doi:10.1145/1823854.1823906]

    Google Scholar 

  • Zhang, G., Hua, W., Qin, X., Shao, Y., Bao, H., 2009. Video stabilization based on a 3D perspective camera model. Vis. Comput., 25(11):997–1008. [doi:10.1007/s00371-009-0310-z]

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying-mei Wei.

Additional information

Project (No. 61070140) supported by the National Natural Science Foundation of China

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Ym., Kang, L., Yang, B. et al. Applications of structure from motion: a survey. J. Zhejiang Univ. - Sci. C 14, 486–494 (2013). https://doi.org/10.1631/jzus.CIDE1302

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.CIDE1302

Key words

CLC number

Navigation