Abstract
Structure from motion (SFM) is an effective approach for reconstructing large-scale 3D scene from multiple images. In this field, many local feature methods have been proposed to detect feature point and compute descriptor. For designing a robust SFM system, how to select a good feature from existing methods is an important problem. In this paper, we aim to help different users for making decision by an experimental way for large-scale 3D reconstruction where many high resolution images are captured. To this end, we make a comprehensive evaluation of several local features on the ground truth datasets. Experimental results show that SIFT and SURF have a better performance than that of some binary features such as ORB and BRISK.







Similar content being viewed by others
References
Agrawal M, Konolige K, Blas MR (2008) Censure: center surround extremas for realtime feature detection and matching. Computer Vision–ECCV 2008, Springer, pp 102–115
Alcantarilla PF, Bartoli A, Davison AJ (2012) KAZE features. Computer Vision–ECCV 2012, Springer, pp 214–227
Bao S, Savarese S (2011) Semantic structure from motion. In: Proceedings of the 2011 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp 2025–2032
Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. pp 404–417
Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: binary robust independent elementary features. Computer Vision–ECCV 2010, pp 778–792
Cao M, Li S, Jia W, Li S, Liu X (2017) Robust bundle adjustment for large-scale structure from motion. Multimed Tools Appl 76(21):21843–21867
Cao MW, Jia W, Zhao Y, Li SJ, Liu XP (2017) Fast and robust absolute camera pose estimation with known focal length. Neural Computing and Applications, July 07, 2017
Cheng J, Leng C, Wu J, Cui H, Lu H (2014, June) Fast and accurate image matching with cascade hashing for 3d reconstruction. In: 2014 I.E. Conf. on Computer Vision and Pattern Recognition (CVPR), pp 1–8
Crandall D, Owns A, Snavely N, Hutenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: Proceedings of the 2011 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), pp 3001–3008
Dong Z, Zhang G, Jia J, Bao H (2009) Keyframe-based real-time camera tracking. In: Proceedings of the 2009 12th IEEE International Conference on Computer Vision (ICCV), pp 1538–1545
Forssén P-E, Lowe DG (2007) Shape descriptors for maximally stable extremal regions. pp 1–8
Frahm J, George P, Gallup D, Johnson T, Raguran R, Wu C, Jen Y, Dunn E, Clipp B Lazebnik S, Pollefeys M (2010) Building Rome on a cloudless day. In: Proceedings of the 11th European Conference on Computer Vision (ECCV), pp 368–381
Furukawa Y, Ponce J (2007) Accurate, dense, and robust multi-view stereopsis. CVPR '07, IEEE conference on computer vision and pattern recognition, 2007, IEEE, pp 1–8
Hartley RI, Sturm P (1997) Triangulation. Comput Vis Image Underst 68(2):146–157
Heinly J, Dunn E, Frahm J-M (2012) Comparative evaluation of binary features. Computer Vision–ECCV 2012, Springer, pp 759–773
Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. vol. 2, pp II-506–II-513
Klein G, Murray D (2007) Parallel tracking and mapping for small AR workspaces. pp 225–234
Leutenegger S, Chli M, Siegwart RY (2011) BRISK: binary robust invariant scalable keypoints. pp 2548–2555
Levi G, Hassner T (2015) LATCH: learned arrangements of three patch codes. arXiv preprint arXiv:1501.03719
Li H, Hartley R (2006) Five-point motion estimation made easy. In: International conference on pattern recognition, IEEE, pp 630–633
Li P, Wang D, Wang L, Lu H (2017) Deep visual tracking: review and experimental comparison. Pattern Recogn 76:323–338
Liu J, Liang X (2011) I-BRIEF: a fast feature point descriptor with more robust features. Seventh international conference on signal image technology & internet-based systems, IEEE computer society, pp 322–328
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Lu H, Li Y, Mu S, Wang D, Kim H, Serikawa S (2017) Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J PP(99):1–1
Lu H, Li B, Zhu J, Li Y, Li Y, Xu X, He L, Li X, Li J, Serikawa S (2017) Wound intensity correction and segmentation with convolutional neural networks. Concurrency Computat Pract Exp 29(6)
Lu H, Li Y, Chen M, Kim H, Serikawa S (2017) Brain intelligence: go beyond artificial intelligence. Mobile Networks and Applications, pp 1–8. https://doi.org/10.1007/s11036-017-0932-8
Mair E, Hager GD, Burschka D, Suppa M, Hirzinger G (2010) Adaptive and generic corner detection based on the accelerated segment test. Computer Vision–ECCV 2010, Springer, pp 183–196
Morel J-M, Yu G (2009) ASIFT: a new framework for fully affine invariant image comparison. SIAM J Imag Sci 2(2):438–469
Moulon P, Monasse P, Marlet R (2013) Global fusion of relative motions for robust, accurate and scalable structure from motion. In: Proceedings of the 2013 I.E. International Conference on Computer Vision (ICCV). pp 3248–3255
Ni K, Dellaert F (2012) HyperSfM. In: Second international conference on 3d imaging, modeling, processing, visualization & transmission, IEEE computer society, pp 144–151
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. Computer Vision–ECCV 2006, Springer, pp 430–443
Rublee E, Rabaud V, Konolige K, Bradski G (2013) ORB: an efficient alternative to SIFT or SURF. pp 2564–2571
Schönberger JL, Frahm J-M (2016) Structure-from-motion revisited. Comput Vis Pattern Recognit pp 4104–4113
Serikawa S, Lu H (2014) Underwater image dehazing using joint trilateral filter. Comput Electr Eng 40(1):41–50
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph (TOG) 25(3):835–846
Sweeney C, Sattler T, Hollerer T, Turk M, Pollefey M (2015) Optimizing the viewing graph for structure-from-motion. In: Proceedings of the 2015 I.E. International Conference on Computer Vision (ICCV), pp 801–809
Triggs B, McLuchlan P, Hartley R, Fitzgibbon A (1999) Bundle adjustment—a modern synthesis. In: Vision algorithms: theory and practice pp 298–372
Trzcinski T, Christoudias M, Fua P, et al. (2013) Boosting binary keypoint descriptors. In: Computer vision and pattern recognition, IEEE, pp 2874–2881
Vandergheynst P, Ortiz R, Alahi A (2012) FREAK: fast retina keypoint. Computer Vision and Pattern Recognition, IEEE, pp 510–517
Wang D, Lu H, Xiao Z, Yang MH (2015) Inverse sparse tracker with a locally weighted distance metric. IEEE Trans Image Process 24(9):2646–2657
Wang T, Kohli P, Mitra N (2015) Dynamic SFM: detecting scene changes from image pairs. Comput Graphics Forum 34(5):177–189
Wang D, Lu H, Yang MH (2016) Robust visual tracking via least soft-threshold squares. IEEE Press, Piscataway
Wilson K, Snavely N (2014) Robust global translations with 1DSFM. In: Proceedings of the 13th European Conference on Computer Vision (ECCV), pp 61–75
Wu C (2011) SiftGPU: a GPU implementation of scale invariant feature transform. URL http://cs.unc.edu/~ccwu/siftgpu
Wu C (2013) Towards linear-time incremental structure from motion. In: Proceedings of the 2013 International Conference on 3D Vision (3DV), pp 127–134
Wu C (2015) P3.5P: pose estimation with unknown focal length. IEEE conference on computer vision and pattern recognition, IEEE computer society, pp 2440–2448
Xiao J, Owens A, Torralba A (2013) SUN3D: a database of big spaces reconstructed using sfm and object labels. In: Proceedings of the 2013 I.E. International Conference on Computer Vision (ICCV), pp 1625–1632
Yang X, Cheng KT (2012) LDB: an ultra-fast feature for scalable augmented reality on mobile devices. IEEE international symposium on mixed and augmented reality, IEEE computer society, pp 49–57
Yang X, Cheng K-T (2014) Learning optimized local difference binaries for scalable augmented reality on mobile devices. IEEE Trans Vis Comput Graph 20(6):852–865
Yang X, Cheng K-T (2014) Local difference binary for ultrafast and distinctive feature description. IEEE Trans Pattern Anal Mach Intell 36(1):188–194
Yi KM, Trulls E, Lepetit V, Fua P (2016) LIFT: learned invariant feature transform. arXiv preprint arXiv:1603.09114
Zach C (2010) ETH-V3D structure-and-motion software.© 2010–2011. ETH Zurich, Zurich
Zach C (2014) Robust bundle adjustment revisited. In: Proceedings of the 13th European Conference on Computer Vision (ECCV), pp 772–787
Zhang G, Liu H, Dong Z, Jia J, Wong, T, Bao H (2015) ENFT: efficient non-consecutive feature tracking for robust structure-from-motion. arXiv preprint arXiv:1510.08012
Zhang G, Liu H, Dong Z et al (2015) Efficient non-consecutive feature tracking for robust structure-from-motion. IEEE Trans Image Process 25(12):5957–5970
Zheng E, Wu C (2015) Structure from motion using structure-less resection. In: Proceedings of the 2015 I.E. International Conference on Computer Vision (ICCV), pp 2075–2083
Acknowledgments
This work is partly supported by the grants of the National Science Foundation of China, Nos. 61673157, and 61402018, the grant of the Natural Science Foundation of Anhui Province, Nos. KJ2014ZD27 and JZ2015AKZR0664, and also supported by the National Key Research and Development Plan under Grant No. 2016YFC0800100.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, M., Cao, L., Jia, W. et al. Evaluation of Local Features for Structure from Motion. Multimed Tools Appl 77, 10979–10993 (2018). https://doi.org/10.1007/s11042-018-5864-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5864-1