Abstract
Augmented reality (AR) technologies enhance users knowledge of their immediate surroundings by presenting contextualized and spatially relevant information. This augmentation is enabled by the automatic estimation of the device’s pose with respect to the environment. Many simple AR applications for smartphones are well served by the pose estimate provided by GPS and inertial sensors, while other demand an higher accuracy which can be obtained with vision-based methods. The application presented in this paper, which aims to augment pictures of mountainous landscapes with geo-referenced data, belongs to the latter category. Our application is based on a novel approach for image-to-world registration which jointly relies on inertial and visual sensors. In a nutshell, first GPS and inertial sensors are used to compute a rough estimate of the device position and pose, then visual data are employed to refine it. Specifically, a learning-based edge detection algorithm is used to extract mountain profiles from the picture of interest. Then, a novel registration algorithm based on a robust optimization framework is employed to align the computed profiles to synthetic ones obtained from Digital Elevation Models. Our experiments, conducted on a novel dataset of manually aligned pictures which is made publicly available, demonstrate that the proposed registration method guarantees an increased accuracy with respect to competing methods and it is computationally efficient when implemented on a smartphone.
Similar content being viewed by others
Notes
Venturi Mountain Dataset: https://venturi.fbk.eu/results/public-datasets/mountain-dataset/.
References
Baboud, L., Čadík, M., Eisemann, E., Seidel, H.P.: Automatic photo-to-terrain alignment for the annotation of mountain pictures. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’11), pp. 41–48 (2011)
Chippendale, P., Zanin, M., Andreatta, C.: Spatial and temporal attractiveness analysis through geo-referenced photo alignment. In: International Symposium on Geoscience and Remote Sensing (IGARS’08), vol. 2, pp. 1116–1119 (2008)
Dollár, P., Tu, Z., Belongie, S.: Supervised learning of edges and object boundaries. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1964–1971 (2006)
Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: Proceedings of the Conference on Computer Vision (ICCV’13), pp. 1841–1848 (2013)
Ozuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 448–461 (2010)
Porzi, L., Rota Buló, S., Valigi, P., Lanz, O., Ricci, E.: Learning contours for automatic annotations of mountains pictures on a smartphone. In: Proceedings of the International Conference on Distributed Smart Cameras (ICDSC’14), pp. 13:1–13:6 (2014)
Google photo sphere. http://www.google.com/maps/about/contribute/photosphere/
Hernández, C.: Lens blur in the new google camera app. http://www.google.com/maps/about/contribute/photosphere/
Omerčević, D., Leonardis, A.: Hyperlinking reality via camera phones. Mach. Vis. Appl. 22(3), 521–534 (2011)
Soldati, F.: Peakfinder. https://play.google.com/store/apps/details?id=org.peakfinder.area.alps
Peak scanner. https://play.google.com/store/apps/details?id=com.quantaq.peakscanner
Peak.ar. https://play.google.com/store/apps/details?id=at.srfg.peakar
Cain, N.: Showmehills ar mountain peaks. https://play.google.com/store/apps/details?id=com.showmehills
Baatz, G., Saurer, O., Köser, K., Pollefeys, M.: Large scale visual geo-localization of images in mountainous terrain. In: Proceedings of the 12th European Conference on Computer Vision (ECCV’97), pp. 517–530 (2012)
Behringer, R.: Registration for outdoor augmented reality applications using computer vision techniques and hybrid sensors. In: Proceedings of the Conference on Virtual Reality, pp. 244–251 (1999)
Ruzon, M.A., Tomasi, C.: Color edge detection with the compass operator. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’99), vol. 2 (1999)
Prasad, M., Zisserman, A., Fitzgibbon, A., Kumar, M.P., Torr, P.H.S.: Learning class-specific edges for object detection and segmentation. In: Computer Vision Graphics and Image Processing, pp. 94–105 (2006)
Xiaofeng, R., Bo, L.: Discriminatively trained sparse code gradients for contour detection. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 584–592 (2012)
Chekhlov, D., Pupilli, M., Mayol, W., Calway, A.: Robust real-time visual slam using scale prediction and exemplar based feature description. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’07), pp. 1–7 (2007)
Clemente, L.A., Davison, A.J., Reid, I.D., Neira, J., Tardós, J.D.: Mapping large loops with a single hand-held camera. In: Robotics: Science and Systems, vol. 2, p. 11 (2007)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)
Klein, G., Murray, D.: Parallel tracking and mapping on a camera phone. In: Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR’09) (2009)
Schops, T., Enge, J., Cremers, D.: Semi-dense visual odometry for ar on a smartphone. In: Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR’14) (2014)
Ventura, J., Arth, C., Reitmayr, G., Schmalstieg, D.: Global localization from monocular slam on a mobile phone. IEEE Trans. Vis. Comput. Graph. 20, 531–539 (2014)
Gil, A., Mozos, O.M., Ballesta, M., Reinoso, O.: A comparative evaluation of interest point detectors and local descriptors for visual slam. Mach. Vis. Appl. 21(6), 905–920 (2010)
Cozman, F., Krotkov, E.: Position estimation from outdoor visual landmarks for teleoperation of lunar rovers. In: Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV’96), pp. 156–161 (1996)
Dumble, S.J., Gibbens, P.W.: Efficient terrain-aided visual horizon based attitude estimation and localization. J. Intell. Robot. Syst. 2, 1–17 (2014)
Sim, D.G., Park, R.H.: Localization based on dem matching using multiple aerial image pairs. IEEE Trans. Image Process. 11(1), 52–55 (2002)
Stein, F., Medioni, G.: Map-based localization using the panoramic horizon. In: Proceedings of the International Conference on Robotics and Automation (ICRA’92), pp. 2631–2637 (1992)
Talluri, R., Aggarwal, J.K.: Position estimation for an autonomous mobile robot in an outdoor environment. IEEE Trans. Robot. Autom. 8(5), 573–584 (1992)
Woo, J., Son, K., Li, T., Kim, G.S., Kweon, I.S.: Vision-based uav navigation in mountain area. In: Proceedings of the IAPR Conference on Machine Vision Applications (MVA’07), pp. 236–239 (2007)
Geonames. http://www.geonames.org/
Ozuysal, M., Fua, P., Lepetit, V.: Fast keypoint recognition in ten lines of code. In: Proceedings Conference on Computer Vision and Pattern Recognition (CVPR’07), pp. 1–8 (2007)
Crow, F.C.: Summed-area tables for texture mapping. ACM SIGGRAPH Comput. Graph. 18(3), 207–212 (1984)
Jarvis, A., Reuter, H.I., Nelson, A., Guevara, E.: Hole-filled seamless srtm data v4. http://srtm.csi.cgiar.org
de Ferranti, J.: Viewfinder panoramas digital elevation data. http://www.viewfinderpanoramas.org/dem3.html
Lourenço, A., Rota Bulò, S., Fred, A., Pelillo, M.: Consensus clustering with robust evidence accumulation. In: Energy Minimization Methods in Computer Vision and Pattern Recognition. Lecture Notes in Computer Science, vol. 8081, pp. 307–320 (2013)
Agarwal, S., Mierle, K., et al.: Ceres solver. http://ceres-solver.org
Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)
Flickr. https://www.flickr.com/
Huynh, D.Q.: Metrics for 3d rotations: comparison and analysis. J. Math. Imaging Vis. 35(2), 155–164 (2009)
Dollár, P.: Piotr’s computer vision matlab toolbox (PMT). https://github.com/pdollar/toolbox
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)
Yang, X.S., Deb, S., Fong, S.: Accelerated particle swarm optimization and support vector machine for business optimization and applications. In: Networked Digital Technologies, pp. 53–66 (2011)
Porzi, L., Ricci, E., Ciarfuglia, T.A., Zanin, M.: Visual-inertial tracking on android for augmented reality applications. In: Proceedings of the Workshop on Environmental, Energy and Structural Monitoring Systems (EESMS’12), pp. 35–41 (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Porzi, L., Bulò, S.R., Lanz, O. et al. An automatic image-to-DEM alignment approach for annotating mountains pictures on a smartphone. Machine Vision and Applications 28, 101–115 (2017). https://doi.org/10.1007/s00138-016-0808-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-016-0808-0