Skip to main content

Semantically Guided Geo-location and Modeling in Urban Environments

  • Chapter
  • First Online:
Book cover Large-Scale Visual Geo-Localization

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

Abstract

The problem of localization and geo-location estimation of an image has a long-standing history both in robotics and computer vision. With the advent of availability of large amounts of geo-referenced image data, several image retrieval approaches have been deployed to tackle this problem. In this work, we will show how the capability of semantic labeling of both query views and the reference dataset by means of semantic segmentation can aid (1) the problem of retrieval of views similar and possibly overlapping with the query and (2) guide the recognition and discovery of commonly occurring scene layouts in the reference dataset. We will demonstrate the effectiveness of these semantic representations on examples of localization, semantic concept discovery, and intersection recognition in the images of urban scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building Rome in a day. In: ICCV, pp 72–79

    Google Scholar 

  2. Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, pp 1–22

    Google Scholar 

  3. Doersch C, Singh S, Gupta A, Sivic J, Efros AA (2012) What makes Paris look like Paris? ACM Trans Graph 31(4):101

    Google Scholar 

  4. Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181

    Google Scholar 

  5. Gould S, Fulton R, Koller D (2009) Decomposing a scene into geometric and semantically consistent regions. In: ICCV, pp 1–8

    Google Scholar 

  6. Gould S, Rodgers J, Cohen D, Elidan G, Koller D (2008) Multi-class segmentation with relative location prior. Int J Comput Vis 80(3):300–316

    Google Scholar 

  7. Hays J, Efros AA (2008) IM2GPS: estimating geographic information from a single image. In: CVPR, pp 1–8 (2008)

    Google Scholar 

  8. Hoiem D, Efros AA, Hebert M (2007) Recovering surface layout from an image. Int J Comput Vis 75(1):151–172

    Google Scholar 

  9. Knopp J, Sivic J, Pajdla T (2010) Avoiding confusing features in place recognition. In: ECCV, pp 748–761

    Google Scholar 

  10. Ladicky L, Russell C, Kohli P, Torr PHS (2009) Associative hierarchical CRFs for object class image segmentation. In: ICCV, pp 739–746

    Google Scholar 

  11. Ladicky L, Russell C, Kohli P, Torr PHS (2010) Graph cut based inference with co-occurrence statistics. In: ECCV (5), pp 239–253

    Google Scholar 

  12. Ladicky L, Sturgess P, Alahari K, Russell C, Torr PHS (2010) What, where and how many? combining object detectors and CRFs. In: ECCV (4), pp 424–437

    Google Scholar 

  13. Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp 282–289

    Google Scholar 

  14. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp 2169–2178

    Google Scholar 

  15. Leung TK, Malik J (2001) Representing and recognizing the visual appearance of materials using three-dimensional textons. Int J Comput Vis 43(1):29–44

    Google Scholar 

  16. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Google Scholar 

  17. Murillo AC, Singh G, Kosecká J, Guerrero JJ (2013) Localization in urban environments using a panoramic gist descriptor. IEEE Trans Robot 29(1):146–160

    Google Scholar 

  18. Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168

    Google Scholar 

  19. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Google Scholar 

  20. Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: CVPR

    Google Scholar 

  21. Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) Objects in context. In: ICCV, pp 1–8

    Google Scholar 

  22. Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336

    Google Scholar 

  23. Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: CVPR

    Google Scholar 

  24. Shotton J, Winn JM, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1):2–23

    Google Scholar 

  25. Singh G, Košecká J (2013) Visual loop closing using gist descriptors in manhattan world. In: Workshop on omnidirectional robot vision, ICRA

    Google Scholar 

  26. Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: ICCV, pp 1470–1477

    Google Scholar 

  27. Tighe J, Lazebnik S (2010) SuperParsing: scalable nonparametric image parsing with superpixels. In: ECCV (5), pp 352–365

    Google Scholar 

  28. Zhang H, Fang T, Chen X, Zhao Q, Quan L (2011) Partial similarity based nonparametric scene parsing in certain environment. In: CVPR, pp 2241–2248

    Google Scholar 

  29. Zhang H, Xiao J, Quan L (2010) Supervised label transfer for semantic segmentation of street scenes. In: ECCV (5), pp 561–574

    Google Scholar 

  30. Zhang W, Košecká J (2006) Image based localization in urban environments. In: 3DPVT06, pp 33–40

    Google Scholar 

Download references

Acknowledgments

Supported by the Intelligence Advanced Research Projects Activity (IARPA) via Air Force Research Laboratory, contract FA8650-12-C-7212. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, AFRL, or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gautam Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Singh, G., Košecká, J. (2016). Semantically Guided Geo-location and Modeling in Urban Environments. In: Zamir, A., Hakeem, A., Van Gool, L., Shah, M., Szeliski, R. (eds) Large-Scale Visual Geo-Localization. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-25781-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25781-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25779-2

  • Online ISBN: 978-3-319-25781-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics