Abstract
We investigate publicly available geo-referenced photo collections for land cover classification. Mapping land cover is a fundamental task in the geographic sciences and is typically done using remote sensing (overhead) imagery through manual annotation. We here propose a novel alternate approach based on proximate sensing. The goal of proximate sensing is to map what-is-where on the surface of the Earth using ground level images of objects and scenes. It has the potential to map phenomena not observable through remote sensing. We perform an extensive case study on using ground level images for binary land cover classification into developed and undeveloped regions. We investigate visual features and text annotations to label images or sets of images with these two classes. Knowing the location of the images allows us to generate land cover maps which we quantitatively evaluate using ground truth maps. We apply our approach to two photo collections, Flickr, the popular photo sharing website, and the Geograph project, whose goal is to collect geographically informative photos. Comparing these two collections allows us to measure the impact of photographer intent. We utilize a weakly supervised learning framework which eliminates the need for manually labeled training data. We also investigate methods for filtering images that are unlikely to be geographically informative. Our results are promising and validate proximate sensing as a novel alternate approach to geographic discovery.
Similar content being viewed by others
Notes
We use the term geo-referenced to indicate that a multimedia object has at least approximate location metadata associated with it.
References
Cao L, Luo J, Kautz H, Huang T (2008) Annotating collections of photos using hierarchical event and scene models. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 1–8
Cao L, Yu J, Luo J, Huang TS (2009) Enhancing semantic and geographic annotation of Web images via logistic canonical correlation regression. In: Proceedings of the ACM international conference on multimedia, pp 125–134
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
Chen WC, Battestini A, Gelfand N, Setlur V (2009) Visual summaries of popular landmarks from community photo collections. In: Proceedings of the ACM international conference on multimedia, pp 789–792
Crandall D, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: Proceedings of the international world wide web conference, pp 761–770
Cristani M, Perina A, Castellani U, Murino V (2008) Geo-located image analysis using latent representations. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 1–8
Gallagher A, Joshi D, Yu J, Luo J (2009) Geo-location inference from image content and user tags. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, workshop on internet vision, pp 55–62
Goodchild MF (2007) Citizens as sensors: the world of volunteered geography. GeoJournal 69(4):211—221
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. Tech. Rep. 7694, California Institute of Technology
Hays J, Efros A (2008) IM2GPS: estimating geographic information from a single image. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 1–8
Hofmann T (1999) Probabilistic latent semantic indexing. In: SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, pp 50–57
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196
Jacobs N, Satkin S, Roman N, Speyer R, Pless R (2007) Geolocating static cameras. In: Proceedings of the IEEE international conference on computer vision, pp 1–6
Joshi D, Luo J (2008) Inferring generic activities and events from image content and bags of geo-tags. In: Proceedings of the international conference on content-based image and video retrieval, pp 37–46
Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T (2007) How Flickr helps us make sense of the world: context and content in community-contributed media collections. In: Proceedings of the ACM international conference on multimedia, pp 631–640
Kennedy L, Naaman M (2008) Generating diverse and representative image search results for landmarks. In: Proceedings of the international world wide web conference, pp 297–306
Leung D, Newsam S (2009) Proximate sensing using georeferenced community contributed photo collections. In: ACM international conference on advances in geographic information systems: workshop on location based social networks
Leung D, Newsam S (2010) Proximate sensing: inferring what-is-where from georeferenced photo collections. In: Proceedings of the IEEE international conference on computer vision and pattern recognition
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (1998) Color and texture descriptors. IEEE Trans Circ Syst Video Technol 11:703–715
Moxley E, Kleban J, Manjunath BS (2008) SpiritTagger: a geo-aware tag suggestion tool mined from Flickr. In: Proceedings of the ACM international conference on multimedia information retrieval, pp 24–30
Naaman M, Yeh RB, Garcia-Molina H, Paepcke A (2005) Leveraging context to resolve identity in photo albums. In: Proceedings of the ACM/IEEE-CS joint conference on digital libraries, pp 178–187
Newsam S (2010) Community-contributed photo collections as volunteered geographic information: crowdsourcing what-is-where. IEEE Multimedia Spec Issue Knowl Discov Over Commun-Contributed Multimedia Data 17(4)
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Quack T, Leibe B, Van Gool L (2008) World-scale mining of objects and events from community photo collections. In: Proceedings of the international conference on content-based image and video retrieval, pp 47–56
Standard Land Use Coding Manual. Urban Renewal Administration, Housing and Home Finance Agency and Bureau of Public Roads, Dept. of Commerce (1965)
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering object categories in image collections. In: Proceedings of the IEEE international conference on computer vision
Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46(2):234–240
Yanai K, Yaegashi K, Qiu B (2009) Detecting cultural differences using consumer-generated geotagged photos. In: Proceedings of the international workshop on location and the web
Zheng YT, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua TS, Neven H (2009) Tour the world: building a web-scale landmark recognition engine. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 1085–1092
Acknowledgements
This work was funded in part by a National Science Foundation CAREER grant (IIS-1150115) and a US Department of Energy Early Career Scientist and Engineer/PECASE award. We thank the anonymous reviewers for their informative feedback.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Leung, D., Newsam, S. Land cover classification using geo-referenced photos. Multimed Tools Appl 74, 11741–11761 (2015). https://doi.org/10.1007/s11042-014-2261-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2261-2