ABSTRACT
User-generated content, such as photos and videos, is often annotated by users with free-text labels, called tags. Increasingly, such content is also georeferenced, i.e., it is associated with geographic coordinates. The implicit relationships between tags and their locations can tell us much about how people conceptualize places and relations between them. However, extracting such knowledge from social annotations presents many challenges, since annotations are often ambiguous, noisy, uncertain and spatially inhomogeneous. We introduce a probabilistic framework for modeling georeferenced annotations and a method for learning model parameters from data. The framework is flexible and general, and can be used in a variety of applications that mine geospatial knowledge from user-generated content. Specifically, we study three problems: extracting place semantics, predicting locations of photos and learning part-of relations between places. We show our method performs well compared to state-of-the-art approaches developed for the first two problems, and offers a novel solution to the problem of learning relations between places.
- E. Amitay, N. Har'El, R. Sivan, and A. Soffer. Web-a-where: geotagging web content. In SIGIR. ACM, 2004. Google ScholarDigital Library
- C. M. Bishop and S. S. En Ligne. Pattern recognition and machine learning, volume 4. springer New York, 2006. Google ScholarDigital Library
- D. J. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the world's photos. In WWW, 2009. Google ScholarDigital Library
- A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), pages 1--38, 1977.Google ScholarCross Ref
- C. Gouvêa, S. Loh, L. F. F. Garcia, E. B. Fonseca, and Wendt. Discovering Location Indicators of Toponyms from News to Improve Gazetteer-Based Geo-Referencing. In Simpósio Brasileiro de Geoinformática-GEOINFO, 2008.Google Scholar
- J. R. Hershey and P. A. Olsen. Approximating the Kullback Leibler divergence between Gaussian mixture models. In ICASSP, volume 4. Ieee, 2007.Google ScholarCross Ref
- J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright. Convergence properties of the Nelder-Mead simplex method in low dimensions. Siam journal of optimization, 9:112--147, 1998. Google ScholarDigital Library
- A. R. Liddle. Information criteria for astrophysical model selection. Monthly Notices of the Royal Astronomical Society: Letters, 377(1):L74--L78, 2007.Google ScholarCross Ref
- S. Openshaw. The modifiable areal unit problem. Geo Books Norwich, UK, 1983.Google Scholar
- A. Plangprasopchok, K. Lerman, and L. Getoor. A probabilistic approach for learning folksonomies from structured data. In WSDM, Nov. 2011. Google ScholarDigital Library
- T. Rattenbury and M. Naaman. Methods for extracting place semantics from Flickr tags. ACM Transactions on the Web (TWEB), 3(1):1, 2009. Google ScholarDigital Library
- M. Sanderson and B. Croft. Deriving concept hierarchies from text. In SIGIR, 1999. Google ScholarDigital Library
- P. Schmitz. Inducing ontology from flickr tags. In WWW Workshop on Collaborative Web Tagging, May 2006.Google Scholar
- P. Serdyukov, V. Murdock, and R. Van Zwol. Placing flickr photos on a map. In SIGIR, 2009. Google ScholarDigital Library
- R. W. Sinnott. Virtues of the Haversine. Sky and telescope, 68:158, 1984.Google Scholar
- Y. Yang and G. I. Webb. Discretization for naive-Bayes learning: managing discretization bias and variance. Machine learning, 74(1):39--74, 2009. Google ScholarDigital Library
- A probabilistic approach to mining geospatial knowledge from social annotations
Recommendations
A probabilistic approach to mining geospatial knowledge from social annotations
Knowledge produced online often comes in the form of free-text labels, known as tags, with which users annotate the content they create, such as photos and videos. Increasingly, such content is also georeferenced, i.e., it is associated with geographic ...
A Flexible Text Mining System for Entity and Relation Extraction in PubMed
DTMBIO '15: Proceedings of the ACM Ninth International Workshop on Data and Text Mining in Biomedical InformaticsDue to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means ...
Application of association rules mining to Named Entity Recognition and co-reference resolution for the Indonesian language
In this paper, we propose a new method, association rules mining for Named Entity Recognition (NER) and co-reference resolution. The method uses several morphological and lexical features such as Pronoun Class (PC) and Name Class (NC), String Similarity ...
Comments