Skip to main content

Knowledge-Based and Data-Driven Approaches for Georeferencing of Informal Documents

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

  • 1820 Accesses

Abstract

This paper describes Knowledge-Based and Data-Driven approaches we have followed for generic Textual Georeferencing of Informal Documents. Textual georeferencing consists in assigning a set of geographical coordinates to formal (news, reports,..) or informal (blogs, social networks, chats, tagsets,...) texts and documents. The system presented in this paper has been designed to deal with informal documents from social sites. The paper describes four Georeferencing approaches, experiments, and results at the MediaEval 2014 Placing Task (ME2014PT) evaluation, and posterior experiments. The task consisted of predicting the most probable geographical coordinates of Flickr images and videos using its visual, audio and metadata associated features. Our approaches used only Flickr users textual metadata annotations and tagsets. The four approaches used for this task were: 1) a Geographical Knowledge-Based (GeoKB) approach that uses Toponym Disambiguation heuristics, 2) the Hiemstra Language Model (HLM), TFIDF and BM25 Information Retrieval (IR) approaches with Re-Ranking, 3) a combination of the GeoKB and the IR models with Re-Ranking (GeoFusion). 4) a combination of the GeoFusion with a HLM model derived from the English Wikipedia georeferenced pages. The HLM approach with Re-Ranking showed the best performance in accuracy within a margin of distance errors ranging from 10m to 1km. The GeoFusion approaches achieved the best results in accuracies from 10km to 5,000km. Both approaches achieved state-of-the-art results at ME2014PT evaluation and posterior experiments, including the best results for distance accuracies of 1000km and 5,000km in the task where only the official training dataset can be used to predict the coordinates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Larson, M., Kelm, P., Rae, A., Hauff, C., Thomee, B., Trevisiol, M., Choi, J., Van Laere, O., Schockaert, S., Jones, G., Serdyukov, P., Murdock, V., Friedland, G.: The benchmark as a research catalyst: charting the progress of geo-prediction for social multimedia. In: Choi, J., Friedland, G. (eds.) Multimodal Location Estimation of Videos and Images, pp. 5–40. Springer International Publishing (2015)

    Google Scholar 

  2. Choi, J., Thomee, B., Friedland, G., Cao, L., Ni, K., Borth, D., Elizalde, B., Gottlieb, L., Carrano, C., Pearce, R., Poland, D.: The placing task: a large-scale geo-estimation challenge for social-media videos and images. In: Proceedings of the 3rd ACM Multimedia Workshop on Geotagging and its Applications in Multimedia, GeoMM 2014, pp. 27–31. ACM, New York (2014)

    Google Scholar 

  3. Serdyukov, P., Murdock, V., van Zwol, R.: Placing flickr photos on a map. In: Allan, J., Aslam, J.A., Sanderson, M., Zhai, C., Zobel, J. (eds) SIGIR, pp. 484–491 (2009)

    Google Scholar 

  4. Kelm, P., Schmiedeke, S., Sikora, T.: Video2GPS: geotagging using collaborative systems, textual and visual features. In: Working Notes of the MediaEval 2010 Workshop, Pisa, Italy, October 24, 2010

    Google Scholar 

  5. Perea-Ortega, J.M., García-Cumbreras, M.A., López, L.A.U., García-Vega, M.: SINAI at placing task of mediaeval 2010. In: Working Notes of the MediaEval 2010 Workshop, Pisa, Italy, October 24, 2010

    Google Scholar 

  6. Laere, O.V., Schockaert, S., Dhoedt, B.: Georeferencing flickr resources based on textual meta-data. Information Sciences 238, 52–74 (2013)

    Article  Google Scholar 

  7. Popescu, A., Papadopoulos, S., Kompatsiaris, I.: USEMP at MediaEval Placing Task (2014). [18]

    Google Scholar 

  8. Kordopatis-Zilos, G., Orfanidis, G., Papadopoulos, S., Kompatsiaris, Y.: SocialSensor at MediaEval Placing Task (2014). [18]

    Google Scholar 

  9. Li, L.T., Penatti, O.A.B., Almeida, J., Chiachia, G., Calumby, R.T., Mendes-Junior, P.R., Pedronette, D.C.G., da Silva Torres, R.: Multimedia Geocoding: The RECOD 2014 Approach. [18]

    Google Scholar 

  10. Cao, J., Huang, Z., Yang, Y., Shen, H.T.: UQ-DKE’s Participation at MediaEval 2014 Placing Task. [18]

    Google Scholar 

  11. Choi, J., Li, X.: The 2014 ICSI/TU Delft Location Estimation System.[18]

    Google Scholar 

  12. Stokes, N., Li, Y., Moffat, A., Rong, J.: An Empirical Study of the Effects of NLP Components on Geographic IR performance. International Journal of Geographical Information Science 22(3), 247–264 (2008)

    Article  Google Scholar 

  13. Ferrés, D., Rodríguez, H.: Georeferencing textual annotations and tagsets with geographical knowledge and language models. In: Actas de la SEPLN 2011, Huelva, Spain, September 2011

    Google Scholar 

  14. Leidner, J.L.: Toponym Resolution: a Comparison and Taxonomy of Heuristics and Methods. Ph.D. Thesis, University of Edinburgh (2007)

    Google Scholar 

  15. Hauptmann, A.G., Hauptmann, E.G., Olligschlaeger, A.M.: Using location information from speech recognition of television news broadcasts. In: Proceedings of the ESCA ETRW Workshop on Accessing Information in Spoken Audio, pp. 102–106. University of Cambridge, Cambridge (1999)

    Google Scholar 

  16. Hiemstra, D.: Using Language Models for Information Retrieval. Ph.D. thesis, Enschede (2001)

    Google Scholar 

  17. Ferrés, D., Rodríguez, H.: TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bases and Language Models for Large-Scale Textual Georeferencing

    Google Scholar 

  18. Larson, M.A., Ionescu, B., Anguera, X., Eskevich, M., Korshunov, P., Schedl, M., Soleymani, M., Petkos, G., Sutcliffe, R.F.E., Choi, J., Jones, G.J.F. (eds): Working Notes Proceedings of the MediaEval 2014 Workshop, Barcelona. CEUR Workshop Proceedings, Catalunya, Spain, October 16–17, vol. 1263. CEUR-WS.org (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Ferrés .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ferrés, D., Rodríguez, H. (2015). Knowledge-Based and Data-Driven Approaches for Georeferencing of Informal Documents. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24033-6_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24032-9

  • Online ISBN: 978-3-319-24033-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics