Abstract
There is an increasing interest on doing research in the field of information retrieval which aims to incorporate new dimensions, apart from text based retrieval, to the Web search engines. Geographical Information Retrieval (GIR) aims to index Web resources using a geographic context. The process of identifying the geographic context starts with the detection of different types of geographic references associated to the documents, as for example, the occurrence of place names. This paper presents a model for detecting geographic references in Web documents based on a set of heuristics. Moreover, new concepts and methods for disambiguation of many places with the same name are addressed. Finally, a prototype was built, called GeoSEn which aimed to validate the effectiveness of the proposed model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McCurley, K.S.: Geospatial mapping and navigation of the web. In: Proceedings of the WWW 2001, pp. 221–229. ACM, Hong Kong (2001)
Buyukkokten, O., Cho, J., Garcia-Molina, H., Gravano, L., Shivakumar, N.: Exploiting geographic location information of web pages. In: ACM SIGMOD, Workshop on the Web and Databases. ACM, Philadelphia (1999)
Ding, J., Gravano, L., Shivakumar, N.: Computing geographic scopes of web resource. In: The International Conference on VLDB, pp. 545–556. Morgan Kaufman, Cairo (2000)
Silva, M.J., Martins, B., Chaves, M.S., Afonso, A.P., Cardoso, N.: Adding geographic scopes to web resources. Computers, Environment and Urban Systems 30(4), 378–399 (2006)
Martins, B., Chaves, M., Silva, M.J.: Assigning geographical scopes to web pages. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 564–567. Springer, Heidelberg (2005)
Li, Y., Moffat, A., Stokes, N., Cavedon, L.: Exploring probabilistic toponym resolution for geographical information retrieval. In: Proceedings of the 3rd ACM Workshop On Geographical Information Retrieval, pp. 17–22. ACM, Seattle (2006)
Volz, R., Kleb, J., Mueller, W.: Towards ontology-based disambiguation of geographical identifiers. In: Proceedings of The WWW 2007, CEUR-WS.org, Banff (2007)
Rauch, E., Bukatin, M., Baker, K.: A confidence-based framework for disambiguating geographic terms. In: Proceedings of the HLT-NAACL Workshop on Analysis of Geographic References, pp. 50–54. ACL, Morristown (2003)
Amitay, E., Har’El, N., Silvan, R., Soffer, A.: Web-a-where: Geotagging web content. In: Proceedings of SIGIR, Workshop on Geographical Information Retrieval, pp. 273–280. ACM, Sheffield (2004)
Zong, W., Wu, D., Sun, A., Lim, E., Goh, D.: On assigning place names to geography related web pages. In: Proceedings of JCDL, pp. 354–362. ACM, Denver (2005)
Markowetz, A., Chen, Y.Y., Suel, T., Long, X., Seeger, B.: Design and implementation of a geographic search engine. In: WebDB, Baltimore, pp. 19–24 (2005)
Campelo, C.E.C., Baptista, C.S.: Geographic Scope Modeling for Web Documents. In: Proceedings of The 5th International Workshop on Geographic Information Retrieval (GIR 2008), pp. 11–18. ACM, Napa Valley (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Campelo, C.E.C., de Souza Baptista, C. (2009). A Model for Geographic Knowledge Extraction on Web Documents. In: Heuser, C.A., Pernul, G. (eds) Advances in Conceptual Modeling - Challenging Perspectives. ER 2009. Lecture Notes in Computer Science, vol 5833. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04947-7_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-04947-7_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04946-0
Online ISBN: 978-3-642-04947-7
eBook Packages: Computer ScienceComputer Science (R0)