Abstract
Finding automatic ways of attaching geographical scopes to on-line resources, also called “geo-referencing” documents, is a challenging problem, getting increasing attention [1,5,3]. Here we present a system architecture and a process for identifying the geographical scope of Web pages, defining a scope as the region where more people than average would find that page relevant. We rely on typical Web IR heuristics (i.e. feature weighting, hypertext topic locality, anchor description) and assumptions on how people use geographical references in documents. The method involves three major steps. First, geographical named entities are identified in the text. Next, we propagate the found named entities through the Web linkage graph. Finally, a geographical ontology is used to disambiguate among the named entities associated to a document, this way selecting the most likely scope. In the future, we plan on using scopes in new location-aware search tools.
This research was partially supported Fundação para a Ciência e Tecnologia, under grants POSI/SRI/40193/2001 and SFRH/BD/10757/2002.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Amitay, E., Har’El, N., Sivan, R., Soffer, A.: Web-a-where: Geotagging Web content. In: Proceedings of SIGIR 2004, the 27th annual international conference on Research and developement in information retrieval, pp. 273–280. ACM Press, New York (2004)
Chaves, M., Martins, B., Silva, M.J.: Grease Knowledge Base. DI/FCUL TR 04–XX, Department of Informatics, University of Lisbon (November 2004)
Ding, J., Gravano, L., Shivakumar, N.: Computing geographical scopes of web resources. In: Proceedings of VLDB 2000, the 26th International Conference on Very Large Data Bases, pp. 545–556. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Hill, L.L., Frew, J., Zheng, Q.: Geographic names - the implementation of a gazetteer in a georeferenced digital library. D-Lib Magazine 5(1) (January 1999)
Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M., Weibel, R.: Spatial information retrieval and geographical ontologies: An overview of the SPIRIT project. In: Proceedings of SIGIR 2002, the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 2002, pp. 387–388. ACM Press, New York (2002)
Mikheev, A., Moens, M., Grover, C.: Named entity recognition without gazetteers. In: Proceedings of EACL 1999, the 9th Conference of the European Chapter of the Association for Computational Linguistics (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martins, B., Chaves, M., Silva, M.J. (2005). Assigning Geographical Scopes To Web Pages. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_52
Download citation
DOI: https://doi.org/10.1007/978-3-540-31865-1_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25295-5
Online ISBN: 978-3-540-31865-1
eBook Packages: Computer ScienceComputer Science (R0)