Abstract
This paper presents a method based on information retrieval to enrich corpus using bootstrapping techniques. A supervised corpus manually validated is provided, and then snippets are obtained from Web in order to increase the size of the initial corpus. Although this technique has already been reported in the literature, the main objective of this work is to apply it under the specific task of GEO/NO-GEO toponym disambiguation.The disambiguation procedure is evaluated by a classification model observing favorable results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andogah, G.: Geographically Constrained Information Retrieval. Ph.D. thesis, University of Groningen, Groningen, Netherlands (May 2010)
Andrade, L., Silva, M.J.: Relevance Ranking for Geographic IR. In: Workshop on Geographic Information Retrieval (GIR 2006) at SIGIR 2006, pp. 1–4. ACM, New York (2006)
Bensalem, I., Kholladi, M.K.: Toponym Disambiguation by Arborescent Relationships. Journal of Computer Science 6(6), 653–659 (2010)
Buscaldi, D.: Toponym Ambiguity in Geographical Information Retrieval. In: 32nd international ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR 2009), pp. 847–847. ACM, New York (2009)
Buscaldi, D.: Toponym Disambiguation in Information Retrieval. Ph.D. thesis, Universidad Politécnica de Valencia, Valencia, España (2010)
Clough, P.: Extracting Metadata for Spatially-Aware Information Retrieval on the Internet. In: 2005 Workshop on Geographic Information Retrieval (GIR 2005), pp. 25–30. ACM, New York (2005)
Dwivedi, S.K., Rastogi, P.: Critical Analysis of WSD Algorithms. In: 2009 International Conference on Advances in Computing, Communication and Control (ICAC3 2009), pp. 62–67. ACM, New York (2009)
Guzmán-Cabrera, R., Rosso, P., Montes-Y-Gómez, M., Villaseñor Pineda, L., Pinto-Avendaño, D.: Semi-supervised Word Sense Disambiguation Using the Web as Corpus. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 256–265. Springer, Heidelberg (2009)
Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M., Weibel, R.: Spatial Information Retrieval and Geographical Ontologies an Overview of the SPIRIT Project. In: 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002), pp. 387–388. ACM, New York (2002)
Koo, S.O., Lim, S.Y., Lee, S.J.: Building an Ontology Based on Hub Words for Information Retrieval. In: 2003 IEEE / WIC / ACM International Conference on Web Intelligence (WI 2003), pp. 466–469. IEEE, Los Alamitos (2003)
Lee, Y.K., Ng, H.T.: An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation. In: ACL Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), vol. 10, pp. 41–48. Association for Computational Linguistics, Stroudsburg (2002)
Leidner, J.L.: Toponym Resolution in Text – Annotation, Evaluation and Applications of Spatial Grounding of Place Names. Universal Press, Boca Raton (2008)
Leite, M.A.A., Ricarte, I.L.: Document Retrieval Using Fuzzy Related Geographic Ontologies. In: 2nd International Workshop on Geographic Information Retrieval (GIR 2008), pp. 47–54. ACM, New York (2008)
Lopez, A., Somodevilla, M.J., Vilarino, D., Pineda, I.H., De Celis, C.: Toponym Disambiguation by Ontology in Spanish – Geographical Proximity Between Place Names in the Same Context. In: AISS: Advances in Information Sciences and Service Sciences, pp. 282–289 (in press, 2012)
Ledo Mezquita, Y., Sidorov, G., Gelbukh, A.: Information Retrieval with Word Sense Disambiguation for Spanish. Computación y Sistemas 11, 288–300 (2008)
Naveiras, D.S.: Técnicas de indexación y recuperación de documentos utilizando referencias geográficas y textuales. Ph.D. thesis, Universidad de Coruña, Coruña, España (2009)
Navigli, R.: Word Sense Disambiguation – A Survey. ACM Computing Surveys 41(2), 10:1–10:69 (2009)
Priego-Sánchez, B., Somodevilla, M.J., Pineda, I.H., Hernandez, J.: Geontomex – una ontología espacial de méxico para la desambiguación de topónimos. In: Congreso Mexicano de Inteligencia Artificial, COMIA 2012 (2012)
Smith, D.A., Crane, G.: Disambiguating Geographic Names in a Historical Digital Library. In: Constantopoulos, P., Sølvberg, I.T. (eds.) ECDL 2001. LNCS, vol. 2163, pp. 127–136. Springer, Heidelberg (2001)
Spink, A., Wolfram, D., Jansen, M.B.J., Saracevic, T.: Searching the Web – The Public and Their Queries. Journal of the American Society for Information Science and Technology 52(3), 226–234 (2001)
Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E.G., Milios, E.E.: Semantic Similarity Methods in wordNet and Their Application to Information Retrieval on the Web. In: 7th Annual ACM International Workshop on Web Information and Data Management (WIDM 2005), pp. 10–16. ACM, New York (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Priego Sánchez, B., Somodevilla, M.J., Guzmán Cabrera, R., Pineda, I.H., Carrillo, M. (2012). A Classification Model with Corpus Enrichment for Toponym Disambiguation. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds) Advances in Artificial Intelligence – IBERAMIA 2012. IBERAMIA 2012. Lecture Notes in Computer Science(), vol 7637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34654-5_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-34654-5_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34653-8
Online ISBN: 978-3-642-34654-5
eBook Packages: Computer ScienceComputer Science (R0)