ABSTRACT
In this paper we present a study carried out over toponyms contained in an Italian news collection, in order to determine the degree of ambiguity of toponyms and how difficult could be to resolve such ambiguities. The results show that frequent toponyms are usually less ambiguous than rare to-ponyms. The resolution of ambiguities on a sample of 1,042 toponyms with different features confirms that ambiguous toponyms are spatially autocorrelated.
- G. Andogah, G. Bouma, J. Nerbonne, and E. Koster. Placename ambiguity resolution. In LREC 2008 workshop on Methodologies and Resources for Processing Spatial Language, 2008.Google Scholar
- T. J. Brunner and R. S. Purves. Spatial autocorrelation and toponym ambiguity. In GIR '08: Proceeding of the 2nd international workshop on Geographic information retrieval, pages 25--26, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- D. Buscaldi and P. Rosso. A conceptual density-based approach for the disambiguation of toponyms. International Journal of Geographical Information Systems, 22(3):301--313, 2008. Google ScholarDigital Library
- D. Buscaldi and P. Rosso. Map-based vs. knowledge-based toponym disambiguation. In GIR '08: Proceeding of the 2nd international workshop on Geographic information retrieval, pages 19--22, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- C. G. Emanuele Pianta and R. Zanoli. The TextPRO Tool Suite. In N. C. et al., editor, Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco, may 2008. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2008/.Google Scholar
- E. Garbin and I. Mani. Disambiguating toponyms in news. In conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT05), pages 363--370, Morristown, NJ, USA, 2005. Association for Computational Linguistics. Google ScholarDigital Library
- Linguistic Data Consortium. ACE English Annotation Guidelines for Entities, 2008. http://projects.ldc.upenn.edu/ace/docs/English-Entities-Guidelines_v6.6.pdf.Google Scholar
- E. Pianta and R. Zanoli. Exploiting SVM for Italian Named Entity Recognition. Intelligenza Artificiale, Special issue on NLP Tools for Italian, IV(2), 2007. In Italian.Google Scholar
- B. Pouliquen, M. Kimler, R. Steinberger, C. Ignat, T. Oellinger, K. Blackler, F. Fuart, W. Zaghouani, A. Widiger, A.-C. Forslund, and C. Best. Geocoding Multilingual Texts: Recognition, Disambiguation and Visualisation. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC-2006), pages 53--58, Genova, Italy, 2006.Google Scholar
- D. A. Smith and G. Crane. Disambiguating geographic names in a historical digital library. In Research and Advanced Technology for Digital Libraries, volume 2163 of Lecture Notes in Computer Science, pages 127--137. Springer, Berlin, 2001. Google ScholarDigital Library
- D. A. Smith and G. S. Mann. Bootstrapping toponym classifiers. In Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references, pages 45--49, Morristown, NJ, USA, 2003. Association for Computational Linguistics. Google ScholarDigital Library
- R. Volz, J. Kleb, and W. Mueller. Towards ontology-based disambiguation of geographical identifiers. In I3 Workshop held at the 16th International World Wide Web Conference (WWW2007), Banff, Alberta, Canada, 2007.Google Scholar
- G. K. Zipf. Human Behavior and the Principle of Least Effort. Addison-Wesley (Reading MA), 1949.Google Scholar
Index Terms
- Grounding toponyms in an Italian local news corpus
Recommendations
Approaches to disambiguating toponyms
Many approaches have been proposed in recent years in the context of Geographic Information Retrieval (GIR), mostly in order to deal with geographically constrained information in un-structured texts. Most of these approaches share a common scheme: in ...
Disambiguating toponyms in news
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language ProcessingThis research is aimed at the problem of disambiguating toponyms (place names) in terms of a classification derived by merging information from two publicly available gazetteers. To establish the difficulty of the problem, we measured the degree of ...
A conceptual density-based approach for the disambiguation of toponyms
Nowadays, a huge quantity of information is stored in digital format. A great portion of this information is constituted by textual and unstructured documents, where geographical references are usually given by means of place names. A common problem ...
Comments