ABSTRACT
When processing toponym information in natural language text, it is crucial to have a good gazetteers. There are several well-organized gazetteers for English text, but they do not cover Japanese local toponyms. In this paper, we introduce a Japanese gazetteers based on Open Data (e.g., the Toponym database distributed by Japanese ministries, Wikipedia, and GeoNames) and propose a toponym disambiguation framework that uses the constructed gazetteers. We also evaluate our approach based on a blog corpus that contains place names with high ambiguity.
- D. Buscaldi. Approaches to disambiguating toponyms. SIGSPATIAL Special, 3(2):16--19, July 2011. Google ScholarDigital Library
- M. Conti, S. K. Das, C. Bisdikian, M. Kumar, L. M. Ni, A. Passarella, G. Roussos, G. Troster, G. Tsudik, and F. Zambonelli. Looking ahead in pervasive computing: Challenges and opportunities in the era of cyber-physical convergence. Pervasive and Mobile Computing, 8(1):2--21, 2012. Google ScholarDigital Library
- J. Gelernter and S. Balaji. An algorithm for local geoparsing of microtext. GeoInformatica, pages 1--33, 2013. Google ScholarDigital Library
- F. Giunchiglia, V. Maltese, F. Farazi, and B. Dutta. Geowordnet: A resource for geo-spatial applications. In L. Aroyo, G. Antoniou, E. Hyvonen, A. ten Teije, H. Stuckenschmidt, L. Cabral, and T. Tudorache, editors, The Semantic Web: Research and Applications, volume 6088 of Lecture Notes in Computer Science, pages 121--136. Springer Berlin/Heidelberg, 2010. Google ScholarDigital Library
- J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, 194(0):28--61, 2013. <ce:title>Artificial Intelligence, Wikipedia and Semi-Structured Resources</ce:title>. Google ScholarDigital Library
- J. Kazama and K. Torisawa. Inducing gazetteers for named entity recognition by large-scale clustering of dependency relations. In ACL, pages 407--415, 2008.Google Scholar
- T. Kudo and Y. Matsumoto. Japanese dependency analysis using cascaded chunking. In CoNLL 2002: Proceedings of the 6th Conference on Natural Language Learning 2002 (COLING 2002 Post-Conference Workshops), pages 63--69, 2002. Google ScholarDigital Library
- A. Popescu, G. Grefenstette, and H. Bouamor. Mining a multilingual geographical gazetteer from the web. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, WI-IAT '09, pages 58--65, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarDigital Library
- E. Rauch, M. Bukatin, and K. Baker. A confidence-based framework for disambiguating geographic terms. In Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1, HLT-NAACL-GEOREF '03, pages 50--54, Stroudsburg, PA, USA, 2003. Association for Computational Linguistics. Google ScholarDigital Library
- R. Volz, J. Kleb, and W. Mueller. Towards ontology-based disambiguation of geographical identifiers. In WWW2007, 2007.Google Scholar
- X. Wang, Y. Zhang, M. Chen, X. Lin, H. Yu, and Y. Liu. An evidence-based approach for toponym disambiguation. In Geoinformatics, 2010, pages 1--7, 2010.Google ScholarCross Ref
- M. Yoshioka and N. Kando. Issues for linking geographical open data of geonames and wikipedia. In H. Takeda, Y. Qu, R. Mizoguchi, and Y. Kitamura, editors, Semantic Technology, volume 7774 of Lecture Notes in Computer Science, pages 375--381. Springer Berlin Heidelberg, 2013.Google Scholar
Index Terms
- Construction of a Japanese gazetteers for Japanese local toponym disambiguation
Recommendations
A disambiguation method for Japanese compound verbs
MWE '03: Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18The purpose of this study is to construct a semantic analysis method for disambiguating Japanese compound verbs. Japanese speakers produce a rich variety of compound verbs, making it difficult to process them by computer. We construct a method employing ...
Wikipedia Mining for Huge Scale Japanese Association Thesaurus Construction
AINAW '08: Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - WorkshopsWikipedia, a huge scale Web-based dictionary, is an impressive corpus for knowledge extraction. We already proved that Wikipedia can be used for constructing an English association thesaurus and our link structure mining method is significantly ...
Construction and analysis of Japanese-English broadcast news corpus with named entity tags
MultiNER '03: Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15We are aiming to acquire named entity (NE) translation knowledge from nonparallel, content-aligned corpora, by utilizing NE extraction techniques. For this research, we are constructing a Japanese-English broadcast news corpus with NE tags. The tags ...
Comments