Abstract
A lot of bilingual dictionaries have been released on the WWW. However, these dictionaries insufficiently cover new and domainspecific terminology. In our demonstration, we present a dictionary constructed by analyzing the link structure of Wikipedia, a huge scale encyclopedia containing a large amount of links between articles in different languages. We analyzed not only these interlanguage links but extracted even more translation candidates from redirect page and link text information. In an experiment, we already proved the advantages of our dictionary compared to manually created dictionaries as well as to extracting bilingual terminology from parallel corpora.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Tsuji, K., Kageura, K.: Automatic generation of japanese-english bilingual thesauri based on bilingual corpora. Journal of the American Society for Information Science and Technology 57(7), 891–906 (2006)
Fung, P., McKeown, K.: A technical word- and term-translation aid using noisy parallel corpora across language groups. Machine Translation 12(1-2), 53–87 (1997)
Nakayama, K., Hara, T., Nishio, S.: A thesaurus construction method from large scale web dictionaries. In: Proc. of IEEE International Conference on Advanced Information Networking and Applications (AINA 2007), pp. 932–939 (2007)
Nakayama, K., Hara, T., Nishio, S.: Wikipedia mining for an association web thesaurus construction. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, Springer, Heidelberg (2007)
Erdmann, M., Nakayama, K., Hara, T., Nishio, S.: An approach for extracting bilingual terminology from wikipedia. In: Haritsa, et al.(eds.) DASFAA 2008. LNCS, vol. 4947, pp. 580–587, Springer, Heidelberg (to appear, 2008)
Wikimedia Foundation: Wikimedia downloads, http://download.wikimedia.org/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Erdmann, M., Nakayama, K., Hara, T., Nishio, S. (2008). A Bilingual Dictionary Extracted from the Wikipedia Link Structure. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_63
Download citation
DOI: https://doi.org/10.1007/978-3-540-78568-2_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78567-5
Online ISBN: 978-3-540-78568-2
eBook Packages: Computer ScienceComputer Science (R0)