skip to main content
10.1145/1526709.1526902acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
poster

Crawling English-Japanese person-name transliterations from the web

Published: 20 April 2009 Publication History

Abstract

Automatic compilation of lexicon is a dream of lexicon compilers as well as lexicon users. This paper proposes a system that crawls English-Japanese person-name transliterations from the Web, which works a back-end collector for automatic compilation of bilingual person-name lexicon. Our crawler collected 561K transliterations in five months. From them, an English-Japanese person-name lexicon with 406K entries has been compiled by an automatic post processing. This lexicon is much larger than other similar resources including English-Japanese lexicon of HeiNER obtained from Wikipedia.

References

[1]
S. Kaide and S. Sato. A person-name classifier by using probability difference (in Japanese). In Proc. of NLP-09, 2009.
[2]
M. Nagata, T. Saito, and K. Suzuki. Using the Web as a bilingual dictionary. In Proc. of the workshop on Data-driven methods in machine translation, pages 1--8, 2001.
[3]
Y. Sakakibara and S. Sato. Automatic compilation of a bilingual person-name lexicon (in Japanese). In Proc. of NLP-07, pages 879--882, 2007.
[4]
W. Wentland, J. Knopp, C. Silberer, and M. Hartung. Building a multilingual lexical resource for named entity disambiguation, translation and transliteration. In Proc. of LREC-08, 2008.

Cited By

View all
  • (2023)Translating the List of Participants in the 2020 Tokyo Olympic Games into Japanese2020 東京オリンピック参加者名簿の翻訳Journal of Natural Language Processing10.5715/jnlp.30.74830:2(748-772)Online publication date: 2023
  • (2009)Web-Based Transliteration of Person NamesProceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 0110.1109/WI-IAT.2009.47(273-278)Online publication date: 15-Sep-2009

Index Terms

  1. Crawling English-Japanese person-name transliterations from the web

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '09: Proceedings of the 18th international conference on World wide web
      April 2009
      1280 pages
      ISBN:9781605584874
      DOI:10.1145/1526709

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 April 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. automatic lexicon compilation
      2. mining transliteration pairs
      3. person name

      Qualifiers

      • Poster

      Conference

      WWW '09
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Translating the List of Participants in the 2020 Tokyo Olympic Games into Japanese2020 東京オリンピック参加者名簿の翻訳Journal of Natural Language Processing10.5715/jnlp.30.74830:2(748-772)Online publication date: 2023
      • (2009)Web-Based Transliteration of Person NamesProceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 0110.1109/WI-IAT.2009.47(273-278)Online publication date: 15-Sep-2009

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media