L&H Lexicography Toolkit for Machine Translation

Meekhof, Timothy; Clements, David

doi:10.1007/3-540-39965-8_24

Timothy Meekhof² &
David Clements²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1934))

Included in the following conference series:

Conference of the Association for Machine Translation in the Americas

662 Accesses

Abstract

One of the most important components of any machine translation system is the translation lexicon. The size and quality of the lexicon, as well as the coverage of the lexicon for a particular use, greatly influence the applicability of machine translation for a user. The high cost of lexicon development limits the extent to which even mature machine translation vendors can expand and specialize their lexicons, and frequently prevents users from building extensive lexicons at all. To address the high cost of lexicography for machine translation, L&H is building a Lexicography Toolkit that includes tools that can significantly improve the process of creating custom lexicons. The toolkit is based on the concept of using automatic methods of data acquisition, using text corpora, to generate lexicon entries. Of course, lexicon entries must be accurate, so the work of the toolkit must be checked by human experts at several stages. However, this checking mostly consists of removing erroneous results, rather than adding data and entire entries. This article will explore how the Lexicography Toolkit would be used to create a lexicon that is specific to the user’s domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Constructing a poor man’s wordnet in a resource-rich world

Article 11 February 2015

word.alignment: an R package for computing statistical word alignment and its evaluation

Article 23 March 2020

Fully automatic multi-language translation with a catalogue of phrases: successful employment for the Swiss avalanche bulletin

Article 14 September 2015

References

Meyers, A., Kosaka, M., and Grishman, R.: A multilingual procedure for dictionary-based sentence alignment. In Farwell, d., Gerber, L., and Hovy, E. (eds.), Machine Translation and the Information Soup: Proceedings of AMTA’98. Berlin: Springer (1998) 187–198
Google Scholar
Brown, P.F., Della Pietra, V.J., deSouza, P., Lai, J., and Mercer, R.: Class-based n-gram models of natural language Computational Linguistics, 18(4) (1992) 467–479
Google Scholar
Yarowsky, D.: Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of COLING-92 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Lernout & Hauspie Speech Products, 4375 Jautland Drive, CA 92121, San Diego
Timothy Meekhof & David Clements

Authors

Timothy Meekhof
View author publications
You can also search for this author in PubMed Google Scholar
David Clements
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Litton PRC, 1500 PRC Drive, VA 22102, McLean, USA
John S. White

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meekhof, T., Clements, D. (2000). L&H Lexicography Toolkit for Machine Translation. In: White, J.S. (eds) Envisioning Machine Translation in the Information Future. AMTA 2000. Lecture Notes in Computer Science(), vol 1934. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39965-8_24

Download citation

DOI: https://doi.org/10.1007/3-540-39965-8_24
Published: 02 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41117-8
Online ISBN: 978-3-540-39965-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

L&H Lexicography Toolkit for Machine Translation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Constructing a poor man’s wordnet in a resource-rich world

word.alignment: an R package for computing statistical word alignment and its evaluation

Fully automatic multi-language translation with a catalogue of phrases: successful employment for the Swiss avalanche bulletin

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

L&H Lexicography Toolkit for Machine Translation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Constructing a poor man’s wordnet in a resource-rich world

word.alignment: an R package for computing statistical word alignment and its evaluation

Fully automatic multi-language translation with a catalogue of phrases: successful employment for the Swiss avalanche bulletin

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation