Skip to main content
Log in

Large-Scale Dictionary Construction for Foreign Language Tutoring and Interlingual Machine Translation

  • Published:
Machine Translation

Abstract

This paper describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language-independent representation called “lexical conceptual structure” (LCS). A primary goal of the LCS research is to demonstrate that synonymous verb senses share distributional patterns. We show how the syntax–semantics relation can be used to develop a lexical acquisition approach that contributes both toward the enrichment of existing online resources and toward the development of lexicons containing more complete information than is provided in any of these resources alone. We start by describing the structure of the LCS and showing how this representation is used in FLT and MT. We then focus on the problem of building LCS dictionaries for large-scale FLT and MT. First, we describe authoring tools for manual and semi-automatic construction of LCS dictionaries; we then present a more sophisticated approach that uses linguistic techniques for building word definitions automatically. These techniques have been implemented as part of a set of lexicon-development tools used in the milt FLT project.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alshawi, H.: 1989, ‘Analysing the Dictionary Definitions’, in: Boguraev and Briscoe (1989a), pp. 153–169.

  • Ballard, B.W. and Stumberger, D.E.: 1986, ‘Semantic Acquisition in TELI: a Transportable, User-Customized Natural Language Processor’, in: 24th Annual Meeting of the Association for Computational Linguistics, New York, pp. 20–29.

  • Barnett, J., Knight, K., Mani, I. and Rich, E.: 1990, ‘Knowledge and Natural Language Processing’, Communications of the ACM, Vol. 33, No. 8, Association for Computing Machinery, New York, pp. 50–71.

    Google Scholar 

  • Bates, M. and Bobrow, R.: 1983, ‘Information Retrieval Using a Transportable Natural Language Interface’, in: Proceedings of the Sixth Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, pp. 81–86.

  • Boguraev, B. and Briscoe, T. (eds): 1989a, Computational Lexicography for Natural Language Processing, Longman, London.

    Google Scholar 

  • Boguraev, B. and Briscoe, T.: 1989b, ‘Utilising the LDOCE Grammar Codes’, in: Boguraev and Briscoe (1989a), pp. 85–116.

  • Brent, M.R.: 1993. ‘From Grammar to Lexicon: Unsupervised Learning of Lexical Syntax’, Computational Linguistics, 19, 243–262.

    Google Scholar 

  • Carrier, J. and Randall, J.H.: 1993, ‘Lexical Mapping’, in: E. Reuland and W. Abraham (eds), Knowledge and Language II: Lexical and Conceptual Structure, Kluwer, Dordrecht, pp. 119–142.

    Google Scholar 

  • Chomsky, N.: 1981, Lectures on Government and Binding, Foris Publications, Dordrecht.

    Google Scholar 

  • Chomsky, N.: 1986, Knowledge of Language: Its Nature, Origin and Use, The MIT Press, Cambridge, MA.

    Google Scholar 

  • Church, K.W. and Hanks, P.: 1990, ‘Word Association Norms, Mutual Information, and Lexicography’, Computational Linguistics, 16, 22–29.

    Google Scholar 

  • Copestake, A., Briscoe, T., Vossen, P., Ageno, A., Castellon, I., Ribas, F., Rigau, G., Rodríguez, H. and Samiotou, A.: 1995, ‘Acquisition of Lexical Translation Relations from MRDs’, Machine Translation, 9, 183–219.

    Google Scholar 

  • Corbin, W., Copeland, D. and Buck, B.: 1994, ‘Determining Verb Usage from Parsed Corpora: Matrix of Levin's Syntactic/Semantic Classes’, Project Report for NLP Course (CMSC 723), University of Maryland, College Park, MD.

    Google Scholar 

  • Dorr, B.J.: 1992, ‘The Use of Lexical Semantics in Interlingual Machine Translation’, Machine Translation, 7, 135–193.

    Google Scholar 

  • Dorr, B.J.: 1993, Machine Translation: A View from the Lexicon, The MIT Press, Cambridge, MA.

    Google Scholar 

  • Dorr, B.J.: 1994, ‘Machine Translation Divergences: A Formal Description and Proposed Solution’, Computational Linguistics, 20, 597–633.

    Google Scholar 

  • Dorr, B.J.: 1997, ‘Large-Scale Acquisition of LCS-Based Lexicons for Foreign Language Tutoring’, in: Fifth Conference on Applied Natural Language Processing, Washington, DC, pp. 139–146.

  • Dorr, B.J., Garman, J. and Weinberg, A.: 1995a, ‘From Syntactic Encodings to Thematic Roles: Building Lexical Entries for Interlingual MT’, Machine Translation, 9, 221–250.

  • Dorr, B.J., Hendler, J., Blanksteen, S. and Migdalof, B.: 1993, ‘Use of Lexical Conceptual Structure for Intelligent Tutoring’, Technical Report UMIACS TR 93–108, CS TR 3161, University of Maryland, College Park

  • Dorr, B.J., Hendler, J., Blanksteen, S. and Migdalof, B.: 1995b, ‘Use of LCS and Discourse for Intelligent Tutoring: On Beyond Syntax’, in: Holland et al. (1995), pp. 289–309.

  • Dorr, B.J. and Jones, D.: 1996a, ‘Acquisition of Semantic Lexicons: Using Word Sense Disambiguation to Improve Precision’, in: Proceedings of the Workshop on Breadth and Depth of Semantic Lexicons, 34th Annual Conference of the Association for Computational Linguistics, Santa Cruz, CA, pp. 42–50.

  • Dorr, B.J. and Jones, D.: 1996b, ‘Role of Word Sense Disambiguation in Lexical Acquisition: Predicting Semantics from Syntactic Cues’, in: COLING-96: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 322–333.

  • Dorr, B.J., Lee, J-H., Voss, C. and Suh, S.: 1995c, ‘Development of Interlingual Lexical Conceptual Structures with Syntactic Markers for Machine Translation’, Technical Report UMIACS TR 95–16, CS TR 3412, University of Maryland, College Park, MD.

  • Dorr, B.J., Lin, D., Lee, J-H. and Suh, S.: 1995d, ‘Efficient Parsing for Korean and English: A Parameterized Message-Passing Approach’, Computational Linguistics, 21, 255–263.

    Google Scholar 

  • Dorr, B.J. and Olsen, M.B.: 1996, ‘Multilingual Generation: The Role of Telicity in Lexical Choice and Syntactic Realization’, Machine Translation, 11, 37–74.

    Google Scholar 

  • Dorr, B.J. and Olsen, M.B.: 1997, ‘Aspectual Modifications to a LCS Database for NLP Applications’, Technical Report LAMP TR 007, UMIACS TR 97–23, CS TR, University of Maryland, College Park, MD.

    Google Scholar 

  • Dorr, B.J. and Voss, C.: 1996, ‘A Multi-Level Approach to Interlingual MT: Defining the Interface between Representational Languages’, International Journal of Expert Systems, 9, 15–51.

    Google Scholar 

  • Farwell, D., Guthrie, L. and Wilks, Y.: 1993, ‘Automatically Creating Lexical Entries for ULTRA, a Multilingual MT System’, Machine Translation, 8, 127–145.

    Google Scholar 

  • Fillmore, C.J.: 1968.: ‘The Case for Case’, in: E. Bach and R. Harms (eds), Universals in Linguistic Theory, Holt, Rinehart, and Winston, New York, pp. 1–88.

    Google Scholar 

  • Fillmore, C.J.: 1970, ‘The Grammar of Hitting and Breaking’, in: R.A. Jacobs and P.S. Rosenbaum (eds), Readings in English Transformational Grammar, Ginn, Waltham, MA, pp. 120–133

    Google Scholar 

  • Fisher, C., Gleitman, H. and Gleitman, L.: 1991, ‘On the Semantic Content of Subcategorization Frames’, Cognitive Psychology, 23, 331–392.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dorr, B.J. Large-Scale Dictionary Construction for Foreign Language Tutoring and Interlingual Machine Translation. Machine Translation 12, 271–322 (1997). https://doi.org/10.1023/A:1007965530302

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1007965530302

Navigation