Skip to main content
Log in

An NLP Lexicon as a Largely Language-Independent Resource

  • Original Article
  • Published:
Machine Translation

Abstract

This paper describes salient aspects of the OntoSem lexicon of English, a lexicon whose semantic descriptions can either be grounded in a language-independent ontology, rely on extra-ontological expressive means, or exploit a combination of the two. The variety of descriptive means, as well as the conceptual complexity of semantic description to begin with, necessitates that OntoSem lexicons be compiled primarily manually. However, once a semantic description is created for a lexeme in one language, it can be reused in others, often with little or no modification. Said differently, the challenge in building a semantic lexicon is describing semantics; once the semantics are described, it is relatively straightforward to connect given meanings to the appropriate head words in other languages. In this paper we provide a brief overview of the OntoSem lexicon and processing environment, orient our approach to lexical semantics among others in the field, and describe in more detail what we mean by the largely language-independent lexicon. Finally, we suggest reasons why our resources might be of interest to the larger community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Beale S, Lavoie B, McShane M, Nirenburg S, Korelsky T (2004) Question answering using ontological semantics. In: Proceedings of the ACL-2004 Workshop on Text Meaning and Interpretation, Barcelona, Spain, pp 41–48

  • Beale S, Nirenburg S, Mahesh K (1995) Semantic analysis in the Mikrokosmos machine translation project. In: Proceedings of the 2nd Symposium on Natural Language Processing, Bangkok, Thailand, pp 297–307

  • Beale S, Nirenburg S, McShane M (2003) Just-in-time grammar. In: Proceedings of the International Conference on Machine Learning: Models, Technologies and Applications, MLMTA’03, Las Vegas, Nevada, pp 291–297

  • Butt M, Forst M, Holloway King T, Kuhn J (2003) The feature space in parallel grammar writing. In: A workshop on ideas and strategies for multilingual grammar development, taking place during the fifteenth european summer school for logic, Language and information, Vienna, Austria, pp 9–16

  • Clark P, Porter B n.d. KM: The Knowledge Machine 2.0: users manual, Available at http://www.cs.utexas.edu/users/mfkb/km.html/userman.pdf [Last access14 Feb 2006]

  • Crouch D, Holloway King T (2005) Unifying lexical resources. In: Proceedings of the Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes, Saarbrücken, Germany, pp~32–37

  • Dorr BJ (1993) Machine translation: a view from the lexicon. The MIT Press, Cambridge, Mass

    Google Scholar 

  • Dorr BJ (1997) Large-scale dictionary construction for foreign language tutoring and Interlingual Machine translation. Mach Trans 12:271–325

    Article  Google Scholar 

  • Fass D (1991) met*: A method for discriminating metonymy and metaphor by computer. Comput Linguist 17:49–90

    Google Scholar 

  • Fass D, Wilks Y (1983) Preference semantics, ill-formedness and metaphor. Amer J Comput Linguist 9:178–187

    Google Scholar 

  • Fellbaum C (1999) A semantic network of English: The mother of all wordnets. Comput Humanit 32:209–220

    Article  Google Scholar 

  • Fikes R, Jenkins J, Frank G (2003) JTP: A system architecture and component library for hybrid reasoning. In: Proceedings of the Seventh World Multiconference on Systemics, Cybernetics, and Informatics, Orlando, Florida

  • Fodor JA (1975) The language of thought. Thomas Crowell, New York

    Google Scholar 

  • Grosz B, Haas N, Hendrix G, Hobbs J, Martin P, Moore R, Robinson J, Rosenschein S (1982) DIALOGIC: A core natural-language processing system. In: COLING 82: Proceedings of the Ninth International Conference on Computational Linguistics, Prague, Czechoslovakia, pp 95–100

  • Hirst G (1995) Near-synonymy and the structure of lexical knowledge. In: Representation and acquisition of lexical knowledge: polysemy, ambiguity, and generativity: papers from the 1995 AAAI symposium, Stanford, CA, pp 51–56

  • Hobbs JR (1985) Ontological promiscuity. In: 23rd annual meeting of the association for computational linguistics, Chicago, Illinois, pp 61–69

  • Hobbs JR (1986) Overview of the TACITUS project. Finite Str Newslet Comput Linguist 12:220–222

    Google Scholar 

  • Hobbs JR (1989) World knowledge and word meaning. In: Wilks Y (eds) Theoretical issues in natural language processing, Lawrence. Erlbaum Associates, Hillsdale, NJ, pp 16–21

    Google Scholar 

  • Hobbs JR, Bear J (1990) Two principles of parse preference. In: COLING-90: Papers presented to the 13th international conference on computational linguistics, Helsinki, Finland, vol 3, pp 162–167

  • Hobbs JR, Stickel M, Martin M, Edwards D (1990) Interpretation as abduction. In: 26th annual meeting of the association for Computational Linguistics, Buffalo, New York, pp 95–103

  • Ide N, Véronis J (1993) Extracting knowledge bases from machine-readable dictionaries: Have we wasted our time? In: Proceedings KB&KS’93, International Conference on Building and Sharing of Very Large-Scale Knowledge Bases, Tokyo, Japan, pp 257–266

  • Jackendoff RS (1983) Semantics and cognition. The MIT Press, Cambridge, Mass

    Google Scholar 

  • Jackendoff RS (1990) Semantic structures. The MIT Press, Cambridge, Mass

    Google Scholar 

  • Lenci A, Bel N, Busa F, Calzolari N, Gola E, Monachini M, Ogonowski A, Peters I, Peters W, Ruimy N, Villegas M, Zampolli A (2000a) SIMPLE: a general framework for the development of multilingual lexicons. Int J Lexicogr 13:287–312

    Article  Google Scholar 

  • Lenci A, Busa F, Ruimy N, Gola E, Monachini M, Calzolari N, Zampolli A, Guimier E, Recourcé G, Humphreys L, Von Rekovsky U, Ogonowski A, McCauley C, Peters W, Peters I, Gaizauskas R, Villegas M (2000b) SIMPLE Work Package 2—Linguistic specifications, Deliverable D2.1, ILC/CNR, Pisa, Italy

  • Levin B (1995) English verb classes and alternations. University of Chicago Press, Chicago, Illinois

    Google Scholar 

  • Mahesh K, Nirenburg S, Beale S (1997) If you have it, flaunt it: Using full ontological knowledge for word sense disambiguation. In: Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation, Santa Fe, New Mexico, pp 1–9

  • McShane M, Beale S, Nirenburg S (2004a) Some meaning procedures of ontological semantics. In: LREC 2004: Fourth international conference on language resources and evaluation, Lisbon, Portugal, pp 1885–1888

  • McShane M, Beale S, Nirenburg S (2004b) OntoSem methods for processing semantic ellipsis. In: Proceedings of HLT/NAACL 2004 Workshop on Computational Lexical Semantics, Boston, Mass, pp 1–8

  • McShane M, Nirenburg S, Beale S (2005b, in press) Semantics-based resolution of fragments and underspecified structures. To appear in Trait Autom Lang 46.1

  • McShane M, Nirenburg S, Beale S (forthcoming) Multi-word entities in human- and machine-oriented lexicons. To appear in: Sica G (ed) The dictionary: open problems, Polimetrica, Monza-Milano, Italy

  • McShane M, Nirenburg S, Beale S, O’Hara T (2005a) Semantically rich human-aided machine annotation. In: Frontiers in corpus annotation II: Pie in the Sky, Workshop at ACL 2005, Ann Arbor, MI, pp 68–75

  • McShane M, Zabludowski M, Nirenburg S, Beale S (2004c) OntoSem and SIMPLE: two multi-lingual world views. In: Proceedings of ACL-2004 Workshop on Text Meaning and Interpretation, Barcelona, Spain, pp 25–32

  • Nirenburg, McShane M, Beale S (2004b) The rationale for building resources expressly for NLP. In: LREC 2004: Fourth international conference on language resources and evaluation, Lisbon, Portugal, pp 3–6

  • Nirenburg S, Raskin V (1999) Supply-side and demand-side lexical semantics. In: Viegas (1999), pp 283–298

  • Nirenburg S, Raskin V (2001) Ontological semantics, formal ontology, and ambiguity. In: International conference on formal ontology in information systems: FOIS-2001, Ogunquit, Maine, pp 151–161

  • Nirenburg S, Raskin V (2004) Ontological semantics. MIT Press, Cambridge, Mass

    Google Scholar 

  • Nirenburg S, Wilks Y (2001) What’s in a symbol: Ontology and the surface of language. J Exp Theor AI 13:9–23

    Article  Google Scholar 

  • Onyshkevych B (1997) An ontological-semantic framework for text analysis. Ph D. thesis, Center for Machine Translation, Carnegie Mellon University, Pittsburgh, PA

    Google Scholar 

  • Onyshkevych BA (1999) Categorization of types and applications of LRs. In: Viegas (1999), pp 3–17

  • Palmer M, Grishman R, Calzolari N, Zampolli A (2000) Standardizing multilingual lexicons. In: Linguistic exploration: workshop on web-based language documentation and description, Philadelphia, PA, pp 265–273

  • Palmer M, Xue N, Babko-Malaya O, Chen J, Snyder B (2005) A parallel proposition bank II for Chinese and English. In: Proceedings of the Workshop on Frontiers in Corpus Annotation II: Pie in the Sky, Ann Arbor, MI, pp 61–67

  • Pederson BS, Keson B (1999) SIMPLE – semantic information for multifunctional plurilingual lexica: some examples of Danish concrete nouns. In: SIGLEX99 standardizing lexical resources, College Park, MD, pp 46–51

  • Pulman SG (1983) Generalised phrase structure grammar, Earley’s algorithm, and the minimisation of recursion. In: Sparck Jones and Wilks (1983), pp 117–131

  • Pustejovsky J (1995) The generative lexicon. The MIT Press, Cambridge, Mass

    Google Scholar 

  • Sampson G (1975) The form of language. Weidenfield and Nicholson, London

    Google Scholar 

  • Schank RC (1972) Conceptual dependency: A theory of natural language understanding. Cogn Psychol 3:532–631

    Article  Google Scholar 

  • Schank R, Abelson, R (1977) Scripts, plans, goals and understanding. Lawrence Erlbaum, Hillsdale, NJ

    Google Scholar 

  • Sparck Jones K, Wilks Y (1983) (eds) Automatic natural language parsing. Ellis Horwood, Chichester, England

    Google Scholar 

  • Viegas E (eds) (1999) Breadth and depth of semantic lexicons. Kluwer Academic Publishers, Dordrecht

    Google Scholar 

  • Wilks YA (1978) Making preferences more active. Arti Intell 11:197–223

    Article  Google Scholar 

  • Wilks Y (1985) Does anyone really still believe this kind of thing? In: Sparck Jones and Wilks (1983), pp 182–189

  • Wilks Y (1992) Form and content in semantics. In: Rosner M, Johnson R (eds) Computational linguistics and formal semantics. Cambridge University Press, Cambridge, England, pp 257–281

    Google Scholar 

  • Wilks YA, Slator BM, Guthrie LM (1996) Electric words: dictionaries, computers and meaning. The MIT Press, Cambridge Mass

    Google Scholar 

  • XTAG (2001) XTAG English Grammar: Release 2.24.2001. Available at www.cis.upenn.edu/~xtag/gramrelease.html. [Last accessed 3 Feb 2006]

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marjorie McShane.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McShane, M., Nirenburg, S. & Beale, S. An NLP Lexicon as a Largely Language-Independent Resource. Machine Translation 19, 139–173 (2005). https://doi.org/10.1007/s10590-006-9001-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-006-9001-y

Keywords

Navigation