Abstract
Wordnets, which are repositories of lexical semantic knowledge containing semantically linked synsets and lexically linked words, are indispensable for work on computational linguistics and natural language processing. While building wordnets for Hindi and Marathi, two major Indo-European languages, we observed that the verb hierarchy in the Princeton Wordnet was rather shallow. We set to constructing a verb knowledge base for Hindi, which arranges the Hindi verbs in a hierarchy of is-a (hypernymy) relation. We realized that there are unique Indian language phenomena that bear upon the lexicalization vs. syntactically derived choice. One such example is the occurrence of conjunct and compound verbs (called Complex Predicates) which are found in all Indian languages. This paper presents our experience in the construction of lexical knowledge bases for Indian languages with special attention to Hindi. The question of storing versus deriving complex predicates has been dealt with linguistically and computationally. We have constructed empirical tests to decide if a combination of two words, the second of which is a verb, is a complex predicate or not. Such tests provide a principled way of deciding the status of complex predicates in Indian language wordnets.
Similar content being viewed by others
Notes
Capital letters are used to represent the retroflexed series of consonants of Hindi.
References
Alsina, A., et al. (1995). Complex predicates. Stanford: CSLI Publications.
Bahari, H. (1997). Vyavaharik Hindi Vyakaran Tatha Rachna. Allahabad, India: Lokbharti Prakashan.
Bashir, E. (1993). Causal chains and compound verbs. In M. K. Verma (Ed.) (1993).
BNC Consortium: British National Corpus. (2000). The Humanities Computing Unit of Oxford University. http://www.hcu.ox.ac.uk/BN
Butt, M. (1993). Conscious choice and some light verbs in Urdu. In M. K. Verma (Ed.) (1993).
Butt M. (1995). The projection of arguments: Lexical and compositional factors. In A. Alsina et al. (Eds.), Complex predicates. Stanford: CSLI Publications.
Chakrabarti, D. & Bhattacharyya, P. (2004). Creation of English and Hindi verb hierarchies and their application to Hindi WordNet Building and English–Hindi MT. Proceedings of the Second Global Wordnet Conference, Brno, Czech Republic.
Chakrabarti, D., Narayan, D., Pandey, P., & Bhattacharyya, P. (2002). Experiences in building the Indo WordNet: A WordNet for Hindi. Proceedings of the First Global WordNet Conference. (http://www.cfilt.iitb.ac.in/webhwn).
Dave, S., & Bhattacharyya, P. (2001). Knowledge extraction from Hindi texts. Journal of Institution of Electronic and Telecommunication Engineers, 18(4), 323–331.
Dayal, V. (2003). A semantics for pseudo incorporation. Ms, Rutgers University.
Fedson, V. J. (1993). Complex verb–verb predicates in Tamil. In M. K. Verma (Ed.) (1993).
Fellbaum, C. (Ed.) (1998). WordNet: An electronic lexical database. MIT Press.
Guarino, N. (1995). Formal; ontology, conceptual analysis and knowledge representation. International Journal of Human and Computer Studies, 43(5/6), 625–640.
Gupta, S. S., & Agarwal, S. (2003). Standard Illustrated Advanced Dictionary. Delhi: Ashok Prakashan.
Hook, P. E. (1974). The Hindi compound verb: What it is and what it does? In K. S. Singh (Ed.), Readings in Hindi–Urdu linguistics. Delhi: National Publishing House.
Hook, P. E. (1981). Hindi structures: Intermediate level. Michigan Papers on South and Southeast Asia. Ann Arbor, Michigan: The University of Michigan Center for South and Southeast Studies.
Hornby, A. S. (2000). Oxford Advanced Learner’s Dictionary of Current English. Oxford: Oxford University Press.
Jha Vishwanath (1975). Amarkosha by Amarsingha. Varanasi: Motilal Banarasidas Publications.
Kachru, Y. (1993). Verb serialization in syntax, typology and historical change. In M. K. Verma (Ed.) (1993).
Lenat, D. B. & Guha, R. V. (1990). Building large knowledge based system, Representation and Inference in the CYC Project, Addison Wesley, Reading, MA. (http://www.cyc.co.)
Levin, B. (1993). English verb classes and alternations: A preliminary investigation. USA: University of Chicago Press.
Marius, P. (2005). Finding instance names and alternative glosses on the Web: WordNet reloaded. In CICLing, pp 280–292.
Mcgregor, R. S. (1997). The Oxford Hindi English Dictionary. New Delhi: Oxford university Press.
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. (1990). Five papers on WordNet. CSL Report 43. Princeton: Cognitive Science Laboratory, Princeton University.
Mohanan, T. (1995). Wordhood and lexicality. NLLT, 13, 75–134.
Mohanan, T. (1995). Multidimensionslity of representation: NV Complex predicates in Hindi. In A. Alsina et al. (Eds.), Complex predicates.
Pandharipande, R. (1993). Serial verb construction in Marathi. In M. K. Verma (Ed.) (1993).
Paul, S. (2004). An HPSG account of Bangla compound verbs with LKB implementation. Ph.D. Dissertation. CALT, University of Hyderabad.
Rion, S., Daniel, J., & Ng, A. Y. (2005). Semantic taxonomy induction from heterogenous evidence. In Proceedings of COLING/ACL 2006, pp. 801–808.
Shree Naval Jee (2000). Nalanda Vishal Shabd Sagar. New Delhi: Adish Book Depot.
Verma, M. K. (Ed.) (1993). Complex predicates in South Asian languages. New Delhi: Manohar Publishers and Distributors.
Verma, R., & Kapur, B. (1998). Lokbharti Pramanik Hindi Kosh. In Lokbharati Prakashan (Ed.), Varanasi.
Vossen, P. (Ed.) (1998). EuroWordNet: A multilingual database with lexical semantic networks. Dordrecht: Kluwer Academic Publishers.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bhattacharyya, P., Chakrabarti, D. & Sarma, V.M. Complex predicates in Indian languages and wordnets. Lang Resources & Evaluation 40, 331–355 (2006). https://doi.org/10.1007/s10579-007-9032-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-007-9032-x