Abstract
We discuss the mathematical structure of various levels of representation of Sanskrit text in order to guide the design of computer aids aiming at useful processing of the digitalised Sanskrit corpus. Two main levels are identified, respectively called the linear and functional level. The design space of these two levels is sketched, and the computational implications of the main design choices are discussed. Current solutions to the problems of mechanical segmentation, tagging, and parsing of Sanskrit text are briefly surveyed in this light. An analysis of the requirements of relevant linguistic resources is provided, in view of justifying standards allowing inter-operability of computer tools.
This paper does not attempt to provide definitive solutions to the representation of Sanskrit at the various levels. It should rather be considered as a survey of various choices, allowing an open discussion of such issues in a formally precise general framework.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Apte, V.S.: The Student’s Guide to Sanskrit Composition. In: A Treatise on Sanskrit Syntax for Use of Schools and Colleges. Lokasamgraha Press, Poona (1885)
Barendregt, H.: The Lambda Calculus: Its Syntax and Semantics. North Holland, Amsterdam (1984)
Bharati, A., Chaitanya, V., Sangal, R.: Natural Language Processing. A Paninian Perspective. Prentice-Hall of India, New Delhi (1995)
Dowty, D.: Grammatical relations and Montague Grammars. In: Jacobson, P., Pullum, G.K. (eds.) The nature of Syntactic Representation, Reidel (1982)
Eilenberg, S.: Automata, Languages, and Machines, volume A. Academic Press, London (1974)
Gillon, B.S.: Bartṛhari’s solution to the problem of asamartha compounds. Études Asiatiques/Asiatische Studien 47(1), 117–133 (1993)
Gillon, B.S.: Autonomy of word formation: evidence from Classical Sanskrit. Indian Linguistics 56(1-4), 15–52 (1995)
Gillon, B.S.: Word order in Classical Sanskrit. Indian Linguistics 57(1), 1–35 (1996)
Gillon, B.S.: Bartṛhari’s rule for unexpressed kārakas: The problem of control in Classical Sanskrit. In: Deshpande, M.M., Hook, P.E. (eds.) Indian linguistic studies: Festschrift in honour of George Cardona, Motilal Banarsidass, Delhi (2002)
Gillon, B.S.: Null arguments and constituent structure in Classical Sanskrit. Private communication (2003)
Gillon, B.S.: Subject predicate order in Classical Sanskrit. In: Scott, P., Casadio, C., Seely, R. (eds.) Language and grammar: studies in mathematical linguistics and natural language, pp. 211–225. Center for the Study of Language and Information (2005)
Gillon, B.S.: Exocentric (bahuvrīhi) compounds in classical Sanskrit. In: Huet, G., Kulkarni, A. (eds.) Proceedings, First International Symposium on Sanskrit Computational Linguistics, pp. 1–12 (2007)
Gillon, B.S.: Pāṇini’s aṣṭādhyāyī and linguistic theory. J. Indian Philos 35, 445–468 (2007)
Girard, J.-Y., Lafont, Y., Régnier, L. (eds.): Advances in Linear Logic. London Mathematical Society Lecture Notes, vol. 222. Cambridge University Press, Cambridge (2005)
Girard, J.-Y., Lafont, Y., Taylor, P. (eds.): Proofs and Types. Cambridge Tracts in Theoretical Computer Science, vol. 7. Cambridge University Press, Cambridge (1988)
Goyal, P., Sinha, R.M.K.: Translation divergence in English-Sanskrit-Hindi language pairs. In: Kulkarni, A., Huet, G. (eds.) Sanskrit Computational Linguistics. LNCS, vol. 5406, pp. 134–143. Springer, Heidelberg (2009)
Hellwig, O.: SanskritTagger, a stochastic lexical and pos tagger for Sanskrit. In: Huet, G., Kulkarni, A. (eds.) Proceedings, First International Symposium on Sanskrit Computational Linguistics, pp. 37–46 (2007)
Hindley, J.R., Seldin, J.P. (eds.): Introduction to Combinators and λ-Calculus. Cambridge University Press, Cambridge (1986)
Hock, H.H.: The Sanskrit quotative: a historical and comparative study. Studies in the Linguistic Sciences 12(2), 39–85 (1982)
Hock, H.H. (ed.): Studies in Sanskrit Syntax. Motilal Banarsidass, Delhi (1991)
Hoffmann, K.: Der Injunktiv im Veda. Eine synchronische Untersuchung. Karl Winter Universitätsverlag (1967)
Huet, G.: The Zen computational linguistics toolkit: Lexicon structures and morphology computations using a modular functional programming language. In: Tutorial, Language Engineering Conference LEC 2002 (2002)
Huet, G.: Towards computational processing of Sanskrit. In: International Conference on Natural Language Processing (ICON) (2003), http://yquem.inria.fr/~huet/PUBLIC/icon.pdf
Huet, G.: Design of a lexical database for Sanskrit. In: Workshop on Enhancing and Using Electronic Dictionaries, COLING 2004. International Conference on Computational Linguistics (2004), http://yquem.inria.fr/~huet/PUBLIC/coling.pdf
Huet, G.: A functional toolkit for morphological and phonological processing, application to a Sanskrit tagger. J. Functional Programming 15(4), 573–614 (2005), http://yquem.inria.fr/~huet/PUBLIC/tagger.pdf
Huet, G.: Lexicon-directed Segmentation and Tagging of Sanskrit. In: Tikkanen, B., Hettrich, H. (eds.) Themes and Tasks in Old and Middle Indo-Aryan Linguistics, pp. 307–325. Motilal Banarsidass, Delhi (2006)
Huet, G.: Shallow syntax analysis in Sanskrit guided by semantic nets constraints. In: Proceedings of the 2006 International Workshop on Research Issues in Digital Libraries. ACM, New York (2007), http://yquem.inria.fr/~huet/PUBLIC/IWRIDL.pdf
Huet, G., Razet, B.: The reactive engine for modular transducers. In: Futatsugi, K., Jouannaud, J.-P., Meseguer, J. (eds.) Algebra, Meaning, and Computation. LNCS, vol. 4060, pp. 355–374. Springer, Heidelberg (2006), http://yquem.inria.fr/~huet/PUBLIC/engine.pdf
Kiparsky, P.: On the architecture of Pāṇini’s grammar. In: International Conference on the Architecture of Grammar, Hyderabad (2002)
Kiparsky, P., Staal, J.F.: Syntactic and semantic relations in Pāṇini. Foundations of Language 5, 83–117 (1969)
Kleene, S.C.: Introduction to Metamathematics. North Holland, Amsterdam (1971) (8th reprint (1st edn. 1952))
Kracht, M.: The combinatorics of case. Research on Language and Computation 1(1/2), 59–97 (2003)
Kulkarni, M.: Phonological overgeneration in paninian system. In: Huet, G., Kulkarni, A., Scharf, P. (eds.) Sanskrit CL 2007/2008. LNCS (LNAI), vol. 5402, pp. 306–319. Springer, Heidelberg (2009)
Löf, P.M.: Intuitionistic Type Theory. Bibliopolis, Napoli (1984)
Mel’cǔk, I.: Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques IV. Les Presses de l’Université de Montréal (1999)
Oberlies, T.: A Grammar of Epic Sanskrit. De Gruyter, Berlin (2003)
Pawan Goyal, V.A., Behera, L.: Analysis of Sanskrit text: Parsing and semantic relations. In: Huet, G., Kulkarni, A. (eds.) Proceedings, First International Symposium on Sanskrit Computational Linguistics, pp. 23–36 (2007)
Ramanujan, P.: Computer processing of Sanskrit. In: Computer Processing of Asian Languages Conference 2. IIT Kanpur (1992)
Renou, L.: La valeur du parfait dans les hymnes védiques. Honoré Champion, Paris (1925); 2ème édition étendue (1967)
Renou, L.: Terminologie grammaticale du sanskrit. Honoré Champion, Paris (1942)
Rétoré, C.: The logic of categorial grammars. Technical report, INRIA Rapport de recherche 5703 (2005), http://www.inria.fr/rrrt/rr-5703.html
Sastri, V.: Samskrita Bālādarśa. Vadhyar, Palghat (2002)
Scharf, P.: Pāṇinian accounts of the vedic subjunctive. Indo-Iranian Journal 48(1-2), 71–96 (2005)
Scharf, P., Hyman, M.: Linguistic Issues in Encoding Sanskrit. Motilal Banarsidass, Delhi (2009)
Speijer, J.S.: Sanskrit Syntax. E. J. Brill, Leyden (1886)
Staal, J.F.: Word Order in Sanskrit and Universal Grammar. Reidel, Dordrecht (1967)
Staal, J.F.: Universals - Studies in Indian Logic and Linguistics. The University of Chicago Press (1988)
Tesnière, L. (ed.): Éléments de Syntaxe Structurale. Klincksieck, Paris (1959)
Tikkanen, B.: The Sanskrit Gerund: a Synchronic, Diachronic and typological analysis. Finnish Oriental Society, Helsinki (1987)
Tubb, G.A., Boose, E.R.: Scholastic Sanskrit. Columbia University, New York (2007)
Verboom, A.: Towards a sanskrit wordparser. Literary and Linguistic Computing 3, 40–44 (1988)
Yelle, R.A.: Explaining mantras. Routledge, New York (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huet, G. (2009). Formal Structure of Sanskrit Text: Requirements Analysis for a Mechanical Sanskrit Processor. In: Huet, G., Kulkarni, A., Scharf, P. (eds) Sanskrit Computational Linguistics. ISCLS ISCLS 2007 2008. Lecture Notes in Computer Science(), vol 5402. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00155-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-00155-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00154-3
Online ISBN: 978-3-642-00155-0
eBook Packages: Computer ScienceComputer Science (R0)