Abstract
This paper presents the work on automatic converb detection in Old Braj poetry from the 15–17 centuries. This is a part of research on non-finite verbal forms in early New Indo-Aryan (NIA) language corpora comprising data from Old Rajasthani, Awadhi, Braj, Dakkhini and Pahari [8]. The goal of the detection mechanism is to successfully identify a plaintext word as a converb or non-converb. Such mechanism facilitates further converb description and analysis, which is of great importance in research on historical syntax of NIA. In order to develop the automatic detector, a selection of state-of-art statistical classification mechanisms was used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
We use here consistently two of the three Dixonian primitive terms [2] i.e. A - subject of a transitive verb and O - object of a transitive verb.
References
Bickel, B.: Capturing particulars and universals in clause linkage: a multivariate analysis. In: Bril, I. (ed.) Clause Linking and Clause Hierarchy : Syntax and Pragmatics, pp. 51–102. No. 121 in Studies in Language Companion Series, John Benjamins, Amsterdam (2010). http://dx.doi.org/10.5167/uzh-48989
Dixon, R.M.: Ergativity. Cambridge Studies in Linguistics, Cambridge University Press (1994). https://books.google.pl/books?id=fKfSAu6v5LYC
Dvivedī, L.: Viṣṇudās kavkiṛt Rāmāyana kathā. Sāhitya bhavan limited (1972)
Dwarikesh, D.P.S.: Historical syntax of the conjunctivc participle phrase in new indo-aryan dialects of madhyadesa (midland) of northern india. University of Chicago Ph.D. dissertation (1971)
Emenau, M.: The sanskrit gerund: a synchronic, diachronic and typological analysis. Language 32, 3–16 (1956)
Haspelmath, M.: The converb as a cross-linguistically valid category. In: Haspelmath, M., König, E. (eds.) Converbs in Cross-Linguistic Perspective: Structure and Meaning of Adverbial Verb Forms - Adverbial Participles, Gerunds, pp. 1–55. No. 13 in Empirical approaches to language typology, Mouton de Gruyter, Berlin (1995)
Jaworski, R., Jassem, K., Stroński, K.: Manual and Automatic Tagging of Indo-Aryan Languages. Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 550–554 (2015)
Jaworski, R., Stroński, K.: New perspectives in annotating early new indo-aryan texts. In: Proceedings of the 32nd South Asian Languages Analysis Round Table SALA-32, Lisbon, Portugal, pp. 66–68 (2016)
Jaworski, R., Stroński, K.: Recognition and multi-layered analysis of converbs in early NIA. In: Proceedings of the 33rd South Asian Languages Analysis Round Table SALA-33, Poznań, Poland, pp. 55–56 (2017)
Langford, J., Li, L., Zhang, T.: Sparse online learning via truncated gradient. In: Advances in Neural Information Processing Systems, pp. 905–912 (2009)
Masica, C.P.: Defining a Linguistic Area: South Asia. Chicago University Press, Chicago (1976)
McGregor, R.: The Language of Indrajit of Orchā: A Study of Early Braj Bhāsā Prose. University of Cambridge Oriental Publications, Cambridge University Press (1968). https://books.google.pl/books?id=EjI3vgAACAAJ
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
Misra, V.P.: Bhusana granthavali. Nai Dilli, Vani Prakasan (1994)
Stroński, K., Tokaj, J., Verbeke, S.: A diachronic account of converbal constructions in old rajasthani. In: Cennamo, M., Fabrizio, C. (eds.) Historical Linguistics 2015, Selected papers from the 22nd International Conference on Historical Linguistics, Naples, 27–31 July 2015, pp. 424–441. No. 348 in Current Issues in Linguistic Theory, John Benjamins, Amsterdam/Philadephia (2019)
Subbārāo, K.: South Asian Languages: A Syntactic Typology. South Asian Languages: A Syntactic Typology, Cambridge University Press (2012). https://books.google.pl/books?id=ZCfiGYvpLOQC
Tikkanen, B.: The Sanskrit gerund: a synchronic, diachronic, and typological analysis. Studia Orientalia, Finnish Oriental Society (1987). https://books.google.pl/books?id=XTkqAQAAIAAJ
Wallace, W.D.: Object-marking in the history of nepali: a case of syntactic diffusion. Stud. Linguist. Sci. 11(2), 107–128 (1981)
Acknowledgements
This research was supported by Polish National Science Centre grant 2013/10/M/HS2/00553.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jaworski, R., Stroński, K. (2020). Experiments with Automatic and Semi-automatic Detection of Sparse Word Forms in Old Braj. In: Vetulani, Z., Paroubek, P., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2017. Lecture Notes in Computer Science(), vol 12598. Springer, Cham. https://doi.org/10.1007/978-3-030-66527-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-66527-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66526-5
Online ISBN: 978-3-030-66527-2
eBook Packages: Computer ScienceComputer Science (R0)