A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali

Senapati, Apurbalal; Garain, Utpal

doi:10.1007/978-3-319-18111-0_34

Apurbalal Senapati¹⁴ &
Utpal Garain¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9041))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2918 Accesses
2 Citations

Abstract

Reduplication is an important phenomenon in language studies especially in Indian languages. The definition of reduplication is the repetition of the smallest linguistic unit partially or completely i.e. repetition of phoneme, morpheme, word, phrase, clause or the utterance as a whole and it gives different meaning in syntax as well as semantic level. The reduplicated words has important role in many natural language processing (NLP) applications, namely in machine translation (MT), text summarization, identification of multiword expressions, etc. This article focuses on an algorithm for identifying the reduplicated words from a text corpus and computing statistics (descriptive statistics) of reduplicated words frequently used in Bengali.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dash, N.: A Descriptive Study of Bengali Words, pp. 225–251. CUP (2015)
Google Scholar
Ananthanarayana, H.S.: Reduplication in Sanketi Tamil OpiL, vol. 2, pp. 39–49 (1976)
Google Scholar
Abbi, A.: Reduplicated Adverbs of Manner and Cause of Hindi. Indian Linguistics 38(2), 125–135 (1977)
Google Scholar
Murthy, C.: Formation of Echo-Words in Kannada. In: All India Conference of Dravidian Linguistics(eds.) (1972)
Google Scholar
Nongmeikapam, K.: Identification of Reduplication MWEs in Manipuri, a rule-based approach. In: 23rd International Conference on the Computer Processing of Oriental Languages, California, USA, pp. 49–54 (2010)
Google Scholar
Chattopadhyay, S.K.: Bhasa-Prakash Bangala Vyakaran, 3rd edn. Pupa publication (1992)
Google Scholar
Chaudhuri, B.B.: Bangla Dhwanipratik: Swarup o Abhidhan (Bangla Sound Symbolism: Properties and Dictionary). Paschimbanga Bangla Academy, Kolkata (2010)
Google Scholar
Thompson, H.R.: Bengali: A Comprehensive Grammar, pp. 663–672. Routledge publication (2010)
Google Scholar
Bandyopadhyay, S.: Identification of Reduplication in Bengali Corpus and their Semantic Analysis: A Rule-Based Approach. In: Proceedings of the Workshop on Multiword Expressions: from Theory to Applications (MWE 2010), Beijing, pp. 72–75 (2010)
Google Scholar
Senapati, A., Garain, U.: Anaphora Resolution in Bangla using global discourse knowledge. In: Int. Conf. of Asian Language Processing, Hanoi, Vietnam (2012)
Google Scholar
Sharon, L.L.: Sampling: Design and Analysis, 2nd edn. Advanced Series, pp. 73–101 (2010)
Google Scholar
TDIL Corpus: A nation-wide consortium for machine translation of Indic languages is being funded by the Ministry of Information Technology, Govt. of India (1995), http://www.tdil-dc.in
Digital Dictionaries of South Asia, http://dsal.uchicago.edu/dictionaries/biswas-bangala/

Download references

Author information

Authors and Affiliations

Central Institute of Technology, BTAD, Kokrajhar, 783370, Assam, India
Apurbalal Senapati
Indian Statistical Institute, 203, B.T.Road, Kolkata, 700108, India
Utpal Garain

Authors

Apurbalal Senapati
View author publications
You can also search for this author in PubMed Google Scholar
Utpal Garain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Apurbalal Senapati .

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico DF, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Senapati, A., Garain, U. (2015). A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-18111-0_34
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18110-3
Online ISBN: 978-3-319-18111-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics