Skip to main content
Log in

Minimum data generation for Telugu speech recognition

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

A morphologically rich language has hundreds of forms of each word which makes storing and maintaining them time and resource consuming. It also leads to confusions while recognizing speech which leads to more word error rate. These issues make it difficult to build applications of speech recognition for such languages. Hence there is a need to develop a phonetically balanced minimal data set. This paper describes generating minimum dataset for Telugu language, the second most widely spoken language in India. Considering minimum data generation as a set covering problem, a variety of datasets are generated based on different criteria. From various set covering algorithms, Greedy algorithm is chosen. The criterion used for final data selection is the frequency of occurrence of words. As set covering requires a large set of data from which minimum data is selected, a 15 Million word text corpus has been created. Thorough analysis of this text corpus is carried out in order to ensure that the generated set is phonetically balanced. The generated minimum dataset consists of 21 words and covers each phoneme of the Telugu language. Telugu speech technology researchers can benefit from this data set in building applications of phoneme level speech recognition by reducing manual recording effort and time. This paper discusses the role of minimum data set in LVSR systems, details of the text corpus created and proposed algorithm for minimum data generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Agrawal S. S. (2010). Recent developments in speech corpora in indian languages: Country Report of India, O-COCOSDA, Kathmandu.

  • Antal, M. (2007). Toward a simple phoneme based speech recognition system. Studia Universitatis Babes, Bolyai, Informatica, LII(2), 33.

    MathSciNet  Google Scholar 

  • Atkins, S., Clear, J., & Ostler, N. (1992). Corpus design criteria. Literary and Linguistic Computing, 7(1), 1–16.

  • Beun, D., Pols, L., & Kloosterman, H. (1995). Phoneme-based automatic speech recognition: towards a demonstrator for information retrieval, using dutch hi-fi speech. In: Proceedings in institute of phonetic sciences, University of Amsterdam (Vol. 19, pp. 126–134).

  • Bharathi, A., Prakash Rao, K., Sangal, R., & Bendre, S.M. (2002). Basic statistical analysis of corpus and cross coparision among corpora. Technical Report 4, IIIT, Hyderabad, www.iiit.net/techreports/2002.4.pdf.

  • Emeneau, M. B. (1946). The phonemes of Sanskrit language.

  • Gopalakrishna, A., et al. (2005, October). Development of indian language speech databases for large vocabulary speech recognition systems. In: Proceedings of international conference on speech and computer (SPECOM), Patras, Greece.

  • Jagannath. (1981). Telugu loanword phonology, Ph.D Thesis, University of Arizona.

  • Khan, A. N., Gangashetty, S. V. & Yegnanarayana, B. (2003). Syllabic properties of three Indian languages: Implications for speech recognition and language identification. In: International Conference Natural Language Processing (pp. 125–134).

  • Kostić, D., Mitter, A., & Krishnamurti, B. (1997). A short outline of Telugu phonetics. Calcutta: Indian Statistical Institute.

    Google Scholar 

  • Krishnamurthy, N. D. (1992). Conversational Telugu. Bangalore: N.D.K.Institute of Languages.

    Google Scholar 

  • Nagamma Reddy, K. (1995). Phonetic, Phonological, morpho-syntactic and semantic functions of segmental duration in spoken Telugu: Acoustic evidence.

  • Neti, C., Rajput, N., & Verma, A. (2002). A large vocabulary continuous speech recognition system for Hindi. In: Proceedings of the national conference on communications, Mumbai (pp. 366–370).

  • Rao, C. R. (1965). A grammatical sketch of Telugu, an artcicle published in 1965.

  • Rao, U. (2004). Materials for a computational grammar for telugu, phonology and Morphology, Vol 1.

  • Reddy, B. R. (1976). Localist studies in Telugu syntax, Ph.D Thesis, University of Edinburgh.

  • Schiffman, H. F., & Eastman, C. (1975). Dravidian phonological systems. London: University of Washington Press. ISBN-13: 9780295955070.

    Google Scholar 

  • Sunitha, K. V. N., & Sharada, A. (2009). Telugu text corpora analysis for creating speech database. International Journal of Engineering & Information Technology, 1(2), 109–114. ISSN: 0975–5292.

    Google Scholar 

  • Young, S., & Bloothooft, G. (eds.), (1997). Corpus-based methods in language and speech precessing, Vol-II. Dordrecht: Kluwer Academic Publishers.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. V. N. Sunitha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sunitha, K.V.N., Sharada, A. Minimum data generation for Telugu speech recognition. Int J Speech Technol 18, 217–230 (2015). https://doi.org/10.1007/s10772-014-9262-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-014-9262-4

Keywords

Navigation