Minimum data generation for Telugu speech recognition

Sunitha, K. V. N.; Sharada, A.

doi:10.1007/s10772-014-9262-4

Minimum data generation for Telugu speech recognition

Published: 30 November 2014

Volume 18, pages 217–230, (2015)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

K. V. N. Sunitha¹ &
A. Sharada²

269 Accesses
Explore all metrics

Abstract

A morphologically rich language has hundreds of forms of each word which makes storing and maintaining them time and resource consuming. It also leads to confusions while recognizing speech which leads to more word error rate. These issues make it difficult to build applications of speech recognition for such languages. Hence there is a need to develop a phonetically balanced minimal data set. This paper describes generating minimum dataset for Telugu language, the second most widely spoken language in India. Considering minimum data generation as a set covering problem, a variety of datasets are generated based on different criteria. From various set covering algorithms, Greedy algorithm is chosen. The criterion used for final data selection is the frequency of occurrence of words. As set covering requires a large set of data from which minimum data is selected, a 15 Million word text corpus has been created. Thorough analysis of this text corpus is carried out in order to ensure that the generated set is phonetically balanced. The generated minimum dataset consists of 21 words and covers each phoneme of the Telugu language. Telugu speech technology researchers can benefit from this data set in building applications of phoneme level speech recognition by reducing manual recording effort and time. This paper discusses the role of minimum data set in LVSR systems, details of the text corpus created and proposed algorithm for minimum data generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal S. S. (2010). Recent developments in speech corpora in indian languages: Country Report of India, O-COCOSDA, Kathmandu.
Antal, M. (2007). Toward a simple phoneme based speech recognition system. Studia Universitatis Babes, Bolyai, Informatica, LII(2), 33.
MathSciNet Google Scholar
Atkins, S., Clear, J., & Ostler, N. (1992). Corpus design criteria. Literary and Linguistic Computing, 7(1), 1–16.
Beun, D., Pols, L., & Kloosterman, H. (1995). Phoneme-based automatic speech recognition: towards a demonstrator for information retrieval, using dutch hi-fi speech. In: Proceedings in institute of phonetic sciences, University of Amsterdam (Vol. 19, pp. 126–134).
Bharathi, A., Prakash Rao, K., Sangal, R., & Bendre, S.M. (2002). Basic statistical analysis of corpus and cross coparision among corpora. Technical Report 4, IIIT, Hyderabad, www.iiit.net/techreports/2002.4.pdf.
Emeneau, M. B. (1946). The phonemes of Sanskrit language.
Gopalakrishna, A., et al. (2005, October). Development of indian language speech databases for large vocabulary speech recognition systems. In: Proceedings of international conference on speech and computer (SPECOM), Patras, Greece.
Jagannath. (1981). Telugu loanword phonology, Ph.D Thesis, University of Arizona.
Khan, A. N., Gangashetty, S. V. & Yegnanarayana, B. (2003). Syllabic properties of three Indian languages: Implications for speech recognition and language identification. In: International Conference Natural Language Processing (pp. 125–134).
Kostić, D., Mitter, A., & Krishnamurti, B. (1997). A short outline of Telugu phonetics. Calcutta: Indian Statistical Institute.
Google Scholar
Krishnamurthy, N. D. (1992). Conversational Telugu. Bangalore: N.D.K.Institute of Languages.
Google Scholar
Nagamma Reddy, K. (1995). Phonetic, Phonological, morpho-syntactic and semantic functions of segmental duration in spoken Telugu: Acoustic evidence.
Neti, C., Rajput, N., & Verma, A. (2002). A large vocabulary continuous speech recognition system for Hindi. In: Proceedings of the national conference on communications, Mumbai (pp. 366–370).
Rao, C. R. (1965). A grammatical sketch of Telugu, an artcicle published in 1965.
Rao, U. (2004). Materials for a computational grammar for telugu, phonology and Morphology, Vol 1.
Reddy, B. R. (1976). Localist studies in Telugu syntax, Ph.D Thesis, University of Edinburgh.
Schiffman, H. F., & Eastman, C. (1975). Dravidian phonological systems. London: University of Washington Press. ISBN-13: 9780295955070.
Google Scholar
Sunitha, K. V. N., & Sharada, A. (2009). Telugu text corpora analysis for creating speech database. International Journal of Engineering & Information Technology, 1(2), 109–114. ISSN: 0975–5292.
Google Scholar
Young, S., & Bloothooft, G. (eds.), (1997). Corpus-based methods in language and speech precessing, Vol-II. Dordrecht: Kluwer Academic Publishers.

Download references

Author information

Authors and Affiliations

BVRIT Hyderabad College of Engineering for Women, Bachupally, Hyderabad, AP, India
K. V. N. Sunitha
G. Narayanamma Institute of Technology & Science for Women, Shaikpet, Hyderabad, AP, India
A. Sharada

Authors

K. V. N. Sunitha
View author publications
You can also search for this author in PubMed Google Scholar
A. Sharada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. V. N. Sunitha.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sunitha, K.V.N., Sharada, A. Minimum data generation for Telugu speech recognition. Int J Speech Technol 18, 217–230 (2015). https://doi.org/10.1007/s10772-014-9262-4

Download citation

Received: 18 July 2014
Accepted: 01 November 2014
Published: 30 November 2014
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10772-014-9262-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimum data generation for Telugu speech recognition

Abstract

Access this article

Similar content being viewed by others

A Supervised Phrase Selection Strategy for Phonetically Balanced Standard Yorùbá Corpus

Designing High-Coverage Multi-level Text Corpus for Non-professional-voice Conservation

Speech corpora subset selection based on time-continuous utterances features

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Minimum data generation for Telugu speech recognition

Abstract

Access this article

Similar content being viewed by others

A Supervised Phrase Selection Strategy for Phonetically Balanced Standard Yorùbá Corpus

Designing High-Coverage Multi-level Text Corpus for Non-professional-voice Conservation

Speech corpora subset selection based on time-continuous utterances features

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation