Automatic phonetic segmentation of Hindi speech using hidden Markov model

Balyan, Archana; Agrawal, S. S.; Dev, Amita

doi:10.1007/s00146-012-0386-2

Automatic phonetic segmentation of Hindi speech using hidden Markov model

Open Forum
Published: 17 February 2012

Volume 27, pages 543–549, (2012)
Cite this article

AI & SOCIETY Aims and scope Submit manuscript

Archana Balyan¹,
S. S. Agrawal² &
Amita Dev³

451 Accesses
12 Citations
Explore all metrics

Abstract

In this paper, we study the performance of baseline hidden Markov model (HMM) for segmentation of speech signals. It is applied on single-speaker segmentation task, using Hindi speech database. The automatic phoneme segmentation framework evolved imitates the human phoneme segmentation process. A set of 44 Hindi phonemes were chosen for the segmentation experiment, wherein we used continuous density hidden Markov model (CDHMM) with a mixture of Gaussian distribution. The left-to-right topology with no skip states has been selected as it is effective in speech recognition due to its consistency with the natural way of articulating the spoken words. This system accepts speech utterances along with their orthographic “transcriptions” and generates segmentation information of the speech. This corpus was used to develop context-independent hidden Markov models (HMMs) for each of the Hindi phonemes. The system was trained using numerous sentences that are relevant to provide information to the passengers of the Metro Rail. The system was validated against a few manually segmented speech utterances. The evaluation of the experiments shows that the best performance is obtained by using a combination of two Gaussians mixtures and five HMM states. A category-wise phoneme error analysis has been performed, and the performance of the phonetic segmentation has been reported. The modeling of HMMs has been implemented using Microsoft Visual Studio 2005 (C++), and the system is designed to work on Windows operating system. The goal of this study is automatic segmentation of speech at phonetic level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arora K, Arora S, Verma K, Agrawal SS (2004) Automatic extraction of phonetically rich sentences from large Text Corpus of Indian Languages. 8th Int’l conference on spoken language, Processing, ICC, Jeju, Jeju Island, Korea, Oct 4–8, Interspeech 2004
Boersma P, Weenik D (2001) Praat: a system for doing phonetics by computer. (http://www.praat.org/)
Brugnara F, Falavigna D, Omologo M (1993) Automatic segmentation and labeling of speech based on hidden markov models. Speech Commun 12(4):357–370
Article Google Scholar
Chou F-C, Tseng C-Y, Lee L-S (2002) A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. IEEE Trans Speech Audio Process 10(7):481–494
Google Scholar
Demuynck K, Laureys T (2002) A comparison of different approaches to automatic speech segmentation. Proceedings of international conference on text, speech and dialogue pp. 277–284
Forney JD (1978) The Viterbi Algorithm. Proc of IEEE 3:268–278
Molau S, Pitz M, Schliiter R, Nay H (2001) Computing Mel-Frequency Cepstral Coefficients on the Power Spectrum. In Proc in Int’l Conf, IEEE 2001 (ICASSP)
Mporas I, Lazaridis A, Ganchev T, Fakotakis N (2009) “Using Hybrid HMM—based speech segmentation to improve synthetic speech quality” 2009 13th Panhellenic Conference on Informatics
Niewiadomy D, Pelikant A (2008) Implementation of MFCC vector generation in classification context. J Appl Comput Sci. http://edu.ics.p.lodz.pl/file.php/38/2-2008/niewiadomy-2-2008.pdf
Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Article Google Scholar
Sethy A, Narayanan S (2002) Refined speech segmentation for concatenative speech synthesis. ICSLP, pp. 149–152. http://sail.usc.edu/publications/autoseg_final.pdf
Ting C-M, Salleh S-H, Tan T-S, Ariff AK (2007) Automatic phonetic segmentation of malay speech database. In ICICS, IEEE
Toledano T, Gomez LAH, Grande LV (2003) Automatic phonetic segmentation. IEEE Trans Speech Audio Process 11(6), Nov 2003
van Santen JPH, Spoart J (1990) High accuracy automatic Segmentation. In: Proceedings of European conference on speech communication and technology
Veeravalli AG et al (2005) A tutorial on using hidden markov models for honeme recognition. In: System Theory, 2005.SSST’05 Proceedings of the thirty—Seventh Southeastern Symposium

Download references

Author information

Authors and Affiliations

Maharaja Surajmal Institute of Technology, Guru Gobind Singh Indraprastha University, C-4, Janakpuri, New Delhi, 110058, India
Archana Balyan
KIIT College of Engineering, KIIT Campus, Sohna Road, Gurgaon, Haryana, India
S. S. Agrawal
Bhai Parmanand Institute of Business Studies, Shakurpur, Delhi, India
Amita Dev

Authors

Archana Balyan
View author publications
You can also search for this author in PubMed Google Scholar
S. S. Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Amita Dev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Archana Balyan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balyan, A., Agrawal, S.S. & Dev, A. Automatic phonetic segmentation of Hindi speech using hidden Markov model. AI & Soc 27, 543–549 (2012). https://doi.org/10.1007/s00146-012-0386-2

Download citation

Received: 14 October 2010
Accepted: 17 January 2012
Published: 17 February 2012
Issue Date: November 2012
DOI: https://doi.org/10.1007/s00146-012-0386-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic phonetic segmentation of Hindi speech using hidden Markov model

Abstract

Access this article

Similar content being viewed by others

Automatic Phonetic Segmentation Using the Kaldi Toolkit

A Phonetic Segmentation Procedure Based on Hidden Markov Models

Automatic Phonetic Segmentation and Pronunciation Detection with Various Approaches of Acoustic Modeling

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic phonetic segmentation of Hindi speech using hidden Markov model

Abstract

Access this article

Similar content being viewed by others

Automatic Phonetic Segmentation Using the Kaldi Toolkit

A Phonetic Segmentation Procedure Based on Hidden Markov Models

Automatic Phonetic Segmentation and Pronunciation Detection with Various Approaches of Acoustic Modeling

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation