Automatic speech segmentation in syllable centric speech recognition system

Panda, Soumya Priyadarsini; Nayak, Ajit Kumar

doi:10.1007/s10772-015-9320-6

Automatic speech segmentation in syllable centric speech recognition system

Published: 21 November 2015

Volume 19, pages 9–18, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

1364 Accesses
Explore all metrics

Abstract

Speech recognition is the process of understanding the human or natural language speech by a computer. A syllable centric speech recognition system in this aspect identifies the syllable boundaries in the input speech and converts it into the respective written scripts or text units. Appropriate segmentation of the acoustic speech signal into syllabic units is an important task for development of highly accurate speech recognition system. This paper presents an automatic syllable based segmentation technique for segmenting continuous speech signals in Indian languages at syllable boundaries. To analyze the performance of the proposed technique, a set of experiments are carried out on different speech samples in three Indian languages Hindi, Bengali and Odia and are compared with the existing group delay based segmentation technique along with the manual segmentation technique. The results of all our experiments show the effectiveness of the proposed technique in segmenting the syllable units from the original speech samples compared to the existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Besacier, L., Barnard, E., Karpov, A., & Schultz, T. (2014). Automatic speech recognition for under-resourced languages: A survey. Speech Communication, 56, 85–100.
Article Google Scholar
Gałka, J., Masior, M., & Salasa, M. (2014). Voice authentication embedded solution for secured access control. IEEE Transactions on Consumer Electronics, 60(4), 653–661.
Article Google Scholar
He, Y., Han, J., Zheng, T., & Sun, G. (2014). A new framework for robust speech recognition in complex channel environments. Digital Signal Processing, 32, 109–123.
Article Google Scholar
Kay, S. M., & Sudhaker, R. (1986). A zero crossing-based spectrum analyzer. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(1), 96–104.
Article Google Scholar
Kelly, F., Drygajlo, A., & Harte, N. (2013). Speaker verification in score-ageing-quality classification space. Computer Speech & Language, 27(5), 1068–1084.
Article Google Scholar
Kitaoka, N., Enami, D., & Nakagawa, S. (2014). Effect of acoustic and linguistic contexts on human and machine speech recognition. Computer Speech & Language, 28(3), 769–787.
Article Google Scholar
Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech using source, system, and prosodic features. International Journal of Speech Technology, 15(2), 265–289.
Article Google Scholar
Lau, Y. K., & Chan, C. K. (1985). Speech recognition based on zero crossing rate and energy. IEEE Transactions on Acoustics, Speech, and Signal Processing, 33(1), 320–323.
Article Google Scholar
Li, M., Han, K. J., & Narayanan, S. (2013). Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Computer Speech & Language, 27(1), 151–167.
Article Google Scholar
Lin, C. H., Wu, C. H., Ting, P. Y., & Wang, H. M. (1996). Frameworks for recognition of Mandarin syllables with tones using sub-syllabic units. Speech Communication, 18(2), 175–190.
Article Google Scholar
Lippmann, R. P. (1997). Speech recognition by machines and humans. Speech Communication, 22(1), 1–15.
Article Google Scholar
Mao, Q., Dong, M., Huang, Z., & Zhan, Y. (2014). Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks. IEEE Transactions on Multimedia, 16(8), 2203–2213.
Article Google Scholar
McLoughlin, I. V. (2014). Super-audible voice activity detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(9), 1424–1433.
Article Google Scholar
Musfir, M., Krishnan, K. R., & Murthy, H. (2014). Analysis of fricatives, stop consonants and nasals in the automatic segmentation of speech using the group delay algorithm. In Twentieth National Conference on Communications (NCC) (pp. 1–6).
Obin, N., Lamare, F., & Roebel, A. (2013). Syll-O-Matic: an adaptive time-frequency representation for the automatic segmentation of speech into syllables. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 6699–6703).
Origlia, A., Cutugno, F., & Galatà, V. (2014). Continuous emotion recognition with phonetic syllables. Speech Communication, 57, 155–169.
Article Google Scholar
Panda, S. P., & Nayak, A. K. (2015). An efficient model for text-to-speech synthesis in Indian languages. International Journal of Speech Technology, 18(3), 305–315.
Article Google Scholar
Panda, S. P., Nayak, A. K., & Patnaik, S. (2015). Text-to-speech synthesis with an Indian language perspective. International Journal of Grid and Utility Computing, 6(3–4), 170–178.
Article Google Scholar
Prasad, V. K., Nagarajan, T., & Murthy, H. A. (2004). Automatic segmentation of continuous speech using minimum phase group delay functions. Speech Communication, 42(3), 429–446.
Article Google Scholar
Prasanna, S., Reddy, B. V. S., & Krishnamoorthy, P. (2009). Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 556–565.
Article Google Scholar
Sakai, T., & Doshita, S. (1963). The automatic speech recognition system for conversational sound. IEEE Transactions on Electronic Computers, 6, 835–846.
Article Google Scholar
Shastri, L., Chang, S., & Greenberg, S. (1999). Syllable detection and segmentation using temporal flow neural networks. In International Congress of Phonetic Sciences (pp. 1721–1724).
Sirigos, J., Fakotakis, N., & Kokkinakis, G. (2002). A hybrid syllable recognition system based on vowel spotting. Speech Communication, 38(3), 427–440.
Article MATH Google Scholar
Sreenivas, T. V., & Niederjohn, R. J. (1992). Zero-crossing based spectral analysis and SVD spectral analysis for formant frequency estimation in noise. IEEE Transactions on Signal Processing, 40(2), 282–293.
Article Google Scholar
Wang, H. M. (2000). Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese. Speech Communication, 32(1), 49–60.
Article Google Scholar
Wang, G., & Sim, K. C. (2014). Regression-based context-dependent modeling of deep neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(11), 1660–1669.
Article Google Scholar
Zhao, X., & Shaughnessy, D. O. (2008). A new hybrid approach for automatic speech signal segmentation using silence signal detection, energy convex hull, and spectral variation. In Canadian Conference on Electrical and Computer Engineering (pp. 145–148).
Ziolko, B., Manandhar, S., Wilson, R. C., & Ziolko, M. (2006). Wavelet method of speech segmentation. In 14th European Signal Processing Conference (pp. 1–5).

Download references

Author information

Authors and Affiliations

Department of CSE, Institute of Technical Education and Research, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha, India
Soumya Priyadarsini Panda
Department of CS&IT, Institute of Technical Education and Research, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha, India
Ajit Kumar Nayak

Authors

Soumya Priyadarsini Panda
View author publications
You can also search for this author inPubMed Google Scholar
Ajit Kumar Nayak
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Soumya Priyadarsini Panda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Panda, S.P., Nayak, A.K. Automatic speech segmentation in syllable centric speech recognition system. Int J Speech Technol 19, 9–18 (2016). https://doi.org/10.1007/s10772-015-9320-6

Download citation

Received: 13 August 2015
Accepted: 12 November 2015
Published: 21 November 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s10772-015-9320-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech segmentation in syllable centric speech recognition system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Syllable Segmentation of Tamil Speech Signals Using Vowel Onset Point and Spectral Transition Measure

Chinese Speech Syllable Segmentation Algorithm Based on Peak Point and Energy Entropy Ratio

Gaussian Filter-Based Speech Segmentation Algorithm for Gujarati Language

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Automatic speech segmentation in syllable centric speech recognition system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Syllable Segmentation of Tamil Speech Signals Using Vowel Onset Point and Spectral Transition Measure

Chinese Speech Syllable Segmentation Algorithm Based on Peak Point and Energy Entropy Ratio

Gaussian Filter-Based Speech Segmentation Algorithm for Gujarati Language

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now