Identification of Nasalization and Nasal Assimilation from Children’s Speech

Ramteke, Pravin Bhaskar; Supanekar, Sujata; Aithal, Venkataraja; Koolagudi, Shashidhar G.

doi:10.1007/978-3-030-66187-8_23

Pravin Bhaskar Ramteke¹²,
Sujata Supanekar¹²,
Venkataraja Aithal¹³ &
…
Shashidhar G. Koolagudi¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11987))

Included in the following conference series:

International Conference on Mining Intelligence and Knowledge Exploration

248 Accesses

Abstract

In children, nasalization is a commonly observed phonological process where the non-nasal sounds are substituted with nasal sounds. Here, an attempt has been made for the identification of nasalization and nasal assimilation. The properties of nasal sounds and nasalized voiced sounds are explored using MFCCs extracted from Hilbert envelope of the numerator of group delay (HNGD) Spectrum. HNGD Spectrum highlights the formants in the speech and extra nasal formant in the vicinity of first formant in nasalized voiced sounds. Features extracted from correctly pronounced and mispronounced words are compared using Dynamic Time Warping (DTW) algorithm. The nature of the deviation of DTW comparison path from its diagonal behavior is analyzed for the identification of mispronunciation. The combination of FFT based MFCCs and HNGD spectrum based MFCCs are observed to achieve highest accuracy of 82.22% within the tolerance range of ±50 ms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anand, J.M., Guruprasad, S., Yegnanarayana, B.: Extracting formants from short segments of speech using group delay functions. In: INTERSPEECH-2006, pp. 1009–1012. IEEE (2006)
Google Scholar
Cucchiarini, C., Strik, H., Boves, L.: Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms. Speech Commun. 30(2–3), 109–119 (2000)
Article Google Scholar
Dubey, A.K., Prasanna, S.M., Dandapat, S.: Zero time windowing based severity analysis of hypernasal speech. In: 2016 IEEE Region 10 Conference (TENCON), pp. 970–974. IEEE (2016)
Google Scholar
Franco, H., Neumeyer, L., Ramos, M., Bratt, H.: Automatic detection of phone-level mispronunciation for language learning. In: Sixth European Conference on Speech Communication and Technology, pp. 851–854 (1999)
Google Scholar
Grunwell, P.: Clinical Phonology. Aspen Publishers, New York (1982)
Google Scholar
Harrison, A.M., Lo, W.K., Qian, X.j., Meng, H.: Implementation of an extended recognition network for mispronunciation detection and diagnosis in computer-assisted pronunciation training. In: International Workshop on Speech and Language Technology in Education (SLaTE), pp. 45–48 (2009)
Google Scholar
Hodson, B.W.: The Assessment of Phonological Processes. Interstate Printers and Publishers, Danville (1980)
Google Scholar
Huang, X., Acero, A., Hon, H.W., Reddy, R.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall PTR, Upper Saddle River (2001)
Google Scholar
Ingram, D.: Phonological rules in young children. J. Child Lang. 1(1), 49–64 (1974)
Article Google Scholar
Kent, R.D., Vorperian, H.K.: Speech impairment in down syndrome: a review. J. Speech Lang. Hear. Res. 56(1), 178–210 (2013)
Article Google Scholar
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2004). https://doi.org/10.1007/s10115-004-0154-9
Article Google Scholar
Lee, A., Glass, J.: A comparison-based approach to mispronunciation detection. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 382–387. IEEE (2012)
Google Scholar
Li, K., Qian, X., Meng, H.: Mispronunciation detection and diagnosis in L2 English speech using multidistribution deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 25(1), 193–207 (2017)
Article Google Scholar
Li, W., Siniscalchi, S.M., Chen, N.F., Lee, C.H.: Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6135–6139. IEEE (2016)
Google Scholar
Martin, P.: WinPitch LTL II, a multimodal pronunciation software. In: In-STIL/ICALL Symposium (2004)
Google Scholar
Miodonska, Z., Bugdol, M.D., Krecichwost, M.: Dynamic time warping in phoneme modeling for fast pronunciation error detection. Comput. Biol. Med. 69, 277–285 (2016)
Article Google Scholar
Moulines, E., Charpentier, F.: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun. 9(5–6), 453–467 (1990)
Article Google Scholar
Murty, K.S.R., Yegnanarayana, B.: Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Process. Lett. 13(1), 52–55 (2006)
Article Google Scholar
Qian, X., Meng, H., Soong, F.K.: The use of DBN-HMMs for mispronunciation detection and diagnosis in L2 English to support computer-aided pronunciation training. In: INTERSPEECH, pp. 775–778 (2012)
Google Scholar
Ramteke, P.B., Koolagudi, S.G., Afroz, F.: Repetition detection in stuttered speech. In: Nagar, A., Mohapatra, D.P., Chaki, N. (eds.) Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics. SIST, vol. 43, pp. 611–617. Springer, New Delhi (2016). https://doi.org/10.1007/978-81-322-2538-6_63
Chapter Google Scholar
Ramteke, P.B., Supanekar, S., Hegde, P., Nelson, H., Aithal, V., Koolagudi, S.G.: NITK Kids’ speech corpus. In: Proceedings of Interspeech 2019, pp. 331–335 (2019)
Google Scholar
Shriberg, L.D., Kwiatkowski, J.: Phonological disorders I: a diagnostic classification system. J. Speech Hear. Disord. 47(3), 226–241 (1982)
Article Google Scholar
Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1(1), 19–22 (2010)
Google Scholar
Wei, S., Hu, G., Hu, Y., Wang, R.H.: A new method for mispronunciation detection using support vector machine based on pronunciation space models. Speech Commun. 51(10), 896–905 (2009)
Article Google Scholar

Download references

Acknowledgment

The authors would like to thank the Cognitive Science Research Initiative (CSRI), Department of Science & Technology, Government of India, Grant no. SR/CSRI/ 49/2015, for its financial support on this work.

Author information

Authors and Affiliations

National Institute of Technology Karnataka, Surathkal, India
Pravin Bhaskar Ramteke, Sujata Supanekar & Shashidhar G. Koolagudi
Department of Speech and Hearing, SOAHS, Manipal, India
Venkataraja Aithal

Authors

Pravin Bhaskar Ramteke
View author publications
You can also search for this author in PubMed Google Scholar
Sujata Supanekar
View author publications
You can also search for this author in PubMed Google Scholar
Venkataraja Aithal
View author publications
You can also search for this author in PubMed Google Scholar
Shashidhar G. Koolagudi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pravin Bhaskar Ramteke .

Editor information

Editors and Affiliations

National Institute of Technology, Goa, India
Purushothama B. R.
National Institute of Technology, Goa, India
Veena Thenkanidiyoor
Indian Institute of Information Technology, Sri City, India
Rajendra Prasath
Indian Institute of Information Technology, Sri City, India
Odelu Vanga

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramteke, P.B., Supanekar, S., Aithal, V., Koolagudi, S.G. (2020). Identification of Nasalization and Nasal Assimilation from Children’s Speech. In: B. R., P., Thenkanidiyoor, V., Prasath, R., Vanga, O. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2019. Lecture Notes in Computer Science(), vol 11987. Springer, Cham. https://doi.org/10.1007/978-3-030-66187-8_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-66187-8_23
Published: 20 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66186-1
Online ISBN: 978-3-030-66187-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics