Analysis-By-Synthesis Modeling of Bengali Intonation

Pakrashi, Moumita; Mahanta, Shakuntala

doi:10.1007/978-3-031-20980-2_45

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13721))

Included in the following conference series:

International Conference on Speech and Computer

1071 Accesses
1 Citations

Abstract

The main concern behind deriving natural sounding synthesized speech lies in the objective mapping of the relation between formal and functional representations of prosody in human speech. Besides stress, rhythm, and duration, intonation is the most vital part of prosody that contributes to the naturalness of any synthetic speech. Latest prosodic studies of Bengali and their application have been carried out using Autosegmental-Metrical and Fujisaki models, but there remains much scope for improving naturalness of synthetic speech in existing TTS systems. In this paper, we study Bengali intonation patterns with a language-independent, hybrid phonetic-phonological model of Momel-INTSINT. Analysis-by-synthesis paradigm involves automatic symbolic coding of the prosodic form by INTSINT (INternational Transcription System for INTonation) that has been derived from the Momel (Modelling Melody) algorithm by stylizing the raw F0 curve to reduce the complex acoustic data to a simplified model. This symbolic representation then becomes the input to the ProZed tool for generating synthetic speech. Our study is based on the prosodically representative sentence set of Bengali speech developed by CDAC-Kolkata. The automatic labeling framework of INTSINT tones helps in precise modeling of intonation patterns within hierarchical prosodic units of accentual, intermediate, and intonation phrases in Bengali utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Modeling Vietnamese Speech Prosody: A Step-by-Step Approach Towards an Expressive Speech Synthesis System

Superpositional HMM-Based Intonation Synthesis Using a Functional F0 Model

Article 19 May 2015

Explorations of Prosody in Vietnamese Language

References

Xu, Y.: Speech prosody: a methodological review. J. Speech Sci. 1(1), 85–115 (2011). https://doi.org/10.20396/joss.v1i1.15014
Article Google Scholar
Hirst, D.J.: The analysis by synthesis of speech melody: from data to models. J. Speech Sci. 1(1), 55–83 (2011). https://doi.org/10.20396/joss.v1i1.15011
Article Google Scholar
Hirst, D., Di Cristo, A., Espesser, R.: Levels of representation and levels of analysis for the description of intonation systems. In: Horne, M. (ed.) Prosody: Theory and Experiment, pp. 51–87. Springer Netherlands, Dordrecht (2000). https://doi.org/10.1007/978-94-015-9413-4_4
Chapter Google Scholar
Pierrehumbert, J.B.: The Phonology and Phonetics of English Intonation. Ph.D. Dissertation. MIT, Cambridge, MA (1980)
Google Scholar
Khan, S.: Intonational Phonology and Focus Prosody of Bengali. Ph.D. Thesis (2008)
Google Scholar
Fujisaki, H.: Analysis and modeling of voice fundamental frequency contours for declarative sentences of Japanese. J. Acoust. Soc. Japan 5, 640–657 (1984). https://doi.org/10.1250/ast.5.233
Article Google Scholar
Das Mandal, S.K., Saha, A., Sarkar I., Datta, A.K.: Phonological, international & prosodic aspects of concatenative speech synthesizer development for Bangla. In: Proceedings of SIMPLE 05, pp. 56–60 (2005)
Google Scholar
Hirst, D.J., Di Cristo, A. (eds.) Intonation Systems. A survey of Twenty Languages. Cambridge University Press, Cambridge (1998). https://doi.org/10.2307/417674
Hirst, D.J.: A Praat plugin for Momel and INTSINT with improved algorithms for modeling and coding intonation. In: Proceedings of the 16th International Congress of Phonetic Sciences, pp. 1233–1236 (2007)
Google Scholar
Hirst, D.J.: ProZed: a speech prosody editor for linguists, using analysis-by-synthesis. In: 6th International Conference on Speech Prosody. Shanghai, China (2012)
Google Scholar
Boersma, P., Weenink, D.: Praat: a system for doing phonetics by computer. http://www.praat.org (Version 6.2.01)
Bhattacharya, K.: Bengali phonetic reader. Central Institute of Indian Languages (1999). https://doi.org/10.1017/S0035869X00082319
Article Google Scholar
Chatterji, S.K.: Bengali phonetics. Bull. Sch. Orient. Afr. Stud. 2, 1–25 (1921). https://doi.org/10.1017/S0041977X0010179X
Article Google Scholar
Hayes, B., Lahiri, A.: Bengali intonational phonology. Nat. Lang. Linguist. Theory 9, 47–96 (1991). https://doi.org/10.1007/BF00133326
Article Google Scholar
Hirst, D.J., Auran, C.: Analysis by synthesis of speech prosody: the ProZed environment. In: 9th European Conference on Speech Communication and Technology. Lisbon, Portugal (Sep 2005)
Google Scholar
Zhi, N., Hirst, D.J., Bertinetto, P.: Automatic analysis of the intonation of a tone language. Applying the Momel algorithm to spontaneous Standard Chinese (Beijing). In: 11^th Annual Conference of the International Speech Communication Association. Makuhari, Chiba, Japan (Sep 2010)
Google Scholar
Wang, T., Hongwei, D., Qiuwu, M., Hirst, D.J.: Automatic analysis of emotional prosody in mandarin Chinese: applying the Momel algorithm. In: International Conference on Speech Prosody, vol. 7. Dublin, Ireland (May 2014). https://doi.org/10.21437/SpeechProsody.2014-12
Zhi, N., Hirst, D.J., Bertinetto, P., Li, A., Jia, Y.: An analysis-by-synthesis study of Mandarin Speech Prosody. International Conference on Speech Prosody (2016). https://doi.org/10.21437/SpeechProsody.2016-22
Article Google Scholar
Ali, S., Hirst, D.J.: Analysis by synthesis of English intonation patterns: generalising from form to function. In: Proceedings of the 16th International Congress of Phonetic Sciences, pp. 1205–1208 (Aug 2007)
Google Scholar
Louw, J.A., Barnard, E.: Automatic intonation modeling with INTSINT. In: Proceedings of the 15th Annual Symposium of the Pattern Recognition Association of South Africa, Grabouw, pp. 107–111 (2004)
Google Scholar

Download references

Acknowledgements

We would like to thank C-DAC (Centre for Development of Advanced Computing) for providing us with the Bengali speech data; special thanks to Speech Processing Lab, C-DAC Kolkata for developing such a prosodically enriched Bengali speech database, which is definitely a valuable resource for different speech researches.

Author information

Authors and Affiliations

Centre for Linguistic Science and Technology, IIT Guwahati, Assam, India
Moumita Pakrashi
Department of Humanities and Social Sciences, IIT Guwahati, Assam, India
Shakuntala Mahanta

Authors

Moumita Pakrashi
View author publications
You can also search for this author in PubMed Google Scholar
Shakuntala Mahanta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moumita Pakrashi .

Editor information

Editors and Affiliations

Indian Institute of Technology Dharwad, Dharwad, India
S. R. Mahadeva Prasanna
St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
K. Samudravijaya
KIIT Group of Colleges, Gurugram, India
Shyam S. Agrawal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pakrashi, M., Mahanta, S. (2022). Analysis-By-Synthesis Modeling of Bengali Intonation. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science(), vol 13721. Springer, Cham. https://doi.org/10.1007/978-3-031-20980-2_45

Download citation

DOI: https://doi.org/10.1007/978-3-031-20980-2_45
Published: 10 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20979-6
Online ISBN: 978-3-031-20980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics