An Effective Algorithm for Determining Pitch Markers of Vietnamese Speech Sentences

Ta, Thai Yen; Van Nguyen, Hung; Van Dao, Tuyet; Ngo, Huy Hoang; Sergey, Ablameyko

doi:10.1007/978-3-319-92537-0_72

Thai Yen Ta¹⁷,
Hung Van Nguyen¹⁸,
Tuyet Van Dao^19,20,
Huy Hoang Ngo²¹ &
…
Ablameyko Sergey^19,22

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10878))

Included in the following conference series:

International Symposium on Neural Networks

3851 Accesses
1 Citations

Abstract

Synthesizing Vietnamese tone plays an important role in Vietnamese text-to-speech systems. To accomplish this, the first important step is to determine the pitch-markers of voice utterances and this technique is still an open issue. In this paper, we propose a simple and efficient algorithm that locates the pitch-markers at the peaks of the cumulative signal of each voiced part of the input utterance. The experimentation has shown that the proposed algorithm presents pitch-markers with high accuracy and based on this obtained result, we have already synthesized Vietnamese complex tones such as the drop and the broken tones for isolated syllables with clear hearing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A short guide to pitch-marking in the festival speech synthesis system and recommendations for improvements. Local Language Speech Technology Initiative (LLSTI) Reports (2004). http://www.llsti.org
Legát, M., Matoušek, J., Tihelka, D.: On the detection of pitch marks using a robust multi-phase algorithm. Speech Commun. 53(4), 552 (2011)
Article Google Scholar
Bořil, H., Pollák, P.: Direct time domain fundamental frequency estimation of speech in noisy conditions. In: EUSIPCO, pp. 1003–1006 (2004)
Google Scholar
Wang, D., Hansen, J.H.L.: F0 estimation for noisy speech by exploring temporal harmonic structures in local time frequency spectrum segment. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. IEEE Press (2016)
Google Scholar
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech Coding and Synthesis. Elsevier (1995)
Google Scholar
Chen, J.-H., Kao, Y.-A.: Pitch marking based on an adaptable filter and a peak-valley estimation method. Comput. Linguist. Chin. Lang. Process. 6(2), 1–12 (2001)
Google Scholar
Legát, M., Tihelka, D., Matoušek, J.: Pitch marks at peaks or valleys? In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 502–507. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74628-7_65
Chapter Google Scholar
PRAAT: doing phonetics by computer. http://www.Praat.org
Kounoudes, A., Naylor, P.A., Brookes, M.: The DYPSA algorithm for estimation of glottal closure instants in voiced speech. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. IEEE Press (2002)
Google Scholar
Babacan, O., Drugman, T., d’Alessandro, N., Henrich, N., Dutoit, T.: A quantitative comparison of glottal closure instant estimation algorithms on a Large Variety of Singing Sounds. In: Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP. IEEE Press (2013)
Google Scholar
Yin Pitch Estimator (2012). http://audition.ens.fr/adc/sw/yin.zip. Accessed 27 Nov 2012
Charpentier, F., Stella, M.: Diphone synthesis using an overlap-add technique for speech waveforms concatenation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1986. IEEE Press (1986)
Google Scholar
Xu, C.X., Xu, Y., Luo, L.-S.: A pitch target approximation model for F0 contours in Mandarin. In: ICPHS99 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Hanoi University of Business and Technology, Hanoi, Vietnam
Thai Yen Ta
Military Institute of Science and Technology, Hanoi, Vietnam
Hung Van Nguyen
Belarusian State University, Minsk, Belarus
Tuyet Van Dao & Ablameyko Sergey
Vietnam National Space Center, Vietnam Academy of Science and Technology, Hanoi, Vietnam
Tuyet Van Dao
Electric Power University, Vietnam Ministry of Industry and Trade, 235 Hoang Quoc Viet Street, Hanoi, Vietnam
Huy Hoang Ngo
United Institute of Informatics Problems of the National Academy of Sciences of Belarus, 6, Surganova Street, Minsk, 220012, Belarus
Ablameyko Sergey

Authors

Thai Yen Ta
View author publications
You can also search for this author in PubMed Google Scholar
Hung Van Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Tuyet Van Dao
View author publications
You can also search for this author in PubMed Google Scholar
Huy Hoang Ngo
View author publications
You can also search for this author in PubMed Google Scholar
Ablameyko Sergey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thai Yen Ta .

Editor information

Editors and Affiliations

Texas A&M University at Qatar, Doha, Qatar
Tingwen Huang
Sichuan University, Chengdu, China
Jiancheng Lv
Southeast University, Nanjing, China
Changyin Sun
United Institute of Informatics Problems, Minsk, Belarus
Alexander V. Tuzikov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ta, T.Y., Van Nguyen, H., Van Dao, T., Ngo, H.H., Sergey, A. (2018). An Effective Algorithm for Determining Pitch Markers of Vietnamese Speech Sentences. In: Huang, T., Lv, J., Sun, C., Tuzikov, A. (eds) Advances in Neural Networks – ISNN 2018. ISNN 2018. Lecture Notes in Computer Science(), vol 10878. Springer, Cham. https://doi.org/10.1007/978-3-319-92537-0_72

Download citation

DOI: https://doi.org/10.1007/978-3-319-92537-0_72
Published: 26 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92536-3
Online ISBN: 978-3-319-92537-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics