Improvement of time alignment of the speech signals to be used in voice conversion

Mozaffari, Fatemeh; Sayadian, Abolghasem

doi:10.1007/s10772-018-9490-0

Improvement of time alignment of the speech signals to be used in voice conversion

Published: 15 January 2018

Volume 21, pages 79–84, (2018)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

196 Accesses
Explore all metrics

Abstract

One of the main applications of time alignment is parallel corpus based voice conversion. In the literature, various methods such as dynamic time warping (DTW) and hidden Markov model have been suggested for time alignment of two speech signals. In this paper, we introduce some modifications to DTW in order to decrease the time alignment error. These modifications are refinement, which is done by exerting a threshold, normalization, and comparisons between the preceding and the following frames to make sound correspondence between two different parallel corpus-based speakers’ speeches. Evaluation of this approach which has been done on some corpus sentences indicates a significant improvement of time alignment. At least about 4% and in some cases 15% decrease of error in comparison with DTW has been achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Arslan, L. M., & Talkin, D. (1998). Speaker transformation using sentence HMM based alignments and detailed prosody modification. ICASSP.
Dengï, Y., & Byrne, W. (2008). HMM word and phrase alignment for statistical machine translation. IEEE Transactions on Audio, Speech and Language Processing, 16, 494–507.
Article Google Scholar
Homayounpour, M. (2009) Text to speech conversion. Tehran: Amirkabir University of Technology.
Google Scholar
Latsch, V. L., & Sergio, L. N. (2011). Pitch-synchronous time alignment of speech signals for prosody transplantation. IEEE international symposium on circuits and systems (ISCAS).
Rabiner, L., & Juang, B. H. (1993). Fundamentals of Speech Recognition. Upper Saddle: Prentice Hall.
MATH Google Scholar
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech. Proceedings of the IEEE.
Sayadian, A., & Mozaffari, F. (2017). A novel method for voice conversion based on non-parallel corpus. International Journal of Speech Technology. https://doi.org/10.1007/s10772-017-9430-4
Google Scholar
Seara, R., et al. (2016). Enhanced CORILGA: introducing the automatic phonetic alignment tool for continuous speech. LREC.
Stainhaouer, G. N., & Carayannis, G. (1990). New parallel implementations for DTW algorithms. IEEE Transactions on Acoustics Speech Signal Processing, 38, 4.
Article Google Scholar
Tinati, M., & Farhid, M. (2007) A novel method for improvement of the quality of voice conversion systems. 13th national computer engineering conference of Iran.
Torkkola, K. (1988). Automatic alignment of speech with phonetic transcriptions in real time. Proceedings of IEEE.
Wang, T., & Cuperman, V. (1998). Robust voicing estimation with dynamic time warping. Proceedings of IEEE..
Yfantis, E. A., Lazarakis, T., & Angelopoulos, A. (1998). On time alignment and metric algorithms for speech recognition. Proceedings of IEEE.

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran
Fatemeh Mozaffari & Abolghasem Sayadian

Authors

Fatemeh Mozaffari
View author publications
You can also search for this author inPubMed Google Scholar
Abolghasem Sayadian
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Fatemeh Mozaffari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mozaffari, F., Sayadian, A. Improvement of time alignment of the speech signals to be used in voice conversion. Int J Speech Technol 21, 79–84 (2018). https://doi.org/10.1007/s10772-018-9490-0

Download citation

Received: 06 July 2017
Accepted: 07 January 2018
Published: 15 January 2018
Issue Date: March 2018
DOI: https://doi.org/10.1007/s10772-018-9490-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improvement of time alignment of the speech signals to be used in voice conversion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion

Adaptive framing based similarity measurement between time warped speech signals using Kalman filter

Study of the Effect of Reducing Training Data in Speech Synthesis Adaptation Based on Frequency Warping

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Improvement of time alignment of the speech signals to be used in voice conversion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion

Adaptive framing based similarity measurement between time warped speech signals using Kalman filter

Study of the Effect of Reducing Training Data in Speech Synthesis Adaptation Based on Frequency Warping

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now