Skip to main content

A Hybrid GMM and Codebook Mapping Method for Spectral Conversion

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2005)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3784))

Abstract

This paper proposes a new mapping method combining GMM and codebook mapping methods to transform spectral envelope for voice conversion system. After analyzing overly smoothing problem of GMM mapping method in detail, we propose to convert the basic spectral envelope by GMM method and convert envelope-subtracted spectral details by GMM and phone-tied codebook mapping method. Objective evaluations based on performance indices show that the performance of proposed mapping method averagely improves 27.2017% than GMM mapping method, and listening tests prove that the proposed method can effectively reduce over smoothing problem of GMM method while it can avoid the discontinuity problem of codebook mapping method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Moulines, E., Sagisaka, Y.: Voice conversion: State of the art and perspectives. Speech Communication 16(2), 125–126 (1995)

    Article  Google Scholar 

  2. Arslan, L.M., Talkin, D.: Voice Conversion by Codebook Mapping of Line Spectral Frequencies and Excitation Spectrum. In: Proc. of the Eurospeech 1997, Rhodes, Greece (1997)

    Google Scholar 

  3. Shuang, Z.-W., Wang, Z.-X., Ling, Z.-H., Wang, R.-H.: A novel voice conversion system based on codebook mapping with phoneme-tied weighting. In: Proc. ICSLP, Jeju (October 2004)

    Google Scholar 

  4. Stylianou, Y., et al.: Continuous probabilistic transform for voice conversion. IEEE Transactions on Speech and Audio Processing 6(2), 131–142 (1998)

    Article  Google Scholar 

  5. Kain, A.B.: High Resolution Voice Transformation, Ph.D. thesis, Oregon Health and Science University (October 2001)

    Google Scholar 

  6. Toda, T., Saruwatari, H., Shikano, K.: Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum. In: Proc. of ICASSP, pp. 841–944 (2001)

    Google Scholar 

  7. Chen, Y., Chu, M., et al.: Voice conversion with smoothed gmm and map adaptation. In: Proc. Eurospeech, Geneva, Switzerland, September 2003, pp. 2413–2416 (2003)

    Google Scholar 

  8. Valbret, H., et al.: Voice transformation using PSOLA technique. Speech Communication 11(2-3), 175–187 (1992)

    Article  Google Scholar 

  9. Narendranath, M., et al.: Transformation of formants for voice conversion using artificial neural networks. Speech Communication 16(2), 207–216 (1995)

    Article  Google Scholar 

  10. Watanabe, T., et al.: Transformation of Spectral Envelope for Voice Conversion Based on Radial Basis Function Networks. In: Proc. ICSLP 2002, Denver, USA, September 2002, pp. 285–288.

    Google Scholar 

  11. Kim, E.K., et al.: Hidden Markov Model Based Voice Conversion Using Dynamic Characteristics of Speaker. In: Proc. Eurospeech, Rhodes, Greece, pp. 2519–2522 (1997)

    Google Scholar 

  12. Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice conversion through vector quantization. J. Acoust. Soc. Jpn (E) 11(2), 71–76 (1990)

    Google Scholar 

  13. Toda, T., Black, A.W., Tokuda, K.: pectral conversion based on maximum likelihood estimation considering global variance of converted parameter. In: Proc. Of ICASSP (2005)

    Google Scholar 

  14. Klabbers, E., Veldhuis, R.: Reducing Audible Spectral Discontinuities. IEEE Transactions on Speech and Audio Processing 9(1), 39–51 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kang, Y., Shuang, Z., Tao, J., Zhang, W., Xu, B. (2005). A Hybrid GMM and Codebook Mapping Method for Spectral Conversion. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_39

Download citation

  • DOI: https://doi.org/10.1007/11573548_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29621-8

  • Online ISBN: 978-3-540-32273-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics