Skip to main content

An Algorithm for Phase Manipulation in a Speech Signal

  • Conference paper
  • First Online:
Book cover Speech and Computer (SPECOM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9811))

Included in the following conference series:

Abstract

While human auditory system is predominantly sensitive to the amplitude spectrum of an incoming sound, a number of sound perception studies have shown that the phase spectrum is also perceptually relevant. In case of speech, its relevance can be established through experiments with speech vocoding or parametric speech synthesis, where particular ways of manipulating the phase of voiced excitation (i.e. setting it to zero or random values) can be shown to affect voice quality. In such experiments the phase should be manipulated with as little distortion of the amplitude spectrum as possible, lest the degradation in voice quality perceived through listening tests, caused by the distortion of amplitude spectrum, be incorrectly attributed to the influence of phase. The paper presents an algorithm for phase manipulation of a speech signal, based on inverse filtering, which introduces negligible distortion into the amplitude spectrum, and demonstrates its accuracy on a number of examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A review of most important early studies in phase perception can be found e.g. in [3].

References

  1. Ohm, G.S.: Über die Definition des Tones, nebst daran geknüpfter Theorie der Sirene und ähnlicher tonbildender Vorrichtungen. Annalen der Physik und Chemie 135(8), 513–565 (1843)

    Article  Google Scholar 

  2. von Helmholtz, H.L.F.: Über die Klangfarbe der Vocale. Annalen der Physik und Chemie 18, 280–290 (1859)

    Article  Google Scholar 

  3. Plomp, R., Steeneken, H.J.M.: Effect of phase on the timbre of complex tones. J. Acoust. Soc. Am. 46(2B), 409–421 (1969)

    Article  Google Scholar 

  4. Schroeder, M.R.: Models of hearing. Proc. of the IEEE 63, 1332–1350 (1975)

    Article  Google Scholar 

  5. Oppenheim, A.V., Lim, J.S.: The importance of phase in signals. Proc. IEEE 69, 529–541 (1981)

    Article  Google Scholar 

  6. Patterson, R.D.: A pulse ribbon model of monaural phase perception. J. Acoust. Soc. Am. 82(5), 1560–1586 (1987)

    Article  Google Scholar 

  7. Paliwal, K.K., Alsteris, L.D.: On the usefulness of STFT phase spectrum in human listening tests. Speech Commun. 45(2), 153–170 (2005)

    Article  Google Scholar 

  8. Lim, J.S., Oppenheim, A.V.: Enhancement and bandwidth compression of noisy speech. Proc. IEEE 67, 1586–1604 (1979)

    Article  Google Scholar 

  9. Wang, D.L., Lim, J.S.: The unimportance of phase in speech enhancement. IEEE Trans. Speech Signal Process. 30(4), 679–681 (1982)

    Article  Google Scholar 

  10. Pobloth, H., Kleijn, W.B: On phase perception in speech. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 29–32 (1999)

    Google Scholar 

  11. Shi, G., Shanechi, M.M., Aarabi, P.: On the importance of phase in human speech recognition. IEEE Trans. Audio Speech Lang. Process. 14(5), 1867–1874 (2006)

    Article  Google Scholar 

  12. Schluter, R., Ney, H.: Using phase spectrum information for improved speech recognition performance. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 133–136 (2001)

    Google Scholar 

  13. Raitio, T., Juvela, L., Suni, A., Vainio, M., Alku, P.: Phase perception of the glottal excitation and its relevance in statistical parametric speech synthesis. Speech Communication (in press, 2016)

    Google Scholar 

  14. Sečujski, M., Ostrogonac, S., Suzić, S., Pekar, D.: Speech database production and tagset design aimed at expressive text-to-speech in Serbian. In: Proceedings of Digital Signal and Image Processing (DOGS), Novi Sad, Serbia, pp. 51–54 (2014)

    Google Scholar 

  15. Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A.W., Tokuda, K.: The HMM-based speech synthesis system version 2.0. In: Proceedings of ISCA Speech Synthesis Workshop (2007)

    Google Scholar 

Download references

Acknowledgments

The presented study was supported in part by the Ministry of Education and Science of the Republic of Serbia (grant TR32035), in part by the project “SP2: SCOPES Project for Speech Prosody” (No. CRSII2-147611/1), financed by the Swiss National Science Foundation, and in part by the company Speech Morphing, Inc. from Campbell, CA, USA, which also provided some of the speech corpora used in the experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Milan Sečujski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Pekar, D., Suzić, S., Mak, R., Friedlander, M., Sečujski, M. (2016). An Algorithm for Phase Manipulation in a Speech Signal. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43958-7_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43957-0

  • Online ISBN: 978-3-319-43958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics