Interpolation of pitch contour using temporal decomposition

Ghaemmaghami, Shahrokh; Deriche, Mohamed; Boashash, Boualem

doi:10.1007/BF02111209

Interpolation of pitch contour using temporal decomposition

Published: September 1998

Volume 2, pages 215–225, (1998)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Shahrokh Ghaemmaghami¹,
Mohamed Deriche¹ &
Boualem Boashash¹

67 Accesses
Explore all metrics

Abstract

A new method for predicting pitch contour of a speech signal using a small number of pitch values is addressed, for the application of very low rate speech coding, relying on the correlation between phonetic evolution and pitch variations during voiced speech segments. To track the phonetic evolution and specify perceptually significant time points, Temporal Decomposition (TD) is used. TD provides information required for both determination of critical pitch values and estimation of pitch contour by detecting event functions, as interpolation paths, and their centroids, as the most steady points, in the spectral parameters space. It is shown that the proposed method reduces the amount of pitch information to about one-tenth of that in conventional frame-by-frame based techniques with less than 5% error in pitch approximation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ahlbom, G., Bimbot, F., and Chollet, G. (1987). Modeling spectral speech transitions using temporal decomposition techniques.Proc. ICASSP'87, pp. 13–16.
Atal, B.S. (1983). Efficient coding of LPC parameters by temporal decomposition.Proc. ICASSP'83, pp. 81–84.
Bimbot, F. and Atal, B.S. (1991). An evaluation of temporal decomposition.Proc. EUROSPEECH'91, pp. 1089–1092.
Blumstein, S.E. and Stevens, K.N. (1979). Acoustic invariance in speech production: Evidence from measurements of the spectral characteristics of stop consonants.J. Acoust. Soc. Am., 66(4): 1001–1017.
Google Scholar
Campbell, J.P., Jr. and Tremain, T.E. (1986). Voiced/unvoiced classification of speech with application to the U.S. government LPC-10E algorithm.Proc. ICASSP'86, pp. 473–476.
Childers, D.G. and Wu, K. (1990). Quality of speech produced by analysis-synthesis.Speech Comm., 9:97–117.
Google Scholar
Chung, J.H. and Schafer, R.W. (1990). Excitation modeling in a homomorphic vocoder.Proc. ICASSP'90, vol. 2, pp. 25–28.
Google Scholar
Ghaemmaghami, S. and Deriche, M. (1996). A new approach to very low-rate speech coding using temporal decomposition.Proc. ICASSP'96, vol. 1, pp. 224–227.
Google Scholar
Ghaemmaghami, S., Deriche, M., and Boashash, B. (1997a). Comparative study of different parameters for temporal decomposition based speech coding.Proc. ICASSP'97, vol. 3, pp. 1703–1706.
Google Scholar
Ghaemmaghami, S., Deriche, M., and Boashash, B. (1997b). On modeling event functions in temporal decomposition based speech coding.EUROSPEECH'97, vol. 3, pp. 1299–1302.
Google Scholar
Golub, G.H. and Van Loan, C.F. (1983).Matrix Computation. North Oxford Academic.
Gong, Y. and Haton, J. (1987). Time domain harmonic matching pitch estimation using time dependent speech modeling.IEEE Trans. ASSP, ASSP-35(10): 1386–1400.
Google Scholar
Harris, M.S. and Umeda, N. (1987). Difference limens for fundamental frequency contours in sentences.J. Acoust. Soc. Am., 81(4): 1139–1145.
Google Scholar
Hess, W.J. (1983).Pitch Determination of Speech Signals: Algorithms and Devices. Springer-Verlag.
Kleijn, W.B. and Haagen, J. (1995). A speech coder based on decomposition of characteristic waveforms.Proc. ICASSP'95, vol. 1, pp. 508–511.
Google Scholar
Knagenhjelm, H.P.W. and Kleijn, B. (1995). Spectral dynamics is more important than spectral distortion.Proc. ICASSP'95, vol. 1, pp. 732–735.
Google Scholar
Mouy, B., De La Noue, P., and Goudezeune, G. (1995). NATO STANAG 4479: A standard for an 800 BPS vocoder and channel coding in HF-ECCM system.Proc. ICASSP'95, vol. 1, pp. 480–483.
Google Scholar
O'Shaughnessy, D. (1987).Speech Communication: Human and Machine. Addison-Wesley Pub. Co.
Rabiner, L.R., Cheng, M.J., Rosenberg, A.E., and McGonegal, C.A. (1976). A comparative performance study of several pitch detection algorithms.IEEE Trans. ASSP, ASSP-24(5):399–418.
Google Scholar
Roucos, S., Schwartz, R., and Makhoul, J. (1983). A segment vocoder at 150 bits/s.Proc. ICASSP'83, pp. 61–64.
Schwartz, R.M. and Roucos, S.E. (1983). A comparison of methods for 300–400 bits/s vocoders.Proc. ICASSP'83, pp. 69–72.
Sekey, A. and Hanson, B.A. (1984). Improved 1-bark bandwidth auditory filter.J. Acoust. Soc. Am., 75(6): 1902–1904.
Google Scholar
Shiraki, Y. and Honda, M. (1988). LPC speech coding based on variable-length segment quantization.IEEE Trans. ASSP, ASSP-36:1437–1444.
Google Scholar
Taori, R., Sluijter, and Kathmann, E. (1995). Speech compression using pitch synchronous interpolation.Proc. ICASSP'95, vol. 1, pp. 512–515.
Google Scholar
Van Dijk-Kappers, A.M.L. (1989). Comparison of parameter sets for temporal decomposition.Speech Comm., 8(3):204–220.
Google Scholar
Wilgus, A.M. and Barnwell, T.P. (1983). Data rate reduction of gain and pitch parameters in an LPC vocoder.Proc. ICASSP'83, pp. 77–80.

Download references

Author information

Authors and Affiliations

Signal Processing Research Centre, School of Electrical and Electronic Systems Engineering, Queensland University of Technology, Brisbane, Australia
Shahrokh Ghaemmaghami, Mohamed Deriche & Boualem Boashash

Authors

Shahrokh Ghaemmaghami
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Deriche
View author publications
You can also search for this author in PubMed Google Scholar
Boualem Boashash
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghaemmaghami, S., Deriche, M. & Boashash, B. Interpolation of pitch contour using temporal decomposition. Int J Speech Technol 2, 215–225 (1998). https://doi.org/10.1007/BF02111209

Download citation

Received: 25 April 1997
Revised: 25 February 1998
Accepted: 18 May 1998
Issue Date: September 1998
DOI: https://doi.org/10.1007/BF02111209

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interpolation of pitch contour using temporal decomposition

Abstract

Access this article

Similar content being viewed by others

Pitch segmentation of speech signals based on short-time energy waveform

Refinement and Temporal Interpolation of Short-Term Spectra: Theory and Applications

Fast fundamental frequency determination via adaptive autocorrelation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interpolation of pitch contour using temporal decomposition

Abstract

Access this article

Similar content being viewed by others

Pitch segmentation of speech signals based on short-time energy waveform

Refinement and Temporal Interpolation of Short-Term Spectra: Theory and Applications

Fast fundamental frequency determination via adaptive autocorrelation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation