An Improved Multipitch Tracking Algorithm with Empirical Mode Decomposition

Jiang, Wei; Liu, Wen-Ju; Tan, Ying-Wei; Liang, Shan

doi:10.1007/978-3-662-45643-9_22

Wei Jiang¹⁵,
Wen-Ju Liu¹⁵,
Ying-Wei Tan¹⁵ &
…
Shan Liang¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 484))

Included in the following conference series:

Chinese Conference on Pattern Recognition

2363 Accesses

Abstract

Multipitch tracking is beneficial for speech separation, audio transcription and many other tasks. In this paper, we greatly improve a state-of-the-art multipitch tracking algorithm. While the amplitude and individual peak positions of autocorrelation function (ACF) were used in previous algorithms, a novel feature based on the average frequency of each time-frequency (T-F) unit is proposed in this paper. This feature is computed by an empirical mode decomposition (EMD) method. Then it is utilized to form the conditional probabilities in the hidden Markov model (HMM) given a pitch state of each frame, and finally the most likely state sequence is searched out. Quantitative evaluations show that the novel feature is more effective, and our algorithm significantly outperforms the previous one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cheveigné, A., Kawahara, H.: Multiple period estimation and pitch perception model. Speech Commun. 27, 175–185 (1999)
Article Google Scholar
Wu, M.Y., Wang, D.L., Brown, G.J.: A multipitch tracking algorithm for noisy speech. IEEE Trans. Speech and Audio Processing 11, 229–241 (2003)
Article Google Scholar
Klapuri, A.: Multiple fundamental frequency estimation by summing harmonic amplitudes. In: Proc. Int. Conf. Music Inf. Retrieval (ISMIR), pp. 216–221 (2006)
Google Scholar
Jin, Z.Z., Wang, D.L.: HMM-based multipitch tracking for noisy and reverberant speech. IEEE Trans. Audio, Speech, Lang. Process. 19, 1091–1102 (2011)
Article Google Scholar
Schnupp, J., Nelken, I., King, A.: Auditory Neuroscience: Making Sense of Sound, pp. 128–129. MIT Press, Cambridge (2011)
Google Scholar
Meddis, R.: Simulation of auditory-neural transduction: Further studies. J. Acoust. Soc. Amer. 83, 1056–1063 (1988)
Article Google Scholar
Slaney, M., Lyon, R.F.: On the importance of time a temporal representation of sound. In: Visual Representations of Speech Signals, pp. 95–116. Wiley, New York (1993)
Google Scholar
Huang, N.E., Shen, Z., Long, S.R., Wu, M.L., Shih, H.H., Zheng, Q., Yen, N.C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Roy. Soc. London A 545, 903–995 (1998)
Article MathSciNet Google Scholar
Cooke, M.P.: Modeling Auditory Processing and Organization. Cambridge University, U.K (1993)
Google Scholar
Zwicker, E.: Psychoacoustics. Springer, New York (1982)
Google Scholar
Liu, W.J., Zhang, X.L., Jiang, W., et al.: Monaural voiced speech segregation based on elaborate harmonic grouping strategies. Sci. China. Inf. Sci., 2471–2480 (2011)
Google Scholar
Boersma, P., Weenink, D.: Praat: Doing Phonetics by Computer (2004), http://www.praat.org
Cheveigné, A.: Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancellation model of auditory processing. J. Acoust. Soc. Am., 3271–3290 (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Wei Jiang, Wen-Ju Liu, Ying-Wei Tan & Shan Liang

Authors

Wei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Ju Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ying-Wei Tan
View author publications
You can also search for this author in PubMed Google Scholar
Shan Liang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Electrical and Information Engineering, Hunan University, 410082, Changsha, P.R. China
Shutao Li
Chinese Academy of Sciences, Beijing, China
Chenglin Liu
College of electrical and information engineering, Hunan University, 410082, Changsha, P.R. China
Yaonan Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, W., Liu, WJ., Tan, YW., Liang, S. (2014). An Improved Multipitch Tracking Algorithm with Empirical Mode Decomposition. In: Li, S., Liu, C., Wang, Y. (eds) Pattern Recognition. CCPR 2014. Communications in Computer and Information Science, vol 484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45643-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-662-45643-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45642-2
Online ISBN: 978-3-662-45643-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics