Abstract
A novel and robust pitch estimation method is presented in this paper. The basic idea is to reshape the speech signal using a combination of the dominant harmonic modification (DHM) and data adaptive time domain filtering techniques. The noisy speech signal is filtered within the ranges of fundamental frequencies to obtain the pre-filtered signal (PFS). The dominant harmonic (DH) of the PFS is determined and enhanced its amplitude. Normalized autocorrelation function (NACF) is applied to that modified signal. Then empirical mode decomposition (EMD) based data adaptive time domain filtering is applied to the NACF signal. Partial reconstruction is performed in EMD domain. The pitch period is determined from the partially reconstructed signal. The experimental results show that the proposed method performs better than the other recently developed methods for noisy and clean speech signals in terms of gross and fine pitch errors.
Similar content being viewed by others
References
Abu-Shikhan, N., & Deriche, M. (1999). A novel pitch estimation technique using the Teager energy. In Proc. of ISSPA (Vol. 1, pp. 135–138).
Dogan, M. C., & Mendel, J. M. (1992). Real-time robust pitch detector. In Proc. of IEEE ICASSP (Vol. 1, pp. 129–132).
Flandrin, P., Rilling, G., & Goncalves, P. (2004). Empirical mode decomposition as a filter bank. IEEE Signal Processing Letters, 11(2), 112–114.
Hasan, M. K., Shahnaz, C., & Fattah, S. A. (2003). Determination of pitch of noisy speech using dominant harmonic frequency. In Proc. IEEE int. symposium on circuits and systems (Vol. 2, pp. 556–559).
Hasan, M. K., et al. (2005). Signal reshaping using dominant harmonic for pitch estimation of noisy speech. Signal Processing, 86(5), 1010–1018.
Hess, W. (1983). Pitch determination of speech signals: Algorithms and devices. Berlin: Springer.
Huang, H., & Pan, J. (2005). Speech pitch determination based on Hilbert-Huang transform. Signal Processing, 86(4), 792–803.
Huang, N. E., et al. (1998). The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A, 454, 903–995.
Kasi, K., & Zahorian, S. A. (2002). Yet another algorithm for pitch tracking. In Proc. IEEE ICASSP (pp. 361–364).
McAulay, R. J., & Quatieri, T. F. (1990). Pitch estimation and voicing detection based on a sinusoidal speech model. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (Vol. 1, pp. 249–252).
Molla, M. K. I., Hirose, K., Minematsu, N., & Hasan, M. K. (2007). Pitch estimation of noisy speech signals using empirical mode decomposition In EUROSPEECH.
Plante, F., Meyer, G., & Ainsworth, W. A. (1995). A pitch extraction reference database. In Proceedings of EUROSPEECH 95 (pp. 837–840).
Rabiner, L. R., Cheng, M. J., Rosenberg, A. E., & McGonegal, C. A. (1976). A comparative performance study of several pitch detection algorithms. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-24(5), 399–417.
Shimamura, T., & Kobayashi, H. (2001). Weighted autocorrelation for pitch extraction of noisy speech. IEEE Transactions on Speech and Audio Processing, 9(7), 727–730.
Tabrikian, J., Dubnov, S., & Dickalov, Y. (2002). Speech enhancement by harmonic modeling via MAP pitch tracking. In Proceedings IEEE international conference on acoustics, speech, and signal processing (Vol. 1, pp. 549–552).
Talkin, D. (Ed.) (1995). A robust algorithm for pitch tracking (RAPT). In Speech coding and synthesis (pp. 495–518). Amsterdam: Elsevier.
Wang, C., & Seneff, S. (2000). Robust pitch tracking for prosodic modeling in telephone speech. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (pp. 1343–1346).
Yang, Z., Huang, D., & Yang, L. (2004). A novel pitch period detection algorithm based on Hilbert-Huang transform. In LNCS: Vol. 3338. Sinobiometrics (pp. 586–593).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Roy, S.K., Molla, M.K.I., Hirose, K. et al. Harmonic modification and data adaptive filtering based approach to robust pitch estimation. Int J Speech Technol 14, 339–349 (2011). https://doi.org/10.1007/s10772-011-9112-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-011-9112-6