Abstract
In this paper, the performance comparison of three pitch detection algorithms (PDAs) has been presented by implementing them in a LPC based speech analysis-synthesis system. The PDAs considered for comparison is based on three paradigms. The paradigms are weighted autocorrelation function (WACF), Empirical Mode Decomposition based autocorrelation function (EMD-ACF) and Empirical Mode Decomposition based average magnitude difference function (EMD-AMDF). Speech quality measurement is an important and essential task to ensure and maintain the quality of services for speech processing applications like modern telecommunication. Hence, the performance of these methods has been compared through the output speech quality using objective (perceptual evaluation of speech quality test) and subjective quality assessment (Mean Opinion Score test, diagnostic rhyme test and synthesized speech waveforms). The results show that the speech quality for the EMD-ACF and EMD-AMDF based PDA’s are better than that for WACF based PDA. The works presented in this paper is beneficial to telecommunication and speech recognition research group.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Al-Mashouq, K. A. (2003). Rating of speech quality in mobile phone networks. Journal of King Saud University-Engineering Sciences, 15(1), 29–45.
Al-Shoshan, A. I. (2006). Speech and music classification and separation: a review. Journal of King Saud University-Engineering Sciences, 19(1), 95–132.
Al-Shoshan, A. I., Alatiyah, A., & Al-Mashouq, K. (2004). A three-level speech, music, and mixture classifier. Journal of King Saud University-Engineering Sciences, 16(2), 319–331.
Cui, Z. (2012). Pitch extraction based on weighted autocorrelation function in speech signal processing. Proceedings of IEEE International Conference on Computer Science and Network Technology, Changchun, China (pp. 2158–2162).
Deller, J. R., Hansen, J. H. L., & Proakis, J. G. (2000). Discrete-time processing of speech signal (pp. 568–593). New York: Wiley.
Dubey, R. K., & Kumar, A. (2013). Non-intrusive speech quality assessment using several combinations of auditory features. International Journal of Speech Technology, 16(1), 89–101.
Fangming, W., & Yip, P. (1991). Cepstral analysis using discrete trigonometric transform. IEEE Transaction Acoustics, Speech, Signal Processing, 39(2), 538–541.
Hu, Y., & Loizou, P. (2007). Subjective evaluation and comparison of speech enhancement algorithms. Speech Communication, 49, 588–601.
Huang, H., & Pan, J. (2006). Speech pitch determination based on Huang Hilbert Transform. Signal Processing, 86(4), 792–803.
Kumar, S. (2016). Performance Evaluation of novel AMDF-based pitch detection scheme. ETRI Journal, 38(3), 425–434.
Kumar, S. (2019). Performance measurement of a novel pitch detection scheme based on weighted autocorrelation for speech signals. International Journal of Speech Technology, 22(4), 885–892.
Kumar, S. (2020). Real-time implementation and performance valuation of speech classifiers in speech analysis-synthesis. ETRI Journal. https://doi.org/10.4218/etrij.2019-0364
Kumar, S., Bhattacharya, S., Dhiman, V., & Mohapatra, S. (2013). Performance evaluation of a wavelet-based pitch detection scheme. International Journal of Speech Technology, Springer, 16(4), 431–437.
Kumar, S., Bhattacharya, S., & Singh, S. K. (2015). Performance evaluation of a ACF-AMDF based pitch detection in real-time. International Journal of Speech Technology, Springer, 18(4), 521–527.
Perceptual evaluation of speech quality (PESQ), ITU-T, 862 (2001).
Pirker, G., Wohlmayr, M., Petrik, S., Pernkopf, F. (2011). A pitch tracking corpus with evaluation on multipitch tracking scenario. Interspeech, 1509–1512
Pratibha K. and Chandrashekar, H.M. (2017). Estimation and tracking of pitch for noisy speech signals using EMD based autocorrelation function algorithm. Proceedings of IEEE International Conference on Recent Trends in Electronics Information and Communication Technology, Bangalore, India (pp. 2071–2075).
Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P. (2001). Perceptual evaluation of speech quality (PESQ)- A new method for speech quality assessment of telephone networks and codecs. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, UT, USA (pp. 749–752).
Ross, M. J., Shaffer, H. L., Cohen, A., Freudberg, R., & Manley, H. J. (1974). Average magnitude difference function pitch extractor. IEEE Transaction Acoustics, Speech, Signal Processing, 22(5), 353–362.
Sondhi, M. M. (1968). New methods of pitch extraction. IEEE Transaction on Audio and Electroacoust, 16(2), 262–266.
Zhang, W., Xu, G. & Wang, Y. (2002). Pitch estimation based on circular AMDF. Proceedings of IEEE Intemationa1 Conference on Acoustics, Speech and Signal Processing, Orlando, FL (pp. 341–344).
Zong, Y., Zeng, Y., Li, M. and Zheng, R. (2013). Pitch detection using EMD-based AMDF. Proceedings of IEEE Fourth International Conference on Intelligent Control and Information Processing, Beijing, China (pp. 594–597).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, S., Singh, S., Agarwal, P. et al. Speech quality evaluation for different pitch detection algorithms in LPC speech analysis–synthesis system. Int J Speech Technol 24, 545–551 (2021). https://doi.org/10.1007/s10772-020-09765-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-020-09765-0