Toward Improving the Performance of Epoch Extraction from Telephonic Speech

Gurugubelli, Krishna; Javid, Mohammad Hashim; Alluri, K. N. R. K. Raju; Vuppala, Anil Kumar

doi:10.1007/s00034-020-01551-2

Toward Improving the Performance of Epoch Extraction from Telephonic Speech

Short Paper
Published: 23 September 2020

Volume 40, pages 2050–2064, (2021)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Krishna Gurugubelli ORCID: orcid.org/0000-0002-7658-065X¹,
Mohammad Hashim Javid¹,
K. N. R. K. Raju Alluri¹ &
…
Anil Kumar Vuppala¹

239 Accesses
1 Citation
Explore all metrics

Abstract

Epoch is an abrupt closure event within a glottal cycle at which significant excitation to the vocal-tract system happens during the production of voiced speech. The state-of-the-art zero frequency filtering technique is a simple and efficient method that shows robustness in extracting the epochs from clean speech. However, this method has shown poor performance for telephonic quality speech, due to the presence of spurious zero crossings in epoch evidence, which leads to a high false alarm rate. Recently, zero-phase zero frequency resonator (ZP-ZFR) an alternative to zero frequency filter is proposed for stable implementation of zero frequency filtering technique. In this study, higher-order ZP-ZFR is investigated to improve the performance of zero frequency filtering for epoch extraction from telephonic speech. The performance of the proposed ZP-ZFR method is quantitatively evaluated on telephonic speech simulated using six standard databases having simultaneous electroglottograph recordings as ground truth. Experimental results suggest that the performance of the proposed method is significantly better than the state-of-the-art methods in terms of identification rate and false alarm rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Epoch Extraction Using Zero Band Filtering from Speech Signal

Article 25 December 2014

Epoch Extraction from Telephonic Speech Signal using Stockwell Transform

Article 26 February 2023

Epoch Extraction Using Hilbert–Huang Transform for Identification of Closed Glottis Interval

Availability of data

The current study used the publicly available datasets for the analysis. The datasets are available in APLAWDW repository: http://www.commsp.ee.ic.ac.uk/sap/resources/aplawdw/ and CMU Arctic repository: http://www.festvox.org/cmu_arctic/index.html.

Notes

The APLAWDW database is available in https://www.commsp.ee.ic.ac.uk/~sap/resources/aplawdw/.
The covarap toolbox is an open source repository of advanced speech processing algorithms, and it can be obtained from https://github.com/covarep/covarep.git.

References

M. Airaksinen, T. Raitio, B. Story, P. Alku, Quasi closed phase glottal inverse filtering analysis with weighted linear prediction. IEEE/ACM Trans. Audio Speech Lang. Process. 22(3), 596–607 (2014)
Article Google Scholar
P. Alku, Glottal inverse filtering analysis of human voice production-a review of estimation and parameterization methods of the glottal excitation and their applications. Sadhana 36(5), 623–650 (2011)
Article Google Scholar
T. Ananthapadmanabha, B. Yegnanarayana, Epoch extraction of voiced speech. IEEE Trans. Acoust. Speech Signal Process. 23(6), 562–570 (1975)
Article Google Scholar
J.P. Cabral, K. Richmond, J. Yamagishi, S. Renals, Glottal spectral separation for speech synthesis. IEEE J. Sel. Top. Signal Process. 8(2), 195–208 (2014)
Article Google Scholar
T. Drugman, A. Alwan, Joint robust voicing detection and pitch estimation based on residual harmonics, in Proceedings of Interspeech, pp. 1973–1976 (2011)
T. Drugman, T. Dutoit, Glottal closure and opening instant detection from speech signals, in Proceedings of interspeech, pp. 2891–2894 (2009)
T. Drugman, M. Thomas, J. Gudnason, P. Naylor, T. Dutoit, Detection of glottal closure instants from speech signals: a quantitative review. IEEE Trans. Audio Speech Lang. Process. 20(3), 994–1006 (2011)
Article Google Scholar
B.R. Gerratt, J. Kreiman, M. Garellek, Comparing measures of voice quality from sustained phonation and continuous speech. J. Speech Lang. Hear. Res. 59(5), 994–1001 (2016)
Article Google Scholar
P. Gómez-Vilda, R. Fernández-Baillo, V. Rodellar-Biarge, V.N. Lluis, A. Álvarez-Marquina, L.M. Mazaira-Fernández, R. Martínez-Olalla, J.I. Godino-Llorente, Glottal source biometrical signature for voice pathology detection. Speech Commun. 51(9), 759–781 (2009)
Article Google Scholar
K. Gurugubelli, A.K. Vuppala, Stable implementation of zero frequency filtering of speech signals for efficient epoch extraction. IEEE Signal Process. Lett. 26(9), 1310–1314 (2019)
Article Google Scholar
S.R. Kadiri, A quantitative comparison of epoch extraction algorithms for telephone speech, in Proceedings of IEEE ICASSP, pp. 6500–6504 (2019)
J. Kane, C. Gobl, Evaluation of glottal closure instant detection in a range of voice qualities. Speech Commun. 55(2), 295–314 (2013)
Article Google Scholar
Y.M. Keerthana, M.K. Reddy, K.S. Rao, Cwt-based approach for epoch extraction from telephone quality speech. IEEE Signal Process. Lett. 26(8), 1107–1111 (2019)
Article Google Scholar
J. Kominek, A.W. Black, The CMU Arctic speech databases, in Proceedings of 5th ISCA speech synthesis workshop, pp. 223–224 (2004)
A. Kounoudes, P.A. Naylor, M. Brookes, The DYPSA algorithm for estimation of glottal closure instants in voiced speech, in Proceedings of IEEE ICASSP, pp. 349–352 (2002)
A.I. Koutrouvelis, G.P. Kafentzis, N.D. Gaubitch, R. Heusdens, A fast method for high-resolution voiced/unvoiced detection and glottal closure/opening instant estimation of speech. IEEE/ACM Trans. Audio Speech Lang Process. 24(2), 316–328 (2016)
Article Google Scholar
K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
Article Google Scholar
P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the dypsa algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)
Article Google Scholar
S.M. Prasanna, D. Govind, Analysis of excitation source information in emotional speech, in Proceedings of Interspeech, pp. 781–784 (2010)
A. Prathosh, T. Ananthapadmanabha, A. Ramakrishnan, Epoch extraction based on integrated linear prediction residual using plosion index. IEEE Trans. Audio Speech Lang. Process. 21(12), 2471–2480 (2013)
Article Google Scholar
K.S. Rao, S.M. Prasanna, B. Yegnanarayana, Determination of instants of significant excitation in speech using hilbert envelope and group delay function. IEEE Signal Process. Lett. 14(10), 762–765 (2007)
Article Google Scholar
K.S. Rao, B. Yegnanarayana, Prosody modification using instants of significant excitation. IEEE Trans. Audio Speech Lang. Process. 14(3), 972–980 (2006)
Article Google Scholar
M.R. Thomas, J. Gudnason, P.A. Naylor, Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm. IEEE Trans. Audio Speech Lang. Process. 20(1), 82–91 (2012)
Article Google Scholar
K. Vijayan, K.S.R. Murty, Epoch extraction from all pass residual of speech signals, in Proceedings of IEEE ICASSP, pp. 1493–1497 (2014)
K. Vijayan, K.S.R. Murty, Epoch extraction by phase modelling of speech signals. Circuits Syst. Signal Process. 35(7), 2584–2609 (2016)
Article Google Scholar
C. Vikram, S.M. Prasanna, Epoch extraction from telephone quality speech using single pole filter. IEEE/ACM Trans. Audio Speech Lang. Process. 25(3), 624–636 (2017)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers, and editor-in-chief M. N. S. Swamy for their support and constructive criticisms, which helped us to improve the quality of this article.

Author information

Authors and Affiliations

Speech Processing Laboratory, LTRC, KCIS, International Institute of Information Technology, Hyderabad, 500032, India
Krishna Gurugubelli, Mohammad Hashim Javid, K. N. R. K. Raju Alluri & Anil Kumar Vuppala

Authors

Krishna Gurugubelli
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Hashim Javid
View author publications
You can also search for this author in PubMed Google Scholar
K. N. R. K. Raju Alluri
View author publications
You can also search for this author in PubMed Google Scholar
Anil Kumar Vuppala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krishna Gurugubelli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gurugubelli, K., Javid, M.H., Alluri, K.N.R.K.R. et al. Toward Improving the Performance of Epoch Extraction from Telephonic Speech. Circuits Syst Signal Process 40, 2050–2064 (2021). https://doi.org/10.1007/s00034-020-01551-2

Download citation

Received: 07 February 2020
Revised: 10 September 2020
Accepted: 13 September 2020
Published: 23 September 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s00034-020-01551-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward Improving the Performance of Epoch Extraction from Telephonic Speech

Abstract

Access this article

Similar content being viewed by others

Epoch Extraction Using Zero Band Filtering from Speech Signal

Epoch Extraction from Telephonic Speech Signal using Stockwell Transform

Epoch Extraction Using Hilbert–Huang Transform for Identification of Closed Glottis Interval

Availability of data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Toward Improving the Performance of Epoch Extraction from Telephonic Speech

Abstract

Access this article

Similar content being viewed by others

Epoch Extraction Using Zero Band Filtering from Speech Signal

Epoch Extraction from Telephonic Speech Signal using Stockwell Transform

Epoch Extraction Using Hilbert–Huang Transform for Identification of Closed Glottis Interval

Availability of data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation