Skip to main content
Log in

Glottal opening instants detection using zero frequency resonator

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Zero frequency resonator (ZFR) was proposed earlier for the extraction of glottal closure instants (GCIs) (Murty and Yegnanarayana 2008). The output of ZFR is an exponentially growing/decaying signal. The trend of this signal can be removed to get the required resolution for detecting relevant information. By considering a window size of typical 1–2 pitch periods, the trend removed signal mainly exhibits information related to GCIs. This work proposes two methods for the detection of glottal opening instants (GOIs) using ZFR. In the first method, the window size for trend removing is reduced to a lower level (say, 0.33 \(\times \) pitch period), and the possibility of hypothesizing GOIs is demonstrated. In the second method, window size remains in the range of 1–2 pitch periods, but the input to ZFR is modified to remove GCIs information. The proposed methods are evaluated using CMU-Arctic database and compared with existing methods for GOI detection. The performance for the detection of GOIs is comparable to that of GCIs and also existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Alku, P. (1992). Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication, 11(23), 109–118, eurospeech ’91. http://www.sciencedirect.com/science/article/pii/016763939290005R.

  • Ananthapadmanabha, T. V., & Yegnanarayana, B. (1975). Epoch extraction of voiced speech. IEEE Transactions on Acoustics, Speech and Signal Processing, 23(6), 562–570.

    Article  Google Scholar 

  • Atal, B. S., & Hanauer, S. L. (1971). Speech analysis and synthesis by linear prediction of the speech wave. The Journal of the Acoustical Society of America, 50, 637–655.

    Article  Google Scholar 

  • Bouzid, A., & Ellouze, N. (2004). Glottal opening instant detection from speech signal. In Proceedings Eusipco.

  • Brookes, M. (2013). http://www.ee.ic.ac.uk/hp/staff/dmb/voicemail/voicebox.html.

  • Childers, D. G., & Krishnamurthy, A. K. (1985). A critical review of electroglottography. CRC Critical Reviews in Biomedical Engineering, 12, 131–161.

    Google Scholar 

  • Childers, D. G., & Lee, C. K. (1991). Vocal quality factors: Analysis, synthesis, and perception. The Journal of the Acoustical Society of America, 90, 2394–2410.

    Article  Google Scholar 

  • Cohen, L. (1995). Time-frequency analysis: Theory and applications. Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  • Deepak, K. T., Ramesh, K., Adiga, N., & Prasanna, S. R. M. (2015). Speech and egg polarity detection using hilbert envelope. In Submitted to TENCON, pp. 1–5.

  • Drugman, T. (2013). http://tcts.fpms.ac.be/~drugman/.

  • Drugman, T. & Dutoit, T. (2009). Glottal closure and opening instant from speech signals. In Proceedings Interspeech.

  • Ellouze, N., & Bouzid, A. (2007). Open quotient measurements based on multiscale product of speech signal wavelet transform. Research Letter in Signal Processing, 7, 1–5.

    Google Scholar 

  • Govind, D., Prasanna, S. R. M. & Pati, D. (2011). Epoch extraction in high pass filtered speech using Hilbert envelope. In Proceedings Interspeech.

  • Henrich, N., Doval, B. & dAlessandro, C. (1999). Glottal open quotient estimation using linear prediction. In Proceedings of the workshop on models and analysis of vocal emissions for biomedical applications.

  • Kominek, J. & Black, A. W. (2004). The CMU ARCTIC speech databases. In Proceedings of the 5th ISCA speech synthesis workshop (pp. 223–224). http://festvox.org/cmu_arctic/index.html.

  • Murty, K. S. R., & Yegnanarayana, B. (2008). Epoch extraction from speech signals. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1602–1614.

    Article  Google Scholar 

  • Narendra, N., & Rao, K. S. (2015). Robust voicing detection and f0 estimation for hmm based speech synthesis. Circuits, Systems, and Signal Processing, 34, 1–23.

    Article  Google Scholar 

  • Naylor, P. A., Kounoudes, A., Gudnason, J., & Brookes, M. (2007). Estimation of glottal closure instants in voiced speech using DYPSA algorithm. IEEE Transactions on Audio, Speech and Language Processing, 15(1), 34–43.

    Article  Google Scholar 

  • Prasanna, S. R. M., Govind, D., Rao, K. S. & Yegnanarayana, B. (2010). Fast prosody modification using instants of significant excitation. In Proceedings Speech Prosody.

  • Quatieri, T. F. (2004). Discrete-time speech signal processing. Delhi: Pearson Education.

    Google Scholar 

  • Ramesh, K., Prasanna, S. R. M. & Govind, D. (2013). Detection of glottal opening instants using Hilbert envelope. In Proceedings Interspeech, Lyon.

  • Rao, K. S., Prasanna, S. R. M., & Yegnanarayana, B. (2007). Determination of instants of significant excitation in speech using hilbert envelope and group delay function. IEEE Signal Processing Letters, 14, 762–765.

    Article  Google Scholar 

  • Smits, R., & Yegnanarayana, B. (1995a). Determination of instants of significant excitation in speech using group delay function. IEEE Transactions on Acoustics, Speech and Signal Processing, 4, 325–333.

    Article  Google Scholar 

  • Smits, R., & Yegnanarayana, B. (1995b). Determination of instants of significant excitation in speech using group delay function. IEEE Transactions on Speech and Audio Processing, 3(5), 325–333.

    Article  Google Scholar 

  • Thomas, M. R. P., Gudnason, J., & Naylor, P. A. (2012). Estimation of glottal closing and opening instants in voiced speech using the yaga algorithm. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 82–91.

    Article  Google Scholar 

  • Thomas, M. R. P., & Naylor, P. A. (2009). The sigma algorithm: A glottal activity detector for electroglottographic signals. IEEE Transactions on Audio,Speech, and Language Processing, 17(8), 1557–1566.

    Article  Google Scholar 

  • Yegnanarayana, B., & Murty, K. S. R. (2009). Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals. IEEE Transactions on Audio, Speech and Language Processing, 17(4), 614–625.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Ramesh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramesh, K., Prasanna, S.R.M. Glottal opening instants detection using zero frequency resonator. Int J Speech Technol 20, 127–141 (2017). https://doi.org/10.1007/s10772-016-9383-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-016-9383-z

Keywords

Navigation