Skip to main content
Log in

Accurate Estimation of Glottal Closure Instants and Glottal Opening Instants from Electroglottographic Signal Using Variational Mode Decomposition

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The objective of the proposed work is to accurately estimate the glottal closure instants (GCIs) and glottal opening instant (GOIs) from electroglottographic (EGG) signals. This work also addresses the issues with existing EGG-based GCI/GOI detection methods. GCIs are the instants at which excitation to the vocal tract is maximum and GOIs, on the other hand, have minimum excitation compared to GCIs. Both these instants occur instantaneously with a fundamental frequency defined for each glottal cycle in a given EGG signal. Accurate detection of these instants from the EGG signal is essential for the performance evaluation of GCIs and GOIs estimated from the speech signal directly. This work proposes a new method for accurate detection of GCIs and GOIs from the EGG signal using variational mode decomposition (VMD) algorithm. The EGG signal has been decomposed into sub-signals using the VMD algorithm. It is shown that VMD captures the center frequency close to the fundamental frequency of the EGG signal through one of its modes. This property of the corresponding mode helps to estimate GCIs and GOIs from the same. Besides, instantaneous pitch frequency is estimated from the obtained GCIs. The proposed method has been evaluated on the CMU-arctic database for GCI/GOI estimation and the Keele pitch extraction reference database for instantaneous pitch frequency estimation. The effectiveness of the proposed method is confirmed by comparison with state-of-the-art methods. Experimental results show that the proposed method has better accuracy and identification rate compared to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. C. Aneesh, S.S. Kumar, P.M. Hisham, K.P. Soman, Performance comparison of variational mode decomposition over empirical wavelet transform for the classification of power quality disturbances using support vector machine. Proc. Comput. Sci. 46, 372–380 (2015)

    Article  Google Scholar 

  2. A. Bouzid, N. Ellouze, Multiscale product of electroglottogram signal for glottal closure and opening instant detection, in Multiconference on Computational Engineering in Systems Applications (2006), pp. 106–109

  3. A. Bouzid, N. Ellouze, Voice source parameter measurement based on multi-scale analysis of electroglottographic signal. Speech Commun. 51, 782–792 (2009)

    Article  Google Scholar 

  4. M. Brookes, VOICEBOX: speech processing toolbox for MATLAB (Online). http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

  5. J. Deller, Some notes on closed phase glottal inverse filtering. IEEE Trans. Acoust. Speech Signal Process. 29(4), 917–919 (1981)

    Article  Google Scholar 

  6. K. Dragomiretskiy, D. Zosso, Variational mode decomposition. IEEE Trans. Signal Process. 62(3), 531–544 (2014)

    Article  MathSciNet  Google Scholar 

  7. T. Drugman, T. Dutoit, Glottal closure and opening instant detection from speech signals, in Interspeech (2009), pp. 2891–2894

  8. T. Drugman, P. Alku, A. Alwan, B. Yegnanarayana, Glottal source processing: from analysis to applications. Comput. Speech Lang. 28(5), 1117–1138 (2014)

    Article  Google Scholar 

  9. J. Gilles, Empirical wavelet transform. IEEE Trans. Signal Process. 61(16), 3999–4010 (2013)

    Article  MathSciNet  Google Scholar 

  10. D. Govind, P. Hisham, D. Pravena, Effectiveness of polarity detection for improved epoch extraction from speech, in National Conference on Communication (2016), pp. 1–6

  11. J. Gudnason, M. Brookes, Voice source cepstrum coefficients for speaker identification, in IEEE International Conference on Acoustics, Speech and Signal Processing (2008), pp. 4821–4824

  12. N. Henrich, C. d’Alessandro, B. Doval, M. Castellengo, On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. J. Acoust. Soc. Am. 115(3), 1321–1332 (2004)

    Article  Google Scholar 

  13. W. Hess, H. Indefrey, Accurate pitch determination of speech signals by means of a laryngograph, in IEEE International Conference on Acoustics, Speech and Signal Processing (1984), pp. 73–76

  14. N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.C. Yen, C.C. Tung, H.H. Liu, The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. R. Soc. Lond. A: Math. Phys. Eng. Sci. 454, 903–995 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  15. M.A. Huckvale, Speech filing system: tools for speech (Online). http://www.phon.ucl.ac.uk/resource/sfs/

  16. J. Kominek, A. Black, CMU-arctic speech databases, in ISCA Speech Synthesis Workshop (2004), pp. 223–224

  17. A.I. Koutrouvelis, G.P. Kafentzis, N.D. Gaubitch, R. Heusdens, A fast method for high-resolution voiced/unvoiced detection and glottal closure/opening instant estimation of speech. IEEE/ACM Trans. Audio Speech Lang. Process. 24(2), 316–328 (2016)

    Article  Google Scholar 

  18. A. Mert, ECG feature extraction based on the bandwidth properties of variational mode decomposition. Physiol. Meas. 37(4), 530–543 (2016)

    Article  Google Scholar 

  19. K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)

    Article  Google Scholar 

  20. P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)

    Article  Google Scholar 

  21. F. Plante, G.F. Meyer, W.A. Aubsworth, A pitch extraction reference database, in Eur. Conf. Speech Commun. (Eurospeech) (1995), pp. 827–840

  22. E. Prabhakararao, M.S. Manikandan, On the use of variational mode decomposition for removal of baseline wander in ECG signals, in National Conference on Communication (2016), pp. 1–6

  23. T.F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice (Prentice-Hall, Upper Saddle River, 2002)

    Google Scholar 

  24. L.R. Rabiner, M.J. Cheng, A.E. Rosenberg, C.A. McGonegal, A comparative performance study of several pitch detection algorithms. IEEE Trans. Acoust. Speech Signal Process. 24(5), 399–418 (1976)

    Article  Google Scholar 

  25. K. Ramesh, S.R.M. Prasanna, D. Govind, Detection of glottal opening instants using hilbert envelope, in Interspeech (2013), pp. 44–48

  26. K. Ramesh, S.R.M. Prasanna, R.K. Das, Significance of glottal activity detection and glottal signature for text dependent speaker verification, in International Conference on Signal Processing and Communications (SPCOM) (2014), pp. 1–5

  27. K.S. Rao, B. Yegnanarayana, Prosody modification using instants of significant excitation. IEEE Trans. Audio Speech Lang. Process. 14, 972–980 (2006)

    Article  Google Scholar 

  28. K.P. Soman, P. Prabaharan, S. Athira, K. Harikumar, Recursive variational mode decomposition algorithm for real time power signal decomposition. Proc. Technol. 21, 540–546 (2015)

    Article  Google Scholar 

  29. D. Talkin, A robust algorithm for pitch tracking, in Speech Coding and Synthesis, ed. by W.B. Kleijn, K.K. Paliwal (Elsevier, New Providence, 1995), pp. 495–518

    Google Scholar 

  30. M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals. IEEE Trans. Audio Speech Lang. Process. 17, 1557–1566 (2009)

    Article  Google Scholar 

  31. M.R.P. Thomas, J. Gudnason, P.A. Naylor, Data-driven voice source waveform modelling, in IEEE International Conference on Acoustics, Speech and Signal Processing (2009), pp. 3965–3968

  32. M.R.P. Thomas, J. Gudnason, P.A. Naylor, Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm. IEEE Trans. Audio Speech Lang. Process. 20(1), 82–91 (2012)

    Article  Google Scholar 

  33. D. Thotappa, S.R.M. Prasanna, Reference and automatic marking of glottal opening instants using EGG signal, in International Conference on Signal Processing and Communications (SPCOM) (2014), pp. 1–5

  34. A. Upadhyay, R.B. Pachori, A new method for determination of instantaneous pitch frequency from speech signals, in IEEE Signal Processing and Signal Processing Education Workshop (2015), pp. 325–330

  35. A. Upadhyay, R.B. Pachori, Instantaneous voiced/non-voiced detection in speech signals based on variational mode decomposition. J. Frankl. Inst. 352, 2679–2707 (2015)

    Article  Google Scholar 

  36. D. Veeneman, S. BeMent, Automatic glottal inverse filtering from speech and electroglottographic signals. IEEE Trans. Signal Process. 33(4), 369–377 (1985)

    Article  Google Scholar 

  37. E. Wechsler, A laryngographic study of voice disorders. Int. J. Lang. Commun. Disord. 12, 9–22 (1977)

    Article  Google Scholar 

  38. Y.J. Xue, J.X. Cao, D.X. Wang, H.K. Du, Y. Yao, Application of the variational-mode decomposition for seismic timefrequency analysis. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 9(8), 3821–3831 (2016)

    Article  Google Scholar 

  39. B. Yegnanarayana, K.S.R. Murty, Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang. Process. 17(4), 614–624 (2009)

    Article  Google Scholar 

  40. B. Yegnanarayana, R.N.J. Veldhuis, Extraction of vocal-tract system characteristics from speech signals. IEEE Trans. Speech Audio Process. 6, 313–327 (1998)

    Article  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge the generous funding provided by Amrita university. The authors would like to thank Dr. K. P. Soman and Ms. M. Neethu for the help given in understanding the concept of VMD algorithm. Next, the authors would like to thank Mr. M. A. Huckvale for providing the speech filing system toolbox. Again, the authors would like to acknowledge Mr. M. Brookes for providing easy access to VOICEBOX toolbox. Finally, the authors would like to thank Mr. F. Plante and Mr. J. Kominek for the EGG reference database used.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. A. Gopalakrishnan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lal, G.J., Gopalakrishnan, E.A. & Govind, D. Accurate Estimation of Glottal Closure Instants and Glottal Opening Instants from Electroglottographic Signal Using Variational Mode Decomposition. Circuits Syst Signal Process 37, 810–830 (2018). https://doi.org/10.1007/s00034-017-0582-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-017-0582-x

Keywords

Navigation