Wavelet sub-band features for voice disorder detection and classification

Gidaye, Girish; Nirmal, Jagannath; Ezzine, Kadria; Frikha, Mondher

doi:10.1007/s11042-020-09424-1

Wavelet sub-band features for voice disorder detection and classification

Published: 04 August 2020

Volume 79, pages 28499–28523, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Girish Gidaye^1,2,3,
Jagannath Nirmal³,
Kadria Ezzine⁴ &
…
Mondher Frikha⁴

696 Accesses
9 Citations
Explore all metrics

Abstract

Acoustic analysis of the speech signal enables non-intrusive, affordable, unbiased and fast assessment of voice pathologies. This assessment provides complimentary information to otolaryngologist for preliminary diagnosis of pathological larynx. Several voice impairment assessment systems focused on acoustic analysis have been introduced in recent years. Nevertheless, these systems are tested using only one or two datasets and are not independent of database and human bias. In this paper, a unified wavelet based framework is suggested for evaluating voice disorders, which is independent of database and human bias. Stationary wavelet transform (SWT) is used to decompose the speech signal, since it offers good time and frequency localization. Energy and statistical features are extracted from each sub-band after multilevel decomposition. Higher the decomposition level, higher is the order of feature vector. To decrease the dimension of the feature vector, information gain (IG) based feature selection technique is harnessed for selecting most relevant and discarding redundant features. The enriched feature vector is assessed using support vector machine (SVM), stochastic gradient descent (SGD) and artificial neural network (ANN) classifiers. Records of vowel /a/, vocalized at natural pitch for both healthy and pathological subjects, are mined from German, English, Arabic and Spanish speech databases. During the first phase of experiments, input speech signal is detected as healthy or pathological. Second phase classifies input speech samples into healthy, cyst, paralysis or polyp. Experimental results demonstrate that, the extracted energy and statistical features can be used as possible clues for voice disorder evaluation. The most important aspect of the proposed method is that the features are independent of the fundamental frequency. The detection and classification rates attained are comparable to other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unified wavelet-based framework for evaluation of voice impairment

Article 21 April 2022

Hilbert Domain Analysis of Wavelet Packets for Emotional Speech Classification

Article 06 December 2023

Stockwell-Transform based feature representation for detection and assessment of voice disorders

Article 29 February 2024

References

Al-Nasheri A, Muhammad G, Alsulaiman M, Ali Z (2017) Investigation of voice pathology detection and classification on different frequency regions using correlation functions. J Voice 31(1):3–15. https://doi.org/10.1016/j.jvoice.2016.01.014
Article Google Scholar
Al-Nasheri A, Muhammad G, Alsulaiman M, Ali Z, Mesallam TA, Farahat M, Malki KH, Bencherif MA (2017) An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. J Voice 31(1):113.e9–113.e18. https://doi.org/10.1016/j.jvoice.2016.03.019
Article Google Scholar
Al-Nasheri A, Muhammad G, Alsulaiman M, Ali Z, Malki KH, Mesallam TA, Farahat Ibrahim M (2018) Voice pathology detection and classification using auto-correlation and entropy features in different frequency regions. IEEE Access 6:6961–6974. https://doi.org/10.1109/ACCESS.2017.2696056
Article Google Scholar
Alhussein M, Muhammad G (2018) Voice pathology detection using deep learning on mobile healthcare framework. IEEE Access 6:41034–41041. https://doi.org/10.1109/ACCESS.2018.2856238
Article Google Scholar
Ali Z, Alsulaiman M, Elamvazuthi I, Muhammad G, Mesallam TA, Farahat M, Malki KH (2016) Voice pathology detection based on the modified voice contour and svm. Biol Inspired Cogn Archit 15:10–18. https://doi.org/10.1016/j.bica.2015.10.004
Article Google Scholar
Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G (2016) Detection of voice pathology using fractal dimension in a multiresolution analysis of normal and disordered speech signals. J Med Syst 40(1):20. https://doi.org/10.1007/s10916-015-0392-2
Article Google Scholar
Ali Z, Hossain MS, Muhammad G, Sangaiah AK (2018) An intelligent healthcare system for detection and classification to discriminate vocal fold disorders. Futur Gener Comput Syst 85:19–28. https://doi.org/10.1016/j.future.2018.02.021
Article Google Scholar
Ali Z, Muhammad G, Alhamid MF (2017) An automatic health monitoring system for patients suffering from voice complications in smart cities. IEEE Access 5:3900–3908. https://doi.org/10.1109/ACCESS.2017.2680467
Article Google Scholar
Amami R, Smiti A (2017) An incremental method combining density clustering and support vector machines for voice pathology detection. Comput Electr Eng 57:257–265. https://doi.org/10.1016/j.compeleceng.2016.08.021
Article Google Scholar
Areiza-Laverde HJ, Castro-Ospina AE, Peluffo-Ordóñez D H (2018) Voice pathology detection using artificial neural networks and support vector machines powered by a multicriteria optimization algorithm . In: Figueroa-García JC, López-Santana ER, Rodriguez-Molano JI (eds) Applied computer sciences in engineering. Springer International Publishing, Cham, pp 148–159, https://doi.org/10.1007/978-3-030-00350-0-13, (to appear in print)
Arias-Londoño JD, Godino-Llorente JI, Markaki M, Stylianou Y (2011) On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices. Logop Phoniatr Vocology 36 (2):60–69. https://doi.org/10.3109/14015439.2010.528788
Article Google Scholar
Barry W, Pützer M Saarbrucken voice database. http://www.Stimmdatenbank.coli.uni-saarland.de
Cesari U, Pietro GD, Marciano E, Niri C, Sannino G, Verde L (2018) A new database of healthy and pathological voices. Comput Electr Eng 68:310–321. https://doi.org/10.1016/j.compeleceng.2018.04.008
Article Google Scholar
Chuang Z, Yu X, Chen J, Hsu Y, Xu Z, Wang C, Lin F, Fang S (2018) Dnn-based approach to detect and classify pathological voice. In: 2018 IEEE international conference on big data (big data), pp 5238–5241, https://doi.org/10.1109/BigData.2018.8622317, (to appear in print)
Deshpande PS, Manikandan MS (2018) Effective glottal instant detection and electroglottographic parameter extraction for automated voice pathology assessment. IEEE J Biomed Health Inform 22(2):398–408. https://doi.org/10.1109/JBHI.2017.2654683
Article Google Scholar
El Emary IMM, Fezari M, Amara F (2014) Towards developing a voice pathologies detection system. J Commun Technol Electron 59 (11):1280–1288. https://doi.org/10.1134/S1064226914110059
Article Google Scholar
Ezzine K, Frikha M (2018) Investigation of glottal flow parameters for voice pathology detection on svd and meei databases. In: 2018 4th International conference on advanced technologies for signal and image processing (ATSIP), pp 1–6, https://doi.org/10.1109/ATSIP.2018.8364517, (to appear in print)
Fang SH, Tsao Y, Hsiao MJ, Chen JY, Lai YH, Lin FC, Wang CT (2018) Detection of pathological voice using cepstrum vectors: a deep learning approach J Voice. https://doi.org/10.1016/j.jvoice.2018.02.003
Farouk MH (2018) Clinical diagnosis and assessment of speech pathology. Springer International Publishing, Cham, pp 77–80
Google Scholar
Godino-Llorente JI, Sáenz-Lechón N, Osma-Ruiz V, Aguilera-Navarro S, Gómez-Vilda P (2006) An integrated tool for the diagnosis of voice disorders. Med Eng Phys 28(3):276–289. https://doi.org/10.1016/j.medengphy.2005.04.014
Article Google Scholar
Gómez-Vilda P, Fernández-Baillo R, Rodellar-Biarge V, Lluis VN, Álvarez Marquina A, Mazaira-Fernández LM, Martínez-Olalla R, Godino-Llorente JI (2009) Glottal source biometrical signature for voice pathology detection. Speech Commun 51(9):759–781. https://doi.org/10.1016/j.specom.2008.09.005. Special issue on non-linear and conventional speech processing
Article Google Scholar
Grzywalski T, Maciaszek A, Biniakowski A, Orwat J, Drgas S, Piecuch M, Belluzzo R, Joachimiak K, Niemiec D, Ptaszynski J, Szarzynski K (2018) Parameterization of sequence of MFCCs for DNN-based voice disorder detection. In: 2018 IEEE International conference on big data (big data), pp 5247–5251, https://doi.org/10.1109/BigData.2018.8622012, (to appear in print)
Hadjitodorov S, Mitev P (2002) A computer system for acoustic analysis of pathological voices and laryngeal diseases screening. Med Eng Phys 24 (6):419–429. https://doi.org/10.1016/S1350-4533(02)00031-0
Article Google Scholar
Hadjitodorov S, Boyanov B, Teston B (2000) Laryngeal pathology detection by means of class-specific neural maps. IEEE Trans Inf Technol Biomed 4(1):68–73. https://doi.org/10.1109/4233.826861
Article Google Scholar
Hariharan M, Polat K, Yaacob S (2014) A new feature constituting approach to detection of vocal fold pathology. Int J Syst Sci 45(8):1622–1634. https://doi.org/10.1080/00207721.2013.794905
Article MATH Google Scholar
Hegde S, Shetty S, Rai S, Dodderi T (2018) A survey on machine learning approaches for automatic detection of voice disorders. J Voice 1–23 https://doi.org/10.1016/j.jvoice.2018.07.014
Hossain MS, Muhammad G, Alamri A (2017) Smart healthcare monitoring: a voice pathology detection paradigm for smart cities. Multimed Syst 1–11. https://doi.org/10.1007/s00530-017-0561-x
Liao X, Li K, Yin J (2016) Separable data hiding in encrypted image based on compressive sensing and discrete fourier transform. Multimed Tools Appl 76:20739–20753. https://doi.org/10.1007/s11042-016-3971-4
Article Google Scholar
Markaki M, Stylianou Y (2011) Voice pathology detection and discrimination based on modulation spectral features. IEEE Trans Audio Speech Lang Process 19(7):1938–1948. https://doi.org/10.1109/TASL.2010.2104141
Article Google Scholar
MEEI: Disordered Voice Database, Voice and Speech Lab, Kay Elemetrics Corp., Version 1.03 (CD-ROM)
Mesallam T, Farahat M, Malki K, Sulaiman M, Ali Z, Alasheri A, Muhammad G (2017) Development of the arabic voice pathology database and its evaluation by using speech features and machine learning algorithms. J Healthcare Eng 2017:1–13. https://doi.org/10.3109/14015439.2010.528788
Article Google Scholar
Muhammad G, Melhem M (2014) Pathological voice detection and binary classification using MPEG-7 audio features. Biomed Signal Process Control 11:1–9
Article Google Scholar
Muhammad G, Rahman SMM, Alelaiwi A, Alamri A (2017) Smart health solution integrating iot and cloud: a case study of voice pathology monitoring. IEEE Commun Mag 55(1):69–73. https://doi.org/10.1109/MCOM.2017.1600425CM
Article Google Scholar
Murugesapandian P, Yaacob S, Hariharan M (2008) Feature extraction based on mel-scaled wavelet packet transform for the diagnosis of voice disorders. In: Abu Osman NA, Ibrahim F, Wan Abas WAB, Abdul Rahman HS, Ting H N (eds) 4th Kuala Lumpur international conference on biomedical engineering 2008. Springer, Berlin, pp 790–793, https://doi.org/10.1007/978-3-540-69139-6-196, (to appear in print)
Nongpiur RC, Shpak DJ (2013) Impulse-noise suppression in speech using the stationary wavelet transform. J Acoust Soc Am 133(2):866–879. https://doi.org/10.1121/1.4773264
Article Google Scholar
Novotný M, Rusz J, Čmejla R, Råžička E (2014) Automatic evaluation of articulatory disorders in Parkinson’s disease. IEEE/ACM Trans Audio Speech Lang Process 22(9):1366–1378. https://doi.org/10.1109/TASLP.2014.2329734
Article Google Scholar
Orozco-Arroyave JR, Belalcazar-Bolaños EA, Arias-Londoño JD, Vargas-Bonilla JF, Skodda S, Rusz J, Daqrouq K, Hönig F, Nöth E (2015) Characterization methods for the detection of multiple voice disorders: Neurological, functional, and laryngeal diseases. IEEE J Biomed Health Inform 19(6):1820–1828. https://doi.org/10.1109/JBHI.2015.2467375
Article Google Scholar
Qi JP, Zhang Q, Zhu Y, Qi J (2014) A novel method for fast change-point detection on simulated time series and electrocardiogram data. PLOS One 9(4):1–15. https://doi.org/10.1371/journal.pone.0093365
Article Google Scholar
Rufo MJ, Martín J, Pérez C J, Paniagua S (2019) A bayesian decision analysis approach to assess voice disorder risks by using acoustic features. Biom J 61(3):503–513. https://doi.org/10.1002/bimj.201700233
Article MathSciNet MATH Google Scholar
Saeedi NE, Almasganj F (2013) Wavelet adaptation for automatic voice disorders sorting. Comput Biol Med 43(6):699–704. https://doi.org/10.1016/j.compbiomed.2013.03.006
Article Google Scholar
Sakar CO, Serbes G, Gunduz A, Tunc HC, Nizam H, Sakar BE, Tutuncu M, Aydin T, Isenkul ME, Apaydin H (2019) A comparative analysis of speech signal processing algorithms for parkinson’s disease classification and the use of the tunable q-factor wavelet transform. Appl Soft Comput 74:255–263. https://doi.org/10.1016/j.asoc.2018.10.022
Article Google Scholar
Selamtzis A, Castellana A, Salvi G, Carullo A, Astolfi A (2019) Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomed Signal Process Control 47:350–357. https://doi.org/10.1016/j.bspc.2018.08.021
Article Google Scholar
Shahnaz C, Fattah SA, Mahbub U, Zhu W, Ahmad MO (2012) Detection of voice disorders based on wavelet and prosody-related properties. In: 2012 IEEE international symposium on circuits and systems, pp 1030–1033, https://doi.org/10.1109/ISCAS.2012.6271403, (to appear in print)
Shia SE, Jayasree T (2017) Detection of pathological voices using discrete wavelet transform and artificial neural networks. In: 2017 IEEE international conference on intelligent techniques in control, optimization and signal processing (INCOS), pp 1–6, https://doi.org/10.1109/ITCOSP.2017.8303086, (to appear in print)
Sreehari VR, Mary L (2018) Automatic speaker recognition using stationary wavelet coefficients of lp residual. In: TENCON 2018–2018 IEEE region 10 conference, pp 1595–1600, https://doi.org/10.1109/TENCON.2018.8650279, (to appear in print)
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. CRC Press, pp 37–64. https://doi.org/10.1201/b17320
Travieso CM, Alonso JB, Orozco-Arroyave J, Vargas-Bonilla J, Nth E, Ravelo-Garca AG (2017) Detection of different voice diseases based on the nonlinear characterization of speech signals. Exp Syst Appl 82(C):184–195. https://doi.org/10.1016/j.eswa.2017.04.012
Article Google Scholar
Trinh NH, O’Brien D (2019) Pathological speech classification using a convolutional neural network. In: IMVIP 2019: Irish machine vision & image processing, https://doi.org/10.21427/9dnc-n002, (to appear in print)
Tsanas A, Little MA, McSharry PE, Spielman J, Ramig LO (2012) Novel speech signal processing algorithms for high-accuracy classification of parkinson’s disease. IEEE Trans Biomed Eng 59(5):1264–1271. https://doi.org/10.1109/TBME.2012.2183367
Article Google Scholar
Vaiciukynas E, Verikas A, Gelzinis A, Bacauskiene M, Uloza V (2012) Exploring similarity-based classification of larynx disorders from human voice. Speech Commun 54(5):601–610. https://doi.org/10.1016/j.specom.2011.04.004
Article Google Scholar
Vaiciukynas E, Verikas A, Gelzinis A, Bacauskiene M, Kons Z, Satt A, Hoory R (2014) Fusion of voice signal information for detection of mild laryngeal pathology. Appl Soft Comput 18:91–103
Article Google Scholar
Verde L, De Pietro G, Sannino G (2018) Voice disorder identification by using machine learning techniques. IEEE Access 6:16246–16255. https://doi.org/10.1109/ACCESS.2018.2816338
Article Google Scholar
Wu H, Soraghan J, Lowit A, Di Caterina G (2018) Convolutional neural networks for pathological voice detection. In: 2018 40th annual international conference of the ieee engineering in medicine and biology society (EMBC), pp 1–4, https://doi.org/10.1109/EMBC.2018.8513222, (to appear in print)

Download references

Author information

Authors and Affiliations

Vidyalankar Institute of Technology, Research scholar at K. J. Somaiya College of Engineering, Mumbai, India
Girish Gidaye
Department of Electronics Engineering, Vidyalankar Institute of Technology, Mumbai, Maharashtra, India
Girish Gidaye
K. J. Somaiya College of Engineering, Mumbai, Maharashtra, India
Girish Gidaye & Jagannath Nirmal
ATISP, ENET’COM, Sfax University, Sfax, Tunisia
Kadria Ezzine & Mondher Frikha

Authors

Girish Gidaye
View author publications
You can also search for this author inPubMed Google Scholar
Jagannath Nirmal
View author publications
You can also search for this author inPubMed Google Scholar
Kadria Ezzine
View author publications
You can also search for this author inPubMed Google Scholar
Mondher Frikha
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Girish Gidaye.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gidaye, G., Nirmal, J., Ezzine, K. et al. Wavelet sub-band features for voice disorder detection and classification. Multimed Tools Appl 79, 28499–28523 (2020). https://doi.org/10.1007/s11042-020-09424-1

Download citation

Received: 06 December 2019
Revised: 29 June 2020
Accepted: 21 July 2020
Published: 04 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11042-020-09424-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Wavelet sub-band features for voice disorder detection and classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Unified wavelet-based framework for evaluation of voice impairment

Hilbert Domain Analysis of Wavelet Packets for Emotional Speech Classification

Stockwell-Transform based feature representation for detection and assessment of voice disorders

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now