Deep Genetic Algorithm-Based Voice Pathology Diagnostic System

Ghoniem, Rania M.

doi:10.1007/978-3-030-23281-8_18

Deep Genetic Algorithm-Based Voice Pathology Diagnostic System

Rania M. Ghoniem^19,20

Conference paper
First Online: 21 June 2019

1584 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11608))

Abstract

Automatic voice pathology diagnosis is a widely investigated area by the research community. Recently, in the literature, most of the proposed solutions are based on robust feature descriptors, which are combined with machine learning algorithms. Despite of their success, it is practically difficult to design handcrafted features which are optimal for specific classification tasks. Nowadays, deep learning approaches, particularly deep Convolutional Neural Networks (CNNs), have significant breakthroughs in the recognition tasks. In this study, the deep CNN, which was mainly explored in image recognition purposes, is used for the purpose of speech recognition. An approach is proposed for voice pathology recognition using both deep CNN and Genetic Algorithm (GA). The CNN weights are initialized using the solutions produced by GA, which minimizes the classification error and increases the ability to discriminate the voice pathology. Moreover, three popular deep CNN architectures, which have been investigated in the literature for image recognition, are adapted for voice pathology diagnosis, namely: AlexNet, VGG16, and ResNet34. For comparison purposes, performance of the hybrid CNN-GA algorithm is compared to the performance of the conventional CNN, and to some other approaches based on hybridization of deep CNN and meta-heuristic methods. Experimental results reveal that the improvement in voice pathology classification accuracy for proposed method in comparison to the basic CNN was 5.4% and when compared with other meta-heuristic based algorithms was up to 4.27%. The proposed approach also outperforms the state of the art works on the same dataset with overall accuracy of 99.37%.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z.: Investigation of voice pathology detection and classification on different frequency regions using correlation functions. J. Voice 31, 3–15 (2017)
Article Google Scholar
Kohler, M., Mendoza, L.A.F., Lazo, J.G., Vellasco, M., Cataldo, E.: Classification of Voice Pathologies Using Glottal Signal Parameters. Anais do 10. Congresso Brasileiro de Inteligência Computacional (2016)
Google Scholar
Ali, Z., Elamvazuthi, I., Alsulaiman, M., Muhammad, G.: Automatic voice pathology detection with running speech by using estimation of auditory spectrum and cepstral coefficients based on the all-pole model. J. Voice 30, 757-e7 (2016)
Article Google Scholar
Hossain, M.S., Muhammad, G.: Cloud-assisted speech and face recognition framework for health monitoring. Mob. Networks Appl. 20, 391–399 (2015)
Article Google Scholar
Cordeiro, H., Meneses, C., Fonseca, J.: Continuous speech classification systems for voice pathologies identification. In: Camarinha-Matos, L.M., Baldissera, T.A., Di Orio, G., Marques, F. (eds.) DoCEIS 2015. IAICT, vol. 450, pp. 217–224. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16766-4_23
Chapter Google Scholar
Kay Elemetrics, Multi-Dimensional Voice Program (MDVP) [Computer Program] (2012)
Google Scholar
Fu, Y., Aldrich, C.: Flotation froth image recognition with convolutional neural networks. Miner. Eng. 132, 183–190 (2019)
Article Google Scholar
Traore, B.B., Kamsu-Foguem, B., Tangara, F.: Deep convolution neural network for image recognition. Ecol. Inf. 48, 257–268 (2018)
Article Google Scholar
Fang, L., Jin, Y., Huang, L., Guo, S., Zhao, G., Chen, X.: Iterative fusion convolutional neural networks for classification of optical coherence tomography images. J. Vis. Commun. Image Represent. 59, 327–333 (2019)
Article Google Scholar
Fayek, H.M., Lech, M., Cavedon, L.: Evaluating deep learning architectures for speech emotion recognition. Neural Networks 92, 60–68 (2017)
Article Google Scholar
Tu, Y.-H., et al.: An iterative mask estimation approach to deep learning based multi-channel speech recognition. Speech Commun. 106, 31–43 (2019)
Article Google Scholar
Angrick, M., Herff, C., Johnson, G., Shih, J., Krusienski, D., Schultz, T.: Interpretation of convolutional neural networks for speech spectrogram regression from intracranial recordings. Neurocomputing 342, 145–151 (2019)
Article Google Scholar
Hossain, M.S., Muhammad, G.: Emotion recognition using deep learning approach from audio–visual emotional big data. Inf. Fusion. 49, 69–78 (2019)
Article Google Scholar
Palaz, D., Magimai-Doss, M., Collobert, R.: End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition. Speech Commun. 108, 15–32 (2019)
Article Google Scholar
Fang, S.-H., et al.: Detection of pathological voice using cepstrum vectors: a deep learning approach. J. Voice (2018)
Google Scholar
Ghoniem, R.M., Shaalan, K.: FCSR - fuzzy continuous speech recognition approach for identifying laryngeal pathologies using new weighted spectrum features. In: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017 Advances in Intelligent Systems and Computing, pp. 384–395 (2017)
Google Scholar
Muhammad, G., et al.: Voice pathology detection using interlaced derivative pattern on glottal source excitation. Biomed. Signal Process. Control 31, 156–164 (2017)
Article Google Scholar
Guedes, V., Junior, A., Fernandes, J., Teixeira, F., Teixeira, J.P.: Long short term memory on chronic laryngitis classification. Procedia Comput. Sci. 138, 250–257 (2018)
Article Google Scholar
Wu, K., Zhang, D., Lu, G., Guo, Z.: Joint learning for voice based disease detection. Pattern Recogn. 87, 130–139 (2019)
Article Google Scholar
Eye, M., Infirmary, E.: Voice Disorders Database, (Version 1.03 Cd-Rom). Vol (Kay Elemetrics Corp., Lincoln Park N, ed.). Kay Elemetrics Corp., Lincoln Park (1994)
Google Scholar
Song, R., Zhang, X., Zhou, C., Liu, J., He, J.: Predicting TEC in China based on the neural networks optimized by genetic algorithm. Adv. Space Res. 62, 745–759 (2018)
Article Google Scholar
Ghoniem, R., Refky, B., Soliman, A., Tawfik, A.: IPES: an image processing-enabled expert system for the detection of breast malignant tumors. J. Biomed. Eng. Med. Imaging 3, 13–32 (2016)
Google Scholar
Rere, L.R., Fanany, M.I., Arymurthy, A.M.: Simulated annealing algorithm for deep learning. Procedia Comput. Sci. 72, 137–144 (2015)
Article Google Scholar
Silva, G.L.F.D., Valente, T.L.A., Silva, A.C., Paiva, A.C.D., Gattass, M.: Convolutional neural network-based PSO for lung nodule false positive reduction on CT images. Comput. Meth. Programs Biomed. 162, 109–118 (2018)
Article Google Scholar
Yang, X.-S.: A new metaheuristic bat-inspired algorithm. In: Nature Inspired Cooperative Strategies for Optimization (NICSO 2010) Studies in Computational Intelligence, pp. 65–74 (2010)
Google Scholar
Akbari, A., Arjmandi, M.K.: An efficient voice pathology classification scheme based on applying multi-layer linear discriminant analysis to wavelet packet-based features. Biomed. Signal Process. Control 10, 209–223 (2014)
Article Google Scholar
Muhammad, G., et al.: Automatic voice pathology detection and classification using vocal tract area irregularity. Biocybernetics Biomed. Eng. 36, 309–317 (2016)
Article Google Scholar
Al-Nasheri, A., et al.: An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. J. Voice 31, 113-e9 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer, Mansoura University, Mansoura, Egypt
Rania M. Ghoniem
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Kingdom of Saudi Arabia
Rania M. Ghoniem

Authors

Rania M. Ghoniem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rania M. Ghoniem .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Salford, Salford, UK
Farid Meziane
University of Salford, Salford, UK
Sunil Vadera
Oakland University, Rochester, MI, USA
Vijayan Sugumaran
CSE, University of Salford, Salford, UK
Mohamad Saraee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghoniem, R.M. (2019). Deep Genetic Algorithm-Based Voice Pathology Diagnostic System. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-23281-8_18
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23280-1
Online ISBN: 978-3-030-23281-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics