Skip to main content

Deep Genetic Algorithm-Based Voice Pathology Diagnostic System

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11608))

Abstract

Automatic voice pathology diagnosis is a widely investigated area by the research community. Recently, in the literature, most of the proposed solutions are based on robust feature descriptors, which are combined with machine learning algorithms. Despite of their success, it is practically difficult to design handcrafted features which are optimal for specific classification tasks. Nowadays, deep learning approaches, particularly deep Convolutional Neural Networks (CNNs), have significant breakthroughs in the recognition tasks. In this study, the deep CNN, which was mainly explored in image recognition purposes, is used for the purpose of speech recognition. An approach is proposed for voice pathology recognition using both deep CNN and Genetic Algorithm (GA). The CNN weights are initialized using the solutions produced by GA, which minimizes the classification error and increases the ability to discriminate the voice pathology. Moreover, three popular deep CNN architectures, which have been investigated in the literature for image recognition, are adapted for voice pathology diagnosis, namely: AlexNet, VGG16, and ResNet34. For comparison purposes, performance of the hybrid CNN-GA algorithm is compared to the performance of the conventional CNN, and to some other approaches based on hybridization of deep CNN and meta-heuristic methods. Experimental results reveal that the improvement in voice pathology classification accuracy for proposed method in comparison to the basic CNN was 5.4% and when compared with other meta-heuristic based algorithms was up to 4.27%. The proposed approach also outperforms the state of the art works on the same dataset with overall accuracy of 99.37%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z.: Investigation of voice pathology detection and classification on different frequency regions using correlation functions. J. Voice 31, 3–15 (2017)

    Article  Google Scholar 

  2. Kohler, M., Mendoza, L.A.F., Lazo, J.G., Vellasco, M., Cataldo, E.: Classification of Voice Pathologies Using Glottal Signal Parameters. Anais do 10. Congresso Brasileiro de Inteligência Computacional (2016)

    Google Scholar 

  3. Ali, Z., Elamvazuthi, I., Alsulaiman, M., Muhammad, G.: Automatic voice pathology detection with running speech by using estimation of auditory spectrum and cepstral coefficients based on the all-pole model. J. Voice 30, 757-e7 (2016)

    Article  Google Scholar 

  4. Hossain, M.S., Muhammad, G.: Cloud-assisted speech and face recognition framework for health monitoring. Mob. Networks Appl. 20, 391–399 (2015)

    Article  Google Scholar 

  5. Cordeiro, H., Meneses, C., Fonseca, J.: Continuous speech classification systems for voice pathologies identification. In: Camarinha-Matos, L.M., Baldissera, T.A., Di Orio, G., Marques, F. (eds.) DoCEIS 2015. IAICT, vol. 450, pp. 217–224. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16766-4_23

    Chapter  Google Scholar 

  6. Kay Elemetrics, Multi-Dimensional Voice Program (MDVP) [Computer Program] (2012)

    Google Scholar 

  7. Fu, Y., Aldrich, C.: Flotation froth image recognition with convolutional neural networks. Miner. Eng. 132, 183–190 (2019)

    Article  Google Scholar 

  8. Traore, B.B., Kamsu-Foguem, B., Tangara, F.: Deep convolution neural network for image recognition. Ecol. Inf. 48, 257–268 (2018)

    Article  Google Scholar 

  9. Fang, L., Jin, Y., Huang, L., Guo, S., Zhao, G., Chen, X.: Iterative fusion convolutional neural networks for classification of optical coherence tomography images. J. Vis. Commun. Image Represent. 59, 327–333 (2019)

    Article  Google Scholar 

  10. Fayek, H.M., Lech, M., Cavedon, L.: Evaluating deep learning architectures for speech emotion recognition. Neural Networks 92, 60–68 (2017)

    Article  Google Scholar 

  11. Tu, Y.-H., et al.: An iterative mask estimation approach to deep learning based multi-channel speech recognition. Speech Commun. 106, 31–43 (2019)

    Article  Google Scholar 

  12. Angrick, M., Herff, C., Johnson, G., Shih, J., Krusienski, D., Schultz, T.: Interpretation of convolutional neural networks for speech spectrogram regression from intracranial recordings. Neurocomputing 342, 145–151 (2019)

    Article  Google Scholar 

  13. Hossain, M.S., Muhammad, G.: Emotion recognition using deep learning approach from audio–visual emotional big data. Inf. Fusion. 49, 69–78 (2019)

    Article  Google Scholar 

  14. Palaz, D., Magimai-Doss, M., Collobert, R.: End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition. Speech Commun. 108, 15–32 (2019)

    Article  Google Scholar 

  15. Fang, S.-H., et al.: Detection of pathological voice using cepstrum vectors: a deep learning approach. J. Voice (2018)

    Google Scholar 

  16. Ghoniem, R.M., Shaalan, K.: FCSR - fuzzy continuous speech recognition approach for identifying laryngeal pathologies using new weighted spectrum features. In: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017 Advances in Intelligent Systems and Computing, pp. 384–395 (2017)

    Google Scholar 

  17. Muhammad, G., et al.: Voice pathology detection using interlaced derivative pattern on glottal source excitation. Biomed. Signal Process. Control 31, 156–164 (2017)

    Article  Google Scholar 

  18. Guedes, V., Junior, A., Fernandes, J., Teixeira, F., Teixeira, J.P.: Long short term memory on chronic laryngitis classification. Procedia Comput. Sci. 138, 250–257 (2018)

    Article  Google Scholar 

  19. Wu, K., Zhang, D., Lu, G., Guo, Z.: Joint learning for voice based disease detection. Pattern Recogn. 87, 130–139 (2019)

    Article  Google Scholar 

  20. Eye, M., Infirmary, E.: Voice Disorders Database, (Version 1.03 Cd-Rom). Vol (Kay Elemetrics Corp., Lincoln Park N, ed.). Kay Elemetrics Corp., Lincoln Park (1994)

    Google Scholar 

  21. Song, R., Zhang, X., Zhou, C., Liu, J., He, J.: Predicting TEC in China based on the neural networks optimized by genetic algorithm. Adv. Space Res. 62, 745–759 (2018)

    Article  Google Scholar 

  22. Ghoniem, R., Refky, B., Soliman, A., Tawfik, A.: IPES: an image processing-enabled expert system for the detection of breast malignant tumors. J. Biomed. Eng. Med. Imaging 3, 13–32 (2016)

    Google Scholar 

  23. Rere, L.R., Fanany, M.I., Arymurthy, A.M.: Simulated annealing algorithm for deep learning. Procedia Comput. Sci. 72, 137–144 (2015)

    Article  Google Scholar 

  24. Silva, G.L.F.D., Valente, T.L.A., Silva, A.C., Paiva, A.C.D., Gattass, M.: Convolutional neural network-based PSO for lung nodule false positive reduction on CT images. Comput. Meth. Programs Biomed. 162, 109–118 (2018)

    Article  Google Scholar 

  25. Yang, X.-S.: A new metaheuristic bat-inspired algorithm. In: Nature Inspired Cooperative Strategies for Optimization (NICSO 2010) Studies in Computational Intelligence, pp. 65–74 (2010)

    Google Scholar 

  26. Akbari, A., Arjmandi, M.K.: An efficient voice pathology classification scheme based on applying multi-layer linear discriminant analysis to wavelet packet-based features. Biomed. Signal Process. Control 10, 209–223 (2014)

    Article  Google Scholar 

  27. Muhammad, G., et al.: Automatic voice pathology detection and classification using vocal tract area irregularity. Biocybernetics Biomed. Eng. 36, 309–317 (2016)

    Article  Google Scholar 

  28. Al-Nasheri, A., et al.: An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. J. Voice 31, 113-e9 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rania M. Ghoniem .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ghoniem, R.M. (2019). Deep Genetic Algorithm-Based Voice Pathology Diagnostic System. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23281-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23280-1

  • Online ISBN: 978-3-030-23281-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics