Skip to main content
Log in

Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Due to the improvements of dysarthric automatic speech recognition (ASR) during the last few decades, the demand for assessment and evaluation of such technologies increased significantly. Evaluation methods of ASRs are now required to consider multiple qualitative and quantitative metrics. In this study, the exploratory factor analysis is conducted to classify the evaluation metrics that is applied by researchers. The metrics with high Pearson correlation coefficiency (\(r > .9\)) are placed in same groups so the number of metrics from 23 is reduced to six main metrics. Artificial neural networks (ANNs) do not require any internal knowledge of system parameters and provide solutions for problems with multi-variables while delivering speedy calculations; hence, they can be used as an alternative to analytical approaches based on obtained evaluation metrics. Here, the adaptive neuro-fuzzy inference system (ANFIS) was employed for ASR performance evaluation in which it applies an ANN to estimate the fuzzy logic membership function parameters of the fuzzy inference system (FIS). The proposed algorithm was deployed in MATLAB and employed to measure the performances of two dysarthric ARS systems based on MVML and MVSL active learning theories. The assessment results presented in this paper show the effectiveness of the developed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Asemi A, Asemi A (2014) Intelligent MCDM method for supplier selection under fuzzy environment. Int J Inf Sci Manag (IJISM) 12(2):33–40

  • Asemi A, Baba MS, Haji Abdullah R, Idris N (2014) Fuzzy multi criteria decision making applications: a review study. Eprints.um, pp 344–351

  • Assaleh K, Al-Rousan M (2005) Recognition of Arabic sign language alphabet using polynomial classifiers. EURASIP J Appl Signal Process 2005:2136–2145

    MATH  Google Scholar 

  • Avci E, Akpolat ZH (2006) Speech recognition using a wavelet packet adaptive network based fuzzy inference system. Expert Syst Appl 31(3):495–503

    Article  Google Scholar 

  • Bangor A, Kortum PT, Miller JT (2008) An empirical evaluation of the system usability scale. Int J Hum Comput Interact 24(6):574–594

    Article  Google Scholar 

  • Bennett I, Babu BR, Morkhandikar K, Gururaj P (2014) Speech recognition system interactive agent. Google Patents

  • Bhandari B, Grant M (2007) User satisfaction and sustainability of drinking water schemes in rural communities of Nepal. Sustain Sci Pract Policy 3(1):12–20

  • Dybkjær L, Bernsen NO (2001) Usability evaluation in spoken language dialogue systems. In: Proceedings of the workshop on evaluation for language and dialogue systems, vol 9

  • Ekici BB, Aksoy UT (2011) Prediction of building energy needs in early stage of design by using ANFIS. Expert Syst Appl 38(5):5352–5358

    Article  Google Scholar 

  • Harman HH (1976) Modern factor analysis. Chicago University Press

  • Hasegawa-Johnson M, Gunderson J, Perlman A, Huang T (2006) HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria. Paper presented at the Proceedings of the 2006 IEEE international conference on acoustics, speech, and signal processing

  • Hawley MS, Enderby P, Green P, Cunningham S, Brownsell S, Carmichael J, Parker M, Hatzis A, O’Neill P, Palmer R (2007) A speech-controlled environmental control system for people with severe dysarthria. Med Eng Phys 29(5):586–593. https://doi.org/10.1016/j.medengphy.2006.06.009

    Article  Google Scholar 

  • İnal M (2008) Determination of dielectric properties of insulator materials by means of ANFIS: a comparative study. J Mater Process Technol 195(1):34–43

    Google Scholar 

  • Jang JSR (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybernet 23(3):665–685

    Article  Google Scholar 

  • Khajeh A, Modarress H, Rezaee B (2009) Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers. Expert Syst Appl 36(3):5728–5732

    Article  Google Scholar 

  • Kim H, Hasegawa-Johnson M, Perlman A, Gunderson J, Huang T, Watkin K, Frame S (2008). Dysarthric speech database for universal access research. Paper presented at the proceedings of the 9th annual conference of the international speech communication association, Brisbane, Australia

  • Kitchenham BA, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. EBSE Technical Report EBSE, pp 1–57

  • Mansourvar M, Asemi A, Raj RG et al (2017) A fuzzy inference system for skeletal age assessment in living individual. Int J Fuzzy Syst 19:838. https://doi.org/10.1007/s40815-016-0232-7

    Article  Google Scholar 

  • Minker W (1998) Evaluation methodologies for interactive speech systems. Paper presented at the first international conference on language resources and evaluation

  • Morales SOC, Cox SJ (2009) Modelling errors in automatic speech recognition for dysarthric speakers. EURASIP J Adv Signal Process 2009(1):1–14

    Article  MATH  Google Scholar 

  • Motamed S, Setayeshi S, Rabiee A (2017) Speech emotion recognition based on a modified brain emotional learning model. Biol Inspir Cognit Archit 19:32–38

  • Petković D, Ćojbašić Ž (2012) Adaptive neuro-fuzzy estimation of autonomic nervous system parameters effect on heart rate variability. Neural Comput Appl 21(8):2065–2070

    Article  Google Scholar 

  • Petković D, Issa M, Pavlović ND, Zentner L, Ćojbašić Ž (2012) Adaptive neuro fuzzy controller for adaptive compliant robotic gripper. Expert Syst Appl 39(18):13295–13304

    Article  Google Scholar 

  • Polur PD, Miller GE (2005) Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model. J Rehabil Res Dev 42(3):363–371. https://doi.org/10.1682/jrrd.2004.06.0067

    Article  Google Scholar 

  • Prabhu V, Gunasekaran G (2016) Fuzzy logic based Nam speech recognition for Tamil syllables. Indian J Sci Technol 9(1):1–12

  • Rudzicz F (2012) Using articulatory likelihoods in the recognition of dysarthric speech. Speech Commun 54(3):430–444. https://doi.org/10.1016/j.specom.2011.10.006

    Article  Google Scholar 

  • Selouani S-A, Yakoub MS, O’Shaughnessy D (2009) Alternative speech communication system for persons with severe speech disorders. EURASIP J Adv Signal Process. https://doi.org/10.1155/2009/540409

    MATH  Google Scholar 

  • Shahamiri SR, Salim B, Salwah S (2014) A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks. IEEE Trans Neural Syst Rehabil Eng 22(5):1053–1063

    Article  Google Scholar 

  • Spiliotopoulos D, Stavropoulou P, Kouroupetroglou G (2009) Spoken dialogue interfaces: integrating usability. Springer, Berlin

    Google Scholar 

  • Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038

    Article  Google Scholar 

  • Vanus J, Smolon M, Martinek R, Koziorek J, Zidek J, Bilik P (2015) Testing of the voice communication in smart home care. Human-Centric Comput Info Sci 5(1):1–22

  • Wang W, Zhou Z-H (2008) On multi-view active learning and the combination with semi-supervised learning. Paper presented at the proceedings of the 25th international conference on Machine learning

  • Wolfe J, Morais M, Schafer E, Agrawal S, Koch D (2015) Evaluation of speech recognition of cochlear implant recipients using adaptive, digital remote microphone technology and a speech enhancement sound processing algorithm. J Am Acad Audiol 26(5):502–508

    Article  Google Scholar 

  • Zhang Q, Sun S (2010) Multiple-view multiple-learner active learning. Pattern Recognit 43(9):3113–3119

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This paper was funded by University of Malaya Research Grant (UMRG), Project No. RP003B-13ICT and UM High Impact Research Grant UM-MOHE UM.C/HIR/MOHE/FCSIT/05 from the Ministry of Higher Education, Malaysia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adeleh Asemi.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of intrest.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asemi, A., Salim, S.S.B., Shahamiri, S.R. et al. Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR. Soft Comput 23, 3529–3544 (2019). https://doi.org/10.1007/s00500-018-3013-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3013-4

Keywords

Navigation