Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR

Asemi, Adeleh; Salim, Siti Salwah Binti; Shahamiri, Seyed Reza; Asemi, Asefeh; Houshangi, Narjes

doi:10.1007/s00500-018-3013-4

Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR

Methodologies and Application
Published: 01 February 2018

Volume 23, pages 3529–3544, (2019)
Cite this article

Soft Computing Aims and scope Submit manuscript

Adeleh Asemi ORCID: orcid.org/0000-0002-9193-2430¹,
Siti Salwah Binti Salim²,
Seyed Reza Shahamiri³,
Asefeh Asemi⁴ &
…
Narjes Houshangi⁵

747 Accesses
12 Citations
Explore all metrics

Abstract

Due to the improvements of dysarthric automatic speech recognition (ASR) during the last few decades, the demand for assessment and evaluation of such technologies increased significantly. Evaluation methods of ASRs are now required to consider multiple qualitative and quantitative metrics. In this study, the exploratory factor analysis is conducted to classify the evaluation metrics that is applied by researchers. The metrics with high Pearson correlation coefficiency (\(r > .9\)) are placed in same groups so the number of metrics from 23 is reduced to six main metrics. Artificial neural networks (ANNs) do not require any internal knowledge of system parameters and provide solutions for problems with multi-variables while delivering speedy calculations; hence, they can be used as an alternative to analytical approaches based on obtained evaluation metrics. Here, the adaptive neuro-fuzzy inference system (ANFIS) was employed for ASR performance evaluation in which it applies an ANN to estimate the fuzzy logic membership function parameters of the fuzzy inference system (FIS). The proposed algorithm was deployed in MATLAB and employed to measure the performances of two dysarthric ARS systems based on MVML and MVSL active learning theories. The assessment results presented in this paper show the effectiveness of the developed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Automatic speech recognition: a survey

Article 10 November 2020

Mishaim Malik, Muhammad Kamran Malik, … Imran Makhdoom

Early dementia detection with speech analysis and machine learning techniques

Article Open access 11 April 2024

Zerin Jahan, Surbhi Bhatia Khan & Mo Saraee

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Amandeep Singh Dhanjal & Williamjeet Singh

References

Asemi A, Asemi A (2014) Intelligent MCDM method for supplier selection under fuzzy environment. Int J Inf Sci Manag (IJISM) 12(2):33–40
Asemi A, Baba MS, Haji Abdullah R, Idris N (2014) Fuzzy multi criteria decision making applications: a review study. Eprints.um, pp 344–351
Assaleh K, Al-Rousan M (2005) Recognition of Arabic sign language alphabet using polynomial classifiers. EURASIP J Appl Signal Process 2005:2136–2145
MATH Google Scholar
Avci E, Akpolat ZH (2006) Speech recognition using a wavelet packet adaptive network based fuzzy inference system. Expert Syst Appl 31(3):495–503
Article Google Scholar
Bangor A, Kortum PT, Miller JT (2008) An empirical evaluation of the system usability scale. Int J Hum Comput Interact 24(6):574–594
Article Google Scholar
Bennett I, Babu BR, Morkhandikar K, Gururaj P (2014) Speech recognition system interactive agent. Google Patents
Bhandari B, Grant M (2007) User satisfaction and sustainability of drinking water schemes in rural communities of Nepal. Sustain Sci Pract Policy 3(1):12–20
Dybkjær L, Bernsen NO (2001) Usability evaluation in spoken language dialogue systems. In: Proceedings of the workshop on evaluation for language and dialogue systems, vol 9
Ekici BB, Aksoy UT (2011) Prediction of building energy needs in early stage of design by using ANFIS. Expert Syst Appl 38(5):5352–5358
Article Google Scholar
Harman HH (1976) Modern factor analysis. Chicago University Press
Hasegawa-Johnson M, Gunderson J, Perlman A, Huang T (2006) HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria. Paper presented at the Proceedings of the 2006 IEEE international conference on acoustics, speech, and signal processing
Hawley MS, Enderby P, Green P, Cunningham S, Brownsell S, Carmichael J, Parker M, Hatzis A, O’Neill P, Palmer R (2007) A speech-controlled environmental control system for people with severe dysarthria. Med Eng Phys 29(5):586–593. https://doi.org/10.1016/j.medengphy.2006.06.009
Article Google Scholar
İnal M (2008) Determination of dielectric properties of insulator materials by means of ANFIS: a comparative study. J Mater Process Technol 195(1):34–43
Google Scholar
Jang JSR (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybernet 23(3):665–685
Article Google Scholar
Khajeh A, Modarress H, Rezaee B (2009) Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers. Expert Syst Appl 36(3):5728–5732
Article Google Scholar
Kim H, Hasegawa-Johnson M, Perlman A, Gunderson J, Huang T, Watkin K, Frame S (2008). Dysarthric speech database for universal access research. Paper presented at the proceedings of the 9th annual conference of the international speech communication association, Brisbane, Australia
Kitchenham BA, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. EBSE Technical Report EBSE, pp 1–57
Mansourvar M, Asemi A, Raj RG et al (2017) A fuzzy inference system for skeletal age assessment in living individual. Int J Fuzzy Syst 19:838. https://doi.org/10.1007/s40815-016-0232-7
Article Google Scholar
Minker W (1998) Evaluation methodologies for interactive speech systems. Paper presented at the first international conference on language resources and evaluation
Morales SOC, Cox SJ (2009) Modelling errors in automatic speech recognition for dysarthric speakers. EURASIP J Adv Signal Process 2009(1):1–14
Article MATH Google Scholar
Motamed S, Setayeshi S, Rabiee A (2017) Speech emotion recognition based on a modified brain emotional learning model. Biol Inspir Cognit Archit 19:32–38
Petković D, Ćojbašić Ž (2012) Adaptive neuro-fuzzy estimation of autonomic nervous system parameters effect on heart rate variability. Neural Comput Appl 21(8):2065–2070
Article Google Scholar
Petković D, Issa M, Pavlović ND, Zentner L, Ćojbašić Ž (2012) Adaptive neuro fuzzy controller for adaptive compliant robotic gripper. Expert Syst Appl 39(18):13295–13304
Article Google Scholar
Polur PD, Miller GE (2005) Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model. J Rehabil Res Dev 42(3):363–371. https://doi.org/10.1682/jrrd.2004.06.0067
Article Google Scholar
Prabhu V, Gunasekaran G (2016) Fuzzy logic based Nam speech recognition for Tamil syllables. Indian J Sci Technol 9(1):1–12
Rudzicz F (2012) Using articulatory likelihoods in the recognition of dysarthric speech. Speech Commun 54(3):430–444. https://doi.org/10.1016/j.specom.2011.10.006
Article Google Scholar
Selouani S-A, Yakoub MS, O’Shaughnessy D (2009) Alternative speech communication system for persons with severe speech disorders. EURASIP J Adv Signal Process. https://doi.org/10.1155/2009/540409
MATH Google Scholar
Shahamiri SR, Salim B, Salwah S (2014) A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks. IEEE Trans Neural Syst Rehabil Eng 22(5):1053–1063
Article Google Scholar
Spiliotopoulos D, Stavropoulou P, Kouroupetroglou G (2009) Spoken dialogue interfaces: integrating usability. Springer, Berlin
Google Scholar
Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038
Article Google Scholar
Vanus J, Smolon M, Martinek R, Koziorek J, Zidek J, Bilik P (2015) Testing of the voice communication in smart home care. Human-Centric Comput Info Sci 5(1):1–22
Wang W, Zhou Z-H (2008) On multi-view active learning and the combination with semi-supervised learning. Paper presented at the proceedings of the 25th international conference on Machine learning
Wolfe J, Morais M, Schafer E, Agrawal S, Koch D (2015) Evaluation of speech recognition of cochlear implant recipients using adaptive, digital remote microphone technology and a speech enhancement sound processing algorithm. J Am Acad Audiol 26(5):502–508
Article Google Scholar
Zhang Q, Sun S (2010) Multiple-view multiple-learner active learning. Pattern Recognit 43(9):3113–3119
Article MATH Google Scholar

Download references

Acknowledgements

This paper was funded by University of Malaya Research Grant (UMRG), Project No. RP003B-13ICT and UM High Impact Research Grant UM-MOHE UM.C/HIR/MOHE/FCSIT/05 from the Ministry of Higher Education, Malaysia.

Author information

Authors and Affiliations

Department of IT and Computer Engineering, Safahan Institute of Higher Education, Isfahan, Iran
Adeleh Asemi
Department of Software Engineering, Faculty of Computer Science and Information Technology, University of Malaya, 50603, Lembah Pantai, Kuala Lumpur, Malaysia
Siti Salwah Binti Salim
Faculty of Business and Information Technology, Manukau Institute of Technology, MIT Manukau, Cnr of Manukau Station Rd Davies Ave, Manukau, Private Bag 94006, Manukau, Auckland, 2241, New Zealand
Seyed Reza Shahamiri
Department of Knowledge and Information Science, University of Isfahan, Isfahan, Iran
Asefeh Asemi
Department of Occupational Therapy, Arak University of Medical Science, Arak, Iran
Narjes Houshangi

Authors

Adeleh Asemi
View author publications
You can also search for this author in PubMed Google Scholar
Siti Salwah Binti Salim
View author publications
You can also search for this author in PubMed Google Scholar
Seyed Reza Shahamiri
View author publications
You can also search for this author in PubMed Google Scholar
Asefeh Asemi
View author publications
You can also search for this author in PubMed Google Scholar
Narjes Houshangi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adeleh Asemi.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of intrest.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Asemi, A., Salim, S.S.B., Shahamiri, S.R. et al. Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR. Soft Comput 23, 3529–3544 (2019). https://doi.org/10.1007/s00500-018-3013-4

Download citation

Published: 01 February 2018
Issue Date: 01 May 2019
DOI: https://doi.org/10.1007/s00500-018-3013-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Early dementia detection with speech analysis and machine learning techniques

A comprehensive survey on automatic speech recognition using neural networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Early dementia detection with speech analysis and machine learning techniques

A comprehensive survey on automatic speech recognition using neural networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation