Abstract
Speech signals convey speaker’s neurodevelopmental state along with phonological information. Recognize a speech disorder by analyzing the speech is essential for human–machine interaction. To develop a subject independent speech recognition system for neurodevelopmental disorders, by identifying voice features from MATLAB toolbox, spectral characteristics and feature selection algorithms are proposed in this paper. Feature selection is applied to overcome the challenges of dimensionality in various applications. This work presents a novel particle swarm optimization (PSO) based algorithm for feature selection. The experiments were conducted using a speech database of the children with intellectual disability with age-matched typically developed and validate the reliability using 10-fold cross-validation technique. The database consists of 141 speech features extracted from linear predictive coding (LPC) based cepstral parameters and Mel-frequency cepstral coefficients (MFCC). Three classification models were applied and obtained the recognition accuracies 90.30% with ANN, 98.00% with SVM and 91.00% with random forest with PSO feature selection algorithm. The results strongly prove the usefulness of the proposed multivariate feature selection algorithm when compared with filter approach.
Similar content being viewed by others
References
de Villiers, J. G., & de Villiers, P. A. (1974). Completeness and performance in child language: Are children really competent to judge? Journal of Child Language,1, 11–22.
Slobin, D. (1973). Cognitive prerequisites for the development of grammar. In C. A. Ferguson & D. I. Stobin (Eds.), Studies in child language development (pp. 175–208). New York: Holt Rinehart and Winston.
Bloom, L. (1970). Language development. Cambridge, MA: MIT Press.
Brown, R. (1973). A first language. London: Academic.
Pinker, S. (1994). The language instinct. New York: Harper Perennial.
Cutler, A., Klein, W., & Levinson, S. C. (2005). The cornerstones of twenty-first century psycholinguistics. In Twenty-first century psycholinguistics: Four cornerstones (pp 1–20). Mahwah, NJ: Erlbaum.
Kumin, L. (2003). Early communication skills for children with Down syndrome: A guide for parents and professionals. Bethesda, MD: Woodbine House.
Locke, J. L. (1983). Phonological acquisition and change. New York: Academic.
Oller, D. K. (1980). The emergence of the sounds of speech in infancy. In Child phonology: Production (Vol. 1, pp 93–112). Academic, New York.
Stark, R. E., Rose, S. N., & McLagen, M. (1975). Features of infant sounds: The first eight weeks of life. Journal of Child Language,2, 205–221.
Sheinkopf, S. J., Mundy, P., Oller, D. K., & Steffens, M. (2000). Vocal atypicalities of preverbal autistic children. Journal of Autism and Developmental Disorders,30, 345–354.
Wetherby, A. M., Woods, J., Allen, L., Cleary, J., Dickinson, H., & Lord, C. (2004). Early indicators of autism spectrum disorders in the second year of life. Journal of Autism and Developmental Disorders,34, 473–493.
Tager-Flusberg, H., & Sullivan, K. (1998). Early language development in children with mental retardation. In J. Burack, R. Hodapp, & E. Zigler (Eds.), Handbook of mental retardation and development (pp. 208–239). New York: Cambridge University Press.
Batshaw, M. L. (2002). Children with disabilities (5th ed.). Baltimore, MD: Brookes.
Petersen, M. D., Kube, D. A., & Palmer, E. B. (1998). Classification of developmental delays. Seminars in Pediatric Neurology,5, 2–14.
American Psychiatric Association. (2013). DSM-5. Diagnostic and statistical manual of mental disorders. American Psychiatric Association.
Lynch, M., Oller, K., Eilers, R., & Basinger, D. (1990). Vocal development of infants with Down’s syndrome. In J. Macnamara (Ed.), 11th symposium for research on child language disorders, Madison, WI. Cambridge, MA: MIT Press.
Harel, S., Greenstein, Y., Kramer, U., Yifat, R., Samuel, E., Nevo, Y., et al. (1996). Clinical characteristics of children referred to a child development centre for evaluation of speech, language, and communication disorders. Paediatric Neurology,15(4), 305–311.
Dockrell, J. E. (2001). Assessing language skills in preschool children. Child Psychology and Psychiatry Review,6(2), 74–85.
Lesser, R., & Hassip, S. (1986). Knowledge and opinions of speech therapy in teachers, doctors and nurses. Child: Care, Health and Development,12(4), 235–249.
Rabiner, L., & Juang, B. (1993). Fundamentals of speech recognition. Upper Saddle River, NJ: Prentice.
Gajsek, R., & Mihelic, F. (2008). Comparison of speech parameterization techniques for Slovenian language. In 9th International PhD workshop on systems and control: Young generation viewpoint.
Reynolds, D. A. (1994). Experimental evaluation of features for robust speaker identification. IEEE Transactions on Speech and Audio Processing,2(4), 639–643.
Alexander, S., & Rhee, Z. (1987). An analysis of finite precision effects for the autocorrelation method and Burg’s method of linear prediction. In IEEE international conference on acoustics, speech, and signal processing, ICASSP ’87.
Alexander, S., & Zong, R. (1987). Analytical finite precision results for Burg’s algorithm and the autocorrelation method for linear prediction. IEEE Transactions on Acoustics, Speech and Signal Processing,35(5), 626–635.
Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE,63(4), 561–580.
Antoniol, G., Rollo, V. F., & Venturi, G. (2005). Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories. In Proceedings of the 2005 international workshop on mining software repositories.
Dhanalakshmi, P., Palanivel, S., & Ramalingam, V. (2009). Classification of audio signals using SVM and RBFNN. Expert Systems with Applications,36(3 Part 2), 6069–6075.
Davis, S. B., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transaction ASSP,28(4), 357–366.
Jothilakshmi, S., Ramalingam, S., & Palanivel, S. (2009). Unsupervised speaker segmentation with residual phase and MFCC feature. Expert Systems with Applications,36(6), 9799–9804.
Picone, J. W. (1993). Signal modelling techniques in speech recognition. Proceedings of IEEE,81(9), 1215–1247.
Alelyani, S., Tang, J., & Liu, H. (2013). Feature selection for clustering: A review. Data Clustering: Algorithms and Applications,29, 110–121.
Yazdani, S., Shanbehzadeh, J., & Aminian, E. (2013). Feature subset selection using constrained binary/integer biogeography-based optimization. ISA Transactions,52, 383–390.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence,97, 273–324.
Kaur, A., & Kaur, M. (2015). A review of parameters for improving the performance of particle swarm optimization. International Journal of Hybrid Information Technology,8, 1. https://doi.org/10.14257/ijhit.2015.8.4.02.
Muthusamy, H., Polat, K., & Yaacob, S. (2015). Improved emotion recognition using Gaussian mixture model and extreme learning machine in speech and glottal signals. Mathematical Problems in Engineering,2015, 1–13.
Poli, R. (2007). An analysis of publications on particle swarm optimization applications. Essex: Department of Computer Science, University of Essex.
Padilla, P., Lopez, M., Gorriz, J. M., Ramirez, J., Salas-Gonzalez, D., & Alwaz, I. (2012). KMF-SVM based CAD tool applied to functional brain images for the diagnosis of Alzheimer’s disease. IEEE Transaction on Medical Imaging,31(2), 207–216.
Vapnik, V. N. (1998). An overview of statistical learning theory. IEEE Transactions on Neural Networks,10(5), 988–999.
Bourlard, H., & Wellekens, C. J. (1989). Speech pattern discrimination and multilayer perceptrons. Computer Speech and Language,3(1), 1–19.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature,323(6088), 533.
Räsänen, O., & Pohjalainen, J. (2013). Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech. In INTERSPEECH (pp. 210–214).
Hau, C. C. (Ed.). (2015). Handbook of pattern recognition and computer vision. Singapore: World Scientific.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Aggarwal, G., Monga, R. & Gochhayat, S.P. A Novel Hybrid PSO Assisted Optimization for Classification of Intellectual Disability Using Speech Signal. Wireless Pers Commun 113, 1955–1971 (2020). https://doi.org/10.1007/s11277-020-07301-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-020-07301-6