Abstract
The purpose of a brain–computer interface (BCI) is to enhance or support the normal functions of disabled people, and as such, BCIs have been utilized for a variety of applications, such as prostheses and identification of mental state. One such application concerned with providing a means of communication for disabled individuals is focused on the recognition of silent speech (also known as imagined speech) in an individual. Silent speech can be defined as the speech originating inside the brain of an individual that has not been vocalized by the individual. The proposed work is concerned with the classification of silent speech from the brain activity of an individual recorded using an electroencephalogram (EEG). EEG data from 45 subjects were collected while they imagined the English vowels /a/, /e/, /i/, /o/, and /u/ without vocalization. EEG data were also recorded from 22 subjects who imagined five Bengali vowels /আ/, /ই/, /উ/, /এ/ and /ও/ without vocalization. The selected Bengali vowels have a similar pronunciation to the English vowels. Various temporal and spectral features were evaluated from the EEG recordings, which were then classified using a stacked autoencoder (SAE). The SAE achieved an accuracy of 75.56% and 73.6% in classifying the silent speech from the English and Bengali languages, respectively. Moreover, it has been observed that the proposed SAE outperforms conventional methods such as common spatial pattern (CSP) and support vector machine (SVM) during classification.
Similar content being viewed by others
References
He Y, Eguren D, Azorín JM, Grossman RG, Luu TP, Contreras-Vidal L. Brain–machine interfaces for controlling lower-limb powered robotic systems. J Neural Eng. 2018;15(021004):1–15.
Birbaumer N, Kubler A, Ghanayim N, Hinterberger T, Perelmouter J, Kaiser J, Iversen I, Kotchoubey B, Neumann N, Flor H. The thought translation device (TTD) for completely paralyzed patients. IEEE Trans Rehabil Eng. 2000;8(2):190–3.
Wang L, Zhang X, Zhong X, Zhang Y. Analysis and classification of speech imagery EEG for BCI. Biomed Signal Process Control. 2013;8(6):901–8.
Ghosh R, Kumar V, Sinha N, Biswas SK. Motor imagery task classification using intelligent algorithm with prominent trial selection. J Intell Fuzzy Syst. 2018;35(2):1501–10.
McAdam DW, Whitaker HA. Language production: Electroencephalographic localization in the normal human brain. Science. 1971;172(3982):499–502.
Molfese DL. Left and right hemisphere involvement in speech perception: Electrophysiological correlates. Percept Psychophys. 1978;23(3):237–43.
Suppes P, Lu ZL, Han B. Brain wave recognition of words. Proc Natl Acad Sci. 1997;94(26):14965–9.
Wester M. Unspoken speech recognition based on electroencephalography. PhD Thesis Universität Karlsruhe (TH), Karlsruhe, Germany, 2006.
Mesgarani N, David S, Shamma S. Representation of phonemes in primary auditory cortex: how the brain analyzes speech. IEEE Int Conf Acoust Speech Signal Process. 2007;4:765–8.
DaSalla CS, Kambara H, Sato M, Koike Y. Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 2009;22(9):1334–9.
D’Zmura M, Deng S, Lappas T, Thorpe S, Srinivasan R. Toward EEG sensing of imagined speech. In: Jacko JA, editor. Human-computer interaction. New Trends. HCI 2009 Lecture notes in computer science. 5610th ed. Berlin: Springer; 2009.
Brigham K, Kumar BV. Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th International Conference on Bioinformatics and Biomedical Engineering, IEEE, China, 2010, pp. 1–4.
Chi X, Hagedorn JB, Schoonover D, D’Zmura M. EEG-based discrimination of imagined speech phonemes. Int J Bioelectromagn. 2011;13(4):201–6.
Matsumoto M. Silent speech decoder using adaptive collection. In: Proceedings of the companion publication of the 19th international conference on Intelligent User Interfaces, ACM, Israel, 2014, pp. 73–6
Salama M, ElSherif L, Lashin H, Gamal T. Recognition of unspoken words using electrode electroencephalograhic signals. In: The Sixth International Conference on Advanced Cognitive Technologies and Applications, 2014, pp. 51–5.
Kamalakkannan R, Rajkumar R, Raj MM, Devi SS. Imagined speech classification using EEG. Adv Biomed Sci Eng. 2014;1(2):20–32.
Ghane P, Hossain,G, Tovar A. Robust understanding of EEG patterns in silent speech. In: 2015 National Aerospace and Electronics Conference (NAECON), IEEE, USA, 2015, pp. 282–9.
Torres-García AA, Reyes-García CA, Villaseñor-Pineda L, García-Aguilar G. Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification. Expert Syst Appl. 2016;59:1–12.
Wang L, Liu X, Liang Z, Yang Z, Hu X. Analysis and classification of hybrid BCI based on motor imagery and speech imagery. Measurement. 2019;147(106842):1–12.
Chengaiyan S, Retnapandian AS, Anandan K. Identification of vowels in consonant–vowel–consonant words from speech imagery based EEG signals. Cogn Neurodyn. 2020;14(1):1–19.
Sereshkeh AR, Trott R, Bricout A, Chau T. EEG classification of covert speech using regularized neural networks. IEEE/ACM Trans Audio Speech Lang Process. 2017;25(12):2292–300.
Nguyen CH, Karavas GK, Artemiadis P. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J Neural Eng. 2017;15(1):1–16.
Dai M, Zheng D, Na R, Wang S, Zhang S. EEG classification of motor imagery using a novel deep learning framework. Sensors. 2019;19(3):1–16.
Lin Q, Ye SQ, Huang XM, Li SY, Zhang MZ, Xue Y, Chen WS. Classification of epileptic EEG signals with stacked sparse autoencoder based on deep learning. In: International conference on intelligent computing. Cham: Springer; 2016. p. 802–10.
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA, Bottou L. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res. 2010;11(12):3371–408.
Shenoi BA. Introduction to digital signal processing and filter design. New Jersey: John Wiley & Sons; 2005.
Ghosh R, Sinha N, Biswas SK. Automated eye blink artefact removal from EEG using support vector machine and autoencoder. IET Signal Proc. 2019;13(2):141–8.
Phadikar S, Sinha N, Ghosh R. Automatic eyeblink artifact removal from EEG signal using wavelet transform with heuristically optimized threshold. IEEE J Biomed Health Inform. 2020;25(2):475–84.
Huang L, Wang Y, Liu J, Wang J. Approximate entropy of EEG as a measure of cerebral ischemic injury. Annl Int Conf IEEE Eng Med Biol Soc. 2004;2:4537–9.
Hjorth B. EEG analysis based on time domain properties. Electroencephalogr Clin Neurophysiol. 1970;29(3):306–10.
Rithwik P, Benzy VK, Vinod AP. High accuracy decoding of motor imagery directions from EEG-based brain computer interface using filter bank spatially regularised common spatial pattern method. Biomed Signal Process Control. 2022;72: 103241.
Murugappan M, Rizon M, Nagarajan R, Yaacob S, Zunaidi I, Hazry D. EEG feature extraction for classifying emotions using FCM and FKM. Int J Comput Commun. 2007;1(2):21–5.
Craik A, He Y, Contreras-Vidal JL. Deep learning for electroencephalogram (EEG) classification tasks: a review. J Neural Eng. 2019;16(031001):1–28.
Matsumoto M, Hori J. Classification of silent speech using support vector machine and relevance vector machine. Appl Soft Comput. 2014;20:95–102.
Funding
This research work is done under the CRS project entitled. “Monitoring stress in students using EEG” with CRS application ID: 1 -5770264050. This project is funded by NPIU (Through TEQIP III, NPIU, MHRD, Govt. of India).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical Statement
All the procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional ethics committee.
Informed Consent
The informed consent has been taken from all the involved human participants.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghosh, R., Sinha, N. & Phadikar, S. Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder. SN COMPUT. SCI. 3, 389 (2022). https://doi.org/10.1007/s42979-022-01274-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01274-y