Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder

Ghosh, Rajdeep; Sinha, Nidul; Phadikar, Souvik

doi:10.1007/s42979-022-01274-y

Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder

Original Research
Published: 22 July 2022

Volume 3, article number 389, (2022)
Cite this article

SN Computer Science Aims and scope Submit manuscript

154 Accesses
1 Citation
Explore all metrics

Abstract

The purpose of a brain–computer interface (BCI) is to enhance or support the normal functions of disabled people, and as such, BCIs have been utilized for a variety of applications, such as prostheses and identification of mental state. One such application concerned with providing a means of communication for disabled individuals is focused on the recognition of silent speech (also known as imagined speech) in an individual. Silent speech can be defined as the speech originating inside the brain of an individual that has not been vocalized by the individual. The proposed work is concerned with the classification of silent speech from the brain activity of an individual recorded using an electroencephalogram (EEG). EEG data from 45 subjects were collected while they imagined the English vowels /a/, /e/, /i/, /o/, and /u/ without vocalization. EEG data were also recorded from 22 subjects who imagined five Bengali vowels /আ/, /ই/, /উ/, /এ/ and /ও/ without vocalization. The selected Bengali vowels have a similar pronunciation to the English vowels. Various temporal and spectral features were evaluated from the EEG recordings, which were then classified using a stacked autoencoder (SAE). The SAE achieved an accuracy of 75.56% and 73.6% in classifying the silent speech from the English and Bengali languages, respectively. Moreover, it has been observed that the proposed SAE outperforms conventional methods such as common spatial pattern (CSP) and support vector machine (SVM) during classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Article Open access 07 May 2022

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Article Open access 13 February 2024

Automatic speech recognition: a survey

Article 10 November 2020

References

He Y, Eguren D, Azorín JM, Grossman RG, Luu TP, Contreras-Vidal L. Brain–machine interfaces for controlling lower-limb powered robotic systems. J Neural Eng. 2018;15(021004):1–15.
Google Scholar
Birbaumer N, Kubler A, Ghanayim N, Hinterberger T, Perelmouter J, Kaiser J, Iversen I, Kotchoubey B, Neumann N, Flor H. The thought translation device (TTD) for completely paralyzed patients. IEEE Trans Rehabil Eng. 2000;8(2):190–3.
Article Google Scholar
Wang L, Zhang X, Zhong X, Zhang Y. Analysis and classification of speech imagery EEG for BCI. Biomed Signal Process Control. 2013;8(6):901–8.
Article Google Scholar
Ghosh R, Kumar V, Sinha N, Biswas SK. Motor imagery task classification using intelligent algorithm with prominent trial selection. J Intell Fuzzy Syst. 2018;35(2):1501–10.
Article Google Scholar
McAdam DW, Whitaker HA. Language production: Electroencephalographic localization in the normal human brain. Science. 1971;172(3982):499–502.
Article Google Scholar
Molfese DL. Left and right hemisphere involvement in speech perception: Electrophysiological correlates. Percept Psychophys. 1978;23(3):237–43.
Article Google Scholar
Suppes P, Lu ZL, Han B. Brain wave recognition of words. Proc Natl Acad Sci. 1997;94(26):14965–9.
Article Google Scholar
Wester M. Unspoken speech recognition based on electroencephalography. PhD Thesis Universität Karlsruhe (TH), Karlsruhe, Germany, 2006.
Mesgarani N, David S, Shamma S. Representation of phonemes in primary auditory cortex: how the brain analyzes speech. IEEE Int Conf Acoust Speech Signal Process. 2007;4:765–8.
Google Scholar
DaSalla CS, Kambara H, Sato M, Koike Y. Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 2009;22(9):1334–9.
Article Google Scholar
D’Zmura M, Deng S, Lappas T, Thorpe S, Srinivasan R. Toward EEG sensing of imagined speech. In: Jacko JA, editor. Human-computer interaction. New Trends. HCI 2009 Lecture notes in computer science. 5610th ed. Berlin: Springer; 2009.
Google Scholar
Brigham K, Kumar BV. Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th International Conference on Bioinformatics and Biomedical Engineering, IEEE, China, 2010, pp. 1–4.
Chi X, Hagedorn JB, Schoonover D, D’Zmura M. EEG-based discrimination of imagined speech phonemes. Int J Bioelectromagn. 2011;13(4):201–6.
Google Scholar
Matsumoto M. Silent speech decoder using adaptive collection. In: Proceedings of the companion publication of the 19th international conference on Intelligent User Interfaces, ACM, Israel, 2014, pp. 73–6
Salama M, ElSherif L, Lashin H, Gamal T. Recognition of unspoken words using electrode electroencephalograhic signals. In: The Sixth International Conference on Advanced Cognitive Technologies and Applications, 2014, pp. 51–5.
Kamalakkannan R, Rajkumar R, Raj MM, Devi SS. Imagined speech classification using EEG. Adv Biomed Sci Eng. 2014;1(2):20–32.
Google Scholar
Ghane P, Hossain,G, Tovar A. Robust understanding of EEG patterns in silent speech. In: 2015 National Aerospace and Electronics Conference (NAECON), IEEE, USA, 2015, pp. 282–9.
Torres-García AA, Reyes-García CA, Villaseñor-Pineda L, García-Aguilar G. Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification. Expert Syst Appl. 2016;59:1–12.
Article Google Scholar
Wang L, Liu X, Liang Z, Yang Z, Hu X. Analysis and classification of hybrid BCI based on motor imagery and speech imagery. Measurement. 2019;147(106842):1–12.
Google Scholar
Chengaiyan S, Retnapandian AS, Anandan K. Identification of vowels in consonant–vowel–consonant words from speech imagery based EEG signals. Cogn Neurodyn. 2020;14(1):1–19.
Article Google Scholar
Sereshkeh AR, Trott R, Bricout A, Chau T. EEG classification of covert speech using regularized neural networks. IEEE/ACM Trans Audio Speech Lang Process. 2017;25(12):2292–300.
Article Google Scholar
Nguyen CH, Karavas GK, Artemiadis P. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J Neural Eng. 2017;15(1):1–16.
Google Scholar
Dai M, Zheng D, Na R, Wang S, Zhang S. EEG classification of motor imagery using a novel deep learning framework. Sensors. 2019;19(3):1–16.
Article Google Scholar
Lin Q, Ye SQ, Huang XM, Li SY, Zhang MZ, Xue Y, Chen WS. Classification of epileptic EEG signals with stacked sparse autoencoder based on deep learning. In: International conference on intelligent computing. Cham: Springer; 2016. p. 802–10.
Google Scholar
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA, Bottou L. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res. 2010;11(12):3371–408.
MathSciNet MATH Google Scholar
Shenoi BA. Introduction to digital signal processing and filter design. New Jersey: John Wiley & Sons; 2005.
Book Google Scholar
Ghosh R, Sinha N, Biswas SK. Automated eye blink artefact removal from EEG using support vector machine and autoencoder. IET Signal Proc. 2019;13(2):141–8.
Article Google Scholar
Phadikar S, Sinha N, Ghosh R. Automatic eyeblink artifact removal from EEG signal using wavelet transform with heuristically optimized threshold. IEEE J Biomed Health Inform. 2020;25(2):475–84.
Article Google Scholar
Huang L, Wang Y, Liu J, Wang J. Approximate entropy of EEG as a measure of cerebral ischemic injury. Annl Int Conf IEEE Eng Med Biol Soc. 2004;2:4537–9.
Article Google Scholar
Hjorth B. EEG analysis based on time domain properties. Electroencephalogr Clin Neurophysiol. 1970;29(3):306–10.
Article Google Scholar
Rithwik P, Benzy VK, Vinod AP. High accuracy decoding of motor imagery directions from EEG-based brain computer interface using filter bank spatially regularised common spatial pattern method. Biomed Signal Process Control. 2022;72: 103241.
Article Google Scholar
Murugappan M, Rizon M, Nagarajan R, Yaacob S, Zunaidi I, Hazry D. EEG feature extraction for classifying emotions using FCM and FKM. Int J Comput Commun. 2007;1(2):21–5.
Google Scholar
Craik A, He Y, Contreras-Vidal JL. Deep learning for electroencephalogram (EEG) classification tasks: a review. J Neural Eng. 2019;16(031001):1–28.
Google Scholar
Matsumoto M, Hori J. Classification of silent speech using support vector machine and relevance vector machine. Appl Soft Comput. 2014;20:95–102.
Article Google Scholar

Download references

Funding

This research work is done under the CRS project entitled. “Monitoring stress in students using EEG” with CRS application ID: 1 -5770264050. This project is funded by NPIU (Through TEQIP III, NPIU, MHRD, Govt. of India).

Author information

Authors and Affiliations

School of Computing Science and Engineering, VIT Bhopal University, Kotri Kalan, Madhya Pradesh, 466114, India
Rajdeep Ghosh
Department of Electrical Engineering, NIT Silchar, Silchar, Assam, 788010, India
Nidul Sinha & Souvik Phadikar

Authors

Rajdeep Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Nidul Sinha
View author publications
You can also search for this author in PubMed Google Scholar
Souvik Phadikar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajdeep Ghosh.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Statement

All the procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional ethics committee.

Informed Consent

The informed consent has been taken from all the involved human participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, R., Sinha, N. & Phadikar, S. Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder. SN COMPUT. SCI. 3, 389 (2022). https://doi.org/10.1007/s42979-022-01274-y

Download citation

Received: 11 February 2022
Accepted: 26 June 2022
Published: 22 July 2022
DOI: https://doi.org/10.1007/s42979-022-01274-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder

Abstract

Access this article

Similar content being viewed by others

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Automatic speech recognition: a survey

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Statement

Informed Consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder

Abstract

Access this article

Similar content being viewed by others

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Automatic speech recognition: a survey

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Statement

Informed Consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation