Skip to main content
Log in

Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

The purpose of a brain–computer interface (BCI) is to enhance or support the normal functions of disabled people, and as such, BCIs have been utilized for a variety of applications, such as prostheses and identification of mental state. One such application concerned with providing a means of communication for disabled individuals is focused on the recognition of silent speech (also known as imagined speech) in an individual. Silent speech can be defined as the speech originating inside the brain of an individual that has not been vocalized by the individual. The proposed work is concerned with the classification of silent speech from the brain activity of an individual recorded using an electroencephalogram (EEG). EEG data from 45 subjects were collected while they imagined the English vowels /a/, /e/, /i/, /o/, and /u/ without vocalization. EEG data were also recorded from 22 subjects who imagined five Bengali vowels /আ/, /ই/, /উ/, /এ/ and /ও/ without vocalization. The selected Bengali vowels have a similar pronunciation to the English vowels. Various temporal and spectral features were evaluated from the EEG recordings, which were then classified using a stacked autoencoder (SAE). The SAE achieved an accuracy of 75.56% and 73.6% in classifying the silent speech from the English and Bengali languages, respectively. Moreover, it has been observed that the proposed SAE outperforms conventional methods such as common spatial pattern (CSP) and support vector machine (SVM) during classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. He Y, Eguren D, Azorín JM, Grossman RG, Luu TP, Contreras-Vidal L. Brain–machine interfaces for controlling lower-limb powered robotic systems. J Neural Eng. 2018;15(021004):1–15.

    Google Scholar 

  2. Birbaumer N, Kubler A, Ghanayim N, Hinterberger T, Perelmouter J, Kaiser J, Iversen I, Kotchoubey B, Neumann N, Flor H. The thought translation device (TTD) for completely paralyzed patients. IEEE Trans Rehabil Eng. 2000;8(2):190–3.

    Article  Google Scholar 

  3. Wang L, Zhang X, Zhong X, Zhang Y. Analysis and classification of speech imagery EEG for BCI. Biomed Signal Process Control. 2013;8(6):901–8.

    Article  Google Scholar 

  4. Ghosh R, Kumar V, Sinha N, Biswas SK. Motor imagery task classification using intelligent algorithm with prominent trial selection. J Intell Fuzzy Syst. 2018;35(2):1501–10.

    Article  Google Scholar 

  5. McAdam DW, Whitaker HA. Language production: Electroencephalographic localization in the normal human brain. Science. 1971;172(3982):499–502.

    Article  Google Scholar 

  6. Molfese DL. Left and right hemisphere involvement in speech perception: Electrophysiological correlates. Percept Psychophys. 1978;23(3):237–43.

    Article  Google Scholar 

  7. Suppes P, Lu ZL, Han B. Brain wave recognition of words. Proc Natl Acad Sci. 1997;94(26):14965–9.

    Article  Google Scholar 

  8. Wester M. Unspoken speech recognition based on electroencephalography. PhD Thesis Universität Karlsruhe (TH), Karlsruhe, Germany, 2006.

  9. Mesgarani N, David S, Shamma S. Representation of phonemes in primary auditory cortex: how the brain analyzes speech. IEEE Int Conf Acoust Speech Signal Process. 2007;4:765–8.

    Google Scholar 

  10. DaSalla CS, Kambara H, Sato M, Koike Y. Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 2009;22(9):1334–9.

    Article  Google Scholar 

  11. D’Zmura M, Deng S, Lappas T, Thorpe S, Srinivasan R. Toward EEG sensing of imagined speech. In: Jacko JA, editor. Human-computer interaction. New Trends. HCI 2009 Lecture notes in computer science. 5610th ed. Berlin: Springer; 2009.

    Google Scholar 

  12. Brigham K, Kumar BV. Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th International Conference on Bioinformatics and Biomedical Engineering, IEEE, China, 2010, pp. 1–4.

  13. Chi X, Hagedorn JB, Schoonover D, D’Zmura M. EEG-based discrimination of imagined speech phonemes. Int J Bioelectromagn. 2011;13(4):201–6.

    Google Scholar 

  14. Matsumoto M. Silent speech decoder using adaptive collection. In: Proceedings of the companion publication of the 19th international conference on Intelligent User Interfaces, ACM, Israel, 2014, pp. 73–6

  15. Salama M, ElSherif L, Lashin H, Gamal T. Recognition of unspoken words using electrode electroencephalograhic signals. In: The Sixth International Conference on Advanced Cognitive Technologies and Applications, 2014, pp. 51–5.

  16. Kamalakkannan R, Rajkumar R, Raj MM, Devi SS. Imagined speech classification using EEG. Adv Biomed Sci Eng. 2014;1(2):20–32.

    Google Scholar 

  17. Ghane P, Hossain,G, Tovar A. Robust understanding of EEG patterns in silent speech. In: 2015 National Aerospace and Electronics Conference (NAECON), IEEE, USA, 2015, pp. 282–9.

  18. Torres-García AA, Reyes-García CA, Villaseñor-Pineda L, García-Aguilar G. Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification. Expert Syst Appl. 2016;59:1–12.

    Article  Google Scholar 

  19. Wang L, Liu X, Liang Z, Yang Z, Hu X. Analysis and classification of hybrid BCI based on motor imagery and speech imagery. Measurement. 2019;147(106842):1–12.

    Google Scholar 

  20. Chengaiyan S, Retnapandian AS, Anandan K. Identification of vowels in consonant–vowel–consonant words from speech imagery based EEG signals. Cogn Neurodyn. 2020;14(1):1–19.

    Article  Google Scholar 

  21. Sereshkeh AR, Trott R, Bricout A, Chau T. EEG classification of covert speech using regularized neural networks. IEEE/ACM Trans Audio Speech Lang Process. 2017;25(12):2292–300.

    Article  Google Scholar 

  22. Nguyen CH, Karavas GK, Artemiadis P. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J Neural Eng. 2017;15(1):1–16.

    Google Scholar 

  23. Dai M, Zheng D, Na R, Wang S, Zhang S. EEG classification of motor imagery using a novel deep learning framework. Sensors. 2019;19(3):1–16.

    Article  Google Scholar 

  24. Lin Q, Ye SQ, Huang XM, Li SY, Zhang MZ, Xue Y, Chen WS. Classification of epileptic EEG signals with stacked sparse autoencoder based on deep learning. In: International conference on intelligent computing. Cham: Springer; 2016. p. 802–10.

    Google Scholar 

  25. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA, Bottou L. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res. 2010;11(12):3371–408.

    MathSciNet  MATH  Google Scholar 

  26. Shenoi BA. Introduction to digital signal processing and filter design. New Jersey: John Wiley & Sons; 2005.

    Book  Google Scholar 

  27. Ghosh R, Sinha N, Biswas SK. Automated eye blink artefact removal from EEG using support vector machine and autoencoder. IET Signal Proc. 2019;13(2):141–8.

    Article  Google Scholar 

  28. Phadikar S, Sinha N, Ghosh R. Automatic eyeblink artifact removal from EEG signal using wavelet transform with heuristically optimized threshold. IEEE J Biomed Health Inform. 2020;25(2):475–84.

    Article  Google Scholar 

  29. Huang L, Wang Y, Liu J, Wang J. Approximate entropy of EEG as a measure of cerebral ischemic injury. Annl Int Conf IEEE Eng Med Biol Soc. 2004;2:4537–9.

    Article  Google Scholar 

  30. Hjorth B. EEG analysis based on time domain properties. Electroencephalogr Clin Neurophysiol. 1970;29(3):306–10.

    Article  Google Scholar 

  31. Rithwik P, Benzy VK, Vinod AP. High accuracy decoding of motor imagery directions from EEG-based brain computer interface using filter bank spatially regularised common spatial pattern method. Biomed Signal Process Control. 2022;72: 103241.

    Article  Google Scholar 

  32. Murugappan M, Rizon M, Nagarajan R, Yaacob S, Zunaidi I, Hazry D. EEG feature extraction for classifying emotions using FCM and FKM. Int J Comput Commun. 2007;1(2):21–5.

    Google Scholar 

  33. Craik A, He Y, Contreras-Vidal JL. Deep learning for electroencephalogram (EEG) classification tasks: a review. J Neural Eng. 2019;16(031001):1–28.

    Google Scholar 

  34. Matsumoto M, Hori J. Classification of silent speech using support vector machine and relevance vector machine. Appl Soft Comput. 2014;20:95–102.

    Article  Google Scholar 

Download references

Funding

This research work is done under the CRS project entitled. “Monitoring stress in students using EEG” with CRS application ID: 1 -5770264050. This project is funded by NPIU (Through TEQIP III, NPIU, MHRD, Govt. of India).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajdeep Ghosh.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Statement

All the procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional ethics committee.

Informed Consent

The informed consent has been taken from all the involved human participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghosh, R., Sinha, N. & Phadikar, S. Classification of Silent Speech in English and Bengali Languages Using Stacked Autoencoder. SN COMPUT. SCI. 3, 389 (2022). https://doi.org/10.1007/s42979-022-01274-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01274-y

Keywords

Navigation