Abstract
For the last two decades, brain-computer interface (BCI) research has worked towards practical and useful applications for communication and control. Yet, many BCI communication approaches suffer from unnatural interaction or time-consuming user training. As continuous speech provides a very natural communication approach, it has been a long standing question whether it is possible to develop BCIs that perform speech recognition from cortical activity. Imagined speech as a BCI paradigm for locked-in patients would mean a large improvement in communication speed and usability without the need for cumbersome spelling using individual letters. We showed for the first time that automatic speech recognition from neural signals is possible. Here, we evaluate the feasibility of speech recognition from neural signals using only temporal offsets associated with speech production and omitting information from speech perception. This analysis provides first insights into the potential usage of imagined speech processes for speech recognition, for which no perceptive activity is present.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
T. Blakely, K.J. Miller, R.P.N. Rao, M.D. Holmes, J.G. Ojemann, Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids, in 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2008. EMBS 2008 (IEEE, 2008), pp. 4964–4967
S.J. Brumberg, E.J. Wright, D.S. Andreasen, F.H. Guenther, P.R. Kennedy, Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex. Front. Neurosci. 5 (2011)
F. Edward, Chang, J.W. Rieger, K. Johnson, M.S. Berger, N.M. Barbaro, R.T. Knight, Categorical speech representation in human superior temporal gyrus. Nat. Neurosci. 13(11), 1428–1432 (2010)
M. Fukuda, R. Rothermel, C. Juhász, M. Nishida, S. Sood, E. Asano, Cortical gamma-oscillations modulated by listening and overt repetition of phonemes. Neuroimage 49(3), 2735–2745 (2010)
D. Heger, C. Herff, A. de Pesters, D. Telaar, P. Brunner, G. Schalk, T. Schultz, Continuous speech recognition from ECoG, in Sixteenth Annual Conference of the International Speech Communication Association (2015)
C. Herff, D. Heger, A. de Pesters, D. Telaar, P. Brunner, G. Schalk, T. Schultz, Brain-to-text: decoding spoken phrases from phone representations in the brain, Front. Neurosci. 9(217) (2015)
F. Jelinek, Statistical Methods for Speech Recognition (MIT Press, 1997)
S. Kellis, K. Miller, K. Thomson, R. Brown, P. House, B. Greger, Decoding spoken words using local field potentials recorded from the cortical surface. J. Neural Eng. 7(5), 056007 (2010)
J. Kubanek, P. Brunner, A. Gunduz, D. Poeppel, G. Schalk, The tracking of speech envelope in the human cortex. PLoS ONE 8(1), e53398 (2013)
C.E. Leuthardt, C. Gaona, M. Sharma, N. Szrama, J. Roland, Z. Freudenberg, J. Solis, J. Breshears, G. Schalk, Using the electrocorticographic speech network to control a brain-computer interface in humans. J. Neural Eng. 8(3), 036004 (2011)
S. Martin, P. Brunner, C. Holdgraf, H.-J. Heinze, N.E. Crone, J. Rieger, G. Schalk, R.T. Knight, B. Pasley, Decoding spectrotemporal features of overt and covert speech from the human cortex. Front. Neuroeng. 7(14) (2014)
N. Mesgarani, C. Cheung, K. Johnson, E.F. Chang, Phonetic feature encoding in human superior temporal gyrus. Science 1245994 (2014)
M.E. Mugler, J.L. Patton, R.D. Flint, Z.A. Wright, S.U. Schuele, J. Rosenow, J.J. Shih, D.J. Krusienski, M.W. Slutzky, Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11(3), 035015 (2014)
M. Perrone-Bertolotti, J. Kujala, J.R. Vidal, C.M. Hamame, T. Ossandon, O. Bertrand, L. Minotti, P. Kahane, K. Jerbi, J.-P. Lachaux, How silent is silent reading? intracerebral evidence for top-down activation of temporal voice areas during reading. J. Neurosci. 32(49), 17554–17562 (2012)
I.C. Petkov, P. Belin, Silent reading: does the brain hear both speech and voices? Curr. Biol. 23(4), R155–R156 (2013)
G. Schalk, D.J. McFarland, T. Hinterberger, N. Birbaumer, J.R. Wolpaw, Bci2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng. 51(6), 1034–1043 (2004)
D. Telaar, M. Wand, D. Gehrig, F. Putze, C. Amma, D. Heger, N.T. Vu, M. Erhardt, T. Schlippe, M. Janke et al., BioKIT—real-time decoder for biosignal processing, in The 15th Annual Conference of the International Speech Communication Association (Interspeech 2014) (2014)
L.V. Towle, H.-A. Yoon, M. Castelle, J.C. Edgar, N.M. Biassou, D.M. Frim, J.-P. Spire, M.H. Kohrman, ECoG gamma activity during a language task: differentiating expressive and receptive speech areas. Brain 131(8), 2013–2027 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Herff, C., de Pesters, A., Heger, D., Brunner, P., Schalk, G., Schultz, T. (2017). Towards Continuous Speech Recognition for BCI. In: Guger, C., Allison, B., Ushiba, J. (eds) Brain-Computer Interface Research. SpringerBriefs in Electrical and Computer Engineering. Springer, Cham. https://doi.org/10.1007/978-3-319-57132-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-57132-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57131-7
Online ISBN: 978-3-319-57132-4
eBook Packages: Computer ScienceComputer Science (R0)