Towards Continuous Speech Recognition for BCI

Herff, Christian; de Pesters, Adriana; Heger, Dominic; Brunner, Peter; Schalk, Gerwin; Schultz, Tanja

doi:10.1007/978-3-319-57132-4_3

Christian Herff⁴,
Adriana de Pesters⁵,
Dominic Heger⁴,
Peter Brunner^5,6,
Gerwin Schalk^5,6 &
…
Tanja Schultz⁴

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSELECTRIC))

1933 Accesses
8 Citations
1 Altmetric

Abstract

For the last two decades, brain-computer interface (BCI) research has worked towards practical and useful applications for communication and control. Yet, many BCI communication approaches suffer from unnatural interaction or time-consuming user training. As continuous speech provides a very natural communication approach, it has been a long standing question whether it is possible to develop BCIs that perform speech recognition from cortical activity. Imagined speech as a BCI paradigm for locked-in patients would mean a large improvement in communication speed and usability without the need for cumbersome spelling using individual letters. We showed for the first time that automatic speech recognition from neural signals is possible. Here, we evaluate the feasibility of speech recognition from neural signals using only temporal offsets associated with speech production and omitting information from speech perception. This analysis provides first insights into the potential usage of imagined speech processes for speech recognition, for which no perceptive activity is present.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

T. Blakely, K.J. Miller, R.P.N. Rao, M.D. Holmes, J.G. Ojemann, Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids, in 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2008. EMBS 2008 (IEEE, 2008), pp. 4964–4967
Google Scholar
S.J. Brumberg, E.J. Wright, D.S. Andreasen, F.H. Guenther, P.R. Kennedy, Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex. Front. Neurosci. 5 (2011)
Google Scholar
F. Edward, Chang, J.W. Rieger, K. Johnson, M.S. Berger, N.M. Barbaro, R.T. Knight, Categorical speech representation in human superior temporal gyrus. Nat. Neurosci. 13(11), 1428–1432 (2010)
Google Scholar
M. Fukuda, R. Rothermel, C. Juhász, M. Nishida, S. Sood, E. Asano, Cortical gamma-oscillations modulated by listening and overt repetition of phonemes. Neuroimage 49(3), 2735–2745 (2010)
Article Google Scholar
D. Heger, C. Herff, A. de Pesters, D. Telaar, P. Brunner, G. Schalk, T. Schultz, Continuous speech recognition from ECoG, in Sixteenth Annual Conference of the International Speech Communication Association (2015)
Google Scholar
C. Herff, D. Heger, A. de Pesters, D. Telaar, P. Brunner, G. Schalk, T. Schultz, Brain-to-text: decoding spoken phrases from phone representations in the brain, Front. Neurosci. 9(217) (2015)
Google Scholar
F. Jelinek, Statistical Methods for Speech Recognition (MIT Press, 1997)
Google Scholar
S. Kellis, K. Miller, K. Thomson, R. Brown, P. House, B. Greger, Decoding spoken words using local field potentials recorded from the cortical surface. J. Neural Eng. 7(5), 056007 (2010)
Article Google Scholar
J. Kubanek, P. Brunner, A. Gunduz, D. Poeppel, G. Schalk, The tracking of speech envelope in the human cortex. PLoS ONE 8(1), e53398 (2013)
Article Google Scholar
C.E. Leuthardt, C. Gaona, M. Sharma, N. Szrama, J. Roland, Z. Freudenberg, J. Solis, J. Breshears, G. Schalk, Using the electrocorticographic speech network to control a brain-computer interface in humans. J. Neural Eng. 8(3), 036004 (2011)
Article Google Scholar
S. Martin, P. Brunner, C. Holdgraf, H.-J. Heinze, N.E. Crone, J. Rieger, G. Schalk, R.T. Knight, B. Pasley, Decoding spectrotemporal features of overt and covert speech from the human cortex. Front. Neuroeng. 7(14) (2014)
Google Scholar
N. Mesgarani, C. Cheung, K. Johnson, E.F. Chang, Phonetic feature encoding in human superior temporal gyrus. Science 1245994 (2014)
Google Scholar
M.E. Mugler, J.L. Patton, R.D. Flint, Z.A. Wright, S.U. Schuele, J. Rosenow, J.J. Shih, D.J. Krusienski, M.W. Slutzky, Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11(3), 035015 (2014)
Google Scholar
M. Perrone-Bertolotti, J. Kujala, J.R. Vidal, C.M. Hamame, T. Ossandon, O. Bertrand, L. Minotti, P. Kahane, K. Jerbi, J.-P. Lachaux, How silent is silent reading? intracerebral evidence for top-down activation of temporal voice areas during reading. J. Neurosci. 32(49), 17554–17562 (2012)
Article Google Scholar
I.C. Petkov, P. Belin, Silent reading: does the brain hear both speech and voices? Curr. Biol. 23(4), R155–R156 (2013)
Article Google Scholar
G. Schalk, D.J. McFarland, T. Hinterberger, N. Birbaumer, J.R. Wolpaw, Bci2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng. 51(6), 1034–1043 (2004)
Google Scholar
D. Telaar, M. Wand, D. Gehrig, F. Putze, C. Amma, D. Heger, N.T. Vu, M. Erhardt, T. Schlippe, M. Janke et al., BioKIT—real-time decoder for biosignal processing, in The 15th Annual Conference of the International Speech Communication Association (Interspeech 2014) (2014)
Google Scholar
L.V. Towle, H.-A. Yoon, M. Castelle, J.C. Edgar, N.M. Biassou, D.M. Frim, J.-P. Spire, M.H. Kohrman, ECoG gamma activity during a language task: differentiating expressive and receptive speech areas. Brain 131(8), 2013–2027 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cognitive Systems Lab, University of Bremen (formerly at Karlsruhe Institute of Technology), Enrique-Schmidt-Str. 5, 28359, Bremen, Germany
Christian Herff, Dominic Heger & Tanja Schultz
New York State Department of Health, National Resource Center for Adaptive Neurotechnologies, Wadsworth Center, Albany, USA
Adriana de Pesters, Peter Brunner & Gerwin Schalk
Department of Neurology, Albany Medical College, Albany, USA
Peter Brunner & Gerwin Schalk

Authors

Christian Herff
View author publications
You can also search for this author in PubMed Google Scholar
Adriana de Pesters
View author publications
You can also search for this author in PubMed Google Scholar
Dominic Heger
View author publications
You can also search for this author in PubMed Google Scholar
Peter Brunner
View author publications
You can also search for this author in PubMed Google Scholar
Gerwin Schalk
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Herff .

Editor information

Editors and Affiliations

g.tec Guger Technologies OG , 4521 Schiedlberg, Austria
Christoph Guger
g.tec Guger Technologies OG, 4521 Schiedlberg, Austria
Brendan Allison
Biosciences and Informatics, Keio University Biosciences and Informatics, 223-8522 Yokohama, Kazagawa, Japan
Junichi Ushiba

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Herff, C., de Pesters, A., Heger, D., Brunner, P., Schalk, G., Schultz, T. (2017). Towards Continuous Speech Recognition for BCI. In: Guger, C., Allison, B., Ushiba, J. (eds) Brain-Computer Interface Research. SpringerBriefs in Electrical and Computer Engineering. Springer, Cham. https://doi.org/10.1007/978-3-319-57132-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-57132-4_3
Published: 30 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57131-7
Online ISBN: 978-3-319-57132-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics