Skip to main content

Delving into Temporal-Spectral Connections in Spike-LFP Decoding by Transformer Networks

  • Conference paper
  • First Online:
Human Brain and Artificial Intelligence (HBAI 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1692))

Included in the following conference series:

  • 682 Accesses

Abstract

Invasive brain-computer interfaces (iBCIs) have demonstrated great potential in neural function restoration by decoding intention from brain signals for external device control. Spike trains and local field potentials (LFPs) are two typical intracortical neural signals with good complementarity from time and frequency domains. However, existing studies mostly focused on a single type of signal, and the interaction between the two signals has not been well studied. This study proposes a temporal-spectral transformer network (TSNet) to model the temporal (with spikes), spectral (with LFPs), and mutual (with both signals) connections in spike-LFPs towards robust neural decoding. Experiments with clinical neural signals demonstrate that the attention-based connection model enables the dynamic temporal-spectral compensation in spike and LFP signals, which improves the robustness against temporal shifts and noises in neural decoding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbaspourazad, H., Hsieh, H.L., Shanechi, M.M.: A multiscale dynamical modeling and identification framework for spike-field activity. IEEE Trans. Neural Syst. Rehabil. Eng. 27(6), 1128–1138 (2019)

    Article  Google Scholar 

  2. Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: International Conference on Machine Learning, pp. 1247–1255. PMLR (2013)

    Google Scholar 

  3. Bansal, A.K., Truccolo, W., Vargas-Irwin, C.E., Donoghue, J.P.: Decoding 3D reach and grasp from hybrid signals in motor and premotor cortices: spikes, multiunit activity, and local field potentials. J. Neurophysiol. 107(5), 1337–1355 (2012)

    Article  Google Scholar 

  4. Chapin, J.K., Moxon, K.A., Markowitz, R.S., Nicolelis, M.A.: Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex. Nat. Neurosci. 2(7), 664–670 (1999)

    Article  Google Scholar 

  5. Collinger, J.L., et al.: High-performance neuroprosthetic control by an individual with tetraplegia. Lancet 381(9866), 557–564 (2013)

    Article  Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Gilja, V., et al.: A high-performance neural prosthesis enabled by control algorithm design. Nat. Neurosci. 15(12), 1752–1757 (2012)

    Article  Google Scholar 

  8. Gilja, V., et al.: Clinical translation of a high-performance neural prosthesis. Nat. Med. 21(10), 1142 (2015)

    Article  Google Scholar 

  9. Hochberg, L.R., Bacher, D., Jarosiewicz, B., Masse, N.Y., Simeral, J.D., Vogel, J., Haddadin, S., Liu, J., Cash, S.S., Van Der Smagt, P., et al.: Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature 485(7398), 372–375 (2012)

    Article  Google Scholar 

  10. Jackson, A., Hall, T.M.: Decoding local field potentials for neural interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 25(10), 1705–1714 (2016)

    Article  Google Scholar 

  11. Li, Y., Qi, Y., Wang, Y., Wang, Y., Xu, K., Pan, G.: Robust neural decoding by kernel regression with Siamese representation learning. J. Neural Eng. 18(5), 056062 (2021)

    Article  Google Scholar 

  12. Liu, Q., et al.: Efficient representations of EEG signals for SSVEP frequency recognition based on deep multiset CCA. Neurocomputing 378, 36–44 (2020)

    Article  Google Scholar 

  13. Lu, J., Batra, D., Parikh, D., Lee, S.: ViLBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265 (2019)

  14. Pandarinath, C., et al.: High performance communication by people with paralysis using an intracortical brain-computer interface. Elife 6, e18554 (2017)

    Article  Google Scholar 

  15. Qi, Y., et al.: Dynamic ensemble bayesian filter for robust control of a human brain-machine interface. IEEE Trans. Biomed. Eng., 1–11 (2022). https://doi.org/10.1109/TBME.2022.3182588

  16. Rickert, J., de Oliveira, S.C., Vaadia, E., Aertsen, A., Rotter, S., Mehring, C.: Encoding of movement direction in different frequency ranges of motor cortical local field potentials. J. Neurosci. 25(39), 8815–8824 (2005)

    Article  Google Scholar 

  17. Serruya, M.D., Hatsopoulos, N.G., Paninski, L., Fellows, M.R., Donoghue, J.P.: Instant neural control of a movement signal. Nature 416(6877), 141–142 (2002)

    Article  MATH  Google Scholar 

  18. Shi, Z., Chen, X., Zhao, C., He, H., Stuphorn, V., Wu, D.: Multi-view broad learning system for primate oculomotor decision decoding. IEEE Trans. Neural Syst. Rehabil. Eng. 28(9), 1908–1920 (2020)

    Article  Google Scholar 

  19. So, K., Dangi, S., Orsborn, A.L., Gastpar, M.C., Carmena, J.M.: Subject-specific modulation of local field potential spectral power during brain-machine interface control in primates. J. Neural Eng. 11(2), 026002 (2014)

    Article  Google Scholar 

  20. Taylor, D.M., Tillery, S.I.H., Schwartz, A.B.: Direct cortical control of 3D neuroprosthetic devices. Science 296(5574), 1829–1832 (2002)

    Article  Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  22. Wang, W., Arora, R., Livescu, K., Bilmes, J.: On deep multi-view representation learning. In: International Conference on Machine Learning, pp. 1083–1092. PMLR (2015)

    Google Scholar 

  23. Wang, Y., Lin, K., Qi, Y., Lian, Q., Feng, S., Wu, Z., Pan, G.: Estimating brain connectivity with varying-length time lags using a recurrent neural network. IEEE Trans. Biomed. Eng. 65(9), 1953–1963 (2018)

    Article  Google Scholar 

  24. Willett, F.R., et al.: Hand knob area of premotor cortex represents the whole body in a compositional way. Cell 181(2), 396–409 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Kedi Xu, Junming Zhu and Jianmin Zhang for the support in the clinical experiments. This work was partly supported by the grants from National Key R &D Program of China (2018YFA0701400), Key R &D Program of Zhejiang (2022C03011), Natural Science Foundation of China (61906166, 61925603), the Fundamental Research Funds for the Central Universities, and the Starry Night Science Fund of Zhejiang University Shanghai Institute for Advanced Study (SN-ZJU-SIAS-002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yueming Wang .

Editor information

Editors and Affiliations

Appendices

Appendix

A Detail Settings Of Neural Decoders

We compare the performance of different neural decoders, including SVM, MLP, LSTM, CCA, DCCA, and ridge regression. The detail model settings are as follows:

  • SVM: We use one-vs-rest (ovr) for multiple class classification and the regularization parameter C is chosen from \(\left\{ 10^{-6}, 10^{-4}, \cdots , 10^{6}\right\} \). Spikes and LFPs are flattened to vectors as model input. For the fusion of spikes and LFPs, we concatenate the two signals as vectors and feed them to SVM.

  • MLP: We use a one-layer MLP and the hidden size is selected from \(\left\{ 50, 100, 200, 300, 400, 500\right\} \). For fusion, we train two individual models on spikes and LFPs, respectively, and concatenate the representation from them as the input of a one-layer MLP with the hidden size selected form \(\left\{ 50, 100, 200\right\} \).

  • LSTM: We use one layer LSTM whose hidden size is selected from \(\left\{ 50, 100, 200, 300, 400, 500\right\} \) to extract the sequence’s representation and use the last hidden state as the final representation. The fusion strategy is the same as MLP.

  • CCA: We select the linear projection dimension from \(\{10, 20, 30, 40, 50, 100, 200\}\) and use an ECOC-SVM classifier as[22].

  • DCCA: We use one layer neural network whose hidden size is selected from \(\left\{ 50, 100, 200, 300, 400, 500, 800, 1000\right\} \) and the linear projection dimension and classification are the same as CCA.

  • Ridge regression: The regularization strength \(\alpha \) is chosen from \(\left\{ 10^{-6}, 10^{-4}, \cdots , 10^{6}\right\} \).

Ten-fold cross-validation is used for parameter selection and performance evaluation. For the model training of MLP and LSTM, we set the batch size to 5 and use Adam as the optimizer with the learning rate of \({10^{-3}}\) and weight decay of \({10^{-4}}\). The loss function is cross-entropy loss. Early stop is applied to avoid overfitting.

B Estimating Movement Conduction Durations With Neuron Responses

We use neurons’ responses to estimate the true durations of movement intention conductions. We first select the neurons that respond strongly to specific motor tasks with the criterion of large response variation across the ’Go’ period. Fig. 4(a) shows two example electrodes that tune to “MouthOpen” and “RightToeTip”, respectively. Then we obtain the trial samples’ data for the selected channel and task and do min-max normalization to transform the responses into decimals between 0 and 1 which showed in Fig. 4(b) with the corresponding color. We mark the time that the firing rate is higher than 0.5 as the start of motor intention. Then we compare the firing rate at 800ms after the start. If the firing rate is below 0.5, we set the end time; if not, we search for the time when the firing rate was below 0.5 as the end. We plot the black dashed lines as the start and end marks in a trial in Fig. 4 (b). The connection weights are plotted along with the firing rate using the blue line.

Fig. 6.
figure 6

Performance evaluation with Gaussian noises in spikes. (a) Dynamic connections adapt to noises. The first row illustrates spike signals with Gaussian masks. The second and third rows are the temporal (self-attention) and spectral-to-temporal (cross-attention) connection weights with different proportions of noises (solid lines) compared with weights without noises (dashed lines). (b) The average connection weights across trials. (c) Accuracy of different methods with Gaussian noises. (d) Accuracy descent ratio of different methods with Gaussian noises.

C Robustness To Gaussian Noises

Here, we analyze the result of masking a proportion of spike signals with Gaussian noise. In Fig. 6(a–b), we illustrate the adaptation of connection weights in temporal-spectral representations with different conditions of signal loss masking with Gaussian noise. Similar to results with signal loss using zero-masks, both the temporal (self-attention) and spectral-to-temporal (cross-attention) connection weights adjust dynamically to cope with changes in signals to improve the robustness and accuracy of neural decoding.

We evaluate the performance of the TSNet with different conditions of noises in spikes. Results are shown in Fig. 6(c) and Fig. 6(d). The results are consistent with the signal loss conditions. Firstly, the fusion of signals improves the robustness against noises. At the loss proportion of 40%, the accuracy descent using both signals outperform using spikes alone by 13.2%, 12.2%, 23.2%, and 16.2% in SVM, MLP, LSTM, and TSNet, respectively. Secondly, the TSNet achieves the best accuracy with high stability in spike-LFP fusion. As shown in Fig. 6 (c), TSNet achieves 77% of accuracy with the 40% loss proportion, outperforming LSTM and MLP by 6.8% and 4.0%, respectively. TSNet also achieves the lowest accuracy descent ratio (Fig. 6(d)), demonstrating our method’s robustness.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, H., Qi, Y., Wang, Y. (2023). Delving into Temporal-Spectral Connections in Spike-LFP Decoding by Transformer Networks. In: Ying, X. (eds) Human Brain and Artificial Intelligence. HBAI 2022. Communications in Computer and Information Science, vol 1692. Springer, Singapore. https://doi.org/10.1007/978-981-19-8222-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8222-4_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8221-7

  • Online ISBN: 978-981-19-8222-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics