Abstract
We present a simple deep learning-based framework commonly used in computer vision and demonstrate its effectiveness for cross-dataset transfer learning in mental imagery decoding tasks that are common in the field of Brain-Computer Interfaces (BCI). We use this framework to characterize the compatibility for transfer learning between twelve motor-imagery datasets.
Challenges. Deep learning models typically require long training times and are data-hungry, which impedes their use for BCI systems that have to minimize the recording time for (training) examples and are subject to constraints induced by experiments involving human subjects. A solution to both issues is transfer learning, but it comes with its own challenge, i.e., substantial data distribution shifts between datasets, subjects and even between subsequent sessions of the same subject.
Approach. For every pair of pre-training (donor) and test (receiver) datasets, we first train a model on the donor dataset before training merely an additional new linear classification layer based on a few receiver trials. Performance of this transfer approach is then tested on other trials of the receiver dataset. We compile these results in a table characterizing the compatibility between the different datasets.
Significance. Our characterisation of compatibility between datasets can be used as a reference for future researchers to make informed donor choices. To strengthen this claim, we present an applied example of such usage. Finally, we lower the threshold to use transfer learning between motor imagery datasets: the overall framework is extremely simple and nevertheless obtains decent classification scores.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Source code: https://gitlab.com/PierreGtch/motor_embedding_benchmark.
- 2.
Pre-trained models, and results: https://huggingface.co/PierreGtch/EEGNetv4.
- 3.
Pre-trained models: https://huggingface.co/PierreGtch/EEGNetv4.
- 4.
For a comprehensive understanding of the state-of-the-art, the reader is advised to consult the MOABB benchmark [48], in conjunction with its forthcoming 2024 update.
- 5.
- 6.
Code of example application: https://github.com/PierreGtch/CrossDataset_Xie2023.
References
Luck, S.J.: An Introduction to the Event-Related Potential Technique, 2nd edn. MIT Press, Cambridge (2014)
Höhne, J., Krenzlin, K., Dähne, S., Tangermann, M.: Natural stimuli improve auditory BCIs with respect to ergonomics and performance. J. Neural Eng. 9(4), 045003 (2012)
Van Der Waal, M., Severens, M., Geuze, J., Desain, P.: Introducing the tactile speller: an ERP-based brain-computer interface for communication. J. Neural Eng. 9(4), 045002 (2012)
Clerc, M., Bougrain, L., Lotte, F.: Brain-Computer Interfaces 2. Wiley-ISTE (2016)
Müller, K.-R., Tangermann, M., Dornhege, G., Krauledat, M., Curio, G., Blankertz, B.: Machine learning for real-time single-trial EEG-analysis: from brain-computer interfacing to mental state monitoring. J. Neurosci. Methods 167(1), 82–90 (2008)
Meinel, A., Castaño-Candamil, S., Reis, J., Tangermann, M.: Pre-trial EEG-based single-trial motor performance prediction to enhance neuroergonomics for a hand force task. Front. Hum. Neurosci. 10, 170 (2016)
Kolkhorst, H., Tangermann, M., Burgard, W.: Decoding perceived hazardousness from user’s brain states to shape human-robot interaction. In: Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, Vienna, Austria, pp. 349–350. ACM (2017)
Mane, R., Chouhan, T., Guan, C.: BCI for stroke rehabilitation: motor and beyond. J. Neural Eng. 17(4), 041001 (2020)
Musso, M., et al.: Aphasia recovery by language training using a brain–computer interface: a proof-of-concept study. Brain Commun. 4(1), fcac008 (2022)
Van Erp, J., Lotte, F., Tangermann, M.: Brain-computer interfaces: beyond medical applications. Computer 45(4), 26–34 (2012)
Tangermann, M., et al.: Playing pinball with non-invasive BCI. In: Advances in Neural Information Processing Systems, vol. 21. Curran Associates, Inc. (2008)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Jayaram, V., Alamgir, M., Altun, Y., Schölkopf, B., Grosse-Wentrup, M.: Transfer learning in brain-computer interfaces. IEEE Comput. Intell. Mag. 11(1), 20–31 (2016)
Zhu, Y., Li, Y., Lu, J., Li, P.: EEGNet with ensemble learning to improve the cross-session classification of SSVEP based BCI from ear-EEG. IEEE Access 9, 15295–15303 (2021)
Samek, W., Meinecke, F.C., Müller, K.-R.: Transferring subspaces between subjects in brain-computer interfacing. IEEE Trans. Biomed. Eng. 60(8), 2289–2298 (2013)
Jeon, E., Ko, W., Yoon, J.S., Suk, H.-I.: Mutual information-driven subject-invariant and class-relevant deep representation learning in BCI. IEEE Trans. Neural Netw. Learn. Syst. 34, 1–11 (2021)
Kobler, R., Hirayama, J.-I., Zhao, Q., Kawanabe, M.: SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG. In: Advances in Neural Information Processing Systems, vol. 35, pp. 6219–6235 (2022)
Wei, X., et al.: 2021 BEETL competition: advancing transfer learning for subject independence and heterogenous EEG data sets. In: Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, pp. 205–219. PMLR (2022)
Xie, Y., et al.: Cross-dataset transfer learning for motor imagery signal classification via multi-task learning and pre-training. J. Neural Eng. 20(5), 056037 (2023)
Aristimunha, B., de Camargo, R.Y., Pinaya, W.H.L., Chevallier, S., Gramfort, A., Rommel, C.: Evaluating the structure of cognitive tasks with transfer learning (2023)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Aristimunha, B., et al.: Mother of all BCI benchmarks. Zenodo (2023)
Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., Lance, B.J.: EEGNet: a compact convolutional network for EEG-based brain-computer interfaces. J. Neural Eng. 15(5), 056013 (2018)
Xu, F., et al.: A transfer learning framework based on motor imagery rehabilitation for stroke. Sci. Rep. 11(1), 19783 (2021)
Raza, H., Chowdhury, A., Bhattacharyya, S., Samothrakis, S.: Single-trial EEG classification with EEGNet and neural structured learning for improving BCI performance. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020)
Schneider, T., Wang, X., Hersche, M., Cavigelli, L., Benini, L.: Q-EEGNet: an energy-efficient 8-bit quantized parallel EEGNet implementation for edge motor-imagery brain-machine interfaces. In: 2020 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 284–289 (2020)
Riyad, M., Khalil, M., Adib, A.: Incep-EEGNet: a ConvNet for motor imagery decoding. In: El Moataz, A., Mammass, D., Mansouri, A., Nouboud, F. (eds.) ICISP 2020. LNCS, vol. 12119, pp. 103–111. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51935-3_11
Deng, X., Zhang, B., Yu, N., Liu, K., Sun, K.: Advanced TSGL-EEGNet for motor imagery EEG-based brain-computer interfaces. IEEE Access 9, 25118–25130 (2021)
Barachant, A.: Commande robuste d’un effecteur par une interface cerveau machine EEG asynchrone. Ph.D. dissertation, Université de Grenoble (2012)
Tangermann, M., et al.: Review of the BCI competition IV. Front. Neurosci. 6, 55 (2012)
Leeb, R., Lee, F., Keinrath, C., Scherer, R., Bischof, H., Pfurtscheller, G.: Brain-computer communication: motivation, aim, and impact of exploring a virtual apartment. IEEE Trans. Neural Syst. Rehabil. Eng. 15(4), 473–482 (2007)
Faller, J., Vidaurre, C., Solis-Escalante, T., Neuper, C., Scherer, R.: Autocalibration and recurrent adaptation: towards a plug and play online ERD-BCI. IEEE Trans. Neural Syst. Rehabil. Eng. 20(3), 313–319 (2012)
Scherer, R., et al.: Individually adapted imagery improves brain-computer interface performance in end-users with disability. PLoS ONE 10(5), e0123727 (2015)
Cho, H., Ahn, M., Ahn, S., Kwon, M., Jun, S.C.: EEG datasets for motor imagery brain–computer interface. GigaScience 6(7) (2017)
Ofner, P., Schwarz, A., Pereira, J., MĂĽller-Putz, G.R.: Upper limb movements can be decoded from the time-domain of low-frequency EEG. PLoS ONE 12(8), e0182578 (2017)
Goldberger, A.L., et al.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)
Schirrmeister, R.T., et al.: Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 38(11), 5391–5420 (2017)
Yi, W., et al.: Evaluation of EEG oscillatory patterns and cognitive process during simple and compound limb motor imagery. PLoS ONE 9(12), e114853 (2014)
Zhou, B., Wu, X., Lv, Z., Zhang, L., Guo, X.: A fully automated trial selection method for optimization of motor imagery based brain-computer interface. PLoS ONE 11(9), e0162657 (2016)
Guetschel, P., Papadopoulo, T., Tangermann, M.: Embedding neurophysiological signals. In: 2022 IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence, and Neural Engineering (MetroXRAINE), Rome, pp. 169–174. IEEE (2022)
Kumar, A., Raghunathan, A., Jones, R., Ma, T., Liang, P.: Fine-tuning can distort pretrained features and underperform out-of-distribution. In: ICLR. arXiv (2022)
Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: ICML, vol. 98, pp. 445–453 (1998)
Conover, W.J.: Practical Nonparametric Statistics. Wiley, Hoboken (1999)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57(1), 289–300 (1995)
Sajda, P., Gerson, A., Müller, K.-R., Blankertz, B., Parra, L.: A data analysis competition to evaluate machine learning algorithms for use in brain-computer interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 11(2), 184–185 (2003)
Blankertz, B., et al.: The BCI competition 2003: progress and perspectives in detection and discrimination of EEG single trials. IEEE Trans. Biomed. Eng. 51(6), 1044–1051 (2004)
Blankertz, B., et al.: The BCI competition III: validating alternative approaches to actual BCI problems. IEEE Trans. Neural Syst. Rehabil. Eng. 14(2), 153–159 (2006)
Jayaram, V., Barachant, A.: MOABB: trustworthy algorithm benchmarking for BCIs. J. Neural Eng. 15(6), 066011 (2018)
Bauer, R., Fels, M., Vukelić, M., Ziemann, U., Gharabaghi, A.: Bridging the gap between motor imagery and motor execution with a brain-robot interface. Neuroimage 108, 319–327 (2015)
Miller, K.J., Schalk, G., Fetz, E.E., den Nijs, M., Ojemann, J.G., Rao, R.P.N.: Cortical activity during motor execution, motor imagery, and imagery-based online feedback. Proc. Natl. Acad. Sci. 107(9), 4430–4435 (2010)
Batula, A.M., Mark, J.A., Kim, Y.E., Ayaz, H.: Comparison of brain activation during motor imagery and motor movement using fNIRS. Comput. Intell. Neurosci. 2017, 1–12 (2017)
Wriessnegger, S.C., Kurzmann, J., Neuper, C.: Spatio-temporal differences in brain oxygenation between movement execution and imagery: a multichannel near-infrared spectroscopy study. Int. J. Psychophysiol. 67(1), 54–63 (2008)
Guetschel, P., Moreau, T., Tangermann, M.: S-JEPA: towards seamless cross-dataset transfer through dynamic spatial attention (2024)
Hübner, D., Verhoeven, T., Müller, K.-R., Kindermans, P.-J., Tangermann, M.: Unsupervised learning for brain-computer interfaces based on event-related potentials: review and online comparison [research frontier]. IEEE Comput. Intell. Mag. 13(2), 66–77 (2018)
MartĂnez-Cagigal, V., Thielen, J., SantamarĂa-Vázquez, E., PĂ©rez-Velasco, S., Desain, P., Hornero, R.: Brain-computer interfaces based on code-modulated visual evoked potentials (c-VEP): a literature review. J. Neural Eng. 18(6), 061002 (2021)
Sosulski, J., Tangermann, M.: Introducing block-Toeplitz covariance matrices to remaster linear discriminant analysis for event-related potential brain-computer interfaces. J. Neural Eng. 19(6), 066001 (2022)
Sosulski, J., Tangermann, M.: UMM: Unsupervised Mean-difference Maximization (2023)
Acknowledgement
This work is in part supported by the Donders Center for Cognition (DCC) and is part of the project Dutch Brain Interface Initiative (DBI2) with project number 024.005.022 of the research programme Gravitation which is (partly) financed by the Dutch Research Council (NWO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Description of the Datasets
Below, we provide the list of datasets we used in this study together with a link to a concise online description on the MOABB website. For the full description of each dataset, we recommend the reader to refer to the corresponding original publications cited in Table 1.
- AlexMI:
-
http://moabb.neurotechx.com/docs/generated/moabb.datasets.AlexMI.html#moabb.datasets.AlexMI
- BNCI2014001:
- BNCI2014004:
- BNCI2015001:
- BNCI2015004:
- Cho2017:
-
http://moabb.neurotechx.com/docs/generated/moabb.datasets.Cho2017.html#moabb.datasets.Cho2017
- Lee2019 MI:
-
http://moabb.neurotechx.com/docs/generated/moabb.datasets.Lee2019_MI.html#moabb.datasets.Lee2019_MI
- Ofner2017:
-
http://moabb.neurotechx.com/docs/generated/moabb.datasets.Ofner2017.html#moabb.datasets.Ofner2017
- PhysionetMI:
- Schirrmeister2017:
- Weibo2014:
-
http://moabb.neurotechx.com/docs/generated/moabb.datasets.Weibo2014.html#moabb.datasets.Weibo2014
- Zhou2016:
-
http://moabb.neurotechx.com/docs/generated/moabb.datasets.Zhou2016.html#moabb.datasets.Zhou2016
1.2 A.2 Detailed compatibility tables
The AUC scores for the feet vs. right-hand classification task for 1, 2, 4, 8, 16, 32, and 64 calibration trials per class can respectively be found in Tables 2a, 3a, 4a, 5a, 6a, 7a, and 8a. The AUC scores for the left-hand vs. right-hand classification task can be found in Tables 2b, 3b, 4b, 5b, 6b, 7b, and 8b. Finally, the accuracy scores on all the classes from the test dataset can be found in Tables 2c, 3c, 4c, 5c, 6c, 7c, and 8c.
We present here the tables in which we collected our results. These results can be used to estimate the compatibility for transfer learning between the different motor imagery datasets we analysed. Due to editorial reasons, the compatibility tables had to be split according to the number of calibration trials per class and needed to be provided in black and white. A colour and un-split version of these tables is available onlineFootnote 5.
1.3 A.3 Experimental Details of the Example Application
We attempted to reproduce the results that Xie and colleagues [19] presented in their Sect. 4.1. (entitled Classification accuracy of the DL models when fine-tuning on the whole training set), in particular their Table 2, of their article. Because the corresponding source code was not published, we had to re-create the experimental settings according to what was reported in the article. Unfortunately, we faced obstacles in this process. We will describe them in this section.
The first difficulty was to establish which fine-tuning scheme had been used in the experiment of Sect. 4.1. In Sect. 2.3, four different fine-tuning schemes were presented but it was not clear, from the text, which one was used to obtain the results in Table 2. Later in the article, the four fine-tuning schemes are compared (only with Schirrmeister2017 as donor) in Table 4 and it seems that the results of the fourth scheme are relatively similar to those presented in column “PT_by_HGD” of Table 2, despite not being perfectly equal. Therefore, we made the assumption that the whole Table 2 was obtained using fine-tuning scheme 4 and also used this scheme in our replication.
Then, it is mentioned that the same training strategy as Schirrmeister et al. [37] was used. However, the article follows up by explaining a training strategy different from the one presented by Schirrmeister and colleagues. Furthermore, the strategy of Xie and colleagues reads as “In the first stage, only training sets were used for model training. When the loss on the training set does not decrease for some epochs, we stop training and record the lowest loss on the training set. Then, training and validation sets were used to train the model and stop training when the loss on the validation set dropped to the same value as the lowest loss at the first training stage.”. This suggests that the early-stopping mechanism of their first stage may not take into account the validation loss. We assumed this was a simple inaccuracy in the textual description and chose to implement the training strategy initially described by Schirrmeister and colleagues. Additionally, the number of epochs without improvement to wait before the end of the first stage was not provided. We chose to utilize the same value for early stopping as previously mentioned in the text but in another context, i.e., 100 epochs.
Concerning the cross-validation procedure, it is mentioned that “Each configuration was evaluated twenty times using different random seeds”. However, it was not clear whether the different folds over a given test subject were all based on the same pre-trained model or if a new pre-training was done for each fold. We choose the first option as it is significantly faster to compute.
The construction of the train/validation/test data splits was also relatively obscure. Because of the absence of source code, the exact data split is unknown. For pre-training, it is mentioned in their Sect. 3.4.1. that the data of all the subjects was “divided into 90% training and 10% validation sets”. We assumed the examples from all subjects were first shuffled, leading to having all subjects present in both sets. Nevertheless, it would be highly unlikely that we reached the exact same split as the original authors. For fine-tuning, it is mentioned in Sect. 4.1. that the “train-test splition is the same as in the BCI Competition IV and the original paper [37]”. These train/test splits have been annotated in the MOABB library, so we used the library’s annotations to obtain the train/test splits. Then, it is stated that the training data is split into “80% training set and 20% validation set”. For each subject, we split the training data 20 times, each with a different random seed and in a stratified way, to create 20 folds. Again, it would be highly unlikely that we reached the exact same splits as the original authors. For sanity check, we reproduced the Table 1 of the original work compiling the amount of data in the training, validation and testing sets of each dataset. This leads to our Table 9. We can observe that the numbers in the two tables do not match but we suspect that the main difference can be explained by a confusion in the author’s table. Indeed, their line “Our dataset” is relatively similar, except regarding N_Subjects, to the Schirrmeister2017 line in our Table 9. The other disagreements are much smaller and could be explained by a difference in the counting methods. Regardless, we conducted our experiments despite the potential disagreement between the two tables.
Finally, some choices seemed rather unconventional to us. First, the band-pass frequency filtering is done after the epoching steps which can cause edge artefacts. Second, the signal is passed at 250 Hz to the neural networks. While Deep ConvNet and Shallow ConvNet were indeed developed for signals at this frequency [37], EEGNet was developed for signals at 128 Hz [23]. It would typically be recommended to adapt the hyperparameters of EEGNet to use it on such signals. Third, an electrode-wise exponential moving standardisation (EMS) step is also applied after epoching using an initialisation on the first 1000 samples. At 250 Hz, this means that the EMS is initialized on an interval of 4 s and only “moves” over the last 0.5 seconds of each epoch. Nevertheless, we implemented these preprocessing steps the way they were described in the original article.
The rest of the experimental settings strictly follow what was described in the original article by Xie and colleagues [19]. The source code of our attempt to reproduce these results, along with the list of libraries we used with their versions, can be found on githubFootnote 6.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Guetschel, P., Tangermann, M. (2025). Identifying Good Donor Datasets for Transfer Learning Scenarios in Motor Imagery BCI. In: Oliehoek, F.A., Kok, M., Verwer, S. (eds) Artificial Intelligence and Machine Learning. BNAIC/Benelearn 2023. Communications in Computer and Information Science, vol 2187. Springer, Cham. https://doi.org/10.1007/978-3-031-74650-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-74650-5_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74649-9
Online ISBN: 978-3-031-74650-5
eBook Packages: Artificial Intelligence (R0)