Abstract
Self-supervised learning has been a powerful training paradigm to facilitate representation learning. In this study, we design a masked autoencoder (MAE) to guide deep learning models to learn electroencephalography (EEG) signal representation. Our MAE includes an encoder and a decoder. A certain proportion of input EEG signals are randomly masked and sent to our MAE. The goal is to recover these masked signals. After this self-supervised pre-training, the encoder is fine-tuned on downstream tasks. We evaluate our MAE on EEGEyeNet gaze estimation task. We find that the MAE is an effective brain signal learner. It also significantly improves learning efficiency. Compared to the model without MAE pre-training, the pre-trained one achieves equal performance with 1/3 the time of training and outperforms it in half the training time. Our study shows that self-supervised learning is a promising research direction for EEG-based applications as other fields (natural language processing, computer vision, robotics, etc.), and thus we expect foundation models to be successful in EEG domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We also experiment with mean squared error (MSE) loss function, the performance increase brought by it is not obvious.
- 2.
Here “EEGViT" is equivalent to “EEGViT Pre-trained" in Table 4 of [29]. This applies to the following mentions as well.
References
Altaheri, H., et al.: Deep learning techniques for classification of electroencephalogram (eeg) motor imagery (mi) signals: a review. Neural Comput. Appl. 35(20), 14681–14722 (2023)
Bao, H., Dong, L., Piao, S., Wei, F.: Beit: bert pre-training of image transformers. arXiv preprint arXiv:2106.08254 (2021)
Bashivan, P., Rish, I., Yeasin, M., Codella, N.: Learning representations from EEG with deep recurrent-convolutional neural networks. arXiv preprint arXiv:1511.06448 (2015)
Bommasani, R., et al.: On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Chen, M., et al.: Generative pretraining from pixels. In: International Conference on Machine Learning. pp. 1691–1703. PMLR (2020)
Chien, H.Y.S., Goh, H., Sandino, C.M., Cheng, J.Y.: Maeeg: masked auto-encoder for EEG representation learning. arXiv preprint arXiv:2211.02625 (2022)
Craik, A., He, Y., Contreras-Vidal, J.L.: Deep learning for electroencephalogram (EEG) classification tasks: a review. J. Neural Eng. 16(3), 031001 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Firoozi, R., et al.: Foundation models in robotics: applications, challenges, and the future. arXiv preprint arXiv:2312.07843 (2023)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Kastrati, A., et al.: EEGEyenet: a simultaneous electroencephalography and eye-tracking dataset and benchmark for eye movement prediction. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1) (2021)
Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacl-HLT, vol. 1, p. 2 (2019)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kostas, D., Aroca-Ouellette, S., Rudzicz, F.: Bendr: using transformers and a contrastive self-supervised learning task to learn from massive amounts of eeg data. Front. Hum. Neurosci. 15, 653659 (2021)
Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., Lance, B.J.: Eegnet: a compact convolutional neural network for EEG-based brain-computer interfaces. J. Neural Eng. 15(5), 056013 (2018)
Li, C., et al.: Multimodal foundation models: from specialists to general-purpose assistants, vol. 1, no. 2, p. 2 (2023). arXiv preprint arXiv:2309.10020
Mao, W., Fathurrahman, H., Lee, Y., Chang, T.: EEG dataset classification using CNN method. In: Journal of Physics: Conference Series, vol. 1456, p. 012017. IOP Publishing (2020)
Murungi, N.K., Pham, M.V., Dai, X.C., Qu, X.: Empowering computer science students in electroencephalography (EEG) analysis: a review of machine learning algorithms for EEG datasets (2023)
OpenAI, R.: Gpt-4 technical report. arXiv, pp. 2303–08774 (2023)
Peng, R., et al.: Wavelet2vec: a filter bank masked autoencoder for EEG-based seizure subtype classification. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)
Pulver, D., Angkan, P., Hungler, P., Etemad, A.: EEG-based cognitive load classification using feature masked autoencoding and emotion transfer learning. In: Proceedings of the 25th International Conference on Multimodal Interaction, pp. 190–197 (2023)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Roy, Y., Banville, H., Albuquerque, I., Gramfort, A., Falk, T.H., Faubert, J.: Deep learning-based electroencephalography analysis: a systematic review. J. Neural Eng. 16(5), 051001 (2019)
Weng, N., Płomecka, M.B., Kaufmann, M., Kastrati, A., Wattenhofer, R., Langer, N.: An interpretable attention-based method for gaze estimation using electroencephalography (2023)
Xiao, G., Shi, M., Ye, M., Xu, B., Chen, Z., Ren, Q.: 4d attention-based neural network for EEG emotion recognition. Cogn. Neurodyn. 1–14 (2022)
Yang, R., Modesitt, E.: Vit2eeg: leveraging hybrid pretrained vision transformers for eeg data. arXiv preprint arXiv:2308.00454 (2023)
Yang, S., Nachum, O., Du, Y., Wei, J., Abbeel, P., Schuurmans, D.: Foundation models for decision making: problems, methods, and opportunities. arXiv preprint arXiv:2303.04129 (2023)
Yi, L., Qu, X.: Attention-based CNN capturing EEG recording’s average voltage and local change. In: Degen, H., Ntoa, S. (eds.) HCII 2022. LNCS, vol. 13336, pp. 448–459. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-05643-7_29
Zhou, C., et al.: A comprehensive survey on pretrained foundation models: a history from bert to chatGPT. arXiv preprint arXiv:2302.09419 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, Y., Liu, S. (2024). Enhancing Representation Learning of EEG Data with Masked Autoencoders. In: Schmorrow, D.D., Fidopiastis, C.M. (eds) Augmented Cognition. HCII 2024. Lecture Notes in Computer Science(), vol 14695. Springer, Cham. https://doi.org/10.1007/978-3-031-61572-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-61572-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-61571-9
Online ISBN: 978-3-031-61572-6
eBook Packages: Computer ScienceComputer Science (R0)