Skip to main content

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-processing

  • Conference paper
  • First Online:
Augmented Cognition (HCII 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14695))

Included in the following conference series:

  • 469 Accesses

Abstract

In the field of EEG-based gaze prediction, the application of deep learning to interpret complex neural data poses significant challenges. This study evaluates the effectiveness of pre-processing techniques and the effect of additional depthwise separable convolution on EEG vision transformers (ViTs) in a pretrained model architecture. We introduce a novel method, the EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (CNNs) with vision transformers, enriched by a pre-processing strategy involving data clustering. The new approach demonstrates superior performance, establishing a new benchmark with a Root Mean Square Error (RMSE) of 51.6 mm. This achievement underscores the impact of pre-processing and model refinement in enhancing EEG-based applications.

M. L. Key and T. Mehtiyev—The first authors contributed equally to this work.

Full source code is available at https://github.com/GWU-CS/EEG-DCViT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. An, S., Bhat, G., Gumussoy, S., Ogras, U.: Transfer learning for human activity recognition using representational analysis of neural networks. ACM Trans. Comput. Healthc. 4(1), 1–21 (2023)

    Article  Google Scholar 

  2. An, S., Tuncel, Y., Basaklar, T., Ogras, U.Y.: A survey of embedded machine learning for smart and sustainable healthcare applications. In: Pasricha, S., Shafique, M. (eds.) Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing: Use Cases and Emerging Challenges, pp. 127–150. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-40677-5_6

    Chapter  Google Scholar 

  3. Chaaraoui, A.A., Climent-Pérez, P., Flórez-Revuelta, F.: A review on vision techniques applied to human behaviour analysis for ambient-assisted living. Expert Syst. Appl. 39(12), 10873–10888 (2012). https://doi.org/10.1016/j.eswa.2012.03.005

    Article  Google Scholar 

  4. Chen, H., et al.: Pre-trained image processing transformer. arXiv preprint arXiv:2012.00364 (2021). https://doi.org/10.48550/arXiv.2012.00364

  5. Cheng, Y., Lu, F.: Gaze estimation using transformer. arXiv preprint arXiv:2105.14424 (2021). https://ar5iv.labs.arxiv.org/html/2105.14424

  6. Craik, A., He, Y., Contreras-Vidal, J.L.: Deep learning for electroencephalogram (EEG) classification tasks: a review. J. Neural Eng. 16(3) (2019). https://doi.org/10.1088/1741-2552/aba0b5

  7. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. ICLR, January 2021

    Google Scholar 

  8. Godoy, R.V., et al.: EEG-based epileptic seizure prediction using temporal multi-channel transformers. arXiv preprint arXiv:2209.11172 (2022). https://ar5iv.labs.arxiv.org/html/2209.11172

  9. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017). https://arxiv.org/pdf/1704.04861.pdf

  10. Huang, Z., et al.: CDBA: a novel multi-branch feature fusion model for EEG-based emotion recognition. Front. Physiol. 14 (2023). https://doi.org/10.3389/fphys.2023.1200656. https://www.frontiersin.org/articles/10.3389/fphys.2023.1200656/full

  11. Kastrati, A., et al.: EEGEyeNet: a simultaneous electroencephalography and eye-tracking dataset and benchmark for eye movement prediction. ETH Zurich, November 2021

    Google Scholar 

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. NeurIPS (2012). https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

  13. Li, Q., et al.: Multidimensional feature in emotion recognition based on multi-channel EEG signals. Entropy 24(12), 1830 (2022). https://doi.org/10.3390/e24121830. https://www.mdpi.com/1099-4300/24/12/1830

  14. Li, W., Lu, X., Qian, S., Lu, J., Zhang, X., Jia, J.: On efficient transformer-based image pre-training for low-level vision. arXiv: Computer Vision and Pattern Recognition, December 2021. https://doi.org/10.48550/arXiv.2112.10175

  15. Lu, Y., Shen, M., Wang, H., Wang, X., van Rechem, C., Wei, W.: Machine learning for synthetic data generation: a review. arXiv preprint arXiv:2302.04062 (2023)

  16. Lu, Y., et al.: COT: an efficient and accurate method for detecting marker genes among many subtypes. Bioinform. Adv. 2(1), vbac037 (2022)

    Google Scholar 

  17. Majaranta, P., Bulling, A.: Eye tracking eye-based human-computer interaction, March 2014. https://doi.org/10.1007/978-1-4471-6392-3_3

  18. Murungi, N.K., Pham, M.V., Dai, X., Qu, X.: Trends in machine learning and electroencephalogram (EEG): a review for undergraduate researchers. arXiv preprint arXiv:2307.02819 (2023)

  19. Okada, G., Masui, K., Tsumura, N.: Advertisement effectiveness estimation based on crowdsourced multimodal affective responses. In: CVPR Workshop (2023)

    Google Scholar 

  20. Qiu, Y., Zhao, Z., Yao, H., Chen, D., Wang, Z.: Modal-aware visual prompting for incomplete multi-modal brain tumor segmentation. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 3228–3239 (2023)

    Google Scholar 

  21. Qu, X., Liu, P., Li, Z., Hickey, T.: Multi-class time continuity voting for EEG classification. In: Frasson, C., Bamidis, P., Vlamos, P. (eds.) BFAL 2020. LNCS (LNAI), vol. 12462, pp. 24–33. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60735-7_3

    Chapter  Google Scholar 

  22. Qu, X., Mei, Q., Liu, P., Hickey, T.: Using EEG to distinguish between writing and typing for the same cognitive task. In: Frasson, C., Bamidis, P., Vlamos, P. (eds.) BFAL 2020. LNCS (LNAI), vol. 12462, pp. 66–74. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60735-7_7

    Chapter  Google Scholar 

  23. Tang, Y., Song, S., Gui, S., Chao, W., Cheng, C., Qin, R.: Active and low-cost hyperspectral imaging for the spectral analysis of a low-light environment. Sensors 23(3), 1437 (2023)

    Article  Google Scholar 

  24. Teplan, M.: Fundamentals of EEG measurement. Meas. Sci. Rev. 2(2), 1–11 (2002)

    Google Scholar 

  25. Wang, X., Shi, R., Wu, X., Zhang, J.: Decoding human interaction type from inter-brain synchronization by using EEG brain network. IEEE J. Biomed. Health Inform. (2023). https://doi.org/10.1109/JBHI.2023.3239742. https://pubmed.ncbi.nlm.nih.gov/37917521/. epub ahead of print

  26. Xu, K., Lee, A.H.X., Zhao, Z., Wang, Z., Wu, M., Lin, W.: Metagrad: adaptive gradient quantization with hypernetworks. arXiv preprint arXiv:2303.02347 (2023)

  27. Yang, R., Modesitt, E.: ViT2EEG: leveraging hybrid pretrained vision transformers for EEG data (2023)

    Google Scholar 

  28. Yi, L., Qu, X.: Attention-based CNN capturing EEG recording’s average voltage and local change. In: Degen, H., Ntoa, S. (eds.) HCII 2022. LNCS, vol. 13336, pp. 448–459. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-05643-7_29

    Chapter  Google Scholar 

  29. Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015). https://arxiv.org/pdf/1506.06579.pdf

  30. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. arXiv preprint arXiv:1311.2901 (2013). https://arxiv.org/pdf/1311.2901.pdf

  31. Zhao, S., et al.: Deep learning based CETSA feature prediction cross multiple cell lines with latent space representation. Sci. Rep. 14(1), 1878 (2024)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthew L. Key .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Key, M.L., Mehtiyev, T., Qu, X. (2024). Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-processing. In: Schmorrow, D.D., Fidopiastis, C.M. (eds) Augmented Cognition. HCII 2024. Lecture Notes in Computer Science(), vol 14695. Springer, Cham. https://doi.org/10.1007/978-3-031-61572-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-61572-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-61571-9

  • Online ISBN: 978-3-031-61572-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics