Skip to main content

Advertisement

A dual transfer learning method based on 3D-CNN and vision transformer for emotion recognition

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In the domain of medical science, emotion recognition based on electroencephalogram (EEG) has been widely used in emotion computing. Despite the prevalence of deep learning in EEG signals analysis, standard convolutional and recurrent neural networks fall short in effectively processing EEG data due to their inherent limitations in capturing global dependencies and addressing the non-linear and unstable characteristics of EEG signals. We propose a dual transfer learning method based on 3D Convolutional Neural Networks (3D-CNN) with a Vision Transformer (ViT) to enhance emotion recognition. This paper aims to utilize 3D-CNN effectively to capture the spatial characteristics of EEG signals and reduce data covariance, extracting shallow features. Additionally, ViT is incorporated to improve the model’s ability to capture long-range dependencies, facilitating deep feature extraction. The methodology involves a two-stage process: initially, the front end of a pre-trained 3D-CNN is employed as a shallow feature extractor to mitigate EEG data covariance and transformer biases, focusing on low-level feature detection. The subsequent stage utilizes ViT as a deep feature extractor, adept at modeling the global aspects of EEG signals and employing attention mechanisms for precise classification. We also present an innovative algorithm for data mapping in transfer learning, ensuring consistent feature representation across both spatio-temporal dimensions. This approach significantly improves global feature processing and long-range dependency detection, with the integration of color channels augmenting the model’s sensitivity to signal variations. In a 10-fold cross-validation experiment on the DEAP, experimental results demonstrate that the proposed method achieves classification accuracies of 92.44\(\%\) and 92.85\(\%\) for the valence and arousal dimensions, and the accuracies of four-class classification across valence and arousal are HVHA: 88.01\(\%\), HVLA: 88.27\(\%\), LVHA: 90.89\(\%\), LVLA: 78.84\(\%\). Similarly, it achieves an accuracy of 98.69\(\%\) on the SEED. Overall, this methodology not only holds substantial potential in advancing emotion recognition tasks but also contributes to the broader field of affective computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

All data generated or analyzed during this study are available. DEAP dataset download URL: http://www.eecs.qmul.ac.uk/mmv/datasets/deap/download.html SEED dataset download URL: https://bcmi.sjtu.edu.cn/~seed/index.html

Code availability

Code can be available.

References

  1. Acharya D, Goel S, Bhardwaj H, et al (2020) A long short term memory deep learning network for the classification of negative emotions using eeg signals. In: 2020 international joint conference on neural networks (ijcnn). IEEE, pp 1–8

  2. Alakuş TB, Türkoğlu İ (2019) Eeg-based emotion estimation with different deep learning models. In: 2019 4th International Conference on Computer Science and Engineering (UBMK). Ieee, pp 33–37

  3. Benitez-Quiroz CF, Srinivasan R, Martinez AM (2016) Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 5562–5570, https://doi.org/10.1109/CVPR.2016.600

  4. Candra H, Yuwono M, Chai R, et al (2015) Investigation of window size in classification of eeg-emotion signal with wavelet entropy and support vector machine. In: 2015 37th Annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 7250–7253

  5. Carion N, Massa F, Synnaeve G, et al (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229

  6. Chaudhari A, Bhatt C, Krishna A et al (2022) Vitfer: Facial emotion recognition with vision transformers. Appl Syst Innov 5(4). https://doi.org/10.3390/asi5040080, https://www.mdpi.com/2571-5577/5/4/80

  7. Chen M, Han J, Guo L, et al (2015) Identifying valence and arousal levels via connectivity between eeg channels. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp 63–69

  8. Cheng C, Zhang Y, Liu L et al (2023) Multi-domain encoding of spatiotemporal dynamics in eeg for emotion recognition. IEEE J Biomed Health Inform 27(3):1342–135. https://doi.org/10.1109/JBHI.2022.3232497

    Article  MATH  Google Scholar 

  9. Dar MN, Akram MU, Yuvaraj R et al (2022) Eeg-based emotion charting for parkinson’s disease patients using convolutional recurrent neural networks and cross dataset learning. Comput Biol Med 144:105327

    Article  MATH  Google Scholar 

  10. Ding Y, Zhang S, Tang C, et al (2023) Masa-tcn: Multi-anchor space-aware temporal convolutional neural networks for continuous and discrete eeg emotion recognition. arXiv:2308.16207

  11. Dong L, Xu S, Xu B (2018) Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5884–5888

  12. Donmez H, Ozkurt N (2019) Emotion classification from eeg signals in convolutional neural networks. In: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU). IEEE, pp 1–6

  13. Dose H, Møller JS, Iversen HK et al (2018) An end-to-end deep learning approach to mi-eeg signal classification for bcis. Expert Syst Appl 114:532–542

    Article  MATH  Google Scholar 

  14. Dosovitskiy A, Beyer L, Kolesnikov A, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929

  15. Ganguly S, Singla R (2019) Electrode channel selection for emotion recognition based on eeg signal. In: 2019 IEEE 5th International Conference for Convergence in Technology (I2CT). IEEE, pp 1–4

  16. Gao Y, Wang X, Potter T et al (2020) Single-trial eeg emotion recognition using granger causality/transfer entropy analysis. J Neurosci Methods 346:108904

    Article  Google Scholar 

  17. Gong L, Li M, Zhang T et al (2023) Eeg emotion recognition using attention-based convolutional transformer neural network. Biomed Signal Process Control 84:104835. https://doi.org/10.1016/j.bspc.2023.104835, https://www.sciencedirect.com/science/article/pii/S1746809423%002689

  18. Gu Y, Zhong X, Qu C et al (2023) A domain generative graph network for eeg-based emotion recognition. IEEE J Biomed Health Inform 27(5):2377–238. https://doi.org/10.1109/JBHI.2023.3242090

    Article  MATH  Google Scholar 

  19. Guo W, Xu G, Wang Y (2022) Horizontal and vertical features fusion network based on different brain regions for emotion recognition. Knowl-Based Syst 247:108819

    Article  Google Scholar 

  20. Hao Y, Shi H, Huo S et al (2021) Emotion classification based on deep learning of eeg signals. J Appl Sci 39(3):10

    MATH  Google Scholar 

  21. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37. JMLR.org, ICML’15, p 448–456

  22. Islam MR, Moni MA, Islam MM et al (2021) Emotion recognition from eeg signal focusing on deep learning and shallow learning techniques. IEEE Access 9:94601–94624

    Article  MATH  Google Scholar 

  23. Iyer A, Das SS, Teotia R et al (2023) Cnn and lstm based ensemble learning for human emotion recognition using eeg recordings. Multimed Tool Appl 82(4):4883–4896

    Article  Google Scholar 

  24. Javitt DC, Spencer KM, Thaker GK et al (2008) Neurophysiological biomarkers for drug development in schizophrenia. Nat Rev Drug Disc 7(1):68–83

    Article  MATH  Google Scholar 

  25. Joesph C, Rajeswari A, Premalatha B, et al (2020) Implementation of physiological signal based emotion recognition algorithm. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, pp 2075–2079

  26. Kessous L, Castellano G, Caridakis G (2010) Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. J Multimodal User Interfac 3:33–48

    Article  MATH  Google Scholar 

  27. Khattak A, Asghar MZ, Ali M et al (2022) An efficient deep learning technique for facial emotion recognition. Multimed Tools Appl 1–35

  28. Kim J, André E (2008) Emotion recognition based on physiological changes in music listening. IEEE Trans Pattern Anal Mach Intell 30(12):2067–2083

    Article  MATH  Google Scholar 

  29. Koelstra S, Muhl C, Soleymani M et al (2011) Deap: A database for emotion analysis; using physiological signals. IEEE Trans Affect Comput 3(1):18–31

    Article  MATH  Google Scholar 

  30. Li B, Wang J, Guo Z et al (2023) Automatic detection of schizophrenia based on spatial-temporal feature mapping and levit with eeg signals. Expert Syst Appl 224:119969

    Article  MATH  Google Scholar 

  31. Li C, Zhang Z, Zhang X et al (2023) Eeg-based emotion recognition via transformer neural architecture search. IEEE Trans Industr Inf 19(4):6016–602. https://doi.org/10.1109/TII.2022.3170422

    Article  MATH  Google Scholar 

  32. Li X, Zhang Y, Tiwari P et al (2022) Eeg based emotion recognition: A tutorial and review. ACM Comput Surv 55(4). https://doi.org/10.1145/3524499, https://doi.org/10.1145/3524499

  33. Li Y, Kambara H, Koike Y et al (2010) Application of covariate shift adaptation techniques in brain–computer interfaces. IEEE Trans Biomed Eng 57(6):1318–132. https://doi.org/10.1109/TBME.2009.2039997

    Article  MATH  Google Scholar 

  34. Li Y, Chen J, Li F et al (2022) Gmss: Graph-based multi-task self-supervised learning for eeg emotion recognition. IEEE Trans Affect Comput

  35. Liu J, Zhang L, Wu H, et al (2021) Transformers for eeg emotion recognition. arXiv:2110.06553, https://api.semanticscholar.org/CorpusID:238744094

  36. Liu S, Wang Z, An Y et al (2023) Eeg emotion recognition based on the attention mechanism and pre-trained convolution capsule network. Knowl-Based Syst 265:110372

    Article  MATH  Google Scholar 

  37. Liu X, Li T, Tang C et al (2019) Emotion recognition and dynamic functional connectivity analysis based on eeg. IEEE Access 7:143293–143302

    Article  Google Scholar 

  38. Marjit S, Talukdar U, Hazarika SM (2021) Eeg-based emotion recognition using genetic algorithm optimized multi-layer perceptron. In: 2021 International Symposium of Asian Control Association on Intelligent Robotics and Industrial Automation (IRIA). IEEE, pp 304–309

  39. Media H (2021) Frontal lobe: Functions, structure, damage, and more. Healthline https://www.healthline.com/health/frontal-lobe, Accessed on 20 Sept 2023

  40. Mehrabian A (1968) Communication without words. https://api.semanticscholar.org/CorpusID:62098432

  41. Meng M, Zhang Y, Ma Y et al (2023) Eeg-based emotion recognition with cascaded convolutional recurrent neural networks. Pattern Anal Appl 26(2):783–795

    Article  MATH  Google Scholar 

  42. Middya AI, Nag B, Roy S (2022) Deep learning based multimodal emotion recognition using model-level fusion of audio–visual modalities. Knowl-Based Syst 244:108580. https://doi.org/10.1016/j.knosys.2022.108580

    Article  Google Scholar 

  43. Morais E, Hoory R, Zhu W, et al (2022) Speech emotion recognition using self-supervised features. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 6922–6926

  44. Nandi A, Xhafa F, Subirats L et al (2021) Real-time emotion classification using eeg data stream in e-learning contexts. Sensors 21(5):1589

    Article  Google Scholar 

  45. Özcan C, Çızmecı H (2020) Eeg based emotion recognition with convolutional neural networks. In: 2020 28th Signal Processing and Communications Applications Conference (SIU). Ieee, pp 1–4

  46. Pan C, Shi C, Mu H et al (2020) Eeg-based emotion recognition using logistic regression with gaussian kernel and laplacian prior and investigation of critical frequency bands. Appl Sci 10(5):1619

    Article  MATH  Google Scholar 

  47. Park MS, Oh HS, Jeong H, et al (2013) Eeg-based emotion recogntion during emotionally evocative films. In: 2013 International Winter Workshop on Brain-Computer Interface (BCI). IEEE, pp 56–57

  48. Ranganathan H, Chakraborty S, Panchanathan S (2016) Multimodal emotion recognition using deep learning architectures. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). pp 1–9,https://doi.org/10.1109/WACV.2016.7477679

  49. Rao R, Bhattacharya N, Thomas N et al (2019) Evaluating protein transfer learning with tape. Adv Neural Inform Process Syst 32

  50. Raza H, Prasad G, Li Y, et al (2014) Covariate shift-adaptation using a transductive learning model for handling non-stationarity in eeg based brain-computer interfaces. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). pp 230–236,https://doi.org/10.1109/BIBM.2014.6999160

  51. Satti A, Guan C, Coyle D, et al (2010) A covariate shift minimisation method to alleviate non-stationarity effects for an adaptive brain-computer interface. In: 2010 20th International Conference on Pattern Recognition. pp 105–108.https://doi.org/10.1109/ICPR.2010.34

  52. Satyanarayana KV, Tejasri V, Srujitha YN, et al (2022) Human emotion classification using knn classifier and recurrent neural networks with seed dataset. In: 2022 6th International Conference on Computing Methodologies and Communication (ICCMC). IEEE, pp 1717–1722

  53. Shenoy P, Krauledat M, Blankertz B et al (2006) Towards adaptive classification for bci*. J Neural Eng 3(1):R13. https://doi.org/10.1088/1741-2560/3/1/R02

    Article  MATH  Google Scholar 

  54. Sun J, Xie J, Zhou H (2021) Eeg classification with transformer-based models. In: 2021 ieee 3rd global conference on life sciences and technologies (lifetech). IEEE, pp 92–93

  55. Tewfik AH, Tran C, Krishna G, et al (2020) Eeg based continuous speech recognition using transformers

  56. Tian Z, Yi J, Bai Y, et al (2020) Synchronous transformers for end-to-end speech recognition. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 7884–7888

  57. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inform Process Syst 30

  58. Wang KY, Ho YL, Huang YD, et al (2019) Design of intelligent eeg system for human emotion recognition with convolutional neural network. In: 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS). IEEE, pp 142–145

  59. Wang P, Jiang A, Liu X et al (2018) Lstm-based eeg classification in motor imagery tasks. IEEE Trans Neural Syst Rehabil Eng 26(11):2086–2095

    Article  MATH  Google Scholar 

  60. Wang Q, Wang M, Yang Y et al (2022) Multi-modal emotion recognition using eeg and speech signals. Comput Biol Med 149:105907

  61. Wang X, Hersche M, Tömekce B, et al (2020) An accurate eegnet-based motor-imagery brain–computer interface for low-power edge computing. In: 2020 IEEE international symposium on medical measurements and applications (MeMeA). IEEE, pp 1–6

  62. Wang Z, Zhou Z, Shen H, et al (2021) Jdat: Joint-dimension-aware transformer with strong flexibility for eeg emotion recognition. https://api.semanticscholar.org/CorpusID:244740985

  63. Wang ZM, Zhang JW, He Y et al (2022) Eeg emotion recognition using multichannel weighted multiscale permutation entropy. Appl Intell 52(10):12064–12076

    Article  Google Scholar 

  64. Xu Y, Du Y, Li L et al (2023) Amdet: Attention based multiple dimensions eeg transformer for emotion recognition. IEEE Trans Affect Comput 1–1. https://doi.org/10.1109/TAFFC.2023.3318321

  65. Zhang J, Yin Z, Chen P et al (2020) Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review. Inform Fusion 59:103–12. https://doi.org/10.1016/j.inffus.2020.01.011. https://www.sciencedirect.com/science/article/pii/S1566253519%302532

  66. Zhang Y, Liu H, Zhang D et al (2022) Eeg-based emotion recognition with emotion localization via hierarchical self-attention. IEEE Trans Affect Comput

  67. Zheng WL, Lu BL (2015) Investigating critical frequency bands and channels for eeg-based emotion recognition with deep neural networks. IEEE Trans Auton Ment Dev 7(3):162–175

    Article  MATH  Google Scholar 

  68. Zhong X, Gu Y, Luo Y et al (2023) Bi-hemisphere asymmetric attention network: recognizing emotion from eeg signals based on the transformer. Appl Intell 53:15278–15294

    Article  MATH  Google Scholar 

  69. Zhou H, Du J, Zhang Y et al (2021) Information fusion in attention networks using adaptive and multi-level factorized bilinear pooling for audio-visual emotion recognition. IEEE/ACM Trans Audio Speech Language Process 29:2617–262. https://doi.org/10.1109/TASLP.2021.3096037

  70. Zhu Y, Wei J, Mao J (2012) Summary of artificial emotion. J Jiangnan Univ (Nat Sci Ed) 4:497–504

    MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the editor and referees for their valuable comments and suggestions for improving the paper.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61836011 and in part by the Fundamental Research Funds for the Central Universities of China under Grant N2104001.

Author information

Authors and Affiliations

Authors

Contributions

Zhifen Guo: Conceptualization, Investigation, Methodology, Formal analysis, Data curation, Coding, Writing-original draft. Jiao Wang: Conceptualization, Supervision, Funding acquisition. Bin Zhang: Conceptualization, Investigation, Formal analysis, Data curation, Coding. Yating Ku: Conceptualization, Validation, Writing - review & editing. Fengbin Ma: Methodology, Writing - review & editing.

Corresponding author

Correspondence to Jiao Wang.

Ethics declarations

Conflicts of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Ethics approval

Not applicable

Consent to participate

Not applicable

Consent for publication

All authors read and approved the final manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, Z., Wang, J., Zhang, B. et al. A dual transfer learning method based on 3D-CNN and vision transformer for emotion recognition. Appl Intell 55, 200 (2025). https://doi.org/10.1007/s10489-024-05976-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05976-z

Keywords