Skip to main content
Log in

MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Currently, research on emotion recognition has shown that multi-modal data fusion has advantages in improving the accuracy and robustness of human emotion recognition, outperforming single-modal methods. Despite the promising results of existing methods, significant challenges remain in effectively fusing data from multiple modalities to achieve superior performance. Firstly, existing works tend to focus on generating a joint representation by fusing multi-modal data, with fewer methods considering the specific characteristics of each modality. Secondly, most methods fail to fully capture the intricate correlations among multiple modalities, often resorting to simplistic combinations of latent features. To address these challenges, we propose a novel fusion network for multi-modal emotion recognition. This network enhances the efficacy of multi-modal fusion while preserving the distinct characteristics of each modality. Specifically, a dual-stream multi-scale feature encoding (MFE) is designed to extract emotional information from both electroencephalogram (EEG) and peripheral physiological signals (PPS) temporal slices. Subsequently, a cross-modal global–local feature fusion module (CGFFM) is proposed to integrate global and local information from multi-modal data and then assign different importance to each modality, which makes the fusion data tend to the more important modalities. Meanwhile, the transformer module is employed to further learn the modality-specific information. Moreover, we introduce the adaptive collaboration block (ACB), which optimally leverages both modality-specific and cross-modality relations for enhanced integration and feature representation. Following extensive experiments on the DEAP and DREAMER multimodal datasets, our model achieves state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availibility

The authors do not have permission to share data.

References

  1. Fiorini, L., Mancioppi, G., Semeraro, F., Fujita, H., Cavallo, F.: Unsupervised emotional state classification through physiological parameters for social robotics applications. Knowl.-Based Syst. 190, 105217 (2020)

    Article  Google Scholar 

  2. Mane, S.A.M., Shinde, A.: StressNet: hybrid model of LSTM and CNN for stress detection from electroencephalogram signal (EEG). Results Control Optim. 11, 100231 (2023)

    Article  MATH  Google Scholar 

  3. Gao, D., Wang, K., Wang, M., Zhou, J., Zhang, Y.: SFT-Net: a network for detecting fatigue from EEG signals by combining 4D feature flow and attention mechanism. IEEE J. Biomed. Health Informa 28, 4444–4455 (2023). https://api.semanticscholar.org/CorpusID:259153959

  4. Wang, Y., Song, W., Tao, W., Liotta, A., Yang, D., Li, X., Gao, S., Sun, Y., Ge, W., Zhang, W., et al.: A systematic review on affective computing: emotion models, databases, and recent advances. Inf. Fusion 83, 19–52 (2022)

    Article  MATH  Google Scholar 

  5. Li, Y., Guo, W., Wang, Y.: Emotion recognition with attention mechanism-guided dual-feature multi-path interaction network. Signal Image Video Process. 1–10 (2024)

  6. Kim, H., Zhang, D., Kim, L., Im, C.-H.: Classification of individual’s discrete emotions reflected in facial microexpressions using electroencephalogram and facial electromyogram. Expert Syst. Appl. 188, 116101 (2022)

    Article  MATH  Google Scholar 

  7. Rahman, M.M., Sarkar, A.K., Hossain, M.A., Hossain, M.S., Islam, M.R., Hossain, M.B., Quinn, J.M., Moni, M.A.: Recognition of human emotions using EEG signals: a review. Comput. Biol. Med. 136, 104696 (2021)

    Article  MATH  Google Scholar 

  8. Shukla, J., Barreda-Angeles, M., Oliver, J., Nandi, G.C., Puig, D.: Feature extraction and selection for emotion recognition from electrodermal activity. IEEE Trans. Affect. Comput. 12(4), 857–869 (2019)

    Article  Google Scholar 

  9. Zhang, Q., Chen, X., Zhan, Q., Yang, T., Xia, S.: Respiration-based emotion recognition with deep learning. Comput. Ind. 92, 84–90 (2017)

    Article  MATH  Google Scholar 

  10. Saleem, A.A., Siddiqui, H.U.R., Raza, M.A., Rustam, F., Dudley, S.E.M., Ashraf, I.: A systematic review of physiological signals based driver drowsiness detection systems. Cogn. Neurodyn. 17, 1229–1259 (2022)

    Article  Google Scholar 

  11. Liu, H., Lou, T., Zhang, Y., Wu, Y., Xiao, Y., Jensen, C.S., Zhang, D.: EEG-based multimodal emotion recognition: a machine learning perspective. IEEE Trans. Instrum. Meas. (2024)

  12. Ferri, F., Tajadura-Jiménez, A., Väljamäe, A., Vastano, R., Costantini, M.: Emotion-inducing approaching sounds shape the boundaries of multisensory peripersonal space. Neuropsychologia 70, 468–475 (2015)

    Article  Google Scholar 

  13. Ekman, P., Friesen, W.V., Ellsworth, P.C.: Emotion in the human face: guidelines for research and an integration of findings (1972). https://api.semanticscholar.org/CorpusID:141855078

  14. Zhao, S., Jia, G., Yang, J., Ding, G., Keutzer, K.: Emotion recognition from multiple modalities: fundamentals and methodologies. IEEE Signal Process. Mag. 38, 59–73 (2021)

    Article  MATH  Google Scholar 

  15. Ackermann, P., Kohlschein, C., Bitsch, J.A., Wehrle, K., Jeschke, S.: EEG-based automatic emotion recognition: feature extraction, selection and classification methods. In: 2016 IEEE 18th International Conference on E-health Networking, Applications and Services (Healthcom), pp. 1–6. IEEE (2016)

  16. Zhang, Y., Zhang, Y., Wang, S.: An attention-based hybrid deep learning model for EEG emotion recognition. SIViP 17(5), 2305–2313 (2023)

    Article  MATH  Google Scholar 

  17. Tao, W., Li, C., Song, R., Cheng, J., Liu, Y., Wan, F., Chen, X.: EEG-based emotion recognition via channel-wise attention and self attention. IEEE Trans. Affect. Comput. 14(1), 382–393 (2020)

    Article  Google Scholar 

  18. Liu, Y., Ding, Y., Li, C., Cheng, J., Song, R., Wan, F., Chen, X.: Multi-channel EEG-based emotion recognition via a multi-level features guided capsule network. Comput. Biol. Med. 123, 103927 (2020)

    Article  Google Scholar 

  19. Li, D., Xie, L., Chai, B., Wang, Z., Yang, H.: Spatial-frequency convolutional self-attention network for EEG emotion recognition. Appl. Soft Comput. 122, 108740 (2022)

    Article  Google Scholar 

  20. Li, C., Wang, B., Zhang, S., Liu, Y., Song, R., Cheng, J., Chen, X.: Emotion recognition from EEG based on multi-task learning with capsule network and attention mechanism. Comput. Biol. Med. 143, 105303 (2022)

    Article  MATH  Google Scholar 

  21. Ru, X., He, K., Lyu, B., Li, D., Xu, W., Gu, W., Ma, X., Liu, J., Li, C., Li, T., et al.: Multimodal neuroimaging with optically pumped magnetometers: a simultaneous MEG-EEG-FNIRS acquisition system. Neuroimage 259, 119420 (2022)

    Article  MATH  Google Scholar 

  22. Poria, S., Cambria, E., Bajpai, R., Hussain, A.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98–125 (2017)

    Article  MATH  Google Scholar 

  23. Baltrušaitis, T., Ahuja, C., Morency, L.-P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  MATH  Google Scholar 

  24. Agarwal, R., Andujar, M., Canavan, S.J.: Classification of emotions using EEG activity associated with different areas of the brain. Pattern Recognit. Lett. 162, 71–80 (2022)

    Article  MATH  Google Scholar 

  25. Lin, W., Li, C., Sun, S.: Deep convolutional neural network for emotion recognition using EEG and peripheral physiological signal. In: Image and Graphics: 9th International Conference, ICIG 2017, Shanghai, China, September 13-15, 2017, Revised Selected Papers, Part II 9, pp. 385–394. Springer (2017)

  26. Ma, J., Tang, H., Zheng, W.-L., Lu, B.-L.: Emotion recognition using multimodal residual lstm network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 176–183 (2019)

  27. Li, Q., Liu, Y., Yan, F., Zhang, Q., Liu, C.: Emotion recognition based on multiple physiological signals. Zhongguo yi liao qi xie za zhi = Chin. J. Med. Instrum. 444, 283–287 (2020)

  28. Chen, S., Tang, J., Zhu, L., Kong, W.: A multi-stage dynamical fusion network for multimodal emotion recognition. Cogn. Neurodyn. 17, 671–680 (2022)

    Article  MATH  Google Scholar 

  29. Wang, Y., Jiang, W.-B., Li, R., Lu, B.-L.: Emotion transformer fusion: complementary representation properties of EEG and eye movements on recognizing anger and surprise. In: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1575–1578. IEEE (2021)

  30. Gong, L., Chen, W., Li, M., Zhang, T.: Emotion recognition from multiple physiological signals using intra-and inter-modality attention fusion network. Digit. Signal Process. 144, 104278 (2024)

    Article  Google Scholar 

  31. Liu, W., Qiu, J., Zheng, W.-L., Lu, B.-L.: Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Trans. Cognit. Dev. Syst. 14, 715–729 (2021)

    Article  MATH  Google Scholar 

  32. Fu, B., Gu, C., Fu, M., Xia, Y., Liu, Y.: A novel feature fusion network for multimodal emotion recognition from EEG and eye movement signals. Front. Neurosci. 17, 1234162 (2023)

    Article  MATH  Google Scholar 

  33. Zhang, Y., Cheng, C., Zhang, Y.: Multimodal emotion recognition using a hierarchical fusion convolutional neural network. IEEE Access 9, 7943–7951 (2021). https://doi.org/10.1109/ACCESS.2021.3049516

    Article  MATH  Google Scholar 

  34. Koelstra, S., Muhl, C., Soleymani, M., Lee, J.-S., Yazdani, A., Ebrahimi, T., Pun, T., Nijholt, A., Patras, I.: Deap: a database for emotion analysis; using physiological signals. IEEE Trans. Affect. Comput. 3(1), 18–31 (2011)

    Article  Google Scholar 

  35. Morris, J.D.: Observations: SAM: the Self-assessment Manikin an efficient cross-cultural measurement of emotional response 1. J. Advert. Res. 35(6), 63–68 (1995)

    MathSciNet  MATH  Google Scholar 

  36. Katsigiannis, S., Ramzan, N.: Dreamer: a database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices. IEEE J. Biomed. Health Inform. 22(1), 98–107 (2017)

    Article  Google Scholar 

Download references

Acknowledgements

We sincerely appreciate all the editors and reviewers for their insightful comments and constructive suggestions. This work was supported by the Key Research and Development Project of Zhejiang Province(Grant No. 2020C04009), Laboratory of Brain Machine Collaborative(Grant No. 2020E10010), and Zhejiang Provincial Natural Science Foundation of China (Grant No. LGF22H090004).

Author information

Authors and Affiliations

Authors

Contributions

All authors have contributed equally.

Corresponding author

Correspondence to Lei Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, L., Ding, Y., Huang, A. et al. MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals. SIViP 19, 58 (2025). https://doi.org/10.1007/s11760-024-03632-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11760-024-03632-0

Keywords

Navigation