Skip to main content

Multimodal Emotion Recognition Using Compressed Graph Neural Networks

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2024)

Abstract

Since electronic devices have become an integral part of life, there has been a need to bring the communication between a human and a machine closer to being as similar as possible to that between two people. As interpersonal relationships are built on the basis of feelings and empathy, training machines to understand emotions and to provide responses in accordance with the emotional state of the user, i.e. human, has become an interesting area for technology development. To gain a more comprehensive understanding of a person's emotional state, simultaneous utilization of different modalities such as audio, text, and video and their further processing using a graph neural network, recently became popular due to its suitability for tracking a conversation. However, small IoT devices commonly have constrained computational capabilities, memory resources and lower power consumption, and running such a complex multimodal algorithm in real-time may be difficult. In this research, we examine utilization of binarization and 8-bit floating point arithmetic for compressing state-of-the-art GNN-based model COGMEN. We demonstrate that in the case of the multimodal emotion recognition task, such constrained models can provide significant data savings while maintaining relatively high performance, as shown through experiments processing data from the IEMOCAP dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. De Rivera, J., Grinkis, C.: Emotions as social relationships. Motiv. Emot. 10, 351–369 (1986)

    Article  Google Scholar 

  2. Frijda, N.H.: The Emotions. Cambridge University Press (1986)

    Google Scholar 

  3. Delić, V., et al.: Speech technology progress based on new machine learning paradigm. Comput. Intell. Neurosci. 2019, 1–19 (2019)

    Article  MATH  Google Scholar 

  4. Yang, C., et al.: Emotion-dependent language featuring depression. J. Behav. Therapy Exp. Psych. 81, 101883 (2023)

    Google Scholar 

  5. Mahlke, S., Minge, M.: Emotions and EMG measures of facial muscles in interactive contexts. Cogn. Emot. 6, 169–200 (2006)

    MATH  Google Scholar 

  6. Simić, N., et al.: Enhancing emotion recognition through federated learning: a multimodal approach with convolutional neural networks. Appl. Sci. 14(4), 1325 (2024)

    Article  MATH  Google Scholar 

  7. Hebb, D.O.: Emotion in man and animal: an analysis of the intuitive processes of recognition. Psychol. Rev. 53(2), 88 (1946)

    Article  MATH  Google Scholar 

  8. Simić, N., et al.: Speaker recognition using constrained convolutional neural networks in emotional speech. Entropy 24(3), 414 (2022)

    Article  MATH  Google Scholar 

  9. Cowie, R., et al.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)

    Article  MATH  Google Scholar 

  10. Joshi, A., Bhat, A., Jain, A., Singh, A.V., Modi, A.: COGMEN: COntextualized GNN based multimodal emotion recognitioN. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Seattle, USA, pp. 4148–4164 (2022)

    Google Scholar 

  11. Liang, F., Qian, C., Yu, W., Griffith, D., Golmie, N.: Survey of graph neural networks and applications. Wirel. Commun. Mob. Comput. 2022(1), 9261537 (2022)

    Google Scholar 

  12. Bajovic, D., et al.: MARVEL: multimodal extreme scale data analytics for smart cities environments. In: proceedings of 2021 International Balkan Conference on Communications and Networking, BalkanCom, Novi Sad, Serbia, pp. 143–147 (2021)

    Google Scholar 

  13. Choi, Y., El-Khamy, M., Lee, J.: Universal deep neural network compression. IEEE J. Sel. Top. Sig. Process. 14(4), 715–726 (2020)

    Article  MATH  Google Scholar 

  14. Ajay, B.S., Rao, M.: Binary neural network based real time emotion detection on an edge computing device to detect passenger anomaly. In: Proceedings of the 2021 34th International Conference on VLSI Design and 2021 20th International Conference on Embedded Systems (VLSID), Guwahati, India, pp. 175–180 (2021)

    Google Scholar 

  15. Muhammad, G., Hossain, M.S.: Emotion recognition for cognitive edge computing using deep learning. IEEE Int. Things J. 8(23), 16894–16901 (2021)

    Article  Google Scholar 

  16. Liu, S., Ha, D.S., Shen, F., Yi, Y.: Efficient neural networks for edge devices. Comput. Electr. Eng. 92(107121), 1–24 (2021)

    MATH  Google Scholar 

  17. Wu, L., Cui, P., Pei, J., Zhao, L.: Graph Neural Networks: Foundations, Frontiers, and Applications. Springer (2022)

    Google Scholar 

  18. Ghosal, D., Majumder, N., Poria, S., Chhaya, N., Gelbukh, A.: Dialoguegcn: a graph convolutional neural network for emotion recognition in conversation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, Association for Computational Linguistics, Hong Kong, China, pp. 154–164 (2019)

    Google Scholar 

  19. Zhang, C., Song, D., Huang, C., Swami, A., Chawla, N.V.: Heterogeneous graph neural network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 793–803 (2019)

    Google Scholar 

  20. Liang, Y., Meng, F., Zhang, Y., Chen, Y., Xu, J., Zhou, J.: Infusing multi-source knowledge with heterogeneous graph neural network for emotional conversation generation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 15, pp. 13343–13352 (2021)

    Google Scholar 

  21. Neill, J.O.: An overview of neural network compression. arXiv preprint arXiv:2006.03669 (2020)

  22. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. The Semantic Web. ESWC 2018. Lecture Notes in Computer Science(), vol. 10843. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38

  23. Shi, Y., Huang, Z., Feng, S., Zhong, H., Wang, W., Sun, Y.: Masked label prediction: unified message passing model for semi-supervised classification. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Montreal, Canada, pp. 1548–1554 (2020)

    Google Scholar 

  24. Busso, C., et al.: IEMOCAP: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42, 335–359 (2008)

    Article  Google Scholar 

  25. Zadeh, A.B., Liang, P.P., Poria, S., Cambria, E., Morency, L.P.: Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers, pp. 2236–2246 (2018)

    Google Scholar 

  26. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or –1. arXiv 2016, arXiv:1602.02830v3 (2016)

  27. Kahan, W.: IEEE standard 754 for binary floating-point arithmetic. Lect. Notes Status IEEE 754(94720–1776), 11 (1996)

    MATH  Google Scholar 

  28. Wang, H., et al.: Binarized graph neural network. World Wide Web 24, 825–848 (2021)

    Article  MATH  Google Scholar 

  29. Huang, L., et al.: EPQuant: a Graph Neural Network compression approach based on product quantization. Neurocomputing 503, 49–61 (2022)

    Article  MATH  Google Scholar 

  30. Liang, T., Glossner, J., Wang, L., Shi, S., Zhang, X.: Pruning and quantization for deep neural network acceleration: a survey. Neurocomputing 461, 370–403 (2021)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This study was Funded by the European Union (Multilingual and Cross-cultural interactions for context-aware, and bias-controlled dialogue systems for safety-critical applications (ELOQUENCE) project, Grant agreement No. 101135916). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or European Commission-EU. Neither the European Union nor the granting authority can be held responsible for them.

Also, this research was supported by the Science Fund of the Republic of Serbia, Grant No. 7449, Multimodal multilingual human-machine speech communication, AI-SPEAK.

Disclosure of Interests.

The authors have no competing interests to declare that are relevant to the content of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikola Simić .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Đurkić, T., Simić, N., Suzić, S., Bajović, D., Perić, Z., Delić, V. (2025). Multimodal Emotion Recognition Using Compressed Graph Neural Networks. In: Karpov, A., Delić, V. (eds) Speech and Computer. SPECOM 2024. Lecture Notes in Computer Science(), vol 15300. Springer, Cham. https://doi.org/10.1007/978-3-031-78014-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78014-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78013-4

  • Online ISBN: 978-3-031-78014-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics