Personalized emotion analysis based on fuzzy multi-modal transformer model

Liu, JianBang; Ang, Mei Choo; Chaw, Jun Kit; Ng, Kok Weng; Kor, Ah-Lian

doi:10.1007/s10489-024-05954-5

Personalized emotion analysis based on fuzzy multi-modal transformer model

Published: 27 December 2024

Volume 55, article number 227, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

JianBang Liu^1,2,
Mei Choo Ang ORCID: orcid.org/0000-0001-5316-3936²,
Jun Kit Chaw²,
Kok Weng Ng³ &
…
Ah-Lian Kor⁴

152 Accesses
1 Altmetric
Explore all metrics

Abstract

Analyzing and detecting human intensions and emotions are important means to improve the communication between users and machines in the areas of human-computer interaction (HCI) and human-robot interaction (HRI). Despite significant progress in utilizing state-of-the-art (SOTA) Transformer-based models, various obstacles persist in managing complicated input interdependencies and extracting intricate contextual semantics. Moreover, it lacks practical applicability and struggles to accurately capture and effectively manage the inherent complexity and unpredictability of human emotions. In recognition of the identified research gaps, we introduce a robust and innovative fuzzy multi-modal Transformer (FMMT) model. Our novel fuzzy Transformer model uniquely heightens the comprehension of emotional contexts by concurrently analyzing audio, visual, and text data through three distinct branches. By incorporating fuzzy mathematic theory and introducing a unique temporal embedding technique to trace the evolution of emotional states, it effectively handles the inherent uncertainty in human emotions, thereby filling a significant void in emotional AI. Building upon the FMMT model, we further explored the emotion expression approach. Furthermore, performance comparison analysis with SOTA baseline methods and detailed ablation study were performed. The results show that the proposed FMMT performs better than the baseline methods. Finally, we conducted detailed experimental verification and empirical analyses of the practicality of the designed method by verifying uncertainty emotion and analyzing emotional state transitions combined with personalized factor. Overall, our research makes a significant contribution to emotion analysis through the implementation of a novel fuzzy Transformer model. This model enhances emotion perception and advances the methods for analyzing emotional expression, thus setting an edge over prior studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Emotional inference by means of Choquet integral and λ-fuzzy measurement in consideration of ambiguity of human mentality

Article 12 January 2016

Multimodal Interfaces for Emotion Recognition: Models, Challenges and Opportunities

A Review of Key Technologies for Emotion Analysis Using Multimodal Information

Article 01 June 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The participants of this study did not give written consent for their data to be shared publicly; therefore, the data are not available.

References

Abbasimehr H, Paki R (2022) Improving time series forecasting using LSTM and attention models. J Ambient Intell Humaniz Comput 13(1):673–691. https://doi.org/10.1007/s12652-020-02761-x
Article Google Scholar
Ahmad A, Singh V, Upreti K (2024) A systematic study on unimodal and multimodal human computer interface for emotion recognition. In: García Márquez FP, Jamil A, Ramirez IS, Eken S, Hameed AA (eds) Computing, internet of things and data analytics. ICCIDA 2023. Studies in computational intelligence, vol 1145. Springer, Cham, https://doi.org/10.1007/978-3-031-53717-2_35
Ahmed N, Aghbari ZA, Girija S (2023) A systematic survey on multimodal emotion recognition using learning algorithms. Intell Syst Appl 17:200171. https://doi.org/10.1016/j.iswa.2022.200171
Article MATH Google Scholar
Albadr MAA, Tiun S, Ayob M, Al-Dhief FT, Omar K, Maen MK (2022) Speech emotion recognition using optimized genetic algorithm-extreme learning machine. Multimedia Tools Appl 81(17):23963–23989. https://doi.org/10.1007/s11042-022-12747-w
Article MATH Google Scholar
Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Farhan L (2021) Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 8(1):1–74
Article Google Scholar
An F, Liu Z (2020) Facial expression recognition algorithm based on parameter adaptive initialization of CNN and LSTM. Visual Comput 36(3):483–498. https://doi.org/10.1007/s00371-019-01635-4
Article MATH Google Scholar
Arkin E, Yadikar N, Xu X, Aysa A, Ubul K (2023) A survey: object detection methods from CNN to transformer. Multimedia Tools Appl 82(14):21353–21383. https://doi.org/10.1007/s11042-022-13801-3
Article Google Scholar
Cao W, Zhang K, Wu H, Xu T, Chen E, Lv G, He M (2022) Video emotion analysis enhanced by recognizing emotion in video comments. Int J Data Sci Analytics 14(2):175–189. https://doi.org/10.1007/s41060-022-00317-0
Article Google Scholar
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-End object detection with transformers. In: Vedaldi A, Bischof H, Brox T, Frahm JM (eds) Computer vision – ECCV 2020. ECCV 2020. Lecture notes in computer science(), vol 12346. Springer, Cham, https://doi.org/10.1007/978-3-030-58452-8_13
Chalapathi MV, Kumar MR, Sharma N, Shitharth S (2022) Ensemble learning by high-dimensional acoustic features for emotion recognition from speech audio signal. Secur Commun Netw 2022(1):8777026
Google Scholar
Chen B et al (2021) Transformer-Based language model fine-tuning methods for covid-19 fake news detection. In: Chakraborty T, Shu K, Bernard HR, Liu H, Akhtar MS (eds) Combating online hostile posts in regional languages during emergency situation. CONSTRAINT 2021. Communications in computer and information science, vol 1402. Springer, Cham. https://doi.org/10.1007/978-3-030-73696-5_9
Chen H, Shi H, Liu X, Li X, Zhao G (2023) SMG: a micro-gesture dataset towards spontaneous body gestures for emotional stress state analysis. Int J Comput Vision 131(6):1346–1366. https://doi.org/10.1007/s11263-023-01761-6
Article MATH Google Scholar
Chen J, Ro T, Zhu Z (2022) Emotion recognition with audio, video, EEG, and EMG: a dataset and baseline approaches. IEEE Access 10:13229–13242
Article Google Scholar
Chen S, Guo X, Wu T, Ju X (2020) Exploring the online doctor-patient interaction on patient satisfaction based on text mining and empirical analysis. Inform Process Manage 57(5):102253
Article Google Scholar
Chen SY, Wang J-H (2021) Individual differences and personalized learning: a review and appraisal. Univ Access Inf Soc 20(4):833–849. https://doi.org/10.1007/s10209-020-00753-4
Article MATH Google Scholar
Cheng Y, Yao L, Xiang G, Zhang G, Tang T, Zhong L (2020) Text sentiment orientation analysis based on multi-channel CNN and bidirectional GRU with attention mechanism. IEEE Access 8:134964–134975. https://doi.org/10.1109/ACCESS.2020.3005823
Article MATH Google Scholar
Cohen J (1960) A coefficient of agreement for nominal scales. Educational Psychol Meas 20(1):37–46
Article MATH Google Scholar
Cuadra A, Wang M, Stein LA, Jung MF, Dell N, Estrin D, Landay JA (2024) The Illusion of Empathy? Notes on Displays of Emotion in Human-Computer Interaction. In: Proceedings of the 2024 CHI conference on human factors in computing systems (CHI '24). Association for computing machinery, New York, USA, Article 446, 1–18. https://doi.org/10.1145/3613904.3642336
Dai W, Cahyawijaya S, Liu Z, Fung P (2021) Multimodal end-to-end sparse model for emotion recognition. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: Human language technologies, Online. Association for computational pages 5305–5316. https://doi.org/10.18653/v1/2021.naacl-main.417
Dai Y, Gao Y, Liu F (2021) TransMed: transformers advance multi-modal medical image classification. Diagnostics 11(8):1384
Article MATH Google Scholar
Dewangan SK, Choubey S, Patra J, Choubey A (2024) IMU-CNN: implementing remote sensing image restoration framework based on mask-upgraded Cascade R-CNN and deep autoencoder. Multimedi Tools Appl. https://doi.org/10.1007/s11042-024-18122-1
Article MATH Google Scholar
Dey A, Chattopadhyay S, Singh PK, Ahmadian A, Ferrara M, Sarkar R (2020) A hybrid Meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition. IEEE Access 8:200953–200970. https://doi.org/10.1109/ACCESS.2020.3035531
Article Google Scholar
Dozio N, Marcolin F, Scurati GW, Ulrich L, Nonis F, Vezzetti E, Ferrise F (2022) A design methodology for affective virtual reality. Int J Hum Comput Stud 162:102791. https://doi.org/10.1016/j.ijhcs.2022.102791
Article Google Scholar
Egger M, Ley M, Hanke S (2019) Emotion recognition from physiological signal analysis: a review. Electron Notes Theor Comput Sci 343:35–55
Article MATH Google Scholar
Ekman P (1999) Basic emotions. Handb Cognition Emot 98(45–60):16
MATH Google Scholar
Fahad M, Deepak A, Pradhan G, Yadav J (2021) DNN-HMM-based speaker-adaptive emotion recognition using MFCC and epoch-based features. Circuits Syst Signal Process 40(1):466–489
Article MATH Google Scholar
Fernández-Blanco Martín G, Matía F, García Gómez-Escalonilla L, Galan D, Sánchez-Escribano MG, de la Puente P, Rodríguez-Cantelar M (2023) An emotional model based on fuzzy logic and social psychology for a personal assistant robot. Appl Sci 13(5):3284. https://doi.org/10.3390/app13053284
Article Google Scholar
Filippini C, Perpetuini D, Cardone D, Chiarelli AM, Merla A (2020) Thermal infrared imaging-based affective computing and its application to facilitate human robot interaction: a review. Appl Sci 10(8):2924
Article Google Scholar
Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. EducationaPsychol Meas 33(3):613–619
MATH Google Scholar
Gómez-Cañón JS, Cano E, Eerola T, Herrera P, Hu X, Yang YH, Gómez E (2021) Music emotion recognition: toward new, robust standards in personalized and context-sensitive applications. IEEE Signal Process Mag 38(6):106–114. https://doi.org/10.1109/MSP.2021.3106232
Article Google Scholar
Gong Y, Lai C-I, Chung Y-A, Glass J (2022) SSAST: self-supervised audio spectrogram transformer. Proc AAAI Conf Artif Intell 36(10):10699–10709. https://doi.org/10.1609/aaai.v36i10.21315
Article Google Scholar
Greco CM, Tagarelli A (2023) Bringing order into the realm of transformer-based language models for artificial intelligence and law. Artif Intell Law. https://doi.org/10.1007/s10506-023-09374-7
Article MATH Google Scholar
Han W, Chen H, Gelbukh A, Zadeh A, Morency L-p, Poria S (2021) Bi-bimodal modality fusion for correlation-controlled multimodal sentiment analysis. Paper presented at the Proceedings of the 2021 International Conference on Multimodal Interaction, Montréal, QC, Canada. https://doi.org/10.1145/3462244.3479919
Hayajneh AM, Aldalahmeh SA, Alasali F, Al-Obiedollah H, Zaidi SA, McLernon D (2024) Tiny machine learning on the edge: a framework for transfer learning empowered unmanned aerial vehicle assisted smart farming. IET Smart Cities 6(1):10–26
Article Google Scholar
Hema C, Garcia Marquez FP (2023) Emotional speech recognition using CNN and Deep learning techniques. Appl Acoust 211:109492. https://doi.org/10.1016/j.apacoust.2023.109492
Article MATH Google Scholar
Ho M-T, Mantello P, Nguyen H-KT, Vuong Q-H (2021) Affective computing scholarship and the rise of China: a view from 25 years of bibliometric data. Humanit Social Sci Commun 8(1):282. https://doi.org/10.1057/s41599-021-00959-8
Article Google Scholar
Hong A, Lunscher N, Hu T, Tsuboi Y, Zhang X, Alves SFdR, Benhabib B (2021) A multimodal emotional human–robot interaction architecture for social robots engaged in bidirectional communication. IEEE Trans Cybern 51(12):5954–5968. https://doi.org/10.1109/TCYB.2020.2974688
Article Google Scholar
Hong SR, Hullman J, Bertini E (2020) Human factors in model interpretability: industry practices, challenges, and needs. Proc ACM on Human-Comp Inter 4(CSCW1):1–26
Article Google Scholar
Hou C, Li Z, Wu J (2022) Unsupervised hash retrieval based on multiple similarity matrices and text self-attention mechanism. Appl Intell 52(7):7670–7685. https://doi.org/10.1007/s10489-021-02804-6
Article MATH Google Scholar
Huddar MG, Sannakki SS, Rajpurohit VS (2021) Attention-based multi-modal sentiment analysis and emotion detection in conversation using RNN 6(6). https://doi.org/10.9781/ijimai.2020.07.004
Jamil S, Jalil Piran M, Kwon O-J (2023) A comprehensive survey of transformers for computer vision. Drones 7(5). https://doi.org/10.3390/drones7050287
Jeste DV, Graham SA, Nguyen TT, Depp CA, Lee EE, Kim H-C (2020) Beyond artificial intelligence: exploring artificial wisdom. Int Psychogeriatr 32(8):993–1001. https://doi.org/10.1017/S1041610220000927
Article Google Scholar
Kattenborn T, Leitloff J, Schiefer F, Hinz S (2021) Review on convolutional neural networks (CNN) in vegetation remote sensing. ISPRS J Photogrammetry Remote Sens 173:24–49
Article Google Scholar
Keltner D, Tracy JL, Sauter D, Cowen A (2019) What basic emotion theory really says for the twenty-first century study of emotion. J Nonverbal Behav 43(2):195–201
Article MATH Google Scholar
Kim M, Qiu X, Wang Y (2024) Interrater agreement in genre analysis: a methodological review and a comparison of three measures. Res Methods Appl Linguistics 3(1):100097. https://doi.org/10.1016/j.rmal.2024.100097
Article Google Scholar
Krieglstein F, Beege M, Rey GD, Sanchez-Stockhammer C, Schneider S (2023) Development and validation of a theory-based questionnaire to measure different types of cognitive load. Education Psychol Rev 35(1):9. https://doi.org/10.1007/s10648-023-09738-0
Article Google Scholar
Kumar P, Malik S, Raman B (2024) Interpretable multimodal emotion recognition using hybrid fusion of speech and image data. Multime Tools Appl 83(10):28373–28394. https://doi.org/10.1007/s11042-023-16443-1
Article MATH Google Scholar
Kuratko DF, Fisher G, Audretsch DB (2021) Unraveling the entrepreneurial mindset. Small Bus Econ 57(4):1681–1691. https://doi.org/10.1007/s11187-020-00372-6
Article MATH Google Scholar
Lai Y, Zhang L, Han D, Zhou R, Wang G (2020) Fine-grained emotion classification of Chinese microblogs based on graph convolution networks. World Wide Web 23(5):2771–2787
Article MATH Google Scholar
Lashgari E, Liang D, Maoz U (2020) Data augmentation for deep-learning-based electroencephalography. J Neurosci Methods 346:108885. https://doi.org/10.1016/j.jneumeth.2020.108885
Article MATH Google Scholar
Li Z, Zhou Y, Liu Z, Zhu F, Yang C, Hu S (2023) QAP: quantum-inspired adaptive-priority-learning model for multimodal emotion recognition. In: Findings of the association for computational linguistics: ACL 2023, pages 12191–12204, Toronto, Canada. association for computational linguistics. https://doi.org/10.18653/v1/2023.findings-acl.772
Lio W, Liu B (2020) Uncertain maximum likelihood estimation with application to uncertain regression analysis. Soft Comput 24(13):9351–9360. https://doi.org/10.1007/s00500-020-04951-3
Article MATH Google Scholar
Liu J, Ang MC, Chaw JK, Kor A-L, Ng KW (2023) Emotion assessment and application in human–computer interaction interface based on backpropagation neural network and artificial bee colony algorithm. Expert Syst Appl 232:120857. https://doi.org/10.1016/j.eswa.2023.120857
Article Google Scholar
Liu J, Ang MC, Chaw JK, Ng KW, Kor AL (2024) The emotional state transition model empowered by genetic hybridization technology on human–robot interaction. IEEE Access 12:105999–106012. https://doi.org/10.1109/ACCESS.2024.3434689
Article Google Scholar
Liu Y, Hu T, Zhang H, Wu H, Wang S, Ma L, Long M (2023) iTransformer: Inverted transformers are effective for time series forecasting. In: The twelfth international conference on learning representations. arXiv preprint arXiv:2310.06625
Liu Z, Xu W, Zhang W, Jiang Q (2023) An emotion-based personalized music recommendation framework for emotion improvement. Inf Process Manag 60(3):103256. https://doi.org/10.1016/j.ipm.2022.103256
Article Google Scholar
Luna-Jiménez C, Kleinlein R, Griol D, Callejas Z, Montero JM, Fernández-Martínez F (2022) A proposal for multimodal emotion recognition using aural transformers and action units on RAVDESS dataset. Appl Sci 12(1). https://doi.org/10.3390/app12010327
Luo W, Xu M, Lai H (2023) Multimodal Reconstruct and align net for missing modality problem in sentiment analysis. In: Dang-Nguyen DT et al. MultiMedia Modeling. MMM 2023. Lecture notes in computer science, vol 13834. Springer, Cham. https://doi.org/10.1007/978-3-031-27818-1_34
Luo Y, Fu Q, Xie J, Qin Y, Wu G, Liu J, Ding X (2020) EEG-based emotion classification using spiking neural networks. IEEE Access 8:46007–46016. https://doi.org/10.1109/ACCESS.2020.2978163
Article Google Scholar
Luo Y, Ye J, Adams RB, Li J, Newman MG, Wang JZ (2020) ARBEE: towards automated recognition of bodily expression of emotion in the wild. Int J Comput Vision 128(1):1–25. https://doi.org/10.1007/s11263-019-01215-y
Article Google Scholar
Majeed A, Beg MO, Arshad U, Mujtaba H (2022) Deep-EmoRU: mining emotions from roman Urdu text using deep learning ensemble. Multimedia Tools Appl 81(30):43163–43188. https://doi.org/10.1007/s11042-022-13147-w
Article Google Scholar
Masuyama N, Loo CK, Seera M (2018) Personality affected robotic emotional model with associative memory for human-robot interaction. Neurocomputing 272:213–225
Article MATH Google Scholar
Mehta D, Siddiqui MFH, Javaid AY (2018) Facial emotion recognition: a Survey and Real-World user. Experiences Mixed Real 18(2):416
MATH Google Scholar
Middya AI, Nag B, Roy S (2022) Deep learning based multimodal emotion recognition using model-level fusion of audio–visual modalities. Knowl Based Syst 244:108580. https://doi.org/10.1016/j.knosys.2022.108580
Article Google Scholar
Mostefai B, Balla A, Trigano P (2019) A generic and efficient emotion-driven approach toward personalized assessment and adaptation in serious games. Cogn Syst Res 56:82–106. https://doi.org/10.1016/j.cogsys.2019.03.006
Article Google Scholar
Muralitharan J, Arumugam C (2024) Privacy BERT-LSTM: a novel NLP algorithm for sensitive information detection in textual documents. Neural Comput Appl. https://doi.org/10.1007/s00521-024-09707-w
Article MATH Google Scholar
Nath S, Shahi AK, Martin T, Choudhury N, Mandal R (2024) Speech emotion recognition using machine learning: a comparative analysis. SN Comput Sci 5(4):390. https://doi.org/10.1007/s42979-024-02656-0
Article MATH Google Scholar
Neethirajan S, Reimert I, Kemp B (2021) Measuring farm animal emotions—sensor-based approaches. Sensors 21(2):553. https://doi.org/10.3390/s21020553
Article Google Scholar
Ortlieb SA, Carbon C-C (2019) A functional model of kitsch and art: linking aesthetic appreciation to the dynamics of social motivation. Front Psychol 9:2437. https://doi.org/10.3389/fpsyg.2018.02437
Article MATH Google Scholar
Pan B, Hirota K, Jia Z, Zhao L, Jin X, Dai Y (2023) Multimodal emotion recognition based on feature selection and extreme learning machine in video clips. J Ambient Intell Humaniz Comput 14(3):1903–1917. https://doi.org/10.1007/s12652-021-03407-2
Article MATH Google Scholar
Panda R, Malheiro RM, Paiva RP (2020) Audio features for music emotion recognition: a survey. IEEE Trans Affect Comput 14(1):68–88. https://doi.org/10.1109/TAFFC.2020.3032373
Article MATH Google Scholar
Park S, Kim SP, Whang M (2021) Individual’s Social Perception of virtual avatars embodied with their habitual facial expressions and facial appearance. Sensors 21(17):5986
Pashevich E (2022) Can communication with social robots influence how children develop empathy? Best-evidence synthesis. AI Soc 37(2):579–589. https://doi.org/10.1007/s00146-021-01214-z
Article Google Scholar
Patwardhan N, Marrone S, Sansone C (2023) Transformers in the real world: a survey on NLP applications. Information 14(4):242. https://doi.org/10.3390/info14040242
Article Google Scholar
Pekár J, Pčolár M (2022) Empirical distribution of daily stock returns of selected developing and emerging markets with application to financial risk management. CEJOR 30(2):699–731. https://doi.org/10.1007/s10100-021-00771-4
Article MATH Google Scholar
Poria S, Majumder N, Mihalcea R, Hovy E (2019) Emotion recognition in conversation: research challenges, datasets, and recent advances. IEEE Access 7:100943–100953. https://doi.org/10.1109/ACCESS.2019.2929050
Article Google Scholar
Rahali A, Akhloufi MA (2023) End-to-end transformer-based models in textual-based NLP. AI 4(1):54–110. https://doi.org/10.3390/ai4010004
Article Google Scholar
Rahman W, Hasan MK, Lee S, Zadeh A, Mao C, Morency LP, Hoque E (2020) Integrating multimodal information in large pretrained transformers. Proc Conf Assoc Comput Linguist Meet 2020:2359–2369. https://doi.org/10.18653/v1/2020.acl-main.214
Rao T, Li X, Zhang H, Xu M (2019) Multi-level region-based convolutional neural network for image emotion classification. Neurocomputing 333:429–439
Article MATH Google Scholar
Reza S, Ferreira MC, Machado J, Tavares JMR (2022) A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks. Expert Syst Appl 202:117275
Article Google Scholar
Robinson R, Wiley K, Rezaeivahdati A, Klarkowski M, Mandryk RL (2020) "Let's Get physiological, physiological!": A systematic review of affective gaming. In: Proceedings of the annual symposium on computer-human interaction in play (CHI PLAY '20). Association for Computing Machinery, New York, USA, 132–147. https://doi.org/10.1145/3410404.3414227
Rodríguez RA (2024) A novel approach to calculate weighted average cost of capital (WACC) considering debt and firm’s cash flow durations. Managerial Decis Econ 45(2):1154–1179
Article MATH Google Scholar
Sahu LP, Pradhan G (2022) Analysis of short-time magnitude spectra for improving intelligibility assessment of dysarthric speech. Circuits Syst Signal Process 41(10):5676–5698. https://doi.org/10.1007/s00034-022-02047-x
Article MATH Google Scholar
Salehi AW, Khan S, Gupta G, Alabduallah BI, Almjally A, Alsolai H, Mellit A (2023) A study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 15(7). https://doi.org/10.3390/su15075930
Samuel O, Walker G, Salmon P, Filtness A, Stevens N, Mulvihill C, Stanton N (2019) Riding the emotional roller-coaster: using the circumplex model of affect to model motorcycle riders’ emotional state-changes at intersections. Transp Res Part F: Traffic Psychol Behav 66:139–150. https://doi.org/10.1016/j.trf.2019.08.018
Article Google Scholar
Schiffmann M, Thoma A, Richer A (2021). Multi-modal emotion recognition for user adaptation in social robots. In: Zallio M, Raymundo Ibañez C, Hernandez JH (eds) Advances in human factors in robots, unmanned systems and cybersecurity. AHFE 2021. Lecture notes in networks and systems, vol 268. Springer, Cham. https://doi.org/10.1007/978-3-030-79997-7_16
Schoneveld L, Othmani A, Abdelkawy H (2021) Leveraging recent advances in deep learning for audio-visual emotion recognition. Pattern Recognit Lett 146:1–7
Article MATH Google Scholar
Shanmugam M, Ismail NNN, Magalingam P, Hashim NNWN, Singh D (2023) Understanding the use of acoustic measurement and Mel Frequency Cepstral Coefficient (MFCC) features for the classification of depression speech. In: Al-Sharafi MA, Al-Emran M, Tan GW-H, Ooi K-B (eds) Current and future trends on intelligent technology adoption, vol. 1. Springer Nature Switzerland, Cham, pp 345–359
Shi C, Zhang Y, Liu B (2024) A multimodal fusion-based deep learning framework combined with local-global contextual TCNs for continuous emotion recognition from videos. Appl Intell 54(4):3040–3057. https://doi.org/10.1007/s10489-024-05329-w
Article MATH Google Scholar
Shukla J, Barreda-Angeles M, Oliver J, Nandi GC, Puig D (2019) Feature extraction and selection for emotion recognition from electrodermal activity. IEEE Trans Affect Comput 12(4):857–869
Article Google Scholar
Siriwardhana S, Kaluarachchi T, Billinghurst M, Nanayakkara S (2020) Multimodal emotion recognition with transformer-based self supervised feature fusion. IEEE Access 8:176274–176285. https://doi.org/10.1109/ACCESS.2020.3026823
Article Google Scholar
Smith R, Parr T, Friston KJ (2019) Simulating emotions: an active inference model of emotional state inference and emotion concept learning. Front Psychol 10:2844
Article MATH Google Scholar
Stock-Homburg R (2022) Survey of emotions in human–robot interactions: perspectives from robotic psychology on 20 years of research. Int J Social Robot 14(2):389–411. https://doi.org/10.1007/s12369-021-00778-6
Article MATH Google Scholar
Stofa MM, Zulkifley MA, Zainuri MA (2022) Micro-expression-based emotion Recognition using Waterfall Atrous spatial pyramid pooling networks. Sensors 22(12). https://doi.org/10.3390/s22124634
Strauss GP, Zamani Esfahlani F, Raugh IM, Luther L, Sayama H (2023) Markov chain analysis indicates that positive and negative emotions have abnormal temporal interactions during daily life in schizophrenia. J Psychiatr Res 164:344–349. https://doi.org/10.1016/j.jpsychires.2023.06.025
Article Google Scholar
Suhas BN, Mallela J, Illa A, Yamini BK, Atchayaram N, Yadav R, ... Ghosh PK (2020) Speech task based automatic classification of ALS and Parkinson’s disease and their severity using log Mel spectrograms. 2020 international conference on signal processing and communications (SPCOM), Bangalore, India, pp 1–5. https://doi.org/10.1109/SPCOM50965.2020.9179503
Sun L, Lian Z, Liu B, Tao J (2023) Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis. IEEE Trans Affect Comput 1–17:1. https://doi.org/10.1109/TAFFC.2023.3274829
Article MATH Google Scholar
Tami M, Masri S, Hasasneh A, Tadj C (2024) Transformer-based approach to pathology diagnosis using audio spectrogram. Information 15(5):253. https://doi.org/10.3390/info15050253
Article Google Scholar
Tsai YH, Bai S, Pu Liang P, Kolter JZ, Morency LP, Salakhutdinov R (2019) Multimodal transformer for unaligned multimodal language sequences. Proc Conf Assoc Comput Linguist Meet 2019:6558–6569. https://doi.org/10.18653/v1/p19-1656
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems. 31st conference on neural information processing systems (NIPS 2017), Long Beach, CA, USA
Wang N, Yan L, Wang Y (2019) Review of theoretical research on artificial intelligence. DEStech Trans Comput Sci Eng(Iciti. https://doi.org/10.12783/dtcse/iciti2018/29138
Wang W, Bao H, Huang S, Dong L, Wei F (2020) MiniLMv2: multi-head self-attention relation distillation for compressing pretrained transformers. arXiv preprint arXiv:.15828. https://api.semanticscholar.org/CorpusID:229923069
Wang Y, Shi Y, Zhang F, Wu C, Chan J, Yeh CF, Xiao A (2021) Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 2021, pp 6778–6782, https://doi.org/10.1109/ICASSP39728.2021.9414087
Wei Q, Huang X, Zhang Y (2023) FV2ES: a fully End2End multimodal system for fast yet effective video emotion recognition inference. IEEE Trans Broadcast 69(1):10–20. https://doi.org/10.1109/TBC.2022.3215245
Article MATH Google Scholar
Verma GK, Tiwary US (2017) Affect representation and recognition in 3D continuous valence–arousal–dominance space. Multimed Tools Appl 76:2159–2183. https://doi.org/10.1007/s11042-015-3119-y
Xin J, Zhou C, Jiang Y, Tang Q, Yang X, Zhou J (2023) A signal recovery method for bridge monitoring system using TVFEMD and encoder-decoder aided LSTM. Measurement 214:112797. https://doi.org/10.1016/j.measurement.2023.112797
Article MATH Google Scholar
Xu D, Tian Z, Lai R, Kong X, Tan Z, Shi W (2020) Deep learning based emotion analysis of microblog texts. Inform Fusion 64:1–11
Article MATH Google Scholar
Xu J, Choi M-C (2023) Can emotional intelligence increase the positive psychological capital and life satisfaction of Chinese university students? Behav Sci 13(7):614. https://doi.org/10.3390/bs13070614
Article MATH Google Scholar
Xu S, Zhang Z, Li L, Zhou Y, Lin D, Zhang M, Liang Z (2023) Functional connectivity profiles of the default mode and visual networks reflect temporal accumulative effects of sustained naturalistic emotional experience. NeuroImage 269:119941. https://doi.org/10.1016/j.neuroimage.2023.119941
Article Google Scholar
Yang B, Shao B, Wu L, Lin X (2022) Multimodal sentiment analysis with unidirectional modality translation. Neurocomputing 467:130–137. https://doi.org/10.1016/j.neucom.2021.09.041
Article MATH Google Scholar
Yang J, Yu Y, Niu D, Guo W, Xu Y (2023) ConFEDE: contrastive feature decomposition for multimodal sentiment analysis. In: proceedings of the 61st annual meeting of the association for computational linguistics (Volume 1: Long Papers), pages 7617–7630, Toronto, Canada. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.421
Yeke S (2023) Digital intelligence as a partner of emotional intelligence in business administration. Asia Pac Manage Rev 28(4):390–400. https://doi.org/10.1016/j.apmrv.2023.01.001
Article Google Scholar
Yu Y, Kim Y-J (2020) Attention-LSTM-attention model for speech emotion recognition and analysis of IEMOCAP Database. Electronics 9(5):713. https://doi.org/10.3390/electronics9050713
Article MATH Google Scholar
Yuvaraj R, Thagavel P, Thomas J, Fogarty J, Ali F (2023) Comprehensive analysis of feature extraction methods for emotion recognition from multichannel EEG recordings. Sensors 23(2):915. https://doi.org/10.3390/s23020915
Article Google Scholar
Zadeh AB, Liang PP, Poria S, Cambria E, Morency L-P (2018) Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), p 2236–2246, Melbourne, Australia. Association for Computational Linguistics. https://doi.org/10.18653/v1/P18-1208
Zeng H, Shu X, Wang Y, Wang Y, Zhang L, Pong T-C, Qu H (2020) Emotioncues: emotion-oriented visual summarization of classroom videos. IEEE Trans Vis Comput Graphics 27(7):3168–3181
Article MATH Google Scholar
Zhang J, Yin Z, Chen P, Nichele S (2020) Emotion recognition using multi-modal data and machine learning techniques: a tutorial and review. Inform Fusion 59:103–126
Article MATH Google Scholar
Zhang L, Xiao F, Cao Z (2023) Multi-channel EEG signals classification via CNN and multi-head self-attention on evidence theory. Inf Sci 642:119107. https://doi.org/10.1016/j.ins.2023.119107
Article MATH Google Scholar
Zhang S, Yang Y, Chen C, Zhang X, Leng Q, Zhao X (2024) Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: a systematic review of recent advancements and future prospects. Expert Syst Appl 237:121692. https://doi.org/10.1016/j.eswa.2023.121692
Article Google Scholar
Zhao H, Jiang J (2022) Role stress, emotional exhaustion, and knowledge hiding: the joint moderating effects of network centrality and structural holes. Curr Psychol 41(12):8829–8841. https://doi.org/10.1007/s12144-021-01348-9
Article MATH Google Scholar
Zhou J, Wu Z, Wang Q, Yu Z (2022) Fault diagnosis method of Smart Meters based on DBN-CapsNet. Electronics 11(10). https://doi.org/10.3390/electronics11101603
Zhou J, Zhao T, Xie Y, Xiao F, Sun L (2022) Emotion recognition based on brain connectivity reservoir and valence lateralization for cyber-physical-social systems. Pattern Recognit Lett 161:154–160. https://doi.org/10.1016/j.patrec.2022.08.009
Article MATH Google Scholar
Zhuang X, Liu F, Hou J, Hao J, Cai X (2022) Transformer-based interactive multi-modal attention network for video sentiment detection. Neural Process Lett 54(3):1943–1960. https://doi.org/10.1007/s11063-021-10713-5
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of Mathematics and Computer, Xinyu University, Xinyu, 338004, China
JianBang Liu
Institute of Visual Informatics, Universiti Kebangsaan Malaysia (UKM), UKM Bangi, Selangor, 43600, Malaysia
JianBang Liu, Mei Choo Ang & Jun Kit Chaw
Dept. of Mechanical, Materials and Manufacturing Engineering, Faculty of Science and Engineering, University of Nottingham Malaysia, Semenyih, Selangor, 43500, Malaysia
Kok Weng Ng
School of Built Environment, Engineering, and Computing, Leeds Beckett University, Leeds, LS6 3QS, UK
Ah-Lian Kor

Authors

JianBang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mei Choo Ang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Kit Chaw
View author publications
You can also search for this author in PubMed Google Scholar
Kok Weng Ng
View author publications
You can also search for this author in PubMed Google Scholar
Ah-Lian Kor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mei Choo Ang.

Ethics declarations

Conflict of interest

The author has no competing interests to declare in relation to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, J., Ang, M.C., Chaw, J.K. et al. Personalized emotion analysis based on fuzzy multi-modal transformer model. Appl Intell 55, 227 (2025). https://doi.org/10.1007/s10489-024-05954-5

Download citation

Accepted: 09 October 2024
Published: 27 December 2024
DOI: https://doi.org/10.1007/s10489-024-05954-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Personalized emotion analysis based on fuzzy multi-modal transformer model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Emotional inference by means of Choquet integral and λ-fuzzy measurement in consideration of ambiguity of human mentality

Multimodal Interfaces for Emotion Recognition: Models, Challenges and Opportunities

A Review of Key Technologies for Emotion Analysis Using Multimodal Information

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Personalized emotion analysis based on fuzzy multi-modal transformer model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Emotional inference by means of Choquet integral and λ-fuzzy measurement in consideration of ambiguity of human mentality

Multimodal Interfaces for Emotion Recognition: Models, Challenges and Opportunities

A Review of Key Technologies for Emotion Analysis Using Multimodal Information

Explore related subjects

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation