UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

Wang, Shuai; Ratnavelu, K.; Bin Shibghatullah, Abdul Samad

doi:10.1007/s10489-024-06113-6

UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

Published: 16 December 2024

Volume 55, article number 171, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

214 Accesses
Explore all metrics

Abstract

The rapid evolution of the digital era has greatly transformed social media, resulting in more diverse emotional expressions and increasingly complex public discourse. Consequently, identifying relationships within multimodal data has become increasingly challenging. Most current multimodal sentiment analysis (MSA) methods concentrate on merging data from diverse modalities into an integrated feature representation to enhance recognition performance by leveraging the complementary nature of multimodal data. However, these approaches often overlook prediction reliability. To address this, we propose the uncertainty estimation fusion network (UEFN), a reliable MSA method based on uncertainty estimation. UEFN combines the Dirichlet distribution and Dempster-Shafer evidence theory (DSET) to predict the probability distribution and uncertainty of text, speech, and image modalities, fusing the predictions at the decision level. Specifically, the method first represents the contextual features of text, speech, and image modalities separately. It then employs a fully connected neural network to transform features from different modalities into evidence forms. Subsequently, it parameterizes the evidence of different modalities via the Dirichlet distribution and estimates the probability distribution and uncertainty for each modality. Finally, we use DSET to fuse the predictions, obtaining the sentiment analysis results and uncertainty estimation, referred to as the multimodal decision fusion layer (MDFL). Additionally, on the basis of the modality uncertainty generated by subjective logic theory, we calculate feature weights, apply them to the corresponding features, concatenate the weighted features, and feed them into a feedforward neural network for sentiment classification, forming the adaptive weight fusion layer (AWFL). Both MDFL and AWFL are then used for multitask training. Experimental comparisons demonstrate that the UEFN not only achieves excellent performance but also provides uncertainty estimation along with the predictions, enhancing the reliability and interpretability of the results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing Sentiment Analysis Accuracy Through Multimodal Data Fusion: A Deep Learning Approach

Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning

Article Open access 16 January 2025

Uncertainty-Aware Gradient Modulation and Feature Masking for Multimodal Sentiment Analysis

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets supporting our research are publicly accessible at the following URLs: (https://paperswithcode.com/dataset/cmu-mosi) and (https://paperswithcode.com/dataset/cmu-mosei).

Notes

References

Zhou X, Liang W, Luo Z, Pan Y (2021) Periodic-aware intelligent prediction model for information diffusion in social networks. IEEE Transactions on Network Science and Engineering. 8(2):894–904
Article MATH Google Scholar
Wang S, Shibghatullah AS, Iqbal TJ, Keoy KH (2024) A review of multimodal-based emotion recognition techniques for cyberbullying detection in online social media platforms. Neural Computing and Applications 1–34
Lu Q, Sun X, Long Y, Gao Z, Feng J, Sun T (2023) Sentiment analysis: Comprehensive reviews, recent advances, and open challenges. IEEE Transactions on Neural Networks and Learning Systems
Singh U, Abhishek K, Azad HK (2024) A survey of cutting-edge multimodal sentiment analysis. ACM Comput Surv 56(9):1–38
Article MATH Google Scholar
Gandhi A, Adhvaryu K, Poria S, Cambria E, Hussain A (2023) Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Information Fusion 91:424–444. https://doi.org/10.1016/j.inffus.2022.09.025
Zeng Y, Li Z, Chen Z, Ma H (2024) A feature-based restoration dynamic interaction network for multimodal sentiment analysis. Engineering Applications of Artificial Intelligence 127:107335. https://doi.org/10.1016/j.engappai.2023.107335
Liu Y, Zhang J (2024) Service function chain embedding meets machine learning: Deep reinforcement learning approach. IEEE Trans Netw Serv Manage 21(3):3465–3481. https://doi.org/10.1109/TNSM.2024.3353808
Article MATH Google Scholar
Zhang J, Liu Y, Ding G, Tang B, Chen Y (2024) Adaptive decomposition and extraction network of individual fingerprint features for specific emitter identification. IEEE Transactions on Information Forensics and Security 19:8515–8528. https://doi.org/10.1109/TIFS.2024.3427361
Zhang J, Liu Y, Ding G, Tang B, Chen Y (2024) Adaptive decomposition and extraction network of individual fingerprint features for specific emitter identification. IEEE Transactions on Information Forensics and Security 19:8515–8528. https://doi.org/10.1109/TIFS.2024.3427361
Xiao Z, Xing H, Qu R, Li H, Feng L, Zhao B, Yang J (2024) Self-bidirectional decoupled distillation for time series classification. IEEE Transactions on Artificial Intelligence. 5(8):4101–4110. https://doi.org/10.1109/TAI.2024.3360180
Article MATH Google Scholar
Xiao Z, Tong H, Qu R, Xing H, Luo S, Zhu Z, Song F, Feng L (2023) Capmatch: Semi-supervised contrastive transformer capsule with feature-based knowledge distillation for human activity recognition. IEEE Transactions on Neural Networks and Learning Systems 1–15. https://doi.org/10.1109/TNNLS.2023.3344294
Abdar M, Pourpanah F, Hussain S, Rezazadegan D, Liu L, Ghavamzadeh M, Fieguth P, Cao X, Khosravi A, Acharya UR et al (2021) A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information fusion. 76:243–297
Article Google Scholar
Olivier A, Shields MD, Graham-Brady L (2021) Bayesian neural networks for uncertainty quantification in data-driven materials modeling. Computer Methods in Applied Mechanics and Engineering 386:114079. https://doi.org/10.1016/j.cma.2021.114079
Xu C, Zhong P-A, Zhu F, Yang L, Wang S, Wang Y (2023) Real-time error correction for flood forecasting based on machine learning ensemble method and its uncertainty assessment. Stoch Env Res Risk Assess 37(4):1557–1577. https://doi.org/10.1007/s00477-022-02336-6
Article MATH Google Scholar
Alarab I, Prakoonwit S, Nacer MI (2021) Illustrative discussion of mc-dropout in general dataset: uncertainty estimation in bitcoin. Neural Process Lett 53(2):1001–1011. https://doi.org/10.1007/s11063-021-10424-x
Article MATH Google Scholar
Son J, Kang S (2023) Efficient improvement of classification accuracy via selective test-time augmentation. Information Sciences 642:119148. https://doi.org/10.1016/j.ins.2023.119148
Kaur R, Kautish S (2022) Multimodal sentiment analysis: A survey and comparison. Research anthology on implementing sentiment analysis across multiple disciplines 1846–1870
Dey RK, Das AK (2023) Modified term frequency-inverse document frequency based deep hybrid framework for sentiment analysis. Multimedia Tools and Applications. 82(21):32967–32990
Article MATH Google Scholar
Dey RK, Das AK (2024) Neighbour adjusted dispersive flies optimization based deep hybrid sentiment analysis framework. Multimedia Tools and Applications 1–24
Xue X, Zhang C, Niu Z, Wu X (2022) Multi-level attention map network for multimodal sentiment analysis. IEEE Trans Knowl Data Eng 35(5):5105–5118
MATH Google Scholar
Moon J, Kim J, Shin Y, Hwang S (2020) Confidence-aware learning for deep neural networks. In: International Conference on Machine Learning, pp. 7034–7044. PMLR
Van Amersfoort J, Smith L, Teh YW, Gal Y (2020) Uncertainty estimation using a single deep deterministic neural network. In: International Conference on Machine Learning, pp. 9690–9700. PMLR
Zadeh A, Chen M, Poria S, Cambria E, Morency LP (2017) Tensor fusion network for multimodal sentiment analysis. arXiv:1707.07250
Hüllermeier E, Waegeman W (2021) Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Mach Learn 110(3):457–506
Article MathSciNet MATH Google Scholar
Gawlikowski J, Tassi CRN, Ali M, Lee J, Humt M, Feng J, Kruspe A, Triebel R, Jung P, Roscher R et al (2023) A survey of uncertainty in deep neural networks. Artif Intell Rev 56(Suppl 1):1513–1589
Article Google Scholar
Ivšinović J, Dinis MAP, Malvić T, Pleše D (2024) Application of the bootstrap method in low-sampled upper miocene sandstone hydrocarbon reservoirs: a case study. Energy Sources, Part A: Recovery, Utilization, and Environmental Effects. 46(1):4474–4488
Article Google Scholar
Choi S, Lee K, Lim S, Oh S (2018) Uncertainty-aware learning from demonstration using mixture density networks with sampling-free variance modeling. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6915–6922. IEEE
Zhang X, Ma Y (2023) An albert-based textcnn-hatt hybrid model enhanced with topic knowledge for sentiment analysis of sudden-onset disasters. Eng Appl Artif Intell 123:106136
Article MATH Google Scholar
Ruz GA, Henríquez PA, Mascareño A (2020) Sentiment analysis of twitter data during critical events through bayesian networks classifiers. Futur Gener Comput Syst 106:92–104
Article Google Scholar
Najar F, Bouguila N (2022) Emotion recognition: A smoothed dirichlet multinomial solution. Eng Appl Artif Intell 107:104542
Article MATH Google Scholar
Dempster AP (2008) Upper and lower probabilities induced by a multivalued mapping. In: Yager RR, Liu L (eds) Classic Works of the Dempster-Shafer Theory of Belief Functions. Springer, Berlin, Heidelberg, pp 57–72
Chapter MATH Google Scholar
Wang X, Qin J (2024) Multimodal recommendation algorithm based on dempster-shafer evidence theory. Multimedia Tools and Applications. 83(10):28689–28704
Article MATH Google Scholar
Xie Z, Yang Y, Wang J, Liu X, Li X (2024) Trustworthy multimodal fusion for sentiment analysis in ordinal sentiment space. IEEE Transactions on Circuits and Systems for Video Technology
Tong Z, Xu P, Denoeux T (2021) An evidential classifier based on dempster-shafer theory and deep learning. Neurocomputing 450:275–293
Article MATH Google Scholar
Jsang A (2018) Subjective Logic: A Formalism for Reasoning Under Uncertainty. Springer, New York
MATH Google Scholar
Esposito C, Galli A, Moscato V, Sperlí G (2022) Multi-criteria assessment of user trust in social reviewing systems with subjective logic fusion. Information Fusion. 77:1–18
Article Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Sun Z, Sarma P, Sethares W, Liang Y (2020) Learning relationships between text, audio, and video via deep canonical correlation for multimodal language analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence 34:8992–8999
Yu W, Xu H, Yuan Z, Wu J (2021) Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10790–10797
Han Z, Zhang C, Fu H, Zhou JT (2022) Trusted multi-view classification with dynamic evidential fusion. IEEE Trans Pattern Anal Mach Intell 45(2):2551–2566
Article MATH Google Scholar
Yang X, Yang X, Yang J, Ming Q, Wang W, Tian Q, Yan J (2021) Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Adv Neural Inf Process Syst 34:18381–18394
MATH Google Scholar
Zadeh A, Zellers R, Pincus E, Morency LP (2016) Mosi: multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos. arXiv:1606.06259
Zadeh AB, Liang PP, Poria S, Cambria E, Morency LP (2018) Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2236–2246
Zadeh A, Liang PP, Mazumder N, Poria S, Cambria E, Morency LP (2018) Memory fusion network for multi-view sequential learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32
Tsai YHH, Liang PP, Zadeh A, Morency LP, Salakhutdinov R (2018) Learning factorized multimodal representations. arXiv:1806.06176
Han W, Chen H, Poria S (2021) Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. arXiv:2109.00412
Degottex G, Kane J, Drugman T, Raitio T, Scherer S (2014) Covarep—a collaborative voice analysis repository for speech technologies. In: 2014 Ieee International Conference on Acoustics, Speech and Signal Processing (icassp), pp. 960–964. IEEE
Guo X, Kong AW-K, Kot A (2022) Deep multimodal sequence fusion by regularized expressive representation distillation. IEEE Trans Multimedia 25:2085–2096
Article Google Scholar
Yu Y, Lado A, Zhang Y, Magnotti JF, Beauchamp MS (2024) Synthetic faces generated with the facial action coding system or deep neural networks improve speech-in-noise perception, but not as much as real faces. Front Neurosci 18:1379988
Article Google Scholar
Tsai YHH, Bai S, Liang PP, Kolter JZ, Morency LP, Salakhutdinov R (2019) Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2019, p. 6558. NIH Public Access
Hazarika D, Zimmermann R, Poria S (2020) Misa: Modality-invariant and-specific representations for multimodal sentiment analysis. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1122–1131
Sun H, Wang H, Liu J, Chen YW, Lin L (2022) Cubemlp: An mlp-based model for multimodal sentiment analysis and depression estimation. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 3722–3729
Wang D, Guo X, Tian Y, Liu J, He L, Luo X (2023) Tetfn: A text enhanced transformer fusion network for multimodal sentiment analysis. Pattern Recogn 136:109259
Article Google Scholar
Yin G, Liu Y, Liu T, Zhang H, Fang F, Tang C, Jiang L (2024) Token-disentangling mutual transformer for multimodal emotion recognition. Eng Appl Artif Intell 133:108348
Article MATH Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge financial support from the Applied Research Project of Yuncheng University (Grant No. YY-202312, 2023). We would also like to express our sincere gratitude to Dr. Miao Xia Chen for her valuable contributions to this work.

Author information

K. Ratnavelu and Abdul Samad Bin Shibghatullah contributed equally to this work.

Authors and Affiliations

Shanxi Province Optoelectronic Information Science and Technology Laboratory, Yuncheng University, No.1155 Fudan West Street, Yuncheng, 044000, Shanxi, China
Shuai Wang
Institute of Computer Science and Digital Innovation, UCSI University, No.1 Jalan Menara Gading, Cheras, 56000, Kuala Lumpur, Malaysia
Shuai Wang, K. Ratnavelu & Abdul Samad Bin Shibghatullah
College of Computing and Informatics, Universiti Tenaga Nasional, Jalan Kajang-Puchong, Kajang, 43000, Selangor, Malaysia
Abdul Samad Bin Shibghatullah

Authors

Shuai Wang
View author publications
You can also search for this author in PubMed Google Scholar
K. Ratnavelu
View author publications
You can also search for this author in PubMed Google Scholar
Abdul Samad Bin Shibghatullah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

[Shuai Wang] contributed to conceptualization, methodology, data collection and analysis, writing the original draft, review, and editing, as well as funding acquisition. [K. Ratnavelu] and [Abdul Samad Bin Shibghatullah] provided project supervision, administration, and essential resources. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Shuai Wang.

Ethics declarations

Competing Interests

The authors state that they have no conflict of interest.

Ethical and informed consent for data used

The authors employ open-source datasets that are devoid of ethical issues. These datasets are publicly accessible at CMU-MOSI and CMU-MOSEI.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, S., Ratnavelu, K. & Bin Shibghatullah, A.S. UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis. Appl Intell 55, 171 (2025). https://doi.org/10.1007/s10489-024-06113-6

Download citation

Accepted: 23 November 2024
Published: 16 December 2024
DOI: https://doi.org/10.1007/s10489-024-06113-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Sentiment Analysis Accuracy Through Multimodal Data Fusion: A Deep Learning Approach

Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning

Uncertainty-Aware Gradient Modulation and Feature Masking for Multimodal Sentiment Analysis

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Sentiment Analysis Accuracy Through Multimodal Data Fusion: A Deep Learning Approach

Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning

Uncertainty-Aware Gradient Modulation and Feature Masking for Multimodal Sentiment Analysis

Explore related subjects

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation