Journals & Magazines >IEEE Transactions on Multimedia >Volume: 22 Issue: 9

Affective Video Content Analysis With Adaptive Fusion Recurrent Network

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Affective video content analysis is an important research topic in video content analysis and has extensive applications. Intuitively, multimodal features can depict elic...Show More

Metadata

Abstract:

Affective video content analysis is an important research topic in video content analysis and has extensive applications. Intuitively, multimodal features can depict elicited emotions, and the accumulation of temporal inputs influences the viewer's emotion. Although a number of research works have been proposed for this task, the adaptive weights of modalities and the correlation of temporal inputs are still not well studied. To address these issues, a novel framework is designed to learn the weights of modalities and temporal inputs from video data. Specifically, three network layers are designed, including statistical-data layer to improve the robustness of data, temporal-adaptive-fusion layer to fuse temporal inputs, and multimodal-adaptive-fusion layer to combine multiple modalities. In particular, the feature vectors of three input modalities are respectively extracted from three pre-trained convolutional neural networks and then fed to three statistical-data layers. Then, the output vectors of these three statistical-data layers are separately connected to three recurrent layers, and the corresponding outputs are fed to a fully-connected layer which shares parameters across modalities and temporal inputs. Finally, the outputs of the fully-connected layer are fused by the temporal-adaptive-fusion layer and then combined by the multimodal-adaptive-fusion layer. To discover the correlation of both multiple modalities and temporal inputs, adaptive weights of modalities and temporal inputs are introduced into loss functions for model training, and these weights are learned by an optimization algorithm. Extensive experiments are conducted on two challenging datasets, which demonstrate that the proposed method achieves better performances than baseline and other state-of-the-art methods.

Published in: IEEE Transactions on Multimedia ( Volume: 22, Issue: 9, September 2020)

Page(s): 2454 - 2466

Date of Publication: 22 November 2019

ISSN Information:

DOI: 10.1109/TMM.2019.2955300

Funding Agency:

Contents

References is not available for this document.

Affective Video Content Analysis With Adaptive Fusion Recurrent Network

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Affective Video Content Analysis With Adaptive Fusion Recurrent Network

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?