Affective Video Content Analysis Based on Two Compact Audio-Visual Features

Guo, Xiaona; Zhong, Wei; Ye, Long; Fang, Li; Zhang, Qin

doi:10.1007/978-981-15-3341-9_29

Xiaona Guo¹¹,
Wei Zhong¹¹,
Long Ye¹¹,
Li Fang¹¹ &
…
Qin Zhang¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1181))

Included in the following conference series:

International Forum on Digital TV and Wireless Multimedia Communications

611 Accesses
1 Citations

Abstract

In this paper, we propose a new framework for affective video content analysis by using two compact audio-visual features. In the proposed framework, the eGeMAPS is first calculated as global audio feature and then the key frames of optical flow images are fed to VGG19 network for implementing the transfer learning and visual feature extraction. Finally for model learning, the logistic regression is employed for affective video content classification. In the experiments, we perform the evaluations of audio and visual features on the dataset of Affective Impact of Movies Task 2015 (AIMT15), and compare our results with those of competition teams participated in AIMT15. The comparison results show that the proposed framework can achieve the comparable classification result with the first place of AIMT15 with a total feature dimension of 344, which is only about one thousandth of feature dimensions used in the first place of AIMT15.

This work is supported by the National Natural Science Foundation of China under Grant Nos. 61801440 and 61631016, and the Fundamental Research Funds for the Central Universities under Grant Nos. 2018XNG1824 and YLSZ180226.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Global Affective Video Content Regression Based on Complementary Audio-Visual Features

Synchronous Prediction of Continuous Affective Video Content Based on Multi-task Learning

Multi-modal learning for affective content analysis in movies

Article 30 January 2018

References

Kang, H.-B.: Affective content detection using HMMs. In: Proceedings of the 11th ACM International Conference on Multimedia, pp. 259–262 (2003)
Google Scholar
Wang, H.L., Cheong, L.-F.: Affective understanding in film. IEEE Trans. Circuits Syst. Video Technol. 16(6), 689–704 (2006)
Article Google Scholar
Zhang, S., Huang, Q., Jiang, S., Gao, W., Tian, Q.: Affective visualization and retrieval for music video. IEEE Trans. Multimed. 12(6), 510–522 (2010)
Article Google Scholar
Acar, E., Hopfgartner, F., Albayrak, S.: Understanding affective content of music videos through learned representations. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds.) MMM 2014. LNCS, vol. 8325, pp. 303–314. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04114-8_26
Chapter Google Scholar
Baecchi, C., Uricchio, T., Bertini, M., Bimbo, A.D.: Deep sentiment features of context and faces for affective video analysis. In: Proceedings of the ACM International Conference on Multimedia Retrieval, pp. 72–77 (2017)
Google Scholar
Yi, Y., Wang, H.-L., Li, Q.Y.: CNN features for emotional impact of movies task. In: MediaEval (2018)
Google Scholar
Baveye, Y., Chamaret, C., Dellandréa, E., Chen, L.M.: Affective video content analysis: a multidisciplinary insight. IEEE Trans. Affect. Comput. 9(4), 396–409 (2018)
Article Google Scholar
Sjöberg, M., Baveye, Y., Wang, H.L., Quang, V.L., Ionescu, B., et al.: The MediaEval 2015 affective impact of movies task. In: MediaEval (2015)
Google Scholar
Dellandréa, E., Chen, L. M., Baveye, Y., Sjöberg, M. V. and Chamaret, C.: The MediaEval 2016 emotional impact of movies task. In: MediaEval (2016)
Google Scholar
Dellandréa, E., Huigsloot, M., Chen, L.M., Baveye, Y., Sjöberg, M.: The MediaEval 2017 emotional impact of movies task. In: MediaEval (2017)
Google Scholar
Dellandréa, E., Huigsloot, M., Chen, L.M., Baveye, Y., Xiao, Z.Z., Sjöberg, M.: The MediaEval 2018 emotional impact of movies task. In: MediaEval (2018)
Google Scholar
Yi, Y., Wang, H.L., Zhang, B.W., Yu, J.: MIC-TJU in MediaEval 2015 affective impact of movies task. In: MediaEval (2015)
Google Scholar
Lam, V., Le, S.P., Le, D.D., Satoh, S., Duong, D.A.: NII-UIT at MediaEval 2015 affective impact of movies task. In: MediaEval (2015)
Google Scholar
Trigeorgis, G., Coutinho, E., Ringeval, F., Marchi, E., Zafeiriou, S., Schuller, B.W.: The ICL-TUM-PASSAU approach for the MediaEval 2015 affective impact of movies task. In: MediaEval (2015)
Google Scholar
Dai, Q., et al.: Fudan-Huawei at MediaEval 2015: detecting violent scenes and affective impact in movies with deep learning. In: MediaEval (2015)
Google Scholar
Vlastelica, P.M., Hayrapetyan, S., Tapaswi, M., Stiefelhagen, R.: KIT at MediaEval 2015-evaluating visual cues for affective impact of movies task. In: MediaEval (2015)
Google Scholar
Seddati, O., Kulah, E., Pironkov, G., Dupont, S., Mahmoudi, S., Dutoit, T.: UMons at MediaEval 2015 affective impact of movies task including violent scenes detection. In: MediaEval (2015)
Google Scholar
Mironica, I., Ionescu, B., Sjöberg, M., Schedl, M., Skowron, M.: RFA at MediaEval 2015 affective impact of movies task: a multimodal approach. In: MediaEval (2015)
Google Scholar
Chakraborty, R., Kumar Maurya, A., Pandharipande, M., Hassan, E., Ghosh, H., Kopparapu, S.K.: TCS-ILAB-MediaEval 2015: affective impact of movies and violent scene detection. In: MediaEval (2015)
Google Scholar
Eyben, F., Scherer, K.R., Schuller, B.W., Sundberg, J., Andre, E., et al.: The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2016)
Article Google Scholar
Aytar, Y., Vondrick, C., Torralba, A.: SoundNet: learning sound representations from unlabeled video. In: Advances in Neural Information Processing Systems, pp. 892–900 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Media Audio and Video (Communication University of China), Ministry of Education, Communication University of China, Beijing, 100024, China
Xiaona Guo, Wei Zhong, Long Ye, Li Fang & Qin Zhang

Authors

Xiaona Guo
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Long Ye
View author publications
You can also search for this author in PubMed Google Scholar
Li Fang
View author publications
You can also search for this author in PubMed Google Scholar
Qin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Zhong .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
Shanghai Jiao Tong University, Shanghai, China
Jun Zhou
Shanghai Jiao Tong University, Shanghai, China
Hua Yang
Shanghai University, Shanghai, China
Ping An
Shanghai Jiao Tong University, Shanghai, China
Xiaokang Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, X., Zhong, W., Ye, L., Fang, L., Zhang, Q. (2020). Affective Video Content Analysis Based on Two Compact Audio-Visual Features. In: Zhai, G., Zhou, J., Yang, H., An, P., Yang, X. (eds) Digital TV and Wireless Multimedia Communication. IFTC 2019. Communications in Computer and Information Science, vol 1181. Springer, Singapore. https://doi.org/10.1007/978-981-15-3341-9_29

Download citation

DOI: https://doi.org/10.1007/978-981-15-3341-9_29
Published: 16 February 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3340-2
Online ISBN: 978-981-15-3341-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Affective Video Content Analysis Based on Two Compact Audio-Visual Features

Abstract

Access this chapter

Similar content being viewed by others

Global Affective Video Content Regression Based on Complementary Audio-Visual Features

Synchronous Prediction of Continuous Affective Video Content Based on Multi-task Learning

Multi-modal learning for affective content analysis in movies

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Affective Video Content Analysis Based on Two Compact Audio-Visual Features

Abstract

Access this chapter

Similar content being viewed by others

Global Affective Video Content Regression Based on Complementary Audio-Visual Features

Synchronous Prediction of Continuous Affective Video Content Based on Multi-task Learning

Multi-modal learning for affective content analysis in movies

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation