research-article

Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction

Authors:
Zhuoran Zhang

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

0000-0002-3789-1797
View Profile

,
Shibiao Xu

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

0000-0003-4037-9900
View Profile

,
Li Guo

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

0000-0002-9723-3294
View Profile

,
Wenke Lian

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876,China; School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; National Engineering Research Center for Mobile Internet Security Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China., China

0000-0002-3527-8120
View Profile

ICCIP '22: Proceedings of the 8th International Conference on Communication and Information ProcessingNovember 2022Pages 9–16https://doi.org/10.1145/3571662.3571664

Published:03 January 2023Publication History

ICCIP '22: Proceedings of the 8th International Conference on Communication and Information Processing

Pages 9–16

ABSTRACT

Popularity prediction of micro videos on multimedia is a hotly studied topic due to the widespread use of video upload sharing services. It’s also a challenging task because popular pattern is affected by multiple factors and is hard to be modeled. The goal of this paper is to use feature extraction techniques and variation auto-encoder (VAE) framework to predict the popularity of online micro-videos. First, we identify four declarable modalities that are important for adaptability and expansibility. Then, we design a multi-modal based VAE regression model (MASSL) to exploit the domestic and foreign information extracted from heterogeneous features. The model can be applied to large-scale multimedia platforms, even the modality absence scenarios. With extensive experiments conducted on the dataset, which was originally generated from the most popular video-sharing website in China, the result demonstrates the effectiveness of our proposed model by comparing with baseline approaches.

References

Mohamed Ahmed, Stella Spagna, Felipe Huici, and Saverio Niccolini. 2013. A peek into the future: Predicting the evolution of popularity in user generated content. In Proceedings of the sixth ACM international conference on Web search and data mining. 607–616.Google ScholarDigital Library
Peng Bao, Hua-Wei Shen, Xiaolong Jin, and Xue-Qi Cheng. 2015. Modeling and predicting popularity dynamics of microblogs using self-excited hawkes processes. In Proceedings of the 24th International Conference on World Wide Web. 9–10.Google ScholarDigital Library
Jonah Berger and Katherine L Milkman. 2012. What makes online content viral?Journal of marketing research 49, 2 (2012), 192–205.Google Scholar
Adam Bielski and Tomasz Trzcinski. 2018. Understanding multimodal popularity prediction of social media videos with self-attention. IEEE Access 6(2018), 74277–74287.Google ScholarCross Ref
Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. 2015. Latent factors of visual popularity prediction. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. 195–202.Google ScholarDigital Library
Guandan Chen, Qingchao Kong, Nan Xu, and Wenji Mao. 2019. NPP: A neural popularity prediction model for social media content. Neurocomputing 333(2019), 221–230.Google ScholarDigital Library
Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. In Proceedings of the 24th ACM international conference on Multimedia. 898–907.Google ScholarDigital Library
Jingtao Ding, Yanghao Li, Yong Li, and Depeng Jin. 2018. Click versus share: A feature-driven study of micro-video popularity and virality in social media. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 198–206.Google ScholarCross Ref
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2625–2634.Google ScholarCross Ref
Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, and Larry Heck. 2016. Contextual lstm (clstm) models for large scale nlp tasks. arXiv preprint arXiv:1602.06291(2016).Google Scholar
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.Google ScholarCross Ref
Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. 2017. Low-rank multi-view embedding learning for micro-video popularity prediction. IEEE Transactions on Knowledge and Data Engineering 30, 8(2017), 1519–1532.Google ScholarDigital Library
Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. 2015. Multi-view discriminant analysis. IEEE transactions on pattern analysis and machine intelligence 38, 1(2015), 188–194.Google Scholar
Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular?. In Proceedings of the 23rd international conference on World wide web. 867–876.Google ScholarDigital Library
Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. Advances in neural information processing systems 28 (2015).Google Scholar
Haitao Li, Xiaoqiang Ma, Feng Wang, Jiangchuan Liu, and Ke Xu. 2013. On popularity prediction of videos shared in online social networks. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 169–178.Google ScholarDigital Library
Zongyang Ma, Aixin Sun, and Gao Cong. 2013. On predicting the popularity of newly emerging hashtags in T witter. Journal of the American Society for Information Science and Technology 64, 7 (2013), 1399–1410.Google ScholarCross Ref
Henrique Pinto, Jussara M Almeida, and Marcos A Gonçalves. 2013. Using early view patterns to predict the popularity of youtube videos. In Proceedings of the sixth ACM international conference on Web search and data mining. 365–374.Google ScholarDigital Library
Miriam Redi, Neil O’Hare, Rossano Schifanella, Michele Trevisiol, and Alejandro Jaimes. 2014. 6 seconds of sound and vision: Creativity in micro-videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4272–4279.Google ScholarDigital Library
Suman Deb Roy, Tao Mei, Wenjun Zeng, and Shipeng Li. 2013. Towards cross-domain learning for social video popularity prediction. IEEE Transactions on multimedia 15, 6 (2013), 1255–1267.Google ScholarDigital Library
Wojciech Stokowiec, Tomasz Trzciński, Krzysztof Wołk, Krzysztof Marasek, and Przemysław Rokita. 2017. Shallow reading with deep learning: Predicting popularity of online content using only its title. In International Symposium on Methodologies for Intelligent Systems. Springer, 136–145.Google ScholarCross Ref
Tomasz Trzciński, Paweł Andruszkiewicz, Tomasz Bocheński, and Przemysław Rokita. 2017. Recurrent neural networks for online video popularity prediction. In International Symposium on Methodologies for Intelligent Systems. Springer, 146–153.Google ScholarCross Ref
Tomasz Trzciński and Przemysław Rokita. 2017. Predicting popularity of online videos using support vector regression. IEEE Transactions on Multimedia 19, 11 (2017), 2561–2570.Google ScholarCross Ref
Jiayi Xie, Yaochen Zhu, Zhibin Zhang, Jian Peng, Jing Yi, Yaosi Hu, Hongyi Liu, and Zhenzhong Chen. 2020. A multimodal variational encoder-decoder framework for micro-video popularity prediction. In Proceedings of The Web Conference 2020. 2542–2548.Google ScholarDigital Library
Jianglong Zhang, Liqiang Nie, Xiang Wang, Xiangnan He, Xianglin Huang, and Tat Seng Chua. 2016. Shorter-is-better: Venue category estimation from micro-video. In Proceedings of the 24th ACM international conference on Multimedia. 1415–1424.Google ScholarDigital Library
Yaochen Zhu, Zhenzhong Chen, and Feng Wu. 2019. Multimodal deep denoise framework for affective video content analysis. In Proceedings of the 27th ACM International Conference on Multimedia. 130–138.Google ScholarDigital Library

Index Terms

Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction
1. Computing methodologies
  1. Artificial intelligence
2. Information systems
  1. Information systems applications
    1. Multimedia information systems

Recommendations

Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model
MM '16: Proceedings of the 24th ACM international conference on Multimedia

Micro-videos, a new form of user generated contents (UGCs), are gaining increasing enthusiasm. Popular micro-videos have enormous commercial potential in many ways, such as online marketing and brand tracking. In fact, the popularity prediction of ...
Read More
A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction
WWW '20: Proceedings of The Web Conference 2020

Predicting the popularity of a micro-video is a challenging task, due to a number of factors impacting the distribution such as the diversity of the video content and user interests, complex online interactions, etc. In this paper, we propose a ...
Read More
Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis
MM '16: Proceedings of the 24th ACM international conference on Multimedia

In this paper, we propose a novel multi-modal multi-view topic-opinion mining (MMTOM) model for social event analysis in multiple collection sources. Compared with existing topic-opinion mining methods, our proposed model has several advantages: (1) The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCIP '22: Proceedings of the 8th International Conference on Communication and Information Processing
November 2022
219 pages
ISBN:9781450397100
DOI:10.1145/3571662

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 January 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
popularity prediction
social media
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ICCIP '22 Paper Acceptance Rate61of301submissions,20%Overall Acceptance Rate61of301submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 114
  Total Downloads
- Downloads (Last 12 months)89
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction

ICCIP '22: Proceedings of the 8th International Conference on Communication and Information Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model

A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction

Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction

ICCIP '22: Proceedings of the 8th International Conference on Communication and Information Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model

A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction

Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media