ABSTRACT
Popularity prediction of micro videos on multimedia is a hotly studied topic due to the widespread use of video upload sharing services. It’s also a challenging task because popular pattern is affected by multiple factors and is hard to be modeled. The goal of this paper is to use feature extraction techniques and variation auto-encoder (VAE) framework to predict the popularity of online micro-videos. First, we identify four declarable modalities that are important for adaptability and expansibility. Then, we design a multi-modal based VAE regression model (MASSL) to exploit the domestic and foreign information extracted from heterogeneous features. The model can be applied to large-scale multimedia platforms, even the modality absence scenarios. With extensive experiments conducted on the dataset, which was originally generated from the most popular video-sharing website in China, the result demonstrates the effectiveness of our proposed model by comparing with baseline approaches.
- Mohamed Ahmed, Stella Spagna, Felipe Huici, and Saverio Niccolini. 2013. A peek into the future: Predicting the evolution of popularity in user generated content. In Proceedings of the sixth ACM international conference on Web search and data mining. 607–616.Google ScholarDigital Library
- Peng Bao, Hua-Wei Shen, Xiaolong Jin, and Xue-Qi Cheng. 2015. Modeling and predicting popularity dynamics of microblogs using self-excited hawkes processes. In Proceedings of the 24th International Conference on World Wide Web. 9–10.Google ScholarDigital Library
- Jonah Berger and Katherine L Milkman. 2012. What makes online content viral?Journal of marketing research 49, 2 (2012), 192–205.Google Scholar
- Adam Bielski and Tomasz Trzcinski. 2018. Understanding multimodal popularity prediction of social media videos with self-attention. IEEE Access 6(2018), 74277–74287.Google ScholarCross Ref
- Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. 2015. Latent factors of visual popularity prediction. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. 195–202.Google ScholarDigital Library
- Guandan Chen, Qingchao Kong, Nan Xu, and Wenji Mao. 2019. NPP: A neural popularity prediction model for social media content. Neurocomputing 333(2019), 221–230.Google ScholarDigital Library
- Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. In Proceedings of the 24th ACM international conference on Multimedia. 898–907.Google ScholarDigital Library
- Jingtao Ding, Yanghao Li, Yong Li, and Depeng Jin. 2018. Click versus share: A feature-driven study of micro-video popularity and virality in social media. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 198–206.Google ScholarCross Ref
- Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2625–2634.Google ScholarCross Ref
- Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, and Larry Heck. 2016. Contextual lstm (clstm) models for large scale nlp tasks. arXiv preprint arXiv:1602.06291(2016).Google Scholar
- Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.Google ScholarCross Ref
- Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. 2017. Low-rank multi-view embedding learning for micro-video popularity prediction. IEEE Transactions on Knowledge and Data Engineering 30, 8(2017), 1519–1532.Google ScholarDigital Library
- Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. 2015. Multi-view discriminant analysis. IEEE transactions on pattern analysis and machine intelligence 38, 1(2015), 188–194.Google Scholar
- Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular?. In Proceedings of the 23rd international conference on World wide web. 867–876.Google ScholarDigital Library
- Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. Advances in neural information processing systems 28 (2015).Google Scholar
- Haitao Li, Xiaoqiang Ma, Feng Wang, Jiangchuan Liu, and Ke Xu. 2013. On popularity prediction of videos shared in online social networks. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 169–178.Google ScholarDigital Library
- Zongyang Ma, Aixin Sun, and Gao Cong. 2013. On predicting the popularity of newly emerging hashtags in T witter. Journal of the American Society for Information Science and Technology 64, 7 (2013), 1399–1410.Google ScholarCross Ref
- Henrique Pinto, Jussara M Almeida, and Marcos A Gonçalves. 2013. Using early view patterns to predict the popularity of youtube videos. In Proceedings of the sixth ACM international conference on Web search and data mining. 365–374.Google ScholarDigital Library
- Miriam Redi, Neil O’Hare, Rossano Schifanella, Michele Trevisiol, and Alejandro Jaimes. 2014. 6 seconds of sound and vision: Creativity in micro-videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4272–4279.Google ScholarDigital Library
- Suman Deb Roy, Tao Mei, Wenjun Zeng, and Shipeng Li. 2013. Towards cross-domain learning for social video popularity prediction. IEEE Transactions on multimedia 15, 6 (2013), 1255–1267.Google ScholarDigital Library
- Wojciech Stokowiec, Tomasz Trzciński, Krzysztof Wołk, Krzysztof Marasek, and Przemysław Rokita. 2017. Shallow reading with deep learning: Predicting popularity of online content using only its title. In International Symposium on Methodologies for Intelligent Systems. Springer, 136–145.Google ScholarCross Ref
- Tomasz Trzciński, Paweł Andruszkiewicz, Tomasz Bocheński, and Przemysław Rokita. 2017. Recurrent neural networks for online video popularity prediction. In International Symposium on Methodologies for Intelligent Systems. Springer, 146–153.Google ScholarCross Ref
- Tomasz Trzciński and Przemysław Rokita. 2017. Predicting popularity of online videos using support vector regression. IEEE Transactions on Multimedia 19, 11 (2017), 2561–2570.Google ScholarCross Ref
- Jiayi Xie, Yaochen Zhu, Zhibin Zhang, Jian Peng, Jing Yi, Yaosi Hu, Hongyi Liu, and Zhenzhong Chen. 2020. A multimodal variational encoder-decoder framework for micro-video popularity prediction. In Proceedings of The Web Conference 2020. 2542–2548.Google ScholarDigital Library
- Jianglong Zhang, Liqiang Nie, Xiang Wang, Xiangnan He, Xianglin Huang, and Tat Seng Chua. 2016. Shorter-is-better: Venue category estimation from micro-video. In Proceedings of the 24th ACM international conference on Multimedia. 1415–1424.Google ScholarDigital Library
- Yaochen Zhu, Zhenzhong Chen, and Feng Wu. 2019. Multimodal deep denoise framework for affective video content analysis. In Proceedings of the 27th ACM International Conference on Multimedia. 130–138.Google ScholarDigital Library
Index Terms
- Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction
Recommendations
Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model
MM '16: Proceedings of the 24th ACM international conference on MultimediaMicro-videos, a new form of user generated contents (UGCs), are gaining increasing enthusiasm. Popular micro-videos have enormous commercial potential in many ways, such as online marketing and brand tracking. In fact, the popularity prediction of ...
A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction
WWW '20: Proceedings of The Web Conference 2020Predicting the popularity of a micro-video is a challenging task, due to a number of factors impacting the distribution such as the diversity of the video content and user interests, complex online interactions, etc. In this paper, we propose a ...
Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis
MM '16: Proceedings of the 24th ACM international conference on MultimediaIn this paper, we propose a novel multi-modal multi-view topic-opinion mining (MMTOM) model for social event analysis in multiple collection sources. Compared with existing topic-opinion mining methods, our proposed model has several advantages: (1) The ...
Comments