skip to main content
10.1145/3571662.3571664acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccipConference Proceedingsconference-collections
research-article

Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction

Authors Info & Claims
Published:03 January 2023Publication History

ABSTRACT

Popularity prediction of micro videos on multimedia is a hotly studied topic due to the widespread use of video upload sharing services. It’s also a challenging task because popular pattern is affected by multiple factors and is hard to be modeled. The goal of this paper is to use feature extraction techniques and variation auto-encoder (VAE) framework to predict the popularity of online micro-videos. First, we identify four declarable modalities that are important for adaptability and expansibility. Then, we design a multi-modal based VAE regression model (MASSL) to exploit the domestic and foreign information extracted from heterogeneous features. The model can be applied to large-scale multimedia platforms, even the modality absence scenarios. With extensive experiments conducted on the dataset, which was originally generated from the most popular video-sharing website in China, the result demonstrates the effectiveness of our proposed model by comparing with baseline approaches.

References

  1. Mohamed Ahmed, Stella Spagna, Felipe Huici, and Saverio Niccolini. 2013. A peek into the future: Predicting the evolution of popularity in user generated content. In Proceedings of the sixth ACM international conference on Web search and data mining. 607–616.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Peng Bao, Hua-Wei Shen, Xiaolong Jin, and Xue-Qi Cheng. 2015. Modeling and predicting popularity dynamics of microblogs using self-excited hawkes processes. In Proceedings of the 24th International Conference on World Wide Web. 9–10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jonah Berger and Katherine L Milkman. 2012. What makes online content viral?Journal of marketing research 49, 2 (2012), 192–205.Google ScholarGoogle Scholar
  4. Adam Bielski and Tomasz Trzcinski. 2018. Understanding multimodal popularity prediction of social media videos with self-attention. IEEE Access 6(2018), 74277–74287.Google ScholarGoogle ScholarCross RefCross Ref
  5. Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. 2015. Latent factors of visual popularity prediction. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. 195–202.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Guandan Chen, Qingchao Kong, Nan Xu, and Wenji Mao. 2019. NPP: A neural popularity prediction model for social media content. Neurocomputing 333(2019), 221–230.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. In Proceedings of the 24th ACM international conference on Multimedia. 898–907.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jingtao Ding, Yanghao Li, Yong Li, and Depeng Jin. 2018. Click versus share: A feature-driven study of micro-video popularity and virality in social media. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 198–206.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2625–2634.Google ScholarGoogle ScholarCross RefCross Ref
  10. Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, and Larry Heck. 2016. Contextual lstm (clstm) models for large scale nlp tasks. arXiv preprint arXiv:1602.06291(2016).Google ScholarGoogle Scholar
  11. Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.Google ScholarGoogle ScholarCross RefCross Ref
  12. Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. 2017. Low-rank multi-view embedding learning for micro-video popularity prediction. IEEE Transactions on Knowledge and Data Engineering 30, 8(2017), 1519–1532.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. 2015. Multi-view discriminant analysis. IEEE transactions on pattern analysis and machine intelligence 38, 1(2015), 188–194.Google ScholarGoogle Scholar
  14. Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular?. In Proceedings of the 23rd international conference on World wide web. 867–876.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. Advances in neural information processing systems 28 (2015).Google ScholarGoogle Scholar
  16. Haitao Li, Xiaoqiang Ma, Feng Wang, Jiangchuan Liu, and Ke Xu. 2013. On popularity prediction of videos shared in online social networks. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 169–178.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zongyang Ma, Aixin Sun, and Gao Cong. 2013. On predicting the popularity of newly emerging hashtags in T witter. Journal of the American Society for Information Science and Technology 64, 7 (2013), 1399–1410.Google ScholarGoogle ScholarCross RefCross Ref
  18. Henrique Pinto, Jussara M Almeida, and Marcos A Gonçalves. 2013. Using early view patterns to predict the popularity of youtube videos. In Proceedings of the sixth ACM international conference on Web search and data mining. 365–374.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Miriam Redi, Neil O’Hare, Rossano Schifanella, Michele Trevisiol, and Alejandro Jaimes. 2014. 6 seconds of sound and vision: Creativity in micro-videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4272–4279.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Suman Deb Roy, Tao Mei, Wenjun Zeng, and Shipeng Li. 2013. Towards cross-domain learning for social video popularity prediction. IEEE Transactions on multimedia 15, 6 (2013), 1255–1267.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wojciech Stokowiec, Tomasz Trzciński, Krzysztof Wołk, Krzysztof Marasek, and Przemysław Rokita. 2017. Shallow reading with deep learning: Predicting popularity of online content using only its title. In International Symposium on Methodologies for Intelligent Systems. Springer, 136–145.Google ScholarGoogle ScholarCross RefCross Ref
  22. Tomasz Trzciński, Paweł Andruszkiewicz, Tomasz Bocheński, and Przemysław Rokita. 2017. Recurrent neural networks for online video popularity prediction. In International Symposium on Methodologies for Intelligent Systems. Springer, 146–153.Google ScholarGoogle ScholarCross RefCross Ref
  23. Tomasz Trzciński and Przemysław Rokita. 2017. Predicting popularity of online videos using support vector regression. IEEE Transactions on Multimedia 19, 11 (2017), 2561–2570.Google ScholarGoogle ScholarCross RefCross Ref
  24. Jiayi Xie, Yaochen Zhu, Zhibin Zhang, Jian Peng, Jing Yi, Yaosi Hu, Hongyi Liu, and Zhenzhong Chen. 2020. A multimodal variational encoder-decoder framework for micro-video popularity prediction. In Proceedings of The Web Conference 2020. 2542–2548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jianglong Zhang, Liqiang Nie, Xiang Wang, Xiangnan He, Xianglin Huang, and Tat Seng Chua. 2016. Shorter-is-better: Venue category estimation from micro-video. In Proceedings of the 24th ACM international conference on Multimedia. 1415–1424.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yaochen Zhu, Zhenzhong Chen, and Feng Wu. 2019. Multimodal deep denoise framework for affective video content analysis. In Proceedings of the 27th ACM International Conference on Multimedia. 130–138.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-modal Variational Auto-Encoder Model for Micro-video Popularity Prediction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICCIP '22: Proceedings of the 8th International Conference on Communication and Information Processing
        November 2022
        219 pages
        ISBN:9781450397100
        DOI:10.1145/3571662

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 January 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        ICCIP '22 Paper Acceptance Rate61of301submissions,20%Overall Acceptance Rate61of301submissions,20%
      • Article Metrics

        • Downloads (Last 12 months)89
        • Downloads (Last 6 weeks)8

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format