skip to main content
10.1145/3581783.3612839acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections

Enhanced CatBoost with Stacking Features for Social Media Prediction

Published: 27 October 2023 Publication History


The Social Media Prediction (SMP) challenge aims to predict the future popularity of online posts by leveraging social media data. Social media data contains multimodal information, such as text, images, time series, etc. Previous methods have proposed many feature extraction and feature construction methods to represent these multimodal information, thereby predicting the popularity of posts. Despite the success of previous methods in extracting features from social media data, these features tend to be predominantly lower-order, posing a challenge in accurately capturing the rich information contained in text and images. In this paper, we propose a more diverse feature mining method and introduce a stacking block module to capture higher-order feature information contained in text and images. "lower-order" refers to the original high-dimensional embedding representation, while "high-order" pertains to the impact on post social popularity captured by tree models from text or image. We conducted massive experiments to evaluate the effectiveness of our proposed method and found that the stacking block module significantly improved performance.

Supplemental Material

MP4 File
By this video. we introduce our work in 6th SMP challenge. including the challenge introduction, previous methods, our framework and ablation experiments. In our framework, we proposed a stacking block to handle text and image features, which is one of the key improvements of our model. What's more, you can find detailed description of the overall framework including input data, post profile feature extraction, text feature extraction, image feature extraction and regression model.


Nabiha Asghar. 2016. Yelp Dataset Challenge: Review Rating Prediction. CoRR abs/1605.05362 (2016). arXiv:1605.05362
Guandan Chen, Qingchao Kong, Nan Xu, and Wenji Mao. 2019. NPP: A neural popularity prediction model for social media content. Neurocomputing 333 (2019), 221--230.
Weilong Chen, Chenghao Huang, Weimin Yuan, Xiaolu Chen, Wenhao Hu, Xinran Zhang, and Yanru Zhang. 2022. Title-and-Tag Contrastive Vision-and-Language Transformer for Social Media Popularity Prediction. In MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022, João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, and Laura Toni (Eds.). ACM, 7008--7012.
Raul Ferrer Conill, Michael Karlsson, Mario Haim, Aske Kammer, Dag Elgesem, and Helle Sjøvaag. 2023. Toward 'Cultures of Engagement'? An exploratory comparison of engagement patterns on Facebook news posts. New Media Soc. 25, 1 (2023), 95--118.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. IEEE Computer Society, 770--778.
Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, 2261--2269.
Po-Yao Huang, Xiaojun Chang, and Alexander G. Hauptmann. 2019. Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 1461--1467.
Sergei Ivanov and Liudmila Prokhorenkova. 2021. Boost then Convolve: Gradient Boosting Meets Graph Neural Networks. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021.
Sebastian Kalkowski, Christian Schulze, Andreas Dengel, and Damian Borth. 2015. Real-time Analysis and Visualization of the YFCC100m Dataset. In Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions, MMCommons 2015, Brisbane, Australia, October 30, 2015, Gerald Friedland, Chong-Wah Ngo, and David A. Shamma (Eds.). ACM, 25--30.
Payam Karisani, Joyce C. Ho, and Eugene Agichtein. 2020. Domain-Guided Task Decomposition with Self-Training for Detecting Personal Events in Social Media. In WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 2411--2420.
Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, and Li Fei-Fei. 2017. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. Int. J. Comput. Vis. 123, 1 (2017), 32--73.
Xin Lai, Yihong Zhang, and Wei Zhang. 2020. HyFea: Winning Solution to Social Media Popularity Prediction for Multimedia Grand Challenge 2020. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12-16, 2020, Chang Wen Chen, Rita Cucchiara, Xian-Sheng Hua, Guo-Jun Qi, Elisa Ricci, Zhengyou Zhang, and Roger Zimmermann (Eds.). ACM, 4565--4569.
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas M. Breuel, Jan Kautz, and Yale Song. 2021. Parameter Efficient Multimodal Transformers for Video Representation Learning. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021.
Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. 2015. Click-through Prediction for Advertising in Twitter Timeline. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, Longbing Cao, Chengqi Zhang, Thorsten Joachims, Geoffrey I. Webb, Dragos D. Margineantu, and Graham Williams (Eds.). ACM, 1959--1968.
Travis Martin, Jake M. Hofman, Amit Sharma, Ashton Anderson, and Duncan J. Watts. 2016. Exploring Limits to Prediction in Complex Social Systems. In Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11 - 15, 2016, Jacqueline Bourdeau, Jim Hendler, Roger Nkambou, Ian Horrocks, and Ben Y. Zhao (Eds.). ACM, 683--694.
Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).
Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Dongdong Zhang, and Nan Duan. 2021. M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021. Computer Vision Foundation / IEEE, 3977--3986.
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1532--1543.
Liudmila Ostroumova Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 6639--6649. 14491b756b3a51daac41c24863285549-Abstract.html
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748--8763.
Aparna S. Varde, Gerard de Melo, and Boxiang Dong. 2023. Temporal Ordinance Mining for Event-Driven Social Media Reaction Analytics. In Companion Proceedings of the ACM Web Conference 2023, WWW 2023, Austin, TX, USA, 30 April 2023 - 4 May 2023, Ying Ding, Jie Tang, Juan F. Sequeda, Lora Aroyo, Carlos Castillo, and Geert-Jan Houben (Eds.). ACM, 1225--1227.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998--6008.
Kai Wang, Penghui Wang, Xin Chen, Qiushi Huang, Zhendong Mao, and Yong-dong Zhang. 2020. A Feature Generalization Framework for Social Media Popularity Prediction. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12-16, 2020, Chang Wen Chen, Rita Cucchiara, Xian-Sheng Hua, Guo-Jun Qi, Elisa Ricci, Zhengyou Zhang, and Roger Zimmermann (Eds.). ACM, 4570--4574.
Bo Wu, Wen-Huang Cheng, Peiye Liu, Bei Liu, Zhaoyang Zeng, and Jiebo Luo. 2019. SMP Challenge: An Overview of Social Media Prediction Challenge 2019. In Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21-25, 2019, Laurent Amsaleg, Benoit Huet, Martha A. Larson, Guillaume Gravier, Hayley Hung, Chong-Wah Ngo, and Wei Tsang Ooi (Eds.). ACM, 2667--2671.
Bo Wu, Wen-Huang Cheng, Yongdong Zhang, and Tao Mei. 2016. Time Matters: Multi-scale Temporalization of Social Media Popularity. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15-19, 2016, Alan Hanjalic, Cees Snoek, Marcel Worring, Dick C. A. Bulterman, Benoit Huet, Aisling Kelliher, Yiannis Kompatsiaris, and Jin Li (Eds.). ACM, 1336--1344.
Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Huang Qiushi, Li Jintao, and Tao Mei. 2017. Sequential Prediction of Social Media Popularity with Deep Temporal Context Networks. In International Joint Conference on Artificial Intelligence (IJCAI) (Melbourne, Australia).
Bo Wu, Tao Mei, Wen-Huang Cheng, and Yongdong Zhang. 2016. Unfolding Temporal Dynamics: Predicting Social Media Popularity Using Multi-scale Temporal Decomposition. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI) (Phoenix, Arizona).
Jianmin Wu, Liming Zhao, Dangwei Li, Chen-Wei Xie, Siyang Sun, and Yun Zheng. 2022. Deeply Exploit Visual and Language Information for Social Media Popularity Prediction. In MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022, João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, and Laura Toni (Eds.). ACM, 7045--7049.
Kele Xu, Zhimin Lin, Jianqiao Zhao, Peichang Shi, Wei Deng, and Huaimin Wang. 2020. Multimodal Deep Learning for Social Media Popularity Prediction With Attention Mechanism. In MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12-16, 2020, Chang Wen Chen, Rita Cucchiara, Xian-Sheng Hua, Guo-Jun Qi, Elisa Ricci, Zhengyou Zhang, and Roger Zimmermann (Eds.). ACM, 4580--4584.
Yuanzhou Yao, Zhao Zhang, Kaijia Yang, Huasheng Liang, Qiang Yan, Fuzhen Zhuang, Yongjun Xu, Boyu Diao, and Chao Li. 2023. A Knowledge Enhanced Hierarchical Fusion Network for CTR Prediction under Account Search Scenario in WeChat. In Companion Proceedings of the ACM Web Conference 2023, WWW 2023, Austin, TX, USA, 30 April 2023 - 4 May 2023, Ying Ding, Jie Tang, Juan F. Sequeda, Lora Aroyo, Carlos Castillo, and Geert-Jan Houben (Eds.). ACM, 475--479.
Huan Zhang, Si Si, and Cho-Jui Hsieh. 2017. GPU-acceleration for Large-scale Tree Boosting. CoRR abs/1706.08359 (2017). arXiv:1706.08359 abs/1706.08359

Cited By

View all
  • (2024)Higher-Order Vision-Language Alignment for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688999(11457-11463)Online publication date: 28-Oct-2024
  • (2024)Dual-Stream Pre-Training Transformer to Enhance Multimodal Learning for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688998(11450-11456)Online publication date: 28-Oct-2024
  • (2024)MMF: Winning Solution to Social Media Popularity Prediction Challenge 2024Proceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688997(11445-11449)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Enhanced CatBoost with Stacking Features for Social Media Prediction



    Information & Contributors


    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].



    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023


    Request permissions for this article.

    Check for updates

    Author Tags

    1. popularity prediction
    2. social media
    3. stacking features


    • Research-article


    MM '23
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)89
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 05 Mar 2025

    Other Metrics


    Cited By

    View all
    • (2024)Higher-Order Vision-Language Alignment for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688999(11457-11463)Online publication date: 28-Oct-2024
    • (2024)Dual-Stream Pre-Training Transformer to Enhance Multimodal Learning for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688998(11450-11456)Online publication date: 28-Oct-2024
    • (2024)MMF: Winning Solution to Social Media Popularity Prediction Challenge 2024Proceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688997(11445-11449)Online publication date: 28-Oct-2024
    • (2024)SMP Challenge Summary: Social Media Prediction ChallengeProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688996(11442-11444)Online publication date: 28-Oct-2024
    • (2024)Evidential Reasoning Approach for Predicting Popularity of Instagram PostsIEEE Access10.1109/ACCESS.2024.351063712(182603-182617)Online publication date: 2024
    • (2024)Sentiment and hashtag-aware attentive deep neural network for multimodal post popularity predictionNeural Computing and Applications10.1007/s00521-024-10755-537:4(2799-2824)Online publication date: 9-Dec-2024

    View Options

    Login options

    View options


    View or Download as a PDF file.



    View online with eReader.







    Share this Publication link

    Share on social media