Abstract
In this paper, we present a Layer-wise Deep Stacking (LDS) model to predict the popularity of Flickr-like social posts. LDS stacks multiple regression models in multiple layers, which enables the different models to complement and reinforce each other. To avoid overfitting, a dropout module is introduced to randomly activate the data being fed into the regression models in each layer. In particular, a detector is devised to determine the depth of LDS automatically by monitoring the performance of the features achieved by the LDS layers. Extensive experiments conducted on a public dataset consisting of 432K Flickr image posts manifest the effectiveness and significance of the LDS model and its components. LDS achieves competitive performance on multiple metrics: Spearman’s Rho: 83.50%, MAE: 1.038, and MSE: 2.011, outperforming state-of-the-art approaches for social image popularity prediction.
Similar content being viewed by others
References
Aloufi, S., Zhu, S., El, S.A.: On the prediction of flickr image popularity by analyzing heterogeneous social sensory data. Sensors 17(3), 631 (2017)
Ansari, A., Essegaier, S., Kohli, R.: Internet recommendation systems. J. Mark. Res. 37(3), 363–375 (2000)
Asur, S., Huberman, B.A.: Predicting the future with social media. In: IEEE, vol. 1, pp 492–499 (2010)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2010)
Cao, D., Ji, R., Lin, D., Li, S.: A cross-media public sentiment analysis system for microblog. Multimedia Systems, Springer 22(4), 479–486 (2016)
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794. ACM (2016)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. arXiv:161002357 (2017)
Cortes, C., Vapnik, V.: Support-vector networks[J]. Mach. Learn. 20(3), 273–297 (1995)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, CVPR 2009, 0pp. 248–255 (2009)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)
Guo, C., Li, B., Tian, X.: Flickr group recommendation using rich social media information. Neurocomputing 204, 8–16 (2016)
Hsu, C.C., Lee, Y.C., Lu, P.E., Lu, S.S., Lai, H.T.: Social media prediction based on residual learning and random forest. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1865–1870 (2017)
Huang, X., Gao, Y., Quan, F., Sang, J., Xu, C.: Towards SMP challenge: stacking of diverse models for social image popularity prediction. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1895–1900 (2017)
Jahrer, M., Töscher, A., Legenstein, R.: Combining predictions for accurate recommender systems. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 693–702. ACM (2010)
Khosla, A., Das, S.A., Hamid, R.: What makes an image popular?. In: Proceedings of the 23rd international conference on World Wide Web, pp. 867–876. ACM (2014)
Lee, W.Y., Kuo, Y., Hsieh, P.J., Cheng, W., Chao, T., Hsieh, H.L., Tsai, C.E., Chang, H., Lan, J., Hsu, W.: Unsupervised latent aspect discovery for diverse event summarization. In: Proceedings of the 23rd ACM international conference on Multimedia, pp. 197–200. ACM (2015)
Li, C., Yue, L., Mei, Q., Wang, D., Sandeep, P.: Sandeep Pandey Click-through Prediction for Advertising in Twitter Timeline. In: ACM SIGKDD, pp. 1959–1968 (2015)
Li, L.S.R., Gao, J., Yang, Z., Liu, W.: A hybrid model combining convolutional neural network with XGBoost for predicting social media popularity. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1912–1917 (2017)
Liu, B.: Sentiment analysis and opinion mining. Encyclopedia of Machine Learning and Data Mining, pp. 1–10 (2016)
Lv, J., Liu, W., Zhang, M., Gong, H., Wu, B., Ma, H.: Multi-feature fusion for predicting social media popularity. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1883–1888 (2017)
Natarajan, P., Wu, S., Vitaladevuni, S., Zhuang, X.: Multimodal feature fusion for robust event detection in Web videos. In: Computer vision and pattern recognition IEEE, pp. 1298–1305 (2012)
Nguyen, H.M., Woo, S., Im, J., Jun, T., Kim, D.: A Workload Prediction Approach Using Models Stacking Based on Recurrent Neural Network and Autoencoder IEEE, 929-936 (2016)
Park, T., Casella, G.: The bayesian lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008)
Rabinovich, M., Spatschek, O.: Web caching and replication. SIGMOD (2003)
Roy, S.D., Mei, T., Zeng, W., L, S.: Towards cross-domain learning for social video popularity prediction. IEEE Transactions on multimedia 15, 1255–1267 (2013)
Schinas, M., Papadopoulos, S., Petkos, G., Kompatsiaris, Y., Mitkas, P.A.: Multimodal graph-based event detection and summarization in social media streams. In: Proceedings of the 23rd ACM international conference on Multimedia, pp. 189-192. ACM (2015)
Sill, J., Takcs, G., Mackey, L., Lin, D.: Feature-weighted linear stacking. arXiv:09110460 (2009)
Simon, K.: Digital in 2017. https://wearesocial.com/special-reports/digital-in-2017-global-overview (2017)
Snedecor, G.W., Cocheran, W.G.: Statistical methods, 7th edn., p 192. Iowa State University Press, Ames (1980)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959 (2012)
Spyrou, E., Mylonas, P.: Analyzing Flickr metadata to extract location-based information and semantically organize its photo content. Neurocomputing 172, 114–133 (2016)
Srivastava, N., Hinton, G., Krizhevsky, A.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Tatar, A., de Amorim, M.D., Fdida, S., Antoniadis, P.: A Survey on Predicting the Popularity of Web Content, vol. 5, p 8 (2014)
Tkachenko, N., Jarvis, S., Procter, R.: Predicting floods with flickr tags. https://doi.org/10.1371/journal.pone.0172870/. Accessed 24 Feb. 2017 (2017)
Vishwanath, D., Gupta, S.: Adding CNNs to the Mix: stacking models for sentiment classification. In: India conference (INDICON), 2016 IEEE Annual, pp 1–4 (2016)
Wang, P., Wang, Z., Wang, D.: Recurrent deep stacking networks for speech recognition. arXiv:161204675 (2016)
Wang, s, Guo, W.: Sparse multi-graph embedding for multimodal feature representation. IEEE Trans. Multimedia PP 99, 1–1 (2017)
Wang, W., Zhang, W.: Combining multiple features for image popularity prediction in social media. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1865–1870. ACM (2017)
Wu, B., Cheng, W.H., Zhang, Y., Huang, Q., Li, J., Mei, T.: Sequential prediction of social media popularity with deep temporal context networks. In: Proceedings of the Twenty-Sixth international joint conference on artificial intelligence, IJCAI (2017)
Wu, B., Mei, T., Cheng, W.H., Zhang, Y.: Time matters: multi-scale temporalization of social media popularity. In: Proceedings of the 2016 ACM on multimedia conference, pp. 1336–1344 (2016)
Wu, B., Mei, T., Cheng, W.H., Zhang, Y.: Unfolding temporal dynamics: predicting social media popularity using multi-scale temporal decomposition Proceeding of AAAI, pp. 272–278 (2016)
Wu, B., Mei, T., Cheng, W.H., Zhang, Y.: https://social-media-prediction.githubprediction.github.io/MM17PredictionChallenge (2017)
Xie, S., Girshick, R., Dollr, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Computer vision and pattern recognition, CVPR 2017, pp. 5987–5995 (2017)
Yang, Z., Li, Q., Li, Z., Ma, Y., Gong, Z., Liu, W.: Dual structure constrained multimodal feature coding for social event detection from flickr data. ACM transactions on internet technology (2017)
Yang, Z., Li, Q., Lu, Z., Gong, Z., Liu, W.: Dual graph regularized NMF model for social event detection from flickr data. World Wide Web J 20(5), 995–1015 (2017)
You, Q., Luo, J., Jin, H., Yang, J.: Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In: Proceedings of the Ninth ACM international conference on Web search and data mining, pp. 13–22. ACM (2016)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61703109, No. 91748107), and the Guangdong Innovative Research Team Program (No. 2014ZT05G157).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Lin, Z., Huang, F., Li, Y. et al. A layer-wise deep stacking model for social image popularity prediction. World Wide Web 22, 1639–1655 (2019). https://doi.org/10.1007/s11280-018-0590-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-018-0590-1