Abstract
Nowadays, with the development of social media networks, micro-videos, an emerging form of user-generated contents (UGCs), are gradually attracting greater interest. Some of them are widely spread, while others draw little attention. The popular micro-videos have significant commercial potential in many ways, such as online advertising and bandwidth allocation. In recent years, the popularity prediction of long videos, web images and texts have gained abundant theoretical support and made great practical success. However, little research has been conducted on micro-videos. There are three difficulties in dealing with the problem: (1) micro-videos are short in duration; (2) the quality of micro-videos is relatively poor; (3) micro-videos can be described by multiple heterogeneous features involving social, visual, acoustic and textual modalities. For these purposes, we presented a feature-discrimination transductive model (FDTM). The proposed method regards the multi-view features as two properties: the low-level features and the attribute features. We divided the micro-videos into different levels of popularity via the attribute features and predicted the popularity scores via the low-level features precisely. Moreover, in the process of prediction, we sought a latent common feature subspace, where the micro-videos can be comprehensively represented. The latent subspace can aggregate the multiple low-level feature information to alleviate the problem of information insufficiency. Extensive experiments on a public dataset show that the proposed method achieves significant improvements compared with the best-known models.
Similar content being viewed by others
References
Yuan, Z., Sang, J., Xu, C., Liu, Y.: A unified framework of latent feature learning in social media. IEEE Trans. Multimedia 16(6), 1624–1635 (2014)
Bakshy, E. Rosenn, I., Marlow, C., Adamic, L.: The role of social networks in information diffusion. In: Proceedings of ACM international conference on world wide web, pp. 519–528 (2012)
Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user behavior in online social networks. In: Proceedings of ACM SIGCOMM conference on internet measurement, pp. 49–62 (2009)
Zhuang, J., Mei, T., Hoi, S.C., Hua, X.-S., Zhang, Y.: Community discovery from social media by low-rank matrix recovery. ACM Trans. Intell. Syst. Technol. 5(4), 67 (2015)
Song, X., Ming, Z.-Y., Nie, L., Zhao, Y.-L., Chua, T.-S.: Volunteerism tendency prediction via harvesting multiple social networks. ACM Trans. Inf. Syst. 34(2), 10 (2016)
Nguyen, P. X., Rogez, G., Fowlkes, C., Ramanan, D.: The open world of micro-videos. arXiv preprint arXiv:1603.09439 (2018)
Wang, M., Kang, D.: Research on semantic representation to promote the correlation of instructional micro video. In: Proceedings of international conference on computational intelligence and security, pp. 470–473 (2015)
Zhang, B., Liu, Y.: Micro-video segmentation based on histogram and local optimal solution method. In: Chinese conference on image and graphics technologies, pp. 292–299 (2015)
Chen, J., Song, X., Nie, L., Wang, X., Zhang, H., Chua, T.-S.: Micro tells macro: predicting the popularity of micro-videos via a transductive model. In: Proceedings of ACM international conference on multimedia, pp. 898–907 (2016)
Chen, J.: Multi-modal learning: study on a large-scale micro-video data collection. In: Proceedings of ACM international conference on multimedia, pp. 1454–1458 (2016)
Hong, L., Dan, O., Davison, B.D.: Predicting popular messages in twitter. In: Proceedings of international conference companion on world wide web, pp. 57–58 (2011)
McParlane, P. J., Moshfeghi, Y., Jose, J.M.: Nobody comes here anymore, it’s too crowded; predicting image popularity on flickr. In: Proceedings of international conference on multimedia retrieval, pp. 385–391 (2014)
Li, H., Ma, X., Wang, F., Liu, J., Xu, K.: On popularity prediction of videos shared in online social networks. In: Proceedings of acm international conference on information and knowledge management, pp. 169–178 (2013)
Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in twitter. J. Assoc. Inf. Sci. Technol. 64(7), 1399–1410 (2013)
Yamaguchi, K., Berg, T.L., Ortiz, L.E.: Chic or social: Visual popularity analysis in online fashion networks. In: Proceedings of ACM international conference on multimedia, pp. 773–776 (2014)
Trzcinski, T., Rokita, P.: Predicting popularity of online videos using support vector regression. IEEE Trans. Multimedia 19, 2561–2570 (2017)
Liu, R., Huggins-Manley, A.C., Bradshaw, L.: The impact of q-matrix designs on diagnostic classification accuracy in the presence of attribute hierarchies. Educ. Psychol. Measur. 77(2), 220–240 (2017)
Powell, B. M., Kalsy, E., Goswami, G., Vatsa, M., Singh, R., Noore, A.: Attack-resistant aicaptcha using a negative selection artificial immune system. In: Proc. 38th S&P, 2nd Workshop on BioSTAR (2017)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp. 3337–3344 (2011)
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.Y.: Zero shot learning through cross-modal transfer. In: Proceedings of advances in neural information processing systems, pp. 935–943 (2013)
Rupnik, J., Shawe-Taylor, J.: Multi-view canonical correlation analysis, in: Proceedings of Conference on Data Mining and Data Warehouses, pp. 1–4 (2010)
Liu, Y., Cui, J., Zhao, H., Zha, H.: Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking. In: International conference on pattern recognition, pp. 898–901 (2012)
Liu, Y., Zheng, Y., Liang, Y., Liu, S., Rosenblum, S.D.: Urban water quality prediction based on multi-task multi-view learning. In: International joint conferences on artificial intelligence (2016)
Sun, S., Xie, X., Yang, M.: Multiview uncorrelated discriminant analysis. IEEE Trans. Cybern. 46(12), 3272–3284 (2016)
Kan, M., Shan, S., Zhang, H., Lao, S., Chen, X.: Multi-view discriminant analysis. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 188–194 (2016)
Wang, H., Nie, F., Huang, H.: Multi-view clustering and feature learning via structured sparsity. In: Proceedings of international conference on machine learning, pp. 352–360 (2013)
Ding, Z., Fu, Y.: Low-rank common subspace for multi-view learning. In: Proceedings of IEEE conference on data mining, pp. 110–119 (2014)
Liu, Y., Nie, L., Liu, L., Rosenblum, S.: D, From action to activity: sensor-based activity recognition. Neurocomputing 181, 108–115 (2016)
Liu, Y., Nie, L., Zhang, L., Liu, S., Rosenblum, S.: D, Action2activity: recognizing complex activities from sensor data. In: Proceedings of the 24th international conference on artificial intelligence, IJCAI’15, pp. 1617–1623 (2015)
Ding, W., Shang, Y., Guo, L., Hu, X., Yan, R., He, T.: Video popularity prediction by sentiment propagation via implicit network. In: Proceedings of ACM international on conference on information and knowledge management, pp. 1621–1630 (2015)
Roy, S.D., Mei, T., Zeng, W., Li, S.: Towards cross-domain learning for social video popularity prediction. IEEE Trans. Multimedia 15(6), 1255–1267 (2013)
Fontanini, G., Bertini, M., Del Bimbo, A.: Web video popularity prediction using sentiment and content visual features. In: Proceedings of ACM on international conference on multimedia retrieval, pp. 289–292 (2016)
Liu, Y., Zhang, X., Cui, J., Wu, C., Aghajan, H., Zha, H.: Visual analysis of childadult interactive behaviors in video sequences. In: International conference on virtual systems and multimedia, pp. 26–33 (2010)
Khosla, A., Das Sarma, A., Hamid, R.: What makes an image popular? In: Proceedings of ACM on international conference on world wide web, pp. 867–876 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G. E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Gelli, F., Uricchio, T., Bertini, M., Del Bimbo, A., Chang, S.-F.: Image popularity prediction in social media using sentiment and context features. In: Proceedings of ACM international conference on multimedia, pp. 907–910 (2015)
Chen, T., Borth, D., Darrell, T., Chang, S.-F.: Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks. arXiv preprint arXiv:1410.8586 (2018)
Bhattacharya, S., Nojavanasghari, B., Chen, T., Liu, D., Chang, S.-F., Shah, M.: Towards a comprehensive computational model foraesthetic assessment of videos. In: Proceedings of ACM international conference on multimedia, pp. 361–364 (2013)
Zhang, J., Nie, L., Wang, X., He, X., Huang, X., Chua, T. S.: Shorter-is-better: venue category estimation from micro-video. In: Proceedings of ACM on multimedia conference, pp. 1415–1424 (2016)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp. 3111–3119 (2013)
Daniel, H., Preoiuc-Pietro, D., Liu, Y., Ungar, L.: Beyond binary labels: political ideology prediction of twitter users. In: Proceedings of the annual meeting of the association for computational linguistics (2017)
Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)
Nie, L., Zhang, L., Yang, Y., Wang, M., Hong, R., Chua, T.-S.: Beyond doctors: future health prediction from multimedia and multimodal observations. In: Proceedings of ACM international conference on multimedia, pp. 591–600 (2015)
Zar, J.H.: Significance testing of the spearman rank correlation coefficient. J. Am. Stat. Assoc. 67(339), 578–580 (1972)
Grégoire, G.: Multiple linear regression. Eur. Astron. Soc. Publ. Ser. 66, 45–72 (2014)
Zhang, C.-H., Huang, J.: The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann. Stat. 2008, 1567–1594 (2008)
Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14(3), 199–222 (2004)
Zhang, J., Huan, J.: Inductive multi-task learning with multiple view data. In: International conference on machine learning, pp. 25–32 (2011)
Yang, Y., Song, K.J., Huang, Z.: Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Trans. Multimedia 15(3), 572–581 (2013)
Song, X., Nie, L., Zhang, L., Akbari, M., Chua, T.-S.: Multiple social network learning and its application in volunteerism tendency prediction. In: Proceedings of international ACM SIGIR conference on research and development in information retrieval, pp. 213–222 (2015)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by F. Wu.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Su, Y., Li, Y., Bai, X. et al. Predicting the popularity of micro-videos via a feature-discrimination transductive model. Multimedia Systems 26, 519–534 (2020). https://doi.org/10.1007/s00530-020-00660-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-020-00660-x