Skip to main content
Log in

Learning from visual content and style: an image-enhanced recommendation model

  • Regular Paper
  • Published:
CCF Transactions on Pervasive Computing and Interaction Aims and scope Submit manuscript

Abstract

Image based platforms are popular in recent years. With a large number of images in these image based platforms, how to properly recommend images that suit each user’s interest is a key problem for recommender systems. While a simple idea is to adopt collaborative filtering for image recommendation, it does not fully utilize the visual information and suffers from the data sparsity issue. Recently, with the huge success of Convolutional Neural Networks (CNN) for image analysis, some researchers proposed to leverage image content information for recommendation. Specifically, Visual Bayesian Personalized Ranking (VBPR) (He and McAuley, in: The association for the advancement of artificial intelligence, 2016) is a state-of-the-art visual based recommendation model, which proposed to learn users’ preferences to items from two spaces: a visual content space learned from CNNs, and a latent space learned from classical collaborative filtering models. VBPR and its variants showed better recommendation performance with image content modeling. In the real-world, when browsing visual images, users not only care the image content, but also concern the matching degree of the image style. Compared to image content, the role of visual styles has been largely ignored in the image recommendation community. Therefore, in this paper, we study the problem of learning both the visual content and style for image recommendation. We leverage advances in computer vision to learn the visual content and style representation, and propose to how to combine visual signals with users’ collaborative data. Finally, experimental results on a real-world dataset clearly show the effectiveness of our proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Bartolini, I., Moscato, V., Pensa, R.G., Penta, A., Picariello, A., Sansone, C., Sapino, M.L.: Recommending multimedia visiting paths in cultural heritage applications. Multimedia Tools Appl. 75(7), 3813–3842 (2016)

    Article  Google Scholar 

  • Bell, R., Koren, Y., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 8, 30–37 (2009)

    Google Scholar 

  • Canini, L., Benini, S., Leonardi, R.: Affective recommendation of movies based on selected connotative features. IEEE Trans. Circuits Syst. Video Technol. 23(4), 636–647 (2012)

    Article  Google Scholar 

  • Chen, J., Zhang, H., He, X., Nie, L., Liu, W., Chua, T.S.: Attentive collaborative filtering: multimedia recommendation with item-and component-level attention. In: The ACM Conference on Research and Development in Information Retrieval, ACM, pp. 335–344 (2017)

  • Chen, T., He, X., Kan, M.Y.: Context-aware image tweet modelling and recommendation. In: ACM Multimedia, ACM, pp. 1018–1027 (2016)

  • Chu, W.T., Wu, Y.L.: Image style classification based on learnt deep correlation features. IEEE Trans. Multimedia 20(9), 2491–2502 (2018)

    Article  Google Scholar 

  • Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. In: The ACM International Conference on Image and Video Retrieval, ACM, p. 48 (2009)

  • Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Earning, pp. 647–655 (2014)

  • Fan, J., Keim, D.A., Gao, Y., Luo, H., Li, Z.: Justclick: personalized image recommendation via exploratory search from large-scale flickr images. IEEE Trans. Circuits Syst. Video Technol. 19(2), 273–288 (2008)

    Google Scholar 

  • Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. (2015) arXiv preprint arXiv:150806576

  • Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (2016)

  • Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 3985–3993 (2017)

  • Gelli, F., Uricchio, T., He, X., Del Bimbo, A., Chua, T.S.: Beyond the product: discovering image posts for brands in social media. In: The ACM Conference on Multimedia Conference (2018)

  • Guo, G., Meng, Y., Zhang, Y., Han, C., Li, Y.: Visual semantic image recommendation. IEEE Access 7, 33424–33433 (2019)

    Article  Google Scholar 

  • He, R., McAuley, J.: VBPR: visual bayesian personalized ranking from implicit feedback. In: The Association for the Advancement of Artificial Intelligence, pp. 144–150 (2016)

  • He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: The International Conference on World Wide Web, pp. 173–182 (2017)

  • Hsiao, W.L., Grauman, K.: Learning the latent “look”: Unsupervised discovery of a style-coherent embedding from fashion images. In: IEEE International Conference on Computer Vision, IEEE, pp. 4213–4222 (2017)

  • Jiang, M., Cui, P., Wang, F., Zhu, W., Yang, S.: Scalable recommendation with social contextual information. IEEE Trans. Knowl. Data Eng. 26(11), 2789–2802 (2014)

    Article  Google Scholar 

  • Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: The International Conference on Knowledge Discovery and Data Mining, ACM, pp. 426–434 (2008)

  • Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  • Liu, Q., Wu, S., Wang, L.: Deepstyle: learning user preferences for visual recommendation. In: The ACM Conference on Research and Development in Information Retrieval, ACM, pp. 841–844 (2017)

  • Luo, H., Zhang, X., Chen, B., Guo, G.: Multi-view visual bayesian personalized ranking from implicit feedback. In: The Conference on User Modeling, Adaptation and Personalization, ACM, pp. 361–362

  • Niu, W., Caverlee, J., Lu, H.: Neural personalized ranking for image recommendation. In: ACM International Conference on Web Search and Data Mining, ACM, pp. 423–431 (2018)

  • Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2018)

    Article  Google Scholar 

  • Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  • Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: Bayesian personalized ranking from implicit feedback. In: The Conference on Uncertainty in Artificial Intelligence, pp. 452–461 (2009)

  • Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J., et al.: Item-based collaborative filtering recommendation algorithms. Int. World Wide Web Conf. 1, 285–295 (2001)

    Google Scholar 

  • Shankar, D., Narumanchi, S., Ananya, H., Kompalli, P., Chaudhury, K.: Deep learning based large scale visual recommendation and search for e-commerce. (2017) arXiv preprint arXiv:170302344

  • Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

  • Tzelepi, M., Tefas, A.: Deep convolutional learning for content based image retrieval. Neurocomputing 275, 2467–2478 (2018)

    Article  Google Scholar 

  • Wang, S., Wang, Y., Tang, J., Shu, K., Ranganath, S., Liu, H.: What your images reveal: exploiting visual contents for point-of-interest recommendation. In: The International Conference on World Wide Web, pp. 391–400 (2017)

  • Wu, L., Chen, L., Hong, R., Fu, Y., Xie, X., Wang, M.: A hierarchical attention model for social contextual image recommendation. IEEE Trans. Knowl. Data Eng. (2019)

Download references

Funding

Funding was provided by NSAF Joint Fund (Grant No. 61602147).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le Wu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, S., Chen, L. & Wu, L. Learning from visual content and style: an image-enhanced recommendation model. CCF Trans. Pervasive Comp. Interact. 1, 275–284 (2019). https://doi.org/10.1007/s42486-019-00017-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42486-019-00017-y

Navigation