ABSTRACT
Personalization is essential for enhancing the customer experience in retrieval tasks. In this paper, we develop a novel method CA-GCN for personalized image retrieval in the Adobe Stock image system. The proposed method CA-GCN leverages user behavior data in a Graph Convolutional Neural Network (GCN) model to learn user and image embeddings simultaneously. Standard GCN performs poorly on sparse user-image interaction graphs due to the limited knowledge gain from less representative neighbors. To address this challenge, we propose to augment the sparse user-image interaction data by considering the similarities among images. Specifically, we detect clusters of similar images and introduce a set of hidden super-nodes in the graph to represent clusters. We show that such an augmented graph structure can significantly improve the retrieval performance on real-world data collected from Adobe Stock service. In particular, when testing the proposed method on real users' stock image retrieval sessions, we get promoted average click position from 70 to 51.
Supplemental Material
- 2019. Ebay steps up personalization efforts to improve search, ads and experience. https://www.modernretail.co/platforms/ebay-steps-up-personalizationefforts-to-improve-search-ads-and-experience/.Google Scholar
- 2019. How Amazon set the standard for product recommendations. https://www.retentionscience.com/blog/how-amazon-set-the-standardfor-product-recommendations/.Google Scholar
- Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, et almbox. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI. 265--283.Google Scholar
- Pranav Aggarwal, Zhe Lin, Baldo Faieta, and Saeid Motiian. 2019. Multitask Text-to-Visual Embedding with Titles and Clickthrough Data. arXiv preprint arXiv:1905.13339 (2019).Google Scholar
- Somnath Banerjee and Krishnan Ramanathan. 2008. Collaborative filtering on skewed datasets. In WWW. 1135--1136.Google Scholar
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. arXiv:1607.04606 (2016).Google Scholar
- Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. arXiv preprint arXiv:1905.07953 (2019).Google Scholar
- Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin. 2019. Graph Neural Networks for Social Recommendation. In WWW. ACM.Google Scholar
- Xiaoxiao Guo, Hui Wu, Yupeng Gao, Steven Rennie, and Rogerio Feris. 2019. The Fashion IQ Dataset: Retrieving Images by Combining Side Information and Relative Natural Language Feedback. arXiv preprint arXiv:1905.12794 (2019).Google Scholar
- Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems.Google Scholar
- Peng Han, Peng Yang, Peilin Zhao, Shuo Shang, Yong Liu, Jiayu Zhou, Xin Gao, and Panos Kalnis. 2019. GCN-MF: Disease-Gene Association Identification By Graph Convolutional Networks and Matrix Factorization. In Proceedings of the 25th ACM SIGKDD. 705--713.Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.Google Scholar
- Zhenyan Ji, Weina Yao, Huaiyu Pi, Wei Lu, Jing He, and Haishuai Wang. 2017. A survey of personalised image retrieval and recommendation. In National Conference of Theoretical Computer Science. Springer, 233--247.Google ScholarCross Ref
- Xiaowei Jia, Sheng Li, Handong Zhao, Sungchul Kim, and Vipin Kumar. 2019. Towards robust and discriminative sequential data learning: When and how to perform adversarial training?. In Proceedings of the 25th ACM SIGKDD.Google ScholarDigital Library
- Xiaowei Jia, Xiaoyi Li, Kang Li, Vishrawas Gopalakrishnan, Guangxu Xun, and Aidong Zhang. 2016. Collaborative restricted Boltzmann machine for social event recommendation. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 402--405.Google ScholarDigital Library
- Xiaowei Jia, Aosen Wang, Xiaoyi Li, Guangxu Xun, Wenyao Xu, and Aidong Zhang. 2015. Multi-modal learning for video recommendation based on mobile application usage. In 2015 IEEE International Conference on Big Data (Big Data).Google ScholarDigital Library
- Da-Cheng Juan, Chun-Ta Lu, Zhen Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Yaxi Gao, Tom Duerig, Andrew Tomkins, and Sujith Ravi. 2019. Graph-rise: Graph-regularized image semantic embedding. arXiv preprint arXiv:1902.10814 (2019).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Joonseok Lee, Sami Abu-El-Haija, Balakrishnan Varadarajan, and Apostol Natsev. 2018. Collaborative deep metric learning for video understanding. In Proceedings of the 24th ACM SIGKDD.Google ScholarDigital Library
- Sanghoon Lee, Mohamed Masoud, Janani Balaji, Saeid Belkasim, Rajshekhar Sunderraman, and Seung-Jin Moon. 2017. A survey of tag-based information retrieval. International Journal of Multimedia Information Retrieval (2017).Google ScholarCross Ref
- Kunpeng Li, Yulun Zhang, Kai Li, Yuanyuan Li, and Yun Fu. 2019. Visual semantic reasoning for image-text matching. In ICCV.Google Scholar
- Zhiwei Liu, Mengting Wan, Stephen Guo, Kannan Achan, and Philip S Yu. 2020. BasConv: Aggregating Heterogeneous Interactions for Basket Recommendation with Graph Convolutional Neural Network. arXiv:2001.09900 (2020).Google Scholar
- Yao Ma, Suhang Wang, Charu C Aggarwal, and Jiliang Tang. 2019. Graph convolutional networks with eigenpooling. In Proceedings of the 25th ACM SIGKDD.Google ScholarDigital Library
- Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training Distributed Word Representations. In LREC 2018.Google Scholar
- Nils Murrugarra-Llerena and Adriana Kovashka. 2019. Cross-Modality Personalization for Retrieval. In CVPR.Google Scholar
- Marcel Nassar. 2018. Hierarchical Bipartite Graph Convolution Networks. arXiv preprint arXiv:1812.03813 (2018).Google Scholar
- Qi Qi, Qiming Huo, Jingyu Wang, Haifeng Sun, Yufei Cao, and Jianxin Liao. 2019. Personalized Sketch-Based Image Retrieval by Convolutional Neural Network and Deep Transfer Learning. IEEE Access, Vol. 7 (2019), 16537--16549.Google ScholarCross Ref
- Mehwish Rehman, Muhammad Iqbal, Muhammad Sharif, and Mudassar Raza. 2012. Content based image retrieval: survey. World Applied Sciences Journal, Vol. 19, 3 (2012), 404--412.Google Scholar
- Karen Sparck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, Vol. 28, 1 (1972), 11--21.Google ScholarCross Ref
- Xiaocheng Tang, Zhiwei Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, and Jieping Ye. 2019. A deep value-network based approach for multi-driver order dispatching. In Proceedings of the 25th ACM SIGKDD.Google ScholarDigital Library
- Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, and James Hays. 2019. Composing text and image for image retrieval-an empirical odyssey. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Yuandong Wang, Hongzhi Yin, Hongxu Chen, Tianyu Wo, Jie Xu, and Kai Zheng. 2019. Origin-destination matrix prediction via graph convolution: a new perspective of passenger demand modeling. In Proceedings of the 25th ACM SIGKDD.Google ScholarDigital Library
- Le Wu, Peijie Sun, Richang Hong, Yanjie Fu, Xiting Wang, and Meng Wang. 2018. SocialGCN: An Efficient Graph Convolutional Network based Model for Social Recommendation. arXiv preprint arXiv:1811.02815 (2018).Google Scholar
- Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD. ACM, 974--983.Google ScholarDigital Library
- Jing Yu, Yuhang Lu, Zengchang Qin, Weifeng Zhang, Yanbing Liu, Jianlong Tan, and Li Guo. 2018. Modeling text with graph convolutional network for cross-modal information retrieval. In Pacific Rim Conference on Multimedia. Springer.Google ScholarDigital Library
- Jiani Zhang, Xingjian Shi, Shenglin Zhao, and Irwin King. 2019. STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems. arXiv preprint arXiv:1905.13129 (2019).Google Scholar
- Handong Zhao, Zhengming Ding, and Yun Fu. 2017. Multi-View Clustering via Deep Matrix Factorization. In AAAI.Google Scholar
- Handong Zhao and Yun Fu. 2015. Semantic Single Video Segmentation with Robust Graph Representation. In IJCAI.Google Scholar
- Jun Zhao, Zhou Zhou, Ziyu Guan, Wei Zhao, Wei Ning, Guang Qiu, and Xiaofei He. 2019. IntentGC: a Scalable Graph Convolution Framework Fusing Heterogeneous Information for Recommendation. In Proceedings of the 25th ACM SIGKDD.Google ScholarDigital Library
- Lei Zhu, Jialie Shen, Liang Xie, and Zhiyong Cheng. 2016. Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Transactions on Knowledge and Data Engineering, Vol. 29, 2 (2016), 472--486.Google ScholarDigital Library
Index Terms
- Personalized Image Retrieval with Sparse Graph Representation Learning
Recommendations
Acyclic 3-choosability of sparse graphs with girth at least 7
Every planar graph is known to be acyclically 7-choosable and is conjectured to be acyclically 5-choosable (Borodin et al. 2002 [4]). This conjecture if proved would imply both Borodin's acyclic 5-color theorem (1979) and Thomassen's 5-choosability ...
Linear Ramsey numbers of sparse graphs
The Ramsey number R(G1,G2) of two graphs G1 and G2 is the least integer p so that either a graph G of order p contains a copy of G1 or its complement Gc contains a copy of G2. In 1973, Burr and Erdõs offered a total of $25 for settling the conjecture that ...
On the Complexity of List -Packing for Sparse Graph Classes
WALCOM: Algorithms and ComputationAbstractThe problem of packing as many subgraphs isomorphic to as possible in a graph for a class of graphs is well studied in the literature. Both vertex-disjoint and edge-disjoint versions are known to be NP-complete for H that contains at least three ...
Comments