short-paper

Uncertainty-based Heterogeneous Privileged Knowledge Distillation for Recommendation System

Authors:
Ang Li

Ant Group, Hangzhou, China

Ant Group, Hangzhou, China

0009-0004-8930-1658
View Profile

,
Jian Hu

Queen Mary University of London, London, United Kingdom

Queen Mary University of London, London, United Kingdom

0000-0001-9918-672X
View Profile

,
Ke Ding

Ant Group, Hangzhou, China

Ant Group, Hangzhou, China

0009-0001-0562-1987
View Profile

,
Xiaolu Zhang

Ant Group, Beijing, China

Ant Group, Beijing, China

0000-0001-8055-0245
View Profile

,
Jun Zhou

Ant Group, Beijing, China

Ant Group, Beijing, China

0000-0001-6033-6102
View Profile

,
Yong He

Ant Group, Hangzhou, China

Ant Group, Hangzhou, China

0009-0000-5390-2655
View Profile

,
Xu Min

Ant Group, Beijing, China

Ant Group, Beijing, China

0000-0002-5952-0794
View Profile

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalJuly 2023Pages 2471–2475https://doi.org/10.1145/3539618.3592079

Published:18 July 2023Publication History

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2471–2475

ABSTRACT

In industrial recommendation systems, both data sizes and computational resources vary across different scenarios. For scenarios with limited data, data sparsity can lead to a decrease in model performance. Heterogeneous knowledge distillation-based transfer learning can be used to transfer knowledge from models in data-rich domains. However, in recommendation systems, the target domain possesses specific privileged features that significantly contribute to the model. While existing knowledge distillation methods have not taken these features into consideration, leading to suboptimal transfer weights. To overcome this limitation, we propose a novel algorithm called Uncertainty-based Heterogeneous Privileged Knowledge Distillation (UHPKD). Our method aims to quantify the knowledge of both the source and target domains, which represents the uncertainty of the models. This approach allows us to derive transfer weights based on the knowledge gain, which captures the difference in knowledge between the source and target domains. Experiments conducted on both public and industrial datasets demonstrate the superiority of our UHPKD algorithm compared to other state-of-the-art methods.

References

Ke Ding, Yong He, Xin Dong, Jieyu Yang, Liang Zhang, Ang Li, Xiaolu Zhang, and Linjian Mo. 2022. GFlow-FT: Pick a Child Network via Gradient Flow for Efficient Fine-Tuning in Recommendation Systems. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3918--3922.Google ScholarDigital Library
Bo Fu, Zhangjie Cao, Mingsheng Long, and Jianmin Wang. 2020. Learning to Detect Open Classes for Universal Domain Adaptation. In ECCV.Google Scholar
Jianping Gou, B. Yu, Stephen J. Maybank, and Dacheng Tao. 2021. Knowledge Distillation: A Survey. ArXiv abs/2006.05525 (2021).Google Scholar
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).Google Scholar
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network., 38--39 pages.Google Scholar
Jian Hu, Hongya Tuo, Chao Wang, Lingfeng Qiao, Haowen Zhong, and Zhongliang Jing. 2019. Multi-Weight Partial Domain Adaptation. In BMVC.Google Scholar
Jian Hu, Hongya Tuo, Chao Wang, Lingfeng Qiao, Haowen Zhong, Junchi Yan, Zhongliang Jing, and Henry Leung. 2020. Discriminative partial domain adversarial network. In ECCV. Springer, 632--648.Google Scholar
Jian Hu, Haowen Zhong, Fei Yang, Shaogang Gong, Guile Wu, and Junchi Yan. 2022. Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling. (2022), 223--241.Google Scholar
Yunhun Jang, Hankook Lee, Sung Ju Hwang, and Jinwoo Shin. 2019. Learning What and Where to Transfer. In ICML.Google Scholar
Taehyeon Kim, Jaehoon Oh, Nakyil Kim, Sangwook Cho, and Se-Young Yun. 2021. Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation. ArXiv abs/2105.08919 (2021).Google Scholar
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2015).Google Scholar
Seunghyun Lee, Dae Ha Kim, and Byung Cheol Song. 2018. Self-supervised Knowledge Distillation Using Singular Value Decomposition. In ECCV.Google Scholar
Ang Li, Jian Hu, Chilin Fu, Xiaolu Zhang, and Jun Zhou. 2022. Attribute-Conditioned Face Swapping Network for Low-Resolution Images. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2305--2309.Google Scholar
Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1754--1763.Google ScholarDigital Library
Nikolaos Passalis, Maria Tzelepi, and Anastasios Tefas. 2020. Heterogeneous Knowledge Distillation using Information Flow Modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
P. Peng, X. Tao, Y. Wang, M. Pontil, and Y. Tian. 2016. Unsupervised Cross-Dataset Transfer Learning for Person Re-identification. In Computer Vision Pattern Recognition.Google Scholar
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio. 2015. FitNets: Hints for Thin Deep Nets. Computer ence (2015).Google Scholar
Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, and Jian Tang. 2019. Autoint: Automatic feature interaction learning via self-attentive neural networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1161--1170.Google ScholarDigital Library
Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15 (2014), 1929--1958.Google ScholarDigital Library
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention Is All You Need. arXiv (2017).Google Scholar
Can Wang, Defang Chen, Jian-Ping Mei, Yuan Zhang, Yan Feng, and Chun Chen. 2022. SemCKD: Semantic Calibration for Cross-Layer Knowledge Distillation. IEEE Transactions on Knowledge and Data Engineering (2022).Google Scholar
Z. Wang, Q. She, and J. Zhang. 2021. MaskNet: Introducing Feature-Wise Multiplication to CTR Ranking Models by Instance-Guided Mask. (2021).Google Scholar
C. Xu, Q. Li, J. Ge, J. Gao, X. Yang, C. Pei, F. Sun, J. Wu, H. Sun, and W. Ou. 2020. Privileged Features Distillation at Taobao Recommendations. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.Google Scholar
J. Yim, D. Joo, J. Bae, and J. Kim. 2017. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Sergey Zagoruyko and Nikos Komodakis. 2017. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer. In ICLR. https://arxiv.org/abs/1612.03928Google Scholar

Index Terms

Uncertainty-based Heterogeneous Privileged Knowledge Distillation for Recommendation System
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Parallel Ratio Based CF for Recommendation System
ICCCNT '16: Proceedings of the 7th International Conference on Computing Communication and Networking Technologies

With the increase in E-commerce, Recommendation Systems are getting popular to provide recommendations of various items (movies, books, music) to users. To build the Recommendation System (RS), Collaborative Filtering (CF) techniques are proven ...
Read More
Research Summary of Recommendation System Based on Knowledge Graph
BDE '21: Proceedings of the 2021 3rd International Conference on Big Data Engineering

This article will systematically review the knowledge graph-based recommendation system from three aspects: knowledge graph, recommendation system, and application of knowledge graph and recommendation system. Specifically, it investigates and analyzes ...
Read More
Personalized recommendation system based on knowledge embedding and historical behavior
Abstract
Collaborative filtering (CF) usually suffers from limited performance in recommendation systems due to the sparsity of user–item interactions and cold start problems. To address these issues, auxiliary information from knowledge graphs, such as ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2023
3567 pages
ISBN:9781450394086
DOI:10.1145/3539618
General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 July 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
heterogeneous knowledge distillation
recommendation system
uncertainty
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 160
  Total Downloads
- Downloads (Last 12 months)160
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Uncertainty-based Heterogeneous Privileged Knowledge Distillation for Recommendation System

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Parallel Ratio Based CF for Recommendation System

Research Summary of Recommendation System Based on Knowledge Graph

Personalized recommendation system based on knowledge embedding and historical behavior