research-article

Task Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning

Authors:

Bing HanAuthors Info & Claims

RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems

Pages 151 - 160

https://doi.org/10.1145/3604915.3608772

Published: 14 September 2023 Publication History

Abstract

In online recommendation, financial service, etc., the most common application of multi-task learning (MTL) is the multi-step conversion estimations. A core property of the multi-step conversion is the sequential dependence among tasks. However, most existing works focus far more on the specific post-view click-through rate (CTR) and post-click conversion rate (CVR) estimations, which neglect the generalization of sequential dependence multi-task learning (SDMTL). Additionally, the performance of the SDMTL framework is also deteriorated by the interference derived from implicitly conflict information passing between adjacent tasks. In this paper, a systematic learning paradigm of the SDMTL problem is established for the first time, which can transform the SDMTL problem into a general MTL problem with constraints and be applicable to more general multi-step conversion scenarios with stronger task dependence. Also, the distribution dependence relationship between adjacent task spaces is illustrated from a theoretical point of view. On the other hand, an SDMTL architecture, named Task Aware Feature Extraction (TAFE), is developed to enable dynamic task representation learning from a sample-wise view. TAFE selectively reconstructs the implicit shared information corresponding to each sample case and performs explicit task-specific extraction under dependence constraints. Extensive experiments on offline public and real-world industrial datasets, and online A/B implementations demonstrate the effectiveness and applicability of proposed theoretical and implementation frameworks.

References

[1]

Jonathan Baxter. 1997. A Bayesian/information theoretic model of learning to learn via multiple task sampling. Machine learning 28, 1 (1997), 7–39.

[2]

Ling Chen, Donghui Chen, Fan Yang, and Jianling Sun. 2021. Neural episodic control. In A deep multi-task representation learning method for time series classification and retrieval. Information Sciences, 17–32.

[3]

Michael Crawshaw. 2020. Multi-task learning with deep neural networks: A survey. arXiv preprint arXiv:2009.09796 (2020).

[4]

Hongliang Fei, Jingyuan Zhang, Xingxuan Zhou, Junhao Zhao, Xinyang Qi, and Ping Li. 2021. GemNN: gating-enhanced multi-task neural networks with feature interaction learning for CTR prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2166–2171.

Digital Library

[5]

Tiankai Gu, Kun Kuang, Hong Zhu, Jingjie Li, Zhenhua Dong, Wenjie Hu, Zhenguo Li, Xiuqiang He, and Yue Liu. 2021. Estimating true post-click conversion via group-stratified counterfactual inference.

[6]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).

[7]

Siyuan Guo, Lixin Zou, Yiding Liu, Wenwen Ye, Suqi Cheng, Shuaiqiang Wang, Hechang Chen, Dawei Yin, and Yi Chang. 2021. Enhanced doubly robust learning for debiasing post-click conversion rate estimation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 275–284.

Digital Library

[8]

H. Hazimeh, Z. Zhao, A. Chowdhery, M. Sathiamoorthy, and E. H. Chi. 2021. DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning. (2021).

[9]

Robert A Jacobs, Michael I Jordan, Steven J Nowlan, and Geoffrey E Hinton. 1991. Adaptive mixtures of local experts. Neural computation 3, 1 (1991), 79–87.

[10]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[11]

Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. 2019. Set transformer: A framework for attention-based permutation-invariant neural networks. In International Conference on Machine Learning. PMLR, 3744–3753.

[12]

PM Lerman. 1980. Fitting segmented regression models by grid search. Journal of the Royal Statistical Society: Series C (Applied Statistics) 29, 1 (1980), 77–84.

[13]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H Chi. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1930–1939.

Digital Library

[14]

Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137–1140.

Digital Library

[15]

Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3994–4003.

[16]

Conor O’Brien, Kin Sum Liu, James Neufeld, Rafael Barreto, and Jonathan J Hunt. 2021. An Analysis Of Entire Space Multi-Task Models For Post-Click Conversion Prediction. In Fifteenth ACM Conference on Recommender Systems. 613–619.

[17]

Alexander Pritzel, Benigno Uria, Sriram Srinivasan, Adria Puigdomenech Badia, Oriol Vinyals, Demis Hassabis, Daan Wierstra, and Charles Blundell. 2017. Neural episodic control. In International Conference on Machine Learning. PMLR, 2827–2836.

[18]

Rich and Caruana. 1997. Multitask Learning. Machine Learning (1997).

[19]

Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard. 2019. Latent multi-task architecture learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4822–4829.

Digital Library

[20]

Jiayi Shen, Xiantong Zhen, Marcel Worring, and Ling Shao. 2021. Variational multi-task learning with gumbel-softmax priors. Advances in Neural Information Processing Systems 34 (2021), 21031–21042.

[21]

Asa Cooper Stickland and Iain Murray. 2019. Bert and pals: Projected attention layers for efficient adaptation in multi-task learning. In International Conference on Machine Learning. PMLR, 5986–5995.

[22]

Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive layered extraction (PLE): A novel multi-task learning (MTL) model for personalized recommendations. In Fourteenth ACM Conference on Recommender Systems. 269–278.

Digital Library

[23]

Partoo Vafaeikia, Khashayar Namdar, and Farzad Khalvati. 2020. A Brief Review of Deep Multi-task Learning and Auxiliary Task Learning. arXiv preprint arXiv:2007.01126 (2020).

[24]

Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, Dengxin Dai, and Luc Van Gool. 2020. Revisiting multi-task learning in the deep learning era. arXiv preprint arXiv:2004.13379 2 (2020).

[25]

Fangye Wang, Yingxu Wang, Dongsheng Li, Hansu Gu, Tun Lu, Peng Zhang, and Ning Gu. 2022. Enhancing CTR Prediction with Context-Aware Feature Representation Learning. arXiv preprint arXiv:2204.08758 (2022).

[26]

Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, and Wei Chu. 2022. ESCM2: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation. arXiv preprint arXiv:2204.05125 (2022).

[27]

Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Guihong Cao, Daxin Jiang, Ming Zhou, 2020. K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808 (2020).

[28]

Hong Wen, Jing Zhang, Fuyu Lv, Wentian Bao, Tianyi Wang, and Zulong Chen. 2021. Hierarchically modeling micro and macro behaviors via multi-task learning for conversion rate prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2187–2191.

Digital Library

[29]

Hong Wen, Jing Zhang, Yuan Wang, Fuyu Lv, Wentian Bao, Quan Lin, and Keping Yang. 2020. Entire space multi-task modeling via post-click behavior decomposition for conversion rate prediction. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 2377–2386.

Digital Library

[30]

Dongbo Xi, Zhen Chen, Peng Yan, Yinger Zhang, Yongchun Zhu, Fuzhen Zhuang, and Yu Chen. 2021. Modeling the sequential dependence among audience multi-step conversions with multi-task learning in targeted display advertising. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3745–3755.

Digital Library

[31]

Wenhao Zhang, Wentian Bao, Xiao-Yang Liu, Keping Yang, Quan Lin, Hong Wen, and Ramin Ramezani. 2020. Large-scale causal approaches to debiasing post-click conversion rate estimation with multi-task learning. In Proceedings of The Web Conference 2020. 2775–2781.

Digital Library

[32]

Jiejie Zhao, Bowen Du, Leilei Sun, Fuzhen Zhuang, Weifeng Lv, and Hui Xiong. 2019. Multiple relational attention network for multi-task learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1123–1131.

Digital Library

Cited By

Wu YFu RXing TYu ZYin F(2025)A user behavior-aware multi-task learning model for enhanced short video recommendationNeurocomputing10.1016/j.neucom.2024.129076617(129076)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.129076
Tang XQiao YLyu FLiu DHe X(2024)Touch the Core: Exploring Task Dependence Among Hybrid Targets for RecommendationProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688101(329-339)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688101
Fu CWang KWu JChen YHuzhang GNi YZeng AZhou ZBaeza-Yates RBonchi F(2024)Residual Multi-Task Learner for Applied RankingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671523(4974-4985)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671523

Recommendations

STRec: Sparse Transformer for Sequential Recommendations
RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems

With the rapid evolution of transformer architectures, researchers are exploring their application in sequential recommender systems (SRSs) and presenting promising performance on SRS tasks compared with former SRS models. However, most existing ...
gSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling
RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems

A large catalogue size is one of the central challenges in training recommendation models: a large number of items makes them memory and computationally inefficient to compute scores for all items during training, forcing these models to deploy negative ...
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Neural-based multi-task learning has been successfully used in many real-world large-scale applications such as recommendation systems. For example, in movie recommendations, beyond providing users movies which they tend to purchase and watch, the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems

September 2023

1406 pages

ISBN:9798400702419

DOI:10.1145/3604915

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 September 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

RecSys '23

Sponsor:

RecSys '23: Seventeenth ACM Conference on Recommender Systems

September 18 - 22, 2023

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
475
Total Downloads

Downloads (Last 12 months)188
Downloads (Last 6 weeks)8

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu YFu RXing TYu ZYin F(2025)A user behavior-aware multi-task learning model for enhanced short video recommendationNeurocomputing10.1016/j.neucom.2024.129076617(129076)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.129076
Tang XQiao YLyu FLiu DHe X(2024)Touch the Core: Exploring Task Dependence Among Hybrid Targets for RecommendationProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688101(329-339)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688101
Fu CWang KWu JChen YHuzhang GNi YZeng AZhou ZBaeza-Yates RBonchi F(2024)Residual Multi-Task Learner for Applied RankingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671523(4974-4985)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671523

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten