short-paper

FINAL: Factorized Interaction Layer for CTR Prediction

Authors:

Rui ZhangAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2006 - 2010

https://doi.org/10.1145/3539618.3591988

Published: 18 July 2023 Publication History

Abstract

Multi-layer perceptron (MLP) serves as a core component in many deep models for click-through rate (CTR) prediction. However, vanilla MLP networks are inefficient in learning multiplicative feature interactions, making feature interaction learning an essential topic for CTR prediction. Existing feature interaction networks are effective in complementing the learning of MLPs, but they often fall short of the performance of MLPs when applied alone. Thus, their integration with MLP networks is necessary to achieve improved performance. This situation motivates us to explore a better alternative to the MLP backbone that could potentially replace MLPs. Inspired by factorization machines, in this paper, we propose FINAL, a factorized interaction layer that extends the widely-used linear layer and is capable of learning 2nd-order feature interactions. Similar to MLPs, multiple FINAL layers can be stacked into a FINAL block, yielding feature interactions with an exponential degree growth. We unify feature interactions and MLPs into a single FINAL block and empirically show its effectiveness as a replacement for the MLP block. Furthermore, we explore the ensemble of two FINAL blocks as an enhanced two-stream CTR model, setting a new state-of-the-art on open benchmark datasets. FINAL can be easily adopted as a building block and has achieved business metric gains in multiple applications at Huawei. Our source code will be made available at MindSpore/models and FuxiCTR/model_zoo.

References

[1]

Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, and Xiangnan He. 2020. Bias and debias in recommender system: A survey and future directions. TOIS (2020).

[2]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In DLRS@RecSys. 7--10.

[3]

Weiyu Cheng, Yanyan Shen, and Linpeng Huang. 2020. Adaptive factorization network: Learning adaptive-order feature interactions. In AAAI, Vol. 34. 3609--3616.

[4]

Yuan Cheng and Yanbo Xue. 2021. Looking at CTR Prediction Again: Is Attention All You Need?. In SIGIR. ACM, 1279--1287.

[5]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In RecSys. ACM, 191--198.

[6]

Yihe Dong, Jean-Baptiste Cordonnier, and Andreas Loukas. 2021. Attention is not all you need: Pure attention loses rank doubly exponentially with depth. In ICML. PMLR, 2793--2803.

[7]

Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, and Zheng-Jun Zha. 2022. Rank diminishing in deep neural networks. arXiv preprint arXiv:2206.06072 (2022).

[8]

Jianping Gou, Baosheng Yu, Stephen J Maybank, and Dacheng Tao. 2021. Knowledge distillation: A survey. IJCV, Vol. 129 (2021), 1789--1819.

Digital Library

[9]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In IJCAI. 1725--1731.

[10]

Boris Hanin. 2018. Which neural net architectures give rise to exploding and vanishing gradients? NeurIPS, Vol. 31 (2018).

[11]

Zekun Li, Zeyu Cui, Shu Wu, Xiaoyu Zhang, and Liang Wang. 2019. Fi-gnn: Modeling feature interactions via graph neural networks for ctr prediction. In Proceedings of the 28th ACM international conference on information and knowledge management. 539--548.

Digital Library

[12]

Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In KDD. ACM, 1754--1763.

[13]

Xiaoliang Ling, Weiwei Deng, Chen Gu, Hucheng Zhou, Cui Li, and Feng Sun. 2017. Model Ensemble for Click Prediction in Bing Search Ads. In WWW. ACM, 689--698.

[14]

Bin Liu, Chenxu Zhu, Guilin Li, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, and Yong Yu. 2020. Autofis: Automatic feature interaction selection in factorization models for click-through rate prediction. In KDD. 2636--2645.

Digital Library

[15]

Yichao Lu, Ruihai Dong, and Barry Smyth. 2018. Why I like it: multi-task learning for recommendation and explanation. In RecSys. ACM, 4--12.

[16]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H. Chi. 2018b. Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts. In KDD. ACM, 1930--1939.

Digital Library

[17]

Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018a. Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate. In SIGIR. ACM, 1137--1140.

[18]

Kelong Mao, Jieming Zhu, Liangcai Su, Guohao Cai, Yuru Li, and Zhenhua Dong. 2023. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

Digital Library

[19]

Steffen Rendle, Walid Krichene, Li Zhang, and John R. Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In RecSys. ACM, 240--248.

Digital Library

[20]

Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, and Jian Tang. 2019. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. In CIKM. ACM, 1161--1170.

[21]

Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations. In RecSys. ACM, 269--278.

[22]

Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In ADKDD. ACM, 12:1--12:7.

[23]

Ruoxi Wang, Rakesh Shivanna, Derek Z Cheng, Sagar Jain, Dong Lin, Lichan Hong, and Ed H Chi. 2020. DCN-M: Improved deep & cross network for feature cross learning in web-scale learning to rank systems. arXiv preprint arXiv:2008.13535 (2020).

[24]

Penghui Wei, Weimin Zhang, Zixuan Xu, Shaoguo Liu, Kuang-chih Lee, and Bo Zheng. 2021. AutoHERI: Automated Hierarchical Representation Integration for Post-Click Conversion Rate Estimation. In CIKM. ACM, 3528--3532.

[25]

Hong Wen, Jing Zhang, Yuan Wang, Fuyu Lv, Wentian Bao, Quan Lin, and Keping Yang. 2020. Entire Space Multi-Task Modeling via Post-Click Behavior Decomposition for Conversion Rate Prediction. In SIGIR. ACM, 2377--2386.

[26]

Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional factorization machines: learning the weight of feature interactions via attention networks. In IJCAI. 3119--3125.

[27]

Wenhao Zhang, Wentian Bao, Xiao-Yang Liu, Keping Yang, Quan Lin, Hong Wen, and Ramin Ramezani. 2020. Large-scale Causal Approaches to Debiasing Post-click Conversion Rate Estimation with Multi-task Learning. In WWW. ACM, 2775--2781.

[28]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In AAAI, Vol. 33. 5941--5948.

Digital Library

[29]

Chenxu Zhu, Bo Chen, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, and Yong Yu. 2021a. AIM: Automatic Interaction Machine for Click-Through Rate Prediction. TKDE (2021).

[30]

Jieming Zhu, Jinyang Liu, Weiqi Li, Jincai Lai, Xiuqiang He, Liang Chen, and Zibin Zheng. 2020. Ensembled CTR Prediction via Knowledge Distillation. In CIKM. ACM, 2941--2958.

[31]

Jieming Zhu, Jinyang Liu, Shuai Yang, Qi Zhang, and Xiuqiang He. 2021b. Open Benchmarking for Click-Through Rate Prediction. In The 30th ACM International Conference on Information and Knowledge Management (CIKM). 2759--2769.

Cited By

Ma HLi MQin CShen DZhu HZhang XXiong H(2025)Joint Ability Assessment for Talent Recruitment: A Neural Cognitive Diagnosis ApproachACM Transactions on Management Information Systems10.1145/3714414Online publication date: 24-Jan-2025
https://dl.acm.org/doi/10.1145/3714414
Wang YChen B(2025)FinalGNN: A dual feature graph enhanced model for CTR predictionNeurocomputing10.1016/j.neucom.2024.129181619(129181)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.129181
Dang KTran TSon TAnh TNguyen DSon N(2025)CoreNet: Leveraging context-aware representations via MLP networks for CTR predictionKnowledge-Based Systems10.1016/j.knosys.2025.113154312(113154)Online publication date: Mar-2025
https://doi.org/10.1016/j.knosys.2025.113154
Show More Cited By

Index Terms

FINAL: Factorized Interaction Layer for CTR Prediction
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Hierarchical Attention Factorization Machine for CTR Prediction
Database Systems for Advanced Applications
Abstract
Click-through rate (CTR) prediction is a crucial task in recommender systems and online advertising. The most critical step in this task is to perform feature interaction. Factorization machines are proposed to complete the second-order ...
SimCEN: Simple Contrast-enhanced Network for CTR Prediction
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Click-through rate (CTR) prediction is an essential component of industrial multimedia recommendation, and the key to enhancing the accuracy of CTR prediction lies in the effective modeling of feature interactions using rich user profiles, item ...
Holistic Neural Network for CTR Prediction
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

This paper proposes HNN, a holistic neural network structure for click-through rate (CTR) prediction in recommender systems. Empirically, equipped with HNN, the performance of deep neural networks for CTR prediction are improved on Criteo and Huawei App ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
767
Total Downloads

Downloads (Last 12 months)321
Downloads (Last 6 weeks)36

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ma HLi MQin CShen DZhu HZhang XXiong H(2025)Joint Ability Assessment for Talent Recruitment: A Neural Cognitive Diagnosis ApproachACM Transactions on Management Information Systems10.1145/3714414Online publication date: 24-Jan-2025
https://dl.acm.org/doi/10.1145/3714414
Wang YChen B(2025)FinalGNN: A dual feature graph enhanced model for CTR predictionNeurocomputing10.1016/j.neucom.2024.129181619(129181)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.129181
Dang KTran TSon TAnh TNguyen DSon N(2025)CoreNet: Leveraging context-aware representations via MLP networks for CTR predictionKnowledge-Based Systems10.1016/j.knosys.2025.113154312(113154)Online publication date: Mar-2025
https://doi.org/10.1016/j.knosys.2025.113154
Wu SDu LYang JWang YZhan DZhao SSun ZKiyavash NMooij J(2024)RE-SORTProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702854(3816-3828)Online publication date: 15-Jul-2024
https://dl.acm.org/doi/10.5555/3702676.3702854
Dai YShen JZhai ZLiu DChen JSun YLi PZhang JZhang KSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)High-order contrastive learning with fine-grained comparative levels for sparse ordinal tensor completionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692461(9856-9871)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692461
Yuan QZhu MLi YLiu HGuo S(2024)Feature-Interaction-Enhanced Sequential Transformer for Click-Through Rate PredictionApplied Sciences10.3390/app1407276014:7(2760)Online publication date: 26-Mar-2024
https://doi.org/10.3390/app14072760
Li HSang LZhang YZhang XZhang Y(2024)CETN: Contrast-enhanced Through Network for Click-Through Rate PredictionACM Transactions on Information Systems10.1145/368857143:1(1-34)Online publication date: 27-Nov-2024
https://dl.acm.org/doi/10.1145/3688571
Zhang QZhu JSun JCai GYu RHe BLi L(2024)Enhancing News Recommendation with Real-Time Feedback and Generative Sequence ModelingProceedings of the Recommender Systems Challenge 202410.1145/3687151.3687158(32-36)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3687151.3687158
Li HSang LZhang YZhang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)SimCEN: Simple Contrast-enhanced Network for CTR PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681203(2311-2320)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681203
Wang YPiao HDong DYao QZhou JBaeza-Yates RBonchi F(2024)Warming Up Cold-Start CTR Prediction by Learning Item-Specific Feature InteractionsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671784(3233-3244)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671784
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten