Skip to main content
Log in

A new interest extraction method based on multi-head attention mechanism for CTR prediction

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Click-through rate (CTR) prediction plays a vital role in recommendation systems. Most models pay little attention to the relationship between target items in the user behavior sequence. The attention units used in these models cannot fully capture the context information, which can be used to reflect the variations of user interests. To address these problems, we propose a new model named interest extraction method based on multi-head attention mechanism (IEN) for CTR prediction. Specifically, we design an interest extraction module, which consists of two sub-modules: the item representation module (IRM) and the context–item interaction module (CIM). In IRM, we learn the relationship between target items in the user behavior sequence by a multi-head attention mechanism. Then, the user representation is gained by integrating the refined item representation and position information. At last, the correlation between the user and the target item is used to reflect user interests. In CIM, the context information has valuable temporal features which can reflect the variations of user interests. Therefore, user interests can be further acquired through the feature interaction between the context and the target item. After that, the learned relevance and the feature interaction are fed to the multi-layer perceptron (MLP) for prediction. Besides, experiments on four Amazon datasets were conducted to evaluate the effectiveness of our method in capturing user interests. The experimental results show that our proposed method outperforms state-of-the-art methods in terms of AUC and RI in the CTR prediction task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://jmcauley.ucsd.edu/data/amazon/.

References

  1. Wang J, Huang P, Zhao H, Zhang Z, Zhao B, Lee DL (2018) Billion-scale commodity embedding for e-commerce recommendation in alibaba. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 839–848

  2. An M, Wu F, Wu C, Zhang K, Liu Z, Xie X (2019) Neural news recommendation with long- and short-term user representations. In: Proceedings of the 57th conference of the association for computational linguistics, pp 336–345

  3. Chen W, Huang P, Xu J, Guo, X, Guo C, Sun F, Li C, Pfadler A, Zhao H, Zhao B (2019) POG: personalized outfit generation for fashion recommendation at alibaba ifashion. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2662–2670

  4. Ni Y, Ou D, Liu S, Li X, Ou W, Zeng A, Si L (2018) Perceive your users in depth: Learning universal user representations from multiple e-commerce tasks. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 596–605

  5. Pei C, Zhang Y, Zhang Y, Sun F, Pei D (2019) Personalized context-aware re-ranking for e-commerce recommender systems

  6. He X, Pan J, Jin O, Xu T, Liu B, Xu T, Shi Y, Atallah A, Herbrich R, Bowers S, Candela JQ (2014) Practical lessons from predicting clicks on ads at facebook. In: Proceedings of the eighth international workshop on data mining for online advertising, pp 5–159

  7. Huang Z, Pan Z, Liu Q, Long B, Ma H, Chen E (2017) An ad CTR prediction method based on feature learning of deep and shallow layers. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 2119–2122

  8. Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition, pp 2261–2269

  9. Lauriola I, Lavelli A, Aiolli F (2022) An introduction to deep learning in natural language processing: models, techniques, and tools. Neurocomputing, pp 443–456

  10. Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: human language technologies, pp 4171–4186

  11. Cheng H, Koc L, Harmsen J, Shaked T, Chandra T, Aradhye H, Anderson G, Corrado G, Chai W, Ispir M, Anil R, Haque Z, Hong L, Jain V, Liu X, Shah H (2016) Wide & deep learning for recommender systems. In: Proceedings of the 1st workshop on deep learning for recommender systems, pp 7–10

  12. Qu Y, Cai H, Ren K, Zhang W, Yu Y, Wen Y, Wang J (2016) Product-based neural networks for user response prediction. In: IEEE 16th international conference on data mining, pp 1149–1154

  13. Zhou G, Zhu X, Song C, Fan Y, Zhu H, Ma X, Yan Y, Jin J, Li H, Gai K (2018) Deep interest network for click-through rate prediction. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1059–1068

  14. Zhou G, Mou N, Fan Y, Pi Q, Bian W, Zhou C, Zhu X, Gai K (2019) Deep interest evolution network for click-through rate prediction. In: The thirty-third AAAI conference on artificial intelligence, pp 5941–5948

  15. Lyu Z, Dong Y, Huo C, Ren W Deep match to rank model for personalized click-through rate prediction. In: The thirty-fourth AAAI conference on artificial intelligence, pp 156–163

  16. McMahan HB, Hol G, Sculley D, Young M, Ebner D, Grady J, Nie L, Phillips T, Davydov E, Golovin D, Chikkerur S, Liu D, Wattenberg M, Hrafnkelsson AM, Boulos T, Kubica J (2013) Ad click prediction: a view from the trenches. In: The 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1222–1230

  17. Rendle S (2010) Factorization machines. In: Webb GI, Liu B, Zhang C, Gunopulos D, Wu X (eds) ICDM 2010, The 10th IEEE international conference on data mining, Sydney, pp 995–1000

  18. Juan Y, Zhuang Y, Chin W, Lin C (2016) Field-aware factorization machines for CTR prediction. In: Proceedings of the 10th ACM conference on recommender systems, pp 43–50

  19. Pan J, Xu J, Ruiz AL, Zhao W, Pan S, Sun Y, Lu Q (2018) Field-weighted factorization machines for click-through rate prediction in display advertising. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp 1349–1357

  20. Yang Y, Cai J, Yang H, Zhang J, Zhao X (2020) TAD: a trajectory clustering algorithm based on spatial-temporal density analysis. Expert Syst Appl 139:112846

    Article  Google Scholar 

  21. Yang Y, Cai J, Yang H, Li Y, Zhao X (2022) Isbfk-means: a new clustering algorithm based on influence space. Expert Syst Appl 201:117018

    Article  Google Scholar 

  22. Yang Y, Cai J, Yang H, Zhao X (2022) Density clustering with divergence distance and automatic center selection. Inf Sci 596:414–438

    Article  Google Scholar 

  23. Yang H, Shi C, Cai J, Zhou L, Yang Y, Zhao X, He Y, Hao J (2022) Data mining techniques on astronomical spectra data-i. clustering analysis. Monthly Notices Astron Soc 517(4):5496–5523

    Article  Google Scholar 

  24. Yang H, Zhou L, Cai J, Shi C, Yang Y, Zhao X, Duan J, Yin X (2022) Data mining techniques on astronomical spectra data-ii. classification analysis. Monthly Notices R. Astron Soc 518(4):5904–5928

    Article  Google Scholar 

  25. He X, Chua T (2017) Neural factorization machines for sparse predictive analytics. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, Shinjuku, pp 355–364

  26. Xiao J, Ye H, He X, Zhang H, Wu F, Chua T (2017) Attentional factorization machines: learning the weight of feature interactions via attention networks. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, pp 3119–3125

  27. Guo H, Tang R, Ye Y. Li Z, He X (2017) Deepfm: a factorization-machine based neural network for CTR prediction. In: Sierra, C. (ed.) Proceedings of the twenty-sixth international joint conference on artificial intelligence, pp. 1725–1731

  28. Wang R, Fu B, Fu G, Wang M (2017) Deep & cross network for ad click predictions. In: Proceedings of the ADKDD’17, pp 12–1127

  29. Lian J, Zhou X, Zhang F, Chen Z, Xie X, Sun G (2018) xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1754–1763

  30. Chen Q, Zhao H, Li W, Huang P, Ou W (2019) Behavior sequence transformer for e-commerce recommendation in alibaba. In: Proceedings of the 1st international workshop on deep learning practice for high-dimensional sparse data, pp 1–4

  31. Feng Y, Lv F, Shen W, Wang M, Sun F, Zhu Y, Yang K (2019) Deep session interest network for click-through rate prediction. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, pp 2301–2307

  32. Wu M, Xing J, Chen S (2022) Deep user multi-interest network for click-through rate prediction. In: knowledge science, engineering and management—15th international conference. lecture notes in computer science, vol 13369, pp 57–69

  33. Zhang K, Qian H, Cui Q, Liu Q, Li L, Zhou J, Ma J, Chen E (2021) Multi-interactive attention network for fine-grained feature learning in CTR prediction. In: WSDM ’21, The fourteenth ACM international conference on web search and data mining, pp 984–992

  34. Yan C, Li X, Chen Y, Zhang Y (2022) JointCTR: a joint CTR prediction framework combining feature interaction and sequential behavior learning. Appl Intell 52, 4701–4714 (2022). https://doi.org/10.1007/s10489-021-02678-8

  35. Jiang W, Jiao Y, Wang Q, Liang C, Guo L, Zhang Y, Sun Z, Xiong Y, Zhu Y (2022) Triangle graph interest network for click-through rate prediction. In: Proceedings of the fifteenth ACM international conference on web search and data mining

  36. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, pp 5998–6008

  37. LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  38. Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: 3rd International conference on learning representations

  39. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit. Lett. 27(8):861–874

    Article  MathSciNet  Google Scholar 

  40. Yan L, Li W, Xue G, Han D (2014) Coupled group lasso for web-scale CTR prediction in display advertising. In: Proceedings of the 31th international conference on machine learning. JMLR workshop and conference Proceedings, vol 32. pp 802–810

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation of China (Grant No. U1931209), the Central Government Guides Local Science and Technology Development Funds (Grant No. 20201070), and the Fundamental Research Program of Shanxi Province (Grant Nos. 20210302123223, 202103021224275).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianghui Cai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethics approval

The study is original and has not been submitted to any other journal/conference.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, H., Yao, L., Cai, J. et al. A new interest extraction method based on multi-head attention mechanism for CTR prediction. Knowl Inf Syst 65, 3337–3352 (2023). https://doi.org/10.1007/s10115-023-01867-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01867-w

Keywords

Navigation