skip to main content
10.1145/3366424.3386195acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations

Published: 20 April 2020 Publication History

Abstract

Learning query and item representations is important for building large scale recommendation systems. In many real applications where there is a huge catalog of items to recommend, the problem of efficiently retrieving top k items given user’s query from deep corpus leads to a family of factorized modeling approaches where queries and items are jointly embedded into a low-dimensional space. In this paper, we first showcase how to apply a two-tower neural network framework, which is also known as dual encoder in the natural language community, to improve a large-scale, production app recommendation system. Furthermore, we offer a novel negative sampling approach called Mixed Negative Sampling (MNS). In particular, different from commonly used batch or unigram sampling methods, MNS uses a mixture of batch and uniformly sampled negatives to tackle the selection bias of implicit user feedback. We conduct extensive offline experiments using large-scale production dataset and show that MNS outperforms other baseline sampling methods. We also conduct online A/B testing and demonstrate that the two-tower retrieval model based on MNS significantly improves retrieval quality by encouraging more high-quality app installs.

References

[1]
Yoshua Bengio and Jean-Sébastien Sénécal. 2003. Quick Training of Probabilistic Neural Nets by Importance Sampling. In Proceedings of the conference on Artificial Intelligence and Statistics (AISTATS).
[2]
Y. Bengio and J. S. Senecal. 2008. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model. Trans. Neur. Netw. 19, 4 (April 2008), 713–722. https://doi.org/10.1109/TNN.2007.912312
[3]
Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed Chi. 2018. Top-K Off-Policy Correction for a REINFORCE Recommender System. arxiv:cs.LG/1812.02353
[4]
Tianqi Chen, Weinan Zhang, Qiuxia Lu, Kailong Chen, Zhao Zheng, and Yong Yu. 2012. SVDFeature: A Toolkit for Feature-based Collaborative Filtering. J. Mach. Learn. Res. 13, 1 (Dec. 2012), 3619–3622. http://dl.acm.org/citation.cfm?id=2503308.2503357
[5]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. arXiv:1606.07792 (2016). http://arxiv.org/abs/1606.07792
[6]
Muthuraman Chidambaram, Yinfei Yang, Daniel Cer, Steve Yuan, Yun-Hsuan Sung, Brian Strope, and Ray Kurzweil. 2018. Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model. CoRR abs/1810.12836(2018).
[7]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. New York, NY, USA.
[8]
Daniel Gillick, Alessandro Presta, and Gaurav Singh Tomar. 2018. End-to-End Retrieval in Continuous Space. arxiv:cs.IR/1811.08008
[9]
Carlos A. Gomez-Uribe and Neil Hunt. 2015. The Netflix Recommender System: Algorithms, Business Value, and Innovation. ACM Trans. Manage. Inf. Syst. 6, 4, Article 13 (Dec. 2015), 19 pages. https://doi.org/10.1145/2843948
[10]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.
[11]
Joshua Goodman. 2001. Classes for Fast Maximum Entropy Training. In ICASSP.
[12]
Ruiqi Guo, Sanjiv Kumar, Krzysztof Choromanski, and David Simcha. 2016. Quantization based Fast Inner Product Search. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Vol. 51. PMLR.
[13]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 173–182. https://doi.org/10.1145/3038912.3052569
[14]
Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In 2008 Eighth IEEE International Conference on Data Mining. 263–272. https://doi.org/10.1109/ICDM.2008.22
[15]
David C. Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C. Ma, Zhigang Zhong, Jenny Liu, and Yushi Jing. 2017. Related Pins at Pinterest: The Evolution of a Real-World Recommender System. In WWW.
[16]
Lajanugen Logeswaran and Honglak Lee. 2018. An efficient framework for learning sentence representations. In International Conference on Learning Representations. https://openreview.net/forum?id=rJvJXZb0W
[17]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. abs/1301.3781 (2013).
[18]
Frederic Morin and Yoshua Bengio. 2005. Hierarchical Probabilistic Neural Network Language Model. In AISTATS.
[19]
Paul Neculoiu, Maarten Versteegh, and Mihai Rotaru. 2016. Learning Text Similarity with Siamese Recurrent Networks. In Proceedings of the 1st Workshop on Representation Learning for NLP. Association for Computational Linguistics, Berlin, Germany, 148–157. https://doi.org/10.18653/v1/W16-1617
[20]
Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. 2017. Embedding-based News Recommendation for Millions of Users. In KDD.
[21]
S. Rendle. 2010. Factorization Machines. In 2010 IEEE International Conference on Data Mining. 995–1000. https://doi.org/10.1109/ICDM.2010.127
[22]
Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Daniel N Holtmann-Rice, David Simcha, and Felix Yu. 2017. Multiscale Quantization for Fast Similarity Search. In Advances in Neural Information Processing Systems 30.
[23]
Yinfei Yang, Steve Yuan, Daniel Cer, Sheng-Yi Kong, Noah Constant, Petr Pilar, Heming Ge, Yun-hsuan Sung, Brian Strope, and Ray Kurzweil. 2018. Learning Semantic Textual Similarity from Conversations. In Proceedings of The Third Workshop on Representation Learning for NLP. Association for Computational Linguistics, Melbourne, Australia, 164–174. https://www.aclweb.org/anthology/W18-3022
[24]
Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Ajit Kumthekar, Zhe Zhao, Li Wei, and Ed Chi. 2019. Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations. 13th ACM Conference on Recommender Systems. Copenhagen, Denmark (2019).
[25]
Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, and Trevor Darrell. 2017. Visual Discovery at Pinterest. (02 2017).

Cited By

View all
  • (2024)Actions speak louder than wordsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694484(58484-58509)Online publication date: 21-Jul-2024
  • (2024)Model-free preference elicitationProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/387(3493-3503)Online publication date: 3-Aug-2024
  • (2024)Enhancing chemical synthesis: a two-stage deep neural network for predicting feasible reaction conditionsJournal of Cheminformatics10.1186/s13321-024-00805-416:1Online publication date: 24-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '20: Companion Proceedings of the Web Conference 2020
April 2020
854 pages
ISBN:9781450370240
DOI:10.1145/3366424
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Context-aware Recommender Systems
  2. Extreme Classification
  3. Information Retrieval
  4. Neural Networks

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '20
Sponsor:
WWW '20: The Web Conference 2020
April 20 - 24, 2020
Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)283
  • Downloads (Last 6 weeks)23
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Actions speak louder than wordsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694484(58484-58509)Online publication date: 21-Jul-2024
  • (2024)Model-free preference elicitationProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/387(3493-3503)Online publication date: 3-Aug-2024
  • (2024)Enhancing chemical synthesis: a two-stage deep neural network for predicting feasible reaction conditionsJournal of Cheminformatics10.1186/s13321-024-00805-416:1Online publication date: 24-Jan-2024
  • (2024)A comparison of embedding aggregation strategies in drug–target interaction predictionBMC Bioinformatics10.1186/s12859-024-05684-y25:1Online publication date: 6-Feb-2024
  • (2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
  • (2024)Mitigating Sample Selection Bias with Robust Domain Adaption in Multimedia RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680615(7581-7590)Online publication date: 28-Oct-2024
  • (2024)Listwise Generative Retrieval Models via a Sequential Learning ProcessACM Transactions on Information Systems10.1145/365371242:5(1-31)Online publication date: 29-Apr-2024
  • (2024)LLMs for User Interest Exploration in Large-scale Recommendation SystemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688161(872-877)Online publication date: 8-Oct-2024
  • (2024)Bootstrapping Conditional Retrieval for User-to-Item RecommendationsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688057(755-757)Online publication date: 8-Oct-2024
  • (2024)Explore versus repeat: insights from an online supermarketProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688050(787-789)Online publication date: 8-Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media