skip to main content
10.1145/3219819.3219897acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Scalable Query N-Gram Embedding for Improving Matching and Relevance in Sponsored Search

Published: 19 July 2018 Publication History

Abstract

Sponsored search has been the major source of revenue for commercial web search engines. It is crucial for a sponsored search engine to retrieve ads that are relevant to user queries to attract clicks as advertisers only pay when their ads get clicked. Retrieving relevant ads for a query typically involves in first matching related ads to the query and then filtering out irrelevant ones. Both require understanding the semantic relationship between a query and an ad. In this work, we propose a novel embedding of queries and ads in sponsored search. The query embeddings are generated from constituent word n-gram embeddings that are trained to optimize an event level word2vec objective over a large volume of search data. We show through a query rewriting task that the proposed query n-gram embedding model outperforms the state-of-the-art word embedding models for capturing query semantics. This allows us to apply the proposed query n-gram embedding model to improve query-ad matching and relevance in sponsored search. First, we use the similarity between a query and an ad derived from the query n-gram embeddings as an additional feature in the query-ad relevance model used in Yahoo Search. We show through online A/B test that using the new relevance model to filter irrelevant ads offline leads to 0.47% CTR and 0.32% revenue increase. Second, we propose a novel online query to ads matching system, built on an open-source big-data serving engine [30], using the learned query n-gram embeddings. Online A/B test shows that the new matching technique increases the search revenue by 2.32% as it significantly increases the ad coverage for tail queries.

References

[1]
Luca Maria Aiello, Ioannis Arapakis, Ricardo A. Baeza-Yates, Xiao Bai, Nicola Barbieri, Amin Mantrach, and Fabrizio Silvestri. 2016. The Role of Relevance in Sponsored Search. In Proceedings of the 25th ACM CIKM (CIKM '16). 185--194.
[2]
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A Neural Probabilistic Language Model. J. Mach. Learn. Res. Vol. 3 (March. 2003), 1137--1155.
[3]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. arXiv preprint arXiv: 1607.04606 (2016).
[4]
Andrei Broder, Peter Ciccolo, Evgeniy Gabrilovich, Vanja Josifovski, Donald Metzler, Lance Riedel, and Jeffrey Yuan. 2009. Online Expansion of Rare Queries for Sponsored Search Proceedings of the 18th International Conference on World Wide Web (WWW '09). 511--520.
[5]
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, Vol. 41, 6 (1990), 391--407.
[6]
Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. 2007. Internet Advertising and the Generalized Second-Price Auction: Selling Billions of Dollars Worth of Keywords. American Economic Review Vol. 97, 1 (March. 2007), 242--259.
[7]
Jianfeng Gao, Xiaodong He, and Jian-Yun Nie. 2010. Clickthrough-based Translation Models for Web Search: From Word Models to Phrase Models Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM '10). 1139--1148.
[8]
Daniel Gayo-Avello. 2009. A Survey on Session Detection Methods in Query Logs and a Proposal for Future Evaluation. Inf. Sci., Vol. 179, 12 (May. 2009), 1822--1843.
[9]
Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Fabrizio Silvestri, Ricardo Baeza-Yates, Andrew Feng, Erik Ordentlich, Lee Yang, and Gavin Owens. 2016. Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '16). 375--384.
[10]
Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Fabrizio Silvestri, and Narayan Bhamidipati. 2015. Context- and Content-aware Embeddings for Query Rewriting in Sponsored Search Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15). 383--392.
[11]
Dustin Hillard, Stefan Schroedl, Eren Manavoglu, Hema Raghavan, and Chirs Leggetter. 2010. Improving Ad Relevance in Sponsored Search. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM '10). 361--370.
[12]
Shan Jiang, Yuening Hu, Changsung Kang, Tim Daly, Jr., Dawei Yin, Yi Chang, and Chengxiang Zhai. 2016. Learning Query and Document Relevance from a Web-scale Click Graph Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '16). 185--194.
[13]
Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Generating Query Substitutions. In Proceedings of the 15th International Conference on World Wide Web (WWW '06). 387--396.
[14]
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of Tricks for Efficient Text Classification. arXiv preprint arXiv: 1607.01759 (2016).
[15]
Rémi Lebret and Ronan Collobert. 2015. "The Sum of Its Parts": Joint Learning of Word and Phrase Representations with Autoencoders. arXiv preprint arXiv:1506.05703 (2015).
[16]
Kevin Lund and Curt Burgess. 1996. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, Vol. 28, 2 (1996), 203--208.
[17]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013 a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[18]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013 b. Distributed Representations of Words and Phrases and Their Compositionality Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'13). 3111--3119.
[19]
Mark Dredze Mo Yu. 2015. Learning Composition Models for Phrase Embeddings. Transactions of the Association for Computational Linguistics Vol. 3 (2015), 227--242.
[20]
Erik Ordentlich, Lee Yang, Andy Feng, Peter Cnudde, Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, and Gavin Owens. 2016. Network-Efficient Distributed Word2Vec Training System for Large Vocabularies Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM '16). 1139--1148.
[21]
Xiaochang Peng and Daniel Gildea. 2016. Exploring phrase-compositionality in skip-gram models. arXiv preprint arXiv: 1607.06208 (07. 2016).
[22]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of EMNLP. 1532--1543.
[23]
Filip Radlinski, Andrei Broder, Peter Ciccolo, Evgeniy Gabrilovich, Vanja Josifovski, and Lance Riedel. 2008. Optimizing Relevance and Revenue in Ad Search: A Query Substitution Approach Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '08). 403--410.
[24]
Hema Raghavan and Rukmini Iyer. 2010. Probabilistic First Pass Retrieval for Search Advertising: From Theory to Practice Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM '10). 1019--1028.
[25]
Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1994. Okapi at TREC-3. TREC, Donna K. Harman (Ed.), Vol. Vol. Special Publication 500--225. 109--126.
[26]
Douglas L. T. Rohde, Laura M. Gonnerman, and David C. Plaut. 2006. An improved model of semantic similarity based on lexical co-occurence. COMMUNICATIONS OF THE ACM Vol. 8 (2006), 627--633.
[27]
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1988. Neurocomputing: Foundations of Research. MIT Press, Chapter Learning Representations by Back-propagating Errors, 696--699.
[28]
Fabrizio Sebastiani. 2002. Machine Learning in Automated Text Categorization. ACM Comput. Surv., Vol. 34, 1 (2002), 1--47.
[29]
Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word Representations: A Simple and General Method for Semi-supervised Learning Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL '10). 384--394.
[30]
Vespa. 2017. http://docs.vespa.ai/documentation/overview.html. Open source. (2017).

Cited By

View all
  • (2024)Software development for contextual advertising of listings in the real estate domainPROBLEMS IN PROGRAMMING10.15407/pp2024.02-03.180(180-189)Online publication date: Sep-2024
  • (2024)Transformer based contextual text representation framework for intelligent information retrievalExpert Systems with Applications10.1016/j.eswa.2023.121629238(121629)Online publication date: Mar-2024
  • (2023)Unified Generative & Dense Retrieval for Query Rewriting in Sponsored SearchProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615459(4745-4751)Online publication date: 21-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Apache spark
  2. Vespa
  3. distributed training
  4. n-gram embedding
  5. query-ad matching
  6. query-ad relevance
  7. sponsored search

Qualifiers

  • Research-article

Conference

KDD '18
Sponsor:

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)3
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Software development for contextual advertising of listings in the real estate domainPROBLEMS IN PROGRAMMING10.15407/pp2024.02-03.180(180-189)Online publication date: Sep-2024
  • (2024)Transformer based contextual text representation framework for intelligent information retrievalExpert Systems with Applications10.1016/j.eswa.2023.121629238(121629)Online publication date: Mar-2024
  • (2023)Unified Generative & Dense Retrieval for Query Rewriting in Sponsored SearchProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615459(4745-4751)Online publication date: 21-Oct-2023
  • (2023)PASS: Personalized Advertiser-aware Sponsored SearchProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599882(4924-4936)Online publication date: 6-Aug-2023
  • (2023) Feynman : Federated Learning-Based Advertising for Ecosystems-Oriented Mobile Apps Recommendation IEEE Transactions on Services Computing10.1109/TSC.2023.328593516:5(3361-3372)Online publication date: Sep-2023
  • (2021)CHASEProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481902(4352-4361)Online publication date: 26-Oct-2021
  • (2021)Diversity driven Query Rewriting in Search AdvertisingProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467202(3423-3431)Online publication date: 14-Aug-2021
  • (2021)TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored SearchProceedings of the Web Conference 202110.1145/3442381.3449842(2848-2857)Online publication date: 19-Apr-2021
  • (2021)DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text DocumentsProceedings of the 14th ACM International Conference on Web Search and Data Mining10.1145/3437963.3441810(31-39)Online publication date: 8-Mar-2021
  • (2021)AdsGNN: Behavior-Graph Augmented Relevance Modeling in Sponsored SearchProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462926(223-232)Online publication date: 11-Jul-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media