research-article

Text Matching Indexers in Taobao Search

Authors:

Sen Li,

Fuyu Lv,

Ruqing Zhang,

Dan Ou,

Zhixuan Zhang,

Maarten de RijkeAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 5339 - 5350

https://doi.org/10.1145/3637528.3671654

Published: 24 August 2024 Publication History

Get Access

Abstract

Product search is an important service on Taobao, the largest e-commerce platform in China. Through this service, users can easily find products relevant to their specific needs. Coping with billion-size query loads, Taobao product search has traditionally relied on classical term-based retrieval models due to their powerful and interpretable indexes. In essence, efficient retrieval hinges on the proper storage of the inverted index. Recent successes involve reducing the size (pruning) of the inverted index but the construction and deployment of lossless static index pruning in practical product search still pose non-trivial challenges.

In this work, we introduce a novel SM art INDexing (SMIND) solution in Taobao product search. SMIND is designed to reduce information loss during the static pruning process by incorporating user search preferences. Specifically, we first construct "user-query-item'' hypergraphs for four different search preferences, namely purchase, click, exposure, and relevance. Then, we develop an efficient TermRank algorithm applied to these hypergraphs, to preserve relevant items based on specific user preferences during the pruning of the inverted indexer. Our approach offers fresh insights into the field of product search, emphasizing that term dependencies in user search preferences go beyond mere text relevance. Moreover, to address the vocabulary mismatch problem inherent in term-based models, we also incorporate an multi-granularity semantic retrieval model to facilitate semantic matching. Empirical results from both offline evaluation and online A/B tests showcase the superiority of SMIND over state-of-the-art methods, especially in commerce metrics with significant improvements of 1.34% in Pay Order Count and 1.50% in Gross Merchandise Value. Besides, SMIND effectively mitigates the Matthew effect of user queries and has been in service for hundreds of millions of daily users since November 2022.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A System for Large-scale Machine Learning. In 12th OSDI. 265--283.

Abstract

References

Index Terms

Recommendations

Enhancing product search by best-selling prediction in e-commerce

Ranking Relevance in Yahoo Search

An experimental study on re-ranking web shop search results using semantic segmentation of user profiles

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations