skip to main content
10.1145/3308558.3313706acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Personalized Online Spell Correction for Personal Search

Published:13 May 2019Publication History

ABSTRACT

Spell correction is a must-have feature for any modern search engine in applications such as web or e-commerce search. Typical spell correction solutions used in production systems consist of large indexed lookup tables based on a global model trained across many users over a large scale web corpus or a query log.

For search over personal corpora, such as email, this global solution is not sufficient, as it ignores the user's personal lexicon. Without personalization, global spelling fails to correct tail queries drawn from a user's own, often idiosyncratic, lexicon. Personalization using existing algorithms is difficult due to resource constraints and unavailability of sufficient data to build per-user models.

In this work, we propose a simple and effective personalized spell correction solution that augments existing global solutions for search over private corpora. Our event driven spell correction candidate generation method is specifically designed with personalization as the key construct. Our novel spell correction and query completion algorithms do not require complex model training and is highly efficient. The proposed solution has shown over 30% click-through rate gain on affected queries when evaluated against a range of strong commercial personal search baselines - Google's Gmail, Drive, and Calendar search production systems.

References

  1. 2018. I before E except after C. Retrieved 2018-10-30 from https://en.wikipedia.org/wiki/I_before_E_except_after_CGoogle ScholarGoogle Scholar
  2. Qingyao Ai, Susan T. Dumais, Nick Craswell, and Dan Liebling. 2017. Characterizing Email Search Using Large-scale Behavioral Logs and Surveys. In WWW. 1511-1520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrei Broder, Peter Ciccolo, Evgeniy Gabrilovich, Vanja Josifovski, Donald Metzler, Lance Riedel, and Jeffrey Yuan. 2009. Online Expansion of Rare Queries for Sponsored Search. In WWW. 511-520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fei Cai, Shangsong Liang, and Maarten de Rijke. 2014. Time-sensitive Personalized Query Auto-Completion. In CIKM. 1599-1608. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. David Carmel, Guy Halawi, Liane Lewin-Eytan, Yoelle Maarek, and Ariel Raviv. 2015. Rank by Time or by Relevance?: Revisiting Email Search. In CIKM. 283-292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Surajit Chaudhuri and Raghav Kaushik. 2009. Extending Autocompletion to Tolerate Errors. In ACM SIGMOD. 707-718. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Qing Chen, Mu Li, and Ming Zhou. 2007. Improving Query Spelling Correction Using Web Search Results. In EMNLP-CoNLL. 181-189.Google ScholarGoogle Scholar
  8. Silviu Cucerzan and Eric Brill. 2004. Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users.. In EMNLP. 293-300.Google ScholarGoogle Scholar
  9. Dong Deng, Guoliang Li, He Wen, H. V. Jagadish, and Jianhua Feng. 2016. META: An Efficient Matching-based Method for Error-tolerant Autocompletion. In VLDB. 828-839. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Huizhong Duan and Bo-June Hsu. 2011. Online Spelling Correction for Query Completion. In WWW. 117-126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Susan Dumais, Edward Cutrell, JJ Cadiz, Gavin Jancke, Raman Sarin, and Daniel C. Robbins. 2003. Stuff I'Ve Seen: A System for Personal Information Retrieval and Re-use. In SIGIR. 72-79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mohammad Ali Elmi and Martha Evens. 1998. Spelling correction using context. In ACL. 360-364.Google ScholarGoogle Scholar
  13. Pravallika Etoori, Manoj Chinnakotla, and Radhika Mamidi. 2018. Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning. In ACL, Student Research Workshop. 146-152.Google ScholarGoogle Scholar
  14. Pieter Fivez, Simon Šuster, and Walter Daelemans. 2017. Unsupervised Context-Sensitive Spelling Correction of English and Dutch Clinical Free-Text with Word and Character N-Gram Embeddings. arXiv preprint arXiv:1710.07045(2017).Google ScholarGoogle Scholar
  15. Jianfeng Gao, Xiaolong Li, Daniel Micol, Chris Quirk, and Xu Sun. 2010. A Large Scale Ranker-based System for Search Query Spelling Correction. In COLING. 358-366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Sasa Hasan, Carmen Heger, and Saab Mansour. 2015. Spelling Correction of User Search Queries through Statistical Machine Translation. In EMNLP. 451-460.Google ScholarGoogle Scholar
  17. Michael Herscovici, Dan Guez, and Hyung-Jin Kim. 2017 granted. Autocompletion using previously submitted query data. In US9740780B1.Google ScholarGoogle Scholar
  18. Maryam Kamvar, Melanie Kellar, Rajan Patel, and Ya Xu. 2009. Computers and Iphones and Mobile Phones, Oh My!: A Logs-based Comparison of Search Users on Different Devices. In WWW. 801-810. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady. 707-710.Google ScholarGoogle Scholar
  20. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS. 3111-3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Harshit Pande. 2017. Effective search space reduction for spell correction using character neural embeddings. In EAACL. 170-174.Google ScholarGoogle Scholar
  22. Milad Shokouhi. 2013. Learning to Personalize Query Auto-completion. In SIGIR. 103-112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xu Sun, Jianfeng Gao, Daniel Micol, and Chris Quirk. 2010. Learning Phrase-based Spelling Error Models from Clickthrough Data. In ACL. 266-274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In SIGIR. 115-124. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format