skip to main content
10.1145/2661829.2661943acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Improving Tail Query Performance by Fusion Model

Published: 03 November 2014 Publication History

Abstract

Tail queries, which occur with low frequency, make up a large fraction of unique queries and often affect a user's experience during Web searching. Because of the data sparseness problem, information that can be leveraged for tail queries is not sufficient. Hence, it is important and difficult to improve the tail query performance. According to our observation, 26% of the tail queries are not essentially scarce: they are expressed in an unusual way, but the information requirements are not rare. In this study, we improve the tail query performance by fusing the results from original query and the query reformulation candidates. Other than results re-ranking, new results can be introduced by the fusion model. We emphasize that queries that can be improved are not only bad queries, and we propose to extract features that predict whether the performance can be improved. Then, we utilize a learning-to-rank method, which is trained to directly optimize a retrieval metric, to fuse the documents and obtain a final results list. We conducted experiments using data from two popular Chinese search engines. The results indicate that our fusion method significantly improves the performance of the tail queries and outperforms the state-of-the-art approaches on the same reformulations. Experiments show that our method is effective for the non-tail queries as well.

References

[1]
G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness, and selective application of query expansion. In Advances in information retrieval, pages 127--137. Springer, 2004.
[2]
N. Balasubramanian, G. Kumaran, and V. R. Carvalho. Predicting query performance on the web. In Proc. 33rd Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 785--786. ACM, 2010.
[3]
P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In Proc. 17th ACM Conf. on Information and knowledge management, pages 609--618. ACM, 2008.
[4]
F. Bonchi, R. Perego, F. Silvestri, H. Vahabi, and R. Venturini. Recommendations for the long tail by term-query graph. In Proc. 20th Intl. Conf. companion on World wide web, pages 15--16. ACM, 2011.
[5]
A. Broder, P. Ciccolo, E. Gabrilovich, V. Josifovski, D. Metzler, L. Riedel, and J. Yuan. Online expansion of rare queries for sponsored search. In Proc. 18th Intl. Conf. on World wide web, pages 511--520. ACM, 2009.
[6]
C. J. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. In NIPS, volume 6, pages 193--200, 2006.
[7]
D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? In Proc. 29th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 390--397. ACM, 2006.
[8]
O. Chapelle and Y. Chang. Yahoo! learning to rank challenge overview. In Yahoo! Learning to Rank Challenge, pages 1--24, 2011.
[9]
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proc. 18th ACM Conf. on Information and knowledge management, pages 621--630. ACM, 2009.
[10]
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proc. 25th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 299--306. ACM, 2002.
[11]
F. Diaz. Performance prediction using spatial autocorrelation. In Proc. 30th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 583--590. ACM, 2007.
[12]
D. Downey, S. Dumais, and E. Horvitz. Heads and tails: studies of web search with common and rare queries. In Proc. 30th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 847--848. ACM, 2007.
[13]
D. Downey, S. Dumais, D. Liebling, and E. Horvitz. Understanding the relationship between searchers' queries and information goals. In Proc. 17th ACM Conf. on Information and knowledge management, pages 449--458. ACM, 2008.
[14]
E. A. Fox and J. A. Shaw. Combination of multiple searches. NIST SPECIAL PUBLICATION SP, pages 243--243, 1994.
[15]
J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.
[16]
S. Goel, A. Broder, E. Gabrilovich, and B. Pang. Anatomy of the long tail: ordinary people with extraordinary tastes. In Proc. 3rd ACM Intl. Conf. on Web search and data mining, pages 201--210. ACM, 2010.
[17]
Q. Guo, R. W. White, S. T. Dumais, J. Wang, and B. Anderson. Predicting query performance using query, result, and user interaction features. In Adaptivity, Personalization and Fusion of Heterogeneous Information, pages 198--201. LE CENTRE DE HAUTES ETUDES Intl.ES D'INFORMATIQUE DOCUMENTAIRE, 2010.
[18]
C. Hauff, D. Hiemstra, and F. de Jong. A survey of pre-retrieval query performance predictors. In Proc. 17th ACM Conf. on Information and knowledge management, pages 1419--1420. ACM, 2008.
[19]
R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural computation, 3(1):79--87, 1991.
[20]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4):422--446, 2002.
[21]
T. Joachims. Optimizing search engines using clickthrough data. In Proc. eighth ACM SIGKDD Intl. Conf. on Knowledge discovery and data mining, pages 133--142. ACM, 2002.
[22]
T. Joachims et al. Evaluating retrieval performance using clickthrough data., 2003.
[23]
Y. Liu, R. Song, Y. Chen, J.-Y. Nie, and J.-R. Wen. Adaptive query suggestion for difficult queries. In Proc. 35th Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 15--24. ACM, 2012.
[24]
M. Montague and J. A. Aslam. Condorcet fusion for improved retrieval. In Proc. eleventh Intl. Conf. on Information and knowledge management, pages 538--548. ACM, 2002.
[25]
S. Pandey, K. Punera, M. Fontoura, and V. Josifovski. Estimating advertisability of tail queries for sponsored search. In Proc. 33rd Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 563--570. ACM, 2010.
[26]
S. Robertson, H. Zaragoza, and M. Taylor. Simple bm25 extension to multiple weighted fields. In Proc. thirteenth ACM Intl. Conf. on Information and knowledge management, pages 42--49. ACM, 2004.
[27]
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proc. 17th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 232--241. Springer-Verlag New York, Inc., 1994.
[28]
D. Sheldon, M. Shokouhi, M. Szummer, and N. Craswell. Lambdamerge: merging the results of query reformulations. In Proc. 4th ACM Intl. Conf. on Web search and data mining, pages 795--804. ACM, 2011.
[29]
A. Shtok, O. Kurland, and D. Carmel. Predicting query performance by query-drift estimation. In Advances in Information Retrieval Theory, pages 305--312. Springer, 2009.
[30]
Y. Song and L.-w. He. Optimal rare query suggestion with implicit user feedback. In Proc. 19th Intl. Conf. on World wide web, pages 901--910. ACM, 2010.
[31]
I. Szpektor, A. Gionis, and Y. Maarek. Improving recommendation for long-tail queries via templates. In Proc. 20th Intl. Conf. on World wide web, pages 47--56. ACM, 2011.
[32]
Q. Wu, C. J. Burges, K. M. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13(3):254--270, 2010.
[33]
J. Xu and W. B. Croft. Query expansion using local and global document analysis. In Proc. 19th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 4--11. ACM, 1996.
[34]
T. Yao, M. Zhang, Y. Liu, S. Ma, and L. Ru. Empirical study on rare query characteristics. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM Intl. Conf. on, volume 1, pages 7--14. IEEE, 2011.
[35]
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In Proc. 28th annual Intl. ACM SIGIR Conf. on Research and development in information retrieval, pages 512--519. ACM, 2005.
[36]
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Metasearch and federation using query difficulty prediction. In Proc. ACM SIGIR Workshop on Predicting Query Difficulty, Salvador, Brazil, 2005.
[37]
H. Zaragoza, B. B. Cambazoglu, and R. Baeza-Yates. Web search solved?: all result rankings the same? In Proc. 19th ACM Intl. Conf. on Information and knowledge management, pages 529--538. ACM, 2010.
[38]
K. Zhou, X. Li, and H. Zha. Collaborative ranking: improving the relevance for tail queries. In Proc. 21st ACM Intl. Conf. on Information and knowledge management, pages 1900--1904. ACM, 2012.

Cited By

View all
  • (2020)A Regularised Intent Model for Discovering Multiple Intents in E-Commerce Tail QueriesAdvances in Information Retrieval10.1007/978-3-030-45439-5_43(651-665)Online publication date: 14-Apr-2020
  • (2019)Addressing Vocabulary Gap in E-commerce SearchProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331323(1073-1076)Online publication date: 18-Jul-2019
  • (2018)Fusion in Information RetrievalThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210186(1383-1386)Online publication date: 27-Jun-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
November 2014
2152 pages
ISBN:9781450325981
DOI:10.1145/2661829
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fusion model
  2. learning to rank
  3. query reformulation
  4. results refinement
  5. tail query

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '14
Sponsor:

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)A Regularised Intent Model for Discovering Multiple Intents in E-Commerce Tail QueriesAdvances in Information Retrieval10.1007/978-3-030-45439-5_43(651-665)Online publication date: 14-Apr-2020
  • (2019)Addressing Vocabulary Gap in E-commerce SearchProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331323(1073-1076)Online publication date: 18-Jul-2019
  • (2018)Fusion in Information RetrievalThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210186(1383-1386)Online publication date: 27-Jun-2018
  • (2016)Ranking Relevance in Yahoo SearchProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2939672.2939677(323-332)Online publication date: 13-Aug-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media