Skip to main content

Caching for Realtime Search

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Abstract

Modern search engines feature real-time indices, which incorporate changes to content within seconds. As search engines also cache search results for reducing user latency and back-end load, without careful real-time management of search results caches, the engine might return stale search results to users despite the efforts invested in keeping the underlying index up to date. A recent paper proposed an architectural component called CIP – the cache invalidation predictor. CIPs invalidate supposedly stale cache entries upon index modifications. Initial evaluation showed the ability to keep the performance benefits of caching without sacrificing much the freshness of search results returned to users. However, it was conducted on a synthetic workload in a simplified setting, using many assumptions. We propose new CIP heuristics, and evaluate them in an authentic environment – on the real evolving corpus and query stream of a large commercial news search engine. Our CIPs operate in conjunction with realistic cache settings, and we use standard metrics for evaluating cache performance. We show that a classical cache replacement policy, LRU, completely fails to guarantee freshness over time, whereas our CIPs serve 97% of the queries with fresh results. Our policies incur a negligible impact on the baseline’s cache hit rate, in contrast with traditional age-based invalidation, which must severely reduce the cache performance in order to achieve the same freshness. We demonstrate that the computational overhead of our algorithms is minor, and that they even allow reducing the cache’s memory footprint.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Denning, P.J., Ullman, J.D.: Principles of Optimal Page Replacement. JACM 18, 80–93 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  2. Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press / Addison Wesley, New York (1999)

    Google Scholar 

  3. Blanco, R., Bortnikov, E., Junqueira, F., Lempel, R., Telloli, L., Zaragoza, H.: Caching Search Engine Results over Incremental Indices. In: SIGIR, pp. 82–89 (2010)

    Google Scholar 

  4. Cambazoglu, B., Junqueira, F., Plachouras, V., Banachowski, S., Cui, B., Lim, S., Bridge, B.: A Refreshing Perspective of Search Engine Caching. In: WWW, pp. 181–190 (2010)

    Google Scholar 

  5. Dong, A., Chang, Y., Zheng, Z., Mishne, G., Bai, J., Zhang, R., Buchner, K., Liao, C., Diaz, F.: Towards recency ranking in web search. In: WSDM, pp. 11–20 (2010)

    Google Scholar 

  6. Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the Performance of Web Search Engines: Caching and Prefetching Query Results by Exploiting Historical Usage Data. ACM Trans. Inf. Syst. 24(1), 51–78 (2006)

    Article  Google Scholar 

  7. Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: WWW, pp. 431–440 (2009)

    Google Scholar 

  8. Handy, J.: The Cache Memory Book. Academic Press, London (1998)

    MATH  Google Scholar 

  9. Lempel, R., Moran, S.: Predictive Caching and Prefetching of Query Results in Search Engines. In: WWW, pp. 19–28 (2003)

    Google Scholar 

  10. Markatos, E.P.: On Caching Search Engine Query Results. Computer Communications 24(2), 137–143 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bortnikov, E., Lempel, R., Vornovitsky, K. (2011). Caching for Realtime Search. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20161-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20160-8

  • Online ISBN: 978-3-642-20161-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics