skip to main content
10.1145/2911451.2911513acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Improved Caching Techniques for Large-Scale Image Hosting Services

Published:07 July 2016Publication History

ABSTRACT

Commercial image serving systems, such as Flickr and Facebook, rely on large image caches to avoid the retrieval of requested images from the costly backend image store, as much as possible. Such systems serve the same image in different resolutions and, thus, in different sizes to different clients, depending on the properties of the clients' devices. The requested resolutions of images can be cached individually, as in the traditional caches, reducing the backend workload. However, a potentially better approach is to store relatively high-resolution images in the cache and resize them during the retrieval to obtain lower-resolution images. Having this kind of on-the-fly image resizing capability enables image serving systems to deploy more sophisticated caching policies and improve their serving performance further. In this paper, we formalize the static caching problem in image serving systems which provide on-the-fly image resizing functionality in their edge caches or regional caches. We propose two gain-based caching policies that construct a static, fixed-capacity cache to reduce the average serving time of images. The basic idea in the proposed policies is to identify the best resolution(s) of images to be cached so that the average serving time for future image retrieval requests is reduced. We conduct extensive experiments using real-life data access logs obtained from Flickr. We show that one of the proposed caching policies reduces the average response time of the service by up to 4.2% with respect to the best-performing baseline that mainly relies on the access frequency information to make the caching decisions. This improvement implies about 25% reduction in cache size under similar serving time constraints.

References

  1. R. Baeza-Yates, A. Gionis, F. P. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. Design trade-offs for search engine caching. ACM Transactions on the Web, 2(4):20:1--20:28, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel. Finding a needle in haystack: Facebook's photo storage. In Proc. 9th USENIX Conf. Operating Systems Design and Implementation, pages 1--8, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. B. Cambazoglu and R. A. Baeza-Yates. Scalability Challenges in Web Search Engines. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, 2015.Google ScholarGoogle Scholar
  4. B. B. Cambazoglu, F. P. Junqueira, V. Plachouras, S. Banachowski, B. Cui, S. Lim, and B. Bridge. A refreshing perspective of search engine caching. In Proc. 19th Int'l Conf. World Wide Web, pages 181--190, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Effelsberg and T. Haerder. Principles of database buffer management. ACM Transactions on Database Systems, 9(4):560--595, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Francès, X. Bai, B. B. Cambazoglu, and R. Baeza-Yates. Improving the efficiency of multi-site web search engines. In Proc. 7th ACM Int'l Conf. Web Search and Data Mining, pages 3--12, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Q. Gan and T. Suel. Improved techniques for result caching in web search engines. In Proc. 18th Int'l Conf. World Wide Web, pages 431--440, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Guo, E. Tan, S. Chen, Z. Xiao, and X. Zhang. The stretched exponential distribution of Internet media access patterns. In Proc. 27th ACM Symp. Principles of Distributed Computing, pages 283--294, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Q. Huang, K. Birman, R. van Renesse, W. Lloyd, S. Kumar, and H. C. Li. An analysis of Facebook photo caching. In Proc. 24th ACM Symp. Operating Systems Principles, pages 167--181, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Kayaaslan, B. B. Cambazoglu, and C. Aykanat. Document replication strategies for geographically distributed web search engines. Information Processing & Management, 49(1):51--66, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. P. Markatos. On caching search engine query results. Computer Communications, 24(2):137--143, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin, W. Liu, S. Pan, S. Shankar, V. Sivakumar, L. Tang, and S. Kumar. F4: Facebook's warm BLOB storage system. In Proc. 11th USENIX Conf. Operating Systems Design and Implementation, pages 383--398, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Ozcan, I. S. Altingovde, B. B. Cambazoglu, F. P. Junqueira, and O. Ulusoy. A five-level static cache architecture for web search engines. Information Processing & Management, 48(5):828--840, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Ozcan, I. S. Altingovde, and O. Ulusoy. Cost-aware strategies for query result caching in web search engines. ACM Transactions on the Web, 5(2):9:1--9:25, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Podlipnig and L. Böszörmenyi. A survey of web cache replacement strategies. ACM Computing Surveys, 35(4):374--398, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. J. Smith. Cache memories. ACM Computing Surveys, 14(3):473--530, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Tang, Q. Huang, W. Lloyd, S. Kumar, and K. Li. RIPQ: Advanced photo caching on flash for Facebook. In Proc. 13th USENIX Conf. File and Storage Technologies, pages 373--386, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improved Caching Techniques for Large-Scale Image Hosting Services

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
        July 2016
        1296 pages
        ISBN:9781450340694
        DOI:10.1145/2911451

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 July 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader