ABSTRACT
Commercial image serving systems, such as Flickr and Facebook, rely on large image caches to avoid the retrieval of requested images from the costly backend image store, as much as possible. Such systems serve the same image in different resolutions and, thus, in different sizes to different clients, depending on the properties of the clients' devices. The requested resolutions of images can be cached individually, as in the traditional caches, reducing the backend workload. However, a potentially better approach is to store relatively high-resolution images in the cache and resize them during the retrieval to obtain lower-resolution images. Having this kind of on-the-fly image resizing capability enables image serving systems to deploy more sophisticated caching policies and improve their serving performance further. In this paper, we formalize the static caching problem in image serving systems which provide on-the-fly image resizing functionality in their edge caches or regional caches. We propose two gain-based caching policies that construct a static, fixed-capacity cache to reduce the average serving time of images. The basic idea in the proposed policies is to identify the best resolution(s) of images to be cached so that the average serving time for future image retrieval requests is reduced. We conduct extensive experiments using real-life data access logs obtained from Flickr. We show that one of the proposed caching policies reduces the average response time of the service by up to 4.2% with respect to the best-performing baseline that mainly relies on the access frequency information to make the caching decisions. This improvement implies about 25% reduction in cache size under similar serving time constraints.
- R. Baeza-Yates, A. Gionis, F. P. Junqueira, V. Murdock, V. Plachouras, and F. Silvestri. Design trade-offs for search engine caching. ACM Transactions on the Web, 2(4):20:1--20:28, 2008. Google ScholarDigital Library
- D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel. Finding a needle in haystack: Facebook's photo storage. In Proc. 9th USENIX Conf. Operating Systems Design and Implementation, pages 1--8, 2010. Google ScholarDigital Library
- B. B. Cambazoglu and R. A. Baeza-Yates. Scalability Challenges in Web Search Engines. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, 2015.Google Scholar
- B. B. Cambazoglu, F. P. Junqueira, V. Plachouras, S. Banachowski, B. Cui, S. Lim, and B. Bridge. A refreshing perspective of search engine caching. In Proc. 19th Int'l Conf. World Wide Web, pages 181--190, 2010. Google ScholarDigital Library
- W. Effelsberg and T. Haerder. Principles of database buffer management. ACM Transactions on Database Systems, 9(4):560--595, 1984. Google ScholarDigital Library
- G. Francès, X. Bai, B. B. Cambazoglu, and R. Baeza-Yates. Improving the efficiency of multi-site web search engines. In Proc. 7th ACM Int'l Conf. Web Search and Data Mining, pages 3--12, 2014. Google ScholarDigital Library
- Q. Gan and T. Suel. Improved techniques for result caching in web search engines. In Proc. 18th Int'l Conf. World Wide Web, pages 431--440, 2009. Google ScholarDigital Library
- L. Guo, E. Tan, S. Chen, Z. Xiao, and X. Zhang. The stretched exponential distribution of Internet media access patterns. In Proc. 27th ACM Symp. Principles of Distributed Computing, pages 283--294, 2008. Google ScholarDigital Library
- Q. Huang, K. Birman, R. van Renesse, W. Lloyd, S. Kumar, and H. C. Li. An analysis of Facebook photo caching. In Proc. 24th ACM Symp. Operating Systems Principles, pages 167--181, 2013. Google ScholarDigital Library
- E. Kayaaslan, B. B. Cambazoglu, and C. Aykanat. Document replication strategies for geographically distributed web search engines. Information Processing & Management, 49(1):51--66, 2013. Google ScholarDigital Library
- E. P. Markatos. On caching search engine query results. Computer Communications, 24(2):137--143, 2001. Google ScholarDigital Library
- S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin, W. Liu, S. Pan, S. Shankar, V. Sivakumar, L. Tang, and S. Kumar. F4: Facebook's warm BLOB storage system. In Proc. 11th USENIX Conf. Operating Systems Design and Implementation, pages 383--398, 2014. Google ScholarDigital Library
- R. Ozcan, I. S. Altingovde, B. B. Cambazoglu, F. P. Junqueira, and O. Ulusoy. A five-level static cache architecture for web search engines. Information Processing & Management, 48(5):828--840, 2012. Google ScholarDigital Library
- R. Ozcan, I. S. Altingovde, and O. Ulusoy. Cost-aware strategies for query result caching in web search engines. ACM Transactions on the Web, 5(2):9:1--9:25, 2011. Google ScholarDigital Library
- S. Podlipnig and L. Böszörmenyi. A survey of web cache replacement strategies. ACM Computing Surveys, 35(4):374--398, 2003. Google ScholarDigital Library
- A. J. Smith. Cache memories. ACM Computing Surveys, 14(3):473--530, 1982. Google ScholarDigital Library
- L. Tang, Q. Huang, W. Lloyd, S. Kumar, and K. Li. RIPQ: Advanced photo caching on flash for Facebook. In Proc. 13th USENIX Conf. File and Storage Technologies, pages 373--386, 2015. Google ScholarDigital Library
Index Terms
- Improved Caching Techniques for Large-Scale Image Hosting Services
Recommendations
Selective Victim Caching: A Method to Improve the Performance of Direct-Mapped Caches
Although direct-mapped caches suffer from higher miss ratios as compared to set-associative caches, they are attractive for today's high-speed pipelined processors that require very low access times. Victim caching was proposed by Jouppi [1] as an ...
Comments