Skip to main content
Log in

Pareto-based cache replacement for YouTube

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Recently, YouTube, which plays diverse video programs for worldwide users, has been one of the most attractive social-networking systems. YouTube employs a distributed memory caching system called Memcached to cache videos, and utilizes the Least Recently Used algorithm (LRU for short) to evict the least recently watched video when Memcached runs out of space. However, LRU may cause a high miss count, which is the number of times that a video requested by users cannot be found in Memcached. This might not only increase network overhead, but also cause a poor service quality for YouTube since those videos need to be retrieved from the remote back-end database. To solve these problems, in this paper, we classify videos into popular and unpopular videos and propose two cache replacement algorithms based on the Pareto principle. One is Pareto-based Least Frequently Used algorithm (PLFU for short), and the other is Pareto-based Least Recently Used algorithm (PLRU for short). The two algorithms always keep several top popular videos of each video category in Memcached to reduce miss count. However, when Memcached has insufficient space to hold a video requested by a user, PLFU and PLRU repeatedly evicts an unpopular video from Memcached based on LFU and LRU so as to hold the video. Our simulation results based on a real-world YouTube trace show that PLFU performs the best among all tested algorithms in terms of miss count and video-retrieval time. The results also indicate that when PLRU is used for a longer time, it provides the second best performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Abhari, A., Soraya, M.: Workload generation for YouTube. Multimedia Tools and Applications 46(1), 91–118 (2010)

    Article  Google Scholar 

  2. Adhikari V.K., Jain S., Zhang Z.-L.:“YouTube Traffic Dynamics and Its Interplay With a Tier-1 ISP: an ISP Perspective,” In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. ACM, pp. 431–443 (2010)

  3. Alexa, http://www.alexa.com/siteinfo/youtube.com. Accessed 23 Nov 2014

  4. Algorithms, https://www.usenix.org/legacy/events/usenix01/full_papers/zhou/zhou_html/node3.html. Accessed 23 Nov 2014

  5. Breslau, L., Cao, P., Fan L., Phillips G., Shenker S.: “Web caching and Zipf-like distributions: evidence and implications,” in: Proceedings of IEEE INFOCOM’99, no.1, New York, USA, pp. 126–134, 21–25 March (1999)

  6. Cache Algorithms, http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used. Accessed 23 Nov 2014

  7. Cha, M., Kwak, H. Rodriguez, P. Ahn Y.-Y, Moon, S.: “I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System,” In : Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, pp. 1–14 (2007)

  8. Cheng, X., Dale C., Liu J.: “Statistics and Social Networking of YouTube Videos,” In: Proceeding of the 16th IEEE International Workshop on Quality of Service, pp. 229–238 (2008)

  9. Cheng, X., Liu J.: “Load-Balanced Migration of Social Media to Content Clouds,” In: Proceedings of the 21st ACM International Workshop on Network and Operating Systems Support for Digital Audio and Video, pp. 51–56 (2011)

  10. Daum UCC, http://ucc.daum.net. Accessed 23 Nov 2014

  11. Geometric distribution, http://en.wikipedia.org/wiki/Geometric_distribution. Accessed 23 Nov 2014

  12. Jose, J., Subramoni, H., Kandalla, K., Wasi-ur-Rahman, M., Wang, H., Narravula, S., Panda, D.K.: “Scalable memcached design for InfiniBand clusters using hybrid transports”, In: Proceedings of the 12th IEEE/ACM International Symposium on Cluster. Cloud and Grid Computing 13–16, 236–243 (2012)

    Google Scholar 

  13. Kleinrock, L.: Queueing Systems. In: Theory Vol. 1. Wiley, New York (1975)

    Google Scholar 

  14. Labovitz, C., Iekel-Johnson, S., McPherson, D., Oberheide, J., Jahanian, F.: “Internet Inter-domain Traffic,” in ACM SIGCOMM Computer Communication Review, vol. 40, no. 4, (2010)

  15. Lee, M.-C., Leu, F.-Y., Chen, Y.-p.: PFRF: an adaptive data replication algorithm based on star-topology data grids. Future Generation Computer systems 28(7), 1045–1057 (2012)

    Article  Google Scholar 

  16. Lee, M.-C., Leu F.-Y., Chen, Y.-p.: “Cache Replacement Algorithms for YouTube,” in 2014 I.E. 28th International Conference on Advanced Information Networking and Applications (AINA), pp. 743–750 (2014)

  17. LiveJournal, http://www.livejournal.com/. Accessed 23 Nov 2014

  18. Memcached, http://en.wikipedia.org/wiki/Memcached. Accessed 23 Nov 2014

  19. Newman, M.E.J.: Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46, 323–351 (2005)

    Article  Google Scholar 

  20. Page Replacement Algorithm, http://en.wikipedia.org/wiki/Page_replacement_algorithm. Accessed 23 Nov 2014

  21. Saab, P.: “Scaling Memcached at Facebook,” 12 December 2008. [Online]. Available: https://www.facebook.com/note.php?note_id=39391378919&ref=mf. Accessed 23 Nov 2014

  22. Sorting algorithm, http://en.wikipedia.org/wiki/Sorting_algorithm. Accessed 23 Nov 2014

  23. Torres, R., Finamore, A., Kim, J., Mellia, M., Munafo, M., Rao, S.: “Dissecting Video Server Selection Strategies in the YouTube CDN,” In: Proceeding of the 31st IEEE International Conference on Distributed Computing Systems,. pp. 248–257 (2011)

  24. Twitter Engineering, “Memcached SPOF Mystery,” Twitter Engineering, 20 April 2010. [Online]. Available: http://engineering.twitter.com/2010/04/memcached-spof-mystery.html. Accessed 23 Nov 2014

  25. Uniform distribution (discrete), http://en.wikipedia.org/wiki/Uniform_distribution_(discrete). Accessed 23 Nov 2014

  26. User-generated Content, http://en.wikipedia.org/wiki/User-generated_content. Accessed 23 Nov 2014

  27. Wiggins, A., Langston J.: “Enhancing the Scalability of Memcached,” http://software.intel.com/sites/default/files/m/0/b/6/1/d/45675-memcached_05172012.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang-Yie Leu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, MC., Leu, FY. & Chen, Yp. Pareto-based cache replacement for YouTube. World Wide Web 18, 1523–1540 (2015). https://doi.org/10.1007/s11280-014-0318-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-014-0318-9

Keywords

Navigation