The LCD interconnection of LRU caches and its analysis

https://doi.org/10.1016/j.peva.2005.05.003Get rights and content

Abstract

In a multi-level cache such as those used for web caching, a hit at level l leads to the caching of the requested object in all intermediate caches on the reverse path (levels l1,,1). This paper shows that a simple modification to this de facto behavior, in which only the l1 level cache gets to store a copy, can lead to significant performance gains. The modified caching behavior is called Leave Copy Down (LCD); it has the merit of being able to avoid the amplification of replacement errors and also the unnecessary repetitious caching of the same objects at multiple levels. Simulation results against other cache interconnections show that when LCD is applied under typical web workloads, it reduces the average hit distance. We construct an approximate analytic model for the case of LCD interconnection of LRU caches and use it to gain a better insight as to why the LCD interconnection yields an improved performance.

Introduction

A cache is a fast access memory that mediates between a consumer of information, and a slower memory where information is stored on a permanent basis. The function of the cache is to maintain a set of most valuable objects, so that they may be accessed promptly, thus avoid accessing the slower and/or remote (permanent) memory. Caching is one of the most pervasive and omnipresent ideas of computer science. It has been studied and applied in many different domains, such as: in computer architecture, to speed up the communication between the central processor unit (CPU) and the main memory (RAM); in operating systems, to perform paging, i.e., keep in the RAM the most valuable blocks from the permanent storage devices (hard disks); in distributed file systems, to keep frequently accessed files closer to the clients; in the world wide web, to allow clients to receive web content from local proxy servers, thus avoid accessing the remote origin servers through the network.

In several occasions caches are employed at multiple levels. Examples are, multi-level cache architectures in modern CPUs, multi-level caches in RAID disk arrays, and multi-level caches in the world wide web. Under the modus operandi of such multi-level systems, requests are first received at the lowest level cache (the one closest to the client), and then routed upwards until they reach a cache that stores the requested object. A hit is said to have occurred in that case. Following the hit, the requested object is sent downwards on the reverse path to the client, and each cache on this path gets to store a local copy of the object.

Leaving copies everywhere on the reverse path (hereafter abbreviated LCE), has been considered as a de facto behavior. Despite the vast bibliography on caching, we are only aware of a few works that have questioned this de facto behavior [1], [2], [3]. This paper continues on this line of research, investigating whether caching a local copy in allintermediate caches on the reverse path is indeed a good idea, or are there reasons to revise it, and instead keep copies only in a subset of intermediate caches. Our answer to this question is that as far as web caching is concerned, LCE is not always the best choice and that simple alternative algorithms can outperform it in a variety of common scenarios. In [3], we have proposed such an algorithm – which we call Leave Copy Down(LCD) – that appears to be superior to LCE and other potential algorithms, across a wide range of parameters. The operation of LCD is quite simple; instead of storing a copy in all intermediate caches, only the immediate downstream neighbor of the hit cache gets to store one. This way, objects move gradually from the origin server towards the clients, with each request advancing them by one hop.

The current work focuses on LCD, and takes an analytic look at its workings, with the aim of deepening our understanding as to why this particular algorithm yields an improved performance. By developing an appropriate analytic performance evaluation model, it becomes clear why this is the case. The enhanced performance of LCD stems from: (1) its ability to avoid the amplification of replacement errors (it limits replacement errors locally to a single cache instead of allowing them to spread to an entire chain of caches); (2) its ability to provide for exclusive caching (allows each cache on a chain of caches to hold a potentially different set of objects, thus avoiding the repetitious replication of the same few objects). To analyze an LCD interconnection of LRU caches requires introducing several approximation techniques in order to overcome problems such as the combinatorial hardness of analyzing LRU replacement, the correlation in the miss streams that flow from cache to cache, the coupling of cache states under LCD (the state of a cache depending on the state of its downstream neighbor and vice versa). By carefully combining the various approximation techniques, our final analytic model is able to predict satisfactorily the performance of the real system.

The LCD interconnection is proposed here for the application of web caching; for this specific application we have provided extensive experimental results that appear in [3]. The analysis, however, in the current article is much more general. Apart from the assumption that requests are not correlated, which is a typical one under, which replacement algorithms are analyzed, we avoid making additional assumptions that are specific to web caching. Thus it is possible that parts of our analysis can potentially lend themselves to the analysis of other applications of caching as well (including modifications when additional operating rules are introduced).

The remainder of the article is structured as follows. Section 2 introduces several cache interconnection algorithms (called meta algorithms), elaborates on the desired properties for such algorithms, and presents a performance comparison via simulation. Section 3 gives an overview of previous approaches for the analysis of LRU caching. Section 4 presents the basic theory for the analysis of isolated LRU caches; this theory is employed as a building block in later parts. Section 5 presents the analysis of LCD-interconnected tandems of LRU caches, and the required modifications to handle more general tree topologies. Section 6 demonstrates results from the application of the final analytic model. Section 7 concludes the article.

Section snippets

Meta algorithm for multi-level caches

The question of whether to cache an object at an intermediate cache is one that may be posed independently of the specific replacement algorithm operating on the cache. For this reason, the algorithms that are studied here may be characterized as meta algorithms for multi-level caches (or just meta algorithms) to differentiate them from the much discussed and well understood replacement algorithms, and to stress the fact that they operate independently of the latter.

In the following, we

Analytic models of LRU in the literature

To put the presented LCD/LRU analytic model for interconnected caches into perspective, we first review previous attempts to model LRU caching. We focus on analyses assuming independent identically distributed requests (the aforementioned IRM), that attempt to derive the expected behavior of LRU in steady-state; the reader is referred to Motwani and Raghavan [16] for analyses from the perspective of theoretical computer science, aiming at establishing worst case performance bounds for the

The Che et al. approximation for individual LRUs

This section presents the approximate analytic model for individual LRU caches that has been proposed recently by one of the authors (H. Che) in [1]. The model and its concepts will be used as building blocks at various occasions during the analysis of interconnected LCD/LRU caches that follows in subsequent sections. To this end, the presentation is adapted to the requirements of the current work. At certain points, it even adds details which do not appear in the original paper, with the aim

Analysis of a two-level LCD/LRU tandem

This section develops an approximate analytic model for the LCD interconnected LRU caches.

Numerical results

To assess the predictive power of the analytic model for the LCD/LRU tandem, analytic numerical results are compared against simulation results. The comparison is carried out under Zipf-like requests. The topology is as shown in Fig. 2, and the access distances are d1=0, d2=1, dos=2(level 1/level 2/origin server). Such a selection of distances amount to a hop-count distance, where the clients are co-located with the level 1 cache. The average hit distance is computed using Eq. (23).

We have

Conclusions

This paper has taken an analytic look at the LCD way of interconnecting LRU caches. Despite its simplicity, LCD has demonstrated surprisingly good performance under various workloads and interconnection topologies. Aiming at explaining the results of simulation experiments, we have constructed an analytic performance evaluation model. Despite the simplicity of the LCD algorithm itself, its exact analysis under LRU replacement is far beyond the borders of tractability. For this reason, we have

Nikos Laoutaris received the Ph.D. degree from the Department of Informatics and Telecommunications of the University of Athens, Greece, in 2004, for his work in the area of Content Networking. He also holds an M.Sc. degree in Telecommunications and Computer Networks (2001) and a B.Sc. degree in Computer Science (1998), both from the same department. His main research interests are in the analysis of algorithms and the performance evaluation of Internet content distribution systems (CDN, P2P,

References (24)

  • A. Mahanti et al.

    Temporal locality and its impact on web proxy cache performance

    Perform. Eval.

    (2000)
  • D. Starobinski et al.

    Probabilistic methods for web caching

    Performance Eval.

    (2001)
  • P. Flajolet et al.

    Birthday paradox, coupon collectors, caching algorithms and self-organizing search

    Discrete Appl. Math.

    (1992)
  • H. Che et al.

    Hierarchical web caching systems: modeling, design and experimental results

    IEEE J. Selected Areas Commun.

    (2002)
  • T.M. Wong et al.

    My cache or yours? making storage more exclusive

  • N. Laoutaris et al.

    Meta algorithms for hierarchical web caches

  • S. Podlipnig et al.

    A survey of web cache replacement strategies

    ACM Comput. Surveys

    (2003)
  • L. Ramaswamy et al.

    An expiration age-based document placement scheme for cooperative web caching

    IEEE Trans. Knowledge Data Eng.

    (2004)
  • A. Mahanti

    Carey Williamson, and Derek Eager, Traffic analysis of a web proxy caching hierarchy

    IEEE Network Magazine

    (2000)
  • X. Tang et al.

    Coordinated en-route web caching

    IEEE Trans. Comput.

    (2002)
  • Kang-Won Lee, Sambit Sahu, Khalil Amiri, Chitra Venkatramani, Understanding the potential benefits of cooperation among...
  • M.R. Korupolu et al.

    Coordinated placement and replacement for large-scale distributed caches

    IEEE Trans. Knowledge Data Eng.

    (2002)
  • Cited by (0)

    Nikos Laoutaris received the Ph.D. degree from the Department of Informatics and Telecommunications of the University of Athens, Greece, in 2004, for his work in the area of Content Networking. He also holds an M.Sc. degree in Telecommunications and Computer Networks (2001) and a B.Sc. degree in Computer Science (1998), both from the same department. His main research interests are in the analysis of algorithms and the performance evaluation of Internet content distribution systems (CDN, P2P, web caching) and multimedia streaming applications.

    Hao Che received the B.S. degree from Nanjing University, Nanjing, China, in 1984, the M.S. degree in physics from the University of Texas at Arlington, TX, in 1994, and Ph.D. degree in electrical engineering from the University of Texas at Austin, TX, in 1998. He was an Assistant Professor of Electrical Engineering at the Pennsylvania State University, University Park, PA, from 1998 to 2000, and a System Architect with Santera Systems, Inc., Plano, TX, from 2000 to 2002. Since September 2002, he has been an Assistant Professor of Computer Science and Engineering at the University of Texas at Arlington, TX. His current research interests include network architecture and design, network resource management, multiservice switching architecture, and network processor design.

    Prof. Ioannis Stavrakakis Diploma in Electrical Engineering, Aristotelian University of Thessaloniki, (Greece), 1983; Ph.D. in EE, University of Virginia (USA), 1988; assistant professor in CSEE, University of Vermont (USA), 1988–1994; associate professor of ECE, Northeastern University, Boston (USA), 1994–1999; associate professor of Informatics and Telecommunications, University of Athens (Greece), 1999–2002 and professor since 2002. Teaching and research interests are focused on resource allocation protocols and traffic management for communication networks, with recent emphasis on continuous media applications and ad hoc networking. His past research has been published in over 100 scientific journals and conference proceedings and was funded by NSF, DARPA, GTE, BBN and Motorola (USA) as well as Greek and European Union (IST) Funding agencies. He has served repeatedly in NSF and IST research proposal review panels and involved in the organization of numerous conferences sponsored by IEEE, ACM, ITC and IFIP societies. He is a senior member of IEEE, a member of (and has served as an elected officer for) the IEEE Technical Committee on Computer Communications (TCCC) and the chairman of IFIP WG6.3. He has served as a co-organizer of the 1996 International Teletraffic Congress (ITC) Mini-Seminar, the organizer of the 1999 IFIP WG6.3 workshop, a technical program co-chair for the IFIP Networking’2000 conference, the Vice-General Chair for Networking’2002 conference and the organizer of the COST-IST(EU)/NSF(USA)-sponsored NeXtworking’03. He is an associate editor for the IEEE/ACM transactions on Networking, the ACM/Baltzer Wireless Networks Journal and the Computer Networks Journal.

    The work of N. Laoutaris and I. Stavrakakis has been supported in part by the IST Program of the European Union under contracts IST-6475 (ACCA) and FP6-506869 (E-NEXT). The work of H. Che has been supported by NSF under grant ANI-0125653.

    View full text