Elsevier

Computer Networks

Volume 51, Issue 11, 8 August 2007, Pages 2917-2937
Computer Networks

Mistreatment-resilient distributed caching

https://doi.org/10.1016/j.comnet.2006.11.030Get rights and content

Abstract

The distributed partitioning of autonomous, self-aware nodes into cooperative groups, within which scarce resources could be effectively shared for the benefit of the group, is increasingly emerging as a hallmark of many newly-proposed overlay and peer-to-peer applications. Distributed caching protocols in which group members cooperate to satisfy local requests for objects is a canonical example of such applications. In recent work of ours we identified mistreatment as a potentially serious problem for nodes participating in such cooperative caching arrangements. Mistreatment materializes when a node’s access cost for fetching objects worsens as a result of cooperation. To that end, we outlined an emulation-based framework for the development of mistreatment-resilient distributed selfish caching schemes. Under this framework, a node opts to participate in the group only if its individual access cost is less than the one achieved while in isolation. In this paper, we argue against the use of such static “all or nothing” approaches which force an individual node to either join or not join a cooperative group. Instead, we advocate the use of a smoother approach, whereby the level of cooperation is tied to the benefit that a node begets from joining a group. To that end, we propose a distributed and easily deployable feedback-control scheme which mitigates mistreatment. Under our proposed adaptive scheme, a node independently emulates its performance as if it were acting in a greedy local manner and then adapts its caching policy in the direction of reducing its measured access cost below its emulated greedy local cost. Using control-theoretic analysis, we show that our proposed scheme converges to the minimal access cost, and indeed outperforms any static scheme. We also show that our scheme results in insignificant degradation to the performance of the caching group under typical operating scenaria.

Introduction

Network applications often rely on distributed resources available within a cooperative grouping of nodes to ensure scalability and efficiency. As typical in many applications such as web server farms or content distribution networks, the grouping of nodes is dictated by a common strategic objective, and as such, the payoff from cooperation is assessed by the overall benefit to the group as opposed to the benefit reaped by individual nodes in the group (which in this case are not presumed to be selfish). More recently, however, new classes of network applications have emerged for which the grouping of nodes is more “ad hoc” in the sense that it is not dictated by organizational boundaries or strategic goals. Examples include the various overlay and peer-to-peer (P2P) applications [2], [3]. For such applications, the grouping of nodes is not governed by a common objective, but rather by the individual (selfish) objectives of the constituent nodes. Under such a setting, and as we have shown in prior work of ours [4], [5], it is possible for a node (or a set of nodes) to be “mistreated” in the sense that its participation in the group (while advantageous to the group) would not be advantageous to its own objective. In this paper, we show how to design mistreatment-resilient cooperative applications.

As part of our recent work on mistreatment in distributed cooperative settings [4], [5], we focused on content networking applications, whereby the distributed resource being shared amongst a group of nodes is storage. In particular, we considered a group of nodes that store information objects and make them available to their local users as well as to remote nodes. A user’s request is first received by the local node. If the requested object is stored locally, it is returned to the requesting user immediately, thereby incurring a minimal access cost. Otherwise, the requested object is searched for, and fetched from other nodes of the group, at a potentially higher access cost. If the object cannot be located anywhere in the group, it is retrieved from an origin server, which is assumed to be outside the group, thus incurring a maximal access cost. Contrary to most previous work in the field, we considered selfish nodes, i.e., nodes that cater strictly and only to the minimization of the access cost for their local client population (disregarding any consequences for the performance of the group as a whole).

In [4], [5] we established the vulnerability of many socially optimal (SO) object replication/caching schemes to mistreatment problems. A mistreated node was defined as a node whose access cost under some cooperative scheme is higher than the corresponding minimal access cost that the node can guarantee for itself by being uncooperative. Unlike centrally designed/controlled groups where all constituent nodes have to abide by the ultimate goal of optimizing the social utility of the group, an autonomous, selfish node will not tolerate such a mistreatment. Indeed, the emergence of such mistreatments may cause selfish nodes to secede from the replication group, resulting in severe inefficiencies for both the individual users as well as the entire group.

Proactive replication strategies such as those studied in [4] are not practical in a highly dynamic content networking setting, which is likely to be the case for most of the Internet overlays and P2P applications we envision for a variety of reasons: (1) Fluid group membership makes it impractical for nodes to decide what to replicate based on what (and where) objects are replicated in the group. (2) Access patterns as well as access costs may be highly dynamic (due to bursty network/server load), necessitating that the selection of replicas and their placement be done continuously, which is not practical. (3) Both the identification of the appropriate re-invocation times [6] and the estimation of the non-stationary demands (or equivalently, the timescale for a stationarity assumption to hold) [7] are non-trivial problems. (4) Content objects may be dynamic and/or may expire, necessitating the use of “pull” (i.e., on-demand caching) as opposed to “push” (i.e., pro-active replication) approaches. Using on-demand caching is the most widely acceptable and natural solution to all of these issues because it requires no a priori knowledge of local/group demand patterns and, as a consequence, responds dynamically to changes in these patterns over time (e.g., introduction of new objects, reduction in the popularity of older ones, etc.).

Therefore, in [5] we considered the problem of Distributed Selfish Caching (DSC), which could be seen as the on-line equivalent of the Distributed Selfish Replication (DSR) problem [4]. In DSC, we adopted an object caching model, whereby a node used demand-driven temporary storage of objects, combined with replacement. Based on that model, we uncovered the operational characteristics of a DSC group that can give rise to mistreatment problems. And, while we argued that under stationary conditions, simple parametric versions of already established protocols and mechanisms are capable of mitigating these problems, we did not prescribe an integrated approach for regulating these parameters in a manner that adapts to the constantly changing conditions of the group (e.g., varying group size, node capacities, delays, and demand patterns).

In this paper, we take our work one significant, constructive step further by proposing a general control-theoretic framework, which enables the parametrization of DSC protocols so as to make these protocols resilient to mistreatment, even under the aforementioned fluid group conditions. Through extensive analysis and simulation experiments, we show that our adaptive scheme not only mitigates mistreatment that may evolve in a distributed caching group, but it also guarantees an access cost for each individual node that is lower than the one achieved by any static scheme. We also show that the impact of this adaptive scheme on the performance of the distributed caching group is minimal under typical operating scenaria.

The rest of the paper is organized as follows. Section 2 summarizes related work in cooperative caching and dynamic schemes used to optimize caching. In Section 3 we describe our model of a distributed caching group. In Section 4 we demonstrate the causes of mistreatment in distributed caching groups. The design of a generic feedback controller for the mitigation of mistreatment is covered in Section 5. In Section 6 we argue that a node equipped with our feedback controller achieves the minimum access cost (compared to any other static scheme). In Section 7, we study the impact of such a controller on the overall performance of the distributed group. We evaluate the performance of our controller with extensive simulations in Section 8. Section 9 concludes the paper.

Section snippets

Related work

Cooperative caching [8], [9] allows multiple caches to cooperate in servicing each others’ requests. Monitoring and controlling the number of copies for the same documents across different caches has been studied in [10] for the web and [11] for wireless ad hoc networks. All aforementioned studies were concerned with the minimization of the aggregated cost of the group. Apart from our previous work [5], [1], we are aware of only two additional works that deal with dynamic schemes to optimize

Model of a distributed caching group

In this section we present the model of a distributed caching group that we consider in our study. Let oi, 1  i  N, and vj, 1  j  n, denote the ith unit-sized object and the jth node, and let O = {o1,  , oN} and V = {v1,  , vn} denote the corresponding sets. Node vj is assumed to have storage capacity for up to Cj unit-sized objects, a total request rate λj (total number of requests per unit time, across all objects), and a demand described by a probability distribution over O, pj={p1j,,pNj}, where pij

Mistreatment in distributed caching groups

The examination of the operational characteristics of a group of nodes involved in a distributed caching solution enabled us to identify two key culprits for the emergence of mistreatment phenomena [5]: (1) the use of a common caching scheme across all the nodes of the group, irrespectively of the particular capabilities and characteristics of each individual node, and (2) the mutual state interaction between replacement algorithms running on different nodes.

Towards mistreatment-resilient caching

From the exposition so far, it should be clear that there exist situations under which an inappropriate, or enforced, scheme may mistreat some of the nodes. While we have focused on detecting and analyzing two causes of mistreatment which appear to be important (namely, due to the adoption of a common cache management scheme and cache state interactions), it should be evident that mistreatments may well arise through other causes. For example, we have not investigated the possibility of

Convergence of the controller

In this section we argue that the access cost of a node equipped with our adaptive mechanism converges to a value that is lower than the one under any other static scheme, and we analytically estimate this value. We consider the scenario with the outlier node that was presented in Section 4.1.

The effect of individual controllers on the overall performance of the group

In this section, we turn our attention to the performance implications resulting from the use of individual-cost minimizing controllers at different nodes. In particular, we look at the impact on the (global) group’s performance by comparing the aggregated steady-state access cost of the distributed caching group (AACPID) when all constituent nodes are equipped with the PID controller described in Section 5.5 with the corresponding cost of the distributed caching group (AAC) when nodes are not

Evaluation of the controller

In order to evaluate our adaptive scheme, we compare its steady-state average access cost to the corresponding cost of one of the two extreme static schemes (LRU(q = 0) or LRU(q = 1)), corresponding to full- or no-cooperation, respectively. To that end, we define the following performance metric:minimum cost reduction(%)=100·coststatic-costadaptivecoststatic,where costadaptive is the access cost of our adaptive mechanism, and coststatic is the minimum cost of the two static schemes: coststatic = min(

Conclusion

Recent work in the literature [5] has uncovered the susceptibility of nodes participating in a distributed caching group to being mistreated, when a node’s access cost for fetching objects while participating in the group is higher than that accrued when operating in isolation. Mistreatment emerges as a result of the adoption of a common caching scheme or as a result of the mutual state interaction between replacement algorithms used by members of the group. Preliminary mistreatment-resilient

Georgios Smaragdakis received the Diploma in Electronic and Computer Engineering from the Technical University of Crete, Crete, Greece. He is currently working toward the Ph.D. degree in Computer Science at Boston University, Boston, MA. His research interests include the design and analysis of Scalable Network Systems with main applications in Overlay Network Creation, Maintenance, and Resource Allocation and Sharing.

References (31)

  • S. Jin, A. Bestavros, Sources and characteristics of Web temporal locality, in: Proceedings of Mascots’2000: The...
  • P. Rodriguez et al.

    Analysis of web caching architectures: hierarchical and distributed caching

    IEEE/ACM Transactions on Networking

    (2001)
  • M.R. Korupolu et al.

    Coordinated placement and replacement for large-scale distributed caches

    IEEE Transactions on Knowledge and Data Engineering

    (2002)
  • L. Fan et al.

    Summary cache: a scalable wide-area web cache sharing protocol

    IEEE/ACM Transactions on Networking

    (2000)
  • L. Yin, G. Cao, Supporting cooperative caching in ad hoc networks., in: Proceedings of the Conference on Computer...
  • Cited by (2)

    Georgios Smaragdakis received the Diploma in Electronic and Computer Engineering from the Technical University of Crete, Crete, Greece. He is currently working toward the Ph.D. degree in Computer Science at Boston University, Boston, MA. His research interests include the design and analysis of Scalable Network Systems with main applications in Overlay Network Creation, Maintenance, and Resource Allocation and Sharing.

    Nikos Laoutaris received the Ph.D. degree from the Department of Informatics and Telecommunications of the University of Athens, Greece, in 2004, for his work in the area of Content Networking. He also holds an M.Sc. degree in Telecommunications and Computer Networks (2001) and a B.Sc. degree in Computer Science (1998), both from the same department. His main research interests are in the analysis of algorithms and the performance evaluation of Internet content distribution systems (CDN, P2P, web caching) and multimedia streaming applications. He is currently a Marie Curie Outgoing International post-doctoral fellow at the Computer Science Department of Boston University.

    Azer Bestavros obtained Ph.D. in Computer Science from Harvard University in 1992. He is the Professor and Chair of CS at Boston University. His research interests are in networking and real-time systems. His seminal works include his generalization of classical rate-monotonic analysis to accommodate probabilistic guarantees, his pioneering of the push model for Internet content distribution adopted years later by CDNs, and his characterization of Web traffic self-similarity and reference locality. His research work has culminated so far in 10 Ph.D. theses, over 80 masters and undergraduate student projects, five US patents, and two startup companies. With over 3000 citations, CiteSeer ranks him in the top 5% of its list of 10,000 most-cited authors, and since 1999, WebBib has ranked his publications as constituting one of the top three bodies of web-related research by a single author. His research has been funded by over $15M of government and industry grants. He has served as general chair, PC chair, officer, or PC member of most major conferences in networking and real-time systems. He received distinguished service awards from both the ACM and the IEEE, and is a distinguished speaker of the IEEE.

    Ibrahim Matta received his Ph.D. in computer science from the University of Maryland at College Park in 1995. He is an associate professor of computer science at Boston University. His research involves routing and transport protocols, focusing on resiliency and safety aspects. He published over 70 refereed technical papers, and was guest co-editor of three special journal issues. He received the National Science Foundation CAREER award in 1997. He is on the Editorial Board of the Computer Networks Journal. He was General Chair of WiOpt’06, Technical Program Co-chair of ICNP’05, Technical Program Co-chair of SenMetrics’05, Internet Co-chair of Infocom’05, Publication Chair of Infocom’03, and Tutorial and Panel Chair of Hot Interconnects’01. He was co-organizer and Technical Program Co-chair of the EU-US NeXtworking’03.

    Ioannis Stavrakakis received the Diploma in Electrical Engineering, Aristotelian University of Thessaloniki, (Greece), 1983; Ph.D. in EE, University of Virginia (USA), 1988; an Asst. Prof. in CSEE, University of Vermont (USA), 1988–1994; an Assoc. Prof. of ECE, Northeastern University, Boston (USA), 1994–1999; an Assoc. Prof. of Informatics and Telecommunications, University of Athens (Greece), 1999–2002 and Prof. since 2002. Teaching and research interests are focused on resource allocation protocols and traffic management for communication networks, with recent emphasis on peer-to-peer, wireless, sensor and ad hoc networking. His past research has been published in over 130 scientific journals and conference proceedings and was funded by NSF, DARPA, GTE, BBN and Motorola (USA) as well as Greek and European Union (IST) Funding agencies. He has served repeatedly in NSF and IST research proposal review panels and involved in the organization of numerous conferences sponsored by IEEE, ACM, ITC and IFIP societies. He is a Fellow of the IEEE, a member of (and has served as an elected officer for) the IEEE Technical Committee on Computer Communications (TCCC) and the chairman of IFIP WG6.3. He has served as a co-organizer of the 1996 International Teletraffic Congress (ITC) Mini-Seminar, the organizer of the 1999 IFIP WG6.3 workshop, a technical program co-chair for the IFIP Networking’2000, EWC’04 and IFIP WiOpt’05 conferences, the Vice-General Chair for Networking’2002 conference, the organizer of the COST-IST(EU)/NSF(USA)-sponsored NeXtworking’03 and the Workshop on Autonomic Communications (WAC2005). He is an associate editor for the IEEE/ACM transactions on Networking, the ACM/Baltzer Wireless Networks Journal and the Computer Networks Journals.

    A. Bestavros and I. Matta are supported in part by a number of NSF awards, including CNS Cybertrust Award #0524477, CNS NeTS Award #0520166, CNS ITR Award #0205294, and EIA RI Award 0202067. I. Stavrakakis is supported in part by EU IST projects CASCADAS and E-NEXT. N. Laoutaris is supported by a Marie Curie Outgoing International Fellowship of the EU MOIF-CT-2005-007230. A preliminary version of this work appeared in the proceedings of 2006 IFIP Networking Conference [1].

    View full text