Elsevier

Computer Communications

Volume 30, Issue 16, 3 November 2007, Pages 3107-3116
Computer Communications

Design of a peer-to-peer system for optimized content replication

https://doi.org/10.1016/j.comcom.2007.05.041Get rights and content

Abstract

This paper introduces a peer-to-peer (p2p) based system for content replication, enhanced with built-in optimization techniques. This turns out to be useful to enhance the overall system and to optimize the usage of resources. In addition, such techniques could be used to mitigate the leeching, which is the usage of resources from a remote user without contributing. Simulations are provided to compare the proposed optimized strategy to more conventional ones.

Introduction

Many of the applications adopted in the modern Internet rely on a client–server framework. Client–server architectures are based upon a centralized entity that manages requests, processes them by applying some kind of policy, and then sends an answer back to the original requestor. Even if easy to deploy, such architectures introduce several hazards: there is a single point of failure and there are many scalability issues. The antithetical approach is the distributed one: centralized components are reduced to the minimum (when not completely absent), or simply employed for coordination purposes. Consequently, the core functionalities are spread among all the network participants. This concept leads to peer-to-peer (p2p) systems, where all the hosts have the same capabilities and responsibilities; accordingly, all entities are called peers. However, p2p systems have been mainly adopted for file-sharing applications [1], [2], which have showcased their core characteristics, such as:

  • Owing to their highly autonomic and distributed flavor, they have an intrinsic resistance against churn, which is the continuous process of peer arrival and departure;

  • They are able to scale dramatically, being able to support tens to millions of users in a seamless way;

  • Many p2p systems promote a selfish user behavior, allowing users to stop participating in the system after they complete the download of a content, hence leaving without sending data. Despite this, many effective solutions have been introduced raising users’ trust and participation in such systems.

Besides, a recent trend in p2p file-sharing services is to employ a centralized component to page users and to provide functionalities for coordination purposes.

Owing to the presence of the centralized component, the work faces optimization of the content replication process from the point of view of the tracker, i.e., we discuss how performance of a typical content-sharing scheme can be improved, in terms of measures of interest such as download times, uniform distribution of the chunks, transferred data, etc., depending on how the bandwidths of the peers are allocated through the process.

Given the number of peers involved in the system, and the complexity of the swarm process, it is of crucial importance to pursue an efficient exploitation of the shared resources, which in turn allows to reduce the costs and times of the download for all the peers.

To this purpose, we model the evolution of the process through a discrete-time system, where decisions are taken by the tracker at the beginning of each temporal stage, and we consider a single-stage optimization approach, due to the fact that peers can enter and leave the system at each time unpredictably, a behavior that prevents the decision maker from planning over a horizon of stages.

In order to cope with the intrinsic complexity of the problem without recurring to mixed nonlinear programming which would be unfeasible due to the strict time constraints, we consider a hybrid random/nonlinear programming scheme that aims at finding (sub)optimal solutions to the problem in a fast and computationally feasible way.

Besides, to test the efficiency of the proposed method and the importance of optimization for p2p systems, we compare the optimization algorithm to “fair” strategies that are similar to actual resource allocation strategies implemented by commonly employed p2p systems such as those exploiting centralized components, like BitTorrent.

The remainder of the paper is structured as follows: Section 2 deals with the background and the related work in content replication with p2p techniques. Section 3 explains the system architecture and offers a quick comparison with the de-facto standard system for p2p content replication. Section 4 introduces the analytical model adopted to develop optimization techniques as well as the numerical analysis, with an extension to congested environments. Section 5 presents the optimization procedure adopted in this paper. Section 6 contains experimental results focused on the comparison between the optimization scheme and a control scheme that is close to the typical behavior of real p2p systems. Section 7 concludes the paper.

Section snippets

Background and related work

As said, even if popular, recent p2p technologies have mainly been developed for file-sharing systems, resulting in a lack of standardization. Hence, it is difficult to classify p2p systems, but they can be grouped into three different kinds: unstructured, structured and hybrid systems. In unstructured systems, such as Gnutella [3], peers organize themselves without any external enforcement, resulting in a highly chaotic infrastructure. As a consequence, it is hard to locate a resource. Hence,

Architecture of the system

Before introducing the analytical model and the optimization method, we discuss the architecture adopted as a foundation to develop the content replication service. The framework has its lineage in typical deployed p2p systems, such as BitTorrent. The latter has been developed in 2002 in order to deliver a content distribution system immune of scalability problems of classical centralized solutions. Briefly, BitTorrent is not a classical file-sharing system, since its purpose is solely to

Description of the model

In this section we introduce the model by which we describe the evolution of the swarming process in order to optimize the system performance. We start by streamlining the notations used throughout the rest of the work.

Optimization of the model

Once the performance index to be optimized has been chosen, the optimization problem at stage t becomes: findminu{1,,N}NS×{1,,K}NS×RNSJ(x(t),u)subject to the following constraints:

  • (1)

    The network must be admissible Θ(t)  A, where A={Θ{1,,N}NS×{1,,K}NS such that Θ is feasible according to Definition 2}

  • (2)

    Transfer rates must be positiveqi(t)0for alli

  • (3)

    The total upload rate for each peer i must not exceed bius=1Sqi,s(t)1for alli

  • (4)

    The total download rate for peer i must not exceed bidjisRj,iqj,s(t)

Simulations

In order to prove the effectiveness of our optimized system, simulative tests have been performed. Six different scenarios have been faced: four are representative of congestion-free environments, while the remaining two rely on top of a congested network. All the simulations have been performed on a 3 GHz Pentium-D PC with 2 Gbytes of RAM, and the optimizations were carried out using standard functions from the Matlab Optimization Toolbox. Table 1 describes the specific deployment. For every

Conclusions and future work

In this paper, we designed and modeled a p2p architecture, for managing distributed content replication in an optimized way, through a discrete dynamic system. We compared also our framework with two strategies similar to those at the basis of BitTorrent-like architectures. In every trial, our algorithm outperformed the others, even in presence of congestion. Future work will be aimed at extending the model and performing a more detailed simulation campaign.

Luca Caviglione (M.Sc. 2002, Ph.D. 2006) participated in several Research Projects funded by the EU, by ESA, and by Siemens COM AG. He is author and co-author of many academic publications about TCP/IP networking, p2p systems, QoS architectures and wireless networks. He participates in several TPCs, and performs talks about IPv6 and p2p. In 2006 he was with the Italian National Consortium for Telecommunications – Genoa Research Unit. He is a member of the Italian IPv6 Task Force.

References (24)

  • C. Cervellera et al.

    Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization

    European Journal of Operational Research

    (2006)
  • S. Sen et al.

    Analysing peer-to-peer traffic across large networks

    IEEE/ACM Transactions on Networking

    (2004)
  • T. Karagiannis, A. Broido, N. Brownlee, K.C. Claffy, and M. Faloutsos, Is p2p dying or just hiding? in: Proceedings of...
  • M. Ripeanu et al.

    Mapping the Gnutella network: properties of large-scale peer-to-peer systems and implications for system design

    IEEE Internet Computing Journal

    (2002)
  • L.A. Adamic et al.

    Search in power law networks

    Physical Review E

    (2001)
  • N. Bisnik, A. Abouzeid, Modeling and analysis of random walk search algorithms in P2P networks, in: Proceedings of the...
  • I. Stoica, R. Morris, D. Karger, M. Kaashoek, H. Balakrishnan, Chord: a scalable peer-to-peer lookup service for...
  • P. Maymounkov, D. Mazieres, Kademlia: a peer-to-peer information system based on the XOR metric, in: Proceedings of the...
  • B. Cohen, The Bit Torrent Homepage, 2006,...
  • X. Yang, G. de Veciana, Service capacity of peer to peer networks, in: Proceedings of the 23rd Annual Joint Conference...
  • R. Ahuja et al.

    Network Flows

    (1993)
  • D.P. Bertsekas et al.

    An analysis of stochastic shortest path problems

    Mathematics of Operations Research

    (1991)
  • Cited by (13)

    • A survey on energy-aware security mechanisms

      2015, Pervasive and Mobile Computing
      Citation Excerpt :

      One of the most adopted workaround is to shift the complexity at the border of the network by using proxy-based architectures [24], where end-nodes are masqueraded via ad-hoc techniques enabling them to dynamically change their energy profiles, e.g., via smart sleeping or other power saving mechanisms. At the same time, having “intelligence” in the network to optimize energy consumption (despite being for traffic or security purposes) or additional machineries (e.g., middle-boxes to prevent propagation of wastage) raises a security issue “by design”, since: (i) the entity running the optimization software in charge of managing devices or computing strategies to dynamically reconfigure the network could be perceived as a target for an attack (see, e.g., Ref. [25] for an example of a centralized decision maker). In this sense, without proper countermeasures the overall architecture will be more fragile; (ii) proxies and middleboxes could break the end-to-end semantic of many security mechanisms, enabling new attacks (à la Man In the Middle); (iii) more devices imply more hardware/software having additional insecurities or exploitable bugs.

    • Design, optimization and performance evaluation of a content distribution overlay for streaming

      2011, Computer Communications
      Citation Excerpt :

      However, a recent trend is to adopt tracker-based solutions, such as the one introduced by BitTorrent (http://www.bittorrent.com). The presence of a centralized component, and its clear separation from the distributed portion of the architecture (i.e., the p2p swarm), offer some major benefits: performing the optimization of the overall service is simpler [6,7], and it is possible to change service-policies quickly. In this perspective, the paper introduces an optimized overlay to guarantee the real-time delivery of data streams.

    • Semantic partitioning of peer-to-peer search space

      2009, Computer Communications
      Citation Excerpt :

      Peer-to-peer networks (P2P) are growing rapidly and unlike current applications like Gnutella [1] and Morpheus [2] that are an infrastructure for file sharing, are developing into an infrastructure for different applications [24,51,23,28,48,16,22,10].

    • Predictive Control for Energy-Aware Consolidation in Cloud Datacenters

      2016, IEEE Transactions on Control Systems Technology
    • A survey of green, energy-aware security and some of its recent developments in networking and mobile computing

      2014, Proceedings - 2014 8th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, IMIS 2014
    View all citing articles on Scopus

    Luca Caviglione (M.Sc. 2002, Ph.D. 2006) participated in several Research Projects funded by the EU, by ESA, and by Siemens COM AG. He is author and co-author of many academic publications about TCP/IP networking, p2p systems, QoS architectures and wireless networks. He participates in several TPCs, and performs talks about IPv6 and p2p. In 2006 he was with the Italian National Consortium for Telecommunications – Genoa Research Unit. He is a member of the Italian IPv6 Task Force.

    Cristiano Cervellera received the M.Sc. degree in electronic engineering from the University of Genoa, Genoa, Italy, in 1998 and the Ph.D. degree in electronic engineering and computer science in 2002. He has been a Researcher at the Istituto di Studi sui Sistemi Intelligenti per l’Automazione, Italian National Research Council, Genoa, Italy, since 2002. His research interests include neural networks, optimal control of large-scale nonlinear systems, and number–theoretic methods for optimization and approximation.

    View full text