Resilience and reliability analysis of P2P network systems

https://doi.org/10.1016/j.orl.2009.09.006Get rights and content

Abstract

This paper conducts resilience and reliability analysis of P2P network by studying the isolation probability and the durable time of a single user. The network with user’s lifetime having stronger NWUE property is proved to be more resilient. Further, both graphical and nonparametric methods are developed to test the NWUE order between two data sets.

Introduction

A Peer-to-Peer (P2P) internet network allows a group of computer users equipped with the same networking program to connect with each other for the purpose of directly accessing files from one another’s hard drives, and it functions by connecting individual computers together to share files instead of going through a central server. P2P networks can be classified by what they can be used for, for instance, content delivery, file sharing, telephony, media streaming (audio, video) and discussion forums, etc. In the current studies of P2P networks, Kaashoek and Karger [1] and Stoica et al. [2] investigate the single-node isolation, Aspnes et al. [3], Ganesh and Massoulie [4], Gummadi et al. [5], and Massoulié [6] etc. discuss the disconnection of the entire graph. Bhagwan et al. [7] is among the first to pay attention to the more realistic P2P failure models, in which, other than those traditional binary metrics, the intrinsic behavior of internet users as well as the departures of users due to more complex factors such as attention span and browsing habit etc. are taken into consideration. In order to investigate the performance of such systems, Leonard et al. [8] recently introduce the node failure model based on users’ lifetimes and study the stochastic resilience of P2P networks. In their passive model, it is assumed that a broken link is never repaired and a user stays online for a random period of time till all neighbors of this user go off line or the user leaves the network on his own initiative. As a consequence, the random lifetime that an arriving user will stay on line reflects both the behavior of the user and the duration of the service to the entire P2P network community. This model is suitable for the scenario that the time to seek a new link or repair a broken link is prohibitively larger than the general lifetime of a new user or peers are not equipped with any neighbor-recovery strategy. For example, due to the rapid increase in size (number of peers), the central server of a Napster network usually cannot be updated and maintained in time. Consequently, it takes a longer time to repair/replace the broken link.

This paper proposes a more general passive model for P2P networks, in which a user is isolated only when a certain number of neighbors of the user leave the network and hence the passive model in [8] is included as a special case. Further, we investigate both the stochastic resilience and reliability properties of this new model, and it is proved that the network with the user’s lifetime having stronger NWUE property is more resilient. The rest of this paper is organized as follows: Section 2 introduces the new P2P network model. Section 3 studies the stochastic resilience by using the isolation probability of a single user, and Section 4 discusses some reliability properties of the durable time of a single user in the network. Finally, both graphical and nonparametric methods are developed in Section 5 to detect the NWUE order between two data sets. All main conclusions are consistent with those obtained through experiments in the literature.

Throughout this paper, the term increasing is used instead of monotone nondecreasing and the term decreasing is used instead of monotone non-increasing, and all random variables are assumed to be absolutely continuous and to have 0 as the common left end point of their supports.

Section snippets

Model description

This section describes the new model on P2P networks in detail and presents its rationality. For an internet user, let X, a nonnegative random variable, be the lifetime to stay online in the network for his own purpose or providing service to other peers. Saroiu et al. [9], Bustemante and Qiao [10] pointed out that the distribution of a user’s lifetime in practical P2P networks very well accords with Pareto distribution, which possesses of the well-known new worse than used in expectation

Resilience analysis

Resilience analysis of random graphs and various types of deterministic networks has attracted considerable interest of researchers during the past several decades. For more details, please refer to [13], [14], [15] and others. Along this line of study, one of the most important problems is that under what failure conditions the network is disconnected or demonstrates noticeably lower performance to users. Assuming uniformly random node failure, Stoica et al. [9], Bollobás [13] and Gummadi

Reliability analysis

The mean durable time of a user entering a P2P network system is derived as follows.

Proposition 1

For a P2P network with user’s lifetime X , E[Tr(X)]=rμFk+1 , 1rk .

Proof

For any 1rk, E[Tr(X)]=0F̄(x)F̄X̃r,k(x)dx=rkr0F̄(x)[FX̃(x)1ur1(1u)krdu]dx=rkr01ur1(1u)kr[0FX̃1(u)F̄(x)dx]du=rμFkr01ur1(1u)kr[0FX̃1(u)dFX̃(x)]du=rμFkr01ur(1u)krdu=rμFk+1. 

This proposition reveals that the mean durable time of a new user in P2P network is distribution free. More precisely, for a P2P network with any type of

Identification of NWUE order

Since the NWUE-ness of the lifetime of the user in a network has a direct impact on resilience of the corresponding P2P network, it is of interest to tell whether one life distribution is more NWUE than the other. In this section, we build some methods to statistically detect the strict NWUE order between two lifetime distributions based on samples Xn=(X1,,Xn) and Ym=(Y1,,Ym) from independent and continuous populations X and Y, respectively.

Acknowledgement

Authors are indebted to the anonymous referee for his insightful comments, which have greatly improved the presentation of this manuscript.

References (28)

  • M.F. Kaashoek, D. Karger, Koorde: A simple degree-optimal distributed hash table, in: IPTPS,...
  • I. Stoica, R. Morris,  Karger, M.F. Kaashoek, H. Balakrishnan, Chord: A scalable Peer-to-Peer lookup service for...
  • J. Aspnes, Z. Diamadi, G. Shah, Fault tolerant routing in Peer to Peer systems, in: ACM PODC,...
  • A. Ganesh, L. Massoulie, Failure resilience in balanced overlay networks, in: Allerton Conference on Communication,...
  • K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S. Shenker, I. Stoica, The impact of DHT routing geometry on...
  • L. Massoulié, A.M. Kermarrec, A. Ganesh, Network awareness and failure resilience in self-organising overlay networks,...
  • R. Bhagwan, S. Savage, G.M. Voelker, Understanding availability, in: Proceedings of the Second International Workshop...
  • D. Leonard et al.

    On lifetime-based node failure and stochastic resilience of decentralized Peer-to-Peer networks

    IEEE/ACM Transactions on Networking

    (2007)
  • S. Saroiu, P.K. Gummadi, S.D. Gribble, A measurement study of Peer-to-Peer file sharing systems, in: MMCN,...
  • F.E. Bustemante, Y. Qiao, Friendships that last: Peer lifespan and its role in P2P protocols, in: Intl. Workshop on Web...
  • M. Harchol-Balter et al.

    Exploiting process lifetime distributions for dynamic load balancing

    ACM Transactions on Computer Systems

    (1997)
  • H.A. David et al.

    Order Statistics

    (2003)
  • B. Bollobás

    Random Graphs

    (2001)
  • D.Burtin Yu

    Connection probability of a random subgraph of an n-dimensional cube

    Problemy Peredachi Informatsii

    (1977)
  • Cited by (8)

    • Complex equipment system resilience: Composition, measurement and element analysis

      2022, Reliability Engineering and System Safety
      Citation Excerpt :

      The resilience process of CES can be divided into three states and two processes according to time, which are the key attentions of resilience measurement [36]. Different scholars have different emphasis on measuring resilience due to different research objects [50,51], which has no standardized method. Based on the definition of resilience, the current research on resilience measurement mainly focuses on the degradation and recovery of system performance in the field of engineering, which can be divided into two categories: deterministic measurement and probabilistic measurement [51,52].

    • An accurate generic model to measure of robustness and fragility of networks to random breakdowns

      2016, International Journal of Mathematical Modelling and Numerical Optimisation
    View all citing articles on Scopus
    View full text