Elsevier

Computer Communications

Volume 31, Issue 2, 5 February 2008, Pages 190-200
Computer Communications

Analysis of hybrid P2P overlay network topology,☆☆

https://doi.org/10.1016/j.comcom.2007.08.014Get rights and content

Abstract

Modeling peer-to-peer (P2P) networks is a challenge for P2P researchers. In this paper, we provide a detailed analysis of large-scale hybrid P2P overlay network topology, using Gnutella as a case study. First, we re-examine the power-law distributions of the Gnutella network discovered by previous researchers. Our results show that the current Gnutella network deviates from the earlier power-laws, suggesting that the Gnutella network topology may have evolved a lot over time. Second, we identify important trends with regard to the evolution of the Gnutella network between September 2005 and February 2006. Upon analyzing the limitations of the power-laws, we provide a novel two-layered approach to study the topology of the Gnutella network. We divide the Gnutella network into two layers, namely the mesh and the forest, to model the hybrid and highly dynamic architecture of the current Gnutella network. We give a detailed analysis of the two-layered overlay and present six power-laws and one empirical law to characterize the topology. Using the two-layered approach and laws proposed, realistic topologies can be generated and the realism of artificial topologies can be validated.

Introduction

Modeling the topologies of peer-to-peer (P2P) networks is an important open problem. An accurate topological model can have significant influence on P2P research. First, we can gain detailed insight into the nature of the underlying system. Second, the model can enable detailed analysis of algorithms and facilitate design of more efficient protocols that take advantage of topology properties. Third, we can generate more accurate artificial topologies for simulation purposes. Furthermore, we can predict future trends and thereby address potential problems in advance.

Previous researchers [2] and [7] tended to use power-laws to characterize the topology of P2P networks. Recent advances in P2P networks have resulted in hybrid architectures, represented by the success of Gnutella protocol 0.6 [3] and Kazaa [4]. In this paper, we provide a detailed analysis of large-scale hybrid P2P network topology, giving results concerning major topology properties and main distributions. In our study, we choose Gnutella as a case study, as it has a large user community and open architecture. Our work can be summarized by the following points.

First, we re-examine the power-law distributions of the Gnutella network discovered by previous researchers. Our results show that the current Gnutella network deviates from the earlier power-laws. This observation suggests that the Gnutella network topology may have evolved a lot over time.

Second, we identify important trends with regard to the evolution of the Gnutella network between September 2005 and February 2006.

As our primary contribution, we provide a novel two-layered approach to study the topology of the Gnutella network. Due to the limitations of the power-laws, we divide the Gnutella network into two layers, namely the mesh and the forest, to model the hybrid and highly dynamic architecture of the current Gnutella network. We give a detailed analysis of the two-layered overlay and present six power-laws and one empirical law to characterize the topology.

Finally, we focus on the generation of realistic topologies and the validation of artificial topologies using our approach and laws proposed.

The rest of this paper is organized as follows. Section 2 presents background and previous work. In Section 3, we present our traces of the Gnutella network. In Section 4, we re-examine the power-law distributions discovered by previous researchers and identify the trends concerning the evolution of Gnutella network. In Section 5, we analyze the limitations of the power-laws and introduce our new two-layered approach to study the topology of Gnutella network. In Section 6, we analyze the topological properties of the mesh and present two power-laws concerning the mesh topology. In Section 7, we examine the topology properties of the forest and provide one empirical law concerning the tree size. In Section 8, we present to two two power-laws concerning the overlay network as a whole and discuss the practical uses of our approach and laws. Finally, Section 9 concludes our work.

Section snippets

Gnutella Protocol and the crawler

Gnutella protocol 0.4 [5] employs a pure decentralized model. In this model, individual nodes, also called servents are equal in terms of functionality. They not only perform server-side roles such as matching incoming queries against their local resources and respond with applicable results, but also offer client-side functions such as issuing queries and collecting search results. All servents are connected to each other randomly. Fig. 1 illustrates the topology of the Gnutella 0.4 network.

Our Gnutella Network Traces

We developed a crawler to collect topology information of the Gnutella network, taking advantage of message communication mechanism of both protocol 0.4 and protocol 0.6. The crawler is based on the Limewire [6] open source client and performs a breadth first searching on the network in parallel. It can discover more than 100,000 nodes in minutes.

We can build the graph of nodes by analyzing the collected data on the Gnutella network. We model two adjacent nodes that have at least one connection

Current Gnutella network topology

In this section, we examine the power-laws of the Gnutella network described in previous literatures against our two traces. The goal of our work is to find out whether the topology of the current Gnutella network accords with the early power-laws.

We use linear regression to fit a line in a set of two-dimensional points using the least-square errors method. The validity of the approximation is quantified by the correlation coefficient ranging from −1.0 and 1.0. The absolute value of the

The two-layered approach

In this section, we first discuss the limitations of the power-laws and then present a new approach to study the topology of the Gnutella network.

Mesh topology analysis

In this section, we study the topology properties concerning the mesh in the Gnutella network. In Table 2, we present some basic statistics about the mesh in our traces. In Table 2, p(m) represents the percentage of nodes in the mesh, l represents average shortest distance, and k represents average degree.

Forest topology analysis

In this section, we study the topology properties concerning the forest in the Gnutella network. In Table 3, we present some basic statistics about the forest in our traces. In Table 3, p(t) represents the percentage of nodes in the forest.

Discussion

In this section, we first present two more power-laws concerning all the nodes (including both in-mesh nodes and in-tree nodes) in the Gnutella network. Then we focus on the generation of synthetic topologies of P2P networks.

Conclusion and future work

In this paper, we study the hybrid P2P network topology through the mesh perspective and the forest perspective respectively. Using the two-layered approach and laws proposed, realistic topologies can be generated.

Chao Xie currently is a Ph.D. student in the Department of Computer Science at University of Wisconsin-Madison. He obtained his M.S. degree in Computer Science from Georgia State University, USA, in 2007, obtained his M.Eng. degree in Computer Science from Huazhong University of Science and Technology, China, in 2005, and obtained his B.S. degree in Mechanical Engineering from Huazhong University of Science and Technology, China, in 2001.

His main research interests include computer networks,

References (19)

  • C. Xie, Y. Pan, Analysis of large-scale hybrid peer-to-peer network topology, in: Proc. IEEE GLOBECOM’06, San...
  • M.A. Jovanovic, Modelling large-scale peer-to-peer networks and a case study of gnutella, Master’s thesis, University...
  • Gnutella, The gnutella protocol v0.6,...
  • The KaZaA website,...
  • Clip2, The Gnutella protocol specification v0.4,...
  • The Limewire website,...
  • L.A. Adamic et al.

    Search in power-law networks

    Physical Review E

    (2001)
  • M. Faloutsos, P. Faloutsos, C. Faloutsos, On power-law relationships of the internet topology, in: Proc. ACM...
  • D. Magoni et al.

    Analysis of the autonomous system network topology

    ACM SIGCOMM Computer Communication Review

    (2001)
There are more references available in the full text version of this article.

Cited by (25)

  • Popularity-based scalable peer-to-peer topology growth

    2016, Computer Networks
    Citation Excerpt :

    Yet the research shows mixed results [7] on power the existence of power-law behavior in P2P networks. It was reported [8,9] that power-law behavior emerges in unstructured P2P networks if nodes that are close to the edge of the topology are excluded. Several studies [10–12] showed small-world topology for P2P networks.

  • Analysis of topology dynamics for unstructured P2P networks

    2016, Computer Communications
    Citation Excerpt :

    Based on the results obtained by a network crawler, Jovanović et al. [2] reported that the Gnutella topology exhibits “small-word” properties and the out degree distribution obeys power law. However, in 2008, Xie et al. [3] discovered that the current Gnutella network deviates from the earlier power laws. By analyzing the topology information obtained by the crawler developed by them, Xie et al. observed that the power law property of the rank distribution and degree distribution has become weak, some even invalid.

  • A complex network model based on the Gnutella protocol

    2009, Physica A: Statistical Mechanics and its Applications
    Citation Excerpt :

    Moreover, users are linked by functions in the protocol; for more details, see Ref. [5]. A number of studies on Gnutella and models derived from it have been carried out, including topological properties of P2P networks based on measurements of real implementations [6–8], models and analyses based on dynamic activities of peers such as file sharing and data researching [9–13], and discussions of modified protocols and relative network structures with special characteristics [14–17]. Moreover, scale-free phenomena have been seen to emerge in these networks [18,19].

  • Hybrid topology based connectivity in HIOT to evade security and connection issues

    2019, Journal of Advanced Research in Dynamical and Control Systems
View all citing articles on Scopus

Chao Xie currently is a Ph.D. student in the Department of Computer Science at University of Wisconsin-Madison. He obtained his M.S. degree in Computer Science from Georgia State University, USA, in 2007, obtained his M.Eng. degree in Computer Science from Huazhong University of Science and Technology, China, in 2005, and obtained his B.S. degree in Mechanical Engineering from Huazhong University of Science and Technology, China, in 2001.

His main research interests include computer networks, distributed systems, parallel computing and data mining.

Chao Xie is a member of the Association of Computing Machinery and the IEEE Computer Society.

Guihai Chen obtained his B.S. degree from Nanjing University, M.Eng. from Southeast University, and Ph.D from University of Hong Kong. He visited Kyushu Institute of Technology, Japan in 1998 as a research fellow, and University of Queensland, Australia in 2000 as a visiting professor. During September 2001 to August 2003, he was a visiting professor in Wayne State University. He is now a full professor and deputy chair of Department of Computer Science, Nanjing University. Prof. Chen has published more than 100 papers in peer-reviewed journals and refereed conference proceedings in the areas of wireless sensor networks, high-performance computer architecture, peer-to-peer computing and performance evaluation. He has also served on technical program committees of numerous international conferences. He is a member of the IEEE Computer Society.

Art Vandenberg was born in Grasonville, Maryland, 1950. Education includes B.A. English Literature, Swarthmore College, Swarthmore, PA, 1972; M.V.A Painting and Drawing, Georgia State University, Atlanta, GA 1979; and M.S. Information and Computer Systems, Georgia Institute of Technology, Atlanta, GA 1985.

He has worked in library systems, research and administrative computing since 1976, including 15 years in information technology positions at Georgia Institute of Technology. Since 1997 he has been with Information Systems & Technology at Georgia State University, as Director of Advanced Campus Services charged with deploying middleware and research computing infrastructure. His current activities include deploying grid computing solutions and establishing high-performance computing cyberinfrastructure. Recent research grants include a NSF ITR Award 0312636 as Co-PI investigating a unique approach to resolving metadata heterogeneity for information integration by combining monitoring, clustering and visualization to discover patterns or trends. He is a member of Georgia State’s IT Risk Management Research Group, the Georgia State Information Integration Lab, and serves as Chair of SURAgrid, a regional grid initiative of the Southeastern Universities Research Association.

Mr. Vandenberg is a member of the Association of Computing Machinery and the IEEE Computer Society.

Yi Pan is the chair and a professor in the Department of Computer Science and a professor in the Department of Computer Information Systems at Georgia State University. Dr. Pan received his B.Eng. and M.Eng. degrees in computer engineering from Tsinghua University, China, in 1982 and 1984, respectively, and his Ph.D. degree in computer science from the University of Pittsburgh, USA, in 1991. Dr. Pan’s research interests include parallel and distributed computing, optical networks, wireless networks, and bioinformatics. Dr. Pan has published more than 100 journal papers with 30 papers published in various IEEE journals. In addition, he has published over 100 papers in refereed conferences (including IPDPS, ICPP, ICDCS, INFOCOM, and GLOBECOM). He has also co-authored/co-edited 30 books (including proceedings) and contributed several book chapters. His pioneer work on computing using reconfigurable optical buses has inspired extensive subsequent work by many researchers, and his research results have been cited by more than 100 researchers worldwide in books, theses, journal and conference papers. He is a co-inventor of three U.S. patents (pending) and 5 provisional patents, and has received many awards from agencies such as NSF, AFOSR, JSPS, IISF and Mellon Foundation. His recent research has been supported by NSF, NIH, NSFC, AFOSR, AFRL, JSPS, IISF and the states of Georgia and Ohio. He has served as a reviewer/panelist for many research foundations/agencies such as the U.S. National Science Foundation, the Natural Sciences and Engineering Research Council of Canada, the Australian Research Council, and the Hong Kong Research Grants Council. Dr. Pan has served as an editor-in-chief or editorial board member for 15 journals including 5 IEEE Transactions and a guest editor for 10 special issues for 9 journals including 2 IEEE Transactions. He has organized several international conferences and workshops and has also served as a program committee member for several major international conferences such as INFOCOM, GLOBECOM, ICC, IPDPS, and ICPP. Dr. Pan has delivered over 10 keynote speeches at many international conferences. Dr. Pan is an IEEE Distinguished Speaker (2000-2002), a Yamacraw Distinguished Speaker (2002), a Shell Oil Colloquium Speaker (2002), and a senior member of IEEE. He is listed in Men of Achievement, Who’sWho in Midwest, Who’sWho in America, Who’sWho in American Education, Who’s Who in Computational Science and Engineering, and Who’s Who of Asian Americans.

This paper extends and supplants the earlier version of this paper presented at IEEE GLOBECOM’06 [1].

☆☆

Guihai Chen’s work is supported by China NSF under Grant 60573131, China Jiangsu Provincial NSF under Grant BK2005208, China 973 projects under Grants 2006CB303000 and 2002CB312002, and Nokia Bridging the World Program. Yi Pan’s work is supported in part by the National Science Foundation (NSF) under Grants ECS-0196569, ECS-0334813, and CCF-0514750. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the NSF, China NSF or Nokia.

View full text