QoS enhancements for global replication management in peer to peer networks

https://doi.org/10.1016/j.future.2011.02.011Get rights and content

Abstract

Replica Management is a key issue to reduce the bandwidth consumption, to improve data availability and to maintain data consistency in large distributed systems. Global Replica Management (GRM) means to maintain the data consistency across the entire network. It is preferable particularly for multi-group distributed systems. On the other hand, GRM is not favorable for many applications because a very large number of message passes is needed for replica management processes. In this paper, in order to reduce the number of message passes needed to achieve the efficient GRM strategy, an interconnection structure called the Distributed Spanning Tree (DST) has been employed. The application of DST converts the peer network into logical layered structures and thereby provides a hierarchical mechanism for replication management. It is proved that this hierarchical approach improves the data availability and consistency across the entire network. In addition to these, it is also proved that the proposed approach reduces the data latency and the required number of message passes for any specific application in the network.

Highlights

► A better solution for effective Replica Management P2P networks. ► A good approach for enhanced data consistency. ► Global Replica Management using the Distributed Spanning Tree (DST) approach. ► Contribution of DST in improving the scalability of the overall environment ► To increase the availability of the data items in the anticipated environment.

Introduction

Peer-to-Peer (P2P) systems are the distributed systems in which nodes of equal roles and capabilities exchange information and services directly with each other. Due to the sparse nature of the peer network, multiple network message passes must be needed for any node to exchange the data with another across the network, which may lead to increased access latency. Data replication is a common and vital approach to reduce the access latency and transmission intensity in the networks [1], [2], [3]. Due to the evidential abilities and the demand for the peer networks, replication management is becoming important for all kinds of distributed applications. In order to maximize the degree of availability, most of the replication methods replicate the data in all nodes or at least two endpoints on a relevant query path. But on the other hand, a large number of replicas consume massive amounts of memory and increase the unit cost of the individual nodes. Of course, it reduces the intended performance of the individual nodes other than the overall data accessing performance. These limitations increase the necessity to have an optimized set of solutions for reducing the number of replicas and indeed without compromising the overall data accessing performance. Many solutions are proposed and they differ in terms of their location, permanency, scope and applicability [4], [5].

Based on the scope and applicability, the replications methods can be classified as local and global. The scope of the local replication schemes are so limited and they are applicable for intra-group management, whereas the global replication schemes are applicable for inter-group management. The Local Replication Management (LRM) systems are associated with a single group, where the number of nodes and peers are fixed and limited. But Global Replica Management (GRM) is very much required for all those applications which support multi-peer collaborative works, where the numbers of nodes are not fixed and defined [6]. Thus it is required to dynamically select the peer that holds the replica, and on any failure therein its content has to be transferred to another node, so that the replica availability will never be affected [7]. On the other hand, replicas of a data item may be inconsistent in an environment, where frequent data updates occur, particularly in case of GRM environments. Since a huge number of message passes are required to achieve the global consistency, it results in network overload and cause congestion in the majority of nodes in the networks [6]. From these points of views, the solutions for GRM are so critical and this exciting field yields the attention of many researchers. For responsiveness and availability reasons, the shared data is often replicated. In this environment, consistency control is critical for the correct functioning of the whole system.

In [8], Ren Xun-yil et al. proposed a consistency technique based on a replica clustering coefficient to classify replica nodes into multi-levels. Replica consistency has been maintained in which the updating of the data item is performed at first-level replica nodes initially and then it is propagated to the next level of nodes in sequence. Though efficiency is proved in terms of response time and the number of message passes required, performance factors are supposed to be degraded with the increase in the size of the network at runtime. In [9], Dynamic Maintenance Service (DMS) is proposed to adjust the locations of the data items based on the frequency of request thus reducing the time required to get the needed data. One-way Replica Consistency Service (ORCS) is also proposed, which maintains the replica consistency to improve the accessing performance. It is observed that the proposed algorithm allows possible ways for parallel download, but adjusting the location of data item based on its requirement consumes more message passes with respect to the dynamic nature of any distributed environment. Rahman et al. [10] proposed a static algorithm for placing the replica in the optimal nodes in the network to minimize the total response time of each node. To cope with the dynamic nature of the network, user request and network latency are used to decide where to place the consistent replica for efficient access schemes. This technique is actually designed for grid environment, but it couldn’t be used since it takes more message passes and so consumes a higher computing power for determining the locations of the new replicas.

In [11], Sang-Min Park et al. proposed a dynamic replica maintenance algorithm that divides sites into many regions and makes the sites close to one another in the same regions. This algorithm terminates the replication process if the replica is present in another site in the same region. Thus, files will have at most two copies in each region, which means only two links will be available for parallel data downloading in any one region. This limitation leads to high costs, thus reducing availability and adversely affecting overall performance. Quorum based consistency management is proposed such that quorums have to be constructed with fewer mobile hosts and to manage the strict consistency among replicas [12], [6], [13]. A hybrid algorithm is used to propagate the updated files to the associated nodes, where flooding is replaced by the technique called rumor spreading to reduce communication overhead [14]. At every step of rumor spreading, a node pushes updates to a subset of related nodes it knows, which provide only partial consistency at a particular time. CFS [15] is a P2P system to evade the cache inconsistency problems using a content hashing technique. Each client has to validate the freshness of a received file by itself, and inconsistent replicas are removed from caches by LRU replacement. This method of consistency maintenance yields poor efficiency with an increase in the number of nodes in the network.

In summary, though all of the reviewed methodologies can offer a better solution for consistency maintenance in replication management, they suffer either by a huge number of message passes or a higher volume of computations. From these perspectives, the work described in this paper is aimed at to develop a GRM system using a Distributed Spanning Tree for peer network collaborative work environment where the selection of the peers for holding the replica is dynamic and with a minimal number of message passes required to achieve it. The paper is organized as follows: Section 2 defines the proposed system with background information needed, including the formulation of the Distributed Spanning Tree (DST) with illustration. Section 3 discusses a proposed mechanism for Global Replica Management (GRM). Section 4 provides experimentation and result analysis and Section 5 concludes the proposed work and enhancements.

Section snippets

Background information needed

As discussed in the previous section, in a peer network each node has to exchange information and services directly with each other without any dedicated intermittent, which develops bottlenecks in the network due to the huge volume of messages being exchanged. This could be avoided if the number of messages across the network was optimized. In this paper, it is proposed to convert the graph-like topology of the peer network into the set of spanning trees called the Distributed Spanning Tree

DST in global replica management

Global Replica Management (GRM) refers to the process of maintaining the data item in a globally consistent fashion with appropriate read and write operations performed by the members of the network. In this paper, the described model employs the read/write operation consisting of 3 phases in each, which basically consists of three rounds of message exchanges; (1) a lock request/its reply; (2) read/write operations/its acknowledgment; and (3) a message for commit and lock release/its

Theoretical analysis

A typical GRM in a peer network is to proceed in a way that a peer holding the latest replica of the data item may act as a server for all other peers. On receiving the requests from other peers, the latest replica is to be communicated to the anticipated peer in the network. Generally, in replica management systems, write operations are to be given higher priority than read operations so as to facilitate the read operation with the latest replicas. From this communication perspective, the

Conclusion

The work presented in this paper described an effective approach for GRM in P2P networks using DST structures. The proposed model has been proved in terms of QoS parameters like data availability, bandwidth conception and number of message passes. By employing the DST structures in the peer network, the consistency and replication efficiency can be achieved with the cost of a few message passes compared to the traditional ones. It is also proved that the scalability of the peer network can also

Acknowledgements

This work is a part of the Research Project sponsored under the Fast track Scheme for young Scientists, DST, India. Reference No: D.O.No.SR/FTP/ETA-112/2010. The authors would like to express their thanks for the support offered by the Sponsored Agency.

P. Victer Paul is a Master Degree student, Department of Computer Science, Pondicherry University, Pondicherry, India. He has obtained his B.Tech. in the field of Information Technology, Pondicherry University, India. His research areas include Software Engineering, Web Services, Distributed Systems and Cloud Computing.

References (18)

  • Xin Sun, Jun Zheng, Qiongxin Liu, Yushu Liu, Dynamic data replication based on access cost in distributed systems, in:...
  • H. Stockinger et al.

    File and object replication in data grids

    J. Cluster Comput.

    (2002)
  • A. Chervenak et al.

    The data grid: towards an architecture for the distributed management and analysis of large scientific datasets

    J. Netw. Comput. Appl.

    (2001)
  • A.A. Helal et al.

    Replication Techniques in Distributed Systems

    (1996)
  • B. Ciciani et al.

    Analysis of replication in distributed database systems

    IEEE Trans. Knowl. Data Eng.

    (1990)
  • Takahiro Hara et al.

    Consistency management strategies for data replication in mobile ad hoc networks

    IEEE Trans. Mob. Comput.

    (2009)
  • S.Y. Hwang, K.K.S. Lee, Y.H. Chin, Data replication in a distributed system: a performance study, in: Proc. Conf....
  • Ren Xun-yi et al.

    Efficient model for replica consistency maintenance in data grids

  • Chao-Tung Yang et al.

    File replication, maintenance, and consistency management services in data grids

    J. Supercomput.

    (2009)
There are more references available in the full text version of this article.

Cited by (0)

P. Victer Paul is a Master Degree student, Department of Computer Science, Pondicherry University, Pondicherry, India. He has obtained his B.Tech. in the field of Information Technology, Pondicherry University, India. His research areas include Software Engineering, Web Services, Distributed Systems and Cloud Computing.

N. Saravanan is working as the Assistant Professor, Thiruvalluvar College of Engineering & Tech, Tamil Nadu, India. He has obtained his M.E. in the field of Software Engineering from College of Engineering, Guindy, Chennai, India and is currently doing his Ph.D. at Anna University, India. He has more than 10 years of experience as an academician and his research areas include Software Engineering, Software Testing, Computer Networks and Internet Programming.

S.K.V. Jayakumar is working as the Assistant Professor, Department of Computer Science, Pondicherry University, India. He has obtained his M.E. from Vellore Institute of Technology and is currently doing his Ph.D. in the field of Computer Science and Engineering at Pondicherry University. He has more than 10 years of experience as an academician and his research areas include Cloud Computing, Web services and Distributed Systems. He has published around 50 research papers in National and International Journals and Conferences.

Dr. P. Dhavachelvan is working as a Professor in the Department of Computer Science, Pondicherry University, India. He obtained his B.E. in the field of Electrical and Electronics Engineering from University of Madras, India. He pursued his M.E. and Ph.D. in the field of Computer Science and Engineering from Anna University, Chennai, India. He has about 15 years of experience as an academician and his research areas include Software Engineering & Standards and Web Service Computing. In his credit, he has more than 125 research papers published in reputed International and National Journals and Conferences. He also obtained Patents and proposed Standards in the domain of Software Engineering.

Dr. R. Baskaran is currently Assistant Professor, Department of computer Science and Engineering, Anna University. He completed his Ph.D. in Computer Science and Engineering, Anna University, Chennai. He has a total of 10+ years experience in research. He has numerous publications in International Journals, publications in Asian Journals, publications in National Journals, publications in International Conferences, publications in National Conferences. He is a member of various national and international committees like the Institution of Electronics and Telecommunication Engineers, Computer Society of India (CSI), INDO–European Union, World Academy of Science, Engineering and Technology (WASET), International Network for Engineering Education and Research, International Association of Engineers, and the International Congress for Global Science and Technology.

View full text