Failure Recovery Mechanism in Neighbor Replica Distribution Architecture

Noor, Ahmad Shukri Mohd; Deris, Mustafa Mat

doi:10.1007/978-3-642-16167-4_6

Failure Recovery Mechanism in Neighbor Replica Distribution Architecture

Ahmad Shukri Mohd Noor¹⁹ &
Mustafa Mat Deris²⁰

Conference paper

1682 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6377))

Abstract

Replication provide an effective way to enhance performance, high availability and fault tolerance in distributed systems. There are numbers of fault tolerant and failure recovery techniques based on replication. These recovery techniques such as Netarkivet’s data grid and fast disaster recovery mechanism for volume replication systems were implemented in two-replica distribution technique(TRDT) or primary-backup architecture. However, these techniques have its weaknesses as they inherit irrecoverable scenarios from TRDT such as double faults, both copies of a file are damaged or lost, missing of the content index in index server table and index server has generated checksum error in content index. In this paper we propose the failure recovery based on the Neighbor Replication Distribution technique (NRDT) to recover the irrecoverable scenarios and to improve the recovery performance. This technique considered neighbors have the replicated data, and thus, maximize the fault tolerant as well as reliability in failure recovery. Also, the technique outperform the TRDT in failure recovery by reducing the irrecoverable cases in TRDT. It also tolerates failures such as server failures, site failure or even network partitioning due to it has more the one back up or replica.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chervenak, A., et al.: The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Data Sets. Journal of Network and Computer Applications, 187–200 (2001)
Google Scholar
Hoschek, W., et al.: Data management in an international data grid project. In: Proceedings of GRID Workshop, pp. 77–90 (2000)
Google Scholar
Stockinger, H., et al.: File and object replication in data grids. In: Tenth IEEE Symposium on High Performance and Distributed Computing, pp. 305–314 (2001)
Google Scholar
Dabrowski, C.: Reliability in grid computing systems Concurrency Computatation. Practice and Experience. Wiley InterScience, Hoboken (2009), www.interscience.wiley.com
Google Scholar
Zhang, Q., et al.: Dynamic Replica Location Service Supporting Data Grid Systems. In: Sixth IEEE International Conference on Computer and Information Technology (CIT 2006), p. 61 (2006)
Google Scholar
Erciyes, K.: A Replication-Based Fault Tolerance Protocol Using Group Communication for the Grid. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds.) ISPA 2006. LNCS, vol. 4330, pp. 672–681. Springer, Heidelberg (2006)
Chapter Google Scholar
Shen, H.H., Chen, S.M., Zheng, W.M., Shi, S.M.: A Communication Model for Data Availability on Server Clusters. In: Proc. Int’l. Symposium on Distributed Computing and Application, Wuhan, pp. 169–171 (2001)
Google Scholar
Niels, H.: A formal analysis of recovery in a preservational data grid. In: Christensen the 14th NASA Goddard - 23rd IEEE Conference on Mass Storage Systems and Technologies, College Park, Maryland, USA, May 15-18 (2006)
Google Scholar
Wang, Y., Li, Z.-h., Lin, W.: A Fast Disaster Recovery Mechanism for Volume Replication Systems. In: Perrott, R., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds.) HPCC 2007. LNCS, vol. 4782, pp. 732–743. Springer, Heidelberg (2007)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Science and Technology, Universiti Malaysia Terengganu, 21030, Kuala Terengganu, Malaysia
Ahmad Shukri Mohd Noor
Faculty of Multimedia and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Johor Darul Takzim, Malaysia
Mustafa Mat Deris

Authors

Ahmad Shukri Mohd Noor
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Mat Deris
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computer Science, South-Central University for Nationalities, Minyuan Road 708, 430073, Wuhan, China
Rongbo Zhu
Victoria University, 8001, Melbourne, VIC, Australia
Yanchun Zhang
College of Science, He’Bei Polytechnic University, Tangshan, 063000, Hebei, China
Baoxiang Liu & Chunfeng Liu &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noor, A.S.M., Deris, M.M. (2010). Failure Recovery Mechanism in Neighbor Replica Distribution Architecture. In: Zhu, R., Zhang, Y., Liu, B., Liu, C. (eds) Information Computing and Applications. ICICA 2010. Lecture Notes in Computer Science, vol 6377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16167-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-16167-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16166-7
Online ISBN: 978-3-642-16167-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics