Abstract
A distributed system employing checkpoint and rollback-recovery as a fault tolerance mechanism, suffers from overhead attributed by the technique. Authors in [4] proposes a technique to automatically identify a checkpoint and recovery protocol based on a pre-estimated database of overhead measures. The technique depends on computation of similarity between a pair of communication patterns. The computation involves first partitioning both the communication patterns into small pieces or splices. A pair of splices, one taken from each of the two communication patterns in question, are then compared to compute a similarity measure. Splicing a communication pattern is an important step in the method since it bears heavy significance for later steps in the computation. This paper introduces a new method for splicing. Experimental results show that the technique yields better similarity measure values in comparison to results reported in [4].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7), 558–565 (1978)
Elnozahy, E.N., Alvisi, L., Wang, Y., Johnson, D.B.: A survey of rollbak-recovery protocols in message-passing sytems. ACM Computing Surveys 34(3), 375–408 (2002)
Netzer, R.H.B., Xu, J.: Necessary and sufficient conditions for consistent global snapshots. IEEE Transactions on Parallel and Distributed Systems 6(2), 165–169 (1995)
Paul, H.S., Gupta, A., Sharma, A.: Finding a suitable checkpoint and recovery protocol for a distributed application. J. Parallel and Distributed Computing 66(5), 732–749 (2006)
Paul, H.S., Gupta, A., Badrinath, R.: Performance comparison of checkpoint and recovery protocols. Concurrency and Computation: Practice and Experience 15(15), 1363–1386 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Paul, H.S. (2010). Causal Cycle Based Communication Pattern Matching. In: Kant, K., Pemmaraju, S.V., Sivalingam, K.M., Wu, J. (eds) Distributed Computing and Networking. ICDCN 2010. Lecture Notes in Computer Science, vol 5935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11322-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-11322-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11321-5
Online ISBN: 978-3-642-11322-2
eBook Packages: Computer ScienceComputer Science (R0)