Abstract
Collaborative applications are characterized by high levels of data sharing. Optimistic replication has been suggested as a mechanism to enable highly concurrent access to the shared data, whilst providing full application-defined consistency guarantees. Nowadays, there are a growing number of emerging cooperative applications adequate for Peer-to-Peer (P2P) networks. However, to enable the deployment of such applications in P2P networks, it is required a mechanism to deal with their high data sharing in dynamic, scalable and available way. Previous work on optimistic replication has mainly concentrated on centralized systems. Centralized approaches are inappropriate for a P2P setting due to their limited availability and vulnerability to failures and partitions from the network. In this paper, we focus on the design of a reconciliation algorithm designed to be deployed in large scale cooperative applications, such as P2P Wiki. The main contribution of this paper is a distributed reconciliation algorithm designed for P2P networks (P2P-reconciler). Other important contributions are: a basic cost model for computing communication costs in a DHT overlay network; a strategy for computing the cost of each reconciliation step taking into account the cost model; and an algorithm that dynamically selects the best nodes for each reconciliation step. Furthermore, since P2P networks are built independently of the underlying topology, which may cause high latencies and large overheads degrading performance, we also propose a topology-aware variant of our P2P-reconciler algorithm and show the important gains on using it. Our P2P-reconciler solution enables high levels of concurrency thanks to semantic reconciliation and yields high availability, excellent scalability, with acceptable performance and limited overhead.
Similar content being viewed by others
References
Aberer, K., Cudré-Mauroux, P., Datta, A., Despotovic, Z., Hauswirth, M., Punceva, M., Schmidt, R.: P-Grid: a self-organizing structured P2P system. ACM SIGMOD Rec. 32(3), 29–33 (2003)
Akbarinia, R., Martins, V., Pacitti, E., Valduriez, P.: Global Data Management, 1st edn. IOS Press (2006) (Chapter: Design and implementation of Atlas P2P architecture)
Akbarinia, R., Martins, V., Pacitti, E., Valduriez, P.: Top-k query processing in the APPA P2P system. In: Proc. of the Int. Conf. on High Performance Computing for Computational Science (VecPar), Rio de Janeiro, Brazil, July 2006
Akbarinia, R., Pacitti, E., Valduriez, P.: Reducing network traffic in unstructured P2P systems using top-k queries. Distributed Parallel Databases 19(2–3), 67–86 (2006)
Akbarinia, R., Pacitti, E., Valduriez, P.: Data currency in replicated DHTs. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, Beijing, China, June 2007, pp. 211–222
Anwitaman, D., Hauswirth, M., Aberer, K.: Updates in highly unreliable, replicated peer-to-peer systems. In: Proc. of the IEEE Int. Conf. on Distributed Computing Systems (ICDCS), Washington, May 2003, pp. 76–85
Brite: http://www.cs.bu.edu/brite/
Castro, M., Jones, M.B., Kermarrec, A.-M., Rowstron, A., Theimer, M., Wang, H., Wolman, A.: An evaluation of scalable application-level multicast built using peer-to-peer overlays. In: Proc. of the Annual Joint Conf. of the IEEE Computer and Communications Societies (INFOCOM), San Francisco, California, April 2003, pp. 1510–1520
Chockler, G., Keidar, I., Vitenberg, R.: Group communication specifications: a comprehensive study. ACM Comput. Surv. 33(4), 427–469 (2001)
Chong, Y.L., Hamadi, Y.: Distributed log-based reconciliation. In: Proc. of the European Conference on Artificial Intelligence (ECAI), Riva del Garda, Italy, September 2006, pp. 108–112
Clarke, I., Miller, S., Hong, T.W., Sandberg, O., Wiley, B.: Protecting free expression online with Freenet. IEEE Internet Comput. 6(1), 40–49 (2002)
El Dick, M., Martins, V., Pacitti, E.: A topology-aware approach for distributed data reconciliation in P2P networks. In: Proc. of the European Conf. on Parallel Computing (Euro-Par), Rennes, France, August 2007
Grid5000 Project: http://www.grid5000.fr
Howell, F., McNab, R.: SimJava: a discrete event simulation package for Java with applications in computer systems modeling. In: Proc. of the Int. Conf. on Web-based Modeling and Simulation, San Diego, California, January 1998
Huebsch, R., Hellerstein, J., Lanham, N., Thau Loo, B., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: Proc. of Int. Conf. on Very Large Databases (VLDB), Berlin, Germany, September 2003, pp. 321–332
Kermarrec, A.-M., Rowstron, A., Shapiro, M., Druschel, P.: The IceCube approach to the reconciliation of diverging replicas. In: Proc. of the ACM Symp. on Principles of Distributed Computing (PODC), Newport, Rhode Island, August 2001, pp. 210–218
Kleinbaum, D.G., Kupper, L.L., Muller, K.E., Nizam, A.: Applied Regression Analysis and Multi-variable Methods, 3rd edn. Duxbury Press (1998)
Knezevic, P., Wombacher, A., Risse, T.: Enabling high data availability in a DHT. In: Proc. of the Int. Workshop on Grid and Peer-to-Peer Computing Impacts on Large Scale Heterogeneous Distributed Database Systems (GLOBE’05), Copenhagen, Denmark, August 2005, pp. 363–367
Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., Zhao, B.: OceanStore: an architecture for global-scale persistent storage. In: Proc. of the ACM Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, Massachusetts, November 2000, pp. 190–201
Mandriva: http://club.mandriva.com/xwiki/
Martins, V.: Data replication in P2P systems. Ph.D. thesis, University of Nantes, Nantes, France, May 2007. http://www.sciences.univ-nantes.fr/lina/gdd/members/vmartins/
Martins, V., Akbarinia, R., Pacitti, E., Valduriez, P.: Reconciliation in the APPA P2P system. In: Proc. of the IEEE Int. Conf. on Parallel and Distributed Systems (ICPADS), Minneapolis, Minnesota, July 2006, pp. 401–410
Martins, V., Pacitti, E.: Dynamic and distributed reconciliation in P2P-DHT networks. In: Proc. of the European Conf. on Parallel Computing (Euro-Par), Dresden, Germany, September 2006, pp. 337–349
Martins, V., Pacitti, E., Jimenez-Peris, R., Valduriez, P.: Scalable and available reconciliation in P2P networks. In: Proc. of the Journées Bases de Données Avancées (BDA), Lille, France, October 2006
Martins, V., Pacitti, E., Valduriez, P.: Distributed semantic reconciliation of replicated data. IEEE France and ACM SIGOPS France—Journées Francophones sur la Cohérence des Données en Univers Réparti (CDUR), Paris, France, November 2005
Meteor: http://meteor.jxta.org/
Preguiça, N., Shapiro, M., Matheson, C.: Efficient semantic-aware reconciliation for optimistic write sharing. Technical report MSR-TR-2002-52, Microsoft Research, Cambridge, UK, May 2002
Preguiça, N., Shapiro, M., Matheson, C.: Semantics-based reconciliation for collaborative and mobile environments. In: Proc. of the Int. Conf. on Cooperative Information Systems (CoopIS), Catania, Italy, November 2003, pp. 38–55
P2P network simulation: http://www.sciences.univ-nantes.fr/lina/gdd/members/vmartins/
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. In: Proc. of the ACM SIGCOMM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications, San Diego, California, August 2001, pp. 161–172
Rowstron, A., Druschel, P.: Pastry: scalable, distributed object location and routing for large-scale peer-to-peer systems. In: Proc. of the IFIP/ACM Int. Conf. on Distributed Systems Platforms (Middleware), Heidelberg, Germany, November 2001, pp. 329–350
Saito, Y., Shapiro, M.: Optimistic replication. ACM Comput. Surv. 37(1), 42–81 (2005)
Shapiro, M., Bhargavan, K., Krishna, N.: A constraint-based formalism for consistency in replicated systems. In: Proc. of the Int. Conf. on Principles of Distributed Systems (OPODIS), Grenoble, France, December 2004
Stoica, I., Morris, R., Karger, D.R., Kaashoek, M.F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for Internet applications. In: Proc. of the ACM SIGCOMM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications, San Diego, California, August 2001, pp. 149–160
Vidot, N., Cart, M., Ferrie, J., Suleiman, M.: Copies convergence in a distributed real-time collaborative environment. In: Proc. of the ACM Int. Conf. on Computer Supported Cooperative Work (CSCW), Philadelphia, Pennsylvania, December 2000, pp. 171–180
Whittaker, S., Issacs, E., O’Day, V.: Widening the net: workshop report on the theory and practice of physical and network communities. ACM SIGCHI Bull. 29(3), 27–30 (1997)
Wikipedia: http://wikipedia.org/
Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1), 41–53 (2004)
Zhao, B.Y., Kubiatowicz, J.D., Joseph, A.D.: Tapestry: an infrastructure for fault-tolerant wide-area location and routing. Technical report CSD-010-1141, University of California, Berkeley, California (2001)
Author information
Authors and Affiliations
Corresponding author
Additional information
Recommended by Ahmed K. Elmagarmid.
Work partially funded by the ARA “Massive Data” of the French ministry of research (project Respire), the European Strep Grid4All project, the CAPES-COFECUB Daad project and the CNPq-INRIA Gridata project.
Rights and permissions
About this article
Cite this article
Martins, V., Pacitti, E., El Dick, M. et al. Scalable and topology-aware reconciliation on P2P networks. Distrib Parallel Databases 24, 1–43 (2008). https://doi.org/10.1007/s10619-008-7029-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-008-7029-0