Abstract
The evolution of peer-to-peer (P2P) systems triggered the building of large scale distributed applications. The main application domain is data sharing across a very large number of highly autonomous participants. Building such data sharing systems is particularly challenging because of the “extreme” characteristics of P2P infrastructures: massive distribution, high churn rate, no global control, potentially untrusted participants... This article focuses on declarative querying support, query optimization and data privacy on a major class of P2P systems, that based on Distributed Hash Table (P2P DHT). The usual approaches and the algorithms used by classic distributed systems and databases for providing data privacy and querying services are not well suited to P2P DHT systems. A considerable amount of work was required to adapt them for the new challenges such systems present. This paper describes the most important solutions found. It also identifies important future research trends in data management in P2P DHT systems.
This work is supported by the ECOS C07M02 action.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abiteboul, S., Benjelloun, O., Manolescu, I., Milo, T., Weber, R.: Active XML: A Data-Centric Perspective on Web Services. In: Demo Proc. of Int. Conf. on Very Large Databases (VLDB), Hong Kong, China (August 2002)
Abiteboul, S., Dar, I., Pop, R., Vasile, G., Vodislav, D.: EDOS Distribution System: a P2P Architecture for Open-Source Content Dissemination. In: IFIP Working Group on Open Source Software (OSS), Limerick, Ireland (June 2007)
Abiteboul, S., Manolescu, I., Polyzotis, N., Preda, N., Sun, C.: XML Processing in DHT Networks. In: Int. Conf. on Data Engineering (ICDE) (April 2008)
Abiteboul, S., Manolescu, I., Preda, N.: Sharing Content in Structured P2P Networks. In: Journées Bases de Données Avancées, Saint-Malo, France (October 2005)
Agrawal, R., Haas, P., Kiernan, J.: A System for Watermarking Relational Databases. In: Int. Conf. on Management of Data (SIGMOD), San Diego, California, USA (June 2003)
Agrawal, R., Kiernan, J., Srikant, R., Xu, Y.: Hippocratic Databases. In: Int. Conf. on Very Large Databases (VLDB), Hong Kong, China (August 2002)
Akbarinia, R., Martins, V., Pacitti, E., Valduriez, P.: Design and Implementation of APPA. In: Baldoni, R., Cortese, G., Davide, F. (eds.) Global Data Management. IOS Press, Amsterdam (2006)
Akbarinia, R., Pacitti, E., Valduriez, P.: Processing Top-k Queries in Distributed Hash Tables. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 489–502. Springer, Heidelberg (2007)
Androutsellis-Theotokis, S., Spinellis, D.: A Survey of Peer-to-Peer Content Distribution Technologies. ACM Computing Surveys 36(4) (2004)
Artigas, M.S., López, P.G., Gómez-Skarmeta, A.F.: Subrange Caching: Handling Popular Range Queries in DHTs. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 22–33. Springer, Heidelberg (2008)
Bharambe, A., Agrawal, M., Seshan, S.: Mercury: Supporting Scalable Multi-Attribute Range Queries. In: Int. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), Portland, Oregon, USA, August-September (2004)
Blanco, R., Ahmed, N., Sung, D.H.L., Li, H., Soliman, M.: A Survey of Data Management in Peer-to-Peer Systems. Technical Report CS-2006-18, University of Waterloo (2006)
Bonifati, A., Cuzzocrea, A.: Storing and Retrieving XPath Fragments in Structured P2P Networks. Data & Knowledge Engineering 59(2) (2006)
Brunkhorst, I., Dhraief, H., Kemper, A., Nejdl, W., Wiesner, C.: Distributed Queries and Query Optimization in Schema-Based P2P-Systems. In: Int. Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P), Berlin, Germany (September 2003)
Cai, M., Frank, M., Chen, J., Szekely, P.: MAAN: A Multi-Attribute Addressable Network for Grid Information Services. In: Int. Workshop on Grid Computing (GRID), Phoenix, Arizona (November 2003)
Cates, J.: Robust and Efficient Data Management for a Distributed Hash Table. Master thesis, Massachusetts Institute of Technology, USA (May 2003)
Chen, Q., Hsu, M.: Correlated Query Process and P2P Execution. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 82–92. Springer, Heidelberg (2008)
Chong, C.N., Peng, Z., Hartel, P.H.: Secure Audit Logging with Tamper-Resistant Hardware. In: Int. Conf. on Information Security (SEC), Athens, Greece (May 2003)
Costa, G.D., Orlando, S., Dikaiakos, M.D.: Multi-set DHT for Range Queries on Dynamic Data for Grid Information Service. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 93–104. Springer, Heidelberg (2008)
Dabek, F., Kaashoek, M., Karger, D., Morris, R., Stoica, I.: Wide-area Cooperative Storage with CFS. In: Int. Symposium on Operating Systems Principles (SOSP), Banff, Canada (October 2001)
Dabek, F., Zhao, B.Y., Druschel, P., Kubiatowicz, J., Stoica, I.: Towards a Common API for Structured Peer-to-Peer Overlays. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735. Springer, Heidelberg (2003)
Daswani, N., Garcia-Molina, H., Yang, B.: Open Problems in Data-Sharing Peer-to-Peer Systems. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 1–15. Springer, Heidelberg (2002)
d’Orazio, L., Jouanot, F., Labbé, C., Roncancio, C.: Building Adaptable Cache Services. In: Int. Workshop on Middleware for Grid Computing (MGC), Grenoble, France (November 2005)
Dragan, F., Gardarin, G., Nguyen, B., Yeh, L.: On Indexing Multidimensional Values in A P2P Architecture. In: French Conf. on Bases de Données Avancées (BDA), Lille, France (2006)
Endsuleit, R., Mie, T.: Censorship-Resistant and Anonymous P2P Filesharing. In: Int. Conf. on Availability, Reliability and Security (ARES), Vienna, Austria (April 2006)
Furtado, P.: Schemas and Queries over P2P. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 808–817. Springer, Heidelberg (2005)
Galanis, L., Wang, Y., Jeffery, S., DeWitt, D.: Locating Data Sources in Large Distributed Systems. In: Int. Conf. on Very Large Databases (VLDB), Berlin, Germany (September 2003)
Garcés-Erice, L., Felber, P., Biersack, E., Urvoy-Keller, G.: Data Indexing in Peer-to-Peer DHT Networks. In: Int. Conf. on Distributed Computing Systems (ICDCS), Columbus, Ohio, USA (June 2004)
Gnawali, O.: A Keyword-Set Search System for Peer-to-Peer Networks. Master thesis, Massachusetts Institute Of Technology, Massachusetts, USA (June 2002)
Harvey, N., Jones, M., Saroiu, S., Theimer, M., Wolman, A.: SkipNet: A Scalable Overlay Network with Practical Locality Properties. In: Int. Symposium on Internet Technologies and Systems (USITS), Washington, USA (March 2003)
Hazel, S., Wiley, B., Wiley, O.: Achord: A Variant of the Chord Lookup Service for Use in Censorship Resistant Peer-to-Peer Publishing Systems. In: Int. Workshop on Peer To Peer Systems (IPTPS), Cambridge, MA, USA (March 2002)
Huebsch, R.: PIER: Internet Scale P2P Query Processing with Distributed Hash Tables. Phd thesis, EECS Department, University of California, Berkeley, California, USA (May 2008)
Huebsch, R., Chun, B., Hellerstein, J., Loo, B., Maniatis, P., Roscoe, T., Shenker, S., Stoica, I., Ymerefendi, A.: The Architecture of PIER: An Internet-Scale Query Processor. In: Int. Conf. on Innovative Data Systems Research (CIDR), California, USA (January 2005)
Huebsch, R., Hellerstein, J., Lanham, N., Loo, B., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: Int. Conf. on Very Large Databases (VLDB), Berlin, Germany (September 2003)
Hunter, D.: Initiation XML. Editions Eyrolles (2001)
Iyer, S., Rowstron, A., Drushchel, P.: Squirrel - A Decentralized Peer-to-Peer Web Cache. In: Int. Symposium on Principles of Distributed Computing (PODC), California, USA (July 2002)
Jagadish, H., Ooi, B., Vu, Q.: Baton: A Balanced Tree Structure for Peer-to-Peer Networks. In: Int. Conf. on Very Large Databases (VLDB), Trondheim, Norway (September 2005)
Jamard, C., Gardarin, G., Yeh, L.: Indexing Textual XML in P2P Networks Using Distributed Bloom Filters. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1007–1012. Springer, Heidelberg (2007)
Jawad, M., Serrano-Alvarado, P., Valduriez, P.: Design of PriServ, A Privacy Service for DHTs. In: Int. Workshop on Privacy and Anonymity in the Information Society (PAIS), Nantes, France (March 2008)
Jawad, M., Serrano-Alvarado, P., Valduriez, P., Drapeau, S.: Data Privacy in Structured P2P Systems with PriServ (May 2009) (submitted paper)
Jouanot, F., D’Orazio, L., Roncancio, C.: Context-Aware Cache Management in Grid Middleware. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 34–45. Springer, Heidelberg (2008)
Judd, D.D.: Geocollaboration using Peer-Peer GIS (May 2005), http://www.directionsmag.com/article.php?article_id=850
Kossmann, D.: The State of the Art in Distributed Query Processing. ACM Computing Surveys 32(4) (2000)
Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., Zhao, B.: OceanStore: An Architecture for Global-Scale Persistent Storage. In: Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, MA (November 2000)
Lesueur, F., Mé, L., Tong, V.V.T.: A Distributed Certification System for Structured P2P Networks. In: Hausheer, D., Schönwälder, J. (eds.) AIMS 2008. LNCS, vol. 5127, pp. 40–52. Springer, Heidelberg (2008)
Li, Y., Jagadish, H.V., Tan, K.-L.: SPRITE: A Learning-Based Text Retrieval System in DHT Networks. In: Int. Conf. on Data Engineering, ICDE (2007)
Loo, B., Hellerstein, J., Huebsch, R., Shenker, S., Stoica, I.: Enhancing P2P File-Sharing with an Internet-Scale Query Processor. In: Int. Conf. on Very Large Databases (VLDB), Toronto, Canada, August-September (2004)
Lua, E.K., Crowcroft, J., Pias, M., Sharma, R., Lim, S.: A Survey and Comparison of Peer-to-Peer Overlay Network Schemes. IEEE Communications Surveys and Tutorials 7 (2005)
Malkhi, D., Naor, M., Ratajczak, D.: Viceroy: A Scalable and Dynamic Emulation of the Butterfly. In: Int. Symposium on Principles of Distributed Computing (PODC), Monterey, CA, USA (July 2002)
Marti, S., Garcia-Molina, H.: Taxonomy of Trust: Categorizing P2P Reputation Systems. Computer Networks 50(4) (2006)
Michel, S.: Top-k Aggregation Queries in Large-Scale Distributed Systems. Phd thesis, Saarland University, Saarbrucken, Germany (May 2007)
Molina, H., Ullman, J., Widom, J.: Database System Implementation. Prentice-Hall, Englewood Cliffs (2000)
Mondal, A., Madria, S.K., Kitsuregawa, M.: CLEAR: An Efficient Context and Location-Based Dynamic Replication Scheme for Mobile-P2P Networks. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 399–408. Springer, Heidelberg (2006)
Ntarmos, N., Triantafillou, P., Weikum, G.: Counting at Large: Efficient Cardinality Estimation in Internet-Scale Data Networks. In: Int. Conf. on Data Engineering (ICDE), Atlanta, USA (April 2006)
Open-Source Search Engine. YACY (2009), http://yacy.net/
P2P Streaming. Joost (2009), http://www.joost.com/
Petkovic, M., Jonker, W.W. (eds.): Security, Privacy, and Trust in Modern Data Management. Data-Centric Systems and Applications. Springer, Heidelberg (2007)
Prada, C.: Servicio para Manejar Estadísticas en Sistemas P2P Basados en DHT. Master thesis, Universidad de los Andes, Bogota, Colombia (January 2009)
Prada, C., Roncancio, C., Labbée, C., Villamil, M.P.: Semantic Caching Proposal in a P2P Querying System. In: Congreso Latinoamericano de Computación de Alto Rendimiento, Santa Marta, Colombia (June 2007)
Prada, C., Villamil, M., Roncancio, C.: Join Queries in P2P DHT Systems. In: Int. Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P), Auckland, New Zealand (August 2008)
Ramabhadran, S., Ratnasamy, S., Hellerstein, J., Shenker, S.: Prefix Hash Trees An Indexing Data Structure Over Distributed Hash Tables (2004), http://berkeley.intel-research.net/sylvia/pht.pdf
Ramachandran, A., Feamster, N.: Authenticated Out-of-Band Communication Over Social Links. In: Int. Workshop on Online social networks (WOSN), Seattle, WA, USA (August 2008)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content Addressable Network. In: Int. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), San Diego, CA, USA (August 2001)
Reynolds, P., Vahdat, A.: Efficient Peer-to-Peer Keyword Searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672. Springer, Heidelberg (2003)
Rice University Houston, USA. FreePastry (2002), http://freepastry.rice.edu/FreePastry/
Rowstron, A., Druschel, P.: Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)
Rowstron, A., Druschel, P.: Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-Peer Storage Utility. In: Int. Symposium on Operating Systems Principles (SOSP), Banff, Canada (October 2001)
Sahin, O., Gupta, A., Agrawal, D., El-Abbadi, A.: A Peer-to-Peer Framework for Caching Range Queries. In: Int. Conf. on Data Engineering (ICDE), Boston, USA, March-April (2004)
Serjantov, A.: Anonymizing Censorship Resistant Systems. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, p. 111. Springer, Heidelberg (2002)
Shing, S., Yang, G., Wang, D., Yu, J., Qu, S., Chen, M.: Making Peer-to-Peer Keyword Searching Feasible Using Multi-level Partitioning. In: Voelker, G.M., Shenker, S. (eds.) IPTPS 2004. LNCS, vol. 3279, pp. 151–161. Springer, Heidelberg (2005)
Sit, E., Morris, R.: Security Considerations for Peer-to-Peer Distributed Hash Tables. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, p. 261. Springer, Heidelberg (2002)
Skobeltsyn, G., Aberer, K.: Distributed Cache Table: Efficient Query-Driven Processing of Multi-Term Queries in P2P Networks. In: Int. Workshop on Information Retrieval in Peer-to-Peer Networks (P2PIR), Arlington, USA (November 2006)
Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications. In: Int. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), San Diego, CA, USA (August 2001)
Triantafillou, P., Pitoura, T.: Toward a Unifying Framework for Complex Query Processing over Structured Peer-to-Peer Data Networks. In: Int. Workshop on Databases, Information Systems, and Peer-to-Peer Computing (DBISP2P), Berlin, Germany (September 2003)
Villamil, M.: Service de Localisation de Données pour les Systèmes P2P. Phd thesis, Institut National Polytechnique de Grenoble, Grenoble, France (June 2006)
Villamil, M., Roncancio, C., Labbé, C.: PinS: Peer to Peer Interrogation and Indexing System. In: Int. Database Engineering and Applications Symposium (IDEAS), Coimbra, Portugal (June 2004)
Villamil, M., Roncancio, C., Labbé, C.: Querying in Massively Distributed Storage Systems. In: Journées Bases de Données Avancées, Saint-Malo, France (October 2005)
WSDL. Web Services Description Language (WSDL) 1.1 (2001), http://www.w3.org/TR/wsdl
Wu, S., Li, J., Ooi, B., Tan, K.-L.: Just-in-Time Query Retrieval over Partially Indexed Data on Structured P2P Overlays. In: Int. Conf. on Management of Data (SIGMOD), Vancouver, Canada (June 2008)
Zhao, B., Huang, L., Stribling, J., Rhea, S., Joseph, A., Kubiatowicz, J.: Tapestry: A Resilient Global-scale Overlay for Service Deployment. IEEE Journal on Selected Areas in Communications 22(1) (2004)
Zhu, Y., Hu, Y.: Efficient Semantic Search on DHT Overlays. Parallel and Distributed Computing 67(5) (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Roncancio, C., del Pilar Villamil, M., Labbé, C., Serrano-Alvarado, P. (2009). Data Sharing in DHT Based P2P Systems. In: Hameurlain, A., Küng, J., Wagner, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems I. Lecture Notes in Computer Science, vol 5740. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03722-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-03722-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03721-4
Online ISBN: 978-3-642-03722-1
eBook Packages: Computer ScienceComputer Science (R0)