Abstract
Non-trivial search predicates beyond mere equality are at the current focus of P2P research. Structured queries, as an important type of non-trivial search, have been studied extensively mainly for unstructured P2P systems so far. As unstructured P2P systems do not use indexing, structured queries are very easy to implement since they can be treated equally to any other type of query. However, this comes at the expense of very high bandwidth consumption and limitations in terms of guarantees and expressiveness that can be provided. Structured P2P systems are an efficient alternative as they typically offer logarithmic search complexity in the number of peers. Though the use of a distributed index (typically a distributed hash table) makes the implementation of structured queries more efficient, it also introduces considerable complexity, and thus only a few approaches exist so far. In this paper we present a first solution for efficiently supporting structured queries, more specifically, XPath queries, in structured P2P systems. For the moment we focus on supporting queries with descendant axes (“//”) and wildcards (“*”) and do not address joins. The results presented in this paper provide foundational basic functionalities to be used by higher-level query engines for more efficient, complex query support.
The work presented in this paper was (partly) carried out in the framework of the EPFL Center for Global Computing and supported by the Swiss National Funding Agency OFES as part of the European project BRICKS No 507457.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aberer, K.: P-grid: A self-organizing access structure for p2p information systems. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds.) CoopIS 2001. LNCS, vol. 2172, pp. 179–194. Springer, Heidelberg (2001)
Aberer, K.: Scalable Data Access in P2P Systems Using Unbalanced Search Trees. In: WDAS 2002: Proceedings of the 4th Workshop on Distributed Data and Structures (2002)
Aberer, K., Cudré-Mauroux, P., Datta, A., Despotovic, Z., Hauswirth, M., Punceva, M., Schmidt, R.: P-Grid: A Self-organizing Structured P2P System. SIGMOD Record 32(3) (2003)
Albrecht, K., Arnold, R., Gahwiler, M., Wattenhofer, R.: Join and Leave in Peer-to-Peer Systems: The Steady State Statistics Service Approach. Technical Report 411, ETH Zurich (2003)
Bonifati, A., Matrangolo, U., Cuzzocrea, A., Jain, M.: Xpath lookup queries in p2p networks. In: WIDM 2004: Proceedings of the 6th annual ACM international workshop on Web information and data management, pp. 48–55. ACM Press, New York (2004)
Chung, C.-W., Min, J.-K., Shim, K.: Apex: an adaptive path index for xml data. In: SIGMOD 2002: Proceedings of the ACM SIGMOD 2002 International Conference on Management of Data, pp. 121–132. ACM Press, New York (2002)
Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A fast index for semistructured data. In: VLDB 2001: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 341–350. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Crespo, A., Garcia-Molina, H.: Routing indices for peer-to-peer systems. In: ICDCS 2002: Proceedings of the 28th Int. Conference on Distributed Computing Systems (July 2002)
Crespo, A., Garcia-Molina, H.: Semantic overlay networks for p2p systems. Technical report, Computer Science Department, Stanford University (2002)
Datta, A., Hauswirth, M., Aberer, K.: Updates in Highly Unreliable, Replicated Peer-to-Peer Systems. In: ICDCS 2003: Proceedings of the International Conference on Distributed Computing Systems (2003)
Datta, A., Hauswirth, M., Schmidt, R., John, R., Aberer, K.: Range queries in trie-structured overlays. In: P2P 2005: Proceedings of the 5th International Conference on Peer-to-Peer Computing (August 2005), http://lsirpeople.epfl.ch/rschmidt/papers/Datta05RangeQueries.pdf
Galanis, L., Wang, Y., Jeffery, S.R., DeWitt, D.J.: Locating data sources in large distributed systems. In: VLDB 2003: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 874–885 (2003)
Goldman, R., Widom, J.: Dataguides: Enabling query formulation and optimization in semistructured databases. In: VLDB 1997: Proceedings of the 23th International Conference on Very Large Data Bases, pp. 436–445 (1997)
Harren, M., Hellerstein, J., Huebsch, R., Loo, B., Shenker, S., Stoica, I.: Complex queries in dht-based peer-to-peer networks. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, p. 242. Springer, Heidelberg (2002)
Huebsch, R., Chun, B., Hellerstein, J.M., Loo, B.T., Maniatis, P., Roscoe, T., Shenker, S., Stoica, I., Yumerefendi, A.R.: The architecture of pier: An internet-scale query processor. In: CIDR 2005: Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research, Asilomar, CA (January 2005)
Kleinberg, J.: The Small-World Phenomenon: An Algorithmic Perspective. In: STOC 2000: Proceedings of the 32nd ACM Symposium on Theory of Computing (2000)
Koloniari, G., Pitoura, E.: Content-based routing of path queries in peer-to-peer systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 29–47. Springer, Heidelberg (2004)
Kothari, A., Agrawal, D., Gupta, A., Suri, S.: Range addressable network: A p2p cache architecture for data ranges. In: P2P 2003: Proceedings of the 3rd International Conference on Peer-to-Peer Computing, pp. 14–22 (2003)
Loo, B.T., Huebsch, R., Hellerstein, J.M., Shenker, S., Stoica, I.: Enhancing p2p file-sharing with an internet-scale query processor. In: VLDB 2004: Proceedings of the 30th International Conference on Very Large Data Bases (August 2004)
Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmér, M., Risch, T.: Edutella: a p2p networking infrastructure based on rdf. In: WWW 2002: Proceedings of the eleventh international conference on World Wide Web, pp. 604–615. ACM Press, New York (2002)
Petrakis, Y., Koloniari, G., Pitoura, E.: On using histograms as routing indexes in peer-to-peer systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 16–30. Springer, Heidelberg (2005)
Ratnasamy, S., Francis, P., Handley, M., Karp, R.M., Shenker, S.: A scalable content-addressable network. In: SIGCOMM 2001: Proceedings of the ACM SIGCOMM 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, pp. 161–172 (2001)
Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: IFIP/ACM 2001: Proceedings of the 18th International Conference on Distributed Systems Platforms, pp. 329–350 (2001)
Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. In: ICDE 2004: Proceedings of the 20th International Conference on Data Engineering, pp. 165–176 (2004)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: SIGCOMM 2001: Proceedings of the ACM SIGCOMM 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, pp. 149–160. ACM Press, New York (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Skobeltsyn, G., Hauswirth, M., Aberer, K. (2005). Efficient Processing of XPath Queries with Structured Overlay Networks. In: Meersman, R., Tari, Z. (eds) On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE. OTM 2005. Lecture Notes in Computer Science, vol 3761. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11575801_20
Download citation
DOI: https://doi.org/10.1007/11575801_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29738-3
Online ISBN: 978-3-540-32120-0
eBook Packages: Computer ScienceComputer Science (R0)