Skip to main content
Log in

XCube: Processing XPath queries in a hypercube overlay network

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present the design and performance of XCube, a tag-based system for managing XML data in a hypercube overlay network. In XCube, each node in a d-dimensional hypercube is identified by a d-bit vector. A peer manages a smaller hypercube with dimension d′ < d. An XML document is compactly represented as a structure summary and a content summary. The structure summary comprises a d-bit vector derived from the distinct tag names in the document and a synopsis capturing the structure of the document. The content summary consists of a bit map that summarizes the document content. The metadata of a document, i.e., owner IP, document identifier, structure summary and content summary, is indexed at its anchor peer (the peer that manages the node with matching bit vector). In addition, the structure summary is further indexed at all peers that manages nodes whose bit vectors are covered by the document’s bit vector. An XPath query is processed in four phases. In phase 1, the query is routed to its anchor peer according to the bit vector of the query. In phase 2, the query is evaluated against all the synopses stored in its anchor peer and forwarded to the anchor peers of the matching synopses. In phase 3, the anchor peer of each related synopsis examines the query on the related bit maps and forwards the query to the related owner peers. Finally in phase 4, the owner peers evaluate the query on the XML documents and return answers to the querying peer. We also present a scheme that dynamically partitions the hypercube to balance the load across peers. We further exploit the partition history to remove redundant messages. We conduct a comprehensive experimental study and the results show the efficiency of XCube.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. A structure/path query can be mapped into a tag-based query by ignoring the structure.

  2. We will use the terms document structure, synopsis, and XML tree interchangeably in the paper.

  3. http://inex.is.informatik.uni-duisburg.de.

  4. http://monetdb.cwi.nl/xml.

  5. http://www.nitf.org.

  6. If a document can be summarized with multiple bit maps, the bit maps can be built with more bits, and thus they are more accurate.

  7. The number of edges in a tree is (T − 1), where T is the number of nodes in the tree. Therefore, the number of 1-bits in the bit vector is bounded by 2N.

  8. http://xml.apache.org/xalan-j/.

References

  1. Aberer K (2001) P-Grid: a self-organizing access structure for P2P information systems. In: Proceedings of the 6th CoopIS conference, pp 179–194

  2. Abiteboul S, Manolescu I, Preda N (2004) Constructing and querying a peer-to-peer warehouse of XML resources. In: Semantic web and databases workshop, pp 219–225

  3. Bonifati A, Matrangolo U, Cuzzocrea A, Jain M (2004) XPath lookup queries in P2P networks. In: Proceedings of WIDM’04, pp 48–55

  4. Crespo A, Garcia-Molina H (2002) Routing indices for peer-to-peer systems. In: Proceedings of ICDCS’02, p 23, July

  5. Galanis L, Wang Y, Jeffery S, DeWitt D (2003) Locating data sources in large distributed systems. In: Proceedings of VLDB’03. Berlin, Germany, pp 874–885

  6. Galanis L, Wang Y, Jeffery SR, Dewitt DJ (2003) Processing queries in a large peer-to-peer system. In: Proceedings of the 16th CAiSE conference, pp 273–288

  7. Ganesan P, Bawa M, Garcia-Molina H (2004) Online balancing of range-partitioned data with applications to peer-to-peer systems. In: Proceedings of VLDB’04, pp 444–455

  8. Goldman R, Widom J (1997) Dataguides: enabling query formulation and optimization in semistructured databases. In: Proceedings of VLDB’97, pp 436–445

  9. Joung YJ, Fang CT, Yang LW (2005) Keyword search in DHT-based peer-to-peer networks. In: Proceedings of ICDCS’05, pp 339–348

  10. Kaushik R, Bohannon P, Naughton JF, Korth HF (2002) Covering indexes for branching path queries. In: Proceedings of ACM SIGMOD’02, pp 133–144

  11. Koloniari G, Pitoura E (2004) Content-based routing of path queries in peer-to-peer systems. In: Proceedings of the EDBT conference, pp 29–47

  12. Polyzotis N, Garofalakis M (2006) XSKETCH synopses for XML data graphs. ACM Trans Database Syst 31(3):1014–1063

    Article  Google Scholar 

  13. Ratnasamy S, Francis P, Handley M, Karp R, Shenker S (2001) A scalable content-addressable network. In: Proceedings of SIGCOMM’01, pp 161–172

  14. Saroiu S, Gummadi PK, Gribble SD (2002) A measurement study of peer-to-peer file sharing systems. In: Proc. of multimedia computing and networking

  15. Sartiani C, Manghi P, Ghelli G, Conforti G (2004) XPeer: a self-organizing XML P2P database system. In: Proceedings of the first EDBT workshop on P2P and databases

  16. Schlosser M, Sintek M, Decker S, Nejdl W (2002) A scalable and ontology-based p2p infrastructure for semantic web services. In: Proceedings of the second international conference on peer-to-peer computing. IEEE Computer Society, Washington, DC, USA, pp 104–111

    Chapter  Google Scholar 

  17. Skobeltsyn G, Hauswirth M, Aberer K (2005) Efficient processing of XPath queries with structured overlay networks. In: OTM conferences, pp 1243–1260

  18. Stoica I, Morris R, Karger D, Kaashoek F, Balakrishnan H (2001) Chord: a scalable peer-to-peer lookup service for internet applications. In: Proceedings of SIGCOMM’01, pp 17–32

  19. Wang Q, Özsu MT (2004) A data locating mechanism for distributed XML data over P2P networks. In: Technical report CS-2004-45, University of Waterloo

  20. Yao BB, Özsu MT, Khandelwal N (2004) XBench benchmark and performance testing of XML DBMSs. In: Proceedings of ICDE’04, p 621

  21. Zhang N, Özsu MT, Aboulnaga A, Ilyas IF (2006) XSEED: accurate and fast cardinality estimation for XPath queries. In: Proceedings of ICDE’06, p 61

Download references

Acknowledgements

Both Yingguang Li and Kian-Lee Tan are partially supported by a university research grant R-252-000-237-112.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingguang Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Özsu, M.T. & Tan, KL. XCube: Processing XPath queries in a hypercube overlay network. Peer-to-Peer Netw. Appl. 2, 128–145 (2009). https://doi.org/10.1007/s12083-008-0025-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-008-0025-3

Keywords

Navigation