skip to main content
10.1145/1031453.1031464acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections

XPath lookup queries in P2P networks

Published: 12 November 2004 Publication History


We address the problem of querying XML data over a P2P network. In P2P networks, the allowed kinds of queries are usually exact-match queries over file names. We discuss the extensions needed to deal with XML data and XPath queries. A single peer can hold a whole document or a partial/complete fragment of the latter. Each XML fragment/document is identified by a distinct path expression, which is encoded in a distributed hash table. Our framework differs from content-based routing mechanisms, biased towards finding the most relevant peers holding the data. We perform fragments placement and enable fragments lookup by solely exploiting few path expressions stored on each peer. By taking advantage of quasi-zero replication of global catalogs, our system supports fast full and partial XPath querying. To this purpose, we have extended the Chord simulator and performed an experimental evaluation of our approach.


S. Abiteboul, A. Bonifati, G. Cobena, I. Manolescu, and T. Milo. Dynamic XML Documents with Distribution and Replication. In Proc. of SIGMOD, 2003.
S. Amer-Yahia, S. Cho, L. Lakshmanan, and D. Srivastava. Minimization of Tree Pattern Queries. In Proc. of SIGMOD, 2001.
S. Amer-Yahia, L. V. Laksmanan, and S. Pandit. FleXPath: Flexible Structure and Full-Text Querying for XML. In Proc. of SIGMOD, 2004.
J.-M. Bremer and M. Gertz. On Distributing XML Repositories. In Proc. of WebDB, 2003.
A. Broder. Some Applications of Rabin's Fingerprinting Method. Springer-Verlag, 1993.
A. Broder, M. Najork, and J. Wiener. Efficient URL Caching for World Wide Web Crawling. In Proc. of WWW, 2003.
E. Brunskill. Building peer-to-peer systems with chord, a distributed lookup service. In Proceedings of the Eighth Workshop on Hot Topics in Operating Systems, page 81. IEEE Computer Society, 2001.
A. Crainiceanu, P. Linga, J. Gehrke, and J. Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proc. of WebDB, 2004.
F. Dabek, M. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proc. of SOSP, 2001.
L. Galanis, Y. Wang, S. Jeffery, and D. DeWitt. Locating Data Sources in Large Distributed Systems. In Proc. of VLDB, 2003.
Gnutella homepage.
G. Gottlob, C. Koch, and R. Pichler. Efficient Algorithms for Processing XPath Queries. In Proc. of VLDB, pages 95--106, 2002.
S. D. Gribble, A. Y. Halevy, Z. G. Ives, M. Rodrig, and D. Suciu. What Can Database Do for Peer-to-Peer? In Proc. of WebDB, 2001.
A. Gupta, D. Agrawal, and A. E. Abbadi. Approximate Range Selection Queries in Peer-to-Peer Systems. In Proc. of CIDR, 2003.
R. Huebsch, J. M. Hellerstein, N. Lanham, B. T. Loo, S. Shenker, and I. Stoica. Querying the Internet with PIER. In Proc. of VLDB, 2003.
The Kazaa Homepage.
A. Kementsietsidis, M. Arenas, and R. Miller. Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues. In Proc. of SIGMOD, 2003.
D. Knuth. The Art of Computer Programming III: Sorting and Searching, 2nd Edition. In Addison-Wesley, 1973.
G. Koloniari and E. Pitoura. Content-Based Routing of Path Queries in Peer-to-Peer Systems. In Proc. of EDBT, 2004.
Y. Li and C. Y. andH. V. Jagadish. Schema-Free XQuery. In Proc. of VLDB, 2004.
B. T. Loo, R. Huebsch, J. M. Hellerstein, I. Stoica, and S. Shenker. Enhancing P2P File-Sharing with an Internet-Scale Query Processor. In Proc. of VLDB (To appear), 2004.
M. T. Ozsu and P. Valduriez. Principles of Distributed Database Systems. Prentice-Hall, 1999.
M. Rabin. Fingerprinting by Random Polynomials. In CRCT TR-15-81, Harvard University, 1981.
C. Sartiani, P. Manghi, G. Ghelli, and G. Conforti. XPeer: A self-organizing XML P2P database system. In Proc. of P2PDB Workshop, co-held with EDBT, 2004.
A. Schmidt, F. Waas, M. Kersten, M. Carey, I. Manolescu, and R. Busse. XMark: A benchmark for XML data management. In Proc. of VLDB, 2002.
I. Stoica, R. Morris, D. Karger, M. Kaashoek, and H. Balakrishnan. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In Proc. of ACM SIGCOMM, 2001.
D. Suciu. Distributed Query Evaluation on Semistructured Data. In TODS, 2004.
I. Tatarinov and A. Halevy. Efficient Query Reformulation in Peer-Data Management Systems. In Proc. of SIGMOD, 2004.
Website. Berkeley DB Data Store, 2003.

Cited By

View all



Information & Contributors


Published In

cover image ACM Conferences
WIDM '04: Proceedings of the 6th annual ACM international workshop on Web information and data management
November 2004
168 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2004


Request permissions for this article.

Check for updates

Author Tags

  1. P2P networks
  2. XML querying
  3. XPath
  4. distributed XML indexes


  • Article


CIKM04: Conference on Information and Knowledge Management
November 12 - 13, 2004
Washington DC, USA

Upcoming Conference

CIKM '25


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics


Cited By

View all
  • (2014)A Survey on XML FragmentationACM SIGMOD Record10.1145/2694428.269443443:3(24-35)Online publication date: 4-Dec-2014
  • (2014)Distributed discovery of user handles with privacy2014 IEEE Global Communications Conference10.1109/GLOCOM.2014.7037256(2947-2953)Online publication date: Dec-2014
  • (2014)Large scale arbitrary search with rendezvous search systemsPeer-to-Peer Networking and Applications10.1007/s12083-014-0247-58:2(229-240)Online publication date: 13-Feb-2014
  • (2014)A gossip-based approach for Internet-scale cardinality estimation of XPath queries over distributed semistructured dataThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-013-0314-123:1(51-76)Online publication date: 1-Feb-2014
  • (2013)Visual Evaluation of XPath QueriesProceedings of the 2013 International Conference on Computational and Information Sciences10.1109/ICCIS.2013.121(434-437)Online publication date: 21-Jun-2013
  • (2012)FoXtrotACM Transactions on the Web10.1145/2344416.23444196:3(1-34)Online publication date: 2-Oct-2012
  • (2011)A Survey of Distributed Search Techniques in Large Scale Distributed SystemsIEEE Communications Surveys & Tutorials10.1109/SURV.2011.040410.0009713:2(150-167)Online publication date: 2011
  • (2010)Selectivity-based XML query processing in structured peer-to-peer networksProceedings of the Fourteenth International Database Engineering & Applications Symposium10.1145/1866480.1866513(236-244)Online publication date: 16-Aug-2010
  • (2009)Fragmenting very large XML data warehouses via K-means clustering algorithmInternational Journal of Business Intelligence and Data Mining10.1504/IJBIDM.2009.0290764:3/4(301-328)Online publication date: 1-Nov-2009
  • (2009)Enhancing XML data warehouse query performance by fragmentationProceedings of the 2009 ACM symposium on Applied Computing10.1145/1529282.1529630(1555-1562)Online publication date: 8-Mar-2009
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media