Abstract
Concept hierarchies greatly help in the organization and reuse of information and are widely used in a variety of information systems applications. In this paper, we describe a method for efficiently storing and querying data organized into concept hierarchies and dispersed over a DHT. In our method, peers individually decide on the level of indexing according to the granularity of the incoming queries. Roll-up and drill-down operations are performed on a per-node basis in order to minimize the required bandwidth for answering queries on variable aggregation levels. We motivate our approach by applying it on a large-scale Grid system: Specifically, we apply our fully decentralized scheme that creates, queries and updates large volumes of hierarchical data on-line and replace the traditional centralized and strictly indexed information systems. Our extensive experimental results support this argument on many diverse configurations: Our system proves very efficient in skewed workloads, both over single and multiple hierarchy levels at the same time. It adapts to sudden changes in popularity and effectively stores and updates large amounts of data at very low cost.
Similar content being viewed by others
References
Egee accounting portal. http://www3.egee.cesga.es/gridsite/accounting/CESGA/
Ganglia Monitoring System. http://ganglia.info/
GT Information Services: Monitoring and Discovery System (MDS). http://www.globus.org/toolkit/mds/
Hawkeye: A Monitoring and Management Tool for Distributed Systems. http://www.cs.wisc.edu/condor/hawkeye/
R-GMA: Relational Grid Monitoring Architecture. http://www.r-gma.org/
The Globus Toolkit. http://www.globus.org/
Aberer, K., Cudre-Mauroux, P., Hauswirth, M.: The chatty web: emergent semantics through gossiping. In: WWW Conference (2003)
Aberer, K., Cudre-Mauroux, P., Hauswirth, M., Pelt, T.V.: Gridvine: building internet-scale semantic overlay networks. In: International Semantic Web Conference (2004)
OLAP Council, APB- 1 OLAP Benchmark. http://www.olapcouncil.org/research/resrchly.htm
Ester, M., Kohlhammer, J., Kriegel, P.: The dc-tree: a fully dynamic index structure for data warehouses. In: ICDE (2000)
Byrom, B. et al.: Apel: an implementation of grid accounting using r-gma. In: UK e-Science All Hands Conference (2005)
FreePastry. http://freepastry.rice.edu/FreePastry
Huebsch, R., Hellerstein, J.M., Lanham, N.L., Boon, T., Shenker, S., Stoica, I.: Querying the internet with PIER. In: VLDB (2003)
Kantere, V., Tsoumakos, D., Sellis, T., Roussopoulos, N.: GrouPeer: dynamic clustering of P2P databases. Inf. Syst. 34(1), 62–86 (2009)
Koloniari, G., Pitoura, E.: Content-based routing of path queries in peer-to-peer systems. In: EDBT (2004)
Lakshmanan, L., Pei, J., Zhao, Y.: QC-trees: an efficient summary structure for semantic OLAP. In: SIGMOD (2003)
Ng, W.S., Ooi, B.C., Tan, K.L., Zhou, A.: PeerDB: a P2P-based system for distributed data sharing. In: ICDE (2003)
Sismanis, Y., Deligiannakis, A., Kotidis, Y., Roussopoulos, N.: Hierarchical dwarfs for the rollup cube. In: DOLAP (2003)
Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: SIGCOMM (2003)
Tatarinov, I., Halevy, A.: Efficient query reformulation in peer-data management systems. In: SIGMOD (2004)
Wang, W., Lu, H., Feng, J., Yu, J.X.: Condensed cube: an effective approach to reducing data cube size. In: ICDE (2002)
Zhang, X., Freschl, J., Schopf, J.: Scalability analysis of three monitoring and information systems: MDS2, R-GMA, and Hawkeye. J. Parallel Distrib. Comput. 67(8), 883–902 (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was partly supported by the European Commission in terms of the GREDIA FP6 IST Project (FP6-34363).
Rights and permissions
About this article
Cite this article
Asiki, A., Tsoumakos, D. & Koziris, N. Distributing and searching concept hierarchies: an adaptive DHT-based system. Cluster Comput 13, 257–276 (2010). https://doi.org/10.1007/s10586-010-0136-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-010-0136-5