Abstract
Hierarchical clustering has been proved an effective means for physically organizing large fact tables since it reduces significantly the I/O cost during ad hoc OLAP query evaluation. In this paper, we propose a novel multidimensional file structure for organizing the most detailed data of a cube, the CUBE File. The CUBE File achieves hierarchical clustering of the data, enabling fast access via hierarchical restrictions. Moreover, it imposes a low storage cost and adapts perfectly to the extensive sparseness of the data space achieving a high compression rate. Our results show that the CUBE File outperforms the most effective method proposed up to now for hierarchically clustering the cube, resulting in 7-9 times less I/Os on average for all workloads tested. Thus, it achieves a higher degree of hierarchical clustering. Moreover, the CUBE File imposes a 2-3 times lower storage cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bayer, R.: The universal B-Tree for multi-dimensional Indexing: General Concepts. In: Masuda, T., Tsukamoto, M., Masunaga, Y. (eds.) WWCA 1997. LNCS, vol. 1274, Springer, Heidelberg (1997)
Chan, C.Y., Ioannidis, Y.E.: Bitmap Index Design and Evaluation. In: SIGMOD 1998 (1998)
Deshpande, P., Ramasamy, K., Shukla, A., Naughton, J.F.: Caching Multidimensional Queries Using Chunks. In: SIGMOD 1998 (1998)
Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and SubTotal. In: ICDE 1996 (1996)
Karayannidis, N.: Storage Structures, Query Processing and Implementation of On-Line Analytical Processing Systems, Ph.D. Thesis, National Technical University of Athens (2003), Available at: http://www.dblab.ece.ntua.gr/~nikos/thesis/PhD_thesis_en.pdf
Karayannidis, N., Sellis, T.: SISYPHUS: The Implementation of a Chunk-Based Storage Manager for OLAP Data Cubes. Data and Knowledge Engineering 45(2), 155–188 (2003)
Karayannidis, N., et al.: Processing Star-Queries on Hierarchically-Clustered Fact-Tables. In: VLDB 2002 (2002)
Lakshmanan, L.V.S., Pei, J., Han, J.: Quotient Cube: How to Summarize the Semantics of a Data Cube. In: VLDB 2002 (2002)
Markl, V., Ramsak, F., Bayern, R.: Improving OLAP Performance by Multidimensional Hierarchical Clustering. In: IDEAS 1999 (1999)
O’Neil, P.E., Graefe, G.: Multi-Table Joins Through Bitmapped Join Indices. SIGMOD Record 24(3), 8–11 (1995)
Nievergelt, J., Hinterberger, H., Sevcik, K.C.: The Grid File: An Adaptable, Symmetric Multikey File Structure. TODS 9(1), 38–71 (1984)
O’Neil, P.E., Quass, D.: Improved Query Performance with Variant Indexes. In: SIGMOD 1997 (1997)
Pieringer, R., et al.: Combining Hierarchy Encoding and Pre-Grouping: Intelligent Grouping in Star Join Processing. In: ICDE 2003 (2003)
Ramsak, F., et al.: Integrating the UB-Tree into a Database System Kernel. In: VLDB 2000 (2000)
Sarawagi, S.: Indexing OLAP Data. Data Engineering Bulletin 20(1), 36–43 (1997)
Sismanis, Y., Deligiannakis, A., Roussopoulos, N., Kotidis, Y.: Dwarf: shrinking the PetaCube. In: SIGMOD 2002 (2002)
Sarawagi, S., Stonebraker, M.: Efficient Organization of Large Multidimensional Arrays. In: ICDE 1994 (1994)
The Transbase Hypercube® relational database system, http://www.transaction.de
Tsois, A., Sellis, T.: The Generalized Pre-Grouping Transformation: Aggregate- Query Optimization in the Presence of Dependencies. In: VLDB 2003 (2003)
Weber, R., Schek, H.-J., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: VLDB 1998, pp. 194–205 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karayannidis, N., Sellis, T., Kouvaras, Y. (2004). CUBE File: A File Structure for Hierarchically Clustered OLAP Cubes. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-24741-8_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21200-3
Online ISBN: 978-3-540-24741-8
eBook Packages: Springer Book Archive