Abstract
A range-sum query computes aggregate information over a data cube in the query range specified by a user. Existing methods based on the prefix-sum approach use an additional cube called the prefix-sum cube (PC), to store the cumulative sums of data, causing a high space overhead. This space overhead not only leads to extra costs for storage devices, but also causes additional propagations of updates and longer access time on physical devices. In this paper, we propose a new cube called Partial Prefix-sum Cube (PPC) that drastically reduces the space of the PC in a large data warehouse. The PPC decreases the update propagation caused by the dependency between values in cells of the PC. We perform an extensive experiment with respect to various dimensions of the data cube and query sizes, and examine the effectiveness and performance of our proposed method. Experimental results show that the PPC drastically reduces the space requirements, while having reasonable query performances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. of ACM SIGMOD Conference, pp. 94–105 (1998)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. of ACM SIGKDD Conference, pp. 226–231 (1996)
Goil, S., Choudhary, A.: BESS: Sparse data storage of multi-dimensional data for OLAP and data mining. Technical report, Northwestern University (1997)
Ho, C., Agrawal, R., Megido, N., Srikant, R.: Range queries in OLAP Data Cubes. In: Proc. of ACM SIGMOD Conference, pp. 73–88 (1997)
Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Proc. of ACM VLDB Conference, pp. 144–155 (1996)
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: Proc. of ACM SIGMOD Conference, pp. 73–84 (1998)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: Proc. of ACM SIGMOD Conference, pp. 103–114 (1996), Appendix: Springer-Author Discount
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chun, SJ. (2003). Partial Prefix Sum Method for Large Data Warehouses. In: Zhong, N., RaÅ›, Z.W., Tsumoto, S., Suzuki, E. (eds) Foundations of Intelligent Systems. ISMIS 2003. Lecture Notes in Computer Science(), vol 2871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39592-8_67
Download citation
DOI: https://doi.org/10.1007/978-3-540-39592-8_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20256-1
Online ISBN: 978-3-540-39592-8
eBook Packages: Springer Book Archive