Partial Prefix Sum Method for Large Data Warehouses

Chun, Seok-Ju

doi:10.1007/978-3-540-39592-8_67

Seok-Ju Chun¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2871))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

490 Accesses
3 Citations

Abstract

A range-sum query computes aggregate information over a data cube in the query range specified by a user. Existing methods based on the prefix-sum approach use an additional cube called the prefix-sum cube (PC), to store the cumulative sums of data, causing a high space overhead. This space overhead not only leads to extra costs for storage devices, but also causes additional propagations of updates and longer access time on physical devices. In this paper, we propose a new cube called Partial Prefix-sum Cube (PPC) that drastically reduces the space of the PC in a large data warehouse. The PPC decreases the update propagation caused by the dependency between values in cells of the PC. We perform an extensive experiment with respect to various dimensions of the data cube and query sizes, and examine the effectiveness and performance of our proposed method. Experimental results show that the PPC drastically reduces the space requirements, while having reasonable query performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. of ACM SIGMOD Conference, pp. 94–105 (1998)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. of ACM SIGKDD Conference, pp. 226–231 (1996)
Google Scholar
Goil, S., Choudhary, A.: BESS: Sparse data storage of multi-dimensional data for OLAP and data mining. Technical report, Northwestern University (1997)
Google Scholar
Ho, C., Agrawal, R., Megido, N., Srikant, R.: Range queries in OLAP Data Cubes. In: Proc. of ACM SIGMOD Conference, pp. 73–88 (1997)
Google Scholar
Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Proc. of ACM VLDB Conference, pp. 144–155 (1996)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: Proc. of ACM SIGMOD Conference, pp. 73–84 (1998)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: Proc. of ACM SIGMOD Conference, pp. 103–114 (1996), Appendix: Springer-Author Discount
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Internet Information, Ansan College, 752, Il-Dong, Sangrok-Ku, Ansan, Korea
Seok-Ju Chun

Authors

Seok-Ju Chun
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The International WIC Institute, Beijing University of Technology, China
Ning Zhong
Department of Computer Science, University of North Carolina, NC 28223, Charlotte, USA
Zbigniew W. Raś
Shimane University, 89-1 Enya-cho Izumo, 6938501, Shimane, Japan
Shusaku Tsumoto
Department of Informatics, Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Nishi, 819-0395, Fukuoka, Japan
Einoshin Suzuki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chun, SJ. (2003). Partial Prefix Sum Method for Large Data Warehouses. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds) Foundations of Intelligent Systems. ISMIS 2003. Lecture Notes in Computer Science(), vol 2871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39592-8_67

Download citation

DOI: https://doi.org/10.1007/978-3-540-39592-8_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20256-1
Online ISBN: 978-3-540-39592-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics