Optimization of Multidimensional Aggregates in Data Warehouses

doi:10.4018/jdm.2007010104

Optimization of Multidimensional Aggregates in Data Warehouses

Russel Pears, Bryan Houliston

Source Title: Journal of Database Management (JDM)18(1)

Cite Article Cite Article

MLA

Pears, Russel, and Bryan Houliston. "Optimization of Multidimensional Aggregates in Data Warehouses." JDM vol.18, no.1 2007: pp.69-93. http://doi.org/10.4018/jdm.2007010104

APA

Pears, R. & Houliston, B. (2007). Optimization of Multidimensional Aggregates in Data Warehouses. Journal of Database Management (JDM), 18(1), 69-93. http://doi.org/10.4018/jdm.2007010104

Chicago

Pears, Russel, and Bryan Houliston. "Optimization of Multidimensional Aggregates in Data Warehouses," Journal of Database Management (JDM) 18, no.1: 69-93. http://doi.org/10.4018/jdm.2007010104

Export Reference

Favorite Full-Issue Download

View Full Text PDF

Abstract

The computation of multidimensional aggregates is a common operation in OLAP applications. The major bottleneck is the large volume of data that needs to be processed which leads to prohibitively expensive query execution times. On the other hand, data analysts are primarily concerned with discerning trends in the data and thus a system that provides approximate answers in a timely fashion would suit their requirements better. In this article we present the prime factor scheme, a novel method for compressing data in a warehouse. Our data compression method is based on aggregating data on each dimension of the data warehouse. We used both real world and synthetic data to compare our scheme against the Haar wavelet and our experiments on range-sum queries show that it outperforms the latter scheme with respect to both decoding time and error rate, while maintaining comparable compression ratios. One encouraging feature is the stability of the error rate when compared to the Haar wavelet. Although wavelets have been shown to be effective at compressing data, the approximate answers they provide varies widely, even for identical types of queries on nearly identical values in distinct parts of the data. This problem has been attributed to the thresholding technique used to reduce the size of the encoded data and is an integral part of the wavelet compression scheme. In contrast the prime factor scheme does not rely on thresholding but keeps a smaller version of every data element from the original data and is thus able to achieve a much higher degree of error stability which is important from a Data Analysts point of view.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Optimization of Multidimensional Aggregates in Data Warehouses

MLA

APA

Chicago

Export Reference

Abstract

Request Access