Supporting Online Queries in ROLAP

Barbará, Daniel; Wu, Xintao

doi:10.1007/3-540-44466-1_23

Daniel Barbará⁷ &
Xintao Wu⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1874))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

511 Accesses

Abstract

Data warehouses are becoming a powerful tool to analyze enterprise data. A critical demand imposed by the users of data warehouses is that the time to get an answer (latency) after posing a query is to be as short as possible. It is arguable that a quick, albeit approximate, answer that can be refined over time is much better than a perfect answer for which a user has to wait a long time. In this paper we addressed the issue of online support for data warehouse queries, meaning the ability to reduce the latency of the answer at the expense of having an approximate answer that can be refined as the user is looking at it. Previous work has address the online support by using sampling techniques. We argue that a better way is to preclassify the cells of the data cube into error bins and bring the target data for a query in “waves,” i.e., by fetching the data in those bins one after the other. The cells are classified into bins by means of the usage of a data model (e.g., linear regression, log-linear models) that allows the system to obtain an approximate value for each of the data cube cells. The difference between the estimated value and the true value is the estimation error, and its magnitude determines to which bin the cell belongs. The estimated value given by the model serves to give a very quick, yet approximate answer, that will be refined online by bringing cells from the error bins. Experiments show that this technique is a good way to support online aggregation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy. Join Synopses for Approximate Query Answering. In Proceedings of the 1999 ACM-SIGMOD International Conference on Management of Data, Philadelphia, PA, June 1999.
Google Scholar
S. Agarwal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan and S. Sarawagi. On the Computation of Multidimensional Aggregates, In Proceedings of the 22nd International Conference on Very Large Data Bases, Bombay, India, pp. 506–521, September, 1996.
Google Scholar
A. Agresti, An Introduction to Categorical Data Analysis. John Wiley, New York, 1996.
MATH Google Scholar
D. Barbará. Quasi-Cubes: Exploiting Approximations in Multidimensional Data Sets. http://www.ise.gmu.edu/dbarbara/quasi.html.
D. Barbará, W. DuMouchel, C. Faloutsos, P.J. Haas, J.M. Hellerstein, Y. Ioannidis, H.V. Jagadish, T. Johnson, R. Ng, V. Poosala, K.A. Ross, and K.G. Sevcik. The New Jersey Data Reduction Report. Bulletin of the Technical Committee on Data Engineering, 20(4):3–45, December 1997.
Google Scholar
D. Barbará and M. Sullivan. Quasi-Cubes: A space-efficient way to support approximate multidimensional databases. Technical Report, Department of Information and Software Systems Engineering, George Mason University, 1997.
Google Scholar
D. Barbará and M. Sullivan. Quasi-cubes: Exploiting approximations in multidimensional databases. SIGMOD Record, 26(3), September 1997.
Google Scholar
D. Barbará and X. Wu. Using Approximations to Scale Exploratory Data Analysis in Datacubes. In Proceedings of the 1999 ACM-SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, August 1999.
Google Scholar
D. Barbará and X. Wu. Using loglinear models to compress data cubes. Technical Report, Department of Information and Software Systems Engineering, George Mason University, 1999.
Google Scholar
C.M. Chen and N. Roussopoulos. Adaptive Selectivity Estimation Using Query Feedback. In Proceedings of the ACM-SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May, 1994.
Google Scholar
G. Colliat. OLAP, Relational, and Multidimensional Database Systems. SIGMOD Record, 25(3), pp. 64–69, Sep. 1996.
Google Scholar
J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. In Proceedings of the International Conference on Data Engineering, New Orleans, 1996.
Google Scholar
V. Harinarayan, A. Rajaraman, and J.D. Ullman. Implementing Data Cubes Efficiently. In Proceedings of the ACM-SIGMOD Conference, Montreal, Canada, 1996.
Google Scholar
P.J. Haas, and J.M. Hellerstein. Ripple Joins for Online Aggregation. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Philadelphia, PA, May 1999.
Google Scholar
J.M. Hellerstein, P.J. Haas, and H.J. Wang. Online Aggregation. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997.
Google Scholar
R. Kimball. The DataWarehouse Toolkit: How to Design Dimensional Data Warehouses. John Wiley, New York, 1996.
Google Scholar
J. Shanmugasundaram, U. Fayyad, and P.S. Bradley. Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions. In Proceedings of the ACM-SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, August 1999.
Google Scholar
D. Srivastava and K. Ross. Fast Computations of Sparse Cubes. In Proceedings of the 23rd International Conference on Very Large Data Bases, Athens, Greece, August 1997.
Google Scholar
J.S. Vitter and M. Wang. Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. In Proceedings of the 1999 ACM-SIGMOD International Conference on Management of Data, Philadelphia, PA, June 1999.
Google Scholar
R.J. Wonnacott and T.H. Wonnacott. Introductory Statistics. John Wiley, New York, 1985
Google Scholar
P. Young. Recursive estimation and time-series analysis. Springer-Verlag, New York, 1984.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Information and Software Engineering Department, George Mason University, Fairfax, VA, 22303
Daniel Barbará & Xintao Wu

Authors

Daniel Barbará
View author publications
You can also search for this author in PubMed Google Scholar
Xintao Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
Yahiko Kambayashi
Computer Science Department, Western Michigan University, Kalamazoo, MI, 49008, USA
Mukesh Mohania
Vienna University of Technology, IFS, Favoritenstr. 9-11/188, 1040, Vienna, Austria
A. Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barbará, D., Wu, X. (2000). Supporting Online Queries in ROLAP. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2000. Lecture Notes in Computer Science, vol 1874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44466-1_23

Download citation

DOI: https://doi.org/10.1007/3-540-44466-1_23
Published: 06 July 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67980-6
Online ISBN: 978-3-540-44466-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics