Abstract
In recent years wavelets were shown to be effective data synopses. We are concerned with the problem of finding efficiently wavelet synopses for massive data sets, in situations where information about query workload is available. We present linear time, I/O optimal algorithms for building optimal workload-based wavelet synopses for point queries. The synopses are based on a novel construction of weighted inner-products and use weighted wavelets that are adapted to those products. The synopses are optimal in the sense that the subset of retained coefficients is the best possible for the bases in use with respect to either the mean-squared absolute or relative errors. For the latter, this is the first optimal wavelet synopsis even for the regular, non-workload-based case. Experimental results demonstrate the advantage obtained by the new optimal wavelet synopses, as well as the robustness of the synopses to deviations in the actual query workload.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aboulnaga, A., Chaudhuri, S.: Self-tuning histograms: Building histograms without looking at data. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pp. 181–192 (1999)
Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate query processing using wavelets. In: Proceedings of 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 111–122 (2000)
Chaudhuri, S., Das, G., Datar, M., Motwani, R., Narasayya, V.R.: Overcoming limitations of sampling for aggregation queries. In: ICDE, pp. 534–542 (2001)
Chaudhuri, S., Das, G., Narasayya, V.: A robust, optimization-based approach for approximate answering of aggregate queries. In: Proceedings of the 2001 ACM SIGMOD international conference on Management of data (2001)
Coifman, R.R., Jones, P.W., Semmes, S.: Two elementary proofs of the l2 boundedness of cauchy integrals on lipschitz curves. J. Amer. Math. Soc. 2(3), 553–564 (1989)
Deligiannakis, A., Roussopoulos, N.: Extended wavelets for multiple measures. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 229–240 (2003)
Ganti, V., Lee, M.-L., Ramakrishnan, R.: Icicles: Self-tuning samples for approximate query answering. The VLDB Journal, 176–187 (2000)
Garofalakis, M., Gibbons, P.B.: Wavelet synopses with error guarantees. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (2002)
Garofalakis, M., Kumar, A.: Deterministic wavelet thresholding for maximum-error metrics. In: Proceedings of the 2004 ACM SIGMOD international conference on on Management of data, pp. 166–176 (2004)
Gibbons, P.B., Matias, Y.: Synopsis data structures for massive data sets. In: DIMACS: Series in Discrete Mathematics and Theoretical Computer Science: Special Issue on External Memory Algorithms and Visualization, A (1999)
Girardi, M., Sweldens, W.: A new class of unbalanced Haar wavelets that form an unconditional basis for L p on general measure spaces. J. Fourier Anal. Appl. 3(4) (1997)
Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, London (1999)
Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, New York, pp. 426–435 (1998)
Matias, Y., Portman, L.: Workload-based wavelet synopses. Technical report, Department of Computer Science, Tel Aviv University (2003)
Matias, Y., Urieli, D.: Optimal wavelet synopses for range-sum queries. Technical report, Department of Computer Science, Tel-Aviv University (2004)
Matias, Y., Urieli, D.: Optimal workload-based weighted wavelet synopses. Technical report, Department of Computer Science, Tel-Aviv University (2004)
Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, WA, June 1998, pp. 448–459 (1998)
Muthukrishnan, S.: Workload-optimal wavelet synopsis. Technical report (May 2004)
Portman, L.: Workload-based wavelet synopses. M.sc. thesis, Tel Aviv University (2003)
Stollnitz, E.J., Derose, T.D., Salesin, D.H.: Wavelets for Computer Graphics. Morgan Kaufmann, San Francisco (1996)
Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Phildelphia, pp. 193–204 (June 1999)
Vitter, J.S., Wang, M., Iyer, B.: Data cube approximation and histograms via wavelets. In: Proceedings of Seventh International Conference on Information and Knowledge Management, Washington D.C, pp. 96–104 (November 1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matias, Y., Urieli, D. (2004). Optimal Workload-Based Weighted Wavelet Synopses. In: Eiter, T., Libkin, L. (eds) Database Theory - ICDT 2005. ICDT 2005. Lecture Notes in Computer Science, vol 3363. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30570-5_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-30570-5_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24288-8
Online ISBN: 978-3-540-30570-5
eBook Packages: Computer ScienceComputer Science (R0)