Abstract
Given a signal A of N dimensions, the problem is to obtain a representation R for it that is a linear combination of vectors in the dictionary H of Haar wavelets. The quality of the representation R is determined by B, the number of vectors from H used, and δ, the error between R and A. Traditionally, δ has been the sum squared error ε R=∑ i (R[i]–A[i])2, in which case, Parseval’s theorem from 1799 helps solve the problem of finding the R with smallest ε R in O(N) time.
Recently, motivated by database applications, researchers have sought other notions of error such as
-
workload-aware error, or \(\epsilon_{{\mathbf R}}^{\pi}=\sum_i \pi[i] ({\mathbf R}[i]-{\mathbf A}[i])^2\), where π[i] is the workload or the weight for i, and
-
maximum pointwise absolute error, eg., \(\epsilon_{{\mathbf R}}^{\infty}=\max_i |{\mathbf R}[i]-{\mathbf A}[i]|\).
Recent results give Ω(N 2) time algorithms for finding R that minimize these errors.
We present subquadratic algorithms for versions of these problems. We present a near-linear time algorithm to minimize ε R π when π is compressible. To minimize ε R ∞ , we give an O(N 2 − − ε) time algorithm. These algorithms follow a natural dynamic programming approach developed recently, but the improvements come from exploiting local structural properties of the Haar wavelet representations of signals we identify.
Sparse approximation theory is a mature area of Mathematics that has traditionally studied signal representations with Haar wavelets. It is interesting that the past few years have seen new problems in this area motivated by Computer Science concerns: we pose a few new additional problems and some partial results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aboulnaga, A., Chaudhuri, S.: Self-tuning histograms: building histograms without looking at data. In: Proc. SIGMOD, pp. 181–192 (1999)
Beauchamp, K.G.: Walsh functions and their applications (1975)
Deligiannakis, A., Garofalakis, M., Roussopoulos, N.: A fast approximation scheme for probabilistic wavelet synopses. In: Proc. of SSDBM (2005)
Deligiannakis, A., Roussopoulos, N.: Extended wavelets for multiple measures. In: Proc. SIGMOD (2003)
Devore, R., Lorentz, G.: Constructive approximation. Springer, Heidelberg (1991)
Egiazarian, K., Astola, J.: Tree-structured Haar transforms. Journal of Mathematical Imaging and Vision 16, 269–279 (2002)
Ganti, V., Lee, M., Ramakrishnan, R.: ICICLES: Self-tuning samples for approximate query answering. VLDB Journal, 176–187 (2000)
Garofalakis, M., Kumar, A.: Deterministic wavelet thresholding for maximum error metrics. In: Proc. PODS (2004)
Garofalakis, M., Gibbons, P.: Wavelet Synopses with Error Guarantees. In: Proc. of ACM SIGMOD, pp. 476–487 (2002)
Gilbert, A., Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Fast, small space algorithms for approximate histogram maintenance. In: Proc. STOC, pp. 389–398 (2002)
Gilbert, A., Muthukrishnan, S., Strauss, M.: Approximation of functions over redundant dictionaries using coherence. In: Proc. ACM-SIAM SODA (2003)
Guha, S.: Space Efficiency in Synopsis Construction Algorithms. In: Proc. VLDB (2005)
Guha, S., Harb, B.: Wavelet Synopsis for Data Streams: Minimizing Non-Euclidean Error. In: Proc. KDD (2005)
Guha, S., Harb, B.: Approximation Algorithms for Wavelet Transform Coding of Data Streams. To appear in Proc. ACM-SIAM SODA (2006)
Jagadish, H., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K., Suel, T.: Optimal Histograms with Quality Guarantees. In: Proc. VLDB, pp. 275–286 (1998)
Haar, A.: Zur theorie der orthogonalen functionsysteme. Math Annal., Vol 69, 331–371 (1910)
Markl, V., Lohman, G., Raman, V.: LEO: An automatic query optimizer for DB2. IBM Systems Journal 42(1) (2003); Aloso, Proc. VLDB (2002)
Matias, Y., Urieli, D.: Optimal workload-based wavelet synopses. In: Proc. Intl Conf on Database Technology (2004)
Matias, Y., Vitter, J., Wang, M.: Wavelet-based histograms for selectivity estimation. In: Proc. ACM SIGMOD, pp. 448–459 (1998)
Muthukrishnan, S.: Workload-optimal wavelet synopsis. DIMACS Technical Report 2004-25 (May 2004)
Muthukrishnan, S., Strauss, M., Zhang, X.: Workload-aware histograms on streams. To appear in Proc. ESA (2005); Also, DIMACS TR 2005
Parseval, M.: (1799), http://encyclopedia.thefreedictionary.com/Parseval’s+theorem
Schmidt, R., Shahabi, C.: How to evaluate multiple range-sum queries progessively. In: Proc. PODS (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muthukrishnan, S. (2005). Subquadratic Algorithms for Workload-Aware Haar Wavelet Synopses. In: Sarukkai, S., Sen, S. (eds) FSTTCS 2005: Foundations of Software Technology and Theoretical Computer Science. FSTTCS 2005. Lecture Notes in Computer Science, vol 3821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11590156_23
Download citation
DOI: https://doi.org/10.1007/11590156_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30495-1
Online ISBN: 978-3-540-32419-5
eBook Packages: Computer ScienceComputer Science (R0)