Abstract
The problem of estimating frequency moments of a data stream has attracted a lot of attention since the onset of streaming algorithms [AMS99]. While the space complexity for approximately computing the p th moment, for p ∈ (0,2] has been settled [KNW10], for p > 2 the exact complexity remains open. For p > 2 the current best algorithm uses O(n 1 − 2/plogn) words of space [AKO11,BO10], whereas the lower bound is of Ω(n 1 − 2/p) [BJKS04].
In this paper, we show a tight lower bound of Ω(n 1 − 2/plogn) words for the class of algorithms based on linear sketches, which store only a sketch Ax of input vector x and some (possibly randomized) matrix A. We note that all known algorithms for this problem are linear sketches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andoni, A., Krauthgamer, R., Onak, K.: Streaming algorithms from precision sampling. In: Proceedings of the Symposium on Foundations of Computer Science, FOCS (2011), Full version appears on arXiv:1011.1263
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comp. Sys. Sci. 58, 137–147 (1999), Previously appeared in STOC 1996
Bar-Yossef, Z.: The complexity of massive data set computations. PhD thesis, UC Berkeley (2002)
Bhuvanagiri, L., Ganguly, S., Kesh, D., Saha, C.: Simpler algorithm for estimating frequency moments of data streams. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 708–713 (2006)
Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D.: An information statistics approach to data stream and communication complexity. J. Comput. Syst. Sci. 68(4), 702–732 (2004)
Brown, L.D., Low, M.G.: A constrained risk inequality with applications to nonparametric functional estimation. The Annals of Statistics 24, 2524–2535 (1996)
Braverman, V., Ostrovsky, R.: Recursive sketching for frequency moments. CoRR, abs/1011.2571 (2010)
Braverman, V., Ostrovsky, R.: Approximating large frequency moments with pick-and-drop sampling. CoRR, abs/1212.0202 (2012)
Chakrabarti, A., Khot, S., Sun, X.: Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In: IEEE Conference on Computational Complexity, pp. 107–117 (2003)
Cai, T.T., Low, M.G.: Testing composite hypotheses, Hermite polynomials and optimal estimation of a nonsmooth functional. The Annals of Statistics 39(2), 1012–1041 (2011)
Csiszár, I.: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar. 2, 299–318 (1967)
Ganguly, S.: Polynomial estimators for high frequency moments. arXiv, 1104.4552 (2011)
Ganguly, S., Cormode, G.: On estimating frequency moments of data streams. In: Charikar, M., Jansen, K., Reingold, O., Rolim, J.D.P. (eds.) APPROX and RANDOM 2007. LNCS, vol. 4627, pp. 479–493. Springer, Heidelberg (2007)
Indyk, P.: Stable distributions, pseudorandom generators, embeddings and data stream computation. J. ACM 53(3), 307–323 (2006); Previously appeared in FOCS 2000
Ingster, Y.I., Suslina, I.A.: Nonparametric goodness-of-fit testing under Gaussian models. Springer, New York (2003)
Indyk, P., Woodruff, D.: Tight lower bounds for the distinct elements problem. In: Proceedings of the Symposium on Foundations of Computer Science (FOCS), pp. 283–290 (2003)
Indyk, P., Woodruff, D.: Optimal approximations of the frequency moments of data streams. In: Proceedings of the Symposium on Theory of Computing, STOC (2005)
Jowhari, H., Saglam, M., Tardos, G.: Tight bounds for L p samplers, finding duplicates in streams, and related problems. In: Proceedings of the ACM Symposium on Principles of Database Systems (PODS), pp. 49–58 (2011), Previously http://arxiv.org/abs/1012.4889
Kane, D.M., Nelson, J., Porat, E., Woodruff, D.P.: Fast moment estimation in data streams in optimal space. In: Proceedings of the Symposium on Theory of Computing (STOC) (2011); A previous version appeared as ArXiv:1007.4191, http://arxiv.org/abs/1007.4191
Kane, D.M., Nelson, J., Woodruff, D.P.: On the exact space complexity of sketching small norms. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA (2010)
Le Cam, L.: Asymptotic methods in statistical decision theory. Springer, New York (1986)
Li, P.: Estimators and tail bounds for dimension reduction in l p (0 < p ≤ 2) using stable random projections. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA (2008)
Low, M.G.: Chi-square lower bounds. In: Borrowing Strength: Theory Powering Applications - A Festschrift for Lawrence D. Brown, pp. 22–31 (2010)
Monemizadeh, M., Woodruff, D.: 1-pass relative-error l p -sampling with applications. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA (2010)
Nelson, J., Woodruff, D.: Fast manhattan sketches in data streams. In: Proceedings of the ACM Symposium on Principles of Database Systems, PODS (2010)
Price, E., Woodruff, D.P.: Applications of the Shannon-Hartley theorem to data streams and sparse recovery. In: Proceedings of the 2012 IEEE International Symposium on Information Theory, pp. 1821–1825 (2012)
Tsybakov, A.B.: Introduction to Nonparametric Estimation. Springer, New York (2009)
Woodruff, D.: Optimal space lower bounds for all frequency moments. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA (2004)
Woodruff, D.: Personal communication (February 2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Andoni, A., Nguyễn, H.L., Polyanskiy, Y., Wu, Y. (2013). Tight Lower Bound for Linear Sketches of Moments. In: Fomin, F.V., Freivalds, R., Kwiatkowska, M., Peleg, D. (eds) Automata, Languages, and Programming. ICALP 2013. Lecture Notes in Computer Science, vol 7965. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39206-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-39206-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39205-4
Online ISBN: 978-3-642-39206-1
eBook Packages: Computer ScienceComputer Science (R0)