Abstract
The sliding window model generalizes the standard streaming model and often performs better in applications where recent data is more important or more accurate than data that arrived prior to a certain time. We study the problem of approximating symmetric norms (a norm on \(\mathbb {R}^n\) that is invariant under sign-flips and coordinate-wise permutations) in the sliding window model, where only the W most recent updates define the underlying frequency vector. Whereas standard norm estimation algorithms for sliding windows rely on the smooth histogram framework of Braverman and Ostrovsky (FOCS 2007), analyzing the smoothness of general symmetric norms seems to be a challenging obstacle. Instead, we observe that the symmetric norm streaming algorithm of Braverman et al. (STOC 2017) can be reduced to identifying and approximating the frequency of heavy-hitters in a number of substreams. We introduce a heavy-hitter algorithm that gives a \((1+\epsilon )\)-approximation to each of the reported frequencies in the sliding window model, thus obtaining the first algorithm for general symmetric norm estimation in the sliding window model. Our algorithm is a universal sketch that simultaneously approximates all symmetric norms in a parametrizable class and also improves upon the smooth histogram framework for estimating \(L_p\) norms, for a range of large p. Finally, we consider the problem of overconstrained linear regression problem in the case that loss function that is an Orlicz norm, a symmetric norm that can be interpreted as a scale-invariant version of M-estimators. We give the first sublinear space algorithms that produce \((1+\epsilon )\)-approximate solutions to the linear regression problem for loss functions that are Orlicz norms in both the streaming and sliding window models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137–147 (1999)
Andoni, A., Lin, C., Sheng, Y., Zhong, P., Zhong, R.: Subspace embedding and linear regression with Orlicz norm. In: Proceedings of the 35th International Conference on Machine Learning, ICML, pp. 224–233 (2018)
Argyriou, A., Foygel, R., Srebro, N.: Sparse prediction with the \(k\)-support norm. In: Advances in Neural Information Processing Systems 25: Annual Conference on Neural Information Processing Systems, pp. 1466–1474 (2012)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-first ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 1–16 (2002)
Blasiok, J., Braverman, V., Chestnut, S.R., Krauthgamer, R., Yang, L.F.: Streaming symmetric norms via measure concentration. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC, pp. 716–729 (2017)
Borassi, M., Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Better sliding window algorithms to maximize subadditive and diversity objectives. In: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS, pp. 254–268 (2019)
Borassi, M., Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Sliding window algorithms for k-clustering problems. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems, NeurIPS (2020)
Braverman, V., et al.: Near optimal linear algebra in the online and sliding window models. In: 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS, pp. 517–528 (2020)
Braverman, V., Gelles, R., Ostrovsky, R.: How to catch \(\text{ L}_2\)-heavy-hitters on sliding windows. Theor. Comput. Sci. 554, 82–94 (2014)
Braverman, V., Grigorescu, E., Lang, H., Woodruff, D.P., Zhou, S.: Nearly optimal distinct elements and heavy hitters on sliding windows. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM, pp. 7:1–7:22 (2018)
Braverman, V., Lang, H., Levin, K., Monemizadeh, M.: Clustering on sliding windows in polylogarithmic space. In: 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS, pp. 350–364 (2015)
Braverman, V., Lang, H., Levin, K., Monemizadeh, M.: Clustering problems on sliding windows. In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 1374–1390 (2016)
Braverman, V., Ostrovsky, R.: Smooth histograms for sliding windows. In: Proceedings of 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 283–293 (2007)
Braverman, V., Ostrovsky, R., Roytman, A.: Zero-one laws for sliding windows and universal sketches. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM, pp. 573–590 (2015)
Braverman, V., Ostrovsky, R., Zaniolo, C.: Optimal sampling from sliding windows. J. Comput. Syst. Sci. 78(1), 260–272 (2012)
Chakrabarti, A., Ba, K.D., Muthukrishnan, S.: Estimating entropy and entropy norm on data streams. Internet Math. 3(1), 63–78 (2006)
Chen, J., Nguyen, H.L., Zhang, Q.: Submodular maximization over sliding windows. CoRR, abs/1611.00129 (2016)
Cormode, G.: The continuous distributed monitoring model. SIGMOD Rec. 42(1), 5–14 (2013)
Cormode, G., Garofalakis, M.N.: Streaming in a connected world: querying and tracking distributed data streams. In: EDBT 2008, Proceedings of 11th International Conference on Extending Database Technology, p. 745 (2008)
Cormode, G., Muthukrishnan, S.: What’s new: finding significant differences in network data streams. IEEE/ACM Trans. Netw. 13(6), 1219–1232 (2005)
Datar, M., Motwani, R.: The sliding-window computation model and results. In: Garofalakis, M., Gehrke, J., Rastogi, R. (eds.) Data Stream Management. DSA, pp. 149–165. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-540-28608-0_7
Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Submodular optimization over sliding windows. In: Proceedings of the 26th International Conference on World Wide Web, WWW, pp. 421–430 (2017)
Feigenbaum, J., Kannan, S., Strauss, M., Viswanathan, M.: An approximate l1-difference algorithm for massive data streams. SIAM J. Comput. 32(1), 131–151 (2002)
Feldman, D., Monemizadeh, M., Sohler, C., Woodruff, D.P.: Coresets and sketches for high dimensional subspace approximation problems. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 630–649 (2010)
Indyk, P., Woodruff, D.P.: Optimal approximations of the frequency moments of data streams. In: Proceedings of the 37th Annual ACM Symposium on Theory of Computing, pp. 202–208 (2005)
Jayaram, R., Woodruff, D.P., Zhou, S.: Truly perfect samplers for data streams and sliding windows. CoRR, abs/2108.12017 (2021)
Krauthgamer, R., Reitblat, D.: Almost-smooth histograms and sliding-window graph algorithms. CoRR, abs/1904.07957 (2019)
Krishnamurthy, B., Sen, S., Zhang, Y., Chen, Y.: Sketch-based change detection: methods, evaluation, and applications. In: Proceedings of the 3rd ACM SIGCOMM Internet Measurement Conference, IMC, pp. 234–247 (2003)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. PVLDB 5(12), 1699 (2012)
McDonald, A.M., Pontil, M., Stamos, D.: Spectral k-support norm regularization. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pp. 3644–3652 (2014)
Osborne, M., et al.: Real-time detection, tracking and monitoring of automatically discovered events in social media. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (2014)
Papapetrou, O., Garofalakis, M., Deligiannakis, A.: Sketching distributed sliding-window data streams. VLDB J. 24(3), 345–368 (2015). https://doi.org/10.1007/s00778-015-0380-7
Song, Z., Wang, R., Yang, L.F., Zhang, H., Zhong, P.: Efficient symmetric norm regression via linear sketching. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, pp. 828–838 (2019)
Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 615–624 (2004)
Wei, Z., Liu, X., Li, F., Shang, S., Du, X., Wen, J.-R.: Matrix sketching over sliding windows. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference, pp. 1465–1480 (2016)
Woodruff, D.P., Zhang, Q.: Distributed statistical estimation of matrix products with applications. In: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp. 383–394 (2018)
Woodruff, D.P., Zhou, S.: Tight bounds for adversarially robust streams and sliding windows via difference estimators. CoRR, abs/2011.07471 (2020)
Woodruff, D.P., Zhou, S.: Separations for estimating large frequency moments on data streams. In: 48th International Colloquium on Automata, Languages, and Programming, ICALP, pp. 112:1–112:21 (2021)
Bin, W., Ding, C., Sun, D., Toh, K.-C.: On the Moreau-Yosida regularization of the vector k-norm related functions. SIAM J. Optim. 24(2), 766–794 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Braverman, V., Wei, V., Zhou, S. (2021). Symmetric Norm Estimation and Regression on Sliding Windows. In: Chen, CY., Hon, WK., Hung, LJ., Lee, CW. (eds) Computing and Combinatorics. COCOON 2021. Lecture Notes in Computer Science(), vol 13025. Springer, Cham. https://doi.org/10.1007/978-3-030-89543-3_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-89543-3_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89542-6
Online ISBN: 978-3-030-89543-3
eBook Packages: Computer ScienceComputer Science (R0)