Skip to main content

Symmetric Norm Estimation and Regression on Sliding Windows

  • Conference paper
  • First Online:
Computing and Combinatorics (COCOON 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13025))

Included in the following conference series:

  • 1100 Accesses

Abstract

The sliding window model generalizes the standard streaming model and often performs better in applications where recent data is more important or more accurate than data that arrived prior to a certain time. We study the problem of approximating symmetric norms (a norm on \(\mathbb {R}^n\) that is invariant under sign-flips and coordinate-wise permutations) in the sliding window model, where only the W most recent updates define the underlying frequency vector. Whereas standard norm estimation algorithms for sliding windows rely on the smooth histogram framework of Braverman and Ostrovsky (FOCS 2007), analyzing the smoothness of general symmetric norms seems to be a challenging obstacle. Instead, we observe that the symmetric norm streaming algorithm of Braverman et al. (STOC 2017) can be reduced to identifying and approximating the frequency of heavy-hitters in a number of substreams. We introduce a heavy-hitter algorithm that gives a \((1+\epsilon )\)-approximation to each of the reported frequencies in the sliding window model, thus obtaining the first algorithm for general symmetric norm estimation in the sliding window model. Our algorithm is a universal sketch that simultaneously approximates all symmetric norms in a parametrizable class and also improves upon the smooth histogram framework for estimating \(L_p\) norms, for a range of large p. Finally, we consider the problem of overconstrained linear regression problem in the case that loss function that is an Orlicz norm, a symmetric norm that can be interpreted as a scale-invariant version of M-estimators. We give the first sublinear space algorithms that produce \((1+\epsilon )\)-approximate solutions to the linear regression problem for loss functions that are Orlicz norms in both the streaming and sliding window models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137–147 (1999)

    Article  MathSciNet  Google Scholar 

  2. Andoni, A., Lin, C., Sheng, Y., Zhong, P., Zhong, R.: Subspace embedding and linear regression with Orlicz norm. In: Proceedings of the 35th International Conference on Machine Learning, ICML, pp. 224–233 (2018)

    Google Scholar 

  3. Argyriou, A., Foygel, R., Srebro, N.: Sparse prediction with the \(k\)-support norm. In: Advances in Neural Information Processing Systems 25: Annual Conference on Neural Information Processing Systems, pp. 1466–1474 (2012)

    Google Scholar 

  4. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Twenty-first ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 1–16 (2002)

    Google Scholar 

  5. Blasiok, J., Braverman, V., Chestnut, S.R., Krauthgamer, R., Yang, L.F.: Streaming symmetric norms via measure concentration. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC, pp. 716–729 (2017)

    Google Scholar 

  6. Borassi, M., Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Better sliding window algorithms to maximize subadditive and diversity objectives. In: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS, pp. 254–268 (2019)

    Google Scholar 

  7. Borassi, M., Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Sliding window algorithms for k-clustering problems. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems, NeurIPS (2020)

    Google Scholar 

  8. Braverman, V., et al.: Near optimal linear algebra in the online and sliding window models. In: 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS, pp. 517–528 (2020)

    Google Scholar 

  9. Braverman, V., Gelles, R., Ostrovsky, R.: How to catch \(\text{ L}_2\)-heavy-hitters on sliding windows. Theor. Comput. Sci. 554, 82–94 (2014)

    Article  Google Scholar 

  10. Braverman, V., Grigorescu, E., Lang, H., Woodruff, D.P., Zhou, S.: Nearly optimal distinct elements and heavy hitters on sliding windows. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM, pp. 7:1–7:22 (2018)

    Google Scholar 

  11. Braverman, V., Lang, H., Levin, K., Monemizadeh, M.: Clustering on sliding windows in polylogarithmic space. In: 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS, pp. 350–364 (2015)

    Google Scholar 

  12. Braverman, V., Lang, H., Levin, K., Monemizadeh, M.: Clustering problems on sliding windows. In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 1374–1390 (2016)

    Google Scholar 

  13. Braverman, V., Ostrovsky, R.: Smooth histograms for sliding windows. In: Proceedings of 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 283–293 (2007)

    Google Scholar 

  14. Braverman, V., Ostrovsky, R., Roytman, A.: Zero-one laws for sliding windows and universal sketches. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM, pp. 573–590 (2015)

    Google Scholar 

  15. Braverman, V., Ostrovsky, R., Zaniolo, C.: Optimal sampling from sliding windows. J. Comput. Syst. Sci. 78(1), 260–272 (2012)

    Article  MathSciNet  Google Scholar 

  16. Chakrabarti, A., Ba, K.D., Muthukrishnan, S.: Estimating entropy and entropy norm on data streams. Internet Math. 3(1), 63–78 (2006)

    Article  MathSciNet  Google Scholar 

  17. Chen, J., Nguyen, H.L., Zhang, Q.: Submodular maximization over sliding windows. CoRR, abs/1611.00129 (2016)

    Google Scholar 

  18. Cormode, G.: The continuous distributed monitoring model. SIGMOD Rec. 42(1), 5–14 (2013)

    Article  Google Scholar 

  19. Cormode, G., Garofalakis, M.N.: Streaming in a connected world: querying and tracking distributed data streams. In: EDBT 2008, Proceedings of 11th International Conference on Extending Database Technology, p. 745 (2008)

    Google Scholar 

  20. Cormode, G., Muthukrishnan, S.: What’s new: finding significant differences in network data streams. IEEE/ACM Trans. Netw. 13(6), 1219–1232 (2005)

    Article  Google Scholar 

  21. Datar, M., Motwani, R.: The sliding-window computation model and results. In: Garofalakis, M., Gehrke, J., Rastogi, R. (eds.) Data Stream Management. DSA, pp. 149–165. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-540-28608-0_7

  22. Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Submodular optimization over sliding windows. In: Proceedings of the 26th International Conference on World Wide Web, WWW, pp. 421–430 (2017)

    Google Scholar 

  23. Feigenbaum, J., Kannan, S., Strauss, M., Viswanathan, M.: An approximate l1-difference algorithm for massive data streams. SIAM J. Comput. 32(1), 131–151 (2002)

    Article  MathSciNet  Google Scholar 

  24. Feldman, D., Monemizadeh, M., Sohler, C., Woodruff, D.P.: Coresets and sketches for high dimensional subspace approximation problems. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 630–649 (2010)

    Google Scholar 

  25. Indyk, P., Woodruff, D.P.: Optimal approximations of the frequency moments of data streams. In: Proceedings of the 37th Annual ACM Symposium on Theory of Computing, pp. 202–208 (2005)

    Google Scholar 

  26. Jayaram, R., Woodruff, D.P., Zhou, S.: Truly perfect samplers for data streams and sliding windows. CoRR, abs/2108.12017 (2021)

    Google Scholar 

  27. Krauthgamer, R., Reitblat, D.: Almost-smooth histograms and sliding-window graph algorithms. CoRR, abs/1904.07957 (2019)

    Google Scholar 

  28. Krishnamurthy, B., Sen, S., Zhang, Y., Chen, Y.: Sketch-based change detection: methods, evaluation, and applications. In: Proceedings of the 3rd ACM SIGCOMM Internet Measurement Conference, IMC, pp. 234–247 (2003)

    Google Scholar 

  29. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. PVLDB 5(12), 1699 (2012)

    Google Scholar 

  30. McDonald, A.M., Pontil, M., Stamos, D.: Spectral k-support norm regularization. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pp. 3644–3652 (2014)

    Google Scholar 

  31. Osborne, M., et al.: Real-time detection, tracking and monitoring of automatically discovered events in social media. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (2014)

    Google Scholar 

  32. Papapetrou, O., Garofalakis, M., Deligiannakis, A.: Sketching distributed sliding-window data streams. VLDB J. 24(3), 345–368 (2015). https://doi.org/10.1007/s00778-015-0380-7

    Article  Google Scholar 

  33. Song, Z., Wang, R., Yang, L.F., Zhang, H., Zhong, P.: Efficient symmetric norm regression via linear sketching. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, pp. 828–838 (2019)

    Google Scholar 

  34. Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 615–624 (2004)

    Google Scholar 

  35. Wei, Z., Liu, X., Li, F., Shang, S., Du, X., Wen, J.-R.: Matrix sketching over sliding windows. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference, pp. 1465–1480 (2016)

    Google Scholar 

  36. Woodruff, D.P., Zhang, Q.: Distributed statistical estimation of matrix products with applications. In: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp. 383–394 (2018)

    Google Scholar 

  37. Woodruff, D.P., Zhou, S.: Tight bounds for adversarially robust streams and sliding windows via difference estimators. CoRR, abs/2011.07471 (2020)

    Google Scholar 

  38. Woodruff, D.P., Zhou, S.: Separations for estimating large frequency moments on data streams. In: 48th International Colloquium on Automata, Languages, and Programming, ICALP, pp. 112:1–112:21 (2021)

    Google Scholar 

  39. Bin, W., Ding, C., Sun, D., Toh, K.-C.: On the Moreau-Yosida regularization of the vector k-norm related functions. SIAM J. Optim. 24(2), 766–794 (2014)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Braverman, V., Wei, V., Zhou, S. (2021). Symmetric Norm Estimation and Regression on Sliding Windows. In: Chen, CY., Hon, WK., Hung, LJ., Lee, CW. (eds) Computing and Combinatorics. COCOON 2021. Lecture Notes in Computer Science(), vol 13025. Springer, Cham. https://doi.org/10.1007/978-3-030-89543-3_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89543-3_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89542-6

  • Online ISBN: 978-3-030-89543-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics