skip to main content
10.1145/3294052.3319701acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Better Sliding Window Algorithms to Maximize Subadditive and Diversity Objectives

Published:25 June 2019Publication History

ABSTRACT

The streaming computation model is a standard model for large-scale data analysis: the input arrives one element at a time, and the goal is to maintain an approximately optimal solution using only a constant, or, at worst, polylogarithmic space.

In practice, however, recency plays a large role, and one often wishes to consider only the last w elements that have arrived, the so-called sliding window problem. A trivial approach is to simply store the last w elements in a buffer; our goal is to develop algorithms with space and update time sublinear in w. In this regime, there are two frameworks: exponential histograms and smooth histograms, which can be used to obtain sliding window algorithms for families of functions satisfying certain properties.

Unfortunately, these frameworks have limitations and cannot always be applied directly. A prominent example is the problem of maximizing submodular function with cardinality constraints. While some of these difficulties can be rectified on a case-by-case basis, here, we describe an alternative approach to designing efficient sliding window algorithms for maximization problems. Then we instantiate this approach on a wide range of problems, yielding better algorithms for submodular function optimization, diversity optimization and general subadditive optimization. In doing so, we improve state-of-the art results obtained using problem-specific algorithms.

References

  1. Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. 2012. Graph Sketches: Sparsification, Spanners, and Subgraphs. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS '12). ACM, New York, NY, USA, 5--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Noga Alon, Yossi Matias, and Mario Szegedy. 1996. The space complexity of approximating the frequency moments. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing. ACM, 20--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arvind Arasu and Gurmeet Singh Manku. 2004. Approximate counts and quantiles over sliding windows. In Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 286--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brian Babcock, Mayur Datar, and Rajeev Motwani. 2002. Sampling from a moving window over streaming data. In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 633--634.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brain Babcock, Mayur Datar, Rajeev Motwani, and Liadan O'Callaghan. 2003. Maintaining Variance and K-medians over Data Stream Windows. In Proceedings of the Twenty-second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS '03). ACM, New York, NY, USA, 234--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. 2014. Streaming submodular maximization: Massive data summarization on the fly. In ACM SIGKDD. ACM, 671--680.Google ScholarGoogle Scholar
  7. Ziv Bar-Yossef, TS Jayram, Ravi Kumar, D Sivakumar, and Luca Trevisan. 2002. Counting distinct elements in a data stream. In International Workshop on Randomization and Approximation Techniques in Computer Science. Springer, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. 2016. Efficient summing over sliding windows. arXiv preprint arXiv:1604.02450 (2016).Google ScholarGoogle Scholar
  9. Paul Beame, Raphael Clifford, and Widad Machmouchi. 2013. Element Distinctness, Frequency Moments, and Sliding Windows. In Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science (FOCS '13). IEEE Computer Society, Washington, DC, USA, 290--299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Vladimir Braverman, Petros Drineas, Jalaj Upadhyay, and Samson Zhou. 2018a. Numerical Linear Algebra in the Sliding Window Model. arXiv preprint arXiv:1805.03765 (2018).Google ScholarGoogle Scholar
  11. Vladimir Braverman, Ran Gelles, and Rafail Ostrovsky. 2014. How to catch l2-heavy-hitters on sliding windows. Theoretical Computer Science, Vol. 554 (2014), 82--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Vladimir Braverman, Elena Grigorescu, Harry Lang, David P Woodruff, and Samson Zhou. 2018b. Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows. arXiv preprint arXiv:1805.00212 (2018).Google ScholarGoogle Scholar
  13. Vladimir Braverman, Harry Lang, Keith Levin, and Morteza Monemizadeh. 2016a. Clustering Problems on Sliding Windows. In SODA. 1374--1390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Vladimir Braverman, Harry Lang, Keith Levin, and Morteza Monemizadeh. 2016b. Clustering problems on sliding windows. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 1374--1390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Vladimir Braverman and Rafail Ostrovsky. 2007. Smooth Histograms for Sliding Windows. In FOCS. 283--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Vladimir Braverman, Rafail Ostrovsky, and Carlo Zaniolo. 2009. Optimal sampling from sliding windows. In Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 147--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, and Eli Upfal. 2017. MapReduce and streaming algorithms for diversity maximization in metric spaces of bounded doubling dimension. Proceedings of the VLDB Endowment, Vol. 10, 5 (2017), 469--480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ho-Leung Chan, Tak Wah Lam, Lap-Kei Lee, and Hing-Fung Ting. 2012. Continuous Monitoring of Distributed Data Streams over a Time-Based Sliding Window. Algorithmica, Vol. 62, 3--4 (2012), 1088--1111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Chandra Chekuri, Shalmoli Gupta, and Kent Quanrud. 2015. Streaming algorithms for submodular function maximization. In International Colloquium on Automata, Languages, and Programming. Springer, 318--330.Google ScholarGoogle Scholar
  20. Jiecao Chen, Huy L. Nguyen, and Qin Zhang. 2016. Submodular Maximization over Sliding Windows. CoRR, Vol. abs/1611.00129 (2016). arxiv: 1611.00129 http://arxiv.org/abs/1611.00129Google ScholarGoogle Scholar
  21. Vincent Cohen-Addad, Chris Schwiegelshohn, and Christian Sohler. 2016. Diameter and k-Center in Sliding Windows. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016) (Leibniz International Proceedings in Informatics (LIPIcs)), Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi (Eds.), Vol. 55. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 19:1--19:12.Google ScholarGoogle Scholar
  22. Michael S Crouch, Andrew McGregor, and Daniel Stubbs. 2013. Dynamic graphs in the sliding-window model. In European Symposium on Algorithms. Springer, 337--348.Google ScholarGoogle ScholarCross RefCross Ref
  23. Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 2002. Maintaining stream statistics over sliding windows. SIAM journal on computing, Vol. 31, 6 (2002), 1794--1813. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, and Morteza Zadimoghaddam. 2017. Submodular Optimization Over Sliding Windows. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 421--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Philippe Flajolet and G Nigel Martin. 1985. Probabilistic counting algorithms for data base applications. Journal of computer and system sciences, Vol. 31, 2 (1985), 182--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Phillip B Gibbons and Srikanta Tirthapura. 2002. Distributed streams algorithms for sliding windows. In SPAA. ACM, 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nuno Homem and Joao Paulo Carvalho. 2011. Finding top-k elements in a time-sliding window. Evolving Systems, Vol. 2, 1 (2011), 51--70.Google ScholarGoogle ScholarCross RefCross Ref
  28. Regant YS Hung and Hing-Fung Ting. 2008. Finding heavy hitters over the sliding window of a weighted data stream. Lecture Notes in Computer Science, Vol. 4957 (2008), 699--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Piotr Indyk. 2007. Sketching, streaming and sublinear-space algorithms. Graduate course notes, available at (2007).Google ScholarGoogle Scholar
  30. Piotr Indyk, Sepideh Mahabadi, Mohammad Mahdian, and Vahab S. Mirrokni. 2014. Composable Core-sets for Diversity and Coverage Maximization. In Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS '14). ACM, New York, NY, USA, 100--108.Google ScholarGoogle Scholar
  31. Lap-Kei Lee and HF Ting. 2006 a. Maintaining significant stream statistics over sliding windows. In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm. Society for Industrial and Applied Mathematics, 724--732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lap-Kei Lee and HF Ting. 2006 b. A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 290--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrew McGregor. 2005. Finding graph matchings in data streams. In APPROX-RANDOM, Vol. 3624. Springer, 170--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Shanmugavelayutham Muthukrishnan et almbox. 2005. Data streams: Algorithms and applications. Foundations and Trends® in Theoretical Computer Science, Vol. 1, 2 (2005), 117--236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Michael Saks and Xiaodong Sun. 2002. Space Lower Bounds for Distance Approximation in the Data Stream Model. In Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing (STOC '02). ACM, New York, NY, USA, 360--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hing-Fung Ting, Lap-Kei Lee, Ho-Leung Chan, and Tak Wah Lam. 2011. Approximating Frequent Items in Asynchronous Data Stream over a Sliding Window. Algorithms, Vol. 4, 3 (2011), 200--222.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yanhao Wang, Qi Fan, Yuchen Li, and Kian-Lee Tan. 2017. Real-time influence maximization on dynamic social streams. Proceedings of the VLDB Endowment, Vol. 10, 7 (2017), 805--816. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Linfeng Zhang and Yong Guan. 2008. Frequency estimation over sliding windows. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on. IEEE, 1385--1387. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Better Sliding Window Algorithms to Maximize Subadditive and Diversity Objectives

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PODS '19: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
            June 2019
            494 pages
            ISBN:9781450362276
            DOI:10.1145/3294052

            Copyright © 2019 Owner/Author

            This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 June 2019

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            PODS '19 Paper Acceptance Rate29of87submissions,33%Overall Acceptance Rate642of2,707submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader