skip to main content
10.1145/3514221.3526123acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Approximate Range Thresholding

Published: 11 June 2022 Publication History

Abstract

In this paper, we study the (approximate) Range Thresholding (RT) problem over streams. Each stream element is a d-dimensional point and with a positive integer weight. An RT query q specifies a d-dimensional axis-parallel rectangular range R(q) and a positive integer threshold τ(q). Once the query q is registered in the system, define s(q) as the total weight of the elements that satisfy: (i) they arrive after q's registration, and (ii) they fall in the range R(q). Given a real number 0 < ε < 1, the task of the system is to capture an arbitrary moment during the period between the first moment when s(q) ≥ (1 - ε)⋅ τ(q) and the first moment when s(q) ≥ τ(q). The challenge is to support a large number of RT queries simultaneously while achieving sub-quadratic overall running time and near-linear space consumption all the time.
We propose a new algorithm called FastRTS, which can reduce the exponent in the poly-logarithmic factor of the state-of-the-art QGT algorithm from d+1 to d, yet slightly increasing the łog term itself. Besides, we propose two extremely effective optimization techniques which significantly improve the practical performance of FastRTS. Experimental results show that FastRTS outperforms the competitors by up to three orders of magnitude in both running time and peak memory usage.

References

[1]
[n.d.]. FastRTS source code and experiment dataset. https://github.com/ zhuozhang-cn/FastRTS.
[2]
[n.d.]. FastRTS Technical Report. https://github.com/zhuozhang-cn/ FastRTSReport.
[3]
Daniel J. Abadi, Donald Carney, Ugur cC etintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stanley B. Zdonik. 2003. Aurora: a new model and architecture for data stream management. The VLDB Journal, Vol. 12, 2 (2003), 120--139.
[4]
Arvind Arasu and Jennifer Widom. 2004. A Denotational Semantics for Continuous Queries over Streams and Relations. SIGMOD Record, Vol. 33, 3 (2004), 6--12.
[5]
Lars Arge and Jan Vahrenhold. 2004. I/O-efficient dynamic planar point location. Comput. Geom., Vol. 29, 2 (2004), 147--162. https://doi.org/10.1016/j.comgeo.2003.04.001
[6]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. 1990. The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, USA, May 23--25, 1990, Hector Garcia-Molina and H. V. Jagadish (Eds.). ACM Press, 322--331. https://doi.org/10.1145/93597.98741
[7]
Jon Louis Bentley and James B. Saxe. 1980. Decomposable Searching Problems I: Static-to-Dynamic Transformation. J. Algorithms, Vol. 1, 4 (1980), 301--358. https://doi.org/10.1016/0196--6774(80)90015--2
[8]
Larry Carter and Mark N. Wegman. 1979. Universal Classes of Hash Functions. J. Comput. Syst. Sci., Vol. 18, 2 (1979), 143--154. https://doi.org/10.1016/0022-0000(79)90044--8
[9]
Jianjun Chen, David J. DeWitt, Feng Tian, and Yuan Wang. 2000. NiagaraCQ: A Scalable Continuous Query System for Internet Databases. In Proceedings of ACM Management of Data (SIGMOD). 379--390.
[10]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, 3rd Edition .MIT Press. http://mitpress.mit.edu/books/introduction-algorithms
[11]
Graham Cormode, S. Muthukrishnan, and Ke Yi. 2011a. Algorithms for distributed functional monitoring. ACM Trans. Algorithms, Vol. 7, 2 (2011), 21:1--21:20. https://doi.org/10.1145/1921659.1921667
[12]
Graham Cormode, S. Muthukrishnan, and Ke Yi. 2011b. Algorithms for Distributed Functional Monitoring. ACM Trans. Algorithms, Vol. 7, 2, Article 21 (March 2011), 21:1--21:20 pages.
[13]
Gianpaolo Cugola and Alessandro Margara. 2012. Processing flows of information: From data stream to complex event processing. Comput. Surveys, Vol. 44, 3 (2012), 15.
[14]
Mark de Berg, Otfried Cheong, Marc J. van Kreveld, and Mark H. Overmars. 2008. Computational geometry: algorithms and applications, 3rd Edition. Springer. https://www.worldcat.org/oclc/227584184
[15]
Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, and Walker M. White. 2007. Cayuga: A General Purpose Event Monitoring System. In Proceedings of Biennial Conference on Innovative Data Systems Research (CIDR). 412--422.
[16]
Yanlei Diao, Shariq Rizvi, and Michael J. Franklin. 2004. Towards an Internet-Scale XML Dissemination Service. In Proceedings of Very Large Data Bases (VLDB). 612--623.
[17]
Francc oise Fabret, Hans-Arno Jacobsen, Francc ois Llirbat, Jo a o L. M. Pereira, Kenneth A. Ross, and Dennis Shasha. 2001. Filtering Algorithms and Implementation for Very Fast Publish/Subscribe. In Proceedings of ACM Management of Data (SIGMOD). 115--126.
[18]
Antonin Guttman. 1984. R-Trees: A Dynamic Index Structure for Spatial Searching. In SIGMOD'84, Proceedings of Annual Meeting, Boston, Massachusetts, USA, June 18--21, 1984, Beatrice Yormark (Ed.). ACM Press, 47--57. https://doi.org/10.1145/602259.602266
[19]
Yuchen Li, Zhifeng Bao, Guoliang Li, and Kian-Lee Tan. 2015. Real time personalized search on social networks. In ICDE. 639--650.
[20]
Samuel Madden, Mehul A. Shah, Joseph M. Hellerstein, and Vijayshankar Raman. 2002. Continuously adaptive continuous queries over streams. In Proceedings of ACM Management of Data (SIGMOD). 49--60.
[21]
Benjamin Nguyen, Serge Abiteboul, Gregory Cobena, and Mihai Preda. 2001. Monitoring XML Data on the Web. In Proceedings of ACM Management of Data (SIGMOD). 437--448.
[22]
Norman W. Paton and Oscar D'i az. 1999. Active Database Systems. Comput. Surveys, Vol. 31, 1 (1999), 63--103.
[23]
Miao Qiao, Junhao Gan, and Yufei Tao. 2016. Range Thresholding on Streams. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, FatmaÖzcan, Georgia Koutrika, and Sam Madden (Eds.). ACM, 571--582. https://doi.org/10.1145/2882903.2915965
[24]
Boyu Ruan, Junhao Gan, Hao Wu, and Anthony Wirth. 2021. Dynamic Structural Clustering on Graphs. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 1491--1503. https://doi.org/10.1145/3448016.3452828
[25]
Eugene Wu, Yanlei Diao, and Shariq Rizvi. 2006. High-performance complex event processing over streams. In Proceedings of ACM Management of Data (SIGMOD). 407--418.
[26]
Albert Yu, Pankaj K. Agarwal, and Jun Yang. 2012. Processing a large number of continuous preference top-k queries. In Proceedings of ACM Management of Data (SIGMOD). 397--408.

Cited By

View all
  • (2024)Efficient Algorithms for Top-k Stabbing Queries on Weighted Interval DataDatabase and Expert Systems Applications10.1007/978-3-031-68309-1_12(146-152)Online publication date: 26-Aug-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
June 2022
2597 pages
ISBN:9781450392495
DOI:10.1145/3514221
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. algorithms
  2. data structures
  3. range thresholding
  4. streams

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient Algorithms for Top-k Stabbing Queries on Weighted Interval DataDatabase and Expert Systems Applications10.1007/978-3-031-68309-1_12(146-152)Online publication date: 26-Aug-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media