Skip to main content
Log in

Sliding-window top-k queries on uncertain streams

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Recently, due to the imprecise nature of the data generated from a variety of streaming applications, such as sensor networks, query processing on uncertain data streams has become an important problem. However, all the existing works on uncertain data streams study unbounded streams. In this paper, we take the first step towards the important and challenging problem of answering sliding-window queries on uncertain data streams, with a focus on one of the most important types of queries—top-k queries. It is nontrivial to find an efficient solution for answering sliding-window top-k queries on uncertain data streams, because challenges not only stem from the strict space and time requirements of processing both arriving and expiring tuples in high-speed streams, but also rise from the exponential blowup in the number of possible worlds induced by the uncertain data model. In this paper, we design a unified framework for processing sliding-window top-k queries on uncertain streams. We show that all the existing top-k definitions in the literature can be plugged into our framework, resulting in several succinct synopses that use space much smaller than the window size, while they are also highly efficient in terms of processing time. We also extend our framework to answering multiple top-k queries. In addition to the theoretical space and time bounds that we prove for these synopses, we present a thorough experimental report to verify their practical efficiency on both synthetic and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aggarwal, C.C., Yu, P.S.: A framework for clustering uncertain data streams. In: Proceedings of ICDE (2008)

  2. Aggarwal C.C., Yu P.S. (2009) A survey of uncertain data algorithms and applications. IEEE TKDE 21(5): 609–623

    Google Scholar 

  3. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. In: Proceedings of ACM STOC (1996)

  4. Babcock, B., Olston, C.: Distributed top-k monitoring. In: Proceedings of SIGMOD (2003)

  5. Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of ICDE, pp. 421–430 (2001)

  6. Chakrabarti, A., Cormode, G., McGregor, A.: Robust lower bounds for communication and stream computation. In: Proceedings of STOC (2008)

  7. Chakrabarti, A., Jayram, T., Pǎtraşcu, M.: Tight lower bounds for selection in randomly ordered streams. In: Proceedings of SODA (2008)

  8. Chen, J., Yi, K.: Dynamic structures for top-k queries on uncertain data. In: Proceedings of ISAAC (2007)

  9. Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: Proceedings of ACM SIGMOD (2003)

  10. Cormode, G., Garofalakis, M.: Sketching probabilistic data streams. In: Proceedings of ACM SIGMOD (2007)

  11. Cormode, G., Korn, F., Tirthapura, S.: Exponentially decayed aggregates on data streams. In: Proceedings of ICDE (2008)

  12. Cormode, G., Tirthapura, S., Xu, B.: Time-decaying sketches for sensor data aggregation. In: Proceedings of ACM PODC (2007)

  13. Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: Proceedings of ICDE (2004)

  14. Dalvi, N.N., Suciu, D.: The dichotomy of conjunctive queries on probabilistic structures. In: Proceedings of ACM PODS (2007)

  15. Das, G., Gunopulos, D., Koudas, N., Tsirogiannis, D.: Answering top-k queries using views. In: Proceedings of VLDB (2006)

  16. Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. In: Proceedings of SODA (2002)

  17. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Proceedings of PODS (2001)

  18. Golab, L.: Sliding Window Query Processing over Data Streams. PhD thesis, University of Waterloo (2006)

  19. Guha, S., McGregor, A.: Approximate quantiles and the order of the stream. In: Proceedings of PODS (2006)

  20. Hua, M., Pei, J., Fu, A.W.C., Lin, X., Leung, H.-F.: Efficiently answering top-k typicality queries on large databases. In: Proceedings of VLDB (2007)

  21. Hua, M., Pei, J., Zhang, W., Lin, X.: Efficiently answering probabilistic threshold top-k queries on uncertain data. In: Proceedings of ICDE (2008)

  22. Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: A probabilistic threshold approach. In: Proceedings of SIGMOD (2008)

  23. Jayram, T., Kale, S., Vee, E.: Efficient aggregation algorithms for probabilistic data. In: Proceedings of SODA (2007)

  24. Jayram, T., McGregor, A., Muthukrishnan, S., Vee, E.: Estimating statistical aggregates on probabilistic data streams. In: Proceedings of PODS (2007)

  25. Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window top-k queires on uncertain streams. In: Proceedings of VLDB (2008)

  26. Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: Proceedings of ACM SIGMOD (2006)

  27. Muthukrishnan, S.: Data streams: algorithms and applications. Foundations and Trends in Theoretical Computer Science. Now Publishers Inc. ISBN:978-1-933019-14-7 (2005)

  28. Nepal, S., Ramakrishna, M.V.: Query processing issues in image (multimedia) databases. In: Proceedings of ICDE (1999)

  29. Nilesh, D.S., Dalvi, N.: Efficient query evaluation on probabilistic databases. In: Proceedings of VLDB (2004)

  30. Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: Proceedings of SIGMOD (2003)

  31. Ŕe, C., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: Proceedings of ICDE (2007)

  32. Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: Proceedings of ICDE (2006)

  33. Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-k query processing in uncertain databases. In: Proceedings of ICDE (2007)

  34. Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: Proceedings of ACM SIGKDD (2006)

  35. Xin, D., Han, J., Cheng, H., Li, X.: Answering top-k queries with multi-dimensional selections: the ranking cube approach. In: Proceedings of VLDB (2006)

  36. Yi, K., Li, F., Kollios, G., Srivastava, D.: Efficient processing of top-k queries in uncertain databases. In: Proceedings of ICDE (2008)

  37. Zhang, Q., Li, F., Yi, K.: Finding frequent items in probabilistic data. In: Proceedings of SIGMOD (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheqing Jin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, C., Yi, K., Chen, L. et al. Sliding-window top-k queries on uncertain streams. The VLDB Journal 19, 411–435 (2010). https://doi.org/10.1007/s00778-009-0171-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-009-0171-0

Keywords

Navigation