Continuous ranking on uncertain streams

Jin, Cheqing; Zhang, Jingwei; Zhou, Aoying

doi:10.1007/s11704-012-1227-7

Continuous ranking on uncertain streams

Research Article
Published: 14 November 2012

Volume 6, pages 686–699, (2012)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Cheqing Jin¹,
Jingwei Zhang¹ &
Aoying Zhou¹

97 Accesses
4 Citations
Explore all metrics

Abstract

Data uncertainty widely exists in many web applications, financial applications and sensor networks. Ranking queries that return a number of tuples with maximal ranking scores are important in the field of database management. Most existing work focuses on proposing static solutions for various ranking semantics over uncertain data. Our focus is to handle continuous ranking queries on uncertain data streams: testing each new tuple to output highly-ranked tuples. The main challenge comes from not only the fact that the possible world space will grow exponentially when new tuples arrive, but also the requirement for low space- and time-complexity to adapt to the streaming environments. This paper aims at handling continuous ranking queries on uncertain data streams. We first study how to handle this issue exactly, then we propose a novel method (exponential sampling) to estimate the expected rank of a tuple with high quality. Analysis in theory and detailed experimental reports evaluate the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

Stratified random sampling from streaming and stored data

Article 23 October 2020

References

Aggarwal C C. Managing and Mining Uncertain Data. Springer, 2009
Antova L, Koch C, Olteanu D. From complete to incomplete information and back. In: Proceedings of ACM SIGMOD. 2007, 713–724
Dalvi N, Suciu D. Efficient query evaluation on probabilistic databases. The VLDB Journal, 2007, 16(4): 523–544
Article Google Scholar
Agrawal P, Benjelloun O, Sarma A, Hayworth C, Nabar S, Sugihara T, Widom J. Trio: a system for data, uncertainty, and lineage. In: Proceedings of VLDB. 2006, 1151–1154
Soliman M, Ilyas I, Chen-Chuan Chang K. Top-k query processing in uncertain databases. In: ICDE. 2007, 896–905
Benjelloun O, Sarma A, Halevy A, Widom J. ULDBs: databases with uncertainty and lineage. In: Proceedings of VLDB. 2006, 953–964
Jiang L X. Learning random forests for ranking. Frontiers of Computer Science in China. 2011, 5(1): 79–86
Article MathSciNet Google Scholar
Geng X B, Cheng X Q. Learning multiple metrics for ranking. Frontiers of Computer Science in China. 2011, 5(3): 259–267
Article MathSciNet Google Scholar
Hua M, Pei J, Zhang W, Lin X. Ranking queries on uncertain data: a probabilistic threshold approach. In: Proceedings of ACM SIGMOD. 2008, 673–686
Zhang X, Chomicki J. On the semantics and evaluation of top-k queries in probabilistic databases. In: Proceedings of DBRank. 2008, 556–563
Cormode G, Li F, Yi K. Semantics of ranking queries for probabilistic data and expected ranks. In: Proceedings of ICDE. 2009, 305–316
Ge T, Zdonik S, Madden S. Top-k queries on uncertain data: on score distribution and typical answers. In: Proceedings of ACM SIGMOD. 2009, 375–388
Yan D, Ng W. Robust ranking of uncertain data. In: Proceedings of DASFAA. 2011, 254–268
Jin C, Yi K, Chen L, Yu J, Lin X. Sliding-window top-k queries on uncertain streams. Proceedings of the VLDB Endowment, 2008, 1(1): 301–312
Google Scholar
Jin C, Gao M, Zhou A. Handling ER-topk query on uncertain streams. In: Proceedings of DASFAA. 2011, 326–340
Motwani R, Raghavan P. Randomized Algorithms. Cambridge University Press, 1995, 67–73
Dalvi N, Suciu D. Management of probabilistic data: foundations and challenges. In: Proceedings of PODS. 2007, 1–12
Jayram T, Kale S, Vee E. Efficient aggregation algorithms for probabilistic data. In: Proceedings of SODA. 2007, 346–355
Cormode G, Garofalakis M. Sketching probabilistic data streams. In: Proceedings of ACM SIGMOD. 2007, 281–292
Jin C, Zhou M, Zhou A. Computing rarity on uncertain data. SCIENCE CHINA Information Sciences, 2011, 54(10): 2028–2039
Article MathSciNet Google Scholar
Aggarwal C, Yu P. A framework for clustering uncertain data streams. In: Proceedings of ICDE. 2008, 150–159
Zhang Q, Li F, Yi K. Finding frequent items in probabilistic data. In: Proceedings of ACM SIGMOD. 2008, 819–832
Zhang W, Lin X, Zhang Y, Wang W, Yu J. Probabilistic skyline operator over sliding windows. In: Proceedings of ICDE. 2009, 1060–1071
Tran T, Peng L, Li B, Diao Y, Liu A. PODS: a new model and processing algorithms for uncertain data streams. In: Proceedings of SIGMOD. 2010, 159–170
Tran T, McGregor A, Diao Y, Peng L, Liu A. Conditioning and aggregating uncertain data streams: going beyond expectations. Proceedings of the VLDB Endowment, 2010, 3(1–2): 1302–1313
Google Scholar
Soliman M, Ilyas I. Ranking with uncertain scores. In: Proceedings of ICDE. 2009, 317–328
Li J, Saha B, Deshpande A. A unified approach to ranking in probabilistic databases. Proceedings of the VLDB Endowment, 2009, 2(1): 502–513
Google Scholar
Hua M, Pei J. Continuously monitoring top-k uncertain data streams: a probabilistic threshold method. Distributed and Parallel Databases, 2009, 26(1): 29–65
Article Google Scholar
Tang M, Li F, Phillips J M, Jestes J. Efficient threshold monitoring for distributed probabilistic data. In: Proceedings of ICDE. 2012

Download references

Author information

Authors and Affiliations

Shanghai Key Laboratory of Trustworthy Computing, Software Engineering Institute, East China Normal University, Shanghai, 200062, China
Cheqing Jin, Jingwei Zhang & Aoying Zhou

Authors

Cheqing Jin
View author publications
You can also search for this author in PubMed Google Scholar
Jingwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Aoying Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheqing Jin.

Additional information

Cheqing Jin received his BS and MS in computer science from Zhejiang University, China, in 1999 and 2002, respectively. He received his PhD degree in computer science from Fudan University, China, in 2005. Currently, he is a professor of the Software Engineering Institute, East China Normal University, China. His current research interests include streaming data, uncertain databases, location-based services, and data quality.

Jingwei Zhang received his MS in computer science from Guilin University of Electronic Technology, Guilin, China, in 2004. He is currently a PhD candidate in computer science at East China Normal University, Shanghai. His research interests include web data management and analysis, massive data management, and data stream mining.

Aoying Zhou is a professor and deputy dean of the Software School at East China Normal University, Shanghai, where he also heads the Institute of Massive Computing. He is the winner of the National Science Fund for Distinguished Young Scholars supported by NSFC and also of the professorship appointment under Chang Jiang Scholars Program sponsored by the Ministry of Education. He acts as the vice-director of ACM SIGMOD China and the Database Technology Committee of China Computer Federation. He servs as the associate editor-in-chief of the China Journal of Computer, and member of the editorial boards of some prestigious academic journals, such as the VLDB Journal, www Journal, and FCS. His research interests include Web data management, data management for data-intensive computing, management of uncertain data, data mining and data streams, distributed storage, and P2P computing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, C., Zhang, J. & Zhou, A. Continuous ranking on uncertain streams. Front. Comput. Sci. 6, 686–699 (2012). https://doi.org/10.1007/s11704-012-1227-7

Download citation

Received: 18 October 2011
Accepted: 08 June 2012
Published: 14 November 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s11704-012-1227-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous ranking on uncertain streams

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Stratified random sampling from streaming and stored data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Continuous ranking on uncertain streams

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Stratified random sampling from streaming and stored data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation