Abstract
Numerous real-life applications are continually generating huge amounts of uncertain data (e.g., sensor or RFID readings). As a result, top-k queries that return only the k most promising probabilistic tuples become an important means to monitor and analyze such data. These “top” tuples should have both high scores in term of some ranking function, and high occurrence probability. The previous works on ranking semantics are not entirely satisfactory in the following sense: they either require user-specified parameters other than k, or cannot be evaluated efficiently in real-time scale, or even generating results violating the underlying probability model. In order to overcome all these deficiencies, we propose a new semantics called U-Popk based on a simpler but more fundamental property inherent in the underlying probability model. We then develop an efficient algorithm to evaluate U-Popk. Extensive experiments confirm that U-Popk is able to ensure high ranking quality and to support efficient evaluation of top-k queries on probabilistic tuples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: ICDE (2009)
Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB Journal 16(4), 523–544 (2007)
Agrawal, P., Benjelloun, O., Das Sarma, A., Hayworth, C., Nabar, S., Sugihara, T., Widom, J.: Trio: A system for data, uncertainty, and lineage. In: VLDB (2006)
Antova, L., Koch, C., Olteanu, D.: From complete to incomplete information and back. In: SIGMOD (2007)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)
Ilyas, I.F., Beskales, G., Soliman, M.A.: Survey of top-k query processing techniques in relational database systems. In: ACM Computing Surveys (2008)
Re, C., Dalvi, N., Suciu, D.: Efficient top-k query evaluation on probabilistic databases. In: ICDE (2007)
Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-k query processing in uncertain databases. In: ICDE (2007)
Zhang, X., Chomicki, J.: On the semantics and evaluation of top-k queries in probabilistic databases. In: DBRank (2008)
Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: A probabilistic threshold approach. In: SIGMOD (2008)
Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. In: VLDB (2009)
Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: On score distribution and typical answers. In: SIGMOD (2009)
Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window top-k queries on uncertain streams. In: VLDB (2008)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: SODA (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yan, D., Ng, W. (2011). Robust Ranking of Uncertain Data. In: Yu, J.X., Kim, M.H., Unland, R. (eds) Database Systems for Advanced Applications. DASFAA 2011. Lecture Notes in Computer Science, vol 6587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20149-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-20149-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20148-6
Online ISBN: 978-3-642-20149-3
eBook Packages: Computer ScienceComputer Science (R0)