Abstract
Recently, some approaches of finding probabilistic skylines on uncertain data have been proposed. In these approaches, a data object is composed of instances, each associated with a probability. The probabilistic skyline is then defined as a set of non-dominated objects with probabilities exceeding or equaling a given threshold. In many applications, data are generated as a form of continuous data streams. Accordingly, we make the first attempt to study a problem of continuously returning probabilistic skylines over uncertain data streams in this paper. Moreover, the sliding window model over data streams is considered here. To avoid recomputing the probability of being not dominated for each uncertain object according to the instances contained in the current window, our main idea is to estimate the bounds of these probabilities for early determining which objects can be pruned or returned as results. We first propose a basic algorithm adapted from an existing approach of answering skyline queries on static and certain data, which updates these bounds by repeatedly processing instances of each object. Then, we design a novel data structure to keep dominance relation between some instances for rapidly tightening these bounds, and propose a progressive algorithm based on this new structure. Moreover, these two algorithms are also adapted to solve the problem of continuously maintaining top-k probabilistic skylines. Finally, a set of experiments are performed to evaluate these algorithms, and the experiment results reveal that the progressive algorithm much outperforms the basic one, directly demonstrating the effectiveness of our newly designed structure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Atallah, M.J., Qi, Y.: Computing all skyline probabilities for uncertain data. In: PODS 2009, pp. 279–287 (2009)
Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE 2001, pp. 421–430 (2001)
Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: Skyline with presorting. In: ICDE 2003, pp. 717–816 (2003)
Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: VLDB 2005, pp. 229–240 (2005)
Kossmann, D., Ramsak, F., Rost, S.: Shooting starts in the sky: An online algorithm for skyline queries. In: VLDB 2002, pp. 275–286 (2002)
Li, J.J., Sun, S.L., Zhu, Y.Y.: Efficient maintaining of skyline over probabilistic data stream. In: ICNC 2008, pp. 378–382 (2008)
Lin, X., Yuan, Y., Wang, W., Lu, H.: Stabbing the sky: Efficient skyline computation over sliding windows. In: ICDE 2005, pp. 502–513 (2005)
Lee, K.C.K., Zheng, B., Li, H., Lee, W.C.: Approaching the skyline in Z order. In: VLDB 2007, pp. 279–290 (2007)
Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: VLDB 2007, pp. 15–26 (2007)
Tao, Y., Papadias, D.: Maintaining sliding window skylines on data streams. IEEE TKDE 18(2), 377–391
Zou, L., Chen, L.: Dominant Graph: An efficient indexing structure to answer top-k queries. In: ICDE 2008, pp. 536–545 (2008)
Zhang, W., Lin, X., Zhang, Y., Wang, W., Yu, J.X.: Probabilistic skyline operator over sliding windows. In: ICDE 2009, pp. 1060–1071 (2009)
Zhang, S., Mamoulis, N., Cheung, D.W.: Scalable skyline computation using object-based space partitioning. In: SIGMOD 2009, pp. 483–494 (2009)
Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: VLDB 2005, pp. 229–240 (2005)
Bartolini, l., Ciaccia, P., Patella, M.: Efficient sort-based skyline evaluation. ACM TODS 33(4), 31–49
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Su, H.Z., Wang, E.T., Chen, A.L.P. (2010). Continuous Probabilistic Skyline Queries over Uncertain Data Streams. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15364-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-15364-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15363-1
Online ISBN: 978-3-642-15364-8
eBook Packages: Computer ScienceComputer Science (R0)