Abstract
Given a set of k-dimensional objects, the skyline query finds the objects that are not dominated by others. In practice, different users may be interested in different dimensions of the data, and issue queries on any subset of k dimensions in stream environments. This paper focuses on supporting concurrent and unpredictable subspace skyline queries over data streams. Simply to compute and store the skyline objects of every subspace in stream environments will incur expensive update cost. To balance the query cost and update cost, we only maintain the full space skyline in this paper. We first propose an efficient maintenance algorithm and several novel pruning techniques. Then, an efficient and scalable two-phase algorithm is proposed to process the skyline queries in different subspaces based on the full space skyline. Furthermore, we present the theoretical analyses and extensive experiments that demonstrate our method is both efficient and effective.
Similar content being viewed by others
References
Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2): 181–214
Borzsonyi S, Kossmann D, Stocker K (2001) The skyline operator. In: 17th international conference on data engineering (ICDE), pp 421–430
Chomicki J, Godfrey P, Gryz J et al (2005) Skyline with presorting: theory and optimization. In: Proceedings of the 14th international conference on intelligent information system, pp 593–602
Tan K-L, Eng P-K, Ooi B-C (2001) Efficient progressive skyline computation. In: Proceedings of the 27th international conference on very large databases, pp 301–310
Kossmann D, Ramsak F, Rost S (2001) Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 27th international conference on very large databases, pp 311–322
Yuan Y, LIN X, Liu Q et al (2005) Efficient computation of the skyline cube. In: Proceedings of the 31th international conference on very large data bases, pp 241–252
Papadias D, Tao Y, Fu G et al (2005) Progressive skyline computation in data systems. ACM Trans Database Syst 30(1): 41–82
Hose K, Lemke C, Sattler K-U (2006) Processing relaxed skylines in PDMS using distributed data summaries. In: Proceedings of the 15th ACM international conference on information and knowledge management, pp 425–434
Tao Y, Xiao X, Pei J (2006) SUBSKY: efficient computation of skylines in subspaces. In: Proceedings of the 22th international conference on data engineering, pp 65–74
Tao Y, Papadias D (2006) Maintaining sliding window skylines on data streams. IEEE Trans Knowl Data Eng 18(3): 377–391
Akrivi V, Christos D, Michalis V (2007) SKYPEER: efficient subspace skyline computation over distributed data. In: Proceedings of the 23th international conference on data engineering, pp 372–381
Chi Y, Wang H, Yu P et al (2006) Catch the moment, maintaining closed frequent itemsets over a data stream sliding window. Knowl Inf Syst 10(3): 265–294
Xiong X, Mokbel M-F, Aref W-G (2005) SEA-CNN: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: Proceedings of the 21th international conference on data engineering, pp 643–654
Papadopoulos S, Sacharidis D, Mouratidis K (2007) Continuous medoid queries over moving objects. In: Proceedings of the 13th international conference on advances in spatial and temporal databases, pp 1–19
Eamonn K, Kaushik C, Michael P et al (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3): 263–286
Wang H, Zimmermann R (2006) Distributed continuous range query processing on moving objects. In: Proceedings of the 17th international conference on database and expert systems applications, pp 655–665
Pei J, Jin W, Ester M et al (2005) Catching the best views of skyline: a semantic approach based on decisive subspaces. In: Proceedings of the 31th international conference on very large databases, pp 253–264
Pei J, Jiang B, Lin X, et al (2007) Probabilistic skylines on uncertain data. In: Proceedings of the 33th international conference on very large databases, pp 641–652
Chen B, Ramakrishnan R, LeFevre K. Privacy skyline: privacy with multidimensional adversarial knowledge. In: Proceedings of the 33th international conference on very large data bases, pp 653–664
Liu X, Yuan Y, Wang W et al (2005) Stabbing the sky: efficient skyline computation over sliding windows. In: Proceedings of the 21th ICDE conference, pp 502–513
Morse M, Patel J-M, Grosky W-I (2006) Efficient continuous skyline computation, In: Proceedings of the 22th international conference on data engineering, pp 108–108
Morse M, Patel J-M, Grosky W-I (2007) Efficient continuous skyline computation. Inf Sci 177(17): 3411–3437
Tao Y, Xiao X (2007) Efficient skyline and top-k retrievel in subspaces. IEEE Trans Knowl Data Eng 19(8): 1072–1088
Pei J, Fu A, Lin Q et al (2007) Computing compressed multidimensional skyline cubes efficiently. In: Proceedings of the 23th international conference on data engineering, pp 96–105
Huang Z, Wang W (2006) A novel incremental maintenance algorithm of SkyCube In: Proceedings of the 17th international conference on database and expert systems applications, pp 781–790
Jin W, Anthony K, Ester M et al (2007) On efficient processing of subspace skyline queries on high dimensional data. In: Proceedings of the 19th international conference on scientific and statistical database management, pp 12–22
Pei J, Yuan Y, Lin X et al (2006) Towards multidimensional subspace skyline analysis. ACM Trans Database Syst 31(4): 643–697
Vaidya J, Yu H, Jiang X (2007) Privacy-preserving svm classification. Knowl Inf Syst 14(2): 161–178
Park N-H, Lee W-S (2007) Grid-based subspace clustering over data streams. In: Proceedings of the 16th ACM international conference on information and knowledge management, pp 801–810
Chaudhuri S, Dalvi N, Raghav K (2006) Robust cardinality and cost estimation for skyline operator. In: Proceedings of the 22th international conference on data engineering, pp 64–73
Corral A, Almendros J-M (2007) A performance comparison of distance-based query algorithms using R-trees in spatial databases. Inf Sci 177(11): 2207–2237
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, Z., Sun, S. & Wang, W. Efficient mining of skyline objects in subspaces over data streams. Knowl Inf Syst 22, 159–183 (2010). https://doi.org/10.1007/s10115-008-0185-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-008-0185-8