Skip to main content
Log in

Efficient mining of skyline objects in subspaces over data streams

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Given a set of k-dimensional objects, the skyline query finds the objects that are not dominated by others. In practice, different users may be interested in different dimensions of the data, and issue queries on any subset of k dimensions in stream environments. This paper focuses on supporting concurrent and unpredictable subspace skyline queries over data streams. Simply to compute and store the skyline objects of every subspace in stream environments will incur expensive update cost. To balance the query cost and update cost, we only maintain the full space skyline in this paper. We first propose an efficient maintenance algorithm and several novel pruning techniques. Then, an efficient and scalable two-phase algorithm is proposed to process the skyline queries in different subspaces based on the full space skyline. Furthermore, we present the theoretical analyses and extensive experiments that demonstrate our method is both efficient and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2): 181–214

    Article  Google Scholar 

  2. Borzsonyi S, Kossmann D, Stocker K (2001) The skyline operator. In: 17th international conference on data engineering (ICDE), pp 421–430

  3. Chomicki J, Godfrey P, Gryz J et al (2005) Skyline with presorting: theory and optimization. In: Proceedings of the 14th international conference on intelligent information system, pp 593–602

  4. Tan K-L, Eng P-K, Ooi B-C (2001) Efficient progressive skyline computation. In: Proceedings of the 27th international conference on very large databases, pp 301–310

  5. Kossmann D, Ramsak F, Rost S (2001) Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 27th international conference on very large databases, pp 311–322

  6. Yuan Y, LIN X, Liu Q et al (2005) Efficient computation of the skyline cube. In: Proceedings of the 31th international conference on very large data bases, pp 241–252

  7. Papadias D, Tao Y, Fu G et al (2005) Progressive skyline computation in data systems. ACM Trans Database Syst 30(1): 41–82

    Article  Google Scholar 

  8. Hose K, Lemke C, Sattler K-U (2006) Processing relaxed skylines in PDMS using distributed data summaries. In: Proceedings of the 15th ACM international conference on information and knowledge management, pp 425–434

  9. Tao Y, Xiao X, Pei J (2006) SUBSKY: efficient computation of skylines in subspaces. In: Proceedings of the 22th international conference on data engineering, pp 65–74

  10. Tao Y, Papadias D (2006) Maintaining sliding window skylines on data streams. IEEE Trans Knowl Data Eng 18(3): 377–391

    Article  Google Scholar 

  11. Akrivi V, Christos D, Michalis V (2007) SKYPEER: efficient subspace skyline computation over distributed data. In: Proceedings of the 23th international conference on data engineering, pp 372–381

  12. Chi Y, Wang H, Yu P et al (2006) Catch the moment, maintaining closed frequent itemsets over a data stream sliding window. Knowl Inf Syst 10(3): 265–294

    Article  Google Scholar 

  13. Xiong X, Mokbel M-F, Aref W-G (2005) SEA-CNN: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: Proceedings of the 21th international conference on data engineering, pp 643–654

  14. Papadopoulos S, Sacharidis D, Mouratidis K (2007) Continuous medoid queries over moving objects. In: Proceedings of the 13th international conference on advances in spatial and temporal databases, pp 1–19

  15. Eamonn K, Kaushik C, Michael P et al (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3): 263–286

    Article  MATH  Google Scholar 

  16. Wang H, Zimmermann R (2006) Distributed continuous range query processing on moving objects. In: Proceedings of the 17th international conference on database and expert systems applications, pp 655–665

  17. Pei J, Jin W, Ester M et al (2005) Catching the best views of skyline: a semantic approach based on decisive subspaces. In: Proceedings of the 31th international conference on very large databases, pp 253–264

  18. Pei J, Jiang B, Lin X, et al (2007) Probabilistic skylines on uncertain data. In: Proceedings of the 33th international conference on very large databases, pp 641–652

  19. Chen B, Ramakrishnan R, LeFevre K. Privacy skyline: privacy with multidimensional adversarial knowledge. In: Proceedings of the 33th international conference on very large data bases, pp 653–664

  20. Liu X, Yuan Y, Wang W et al (2005) Stabbing the sky: efficient skyline computation over sliding windows. In: Proceedings of the 21th ICDE conference, pp 502–513

  21. Morse M, Patel J-M, Grosky W-I (2006) Efficient continuous skyline computation, In: Proceedings of the 22th international conference on data engineering, pp 108–108

  22. Morse M, Patel J-M, Grosky W-I (2007) Efficient continuous skyline computation. Inf Sci 177(17): 3411–3437

    Article  MathSciNet  Google Scholar 

  23. Tao Y, Xiao X (2007) Efficient skyline and top-k retrievel in subspaces. IEEE Trans Knowl Data Eng 19(8): 1072–1088

    Article  Google Scholar 

  24. Pei J, Fu A, Lin Q et al (2007) Computing compressed multidimensional skyline cubes efficiently. In: Proceedings of the 23th international conference on data engineering, pp 96–105

  25. Huang Z, Wang W (2006) A novel incremental maintenance algorithm of SkyCube In: Proceedings of the 17th international conference on database and expert systems applications, pp 781–790

  26. Jin W, Anthony K, Ester M et al (2007) On efficient processing of subspace skyline queries on high dimensional data. In: Proceedings of the 19th international conference on scientific and statistical database management, pp 12–22

  27. Pei J, Yuan Y, Lin X et al (2006) Towards multidimensional subspace skyline analysis. ACM Trans Database Syst 31(4): 643–697

    Article  Google Scholar 

  28. Vaidya J, Yu H, Jiang X (2007) Privacy-preserving svm classification. Knowl Inf Syst 14(2): 161–178

    Article  Google Scholar 

  29. Park N-H, Lee W-S (2007) Grid-based subspace clustering over data streams. In: Proceedings of the 16th ACM international conference on information and knowledge management, pp 801–810

  30. Chaudhuri S, Dalvi N, Raghav K (2006) Robust cardinality and cost estimation for skyline operator. In: Proceedings of the 22th international conference on data engineering, pp 64–73

  31. Corral A, Almendros J-M (2007) A performance comparison of distance-based query algorithms using R-trees in spatial databases. Inf Sci 177(11): 2207–2237

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenhua Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Z., Sun, S. & Wang, W. Efficient mining of skyline objects in subspaces over data streams. Knowl Inf Syst 22, 159–183 (2010). https://doi.org/10.1007/s10115-008-0185-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-008-0185-8

Keywords

Navigation