skip to main content
research-article

Online subspace skyline query processing using the compressed skycube

Published: 04 June 2012 Publication History

Abstract

The skyline query can help identify the “best” objects in a multi-attribute dataset. During the past decade, this query has received considerable attention in the database research community. Most research focused on computing the “skyline” of a dataset, or the set of “skyline objects” that are not dominated by any other object. Such algorithms are not appropriate in an online system, which should respond in real time to skyline query requests with arbitrary subsets of the attributes (also called subspaces). To guarantee real-time response, an online system should precompute the skylines for all subspaces, and look up a skyline upon query. Unfortunately, because the number of subspaces is exponential to the number of attributes, such pre computation has very expensive storage cost and update cost. We propose the Compressed SkyCube (CSC) that is much more compact, yet can still return the skyline of any subspace without consulting the base table. The CSC therefore combines the advantage of precomputation in that it can respond to queries in real time, and the advantage of no-precomputation in that it has efficient space cost and update cost. This article presents the CSC data structures, the CSC query algorithm, the CSC update algorithm, and the CSC initial computation scheme. A solution to extend to high-dimensional data is also proposed.

References

[1]
Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J., Ramakrishnan, R., and Sarawagi, S. 1996. On the computation of multidimensional aggregates. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 506--521.
[2]
Balke, W.-T., Güntzer, U., and Zheng, J. X. 2004. Efficient distributed skylining for web information systems. In Proceedings of the International Conference on Extending Database Technology (EDBT). 256--273.
[3]
Barndorff-Nielsen, O. and Sobel, M. 1966. On the distribution of the number of admissable points in a vector random sample. Theory Probab. Appl. 11, 2, 249--269.
[4]
Bentley, J. L., Clarkson, K. L., and Levine, D. B. 1990. Fast linear expected-time algorithms for computing maxima and convex hulls. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 179--187.
[5]
Bentley, J. L., Kung, H. T., Schkolnick, M., and Thompson, C. D. 1978. On the average number of maxima in a set of vectors and applications. J. Assoc. Comput. Mach. 25, 4, 536--543.
[6]
Börzsönyi, S., Kossmann, D., and Stocker, K. 2001. The skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE). 421--430.
[7]
Buchta, C. 1989. On the average number of maxima in a set of vectors. Inf. Process. Lett. 33, 2, 63--65.
[8]
Chan, C. Y., Eng, P.-K., and Tan, K.-L. 2005. Stratified computation of skylines with partially-ordered domains. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 203--214.
[9]
Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006a. Finding k-dominant skylines in high dimensional space. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 503--514.
[10]
Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006b. On high dimensional skylines. In Proceedings of the International Conference on Extending Database Technology (EDBT). 478--495.
[11]
Chaudhuri, S., Dalvi, N. N., and Kaushik, R. 2006. Robust cardinality and cost estimation for skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE). 64.
[12]
Chomicki, J. 2002. Querying with intrinsic preferences. In Proceedings of the International Conference on Extending Database Technology (EDBT). 34--51.
[13]
Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2003. Skyline with presorting. In Proceedings of the International Conference on Data Engineering (ICDE). 717--816.
[14]
Dellis, E., Vlachou, A., Vladimirskiy, I., Seeger, B., and Theodoridis, Y. 2006. Constrained subspace skyline computation. In Proceedings of the International Conference on Information and Knowledge Management (CIKM).
[15]
Fang, Z., Wang, J., and Zhang, D. 2009. Workload-driven compressed skycube queries in wireless applications. In Proceedings of the International Conference on Wirelss Algorithms, Systems, and Applications (WASA). 244--253.
[16]
Godfrey, P. 2004. Skyline cardinality for relational processing. In Proceedings of the International Symposium on Foundations of Information and Knowledge Systems (FoIKS). 78--97.
[17]
Godfrey, P., Shipley, R., and Gryz, J. 2005. Maximal vector computation in large data sets. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 229--240.
[18]
Hose, K., Lemke, C., and Sattler, K.-U. 2006. Processing relaxed skylines in pdms using distributed data summaries. In Proceedings of the International Conference on Information and Knowledge Management (CIKM).
[19]
Hristidis, V., Koudas, N., and Papakonstantinou, Y. 2001. PREFER: A system for the efficient execution of multi-parametric ranked queries. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 259--270.
[20]
Huang, X. and Jensen, C. S. 2004. In-route skyline querying for location-based services. In Proceedings of the International Symposium on Web and Wireless Geographical Information Systems (W2GIS). 120--135.
[21]
Huang, Z., Jensen, C. S., Lu, H., and Ooi, B. C. 2006a. Skyline queries against mobile lightweight devices in manets. In Proceedings of the International Conference on Data Engineering (ICDE). 66.
[22]
Huang, Z., Lu, H., Ooi, B. C., and Tung, A. K. H. 2006b. Continuous skyline queries for moving objects. IEEE Trans. Knowl. Data Engin. 18, 12, 1645--1658.
[23]
Jin, W., Han, J., and Ester, M. 2004. Mining thick skylines over large databases. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery (PKDD). 255--266.
[24]
Jin, W., Morse, M. D., Patel, J. M., Ester, M., and Hu, Z. 2010. Evaluating skylines in the presence of equijoins. In Proceedings of the International Conference on Data Engineering (ICDE). 249--260.
[25]
Kossmann, D., Ramsak, F., and Rost, S. 2002. Shooting stars in the sky: an online algorithm for skyline queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 275--286.
[26]
Kung, H. T., Luccio, F., and Preparata, F. P. 1975. On finding the maxima of a set of vectors. J. Assoc. Comput. Mach. 22, 4, 469--476.
[27]
Lee, J. and Hwang, S. 2011. Qskycube: efficient skycube computation using point-based space partitioning. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 185--196.
[28]
Lin, X., Yuan, Y., Wang, W., and Lu, H. 2005. Stabbing the sky: efficient skyline computation over sliding windows. In Proceedings of the International Conference on Data Engineering (ICDE). 502--513.
[29]
Lin, X., Yuan, Y., Zhang, Q., and Zhang, Y. 2007. Selecting stars: The k most representative skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE).
[30]
Matousek, J. 1991. Computing dominances in En. Inf. Process. Lett. 38, 5, 277--278.
[31]
Morse, M. D., Patel, J. M., and Grosky, W. I. 2006. Efficient continuous skyline computation. In Proceedings of the International Conference on Data Engineering (ICDE). 108.
[32]
Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2003. An optimal and progressive algorithm for skyline queries. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 467--478.
[33]
Pei, J., Fu, A. W.-C., Lin, X., and Wang, H. 2007. Computing compressed multidimensional skyline cubes efficiently. In Proceedings of the International Conference on Data Engineering (ICDE).
[34]
Pei, J., Jin, W., Ester, M., and Tao, Y. 2005. Catching the best views of skyline: a semantic approach based on decisive subspaces. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 253--264.
[35]
Pei, J., Yuan, Y., Lin, X., Jin, W., Ester, M., Liu, Q., Wang, W., Tao, Y., Yu, J. X., and Zhang, Q. 2006. Towards multidimensional subspace skyline analysis. ACM Trans. Datab. Syst. 31, 4, 1335--1381.
[36]
Preparata, F. and Shamos, M. 1985. Computational Geometry: An Introduction. Springer.
[37]
Raïssi, C., Pei, J., and Kister, T. 2010. Computing closed skycubes. Proc. VLDB Endow. 3, 838--847.
[38]
Rhee, C. J., Dhall, S. K., and Lakshmivarahan, S. 1995. The minimum weight dominating set problem for permutation graphs is in nc. J. Parallel Distrib. Comput. 28, 2, 109--112.
[39]
Sharifzadeh, M. and Shahabi, C. 2006. The spatial skyline queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB).
[40]
Sheng, C. and Tao, Y. 2011. On finding skylines in external memory. In Proceedings of the ACM International Symposium on Principles of Database Systems (PODS). 107--116.
[41]
Stojmenovic, I. and Miyakawa, M. 1988. An optimal parallel algorithm for solving the maximal elements problem in the plane. Parallel Comput. 7, 2, 249--251.
[42]
Tan, K.-L., Eng, P. K., and Ooi, B. C. 2001. Efficient progressive skyline computation. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 301--310.
[43]
Tao, Y. and Papadias, D. 2006. Maintaining sliding window skylines on data streams. IEEE Trans. Knowl. Data Engin. 18, 3, 377--391.
[44]
Tao, Y., Xiao, X., and Pei, J. 2006. Subsky: efficient computation of skylines in subspaces. In Proceedings of the International Conference on Data Engineering (ICDE).
[45]
Wong, R. C.-W., Fu, A. W.-C., Pei, J., Ho, Y. S., Wong, T., and Liu, Y. 2008. Efficient skyline querying with variable user preferences on nominal attributes. Proc. VLDB Endow. 1, 1, 1032--1043.
[46]
Wu, P., Agrawal, D., Egecioglu, Ö., and El Abbadi, A. 2007. Deltasky: optimal maintenance of skyline deletions without exclusive dominance region generation. In Proceedings of the International Conference on Data Engineering (ICDE). 486--495.
[47]
Wu, P., Zhang, C., Feng, Y., Zhao, B. Y., Agrawal, D., and Abbadi, A. E. 2006. Parallelizing skyline queries for scalable distribution. In Proceedings of the International Conference on Extending Database Technology (EDBT). 112--130.
[48]
Xia, T. and Zhang, D. 2006. Refreshing the sky: the compressed skycube with efficient support for frequent updates. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 491--502.
[49]
Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J. X., and Zhang, Q. 2005. Efficient computation of the skyline cube. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 241--252.
[50]
Zhang, S., Mamoulis, N., and Cheung, D. W. 2009a. Scalable skyline computation using object-based space partitioning. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 483--494.
[51]
Zhang, Z., Yang, Y., Cai, R., Papadias, D., and Tung, A. K. H. 2009b. Kernel-Based skyline cardinality estimation. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 509--522.

Cited By

View all
  • (2024)Efficient Skyline Keyword-Based Tree Retrieval on Attributed GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338898836:11(6056-6070)Online publication date: Nov-2024
  • (2024)Parallel continuous skyline query over high-dimensional data stream windowsDistributed and Parallel Databases10.1007/s10619-024-07443-742:4(469-524)Online publication date: 1-Dec-2024
  • (2021)Using Multidimensional Skylines for Regret MinimizationModel and Data Engineering10.1007/978-3-030-78428-7_23(293-304)Online publication date: 14-Jun-2021
  • Show More Cited By

Index Terms

  1. Online subspace skyline query processing using the compressed skycube

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Database Systems
    ACM Transactions on Database Systems  Volume 37, Issue 2
    May 2012
    326 pages
    ISSN:0362-5915
    EISSN:1557-4644
    DOI:10.1145/2188349
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 June 2012
    Accepted: 01 February 2012
    Revised: 01 December 2011
    Received: 01 December 2010
    Published in TODS Volume 37, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Skyline
    2. compressed
    3. subspace
    4. update support
    5. workload

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Skyline Keyword-Based Tree Retrieval on Attributed GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338898836:11(6056-6070)Online publication date: Nov-2024
    • (2024)Parallel continuous skyline query over high-dimensional data stream windowsDistributed and Parallel Databases10.1007/s10619-024-07443-742:4(469-524)Online publication date: 1-Dec-2024
    • (2021)Using Multidimensional Skylines for Regret MinimizationModel and Data Engineering10.1007/978-3-030-78428-7_23(293-304)Online publication date: 14-Jun-2021
    • (2020)Accelerating Skycube Computation with Partial and Parallel Processing for Service SelectionIEEE Transactions on Services Computing10.1109/TSC.2017.276268113:6(969-984)Online publication date: 1-Nov-2020
    • (2020)Peak cubes in service operations: Bringing multidimensionality into decision support systemsDecision Support Systems10.1016/j.dss.2020.113442(113442)Online publication date: Nov-2020
    • (2020)A framework for multidimensional skyline queries over streaming dataData & Knowledge Engineering10.1016/j.datak.2020.101792127(101792)Online publication date: May-2020
    • (2020)Efficient column-oriented processing for mutual subspace skyline queriesSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-020-04875-y24:20(15427-15445)Online publication date: 1-Oct-2020
    • (2019)The negative skycubeInformation Systems10.1016/j.is.2019.101443(101443)Online publication date: Sep-2019
    • (2019)PRS: efficient range skyline computation on massive data via presortingKnowledge and Information Systems10.1007/s10115-018-1310-y60:3(1511-1548)Online publication date: 1-Sep-2019
    • (2017)Efficient Computation of Subspace Skyline over Categorical DomainsProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3133012(407-416)Online publication date: 6-Nov-2017
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media