Abstract
The skyline query can help identify the “best” objects in a multi-attribute dataset. During the past decade, this query has received considerable attention in the database research community. Most research focused on computing the “skyline” of a dataset, or the set of “skyline objects” that are not dominated by any other object. Such algorithms are not appropriate in an online system, which should respond in real time to skyline query requests with arbitrary subsets of the attributes (also called subspaces). To guarantee real-time response, an online system should precompute the skylines for all subspaces, and look up a skyline upon query. Unfortunately, because the number of subspaces is exponential to the number of attributes, such pre computation has very expensive storage cost and update cost. We propose the Compressed SkyCube (CSC) that is much more compact, yet can still return the skyline of any subspace without consulting the base table. The CSC therefore combines the advantage of precomputation in that it can respond to queries in real time, and the advantage of no-precomputation in that it has efficient space cost and update cost. This article presents the CSC data structures, the CSC query algorithm, the CSC update algorithm, and the CSC initial computation scheme. A solution to extend to high-dimensional data is also proposed.
- Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J., Ramakrishnan, R., and Sarawagi, S. 1996. On the computation of multidimensional aggregates. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 506--521. Google ScholarDigital Library
- Balke, W.-T., Güntzer, U., and Zheng, J. X. 2004. Efficient distributed skylining for web information systems. In Proceedings of the International Conference on Extending Database Technology (EDBT). 256--273.Google Scholar
- Barndorff-Nielsen, O. and Sobel, M. 1966. On the distribution of the number of admissable points in a vector random sample. Theory Probab. Appl. 11, 2, 249--269.Google ScholarCross Ref
- Bentley, J. L., Clarkson, K. L., and Levine, D. B. 1990. Fast linear expected-time algorithms for computing maxima and convex hulls. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 179--187. Google ScholarDigital Library
- Bentley, J. L., Kung, H. T., Schkolnick, M., and Thompson, C. D. 1978. On the average number of maxima in a set of vectors and applications. J. Assoc. Comput. Mach. 25, 4, 536--543. Google ScholarDigital Library
- Börzsönyi, S., Kossmann, D., and Stocker, K. 2001. The skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE). 421--430. Google ScholarDigital Library
- Buchta, C. 1989. On the average number of maxima in a set of vectors. Inf. Process. Lett. 33, 2, 63--65. Google ScholarDigital Library
- Chan, C. Y., Eng, P.-K., and Tan, K.-L. 2005. Stratified computation of skylines with partially-ordered domains. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 203--214. Google ScholarDigital Library
- Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006a. Finding k-dominant skylines in high dimensional space. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 503--514. Google ScholarDigital Library
- Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006b. On high dimensional skylines. In Proceedings of the International Conference on Extending Database Technology (EDBT). 478--495. Google ScholarDigital Library
- Chaudhuri, S., Dalvi, N. N., and Kaushik, R. 2006. Robust cardinality and cost estimation for skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE). 64. Google ScholarDigital Library
- Chomicki, J. 2002. Querying with intrinsic preferences. In Proceedings of the International Conference on Extending Database Technology (EDBT). 34--51. Google ScholarDigital Library
- Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2003. Skyline with presorting. In Proceedings of the International Conference on Data Engineering (ICDE). 717--816.Google Scholar
- Dellis, E., Vlachou, A., Vladimirskiy, I., Seeger, B., and Theodoridis, Y. 2006. Constrained subspace skyline computation. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). Google ScholarDigital Library
- Fang, Z., Wang, J., and Zhang, D. 2009. Workload-driven compressed skycube queries in wireless applications. In Proceedings of the International Conference on Wirelss Algorithms, Systems, and Applications (WASA). 244--253. Google ScholarDigital Library
- Godfrey, P. 2004. Skyline cardinality for relational processing. In Proceedings of the International Symposium on Foundations of Information and Knowledge Systems (FoIKS). 78--97.Google ScholarCross Ref
- Godfrey, P., Shipley, R., and Gryz, J. 2005. Maximal vector computation in large data sets. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 229--240. Google ScholarDigital Library
- Hose, K., Lemke, C., and Sattler, K.-U. 2006. Processing relaxed skylines in pdms using distributed data summaries. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). Google ScholarDigital Library
- Hristidis, V., Koudas, N., and Papakonstantinou, Y. 2001. PREFER: A system for the efficient execution of multi-parametric ranked queries. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 259--270. Google ScholarDigital Library
- Huang, X. and Jensen, C. S. 2004. In-route skyline querying for location-based services. In Proceedings of the International Symposium on Web and Wireless Geographical Information Systems (W2GIS). 120--135. Google ScholarDigital Library
- Huang, Z., Jensen, C. S., Lu, H., and Ooi, B. C. 2006a. Skyline queries against mobile lightweight devices in manets. In Proceedings of the International Conference on Data Engineering (ICDE). 66. Google ScholarDigital Library
- Huang, Z., Lu, H., Ooi, B. C., and Tung, A. K. H. 2006b. Continuous skyline queries for moving objects. IEEE Trans. Knowl. Data Engin. 18, 12, 1645--1658. Google ScholarDigital Library
- Jin, W., Han, J., and Ester, M. 2004. Mining thick skylines over large databases. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery (PKDD). 255--266. Google ScholarDigital Library
- Jin, W., Morse, M. D., Patel, J. M., Ester, M., and Hu, Z. 2010. Evaluating skylines in the presence of equijoins. In Proceedings of the International Conference on Data Engineering (ICDE). 249--260.Google Scholar
- Kossmann, D., Ramsak, F., and Rost, S. 2002. Shooting stars in the sky: an online algorithm for skyline queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 275--286. Google ScholarDigital Library
- Kung, H. T., Luccio, F., and Preparata, F. P. 1975. On finding the maxima of a set of vectors. J. Assoc. Comput. Mach. 22, 4, 469--476. Google ScholarDigital Library
- Lee, J. and Hwang, S. 2011. Qskycube: efficient skycube computation using point-based space partitioning. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 185--196. Google ScholarDigital Library
- Lin, X., Yuan, Y., Wang, W., and Lu, H. 2005. Stabbing the sky: efficient skyline computation over sliding windows. In Proceedings of the International Conference on Data Engineering (ICDE). 502--513. Google ScholarDigital Library
- Lin, X., Yuan, Y., Zhang, Q., and Zhang, Y. 2007. Selecting stars: The k most representative skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE).Google Scholar
- Matousek, J. 1991. Computing dominances in En. Inf. Process. Lett. 38, 5, 277--278. Google ScholarDigital Library
- Morse, M. D., Patel, J. M., and Grosky, W. I. 2006. Efficient continuous skyline computation. In Proceedings of the International Conference on Data Engineering (ICDE). 108. Google ScholarDigital Library
- Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2003. An optimal and progressive algorithm for skyline queries. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 467--478. Google ScholarDigital Library
- Pei, J., Fu, A. W.-C., Lin, X., and Wang, H. 2007. Computing compressed multidimensional skyline cubes efficiently. In Proceedings of the International Conference on Data Engineering (ICDE).Google Scholar
- Pei, J., Jin, W., Ester, M., and Tao, Y. 2005. Catching the best views of skyline: a semantic approach based on decisive subspaces. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 253--264. Google ScholarDigital Library
- Pei, J., Yuan, Y., Lin, X., Jin, W., Ester, M., Liu, Q., Wang, W., Tao, Y., Yu, J. X., and Zhang, Q. 2006. Towards multidimensional subspace skyline analysis. ACM Trans. Datab. Syst. 31, 4, 1335--1381. Google ScholarDigital Library
- Preparata, F. and Shamos, M. 1985. Computational Geometry: An Introduction. Springer. Google ScholarDigital Library
- Raïssi, C., Pei, J., and Kister, T. 2010. Computing closed skycubes. Proc. VLDB Endow. 3, 838--847. Google ScholarDigital Library
- Rhee, C. J., Dhall, S. K., and Lakshmivarahan, S. 1995. The minimum weight dominating set problem for permutation graphs is in nc. J. Parallel Distrib. Comput. 28, 2, 109--112. Google ScholarDigital Library
- Sharifzadeh, M. and Shahabi, C. 2006. The spatial skyline queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB). Google ScholarDigital Library
- Sheng, C. and Tao, Y. 2011. On finding skylines in external memory. In Proceedings of the ACM International Symposium on Principles of Database Systems (PODS). 107--116. Google ScholarDigital Library
- Stojmenovic, I. and Miyakawa, M. 1988. An optimal parallel algorithm for solving the maximal elements problem in the plane. Parallel Comput. 7, 2, 249--251.Google ScholarCross Ref
- Tan, K.-L., Eng, P. K., and Ooi, B. C. 2001. Efficient progressive skyline computation. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 301--310. Google ScholarDigital Library
- Tao, Y. and Papadias, D. 2006. Maintaining sliding window skylines on data streams. IEEE Trans. Knowl. Data Engin. 18, 3, 377--391. Google ScholarDigital Library
- Tao, Y., Xiao, X., and Pei, J. 2006. Subsky: efficient computation of skylines in subspaces. In Proceedings of the International Conference on Data Engineering (ICDE). Google ScholarDigital Library
- Wong, R. C.-W., Fu, A. W.-C., Pei, J., Ho, Y. S., Wong, T., and Liu, Y. 2008. Efficient skyline querying with variable user preferences on nominal attributes. Proc. VLDB Endow. 1, 1, 1032--1043. Google ScholarDigital Library
- Wu, P., Agrawal, D., Egecioglu, Ö., and El Abbadi, A. 2007. Deltasky: optimal maintenance of skyline deletions without exclusive dominance region generation. In Proceedings of the International Conference on Data Engineering (ICDE). 486--495.Google ScholarCross Ref
- Wu, P., Zhang, C., Feng, Y., Zhao, B. Y., Agrawal, D., and Abbadi, A. E. 2006. Parallelizing skyline queries for scalable distribution. In Proceedings of the International Conference on Extending Database Technology (EDBT). 112--130. Google ScholarDigital Library
- Xia, T. and Zhang, D. 2006. Refreshing the sky: the compressed skycube with efficient support for frequent updates. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 491--502. Google ScholarDigital Library
- Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J. X., and Zhang, Q. 2005. Efficient computation of the skyline cube. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 241--252. Google ScholarDigital Library
- Zhang, S., Mamoulis, N., and Cheung, D. W. 2009a. Scalable skyline computation using object-based space partitioning. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 483--494. Google ScholarDigital Library
- Zhang, Z., Yang, Y., Cai, R., Papadias, D., and Tung, A. K. H. 2009b. Kernel-Based skyline cardinality estimation. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 509--522. Google ScholarDigital Library
Index Terms
- Online subspace skyline query processing using the compressed skycube
Recommendations
Towards multidimensional subspace skyline analysis
The skyline operator is important for multicriteria decision-making applications. Although many recent studies developed efficient methods to compute skyline objects in a given space, none of them considers skylines in multiple subspaces simultaneously. ...
Group-by skyline query processing in relational engines
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementThe skyline operator was first proposed in 2001 for retrieving interesting tuples from a dataset. Since then, 100+ skyline-related papers have been published; however, we discovered that one of the most intuitive and practical type of skyline queries, ...
On efficient reverse skyline query processing
We propose two efficient algorithms for exact RSQ processing.We use precomputation, reuse, and pruning techniques to boost query performance.We extend our techniques to tackle a natural variant of RSQ, i.e., CRSQ.Extensive experiments show that our ...
Comments