skip to main content
research-article

Online subspace skyline query processing using the compressed skycube

Published:04 June 2012Publication History
Skip Abstract Section

Abstract

The skyline query can help identify the “best” objects in a multi-attribute dataset. During the past decade, this query has received considerable attention in the database research community. Most research focused on computing the “skyline” of a dataset, or the set of “skyline objects” that are not dominated by any other object. Such algorithms are not appropriate in an online system, which should respond in real time to skyline query requests with arbitrary subsets of the attributes (also called subspaces). To guarantee real-time response, an online system should precompute the skylines for all subspaces, and look up a skyline upon query. Unfortunately, because the number of subspaces is exponential to the number of attributes, such pre computation has very expensive storage cost and update cost. We propose the Compressed SkyCube (CSC) that is much more compact, yet can still return the skyline of any subspace without consulting the base table. The CSC therefore combines the advantage of precomputation in that it can respond to queries in real time, and the advantage of no-precomputation in that it has efficient space cost and update cost. This article presents the CSC data structures, the CSC query algorithm, the CSC update algorithm, and the CSC initial computation scheme. A solution to extend to high-dimensional data is also proposed.

References

  1. Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J., Ramakrishnan, R., and Sarawagi, S. 1996. On the computation of multidimensional aggregates. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 506--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Balke, W.-T., Güntzer, U., and Zheng, J. X. 2004. Efficient distributed skylining for web information systems. In Proceedings of the International Conference on Extending Database Technology (EDBT). 256--273.Google ScholarGoogle Scholar
  3. Barndorff-Nielsen, O. and Sobel, M. 1966. On the distribution of the number of admissable points in a vector random sample. Theory Probab. Appl. 11, 2, 249--269.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bentley, J. L., Clarkson, K. L., and Levine, D. B. 1990. Fast linear expected-time algorithms for computing maxima and convex hulls. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 179--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bentley, J. L., Kung, H. T., Schkolnick, M., and Thompson, C. D. 1978. On the average number of maxima in a set of vectors and applications. J. Assoc. Comput. Mach. 25, 4, 536--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Börzsönyi, S., Kossmann, D., and Stocker, K. 2001. The skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE). 421--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Buchta, C. 1989. On the average number of maxima in a set of vectors. Inf. Process. Lett. 33, 2, 63--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chan, C. Y., Eng, P.-K., and Tan, K.-L. 2005. Stratified computation of skylines with partially-ordered domains. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 203--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006a. Finding k-dominant skylines in high dimensional space. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 503--514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006b. On high dimensional skylines. In Proceedings of the International Conference on Extending Database Technology (EDBT). 478--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chaudhuri, S., Dalvi, N. N., and Kaushik, R. 2006. Robust cardinality and cost estimation for skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE). 64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chomicki, J. 2002. Querying with intrinsic preferences. In Proceedings of the International Conference on Extending Database Technology (EDBT). 34--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2003. Skyline with presorting. In Proceedings of the International Conference on Data Engineering (ICDE). 717--816.Google ScholarGoogle Scholar
  14. Dellis, E., Vlachou, A., Vladimirskiy, I., Seeger, B., and Theodoridis, Y. 2006. Constrained subspace skyline computation. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fang, Z., Wang, J., and Zhang, D. 2009. Workload-driven compressed skycube queries in wireless applications. In Proceedings of the International Conference on Wirelss Algorithms, Systems, and Applications (WASA). 244--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Godfrey, P. 2004. Skyline cardinality for relational processing. In Proceedings of the International Symposium on Foundations of Information and Knowledge Systems (FoIKS). 78--97.Google ScholarGoogle ScholarCross RefCross Ref
  17. Godfrey, P., Shipley, R., and Gryz, J. 2005. Maximal vector computation in large data sets. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 229--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hose, K., Lemke, C., and Sattler, K.-U. 2006. Processing relaxed skylines in pdms using distributed data summaries. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hristidis, V., Koudas, N., and Papakonstantinou, Y. 2001. PREFER: A system for the efficient execution of multi-parametric ranked queries. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 259--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Huang, X. and Jensen, C. S. 2004. In-route skyline querying for location-based services. In Proceedings of the International Symposium on Web and Wireless Geographical Information Systems (W2GIS). 120--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Huang, Z., Jensen, C. S., Lu, H., and Ooi, B. C. 2006a. Skyline queries against mobile lightweight devices in manets. In Proceedings of the International Conference on Data Engineering (ICDE). 66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Huang, Z., Lu, H., Ooi, B. C., and Tung, A. K. H. 2006b. Continuous skyline queries for moving objects. IEEE Trans. Knowl. Data Engin. 18, 12, 1645--1658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jin, W., Han, J., and Ester, M. 2004. Mining thick skylines over large databases. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery (PKDD). 255--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jin, W., Morse, M. D., Patel, J. M., Ester, M., and Hu, Z. 2010. Evaluating skylines in the presence of equijoins. In Proceedings of the International Conference on Data Engineering (ICDE). 249--260.Google ScholarGoogle Scholar
  25. Kossmann, D., Ramsak, F., and Rost, S. 2002. Shooting stars in the sky: an online algorithm for skyline queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 275--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kung, H. T., Luccio, F., and Preparata, F. P. 1975. On finding the maxima of a set of vectors. J. Assoc. Comput. Mach. 22, 4, 469--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lee, J. and Hwang, S. 2011. Qskycube: efficient skycube computation using point-based space partitioning. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 185--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lin, X., Yuan, Y., Wang, W., and Lu, H. 2005. Stabbing the sky: efficient skyline computation over sliding windows. In Proceedings of the International Conference on Data Engineering (ICDE). 502--513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Lin, X., Yuan, Y., Zhang, Q., and Zhang, Y. 2007. Selecting stars: The k most representative skyline operator. In Proceedings of the International Conference on Data Engineering (ICDE).Google ScholarGoogle Scholar
  30. Matousek, J. 1991. Computing dominances in En. Inf. Process. Lett. 38, 5, 277--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Morse, M. D., Patel, J. M., and Grosky, W. I. 2006. Efficient continuous skyline computation. In Proceedings of the International Conference on Data Engineering (ICDE). 108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2003. An optimal and progressive algorithm for skyline queries. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 467--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Pei, J., Fu, A. W.-C., Lin, X., and Wang, H. 2007. Computing compressed multidimensional skyline cubes efficiently. In Proceedings of the International Conference on Data Engineering (ICDE).Google ScholarGoogle Scholar
  34. Pei, J., Jin, W., Ester, M., and Tao, Y. 2005. Catching the best views of skyline: a semantic approach based on decisive subspaces. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 253--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Pei, J., Yuan, Y., Lin, X., Jin, W., Ester, M., Liu, Q., Wang, W., Tao, Y., Yu, J. X., and Zhang, Q. 2006. Towards multidimensional subspace skyline analysis. ACM Trans. Datab. Syst. 31, 4, 1335--1381. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Preparata, F. and Shamos, M. 1985. Computational Geometry: An Introduction. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Raïssi, C., Pei, J., and Kister, T. 2010. Computing closed skycubes. Proc. VLDB Endow. 3, 838--847. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rhee, C. J., Dhall, S. K., and Lakshmivarahan, S. 1995. The minimum weight dominating set problem for permutation graphs is in nc. J. Parallel Distrib. Comput. 28, 2, 109--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sharifzadeh, M. and Shahabi, C. 2006. The spatial skyline queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sheng, C. and Tao, Y. 2011. On finding skylines in external memory. In Proceedings of the ACM International Symposium on Principles of Database Systems (PODS). 107--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Stojmenovic, I. and Miyakawa, M. 1988. An optimal parallel algorithm for solving the maximal elements problem in the plane. Parallel Comput. 7, 2, 249--251.Google ScholarGoogle ScholarCross RefCross Ref
  42. Tan, K.-L., Eng, P. K., and Ooi, B. C. 2001. Efficient progressive skyline computation. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 301--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Tao, Y. and Papadias, D. 2006. Maintaining sliding window skylines on data streams. IEEE Trans. Knowl. Data Engin. 18, 3, 377--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Tao, Y., Xiao, X., and Pei, J. 2006. Subsky: efficient computation of skylines in subspaces. In Proceedings of the International Conference on Data Engineering (ICDE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Wong, R. C.-W., Fu, A. W.-C., Pei, J., Ho, Y. S., Wong, T., and Liu, Y. 2008. Efficient skyline querying with variable user preferences on nominal attributes. Proc. VLDB Endow. 1, 1, 1032--1043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Wu, P., Agrawal, D., Egecioglu, Ö., and El Abbadi, A. 2007. Deltasky: optimal maintenance of skyline deletions without exclusive dominance region generation. In Proceedings of the International Conference on Data Engineering (ICDE). 486--495.Google ScholarGoogle ScholarCross RefCross Ref
  47. Wu, P., Zhang, C., Feng, Y., Zhao, B. Y., Agrawal, D., and Abbadi, A. E. 2006. Parallelizing skyline queries for scalable distribution. In Proceedings of the International Conference on Extending Database Technology (EDBT). 112--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Xia, T. and Zhang, D. 2006. Refreshing the sky: the compressed skycube with efficient support for frequent updates. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 491--502. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J. X., and Zhang, Q. 2005. Efficient computation of the skyline cube. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 241--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Zhang, S., Mamoulis, N., and Cheung, D. W. 2009a. Scalable skyline computation using object-based space partitioning. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 483--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Zhang, Z., Yang, Y., Cai, R., Papadias, D., and Tung, A. K. H. 2009b. Kernel-Based skyline cardinality estimation. In Proceedings of the ACM/SIGMOD Annual Conference on Management of Data (SIGMOD). 509--522. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Online subspace skyline query processing using the compressed skycube

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Database Systems
      ACM Transactions on Database Systems  Volume 37, Issue 2
      May 2012
      326 pages
      ISSN:0362-5915
      EISSN:1557-4644
      DOI:10.1145/2188349
      Issue’s Table of Contents

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 June 2012
      • Accepted: 1 February 2012
      • Revised: 1 December 2011
      • Received: 1 December 2010
      Published in tods Volume 37, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader