ABSTRACT
During the last decade, data management and storage have become increasingly distributed. In consideration of the huge amount of data available in such systems, advanced query operators, such as skyline queries, are necessary to help users process the data. For example, a user who is interested in buying a car wants to find a good trade-off between minimum age and minimum price. It is not obvious how much cheaper a car should be, if it is one year older than another car. Thus, the skyline query will retrieve a set of data items that are the best trade-offs for the user's preferences. The skyline operator has been proposed about a decade ago, but research on skyline queries, especially in distributed scenarios, is still an ongoing process.
Query processing in distributed environments poses inherent challenges and requires non-traditional techniques due to the distribution of content and the lack of global knowledge. In this tutorial, we will outline the objectives and the main principles that any distributed skyline approach has to fulfill, leading to useful guidelines for the design of efficient distributed skyline algorithms. More importantly, distributed processing of other query types share the same objectives and principles, therefore several of the guidelines are applicable also for other query types. Furthermore, this tutorial will provide a broad survey of the state-of-the-art in distributed skyline processing, present a categorization of the existing approaches based on their characteristics, and point out open research challenges in distributed skyline processing.
- S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--432, 2001. Google ScholarDigital Library
- L. Chen, B. Cui, and H. Lu. Constrained skyline query processing against distributed data sites. TKDE, 23(2):204--217, 2011. Google ScholarDigital Library
- L. Chen, B. Cui, H. Lu, L. Xu, and Q. Xu. iSky: Efficient and progressive skyline computing in a structured P2P network. In ICDCS, pages 160--167, 2008. Google ScholarDigital Library
- B. Cui, L. Chen, L. Xu, H. Lu, G. Song, and Q. Xu. Efficient skyline computation in structured peer-to-peer systems. TKDE, 21(7):1059--1072, 2009. Google ScholarDigital Library
- B. Cui, H. Lu, Q. Xu, L. Chen, Y. Dai, and Y. Zhou. Parallel Distributed Processing of Constrained Skyline Queries by Filtering. In ICDE, pages 546--555, 2008. Google ScholarDigital Library
- K. Fotiadou and E. Pitoura. BITPEER: continuous subspace skyline computation with distributed bitmap indexes. In DaMaP, pages 35--42, 2008. Google ScholarDigital Library
- K. Hose, C. Lemke, and K. Sattler. Processing Relaxed Skylines in PDMS Using Distributed Data Summaries. In CIKM, pages 425--434, 2006. Google ScholarDigital Library
- K. Hose, C. Lemke, K. Sattler, and D. Zinn. A relaxed but not necessarily constrained way from the top to the sky. In CoopIS, pages 339--407, 2007. Google ScholarDigital Library
- K. Hose and A. Vlachou. A survey of skyline processing in highly distributed environments. VLDB Journal, pages 1--26. 10.1007/s00778-011-0246-6. Google ScholarDigital Library
- Z. Huang, C. S. Jensen, H. Lu, and B. C. Ooi. Skyline queries against mobile lightweight devices in manets. In ICDE, page 66, 2006. Google ScholarDigital Library
- H. T. Kung, F. Luccio, and F. P. Preparata. On finding the maxima of a set of vectors. Journal of the ACM, 22(4):469--476, 1975. Google ScholarDigital Library
- H. Li, Q. Tan, and W. Lee. Efficient progressive processing of skyline queries in peer-to-peer systems. In Infoscale, page 26, 2006. Google ScholarDigital Library
- D. Papadias, Y. Tao, G. Fu, and B. Seeger. An Optimal and Progressive Algorithm for Skyline Queries. In SIGMOD, pages 467--478, 2003. Google ScholarDigital Library
- J. Pei, W. Jin, M. Ester, and Y. Tao. Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces. In VLDB, pages 253--264, 2005. Google ScholarDigital Library
- F. P. Preparata and M. I. Shamos. Computational Geometry - An Introduction. Springer, 1985. Google ScholarDigital Library
- J. B. Rocha-Junior, A. Vlachou, C. Doulkeridis, and K. Nørvåg. AGiDS: A grid-based strategy for distributed skyline query processing. In Globe, pages 12--23, 2009. Google ScholarDigital Library
- J. B. Rocha-Junior, A. Vlachou, C. Doulkeridis, and K. Nørvåg. Efficient execution plans for distributed skyline query processing. In EDBT, pages 271--282, 2011. Google ScholarDigital Library
- A. Vlachou, C. Doulkeridis, Y. Kotidis, and M. Vazirgiannis. SKYPEER: Efficient subspace skyline computation over distributed data. In ICDE, pages 416--425, 2007.Google ScholarCross Ref
- A. Vlachou, C. Doulkeridis, Y. Kotidis, and M. Vazirgiannis. Efficient routing of subspace skyline queries over highly distributed data. TKDE, 22(12):1694--1708, 2010. Google ScholarDigital Library
- S. Wang, B. Ooi, A. Tung, and L. Xu. Efficient skyline query processing on peer-to-peer networks. In ICDE, pages 1126--1135, 2007.Google ScholarCross Ref
- S. Wang, Q. H. Vu, B. C. Ooi, A. K. Tung, and L. Xu. Skyframe: a framework for skyline query processing in peer-to-peer systems. VLDB Journal, 18(1):345--362, 2009. Google ScholarDigital Library
- P. Wu, C. Zhang, Y. Feng, B. Zhao, D. Agrawal, and A. Abbadi. Parallelizing Skyline Queries for Scalable Distribution. In EDBT, pages 112--130, 2006. Google ScholarDigital Library
- Y. Yuan, X. Lin, Q. Liu, W. Wang, J. X. Yu, and Q. Zhang. Efficient computation of the skyline cube. In VLDB, pages 241--252, 2005. Google ScholarDigital Library
- L. Zhu, Y. Tao, and S. Zhou. Distributed skyline retrieval with low bandwidth consumption. TKDE, 21(3):384--400, 2009. Google ScholarDigital Library
Index Terms
- Distributed skyline processing: a trend in database research still going strong
Recommendations
U-Skyline: A New Skyline Query for Uncertain Databases
The skyline query, aiming at identifying a set of skyline tuples that are not dominated by any other tuple, is particularly useful for multicriteria data analysis and decision making. For uncertain databases, a probabilistic skyline query, called P-...
The σ-neighborhood skyline queries
Skyline queries have recently attracted considerable attention for their ability to return data points from a given dataset that are not dominated by any other points. This study extends the concept of skyline queries in the development of a -...
Efficient skyline query processing in wireless sensor networks
How to process a skyline query efficiently has received considerable attention in recent years. A skyline query identifies a set of non-dominated data records in a multidimensional dataset. Whereas most previous studies have resolved this problem in a ...
Comments