Skip to main content
Log in

Restricting skyline sizes using weak Pareto dominance

  • Themenheft Datenbanksysteme
  • Published:
Informatik - Forschung und Entwicklung

Abstract

Skyline queries have recently received a lot of attention due to their intuitive query formulation: users can state preferences with respect to several attributes. Unlike numerical or score-based preferences, preferences over discrete value domains do not show an inherent total order, but have to rely on partial orders as stated by the user. In such orders typically many object values are incomparable, increasing the size of skyline sets significantly, and making their computation expensive. In this paper we explore how to enable interactive tasks like query refinement or relevance feedback by providing interesting subsets of the full Pareto skyline, which give users a good overview over the skyline. To be practical these subsets have to be small, efficient to compute, suitable for higher numbers of query predicates, and representative. The key to improved performance and reduced result set sizes is the relaxation of Pareto semantics to the concept of weak Pareto dominance. We argue that this relaxation yields intuitive results and show how it opens up the use of efficient and scalable query processing algorithms. We first derive the complete skyline subset given by weak Pareto dominance called ‘restricted skyline’ and then considering the individual performance of objects limit this further to a subset called ‘focused skyline’. Assessing the practical impact our experiments show that our approach indeed leads to lean result set sizes and outperforms Pareto skyline computations by up to two orders of magnitude.

Zusammenfassung

Skyline-Queries stehen in den letzten Jahren im Fokus der Aufmerksamkeit, da sie eine besonders intuitive Form der Anfragestellung unterstützen. Der Benutzer spezifiziert Präferenzen nur in qualitativer Weise im Bezug auf alle gewünschten Attribute. Im Gegensatz zu numerischen oder Score-basierten Präferenzen, definieren attribut-basierte Präferenzen dabei nicht immer eine totale Ordnung auf den Datenbankobjekten. Sie basieren typischerweise lediglich auf partiellen Ordnungen, die der Benutzer angibt. In solchen Ordnung sind daher viele Objekte unvergleichbar, was die durchschnittliche Größe der Skyline signifikant anwachsen lässt und die Effizienz ihrer Berechnung deutlich verbessert. In dieser Arbeit untersuchen wir die Ableitung interessanter Teilmengen der Pareto-Skyline, die Benutzern einen guten Überblick über die zu erwartenden Objekte für interaktive Aufgaben wie Anfrageverfeinerung oder Relevanz-Feedback bieten. Um nützlich zu sein, müssen diese Teilmengen klein, effizient zu berechnen, für hoch-dimensionale Anfragen geeignet und möglichst repräsentativ sein. Der Schlüssel zu guten Berechnungszeiten und deutlich verkleinerten Skylines ist dabei die Abschwächung der Pareto-Optimalität auf das Konzept der schwachen Pareto-Dominanz. Wir zeigen, dass diese Abschwächung zu intuitiven Ergebnissen führt und die Verwendung effizienter und skalierbarer Anfrageauswertungsalgorithmen erlaubt. Wir werten zuerst die komplette Menge der nicht schwach dominierten Objekte (Restricted Skyline) aus und leiten dann daraus eine besondere Teilmenge jener Objekte ab, die viel versprechend bezüglich des Benutzerinteresses sind (Focused Skyline). Unsere Experimente zur Überprüfung der praktischen Anwendbarkeit unseres Ansatzes zeigen, dass nicht nur die Größe der Resultatmengen deutlich schrumpft, sondern auch die notwendige Auswertungszeit um etwa zwei Größenordnungen reduziert werden kann.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Balke WT, Güntzer U (2004) Multi-objective Query Processing for Database Systems. In: Proc Int Conf on Very Large Databases (VLDB), Toronto, Canada

  2. Balke WT, Güntzer U (2005) Efficient Skyline Queries under Weak Pareto Dominance. In: Proc IJCAI-05 Multidisciplinary Workshop on Advances in Preference Handling (PREFERENCE), Edinburgh, UK

  3. Balke WT, Güntzer U, Siberski W (2006) Exploiting Indifference for Customization of Partial Order Skylines. In: Proc of the Int Database Engineering & Applications Symposium (IDEAS 2006), Delhi, India

  4. Balke WT, Güntzer U, Siberski W (2007) Getting Prime Cuts from Skylines over Partially Ordered Domains. In: Proc of the GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web (BTW 2007), Aachen, Germany

  5. Balke WT, Güntzer U, Zheng J (2004) Efficient Distributed Skylining for Web Information Systems. In: Proc of the Int Conf on Extending Database Technology (EDBT), LNCS 2992, Heraklion, Crete, Greece

  6. Balke WT, Wagner M (2004) Through Different Eyes – Assessing Multiple Conceptual Views for Querying Web Services. In: Proc of the Int World Wide Web Conference (WWW 2004), New York, USA, ACM

  7. Balke WT, Zheng J, Güntzer U (2005) Approaching the Efficient Frontier: Cooperative Database Retrieval Using High-Dimensional Skylines. In: Proc of the Int Conf on Database Systems for Advanced Applications (DASFAA), Beijing, China

  8. Bentley J, Kung H, Schkolnick M, Thompson C (1978) On the Average Number of Maxima in a Set of Vectors and Applications. J ACM 25(4):536–543

    Article  MATH  Google Scholar 

  9. Börzsönyi S, Kossmann D, Stocker K (2001) The Skyline Operator. In: Proc of the Int Conf on Data Engineering (ICDE), Heidelberg, Germany

  10. Chan C, Eng P, Tan K (2005) Stratified Computation of Skylines with Partially Ordered Domains. In: Proc of the Int Conf on Management of Data (SIGMOD), Baltimore, MD, USA

  11. Chomicki J (2002) Querying with Intrinsic Preferences. In: Proc of the Int Conf on Extending Database Technology (EDBT), LNCS 2287, Prague, Czech Republic

  12. Chomicki J (2003) Preference Formulas in Relational Queries. ACM Trans Database Syst 28(4):427–466

    Article  Google Scholar 

  13. Fagin R, Lotem A, Naor M (2001) Optimal Aggregation Algorithms for Middleware. In: ACM Symp on Principles of Database Systems (PODS), Santa Barbara, USA

  14. Fishburn P (1999) Preference Structures and their Numerical Representations. Theor Comput Sci 217:359–383

    Article  MATH  Google Scholar 

  15. Güntzer U, Balke WT, Kießling W (2000) Optimizing Multi-Feature Queries for Image Data-bases. In: Proc of the Int Conf on Very Large Databases (VLDB), Cairo, Egypt

  16. Huang X, Jensen C (2004) In-Route Skyline Querying for Location-Based Services. In: Proc of the Int Workshop on Web and Wireless Geographical Information Systems (W2GIS), Goyang, Korea

  17. Kießling W (2002) Foundations of Preferences in Database Systems. In: Proc of the Int Conf on Very Large Databases (VLDB), Hong Kong, China

  18. Kießling W (2005) Preference Queries with SV-Semantics. In: Proc of the Int Conf on Management of Data (COMAD), Goa, India

  19. Köhncke B, Balke WT (2006) Personalized Digital Item Adaptation in Service-Oriented Environments. In: Proc of the Int Workshop on Semantic Media Adaptation and Personalization (SMAP 2006), Athens, Greece

  20. Koltun V, Papadimitriou C (2005) Approximately Dominating Representatives. In: Proc of the Int Conf on Database Theory (ICDT), Edinburgh, UK

  21. Kossmann D, Ramsak F, Rost S (2002) Shooting Stars in the Sky: An Online Algorithm for Skyline Queries. In: Proc of the Int Conf on Very Large Data Bases (VLDB), Hong Kong, China

  22. Lacroix M, Lavency P (1987) Preferences: Putting more Knowledge into Queries. In: Proc of the Int Conf on Very Large Databases (VLDB), Brighton, UK

  23. Motro A (1988) VAGUE: A User Interface to Relational Databases that Permits Vague Queries. ACM Trans Inf Syst 6(3):187–214

    Article  Google Scholar 

  24. Papadias D, Tao Y, Fu G et al. (2003) An Optimal and Progressive Algorithm for Skyline Queries. In: Proc of the Int ACM SIGMOD Conf (SIGMOD’03), San Diego, USA

  25. Tan KL, Eng PK, Ooi BC (2001) Efficient Progressive Skyline Computation. In: Proc of Conf on Very Large Data Bases (VLDB’01), Rome, Italy

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wolf Tilo Balke.

Additional information

This paper is an extended version of [4].

CR subject classification

H.3.3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balke, W.T., Güntzer, U. & Siberski, W. Restricting skyline sizes using weak Pareto dominance . Informatik Forsch. Entw. 21, 165–178 (2007). https://doi.org/10.1007/s00450-007-0025-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-007-0025-1

Keywords

Schlagworte

Navigation