Skip to main content

Advertisement

Log in

Integrating multi-objective genetic algorithm based clustering and data partitioning for skyline computation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Skyline computation in databases has been a hot topic in the literature because of its interesting applications. The basic idea is to find non-dominated values within a database. The task is mainly a multi-objective optimization process as described in this paper. This motivated for our approach that employs a multi-objective genetic algorithm based clustering approach to find the pareto-optimal front which allows us to locate skylines within a given data. To tackle large data, we simply split the data into manageable subsets and concentrate our analysis on the subsets instead of the whole data at once. The proposed approach produced interesting results as demonstrated by the outcome from the conducted experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Aksoy Y, Butler TW, Minor ED (1996) Comparitive studies in interactive multiple objective mathematical programming. Eur J Oper Res 89:408–422

    Article  MATH  Google Scholar 

  2. Alhajj R, Kaya M (2008) Multi-objective genetic algorithms based automated clustering for fuzzy association rules mining. J Intell Inf Syst 31(3):243–264

    Article  Google Scholar 

  3. Armentano VA, Claudio JE (2004) An application of a multi-objective tabu search algorithm to a bicriteria flowshop problem. J Heuristics 10(5):463–481

    Article  MATH  Google Scholar 

  4. Berge C (1962) The theory of graphs and its applications. Methuen, London

    MATH  Google Scholar 

  5. Borzsonyi S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of IEEE international conference on data engineering, Heidelberg, Germany, pp 421–430

  6. Buchanan J, Daellenbach HG (1987) A camparative evaluation of interactive solution methods for multiple objective decision models. Eur J Oper Res 24:353–359

    Article  Google Scholar 

  7. Chan C-Y, Eng P-K, Tan K-L (2005) Stratified computation of skylines with partially-ordered domains. In: Proceedings of ACM-SIGMOD international conference on management of data, 2005

  8. Chan C-Y, Jagadish HV, Tan K-L, Tung AKH, Zhang Z (2006) Finding k-dominant skylines in high dimensional space. In: Proceedings of ACM-SIGMOD international conference on management of data, 2006

  9. Chaudhuri S, Dalvi N, Kaushik R (2006) Robust cardinality and cost estimation for skyline operator. In: Proceedings of IEEE international conference on data engineering, 2006

  10. Chavez E, Navarro G (2005) A compact space decomposition for effective metric indexing. Pattern Recogn Lett 26(9):1363–1376

    Article  Google Scholar 

  11. Chomicki J, Godfrey P, Gryz J, Liang D (2003) Skyline with presorting. In: Proceedings of IEEE international conference on data engineering, 2003

  12. Coello CA, Lamont GB (2004) Recent developments and future research directions. In: Applications of multi-objective evolutionary algorithms. World Scientific, Singapore

    Google Scholar 

  13. Deb K, Agrawal S, Pratab A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-II. In: Proceedings of the parallel problem solving from nature, number 1917, Paris, France

  14. Elegbede C, Adjallah K (2003) Availability allocation to repairable systems with genetic algorithms: a multi-objective formulation. Reliab Eng Syst Saf 82(3):319–330

    Article  Google Scholar 

  15. Fonseca CM, Fleming PJ (1998) Multiobjective optimization and multiple constraint handling with evolutionary algorithms—Part I: a unified formulation. IEEE Trans Syst Man Cybern, Part A 28:26–37

    Article  Google Scholar 

  16. Hjaltason G, Samet H (1999) Distance browsing in spatial databases. ACM Trans Database Syst 24(2):265–318

    Article  Google Scholar 

  17. Huang Z, Lu H, Ooi BC, Tung AK (2006) Continuous skyline queries for moving objects. IEEE Trans Knowl Data Eng 18(12):1645–1658

    Article  Google Scholar 

  18. Lin X, Yuan Y, Zhang Q, Zhang Y (2007) Selecting stars: The k most representative skyline operator. In: Proceedings of IEEE international conference on data engineering, 2007

  19. Kalefa ME, Mokbel MF, Levandoski JJ (2008) Skyline query processing for incomplete data. In: Proceedings of IEEE international conference on data engineering, 2008

  20. Kossmann D, Ramsak F, Rost S (2002) Shooting stars in the sky: An online algorithm for skyline queries. In: Proceedings of the international conference on very large databases, 2002

  21. Kumar R, Banerjee N (2003) Multicriteria network design using evolutionary algorithm. In: Proceedings of genetic and evolutionary computations conference (GECCO-03). LNCS, vol 2023. Springer, Berlin, pp 2179–2190

    Google Scholar 

  22. Kumar R, Parida PP, Gupta M (2002) Topological design of communication networks using multiobjective genetic optimization. In: Proceedings of the congress Evolutionary Computation (CEC-2002). IEEE Press, New York, pp 425–430

    Chapter  Google Scholar 

  23. Kung HT, Luccio F, Preparata FP (1975) On finding the maxima of a set of vectors. J ACM 22(4):469–476

    Article  MathSciNet  MATH  Google Scholar 

  24. Lee KCK, Zheng B, Li H, Lee W-C (2007) Approaching the skyline in Z order. In: Proceedings of the international conference on very large databases, 2007

  25. Likas A, Vlassis N, Verbeek J (2003) The global k-means clustering algorithm. Pattern Recogn 36(2):451–461

    Article  Google Scholar 

  26. Lu Y, Lu S, Fotouhi F, Deng Y, Brown S (2004) FGKA: A fast genetic k-means clustering algorithm. In: Proceedings of ACM symposium on applied computing, pp. 162–163, 2004

  27. Mamede M (2005) Recursive lists of clusters: A dynamic data structure for range queries in metric spaces. Proceedings of the 20th international symposium on computer and information sciences, pp. 843–853

  28. Matousek J (1991) Computing dominances in En. Inf Process Lett 38(5):277–278

    Article  MathSciNet  MATH  Google Scholar 

  29. Morse M, Patel JM, Jagadish HV (2007) Efficient skyline computation over low-cardinality domains. In: Proceedings of the international conference on very large databases, 2007

  30. Murata T, Ishibuchi H, Tanaka H (1996) Multi-objective genetic algorithm and its applications to flowshop scheduling. Comput Ind Eng 30(4):957–968

    Article  Google Scholar 

  31. Özyer T, Alhajj R (2006) Clustering by integrating multi-objective optimization with weighted k-means and validity analysis. In: Proceedings of the international conference on intelligent data engineering and automated learning, 2006

  32. Özyer T, Alhajj R (2006) Achieving natural clustering by validating results of iterative evolutionary clustering approach. In: Proceedings of the IEEE conference on intelligent systems, 2006

  33. Özyer T, Alhajj R (2008) Deciding on number of clusters by multi-objective optimization and validity analysis. J Multi-Valued Log Soft Comput 14(3–5):457–474

    Google Scholar 

  34. Özyer T, Alhajj R (2009) Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer. J Appl Intell 31(3):318–331

    Article  Google Scholar 

  35. Papadias D, Tao Y, Fu G, Seeger B (2005) Progressive skyline computation in database systems. ACM Trans Database Syst 30(1):41–82

    Article  Google Scholar 

  36. Pei J, Jin W, Ester M, Tao Y (2005) Catching the best views of skyline: A semantic approach based on decisive subspaces. In: Proceedings of the international conference on very large databases, 2005

  37. Pei J, Fu AW-C, Lin X, Wang H (2007) Computing compressed multidimensional skyline cubes efficiently. In: Proceedings of IEEE international conference on data engineering, 2007

  38. Tamura K, Miura S (1979) Necessary and sufficient conditions for local and global non dominated solutions in decision problems with multi-objectives. J Optim Theory Appl 28(4):501–523

    Article  MathSciNet  MATH  Google Scholar 

  39. Tan KL, Eng PK, Ooi BC (2001) Efficient progressive skyline computation. In: Proceedings of the international conference on very large databases, 2001

  40. Tao Y, Papadias D (2006) Maintaining sliding window skyline on data streams. IEEE Trans Knowl Data Eng 18(2):377–391

    Google Scholar 

  41. Tao Y, Xiao X, Pei J (2006) SUBSKY: Efficient computation of skylines in subspaces. In: Proceedings of IEEE international conference on data engineering, 2006

  42. Vlachou A, Doulkeridis C, Koidis Y (2008) Angle-based space partitioning for efficient parallel skyline computation. In: Proceedings of ACM-SIGMOD international conference on management of data, 2008

  43. Wong RC-W, Fu AW-C, Pei J, Ho YS, Wong T, Liu Y (2008) Efficient skyline querying with variable user preference on nominal attributes. In: Proceedings of the international conference on very large databases, 2008

  44. Wu P, Agrawal D, Egecioglu O, Abbadi AE (2007) DeltaSky: Optimal maintenance of skyline deletions without exclusive dominance region generation. In: Proceedings of IEEE international conference on data engineering, pp 486–495

  45. Xia T, Zhang D (2006) Refreshing the sky: The compressed skycube with efficient support for frequent updates. In: Proceedings of ACM-SIGMOD international conference on management of data, 2006

  46. Yang J-E, Hwang M-J, Sung T-Y, Jin Y (1999) Application of genetic algorithm for reliability allocation in nuclear power plants. Reliab Eng Syst Saf 65(3):229–238

    Article  Google Scholar 

  47. Yuan Y, Lin X, Liu Q, Wang W, Yu JX, Zhang Q (2005) Efficient computation of the skyline cube. In: Proceedings of the international conference on very large databases, 2005

  48. Zitzler E (1999) Evolutionary algorithms for multiobjective optimization: Methods and applications. PhD thesis, Zurich: Swiss Federal Institute of Technology (ETH), Aachen, Germany

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reda Alhajj.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Özyer, T., Zhang, M. & Alhajj, R. Integrating multi-objective genetic algorithm based clustering and data partitioning for skyline computation. Appl Intell 35, 110–122 (2011). https://doi.org/10.1007/s10489-009-0206-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-009-0206-7

Navigation