Skip to main content
Log in

Improved multi-objective clustering with automatic determination of the number of clusters

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The multi-objective clustering with automatic determination of the number of clusters (MOCK) approach is improved in this work by means of an empirical comparison of three multi-objective evolutionary algorithms added to MOCK instead of the original algorithm used in such approach. The results of two different experiments using seven real data sets from UCI repository are reported: (1) using two multi-objective optimization performance metrics (hypervolume and two-set coverage) and (2) using the F-measure and the silhouette coefficient to evaluate the clustering quality. The results are compared against the original version of MOCK and also against other algorithms representative of the state of the art. Such results indicate that the new versions are highly competitive and able to deal with different types of data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Coello CAC, Van Veldhuizen DA, Lamont GB (2002) Evolutionary algorithms for solving multi-objective problems. Kluwer Academic Publishers, New York

    Book  MATH  Google Scholar 

  2. Jin Y (ed) (2006) Multi-objective machine learning. Springer, Berlin, Heidelberg

    Book  MATH  Google Scholar 

  3. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Chichester, London

    MATH  Google Scholar 

  4. Cole RM (1998) Clustering with genetic algorithms. Western University, MSc Thesis

  5. Ma PCH, Chan KCC, Yao X, Chiu DKY (2006) An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Trans Evol Comput 10(3):296–314

    Article  Google Scholar 

  6. Casillas A, González de Lena MT, Martínez R (2003) Document clustering into an unknown number of clusters using a genetic algorithm. In: Matoušek V, Mautner P (eds) TSD 2003: text, speech and dialogue. 6th international conference on text speech and dialogue, České Budéjovice, September 2003. Lecture notes in computer science, vol 2807. Springer, Heidelberg, pp 43–49

  7. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to Algorithms. MIT Press, Massachusetts

    MATH  Google Scholar 

  8. Ferligoj A, Batagelj V (1992) Direct multicriterion clustering. J Classif 9:43–61

    Article  MATH  Google Scholar 

  9. Handl J, Knowles J (2004) Evolutionary multiobjective clustering. In: Yao X et al (eds) PPSN VIII: parallel problem solving from nature, Birmingham, September 2004. Lecture notes in computer science, vol 3242. Springer, Heidelberg, pp 1081–1091

  10. Handl J, Knowles J (2004) Multiobjective clustering with automatic determination of the number of clusters, Technical Report TR-COMPSYSBIO-2004-02. UMIST, Manchester, UK

    Google Scholar 

  11. Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76

    Article  Google Scholar 

  12. Corne DW, Jerram NR, Knowles JD, Oates MJ (2001) PESA-II: Region-Based Selection in Evolutionary Multiobjective Optimization. In: Spector L, Goodman D et al (eds) Proceedings of the genetic evolutionary computation conference. Morgan Kaufmann, San Francisco, pp 283–290

    Google Scholar 

  13. Korkmaz EE, Du J, Alhajj R, Barker K (2006) Combining advantages of new chromosome representation scheme and multi-objective genetic algorithms for better clustering. J Intell Data Anal 10(2):163–182

    Google Scholar 

  14. Mukhopadhyay A, Maulik U, Bandyopadhyay S (2007) Multiobjective genetic fuzzy clustering of categorical attributes. In: Proceedings of the 10th international conference on Information Technology, Orissa, December 2007, IEEE Computer Society, pp 74–79

  15. Ripon KSN, Tsang C-H, Kwong S, Ip M-K (2006) Multi-objective evolutionary clustering using variable-length real jumping genes genetic algorithm. In: Proceedings of the 18th international conference on Pattern Recognition (ICPR’06). IEEE Computer Society, vol 1, pp 1200–1203

  16. Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  17. Rokach L (2010) A survey of clustering algorithms. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, US, pp 269–298

    Google Scholar 

  18. Wilson RJ, Watkins JJ (1990) Graphs: an introductory approach: a first course in discrete mathematics. New York, USA

  19. Corne DW, Knowles JD, Oates MJ (2000) The Pareto envelope-based selection algorithm for multiobjective optimization. In: Schoenauer M, Deb K, Rudolph G et al (eds) Proceedings of the 6th international conference of parallel problem solving from nature (PPSN VI), Paris France, September 2000. Lecture notes in computer science, vol 1917. Springer, Heidelberg, pp 839–848

  20. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist Multiobjective genetic algorithm: NSGA–II. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  21. Zitzler E, Laumanns M, Thiele L (2001) SPEA2: improving the strength pareto evolutionary algorithm, technical report 103, Computer Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH), Zurich, Switzerland

  22. Deb K (2000) An efficient constraint handling method for genetic algorithms. Comput Method Appl M 186(2/4):311–338

    Article  MATH  Google Scholar 

  23. Blake C, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html. Accessed 25 Mar 2014

  24. López-Ibáñez M, Dubois-Lacoste J (2003) The irace package: iterated race for automatic algorithm configuration. http://cran.rproject.org/web/packages/irace/index.html. Accessed 12 May 2013

  25. López-Ibáñez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The irace package: s Report TR/IRIDIA/2011-004. Université Libre de Bruxelles, Belgium, IRIDIA

    Google Scholar 

  26. Derrac J, García S, Molina D, Francisco Herrera (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18

    Article  Google Scholar 

  27. Hruschka ER, Campello RJGB, Freitas AA, de Carvalho ACPLF (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Syst 39(2):133–155

    Article  Google Scholar 

  28. Bandyopadhyay S, Maulik U (2001) Nonparametric genetic clustering: comparison of validity indices. IEEE Trans Syst Man Cybern Syst 31(1):120–125

    Article  Google Scholar 

  29. Pan S, Cheng K (2007) Evolution-based tabu search approach to automatic clustering. IEEE Trans Syst Man Cybern Syst 37(5):827–838

    Article  Google Scholar 

  30. Arias-Montaño A, Coello CAC, Mezura-Montes E (2012) Multiobjective evolutionary algorithms in aeronautical and aerospace engineering. IEEE Trans Evol Comput 16(5):662–694

    Article  Google Scholar 

  31. Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731

    Article  Google Scholar 

  32. Pearson RK, Zylkin T, Schwaber JS, Gonye GE (2004) Quantitative evaluation of clustering results using computational negative controls. In: Proceedings of the 15th international conference on Data Mining (SIAM), Lake Buena Vista, Florida, pp 188–199

Download references

Acknowledgments

The first author acknowledges economical support from CONACyT through scholarship No. 258800 and the academic support from the University of Veracruz to pursue graduate studies. The second author acknowledges support from CONACyT through Project No. 220522.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María-Guadalupe Martínez-Peñaloza.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this article.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martínez-Peñaloza, MG., Mezura-Montes, E., Cruz-Ramírez, N. et al. Improved multi-objective clustering with automatic determination of the number of clusters. Neural Comput & Applic 28, 2255–2275 (2017). https://doi.org/10.1007/s00521-016-2191-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2191-1

Keywords

Navigation