Skip to main content

An Investigation of Representations and Operators for Evolutionary Data Clustering with a Variable Number of Clusters

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4193))

Abstract

This paper analyses the properties of four alternative representation/operator combinations suitable for data clustering algorithms that keep the number of clusters variable. These representations are investigated in the context of their performance when used in a multiobjective evolutionary clustering algorithm (MOCK), which we have described previously. To shed light on the resulting performance differences observed, we consider the relative size of the search space and heuristic bias inherent to each representation, as well as its locality and heritability under the associated variation operators. We find that the representation that performs worst when a random initialization is employed, is nevertheless the best overall performer given the heuristic initialization normally used in MOCK. This suggests there are strong interaction effects between initialization, representation and operators in this problem.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, University of Western Australia, Australia (1998)

    Google Scholar 

  2. Corne, D.W., Knowles, J.D., Oates, M.J.: PESA-II: Region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 283–290. ACM Press, New York (2001)

    Google Scholar 

  3. Falkenauer, E.: Genetic Algorithms and Grouping Problems. John Wiley and Son Ltd., New York (1998)

    MATH  Google Scholar 

  4. Goulden, I.P., Jackson, D.M.: Combinatorial Enumeration, p. 192, 3.3.28. John Wiley and Sons, Inc, New York (1983)

    Google Scholar 

  5. Handl, J., Knowles, J.: Evolutionary multiobjective clustering. In: Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pp. 1081–1091. Springer, Berlin, Germany (2004)

    Chapter  Google Scholar 

  6. Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Transactions on Evolutionary Computation (in press, 2006)

    Google Scholar 

  7. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  8. Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Transactions on Evolutionary Computation (in press, 2006)

    Google Scholar 

  9. MacQueen, L.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  10. Park, Y.-J., Song, M.-S.: A genetic algorithm for clustering problems. In: Proceedings of the Third Annual Conference on Genetic Programming, pp. 568–575. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  11. Radcliffe, N.J., Surry, P.D.: Fitness variance of formae and performance prediction. In: Foundations of Genetic Algorithms, vol. 3, pp. 51–72. Morgan Kaufmann Publishers, San Mateo (1995)

    Google Scholar 

  12. Raidl, G.R., Gottlieb, J.: Empirical analysis of locality, heritability and heuristic bias in evolutionary algorithms: A case study for the multidimensional knapsack problem. Evolutionary Computation 13(4), 441–475 (2005)

    Article  Google Scholar 

  13. Rothlauf, F., Goldberg, D.E.: Redundant representations in evolutionary computation. evolutionary computation 11(4), 381–415 (2003)

    Article  Google Scholar 

  14. Sloane, N.J.A.: Series A060281 in The On-Line Encyclopedia of Integer Sequences

    Google Scholar 

  15. Zitzler, E.: Evolutionary algorithms for multiobjective optimization: methods and applications. PhD thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Handl, J., Knowles, J. (2006). An Investigation of Representations and Operators for Evolutionary Data Clustering with a Variable Number of Clusters. In: Runarsson, T.P., Beyer, HG., Burke, E., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds) Parallel Problem Solving from Nature - PPSN IX. PPSN 2006. Lecture Notes in Computer Science, vol 4193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11844297_85

Download citation

  • DOI: https://doi.org/10.1007/11844297_85

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-38990-3

  • Online ISBN: 978-3-540-38991-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics