Skip to main content

Improving Clustering via a Fine-Grained Parallel Genetic Algorithm with Information Sharing

  • Conference paper
  • First Online:
Data Mining (AusDM 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1127))

Included in the following conference series:

  • 612 Accesses

Abstract

Clustering is a very common unsupervised machine learning task, used to organise datasets into groups that can provide useful insight. Genetic algorithms (GAs) are often applied to the task of clustering as they are effective at finding viable solutions to optimization problems. Parallel genetic algorithms (PGAs) are an existing approach that maximizes the effectiveness of GAs by making them run in parallel with multiple independent subpopulations. Each subpopulation can also communicate by exchanging information throughout the genetic process, enhancing their overall effectiveness. PGAs offer greater performance by mitigating some of the weaknesses of GAs. Firstly, having multiple subpopulations enable the algorithm to more widely explore the solution space. This can reduce the probability of converging to poor-quality local optima, while increasing the chance of finding high-quality local optima. Secondly, PGAs offer improved execution time, as each subpopulation is processed in parallel on separate threads. Our technique advances an existing GA-based method called GenClust++, by employing a PGA along with a novel information sharing technique. We also compare our technique with 2 alternative information sharing functions, as well with no information sharing. On 5 commonly researched datasets, our approach consistently yields improved cluster quality and a markedly reduced runtime compared to GenClust++.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hendricks, D., Gebbie, T., Wilcox, D.: High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm. South Afr. J. Sci. 112, 57 (2016)

    Google Scholar 

  2. Beg, A.H., Islam, Md.Z., Estivill-Castro, V.: Genetic algorithm with healthy population and multiple streams sharing information for clustering. Knowl.-Based Syst. 114, 61–78 (2016)

    Article  Google Scholar 

  3. Islam, Md.Z., Estivill-Castro, V., Rahman, Md.A., Bossomaier, T.: Combining k-means and a genetic algorithm through a novel arrangement of genetic operators for high quality clustering. Expert Syst. Appl. 91, 402–417 (2018)

    Article  Google Scholar 

  4. Cavuoti, S., Garofalo, M., Brescia, M., Pescape’, A., Longo, G., Ventre, G.: Genetic algorithm modeling with GPU parallel computing technology. In: Apolloni, B., Bassis, S., Esposito, A., Morabito, F. (eds.) Neural Nets and Surroundings. Smart Innovation, Systems and Technologies, vol. 19, pp. 29–39. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35467-0_4

    Chapter  Google Scholar 

  5. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 1(2), 224–227 (1979)

    Article  Google Scholar 

  6. Li, X., Kirley, M.: The effects of varying population density in a fine-grained parallel genetic algorithm, vol. 2, pp. 1709–1714, February 2002

    Google Scholar 

  7. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Addison Wessley, Boston (2005)

    Google Scholar 

  8. Rahman, Md.A., Islam, Md.Z.: A hybrid clustering technique combining a novel genetic algorithm with k-means. Knowl.-Based Syst. 71, 21–28 (2014)

    Article  Google Scholar 

  9. Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recogn. 33, 1455–1465 (2000)

    Article  Google Scholar 

  10. University of Waikato - collections of datasets. https://www.cs.waikato.ac.nz/ml/weka/datasets.html. Accessed 7 July 2018

  11. Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml. Accessed 7 July 2018

  12. Kohlmorgen, U., Schmeck, H., Haase, K.: Experiences with fine-grained parallel genetic algorithms. Ann. Oper. Res.-Ann. OR 90, 203–219 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Storm Bartlett .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bartlett, S., Islam, M.Z. (2019). Improving Clustering via a Fine-Grained Parallel Genetic Algorithm with Information Sharing. In: Le, T., et al. Data Mining. AusDM 2019. Communications in Computer and Information Science, vol 1127. Springer, Singapore. https://doi.org/10.1007/978-981-15-1699-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1699-3_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1698-6

  • Online ISBN: 978-981-15-1699-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics